[ 
https://issues.apache.org/jira/browse/METRON-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878721#comment-15878721
 ] 

Nick Allen commented on METRON-735:
-----------------------------------

~cestella commented on this issue:

https://github.com/apache/incubator-metron/pull/438

I'm of the opinion that the flattening should be writer-specific and that 
should be a function of the writer config with the default to be specified by 
the writer implementation. This way we can have our cake and eat it too. Also, 
we could ALREADY be in a situation where messages aren't flat (imagine a 
situation where a stellar function returns a map or a list and it get assigned 
to a field). The only safe way to do this is to enforce it at the writer, IMO. 
This is one of the stated benefits to extracting writer configs into their own 
structures.

Regarding existing conventions, this one was around when I joined the project. 
I might be wrong, but it was an early convention. The reasoning, as I recall, 
was multi-fold:

* Solr didn't handle it
* Interacting with complex structures was deemed to be difficult
* Indexing nested structures had some performance implications

> Support Complex Data Types in Telemetry Messages
> ------------------------------------------------
>
>                 Key: METRON-735
>                 URL: https://issues.apache.org/jira/browse/METRON-735
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>
> Up until now, we have been working under the assumption that all telemetry 
> messages in Metron must not contain complex data types like lists or maps.  
> Complex data types were 'flattened' to remove any complex data types from the 
> telemetry messages. 
> Most parsers today produce flat telemetry messages.  That is messages with no 
> complex data types.  The problem is that even if I use a parser that 
> generates only flattened data, I could create an enrichment (like using 
> GEO_GET) that appends non-flat data to a message. Thus we have non-flat data 
> coursing through the veins of Metron. ;)
> I think the original idea of flattening data was because one of our Indexers 
> could not handle non-flat, complex data types. At the time, we just decided, 
> well don't create any non-flat data.
> But now, since we have a completely 'programmable' system, I don't think it 
> is safe to assume that the data will always be non-flat. A user could create 
> their own Stellar function to use during enrichment. Should we force on them 
> the burden of flattening the data?
> It makes way more sense in my mind, to make the indexer transform the data 
> however it needs to , to correctly index the data. If the current issue is 
> with the Solr Indexer, then we should fix that to flatten any data that it 
> needs to. There would be one touch point to address this issue rather than 
> many.
> This is a blanket JIRA that might result in multiple sub-tasks of changes 
> needed to allow telemetry messages to contain complex data types like lists 
> and maps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to