Hi James, I understand the concern about nested document and the corresponding complexity, but the complexity of nested documents does not add any complexity to having Multi-value objects as a list of attributes. Am I missing something?
Cheers, Ali On Thu, Apr 6, 2017 at 6:35 AM, James Sirota <jsir...@apache.org> wrote: > That is correct, nested Json structures introduced a high level of > complexity for Solr. We did this work a while ago, but if I remember > correctly we had to have child documents for nested pieces so for one > ingest of a complex Json you could end up with multiple documents indexed. > Template definition and management became more difficult as well and there > were also some gotchas on the query side. For those reasons it became much > simpler to just flatten everything. > > > 04.04.2017, 05:46, "Nick Allen" <n...@nickallen.org>: > > I have no knowledge of specifics. I was not involved with the original > problem. I just know that I had to flatten the output from threat triage > based on these concerns as part of PR #438 [1]. > > > [1] https://github.com/apache/incubator-metron/pull/438. > > On Mon, Apr 3, 2017 at 8:11 PM, Ali Nazemian <alinazem...@gmail.com> > wrote: > > Thanks, Nick. > > Can you give me more information on what the problem with Solr indexing > was at the first place? I've got some experience with Solr so I might be > able to help to fix that situation. > > Regards, > Ali > > On Mon, Apr 3, 2017 at 11:55 PM, Nick Allen <n...@nickallen.org> wrote: > > Up to this point, we have been making the assumption that we need to > "flatten" complex data types like lists and maps before they get indexed. > For example, a list like this... > > { > users: [ mary, alice, bob ] > } > > > is flattened and ends up looking like this... > > { > users.0: mary, > users.1: alice, > users.2: bob > } > > > The goal of the JIRA that I referenced is to make each indexer responsible > for transforming the message in whatever way necessary to correctly index > the data. This way enrichments and transformations that occur upstream > don't have to worry about this. > > I *think* the specific issue is that Solr indexing may not work with > complex data types like lists and maps in some scenarios. I *think* > Elasticsearch indexing may be fine. Others may have more insight, but this > is what I remember. It is probably worth the effort to validate this in > your environment and see if any problems arise. It should be fairly simple > to validate. > > > > > > On Sun, Apr 2, 2017 at 10:50 PM, Ali Nazemian <alinazem...@gmail.com> > wrote: > > Thank you very much, Nick. I was not aware of the fact that Metron does > not support the multi-value attribute. So, in this case, I need to have a > Stellar function to deal with splitting data and mapping to enrichment CF. > Is that correct? > > Regards, > Ali > > On Mon, Apr 3, 2017 at 6:31 AM, Nick Allen <n...@nickallen.org> wrote: > > You could use the programmatic enrichment functions to do this. For > instance, say you wanted to look-up the impacted users in a company > 'phonebook' to get more information. > > 'impacted-user-0": ENRICHMENT_GET(''phonebook", GET(user_ids, 0), "tb", > "cf") > > 'impacted-user-1": ENRICHMENT_GET(''phonebook", GET(user_ids, 1), "tb", > "cf") > > "impacted-user-2": ENRICHMENT_GET(''phonebook", GET(user_ids, 2), "tb", > "cf") > > > Also note that there is an open JIRA to ensure that all of the index > destinations can handle complex types in the message JSON. This may or may > not impact your use case, but something to keep in mind. > > https://issues.apache.org/jira/browse/METRON-735 > > > > > > On Sun, Apr 2, 2017 at 10:26 AM, Ali Nazemian <alinazem...@gmail.com> > wrote: > > Hi all, > > > I was wondering how I can achieve the following use case in the current > version of Metron? > > > > I want to have attributes in the Metron JSON object that are an array. > For example, if a threat is impacting multiple users, they are all > contained in an attribute (e.g. user_id:[id1, id2, id3]). Now if I want > to enrich the event with data that requires the user_id as a key in > enrichment stored in HBASE, how would I do this? > > > Cheers, > Ali > > > > > > -- > A.Nazemian > > > > > > -- > A.Nazemian > > > > > ------------------- > Thank you, > > James Sirota > PPMC- Apache Metron (Incubating) > jsirota AT apache DOT org > > -- A.Nazemian