Hi James,

I understand the concern about nested document and the corresponding
complexity, but the complexity of nested documents does not add any
complexity to having Multi-value objects as a list of attributes. Am I
missing something?

Cheers,
Ali

On Thu, Apr 6, 2017 at 6:35 AM, James Sirota <jsir...@apache.org> wrote:

> That is correct, nested Json structures introduced a high level of
> complexity for Solr. We did this work a while ago, but if I remember
> correctly we had to have child documents for nested pieces so for one
> ingest of a complex Json you could end up with multiple documents indexed.
> Template definition and management became more difficult as well and there
> were also some gotchas on the query side. For those reasons it became much
> simpler to just flatten everything.
>
>
> 04.04.2017, 05:46, "Nick Allen" <n...@nickallen.org>:
>
> I have no knowledge of specifics.  I was not involved with the original
> problem.  I just know that I had to flatten the output from threat triage
> based on these concerns as part of PR #438 [1].
>
>
> [1] https://github.com/apache/incubator-metron/pull/438.
>
> On Mon, Apr 3, 2017 at 8:11 PM, Ali Nazemian <alinazem...@gmail.com>
> wrote:
>
> Thanks, Nick.
>
> Can you give me more information on what the problem with Solr indexing
> was at the first place? I've got some experience with Solr so I might be
> able to help to fix that situation.
>
> Regards,
> Ali
>
> On Mon, Apr 3, 2017 at 11:55 PM, Nick Allen <n...@nickallen.org> wrote:
>
> Up to this point, we have been making the assumption that we need to
> "flatten" complex data types like lists and maps before they get indexed.
> For example, a list like this...
>
> {
>    users: [ mary, alice, bob ]
> }
>
>
> is flattened and ends up looking like this...
>
> {
>   users.0: mary,
>   users.1: alice,
>   users.2: bob
> }
>
>
> The goal of the JIRA that I referenced is to make each indexer responsible
> for transforming the message in whatever way necessary to correctly index
> the data.  This way enrichments and transformations that occur upstream
> don't have to worry about this.
>
> I *think* the specific issue is that Solr indexing may not work with
> complex data types like lists and maps in some scenarios.  I *think*
> Elasticsearch indexing may be fine.  Others may have more insight, but this
> is what I remember. It is probably worth the effort to validate this in
> your environment and see if any problems arise.  It should be fairly simple
> to validate.
>
>
>
>
>
> On Sun, Apr 2, 2017 at 10:50 PM, Ali Nazemian <alinazem...@gmail.com>
> wrote:
>
> Thank you very much, Nick. I was not aware of the fact that Metron does
> not support the multi-value attribute. So, in this case, I need to have a
> Stellar function to deal with splitting data and mapping to enrichment CF.
> Is that correct?
>
> Regards,
> Ali
>
> On Mon, Apr 3, 2017 at 6:31 AM, Nick Allen <n...@nickallen.org> wrote:
>
> You could use the programmatic enrichment functions to do this.  For
> instance, say you wanted to look-up the impacted users in a company
> 'phonebook' to get more information.
>
> 'impacted-user-0": ENRICHMENT_GET(''phonebook", GET(user_ids, 0), "tb",
> "cf")
>
> 'impacted-user-1": ENRICHMENT_GET(''phonebook", GET(user_ids, 1), "tb",
> "cf")
>
> "impacted-user-2": ENRICHMENT_GET(''phonebook", GET(user_ids, 2), "tb",
> "cf")
>
>
> Also note that there is an open JIRA to ensure that all of the index
> destinations can handle complex types in the message JSON.  This may or may
> not impact your use case, but something to keep in mind.
>
> https://issues.apache.org/jira/browse/METRON-735
>
>
>
>
>
> On Sun, Apr 2, 2017 at 10:26 AM, Ali Nazemian <alinazem...@gmail.com>
> wrote:
>
> Hi all,
>
>
> I was wondering how I can achieve the following use case in the current
> version of Metron?
>
>
>
> I want to have attributes in the Metron JSON object that are an array.
> For example, if a threat is impacting multiple users, they are all
> contained in an attribute (e.g.  user_id:[id1, id2, id3]).   Now if I want
> to enrich the event with data that requires the user_id as a key in
> enrichment stored in HBASE, how would I do this?
>
>
> Cheers,
> Ali
>
>
>
>
>
> --
> A.Nazemian
>
>
>
>
>
> --
> A.Nazemian
>
>
>
>
> -------------------
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>
>


-- 
A.Nazemian

Reply via email to