I propose that we just disallow having dots in the field name.  Dots seem to 
have a special meaning and as we keep adding data stores we may run into some 
unintended behavior.  We should have logic in our code to check for it and 
either auto-correct it (replace with underscores?) or at least throw an error 
or a warning.  

Thanks,
James 

07.09.2018, 16:33, "Ryan Merriman" <merrim...@gmail.com>:
> Internal means it’s not configurable, doesn’t contain our default separator 
> (dots) and is namespaced with metron. We can definitely improve on DRY but 
> there’s more to it than that. For example, having 2 different versions of 
> this field name (ES and Solr) adds a significant amount of complexity for no 
> real benefit.
>
>>  On Sep 7, 2018, at 5:12 PM, Michael Miklavcic <michael.miklav...@gmail.com> 
>> wrote:
>>
>>  Can you elaborate on what you mean by "convert to internal?" From your
>>  description, it looks like the challenge is from our violations of DRY when
>>  it comes to constants referencing those keys, which would be eliminated by
>>  refactoring.
>>
>>>  On Fri, Sep 7, 2018, 3:50 PM Ryan Merriman <merrim...@gmail.com> wrote:
>>>
>>>  I recently worked on a PR that involved changing the default behavior of
>>>  the ElasticsearchWriter to store data using field names with the default
>>>  Metron separator, dots. One of the unfortunate consequences of this is
>>>  that although dots are allowed in more recent versions of ES, it changes
>>>  how these fields are stored. Having a dot in a field name causes ES to
>>>  treat it as an object field type. We're not quite comfortable with this
>>>  because it could introduce unforeseen side effects that may not be
>>>  obvious. Here's the PR: https://github.com/apache/metron/pull/1181
>>>
>>>  As I worked through it I noticed there are a couple fields that include
>>>  separators where it's not actually necessary. They are not nested by
>>>  nature and are internal to Metron. The fact that they are internal means
>>>  they show up in constants and are hardcoded in several different places.
>>>  That made the work in the PR above much harder and tedious than it should
>>>  have been. There are 2 in particular that I had to deal with: source:type
>>>  and threat:triage:score in metaalerts.
>>>
>>>  Is it worth considering converting these to internal Metron fields so that
>>>  they stay constant and this isn't a problem in the future? I could see
>>>  these fields following the same pattern as 'metron_alert'. However this
>>>  would cause pain when upgrading because existing data would need to be
>>>  updated with these new fields.
>>>
>>>  Just an idea. Curious if other have an opinion on the subject.

------------------- 
Thank you,

James Sirota
PMC- Apache Metron
jsirota AT apache DOT org

Reply via email to