I propose that we just disallow having dots in the field name. Dots seem to have a special meaning and as we keep adding data stores we may run into some unintended behavior. We should have logic in our code to check for it and either auto-correct it (replace with underscores?) or at least throw an error or a warning.
Thanks, James 07.09.2018, 16:33, "Ryan Merriman" <merrim...@gmail.com>: > Internal means it’s not configurable, doesn’t contain our default separator > (dots) and is namespaced with metron. We can definitely improve on DRY but > there’s more to it than that. For example, having 2 different versions of > this field name (ES and Solr) adds a significant amount of complexity for no > real benefit. > >> On Sep 7, 2018, at 5:12 PM, Michael Miklavcic <michael.miklav...@gmail.com> >> wrote: >> >> Can you elaborate on what you mean by "convert to internal?" From your >> description, it looks like the challenge is from our violations of DRY when >> it comes to constants referencing those keys, which would be eliminated by >> refactoring. >> >>> On Fri, Sep 7, 2018, 3:50 PM Ryan Merriman <merrim...@gmail.com> wrote: >>> >>> I recently worked on a PR that involved changing the default behavior of >>> the ElasticsearchWriter to store data using field names with the default >>> Metron separator, dots. One of the unfortunate consequences of this is >>> that although dots are allowed in more recent versions of ES, it changes >>> how these fields are stored. Having a dot in a field name causes ES to >>> treat it as an object field type. We're not quite comfortable with this >>> because it could introduce unforeseen side effects that may not be >>> obvious. Here's the PR: https://github.com/apache/metron/pull/1181 >>> >>> As I worked through it I noticed there are a couple fields that include >>> separators where it's not actually necessary. They are not nested by >>> nature and are internal to Metron. The fact that they are internal means >>> they show up in constants and are hardcoded in several different places. >>> That made the work in the PR above much harder and tedious than it should >>> have been. There are 2 in particular that I had to deal with: source:type >>> and threat:triage:score in metaalerts. >>> >>> Is it worth considering converting these to internal Metron fields so that >>> they stay constant and this isn't a problem in the future? I could see >>> these fields following the same pattern as 'metron_alert'. However this >>> would cause pain when upgrading because existing data would need to be >>> updated with these new fields. >>> >>> Just an idea. Curious if other have an opinion on the subject. ------------------- Thank you, James Sirota PMC- Apache Metron jsirota AT apache DOT org