Re: [DISCUSS] Field conversions

2018-06-06 Thread Michael Miklavcic
Yeah Otto, pre-0.5.0 (0.4.2) would be ES 2.3 if users were not using
master. ES upgrade is a big piece of this Apache release. Last release was
0.4.2 on Fri Dec 22 2017.

I'm +1 on the idea of an example referencing the ES docs and keeping this
as simple as possible.

* https://archive.apache.org/dist/metron/

On Tue, Jun 5, 2018 at 11:06 AM, Otto Fowler 
wrote:

> Aren’t people who are on an old version of ES everyone pre 0.5.0?  Like all
> the metron users?
>
>
> On June 5, 2018 at 12:31:30, Simon Elliston Ball (
> si...@simonellistonball.com) wrote:
>
> Yes, anything using elastic would need the field names changed. That said,
> people who are on such an old version (eol) will need to not the bullet
> with ES compatibility as some point.
>
> Simon
>
> > On 5 Jun 2018, at 17:17, Otto Fowler  wrote:
> >
> > Are there consequences with Kibana as well? queries, visualizations,
> > templates they may have?
> >
> >
> > On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote:
> >
> > I just don't know if telling users to do a bulk upgrade of their indices
> is
> > sufficient enough of an upgrade path. I would expect some to have
> > downstream processes dependent on those field names, which would also
> need
> > to be updated.
> >
> > Although, we could tell users to do any field name conversions that they
> > depend on using parser transformations; rather than the
> > `FieldNameConverter` abstractions. I *think* that would be a valid
> upgrade
> > path where we could just revert #1022.
> >
> >> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen  wrote:
> >>
> >> I am in favor of removing the `FieldNameConverter` abstraction as an end
> >> state. Although, I don't agree with Simon that we could have just done
> >> that directly without providing a backwards compatible solution as was
> > done
> >> in #1022. There are too many touch points that rely on that conversion
> > and
> >> users who expect fields to land in their indices named a certain way (no
> >> matter what version of ES they are running). If I am wrong and there is
> a
> >> better approach that works, then we should just revert #1022.
> >>
> >> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
> >> si...@simonellistonball.com> wrote:
> >>
> >>> I would definitely agree that the transformation should be removed. We
> >>> have
> >>> now however added a complex generic solution in the backend, which is
> >>> going
> >>> to be noop for most people. This was done I believe for the sake of
> >>> backward compatibility. I would argue however, that there is no need to
> >>> support ES 2.3, and therefore no need to support de-dotting
> >>> transformations. This does seem somewhat over-engineered to me, though
> > it
> >>> does save people re-indexing on upgrades. I suspect in reality that
> this
> >>> is
> >>> a rare edge case, and that we would do far better to settle on one
> >>> solution
> >>> (the dotted version, not the colons, to my mind)
> >>>
> >>> Simon
> >>>
>  On 5 June 2018 at 06:29, Ryan Merriman  wrote:
> 
>  I agree completely. I will leave this thread open for a day or two to
> >>> give
>  others a chance to weigh in. If no one opposes, I will creates Jiras
> >>> for
>  removing field transformations and transforming existing data.
> 
>  On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
> >>> wrote:
> 
> > Well, on write it is a transformation, on read it's a translation.
> >>> This
>  is
> > to say that you're providing a mapping on read to translate field
> >>> names
> > given the index you're using. The other approach that I was
> >>> considering
> > last night is a field transformation REST call which translates
> > field
>  names
> > that the UI could call. So, the UI would pass 'source.type' to the
> >>> field
> > translation service and in Solr it'd return source.type and in ES
> > it'd
> > return source:type. Underneath the hood the service would use the
> >>> same
> > transformation as the writer uses. That's another way to skin this
> >>> cat.
> >
> > Ultimately, I think we should just ditch this field transformation
> > business, as Laurens said, as long as we have a utility to transform
> > existing data.
> >
> > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
>  wrote:
> >
> >> Having 2 different patterns for configuring field name
> >>> transformations
>  on
> >> read vs write is confusing to me. I agree with both of you that
> >> normalizing on '.' and not having to do the translation at all
> >>> would be
> >> ideal. Like you both suggested, we would need some utility or
> >>> script
>  to
> >> convert preexisting data to match this format. There could also be
>  some
> >> adjustments a user would need to make in the UI but I feel like we
>  could
> >> document around that. Are there any objections to doing it this
> >>> way?
> >>
> >>
> >>
> >> On Mon, Jun 4, 20

Re: [DISCUSS] Field conversions

2018-06-05 Thread Otto Fowler
Aren’t people who are on an old version of ES everyone pre 0.5.0?  Like all
the metron users?


On June 5, 2018 at 12:31:30, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Yes, anything using elastic would need the field names changed. That said,
people who are on such an old version (eol) will need to not the bullet
with ES compatibility as some point.

Simon

> On 5 Jun 2018, at 17:17, Otto Fowler  wrote:
>
> Are there consequences with Kibana as well? queries, visualizations,
> templates they may have?
>
>
> On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote:
>
> I just don't know if telling users to do a bulk upgrade of their indices
is
> sufficient enough of an upgrade path. I would expect some to have
> downstream processes dependent on those field names, which would also
need
> to be updated.
>
> Although, we could tell users to do any field name conversions that they
> depend on using parser transformations; rather than the
> `FieldNameConverter` abstractions. I *think* that would be a valid
upgrade
> path where we could just revert #1022.
>
>> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen  wrote:
>>
>> I am in favor of removing the `FieldNameConverter` abstraction as an end
>> state. Although, I don't agree with Simon that we could have just done
>> that directly without providing a backwards compatible solution as was
> done
>> in #1022. There are too many touch points that rely on that conversion
> and
>> users who expect fields to land in their indices named a certain way (no
>> matter what version of ES they are running). If I am wrong and there is
a
>> better approach that works, then we should just revert #1022.
>>
>> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
>> si...@simonellistonball.com> wrote:
>>
>>> I would definitely agree that the transformation should be removed. We
>>> have
>>> now however added a complex generic solution in the backend, which is
>>> going
>>> to be noop for most people. This was done I believe for the sake of
>>> backward compatibility. I would argue however, that there is no need to
>>> support ES 2.3, and therefore no need to support de-dotting
>>> transformations. This does seem somewhat over-engineered to me, though
> it
>>> does save people re-indexing on upgrades. I suspect in reality that
this
>>> is
>>> a rare edge case, and that we would do far better to settle on one
>>> solution
>>> (the dotted version, not the colons, to my mind)
>>>
>>> Simon
>>>
 On 5 June 2018 at 06:29, Ryan Merriman  wrote:

 I agree completely. I will leave this thread open for a day or two to
>>> give
 others a chance to weigh in. If no one opposes, I will creates Jiras
>>> for
 removing field transformations and transforming existing data.

 On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
>>> wrote:

> Well, on write it is a transformation, on read it's a translation.
>>> This
 is
> to say that you're providing a mapping on read to translate field
>>> names
> given the index you're using. The other approach that I was
>>> considering
> last night is a field transformation REST call which translates
> field
 names
> that the UI could call. So, the UI would pass 'source.type' to the
>>> field
> translation service and in Solr it'd return source.type and in ES
> it'd
> return source:type. Underneath the hood the service would use the
>>> same
> transformation as the writer uses. That's another way to skin this
>>> cat.
>
> Ultimately, I think we should just ditch this field transformation
> business, as Laurens said, as long as we have a utility to transform
> existing data.
>
> On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
 wrote:
>
>> Having 2 different patterns for configuring field name
>>> transformations
 on
>> read vs write is confusing to me. I agree with both of you that
>> normalizing on '.' and not having to do the translation at all
>>> would be
>> ideal. Like you both suggested, we would need some utility or
>>> script
 to
>> convert preexisting data to match this format. There could also be
 some
>> adjustments a user would need to make in the UI but I feel like we
 could
>> document around that. Are there any objections to doing it this
>>> way?
>>
>>
>>
>> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
 wrote:
>>
>>> ES 2.x support officially ended 4 months ago (
>>> https://www.elastic.co/support/eol), so why still support ':' at
 all?
> :)
>>> Additionally, 2.x isn't even supported at all on the last 2
> Ubuntu
 LTS
>>> releases (16.04 & 18.05).
>>>
>>> Therefor, move everything to use '.' and provide a
>>> conversion/upgrade
>>> script to change '.' to ':'?
>>>
>>>
 On 2018-06-04 13:55, Ryan Merriman wrote:

 We've been dealing with a reoccurring challenge in Metron. It
> is
> common
>>

Re: [DISCUSS] Field conversions

2018-06-05 Thread Simon Elliston Ball
Yes, anything using elastic would need the field names changed. That said, 
people who are on such an old version (eol) will need to not the bullet with ES 
compatibility as some point.

Simon 

> On 5 Jun 2018, at 17:17, Otto Fowler  wrote:
> 
> Are there consequences with Kibana as well?  queries, visualizations,
> templates they may have?
> 
> 
> On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote:
> 
> I just don't know if telling users to do a bulk upgrade of their indices is
> sufficient enough of an upgrade path. I would expect some to have
> downstream processes dependent on those field names, which would also need
> to be updated.
> 
> Although, we could tell users to do any field name conversions that they
> depend on using parser transformations; rather than the
> `FieldNameConverter` abstractions. I *think* that would be a valid upgrade
> path where we could just revert #1022.
> 
>> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen  wrote:
>> 
>> I am in favor of removing the `FieldNameConverter` abstraction as an end
>> state. Although, I don't agree with Simon that we could have just done
>> that directly without providing a backwards compatible solution as was
> done
>> in #1022. There are too many touch points that rely on that conversion
> and
>> users who expect fields to land in their indices named a certain way (no
>> matter what version of ES they are running). If I am wrong and there is a
>> better approach that works, then we should just revert #1022.
>> 
>> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
>> si...@simonellistonball.com> wrote:
>> 
>>> I would definitely agree that the transformation should be removed. We
>>> have
>>> now however added a complex generic solution in the backend, which is
>>> going
>>> to be noop for most people. This was done I believe for the sake of
>>> backward compatibility. I would argue however, that there is no need to
>>> support ES 2.3, and therefore no need to support de-dotting
>>> transformations. This does seem somewhat over-engineered to me, though
> it
>>> does save people re-indexing on upgrades. I suspect in reality that this
>>> is
>>> a rare edge case, and that we would do far better to settle on one
>>> solution
>>> (the dotted version, not the colons, to my mind)
>>> 
>>> Simon
>>> 
 On 5 June 2018 at 06:29, Ryan Merriman  wrote:
 
 I agree completely. I will leave this thread open for a day or two to
>>> give
 others a chance to weigh in. If no one opposes, I will creates Jiras
>>> for
 removing field transformations and transforming existing data.
 
 On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
>>> wrote:
 
> Well, on write it is a transformation, on read it's a translation.
>>> This
 is
> to say that you're providing a mapping on read to translate field
>>> names
> given the index you're using. The other approach that I was
>>> considering
> last night is a field transformation REST call which translates
> field
 names
> that the UI could call. So, the UI would pass 'source.type' to the
>>> field
> translation service and in Solr it'd return source.type and in ES
> it'd
> return source:type. Underneath the hood the service would use the
>>> same
> transformation as the writer uses. That's another way to skin this
>>> cat.
> 
> Ultimately, I think we should just ditch this field transformation
> business, as Laurens said, as long as we have a utility to transform
> existing data.
> 
> On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
 wrote:
> 
>> Having 2 different patterns for configuring field name
>>> transformations
 on
>> read vs write is confusing to me. I agree with both of you that
>> normalizing on '.' and not having to do the translation at all
>>> would be
>> ideal. Like you both suggested, we would need some utility or
>>> script
 to
>> convert preexisting data to match this format. There could also be
 some
>> adjustments a user would need to make in the UI but I feel like we
 could
>> document around that. Are there any objections to doing it this
>>> way?
>> 
>> 
>> 
>> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
 wrote:
>> 
>>> ES 2.x support officially ended 4 months ago (
>>> https://www.elastic.co/support/eol), so why still support ':' at
 all?
> :)
>>> Additionally, 2.x isn't even supported at all on the last 2
> Ubuntu
 LTS
>>> releases (16.04 & 18.05).
>>> 
>>> Therefor, move everything to use '.' and provide a
>>> conversion/upgrade
>>> script to change '.' to ':'?
>>> 
>>> 
 On 2018-06-04 13:55, Ryan Merriman wrote:
 
 We've been dealing with a reoccurring challenge in Metron. It
> is
> common
 for various fields to contain '.' characters for the purpose of
 making
 them
 more readable, namespacing, etc. At one point we 

Re: [DISCUSS] Field conversions

2018-06-05 Thread Otto Fowler
Are there consequences with Kibana as well?  queries, visualizations,
templates they may have?


On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote:

I just don't know if telling users to do a bulk upgrade of their indices is
sufficient enough of an upgrade path. I would expect some to have
downstream processes dependent on those field names, which would also need
to be updated.

Although, we could tell users to do any field name conversions that they
depend on using parser transformations; rather than the
`FieldNameConverter` abstractions. I *think* that would be a valid upgrade
path where we could just revert #1022.

On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen  wrote:

> I am in favor of removing the `FieldNameConverter` abstraction as an end
> state. Although, I don't agree with Simon that we could have just done
> that directly without providing a backwards compatible solution as was
done
> in #1022. There are too many touch points that rely on that conversion
and
> users who expect fields to land in their indices named a certain way (no
> matter what version of ES they are running). If I am wrong and there is a
> better approach that works, then we should just revert #1022.
>
> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
>> I would definitely agree that the transformation should be removed. We
>> have
>> now however added a complex generic solution in the backend, which is
>> going
>> to be noop for most people. This was done I believe for the sake of
>> backward compatibility. I would argue however, that there is no need to
>> support ES 2.3, and therefore no need to support de-dotting
>> transformations. This does seem somewhat over-engineered to me, though
it
>> does save people re-indexing on upgrades. I suspect in reality that this
>> is
>> a rare edge case, and that we would do far better to settle on one
>> solution
>> (the dotted version, not the colons, to my mind)
>>
>> Simon
>>
>> On 5 June 2018 at 06:29, Ryan Merriman  wrote:
>>
>> > I agree completely. I will leave this thread open for a day or two to
>> give
>> > others a chance to weigh in. If no one opposes, I will creates Jiras
>> for
>> > removing field transformations and transforming existing data.
>> >
>> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
>> wrote:
>> >
>> > > Well, on write it is a transformation, on read it's a translation.
>> This
>> > is
>> > > to say that you're providing a mapping on read to translate field
>> names
>> > > given the index you're using. The other approach that I was
>> considering
>> > > last night is a field transformation REST call which translates
field
>> > names
>> > > that the UI could call. So, the UI would pass 'source.type' to the
>> field
>> > > translation service and in Solr it'd return source.type and in ES
it'd
>> > > return source:type. Underneath the hood the service would use the
>> same
>> > > transformation as the writer uses. That's another way to skin this
>> cat.
>> > >
>> > > Ultimately, I think we should just ditch this field transformation
>> > > business, as Laurens said, as long as we have a utility to transform
>> > > existing data.
>> > >
>> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
>> > wrote:
>> > >
>> > > > Having 2 different patterns for configuring field name
>> transformations
>> > on
>> > > > read vs write is confusing to me. I agree with both of you that
>> > > > normalizing on '.' and not having to do the translation at all
>> would be
>> > > > ideal. Like you both suggested, we would need some utility or
>> script
>> > to
>> > > > convert preexisting data to match this format. There could also be
>> > some
>> > > > adjustments a user would need to make in the UI but I feel like we
>> > could
>> > > > document around that. Are there any objections to doing it this
>> way?
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
>> > wrote:
>> > > >
>> > > > > ES 2.x support officially ended 4 months ago (
>> > > > > https://www.elastic.co/support/eol), so why still support ':' at
>> > all?
>> > > :)
>> > > > > Additionally, 2.x isn't even supported at all on the last 2
Ubuntu
>> > LTS
>> > > > > releases (16.04 & 18.05).
>> > > > >
>> > > > > Therefor, move everything to use '.' and provide a
>> conversion/upgrade
>> > > > > script to change '.' to ':'?
>> > > > >
>> > > > >
>> > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
>> > > > >
>> > > > >> We've been dealing with a reoccurring challenge in Metron. It
is
>> > > common
>> > > > >> for various fields to contain '.' characters for the purpose of
>> > making
>> > > > >> them
>> > > > >> more readable, namespacing, etc. At one point we only supported
>> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use
>> ':'
>> > > > >> instead. This limitation does not exist in later versions of
>> > > > >> Elasticsearch
>> > > > >> or Solr.
>> > > > >>
>> > > > >> Now we're in a situation where we 

Re: [DISCUSS] Field conversions

2018-06-05 Thread Nick Allen
I just don't know if telling users to do a bulk upgrade of their indices is
sufficient enough of an upgrade path.  I would expect some to have
downstream processes dependent on those field names, which would also need
to be updated.

Although, we could tell users to do any field name conversions that they
depend on using parser transformations; rather than the
`FieldNameConverter` abstractions.  I *think* that would be a valid upgrade
path where we could just revert #1022.

On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen  wrote:

> I am in favor of removing the `FieldNameConverter` abstraction as an end
> state.  Although, I don't agree with Simon that we could have just done
> that directly without providing a backwards compatible solution as was done
> in #1022.  There are too many touch points that rely on that conversion and
> users who expect fields to land in their indices named a certain way (no
> matter what version of ES they are running).  If I am wrong and there is a
> better approach that works, then we should just revert #1022.
>
> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
>> I would definitely agree that the transformation should be removed. We
>> have
>> now however added a complex generic solution in the backend, which is
>> going
>> to be noop for most people. This was done I believe for the sake of
>> backward compatibility. I would argue however, that there is no need to
>> support ES 2.3, and therefore no need to support de-dotting
>> transformations. This does seem somewhat over-engineered to me, though it
>> does save people re-indexing on upgrades. I suspect in reality that this
>> is
>> a rare edge case, and that we would do far better to settle on one
>> solution
>> (the dotted version, not the colons, to my mind)
>>
>> Simon
>>
>> On 5 June 2018 at 06:29, Ryan Merriman  wrote:
>>
>> > I agree completely.  I will leave this thread open for a day or two to
>> give
>> > others a chance to weigh in.  If no one opposes, I will creates Jiras
>> for
>> > removing field transformations and transforming existing data.
>> >
>> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
>> wrote:
>> >
>> > > Well, on write it is a transformation, on read it's a translation.
>> This
>> > is
>> > > to say that you're providing a mapping on read to translate field
>> names
>> > > given the index you're using.  The other approach that I was
>> considering
>> > > last night is a field transformation REST call which translates field
>> > names
>> > > that the UI could call.  So, the UI would pass 'source.type' to the
>> field
>> > > translation service and in Solr it'd return source.type and in ES it'd
>> > > return source:type.  Underneath the hood the service would use the
>> same
>> > > transformation as the writer uses.  That's another way to skin this
>> cat.
>> > >
>> > > Ultimately, I think we should just ditch this field transformation
>> > > business, as Laurens said, as long as we have a utility to transform
>> > > existing data.
>> > >
>> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
>> > wrote:
>> > >
>> > > > Having 2 different patterns for configuring field name
>> transformations
>> > on
>> > > > read vs write is confusing to me.  I agree with both of you that
>> > > > normalizing on '.' and not having to do the translation at all
>> would be
>> > > > ideal.  Like you both suggested, we would need some utility or
>> script
>> > to
>> > > > convert preexisting data to match this format.  There could also be
>> > some
>> > > > adjustments a user would need to make in the UI but I feel like we
>> > could
>> > > > document around that.  Are there any objections to doing it this
>> way?
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
>> > wrote:
>> > > >
>> > > > > ES 2.x support officially ended 4 months ago (
>> > > > > https://www.elastic.co/support/eol), so why still support ':' at
>> > all?
>> > > :)
>> > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu
>> > LTS
>> > > > > releases (16.04 & 18.05).
>> > > > >
>> > > > > Therefor, move everything to use '.' and provide a
>> conversion/upgrade
>> > > > > script to change '.' to ':'?
>> > > > >
>> > > > >
>> > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
>> > > > >
>> > > > >> We've been dealing with a reoccurring challenge in Metron.  It is
>> > > common
>> > > > >> for various fields to contain '.' characters for the purpose of
>> > making
>> > > > >> them
>> > > > >> more readable, namespacing, etc.  At one point we only supported
>> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use
>> ':'
>> > > > >> instead.  This limitation does not exist in later versions of
>> > > > >> Elasticsearch
>> > > > >> or Solr.
>> > > > >>
>> > > > >> Now we're in a situation where we need to allow a user to use
>> either
>> > > one
>> > > > >> because they may still be using ES 2.3 or have data with ':'
>> > > characters
>> > > >

Re: [DISCUSS] Field conversions

2018-06-05 Thread Casey Stella
Agreed, we should definitely have a clear picture about how to do that,
maybe even a worked example in the use-cases that we can reference.  I'm
just saying we don't need to migrate ES docs into Metron, but rather
reference them as much as we possibly can.

On Tue, Jun 5, 2018 at 11:38 AM Otto Fowler  wrote:

> It is still our user list and dev list that will have the burden of
> talking folks through that.
>
>
> On June 5, 2018 at 09:58:32, Casey Stella (ceste...@gmail.com) wrote:
>
> To be clear, I'm not even suggesting that we create any tooling here. I'd
> say just a reference to the ES docs and a call-out in Upgrading.md would
> suffice as long as we have some strong reason to believe it'll work. As
> far as I'm concerned, the sooner we're out of the business of transforming
> fields, the better.
>
> On Tue, Jun 5, 2018 at 9:49 AM Justin Leet  wrote:
>
> > ES does have some docs around how this gets handled in upgrades:
> >
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html
> >
> > Might be worth taking a look to see what conflicts we'd have going from
> 2.x
> > to 5.x and figuring out where to go from there.
> >
> > On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> > > I guess in principal you could use
> > > https://www.elastic.co/guide/en/elasticsearch/reference/
> > > current/docs-reindex.html#docs-reindex-change-name
> > > to reindex with the new fields. It wouldn't be hard to script up a bit
> of
> > > python to help users out with that, or of course to leave that as an
> > > exercise to the reader. It would be nice to have a script that read
> and
> > > transformed fields for templates and indices to replace the colons
> with
> > > dots in ES.
> > >
> > > Simon
> > >
> > > On 5 June 2018 at 06:40, Casey Stella  wrote:
> > >
> > > > +1 to that, Simon. Do we have a sense of if there are utilities
> > provided
> > > > by ES to do this kind of migration transformation easily?
> > > >
> > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
> > > > si...@simonellistonball.com> wrote:
> > > >
> > > > > I would definitely agree that the transformation should be
> removed.
> > We
> > > > have
> > > > > now however added a complex generic solution in the backend, which
> is
> > > > going
> > > > > to be noop for most people. This was done I believe for the sake
> of
> > > > > backward compatibility. I would argue however, that there is no
> need
> > to
> > > > > support ES 2.3, and therefore no need to support de-dotting
> > > > > transformations. This does seem somewhat over-engineered to me,
> > though
> > > it
> > > > > does save people re-indexing on upgrades. I suspect in reality
> that
> > > this
> > > > is
> > > > > a rare edge case, and that we would do far better to settle on one
> > > > solution
> > > > > (the dotted version, not the colons, to my mind)
> > > > >
> > > > > Simon
> > > > >
> > > > > On 5 June 2018 at 06:29, Ryan Merriman 
> wrote:
> > > > >
> > > > > > I agree completely. I will leave this thread open for a day or
> two
> > > to
> > > > > give
> > > > > > others a chance to weigh in. If no one opposes, I will creates
> > Jiras
> > > > for
> > > > > > removing field transformations and transforming existing data.
> > > > > >
> > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
>
> > > > wrote:
> > > > > >
> > > > > > > Well, on write it is a transformation, on read it's a
> > translation.
> > > > > This
> > > > > > is
> > > > > > > to say that you're providing a mapping on read to translate
> field
> > > > names
> > > > > > > given the index you're using. The other approach that I was
> > > > > considering
> > > > > > > last night is a field transformation REST call which
> translates
> > > field
> > > > > > names
> > > > > > > that the UI could call. So, the UI would pass 'source.type' to
> > the
> > > > > field
> > > > > > > translation service and in Solr it'd return source.type and in
> ES
> > > > it'd
> > > > > > > return source:type. Underneath the hood the service would use
> > the
> > > > same
> > > > > > > transformation as the writer uses. That's another way to skin
> > this
> > > > > cat.
> > > > > > >
> > > > > > > Ultimately, I think we should just ditch this field
> > transformation
> > > > > > > business, as Laurens said, as long as we have a utility to
> > > transform
> > > > > > > existing data.
> > > > > > >
> > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <
> > merrim...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Having 2 different patterns for configuring field name
> > > > > transformations
> > > > > > on
> > > > > > > > read vs write is confusing to me. I agree with both of you
> > that
> > > > > > > > normalizing on '.' and not having to do the translation at
> all
> > > > would
> > > > > be
> > > > > > > > ideal. Like you both suggested, we would need some utility
> or
> > > > script
> > > > > > to
> > > > > > > > convert preexisting dat

Re: [DISCUSS] Field conversions

2018-06-05 Thread Otto Fowler
It is still our user list and dev list that will have the burden of talking
folks through that.


On June 5, 2018 at 09:58:32, Casey Stella (ceste...@gmail.com) wrote:

To be clear, I'm not even suggesting that we create any tooling here. I'd
say just a reference to the ES docs and a call-out in Upgrading.md would
suffice as long as we have some strong reason to believe it'll work. As
far as I'm concerned, the sooner we're out of the business of transforming
fields, the better.

On Tue, Jun 5, 2018 at 9:49 AM Justin Leet  wrote:

> ES does have some docs around how this gets handled in upgrades:
>
>
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html
>
> Might be worth taking a look to see what conflicts we'd have going from
2.x
> to 5.x and figuring out where to go from there.
>
> On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > I guess in principal you could use
> > https://www.elastic.co/guide/en/elasticsearch/reference/
> > current/docs-reindex.html#docs-reindex-change-name
> > to reindex with the new fields. It wouldn't be hard to script up a bit
of
> > python to help users out with that, or of course to leave that as an
> > exercise to the reader. It would be nice to have a script that read and
> > transformed fields for templates and indices to replace the colons with
> > dots in ES.
> >
> > Simon
> >
> > On 5 June 2018 at 06:40, Casey Stella  wrote:
> >
> > > +1 to that, Simon. Do we have a sense of if there are utilities
> provided
> > > by ES to do this kind of migration transformation easily?
> > >
> > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
> > > si...@simonellistonball.com> wrote:
> > >
> > > > I would definitely agree that the transformation should be removed.
> We
> > > have
> > > > now however added a complex generic solution in the backend, which
is
> > > going
> > > > to be noop for most people. This was done I believe for the sake of
> > > > backward compatibility. I would argue however, that there is no
need
> to
> > > > support ES 2.3, and therefore no need to support de-dotting
> > > > transformations. This does seem somewhat over-engineered to me,
> though
> > it
> > > > does save people re-indexing on upgrades. I suspect in reality that
> > this
> > > is
> > > > a rare edge case, and that we would do far better to settle on one
> > > solution
> > > > (the dotted version, not the colons, to my mind)
> > > >
> > > > Simon
> > > >
> > > > On 5 June 2018 at 06:29, Ryan Merriman  wrote:
> > > >
> > > > > I agree completely. I will leave this thread open for a day or
two
> > to
> > > > give
> > > > > others a chance to weigh in. If no one opposes, I will creates
> Jiras
> > > for
> > > > > removing field transformations and transforming existing data.
> > > > >
> > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
> > > wrote:
> > > > >
> > > > > > Well, on write it is a transformation, on read it's a
> translation.
> > > > This
> > > > > is
> > > > > > to say that you're providing a mapping on read to translate
field
> > > names
> > > > > > given the index you're using. The other approach that I was
> > > > considering
> > > > > > last night is a field transformation REST call which translates
> > field
> > > > > names
> > > > > > that the UI could call. So, the UI would pass 'source.type' to
> the
> > > > field
> > > > > > translation service and in Solr it'd return source.type and in
ES
> > > it'd
> > > > > > return source:type. Underneath the hood the service would use
> the
> > > same
> > > > > > transformation as the writer uses. That's another way to skin
> this
> > > > cat.
> > > > > >
> > > > > > Ultimately, I think we should just ditch this field
> transformation
> > > > > > business, as Laurens said, as long as we have a utility to
> > transform
> > > > > > existing data.
> > > > > >
> > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <
> merrim...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Having 2 different patterns for configuring field name
> > > > transformations
> > > > > on
> > > > > > > read vs write is confusing to me. I agree with both of you
> that
> > > > > > > normalizing on '.' and not having to do the translation at
all
> > > would
> > > > be
> > > > > > > ideal. Like you both suggested, we would need some utility or
> > > script
> > > > > to
> > > > > > > convert preexisting data to match this format. There could
> also
> > be
> > > > > some
> > > > > > > adjustments a user would need to make in the UI but I feel
like
> > we
> > > > > could
> > > > > > > document around that. Are there any objections to doing it
> this
> > > way?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets <
> laur...@daemon.be>
> > > > > wrote:
> > > > > > >
> > > > > > > > ES 2.x support officially ended 4 months ago (
> > > > > > > > https://www.elastic.co/support/eol), so why still support
> ':'
> > at
> > > > > all?
> > > > > >

Re: [DISCUSS] Field conversions

2018-06-05 Thread Nick Allen
I am in favor of removing the `FieldNameConverter` abstraction as an end
state.  Although, I don't agree with Simon that we could have just done
that directly without providing a backwards compatible solution as was done
in #1022.  There are too many touch points that rely on that conversion and
users who expect fields to land in their indices named a certain way (no
matter what version of ES they are running).  If I am wrong and there is a
better approach that works, then we should just revert #1022.

On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> I would definitely agree that the transformation should be removed. We have
> now however added a complex generic solution in the backend, which is going
> to be noop for most people. This was done I believe for the sake of
> backward compatibility. I would argue however, that there is no need to
> support ES 2.3, and therefore no need to support de-dotting
> transformations. This does seem somewhat over-engineered to me, though it
> does save people re-indexing on upgrades. I suspect in reality that this is
> a rare edge case, and that we would do far better to settle on one solution
> (the dotted version, not the colons, to my mind)
>
> Simon
>
> On 5 June 2018 at 06:29, Ryan Merriman  wrote:
>
> > I agree completely.  I will leave this thread open for a day or two to
> give
> > others a chance to weigh in.  If no one opposes, I will creates Jiras for
> > removing field transformations and transforming existing data.
> >
> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella  wrote:
> >
> > > Well, on write it is a transformation, on read it's a translation.
> This
> > is
> > > to say that you're providing a mapping on read to translate field names
> > > given the index you're using.  The other approach that I was
> considering
> > > last night is a field transformation REST call which translates field
> > names
> > > that the UI could call.  So, the UI would pass 'source.type' to the
> field
> > > translation service and in Solr it'd return source.type and in ES it'd
> > > return source:type.  Underneath the hood the service would use the same
> > > transformation as the writer uses.  That's another way to skin this
> cat.
> > >
> > > Ultimately, I think we should just ditch this field transformation
> > > business, as Laurens said, as long as we have a utility to transform
> > > existing data.
> > >
> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
> > wrote:
> > >
> > > > Having 2 different patterns for configuring field name
> transformations
> > on
> > > > read vs write is confusing to me.  I agree with both of you that
> > > > normalizing on '.' and not having to do the translation at all would
> be
> > > > ideal.  Like you both suggested, we would need some utility or script
> > to
> > > > convert preexisting data to match this format.  There could also be
> > some
> > > > adjustments a user would need to make in the UI but I feel like we
> > could
> > > > document around that.  Are there any objections to doing it this way?
> > > >
> > > >
> > > >
> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
> > wrote:
> > > >
> > > > > ES 2.x support officially ended 4 months ago (
> > > > > https://www.elastic.co/support/eol), so why still support ':' at
> > all?
> > > :)
> > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu
> > LTS
> > > > > releases (16.04 & 18.05).
> > > > >
> > > > > Therefor, move everything to use '.' and provide a
> conversion/upgrade
> > > > > script to change '.' to ':'?
> > > > >
> > > > >
> > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > > > >
> > > > >> We've been dealing with a reoccurring challenge in Metron.  It is
> > > common
> > > > >> for various fields to contain '.' characters for the purpose of
> > making
> > > > >> them
> > > > >> more readable, namespacing, etc.  At one point we only supported
> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use
> ':'
> > > > >> instead.  This limitation does not exist in later versions of
> > > > >> Elasticsearch
> > > > >> or Solr.
> > > > >>
> > > > >> Now we're in a situation where we need to allow a user to use
> either
> > > one
> > > > >> because they may still be using ES 2.3 or have data with ':'
> > > characters
> > > > in
> > > > >> field names.  We've attempted to make this configurable in a
> couple
> > > > >> different PRs:
> > > > >>
> > > > >> https://github.com/apache/metron/pull/1022
> > > > >> https://github.com/apache/metron/pull/1010
> > > > >> https://github.com/apache/metron/pull/1038
> > > > >>
> > > > >> The approaches taken in these are not consistent and fall short in
> > > > >> different ways.  The first (METRON-1569 Allow user to change field
> > > name
> > > > >> conversion when indexing) only applies to indexing and not
> querying.
> > > > The
> > > > >> others only apply to a single field which does not scale well.
> Now
> > we
> > > > >> have
> > > > >> an issu

Re: [DISCUSS] Field conversions

2018-06-05 Thread Simon Elliston Ball
+1 to that. It's a simple problem to solve if you have it, and with a
little docs help I imagine we'll be fine.

On 5 June 2018 at 06:58, Casey Stella  wrote:

> To be clear, I'm not even suggesting that we create any tooling here.  I'd
> say just a reference to the ES docs and a call-out in Upgrading.md would
> suffice as long as we have some strong reason to believe it'll work.  As
> far as I'm concerned, the sooner we're out of the business of transforming
> fields, the better.
>
> On Tue, Jun 5, 2018 at 9:49 AM Justin Leet  wrote:
>
> > ES does have some docs around how this gets handled in upgrades:
> >
> > https://www.elastic.co/guide/en/elasticsearch/reference/2.
> 4/dots-in-names.html
> >
> > Might be worth taking a look to see what conflicts we'd have going from
> 2.x
> > to 5.x and figuring out where to go from there.
> >
> > On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> > > I guess in principal you could use
> > > https://www.elastic.co/guide/en/elasticsearch/reference/
> > > current/docs-reindex.html#docs-reindex-change-name
> > > to reindex with the new fields. It wouldn't be hard to script up a bit
> of
> > > python to help users out with that, or of course to leave that as an
> > > exercise to the reader. It would be nice to have a script that read and
> > > transformed fields for templates and indices to replace the colons with
> > > dots in ES.
> > >
> > > Simon
> > >
> > > On 5 June 2018 at 06:40, Casey Stella  wrote:
> > >
> > > > +1 to that, Simon.  Do we have a sense of if there are utilities
> > provided
> > > > by ES to do this kind of migration transformation easily?
> > > >
> > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
> > > > si...@simonellistonball.com> wrote:
> > > >
> > > > > I would definitely agree that the transformation should be removed.
> > We
> > > > have
> > > > > now however added a complex generic solution in the backend, which
> is
> > > > going
> > > > > to be noop for most people. This was done I believe for the sake of
> > > > > backward compatibility. I would argue however, that there is no
> need
> > to
> > > > > support ES 2.3, and therefore no need to support de-dotting
> > > > > transformations. This does seem somewhat over-engineered to me,
> > though
> > > it
> > > > > does save people re-indexing on upgrades. I suspect in reality that
> > > this
> > > > is
> > > > > a rare edge case, and that we would do far better to settle on one
> > > > solution
> > > > > (the dotted version, not the colons, to my mind)
> > > > >
> > > > > Simon
> > > > >
> > > > > On 5 June 2018 at 06:29, Ryan Merriman 
> wrote:
> > > > >
> > > > > > I agree completely.  I will leave this thread open for a day or
> two
> > > to
> > > > > give
> > > > > > others a chance to weigh in.  If no one opposes, I will creates
> > Jiras
> > > > for
> > > > > > removing field transformations and transforming existing data.
> > > > > >
> > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella  >
> > > > wrote:
> > > > > >
> > > > > > > Well, on write it is a transformation, on read it's a
> > translation.
> > > > > This
> > > > > > is
> > > > > > > to say that you're providing a mapping on read to translate
> field
> > > > names
> > > > > > > given the index you're using.  The other approach that I was
> > > > > considering
> > > > > > > last night is a field transformation REST call which translates
> > > field
> > > > > > names
> > > > > > > that the UI could call.  So, the UI would pass 'source.type' to
> > the
> > > > > field
> > > > > > > translation service and in Solr it'd return source.type and in
> ES
> > > > it'd
> > > > > > > return source:type.  Underneath the hood the service would use
> > the
> > > > same
> > > > > > > transformation as the writer uses.  That's another way to skin
> > this
> > > > > cat.
> > > > > > >
> > > > > > > Ultimately, I think we should just ditch this field
> > transformation
> > > > > > > business, as Laurens said, as long as we have a utility to
> > > transform
> > > > > > > existing data.
> > > > > > >
> > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <
> > merrim...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Having 2 different patterns for configuring field name
> > > > > transformations
> > > > > > on
> > > > > > > > read vs write is confusing to me.  I agree with both of you
> > that
> > > > > > > > normalizing on '.' and not having to do the translation at
> all
> > > > would
> > > > > be
> > > > > > > > ideal.  Like you both suggested, we would need some utility
> or
> > > > script
> > > > > > to
> > > > > > > > convert preexisting data to match this format.  There could
> > also
> > > be
> > > > > > some
> > > > > > > > adjustments a user would need to make in the UI but I feel
> like
> > > we
> > > > > > could
> > > > > > > > document around that.  Are there any objections to doing it
> > this
> > > > way?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> 

Re: [DISCUSS] Field conversions

2018-06-05 Thread Casey Stella
To be clear, I'm not even suggesting that we create any tooling here.  I'd
say just a reference to the ES docs and a call-out in Upgrading.md would
suffice as long as we have some strong reason to believe it'll work.  As
far as I'm concerned, the sooner we're out of the business of transforming
fields, the better.

On Tue, Jun 5, 2018 at 9:49 AM Justin Leet  wrote:

> ES does have some docs around how this gets handled in upgrades:
>
> https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html
>
> Might be worth taking a look to see what conflicts we'd have going from 2.x
> to 5.x and figuring out where to go from there.
>
> On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > I guess in principal you could use
> > https://www.elastic.co/guide/en/elasticsearch/reference/
> > current/docs-reindex.html#docs-reindex-change-name
> > to reindex with the new fields. It wouldn't be hard to script up a bit of
> > python to help users out with that, or of course to leave that as an
> > exercise to the reader. It would be nice to have a script that read and
> > transformed fields for templates and indices to replace the colons with
> > dots in ES.
> >
> > Simon
> >
> > On 5 June 2018 at 06:40, Casey Stella  wrote:
> >
> > > +1 to that, Simon.  Do we have a sense of if there are utilities
> provided
> > > by ES to do this kind of migration transformation easily?
> > >
> > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
> > > si...@simonellistonball.com> wrote:
> > >
> > > > I would definitely agree that the transformation should be removed.
> We
> > > have
> > > > now however added a complex generic solution in the backend, which is
> > > going
> > > > to be noop for most people. This was done I believe for the sake of
> > > > backward compatibility. I would argue however, that there is no need
> to
> > > > support ES 2.3, and therefore no need to support de-dotting
> > > > transformations. This does seem somewhat over-engineered to me,
> though
> > it
> > > > does save people re-indexing on upgrades. I suspect in reality that
> > this
> > > is
> > > > a rare edge case, and that we would do far better to settle on one
> > > solution
> > > > (the dotted version, not the colons, to my mind)
> > > >
> > > > Simon
> > > >
> > > > On 5 June 2018 at 06:29, Ryan Merriman  wrote:
> > > >
> > > > > I agree completely.  I will leave this thread open for a day or two
> > to
> > > > give
> > > > > others a chance to weigh in.  If no one opposes, I will creates
> Jiras
> > > for
> > > > > removing field transformations and transforming existing data.
> > > > >
> > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
> > > wrote:
> > > > >
> > > > > > Well, on write it is a transformation, on read it's a
> translation.
> > > > This
> > > > > is
> > > > > > to say that you're providing a mapping on read to translate field
> > > names
> > > > > > given the index you're using.  The other approach that I was
> > > > considering
> > > > > > last night is a field transformation REST call which translates
> > field
> > > > > names
> > > > > > that the UI could call.  So, the UI would pass 'source.type' to
> the
> > > > field
> > > > > > translation service and in Solr it'd return source.type and in ES
> > > it'd
> > > > > > return source:type.  Underneath the hood the service would use
> the
> > > same
> > > > > > transformation as the writer uses.  That's another way to skin
> this
> > > > cat.
> > > > > >
> > > > > > Ultimately, I think we should just ditch this field
> transformation
> > > > > > business, as Laurens said, as long as we have a utility to
> > transform
> > > > > > existing data.
> > > > > >
> > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <
> merrim...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Having 2 different patterns for configuring field name
> > > > transformations
> > > > > on
> > > > > > > read vs write is confusing to me.  I agree with both of you
> that
> > > > > > > normalizing on '.' and not having to do the translation at all
> > > would
> > > > be
> > > > > > > ideal.  Like you both suggested, we would need some utility or
> > > script
> > > > > to
> > > > > > > convert preexisting data to match this format.  There could
> also
> > be
> > > > > some
> > > > > > > adjustments a user would need to make in the UI but I feel like
> > we
> > > > > could
> > > > > > > document around that.  Are there any objections to doing it
> this
> > > way?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets <
> laur...@daemon.be>
> > > > > wrote:
> > > > > > >
> > > > > > > > ES 2.x support officially ended 4 months ago (
> > > > > > > > https://www.elastic.co/support/eol), so why still support
> ':'
> > at
> > > > > all?
> > > > > > :)
> > > > > > > > Additionally, 2.x isn't even supported at all on the last 2
> > > Ubuntu
> > > > > LTS
> > > > > > > > releases (16.04 & 18.05).
> > > >

Re: [DISCUSS] Field conversions

2018-06-05 Thread Justin Leet
ES does have some docs around how this gets handled in upgrades:
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html

Might be worth taking a look to see what conflicts we'd have going from 2.x
to 5.x and figuring out where to go from there.

On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> I guess in principal you could use
> https://www.elastic.co/guide/en/elasticsearch/reference/
> current/docs-reindex.html#docs-reindex-change-name
> to reindex with the new fields. It wouldn't be hard to script up a bit of
> python to help users out with that, or of course to leave that as an
> exercise to the reader. It would be nice to have a script that read and
> transformed fields for templates and indices to replace the colons with
> dots in ES.
>
> Simon
>
> On 5 June 2018 at 06:40, Casey Stella  wrote:
>
> > +1 to that, Simon.  Do we have a sense of if there are utilities provided
> > by ES to do this kind of migration transformation easily?
> >
> > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> > > I would definitely agree that the transformation should be removed. We
> > have
> > > now however added a complex generic solution in the backend, which is
> > going
> > > to be noop for most people. This was done I believe for the sake of
> > > backward compatibility. I would argue however, that there is no need to
> > > support ES 2.3, and therefore no need to support de-dotting
> > > transformations. This does seem somewhat over-engineered to me, though
> it
> > > does save people re-indexing on upgrades. I suspect in reality that
> this
> > is
> > > a rare edge case, and that we would do far better to settle on one
> > solution
> > > (the dotted version, not the colons, to my mind)
> > >
> > > Simon
> > >
> > > On 5 June 2018 at 06:29, Ryan Merriman  wrote:
> > >
> > > > I agree completely.  I will leave this thread open for a day or two
> to
> > > give
> > > > others a chance to weigh in.  If no one opposes, I will creates Jiras
> > for
> > > > removing field transformations and transforming existing data.
> > > >
> > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
> > wrote:
> > > >
> > > > > Well, on write it is a transformation, on read it's a translation.
> > > This
> > > > is
> > > > > to say that you're providing a mapping on read to translate field
> > names
> > > > > given the index you're using.  The other approach that I was
> > > considering
> > > > > last night is a field transformation REST call which translates
> field
> > > > names
> > > > > that the UI could call.  So, the UI would pass 'source.type' to the
> > > field
> > > > > translation service and in Solr it'd return source.type and in ES
> > it'd
> > > > > return source:type.  Underneath the hood the service would use the
> > same
> > > > > transformation as the writer uses.  That's another way to skin this
> > > cat.
> > > > >
> > > > > Ultimately, I think we should just ditch this field transformation
> > > > > business, as Laurens said, as long as we have a utility to
> transform
> > > > > existing data.
> > > > >
> > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
> > > > wrote:
> > > > >
> > > > > > Having 2 different patterns for configuring field name
> > > transformations
> > > > on
> > > > > > read vs write is confusing to me.  I agree with both of you that
> > > > > > normalizing on '.' and not having to do the translation at all
> > would
> > > be
> > > > > > ideal.  Like you both suggested, we would need some utility or
> > script
> > > > to
> > > > > > convert preexisting data to match this format.  There could also
> be
> > > > some
> > > > > > adjustments a user would need to make in the UI but I feel like
> we
> > > > could
> > > > > > document around that.  Are there any objections to doing it this
> > way?
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
> > > > wrote:
> > > > > >
> > > > > > > ES 2.x support officially ended 4 months ago (
> > > > > > > https://www.elastic.co/support/eol), so why still support ':'
> at
> > > > all?
> > > > > :)
> > > > > > > Additionally, 2.x isn't even supported at all on the last 2
> > Ubuntu
> > > > LTS
> > > > > > > releases (16.04 & 18.05).
> > > > > > >
> > > > > > > Therefor, move everything to use '.' and provide a
> > > conversion/upgrade
> > > > > > > script to change '.' to ':'?
> > > > > > >
> > > > > > >
> > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > > > > > >
> > > > > > >> We've been dealing with a reoccurring challenge in Metron.  It
> > is
> > > > > common
> > > > > > >> for various fields to contain '.' characters for the purpose
> of
> > > > making
> > > > > > >> them
> > > > > > >> more readable, namespacing, etc.  At one point we only
> supported
> > > > > > >> Elasticsearch 2.3 which did not allow dots and forced us to
> use
> > > ':'
> > > > > > >> instead.  This limitation does not

Re: [DISCUSS] Field conversions

2018-06-05 Thread Simon Elliston Ball
I guess in principal you could use
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-change-name
to reindex with the new fields. It wouldn't be hard to script up a bit of
python to help users out with that, or of course to leave that as an
exercise to the reader. It would be nice to have a script that read and
transformed fields for templates and indices to replace the colons with
dots in ES.

Simon

On 5 June 2018 at 06:40, Casey Stella  wrote:

> +1 to that, Simon.  Do we have a sense of if there are utilities provided
> by ES to do this kind of migration transformation easily?
>
> On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > I would definitely agree that the transformation should be removed. We
> have
> > now however added a complex generic solution in the backend, which is
> going
> > to be noop for most people. This was done I believe for the sake of
> > backward compatibility. I would argue however, that there is no need to
> > support ES 2.3, and therefore no need to support de-dotting
> > transformations. This does seem somewhat over-engineered to me, though it
> > does save people re-indexing on upgrades. I suspect in reality that this
> is
> > a rare edge case, and that we would do far better to settle on one
> solution
> > (the dotted version, not the colons, to my mind)
> >
> > Simon
> >
> > On 5 June 2018 at 06:29, Ryan Merriman  wrote:
> >
> > > I agree completely.  I will leave this thread open for a day or two to
> > give
> > > others a chance to weigh in.  If no one opposes, I will creates Jiras
> for
> > > removing field transformations and transforming existing data.
> > >
> > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella 
> wrote:
> > >
> > > > Well, on write it is a transformation, on read it's a translation.
> > This
> > > is
> > > > to say that you're providing a mapping on read to translate field
> names
> > > > given the index you're using.  The other approach that I was
> > considering
> > > > last night is a field transformation REST call which translates field
> > > names
> > > > that the UI could call.  So, the UI would pass 'source.type' to the
> > field
> > > > translation service and in Solr it'd return source.type and in ES
> it'd
> > > > return source:type.  Underneath the hood the service would use the
> same
> > > > transformation as the writer uses.  That's another way to skin this
> > cat.
> > > >
> > > > Ultimately, I think we should just ditch this field transformation
> > > > business, as Laurens said, as long as we have a utility to transform
> > > > existing data.
> > > >
> > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
> > > wrote:
> > > >
> > > > > Having 2 different patterns for configuring field name
> > transformations
> > > on
> > > > > read vs write is confusing to me.  I agree with both of you that
> > > > > normalizing on '.' and not having to do the translation at all
> would
> > be
> > > > > ideal.  Like you both suggested, we would need some utility or
> script
> > > to
> > > > > convert preexisting data to match this format.  There could also be
> > > some
> > > > > adjustments a user would need to make in the UI but I feel like we
> > > could
> > > > > document around that.  Are there any objections to doing it this
> way?
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
> > > wrote:
> > > > >
> > > > > > ES 2.x support officially ended 4 months ago (
> > > > > > https://www.elastic.co/support/eol), so why still support ':' at
> > > all?
> > > > :)
> > > > > > Additionally, 2.x isn't even supported at all on the last 2
> Ubuntu
> > > LTS
> > > > > > releases (16.04 & 18.05).
> > > > > >
> > > > > > Therefor, move everything to use '.' and provide a
> > conversion/upgrade
> > > > > > script to change '.' to ':'?
> > > > > >
> > > > > >
> > > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > > > > >
> > > > > >> We've been dealing with a reoccurring challenge in Metron.  It
> is
> > > > common
> > > > > >> for various fields to contain '.' characters for the purpose of
> > > making
> > > > > >> them
> > > > > >> more readable, namespacing, etc.  At one point we only supported
> > > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use
> > ':'
> > > > > >> instead.  This limitation does not exist in later versions of
> > > > > >> Elasticsearch
> > > > > >> or Solr.
> > > > > >>
> > > > > >> Now we're in a situation where we need to allow a user to use
> > either
> > > > one
> > > > > >> because they may still be using ES 2.3 or have data with ':'
> > > > characters
> > > > > in
> > > > > >> field names.  We've attempted to make this configurable in a
> > couple
> > > > > >> different PRs:
> > > > > >>
> > > > > >> https://github.com/apache/metron/pull/1022
> > > > > >> https://github.com/apache/metron/pull/1010
> > > > > >> https://github.com/apache/metron/pull/1038
> > > > > >>
> > > > > >> The a

Re: [DISCUSS] Field conversions

2018-06-05 Thread Casey Stella
+1 to that, Simon.  Do we have a sense of if there are utilities provided
by ES to do this kind of migration transformation easily?

On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> I would definitely agree that the transformation should be removed. We have
> now however added a complex generic solution in the backend, which is going
> to be noop for most people. This was done I believe for the sake of
> backward compatibility. I would argue however, that there is no need to
> support ES 2.3, and therefore no need to support de-dotting
> transformations. This does seem somewhat over-engineered to me, though it
> does save people re-indexing on upgrades. I suspect in reality that this is
> a rare edge case, and that we would do far better to settle on one solution
> (the dotted version, not the colons, to my mind)
>
> Simon
>
> On 5 June 2018 at 06:29, Ryan Merriman  wrote:
>
> > I agree completely.  I will leave this thread open for a day or two to
> give
> > others a chance to weigh in.  If no one opposes, I will creates Jiras for
> > removing field transformations and transforming existing data.
> >
> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella  wrote:
> >
> > > Well, on write it is a transformation, on read it's a translation.
> This
> > is
> > > to say that you're providing a mapping on read to translate field names
> > > given the index you're using.  The other approach that I was
> considering
> > > last night is a field transformation REST call which translates field
> > names
> > > that the UI could call.  So, the UI would pass 'source.type' to the
> field
> > > translation service and in Solr it'd return source.type and in ES it'd
> > > return source:type.  Underneath the hood the service would use the same
> > > transformation as the writer uses.  That's another way to skin this
> cat.
> > >
> > > Ultimately, I think we should just ditch this field transformation
> > > business, as Laurens said, as long as we have a utility to transform
> > > existing data.
> > >
> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
> > wrote:
> > >
> > > > Having 2 different patterns for configuring field name
> transformations
> > on
> > > > read vs write is confusing to me.  I agree with both of you that
> > > > normalizing on '.' and not having to do the translation at all would
> be
> > > > ideal.  Like you both suggested, we would need some utility or script
> > to
> > > > convert preexisting data to match this format.  There could also be
> > some
> > > > adjustments a user would need to make in the UI but I feel like we
> > could
> > > > document around that.  Are there any objections to doing it this way?
> > > >
> > > >
> > > >
> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
> > wrote:
> > > >
> > > > > ES 2.x support officially ended 4 months ago (
> > > > > https://www.elastic.co/support/eol), so why still support ':' at
> > all?
> > > :)
> > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu
> > LTS
> > > > > releases (16.04 & 18.05).
> > > > >
> > > > > Therefor, move everything to use '.' and provide a
> conversion/upgrade
> > > > > script to change '.' to ':'?
> > > > >
> > > > >
> > > > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > > > >
> > > > >> We've been dealing with a reoccurring challenge in Metron.  It is
> > > common
> > > > >> for various fields to contain '.' characters for the purpose of
> > making
> > > > >> them
> > > > >> more readable, namespacing, etc.  At one point we only supported
> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use
> ':'
> > > > >> instead.  This limitation does not exist in later versions of
> > > > >> Elasticsearch
> > > > >> or Solr.
> > > > >>
> > > > >> Now we're in a situation where we need to allow a user to use
> either
> > > one
> > > > >> because they may still be using ES 2.3 or have data with ':'
> > > characters
> > > > in
> > > > >> field names.  We've attempted to make this configurable in a
> couple
> > > > >> different PRs:
> > > > >>
> > > > >> https://github.com/apache/metron/pull/1022
> > > > >> https://github.com/apache/metron/pull/1010
> > > > >> https://github.com/apache/metron/pull/1038
> > > > >>
> > > > >> The approaches taken in these are not consistent and fall short in
> > > > >> different ways.  The first (METRON-1569 Allow user to change field
> > > name
> > > > >> conversion when indexing) only applies to indexing and not
> querying.
> > > > The
> > > > >> others only apply to a single field which does not scale well.
> Now
> > we
> > > > >> have
> > > > >> an issue with another field in
> > > > >> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
> > > > >> continuing
> > > > >> with a patchwork of different fixes I want to attempt to design a
> > > > >> system-wide solution.
> > > > >>
> > > > >> My first thought is to expand
> > > > https://github.com/apache/metron/pull/1022
> > > > >> to
> > > > >> apply globally.  

Re: [DISCUSS] Field conversions

2018-06-05 Thread Simon Elliston Ball
I would definitely agree that the transformation should be removed. We have
now however added a complex generic solution in the backend, which is going
to be noop for most people. This was done I believe for the sake of
backward compatibility. I would argue however, that there is no need to
support ES 2.3, and therefore no need to support de-dotting
transformations. This does seem somewhat over-engineered to me, though it
does save people re-indexing on upgrades. I suspect in reality that this is
a rare edge case, and that we would do far better to settle on one solution
(the dotted version, not the colons, to my mind)

Simon

On 5 June 2018 at 06:29, Ryan Merriman  wrote:

> I agree completely.  I will leave this thread open for a day or two to give
> others a chance to weigh in.  If no one opposes, I will creates Jiras for
> removing field transformations and transforming existing data.
>
> On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella  wrote:
>
> > Well, on write it is a transformation, on read it's a translation.  This
> is
> > to say that you're providing a mapping on read to translate field names
> > given the index you're using.  The other approach that I was considering
> > last night is a field transformation REST call which translates field
> names
> > that the UI could call.  So, the UI would pass 'source.type' to the field
> > translation service and in Solr it'd return source.type and in ES it'd
> > return source:type.  Underneath the hood the service would use the same
> > transformation as the writer uses.  That's another way to skin this cat.
> >
> > Ultimately, I think we should just ditch this field transformation
> > business, as Laurens said, as long as we have a utility to transform
> > existing data.
> >
> > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman 
> wrote:
> >
> > > Having 2 different patterns for configuring field name transformations
> on
> > > read vs write is confusing to me.  I agree with both of you that
> > > normalizing on '.' and not having to do the translation at all would be
> > > ideal.  Like you both suggested, we would need some utility or script
> to
> > > convert preexisting data to match this format.  There could also be
> some
> > > adjustments a user would need to make in the UI but I feel like we
> could
> > > document around that.  Are there any objections to doing it this way?
> > >
> > >
> > >
> > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets 
> wrote:
> > >
> > > > ES 2.x support officially ended 4 months ago (
> > > > https://www.elastic.co/support/eol), so why still support ':' at
> all?
> > :)
> > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu
> LTS
> > > > releases (16.04 & 18.05).
> > > >
> > > > Therefor, move everything to use '.' and provide a conversion/upgrade
> > > > script to change '.' to ':'?
> > > >
> > > >
> > > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > > >
> > > >> We've been dealing with a reoccurring challenge in Metron.  It is
> > common
> > > >> for various fields to contain '.' characters for the purpose of
> making
> > > >> them
> > > >> more readable, namespacing, etc.  At one point we only supported
> > > >> Elasticsearch 2.3 which did not allow dots and forced us to use ':'
> > > >> instead.  This limitation does not exist in later versions of
> > > >> Elasticsearch
> > > >> or Solr.
> > > >>
> > > >> Now we're in a situation where we need to allow a user to use either
> > one
> > > >> because they may still be using ES 2.3 or have data with ':'
> > characters
> > > in
> > > >> field names.  We've attempted to make this configurable in a couple
> > > >> different PRs:
> > > >>
> > > >> https://github.com/apache/metron/pull/1022
> > > >> https://github.com/apache/metron/pull/1010
> > > >> https://github.com/apache/metron/pull/1038
> > > >>
> > > >> The approaches taken in these are not consistent and fall short in
> > > >> different ways.  The first (METRON-1569 Allow user to change field
> > name
> > > >> conversion when indexing) only applies to indexing and not querying.
> > > The
> > > >> others only apply to a single field which does not scale well.  Now
> we
> > > >> have
> > > >> an issue with another field in
> > > >> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
> > > >> continuing
> > > >> with a patchwork of different fixes I want to attempt to design a
> > > >> system-wide solution.
> > > >>
> > > >> My first thought is to expand
> > > https://github.com/apache/metron/pull/1022
> > > >> to
> > > >> apply globally.  However this is not trivial and would require
> > > significant
> > > >> changes.  It would also make https://github.com/apache/
> > metron/pull/1010
> > > >> obsolete and we might end up having to revert all of it.
> > > >>
> > > >> Does anyone have any ideas or opinions?  I am still researching
> > > solutions
> > > >> but would love some guidance from the community.
> > > >>
> > > >
> > >
> >
>



-- 
--
simon elliston ball
@sireb


Re: [DISCUSS] Field conversions

2018-06-05 Thread Ryan Merriman
I agree completely.  I will leave this thread open for a day or two to give
others a chance to weigh in.  If no one opposes, I will creates Jiras for
removing field transformations and transforming existing data.

On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella  wrote:

> Well, on write it is a transformation, on read it's a translation.  This is
> to say that you're providing a mapping on read to translate field names
> given the index you're using.  The other approach that I was considering
> last night is a field transformation REST call which translates field names
> that the UI could call.  So, the UI would pass 'source.type' to the field
> translation service and in Solr it'd return source.type and in ES it'd
> return source:type.  Underneath the hood the service would use the same
> transformation as the writer uses.  That's another way to skin this cat.
>
> Ultimately, I think we should just ditch this field transformation
> business, as Laurens said, as long as we have a utility to transform
> existing data.
>
> On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman  wrote:
>
> > Having 2 different patterns for configuring field name transformations on
> > read vs write is confusing to me.  I agree with both of you that
> > normalizing on '.' and not having to do the translation at all would be
> > ideal.  Like you both suggested, we would need some utility or script to
> > convert preexisting data to match this format.  There could also be some
> > adjustments a user would need to make in the UI but I feel like we could
> > document around that.  Are there any objections to doing it this way?
> >
> >
> >
> > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets  wrote:
> >
> > > ES 2.x support officially ended 4 months ago (
> > > https://www.elastic.co/support/eol), so why still support ':' at all?
> :)
> > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS
> > > releases (16.04 & 18.05).
> > >
> > > Therefor, move everything to use '.' and provide a conversion/upgrade
> > > script to change '.' to ':'?
> > >
> > >
> > > On 2018-06-04 13:55, Ryan Merriman wrote:
> > >
> > >> We've been dealing with a reoccurring challenge in Metron.  It is
> common
> > >> for various fields to contain '.' characters for the purpose of making
> > >> them
> > >> more readable, namespacing, etc.  At one point we only supported
> > >> Elasticsearch 2.3 which did not allow dots and forced us to use ':'
> > >> instead.  This limitation does not exist in later versions of
> > >> Elasticsearch
> > >> or Solr.
> > >>
> > >> Now we're in a situation where we need to allow a user to use either
> one
> > >> because they may still be using ES 2.3 or have data with ':'
> characters
> > in
> > >> field names.  We've attempted to make this configurable in a couple
> > >> different PRs:
> > >>
> > >> https://github.com/apache/metron/pull/1022
> > >> https://github.com/apache/metron/pull/1010
> > >> https://github.com/apache/metron/pull/1038
> > >>
> > >> The approaches taken in these are not consistent and fall short in
> > >> different ways.  The first (METRON-1569 Allow user to change field
> name
> > >> conversion when indexing) only applies to indexing and not querying.
> > The
> > >> others only apply to a single field which does not scale well.  Now we
> > >> have
> > >> an issue with another field in
> > >> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
> > >> continuing
> > >> with a patchwork of different fixes I want to attempt to design a
> > >> system-wide solution.
> > >>
> > >> My first thought is to expand
> > https://github.com/apache/metron/pull/1022
> > >> to
> > >> apply globally.  However this is not trivial and would require
> > significant
> > >> changes.  It would also make https://github.com/apache/
> metron/pull/1010
> > >> obsolete and we might end up having to revert all of it.
> > >>
> > >> Does anyone have any ideas or opinions?  I am still researching
> > solutions
> > >> but would love some guidance from the community.
> > >>
> > >
> >
>


Re: [DISCUSS] Field conversions

2018-06-05 Thread Casey Stella
Well, on write it is a transformation, on read it's a translation.  This is
to say that you're providing a mapping on read to translate field names
given the index you're using.  The other approach that I was considering
last night is a field transformation REST call which translates field names
that the UI could call.  So, the UI would pass 'source.type' to the field
translation service and in Solr it'd return source.type and in ES it'd
return source:type.  Underneath the hood the service would use the same
transformation as the writer uses.  That's another way to skin this cat.

Ultimately, I think we should just ditch this field transformation
business, as Laurens said, as long as we have a utility to transform
existing data.

On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman  wrote:

> Having 2 different patterns for configuring field name transformations on
> read vs write is confusing to me.  I agree with both of you that
> normalizing on '.' and not having to do the translation at all would be
> ideal.  Like you both suggested, we would need some utility or script to
> convert preexisting data to match this format.  There could also be some
> adjustments a user would need to make in the UI but I feel like we could
> document around that.  Are there any objections to doing it this way?
>
>
>
> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets  wrote:
>
> > ES 2.x support officially ended 4 months ago (
> > https://www.elastic.co/support/eol), so why still support ':' at all? :)
> > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS
> > releases (16.04 & 18.05).
> >
> > Therefor, move everything to use '.' and provide a conversion/upgrade
> > script to change '.' to ':'?
> >
> >
> > On 2018-06-04 13:55, Ryan Merriman wrote:
> >
> >> We've been dealing with a reoccurring challenge in Metron.  It is common
> >> for various fields to contain '.' characters for the purpose of making
> >> them
> >> more readable, namespacing, etc.  At one point we only supported
> >> Elasticsearch 2.3 which did not allow dots and forced us to use ':'
> >> instead.  This limitation does not exist in later versions of
> >> Elasticsearch
> >> or Solr.
> >>
> >> Now we're in a situation where we need to allow a user to use either one
> >> because they may still be using ES 2.3 or have data with ':' characters
> in
> >> field names.  We've attempted to make this configurable in a couple
> >> different PRs:
> >>
> >> https://github.com/apache/metron/pull/1022
> >> https://github.com/apache/metron/pull/1010
> >> https://github.com/apache/metron/pull/1038
> >>
> >> The approaches taken in these are not consistent and fall short in
> >> different ways.  The first (METRON-1569 Allow user to change field name
> >> conversion when indexing) only applies to indexing and not querying.
> The
> >> others only apply to a single field which does not scale well.  Now we
> >> have
> >> an issue with another field in
> >> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
> >> continuing
> >> with a patchwork of different fixes I want to attempt to design a
> >> system-wide solution.
> >>
> >> My first thought is to expand
> https://github.com/apache/metron/pull/1022
> >> to
> >> apply globally.  However this is not trivial and would require
> significant
> >> changes.  It would also make https://github.com/apache/metron/pull/1010
> >> obsolete and we might end up having to revert all of it.
> >>
> >> Does anyone have any ideas or opinions?  I am still researching
> solutions
> >> but would love some guidance from the community.
> >>
> >
>


Re: [DISCUSS] Field conversions

2018-06-05 Thread Ryan Merriman
Having 2 different patterns for configuring field name transformations on
read vs write is confusing to me.  I agree with both of you that
normalizing on '.' and not having to do the translation at all would be
ideal.  Like you both suggested, we would need some utility or script to
convert preexisting data to match this format.  There could also be some
adjustments a user would need to make in the UI but I feel like we could
document around that.  Are there any objections to doing it this way?



On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets  wrote:

> ES 2.x support officially ended 4 months ago (
> https://www.elastic.co/support/eol), so why still support ':' at all? :)
> Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS
> releases (16.04 & 18.05).
>
> Therefor, move everything to use '.' and provide a conversion/upgrade
> script to change '.' to ':'?
>
>
> On 2018-06-04 13:55, Ryan Merriman wrote:
>
>> We've been dealing with a reoccurring challenge in Metron.  It is common
>> for various fields to contain '.' characters for the purpose of making
>> them
>> more readable, namespacing, etc.  At one point we only supported
>> Elasticsearch 2.3 which did not allow dots and forced us to use ':'
>> instead.  This limitation does not exist in later versions of
>> Elasticsearch
>> or Solr.
>>
>> Now we're in a situation where we need to allow a user to use either one
>> because they may still be using ES 2.3 or have data with ':' characters in
>> field names.  We've attempted to make this configurable in a couple
>> different PRs:
>>
>> https://github.com/apache/metron/pull/1022
>> https://github.com/apache/metron/pull/1010
>> https://github.com/apache/metron/pull/1038
>>
>> The approaches taken in these are not consistent and fall short in
>> different ways.  The first (METRON-1569 Allow user to change field name
>> conversion when indexing) only applies to indexing and not querying.  The
>> others only apply to a single field which does not scale well.  Now we
>> have
>> an issue with another field in
>> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
>> continuing
>> with a patchwork of different fixes I want to attempt to design a
>> system-wide solution.
>>
>> My first thought is to expand https://github.com/apache/metron/pull/1022
>> to
>> apply globally.  However this is not trivial and would require significant
>> changes.  It would also make https://github.com/apache/metron/pull/1010
>> obsolete and we might end up having to revert all of it.
>>
>> Does anyone have any ideas or opinions?  I am still researching solutions
>> but would love some guidance from the community.
>>
>


Re: [DISCUSS] Field conversions

2018-06-04 Thread Laurens Vets
ES 2.x support officially ended 4 months ago 
(https://www.elastic.co/support/eol), so why still support ':' at all? 
:) Additionally, 2.x isn't even supported at all on the last 2 Ubuntu 
LTS releases (16.04 & 18.05).


Therefor, move everything to use '.' and provide a conversion/upgrade 
script to change '.' to ':'?


On 2018-06-04 13:55, Ryan Merriman wrote:
We've been dealing with a reoccurring challenge in Metron.  It is 
common
for various fields to contain '.' characters for the purpose of making 
them

more readable, namespacing, etc.  At one point we only supported
Elasticsearch 2.3 which did not allow dots and forced us to use ':'
instead.  This limitation does not exist in later versions of 
Elasticsearch

or Solr.

Now we're in a situation where we need to allow a user to use either 
one
because they may still be using ES 2.3 or have data with ':' characters 
in

field names.  We've attempted to make this configurable in a couple
different PRs:

https://github.com/apache/metron/pull/1022
https://github.com/apache/metron/pull/1010
https://github.com/apache/metron/pull/1038

The approaches taken in these are not consistent and fall short in
different ways.  The first (METRON-1569 Allow user to change field name
conversion when indexing) only applies to indexing and not querying.  
The
others only apply to a single field which does not scale well.  Now we 
have

an issue with another field in
https://issues.apache.org/jira/browse/METRON-1600.  Rather than 
continuing

with a patchwork of different fixes I want to attempt to design a
system-wide solution.

My first thought is to expand 
https://github.com/apache/metron/pull/1022 to
apply globally.  However this is not trivial and would require 
significant

changes.  It would also make https://github.com/apache/metron/pull/1010
obsolete and we might end up having to revert all of it.

Does anyone have any ideas or opinions?  I am still researching 
solutions

but would love some guidance from the community.


Re: [DISCUSS] Field conversions

2018-06-04 Thread Casey Stella
Before we construct a super generic solution, can we get an analysis of all
the places in the UI where we're hard-coding fields?  It seems like pulling
the field from the global config is the strategy that we've gone with that
could be expanded upon in https://github.com/apache/metron/pull/1010
(though didn't quite get the semantic correct as it required
https://github.com/apache/metron/pull/1038).  Is there a reason why we
wouldn't create a PR to refer to all of the hard-coded fields in the same
way?

I guess my perspective is that this seems like a problem contained to the
UI accessing a small number of hard-coded fields and expansion of those
fields seem pretty contained.  If so, I'd suggest we continue with the
pattern we already have.  If you want to expand it, you might consider
taking advantage of the fact that the global config can use maps and doing
something like:
{
...
  "fieldNameTransformations" : {
"source:type" : "source.type",
"threat:triage:reason" : "threat.triage.reason"
  }
}

Whereby in the UI when accessing a hard-coded field, it will look up the
field in the fieldNameTransformations map from global config.  If it exists
in the map, then it'll use the translated field.  If it does not, then it
will use the field it passed in (e.g. source:type).  That would allow us to
add new translations easily, but it may be overkill if we're talking about
3 fields.

Another question, is there an easy way to bulk change field names in ES
across many indices?  Could we normalize on .'s and not do this translation
at all?  I think in order to do that, we'd need instructions on how to
transition at least selected fields (i.e. those hard coded fields in the
UI).

Casey

On Mon, Jun 4, 2018 at 4:55 PM Ryan Merriman  wrote:

> We've been dealing with a reoccurring challenge in Metron.  It is common
> for various fields to contain '.' characters for the purpose of making them
> more readable, namespacing, etc.  At one point we only supported
> Elasticsearch 2.3 which did not allow dots and forced us to use ':'
> instead.  This limitation does not exist in later versions of Elasticsearch
> or Solr.
>
> Now we're in a situation where we need to allow a user to use either one
> because they may still be using ES 2.3 or have data with ':' characters in
> field names.  We've attempted to make this configurable in a couple
> different PRs:
>
> https://github.com/apache/metron/pull/1022
> https://github.com/apache/metron/pull/1010
> https://github.com/apache/metron/pull/1038
>
> The approaches taken in these are not consistent and fall short in
> different ways.  The first (METRON-1569 Allow user to change field name
> conversion when indexing) only applies to indexing and not querying.  The
> others only apply to a single field which does not scale well.  Now we have
> an issue with another field in
> https://issues.apache.org/jira/browse/METRON-1600.  Rather than continuing
> with a patchwork of different fixes I want to attempt to design a
> system-wide solution.
>
> My first thought is to expand https://github.com/apache/metron/pull/1022
> to
> apply globally.  However this is not trivial and would require significant
> changes.  It would also make https://github.com/apache/metron/pull/1010
> obsolete and we might end up having to revert all of it.
>
> Does anyone have any ideas or opinions?  I am still researching solutions
> but would love some guidance from the community.
>


[DISCUSS] Field conversions

2018-06-04 Thread Ryan Merriman
We've been dealing with a reoccurring challenge in Metron.  It is common
for various fields to contain '.' characters for the purpose of making them
more readable, namespacing, etc.  At one point we only supported
Elasticsearch 2.3 which did not allow dots and forced us to use ':'
instead.  This limitation does not exist in later versions of Elasticsearch
or Solr.

Now we're in a situation where we need to allow a user to use either one
because they may still be using ES 2.3 or have data with ':' characters in
field names.  We've attempted to make this configurable in a couple
different PRs:

https://github.com/apache/metron/pull/1022
https://github.com/apache/metron/pull/1010
https://github.com/apache/metron/pull/1038

The approaches taken in these are not consistent and fall short in
different ways.  The first (METRON-1569 Allow user to change field name
conversion when indexing) only applies to indexing and not querying.  The
others only apply to a single field which does not scale well.  Now we have
an issue with another field in
https://issues.apache.org/jira/browse/METRON-1600.  Rather than continuing
with a patchwork of different fixes I want to attempt to design a
system-wide solution.

My first thought is to expand https://github.com/apache/metron/pull/1022 to
apply globally.  However this is not trivial and would require significant
changes.  It would also make https://github.com/apache/metron/pull/1010
obsolete and we might end up having to revert all of it.

Does anyone have any ideas or opinions?  I am still researching solutions
but would love some guidance from the community.