Re: [DISCUSS] Field conversions
Yeah Otto, pre-0.5.0 (0.4.2) would be ES 2.3 if users were not using master. ES upgrade is a big piece of this Apache release. Last release was 0.4.2 on Fri Dec 22 2017. I'm +1 on the idea of an example referencing the ES docs and keeping this as simple as possible. * https://archive.apache.org/dist/metron/ On Tue, Jun 5, 2018 at 11:06 AM, Otto Fowler wrote: > Aren’t people who are on an old version of ES everyone pre 0.5.0? Like all > the metron users? > > > On June 5, 2018 at 12:31:30, Simon Elliston Ball ( > si...@simonellistonball.com) wrote: > > Yes, anything using elastic would need the field names changed. That said, > people who are on such an old version (eol) will need to not the bullet > with ES compatibility as some point. > > Simon > > > On 5 Jun 2018, at 17:17, Otto Fowler wrote: > > > > Are there consequences with Kibana as well? queries, visualizations, > > templates they may have? > > > > > > On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote: > > > > I just don't know if telling users to do a bulk upgrade of their indices > is > > sufficient enough of an upgrade path. I would expect some to have > > downstream processes dependent on those field names, which would also > need > > to be updated. > > > > Although, we could tell users to do any field name conversions that they > > depend on using parser transformations; rather than the > > `FieldNameConverter` abstractions. I *think* that would be a valid > upgrade > > path where we could just revert #1022. > > > >> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen wrote: > >> > >> I am in favor of removing the `FieldNameConverter` abstraction as an end > >> state. Although, I don't agree with Simon that we could have just done > >> that directly without providing a backwards compatible solution as was > > done > >> in #1022. There are too many touch points that rely on that conversion > > and > >> users who expect fields to land in their indices named a certain way (no > >> matter what version of ES they are running). If I am wrong and there is > a > >> better approach that works, then we should just revert #1022. > >> > >> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < > >> si...@simonellistonball.com> wrote: > >> > >>> I would definitely agree that the transformation should be removed. We > >>> have > >>> now however added a complex generic solution in the backend, which is > >>> going > >>> to be noop for most people. This was done I believe for the sake of > >>> backward compatibility. I would argue however, that there is no need to > >>> support ES 2.3, and therefore no need to support de-dotting > >>> transformations. This does seem somewhat over-engineered to me, though > > it > >>> does save people re-indexing on upgrades. I suspect in reality that > this > >>> is > >>> a rare edge case, and that we would do far better to settle on one > >>> solution > >>> (the dotted version, not the colons, to my mind) > >>> > >>> Simon > >>> > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > I agree completely. I will leave this thread open for a day or two to > >>> give > others a chance to weigh in. If no one opposes, I will creates Jiras > >>> for > removing field transformations and transforming existing data. > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > >>> wrote: > > > Well, on write it is a transformation, on read it's a translation. > >>> This > is > > to say that you're providing a mapping on read to translate field > >>> names > > given the index you're using. The other approach that I was > >>> considering > > last night is a field transformation REST call which translates > > field > names > > that the UI could call. So, the UI would pass 'source.type' to the > >>> field > > translation service and in Solr it'd return source.type and in ES > > it'd > > return source:type. Underneath the hood the service would use the > >>> same > > transformation as the writer uses. That's another way to skin this > >>> cat. > > > > Ultimately, I think we should just ditch this field transformation > > business, as Laurens said, as long as we have a utility to transform > > existing data. > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman > wrote: > > > >> Having 2 different patterns for configuring field name > >>> transformations > on > >> read vs write is confusing to me. I agree with both of you that > >> normalizing on '.' and not having to do the translation at all > >>> would be > >> ideal. Like you both suggested, we would need some utility or > >>> script > to > >> convert preexisting data to match this format. There could also be > some > >> adjustments a user would need to make in the UI but I feel like we > could > >> document around that. Are there any objections to doing it this > >>> way? > >> > >> > >> > >> On Mon, Jun 4, 20
Re: [DISCUSS] Field conversions
Aren’t people who are on an old version of ES everyone pre 0.5.0? Like all the metron users? On June 5, 2018 at 12:31:30, Simon Elliston Ball ( si...@simonellistonball.com) wrote: Yes, anything using elastic would need the field names changed. That said, people who are on such an old version (eol) will need to not the bullet with ES compatibility as some point. Simon > On 5 Jun 2018, at 17:17, Otto Fowler wrote: > > Are there consequences with Kibana as well? queries, visualizations, > templates they may have? > > > On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote: > > I just don't know if telling users to do a bulk upgrade of their indices is > sufficient enough of an upgrade path. I would expect some to have > downstream processes dependent on those field names, which would also need > to be updated. > > Although, we could tell users to do any field name conversions that they > depend on using parser transformations; rather than the > `FieldNameConverter` abstractions. I *think* that would be a valid upgrade > path where we could just revert #1022. > >> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen wrote: >> >> I am in favor of removing the `FieldNameConverter` abstraction as an end >> state. Although, I don't agree with Simon that we could have just done >> that directly without providing a backwards compatible solution as was > done >> in #1022. There are too many touch points that rely on that conversion > and >> users who expect fields to land in their indices named a certain way (no >> matter what version of ES they are running). If I am wrong and there is a >> better approach that works, then we should just revert #1022. >> >> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < >> si...@simonellistonball.com> wrote: >> >>> I would definitely agree that the transformation should be removed. We >>> have >>> now however added a complex generic solution in the backend, which is >>> going >>> to be noop for most people. This was done I believe for the sake of >>> backward compatibility. I would argue however, that there is no need to >>> support ES 2.3, and therefore no need to support de-dotting >>> transformations. This does seem somewhat over-engineered to me, though > it >>> does save people re-indexing on upgrades. I suspect in reality that this >>> is >>> a rare edge case, and that we would do far better to settle on one >>> solution >>> (the dotted version, not the colons, to my mind) >>> >>> Simon >>> On 5 June 2018 at 06:29, Ryan Merriman wrote: I agree completely. I will leave this thread open for a day or two to >>> give others a chance to weigh in. If no one opposes, I will creates Jiras >>> for removing field transformations and transforming existing data. On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella >>> wrote: > Well, on write it is a transformation, on read it's a translation. >>> This is > to say that you're providing a mapping on read to translate field >>> names > given the index you're using. The other approach that I was >>> considering > last night is a field transformation REST call which translates > field names > that the UI could call. So, the UI would pass 'source.type' to the >>> field > translation service and in Solr it'd return source.type and in ES > it'd > return source:type. Underneath the hood the service would use the >>> same > transformation as the writer uses. That's another way to skin this >>> cat. > > Ultimately, I think we should just ditch this field transformation > business, as Laurens said, as long as we have a utility to transform > existing data. > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman wrote: > >> Having 2 different patterns for configuring field name >>> transformations on >> read vs write is confusing to me. I agree with both of you that >> normalizing on '.' and not having to do the translation at all >>> would be >> ideal. Like you both suggested, we would need some utility or >>> script to >> convert preexisting data to match this format. There could also be some >> adjustments a user would need to make in the UI but I feel like we could >> document around that. Are there any objections to doing it this >>> way? >> >> >> >> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets wrote: >> >>> ES 2.x support officially ended 4 months ago ( >>> https://www.elastic.co/support/eol), so why still support ':' at all? > :) >>> Additionally, 2.x isn't even supported at all on the last 2 > Ubuntu LTS >>> releases (16.04 & 18.05). >>> >>> Therefor, move everything to use '.' and provide a >>> conversion/upgrade >>> script to change '.' to ':'? >>> >>> On 2018-06-04 13:55, Ryan Merriman wrote: We've been dealing with a reoccurring challenge in Metron. It > is > common >>
Re: [DISCUSS] Field conversions
Yes, anything using elastic would need the field names changed. That said, people who are on such an old version (eol) will need to not the bullet with ES compatibility as some point. Simon > On 5 Jun 2018, at 17:17, Otto Fowler wrote: > > Are there consequences with Kibana as well? queries, visualizations, > templates they may have? > > > On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote: > > I just don't know if telling users to do a bulk upgrade of their indices is > sufficient enough of an upgrade path. I would expect some to have > downstream processes dependent on those field names, which would also need > to be updated. > > Although, we could tell users to do any field name conversions that they > depend on using parser transformations; rather than the > `FieldNameConverter` abstractions. I *think* that would be a valid upgrade > path where we could just revert #1022. > >> On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen wrote: >> >> I am in favor of removing the `FieldNameConverter` abstraction as an end >> state. Although, I don't agree with Simon that we could have just done >> that directly without providing a backwards compatible solution as was > done >> in #1022. There are too many touch points that rely on that conversion > and >> users who expect fields to land in their indices named a certain way (no >> matter what version of ES they are running). If I am wrong and there is a >> better approach that works, then we should just revert #1022. >> >> On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < >> si...@simonellistonball.com> wrote: >> >>> I would definitely agree that the transformation should be removed. We >>> have >>> now however added a complex generic solution in the backend, which is >>> going >>> to be noop for most people. This was done I believe for the sake of >>> backward compatibility. I would argue however, that there is no need to >>> support ES 2.3, and therefore no need to support de-dotting >>> transformations. This does seem somewhat over-engineered to me, though > it >>> does save people re-indexing on upgrades. I suspect in reality that this >>> is >>> a rare edge case, and that we would do far better to settle on one >>> solution >>> (the dotted version, not the colons, to my mind) >>> >>> Simon >>> On 5 June 2018 at 06:29, Ryan Merriman wrote: I agree completely. I will leave this thread open for a day or two to >>> give others a chance to weigh in. If no one opposes, I will creates Jiras >>> for removing field transformations and transforming existing data. On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella >>> wrote: > Well, on write it is a transformation, on read it's a translation. >>> This is > to say that you're providing a mapping on read to translate field >>> names > given the index you're using. The other approach that I was >>> considering > last night is a field transformation REST call which translates > field names > that the UI could call. So, the UI would pass 'source.type' to the >>> field > translation service and in Solr it'd return source.type and in ES > it'd > return source:type. Underneath the hood the service would use the >>> same > transformation as the writer uses. That's another way to skin this >>> cat. > > Ultimately, I think we should just ditch this field transformation > business, as Laurens said, as long as we have a utility to transform > existing data. > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman wrote: > >> Having 2 different patterns for configuring field name >>> transformations on >> read vs write is confusing to me. I agree with both of you that >> normalizing on '.' and not having to do the translation at all >>> would be >> ideal. Like you both suggested, we would need some utility or >>> script to >> convert preexisting data to match this format. There could also be some >> adjustments a user would need to make in the UI but I feel like we could >> document around that. Are there any objections to doing it this >>> way? >> >> >> >> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets wrote: >> >>> ES 2.x support officially ended 4 months ago ( >>> https://www.elastic.co/support/eol), so why still support ':' at all? > :) >>> Additionally, 2.x isn't even supported at all on the last 2 > Ubuntu LTS >>> releases (16.04 & 18.05). >>> >>> Therefor, move everything to use '.' and provide a >>> conversion/upgrade >>> script to change '.' to ':'? >>> >>> On 2018-06-04 13:55, Ryan Merriman wrote: We've been dealing with a reoccurring challenge in Metron. It > is > common for various fields to contain '.' characters for the purpose of making them more readable, namespacing, etc. At one point we
Re: [DISCUSS] Field conversions
Are there consequences with Kibana as well? queries, visualizations, templates they may have? On June 5, 2018 at 12:03:44, Nick Allen (n...@nickallen.org) wrote: I just don't know if telling users to do a bulk upgrade of their indices is sufficient enough of an upgrade path. I would expect some to have downstream processes dependent on those field names, which would also need to be updated. Although, we could tell users to do any field name conversions that they depend on using parser transformations; rather than the `FieldNameConverter` abstractions. I *think* that would be a valid upgrade path where we could just revert #1022. On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen wrote: > I am in favor of removing the `FieldNameConverter` abstraction as an end > state. Although, I don't agree with Simon that we could have just done > that directly without providing a backwards compatible solution as was done > in #1022. There are too many touch points that rely on that conversion and > users who expect fields to land in their indices named a certain way (no > matter what version of ES they are running). If I am wrong and there is a > better approach that works, then we should just revert #1022. > > On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > >> I would definitely agree that the transformation should be removed. We >> have >> now however added a complex generic solution in the backend, which is >> going >> to be noop for most people. This was done I believe for the sake of >> backward compatibility. I would argue however, that there is no need to >> support ES 2.3, and therefore no need to support de-dotting >> transformations. This does seem somewhat over-engineered to me, though it >> does save people re-indexing on upgrades. I suspect in reality that this >> is >> a rare edge case, and that we would do far better to settle on one >> solution >> (the dotted version, not the colons, to my mind) >> >> Simon >> >> On 5 June 2018 at 06:29, Ryan Merriman wrote: >> >> > I agree completely. I will leave this thread open for a day or two to >> give >> > others a chance to weigh in. If no one opposes, I will creates Jiras >> for >> > removing field transformations and transforming existing data. >> > >> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella >> wrote: >> > >> > > Well, on write it is a transformation, on read it's a translation. >> This >> > is >> > > to say that you're providing a mapping on read to translate field >> names >> > > given the index you're using. The other approach that I was >> considering >> > > last night is a field transformation REST call which translates field >> > names >> > > that the UI could call. So, the UI would pass 'source.type' to the >> field >> > > translation service and in Solr it'd return source.type and in ES it'd >> > > return source:type. Underneath the hood the service would use the >> same >> > > transformation as the writer uses. That's another way to skin this >> cat. >> > > >> > > Ultimately, I think we should just ditch this field transformation >> > > business, as Laurens said, as long as we have a utility to transform >> > > existing data. >> > > >> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman >> > wrote: >> > > >> > > > Having 2 different patterns for configuring field name >> transformations >> > on >> > > > read vs write is confusing to me. I agree with both of you that >> > > > normalizing on '.' and not having to do the translation at all >> would be >> > > > ideal. Like you both suggested, we would need some utility or >> script >> > to >> > > > convert preexisting data to match this format. There could also be >> > some >> > > > adjustments a user would need to make in the UI but I feel like we >> > could >> > > > document around that. Are there any objections to doing it this >> way? >> > > > >> > > > >> > > > >> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets >> > wrote: >> > > > >> > > > > ES 2.x support officially ended 4 months ago ( >> > > > > https://www.elastic.co/support/eol), so why still support ':' at >> > all? >> > > :) >> > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu >> > LTS >> > > > > releases (16.04 & 18.05). >> > > > > >> > > > > Therefor, move everything to use '.' and provide a >> conversion/upgrade >> > > > > script to change '.' to ':'? >> > > > > >> > > > > >> > > > > On 2018-06-04 13:55, Ryan Merriman wrote: >> > > > > >> > > > >> We've been dealing with a reoccurring challenge in Metron. It is >> > > common >> > > > >> for various fields to contain '.' characters for the purpose of >> > making >> > > > >> them >> > > > >> more readable, namespacing, etc. At one point we only supported >> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use >> ':' >> > > > >> instead. This limitation does not exist in later versions of >> > > > >> Elasticsearch >> > > > >> or Solr. >> > > > >> >> > > > >> Now we're in a situation where we
Re: [DISCUSS] Field conversions
I just don't know if telling users to do a bulk upgrade of their indices is sufficient enough of an upgrade path. I would expect some to have downstream processes dependent on those field names, which would also need to be updated. Although, we could tell users to do any field name conversions that they depend on using parser transformations; rather than the `FieldNameConverter` abstractions. I *think* that would be a valid upgrade path where we could just revert #1022. On Tue, Jun 5, 2018 at 10:34 AM, Nick Allen wrote: > I am in favor of removing the `FieldNameConverter` abstraction as an end > state. Although, I don't agree with Simon that we could have just done > that directly without providing a backwards compatible solution as was done > in #1022. There are too many touch points that rely on that conversion and > users who expect fields to land in their indices named a certain way (no > matter what version of ES they are running). If I am wrong and there is a > better approach that works, then we should just revert #1022. > > On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > >> I would definitely agree that the transformation should be removed. We >> have >> now however added a complex generic solution in the backend, which is >> going >> to be noop for most people. This was done I believe for the sake of >> backward compatibility. I would argue however, that there is no need to >> support ES 2.3, and therefore no need to support de-dotting >> transformations. This does seem somewhat over-engineered to me, though it >> does save people re-indexing on upgrades. I suspect in reality that this >> is >> a rare edge case, and that we would do far better to settle on one >> solution >> (the dotted version, not the colons, to my mind) >> >> Simon >> >> On 5 June 2018 at 06:29, Ryan Merriman wrote: >> >> > I agree completely. I will leave this thread open for a day or two to >> give >> > others a chance to weigh in. If no one opposes, I will creates Jiras >> for >> > removing field transformations and transforming existing data. >> > >> > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella >> wrote: >> > >> > > Well, on write it is a transformation, on read it's a translation. >> This >> > is >> > > to say that you're providing a mapping on read to translate field >> names >> > > given the index you're using. The other approach that I was >> considering >> > > last night is a field transformation REST call which translates field >> > names >> > > that the UI could call. So, the UI would pass 'source.type' to the >> field >> > > translation service and in Solr it'd return source.type and in ES it'd >> > > return source:type. Underneath the hood the service would use the >> same >> > > transformation as the writer uses. That's another way to skin this >> cat. >> > > >> > > Ultimately, I think we should just ditch this field transformation >> > > business, as Laurens said, as long as we have a utility to transform >> > > existing data. >> > > >> > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman >> > wrote: >> > > >> > > > Having 2 different patterns for configuring field name >> transformations >> > on >> > > > read vs write is confusing to me. I agree with both of you that >> > > > normalizing on '.' and not having to do the translation at all >> would be >> > > > ideal. Like you both suggested, we would need some utility or >> script >> > to >> > > > convert preexisting data to match this format. There could also be >> > some >> > > > adjustments a user would need to make in the UI but I feel like we >> > could >> > > > document around that. Are there any objections to doing it this >> way? >> > > > >> > > > >> > > > >> > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets >> > wrote: >> > > > >> > > > > ES 2.x support officially ended 4 months ago ( >> > > > > https://www.elastic.co/support/eol), so why still support ':' at >> > all? >> > > :) >> > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu >> > LTS >> > > > > releases (16.04 & 18.05). >> > > > > >> > > > > Therefor, move everything to use '.' and provide a >> conversion/upgrade >> > > > > script to change '.' to ':'? >> > > > > >> > > > > >> > > > > On 2018-06-04 13:55, Ryan Merriman wrote: >> > > > > >> > > > >> We've been dealing with a reoccurring challenge in Metron. It is >> > > common >> > > > >> for various fields to contain '.' characters for the purpose of >> > making >> > > > >> them >> > > > >> more readable, namespacing, etc. At one point we only supported >> > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use >> ':' >> > > > >> instead. This limitation does not exist in later versions of >> > > > >> Elasticsearch >> > > > >> or Solr. >> > > > >> >> > > > >> Now we're in a situation where we need to allow a user to use >> either >> > > one >> > > > >> because they may still be using ES 2.3 or have data with ':' >> > > characters >> > > >
Re: [DISCUSS] Field conversions
Agreed, we should definitely have a clear picture about how to do that, maybe even a worked example in the use-cases that we can reference. I'm just saying we don't need to migrate ES docs into Metron, but rather reference them as much as we possibly can. On Tue, Jun 5, 2018 at 11:38 AM Otto Fowler wrote: > It is still our user list and dev list that will have the burden of > talking folks through that. > > > On June 5, 2018 at 09:58:32, Casey Stella (ceste...@gmail.com) wrote: > > To be clear, I'm not even suggesting that we create any tooling here. I'd > say just a reference to the ES docs and a call-out in Upgrading.md would > suffice as long as we have some strong reason to believe it'll work. As > far as I'm concerned, the sooner we're out of the business of transforming > fields, the better. > > On Tue, Jun 5, 2018 at 9:49 AM Justin Leet wrote: > > > ES does have some docs around how this gets handled in upgrades: > > > > > https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html > > > > Might be worth taking a look to see what conflicts we'd have going from > 2.x > > to 5.x and figuring out where to go from there. > > > > On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball < > > si...@simonellistonball.com> wrote: > > > > > I guess in principal you could use > > > https://www.elastic.co/guide/en/elasticsearch/reference/ > > > current/docs-reindex.html#docs-reindex-change-name > > > to reindex with the new fields. It wouldn't be hard to script up a bit > of > > > python to help users out with that, or of course to leave that as an > > > exercise to the reader. It would be nice to have a script that read > and > > > transformed fields for templates and indices to replace the colons > with > > > dots in ES. > > > > > > Simon > > > > > > On 5 June 2018 at 06:40, Casey Stella wrote: > > > > > > > +1 to that, Simon. Do we have a sense of if there are utilities > > provided > > > > by ES to do this kind of migration transformation easily? > > > > > > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < > > > > si...@simonellistonball.com> wrote: > > > > > > > > > I would definitely agree that the transformation should be > removed. > > We > > > > have > > > > > now however added a complex generic solution in the backend, which > is > > > > going > > > > > to be noop for most people. This was done I believe for the sake > of > > > > > backward compatibility. I would argue however, that there is no > need > > to > > > > > support ES 2.3, and therefore no need to support de-dotting > > > > > transformations. This does seem somewhat over-engineered to me, > > though > > > it > > > > > does save people re-indexing on upgrades. I suspect in reality > that > > > this > > > > is > > > > > a rare edge case, and that we would do far better to settle on one > > > > solution > > > > > (the dotted version, not the colons, to my mind) > > > > > > > > > > Simon > > > > > > > > > > On 5 June 2018 at 06:29, Ryan Merriman > wrote: > > > > > > > > > > > I agree completely. I will leave this thread open for a day or > two > > > to > > > > > give > > > > > > others a chance to weigh in. If no one opposes, I will creates > > Jiras > > > > for > > > > > > removing field transformations and transforming existing data. > > > > > > > > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > > > > > wrote: > > > > > > > > > > > > > Well, on write it is a transformation, on read it's a > > translation. > > > > > This > > > > > > is > > > > > > > to say that you're providing a mapping on read to translate > field > > > > names > > > > > > > given the index you're using. The other approach that I was > > > > > considering > > > > > > > last night is a field transformation REST call which > translates > > > field > > > > > > names > > > > > > > that the UI could call. So, the UI would pass 'source.type' to > > the > > > > > field > > > > > > > translation service and in Solr it'd return source.type and in > ES > > > > it'd > > > > > > > return source:type. Underneath the hood the service would use > > the > > > > same > > > > > > > transformation as the writer uses. That's another way to skin > > this > > > > > cat. > > > > > > > > > > > > > > Ultimately, I think we should just ditch this field > > transformation > > > > > > > business, as Laurens said, as long as we have a utility to > > > transform > > > > > > > existing data. > > > > > > > > > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman < > > merrim...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > Having 2 different patterns for configuring field name > > > > > transformations > > > > > > on > > > > > > > > read vs write is confusing to me. I agree with both of you > > that > > > > > > > > normalizing on '.' and not having to do the translation at > all > > > > would > > > > > be > > > > > > > > ideal. Like you both suggested, we would need some utility > or > > > > script > > > > > > to > > > > > > > > convert preexisting dat
Re: [DISCUSS] Field conversions
It is still our user list and dev list that will have the burden of talking folks through that. On June 5, 2018 at 09:58:32, Casey Stella (ceste...@gmail.com) wrote: To be clear, I'm not even suggesting that we create any tooling here. I'd say just a reference to the ES docs and a call-out in Upgrading.md would suffice as long as we have some strong reason to believe it'll work. As far as I'm concerned, the sooner we're out of the business of transforming fields, the better. On Tue, Jun 5, 2018 at 9:49 AM Justin Leet wrote: > ES does have some docs around how this gets handled in upgrades: > > https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html > > Might be worth taking a look to see what conflicts we'd have going from 2.x > to 5.x and figuring out where to go from there. > > On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > > > I guess in principal you could use > > https://www.elastic.co/guide/en/elasticsearch/reference/ > > current/docs-reindex.html#docs-reindex-change-name > > to reindex with the new fields. It wouldn't be hard to script up a bit of > > python to help users out with that, or of course to leave that as an > > exercise to the reader. It would be nice to have a script that read and > > transformed fields for templates and indices to replace the colons with > > dots in ES. > > > > Simon > > > > On 5 June 2018 at 06:40, Casey Stella wrote: > > > > > +1 to that, Simon. Do we have a sense of if there are utilities > provided > > > by ES to do this kind of migration transformation easily? > > > > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < > > > si...@simonellistonball.com> wrote: > > > > > > > I would definitely agree that the transformation should be removed. > We > > > have > > > > now however added a complex generic solution in the backend, which is > > > going > > > > to be noop for most people. This was done I believe for the sake of > > > > backward compatibility. I would argue however, that there is no need > to > > > > support ES 2.3, and therefore no need to support de-dotting > > > > transformations. This does seem somewhat over-engineered to me, > though > > it > > > > does save people re-indexing on upgrades. I suspect in reality that > > this > > > is > > > > a rare edge case, and that we would do far better to settle on one > > > solution > > > > (the dotted version, not the colons, to my mind) > > > > > > > > Simon > > > > > > > > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > > > > > > > > I agree completely. I will leave this thread open for a day or two > > to > > > > give > > > > > others a chance to weigh in. If no one opposes, I will creates > Jiras > > > for > > > > > removing field transformations and transforming existing data. > > > > > > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > > > wrote: > > > > > > > > > > > Well, on write it is a transformation, on read it's a > translation. > > > > This > > > > > is > > > > > > to say that you're providing a mapping on read to translate field > > > names > > > > > > given the index you're using. The other approach that I was > > > > considering > > > > > > last night is a field transformation REST call which translates > > field > > > > > names > > > > > > that the UI could call. So, the UI would pass 'source.type' to > the > > > > field > > > > > > translation service and in Solr it'd return source.type and in ES > > > it'd > > > > > > return source:type. Underneath the hood the service would use > the > > > same > > > > > > transformation as the writer uses. That's another way to skin > this > > > > cat. > > > > > > > > > > > > Ultimately, I think we should just ditch this field > transformation > > > > > > business, as Laurens said, as long as we have a utility to > > transform > > > > > > existing data. > > > > > > > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman < > merrim...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Having 2 different patterns for configuring field name > > > > transformations > > > > > on > > > > > > > read vs write is confusing to me. I agree with both of you > that > > > > > > > normalizing on '.' and not having to do the translation at all > > > would > > > > be > > > > > > > ideal. Like you both suggested, we would need some utility or > > > script > > > > > to > > > > > > > convert preexisting data to match this format. There could > also > > be > > > > > some > > > > > > > adjustments a user would need to make in the UI but I feel like > > we > > > > > could > > > > > > > document around that. Are there any objections to doing it > this > > > way? > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets < > laur...@daemon.be> > > > > > wrote: > > > > > > > > > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > > > > > https://www.elastic.co/support/eol), so why still support > ':' > > at > > > > > all? > > > > > >
Re: [DISCUSS] Field conversions
I am in favor of removing the `FieldNameConverter` abstraction as an end state. Although, I don't agree with Simon that we could have just done that directly without providing a backwards compatible solution as was done in #1022. There are too many touch points that rely on that conversion and users who expect fields to land in their indices named a certain way (no matter what version of ES they are running). If I am wrong and there is a better approach that works, then we should just revert #1022. On Tue, Jun 5, 2018 at 9:37 AM, Simon Elliston Ball < si...@simonellistonball.com> wrote: > I would definitely agree that the transformation should be removed. We have > now however added a complex generic solution in the backend, which is going > to be noop for most people. This was done I believe for the sake of > backward compatibility. I would argue however, that there is no need to > support ES 2.3, and therefore no need to support de-dotting > transformations. This does seem somewhat over-engineered to me, though it > does save people re-indexing on upgrades. I suspect in reality that this is > a rare edge case, and that we would do far better to settle on one solution > (the dotted version, not the colons, to my mind) > > Simon > > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > > I agree completely. I will leave this thread open for a day or two to > give > > others a chance to weigh in. If no one opposes, I will creates Jiras for > > removing field transformations and transforming existing data. > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella wrote: > > > > > Well, on write it is a transformation, on read it's a translation. > This > > is > > > to say that you're providing a mapping on read to translate field names > > > given the index you're using. The other approach that I was > considering > > > last night is a field transformation REST call which translates field > > names > > > that the UI could call. So, the UI would pass 'source.type' to the > field > > > translation service and in Solr it'd return source.type and in ES it'd > > > return source:type. Underneath the hood the service would use the same > > > transformation as the writer uses. That's another way to skin this > cat. > > > > > > Ultimately, I think we should just ditch this field transformation > > > business, as Laurens said, as long as we have a utility to transform > > > existing data. > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman > > wrote: > > > > > > > Having 2 different patterns for configuring field name > transformations > > on > > > > read vs write is confusing to me. I agree with both of you that > > > > normalizing on '.' and not having to do the translation at all would > be > > > > ideal. Like you both suggested, we would need some utility or script > > to > > > > convert preexisting data to match this format. There could also be > > some > > > > adjustments a user would need to make in the UI but I feel like we > > could > > > > document around that. Are there any objections to doing it this way? > > > > > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets > > wrote: > > > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > > https://www.elastic.co/support/eol), so why still support ':' at > > all? > > > :) > > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu > > LTS > > > > > releases (16.04 & 18.05). > > > > > > > > > > Therefor, move everything to use '.' and provide a > conversion/upgrade > > > > > script to change '.' to ':'? > > > > > > > > > > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > > > > > > > >> We've been dealing with a reoccurring challenge in Metron. It is > > > common > > > > >> for various fields to contain '.' characters for the purpose of > > making > > > > >> them > > > > >> more readable, namespacing, etc. At one point we only supported > > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use > ':' > > > > >> instead. This limitation does not exist in later versions of > > > > >> Elasticsearch > > > > >> or Solr. > > > > >> > > > > >> Now we're in a situation where we need to allow a user to use > either > > > one > > > > >> because they may still be using ES 2.3 or have data with ':' > > > characters > > > > in > > > > >> field names. We've attempted to make this configurable in a > couple > > > > >> different PRs: > > > > >> > > > > >> https://github.com/apache/metron/pull/1022 > > > > >> https://github.com/apache/metron/pull/1010 > > > > >> https://github.com/apache/metron/pull/1038 > > > > >> > > > > >> The approaches taken in these are not consistent and fall short in > > > > >> different ways. The first (METRON-1569 Allow user to change field > > > name > > > > >> conversion when indexing) only applies to indexing and not > querying. > > > > The > > > > >> others only apply to a single field which does not scale well. > Now > > we > > > > >> have > > > > >> an issu
Re: [DISCUSS] Field conversions
+1 to that. It's a simple problem to solve if you have it, and with a little docs help I imagine we'll be fine. On 5 June 2018 at 06:58, Casey Stella wrote: > To be clear, I'm not even suggesting that we create any tooling here. I'd > say just a reference to the ES docs and a call-out in Upgrading.md would > suffice as long as we have some strong reason to believe it'll work. As > far as I'm concerned, the sooner we're out of the business of transforming > fields, the better. > > On Tue, Jun 5, 2018 at 9:49 AM Justin Leet wrote: > > > ES does have some docs around how this gets handled in upgrades: > > > > https://www.elastic.co/guide/en/elasticsearch/reference/2. > 4/dots-in-names.html > > > > Might be worth taking a look to see what conflicts we'd have going from > 2.x > > to 5.x and figuring out where to go from there. > > > > On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball < > > si...@simonellistonball.com> wrote: > > > > > I guess in principal you could use > > > https://www.elastic.co/guide/en/elasticsearch/reference/ > > > current/docs-reindex.html#docs-reindex-change-name > > > to reindex with the new fields. It wouldn't be hard to script up a bit > of > > > python to help users out with that, or of course to leave that as an > > > exercise to the reader. It would be nice to have a script that read and > > > transformed fields for templates and indices to replace the colons with > > > dots in ES. > > > > > > Simon > > > > > > On 5 June 2018 at 06:40, Casey Stella wrote: > > > > > > > +1 to that, Simon. Do we have a sense of if there are utilities > > provided > > > > by ES to do this kind of migration transformation easily? > > > > > > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < > > > > si...@simonellistonball.com> wrote: > > > > > > > > > I would definitely agree that the transformation should be removed. > > We > > > > have > > > > > now however added a complex generic solution in the backend, which > is > > > > going > > > > > to be noop for most people. This was done I believe for the sake of > > > > > backward compatibility. I would argue however, that there is no > need > > to > > > > > support ES 2.3, and therefore no need to support de-dotting > > > > > transformations. This does seem somewhat over-engineered to me, > > though > > > it > > > > > does save people re-indexing on upgrades. I suspect in reality that > > > this > > > > is > > > > > a rare edge case, and that we would do far better to settle on one > > > > solution > > > > > (the dotted version, not the colons, to my mind) > > > > > > > > > > Simon > > > > > > > > > > On 5 June 2018 at 06:29, Ryan Merriman > wrote: > > > > > > > > > > > I agree completely. I will leave this thread open for a day or > two > > > to > > > > > give > > > > > > others a chance to weigh in. If no one opposes, I will creates > > Jiras > > > > for > > > > > > removing field transformations and transforming existing data. > > > > > > > > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > > > > > wrote: > > > > > > > > > > > > > Well, on write it is a transformation, on read it's a > > translation. > > > > > This > > > > > > is > > > > > > > to say that you're providing a mapping on read to translate > field > > > > names > > > > > > > given the index you're using. The other approach that I was > > > > > considering > > > > > > > last night is a field transformation REST call which translates > > > field > > > > > > names > > > > > > > that the UI could call. So, the UI would pass 'source.type' to > > the > > > > > field > > > > > > > translation service and in Solr it'd return source.type and in > ES > > > > it'd > > > > > > > return source:type. Underneath the hood the service would use > > the > > > > same > > > > > > > transformation as the writer uses. That's another way to skin > > this > > > > > cat. > > > > > > > > > > > > > > Ultimately, I think we should just ditch this field > > transformation > > > > > > > business, as Laurens said, as long as we have a utility to > > > transform > > > > > > > existing data. > > > > > > > > > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman < > > merrim...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > Having 2 different patterns for configuring field name > > > > > transformations > > > > > > on > > > > > > > > read vs write is confusing to me. I agree with both of you > > that > > > > > > > > normalizing on '.' and not having to do the translation at > all > > > > would > > > > > be > > > > > > > > ideal. Like you both suggested, we would need some utility > or > > > > script > > > > > > to > > > > > > > > convert preexisting data to match this format. There could > > also > > > be > > > > > > some > > > > > > > > adjustments a user would need to make in the UI but I feel > like > > > we > > > > > > could > > > > > > > > document around that. Are there any objections to doing it > > this > > > > way? > > > > > > > > > > > > > > > > > > > > > > > > >
Re: [DISCUSS] Field conversions
To be clear, I'm not even suggesting that we create any tooling here. I'd say just a reference to the ES docs and a call-out in Upgrading.md would suffice as long as we have some strong reason to believe it'll work. As far as I'm concerned, the sooner we're out of the business of transforming fields, the better. On Tue, Jun 5, 2018 at 9:49 AM Justin Leet wrote: > ES does have some docs around how this gets handled in upgrades: > > https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html > > Might be worth taking a look to see what conflicts we'd have going from 2.x > to 5.x and figuring out where to go from there. > > On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball < > si...@simonellistonball.com> wrote: > > > I guess in principal you could use > > https://www.elastic.co/guide/en/elasticsearch/reference/ > > current/docs-reindex.html#docs-reindex-change-name > > to reindex with the new fields. It wouldn't be hard to script up a bit of > > python to help users out with that, or of course to leave that as an > > exercise to the reader. It would be nice to have a script that read and > > transformed fields for templates and indices to replace the colons with > > dots in ES. > > > > Simon > > > > On 5 June 2018 at 06:40, Casey Stella wrote: > > > > > +1 to that, Simon. Do we have a sense of if there are utilities > provided > > > by ES to do this kind of migration transformation easily? > > > > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < > > > si...@simonellistonball.com> wrote: > > > > > > > I would definitely agree that the transformation should be removed. > We > > > have > > > > now however added a complex generic solution in the backend, which is > > > going > > > > to be noop for most people. This was done I believe for the sake of > > > > backward compatibility. I would argue however, that there is no need > to > > > > support ES 2.3, and therefore no need to support de-dotting > > > > transformations. This does seem somewhat over-engineered to me, > though > > it > > > > does save people re-indexing on upgrades. I suspect in reality that > > this > > > is > > > > a rare edge case, and that we would do far better to settle on one > > > solution > > > > (the dotted version, not the colons, to my mind) > > > > > > > > Simon > > > > > > > > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > > > > > > > > I agree completely. I will leave this thread open for a day or two > > to > > > > give > > > > > others a chance to weigh in. If no one opposes, I will creates > Jiras > > > for > > > > > removing field transformations and transforming existing data. > > > > > > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > > > wrote: > > > > > > > > > > > Well, on write it is a transformation, on read it's a > translation. > > > > This > > > > > is > > > > > > to say that you're providing a mapping on read to translate field > > > names > > > > > > given the index you're using. The other approach that I was > > > > considering > > > > > > last night is a field transformation REST call which translates > > field > > > > > names > > > > > > that the UI could call. So, the UI would pass 'source.type' to > the > > > > field > > > > > > translation service and in Solr it'd return source.type and in ES > > > it'd > > > > > > return source:type. Underneath the hood the service would use > the > > > same > > > > > > transformation as the writer uses. That's another way to skin > this > > > > cat. > > > > > > > > > > > > Ultimately, I think we should just ditch this field > transformation > > > > > > business, as Laurens said, as long as we have a utility to > > transform > > > > > > existing data. > > > > > > > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman < > merrim...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Having 2 different patterns for configuring field name > > > > transformations > > > > > on > > > > > > > read vs write is confusing to me. I agree with both of you > that > > > > > > > normalizing on '.' and not having to do the translation at all > > > would > > > > be > > > > > > > ideal. Like you both suggested, we would need some utility or > > > script > > > > > to > > > > > > > convert preexisting data to match this format. There could > also > > be > > > > > some > > > > > > > adjustments a user would need to make in the UI but I feel like > > we > > > > > could > > > > > > > document around that. Are there any objections to doing it > this > > > way? > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets < > laur...@daemon.be> > > > > > wrote: > > > > > > > > > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > > > > > https://www.elastic.co/support/eol), so why still support > ':' > > at > > > > > all? > > > > > > :) > > > > > > > > Additionally, 2.x isn't even supported at all on the last 2 > > > Ubuntu > > > > > LTS > > > > > > > > releases (16.04 & 18.05). > > > >
Re: [DISCUSS] Field conversions
ES does have some docs around how this gets handled in upgrades: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html Might be worth taking a look to see what conflicts we'd have going from 2.x to 5.x and figuring out where to go from there. On Tue, Jun 5, 2018 at 9:46 AM, Simon Elliston Ball < si...@simonellistonball.com> wrote: > I guess in principal you could use > https://www.elastic.co/guide/en/elasticsearch/reference/ > current/docs-reindex.html#docs-reindex-change-name > to reindex with the new fields. It wouldn't be hard to script up a bit of > python to help users out with that, or of course to leave that as an > exercise to the reader. It would be nice to have a script that read and > transformed fields for templates and indices to replace the colons with > dots in ES. > > Simon > > On 5 June 2018 at 06:40, Casey Stella wrote: > > > +1 to that, Simon. Do we have a sense of if there are utilities provided > > by ES to do this kind of migration transformation easily? > > > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < > > si...@simonellistonball.com> wrote: > > > > > I would definitely agree that the transformation should be removed. We > > have > > > now however added a complex generic solution in the backend, which is > > going > > > to be noop for most people. This was done I believe for the sake of > > > backward compatibility. I would argue however, that there is no need to > > > support ES 2.3, and therefore no need to support de-dotting > > > transformations. This does seem somewhat over-engineered to me, though > it > > > does save people re-indexing on upgrades. I suspect in reality that > this > > is > > > a rare edge case, and that we would do far better to settle on one > > solution > > > (the dotted version, not the colons, to my mind) > > > > > > Simon > > > > > > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > > > > > > I agree completely. I will leave this thread open for a day or two > to > > > give > > > > others a chance to weigh in. If no one opposes, I will creates Jiras > > for > > > > removing field transformations and transforming existing data. > > > > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > > wrote: > > > > > > > > > Well, on write it is a transformation, on read it's a translation. > > > This > > > > is > > > > > to say that you're providing a mapping on read to translate field > > names > > > > > given the index you're using. The other approach that I was > > > considering > > > > > last night is a field transformation REST call which translates > field > > > > names > > > > > that the UI could call. So, the UI would pass 'source.type' to the > > > field > > > > > translation service and in Solr it'd return source.type and in ES > > it'd > > > > > return source:type. Underneath the hood the service would use the > > same > > > > > transformation as the writer uses. That's another way to skin this > > > cat. > > > > > > > > > > Ultimately, I think we should just ditch this field transformation > > > > > business, as Laurens said, as long as we have a utility to > transform > > > > > existing data. > > > > > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman > > > > wrote: > > > > > > > > > > > Having 2 different patterns for configuring field name > > > transformations > > > > on > > > > > > read vs write is confusing to me. I agree with both of you that > > > > > > normalizing on '.' and not having to do the translation at all > > would > > > be > > > > > > ideal. Like you both suggested, we would need some utility or > > script > > > > to > > > > > > convert preexisting data to match this format. There could also > be > > > > some > > > > > > adjustments a user would need to make in the UI but I feel like > we > > > > could > > > > > > document around that. Are there any objections to doing it this > > way? > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets > > > > wrote: > > > > > > > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > > > > https://www.elastic.co/support/eol), so why still support ':' > at > > > > all? > > > > > :) > > > > > > > Additionally, 2.x isn't even supported at all on the last 2 > > Ubuntu > > > > LTS > > > > > > > releases (16.04 & 18.05). > > > > > > > > > > > > > > Therefor, move everything to use '.' and provide a > > > conversion/upgrade > > > > > > > script to change '.' to ':'? > > > > > > > > > > > > > > > > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > > > > > > > > > > > >> We've been dealing with a reoccurring challenge in Metron. It > > is > > > > > common > > > > > > >> for various fields to contain '.' characters for the purpose > of > > > > making > > > > > > >> them > > > > > > >> more readable, namespacing, etc. At one point we only > supported > > > > > > >> Elasticsearch 2.3 which did not allow dots and forced us to > use > > > ':' > > > > > > >> instead. This limitation does not
Re: [DISCUSS] Field conversions
I guess in principal you could use https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-change-name to reindex with the new fields. It wouldn't be hard to script up a bit of python to help users out with that, or of course to leave that as an exercise to the reader. It would be nice to have a script that read and transformed fields for templates and indices to replace the colons with dots in ES. Simon On 5 June 2018 at 06:40, Casey Stella wrote: > +1 to that, Simon. Do we have a sense of if there are utilities provided > by ES to do this kind of migration transformation easily? > > On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < > si...@simonellistonball.com> wrote: > > > I would definitely agree that the transformation should be removed. We > have > > now however added a complex generic solution in the backend, which is > going > > to be noop for most people. This was done I believe for the sake of > > backward compatibility. I would argue however, that there is no need to > > support ES 2.3, and therefore no need to support de-dotting > > transformations. This does seem somewhat over-engineered to me, though it > > does save people re-indexing on upgrades. I suspect in reality that this > is > > a rare edge case, and that we would do far better to settle on one > solution > > (the dotted version, not the colons, to my mind) > > > > Simon > > > > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > > > > I agree completely. I will leave this thread open for a day or two to > > give > > > others a chance to weigh in. If no one opposes, I will creates Jiras > for > > > removing field transformations and transforming existing data. > > > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella > wrote: > > > > > > > Well, on write it is a transformation, on read it's a translation. > > This > > > is > > > > to say that you're providing a mapping on read to translate field > names > > > > given the index you're using. The other approach that I was > > considering > > > > last night is a field transformation REST call which translates field > > > names > > > > that the UI could call. So, the UI would pass 'source.type' to the > > field > > > > translation service and in Solr it'd return source.type and in ES > it'd > > > > return source:type. Underneath the hood the service would use the > same > > > > transformation as the writer uses. That's another way to skin this > > cat. > > > > > > > > Ultimately, I think we should just ditch this field transformation > > > > business, as Laurens said, as long as we have a utility to transform > > > > existing data. > > > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman > > > wrote: > > > > > > > > > Having 2 different patterns for configuring field name > > transformations > > > on > > > > > read vs write is confusing to me. I agree with both of you that > > > > > normalizing on '.' and not having to do the translation at all > would > > be > > > > > ideal. Like you both suggested, we would need some utility or > script > > > to > > > > > convert preexisting data to match this format. There could also be > > > some > > > > > adjustments a user would need to make in the UI but I feel like we > > > could > > > > > document around that. Are there any objections to doing it this > way? > > > > > > > > > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets > > > wrote: > > > > > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > > > https://www.elastic.co/support/eol), so why still support ':' at > > > all? > > > > :) > > > > > > Additionally, 2.x isn't even supported at all on the last 2 > Ubuntu > > > LTS > > > > > > releases (16.04 & 18.05). > > > > > > > > > > > > Therefor, move everything to use '.' and provide a > > conversion/upgrade > > > > > > script to change '.' to ':'? > > > > > > > > > > > > > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > > > > > > > > > >> We've been dealing with a reoccurring challenge in Metron. It > is > > > > common > > > > > >> for various fields to contain '.' characters for the purpose of > > > making > > > > > >> them > > > > > >> more readable, namespacing, etc. At one point we only supported > > > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use > > ':' > > > > > >> instead. This limitation does not exist in later versions of > > > > > >> Elasticsearch > > > > > >> or Solr. > > > > > >> > > > > > >> Now we're in a situation where we need to allow a user to use > > either > > > > one > > > > > >> because they may still be using ES 2.3 or have data with ':' > > > > characters > > > > > in > > > > > >> field names. We've attempted to make this configurable in a > > couple > > > > > >> different PRs: > > > > > >> > > > > > >> https://github.com/apache/metron/pull/1022 > > > > > >> https://github.com/apache/metron/pull/1010 > > > > > >> https://github.com/apache/metron/pull/1038 > > > > > >> > > > > > >> The a
Re: [DISCUSS] Field conversions
+1 to that, Simon. Do we have a sense of if there are utilities provided by ES to do this kind of migration transformation easily? On Tue, Jun 5, 2018 at 9:37 AM Simon Elliston Ball < si...@simonellistonball.com> wrote: > I would definitely agree that the transformation should be removed. We have > now however added a complex generic solution in the backend, which is going > to be noop for most people. This was done I believe for the sake of > backward compatibility. I would argue however, that there is no need to > support ES 2.3, and therefore no need to support de-dotting > transformations. This does seem somewhat over-engineered to me, though it > does save people re-indexing on upgrades. I suspect in reality that this is > a rare edge case, and that we would do far better to settle on one solution > (the dotted version, not the colons, to my mind) > > Simon > > On 5 June 2018 at 06:29, Ryan Merriman wrote: > > > I agree completely. I will leave this thread open for a day or two to > give > > others a chance to weigh in. If no one opposes, I will creates Jiras for > > removing field transformations and transforming existing data. > > > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella wrote: > > > > > Well, on write it is a transformation, on read it's a translation. > This > > is > > > to say that you're providing a mapping on read to translate field names > > > given the index you're using. The other approach that I was > considering > > > last night is a field transformation REST call which translates field > > names > > > that the UI could call. So, the UI would pass 'source.type' to the > field > > > translation service and in Solr it'd return source.type and in ES it'd > > > return source:type. Underneath the hood the service would use the same > > > transformation as the writer uses. That's another way to skin this > cat. > > > > > > Ultimately, I think we should just ditch this field transformation > > > business, as Laurens said, as long as we have a utility to transform > > > existing data. > > > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman > > wrote: > > > > > > > Having 2 different patterns for configuring field name > transformations > > on > > > > read vs write is confusing to me. I agree with both of you that > > > > normalizing on '.' and not having to do the translation at all would > be > > > > ideal. Like you both suggested, we would need some utility or script > > to > > > > convert preexisting data to match this format. There could also be > > some > > > > adjustments a user would need to make in the UI but I feel like we > > could > > > > document around that. Are there any objections to doing it this way? > > > > > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets > > wrote: > > > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > > https://www.elastic.co/support/eol), so why still support ':' at > > all? > > > :) > > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu > > LTS > > > > > releases (16.04 & 18.05). > > > > > > > > > > Therefor, move everything to use '.' and provide a > conversion/upgrade > > > > > script to change '.' to ':'? > > > > > > > > > > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > > > > > > > >> We've been dealing with a reoccurring challenge in Metron. It is > > > common > > > > >> for various fields to contain '.' characters for the purpose of > > making > > > > >> them > > > > >> more readable, namespacing, etc. At one point we only supported > > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use > ':' > > > > >> instead. This limitation does not exist in later versions of > > > > >> Elasticsearch > > > > >> or Solr. > > > > >> > > > > >> Now we're in a situation where we need to allow a user to use > either > > > one > > > > >> because they may still be using ES 2.3 or have data with ':' > > > characters > > > > in > > > > >> field names. We've attempted to make this configurable in a > couple > > > > >> different PRs: > > > > >> > > > > >> https://github.com/apache/metron/pull/1022 > > > > >> https://github.com/apache/metron/pull/1010 > > > > >> https://github.com/apache/metron/pull/1038 > > > > >> > > > > >> The approaches taken in these are not consistent and fall short in > > > > >> different ways. The first (METRON-1569 Allow user to change field > > > name > > > > >> conversion when indexing) only applies to indexing and not > querying. > > > > The > > > > >> others only apply to a single field which does not scale well. > Now > > we > > > > >> have > > > > >> an issue with another field in > > > > >> https://issues.apache.org/jira/browse/METRON-1600. Rather than > > > > >> continuing > > > > >> with a patchwork of different fixes I want to attempt to design a > > > > >> system-wide solution. > > > > >> > > > > >> My first thought is to expand > > > > https://github.com/apache/metron/pull/1022 > > > > >> to > > > > >> apply globally.
Re: [DISCUSS] Field conversions
I would definitely agree that the transformation should be removed. We have now however added a complex generic solution in the backend, which is going to be noop for most people. This was done I believe for the sake of backward compatibility. I would argue however, that there is no need to support ES 2.3, and therefore no need to support de-dotting transformations. This does seem somewhat over-engineered to me, though it does save people re-indexing on upgrades. I suspect in reality that this is a rare edge case, and that we would do far better to settle on one solution (the dotted version, not the colons, to my mind) Simon On 5 June 2018 at 06:29, Ryan Merriman wrote: > I agree completely. I will leave this thread open for a day or two to give > others a chance to weigh in. If no one opposes, I will creates Jiras for > removing field transformations and transforming existing data. > > On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella wrote: > > > Well, on write it is a transformation, on read it's a translation. This > is > > to say that you're providing a mapping on read to translate field names > > given the index you're using. The other approach that I was considering > > last night is a field transformation REST call which translates field > names > > that the UI could call. So, the UI would pass 'source.type' to the field > > translation service and in Solr it'd return source.type and in ES it'd > > return source:type. Underneath the hood the service would use the same > > transformation as the writer uses. That's another way to skin this cat. > > > > Ultimately, I think we should just ditch this field transformation > > business, as Laurens said, as long as we have a utility to transform > > existing data. > > > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman > wrote: > > > > > Having 2 different patterns for configuring field name transformations > on > > > read vs write is confusing to me. I agree with both of you that > > > normalizing on '.' and not having to do the translation at all would be > > > ideal. Like you both suggested, we would need some utility or script > to > > > convert preexisting data to match this format. There could also be > some > > > adjustments a user would need to make in the UI but I feel like we > could > > > document around that. Are there any objections to doing it this way? > > > > > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets > wrote: > > > > > > > ES 2.x support officially ended 4 months ago ( > > > > https://www.elastic.co/support/eol), so why still support ':' at > all? > > :) > > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu > LTS > > > > releases (16.04 & 18.05). > > > > > > > > Therefor, move everything to use '.' and provide a conversion/upgrade > > > > script to change '.' to ':'? > > > > > > > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > > > > > >> We've been dealing with a reoccurring challenge in Metron. It is > > common > > > >> for various fields to contain '.' characters for the purpose of > making > > > >> them > > > >> more readable, namespacing, etc. At one point we only supported > > > >> Elasticsearch 2.3 which did not allow dots and forced us to use ':' > > > >> instead. This limitation does not exist in later versions of > > > >> Elasticsearch > > > >> or Solr. > > > >> > > > >> Now we're in a situation where we need to allow a user to use either > > one > > > >> because they may still be using ES 2.3 or have data with ':' > > characters > > > in > > > >> field names. We've attempted to make this configurable in a couple > > > >> different PRs: > > > >> > > > >> https://github.com/apache/metron/pull/1022 > > > >> https://github.com/apache/metron/pull/1010 > > > >> https://github.com/apache/metron/pull/1038 > > > >> > > > >> The approaches taken in these are not consistent and fall short in > > > >> different ways. The first (METRON-1569 Allow user to change field > > name > > > >> conversion when indexing) only applies to indexing and not querying. > > > The > > > >> others only apply to a single field which does not scale well. Now > we > > > >> have > > > >> an issue with another field in > > > >> https://issues.apache.org/jira/browse/METRON-1600. Rather than > > > >> continuing > > > >> with a patchwork of different fixes I want to attempt to design a > > > >> system-wide solution. > > > >> > > > >> My first thought is to expand > > > https://github.com/apache/metron/pull/1022 > > > >> to > > > >> apply globally. However this is not trivial and would require > > > significant > > > >> changes. It would also make https://github.com/apache/ > > metron/pull/1010 > > > >> obsolete and we might end up having to revert all of it. > > > >> > > > >> Does anyone have any ideas or opinions? I am still researching > > > solutions > > > >> but would love some guidance from the community. > > > >> > > > > > > > > > > -- -- simon elliston ball @sireb
Re: [DISCUSS] Field conversions
I agree completely. I will leave this thread open for a day or two to give others a chance to weigh in. If no one opposes, I will creates Jiras for removing field transformations and transforming existing data. On Tue, Jun 5, 2018 at 8:21 AM, Casey Stella wrote: > Well, on write it is a transformation, on read it's a translation. This is > to say that you're providing a mapping on read to translate field names > given the index you're using. The other approach that I was considering > last night is a field transformation REST call which translates field names > that the UI could call. So, the UI would pass 'source.type' to the field > translation service and in Solr it'd return source.type and in ES it'd > return source:type. Underneath the hood the service would use the same > transformation as the writer uses. That's another way to skin this cat. > > Ultimately, I think we should just ditch this field transformation > business, as Laurens said, as long as we have a utility to transform > existing data. > > On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman wrote: > > > Having 2 different patterns for configuring field name transformations on > > read vs write is confusing to me. I agree with both of you that > > normalizing on '.' and not having to do the translation at all would be > > ideal. Like you both suggested, we would need some utility or script to > > convert preexisting data to match this format. There could also be some > > adjustments a user would need to make in the UI but I feel like we could > > document around that. Are there any objections to doing it this way? > > > > > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets wrote: > > > > > ES 2.x support officially ended 4 months ago ( > > > https://www.elastic.co/support/eol), so why still support ':' at all? > :) > > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS > > > releases (16.04 & 18.05). > > > > > > Therefor, move everything to use '.' and provide a conversion/upgrade > > > script to change '.' to ':'? > > > > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > > > >> We've been dealing with a reoccurring challenge in Metron. It is > common > > >> for various fields to contain '.' characters for the purpose of making > > >> them > > >> more readable, namespacing, etc. At one point we only supported > > >> Elasticsearch 2.3 which did not allow dots and forced us to use ':' > > >> instead. This limitation does not exist in later versions of > > >> Elasticsearch > > >> or Solr. > > >> > > >> Now we're in a situation where we need to allow a user to use either > one > > >> because they may still be using ES 2.3 or have data with ':' > characters > > in > > >> field names. We've attempted to make this configurable in a couple > > >> different PRs: > > >> > > >> https://github.com/apache/metron/pull/1022 > > >> https://github.com/apache/metron/pull/1010 > > >> https://github.com/apache/metron/pull/1038 > > >> > > >> The approaches taken in these are not consistent and fall short in > > >> different ways. The first (METRON-1569 Allow user to change field > name > > >> conversion when indexing) only applies to indexing and not querying. > > The > > >> others only apply to a single field which does not scale well. Now we > > >> have > > >> an issue with another field in > > >> https://issues.apache.org/jira/browse/METRON-1600. Rather than > > >> continuing > > >> with a patchwork of different fixes I want to attempt to design a > > >> system-wide solution. > > >> > > >> My first thought is to expand > > https://github.com/apache/metron/pull/1022 > > >> to > > >> apply globally. However this is not trivial and would require > > significant > > >> changes. It would also make https://github.com/apache/ > metron/pull/1010 > > >> obsolete and we might end up having to revert all of it. > > >> > > >> Does anyone have any ideas or opinions? I am still researching > > solutions > > >> but would love some guidance from the community. > > >> > > > > > >
Re: [DISCUSS] Field conversions
Well, on write it is a transformation, on read it's a translation. This is to say that you're providing a mapping on read to translate field names given the index you're using. The other approach that I was considering last night is a field transformation REST call which translates field names that the UI could call. So, the UI would pass 'source.type' to the field translation service and in Solr it'd return source.type and in ES it'd return source:type. Underneath the hood the service would use the same transformation as the writer uses. That's another way to skin this cat. Ultimately, I think we should just ditch this field transformation business, as Laurens said, as long as we have a utility to transform existing data. On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman wrote: > Having 2 different patterns for configuring field name transformations on > read vs write is confusing to me. I agree with both of you that > normalizing on '.' and not having to do the translation at all would be > ideal. Like you both suggested, we would need some utility or script to > convert preexisting data to match this format. There could also be some > adjustments a user would need to make in the UI but I feel like we could > document around that. Are there any objections to doing it this way? > > > > On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets wrote: > > > ES 2.x support officially ended 4 months ago ( > > https://www.elastic.co/support/eol), so why still support ':' at all? :) > > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS > > releases (16.04 & 18.05). > > > > Therefor, move everything to use '.' and provide a conversion/upgrade > > script to change '.' to ':'? > > > > > > On 2018-06-04 13:55, Ryan Merriman wrote: > > > >> We've been dealing with a reoccurring challenge in Metron. It is common > >> for various fields to contain '.' characters for the purpose of making > >> them > >> more readable, namespacing, etc. At one point we only supported > >> Elasticsearch 2.3 which did not allow dots and forced us to use ':' > >> instead. This limitation does not exist in later versions of > >> Elasticsearch > >> or Solr. > >> > >> Now we're in a situation where we need to allow a user to use either one > >> because they may still be using ES 2.3 or have data with ':' characters > in > >> field names. We've attempted to make this configurable in a couple > >> different PRs: > >> > >> https://github.com/apache/metron/pull/1022 > >> https://github.com/apache/metron/pull/1010 > >> https://github.com/apache/metron/pull/1038 > >> > >> The approaches taken in these are not consistent and fall short in > >> different ways. The first (METRON-1569 Allow user to change field name > >> conversion when indexing) only applies to indexing and not querying. > The > >> others only apply to a single field which does not scale well. Now we > >> have > >> an issue with another field in > >> https://issues.apache.org/jira/browse/METRON-1600. Rather than > >> continuing > >> with a patchwork of different fixes I want to attempt to design a > >> system-wide solution. > >> > >> My first thought is to expand > https://github.com/apache/metron/pull/1022 > >> to > >> apply globally. However this is not trivial and would require > significant > >> changes. It would also make https://github.com/apache/metron/pull/1010 > >> obsolete and we might end up having to revert all of it. > >> > >> Does anyone have any ideas or opinions? I am still researching > solutions > >> but would love some guidance from the community. > >> > > >
Re: [DISCUSS] Field conversions
Having 2 different patterns for configuring field name transformations on read vs write is confusing to me. I agree with both of you that normalizing on '.' and not having to do the translation at all would be ideal. Like you both suggested, we would need some utility or script to convert preexisting data to match this format. There could also be some adjustments a user would need to make in the UI but I feel like we could document around that. Are there any objections to doing it this way? On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets wrote: > ES 2.x support officially ended 4 months ago ( > https://www.elastic.co/support/eol), so why still support ':' at all? :) > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS > releases (16.04 & 18.05). > > Therefor, move everything to use '.' and provide a conversion/upgrade > script to change '.' to ':'? > > > On 2018-06-04 13:55, Ryan Merriman wrote: > >> We've been dealing with a reoccurring challenge in Metron. It is common >> for various fields to contain '.' characters for the purpose of making >> them >> more readable, namespacing, etc. At one point we only supported >> Elasticsearch 2.3 which did not allow dots and forced us to use ':' >> instead. This limitation does not exist in later versions of >> Elasticsearch >> or Solr. >> >> Now we're in a situation where we need to allow a user to use either one >> because they may still be using ES 2.3 or have data with ':' characters in >> field names. We've attempted to make this configurable in a couple >> different PRs: >> >> https://github.com/apache/metron/pull/1022 >> https://github.com/apache/metron/pull/1010 >> https://github.com/apache/metron/pull/1038 >> >> The approaches taken in these are not consistent and fall short in >> different ways. The first (METRON-1569 Allow user to change field name >> conversion when indexing) only applies to indexing and not querying. The >> others only apply to a single field which does not scale well. Now we >> have >> an issue with another field in >> https://issues.apache.org/jira/browse/METRON-1600. Rather than >> continuing >> with a patchwork of different fixes I want to attempt to design a >> system-wide solution. >> >> My first thought is to expand https://github.com/apache/metron/pull/1022 >> to >> apply globally. However this is not trivial and would require significant >> changes. It would also make https://github.com/apache/metron/pull/1010 >> obsolete and we might end up having to revert all of it. >> >> Does anyone have any ideas or opinions? I am still researching solutions >> but would love some guidance from the community. >> >
Re: [DISCUSS] Field conversions
ES 2.x support officially ended 4 months ago (https://www.elastic.co/support/eol), so why still support ':' at all? :) Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS releases (16.04 & 18.05). Therefor, move everything to use '.' and provide a conversion/upgrade script to change '.' to ':'? On 2018-06-04 13:55, Ryan Merriman wrote: We've been dealing with a reoccurring challenge in Metron. It is common for various fields to contain '.' characters for the purpose of making them more readable, namespacing, etc. At one point we only supported Elasticsearch 2.3 which did not allow dots and forced us to use ':' instead. This limitation does not exist in later versions of Elasticsearch or Solr. Now we're in a situation where we need to allow a user to use either one because they may still be using ES 2.3 or have data with ':' characters in field names. We've attempted to make this configurable in a couple different PRs: https://github.com/apache/metron/pull/1022 https://github.com/apache/metron/pull/1010 https://github.com/apache/metron/pull/1038 The approaches taken in these are not consistent and fall short in different ways. The first (METRON-1569 Allow user to change field name conversion when indexing) only applies to indexing and not querying. The others only apply to a single field which does not scale well. Now we have an issue with another field in https://issues.apache.org/jira/browse/METRON-1600. Rather than continuing with a patchwork of different fixes I want to attempt to design a system-wide solution. My first thought is to expand https://github.com/apache/metron/pull/1022 to apply globally. However this is not trivial and would require significant changes. It would also make https://github.com/apache/metron/pull/1010 obsolete and we might end up having to revert all of it. Does anyone have any ideas or opinions? I am still researching solutions but would love some guidance from the community.
Re: [DISCUSS] Field conversions
Before we construct a super generic solution, can we get an analysis of all the places in the UI where we're hard-coding fields? It seems like pulling the field from the global config is the strategy that we've gone with that could be expanded upon in https://github.com/apache/metron/pull/1010 (though didn't quite get the semantic correct as it required https://github.com/apache/metron/pull/1038). Is there a reason why we wouldn't create a PR to refer to all of the hard-coded fields in the same way? I guess my perspective is that this seems like a problem contained to the UI accessing a small number of hard-coded fields and expansion of those fields seem pretty contained. If so, I'd suggest we continue with the pattern we already have. If you want to expand it, you might consider taking advantage of the fact that the global config can use maps and doing something like: { ... "fieldNameTransformations" : { "source:type" : "source.type", "threat:triage:reason" : "threat.triage.reason" } } Whereby in the UI when accessing a hard-coded field, it will look up the field in the fieldNameTransformations map from global config. If it exists in the map, then it'll use the translated field. If it does not, then it will use the field it passed in (e.g. source:type). That would allow us to add new translations easily, but it may be overkill if we're talking about 3 fields. Another question, is there an easy way to bulk change field names in ES across many indices? Could we normalize on .'s and not do this translation at all? I think in order to do that, we'd need instructions on how to transition at least selected fields (i.e. those hard coded fields in the UI). Casey On Mon, Jun 4, 2018 at 4:55 PM Ryan Merriman wrote: > We've been dealing with a reoccurring challenge in Metron. It is common > for various fields to contain '.' characters for the purpose of making them > more readable, namespacing, etc. At one point we only supported > Elasticsearch 2.3 which did not allow dots and forced us to use ':' > instead. This limitation does not exist in later versions of Elasticsearch > or Solr. > > Now we're in a situation where we need to allow a user to use either one > because they may still be using ES 2.3 or have data with ':' characters in > field names. We've attempted to make this configurable in a couple > different PRs: > > https://github.com/apache/metron/pull/1022 > https://github.com/apache/metron/pull/1010 > https://github.com/apache/metron/pull/1038 > > The approaches taken in these are not consistent and fall short in > different ways. The first (METRON-1569 Allow user to change field name > conversion when indexing) only applies to indexing and not querying. The > others only apply to a single field which does not scale well. Now we have > an issue with another field in > https://issues.apache.org/jira/browse/METRON-1600. Rather than continuing > with a patchwork of different fixes I want to attempt to design a > system-wide solution. > > My first thought is to expand https://github.com/apache/metron/pull/1022 > to > apply globally. However this is not trivial and would require significant > changes. It would also make https://github.com/apache/metron/pull/1010 > obsolete and we might end up having to revert all of it. > > Does anyone have any ideas or opinions? I am still researching solutions > but would love some guidance from the community. >
[DISCUSS] Field conversions
We've been dealing with a reoccurring challenge in Metron. It is common for various fields to contain '.' characters for the purpose of making them more readable, namespacing, etc. At one point we only supported Elasticsearch 2.3 which did not allow dots and forced us to use ':' instead. This limitation does not exist in later versions of Elasticsearch or Solr. Now we're in a situation where we need to allow a user to use either one because they may still be using ES 2.3 or have data with ':' characters in field names. We've attempted to make this configurable in a couple different PRs: https://github.com/apache/metron/pull/1022 https://github.com/apache/metron/pull/1010 https://github.com/apache/metron/pull/1038 The approaches taken in these are not consistent and fall short in different ways. The first (METRON-1569 Allow user to change field name conversion when indexing) only applies to indexing and not querying. The others only apply to a single field which does not scale well. Now we have an issue with another field in https://issues.apache.org/jira/browse/METRON-1600. Rather than continuing with a patchwork of different fixes I want to attempt to design a system-wide solution. My first thought is to expand https://github.com/apache/metron/pull/1022 to apply globally. However this is not trivial and would require significant changes. It would also make https://github.com/apache/metron/pull/1010 obsolete and we might end up having to revert all of it. Does anyone have any ideas or opinions? I am still researching solutions but would love some guidance from the community.