Well, on write it is a transformation, on read it's a translation.  This is
to say that you're providing a mapping on read to translate field names
given the index you're using.  The other approach that I was considering
last night is a field transformation REST call which translates field names
that the UI could call.  So, the UI would pass 'source.type' to the field
translation service and in Solr it'd return source.type and in ES it'd
return source:type.  Underneath the hood the service would use the same
transformation as the writer uses.  That's another way to skin this cat.

Ultimately, I think we should just ditch this field transformation
business, as Laurens said, as long as we have a utility to transform
existing data.

On Tue, Jun 5, 2018 at 8:54 AM Ryan Merriman <merrim...@gmail.com> wrote:

> Having 2 different patterns for configuring field name transformations on
> read vs write is confusing to me.  I agree with both of you that
> normalizing on '.' and not having to do the translation at all would be
> ideal.  Like you both suggested, we would need some utility or script to
> convert preexisting data to match this format.  There could also be some
> adjustments a user would need to make in the UI but I feel like we could
> document around that.  Are there any objections to doing it this way?
>
>
>
> On Mon, Jun 4, 2018 at 4:30 PM, Laurens Vets <laur...@daemon.be> wrote:
>
> > ES 2.x support officially ended 4 months ago (
> > https://www.elastic.co/support/eol), so why still support ':' at all? :)
> > Additionally, 2.x isn't even supported at all on the last 2 Ubuntu LTS
> > releases (16.04 & 18.05).
> >
> > Therefor, move everything to use '.' and provide a conversion/upgrade
> > script to change '.' to ':'?
> >
> >
> > On 2018-06-04 13:55, Ryan Merriman wrote:
> >
> >> We've been dealing with a reoccurring challenge in Metron.  It is common
> >> for various fields to contain '.' characters for the purpose of making
> >> them
> >> more readable, namespacing, etc.  At one point we only supported
> >> Elasticsearch 2.3 which did not allow dots and forced us to use ':'
> >> instead.  This limitation does not exist in later versions of
> >> Elasticsearch
> >> or Solr.
> >>
> >> Now we're in a situation where we need to allow a user to use either one
> >> because they may still be using ES 2.3 or have data with ':' characters
> in
> >> field names.  We've attempted to make this configurable in a couple
> >> different PRs:
> >>
> >> https://github.com/apache/metron/pull/1022
> >> https://github.com/apache/metron/pull/1010
> >> https://github.com/apache/metron/pull/1038
> >>
> >> The approaches taken in these are not consistent and fall short in
> >> different ways.  The first (METRON-1569 Allow user to change field name
> >> conversion when indexing) only applies to indexing and not querying.
> The
> >> others only apply to a single field which does not scale well.  Now we
> >> have
> >> an issue with another field in
> >> https://issues.apache.org/jira/browse/METRON-1600.  Rather than
> >> continuing
> >> with a patchwork of different fixes I want to attempt to design a
> >> system-wide solution.
> >>
> >> My first thought is to expand
> https://github.com/apache/metron/pull/1022
> >> to
> >> apply globally.  However this is not trivial and would require
> significant
> >> changes.  It would also make https://github.com/apache/metron/pull/1010
> >> obsolete and we might end up having to revert all of it.
> >>
> >> Does anyone have any ideas or opinions?  I am still researching
> solutions
> >> but would love some guidance from the community.
> >>
> >
>

Reply via email to