What I would really like to see is not a full end-to-end schema, but units that contribute schema. I don't want to see a parser, enrichment, indexing config as one package because in any given deployment for any given sensor, I may have a different set of enrichments, and so need a different output template.
What I would propose would be parsers and enrichments contribute partial schema (potentially expressed as avro, but the important thing is just a map of fields to types) which can then be composed, and have the metron platform handle creating ES templates / solr schema / Hive Hcat schema / A.N.Other index's schema meta data as the composite of those pieces. So, a parser would contribute a set of fields, the fieldTransformations on the sensor would contribute some fields, and each enrichment block would contribute some fields, at which point we have enough schema definition to generate all the required artefacts for whatever storage it ends up in. Essentially, composable partial schema units from each component, which add up at the end. Does that make sense? Simon On 22 May 2018 at 14:10, Otto Fowler <ottobackwa...@gmail.com> wrote: > We have discussed in the past as part of 777 ( moment of silence…. ) the > idea that parsers/sensors ( or whatever we would call the complete > parse/enrich path ) could define a their ES or Solr schemas so that > they can be ‘installed’ as part of metron and remove the requirement for a > separate install by the system or by the user of a specific index template > or equivalent. > > Nifi has settled on Avro schemas to describe their ‘record’ based data, and > it makes me wonder if we might want to think of using Avro as a universal > schema or the base for one such that we can define a schema and apply it to > either ES or Solr. > > Thoughts? > -- -- simon elliston ball @sireb