RE:
* One Module - yes, I think grouping for the base parsers is good,  I just
don’t want them to stay in -common, it should ‘live’ in the metron lib.  I
think a grouped set of the primitive parsers is correct, still it’s own.
* ES Templates - they don’t *have* to be there, but if they are they will
be used.  The idea that I’m having is “ someone writing a parser should be
able to produce 1 thing, in one place”.  We are talking with Simon on a
different thread about the types of indexing templates we could have.  I
think we could have from *nothing to es or solr specific to something new

As we discuss we can come up with the mv-pr.

On February 17, 2017 at 15:47:57, Casey Stella (ceste...@gmail.com) wrote:

Ok, This is a long one, so don't expect a coherent response just yet, but I
will give some initial impressions:

- I strongly agree with the premise of this idea. Making Metron
extensible is and should be among the top of our priorities and at the
moment, it's painful to develop a new parser.
- One maven module per parser may be overkill here as the shading is
costly and I think it may make some sense to group based on characteristics
in some way (e.g. json and csv may get grouped together).
- The notion of instance vs parser is a good one
- Binding ES templates and parsers may not be a good idea. You can have
non-indexed parsers (e.g. streaming enrichments).

Can we start small here and then iterate toward the complete vision? I'd
recommend

- Splitting the parsers up into some coherent organization with common
bits separated from the parser itself
- Having a maven archetype

As the two most valuable and achievable parts of this idea since they are
the bits required to enable users to create parsers without forking Metron.

On Fri, Feb 17, 2017 at 11:54 AM, Otto Fowler <ottobackwa...@gmail.com>
wrote:

> The ability for implementors and developers building on the project to
> ‘side load’, that is to build, maintain, and install, telemetry sources
> into the system without having to actually develop within METRON itself
is
> very important.
>
> If done properly it gives developers and easier and more manageable
> proposition for extending METRON to suit their needs in what may be the
> most common extension case. It also may reduce the necessity to create
and
> maintain forks of METRON.
>
> I would like to put forward a proposal on a way to move this forward, and
> ask the community for feedback and assistance in reaching an acceptable
> approach and raising the issues that I have surely missed.
>
> Conceptually what I would like to propose is the following:
>
> * What is currently metron-parsers should be broken apart such that each
> parser is it’s own individual component
> * Each of these components should be completely self contained ( or
produce
> a self contained package )
> * These packages will include the shaded jar for the parser, default
> configurations for the parser and enrichment, default elasticsearch
> template, and a default log-rotate script
> * These packages will be deployed to disk in a new library directory
under
> metron
> * Zookeeper should have a new telemetry or source area where all
> ‘installed’ sources exist
> * This area would host the default configurations, rules, templates, and
> scripts and metadata
> * Installed sources can be instantiated as named instances
> * Instantiating an instance will move the default configurations to what
is
> currently the enrichment and parser areas for the instance name
> * It will also deploy the elasticsearch template for the instance
> name
> * It will deploy the log-rotate scripts
> * Installed and instantiated sources can be ‘redeployed’ from disk to
> upgrade
> * Installed sources are available for selection in ambari
> * question on post selection configuration, but we have that problem
> already
> * Instantiation is exposed through REST
> * the UI can install a new package
> * the UI can allow a workflow to edit the configurations and templates
> before finalizing
> * are there three states here? Installed | Edited | Instantiated
> ?
> * the UI can edit existing and redeploy
> * possibly re-deploy ES template after adding fields or account for
fields
> added by enrichment…. manually or automatically?
> * a script can be made to instantiate a ‘base’ parser ( json, grok, csv )
> with only configuration
> * The installation and instantiation should be exposed through the
Stellar
> management console
> * Starting a topology will now start the parser’s shaded jar found
through
> the parser type ( which may need to added to the configurations ) and the
> library
> * A Maven Archetype should be created for a parser | telemetry source
> project that allows the proper setup of a development project outside the
> METRON source tree
> * should be published
> * should have a useful default set
>
> So the developer’s workflow:
>
> * Create a new project from the archetype outside of the metron tree
> * edit the configurations, templates, rules etc in the project
> * code or modify the sample
> * build
> * run the installer script or the ui to upload/deploy the package
> * use the console or ui to create an instance
>
> QUESTIONS:
> * it seems strange to have this as ‘parsers’ when conceptually parsers
are
> a part of the whole, should we introduce something like ‘source’ that is
> all of it?
> * should configurations etc be in ZK or on disk? or HDFS? or All of the
> above?
> * did you read this far? good!
> * I am sure that after hitting send I will think of 10 things that are
> missing from this
>
> I have started a POC of this, and thus far have created
> metron-parsers-common and started breaking out metron-parser-asa.
> I will continue to work through some of this here
> https://github.com/ottobackwards/incubator-metron/tree/METRON-258
>
> Again, thank you for your time and feedback.
>

Reply via email to