Re: [DISCUSS] SIDELOADING PARSERS: Packaging and Deployment

2017-03-10 Thread Casey Stella
I really don't like the 2x build size.  I think we can at this point do
something similar to side-loading of stellar functions to remove that
concern. This should be easy now that that's in master.

What I'd like to see as a MVP is:

   - the maven archetype to have a "provided" dependency to metron-parsers
   - Create a JIRA to
  - Add a "parser.paths" field to global config
  - modify the ParserBolt to use the VFSClassloader to instantiate the
  parser

I'd consider that phase 1 as it solves the question of "how do I easily
create parsers without forking Metron" and it doesn't require shaded jars.

>From there, I wouldn't be opposed to splitting the existing parsers into
separate projects, but I'd consider that a phase 2 activity.  We can trim
down metron-parsers at our leisure.

Thoughts?

On Fri, Mar 10, 2017 at 9:53 AM, Otto Fowler 
wrote:

> As previously discussed here, I have been working on side loading of
> parsers.  The goals of this work are:
> * Make it possible of developers to create, maintain and deploy parsers
> outside of the Metron code tree and not have to fork
> * Create maven archetype support for developers of parsers
> * Introduce a parser ‘lifecycle’ to support multiple instances and
> configurations, states of being installed, under configuration, and
> deployed
> etc.
>
> I would like to have some discussion based on where I am after rebasing
> onto METRON-671 which revamps deployment to be totally ambari based.
>
>
> Packaging and Deployment
>
> I have not change the packaging methodology that was already there, ie. All
> the parsers are still shaded uber jars, and all are package into tar.gz to
> include /lib /config /pattens
>
> They are all explicitly called out in the copy resources portion of the
> rpm-docker pom, and explicitly configured in the metron.spec for rpm
> generation.
>
> When deployed, they are deployed to a new directory - telemetry under the
> metron home ( /usr/metron/0.3.1/telemetry ).
> Each parser has it’s own directory, which gives it an isolated environment:
>
> telemetry/asa
> /config
> /lib
> /patterns
>
> I could see adding a version here as well.  Also, this directory structure
> could change, if not by review then by other follow on deployment options
> due to ‘thinning'
>
> All the scripts, ambari services have been changed to account for this, and
> the start parser topology script is changed to find the right parser jar to
> use for the -s option ( as opposed to only loading metron-parsers jar which
> is the root of the issue ).
>
> The packaging issues here:
>
> * 10 or so new Uber jars increases the build size x2
> * Travis needs to be changed from a container build to a vm build to get a
> bigger space to work in
>
> The rpm issues here:
> * explicitly listing like items that *could* be iterated from a list is a
> code smell to me.  With ansible I was able to define a list and use
> with_items to get a nice clean, maintainable flow.  With rpm and maven
> resources we have to have explicit entries.  My rpm and maven foo was not
> good enough to sort this out, so I just bit the bullet and did it.  I think
> we should explore copying using some kind of script or iteration in maven,
> and possibly generating the metron spec from a template ( this too is
> easier in ansible  ).
>
> Options here:
>
> 1)  Accept this as mvp for the PR with improvements to packaging and
> deployment as a follow on
> 2)  Delay the mvp and go for a possibly smaller - optimized deployment
>
>
> Going for a smaller deployment, that is slimming the jars is something I’ll
> talk about in another email :)
>


[DISCUSS] SIDELOADING PARSERS: Packaging and Deployment

2017-03-10 Thread Otto Fowler
As previously discussed here, I have been working on side loading of
parsers.  The goals of this work are:
* Make it possible of developers to create, maintain and deploy parsers
outside of the Metron code tree and not have to fork
* Create maven archetype support for developers of parsers
* Introduce a parser ‘lifecycle’ to support multiple instances and
configurations, states of being installed, under configuration, and deployed
etc.

I would like to have some discussion based on where I am after rebasing
onto METRON-671 which revamps deployment to be totally ambari based.


Packaging and Deployment

I have not change the packaging methodology that was already there, ie. All
the parsers are still shaded uber jars, and all are package into tar.gz to
include /lib /config /pattens

They are all explicitly called out in the copy resources portion of the
rpm-docker pom, and explicitly configured in the metron.spec for rpm
generation.

When deployed, they are deployed to a new directory - telemetry under the
metron home ( /usr/metron/0.3.1/telemetry ).
Each parser has it’s own directory, which gives it an isolated environment:

telemetry/asa
/config
/lib
/patterns

I could see adding a version here as well.  Also, this directory structure
could change, if not by review then by other follow on deployment options
due to ‘thinning'

All the scripts, ambari services have been changed to account for this, and
the start parser topology script is changed to find the right parser jar to
use for the -s option ( as opposed to only loading metron-parsers jar which
is the root of the issue ).

The packaging issues here:

* 10 or so new Uber jars increases the build size x2
* Travis needs to be changed from a container build to a vm build to get a
bigger space to work in

The rpm issues here:
* explicitly listing like items that *could* be iterated from a list is a
code smell to me.  With ansible I was able to define a list and use
with_items to get a nice clean, maintainable flow.  With rpm and maven
resources we have to have explicit entries.  My rpm and maven foo was not
good enough to sort this out, so I just bit the bullet and did it.  I think
we should explore copying using some kind of script or iteration in maven,
and possibly generating the metron spec from a template ( this too is
easier in ansible  ).

Options here:

1)  Accept this as mvp for the PR with improvements to packaging and
deployment as a follow on
2)  Delay the mvp and go for a possibly smaller - optimized deployment


Going for a smaller deployment, that is slimming the jars is something I’ll
talk about in another email :)