Re: [DISCUSS] Metron Parsers in Nifi

Simon Elliston Ball Mon, 13 Aug 2018 06:42:48 -0700

Yep, I'm wondering whether our parser interface should have the ability to
create schema either like that, or well, that, which would be helpful
within Metron as well.


@Otto, the one thing missing from the record reader api, is that if you
don't emit any records at all for a flow file, it errors, which is not
strictly speaking an error, but yeah, we can certainly control things like
filtering errors aside from this. I would say this was a nifi bug
(debatably) which should be fixed on that side.

Simon

On 13 August 2018 at 14:29, Otto Fowler <ottobackwa...@gmail.com> wrote:

> Also,  If we are doing the record readers, we can have a reader for a
> parser type and explicitly set the schema, as seen here :
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-
> services/nifi-record-serialization-services-bundle/
> nifi-record-serialization-services/src/main/java/org/apache/nifi/syslog/
> Syslog5424Reader.java
>
>
>
> On August 13, 2018 at 09:26:50, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> If we can do the record readers ourselves ( with the parsers inside them )
> we can handle the returns.
> I’ll be doing the net flow 5 readers once the net flow 5 processor PR (
> not mine ) is in.
>
> I don’t think having a generic class loading parsers foo and having to
> manage all that is preferable to having
> an archetype and explicit parsers.
>
> Nifi processors and readers are self documenting, and this approach will
> make that not possible, as another consideration.
>
>
>
> On August 13, 2018 at 06:50:09, Simon Elliston Ball (
> si...@simonellistonball.com) wrote:
>
> Maybe the edge use case will clarify the config issue a little. The reason
> I would want to be able to push Metron parsers into NiFi would be so I can
> pre-parse and filter on the edge to save bandwidth from remote locations. I
> would expect to be able to parse at the edge and use NiFi to prioritise or
> filter on the Metron ready data, then push through to a 'NoOp' parser in
> Metron. For this to happen, we would absolutely not want to connect to
> Zookeeper, so I'm +1 on Otto's suggestion that the config be embeddable in
> NiFi properties. We cannot assume ZK connectivity from NiFi.
>
> I can also see a scenario where NiFi might make it easier to chain parsers,
> which is where it overlaps more with Metron. This is more about the fact
> that NiFi make it a lot easier to configure and manage complex multi-step
> flows than Metron, and is way more user intuitive from a design and
> monitoring perspective. My main concern around using NiFi in this way is
> about the load on the content repository. We are looking at a lot of
> content level transformation here. You could argue that the same load is
> taken off Kafka in the chaining scenario, but there is still a chance for a
> user to accidentally create a lot of disk access if they go over the top
> with NiFi.
>
> I see this as potentially a a chance to make the Metron Parser interface
> compatible with NiFi Record Readers. Then both communities could benefit
> from sharing each other's parsers.
>
> In terms of the NAR approach, I would say we have a base bundle of the NiFi
> bits (https://github.com/simonellistonball/metron/tree/nifi already has
> this for stellar, enrichments and an opinionated publisher, it also has a
> readme with some discussion around this
> https://github.com/simonellistonball/metron/tree/nifi/nifi-metron-bundle).
> We can then use other nar dependencies to side load parser classes into the
> record reader. We would then need to do some fancy property validation in
> NiFi to ensure the classes were available.
>
> Also, Record Readers are much much faster. The only problem I've found with
> them is that they error on blank output, which was a problem for me writing
> a netflow 9 reader (template only records need to live in NiFi cache, but
> not be emitted).
>
> In terms of the schema objection, I'm not sure why schema focus is a
> problem. Our parsers have implicit schema and the output schema formats
> used in NiFi are very flexible and could be "just a map". That said, we
> could also take the opportunity to introduce a method to the parser
> interface to emit traits to contribute the bits of schema that a parser
> produces. This would ultimately lead to us being able to generate output
> schemas (ES, Solr, Hive, whatever which would take a lot of the pain out of
> setup for sensors).
>
> Simon
>
> On 9 August 2018 at 16:42, Otto Fowler <ottobackwa...@gmail.com> wrote:
>
> > I would say that
> >
> > - For each configuration parameter we want to pull in, it should be
> > explicitly configured through a property as well as through a controller
> > service that accesses the metron zk
> > - Transformations should not be conflated with parsing in those
> processors
> > or readers
> >
> > There is no on the fly configuration change in nifi ( You can’t change
> > properties once started ).
> >
> > Wouldn’t the simplest minimal start be to say that we expect either nifi
> or
> > metron and simplify things? Let nifi nifi, let metron metron.
> >
> >
> > On August 9, 2018 at 10:53:24, Justin Leet (justinjl...@gmail.com)
> wrote:
> >
> > That's definitely good info, thanks for reaching out to them about it.
> >
> > In terms of exposing/sharing, I don't think we have to couple them
> tightly
> > (in fact, I think we should loosen the coupling as much as possible
> without
> > forcing reimplementation of things). I think there's definitely a way to
> do
> > that terms of the general purpose processor I proposed (or in terms of
> > RecordReader or another implementation).
> >
> > It would definitely be easy enough to configure it to either pull from ZK
> > or to use a parser config json extract as a parameter (to maintain the
> same
> > formatting and make migration easy). And we can still build specific
> > NiFi-oriented parsers as needed (that manage things like Schema via the
> > registry and other Nifi mechanisms). This keeps parsers entirely
> decoupled
> > from a metron installation.
> >
> > Alternatively, we extract our config handling to a module and scripts we
> > can package up and easily deploy configs against ZK (or the maybe Nifi's
> > StateController's or whatever). We definitely shouldn't need absolutely
> > everything installed to be able to run just parsers on Nifi.
> >
> > Having said that, right now the easiest way we have to maintain on the
> fly
> > updatable configs (and updatable is important!) is via ZK. Params in Nifi
> > aren't quite that flexible, to the best of my knowledge (i.e. you have to
> > stop, update config and restart). We might be able to exploit the
> > StateController to manage this for us, but I'm honestly not familiar
> enough
> > with it and for deployments split between NiFi and Storm, it means
> > configuration gets managed in a couple different ways (which may with
> users
> > since there is a fairly brightline delineation which makes it easier to
> > accept). There some complicated configs like fieldTransforms, which is
> > part of why I would like things to be configured in the same format (if
> not
> > the same mechanism).
> >
> > Ideally, in my mind, the parsers shared between both NiFi and Storm just
> > implement the very general MessageParser interface (which is pretty
> > minimal, a couple setup methods, validation, and the actual parse). This
> > is pretty lightweight and the split of metron-parsers into
> > metron-parsers-common et al. would loosen the coupling between parsers
> and
> > the rest of metron into that core needed to support that.
> >
> > IMO, at that point, we'd have a pretty minimal NAR (or NARs depending on
> > config management) that lets us run our set of parsers, lets users build
> > new parsers (and don't block specialized NiFi implementations that
> exploit
> > NiFi's feature set), and lets us get things configured in a relatively
> > consistent manner, without losing features, and hopefully requiring a
> > pretty minimal slice of Metron to be useful.
> >
> > On Thu, Aug 9, 2018 at 10:06 AM Otto Fowler <ottobackwa...@gmail.com>
> > wrote:
> >
> > > I think the benefits are clear. What is unclear is if the goal is to
> > > expose or share or re-use Metron capabilities ( stellar, parsing ) in
> > nifi
> > > in a way that is native to nifi ( configured and managed in nifi ),
> where
> > > you may not even need metron ( say you just want to parse asa ) or if
> the
> > > goal is to have a hybrid approach coupling the processors/readers to
> the
> > > metron installation.
> > >
> > >
> > > On August 9, 2018 at 09:14:58, Justin Leet (justinjl...@gmail.com)
> > wrote:
> > >
> > > I'll add onto Mike's discussion with the original set of requirements I
> > had
> > > in mind (and apply feedback on these as necessary!). This is largely
> > > overlap with what Mike said, but I want to make sure it's clear where
> my
> > > proposal was coming from, so we can improve on it as needed. James and
> > > Mike are also right, I think I skipped over the benefits of NiFi in
> > general
> > > a bit, so thanks for chiming in there.
> > >
> > > - Deploy our bundled parsers without needing custom wrapping on all of
> > > them.
> > > - Don't prevent ourselves from building custom wrapping as needed.
> > > - Custom Java parsers with an easy way to hook in, similar to what we
> > > already do in Storm.
> > > - One stop (or at least one format) configuration, for the case when
> > we're
> > > doing some thing in NiFi (parsers) and some elsewhere (enrichment and
> > > indexing). I don't think it'll always be "start in NiFi, end in Storm",
> > > especially as we build out Stellar capability, but I also don't want
> > users
> > > learning a different set of configs and config tools for every platform
> > we
> > > run on.
> > > - Ability to build out parsers and other systems fairly easily, e.g.
> > Spark.
> > > - Support our current use cases (in particular parser chaining as a
> more
> > > advanced use case).
> > >
> > > It really boils down to providing a relatively simple user path to be
> > able
> > > to migrate to NiFi as needed or desired as simply as possible in a very
> > > general way, while not preventing parser by parser enhancements.
> > >
> > > On Wed, Aug 8, 2018 at 7:14 PM Michael Miklavcic <
> > > michael.miklav...@gmail.com> wrote:
> > >
> > > > I think it also provides customers greater control over their
> > > architecture
> > > > by giving them the flexibility to choose where/how to host their
> > parsers.
> > > >
> > > > To Justin's point about the API, my biggest concern about the
> > > RecordReader
> > > > approach is that it is not stable. We already have a similar problem
> in
> > > > having the TransportClient in ElasticSearch - they are prone to
> > changing
> > > it
> > > > in minor versions with the advent of their newer REST API, which is
> > > > problematic for ensuring a stable installation.
> > > >
> > > > From my own perspective, our goal with NiFi, at least in part, should
> > be
> > > > the ability to deploy our core parsing infrastructure, i.e.
> > > >
> > > > - pre-built parsers
> > > > - custom java parsers
> > > > - Stellar transforms
> > > > - custom stellar transforms
> > > >
> > > > And have the ability to configure it similarly to how we configure
> > > parsers
> > > > within Storm. Consistent with our recent parser chaining and
> > aggregation
> > > > feature, users should be able to construct and deploy similar
> > constructs
> > > in
> > > > NiFi. The core architectural shift would be that parser code should
> be
> > > > platform agnostic. We provide the plumbing in Storm, NiFi, and <Spark
> > > > Streaming?, other> and platform architects and devops teams can
> choose
> > > how
> > > > and where to deploy.
> > > >
> > > > Best,
> > > > Mike
> > > >
> > > >
> > > > On Wed, Aug 8, 2018 at 9:57 AM James Sirota <jsir...@apache.org>
> > wrote:
> > > >
> > > > > Integration with NiFi would be useful for parsing low-volume
> > > telemetries
> > > > > at the edge. This is a much more resource friendly way to do it
> than
> > > > > setting up dedicated storm topologies. The integration would be
> that
> > > the
> > > > > NiFi processor parses the data and pushes it straight into the
> > > enrichment
> > > > > topic, saving us the resources of having multiple parsers in storm
> > > > >
> > > > > Thanks,
> > > > > James
> > > > >
> > > > > 07.08.2018, 11:29, "Otto Fowler" <ottobackwa...@gmail.com>:
> > > > > > Why do we start over. We are going back and forth on
> > implementation,
> > > > and
> > > > > I
> > > > > > don’t think we have the same goals or concerns.
> > > > > >
> > > > > > What would be the requirements or goals of metron integration
> with
> > > > Nifi?
> > > > > > How many levels or options for integration do we have?
> > > > > > What are the approaches to choose from?
> > > > > > Who are the target users?
> > > > > >
> > > > > > On August 7, 2018 at 12:24:56, Justin Leet (
> justinjl...@gmail.com)
> > > > > wrote:
> > > > > >
> > > > > > So how does the MetronRecordReader roll into everything? It seems
> > > like
> > > > > it'd
> > > > > > be more useful on the reader per format approach, but otherwise
> it
> > > > > doesn't
> > > > > > really seem like we gain much, and it requires getting everything
> > > > linked
> > > > > up
> > > > > > properly to be used. Assuming we looked at doing it that way, is
> > the
> > > > idea
> > > > > > that we'd setup a ControllerService with the MetronRecordReader
> > and a
> > > > > > MetronRecordWriter and then have the StellarTransformRecord
> > processor
> > > > > > configured with those ControllerServices? How do we manage the
> > > > > > configurations of the everything that way? How does the
> > > > ControllerService
> > > > > > get configured with whatever parser(s) are needed in the flow?
> > > > Basically,
> > > > > > what's your vision for how everything would tie together?
> > > > > >
> > > > > > I also forgot to mention this in the original writeup, but
> there's
> > > > > another
> > > > > > reason to avoid the RecordReader: It's not considered stable. See
> > > > > >
> > > > >
> > > >
> > > https://github.com/apache/nifi/blob/master/nifi-commons/
> > nifi-record/src/main/java/org/apache/nifi/serialization/
> > RecordReader.java#L34
> > > > > .
> > > > > > That alone makes me super hesitant to use it, if it can shift out
> > > from
> > > > > > under us in even in incremental version.
> > > > > >
> > > > > > I'm also unclear on why StellarTransformRecord processor matters
> > for
> > > > > either
> > > > > > approach. With the Processor approach you could simply follow it
> up
> > > > with
> > > > > > the Stellar processor, the same way you'd would in the
> RecordReader
> > > > > > approach. The Stellar processor should be a parallel improvement,
> > > not a
> > > > > > conflicting one.
> > > > > >
> > > > > > On Tue, Aug 7, 2018 at 11:50 AM Otto Fowler <
> > ottobackwa...@gmail.com
> > > >
> > > > > wrote:
> > > > > >
> > > > > >> A Metron Processor itself isn’t really necessary. A
> > > > MetronRecordReader
> > > > > (
> > > > > >> either the megalithic or a reader per format ) would be a good
> > > > > approach.
> > > > > >> Then have StellarTransformRecord processor that can do Stellar
> on
> > > > _any_
> > > > > >> record, regardless of source.
> > > > > >>
> > > > > >> On August 7, 2018 at 11:06:22, Justin Leet (
> justinjl...@gmail.com
> > )
> > > > > wrote:
> > > > > >>
> > > > > >> Thanks for the comments, Otto, this is definitely great
> feedback.
> > > I'd
> > > > > >> love to respond inline, but the email's already starting to lose
> > > it's
> > > > > >> formatting, so I'll go with the classic "wall of text". Let me
> > know
> > > > if
> > > > > I
> > > > > >> didn't address everything.
> > > > > >>
> > > > > >> Loading modules (or jars or whatever) outside of our Processor
> > gives
> > > > us
> > > > > >> the benefit of making it incredibly easy for a users to create
> > their
> > > > > own
> > > > > >> parsers. I would definitely expect our own bundled parsers to be
> > > > > included
> > > > > >> in our base NAR, but loading modules enables users to only have
> to
> > > > > learn
> > > > > >> how Metron wants our stuff lined up and just plug it in. Having
> > said
> > > > > that,
> > > > > >> I could see having a wrapper for our bundled parsers that makes
> it
> > > > > really
> > > > > >> easy to just say you want an MetronAsaParser or MetronBroParser,
> > > etc.
> > > > > That
> > > > > >> would give us the best of both worlds, where it's easy to get
> > setup
> > > > our
> > > > > >> bundled parsers and also trivial to pull in non-bundled parsers.
> > > What
> > > > > >> doing this gives us is an easy way to support (hopefully) every
> > > > parser
> > > > > that
> > > > > >> gets made, right out of the box, without us needing to build a
> > > > > specialized
> > > > > >> version of everything until we decide to and without users
> having
> > to
> > > > > jump
> > > > > >> through hoops.
> > > > > >>
> > > > > >> None of this prevents anyone from creating specialized parsers
> > (for
> > > > > perf
> > > > > >> reasons, or to use the schema registries, or anything else).
> It's
> > > > > probably
> > > > > >> worthwhile to package up some of built-in parsers and customize
> > them
> > > > > to use
> > > > > >> more specialized feature appropriately as we see things get used
> > in
> > > > the
> > > > > >> wild. Like you said, we could likely provide Avro schemas for
> some
> > > of
> > > > > this
> > > > > >> and give users a more robust experience on what we choose to
> > support
> > > > > and
> > > > > >> provide guidance for other things. I'm also worried that
> building
> > > > > >> specialized schemas becomes problematic for things like parser
> > > > chaining
> > > > > >> (where our routers wrap the underlying messages and add on their
> > own
> > > > > info).
> > > > > >> Going down that road potentially requires anything wrapped to
> > have a
> > > > > >> specialized schema for the wrapped version in addition to a
> > vanilla
> > > > > version
> > > > > >> (although please correct me if I'm missing something there, I'll
> > > > openly
> > > > > >> admit to some shakiness on how that would be handled).
> > > > > >>
> > > > > >> I also disagree that this is un-Nifi-like, although I'm
> admittedly
> > > > not
> > > > > as
> > > > > >> skilled there. The basis for doing this is directly inspired by
> > the
> > > > > >> JoltTransformer, which is extremely similar to the proposed
> setup
> > > for
> > > > > our
> > > > > >> parsers: Simply take a spec (in this case the configs, including
> > the
> > > > > >> fieldTransformations), and delegate a mapping from bytes[] to
> > JSON.
> > > > The
> > > > > >> Jolt library even has an Expression Language (check out
> > > > > >>
> > > > >
> > > >
> > > https://community.hortonworks.com/articles/105965/
> > expression-language-with-jolt-in-apache-nifi.html
> > > > > ),
> > > > > >> so it's not a foreign concept. I believe Simon Ball has already
> > done
> > > > > some
> > > > > >> experimenting around with getting Stellar running in NiFi, and
> I'd
> > > > > love to
> > > > > >> see Stellar more readily available in NiFi in general.
> > > > > >>
> > > > > >> Re: the ControllerService, I see this as a way to maintain
> > Metron's
> > > > > use of
> > > > > >> ZK as the source of config truth. Users could definitely be
> using
> > > > NiFi
> > > > > and
> > > > > >> Storm in tandem (parse in NiFi + enrich and index from Storm,
> for
> > > > > >> example). Using the ControllerService gives us a ZK instance as
> > the
> > > > > single
> > > > > >> source of truth. That way we aren't forcing users to go to two
> > > > > different
> > > > > >> places to manage configs. This also lets us leverage our
> existing
> > > > > scripts
> > > > > >> and our existing infrastructure around configs and their
> > management
> > > > and
> > > > > >> validation very easily. It also gives users a way to port from
> > NiFi
> > > > to
> > > > > >> Storm or vice-versa without having to migrate configs as well.
> We
> > > > could
> > > > > >> also provide the option to configure the Processor itself with
> the
> > > > data
> > > > > >> (just don't set up a controller service and provide the json or
> > > > > whatever as
> > > > > >> one of our properties).
> > > > > >>
> > > > > >> On Tue, Aug 7, 2018 at 10:12 AM Otto Fowler <
> > > ottobackwa...@gmail.com
> > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> I think this is a good idea. As I mentioned in the other thread
> > > I’ve
> > > > > >>> been doing a lot of work on Nifi recently.
> > > > > >>> I think the important thing is that what is done should be done
> > the
> > > > > NiFi
> > > > > >>> way, not bolting the Metron composition
> > > > > >>> onto Nifi. Think of it like the Tao of Unix, the parsers and
> > > > > components
> > > > > >>> should be single purpose and simple, allowing
> > > > > >>> exceptional flexibility in composition.
> > > > > >>>
> > > > > >>> Comments inline.
> > > > > >>>
> > > > > >>> On August 7, 2018 at 09:27:01, Justin Leet (
> > justinjl...@gmail.com)
> > > > > wrote:
> > > > > >>>
> > > > > >>> Hi all,
> > > > > >>>
> > > > > >>> There's interest in being able to run Metron parsers in NiFi,
> > > rather
> > > > > than
> > > > > >>>
> > > > > >>> inside Storm. I dug into this a bit, and have some thoughts on
> > how
> > > > we
> > > > > >>> could
> > > > > >>> go about this. I'd love feedback on this, along with anything
> > we'd
> > > > > >>> consider must haves as well as future enhancements.
> > > > > >>>
> > > > > >>> 1. Separate metron-parsers into metron-parsers-common and
> > > > metron-storm
> > > > > >>> and create metron-parsers-nifi. For this code to be reusable
> > across
> > > > > >>> platforms (NiFi, Storm, and anything else in the future), we'll
> > > need
> > > > > to
> > > > > >>> decouple our parsers and Storm.
> > > > > >>>
> > > > > >>> +1. The “parsing code” should be a library that implements an
> > > > > interface
> > > > > >>> ( another library ).
> > > > > >>>
> > > > > >>> The Processors and the Storm things can share them.
> > > > > >>>
> > > > > >>> - There's also some nice fringe benefits around refactoring our
> > > code
> > > > > >>> to be substantially more clear and understandable; something
> > > > > >>> which came up
> > > > > >>> while allowing for parser aggregation.
> > > > > >>> 2. Create a MetronProcessor that can run our parsers.
> > > > > >>> - I took a look at how RecordReader could be leveraged (e.g.
> > > > > >>> CSVRecordReader), but this is pretty tightly tied into schemas
> > > > > >>> and is meant
> > > > > >>> to be used by ControllerServices, which are then used by
> > > Processors.
> > > > > >>> There's friction involved there in terms of schemas, but also
> in
> > > > > terms of
> > > > > >>>
> > > > > >>> access to ZK configs and things like parser chaining. We might
> > > > > >>> be able to
> > > > > >>> leverage it, but it seems like it'd be fairly shoehorned in
> > > > > >>> without getting
> > > > > >>> the schema and other benefits.
> > > > > >>>
> > > > > >>> We won’t have to provide our ‘no schema processors’ ( grok,
> csv,
> > > > json
> > > > > ).
> > > > > >>>
> > > > > >>> All the remaining processors DO have schemas that we know
> about.
> > We
> > > > > can
> > > > > >>> just provide the avro schemas the same way we provide the ES
> > > > schemas.
> > > > > >>>
> > > > > >>> The “parsing” should not be conflated with the
> transform/stellar
> > in
> > > > > >>> NiFi. We should make that separate. Running Stellar over
> Records
> > > > > would be
> > > > > >>> the best thing.
> > > > > >>>
> > > > > >>> - This Processor would work similarly to Storm: bytes[] in ->
> > JSON
> > > > > >>> out.
> > > > > >>> - There is a Processor
> > > > > >>> <
> > > > > >>>
> > > > >
> > > >
> > > https://github.com/apache/nifi/blob/master/nifi-nar-
> > bundles/nifi-standard-bundle/nifi-standard-processors/src/
> > main/java/org/apache/nifi/processors/standard/JoltTransformJSON.java
> > > > > >>> >
> > > > > >>> that
> > > > > >>> handles loading other JARs that we can model a
> > > > > >>> MetronParserProcessor off of
> > > > > >>> that handles classpath/classloader issues (basically just sets
> > up a
> > > > > >>> classloader specific to what's being loaded and swaps out the
> > > > Thread's
> > > > > >>> loader when it calls to outside resources).
> > > > > >>>
> > > > > >>> There should be no reason to load modules outside the NAR. Why
> do
> > > > you
> > > > > >>> expect to? If each Metron Processor equiv of a Metron Storm
> > Parser
> > > > is
> > > > > just
> > > > > >>> parsing to json it shouldn’t need much.And we could package
> them
> > in
> > > > > the
> > > > > >>> NAR. I would suggest we have a Processor per Parser to allow
> for
> > > > > >>> specialization. It should all be in the nar.
> > > > > >>>
> > > > > >>> The Stellar Processor, if you would support the works would
> > > possibly
> > > > > need
> > > > > >>> this.
> > > > > >>>
> > > > > >>> 3. Create a MetronZkControllerService to supply our configs to
> > our
> > > > > >>> processors.
> > > > > >>> - This is a pretty established NiFi pattern for being able to
> > > > provide
> > > > > >>> access to other services needed by a Processor (e.g. databases
> or
> > > > > large
> > > > > >>> configurations files).
> > > > > >>> - The same controller service can be used by all Processors to
> > > > manage
> > > > > >>> configs in a consistent manner.
> > > > > >>>
> > > > > >>> I think controller services would make sense where needed, I’m
> > just
> > > > > not
> > > > > >>> sure what you imagine them being needed for?
> > > > > >>>
> > > > > >>> If the user has NiFi, and a Registry etc, are you saying you
> > > imagine
> > > > > them
> > > > > >>> using Metron + ZK to manage configurations? Or to be using BOTH
> > > > storm
> > > > > >>> processors and Nifi Processors?
> > > > > >>>
> > > > > >>> At that point, we can just NAR our controller service and
> parser
> > > > > processor
> > > > > >>>
> > > > > >>> up as needed, deploy them to NiFi, and let the user provide a
> > > config
> > > > > for
> > > > > >>> where their custom parsers can be provided (i.e. their parser
> > jar).
> > > > > This
> > > > > >>> would be 3 nars (processor, controller-service, and
> > > > > controller-service-api
> > > > > >>>
> > > > > >>> in order to bind the other two together).
> > > > > >>>
> > > > > >>> Once deployed, our ability to use parsers should fit well into
> > the
> > > > > >>> standard
> > > > > >>> NiFi workflow:
> > > > > >>>
> > > > > >>> 1. Create a MetronZkControllerService.
> > > > > >>> 2. Configure the service to point at zookeeper.
> > > > > >>> 3. Create a MetronParser.
> > > > > >>> 4. Configure it to use the controller service + parser jar
> > location
> > > > +
> > > > > >>> any other needed configs.
> > > > > >>> 5. Use the outputs as needed downstream (either writing out to
> > > Kafka
> > > > > or
> > > > > >>> feeding into more MetronParsers, etc.)
> > > > > >>>
> > > > > >>> Chaining parsers should ideally become a matter of chaining
> > > > > MetronParsers
> > > > > >>>
> > > > > >>> (and making sure the enveloping configs carry through
> properly).
> > > For
> > > > > >>> parser
> > > > > >>> aggregation, I'd just avoid it entirely until we know it's
> needed
> > > in
> > > > > NiFi.
> > > > > >>>
> > > > > >>> Justin
> > > > >
> > > > > -------------------
> > > > > Thank you,
> > > > >
> > > > > James Sirota
> > > > > PMC- Apache Metron
> > > > > jsirota AT apache DOT org
> > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
>
> --
> --
> simon elliston ball
> @sireb
>
>


-- 
--
simon elliston ball
@sireb

Re: [DISCUSS] Metron Parsers in Nifi

Reply via email to