Great

Will have pirk-63 sometime this weekend, which will help.  Then go ahead
with these suggestions as a base, I may come back with some thoughts about
the cli.  I'd like for new responders not to modify pirk-core.  There's a
few ways I've done this before, but need to decide which will be least
intrusive and easiest to maintain.

Darin

On Sep 15, 2016 6:17 PM, "Ellison Anne Williams" <eawilli...@apache.org>
wrote:

> On Thu, Sep 15, 2016 at 9:25 AM, Tim Ellison <t.p.elli...@gmail.com>
> wrote:
>
> > On 15/09/16 09:21, Darin Johnson wrote:
> > > So my goal for the submodule refactor is pretty straight forward, I
> > > basically want to separate the project into: pirk-core, pirk-hadoop,
> > > pirk-spark, and pirk-storm.  I think separating pirk-core and
> pirk-hadoop
> > > is very ambitious at this point as there's a lot of dependencies we'd
> > need
> > > to resolve.
> >
> > I think it is quite do-able, but agree that it is more work than the
> > others.
> >
> > > pirk-storm and pirk-spark would be much more reasonable
> > > starts.  I'd also recommend we do something about the elastic-search
> > > dependency, it seems more of an InputFormat option than part of
> > pirk-core.
> > >
> > > There's a few blockers to this:
> > >
> > > This first is PIRK-63, here the ResponderDriver was calling the
> Responder
> > > class of each specific framework.  That fix is straight-forward, pass
> the
> > > class as an argument I've started that here:
> > > https://github.com/DarinJ/incubator-pirk/tree/Pirk-63 (PR was expected
> > > earlier - but had a rebase issue - so didn't get around to completing a
> > few
> > > bits).  It also allows at least at the rudimentary level to add new
> > > responders by putting jars on the classpath vs recompiling pirk.  I'm
> > open
> > > to suggestions here - I think it's very likely ResponderLauncher isn't
> > > needed and instead run could be a static member of another class,
> however
> > > based off what was in ResponderDriver this seems to be the approach
> with
> > > the fewest issues - especially storm.
> >
> > Give a shout when you want somebody to take a look.
> >
> > > Another is how we're passing the command line options in ResponderCLI,
> > here
> > > we're defining framework specific elements to the Driver which are then
> > > passed to the underlying framework Driver/Topology/ToolRunner.  This
> > > becomes more difficult to address cleanly so seems like a good place to
> > > start a discussion.  I think this mechanism should be addressed though
> as
> > > putting options for every framework/inputformat everyone could want in
> > > untenable.
> >
> > I guess one option is structure the monolithic CLI around plug-ins, so
> > rather than today's
> >   ResponderDriver <options for everything> ...
> >
> > it would become
> >   ResponderDriver --pir embedSelector=true --storm option=value ...
> >
> > and so on; or more likely
> >   ResponderDriver --pir optionsFile=pir.properties --storm
> > optionsFile=storm.properties ...
> >
> > and then the driver can delegate each command line option group to the
> > correct handler.
> >
>
> Agree with this approach - as the CLI already supports reading all of the
> properties from properties files (both local and in hdfs), it should be
> relatively straightforward to delegate the handling.
>
>
> >
> > > After addressing these two based off some experiments it looks like
> > > breaking out storm is pretty straight forward and spark should be about
> > the
> > > same.  I'm still looking at elastic search.  Hadoop would require more
> > and
> > > I think less important for now.
> >
> > Much of the Hadoop dependency I see is 'services' for storing and
> > retrieving, these could be abstracted out to a provider model.
> >
>
> Agreed.
>
>
> >
> > > I also realize there are other ways to break the modules apart and I'm
> > > mostly discussing modularizing the responder package, however that's
> were
> > > most of the dependencies lie so I think that's were we'll get the most
> > > impact.
> >
> > +1, the responder and CLI.
> >
> > Regards,
> > Tim
> >
> >
> > > On Wed, Sep 14, 2016 at 8:52 AM, Suneel Marthi <
> suneel.mar...@gmail.com>
> > > wrote:
> > >
> > >> +1 to start a sub-thread. I would suggest to start a shared Google Doc
> > for
> > >> dumping ideas and evolving a structure.
> > >>
> > >> On Wed, Sep 14, 2016 at 2:48 PM, Ellison Anne Williams <
> > >> eawilli...@apache.org> wrote:
> > >>
> > >>> Starting a new thread to discuss the Pirk submodule refactor (so that
> > we
> > >>> don't get too mixed up with the 'Next short term goal?' thread)...
> > >>>
> > >>> Darin - Thanks for jumping in on the last email (I think that we hit
> > send
> > >>> at exactly the same time :)). Can you describe what you have in mind
> > for
> > >>> the submodule refactor so that we can discuss?
> > >>>
> > >>> (No, there is not an umbrella JIRA for producing separate Responder
> > jars
> > >> -
> > >>> please feel free to go ahead and add one)
> > >>>
> > >>
> > >
> >
>

Reply via email to