On Thu, Sep 15, 2016 at 9:25 AM, Tim Ellison <t.p.elli...@gmail.com> wrote:

> On 15/09/16 09:21, Darin Johnson wrote:
> > So my goal for the submodule refactor is pretty straight forward, I
> > basically want to separate the project into: pirk-core, pirk-hadoop,
> > pirk-spark, and pirk-storm.  I think separating pirk-core and pirk-hadoop
> > is very ambitious at this point as there's a lot of dependencies we'd
> need
> > to resolve.
>
> I think it is quite do-able, but agree that it is more work than the
> others.
>
> > pirk-storm and pirk-spark would be much more reasonable
> > starts.  I'd also recommend we do something about the elastic-search
> > dependency, it seems more of an InputFormat option than part of
> pirk-core.
> >
> > There's a few blockers to this:
> >
> > This first is PIRK-63, here the ResponderDriver was calling the Responder
> > class of each specific framework.  That fix is straight-forward, pass the
> > class as an argument I've started that here:
> > https://github.com/DarinJ/incubator-pirk/tree/Pirk-63 (PR was expected
> > earlier - but had a rebase issue - so didn't get around to completing a
> few
> > bits).  It also allows at least at the rudimentary level to add new
> > responders by putting jars on the classpath vs recompiling pirk.  I'm
> open
> > to suggestions here - I think it's very likely ResponderLauncher isn't
> > needed and instead run could be a static member of another class, however
> > based off what was in ResponderDriver this seems to be the approach with
> > the fewest issues - especially storm.
>
> Give a shout when you want somebody to take a look.
>
> > Another is how we're passing the command line options in ResponderCLI,
> here
> > we're defining framework specific elements to the Driver which are then
> > passed to the underlying framework Driver/Topology/ToolRunner.  This
> > becomes more difficult to address cleanly so seems like a good place to
> > start a discussion.  I think this mechanism should be addressed though as
> > putting options for every framework/inputformat everyone could want in
> > untenable.
>
> I guess one option is structure the monolithic CLI around plug-ins, so
> rather than today's
>   ResponderDriver <options for everything> ...
>
> it would become
>   ResponderDriver --pir embedSelector=true --storm option=value ...
>
> and so on; or more likely
>   ResponderDriver --pir optionsFile=pir.properties --storm
> optionsFile=storm.properties ...
>
> and then the driver can delegate each command line option group to the
> correct handler.
>

Agree with this approach - as the CLI already supports reading all of the
properties from properties files (both local and in hdfs), it should be
relatively straightforward to delegate the handling.


>
> > After addressing these two based off some experiments it looks like
> > breaking out storm is pretty straight forward and spark should be about
> the
> > same.  I'm still looking at elastic search.  Hadoop would require more
> and
> > I think less important for now.
>
> Much of the Hadoop dependency I see is 'services' for storing and
> retrieving, these could be abstracted out to a provider model.
>

Agreed.


>
> > I also realize there are other ways to break the modules apart and I'm
> > mostly discussing modularizing the responder package, however that's were
> > most of the dependencies lie so I think that's were we'll get the most
> > impact.
>
> +1, the responder and CLI.
>
> Regards,
> Tim
>
>
> > On Wed, Sep 14, 2016 at 8:52 AM, Suneel Marthi <suneel.mar...@gmail.com>
> > wrote:
> >
> >> +1 to start a sub-thread. I would suggest to start a shared Google Doc
> for
> >> dumping ideas and evolving a structure.
> >>
> >> On Wed, Sep 14, 2016 at 2:48 PM, Ellison Anne Williams <
> >> eawilli...@apache.org> wrote:
> >>
> >>> Starting a new thread to discuss the Pirk submodule refactor (so that
> we
> >>> don't get too mixed up with the 'Next short term goal?' thread)...
> >>>
> >>> Darin - Thanks for jumping in on the last email (I think that we hit
> send
> >>> at exactly the same time :)). Can you describe what you have in mind
> for
> >>> the submodule refactor so that we can discuss?
> >>>
> >>> (No, there is not an umbrella JIRA for producing separate Responder
> jars
> >> -
> >>> please feel free to go ahead and add one)
> >>>
> >>
> >
>

Reply via email to