So my concern with elastic search isn't really that I want it as a
submodule it's more that I don't think it belongs in any of the jars.
Otherwise one can argue every inputformat belongs.  It seems more prudent
to have those included via some abstraction devs can add on thier own
(cascading's tap comes to mind).

On Sep 19, 2016 5:24 PM, "Ellison Anne Williams" <eawilli...@apache.org>
wrote:

> Yes, ES is just an inputformat (like HDFS, Kafka, etc) - we don't need a
> separate submodule.
>
> Aside from pirk-core, it seems that we would want to break the responder
> implementations out into submodules. This would leave us with something
> along the lines of the following (at this point):
>
> pirk-core (encryption, core responder incl. standalone, core querier,
> query, inputformat, serialization, utils)
> pirk-storm
> pirk-mapreduce
> pirk-spark
> pirk-benchmark
> pirk-distributed-test
>
> Once we add other responder implementations, we can add them as submodules
> - i.e. for Flink, we would have pirk-flink; for Beam, pirk-beam, etc.
>
> We could break 'pirk-core' down further...
>
> On Mon, Sep 19, 2016 at 5:10 PM, Suneel Marthi <suneel.mar...@gmail.com>
> wrote:
>
> > Here's an example from the Flink project for how they go about new
> features
> > or system breaking API changes, we could start a similar process. The
> Flink
> > guys call these FLIP (Flink Improvement Proposal) and Kafka community
> > similarly has something called KLIP.
> >
> > We could start a PLIP (??? :-) )
> >
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65870673
> >
> >
> > On Mon, Sep 19, 2016 at 11:07 PM, Suneel Marthi <suneel.mar...@gmail.com
> >
> > wrote:
> >
> > > A shared Google doc would be more convenient than a bunch of Jiras. Its
> > > easier to comment and add notes that way.
> > >
> > >
> > > On Mon, Sep 19, 2016 at 10:38 PM, Darin Johnson <
> dbjohnson1...@gmail.com
> > >
> > > wrote:
> > >
> > >> Suneel, I'll try to put a couple jiras on it tonight with my thoughts.
> > >> Based off my pirk-63 I was able to pull spark and storm out with no
> > >> issues.  I was planning to pull them out, then tackling elastic
> search,
> > >> then hadoop as it's a little entrenched.  This should keep most PRs to
> > >> manageable chunks. I think once that's done addressing the configs
> will
> > >> make more sense.
> > >>
> > >> I'm open to suggestions. But the hope would be:
> > >> Pirk-parent
> > >> Pirk-core
> > >> Pirk-hadoop
> > >> Pirk-storm
> > >> Pirk-parent
> > >>
> > >> Pirk-es is a little weird as it's really just an inputformat, seems
> like
> > >> there's a more general solution here than creating submodules for
> every
> > >> inputformat.
> > >>
> > >> Darin
> > >>
> > >> On Sep 19, 2016 1:00 PM, "Suneel Marthi" <smar...@apache.org> wrote:
> > >>
> > >> >
> > >>
> > >> > Refactor is definitely a first priority.  Is there a design/proposal
> > >> draft
> > >> > that we could comment on about how to go about refactoring the code.
> > I
> > >> > have been trying to keep up with the emails but definitely would
> have
> > >> > missed some.
> > >> >
> > >> >
> > >> >
> > >> > On Mon, Sep 19, 2016 at 6:57 PM, Ellison Anne Williams <
> > >> > eawilli...@apache.org <eawilli...@apache.org>> wrote:
> > >> >
> > >> > > Agree - let's leave the config/CLI the way it is for now and
> tackle
> > >> that as
> > >> > > a subsequent design discussion and PR.
> > >> > >
> > >> > > Also, I think that we should leave the ResponderDriver and the
> > >> > > ResponderProps alone for this PR and push to a subsequent PR (once
> > we
> > >> > > decide if and how we would like to delegate each).
> > >> > >
> > >> > > I vote to remove the 'platform' option and the backwards
> > compatibility
> > >> in
> > >> > > this PR and proceed with having a ResponderLauncher interface and
> > >> forcing
> > >> > > its implementation by the ResponderDriver.
> > >> > >
> > >> > > And, I am not so concerned with having one fat jar vs. multiple
> jars
> > >> right
> > >> > > now - to me, at this point, it's a 'nice to have' and not a 'must
> > >> have'
> > >> for
> > >> > > Pirk functionality. We do need to break out Pirk into more clearly
> > >> defined
> > >> > > submodules (which is in progress) - via this re-factor, I think
> that
> > >> we
> > >> > > will gain some ability to generate multiple jars which is nice.
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Mon, Sep 19, 2016 at 12:19 PM, Tim Ellison <
> > t.p.elli...@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > On 19/09/16 15:46, Darin Johnson wrote:
> > >> > > > > Hey guys,
> > >> > > > >
> > >> > > > > Thanks for looking at the PR, I apologize if it offended
> > anyone's
> > >> > > eyes:).
> > >> > > > >
> > >> > > > > I'm glad it generated some discussion about the configuration.
> > I
> > >> > > didn't
> > >> > > > > really like where things were heading with the config.
> However,
> > >> didn't
> > >> > > > > want to create to much scope creep.
> > >> > > > >
> > >> > > > > I think any hierarchical config (TypeSafe or yaml) would make
> > >> things
> > >> > > much
> > >> > > > > more maintainable, the plugin could simply grab the
> appropriate
> > >> part of
> > >> > > > the
> > >> > > > > config and handle accordingly.  I'd also cut down the number
> of
> > >> command
> > >> > > > > line options to only those that change between runs often
> (like
> > >> > > > > input/output)
> > >> > > > >
> > >> > > > >> One option is to make Pirk pluggable, so that a Pirk
> > installation
> > >> > > could
> > >> > > > >> use one or more of these in an extensible fashion by adding
> JAR
> > >> files.
> > >> > > > >> That would still require selecting one by command-line
> > argument.
> > >> > > > >
> > >> > > > > An argument for this approach is for lambda architecture
> > >> approaches
> > >> > > (say
> > >> > > > > spark/spark-streaming) were the contents of the jars would be
> so
> > >> > > similar
> > >> > > > it
> > >> > > > > seems like to much trouble to create separate jars.
> > >> > > > >
> > >> > > > > Happy to continue working on this given some direction on
> where
> > >> you'd
> > >> > > > like
> > >> > > > > it to go.  Also, it's a bit of a blocker to refactoring the
> > build
> > >> into
> > >> > > > > submodules.
> > >> > > >
> > >> > > > FWIW my 2c is to not try and fix all the problems in one go, and
> > >> rather
> > >> > > > take a compromise on the configurations while you tease apart
> the
> > >> > > > submodules in to separate source code trees, poms, etc; then
> come
> > >> back
> > >> > > > and fix the runtime configs.
> > >> > > >
> > >> > > > Once the submodules are in place it will open up more work for
> > >> release
> > >> > > > engineering and tinkering that can be done in parallel with the
> > >> config
> > >> > > > polishing.
> > >> > > >
> > >> > > > Just a thought.
> > >> > > > Tim
> > >> > > >
> > >> > > >
> > >> > > > > On Mon, Sep 19, 2016 at 9:33 AM, Tim Ellison <
> > >> t.p.elli...@gmail.com>
> > >> > > > wrote:
> > >> > > > >
> > >> > > > >> On 19/09/16 13:40, Ellison Anne Williams wrote:
> > >> > > > >>> It seems that it's the same idea as the ResponderLauncher
> with
> > >> the
> > >> > > > >> service
> > >> > > > >>> component added to maintain something akin to the
> 'platform'.
> > I
> > >> would
> > >> > > > >>> prefer that we just did away with the platform notion
> > altogether
> > >> and
> > >> > > > make
> > >> > > > >>> the ResponderDriver 'dumb'. We get around needing a
> > >> platform-aware
> > >> > > > >> service
> > >> > > > >>> by requiring the ResponderLauncher implementation to be
> passed
> > >> as
> > >> a
> > >> > > CLI
> > >> > > > >> to
> > >> > > > >>> the ResponderDriver.
> > >> > > > >>
> > >> > > > >> Let me check I understand what you are saying here.
> > >> > > > >>
> > >> > > > >> At the moment, there is a monolithic Pirk that hard codes how
> > to
> > >> > > respond
> > >> > > > >> using lots of different backends (mapreduce, spark,
> > >> sparkstreaming,
> > >> > > > >> storm , standalone), and that is selected by command-line
> > >> argument.
> > >> > > > >>
> > >> > > > >> One option is to make Pirk pluggable, so that a Pirk
> > installation
> > >> > > could
> > >> > > > >> use one or more of these in an extensible fashion by adding
> JAR
> > >> files.
> > >> > > > >> That would still require selecting one by command-line
> > argument.
> > >> > > > >>
> > >> > > > >> A second option is to simply pass in the required backend JAR
> > to
> > >> > > select
> > >> > > > >> the particular implementation you choose, as a specific Pirk
> > >> > > > >> installation doesn't need to use multiple backends
> > >> simultaneously.
> > >> > > > >>
> > >> > > > >> ...and you are leaning towards the second option.  Do I have
> > that
> > >> > > > correct?
> > >> > > > >>
> > >> > > > >> Regards,
> > >> > > > >> Tim
> > >> > > > >>
> > >> > > > >>> Am I missing something? Is there a good reason to provide a
> > >> service
> > >> > > by
> > >> > > > >>> which platforms are registered? I'm open...
> > >> > > > >>>
> > >> > > > >>> On Mon, Sep 19, 2016 at 8:28 AM, Tim Ellison <
> > >> t.p.elli...@gmail.com>
> > >> > > > >> wrote:
> > >> > > > >>>
> > >> > > > >>>> How about an approach like this?
> > >> > > > >>>>    https://github.com/tellison/incubator-pirk/tree/pirk-63
> > >> > > > >>>>
> > >> > > > >>>> The "on-ramp" is the driver [1], which calls upon the
> service
> > >> to
> > >> > > find
> > >> > > > a
> > >> > > > >>>> plug-in [2] that claims to implement the required platform
> > >> > > responder,
> > >> > > > >>>> e.g. [3].
> > >> > > > >>>>
> > >> > > > >>>> The list of plug-ins is given in the provider's JAR file,
> so
> > >> the
> > >> > > ones
> > >> > > > we
> > >> > > > >>>> provide in Pirk are listed together [4], but if you split
> > these
> > >> into
> > >> > > > >>>> modules, or somebody brings their own JAR alongside, these
> > >> would
> > >> be
> > >> > > > >>>> listed in each JAR's services/ directory.
> > >> > > > >>>>
> > >> > > > >>>> [1]
> > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > >> > > > >>>> src/main/java/org/apache/pirk/responder/wideskies/
> > >> > > > ResponderDriver.java
> > >> > > > >>>> [2]
> > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > >> > > > >>>> src/main/java/org/apache/pirk/
> responder/spi/ResponderPlugin.
> > >> java
> > >> > > > >>>> [3]
> > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > >> > > > >>>> src/main/java/org/apache/pirk/responder/wideskies/storm/
> > >> > > > >>>> StormResponder.java
> > >> > > > >>>> [4]
> > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > >> > > > >>>> src/main/services/org.apache.responder.spi.Responder
> > >> > > > >>>>
> > >> > > > >>>> I'm not even going to dignify this with a WIP PR, it is far
> > >> from
> > >> > > > ready,
> > >> > > > >>>> so proceed with caution.  There is hopefully enough there
> to
> > >> show
> > >> > > the
> > >> > > > >>>> approach, and if it is worth continuing I'm happy to do so.
> > >> > > > >>>>
> > >> > > > >>>> Regards,
> > >> > > > >>>> Tim
> > >> > > > >>>>
> > >> > > > >>>>
> > >> > > > >>>
> > >> > > > >>
> > >> > > > >
> > >> > > >
> > >> > >
> > >>
> > >
> > >
> >
>

Reply via email to