Great will write up the doc link here, finish pirk63 then start this.

On Sep 19, 2016 5:34 PM, "Suneel Marthi" <suneel.mar...@gmail.com> wrote:

> +100
>
> On Mon, Sep 19, 2016 at 11:24 PM, Ellison Anne Williams <
> eawilli...@apache.org> wrote:
>
> > Yes, ES is just an inputformat (like HDFS, Kafka, etc) - we don't need a
> > separate submodule.
> >
> > Aside from pirk-core, it seems that we would want to break the responder
> > implementations out into submodules. This would leave us with something
> > along the lines of the following (at this point):
> >
> > pirk-core (encryption, core responder incl. standalone, core querier,
> > query, inputformat, serialization, utils)
> > pirk-storm
> > pirk-mapreduce
> > pirk-spark
> > pirk-benchmark
> > pirk-distributed-test
> >
> > Once we add other responder implementations, we can add them as
> submodules
> > - i.e. for Flink, we would have pirk-flink; for Beam, pirk-beam, etc.
> >
> > We could break 'pirk-core' down further...
> >
> > On Mon, Sep 19, 2016 at 5:10 PM, Suneel Marthi <suneel.mar...@gmail.com>
> > wrote:
> >
> > > Here's an example from the Flink project for how they go about new
> > features
> > > or system breaking API changes, we could start a similar process. The
> > Flink
> > > guys call these FLIP (Flink Improvement Proposal) and Kafka community
> > > similarly has something called KLIP.
> > >
> > > We could start a PLIP (??? :-) )
> > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=65870673
> > >
> > >
> > > On Mon, Sep 19, 2016 at 11:07 PM, Suneel Marthi <
> suneel.mar...@gmail.com
> > >
> > > wrote:
> > >
> > > > A shared Google doc would be more convenient than a bunch of Jiras.
> Its
> > > > easier to comment and add notes that way.
> > > >
> > > >
> > > > On Mon, Sep 19, 2016 at 10:38 PM, Darin Johnson <
> > dbjohnson1...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Suneel, I'll try to put a couple jiras on it tonight with my
> thoughts.
> > > >> Based off my pirk-63 I was able to pull spark and storm out with no
> > > >> issues.  I was planning to pull them out, then tackling elastic
> > search,
> > > >> then hadoop as it's a little entrenched.  This should keep most PRs
> to
> > > >> manageable chunks. I think once that's done addressing the configs
> > will
> > > >> make more sense.
> > > >>
> > > >> I'm open to suggestions. But the hope would be:
> > > >> Pirk-parent
> > > >> Pirk-core
> > > >> Pirk-hadoop
> > > >> Pirk-storm
> > > >> Pirk-parent
> > > >>
> > > >> Pirk-es is a little weird as it's really just an inputformat, seems
> > like
> > > >> there's a more general solution here than creating submodules for
> > every
> > > >> inputformat.
> > > >>
> > > >> Darin
> > > >>
> > > >> On Sep 19, 2016 1:00 PM, "Suneel Marthi" <smar...@apache.org>
> wrote:
> > > >>
> > > >> >
> > > >>
> > > >> > Refactor is definitely a first priority.  Is there a
> design/proposal
> > > >> draft
> > > >> > that we could comment on about how to go about refactoring the
> code.
> > > I
> > > >> > have been trying to keep up with the emails but definitely would
> > have
> > > >> > missed some.
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Mon, Sep 19, 2016 at 6:57 PM, Ellison Anne Williams <
> > > >> > eawilli...@apache.org <eawilli...@apache.org>> wrote:
> > > >> >
> > > >> > > Agree - let's leave the config/CLI the way it is for now and
> > tackle
> > > >> that as
> > > >> > > a subsequent design discussion and PR.
> > > >> > >
> > > >> > > Also, I think that we should leave the ResponderDriver and the
> > > >> > > ResponderProps alone for this PR and push to a subsequent PR
> (once
> > > we
> > > >> > > decide if and how we would like to delegate each).
> > > >> > >
> > > >> > > I vote to remove the 'platform' option and the backwards
> > > compatibility
> > > >> in
> > > >> > > this PR and proceed with having a ResponderLauncher interface
> and
> > > >> forcing
> > > >> > > its implementation by the ResponderDriver.
> > > >> > >
> > > >> > > And, I am not so concerned with having one fat jar vs. multiple
> > jars
> > > >> right
> > > >> > > now - to me, at this point, it's a 'nice to have' and not a
> 'must
> > > >> have'
> > > >> for
> > > >> > > Pirk functionality. We do need to break out Pirk into more
> clearly
> > > >> defined
> > > >> > > submodules (which is in progress) - via this re-factor, I think
> > that
> > > >> we
> > > >> > > will gain some ability to generate multiple jars which is nice.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On Mon, Sep 19, 2016 at 12:19 PM, Tim Ellison <
> > > t.p.elli...@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > On 19/09/16 15:46, Darin Johnson wrote:
> > > >> > > > > Hey guys,
> > > >> > > > >
> > > >> > > > > Thanks for looking at the PR, I apologize if it offended
> > > anyone's
> > > >> > > eyes:).
> > > >> > > > >
> > > >> > > > > I'm glad it generated some discussion about the
> configuration.
> > > I
> > > >> > > didn't
> > > >> > > > > really like where things were heading with the config.
> > However,
> > > >> didn't
> > > >> > > > > want to create to much scope creep.
> > > >> > > > >
> > > >> > > > > I think any hierarchical config (TypeSafe or yaml) would
> make
> > > >> things
> > > >> > > much
> > > >> > > > > more maintainable, the plugin could simply grab the
> > appropriate
> > > >> part of
> > > >> > > > the
> > > >> > > > > config and handle accordingly.  I'd also cut down the number
> > of
> > > >> command
> > > >> > > > > line options to only those that change between runs often
> > (like
> > > >> > > > > input/output)
> > > >> > > > >
> > > >> > > > >> One option is to make Pirk pluggable, so that a Pirk
> > > installation
> > > >> > > could
> > > >> > > > >> use one or more of these in an extensible fashion by adding
> > JAR
> > > >> files.
> > > >> > > > >> That would still require selecting one by command-line
> > > argument.
> > > >> > > > >
> > > >> > > > > An argument for this approach is for lambda architecture
> > > >> approaches
> > > >> > > (say
> > > >> > > > > spark/spark-streaming) were the contents of the jars would
> be
> > so
> > > >> > > similar
> > > >> > > > it
> > > >> > > > > seems like to much trouble to create separate jars.
> > > >> > > > >
> > > >> > > > > Happy to continue working on this given some direction on
> > where
> > > >> you'd
> > > >> > > > like
> > > >> > > > > it to go.  Also, it's a bit of a blocker to refactoring the
> > > build
> > > >> into
> > > >> > > > > submodules.
> > > >> > > >
> > > >> > > > FWIW my 2c is to not try and fix all the problems in one go,
> and
> > > >> rather
> > > >> > > > take a compromise on the configurations while you tease apart
> > the
> > > >> > > > submodules in to separate source code trees, poms, etc; then
> > come
> > > >> back
> > > >> > > > and fix the runtime configs.
> > > >> > > >
> > > >> > > > Once the submodules are in place it will open up more work for
> > > >> release
> > > >> > > > engineering and tinkering that can be done in parallel with
> the
> > > >> config
> > > >> > > > polishing.
> > > >> > > >
> > > >> > > > Just a thought.
> > > >> > > > Tim
> > > >> > > >
> > > >> > > >
> > > >> > > > > On Mon, Sep 19, 2016 at 9:33 AM, Tim Ellison <
> > > >> t.p.elli...@gmail.com>
> > > >> > > > wrote:
> > > >> > > > >
> > > >> > > > >> On 19/09/16 13:40, Ellison Anne Williams wrote:
> > > >> > > > >>> It seems that it's the same idea as the ResponderLauncher
> > with
> > > >> the
> > > >> > > > >> service
> > > >> > > > >>> component added to maintain something akin to the
> > 'platform'.
> > > I
> > > >> would
> > > >> > > > >>> prefer that we just did away with the platform notion
> > > altogether
> > > >> and
> > > >> > > > make
> > > >> > > > >>> the ResponderDriver 'dumb'. We get around needing a
> > > >> platform-aware
> > > >> > > > >> service
> > > >> > > > >>> by requiring the ResponderLauncher implementation to be
> > passed
> > > >> as
> > > >> a
> > > >> > > CLI
> > > >> > > > >> to
> > > >> > > > >>> the ResponderDriver.
> > > >> > > > >>
> > > >> > > > >> Let me check I understand what you are saying here.
> > > >> > > > >>
> > > >> > > > >> At the moment, there is a monolithic Pirk that hard codes
> how
> > > to
> > > >> > > respond
> > > >> > > > >> using lots of different backends (mapreduce, spark,
> > > >> sparkstreaming,
> > > >> > > > >> storm , standalone), and that is selected by command-line
> > > >> argument.
> > > >> > > > >>
> > > >> > > > >> One option is to make Pirk pluggable, so that a Pirk
> > > installation
> > > >> > > could
> > > >> > > > >> use one or more of these in an extensible fashion by adding
> > JAR
> > > >> files.
> > > >> > > > >> That would still require selecting one by command-line
> > > argument.
> > > >> > > > >>
> > > >> > > > >> A second option is to simply pass in the required backend
> JAR
> > > to
> > > >> > > select
> > > >> > > > >> the particular implementation you choose, as a specific
> Pirk
> > > >> > > > >> installation doesn't need to use multiple backends
> > > >> simultaneously.
> > > >> > > > >>
> > > >> > > > >> ...and you are leaning towards the second option.  Do I
> have
> > > that
> > > >> > > > correct?
> > > >> > > > >>
> > > >> > > > >> Regards,
> > > >> > > > >> Tim
> > > >> > > > >>
> > > >> > > > >>> Am I missing something? Is there a good reason to provide
> a
> > > >> service
> > > >> > > by
> > > >> > > > >>> which platforms are registered? I'm open...
> > > >> > > > >>>
> > > >> > > > >>> On Mon, Sep 19, 2016 at 8:28 AM, Tim Ellison <
> > > >> t.p.elli...@gmail.com>
> > > >> > > > >> wrote:
> > > >> > > > >>>
> > > >> > > > >>>> How about an approach like this?
> > > >> > > > >>>>    https://github.com/tellison/
> incubator-pirk/tree/pirk-63
> > > >> > > > >>>>
> > > >> > > > >>>> The "on-ramp" is the driver [1], which calls upon the
> > service
> > > >> to
> > > >> > > find
> > > >> > > > a
> > > >> > > > >>>> plug-in [2] that claims to implement the required
> platform
> > > >> > > responder,
> > > >> > > > >>>> e.g. [3].
> > > >> > > > >>>>
> > > >> > > > >>>> The list of plug-ins is given in the provider's JAR file,
> > so
> > > >> the
> > > >> > > ones
> > > >> > > > we
> > > >> > > > >>>> provide in Pirk are listed together [4], but if you split
> > > these
> > > >> into
> > > >> > > > >>>> modules, or somebody brings their own JAR alongside,
> these
> > > >> would
> > > >> be
> > > >> > > > >>>> listed in each JAR's services/ directory.
> > > >> > > > >>>>
> > > >> > > > >>>> [1]
> > > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > > >> > > > >>>> src/main/java/org/apache/pirk/responder/wideskies/
> > > >> > > > ResponderDriver.java
> > > >> > > > >>>> [2]
> > > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > > >> > > > >>>> src/main/java/org/apache/pirk/
> > responder/spi/ResponderPlugin.
> > > >> java
> > > >> > > > >>>> [3]
> > > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > > >> > > > >>>> src/main/java/org/apache/pirk/responder/wideskies/storm/
> > > >> > > > >>>> StormResponder.java
> > > >> > > > >>>> [4]
> > > >> > > > >>>> https://github.com/tellison/incubator-pirk/blob/pirk-63/
> > > >> > > > >>>> src/main/services/org.apache.responder.spi.Responder
> > > >> > > > >>>>
> > > >> > > > >>>> I'm not even going to dignify this with a WIP PR, it is
> far
> > > >> from
> > > >> > > > ready,
> > > >> > > > >>>> so proceed with caution.  There is hopefully enough there
> > to
> > > >> show
> > > >> > > the
> > > >> > > > >>>> approach, and if it is worth continuing I'm happy to do
> so.
> > > >> > > > >>>>
> > > >> > > > >>>> Regards,
> > > >> > > > >>>> Tim
> > > >> > > > >>>>
> > > >> > > > >>>>
> > > >> > > > >>>
> > > >> > > > >>
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to