On Mon, Jul 18, 2016 at 11:03 AM, Chris Harris <[email protected]>
wrote:

> I agree separating the benchmarking code out might be a good idea.  Maybe
> we want to make this a multi-module project and the benchmarking be a
> submodule? Won't JMH still need to included via the child pom in the
> benchmarking though; I don't think it can be marked as "provided" or
> anything, right? Glad to hear if you had a different thought on how to go
> with this though.
>
> Maybe it's good to create submodules for all the adapters (eg. Spark, MR,
> Flink, Storm etc...) too?  Then we won't have just one jar needing to carry
> around all those dependencies that aren't used (and reduce potential
> version conflicts).  I don't know if this would shake things up too much to
> do this restructuring now, or if it is better to do  now before we move too
> far down the line.  We'd have to look at what's Pirk "core" vs not, but I
> don't think that's too tough to discern right now.  Most of the adapter
> code is in org.apache.pirk.responder.
>

I have a suggestion here. Its not a good idea to be having submodules for
every streaming engine that's out there. In the long run, its gonna be a
maintenance and compatibility nightmare with every release of Flink, Spark
etc...

I am guilty of having wasted the last 2 yrs of my life (and fellow
committers time) in adding support for Spark, Flink, H2O on the Apache
Mahout project as distributed backend engines.

I wish now that Apache Beam was around in 2014, I would have to then just
support Beam and be abstracted away from all of the other Streaming
frameworks.

As a new project, I would suggest that we look into integrating with Beam
and keep the codebase lean and not bloat the project with all the different
Streaming engines.

Beam is a unified Batch + Streaming processing engine from google. All of
the other Streaming engines like Flink, Spark, Apex, Storm etc... are now
vying to provide native runners that can support Beam. This kind'a makes
Beam an abstraction over every other streaming framework.

As an application developer, I would write my jobs against the Beam API and
have the option of being able to execute the same as a Spark batch job,
Flink streaming/batch job etc. This completely shields the developer from
having to support the plethora of streaming engines out there.

On the Mahout project, we built a complete logical layer for coding up ML
algorithms, and a physical layer that translates the logical plan to run on
different execution engines like Spark, Flink. There was no Beam then when
we started that effort in early 2014.
I would do it a different way today given Beam.

It would be real cool by way of publicity and building a community for
Pirk, if Pirk were to be one of the few projects out there that support
Beam.




> Regards,
> Chris
>
>
>
> On Mon, Jul 18, 2016 at 6:25 AM, Tim Ellison <[email protected]>
> wrote:
>
> > On 17/07/16 16:57, Ellison Anne Williams wrote:
> > > Suneel -- Thanks for creating the JIRA issue and pointing out the
> > licensing
> > > problems. I see that JMH is under the GNU GPL2 (
> > > http://openjdk.java.net/legal/) which is not compatible with the
> Apache
> > > license (http://www.apache.org/legal/resolved.html).
> > >
> > > It appears that Flink just removed the benchmarking code instead of
> > > re-porting it to another option.
> > >
> > > I would like us to port it to another license-compatible benchmarking
> > > framework such as Google Caliper (or something similar) instead of
> > removing
> > > the code as the benchmarking is important for encryption optimization.
> > >
> > > Thoughts?
> >
> > JMH is GPLv2 with classpath exception [1], which means that it cannot be
> > distributed as part of the ALv2 licensed works (Pirk), but there is no
> > problem with using this library as a tool / dependency at runtime.
> > Afterall, there is no Java runtime that allows for redistribution under
> > ALv2 either!
> >
> > That said, running the "mvn package" target *does* put JMH generated
> > code into the resulting pirk-0.0.1-SNAPSHOT.jar -- which then begs the
> > question why Pirk is putting test code into the library?
> >
> > So if the benchmark code were not part of the delivery then you can
> > continue to use JMH, but if that is there for a reason then we would
> > have to switch to a compatible licensed framework.
> >
> > [1] http://hg.openjdk.java.net/code-tools/jmh/file/c050a47b2b37/LICENSE
> >
> > Regards,
> > Tim
> >
>

Reply via email to