Re: [DISCUSS] Spark runner packaging

Jean-Baptiste Onofré Thu, 07 Jul 2016 13:20:12 -0700

No problem and good idea to discuss in the Jira.

Actually, I started to experiment a bit beam distributions on a branch(that I can share with people interested).


Regards
JB

On 07/07/2016 10:12 PM, Amit Sela wrote:

Thanks JB, I've missed that one.

I suggest we continue this in the ticket comments.

Thanks,
Amit

On Thu, Jul 7, 2016 at 11:05 PM Jean-Baptiste Onofré <[email protected]>
wrote:

Hi Amit,

I think your proposal is related to:

https://issues.apache.org/jira/browse/BEAM-320

As described in the Jira, I'm planning to provide (in dedicated Maven
modules) is a Beam distribution including:
- an uber jar to wrap the dependencies
- the underlying runtime backends
- etc

Regards
JB

On 07/07/2016 07:49 PM, Amit Sela wrote:

Hi everyone,

Lately I've encountered a number of issues concerning the fact that the
Spark runner does not package Spark along with it and forcing people to

do

this on their own.
In addition, this seems to get in the way of having beam-examples

executed

against the Spark runner, again because it would have to add Spark
dependencies.

When running on a cluster (which I guess was the original goal here), it

is

recommended to have Spark provided by the cluster - this makes sense for
Spark clusters and more so for Spark + YARN clusters where you might have
your Spark built against a specific Hadoop version or using a vendor
distribution.

In order to make the runner more accessible to new adopters, I suggest to
consider releasing a "spark-included" artifact as well.

Thoughts ?

Thanks,
Amit


--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com


--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: [DISCUSS] Spark runner packaging

Reply via email to