I don't think the proposal is to put this into the source release, rather to have a separate binary artifact that's Beam+Spark.
On Thu, Jul 7, 2016 at 11:54 AM, Vlad Rozov <v.ro...@datatorrent.com> wrote: > I am not sure if I read the proposal correctly, but note that it will be > against Apache policy to include compiled binaries into the source release. > On the other side, each runner may include necessary run-time binaries as > test only dependencies into the runner's maven pom.xml > > > On 7/7/16 11:01, Lukasz Cwik wrote: > >> That makes a lot of sense. I can see other runners following suit where >> there is a packaged up version for different scenarios / backend cluster >> runtimes. >> >> Should this be part of Apache Beam as a separate maven module or another >> sub-module inside of Apache Beam, or something else? >> >> On Thu, Jul 7, 2016 at 1:49 PM, Amit Sela <amitsel...@gmail.com> wrote: >> >> Hi everyone, >>> >>> Lately I've encountered a number of issues concerning the fact that the >>> Spark runner does not package Spark along with it and forcing people to >>> do >>> this on their own. >>> In addition, this seems to get in the way of having beam-examples >>> executed >>> against the Spark runner, again because it would have to add Spark >>> dependencies. >>> >>> When running on a cluster (which I guess was the original goal here), it >>> is >>> recommended to have Spark provided by the cluster - this makes sense for >>> Spark clusters and more so for Spark + YARN clusters where you might have >>> your Spark built against a specific Hadoop version or using a vendor >>> distribution. >>> >>> In order to make the runner more accessible to new adopters, I suggest to >>> consider releasing a "spark-included" artifact as well. >>> >>> Thoughts ? >>> >>> Thanks, >>> Amit >>> >>> >