Hey Christian,

I posted an example to my local github repo (word count, of course) of
running Spark 0.9.0 on a cluster, but it's pre-yarn:

https://github.com/jwills/crunch-demo/tree/spark

Use the spark-run.sh script to run it; you need to set -Dspark.master at
the commandline to point at the spark master on the cluster. It would be
cool to integrate it with the instructions here for running Spark under
YARN and see how it came out:

http://spark.apache.org/docs/latest/running-on-yarn.html

Of course, we'd need to commit that patch to upgrade Crunch to Spark 1.0.0:
https://issues.apache.org/jira/browse/CRUNCH-410

J


On Tue, Jun 17, 2014 at 7:47 AM, Christian Tzolov <
[email protected]> wrote:

> Is there an example of Crunch Spark pipeline for hadoop2/yarn cluster
> manager?
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Reply via email to