Re: Running beam script using SparkRunner

Lukasz Cwik Mon, 13 Aug 2018 11:37:20 -0700

Have you looked at https://beam.apache.org/get-started/quickstart-java/?


It suggests using:
mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
     -Dexec.args="--runner=SparkRunner --inputFile=pom.xml --output=counts"
-Pspark-runner

This will launch a job that runs against a local Spark cluster.

If you want to use an existing Spark cluster, I believe you'll need to
build an uber jar of your application and use Spark submit.

On Mon, Aug 13, 2018 at 11:32 AM Mahesh Vangala <[email protected]>
wrote:

> Hello all -
>
> I have a barebones word_count script that runs locally but I don't have a
> clue how to run this using SparkRunner.
>
> For example,
> Local:
> mvn exec:java -Dexec.mainClass="com.apache.beam.learning.WordCount"
> -Dexec.args="--runner=DirectRunner"
> Spark:
> mvn exec:java -Dexec.mainClass="com.apache.beam.learning.WordCount"
> -Dexec.args="--runner=SparkRunner"
>
> My code takes runner from args,
>
> PipelineOptions opts = PipelineOptionsFactory.fromArgs(args).create();
>
> I have local spark cluster, but what additional parameters need to be
> given to make beam code run on spark.
> (Sorry, there seems to be not so great documentation for this use case, or
> perhaps, I overlooked?)
>
> Thank you for your help.
>
> *--*
> *Mahesh Vangala*
> *(Ph) 443-326-1957*
> *(web) mvangala.com <http://mvangala.com>*
>

Re: Running beam script using SparkRunner

Reply via email to