> Looks like for Java, a standalone Job Service is not required to run beam
functions on Spark, and spark-submit handles everything in cluster mode.
But this is not the case for Python runner.

That's correct.

> Are you aware of any example in Python that runs in a (i.e. Kubernetes)
cluster?

Not that I'm aware of, but there's been some work on Beam Python+Flink+k8s:
https://github.com/apache/beam/pull/9872. I am planning on doing something
similar for the Spark runner.

On Tue, Oct 29, 2019 at 8:57 AM Matthew K. <[email protected]> wrote:

> Thanks Tom,
>
> Looks like for Java, a standalone Job Service is not required to run beam
> functions on Spark, and spark-submit handles everything in cluster mode.
> But this is not the case for Python runner. Are you aware of any example in
> Python that runs in a (i.e. Kubernetes) cluster?
>
> *Sent:* Monday, October 28, 2019 at 6:21 PM
> *From:* "Tom Barber" <[email protected]>
> *To:* [email protected], "Matthew K." <[email protected]>
> *Subject:* Re: Running Python Beam Functions on Spark Kubernetes Cluster
> As my websever needs to move house tomorrow, here’s a snippet version of
> the post in case the link isn’t available:
> https://gitlab.com/spiculedata/spark-beam-demo/snippets/1908248
>
>
>
>
> On 28 October 2019 at 23:16:20, Tom Barber ([email protected]) wrote:
>
>
>
>
> I spend a while figuring that out a week or two ago and wrote up a blog
> post on it:
> https://www.spicule.co.uk/news/post/2019-09-30-running-an-apache-beam-pipeline-over-spark-on-kubernetes
>
> And some sample code here: https://gitlab.com/spiculedata/spark-beam-demo
>
>
> The actual submit command looks something like this:
>
>
> ./spark-submit --master k8s://https:// --deploy-mode cluster --name 
> spark-demo --class com.example.beam.ProcessHealth2 --conf 
> spark.executor.instances=5 --conf 
> spark.kubernetes.authenticate.driver.serviceAccountName=spark  --conf 
> spark.kubernetes.container.image=/spark: 
> local:///opt/wordcount-app-1.0.0-shaded.jar "--runner=SparkRunner" 
> "--awsKey=" "--awsSecret=" "--outputPath=s3:///" "--awsRegion=us-east-1”
>
>
>
> Tom
>
> On 28 October 2019 at 22:50:51, Matthew K. ([email protected]) wrote:
>
> I would like to run Beam functions on Spark cluster created on a
> Kubernetes using `spark-submit`. However, it is not clear how to integrate
> Beam's Job Service with non-standalone Spark master (on Kubernetes)
> launched by `spark-submit`.
>
> Any information is appreciated.
>
> Thanks
>
>
>
> Spicule Limited is registered in England & Wales. Company Number:
> 09954122. Registered office: First Floor, Telecom House, 125-135 Preston
> Road, Brighton, England, BN1 6AF. VAT No. 251478891.
>
>
>
> All engagements are subject to Spicule Terms and Conditions of Business.
> This email and its contents are intended solely for the individual to whom
> it is addressed and may contain information that is confidential,
> privileged or otherwise protected from disclosure, distributing or copying.
> Any views or opinions presented in this email are solely those of the
> author and do not necessarily represent those of Spicule Limited. The
> company accepts no liability for any damage caused by any virus transmitted
> by this email. If you have received this message in error, please notify us
> immediately by reply email before deleting it from your system. Service of
> legal notice cannot be effected on Spicule Limited by email.
>
>

Reply via email to