I spend a while figuring that out a week or two ago and wrote up a blog
post on it:
https://www.spicule.co.uk/news/post/2019-09-30-running-an-apache-beam-pipeline-over-spark-on-kubernetes

And some sample code here: https://gitlab.com/spiculedata/spark-beam-demo

The actual submit command looks something like this:

./spark-submit --master k8s://https:// --deploy-mode cluster --name
spark-demo --class com.example.beam.ProcessHealth2 --conf
spark.executor.instances=5 --conf
spark.kubernetes.authenticate.driver.serviceAccountName=spark  --conf
spark.kubernetes.container.image=/spark:
local:///opt/wordcount-app-1.0.0-shaded.jar "--runner=SparkRunner"
"--awsKey=" "--awsSecret=" "--outputPath=s3:///"
"--awsRegion=us-east-1”


Tom

On 28 October 2019 at 22:50:51, Matthew K. ([email protected]) wrote:

I would like to run Beam functions on Spark cluster created on a Kubernetes
using `spark-submit`. However, it is not clear how to integrate Beam's Job
Service with non-standalone Spark master (on Kubernetes) launched by
`spark-submit`.

Any information is appreciated.

Thanks

-- 


Spicule Limited is registered in England & Wales. Company Number: 
09954122. Registered office: First Floor, Telecom House, 125-135 Preston 
Road, Brighton, England, BN1 6AF. VAT No. 251478891.




All engagements 
are subject to Spicule Terms and Conditions of Business. This email and its 
contents are intended solely for the individual to whom it is addressed and 
may contain information that is confidential, privileged or otherwise 
protected from disclosure, distributing or copying. Any views or opinions 
presented in this email are solely those of the author and do not 
necessarily represent those of Spicule Limited. The company accepts no 
liability for any damage caused by any virus transmitted by this email. If 
you have received this message in error, please notify us immediately by 
reply email before deleting it from your system. Service of legal notice 
cannot be effected on Spicule Limited by email.

Reply via email to