Hi Buvana,

Running Beam Python on Spark on Kubernetes is more complicated, because
Beam has its own solution for running Python code [1]. Unfortunately
there's no guide that I know of for Spark yet, however we do have
instructions for Flink [2]. Beam's Flink and Spark runners, and I assume
GCP's (unofficial) Flink and Spark [3] operators, are probably similar
enough that it shouldn't be too hard to port the YAML from the Flink
operator to the Spark operator. I filed an issue for it [4], but I probably
won't have the bandwidth to work on it myself for a while.
<https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/beam_guide.md>

- Kyle

[1] https://beam.apache.org/roadmap/portability/
[2]
https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/beam_guide.md
[3] https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/
[4] https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/870

On Sat, Apr 11, 2020 at 4:33 PM Ramanan, Buvana (Nokia - US/Murray Hill) <
[email protected]> wrote:

> Thank you, Rahul for your very useful response. Can you please extend your
> response by commenting on the procedure for Beam python pipeline?
>
>
>
> *From: *rahul patwari <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Friday, April 10, 2020 at 10:57 PM
> *To: *user <[email protected]>
> *Subject: *Re: SparkRunner on k8s
>
>
>
> Hi Buvana,
>
>
>
> You can submit a Beam Pipeline to Spark on k8s like any other Spark
> Pipeline using the spark-submit script.
>
>
>
> Create an Uber Jar of your Beam code and provide it as the primary
> resource to spark-submit. Provide the k8s master and the container image to
> use as arguments to spark-submit.
>
> Refer https://spark.apache.org/docs/latest/running-on-kubernetes.html to
> know more about how to run Spark on k8s.
>
>
>
> The Beam pipeline will be translated to a Spark Pipeline using Spark APIs
> in Runtime.
>
>
>
> Regards,
>
> Rahul
>
>
>
> On Sat, Apr 11, 2020 at 4:38 AM Ramanan, Buvana (Nokia - US/Murray Hill) <
> [email protected]> wrote:
>
> Hello,
>
>
>
> I newly joined this group and I went through the archive to see if any
> discussion exists on submitting Beam pipelines to a SparkRunner on k8s.
>
>
>
> I run my Spark jobs on a k8s cluster in the cluster mode. Would like to
> deploy my beam pipeline on a SparkRunner with k8s underneath.
>
>
>
> The Beam documentation:
>
> https://beam.apache.org/documentation/runners/spark/
>
> does not discuss about k8s (though there is mention of Mesos and YARN).
>
>
>
> Can someone please point me to relevant material in this regard? Or,
> provide the steps for running my beam pipeline in this configuration?
>
>
>
> Thank you,
>
> Regards,
>
> Buvana
>
>

Reply via email to