Hi All,
We have Spark Streaming pipelines(written in java) currently running on yarn in production. We are evaluating moving these streaming pipelines onto Kubernetes. We had set up a working Kubernetes cluster. I have been reading Spark documentation and a few other blogs on migrating them to Kubernetes. 1. But, it's not very clear on how to migrate existing pipelines to Spark on Kubernetes. Any pointers on this would be helpful. 2. Also, I am trying to run sample wordcount example using the commands from documentation(https://spark.apache.org/docs/2.4.0/running-on-kubernetes.html#cluster-mode). However, I am not able to figure out a way to pass in Spark docker image as one of the conf (spark.kubernetes.container.image). Our machines have no access to the internet and so I have pre-loaded a spark docker image available at gcr.io manually to our docker images. So, how should be my spark-submit command? 3. Would specifying spark.kubernetes.container.image.pullPolicy=IfNotPresent would only try to pull the docker image if it's not existing in the docker list already? Any help in answering the above questions would be appreciated. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org