Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

Zhou Jiang Sun, 12 Nov 2023 10:24:49 -0800

I'd say that's actually the other way round. A user may either
1. Use spark-submit, this works with or without operator. Or,
2. Deploy the operator, create the Spark Applications with kubectl /
clients - so that the Operator does spark-submit for you.
We may also continue this discussion in the proposal doc.


On Fri, Nov 10, 2023 at 8:57 PM Cheng Pan <pan3...@gmail.com> wrote:

> > Not really - this is not designed to be a replacement for the current
> approach.
>
> That's what I assumed too. But my question is, as a user, how to write a
> spark-submit command to submit a Spark app to leverage this operator?
>
> Thanks,
> Cheng Pan
>
>
> > On Nov 11, 2023, at 03:21, Zhou Jiang <zhou.c.ji...@gmail.com> wrote:
> >
> > Not really - this is not designed to be a replacement for the current
> approach. Kubernetes operator fits in the scenario for automation and
> application lifecycle management at scale. Users can choose between
> spark-submit and operator approach based on their specific needs and
> requirements.
> >
> > On Thu, Nov 9, 2023 at 9:16 PM Cheng Pan <pan3...@gmail.com> wrote:
> > Thanks for this impressive proposal, I have a basic question, how does
> spark-submit work with this operator? Or it enforces that we must use
> `kubectl apply -f spark-job.yaml`(or K8s client in programming way) to
> submit Spark app?
> >
> > Thanks,
> > Cheng Pan
> >
> >
> > > On Nov 10, 2023, at 04:05, Zhou Jiang <zhou.c.ji...@gmail.com> wrote:
> > >
> > > Hi Spark community,
> > > I'm reaching out to initiate a conversation about the possibility of
> developing a Java-based Kubernetes operator for Apache Spark. Following the
> operator pattern (
> https://kubernetes.io/docs/concepts/extend-kubernetes/operator/), Spark
> users may manage applications and related components seamlessly using
> native tools like kubectl. The primary goal is to simplify the Spark user
> experience on Kubernetes, minimizing the learning curve and operational
> complexities and therefore enable users to focus on the Spark application
> development.
> > > Although there are several open-source Spark on Kubernetes operators
> available, none of them are officially integrated into the Apache Spark
> project. As a result, these operators may lack active support and
> development for new features. Within this proposal, our aim is to introduce
> a Java-based Spark operator as an integral component of the Apache Spark
> project. This solution has been employed internally at Apple for multiple
> years, operating millions of executors in real production environments. The
> use of Java in this solution is intended to accommodate a wider user and
> contributor audience, especially those who are familiar with Scala.
> > > Ideally, this operator should have its dedicated repository, similar
> to Spark Connect Golang or Spark Docker, allowing it to maintain a loose
> connection with the Spark release cycle. This model is also followed by the
> Apache Flink Kubernetes operator.
> > > We believe that this project holds the potential to evolve into a
> thriving community project over the long run. A comparison can be drawn
> with the Flink Kubernetes Operator: Apple has open-sourced internal Flink
> Kubernetes operator, making it a part of the Apache Flink project (
> https://github.com/apache/flink-kubernetes-operator). This move has
> gained wide industry adoption and contributions from the community. In a
> mere year, the Flink operator has garnered more than 600 stars and has
> attracted contributions from over 80 contributors. This showcases the level
> of community interest and collaborative momentum that can be achieved in
> similar scenarios.
> > > More details can be found at SPIP doc : Spark Kubernetes Operator
> https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE
> > > Thanks,--
> > > Zhou JIANG
> > >
> >
> >
> >
> > --
> > Zhou JIANG
> >
>
>

-- 
*Zhou JIANG*

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

Reply via email to