[
https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734033#comment-15734033
]
Reynold Xin commented on SPARK-18278:
-------------------------------------
In the past few days I've given this a lot of thought.
I'm personally very interested in this work, and would actually use it myself.
That said, based on my experience, the real work starts after the initial thing
works, i.e. the maintenance and enhancement work in the future will be much
larger than the initial commit. Adding another officially supported scheduler
definitely has some serious (and maybe disruptive) impacts to Spark. Some
examples are ...
1. Testing becomes more complicated.
2. Related to 1, releases become more likely to be delayed. In the past many
Spark releases were delayed due to bugs in Mesos integration or the YARN
integration, because those are harder to be tested reliably in an automated
fashion.
3. The release process has to change.
Given Kubernetes is still very young, and unclear how successful it will be in
the future (I personally think it will be, but you never know), I would make
the following, concrete recommendations on moving this forward:
1. See if we can implement this as an add-on (library) outside Spark If not
possible, what about a fork?
2. Publish some non-official docker images so it is easy to use Spark on
Kubernetes this way.
3. Encourage users to use it and get feedback. Have the contributors that are
really interested in this work maintain it for couple Spark releases (this
includes testing the implementation, publishing new docker images, writing
documentations).
4. Evaluate later (say 2 releases) how well this has been received on whether
we take a coordinated effort to merge this into Spark, since it might become
the most popular cluster manager.
> Support native submission of spark jobs to a kubernetes cluster
> ---------------------------------------------------------------
>
> Key: SPARK-18278
> URL: https://issues.apache.org/jira/browse/SPARK-18278
> Project: Spark
> Issue Type: Umbrella
> Components: Build, Deploy, Documentation, Scheduler, Spark Core
> Reporter: Erik Erlandson
> Attachments: SPARK-18278 - Spark on Kubernetes Design Proposal.pdf
>
>
> A new Apache Spark sub-project that enables native support for submitting
> Spark applications to a kubernetes cluster. The submitted application runs
> in a driver executing on a kubernetes pod, and executors lifecycles are also
> managed as pods.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]