Re: Supporting Apache Aurora as a cluster manager

Mark Hamstra Sun, 10 Sep 2017 23:03:09 -0700

While it may be worth creating the design doc and JIRA ticket so that we at
least have a better idea and a record of what you are talking about, I kind
of doubt that we are going to want to merge this into the Spark codebase.
That's not because of anything specific to this Aurora effort, but rather
because scheduler implementations in general are not going in the preferred
direction. There is already some regret that the YARN scheduler wasn't
implemented by means of a scheduler plug-in API, and there is likely to be
more regret if we continue to go forward with the spark-on-kubernetes SPIP
in its present form. I'd guess that we are likely to merge code associated
with that SPIP just because Kubernetes has become such an important
resource scheduler, but such a merge wouldn't be without some misgivings.
That is because we just can't get into the position of having more and more
scheduler implementations in the Spark code, and more and more maintenance
overhead to keep up with the idiosyncrasies of all the scheduler
implementations. We've really got to get to the kind of plug-in
architecture discussed in SPARK-19700 so that scheduler implementations can
be done outside of the Spark codebase, release schedule, etc.


My opinion on the subject isn't dispositive on its own, of course, but that
is how I'm seeing things right now.

On Sun, Sep 10, 2017 at 8:27 PM, karthik padmanabhan <treadston...@gmail.com
> wrote:

> Hi Spark Devs,
>
> We are using Aurora (http://aurora.apache.org/) as our mesos framework
> for running stateless services. We would like to use Aurora to deploy big
> data and batch workloads as well. And for this we have forked Spark and
> implement the ExternalClusterManager trait.
>
> The reason for doing this and not running Spark on Mesos is to leverage
> the existing roles and quotas provided by Aurora for admission control and
> also leverage Aurora features such as priority and preemption. Additionally
> we would like Aurora to be only deploy/orchestration system that our users
> should interact with.
>
> We have a working POC where Spark is launching jobs through as the
> ClusterManager. Is this something that can be merged upstream ? If so I can
> create a design document and create an associated jira ticket.
>
> Thanks
> Karthik
>

Re: Supporting Apache Aurora as a cluster manager

Reply via email to