[jira] [Commented] (SPARK-19700) Design an API for pluggable scheduler implementations

Hamel Ajay Kothari (JIRA) Mon, 27 Feb 2017 04:56:17 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885715#comment-15885715
 ]


Hamel Ajay Kothari commented on SPARK-19700:
--------------------------------------------

Throwing this here for consistency. It looks like you guys referenced this in 
the kubernetes ticket: https://github.com/palantir/spark/pull/81

The big problem that I see with this is the one that you also mentioned, which 
is that different schedulers handle the different lifecycle events of executors 
differently. I don't think that having an explicit life cycle class which can 
be extended (SchedulerBackendExecutorLifecycleManager you call it) is too much 
to ask for. Currently CoarseGrainedSchedulerBackend has something like this 
which is done through createDriverEndpoint, right? So we wouldn't be adding 
much more complexity by having that, in fact I think it would make handling 
lifecycle events a little more clear, something which we also do for the Cook 
spark scheduler.

[~mccheah] do you guys have any plans to put together an actual design doc? I'd 
be happy to take a look at it if you are and contribute from a viewpoint from 
another scheduler that could benefit from this. If so, any idea when you guys 
will be putting that together?

> Design an API for pluggable scheduler implementations
> -----------------------------------------------------
>
>                 Key: SPARK-19700
>                 URL: https://issues.apache.org/jira/browse/SPARK-19700
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: Matt Cheah
>
> One point that was brought up in discussing SPARK-18278 was that schedulers 
> cannot easily be added to Spark without forking the whole project. The main 
> reason is that much of the scheduler's behavior fundamentally depends on the 
> CoarseGrainedSchedulerBackend class, which is not part of the public API of 
> Spark and is in fact quite a complex module. As resource management and 
> allocation continues evolves, Spark will need to be integrated with more 
> cluster managers, but maintaining support for all possible allocators in the 
> Spark project would be untenable. Furthermore, it would be impossible for 
> Spark to support proprietary frameworks that are developed by specific users 
> for their other particular use cases.
> Therefore, this ticket proposes making scheduler implementations fully 
> pluggable. The idea is that Spark will provide a Java/Scala interface that is 
> to be implemented by a scheduler that is backed by the cluster manager of 
> interest. The user can compile their scheduler's code into a JAR that is 
> placed on the driver's classpath. Finally, as is the case in the current 
> world, the scheduler implementation is selected and dynamically loaded 
> depending on the user's provided master URL.
> Determining the correct API is the most challenging problem. The current 
> CoarseGrainedSchedulerBackend handles many responsibilities, some of which 
> will be common across all cluster managers, and some which will be specific 
> to a particular cluster manager. For example, the particular mechanism for 
> creating the executor processes will differ between YARN and Mesos, but, once 
> these executors have started running, the means to submit tasks to them over 
> the Netty RPC is identical across the board.
> We must also consider a plugin model and interface for submitting the 
> application as well, because different cluster managers support different 
> configuration options, and thus the driver must be bootstrapped accordingly. 
> For example, in YARN mode the application and Hadoop configuration must be 
> packaged and shipped to the distributed cache prior to launching the job. A 
> prototype of a Kubernetes implementation starts a Kubernetes pod that runs 
> the driver in cluster mode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-19700) Design an API for pluggable scheduler implementations

Reply via email to