[
https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012011#comment-16012011
]
Andrew Ash commented on SPARK-19700:
------------------------------------
Found another potential implementation: Eagle cluster manager from
https://github.com/epfl-labos/eagle with PR at
https://github.com/apache/spark/pull/17974
> Design an API for pluggable scheduler implementations
> -----------------------------------------------------
>
> Key: SPARK-19700
> URL: https://issues.apache.org/jira/browse/SPARK-19700
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.1.0
> Reporter: Matt Cheah
>
> One point that was brought up in discussing SPARK-18278 was that schedulers
> cannot easily be added to Spark without forking the whole project. The main
> reason is that much of the scheduler's behavior fundamentally depends on the
> CoarseGrainedSchedulerBackend class, which is not part of the public API of
> Spark and is in fact quite a complex module. As resource management and
> allocation continues evolves, Spark will need to be integrated with more
> cluster managers, but maintaining support for all possible allocators in the
> Spark project would be untenable. Furthermore, it would be impossible for
> Spark to support proprietary frameworks that are developed by specific users
> for their other particular use cases.
> Therefore, this ticket proposes making scheduler implementations fully
> pluggable. The idea is that Spark will provide a Java/Scala interface that is
> to be implemented by a scheduler that is backed by the cluster manager of
> interest. The user can compile their scheduler's code into a JAR that is
> placed on the driver's classpath. Finally, as is the case in the current
> world, the scheduler implementation is selected and dynamically loaded
> depending on the user's provided master URL.
> Determining the correct API is the most challenging problem. The current
> CoarseGrainedSchedulerBackend handles many responsibilities, some of which
> will be common across all cluster managers, and some which will be specific
> to a particular cluster manager. For example, the particular mechanism for
> creating the executor processes will differ between YARN and Mesos, but, once
> these executors have started running, the means to submit tasks to them over
> the Netty RPC is identical across the board.
> We must also consider a plugin model and interface for submitting the
> application as well, because different cluster managers support different
> configuration options, and thus the driver must be bootstrapped accordingly.
> For example, in YARN mode the application and Hadoop configuration must be
> packaged and shipped to the distributed cache prior to launching the job. A
> prototype of a Kubernetes implementation starts a Kubernetes pod that runs
> the driver in cluster mode.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]