[GitHub] spark pull request: [SPARK-13904][Scheduler]Add support for plugga...

hbhanawat Tue, 15 Mar 2016 02:38:24 -0700

GitHub user hbhanawat opened a pull request:

    https://github.com/apache/spark/pull/11723


    [SPARK-13904][Scheduler]Add support for pluggable cluster manager 

    ## What changes were proposed in this pull request?
    
    This commit adds support for pluggable cluster manager. And also allows a 
cluster manager to clean up tasks without taking the parent process down.
    
    To plug a new external cluster manager, ExternalClusterManager trait should 
be implemented. It returns task scheduler and backend scheduler that will be 
used by SparkContext to schedule tasks. An external cluster manager is 
registered using the java.util.ServiceLoader mechanism (This mechanism is also 
being used to register data sources like parquet, json, jdbc etc.). This allows 
auto-loading implementations of ExternalClusterManager interface.
    
    Currently, when a driver fails, executors exit using system.exit. This does 
not bode well for cluster managers that would like to reuse the parent process 
of an executor. Hence,
    
      1. Moving system.exit to a function that can be overriden in subclasses 
of CoarseGrainedExecutorBackend.
      2. Added functionality of killing all the running tasks in an executor.
    
    ## How was this patch tested?
    ExternalClusterManagerSuite.scala was added to test this patch. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hbhanawat/spark pluggableScheduler

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11723.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11723
    
----
commit 800834f24ad1f0c4a68d8d49f600db6570d100ef
Author: Hemant Bhanawat <[email protected]>
Date:   2016-03-15T09:00:30Z

    This commit adds support for pluggable cluster manager. And also allows a 
cluster manager to clean up tasks without taking the parent process down.
    
    To plug a new external cluster manager, ExternalClusterManager trait should 
be implemented. It returns task scheduler and backend scheduler that will be 
used by SparkContext to schedule tasks. An external cluster manager is 
registered using the java.util.ServiceLoader mechanism (This mechanism is also 
being used to register data sources like parquet, json, jdbc etc.). This allows 
auto-loading implementations of ExternalClusterManager interface.
    
    Currently, when a driver fails, executors exit using system.exit. This does 
not bode well for cluster managers that would like to reuse the parent process 
of an executor. Hence,
    
      1. Moving system.exit to a function that can be overriden in subclasses 
of CoarseGrainedExecutorBackend.
      2. Added functionality of killing all the running tasks in an executor.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13904][Scheduler]Add support for plugga...

Reply via email to