Yogesh Natarajan created SPARK-24075:
----------------------------------------

             Summary: [Mesos] Supervised driver upon failure will be retried 
indefinitely unless explicitly killed
                 Key: SPARK-24075
                 URL: https://issues.apache.org/jira/browse/SPARK-24075
             Project: Spark
          Issue Type: Improvement
          Components: Mesos
    Affects Versions: 2.3.0
            Reporter: Yogesh Natarajan


If supervise is enabled, MesosClusterScheduler will retry a failing driver 
indefinitely. This takes up cluster resources which is freed up only when the 
driver is explicitly killed.

The proposed solution is to introduce spark configuration 
"spark.driver.supervise.maxRetries" which allows the maximum number of retries 
to be specified while preserving the default behavior of retrying the driver 
indefinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to