Yogesh Natarajan created SPARK-24075: ----------------------------------------
Summary: [Mesos] Supervised driver upon failure will be retried indefinitely unless explicitly killed Key: SPARK-24075 URL: https://issues.apache.org/jira/browse/SPARK-24075 Project: Spark Issue Type: Improvement Components: Mesos Affects Versions: 2.3.0 Reporter: Yogesh Natarajan If supervise is enabled, MesosClusterScheduler will retry a failing driver indefinitely. This takes up cluster resources which is freed up only when the driver is explicitly killed. The proposed solution is to introduce spark configuration "spark.driver.supervise.maxRetries" which allows the maximum number of retries to be specified while preserving the default behavior of retrying the driver indefinitely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org