Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17750 IMO we should not enable checkpointing in fine-grained mode. Because with checkpointing enabled, mesos agents would persist all status updates to disk which means great I/O cost because fine-grained mode makes use of mesos status updates to send the task results back to the driver. Also I'm not sure whether it makes sense to set the `failover_timeout` or not. The framework timeout is designed for frameworks that can reconcile with mesos master of existing tasks when re-connected, but the mesos scheduler in spark doesn't implement that yet. Currently when the spark driver disconnects with the mesos master, the master would immediately remove the spark driver from the frameworks list.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org