Github user lins05 commented on the issue:
https://github.com/apache/spark/pull/17750
IMO we should not enable checkpointing in fine-grained mode. Because with
checkpointing enabled, mesos agents would persist all status updates to disk
which means great I/O cost because fine-grained mode makes use of mesos status
updates to send the task results back to the driver.
Also I'm not sure whether it makes sense to set the `failover_timeout` or
not. The framework timeout is designed for frameworks that can reconcile with
mesos master of existing tasks when re-connected, but the mesos scheduler in
spark doesn't implement that yet. Currently when the spark driver disconnects
with the mesos master, the master would immediately remove the spark driver
from the frameworks list.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]