[
https://issues.apache.org/jira/browse/SPARK-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201428#comment-14201428
]
Aaron Davidson commented on SPARK-4287:
---------------------------------------
I think the right solution here might be to supplement the ContextCleaner (or
BlockManagerMaster) with the notion of the external shuffle service, and for it
to make a request to the external shuffle service to clean up the appropriate
shuffle files. This should hopefully be a relatively small change.
> Add TTL-based cleanup in external shuffle service
> -------------------------------------------------
>
> Key: SPARK-4287
> URL: https://issues.apache.org/jira/browse/SPARK-4287
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.2.0
> Reporter: Andrew Or
>
> A problem with long-running SparkContexts using the external shuffle service
> is that its shuffle files may never be cleaned. This is because the
> ContextCleaner may no longer have executors to go through to clean up these
> files (they may be killed intentionally). We should have a TTL-based timeout
> that does this as a backup, defaulting to a timeout of a week or something.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]