[
https://issues.apache.org/jira/browse/MESOS-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607057#comment-13607057
]
Benjamin Mahler commented on MESOS-396:
---------------------------------------
These are the two major components:
https://reviews.apache.org/r/10028/
https://reviews.apache.org/r/10032/
> Slave GarbageCollector needs to delete the parent executor directories. It
> currently only deletes the executor run directories.
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: MESOS-396
> URL: https://issues.apache.org/jira/browse/MESOS-396
> Project: Mesos
> Issue Type: Bug
> Reporter: Benjamin Mahler
> Assignee: Benjamin Mahler
> Priority: Blocker
>
> The result of this is that long lived slaves accumulate a large number of
> empty executor directories. All that remains in these directories is a broken
> link to the 'latest' run.
> Over time, as the slave approaches having LINK_MAX empty executor
> directories, the slave will crash from mkdir failing, as was found in
> MESOS-391.
> The fix is that we have to schedule the executor parent directories for
> deletion, however the GC module does not know whether the parent executor can
> be deleted! This is because there could be more tasks launched with the same
> executor id, since having scheduled the directory for deletion.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira