Meng Zhu created MESOS-9673:
-------------------------------
Summary: Add timeout mechanism to GC incomplete task.
Key: MESOS-9673
URL: https://issues.apache.org/jira/browse/MESOS-9673
Project: Mesos
Issue Type: Improvement
Components: containerization
Reporter: Meng Zhu
Currently, an executor's meta and sandbox directory are only GCed when a task
is completed i.e. terminal task with all status acked.
However, in the case of unacked status update, the agent will keep resending
and keep the directories forever.
One issue is that, agent will keep recovering this executor upon every failover
and if a later executor happens to use the same pid (almost a certainty
consider the old meta dir will never be GCed), it will send agent into a crash
loop (MESOS-9672).
We should consider introducing a timeout mechanism to GC incomplete tasks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)