Yan Xu created MESOS-5763:
-----------------------------
Summary: Task stuck in fetching is not cleaned up after
--executor_registration_timeout.
Key: MESOS-5763
URL: https://issues.apache.org/jira/browse/MESOS-5763
Project: Mesos
Issue Type: Bug
Components: containerization
Affects Versions: 0.29.0, 0.28.0, 1.0.0
Reporter: Yan Xu
Assignee: Yan Xu
When the fetching process hangs forever due to reasons such as HDFS issues,
Mesos containerizer would attempt to destroy the container and kill the
executor after {{--executor_registration_timeout}}. However this reliably fails
for us: the executor would be killed by the launcher destroy and the container
would be destroyed but the agent would never find out that the executor is
terminated thus leaving the task in the STAGING state forever.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)