[ https://issues.apache.org/jira/browse/MESOS-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364650#comment-15364650 ]
Yan Xu edited comment on MESOS-5763 at 7/6/16 9:33 PM: ------------------------------------------------------- https://reviews.apache.org/r/49650/ https://reviews.apache.org/r/49725/ https://reviews.apache.org/r/49651/ https://reviews.apache.org/r/49652/ https://reviews.apache.org/r/49653/ https://reviews.apache.org/r/49726/ We have another patch which adds a test, will post it soon. /cc [~megha.sharma] was (Author: xujyan): https://reviews.apache.org/r/49650/ https://reviews.apache.org/r/49651/ https://reviews.apache.org/r/49652/ https://reviews.apache.org/r/49653/ We have another patch which adds a test, will post it soon. /cc [~megha.sharma] > Task stuck in fetching is not cleaned up after > --executor_registration_timeout. > ------------------------------------------------------------------------------- > > Key: MESOS-5763 > URL: https://issues.apache.org/jira/browse/MESOS-5763 > Project: Mesos > Issue Type: Bug > Components: containerization > Affects Versions: 0.28.0, 1.0.0, 0.29.0 > Reporter: Yan Xu > Assignee: Yan Xu > Priority: Blocker > Fix For: 0.28.3, 1.0.0, 0.27.4 > > > When the fetching process hangs forever due to reasons such as HDFS issues, > Mesos containerizer would attempt to destroy the container and kill the > executor after {{--executor_registration_timeout}}. However this reliably > fails for us: the executor would be killed by the launcher destroy and the > container would be destroyed but the agent would never find out that the > executor is terminated thus leaving the task in the STAGING state forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)