[ 
https://issues.apache.org/jira/browse/MESOS-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364650#comment-15364650
 ] 

Yan Xu edited comment on MESOS-5763 at 7/6/16 9:33 PM:
-------------------------------------------------------

https://reviews.apache.org/r/49650/
https://reviews.apache.org/r/49725/
https://reviews.apache.org/r/49651/
https://reviews.apache.org/r/49652/
https://reviews.apache.org/r/49653/
https://reviews.apache.org/r/49726/

We have another patch which adds a test, will post it soon. /cc [~megha.sharma]



was (Author: xujyan):
https://reviews.apache.org/r/49650/
https://reviews.apache.org/r/49651/
https://reviews.apache.org/r/49652/
https://reviews.apache.org/r/49653/

We have another patch which adds a test, will post it soon. /cc [~megha.sharma]


> Task stuck in fetching is not cleaned up after 
> --executor_registration_timeout.
> -------------------------------------------------------------------------------
>
>                 Key: MESOS-5763
>                 URL: https://issues.apache.org/jira/browse/MESOS-5763
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>    Affects Versions: 0.28.0, 1.0.0, 0.29.0
>            Reporter: Yan Xu
>            Assignee: Yan Xu
>            Priority: Blocker
>             Fix For: 0.28.3, 1.0.0, 0.27.4
>
>
> When the fetching process hangs forever due to reasons such as HDFS issues, 
> Mesos containerizer would attempt to destroy the container and kill the 
> executor after {{--executor_registration_timeout}}. However this reliably 
> fails for us: the executor would be killed by the launcher destroy and the 
> container would be destroyed but the agent would never find out that the 
> executor is terminated thus leaving the task in the STAGING state forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to