[ https://issues.apache.org/jira/browse/MESOS-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534077#comment-16534077 ]
Vinod Kone commented on MESOS-9052: ----------------------------------- Instead of suicide, it should shutdown the current task group. Since one task/container failing to launch shouldn't impact other task groups. Also, should this be more generically applied to all calls from executor to agent or just launch? cc [~gkleiman] > Default executor should commit suicide if it cannot receive HTTP responses > for LAUNCH_NESTED_CONTAINER calls. > ------------------------------------------------------------------------------------------------------------- > > Key: MESOS-9052 > URL: https://issues.apache.org/jira/browse/MESOS-9052 > Project: Mesos > Issue Type: Bug > Components: executor > Affects Versions: 1.4.0, 1.5.0, 1.6.0, 1.7.0 > Reporter: Chun-Hung Hsiao > Priority: Major > > If there is a network problem (e.g., a routing problem), it is possible that > the agent has received {{LAUNCH_NESTED_CONTAINER}} calls from the default > executor and launched the nested container, but the executor does not get the > HTTP response. This would result in tasks stuck at {{TASK_STARTING}} forever. > We should consider making the default executor commit suicide if it does not > receive the response in a reasonable amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)