[
https://issues.apache.org/jira/browse/MESOS-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kone reassigned MESOS-8568:
---------------------------------
Shepherd: Gilbert Song (was: Alexander Rukletsov)
Assignee: Qian Zhang
Sprint: Mesosphere Sprint 74, Mesosphere Sprint 75, Mesosphere Sprint
2018-27 (was: Mesosphere Sprint 74, Mesosphere Sprint 75)
Issue Type: Improvement (was: Task)
> Command checks should always call `WAIT_NESTED_CONTAINER` before
> `REMOVE_NESTED_CONTAINER`
> ------------------------------------------------------------------------------------------
>
> Key: MESOS-8568
> URL: https://issues.apache.org/jira/browse/MESOS-8568
> Project: Mesos
> Issue Type: Improvement
> Reporter: Andrei Budnik
> Assignee: Qian Zhang
> Priority: Blocker
> Labels: default-executor, health-check, mesosphere
>
> After successful launch of a nested container via
> `LAUNCH_NESTED_CONTAINER_SESSION` in a checker library, it calls
> [waitNestedContainer
> |https://github.com/apache/mesos/blob/0a40243c6a35dc9dc41774d43ee3c19cdf9e54be/src/checks/checker_process.cpp#L657]
> for the container. Checker library
> [calls|https://github.com/apache/mesos/blob/0a40243c6a35dc9dc41774d43ee3c19cdf9e54be/src/checks/checker_process.cpp#L466-L487]
> `REMOVE_NESTED_CONTAINER` to remove a previous nested container before
> launching a nested container for a subsequent check. Hence,
> `REMOVE_NESTED_CONTAINER` call follows `WAIT_NESTED_CONTAINER` to ensure that
> the nested container has been terminated and can be removed/cleaned up.
> In case of failure, the library [doesn't
> call|https://github.com/apache/mesos/blob/0a40243c6a35dc9dc41774d43ee3c19cdf9e54be/src/checks/checker_process.cpp#L627-L636]
> `WAIT_NESTED_CONTAINER`. Despite the failure, the container might be
> launched and the following attempt to remove the container without call
> `WAIT_NESTED_CONTAINER` leads to errors like:
> {code:java}
> W0202 20:03:08.895830 7 checker_process.cpp:503] Received '500 Internal
> Server Error' (Nested container has not terminated yet) while removing the
> nested container
> '2b0c542c-1f5f-42f7-b914-2c1cadb4aeca.da0a7cca-516c-4ec9-b215-b34412b670fa.check-49adc5f1-37a3-4f26-8708-e27d2d6cd125'
> used for the COMMAND check for task
> 'node-0-server__e26a82b0-fbab-46a0-a1ea-e7ac6cfa4c91
> {code}
> The checker library should always call `WAIT_NESTED_CONTAINER` before
> `REMOVE_NESTED_CONTAINER`.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)