[
https://issues.apache.org/jira/browse/MESOS-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163768#comment-14163768
]
Alexander Rukletsov commented on MESOS-1871:
--------------------------------------------
I think what happens is that the task process escapes its process tree and is
not killed by {{PosixLauncher}}. Here is an orphaned process after launching
the first test:
{code}
[email protected]: ~ $ ps aux | grep handler
alex 5641 0.0 0.0 2432784 624 s003 S+ 6:52PM 0:00.00
grep handler
alex 5620 0.0 0.0 2447700 688 ?? S 6:52PM 0:00.00
sh -c ( handler() { echo SIGTERM; }; trap 'handler TERM' SIGTERM; echo $$; echo
$(which sleep); while true; do date; sleep 1; done; exit 0 )
[email protected]: ~ $ ps -p 5620 -o ppid=
1
{code}
> Sending SIGTERM to a task command may render it orphaned
> --------------------------------------------------------
>
> Key: MESOS-1871
> URL: https://issues.apache.org/jira/browse/MESOS-1871
> Project: Mesos
> Issue Type: Bug
> Components: slave
> Reporter: Alexander Rukletsov
> Assignee: Alexander Rukletsov
>
> {{CommandExecutor}} launches tasks wrapping them into {{sh -c}}. That means
> signals are sent to the top process—that is {{sh -c}}—and not to the task
> directly. Though {{SIGTERM}} is propagated by {{sh -c}} down the process
> tree, if the task is unresponsive to {{SIGTERM}}, {{sh -c}} terminates
> reporting success to the {{CommandExecutor}}, rendering the task detached
> from the parent process and still running. Because the {{CommandExecutor}}
> thinks the command terminated normally, its OS process exits normally and may
> not trigger containerizer's escalation which destroys cgroups.
> Here is the test related to the first part:
> [https://gist.github.com/rukletsov/68259dfb02421813f9e6].
> Here is the test related to the second part:
> [https://gist.github.com/rukletsov/3f19ecc7389fa51e65c0].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)