-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53418/
-----------------------------------------------------------
(Updated Nov. 4, 2016, 12:44 p.m.)
Review request for Aurora, Joshua Cohen and Stephan Erb.
Changes
-------
Review feedback.
Bugs: AURORA-1808
https://issues.apache.org/jira/browse/AURORA-1808
Repository: aurora
Description
-------
# Problem
Processes can deamonize and escape the supervision of a coordinator. Using the
Docker Containerizer or the Mesos Containerizer with pid isolation means that
the processes will be come reparented to the sh process that launches the
executor. For example:
```
root@aurora:/# ps xf
PID TTY STAT TIME COMMAND
48 ? Ss 0:00 /bin/bash
86 ? R+ 0:00 _ ps xf
1 ? Ss 0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex
--announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config
/home/vagrant/aurora/examples/va
5 ? Sl 0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex
--announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config
/home/vagrant/aurora/examples/vag
23 ? S 0:00 _ /usr/local/bin/python2.7
/mnt/mesos/sandbox/thermos_runner.pex
--task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be152
--
29 ? Ss 0:00 _ /usr/local/bin/python2.7
/mnt/mesos/sandbox/thermos_runner.pex
--task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15
32 ? S 0:00 | _ /bin/bash -c while true; do
echo hello world sleep 10 done
81 ? S 0:00 | _ sleep 10
31 ? Ss 0:00 _ /usr/local/bin/python2.7
/mnt/mesos/sandbox/thermos_runner.pex
--task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15
33 ? S 0:00 _ /bin/bash -c while true; do
echo hello world sleep 10 done
82 ? S 0:00 _ sleep 10
47 ? S 0:00 python ./daemon.py
```
# Solution
Ensure processes that escape the supervision of the coordinator reparent to the
runner who can send signals to them on task tear down. We do this by using the
`PR_SET_CHILD_SUBREAPER` flag of `prctl(2)`.
After this change the process tree looks like:
```
root@aurora:/# ps xf
PID TTY STAT TIME COMMAND
66 ? Ss 0:00 /bin/bash
70 ? R+ 0:00 _ ps xf
1 ? Ss 0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex
--announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config
/home/vagrant/aurora/examples/va
5 ? Sl 0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex
--announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config
/home/vagrant/aurora/examples/vag
23 ? S 0:00 _ /usr/local/bin/python2.7
/mnt/mesos/sandbox/thermos_runner.pex
--task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b849
--
33 ? Ss 0:00 _ /usr/local/bin/python2.7
/mnt/mesos/sandbox/thermos_runner.pex
--task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84
40 ? S 0:00 | _ /bin/bash -c while true; do
echo hello world sleep 10 done
63 ? S 0:00 | _ sleep 10
36 ? Ss 0:00 _ /usr/local/bin/python2.7
/mnt/mesos/sandbox/thermos_runner.pex
--task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84
37 ? S 0:00 | _ /bin/bash -c while true; do
echo hello world sleep 10 done
62 ? S 0:00 | _ sleep 10
55 ? S 0:00 _ python ./daemon.py
```
Now the runner is aware of the reparented procesess can can tear it down
cleanly with a `SIGTERM`.
Diffs (updated)
-----
RELEASE-NOTES.md d89ef2f641373ac229be693a21a2c0111e1f241a
src/main/python/apache/thermos/common/process_util.py
abd2c0ef35858d13971319b0a7436ce2293824ce
src/main/python/apache/thermos/core/helper.py
68855e1e54ba1cd4456e18a36fb237ce6a468c34
src/main/python/apache/thermos/core/process.py
3ec43e2719ef97026f399c4b2aa23002559b3153
src/main/python/apache/thermos/core/runner.py
7b9013d11f6ff4172b6b7bf56e62299b0d11c977
src/test/sh/org/apache/aurora/e2e/test_daemonizing_process.aurora
PRE-CREATION
src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
67702d2c0f2e18ee10dcb798b6d421050bd7d4ca
Diff: https://reviews.apache.org/r/53418/diff/
Testing
-------
src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
Thanks,
Zameer Manji