----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/53403/#review154630 -----------------------------------------------------------
Master (13d4861) is red with this patch. ./build-support/jenkins/build.sh virtualenv-15.0.2/virtualenv_support/ virtualenv-15.0.2/virtualenv_support/__init__.py virtualenv-15.0.2/virtualenv_support/argparse-1.4.0-py2.py3-none-any.whl virtualenv-15.0.2/virtualenv_support/pip-8.1.2-py2.py3-none-any.whl virtualenv-15.0.2/virtualenv_support/setuptools-21.2.1-py2.py3-none-any.whl virtualenv-15.0.2/virtualenv_support/wheel-0.29.0-py2.py3-none-any.whl + touch virtualenv-15.0.2/BOOTSTRAPPED + popd /home/jenkins/jenkins-slave/workspace/AuroraBot + exec /usr/bin/python2.7 /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/virtualenv-15.0.2/virtualenv.py --no-download /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv New python executable in /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python2.7 Also creating executable in /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/bin/python Installing setuptools, pip, wheel...done. Collecting isort==4.0.0 /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning. SNIMissingWarning /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning Downloading isort-4.0.0-py2.py3-none-any.whl Installing collected packages: isort Successfully installed isort-4.0.0 /home/jenkins/jenkins-slave/workspace/AuroraBot/build-support/python/isort.venv/local/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning You are using pip version 8.1.2, however version 9.0.0 is available. You should consider upgrading via the 'pip install --upgrade pip' command. ERROR: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/core/process.py Imports are incorrectly sorted. --- /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/core/process.py:before 2016-11-02 20:31:33.690210 +++ /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/core/process.py:after 2016-11-02 20:38:16.762782 @@ -39,10 +39,7 @@ from twitter.common.quantity import Amount, Data, Time from twitter.common.recordio import ThriftRecordReader, ThriftRecordWriter -from apache.thermos.common.process_util import ( - setup_child_subreaping, - wrap_with_mesos_containerizer -) +from apache.thermos.common.process_util import setup_child_subreaping, wrap_with_mesos_containerizer from gen.apache.aurora.api.constants import TASK_FILESYSTEM_MOUNT_POINT from gen.apache.thermos.ttypes import ProcessState, ProcessStatus, RunnerCkpt I will refresh this build result if you post a review containing "@ReviewBot retry" - Aurora ReviewBot On Nov. 2, 2016, 8:32 p.m., Zameer Manji wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/53403/ > ----------------------------------------------------------- > > (Updated Nov. 2, 2016, 8:32 p.m.) > > > Review request for Aurora, Joshua Cohen, Santhosh Kumar Shanmugham, and > Stephan Erb. > > > Bugs: AURORA-1808 > https://issues.apache.org/jira/browse/AURORA-1808 > > > Repository: aurora > > > Description > ------- > > This is a WIP patch showing a possible fix to AURORA-1808. > > # Problem > > Processes can deamonize and escape the supervision of a coordinator. Using > the Docker Containerizer or the Mesos Containerizer with pid isolation means > that the processes will be come reparented to the `sh` process that launches > the executor. For example: > ```` > root@aurora:/# ps xf > PID TTY STAT TIME COMMAND > 48 ? Ss 0:00 /bin/bash > 86 ? R+ 0:00 _ ps xf > 1 ? Ss 0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex > --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config > /home/vagrant/aurora/examples/va > 5 ? Sl 0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex > --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config > /home/vagrant/aurora/examples/vag > 23 ? S 0:00 _ /usr/local/bin/python2.7 > /mnt/mesos/sandbox/thermos_runner.pex > --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be152 > -- > 29 ? Ss 0:00 _ /usr/local/bin/python2.7 > /mnt/mesos/sandbox/thermos_runner.pex > --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15 > 32 ? S 0:00 | _ /bin/bash -c while true; do > echo hello world sleep 10 done > 81 ? S 0:00 | _ sleep 10 > 31 ? Ss 0:00 _ /usr/local/bin/python2.7 > /mnt/mesos/sandbox/thermos_runner.pex > --task_id=www-data-devel-hello_docker_engine-0-bde5cdc7-8685-46fd-9078-4a86bd5be15 > 33 ? S 0:00 _ /bin/bash -c while true; do > echo hello world sleep 10 done > 82 ? S 0:00 _ sleep 10 > 47 ? S 0:00 python ./daemon.py > ```` > > # Solution > Ensure processes that escape the supervision of the coordinator reparent to > the runner who can send signals to them on task tear down. We do this by > using the `PR_SET_CHILD_SUBREAPER` flag of `prctl(2)`. > > After this change the process tree looks like: > ```` > root@aurora:/# ps xf > PID TTY STAT TIME COMMAND > 66 ? Ss 0:00 /bin/bash > 70 ? R+ 0:00 _ ps xf > 1 ? Ss 0:00 /bin/sh -c ${MESOS_SANDBOX=.}/thermos_executor.pex > --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config > /home/vagrant/aurora/examples/va > 5 ? Sl 0:02 python2.7 /mnt/mesos/sandbox/thermos_executor.pex > --announcer-ensemble localhost:2181 --announcer-zookeeper-auth-config > /home/vagrant/aurora/examples/vag > 23 ? S 0:00 _ /usr/local/bin/python2.7 > /mnt/mesos/sandbox/thermos_runner.pex > --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b849 > -- > 33 ? Ss 0:00 _ /usr/local/bin/python2.7 > /mnt/mesos/sandbox/thermos_runner.pex > --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84 > 40 ? S 0:00 | _ /bin/bash -c while true; do > echo hello world sleep 10 done > 63 ? S 0:00 | _ sleep 10 > 36 ? Ss 0:00 _ /usr/local/bin/python2.7 > /mnt/mesos/sandbox/thermos_runner.pex > --task_id=www-data-devel-hello_docker_engine-0-721406db-00f5-4c0c-915e-1dbc5568b84 > 37 ? S 0:00 | _ /bin/bash -c while true; do > echo hello world sleep 10 done > 62 ? S 0:00 | _ sleep 10 > 55 ? S 0:00 _ python ./daemon.py > ```` > > Now the runner is aware of the reparented procesess can can tear it down > cleanly during teardown. > > Note that the man page for `prctl(2)` says that the processes that set > `PR_SET_CHILD_SUBREAPER` should reap children to get rid of zombies. It is > important to note tht the runner already does this in its run loop via > `TaskRunnerHelper.reap_children()`. This patch has the side effect of > ensuring it will reap all of the children launched via coordinators. > > > Diffs > ----- > > src/main/python/apache/thermos/common/process_util.py > abd2c0ef35858d13971319b0a7436ce2293824ce > src/main/python/apache/thermos/core/helper.py > 68855e1e54ba1cd4456e18a36fb237ce6a468c34 > src/main/python/apache/thermos/core/process.py > 3ec43e2719ef97026f399c4b2aa23002559b3153 > src/main/python/apache/thermos/core/runner.py > 7b9013d11f6ff4172b6b7bf56e62299b0d11c977 > > Diff: https://reviews.apache.org/r/53403/diff/ > > > Testing > ------- > > no automated tests yet. > > Validated behaviour with `ps` and `strace`. > > > Thanks, > > Zameer Manji > >