Karsten created MESOS-8762: ------------------------------ Summary: Farmework Teardown Leaves Task in Uninterruptible Sleep State D Key: MESOS-8762 URL: https://issues.apache.org/jira/browse/MESOS-8762 Project: Mesos Issue Type: Bug Reporter: Karsten
The Marathon has a testsuite that starts a Python simple HTTP server in a task group aka pod in Marathon. After the test run we call {{/teardown}} and wait for the Marathon framework to be completed (see [MesosTest|https://github.com/mesosphere/marathon/blob/master/src/test/scala/mesosphere/marathon/integration/setup/MesosTest.scala#L311]). Our CI checks whether we leak any tasks after all test runs. It turns out we do: {code} Will kill: root 18084 0.0 0.0 45380 13612 ? D 07:52 0:00 python src/app_mock.py 35477 resident-pod-16322-fail 2018-04-06T07:52:16.924Z http://www.example.com Running 'sudo kill -9 18084 Wait for processes being killed... ... Couldn't kill some leaked processes: root 18084 0.0 0.0 45380 13612 ? D 07:52 0:00 python src/app_mock.py 35477 resident-pod-16322-fail 2018-04-06T07:52:16.924Z http://www.example.com ammonite.$file.ci.utils$StageException: Stage Compile and Test failed. {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)