[
https://issues.apache.org/jira/browse/MESOS-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858096#comment-15858096
]
Benjamin Bannier commented on MESOS-6777:
-----------------------------------------
This same test hangs for me deterministically under OS X when running the test
suite in parallel (tests would finish, notification about {{\[PASSED\]}} test
suite would be printed, but would then hang in joining of test execution worker
threads).
I can make the test suite finish by killing an orphan {{sleep 1000}} process. I
believe this is related to this test launching an agent which launches a task
which will {{sleep 1000}}, but the then terminated agent not terminating the
running task. We still seem to require to wait for the running task.
I believe if there is a difference between BSDs and e.g., Linux it might be
related to how forked child processes are tracked should the parent go away,
https://github.com/apache/mesos/blob/948466912dce570f81778ffd9d166f43bf98a206/src/tests/containerizer/io_switchboard_tests.cpp#L1006.
We should make sure that even under e.g., Linux this test doesn't leak
orphaned processes.
> IOSwitchboardTest.RecoverThenKillSwitchboardContainerDestroyed hangs on
> FreeBSD
> -------------------------------------------------------------------------------
>
> Key: MESOS-6777
> URL: https://issues.apache.org/jira/browse/MESOS-6777
> Project: Mesos
> Issue Type: Bug
> Reporter: David Forsythe
> Assignee: David Forsythe
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)