Benjamin Mahler <[email protected]> writes: > That test is broken on master currently, the ticket is here: > MESOS-487<https://issues.apache.org/jira/browse/MESOS-487>
And the fix for the broken test is in: https://reviews.apache.org/r/13034/ Kevin Your first run of the tests is current expected until that test is fixed. With respect to subsequent runs I have seen that before simply with mounting and unmounting cgroupfs. There are weird races in play and weird checks going on, and the unit tests exercise the kernel bugs quite well. You can look at /proc/cgroups and /proc/<pid>/cgroups to have some idea of what is going on. For myself when I do not wind up with unkillable processes or orphan processes I only had to wait a while. Possibly coupled with echo 3 > /proc/sys/vm/drop_caches and it was possible to mount cgroup filesystems again. I intend to look into these kernel bugs soonish but they aren't exactly deterministic. mesos-slave in a running configuration instead of a test configuration leaves cgroupfs mounted so you are not likely to hit these kernel problems if you actually start running mesos. Do becareful about running with a fixed balloon test though. With an unfixed kernel and system with swap enabled it creates effectively unkillable processes for me. If you are a curious you can find more about how the tests are failing by running them with MESOS_VERBOSE=1 make check. Eric
