abudnik commented on pull request #388: URL: https://github.com/apache/mesos/pull/388#issuecomment-850982231
IIRC, Mesos leaves a task in the KILLING state if `cgroup::destroy()` fails or hangs. The agent can't force the kernel to deallocate resources in the cgroup, so it leaves the task in that state to prevent resource leaks. Otherwise, the agent would eventually allocate all resources of the host on a buggy kernel. I remember a few jira tickets describing similar issues. Maybe it would make more sense to add some links as there have been investigations some time ago. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
