Ian Downes created MESOS-2421:
---------------------------------

             Summary: Processes can be stuck in D state and block destroy
                 Key: MESOS-2421
                 URL: https://issues.apache.org/jira/browse/MESOS-2421
             Project: Mesos
          Issue Type: Bug
          Components: isolation
    Affects Versions: 0.21.1
         Environment: CentOS, 3.10 kernel
            Reporter: Ian Downes


We've observed processes getting stuck in D state (uninterruptible sleep) when 
using the cpu isolator. This prevents the MesosContainerizer launcher from 
killing all container processes and blocks destroying the container. 

It appears to be a kernel scheduler bug: the processes can be unstuck by 
modifying the cpu.cfs_quota_us for the cpu cgroup. This seems to run the 
processes, deliver the kill signal, and they exit.

We should implement this workaround in the launcher destroy path when processes 
are observed to be in D state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to