Gilbert Song created MESOS-9191:
-----------------------------------

             Summary: Docker command executor may stuck at infinite unkillable 
loop.
                 Key: MESOS-9191
                 URL: https://issues.apache.org/jira/browse/MESOS-9191
             Project: Mesos
          Issue Type: Bug
          Components: containerization, docker
            Reporter: Gilbert Song


Due to the change from https://issues.apache.org/jira/browse/MESOS-8574, the 
behavior of docker command executor to discard the future of docker stop was 
changed. If there is a new killTask() invoked and there is an existing docker 
stop in pending state, the old one would call discard and then execute the new 
one. This is ok for most of cases.

However, docker stop could take long (depends on grace period and whether the 
application could handle SIGTERM). If the framework retry killTask more 
frequently than grace period (depends on killpolicy API, env var, or agent 
flags), then the executor may be stuck forever with unkillable tasks. Because 
everytime before the docker stop finishes, the future of docker stop is 
discarded by the new incoming killTask.

We should consider re-use grace period before calling discard() to a pending 
docker stop future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to