Gastón Kleiman created MESOS-8666:
-------------------------------------

             Summary: Windows agent fails KILL_NESTED_CONTAINER but kills 
container anyway
                 Key: MESOS-8666
                 URL: https://issues.apache.org/jira/browse/MESOS-8666
             Project: Mesos
          Issue Type: Bug
          Components: agent
            Reporter: Gastón Kleiman
            Assignee: Andrew Schwartzmeyer


While working on [https://reviews.apache.org/r/65962/,] I observed that on 
Windows the {{KILL_NESTED_CONTAINER}} call to kill a sleep task will fail, and 
the agent/default executor will log the failure, but the agent actually manages 
to kill the container.

I think this is a bug, the agent should not respond to the kill call with an 
error if it is able to kill a container.

Agent logs:
{noformat}
I0308 22:58:20.424288  9556 http.cpp:2984] Processing KILL_NESTED_CONTAINER 
call for container 
'43431305-2580-451c-ac32-2b5f873ed49b.242f0f68-f332-4032-9eda-621506591530'
E0308 22:58:20.426290  1788 kill.hpp:41] os::kill_process(): Failed call to 
TerminateProcess
I0308 22:58:20.523291  9556 containerizer.cpp:2791] Container 
43431305-2580-451c-ac32-2b5f873ed49b.242f0f68-f332-4032-9eda-621506591530 has 
exited
{noformat}

Default executor logs:
{noformat}
I0308 22:58:20.408291  3336 default_executor.cpp:1103] Killing task 
63acc00f-df83-4e4e-9a46-5d0847067639 running in child container 
43431305-2580-451c-ac32-2b5f873ed49b.242f0f68-f332-4032-9eda-621506591530 with 
SIGTERM signal
I0308 22:58:20.408291  3336 default_executor.cpp:1125] Scheduling escalation to 
SIGKILL in 3secs from now
W0308 22:58:20.430287  7280 default_executor.cpp:1261] Failed to kill the task 
'63acc00f-df83-4e4e-9a46-5d0847067639' running in child container 
43431305-2580-451c-ac32-2b5f873ed49b.242f0f68-f332-4032-9eda-621506591530:The 
agent failed to kill the container 
43431305-2580-451c-ac32-2b5f873ed49b.242f0f68-f332-4032-9eda-621506591530: 
Unable to send signal to container: No error
I0308 22:58:20.622292  8284 default_executor.cpp:935] Child container 
43431305-2580-451c-ac32-2b5f873ed49b.242f0f68-f332-4032-9eda-621506591530 of 
task '63acc00f-df83-4e4e-9a46-5d0847067639' completed in state TASK_FAILED: 
Command exited with status 1
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to