Stephan Erb created MESOS-9174:
----------------------------------

             Summary: Unexpected containers transition from RUNNING to 
DESTROYING during recovery
                 Key: MESOS-9174
                 URL: https://issues.apache.org/jira/browse/MESOS-9174
             Project: Mesos
          Issue Type: Bug
          Components: containerization
    Affects Versions: 1.6.1, 1.5.0
            Reporter: Stephan Erb
         Attachments: mesos-agent.log, mesos-executor-stderr.log

I am trying to hunt down a weird issue where sometimes restarting a Mesos agent 
takes down all Mesos containers. The containers die without an apparent cause:

{code}
I0821 13:35:01.486346 61392 linux_launcher.cpp:360] Recovered container 
02da7be0-271e-449f-9554-dc776adb29a9
I0821 13:35:03.627367 61362 provisioner.cpp:451] Recovered container 
02da7be0-271e-449f-9554-dc776adb29a9
I0821 13:35:03.701448 61375 containerizer.cpp:2835] Container 
02da7be0-271e-449f-9554-dc776adb29a9 has exited
I0821 13:35:03.701453 61375 containerizer.cpp:2382] Destroying container 
02da7be0-271e-449f-9554-dc776adb29a9 in RUNNING state
I0821 13:35:03.701457 61375 containerizer.cpp:2996] Transitioning the state of 
container 02da7be0-271e-449f-9554-dc776adb29a9 from RUNNING to DESTROYING
{code}

>From the perspective of the executor, there is nothing relevant in the logs. 
>Everything just stops directly as if the container gets terminated externally 
>without notifying the executor first. For further details, please see the 
>attached agent log and one (example) executor log file.

I am aware that this is a long shot, but anyone an idea what I should be 
looking at to narrow down the issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to