Kevin Klues created MESOS-5799:
----------------------------------

             Summary: docker::inspect() may get wrong output when a docker 
container is not in "running" state
                 Key: MESOS-5799
                 URL: https://issues.apache.org/jira/browse/MESOS-5799
             Project: Mesos
          Issue Type: Bug
          Components: containerization, docker
            Reporter: Kevin Klues
             Fix For: 1.0.0


I (klueska) am copying the text from an email I got about a bug report from 
Yubo Li at IBM.

docker::inspect() may get wrong output when the docker container is not in 
"running" state. In this case, the "docker inspect" will failed to parse data, 
and system can not enter TASK:RUNNING status.

I attached related logs in stderr, I printed the docker inspect output. The 
inspected output shows that the docker is in "created" status, not "running", 
so that many of inspect fields are invalid. 

Possible Fix: detect the "State->Running" field, and get success return when 
"State->Running" is true.

{noformat}
I0706 09:01:05.342895  2975 docker.cpp:780] Running docker -H 
unix:///var/run/docker.sock run --cpu-shares 512 --memory 536870912 -e 
MARATHON_APP_VERSION=2016-07-06T08:15:02.610Z -e HOST=9.186.57.67 -e 
MARATHON_APP_RESOURCE_CPUS=0.5 -e MARATHON_APP_RESOURCE_GPUS=1 -e 
MARATHON_APP_DOCKER_IMAGE=cuda_test_v0.1 -e PORT_10000=31435 -e 
MESOS_TASK_ID=ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607 -e 
PORT=31435 -e MARATHON_APP_RESOURCE_MEM=512.0 -e PORTS=31435 -e 
MARATHON_APP_RESOURCE_DISK=0.0 -e MARATHON_APP_LABELS= -e 
MARATHON_APP_ID=/ubuntu-gpu-32520 -e PORT0=31435 -e 
MESOS_SANDBOX=/mnt/mesos/sandbox -e 
MESOS_CONTAINER_NAME=mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439
 -v 
/var/run/mesos/slaves/1875c0d3-9712-43c3-9d58-572c89fac50b-S1/frameworks/aee07017-f8e6-4ed5-8008-b4ea3a090282-0000/executors/ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607/runs/cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439:/mnt/mesos/sandbox
 --net host --device=/dev/nvidiactl:/dev/nvidiactl:rwm 
--device=/dev/nvidia-uvm:/dev/nvidia-uvm:rwm 
--device=/dev/nvidia0:/dev/nvidia0:rwm --entrypoint /bin/sh --name 
mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439
 cuda_test_v0.1 -c nvidia-smi && sleep 60s
I0706 09:01:05.345935  2975 docker.cpp:943] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439
I0706 09:01:05.548992  2976 docker.cpp:249] Docker inspect: [
{
    "Id": "5a4dc17e739b60593c04abf310f2485dddea832476e83007387b612839933f5a",
    "Created": "2016-07-06T09:01:05.531216924Z",
    "Path": "/bin/sh",
    "Args": [
        "-c",
        "nvidia-smi \u0026\u0026 sleep 60s"
    ],
    "State": {
        "Status": "created",
        "Running": false,
        "Paused": false,
        "Restarting": false,
        "OOMKilled": false,
        "Dead": false,
        "Pid": 0,
        "ExitCode": 0,
        "Error": "",
        "StartedAt": "0001-01-01T00:00:00Z",
        "FinishedAt": "0001-01-01T00:00:00Z"
    },
    "Image": "8cf6c8da7045ec24b1e561906dfa54ab0276753ec617e139a7b2da3ef72d245e",
    "ResolvConfPath": "",
    "HostnamePath": "",
    "HostsPath": "",
    "LogPath": "",
    "Name": 
"/mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439",
    "RestartCount": 0,
    "Driver": "aufs",
    "ExecDriver": "native-0.2",
    "MountLabel": "",
    "ProcessLabel": "",
    "AppArmorProfile": "",
    "ExecIDs": null,
    "HostConfig": {
        "Binds": null,
        "ContainerIDFile": "",
        "LxcConf": null,
        "Memory": 0,
        "MemoryReservation": 0,
        "MemorySwap": 0,
        "KernelMemory": 0,
        "CpuShares": 0,
        "CpuPeriod": 0,
        "CpusetCpus": "",
        "CpusetMems": "",
        "CpuQuota": 0,
        "BlkioWeight": 0,
        "OomKillDisable": false,
        "MemorySwappiness": null,
        "Privileged": false,
        "PortBindings": null,
        "Links": null,
        "PublishAllPorts": false,
        "Dns": null,
        "DnsOptions": null,
        "DnsSearch": null,
        "ExtraHosts": null,
        "VolumesFrom": null,
        "Devices": null,
        "NetworkMode": "",
        "IpcMode": "",
        "PidMode": "",
        "UTSMode": "",
        "CapAdd": null,
        "CapDrop": null,
        "GroupAdd": null,
        "RestartPolicy": {
            "Name": "",
            "MaximumRetryCount": 0
        },
        "SecurityOpt": null,
        "ReadonlyRootfs": false,
        "Ulimits": null,
        "LogConfig": {
            "Type": "json-file",
            "Config": {}
        },
        "CgroupParent": "",
        "ConsoleSize": [
            0,
            0
        ],
        "VolumeDriver": ""
    },
    "GraphDriver": {
        "Name": "aufs",
        "Data": null
    },
    "Mounts": [],
    "Config": {
        "Hostname": "5a4dc17e739b",
        "Domainname": "",
        "User": "",
        "AttachStdin": false,
        "AttachStdout": true,
        "AttachStderr": true,
        "Tty": false,
        "OpenStdin": false,
        "StdinOnce": false,
        "Env": [
            "MARATHON_APP_VERSION=2016-07-06T08:15:02.610Z",
            "HOST=9.186.57.67",
            "MARATHON_APP_RESOURCE_CPUS=0.5",
            "MARATHON_APP_RESOURCE_GPUS=1",
            "MARATHON_APP_DOCKER_IMAGE=cuda_test_v0.1",
            "PORT_10000=31435",
            
"MESOS_TASK_ID=ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607",
            "PORT=31435",
            "MARATHON_APP_RESOURCE_MEM=512.0",
            "PORTS=31435",
            "MARATHON_APP_RESOURCE_DISK=0.0",
            "MARATHON_APP_LABELS=",
            "MARATHON_APP_ID=/ubuntu-gpu-32520",
            "PORT0=31435",
            "MESOS_SANDBOX=/mnt/mesos/sandbox",
            
"MESOS_CONTAINER_NAME=mesos-1875c0d3-9712-43c3-9d58-572c89fac50b-S1.cfe287a0-8a37-4a0f-8ffb-55eb0e6e4439",
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
        ],
        "Cmd": [
            "-c",
            "nvidia-smi \u0026\u0026 sleep 60s"
        ],
        "Image": "cuda_test_v0.1",
        "Volumes": null,
        "WorkingDir": "",
        "Entrypoint": [
            "/bin/sh"
        ],
        "OnBuild": null,
        "Labels": {},
        "StopSignal": "SIGTERM"
    },
    "NetworkSettings": {
        "Bridge": "",
        "SandboxID": "",
        "HairpinMode": false,
        "LinkLocalIPv6Address": "",
        "LinkLocalIPv6PrefixLen": 0,
        "Ports": null,
        "SandboxKey": "",
        "SecondaryIPAddresses": null,
        "SecondaryIPv6Addresses": null,
        "EndpointID": "",
        "Gateway": "",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "IPAddress": "",
        "IPPrefixLen": 0,
        "IPv6Gateway": "",
        "MacAddress": "",
        "Networks": null
    }
}
]
I0706 09:01:05.549659  2976 docker.cpp:335] Unable to detect IP Address at 
'NetworkSettings.Networks..IPAddress', attempting deprecated field
WARNING: Your kernel does not support swap limit capabilities, memory limited 
without swap.
I0706 09:01:52.983609  2973 exec.cpp:486] Agent exited, but framework has 
checkpointing enabled. Waiting 15mins to reconnect with agent 
1875c0d3-9712-43c3-9d58-572c89fac50b-S1
I0706 09:02:06.057607  2978 exec.cpp:549] Executor sending status update 
TASK_FINISHED (UUID: 2cff35f2-9512-4120-b912-74a82c197696) for task 
ubuntu-gpu-32520.29f083bf-4358-11e6-b886-2ee1446b5607 of framework 
aee07017-f8e6-4ed5-8008-b4ea3a090282-0000
I0706 09:02:06.058717  2980 poll_socket.cpp:131] Socket error while connecting
I0706 09:02:06.058815  2980 process.cpp:1799] Failed to send 
'mesos.internal.StatusUpdateMessage' to '127.0.1.1:5051', connect: Socket error 
while connecting
E0706 09:02:06.058931  2980 process.cpp:2104] Failed to shutdown socket with fd 
6: Transport endpoint is not connected
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to