Jay Buffington created MESOS-2200:
-------------------------------------

             Summary: bogus docker images result in bad error message to 
scheduler
                 Key: MESOS-2200
                 URL: https://issues.apache.org/jira/browse/MESOS-2200
             Project: Mesos
          Issue Type: Bug
          Components: containerization
            Reporter: Jay Buffington


When a scheduler specifies a bogus image in ContainerInfo mesos doesn't tell 
the scheduler that the docker pull failed or why.

This error is logged in the mesos-slave log, but it isn't given to the 
scheduler (as far as I can tell):

{noformat}
E1218 23:50:55.406230  8123 slave.cpp:2730] Container 
'8f70784c-3e40-4072-9ca2-9daed23f15ff' for executor 
'thermos-1418946354013-xxx-xxx-curl-0-f500cc41-dd0a-4338-8cbc-d631cb588bb1' of 
framework '20140522-213145-1749004561-5050-29512-0000' failed to start: Failed 
to 'docker pull docker-registry.example.com/doesntexist/hello1.1:latest': exit 
status = exited with status 1 stderr = 2014/12/18 23:50:55 Error: image 
doesntexist/hello1.1 not found
{noformat}

If the docker image is not in the registry, the scheduler should give the user 
an error message.  If docker pull failed because of networking issues, it 
should be retried.  Mesos should give the scheduler enough information to be 
able to make that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to