[ 
https://issues.apache.org/jira/browse/MESOS-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14524149#comment-14524149
 ] 

Steven Schlansker commented on MESOS-2684:
------------------------------------------

Sorry if I wasn't clear, I mentioned that these files have no non-application 
output.  For completeness:

stderr:

I0421 21:05:14.850749 13546 exec.cpp:132] Version: 0.21.1
I0421 21:05:14.862670 13559 exec.cpp:206] Executor registered on slave 
20150327-194449-419644938-5050-1649-S71

stdout:

Registered executor on 10.70.8.160
Starting task 
pp-request-bookings-teamcity.2015.04.02T15.58.28-1429650229399-2-10.70.8.160-us_west_2b
Forked command at 13575
/bin/sh -c exit `docker wait mesos-8d3b46d5-99d6-4994-a7e4-df66aa34ae89` 
2015-04-21T21:05:15.954Z, LOGGER ERROR, log client is undefined!, 
{"@timestamp":"2015-04-21T21:05:15.954Z","servicetype":"requestbookings","logname":"LOGGER
 ERROR","formatversion":"v1","type":"requestbookings-LOGGER 
ERROR-v1","host":"10.70.8.160","sequencenumber":1,"message":"log client is 
undefined!"}
2015-04-21T21:05:15.953Z, salesforceConnection, , 
{"@timestamp":"2015-04-21T21:05:15.953Z","servicetype":"requestbookings","logname":"salesforceConnection","formatversion":"v1","type":"requestbookings-salesforceConnection-v1","host":"10.70.8.160","sequencenumber":1,"durationMs":226}
Connection to redis closed. It will reopen when logs will need sending.
Connection to redis closed. It will reopen when logs will need sending.

Yeah, those errors are not great, those pesky end users... but I don't see any 
executor output just the same

> mesos-slave should not abort when a single task has e.g. a 'mkdir' failure
> --------------------------------------------------------------------------
>
>                 Key: MESOS-2684
>                 URL: https://issues.apache.org/jira/browse/MESOS-2684
>             Project: Mesos
>          Issue Type: Bug
>          Components: docker, slave
>    Affects Versions: 0.21.1
>            Reporter: Steven Schlansker
>         Attachments: mesos-slave-restart.txt
>
>
> mesos-slave can encounter a variety of problems while attempting to launch a 
> task.  If the task fails, that is unfortunate, but not the end of the world.  
> Other tasks should not be affected.
> However, if the task failure happens to trigger an assertion, the entire 
> slave comes crashing down:
> F0501 19:10:46.095464  1705 paths.hpp:342] CHECK_SOME(mkdir): No space left 
> on device Failed to create executor directory 
> '/mnt/mesos/slaves/20150327-194449-419644938-5050-1649-S71/frameworks/Singularity/executors/pp-gc-eventlog-teamcity.2015.03.31T23.55.14-1430507446029-2-10.70.8.160-us_west_2b/runs/95a54aeb-322c-48e9-9f6f-5b359bccbc01'
> Immediately afterwards, all tasks on this slave were declared TASK_KILLED 
> when mesos-slave restarted.
> Something as simple as a 'mkdir' failing is not worthy of an assertion 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to