It looks like we can have a better error message here. @Jay, mind filing a JIRA ticket for with description, status update, and your fix attached? Thanks!
On Fri, Aug 21, 2015 at 7:36 PM, Jay Taylor <j...@jaytaylor.com> wrote: > Eventually I was able to isolate what was going on; in this case the > FrameworkInfo.User was set to an invalid value and setting it to "root" did > the trick. > > My scheduler is now working [in a basic form]!!! > > Cheers, > Jay > > On Thu, Aug 20, 2015 at 4:15 PM, Jay Taylor <j...@jaytaylor.com> wrote: > >> Hey Tim, >> >> Thank you for the quick response! >> >> Just checked the sandbox logs and they are all empty (stdout and stderr >> are both 0 bytes). >> >> I have discovered a little bit more information from the StatusUpdate >> event posted back to my scheduler: >> >> &TaskStatus{ >> TaskId: &TaskID{ >> Value:*fluxCapacitor-test-1,XXX_unrecognized:[], >> }, >> State: *TASK_FAILED, >> Message: *Abnormal executor termination, >> Source: *SOURCE_SLAVE, >> Reason: *REASON_COMMAND_EXECUTOR_FAILED, >> Data:nil, >> SlaveId: &SlaveID{ >> Value: *20150804-211459-1407297728-5050-5855-S1, >> XXX_unrecognized: [], >> }, >> ExecutorId: nil, >> Timestamp: *1.440112075509318e+09, >> Uuid: *[102 75 82 85 38 139 68 94 153 189 210 87 218 235 147 166], >> Healthy: nil, >> XXX_unrecognized: [], >> } >> >> How can I find out what why the command executor is failing? >> >> >> On Thu, Aug 20, 2015 at 4:08 PM, Tim Chen <t...@mesosphere.io> wrote: >> >>> It received a TASK_FAILED from the executor, so you'll need to look at >>> the sandbox logs of your task stdout and stderr files to see what went >>> wrong. >>> >>> These files should be reachable by the Mesos UI. >>> >>> Tim >>> >>> On Thu, Aug 20, 2015 at 4:01 PM, Jay Taylor <outtat...@gmail.com> wrote: >>> >>>> Hey everyone, >>>> >>>> I am writing a scheduler for Mesos and on of my first goals is to get >>>> simple a docker container to run. >>>> >>>> The tasks get marked as failed with the failure messages originating >>>> from the slave logs. Now I'm not sure how to determine exactly what is >>>> causing the failure. >>>> >>>> The most informative log messages I've found were in the slave log: >>>> >>>> ==> /var/log/mesos/mesos-slave.INFO <== >>>> W0820 20:44:25.242230 29639 docker.cpp:994] Ignoring updating unknown >>>> container: e190037a-b011-4681-9e10-dcbacf6cb819 >>>> I0820 20:44:25.242270 29639 status_update_manager.cpp:322] Received >>>> status update TASK_FAILED (UUID: 17a21cf7-17d1-42dd-92eb-b281396ebf60) for >>>> task jay-test-29 of framework 20150804-211741-1608624320-5050-18273-0060 >>>> I0820 20:44:25.242377 29639 slave.cpp:2961] Forwarding the update >>>> TASK_FAILED (UUID: 17a21cf7-17d1-42dd-92eb-b281396ebf60) for task >>>> jay-test-29 of framework 20150804-211741-1608624320-5050-18273-0060 to >>>> master@63.198.215.105:5050 >>>> I0820 20:44:25.247926 29636 status_update_manager.cpp:394] Received >>>> status update acknowledgement (UUID: 17a21cf7-17d1-42dd-92eb-b281396ebf60) >>>> for task jay-test-29 of framework >>>> 20150804-211741-1608624320-5050-18273-0060 >>>> I0820 20:44:25.248108 29636 slave.cpp:3502] Cleaning up executor >>>> 'jay-test-29' of framework 20150804-211741-1608624320-5050-18273-0060 >>>> I0820 20:44:25.248342 29636 slave.cpp:3591] Cleaning up framework >>>> 20150804-211741-1608624320-5050-18273-0060 >>>> >>>> And this doesn't really tell me much about *why* it's failed. >>>> >>>> Is there somewhere else I should be looking or an option that needs to >>>> be turned on to show more information? >>>> >>>> Your assistance is greatly appreciated! >>>> >>>> Jay >>>> >>> >>> >> >