On Thu, May 16, 2013 at 12:28 PM, David Greenberg <[email protected]>wrote:

> Thanks! I didn't realize that the stdout/stderr was being discarded. Turns
> out it was a permission issue (running as the wrong user).
>
> Great to hear you were able to figure out the issue.



> Would you accept a patch to have the mesos-daemon.sh direct stdout/stderr
> to the log_dir of the mesos_slave? (by reading the mesos.conf?)
>
> Sure. We always accept patches :)


btw, I will be finally starting to release patches on review board this
> week or next, after working it out w/ my employer.
>
> Sick. Can't wait!



>
> On Thu, May 16, 2013 at 2:43 PM, Vinod Kone <[email protected]> wrote:
>
> > You could pipe the stderr (and stdout too) into a log file.
> >
> > For example:
> >
> > mesos-slave --<required flags> >> <log_dir>/mesos-slave.log 2>&1
> >
> >
> > On Thu, May 16, 2013 at 11:34 AM, David Greenberg <
> [email protected]
> > >wrote:
> >
> > > How do I capture the stderr of the slave? I set the work_dir and
> log_dir
> > in
> > > the configuration.
> > >
> > > I know that /bin/sh exists on the hosts that I'm trying to run the
> > command
> > > on.
> > >
> > >
> > > On Thu, May 16, 2013 at 2:25 PM, Vinod Kone <[email protected]>
> wrote:
> > >
> > > > From the log, it looks like the command executor was Aborted!
> > > >
> > > > I0515 20:38:49.283588 60936 slave.cpp:1065] Executor 'Task
> > > > ct:foo:1368650328332:1 (touch /u/dgr...)' of framework chronos has
> > > > terminated with signal Aborted
> > > >
> > > > Not sure why though. Command executor is just a shell wrapper. Also,
> > > looks
> > > > like you are not capturing the stderr of slave? This might also
> provide
> > > > some useful info, regarding whether the executor died during fork or
> > > after
> > > > exec (though the lack of stderr/stdout in executor sandbox suggests
> the
> > > > former).
> > > >
> > > >
> > > >
> > > > On Thu, May 16, 2013 at 10:20 AM, David Greenberg <
> > > [email protected]
> > > > >wrote:
> > > >
> > > > > When I try to run a job with chronos, if I provide no executor, the
> > > task
> > > > > fails with the "failed" state. When I check the executor logs
> > > > > (stdout/stderr) in Mesos, there's nothing there. I'm just trying
> > simple
> > > > > things, like "echo hello world". I've included a short log of the
> > > > > TASK_FAILED message below. Do you have any ideas on where I could
> > look
> > > > for
> > > > > debugging? I'm using mesos 0.12.
> > > > >
> > > > > I0515 20:38:49.093731 60967 slave.cpp:487] Got assigned task
> > > > > ct:foo:1368650328332:1 for framework chronos
> > > > > I0515 20:38:49.095643 60967 paths.hpp:235] Created executor
> directory
> > > > >
> > > > >
> > > >
> > >
> >
> '/data/scratch/local/mesos/work/slaves/201305152037-3261142444-5050-57326-0/frameworks/chronos/executors/Task
> > > > > ct:foo:1368650328332:1 (touch
> > > > > /u/dgr...)/runs/45f66450-3e3f-4a41-af40-4cb33aa33960'
> > > > > I0515 20:38:49.095824 60948 process_based_isolation_module.cpp:108]
> > > > > Launching Task ct:foo:1368650328332:1 (touch /u/dgr...) (/net/
> > > > >
> > > > >
> > > >
> > >
> >
> hsdgrnbrg.aoa.twosigma.com/userhome/dgrnbrg/black-mesos/chronos-chosen-mesos/install/libexec/mesos/mesos-executor
> > > > > )
> > > > > in
> > > > >
> > > > >
> > > >
> > >
> >
> /data/scratch/local/mesos/work/slaves/201305152037-3261142444-5050-57326-0/frameworks/chronos/executors/Task
> > > > > ct:foo:1368650328332:1 (touch
> > > > > /u/dgr...)/runs/45f66450-3e3f-4a41-af40-4cb33aa33960 with
> resources '
> > > for
> > > > > framework chronos
> > > > > I0515 20:38:49.096753 60948 process_based_isolation_module.cpp:153]
> > > > Forked
> > > > > executor at 61215
> > > > > I0515 20:38:49.097357 60956 slave.cpp:361] Successfully attached
> file
> > > > >
> > > > >
> > > >
> > >
> >
> '/data/scratch/local/mesos/work/slaves/201305152037-3261142444-5050-57326-0/frameworks/chronos/executors/Task
> > > > > ct:foo:1368650328332:1 (touch
> > > > > /u/dgr...)/runs/45f66450-3e3f-4a41-af40-4cb33aa33960'
> > > > > I0515 20:38:49.283064 60963 process_based_isolation_module.cpp:344]
> > > > Telling
> > > > > slave of lost executor Task ct:foo:1368650328332:1 (touch
> /u/dgr...)
> > of
> > > > > framework chronos
> > > > > I0515 20:38:49.283588 60936 slave.cpp:1065] Executor 'Task
> > > > > ct:foo:1368650328332:1 (touch /u/dgr...)' of framework chronos has
> > > > > terminated with signal Aborted
> > > > > I0515 20:38:49.284939 60936 slave.cpp:842] Status update: task
> > > > > ct:foo:1368650328332:1 of framework chronos is now in state
> > TASK_FAILED
> > > > > I0515 20:38:49.285099 60936 gc.cpp:97] Scheduling
> > > > >
> > > > >
> > > >
> > >
> >
> /data/scratch/local/mesos/work/slaves/201305152037-3261142444-5050-57326-0/frameworks/chronos/executors/Task
> > > > > ct:foo:1368650328332:1 (touch
> > > > > /u/dgr...)/runs/45f66450-3e3f-4a41-af40-4cb33aa33960 for removal
> > > > > I0515 20:38:49.283115 60963 process_utils.hpp:64] Stopping ...
> 61215
> > > > > Sent signal to 61215
> > > > > I0515 20:38:49.432054 60936 slave.cpp:739] Got acknowledgement of
> > > status
> > > > > update for task ct:foo:1368650328332:1 of framework chronos
> > > > >
> > > >
> > >
> >
>

Reply via email to