Looks like that slave was unavailable for a while, so the master removed its slaveID as 'shutdown'. If you restart the slave, it should reset & register as a new slaveID. But if you want to be extra sure, wipe the contents of `/var/run/mesos/meta` and then restart the slave.
On Thu, Jan 8, 2015 at 12:10 PM, Srinivas Murthy <[email protected]> wrote: > Duh, so much for my diligence :-) > > On Thu, Jan 8, 2015 at 12:09 PM, Srinivas Murthy <[email protected]> > wrote: > >> Running on machine:xxxx >> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg >> I0107 12:07:44.205533 26720 logging.cpp:172] INFO level logging started! >> I0107 12:07:44.205888 26720 main.cpp:142] Build: 2014-12-23 10:33:15 by >> root >> I0107 12:07:44.205914 26720 main.cpp:144] Version: 0.21.0 >> I0107 12:07:44.206110 26720 containerizer.cpp:100] Using isolation: >> posix/cpu,posix/mem >> I0107 12:07:44.207408 26720 main.cpp:165] Starting Mesos slave >> I0107 12:07:44.210654 26720 slave.cpp:169] Slave started on 1)@ >> 10.122.21.21:5051 >> I0107 12:07:44.211130 26720 slave.cpp:289] Slave resources: cpus(*):4; >> mem(*):6816; disk(*):19825; ports(*):[31000-32000 >> ] >> I0107 12:07:44.211328 26720 slave.cpp:318] Slave hostname: xxxx >> I0107 12:07:44.211362 26720 slave.cpp:319] Slave checkpoint: true >> I0107 12:07:44.218703 26728 state.cpp:33] Recovering state from >> '/var/run/mesos/meta' >> I0107 12:07:44.219048 26725 group.cpp:313] Group process (group(1)@ >> 10.122.21.21:5051) connected to ZooKeeper >> I0107 12:07:44.219140 26725 group.cpp:790] Syncing group operations: >> queue size (joins, cancels, datas) = (0, 0, 0) >> I0107 12:07:44.219173 26725 group.cpp:385] Trying to create path '/mesos' >> in ZooKeeper >> I0107 12:07:44.221113 26723 status_update_manager.cpp:197] Recovering >> status update manager >> I0107 12:07:44.221750 26721 containerizer.cpp:281] Recovering >> containerizer >> I0107 12:07:44.222080 26725 detector.cpp:138] Detected a new leader: >> (id='8') >> I0107 12:07:44.222859 26725 group.cpp:659] Trying to get >> '/mesos/info_0000000008' in ZooKeeper >> I0107 12:07:44.223629 26721 slave.cpp:3466] Finished recovery >> I0107 12:07:44.226488 26726 detector.cpp:433] A new leading master (UPID= >> [email protected]:5050) is detected >> I0107 12:07:44.226738 26726 slave.cpp:602] New master detected at >> master@mymaster:5050 >> I0107 12:07:44.226922 26726 slave.cpp:627] No credentials provided. >> Attempting to register without authentication >> I0107 12:07:44.227015 26726 slave.cpp:638] Detecting new master >> I0107 12:07:44.227149 26726 status_update_manager.cpp:171] Pausing >> sending status updates >> I0107 12:07:44.991296 26721 slave.cpp:526] Slave asked to shut down by >> master@mymaster:5050 because 'Slave attempted >> to re-register after removal' >> I0107 12:07:44.991412 26721 slave.cpp:484] Slave terminating >> >> I have masked some IP addresses from these log entries >> >> On Thu, Jan 8, 2015 at 11:53 AM, Adam Bordelon <[email protected]> >> wrote: >> >>> There should be a WARNING log line in the mesos slave log (typically >>> /var/log/mesos/mesos-slave.INFO) that says "Shutting down executor ... >>> because ..." probably right after the line that says "Got registration for >>> executor ..." >>> Can you post a gist of the relevant slave log lines? >>> >>> On Thu, Jan 8, 2015 at 11:39 AM, Srinivas Murthy <[email protected]> >>> wrote: >>> >>>> Its a custom executor, I can see each of the nodes have >>>> /tmp/mesos/...executors/..runs/../latest with stderr and stdout, along with >>>> the jar file. >>>> My stdout, is blank, while the stderr has "Executor asked to shutdown" >>>> as its last line, after the URI is accessed and the resource jar is >>>> fetched.. >>>> >>>> >>>> On Thu, Jan 8, 2015 at 11:29 AM, Adam Bordelon <[email protected]> >>>> wrote: >>>> >>>>> Is your "adhoc framework" using the default Mesos executor, or does it >>>>> use a custom executor? >>>>> You can check the task/executor's sandbox from the Mesos web UI, to >>>>> see if the custom executor or other URIs were properly downloaded, and to >>>>> view the stdout/stderr of the executor/task. >>>>> >>>>> On Thu, Jan 8, 2015 at 10:14 AM, Srinivas Murthy < >>>>> [email protected]> wrote: >>>>> >>>>>> I am running a cluster with one master node and three slaves. >>>>>> Just got hold of a tutorial code from Git that runs an adhoc >>>>>> framework written in Java, nothing fancy. >>>>>> All I am getting is " Executor asked to shutdown" and the code exits >>>>>> gracefully, no exceptions. I am trying to put some logging statements in >>>>>> all the callback functions, but looks like the Executors are invoked but >>>>>> never run. >>>>>> Any clues on how to debug this? >>>>>> I am running Mesos 0.21 and JDK 1.7.55. >>>>>> >>>>>> Regards >>>>>> Srinivas >>>>>> >>>>> >>>>> >>>> >>> >> >

