Looks like that slave was unavailable for a while, so the master removed
its slaveID as 'shutdown'.
If you restart the slave, it should reset & register as a new slaveID.
But if you want to be extra sure, wipe the contents of
`/var/run/mesos/meta` and then restart the slave.

On Thu, Jan 8, 2015 at 12:10 PM, Srinivas Murthy <[email protected]>
wrote:

> Duh, so much for my diligence :-)
>
> On Thu, Jan 8, 2015 at 12:09 PM, Srinivas Murthy <[email protected]>
> wrote:
>
>> Running on machine:xxxx
>> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
>> I0107 12:07:44.205533 26720 logging.cpp:172] INFO level logging started!
>> I0107 12:07:44.205888 26720 main.cpp:142] Build: 2014-12-23 10:33:15 by
>> root
>> I0107 12:07:44.205914 26720 main.cpp:144] Version: 0.21.0
>> I0107 12:07:44.206110 26720 containerizer.cpp:100] Using isolation:
>> posix/cpu,posix/mem
>> I0107 12:07:44.207408 26720 main.cpp:165] Starting Mesos slave
>> I0107 12:07:44.210654 26720 slave.cpp:169] Slave started on 1)@
>> 10.122.21.21:5051
>> I0107 12:07:44.211130 26720 slave.cpp:289] Slave resources: cpus(*):4;
>> mem(*):6816; disk(*):19825; ports(*):[31000-32000
>> ]
>> I0107 12:07:44.211328 26720 slave.cpp:318] Slave hostname: xxxx
>> I0107 12:07:44.211362 26720 slave.cpp:319] Slave checkpoint: true
>> I0107 12:07:44.218703 26728 state.cpp:33] Recovering state from
>> '/var/run/mesos/meta'
>> I0107 12:07:44.219048 26725 group.cpp:313] Group process (group(1)@
>> 10.122.21.21:5051) connected to ZooKeeper
>> I0107 12:07:44.219140 26725 group.cpp:790] Syncing group operations:
>> queue size (joins, cancels, datas) = (0, 0, 0)
>> I0107 12:07:44.219173 26725 group.cpp:385] Trying to create path '/mesos'
>> in ZooKeeper
>> I0107 12:07:44.221113 26723 status_update_manager.cpp:197] Recovering
>> status update manager
>> I0107 12:07:44.221750 26721 containerizer.cpp:281] Recovering
>> containerizer
>> I0107 12:07:44.222080 26725 detector.cpp:138] Detected a new leader:
>> (id='8')
>> I0107 12:07:44.222859 26725 group.cpp:659] Trying to get
>> '/mesos/info_0000000008' in ZooKeeper
>> I0107 12:07:44.223629 26721 slave.cpp:3466] Finished recovery
>> I0107 12:07:44.226488 26726 detector.cpp:433] A new leading master (UPID=
>> [email protected]:5050) is detected
>> I0107 12:07:44.226738 26726 slave.cpp:602] New master detected at
>> master@mymaster:5050
>> I0107 12:07:44.226922 26726 slave.cpp:627] No credentials provided.
>> Attempting to register without authentication
>> I0107 12:07:44.227015 26726 slave.cpp:638] Detecting new master
>> I0107 12:07:44.227149 26726 status_update_manager.cpp:171] Pausing
>> sending status updates
>> I0107 12:07:44.991296 26721 slave.cpp:526] Slave asked to shut down by
>> master@mymaster:5050 because 'Slave attempted
>>  to re-register after removal'
>> I0107 12:07:44.991412 26721 slave.cpp:484] Slave terminating
>>
>> I have masked some IP addresses from these log entries
>>
>> On Thu, Jan 8, 2015 at 11:53 AM, Adam Bordelon <[email protected]>
>> wrote:
>>
>>> There should be a WARNING log line in the mesos slave log (typically
>>> /var/log/mesos/mesos-slave.INFO) that says "Shutting down executor ...
>>> because ..." probably right after the line that says "Got registration for
>>> executor ..."
>>> Can you post a gist of the relevant slave log lines?
>>>
>>> On Thu, Jan 8, 2015 at 11:39 AM, Srinivas Murthy <[email protected]>
>>> wrote:
>>>
>>>> Its a custom executor, I can see each of the nodes have
>>>> /tmp/mesos/...executors/..runs/../latest with stderr and stdout, along with
>>>> the jar file.
>>>> My stdout, is blank, while the stderr has "Executor asked to shutdown"
>>>> as its last line, after the URI is accessed and the resource jar is
>>>> fetched..
>>>>
>>>>
>>>> On Thu, Jan 8, 2015 at 11:29 AM, Adam Bordelon <[email protected]>
>>>> wrote:
>>>>
>>>>> Is your "adhoc framework" using the default Mesos executor, or does it
>>>>> use a custom executor?
>>>>> You can check the task/executor's sandbox from the Mesos web UI, to
>>>>> see if the custom executor or other URIs were properly downloaded, and to
>>>>> view the stdout/stderr of the executor/task.
>>>>>
>>>>> On Thu, Jan 8, 2015 at 10:14 AM, Srinivas Murthy <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> I am running a cluster with one master node and three slaves.
>>>>>> Just got hold of a tutorial code from Git that runs an adhoc
>>>>>> framework written in Java, nothing fancy.
>>>>>> All I am getting is " Executor asked to shutdown" and the code exits
>>>>>> gracefully, no exceptions. I am trying to put some logging statements in
>>>>>> all the callback functions, but looks like the Executors are invoked but
>>>>>> never run.
>>>>>> Any clues on how to debug this?
>>>>>> I am running Mesos 0.21 and  JDK 1.7.55.
>>>>>>
>>>>>> Regards
>>>>>> Srinivas
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to