Looks like you're running off the master branch?

Interestingly, the master asked the slave to shutdown the framework before
the discarded log messages happened:

I0608 02:33:49.211230 24157 slave.cpp:1106] Asked to shut down framework
201306071736-252063498-5050-19065-0044 by [email protected]:5050

Then I see another framework is launching tasks (note the framework ID is
different):
I0608 02:35:07.981590 24170 slave.cpp:824] Launching task Task_Tracker_97
for framework 201306071736-252063498-5050-19065-0000

1. Do you have the master logs available?

2. It seems you're running two or more frameworks in your cluster, correct?

I'm not sure how the discarded is occurring at the moment, more information
will help. :)





On Wed, Jun 12, 2013 at 7:36 PM, 王国栋 <[email protected]> wrote:

> The slave is not shutting down.
>
> FYI, the final part of log of the slave is attached.
>
> Guodong
>
>
> On Mon, Jun 10, 2013 at 7:21 AM, Benjamin Mahler <
> [email protected]> wrote:
>
>> I've fixed the issue you linked, and what you've mentioned above does not
>> seem related. Can you provide the full logs? Is the slave shutting down?
>>
>>
>> On Fri, Jun 7, 2013 at 10:51 PM, 王国栋 <[email protected]> wrote:
>>
>> > Hi guys,
>> >
>> > *One of the slave in our cluster is blocked.*
>> > *
>> > *
>> > *I can see a lot of logs like this.*
>> > *W0608 03:33:27.404191 24158 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_103' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded*
>> > *W0608 03:33:30.907639 24159 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_97' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded*
>> > *W0608 03:33:32.405814 24155 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_103' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded*
>> >
>> > *I find that the slave do launch executor_Task_Tracker_97 and
>> > executor_Task_Tracker_103, and they do work with jobtracker normally.
>> And
>> > later, 2 executors are killed by mesos-master. And I can see the log
>> like
>> > this.*
>> >
>> > *I0608 03:33:22.846537 24162 slave.cpp:983] Asked to kill task
>> > Task_Tracker_103 of framework 201306071736-252063498-5050-19065-0000*
>> > *I0608 03:33:22.846947 24157 slave.cpp:983] Asked to kill task
>> > Task_Tracker_97 of framework 201306071736-252063498-5050-19065-0000*
>> > *I0608 03:33:22.984558 24160 status_update_manager.cpp:290] Received
>> status
>> > update TASK_FINISHED (UUID: 67a885a6-d121-41e9-9e65-810933533fe3) for
>> task
>> > Task_Tracker_97 of framework 201306071736-252063498-50*
>> > *50-19065-0000 with checkpoint=false*
>> > *I0608 03:33:22.984678 24160 status_update_manager.cpp:336] Forwarding
>> > status update TASK_FINISHED (UUID: 67a885a6-d121-41e9-9e65-810933533fe3)
>> > for task Task_Tracker_97 of framework 201306071736-252063498-*
>> > *5050-19065-0000 to [email protected]:5050*
>> > *I0608 03:33:22.985399 24160 slave.cpp:1794] Sending acknowledgement for
>> > status update TASK_FINISHED (UUID: 67a885a6-d121-41e9-9e65-810933533fe3)
>> > for task Task_Tracker_97 of framework 201306071736-25206349*
>> > *8-5050-19065-0000 to executor(1)@10.47.6.16:60089*
>> > *I0608 03:33:22.986699 24163 status_update_manager.cpp:360] Received
>> status
>> > update acknowledgement 67a885a6-d121-41e9-9e65-810933533fe3 for task
>> > Task_Tracker_97 of framework 201306071736-252063498-5050-190*
>> > *65-0000*
>> >
>> > *Then, the internal state of the slave is wrong. After the executors are
>> > killed, there are logs like this*
>> > W0608 03:33:27.404191 24158 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_103' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:30.907639 24159 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_97' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:32.405814 24155 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_103' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:35.909579 24164 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_97' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:37.406793 24167 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_103' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:40.910625 24164 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_97' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:42.407618 24167 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_103' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> > W0608 03:33:45.911341 24165 monitor.cpp:167] Failed to collect resource
>> > usage for executor 'executor_Task_Tracker_97' of framework
>> > '201306071736-252063498-5050-19065-0000': Future discarded
>> >
>> > *At last, slave is blocked, and it never starts any executor. Although I
>> > can see the log slave is launching executor, but I can not find any
>> files
>> > in the executor working directory.*
>> > *
>> > *
>> > *Does this a known issue? I search the jira, but only find this. *
>> >
>> >
>> http://mail-archives.apache.org/mod_mbox/incubator-mesos-dev/201305.mbox/%3CJIRA.12646311.1367878134469.276019.1367878215622@arcas%3E
>> > *
>> > *
>> > Thanks.
>> >
>> > Guodong
>> >
>>
>
>

Reply via email to