Hey Guodong,

So, looks like Task_Tracker_242 did not register with the slave within 1
minute and the slave decided to kill it because it was deemed unhealthy. At
this point the executor should've received a kill signal from the slave. Do
you see anything of that sort in the slave or executor logs?


On Mon, Jul 8, 2013 at 11:30 PM, 王国栋 <wangg...@gmail.com> wrote:

> Hi vinod.
>
> I am using the code from the trunk. I think the latest commit is at Jul
> 1st. I will grep some master log in another mail.
>
> The Task "Task_Tracker_242" is stuck in STAGING. I think "Task_Tracker_224"
> and "Task_Tracker_230" exit sucessfully. But it is strange that there are a
> lot of "Fail to collect resource..." warnings.
>
> I0709 00:46:11.288698 11002 slave.cpp:739] Got assigned task
> Task_Tracker_242 for framework 201307040929-252063498-5050-27411-0000
> I0709 00:46:11.289136 11002 slave.cpp:837] Launching task Task_Tracker_242
> for framework 201307040929-252063498-5050-27411-0000
> I0709 00:46:11.291296 11002 paths.hpp:303] Created executor directory
>
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe
> cutor_Task_Tracker_242/runs/5c47ad99-1c78-43c8-9f27-9509f1d39c3d'
> I0709 00:46:11.291647 11002 slave.cpp:948] Queuing task 'Task_Tracker_242'
> for executor executor_Task_Tracker_242 of framework
> '201307040929-252063498-5050-27411-0000
> I0709 00:46:11.292162 11002 slave.cpp:511] Successfully attached file
>
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe
> cutor_Task_Tracker_242/runs/5c47ad99-1c78-43c8-9f27-9509f1d39c3d'
> W0709 00:46:12.197242 10992 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:16.100548 10994 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:17.197463 11001 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:21.101570 11002 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:22.198303 11005 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:26.102522 11002 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:27.199403 10998 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:31.103610 10998 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:32.200248 11001 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:36.104547 11004 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:37.201236 10991 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:41.105523 10997 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:42.202250 10991 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> I0709 00:46:45.283098 11002 slave.cpp:2511] Current usage 57.43%. Max
> allowed age: 2.279812884766227days
> W0709 00:46:46.106760 10994 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:47.203474 10993 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:51.107544 11006 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:52.204280 10997 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:56.108530 10995 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:46:57.205417 10997 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:47:01.109284 10997 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:47:02.206368 11002 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> I0709 00:47:05.288517 11002 slave.cpp:2463] Terminating executor
> executor_Task_Tracker_238 of framework
> 201307040929-252063498-5050-27411-0000 because it did not register within
> 1mins
> W0709 00:47:06.110532 11005 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:47:07.207320 10997 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_230' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> W0709 00:47:11.111778 10996 monitor.cpp:186] Failed to collect resource
> usage for executor 'executor_Task_Tracker_224' of framework
> '201307040929-252063498-5050-27411-0000': Future discarded
> I0709 00:47:11.292485 10991 slave.cpp:2463] Terminating executor
> executor_Task_Tracker_242 of framework
> 201307040929-252063498-5050-27411-0000 because it did not register within
> 1mins
>
>
> Guodong
>
>
> On Tue, Jul 9, 2013 at 2:21 PM, Vinod Kone <vinodk...@gmail.com> wrote:
>
> > hey guodong, which of these task(s) is stuck in STAGING? also, the
> > corresponding master's logs would also be helpful here. also which
> version
> > of mesos are you running?
> >
> >
> > On Mon, Jul 8, 2013 at 11:02 PM, 王国栋 <wangg...@gmail.com> wrote:
> >
> > > It is very interesting that there are these logs.
> > >
> > > I0709 00:33:43.833853 11002 slave.cpp:996] Asked to kill task
> > > Task_Tracker_224 of framework 201307040929-252063498-5050-27411-0000
> > > I0709 00:33:43.835552 11006 slave.cpp:996] Asked to kill task
> > > Task_Tracker_230 of framework 201307040929-252063498-5050-27411-0000
> > > I0709 00:33:43.972771 10994 slave.cpp:1692] Handling status update
> > > TASK_FINISHED (UUID: 372081cc-edf2-4183-a461-9345ab6d279c) for task
> > > Task_Tracker_230 of framework 201307040929-252063498-5050-27411-0000
> > > from executor(1)@10.47.6.21:27786
> > > I0709 00:33:43.973132 10994 status_update_manager.cpp:290] Received
> > status
> > > update TASK_FINISHED (UUID: 372081cc-edf2-4183-a461-9345ab6d279c) for
> > task
> > > Task_Tracker_230 of framework 201307040929-252063498-5
> > > 050-27411-0000 with checkpoint=false
> > > I0709 00:33:43.973192 10994 status_update_manager.cpp:336] Forwarding
> > > status update TASK_FINISHED (UUID:
> 372081cc-edf2-4183-a461-9345ab6d279c)
> > > for task Task_Tracker_230 of framework 201307040929-252063498
> > > -5050-27411-0000 to master@10.47.6.15:5050
> > > I0709 00:33:43.973846 11005 slave.cpp:1809] Sending acknowledgement for
> > > status update TASK_FINISHED (UUID:
> 372081cc-edf2-4183-a461-9345ab6d279c)
> > > for task Task_Tracker_230 of framework 201307040929-2520634
> > > 98-5050-27411-0000 to executor(1)@10.47.6.21:27786
> > > I0709 00:33:43.974591 11000 status_update_manager.cpp:360] Received
> > status
> > > update acknowledgement 372081cc-edf2-4183-a461-9345ab6d279c for task
> > > Task_Tracker_230 of framework 201307040929-252063498-5050-27
> > > 411-0000
> > > I0709 00:33:43.974652 11000 status_update_manager.cpp:481] Cleaning up
> > > status update stream for task Task_Tracker_230 of framework
> > > 201307040929-252063498-5050-27411-0000
> > > I0709 00:33:44.090603 11003 slave.cpp:1692] Handling status update
> > > TASK_FINISHED (UUID: 61d5775a-2375-412a-a5a4-80ab55163d88) for task
> > > Task_Tracker_224 of framework 201307040929-252063498-5050-27411-0000
> > > from executor(1)@10.47.6.21:2310
> > > I0709 00:33:44.090860 11003 status_update_manager.cpp:290] Received
> > status
> > > update TASK_FINISHED (UUID: 61d5775a-2375-412a-a5a4-80ab55163d88) for
> > task
> > > Task_Tracker_224 of framework 201307040929-252063498-5
> > > 050-27411-0000 with checkpoint=false
> > > I0709 00:33:44.090973 11003 status_update_manager.cpp:336] Forwarding
> > > status update TASK_FINISHED (UUID:
> 61d5775a-2375-412a-a5a4-80ab55163d88)
> > > for task Task_Tracker_224 of framework 201307040929-252063498
> > > -5050-27411-0000 to master@10.47.6.15:5050
> > > I0709 00:33:44.091279 11003 slave.cpp:1809] Sending acknowledgement for
> > > status update TASK_FINISHED (UUID:
> 61d5775a-2375-412a-a5a4-80ab55163d88)
> > > for task Task_Tracker_224 of framework 201307040929-2520634
> > > 98-5050-27411-0000 to executor(1)@10.47.6.21:2310
> > > I0709 00:33:44.093286 11003 status_update_manager.cpp:360] Received
> > status
> > > update acknowledgement 61d5775a-2375-412a-a5a4-80ab55163d88 for task
> > > Task_Tracker_224 of framework 201307040929-252063498-5050-27
> > > 411-0000
> > > I0709 00:33:44.093359 11003 status_update_manager.cpp:481] Cleaning up
> > > status update stream for task Task_Tracker_224 of framework
> > > 201307040929-252063498-5050-27411-0000
> > > I0709 00:33:45.259831 10997 slave.cpp:2511] Current usage 57.44%. Max
> > > allowed age: 2.279168852469954days
> > > W0709 00:33:45.949470 10996 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:33:47.063181 11005 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:33:50.950412 11000 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:33:52.063576 10993 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:33:55.951427 11003 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:33:57.064575 10998 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:00.952390 11003 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:02.065218 10998 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:05.953456 10995 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:07.066515 10995 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:10.954479 10998 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:12.067471 11005 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:15.955461 10996 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_224' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > W0709 00:34:17.068209 10996 monitor.cpp:186] Failed to collect resource
> > > usage for executor 'executor_Task_Tracker_230' of framework
> > > '201307040929-252063498-5050-27411-0000': Future discarded
> > >
> > >
> > >
> > > Guodong
> > >
> > >
> > > On Tue, Jul 9, 2013 at 1:59 PM, 王国栋 <wangg...@gmail.com> wrote:
> > >
> > > > Hi ben,
> > > >
> > > > I ran into the same issue here.
> > > >
> > > > This also happens in our hadoop framework. The slave log is like
> these.
> > > At
> > > > that time, I think the work load of the node is very high.
> > > >
> > > > I0708 23:36:44.253880 11005 slave.cpp:739] Got assigned task
> > > > Task_Tracker_224 for framework 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:44.255221 10999 gc.cpp:84] Unscheduling
> > > >
> > >
> >
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000'
> > > > for removal
> > > > I0708 23:36:44.256206 11001 slave.cpp:837] Launching task
> > > Task_Tracker_224
> > > > for framework 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:44.258117 11001 paths.hpp:303] Created executor directory
> > > >
> > >
> >
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe
> > > > cutor_Task_Tracker_224/runs/953d3565-424c-4ab3-9926-a3fa71042bf1'
> > > > I0708 23:36:44.258467 10991 process_isolator.cpp:99] Launching
> > > > executor_Task_Tracker_224 (cd hadoop && ./bin/mesos-executor) in
> > > >
> > >
> >
> /data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/framew
> > > >
> > >
> >
> orks/201307040929-252063498-5050-27411-0000/executors/executor_Task_Tracker_224/runs/953d3565-424c-4ab3-9926-a3fa71042bf1
> > > > with resources cpus=1; mem=1280' for framework
> > > > 201307040929-252063498-5050-27411-0
> > > > 000
> > > > I0708 23:36:44.258496 11001 slave.cpp:948] Queuing task
> > > 'Task_Tracker_224'
> > > > for executor executor_Task_Tracker_224 of framework
> > > > '201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:44.261446 10991 process_isolator.cpp:161] Forked executor
> > at
> > > > 2220
> > > > I0708 23:36:44.261787 10996 slave.cpp:511] Successfully attached file
> > > >
> > >
> >
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe
> > > > cutor_Task_Tracker_224/runs/953d3565-424c-4ab3-9926-a3fa71042bf1'
> > > > I0708 23:36:44.580497 10996 slave.cpp:2511] Current usage 57.21%. Max
> > > > allowed age: 2.295155852123924days
> > > > I0708 23:36:44.750393 11002 slave.cpp:1395] Got registration for
> > executor
> > > > 'executor_Task_Tracker_224' of framework
> > > > 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:44.751095 11002 slave.cpp:1510] Flushing queued task
> > > > Task_Tracker_224 for executor 'executor_Task_Tracker_224' of
> framework
> > > > 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:46.144317 11006 slave.cpp:1692] Handling status update
> > > > TASK_RUNNING (UUID: 364ee347-f6a2-4c7b-8702-460aa0ece579) for task
> > > > Task_Tracker_224 of framework 201307040929-252063498-5050-27411-0000
> f
> > > > rom executor(1)@10.47.6.21:2310
> > > > I0708 23:36:46.144745 11006 status_update_manager.cpp:290] Received
> > > status
> > > > update TASK_RUNNING (UUID: 364ee347-f6a2-4c7b-8702-460aa0ece579) for
> > task
> > > > Task_Tracker_224 of framework 201307040929-252063498-50
> > > > 50-27411-0000 with checkpoint=false
> > > > I0708 23:36:46.144821 11006 status_update_manager.cpp:450] Creating
> > > > StatusUpdate stream for task Task_Tracker_224 of framework
> > > > 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:46.145076 11006 status_update_manager.cpp:336] Forwarding
> > > > status update TASK_RUNNING (UUID:
> 364ee347-f6a2-4c7b-8702-460aa0ece579)
> > > for
> > > > task Task_Tracker_224 of framework 201307040929-252063498-
> > > > 5050-27411-0000 to master@10.47.6.15:5050
> > > > I0708 23:36:46.145882 10997 slave.cpp:1809] Sending acknowledgement
> for
> > > > status update TASK_RUNNING (UUID:
> 364ee347-f6a2-4c7b-8702-460aa0ece579)
> > > for
> > > > task Task_Tracker_224 of framework 201307040929-25206349
> > > > 8-5050-27411-0000 to executor(1)@10.47.6.21:2310
> > > > I0708 23:36:46.146870 10993 status_update_manager.cpp:360] Received
> > > status
> > > > update acknowledgement 364ee347-f6a2-4c7b-8702-460aa0ece579 for task
> > > > Task_Tracker_224 of framework 201307040929-252063498-5050-27
> > > > 411-0000
> > > > I0708 23:36:50.258347 11005 slave.cpp:739] Got assigned task
> > > > Task_Tracker_230 for framework 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:50.259472 11005 slave.cpp:837] Launching task
> > > Task_Tracker_230
> > > > for framework 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:50.261641 11005 paths.hpp:303] Created executor directory
> > > >
> > >
> >
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe
> > > > cutor_Task_Tracker_230/runs/1dc87acc-d090-469f-ba30-0477139ee7fd'
> > > > I0708 23:36:50.262265 11005 slave.cpp:948] Queuing task
> > > 'Task_Tracker_230'
> > > > for executor executor_Task_Tracker_230 of framework
> > > > '201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:50.262557 11005 process_isolator.cpp:99] Launching
> > > > executor_Task_Tracker_230 (cd hadoop && ./bin/mesos-executor) in
> > > >
> > >
> >
> /data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/framew
> > > >
> > >
> >
> orks/201307040929-252063498-5050-27411-0000/executors/executor_Task_Tracker_230/runs/1dc87acc-d090-469f-ba30-0477139ee7fd
> > > > with resources cpus=1; mem=1280' for framework
> > > > 201307040929-252063498-5050-27411-0
> > > > 000
> > > > I0708 23:36:50.265396 10999 slave.cpp:511] Successfully attached file
> > > >
> > >
> >
> '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe
> > > > cutor_Task_Tracker_230/runs/1dc87acc-d090-469f-ba30-0477139ee7fd'
> > > > I0708 23:36:50.265419 11005 process_isolator.cpp:161] Forked executor
> > at
> > > > 2851
> > > > I0708 23:36:50.835607 10995 slave.cpp:1395] Got registration for
> > executor
> > > > 'executor_Task_Tracker_230' of framework
> > > > 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:50.836174 10995 slave.cpp:1510] Flushing queued task
> > > > Task_Tracker_230 for executor 'executor_Task_Tracker_230' of
> framework
> > > > 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:54.617856 10994 slave.cpp:1692] Handling status update
> > > > TASK_RUNNING (UUID: 7753252d-c90b-4b0d-adca-7c97f38f692e) for task
> > > > Task_Tracker_230 of framework 201307040929-252063498-5050-27411-0000
> f
> > > > rom executor(1)@10.47.6.21:27786
> > > > I0708 23:36:54.618275 10994 status_update_manager.cpp:290] Received
> > > status
> > > > update TASK_RUNNING (UUID: 7753252d-c90b-4b0d-adca-7c97f38f692e) for
> > task
> > > > Task_Tracker_230 of framework 201307040929-252063498-50
> > > > 50-27411-0000 with checkpoint=false
> > > > I0708 23:36:54.618326 10994 status_update_manager.cpp:450] Creating
> > > > StatusUpdate stream for task Task_Tracker_230 of framework
> > > > 201307040929-252063498-5050-27411-0000
> > > > I0708 23:36:54.618443 10994 status_update_manager.cpp:336] Forwarding
> > > > status update TASK_RUNNING (UUID:
> 7753252d-c90b-4b0d-adca-7c97f38f692e)
> > > for
> > > > task Task_Tracker_230 of framework 201307040929-252063498-
> > > > 5050-27411-0000 to master@10.47.6.15:5050
> > > > I0708 23:36:54.619137 10994 slave.cpp:1809] Sending acknowledgement
> for
> > > > status update TASK_RUNNING (UUID:
> 7753252d-c90b-4b0d-adca-7c97f38f692e)
> > > for
> > > > task Task_Tracker_230 of framework 201307040929-25206349
> > > > 8-5050-27411-0000 to executor(1)@10.47.6.21:27786
> > > > I0708 23:36:54.637682 10994 status_update_manager.cpp:360] Received
> > > status
> > > > update acknowledgement 7753252d-c90b-4b0d-adca-7c97f38f692e for task
> > > > Task_Tracker_230 of framework 201307040929-252063498-5050-27
> > > > 411-0000
> > > > I0708 23:37:44.583014 11002 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293704423241597days
> > > > I0708 23:38:44.585233 11003 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293703916528542days
> > > > I0708 23:39:44.599442 11006 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293639867998055days
> > > > I0708 23:40:44.603996 10997 slave.cpp:2511] Current usage 57.24%. Max
> > > > allowed age: 2.292921551567535days
> > > > I0708 23:41:44.608608 11006 slave.cpp:2511] Current usage 57.26%. Max
> > > > allowed age: 2.291521098018820days
> > > > I0708 23:42:44.609956 10992 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293668041244063days
> > > > I0708 23:43:44.682621 11000 slave.cpp:2511] Current usage 57.24%. Max
> > > > allowed age: 2.292935638190544days
> > > > I0708 23:44:44.684306 10993 slave.cpp:2511] Current usage 57.24%. Max
> > > > allowed age: 2.292916079066516days
> > > > I0708 23:45:44.686172 11001 slave.cpp:2511] Current usage 57.26%. Max
> > > > allowed age: 2.291485324076945days
> > > > I0708 23:46:44.699095 10995 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293641894850289days
> > > > I0708 23:47:44.721156 10998 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293629429709074days
> > > > I0708 23:48:44.779767 10992 slave.cpp:2511] Current usage 57.24%. Max
> > > > allowed age: 2.293525350847025days
> > > > I0708 23:49:44.812389 11004 slave.cpp:2511] Current usage 57.24%. Max
> > > > allowed age: 2.292909289111539days
> > > > I0708 23:50:44.814146 10999 slave.cpp:2511] Current usage 57.27%. Max
> > > > allowed age: 2.291438098419977days
> > > > I0708 23:51:44.814877 11005 slave.cpp:2511] Current usage 57.23%. Max
> > > > allowed age: 2.293635104895313days
> > > > I0708 23:52:44.818620 10998 slave.cpp:2511] Current usage 57.24%. Max
> > > > allowed age: 2.292983775931019days
> > > > I0708 23:53:44.829911 10997 slave.cpp:2511] Current usage 57.33%. Max
> > > > allowed age: 2.286910009194236days
> > > > I0708 23:54:44.831307 10999 slave.cpp:2511] Current usage 57.33%. Max
> > > > allowed age: 2.286909502481169days
> > > > I0708 23:55:44.902858 10994 slave.cpp:2511] Current usage 57.37%. Max
> > > > allowed age: 2.284414244700093days
> > > > I0708 23:56:44.905398 11002 slave.cpp:2511] Current usage 57.42%. Max
> > > > allowed age: 2.280636901540567days
> > > > I0708 23:57:44.933673 10991 slave.cpp:2511] Current usage 57.44%. Max
> > > > allowed age: 2.279481899796968days
> > > > I0708 23:58:44.934840 11004 slave.cpp:2511] Current usage 57.48%. Max
> > > > allowed age: 2.276566475548496days
> > > > I0708 23:59:44.936063 11001 slave.cpp:2511] Current usage 57.49%. Max
> > > > allowed age: 2.275690368671817days
> > > > I0709 00:00:44.937433 11004 slave.cpp:2511] Current usage 57.50%. Max
> > > > allowed age: 2.275057180034989days
> > > > I0709 00:01:44.938940 11001 slave.cpp:2511] Current usage 57.51%. Max
> > > > allowed age: 2.273999467198449days
> > > > I0709 00:02:44.955103 10996 slave.cpp:2511] Current usage 57.52%. Max
> > > > allowed age: 2.273472384275891days
> > > > I0709 00:03:44.956354 10993 slave.cpp:2511] Current usage 57.39%. Max
> > > > allowed age: 2.282894612240220days
> > > > I0709 00:04:44.957926 10997 slave.cpp:2511] Current usage 57.40%. Max
> > > > allowed age: 2.281966516603831days
> > > > I0709 00:05:44.969205 10996 slave.cpp:2511] Current usage 57.40%. Max
> > > > allowed age: 2.281962260214144days
> > > > I0709 00:06:44.969987 10992 slave.cpp:2511] Current usage 57.40%. Max
> > > > allowed age: 2.281791801941551days
> > > > I0709 00:07:44.977504 11004 slave.cpp:2511] Current usage 57.40%. Max
> > > > allowed age: 2.281715288269849days
> > > > I0709 00:08:44.982868 10998 slave.cpp:2511] Current usage 57.40%. Max
> > > > allowed age: 2.281699782850289days
> > > > I0709 00:09:44.997082 11000 slave.cpp:2511] Current usage 57.42%. Max
> > > > allowed age: 2.280776044946192days
> > > > I0709 00:10:44.998754 10994 slave.cpp:2511] Current usage 57.42%. Max
> > > > allowed age: 2.280772193926956days
> > > > I0709 00:11:44.999949 11002 slave.cpp:2511] Current usage 57.44%. Max
> > > > allowed age: 2.279204525069213days
> > > > I0709 00:12:45.001539 10995 slave.cpp:2511] Current usage 57.47%. Max
> > > > allowed age: 2.277132676719109days
> > > > I0709 00:13:45.002728 10992 slave.cpp:2511] Current usage 57.43%. Max
> > > > allowed age: 2.280012428368322days
> > > > I0709 00:14:45.009699 10998 slave.cpp:2511] Current usage 57.48%. Max
> > > > allowed age: 2.276733690857512days
> > > > I0709 00:15:45.013483 10996 slave.cpp:2511] Current usage 57.53%. Max
> > > > allowed age: 2.272715152282546days
> > > > I0709 00:16:45.015496 10998 slave.cpp:2511] Current usage 57.57%. Max
> > > > allowed age: 2.270354274804352days
> > > > I0709 00:17:45.016628 11000 slave.cpp:2511] Current usage 57.62%. Max
> > > > allowed age: 2.266927678423322days
> > > > I0709 00:18:45.032670 11002 slave.cpp:2511] Current usage 57.65%. Max
> > > > allowed age: 2.264218182361482days
> > > > I0709 00:19:45.043442 10998 slave.cpp:2511] Current usage 57.69%. Max
> > > > allowed age: 2.261509598383137days
> > > > I0709 00:20:45.080648 10992 slave.cpp:2511] Current usage 57.72%. Max
> > > > allowed age: 2.259379478031400days
> > > > I0709 00:21:45.081632 10995 slave.cpp:2511] Current usage 57.77%. Max
> > > > allowed age: 2.255819920144039days
> > > > I0709 00:22:45.082593 11005 slave.cpp:2511] Current usage 57.81%. Max
> > > > allowed age: 2.253314528101817days
> > > > I0709 00:23:45.193588 10997 slave.cpp:2511] Current usage 57.85%. Max
> > > > allowed age: 2.250524870034248days
> > > > I0709 00:24:45.220617 10994 slave.cpp:2511] Current usage 57.90%. Max
> > > > allowed age: 2.246784618270532days
> > > > I0709 00:25:45.241602 10992 slave.cpp:2511] Current usage 57.97%. Max
> > > > allowed age: 2.242399422127049days
> > > > I0709 00:26:45.248977 11000 slave.cpp:2511] Current usage 58.00%. Max
> > > > allowed age: 2.240250654734792days
> > > > I0709 00:27:45.250953 10993 slave.cpp:2511] Current usage 57.99%. Max
> > > > allowed age: 2.240516983117894days
> > > > I0709 00:28:45.252694 10996 slave.cpp:2511] Current usage 58.06%. Max
> > > > allowed age: 2.235834143724352days
> > > > I0709 00:29:45.254992 11003 slave.cpp:2511] Current usage 58.10%. Max
> > > > allowed age: 2.233297436815162days
> > > > W0709 00:30:06.753098 10999 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:10.715373 10996 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:11.754446 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:15.719880 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:16.755473 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:20.720330 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:21.766019 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:25.721364 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:26.768874 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:30.722605 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:31.770354 11003 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:35.724455 10992 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:36.788751 10992 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:40.745380 10992 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_224' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > W0709 00:30:41.789358 10992 monitor.cpp:186] Failed to collect
> resource
> > > > usage for executor 'executor_Task_Tracker_230' of framework
> > > > '201307040929-252063498-5050-27411-0000': Future discarded
> > > > I0709 00:30:45.256590 11004 slave.cpp:2511] Current usage 58.11%. Max
> > > > allowed age: 2.232469873049410days
> > > >
> > > >
> > > > Guodong
> > > >
> > > >
> > > > On Tue, Jul 9, 2013 at 4:55 AM, Benjamin Mahler <
> > > benjamin.mah...@gmail.com
> > > > > wrote:
> > > >
> > > >> Are these the un-edited logs? I'm expecting to see some logs from
> the
> > > >> process_isolator or cgroups_isolator in there.
> > > >>
> > > >>
> > > >> On Fri, Jul 5, 2013 at 2:38 PM, Brenden Matthews <
> > > >> brenden.matth...@airbedandbreakfast.com> wrote:
> > > >>
> > > >> > Hey guys,
> > > >> >
> > > >> > I'm currently having a problem where tasks will get stuck in the
> > > staging
> > > >> > state, though according to the logs they should have been
> > terminated.
> > > >>  They
> > > >> > hang indefinitely, or until I restart the slave.  Below is a
> > > screenshot
> > > >> +
> > > >> > logs.  Also interesting is the 'Failed to collect resource usage
> > ...'
> > > >> > messages.
> > > >> >
> > > >> > [image: Inline image 2]
> > > >> >
> > > >> > I0705 16:19:51.551512  9706 slave.cpp:739] Got assigned task
> > > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partitio
> > > >> >> n for framework chronos
> > > >> >> I0705 16:19:51.552150  9706 slave.cpp:837] Launching task
> > > >> >>
> ct:1373041190990:0:add_latest_reservation_survey_events_partition f
> > > >> >> or framework chronos
> > > >> >> I0705 16:19:51.553956  9706 paths.hpp:303] Created executor
> > directory
> > > >> >> '/tmp/mesos/slaves/201307030043-2037266954-5050-15277-1
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> 517/frameworks/chronos/executors/ct:1373041190990:0:add_latest_reservation_survey_events_partition/runs/611ba128-557f-4b5e-8c
> > > >> >> f2-4d1ce60d618f'
> > > >> >> I0705 16:19:51.554576  9706 slave.cpp:948] Queuing task
> > > >> >>
> > 'ct:1373041190990:0:add_latest_reservation_survey_events_partition' f
> > > >> >> or executor
> > > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partition
> > of
> > > >> >> framework 'c
> > > >> >> hronos
> > > >> >> I0705 16:19:51.555027  9706 slave.cpp:511] Successfully attached
> > file
> > > >> >>
> > > >>
> > >
> >
> '/tmp/mesos/slaves/201307030043-2037266954-5050-15277-1517/frameworks/chronos/executors/ct:1373041190990:0:add_latest_reservation_survey_events_partition/runs/611ba128-557f-4b5e-8cf2-4d1ce60d618f'
> > > >> >> I0705 16:19:54.048754  9724 slave.cpp:2530] Current usage 42.18%.
> > Max
> > > >> >> allowed age: 22.955009563956388hrs
> > > >> >> W0705 16:19:54.108963  9724 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:19:59.110787  9729 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:04.112406  9704 monitor.cpp:186] Failed to collect
> > > >> resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:09.114367  9705 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:14.116312  9706 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:19.118370  9699 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:24.120311  9701 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:29.122355  9700 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:34.123443  9722 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:39.125660  9718 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:44.127464  9724 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:49.129385  9725 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> I0705 16:20:51.555174  9703 slave.cpp:2482] Terminating executor
> > > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partition
> > of
> > > >> >> framework chronos because it did not register within 1mins
> > > >> >> I0705 16:20:54.050434  9717 slave.cpp:2530] Current usage 42.18%.
> > Max
> > > >> >> allowed age: 22.955009342481944hrs
> > > >> >> W0705 16:20:54.130730  9699 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:20:59.132472  9702 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:21:04.134557  9713 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >> W0705 16:21:09.135619  9701 monitor.cpp:186] Failed to collect
> > > resource
> > > >> >> usage for executor 'executor_Task_Tracker_8023' of framework
> > > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded
> > > >> >
> > > >> >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to