hey guodong, which of these task(s) is stuck in STAGING? also, the corresponding master's logs would also be helpful here. also which version of mesos are you running?
On Mon, Jul 8, 2013 at 11:02 PM, 王国栋 <wangg...@gmail.com> wrote: > It is very interesting that there are these logs. > > I0709 00:33:43.833853 11002 slave.cpp:996] Asked to kill task > Task_Tracker_224 of framework 201307040929-252063498-5050-27411-0000 > I0709 00:33:43.835552 11006 slave.cpp:996] Asked to kill task > Task_Tracker_230 of framework 201307040929-252063498-5050-27411-0000 > I0709 00:33:43.972771 10994 slave.cpp:1692] Handling status update > TASK_FINISHED (UUID: 372081cc-edf2-4183-a461-9345ab6d279c) for task > Task_Tracker_230 of framework 201307040929-252063498-5050-27411-0000 > from executor(1)@10.47.6.21:27786 > I0709 00:33:43.973132 10994 status_update_manager.cpp:290] Received status > update TASK_FINISHED (UUID: 372081cc-edf2-4183-a461-9345ab6d279c) for task > Task_Tracker_230 of framework 201307040929-252063498-5 > 050-27411-0000 with checkpoint=false > I0709 00:33:43.973192 10994 status_update_manager.cpp:336] Forwarding > status update TASK_FINISHED (UUID: 372081cc-edf2-4183-a461-9345ab6d279c) > for task Task_Tracker_230 of framework 201307040929-252063498 > -5050-27411-0000 to master@10.47.6.15:5050 > I0709 00:33:43.973846 11005 slave.cpp:1809] Sending acknowledgement for > status update TASK_FINISHED (UUID: 372081cc-edf2-4183-a461-9345ab6d279c) > for task Task_Tracker_230 of framework 201307040929-2520634 > 98-5050-27411-0000 to executor(1)@10.47.6.21:27786 > I0709 00:33:43.974591 11000 status_update_manager.cpp:360] Received status > update acknowledgement 372081cc-edf2-4183-a461-9345ab6d279c for task > Task_Tracker_230 of framework 201307040929-252063498-5050-27 > 411-0000 > I0709 00:33:43.974652 11000 status_update_manager.cpp:481] Cleaning up > status update stream for task Task_Tracker_230 of framework > 201307040929-252063498-5050-27411-0000 > I0709 00:33:44.090603 11003 slave.cpp:1692] Handling status update > TASK_FINISHED (UUID: 61d5775a-2375-412a-a5a4-80ab55163d88) for task > Task_Tracker_224 of framework 201307040929-252063498-5050-27411-0000 > from executor(1)@10.47.6.21:2310 > I0709 00:33:44.090860 11003 status_update_manager.cpp:290] Received status > update TASK_FINISHED (UUID: 61d5775a-2375-412a-a5a4-80ab55163d88) for task > Task_Tracker_224 of framework 201307040929-252063498-5 > 050-27411-0000 with checkpoint=false > I0709 00:33:44.090973 11003 status_update_manager.cpp:336] Forwarding > status update TASK_FINISHED (UUID: 61d5775a-2375-412a-a5a4-80ab55163d88) > for task Task_Tracker_224 of framework 201307040929-252063498 > -5050-27411-0000 to master@10.47.6.15:5050 > I0709 00:33:44.091279 11003 slave.cpp:1809] Sending acknowledgement for > status update TASK_FINISHED (UUID: 61d5775a-2375-412a-a5a4-80ab55163d88) > for task Task_Tracker_224 of framework 201307040929-2520634 > 98-5050-27411-0000 to executor(1)@10.47.6.21:2310 > I0709 00:33:44.093286 11003 status_update_manager.cpp:360] Received status > update acknowledgement 61d5775a-2375-412a-a5a4-80ab55163d88 for task > Task_Tracker_224 of framework 201307040929-252063498-5050-27 > 411-0000 > I0709 00:33:44.093359 11003 status_update_manager.cpp:481] Cleaning up > status update stream for task Task_Tracker_224 of framework > 201307040929-252063498-5050-27411-0000 > I0709 00:33:45.259831 10997 slave.cpp:2511] Current usage 57.44%. Max > allowed age: 2.279168852469954days > W0709 00:33:45.949470 10996 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:33:47.063181 11005 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:33:50.950412 11000 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:33:52.063576 10993 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:33:55.951427 11003 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:33:57.064575 10998 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:00.952390 11003 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:02.065218 10998 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:05.953456 10995 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:07.066515 10995 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:10.954479 10998 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:12.067471 11005 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:15.955461 10996 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_224' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > W0709 00:34:17.068209 10996 monitor.cpp:186] Failed to collect resource > usage for executor 'executor_Task_Tracker_230' of framework > '201307040929-252063498-5050-27411-0000': Future discarded > > > > Guodong > > > On Tue, Jul 9, 2013 at 1:59 PM, 王国栋 <wangg...@gmail.com> wrote: > > > Hi ben, > > > > I ran into the same issue here. > > > > This also happens in our hadoop framework. The slave log is like these. > At > > that time, I think the work load of the node is very high. > > > > I0708 23:36:44.253880 11005 slave.cpp:739] Got assigned task > > Task_Tracker_224 for framework 201307040929-252063498-5050-27411-0000 > > I0708 23:36:44.255221 10999 gc.cpp:84] Unscheduling > > > '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000' > > for removal > > I0708 23:36:44.256206 11001 slave.cpp:837] Launching task > Task_Tracker_224 > > for framework 201307040929-252063498-5050-27411-0000 > > I0708 23:36:44.258117 11001 paths.hpp:303] Created executor directory > > > '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe > > cutor_Task_Tracker_224/runs/953d3565-424c-4ab3-9926-a3fa71042bf1' > > I0708 23:36:44.258467 10991 process_isolator.cpp:99] Launching > > executor_Task_Tracker_224 (cd hadoop && ./bin/mesos-executor) in > > > /data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/framew > > > orks/201307040929-252063498-5050-27411-0000/executors/executor_Task_Tracker_224/runs/953d3565-424c-4ab3-9926-a3fa71042bf1 > > with resources cpus=1; mem=1280' for framework > > 201307040929-252063498-5050-27411-0 > > 000 > > I0708 23:36:44.258496 11001 slave.cpp:948] Queuing task > 'Task_Tracker_224' > > for executor executor_Task_Tracker_224 of framework > > '201307040929-252063498-5050-27411-0000 > > I0708 23:36:44.261446 10991 process_isolator.cpp:161] Forked executor at > > 2220 > > I0708 23:36:44.261787 10996 slave.cpp:511] Successfully attached file > > > '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe > > cutor_Task_Tracker_224/runs/953d3565-424c-4ab3-9926-a3fa71042bf1' > > I0708 23:36:44.580497 10996 slave.cpp:2511] Current usage 57.21%. Max > > allowed age: 2.295155852123924days > > I0708 23:36:44.750393 11002 slave.cpp:1395] Got registration for executor > > 'executor_Task_Tracker_224' of framework > > 201307040929-252063498-5050-27411-0000 > > I0708 23:36:44.751095 11002 slave.cpp:1510] Flushing queued task > > Task_Tracker_224 for executor 'executor_Task_Tracker_224' of framework > > 201307040929-252063498-5050-27411-0000 > > I0708 23:36:46.144317 11006 slave.cpp:1692] Handling status update > > TASK_RUNNING (UUID: 364ee347-f6a2-4c7b-8702-460aa0ece579) for task > > Task_Tracker_224 of framework 201307040929-252063498-5050-27411-0000 f > > rom executor(1)@10.47.6.21:2310 > > I0708 23:36:46.144745 11006 status_update_manager.cpp:290] Received > status > > update TASK_RUNNING (UUID: 364ee347-f6a2-4c7b-8702-460aa0ece579) for task > > Task_Tracker_224 of framework 201307040929-252063498-50 > > 50-27411-0000 with checkpoint=false > > I0708 23:36:46.144821 11006 status_update_manager.cpp:450] Creating > > StatusUpdate stream for task Task_Tracker_224 of framework > > 201307040929-252063498-5050-27411-0000 > > I0708 23:36:46.145076 11006 status_update_manager.cpp:336] Forwarding > > status update TASK_RUNNING (UUID: 364ee347-f6a2-4c7b-8702-460aa0ece579) > for > > task Task_Tracker_224 of framework 201307040929-252063498- > > 5050-27411-0000 to master@10.47.6.15:5050 > > I0708 23:36:46.145882 10997 slave.cpp:1809] Sending acknowledgement for > > status update TASK_RUNNING (UUID: 364ee347-f6a2-4c7b-8702-460aa0ece579) > for > > task Task_Tracker_224 of framework 201307040929-25206349 > > 8-5050-27411-0000 to executor(1)@10.47.6.21:2310 > > I0708 23:36:46.146870 10993 status_update_manager.cpp:360] Received > status > > update acknowledgement 364ee347-f6a2-4c7b-8702-460aa0ece579 for task > > Task_Tracker_224 of framework 201307040929-252063498-5050-27 > > 411-0000 > > I0708 23:36:50.258347 11005 slave.cpp:739] Got assigned task > > Task_Tracker_230 for framework 201307040929-252063498-5050-27411-0000 > > I0708 23:36:50.259472 11005 slave.cpp:837] Launching task > Task_Tracker_230 > > for framework 201307040929-252063498-5050-27411-0000 > > I0708 23:36:50.261641 11005 paths.hpp:303] Created executor directory > > > '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe > > cutor_Task_Tracker_230/runs/1dc87acc-d090-469f-ba30-0477139ee7fd' > > I0708 23:36:50.262265 11005 slave.cpp:948] Queuing task > 'Task_Tracker_230' > > for executor executor_Task_Tracker_230 of framework > > '201307040929-252063498-5050-27411-0000 > > I0708 23:36:50.262557 11005 process_isolator.cpp:99] Launching > > executor_Task_Tracker_230 (cd hadoop && ./bin/mesos-executor) in > > > /data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/framew > > > orks/201307040929-252063498-5050-27411-0000/executors/executor_Task_Tracker_230/runs/1dc87acc-d090-469f-ba30-0477139ee7fd > > with resources cpus=1; mem=1280' for framework > > 201307040929-252063498-5050-27411-0 > > 000 > > I0708 23:36:50.265396 10999 slave.cpp:511] Successfully attached file > > > '/data/mesos-slave-work-dir//slaves/201307041648-252063498-5050-8038-5/frameworks/201307040929-252063498-5050-27411-0000/executors/exe > > cutor_Task_Tracker_230/runs/1dc87acc-d090-469f-ba30-0477139ee7fd' > > I0708 23:36:50.265419 11005 process_isolator.cpp:161] Forked executor at > > 2851 > > I0708 23:36:50.835607 10995 slave.cpp:1395] Got registration for executor > > 'executor_Task_Tracker_230' of framework > > 201307040929-252063498-5050-27411-0000 > > I0708 23:36:50.836174 10995 slave.cpp:1510] Flushing queued task > > Task_Tracker_230 for executor 'executor_Task_Tracker_230' of framework > > 201307040929-252063498-5050-27411-0000 > > I0708 23:36:54.617856 10994 slave.cpp:1692] Handling status update > > TASK_RUNNING (UUID: 7753252d-c90b-4b0d-adca-7c97f38f692e) for task > > Task_Tracker_230 of framework 201307040929-252063498-5050-27411-0000 f > > rom executor(1)@10.47.6.21:27786 > > I0708 23:36:54.618275 10994 status_update_manager.cpp:290] Received > status > > update TASK_RUNNING (UUID: 7753252d-c90b-4b0d-adca-7c97f38f692e) for task > > Task_Tracker_230 of framework 201307040929-252063498-50 > > 50-27411-0000 with checkpoint=false > > I0708 23:36:54.618326 10994 status_update_manager.cpp:450] Creating > > StatusUpdate stream for task Task_Tracker_230 of framework > > 201307040929-252063498-5050-27411-0000 > > I0708 23:36:54.618443 10994 status_update_manager.cpp:336] Forwarding > > status update TASK_RUNNING (UUID: 7753252d-c90b-4b0d-adca-7c97f38f692e) > for > > task Task_Tracker_230 of framework 201307040929-252063498- > > 5050-27411-0000 to master@10.47.6.15:5050 > > I0708 23:36:54.619137 10994 slave.cpp:1809] Sending acknowledgement for > > status update TASK_RUNNING (UUID: 7753252d-c90b-4b0d-adca-7c97f38f692e) > for > > task Task_Tracker_230 of framework 201307040929-25206349 > > 8-5050-27411-0000 to executor(1)@10.47.6.21:27786 > > I0708 23:36:54.637682 10994 status_update_manager.cpp:360] Received > status > > update acknowledgement 7753252d-c90b-4b0d-adca-7c97f38f692e for task > > Task_Tracker_230 of framework 201307040929-252063498-5050-27 > > 411-0000 > > I0708 23:37:44.583014 11002 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293704423241597days > > I0708 23:38:44.585233 11003 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293703916528542days > > I0708 23:39:44.599442 11006 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293639867998055days > > I0708 23:40:44.603996 10997 slave.cpp:2511] Current usage 57.24%. Max > > allowed age: 2.292921551567535days > > I0708 23:41:44.608608 11006 slave.cpp:2511] Current usage 57.26%. Max > > allowed age: 2.291521098018820days > > I0708 23:42:44.609956 10992 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293668041244063days > > I0708 23:43:44.682621 11000 slave.cpp:2511] Current usage 57.24%. Max > > allowed age: 2.292935638190544days > > I0708 23:44:44.684306 10993 slave.cpp:2511] Current usage 57.24%. Max > > allowed age: 2.292916079066516days > > I0708 23:45:44.686172 11001 slave.cpp:2511] Current usage 57.26%. Max > > allowed age: 2.291485324076945days > > I0708 23:46:44.699095 10995 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293641894850289days > > I0708 23:47:44.721156 10998 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293629429709074days > > I0708 23:48:44.779767 10992 slave.cpp:2511] Current usage 57.24%. Max > > allowed age: 2.293525350847025days > > I0708 23:49:44.812389 11004 slave.cpp:2511] Current usage 57.24%. Max > > allowed age: 2.292909289111539days > > I0708 23:50:44.814146 10999 slave.cpp:2511] Current usage 57.27%. Max > > allowed age: 2.291438098419977days > > I0708 23:51:44.814877 11005 slave.cpp:2511] Current usage 57.23%. Max > > allowed age: 2.293635104895313days > > I0708 23:52:44.818620 10998 slave.cpp:2511] Current usage 57.24%. Max > > allowed age: 2.292983775931019days > > I0708 23:53:44.829911 10997 slave.cpp:2511] Current usage 57.33%. Max > > allowed age: 2.286910009194236days > > I0708 23:54:44.831307 10999 slave.cpp:2511] Current usage 57.33%. Max > > allowed age: 2.286909502481169days > > I0708 23:55:44.902858 10994 slave.cpp:2511] Current usage 57.37%. Max > > allowed age: 2.284414244700093days > > I0708 23:56:44.905398 11002 slave.cpp:2511] Current usage 57.42%. Max > > allowed age: 2.280636901540567days > > I0708 23:57:44.933673 10991 slave.cpp:2511] Current usage 57.44%. Max > > allowed age: 2.279481899796968days > > I0708 23:58:44.934840 11004 slave.cpp:2511] Current usage 57.48%. Max > > allowed age: 2.276566475548496days > > I0708 23:59:44.936063 11001 slave.cpp:2511] Current usage 57.49%. Max > > allowed age: 2.275690368671817days > > I0709 00:00:44.937433 11004 slave.cpp:2511] Current usage 57.50%. Max > > allowed age: 2.275057180034989days > > I0709 00:01:44.938940 11001 slave.cpp:2511] Current usage 57.51%. Max > > allowed age: 2.273999467198449days > > I0709 00:02:44.955103 10996 slave.cpp:2511] Current usage 57.52%. Max > > allowed age: 2.273472384275891days > > I0709 00:03:44.956354 10993 slave.cpp:2511] Current usage 57.39%. Max > > allowed age: 2.282894612240220days > > I0709 00:04:44.957926 10997 slave.cpp:2511] Current usage 57.40%. Max > > allowed age: 2.281966516603831days > > I0709 00:05:44.969205 10996 slave.cpp:2511] Current usage 57.40%. Max > > allowed age: 2.281962260214144days > > I0709 00:06:44.969987 10992 slave.cpp:2511] Current usage 57.40%. Max > > allowed age: 2.281791801941551days > > I0709 00:07:44.977504 11004 slave.cpp:2511] Current usage 57.40%. Max > > allowed age: 2.281715288269849days > > I0709 00:08:44.982868 10998 slave.cpp:2511] Current usage 57.40%. Max > > allowed age: 2.281699782850289days > > I0709 00:09:44.997082 11000 slave.cpp:2511] Current usage 57.42%. Max > > allowed age: 2.280776044946192days > > I0709 00:10:44.998754 10994 slave.cpp:2511] Current usage 57.42%. Max > > allowed age: 2.280772193926956days > > I0709 00:11:44.999949 11002 slave.cpp:2511] Current usage 57.44%. Max > > allowed age: 2.279204525069213days > > I0709 00:12:45.001539 10995 slave.cpp:2511] Current usage 57.47%. Max > > allowed age: 2.277132676719109days > > I0709 00:13:45.002728 10992 slave.cpp:2511] Current usage 57.43%. Max > > allowed age: 2.280012428368322days > > I0709 00:14:45.009699 10998 slave.cpp:2511] Current usage 57.48%. Max > > allowed age: 2.276733690857512days > > I0709 00:15:45.013483 10996 slave.cpp:2511] Current usage 57.53%. Max > > allowed age: 2.272715152282546days > > I0709 00:16:45.015496 10998 slave.cpp:2511] Current usage 57.57%. Max > > allowed age: 2.270354274804352days > > I0709 00:17:45.016628 11000 slave.cpp:2511] Current usage 57.62%. Max > > allowed age: 2.266927678423322days > > I0709 00:18:45.032670 11002 slave.cpp:2511] Current usage 57.65%. Max > > allowed age: 2.264218182361482days > > I0709 00:19:45.043442 10998 slave.cpp:2511] Current usage 57.69%. Max > > allowed age: 2.261509598383137days > > I0709 00:20:45.080648 10992 slave.cpp:2511] Current usage 57.72%. Max > > allowed age: 2.259379478031400days > > I0709 00:21:45.081632 10995 slave.cpp:2511] Current usage 57.77%. Max > > allowed age: 2.255819920144039days > > I0709 00:22:45.082593 11005 slave.cpp:2511] Current usage 57.81%. Max > > allowed age: 2.253314528101817days > > I0709 00:23:45.193588 10997 slave.cpp:2511] Current usage 57.85%. Max > > allowed age: 2.250524870034248days > > I0709 00:24:45.220617 10994 slave.cpp:2511] Current usage 57.90%. Max > > allowed age: 2.246784618270532days > > I0709 00:25:45.241602 10992 slave.cpp:2511] Current usage 57.97%. Max > > allowed age: 2.242399422127049days > > I0709 00:26:45.248977 11000 slave.cpp:2511] Current usage 58.00%. Max > > allowed age: 2.240250654734792days > > I0709 00:27:45.250953 10993 slave.cpp:2511] Current usage 57.99%. Max > > allowed age: 2.240516983117894days > > I0709 00:28:45.252694 10996 slave.cpp:2511] Current usage 58.06%. Max > > allowed age: 2.235834143724352days > > I0709 00:29:45.254992 11003 slave.cpp:2511] Current usage 58.10%. Max > > allowed age: 2.233297436815162days > > W0709 00:30:06.753098 10999 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:10.715373 10996 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:11.754446 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:15.719880 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:16.755473 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:20.720330 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:21.766019 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:25.721364 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:26.768874 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:30.722605 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:31.770354 11003 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:35.724455 10992 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:36.788751 10992 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:40.745380 10992 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_224' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > W0709 00:30:41.789358 10992 monitor.cpp:186] Failed to collect resource > > usage for executor 'executor_Task_Tracker_230' of framework > > '201307040929-252063498-5050-27411-0000': Future discarded > > I0709 00:30:45.256590 11004 slave.cpp:2511] Current usage 58.11%. Max > > allowed age: 2.232469873049410days > > > > > > Guodong > > > > > > On Tue, Jul 9, 2013 at 4:55 AM, Benjamin Mahler < > benjamin.mah...@gmail.com > > > wrote: > > > >> Are these the un-edited logs? I'm expecting to see some logs from the > >> process_isolator or cgroups_isolator in there. > >> > >> > >> On Fri, Jul 5, 2013 at 2:38 PM, Brenden Matthews < > >> brenden.matth...@airbedandbreakfast.com> wrote: > >> > >> > Hey guys, > >> > > >> > I'm currently having a problem where tasks will get stuck in the > staging > >> > state, though according to the logs they should have been terminated. > >> They > >> > hang indefinitely, or until I restart the slave. Below is a > screenshot > >> + > >> > logs. Also interesting is the 'Failed to collect resource usage ...' > >> > messages. > >> > > >> > [image: Inline image 2] > >> > > >> > I0705 16:19:51.551512 9706 slave.cpp:739] Got assigned task > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partitio > >> >> n for framework chronos > >> >> I0705 16:19:51.552150 9706 slave.cpp:837] Launching task > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partition f > >> >> or framework chronos > >> >> I0705 16:19:51.553956 9706 paths.hpp:303] Created executor directory > >> >> '/tmp/mesos/slaves/201307030043-2037266954-5050-15277-1 > >> >> > >> >> > >> > 517/frameworks/chronos/executors/ct:1373041190990:0:add_latest_reservation_survey_events_partition/runs/611ba128-557f-4b5e-8c > >> >> f2-4d1ce60d618f' > >> >> I0705 16:19:51.554576 9706 slave.cpp:948] Queuing task > >> >> 'ct:1373041190990:0:add_latest_reservation_survey_events_partition' f > >> >> or executor > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partition of > >> >> framework 'c > >> >> hronos > >> >> I0705 16:19:51.555027 9706 slave.cpp:511] Successfully attached file > >> >> > >> > '/tmp/mesos/slaves/201307030043-2037266954-5050-15277-1517/frameworks/chronos/executors/ct:1373041190990:0:add_latest_reservation_survey_events_partition/runs/611ba128-557f-4b5e-8cf2-4d1ce60d618f' > >> >> I0705 16:19:54.048754 9724 slave.cpp:2530] Current usage 42.18%. Max > >> >> allowed age: 22.955009563956388hrs > >> >> W0705 16:19:54.108963 9724 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:19:59.110787 9729 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:04.112406 9704 monitor.cpp:186] Failed to collect > >> resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:09.114367 9705 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:14.116312 9706 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:19.118370 9699 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:24.120311 9701 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:29.122355 9700 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:34.123443 9722 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:39.125660 9718 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:44.127464 9724 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:49.129385 9725 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> I0705 16:20:51.555174 9703 slave.cpp:2482] Terminating executor > >> >> ct:1373041190990:0:add_latest_reservation_survey_events_partition of > >> >> framework chronos because it did not register within 1mins > >> >> I0705 16:20:54.050434 9717 slave.cpp:2530] Current usage 42.18%. Max > >> >> allowed age: 22.955009342481944hrs > >> >> W0705 16:20:54.130730 9699 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:20:59.132472 9702 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:21:04.134557 9713 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> >> W0705 16:21:09.135619 9701 monitor.cpp:186] Failed to collect > resource > >> >> usage for executor 'executor_Task_Tracker_8023' of framework > >> >> '201307030043-2037266954-5050-15277-0006': Future discarded > >> > > >> > > >> > > >> > > > > >