logs? Also what version of mesos?

@vinodkone
Sent from my mobile 

On May 15, 2013, at 12:00 AM, 王瑜 <[email protected]> wrote:

> Hi Ben,
> 
> I think the problem is mesos have found the executor on 
> hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did 
> not use it.
> Mesos found the executor, so it did not output error, just update the task 
> status as lost; but mesos did not use the executor, so the executor directory 
> contains nothing! 
> 
> But I am not very familiar with source code, so I do not know why mesos can 
> not use the executor. And I also do not know whether my analysis is right. 
> Thanks very much for your help!
> 
> 
> 
> 
> Wang Yu
> 
> 发件人: 王瑜
> 发送时间: 2013-05-15 11:04
> 收件人: mesos-dev
> 抄送: Benjamin Mahler
> 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited 
> TaskTracker: http://slave5:50060
> Hi, Ben,
> 
> I have reworked the test, and checked log directory again, it is still null. 
> The same as following.
> I think there is the problem with my executor, but I do not know how to let 
> the executor works. Logs is as following...
> " Asked to update resources for an unknown/killed executor" why it always 
> kill the executor?
> 
> 1. I opened all the executor directory, but all of them are null. I do not 
> know what happened to them...
> [root@slave1 logs]# cd 
> /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l
> 总用量 0
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a
> .  ..
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]#
> 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are 
> always lost. But there is no error any more, I still do not know what 
> happened to the executor...Logs on one slave is as follows. Please help me, 
> thanks very much!
> 
> mesos-slave.INFO
> Log file created at: 2013/05/13 09:12:54
> Running on machine: slave1
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator
> I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root
> I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave
> I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 
> 1)@192.168.0.3:36668
> I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; 
> mem=63356; ports=[31000-32000]; disk=29143
> I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as 
> cgroups hierarchy root
> I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at 
> [email protected]:5050
> I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file 
> '/home/mesos/build/logs/mesos-slave.INFO'
> I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master 
> detected at [email protected]:5050
> I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator
> I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery
> I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given 
> slave ID 201305130913-33597632-5050-3893-0
> I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal
> I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal
> I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal
> I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal
> I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal
> I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal
> I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling 
> '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal
> I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max 
> allowed age: 5.11days
> I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max 
> allowed age: 5.11days
> I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max 
> allowed age: 5.11days
> I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 
> for framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory 
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
> I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file 
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
> I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching 
> executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in 
> /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495
>  with resources cpus=1; mem=1280 for framework 
> 201305130913-33597632-5050-3893-0000 in cgroup 
> mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:23.059368 24190 cgroups_isolator.cpp:670] Changing cgroup 
> controls for executor executor_Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
> I0513 09:16:23.060478 24190 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 
> 1024 for executor executor_Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 
> 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of 
> framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 
> 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of 
> framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.061807 24190 cgroups_isolator.cpp:1005] Started listening for 
> OOM events for executor executor_Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.063297 24190 cgroups_isolator.cpp:555] Forked executor at = 
> 24552
> I0513 09:16:29.055598 24190 slave.cpp:587] Got assigned task Task_Tracker_1 
> for framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.058297 24190 paths.hpp:302] Created executor directory 
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
> I0513 09:16:29.059012 24203 slave.cpp:436] Successfully attached file 
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
> I0513 09:16:29.059865 24200 cgroups_isolator.cpp:525] Launching 
> executor_Task_Tracker_1 (cd hadoop && ./bin/mesos-executor) in 
> /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
>  with resources cpus=1; mem=1280 for framework 
> 201305130913-33597632-5050-3893-0000 in cgroup 
> mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
> I0513 09:16:29.061282 24200 cgroups_isolator.cpp:670] Changing cgroup 
> controls for executor executor_Task_Tracker_1 of framework 
> 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
> I0513 09:16:29.062208 24200 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 
> 1024 for executor executor_Task_Tracker_1 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.062940 24200 cgroups_isolator.cpp:979] Updated 
> 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_1 of 
> framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.063705 24200 cgroups_isolator.cpp:1005] Started listening for 
> OOM events for executor executor_Task_Tracker_1 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.065239 24200 cgroups_isolator.cpp:555] Forked executor at = 
> 24628
> I0513 09:16:34.457746 24188 cgroups_isolator.cpp:806] Executor 
> executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> terminated with status 256
> I0513 09:16:34.457909 24188 cgroups_isolator.cpp:635] Killing executor 
> executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.459873 24188 cgroups_isolator.cpp:1025] OOM notifier is 
> triggered for executor executor_Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000 with uuid 
> 6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.460028 24188 cgroups_isolator.cpp:1030] Discarded OOM notifier 
> for executor executor_Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000 with uuid 
> 6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.461314 24190 cgroups.cpp:1175] Trying to freeze cgroup 
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.461675 24190 cgroups.cpp:1214] Successfully froze cgroup 
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
>  after 1 attempts
> I0513 09:16:34.464400 24197 cgroups.cpp:1190] Trying to thaw cgroup 
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.464659 24197 cgroups.cpp:1298] Successfully thawed 
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.477118 24199 cgroups_isolator.cpp:1144] Successfully destroyed 
> cgroup 
> mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.477439 24190 slave.cpp:1479] Executor 
> 'executor_Task_Tracker_0' of framework 201305130913-33597632-5050-3893-0000 
> has exited with status 1
> I0513 09:16:34.479852 24190 slave.cpp:1232] Handling status update TASK_LOST 
> from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.480123 24190 slave.cpp:1280] Forwarding status update 
> TASK_LOST from task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000 to the status update manager
> I0513 09:16:34.480136 24199 cgroups_isolator.cpp:666] Asked to update 
> resources for an unknown/killed executor
> I0513 09:16:34.480480 24185 status_update_manager.cpp:254] Received status 
> update TASK_LOST from task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.480716 24185 status_update_manager.cpp:403] Creating 
> StatusUpdate stream for task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.480927 24185 status_update_manager.hpp:314] Handling UPDATE 
> for status update TASK_LOST from task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.481107 24185 status_update_manager.cpp:289] Forwarding status 
> update TASK_LOST from task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000 to the master at [email protected]:5050
> I0513 09:16:34.487007 24194 slave.cpp:979] Got acknowledgement of status 
> update for task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.487257 24185 status_update_manager.cpp:314] Received status 
> update acknowledgement for task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.487412 24185 status_update_manager.hpp:314] Handling ACK for 
> status update TASK_LOST from task Task_Tracker_0 of framework 
> 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.487547 24185 status_upda

Reply via email to