Hi Vinod, The mesos version is 0.13.0. And the logs for master and slave is attached, can you get them? Where should I get scheduler logs?
Thanks very much for your help! Wang Yu From: Vinod Kone Date: 2013-05-15 23:45 To: Wang Yu CC: [email protected]; Benjamin Mahler Subject: Re: Tasks always lost when running hadoop test! What is the git sha of your HEAD? Also can you post the scheduler/master/slave logs? @vinodkone Sent from my mobile On May 15, 2013, at 8:22 AM, "Wang Yu" <[email protected]> wrote: > 1. There is no log in directories like > "/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c" > 2. I use the git version, just download it using "git clone > git://git.apache.org/mesos.git". This is what you told me before...<1.gif> > > 2013-05-15 > Wang Yu > 发件人:Vinod Kone <[email protected]> > 发送时间:2013-05-15 23:14 > 主题:Re: Tasks always lost when running hadoop test! > 收件人:"[email protected]"<[email protected]> > 抄送:"mesos-dev"<[email protected]>,"Benjamin > Mahler"<[email protected]> > > logs? Also what version of mesos? > > @vinodkone > Sent from my mobile > > On May 15, 2013, at 12:00 AM, 王瑜 <[email protected]> wrote: > > > Hi Ben, > > > > I think the problem is mesos have found the executor on > > hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did > > not use it. > > Mesos found the executor, so it did not output error, just update the task > > status as lost; but mesos did not use the executor, so the executor > > directory contains nothing! > > > > But I am not very familiar with source code, so I do not know why mesos can > > not use the executor. And I also do not know whether my analysis is right. > > Thanks very much for your help! > > > > > > > > > > Wang Yu > > > > 发件人: 王瑜 > > 发送时间: 2013-05-15 11:04 > > 收件人: mesos-dev > > 抄送: Benjamin Mahler > > 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited > > TaskTracker: http://slave5:50060 > > Hi, Ben, > > > > I have reworked the test, and checked log directory again, it is still > > null. The same as following. > > I think there is the problem with my executor, but I do not know how to let > > the executor works. Logs is as following... > > " Asked to update resources for an unknown/killed executor" why it always > > kill the executor? > > > > 1. I opened all the executor directory, but all of them are null. I do not > > know what happened to them... > > [root@slave1 logs]# cd > > /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c > > > > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls > > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l > > 总用量 0 > > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a > > . .. > > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# > > 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks > > are always lost. But there is no error any more, I still do not know what > > happened to the executor...Logs on one slave is as follows. Please help me, > > thanks very much! > > > > mesos-slave.INFO > > Log file created at: 2013/05/13 09:12:54 > > Running on machine: slave1 > > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg > > I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator > > I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by > > root > > I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave > > I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on > > 1)@192.168.0.3:36668 > > I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; > > mem=63356; ports=[31000-32000]; disk=29143 > > I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as > > cgroups hierarchy root > > I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at > > [email protected]:5050 > > I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file > > '/home/mesos/build/logs/mesos-slave.INFO' > > I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master > > detected at [email protected]:5050 > > I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator > > I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery > > I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given > > slave ID 201305130913-33597632-5050-3893-0 > > I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal > > I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal > > I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal > > I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal > > I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal > > I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal > > I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling > > '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal > > I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max > > allowed age: 5.11days > > I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max > > allowed age: 5.11days > > I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max > > allowed age: 5.11days > > I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 > > for framework 201305130913-33597632-5050-3893-0000 > > I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory > > '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' > > > > I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file > > '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' > > > > I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching > > executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in > > /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 > > with resources cpus=1; mem=1280 for framework > > 201305130913-33597632-5050-3893-0000 in cgroup > > mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 > > > > I0513&nbs
