Now that you've uploaded the executor, can you send us the master / slave
logs? When looking at a slave, can you look at an executor run directory to
see what's in stderr?

For example, in the slave you'll see a log line like the following:

I0513 09:16:47.082861 24194 cgroups_isolator.cpp:525]
Launching executor_Task_Tracker_4 (cd hadoop && ./bin/
mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-
5050-3893-0/frameworks/201305130913-33597632-5050-
3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-
1ec0-4946-a1bc-0644a7238e3c with resources cpus=1; mem=1280 for framework
201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-
33597632-5050-3893-0000_executor_executor_Task_Tracker_4_tag_8a4dd631-1ec0-
4946-a1bc-0644a7238e3c

Based on the above, you'll want to check out what's inside the directory:
$ cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/
201305130913-33597632-5050-3893-0000/executors/executor_
Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c
$ ls
$ cat stderr

Thanks!


On Sun, May 12, 2013 at 8:45 PM, 王瑜 <[email protected]> wrote:

> Yes, I also updated mapred-site.xml. But it still can not work.
>
> I am using git version, just download it using git clone git://
> git.apache.org/mesos.git
>
> $ cd mesos
> $ ./bootstrap
> $ ./configure
> $ make
> $ cd hadoop
> $ make hadoop-0.20.205.0
>
> Then deploy it on the real cluster.
>
> I really do not know where is the problem, please help me with it.
>
>
>
>
> Wang Yu
>
> 发件人: Vinod Kone
> 发送时间: 2013-05-13 11:30
> 收件人: [email protected]
> 抄送: mesos-dev
> 主题: Re: 回复: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
> TaskTracker: http://slave5:50060
> Hmm. You definitely need the right extension but not the "hadoop" name. In
> assuming you also updated the file name in mapred-site.xml?
>
> Also I'm surprised that the slave logs donot show info about downloading
> the executor. What version of mesos are you running?
>
> @vinodkone
> Sent from my mobile
>
> On May 12, 2013, at 7:59 PM, 王瑜 <[email protected]> wrote:
>
> > I have uploaded the right file using:
> > [root@master hadoop-0.20.205.0]# hadoop fs -mkdir mesos
> > [root@master hadoop-0.20.205.0]# hadoop fs -copyFromLocal
> /home/mesos/build/hadoop/hadoop-0.20.205.0/build/hadoop.tar.gz
> /user/mesos/mesos-executor
> >
> > I have tried add file extension--" /user/mesos/mesos-executor"->"
> /user/mesos/mesos-executor.tar.gz", but it still can not work. Does it must
> using hadoop.tar.gz as the file name?
> >
> >
> >
> >
> > Wang Yu
> >
> > 发件人: Vinod Kone
> > 发送时间: 2013-05-13 10:42
> > 收件人: [email protected]; wangyu
> > 抄送: Benjamin Mahler
> > 主题: Re: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
> TaskTracker: http://slave5:50060
> >>
> >>  <property>
> >>    <name>mapred.mesos.executor</name>
> >> #    <value>hdfs://hdfs.name.node:port/hadoop.zip</value>
> >>    <value>hdfs://master/user/mesos/mesos-executor</value>
> >>  </property>
> >
> > the mapred.mesos.executor property looks incorrect. the value should be
> > where you have uploaded the "hadoop.tar.gz" bundle generated by the
> > (TUTORIAL.sh or make hadoop). you can find the generated "hadoop.tar.gz"
> > bundle in the hadoop build directory. upload the bundle to a hdfs
> location
> > and set the above property to that location.
> >
> > vinod
> >
> >
> >
> >>
> /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c
> >>> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls
> >>> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l
> >>> 总用量 0
> >>> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a
> >>> .  ..
> >>> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]#
> >>> 2. I added "--isolation=cgroups" for slaves, but it still not work.
> Tasks
> >>> are always lost. But there is no error any more, I still do not know
> what
> >>> happened to the executor...Logs on one slave is as follows. Please help
> >> me,
> >>> thanks very much!
> >>>
> >>> mesos-slave.INFO
> >>> Log file created at: 2013/05/13 09:12:54
> >>> Running on machine: slave1
> >>> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> >>> I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator
> >>> I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by
> >>> root
> >>> I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave
> >>> I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@
> >>> 192.168.0.3:36668
> >>> I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24;
> >>> mem=63356; ports=[31000-32000]; disk=29143
> >>> I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as
> >>> cgroups hierarchy root
> >>> I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at
> >>> [email protected]:5050
> >>> I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file
> >>> '/home/mesos/build/logs/mesos-slave.INFO'
> >>> I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master
> >>> detected at [email protected]:5050
> >>> I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering
> isolator
> >>> I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery
> >>> I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master;
> given
> >>> slave ID 201305130913-33597632-5050-3893-0
> >>> I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal
> >>> I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal
> >>> I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal
> >>> I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal
> >>> I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal
> >>> I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal
> >>> I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling
> >>> '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal
> >>> I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%.
> >> Max
> >>> allowed age: 5.11days
> >>> I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%.
> >> Max
> >>> allowed age: 5.11days
> >>> I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%.
> >> Max
> >>> allowed age: 5.11days
> >>> I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task
> >>> Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory
> >>
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
> >>> I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file
> >>
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
> >>> I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching
> >>> executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in
> >>
> /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495
> >>> with resources cpus=1; mem=1280 for framework
> >>> 201305130913-33597632-5050-3893-0000 in cgroup
> >>
> mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:23.059368 24190 cgroups_isolator.cpp:670] Changing cgroup
> >>> controls for executor executor_Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
> >>> I0513 09:16:23.060478 24190 cgroups_isolator.cpp:841] Updated
> >> 'cpu.shares'
> >>> to 1024 for executor executor_Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated
> >>> 'memory.limit_in_bytes' to 1342177280 for executor
> >> executor_Task_Tracker_0
> >>> of framework 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated
> >>> 'memory.limit_in_bytes' to 1342177280 for executor
> >> executor_Task_Tracker_0
> >>> of framework 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:23.061807 24190 cgroups_isolator.cpp:1005] Started
> listening
> >>> for OOM events for executor executor_Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:23.063297 24190 cgroups_isolator.cpp:555] Forked executor
> at
> >> =
> >>> 24552
> >>> I0513 09:16:29.055598 24190 slave.cpp:587] Got assigned task
> >>> Task_Tracker_1 for framework 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:29.058297 24190 paths.hpp:302] Created executor directory
> >>
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
> >>> I0513 09:16:29.059012 24203 slave.cpp:436] Successfully attached file
> >>
> '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
> >>> I0513 09:16:29.059865 24200 cgroups_isolator.cpp:525] Launching
> >>> executor_Task_Tracker_1 (cd hadoop && ./bin/mesos-executor) in
> >>
> /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
> >>> with resources cpus=1; mem=1280 for framework
> >>> 201305130913-33597632-5050-3893-0000 in cgroup
> >>
> mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
> >>> I0513 09:16:29.061282 24200 cgroups_isolator.cpp:670] Changing cgroup
> >>> controls for executor executor_Task_Tracker_1 of framework
> >>> 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
> >>> I0513 09:16:29.062208 24200 cgroups_isolator.cpp:841] Updated
> >> 'cpu.shares'
> >>> to 1024 for executor executor_Task_Tracker_1 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:29.062940 24200 cgroups_isolator.cpp:979] Updated
> >>> 'memory.limit_in_bytes' to 1342177280 for executor
> >> executor_Task_Tracker_1
> >>> of framework 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:29.063705 24200 cgroups_isolator.cpp:1005] Started
> listening
> >>> for OOM events for executor executor_Task_Tracker_1 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:29.065239 24200 cgroups_isolator.cpp:555] Forked executor
> at
> >> =
> >>> 24628
> >>> I0513 09:16:34.457746 24188 cgroups_isolator.cpp:806] Executor
> >>> executor_Task_Tracker_0 of framework
> 201305130913-33597632-5050-3893-0000
> >>> terminated with status 256
> >>> I0513 09:16:34.457909 24188 cgroups_isolator.cpp:635] Killing executor
> >>> executor_Task_Tracker_0 of framework
> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:34.459873 24188 cgroups_isolator.cpp:1025] OOM notifier is
> >>> triggered for executor executor_Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000 with uuid
> >>> 6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:34.460028 24188 cgroups_isolator.cpp:1030] Discarded OOM
> >>> notifier for executor executor_Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000 with uuid
> >>> 6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:34.461314 24190 cgroups.cpp:1175] Trying to freeze cgroup
> >>
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:34.461675 24190 cgroups.cpp:1214] Successfully froze cgroup
> >>
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> >>> after 1 attempts
> >>> I0513 09:16:34.464400 24197 cgroups.cpp:1190] Trying to thaw cgroup
> >>
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:34.464659 24197 cgroups.cpp:1298] Successfully thawed
> >>
> /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:34.477118 24199 cgroups_isolator.cpp:1144] Successfully
> >>> destroyed cgroup
> >>
> mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> >>> I0513 09:16:34.477439 24190 slave.cpp:1479] Executor
> >>> 'executor_Task_Tracker_0' of framework
> >> 201305130913-33597632-5050-3893-0000
> >>> has exited with status 1
> >>> I0513 09:16:34.479852 24190 slave.cpp:1232] Handling status update
> >>> TASK_LOST from task Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:34.480123 24190 slave.cpp:1280] Forwarding status update
> >>> TASK_LOST from task Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000 to the status update manager
> >>> I0513 09:16:34.480136 24199 cgroups_isolator.cpp:666] Asked to update
> >>> resources for an unknown/killed executor
> >>> I0513 09:16:34.480480 24185 status_update_manager.cpp:254] Received
> >> status
> >>> update TASK_LOST from task Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:34.480716 24185 status_update_manager.cpp:403] Creating
> >>> StatusUpdate stream for task Task_Tracker_0 of framework
> >>> 201305130913-33597632-5050-3893-0000
> >>> I0513 09:16:34.480927 24185 status_update_manager.hpp:314] Handling
> >> UPDATE
> >>> for status update TASK_
>

Reply via email to