Guodong,
There still are problems with me, I think there are some problem with my
executor setting.
In mapred-site.xml, I set:("master" is the hostname of mesos-master-hostname)
<property>
<name>mapred.mesos.executor</name>
# <value>hdfs://hdfs.name.node:port/hadoop.zip</value>
<value>hdfs://master/user/mesos/mesos-executor</value>
</property>
And I upload mesos-executor in /user/mesos/mesos-executor
The head content is as follows:
#! /bin/sh
# mesos-executor - temporary wrapper script for .libs/mesos-executor
# Generated by ltmain.sh (GNU libtool) 2.2.6b
#
# The mesos-executor program cannot be directly executed until all the libtool
# libraries that it depends on are installed.
#
# This wrapper script should never be moved out of the build directory.
# If it is, it will not operate correctly.
# Sed substitution that helps us do robust quoting. It backslashifies
# metacharacters that are still active within double-quoted strings.
Xsed='/bin/sed -e 1s/^X//'
sed_quote_subst='s/\([`"$\\]\)/\\\1/g'
# Be Bourne compatible
if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then
emulate sh
NULLCMD=:
# Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which
# is contrary to our usage. Disable this feature.
alias -g '${1+"$@"}'='"$@"'
setopt NO_GLOB_SUBST
else
case `(set -o) 2>/dev/null` in *posix*) set -o posix;; esac
fi
BIN_SH=xpg4; export BIN_SH # for Tru64
DUALCASE=1; export DUALCASE # for MKS sh
# The HP-UX ksh and POSIX shell print the target directory to stdout
# if CDPATH is set.
(unset CDPATH) >/dev/null 2>&1 && unset CDPATH
relink_command="(cd /home/mesos/build/src; { test -z \"\${LIBRARY_PATH+set}\"
|| unset LIBRARY_PATH || { LIBRARY_PATH=; export LIBRARY_PATH; }; }; { test -z
\"\${COMPILER_PATH+set}\" || unset COMPILER_PATH || { COMPILER_PATH=; export
COMPILER_PATH; }; }; { test -z \"\${GCC_EXEC_PREFIX+set}\" || unset
GCC_EXEC_PREFIX || { GCC_EXEC_PREFIX=; export GCC_EXEC_PREFIX; }; }; { test -z
\"\${LD_RUN_PATH+set}\" || unset LD_RUN_PATH || { LD_RUN_PATH=; export
LD_RUN_PATH; }; };
LD_LIBRARY_PATH=/home/wangyu/protobuf/lib:/home/mesos/mesos-0.9.0/build/hadoop/hadoop-0.20.205.0/lib/native/Linux-amd64-64/;
export LD_LIBRARY_PATH;
PATH=/home/wangyu/protobuf/bin:/usr/lib/jvm/java-7-sun/bin:/usr/lib/jvm/java-7-sun/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/home/mesos/mesos-0.10.0/hadoop/hadoop-0.20.205.0/bin:/usr/lib/ant/apache-ant-1.8.4/bin:/opt/scala-2.9.1.final/bin:/home/haidong/zookeeper-3.4.5/bin:/home/hadoop/hive-0.9.0/bin:/home/hadoop/pig-0.10.0/bin:/home/mesos/mpi/build/bin:/home/mesos/torque/torque-4.1.3:/home/mesos/mesos-0.9.0/build/hadoop/hadoop-0.20.205.0/bin:/root/bin;
export PATH; g++ -g -g2 -O2 -o \$progdir/\$file
launcher/mesos_executor-executor.o ./.libs/libmesos.so
-L/usr/lib/jvm/java-7-sun/jre/lib/amd64/server -lpthread -lcurl -lssl -lcrypto
-lz -lrt -pthread -Wl,-rpath -Wl,/home/mesos/build/src/.libs -Wl,-rpath
-Wl,/home/mesos/build/lib)"
...
Did I upload the right file? and set up it in conf file correct? Thanks very
much!
Wang Yu
From: 王国栋
Date: 2013-04-23 13:32
To: wangyu
CC: mesos-dev
Subject: Re: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
TaskTracker: http://slave5:50060
Hmm. it seems that the mapred.mesos.master is set correctly.
if you run hadoop in local mode, use the following setting is ok
<property>
<name>mapred.mesos.master</name>
<value>local</value>
</property>
if you want to start the cluster. set mapred.mesos.master as the
mesos-master-hostname:mesos-master-port.
Make sure the dns parser result for mesos-master-hostname is the right ip.
BTW: when you starting the jobtracker, you can check mesos webUI and check
if there is hadoop framework registered.
Thanks.
Guodong
On Tue, Apr 23, 2013 at 1:24 PM, 王瑜 <[email protected]> wrote:
> **
> Hi, Guodong,
>
> I start hadoop as you said, then I saw this error:
> 13/04/23 13:03:43 ERROR mapred.MesosScheduler: Error from scheduler driver:
> Cannot parse
> '@0.0.0.0:0'
>
> What's this mean? where should I change MesosScheduler code to fix this?
> Thanks very much! I am so sorry for interrupt you once again...
>
> The whole log is as follows:
>
> [root@master hadoop-0.20.205.0]# hadoop jobtracker
> 13/04/23 13:21:04 INFO mapred.JobTracker: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting JobTracker
> STARTUP_MSG: host = master/192.168.0.2
> STARTUP_MSG: args = []
> STARTUP_MSG: version = 0.20.205.0
>
> STARTUP_MSG: build = -r ; compiled by 'root' on Sat Apr 13 11:19:33 CST
> 2013
> ************************************************************/
>
> 13/04/23 13:21:04 INFO impl.MetricsConfig: loaded properties from
> hadoop-metrics2.properties
>
> 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source
> MetricsSystem,sub=Stats registered.
>
> 13/04/23 13:21:04 INFO impl.MetricsSystemImpl: Scheduled snapshot period at
> 10 second(s).
>
> 13/04/23 13:21:04 INFO impl.MetricsSystemImpl: JobTracker metrics system
> started
>
> 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source
> QueueMetrics,q=default registered.
>
> 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source ugi
> registered.
>
> 13/04/23 13:21:04 INFO delegation.AbstractDelegationTokenSecretManager:
> Updating the current master key for generating delegation tokens
>
> 13/04/23 13:21:04 INFO delegation.AbstractDelegationTokenSecretManager:
> Starting expired delegation token remover thread, tokenRemoverScanInterval=60
> min(s)
>
> 13/04/23 13:21:04 INFO mapred.JobTracker: Scheduler configured with
> (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks,
> limitMaxMemForReduceTasks) (-1, -1, -1, -1)
>
> 13/04/23 13:21:04 INFO delegation.AbstractDelegationTokenSecretManager:
> Updating the current master key for generating delegation tokens
>
> 13/04/23 13:21:04 INFO util.HostsFileReader: Refreshing hosts
> (include/exclude) list
>
> 13/04/23 13:21:04 INFO mapred.JobTracker: Starting jobtracker with owner as
> root
> 13/04/23 13:21:04 INFO ipc.Server: Starting SocketReader
>
> 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source
> RpcDetailedActivityForPort9001 registered.
>
> 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source
> RpcActivityForPort9001 registered.
>
> 13/04/23 13:21:04 INFO mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
>
> 13/04/23 13:21:05 INFO http.HttpServer: Added global filtersafety
> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>
> 13/04/23 13:21:05 INFO http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the
> listener on 50030
>
> 13/04/23 13:21:05 INFO http.HttpServer: listener.getLocalPort() returned
> 50030 webServer.getConnectors()[0].getLocalPort() returned 50030
> 13/04/23 13:21:05 INFO http.HttpServer: Jetty bound to port 50030
> 13/04/23 13:21:05 INFO mortbay.log: jetty-6.1.26
> 13/04/23 13:21:05 INFO mortbay.log: Started
> [email protected]:50030
>
> 13/04/23 13:21:05 INFO impl.MetricsSourceAdapter: MBean for source jvm
> registered.
>
> 13/04/23 13:21:05 INFO impl.MetricsSourceAdapter: MBean for source
> JobTrackerMetrics registered.
> 13/04/23 13:21:05 INFO mapred.JobTracker: JobTracker up at: 9001
> 13/04/23 13:21:05 INFO mapred.JobTracker: JobTracker webserver: 50030
> 13/04/23 13:21:05 INFO mapred.JobTracker: Cleaning up the system directory
>
> 13/04/23 13:21:05 INFO mapred.JobTracker: History server being initialized in
> embedded mode
>
> 13/04/23 13:21:05 INFO mapred.JobHistoryServer: Started job history server
> at: localhost:50030
>
> 13/04/23 13:21:05 INFO mapred.JobTracker: Job History Server web address:
> localhost:50030
>
> 13/04/23 13:21:05 INFO mapred.CompletedJobStatusStore: Completed job store is
> inactive
> 13/04/23 13:21:05 INFO mapred.MesosScheduler: Starting MesosScheduler
> 13/04/23 13:21:05 INFO mapred.JobTracker: Refreshing hosts information
>
> 13/04/23 13:21:05 ERROR mapred.MesosScheduler: Error from scheduler driver:
> Cannot parse '@
> 0.0.0.0:0'
> 13/04/23 13:21:05 INFO util.HostsFileReader: Setting the includes file to
> 13/04/23 13:21:05 INFO util.HostsFileReader: Setting the excludes file to
>
> 13/04/23 13:21:05 INFO util.HostsFileReader: Refreshing hosts
> (include/exclude) list
> 13/04/23 13:21:05 INFO mapred.JobTracker: Decommissioning 0 nodes
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server Responder: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server listener on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 0 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 1 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 3 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 2 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 5 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 4 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 6 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 7 on 9001: starting
> 13/04/23 13:21:05 INFO mapred.JobTracker: Starting RUNNING
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 8 on 9001: starting
> 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 9 on 9001: starting
>
> 13/04/23 13:21:32 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 13/04/23 13:21:32 INFO mapred.JobInProgress: job_201304231321_0001: nMaps=0
> nReduces=0 max=-1
>
> 13/04/23 13:21:32 INFO mapred.MesosScheduler: Added job job_201304231321_0001
>
> 13/04/23 13:21:32 INFO mapred.JobTracker: Job job_201304231321_0001 added
> successfully for user 'root' to queue 'default'
>
> 13/04/23 13:21:32 INFO mapred.AuditLogger: USER=root IP=192.168.0.2
> OPERATION=SUBMIT_JOB TARGET=job_201304231321_0001 RESULT=SUCCESS
>
> 13/04/23 13:21:32 INFO mapred.JobTracker: Initializing job_201304231321_0001
>
> 13/04/23 13:21:32 INFO mapred.JobInProgress: Initializing
> job_201304231321_0001
>
> 13/04/23 13:21:32 INFO mapred.JobInProgress: jobToken generated and stored
> with users keys in
> /home/HadoopRun/tmp/mapred/system/job_201304231321_0001/jobToken
>
> 13/04/23 13:21:32 INFO mapred.JobInProgress: Input size for job
> job_201304231321_0001 = 0. Number of splits = 0
>
> 13/04/23 13:21:32 INFO mapred.JobInProgress: Job job_201304231321_0001
> initialized successfully with 0 map tasks and 0 reduce tasks.
>
> ------------------------------
> Wang Yu
>
> *From:* 王国栋 <[email protected]>
> *Date:* 2013-04-23 11:34
> *To:* mesos-dev <[email protected]>;
> wangyu<[email protected]>
> *Subject:* Re: Re: org.apache.hadoop.mapred.MesosScheduler:
> Unknown/exited TaskTracker: http://slave5:50060
> Hi Yu,
>
> Mesos will just launch tasktracker on each slave node as long as the
> required resource is enough for the tasktracker. So you have to run
> NameNode, Jobtracker and DataNode by your own.
>
> Basicly, starting the hadoop on mesos is like this.
> 1. start the dfs. use hadoop/bin/start-dfs.sh. (you should configure
> core-sites.xml and hdfs-site.xml). dfs is no different from the normal one.
> 2. start jobtracker, use hadoop/bin/hadoop jobtracker (you should
> configure mapred-site.xml, this jobtracker should contains the patch for
> mesos)
>
> Then, you can use mesos web UI and jobtracker web UI to check the status
> of Jobtracker.
>
> Guodong
>
>
> On Tue, Apr 23, 2013 at 11:06 AM, 王瑜 <[email protected]> wrote:
>
>> Oh, yes, I start my hadoop using "start-all.sh". I know what's my
>> problem. Thanks very much!
>>
>> ps: Besides TaskTracker, is there any other roles(like JobTracker,
>> DataNode) I should stop it first?
>>
>>
>>
>>
>> Wang Yu
>>
>> 发件人: Benjamin Mahler
>> 发送时间: 2013-04-23 10:56
>> 收件人: [email protected]; wangyu
>> 主题: Re: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
>> TaskTracker: http://slave5:50060
>> The scheduler we wrote for Hadoop will start its own TaskTrackers,
>> meaning
>> you do not have to start any TaskTrackers yourself
>>
>> Are you starting your own TaskTrackers? Are there any TaskTrackers running
>> in your cluster?
>>
>> Looking at your jps output, is there already a TaskTracker running?
>> [root@master logs]# jps
>> 13896 RunJar
>> 14123 Jps
>> 12718 NameNode
>> 12900 DataNode
>> 13374 TaskTracker <--- How was this started?
>> 13218 JobTracker
>>
>>
>> On Mon, Apr 22, 2013 at 7:47 PM, 王瑜 <[email protected]> wrote:
>>
>> > Hi, Ben and Guodong,
>> >
>> > What do you mean "managing your own TaskTrackers"? How should I know
>> > whether I have manager my own TaskTrackers? Sorry, I do not familiar
>> with
>> > mesos very much.
>> > Dies it mean I do not need configure hdfs-site.xml and core-site.xml in
>> > hadoop? I do not want to run my own TaskTracker, I just want to set up
>> > hadoop on mesos, and run my MR tasks.
>> >
>> > Thanks very much for your patient reply...Maybe I have a long way to
>> go...
>> >
>> >
>> >
>> > The log messages you see:
>> > 2013-04-18 16:47:19,645 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker: http://master:50060.
>> >
>> > Are printed when mesos does not know about the TaskTracker. We currently
>> > don't support running your own TaskTrackers, as the MesosScheduler will
>> > launch them on your behalf when needed.
>> >
>> > Are you managing your own TaskTrackers? The purpose of using Hadoop with
>> > mesos is that you no longer have to do that. We will detect that jobs
>> have
>> > pending map / reduce tasks and launch TaskTrackers accordingly.
>> >
>> > Guodong may be able to help further getting set up!
>> >
>> >
>> >
>> >
>> > Wang Yu
>> >
>> > From: 王国栋
>> > Date: 2013-04-18 17:10
>> > To: mesos-dev; wangyu
>> > Subject: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited
>> > TaskTracker: http://slave5:50060
>> > You can check the slave log and the mesos-executor log, which is
>> normally
>> > located in the dir like
>> >
>> >
>> "/tmp/mesos/slaves/201304181115-16842879-5050-4680-13/frameworks/201304181115-16842879-5050-4680-0003/executors/executor_Task_Tracker_16/runs/latest/stderr".
>> > The log is tasktracker log.
>> >
>> > I hope it will help.
>> >
>> > Guodong
>> >
>> >
>> > On Thu, Apr 18, 2013 at 5:03 PM, 王瑜 <[email protected]> wrote:
>> >
>> > > **
>> > > Hi All,
>> > >
>> > > I have deployed mesos on three node: master, slave1, slave5. and it
>> works
>> > > well.
>> > > Then I set hadoop over it, using master as namenode, and master,
>> slave1,
>> > > slave5 as datanode. When I using 'jps', it looks works well.
>> > > [root@master logs]# jps
>> > > 13896 RunJar
>> > > 14123 Jps
>> > > 12718 NameNode
>> > > 12900 DataNode
>> > > 13374 TaskTracker
>> > > 13218 JobTracker
>> > >
>> > > Then I run test benchmark, it can not go on working...
>> > > [root@master
>> > > hadoop-0.20.205.0]# bin/hadoop jar hadoop-examples-0.20.205.0.jar
>> > randomwriter -Dtest.randomwrite.bytes_per_map=6710886
>> > -Dtest.randomwriter.maps_per_host=10 rand
>> > > Running 30 maps.
>> > > Job started: Thu Apr 18 16:49:36 CST 2013
>> > > 13/04/18 16:49:36 INFO mapred.JobClient: Running job:
>> > job_201304181646_0001
>> > > 13/04/18 16:49:37 INFO mapred.JobClient: map 0% reduce 0%
>> > > It stopped here.
>> > >
>> > > Then I read the log file: hadoop-root-jobtracker-master.log, it shows:
>> > > 2013-04-18 16
>> > > :46:51,724 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING
>> > > 2013-04-18 16
>> > > :46:51,726 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on
>> > 9001: starting
>> > > 2013-04-18 16
>> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on
>> > 9001: starting
>> > > 2013-04-18 16
>> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on
>> > 9001: starting
>> > > 2013-04-18 16
>> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on
>> > 9001: starting
>> > > 2013-04-18 16
>> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on
>> > 9001: starting
>> > > 2013-04-18 16
>> > > :46:52,557 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
>> > node: /default-rack/master
>> > > 2013-04-18 16
>> > > :46:52,560 INFO org.apache.hadoop.mapred.JobTracker: Adding tracker
>> > tracker_master:localhost/
>> > > 127.0.0.1:44997 to host master
>> > > 2013-04-18 16
>> > > :46:52,568 INFO org.apache.hadoop.mapred.MesosScheduler:
>> Unknown/exited
>> > TaskTracker:
>> > > http://master:50060.
>> > > 2013-04-18 16
>> > > :46:55,581 INFO org.apache.hadoop.mapred.MesosScheduler:
>> Unknown/exited
>> > TaskTracker:
>> > > http://master:50060.
>> > > 2013-04-18 16
>> > > :46:58,590 INFO org.apache.hadoop.mapred.MesosScheduler:
>> Unknown/exited
>> > TaskTracker:
>> > > http://master:50060.
>> > > 2013-04-18 16
>> > > :47:01,600 INFO org.apache.hadoop.mapred.MesosScheduler:
>> Unknown/exited
>> > TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:04,609 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:07,618 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:10,625 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:13,632 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:13,686 INFO org.apache.hadoop.net.NetworkTopology:
>> > Adding a new node: /default-rack/slave5
>> > >
>> > > 2013-04-18 16:47:13,686 INFO org.apache.hadoop.mapred.JobTracker:
>> Adding
>> > tracker tracker_slave5:
>> > > 127.0.0.1/127.0.0.1:60621 to host slave5
>> > >
>> > > 2013-04-18 16:47:13,687 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://slave5:50060.
>> > >
>> > > 2013-04-18 16:47:16,638 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:16,697 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://slave5:50060.
>> > >
>> > > 2013-04-18 16:47:19,645 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:19,707 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://slave5:50060.
>> > >
>> > > 2013-04-18 16:47:22,651 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:22,715 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://slave5:50060.
>> > >
>> > > 2013-04-18 16:47:25,658 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > 2013-04-18 16:47:25,725 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://slave5:50060.
>> > >
>> > > 2013-04-18 16:47:28,665 INFO org.apache.hadoop.mapred.MesosScheduler:
>> > Unknown/exited TaskTracker:
>> > > http://master:50060.
>> > >
>> > > Does anybody can help me? Thanks very much!
>> > >
>> >
>>
>
>