Did you forget to attach them?
On Wed, May 8, 2013 at 6:48 PM, 王瑜 <[email protected]> wrote: > ** > OK. > Logs are attached. I use Ctrl+C to stop jobtracker when the task_lost > happened. > > Thanks very much for your help! > > ------------------------------ > Wang Yu > > *发件人:* Benjamin Mahler <[email protected]> > *发送时间:* 2013-05-09 01:23 > *收件人:* [email protected] > *抄送:* wangyu <[email protected]> > *主题:* Re: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited > TaskTracker: http://slave5:50060 > > Hey Brenden, are there any bugs in particular here that you're referring to? > > Wang, can you provide the logs for the JobTracker, the slave, and the > master? > > > On Tue, May 7, 2013 at 11:50 AM, Brenden Matthews < > [email protected]> wrote: > > > You may want to try Airbnb's dist of Mesos: > > > > https://github.com/airbnb/mesos/tree/testing > > > > A good number of these Mesos bugs have been fixed but aren't yet merged > > into upstream. > > > > > > On Mon, May 6, 2013 at 8:34 PM, 王瑜 <[email protected]> wrote: > > > > > The log on each slave of the lost task is : No executor found with ID: > > > executor_Task_Tracker_XXX. > > > > > > > > > > > > > > > Wang Yu > > > > > > 发件人: 王瑜 > > > 发送时间: 2013-05-07 11:13 > > > 收件人: mesos-dev > > > 主题: 回复: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited > > > TaskTracker: http://slave5:50060 > > > Hi all, > > > > > > I have tried adding file extension when upload executor as well as the > > > conf file, but it still can not work. > > > > > > And I have seen > > > > > > /tmp/mesos/slaves/201304131144-33597632-5050-4949-0/frameworks/201304131144-33597632-5050-4949-0006/executors/executor_Task_Tracker_63/runs/latest, > > > but it is a null directory. > > > > > > > Is there any other logs I can read to know why the TASK_LOST happened? I > > > really need your help, thanks very much! > > > > > > > > > > > > > > > Wang Yu > > > > > > 发件人: Vinod Kone > > > 发送时间: 2013-04-26 01:31 > > > 收件人: [email protected] > > > 抄送: wangyu > > > 主题: Re: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited > > > TaskTracker: http://slave5:50060 > > > Also, you could look at the executor logs (default: > > > /tmp/mesos/slaves/....../executors/../runs/latest/) to see why the > > > TASK_LOST happened. > > > > > > > > > > > > On Thu, Apr 25, 2013 at 10:19 AM, Benjamin Mahler < > > > [email protected]> wrote: > > > > > > Can you maintain the file extension? That is how mesos knows to extract > > it: > > > hadoop fs -copyFromLocal > > > /home/mesos/build/hadoop/hadoop-0.20.205.0/build/hadoop.tar.gz > > > /user/mesos/mesos-executor.tar.gz > > > > > > Also make sure your mapred-site.xml has the extension as well. > > > > > > > > > > > > On Thu, Apr 25, 2013 at 1:08 AM, 王瑜 <[email protected]> wrote: > > > > > > > Hi, Ben, > > > > > > > > I have tried as you said, but It still can not work. > > > > I have upload mesos-executor using: hadoop fs -copyFromLocal > > > > /home/mesos/build/hadoop/hadoop-0.20.205.0/build/hadoop.tar.gz > > > > /user/mesos/mesos-executor > > > > Did I do the right thing? Thanks very much! > > > > > > > > The log in jobtracker is: > > > > 13/04/25 16:00:55 INFO mapred.MesosScheduler: Launching task > > > > Task_Tracker_82 on http://slave1:31000 > > > > > 13/04/25 16:00:55 INFO mapred.MesosScheduler: Satisfied map and reduce > > > > slots needed. > > > > 13/04/25 16:00:55 INFO mapred.MesosScheduler: Status update of > > > > Task_Tracker_82 to TASK_LOST with message Executor terminated > > > > 13/04/25 16:00:56 INFO mapred.MesosScheduler: JobTracker Status > > > > Pending Map Tasks: 2 > > > > Pending Reduce Tasks: 1 > > > > Idle Map Slots: 0 > > > > Idle Reduce Slots: 0 > > > > Inactive Map Slots: 6 (launched but no hearbeat yet) > > > > Inactive Reduce Slots: 6 (launched but no hearbeat yet) > > > > Needed Map Slots: 2 > > > > Needed Reduce Slots: 1 > > > > 13/04/25 16:00:56 INFO mapred.MesosScheduler: Launching task > > > > Task_Tracker_83 on http://slave1:31000 > > > > > 13/04/25 16:00:56 INFO mapred.MesosScheduler: Satisfied map and reduce > > > > slots needed. > > > > 13/04/25 16:00:56 INFO mapred.MesosScheduler: Status update of > > > > Task_Tracker_83 to TASK_LOST with message Executor terminated > > > > 13/04/25 16:00:57 INFO mapred.MesosScheduler: JobTracker Status > > > > Pending Map Tasks: 2 > > > > Pending Reduce Tasks: 1 > > > > Idle Map Slots: 0 > > > > Idle Reduce Slots: 0 > > > > Inactive Map Slots: 6 (launched but no hearbeat yet) > > > > Inactive Reduce Slots: 6 (launched but no hearbeat yet) > > > > Needed Map Slots: 2 > > > > Needed Reduce Slots: 1 > > > > > > > > > > > > > > > > > > > > > > > > Wang Yu > > > > > > > > 发件人: Benjamin Mahler > > > > 发送时间: 2013-04-24 07:49 > > > > 收件人: [email protected]; wangyu > > > > 主题: Re: Re: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited > > > > TaskTracker: http://slave5:50060 > > > > > You need to instead upload the hadoop.tar.gz generated by the tutorial. > > > > > Then point the conf file to the hdfs directory (you had the right idea, > > > > just uploaded the wrong file). :) > > > > > > > > Can you try that and report back? > > > > > > > > > > > > On Tue, Apr 23, 2013 at 12:45 AM, 王瑜 <[email protected]> wrote: > > > > > > > > > Guodong, > > > > > > > > > > > There still are problems with me, I think there are some problem with > > > my > > > > > executor setting. > > > > > > > > > > In mapred-site.xml, I set:("master" is the hostname of > > > > > mesos-master-hostname) > > > > > <property> > > > > > <name>mapred.mesos.executor</name> > > > > > # <value>hdfs://hdfs.name.node:port/hadoop.zip</value> > > > > > <value>hdfs://master/user/mesos/mesos-executor</value> > > > > > </property> > > > > > > > > > > And I upload mesos-executor in /user/mesos/mesos-executor > > > > > > > > > > The head content is as follows: > > > > > > > > > > #! /bin/sh > > > > > > > > > > > # mesos-executor - temporary wrapper script for .libs/mesos-executor > > > > > # Generated by ltmain.sh (GNU libtool) 2.2.6b > > > > > # > > > > > # The mesos-executor program cannot be directly executed until all > > the > > > > > libtool > > > > > # libraries that it depends on are installed. > > > > > # > > > > > # This wrapper script should never be moved out of the build > > directory. > > > > > # If it is, it will not operate correctly. > > > > > > > > > > # Sed substitution that helps us do robust quoting. It > > backslashifies > > > > > > # metacharacters that are still active within double-quoted strings. > > > > > Xsed='/bin/sed -e 1s/^X//' > > > > > sed_quote_subst='s/\([`"$\\]\)/\\\1/g' > > > > > > > > > > # Be Bourne compatible > > > > > > if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then > > > > > emulate sh > > > > > NULLCMD=: > > > > > # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which > > > > > # is contrary to our usage. Disable this feature. > > > > > alias -g '${1+"$@"}'='"$@"' > > > > > setopt NO_GLOB_SUBST > > > > > else > > > > > case `(set -o) 2>/dev/null` in *posix*) set -o posix;; esac > > > > > fi > > > > > BIN_SH=xpg4; export BIN_SH # for Tru64 > > > > > DUALCASE=1; export DUALCASE # for MKS sh > > > > > > > > > > > # The HP-UX ksh and POSIX shell print the target directory to stdout > > > > > # if CDPATH is set. > > > > > (unset CDPATH) >/dev/null 2>&1 && unset CDPATH > > > > > > > > > > relink_command="(cd /home/mesos/build/src; { test -z > > > > > \"\${LIBRARY_PATH+set}\" || unset LIBRARY_PATH || { LIBRARY_PATH=; > > > export > > > > > LIBRARY_PATH; }; }; { test -z \"\${COMPILER_PATH+set}\" || unset > > > > > > COMPILER_PATH || { COMPILER_PATH=; export COMPILER_PATH; }; }; { test > > > -z > > > > > \"\${GCC_EXEC_PREFIX+set}\" || unset GCC_EXEC_PREFIX || { > > > > GCC_EXEC_PREFIX=; > > > > > export GCC_EXEC_PREFIX; }; }; { test -z \"\${LD_RUN_PATH+set}\" || > > > unset > > > > > LD_RUN_PATH || { LD_RUN_PATH=; export LD_RUN_PATH; }; }; > > > > > > > > > > > > > > > LD_LIBRARY_PATH=/home/wangyu/protobuf/lib:/home/mesos/mesos-0.9.0/build/hadoop/hadoop-0.20.205.0/lib/native/Linux-amd64-64/; > > > > > export LD_LIBRARY_PATH; > > > > > > > > > > > > > > > PATH=/home/wangyu/protobuf/bin:/usr/lib/jvm/java-7-sun/bin:/usr/lib/jvm/java-7-sun/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/home/mesos/mesos-0.10.0/hadoop/hadoop-0.20.205.0/bin:/usr/lib/ant/apache-ant-1.8.4/bin:/opt/scala-2.9.1.final/bin:/home/haidong/zookeeper-3.4.5/bin:/home/hadoop/hive-0.9.0/bin:/home/hadoop/pig-0.10.0/bin:/home/mesos/mpi/build/bin:/home/mesos/torque/torque-4.1.3:/home/mesos/mesos-0.9.0/build/hadoop/hadoop-0.20.205.0/bin:/root/bin; > > > > > export PATH; g++ -g -g2 -O2 -o \$progdir/\$file > > > > > launcher/mesos_executor-executor.o ./.libs/libmesos.so > > > > > > -L/usr/lib/jvm/java-7-sun/jre/lib/amd64/server -lpthread -lcurl -lssl > > > > > > -lcrypto -lz -lrt -pthread -Wl,-rpath -Wl,/home/mesos/build/src/.libs > > > > > -Wl,-rpath -Wl,/home/mesos/build/lib)" > > > > > ... > > > > > > > > > > > > > > > Did I upload the right file? and set up it in conf file correct? > > Thanks > > > > > very much! > > > > > > > > > > > > > > > > > > > > Wang Yu > > > > > > > > > > From: 王国栋 > > > > > Date: 2013-04-23 13:32 > > > > > To: wangyu > > > > > CC: mesos-dev > > > > > Subject: Re: Re: org.apache.hadoop.mapred.MesosScheduler: > > > Unknown/exited > > > > > TaskTracker: http://slave5:50060 > > > > > Hmm. it seems that the mapred.mesos.master is set correctly. > > > > > > > > > > if you run hadoop in local mode, use the following setting is ok > > > > > <property> > > > > > <name>mapred.mesos.master</name> > > > > > <value>local</value> > > > > > </property> > > > > > > > > > > if you want to start the cluster. set mapred.mesos.master as the > > > > > mesos-master-hostname:mesos-master-port. > > > > > > > > > > Make sure the dns parser result for mesos-master-hostname is the > > right > > > > ip. > > > > > > > > > > > BTW: when you starting the jobtracker, you can check mesos webUI and > > > > check > > > > > if there is hadoop framework registered. > > > > > > > > > > Thanks. > > > > > > > > > > Guodong > > > > > > > > > > > > > > > On Tue, Apr 23, 2013 at 1:24 PM, 王瑜 <[email protected] > > wrote: > > > > > > > > > > > ** > > > > > > Hi, Guodong, > > > > > > > > > > > > I start hadoop as you said, then I saw this error: > > > > > > > 13/04/23 13:03:43 ERROR mapred.MesosScheduler: Error from scheduler > > > > > driver: Cannot parse > > > > > > '@0.0.0.0:0' > > > > > > > > > > > > > What's this mean? where should I change MesosScheduler code to fix > > > > this? > > > > > > Thanks very much! I am so sorry for interrupt you once again... > > > > > > > > > > > > The whole log is as follows: > > > > > > > > > > > > [root@master hadoop-0.20.205.0]# hadoop jobtracker > > > > > > 13/04/23 13:21:04 INFO mapred.JobTracker: STARTUP_MSG: > > > > > > /************************************************************ > > > > > > STARTUP_MSG: Starting JobTracker > > > > > > STARTUP_MSG: host = master/192.168.0.2 > > > > > > STARTUP_MSG: args = [] > > > > > > STARTUP_MSG: version = 0.20.205.0 > > > > > > > > > > > > STARTUP_MSG: build = -r ; compiled by 'root' on Sat Apr 13 > > > 11:19:33 > > > > > CST 2013 > > > > > > ************************************************************/ > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsConfig: loaded properties from > > > > > hadoop-metrics2.properties > > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source > > > > > MetricsSystem,sub=Stats registered. > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSystemImpl: Scheduled snapshot > > > > period > > > > > at 10 second(s). > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSystemImpl: JobTracker metrics > > > > system > > > > > started > > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source > > > > > QueueMetrics,q=default registered. > > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source > > > ugi > > > > > registered. > > > > > > > > > > > > 13/04/23 13:21:04 INFO > > > delegation.AbstractDelegationTokenSecretManager: > > > > > Updating the current master key for generating delegation tokens > > > > > > > > > > > > 13/04/23 13:21:04 INFO > > > delegation.AbstractDelegationTokenSecretManager: > > > > > Starting expired delegation token remover thread, > > > > > tokenRemoverScanInterval=60 min(s) > > > > > > > > > > > > > 13/04/23 13:21:04 INFO mapred.JobTracker: Scheduler configured with > > > > > (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, > > > limitMaxMemForMapTasks, > > > > > limitMaxMemForReduceTasks) (-1, -1, -1, -1) > > > > > > > > > > > > 13/04/23 13:21:04 INFO > > > delegation.AbstractDelegationTokenSecretManager: > > > > > Updating the current master key for generating delegation tokens > > > > > > > > > > > > 13/04/23 13:21:04 INFO util.HostsFileReader: Refreshing hosts > > > > > (include/exclude) list > > > > > > > > > > > > > 13/04/23 13:21:04 INFO mapred.JobTracker: Starting jobtracker with > > > > owner > > > > > as root > > > > > > 13/04/23 13:21:04 INFO ipc.Server: Starting SocketReader > > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source > > > > > RpcDetailedActivityForPort9001 registered. > > > > > > > > > > > > > 13/04/23 13:21:04 INFO impl.MetricsSourceAdapter: MBean for source > > > > > RpcActivityForPort9001 registered. > > > > > > > > > > > > 13/04/23 13:21:04 INFO mortbay.log: Logging to > > > > > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > > > > > org.mortbay.log.Slf4jLog > > > > > > > > > > > > 13/04/23 13:21:05 INFO http.HttpServer: Added global filtersafety > > > > > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) > > > > > > > > > > > > 13/04/23 13:21:05 INFO http.HttpServer: Port returned by > > > > > webServer.getConnectors()[0].getLocalPort() before open() is -1. > > > Opening > > > > > the listener on 50030 > > > > > > > > > > > > 13/04/23 13:21:05 INFO http.HttpServer: listener.getLocalPort() > > > > returned > > > > > 50030 webServer.getConnectors()[0].getLocalPort() returned 50030 > > > > > > 13/04/23 13:21:05 INFO http.HttpServer: Jetty bound to port 50030 > > > > > > 13/04/23 13:21:05 INFO mortbay.log: jetty-6.1.26 > > > > > > 13/04/23 13:21:05 INFO mortbay.log: Started > > > > > > [email protected]:50030 > > > > > > > > > > > > > 13/04/23 13:21:05 INFO impl.MetricsSourceAdapter: MBean for source > > > jvm > > > > > registered. > > > > > > > > > > > > > 13/04/23 13:21:05 INFO impl.MetricsSourceAdapter: MBean for source > > > > > JobTrackerMetrics registered. > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: JobTracker up at: 9001 > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: JobTracker webserver: > > 50030 > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: Cleaning up the system > > > > > directory > > > > > > > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: History server being > > > > > initialized in embedded mode > > > > > > > > > > > > > 13/04/23 13:21:05 INFO mapred.JobHistoryServer: Started job history > > > > > server at: localhost:50030 > > > > > > > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: Job History Server web > > > > > address: localhost:50030 > > > > > > > > > > > > 13/04/23 13:21:05 INFO mapred.CompletedJobStatusStore: Completed > > job > > > > > store is inactive > > > > > > 13/04/23 13:21:05 INFO mapred.MesosScheduler: Starting > > MesosScheduler > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: Refreshing hosts > > > information > > > > > > > > > > > > > 13/04/23 13:21:05 ERROR mapred.MesosScheduler: Error from scheduler > > > > > driver: Cannot parse '@ > > > > > > 0.0.0.0:0' > > > > > > 13/04/23 13:21:05 INFO util.HostsFileReader: Setting the includes > > > file > > > > to > > > > > > 13/04/23 13:21:05 INFO util.HostsFileReader: Setting the excludes > > > file > > > > to > > > > > > > > > > > > 13/04/23 13:21:05 INFO util.HostsFileReader: Refreshing hosts > > > > > (include/exclude) list > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: Decommissioning 0 nodes > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server Responder: starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server listener on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 0 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 1 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 3 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 2 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 5 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 4 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 6 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 7 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO mapred.JobTracker: Starting RUNNING > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 8 on 9001: > > > > starting > > > > > > 13/04/23 13:21:05 INFO ipc.Server: IPC Server handler 9 on 9001: > > > > starting > > > > > > > > > > > > 13/04/23 13:21:32 WARN util.NativeCodeLoader: Unable to load > > > > > > native-hadoop library for your platform... using builtin-java classes > > > > where > > > > > applicable > > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobInProgress: job_201304231321_0001: > > > > > nMaps=0 nReduces=0 max=-1 > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.MesosScheduler: Added job > > > > > job_201304231321_0001 > > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobTracker: Job job_201304231321_0001 > > > > > added successfully for user 'root' to queue 'default' > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.AuditLogger: USER=root > > > IP=192.168.0.2 > > > > > OPERATION=SUBMIT_JOB TARGET=job_201304231321_0001 > > RESULT=SUCCESS > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobTracker: Initializing > > > > > job_201304231321_0001 > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobInProgress: Initializing > > > > > job_201304231321_0001 > > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobInProgress: jobToken generated and > > > > > stored with users keys in > > > > > /home/HadoopRun/tmp/mapred/system/job_201304231321_0001/jobToken > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobInProgress: Input size for job > > > > > job_201304231321_0001 = 0. Number of splits = 0 > > > > > > > > > > > > 13/04/23 13:21:32 INFO mapred.JobInProgress: Job > > > job_201304231321_0001 > > > > > initialized successfully with 0 map tasks and 0 reduce tasks. > > > > > > > > > > > > ------------------------------ > > > > > > Wang Yu > > > > > > > > > > > > *From:* 王国栋 <[email protected]> > > > > > > *Date:* 2013-04-23 11:34 > > > > > > *To:* mesos-dev <[email protected]>; wangyu< > > > > > [email protected]> > > > > > > *Subject:* Re: Re: org.apache.hadoop.mapred.MesosScheduler: > > > > > > Unknown/exited TaskTracker: http://slave5:50060 > > > > > > Hi Yu, > > > > > > > > > > > > Mesos will just launch tasktracker on each slave node as long as > > the > > > > > > > required resource is enough for the tasktracker. So you have to run > > > > > > NameNode, Jobtracker and DataNode by your own. > > > > > > > > > > > > Basicly, starting the hadoop on mesos is like this. > > > > > > 1. start the dfs. use hadoop/bin/start-dfs.sh. (you should > > configure > > > > > > core-sites.xml and hdfs-site.xml). dfs is no different from the > > > normal > > > > > one. > > > > > > 2. start jobtracker, use hadoop/bin/hadoop jobtracker (you should > > > > > > configure mapred-site.xml, this jobtracker should contains the > > patch > > > > for > > > > > > mesos) > > > > > > > > > > > > Then, you can use mesos web UI and jobtracker web UI to check the > > > > status > > > > > > of Jobtracker. > > > > > > > > > > > > Guodong > > > > > > > > > > > > > > > > > > On Tue, Apr 23, 2013 at 11:06 AM, 王瑜 <[email protected]> > > wrote: > > > > > > > > > > > > >> Oh, yes, I start my hadoop using "start-all.sh". I know what's my > > > > > >> problem. Thanks very much! > > > > > >> > > > > > > >> ps: Besides TaskTracker, is there any other roles(like JobTracker, > > > > > >> DataNode) I should stop it first? > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> Wang Yu > > > > > >> > > > > > >> 发件人: Benjamin Mahler > > > > > >> 发送时间: 2013-04-23 10:56 > > > > > >> 收件人: [email protected]; wangyu > > > > > >> 主题: Re: Re: org.apache.hadoop.mapred.MesosScheduler: > > Unknown/exited > > > > > >> TaskTracker: http://slave5:50060 > > > > > >> The scheduler we wrote for Hadoop will start its own > > TaskTrackers, > > > > > >> meaning > > > > > >> you do not have to start any TaskTrackers yourself > > > > > >> > > > > > > >> Are you starting your own TaskTrackers? Are there any TaskTrackers > > > > > running > > > > > >> in your cluster? > > > > > >> > > > > > >> Looking at your jps output, is there already a TaskTracker > > running? > > > > > >> [root@master logs]# jps > > > > > >> 13896 RunJar > > > > > >> 14123 Jps > > > > > >> 12718 NameNode > > > > > >> 12900 DataNode > > > > > >> 13374 TaskTracker <--- How was this started? > > > > > >> 13218 JobTracker > > > > > >> > > > > > >> > > > > > >> On Mon, Apr 22, 2013 at 7:47 PM, 王瑜 <[email protected]> > > wrote: > > > > > >> > > > > > >> > Hi, Ben and Guodong, > > > > > >> > > > > > > > >> > What do you mean "managing your own TaskTrackers"? How should I > > > know > > > > > >> > whether I have manager my own TaskTrackers? Sorry, I do not > > > familiar > > > > > >> with > > > > > >> > mesos very much. > > > > > >> > Dies it mean I do not need configure hdfs-site.xml and > > > core-site.xml > > > > > in > > > > > > >> > hadoop? I do not want to run my own TaskTracker, I just want to > > > set > > > > up > > > > > >> > hadoop on mesos, and run my MR tasks. > > > > > >> > > > > > > >> > Thanks very much for your patient reply...Maybe I have a long > > way > > > to > > > > > >> go... > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > The log messages you see: > > > > > >> > 2013-04-18 16:47:19,645 INFO > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: http://master:50060. > > > > > >> > > > > > > >> > Are printed when mesos does not know about the TaskTracker. We > > > > > currently > > > > > >> > don't support running your own TaskTrackers, as the > > MesosScheduler > > > > > will > > > > > >> > launch them on your behalf when needed. > > > > > >> > > > > > > >> > Are you managing your own TaskTrackers? The purpose of using > > > Hadoop > > > > > with > > > > > > >> > mesos is that you no longer have to do that. We will detect that > > > > jobs > > > > > >> have > > > > > > >> > pending map / reduce tasks and launch TaskTrackers accordingly. > > > > > >> > > > > > > >> > Guodong may be able to help further getting set up! > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > Wang Yu > > > > > >> > > > > > > >> > From: 王国栋 > > > > > >> > Date: 2013-04-18 17:10 > > > > > >> > To: mesos-dev; wangyu > > > > > >> > Subject: Re: org.apache.hadoop.mapred.MesosScheduler: > > > Unknown/exited > > > > > >> > TaskTracker: http://slave5:50060 > > > > > > >> > You can check the slave log and the mesos-executor log, which is > > > > > >> normally > > > > > >> > located in the dir like > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > "/tmp/mesos/slaves/201304181115-16842879-5050-4680-13/frameworks/201304181115-16842879-5050-4680-0003/executors/executor_Task_Tracker_16/runs/latest/stderr". > > > > > >> > The log is tasktracker log. > > > > > >> > > > > > > >> > I hope it will help. > > > > > >> > > > > > > >> > Guodong > > > > > >> > > > > > > >> > > > > > > >> > On Thu, Apr 18, 2013 at 5:03 PM, 王瑜 <[email protected]> > > > wrote: > > > > > >> > > > > > > >> > > ** > > > > > >> > > Hi All, > > > > > >> > > > > > > > >> > > I have deployed mesos on three node: master, slave1, slave5. > > and > > > > it > > > > > >> works > > > > > >> > > well. > > > > > >> > > Then I set hadoop over it, using master as namenode, and > > > master, > > > > > >> slave1, > > > > > >> > > slave5 as datanode. When I using 'jps', it looks works well. > > > > > >> > > [root@master logs]# jps > > > > > >> > > 13896 RunJar > > > > > >> > > 14123 Jps > > > > > >> > > 12718 NameNode > > > > > >> > > 12900 DataNode > > > > > >> > > 13374 TaskTracker > > > > > >> > > 13218 JobTracker > > > > > >> > > > > > > > >> > > Then I run test benchmark, it can not go on working... > > > > > >> > > [root@master > > > > > >> > > hadoop-0.20.205.0]# bin/hadoop jar > > > hadoop-examples-0.20.205.0.jar > > > > > >> > randomwriter -Dtest.randomwrite.bytes_per_map=6710886 > > > > > >> > -Dtest.randomwriter.maps_per_host=10 rand > > > > > >> > > Running 30 maps. > > > > > >> > > Job started: Thu Apr 18 16:49:36 CST 2013 > > > > > >> > > 13/04/18 16:49:36 INFO mapred.JobClient: Running job: > > > > > >> > job_201304181646_0001 > > > > > >> > > 13/04/18 16:49:37 INFO mapred.JobClient: map 0% reduce 0% > > > > > >> > > It stopped here. > > > > > >> > > > > > > > >> > > Then I read the log file: hadoop-root-jobtracker-master.log, > > it > > > > > shows: > > > > > >> > > 2013-04-18 16 > > > > > > >> > > :46:51,724 INFO org.apache.hadoop.mapred.JobTracker: Starting > > > > > RUNNING > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:51,726 INFO org.apache.hadoop.ipc.Server: IPC Server > > > handler 5 > > > > > on > > > > > >> > 9001: starting > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server > > > handler 6 > > > > > on > > > > > >> > 9001: starting > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server > > > handler 9 > > > > > on > > > > > >> > 9001: starting > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server > > > handler 7 > > > > > on > > > > > >> > 9001: starting > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:51,727 INFO org.apache.hadoop.ipc.Server: IPC Server > > > handler 8 > > > > > on > > > > > >> > 9001: starting > > > > > >> > > 2013-04-18 16 > > > > > > >> > > :46:52,557 INFO org.apache.hadoop.net.NetworkTopology: Adding > > a > > > > new > > > > > >> > node: /default-rack/master > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:52,560 INFO org.apache.hadoop.mapred.JobTracker: Adding > > > > tracker > > > > > >> > tracker_master:localhost/ > > > > > >> > > 127.0.0.1:44997 to host master > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:52,568 INFO org.apache.hadoop.mapred.MesosScheduler: > > > > > >> Unknown/exited > > > > > >> > TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:55,581 INFO org.apache.hadoop.mapred.MesosScheduler: > > > > > >> Unknown/exited > > > > > >> > TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > 2013-04-18 16 > > > > > >> > > :46:58,590 INFO org.apache.hadoop.mapred.MesosScheduler: > > > > > >> Unknown/exited > > > > > >> > TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > 2013-04-18 16 > > > > > >> > > :47:01,600 INFO org.apache.hadoop.mapred.MesosScheduler: > > > > > >> Unknown/exited > > > > > >> > TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:04,609 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:07,618 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:10,625 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:13,632 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:13,686 INFO > > > > org.apache.hadoop.net.NetworkTopology: > > > > > >> > Adding a new node: /default-rack/slave5 > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:13,686 INFO > > > org.apache.hadoop.mapred.JobTracker: > > > > > >> Adding > > > > > >> > tracker tracker_slave5: > > > > > >> > > 127.0.0.1/127.0.0.1:60621 to host slave5 > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:13,687 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://slave5:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:16,638 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:16,697 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://slave5:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:19,645 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:19,707 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://slave5:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:22,651 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:22,715 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://slave5:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:25,658 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:25,725 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://slave5:50060. > > > > > >> > > > > > > > >> > > 2013-04-18 16:47:28,665 INFO > > > > > org.apache.hadoop.mapred.MesosScheduler: > > > > > >> > Unknown/exited TaskTracker: > > > > > >> > > http://master:50060. > > > > > >> > > > > > > > >> > > Does anybody can help me? Thanks very much! > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
