Hi Yunhong, As per the output it seems the job ran to successful completion (albeit with some failures)... Devaraj
> -----Original Message----- > From: Yunhong Gu1 [mailto:[EMAIL PROTECTED] > Sent: Saturday, January 19, 2008 8:56 AM > To: hadoop-user@lucene.apache.org > Subject: Re: Reduce hangs > > > > Yes, it looks like HADOOP-1374 > > The program actually failed after a while: > > > [EMAIL PROTECTED]:~/hadoop-0.15.2$ ./bin/hadoop jar > hadoop-0.15.2-test.jar mrbench > MRBenchmark.0.0.2 > 08/01/18 18:53:08 INFO mapred.MRBench: creating control file: > 1 numLines, ASCENDING sortOrder > 08/01/18 18:53:08 INFO mapred.MRBench: created control file: > /benchmarks/MRBench/mr_input/input_-450753747.txt > 08/01/18 18:53:09 INFO mapred.MRBench: Running job 0: > input=/benchmarks/MRBench/mr_input > output=/benchmarks/MRBench/mr_output/output_1843693325 > 08/01/18 18:53:09 INFO mapred.FileInputFormat: Total input > paths to process : 1 > 08/01/18 18:53:09 INFO mapred.JobClient: Running job: > job_200801181852_0001 > 08/01/18 18:53:10 INFO mapred.JobClient: map 0% reduce 0% > 08/01/18 18:53:17 INFO mapred.JobClient: map 100% reduce 0% > 08/01/18 18:53:25 INFO mapred.JobClient: map 100% reduce 16% > 08/01/18 19:08:27 INFO mapred.JobClient: Task Id : > task_200801181852_0001_m_000001_0, Status : FAILED Too many > fetch-failures > 08/01/18 19:08:27 WARN mapred.JobClient: Error reading task > outputncdm15 > 08/01/18 19:08:27 WARN mapred.JobClient: Error reading task > outputncdm15 > 08/01/18 19:08:34 INFO mapred.JobClient: map 100% reduce 100% > 08/01/18 19:08:35 INFO mapred.JobClient: Job complete: > job_200801181852_0001 > 08/01/18 19:08:35 INFO mapred.JobClient: Counters: 10 > 08/01/18 19:08:35 INFO mapred.JobClient: Job Counters > 08/01/18 19:08:35 INFO mapred.JobClient: Launched map tasks=3 > 08/01/18 19:08:35 INFO mapred.JobClient: Launched reduce tasks=1 > 08/01/18 19:08:35 INFO mapred.JobClient: Data-local map tasks=2 > 08/01/18 19:08:35 INFO mapred.JobClient: Map-Reduce Framework > 08/01/18 19:08:35 INFO mapred.JobClient: Map input records=1 > 08/01/18 19:08:35 INFO mapred.JobClient: Map output records=1 > 08/01/18 19:08:35 INFO mapred.JobClient: Map input bytes=2 > 08/01/18 19:08:35 INFO mapred.JobClient: Map output bytes=5 > 08/01/18 19:08:35 INFO mapred.JobClient: Reduce input groups=1 > 08/01/18 19:08:35 INFO mapred.JobClient: Reduce input records=1 > 08/01/18 19:08:35 INFO mapred.JobClient: Reduce output records=1 > DataLines Maps Reduces AvgTime (milliseconds) > 1 2 1 926333 > > > > On Fri, 18 Jan 2008, Konstantin Shvachko wrote: > > > Looks like we still have this unsolved mysterious problem: > > > > http://issues.apache.org/jira/browse/HADOOP-1374 > > > > Could it be related to HADOOP-1246? Arun? > > > > Thanks, > > --Konstantin > > > > Yunhong Gu1 wrote: > >> > >> Hi, > >> > >> If someone knows how to fix the problem described below, > please help > >> me out. Thanks! > >> > >> I am testing Hadoop on 2-node cluster and the "reduce" > always hangs > >> at some stage, even if I use different clusters. My OS is Debian > >> Linux kernel 2.6 (AMD Opteron w/ 4GB Mem). Hadoop verision > is 0.15.2. > >> Java version is 1.5.0_01-b08. > >> > >> I simply tried "./bin/hadoop jar hadoop-0.15.2-test.jar > mrbench" and > >> when the map stage finishes, the reduce stage will hang > somewhere in > >> the middle, sometimes at 0%. I also tried any other > mapreduce program > >> I can find in the example jar package but they all hang. > >> > >> The log file simply print > >> 2008-01-18 15:15:50,831 INFO org.apache.hadoop.mapred.TaskTracker: > >> task_200801181424_0004_r_000000_0 0.0% reduce > copy > > >> 2008-01-18 15:15:56,841 INFO org.apache.hadoop.mapred.TaskTracker: > >> task_200801181424_0004_r_000000_0 0.0% reduce > copy > > >> 2008-01-18 15:16:02,850 INFO org.apache.hadoop.mapred.TaskTracker: > >> task_200801181424_0004_r_000000_0 0.0% reduce > copy > > >> > >> forever. > >> > >> The program does work if I start Hadoop only on single node. > >> > >> Below is my hadoop-site.xml configuration: > >> > >> <configuration> > >> > >> <property> > >> <name>fs.default.name</name> > >> <value>10.0.0.1:60000</value> > >> </property> > >> > >> <property> > >> <name>mapred.job.tracker</name> > >> <value>10.0.0.1:60001</value> > >> </property> > >> > >> <property> > >> <name>dfs.data.dir</name> > >> <value>/raid/hadoop/data</value> > >> </property> > >> > >> <property> > >> <name>mapred.local.dir</name> > >> <value>/raid/hadoop/mapred</value> > >> </property> > >> > >> <property> > >> <name>hadoop.tmp.dir</name> > >> <value>/raid/hadoop/tmp</value> > >> </property> > >> > >> <property> > >> <name>mapred.child.java.opts</name> > >> <value>-Xmx1024m</value> > >> </property> > >> > >> <property> > >> <name>mapred.tasktracker.tasks.maximum</name> > >> <value>4</value> > >> </property> > >> > >> <!-- > >> <property> > >> <name>mapred.map.tasks</name> > >> <value>7</value> > >> </property> > >> > >> <property> > >> <name>mapred.reduce.tasks</name> > >> <value>3</value> > >> </property> > >> --> > >> > >> <property> > >> <name>fs.inmemory.size.mb</name> > >> <value>200</value> > >> </property> > >> > >> <property> > >> <name>dfs.block.size</name> > >> <value>134217728</value> > >> </property> > >> > >> <property> > >> <name>io.sort.factor</name> > >> <value>100</value> > >> </property> > >> > >> <property> > >> <name>io.sort.mb</name> > >> <value>200</value> > >> </property> > >> > >> <property> > >> <name>io.file.buffer.size</name> > >> <value>131072</value> > >> </property> > >> > >> </configuration> > >> > >> > > >