On Fri, Aug 27, 2010 at 3:13 PM, xu cheng <[email protected]> wrote: > hello guys: > I'm doing some experiences on my 3 node virtual machine cluster, one for > namenode and jobtracker while the other tow for datanode and > tasktracker.with a 0.21.0 hadoop > and when I 'm running a job ,I got such message > > > 10/08/27 17:28:58 INFO mapreduce.Job: map 0% reduce 0% > 10/08/27 17:29:10 INFO mapreduce.Job: map 50% reduce 0% > 10/08/27 17:29:12 INFO mapreduce.Job: map 83% reduce 0% > 10/08/27 17:29:19 INFO mapreduce.Job: map 83% reduce 16% > 10/08/27 17:29:24 INFO mapreduce.Job: map 100% reduce 16% > > > the reduce runs while the map task hasn't finished!! ( I read from the books > that reduce task runs exactlly after the maps finish!!)
The reducer operation contains 3 steps - copy/sort/reduce. 33%-33%-33% roughly. What you're seeing is the copy phase progress at this point. The 'reduce' phase indeed runs only after all maps finish, that is true. > > is there something wrong with the cluster or my knowledge? > > by the way ,the reduce job stuck while it is in the progress 16.63%, I found > that people on the internet also got this problem but I haven't found the > solution. > > however, after some time , after the system report the error message , the > job began to run again! like this > > > 10/08/27 17:29:58 INFO mapreduce.Job: map 100% reduce 16% > 10/08/27 17:30:01 INFO mapreduce.Job: map 50% reduce 16% > 10/08/27 17:30:07 INFO mapreduce.Job: map 83% reduce 16% > 10/08/27 17:30:19 INFO mapreduce.Job: map 100% reduce 16% > 10/08/27 17:30:25 INFO mapreduce.Job: map 100% reduce 66% > 10/08/27 17:30:31 INFO mapreduce.Job: map 100% reduce 100% > > > and it runs pretty well! does someone know about this? > > belows are the message on the console, If the logs are needed ,let me > know.thanks > > any suggestions and references are appreciated > best regards > xu > > > Are you running a Firewall, and if you are - is it configured properly? > > > > > > 10/08/27 17:28:56 INFO security.Groups: Group mapping > impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; > cacheTimeout=300000 > 10/08/27 17:28:56 WARN conf.Configuration: mapred.task.id is deprecated. > Instead, use mapreduce.task.attempt.id > 10/08/27 17:28:56 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for > parsing the arguments. Applications should implement Tool for the same. > 10/08/27 17:28:56 INFO input.FileInputFormat: Total input paths to process : > 1 > 10/08/27 17:28:56 WARN conf.Configuration: mapred.map.tasks is deprecated. > Instead, use mapreduce.job.maps > 10/08/27 17:28:56 INFO mapreduce.JobSubmitter: number of splits:2 > 10/08/27 17:28:57 INFO mapreduce.JobSubmitter: adding the following > namenodes' delegation tokens:null > 10/08/27 17:28:57 INFO mapreduce.Job: Running job: job_201008271725_0001 > 10/08/27 17:28:58 INFO mapreduce.Job: map 0% reduce 0% > 10/08/27 17:29:10 INFO mapreduce.Job: map 50% reduce 0% > 10/08/27 17:29:12 INFO mapreduce.Job: map 83% reduce 0% > 10/08/27 17:29:19 INFO mapreduce.Job: map 83% reduce 16% > 10/08/27 17:29:24 INFO mapreduce.Job: map 100% reduce 16% > 10/08/27 17:29:48 INFO mapreduce.Job: Task Id : > attempt_201008271725_0001_r_000000_0, Status : FAILED > org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in fetcher#1 > at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362) > at org.apache.hadoop.mapred.Child$4.run(Child.java:217) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) > at org.apache.hadoop.mapred.Child.main(Child.java:211) > Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; > bailing-out. > at > org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253) > at > org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187) > at > org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:234) > at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) > 10/08/27 17:29:48 WARN mapreduce.Job: Error reading task outputConnection > refused > 10/08/27 17:29:48 WARN mapreduce.Job: Error reading task outputConnection > refused > 10/08/27 17:29:49 INFO mapreduce.Job: map 100% reduce 0% > 10/08/27 17:29:57 INFO mapreduce.Job: Task Id : > attempt_201008271725_0001_m_000000_0, Status : FAILED > Too many fetch-failures > 10/08/27 17:29:57 WARN mapreduce.Job: Error reading task outputConnection > refused > 10/08/27 17:29:57 WARN mapreduce.Job: Error reading task outputConnection > refused > 10/08/27 17:29:58 INFO mapreduce.Job: map 100% reduce 16% > 10/08/27 17:30:01 INFO mapreduce.Job: map 50% reduce 16% > 10/08/27 17:30:07 INFO mapreduce.Job: map 83% reduce 16% > 10/08/27 17:30:19 INFO mapreduce.Job: map 100% reduce 16% > 10/08/27 17:30:25 INFO mapreduce.Job: map 100% reduce 66% > 10/08/27 17:30:31 INFO mapreduce.Job: map 100% reduce 100% > 10/08/27 17:30:33 INFO mapreduce.Job: Job complete: job_201008271725_0001 > 10/08/27 17:30:33 INFO mapreduce.Job: Counters: 33 > FileInputFormatCounters > BYTES_READ=76420532 > FileSystemCounters > FILE_BYTES_READ=155979268 > FILE_BYTES_WRITTEN=239598906 > HDFS_BYTES_READ=76424828 > HDFS_BYTES_WRITTEN=78386951 > Shuffle Errors > BAD_ID=0 > CONNECTION=0 > IO_ERROR=2 > WRONG_LENGTH=0 > WRONG_MAP=0 > WRONG_REDUCE=0 > Job Counters > Data-local map tasks=3 > Total time spent by all maps waiting after reserving slots (ms)=0 > Total time spent by all reduces waiting after reserving slots (ms)=0 > SLOTS_MILLIS_MAPS=47747 > SLOTS_MILLIS_REDUCES=77241 > Launched map tasks=3 > Launched reduce tasks=2 > Map-Reduce Framework > Combine input records=999998 > Combine output records=994179 > Failed Shuffles=1 > GC time elapsed (ms)=667 > Map input records=499999 > Map output bytes=80759850 > Map output records=999998 > Merged Map outputs=2 > Reduce input groups=993814 > Reduce input records=994179 > Reduce output records=993814 > Reduce shuffle bytes=83049802 > Shuffled Maps =2 > Spilled Records=2861309 > SPLIT_RAW_BYTES=200 > -- Harsh J www.harshj.com
