Hi,

I tried to write the large amount of data into HBase. The map processes
could be completed, but while processing the reduce tasks, the job is
failed.
The error message is shown as follow:
------------------------------------------------------------------------
...
09/08/07 19:45:47 INFO mapred.JobClient:  map 100% reduce 5%
09/08/07 19:46:21 INFO mapred.JobClient:  map 100% reduce 6%
09/08/07 19:46:27 INFO mapred.JobClient:  map 100% reduce 5%
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
    at icl.atc.ites.hbase.PIndexCreator.run(PIndexCreator.java:571)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at icl.atc.ites.hbase.PIndexCreator.main(PIndexCreator.java:598)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

And the error message on web ui:

org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:786)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:465)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:447)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:575)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:474)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:447)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:575)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:478)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:440)
        at org.apache.hadoop.hbase.client.HTable.(HTable.java:115)
        at org.apache.hadoop.hbase.client.HTable.(HTable.java:98)
        at 
org.apache.hadoop.hbase.mapred.TableOutputFormat.getRecordWriter(TableOutputFormat.java:90)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:404)
        at org.apache.hadoop.mapred.Child.main(Child.java:158)


-------------------------------------------------------------------
And I checked both the userlog and the tasktracker log in one of the machine
(these two files are located at the machine #4 in my cluster):
Userlog:
-------------------------------------------------------------------------------------------------
...

2009-08-06 18:23:25,712 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 16 segments left of total size: 163329244
bytes
2009-08-06 18:23:26,867 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:27,531 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 15322761 bytes (15322765 raw bytes) into RAM from
attempt_200908051649_0004_m_000155_0
2009-08-06 18:23:31,360 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Merge of the 16 files in-memory
complete. Local file is
/home/chu/hadoop19/mapred/local/taskTracker/jobcache/job_200908051649_0004/attempt_200908051649_0004_r_000000_1/output/map_136.out
of size 163329218
2009-08-06 18:23:31,360 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Thread waiting: Thread for
merging on-disk files
2009-08-06 18:23:33,203 INFO org.apache.hadoop.mapred.ReduceTask: Read
15322761 bytes from map-output for
attempt_200908051649_0004_m_000155_0
2009-08-06 18:23:33,203 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000155_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:33,867 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:34,232 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 14110457 bytes (14110461 raw bytes) into RAM from
attempt_200908051649_0004_m_000153_0
2009-08-06 18:23:42,491 INFO org.apache.hadoop.mapred.ReduceTask: Read
14110457 bytes from map-output for
attempt_200908051649_0004_m_000153_0
2009-08-06 18:23:42,491 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000153_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:43,866 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:44,050 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 13510680 bytes (13510684 raw bytes) into RAM from
attempt_200908051649_0004_m_000151_0
2009-08-06 18:23:48,791 INFO org.apache.hadoop.mapred.ReduceTask: Read
13510680 bytes from map-output for
attempt_200908051649_0004_m_000151_0
2009-08-06 18:23:48,791 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000151_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:49,867 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:50,194 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 14813689 bytes (14813693 raw bytes) into RAM from
attempt_200908051649_0004_m_000154_0
2009-08-06 18:23:57,301 INFO org.apache.hadoop.mapred.ReduceTask: Read
14813689 bytes from map-output for
attempt_200908051649_0004_m_000154_0
2009-08-06 18:23:57,301 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000154_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:57,867 INFO org.apache.hadoop.mapred.ReduceTask:
Closed ram manager
2009-08-06 18:23:57,867 INFO org.apache.hadoop.mapred.ReduceTask:
Interleaved on-disk merge complete: 13 files left.
2009-08-06 18:23:57,867 INFO org.apache.hadoop.mapred.ReduceTask:
In-memory merge complete: 5 files left.
2009-08-06 18:23:57,869 INFO org.apache.hadoop.mapred.ReduceTask:
Keeping 5 segments, 73734381 bytes in memory for intermediate, on-disk
merge
2009-08-06 18:23:57,872 INFO org.apache.hadoop.mapred.ReduceTask:
Merging 13 files, 2343086626 bytes from disk
2009-08-06 18:23:58,671 INFO org.apache.hadoop.mapred.ReduceTask:
Merging 0 segments, 0 bytes from memory into reduce
2009-08-06 18:23:58,671 INFO org.apache.hadoop.mapred.Merger: Merging
18 sorted segments
2009-08-06 18:23:59,962 INFO org.apache.hadoop.mapred.Merger: Merging
9 intermediate segments out of a total of 18
2009-08-06 18:27:44,177 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 10 segments left of total size: 2343086558
bytes
2009-08-06 18:29:04,667 ERROR
org.apache.hadoop.hbase.mapred.TableOutputFormat:
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region
2009-08-06 18:29:04,699 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:786)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:465)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:447)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:575)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:474)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:447)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:575)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:478)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:440)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:115)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:98)
        at 
org.apache.hadoop.hbase.mapred.TableOutputFormat.getRecordWriter(TableOutputFormat.java:90)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:404)
        at org.apache.hadoop.mapred.Child.main(Child.java:158)

2009-08-06 18:29:04,716 INFO org.apache.hadoop.mapred.TaskRunner:
Runnning cleanup for the task


------------------------------------------------------------------------------------------------------------

  And tasklog:

2009-08-06 18:23:25,712 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 16 segments left of total size: 163329244
bytes
2009-08-06 18:23:26,867 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:27,531 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 15322761 bytes (15322765 raw bytes) into RAM from
attempt_200908051649_0004_m_000155_0
2009-08-06 18:23:31,360 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Merge of the 16 files in-memory
complete. Local file is
/home/chu/hadoop19/mapred/local/taskTracker/jobcache/job_200908051649_0004/attempt_200908051649_0004_r_000000_1/output/map_136.out
of size 163329218
2009-08-06 18:23:31,360 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Thread waiting: Thread for
merging on-disk files
2009-08-06 18:23:33,203 INFO org.apache.hadoop.mapred.ReduceTask: Read
15322761 bytes from map-output for
attempt_200908051649_0004_m_000155_0
2009-08-06 18:23:33,203 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000155_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:33,867 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:34,232 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 14110457 bytes (14110461 raw bytes) into RAM from
attempt_200908051649_0004_m_000153_0
2009-08-06 18:23:42,491 INFO org.apache.hadoop.mapred.ReduceTask: Read
14110457 bytes from map-output for
attempt_200908051649_0004_m_000153_0
2009-08-06 18:23:42,491 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000153_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:43,866 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:44,050 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 13510680 bytes (13510684 raw bytes) into RAM from
attempt_200908051649_0004_m_000151_0
2009-08-06 18:23:48,791 INFO org.apache.hadoop.mapred.ReduceTask: Read
13510680 bytes from map-output for
attempt_200908051649_0004_m_000151_0
2009-08-06 18:23:48,791 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000151_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:49,867 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200908051649_0004_r_000000_1 Scheduled 1 outputs (0 slow hosts
and0 dup hosts)
2009-08-06 18:23:50,194 INFO org.apache.hadoop.mapred.ReduceTask:
Shuffling 14813689 bytes (14813693 raw bytes) into RAM from
attempt_200908051649_0004_m_000154_0
2009-08-06 18:23:57,301 INFO org.apache.hadoop.mapred.ReduceTask: Read
14813689 bytes from map-output for
attempt_200908051649_0004_m_000154_0
2009-08-06 18:23:57,301 INFO org.apache.hadoop.mapred.ReduceTask: Rec
#1 from attempt_200908051649_0004_m_000154_0 -> (32, 26) from
ITeS-clusterPC2
2009-08-06 18:23:57,867 INFO org.apache.hadoop.mapred.ReduceTask:
Closed ram manager
2009-08-06 18:23:57,867 INFO org.apache.hadoop.mapred.ReduceTask:
Interleaved on-disk merge complete: 13 files left.
2009-08-06 18:23:57,867 INFO org.apache.hadoop.mapred.ReduceTask:
In-memory merge complete: 5 files left.
2009-08-06 18:23:57,869 INFO org.apache.hadoop.mapred.ReduceTask:
Keeping 5 segments, 73734381 bytes in memory for intermediate, on-disk
merge
2009-08-06 18:23:57,872 INFO org.apache.hadoop.mapred.ReduceTask:
Merging 13 files, 2343086626 bytes from disk
2009-08-06 18:23:58,671 INFO org.apache.hadoop.mapred.ReduceTask:
Merging 0 segments, 0 bytes from memory into reduce
2009-08-06 18:23:58,671 INFO org.apache.hadoop.mapred.Merger: Merging
18 sorted segments
2009-08-06 18:23:59,962 INFO org.apache.hadoop.mapred.Merger: Merging
9 intermediate segments out of a total of 18
2009-08-06 18:27:44,177 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 10 segments left of total size: 2343086558
bytes
2009-08-06 18:29:04,667 ERROR
org.apache.hadoop.hbase.mapred.TableOutputFormat:
org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
trying to locate root region

-----------------------------------------------------------------------------------------------------------------------------------------------

The error seems occured in the sort stage of the reduce task. It never
occurs if I write the small data into HBase.
Can anyone give me a help? Thank you very much!


chu

Reply via email to