FWIW: HADOOP-3940 is merged into the 0.18 branch and should be part of
0.18.1. -C
On Sep 4, 2008, at 6:33 AM, Devaraj Das wrote:
I started a profile of the reduce-task. I've attached the profiling
output.
It seems from the samples that ramManager.waitForDataToMerge()
doesn't
actually wait.
Has anybody seen this behavior.
This has been fixed in HADOOP-3940
On 9/4/08 6:36 PM, "Espen Amble Kolstad" <[EMAIL PROTECTED]> wrote:
I have the same problem on our cluster.
It seems the reducer-tasks are using all cpu, long before there's
anything to
shuffle.
I started a profile of the reduce-task. I've attached the profiling
output.
It seems from the samples that ramManager.waitForDataToMerge()
doesn't
actually wait.
Has anybody seen this behavior.
Espen
On Thursday 28 August 2008 06:11:42 wangxu wrote:
Hi,all
I am using hadoop-0.18.0-core.jar and nutch-2008-08-18_04-01-55.jar,
and running hadoop on one namenode and 4 slaves.
attached is my hadoop-site.xml, and I didn't change the file
hadoop-default.xml
when data in segments are large,this kind of errors occure:
java.io.IOException: Could not obtain block:
blk_-2634319951074439134_1129
file=/user/root/crawl_debug/segments/20080825053518/content/
part-00002/data
at
org.apache.hadoop.dfs.DFSClient
$DFSInputStream.chooseDataNode(DFSClient.jav
a:1462) at
org.apache.hadoop.dfs.DFSClient
$DFSInputStream.blockSeekTo(DFSClient.java:1
312) at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:
1417) at
java.io.DataInputStream.readFully(DataInputStream.java:178)
at
org.apache.hadoop.io.DataOutputBuffer
$Buffer.write(DataOutputBuffer.java:64
) at
org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:
102)
at
org.apache.hadoop.io.SequenceFile
$Reader.readBuffer(SequenceFile.java:1646)
at
org.apache.hadoop.io.SequenceFile
$Reader.seekToCurrentValue(SequenceFile.ja
va:1712) at
org.apache.hadoop.io.SequenceFile
$Reader.getCurrentValue(SequenceFile.java:
1787) at
org
.apache
.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceF
ileRecordReader.java:104) at
org
.apache
.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordRe
ader.java:79) at
org
.apache
.hadoop.mapred.join.WrappedRecordReader.next(WrappedRecordReader.
java:112) at
org
.apache
.hadoop.mapred.join.WrappedRecordReader.accept(WrappedRecordReade
r.java:130) at
org
.apache
.hadoop.mapred.join.CompositeRecordReader.fillJoinCollector(Compo
siteRecordReader.java:398) at
org
.apache
.hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:5
6) at
org
.apache
.hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:3
3) at
org.apache.hadoop.mapred.MapTask
$TrackedRecordReader.next(MapTask.java:165)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at org.apache.hadoop.mapred.TaskTracker
$Child.main(TaskTracker.java:2209)
how can I correct this?
thanks.
Xu