I am using hadoop 0.15.1 to index some catalog that has a tree-like
structure, where the leaf nodes are data files.  My main task is a
loop that performs a breadth-first walkthrough that parses out URLs to
catalogs and datafiles at the next level, which is done in a mapper.
To determine when the loop should terminate, I use a reduce task that
counts the number of new catalogs found, and stops the loop when the
count is 0.

But while I was running the jobs, I kept getting this exception
(pasted below from the logs).  I didn't quite understand what it was
trying to say.  But in my code, I never used LongWritable.  Only Text
for output key and output values, and KeyValueTextInputFormat for
input.

What's weirder is that this exception occurs at different places from
job to job.  Sometimes it may be thrown at the 2nd iteration of my
loop, while other times, it may be the 3rd, the 4th etc.  Can someone
explain to me what and why this is?  Also, what would be the best way
to test/debug a hadoop job??  Thanks.


2008-01-16 00:37:19,941 INFO org.apache.hadoop.mapred.ReduceTask:
task_200801160024_0011_r_000000_1 Copying
task_200801160024_0011_m_000000_0 output from ginkgo.mycluster.org
2008-01-16 00:37:19,953 INFO org.apache.hadoop.mapred.ReduceTask:
task_200801160024_0011_r_000000_1 done copying
task_200801160024_0011_m_000000_0 output from ginkgo.mycluster.org
2008-01-16 00:37:19,955 INFO org.apache.hadoop.mapred.ReduceTask:
task_200801160024_0011_r_000000_1 Copying of all map outputs complete.
Initiating the last merge on the remaining files in
ramfs://mapoutput26453615
2008-01-16 00:37:20,088 WARN org.apache.hadoop.mapred.ReduceTask:
task_200801160024_0011_r_000000_1 Final merge of the inmemory files
threw an exception: java.io.IOException: java.io.IOException: wrong
key class: class org.apache.hadoop.io.LongWritable is not class
org.apache.hadoop.io.Text
        at 
org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2874)
        at 
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2683)
        at 
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2437)
        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.fetchOutputs(ReduceTask.java:1153)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:252)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

        at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.fetchOutputs(ReduceTask.java:1161)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:252)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

2008-01-16 00:37:20,090 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.io.IOException: task_200801160024_0011_r_000000_1The reduce copier failed
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)



-- 
--------------------------------------
Standing Bear Has Spoken
--------------------------------------

Reply via email to