I have a task that merges a set of SequenceFiles (BytesWritable/LongWritable) into a single SequenceFile (BytesWritable/LongWritable), and the reduce phase fails with the following error:

java.io.IOException: Value too large for defined data type
        at java.io.FileInputStream.available(Native Method)
at org.apache.hadoop.fs.LocalFileSystem$LocalFSFileInputStream.available(LocalFileSystem.java:96)
        at java.io.FilterInputStream.available(FilterInputStream.java:169)
        at java.io.FilterInputStream.available(FilterInputStream.java:169)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:332)
        at java.io.DataInputStream.readFully(DataInputStream.java:202)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55)
        at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:405)
at org.apache.hadoop.io.SequenceFile$Sorter$MergeStream.next(SequenceFile.java:871) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:915) at org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:800) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:738)
        at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:542)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:218)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1013)


This error message doesn't make much sense to me; a long should be quite sufficient to hold the values, and the BytesWritable keys don't change. Any ideas as to what could be wrong or how I can debug this to locate the problem?

Thanks,
--
Vetle Roeim
Opera Software ASA <URL: http://www.opera.com/ >

Reply via email to