Hi all -

I am getting the following exception while running an itemsimilarity job:

java.io.IOException: Task: attempt_201106201353_0017_r_000000_0 - The reduce copier failed
       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:388)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
       at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.io.IOException: java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103) at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
       at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
       at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2669)
Caused by: java.io.EOFException
       at java.io.DataInputStream.readByte(DataInputStream.java:250)
at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
       at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64) at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
       ... 7 more

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(ReduceTask.java:2673)

The exception only occurs for large data sets >= 9 gigs making it difficult to diagnose.

I am using mahout-distribution-0.4 (0.5 gave me other issues) with hadoop-0.20.203.0.

Has anyone else encountered this problem?

Thanks,

Andrew

Reply via email to