Are you perhaps creating large numbers of files, and running out of file descriptors in your tasks.
On Wed, Oct 7, 2009 at 1:52 PM, Geoffry Roberts <geoffry.robe...@gmail.com>wrote: > All, > > I have a MapRed job that ceases to produce output about halfway through. > The obvious question is why? > > This job reads a file and uses MultipleTextOutputFormat to generate output > files named with the output key. At about the halfway point, the job > continues to create files, but they are all of zero length. I've worked > with this input file extensively and I know it actually contains the > required data and that it is clean or at least it was when I copied it in. > > My first impulse was to check for a full disk, but there seems to be ample > free space. > > This doesn't appear to have anything to do with my code. > > stderror is full of the following entry: > > java.io.EOFException > > at java.io.DataInputStream.readByte(DataInputStream.java:250) > at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) > at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) > at org.apache.hadoop.io.Text.readString(Text.java:400) > > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) > > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232) > > > syslog for the reducer starts filling up with the following at what could > indeed be the halfway point: > > 2009-10-07 11:27:50,874 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.EOFException > > 2009-10-07 11:27:50,916 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-1693260904457793456_3495 > 2009-10-07 11:27:56,919 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.EOFException > > 2009-10-07 11:27:56,919 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_7536254999085848659_3495 > 2009-10-07 11:28:02,921 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.EOFException > > 2009-10-07 11:28:02,921 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_-7513223558440754487_3495 > 2009-10-07 11:28:08,924 INFO org.apache.hadoop.hdfs.DFSClient: Exception in > createBlockOutputStream java.io.EOFException > > 2009-10-07 11:28:08,924 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > block blk_2580888829875117043_3495 > 2009-10-07 11:28:14,965 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.IOException: Unable to create new block. > > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232) > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals