Jason, Quite possibly, here's what I did: I upped "dfs.datanode.max.xcievers" to 512, which is a doubling, and the full set of output files are created correctly.
Thanks for responding. Learning, learning the ins and outs of Hadoop. On Thu, Oct 8, 2009 at 6:01 AM, Jason Venner <jason.had...@gmail.com> wrote: > Are you perhaps creating large numbers of files, and running out of file > descriptors in your tasks. > > > On Wed, Oct 7, 2009 at 1:52 PM, Geoffry Roberts <geoffry.robe...@gmail.com > > wrote: > >> All, >> >> I have a MapRed job that ceases to produce output about halfway through. >> The obvious question is why? >> >> This job reads a file and uses MultipleTextOutputFormat to generate output >> files named with the output key. At about the halfway point, the job >> continues to create files, but they are all of zero length. I've worked >> with this input file extensively and I know it actually contains the >> required data and that it is clean or at least it was when I copied it in. >> >> My first impulse was to check for a full disk, but there seems to be ample >> free space. >> >> This doesn't appear to have anything to do with my code. >> >> stderror is full of the following entry: >> >> java.io.EOFException >> >> >> at java.io.DataInputStream.readByte(DataInputStream.java:250) >> at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) >> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) >> at org.apache.hadoop.io.Text.readString(Text.java:400) >> >> >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) >> >> >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232) >> >> >> syslog for the reducer starts filling up with the following at what could >> indeed be the halfway point: >> >> 2009-10-07 11:27:50,874 INFO org.apache.hadoop.hdfs.DFSClient: Exception in >> createBlockOutputStream java.io.EOFException >> >> >> 2009-10-07 11:27:50,916 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning >> block blk_-1693260904457793456_3495 >> 2009-10-07 11:27:56,919 INFO org.apache.hadoop.hdfs.DFSClient: Exception in >> createBlockOutputStream java.io.EOFException >> >> >> 2009-10-07 11:27:56,919 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning >> block blk_7536254999085848659_3495 >> 2009-10-07 11:28:02,921 INFO org.apache.hadoop.hdfs.DFSClient: Exception in >> createBlockOutputStream java.io.EOFException >> >> >> 2009-10-07 11:28:02,921 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning >> block blk_-7513223558440754487_3495 >> 2009-10-07 11:28:08,924 INFO org.apache.hadoop.hdfs.DFSClient: Exception in >> createBlockOutputStream java.io.EOFException >> >> >> 2009-10-07 11:28:08,924 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning >> block blk_2580888829875117043_3495 >> 2009-10-07 11:28:14,965 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer >> Exception: java.io.IOException: Unable to create new block. >> >> >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232) >> >> >> > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals >