I think I had similar sorts of errors trying to write via 100s of open OutputStreams to HDFS - I presume that MultipleOutputFormat has to do the same at each Reducer if you give it many output files to create, as in your use case. I think the problem stems from having many concurrent sockets open. You can mitigate this problem by adding dfs.datanode.max.xcievers to hadoop-site.xml and setting a higher value (if you check the DataNode logs you might find these errors confirming it: java.io.IOException: xceiverCount 257 exceeds the limit of concurrent xcievers 256). However, I think this setting carries a memory overhead so you can't keep increasing the value indefinitely - is there another way you can approach this without needing so many output files per reducer?
2009/9/15 Aviad sela <sela.s...@gmail.com> > Is any body interested ,addressed such probelm. > Or does it seem to be esoteric usage ? > > > > > On Wed, Sep 9, 2009 at 7:06 PM, Aviad sela <sela.s...@gmail.com> wrote: > > > I am using Hadoop 0.19.1 > > > > I attempt to split an input into multiple directories. > > I don't know in advance how many directories exists. > > I don't know in advance what is the directory depth. > > I expect that under each such directory a file exists with all availble > > records having the same key permutation > > found in the job. > > > > If currently each reducer produce a single output i.e. PART-0001 > > I would like to create as many directory necessary taking the following > > pattern: > > > > key1 / key2/ .../ keyN/ PART-0001 > > > > where the "key?" may have different values for each input record. > > different record may results with a different path requested: > > key1a/key2b/PART-0001 > > key1c/key2d/key3e/PART-0001 > > to keep it simple, during each job we may expect the same depth from each > > record. > > > > I assume that the input records imply that each reduce will produce > several > > hundreds of such directories. > > (Indeed this strongly depends on the input record semantic). > > > > > > The MAP part reads a record,following some logic, assign a key like : > > KEY_A, KEY_B > > The MAP Value is the original input line. > > > > > > For The reducer part I assign the IdentityReducer. > > However have set : > > > > jobConf.setReducerClass(IdentityReducer. > > *class*); > > > > jobConf.setOutputFormat(MyTextOutputFormat.*class*); > > > > > > > > Where the MyTextOutputFormat extends MultipleTextOutputFormat, and > > implements: > > > > protected String generateFileNameForKeyValue(K key, V value, String > > name) > > { > > String keyParts[] = key.toString().split(","); > > Path finalPath = null; > > // Build the directory structure comprised of the Key parts > > for (int i=0; i < keyParts.length; i++) > > { > > String part = keyParts[i].trim(); > > if (false == "".equals(part)) > > { > > if (null == finalPath) > > finalPath = new Path(part); > > else > > { > > finalPath = new Path(finalPath, part); > > } > > } > > } //end of for > > > > String fileName = generateLeafFileName(name); > > finalPath = new Path(finalPath, fileName); > > > > return finalPath.toString(); > > } //generatedFileNameKeyValue > > During execution I have seen the reduce attempts does create the > following > > path under the output path: > > > > > "/user/hadoop/test_rep01/_temporary/_attempt_200909080349_0013_r_000000_0/KEY_A/KEY_B/part-00000" > > However, the file was empty. > > > > > > > > The job fails at the end with the the following exceptions found in the > > task log: > > > > 2009-09-09 11:19:49,653 INFO org.apache.hadoop.hdfs.DFSClient: Exception > in > > createBlockOutputStream java.io.IOException: Bad connect ack with > > firstBadLink 9.148.30.71:50010 > > 2009-09-09 11:19:49,654 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning > > block blk_-6138647338595590910_39383 > > 2009-09-09 11:19:55,659 WARN org.apache.hadoop.hdfs.DFSClient: > DataStreamer > > Exception: java.io.IOException: Unable to create new block. > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2722) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > > > > 2009-09-09 11:19:55,659 WARN org.apache.hadoop.hdfs.DFSClient: Error > > Recovery for block blk_-6138647338595590910_39383 bad datanode[1] nodes > == > > null > > 2009-09-09 11:19:55,660 WARN org.apache.hadoop.hdfs.DFSClient: Could not > > get block locations. Source file > > > "/user/hadoop/test_rep01/_temporary/_attempt_200909080349_0013_r_000002_0/KEY_A/KEY_B/part-00002" > > - Aborting... > > 2009-09-09 11:19:55,686 WARN org.apache.hadoop.mapred.TaskTracker: Error > > running child > > java.io.IOException: Bad connect ack with firstBadLink 9.148.30.71:50010 > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2780) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2703) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > > 2009-09-09 11:19:55,688 INFO org.apache.hadoop.mapred.TaskRunner: > Runnning > > cleanup for the task > > > > > > The Command line also writes: > > > > 09/09/09 11:24:06 INFO mapred.JobClient: Task Id : > > attempt_200909080349_0013_r_000003_2, Status : FAILED > > java.io.IOException: Bad connect ack with firstBadLink 9.148.30.80:50010 > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2780) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2703) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996) > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > > > > > > > > Any Ideas how to support such a scenario > > >