Why don't u create/write to hdfs files directly from reduce job (don't depend on the default reduce output dir/files)?
Like the cases where input is not homogenous, this seems (at least to me) to be another common pattern (output is not homogenous). I have run into this when loading data into hadoop (and wanting to organize different types of records into different dirs/files). Just make sure (somehow), that different reduce jobs don't try to write to same file. -----Original Message----- From: C G [mailto:[EMAIL PROTECTED] Sent: Friday, September 21, 2007 1:20 PM To: [email protected] Subject: Multiple output files, and controlling output file name... Hi All: In the context of using the aggregation classes, is there anyway to send output to multiple files? In my case, I am processing columnar records that are very wide. I have to do a variety of different aggregations and the results of each type of aggregation is a set of rows suitable for loading into a database. Rather than write all the records to "part-00000", etc., I'd like to write them to a series of files based. I don't see an obvious way to do this..is it possible? Also, for those of us that don't like "part-00000" and so forth as naming conventions, is there a way to name the output? In my case, incorporating a date/time stamp like "loadA-200709221600" would be very useful. Thanks for any advice, C G --------------------------------- Tonight's top picks. What will you watch tonight? Preview the hottest shows on Yahoo! TV.
