Maybe consider a hierachy. The first level is one map per file, and the second level is map/reduce for parent level.
YC On 7/3/08, Jason Venner <[EMAIL PROTECTED]> wrote: > > You could also set your input split size to Long.MAX_VALUE. > > Goel, Ankur wrote: > >> Nope, But if the intent is so then there are 2 ways of doing it. >> >> 1. Just extend the input format of your choice and override >> isSplitable() method to return false. >> >> 2. Compress your text file using a compression format supported by >> hadoop (e.g gzip). This will ensure that one map task processes 1 file >> since compressed files are not split between processes. >> >> >> -----Original Message----- >> From: Qiong Zhang [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 01, >> 2008 9:54 PM >> To: [email protected] >> Subject: one input file per map >> Hi, >> >> >> Is there an existing input format/split which supports one input file >> (e.g. plain text) per map task? >> >> >> Thanks, >> >> James >> >> >> >
