Re: one input file per map

Yang Chen Thu, 03 Jul 2008 09:13:07 -0700

Maybe consider a hierachy. The first level is one map per file, and the
second level is map/reduce for parent level.


YC


On 7/3/08, Jason Venner <[EMAIL PROTECTED]> wrote:
>
> You could also set your input split size to Long.MAX_VALUE.
>
> Goel, Ankur wrote:
>
>> Nope, But if the intent is so then there are 2 ways of doing it.
>>
>> 1. Just extend the input format of your choice and override
>> isSplitable() method to return false.
>>
>> 2. Compress your text file using a compression format supported by
>> hadoop (e.g gzip). This will ensure that one map task processes 1 file
>> since compressed files are not split between processes.
>>
>>
>> -----Original Message-----
>> From: Qiong Zhang [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 01,
>> 2008 9:54 PM
>> To: [email protected]
>> Subject: one input file per map
>> Hi,
>>
>>
>> Is there an existing input format/split which supports one input file
>> (e.g. plain text) per map task?
>>
>>
>> Thanks,
>>
>> James
>>
>>
>>
>

Re: one input file per map

Reply via email to