Re: Setting number of mappers according to number of TextInput lines

Harsh J Sat, 16 Jun 2012 20:03:47 -0700

Ondřej,

While NLineInputFormat will indeed give you N lines per task, it does
not guarantee that the N map tasks that come out for a file from it
will all be sent to different nodes. Which one is your need exactly -
Simply having N lines per map task, or N wider distributed maps?


On Sat, Jun 16, 2012 at 3:01 PM, Ondřej Klimpera <klimp...@fit.cvut.cz> wrote:
> I tried this approach, but the job is not distributed among 10 mapper nodes.
> Seems Hadoop ignores this property :(
>
> My first thought is, that the small file size is the problem and Hadoop
> doesn't care about it's splitting in proper way.
>
> Thanks any ideas.
>
>
>
> On 06/16/2012 11:27 AM, Bejoy KS wrote:
>>
>> Hi Ondrej
>>
>> You can use NLineInputFormat with n set to 10.
>>
>> ------Original Message------
>> From: Ondřej Klimpera
>> To: common-user@hadoop.apache.org
>> ReplyTo: common-user@hadoop.apache.org
>> Subject: Setting number of mappers according to number of TextInput lines
>> Sent: Jun 16, 2012 14:31
>>
>> Hello,
>>
>> I have very small input size (kB), but processing to produce some output
>> takes several minutes. Is there a way how to say, file has 100 lines, i
>> need 10 mappers, where each mapper node has to process 10 lines of input
>> file?
>>
>> Thanks for advice.
>> Ondrej Klimpera
>>
>>
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>>
>



-- 
Harsh J

Re: Setting number of mappers according to number of TextInput lines

Reply via email to