I think I understand that by last 2 replies :)  But my question is can
I change this configuration to say split file into 250K so that
multiple mappers can be invoked?

On Thu, May 26, 2011 at 3:41 PM, James Seigel <[email protected]> wrote:
> have more data for it to process :)
>
>
> On 2011-05-26, at 4:30 PM, Mohit Anchlia wrote:
>
>> I ran a simple pig script on this file:
>>
>> -rw-r--r-- 1 root root   208348 May 26 13:43 excite-small.log
>>
>> that orders the contents by name. But it only created one mapper. How
>> can I change this to distribute accross multiple machines?
>>
>> On Thu, May 26, 2011 at 3:08 PM, jagaran das <[email protected]> wrote:
>>> Hi Mohit,
>>>
>>> No of Maps - It depends on what is the Total File Size / Block Size
>>> No of Reducers - You can specify.
>>>
>>> Regards,
>>> Jagaran
>>>
>>>
>>>
>>> ________________________________
>>> From: Mohit Anchlia <[email protected]>
>>> To: [email protected]
>>> Sent: Thu, 26 May, 2011 2:48:20 PM
>>> Subject: No. of Map and reduce tasks
>>>
>>> How can I tell how the map and reduce tasks were spread accross the
>>> cluster? I looked at the jobtracker web page but can't find that info.
>>>
>>> Also, can I specify how many map or reduce tasks I want to be launched?
>>>
>>> From what I understand is that it's based on the number of input files
>>> passed to hadoop. So if I have 4 files there will be 4 Map taks that
>>> will be launced and reducer is dependent on the hashpartitioner.
>>>
>
>

Reply via email to