I have tried
Configuration.setInt("mapred.max.split.size",134217728);
and setting mapred.max.split.size in mapred-site.xml. ( dfs.block.size is left
unchanged at 67108864).
But in the job.xml, I am still getting mapred.map.tasks =242 .
Shing
________________________________
From: Bejoy Ks <[email protected]>
To: [email protected]; Shing Hing Man <[email protected]>
Sent: Tuesday, October 2, 2012 6:03 PM
Subject: Re: How to lower the total number of map tasks
Sorry for the typo, the property name is mapred.max.split.size
Also just for changing the number of map tasks you don't need to modify the
hdfs block size.
On Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks <[email protected]> wrote:
Hi
>
>
>You need to alter the value of mapred.max.split size to a value larger than
>your block size to have less number of map tasks than the default.
>
>
>
>On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man <[email protected]> wrote:
>
>
>>
>>
>>I am running Hadoop 1.0.3 in Pseudo distributed mode.
>>When I submit a map/reduce job to process a file of size about 16 GB, in
>>job.xml, I have the following
>>
>>
>>mapred.map.tasks =242
>>mapred.min.split.size =0
>>dfs.block.size = 67108864
>>
>>
>>I would like to reduce mapred.map.tasks to see if it improves performance.
>>I have tried doubling the size of dfs.block.size. But
>>the mapred.map.tasks remains unchanged.
>>Is there a way to reduce mapred.map.tasks ?
>>
>>
>>Thanks in advance for any assistance !
>>Shing
>>
>>
>