I set the block size using
Configuration.setInt("dfs.block.size",134217728);
I have also set it in mapred-site.xml.
Shing
________________________________
From: Chris Nauroth <[email protected]>
To: [email protected]; Shing Hing Man <[email protected]>
Sent: Tuesday, October 2, 2012 6:00 PM
Subject: Re: How to lower the total number of map tasks
Those numbers make sense, considering 1 map task per block. 16 GB file / 64 MB
block size = ~242 map tasks.
When you doubled dfs.block.size, how did you accomplish that? Typically, the
block size is selected at file write time, with a default value from system
configuration used if not specified. Did you "hadoop fs -put" the file with
the new block size, or was it something else?
Thank you,
--Chris
On Tue, Oct 2, 2012 at 9:34 AM, Shing Hing Man <[email protected]> wrote:
>
>
>I am running Hadoop 1.0.3 in Pseudo distributed mode.
>When I submit a map/reduce job to process a file of size about 16 GB, in
>job.xml, I have the following
>
>
>mapred.map.tasks =242
>mapred.min.split.size =0
>dfs.block.size = 67108864
>
>
>I would like to reduce mapred.map.tasks to see if it improves performance.
>I have tried doubling the size of dfs.block.size. But
>the mapred.map.tasks remains unchanged.
>Is there a way to reduce mapred.map.tasks ?
>
>
>Thanks in advance for any assistance !
>Shing
>
>