On Jul 15, 2010, at 11:40 AM, Syed Wasti wrote:

> Will it matter what the data block size is ? 

Yes.

> It is recommended to have a block size of 64 MB, but if we want to have the 
> data block size to 128 MB, should this effect the performance ?

Yes.

FWIW, we run with 128MB.

> Does the size of the map jobs created on each datanodes in anyway depend the 
> block size ?

Yes.

Unless told otherwise, Hadoop will generally use the # of maps == # of blocks.  
So if you have fewer blocks to process, you'll have fewer maps to do more work. 
 This is not necessarily a bad thing; it all depends upon your workload, size 
of grid, etc.

Reply via email to