Hi ,

I need confirmation regarding this two parameters and how they affect
performance .

I know(read) that always *mapred.max.split.size * should be less that
*dfs.block.size,*
But we always have an option of specifying  *mapred.max.split.size  *greater
than *dfs.block.size,*
What will happen in that case will the FileInputFormat for calculating
splits allows ?or it takes *dfs.block.size  *as the split size .

Say if the framework allows then in that case one map-task will end up
processing  more than one block (which will not be in local machine
always),In that case how the performance Impact?.

It would be a great help if anyone can help me get rid of this confusion.

Thanks
sandeep

Reply via email to