Hello again, On Wed, Mar 30, 2011 at 2:42 AM, Brendan W. <[email protected]> wrote: > But what would be the benefit of actually changing the DFS block size (to > say N*64 Mbytes), as opposed to just increasing the inputSplitSize to N > 64-Mbyte blocks for my job? Both will reduce my number of mappers by a > factor of N, right? Any benefit to one over the other?
You'll have a data locality benefit if you choose to change the block size of the files itself, instead of going for 'logical' splitting by the framework. This should save you a great deal of network transfer costs in your job. -- Harsh J http://harshj.com
