benchmark on my 200 node cluster

Owen O'Malley Tue, 30 May 2006 21:26:38 -0700

After observing the speed differential on my 200 node cluster betweenthe fast and slow nodes as seen in HADOOP-253, I wanted to try runningmore smaller reduces. So instead of the default 400 reduces, whichcould all start at the beginning of execution, I used 700 instead. Thismeans that all of the nodes run at least 2 reduces and the fastest 150nodes run an additional 2 reduces each. Clearly, in a case where allof the nodes are well balanced, that will lose because the second roundof data shuffling doesn't overlap the maps. However, in the presence offailures or uneven hardware, it will be a win.

That brought down my run time on sorting my 2010 gigabyte dataset from8.5 hours to 6.6 hours. For those of you who are keeping score, thatmeans that at the start of the month the sort benchmark was taking 47hours and is now taking 6.6 hours on the same hardware.

Note that it would have also made sense to double the block size on theinputs, so that the size of data on each of the M*R data paths staysconstant, but I wanted to try the changes independently. As for myother config choices, the only non-default ones are:


dfs.block.size=134217728
io.sort.factor=100
io.file.buffer.size=65536
mapred.reduce.parallel.copies=10

I'm also looking forward to trying out Ben Reed's patches to reduce thenumber of trips to disk in the reduces.


-- Owen

benchmark on my 200 node cluster

Reply via email to