Default GridMix2 config is set in such a way that the number of reducer tasks increases from small to medium to large data set jobs. For example in case of StreamingSorter it is 15, 170 and 370 in that order. On a single node setup I noticed that decreasing the number of reducer tasks for StreamingSorter.medium from 170 to 15 speeds up this workload by ~9X.
Do users tend not to tweak # of reducers specifically in the context of GridMix? Is GridMix designed to exercise higher number of reducer tasks with increased data set size on purpose? Thanks for your time. -Shrinivas
