Currently, only map tasks are balanced, and reduce tasks possible are skew, the timeslice is also different, which lead the scheduler is not smart. I have an idea to improve it.
We can break the output of map to N*M splits, N is the number of nodes, and M >=1,and regroup to new splits bycombining the smaller splits and resplitting the bigger splits, until the size of every splits is balanced with the specified value. There are three cases: 1. Too many values for a key 2. Too many keys hash to a partition 3. Every partition is balanced in the size If too many values for a key, adding a new MapReduce procedure is necessary. If too many keys hash to a partition, resplitting is necessary. If every splitting is balanced, we can consider a task (map or reduce) to a scheduler timeslice, the scheduler will be smart like OS's scheduler.
