Can someone explain how the map reduce merge is done? As far as I can tell, it appears to pull all of the spill files into one giant file to send to the reducer. Is this correct? Even if you set smaller spill files and a lower sort factor, the eventual merge is still the same. It just takes more passes to get there.
Thanks.