On Nov 9, 2010, at 6:53 AM, Steve Loughran wrote:
> You can get unbalanced disks even without swapping if you are using the same
> set of disks for mapred temp/overspill storage. This gives you good
> bandwidth, but can lead to unbalanced systems, as can deletion of large files.
This is actually the reason I recommend people create separate file
systems for mapred. It is the only way to keep MR 'contained' to the point it
doesn't destroy a grid. [Plus it makes it dirt simple to clean up the MR
directories every so often since Hadoop is pretty bad at it.]