I came up with a nice little hack to trick hadoop into calculating disk usage with df instead of du
http://allthingshadoop.com/2011/05/20/faster-datanodes-with-less-wait-io-using-df-instead-of-du/ I am running this in production, works like a charm and already seeing benefit, woot! I hope it works well for others too. /* Joe Stein http://www.twitter.com/allthingshadoop */
