On Oct 17, 2010, at 10:32 PM, Allen Wittenauer wrote: > > On Oct 17, 2010, at 10:29 PM, shangan wrote: > >> then you just write a shell to remove the logs periodically as a workaround? >> or better ideas ? > > We basically have a cron job that does a few things as part of our > maintenance. We have it rigged up such that it runs on the namenode and > then, over ssh, runs each of these on the slave nodes: > > - purge old logs > - purge old files out of mapred temp space > - kill stale/stuck tasks > > Hadoop really should manage this stuff on its own, but well, Hadoop should do > a lot of things to be more operable. ;)
Of course, it is worth mentioning you could also tie these logs to logadm/logwatch/rotatelogs/etc. But I like to have them centrally managed so that they are consistent across the grid.
