[ http://issues.apache.org/jira/browse/HADOOP-386?page=comments#action_12423370 ] Bryan Pendleton commented on HADOOP-386: ----------------------------------------
I suggest rather than "percent free space" we use the metric "absolute free space". This would improve interaction with the new code which lets a tasktracker withdraw from the workload when its space gets low.... and, especially, improve the odds that machines with much smaller drives can contribute productively. I have a few really old machines in my cluster, with 15gb drives. Running datanodes and tasktrackers on them makes them usually end up being (small) datanodes, and doing very little computation. The reverse would probably be more useful. Another thing that would be nice would be to first consider rebalancing within local drives. It costs no network bandwidth to move files between local drives. Plus, you can add a new big drive to one of those old nodes I mention above, and have it become productive even more easily. > Periodically move blocks from full nodes to those with space > ------------------------------------------------------------ > > Key: HADOOP-386 > URL: http://issues.apache.org/jira/browse/HADOOP-386 > Project: Hadoop > Issue Type: New Feature > Components: dfs > Affects Versions: 0.4.0 > Reporter: Johan Oskarson > > I'm still having *a lot* of problems with some nodes filling up quickly and > others hardly being touched, mostly because of the hardware being > very different. > As someone suggested, there should be a thread that periodically checks the > dfs for nodes with little or no free space and schedules blocks > to be moved off that node. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
