[ http://issues.apache.org/jira/browse/HADOOP-297?page=comments#action_12416234 ]
Konstantin Shvachko commented on HADOOP-297: -------------------------------------------- Implementing a good weight function is a non-trivial problem. It is still a very good thing to implement the framework that would support prioritized queue of nodes with a simple weight function (=remaining disk space) for a starter. The function could be fine tuned later on. The rebalancing/migration thread might be a separate task. > When selecting node to put new block on, give priority to those with more > free space/less blocks > ------------------------------------------------------------------------------------------------ > > Key: HADOOP-297 > URL: http://issues.apache.org/jira/browse/HADOOP-297 > Project: Hadoop > Type: Improvement > Components: dfs > Versions: 0.3.2 > Reporter: Johan Oskarson > Priority: Minor > Attachments: priorityshuffle_v1.patch > > As mentioned in previous bug report: > We're running a smallish cluster with very different machines, some with only > 60 gb harddrives > This creates a problem when inserting files into the dfs, these machines run > out of space quickly while some have plenty of space free. > So instead of just shuffling the nodes, I've created a quick patch that first > sorts the target nodes by (freespace / blocks). > It then randomizes the position of the first third of the nodes (so we don't > put all the blocks in the file on the same machine) > I'll let you guys figure out how to improve this. > /Johan -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
