On Apr 26, 2010, at 9:45 AM, Steve Loughran wrote: > Allen Wittenauer wrote: >> On Apr 22, 2010, at 5:41 AM, Steve Loughran wrote: >>> that brings up a couple of issues I've been thinking about now that workers >>> can go to 6+ HDDs/node >>> >>> * a way to measure the distribution across disks, rather than just nodes. >>> DfsClient doesn't provide enough info here yet. >> >> What should probably happen is that instead of throwing you to the file >> browser, clicking on a host from the live nodes page should probably put you >> on a "stats about this node" page. > > I don't want to do any of this by hand. I want machine readable content > something can aggregate over time. > >> >>> * a way to triger some rebalancing on a single node, to say "position stuff >>> more fairly". You don't need to worry about network traffic, just local >>> disk load and CPU time, so it should be simpler. >> >> >> Yup. Working with 8 drives per node, it is interesting to see how >> unbalanced the data gets after a while. [Luckily, we have MR tmp space >> segregated off so I'm sure it would be a lot worse if we didn't!] >> >> Someone should file a jira. :) > > Especially if someone else offers to fix it. >
Should be trivial to at least make the new block allocation choose which device to allocate the block on with a weighted roulette algorithm instead of round-robin.
