Scott Carey wrote:
On Apr 26, 2010, at 9:45 AM, Steve Loughran wrote:

Allen Wittenauer wrote:
On Apr 22, 2010, at 5:41 AM, Steve Loughran wrote:
that brings up a couple of issues I've been thinking about now that workers can 
go to 6+ HDDs/node

* a way to measure the distribution across disks, rather than just nodes. 
DfsClient doesn't provide enough info here yet.
What should probably happen is that instead of throwing you to the file browser, clicking 
on a host from the live nodes page should probably put you on a "stats about this 
node" page.
I don't want to do any of this by hand. I want machine readable content something can aggregate over time.

* a way to triger some rebalancing on a single node, to say "position stuff more 
fairly". You don't need to worry about network traffic, just local disk load and CPU 
time, so it should be simpler.

Yup.  Working with 8 drives per node, it is interesting to see how unbalanced 
the data gets after a while.  [Luckily, we have MR tmp space segregated off so 
I'm sure it would be a lot worse if we didn't!]

Someone should file a jira. :)
Especially if someone else offers to fix it.


Should be trivial to at least make the new block allocation choose which device 
to allocate the block on with a weighted roulette algorithm instead of 
round-robin.


I'd go for another plugin point with the default impl being round-robin, let people come up with other strategies, and add a way from DfsClient to measure distribution.

Other strategies could use specific knowledge about the disks -are they raided, can you hot swap, is the disk playing up, etc, etc, feature creep that I encourage people to explore.

Reply via email to