Scott Carey wrote:
On Apr 26, 2010, at 9:45 AM, Steve Loughran wrote:
Allen Wittenauer wrote:
On Apr 22, 2010, at 5:41 AM, Steve Loughran wrote:
that brings up a couple of issues I've been thinking about now that workers can
go to 6+ HDDs/node
* a way to measure the distribution across disks, rather than just nodes.
DfsClient doesn't provide enough info here yet.
What should probably happen is that instead of throwing you to the file browser, clicking
on a host from the live nodes page should probably put you on a "stats about this
node" page.
I don't want to do any of this by hand. I want machine readable content
something can aggregate over time.
* a way to triger some rebalancing on a single node, to say "position stuff more
fairly". You don't need to worry about network traffic, just local disk load and CPU
time, so it should be simpler.
Yup. Working with 8 drives per node, it is interesting to see how unbalanced
the data gets after a while. [Luckily, we have MR tmp space segregated off so
I'm sure it would be a lot worse if we didn't!]
Someone should file a jira. :)
Especially if someone else offers to fix it.
Should be trivial to at least make the new block allocation choose which device
to allocate the block on with a weighted roulette algorithm instead of
round-robin.
I'd go for another plugin point with the default impl being round-robin,
let people come up with other strategies, and add a way from DfsClient
to measure distribution.
Other strategies could use specific knowledge about the disks -are they
raided, can you hot swap, is the disk playing up, etc, etc, feature
creep that I encourage people to explore.