[
https://issues.apache.org/jira/browse/HBASE-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568419#action_12568419
]
Bryan Duxbury commented on HBASE-55:
------------------------------------
I think the tricky part of trying to incorporate update/read rate and memory
usage is that this could change very quickly, and if we are making balancing
decisions based on this, we could get really bad oscillations in assignments.
Moreover, there's really no such thing as "too busy". Either it's less busy
than average, and it should take on new regions, or more busy than average, and
regions should be taken away. If all of the servers have equal load, but the
average is "too high", then all you get is poor performance. At no point does
it make sense for a region server to "say no" to an assignment, because in
theory the master has decided that assignment is optimal for the known factors.
I think that calculating the load factor has to be simple, otherwise we can
easily get caught up trying to build a complicated load function that takes
into account every factor the region server can offer, but only provides a
marginal improvement over simpler functions.
Let's try a simple metric and see what happens. If it fails to give us decent
distribution, then we can go back to the drawing board.
As far as HBASE-70, I see that not so much as a way for us to monitor how much
memory is in use as for each region server to work best within the memory it
has. Perhaps we also need to be aware of the "swappiness" of a region server
(how much of it's cache is being discarded due to memory pressure), but that
may be separate.
> [hbase] Improve Master region assignment function
> -------------------------------------------------
>
> Key: HBASE-55
> URL: https://issues.apache.org/jira/browse/HBASE-55
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Bryan Duxbury
> Fix For: 0.2.0
>
>
> We would like the master's region assignment function to take into account
> more factors when choosing where to assign regions.
>
> - More advanced accounting of load on regionserver - memory, # requests, etc
> - Don't deploy both daughter regions to the same regionserver
> - Assign regions where the underlying DFS blocks are hosted if possible
> Please add additional ideas in comments as they come up.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.