[ 
https://issues.apache.org/jira/browse/HBASE-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568419#action_12568419
 ] 

Bryan Duxbury commented on HBASE-55:
------------------------------------

I think the tricky part of trying to incorporate update/read rate and memory 
usage is that this could change very quickly, and if we are making balancing 
decisions based on this, we could get really bad oscillations in assignments. 

Moreover, there's really no such thing as "too busy". Either it's less busy 
than average, and it should take on new regions, or more busy than average, and 
regions should be taken away. If all of the servers have equal load, but the 
average is "too high", then all you get is poor performance. At no point does 
it make sense for a region server to "say no" to an assignment, because in 
theory the master has decided that assignment is optimal for the known factors. 

I think that calculating the load factor has to be simple, otherwise we can 
easily get caught up trying to build a complicated load function that takes 
into account every factor the region server can offer, but only provides a 
marginal improvement over simpler functions. 

Let's try a simple metric and see what happens. If it fails to give us decent 
distribution, then we can go back to the drawing board.

As far as HBASE-70, I see that not so much as a way for us to monitor how much 
memory is in use as for each region server to work best within the memory it 
has. Perhaps we also need to be aware of the "swappiness" of a region server 
(how much of it's cache is being discarded due to memory pressure), but that 
may be separate.

> [hbase] Improve Master region assignment function
> -------------------------------------------------
>
>                 Key: HBASE-55
>                 URL: https://issues.apache.org/jira/browse/HBASE-55
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Bryan Duxbury
>             Fix For: 0.2.0
>
>
> We would like the master's region assignment function to take into account 
> more factors when choosing where to assign regions.
>  
> - More advanced accounting of load on regionserver - memory, # requests, etc
> - Don't deploy both daughter regions to the same regionserver
> - Assign regions where the underlying DFS blocks are hosted if possible
> Please add additional ideas in comments as they come up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to