[
https://issues.apache.org/jira/browse/HBASE-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215531#comment-15215531
]
Heng Chen commented on HBASE-15547:
-----------------------------------
{quote}
We recently had a cluster get into a bad state where a subset of region servers
consistently could not open new regions
{quote}
I think we should figure out why RS could not open new regions firstly. (too
many regions on that RS?)
> Balancer should take into account number of PENDING_OPEN regions
> ----------------------------------------------------------------
>
> Key: HBASE-15547
> URL: https://issues.apache.org/jira/browse/HBASE-15547
> Project: HBase
> Issue Type: Improvement
> Components: Balancer, Operability, Region Assignment
> Affects Versions: 0.98.0, 1.0.0
> Reporter: Sean Busbey
> Priority: Critical
> Fix For: 2.0.0, 1.4.0
>
>
> We recently had a cluster get into a bad state where a subset of region
> servers consistently could not open new regions (but could continue serving
> the regions they already hosted).
> Recovering the cluster was just a matter of restarting region servers in
> sequence. However, this led to things getting substantially worse before they
> got better since the bulk assigner continued to place an uniform number of
> recovered regions across all servers, including onto those that could not
> open regions.
> It would be useful if the balancer could penalize regionservers with a
> backlog of pending_open regions and place more work on those regionservers
> that are properly serving.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)