[
https://issues.apache.org/jira/browse/HBASE-20368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874786#comment-16874786
]
Xiaolin Ha edited comment on HBASE-20368 at 6/28/19 9:08 AM:
-------------------------------------------------------------
[~zghaobac], thanks your question.
{quote}Why this patch works and the old implemenation stucked in where?
{quote}
Balancer skipped to process region assignments when there are none online
regionservers of group.
And AM won't process these regions again either. As a result, RITs will be held
there and stuck.
I assigned regions to the BOGUS server when no online regionservers in group,
and let AM check the assignment plans, if BOGUS, add the regions back to the
pending assign queue.
Equivalent to retry to assign regions until there are online regionservers in
group.
was (Author: xiaolin ha):
{quote}Why this patch works and the old implemenation stucked in where?
{quote}
Balancer skipped to process region assignments when there are none online
regionservers of group.
And AM won't process these regions again either. As a result, RITs will be held
there and stuck.
I assigned regions to the BOGUS server when no online regionservers in group,
and let AM check the assignment plans, if BOGUS, add the regions back to the
pending assign queue.
Equivalent to retry to assign regions until there are online regionservers in
group.
> Fix RIT stuck when a rsgroup has no online servers but AM's
> pendingAssginQueue is cleared
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-20368
> URL: https://issues.apache.org/jira/browse/HBASE-20368
> Project: HBase
> Issue Type: Bug
> Components: rsgroup
> Affects Versions: 2.0.0
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
> Attachments: HBASE-20368.branch-2.001.patch,
> HBASE-20368.branch-2.002.patch, HBASE-20368.branch-2.003.patch,
> HBASE-20368.branch-2.1.001.patch
>
>
> This error can be reproduced by shutting down all servers in a rsgroups and
> starting them soon afterwards.
> The regions on this rsgroup will be reassigned, but there is no available
> servers of this rsgroup.
> They will be added to AM's pendingAssginQueue, which AM will clear regardless
> of the result of assigning in this case.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)