[ 
https://issues.apache.org/jira/browse/HBASE-22767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898054#comment-16898054
 ] 

Guanghao Zhang commented on HBASE-22767:
----------------------------------------

I thought the better fix should fallback to default group to find a RS... No 
need to add a BOGUS_SERVER_NAME and no need to fix these special cases.

> System table RIT STUCK if their RSGroup has no highest version RSes
> -------------------------------------------------------------------
>
>                 Key: HBASE-22767
>                 URL: https://issues.apache.org/jira/browse/HBASE-22767
>             Project: HBase
>          Issue Type: Bug
>          Components: rsgroup
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>
> AM chooses highest version region servers as participants for system tables, 
> including META table. If system table group has no highest version region 
> servers, then the reassignment of their regions will be always the BOGUS 
> server defined in RSGroup. 
> In our test environment using branch-2.2, we isolate system tables in a 
> rsgroup containing only one server. And when upgrading RSs, we have met the 
> problem that META is always assigned to the BOGUS server while the group 
> server has already been online for a while. META RIT is stuck and can not be 
> recovered by hbck2.
> I made a UT reproduce this problem, steps are:
> 1. add a group, move 1 server to it;
> 2. move meta table to the group;
> 3. restart the group server and downgrade its version;
> 4. meta rit stuck.
>  
> ROOT cause is AM filters highest version RSs for system tables. So if we do 
> not change the versions of system table group servers, but upgrade the 
> versions of other group servers, then if there is reassignment for any system 
> tables, such as balancer moving their regions, RIT STUCK!! 
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to