Bryan Beaudreault created HBASE-26298:
-----------------------------------------

             Summary: Downgrading is complicated by refusal to assign system 
tables to lower version
                 Key: HBASE-26298
                 URL: https://issues.apache.org/jira/browse/HBASE-26298
             Project: HBase
          Issue Type: Bug
            Reporter: Bryan Beaudreault


I was doing some rolling downgrades of test clusters and keep getting into a 
state where my automation gets stuck trying to drain the final RegionServer in 
the cluster. At this point that RegionServer hosts 3 regions: meta, quota, 
namespace. The HMaster is outputting logs like: "Passed destination servername 
is null/empty so choosing a server at random".

I's very hard to understand what's happening based on that log, so you really 
have to look at the code. Tracking down that log line, it becomes somewhat 
clear that you are getting trapped by 
AssignmentManager.getExcludedServersForSystemTable().

Looking at the code, you can see comments related to 
"hbase.min.version.move.system.tables" config, but the comments are very 
unclear. What should I set this to?

This setting was added in https://issues.apache.org/jira/browse/HBASE-22923 
which focuses mostly on RSGroup, but this issue is affecting clusters that do 
not use RSGroup. The release note also is not super clear.

It would be great to clarify the docs to help the operator know what to change 
this to, or perhaps make the config itself more intuitive. For example, could 
we just make it an allowlist of versions that can hold system tables? At that 
point my path is clear: add the version I'm downgrading to to the allowlist.

This issue is also exacerbated by the fact that by the time you've realized 
this you're in a somewhat tricky situation where there's only 1 RegionServer 
left and your only way around it is to force stop it or to push a new config 
and rolling restart your HMasters. It would be great if this setting were able 
to be updated via Admin or at the very least reloadable with 
ConfigurationObserver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to