[
https://issues.apache.org/jira/browse/HBASE-26298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420122#comment-17420122
]
Viraj Jasani commented on HBASE-26298:
--------------------------------------
>From HBASE-22923 release notes:
_Example: Let's assume the cluster is on version 1.4.0 and we have_
_set "hbase.min.version.move.system.tables" as "2.0.0". Now if we upgrade_
_one RegionServer on 1.4.0 cluster to 1.6.0 (< 2.0.0), then AssignmentManager
will_
_not move hbase:meta, hbase:namespace and other system table regions_
_to newly brought up RegionServer 1.6.0 as part of auto-migration._
_However, if we upgrade one RegionServer on 1.4.0 cluster to 2.2.0 (> 2.0.0),_
_then AssignmentManager will move all system table regions to newly brought_
_up RegionServer 2.2.0 as part of auto-migration done by_
_AssignmentManager#checkIfShouldMoveSystemRegionAsync()._
This config is used to prevent assigning system tables to higher version
regionserver during upgrade. Similarly, it is also useful to prevent keeping
system tables on higher version only.
Let's assume,
cluster version: 1.4.0, upgrade target version: 2.0.0, and we have not
specified min version config. When any single regionserver is upgraded to 2.0.0
version, all system tables are moved there automatically and it is by design.
If we want to prevent this from happening, we can provide min version config
with value greater than 2.0.0 (e.g 2.1.0 or 2.4.0), and system tables will just
be treated as any other user tables.
This same rule is applicable to downgrade case also. If all regionservers are
on 2.0.0 and we start downgrade to 1.4.0 one server at a time. The last server
that stays on 2.0.0 will hold all system tables and we can't gracefully
shutdown that server because system tables are not allowed to be moved to lower
version, hence we will have to forcefully wither kill or stop last server
non-gracefully. However, if we provide min version config with value greater
than 2.0.0, system tables can be moved to any server (2.0.0 or 1.4.0).
> Downgrading is complicated by refusal to assign system tables to lower version
> ------------------------------------------------------------------------------
>
> Key: HBASE-26298
> URL: https://issues.apache.org/jira/browse/HBASE-26298
> Project: HBase
> Issue Type: Bug
> Reporter: Bryan Beaudreault
> Priority: Minor
>
> I was doing some rolling downgrades of test clusters and keep getting into a
> state where my automation gets stuck trying to drain the final RegionServer
> in the cluster. At this point that RegionServer hosts 3 regions: meta, quota,
> namespace. The HMaster is outputting logs like: "Passed destination
> servername is null/empty so choosing a server at random".
> I's very hard to understand what's happening based on that log, so you really
> have to look at the code. Tracking down that log line, it becomes somewhat
> clear that you are getting trapped by
> AssignmentManager.getExcludedServersForSystemTable().
> Looking at the code, you can see comments related to
> "hbase.min.version.move.system.tables" config, but the comments are very
> unclear. What should I set this to?
> This setting was added in https://issues.apache.org/jira/browse/HBASE-22923
> which focuses mostly on RSGroup, but this issue is affecting clusters that do
> not use RSGroup. The release note also is not super clear.
> It would be great to clarify the docs to help the operator know what to
> change this to, or perhaps make the config itself more intuitive. For
> example, could we just make it an allowlist of versions that can hold system
> tables? At that point my path is clear: add the version I'm downgrading to to
> the allowlist.
> This issue is also exacerbated by the fact that by the time you've realized
> this you're in a somewhat tricky situation where there's only 1 RegionServer
> left and your only way around it is to force stop it or to push a new config
> and rolling restart your HMasters. It would be great if this setting were
> able to be updated via Admin or at the very least reloadable with
> ConfigurationObserver.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)