[
https://issues.apache.org/jira/browse/HBASE-23269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988400#comment-16988400
]
Jianzhen Xu commented on HBASE-23269:
-------------------------------------
[~zhangduo] I proposeed a pr, which filters higher version instances in groups.
Is that appropriate?
> Hbase crashed due to two versions of regionservers when rolling upgrading
> -------------------------------------------------------------------------
>
> Key: HBASE-23269
> URL: https://issues.apache.org/jira/browse/HBASE-23269
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 1.4.0, 1.4.2, 1.4.9, 1.4.10, 1.4.11
> Reporter: Jianzhen Xu
> Assignee: Jianzhen Xu
> Priority: Critical
> Attachments: 9.png, image-2019-11-07-14-49-41-253.png,
> image-2019-11-07-14-50-11-877.png, image-2019-11-07-14-51-38-858.png
>
>
> Currently, when hbase turns on the rs_group function and needs to upgrade to
> a higher version, the meta table maybe assign failed, which eventually makes
> the whole cluster unavailable and the availability drops to 0.This applies to
> all versions that introduce rs_group functionality in hbase-1.4.*. Including
> the patch of rs_group is introduced in the version below 1.4, upgrade to
> version 1.4 will also appear.
> When this happens during an upgrade:
> * When rolling upgrading regionservers, it must appear if the first rs of
> the upgrade is not in the same rs_group as the meta table.
> The phenomenon is as follows:
> !image-2019-11-07-14-50-11-877.png!
> !image-2019-11-07-14-51-38-858.png!
> The reason for this is as follows: during a rolling upgrade of the first
> regionserver node (denoted as RS1),RS1 started up and re-registered to
> zk,master triggered the operation through watcher perception in
> RegionServerTracker, and finally came to this
> method-HMaster.checkIfShouldMoveSystemRegionAsync()。
> The logic of this method is as follows:
>
> {code:java}
> // code placeholder
> public void checkIfShouldMoveSystemRegionAsync() {
> new Thread(new Runnable() {
> @Override
> public void run() {
> try {
> synchronized (checkIfShouldMoveSystemRegionLock) {
> // RS register on ZK after reports startup on master
> List<HRegionInfo> regionsShouldMove = new ArrayList<>();
> for (ServerName server : getExcludedServersForSystemTable()) {
> regionsShouldMove.addAll(getCarryingSystemTables(server));
> }
> if (!regionsShouldMove.isEmpty()) {
> List<RegionPlan> plans = new ArrayList<>();
> for (HRegionInfo regionInfo : regionsShouldMove) {
> RegionPlan plan = getRegionPlan(regionInfo, true);
> if (regionInfo.isMetaRegion()) {
> // Must move meta region first.
> balance(plan);
> } else {
> plans.add(plan);
> }
> }
> for (RegionPlan plan : plans) {
> balance(plan);
> }
> }
> }
> } catch (Throwable t) {
> LOG.error(t);
> }
> }
> }).start();
> }{code}
>
> # First execute getExcludedServersForSystemTable():Get the highest version
> value in all regionservers and return all RSs below that version value,
> labeled LowVersionRSList
> # If 1 does not return null, iterate.If there is a region with system table
> on rs, add this region to the List that needs move.If the first rs upgraded
> at this point is not in the rs_group where the system table is located, the
> region of the meta table is added to regionsShouldMove
> # Get a Regionplan for the region in regionsShouldMove,, and the parameter
> forceNewPlan is true:
> ## Gets all regionserver which version is below the highest version;
> ## Exclude regionservers from 1) for all rs online status. The result is
> that only the rs has been upgraded will in collection, marked as destServers ;
> ## Since forceNewPlan is set to true, destination server will be obtained
> through balance.randomassignmet (region, destServers). Since rs_group
> function is enabled, the balance here is RSGroupBasedLoadBalancer.The logic
> in this method is:
> ### the destServers in 3.2 obtained intersect with all online regionservers
> in the rs_group of the current region.When region is a system table and not
> in the same rs_group, the result here is null.If null is returned,
> destination regionserver is hard-coded as BOGUS_SERVER_NAME(localhost,1);
> Therefore, when master assigns region of the system table to localhost,1, it
> will naturally assign failed.If the above master logic is not noticed and
> this problem occurs, you can randomly upgrade a node in the rs_group where
> the system table is located, and it will automatically recover.
> During the actual upgrade process, you will rarely know this problem without
> looking at the master code.However, the official document does not indicate
> that when using the rs_group function, the rs_group where the system table is
> located needs to be upgraded first. It is easy to get into this process and
> eventually crash.The system tables are assigned to the highest version of rs
> for compatibility purposes, the comment says.
> Therefore, without changing the code logic, it can be noted in the official
> documentation that the rs_group of the system table is the priority to be
> upgraded when the cluster is upgraded with the rs_group function.
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)