saintstack commented on a change in pull request #1924:
URL: https://github.com/apache/hbase/pull/1924#discussion_r443714241
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
##########
@@ -338,6 +345,35 @@ protected Flow executeFromState(MasterProcedureEnv env,
RegionStateTransitionSta
try {
switch (state) {
case REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE:
+
+ // Need to do some sanity check for replica region, if the region
does not exist at file
+ // system, do not try to assign the replica region, log error and
return.
+ // Do not rely on master's in-memory state, primary region got its
own life, it can be
+ // closed, offline for various reasons.
Review comment:
This is a mistake then? Master is the authority. If it can't make a call
on read replicas, then read replicas are not done right?
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
##########
@@ -338,6 +345,35 @@ protected Flow executeFromState(MasterProcedureEnv env,
RegionStateTransitionSta
try {
switch (state) {
case REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE:
+
+ // Need to do some sanity check for replica region, if the region
does not exist at file
+ // system, do not try to assign the replica region, log error and
return.
+ // Do not rely on master's in-memory state, primary region got its
own life, it can be
+ // closed, offline for various reasons.
+ if (!RegionReplicaUtil.isDefaultReplica(regionNode.getRegionInfo()))
{
+ MasterFileSystem mfs =
env.getMasterServices().getMasterFileSystem();
+ RegionInfo replicaRI = regionNode.getRegionInfo();
+ Path tableDir = CommonFSUtils.getTableDir(mfs.getRootDir(),
regionNode.getTable());
+ Path regionDir = FSUtils.getRegionDirFromTableDir(tableDir,
replicaRI);
+ FileSystem fs = mfs.getFileSystem();
+ // Check if primary region directory exists
+ if (!fs.exists(regionDir)) {
+ LOG.error(
+ "Cannot assign replica region {} because its primary region {}
does not exist"
+ + " at Filesystem", replicaRI,
+
ServerRegionReplicaUtil.getRegionInfoForDefaultReplica(replicaRI));
+ return Flow.NO_MORE_STATE;
+ }
+ // check if .regionInfo exists in primary region
+ Path regionInfoFile = new Path(regionDir,
HRegionFileSystem.REGION_INFO_FILE);
+ if (!fs.exists(regionInfoFile)) {
+ LOG.error(
+ "Cannot assign replica region {} because region info file does
not exist in its"
+ + " primary region {}", replicaRI,
+
ServerRegionReplicaUtil.getRegionInfoForDefaultReplica(replicaRI));
+ return Flow.NO_MORE_STATE;
+ }
+ }
Review comment:
Ouch.
We need follow-on to fix read replicas so Master is authority?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]