Re: [PR] HBASE-28215: region reopen procedure batching/throttling [hbase]

via GitHub Wed, 29 Nov 2023 03:42:54 -0800


bbeaudreault commented on code in PR #5534:
URL: https://github.com/apache/hbase/pull/5534#discussion_r1409161210



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ReopenTableRegionsProcedure.java:
##########
@@ -139,33 +170,57 @@ protected Flow executeFromState(MasterProcedureEnv env, 
ReopenTableRegionsState
       case REOPEN_TABLE_REGIONS_CONFIRM_REOPENED:
         regions = 
regions.stream().map(env.getAssignmentManager().getRegionStates()::checkReopened)
           .filter(l -> l != null).collect(Collectors.toList());
-        if (regions.isEmpty()) {
-          return Flow.NO_MORE_STATE;
+        // we need to create a set of region names because the HRegionLocation 
hashcode is only
+        // based
+        // on the server name
+        Set<byte[]> currentRegionBatchNames = currentRegionBatch.stream()
+          .map(r -> r.getRegion().getRegionName()).collect(Collectors.toSet());
+        currentRegionBatch = regions.stream()
+          .filter(r -> 
currentRegionBatchNames.contains(r.getRegion().getRegionName()))
+          .collect(Collectors.toList());
+        if (currentRegionBatch.isEmpty()) {
+          if (regions.isEmpty()) {
+            return Flow.NO_MORE_STATE;
+          } else {
+            
setNextState(ReopenTableRegionsState.REOPEN_TABLE_REGIONS_REOPEN_REGIONS);
+            if (reopenBatchBackoffMillis > 0) {
+              backoff(reopenBatchBackoffMillis);
+            }
+            return Flow.HAS_MORE_STATE;
+          }
         }
-        if (regions.stream().anyMatch(loc -> canSchedule(env, loc))) {
+        if (currentRegionBatch.stream().anyMatch(loc -> canSchedule(env, 
loc))) {

Review Comment:
   On second thought, after a discussion on 
https://issues.apache.org/jira/browse/HBASE-25549, i wonder if we should take 
the safer approach where require each batch to finish before scheduling more. 
This feature could serve a dual purpose of progressive rollout and rate 
limiting.
   
   We could also trivially update this code to do an actual progressive deploy 
-- first batch size 1, then 2, then 4, etc up to the current batch size config 
you added. At that point it stays at that max concurrency until completion.
   
   Thoughts on adding that? I think we just need one more field for 
`currentBatchSize` which we increment after each batch up to `reopenBatchSize`. 
It might make sense to persist this new field in the proto.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-28215: region reopen procedure batching/throttling [hbase]

Reply via email to