frostruan commented on a change in pull request #4115:
URL: https://github.com/apache/hbase/pull/4115#discussion_r814902668



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java
##########
@@ -1264,4 +1424,55 @@ private SnapshotDescription 
toSnapshotDescription(ProcedureDescription desc)
     builder.setType(SnapshotDescription.Type.FLUSH);
     return builder.build();
   }
+
+  public void registerSnapshotProcedure(SnapshotDescription snapshot, long 
procId) {
+    snapshotToProcIdMap.put(snapshot, procId);
+    LOG.debug("register snapshot={}, snapshot procedure id = {}",
+      ClientSnapshotDescriptionUtils.toString(snapshot), procId);
+  }
+
+  public void unregisterSnapshotProcedure(SnapshotDescription snapshot, long 
procId) {
+    snapshotToProcIdMap.remove(snapshot, procId);
+    LOG.debug("unregister snapshot={}, snapshot procedure id = {}",
+      ClientSnapshotDescriptionUtils.toString(snapshot), procId);
+  }
+
+  public boolean snapshotProcedureEnabled() {
+    return master.getConfiguration()
+      .getBoolean(SNAPSHOT_PROCEDURE_ENABLED, 
SNAPSHOT_PROCEDURE_ENABLED_DEFAULT);
+  }
+
+  public ServerName acquireSnapshotVerifyWorker(SnapshotVerifyProcedure 
procedure)
+      throws ProcedureSuspendedException {
+    Optional<ServerName> worker = verifyWorkerAssigner.acquire();
+    if (worker.isPresent()) {
+      LOG.debug("{} Acquired verify snapshot worker={}", procedure, 
worker.get());
+      return worker.get();
+    }
+    verifyWorkerAssigner.suspend(procedure);
+    throw new ProcedureSuspendedException();
+  }
+
+  public void releaseSnapshotVerifyWorker(SnapshotVerifyProcedure procedure,
+      ServerName worker, MasterProcedureScheduler scheduler) {
+    LOG.debug("{} Release verify snapshot worker={}", procedure, worker);
+    verifyWorkerAssigner.release(worker);
+    verifyWorkerAssigner.wake(scheduler);
+  }
+
+  private void restoreWorkers() {
+    master.getMasterProcedureExecutor().getActiveProceduresNoCopy().stream()
+      .filter(p -> p instanceof SnapshotVerifyProcedure)
+      .map(p -> (SnapshotVerifyProcedure) p)
+      .filter(p -> !p.isFinished())
+      .filter(p -> p.getServerName() != null)
+      .forEach(p -> {
+        verifyWorkerAssigner.addUsedWorker(p.getServerName());

Review comment:
       I think it's harmless that we have dead servers in the WorkAssigner 
based on this consideration:
   
   1. If the remote server is dead, there will be a 
FailedRemoteDispatchException thrown when we dispatch remote procedures. The 
procedure-v2 framework is designed to be able to handle this. The upper layer 
procedure can notice that and reschedule a new procedure, or the procedure can 
retry itself. 
   2. When acquire a new worker from WorkAssigner, we will always choose from 
the online server list. The dead server will not be able to become the 
candidate.
   3. If there are too many dead servers in the WorkerAssigner, we can restart 
the master to clear it. This is a very lightweight operation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to