ndimiduk commented on a change in pull request #1786:
URL: https://github.com/apache/hbase/pull/1786#discussion_r431437782



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1911,43 +1912,51 @@ public boolean normalizeRegions() throws IOException {
       return false;
     }
 
-    synchronized (this.normalizer) {
+    if (!normalizationInProgressLock.tryLock()) {

Review comment:
       It hangs out here on the master because, sadly, a bunch of logic is 
implemented in `HMaster`. All the stuff about deciding which tables to 
normalize and actual execution of the generated plans happens in 
`Master#normalizeRegions` (under the same lock we were just discussing). I 
don't love this design, but I think it does make sense that a normalizer 
algorithm exist mostly outside of the context of the master context in which it 
runs. If I were to make the master more of a composition, I would implement a 
class called something like `NormalizationService`, which was responsible for 
instantiating the instance of the `Normalizer` algorithm, registering the 
`Chore` and being the delegation target for invocations coming from RPC. The 
lock would be internal to this `NormalizationService`.

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -400,6 +401,8 @@ public void run() {
   private final LockManager lockManager = new LockManager(this);
 
   private RSGroupBasedLoadBalancer balancer;
+  // a lock to prevent concurrent normalization actions.
+  private final ReentrantLock normalizationInProgressLock = new 
ReentrantLock();

Review comment:
       There is a `RegionNormalizerChore`, yes. It invokes 
`master.normalizeRegions()`. However, the same method can also be called from 
`MasterRpcServices`, so this lock (previously a `synchronized(normalizer) 
block`) prevents the chore and an operator invocation from colliding.

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -797,7 +800,6 @@ protected void initializeZKBasedSystemTrackers()
     this.balancer.setConf(conf);
     this.normalizer = RegionNormalizerFactory.getRegionNormalizer(conf);
     this.normalizer.setMasterServices(this);
-    this.normalizer.setMasterRpcServices((MasterRpcServices)rpcServices);

Review comment:
       It was using `masterRpcServices` to reach out to check the split and 
merge switches. I've replaced that with direct calls using `masterServices`.

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1911,43 +1912,51 @@ public boolean normalizeRegions() throws IOException {
       return false;
     }
 
-    synchronized (this.normalizer) {
+    if (!normalizationInProgressLock.tryLock()) {
       // Don't run the normalizer concurrently
+      LOG.info("Normalization already in progress. Skipping request.");
+    } else {
+      try {
+        List<TableName> allEnabledTables = new ArrayList<>(
+          tableStateManager.getTablesInStates(TableState.State.ENABLED));
+        Collections.shuffle(allEnabledTables);
 
-      List<TableName> allEnabledTables = new ArrayList<>(
-        this.tableStateManager.getTablesInStates(TableState.State.ENABLED));
-
-      Collections.shuffle(allEnabledTables);
-
-      for (TableName table : allEnabledTables) {
-        TableDescriptor tblDesc = getTableDescriptors().get(table);
-        if (table.isSystemTable() || (tblDesc != null &&
-            !tblDesc.isNormalizationEnabled())) {
-          LOG.trace("Skipping normalization for {}, as it's either system"
-              + " table or doesn't have auto normalization turned on", table);
-          continue;
-        }
+        for (TableName table : allEnabledTables) {
+          if (table.isSystemTable()) {
+            continue;
+          }
+          final TableDescriptor tblDesc = getTableDescriptors().get(table);
+          if (tblDesc != null && !tblDesc.isNormalizationEnabled()) {
+            LOG.debug(
+              "Skipping {} because normalization is disabled in its table 
properties.", table);
+            continue;
+          }
 
-        // make one last check that the cluster isn't shutting down before 
proceeding.
-        if (skipRegionManagementAction("region normalizer")) {
-          return false;
-        }
+          // make one last check that the cluster isn't shutting down before 
proceeding.
+          if (skipRegionManagementAction("region normalizer")) {
+            return false;
+          }
 
-        final List<NormalizationPlan> plans = 
this.normalizer.computePlanForTable(table);
-        if (CollectionUtils.isEmpty(plans)) {
-          return true;
-        }
+          final List<NormalizationPlan> plans = 
normalizer.computePlansForTable(table);
+          if (CollectionUtils.isEmpty(plans)) {
+            return true;

Review comment:
       Yikes! good point.

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1911,43 +1912,51 @@ public boolean normalizeRegions() throws IOException {
       return false;
     }
 
-    synchronized (this.normalizer) {
+    if (!normalizationInProgressLock.tryLock()) {
       // Don't run the normalizer concurrently
+      LOG.info("Normalization already in progress. Skipping request.");
+    } else {
+      try {
+        List<TableName> allEnabledTables = new ArrayList<>(
+          tableStateManager.getTablesInStates(TableState.State.ENABLED));
+        Collections.shuffle(allEnabledTables);
 
-      List<TableName> allEnabledTables = new ArrayList<>(
-        this.tableStateManager.getTablesInStates(TableState.State.ENABLED));
-
-      Collections.shuffle(allEnabledTables);
-
-      for (TableName table : allEnabledTables) {
-        TableDescriptor tblDesc = getTableDescriptors().get(table);
-        if (table.isSystemTable() || (tblDesc != null &&
-            !tblDesc.isNormalizationEnabled())) {
-          LOG.trace("Skipping normalization for {}, as it's either system"
-              + " table or doesn't have auto normalization turned on", table);
-          continue;
-        }
+        for (TableName table : allEnabledTables) {
+          if (table.isSystemTable()) {
+            continue;
+          }
+          final TableDescriptor tblDesc = getTableDescriptors().get(table);
+          if (tblDesc != null && !tblDesc.isNormalizationEnabled()) {
+            LOG.debug(
+              "Skipping {} because normalization is disabled in its table 
properties.", table);
+            continue;
+          }
 
-        // make one last check that the cluster isn't shutting down before 
proceeding.
-        if (skipRegionManagementAction("region normalizer")) {
-          return false;
-        }
+          // make one last check that the cluster isn't shutting down before 
proceeding.
+          if (skipRegionManagementAction("region normalizer")) {
+            return false;
+          }
 
-        final List<NormalizationPlan> plans = 
this.normalizer.computePlanForTable(table);
-        if (CollectionUtils.isEmpty(plans)) {
-          return true;
-        }
+          final List<NormalizationPlan> plans = 
normalizer.computePlansForTable(table);
+          if (CollectionUtils.isEmpty(plans)) {
+            return true;
+          }
 
-        try (final Admin admin = 
asyncClusterConnection.toConnection().getAdmin()) {
-          for (NormalizationPlan plan : plans) {
-            plan.execute(admin);
-            if (plan.getType() == PlanType.SPLIT) {
-              splitPlanCount++;
-            } else if (plan.getType() == PlanType.MERGE) {
-              mergePlanCount++;
+          try (final Admin admin = 
asyncClusterConnection.toConnection().getAdmin()) {
+            // as of this writing, `plan.execute()` is non-blocking, so 
there's no artificial rate-
+            // limiting of merge requests due to this serial loop.

Review comment:
       Well, sort of you can limit the work done. There no way to say "execute 
at most N actions per table and M actions in a single invocation." You think we 
need something like this?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to