virajjasani commented on a change in pull request #2890:
URL: https://github.com/apache/hbase/pull/2890#discussion_r559461474



##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/ChoreService.java
##########
@@ -319,12 +360,18 @@ public synchronized void 
onChoreMissedStartTime(ScheduledChore chore) {
    * shutdown the service. Any chores that are scheduled for execution will be 
cancelled. Any chores
    * in the middle of execution will be interrupted and shutdown. This service 
will be unusable
    * after this method has been called (i.e. future scheduling attempts will 
fail).
+   * <p/>
+   * Notice that, this will only clean the chore from this ChoreService but 
you could still schedule
+   * the chore with other ChoreService.

Review comment:
       nit: other or same ChreService instance?

##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/ChoreService.java
##########
@@ -141,28 +140,41 @@ public ChoreService(final String coreThreadPoolPrefix, 
int corePoolSize, boolean
    * @return true when the chore was successfully scheduled. false when the 
scheduling failed
    *         (typically occurs when a chore is scheduled during shutdown of 
service)
    */
-  public synchronized boolean scheduleChore(ScheduledChore chore) {
+  public boolean scheduleChore(ScheduledChore chore) {
     if (chore == null) {
       return false;
     }
-
-    try {
-      if (chore.getPeriod() <= 0) {
-        LOG.info("Chore {} is disabled because its period is not positive.", 
chore);
-        return false;
-      }
-      LOG.info("Chore {} is enabled.", chore);
-      chore.setChoreServicer(this);
-      ScheduledFuture<?> future =
-          scheduler.scheduleAtFixedRate(chore, chore.getInitialDelay(), 
chore.getPeriod(),
-            chore.getTimeUnit());
-      scheduledChores.put(chore, future);
-      return true;
-    } catch (Exception exception) {
-      if (LOG.isInfoEnabled()) {
-        LOG.info("Could not successfully schedule chore: " + chore.getName());
+    // always lock chore first to prevent dead lock
+    synchronized (chore) {

Review comment:
       Other than here, do we have synchronization on `ScheduledChore` 
elsewhere? I think all other methods themselves are synchronized which is good 
but how do we prevent anyone from synchronizing on `ScheduledChore` instance in 
future? Perhaps for now, we can write a Javadoc comment on `ScheduledChore` 
providing this info?

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
##########
@@ -936,7 +936,7 @@ public void startChore() {
    */
   public void stop() {
     if (flushedSeqIdFlusher != null) {
-      flushedSeqIdFlusher.cancel();
+      flushedSeqIdFlusher.shutdown();

Review comment:
       `shutdown(true)` looks better candidate

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RegionServerSpaceQuotaManager.java
##########
@@ -98,11 +98,11 @@ public synchronized void start() throws IOException {
 
   public synchronized void stop() {
     if (spaceQuotaRefresher != null) {
-      spaceQuotaRefresher.cancel();

Review comment:
       `cancel()` is nothing but `cancel(true)`, so all `cancel()` should be 
replaced with `shutdown(true)` just like all `cancel(true)` are replaced with 
`shutdown(true)`

##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/ChoreService.java
##########
@@ -141,28 +140,41 @@ public ChoreService(final String coreThreadPoolPrefix, 
int corePoolSize, boolean
    * @return true when the chore was successfully scheduled. false when the 
scheduling failed
    *         (typically occurs when a chore is scheduled during shutdown of 
service)
    */
-  public synchronized boolean scheduleChore(ScheduledChore chore) {
+  public boolean scheduleChore(ScheduledChore chore) {
     if (chore == null) {
       return false;
     }
-
-    try {
-      if (chore.getPeriod() <= 0) {
-        LOG.info("Chore {} is disabled because its period is not positive.", 
chore);
-        return false;
-      }
-      LOG.info("Chore {} is enabled.", chore);
-      chore.setChoreServicer(this);
-      ScheduledFuture<?> future =
-          scheduler.scheduleAtFixedRate(chore, chore.getInitialDelay(), 
chore.getPeriod(),
-            chore.getTimeUnit());
-      scheduledChores.put(chore, future);
-      return true;
-    } catch (Exception exception) {
-      if (LOG.isInfoEnabled()) {
-        LOG.info("Could not successfully schedule chore: " + chore.getName());
+    // always lock chore first to prevent dead lock
+    synchronized (chore) {
+      synchronized (this) {
+        try {
+          // Chores should only ever be scheduled with a single ChoreService. 
If the choreService
+          // is changing, cancel any existing schedules of this chore.
+          if (chore.getChoreService() == this) {
+            LOG.warn("Chore {} has already been scheduled with us", chore);
+            return false;
+          }
+          if (chore.getPeriod() <= 0) {
+            LOG.info("Chore {} is disabled because its period is not 
positive.", chore);
+            return false;
+          }
+          LOG.info("Chore {} is enabled.", chore);
+          if (chore.getChoreService() != null) {
+            LOG.info("Cancel chore {} from its previous service", chore);
+            chore.getChoreService().cancelChore(chore);
+          }
+          chore.setChoreService(this);
+          ScheduledFuture<?> future = scheduler.scheduleAtFixedRate(chore, 
chore.getInitialDelay(),
+            chore.getPeriod(), chore.getTimeUnit());
+          scheduledChores.put(chore, future);
+          return true;
+        } catch (Exception exception) {
+          if (LOG.isInfoEnabled()) {
+            LOG.info("Could not successfully schedule chore: " + 
chore.getName());

Review comment:
       Let's convert this to ERROR with exception stacktrace?

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1500,11 +1499,9 @@ private void switchSnapshotCleanup(final boolean on) {
     try {
       snapshotCleanupTracker.setSnapshotCleanupEnabled(on);
       if (on) {
-        if (!getChoreService().isChoreScheduled(this.snapshotCleanerChore)) {
-          getChoreService().scheduleChore(this.snapshotCleanerChore);
-        }
+        getChoreService().scheduleChore(this.snapshotCleanerChore);
       } else {
-        getChoreService().cancelChore(this.snapshotCleanerChore);
+        this.snapshotCleanerChore.cancel();

Review comment:
       I know it's a pain but we need to have null check here

##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/ChoreService.java
##########
@@ -141,28 +140,41 @@ public ChoreService(final String coreThreadPoolPrefix, 
int corePoolSize, boolean
    * @return true when the chore was successfully scheduled. false when the 
scheduling failed
    *         (typically occurs when a chore is scheduled during shutdown of 
service)
    */
-  public synchronized boolean scheduleChore(ScheduledChore chore) {
+  public boolean scheduleChore(ScheduledChore chore) {
     if (chore == null) {
       return false;
     }
-
-    try {
-      if (chore.getPeriod() <= 0) {
-        LOG.info("Chore {} is disabled because its period is not positive.", 
chore);
-        return false;
-      }
-      LOG.info("Chore {} is enabled.", chore);
-      chore.setChoreServicer(this);
-      ScheduledFuture<?> future =
-          scheduler.scheduleAtFixedRate(chore, chore.getInitialDelay(), 
chore.getPeriod(),
-            chore.getTimeUnit());
-      scheduledChores.put(chore, future);
-      return true;
-    } catch (Exception exception) {
-      if (LOG.isInfoEnabled()) {
-        LOG.info("Could not successfully schedule chore: " + chore.getName());
+    // always lock chore first to prevent dead lock
+    synchronized (chore) {
+      synchronized (this) {
+        try {
+          // Chores should only ever be scheduled with a single ChoreService. 
If the choreService
+          // is changing, cancel any existing schedules of this chore.
+          if (chore.getChoreService() == this) {
+            LOG.warn("Chore {} has already been scheduled with us", chore);
+            return false;
+          }
+          if (chore.getPeriod() <= 0) {
+            LOG.info("Chore {} is disabled because its period is not 
positive.", chore);
+            return false;
+          }
+          LOG.info("Chore {} is enabled.", chore);
+          if (chore.getChoreService() != null) {
+            LOG.info("Cancel chore {} from its previous service", chore);
+            chore.getChoreService().cancelChore(chore);

Review comment:
       I think it makes sense. So what we are doing is we are taking lock on 
give `ScheduledChore` instance, followed by current `ChoreService` itself and 
when we find that this ScheduledChore belongs to different ChoreService 
instance, we call `cancel()` on that instance, which is anyways synchronized 
instance method.

##########
File path: 
hbase-common/src/main/java/org/apache/hadoop/hbase/ScheduledChore.java
##########
@@ -354,7 +320,27 @@ protected boolean initialChore() {
   /**
    * Override to run cleanup tasks when the Chore encounters an error and must 
stop running
    */
-  protected synchronized void cleanup() {
+  protected void cleanup() {
+  }
+
+  /**
+   * Call {@link #shutdown(boolean)} with {@code true}.
+   * @see ScheduledChore#shutdown(boolean)
+   */
+  public synchronized void shutdown() {
+    cancel(false);
+    cleanup();

Review comment:
       This is supposed to call `shutdown(true)` right? Javadoc also says so.

##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/ChoreService.java
##########
@@ -319,12 +360,18 @@ public synchronized void 
onChoreMissedStartTime(ScheduledChore chore) {
    * shutdown the service. Any chores that are scheduled for execution will be 
cancelled. Any chores
    * in the middle of execution will be interrupted and shutdown. This service 
will be unusable
    * after this method has been called (i.e. future scheduling attempts will 
fail).
+   * <p/>
+   * Notice that, this will only clean the chore from this ChoreService but 
you could still schedule
+   * the chore with other ChoreService.
    */
   public synchronized void shutdown() {
+    if (isShutdown()) {
+      return;
+    }
     scheduler.shutdownNow();
     if (LOG.isInfoEnabled()) {
-      LOG.info("Chore service for: " + coreThreadPoolPrefix + " had " + 
scheduledChores.keySet()
-          + " on shutdown");
+      LOG.info("Chore service for: " + coreThreadPoolPrefix + " had " + 
scheduledChores.keySet() +
+        " on shutdown");
     }

Review comment:
       I think INFO is quite common level set by majority applications. We 
better remove `isInfoEnabled()` check and also replace String concat with 
placeholders with `{}`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to