EdColeman commented on code in PR #2778:
URL: https://github.com/apache/accumulo/pull/2778#discussion_r934581685


##########
server/base/src/main/java/org/apache/accumulo/server/conf/ServerConfigurationFactory.java:
##########
@@ -140,4 +163,119 @@ public void connectionEvent() {
       // no-op. changes handled by prop store impl
     }
   }
+
+  private class ConfigRefreshRunner {
+    private static final long MIN_JITTER_DELAY = 1;
+    private static final long MAX_JITTER_DELAY = 23;
+    private final ScheduledFuture<?> refreshTaskFuture;
+
+    ConfigRefreshRunner() {
+
+      Runnable refreshTask = this::verifySnapshotVersions;
+
+      ScheduledThreadPoolExecutor executor = ThreadPools.getServerThreadPools()
+          .createScheduledExecutorService(1, "config-refresh", false);
+
+      // staggering the initial delay prevents synchronization of Accumulo 
servers communicating
+      // with ZooKeeper for the sync process. (Value is 25% -> 100% of the 
refresh period.)
+      long randDelay = jitter(REFRESH_PERIOD_MINUTES / 4, 
REFRESH_PERIOD_MINUTES);
+      refreshTaskFuture =
+          executor.scheduleWithFixedDelay(refreshTask, randDelay, 
REFRESH_PERIOD_MINUTES, MINUTES);
+    }
+
+    /**
+     * Check that the stored version in ZooKeeper matches the version held in 
the local snapshot.
+     * When a mismatch is detected, a change event is sent to the prop store 
which will cause a
+     * re-load. If the Zookeeper node has been deleted, the local cache 
entries are removed.
+     * <p>
+     * This method is designed to be called as a scheduled task, so it does 
not propagate exceptions
+     * other than interrupted Exceptions so the scheduled tasks will continue 
to run.
+     */
+    private void verifySnapshotVersions() {
+
+      // short circuit if refresh in progress
+      if (isConfigRefreshRunning.get()) {
+        return;
+      }
+
+      // allow only one thread if missed short circuit check.
+      refreshLock.lock();
+      try {
+        isConfigRefreshRunning.set(true);
+        long refreshStart = System.nanoTime();
+        int keyCount = 0;
+        int keyChangedCount = 0;
+
+        PropStore propStore = context.getPropStore();
+        keyCount++;
+
+        // rely on store to propagate change event if different
+        propStore.validateDataVersion(SystemPropKey.of(context),
+            ((ZooBasedConfiguration) 
getSystemConfiguration()).getDataVersion());
+        // small yield - spread out ZooKeeper calls
+        jitterDelay();
+
+        for (Map.Entry<NamespaceId,NamespaceConfiguration> entry : 
namespaceConfigs.entrySet()) {
+          keyCount++;
+          PropStoreKey<?> propKey = NamespacePropKey.of(context, 
entry.getKey());
+          if (!propStore.validateDataVersion(propKey, 
entry.getValue().getDataVersion())) {
+            keyChangedCount++;
+            namespaceConfigs.remove(entry.getKey());
+          }
+          // small yield - spread out ZooKeeper calls between namespace config 
checks
+          jitterDelay();

Review Comment:
   On a large cluster start, all tservers could come on line at nearly the same 
time.  The start jitter and then the additional jitter delays are meant to 
ensure that this check does not sync ZooKeeper calls across the cluster over 
time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to