keith-turner commented on code in PR #5143:
URL: https://github.com/apache/accumulo/pull/5143#discussion_r1873997910


##########
core/src/main/java/org/apache/accumulo/core/fate/zookeeper/ZooCache.java:
##########
@@ -316,43 +277,46 @@ public List<String> getChildren(final String zPath) {
       public List<String> run() throws KeeperException, InterruptedException {
 
         var zcNode = nodeCache.get(zPath);
-        if (zcNode != null && zcNode.childrenSet) {
+        if (zcNode != null && zcNode.cachedChildren()) {
           return zcNode.getChildren();
         }
 
         log.trace("{} {} was not in children cache, looking up in zookeeper", 
cacheId, zPath);
 
-        try {
-          zcNode = nodeCache.compute(zPath, (zp, zcn) -> {
-            // recheck the children now that lock is held on key
-            if (zcn != null && zcn.childrenSet) {
-              return zcn;
-            }
+        zcNode = nodeCache.compute(zPath, (zp, zcn) -> {
+          // recheck the children now that lock is held on key
+          if (zcn != null && zcn.cachedChildren()) {
+            return zcn;
+          }
 
-            try {
-              final ZooKeeper zooKeeper = getZooKeeper();
-              List<String> children;
-              children = zooKeeper.getChildren(zPath, watcher);
-              if (children != null) {
-                children = List.copyOf(children);
-              }
-              return new ZcNode(zcn, children);
-            } catch (KeeperException e) {
-              throw new ZcException(e);
-            } catch (InterruptedException e) {
-              throw new ZcInterruptedException(e);
+          try {
+            final ZooKeeper zooKeeper = getZooKeeper();
+            // Register a watcher on the node to monitor creation/deletion 
events for the node. It
+            // is possible that an event from this watch could trigger prior 
to calling getChildren.
+            // That is ok because the compute() call on the map has a lock and 
processing the event
+            // will block until compute() returns. After compute() returns the 
event processing
+            // would clear the map entry.
+            Stat stat = zooKeeper.exists(zPath, watcher);

Review Comment:
   In ee80c70 updated to remove entries in zoocache that have not been accessed 
in 3 minutes.  This also removes their watches. I think this covers a lot of 
the issues of creating watches that hang around forever.  Looking into this I 
think the following follow on work could be done to improve the watch situation.
   
    * When something is removed from zoocache also remove the watch.  Currently 
when something is removed from zoocache explicitly its watch is just left 
there. 
    * Only set a watch on things that are accesses more than once. I suspect 
there are things in zoocache that are only accessed once and a watch is set.   
These expire changes will help with this.  We could reduce watches further in a 
follow on by only setting them on the 2nd access.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to