xyuanlu commented on code in PR #2805:
URL: https://github.com/apache/helix/pull/2805#discussion_r1628558328


##########
meta-client/src/main/java/org/apache/helix/metaclient/impl/zk/ZkMetaClient.java:
##########
@@ -207,16 +208,55 @@ public void set(String key, T data, int version) {
 
   @Override
   public T update(String key, DataUpdater<T> updater) {
-    org.apache.zookeeper.data.Stat stat = new org.apache.zookeeper.data.Stat();
-    // TODO: add retry logic for ZkBadVersionException.
-    try {
-      T oldData = _zkClient.readData(key, stat);
-      T newData = updater.update(oldData);
-      set(key, newData, stat.getVersion());
-      return newData;
-    } catch (ZkException e) {
-      throw translateZkExceptionToMetaclientException(e);
-    }
+    return update(key, updater, false);
+  }
+
+  @Override
+  public T update(String key, DataUpdater<T> updater, boolean retryOnFailure) {
+    final int MAX_RETRY_ATTEMPTS = 3;
+    int retryAttempts = 0;
+    boolean retry;
+    T updatedData = null;
+    do {
+      retry = false;
+      retryAttempts++;
+      try {
+        ImmutablePair<T, Stat> tup = getDataAndStat(key);
+        Stat stat = tup.right;
+        T oldData = tup.left;
+        T newData = updater.update(oldData);
+        set(key, newData, stat.getVersion());
+        updatedData = newData;
+      } catch (MetaClientBadVersionException badVersionException) {
+        // If exceeded max retry attempts, re-throw exception
+        if (retryAttempts >= MAX_RETRY_ATTEMPTS) {
+          LOG.error("Failed to update node at " + key + " after " + 
MAX_RETRY_ATTEMPTS + " attempts.");
+          throw badVersionException;
+        }
+        // Retry on bad version
+        retry = true;
+      } catch (MetaClientNoNodeException noNodeException) {
+        // If node does not exist, attempt to create it - pass null to updater
+        T newData = updater.update(null);
+        if (newData != null) {
+          try {
+            create(key, newData);

Review Comment:
   Thanks for creating the PR!
   Question about the retry logic here.
   For example, when user try to update node "/a/b/c", and there is no node 
"/a/b", set in line 228 with throw  NoNodeException, and we reach line 243. 
Since there is no node "/a/b", create will also throw NoNodeException, and we 
keep retry until expire. Should we do recursive create instead? Or we just let 
it expire and forward the NoNodeException?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org
For additional commands, e-mail: reviews-h...@helix.apache.org

Reply via email to