dmvk commented on a change in pull request #17607:
URL: https://github.com/apache/flink/pull/17607#discussion_r755095766



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/zookeeper/ZooKeeperStateHandleStore.java
##########
@@ -87,18 +88,19 @@
 
     private static final Logger LOG = 
LoggerFactory.getLogger(ZooKeeperStateHandleStore.class);
 
-    @VisibleForTesting
-    static final Set<Class<? extends KeeperException>> PRE_COMMIT_EXCEPTIONS =
-            newHashSet(
-                    KeeperException.NodeExistsException.class,
-                    KeeperException.BadArgumentsException.class,
-                    KeeperException.NoNodeException.class,
-                    KeeperException.NoAuthException.class,
-                    KeeperException.BadVersionException.class,
-                    KeeperException.AuthFailedException.class,
-                    KeeperException.InvalidACLException.class,
-                    KeeperException.SessionMovedException.class,
-                    KeeperException.NotReadOnlyException.class);
+    /** Pre-commit exceptions that don't imply data inconsistency. */
+    private static final Set<Class<? extends KeeperException>> 
SAFE_PRE_COMMIT_EXCEPTIONS =
+            new HashSet<>(
+                    Arrays.asList(
+                            KeeperException.NodeExistsException.class,
+                            KeeperException.BadArgumentsException.class,
+                            KeeperException.NoNodeException.class,
+                            KeeperException.NoAuthException.class,
+                            KeeperException.BadVersionException.class,
+                            KeeperException.AuthFailedException.class,
+                            KeeperException.InvalidACLException.class,
+                            KeeperException.SessionMovedException.class,
+                            KeeperException.NotReadOnlyException.class));

Review comment:
       I'd say in very rare conditions it could happen (if connection loss 
happens right before retries are exhausted). We default to 3 retries with 
exponential backoff.
   
   We could also argue that this could happen `AuthFailed` / 
`InvalidACLException` when there is some manual operation on the broker side. 
Eg. sys-admin changes the ACL
   
   Description of `SessionMovedException`
   ```
   SESSIONMOVED
   Session moved to another server, so operation is ignored
   ```
   
   From `RetryLoop`:
   ```
   public static boolean shouldRetry(int rc) {
       return rc == Code.CONNECTIONLOSS.intValue() || rc == 
Code.OPERATIONTIMEOUT.intValue() || rc == Code.SESSIONMOVED.intValue() || rc == 
Code.SESSIONEXPIRED.intValue() || rc == -13;
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to