nicoweidner commented on a change in pull request #15675:
URL: https://github.com/apache/flink/pull/15675#discussion_r683613166



##########
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/ZooKeeperLeaderElectionConnectionLossTest.java
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.leaderelection;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.HighAvailabilityOptions;
+import org.apache.flink.core.testutils.OneShotLatch;
+import org.apache.flink.runtime.util.ZooKeeperUtils;
+import org.apache.flink.runtime.zookeeper.ZooKeeperResource;
+import org.apache.flink.util.TestLogger;
+
+import 
org.apache.flink.shaded.curator4.org.apache.curator.framework.CuratorFramework;
+import 
org.apache.flink.shaded.curator4.org.apache.curator.framework.state.ConnectionState;
+import 
org.apache.flink.shaded.curator4.org.apache.curator.framework.state.ConnectionStateListener;
+
+import org.apache.zookeeper.KeeperException.ConnectionLossException;
+import org.junit.Rule;
+import org.junit.Test;
+
+import java.time.Duration;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.fail;
+
+/** Test behaviors of {@link ZooKeeperLeaderElectionDriver} on {@link 
ConnectionLossException} */
+public class ZooKeeperLeaderElectionConnectionLossTest extends TestLogger {
+
+    private static final String LATCH_PATH = "/latch";
+    private static final String LEADER_PATH = "/leader";
+
+    private static final Duration TIMEOUT = Duration.ofMillis(2000L);
+
+    @Rule public final ZooKeeperResource zooKeeperResource = new 
ZooKeeperResource();
+
+    @Test
+    public void testKeepLeadershipOnConnectionLoss() throws Exception {

Review comment:
       Following @mxm 's naming suggestion, maybe 
`testKeepLeadershipOnConnectionSuspended` would be more precise. Similar for 
`connectionLossLatch` -> `connectionSuspendedLatch` below

##########
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/ZooKeeperLeaderElectionConnectionLossTest.java
##########
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.leaderelection;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.configuration.HighAvailabilityOptions;
+import org.apache.flink.core.testutils.OneShotLatch;
+import org.apache.flink.runtime.util.ZooKeeperUtils;
+import org.apache.flink.runtime.zookeeper.ZooKeeperResource;
+import org.apache.flink.util.TestLogger;
+
+import 
org.apache.flink.shaded.curator4.org.apache.curator.framework.CuratorFramework;
+import 
org.apache.flink.shaded.curator4.org.apache.curator.framework.state.ConnectionState;
+import 
org.apache.flink.shaded.curator4.org.apache.curator.framework.state.ConnectionStateListener;
+
+import org.apache.zookeeper.KeeperException.ConnectionLossException;
+import org.junit.Rule;
+import org.junit.Test;
+
+import java.time.Duration;
+import java.util.UUID;
+import java.util.concurrent.TimeUnit;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.fail;
+
+/** Test behaviors of {@link ZooKeeperLeaderElectionDriver} on {@link 
ConnectionLossException} */
+public class ZooKeeperLeaderElectionConnectionLossTest extends TestLogger {

Review comment:
       I think it would be nice to add a test verifying behavior on `LOST` 
state as well. We probably want to configure CuratorFramework with a very low 
session timeout for that.
   
   On that note, there are existing connection handling tests in 
`ZooKeeperLeaderElectionConnectionHandlingTest` that seem to overlap. 
`testConnectionSuspendedHandling` in particular could become the new 
`testConnectionLostHandling`.
   I think it would be good to put all such tests in 
`ZooKeeperLeaderElectionConnectionHandlingTest`.

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/util/ZooKeeperUtils.java
##########
@@ -150,6 +151,7 @@ public static CuratorFramework 
startCuratorFramework(Configuration configuration
                         .sessionTimeoutMs(sessionTimeout)
                         .connectionTimeoutMs(connectionTimeout)
                         .retryPolicy(new ExponentialBackoffRetry(retryWait, 
maxRetryAttempts))
+                        .connectionStateErrorPolicy(new 
SessionConnectionStateErrorPolicy())

Review comment:
       Corresponding links:
   https://curator.apache.org/errors.html
   
http://curator.apache.org/apidocs/org/apache/curator/framework/state/SessionConnectionStateErrorPolicy.html
   
   As mentioned in comments on the PR, we'd want to guard this behind a config 
setting `tolerate-suspended-connections` which defaults to false




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to