[GitHub] [flink] tillrohrmann commented on a change in pull request #13725: [FLINK-19557] Trigger LeaderRetrievalListener notification upon ZooKeeper reconnection in ZooKeeperLeaderRetrievalService

GitBox Thu, 22 Oct 2020 00:24:50 -0700


tillrohrmann commented on a change in pull request #13725:
URL: https://github.com/apache/flink/pull/13725#discussion_r509935113




##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/leaderretrieval/ZooKeeperLeaderRetrievalService.java
##########
@@ -203,6 +207,11 @@ protected void handleStateChange(ConnectionState newState) 
{
                }
        }
 
+       private void onReconnectedConnectionState() {
+               // check whether we find some new leader information in 
ZooKeeper
+               retrieveLeaderInformationFromZooKeeper();

Review comment:
       I think you are right that in some cases we will report a stale leader 
here. However, once the asynchronous fetch from the `NodeCache` completes, the 
listener should be notified about the new leader. What happens in the meantime 
is that the listener will try to connect to the old leader which should either 
be gone or reject all connection attempts since he is no longer the leader.
   
   The problem `notifyLossOfLeader` tried to solve is that a listener thinks 
that a stale leader is still the leader and, thus, continues working for it w/o 
questioning it (e.g. check with the leader) until the connection to the leader 
times out. With the `notifyLossOfLeader` change, once the retrieval service 
loses connection to ZooKeeper, it will tell the listener that the current 
leader is no longer valid. This will tell the listener to stop working for this 
leader (e.g. cancelling all tasks, disconnecting from it, etc.). If the 
listener should shortly after be told that the old leader is still the leader 
because of stale information, then it will first try to connect to the leader 
which will fail (assuming that the old leader is indeed no longer the leader) 
before it can start doing work for the leader.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] tillrohrmann commented on a change in pull request #13725: [FLINK-19557] Trigger LeaderRetrievalListener notification upon ZooKeeper reconnection in ZooKeeperLeaderRetrievalService

Reply via email to