maxmzkr commented on a change in pull request #13725:
URL: https://github.com/apache/flink/pull/13725#discussion_r509370138



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/leaderretrieval/ZooKeeperLeaderRetrievalService.java
##########
@@ -203,6 +207,11 @@ protected void handleStateChange(ConnectionState newState) 
{
                }
        }
 
+       private void onReconnectedConnectionState() {
+               // check whether we find some new leader information in 
ZooKeeper
+               retrieveLeaderInformationFromZooKeeper();

Review comment:
       When this is retrieved, I believe it will be equal to the last value 
that the node cache had before disconnect except when the NodeCache has had 
time to update the value. On reconnect, the NodeCache sets up an asynchronous 
fetch of the value so the NodeCache will usually not have had time to update. 
If the leader changes when the zookeeper seession is lost (which will always be 
the case if the leader was the one that lost connection), the value that is 
retrieved here will likely be wrong. It seems like notifyLossOfLeader was 
implemented to prevent Flink nodes from having the wrong or stale leader. Is it 
less of a big deal for the Flink node to have the wrong leader in this case 
than it is in the case that notifyLossOfLeader is trying to solve?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to