nicoweidner commented on pull request #15675: URL: https://github.com/apache/flink/pull/15675#issuecomment-891811044
@tisonkun Thanks for looking into this and creating the PR for this long-standing issue, and sorry again that it was sidelined for so long! It would be cool to get this in in time for the feature freeze for 1.14. I would be available to review and pester @tillrohrmann as much as required to get it done quickly - would you be willing to reopen? Some points I noticed on a first pass: - Current behavior should be kept as default, meaning that we have a config option like `tolerate-connection-loss` which defaults to false (similar to an earlier version of the PR, except different default). - As mentioned [on the ticket](https://issues.apache.org/jira/browse/FLINK-10052?focusedCommentId=17349943&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17349943), we probably have to remove [this line](https://github.com/apache/flink/blob/6fa6049bc6c022b776163f4e4768d84ac00ab8e6/flink-runtime/src/main/java/org/apache/flink/runtime/leaderretrieval/ZooKeeperLeaderRetrievalDriver.java#L159), so `ZooKeeperLeaderRetrievalDriver` does not reset leader address on `SUSPENDED`. I think we are currently also missing tests for that class. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
