HuangZhenQiu commented on a change in pull request #11541:
URL: https://github.com/apache/flink/pull/11541#discussion_r442926109



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/PartitionRequestClientFactory.java
##########
@@ -131,6 +134,42 @@ void destroyPartitionRequestClient(ConnectionID 
connectionId, PartitionRequestCl
                clients.remove(connectionId, client);
        }
 
+       private NettyPartitionRequestClient 
connectChannelWithRetry(ConnectingChannel connectingChannel,
+                                                                               
 ConnectionID connectionId, boolean needConnect)
+               throws IOException, InterruptedException {
+               int count = 0;
+               Exception exception = null;
+               do {
+                       try {
+                               if (needConnect) {
+                                       LOG.info("Connecting to {} at {} 
attempt", connectionId.getAddress(), count);
+                                       
nettyClient.connect(connectionId.getAddress()).addListener(connectingChannel);
+                               }
+
+                               NettyPartitionRequestClient client = 
connectingChannel.waitForChannel();
+                               clients.replace(connectionId, 
connectingChannel, client);
+                               return client;
+                       } catch (IOException | ChannelException e) {
+                               LOG.error("Failed {} times to connect to {}", 
count, connectionId.getAddress(), e);
+                               ConnectingChannel newConnectingChannel = new 
ConnectingChannel(connectionId, this);
+                               clients.replace(connectionId, 
connectingChannel, newConnectingChannel);

Review comment:
       Yes. It is the reason of deadlock before adding the synchronized on 
connectionId change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to