zentol commented on pull request #16915:
URL: https://github.com/apache/flink/pull/16915#issuecomment-903755167


   hmm...that was actually nice hint.
   
   In `Client#handInChannel` we have this sequence:
   ```
   synchronized (connectLock) {
         establishedConnections.put(serverAddress, established);
         pendingConnections.remove(serverAddress);
   }
   ```
   
   In `Client#sendRequest` we have this (without a lock!):
   ```
   EstablishedConnection connection = establishedConnections.get(serverAddress);
   if (connection != null) {
       return connection.sendRequest(request);
   } else {
       PendingConnection pendingConnection = 
pendingConnections.get(serverAddress);
       if (pendingConnection != null) {
           // There was a race, use the existing pending connection.
           return pendingConnection.sendRequest(request);
       } else {
          <create a new connection>
       }
   }
   ```
   
   Since `sendRequest` does not use the `connectLock` there is a race condition 
where  `sendRequest` might not see an existing connection if it is moved from 
pending to established between the `get` calls. That could explain why multiple 
channels are initialized.
   
   We could think about fixing that to ensure that only 1 channel is used. Then 
it would also be fine for the handlers to not be `Shareable`, because we don't 
intend them to be shared.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to