zentol commented on pull request #16915:
URL: https://github.com/apache/flink/pull/16915#issuecomment-903755167
hmm...that was actually nice hint.
In `Client#handInChannel` we have this sequence:
```
synchronized (connectLock) {
establishedConnections.put(serverAddress, established);
pendingConnections.remove(serverAddress);
}
```
In `Client#sendRequest` we have this (without a lock!):
```
EstablishedConnection connection = establishedConnections.get(serverAddress);
if (connection != null) {
return connection.sendRequest(request);
} else {
PendingConnection pendingConnection =
pendingConnections.get(serverAddress);
if (pendingConnection != null) {
// There was a race, use the existing pending connection.
return pendingConnection.sendRequest(request);
} else {
<create a new connection>
}
}
```
Since `sendRequest` does not use the `connectLock` there is a race condition
where `sendRequest` might not see an existing connection if it is moved from
pending to established between the `get` calls. That could explain why multiple
channels are initialized.
We could think about fixing that to ensure that only 1 channel is used. Then
it would also be fine for the handlers to not be `Shareable`, because we don't
intend them to be shared.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]