zhijiangW commented on a change in pull request #9132: [FLINK-13245][network]
Fix the bug of file resource leak while canceling partition request
URL: https://github.com/apache/flink/pull/9132#discussion_r307666311
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/PartitionRequestQueue.java
##########
@@ -181,19 +179,18 @@ public void userEventTriggered(ChannelHandlerContext
ctx, Object msg) throws Exc
return;
}
- // Cancel the request for the input channel
+ // remove reader from queue of available readers
int size = availableReaders.size();
for (int i = 0; i < size; i++) {
NetworkSequenceViewReader reader =
pollAvailableReader();
- if (reader.getReceiverId().equals(toCancel)) {
- reader.releaseAllResources();
- markAsReleased(reader.getReceiverId());
- } else {
+ if (reader != null &&
!reader.getReceiverId().equals(toCancel)) {
registerAvailableReader(reader);
}
}
- allReaders.remove(toCancel);
+ // remove reader from queue of all readers and release
its resource
+ NetworkSequenceViewReader toRelease =
allReaders.remove(toCancel);
Review comment:
The `toRelease` could be null if it was already released on server side (any
network exceptions) before receiving cancel message from client side.
I think it might be proper to ignore non-existing readers here. And it also
makes sense to remove `PartitionRequestQueue#released`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services