Github user NicoK commented on a diff in the pull request:
https://github.com/apache/flink/pull/6272#discussion_r201970654
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/RemoteInputChannel.java
---
@@ -360,32 +360,44 @@ public boolean notifyBufferAvailable(Buffer buffer) {
return false;
}
- boolean needMoreBuffers = false;
- synchronized (bufferQueue) {
- checkState(isWaitingForFloatingBuffers, "This channel
should be waiting for floating buffers.");
+ boolean recycleBuffer = true;
+ try {
+ boolean needMoreBuffers = false;
+ synchronized (bufferQueue) {
+ checkState(isWaitingForFloatingBuffers,
+ "This channel should be waiting for
floating buffers.");
+
+ // Important: double check the isReleased state
inside synchronized block, so there is no
+ // race condition when notifyBufferAvailable
and releaseAllResources running in parallel.
+ if (isReleased.get() ||
+ bufferQueue.getAvailableBufferSize() >=
numRequiredBuffers) {
+ isWaitingForFloatingBuffers = false;
+ buffer.recycleBuffer();
+ return false;
+ }
- // Important: double check the isReleased state inside
synchronized block, so there is no
- // race condition when notifyBufferAvailable and
releaseAllResources running in parallel.
- if (isReleased.get() ||
bufferQueue.getAvailableBufferSize() >= numRequiredBuffers) {
- isWaitingForFloatingBuffers = false;
- buffer.recycleBuffer();
- return false;
- }
+ recycleBuffer = false;
--- End diff --
I don't think we should ignore the `checkState` since a failure there in
our task is of course an implementation error but should not leak buffers for
other tasks still running and working with this buffer pool.
Regarding the `buffer.recycleBuffer()`, you are right: we kind of always
assume this cannot throw, but here we should maybe guard against recycling
twice.
---