[
https://issues.apache.org/jira/browse/FLINK-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529613#comment-16529613
]
Nico Kruber edited comment on FLINK-9636 at 7/2/18 9:59 AM:
------------------------------------------------------------
Actually, {{numRequiredBuffers}} is only a local variable in this method - why
should we bother changing it?
Also, if there is an {{InterruptedException}} when polling memory segments from
the {{availableMemorySegments}} queue, this will be re-thrown and the request
will fail - {{NetworkBufferPool}} should then be restored to the state it was
before which it is, isn't it?
I see only one point where the accounting for {{numTotalRequiredBuffers}} can
be wrong: if an exception is thrown in the first of the
{{redistributeBuffers()}} calls. Tracing it further down, this can only happen
if {{SpillableSubpartition#releaseMemory()}} throws, e.g. due to a failure in
creating a {{spillWriter}}. I'm working on a patch...
was (Author: nicok):
Actually, {{numRequiredBuffers}} is only a local variable in this method - why
should we bother changing it?
Also, if there is an {{InterruptedException}} when polling memory segments from
the {{availableMemorySegments}} queue, this will be re-thrown and the request
will fail - {{NetworkBufferPool}} should then be restored to the state it was
before which it is, isn't it?
I see only one point where the accounting for {{numTotalRequiredBuffers}} can
be wrong: if an exception is thrown in the first of the
{{redistributeBuffers()}} calls.
> Network buffer leaks in requesting a batch of segments during canceling
> -----------------------------------------------------------------------
>
> Key: FLINK-9636
> URL: https://issues.apache.org/jira/browse/FLINK-9636
> Project: Flink
> Issue Type: Bug
> Components: Network
> Affects Versions: 1.5.0, 1.6.0
> Reporter: zhijiang
> Priority: Major
> Fix For: 1.5.1
>
>
> In {{NetworkBufferPool#requestMemorySegments}}, {{numTotalRequiredBuffers}}
> is increased by {{numRequiredBuffers}} first.
> If {{InterruptedException}} is thrown during polling segments from the
> available queue, the requested segments will be recycled back to
> {{NetworkBufferPool}}, {{numTotalRequiredBuffers}} is decreased by the number
> of polled segments which is now inconsistent with {{numRequiredBuffers}}. So
> {{numTotalRequiredBuffers}} in {{NetworkBufferPool}} leaks in this case, and
> we can also decrease {{numRequiredBuffers}} to fix this bug.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)