[
https://issues.apache.org/jira/browse/FLINK-11082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735597#comment-16735597
]
zhijiang commented on FLINK-11082:
----------------------------------
Thanks for replying, [~pnowojski] :)
If I understood correctly, you described the scenario like this:
# If the backlog is 2 for one input channel, and there are 2 exclusive
available buffers currently, then this input channel would request 2 floating
buffers to maintain total 4 available credits.
# If one exclusive buffer is used for receiving network data, then the current
backlog becomes 1, and there are 3 available credits currently.
# If the above exclusive buffer is consumed by task and then recycle it, there
are total 4 available credits now, then it would trigger return one extra
floating buffer directly. No need to wait for the front another exclusive
buffer. That means we always try to maintain {{backlog+initialCredit}}
available credits, and if some exclusive buffers are recycled, the
corresponding number of floating buffers would be returned immediately as a
result. The related process is in
{{RemoteInputChannel#AvailableBufferQueue#addExclusiveBuffer}}.
So the floating buffers distribution is determined by backlog size currently,
but the backlog size does not accurately reflect the condition that how many
buffers are ready for transporting in network stack.
> Increase backlog only if it is available for consumption
> --------------------------------------------------------
>
> Key: FLINK-11082
> URL: https://issues.apache.org/jira/browse/FLINK-11082
> Project: Flink
> Issue Type: Sub-task
> Components: Network
> Affects Versions: 1.8.0
> Reporter: zhijiang
> Assignee: zhijiang
> Priority: Minor
>
> The backlog should indicate how many buffers are available in subpartition
> for downstream's consumption. The availability is considered from two
> factors. One is {{BufferConsumer}} finished, and the other is flush triggered.
> In current implementation, when the {{BufferConsumer}} is added into the
> subpartition, then the backlog is increased as a result, but this
> {{BufferConsumer}} is not yet available for network transport.
> Furthermore, the backlog would affect requesting floating buffers on
> downstream side. That means some floating buffers are fetched in advance but
> not be used for long time, so the floating buffers are not made use of
> efficiently.
> We found this scenario extremely for rebalance selector on upstream side, so
> we want to change when to increase backlog by finishing {{BufferConsumer}} or
> flush triggered.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)