[
https://issues.apache.org/jira/browse/FLINK-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhijiang updated FLINK-14472:
-----------------------------
Parent: FLINK-14551
Issue Type: Sub-task (was: Task)
> Implement back-pressure monitor with non-blocking outputs
> ---------------------------------------------------------
>
> Key: FLINK-14472
> URL: https://issues.apache.org/jira/browse/FLINK-14472
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Network
> Reporter: zhijiang
> Assignee: Yingjie Cao
> Priority: Minor
> Fix For: 1.10.0
>
>
> Currently back-pressure monitor relies on detecting task threads that are
> stuck in `requestBufferBuilderBlocking`. There are actually two cases to
> cause back-pressure ATM:
> * There are no available buffers in `LocalBufferPool` and all the given
> quotas from global pool are also exhausted. Then we need to wait for buffer
> recycling to `LocalBufferPool`.
> * No available buffers in `LocalBufferPool`, but the quota has not been used
> up. While requesting buffer from global pool, it is blocked because of no
> available buffers in global pool. Then we need to wait for buffer recycling
> to global pool.
> We try to implement the non-blocking network output in FLINK-14396, so the
> back pressure monitor should be adjusted accordingly after the non-blocking
> output is used in practice.
> In detail we try to avoid the current monitor way by analyzing the task
> thread stack, which has some drawbacks discussed before:
> * If the `requestBuffer` is not triggered by task thread, the current
> monitor is invalid in practice.
> * The current monitor is heavy-weight and fragile because it needs to
> understand more details of LocalBufferPool implementation.
> We could provide a transparent method for the monitor caller to get the
> backpressure result directly, and hide the implementation details in the
> LocalBufferPool.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)