[
https://issues.apache.org/jira/browse/FLINK-12576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936784#comment-16936784
]
David Anderson commented on FLINK-12576:
----------------------------------------
I just did some more careful testing, this time with
taskmanager.network.memory.buffers-per-channel:1
taskmanager.network.memory.floating-buffers-per-gate:1
which I think is as low as the buffering can go.
Here are the various input and output metrics, running on Flink 1.9 with 2
single-slot TMs:
!Screen Shot 2019-09-24 at 3.22.53 PM.png!
!Screen Shot 2019-09-24 at 3.22.36 PM.png!
Running on Flink 1.9 with a single two-slot TM looks like this:
!Screen Shot 2019-09-24 at 3.13.05 PM.png!
!Screen Shot 2019-09-24 at 3.11.15 PM.png!
I'll see if I can repeat the case you asked about on Flink 1.8.
> inputQueueLength metric does not work for LocalInputChannels
> ------------------------------------------------------------
>
> Key: FLINK-12576
> URL: https://issues.apache.org/jira/browse/FLINK-12576
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Metrics, Runtime / Network
> Affects Versions: 1.6.4, 1.7.2, 1.8.0, 1.9.0
> Reporter: Piotr Nowojski
> Assignee: Aitozi
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.9.0
>
> Attachments: Screen Shot 2019-09-24 at 3.11.15 PM.png, Screen Shot
> 2019-09-24 at 3.13.05 PM.png, Screen Shot 2019-09-24 at 3.22.36 PM.png,
> Screen Shot 2019-09-24 at 3.22.53 PM.png
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Currently {{inputQueueLength}} ignores LocalInputChannels
> ({{SingleInputGate#getNumberOfQueuedBuffers}}). This can can cause mistakes
> when looking for causes of back pressure (If task is back pressuring whole
> Flink job, but there is a data skew and only local input channels are being
> used).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)