[
https://issues.apache.org/jira/browse/FLINK-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245192#comment-17245192
]
Piotr Nowojski commented on FLINK-14712:
----------------------------------------
[~mapohl], yes those changes are definitely worth working on from the usability
perspective. Currently power users can get those numbers in other ways, but
having them presented in an easy to digest way in the web UI would be quite
useful.
> Improve back-pressure reporting mechanism
> -----------------------------------------
>
> Key: FLINK-14712
> URL: https://issues.apache.org/jira/browse/FLINK-14712
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Metrics, Runtime / Network, Runtime / REST
> Reporter: lining
> Assignee: lining
> Priority: Major
> Attachments: image-2019-11-12-14-30-16-130.png
>
>
> h4. (1) The current monitor is heavy-weight.
> * Backpressure monitoring works by repeatedly taking stack trace samples
> of your running tasks.
> h4. (2) It is difficult to find out which vertex is the source of
> backpressure.
> * User need to know current and upstream's network metric to judge current
> whether is the source of backpressure. Now user has to record relevant
> information.
> h3. Proposed Changes
> 1. expose the new mechanism implemented in FLINK-14472 as a "is
> back-pressured" metric.
> 2. show the vertex that produces the backpressure source for the job.
> 3. expose network metric in IOMetricsInfo:
> * SubTask
> ** pool usage: outPoolUsage, inputExclusiveBuffersUsage,
> inputFloatingBuffersUsage.
> *** If the subtask is not back pressured, but it is causing backpressure
> (full input, empty output)
> *** By comparing exclusive/floating buffers usage, whether all channels are
> back-pressure or only some of them
> ** back-pressured for show whether it is back pressured.
> * Vertex
> ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg,
> inputFloatingBuffersUsageAvg
> ** back-pressured for show whether it is back pressured(merge all iths
> subtasks)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)