[
https://issues.apache.org/jira/browse/FLINK-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16759781#comment-16759781
]
Barisa commented on FLINK-3310:
-------------------------------
Hi, is the backpressure operation something that is expenesive?
I'm asking, since we are considering in polling this info once a minute, and
exposing as an Prometheus metric.
Question already asked in
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Continuous-Monitoring-of-back-pressure-tt25869.html]
I'm currently writing some code to convert the back-pressure REST API data into
Prometheus-compatible output. I was just curious why back-pressure wasn't
already exposed as a metric in the in-built Prometheus exporter? Is it because
the thread-sampling is too intensive? Or too slow (particularly if running
multiple jobs)? In our case we're running a single job per cluster. Any
feedback would be appreciated.
Regards,
Dave
> Add back pressure statistics to web frontend
> --------------------------------------------
>
> Key: FLINK-3310
> URL: https://issues.apache.org/jira/browse/FLINK-3310
> Project: Flink
> Issue Type: Improvement
> Components: Webfrontend
> Reporter: Ufuk Celebi
> Assignee: Ufuk Celebi
> Priority: Minor
> Fix For: 1.0.0
>
>
> When a task is receiving data at a higher rate than it can process, the task
> is back pressuring preceding tasks. Currently, there is no way to tell
> whether this is the case or not. An indicator for back pressure is tasks
> being stuck in buffer requests on the network stack. This means that they
> have filled all their buffers with data, but the following tasks/network are
> not consuming them fast enough.
> A simple way to measure back pressure is to sample running tasks and report
> back pressure if they are stuck in the blocking buffers calls.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)