[
https://issues.apache.org/jira/browse/FLINK-17328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195466#comment-17195466
]
Piotr Nowojski edited comment on FLINK-17328 at 9/14/20, 1:42 PM:
------------------------------------------------------------------
What I meant is difficult, is that if you have ~100 of tasks (with hundreds of
parallel subtasks each), it's really difficult to understand what's happening
with the Job, without visualising the data in a shape of the job graph. With
textual form, you are forced to look the tasks (or subtasks for data skew) one
by one. Grafana or other metrics visualisers are not helping with that much.
Now compare this to looking at a graph with green, yellow or red dots and with
some other similar marker for average state of the buffer pools. One quick
glance and it becomes immediately obvious:
* what is backpressured and what's not
* if there is some data skew involved and on which edges
More over, just for the sake of sanity of people using Flink or answering to
users's problems, it's really good to have some basic functionality built into
the system, that allows to understand what's happening.
was (Author: pnowojski):
What I meant is difficult, is that if you have ~100 of tasks (with hundreds of
parallel subtasks each), it's really difficult to understand what's happening
with the Job, without visualising the data in a shape of the job graph. Have
you tried doing it [~chesnay]? :) With textual form, you are forced to look the
tasks (or subtasks for data skew) one by one. Grafana or other metrics
visualisers are not helping with that much.
Now compare this to looking at a graph with green, yellow or red dots and with
some other similar marker for average state of the buffer pools. One quick
glance and it becomes immediately obvious:
* what is backpressured and what's not
* if there is some data skew involved and on which edges
More over, just for the sake of sanity of people using Flink or answering to
users's problems, it's really good to have some basic functionality built into
the system, that allows to understand what's happening.
> Expose network metric for job vertex in rest api
> ------------------------------------------------
>
> Key: FLINK-17328
> URL: https://issues.apache.org/jira/browse/FLINK-17328
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Metrics, Runtime / REST
> Reporter: lining
> Assignee: lining
> Priority: Major
> Labels: pull-request-available
>
> JobDetailsHandler
> * pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg,
> inputFloatingBuffersUsageAvg
> * back-pressured for show whether it is back pressured(merge all iths
> subtasks)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)