Greg Hogan created FLINK-3160:
---------------------------------
Summary: Aggregate operator statistics by TaskManager
Key: FLINK-3160
URL: https://issues.apache.org/jira/browse/FLINK-3160
Project: Flink
Issue Type: Improvement
Components: Webfrontend
Affects Versions: 1.0.0
Reporter: Greg Hogan
The web client job info page presents a table of the following per task
statistics: start time, end time, duration, bytes received, records received,
bytes sent, records sent, attempt, host, status.
Flink supports clusters with thousands of slots and a job setting a high
parallelism renders this job info page unwieldy and difficult to analyze in
real-time.
It would be helpful to optionally or automatically aggregate statistics by
TaskManager. These rows could then be expanded to reveal the current per task
statistics.
Start time, end time, duration, and attempt are not applicable to a TaskManager
since new tasks for repeated attempts may be started. Bytes received, records
received, bytes sent, and records sent are summed. Any throughput metrics can
be averaged over the total task time or time window. Status could reference the
number of running tasks on the TaskManager or an idle state.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)