Liu created FLINK-38961:
---------------------------
Summary: Display process metrics (CPU, Memory, I/O) on TaskManager
Web UI
Key: FLINK-38961
URL: https://issues.apache.org/jira/browse/FLINK-38961
Project: Flink
Issue Type: Improvement
Components: Runtime / Web Frontend
Reporter: Liu
### Summary
Add a new "Process Usage" panel on the TaskManager Metrics page of the Flink
Web UI to display process-level metrics, including CPU usage, memory (RSS), and
I/O statistics.
### Motivation
Currently, the TaskManager Metrics page in Flink Web UI only displays JVM and
Flink-managed memory metrics. However, users often need to monitor
process-level resource consumption to better understand the actual resource
usage of TaskManagers.
When `metrics.system-resource` is enabled, Flink collects process-level metrics
such as:
- `Process.CPU.Usage` - CPU usage percentage of the process
- `Process.Memory.RSS` - Resident Set Size (physical memory used by the process)
- `Process.IO.Read` / `Process.IO.Write` - I/O read and write bytes
These metrics are already available through the REST API but are not displayed
in the Web UI, making it inconvenient for users to monitor them.
### Proposed Changes
1. **Add a "Process Usage" card** on the TaskManager Metrics page
(`task-manager-metrics.component.html`) displaying:
- **CPU**: Process CPU usage percentage
- **Memory**: Process RSS (Resident Set Size)
- **I/O**: Combined read and write I/O bytes
2. **Extend the metrics query** in `task-manager-metrics.component.ts` to
include:
- `Process.CPU.Usage`
- `Process.Memory.RSS`
- `Process.IO.Read`
- `Process.IO.Write`
### Prerequisites
Users need to enable system resource metrics by setting
`metrics.system-resource: true` in the Flink configuration (it is disabled by
default). If this option is not enabled, the process metrics will show as
empty/zero.
### UI Mockup
The new "Process Usage" panel will be placed at the top of the TaskManager
Metrics page, showing three columns:
- CPU (percentage with 6 decimal precision)
- Memory (humanized bytes format)
- I/O (sum of read and write bytes, humanized)
### Related Documentation
- [System Resource
Metrics](https://nightlies.apache.org/flink/flink-docs-stable/docs/ops/metrics/#system-resources)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)