[
https://issues.apache.org/jira/browse/FLINK-24402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583503#comment-17583503
]
Yuan Mei commented on FLINK-24402:
----------------------------------
Currently, `waiting` means blocking-waiting.
- blocking waiting is reported as busy
- non-blocking waiting is reported also as busy
at least "blocking" should be reported as changelog-backpressure?
or should be reported as backpressure instead of busy?
or if multiple tasks are waiting for changelog writer uploading, what should be
reported?
> Add a metric for back-pressure from the ChangelogStateBackend
> -------------------------------------------------------------
>
> Key: FLINK-24402
> URL: https://issues.apache.org/jira/browse/FLINK-24402
> Project: Flink
> Issue Type: New Feature
> Components: Runtime / Checkpointing, Runtime / Metrics, Runtime /
> State Backends
> Reporter: Roman Khachatryan
> Priority: Major
>
> FLINK-23381 adds back-pressure, this task is to add monitoring for that.
> See design doc:
> https://docs.google.com/document/d/1k5WkWIYzs3n3GYQC76H9BLGxvN3wuq7qUHJuBPR9YX0/edit#heading=h.ayt6cka7z0qf
> Can be reported as back-pressured by backend per second, similar to how
> "regular" back-pressure is currently reported
> ([prototype|https://github.com/rkhachatryan/flink/tree/clsb-bp-test]).
> Metric name: stateBackendBlockedTimeMsPerSecond
> Take into account:
> * there is blocking and non-blocking waiting for changelog availability (see
> [https://github.com/apache/flink/pull/17229#discussion_r740111285)]
> * UI needs to be adjusted in several places: Task label; Task details
> * Back-pressure status label should probably be adjusted
> * If changelog is disabled then the metric shouldn't be shown
> Consider whether to include changelog back-pressure into overall
> back-pressure
> (https://github.com/apache/flink/pull/17229#discussion_r738322138 ).
>
> Uploading metrics should be added in FLINK-23486.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)