Feifan Wang created FLINK-24384:
-----------------------------------
Summary: Count checkpoints failed in trigger phase into
numberOfFailedCheckpoints
Key: FLINK-24384
URL: https://issues.apache.org/jira/browse/FLINK-24384
Project: Flink
Issue Type: Improvement
Components: Runtime / Checkpointing
Reporter: Feifan Wang
h1. *Problem*
In current implementation, checkpoints failed in trigger phase do not count
into metric 'numberOfFailedCheckpoints'. Such that users can not aware
checkpoint stoped by this metric.
As lang as users can use rules like _*'numberOfCompletedCheckpoints' not
increase in some minutes past*_ (maybe checkpoint interval + timeout) for
alerting, but I think it is ambages and can not alert timely.
h1. *Proposal*
As the title, count checkpoints failed in trigger phase into
'numberOfFailedCheckpoints'.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)