[
https://issues.apache.org/jira/browse/FLINK-22631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441125#comment-17441125
]
Li commented on FLINK-22631:
----------------------------
Hi,[~gaoyunhaii]
We have a data center, and the scheduling system will schedule a large number
of flink tasks at 0 AM every day.Then the monitoring system will monitor the
status of these tasks, and if the tasks are finished, it will get the metrics
of these tasks.These metrics have two main uses:
1) Used to display the overall situation of the current tasks in the data
center on the big screen
2) The monitoring system judges the actual completion of each task. For a
simple example, for the data synchronization task, it judges whether dirty data
is generated and triggers an alarm by reading and writing the same number of
items.
For failovers, we deal with it like this:
We have customized metrics, source and sink connector. During checkpointing, we
save custom metrics to checkpoints. In this way, we can get the correct metrics
from the checkpoint when failover.
> Metrics are incorrect when task finished
> ----------------------------------------
>
> Key: FLINK-22631
> URL: https://issues.apache.org/jira/browse/FLINK-22631
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Metrics
> Reporter: Li
> Priority: Minor
> Labels: auto-deprioritized-major
> Attachments: image-2021-05-11-20-13-25-886.png,
> image-2021-05-11-20-14-29-290.png, image-2021-05-19-10-10-29-765.png,
> image-2021-05-19-10-11-02-764.png
>
>
> MetricReporters are reported periodically, default 10 seconds. The
> final metrics may not be reported to metric system like pushgateway when task
> finished. This makes users unable to obtain the correct metrics。
>
> Maybe metricReporters should be reported once manually before closed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)