[
https://issues.apache.org/jira/browse/HUDI-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinish Reddy updated HUDI-3373:
-------------------------------
Description:
Prometheus [Pushgateway|https://prometheus.io/docs/practices/pushing/] never
forgets series pushed to it and will expose them to Prometheus forever unless
those series are manually deleted via the Pushgateway's API. When deltastreamer
finds that there are no new records to be read from the source since the last
checkpoint it doesn't emit a metric because the write path is not triggered and
leads to a graph like below (a straight line with constant value as metrics
emitted) in case of PushGateway metrics reporter. For other metrics reporter
like DataDog etc. it would be a null value and no data will be shown in the
graph.
{{22/02/07 11:23:10 INFO DeltaSync: No new data, source checkpoint has not
changed. Nothing to commit. Old checkpoint=(Option\{val=20220207062847034}).
New Checkpoint=(20220207062847034)}}
{{!https://user-images.githubusercontent.com/16958856/152831831-7c4baab1-8b01-4100-8b64-40ec702e9749.png!}}
To give a better representation of the actual metrics for Prometheus
PushGateway, the counter metrics have to be updated with zero value so that
it's better than having stale data which was ingested as part of the last
checkpoint.
!https://user-images.githubusercontent.com/16958856/152831933-7238666b-b6b4-4ac6-9962-399b6a4cf81b.png!
{{}}
was:
Prometheus [Pushgateway|https://prometheus.io/docs/practices/pushing/] never
forgets series pushed to it and will expose them to Prometheus forever unless
those series are manually deleted via the Pushgateway's API. When deltastreamer
finds that there are no new records to be read from the source since the last
checkpoint it doesn't emit a metric because the write path is not triggered and
leads to a graph like below in case of PushGateway metrics reporter. For other
metrics reporter it would be a null value and no data will be shown in the
graph.
{{22/02/07 11:23:10 INFO DeltaSync: No new data, source checkpoint has not
changed. Nothing to commit. Old checkpoint=(Option\{val=20220207062847034}).
New Checkpoint=(20220207062847034)}}
{{!https://user-images.githubusercontent.com/16958856/152831831-7c4baab1-8b01-4100-8b64-40ec702e9749.png!}}
To give a better representation of the actual metrics for Prometheus
PushGateway, the counter metrics have to be updated with zero value so that
it's better than having stale data which was ingested as part of the last
checkpoint.
!https://user-images.githubusercontent.com/16958856/152831933-7238666b-b6b4-4ac6-9962-399b6a4cf81b.png!
{{}}
> Add zero value counter metrics because of Prometheus PushGateway stale
> metrics problem
> --------------------------------------------------------------------------------------
>
> Key: HUDI-3373
> URL: https://issues.apache.org/jira/browse/HUDI-3373
> Project: Apache Hudi
> Issue Type: Task
> Components: metrics
> Reporter: Vinish Reddy
> Priority: Minor
>
> Prometheus [Pushgateway|https://prometheus.io/docs/practices/pushing/] never
> forgets series pushed to it and will expose them to Prometheus forever unless
> those series are manually deleted via the Pushgateway's API. When
> deltastreamer finds that there are no new records to be read from the source
> since the last checkpoint it doesn't emit a metric because the write path is
> not triggered and leads to a graph like below (a straight line with constant
> value as metrics emitted) in case of PushGateway metrics reporter. For other
> metrics reporter like DataDog etc. it would be a null value and no data will
> be shown in the graph.
> {{22/02/07 11:23:10 INFO DeltaSync: No new data, source checkpoint has not
> changed. Nothing to commit. Old checkpoint=(Option\{val=20220207062847034}).
> New Checkpoint=(20220207062847034)}}
> {{!https://user-images.githubusercontent.com/16958856/152831831-7c4baab1-8b01-4100-8b64-40ec702e9749.png!}}
>
> To give a better representation of the actual metrics for Prometheus
> PushGateway, the counter metrics have to be updated with zero value so that
> it's better than having stale data which was ingested as part of the last
> checkpoint.
> !https://user-images.githubusercontent.com/16958856/152831933-7238666b-b6b4-4ac6-9962-399b6a4cf81b.png!
>
> {{}}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)