[ 
https://issues.apache.org/jira/browse/AIRFLOW-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated AIRFLOW-3177:
--------------------------------------
    Fix Version/s: 1.10.1

> Change scheduler_heartbeat metric from gauge to counter
> -------------------------------------------------------
>
>                 Key: AIRFLOW-3177
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3177
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.0.0
>            Reporter: Greg Neiheisel
>            Assignee: Greg Neiheisel
>            Priority: Minor
>             Fix For: 1.10.1
>
>
> Currently, the scheduler_heartbeat metric exposed with the statsd integration 
> is a gauge. I'm proposing to change the gauge to a counter for a better 
> integration with Prometheus via the 
> [statsd_exporter|[https://github.com/prometheus/statsd_exporter].]
> Rather than pointing Airflow at an actual statsd server, you can point it at 
> this exporter, which will accumulate the metrics and expose them to be 
> scraped by Prometheus at /metrics. The problem is that once this value is set 
> when the scheduler runs its first loop, it will always be exposed to 
> Prometheus as 1. The scheduler can crash, or be turned off and the statsd 
> exporter will report a 1 until it is restarted and rebuilds its internal 
> state.
> By turning this metric into a counter, we can detect an issue with the 
> scheduler by graphing and alerting using a rate. If the rate of change of the 
> counter drops below what it should be at (determined by the 
> scheduler_heartbeat_secs setting), we can fire an alert.
> This should be helpful for adoption in Kubernetes environments where 
> Prometheus is pretty much the standard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to