[
https://issues.apache.org/jira/browse/AIRFLOW-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong updated AIRFLOW-3177:
--------------------------------------
Fix Version/s: 1.10.1
> Change scheduler_heartbeat metric from gauge to counter
> -------------------------------------------------------
>
> Key: AIRFLOW-3177
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3177
> Project: Apache Airflow
> Issue Type: Improvement
> Components: scheduler
> Affects Versions: 2.0.0
> Reporter: Greg Neiheisel
> Assignee: Greg Neiheisel
> Priority: Minor
> Fix For: 1.10.1
>
>
> Currently, the scheduler_heartbeat metric exposed with the statsd integration
> is a gauge. I'm proposing to change the gauge to a counter for a better
> integration with Prometheus via the
> [statsd_exporter|[https://github.com/prometheus/statsd_exporter].]
> Rather than pointing Airflow at an actual statsd server, you can point it at
> this exporter, which will accumulate the metrics and expose them to be
> scraped by Prometheus at /metrics. The problem is that once this value is set
> when the scheduler runs its first loop, it will always be exposed to
> Prometheus as 1. The scheduler can crash, or be turned off and the statsd
> exporter will report a 1 until it is restarted and rebuilds its internal
> state.
> By turning this metric into a counter, we can detect an issue with the
> scheduler by graphing and alerting using a rate. If the rate of change of the
> counter drops below what it should be at (determined by the
> scheduler_heartbeat_secs setting), we can fire an alert.
> This should be helpful for adoption in Kubernetes environments where
> Prometheus is pretty much the standard.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)