rafidka opened a new issue #13026: URL: https://github.com/apache/airflow/issues/13026
**Apache Airflow version**: 1.10.12 **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): N/A **Environment**: - **Cloud provider or hardware configuration**: Amazon `r5.4xlarge` EC2 instance. - **OS** (e.g. from /etc/os-release): Amazon Linux 2 - **Kernel** (e.g. `uname -a`): 5.4.58-37.125.amzn2int.x86_64 (via `uname -r`) - **Install tools**: N/A - **Others**: N/A **What happened**: I am experiencing some strange issue with scheduler heartbeat. If I [configure it](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#scheduler-heartbeat-sec) to 5 seconds, somehow the heartbeat metric is received every 6 seconds. I tried different values and what I noticed is that with values of 5 or less, the heartbeat is received one second later than expected. But with 6 or more, the heartbeat is received at the expected time (though for 2 and 6 I experienced an even stranger fluctuating behaviour). Below is a table of the values I tried and the frequencies I received: scheduler_heartbeat_sec value|frequency of scheduler_heartbeat metric -|- 0|2 1|2 2|Fluctuates between 2 and 4 3|4 5|Fluctuates between 6 and 8 6|6 10|10 30|30 **What you expected to happen**: I expect the `airflow.scheduler_heartbeat` metric to be received at the same frequency specified by the `scheduler_heartbeat_sec` configuration. **How to reproduce it**: - Install Airflow locally via pip - Enable StatsD metrics via the airflow.cfg configurations: ``` # Statsd (https://github.com/etsy/statsd) integration settings statsd_on = True statsd_host = localhost statsd_port = 8125 statsd_prefix = airflow ``` - Run Airflow webserver and scheduler and enable some DAGs. - Execute the following Python code and watch the timestamp of the received metrics: ```python import socket from datetime import datetime UDP_IP = "127.0.0.1" UDP_PORT = 8125 sock = socket.socket(socket.AF_INET, # Internet socket.SOCK_DGRAM) # UDP sock.bind((UDP_IP, UDP_PORT)) prev_iter_time = None while True: data, addr = sock.recvfrom(1024) # buffer size is 1024 bytes message = str(data) if 'airflow.scheduler_heartbeat' in message: this_iter_time = datetime.now() diff = (this_iter_time - prev_iter_time).total_seconds() if prev_iter_time is not None else 0 print(f"{this_iter_time} [{diff} seconds] received message: {message}.") prev_iter_time = this_iter_time ``` This is an example output when I set `scheduler_heartbeat_sec` to 5: ``` 2020-12-11 22:18:34.747872 [0 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:18:40.760118 [6.012246 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:18:46.765760 [6.005642 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:18:52.771790 [6.00603 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:18:58.778018 [6.006228 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:19:04.784721 [6.006703 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:19:10.789910 [6.005189 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. ``` And this is another example when I set `scheduler_heartbeat_sec` to 6: ``` 2020-12-11 22:19:56.306896 [0 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:02.317303 [6.010407 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:10.324169 [8.006866 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:18.331488 [8.007319 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:24.338565 [6.007077 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:30.345710 [6.007145 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:38.347814 [8.002104 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:44.358943 [6.011129 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:20:52.368052 [8.009109 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. ``` Notice that this time it fluctuates between 6 and 8 for reason. Now, setting `scheduler_heartbeat_sec` to 10, here is a much better stable output with the expected frequency: ``` python3 listen_to_statsd.py 2020-12-11 22:22:39.028670 [0 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:22:49.041901 [10.013231 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:22:59.052456 [10.010555 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:23:09.062201 [10.009745 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:23:19.071650 [10.009449 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:23:29.084219 [10.012569 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:23:39.093269 [10.00905 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:23:49.106148 [10.012879 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:23:59.113249 [10.007101 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:24:09.124079 [10.01083 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:24:19.135419 [10.01134 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:24:29.143558 [10.008139 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:24:39.154385 [10.010827 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. 2020-12-11 22:24:49.164710 [10.010325 seconds] received message: b'airflow.scheduler_heartbeat:1|c'. ``` **Anything else we need to know**: Yes, you are awesome, but you might already know this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
