Hi Everyone, Since last 2 weeks, we're facing an issue with LocalExecutor setup of Airflow v1.9(MySQL as metastore) where in a DAG if retry has been configured and initial try_number gets failed, then nearly 8 out of 10 times, task will get stuck in up_for_retry state, in fact there is no running state coming after Scheduled>Queued in TI. In Job table entry gets successful within fraction of second and failed entry gets logged in task_fail table without task even reaching to operator code and as a result we get aemail alert saying
``` Try 2 out of 4 Exception: Executor reports task instance %s finished (%s) although the task says its %s. Was the task killed externally? ``` But when default value of job_heartbeat_sec changed from 5 to 30 seconds(https://groups.google.com/forum/#!topic/airbnb_airflow/hTXKFw2XFx0 mentioned by Max sometimes back in 2016 for healthy supervision), this issue stops arising. But we're still clueless how this new configuration actually solved/suppressed the issue, any key information around it would really help here. Regards, Vardan Gupta