This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 3c81b30b324077b76adb753220b63350842e5f73
Author: Ry Walker <[email protected]>
AuthorDate: Tue Feb 4 03:43:02 2020 -0500

    [AIRFLOW-XXXX] Add scheduler in production section (#7351)
    
    (cherry picked from commit 34f5d6f3c0a6f6f6f2066c70ede0c71245eb6931)
---
 docs/best-practices.rst | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/docs/best-practices.rst b/docs/best-practices.rst
index 9b6b3ae..1a27e73 100644
--- a/docs/best-practices.rst
+++ b/docs/best-practices.rst
@@ -315,3 +315,16 @@ Some configurations such as Airflow Backend connection URI 
can be derived from b
 .. code::
 
  sql_alchemy_conn_cmd = bash_command_to_run
+
+
+Scheduler Uptime
+-----------------
+
+Airflow users have for a long time been affected by a
+`core Airflow bug <https://issues.apache.org/jira/browse/AIRFLOW-401>`_
+that causes the scheduler to hang without a trace.
+
+Until fully resolved, you can mitigate a few ways:
+
+* Set a reasonable run_duration setting in your ``airflow.cfg``. `Example 
config 
<https://github.com/astronomer/airflow-chart/blob/63bc503c67e2cd599df0b6f831d470d09bad7ee7/templates/configmap.yaml#L44>`_.
+* Add an ``exec`` style health check to your helm charts on the scheduler 
deployment to fail if the scheduler has not heartbeat in a while. `Example 
health check definition 
<https://github.com/astronomer/helm.astronomer.io/pull/200/files>`_.

Reply via email to