This is an automated email from the ASF dual-hosted git repository.
potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new 2429d077d8 Trigger gevent monkeypatching via environment variable
(#28283)
2429d077d8 is described below
commit 2429d077d8c59299487562c8867cfc63cd969b9d
Author: Jarek Potiuk <[email protected]>
AuthorDate: Wed Dec 21 20:13:58 2022 +0100
Trigger gevent monkeypatching via environment variable (#28283)
Gevent needs to monkeypatch a number of system libraries as soon
as possible when Python interpreter starts, in order to avoid
other libraries monkey-patching them before. We should do it before
any other initialization and it needs to be only run on webserver.
So far it was done by local_settings monkeypatching but that has
been rather brittle and some changes in Airflow made previous attempts
to stop working because the "other" packages could be loaded by
Airflow before - depending on installed providers and configuration
(for example when you had AWS configured as logger, boto could have
been loaded before and it could have monkey patch networking before
gevent had a chance to do so.
This change introduces different mechanism of triggering the
patching - it could be triggered by setting an environment variable.
This has the benefit that we do not need to initialize anything
(including reading settings or setting up logging) before we determine
if gevent patching should be performed.
It has also the drawback that the user will have to set the environment
variable in their deployment manually. However this is a small price to
pay if they will get a stable and future-proof gevent monkeypatching
built-in in Airflow.
Fixes: #8212
---
airflow/__init__.py | 9 +++++++++
airflow/config_templates/config.yml | 4 +++-
airflow/config_templates/default_airflow.cfg | 4 +++-
newsfragments/08212.misc.rst | 1 +
4 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/airflow/__init__.py b/airflow/__init__.py
index 38190dc8b8..1ecd10bb88 100644
--- a/airflow/__init__.py
+++ b/airflow/__init__.py
@@ -32,6 +32,14 @@ import os
import sys
from typing import Callable
+if os.environ.get("_AIRFLOW_PATCH_GEVENT"):
+ # If you are using gevents and start airflow webserver, you might want to
run gevent monkeypatching
+ # as one of the first thing when Airflow is started. This allows gevent to
patch networking and other
+ # system libraries to make them gevent-compatible before anything else
patches them (for example boto)
+ from gevent.monkey import patch_all
+
+ patch_all()
+
from airflow import settings
__all__ = ["__version__", "login", "DAG", "PY36", "PY37", "PY38", "PY39",
"PY310", "XComArg"]
@@ -41,6 +49,7 @@ __all__ = ["__version__", "login", "DAG", "PY36", "PY37",
"PY38", "PY39", "PY310
# lib.)
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type:
ignore
+
# Perform side-effects unless someone has explicitly opted out before import
# WARNING: DO NOT USE THIS UNLESS YOU REALLY KNOW WHAT YOU'RE DOING.
if not os.environ.get("_AIRFLOW__AS_LIBRARY", None):
diff --git a/airflow/config_templates/config.yml
b/airflow/config_templates/config.yml
index c38ef0c3b4..af5f130135 100644
--- a/airflow/config_templates/config.yml
+++ b/airflow/config_templates/config.yml
@@ -1233,7 +1233,9 @@ webserver:
worker_class:
description: |
The worker class gunicorn should use. Choices include
- sync (default), eventlet, gevent
+ sync (default), eventlet, gevent. Note when using gevent you might
also want to set the
+ "_AIRFLOW_PATCH_GEVENT" environment variable to "1" to make sure
gevent patching is done as
+ early as possible.
version_added: ~
type: string
example: ~
diff --git a/airflow/config_templates/default_airflow.cfg
b/airflow/config_templates/default_airflow.cfg
index 4bd2883563..8f704c378f 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -640,7 +640,9 @@ secret_key = {SECRET_KEY}
workers = 4
# The worker class gunicorn should use. Choices include
-# sync (default), eventlet, gevent
+# sync (default), eventlet, gevent. Note when using gevent you might also want
to set the
+# "_AIRFLOW_PATCH_GEVENT" environment variable to "1" to make sure gevent
patching is done as
+# early as possible.
worker_class = sync
# Log files for the gunicorn webserver. '-' means log to stderr.
diff --git a/newsfragments/08212.misc.rst b/newsfragments/08212.misc.rst
new file mode 100644
index 0000000000..acce074f10
--- /dev/null
+++ b/newsfragments/08212.misc.rst
@@ -0,0 +1 @@
+If you are using gevent for your webserver deployment and used local settings
to monkeypatch gevent, you might want to replace local settings patching with
an ``_AIRFLOW_PATCH_GEVENT`` environment variable set to 1 in your webserver.
This ensures gevent patching is done as early as possible.