Kamil Bregula created AIRFLOW-6532: -------------------------------------- Summary: Fetch celery states using batch method instead Pool Key: AIRFLOW-6532 URL: https://issues.apache.org/jira/browse/AIRFLOW-6532 Project: Apache Airflow Issue Type: Improvement Components: executors Affects Versions: 1.10.7 Reporter: Kamil Bregula
One aspect that is worth checking is how much time Celery takes to receive task statuses. https://github.com/apache/airflow/blob/77099b876814ec0008fd8da18f35de70deccbe03/airflow/executors/celery_executor.py#L246-L259 My clients use MySQL as the result backend, so celery sends 100 queries to the database for 100 tasks. https://github.com/celery/celery/blob/77099b876814ec0008fd8da18f35de70deccbe03/airflow/backends/database/__init__.py#L149-L164 In my opinion, this can speed up if we replace our code by calling the method from Celery - celery.backends.base:BaseKeyValueStoreBackend.get_many https://github.com/celery/celery/blob/77099b876814ec0008fd8da18f35de70deccbe03/celery/backends/base.py#L711-L747 Unfortunately, this method works only with Redis, so we will have to extend the mget / get_many method in DatabaseBackend class to work properly. -- This message was sent by Atlassian Jira (v8.3.4#803005)