arkadiuszbach opened a new pull request, #58733:
URL: https://github.com/apache/airflow/pull/58733

   **What**
   Derive celery sync_parallelism from the scheduler's resources.limits.cpu 
when it is defined.
   
   **Why**
   The default value of `sync_parallelism` in Airflow is `0`, which causes it 
to fall back to `multiprocessing.cpu_count`:
   
https://github.com/apache/airflow/blob/97cd1c9c99030be89cebaddc2e342359fc01b5b8/providers/celery/src/airflow/providers/celery/executors/celery_executor.py#L311
   
   In containerized environments, cpu_count() returns the number of host 
machine cores, rather than the number of cores actually allocated to the 
container - see: https://github.com/python/cpython/issues/80235
   
   This leads to incorrect behavior. For example:
   If the scheduler container has 500m CPU allocated but is running on a 16 
vCPU node, Airflow will incorrectly spawn 15 (as there is minus 1) processes 
instead of 1 when sending Celery tasks via apply_async, leading to unnecessary 
resource contention and overhead.
   
   ---
   **^ Add meaningful description above**
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to