shenoykarthikd opened a new issue #11266:
URL: https://github.com/apache/airflow/issues/11266


   FAQ Documentation for max_threads currently reads as follows:
   
   max_threads: Scheduler will spawn multiple threads in parallel to schedule 
dags. This is controlled by max_threads with default value of 2. User should 
increase this value to a larger value (e.g numbers of cpus where scheduler runs 
- 1) in production.
   
   The example above creates confusion in the minds of new developers as it is 
incorrectly understood as the maximum number of threads for the scheduler 
cannot exceed the number of cpus - 1. I have seen many Airflow installations 
where the value is setup as max number of cpus - 1, while the upper limit of 
threads should actually be determined by the size of the instance (CPU + 
Memory) onto which the scheduler is installed. Due to this misunderstanding, 
I've heard many new Airflow developers say that Airflow is very slow at 
scheduling DAGs. When I delve deeper into their config I see the max_threads 
configuration limited to the number of CPUs.
   
   Kindly consider changing this to the below as follows - 
   max_threads: Scheduler will spawn multiple threads in parallel to schedule 
dags. This is controlled by max_threads with default value of 2. User should 
increase this value to a larger value that fits the size of the installed 
hardware in production.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to