luoyuliuyin opened a new pull request, #39484: URL: https://github.com/apache/airflow/pull/39484
PR of https://github.com/apache/airflow/issues/39482 When the scheduler send task to celery, if there is only 1 task in the current cycle, the task will be sent to the main thread; if there are multiple tasks, a thread pool will be created based on the number of CPU cores, and then all tasks will be consumed by the thread pool. There are some problems with the current implementation. The scheduler creates a thread pool every time it schedules, which will bring a very large performance overhead. In fact, the thread pool can be reused.  When I tested, sometimes it would take almost 4 seconds to consume 32 tasks.  If the thread pool is reused, it only takes 10 milliseconds.   -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
