sania-2912 commented on issue #59553: URL: https://github.com/apache/airflow/issues/59553#issuecomment-3682883299
Hi @potiuk, I’ve gone through the reproduction steps and can consistently reproduce the issue on Airflow 3.1.5 with CeleryExecutor. From my view, it looks like task_queued_timeout is not being honored correctly for Celery tasks waiting in a non-default queue with limited worker concurrency, leading to premature retries (~15 min) despite a much higher configured timeout. I’d like to work on a fix for this and submit a PR. My plan is to: 1. Trace where queued task timeout is enforced in the scheduler/executor flow 2. Verify whether the timeout is being overridden or hard-limited elsewhere (e.g. scheduler loop, heartbeat, or executor state handling) 3. Add a regression test covering long queue wait times with CeleryExecutor Please let me know if there are any architectural considerations or preferred areas to start with. Happy to proceed. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
