potiuk commented on issue #22350:
URL: https://github.com/apache/airflow/issues/22350#issuecomment-1073945325


   It's an interesting one. I think zombies should be restrted when detected. I 
think you can fine-tune your celery/airflow behaviour by 
https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#scheduler-zombie-task-threshold
 and 
[timeout]https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#celery-config-options
 to configure the hard limit on waiting for tasks when celery worker attempts 
to shut down. 
   
   Airlfow SHOULD eventually catch up if you wait long enough that all the 
timeuts passed, but the best way is to configure your timeouts so that that the 
termination is not forced by K8S - i.e. K8S grace timeout  > Celery worker task 
timeout. Then Celery should have enough time to mark the tasks as failed so 
that they can be retried by scheduler much faster.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to