GitHub user sdaberdaku created a discussion: Why is 
AIRFLOW__CELERY__TASK_ACKS_LATE True by default?

Hello all,

I have a question regarding the `AIRFLOW__CELERY__TASK_ACKS_LATE` 
configuration. According to the documentation:
https://airflow.apache.org/docs/apache-airflow-providers-celery/stable/configurations-ref.html#task-acks-late

> If an Airflow task’s execution time exceeds the visibility_timeout, Celery 
> will re-assign the task to a Celery worker, even if the original task is 
> still running successfully. The new task instance then runs concurrently with 
> the original task and the Airflow UI and logs only show an error message: 
> ‘Task Instance Not Running’ FAILED: Task is in the running state’ Setting 
> task_acks_late to True will force Celery to wait until a task is finished 
> before a new task instance is assigned. This effectively overrides the 
> visibility timeout.

This description makes it seem like setting this variable to `True` will 
effectively force Celery to wait until a task is finished before a new task 
instance is assigned, which is the opposite of what the celery documentation 
states. 
https://docs.celeryq.dev/en/stable/reference/celery.app.task.html#celery.app.task.Task.acks_late

> When enabled messages for this task will be acknowledged after the task has 
> been executed, and not right before (the default behavior).
> 
> Please note that this means the task may be executed twice if the worker 
> crashes mid execution.

If the `visibility_timeout` is exceeded, and the task is not acknowledged yet, 
it will be picked up by another celery worker, regardless of whether the 
original run has completed or not. I think the default value of this 
configuration should be set to `False` instead, since Airflow has its own retry 
mechanism of detecting failed/zombie tasks and rescheduling them.

GitHub link: https://github.com/apache/airflow/discussions/57348

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to