GitHub user sdaberdaku created a discussion: Why is AIRFLOW__CELERY__TASK_ACKS_LATE True by default?
Hello all, I have a question regarding the `AIRFLOW__CELERY__TASK_ACKS_LATE` configuration. According to the documentation: https://airflow.apache.org/docs/apache-airflow-providers-celery/stable/configurations-ref.html#task-acks-late > If an Airflow task’s execution time exceeds the visibility_timeout, Celery > will re-assign the task to a Celery worker, even if the original task is > still running successfully. The new task instance then runs concurrently with > the original task and the Airflow UI and logs only show an error message: > ‘Task Instance Not Running’ FAILED: Task is in the running state’ Setting > task_acks_late to True will force Celery to wait until a task is finished > before a new task instance is assigned. This effectively overrides the > visibility timeout. This description makes it seem like setting this variable to `True` will effectively force Celery to wait until a task is finished before a new task instance is assigned, which is the opposite of what the celery documentation states. https://docs.celeryq.dev/en/stable/reference/celery.app.task.html#celery.app.task.Task.acks_late > When enabled messages for this task will be acknowledged after the task has > been executed, and not right before (the default behavior). > > Please note that this means the task may be executed twice if the worker > crashes mid execution. If the `visibility_timeout` is exceeded, and the task is not acknowledged yet, it will be picked up by another celery worker, regardless of whether the original run has completed or not. I think the default value of this configuration should be set to `False` instead, since Airflow has its own retry mechanism of detecting failed/zombie tasks and rescheduling them. GitHub link: https://github.com/apache/airflow/discussions/57348 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
