karenbraganz commented on code in PR #43536: URL: https://github.com/apache/airflow/pull/43536#discussion_r1826283883
########## docs/apache-airflow/core-concepts/tasks.rst: ########## @@ -167,25 +167,22 @@ These can be useful if your code has extra knowledge about its environment and w .. _concepts:zombies: -Zombie/Undead Tasks -------------------- +Zombie Tasks +------------ -No system runs perfectly, and task instances are expected to die once in a while. Airflow detects two kinds of task/process mismatch: +No system runs perfectly, and task instances are expected to die once in a while. -* *Zombie tasks* are ``TaskInstances`` stuck in a ``running`` state despite their associated jobs being inactive - (e.g. their process did not send a recent heartbeat as it got killed, or the machine died). Airflow will find these - periodically, clean them up, and either fail or retry the task depending on its settings. Tasks can become zombies for - many reasons, including: +*Zombie tasks* are ``TaskInstances`` stuck in a ``running`` state despite their associated jobs being inactive +(e.g. their process did not send a recent heartbeat as it got killed, or the machine died). Airflow will find these +periodically, clean them up, and either fail or retry the task depending on its settings. Tasks can become zombies for +many reasons, including: - * The Airflow worker ran out of memory and was OOMKilled. - * The Airflow worker failed its liveness probe, so the system (for example, Kubernetes) restarted the worker. - * The system (for example, Kubernetes) scaled down and moved an Airflow worker from one node to another. +* The Airflow worker ran out of memory and was OOMKilled. +* The Airflow worker failed its liveness probe, so the system (for example, Kubernetes) restarted the worker. +* The system (for example, Kubernetes) scaled down and moved an Airflow worker from one node to another. -* *Undead tasks* are tasks that are *not* supposed to be running but are, often caused when you manually edit Task - Instances via the UI. Airflow will find them periodically and terminate them. - -Below is the code snippet from the Airflow scheduler that runs periodically to detect zombie/undead tasks. +Below is the code snippet from the Airflow scheduler that runs periodically to detect zombie tasks. Review Comment: @rawwar Ryan and I discussed removing this because it might be too much detail for documentation. The Airflow source code is available for anyone who wants to understand this. I am also seeing [comments from Ephraim and Jarek](https://github.com/apache/airflow/pull/35825#discussion_r1404011284) (in the original PR where the code was added) recommending omission of the code, but those comments were never resolved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
