abhijeets25012-tech opened a new pull request, #61642:
URL: https://github.com/apache/airflow/pull/61642
### What
Currently, when a Celery worker task is killed due to an out-of-memory (OOM)
situation
(exit_code=-9), the Airflow UI does not always show the cause of the
failure. This
change ensures that such tasks log a CRITICAL message specifying the OOM
kill.
### Why
Referencing issue #61521:
Some tasks silently fail due to OOM, making debugging difficult. This change
improves
observability by logging a CRITICAL message when a task is killed with
exit_code=-9.
### Changes
- Updated `supervisor.py` in `ActivitySubprocess.wait()`:
- If `exit_code == -9`, log a `CRITICAL` message with task_instance_id,
duration, and final_state.
- Otherwise, continue logging task finish as `INFO`.
### Testing
- Manually verified that tasks killed by OOM now produce a CRITICAL log.
- Normal task completion still logs as INFO.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]