diogosilva30 commented on PR #65943: URL: https://github.com/apache/airflow/pull/65943#issuecomment-4439886330
@wjddn279 Hey, just checked our staging pods to answer this properly. On one of our workers today (`airflow-worker-6998fbdf9c-mpxrd`, been up 23h), I pulled the logs and found three SIGABRT crashes: ``` 2026-05-13T04:41:22.161535Z [info ] Task finished [supervisor] duration=1.5012219889904372 exit_code=<Negsignal.SIGABRT: -6> final_state=failed loc=supervisor.py:2109 task_instance_id=019e1fa3-5c78-712e-bc70-c68310f34cd0 2026-05-13T04:57:57.514458Z [info ] Task finished [supervisor] duration=1.675403744011419 exit_code=<Negsignal.SIGABRT: -6> final_state=failed loc=supervisor.py:2109 task_instance_id=019e1fb2-a889-7646-ac6f-897bcb8e214f 2026-05-13T06:17:03.839441Z [info ] Task finished [supervisor] duration=1.4628011609893292 exit_code=<Negsignal.SIGABRT: -6> final_state=failed loc=supervisor.py:2109 task_instance_id=019e1ff9-6c08-7b8b-8c66-e4a6e45bcfc2 ``` Under 2 seconds probably means the child never actually ran. It deadlocked on an import lock right at startup and got aborted. On the "gradually decreases" question, I see where you're coming from, but the problem here isn't plugin imports in user code. The warning fires on every single fork throughout the worker's lifetime because `supervisor.py` itself keeps a thread pool alive for async log pushing (`aiofiles`/anyio). Add GCS credential refreshes and Secret Manager calls on every task completion and you've got live threads on basically every fork. The race doesn't go away after warmup. On memory, fair point, the numbers are real. That's exactly why we made it opt-in rather than changing the default. People who aren't hitting this keep the existing behaviour. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
