JzhJay opened a new issue, #54539:
URL: https://github.com/apache/airflow/issues/54539
### Apache Airflow version
Other Airflow 2 version (please specify below)
### If "Other Airflow 2 version" selected, which one?
2.8.4
### What happened?
My Airflow is deployed via k8s in containerized form:
NAME READY
STATUS RESTARTS AGE
airflow-scheduler-589f94f6cc-q97hx 2/2
Running 0 13d
airflow-statsd-5fc99b9fc9-bmscm 1/1
Running 10 (69d ago) 207d
airflow-triggerer-5cdd4444fd-l6tpc 2/2
Running 1 (54d ago) 108d
airflow-webserver-664c79778c-ggmcb 1/1
Running 79 (45m ago) 190d
As you can see here, the airflow-webserver pod has restarted many time since
the depoyment created. We have discovered recently hat these reboots are likely
caused by the same cause -> the communication problem between airflow and LDAP
server.
Here is the exit logs extracted from elastic research which basically points
out that the airflow webserver directly closes guvicorn service after raising
{result': -1, 'desc': "Can't contact LDAP server", 'errno': 4, 'ctrls': [],
'info': 'Interrupted system call'} error. The thing is that I've found nothing
abormal on LDAP server, there is no error log, networking is fine as well...
Could anyone can tell me any clue on debugging this airflow webserver reboot
problem? I'm also really curious why airflow webserver is so "sensible" with
LDAP, even though there is a communication problem out there, the webserver
should not kill itself, this doesn't make any sense to me...
### What you think should happen instead?
_No response_
### How to reproduce
<html>
<body>
<!--StartFragment-->
Aug 15, 2025 @ 16:31:12.369 | ____________ _____________
-- | --
| Aug 15, 2025 @ 16:31:12.369 | ____ \|__( )_________ __/__ /________
__
| Aug 15, 2025 @ 16:31:12.369 | ____ /\| \|_ /__ ___/_ /_ __ /_ __
\_ \| /\| / /
| Aug 15, 2025 @ 16:31:12.369 | ___ ___ \| / _ / _ __/ _ / / /_/ /_
\|/ \|/ /
| Aug 15, 2025 @ 16:31:12.369 | _/_/ \|_/_/ /_/ /_/ /_/
\____/____/\|__/
| Aug 15, 2025 @ 16:31:12.369 | Running the Gunicorn Server with:
| Aug 15, 2025 @ 16:31:12.369 | Workers: 4 sync
| Aug 15, 2025 @ 16:31:12.369 | Host: 0.0.0.0:8080
| Aug 15, 2025 @ 16:31:12.369 | Timeout: 120
| Aug 15, 2025 @ 16:31:12.369 | Logfiles: - -
| Aug 15, 2025 @ 16:31:12.369 | Access Logformat:
| Aug 15, 2025 @ 16:31:12.369 |
=================================================================
| Aug 15, 2025 @ 16:31:12.369 | [[34m2025-08-15T08:31:12.366+0000[0m]
{[34mwebserver_command.py:[0m429} INFO[0m - Received signal: 15. Closing
gunicorn.[0m
| Aug 15, 2025 @ 16:31:12.367 | [2025-08-15 08:31:12 +0000] [13] [INFO]
Handling signal: term
| Aug 15, 2025 @ 16:31:12.367 | [[34m2025-08-15T08:31:12.366+0000[0m]
{[34moverride.py:[0m2048} ERROR[0m - {'result': -1, 'desc': "Can't contact
LDAP server", 'errno': 4, 'ctrls': [], 'info': 'Interrupted system call'}[0m
| Aug 15, 2025 @ 16:31:12.367 | [[34m2025-08-15T08:31:12.366+0000[0m]
{[34moverride.py:[0m2048} ERROR[0m - {'result': -1, 'desc': "Can't contact
LDAP server", 'errno': 4, 'ctrls': [], 'info': 'Interrupted system call'}[0m
| Aug 15, 2025 @ 16:31:12.367 | [[34m2025-08-15T08:31:12.366+0000[0m]
{[34moverride.py:[0m2048} ERROR[0m - {'result': -1, 'desc': "Can't contact
LDAP server", 'errno': 4, 'ctrls': [], 'info': 'Interrupted system call'}[0m
<!--EndFragment-->
</body>
</html>
### Operating System
Ubuntu 22.04
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]