stepanof commented on issue #24731: URL: https://github.com/apache/airflow/issues/24731#issuecomment-1322123564
@potiuk Thank you for such extended comment, I see your point. I have one more question. Sometimes during Virtual IP and postgres endpoint changing airflow-worker try to restart by itself (not by 'autoheal' service). But it can't restart beacuse of such error: ``` [2022-11-18 18:39:34 +0300] [42] [INFO] Starting gunicorn 20.1.0 [2022-11-18 18:39:34 +0300] [42] [INFO] Listening at: http://[::]:8793 (42) [2022-11-18 18:39:34 +0300] [42] [INFO] Using worker: sync [2022-11-18 18:39:35 +0300] [43] [INFO] Booting worker with pid: 43 [2022-11-18 18:39:35 +0300] [44] [INFO] Booting worker with pid: 44 [2022-11-18 18:39:35 +0300] [42] [INFO] Handling signal: term ERROR: Pidfile (/opt/airflow/airflow-worker.pid) already exists. Seems we're already running? (pid: 1) [2022-11-18 18:39:35 +0300] [43] [INFO] Worker exiting (pid: 43) [2022-11-18 18:39:35 +0300] [44] [INFO] Worker exiting (pid: 44) [2022-11-18 18:39:35 +0300] [42] [INFO] Shutting down: Master ``` Airflow-worker restarting can long endlessly, and each time there will be this error. Manual restart of worker (`docker-compose down && docker-compose up`) fixes the problem (`/opt/airflow/airflow-worker.pid ` becomes deleted). Why '`/opt/airflow/airflow-worker.pid`' isn't deleted during automatic worker restart? During such endless automatic restart, container with worker doesn't take state "unhelthy" (because it dies immidiatedly) and 'autoheal' doesn't understand that worker should be rebooted. Is it possible to fix it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
