stepanof commented on issue #24731:
URL: https://github.com/apache/airflow/issues/24731#issuecomment-1322123564

   @potiuk Thank you for such extended comment, I see your point.
   
   I have one more question.
   Sometimes during Virtual IP and postgres endpoint changing airflow-worker 
try to restart by itself (not by 'autoheal' service). 
   But it can't restart beacuse of such error:
   ```
   [2022-11-18 18:39:34 +0300] [42] [INFO] Starting gunicorn 20.1.0
   [2022-11-18 18:39:34 +0300] [42] [INFO] Listening at: http://[::]:8793 (42)
   [2022-11-18 18:39:34 +0300] [42] [INFO] Using worker: sync
   [2022-11-18 18:39:35 +0300] [43] [INFO] Booting worker with pid: 43
   [2022-11-18 18:39:35 +0300] [44] [INFO] Booting worker with pid: 44
   [2022-11-18 18:39:35 +0300] [42] [INFO] Handling signal: term
   ERROR: Pidfile (/opt/airflow/airflow-worker.pid) already exists.
   Seems we're already running? (pid: 1)
   [2022-11-18 18:39:35 +0300] [43] [INFO] Worker exiting (pid: 43)
   [2022-11-18 18:39:35 +0300] [44] [INFO] Worker exiting (pid: 44)
   [2022-11-18 18:39:35 +0300] [42] [INFO] Shutting down: Master
   ```
   
   Airflow-worker restarting can long endlessly, and each time there will be 
this error.
   Manual restart of worker (`docker-compose down && docker-compose up`) fixes 
the problem (`/opt/airflow/airflow-worker.pid ` becomes deleted).
   
   Why '`/opt/airflow/airflow-worker.pid`' isn't deleted during automatic 
worker restart?
   
   During such endless automatic restart, container with worker doesn't take 
state "unhelthy" (because it dies immidiatedly) and 'autoheal' doesn't 
understand that worker should be rebooted. 
   
   Is it possible to fix it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to