michaelosthege commented on PR #58869:
URL: https://github.com/apache/airflow/pull/58869#issuecomment-3820695349

   > @michaelosthege hypothesis ... if not duplicate hostname could it be that 
the duplicate hostname check sometimes needs longer running into timeout as 
well? Not sure about a specific error message...
   
   No idea how long that code takes to execute, but we had tried increasing 
`celery.concurrency.asynpool.PROC_ALIVE_TIMEOUT` to 60 seconds and all it did 
was delaying the killing. If the hostname check went `SystemExit` the child 
process must have been already been dead at that point.
   We never saw a log message by those worker child processes. The check from 
https://github.com/apache/airflow/pull/58591 also appears to run before logging 
is configured.
   
   How does the UP message travel to the parent? Via the message broker?
   
   We also observed a "clocks out of sync" warning in the logs, but with the 
worker and Redis containers running on the same host that doesn't make any 
sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to