michaelosthege commented on PR #58869: URL: https://github.com/apache/airflow/pull/58869#issuecomment-3820695349
> @michaelosthege hypothesis ... if not duplicate hostname could it be that the duplicate hostname check sometimes needs longer running into timeout as well? Not sure about a specific error message... No idea how long that code takes to execute, but we had tried increasing `celery.concurrency.asynpool.PROC_ALIVE_TIMEOUT` to 60 seconds and all it did was delaying the killing. If the hostname check went `SystemExit` the child process must have been already been dead at that point. We never saw a log message by those worker child processes. The check from https://github.com/apache/airflow/pull/58591 also appears to run before logging is configured. How does the UP message travel to the parent? Via the message broker? We also observed a "clocks out of sync" warning in the logs, but with the worker and Redis containers running on the same host that doesn't make any sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
