potiuk commented on PR #27214: URL: https://github.com/apache/airflow/pull/27214#issuecomment-1293437980
> Not related to this PR but a bit annoying things. > > Is anyone know what might be a nature of this CI error which happen time to time? Is another it another `airflow-test-integration_trino_1` might run in the same time in worker CI? > > ``` > Host is already in use by another container > Creating airflow-test-integration_trino_1 ... error > > ERROR: for airflow-test-integration_trino_1 Cannot start service trino: driver failed programming external connectivity on endpoint airflow-test-integration_trino_1 (9718b96633b0c3913bc614cb7fc4577496eae3ae9a0d5872f8aa14333f8816b8): Error starting userland proxy: listen tcp4 0.0.0.0:38080: bind: address already in use > ``` I am chasing that one for a long time and I was never able to make a plausible hypothesis on why it happens and implements some workaround. But any ideas/inputs are more than welcome. This happens intermittently which makes it very difficult to diagnose, and I was never able to replicate it locally - and it is extremely annoying to see it. Theorethically it should not happen on GitHub Public runners - we should have a clean public runner every time we run job there. And busy port is not something that should not happen. What I THNK that happens is that docker-compose which runs several services experiences some race condition when starting our integration tests (with mutliple containers) or has some resource problems (memory, opened sockets, etc.). One idea I have is that we might want to eventually make some exclusions fo the Integration tests - and maybe just run them on single runner (sqlite? in public runners). I might experiment with it and make a PR for that after I re-run those failures. It's interesting to see it happening in 3 jobs out of 4 as it was the case in your build. Theorethically, those are independent runners - and yet all of them failed at about the same time, only sqlite "did it". I will restart those jobs and see if that will be reproduced or (as usual) it is a race/intermittent failure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
