potiuk edited a comment on pull request #14531: URL: https://github.com/apache/airflow/pull/14531#issuecomment-798906062
OK. looks like I got it under control. It will fail ocassionaly with 137 but in similar way as in the previous setup. Summary: Seems that when we finish, th whole tests suite will be able to complete in ~ 13 minutes rather than current ~ 53 minutes when we are done and when you have powerful local machine, you will be able to run whole test suite in less than 6 minutes. What I have done now is I added a bit more `smarts`: a) I check how much memory and CPUs are available not in the host but in the docker engine. Those might differ - especially on Mac. b) I run most tests in parallel (as many parallel test runs as many CPUs we have), but in case we have < ~ 32 GB available as RAM in the Docker Engine, I will not run Integration tests in parallel to other tests - they are run separately, at the end, after clean-up of the docker engine remnants c) If we have > 32 GB of memory in docker engine, we run everything in parallel. The current numbers (the numbers already improved after the fantastic @jhtimmins optimistations in WWW/API tests): ## Public GitHub runners: * Before parallelization: ~ 53 minutes * After parallelization (2 parallel test streams + Integration tests sequentially) : ~34 minutes **Overall**: We save 20 minutes per test suite or ~ 40% (!) speedup (for the whole test suite) ## Self-hosted runners with 4 Cores/32 GB RAM+ tmpfs (current setup) * Before parallelization: ~ 42 minutes * After parallelization (4 parallel test streams + Integration tests sequentially): ~ 24 minutes **Overall**: We save 18 minutes per test suite ~ 43% speedup (!) (for the whole test suite) ## Future self-hosted runners with 8 Cores/64 GB RAM + tmpfs Those are estimates based on earlier measurements on AWS where I switched our templates temporary to use them. * Before parallelization: ~ 40 minutes * After parallelization (8 parallel test streams): ~ 13 minutes **Overall**: We save 12 minutes vs the 4 core/32 GB machines which is ADDITIONAL almost ~ 50%(!) improvement (for the whole test suite) I think this is a really nice auto-detecting setup which will not only help with CI but also will allow to run full-test suite of tests locally for those developers who have powerful machines. The full tests suite on my machine: 8 CPUs (16 cores) + 64 GB RAM + no tmpfs (but very fast SSD disk) takes exactly 5 (!!!!) minutes. So we can still improve it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
