potiuk edited a comment on pull request #14531:
URL: https://github.com/apache/airflow/pull/14531#issuecomment-798906062


   OK. looks like I got it under control. It will fail ocassionaly with 137 but 
in similar way as in the previous setup.
   
   Summary:  Seems that when we finish, th whole tests suite will be able to 
complete in ~ 13 minutes rather than current ~ 53 minutes  when we are done and 
when you have powerful  local machine, you will be able to run whole test suite 
in less than 6 minutes.
   
   
   What I have done now is I added a bit more `smarts`:
   
   a) I check how much memory and CPUs are available not in the host but in the 
docker engine. Those might differ - especially on Mac. 
   
   b) I run most tests in parallel (as many parallel test runs as many CPUs we 
have), but in case we have < ~ 32 GB available as RAM in the Docker Engine, I 
will not run Integration tests in parallel to other tests - they are run 
separately, at the end, after clean-up of the docker engine remnants
   
   c) If we have > 32 GB of memory in docker engine, we run everything in 
parallel.
   
   The current numbers (the numbers already improved after the fantastic 
@jhtimmins  optimistations in WWW/API tests):
   
   ## Public GitHub runners: 
   
   * Before parallelization: ~ 53 minutes
   * After parallelization (2 parallel test streams + Integration tests 
sequentially) : ~34 minutes 
   
   **Overall**:  We save 20 minutes per test suite or ~ 40% (!) speedup (for 
the whole test suite)
   
   ## Self-hosted runners with 4 Cores/32 GB RAM+ tmpfs (current setup)
   
   * Before parallelization: ~ 42  minutes
   * After parallelization  (4 parallel test streams + Integration tests 
sequentially): ~ 24 minutes
   
   **Overall**: We save 18 minutes per test suite ~ 43% speedup (!)  (for the 
whole test suite)
   
   ## Future self-hosted runners with 8 Cores/64 GB RAM + tmpfs
   
   Those are estimates based on earlier measurements on AWS where I switched 
our templates temporary to use them.
   
   * Before parallelization: ~ 40  minutes
   * After parallelization  (8 parallel test streams): ~ 13 minutes
   
   **Overall**: We save 12 minutes vs the 4 core/32 GB machines which is 
ADDITIONAL almost ~ 50%(!) improvement  (for the whole test suite)
   
   I think this is a really nice auto-detecting setup which will not only help 
with CI but also will allow to run full-test suite of tests locally for those 
developers who  have powerful machines. The full tests suite on my machine: 8 
CPUs (16 cores) + 64 GB RAM + no tmpfs (but very fast SSD disk) takes exactly 5 
(!!!!) minutes. 
   
   So we can still improve it. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to