The GitHub Actions job "Tests" on airflow.git has failed.
Run started by GitHub user potiuk (triggered by potiuk).

Head commit for run:
7607b26b4584568a8eb5249cfa8145e2500fd79b / Jarek Potiuk <[email protected]>
Optimize parallel test execution for unit tests

We are runnig the tests in parallel test types in order to speed
up their execution. Howver some test types and subsets of tests
are taking far longer to execute than other test types.

The longest tests to run are Providers and WWW tests, and the
longest tests from Providers are by far Amazon tests, then
Google. "All Other" Provider tests take about the same time
as Amazon tests - also after splitting the provider tests,
Core tests take the longest time.

When we are running tests in parallel on multiple CPUs, often
the longest running tests remain runing on their own while the
other CPUS are not busy. We could run separate tests type
per provider, but overhead of starting the database and collecting
and initializing tests for them is too big for it to achieve
speedups - especially for Public runners, having 80 separate
databases with 80 subsequent container runs is slower than
running all Provider tests together.

However we can split the Provider tests into smaller number of
chunks and prioritize running the long chunks first. This
should improve the effect of parellelisation and improve utilization of
our multi-CPU machines.

This PR aims to do that:

* Split Provider tests (if amazon or google are part of the
  provider tests) into amazon, google, all-other chunks

* Move sorting of the test types to selective_check, to sort the
  test types according to expected longest running time (the longest
  tests to run are added first)

This should improve the CPU utilization of our multi-CPU runners
and make the tests involving complete Provider set (or even sets
containing amazon, google and few other providers)
execute quite a few minutes faster on average.

We could also get rid of some sequential processing for the Public PRs
because each test type we will run will be less demanding overall. We
used to get a lot of 137 exit codes (memory errors) but with splitting
out Providers, the risk of exhausting resources be two test types
running in paralel are low.

Report URL: https://github.com/apache/airflow/actions/runs/4735571841

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to