Re: [PROPOSAL] (likely) significantly shorten CI test time: splitting to db/non-db tests

Jarek Potiuk Thu, 26 Oct 2023 06:53:07 -0700

BTW. I also plan to add short "best practices" chapter to our TESTING.rst
to deal with some of those "special" cases based on the learning from that
whole experience.


On Thu, Oct 26, 2023 at 3:46 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> All right.
>
> I think I am getting closer to getting it ready and there are indeed
> significant savings it seems.
>
> I believe we can achieve some ~40/50% improvement overall for the CPU
> build time needed to run the tests and likely some 30% - 40% improvement in
> waiting for the build to complete (note that it only impacts the "unit
> tests" we had - there are other things like docs building etc.
>
> Rough numbers: we have *~9200* non-DB tests and *~7300* DB tests so ~*55%
> *of our tests are non-DB tests. On my (fast 16 processor) machine all
> non-DB tests complete in ~4 minutes with xdist (parallel test running). I
> think we can easily get less than 10 minutes per-test suite for self-hosted
> runners (today > 16- 25 minutes). Probably a bit more than ~ 2x that for
> public runners (but many of regular PRs already run a subset of tests - and
> those will also get 2x faster).
>
> I will have more details available once I solve last few issues), but I
> marked the PR as ready for review and looking for comments
> https://github.com/apache/airflow/pull/35160 - the change is quite big (>
> 400 files) but most of it are either changes in breeze/CI env scripts or
> one-line-changes marking test/class/module as db-test.
>
> There are few caveats - which I would love the potential reviewers to take
> a look at because I had to solve a few issues and maybe there are better
> ways:
>
> * few tests initialized the DB during test collection which made
> marker-based exclusion ineffective. I had to adapt the tests slightly to
> accommodate it - there are few ways you can do it, and i tried a few of
> those and would love comments there
> * for non-db tests we are now using pytest-xdist to achieve high degree of
> parallelism and it has a few challenges:
>  - I had to modify some parameterized tests to get a stable sequence of
> parameters
>  - some of our non-db tests are not "ready" for xdist - they rely on some
> shared state (files etc.) and they will randomly fail when tests are
> running in parallel. I am tracking those down and also mark them as "DB
> tests" to make them run sequentially - but we will likely have some flakes
> because of those that we will have to fight with (though those kind of
> flakes can be easily circumvented by some pytest flags to rerun flaky tests
> automatically
> * I also moved most of the remaining test logic from bash scripts to
> breeze so it will be easier to manage it and easier to fix by others - I
> already added some "stability" behaviours in the PR.
>
> Looking forward to reviews and making it green and merged soon (we have
> currently quite a bit of instability in main that this change should
> address.
>
>
>
>
> On Tue, Oct 24, 2023 at 7:47 PM Aritra Basu <aritrabasu1...@gmail.com>
> wrote:
>
>> +1
>> Definitely a +1 from me, seems like a relatively small effort to get good
>> returns
>>
>> --
>> Regards,
>> Aritra Basu
>>
>> On Tue, Oct 24, 2023, 11:01 PM Vincent Beck <vincb...@apache.org> wrote:
>>
>> > +1 I like this one. I think it is totally worth it adding this
>> decorator,
>> > mostly because I think the effort is not huge. I also think, as a
>> > contributor, it is a good exercise to, when writing tests, figure out
>> > whether the tests I am writing is using the DB.
>> >
>> > On 2023/10/24 16:31:57 Jarek Potiuk wrote:
>> > > Hello everyone,
>> > >
>> > > *TL;DR; *I have a proposal on how we can probably significantly
>> decrease
>> > > time needed to run our CI tests. All that at the expense of  small - I
>> > > think-  effort by the need to mark tests as "db-tests" by contributors
>> > > enforced by our CI (in most cases that won't even be needed)..
>> > >
>> > > *A bit more context:*
>> > >
>> > > I have started to focus on AIP-44 work (internal API) and part of this
>> > work
>> > > is to make it possible to run subset of our tests (related to
>> > > untrusted components and providers) in a db-isolation mode to see if
>> we
>> > > have not forgotten anything (and to keep db isolation mode in the
>> > future).
>> > > Part of this task is to see how many tests we have that actually use
>> the
>> > > DB. I had some interesting findings and I thought I would explore an
>> idea
>> > > to split our tests to DB and NON-DB tests.
>> > >
>> > > After some initial exploration I believe we can get at least 30-50%
>> > > reduction in our CI test execution time (and build test time) at the
>> > > expense of the necessity of marking tests as "db_tests" when they need
>> > > the DB.
>> > >
>> > > This topic has been brought up occasionally in the past - why do we
>> run
>> > our
>> > > tests on multiple versions of multiple databases. And the problem is
>> that
>> > > many of our tests are using the DB and we need to test them on all of
>> > them.
>> > > But also - turns out we have quite a lot of tests that do not use the
>> DB.
>> > > This - for example - does not allow us to use "pytest-xdist" to run
>> the
>> > > test in parallel freely, we have to use the "custom" docker based
>> > > parallelism in CI.
>> > >
>> > > So the idea is that If we could separate DB from NON-DB tests we could
>> > only
>> > > run the DB tests on multiple databases, but then all the "non-db
>> tests"
>> > > could only be run once. It would only make sense though if we have a
>> > > significant number of non-DB tests, if we could clearly separate them
>> and
>> > > if it was easy in the future to keep them separated (so only mark
>> tests
>> > as
>> > > db_tests when they actually need DB).
>> > >
>> > > *Findings:*
>> > >
>> > > I tried it out and I found out that:
>> > >
>> > > a) Turns out that we have quite a significant number of tests that are
>> > > non-db tests. Likely between 30% to 50% by my estimate - both in core
>> and
>> > > providers and they are scattered among our code - mostly whole
>> "modules"
>> > > are either db or non-db tests.
>> > >
>> > > b) It is rather easy to separate db from non-db tests without any
>> > > structural change to our test structure (just by using a
>> > > custom @pytest.mark.db_test - on test, class, or even module level).
>> So
>> > we
>> > > can easily handle cases where the whole module is "mostly non-db" and
>> > only
>> > > few test cases are "db" - there are some cases like that, but the
>> custom
>> > > markers are rather nice to help with that.
>> > >
>> > > c) It should be easy to keep them separated in the future with only
>> very
>> > > little overhead. Basically when your tests will start failing in CI
>> and
>> > > tell you "mark your test with @pytest.mark.db_test" you will have to
>> do
>> > it
>> > > - but in most cases you won't even have to do it because your test
>> will
>> > be
>> > > added to the module that is already marked as db_test module
>> > >
>> > > d) We can use pytest-xdist for parallel execution of non-db tests
>> which
>> > > will make the test suites of "non-db" tests much more straightforward
>> -
>> > > basically `pytest tests --skip-db-tests -n auto` should do the trick).
>> > And
>> > > you could do the same locally as well.
>> > >
>> > > *Impact on contributors: *
>> > >
>> > > There is absolutely no impact on running the tests locally. Non-db
>> tests
>> > > will run when DB is there, no problem with that. I think in a vast
>> number
>> > > of cases it will be mostly no change for regular contributors. Only
>> > > occasionally (when new test module is added for example) their tests
>> will
>> > > fail in CI with very clear instructions from the CI to mark their
>> tests
>> > as
>> > > "@pytest.mark.db_test" or marking the whole module as "db_test
>> module" by
>> > > `pytestmark = pytest.mark.db_test`. This will happen when their tests
>> > will
>> > > be treated as "non-db" tests and attempted to run in CI.
>> > >
>> > > *Summarising:*
>> > >
>> > > I think we can achieve some 30%-50% reduction in time/CPU needed if we
>> > > implement it. Of course it means again complicating our CI builds
>> quite a
>> > > bit. But with those gains I think it's totally worth it.
>> > >
>> > > I have a draft (still needs quite some work) -  PR showing how it
>> could
>> > > look like  https://github.com/apache/airflow/pull/35160. I think in
>> 1-2
>> > > days I should have a ready-to-merge change with all the tests passing
>> and
>> > > with 30%-50% reduction of the resources needed and time.
>> > >
>> > > Would love your comments - is it worth it? Should I proceed with it?
>> Any
>> > > other comments?
>> > >
>> > > J.
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> > For additional commands, e-mail: dev-h...@airflow.apache.org
>> >
>> >
>>
>

Re: [PROPOSAL] (likely) significantly shorten CI test time: splitting to db/non-db tests

Reply via email to