Yep. I think these are all good ideas, and I think this should be part of our big Airflow 2 vs. Airflow 3 discussion. Almost as important as what is in and what is out is where and how development of different components happen. Same repo? Different repos? Different branches? Single monorepo for Airflow2 + Providers, and separate repo for Airflow 3 only? Keeping monorepo for Airflow 3 ? How do we cherry-pick? I think we need to "design" the developer experience as part of our discussion - and it should be a serious discussion considering all the consequences. How do we test things together? How do we test back-compatibility? How do we prevent Airflow 3 PRs breaking providers? Should we separate-out Helm chart as well? There are many many questions and multiple possible answers.
But let's not derail this discussion - my proposal is to use what we have now and simply get back-compatibility working without changing the structure (yet), but as part of Airflow 2 vs. Airflow 3 we should make sure this topic is fully covered and we get to consensus on the answers. J On Fri, May 10, 2024 at 9:17 AM Andrey Anshin <andrey.ans...@taragol.is> wrote: > Great job, Jarek! > > I would have some proposals, which should be considered as a long term > > > We should rework our test structure to fully run provider tests without > touching the Core tests. > The main problem here is that we configure a lot of things into the root > conftest.py which might be a problem in case of running tests on a provider > under a different version of the airflow. Core itself might use something > which was only added in a recent version of Airflow, but this should not be > a case in case of providers. So we should slightly change the test > structure, unless we could decouple providers for the mono repo (i'm not > sure it is even a case in the future). E.g. move tests/providers to > tests/providers/unit and after so w would have > tests/system/{unit|system|integration|conftest.py) maybe also some helpers > for providers should be moved into the tests/providers/helpers (I don't > like name helpers but this only for the reference). In the same momemen > move core related tests to the tests/core (name could be different) and > create structure like > tests/core/{unit|system|integration|helpers|conftest.py}. And move as much > as possible from tests/conftest.py to appropriate in > tests/{core|providers}/conftest.py > > > Providers tests should not be relied on DB backend, and could be easily run > on any of the supported, because providers not extend DB backend support > DB, and if tests pass in core we take an assumption that providers could > use any of them e.g. SQlite (preferable for setup in xdists) or Postgres. > > If we go even further we might want to move specific helpers in the > separate test package, e.g `pytest-apache-airflow`, and move all common > helpers and simple setup/configuration tests airflow environment (really > simple one as first steps) and compatibility level, same as provider I year > after feature version released. We could test this package against > different versions of airflow to make sure that within combination Airflow > (2.7-2.9 + main) + `pytest-apache-airflow` we could run tests against each > provider. > This pytest package also would be released, uploaded into the PyPI and > could be installed via pip/uv however at least for the initial stage it > shouldn't be considered to use outside of Airflow and Airflow Providers CI, > in another word it is no GA for the end users. This might be changed in the > future but let's focus that this package only for Airflow development > internals > > On Fri, 10 May 2024 at 01:08, Jarek Potiuk <ja...@potiuk.com> wrote: > > > Hello everyone, > > > > As part of preparation for the Airflow 3 move and (possible) provider > > separation (I have some ideas how to do it but that should be a separate > > discussion) I took on the task of improving our compatibility tests for > > Providers. I discussed it briefly with Kaxil and Ash and decided to give > it > > a go and see what it takes. > > > > The PR here: https://github.com/apache/airflow/pull/39513 > > > > I extended our "import" checks with checks that also run all provider > unit > > tests for specified airflow versions (for now 2.9.1 - but once we get it > > merged/approved we can make sure the tests are working for 2.7 and 2.8). > We > > will also be able to run "future" compatibility tests in case we decide > to > > leave providers aside from Airflow 3 and will be able to run the tests > for > > both`main` and `pypi`-released versions of airflow. > > > > A number of our tests rely on some internals of Airflow and they > > implicitly rely on the fact that they are run directly in airflow source > > tree - but there are not many of those - after some initial compatibility > > fixes I got 50 or so tests failing for 2.9.1 (probably there will be more > > for 2.8.0 and 2.7.0, but I want to make 2.9.1 works first). > > > > I almost got it working (few tests are still failing) with compatibility > > for 2.9.1 but I will need some help from a few people - around > > openlineage and serialization but also around recently improved > try_number > > :). I will reach out to the relevant people individually if we see that > as > > a good idea. > > > > It requires some care when writing tests to make sure the tests can be > run > > against installed airflow and not from sources. So in the future anyone > > contributing provider changes will have to make sure the tests pass also > > for past airflow versions (there are simple instructions explaining how > to > > do it with breeze). But once we merge it, this will be caught on PR level > > and should be easy to fix any of those problems. > > > > The benefit of having the tests is that we not only do simple import > tests > > but actually run provider tests, the drawback is that sometimes tests > will > > have to be adapted to make sure they work also for installed older > airflow > > versions (which is not always straightforward or easy and will need some > > compatibility code in tests - for example after recent rename of > > airflow.models.ImportError to ParsingImportError we had to add compat.py > to > > test_utils and import ParsingImportError from there rather than from > > Airflow directly in tests. > > > > I don't think it's too controversial - being able to run unit tests for > > providers for old (and future) versions of Airflow is generally quite an > > improvement in stability, but this adds a bit overhead on contributions, > so > > I am letting everyone here know it's coming, so that it's not a surprise > to > > contributors. > > > > J. > > >