OK. Tests should be green now - all the issues are "handled" - there are few follow up tasks from the tests run on 2.9.1 but the PR should be quite ready for final review and merge now and I can attempt to look at 2.8 compatibility once it's done.
On Mon, May 13, 2024 at 1:17 AM Jarek Potiuk <ja...@potiuk.com> wrote: > OK. I think I found and fixed all the compatibility issues in > https://github.com/apache/airflow/pull/39513 - except one last > openlineage plugin enablement fix (but I think reviews for all the other > changes would be great until we fix the issue). There are probably a few > incompatibilities that will need to be addressed before we release 2.1.10 > so I need confirmation / comments from Niko/Daniel if my findings are > correct. > > On Fri, May 10, 2024 at 12:54 PM Jarek Potiuk <ja...@potiuk.com> wrote: > >> > Just for clarification, this is only related to the provider's tests, >> right? >> >> Absolutely. >> >> On Fri, May 10, 2024 at 11:21 AM Andrey Anshin <andrey.ans...@taragol.is> >> wrote: >> >>> > "enable" tests for 2.8 and 2.7 separately >>> >>> Just for clarification, this is only related to the provider's tests, >>> right? >>> >>> >>> >>> On Fri, 10 May 2024 at 13:15, Jarek Potiuk <ja...@potiuk.com> wrote: >>> >>> > > Regarding Airflow 2.7 and Airflow 2.8 in the time we are ready to >>> move >>> > forward to the initial version of Airflow 3 providers might already >>> drop >>> > support of these versions in providers. >>> > Airflow 2.7 in the mid of August 2024 >>> > Airflow 2.8 in the mid of December 2024 >>> > >>> > Yep. But also "here and now" those compatibility tests might help us to >>> > find some hidden incompatibilities (and prevent adding future ones). >>> We can >>> > see how much complexity we are dealing with when we attempt to enable >>> the >>> > tests for 2.8 and then 2.7 and decide if it's worth it. The change I >>> added >>> > makes it easy to just "enable" tests for 2.8 and 2.7 separately. >>> > >>> > On Fri, May 10, 2024 at 11:10 AM Andrey Anshin < >>> andrey.ans...@taragol.is> >>> > wrote: >>> > >>> > > BTW, forget to mention that we should also check Pytest: Good >>> Integration >>> > > Practices from >>> > > https://docs.pytest.org/en/stable/explanation/goodpractices.html >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > On Fri, 10 May 2024 at 13:07, Andrey Anshin < >>> andrey.ans...@taragol.is> >>> > > wrote: >>> > > >>> > > > I think the current solution with run tests against installed >>> packages >>> > > > might help with future modifications and develop new dev >>> experience. >>> > And >>> > > > what is more important is help to find problems and >>> incompatibilities >>> > of >>> > > > providers with the previous version of Airflow "here and now". >>> > > > >>> > > > Regarding Airflow 2.7 and Airflow 2.8 in the time we are ready to >>> move >>> > > > forward to the initial version of Airflow 3 providers might already >>> > drop >>> > > > support of these versions in providers. >>> > > > Airflow 2.7 in the mid of August 2024 >>> > > > Airflow 2.8 in the mid of December 2024 >>> > > > >>> > > > >>> > > > >>> > > > On Fri, 10 May 2024 at 12:32, Jarek Potiuk <ja...@potiuk.com> >>> wrote: >>> > > > >>> > > >> And yes - as we get down to 2.8 and 2.7 it might be possible that >>> we >>> > > will >>> > > >> already implement some of the simplifications you mentioned as it >>> > might >>> > > be >>> > > >> easier than adding back-compatiblity to the current ways. I >>> assume it >>> > > will >>> > > >> be `quite` a bit harder to make our test suite work with Airflow >>> 2.8 >>> > and >>> > > >> then 2.7 - so it might be that some of the refactors and changes >>> will >>> > > need >>> > > >> to be applied to make it easier to maintain. >>> > > >> >>> > > >> On Fri, May 10, 2024 at 10:27 AM Jarek Potiuk <ja...@potiuk.com> >>> > wrote: >>> > > >> >>> > > >> > Yep. I think these are all good ideas, and I think this should >>> be >>> > part >>> > > >> of >>> > > >> > our big Airflow 2 vs. Airflow 3 discussion. Almost as important >>> as >>> > > what >>> > > >> is >>> > > >> > in and what is out is where and how development of different >>> > > components >>> > > >> > happen. Same repo? Different repos? Different branches? Single >>> > > monorepo >>> > > >> for >>> > > >> > Airflow2 + Providers, and separate repo for Airflow 3 only? >>> Keeping >>> > > >> > monorepo for Airflow 3 ? How do we cherry-pick? >>> > > >> > I think we need to "design" the developer experience as part >>> of our >>> > > >> > discussion - and it should be a serious discussion considering >>> all >>> > the >>> > > >> > consequences. How do we test things together? How do we test >>> > > >> > back-compatibility? How do we prevent Airflow 3 PRs breaking >>> > > providers? >>> > > >> > Should we separate-out Helm chart as well? There are many many >>> > > questions >>> > > >> > and multiple possible answers. >>> > > >> > >>> > > >> > But let's not derail this discussion - my proposal is to use >>> what we >>> > > >> have >>> > > >> > now and simply get back-compatibility working without changing >>> the >>> > > >> > structure (yet), but as part of Airflow 2 vs. Airflow 3 we >>> should >>> > make >>> > > >> sure >>> > > >> > this topic is fully covered and we get to consensus on the >>> answers. >>> > > >> > >>> > > >> > J >>> > > >> > >>> > > >> > >>> > > >> > On Fri, May 10, 2024 at 9:17 AM Andrey Anshin < >>> > > andrey.ans...@taragol.is >>> > > >> > >>> > > >> > wrote: >>> > > >> > >>> > > >> >> Great job, Jarek! >>> > > >> >> >>> > > >> >> I would have some proposals, which should be considered as a >>> long >>> > > term >>> > > >> >> >>> > > >> >> >>> > > >> >> We should rework our test structure to fully run provider tests >>> > > without >>> > > >> >> touching the Core tests. >>> > > >> >> The main problem here is that we configure a lot of things >>> into the >>> > > >> root >>> > > >> >> conftest.py which might be a problem in case of running tests >>> on a >>> > > >> >> provider >>> > > >> >> under a different version of the airflow. Core itself might use >>> > > >> something >>> > > >> >> which was only added in a recent version of Airflow, but this >>> > should >>> > > >> not >>> > > >> >> be >>> > > >> >> a case in case of providers. So we should slightly change the >>> test >>> > > >> >> structure, unless we could decouple providers for the mono repo >>> > (i'm >>> > > >> not >>> > > >> >> sure it is even a case in the future). E.g. move >>> tests/providers to >>> > > >> >> tests/providers/unit and after so w would have >>> > > >> >> tests/system/{unit|system|integration|conftest.py) maybe also >>> some >>> > > >> helpers >>> > > >> >> for providers should be moved into the tests/providers/helpers >>> (I >>> > > don't >>> > > >> >> like name helpers but this only for the reference). In the same >>> > > momemen >>> > > >> >> move core related tests to the tests/core (name could be >>> different) >>> > > and >>> > > >> >> create structure like >>> > > >> >> tests/core/{unit|system|integration|helpers|conftest.py}. And >>> move >>> > as >>> > > >> much >>> > > >> >> as possible from tests/conftest.py to appropriate in >>> > > >> >> tests/{core|providers}/conftest.py >>> > > >> >> >>> > > >> >> >>> > > >> >> Providers tests should not be relied on DB backend, and could >>> be >>> > > easily >>> > > >> >> run >>> > > >> >> on any of the supported, because providers not extend DB >>> backend >>> > > >> support >>> > > >> >> DB, and if tests pass in core we take an assumption that >>> providers >>> > > >> could >>> > > >> >> use any of them e.g. SQlite (preferable for setup in xdists) or >>> > > >> Postgres. >>> > > >> >> >>> > > >> >> If we go even further we might want to move specific helpers >>> in the >>> > > >> >> separate test package, e.g `pytest-apache-airflow`, and move >>> all >>> > > common >>> > > >> >> helpers and simple setup/configuration tests airflow >>> environment >>> > > >> (really >>> > > >> >> simple one as first steps) and compatibility level, same as >>> > provider >>> > > I >>> > > >> >> year >>> > > >> >> after feature version released. We could test this package >>> against >>> > > >> >> different versions of airflow to make sure that within >>> combination >>> > > >> Airflow >>> > > >> >> (2.7-2.9 + main) + `pytest-apache-airflow` we could run tests >>> > against >>> > > >> each >>> > > >> >> provider. >>> > > >> >> This pytest package also would be released, uploaded into the >>> PyPI >>> > > and >>> > > >> >> could be installed via pip/uv however at least for the initial >>> > stage >>> > > it >>> > > >> >> shouldn't be considered to use outside of Airflow and Airflow >>> > > Providers >>> > > >> >> CI, >>> > > >> >> in another word it is no GA for the end users. This might be >>> > changed >>> > > in >>> > > >> >> the >>> > > >> >> future but let's focus that this package only for Airflow >>> > development >>> > > >> >> internals >>> > > >> >> >>> > > >> >> On Fri, 10 May 2024 at 01:08, Jarek Potiuk <ja...@potiuk.com> >>> > wrote: >>> > > >> >> >>> > > >> >> > Hello everyone, >>> > > >> >> > >>> > > >> >> > As part of preparation for the Airflow 3 move and (possible) >>> > > provider >>> > > >> >> > separation (I have some ideas how to do it but that should >>> be a >>> > > >> separate >>> > > >> >> > discussion) I took on the task of improving our compatibility >>> > tests >>> > > >> for >>> > > >> >> > Providers. I discussed it briefly with Kaxil and Ash and >>> decided >>> > to >>> > > >> >> give it >>> > > >> >> > a go and see what it takes. >>> > > >> >> > >>> > > >> >> > The PR here: https://github.com/apache/airflow/pull/39513 >>> > > >> >> > >>> > > >> >> > I extended our "import" checks with checks that also run all >>> > > provider >>> > > >> >> unit >>> > > >> >> > tests for specified airflow versions (for now 2.9.1 - but >>> once we >>> > > >> get it >>> > > >> >> > merged/approved we can make sure the tests are working for >>> 2.7 >>> > and >>> > > >> >> 2.8). We >>> > > >> >> > will also be able to run "future" compatibility tests in >>> case we >>> > > >> decide >>> > > >> >> to >>> > > >> >> > leave providers aside from Airflow 3 and will be able to run >>> the >>> > > >> tests >>> > > >> >> for >>> > > >> >> > both`main` and `pypi`-released versions of airflow. >>> > > >> >> > >>> > > >> >> > A number of our tests rely on some internals of Airflow and >>> they >>> > > >> >> > implicitly rely on the fact that they are run directly in >>> airflow >>> > > >> source >>> > > >> >> > tree - but there are not many of those - after some initial >>> > > >> >> compatibility >>> > > >> >> > fixes I got 50 or so tests failing for 2.9.1 (probably there >>> will >>> > > be >>> > > >> >> more >>> > > >> >> > for 2.8.0 and 2.7.0, but I want to make 2.9.1 works first). >>> > > >> >> > >>> > > >> >> > I almost got it working (few tests are still failing) with >>> > > >> compatibility >>> > > >> >> > for 2.9.1 but I will need some help from a few people - >>> around >>> > > >> >> > openlineage and serialization but also around recently >>> improved >>> > > >> >> try_number >>> > > >> >> > :). I will reach out to the relevant people individually if >>> we >>> > see >>> > > >> that >>> > > >> >> as >>> > > >> >> > a good idea. >>> > > >> >> > >>> > > >> >> > It requires some care when writing tests to make sure the >>> tests >>> > can >>> > > >> be >>> > > >> >> run >>> > > >> >> > against installed airflow and not from sources. So in the >>> future >>> > > >> anyone >>> > > >> >> > contributing provider changes will have to make sure the >>> tests >>> > pass >>> > > >> also >>> > > >> >> > for past airflow versions (there are simple instructions >>> > explaining >>> > > >> how >>> > > >> >> to >>> > > >> >> > do it with breeze). But once we merge it, this will be >>> caught on >>> > PR >>> > > >> >> level >>> > > >> >> > and should be easy to fix any of those problems. >>> > > >> >> > >>> > > >> >> > The benefit of having the tests is that we not only do simple >>> > > import >>> > > >> >> tests >>> > > >> >> > but actually run provider tests, the drawback is that >>> sometimes >>> > > tests >>> > > >> >> will >>> > > >> >> > have to be adapted to make sure they work also for installed >>> > older >>> > > >> >> airflow >>> > > >> >> > versions (which is not always straightforward or easy and >>> will >>> > need >>> > > >> some >>> > > >> >> > compatibility code in tests - for example after recent >>> rename of >>> > > >> >> > airflow.models.ImportError to ParsingImportError we had to >>> add >>> > > >> >> compat.py to >>> > > >> >> > test_utils and import ParsingImportError from there rather >>> than >>> > > from >>> > > >> >> > Airflow directly in tests. >>> > > >> >> > >>> > > >> >> > I don't think it's too controversial - being able to run unit >>> > tests >>> > > >> for >>> > > >> >> > providers for old (and future) versions of Airflow is >>> generally >>> > > >> quite an >>> > > >> >> > improvement in stability, but this adds a bit overhead on >>> > > >> >> contributions, so >>> > > >> >> > I am letting everyone here know it's coming, so that it's >>> not a >>> > > >> >> surprise to >>> > > >> >> > contributors. >>> > > >> >> > >>> > > >> >> > J. >>> > > >> >> > >>> > > >> >> >>> > > >> > >>> > > >> >>> > > > >>> > > >>> > >>> >>