FYI. Not forgotten - I have a branch in progress on that - and hopefully will soon solve all issues in it, but it looks promising to use "importlib" and rather than having deliberate test package hierarchy, simply make it impossible to share code between different test packages. The idea to replace ALL code reuse by fixtures (as all extremes) is likely not best - because in a number of cases it would be cumbersome or straight impossible, but when we combine it with sharing code in "tests_common" that can be imported as "from tests_common.*.*" -> it seems like we can have the best of both worlds - easiness of code reuse between tests and "isolation" of tests.
Also - just to add - that one will allow us to solve another problem for us. So far our ".sdist' packages for providers have been treated as "ASF Source packages" - fulfilling the criteria of ASF "source release". They have however one problem (I captured it in this issue - https://github.com/apache/airflow/issues/47343 ) "true" source packages released by us should allow the users to run "tests" and build "docs" for the package - and this change and using importlib for pytest and moving everything "common" to devel-common - will finally make it possible easily - we will be able to prepare source packages that will contain only "provider/<PROVIDER_ID>" and "devel-common" and they should allow to make a complete tests and docs building - completing the "provider isolation" work that we started at the beginning of year - making providers "truly" isolated and standalone. There is one small thing in docs building that also needs to be done (example_include handling better cross-provider includes). But that one is rather small. J. On Tue, Jul 29, 2025 at 10:09 AM Jarek Potiuk <ja...@potiuk.com> wrote: > I will try it today on the train while getting back from my hometown ;) > > On Tue, Jul 29, 2025 at 7:39 AM Amogh Desai <amoghde...@apache.org> wrote: > >> Nice to hear that opinion Ash, it's an interesting angle. >> >> If someone works on a POC for that, would be interested to know how it >> spans out. >> >> Thanks & Regards, >> Amogh Desai >> >> >> On Mon, Jul 28, 2025 at 7:23 PM Ash Berlin-Taylor <a...@apache.org> wrote: >> >> > Yeah, I wasn’t sure if it would work for us. Let us know how it goes. >> > >> > > On 28 Jul 2025, at 14:12, Jarek Potiuk <ja...@potiuk.com> wrote: >> > > >> > > That is an interesting alternative and I quite like it. >> > > >> > > We tried to use importlib before and we failed (but we did not try >> > "hard") >> > > - Andrey tried it one day and it generated a lot of errors - but we've >> > > changed a LOT since the (including separating of common test classes >> to >> > > "devel_common". >> > > >> > > It might be worth doing a POC for that - this adds very nice isolation >> > > where all tests are generally fully "standalone". I might attempt to >> > make a >> > > POC on that. >> > > >> > > I would not asy "only fixtures" is the right approach - but I think it >> > > should generally apply to trying to import common code between >> "tests". >> > > There is a bit of imports that are pretty "valid" in some way and >> easier >> > to >> > > use than fixtures I think - i.e. "devel-common" classes, but they are >> > not >> > > "tests" classes - they are regular "src" classes in "tests_common" >> > package >> > > in "apache-airflow-devel-common" distribution, so this is pretty good >> > idea >> > > to import them. >> > > >> > > But I do agree that importing one test from other classes is "fishy" >> and >> > if >> > > importlib might help us with this, then it is an easy way to enforce >> it. >> > I >> > > will try it first. That's a very good idea :) >> > > >> > > J. >> > > >> > > On Mon, Jul 28, 2025 at 10:24 AM Ash Berlin-Taylor <a...@apache.org> >> > wrote: >> > > >> > >> Did you look at changing the pytest import option instead? >> > >> https://docs.pytest.org/en/stable/explanation/pythonpath.html - and >> > >> specifically the `importlib` option >> > >> >> > >> To quote their documentation: >> > >> >> > >>> importlib: this mode uses more fine control mechanisms provided by >> > >> importlib < >> > >> https://docs.python.org/3/library/importlib.html#module-importlib> >> to >> > >> import test modules, without changing sys.path < >> > >> https://docs.python.org/3/library/sys.html#sys.path>. >> > >>> >> > >>> Advantages of this mode: >> > >>> >> > >>> pytest will not change sys.path < >> > >> https://docs.python.org/3/library/sys.html#sys.path> at all. >> > >>> >> > >>> Test module names do not need to be unique – pytest will generate a >> > >> unique name automatically based on the rootdir. >> > >>> >> > >>> Disadvantages: >> > >>> >> > >>> Test modules can’t import each other. >> > >>> >> > >>> Testing utility modules in the tests directories (for example a >> > >> tests.helpers module containing test-related functions/classes) are >> not >> > >> importable. The recommendation in this case it to place testing >> utility >> > >> modules together with the application/library code, for example >> > >> app.testing.helpers. >> > >>> >> > >>> Important: by “test utility modules”, we mean functions/classes >> which >> > >> are imported by other tests directly; this does not include fixtures, >> > which >> > >> should be placed in conftest.py files, along with the test modules, >> and >> > are >> > >> discovered automatically by pytest. >> > >>> >> > >> >> > >> I think with pytest fixtures we almost shouldn’t need to import tests >> > >> directly at all, so if they are a package or not should not be an >> issue. >> > >> Doing this (or working towards) it means we don’t need to worry about >> > the >> > >> name at all, and we already have the `tests_common` etc as a place to >> > put >> > >> helpers. >> > >> >> > >> That way we don’t need any special rules or naming convention or >> > otherwise >> > >> “duplicate” folder names. >> > >> >> > >> I.e. instead of adding a name to make things unique, we forbid >> imports >> > of >> > >> helper functions and make them use `pytest` fixtures instead. >> > >> >> > >> Thoughts? >> > >> >> > >>> On 27 Jul 2025, at 17:55, Jens Scheffler <j_scheff...@gmx.de.INVALID >> > >> > >> wrote: >> > >>> >> > >>> Hi Jarek, >> > >>> >> > >>> I like the general idea of the structure as you propose. >> > >>> >> > >>> I would not "swap" as you described afterwards but propose to use >> the >> > >> first proposal you made. >> > >>> >> > >>> For docker and k8s tests I think it would be better to keep them >> > outside >> > >> of "airflow-core" as they have a rather integrative/system scope and >> are >> > >> only indirectly related to core. For the moment I'd keep then and >> once >> > we >> > >> re-structure the packages still we can align later. >> > >>> >> > >>> Jens >> > >>> >> > >>> On 25.07.25 11:57, Amogh Desai wrote: >> > >>>> Hi Jarek, >> > >>>> >> > >>>>> So maybe it's a good time to improve it and come up with a >> convention >> > >> that >> > >>>> we can apply also to shared folders? Such change will be very >> little >> > >>>> disruptive and can be largely automated and applied in a single PR >> > >>>> >> > >>>> Yeah, I think it's a good time and idea to get such a thing done. >> > >>>> >> > >>>> *> from unit.amazon import s3_operator_test* >> > >>>> I never paid close attention to such import patterns and I do not >> like >> > >>>> them. Its very >> > >>>> confusing and ambiguous, and it's very hard to tell "what package >> is >> > >> this" >> > >>>> to any of those >> > >>>> imports. >> > >>>> >> > >>>> While writing the Task SDK integration tests I did keep a lot of >> those >> > >>>> things in mind and if you >> > >>>> see the internal imports I have, they are of the pattern: >> > >>>> *from task_sdk_tests.constants import AIRFLOW_ROOT_PATH *which is >> far >> > >> less >> > >>>> ambiguous and >> > >>>> confusing. >> > >>>> >> > >>>>> Note that this does not yet tackle the other "special" test types >> we >> > >> have >> > >>>> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being >> > >> merged >> > >>>> by Amogh), especially the last one seem to be clashing with >> > >>>> `task_sdk_tests`. But I think (and this would be separate >> discussion) >> > - >> > >>>> this also adds an opportunity (next step) to move those "special" >> test >> > >>>> types we have. Eventually we could move those tests to be "yet >> another >> > >> test >> > >>>> type" under "tests": >> > >>>> >> > >>>> Agreed, we can consider moving those to the right places where >> > >> applicable, >> > >>>> like: >> > >>>> *task_sdk_tests/* >> > >>>> >> > >>>> *├── unit/└── integration/ # Moved from task-sdk-tests/* >> > >>>> >> > >>>> We can start small here and build up to fix the issues one package >> at >> > a >> > >>>> time, and if >> > >>>> we can come up with a good AI prompt, we can get some work >> multiplexed >> > >> by >> > >>>> the community >> > >>>> too. >> > >>>> >> > >>>> >> > >>>> Thanks & Regards, >> > >>>> Amogh Desai >> > >>>> >> > >>>> >> > >>>> On Thu, Jul 24, 2025 at 8:38 PM Jarek Potiuk <ja...@potiuk.com> >> > wrote: >> > >>>> >> > >>>>> Hello here, >> > >>>>> >> > >>>>> Since we have now started even more isolation and separation of >> the >> > >> code >> > >>>>> and libraries, I think it might be a good time to maybe improve a >> bit >> > >> the >> > >>>>> way our tests "package" hierarchy is done - in providers and >> > elsewhere. >> > >>>>> >> > >>>>> This question came up when i started to add pre-commit checks >> after >> > our >> > >>>>> "shared code" PR have been merged - in >> > >>>>> https://github.com/apache/airflow/pull/53697 >> > >>>>> >> > >>>>> One problem with having tests in monorepo, is that if all "tests" >> > >> folders >> > >>>>> are added to PYTHONPATH, we cannot really put tests in the "top" >> > level >> > >> of >> > >>>>> the tests folder (that is pretty natural whe you do not have >> multiple >> > >> of >> > >>>>> "tests" folders. There are two reasons for that: >> > >>>>> >> > >>>>> * it's likely module names or package names will have the same >> names >> > in >> > >>>>> separate distributions >> > >>>>> * quite often - if we add tests in a sub-package of tests directly >> > they >> > >>>>> might clash with the same names - for example from stdlib or >> another >> > >>>>> package we installed. Good examples of it (and something that >> > happened >> > >> in >> > >>>>> the past were `tests/smtp/test_*` and `tests/kubernetes/test_*`. >> > >> Depending >> > >>>>> on how your IDE and env was configured you could have ended up in >> > >> "import >> > >>>>> kubernetes" not behaving as you expected. >> > >>>>> >> > >>>>> So far in providers we have the >> > "{unit/system/integration}/<PROVIDER>/" >> > >>>>> convention that solved the issue. But even me as author of it, I >> must >> > >> admit >> > >>>>> `from unit.amazon` when you want to cross-import between different >> > >> amazon >> > >>>>> or different providers looks rather stupid and baffling. Even >> > recently >> > >> Ash >> > >>>>> commented in one PR "that must be wrong" - and yeah, it does look >> > >> wrong.... >> > >>>>> >> > >>>>> So maybe it's a good time to improve it and come up with a >> convention >> > >> that >> > >>>>> we can apply also to shared folders? Such change will be very >> little >> > >>>>> disruptive and can be largely automated and applied in a single >> PR - >> > >> and >> > >>>>> modern git and IDE integration will follow such changes nicely - >> so >> > >> that >> > >>>>> you can see history of changes and it resolves most of conflicts >> > >>>>> automatically - so it could be done with very little disruption >> (only >> > >>>>> currently opened PRs will have to be rebased). >> > >>>>> >> > >>>>> My proposal would be to add a "parent" folder in the "tests" >> > directory >> > >> to >> > >>>>> indicate where the test is coming from. That might sound like a >> > >>>>> duplication, but I think it's a natural consequence of having >> code in >> > >>>>> monorepo, and a nice way to improve the organisation of tests. >> Also >> > it >> > >> has >> > >>>>> an immediate notion or where the imports come from. >> > >>>>> >> > >>>>> airflow-core\ >> > >>>>> src\ >> > >>>>> airflow\ >> > >>>>> tests\ >> > >>>>> airflow_core_tests\ >> > >>>>> unit\ >> > >>>>> system\ >> > >>>>> integration\ >> > >>>>> airflow-task-sdk\ >> > >>>>> src\ >> > >>>>> airflow\ >> > >>>>> sdk >> > >>>>> tests\ >> > >>>>> task_sdk_tests\ >> > >>>>> # here all tests are unit so >> no >> > >> need >> > >>>>> for sub-packages >> > >>>>> >> > >>>>> providers/amazon/tests\ >> > >>>>> provider_tests\ >> > >>>>> unit\ >> > >>>>> amazon >> > >>>>> integration\ >> > >>>>> amazon >> > >>>>> >> > >>>>> >> > >>>>> providers/cncf/kubernetes/tests\ >> > >>>>> provider_tests\ >> > >>>>> unit\ >> > >>>>> cncf\kubernetes\ >> > >>>>> integration\ >> > >>>>> >> > >> cncf\kubernetes\ >> > >>>>> >> > >>>>> >> > >>>>> I also considered swapping "unit" / "provider" >> > >> (provider_tests\amazon\unit) >> > >>>>> - but that would make it a bit more complex when we choose type of >> > >> tests to >> > >>>>> run or examples to have and I think it's somewhat better to have >> the >> > >>>>> unit/system/integration distinction right after provider_tests >> > because >> > >>>>> essentially those test types are VERY different. >> > >>>>> >> > >>>>> Again - that sounds like a lot of duplication in the path, but I >> > think >> > >> it's >> > >>>>> worth it (and our pre-commits already make sure that there are no >> > >> typos and >> > >>>>> keep it in order and they can be updated to keep this new >> structure). >> > >>>>> >> > >>>>> The main benefit of it hat when you import (wherever) between >> > different >> > >>>>> distributions - the imports will **look** better and be more >> obvious: >> > >>>>> >> > >>>>> from provider_tests.unit.amazon import some_amazon_test_code >> > >>>>> from task_sdk_tests import apis >> > >>>>> >> > >>>>> Note that this does not yet tackle the other "special" test types >> we >> > >> have >> > >>>>> -> kubernetes-tests, docker-tests, task-sdk-tests (right now being >> > >> merged >> > >>>>> by Amogh), especially the last one seem to be clashing with >> > >>>>> `task_sdk_tests`. But I think (and this would be separate >> > discussion) - >> > >>>>> this also adds an opportunity (next step) to move those "special" >> > test >> > >>>>> types we have. Eventually we could move those tests to be "yet >> > another >> > >> test >> > >>>>> type" under "tests": >> > >>>>> >> > >>>>> airflow_core_tests\ >> > >>>>> unit >> > >>>>> system >> > >>>>> integration >> > >>>>> docker >> > >>>>> kubernetes >> > >>>>> >> > >>>>> task_sdk_tests\ >> > >>>>> unit >> > >>>>> integration >> > >>>>> >> > >>>>> But I would approach it as the next step after we reorganize >> imports >> > >> for >> > >>>>> the "regular" tests we have. >> > >>>>> >> > >>>>> WDYT? Good idea? Other proposals? >> > >>>>> >> > >>>>> J. >> > >>>>> >> > >>> >> > >>> >> --------------------------------------------------------------------- >> > >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> > >>> For additional commands, e-mail: dev-h...@airflow.apache.org >> > >>> >> > >> >> > >> >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> > For additional commands, e-mail: dev-h...@airflow.apache.org >> > >> > >> >