Hello everyone,

TL;DR; I wanted to discuss and agree on the final setup for provider's
tests package internal structure

Now that first stage of cleanup after providers move is merged (
https://github.com/apache/airflow/pull/46608 - thanks Jens for a thorough
review of the 608 file+ beast of a change) I would like to explain current
approach for tests setup in providers and some of the consequences, and ask
others what they think.

Currently each of the provider's sub-projects look like that more or less
(this is all inside "providers" folder in airflow):

PROVIDER
       |- pyproject.toml
       |- provider.yaml
       |- src
       |.   | airflow.providers.PROVIDER
       |                               |- PROVIDER CODE HERE
       |- docs
       |- tests
       |      | provider_tests
       |      |              | PROVIDER
       |      |                        |- UNIT TEST CODE HERE
       |      | integration (optional)
       |      |           | PROVIDER
       |      |                    |- INTEGRATION TEST CODE HERE
       |      | system  (optional)
       |      |      | PROVIDER
       |      |               |- SYSTEM TEST CODE HERE

This structure is pretty "standalone" and allows the code in tests of
providers to import other test provider code without going "out" to
airflow. I implemented code in "tests_common" that automatically adds
"tests"  for all providers to PYTHONPATH so what you can do now much more
reasonable imports inside the provider test code (i.e. reusing some code
inside tests for each provider):

This is for example what is now
in providers/standard/tests/provider_tests/standard/decorators/test_python.py:

*from provider_tests.standard.operators.test_python import BasePythonTest*

This is pretty cool, because before the cleanup the imports looked like
this (they were importing starting from the root of the airflow project):

*from
providers.standard.tests.provider_tests.standard.operators.test_python
import BasePythonTest*

This was bad - because none of the "providers.standard.tests" were real
"python packages" - they were just folders that we used to place the
"standard provider" and its tests in - we artificially made them "python
packages" by placing `__init__.py` files in them. This is no longer there.

This has a certain consequence: if you want to make your IDE to understand
the imports, you have to mark the "tests" folder as "root of test sources"
(at least in IntelliJ).

This is easy to justify and understand (and a very, very reasonable
assumption) and in a number of cases (for example when you open the
provider code as a separate project, this will even be auto-detected.

I will add soon (after we finalize this discussion) a bit more instructions
and explanation for various scenarios of opening projects and using various
IDEs in order to develop provider's code to make it clearer - and maybe
later on we could even commit some of the intellij/.vscode project
configuration (just PYTHONPATH) so that people will not have to do it
manually.

Eventually (we are not yet fully there but this is coming) - contributors
in the future should be able to just point to the provider's folder as a
project, run `uv sync` there and develop provider without loading and
opening "airflow" project - run all the tests, build documentation etc.
without even touching the parent airflow folders.

Now.... My questions are:

*1)* do you like this approach? Any comments?

*2)* should we add some pre-commits to verify and check that we only use
the right "from" imports in our tests (there is a possibility that someone
will still use longer names, from content root "from providers.standard" ?
I think we should as it is likely to happen.

*3)* are the top-level package names I chose good ?

I've chosen "provider_tests", "system", "integration" - but maybe those are
not the best names and we should change them ? They are a bit inconsistent,
and also "system" and "integration" , while common, are unlikely to be used
by other packages installed. Previously we could not do such import
shortening because for example "smtp" provider clashed with "smtp" built-in
package - adding "provider_tests", "System

I had other ideas such as "af_unit", "af_integration", "af_system" (they
should not be very long). But those did not feel right. Somewhat I like the
inconsistency - as integration and system tests are very much different
from the unit tests, even if we still use pytest as runners.

Also - those changes are likely impacting those who run system dashboards,
so you will have to adapt slightly the paths you use in your system tests
:)

Let me know what you think of all that :)

J.

Reply via email to