>
>
> I have two immediate thoughts on it.   First, can this be used as an
> opportunity to reorganize the system tests tree into the provider's test
> tree?  Instead of having `tests/system/providers/{foo}`, maybe we can move
> those system tests alongside the other provider tests like this:
>

Somehow what you wanted to add here was lost so I am not sure what the
proposal is :).


> My other thought is that I think I am missing something here as far as the
> new organization goes.  I haven't spent the time that you have spent
> looking at how to make this work, and I also realize this is early PoC
> discussion so maybe this isn't the end target, but the new paths are very
> long and redundant.  Is that for automation reasons?  For example in the
> image you shared:
>


> `airflow/providers/airbyte/hooks/airbyte.py` becomes
> `airflow/providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`
> `tests/providers/airbyte/hooks/test_airbyte.py` becomes
> `airflow/providers/airbyte/
> tests/airflow/providers/airbyte/hooks/test_airbyte.py.
>

For sources, I am quite sure we cannot do it - as this would end up with a
non-standard package and we would have to make some non-standard
manipulations with import paths and it would confuse multiple IDEs like
crazy.
Our providers are in "airflow.providers.<provider>" package. The "src" is
the root of sources and we need to have "airflow"/"providers" as the
package in.

Also this choice was largely based on the official Python packaging
tutorial:
https://packaging.python.org/en/latest/tutorials/packaging-projects/ (this
is where I took the "src" from - it's not been very common in Python
Projects - including Airflow - but it is now recommended in multiple places
including official Pypa tutorial:

packaging_tutorial/
└── src/
    └── example_package_YOUR_USERNAME_HERE/
        ├── __init__.py
        └── example.py

Regarding tests - yes, that could be removed. I even had it like that
initially. And I am on the fence on this one.

Putting tests at the top level has some drawbacks. One drawback is that if
we put tests at the top, we miss "namespace". Having a "namespace" makes
tests unique even if they are defined in different providers and are both
put on PYTHONPATH

Imagine provider_a defining "tests/hooks/test_utils.py" and provider_b
defining "tests/utils/test_utils.py". When you import
"tests.utils.test_utils" - which one do you import if they are all in the
PYTHONPATH? This will become a problem when we will make all of them added
to PYTHONPATH in your IDE - I think. I believe many IDEs (including PyCharm
and VSCode) will not work with multiple, separate python modules and they
will get confused when seeing multiple repeated packages with the same
modules on PYTHONPATH.
<package_name> gives us "namespace" and guarantees that each module will be
different.

Also keeping them in packages is pretty consistent with some guidelines I
saw that "tests/<package>" is recommended. I would love however to hear
more from others experiences (especially TP) as maybe there are other ways
I have not thought about. I would also prefer "tests/test_mmm.py" rather
than "tests/airflow/providers/providers/test_mmm.py" - but I am afraid it's
going to be difficult due to the "repeated modules" problem.



>

Reply via email to