And just to clarify - there is a little too much redundancy in your example:

`airflow/providers/airbyte/hooks/airbyte.py` becomes `airflow
/providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`

The leading "airflow" is not there in the target path:

`airflow/providers/airbyte/hooks/airbyte.py` becomes
`providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`

And the redundancy comes from the fact that we have to somehow name the
directory where the entire provider is moved. Let me rewrite it differently:

providers/airbyte <- This is where the airbyte provider is moved
                |
                src/airflow/providers/airbyte/hooks/airbyte.py <- this is
actual path "inside" the provider


So there is far less redundancy than you think.


On Tue, Dec 13, 2022 at 2:40 AM Jarek Potiuk <ja...@potiuk.com> wrote:

>
>> I have two immediate thoughts on it.   First, can this be used as an
>> opportunity to reorganize the system tests tree into the provider's test
>> tree?  Instead of having `tests/system/providers/{foo}`, maybe we can move
>> those system tests alongside the other provider tests like this:
>>
>
> Somehow what you wanted to add here was lost so I am not sure what the
> proposal is :).
>
>
>> My other thought is that I think I am missing something here as far as
>> the new organization goes.  I haven't spent the time that you have spent
>> looking at how to make this work, and I also realize this is early PoC
>> discussion so maybe this isn't the end target, but the new paths are very
>> long and redundant.  Is that for automation reasons?  For example in the
>> image you shared:
>>
>
>
>> `airflow/providers/airbyte/hooks/airbyte.py` becomes
>> `airflow/providers/airbyte/src/airflow/providers/airbyte/hooks/airbyte.py`
>> `tests/providers/airbyte/hooks/test_airbyte.py` becomes
>> `airflow/providers/airbyte/
>> tests/airflow/providers/airbyte/hooks/test_airbyte.py.
>>
>
> For sources, I am quite sure we cannot do it - as this would end up with a
> non-standard package and we would have to make some non-standard
> manipulations with import paths and it would confuse multiple IDEs like
> crazy.
> Our providers are in "airflow.providers.<provider>" package. The "src" is
> the root of sources and we need to have "airflow"/"providers" as the
> package in.
>
> Also this choice was largely based on the official Python packaging
> tutorial:
> https://packaging.python.org/en/latest/tutorials/packaging-projects/
> (this is where I took the "src" from - it's not been very common in Python
> Projects - including Airflow - but it is now recommended in multiple places
> including official Pypa tutorial:
>
> packaging_tutorial/
> └── src/
>     └── example_package_YOUR_USERNAME_HERE/
>         ├── __init__.py
>         └── example.py
>
> Regarding tests - yes, that could be removed. I even had it like that
> initially. And I am on the fence on this one.
>
> Putting tests at the top level has some drawbacks. One drawback is that if
> we put tests at the top, we miss "namespace". Having a "namespace" makes
> tests unique even if they are defined in different providers and are both
> put on PYTHONPATH
>
> Imagine provider_a defining "tests/hooks/test_utils.py" and provider_b
> defining "tests/utils/test_utils.py". When you import
> "tests.utils.test_utils" - which one do you import if they are all in the
> PYTHONPATH? This will become a problem when we will make all of them added
> to PYTHONPATH in your IDE - I think. I believe many IDEs (including PyCharm
> and VSCode) will not work with multiple, separate python modules and they
> will get confused when seeing multiple repeated packages with the same
> modules on PYTHONPATH.
> <package_name> gives us "namespace" and guarantees that each module will
> be different.
>
> Also keeping them in packages is pretty consistent with some guidelines I
> saw that "tests/<package>" is recommended. I would love however to hear
> more from others experiences (especially TP) as maybe there are other ways
> I have not thought about. I would also prefer "tests/test_mmm.py" rather
> than "tests/airflow/providers/providers/test_mmm.py" - but I am afraid it's
> going to be difficult due to the "repeated modules" problem.
>
>
>
>>

Reply via email to