Those are very good questions :) On Mon, Jan 6, 2025 at 10:54 PM Ferruzzi, Dennis <ferru...@amazon.com.invalid> wrote:
> To clarify that I understand your diagram correctly, let's say you clone > the Airflow repo to ~/workspace/airflow/. Does this mean that the AWS Glue > Hook which used to live at > ~/workspace/airflow/providers/amazon/aws/hooks/glue.py (as a random > example) will be located at > ~/workspace/airflow/providers/amazon/aws/src/airflow/providers/amazon/aws/hooks/glue.py? > That feels unnecessarily repetitive to me, maybe it makes sense but I'm > missing the context? > Yes - it means that there is this repetitiveness but for a good reason - those two "amazon/aws" serve different purpose: * The first "providers/amazon/aws" is just where the whole provider "complete project" is stored - it's basically a directory where "aws provider" is stored, a convenient folder to locate it in, that makes it separate from other providers * The second "src/airflow/providers/amazon/aws" - is the python package where the source files is stored - this is how (inside the sub-folder) you tell the actual python "import" to look for the sources. .What really matters is that (eventually) `~/workspace/airflow/providers/amazon/aws/` can be treated as a completely separate python project - a source of a "standalone" provider python project. There is a "pyproject.toml" file at the root of it and if you do this (for example): cd providers/amazon/aws/ uv sync And with it you will be able to work on AWS provider exclusively as a separate project (this is not yet complete with the move - tests are not entirely possible to run today - but it will be possible as next step - I explained it in https://github.com/apache/airflow/pull/45259#issuecomment-2572427916 This has a number of benefits, but one of them is that you will be able to build provider by just running `build` command of your favourite PEP-standard compliant frontend: cd providers/amazon/aws/ `uv build` (or `hatch build` or `poetry build` or `flit build` ).... This will create the provider package inside the `dist" folder. I just did it in my PR with `uv` in the first "airbyte` project: root@d74b3136d62f:/opt/airflow/providers/airbyte# uv build Building source distribution... Building wheel from source distribution... Successfully built dist/apache_airflow_providers_airbyte-5.0.0.tar.gz Successfully built dist/apache_airflow_providers_airbyte-5.0.0-py3-none-any.whl That's it. That also allows cases like installing provider packages using git URLs - which I used earlier today to test if the incoming PR of pygments is actually solving the problem we had yesteday https://github.com/apache/airflow/pull/45416 (basically we just make our provider packages "standard" python packages that all the tools support. Anyone who would like to install a commit, hash or branch version of the "airbyte" package from main version of Airflow repo will be able to do: pip install "apache-airflow-providers-airbyte @ git+ https://github.com/apache/airflow.git/providers/airbyte@COMMIT_ID" Currently in order to create the package we need to manually extract the "amazon" subtree, copy it elsewhere, prepare dynamically some files (pyproject.toml, README.rst and few others) and only then we build the package. All this - copying file structure, creating new files, running the build command after and finally deleting the copied files is now - dynamically and under-the-hood created by "breeze release-management prepare-provider-packages" command. With this change, the directory structure in `git` repo of ours is totally standard and allows us (and anyone else) to build the package directly from it. And what is the plan for system tests? As part of this reorganization, > could they be moved into providers/{PROVIDER_ID}/tests/system? That seems > more intuitive to me than their current location in > providers/tests/system/{PROVIDER_ID}/example_foo.py. > > Oh yeah - I missed that in the original structure as the "airbyte" provider (that I chose as first one) did not contain the "system" tests - but one of the two providers after that i was planning to make sure system tests are covered. They are supposed to be moved to "tests/system" of course - Elad had similar question and I also explained it in detail in https://github.com/apache/airflow/pull/45259#issuecomment-2572427916 I hope it answers the questions. If not - I am happy to add more clarifications :) > J. >