Those are very good questions :)

On Mon, Jan 6, 2025 at 10:54 PM Ferruzzi, Dennis
<ferru...@amazon.com.invalid> wrote:

> To clarify that I understand your diagram correctly, let's say you clone
> the Airflow repo to ~/workspace/airflow/.  Does this mean that the AWS Glue
> Hook which used to live at
> ~/workspace/airflow/providers/amazon/aws/hooks/glue.py (as a random
> example) will be located at
> ~/workspace/airflow/providers/amazon/aws/src/airflow/providers/amazon/aws/hooks/glue.py?
> That feels unnecessarily repetitive to me, maybe it makes sense but I'm
> missing the context?
>

Yes - it means that there is this repetitiveness but for a good reason -
those two "amazon/aws" serve different purpose:

* The first "providers/amazon/aws" is just where the whole provider
"complete project" is stored - it's basically a directory where "aws
provider" is stored, a convenient folder to locate it in, that makes it
separate from other providers
* The second "src/airflow/providers/amazon/aws" - is the python
package where the source files is stored - this is how (inside the
sub-folder) you tell the actual python "import" to look for the sources.

.What really matters is that (eventually)
`~/workspace/airflow/providers/amazon/aws/` can be treated as a completely
separate python project - a source of a "standalone" provider python
project.
There is a "pyproject.toml" file at the root of it and if you do this (for
example):

cd providers/amazon/aws/
uv sync

And with it you will be able to work on AWS provider exclusively as a
separate project (this is not yet complete with the move - tests are not
entirely possible to run today - but it will be possible as next step - I
explained it in
https://github.com/apache/airflow/pull/45259#issuecomment-2572427916

This has a number of benefits, but one of them is that you will be able to
build provider by just running `build` command of your favourite
PEP-standard compliant frontend:

cd providers/amazon/aws/
`uv build` (or `hatch build` or `poetry build` or `flit build` )....

This will create  the provider package inside the `dist" folder. I just did
it in my PR with `uv` in the first "airbyte` project:

root@d74b3136d62f:/opt/airflow/providers/airbyte# uv build
Building source distribution...
Building wheel from source distribution...
Successfully built dist/apache_airflow_providers_airbyte-5.0.0.tar.gz
Successfully built
dist/apache_airflow_providers_airbyte-5.0.0-py3-none-any.whl

That's it. That also allows cases like installing provider packages using
git URLs - which I used earlier today to test if the incoming PR of
pygments is actually solving the problem we had yesteday
https://github.com/apache/airflow/pull/45416  (basically we just make our
provider packages "standard" python packages that all the tools support.
Anyone who would like to install a commit, hash or branch version of the
"airbyte" package from main version of Airflow repo will be able to do:

pip install "apache-airflow-providers-airbyte @ git+
https://github.com/apache/airflow.git/providers/airbyte@COMMIT_ID";

Currently in order to create the package we need to manually extract the
"amazon" subtree, copy it elsewhere, prepare dynamically some files
(pyproject.toml, README.rst and few others) and only then we  build the
package. All this - copying file structure, creating new files, running the
build command after and finally deleting the copied files is now -
dynamically and under-the-hood created by "breeze release-management
prepare-provider-packages" command. With this change, the directory
structure in `git` repo of ours is totally standard and allows us (and
anyone else) to build the package directly from it.


And what is the plan for system tests?   As part of this reorganization,
> could they be moved into providers/{PROVIDER_ID}/tests/system?  That seems
> more intuitive to me than their current location in
> providers/tests/system/{PROVIDER_ID}/example_foo.py.
>
>
Oh yeah - I missed that in the original structure as the "airbyte" provider
(that I chose as first one) did not contain the "system" tests - but one of
the two providers after that i was planning to make sure system tests are
covered. They are supposed to be moved to "tests/system" of course - Elad
had similar question and I also explained it in detail in
https://github.com/apache/airflow/pull/45259#issuecomment-2572427916


I hope it answers the questions. If not - I am happy to add more
clarifications :)


> J.
>

Reply via email to