1) I love the separation and the subproject structure overall. Especially with 
docs in there it should help speed up our docs build on PRs  lot by not needing 
to build all docs

2) no real opinion, seems useful. Let’s see if we can do it with a ruff rule 
first? It _might_ be possible. (It is possible to extend the ruff rules on a 
per subproject basis, check out task_sdk/pyproject.toml) 

 3) I don’t love the `tests/provider_tests` (I know why we did it) so how about 
`tests/unit` instead? It makes it clearer what it is (cos everything under 
`tests/` are provider tests so it’s a bit confusing at first to see provider 
tests under a provider folder.

So this would be

PROVIDER
      |- pyproject.toml
      |- provider.yaml
      |- src/
      |.   | airflow/providers/PROVIDER/
      |                               |- PROVIDER CODE HERE
      |- docs/
      |- tests/
      |      | unit/
      |      |              | PROVIDER
      |      |                        |- UNIT TEST CODE HERE
      |      | integration/ (optional)
      |      |           | PROVIDER
      |      |                    |- INTEGRATION TEST CODE HERE
      |      | system/  (optional)
      |      |      | PROVIDER
      |      |               |- SYSTEM TEST CODE HERE


One question that wasn’t clear to me (maybe it’s not 100% relevant) is which of 
these folders will and won’t have init.py’s in them?

-ash

> On 15 Feb 2025, at 19:34, Jens Scheffler <j_scheff...@gmx.de.INVALID> wrote:
> 
> Not more to say. Positive to all three points. No better ideas. But time
> will tell and we might need to revise in future if tooling changes. But
> this is anyway the normal case.
> 
> On 15.02.25 20:24, Aritra Basu wrote:
>> Hi,
>> Thanks for starting the conversation Jarek,
>> 1. Given I can't think of a cleaner/better approach I'm for this. I do
>> think the end goal of adding intellij/vscode config files should definitely
>> be added because many might miss the instructions when they expect the ide
>> to just work.
>> 2. Very much yes, we definitely should have pre-commits to catch them.
>> 3. I do prefer these names over the af prefix, though I would love to hear
>> some alternatives
>> --
>> Regards,
>> Aritra Basu
>> 
>> On Sat, 15 Feb 2025, 11:45 pm Jarek Potiuk, <ja...@potiuk.com> wrote:
>> 
>>> Hello everyone,
>>> 
>>> TL;DR; I wanted to discuss and agree on the final setup for provider's
>>> tests package internal structure
>>> 
>>> Now that first stage of cleanup after providers move is merged (
>>> https://github.com/apache/airflow/pull/46608 - thanks Jens for a thorough
>>> review of the 608 file+ beast of a change) I would like to explain current
>>> approach for tests setup in providers and some of the consequences, and ask
>>> others what they think.
>>> 
>>> Currently each of the provider's sub-projects look like that more or less
>>> (this is all inside "providers" folder in airflow):
>>> 
>>> PROVIDER
>>>        |- pyproject.toml
>>>        |- provider.yaml
>>>        |- src
>>>        |.   | airflow.providers.PROVIDER
>>>        |                               |- PROVIDER CODE HERE
>>>        |- docs
>>>        |- tests
>>>        |      | provider_tests
>>>        |      |              | PROVIDER
>>>        |      |                        |- UNIT TEST CODE HERE
>>>        |      | integration (optional)
>>>        |      |           | PROVIDER
>>>        |      |                    |- INTEGRATION TEST CODE HERE
>>>        |      | system  (optional)
>>>        |      |      | PROVIDER
>>>        |      |               |- SYSTEM TEST CODE HERE
>>> 
>>> This structure is pretty "standalone" and allows the code in tests of
>>> providers to import other test provider code without going "out" to
>>> airflow. I implemented code in "tests_common" that automatically adds
>>> "tests"  for all providers to PYTHONPATH so what you can do now much more
>>> reasonable imports inside the provider test code (i.e. reusing some code
>>> inside tests for each provider):
>>> 
>>> This is for example what is now
>>> in
>>> providers/standard/tests/provider_tests/standard/decorators/test_python.py:
>>> 
>>> *from provider_tests.standard.operators.test_python import BasePythonTest*
>>> 
>>> This is pretty cool, because before the cleanup the imports looked like
>>> this (they were importing starting from the root of the airflow project):
>>> 
>>> *from
>>> providers.standard.tests.provider_tests.standard.operators.test_python
>>> import BasePythonTest*
>>> 
>>> This was bad - because none of the "providers.standard.tests" were real
>>> "python packages" - they were just folders that we used to place the
>>> "standard provider" and its tests in - we artificially made them "python
>>> packages" by placing `__init__.py` files in them. This is no longer there.
>>> 
>>> This has a certain consequence: if you want to make your IDE to understand
>>> the imports, you have to mark the "tests" folder as "root of test sources"
>>> (at least in IntelliJ).
>>> 
>>> This is easy to justify and understand (and a very, very reasonable
>>> assumption) and in a number of cases (for example when you open the
>>> provider code as a separate project, this will even be auto-detected.
>>> 
>>> I will add soon (after we finalize this discussion) a bit more instructions
>>> and explanation for various scenarios of opening projects and using various
>>> IDEs in order to develop provider's code to make it clearer - and maybe
>>> later on we could even commit some of the intellij/.vscode project
>>> configuration (just PYTHONPATH) so that people will not have to do it
>>> manually.
>>> 
>>> Eventually (we are not yet fully there but this is coming) - contributors
>>> in the future should be able to just point to the provider's folder as a
>>> project, run `uv sync` there and develop provider without loading and
>>> opening "airflow" project - run all the tests, build documentation etc.
>>> without even touching the parent airflow folders.
>>> 
>>> Now.... My questions are:
>>> 
>>> *1)* do you like this approach? Any comments?
>>> 
>>> *2)* should we add some pre-commits to verify and check that we only use
>>> the right "from" imports in our tests (there is a possibility that someone
>>> will still use longer names, from content root "from providers.standard" ?
>>> I think we should as it is likely to happen.
>>> 
>>> *3)* are the top-level package names I chose good ?
>>> 
>>> I've chosen "provider_tests", "system", "integration" - but maybe those are
>>> not the best names and we should change them ? They are a bit inconsistent,
>>> and also "system" and "integration" , while common, are unlikely to be used
>>> by other packages installed. Previously we could not do such import
>>> shortening because for example "smtp" provider clashed with "smtp" built-in
>>> package - adding "provider_tests", "System
>>> 
>>> I had other ideas such as "af_unit", "af_integration", "af_system" (they
>>> should not be very long). But those did not feel right. Somewhat I like the
>>> inconsistency - as integration and system tests are very much different
>>> from the unit tests, even if we still use pytest as runners.
>>> 
>>> Also - those changes are likely impacting those who run system dashboards,
>>> so you will have to adapt slightly the paths you use in your system tests
>>> :)
>>> 
>>> Let me know what you think of all that :)
>>> 
>>> J.
>>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to