Hello here, *TL;DR; Following recent discussions I am seeking an opinion and consensus on the way we implement Airflow version checks in providers - ideally in a consistent way, because currently we have a very messy approach where 5 different ways are used in a way that makes it prone to errors.*
Yesterday we had - I think a bit unnecessarily - heated discussion on a change I proposed to apply consistency in the Airflow version compatibility check in providers. I proposed the change after some concerns were raised about using "common.compat" for this by Daniel (rightfully so IMHO) - and I proposed a solution that was supposed to address all problems around the version check I was aware about in the PR [1] (Generate constants with airflow versions in providers). Somewhat surprisingly it raised quite strong objections (borderline rejection) and somewhat heated discussions - from both Ash and Daniel - while met somewhat enthusiastic approval from Kacper and Jens, and since I respect and value everyone's opinion, I decided (following Daniel's suggestion) to separate the non-controversial part of my PR [2] (fixing __init__.py override problem that already happened for the second time and caused problems when Elad attempted to release providers). While I hope [2] will be quickly approved as non-controversial, I am also seeking community wisdom and decision (consensus hopefully, vote if needed) how we approach version checks because we seem to have different opinions there. Here is a bit more context: *Context*: In our providers that currently support Airflow 2.8.0+, we often have to have conditional code to support all the (4 currently) supported versions of Airflow - 2.8, 2.9, 2.10, 3.0 (in the near future also 2.11). This happens in main providers but also in the "test" code - because some of our tests must also be compatible. Airflow version check for the test code is pretty consistent and well documented (with examples) [3] and generally well followed "by example" and "docs". We use AIRFLOW_V_2_10_PLUS from "tests_common.test_utils.compat" module. *Problem:* However the version ched is currently done pretty inconsistently in various providers. Generally version check is done by this snippet: from airflow import __version__ as AIRFLOW_VERSION AIRFLOW_V_2_10_PLUS = Version(Version(AIRFLOW_VERSION).base_version) >= Version("2.10.0") But it is used pretty inconsistently and with side-effects a) In some cases the code is imported from `common.compat` provider - which means that there should be a dependency on common.compat from the provider and in the future when we add new versions (2.11.0) we have to bump version of the compat.provider as dependency. b) in other cases the above snippet is stored in some module (for example version_references.py in the providers and used from there in all provider modules as c) in other cases the snippet is inlined several times in different modules of the providers - usually at top level of the module d) in other cases the snipped produces differently named constant - for example _IS_AIRFLOW_2_10_OR_HIGHER e) in other cases the check is directly inlined: if Version(Version(version).base_version) >= Version("3.0.0"): f) also in some cases the imports are done from wrong places. Many of those top-level constants are named the same AIRFLOW_V_2_10_PLUS. Sometimes we have the same constant defined multiple times even in the same provider (and same name as the tests_compat one). g) in some cases (and that was the cause for release manager issues we had and is going to be solved by [1]) some contributors did not realise __init__.py is generated (despite comment in the file) - and added the constants to the __init__.py file (this is very natural place for those constants, so I am not surprised) Even currently (but also in the past) it happened that the variables were imported from a wrong package - likely - I guess - because when you press "Alt-Enter" in IDE, it will import it from *somewhere* by default - and since we have about 12 places where the constant is defined, it might accidentally introduce a dependency of providers on test code or another providers *Proposed solution* So - we have quite a bit of a mess here, and my PR [1] aimed to solve it and introduce consistency. There is no DRY (but IMHO that's ok - those snippets are small-enough to not have to be imported from common place) - but the consistency and wrong imports are a problem even now in the main branch and it might get worse. I believe that we all agree that consistency is generally good. I think (but I am interpreting the complaints of Ash and Daniel) that "enforcing" the consistency with pre-commit, is wrong and that it is enough to rely on reviews (which already failed us as you can see). My idea with [2] (following our usual "automate consistency when you see a problem) was that we could use what we already - have - generation of the __init__.py (which we used during release process so far and [2] adds it also to pre-commit) - and generate the constants in the way that will make it difficult to import the wrong now. I proposed that for providers that need version variants we could generate those constants in provider's `__init__.py` : PROVIDERS_AMAZON_AIRFLOW_IS_2_10_PLUS = Version("2.10.0") <= _AIRFLOW_VERSION There are few more things generated - see [4] for example in my PR including explanation how it works in the generated code and [5] instruction on provider's lifecycle documented why and how we are doing it [5] so that the contributors can understand and learn how to do it. This way it will be difficult to make the cross-import from different providers or test code, we can have a consistent approach, and it is easy to review and see mistakes. I am totally open to different naming conventions, and even unconditional generation of the version constants - currently it is controlled by a "need_version_variants" flag in provider.yaml - but we could also generate it for all providers - always. I am looking for feedback on it - I hoped it could be done - as usual - in the PR, but apparently Ash and Daniel feel very strongly about it and it seems that we need to agree in the devlist. *What do you think? Few questions:* 1) Is this a good idea to not only propose and describe a consistent approach but also "enforce" consistency using our existing pre-commit framework ? 2) Is generation in the __init__.py a good idea - if not what alternative people would see? 3) Do others share my concern as well that imports of constants named the same way (which we already see slipped through reviews a few times) is a problem worth addressing by having differently named constants? Looking forward to feedback (and sorry for the long mail, but I tried to add TL;DR and structure it in a way that is easy to digest and understand problems and solutions proposed. J. [1] Generate constants with airflow versions in providers https://github.com/apache/airflow/pull/44686 [2] Prevent __init__.py in providers from being modified https://github.com/apache/airflow/pull/44713 [3] https://github.com/apache/airflow/blob/main/contributing-docs/testing/unit_tests.rst#implementing-compatibility-for-provider-tests-for-older-airflow-versions [4] Example generated proposal: https://github.com/apache/airflow/pull/44686/files#diff-bb74b7a9bbd568ea5d575410753d4e814cbaece5f42d84d96e8f7991a6c4ab95R41 [5] Documentation in Provider's lifecycle docs https://github.com/apache/airflow/pull/44686/files#diff-1ed6f13b432ea574158b9577dfb2a7ad8b88b0a09c45021e94b3c069b36a4408R436