jedcunningham commented on a change in pull request #22573:
URL: https://github.com/apache/airflow/pull/22573#discussion_r836770537



##########
File path: README.md
##########
@@ -379,6 +379,14 @@ The important dependencies are:
    are very likely to introduce breaking changes across those so limiting it 
to MAJOR version makes sense
 * `werkzeug`: the library is known to cause problems in new versions. It is 
tightly coupled with Flask
    libraries, and we should update them together
+* `celery`: Celery is crucial component of Airflow as it used for Celery 
Executor. Celery

Review comment:
       ```suggestion
   * `celery`: Celery is crucial component of Airflow as it used for 
CeleryExecutor. Celery
   ```
   nit

##########
File path: setup.py
##########
@@ -235,7 +235,16 @@ def write_version(filename: str = os.path.join(*[my_dir, 
"airflow", "git_version
     'cassandra-driver>=3.13.0',
 ]
 celery = [
-    'celery>=5.2.3',
+    # The Celery is known to introduce problems when upgraded to a MAJOR 
version. Airflow Core
+    # Uses Celery for Celery executor, and we also know that Kubernetes Python 
client follows SemVer

Review comment:
       ```suggestion
       # Uses Celery for CeleryExecutor, and we also know that Kubernetes 
Python client follows SemVer
   ```

##########
File path: setup.py
##########
@@ -419,7 +428,15 @@ def write_version(filename: str = os.path.join(*[my_dir, 
"airflow", "git_version
 ]
 kubernetes = [
     'cryptography>=2.0.0',
-    'kubernetes>=21.7.0',
+    # The Kubernetes API is known to introduce problems when upgraded to a 
MAJOR version. Airflow Core
+    # Uses Kubernetes for K8S executor, and we also know that Kubernetes 
Python client follows SemVer

Review comment:
       ```suggestion
       # Uses Kubernetes for KubernetesExecutor, and we also know that 
Kubernetes Python client follows SemVer
   ```

##########
File path: setup.py
##########
@@ -235,7 +235,16 @@ def write_version(filename: str = os.path.join(*[my_dir, 
"airflow", "git_version
     'cassandra-driver>=3.13.0',
 ]
 celery = [
-    'celery>=5.2.3',
+    # The Celery is known to introduce problems when upgraded to a MAJOR 
version. Airflow Core
+    # Uses Celery for Celery executor, and we also know that Kubernetes Python 
client follows SemVer
+    # 
(https://docs.celeryq.dev/en/stable/contributing.html?highlight=semver#versions).
+    # This is a crucial component of Airflow, so we should limit it to the 
next MAJOR version and only
+    # deliberately bump the version when we tested it, and we know it can be 
bumped.
+    # Bumping this version should also be connected with
+    # limiting minimum airflow version supported in cncf.kubernetes provider, 
due to the
+    # potential breaking changes in Airflow Core as well (celery is added as 
extra, so Airflow
+    # core is not hard-limited via install-requirements, only by extra.

Review comment:
       ```suggestion
       # core is not hard-limited via install-requirements, only by extra).
   ```

##########
File path: README.md
##########
@@ -379,6 +379,14 @@ The important dependencies are:
    are very likely to introduce breaking changes across those so limiting it 
to MAJOR version makes sense
 * `werkzeug`: the library is known to cause problems in new versions. It is 
tightly coupled with Flask
    libraries, and we should update them together
+* `celery`: Celery is crucial component of Airflow as it used for Celery 
Executor. Celery
+   [follows 
SemVer](https://docs.celeryq.dev/en/stable/contributing.html?highlight=semver#versions),
 so
+   we should upper-bound it to the next MAJOR version. Also when we bump the 
upper version of the library,
+   we should make sure Celery Provider minimum Airflow version is updated).
+* `kubernetes`: Kubernetes Executor is a crucial component of Airflow as it is 
used for the K8SExecutor.

Review comment:
       ```suggestion
   * `kubernetes`: Kubernetes is a crucial component of Airflow as it is used 
for the KubernetesExecutor.
   ```
   nit

##########
File path: setup.py
##########
@@ -1033,17 +1050,29 @@ def 
replace_extra_requirement_with_provider_packages(extra: str, providers: List
             ['simple-salesforce>=1.0.0', 'tableauserverclient']
 
     So transitively 'salesforce' extra has all the requirements it needs and 
in case the provider
-    changes it's dependencies, they will transitively change as well.
+    changes its dependencies, they will transitively change as well.
 
     In the constraint mechanism we save both - provider versions and it's 
dependencies
     version, which means that installation using constraints is repeatable.
 
+    For K8s, Celery and Dask which are both "Core executors" and "Providers" 
we have to
+    add the base dependencies to the core as well - in order to mitigate 
problems where
+    newer version of provider will have less strict limits. This should be 
done for both:

Review comment:
       ```suggestion
       newer version of provider will have less strict limits. This should be 
done for both
   ```

##########
File path: setup.py
##########
@@ -419,7 +428,15 @@ def write_version(filename: str = os.path.join(*[my_dir, 
"airflow", "git_version
 ]
 kubernetes = [
     'cryptography>=2.0.0',
-    'kubernetes>=21.7.0',
+    # The Kubernetes API is known to introduce problems when upgraded to a 
MAJOR version. Airflow Core
+    # Uses Kubernetes for K8S executor, and we also know that Kubernetes 
Python client follows SemVer
+    # (https://github.com/kubernetes-client/python#compatibility). This is a 
crucial component of Airflow
+    # So we should limit it to the next MAJOR version and only deliberately 
bump the version when we
+    # tested it, and we know it can be bumped. Bumping this version should 
also be connected with
+    # limiting minimum airflow version supported in cncf.kubernetes provider, 
due to the
+    # potential breaking changes in Airflow Core as well (kubernetes is added 
as extra, so Airflow
+    # core is not hard-limited via install-requirements, only by extra.

Review comment:
       ```suggestion
       # core is not hard-limited via install-requirements, only by extra).
   ```

##########
File path: airflow/providers/celery/provider.yaml
##########
@@ -32,7 +32,6 @@ versions:
 
 additional-dependencies:
   - apache-airflow>=2.1.0

Review comment:
       ```suggestion
     - apache-airflow>=2.2.0
   ```
   
   This is probably what we want, as 2.2.0 was the first version with celery 5?

##########
File path: setup.py
##########
@@ -1033,17 +1050,29 @@ def 
replace_extra_requirement_with_provider_packages(extra: str, providers: List
             ['simple-salesforce>=1.0.0', 'tableauserverclient']
 
     So transitively 'salesforce' extra has all the requirements it needs and 
in case the provider
-    changes it's dependencies, they will transitively change as well.
+    changes its dependencies, they will transitively change as well.
 
     In the constraint mechanism we save both - provider versions and it's 
dependencies
     version, which means that installation using constraints is repeatable.
 
+    For K8s, Celery and Dask which are both "Core executors" and "Providers" 
we have to
+    add the base dependencies to the core as well - in order to mitigate 
problems where
+    newer version of provider will have less strict limits. This should be 
done for both:
+    extras and their deprecated aliases. This is not a full protection 
however, the way
+    extras work, this will not add "hard" limits for airflow and the user who 
does not use
+    constraints
+
     :param extra: Name of the extra to add providers to
     :param providers: list of provider ids
     """
-    EXTRAS_REQUIREMENTS[extra] = [
-        get_provider_package_from_package_id(package_name) for package_name in 
providers
-    ]
+    if extra in ['cncf.kubernetes', 'kubernetes', 'celery']:

Review comment:
       We mention dask in the docstring, but don't actually do anything for it 
here? Which is right?

##########
File path: setup.py
##########
@@ -1033,17 +1050,29 @@ def 
replace_extra_requirement_with_provider_packages(extra: str, providers: List
             ['simple-salesforce>=1.0.0', 'tableauserverclient']
 
     So transitively 'salesforce' extra has all the requirements it needs and 
in case the provider
-    changes it's dependencies, they will transitively change as well.
+    changes its dependencies, they will transitively change as well.
 
     In the constraint mechanism we save both - provider versions and it's 
dependencies
     version, which means that installation using constraints is repeatable.
 
+    For K8s, Celery and Dask which are both "Core executors" and "Providers" 
we have to
+    add the base dependencies to the core as well - in order to mitigate 
problems where
+    newer version of provider will have less strict limits. This should be 
done for both:
+    extras and their deprecated aliases. This is not a full protection 
however, the way
+    extras work, this will not add "hard" limits for airflow and the user who 
does not use
+    constraints

Review comment:
       ```suggestion
       extras work, this will not add "hard" limits for Airflow and the user 
who does not use
       constraints.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to