potiuk commented on a change in pull request #15513:
URL: https://github.com/apache/airflow/pull/15513#discussion_r619753452
##########
File path: setup.py
##########
@@ -198,7 +198,7 @@ def get_sphinx_theme_version() -> str:
'watchtower~=0.7.3',
]
apache_beam = [
- 'apache-beam[gcp]',
+ 'apache-beam>=2.20.0',
Review comment:
Good point. I added it to additional-extras for "apache.beam" in google
provider. As the result this is what google provider extras will look like:
```
extras_require={
'amazon': ['apache-airflow-providers-amazon'],
'apache.beam': ['apache-airflow-providers-apache-beam',
'apache-beam[gcp]'],
'apache.cassandra':
['apache-airflow-providers-apache-cassandra'],
'cncf.kubernetes': ['apache-airflow-providers-cncf-kubernetes'],
'facebook': ['apache-airflow-providers-facebook'],
'microsoft.azure': ['apache-airflow-providers-microsoft-azure'],
'microsoft.mssql': ['apache-airflow-providers-microsoft-mssql'],
'mysql': ['apache-airflow-providers-mysql'],
'oracle': ['apache-airflow-providers-oracle'],
'postgres': ['apache-airflow-providers-postgres'],
'presto': ['apache-airflow-providers-presto'],
'salesforce': ['apache-airflow-providers-salesforce'],
'sftp': ['apache-airflow-providers-sftp'],
'ssh': ['apache-airflow-providers-ssh'],
'trino': ['apache-airflow-providers-trino'],
},
```
Anyone willing to use DataFlow operators will have to use
`apache-airflow-providers-google[apache.beam]`. Those extras are nicely
documented in the README,
##########
File path: setup.py
##########
@@ -502,7 +502,7 @@ def get_sphinx_theme_version() -> str:
'paramiko',
'pipdeptree',
'pre-commit',
- 'pylint>=2.7.0',
+ 'pylint~=2.7.4',
Review comment:
If we don't do it, we will need to fix new issues reported by Pylint
2.8.0 released yesterday.
General context:
All the reqular PR's are using the current 'constraints'. But whenever we
modify setup.py, such PR will automatically attempt to run "eager upgrade" to
get latest versions of everything (including pylint new version in this case).
This is our way to - pretty much automatically - upgrade all such constraints
and detect any future incompatibilies early (but without affecting the regular
PRs).
Context for pylint:
We've already had similar problem caused by new pylint versions (new errors
reported on already 'good' code whenever setup.py changedf) so I'd rather fix
it now to minor version and upgrade it separately in a separate PR (especially
that they are preparing to release a major 3.0 upgrade which might require a
bit more 'overhaul'.
##########
File path: airflow/providers/apache/beam/CHANGELOG.rst
##########
@@ -19,6 +19,64 @@
Changelog
---------
+2.0.0
+.....
+
+Breaking changes
+~~~~~~~~~~~~~~~~
+
+Integration with the ``google`` provider
+````````````````````````````````````````
+
+In 2.0.0 version of the provider we've changed the way of integrating with the
``google`` provider.
+The previous versions of both providers caused conflicts when trying to
install them together
+using PIP > 20.2.4. The conflict is not detected by PIP 20.2.4 and below but
it was there and
+the version of ``Google BigQuery`` python client was not matching on both
sides. As the result, when
+both ``apache.beam`` and ``google`` provider were installed, some features of
the ``BigQuery`` operators
+might not work properly. This was cause by ``apache-beam`` client not yet
supporting the new google
+python clients when ``apache-beam[gcp]`` extra was used. The
``apache-beam[gcp]`` extra is used
+by ``Dataflow`` operators and while they might work with the newer version of
the ``Google BigQuery``
+python client, it is not guaranteed.
+
+This version introduces additional extra requirement for the ``apache.beam``
extra of the ``google`` provider
+and symmetrically the additional requirement for the ``google`` extra of the
``apache.beam`` provider.
+Both ``google`` and ``apache.beam`` provider do not use those extras by
default, but you can specify
+them when installing the providers. The consequence of that is that some
functionality of the ``Dataflow``
+operators might not be available.
+
+Unfortunately the only ``complete`` solution to the problem is for the
``apache.beam`` to migrate to the
+new (>=2.0.0) Google Python clients.
+
+This is the extra for the ``google`` provider:
+
+.. code-block:: python
+
+ extras_require={
+ ...
+ 'apache.beam': ['apache-airflow-providers-apache-beam',
'apache-beam[gcp]'],
+ ....
+ },
+
+And likewise this is the extra for the ``apache.beam`` provider:
+
+.. code-block:: python
+
+ extras_require={'google': ['apache-airflow-providers-google',
'apache-beam[gcp]']},
+
+You can still run this with PIP version <= 20.2.4 and go back to the previous
behaviour:
+
+.. code-block:: shell
+
+ pip install apache-airflow-providers-google['apache.beam']
Review comment:
Right :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]