mik-laj commented on a change in pull request #12548:
URL: https://github.com/apache/airflow/pull/12548#discussion_r529497102
##########
File path: setup.py
##########
@@ -658,9 +655,10 @@ def write_version(filename: str = os.path.join(*[my_dir,
"airflow", "git_version
EXTRAS_PROVIDERS_PACKAGES: Dict[str, Iterable[str]] = {
'all': list(PROVIDERS_REQUIREMENTS.keys()),
- # this is not 100% accurate with devel_ci definition, but we really want
to have all providers
- # when devel_ci extra is installed!
+ # this is not 100% accurate with devel_ci and devel_all definition, but we
really want
+ # to have all providers when devel_ci extra is installed!
'devel_ci': list(PROVIDERS_REQUIREMENTS.keys()),
+ 'devel_all': list(PROVIDERS_REQUIREMENTS.keys()),
Review comment:
Apache Beam is a big project and has/has a lot of dependency conflicts,
especially older versions. For example: Apache Beam v2.21 required a mock older
than 3, which was in conflict with our tests.
https://github.com/apache/beam/blob/v2.21.0/sdks/python/setup.py#L157
It is also worth adding that there are several other ways to run Dataflow
jobs. Instead of `DataflowCreatePythonJobOperator` and running this process
locally, we can use `DataflowTemplatedJobStartOperator` or
`DataflowStartFlexTemplateOperator` or `KubernetesPodOperator`.
I will be dissuaded from running Apache Beam on one cluster with Airflow,
because it is often more problematic and less optimal (also financially). I
have high hopes for `DataflowStartFlexTemplateOperator`, which was implemented
by @TobKed and addresses most of the use cases that users have had.
CC: @aaltay
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]