potiuk commented on issue #18932:
URL: https://github.com/apache/airflow/issues/18932#issuecomment-966687790


   Hmm. I thought about it and I think it's pretty much expected behaviour 
(though we might simply want to remove the bundle extras from released airflow 
as they make very little sense there) . 
   
   You are not supposed to use the "all" and 'all_dbs" when you are installing 
airflow from PyPI. Those are development-only extras which work a bit 
differently than the "provider" extras. 
   
   I think the problem is different - we should simply remove them from the 
"installable" version of airflow in PyPI because having them there is simply 
misleading. I will do it as a follow up of this, when I am back at home 
(travelling now).
   
   Airflow has several different types of extras 
(https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html)
   
   * Core extras - those install deps needed by some "core features" of Airlfow 
that are not enabled by default
   
   * Providers extras - those extras might be used to install "provider" 
packages (you can also install those provider packages manually as packages). 
In the "PyPI/packaged" version of Airflow those providers do not have the 
"specific" requirements, for exampel "airflow[microsoft.azure]" introduces a 
dependency on "apache-airflow-providers-microsoft-azure"  but it does not have 
"azure-cosmos" dependency (this comes transitively from the azure-cosmos 
provider".
   
   * Bundle extras - The 'all_dbs' and "all"  belong to that.
   
   The problem is that unlike "provider" extras, the bundle extras contain 
"transitive" dependencies that were valid at the time of package relase. In 
"providers" the dependencies are transitive - from the actuallly installed 
providers. But those are often different than those in constraints. The 
constraints we generate include the dependencies of providers that were 
RELEASED at  the time of preparing given version. In the meantime the 
dependencies  could have changed in "main" and they could contain different 
dependencies than then go to "all" and "all_dbs". So in fact the "all" and 
"all_dbs" is really only useful when you are installing airlfow in 
"Development" mode from sources, not when you are installing airflow from PyPI. 
   
   You could check it yourself - if instead of `[all_dbs]` you specify 
`[apache.cassandra, apache.drill, 
apache.druid,apache.hdfs,apache.hive,apache.pinot,cloudant,exasol,influxdb,microsoft.mssql,mongo,mysql,neo4j,postgres,presto,trino,vertica]`
 - the installation should work just fine.
   
   I think I will simply make sure to document it and clarify beheviour of 
bundle extras and I will remove the bundle releases from the next release of 
Airflow. The "bundle" release makes very little sense for PyPI installation.
   
   WDYT? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to