Merry Xmas :) I have a warm feeling about keeping the providers as extras (but also I'd love to hear what others think).
I only think we should change the extras to be "[providers.google], [providers.amazon], [providers.cncf.kubernetes] ... " to keep them easily separated from non-provider extras and get more "explicit" information which extra brings with it a provider. Extras - in general - are pretty evil (especially if they are used in transitive dependencies). I am sure there are few people around that will agree with me. However, they are really, really convenient to install optional stuff. And this is the main reason why I think we should keep them. I think the example you gave Kaxil : pip install -U "apache-airflow[google]==2.2.3" -c $CONSTRAINTS_URL - is a very good one and I think it is expected behavior (and one that our users should learn to expect). This is the exact same behavior if for example you run "pip install -U apache-airflow[virtualenv]" for example. In case we used different "golden" virtualenv versions, it will also upgrade virtualenv (and all its deps). And we also have this very clearly stated in the docs where we explain what are the different upgrade scenarios: https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#upgrading-airflow-with-providers BTW. The `-U/--upgrade` flag in this case is not effective. The `--upgrade` flag only affects the "direct" installation target and not dependencies (https://pip.pypa.io/en/stable/user_guide/#only-if-needed-recursive-upgrade) unless --upgrade-strategy is specified. In this case the constraints affect upgrade of dependencies, not the --upgrade flag. so: pip install "apache-airflow[google]==2.2.3" -c $CONSTRAINTS_URL - will also upgrade all the deps of core airflow and google provider to the versions specified in the constraints. I just learned about that recently too (from TP) and just opened a PR to correct it in our examples: https://github.com/apache/airflow/pull/20537/files Why do I think this is good behavior ? 1) Because this is the strategy we took for providers. We only introduce breaking changes when we must, but we always give the users an opportunity to downgrade selected providers if they see a problem. It's an "optimistic" strategy, sure. But one that is very close to reality. Even when we had breaking changes in Google or some other bigger providers, it was very likely things will continue working for most users (breaking changes were usually very localized). And we managed to keep our providers mostly backwards compatible during last year, without huge maintenance burden (heavily increased maintenance burden is the only reason why backwards compatibility should be broken IMHO). And you get - most of the time with the benefit of using the latest and greatest dependencies most of the time (which is great for security and should be even more important after the recent log4j drama). And the whole point of providers is that you still can selectively downgrade. This is really a huge thing praised in a number of conversations I had with our users. 2) Because providers are "the same" kind of dependencies as 3rd-party dependencies. If we upgrade virtualenv to the latest version (no matter if it's breaking or not - as long as it passes the tests), why should we not upgrade providers in the same way? 3) Most important - because this is what we anyway do in our reference image and I cannot image we change it. Users of our image will anyhow get precisely this default behavior. If they use 2.1 and upgrade to 2.3. all the providers embedded (and even all those they install using constraints) will be upgraded by default. And there is not much we can do (unless we completely strip-off the image from providers - which is not a good idea I think). What would change if we remove extras ? Not much. I think we should give the advice to our users what they should do if they want to follow the same strategy as our "images". The advice would change to: `pip install apache-airflow==2.2.3 apache-airflow-providers-google -c $CONSTRAINTS_URL` - which would have the same effect. Only much longer to write. And they can do it today. If they want to stay with the exact version of the provider they have. They just have to run `pip install -U apache-airflow==2.2.3 -c $CONSTRAINTS_URL` - that would not change. This is even one of the options we list as "installation and upgrade scenarios" https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#installation-and-upgrade-of-airflow-core. But I think if we change "[google]" to "[providers.google]" as extra, it would be a much more obvious way of making the users aware that it's about the provider's upgrade as well. J. J. On Fri, Dec 24, 2021 at 5:18 PM Kaxil Naik <[email protected]> wrote: > > Hi folks, > > Merry Christmas 🎄🎅. > > Currently, Airflow allows installing providers via "extras" - > https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#providers-extras > for convenience. > > We initially did this for easing Migrations for users from 1.10.x to 2.x. But > I want to discuss this again as I feel users can unintentionally upgrade the > providers versions. > > Example: > > pip install -U "apache-airflow[google]==2.2.3" -c $CONSTRAINTS_URL > > > Now, if a new major version of the provider is released between Airflow 2.2.2 > and 2.2.3 and the user uses that command, it will bump the > "apache-airflow-providers-google" providers from as en example 4.0.0 to 5.0.0 > and all of its dependencies. > > This might have unintended consequences. Now an easy solution is to downgrade > the provider version to the previously installed version by running if the > user notices this: > > pip install -U "apache-airflow-providers-google==4.0.0" > > > However, I feel we can stop this unintended upgrade by not allowing the > installation of providers via extras. This would also clear out any confusion > users might have on installing providers as we will only have a single way to > install them and truly separate providers from the core. And users can > upgrade each provider only when they need to and asses when upgrading to > major versions of the provider. > > On the flip side, installing providers via extras is actually really > convenient 😁 and I use them all the time for testing. > > Thoughts? > > Regards, > Kaxil
