potiuk commented on pull request #19189: URL: https://github.com/apache/airflow/pull/19189#issuecomment-950887593
> I've be hesitant to propose this since this is technically a backward incompatibility change for those using PROD as a base image for their Dockerfile, the most significant part being (obviously) the location of the interpreter. So while I think this is a good thing to do in a vacuum, this should probably either be done by introducing a new image tag series and deprecating the venv-less one until 3.0. First of all I do not thing this is backwards-incompatible, secondly - I do not really think this is a problem even if it was becuase airflow incompatibility has nothing to do with image incompatibility (especially that our image is not yet "official stable" image - it's a "reference" image). Why I think it is not incompatible? Because all the examples and recommendation we had about extending and customising the image, remain unchanged. The image, airlfow, providers and all the tools inside will continue to work if peopel were using all our examples and following them (and we have PLENTY of them). Even more - those examples are automatically validated during the CI build (except image customisation that I run separately every time I make significant change like this one - so I am pretty sure they are working fine. All our prod 'image tests" are also working fine with it (we test if all the imports work, if all providers are installed and are importable, etc. etc. From the user's point of view - who either customizes or extends the image - nothing changes. The only change is where the packages are installed. But if they use (as they should) `pip` to manipulate their packages, nothing changes. Even if they manualy added `--user` flag in their PIP, this will continue to work (except some really obscure changes) - althought they were not even encouraged to do that - we had PIP_USER variable set in the image which made this behaviour automatic (and this variable is gone with that change). This is really equivalent to refactoring code wihch is not "public" API in Python. The "location" of the instaled packages is not "public API". The 'pip' commands to manipulate those are the API (and those have not changed). Now why this would not be a big problem even if it was more "backwards-incompatible"? The `Airflow X.Y` compatibility is all about "Airflow", not about the image. There is no "guarantee" that the image will remain unchanged - in fact we have done quite a number of incompatible variable names when customizing the image in the past without any major disruptions to our users. The Image we publish is not "official" release - it is a "convenience binary" and I often even name it "reference image". It does not bring the same "guarantees" as official release, details of it can change without breaking Airflow MAJOR version compatibility. I try - of course - not to do it and I think we had far more of those changes between 2.0.0 and 2.1.0 - where we got a lot of feedback from the users (for example OpenShift compatibility came from that) and were able to incorpoarate a lot of that without waiting for Airflow 3. That's a major win for the quality of the image I think. Even Python base images did some backwards-incompatible changes in the past. For example by replacing the 3. * images suddenly with removal of Python 2.7 (!) without even bumping patchlevel (!). That's not a "nice" approach of course - but technically speaking it did not break Python3.* compatibility (otherwise they would have to wait with releasing the images without Python 2.7 until version 4). This situation will change however (from my point of view at least - apparently Python maintainers have a different view on that) when we apply for the "official docker image status" - https://docs.docker.com/docker-hub/official_images/. Then I would be far more careful about similar changes. This is about the last two changes I am still hesitant about completing because there were a few open things (like the .venv). When I look at the rate of changes of the image it stabilized significantly. We handle all the cases we want to handle, the API to build those images was significantly simplified and more intuitive, we had far more issues raised by the users that my answer is "Yes - this is supported already by the image see the doc here" (for example when people want to build image in air-gaped environment or when they want to verify provenence of all the python packages, or when they want to add custom entrypoint etc. etc. ). I was building up the knowledge and documentation and I think I am rather close to say "yeah we are ready to get the official image status". By then as well I plan to extract a separate "read-only" repot where only relevant files will be present (I plan to use `copybara` to only copy relevant commits/code from Airflow repo) and then it will be much easier from people to "officially" build their custom images and it will be actually built automatically by Docker's official team (plus we will get extra security checks and notifications as the "official" images by Docker get special treatment and got some automated scanning and notifications - and then we will likely also have to build a bit faster loop on rebuilding the images when security issues are discovered in base image (but that's another topic to be discussed when we apply for the "official" status). Then such images will be available to pull as `docker pull apache-airflow` and then yeah - I agree such change could be seen a s backwards-incompatibile. See the issues there: https://github.com/apache/airflow/projects/3 - not having "official" status is the only reason why AIP-26 is still "in-progress". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
