Hello everyone,

TL;DR: I am looking for consensus on releasing "slim" versions of PROD
images - ones that will be way smaller and contain no providers nor
other extras and would be database-specific.

Context:

Now after we are done with some infra changes that were also released
in 2.3.0 I came back to the issue raised in in
https://github.com/apache/airflow/issues/20849 which was originally
about "vanilla" image for Airflow, but I renamed the idea to "slim"
image (following similar convention by various distro and Python
providers). The issue itself explains why there is a need for such
images.

The idea is to have a very small "base" ("slim") image that users will
be able to extend  - not only a "regular" (see the relation with
"slim" :D ?)  image where we pre-install a set of providers and
support multiple database backends.

The "slim" images also have the advantage that we can use
"no-constraints" dependencies with them - which means that in those
images, the dependencies are "latest" that airflow supports even if
some providers would limit the dependencies.

I looked at what it would mean and really what it translates to is
that we would have to push many more images.

The bad news:

We need to push matrix of 4 * 3 = 12 new "slim" images (plus some
aliases for "latest")
*  Python versions: 3.7, 3.8, 3.9, 3.10
*  Database: postgres, mysql, mssql

Postgres images would be additionally multiplatform (AMD64/ARM64) and
for now MySQL and MsSQL would  be just AMD64 (until we add support for
ARM for those).
Those are plenty of images, but this is a rather normal approach if
you look for a number of other images published by multiple
"platform-like" products.

The good news:

We only need to do it at release time and we already have the right
set of scripts and parameters to enable that. It will take a bit
longer, but those images are much smaller and building and pushing
them is WAY faster and smaller han the regular image.

Some comparison:

Size (uncompressed): Regular (1.1G), Slim (500MB)
Time to build single image: Regular(6m), Slim (up to 3m)

Overall the release process would take some 20 mins longer if we
release the slim images (and I already made it a separate step so it
should not block "regular" release).

The very good news:

I've actually prepared PR:
https://github.com/apache/airflow/pull/23391 to add this feature
(including the docs), and it's a very small change. It does not change
any of the source code of airflow or Dockerfile, we basically need to
extend our "dev" script to build and push images to ... build and push
more images. I actually even .. prepared and pushed 2.3.0 images of
airflow to my private dockerhub account so that everyone can see how
it will look like.

You can see it here:
https://hub.docker.com/repository/docker/potiuk/airflow/tags?page=1&ordering=last_updated&name=2.3.0

I **believe** those changes don't even need PMC votes for release, and
this is more a procedural change than software release, so we
**could** release the "slim" 2.3.0 images even now - so that they are
available as of 2.3.0. I think even if we see that this is a welcome
change (despite the complexity of our dockerhub images available) it
could even be agreed to via lasy-consensus if we see consensus
forming.

J.

Reply via email to