This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new 25a905ef31b Decrease size of docker context by two orders of magnitude 
(#47342)
25a905ef31b is described below

commit 25a905ef31b943b5e625f87114739bbbd3ba88f9
Author: Jarek Potiuk <[email protected]>
AuthorDate: Tue Mar 4 11:34:12 2025 +0100

    Decrease size of docker context by two orders of magnitude (#47342)
    
    When building docker image, local source files are sent as context.
    Unfortunately node and pnmp are adding a lot of cache inside the
    source tree and if we are not carefuly with excludin those, we
    end up with GBs of context being sent to docker before the build
    even starts (which takes minutes)
    
    This PR removes .pnpm-store folders that were the root cause for
    sending 1.5GB of context. It also adds simple instructions how
    you can check which files are in the context and how to see the
    size of the context. With this change, the context is down
    from 1.5 GB to 90 MB - cutting docker build context sending time
    from ~ minute to under a second.
---
 .dockerignore                   | 15 ++++++++++++---
 dev/MANUALLY_BUILDING_IMAGES.md | 26 ++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/.dockerignore b/.dockerignore
index dd90a925818..f561bc836f8 100644
--- a/.dockerignore
+++ b/.dockerignore
@@ -80,9 +80,10 @@
 
 # Git version is dynamically generated
 airflow/git_version
-airflow/ui/node_modules
-airflow/auth/managers/simple/ui/node_modules
 
+# Exclude node/pmpme caches..
+**/.pnpm-store
+**/node_modules
 # Exclude link to docs
 airflow/ui/static/docs
 
@@ -91,6 +92,14 @@ airflow/www/static/docs
 airflow/www/static/dist
 airflow/www/node_modules
 
+# Exclude any .venv and .ruff_cache
+**/.venv
+**/.ruff_cache/
+
+# Exclude docs artifacts
+**/_inventory_cache/
+docs/**/_api/**
+
 # Exclude python generated files
 **/__pycache__/
 **/*.py[cod]
@@ -99,7 +108,7 @@ airflow/www/node_modules
 **/env/
 **/build/
 **/develop-eggs/
-/dist/
+**/dist/
 **/downloads/
 **/eggs/
 **/.eggs/
diff --git a/dev/MANUALLY_BUILDING_IMAGES.md b/dev/MANUALLY_BUILDING_IMAGES.md
index 2a07f8b0c08..b0537fe6c5b 100644
--- a/dev/MANUALLY_BUILDING_IMAGES.md
+++ b/dev/MANUALLY_BUILDING_IMAGES.md
@@ -22,6 +22,7 @@
 **Table of Contents**  *generated with 
[DocToc](https://github.com/thlorenz/doctoc)*
 
 - [Building docker images](#building-docker-images)
+- [Keeping your docker context small](#keeping-your-docker-context-small)
 - [Setting environment with emulation](#setting-environment-with-emulation)
 - [Setting up cache refreshing with hardware ARM/AMD 
support](#setting-up-cache-refreshing-with-hardware-armamd-support)
 
@@ -37,6 +38,31 @@ you do not have those two installed.
 You also need to have the right permissions to push the images, so you should 
run
 `docker login` before and authenticate with your DockerHub token.
 
+## Keeping your docker context small
+
+Sometimes, especially when you generate node assets, some of the files 
generated are kept in the source
+directory. This can make the docker context very large when building images, 
because the whole context
+is transferred to the docker daemon. In order to avoid this we have 
.dockerignore where we exclude certain
+paths from being treated as part of the context - similar to .gitignore that 
keeps them away from git.
+
+If your context gets large you see a long (minutes) preliminary step before 
dockeer build is run
+where the context is being transmitted.
+
+You can see all the context files by running:
+
+```shell script
+printf 'FROM scratch\nCOPY . /' | DOCKER_BUILDKIT=1 docker build -q -f- -o- . 
| tar t
+```
+
+Once you see something that should be excluded from the context, you should 
add it to `.dockerignore` file.
+
+You can also check the size of the context by running:
+
+```shell script
+printf 'FROM scratch\nCOPY . /' | DOCKER_BUILDKIT=1 docker build -q -f- -o- . 
| wc -c | numfmt --to=iec --suffix=B
+```
+
+
 ## Setting environment with emulation
 
 According to the [official installation 
instructions](https://docs.docker.com/buildx/working-with-buildx/#build-multi-platform-images)

Reply via email to