hterik commented on PR #35026:
URL: https://github.com/apache/airflow/pull/35026#issuecomment-1772102486

   > The second case might be a bit more interesting - but there also you will 
see reinstallation only happening if you change requirements.txt.
   
   Or changes to content used in any of the layers **above** pip install. It 
seems like it could be quite many, eg `COPY scripts`, `COPY 
${AIRFLOW_SOURCES_FROM} `happen above the pip install.
   Refactoring the dockerfile to only COPY requirements first and frequently 
changed sources later is another good optimization technique. But i haven't 
looked at the dockerfile in detail to see if that is already done.
   
   
   
   > I have one worry though. Knowing that "cache invalidation is the most 
difficult thing". I am not 100% sure how it will work - how does the local 
cache gets allocated when you have differnet arguments passed. Do you know how 
the local cache is determined in case we change parameters/arguments and use 
the same Dockerfile?
   > 
   > To be honest I am quite worried about the case where base python changes 
(see my previous comment), about this case:
   > 
   > ```
   > docker build --build-arg PYTHON_BASE_IMAGE=python:3.11-slim-bullseye ... 
Dockerfile
   > ```
   > 
   > Then (note Python version change):
   > 
   > ```
   > docker build --build-arg PYTHON_BASE_IMAGE=python:3.10-slim-bullseye ... 
Dockerfile
   > ```
   > 
   YES! Good observation, this is a very possible issue. When doing similar 
caching with `apt install`, it's very easy to mix the base image Ubuntu version 
across caches and end up in bad situations.
   To solve this, we usually follow convention of suffixing the version in the 
cache key, eg for Ubuntu 22.04:
   `RUN --mount=type=cache,sharing=locked,target=/var/cache/apt,id=aptcache2204 
--mount=type=cache,sharing=locked,target=/var/lib/apt,id=aptlib2204 \` 
   You can probably do this with variables to not forget it everywhere.
   For pip, i don't know if cache is reusable across python versions.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to