potiuk commented on pull request #21145:
URL: https://github.com/apache/airflow/pull/21145#issuecomment-1027858573


   > @potiuk could you explain to me this part? I can understand that we check 
if the files in the list of `FILES_FOR_REBUILD_CHECK` are modified. we use 
md5sum to determine that. we persist the old md5sum calculated in build cache 
and if it varies from the recent file, we assume that file is modified and do 
build the local image. But why do we have to calculate them for only specific 
files in that list?
   
   Yeah. The thing is that we have a lot of files "mounted" inside the 
Dockerfile. All the sources of airflow are mounted in the place where they 
would normally be baked in when docker image is built. Mounting files when you 
run breeze is super quick (milliseconds). Re-building image on the other hand 
takes 20-30 seconds if you have almost nothing to build (just modified 
sources). That's why when you modified your file localy, you can enter breeze 
and run test in a matter of seconds. This allow for very quick iterations when 
you modify the code and want to test. Imagine if after every single change of 
your file, you'd have to wait 20-30 seconds to just be able to test it. That's 
not going to fly. And rebuilding the image is not really needed in this case 
because mounting the files effectively does what rebuilding the image does.
   
   So instead - we only suggest to rebuild the files when we know the 
"important" files changed and mounting is not "enough":
   
   * When Dockerfile.ci changes - you likely need to rebuild it because maybe 
somene added new tools/dependencies via apt
   * When setup.py/setup.cfg changes - you need to rebuild it to reinstall 
`pip` dependencies - because someone likely added a new dependency (or maybe 
even you added a new dependency and you want to get it installed inside the 
image persistently)
   * The ./docker scripts - similarly as Dockerfile.ci - they might do some new 
installation - if one of them is changed they influence 
   the docker image internals
   * www/ui  package/yarn/webpack- they likely require new "node_modules" to be 
installed and those are also installed as a separate step in Dockerfile
   * . dockerignore - might mean that some new files have been excluded/added 
to "Docker context" when docker is build, so likelly docker build is needed to 
reflect those changes.
   
   Unlike Python souce code - it's not enough to "mount" that files inside the 
docker to replaces the files in the image. You need to run an extra "action" 
afterwards if you want changes in those files to be reflected in the image: 
'pip install', "yarn install", or just running the scripts to install mysql or 
others.
   
   So this is purely optimization of iteration speed.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to