potiuk commented on issue #6266: [AIRFLOW-2439] Production Docker image support 
including refactoring of build scripts
URL: https://github.com/apache/airflow/pull/6266#issuecomment-539906026
 
 
   @dimberman @ashb : In the latest commit I have a really nice set of 
optimisations and simplifications for the docker image. 
   
   Now we have simpler build scripts, less number of images, better caching, CI 
image is optimised for build time, PROD image is optimised for size. Plus we 
have bigger flexibility to choose what is only in PROD and what is in CI image. 
   
   There are quite a few changes (the basic framework remains the same and it 
has the same properties as before - including automated maintenance via 
github/dockerhub builds):
   
   1) I completely got rid of the SLIM_CI image. The way it was built, it only 
added complexity and in fact increased build/download size eventually. All 
static checks will now be done using CI image. 
   
   2) I got rid of the "if" conditionals in the Docker image at the expense of 
some duplication (PROD/CI image has their own stages). Eventually the 
duplicated code is not going to change too much and we have now greater 
flexibility in deciding what goes into PROD and what goes into CI image. We do 
not need the CI_OPTIMISED flag any more.
   
   3) We have much simpler build scripts now - there is no need to pass base 
images as build args - it is rather straightforward now. We still have the 
build scripts, but basically the build command now looks like follows:
   `docker build --build-arg PYTHON_BASE_IMAGE=python:3.6-slim-buster 
--build-arg AIRFLOW_VERSION=2.0.0.dev0 --build-arg AIRFLOW_BRANCH=master -t 
potiuk/airflow:master-www --target airflow-prod` or `docker build --build-arg 
PYTHON_BASE_IMAGE=python:3.6-slim-buster --build-arg AIRFLOW_VERSION=2.0.0.dev0 
--build-arg AIRFLOW_BRANCH=master -t potiuk/airflow:master-www --target 
airflow-ci`.
   Of course when we add efficient caching from dockerhub it becomes more 
complex (--cache-from) but this is all handled by the build scripts to add 
proper --cache-from directives and pull images as needed.
   
   4) I have separate `airflow-www` stage where I only run `npm ci` and `npm 
prod` on airflow/www and then i used `COPY --from=airflow-www` to copy 
appropriate parts to the right places (sources + node_modules + dist in CI 
image), (sources + dist) in PROD image. No NPM neither node_modules in PROD 
image :).
   
   5) I use `pip install --user`. This way all the dependencies are installed 
to ${HOME}/.local and it's easier to determine the folder to copy the npm to 
(it's python-version independent).
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to