potiuk commented on issue #4543: [AIRFLOW-3718] [WIP] Multi-layered version of the docker image URL: https://github.com/apache/airflow/pull/4543#issuecomment-469703048 Some comments on the progress and current state @Fokko @ashb. You might take a look at the current implementation if you are curious, but it's still WIP. What I have now is now pretty complex set of Dockerfiles, some using multistage features (COPY --from). But I think the basic idea of caching the binaries via Docker images and building local CI images for testing on Travis starting from official images shared in DockerHub already works. I know I can have a build that will prepare a docker and rebuild only last layers for usual/regular builds. Those local builds can run very quickly even if new dependencies are added vs. officially released images. The timing to build CI image on Travis (to run tests) is about 30 seconds to check that nothing changes and about 2 minutes to build if sources changed, and possibly 4 minutes to build if you have a new dependency added vs image released on DockerHub. I am now using pre-cached wheels images built separately (using multi-stage build features) to achieve that and it works pretty well. Timings on DockerHub are much longer (15-30 mins per python version) but this is a secondary issue - those images will be built from master only to "catch-up" with latest merged changes , so this is not a problem. It already looks comparable on Travis to timing of current scripts inside docker - those installing dependencies from mounted wheels using the airflow-incubator-ci image. So far I did it in the way to minimise Dockerfile copy&pasting so I ended up with building 6(!) different images per version of python and 7(!) different Dockerfiles. This is all to overcome some limitations of Dockerfiles (for example you cannot specify ARGs in --from directive of multi-stage build). Those images have pretty complex build/dependency logic between them (all automated in hook/build script). You can run ./docker_build.sh locally and it will also build fine on your local machine. However I am quite sure it is far too complex now for the job it tries to do :) . But the good news I am going to simplify it a lot now that I made it works (I already have an idea how) once I have the tests running and succeeding on Travis. How I see it - we will have just 2 images. One for CI and one for official airflow. Unfortunately due to current limitations of docker build, we can't avoid some copy&pasting between CI and Airflow Dockerfiles in this solution. But we can have common args (list of dependencies mainly) that we can still share between the two images and I am quite sure it will be far simpler to understand and manage than the current 6 images and 7 Dockerfiles I have. So it looks like once I get from "Red" phase to "Green" (all tests pass) I will move into "Refactor" phase and make it much simpler.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
