potiuk commented on issue #4543: [AIRFLOW-3718] [WIP] Multi-layered version of 
the docker image
URL: https://github.com/apache/airflow/pull/4543#issuecomment-469703048
 
 
   Some comments on the progress and current state @Fokko @ashb. You might take 
a look at the current implementation if you are curious, but it's still WIP.
   
   What I have now is now pretty complex set of Dockerfiles, some using 
multistage features (COPY --from). But I think the basic idea of caching the 
binaries via Docker images and building local CI images for testing on Travis 
starting from official images shared in DockerHub already works.
   
   I know I can have a build that will prepare a docker and rebuild only last 
layers for usual/regular builds. Those local builds can run  very quickly even 
if new dependencies are added vs. officially released images. The timing to 
build CI image on Travis (to run tests)  is about 30 seconds to check that 
nothing changes and about 2 minutes to build if sources changed, and possibly 4 
minutes to build if you have a new dependency added vs image released on 
DockerHub. I am now using pre-cached wheels images built separately (using 
multi-stage build features) to achieve that and it works pretty well. 
   
   Timings on DockerHub are much longer (15-30 mins per python version) but 
this is a secondary issue - those images will be built from master only to 
"catch-up" with latest merged changes , so this is not a problem. 
   
   It already looks comparable on Travis to timing of current scripts inside 
docker - those installing dependencies from mounted wheels using the 
airflow-incubator-ci image. 
   
   So far I did it in the way to minimise Dockerfile copy&pasting so I ended up 
with building 6(!) different images per version of python and 7(!) different 
Dockerfiles. This is all to overcome some limitations of Dockerfiles (for 
example you cannot specify ARGs in --from directive of multi-stage build). 
Those images have pretty complex build/dependency logic between them (all 
automated in hook/build script). You can run ./docker_build.sh locally and it 
will also build fine on your local machine.
   
   However I am quite sure it is far too complex now for the job it tries to do 
:) .
   
   But the good news I am going to simplify it a lot now that I made it works 
(I already have an idea how) once I have the tests running and succeeding on 
Travis. How I see it - we will have just 2 images. 
   
   One for CI and one for official airflow. Unfortunately due to current 
limitations of docker build, we can't avoid some copy&pasting between CI and 
Airflow Dockerfiles in this solution. But we can have common args (list of 
dependencies mainly) that we can still share between the two images and I am 
quite sure it will be far simpler to understand and manage than the current 6 
images and 7 Dockerfiles I have.
   
   So it looks like once I get from "Red" phase to "Green" (all tests pass) I 
will move into "Refactor" phase and make it much simpler.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to