potiuk commented on a change in pull request #4938: [AIRFLOW-4117] Multi-staging Image - Travis CI tests [Step 3/3] URL: https://github.com/apache/airflow/pull/4938#discussion_r303229786
########## File path: scripts/ci/_cache_utils.sh ########## @@ -0,0 +1,99 @@ +#!/usr/bin/env bash +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +AIRFLOW_SOURCES=$(pwd) +export AIRFLOW_SOURCES + +BUILD_CACHE_DIR="${AIRFLOW_SOURCES}/.build" +mkdir -p "${BUILD_CACHE_DIR}/cache/" + +USE_LOCALLY_BUILD_IMAGES_AS_CACHE="${BUILD_CACHE_DIR}/.use_locally_built_images_as_cache_${PYTHON_VERSION}" +CACHE_TMP_FILE_DIR=$(mktemp -d "${BUILD_CACHE_DIR}/cache/XXXXXXXXXX") + +if [[ ${SKIP_CACHE_DELETION:=} != "true" ]]; then + trap 'rm -rf -- "${CACHE_TMP_FILE_DIR}"' INT TERM HUP EXIT +fi + +function check_file_md5sum { + local FILE="${1}" + local MD5SUM + MD5SUM=$(md5sum "${FILE}") + local MD5SUM_FILE + MD5SUM_FILE=${BUILD_CACHE_DIR}/$(basename "${FILE}").md5sum + local MD5SUM_FILE_NEW + MD5SUM_FILE_NEW=${CACHE_TMP_FILE_DIR}/$(basename "${FILE}").md5sum.new + echo "${MD5SUM}" > "${MD5SUM_FILE_NEW}" + local RET_CODE=0 + if [[ ! -f "${MD5SUM_FILE}" ]]; then + echo "Missing md5sum for ${FILE}" + RET_CODE=1 + else + diff "${MD5SUM_FILE_NEW}" "${MD5SUM_FILE}" >/dev/null + RES=$? + if [[ "${RES}" != "0" ]]; then + echo "The md5sum changed for ${FILE}" + RET_CODE=1 + fi + fi + return ${RET_CODE} +} + +function move_file_md5sum { + local FILE="${1}" + local MD5SUM_FILE + MD5SUM_FILE=${BUILD_CACHE_DIR}/$(basename "${FILE}").md5sum + local MD5SUM_FILE_NEW + MD5SUM_FILE_NEW=${CACHE_TMP_FILE_DIR}/$(basename "${FILE}").md5sum.new + if [[ -f "${MD5SUM_FILE_NEW}" ]]; then + mv "${MD5SUM_FILE_NEW}" "${MD5SUM_FILE}" + echo "Updated md5sum file ${MD5SUM_FILE} for ${FILE}." + fi +} + +FILES_FOR_REBUILD_CHECK="\ +setup.py \ +setup.cfg \ +Dockerfile \ +airflow/version.py \ +airflow/www/package.json \ +airflow/www/package-lock.json +" + +function update_all_md5_files() { + # Record that we built the images locally so that next time we use "standard" cache + touch "${USE_LOCALLY_BUILD_IMAGES_AS_CACHE}" + echo + echo "Updating md5sum files" + echo + for FILE in ${FILES_FOR_REBUILD_CHECK} + do + move_file_md5sum "${AIRFLOW_SOURCES}/${FILE}" + done +} + +function check_if_docker_build_is_needed() { + set +e + + for FILE in ${FILES_FOR_REBUILD_CHECK} + do + if ! check_file_md5sum "${AIRFLOW_SOURCES}/${FILE}"; then Review comment: No, the reason is different. Actually the way I did .dockerignore makes it rather fast to copy the whole context because we avoid copying all locally generated files: build artifacts, generated doc/_build files, node_modules installed by npm etc. It takes less than a second I think now on My Mac. There are few other problems that current docker does not work well with so I had to make this md5sum check. I will make a - shorter- comment in the code and describe it shortly in the code. But here is the full context: 1) before we start `docker build' we need to fix the group permissions for files just in case. Docker's caching mechanism uses file permissions to check cache validation (see the comments in hooks/build about fixing permissions and differences on different systems - Dockerhub and Travis have different umask). We technically do not know if permissions have been fixed for all files so we run this permission fix/check before every build so that Docker can properly use caching mechanisms when checking the context. This takes precious 10-15 seconds extra for spinning up docker-context image and running "chmod" on all relevant files just in case. 2) We have single Dockerfile with multi-stage build for both CI/SLIM images. So they contain stages needed for SLIM and for CI image. I've implemented a workaround for missing (upcoming in buildkit: https://github.com/docker/cli/issues/1134) feature to skip building stages that are not needed when we only build the "slim" image. The airflow-apt-deps-ci stage is not needed at all for slim image - unfortunately it has to be there before the "main" stage (which has to be last) and it gets built by Docker. Current Docker will still build the not-used-stage even if not needed. So in order to optimise the build time for slim image (5-6 minutes I think saved) I've added those 'ifs' in the airflow-apt-deps-ci stage: ``` RUN if [[ "${APT_DEPS_IMAGE}" == "airflow-apt-deps-ci" ]]; then \ ``` In this case `docker build` will still execute those series of RUN commands locally when buildig slim image and will build pretty much empty stage (which is then discarded) - fairly fast (maybe 20 seconds). But it is still 20 seconds every time you run docker build. Once Buildkit goes out of beta and we switch to it - we might be able to get rid of this workaround (There is a comment about it in the Dockerfile) 3) Last but most important. We do not need to rebuild the image at all after only airflow sources change and there is no way to do that with docker build (not easily at least). We run those pylint/flake8/mypy tests now with sources mounted from the host in the place of sources baked in the image. This means that we start static checks instantaneously - because we have the latest sources in container mounted. If we try to rebuild the Docker using `docker build` every time we change sources, this would mean around a minute waiting before every pylint etc. operation. Right now if you don't modify Dockerfile, setup.py etc. those static check operations are pretty much as fast as running them locally (if you already run it once before and have the docker image built). That makes a really fast fix cycle when you have a number of problems to fix only in sources but you do not touch setup.py. You can even limit it to particular folders/files if you specify them as an argument (see the examples in CONTRIBUTING.md). Or you can use the full "CI test" image, enter the environment and run your static checks from inside the container to make it even faster (also described in the CONTRIBUTING.md). It's not an exact equivalent of running it in Travis (because we are using "slim" image for static checks) but it's close enough. Then having md5sum check - if you add a dependency in setup.py - then normal Docker build will happen and once it finishes (after a minute or so), you have the new dependency already installed in the Docker image. It's joining the best of both worlds - running consistent environment for everyone (Docker) for static checks but it's also super fast for development cycle. And as the next step to follow - having it that fast and consistent across all developers - makes it super-useful for pre-commit hooks which are following soon :). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
