potiuk commented on a change in pull request #4938: [AIRFLOW-4117]
Multi-staging Image - Travis CI tests [Step 3/3]
URL: https://github.com/apache/airflow/pull/4938#discussion_r303210482
##########
File path: Dockerfile
##########
@@ -278,42 +273,81 @@ RUN echo "Pip version: ${PIP_VERSION}"
RUN pip install --upgrade pip==${PIP_VERSION}
-# We are copying everything with airflow:airflow user:group even if we use
root to run the scripts
+ARG AIRFLOW_REPO=apache/airflow
+ENV AIRFLOW_REPO=${AIRFLOW_REPO}
+
+ARG AIRFLOW_BRANCH=master
+ENV AIRFLOW_BRANCH=${AIRFLOW_BRANCH}
+
+ENV
AIRFLOW_GITHUB_DOWNLOAD=https://raw.githubusercontent.com/${AIRFLOW_REPO}/${AIRFLOW_BRANCH}
+
+# Airflow Extras installed
+ARG AIRFLOW_EXTRAS="all"
+ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}
+
+RUN echo "Installing with extras: ${AIRFLOW_EXTRAS}."
+
+ARG AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD="false"
+ENV
AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD=${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}
+
+# By changing the CI build epoch we can force reinstalling Arflow from the
current master -
+# in case of CI optimized builds (next step). Our build scripts will change
the EPOCH every month normally
+# But it can also be overwritten manually by setting the
AIRFLOW_CI_BUILD_EPOCH environment variable.
+ARG AIRFLOW_CI_BUILD_EPOCH=""
+ENV AIRFLOW_CI_BUILD_EPOCH=${AIRFLOW_CI_BUILD_EPOCH}
+
+# In case of CI-optimised builds we want to pre-install master version of
airflow dependencies so that
+# We do not have to always reinstall it from the scratch.
+# This can be reinstalled from latest master by increasing
PIP_DEPENDENCIES_EPOCH_NUMBER.
+# And is automatically reinstalled from the scratch every month
+RUN \
+ if [[ "${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" == "true" ]]; then \
+ pip install --no-use-pep517 \
+
"https://github.com/apache/airflow/archive/master.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]"
\
+ && pip uninstall --yes apache-airflow; \
+ fi
+
+# Note! We are copying everything with airflow:airflow user:group even if we
use root to run the scripts
# This is fine as root user will be able to use those dirs anyway.
# Airflow sources change frequently but dependency configuration won't change
that often
# We copy setup.py and other files needed to perform setup of dependencies
-# This way cache here will only be invalidated if any of the
-# version/setup configuration change but not when airflow sources change
+# So in case setup.py changes we can install latest dependencies required.
COPY --chown=airflow:airflow setup.py ${AIRFLOW_SOURCES}/setup.py
COPY --chown=airflow:airflow setup.cfg ${AIRFLOW_SOURCES}/setup.cfg
COPY --chown=airflow:airflow airflow/version.py
${AIRFLOW_SOURCES}/airflow/version.py
COPY --chown=airflow:airflow airflow/__init__.py
${AIRFLOW_SOURCES}/airflow/__init__.py
COPY --chown=airflow:airflow airflow/bin/airflow
${AIRFLOW_SOURCES}/airflow/bin/airflow
-# Airflow Extras installed
-ARG AIRFLOW_EXTRAS="all"
-ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}
-RUN echo "Installing with extras: ${AIRFLOW_EXTRAS}."
-
-# First install only dependencies but no Apache Airflow itself
-# This way regular changes in sources of Airflow will not trigger
reinstallation of all dependencies
-# And this Docker layer will be reused between builds.
-RUN pip install --no-use-pep517 -e ".[${AIRFLOW_EXTRAS}]"
+# The goal of this line is to install the dependencies from the most current
setup.py from sources
+# This will be usually incremental small set of packages so it will be very
fast
+RUN \
+ if [[ "${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" == "true" ]]; then \
+ pip install --no-use-pep517 -e ".[${AIRFLOW_EXTRAS}]"; \
Review comment:
Ah. You are right of course. An optimisation on its own but worth it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services