potiuk commented on a change in pull request #13654:
URL: https://github.com/apache/airflow/pull/13654#discussion_r557400091



##########
File path: Dockerfile.ci
##########
@@ -286,22 +286,23 @@ RUN echo "Pip no cache dir: ${PIP_NO_CACHE_DIR}"
 
 RUN pip install --upgrade "pip==${AIRFLOW_PIP_VERSION}"
 
-# Increase the value here to force reinstalling Apache Airflow pip dependencies
-ARG PIP_DEPENDENCIES_EPOCH_NUMBER="5"
-ENV PIP_DEPENDENCIES_EPOCH_NUMBER=${PIP_DEPENDENCIES_EPOCH_NUMBER}
-
 # Only copy install_airflow_from_latest_master.sh to not invalidate cache on 
other script changes
 COPY scripts/docker/install_airflow_from_latest_master.sh 
/scripts/docker/install_airflow_from_latest_master.sh
 # fix permission issue in Azure DevOps when running the script
 RUN chmod a+x /scripts/docker/install_airflow_from_latest_master.sh
 
+ARG UPGRADE_TO_NEWER_DEPENDENCIES="false"
+ENV UPGRADE_TO_NEWER_DEPENDENCIES=${UPGRADE_TO_NEWER_DEPENDENCIES}
+
 # In case of CI builds we want to pre-install master version of airflow 
dependencies so that
 # We do not have to always reinstall it from the scratch.
-# This can be reinstalled from latest master by increasing 
PIP_DEPENDENCIES_EPOCH_NUMBER.
 # And is automatically reinstalled from the scratch every time patch release 
of python gets released
 # The Airflow (and providers in case INSTALL_PROVIDERS_FROM_SOURCES is "false")
-# are uninstalled, only dependencies remain
-RUN if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" ]]; then \
+# are uninstalled, only dependencies remain.
+# the cache is only used when "upgrade to newer dependencies" is not set to 
automatically
+# account for removed dependencies (we do not install them in the first place)
+RUN if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" && \
+          ${UPGRADE_TO_NEWER_DEPENDENCIES} != "true" ]]; then \

Review comment:
       UPGRADE_TO_NEWER_DEPENDENCIES is "false" by default. 
   
   It is set in the 'selective_checks" in CI to "true" (as of few days) 
whenever setup.py or setup.cfg changes 
(https://github.com/apache/airflow/blob/e4b8ee63b04a25feb21a5766b1cc997aca9951a9/scripts/ci/selective_ci_checks.sh#L325)
   
   This means that in CI, by default the 
"install_airflow_from_latest_master.sh" is used (and airflow is installed from 
cache first). 
   
   When either of the two setup files change, UPGRADE_TO_LATEST_DEPENDENCIES is 
set and this line is skipped, so no preinstalled packages - they are all 
installed from scratch (which will take a bit longer but it is 'clean' state - 
so anything that disappears (comparing to master) is not installed.
   
   
   The situation is different when you build image locally - when you change 
setup.py, setup.cfg and build the image. the cache is still used 
(UPGRADE_TO_NEWER_DEPENDENCIES) is "false"). 
   
   This way you avoid rebuilding all of the dependencies when you add new 
dependency (it takes ~ 10 minutes) to install all deps from the scratch.
   
   You can still trigger the same behavior as in CI by adding 
`--upgrade-to-newer-dependencies` flag in breeze when building the image (it 
simply sets UPGRADE_TO_NEWER_DEPENDENCIES). 
   
   However, good that I explained it -  I just realised that the comparision 
should be == "false"  because we are using commit_hash as the "truthy" value in 
selective checks! Fixing it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to