potiuk commented on a change in pull request #10784:
URL: https://github.com/apache/airflow/pull/10784#discussion_r490609617



##########
File path: Dockerfile.ci
##########
@@ -256,7 +256,7 @@ ENV 
UPGRADE_TO_LATEST_CONSTRAINTS=${UPGRADE_TO_LATEST_CONSTRAINTS}
 # But in cron job we will install latest versions matching setup.py to see if 
there is no breaking change
 # and push the constraints if everything is successful
 RUN \
-    if [[ "${UPGRADE_TO_LATEST_CONSTRAINTS}" == "true" ]]; then \
+    if [[ "${UPGRADE_TO_LATEST_CONSTRAINTS}" != "false" ]]; then \

Review comment:
       Everything will work, I believe (but it would be great to check if there 
are no loopholes).
   
   This part has not changed and we run it for > year now. Only the location 
changed to run the scripts  -  the scripts are run in  the "workflow_run". Here 
we build the image using `scripts/ci` taken from master, but the `Dockerfile`, 
`setup.py`, `scripts/in_container` and everything else is coming from the PR.
   
   In Dockerfile.ci there are two steps of pip  install:
   
   1)  The first one: 
https://github.com/apache/airflow/blob/5b3fb53d9c5c943abb8ddc90214c320b7a11c8c2/Dockerfile.ci#L223
 installs deps from master. This one only invalidates when we rebuild the image 
from scratch (i.e. when we change Dockerfile or when we get a new base python 
image. In all other cases, this one is taken from cache - even if setup.py 
changes. This is because we COPY setup.py in one of the following lines (not 
before). It's really an optimization - thanks to that, even if setup.py 
changes, it does not take 20 minutes to build the image but maybe 6-7 minutes 
(because all dependencies from the previous master run are already installed in 
the layer taken from the cache. 
   
   2) then we copy setup.py: 
https://github.com/apache/airflow/blob/master/Dockerfile.ci#L243 - if setup.py 
is not changed vs. latest master build - this layer is also taken from cache as 
well as the subsequent step of pip install (this way if no setup.py changes the 
image build is ~ 3/4 minutes - we do not lose time on even checking for new 
requirements. But if the setup.py changes (setup.py comes from the PR - just to 
remind) then Docker cache gets invalidated here and all later layers of the 
image are built from here.
   
   3) Then if setup.py changed in the previous step we reach that line during 
rebuilding the layers: 
https://github.com/apache/airflow/blob/master/Dockerfile.ci#L258. 
   In "normal" PR's (UPGRADE_TO_LATEST_CONSTRAINT = "false") we run this 
command:
    `pip install -e ".[${AIRFLOW_EXTRAS}]"   --constraint 
"${AIRFLOW_CONSTRAINTS_URL}" `
   
   This will:
   a) install all the new stuff that appeared in setup.py (so here where we 
install new packages)
   b) upgrade all the stuff for which limits were changed in setup.py and the 
limits do not fit either constraints or already installed versions
   c) but all the rest remains untouched.
   
   This way PR gets only the previously "good" constrains + anything that 
changed in the setuup.py for this PR (this is where we avoid the 'package not 
found`).
   
   Basically it means that it will only try to upgrade or install new deps that 
freshly appeared in setup.py. But if the previous requirements are fitting the 
setup.py limits, no "eager" upgrade of those requirements is performed.
   
   In the other case (UPGRADE_TO_LATEST_CONSTRAINT = "true") the same pip 
install is run with `--upgrade --upgrade-strategy eager and NO constraints'. 
This is only run in case we run a "master merge" push or scheduled build. This  
means that pip will try to upgrade all the dependencies to their most recent 
versions that fulfill the setup.py limitations  - without looking at the last 
constraints. If such a build succeeds (all tests pass), then we upgrade 
constraints to match those newly upgraded versions (because we know that they 
passed the tests). This is recorded as the commit message in teh orphaned 
"constraints-master" or constraints-1-10 branches (depending if it was a master 
or v1-10-test push):
   
   Example here: 
https://github.com/apache/airflow/commit/625194596587bd8b3ef54f1902b6060e4a0c187b
 - you see which commit originated it and what exactly changes so we can easily 
track the history of upgrades of those.
   
   Once the image is build (with potentially new requirements), it is pushed to 
registry and tests for the run that triggered the build are run using this 
image.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to