potiuk commented on a change in pull request #10784:
URL: https://github.com/apache/airflow/pull/10784#discussion_r490609617
##########
File path: Dockerfile.ci
##########
@@ -256,7 +256,7 @@ ENV
UPGRADE_TO_LATEST_CONSTRAINTS=${UPGRADE_TO_LATEST_CONSTRAINTS}
# But in cron job we will install latest versions matching setup.py to see if
there is no breaking change
# and push the constraints if everything is successful
RUN \
- if [[ "${UPGRADE_TO_LATEST_CONSTRAINTS}" == "true" ]]; then \
+ if [[ "${UPGRADE_TO_LATEST_CONSTRAINTS}" != "false" ]]; then \
Review comment:
Everything will work, I believe (but it would be great to check if there
are no loopholes).
This part has not changed and we run it for > year now. Only the location
changed to run the scripts - the scripts are run in the "workflow_run". Here
we build the image using `scripts/ci` taken from master, but the `Dockerfile`,
`setup.py`, `scripts/in_container` and everything else is coming from the PR.
In Dockerfile.ci there are two steps of pip install:
1) The first one:
https://github.com/apache/airflow/blob/5b3fb53d9c5c943abb8ddc90214c320b7a11c8c2/Dockerfile.ci#L223
installs deps from master. This one only invalidates when we rebuild the image
from scratch (i.e. when we change Dockerfile or when we get a new base python
image. In all other cases, this one is taken from cache - even if setup.py
changes. This is because we COPY setup.py in one of the following lines (not
before). It's really an optimization - thanks to that, even if setup.py
changes, it does not take 20 minutes to build the image but maybe 6-7 minutes
(because all dependencies from the previous master run are already installed in
the layer taken from the cache.
2) then we copy setup.py:
https://github.com/apache/airflow/blob/master/Dockerfile.ci#L243 - if setup.py
is not changed vs. latest master build - this layer is also taken from cache as
well as the subsequent step of pip install (this way if no setup.py changes the
image build is ~ 3/4 minutes - we do not lose time on even checking for new
requirements. But if the setup.py changes (setup.py comes from the PR - just to
remind) then Docker cache gets invalidated here and all later layers of the
image are built from here.
3) Then if setup.py changed in the previous step we reach that line during
rebuilding the layers:
https://github.com/apache/airflow/blob/master/Dockerfile.ci#L258.
In "normal" PR's (UPGRADE_TO_LATEST_CONSTRAINT = "false") we run this
command:
`pip install -e ".[${AIRFLOW_EXTRAS}]" --constraint
"${AIRFLOW_CONSTRAINTS_URL}" `
This will:
a) install all the new stuff that appeared in setup.py (so here where we
install new packages)
b) upgrade all the stuff for which limits were changed in setup.py and the
limits do not fit either constraints or already installed versions
c) but all the rest remains untouched.
This way PR gets only the previously "good" constrains + anything that
changed in the setuup.py for this PR (this is where we avoid the 'package not
found`).
Basically it means that it will only try to upgrade or install new deps that
freshly appeared in setup.py. But if the previous requirements are fitting the
setup.py limits, no "eager" upgrade of those requirements is performed.
In the other case (UPGRADE_TO_LATEST_CONSTRAINT = "true") the same pip
install is run with `--upgrade --upgrade-strategy eager and NO constraints'.
This is only run in case we run a "master merge" push or scheduled build. This
means that pip will try to upgrade all the dependencies to their most recent
versions that fulfill the setup.py limitations - without looking at the last
constraints. If such a build succeeds (all tests pass), then we upgrade
constraints to match those newly upgraded versions (because we know that they
passed the tests). This is recorded as the commit message in teh orphaned
"constraints-master" or constraints-1-10 branches (depending if it was a master
or v1-10-test push):
Example here:
https://github.com/apache/airflow/commit/625194596587bd8b3ef54f1902b6060e4a0c187b
- you see which commit originated it and what exactly changes so we can easily
track the history of upgrades of those.
Once the image is build (with potentially new requirements), it is pushed to
registry and tests for the run that triggered the build are run using this
image.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]