potiuk edited a comment on pull request #12511: URL: https://github.com/apache/airflow/pull/12511#issuecomment-732753093
> Digging deeper the problem seems to be in the fact that we are using master constraints to build the image (--build-arg AIRFLOW_CONSTRAINTS_REFERENCE=constraints-master)? And that's something I don't get. > I changed setup.py but the image is still building using old dependencies. Can you explain (or point to docs) how changes to setup.py|cfg gets propagated to prod images? Sure. It is something new we want to do, so It needs likely local running first, to see all the details and then transferring that into CI setup with all the right environment variables etc. It has a lot of variable parts, because we have complex constraint requirements and local optimizations that make the image builds faster in the CI - this is why it needs some special treatment, The technical details and architecture of how images are built are here: https://github.com/apache/airflow/blob/master/IMAGES.rst#technical-details-of-airflow-images. Building production image from the user perspective is described here: https://github.com/apache/airflow/blob/master/docs/production-deployment.rst#production-container-images because this is not a developer, but user-facing feature. But I perfectly understand it can be overwhelming initially. It's also good to take a look at the Dockerfile: https://github.com/apache/airflow/blob/master/Dockerfile Basically in the CI environment you will see the `docker build` command that is used to build the image and you can just run it locally and modify to get what you want (with a new thing like this you might also end-up with modifying the Dockrfile itself). Eventually, it ends up with the right combination of the build args and sometimes new features to add. The installation you mentioned is controlled by this: `AIRFLOW_PRE_CACHED_PIP_PACKAGES` , When it is set to true, we are first installing airflow from the current "master". This is an optimization step for rebuilding the image. it will install the "latest master" version of the dependencies so that in subsequent steps it will only incrementally add new dependencies or upgrade them. ``` # In case of Production build image segment we want to pre-install master version of airflow # dependencies from GitHub so that we do not have to always reinstall it from the scratch. RUN if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" ]]; then \ if [[ ${INSTALL_MYSQL_CLIENT} != "true" ]]; then \ AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS/mysql,}; \ fi; \ pip install --user \ "https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \ --constraint "https://raw.githubusercontent.com/apache/airflow/${AIRFLOW_CONSTRAINTS_REFERENCE}/constraints-${PYTHON_MAJOR_MINOR_VERSION}.txt" \ && pip uninstall --yes apache-airflow; \ fi ``` This step can eb entirely skipped by setting `AIRFLOW_PRE_CACHED_PIP_PACKAGES` to "false". In which case the only installation step that will happen is this. Default value of `INSTALL_AIRFLOW_VIA_PIP` is "true" so the next step should be installation of Airflow from this step. And this step should install airflow from the constraints specified. ``` # remove mysql from extras if client is not installed RUN if [[ ${INSTALL_MYSQL_CLIENT} != "true" ]]; then \ AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS/mysql,}; \ fi; \ if [[ ${INSTALL_AIRFLOW_VIA_PIP} == "true" ]]; then \ pip install --user "${AIRFLOW_INSTALL_SOURCES}[${AIRFLOW_EXTRAS}]${AIRFLOW_INSTALL_VERSION}" \ --constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"; \ fi; \ if [[ -n "${ADDITIONAL_PYTHON_DEPS}" ]]; then \ pip install --user ${ADDITIONAL_PYTHON_DEPS} --constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"; \ fi; \ if [[ ${AIRFLOW_LOCAL_PIP_WHEELS} == "true" ]]; then \ if ls /docker-context-files/*.whl 1> /dev/null 2>&1; then \ pip install --user --no-deps /docker-context-files/*.whl; \ fi ; \ fi; \ find /root/.local/ -name '*.pyc' -print0 | xargs -0 rm -r || true ; \ find /root/.local/ -type d -name '__pycache__' -print0 | xargs -0 rm -r || true ```` So to cut long story short - you need to set AIRFLOW_PRE_CACHED_PIP_PACKAGES to false and AIRFLOW_CONSTRAINTS_LOCATION to your files located in "./docker-context-files" ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
