potiuk commented on code in PR #29469: URL: https://github.com/apache/airflow/pull/29469#discussion_r1106545002
########## dev/README_RELEASE_PROVIDER_PACKAGES.md: ########## @@ -561,14 +561,41 @@ Or update it if you already checked it out: svn update . ``` -Optionally you can use `check_files.py` script to verify that all expected files are -present in SVN. This script may help also with verifying installation of the packages. +Optionally you can use the [`check_files.py`](https://github.com/apache/airflow/blob/main/dev/check_files.py) +script to verify that all expected files are present in SVN. This script will produce a `Dockerfile.pmc` which +may help with verifying installation of the packages. ```shell script # Copy the list of packages (pypi urls) into `packages.txt` then run: python check_files.py providers -p {PATH_TO_SVN} ``` +After the above script completes you can build `Dockerfile.pmc` to trigger an installation of each provider +package: + +```shell script +docker build - < Dockerfile.pmc +``` + +**Note**: This may fail to install some providers. For example, if they require some system level dependencies +that aren't present in the image. If you wish to investigate you may update the Dockerfile to install any +missing dependencies and then try to build again. + +For example, currently `apache-airflow-providers-apache-hive` requires the `libsasl2` system dependency. To Review Comment: Yes. Good point (sorry I missed it). This is because you get the providers also in the "airlfow/providers" source code (this is how airflow is installed in the CI image - not from package, but from sources copied to inside the code, so that the source code can be replaced by just mounting the "airflow" volume in the place where airflow is installed. The solution is simple. Remove `/opt/airflow/airflow/providers" right after "FROM". We use similar approach in our CI when we test installation of providers, but in this case we are removing the whole "airflow" folder by ... (wait for it) ... mounting an empty directory in it's place :) (see below). In your case it's easier to remove it in Dockerfile as we do not care about the size of the image (the layer with all airflow sources will stay in the image and you build the image anyway. In case of CI, we do not want to build extra image and we can simply set env var that turns into option and mounts the empty dir (which we do automatically when ``--use-airflow-version`` is used https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/params/shell_params.py#L220 : Try it yourself with breeze (and you can use `--dry-run` to see the actual command or look at the breeze source code): ``` > breeze shell --mount-sources skip ... root@a51c9ce0ae89:/opt/airflow# ls -la airflow/ total 276 drwxr-xr-x 60 root root 1920 Feb 13 22:40 . drwxr-xr-x 1 root root 4096 Feb 15 00:50 .. -rw-r--r-- 1 root root 4670 Feb 12 18:21 __init__.py -rw-r--r-- 1 root root 1403 Jan 13 11:23 __main__.py drwxr-xr-x 5 root root 160 Dec 18 14:57 _vendor -rw-r--r-- 1 root root 2320 Sep 10 14:26 alembic.ini drwxr-xr-x 6 root root 192 Feb 12 22:21 api drwxr-xr-x 10 root root 320 Feb 12 22:21 api_connexion drwxr-xr-x 6 root root 192 Feb 13 22:40 api_internal drwxr-xr-x 7 root root 224 Feb 13 22:40 callbacks .... drwxr-xr-x 65 root root 2080 Dec 29 20:50 providers -rw-r--r-- 1 root root 44445 Jan 13 11:23 providers_manager.py -rw-r--r-- 1 root root 846 Sep 10 14:26 py.typed drwxr-xr-x 7 root root 224 Feb 13 22:40 secrets ``` vs. ``` > breeze shell --mount-sources remove ... root@7ff23b566750:/opt/airflow# ls -la airflow/ total 4 drwxr-xr-x 3 root root 96 May 16 2022 . drwxr-xr-x 1 root root 4096 Feb 15 00:51 .. -rw-r--r-- 1 root root 0 May 16 2022 .gitignore root@7ff23b566750:/opt/airflow# ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
