potiuk commented on code in PR #29469:
URL: https://github.com/apache/airflow/pull/29469#discussion_r1106545002


##########
dev/README_RELEASE_PROVIDER_PACKAGES.md:
##########
@@ -561,14 +561,41 @@ Or update it if you already checked it out:
 svn update .
 ```
 
-Optionally you can use `check_files.py` script to verify that all expected 
files are
-present in SVN. This script may help also with verifying installation of the 
packages.
+Optionally you can use the 
[`check_files.py`](https://github.com/apache/airflow/blob/main/dev/check_files.py)
+script to verify that all expected files are present in SVN. This script will 
produce a `Dockerfile.pmc` which
+may help with verifying installation of the packages.
 
 ```shell script
 # Copy the list of packages (pypi urls) into `packages.txt` then run:
 python check_files.py providers -p {PATH_TO_SVN}
 ```
 
+After the above script completes you can build `Dockerfile.pmc` to trigger an 
installation of each provider
+package:
+
+```shell script
+docker build - < Dockerfile.pmc
+```
+
+**Note**: This may fail to install some providers. For example, if they 
require some system level dependencies
+that aren't present in the image. If you wish to investigate you may update 
the Dockerfile to install any
+missing dependencies and then try to build again.
+
+For example, currently `apache-airflow-providers-apache-hive` requires the 
`libsasl2` system dependency. To

Review Comment:
   Yes. Good point (sorry I missed it). 
   
   This is because you get the providers also in the "airlfow/providers" source 
code (this is how airflow is installed in the CI image - not from package, but 
from sources copied to inside the code, so that the source code can be replaced 
by just mounting the "airflow" volume in the place where airflow is installed. 
   
   The solution is simple. Remove `/opt/airflow/airflow/providers" right after 
"FROM". 
   
   We use similar approach in our CI when we test installation of providers, 
but in this case we are removing the whole "airflow" folder by ... (wait for 
it) ... mounting an empty directory in it's place :) (see below).
   
   In your case it's easier to remove it in Dockerfile as we do not care about 
the size of the image (the layer with all airflow sources will stay in the 
image and you build the image anyway. In case of CI, we do not want to build 
extra image and we can simply set env var that turns into option and mounts the 
empty dir (which we do automatically when ``--use-airflow-version`` is used 
https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/params/shell_params.py#L220
 : 
   
   Try it yourself with breeze (and you can use `--dry-run` to see the actual 
command or look at the breeze source code):
   
   ```
   >  breeze shell --mount-sources skip
   ...
   root@a51c9ce0ae89:/opt/airflow# ls -la airflow/
   total 276
   drwxr-xr-x 60 root root  1920 Feb 13 22:40 .
   drwxr-xr-x  1 root root  4096 Feb 15 00:50 ..
   -rw-r--r--  1 root root  4670 Feb 12 18:21 __init__.py
   -rw-r--r--  1 root root  1403 Jan 13 11:23 __main__.py
   drwxr-xr-x  5 root root   160 Dec 18 14:57 _vendor
   -rw-r--r--  1 root root  2320 Sep 10 14:26 alembic.ini
   drwxr-xr-x  6 root root   192 Feb 12 22:21 api
   drwxr-xr-x 10 root root   320 Feb 12 22:21 api_connexion
   drwxr-xr-x  6 root root   192 Feb 13 22:40 api_internal
   drwxr-xr-x  7 root root   224 Feb 13 22:40 callbacks
   
   ....
   
   drwxr-xr-x 65 root root  2080 Dec 29 20:50 providers
   -rw-r--r--  1 root root 44445 Jan 13 11:23 providers_manager.py
   -rw-r--r--  1 root root   846 Sep 10 14:26 py.typed
   drwxr-xr-x  7 root root   224 Feb 13 22:40 secrets
   ```
   
   vs. 
   
   ```
   > breeze shell --mount-sources remove
   ...
   
   root@7ff23b566750:/opt/airflow# ls -la airflow/
   total 4
   drwxr-xr-x 3 root root   96 May 16  2022 .
   drwxr-xr-x 1 root root 4096 Feb 15 00:51 ..
   -rw-r--r-- 1 root root    0 May 16  2022 .gitignore
   root@7ff23b566750:/opt/airflow#
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to