potiuk commented on PR #35617:
URL: https://github.com/apache/airflow/pull/35617#issuecomment-1809884844

   Right. I am pretty happy with what I came up with :). I hope we will get it 
reviewed and merged soon (after #35586) as it brings quite some improvements in 
speed and "supply chain" side of things when it comes to release management 
(more about it after I complete this `quest`).
   
   Seems I got it working nicely.  Just to make it easier for review and 
verification you can take a look at  PROD images in this PR which are built 
using provider packages prepared in this PR. They pass all the tests in CI 
(incliding discovery of of providers from installed packages, discovery of 
plugins etc). We will need to run more test when the providers are first 
released using this mechanism but it generally looks good. You can pull the 
images and run bash script inside those images easily with this (these are AMD 
images so will run slow on ARM /Mac OS).:
   
   ```
   docker run -it 
ghcr.io/apache/airflow/main/prod/python3.8:1219f8c13febe13271a5e109ad649c76c6c5a3fc
 bash
   ````
   
   You will see that provider packages are installed from locally builld 
packages and they are all nicely discoverable and working,,
   
   
![image](https://github.com/apache/airflow/assets/595491/90bffe83-85ee-42bb-885e-be6e8cdd4526)
   
   
   @uranusjr - i'd appreciate if you can take a look - maybe you will have some 
advice and keen eye reviewing the `pyproject.toml` template I came up with. You 
can also see generated `pyproject.toml` and the sources where packages are 
generated from easily with this PR. Just run this:
   
   ```
   breeze release-management --skip-tag-check --skip-deleting-generated-files 
--clean-dist --package-format both
   ```
   
   The command above will create all provider packages in `dist` , but it will 
also leave all the generated and copied code in the `dist/provider_packages` 
(in a structure reflecting our provider structure) so that you can see where 
the packages were generated from. You will find generated `pyproject.toml` for 
every provider there (and no setup.py/setup.cfg).
   
   I compared the provider packages before (built with setuptools) and after 
(built with flit) and I think I got them all right (I will do a more complete 
comparision before merging that one - on complete set of providers).  
   
   They are not identical but I believe those packages have everything needed:
   
   * entrypoints (plugins/provider_info) are good
   * non-python files are added automatically to both wheel and sdist (which is 
nice - no more need for MANIFEST.in and setup.cfg different ways of specifying 
them)
   * One change that I did - I added LICENCE file to the package instead of 
adding them to Metadata. I could not find a way on how to add extra file to 
metadata of wheel package with flit, but I think adding them as file in 
"airflow/providers/PROVIDER_ID> package is much more appropriate way of adding 
LICENCE as it remains in the package after being installed and you can always 
see it there. @uranusjr - any comments/insights are appreciated
   
   Also the way packages are being generated in CI and locally when release 
manager generates them are a little nicer, it's easier to see what's going on:
   
   Generating "PROD" packages:
   
   
![image](https://github.com/apache/airflow/assets/595491/4994a06c-6fc4-4b54-970b-55c1e3268274)
   
   Summary:
   
   
![image](https://github.com/apache/airflow/assets/595491/786a23f7-0dbc-4ec4-935d-7a0297b431a6)
   
   
   As mentioned elsewhere with changing to flit/pyproject.toml (which allowed 
us to move it out of docker environment to build the packages) the generation 
of all providers is WAAAAAAAY faster  now:
   
   Before: **13m58s**
   
   
![image](https://github.com/apache/airflow/assets/595491/a344f2e1-77ad-4e13-b262-a3687631d1ec)
   
   After: **21s (!!!!!):**
   
   
![image](https://github.com/apache/airflow/assets/595491/87a2f723-c004-4ea8-b64f-3c2eea22efb3)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to