Hello Everyone, Following the discussion we had on Mono-layered vs. Multi-layered official image for Airflow here https://github.com/apache/airflow/pull/4483, I prepared a proof-of-concept PR of multi-layered image (based on the mono-layered one) and I performed calculations and reached some conclusions in this proposal (I wanted to have some hard numbers to back the statement that multi-layered Docker file is better) :
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+official+Airflow+image The conclusions I reached: - The multi-layered image is even slightly smaller than the mono-layered one - so multi-layered image is even better when you download it once - Downloading the image regularly by the users is way better in case of multi-layered image - for simulated user, downloading airflow image twice a week it is: 5.7 GB (multi-layered) vs. 16.15 GB (mono-layered) downloads over the course of 8 weeks.\ - Multi-layered image is better choice. I based those calculations on the PR I prepared: https://github.com/apache/airflow/pull/4543 where I implemented rather nice multi-layered Dockerfile that can be easily maintained. It's based on my experience with Airflow Breeze <https://github.com/PolideaInternal/airflow-breeze> - the GCP Development environment we used to develop 30+ GCP based operators recently. I hope we can reach the conclusion as the community that multi-layered is better and that we can go in this direction :). I am happy to iterate on my PR to make it even better. J. -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> E: [email protected]
