potiuk commented on pull request #20238: URL: https://github.com/apache/airflow/pull/20238#issuecomment-1009731442
I Look for some reviews. As part of the optimization I also reviewed the image with Dive (cc: @malthe @mik-laj ) and made sure that some of the remainig remnants that were "bloating" the image were removed * we had (unnecesary) PIP install in the final image - this caused (a small) number of .pyc files to be embedded in the image * we also had a lastlog produced during apt installl which had 15MB - I made sure it is removed as the last step of the RUN instruction that created it (thanks @malthe for pointing that out!). * I also reviewed and improved the instructions which copied the .local folder and performed permission - one of the problems noted in #20776 that there was no "group write" permission for the home directory of Airflow (which could be problematic in some open-shift cases). It had to be done carefully - changing of the permissions has to be done in the right place bacause changing the permission after the files are stored as layer effectively duplicates the layer (the new layer with pemissions creates effectively a copy o all the files) 😱 As result the efficiency score of our image jumped from 97% to 99%:  I am thinking about adding some more automated tests for the presence of unwanted files and automating the tests for the image "efficiency" in our CI, but I would like to do it after this one and #20258 as switching to buildx significantly improves the experience of iterating over the images and building them in small increments. Looking forward to reviews! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
