potiuk opened a new pull request #22492:
URL: https://github.com/apache/airflow/pull/22492


   This change is one of the biggest optimizations to the Dockerfiles
   that from the very beginning was a goal, but it has been enabled
   by switching to buildkit and recent relase of support for
   the 1.4 dockerfile syntax. This syntax introduced two features:
   
   * heredocs
   * links for COPY commands
   
   Both changes allows to solve multiple problems:
   
   * COPY for build scripts suffer from permission problems. Depending
     on umask setting of the host, the scripts could have different
     group permissions and invalidate docker cache. Inlining the
     scripts (automatically by pre-commit) gets rid of the problem
     completely
   
   * COPY --link allows to optimize and parallelize builds for
     Dockerfile.ci embedded source code. This should speed up
     not only building the images locally but also it will allow
     to use more efficiently cache for the CI builds (in case no
     source code change, the builds will use pre-cached layers from
     the cache more efficiently (and in parallel)
   
   * The PROD Dockerfile is now completely standalone. You do not
     need to have any folders or files to build Airlfow image. At
     the same time the versatility and support for multiple ways
     on how you can build the image (as described in
     https://airflow.apache.org/docs/docker-stack/build.html is
     maintained (this was a goal from the very beginning of the
     PROD Dockerfile but it was not easily achievable - heredocs
     allow to inline scripts that are used for the build and the
     pre-commits will make sure that there is one source of truth
     and nicely editable scripts for both PROD and CI Dockerfile.
   
   The last point is really cool, because it allows our users to
   build custom dockerfiles without checking out the code of
   Airflow, it is enough to download the latest released
   Dockerfile and they can easily build the image.
   
   Overall - this change will vastly optimize build speed for
   both PROD and CI images in multiple scenarios.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to