[GitHub] [airflow] MarkusTeufelberger commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

GitBox Tue, 01 Jun 2021 00:23:42 -0700


MarkusTeufelberger commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642846193




##########
File path: docs/docker-stack/build.rst
##########
@@ -15,16 +15,126 @@
     specific language governing permissions and limitations
     under the License.
 
+.. _build:build_image:
+
 Building the image
 ==================
 
-Before you dive-deeply in the way how the Airflow Image is build, named and 
why we are doing it the
-way we do, you might want to know very quickly how you can extend or customize 
the existing image
-for Apache Airflow. This chapter gives you a short answer to those questions.
+Before you dive-deeply in the way how the Airflow Image is build, let us first 
explain why you might need
+to build the custom container image and we show a few typical ways you can do 
it.
+
+Why custom image ?
+------------------
+
+The Apache Airflow community, releases Docker Images which are ``reference 
images`` for Apache Airflow.
+However, Airflow has more than 60 community managed providers (installable via 
extras) and some of the
+default extras/providers installed are not used by everyone, sometimes others 
extras/providers
+are needed, sometimes (very often actually) you need to add your own custom 
dependencies,
+packages or even custom providers.
+
+In Kubernetes and Docker terms this means that you need another image with 
your specific requirements.
+This is why you should learn how to build your own Docker (or more properly 
Container) image.
+You might be tempted to use the ``reference image`` and dynamically install 
the new packages while
+starting your containers, but this is a bad idea for multiple reasons - 
starting from fragility of the build
+and ending with the extra time needed to install those packages - which has to 
happen every time every
+container starts. The only viable way to deal with new dependencies and 
requirements in production is to
+build and use your own image. You should only use installing dependencies 
dynamically in case of
+"hobbyist" and "quick start" scenarios when you want to iterate quickly to try 
things out and later
+replace it with your own images.
+
+How to build your own image
+---------------------------
+
+There are several most-typical scenarios that you will encounter and here is a 
quick recipe on how to achieve
+your goal quickly. In order to understand details you can read further, but 
for the simple cases using
+typical tools here are the simple examples.
+
+In the simplest case building your image consists of those steps:
+
+1) Create your own ``Dockerfile`` (name it ``Dockerfile``) where you add:
+
+* information what your image should be based on (for example ``FROM: 
apache/airflow:latest-python3.8``
+
+* additional steps that should be executed in your image (typically in the 
form of ``RUN <command>``)
+
+2) Build your image. This can be done with ``docker`` CLI tools and examples 
below assume ``docker`` is used.
+   There are other tools like ``kaniko`` or ``podman`` that allow you to build 
the image, but ``docker`` is
+   so far the most popular and developer-friendly tool out there. Typical way 
of building the image looks
+   like follows (``my-custom-airflow-image-name`` is the custom name your 
image has). In case you use some
+   kind of registry where you will be using the image from, it is usually 
named in the form of
+   ``registry/image-name``. The name of the image has to be configured for the 
deployment method your
+   image will be deployed. This can be set for example as image name in the
+   `docker-compose file <running-airflow-in-docker>`_ or in the `Helm chart 
<helm-chart>`_.
+
+.. code-block:: shell
+
+   docker build . -f Dockerfile -t my-custom-airflow-image-name
+
+
+3) Once you build the image locally you have usually several options to make 
them available for your deployment:
+
+* For ``docker-compose`` deployment, that's all you need. The image is stored 
in docker engine cache
+  and docker compose will use it from there.
+
+* For some - development targeted - Kubernetes deployments you can load the 
images directly to
+  Kubernetes clusters. Clusters such as ``kind`` or ``minikube`` have 
dedicated ``load`` method to load the
+  images to the cluster.
+
+* Last but not least - you can push your image to a remote registry which is 
the most common way
+  of storing and exposing the images, and it is most portable way of 
publishing the image. Both
+  Docker-Compose and Kubernetes can make use of images exposed via registries.
+
+The most common scenarios where you want to build your own image are adding a 
new ``apt`` package,

Review comment:
       deb, not apt. Apt is the package manager, deb is the format.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] MarkusTeufelberger commented on a change in pull request #16170: Adding extra requirements for build and runtime of the PROD image.

Reply via email to