MarkusTeufelberger commented on a change in pull request #16170:
URL: https://github.com/apache/airflow/pull/16170#discussion_r642846193
##########
File path: docs/docker-stack/build.rst
##########
@@ -15,16 +15,126 @@
specific language governing permissions and limitations
under the License.
+.. _build:build_image:
+
Building the image
==================
-Before you dive-deeply in the way how the Airflow Image is build, named and
why we are doing it the
-way we do, you might want to know very quickly how you can extend or customize
the existing image
-for Apache Airflow. This chapter gives you a short answer to those questions.
+Before you dive-deeply in the way how the Airflow Image is build, let us first
explain why you might need
+to build the custom container image and we show a few typical ways you can do
it.
+
+Why custom image ?
+------------------
+
+The Apache Airflow community, releases Docker Images which are ``reference
images`` for Apache Airflow.
+However, Airflow has more than 60 community managed providers (installable via
extras) and some of the
+default extras/providers installed are not used by everyone, sometimes others
extras/providers
+are needed, sometimes (very often actually) you need to add your own custom
dependencies,
+packages or even custom providers.
+
+In Kubernetes and Docker terms this means that you need another image with
your specific requirements.
+This is why you should learn how to build your own Docker (or more properly
Container) image.
+You might be tempted to use the ``reference image`` and dynamically install
the new packages while
+starting your containers, but this is a bad idea for multiple reasons -
starting from fragility of the build
+and ending with the extra time needed to install those packages - which has to
happen every time every
+container starts. The only viable way to deal with new dependencies and
requirements in production is to
+build and use your own image. You should only use installing dependencies
dynamically in case of
+"hobbyist" and "quick start" scenarios when you want to iterate quickly to try
things out and later
+replace it with your own images.
+
+How to build your own image
+---------------------------
+
+There are several most-typical scenarios that you will encounter and here is a
quick recipe on how to achieve
+your goal quickly. In order to understand details you can read further, but
for the simple cases using
+typical tools here are the simple examples.
+
+In the simplest case building your image consists of those steps:
+
+1) Create your own ``Dockerfile`` (name it ``Dockerfile``) where you add:
+
+* information what your image should be based on (for example ``FROM:
apache/airflow:latest-python3.8``
+
+* additional steps that should be executed in your image (typically in the
form of ``RUN <command>``)
+
+2) Build your image. This can be done with ``docker`` CLI tools and examples
below assume ``docker`` is used.
+ There are other tools like ``kaniko`` or ``podman`` that allow you to build
the image, but ``docker`` is
+ so far the most popular and developer-friendly tool out there. Typical way
of building the image looks
+ like follows (``my-custom-airflow-image-name`` is the custom name your
image has). In case you use some
+ kind of registry where you will be using the image from, it is usually
named in the form of
+ ``registry/image-name``. The name of the image has to be configured for the
deployment method your
+ image will be deployed. This can be set for example as image name in the
+ `docker-compose file <running-airflow-in-docker>`_ or in the `Helm chart
<helm-chart>`_.
+
+.. code-block:: shell
+
+ docker build . -f Dockerfile -t my-custom-airflow-image-name
+
+
+3) Once you build the image locally you have usually several options to make
them available for your deployment:
+
+* For ``docker-compose`` deployment, that's all you need. The image is stored
in docker engine cache
+ and docker compose will use it from there.
+
+* For some - development targeted - Kubernetes deployments you can load the
images directly to
+ Kubernetes clusters. Clusters such as ``kind`` or ``minikube`` have
dedicated ``load`` method to load the
+ images to the cluster.
+
+* Last but not least - you can push your image to a remote registry which is
the most common way
+ of storing and exposing the images, and it is most portable way of
publishing the image. Both
+ Docker-Compose and Kubernetes can make use of images exposed via registries.
+
+The most common scenarios where you want to build your own image are adding a
new ``apt`` package,
Review comment:
deb, not apt. Apt is the package manager, deb is the format.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]