mik-laj commented on a change in pull request #14911:
URL: https://github.com/apache/airflow/pull/14911#discussion_r599151524



##########
File path: docs/docker-stack/build.rst
##########
@@ -262,119 +413,99 @@ The ``pip download`` might happen in a separate 
environment. The files can be co
 binary repository and vetted/verified by the security team and used 
subsequently to build images
 of Airflow when needed on an air-gaped system.
 
-Preparing the constraint files and wheel files:
+Example of preparing the constraint files and wheel files (note that ``mysql`` 
dependency is removed
+as ``mysqlclient`` is installed from Oracle's ``apt`` repository and if you 
want to add it, you need
+to provide this library from you repository if you want to build Airflow image 
in an "air-gaped" system.
 
-.. code-block:: bash
+.. exampleinclude:: docker-examples/restricted/restricted_environments.sh
+    :language: bash
+    :start-after: [START download]
+    :end-before: [END download]
 
-  rm docker-context-files/*.whl docker-context-files/*.txt
+After this step is finished, your ``docker-context-files`` folder will contain 
all the packages that
+are needed to install Airflow from.
 
-  curl -Lo "docker-context-files/constraints-2-0.txt" \
-    
https://raw.githubusercontent.com/apache/airflow/constraints-2-0/constraints-3.7.txt
+Those downloaded packages and constraint file can be pre-vetted by your 
security team before you attempt
+to install the image. You can also store those downloaded binary packages in 
your private artifact registry
+which allows for the flow where you will download the packages on one machine, 
submit only new packages for
+security vetting and only use the new packages when they were vetted.
 
-  pip download --dest docker-context-files \
-    --constraint docker-context-files/constraints-2-0.txt  \
-    
apache-airflow[async,aws,azure,celery,dask,elasticsearch,gcp,kubernetes,mysql,postgres,redis,slack,ssh,statsd,virtualenv]==2.0.1
+On a separate (air-gaped) system, all the PyPI packages can be copied to 
``docker-context-files``
+where you can build the image using the packages downloaded by passing those 
build args:
 
-Since apache-airflow .whl packages are treated differently by the docker 
image, you need to rename the
-downloaded apache-airflow* files, for example:
+  * ``INSTALL_FROM_DOCKER_CONTEXT_FILES="true"``  - to use packages present in 
``docker-context-files``
+  * ``AIRFLOW_PRE_CACHED_PIP_PACKAGES="false"``  - to not pre-cache packages 
from PyPI when building image
+  * 
``AIRFLOW_CONSTRAINTS_LOCATION=/docker-context-files/YOUR_CONSTRAINT_FILE.txt`` 
- to downloaded constraint files
+  * (Optional) ``INSTALL_MYSQL_CLIENT="false"`` if you do not want to install 
``MySQL``
+    client from the Oracle repositories. In this case also make sure that your
 
-.. code-block:: bash
+Note, that the solution we have for installing python packages from local 
packages, only solves the problem
+of "air-gaped" python installation. The Docker image also downloads ``apt`` 
dependencies and ``node-modules``.
+Those type of dependencies are however more likely to be available in your 
"air-gaped" system via transparent
+proxies and it should automatically reach out to your private registries, 
however in the future the
+solution might be applied to both of those installation steps.
 
-   pushd docker-context-files
-   for file in apache?airflow*
-   do
-     mv ${file} _${file}
-   done
-   popd
+You can also use techniques described in the previous chapter to make ``docker 
build`` use your private
+apt sources or private PyPI repositories (via ``.pypirc``) available which can 
be security-vetted.
 
-Building the image:
+If you fulfill all the criteria, you can build the image on an air-gaped 
system by running command similar
+to the below:
 
-.. code-block:: bash
+.. exampleinclude:: docker-examples/restricted/restricted_environments.sh
+    :language: bash
+    :start-after: [START build]
+    :end-before: [END build]
 
-  ./breeze build-image \
-      --production-image --python 3.7 --install-airflow-version=2.0.1 \
-      --disable-mysql-client-installation --disable-pip-cache 
--install-from-local-files-when-building \
-      --constraints-location="/docker-context-files/constraints-2-0.txt"
+Modifying the Dockerfile
+........................
 
-or
+The build arg approach is a convenience method if you do not want to manually 
modify the ``Dockerfile``.
+Our approach is flexible enough, to be able to accommodate most requirements 
and
+customizations out-of-the-box. When you use it, you do not need to worry about 
adapting the image every
+time new version of Airflow is released. However sometimes it is not enough if 
you have very
+specific needs and want to build a very custom image. In such case you can 
simply modify the
+``Dockerfile`` manually as you see fit and store it in your forked repository. 
However you will have to
+make sure to rebase your changes whenever new version of Airflow is released, 
because we might modify
+the approach of our Dockerfile builds in the future and you might need to 
resolve conflicts
+and rebase your changes.
 
-.. code-block:: bash
+There are a few things to remember when you modify the ``Dockerfile``:
 
-  docker build . \
-    --build-arg PYTHON_BASE_IMAGE="python:3.7-slim-buster" \
-    --build-arg PYTHON_MAJOR_MINOR_VERSION=3.7 \
-    --build-arg AIRFLOW_INSTALLATION_METHOD="apache-airflow" \
-    --build-arg AIRFLOW_VERSION="2.0.1" \
-    --build-arg AIRFLOW_VERSION_SPECIFICATION="==2.0.1" \
-    --build-arg AIRFLOW_CONSTRAINTS_REFERENCE="constraints-2-0" \
-    --build-arg AIRFLOW_SOURCES_FROM="empty" \
-    --build-arg AIRFLOW_SOURCES_TO="/empty" \
-    --build-arg INSTALL_MYSQL_CLIENT="false" \
-    --build-arg AIRFLOW_PRE_CACHED_PIP_PACKAGES="false" \
-    --build-arg INSTALL_FROM_DOCKER_CONTEXT_FILES="true" \
-    --build-arg 
AIRFLOW_CONSTRAINTS_LOCATION="/docker-context-files/constraints-2-0.txt"
+* We are using the widely recommended pattern of ``.dockerignore`` where 
everything is ignored by default
+  and only the required folders are added through exclusion (!). This allows 
to keep docker context small
+  because there are many binary artifacts generated in the sources of Airflow 
and if they are added to
+  the context, the time of building the image would increase significantly. If 
you want to add any new
+  folders to be available in the image you must add it here with leading ``!``.
 
+.. code-block:: text
 
-Customizing & extending the image together
-..........................................
+    # Ignore everything
+    **
 
-You can combine both - customizing & extending the image. You can build the 
image first using
-``customize`` method (either with docker command or with ``breeze`` and then 
you can ``extend``
-the resulting image using ``FROM`` any dependencies you want.
+    # Allow only these directories
+    !airflow
+    ...
 
-Customizing PYPI installation
-.............................
 
-You can customize PYPI sources used during image build by adding a 
``docker-context-files``/``.pypirc`` file
-This ``.pypirc`` will never be committed to the repository and will not be 
present in the final production image.
-It is added and used only in the build segment of the image so it is never 
copied to the final image.
+* The ``docker-context-files`` folder is automatically added to the context of 
the image, so if you want
+  to add individual files, binaries, requirement files etc you can add them 
there. The
+  ``docker-context-files`` is copied to the ``/docker-context-files`` folder 
of the build segment of the
+  image, so it is not present in the final image - which makes the final image 
smaller in case you want
+  to use those files only in the ``build`` segment. You must copy any files 
from the directory manually,
+  using COPY command if you want to get the files in your final image (in the 
main image segment).
 
-External sources for dependencies
-.................................
 
-In corporate environments, there is often the need to build your Container 
images using
-other than default sources of dependencies. The docker file uses standard 
sources (such as
-Debian apt repositories or PyPI repository. However, in corporate 
environments, the dependencies
-are often only possible to be installed from internal, vetted repositories 
that are reviewed and
-approved by the internal security teams. In those cases, you might need to use 
those different
-sources.
-
-This is rather easy if you extend the image - you simply write your extension 
commands
-using the right sources - either by adding/replacing the sources in apt 
configuration or
-specifying the source repository in pip install command.
-
-It's a bit more involved in the case of customizing the image. We do not have 
yet (but we are working
-on it) a capability of changing the sources via build args. However, since the 
builds use
-Dockerfile that is a source file, you can rather easily simply modify the file 
manually and
-specify different sources to be used by either of the commands.
-
-
-Comparing extending and customizing the image
----------------------------------------------
-
-Here is the comparison of the two types of building images.
-
-+----------------------------------------------------+---------------------+-----------------------+
-|                                                    | Extending the image | 
Customizing the image |
-+====================================================+=====================+=======================+
-| Produces optimized image                           | No                  | 
Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Use Airflow Dockerfile sources to build the image  | No                  | 
Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Requires Airflow sources                           | No                  | 
Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| You can build it with Breeze                       | No                  | 
Yes                   |
-+----------------------------------------------------+---------------------+-----------------------+
-| Allows to use non-default sources for dependencies | Yes                 | 
No [1]                |
-+----------------------------------------------------+---------------------+-----------------------+
-
-[1] When you combine customizing and extending the image, you can use external 
sources
-in the "extend" part. There are plans to add functionality to add external 
sources
-option to image customization. You can also modify Dockerfile manually if you 
want to
-use non-default sources for dependencies.
-
-More details about the images
------------------------------
+More details

Review comment:
       Incorrect header level. This creates a new page title.
   ![Uploading Screenshot 2021-03-23 at 00.54.34.png…]()
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to