rosetn commented on a change in pull request #13420: URL: https://github.com/apache/beam/pull/13420#discussion_r538031144
########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: + +1. Create a new Dockerfile that designates a base image using the [FROM instruction](https://docs.docker.com/engine/reference/builder/#from). As an example, this `Dockerfile`: -1. Pull a [prebuilt SDK container image](https://hub.docker.com/search?q=apache%2Fbeam&type=image) for your [target](https://docs.docker.com/docker-hub/repos/#searching-for-repositories) language and version. The following example pulls the latest Python SDK: ``` -docker pull apache/beam_python3.7_sdk +FROM apache/beam_python3.7_sdk:2.25.0 + +ENV FOO=bar +COPY /src/path/to/file /dest/path/to/file/ ``` -2. [Write a new Dockerfile](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) that [designates](https://docs.docker.com/engine/reference/builder/#from) the original as its [parent](https://docs.docker.com/glossary/?term=parent%20image). -3. [Build](#building-container-images) a child image. -### Modifying the original Dockerfile {#modifying-dockerfiles} +uses the prebuilt Python 3.7 SDK container image [`beam_python3.7_sdk`](https://hub.docker.com/r/apache/beam_python3.7_sdk) tagged at (SDK version) `2.25.0`, and adds an additional environment variable and file to the image. + + +2. [Build](https://docs.docker.com/engine/reference/commandline/build/) and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker. + -1. Clone the `beam` repository: ``` -git clone https://github.com/apache/beam.git +export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0" +export IMAGE_NAME="myremoterepo/mybeamsdk" +export TAG="latest" + +# Optional - pull the base image into your local Docker daemon to ensure +# you have the most up-to-date version of the base image locally. +docker pull "${BASE_IMAGE}" + +docker build -f Dockerfile -t "${IMAGE_NAME}:${TAG}" . +docker push "${IMAGE_NAME}:${TAG}" ``` -2. Customize the [Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -3. [Reimage](#building-container-images) the container. -### Testing customized images +**NOTE**: After pushing a container image, you should verify the remote image ID and digest should match the local image ID and digest, output from `docker build` or `docker images`. -To test a customized image locally, run a pipeline with PortableRunner and set the `--environment_config` flag to the image path: +#### Modifying a source Dockerfile in Beam {#modifying-dockerfiles} -{{< highlight class="runner-direct" >}} -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output /path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=embed \ ---environment_config=path/to/container/image -{{< /highlight >}} +This method will require building image artifacts from Beam source. For additional instructions on setting up your development environment, see the [Contribution guide](contribute/#development-setup). Review comment: Broken link ########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: Review comment: Delete this line "Steps:" ########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: + +1. Create a new Dockerfile that designates a base image using the [FROM instruction](https://docs.docker.com/engine/reference/builder/#from). As an example, this `Dockerfile`: Review comment: Can you combine "As an example, this `Dockerfile`:" to the sentence after the code sample? It's a little confusing to have the interrupted sentence. There are a few more instances of sentences with a code snippets in the Modifying a source Dockerfile in Beam section--can you combine those too? The typical pattern is instruction>code snippet>explanation. ########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: + +1. Create a new Dockerfile that designates a base image using the [FROM instruction](https://docs.docker.com/engine/reference/builder/#from). As an example, this `Dockerfile`: -1. Pull a [prebuilt SDK container image](https://hub.docker.com/search?q=apache%2Fbeam&type=image) for your [target](https://docs.docker.com/docker-hub/repos/#searching-for-repositories) language and version. The following example pulls the latest Python SDK: ``` -docker pull apache/beam_python3.7_sdk +FROM apache/beam_python3.7_sdk:2.25.0 + +ENV FOO=bar +COPY /src/path/to/file /dest/path/to/file/ ``` -2. [Write a new Dockerfile](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) that [designates](https://docs.docker.com/engine/reference/builder/#from) the original as its [parent](https://docs.docker.com/glossary/?term=parent%20image). -3. [Build](#building-container-images) a child image. -### Modifying the original Dockerfile {#modifying-dockerfiles} +uses the prebuilt Python 3.7 SDK container image [`beam_python3.7_sdk`](https://hub.docker.com/r/apache/beam_python3.7_sdk) tagged at (SDK version) `2.25.0`, and adds an additional environment variable and file to the image. + + +2. [Build](https://docs.docker.com/engine/reference/commandline/build/) and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker. + -1. Clone the `beam` repository: ``` -git clone https://github.com/apache/beam.git +export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0" +export IMAGE_NAME="myremoterepo/mybeamsdk" +export TAG="latest" + +# Optional - pull the base image into your local Docker daemon to ensure +# you have the most up-to-date version of the base image locally. +docker pull "${BASE_IMAGE}" + +docker build -f Dockerfile -t "${IMAGE_NAME}:${TAG}" . +docker push "${IMAGE_NAME}:${TAG}" ``` -2. Customize the [Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -3. [Reimage](#building-container-images) the container. -### Testing customized images +**NOTE**: After pushing a container image, you should verify the remote image ID and digest should match the local image ID and digest, output from `docker build` or `docker images`. Review comment: If it's important to check the image IDs and digest, let's make this note into its own step. Same with the other instances of this note. Readers will sometimes ignore notes. "3. After pushing a container image, you should verify the remote image ID and digest should match the local image ID and digest, output from `docker build` or `docker images`." ########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: + +1. Create a new Dockerfile that designates a base image using the [FROM instruction](https://docs.docker.com/engine/reference/builder/#from). As an example, this `Dockerfile`: -1. Pull a [prebuilt SDK container image](https://hub.docker.com/search?q=apache%2Fbeam&type=image) for your [target](https://docs.docker.com/docker-hub/repos/#searching-for-repositories) language and version. The following example pulls the latest Python SDK: ``` -docker pull apache/beam_python3.7_sdk +FROM apache/beam_python3.7_sdk:2.25.0 + +ENV FOO=bar +COPY /src/path/to/file /dest/path/to/file/ ``` -2. [Write a new Dockerfile](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) that [designates](https://docs.docker.com/engine/reference/builder/#from) the original as its [parent](https://docs.docker.com/glossary/?term=parent%20image). -3. [Build](#building-container-images) a child image. -### Modifying the original Dockerfile {#modifying-dockerfiles} +uses the prebuilt Python 3.7 SDK container image [`beam_python3.7_sdk`](https://hub.docker.com/r/apache/beam_python3.7_sdk) tagged at (SDK version) `2.25.0`, and adds an additional environment variable and file to the image. + + +2. [Build](https://docs.docker.com/engine/reference/commandline/build/) and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker. + -1. Clone the `beam` repository: ``` -git clone https://github.com/apache/beam.git +export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0" +export IMAGE_NAME="myremoterepo/mybeamsdk" +export TAG="latest" + +# Optional - pull the base image into your local Docker daemon to ensure +# you have the most up-to-date version of the base image locally. +docker pull "${BASE_IMAGE}" + +docker build -f Dockerfile -t "${IMAGE_NAME}:${TAG}" . +docker push "${IMAGE_NAME}:${TAG}" ``` -2. Customize the [Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -3. [Reimage](#building-container-images) the container. -### Testing customized images +**NOTE**: After pushing a container image, you should verify the remote image ID and digest should match the local image ID and digest, output from `docker build` or `docker images`. -To test a customized image locally, run a pipeline with PortableRunner and set the `--environment_config` flag to the image path: +#### Modifying a source Dockerfile in Beam {#modifying-dockerfiles} -{{< highlight class="runner-direct" >}} -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output /path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=embed \ ---environment_config=path/to/container/image -{{< /highlight >}} +This method will require building image artifacts from Beam source. For additional instructions on setting up your development environment, see the [Contribution guide](contribute/#development-setup). -{{< highlight class="runner-flink-local" >}} -# Start a Flink job server on localhost:8099 -./gradlew :runners:flink:1.8:job-server:runShadow +1. Clone the `beam` repository. It is recommended that you start from a stable + release branch rather than from master for both customizing the Dockerfile + and building image artifacts, and that you use the same version of the SDK + to run your pipeline with a custom container. -# Run a pipeline on the Flink job server -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output=/path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=localhost:8099 \ ---environment_config=path/to/container/image -{{< /highlight >}} +``` +export BEAM_SDK_VERSION="2.26.0" -{{< highlight class="runner-spark-local" >}} -# Start a Spark job server on localhost:8099 -./gradlew :runners:spark:job-server:runShadow +git clone https://github.com/apache/beam.git +git checkout origin/release-$BEAM_SDK_VERSION +``` -# Run a pipeline on the Spark job server -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output=path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=localhost:8099 \ ---environment_config=path/to/container/image -{{< /highlight >}} +3. Customize the `Dockerfile` for a given language. This file is typically in the `sdks/<language>/container` directory (e.g. the [Dockerfile for Python](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -## Building container images +3. Navigate to the root directory of the local copy of your Apache Beam. -To build Beam SDK container images: +4. Run Gradle with the `docker` target. -1. Navigate to the root directory of the local copy of your Apache Beam. -2. Run Gradle with the `docker` target. If you're [building a child image](#writing-new-dockerfiles), set the optional `--file` flag to the new Dockerfile. If you're [building an image from an original Dockerfile](#modifying-dockerfiles), ignore the `--file` flag: ``` # The default repository of each SDK -./gradlew [--file=path/to/new/Dockerfile] :sdks:java:container:java8:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:java:container:java11:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:go:container:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py2:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py35:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py36:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py37:docker +./gradlew :sdks:java:container:java8:docker +./gradlew :sdks:java:container:java11:docker +./gradlew :sdks:go:container:docker +./gradlew :sdks:python:container:py36:docker +./gradlew :sdks:python:container:py37:docker +./gradlew :sdks:python:container:py38:docker -# Shortcut for building all four Python SDKs -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container buildAll +# Shortcut for building all Python SDKs +./gradlew :sdks:python:container buildAll ``` -From 2.21.0, `docker-pull-licenses` tag was introduced. Licenses/notices of third party dependencies will be added to the docker images when `docker-pull-licenses` was set. -For example, `./gradlew :sdks:java:container:java8:docker -Pdocker-pull-licenses`. The files are added to `/opt/apache/beam/third_party_licenses/`. -By default, no licenses/notices are added to the docker images. +To examine the containers that you built, run `docker images`: -To examine the containers that you built, run `docker images` from anywhere in the command line. If you successfully built all of the container images, the command prints a table like the following: ``` +$> docker images REPOSITORY TAG IMAGE ID CREATED SIZE -apache/beam_java8_sdk latest ... 2 weeks ago ... -apache/beam_java11_sdk latest ... 2 weeks ago ... -apache/beam_python2.7_sdk latest ... 2 weeks ago ... -apache/beam_python3.5_sdk latest ... 2 weeks ago ... -apache/beam_python3.6_sdk latest ... 2 weeks ago ... -apache/beam_python3.7_sdk latest ... 2 weeks ago ... -apache/beam_go_sdk latest ... 2 weeks ago ... +apache/beam_java8_sdk latest ... 1 min ago ... +apache/beam_java11_sdk latest ... 1 min ago ... +apache/beam_python3.6_sdk latest ... 1 min ago ... +apache/beam_python3.7_sdk latest ... 1 min ago ... +apache/beam_python3.8_sdk latest ... 1 min ago ... +apache/beam_go_sdk latest ... 1 min ago ... ``` -### Overriding default Docker targets - -The default [tag](https://docs.docker.com/engine/reference/commandline/tag/) is sdk_version defined at [gradle.properties](https://github.com/apache/beam/blob/master/gradle.properties) and the default repositories are in the Docker Hub `apache` namespace. -The `docker` command-line tool implicitly [pushes container images](#pushing-container-images) to this location. +If you did not provide a custom repo/tag as additional parameters (see below), you can retag the image and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker to a remote repository. Review comment: I'd remove "(see below)" and either explicitly link to that heading or use "see the following section." ########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: + +1. Create a new Dockerfile that designates a base image using the [FROM instruction](https://docs.docker.com/engine/reference/builder/#from). As an example, this `Dockerfile`: -1. Pull a [prebuilt SDK container image](https://hub.docker.com/search?q=apache%2Fbeam&type=image) for your [target](https://docs.docker.com/docker-hub/repos/#searching-for-repositories) language and version. The following example pulls the latest Python SDK: ``` -docker pull apache/beam_python3.7_sdk +FROM apache/beam_python3.7_sdk:2.25.0 + +ENV FOO=bar +COPY /src/path/to/file /dest/path/to/file/ ``` -2. [Write a new Dockerfile](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) that [designates](https://docs.docker.com/engine/reference/builder/#from) the original as its [parent](https://docs.docker.com/glossary/?term=parent%20image). -3. [Build](#building-container-images) a child image. -### Modifying the original Dockerfile {#modifying-dockerfiles} +uses the prebuilt Python 3.7 SDK container image [`beam_python3.7_sdk`](https://hub.docker.com/r/apache/beam_python3.7_sdk) tagged at (SDK version) `2.25.0`, and adds an additional environment variable and file to the image. + + +2. [Build](https://docs.docker.com/engine/reference/commandline/build/) and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker. + -1. Clone the `beam` repository: ``` -git clone https://github.com/apache/beam.git +export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0" +export IMAGE_NAME="myremoterepo/mybeamsdk" +export TAG="latest" + +# Optional - pull the base image into your local Docker daemon to ensure +# you have the most up-to-date version of the base image locally. +docker pull "${BASE_IMAGE}" + +docker build -f Dockerfile -t "${IMAGE_NAME}:${TAG}" . +docker push "${IMAGE_NAME}:${TAG}" ``` -2. Customize the [Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -3. [Reimage](#building-container-images) the container. -### Testing customized images +**NOTE**: After pushing a container image, you should verify the remote image ID and digest should match the local image ID and digest, output from `docker build` or `docker images`. -To test a customized image locally, run a pipeline with PortableRunner and set the `--environment_config` flag to the image path: +#### Modifying a source Dockerfile in Beam {#modifying-dockerfiles} -{{< highlight class="runner-direct" >}} -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output /path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=embed \ ---environment_config=path/to/container/image -{{< /highlight >}} +This method will require building image artifacts from Beam source. For additional instructions on setting up your development environment, see the [Contribution guide](contribute/#development-setup). -{{< highlight class="runner-flink-local" >}} -# Start a Flink job server on localhost:8099 -./gradlew :runners:flink:1.8:job-server:runShadow +1. Clone the `beam` repository. It is recommended that you start from a stable + release branch rather than from master for both customizing the Dockerfile + and building image artifacts, and that you use the same version of the SDK + to run your pipeline with a custom container. -# Run a pipeline on the Flink job server -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output=/path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=localhost:8099 \ ---environment_config=path/to/container/image -{{< /highlight >}} +``` +export BEAM_SDK_VERSION="2.26.0" -{{< highlight class="runner-spark-local" >}} -# Start a Spark job server on localhost:8099 -./gradlew :runners:spark:job-server:runShadow +git clone https://github.com/apache/beam.git +git checkout origin/release-$BEAM_SDK_VERSION +``` -# Run a pipeline on the Spark job server -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output=path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=localhost:8099 \ ---environment_config=path/to/container/image -{{< /highlight >}} +3. Customize the `Dockerfile` for a given language. This file is typically in the `sdks/<language>/container` directory (e.g. the [Dockerfile for Python](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -## Building container images +3. Navigate to the root directory of the local copy of your Apache Beam. Review comment: Does "Navigate to the root directory where you've installed your local copy of the Beam SDK." work? ########## File path: website/www/site/content/en/documentation/runtime/environments.md ########## @@ -17,147 +17,232 @@ limitations under the License. # Container environments -The Beam SDK runtime environment is isolated from other runtime systems because the SDK runtime environment is [containerized](https://s.apache.org/beam-fn-api-container-contract) with [Docker](https://www.docker.com/). This means that any execution engine can run the Beam SDK. +The Beam SDK runtime environment can be [containerized](https://www.docker.com/resources/what-container) with [Docker](https://www.docker.com/) to isolate it from other runtime systems. To learn more about the container environment, read the Beam [SDK Harness container contract](https://s.apache.org/beam-fn-api-container-contract). -This page describes how to customize, build, and push Beam SDK container images. +Prebuilt SDK container images are released per supported language during Beam releases and pushed to [Docker Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image). -Before you begin, install [Docker](https://www.docker.com/) on your workstation. +## Custom containers -## Customizing container images +You may want to customize container images for many reasons, including: -You can add extra dependencies to container images so that you don't have to supply the dependencies to execution engines. +* Pre-installing additional dependencies +* Launching third-party software in the worker environment +* Further customizing the execution environment -To customize a container image, either: -* [Write a new](#writing-new-dockerfiles) [Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the original. -* [Modify](#modifying-dockerfiles) the [original Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile) and reimage the container. + This guide describes how to create and use customized containers for the Beam SDK. -It's often easier to write a new Dockerfile. However, by modifying the original Dockerfile, you can customize anything (including the base OS). +### Prerequisites -### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles} +* You will need to use Docker, either by [installing Docker tools locally](https://docs.docker.com/get-docker/) or using build services that can run Docker, such as [Google Cloud Build](https://cloud.google.com/cloud-build/docs/building/build-containers). +* You will need to have a container registry accessible by your execution engine or runner to host a custom container image. Options include [Docker Hub](https://hub.docker.com/) or a "self-hosted" repository, including cloud-specific container registries like [Google Container Registry](https://cloud.google.com/container-registry) (GCR) or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). + +> **NOTE**: On Nov 20, 2020, Docker Hub put [rate limits](https://www.docker.com/increase-rate-limits) into effect for anonymous and free authenticated use, which may impact larger pipelines that pull containers several times. + +For optimal user experience, we also recommend you use the latest released version of Beam. + +### Building and pushing custom containers + +Beam [SDK container images](https://hub.docker.com/search?q=apache%2Fbeam&type=image) are built from Dockerfiles checked into the [Github](https://github.com/apache/beam) repository and published to Docker Hub for every release. You can build customized containers in one of two ways: + +1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on a released container image**. This is sufficient for simple additions to the image, such as adding artifacts or environment variables. +2. **[Modifying](#modifying-dockerfiles) a source Dockerfile in [Beam](https://github.com/apache/beam)**. This method requires building from Beam source but allows for greater customization of the container (including replacement of artifacts or base OS/language versions). + +#### Writing a new Dockerfile based on an existing published container image {#writing-new-dockerfiles} + +Steps: + +1. Create a new Dockerfile that designates a base image using the [FROM instruction](https://docs.docker.com/engine/reference/builder/#from). As an example, this `Dockerfile`: -1. Pull a [prebuilt SDK container image](https://hub.docker.com/search?q=apache%2Fbeam&type=image) for your [target](https://docs.docker.com/docker-hub/repos/#searching-for-repositories) language and version. The following example pulls the latest Python SDK: ``` -docker pull apache/beam_python3.7_sdk +FROM apache/beam_python3.7_sdk:2.25.0 + +ENV FOO=bar +COPY /src/path/to/file /dest/path/to/file/ ``` -2. [Write a new Dockerfile](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) that [designates](https://docs.docker.com/engine/reference/builder/#from) the original as its [parent](https://docs.docker.com/glossary/?term=parent%20image). -3. [Build](#building-container-images) a child image. -### Modifying the original Dockerfile {#modifying-dockerfiles} +uses the prebuilt Python 3.7 SDK container image [`beam_python3.7_sdk`](https://hub.docker.com/r/apache/beam_python3.7_sdk) tagged at (SDK version) `2.25.0`, and adds an additional environment variable and file to the image. + + +2. [Build](https://docs.docker.com/engine/reference/commandline/build/) and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker. + -1. Clone the `beam` repository: ``` -git clone https://github.com/apache/beam.git +export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0" +export IMAGE_NAME="myremoterepo/mybeamsdk" +export TAG="latest" + +# Optional - pull the base image into your local Docker daemon to ensure +# you have the most up-to-date version of the base image locally. +docker pull "${BASE_IMAGE}" + +docker build -f Dockerfile -t "${IMAGE_NAME}:${TAG}" . +docker push "${IMAGE_NAME}:${TAG}" ``` -2. Customize the [Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -3. [Reimage](#building-container-images) the container. -### Testing customized images +**NOTE**: After pushing a container image, you should verify the remote image ID and digest should match the local image ID and digest, output from `docker build` or `docker images`. -To test a customized image locally, run a pipeline with PortableRunner and set the `--environment_config` flag to the image path: +#### Modifying a source Dockerfile in Beam {#modifying-dockerfiles} -{{< highlight class="runner-direct" >}} -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output /path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=embed \ ---environment_config=path/to/container/image -{{< /highlight >}} +This method will require building image artifacts from Beam source. For additional instructions on setting up your development environment, see the [Contribution guide](contribute/#development-setup). -{{< highlight class="runner-flink-local" >}} -# Start a Flink job server on localhost:8099 -./gradlew :runners:flink:1.8:job-server:runShadow +1. Clone the `beam` repository. It is recommended that you start from a stable + release branch rather than from master for both customizing the Dockerfile + and building image artifacts, and that you use the same version of the SDK + to run your pipeline with a custom container. -# Run a pipeline on the Flink job server -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output=/path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=localhost:8099 \ ---environment_config=path/to/container/image -{{< /highlight >}} +``` +export BEAM_SDK_VERSION="2.26.0" -{{< highlight class="runner-spark-local" >}} -# Start a Spark job server on localhost:8099 -./gradlew :runners:spark:job-server:runShadow +git clone https://github.com/apache/beam.git +git checkout origin/release-$BEAM_SDK_VERSION +``` -# Run a pipeline on the Spark job server -python -m apache_beam.examples.wordcount \ ---input=/path/to/inputfile \ ---output=path/to/write/counts \ ---runner=PortableRunner \ ---job_endpoint=localhost:8099 \ ---environment_config=path/to/container/image -{{< /highlight >}} +3. Customize the `Dockerfile` for a given language. This file is typically in the `sdks/<language>/container` directory (e.g. the [Dockerfile for Python](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile). If you're adding dependencies from [PyPI](https://pypi.org/), use [`base_image_requirements.txt`](https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt) instead. -## Building container images +3. Navigate to the root directory of the local copy of your Apache Beam. -To build Beam SDK container images: +4. Run Gradle with the `docker` target. -1. Navigate to the root directory of the local copy of your Apache Beam. -2. Run Gradle with the `docker` target. If you're [building a child image](#writing-new-dockerfiles), set the optional `--file` flag to the new Dockerfile. If you're [building an image from an original Dockerfile](#modifying-dockerfiles), ignore the `--file` flag: ``` # The default repository of each SDK -./gradlew [--file=path/to/new/Dockerfile] :sdks:java:container:java8:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:java:container:java11:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:go:container:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py2:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py35:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py36:docker -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py37:docker +./gradlew :sdks:java:container:java8:docker +./gradlew :sdks:java:container:java11:docker +./gradlew :sdks:go:container:docker +./gradlew :sdks:python:container:py36:docker +./gradlew :sdks:python:container:py37:docker +./gradlew :sdks:python:container:py38:docker -# Shortcut for building all four Python SDKs -./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container buildAll +# Shortcut for building all Python SDKs +./gradlew :sdks:python:container buildAll ``` -From 2.21.0, `docker-pull-licenses` tag was introduced. Licenses/notices of third party dependencies will be added to the docker images when `docker-pull-licenses` was set. -For example, `./gradlew :sdks:java:container:java8:docker -Pdocker-pull-licenses`. The files are added to `/opt/apache/beam/third_party_licenses/`. -By default, no licenses/notices are added to the docker images. +To examine the containers that you built, run `docker images`: -To examine the containers that you built, run `docker images` from anywhere in the command line. If you successfully built all of the container images, the command prints a table like the following: ``` +$> docker images REPOSITORY TAG IMAGE ID CREATED SIZE -apache/beam_java8_sdk latest ... 2 weeks ago ... -apache/beam_java11_sdk latest ... 2 weeks ago ... -apache/beam_python2.7_sdk latest ... 2 weeks ago ... -apache/beam_python3.5_sdk latest ... 2 weeks ago ... -apache/beam_python3.6_sdk latest ... 2 weeks ago ... -apache/beam_python3.7_sdk latest ... 2 weeks ago ... -apache/beam_go_sdk latest ... 2 weeks ago ... +apache/beam_java8_sdk latest ... 1 min ago ... +apache/beam_java11_sdk latest ... 1 min ago ... +apache/beam_python3.6_sdk latest ... 1 min ago ... +apache/beam_python3.7_sdk latest ... 1 min ago ... +apache/beam_python3.8_sdk latest ... 1 min ago ... +apache/beam_go_sdk latest ... 1 min ago ... ``` -### Overriding default Docker targets - -The default [tag](https://docs.docker.com/engine/reference/commandline/tag/) is sdk_version defined at [gradle.properties](https://github.com/apache/beam/blob/master/gradle.properties) and the default repositories are in the Docker Hub `apache` namespace. -The `docker` command-line tool implicitly [pushes container images](#pushing-container-images) to this location. +If you did not provide a custom repo/tag as additional parameters (see below), you can retag the image and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker to a remote repository. -To tag a local image, set the `docker-tag` option when building the container. The following command tags a Python SDK image with a date. -``` -./gradlew :sdks:python:container:py36:docker -Pdocker-tag=2019-10-04 ``` +export BEAM_SDK_VERSION="2.26.0" +export IMAGE_NAME="myrepo/mybeamsdk" +export TAG="${BEAM_SDK_VERSION}-custom" -To change the repository, set the `docker-repository-root` option to a new location. The following command sets the `docker-repository-root` -to a repository named `example-repo` on Docker Hub. -``` -./gradlew :sdks:python:container:py36:docker -Pdocker-repository-root=example-repo +docker tag apache/beam_python3.6_sdk "${IMAGE_NAME}:${TAG}" +docker push "${IMAGE_NAME}:${TAG}" ``` -## Pushing container images +**NOTE**: After pushing a container image, verify the remote image ID and digest matches the local image ID and digest output from `docker_images` -After [building a container image](#building-container-images), you can store it in a remote Docker repository. +##### Additional build parameters Review comment: I think it's OK to keep this at H3 instead of H4 ### Additional build parameters ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
