emilymye commented on a change in pull request #13420:
URL: https://github.com/apache/beam/pull/13420#discussion_r532961476



##########
File path: website/www/site/content/en/documentation/runtime/environments.md
##########
@@ -87,77 +202,28 @@ python -m apache_beam.examples.wordcount \
 --output=path/to/write/counts \
 --runner=PortableRunner \
 --job_endpoint=localhost:8099 \
---environment_config=path/to/container/image
+--environment_config="${IMAGE}:${TAG}"
 {{< /highlight >}}
 
-## Building container images
-
-To build Beam SDK container images:
-
-1. Navigate to the root directory of the local copy of your Apache Beam.
-2. Run Gradle with the `docker` target. If you're [building a child 
image](#writing-new-dockerfiles), set the optional `--file` flag to the new 
Dockerfile. If you're [building an image from an original 
Dockerfile](#modifying-dockerfiles), ignore the `--file` flag:
-
-```
-# The default repository of each SDK
-./gradlew [--file=path/to/new/Dockerfile] :sdks:java:container:java8:docker
-./gradlew [--file=path/to/new/Dockerfile] :sdks:java:container:java11:docker
-./gradlew [--file=path/to/new/Dockerfile] :sdks:go:container:docker
-./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py2:docker
-./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py35:docker
-./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py36:docker
-./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container:py37:docker
-
-# Shortcut for building all four Python SDKs
-./gradlew [--file=path/to/new/Dockerfile] :sdks:python:container buildAll
-```
-
-From 2.21.0, `docker-pull-licenses` tag was introduced. Licenses/notices of 
third party dependencies will be added to the docker images when 
`docker-pull-licenses` was set.
-For example, `./gradlew :sdks:java:container:java8:docker 
-Pdocker-pull-licenses`. The files are added to 
`/opt/apache/beam/third_party_licenses/`.
-By default, no licenses/notices are added to the docker images.
-
-To examine the containers that you built, run `docker images` from anywhere in 
the command line. If you successfully built all of the container images, the 
command prints a table like the following:
-```
-REPOSITORY                         TAG                 IMAGE ID            
CREATED           SIZE
-apache/beam_java8_sdk              latest              ...                 2 
weeks ago       ...
-apache/beam_java11_sdk             latest              ...                 2 
weeks ago       ...
-apache/beam_python2.7_sdk          latest              ...                 2 
weeks ago       ...
-apache/beam_python3.5_sdk          latest              ...                 2 
weeks ago       ...
-apache/beam_python3.6_sdk          latest              ...                 2 
weeks ago       ...
-apache/beam_python3.7_sdk          latest              ...                 2 
weeks ago       ...
-apache/beam_go_sdk                 latest              ...                 2 
weeks ago       ...
-```
-
-### Overriding default Docker targets
-
-The default [tag](https://docs.docker.com/engine/reference/commandline/tag/) 
is sdk_version defined at 
[gradle.properties](https://github.com/apache/beam/blob/master/gradle.properties)
 and the default repositories are in the Docker Hub `apache` namespace.
-The `docker` command-line tool implicitly [pushes container 
images](#pushing-container-images) to this location.
+{{< highlight class="runner-dataflow" >}}
+export IMAGE="my-repo/beam_python_sdk_custom"
+export TAG="X.Y.Z"
 
-To tag a local image, set the `docker-tag` option when building the container. 
The following command tags a Python SDK image with a date.
-```
-./gradlew :sdks:python:container:py36:docker -Pdocker-tag=2019-10-04
-```
-
-To change the repository, set the `docker-repository-root` option to a new 
location. The following command sets the `docker-repository-root`
-to a repository named `example-repo` on Docker Hub.
-```
-./gradlew :sdks:python:container:py36:docker 
-Pdocker-repository-root=example-repo
-```
+export GCS_PATH="gs://my-gcs-bucket"
+export GCP_PROJECT="my-gcp-project"
+export REGION="us-central1"
 
-## Pushing container images
-
-After [building a container image](#building-container-images), you can store 
it in a remote Docker repository.
-
-The following steps push a Python3.6 SDK image to the 
[`docker-root-repository` value](#overriding-default-docker-targets).
-Please log in to the destination repository as needed.
-
-Upload it to the remote repository:
-```
-docker push example-repo/beam_python3.6_sdk
-```
-
-To download the image again, run `docker pull`:
-```
-docker pull example-repo/beam_python3.6_sdk
-```
+# Run a pipeline on Dataflow.
+# This is a Python batch pipeline, so to run on Dataflow Runner V2
+# you must specify the experiment "use_runner_v2"

Review comment:
       The {{< highlight class="runner-X" >}} formats it into code-blocks, 
tabbed by runner type - see existing page 
https://beam.apache.org/documentation/runtime/environments/#testing-customized-images

##########
File path: website/www/site/content/en/documentation/runtime/environments.md
##########
@@ -15,56 +15,168 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 -->
 
-# Container environments
+# Container Environments
 
-The Beam SDK runtime environment is isolated from other runtime systems 
because the SDK runtime environment is 
[containerized](https://s.apache.org/beam-fn-api-container-contract) with 
[Docker](https://www.docker.com/). This means that any execution engine can run 
the Beam SDK.
+The Beam SDK runtime environment is 
[containerized](https://www.docker.com/resources/what-container) with 
[Docker](https://www.docker.com/) to isolate it from other runtime systems. 
This means any execution engine can run the Beam SDK. To learn more about the 
container environment, read the Beam [SDK Harness container 
contract](https://s.apache.org/beam-fn-api-container-contract).
 
-This page describes how to customize, build, and push Beam SDK container 
images.
+Prebuilt SDK container images are released per supported language version 
during Beam releases and and pushed to [Docker 
Hub](https://hub.docker.com/search?q=apache%2Fbeam&type=image)
 
-Before you begin, install [Docker](https://www.docker.com/) on your 
workstation.
+## Custom Containers
 
-## Customizing container images
+Users may want to customize container images for many reasons, including:
 
-You can add extra dependencies to container images so that you don't have to 
supply the dependencies to execution engines.
+* pre-installing additional dependencies,
+* launching third-party software
+* further customizing the execution environment
 
-To customize a container image, either:
-* [Write a new](#writing-new-dockerfiles) 
[Dockerfile](https://docs.docker.com/engine/reference/builder/) on top of the 
original.
-* [Modify](#modifying-dockerfiles) the [original 
Dockerfile](https://github.com/apache/beam/blob/master/sdks/python/container/Dockerfile)
 and reimage the container.
+ This guide describes how to create and use customized containers for the Beam 
SDK.
 
-It's often easier to write a new Dockerfile. However, by modifying the 
original Dockerfile, you can customize anything (including the base OS).
+### Prerequisites
 
-### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles}
+* You will need to have a version of the Beam SDK >= 2.21.0.
+* You will need to have [Docker 
installed](https://docs.docker.com/get-docker/).
+* You will need to have a container registry accessible by your execution 
engine or runner to host a custom container image. Options include [Docker 
Hub](https://hub.docker.com/) or a "self-hosted" repository, including 
cloud-specific container registries.
 
-1. Pull a [prebuilt SDK container 
image](https://hub.docker.com/search?q=apache%2Fbeam&type=image) for your 
[target](https://docs.docker.com/docker-hub/repos/#searching-for-repositories) 
language and version. The following example pulls the latest Python SDK:
+>  **NOTE**: On Nov 20, 2020, Docker Hub put [rate 
limits](https://www.docker.com/increase-rate-limits) into effect for anonymous 
and free authenticated use, which may impact larger pipelines that pull 
containers several times.
+
+### Building and pushing custom containers
+
+Beam builds prebuilt images from 
[Dockerfiles](https://docs.docker.com/engine/reference/builder/). Users can 
build customized containers in one of two ways:
+
+1. **[Writing a new](#writing-new-dockerfiles) Dockerfile based on an existing 
prebuilt container**. This is sufficient for simple additions to the image, 
such as adding artifacts or environment variables.
+2. **[Modifying](#modifying-dockerfiles) an existing Dockerfile in [Beam 
source](https://github.com/apache/beam)**. This method requires building from 
Beam source but allows for greater customization of the container (including 
replacement of artifacts or base OS/language versions).
+
+#### Writing new Dockerfiles on top of the original {#writing-new-dockerfiles}
+
+Steps:
+
+1. Create a new Dockerfile that designates a base image using the [FROM 
instruction](https://docs.docker.com/engine/reference/builder/#from)
+
+2. Once you have a created a custom Dockerfile, 
[build](https://docs.docker.com/engine/reference/commandline/build/) and 
[push](https://docs.docker.com/engine/reference/commandline/push/) the image 
using Docker:
+
+As an example, this `Dockerfile`:
+
+```
+FROM apache/beam_python3.7_sdk:2.25.0
+
+ENV FOO=bar
+COPY /src/path/to/file /dest/path/to/file/
 ```
-docker pull apache/beam_python3.7_sdk
+
+uses the prebuilt Python 3.7 SDK container image 
[`beam_python3.7_sdk`](https://hub.docker.com/r/apache/beam_python3.7_sdk) 
tagged at (SDK version) `2.25.0`, and adds an additional environment variable 
and file to the image.
+
+```
+export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0"
+export IMAGE_NAME="myremoterepo/mybeamsdk"
+export TAG="latest"
+
+# Optional but recommended pull step to pull the base image into your local 
Docker daemon.

Review comment:
       Added an explaination. I think it's fine to leave as part of the shell 
instructions - the reason you want to pull is to get a newer version of the 
base image (though there are couple of ways to force that with docker)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to