This is an automated email from the ASF dual-hosted git repository.
rmetzger pushed a commit to branch release-1.11
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/release-1.11 by this push:
new 030df18 [FLINK-17976][docs][k8s/docker] Improvements about custom
docker images
030df18 is described below
commit 030df18deaa7a21382eb5254a037c0c1b3438ad8
Author: Robert Metzger <[email protected]>
AuthorDate: Tue Jun 9 16:12:03 2020 +0200
[FLINK-17976][docs][k8s/docker] Improvements about custom docker images
This closes #12558
---
docs/ops/deployment/docker.md | 16 +++++++-
docs/ops/deployment/docker.zh.md | 16 +++++++-
docs/ops/deployment/index.md | 67 ++++++++++++++++++++++++++++---
docs/ops/deployment/index.zh.md | 69 ++++++++++++++++++++++++++++----
docs/ops/deployment/native_kubernetes.md | 6 ++-
5 files changed, 155 insertions(+), 19 deletions(-)
diff --git a/docs/ops/deployment/docker.md b/docs/ops/deployment/docker.md
index 2525185..87564b5 100644
--- a/docs/ops/deployment/docker.md
+++ b/docs/ops/deployment/docker.md
@@ -152,6 +152,8 @@ the *Flink Master* and *TaskManagers*:
* **or extend the Flink image** by writing a custom `Dockerfile`, build it and
use it for starting the *Flink Master* and *TaskManagers*:
+ *Dockerfile*:
+
```dockerfile
FROM flink
ADD /host/path/to/job/artifacts/1 /opt/flink/usrlib/artifacts/1
@@ -233,6 +235,8 @@ To provide a custom location for the Flink configuration
files, you can
* or add them to your **custom Flink image**, build and run it:
+ *Dockerfile*:
+
```dockerfile
FROM flink
ADD /host/path/to/flink-conf.yaml /opt/flink/conf/flink-conf.yaml
@@ -264,10 +268,12 @@ There are several ways in which you can further customize
the Flink image:
* install custom software (e.g. python)
* enable (symlink) optional libraries or plugins from `/opt/flink/opt` into
`/opt/flink/lib` or `/opt/flink/plugins`
-* add other libraries to `/opt/flink/lib` (e.g.
[hadoop](hadoop.html#adding-hadoop-to-lib))
+* add other libraries to `/opt/flink/lib` (e.g. Hadoop)
* add other plugins to `/opt/flink/plugins`
-you can achieve this in several ways:
+See also: [How to provide dependencies in the classpath]({% link index.md
%}#how-to-provide-dependencies-in-the-classpath).
+
+You can customize the Flink image in several ways:
* **override the container entry point** with a custom script where you can
run any bootstrap actions.
At the end you can call the standard `/docker-entrypoint.sh` script of the
Flink image with the same arguments
@@ -303,6 +309,8 @@ as described in [how to run the Flink
image](#how-to-run-flink-image).
* **extend the Flink image** by writing a custom `Dockerfile` and build a
custom image:
+ *Dockerfile*:
+
```dockerfile
FROM flink
@@ -319,6 +327,8 @@ as described in [how to run the Flink
image](#how-to-run-flink-image).
ENV VAR_NAME value
```
+ **Commands for building**:
+
```sh
docker build -t custom_flink_image .
# optional push to your docker image registry if you have it,
@@ -397,6 +407,7 @@ The next chapters show examples of configuration files to
run Flink.
### Session Cluster with Docker Compose
**docker-compose.yml:**
+
```yaml
version: "2.2"
services:
@@ -430,6 +441,7 @@ See also [how to specify the Flink Master
arguments](#flink-master-additional-co
in the `command` for the `jobmanager` service.
**docker-compose.yml:**
+
```yaml
version: "2.2"
services:
diff --git a/docs/ops/deployment/docker.zh.md b/docs/ops/deployment/docker.zh.md
index 6a4aea7..94df910 100644
--- a/docs/ops/deployment/docker.zh.md
+++ b/docs/ops/deployment/docker.zh.md
@@ -152,6 +152,8 @@ the *Flink Master* and *TaskManagers*:
* **or extend the Flink image** by writing a custom `Dockerfile`, build it and
use it for starting the *Flink Master* and *TaskManagers*:
+ *Dockerfile*:
+
```dockerfile
FROM flink
ADD /host/path/to/job/artifacts/1 /opt/flink/usrlib/artifacts/1
@@ -233,6 +235,8 @@ To provide a custom location for the Flink configuration
files, you can
* or add them to your **custom Flink image**, build and run it:
+ *Dockerfile*:
+
```dockerfile
FROM flink
ADD /host/path/to/flink-conf.yaml /opt/flink/conf/flink-conf.yaml
@@ -264,10 +268,12 @@ There are several ways in which you can further customize
the Flink image:
* install custom software (e.g. python)
* enable (symlink) optional libraries or plugins from `/opt/flink/opt` into
`/opt/flink/lib` or `/opt/flink/plugins`
-* add other libraries to `/opt/flink/lib` (e.g.
[hadoop](hadoop.html#adding-hadoop-to-lib))
+* add other libraries to `/opt/flink/lib` (e.g. Hadoop)
* add other plugins to `/opt/flink/plugins`
-you can achieve this in several ways:
+See also: [How to provide dependencies in the classpath]({% link index.md
%}#how-to-provide-dependencies-in-the-classpath).
+
+You can customize the Flink image in several ways:
* **override the container entry point** with a custom script where you can
run any bootstrap actions.
At the end you can call the standard `/docker-entrypoint.sh` script of the
Flink image with the same arguments
@@ -303,6 +309,8 @@ as described in [how to run the Flink
image](#how-to-run-flink-image).
* **extend the Flink image** by writing a custom `Dockerfile` and build a
custom image:
+ *Dockerfile*:
+
```dockerfile
FROM flink
@@ -319,6 +327,8 @@ as described in [how to run the Flink
image](#how-to-run-flink-image).
ENV VAR_NAME value
```
+ **Commands for building**:
+
```sh
docker build -t custom_flink_image .
# optional push to your docker image registry if you have it,
@@ -397,6 +407,7 @@ The next chapters show examples of configuration files to
run Flink.
### Session Cluster with Docker Compose
**docker-compose.yml:**
+
```yaml
version: "2.2"
services:
@@ -430,6 +441,7 @@ See also [how to specify the Flink Master
arguments](#flink-master-additional-co
in the `command` for the `jobmanager` service.
**docker-compose.yml:**
+
```yaml
version: "2.2"
services:
diff --git a/docs/ops/deployment/index.md b/docs/ops/deployment/index.md
index 9824541..6a856b2 100644
--- a/docs/ops/deployment/index.md
+++ b/docs/ops/deployment/index.md
@@ -119,7 +119,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
Run Flink locally for basic testing and experimentation
- <br><a href="{{ site.baseurl }}/ops/deployment/local.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/local.md %}">Learn more</a>
</div>
</div>
</div>
@@ -130,7 +130,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
A simple solution for running Flink on bare metal or VM's
- <br><a href="{{ site.baseurl
}}/ops/deployment/cluster_setup.html">Learn more</a>
+ <br><a href="{% link ops/deployment/cluster_setup.md %}">Learn more</a>
</div>
</div>
</div>
@@ -141,7 +141,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
Deploy Flink on-top of Apache Hadoop's resource manager
- <br><a href="{{ site.baseurl }}/ops/deployment/yarn_setup.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/yarn_setup.md %}">Learn more</a>
</div>
</div>
</div>
@@ -154,7 +154,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
A generic resource manager for running distriubted systems
- <br><a href="{{ site.baseurl }}/ops/deployment/mesos.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/mesos.md %}">Learn more</a>
</div>
</div>
</div>
@@ -165,7 +165,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
A popular solution for running Flink within a containerized environment
- <br><a href="{{ site.baseurl }}/ops/deployment/docker.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/docker.md %}">Learn more</a>
</div>
</div>
</div>
@@ -176,7 +176,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
An automated system for deploying containerized applications
- <br><a href="{{ site.baseurl }}/ops/deployment/kubernetes.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/kubernetes.md %}">Learn more</a>
</div>
</div>
</div>
@@ -247,3 +247,58 @@ Supported Environments:
<span class="label label-primary">Azure</span>
<span class="label label-primary">Google Cloud</span>
<span class="label label-primary">On-Premise</span>
+
+## Deployment Best Practices
+
+### How to provide dependencies in the classpath
+
+Flink provides several approaches for providing dependencies (such as `*.jar`
files or static data) to Flink or user-provided
+applications. These approaches differ based on the deployment mode and target,
but also have commonalities, which are described here.
+
+To provide a dependency, there are the following options:
+- files in the **`lib/` folder** are added to the classpath used to start
Flink. It is suitable for libraries such as Hadoop or file systems not
available as plugins. Beware that classes added here can potentially interfere
with Flink, for example if you are adding a different version of a library
already provided by Flink.
+
+- **`plugins/<name>/`** are loaded at runtime by Flink through separate
classloaders to avoid conflicts with classes loaded and used by Flink. Only jar
files which are prepared as [plugins]({% link ops/plugins.md %}) can be added
here.
+
+### Download Maven dependencies locally
+
+If you need to extend the Flink with a Maven dependency (and its transitive
dependencies),
+you can use an [Apache Maven](https://maven.apache.org) *pom.xml* file to
download all required files into a local folder:
+
+*pom.xml*:
+
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>org.apache.flink</groupId>
+ <artifactId>docker-dependencies</artifactId>
+ <version>1.0-SNAPSHOT</version>
+
+ <dependencies>
+ <!-- Put your dependency here, for example a Hadoop GCS connector -->
+ </dependencies>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-dependency-plugin</artifactId>
+ <version>3.1.2</version>
+ <executions>
+ <execution>
+ <id>copy-dependencies</id>
+ <phase>package</phase>
+ <goals><goal>copy-dependencies</goal></goals>
+
<configuration><outputDirectory>jars</outputDirectory></configuration>
+ </execution>
+ </executions>
+ </plugin>
+ </plugins>
+ </build>
+</project>
+```
+
+Running `mvn package` in the same directory will create a `jars/` folder
containing all the jar files,
+which you can add to the desired folder, Docker image etc.
diff --git a/docs/ops/deployment/index.zh.md b/docs/ops/deployment/index.zh.md
index 53b4558..60dca63 100644
--- a/docs/ops/deployment/index.zh.md
+++ b/docs/ops/deployment/index.zh.md
@@ -119,7 +119,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
Run Flink locally for basic testing and experimentation
- <br><a href="{{ site.baseurl }}/ops/deployment/local.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/local.md %}">Learn more</a>
</div>
</div>
</div>
@@ -130,7 +130,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
A simple solution for running Flink on bare metal or VM's
- <br><a href="{{ site.baseurl
}}/ops/deployment/cluster_setup.html">Learn more</a>
+ <br><a href="{% link ops/deployment/cluster_setup.md %}">Learn more</a>
</div>
</div>
</div>
@@ -140,8 +140,8 @@ Apache Flink ships with first class support for a number of
common deployment ta
<b>Yarn</b>
</div>
<div class="panel-body">
- Deploy Flink on-top Apache Hadoop's resource manager
- <br><a href="{{ site.baseurl }}/ops/deployment/yarn_setup.html">Learn
more</a>
+ Deploy Flink on-top of Apache Hadoop's resource manager
+ <br><a href="{% link ops/deployment/yarn_setup.md %}">Learn more</a>
</div>
</div>
</div>
@@ -154,7 +154,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
A generic resource manager for running distriubted systems
- <br><a href="{{ site.baseurl }}/ops/deployment/mesos.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/mesos.md %}">Learn more</a>
</div>
</div>
</div>
@@ -165,7 +165,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
A popular solution for running Flink within a containerized environment
- <br><a href="{{ site.baseurl }}/ops/deployment/docker.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/docker.md %}">Learn more</a>
</div>
</div>
</div>
@@ -176,7 +176,7 @@ Apache Flink ships with first class support for a number of
common deployment ta
</div>
<div class="panel-body">
An automated system for deploying containerized applications
- <br><a href="{{ site.baseurl }}/ops/deployment/kubernetes.html">Learn
more</a>
+ <br><a href="{% link ops/deployment/kubernetes.md %}">Learn more</a>
</div>
</div>
</div>
@@ -247,3 +247,58 @@ Supported Environments:
<span class="label label-primary">Azure</span>
<span class="label label-primary">Google Cloud</span>
<span class="label label-primary">On-Premise</span>
+
+## Deployment Best Practices
+
+### How to provide dependencies in the classpath
+
+Flink provides several approaches for providing dependencies (such as `*.jar`
files or static data) to Flink or user-provided
+applications. These approaches differ based on the deployment mode and target,
but also have commonalities, which are described here.
+
+To provide a dependency, there are the following options:
+- files in the **`lib/` folder** are added to the classpath used to start
Flink. It is suitable for libraries such as Hadoop or file systems not
available as plugins. Beware that classes added here can potentially interfere
with Flink, for example if you are adding a different version of a library
already provided by Flink.
+
+- **`plugins/<name>/`** are loaded at runtime by Flink through separate
classloaders to avoid conflicts with classes loaded and used by Flink. Only jar
files which are prepared as [plugins]({% link ops/plugins.md %}) can be added
here.
+
+### Download Maven dependencies locally
+
+If you need to extend the Flink with a Maven dependency (and its transitive
dependencies),
+you can use an [Apache Maven](https://maven.apache.org) *pom.xml* file to
download all required files into a local folder:
+
+*pom.xml*:
+
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>org.apache.flink</groupId>
+ <artifactId>docker-dependencies</artifactId>
+ <version>1.0-SNAPSHOT</version>
+
+ <dependencies>
+ <!-- Put your dependency here, for example a Hadoop GCS connector -->
+ </dependencies>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-dependency-plugin</artifactId>
+ <version>3.1.2</version>
+ <executions>
+ <execution>
+ <id>copy-dependencies</id>
+ <phase>package</phase>
+ <goals><goal>copy-dependencies</goal></goals>
+
<configuration><outputDirectory>jars</outputDirectory></configuration>
+ </execution>
+ </executions>
+ </plugin>
+ </plugins>
+ </build>
+</project>
+```
+
+Running `mvn package` in the same directory will create a `jars/` folder
containing all the jar files,
+which you can add to the desired folder, Docker image etc.
diff --git a/docs/ops/deployment/native_kubernetes.md
b/docs/ops/deployment/native_kubernetes.md
index 31a12f4..930dcdc 100644
--- a/docs/ops/deployment/native_kubernetes.md
+++ b/docs/ops/deployment/native_kubernetes.md
@@ -103,7 +103,7 @@ $ ./bin/flink run -d -t kubernetes-session
-Dkubernetes.cluster-id=<ClusterId> e
### Accessing Job Manager UI
There are several ways to expose a Service onto an external (outside of your
cluster) IP address.
-This can be configured using `kubernetes.service.exposed.type`.
+This can be configured using [`kubernetes.rest-service.exposed.type`]({% link
ops/config.md %}#kubernetes-rest-service-exposed-type).
- `ClusterIP`: Exposes the service on a cluster-internal IP.
The Service is only reachable within the cluster. If you want to access the
Job Manager ui or submit job to the existing session, you need to start a local
proxy.
@@ -116,10 +116,12 @@ $ kubectl port-forward service/<ServiceName> 8081
- `NodePort`: Exposes the service on each Node’s IP at a static port (the
`NodePort`). `<NodeIP>:<NodePort>` could be used to contact the Job Manager
Service. `NodeIP` could be easily replaced with Kubernetes ApiServer address.
You could find it in your kube config file.
-- `LoadBalancer`: Default value, exposes the service externally using a cloud
provider’s load balancer.
+- `LoadBalancer`: Exposes the service externally using a cloud provider’s load
balancer.
Since the cloud provider and Kubernetes needs some time to prepare the load
balancer, you may get a `NodePort` JobManager Web Interface in the client log.
You can use `kubectl get services/<ClusterId>` to get EXTERNAL-IP and then
construct the load balancer JobManager Web Interface manually
`http://<EXTERNAL-IP>:8081`.
+ <span class="label label-warning">Warning!</span> Your JobManager (which can
run arbitary jar files) might be exposed to the public internet, without
authentication.
+
- `ExternalName`: Map a service to a DNS name, not supported in current
version.
Please reference the official documentation on [publishing services in
Kubernetes](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types)
for more information.