[GitHub] [flink] XComp commented on a change in pull request #14346: [FLINK-20354] Rework standalone docs pages

GitBox Mon, 14 Dec 2020 04:58:54 -0800


XComp commented on a change in pull request #14346:
URL: https://github.com/apache/flink/pull/14346#discussion_r542326276




##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -149,16 +143,16 @@ the *JobManager* and *TaskManagers*:
         [--fromSavepoint /path/to/savepoint [--allowNonRestoredState]] \
         [job arguments]
 
-    docker run \
+    $ docker run \
         --mount 
type=bind,src=/host/path/to/job/artifacts1,target=/opt/flink/usrlib/artifacts1 \
         --mount 
type=bind,src=/host/path/to/job/artifacts2,target=/opt/flink/usrlib/artifacts2 \
         --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
         flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} taskmanager
     ```
 
-* **or extend the Flink image** by writing a custom `Dockerfile`, build it and 
use it for starting the *JobManager* and *TaskManagers*:
+* **or extend the Flink image** by writing a custom `Dockerfile`, build it and 
use it for starting the JobManager and TaskManagers:
 
-    *Dockerfile*:
+  *Dockerfile*:

Review comment:
       ```suggestion
   ```
   It was already mentioned above that this is a Dockerfile.

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -590,11 +608,11 @@ docker service create \
     taskmanager
 ```
 
-The *job artifacts* must be available in the *JobManager* container, as 
outlined [here](#start-a-job-cluster).
+The *job artifacts* must be available in the JobManager container, as outlined 
[here](#application-mode-on-docker).
 See also [how to specify the JobManager 
arguments](#jobmanager-additional-command-line-arguments) to pass them
 to the `flink-jobmanager` container.
 
 The example assumes that you run the swarm locally and expects the *job 
artifacts* to be in `/host/path/to/job/artifacts`.
-It also mounts the host path with the artifacts as a volume to the container's 
path `/opt/flink/usrlib`.
+It also mounts the host path with the artifacts as a volume to the container's 
path `/opt/flink/usrlib`.    

Review comment:
       ```suggestion
   It also mounts the host path with the artifacts as a volume to the 
container's path `/opt/flink/usrlib`.
   ```

##########
File path: docs/deployment/resource-providers/standalone/index.md
##########
@@ -24,153 +24,206 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This page provides instructions on how to run Flink in a *fully distributed 
fashion* on a *static* (but possibly heterogeneous) cluster.
-
 * This will be replaced by the TOC
 {:toc}
 
-## Requirements
 
-### Software Requirements
+## Getting Started
 
-Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows) and expects the cluster to consist of **one master 
node** and **one or more worker nodes**. Before you start to setup the system, 
make sure you have the following software installed **on each node**:
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate processes) of a Flink cluster. This can easily be 
expanded to set up a distributed standalone cluster, which we describe in the 
[reference section](#the-start-and-stop-scripts).
 
-- **Java 1.8.x** or higher,
-- **ssh** (sshd must be running to use the Flink scripts that manage
-  remote components)
+### Introduction
 
-If your cluster does not fulfill these software requirements you will need to 
install/upgrade it.
+The standalone mode is the most barebone way of deploying Flink: The Flink 
services described in the [deployment overview]({% link deployment/index.md %}) 
are just launched as processes on the operating system. Unlike deploying Flink 
with a resource provider such as [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) or [YARN]({% link 
deployment/resource-providers/yarn.md %}), you have to take care of restarting 
failed processes, or allocation and de-allocation of resources during operation.
 
-Having __passwordless SSH__ and
-__the same directory structure__ on all your cluster nodes will allow you to 
use our scripts to control
-everything.
+In the additional subpages of the standalone mode resource provider, we 
describe additional deployment methods which are based on the standalone mode: 
[Deployment in Docker containers]({% link 
deployment/resource-providers/standalone/docker.md %}), and on [Kubernetes]({% 
link deployment/resource-providers/standalone/kubernetes.md %}).
 
-{% top %}
+### Preparation
 
-### `JAVA_HOME` Configuration
+Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows). Before you start to setup the system, make sure your 
system fulfils the following requirements.
 
-Flink requires the `JAVA_HOME` environment variable to be set on the master 
and all worker nodes and point to the directory of your Java installation.
+- **Java 1.8.x** or higher installed,
+- Downloaded a recent Flink distribution from the [download page]({{ 
site.download_url }}) and unpacked it.
 
-You can set this variable in `conf/flink-conf.yaml` via the `env.java.home` 
key.
+### Starting a Standalone Cluster (Session Mode)
 
-{% top %}
+These steps show how to launch a Flink standalone cluster, and submit an 
example job:
 
-## Flink Setup
+{% highlight bash %}
+# we assume to be in the root directory of the unzipped Flink distribution
 
-Go to the [downloads page]({{ site.download_url }}) and get the ready-to-run 
package.
+# (1) Start Cluster
+$ ./bin/start-cluster.sh
 
-After downloading the latest release, copy the archive to your master node and 
extract it:
+# (2) You can now access the Flink Web Interface on http://localhost:8081
+
+# (3) Submit example job
+$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
+
+# (4) Stop the cluster again
+$ ./bin/stop-cluster.sh
+{% endhighlight %}
+
+In step `(1)`, we've started 2 processes: A JVM for the JobManager, and a JVM 
for the TaskManager. The JobManager is serving the web interface accessible at 
[localhost:8081](http://localhost:8081).
+In step `(3)`, we are starting a Flink Client (a short-lived JVM process) that 
submits an application to the JobManager.
+
+## Deployment Modes Supported by the Standalone Cluster
+
+### Application Mode
+
+To start a Flink JobManager with an embedded application, we use the 
`bin/standalone-job.sh` script. 
+We demonstrate this mode by locally starting the `TopSpeedWindowing.jar` 
example, running on a single TaskManager.
+
+The application jar file needs to be available in the classpath. The easiest 
approach to achieve that is putting the jar into the `lib/` folder:
 
 {% highlight bash %}
-tar xzf flink-*.tgz
-cd flink-*
+$ cp ./examples/streaming/TopSpeedWindowing.jar lib/
 {% endhighlight %}
 
-### Configuring Flink
+Then, we can launch the JobManager:
 
-After having extracted the system files, you need to configure Flink for the 
cluster by editing *conf/flink-conf.yaml*.
+{% highlight bash %}
+$ ./bin/standalone-job.sh start --job-classname 
org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
+{% endhighlight %}
 
-Set the `jobmanager.rpc.address` key to point to your master node. You should 
also define the maximum amount of main memory Flink is allowed to allocate on 
each node by setting the `jobmanager.memory.process.size` and 
`taskmanager.memory.process.size` keys.
+The web interface is now available at [localhost:8081](http://localhost:8081). 
However, the application won't be able to start, because there are no 
TaskManagers running yet:
 
-These values are given in MB. If some worker nodes have more main memory which 
you want to allocate to the Flink system you can overwrite the default value by 
setting `taskmanager.memory.process.size` or `taskmanager.memory.flink.size` in 
*conf/flink-conf.yaml* on those specific nodes.
+{% highlight bash %}
+$ ./bin/taskmanager.sh start
+{% endhighlight %}
 
-Finally, you must provide a list of all nodes in your cluster that shall be 
used as worker nodes, i.e., nodes running a TaskManager. Edit the file 
*conf/workers* and enter the IP/host name of each worker node.
+Note: You can start multiple TaskManagers, if your application needs more 
resources.
 
-The following example illustrates the setup with three nodes (with IP 
addresses from _10.0.0.1_
-to _10.0.0.3_ and hostnames _master_, _worker1_, _worker2_) and shows the 
contents of the
-configuration files (which need to be accessible at the same path on all 
machines):
+Stopping the services is also supported via the scripts. Call them multiple 
times if you want to stop multiple instances, or use `stop-all`:
 
-<div class="row">
-  <div class="col-md-6 text-center">
-    <img src="{% link /page/img/quickstart_cluster.png %}" style="width: 60%">
-  </div>
-<div class="col-md-6">
-  <div class="row">
-    <p class="lead text-center">
-      /path/to/<strong>flink/conf/<br>flink-conf.yaml</strong>
-    <pre>jobmanager.rpc.address: 10.0.0.1</pre>
-    </p>
-  </div>
-<div class="row" style="margin-top: 1em;">
-  <p class="lead text-center">
-    /path/to/<strong>flink/<br>conf/workers</strong>
-  <pre>
-10.0.0.2
-10.0.0.3</pre>
-  </p>
-</div>
-</div>
-</div>
+{% highlight bash %}
+$ ./bin/taskmanager.sh stop
+$ ./bin/standalone-job.sh stop
+{% endhighlight %}
 
-The Flink directory must be available on every worker under the same path. You 
can use a shared NFS directory, or copy the entire Flink directory to every 
worker node.
 
-Please see the [configuration page]({% link deployment/config.md %}) for 
details and additional configuration options.
+### Per-Job Mode
 
-In particular,
+Per-Job Mode is not supported by the the Standalone Cluster.

Review comment:
       ```suggestion
   Per-Job Mode is not supported by the Standalone Cluster.
   ```

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -204,21 +198,53 @@ You can provide the following additional command line 
arguments to the cluster e
 
 If the main function of the user job main class accepts arguments, you can 
also pass them at the end of the `docker run` command.
 
-## Customize Flink image
+### Per-Job Cluster Mode
+
+Per-Job Cluster Mode is not supported by Docker.

Review comment:
       ```suggestion
   Per-Job Cluster Mode is not supported by Flink on Docker.
   ```

##########
File path: docs/deployment/resource-providers/standalone/index.md
##########
@@ -24,153 +24,206 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This page provides instructions on how to run Flink in a *fully distributed 
fashion* on a *static* (but possibly heterogeneous) cluster.
-
 * This will be replaced by the TOC
 {:toc}
 
-## Requirements
 
-### Software Requirements
+## Getting Started
 
-Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows) and expects the cluster to consist of **one master 
node** and **one or more worker nodes**. Before you start to setup the system, 
make sure you have the following software installed **on each node**:
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate processes) of a Flink cluster. This can easily be 
expanded to set up a distributed standalone cluster, which we describe in the 
[reference section](#the-start-and-stop-scripts).
 
-- **Java 1.8.x** or higher,
-- **ssh** (sshd must be running to use the Flink scripts that manage
-  remote components)
+### Introduction
 
-If your cluster does not fulfill these software requirements you will need to 
install/upgrade it.
+The standalone mode is the most barebone way of deploying Flink: The Flink 
services described in the [deployment overview]({% link deployment/index.md %}) 
are just launched as processes on the operating system. Unlike deploying Flink 
with a resource provider such as [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) or [YARN]({% link 
deployment/resource-providers/yarn.md %}), you have to take care of restarting 
failed processes, or allocation and de-allocation of resources during operation.
 
-Having __passwordless SSH__ and
-__the same directory structure__ on all your cluster nodes will allow you to 
use our scripts to control
-everything.
+In the additional subpages of the standalone mode resource provider, we 
describe additional deployment methods which are based on the standalone mode: 
[Deployment in Docker containers]({% link 
deployment/resource-providers/standalone/docker.md %}), and on [Kubernetes]({% 
link deployment/resource-providers/standalone/kubernetes.md %}).
 
-{% top %}
+### Preparation
 
-### `JAVA_HOME` Configuration
+Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows). Before you start to setup the system, make sure your 
system fulfils the following requirements.
 
-Flink requires the `JAVA_HOME` environment variable to be set on the master 
and all worker nodes and point to the directory of your Java installation.
+- **Java 1.8.x** or higher installed,
+- Downloaded a recent Flink distribution from the [download page]({{ 
site.download_url }}) and unpacked it.
 
-You can set this variable in `conf/flink-conf.yaml` via the `env.java.home` 
key.
+### Starting a Standalone Cluster (Session Mode)
 
-{% top %}
+These steps show how to launch a Flink standalone cluster, and submit an 
example job:
 
-## Flink Setup
+{% highlight bash %}
+# we assume to be in the root directory of the unzipped Flink distribution
 
-Go to the [downloads page]({{ site.download_url }}) and get the ready-to-run 
package.
+# (1) Start Cluster
+$ ./bin/start-cluster.sh
 
-After downloading the latest release, copy the archive to your master node and 
extract it:
+# (2) You can now access the Flink Web Interface on http://localhost:8081
+
+# (3) Submit example job
+$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
+
+# (4) Stop the cluster again
+$ ./bin/stop-cluster.sh
+{% endhighlight %}
+
+In step `(1)`, we've started 2 processes: A JVM for the JobManager, and a JVM 
for the TaskManager. The JobManager is serving the web interface accessible at 
[localhost:8081](http://localhost:8081).
+In step `(3)`, we are starting a Flink Client (a short-lived JVM process) that 
submits an application to the JobManager.
+
+## Deployment Modes Supported by the Standalone Cluster
+
+### Application Mode
+
+To start a Flink JobManager with an embedded application, we use the 
`bin/standalone-job.sh` script. 
+We demonstrate this mode by locally starting the `TopSpeedWindowing.jar` 
example, running on a single TaskManager.
+
+The application jar file needs to be available in the classpath. The easiest 
approach to achieve that is putting the jar into the `lib/` folder:
 
 {% highlight bash %}
-tar xzf flink-*.tgz
-cd flink-*
+$ cp ./examples/streaming/TopSpeedWindowing.jar lib/
 {% endhighlight %}
 
-### Configuring Flink
+Then, we can launch the JobManager:
 
-After having extracted the system files, you need to configure Flink for the 
cluster by editing *conf/flink-conf.yaml*.
+{% highlight bash %}
+$ ./bin/standalone-job.sh start --job-classname 
org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
+{% endhighlight %}
 
-Set the `jobmanager.rpc.address` key to point to your master node. You should 
also define the maximum amount of main memory Flink is allowed to allocate on 
each node by setting the `jobmanager.memory.process.size` and 
`taskmanager.memory.process.size` keys.
+The web interface is now available at [localhost:8081](http://localhost:8081). 
However, the application won't be able to start, because there are no 
TaskManagers running yet:
 
-These values are given in MB. If some worker nodes have more main memory which 
you want to allocate to the Flink system you can overwrite the default value by 
setting `taskmanager.memory.process.size` or `taskmanager.memory.flink.size` in 
*conf/flink-conf.yaml* on those specific nodes.
+{% highlight bash %}
+$ ./bin/taskmanager.sh start
+{% endhighlight %}
 
-Finally, you must provide a list of all nodes in your cluster that shall be 
used as worker nodes, i.e., nodes running a TaskManager. Edit the file 
*conf/workers* and enter the IP/host name of each worker node.
+Note: You can start multiple TaskManagers, if your application needs more 
resources.
 
-The following example illustrates the setup with three nodes (with IP 
addresses from _10.0.0.1_
-to _10.0.0.3_ and hostnames _master_, _worker1_, _worker2_) and shows the 
contents of the
-configuration files (which need to be accessible at the same path on all 
machines):
+Stopping the services is also supported via the scripts. Call them multiple 
times if you want to stop multiple instances, or use `stop-all`:
 
-<div class="row">
-  <div class="col-md-6 text-center">
-    <img src="{% link /page/img/quickstart_cluster.png %}" style="width: 60%">
-  </div>
-<div class="col-md-6">
-  <div class="row">
-    <p class="lead text-center">
-      /path/to/<strong>flink/conf/<br>flink-conf.yaml</strong>
-    <pre>jobmanager.rpc.address: 10.0.0.1</pre>
-    </p>
-  </div>
-<div class="row" style="margin-top: 1em;">
-  <p class="lead text-center">
-    /path/to/<strong>flink/<br>conf/workers</strong>
-  <pre>
-10.0.0.2
-10.0.0.3</pre>
-  </p>
-</div>
-</div>
-</div>
+{% highlight bash %}
+$ ./bin/taskmanager.sh stop
+$ ./bin/standalone-job.sh stop
+{% endhighlight %}
 
-The Flink directory must be available on every worker under the same path. You 
can use a shared NFS directory, or copy the entire Flink directory to every 
worker node.
 
-Please see the [configuration page]({% link deployment/config.md %}) for 
details and additional configuration options.
+### Per-Job Mode
 
-In particular,
+Per-Job Mode is not supported by the the Standalone Cluster.
 
- * the amount of available memory per JobManager 
(`jobmanager.memory.process.size`),
- * the amount of available memory per TaskManager 
(`taskmanager.memory.process.size` and check [memory setup guide]({% link 
deployment/memory/mem_tuning.md %}#configure-memory-for-standalone-deployment)),
- * the number of available CPUs per machine (`taskmanager.numberOfTaskSlots`),
- * the total number of CPUs in the cluster (`parallelism.default`) and
- * the temporary directories (`io.tmp.dirs`)
+### Session Mode
 
-are very important configuration values.
+Local deployment in Session Mode has already been described in the 
[introduction](#starting-a-standalone-cluster-session-mode) above.
 
-{% top %}
+## Standalone Cluster Reference
+
+### Configuration
+
+All available configuration options are listed on the [configuration page]({% 
link deployment/config.md %}), in particular the [Basic Setup]({% link 
deployment/config.md %}#basic-setup) section contains good advise on 
configuring the ports, memory, parallelism etc.
+
+### Debugging
+
+If Flink is behaving unexpectedly, we recommend looking at Flink's log files 
as a starting point for further investigations.
+
+The log files are located in the `logs/` directory. There's a `.log` file for 
each Flink service running on this machine. In the default configuration, log 
files are rotated on each start of a Flink service -- older runs of a service 
will have a number suffixed to the log file.
 
-### Starting Flink
+Alternatively, logs are available from the Flink web frontend (both for the 
JobManager and each TaskManager).
 
-The following script starts a JobManager on the local node and connects via 
SSH to all worker nodes listed in the *workers* file to start the TaskManager 
on each node. Now your Flink system is up and running. The JobManager running 
on the local node will now accept jobs at the configured RPC port.
+By default, Flink is logging on the "INFO" log level, which provides basic 
information for all obvious issues. For cases where Flink seems to behave 
wrongly, reducing the log level to "DEBUG" is advised. The logging level is 
controlled via the `conf/log4.properties` file.
+Setting `rootLogger.level = DEBUG` will boostrap Flink on the DEBUG log level.
 
-Assuming that you are on the master node and inside the Flink directory:
+There's a dedicated page on the [logging]({%link 
deployment/advanced/logging.md %}) in Flink.
 
+### Component Management Scripts
+
+#### Starting and Stopping a cluster
+
+`bin/start-cluster.sh` and `bin/stop-cluster.sh` rely on `conf/masters` and 
`conf/workers` to determine the number of cluster component instances.
+
+If password-less SSH access to the listed machines is configured, and they 
share the same directory structure, the scripts also support starting and 
stopping instances remotely.
+
+##### Example 1: Start a cluster with 2 TaskManagers locally
+
+`conf/masters` contents:
 {% highlight bash %}
-bin/start-cluster.sh
+localhost
 {% endhighlight %}
 
-To stop Flink, there is also a `stop-cluster.sh` script.
-
-{% top %}
+`conf/workers` contents:
+{% highlight bash %}
+localhost
+localhost
+{% endhighlight %}
 
-### Adding JobManager/TaskManager Instances to a Cluster
+##### Example 2: Start a distributed cluster JobMangers
 
-You can add both JobManager and TaskManager instances to your running cluster 
with the `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts.
+This assumes a cluster with 4 machines (`master1, worker1, worker2, worker3`), 
which all can reach each other over the network.
 
-#### Adding a JobManager
+`conf/masters` contents:
+{% highlight bash %}
+master1
+{% endhighlight %}
 
+`conf/workers` contents:
 {% highlight bash %}
-bin/jobmanager.sh ((start|start-foreground) [host] [webui-port])|stop|stop-all
+worker1
+worker2
+worker3
 {% endhighlight %}
 
-#### Adding a TaskManager
+Note that the configuration key [jobmanager.rpc.address]({% link 
deployment/config.md %}#jobmanager-rpc-address) needs to be set to `master1` 
for this to work.
+
+We show a third example with a standby JobManager in the [high-availability 
section](#setting-up-high-availability).
+
+#### Starting and Stopping Flink Components
+
+The `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts support starting the 
respective daemon in the background (using the `start` argument), or in the 
foreground (using `start-foreground`). In the foreground mode, the logs are 
printed to standard out. This mode is useful for deployment scenarios where 
another process is controlling the Flink daemon (e.g. Docker).
+
+The scripts can be called multiple times, for example if multiple TaskManagers 
are needed. The instances are tracked by the scripts, and can be stopped 
one-by-one (using `stop`) or all together (using `stop-all`).
+
+#### Windows Cygwin Users
+
+If you are installing Flink from the git repository and you are using the 
Windows git shell, Cygwin can produce a failure similar to this one:
 
 {% highlight bash %}
-bin/taskmanager.sh start|start-foreground|stop|stop-all
+c:/flink/bin/start-cluster.sh: line 30: $'\r': command not found
 {% endhighlight %}
 
-Make sure to call these scripts on the hosts on which you want to start/stop 
the respective instance.
+This error occurs because git is automatically transforming UNIX line endings 
to Windows style line endings when running on Windows. The problem is that 
Cygwin can only deal with UNIX style line endings. The solution is to adjust 
the Cygwin settings to deal with the correct line endings by following these 
three steps:
+
+1. Start a Cygwin shell.
+
+2. Determine your home directory by entering
 
-## High-Availability with Standalone
+    ```bash
+    cd; pwd
+    ```
+
+    This will return a path under the Cygwin root path.
+
+3. Using NotePad, WordPad or a different text editor open the file 
`.bash_profile` in the home directory and append the following (if the file 
does not exist you will have to create it):
+
+    ```bash
+    $ export SHELLOPTS
+    $ set -o igncr
+    ```
+
+4. Save the file and open a new bash shell.
+
+### Setting up High-Availability
 
 In order to enable HA for a standalone cluster, you have to use the [ZooKeeper 
HA services]({% link deployment/ha/zookeeper_ha.md %}).
 
 Additionally, you have to configure your cluster to start multiple JobManagers.
 
-### Masters File (masters)
-
 In order to start an HA-cluster configure the *masters* file in `conf/masters`:
 
 - **masters file**: The *masters file* contains all hosts, on which 
JobManagers are started, and the ports to which the web user interface binds.
 
   <pre>
-jobManagerAddress1:webUIPort1
+master1:webUIPort1
 [...]
-jobManagerAddressX:webUIPortX
+masterX:webUIPortX
   </pre>
 
 By default, the job manager will pick a *random port* for inter process 
communication. You can change this via the 
[high-availability.jobmanager.port]({% link deployment/config.md 
%}#high-availability-jobmanager-port) key. This key accepts single ports (e.g. 
`50010`), ranges (`50000-50025`), or a combination of both 
(`50010,50011,50020-50025,50050-50075`).
 
-### Example: Standalone Cluster with 2 JobManagers
+#### Example: Standalone HA Cluster with 2 JobManagers
 
 1. **Configure high availability mode and ZooKeeper quorum** in 
`conf/flink-conf.yaml`:

Review comment:
       I think we could remove the formatting for the items.

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -204,21 +198,53 @@ You can provide the following additional command line 
arguments to the cluster e
 
 If the main function of the user job main class accepts arguments, you can 
also pass them at the end of the `docker run` command.
 
-## Customize Flink image
+### Per-Job Cluster Mode
+
+Per-Job Cluster Mode is not supported by Docker.

Review comment:
       ```suggestion
   [Per-Job Mode]({% link deployment/index.md %}#per-job-mode) is not supported 
by Docker.
   ```

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -204,21 +198,53 @@ You can provide the following additional command line 
arguments to the cluster e
 
 If the main function of the user job main class accepts arguments, you can 
also pass them at the end of the `docker run` command.
 
-## Customize Flink image
+### Per-Job Cluster Mode

Review comment:
       ```suggestion
   ### Per-Job Mode on Docker
   ```

##########
File path: docs/deployment/resource-providers/standalone/index.md
##########
@@ -24,153 +24,206 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This page provides instructions on how to run Flink in a *fully distributed 
fashion* on a *static* (but possibly heterogeneous) cluster.
-
 * This will be replaced by the TOC
 {:toc}
 
-## Requirements
 
-### Software Requirements
+## Getting Started
 
-Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows) and expects the cluster to consist of **one master 
node** and **one or more worker nodes**. Before you start to setup the system, 
make sure you have the following software installed **on each node**:
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate processes) of a Flink cluster. This can easily be 
expanded to set up a distributed standalone cluster, which we describe in the 
[reference section](#the-start-and-stop-scripts).

Review comment:
       ```suggestion
   This *Getting Started* section guides you through the local setup (on one 
machine, but in separate processes) of a Flink cluster. This can easily be 
expanded to set up a distributed standalone cluster, which we describe in the 
[reference section](#example-2-start-a-distributed-cluster-jobmangers).
   ```

##########
File path: docs/deployment/resource-providers/standalone/index.md
##########
@@ -24,153 +24,206 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This page provides instructions on how to run Flink in a *fully distributed 
fashion* on a *static* (but possibly heterogeneous) cluster.
-
 * This will be replaced by the TOC
 {:toc}
 
-## Requirements
 
-### Software Requirements
+## Getting Started
 
-Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows) and expects the cluster to consist of **one master 
node** and **one or more worker nodes**. Before you start to setup the system, 
make sure you have the following software installed **on each node**:
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate processes) of a Flink cluster. This can easily be 
expanded to set up a distributed standalone cluster, which we describe in the 
[reference section](#the-start-and-stop-scripts).
 
-- **Java 1.8.x** or higher,
-- **ssh** (sshd must be running to use the Flink scripts that manage
-  remote components)
+### Introduction
 
-If your cluster does not fulfill these software requirements you will need to 
install/upgrade it.
+The standalone mode is the most barebone way of deploying Flink: The Flink 
services described in the [deployment overview]({% link deployment/index.md %}) 
are just launched as processes on the operating system. Unlike deploying Flink 
with a resource provider such as [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) or [YARN]({% link 
deployment/resource-providers/yarn.md %}), you have to take care of restarting 
failed processes, or allocation and de-allocation of resources during operation.
 
-Having __passwordless SSH__ and
-__the same directory structure__ on all your cluster nodes will allow you to 
use our scripts to control
-everything.
+In the additional subpages of the standalone mode resource provider, we 
describe additional deployment methods which are based on the standalone mode: 
[Deployment in Docker containers]({% link 
deployment/resource-providers/standalone/docker.md %}), and on [Kubernetes]({% 
link deployment/resource-providers/standalone/kubernetes.md %}).
 
-{% top %}
+### Preparation
 
-### `JAVA_HOME` Configuration
+Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows). Before you start to setup the system, make sure your 
system fulfils the following requirements.
 
-Flink requires the `JAVA_HOME` environment variable to be set on the master 
and all worker nodes and point to the directory of your Java installation.
+- **Java 1.8.x** or higher installed,
+- Downloaded a recent Flink distribution from the [download page]({{ 
site.download_url }}) and unpacked it.
 
-You can set this variable in `conf/flink-conf.yaml` via the `env.java.home` 
key.
+### Starting a Standalone Cluster (Session Mode)
 
-{% top %}
+These steps show how to launch a Flink standalone cluster, and submit an 
example job:
 
-## Flink Setup
+{% highlight bash %}
+# we assume to be in the root directory of the unzipped Flink distribution
 
-Go to the [downloads page]({{ site.download_url }}) and get the ready-to-run 
package.
+# (1) Start Cluster
+$ ./bin/start-cluster.sh
 
-After downloading the latest release, copy the archive to your master node and 
extract it:
+# (2) You can now access the Flink Web Interface on http://localhost:8081
+
+# (3) Submit example job
+$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
+
+# (4) Stop the cluster again
+$ ./bin/stop-cluster.sh
+{% endhighlight %}
+
+In step `(1)`, we've started 2 processes: A JVM for the JobManager, and a JVM 
for the TaskManager. The JobManager is serving the web interface accessible at 
[localhost:8081](http://localhost:8081).
+In step `(3)`, we are starting a Flink Client (a short-lived JVM process) that 
submits an application to the JobManager.
+
+## Deployment Modes Supported by the Standalone Cluster
+
+### Application Mode
+
+To start a Flink JobManager with an embedded application, we use the 
`bin/standalone-job.sh` script. 
+We demonstrate this mode by locally starting the `TopSpeedWindowing.jar` 
example, running on a single TaskManager.
+
+The application jar file needs to be available in the classpath. The easiest 
approach to achieve that is putting the jar into the `lib/` folder:
 
 {% highlight bash %}
-tar xzf flink-*.tgz
-cd flink-*
+$ cp ./examples/streaming/TopSpeedWindowing.jar lib/
 {% endhighlight %}
 
-### Configuring Flink
+Then, we can launch the JobManager:
 
-After having extracted the system files, you need to configure Flink for the 
cluster by editing *conf/flink-conf.yaml*.
+{% highlight bash %}
+$ ./bin/standalone-job.sh start --job-classname 
org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
+{% endhighlight %}
 
-Set the `jobmanager.rpc.address` key to point to your master node. You should 
also define the maximum amount of main memory Flink is allowed to allocate on 
each node by setting the `jobmanager.memory.process.size` and 
`taskmanager.memory.process.size` keys.
+The web interface is now available at [localhost:8081](http://localhost:8081). 
However, the application won't be able to start, because there are no 
TaskManagers running yet:
 
-These values are given in MB. If some worker nodes have more main memory which 
you want to allocate to the Flink system you can overwrite the default value by 
setting `taskmanager.memory.process.size` or `taskmanager.memory.flink.size` in 
*conf/flink-conf.yaml* on those specific nodes.
+{% highlight bash %}
+$ ./bin/taskmanager.sh start
+{% endhighlight %}
 
-Finally, you must provide a list of all nodes in your cluster that shall be 
used as worker nodes, i.e., nodes running a TaskManager. Edit the file 
*conf/workers* and enter the IP/host name of each worker node.
+Note: You can start multiple TaskManagers, if your application needs more 
resources.
 
-The following example illustrates the setup with three nodes (with IP 
addresses from _10.0.0.1_
-to _10.0.0.3_ and hostnames _master_, _worker1_, _worker2_) and shows the 
contents of the
-configuration files (which need to be accessible at the same path on all 
machines):
+Stopping the services is also supported via the scripts. Call them multiple 
times if you want to stop multiple instances, or use `stop-all`:
 
-<div class="row">
-  <div class="col-md-6 text-center">
-    <img src="{% link /page/img/quickstart_cluster.png %}" style="width: 60%">
-  </div>
-<div class="col-md-6">
-  <div class="row">
-    <p class="lead text-center">
-      /path/to/<strong>flink/conf/<br>flink-conf.yaml</strong>
-    <pre>jobmanager.rpc.address: 10.0.0.1</pre>
-    </p>
-  </div>
-<div class="row" style="margin-top: 1em;">
-  <p class="lead text-center">
-    /path/to/<strong>flink/<br>conf/workers</strong>
-  <pre>
-10.0.0.2
-10.0.0.3</pre>
-  </p>
-</div>
-</div>
-</div>
+{% highlight bash %}
+$ ./bin/taskmanager.sh stop
+$ ./bin/standalone-job.sh stop
+{% endhighlight %}
 
-The Flink directory must be available on every worker under the same path. You 
can use a shared NFS directory, or copy the entire Flink directory to every 
worker node.
 
-Please see the [configuration page]({% link deployment/config.md %}) for 
details and additional configuration options.
+### Per-Job Mode
 
-In particular,
+Per-Job Mode is not supported by the the Standalone Cluster.
 
- * the amount of available memory per JobManager 
(`jobmanager.memory.process.size`),
- * the amount of available memory per TaskManager 
(`taskmanager.memory.process.size` and check [memory setup guide]({% link 
deployment/memory/mem_tuning.md %}#configure-memory-for-standalone-deployment)),
- * the number of available CPUs per machine (`taskmanager.numberOfTaskSlots`),
- * the total number of CPUs in the cluster (`parallelism.default`) and
- * the temporary directories (`io.tmp.dirs`)
+### Session Mode
 
-are very important configuration values.
+Local deployment in Session Mode has already been described in the 
[introduction](#starting-a-standalone-cluster-session-mode) above.
 
-{% top %}
+## Standalone Cluster Reference
+
+### Configuration
+
+All available configuration options are listed on the [configuration page]({% 
link deployment/config.md %}), in particular the [Basic Setup]({% link 
deployment/config.md %}#basic-setup) section contains good advise on 
configuring the ports, memory, parallelism etc.
+
+### Debugging
+
+If Flink is behaving unexpectedly, we recommend looking at Flink's log files 
as a starting point for further investigations.
+
+The log files are located in the `logs/` directory. There's a `.log` file for 
each Flink service running on this machine. In the default configuration, log 
files are rotated on each start of a Flink service -- older runs of a service 
will have a number suffixed to the log file.
 
-### Starting Flink
+Alternatively, logs are available from the Flink web frontend (both for the 
JobManager and each TaskManager).
 
-The following script starts a JobManager on the local node and connects via 
SSH to all worker nodes listed in the *workers* file to start the TaskManager 
on each node. Now your Flink system is up and running. The JobManager running 
on the local node will now accept jobs at the configured RPC port.
+By default, Flink is logging on the "INFO" log level, which provides basic 
information for all obvious issues. For cases where Flink seems to behave 
wrongly, reducing the log level to "DEBUG" is advised. The logging level is 
controlled via the `conf/log4.properties` file.
+Setting `rootLogger.level = DEBUG` will boostrap Flink on the DEBUG log level.
 
-Assuming that you are on the master node and inside the Flink directory:
+There's a dedicated page on the [logging]({%link 
deployment/advanced/logging.md %}) in Flink.
 
+### Component Management Scripts
+
+#### Starting and Stopping a cluster
+
+`bin/start-cluster.sh` and `bin/stop-cluster.sh` rely on `conf/masters` and 
`conf/workers` to determine the number of cluster component instances.
+
+If password-less SSH access to the listed machines is configured, and they 
share the same directory structure, the scripts also support starting and 
stopping instances remotely.
+
+##### Example 1: Start a cluster with 2 TaskManagers locally
+
+`conf/masters` contents:
 {% highlight bash %}
-bin/start-cluster.sh
+localhost
 {% endhighlight %}
 
-To stop Flink, there is also a `stop-cluster.sh` script.
-
-{% top %}
+`conf/workers` contents:
+{% highlight bash %}
+localhost
+localhost
+{% endhighlight %}
 
-### Adding JobManager/TaskManager Instances to a Cluster
+##### Example 2: Start a distributed cluster JobMangers
 
-You can add both JobManager and TaskManager instances to your running cluster 
with the `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts.
+This assumes a cluster with 4 machines (`master1, worker1, worker2, worker3`), 
which all can reach each other over the network.
 
-#### Adding a JobManager
+`conf/masters` contents:
+{% highlight bash %}
+master1
+{% endhighlight %}
 
+`conf/workers` contents:
 {% highlight bash %}
-bin/jobmanager.sh ((start|start-foreground) [host] [webui-port])|stop|stop-all
+worker1
+worker2
+worker3
 {% endhighlight %}
 
-#### Adding a TaskManager
+Note that the configuration key [jobmanager.rpc.address]({% link 
deployment/config.md %}#jobmanager-rpc-address) needs to be set to `master1` 
for this to work.
+
+We show a third example with a standby JobManager in the [high-availability 
section](#setting-up-high-availability).
+
+#### Starting and Stopping Flink Components
+
+The `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts support starting the 
respective daemon in the background (using the `start` argument), or in the 
foreground (using `start-foreground`). In the foreground mode, the logs are 
printed to standard out. This mode is useful for deployment scenarios where 
another process is controlling the Flink daemon (e.g. Docker).
+
+The scripts can be called multiple times, for example if multiple TaskManagers 
are needed. The instances are tracked by the scripts, and can be stopped 
one-by-one (using `stop`) or all together (using `stop-all`).
+
+#### Windows Cygwin Users
+
+If you are installing Flink from the git repository and you are using the 
Windows git shell, Cygwin can produce a failure similar to this one:
 
 {% highlight bash %}
-bin/taskmanager.sh start|start-foreground|stop|stop-all
+c:/flink/bin/start-cluster.sh: line 30: $'\r': command not found
 {% endhighlight %}
 
-Make sure to call these scripts on the hosts on which you want to start/stop 
the respective instance.
+This error occurs because git is automatically transforming UNIX line endings 
to Windows style line endings when running on Windows. The problem is that 
Cygwin can only deal with UNIX style line endings. The solution is to adjust 
the Cygwin settings to deal with the correct line endings by following these 
three steps:
+
+1. Start a Cygwin shell.
+
+2. Determine your home directory by entering
 
-## High-Availability with Standalone
+    ```bash
+    cd; pwd
+    ```
+
+    This will return a path under the Cygwin root path.
+
+3. Using NotePad, WordPad or a different text editor open the file 
`.bash_profile` in the home directory and append the following (if the file 
does not exist you will have to create it):
+
+    ```bash
+    $ export SHELLOPTS
+    $ set -o igncr
+    ```
+
+4. Save the file and open a new bash shell.
+
+### Setting up High-Availability
 
 In order to enable HA for a standalone cluster, you have to use the [ZooKeeper 
HA services]({% link deployment/ha/zookeeper_ha.md %}).
 
 Additionally, you have to configure your cluster to start multiple JobManagers.
 
-### Masters File (masters)
-
 In order to start an HA-cluster configure the *masters* file in `conf/masters`:
 
 - **masters file**: The *masters file* contains all hosts, on which 
JobManagers are started, and the ports to which the web user interface binds.
 
   <pre>
-jobManagerAddress1:webUIPort1
+master1:webUIPort1
 [...]
-jobManagerAddressX:webUIPortX
+masterX:webUIPortX
   </pre>
 
 By default, the job manager will pick a *random port* for inter process 
communication. You can change this via the 
[high-availability.jobmanager.port]({% link deployment/config.md 
%}#high-availability-jobmanager-port) key. This key accepts single ports (e.g. 
`50010`), ranges (`50000-50025`), or a combination of both 
(`50010,50011,50020-50025,50050-50075`).
 
-### Example: Standalone Cluster with 2 JobManagers
+#### Example: Standalone HA Cluster with 2 JobManagers

Review comment:
       Can we switch from `<pre/>` tags to "\`\`\`sh \`\`\`"? This way, we 
would enable syntax highlighting for these code blocks.

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -23,120 +23,114 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-[Docker](https://www.docker.com) is a popular container runtime.
-There are Docker images for Apache Flink available [on Docker 
Hub](https://hub.docker.com/_/flink).
-You can use the docker images to deploy a *Session* or *Job cluster* in a 
containerized environment, e.g.,
-[standalone Kubernetes]({% link 
deployment/resource-providers/standalone/kubernetes.md %}) or [native 
Kubernetes]({% link deployment/resource-providers/native_kubernetes.md %}).
-
 * This will be replaced by the TOC
 {:toc}
 
-## Docker Hub Flink images
-
-The [Flink Docker repository](https://hub.docker.com/_/flink/) is hosted on
-Docker Hub and serves images of Flink version 1.2.1 and later.
-The source for these images can be found in the [Apache 
flink-docker](https://github.com/apache/flink-docker) repository.
-
-### Image tags
-
-Images for each supported combination of Flink and Scala versions are 
available, and
-[tag aliases](https://hub.docker.com/_/flink?tab=tags) are provided for 
convenience.
 
-For example, you can use the following aliases:
+## Getting Started
 
-* `flink:latest` → `flink:<latest-flink>-scala_<latest-scala>`
-* `flink:1.11` → `flink:1.11.<latest-flink-1.11>-scala_2.11`
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate containers) of a Flink cluster using Docker containers.
 
-<span class="label label-info">Note</span> It is recommended to always use an 
explicit version tag of the docker image that specifies both the needed Flink 
and Scala
-versions (for example `flink:1.11-scala_2.12`).
-This will avoid some class conflicts that can occur if the Flink and/or Scala 
versions used in the application are different
-from the versions provided by the docker image.
+### Introduction
 
-<span class="label label-info">Note</span> Prior to Flink 1.5 version, Hadoop 
dependencies were always bundled with Flink.
-You can see that certain tags include the version of Hadoop, e.g. (e.g. 
`-hadoop28`).
-Beginning with Flink 1.5, image tags that omit the Hadoop version correspond 
to Hadoop-free releases of Flink
-that do not include a bundled Hadoop distribution.
-
-## How to run a Flink image
-
-The Flink image contains a regular Flink distribution with its default 
configuration and a standard entry point script.
-You can run its entry point in the following modes:
-* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Session cluster](#start-a-session-cluster)
-* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a Job 
cluster](#start-a-job-cluster)
-* [TaskManager]({% link concepts/glossary.md %}#flink-taskmanager) for any 
cluster
-
-This allows you to deploy a standalone cluster (Session or Job) in any 
containerised environment, for example:
-* manually in a local Docker setup,
-* [in a Kubernetes cluster]({% link 
deployment/resource-providers/standalone/kubernetes.md %}),
-* [with Docker Compose](#flink-with-docker-compose),
-* [with Docker swarm](#flink-with-docker-swarm).
-
-<span class="label label-info">Note</span> [The native Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) also runs the same image 
by default
-and deploys *TaskManagers* on demand so that you do not have to do it manually.
-
-The next chapters describe how to start a single Flink Docker container for 
various purposes.
+[Docker](https://www.docker.com) is a popular container runtime.
+There are Docker images for Apache Flink available [on Docker 
Hub](https://hub.docker.com/_/flink).
+You can use the Docker images to deploy a *Session* or *Application cluster* 
on Docker. This page focuses on the setup of Flink on Docker, Docker Swarm and 
Docker compose.

Review comment:
       ```suggestion
   You can use the Docker images to deploy a *Session* or *Application cluster* 
on Docker. This page focuses on the setup of Flink on Docker, Docker Swarm and 
Docker Compose.
   ```

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -23,120 +23,114 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-[Docker](https://www.docker.com) is a popular container runtime.
-There are Docker images for Apache Flink available [on Docker 
Hub](https://hub.docker.com/_/flink).
-You can use the docker images to deploy a *Session* or *Job cluster* in a 
containerized environment, e.g.,
-[standalone Kubernetes]({% link 
deployment/resource-providers/standalone/kubernetes.md %}) or [native 
Kubernetes]({% link deployment/resource-providers/native_kubernetes.md %}).
-
 * This will be replaced by the TOC
 {:toc}
 
-## Docker Hub Flink images
-
-The [Flink Docker repository](https://hub.docker.com/_/flink/) is hosted on
-Docker Hub and serves images of Flink version 1.2.1 and later.
-The source for these images can be found in the [Apache 
flink-docker](https://github.com/apache/flink-docker) repository.
-
-### Image tags
-
-Images for each supported combination of Flink and Scala versions are 
available, and
-[tag aliases](https://hub.docker.com/_/flink?tab=tags) are provided for 
convenience.
 
-For example, you can use the following aliases:
+## Getting Started
 
-* `flink:latest` → `flink:<latest-flink>-scala_<latest-scala>`
-* `flink:1.11` → `flink:1.11.<latest-flink-1.11>-scala_2.11`
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate containers) of a Flink cluster using Docker containers.
 
-<span class="label label-info">Note</span> It is recommended to always use an 
explicit version tag of the docker image that specifies both the needed Flink 
and Scala
-versions (for example `flink:1.11-scala_2.12`).
-This will avoid some class conflicts that can occur if the Flink and/or Scala 
versions used in the application are different
-from the versions provided by the docker image.
+### Introduction
 
-<span class="label label-info">Note</span> Prior to Flink 1.5 version, Hadoop 
dependencies were always bundled with Flink.
-You can see that certain tags include the version of Hadoop, e.g. (e.g. 
`-hadoop28`).
-Beginning with Flink 1.5, image tags that omit the Hadoop version correspond 
to Hadoop-free releases of Flink
-that do not include a bundled Hadoop distribution.
-
-## How to run a Flink image
-
-The Flink image contains a regular Flink distribution with its default 
configuration and a standard entry point script.
-You can run its entry point in the following modes:
-* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Session cluster](#start-a-session-cluster)
-* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a Job 
cluster](#start-a-job-cluster)
-* [TaskManager]({% link concepts/glossary.md %}#flink-taskmanager) for any 
cluster
-
-This allows you to deploy a standalone cluster (Session or Job) in any 
containerised environment, for example:
-* manually in a local Docker setup,
-* [in a Kubernetes cluster]({% link 
deployment/resource-providers/standalone/kubernetes.md %}),
-* [with Docker Compose](#flink-with-docker-compose),
-* [with Docker swarm](#flink-with-docker-swarm).
-
-<span class="label label-info">Note</span> [The native Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) also runs the same image 
by default
-and deploys *TaskManagers* on demand so that you do not have to do it manually.
-
-The next chapters describe how to start a single Flink Docker container for 
various purposes.
+[Docker](https://www.docker.com) is a popular container runtime.
+There are Docker images for Apache Flink available [on Docker 
Hub](https://hub.docker.com/_/flink).
+You can use the Docker images to deploy a *Session* or *Application cluster* 
on Docker. This page focuses on the setup of Flink on Docker, Docker Swarm and 
Docker compose.
 
-Once you've started Flink on Docker, you can access the Flink Webfrontend on 
[localhost:8081](http://localhost:8081/#/overview) or submit jobs like this 
`./bin/flink run ./examples/streaming/TopSpeedWindowing.jar`.
+Deployment into managed containerized environments, such as [standalone 
Kubernetes]({% link deployment/resource-providers/standalone/kubernetes.md %}) 
or [native Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}), are described on 
separate pages.
 
-We recommend using [Docker Compose]({% link 
deployment/resource-providers/standalone/docker.md 
%}#session-cluster-with-docker-compose) or [Docker Swarm]({% link 
deployment/resource-providers/standalone/docker.md 
%}#session-cluster-with-docker-swarm) for deploying Flink as a Session Cluster 
to ease system configuration.
 
-### Start a Session Cluster
+### Starting a Session Cluster on Docker
 
-A *Flink Session cluster* can be used to run multiple jobs. Each job needs to 
be submitted to the cluster after it has been deployed.
-To deploy a *Flink Session cluster* with Docker, you need to start a 
*JobManager* container. To enable communication between the containers, we 
first set a required Flink configuration property and create a network:
+A *Flink Session cluster* can be used to run multiple jobs. Each job needs to 
be submitted to the cluster after the cluster has been deployed.
+To deploy a *Flink Session cluster* with Docker, you need to start a 
JobManager container. To enable communication between the containers, we first 
set a required Flink configuration property and create a network:
 ```sh
-FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"
-docker network create flink-network
+$ FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"
+$ docker network create flink-network
 ```
 
 Then we launch the JobManager:
 
 ```sh
-docker run \
+$ docker run \
     --rm \
     --name=jobmanager \
     --network flink-network \
-    -p 8081:8081 \
+    --publish 8081:8081 \
     --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
     flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} jobmanager
 ```
 
-and one or more *TaskManager* containers:
+and one or more TaskManager containers:
 
 ```sh
-docker run \
+$ docker run \
     --rm \
     --name=taskmanager \
     --network flink-network \
     --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
     flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} taskmanager
 ```
 
+The web interface is now available at [localhost:8081](http://localhost:8081).
 
-### Start a Job Cluster
 
-A *Flink Job cluster* is a dedicated cluster which runs a single job.
+Submission of a job is now possible like this (assuming you have a local 
distribution of Flink available):
+
+```sh
+$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
+```
+
+To shut down the cluster, either terminate (e.g. with `CTRL-C`) the JobManager 
and TaskManager processes, or use `docker ps` to identify and `docker stop` to 
terminate the containers.
+
+## Deployment Modes
+
+The Flink image contains a regular Flink distribution with its default 
configuration and a standard entry point script.
+You can run its entry point in the following modes:
+* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Session cluster](#starting-a-session-cluster-on-docker)
+* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Application cluster](#application-mode-on-docker)
+* [TaskManager]({% link concepts/glossary.md %}#flink-taskmanager) for any 
cluster
+
+This allows you to deploy a standalone cluster (Session or Per-Job Mode) in 
any containerised environment, for example:

Review comment:
       ```suggestion
   This allows you to deploy a standalone cluster (Session or Application Mode) 
in any containerised environment, for example:
   ```

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -304,29 +349,33 @@ as described in [how to run the Flink 
image](#how-to-run-flink-image).
     # create custom_lib.jar
     # create custom_plugin.jar
 
-    echo "
-    ln -fs /opt/flink/opt/flink-queryable-state-runtime-*.jar /opt/flink/lib/. 
 # enable an optional library
-    ln -fs /mnt/custom_lib.jar /opt/flink/lib/.  # enable a custom library
+    $ echo "
+    # enable an optional library
+    ln -fs /opt/flink/opt/flink-queryable-state-runtime-*.jar /opt/flink/lib/
+    # enable a custom library
+    ln -fs /mnt/custom_lib.jar /opt/flink/lib/
 
     mkdir -p /opt/flink/plugins/flink-s3-fs-hadoop
-    ln -fs /opt/flink/opt/flink-s3-fs-hadoop-*.jar 
/opt/flink/plugins/flink-s3-fs-hadoop/.  # enable an optional plugin
+    # enable an optional plugin
+    ln -fs /opt/flink/opt/flink-s3-fs-hadoop-*.jar 
/opt/flink/plugins/flink-s3-fs-hadoop/  
 
     mkdir -p /opt/flink/plugins/custom_plugin
-    ln -fs /mnt/custom_plugin.jar /opt/flink/plugins/custom_plugin/.  # enable 
a custom plugin
+    # enable a custom plugin
+    ln -fs /mnt/custom_plugin.jar /opt/flink/plugins/custom_plugin/
 
     /docker-entrypoint.sh <jobmanager|standalone-job|taskmanager>
     " > custom_entry_point_script.sh
 
-    chmod 755 custom_entry_point_script.sh
+    $ chmod 755 custom_entry_point_script.sh
 
-    docker run \
+    $ docker run \
         --mount type=bind,src=$(pwd),target=/mnt
         flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} /mnt/custom_entry_point_script.sh
     ```
 
 * **extend the Flink image** by writing a custom `Dockerfile` and build a 
custom image:
 
-    *Dockerfile*:
+  *Dockerfile*:

Review comment:
       ```suggestion
   ```
   I think that's obsolete again.

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -344,209 +393,178 @@ as described in [how to run the Flink 
image](#how-to-run-flink-image).
     ENV VAR_NAME value
     ```
 
-    **Commands for building**:
+  **Commands for building**:
 
     ```sh
-    docker build -t custom_flink_image .
+    $ docker build --tag custom_flink_image .
     # optional push to your docker image registry if you have it,
     # e.g. to distribute the custom image to your cluster
-    docker push custom_flink_image
+    $ docker push custom_flink_image
     ```
-  
-### Enabling Python
-
-To build a custom image which has Python and Pyflink prepared, you can refer 
to the following Dockerfile:
-{% highlight Dockerfile %}
-FROM flink
-
-# install python3 and pip3
-RUN apt-get update -y && \
-apt-get install -y python3.7 python3-pip python3.7-dev && rm -rf 
/var/lib/apt/lists/*
-RUN ln -s /usr/bin/python3 /usr/bin/python
 
-# install Python Flink
-RUN pip3 install apache-flink
-{% endhighlight %}
-
-Build the image named as **pyflink:latest**:
-
-{% highlight bash %}
-sudo docker build -t pyflink:latest .
-{% endhighlight %}
 
-{% top %}
-
-## Flink with Docker Compose
+### Flink with Docker Compose
 
 [Docker Compose](https://docs.docker.com/compose/) is a way to run a group of 
Docker containers locally.
-The next chapters show examples of configuration files to run Flink.
+The next sections show examples of configuration files to run Flink.
 
-### Usage
+#### Usage
 
 * Create the `yaml` files with the container configuration, check examples for:
-    * [Session cluster](#session-cluster-with-docker-compose)
-    * [Job cluster](#job-cluster-with-docker-compose)
+  * [Application cluster](#app-cluster-yml)
+  * [Session cluster](#session-cluster-yml)
 
-    See also [the Flink Docker image tags](#image-tags) and [how to customize 
the Flink Docker image](#advanced-customization)
-    for usage in the configuration files.
+  See also [the Flink Docker image tags](#image-tags) and [how to customize 
the Flink Docker image](#advanced-customization)
+  for usage in the configuration files.
 
-* Launch a cluster in the foreground
+* Launch a cluster in the foreground (use `-d` for background)
 
     ```sh
-    docker-compose up
+    $ docker-compose up
     ```
 
-* Launch a cluster in the background
+* Scale the cluster up or down to `N` TaskManagers
 
     ```sh
-    docker-compose up -d
+    $ docker-compose scale taskmanager=<N>
     ```
 
-* Scale the cluster up or down to *N TaskManagers*
+* Access the JobManager container
 
     ```sh
-    docker-compose scale taskmanager=<N>
-    ```
-
-* Access the *JobManager* container
-
-    ```sh
-    docker exec -it $(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}) /bin/sh
+    $ docker exec -it $(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}) /bin/sh
     ```
 
 * Kill the cluster
 
     ```sh
-    docker-compose kill
+    $ docker-compose kill
     ```
 
 * Access Web UI
 
-    When the cluster is running, you can visit the web UI at 
[http://localhost:8081](http://localhost:8081).
-    You can also use the web UI to submit a job to a *Session cluster*.
+  When the cluster is running, you can visit the web UI at 
[http://localhost:8081](http://localhost:8081).
+  You can also use the web UI to submit a job to a *Session cluster*.
 
 * To submit a job to a *Session cluster* via the command line, you can either
 
   * use [Flink CLI]({% link deployment/cli.md %}) on the host if it is 
installed:
 
     ```sh
-    flink run -d -c ${JOB_CLASS_NAME} /job.jar
+    $ ./bin/flink run --detached --class ${JOB_CLASS_NAME} /job.jar
     ```
 
-  * or copy the JAR to the *JobManager* container and submit the job using the 
[CLI]({% link deployment/cli.md %}) from there, for example:
+  * or copy the JAR to the JobManager container and submit the job using the 
[CLI]({% link deployment/cli.md %}) from there, for example:
 
     ```sh
-    JOB_CLASS_NAME="com.job.ClassName"
-    JM_CONTAINER=$(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}))
-    docker cp path/to/jar "${JM_CONTAINER}":/job.jar
-    docker exec -t -i "${JM_CONTAINER}" flink run -d -c ${JOB_CLASS_NAME} 
/job.jar
+    $ JOB_CLASS_NAME="com.job.ClassName"
+    $ JM_CONTAINER=$(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}))
+    $ docker cp path/to/jar "${JM_CONTAINER}":/job.jar
+    $ docker exec -t -i "${JM_CONTAINER}" flink run -d -c ${JOB_CLASS_NAME} 
/job.jar
     ```
 
-### Session Cluster with Docker Compose
+Here, we provide the **docker-compose.yml:** <a id="app-cluster-yml">file</a> 
for *Application Cluster*.
 
-**docker-compose.yml:**
+Note: For the application cluster, the artifacts must be available in the 
Flink containers, check details [here](#application-mode-on-docker).

Review comment:
       ```suggestion
   Note: For the Application Mode cluster, the artifacts must be available in 
the Flink containers, check details [here](#application-mode-on-docker).
   ```

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -23,120 +23,114 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-[Docker](https://www.docker.com) is a popular container runtime.
-There are Docker images for Apache Flink available [on Docker 
Hub](https://hub.docker.com/_/flink).
-You can use the docker images to deploy a *Session* or *Job cluster* in a 
containerized environment, e.g.,
-[standalone Kubernetes]({% link 
deployment/resource-providers/standalone/kubernetes.md %}) or [native 
Kubernetes]({% link deployment/resource-providers/native_kubernetes.md %}).
-
 * This will be replaced by the TOC
 {:toc}
 
-## Docker Hub Flink images
-
-The [Flink Docker repository](https://hub.docker.com/_/flink/) is hosted on
-Docker Hub and serves images of Flink version 1.2.1 and later.
-The source for these images can be found in the [Apache 
flink-docker](https://github.com/apache/flink-docker) repository.
-
-### Image tags
-
-Images for each supported combination of Flink and Scala versions are 
available, and
-[tag aliases](https://hub.docker.com/_/flink?tab=tags) are provided for 
convenience.
 
-For example, you can use the following aliases:
+## Getting Started
 
-* `flink:latest` → `flink:<latest-flink>-scala_<latest-scala>`
-* `flink:1.11` → `flink:1.11.<latest-flink-1.11>-scala_2.11`
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate containers) of a Flink cluster using Docker containers.
 
-<span class="label label-info">Note</span> It is recommended to always use an 
explicit version tag of the docker image that specifies both the needed Flink 
and Scala
-versions (for example `flink:1.11-scala_2.12`).
-This will avoid some class conflicts that can occur if the Flink and/or Scala 
versions used in the application are different
-from the versions provided by the docker image.
+### Introduction
 
-<span class="label label-info">Note</span> Prior to Flink 1.5 version, Hadoop 
dependencies were always bundled with Flink.
-You can see that certain tags include the version of Hadoop, e.g. (e.g. 
`-hadoop28`).
-Beginning with Flink 1.5, image tags that omit the Hadoop version correspond 
to Hadoop-free releases of Flink
-that do not include a bundled Hadoop distribution.
-
-## How to run a Flink image
-
-The Flink image contains a regular Flink distribution with its default 
configuration and a standard entry point script.
-You can run its entry point in the following modes:
-* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Session cluster](#start-a-session-cluster)
-* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a Job 
cluster](#start-a-job-cluster)
-* [TaskManager]({% link concepts/glossary.md %}#flink-taskmanager) for any 
cluster
-
-This allows you to deploy a standalone cluster (Session or Job) in any 
containerised environment, for example:
-* manually in a local Docker setup,
-* [in a Kubernetes cluster]({% link 
deployment/resource-providers/standalone/kubernetes.md %}),
-* [with Docker Compose](#flink-with-docker-compose),
-* [with Docker swarm](#flink-with-docker-swarm).
-
-<span class="label label-info">Note</span> [The native Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) also runs the same image 
by default
-and deploys *TaskManagers* on demand so that you do not have to do it manually.
-
-The next chapters describe how to start a single Flink Docker container for 
various purposes.
+[Docker](https://www.docker.com) is a popular container runtime.
+There are Docker images for Apache Flink available [on Docker 
Hub](https://hub.docker.com/_/flink).
+You can use the Docker images to deploy a *Session* or *Application cluster* 
on Docker. This page focuses on the setup of Flink on Docker, Docker Swarm and 
Docker compose.
 
-Once you've started Flink on Docker, you can access the Flink Webfrontend on 
[localhost:8081](http://localhost:8081/#/overview) or submit jobs like this 
`./bin/flink run ./examples/streaming/TopSpeedWindowing.jar`.
+Deployment into managed containerized environments, such as [standalone 
Kubernetes]({% link deployment/resource-providers/standalone/kubernetes.md %}) 
or [native Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}), are described on 
separate pages.
 
-We recommend using [Docker Compose]({% link 
deployment/resource-providers/standalone/docker.md 
%}#session-cluster-with-docker-compose) or [Docker Swarm]({% link 
deployment/resource-providers/standalone/docker.md 
%}#session-cluster-with-docker-swarm) for deploying Flink as a Session Cluster 
to ease system configuration.
 
-### Start a Session Cluster
+### Starting a Session Cluster on Docker
 
-A *Flink Session cluster* can be used to run multiple jobs. Each job needs to 
be submitted to the cluster after it has been deployed.
-To deploy a *Flink Session cluster* with Docker, you need to start a 
*JobManager* container. To enable communication between the containers, we 
first set a required Flink configuration property and create a network:
+A *Flink Session cluster* can be used to run multiple jobs. Each job needs to 
be submitted to the cluster after the cluster has been deployed.
+To deploy a *Flink Session cluster* with Docker, you need to start a 
JobManager container. To enable communication between the containers, we first 
set a required Flink configuration property and create a network:
 ```sh
-FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"
-docker network create flink-network
+$ FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager"
+$ docker network create flink-network
 ```
 
 Then we launch the JobManager:
 
 ```sh
-docker run \
+$ docker run \
     --rm \
     --name=jobmanager \
     --network flink-network \
-    -p 8081:8081 \
+    --publish 8081:8081 \
     --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
     flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} jobmanager
 ```
 
-and one or more *TaskManager* containers:
+and one or more TaskManager containers:
 
 ```sh
-docker run \
+$ docker run \
     --rm \
     --name=taskmanager \
     --network flink-network \
     --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" \
     flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} taskmanager
 ```
 
+The web interface is now available at [localhost:8081](http://localhost:8081).
 
-### Start a Job Cluster
 
-A *Flink Job cluster* is a dedicated cluster which runs a single job.
+Submission of a job is now possible like this (assuming you have a local 
distribution of Flink available):
+
+```sh
+$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
+```
+
+To shut down the cluster, either terminate (e.g. with `CTRL-C`) the JobManager 
and TaskManager processes, or use `docker ps` to identify and `docker stop` to 
terminate the containers.
+
+## Deployment Modes
+
+The Flink image contains a regular Flink distribution with its default 
configuration and a standard entry point script.
+You can run its entry point in the following modes:
+* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Session cluster](#starting-a-session-cluster-on-docker)
+* [JobManager]({% link concepts/glossary.md %}#flink-jobmanager) for [a 
Application cluster](#application-mode-on-docker)
+* [TaskManager]({% link concepts/glossary.md %}#flink-taskmanager) for any 
cluster
+
+This allows you to deploy a standalone cluster (Session or Per-Job Mode) in 
any containerised environment, for example:
+* manually in a local Docker setup,
+* [in a (native) Kubernetes cluster]({% link 
deployment/resource-providers/standalone/kubernetes.md %}),

Review comment:
       ```suggestion
   * [in a Kubernetes cluster]({% link 
deployment/resource-providers/standalone/kubernetes.md %}),
   ```
   I'd suggest removing `(native)` here as we're referring to the Standalone 
Kubernetes deployment. ...just to avoid confusion

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -234,14 +260,14 @@ To provide a custom location for the Flink configuration 
files, you can
 * **either mount a volume** with the custom configuration files to this path 
`/opt/flink/conf` when you run the Flink image:
 
     ```sh
-    docker run \
+    $ docker run \
         --mount type=bind,src=/host/path/to/custom/conf,target=/opt/flink/conf 
\
         flink:{% if site.is_stable 
%}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif 
%} <jobmanager|standalone-job|taskmanager>
     ```
 
 * or add them to your **custom Flink image**, build and run it:
 
-    *Dockerfile*:
+  *Dockerfile*:

Review comment:
       ```suggestion
   ```
   I think we can remove `Dockerfile:` here as it's mentioned in the item above 
that it's a "custom Flink image"

##########
File path: docs/deployment/resource-providers/standalone/index.md
##########
@@ -24,153 +24,206 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This page provides instructions on how to run Flink in a *fully distributed 
fashion* on a *static* (but possibly heterogeneous) cluster.
-
 * This will be replaced by the TOC
 {:toc}
 
-## Requirements
 
-### Software Requirements
+## Getting Started
 
-Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows) and expects the cluster to consist of **one master 
node** and **one or more worker nodes**. Before you start to setup the system, 
make sure you have the following software installed **on each node**:
+This *Getting Started* section guides you through the local setup (on one 
machine, but in separate processes) of a Flink cluster. This can easily be 
expanded to set up a distributed standalone cluster, which we describe in the 
[reference section](#the-start-and-stop-scripts).
 
-- **Java 1.8.x** or higher,
-- **ssh** (sshd must be running to use the Flink scripts that manage
-  remote components)
+### Introduction
 
-If your cluster does not fulfill these software requirements you will need to 
install/upgrade it.
+The standalone mode is the most barebone way of deploying Flink: The Flink 
services described in the [deployment overview]({% link deployment/index.md %}) 
are just launched as processes on the operating system. Unlike deploying Flink 
with a resource provider such as [Kubernetes]({% link 
deployment/resource-providers/native_kubernetes.md %}) or [YARN]({% link 
deployment/resource-providers/yarn.md %}), you have to take care of restarting 
failed processes, or allocation and de-allocation of resources during operation.
 
-Having __passwordless SSH__ and
-__the same directory structure__ on all your cluster nodes will allow you to 
use our scripts to control
-everything.
+In the additional subpages of the standalone mode resource provider, we 
describe additional deployment methods which are based on the standalone mode: 
[Deployment in Docker containers]({% link 
deployment/resource-providers/standalone/docker.md %}), and on [Kubernetes]({% 
link deployment/resource-providers/standalone/kubernetes.md %}).
 
-{% top %}
+### Preparation
 
-### `JAVA_HOME` Configuration
+Flink runs on all *UNIX-like environments*, e.g. **Linux**, **Mac OS X**, and 
**Cygwin** (for Windows). Before you start to setup the system, make sure your 
system fulfils the following requirements.
 
-Flink requires the `JAVA_HOME` environment variable to be set on the master 
and all worker nodes and point to the directory of your Java installation.
+- **Java 1.8.x** or higher installed,
+- Downloaded a recent Flink distribution from the [download page]({{ 
site.download_url }}) and unpacked it.
 
-You can set this variable in `conf/flink-conf.yaml` via the `env.java.home` 
key.
+### Starting a Standalone Cluster (Session Mode)
 
-{% top %}
+These steps show how to launch a Flink standalone cluster, and submit an 
example job:
 
-## Flink Setup
+{% highlight bash %}
+# we assume to be in the root directory of the unzipped Flink distribution
 
-Go to the [downloads page]({{ site.download_url }}) and get the ready-to-run 
package.
+# (1) Start Cluster
+$ ./bin/start-cluster.sh
 
-After downloading the latest release, copy the archive to your master node and 
extract it:
+# (2) You can now access the Flink Web Interface on http://localhost:8081
+
+# (3) Submit example job
+$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
+
+# (4) Stop the cluster again
+$ ./bin/stop-cluster.sh
+{% endhighlight %}
+
+In step `(1)`, we've started 2 processes: A JVM for the JobManager, and a JVM 
for the TaskManager. The JobManager is serving the web interface accessible at 
[localhost:8081](http://localhost:8081).
+In step `(3)`, we are starting a Flink Client (a short-lived JVM process) that 
submits an application to the JobManager.
+
+## Deployment Modes Supported by the Standalone Cluster
+
+### Application Mode
+
+To start a Flink JobManager with an embedded application, we use the 
`bin/standalone-job.sh` script. 
+We demonstrate this mode by locally starting the `TopSpeedWindowing.jar` 
example, running on a single TaskManager.
+
+The application jar file needs to be available in the classpath. The easiest 
approach to achieve that is putting the jar into the `lib/` folder:
 
 {% highlight bash %}
-tar xzf flink-*.tgz
-cd flink-*
+$ cp ./examples/streaming/TopSpeedWindowing.jar lib/
 {% endhighlight %}
 
-### Configuring Flink
+Then, we can launch the JobManager:
 
-After having extracted the system files, you need to configure Flink for the 
cluster by editing *conf/flink-conf.yaml*.
+{% highlight bash %}
+$ ./bin/standalone-job.sh start --job-classname 
org.apache.flink.streaming.examples.windowing.TopSpeedWindowing
+{% endhighlight %}
 
-Set the `jobmanager.rpc.address` key to point to your master node. You should 
also define the maximum amount of main memory Flink is allowed to allocate on 
each node by setting the `jobmanager.memory.process.size` and 
`taskmanager.memory.process.size` keys.
+The web interface is now available at [localhost:8081](http://localhost:8081). 
However, the application won't be able to start, because there are no 
TaskManagers running yet:
 
-These values are given in MB. If some worker nodes have more main memory which 
you want to allocate to the Flink system you can overwrite the default value by 
setting `taskmanager.memory.process.size` or `taskmanager.memory.flink.size` in 
*conf/flink-conf.yaml* on those specific nodes.
+{% highlight bash %}
+$ ./bin/taskmanager.sh start
+{% endhighlight %}
 
-Finally, you must provide a list of all nodes in your cluster that shall be 
used as worker nodes, i.e., nodes running a TaskManager. Edit the file 
*conf/workers* and enter the IP/host name of each worker node.
+Note: You can start multiple TaskManagers, if your application needs more 
resources.
 
-The following example illustrates the setup with three nodes (with IP 
addresses from _10.0.0.1_
-to _10.0.0.3_ and hostnames _master_, _worker1_, _worker2_) and shows the 
contents of the
-configuration files (which need to be accessible at the same path on all 
machines):
+Stopping the services is also supported via the scripts. Call them multiple 
times if you want to stop multiple instances, or use `stop-all`:
 
-<div class="row">
-  <div class="col-md-6 text-center">
-    <img src="{% link /page/img/quickstart_cluster.png %}" style="width: 60%">
-  </div>
-<div class="col-md-6">
-  <div class="row">
-    <p class="lead text-center">
-      /path/to/<strong>flink/conf/<br>flink-conf.yaml</strong>
-    <pre>jobmanager.rpc.address: 10.0.0.1</pre>
-    </p>
-  </div>
-<div class="row" style="margin-top: 1em;">
-  <p class="lead text-center">
-    /path/to/<strong>flink/<br>conf/workers</strong>
-  <pre>
-10.0.0.2
-10.0.0.3</pre>
-  </p>
-</div>
-</div>
-</div>
+{% highlight bash %}
+$ ./bin/taskmanager.sh stop
+$ ./bin/standalone-job.sh stop
+{% endhighlight %}
 
-The Flink directory must be available on every worker under the same path. You 
can use a shared NFS directory, or copy the entire Flink directory to every 
worker node.
 
-Please see the [configuration page]({% link deployment/config.md %}) for 
details and additional configuration options.
+### Per-Job Mode
 
-In particular,
+Per-Job Mode is not supported by the the Standalone Cluster.
 
- * the amount of available memory per JobManager 
(`jobmanager.memory.process.size`),
- * the amount of available memory per TaskManager 
(`taskmanager.memory.process.size` and check [memory setup guide]({% link 
deployment/memory/mem_tuning.md %}#configure-memory-for-standalone-deployment)),
- * the number of available CPUs per machine (`taskmanager.numberOfTaskSlots`),
- * the total number of CPUs in the cluster (`parallelism.default`) and
- * the temporary directories (`io.tmp.dirs`)
+### Session Mode
 
-are very important configuration values.
+Local deployment in Session Mode has already been described in the 
[introduction](#starting-a-standalone-cluster-session-mode) above.
 
-{% top %}
+## Standalone Cluster Reference
+
+### Configuration
+
+All available configuration options are listed on the [configuration page]({% 
link deployment/config.md %}), in particular the [Basic Setup]({% link 
deployment/config.md %}#basic-setup) section contains good advise on 
configuring the ports, memory, parallelism etc.
+
+### Debugging
+
+If Flink is behaving unexpectedly, we recommend looking at Flink's log files 
as a starting point for further investigations.
+
+The log files are located in the `logs/` directory. There's a `.log` file for 
each Flink service running on this machine. In the default configuration, log 
files are rotated on each start of a Flink service -- older runs of a service 
will have a number suffixed to the log file.
 
-### Starting Flink
+Alternatively, logs are available from the Flink web frontend (both for the 
JobManager and each TaskManager).
 
-The following script starts a JobManager on the local node and connects via 
SSH to all worker nodes listed in the *workers* file to start the TaskManager 
on each node. Now your Flink system is up and running. The JobManager running 
on the local node will now accept jobs at the configured RPC port.
+By default, Flink is logging on the "INFO" log level, which provides basic 
information for all obvious issues. For cases where Flink seems to behave 
wrongly, reducing the log level to "DEBUG" is advised. The logging level is 
controlled via the `conf/log4.properties` file.
+Setting `rootLogger.level = DEBUG` will boostrap Flink on the DEBUG log level.
 
-Assuming that you are on the master node and inside the Flink directory:
+There's a dedicated page on the [logging]({%link 
deployment/advanced/logging.md %}) in Flink.
 
+### Component Management Scripts
+
+#### Starting and Stopping a cluster
+
+`bin/start-cluster.sh` and `bin/stop-cluster.sh` rely on `conf/masters` and 
`conf/workers` to determine the number of cluster component instances.
+
+If password-less SSH access to the listed machines is configured, and they 
share the same directory structure, the scripts also support starting and 
stopping instances remotely.
+
+##### Example 1: Start a cluster with 2 TaskManagers locally
+
+`conf/masters` contents:
 {% highlight bash %}
-bin/start-cluster.sh
+localhost
 {% endhighlight %}
 
-To stop Flink, there is also a `stop-cluster.sh` script.
-
-{% top %}
+`conf/workers` contents:
+{% highlight bash %}
+localhost
+localhost
+{% endhighlight %}
 
-### Adding JobManager/TaskManager Instances to a Cluster
+##### Example 2: Start a distributed cluster JobMangers
 
-You can add both JobManager and TaskManager instances to your running cluster 
with the `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts.
+This assumes a cluster with 4 machines (`master1, worker1, worker2, worker3`), 
which all can reach each other over the network.
 
-#### Adding a JobManager
+`conf/masters` contents:
+{% highlight bash %}
+master1
+{% endhighlight %}
 
+`conf/workers` contents:
 {% highlight bash %}
-bin/jobmanager.sh ((start|start-foreground) [host] [webui-port])|stop|stop-all
+worker1
+worker2
+worker3
 {% endhighlight %}
 
-#### Adding a TaskManager
+Note that the configuration key [jobmanager.rpc.address]({% link 
deployment/config.md %}#jobmanager-rpc-address) needs to be set to `master1` 
for this to work.
+
+We show a third example with a standby JobManager in the [high-availability 
section](#setting-up-high-availability).
+
+#### Starting and Stopping Flink Components
+
+The `bin/jobmanager.sh` and `bin/taskmanager.sh` scripts support starting the 
respective daemon in the background (using the `start` argument), or in the 
foreground (using `start-foreground`). In the foreground mode, the logs are 
printed to standard out. This mode is useful for deployment scenarios where 
another process is controlling the Flink daemon (e.g. Docker).
+
+The scripts can be called multiple times, for example if multiple TaskManagers 
are needed. The instances are tracked by the scripts, and can be stopped 
one-by-one (using `stop`) or all together (using `stop-all`).
+
+#### Windows Cygwin Users
+
+If you are installing Flink from the git repository and you are using the 
Windows git shell, Cygwin can produce a failure similar to this one:
 
 {% highlight bash %}
-bin/taskmanager.sh start|start-foreground|stop|stop-all
+c:/flink/bin/start-cluster.sh: line 30: $'\r': command not found
 {% endhighlight %}
 
-Make sure to call these scripts on the hosts on which you want to start/stop 
the respective instance.
+This error occurs because git is automatically transforming UNIX line endings 
to Windows style line endings when running on Windows. The problem is that 
Cygwin can only deal with UNIX style line endings. The solution is to adjust 
the Cygwin settings to deal with the correct line endings by following these 
three steps:
+
+1. Start a Cygwin shell.
+
+2. Determine your home directory by entering
 
-## High-Availability with Standalone
+    ```bash
+    cd; pwd
+    ```
+
+    This will return a path under the Cygwin root path.
+
+3. Using NotePad, WordPad or a different text editor open the file 
`.bash_profile` in the home directory and append the following (if the file 
does not exist you will have to create it):
+
+    ```bash
+    $ export SHELLOPTS
+    $ set -o igncr
+    ```
+
+4. Save the file and open a new bash shell.
+
+### Setting up High-Availability
 
 In order to enable HA for a standalone cluster, you have to use the [ZooKeeper 
HA services]({% link deployment/ha/zookeeper_ha.md %}).
 
 Additionally, you have to configure your cluster to start multiple JobManagers.
 
-### Masters File (masters)
-
 In order to start an HA-cluster configure the *masters* file in `conf/masters`:
 
 - **masters file**: The *masters file* contains all hosts, on which 
JobManagers are started, and the ports to which the web user interface binds.
 
   <pre>
-jobManagerAddress1:webUIPort1
+master1:webUIPort1
 [...]
-jobManagerAddressX:webUIPortX
+masterX:webUIPortX
   </pre>
 
 By default, the job manager will pick a *random port* for inter process 
communication. You can change this via the 
[high-availability.jobmanager.port]({% link deployment/config.md 
%}#high-availability-jobmanager-port) key. This key accepts single ports (e.g. 
`50010`), ranges (`50000-50025`), or a combination of both 
(`50010,50011,50020-50025,50050-50075`).

Review comment:
       ```suggestion
   By default, the JobManager will pick a *random port* for inter process 
communication. You can change this via the 
[high-availability.jobmanager.port]({% link deployment/config.md 
%}#high-availability-jobmanager-port) key. This key accepts single ports (e.g. 
`50010`), ranges (`50000-50025`), or a combination of both 
(`50010,50011,50020-50025,50050-50075`).
   ```

##########
File path: docs/deployment/resource-providers/standalone/docker.md
##########
@@ -344,209 +393,178 @@ as described in [how to run the Flink 
image](#how-to-run-flink-image).
     ENV VAR_NAME value
     ```
 
-    **Commands for building**:
+  **Commands for building**:
 
     ```sh
-    docker build -t custom_flink_image .
+    $ docker build --tag custom_flink_image .
     # optional push to your docker image registry if you have it,
     # e.g. to distribute the custom image to your cluster
-    docker push custom_flink_image
+    $ docker push custom_flink_image
     ```
-  
-### Enabling Python
-
-To build a custom image which has Python and Pyflink prepared, you can refer 
to the following Dockerfile:
-{% highlight Dockerfile %}
-FROM flink
-
-# install python3 and pip3
-RUN apt-get update -y && \
-apt-get install -y python3.7 python3-pip python3.7-dev && rm -rf 
/var/lib/apt/lists/*
-RUN ln -s /usr/bin/python3 /usr/bin/python
 
-# install Python Flink
-RUN pip3 install apache-flink
-{% endhighlight %}
-
-Build the image named as **pyflink:latest**:
-
-{% highlight bash %}
-sudo docker build -t pyflink:latest .
-{% endhighlight %}
 
-{% top %}
-
-## Flink with Docker Compose
+### Flink with Docker Compose
 
 [Docker Compose](https://docs.docker.com/compose/) is a way to run a group of 
Docker containers locally.
-The next chapters show examples of configuration files to run Flink.
+The next sections show examples of configuration files to run Flink.
 
-### Usage
+#### Usage
 
 * Create the `yaml` files with the container configuration, check examples for:
-    * [Session cluster](#session-cluster-with-docker-compose)
-    * [Job cluster](#job-cluster-with-docker-compose)
+  * [Application cluster](#app-cluster-yml)
+  * [Session cluster](#session-cluster-yml)
 
-    See also [the Flink Docker image tags](#image-tags) and [how to customize 
the Flink Docker image](#advanced-customization)
-    for usage in the configuration files.
+  See also [the Flink Docker image tags](#image-tags) and [how to customize 
the Flink Docker image](#advanced-customization)
+  for usage in the configuration files.
 
-* Launch a cluster in the foreground
+* Launch a cluster in the foreground (use `-d` for background)
 
     ```sh
-    docker-compose up
+    $ docker-compose up
     ```
 
-* Launch a cluster in the background
+* Scale the cluster up or down to `N` TaskManagers
 
     ```sh
-    docker-compose up -d
+    $ docker-compose scale taskmanager=<N>
     ```
 
-* Scale the cluster up or down to *N TaskManagers*
+* Access the JobManager container
 
     ```sh
-    docker-compose scale taskmanager=<N>
-    ```
-
-* Access the *JobManager* container
-
-    ```sh
-    docker exec -it $(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}) /bin/sh
+    $ docker exec -it $(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}) /bin/sh
     ```
 
 * Kill the cluster
 
     ```sh
-    docker-compose kill
+    $ docker-compose kill
     ```
 
 * Access Web UI
 
-    When the cluster is running, you can visit the web UI at 
[http://localhost:8081](http://localhost:8081).
-    You can also use the web UI to submit a job to a *Session cluster*.
+  When the cluster is running, you can visit the web UI at 
[http://localhost:8081](http://localhost:8081).
+  You can also use the web UI to submit a job to a *Session cluster*.
 
 * To submit a job to a *Session cluster* via the command line, you can either
 
   * use [Flink CLI]({% link deployment/cli.md %}) on the host if it is 
installed:
 
     ```sh
-    flink run -d -c ${JOB_CLASS_NAME} /job.jar
+    $ ./bin/flink run --detached --class ${JOB_CLASS_NAME} /job.jar
     ```
 
-  * or copy the JAR to the *JobManager* container and submit the job using the 
[CLI]({% link deployment/cli.md %}) from there, for example:
+  * or copy the JAR to the JobManager container and submit the job using the 
[CLI]({% link deployment/cli.md %}) from there, for example:
 
     ```sh
-    JOB_CLASS_NAME="com.job.ClassName"
-    JM_CONTAINER=$(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}))
-    docker cp path/to/jar "${JM_CONTAINER}":/job.jar
-    docker exec -t -i "${JM_CONTAINER}" flink run -d -c ${JOB_CLASS_NAME} 
/job.jar
+    $ JOB_CLASS_NAME="com.job.ClassName"
+    $ JM_CONTAINER=$(docker ps --filter name=jobmanager --format={% raw 
%}{{.ID}}{% endraw %}))
+    $ docker cp path/to/jar "${JM_CONTAINER}":/job.jar
+    $ docker exec -t -i "${JM_CONTAINER}" flink run -d -c ${JOB_CLASS_NAME} 
/job.jar
     ```
 
-### Session Cluster with Docker Compose
+Here, we provide the **docker-compose.yml:** <a id="app-cluster-yml">file</a> 
for *Application Cluster*.

Review comment:
       ```suggestion
   Here, we provide the <a id="app-cluster-yml">docker-compose.yml</a> for 
*Application Cluster*.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] XComp commented on a change in pull request #14346: [FLINK-20354] Rework standalone docs pages

Reply via email to