tillrohrmann commented on a change in pull request #14258:
URL: https://github.com/apache/flink/pull/14258#discussion_r537496569



##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.

Review comment:
       ```suggestion
   onto the the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
   ```

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 

Review comment:
       I think `bin/mesos-appmaster.sh` won't start a process on the Mesos 
master. Instead it will start a process using the 
`MesosSessionClusterEntrypoint` wherever this script is called.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 

Review comment:
       Technically, `mesos-appmaster.sh` starts the Mesos application master 
(JobManager with Mesos integration). The process runs wherever you call it 
(also outside of the Mesos cluster).

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see 
[FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official 
website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can 
use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples 
documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official 
website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by 
creating the files `MESOS_HOME/etc/mesos/masters` and 
`MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective 
component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the 
template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create 
`MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same 
directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found 
[here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install 
Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run 
Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and 
apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers 
to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables 
`HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The 
library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would 
require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note 
that you can run multiple Flink jobs on a session cluster. Each job needs to be 
submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call 
`bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new 
TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define 
`mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to 
the Java process.
+The call above uses two variables not introduced, yet, as they depend on the 
cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's 
important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the 
Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink 
installation directory (see Mesos' 
+documentation on [specifying a 
user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the 
machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos 
cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link 
deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs 
to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in 
mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the 
Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) 
to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a 
better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define 
`mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java 
process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink 
cluster that is only 
+used for that specific job. No extra job submission is needed. 
`bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which 
is passed as a 
+JobGraph file. This file can be created by applying the following code to your 
Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-       ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-       obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes 
have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the 
user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of 
`lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user 
code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java 
properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High 
Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded 
JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of 
`MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) 
guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of 
this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application 
master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is 
set up through the 
+application master. The most sophisticated component of the Flink on Mesos 
implementation is the 
+application master. The application master currently hosts the following 
components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a 
framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to 
report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the 
cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if 
necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, 
it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in 
[ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a 
new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will 
then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, 
the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered 
information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing 
resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or 
configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will 
provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and 
`bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all 
further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration 
inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a 
configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos 
Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a 
more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's 
`lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's 
sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the 
Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries 
within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the 
resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) 
into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). 
It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link 
deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands 
of this page.
+2. **Using Docker containerization**: This enables the user to provide user 
libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled 
by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through 
[mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper 
needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link 
deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the 
Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In 
particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh 
-Dmesos.resourcemanager.framework.user=root 
-Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh
 -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m 
-Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 
-Dparallelism.default=2",
+    "cpus": 2,
+    "mem": 1024,
+    "disk": 0,
+    "instances": 1,
+    "env": {
+        "MESOS_NATIVE_JAVA_LIBRARY": "/usr/lib/libmesos.so",
+        "HADOOP_CLASSPATH": 
"/opt/hadoop-2.10.1/etc/hadoop:/opt/hadoop-2.10.1/share/hadoop/common/lib/*:/opt/hadoop-2.10.1/share/hadoop/common/*:/opt/hadoop-2.10.1/share/hadoop/hdfs:/opt/hadoop-2.10.1/share/hadoop/hdfs/lib/*:/opt/hadoop-2.10.1/share/hadoop/hdfs/*:/opt/hadoop-2.10.1/share/hadoop/yarn:/opt/hadoop-2.10.1/share/hadoop/yarn/lib/*:/opt/hadoop-2.10.1/share/hadoop/yarn/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar"
+    },
+    "healthChecks": [
+        {
+            "protocol": "HTTP",
+            "path": "/",
+            "port": 8081,
+            "gracePeriodSeconds": 300,
+            "intervalSeconds": 60,
+            "timeoutSeconds": 20,
+            "maxConsecutiveFailures": 3
+        }
+    ],
+    "user": "root"
+}
+Flink is installed into `/opt/flink-1.11.2` for this example having `root` as 
the owner of the Flink 

Review comment:
       I think we can refer to the current version via
   ```suggestion
   Flink is installed into `/opt/flink-{{site.version}}` for this example 
having `root` as the owner of the Flink 
   ```
   
   The same applies to the Marathon configuration provided above.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see 
[FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official 
website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can 
use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples 
documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official 
website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by 
creating the files `MESOS_HOME/etc/mesos/masters` and 
`MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective 
component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the 
template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create 
`MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same 
directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found 
[here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install 
Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run 
Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and 
apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers 
to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables 
`HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The 
library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would 
require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note 
that you can run multiple Flink jobs on a session cluster. Each job needs to be 
submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call 
`bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new 
TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define 
`mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to 
the Java process.
+The call above uses two variables not introduced, yet, as they depend on the 
cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's 
important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the 
Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink 
installation directory (see Mesos' 
+documentation on [specifying a 
user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the 
machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos 
cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link 
deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs 
to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in 
mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the 
Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) 
to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a 
better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define 
`mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java 
process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink 
cluster that is only 
+used for that specific job. No extra job submission is needed. 
`bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which 
is passed as a 
+JobGraph file. This file can be created by applying the following code to your 
Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-       ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-       obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes 
have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the 
user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of 
`lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user 
code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java 
properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High 
Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded 
JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of 
`MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) 
guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of 
this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application 
master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is 
set up through the 
+application master. The most sophisticated component of the Flink on Mesos 
implementation is the 
+application master. The application master currently hosts the following 
components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a 
framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to 
report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the 
cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if 
necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, 
it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in 
[ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a 
new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will 
then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, 
the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered 
information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing 
resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or 
configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will 
provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and 
`bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all 
further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration 
inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a 
configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos 
Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a 
more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's 
`lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's 
sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the 
Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries 
within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the 
resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) 
into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). 
It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link 
deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands 
of this page.
+2. **Using Docker containerization**: This enables the user to provide user 
libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled 
by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through 
[mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper 
needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link 
deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the 
Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In 
particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh 
-Dmesos.resourcemanager.framework.user=root 
-Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh
 -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m 
-Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 
-Dparallelism.default=2",

Review comment:
       Do we have to specify all the memory configurations or can we use the 
default values?

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see 
[FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official 
website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can 
use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples 
documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official 
website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by 
creating the files `MESOS_HOME/etc/mesos/masters` and 
`MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective 
component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the 
template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create 
`MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same 
directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found 
[here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install 
Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run 
Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and 
apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers 
to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables 
`HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The 
library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would 
require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note 
that you can run multiple Flink jobs on a session cluster. Each job needs to be 
submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call 
`bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new 
TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define 
`mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to 
the Java process.
+The call above uses two variables not introduced, yet, as they depend on the 
cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's 
important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the 
Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink 
installation directory (see Mesos' 
+documentation on [specifying a 
user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the 
machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos 
cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link 
deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs 
to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in 
mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the 
Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) 
to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a 
better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define 
`mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java 
process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink 
cluster that is only 
+used for that specific job. No extra job submission is needed. 
`bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which 
is passed as a 
+JobGraph file. This file can be created by applying the following code to your 
Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-       ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-       obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes 
have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the 
user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of 
`lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user 
code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java 
properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High 
Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded 
JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of 
`MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) 
guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of 
this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application 
master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is 
set up through the 
+application master. The most sophisticated component of the Flink on Mesos 
implementation is the 
+application master. The application master currently hosts the following 
components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a 
framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to 
report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the 
cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if 
necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, 
it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in 
[ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a 
new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will 
then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, 
the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered 
information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing 
resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or 
configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will 
provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and 
`bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all 
further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration 
inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a 
configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos 
Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a 
more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's 
`lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's 
sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the 
Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries 
within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the 
resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) 
into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). 
It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link 
deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands 
of this page.
+2. **Using Docker containerization**: This enables the user to provide user 
libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled 
by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through 
[mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper 
needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link 
deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the 
Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In 
particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh 
-Dmesos.resourcemanager.framework.user=root 
-Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh
 -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m 
-Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 
-Dparallelism.default=2",

Review comment:
       Where comes `master:5050` from?

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.

Review comment:
       Do I need to install Hadoop or is it enough to provide the Hadoop 
dependencies (e.g. via Flink's pre bundled Hadoop dependency 
(https://flink.apache.org/downloads.html))?

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see 
[FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official 
website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can 
use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples 
documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official 
website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by 
creating the files `MESOS_HOME/etc/mesos/masters` and 
`MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective 
component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the 
template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create 
`MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same 
directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found 
[here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install 
Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run 
Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and 
apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers 
to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables 
`HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The 
library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would 
require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note 
that you can run multiple Flink jobs on a session cluster. Each job needs to be 
submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call 
`bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new 
TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \

Review comment:
       Maybe state that this will start `MesosSessionClusterEntrypoint` as a 
local process.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see 
[FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official 
website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can 
use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples 
documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official 
website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by 
creating the files `MESOS_HOME/etc/mesos/masters` and 
`MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective 
component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the 
template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create 
`MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same 
directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found 
[here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install 
Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run 
Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and 
apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers 
to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables 
`HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The 
library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would 
require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note 
that you can run multiple Flink jobs on a session cluster. Each job needs to be 
submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call 
`bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new 
TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define 
`mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to 
the Java process.
+The call above uses two variables not introduced, yet, as they depend on the 
cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's 
important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the 
Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink 
installation directory (see Mesos' 
+documentation on [specifying a 
user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the 
machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos 
cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link 
deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs 
to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in 
mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the 
Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) 
to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a 
better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define 
`mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java 
process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink 
cluster that is only 
+used for that specific job. No extra job submission is needed. 
`bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which 
is passed as a 
+JobGraph file. This file can be created by applying the following code to your 
Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-       ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-       obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes 
have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the 
user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of 
`lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user 
code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java 
properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High 
Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded 
JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of 
`MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) 
guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of 
this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture

Review comment:
       I would move this to the back of the "Flink on Mesos Reference" section.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the 
environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently 
hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully 
functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers 
and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider 
supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its 
TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against 
the
-recovered information from Zookeeper and makes sure to bring back the cluster 
in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the 
Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the 
very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It 
will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to 
run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see 
[FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official 
website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can 
use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples 
documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official 
website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by 
creating the files `MESOS_HOME/etc/mesos/masters` and 
`MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective 
component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the 
template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create 
`MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same 
directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export 
`MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script 
`MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found 
[here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install 
Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run 
Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and 
apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers 
to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables 
`HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The 
library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would 
require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note 
that you can run multiple Flink jobs on a session cluster. Each job needs to be 
submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call 
`bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new 
TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define 
`mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to 
the Java process.
+The call above uses two variables not introduced, yet, as they depend on the 
cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's 
important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the 
Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink 
installation directory (see Mesos' 
+documentation on [specifying a 
user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the 
machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos 
cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link 
deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs 
to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in 
mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the 
Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) 
to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a 
better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos 
scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define 
`mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java 
process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink 
cluster that is only 
+used for that specific job. No extra job submission is needed. 
`bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which 
is passed as a 
+JobGraph file. This file can be created by applying the following code to your 
Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-       ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-       obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes 
have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the 
user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of 
`lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user 
code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java 
properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High 
Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded 
JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of 
`MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) 
guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of 
this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application 
master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is 
set up through the 
+application master. The most sophisticated component of the Flink on Mesos 
implementation is the 
+application master. The application master currently hosts the following 
components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a 
framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to 
report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the 
cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if 
necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, 
it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in 
[ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a 
new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will 
then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, 
the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered 
information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing 
resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or 
configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will 
provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and 
`bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all 
further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration 
inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a 
configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos 
Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a 
more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's 
`lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's 
sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the 
Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries 
within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the 
resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) 
into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). 
It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link 
deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands 
of this page.
+2. **Using Docker containerization**: This enables the user to provide user 
libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled 
by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through 
[mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md 
%}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care 
of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper 
needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link 
deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the 
Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In 
particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh 
-Dmesos.resourcemanager.framework.user=root 
-Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh
 -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m 
-Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 
-Dparallelism.default=2",
+    "cpus": 2,
+    "mem": 1024,
+    "disk": 0,
+    "instances": 1,
+    "env": {
+        "MESOS_NATIVE_JAVA_LIBRARY": "/usr/lib/libmesos.so",
+        "HADOOP_CLASSPATH": 
"/opt/hadoop-2.10.1/etc/hadoop:/opt/hadoop-2.10.1/share/hadoop/common/lib/*:/opt/hadoop-2.10.1/share/hadoop/common/*:/opt/hadoop-2.10.1/share/hadoop/hdfs:/opt/hadoop-2.10.1/share/hadoop/hdfs/lib/*:/opt/hadoop-2.10.1/share/hadoop/hdfs/*:/opt/hadoop-2.10.1/share/hadoop/yarn:/opt/hadoop-2.10.1/share/hadoop/yarn/lib/*:/opt/hadoop-2.10.1/share/hadoop/yarn/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar"

Review comment:
       Maybe it is simpler (for the example) to use Flink's bundled Hadoop 
dependency. That way we might work around the problem that Mesos relies on 
Hadoop even if you don't use it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to