sjwiesman commented on a change in pull request #9192: [FLINK-12749] [docs]
[examples] Initial Version of Flink Cluster Playground
URL: https://github.com/apache/flink/pull/9192#discussion_r305942252
##########
File path: docs/getting-started/docker-playgrounds/flink_cluster_playground.md
##########
@@ -23,5 +23,658 @@ specific language governing permissions and limitations
under the License.
-->
+There are many ways to deploy and operate Apache Flink in various
environments. Regardless of this
+variety, the fundamental building blocks of a Flink Cluster remain the same
and similar
+operational principles apply.
+
+This docker compose-based playground will get you started with Apache Flink
operations quickly and
+will briefly introduce you to the main components that make up a Flink Cluster.
+
* This will be replaced by the TOC
{:toc}
+
+## Anatomy of this Playground
+
+This playground consists of a long living
+[Flink Session Cluster]({{ site.baseurl
}}/concepts/glossary.html#flink-session-cluster) and a Kafka
+Cluster.
+
+A Flink Cluster always consists of a
+[Flink Master]({{ site.baseurl }}/concepts/glossary.html#flink-master) and one
or more
+[Flink TaskManagers]({{ site.baseurl
}}/concepts/glossary.html#flink-taskmanager). The Flink Master
+is responsible to handle Job submissions, the supervision of Jobs as well as
resource
+management. The Flink TaskManagers are the worker processes and are
responsible for the execution of
+the actual [Tasks]({{ site.baseurl }}/concepts/glossary.html#task) which make
up a Flink Job. In
+this playground you will start with a single TaskManager, but scale out to
more TaskManagers later.
+Additionally, this playground comes with a dedicated *client* container, which
we use to submit the
+Flink Job initially and to perform various operational tasks later on.
+
+The Kafka Cluster consists of a Zookeeper server and a Kafka Broker.
+
+<img src="{{ site.baseurl }}/fig/flink-docker-playground.svg" alt="Flink
Docker Playground"
+class="offset" width="80%" />
+
+When the playground is started a Flink Job called *Flink Event Count* will be
submitted to the
+Flink Master. Additionally, two Kafka Topics *input* and *output* are created.
+
+<img src="{{ site.baseurl }}/fig/click-event-count-example.svg" alt="Click
Event Count Example"
+class="offset" width="80%" />
+
+The Job consumes `ClickEvent`s from the *input* topic, each with a `timestamp`
and a `page`. The
+events are then keyed by `page` and counted in one minute
+[windows]({{ site.baseurl }}/dev/stream/operators/windows.html). The results
are written to the
+*output* topic.
+
+There are six different `page`s and the **events are generated so that each
window contains exactly
+one thousand records**.
+
+{% top %}
+
+## Setup
+
+{% if site.version contains "SNAPSHOT" %}
+<p style="border-radius: 5px; padding: 5px" class="bg-danger">
+ <b>Note</b>: The Apache Flink Docker images used for this playground are
only available for
+ released versions of Apache Flink. Since you are currently looking at the
latest SNAPSHOT
+ version of the documentation the branch referenced below will not exist. You
can either change it
+ manually or switch to the released version of the ocumentation via the
release picker.
+</p>
+{% endif %}
+
+In this section you will setup the playground locally on your machine and
verify that the Job is
+running successfully.
+
+This guide assumes that you have [docker](https://docs.docker.com/) (1.12+) and
+[docker-compose](https://docs.docker.com/compose/) (2.1+) installed on your
machine.
+
+The required configuration files are available in the
+[flink-playgrounds](https://github.com/apache/flink-playgrounds) repository.
Check it out and spin
+up the environment:
+
+{% highlight bash %}
+git clone --branch release-{{ site.version }}
[email protected]:apache/flink-playgrounds.git
+cd flink-cluster-playground
+docker-compose up -d
+{% endhighlight %}
+
+Afterwards, `docker-compose ps` should give you the following output:
+
+{% highlight bash %}
+ Name Command
State Ports
+--------------------------------------------------------------------------------------------------------------------------------
+flink-cluster-playground_clickevent-generator_1 /docker-entrypoint.sh java
... Up 6123/tcp, 8081/tcp
+flink-cluster-playground_client_1 /docker-entrypoint.sh flin
... Exit 0
+flink-cluster-playground_jobmanager_1 /docker-entrypoint.sh jobm
... Up 6123/tcp, 0.0.0.0:8081->8081/tcp
+flink-cluster-playground_kafka_1 start-kafka.sh
Up 0.0.0.0:9094->9094/tcp
+flink-cluster-playground_taskmanager_1 /docker-entrypoint.sh task
... Up 6123/tcp, 8081/tcp
+flink-cluster-playground_zookeeper_1 /bin/sh -c /usr/sbin/sshd
... Up 2181/tcp, 22/tcp, 2888/tcp, 3888/tcp
+{% endhighlight %}
+
+This indicates that the client container has successfully submitted the Flink
Job ("Exit 0") and all
+cluster components as well as the data generator are running ("Up").
+
+## Interaction with the Playground
+
+There are many ways to interact with this playground and its components.
+
+### Flink WebUI
+
+The Flink WebUI is probably the most natural starting point to observe your
Flink Cluster. It is
+exposed under `http://localhost:8081`. If everything went well, you'll see
that the cluster initially
+consists of one TaskManager and one Job called *Click Event Count* is in
"RUNNING" state.
+
+<img src="{{ site.baseurl }}/fig/playground-webui.png" alt="Playground Flink
WebUI"
+class="offset" width="100%" />
+
+### Logs
+
+**JobManager**
+
+The JobManager logs can be tailed via `docker-compose`.
+
+{% highlight bash %}
+docker-compose logs -f jobmanager
+{% endhighlight %}
+
+After the initial startup you should mainly see log messages for every
checkpoint completion.
+
+**TaskManager**
+
+The TaskManager log can be tailed in the same way.
+{% highlight bash %}
+docker-compose logs -f taskmanager
+{% endhighlight %}
+
+After the initial startup you should mainly see log messages for every
checkpoint completion.
+
+### Flink CLI
+
+The [Flink CLI]({{ site.baseurl }}/ops/cli.html) can be used from within the
client container. For
+example, to print the `help` message of the Flink CLI you can run
+{% highlight bash%}
+docker-compose run --no-deps client flink --help
+{% endhighlight %}
+
+### Flink REST API
+
+The [Flink REST API]({{ site.baseurl }}/monitoring/rest_api.html#api) is
exposed via
+`localhost:8081` on the host or via `jobmanager:8081` from the client
container, e.g. to list all
+currently running jobs, you can run:
+{% highlight bash%}
+docker-compose run --no-deps client curl jobmanager:8081/jobs
+{% endhighlight %}
+
+### Kafka Topics
+
+To manually look at the records in the Kakfa Topics, you can run
+{% highlight bash%}
+//input topic (1000 records/s)
+docker-compose exec kafka kafka-console-consumer.sh --bootstrap-server
localhost:9092 --topic input
+//output topic (6 records/min)
+docker-compose exec kafka kafka-console-consumer.sh --bootstrap-server
localhost:9092 --topic output
+{% endhighlight %}
+
+{% top %}
+
+## Tear Down
+
+To tear down the environment (including the removal of all volumes) run:
+{% highlight bash%}
+docker-compose down -v
+{% endhighlight %}
+
+{% top %}
+
+## Operational Plays
+
+This section describes some prototypical operational activities in the context
of this playground.
+They do not need to be executed in any particular order. Most of these tasks
can be performed either
+via the [CLI](#flink-cli) or the [REST API](#flink-rest-api).
+
+### Listing Running Jobs
+
+<div class="codetabs" markdown="1">
+<div data-lang="CLI" markdown="1">
+**Command**
+{% highlight bash %}
+docker-compose run --no-deps client flink list
+{% endhighlight %}
+**Expected Output**
+{% highlight plain %}
+Waiting for response...
+------------------ Running/Restarting Jobs -------------------
+16.07.2019 16:37:55 : <job-id> : Click Event Count (RUNNING)
+--------------------------------------------------------------
+No scheduled jobs.
+{% endhighlight %}
+</div>
+<div data-lang="REST API" markdown="1">
+**Request**
+{% highlight bash %}
+docker-compose run --no-deps client curl jobmanager:8081/jobs
+{% endhighlight %}
+**Expected Response (pretty-printed)**
+{% highlight bash %}
+{
+ "jobs": [
+ {
+ "id": "<job-id>",
+ "status": "RUNNING"
+ }
+ ]
+}
+{% endhighlight %}
+</div>
+</div>
+
+### Observing Failure & Recovery
+
+Flink provides exactly-once processing guarantees under (partial) failure. In
this playground you
+can observe and - to some extent - verify this behavior.
+
+#### Step 1: Observing the Output
+
+As described [above](#anatomy-of-this-playground) the events in this
playground are generate such
Review comment:
```suggestion
As described [above](#anatomy-of-this-playground), the events in this
playground are generated such
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services