fhueske commented on a change in pull request #9491: [FLINK-12749] Add Flink Operations Playground URL: https://github.com/apache/flink/pull/9491#discussion_r317631609
########## File path: docs/tutorials/docker-playgrounds/flink-operations-playground.md ########## @@ -0,0 +1,809 @@ +--- +title: "Flink Operations Playground" +nav-title: 'Flink Operations Playground' +nav-parent_id: docker-playgrounds +nav-pos: 1 +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +There are many ways to deploy and operate Apache Flink in various environments. Regardless of this +variety, the fundamental building blocks of a Flink Cluster remain the same, and similar +operational principles apply. + +In this playground, you will learn how to manage and run Flink Jobs. You will see how to deploy and +monitor an application, experience how Flink recovers from Job failure, and perform everyday +operational tasks like upgrades and rescaling. + +* This will be replaced by the TOC +{:toc} + +## Anatomy of this Playground + +This playground consists of a long living Flink Session Cluster and a Kafka Cluster. + +A Flink Cluster always consists of a Flink Master and one or more Flink TaskManagers. The Flink Master +is responsible for handling Job submissions, the supervision of Jobs as well as resource management. +The Flink TaskManagers are the worker processes and are responsible for the execution of the actual +Tasks which make up a Flink Job. In this playground you will start with a single TaskManager, but +scale out to more TaskManagers later. +Additionally, this playground comes with a dedicated *client* container, which we use to submit the +Flink Job initially and to perform various operational tasks later on. The *client* container is not +needed by the Flink Cluster itself but only included for ease of use. + +The Kafka Cluster consists of a Zookeeper server and a Kafka Broker. + +<img src="{{ site.baseurl }}/fig/flink-docker-playground.svg" alt="Flink Docker Playground" +class="offset" width="80%" /> + +When the playground is started a Flink Job called *Flink Event Count* will be submitted to the +Flink Master. Additionally, two Kafka Topics *input* and *output* are created. + +<img src="{{ site.baseurl }}/fig/click-event-count-example.svg" alt="Click Event Count Example" +class="offset" width="80%" /> + +The Job consumes `ClickEvent`s from the *input* topic, each with a `timestamp` and a `page`. The +events are then keyed by `page` and counted in 15 second +[windows]({{ site.baseurl }}/dev/stream/operators/windows.html). The results are written to the +*output* topic. + +There are six different pages and we generate 1000 click events per page and 15 seconds. Hence, the +output of the Flink job should show 1000 views per page and window. + +{% top %} + +## Starting the Playground + +{% if site.version contains "SNAPSHOT" %} +<p style="border-radius: 5px; padding: 5px" class="bg-danger"> + <b>Note</b>: The Apache Flink Docker images used for this playground are only available for + released versions of Apache Flink. Since you are currently looking at the latest SNAPSHOT + version of the documentation the branch referenced below will not exist. You can either change it + manually or switch to the released version of the documentation via the release picker. +</p> +{% endif %} + +The playground environment is set up in just a few steps. We will walk you through the necessary +commands and show how to validate that everything is running correctly. + +We assume that you have that you have [docker](https://docs.docker.com/) (1.12+) and +[docker-compose](https://docs.docker.com/compose/) (2.1+) installed on your machine. + +The required configuration files are available in the +[flink-playgrounds](https://github.com/apache/flink-playgrounds) repository. Check it out and spin +up the environment: + +{% highlight bash %} +git clone --branch release-{{ site.version_title }} https://github.com/apache/flink-playgrounds.git +cd flink-playgrounds/operations-playground +docker-compose build +docker-compose up -d +{% endhighlight %} + +Afterwards, you can inspect the running Docker containers with the following command: + +{% highlight bash %} +docker-compose ps + + Name Command State Ports +-------------------------------------------------------------------------------------------------------------------------------- +flink-cluster-playground_clickevent-generator_1 /docker-entrypoint.sh java ... Up 6123/tcp, 8081/tcp Review comment: Update output ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
