shuiqiangchen commented on a change in pull request #16: URL: https://github.com/apache/flink-playgrounds/pull/16#discussion_r493545356
########## File path: pyflink-walkthrough/README.md ########## @@ -0,0 +1,134 @@ +# pyflink-walkthrough + +## Background + +In this playground, you will learn how to build and run an end-to-end PyFlink pipeline for data analytics, covering the following steps: + +* Reading data from a Kafka source; +* Creating data using a [UDF](https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/python/table-api-users-guide/udfs/python_udfs.html); +* Performing a simple aggregation over the source data; +* Writing the results to Elasticsearch and visualizing them in Kibana. + +The environment is based on Docker Compose, so the only requirement is that you have [Docker](https://docs.docker.com/get-docker/) +installed in your machine. + +### Kafka +You will be using Kafka to store sample input data about payment transactions. A simple data generator [generate_source_data.py](generator/generate_source_data.py) is provided to Review comment: Here I would like to use `H3` header for each component (Kafka, PyFlink, Elaticsearch, Kibana) introduction to make them more recognizable. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
