[GitHub] [flink] infoverload commented on a change in pull request #18055: [docs] Tutorial: Write Your First Flink SQL program

GitBox Thu, 09 Dec 2021 05:56:43 -0800


infoverload commented on a change in pull request #18055:
URL: https://github.com/apache/flink/pull/18055#discussion_r765810772




##########
File path: docs/content/docs/try-flink/write_flink_program_with_sql.md
##########
@@ -0,0 +1,266 @@
+---
+title: 'Write your first Flink program with SQL'
+weight: 2 
+type: docs
+aliases:
+  - /try-flink/write_flink_program_with_sql.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Write your first Flink program with SQL
+
+## Introduction
+
+Flink features [multiple 
APIs](https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/overview/)
 with different levels of abstraction that can be used to develop your 
streaming application. SQL is the highest level of abstraction and is supported 
by Flink as a relational unified API for batch and stream processing. This 
means that you can write the same queries on both unbounded real-time streams 
and bounded recorded streams and produce the same results. 
+
+SQL on Flink is based on [Apache Calcite](https://calcite.apache.org/) (which 
is based on standard SQL) and is commonly used to ease the process of data 
analytics, data pipelining, and ETL applications.  It is a great entry way to 
writing your first Flink application and requires no need for Java or Python. 
+
+This tutorial will guide you through writing your first Flink program 
leveraging SQL alone. Through this exercise you will learn and understand the 
ease and speed with which you can analyze streaming data in Flink! 
+
+
+## Goals
+
+This tutorial will teach you how to:
+
+- use the Flink SQL client to submit queries 
+- consume a data source with Flink SQL
+- run a continuous query on a stream of data
+- use Flink SQL to write out results to persistent storage 
+
+
+## Prerequisites 
+
+You only need to have basic knowledge of SQL to follow along.
+
+
+## Step 1: Start the Flink SQL client 
+
+The [SQL 
Client](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sqlclient/)
 is bundled in the regular Flink distribution and runnable out-of-the-box. It 
requires only a running Flink cluster where table programs can be executed 
(since Flink SQL is a thin abstraction over the Table API). 
+
+There are many ways to set up Flink but you will run it locally for the 
purpose of this tutorial. [Download 
Flink](https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/try-flink/local_installation/#downloading-flink)
 and [start a local 
cluster](https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/try-flink/local_installation/#starting-and-stopping-a-local-cluster)
 with one worker (or TaskManager).  
+
+The scripts for the SQL client are located in the `/bin` directory of Flink. 
You can start the client by executing:
+
+```sh
+./bin/sql-client.sh
+```
+
+You should see something like this:
+
+{{< img src="/fig/try-flink/flink-sql.png" alt="Flink SQL client" >}}
+
+
+## Step 2: Set up a data source with flink-faker
+
+Like with any Flink program, you will need a data source to connect to so that 
Flink can process it. There are many popular data sources but for the interest 
of this tutorial, you will be using 
[flink-faker](https://github.com/knaufk/flink-faker).  This custom [table 
source](https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/overview/)
 is based on [Java Faker](https://github.com/DiUS/java-faker) and can generate 
fake data continuously in memory and in a realistic format. 
+
+Java Faker is a tool for generating this data and flink-faker exposes that as 
a source in Flink by implementing the [DynamicTableSource 
interface](https://nightlies.apache.org/flink/flink-docs-release-1.11/api/java/org/apache/flink/table/connector/source/DynamicTableSource.html).
 The dynamic table source has the logic of how to create a table source (in 
this case, from flink-faker), and then by adding a factory 
(https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/table/factories/DynamicTableSourceFactory.html)
 for it you can expose it in the SQL API by referencing it with `"connector" = 
"faker"`.

Review comment:
       ooops mistake




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] infoverload commented on a change in pull request #18055: [docs] Tutorial: Write Your First Flink SQL program

Reply via email to