[GitHub] [flink-table-store] LadyForest commented on a diff in pull request #207: [FLINK-28456] Document streaming query page

GitBox Fri, 08 Jul 2022 04:44:04 -0700


LadyForest commented on code in PR #207:
URL: https://github.com/apache/flink-table-store/pull/207#discussion_r916737039



##########
docs/content/docs/development/streaming-query.md:
##########
@@ -0,0 +1,115 @@
+---
+title: "Streaming Query"
+weight: 5
+type: docs
+aliases:
+- /development/streaming-query.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Streaming Query
+
+Currently, only Flink supports streaming query.
+
+The Table Store is streaming batch unified, you can read full
+and incremental data depending on the runtime execution mode:
+
+```sql
+-- Batch mode, read latest snapshot
+SET 'execution.runtime-mode' = 'batch';
+SELECT * FROM MyTable;
+
+-- Streaming mode, streaming reading, read incremental snapshot, read the 
snapshot first, then read the incremental
+SET 'execution.runtime-mode' = 'streaming';
+SELECT * FROM MyTable;
+
+-- Streaming mode, streaming reading, read latest incremental
+SET 'execution.runtime-mode' = 'streaming';
+SELECT * FROM MyTable /*+ OPTIONS ('log.scan'='latest') */;
+```
+
+Different `log.scan` mode will result in different consuming behavior under 
streaming mode.
+<table class="table table-bordered">
+    <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Scan Mode</th>
+      <th class="text-center" style="width: 5%">Default</th>
+      <th class="text-center" style="width: 60%">Description</th>
+    </tr>
+    </thead>
+    <tbody>
+    <tr>
+      <td><h5>FULL</h5></td>
+      <td>Yes</td>
+      <td>FULL scan mode performs a hybrid reading with a snapshot scan and 
the streaming incremental scan.</td>
+    </tr>
+    <tr>
+      <td><h5>LATEST</h5></td>
+      <td>No</td>
+      <td>LATEST scan mode only reads incremental data from the latest 
offset.</td>
+    </tr>
+    </tbody>
+</table>
+
+## Streaming Query on Files
+
+You can incrementally consume tables directly on the lake store files. This 
mode has
+a lower cost compared to Kafka, but the latency will be bigger, depending on 
the
+checkpoint interval of the writing stream job.
+
+By default, the downstream streaming consumption is a disordered (ordered 
within the key)
+stream of upsert data. If you expect an ordered CDC data stream, you can 
configure it
+as follows (recommended):
+
+```sql
+CREATE TABLE T (...)
+WITH (
+    'changelog-file' = 'true',
+    'log.changelog-mode' = 'all'
+)
+```
+
+## Streaming Query on Kafka
+
+You can configure the Kafka topic for the table, and the data written will be

Review Comment:
   ```suggestion
   For a table configuring a log system like Kafka, data will be double written 
to the file storage and the topic under streaming mode. For queries, there will 
be hybrid reads will from incremental snapshots.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink-table-store] LadyForest commented on a diff in pull request #207: [FLINK-28456] Document streaming query page

Reply via email to