LadyForest commented on code in PR #159:
URL: https://github.com/apache/flink-table-store/pull/159#discussion_r898836246


##########
docs/content/docs/development/query-table.md:
##########
@@ -58,51 +58,14 @@ Supported filter functions are:
 - `in`
 - starts with `like`
 
-## Streaming Real-time
+## Real-time Streaming Consumption
 
 By default, data is only visible after the checkpoint, which means
 that the streaming reading has transactional consistency.
 
-If you want the data to be immediately visible, you need to set the following 
options:
+If you want the data to be immediately visible, you need to set table option:
+`log.consistency` = `eventual`.
 
-<table class="table table-bordered">
-    <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Table Option</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 60%">Description</th>
-    </tr>
-    </thead>
-    <tbody>
-    <tr>
-      <td><h5>`log.system` = `kafka`</h5></td>
-      <td>No</td>
-      <td>You need to enable log system because the FileStore's continuous 
consumption only provides checkpoint-based visibility.</td>
-    </tr>
-    <tr>
-      <td><h5>`log.consistency` = `eventual`</h5></td>
-      <td>No</td>
-      <td>This means that writes are visible without using LogSystem's 
transaction mechanism.</td>
-    </tr>
-    </tbody>
-</table>
-
-Note: All tables need to have the primary key defined because only then can the
-data be de-duplicated by the normalizing node of the downstream job.
-
-## Streaming Low Cost
-
-By default, for the table with the primary key, the records in the table store 
only
-contain `INSERT`, `UPDATE_AFTER`, and `DELETE`. The downstream consuming job 
will
-generate a normalized node, and it stores all processed key-value to produce 
the
-`UPDATE_BEFORE` message, which will bring extra overhead.
-
-If you want to remove downstream normalized node (It's costly) or see the all
-changes of this table, you can configure:
-- 'log.changelog-mode' = 'all'
-- 'log.consistency' = 'transactional' (default)
-
-The inserted query written to the table store must contain all message types 
with
-`UPDATE_BEFORE`, otherwise the planner will throw an exception. It means that 
Planner
-expects the inserted query to produce a real changelog, otherwise the data 
would
-be wrong.
+In this at-least-once mode, records are sent to downstream jobs ahead of time,
+which means that duplicate data may be sent at job failover, and you may need 
to
+manually de-duplicate data to achieve final consistency.

Review Comment:
   ```suggestion
   Due to the tradeoff between data freshness and completeness, immediate data 
visibility is barely accomplished under exactly-once semantics. Nevertheless, 
users can relax the constraint to use at-least-once mode to achieve it. Note 
that records may be sent to downstream jobs ahead of the watermark (since no 
barrier alignment is required), which may lead to duplicate data during job 
failover. As a result, users may need to manually de-duplicate data to achieve 
final consistency.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to