This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink-table-store.git


The following commit(s) were added to refs/heads/master by this push:
     new c2df4674 [FLINK-30043] Some example sqls in flink table store 
rescale-bucket doucument are incorrect
c2df4674 is described below

commit c2df467406e7817b1f308cd4a8eb1c5baaa5f294
Author: Stan <[email protected]>
AuthorDate: Thu Nov 17 11:43:28 2022 +0800

    [FLINK-30043] Some example sqls in flink table store rescale-bucket 
doucument are incorrect
    
    This closes #385
---
 docs/content/docs/development/rescale-bucket.md | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/docs/content/docs/development/rescale-bucket.md 
b/docs/content/docs/development/rescale-bucket.md
index 337b1063..ba2e59e9 100644
--- a/docs/content/docs/development/rescale-bucket.md
+++ b/docs/content/docs/development/rescale-bucket.md
@@ -88,6 +88,21 @@ WITH (
     'bucket' = '16'
 );
 
+-- like from a kafka table 
+CREATE temporary TABLE raw_orders(
+    trade_order_id BIGINT,
+    item_id BIGINT,
+    item_price BIGINT,
+    gmt_create STRING,
+    order_status STRING
+) WITH (
+    'connector' = 'kafka',
+    'topic' = '...',
+    'properties.bootstrap.servers' = '...',
+    'format' = 'csv'
+    ...
+);
+
 -- streaming insert as bucket num = 16
 INSERT INTO verified_orders
 SELECT trade_order_id,
@@ -95,7 +110,7 @@ SELECT trade_order_id,
        item_price,
        DATE_FORMAT(gmt_create, 'yyyy-MM-dd') AS dt
 FROM raw_orders
-WHERE order_status = 'verified'
+WHERE order_status = 'verified';
 ```
 The pipeline has been running well for the past few weeks. However, the data 
volume has grown fast recently, 
 and the job's latency keeps increasing. To improve the data freshness, users 
can 
@@ -110,7 +125,7 @@ and the job's latency keeps increasing. To improve the data 
freshness, users can
 - Increase the bucket number
   ```sql
   -- scaling out
-  ALTER TABLE verified_orders SET ('bucket' = '32')
+  ALTER TABLE verified_orders SET ('bucket' = '32');
   ```
 - Switch to the batch mode and overwrite the current partition(s) to which the 
streaming job is writing
   ```sql
@@ -122,7 +137,7 @@ and the job's latency keeps increasing. To improve the data 
freshness, users can
          item_id,
          item_price
   FROM verified_orders
-  WHERE dt = '2022-06-22' AND order_status = 'verified'
+  WHERE dt = '2022-06-22';
   
   -- case 2: there are late events updating the historical partitions, but the 
range does not exceed 3 days
   INSERT OVERWRITE verified_orders
@@ -131,7 +146,7 @@ and the job's latency keeps increasing. To improve the data 
freshness, users can
          item_price,
          dt
   FROM verified_orders
-  WHERE dt IN ('2022-06-20', '2022-06-21', '2022-06-22') AND order_status = 
'verified'
+  WHERE dt IN ('2022-06-20', '2022-06-21', '2022-06-22');
   ```
 - After overwrite job finished, switch back to streaming mode. And now, the 
parallelism can be increased alongside with bucket number to restore the 
streaming job from the savepoint 
 ( see [Start a SQL Job from a 
savepoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sqlclient/#start-a-sql-job-from-a-savepoint)
 )
@@ -145,5 +160,5 @@ and the job's latency keeps increasing. To improve the data 
freshness, users can
        item_price,
        DATE_FORMAT(gmt_create, 'yyyy-MM-dd') AS dt
   FROM raw_orders
-  WHERE order_status = 'verified'
+  WHERE order_status = 'verified';
   ```

Reply via email to