[GitHub] [iceberg] openinx commented on a change in pull request #2147: Docs: Add new flink features for release 0.11.0

GitBox Mon, 25 Jan 2021 20:23:58 -0800


openinx commented on a change in pull request #2147:
URL: https://github.com/apache/iceberg/pull/2147#discussion_r563727469




##########
File path: site/docs/flink.md
##########
@@ -224,24 +224,49 @@ DROP TABLE hive_catalog.default.sample;
 
 ## Querying with SQL
 
-Iceberg does not support streaming read in flink now, it's still working 
in-progress. But it support batch read to scan the existing records in iceberg 
table.
+Iceberg support both streaming and batch read in flink now. we could execute 
the following sql command to switch the execute type from 'streaming' mode to 
'batch' mode, and vice versa:
+
+```sql
+-- Execute the flink job in streaming mode for current session context
+SET execution.type = streaming
+
+-- Execute the flink job in batch mode for current session context
+SET execution.type = batch
+```
+
+### Flink batch read
+
+If want to check all the rows in iceberg table by submitting a flink __batch__ 
job, you could execute the following sentences:
 
 ```sql
 -- Execute the flink job in batch mode for current session context
 SET execution.type = batch ;
 SELECT * FROM sample       ;
 ```
 
-Notice: we could execute the following sql command to switch the execute type 
from 'streaming' mode to 'batch' mode, and vice versa:
+### Flink streaming read
+
+Iceberg supports processing incremental data in flink streaming jobs which 
starts from a historical snapshot-id:
 
 ```sql
--- Execute the flink job in streaming mode for current session context
-SET execution.type = streaming
+-- Submit the flink job in streaming mode for current session.
+SET execution.type = streaming ;
 
--- Execute the flink job in batch mode for current session context
-SET execution.type = batch
+-- Enable this switch because streaming read SQL will provide few job options 
in flink SQL hint options.
+SET table.dynamic-table-options.enabled=true;
+
+-- Read all the records from the iceberg current snapshot, and then read 
incremental data starting from that snapshot.
+SELECT * FROM sample /*+ OPTIONS('streaming'='true', 
'monitor-interval'='1s')*/ ;

Review comment:
       Though we've switched the `execution.type=streaming`  , we still need 
the hint option `'streaming'='true'` here because  the 
`execution.type=streaming` will not be passed to `StreamTableSource`  and table 
sql job won't know whether it is a `streaming` job or `batch` job ( in flink 
1.11 and flink 1.12 ).  That's the imperfect place from flink SQL's 
implementation,  we will try to improve it in the future flink release.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] openinx commented on a change in pull request #2147: Docs: Add new flink features for release 0.11.0

Reply via email to