openinx commented on a change in pull request #2147:
URL: https://github.com/apache/iceberg/pull/2147#discussion_r563727469
##########
File path: site/docs/flink.md
##########
@@ -224,24 +224,49 @@ DROP TABLE hive_catalog.default.sample;
## Querying with SQL
-Iceberg does not support streaming read in flink now, it's still working
in-progress. But it support batch read to scan the existing records in iceberg
table.
+Iceberg support both streaming and batch read in flink now. we could execute
the following sql command to switch the execute type from 'streaming' mode to
'batch' mode, and vice versa:
+
+```sql
+-- Execute the flink job in streaming mode for current session context
+SET execution.type = streaming
+
+-- Execute the flink job in batch mode for current session context
+SET execution.type = batch
+```
+
+### Flink batch read
+
+If want to check all the rows in iceberg table by submitting a flink __batch__
job, you could execute the following sentences:
```sql
-- Execute the flink job in batch mode for current session context
SET execution.type = batch ;
SELECT * FROM sample ;
```
-Notice: we could execute the following sql command to switch the execute type
from 'streaming' mode to 'batch' mode, and vice versa:
+### Flink streaming read
+
+Iceberg supports processing incremental data in flink streaming jobs which
starts from a historical snapshot-id:
```sql
--- Execute the flink job in streaming mode for current session context
-SET execution.type = streaming
+-- Submit the flink job in streaming mode for current session.
+SET execution.type = streaming ;
--- Execute the flink job in batch mode for current session context
-SET execution.type = batch
+-- Enable this switch because streaming read SQL will provide few job options
in flink SQL hint options.
+SET table.dynamic-table-options.enabled=true;
+
+-- Read all the records from the iceberg current snapshot, and then read
incremental data starting from that snapshot.
+SELECT * FROM sample /*+ OPTIONS('streaming'='true',
'monitor-interval'='1s')*/ ;
Review comment:
Though we've switched the `execution.type=streaming` , we still need
the hint option `'streaming'='true'` here because the
`execution.type=streaming` will not be passed to `StreamTableSource` and table
sql job won't know whether it is a `streaming` job or `batch` job ( in flink
1.11 and flink 1.12 ). That's the imperfect place from flink SQL's
implementation, we will try to improve it in the future flink release.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]