bhasudha commented on code in PR #10294:
URL: https://github.com/apache/hudi/pull/10294#discussion_r1423987370
##########
website/docs/sql_queries.md:
##########
@@ -98,44 +98,40 @@ Once the Flink Hudi tables have been registered to the
Flink catalog, they can b
relying on the custom Hudi input formats like Hive. Typically, notebook users
and Flink SQL CLI users leverage flink sql for querying Hudi tables. Please add
hudi-flink-bundle as described in the [Flink
Quickstart](/docs/flink-quick-start-guide).
-### Snapshot Query
+### Snapshot Query
By default, Flink SQL will try to use its optimized native readers (for e.g.
reading parquet files) instead of Hive SerDes.
Additionally, partition pruning is applied by Flink if a partition predicate
is specified in the filter. Filters push down may not be supported yet (please
check Flink roadmap).
-```sql
-select * from hudi_table/*+ OPTIONS('metadata.enabled'='true',
'read.data.skipping.enabled'='false','hoodie.metadata.index.column.stats.enable'='true')*/;
Review Comment:
Please retain the code samples.
##########
website/docs/sql_queries.md:
##########
@@ -98,44 +98,40 @@ Once the Flink Hudi tables have been registered to the
Flink catalog, they can b
relying on the custom Hudi input formats like Hive. Typically, notebook users
and Flink SQL CLI users leverage flink sql for querying Hudi tables. Please add
hudi-flink-bundle as described in the [Flink
Quickstart](/docs/flink-quick-start-guide).
-### Snapshot Query
+### Snapshot Query
By default, Flink SQL will try to use its optimized native readers (for e.g.
reading parquet files) instead of Hive SerDes.
Additionally, partition pruning is applied by Flink if a partition predicate
is specified in the filter. Filters push down may not be supported yet (please
check Flink roadmap).
-```sql
-select * from hudi_table/*+ OPTIONS('metadata.enabled'='true',
'read.data.skipping.enabled'='false','hoodie.metadata.index.column.stats.enable'='true')*/;
-```
-
#### Options
-| Option Name | Required | Default | Remarks |
-| ----------- | ------- | ------- | ------- |
-| `metadata.enabled` | `false` | false | Set to `true` to enable |
-| `read.data.skipping.enabled` | `false` | false | Whether to enable data
skipping for batch snapshot read, by default disabled |
-| `hoodie.metadata.index.column.stats.enable` | `false` | false | Whether to
enable column statistics (max/min) |
-| `hoodie.metadata.index.column.stats.column.list` | `false` | N/A |
Columns(separated by comma) to collect the column statistics |
+| Option Name | Required | Default |
Remarks |
+|--------------------------------------------------|----------|---------|------------------------------------------------------------------------------|
+| `metadata.enabled` | `false` | false | Set
to `true` to enable |
+| `read.data.skipping.enabled` | `false` | false |
Whether to enable data skipping for batch snapshot read, by default disabled |
+| `hoodie.metadata.index.column.stats.enable` | `false` | false |
Whether to enable column statistics (max/min) |
+| `hoodie.metadata.index.column.stats.column.list` | `false` | N/A |
Columns(separated by comma) to collect the column statistics |
### Streaming Query
By default, the hoodie table is read as batch, that is to read the latest
snapshot data set and returns. Turns on the streaming read
mode by setting option `read.streaming.enabled` as `true`. Sets up option
`read.start-commit` to specify the read start offset, specifies the
value as `earliest` if you want to consume all the history data set.
```sql
-select * from hudi_table/*+ OPTIONS('read.streaming.enabled'='true',
'read.start-commit'='earliest')*/;
Review Comment:
Please retain the code samples.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]