This is an automated email from the ASF dual-hosted git repository.
danny0405 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new cf269ca6805 [HUDI-6225][DOCS] hudi_table_changes docs in spark
quickstart guide (#8880)
cf269ca6805 is described below
commit cf269ca68054bf0b65924c49c12c02d5c6415df4
Author: kazdy <[email protected]>
AuthorDate: Wed Jun 21 02:51:50 2023 +0200
[HUDI-6225][DOCS] hudi_table_changes docs in spark quickstart guide (#8880)
* add hudi_table_changes(by_path) docs to spark quickstart guide
* polymorphic hudi_table_changes
---
website/docs/quick-start-guide.md | 41 +++++++++++++++++++++++++++++++++++++--
1 file changed, 39 insertions(+), 2 deletions(-)
diff --git a/website/docs/quick-start-guide.md
b/website/docs/quick-start-guide.md
index 2d14f973a7f..8de81f2856f 100644
--- a/website/docs/quick-start-guide.md
+++ b/website/docs/quick-start-guide.md
@@ -839,6 +839,7 @@ defaultValue="python"
values={[
{ label: 'Scala', value: 'scala', },
{ label: 'Python', value: 'python', },
+{ label: 'Spark SQL', value: 'sparksql', }
]}
>
@@ -897,8 +898,44 @@ spark.sql("select `_hoodie_commit_time`, fare, begin_lon,
begin_lat, ts from hu
</TabItem>
-</Tabs
->
+<TabItem value="sparksql">
+
+```sql
+-- syntax
+hudi_table_changes(table or path, queryType, beginTime [, endTime]);
+-- table or path: table identifier, example: db.tableName, tableName,
+-- or path for of your table, example: path/to/hudiTable
+-- in this case table does not need to exist in the metastore,
+-- queryType: incremental query mode, example: latest_state, cdc
+-- (for cdc query, first enable cdc for your table by setting
cdc.enabled=true),
+-- beginTime: instantTime to begin query from, example: earliest,
202305150000,
+-- endTime: optional instantTime to end query at, example: 202305160000,
+
+-- incrementally query data by table name
+-- start from earliest available commit, end at latest available commit.
+select * from hudi_table_changes('db.table', 'latest_state', 'earliest');
+
+-- start from earliest, end at 202305160000.
+select * from hudi_table_changes('table', 'latest_state', 'earliest',
'202305160000');
+
+-- start from 202305150000, end at 202305160000.
+select * from hudi_table_changes('table', 'latest_state', '202305150000',
'202305160000');
+
+-- incrementally query data by path
+-- start from earliest available commit, end at latest available commit.
+select * from hudi_table_changes('path/to/table', 'cdc', 'earliest');
+
+-- start from earliest, end at 202305160000.
+select * from hudi_table_changes('path/to/table', 'cdc', 'earliest',
'202305160000');
+
+-- start from 202305150000, end at 202305160000.
+select * from hudi_table_changes('path/to/table', 'cdc', '202305150000',
'202305160000');
+
+```
+
+</TabItem>
+
+</Tabs>
:::info
This will give all changes that happened after the beginTime commit with the
filter of fare > 20.0. The unique thing about this