This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 28d00eed87c [HUDI-9077][DOCS] Update MERGE INTO docs (#12883)
28d00eed87c is described below
commit 28d00eed87cf03f2f8c5e95f2ab33ee741b79ba2
Author: Davis-Zhang-Onehouse
<[email protected]>
AuthorDate: Tue Feb 25 12:55:48 2025 -0800
[HUDI-9077][DOCS] Update MERGE INTO docs (#12883)
Co-authored-by: Y Ethan Guo <[email protected]>
---
website/docs/sql_dml.md | 10 ++++++++++
website/versioned_docs/version-1.0.1/sql_dml.md | 9 +++++++++
2 files changed, 19 insertions(+)
diff --git a/website/docs/sql_dml.md b/website/docs/sql_dml.md
index fd89ca8dfd4..9828e1b3597 100644
--- a/website/docs/sql_dml.md
+++ b/website/docs/sql_dml.md
@@ -133,6 +133,14 @@ ON <merge_condition>
There are two kinds of `INSERT` clauses:
1. `INSERT *` clauses require that the source table has the same columns as
those in the target table.
2. `INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])` clauses
do not require to specify all the columns of the target table. For unspecified
target columns, insert the `NULL` value.
+
+For a Hudi table with user configured primary keys, the join condition and the
`UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the
primary keys of the table.
+
+For a table where Hudi auto generates primary keys, the join condition in
`MERGE INTO` can be on any arbitrary data columns.
+
+if the `hoodie.record.merge.mode` is set to `EVENT_TIME_ORDERING`, the
`preCombineField` is required to be set with value in the `UPDATE`/`INSERT`
clause.
+
+It is enforced that if the target table has primary key and partition key
column, the source table counterparts must enforce the same data type
accordingly. Plus, if the target table is configured with
`hoodie.record.merge.mode` = `EVENT_TIME_ORDERING` where target table is
expected to have a valid precombine field configuration, the source table
counterpart must also have the same data type.
:::
Examples below
@@ -224,6 +232,8 @@ Partial update is not yet supported in the following cases:
2. When virtual keys is enabled.
3. When schema on read is enabled.
4. When there is an enum field in the source data.
+
+For a Hudi table with user configured primary keys, the join condition and the
`UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the
primary keys of the table.
:::
### Delete From
diff --git a/website/versioned_docs/version-1.0.1/sql_dml.md
b/website/versioned_docs/version-1.0.1/sql_dml.md
index fd89ca8dfd4..97a1bac8a0f 100644
--- a/website/versioned_docs/version-1.0.1/sql_dml.md
+++ b/website/versioned_docs/version-1.0.1/sql_dml.md
@@ -133,6 +133,13 @@ ON <merge_condition>
There are two kinds of `INSERT` clauses:
1. `INSERT *` clauses require that the source table has the same columns as
those in the target table.
2. `INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])` clauses
do not require to specify all the columns of the target table. For unspecified
target columns, insert the `NULL` value.
+
+For a Hudi table with user configured primary keys, the join condition and the
`UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the
primary keys of the table.
+
+For a table where Hudi auto generates primary keys, the join condition in
`MERGE INTO` can be on any arbitrary data columns.
+
+if the `hoodie.record.merge.mode` is set to `EVENT_TIME_ORDERING`, the
`preCombineField` is required to be set with value in the `UPDATE`/`INSERT`
clause.
+
:::
Examples below
@@ -224,6 +231,8 @@ Partial update is not yet supported in the following cases:
2. When virtual keys is enabled.
3. When schema on read is enabled.
4. When there is an enum field in the source data.
+
+For a Hudi table with user configured primary keys, the join condition and the
`UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the
primary keys of the table.
:::
### Delete From