yihua commented on code in PR #12883: URL: https://github.com/apache/hudi/pull/12883#discussion_r1970501258
########## website/docs/sql_dml.md: ########## @@ -133,6 +133,14 @@ ON <merge_condition> There are two kinds of `INSERT` clauses: 1. `INSERT *` clauses require that the source table has the same columns as those in the target table. 2. `INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])` clauses do not require to specify all the columns of the target table. For unspecified target columns, insert the `NULL` value. + +For a Hudi table with user configured primary keys, the join condition and the `UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the primary keys of the table. + +For a Table where Hudi auto generates primary keys, the join condition in `MERGE INTO` can be on any arbitrary data columns. Review Comment: ```suggestion For a table where Hudi auto generates primary keys, the join condition in `MERGE INTO` can be on any arbitrary data columns. ``` ########## website/docs/sql_dml.md: ########## @@ -133,6 +133,14 @@ ON <merge_condition> There are two kinds of `INSERT` clauses: 1. `INSERT *` clauses require that the source table has the same columns as those in the target table. 2. `INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])` clauses do not require to specify all the columns of the target table. For unspecified target columns, insert the `NULL` value. + +For a Hudi table with user configured primary keys, the join condition and the `UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the primary keys of the table. + +For a Table where Hudi auto generates primary keys, the join condition in `MERGE INTO` can be on any arbitrary data columns. + +if the `hoodie.record.merge.mode` is set to `EVENT_TIME_ORDERING`, the `preCombineField` is required to be set with value in the `UPDATE`/`INSERT` clause. + +It is enforced that if the target table has primary key and partition key column, the source table counterparts must enforce the same data type accordingly. Plus, if the target table is configured with `hoodie.record.merge.mode` = `EVENT_TIME_ORDERING` where target table is expected to have a valid precombine field configurations, the source table counterpart must also have the same data type. Review Comment: ```suggestion It is enforced that if the target table has primary key and partition key column, the source table counterparts must enforce the same data type accordingly. Plus, if the target table is configured with `hoodie.record.merge.mode` = `EVENT_TIME_ORDERING` where target table is expected to have a valid precombine field configuration, the source table counterpart must also have the same data type. ``` ########## website/versioned_docs/version-1.0.1/sql_dml.md: ########## @@ -133,6 +133,13 @@ ON <merge_condition> There are two kinds of `INSERT` clauses: 1. `INSERT *` clauses require that the source table has the same columns as those in the target table. 2. `INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])` clauses do not require to specify all the columns of the target table. For unspecified target columns, insert the `NULL` value. + +For a Hudi table with user configured primary keys, the join condition and the `UPDATE`/`INSERT INTO` clause in `MERGE INTO` is expected to contain the primary keys of the table. + +For a Table where Hudi auto generates primary keys, the join condition in `MERGE INTO` can be on any arbitrary data columns. Review Comment: ```suggestion For a table where Hudi auto generates primary keys, the join condition in `MERGE INTO` can be on any arbitrary data columns. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
