[
https://issues.apache.org/jira/browse/HUDI-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17910413#comment-17910413
]
Davis Zhang commented on HUDI-8824:
-----------------------------------
[~yihua] should we enforce the same thing for primary key? back in 0.14 we must
set primary key in update/insert clause otherwise MIT does not work.
What about partition key? As of today if partition key is not set, the rows
from source are partitioned according to target table partition key. Then for
records fall into the same partition between source and target, MIT is executed.
> Merge into if no precombine key involved in update/insert should error out
> --------------------------------------------------------------------------
>
> Key: HUDI-8824
> URL: https://issues.apache.org/jira/browse/HUDI-8824
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: spark-sql
> Reporter: Davis Zhang
> Assignee: Davis Zhang
> Priority: Blocker
> Fix For: 1.0.1
>
> Original Estimate: 10h
> Remaining Estimate: 10h
>
> If precombine key is required, i.e., for EVENT_TIME_ORDERING, we should throw
> an error to the user in MERGE INTO statement in Spark SQL, especially for
> partial updates.
>
> ```
> merge into a and b
> on a.c1=b.c1
> when (not) matched
> update/delete/insert clause ...
> ```
>
>
> ```
> a.precombineKey+1=b.precombineKey
> a.precombineKey=b.precombineKey+1
> a.precombineKey=b.precombineKey <=== not found this pattern in update/insert,
> then error out.
> ```
>
> 1 day code complete. 0.2 for review.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)