[
https://issues.apache.org/jira/browse/HUDI-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Kudinkin updated HUDI-5324:
----------------------------------
Description:
h4. *UPDATED*
Aforementioned issue was actually a result of misconfiguration of the Merge
Into statement – MIT was using "insert" operation instead of "upsert".
Real issue though is that MIT implicitly predicates using "upsert" operation
onto whether "preCombine" config is set. Instead, it should always specify
operation as "upsert", since MIT allows to specify updating semantics w/o
requiring presence of the "preCombine" field
was:
~When setting hoodie.index.type=BLOOM in the hudi-defaults.conf, while the
Spark SQL DELETE statement uses Bloom Index, the MERGE INTO statement does not
seem to use Bloom Index and instead uses Simple Index.~
h4. *UPDATE*
Aforementioned issue was actually a result of misconfiguration of the Merge
Into statement – MIT was using "insert" operation instead of "upsert".
Real issue though is that MIT implicitly predicates using "upsert" operation
onto whether "preCombine" config is set. Instead, it should always specify
operation as "upsert", since MIT allows to specify updating semantics w/o
requiring presence of the "preCombine" field
> Spark SQL MERGE INTO statement should always do upsert if there's matching
> update clause
> ----------------------------------------------------------------------------------------
>
> Key: HUDI-5324
> URL: https://issues.apache.org/jira/browse/HUDI-5324
> Project: Apache Hudi
> Issue Type: Improvement
> Components: index, spark-sql
> Reporter: Ethan Guo
> Assignee: Alexey Kudinkin
> Priority: Critical
> Fix For: 0.13.0
>
>
> h4. *UPDATED*
> Aforementioned issue was actually a result of misconfiguration of the Merge
> Into statement – MIT was using "insert" operation instead of "upsert".
> Real issue though is that MIT implicitly predicates using "upsert" operation
> onto whether "preCombine" config is set. Instead, it should always specify
> operation as "upsert", since MIT allows to specify updating semantics w/o
> requiring presence of the "preCombine" field
--
This message was sent by Atlassian Jira
(v8.20.10#820010)