kazdy commented on PR #7514: URL: https://github.com/apache/hudi/pull/7514#issuecomment-1483935259
> for mOR its mandatory, and for COW, its mandatory if users are using "upserts". for immutable workloads in COW, it may not be required. @nsivabalan @jonvex That's not exactly true. CoW allows mutable workloads when no precombine is set with MERGE INTO statement: https://github.com/apache/hudi/blob/1a526eea748d93f28f8cd4a786d5357d218c392c/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala#L349-L354 Here INSERT op is used to do support WHEN MATCHED UPDATE ... records. This is inconsistent with how sql UPDATE works, but for some reason, this is how it is. Another example is that we also can update record when no precombine field is specified if we use spark datasource insert in upsert mode. It will update existing records. So one can argue it is no longer an immutable workload. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
