[GitHub] [hudi] kazdy commented on a diff in pull request #7998: [HUDI-5824] Fix: do not combine if write operation is Upsert and COMBINE_BEFORE_UPSERT is false

via GitHub Tue, 21 Feb 2023 12:23:11 -0800


kazdy commented on code in PR #7998:
URL: https://github.com/apache/hudi/pull/7998#discussion_r1113530490



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -1063,7 +1063,9 @@ object HoodieSparkSqlWriter {
     val recordType = config.getRecordMerger.getRecordType
 
     val shouldCombine = parameters(INSERT_DROP_DUPS.key()).toBoolean ||
-      operation.equals(WriteOperationType.UPSERT) ||
+      (operation.equals(WriteOperationType.UPSERT) &&
+        parameters.getOrElse(HoodieWriteConfig.COMBINE_BEFORE_UPSERT.key(),
+          HoodieWriteConfig.COMBINE_BEFORE_UPSERT.defaultValue()).toBoolean) ||

Review Comment:
   This also makes spark hudi compatible with flink hudi, if 
COMBINE_BEFORE_UPSERT=false, users can upsert when no preCombine field is not 
defined. Flink has COMBINE_BEFORE_UPSERT set to false as default. 
   That's why it worked in flink but not in spark.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] kazdy commented on a diff in pull request #7998: [HUDI-5824] Fix: do not combine if write operation is Upsert and COMBINE_BEFORE_UPSERT is false

Reply via email to