vinothchandar commented on a change in pull request #2311:
URL: https://github.com/apache/hudi/pull/2311#discussion_r540771309
##########
File path: hudi-spark/src/main/java/org/apache/hudi/DataSourceUtils.java
##########
@@ -142,6 +142,20 @@ public static HoodieRecordPayload createPayload(String
payloadClass, GenericReco
}
}
+ /**
+ * Create a payload class via reflection, passing in an ordering/precombine
value.
+ */
+ public static HoodieRecordPayload createPayload(String payloadClass,
GenericRecord record, Comparable orderingVal,
Review comment:
@nsivabalan won't this break for an existing payloadClass (user
defined), that does have this three member constructor?
##########
File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -154,9 +154,16 @@ private[hudi] object HoodieSparkSqlWriter {
val hoodieRecord = if (shouldCombine) {
val orderingVal = HoodieAvroUtils.getNestedFieldVal(gr,
parameters(PRECOMBINE_FIELD_OPT_KEY), false)
.asInstanceOf[Comparable[_]]
- DataSourceUtils.createHoodieRecord(gr,
- orderingVal, keyGenerator.getKey(gr),
- parameters(PAYLOAD_CLASS_OPT_KEY))
+ val honorOrderingWhileMerging =
parameters(HONOR_ORDERING_WHILE_MERGING_OPT_KEY).toBoolean
Review comment:
this is more and more overhead for the user to learn one more parameter
when using one specific payload
##########
File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -205,6 +205,12 @@ object DataSourceWriteOptions {
val PRECOMBINE_FIELD_OPT_KEY = "hoodie.datasource.write.precombine.field"
val DEFAULT_PRECOMBINE_FIELD_OPT_VAL = "ts"
+ /**
+ * Field when set to true, will honor ordering while merging two records. If
not, incoming record will overwrite
+ * record in storage.
+ */
+ val HONOR_ORDERING_WHILE_MERGING_OPT_KEY =
"hoodie.datasource.honor.ordering.while.merging"
+ val DEFAULT_HONOR_ORDERING_WHILE_MERGING_OPT_VAL = "false"
Review comment:
I really don't like having these payload specific configs, top level in
datasource options.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]