jonvex commented on code in PR #11943:
URL: https://github.com/apache/hudi/pull/11943#discussion_r1811172853


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala:
##########
@@ -167,9 +166,21 @@ object HoodieWriterUtils {
     if (!isOverWriteMode) {
       val resolver = spark.sessionState.conf.resolver
       val diffConfigs = StringBuilder.newBuilder
+      val payloadIsExpressionPayload = 
params.getOrElse(PAYLOAD_CLASS_NAME.key(), 
"").equals("org.apache.spark.sql.hudi.command.payload.ExpressionPayload")
       params.foreach { case (key, value) =>
+        var ignoreConfig = false
         // Base file format can change between writes, so ignore it.
-        if (!HoodieTableConfig.BASE_FILE_FORMAT.key.equals(key)) {
+        ignoreConfig = ignoreConfig || 
HoodieTableConfig.BASE_FILE_FORMAT.key.equals(key)
+
+        //expression payload will never be the table config so skip validation 
of merge configs
+        ignoreConfig = ignoreConfig || (payloadIsExpressionPayload && 
(key.equals(PAYLOAD_CLASS_NAME.key())
+          || key.equals(HoodieTableConfig.PAYLOAD_CLASS_NAME.key()) || 
key.equals(RECORD_MERGE_MODE.key())
+          || key.equals(RECORD_MERGER_STRATEGY_ID.key())))
+
+        //don't validate the payload only in the case that insert into is 
using fallback to some legacy configs
+        ignoreConfig = ignoreConfig || (key.equals(PAYLOAD_CLASS_NAME.key()) 
&&  
value.equals("org.apache.spark.sql.hudi.command.ValidateDuplicateKeyPayload"))

Review Comment:
   We need to ignore the validation on all of those keys. Because lets say you 
are just using the default. You will have:
   
   payload = default
   merger strategy = default id
   merge mode = event time
   
   then when we do MIT the input configs will be
   
   payload = expression payload
   merger strategy = payload based strategy 
   merge mode = custom
   
   So we don't want to validate all of those



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to