[GitHub] [hudi] codope commented on a diff in pull request #9261: [HUDI-6579] Adding support for upsert and deletes with spark datasource for pk less table

via GitHub Tue, 01 Aug 2023 00:42:27 -0700


codope commented on code in PR #9261:
URL: https://github.com/apache/hudi/pull/9261#discussion_r1280219845



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala:
##########
@@ -164,6 +164,18 @@ class DefaultSource extends RelationProvider
     new HoodieEmptyRelation(sqlContext, df.schema)
   }
 
+  /**
+   * For primary key less tables, incoming df could contain meta fields and we 
can't remove them. for other
+   * flows we can remove the meta fields.
+   * @param optParams
+   * @return
+   */
+  private def canRemoveMetaFields(optParams: Map[String, String]) : Boolean = {
+    !(optParams.getOrDefault(SPARK_SQL_WRITES_PREPPED_KEY, "false").toBoolean
+    || optParams.getOrDefault(SPARK_SQL_MERGE_INTO_PREPPED_KEY, 
"false").toBoolean
+    || !optParams.containsKey(RECORDKEY_FIELD.key()))

Review Comment:
   My point was for pk-less this condition would be true and 
`canRemoveMetaFields` can return true, isn't it? However, we need the meta 
fields for pk-less.
   ```
   !optParams.containsKey(RECORDKEY_FIELD.key()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] codope commented on a diff in pull request #9261: [HUDI-6579] Adding support for upsert and deletes with spark datasource for pk less table

Reply via email to