[GitHub] [hudi] nsivabalan commented on a diff in pull request #9261: [HUDI-6579] Adding support for upsert and deletes with spark datasource for pk less table

via GitHub Mon, 31 Jul 2023 18:01:43 -0700


nsivabalan commented on code in PR #9261:
URL: https://github.com/apache/hudi/pull/9261#discussion_r1280014102



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala:
##########
@@ -164,6 +164,18 @@ class DefaultSource extends RelationProvider
     new HoodieEmptyRelation(sqlContext, df.schema)
   }
 
+  /**
+   * For primary key less tables, incoming df could contain meta fields and we 
can't remove them. for other
+   * flows we can remove the meta fields.
+   * @param optParams
+   * @return
+   */
+  private def canRemoveMetaFields(optParams: Map[String, String]) : Boolean = {
+    !(optParams.getOrDefault(SPARK_SQL_WRITES_PREPPED_KEY, "false").toBoolean
+    || optParams.getOrDefault(SPARK_SQL_MERGE_INTO_PREPPED_KEY, 
"false").toBoolean
+    || !optParams.containsKey(RECORDKEY_FIELD.key()))

Review Comment:
   not sure I get it. we will drop the meta fields down the line while creating 
the HoodieRecordPayload. 
   here we are just trying to gauge if we can go w/ prepped writes or regular 
non-prepped writes. if incoming df has meta fields and if its pk less table, we 
go with prepped flow. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on a diff in pull request #9261: [HUDI-6579] Adding support for upsert and deletes with spark datasource for pk less table

Reply via email to