codope commented on code in PR #9261:
URL: https://github.com/apache/hudi/pull/9261#discussion_r1271396748


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala:
##########
@@ -85,6 +89,24 @@ object HoodieWriterUtils {
     Map() ++ hoodieConfig.getProps.asScala ++ globalProps ++ 
DataSourceOptionsHelper.translateConfigurations(parameters)
   }
 
+  def canDoPrepped(hoodieConfig: HoodieConfig, parameters: Map[String, 
String], operation : WriteOperationType, df: Dataset[Row]): Boolean = {

Review Comment:
   let's add some scaladoc and rename method to `canDoPreppedWrites`?



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -172,6 +172,9 @@ object HoodieSparkSqlWriter {
       operation = WriteOperationType.INSERT
     }
 
+    val sqlMergeIntoPrepped = 
parameters.getOrDefault(SPARK_SQL_MERGE_INTO_PREPPED_KEY, "false").toBoolean
+    val isPrepped = canDoPrepped(hoodieConfig, parameters, operation, df)

Review Comment:
   let's just keep one boolean `isPrepped`? The boolean `sqlMergeIntoPrepped` 
can also be handled in `canDoPrepped`.



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala:
##########
@@ -164,6 +164,18 @@ class DefaultSource extends RelationProvider
     new HoodieEmptyRelation(sqlContext, df.schema)
   }
 
+  /**
+   * For primary key less tables, incoming df could contain meta fields and we 
can't remove them. for other
+   * flows we can remove the meta fields.
+   * @param optParams
+   * @return
+   */
+  private def canRemoveMetaFields(optParams: Map[String, String]) : Boolean = {
+    !(optParams.getOrDefault(SPARK_SQL_WRITES_PREPPED_KEY, "false").toBoolean
+    || optParams.getOrDefault(SPARK_SQL_MERGE_INTO_PREPPED_KEY, 
"false").toBoolean
+    || !optParams.containsKey(RECORDKEY_FIELD.key()))

Review Comment:
   For pk-less tables, RECORDKEY_FIELD may not be present and this condition 
can be true. Do we want to drop the meta fields in such cases? From the comment 
it seems like for pk-less we don;t want to drop the meta fields.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to