yihua commented on code in PR #6028:
URL: https://github.com/apache/hudi/pull/6028#discussion_r915125780


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -523,17 +523,14 @@ object HoodieSparkSqlWriter {
     val params: mutable.Map[String, String] = 
collection.mutable.Map(parameters.toSeq: _*)
     params(HoodieWriteConfig.AVRO_SCHEMA_STRING.key) = schema.toString
     val writeConfig = DataSourceUtils.createHoodieConfig(schema.toString, 
path, tblName, mapAsJavaMap(params))
-    val bulkInsertPartitionerRows: BulkInsertPartitioner[Dataset[Row]] = if 
(populateMetaFields) {
+    val bulkInsertPartitionerRows: BulkInsertPartitioner[Dataset[Row]] = {

Review Comment:
   We have to keep the if-else branch here based on the reason above.



##########
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/HoodieDatasetBulkInsertHelper.java:
##########
@@ -182,8 +184,8 @@ public static Dataset<Row> 
prepareHoodieDatasetForBulkInsertWithoutMetaFields(Da
     allCols.addAll(metaFields);
     allCols.addAll(originalFields);
 
-    return rowsWithMetaCols.select(
-        
JavaConverters.collectionAsScalaIterableConverter(allCols).asScala().toSeq());
+    return 
bulkInsertPartitionerRows.repartitionRecords(rowsWithMetaCols.select(

Review Comment:
   Some of the partitioners rely on the meta fields for sorting so if the meta 
fields are not present, the repartitioning or sorting fails.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to