[GitHub] [incubator-iceberg] chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile

GitBox Sun, 16 Feb 2020 18:22:08 -0800

chenjunjiedada commented on a change in pull request #786: replace 
SparkDataFile with DataFile
URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r379961821


 ##########
 File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala
 ##########
 @@ -527,9 +425,8 @@ object SparkTableUtil {
     val metricsConfig = MetricsConfig.fromProperties(targetTable.properties)
 
     val manifests = partitionDS
-      .flatMap(partition => listPartition(partition, serializableConf, 
metricsConfig))
+      .flatMap(partition => listPartition(partition, spec, serializableConf, 
metricsConfig))
       .repartition(numShufflePartitions)
-      .orderBy($"path")
 
 Review comment:
   BTW, even we coalesce files in a manifest for same partition, I think we 
still have to iterate through all manifest entries in the manifest for 
partition skipping.  Please correct me if I am wrong.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [incubator-iceberg] chenjunjiedada commented on a change in pull request #786: replace SparkDataFile with DataFile

Reply via email to