aokolnychyi commented on a change in pull request #786: replace SparkDataFile 
with DataFile
URL: https://github.com/apache/incubator-iceberg/pull/786#discussion_r379614197
 
 

 ##########
 File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala
 ##########
 @@ -527,9 +425,8 @@ object SparkTableUtil {
     val metricsConfig = MetricsConfig.fromProperties(targetTable.properties)
 
     val manifests = partitionDS
-      .flatMap(partition => listPartition(partition, serializableConf, 
metricsConfig))
+      .flatMap(partition => listPartition(partition, spec, serializableConf, 
metricsConfig))
       .repartition(numShufflePartitions)
-      .orderBy($"path")
 
 Review comment:
   Why do we want to remove the sort? I think it is needed to collocate files 
for the same partition next to each other so that partition skipping is quick.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to