VindhyaG commented on code in PR #53040:
URL: https://github.com/apache/spark/pull/53040#discussion_r2554564269
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FilePartition.scala:
##########
@@ -75,7 +75,7 @@ object FilePartition extends SessionStateHelper with Logging {
}
// Assign files to partitions using "Next Fit Decreasing"
- partitionedFiles.foreach { file =>
+
partitionedFiles.sortBy(_.length)(implicitly[Ordering[Long]].reverse).foreach {
file =>
Review Comment:
The PR description mentions currently the partition files are already
sorted then why do we need to sort Seq[PartitionedFile] again? As far as I
understand
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L888
does that right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]