NGA-TRAN commented on code in PR #9593:
URL: https://github.com/apache/arrow-datafusion/pull/9593#discussion_r1534607051


##########
datafusion/core/src/datasource/physical_plan/file_scan_config.rs:
##########
@@ -194,6 +203,71 @@ impl FileScanConfig {
             .with_repartition_file_min_size(repartition_file_min_size)
             .repartition_file_groups(&file_groups)
     }
+
+    /// Attempts to do a bin-packing on files into file groups, such that any 
two files
+    /// in a file group are ordered and non-overlapping with respect to their 
statistics.
+    /// It will produce the smallest number of file groups possible.

Review Comment:
   Thanks. I think another possible next step is to move the implementation of 
`ProgressiveEvalExec` from InfluxDB IOx here and then  someone can use it and 
the results of this PR to do 
https://github.com/apache/arrow-datafusion/issues/6672. 
   What do you think @alamb ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to