kumarUjjawal commented on code in PR #20043:
URL: https://github.com/apache/datafusion/pull/20043#discussion_r2742478579
##########
datafusion/datasource/src/file_scan_config.rs:
##########
@@ -1157,19 +1157,30 @@ impl FileScanConfig {
&self,
new_file_source: Arc<dyn FileSource>,
is_exact: bool,
+ order: &[PhysicalSortExpr],
) -> Result<Arc<dyn DataSource>> {
let mut new_config = self.clone();
- // Reverse file groups (FileScanConfig's responsibility)
- new_config.file_groups = new_config
- .file_groups
- .into_iter()
- .map(|group| {
- let mut files = group.into_inner();
- files.reverse();
- files.into()
- })
- .collect();
+ // Reverse file groups (FileScanConfig's responsibility) if doing so
helps satisfy the
+ // requested ordering.
+ let reverse_file_groups =
+ LexOrdering::new(order.iter().cloned()).is_some_and(|requested| {
+ self.output_ordering
+ .iter()
+ .any(|ordering| ordering.is_reverse(&requested))
+ });
Review Comment:
I expanded this check because order is expressed in the scan’s
output/projection schema, while self.output_ordering is stored pre-projection.
LexOrdering::is_reverse compares PhysicalSortExprs (incl. Column indices), so
the old code could miss valid reversals when projections reorder/drop columns.
By projecting output_ordering into the scan output schema first
(project_orderings), we compare like-for-like and only reverse file groups when
it truly matches a reverse request.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]