gene-bordegaray commented on code in PR #22657:
URL: https://github.com/apache/datafusion/pull/22657#discussion_r3401721774
##########
datafusion/catalog-listing/src/table.rs:
##########
@@ -690,12 +715,45 @@ impl ListingTable {
/// Get the list of files for a scan as well as the file level statistics.
/// The list is grouped to let the execution plan know how the files should
/// be distributed to different threads / executors.
+ ///
+ /// If [`ListingOptions::output_partitioning`] is set, the returned file
+ /// groups preserve that declared partition count, including empty trailing
+ /// groups when needed, rather than using
+ /// [`ListingOptions::target_partitions`].
pub async fn list_files_for_scan<'a>(
&'a self,
ctx: &'a dyn Session,
filters: &'a [Expr],
limit: Option<usize>,
) -> datafusion_common::Result<ListFilesResult> {
+ let declared_output_partitioning =
self.options.output_partitioning.as_ref();
+ let target_partitions = declared_output_partitioning
+ .map(Partitioning::partition_count)
+ .unwrap_or(self.options.target_partitions);
+ self.list_files_for_scan_with_target(
+ ctx,
+ filters,
+ limit,
+ target_partitions,
+ declared_output_partitioning.is_some(),
+ )
+ .await
+ }
+
+ async fn list_files_for_scan_with_target<'a>(
Review Comment:
Addressed. I removed the extra list_files_for_scan_with_target variant and
kept the behavior driven by ListingOptions instead of adding another method
shape.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]