berkaysynnada commented on code in PR #15852: URL: https://github.com/apache/datafusion/pull/15852#discussion_r2062662498
########## datafusion/datasource/src/file_groups.rs: ########## @@ -421,7 +421,7 @@ impl FileGroup { } /// Get the statistics for this group - pub fn statistics(&self) -> Option<&Statistics> { + pub fn statistics_ref(&self) -> Option<&Statistics> { Review Comment: Should we apply a similar pattern for FileGroup statistics API as well? with a given index for the nth partitioned file ########## datafusion/physical-plan/src/execution_plan.rs: ########## @@ -430,6 +430,32 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync { Ok(Statistics::new_unknown(&self.schema())) } + /// Returns statistics for a specific partition of this `ExecutionPlan` node. + /// If statistics are not available, should return [`Statistics::new_unknown`] + /// (the default), not an error. + /// If `partition` is `None`, it returns statistics for the entire plan. + fn partition_statistics(&self, partition: Option<usize>) -> Result<Statistics> { + match partition { + Some(idx) => { + // Validate partition index + let partition_count = self.properties().partitioning.partition_count(); + if idx >= partition_count { + return internal_err!( + "Invalid partition index: {}, the partition count is {}", + idx, + partition_count + ); + } + // Default implementation: return unknown statistics for the specific partition + Ok(Statistics::new_unknown(&self.schema())) + } + None => { + // Return unknown statistics for the entire plan + Ok(Statistics::new_unknown(&self.schema())) Review Comment: ```suggestion if let Some(idx) = partition { // Validate partition index let partition_count = self.properties().partitioning.partition_count(); if idx >= partition_count { return internal_err!( "Invalid partition index: {}, the partition count is {}", idx, partition_count ); } } Ok(Statistics::new_unknown(&self.schema())) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org