berkaysynnada commented on code in PR #15852:
URL: https://github.com/apache/datafusion/pull/15852#discussion_r2062662498


##########
datafusion/datasource/src/file_groups.rs:
##########
@@ -421,7 +421,7 @@ impl FileGroup {
     }
 
     /// Get the statistics for this group
-    pub fn statistics(&self) -> Option<&Statistics> {
+    pub fn statistics_ref(&self) -> Option<&Statistics> {

Review Comment:
   Should we apply a similar pattern for FileGroup statistics API as well? with 
a given index for the nth partitioned file



##########
datafusion/physical-plan/src/execution_plan.rs:
##########
@@ -430,6 +430,32 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
         Ok(Statistics::new_unknown(&self.schema()))
     }
 
+    /// Returns statistics for a specific partition of this `ExecutionPlan` 
node.
+    /// If statistics are not available, should return 
[`Statistics::new_unknown`]
+    /// (the default), not an error.
+    /// If `partition` is `None`, it returns statistics for the entire plan.
+    fn partition_statistics(&self, partition: Option<usize>) -> 
Result<Statistics> {
+        match partition {
+            Some(idx) => {
+                // Validate partition index
+                let partition_count = 
self.properties().partitioning.partition_count();
+                if idx >= partition_count {
+                    return internal_err!(
+                        "Invalid partition index: {}, the partition count is 
{}",
+                        idx,
+                        partition_count
+                    );
+                }
+                // Default implementation: return unknown statistics for the 
specific partition
+                Ok(Statistics::new_unknown(&self.schema()))
+            }
+            None => {
+                // Return unknown statistics for the entire plan
+                Ok(Statistics::new_unknown(&self.schema()))

Review Comment:
   ```suggestion
           if let Some(idx) = partition {
               // Validate partition index
               let partition_count = 
self.properties().partitioning.partition_count();
               if idx >= partition_count {
                   return internal_err!(
                       "Invalid partition index: {}, the partition count is {}",
                       idx,
                       partition_count
                   );
               }
           }
           Ok(Statistics::new_unknown(&self.schema()))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to