berkaysynnada commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2083998118


##########
datafusion/datasource/src/file_stream.rs:
##########
@@ -367,7 +368,7 @@ impl Default for OnError {
 pub trait FileOpener: Unpin + Send + Sync {
     /// Asynchronously open the specified file and return a stream
     /// of [`RecordBatch`]
-    fn open(&self, file_meta: FileMeta) -> Result<FileOpenFuture>;
+    fn open(&self, file_meta: FileMeta, file: PartitionedFile) -> 
Result<FileOpenFuture>;

Review Comment:
   > Maybe? But I feel like we have the partitioned file we might as well pass 
it in. Maybe we use it in the future to enable optimizations that use the 
partition values (eg late pruning based on partition values, including 
partition values in the scan so that [more filters can be 
evaluated](https://github.com/apache/datafusion/pull/15935), etc)
   
   I believe these can also be inferred from statistics in a more generalized 
fashion(don't know partition columns exist in column_statistics now) but not a 
big deal, we can keep this 👍🏻 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to