friendlymatthew opened a new pull request, #17242:
URL: https://github.com/apache/datafusion/pull/17242

   ## Which issue does this PR close?
   
   - Closes https://github.com/apache/datafusion/issues/17095
   - Closes https://github.com/apache/datafusion/issues/15952
   
   ## Rationale for this change
   
   This PR simplifies the relationship between `FileSource`s <--> 
`FileScanConfig` <--> `DataSource`.  
   
   Currently, `FileScanConfig` is a struct used to group common parameters 
shared across different file sources. However, the existing design also makes 
`FileScanConfig` impl `DataSource`. This means to construct a data source 
execution plan, you must derive it from a configuration struct. 
   
   This PR removes that indirection. Instead, each `FileSource` struct holds a 
`FileScanConfig` field, and all types impl `FileSource` also impl `DataSource`. 
   
   This redesign proves to remove a lot of redundant code. For instance, 
`AvroSource` previously duplicated fields from `FileScanConfig`, which required 
additional boilerplate to manually get/set values:
   
   <img width="1397" height="224" alt="Screenshot 2025-08-19 at 3 00 26 PM" 
src="https://github.com/user-attachments/assets/c3bfd060-3b00-4f24-b425-0de3a3c2b53e";
 />
   
   --
   
   We still maintain an abstraction bounday between `FileSource` and 
`DataSource`s. The `DataSource` impl remains generic over any `T: FileSource`
   
   ## Are there any user-facing changes?
   
   Yes -- and they are substantial.
   
   Note: the current diff does not yet include deprecation strategies for 
existing methods to keep the review process clearer
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to