alamb opened a new issue, #14939:
URL: https://github.com/apache/datafusion/issues/14939

   ### Is your feature request related to a problem or challenge?
   
   While working on various upgrade PRs and preparing for the DataFusion 46 
release, I have noticed something I would like to change before we release
   
   The `FileSource` and `DataSource` traits were introduced in the datasource 
refactor
   - https://github.com/apache/datafusion/pull/14224
   
   They have APIs to update the underlying source in a few ways, but the APIs 
require cloning. For example, `FileSource` looks like this:
   
   ```rust
   /// Common behaviors that every file format needs to implement.
   ///
   /// See initialization examples on `ParquetSource`, `CsvSource`
   pub trait FileSource: Send + Sync {
   ...
       /// Initialize new type with batch size configuration
       fn with_batch_size(&self, batch_size: usize) -> Arc<dyn FileSource>;
   ...
   }
   ```
   
   The only way to implement `with_batch_size` is to (deep) clone the object
   
   ```rust
       fn with_batch_size(&self, batch_size: usize) -> Arc<dyn FileSource> {
           let mut conf = self.clone();
           conf.batch_size = Some(batch_size);
           Arc::new(conf)
       }
   ```
   
   
https://github.com/apache/datafusion/blob/1ae06a497e7c6b117c211c52b33445c2063b9921/datafusion/core/src/datasource/physical_plan/csv.rs#L584-L588
   
   ### Describe the solution you'd like
   
   I would like to avoid having to deep clone the object
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to