alamb opened a new issue, #6339: URL: https://github.com/apache/arrow-datafusion/issues/6339
### Is your feature request related to a problem or challenge? Recent INSERT work https://github.com/apache/arrow-datafusion/pull/6049 is a good example of a useful datafusion feature that has an extensibility story (a new function on a trait) However, it takes a non trivial effort to add such support (requires an new physical operator). ### Describe the solution you'd like Thus I would like to propose the following API to support writing to sources # DataSink trait A new trait that exposes just the information needed writing. Something like: ```rust /// The DataSink implements writing streams of [`RecordBatch`]es to /// partitioned destinations pub trait DataSink: std::fmt::Debug + std::fmt::Display + Send + Sync { /// How does this sink want its input distributed? fn required_input_distribution(&self) -> Distribution; /// return a future which writes a RecordBatchStream to a particular partition /// and return the number of rows written fn write_stream(&self, partition: usize, input: SendableRecordBatchStream) -> BoxFuture<Result<u64>>; } ``` # Change signature of `TableProvider` Then if we change the signature of `TableProvider `from ```rust /// Insert into this table async fn insert_into( &self, _state: &SessionState, _input: Arc<dyn ExecutionPlan>, ) -> Result<Arc<dyn ExecutionPlan>> { let msg = "Insertion not implemented for this table".to_owned(); Err(DataFusionError::NotImplemented(msg)) } ``` To something like ```rust /// Get a sink to use to write to this table, if supported async fn sink( &self, ) -> Result<Arc<dyn DataSink>> { let msg = "Insertion not implemented for this table".to_owned(); Err(DataFusionError::NotImplemented(msg)) } ``` I think almost all of the inert plans can share a common `ExecutionPlan` ### Describe alternatives you've considered do nothing ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
