alamb opened a new issue, #6339:
URL: https://github.com/apache/arrow-datafusion/issues/6339

   ### Is your feature request related to a problem or challenge?
   
   Recent INSERT work  https://github.com/apache/arrow-datafusion/pull/6049  is 
a good example of a useful datafusion feature that has an extensibility story 
(a new function on a trait) 
   
   However, it takes a non trivial effort to add such support (requires an new 
physical operator). 
   
   
   ### Describe the solution you'd like
   
   Thus I would like to propose the following API to support writing to sources
   
   # DataSink trait
   A new trait that exposes just the information needed writing. Something like:
   
   ```rust
   /// The DataSink implements writing streams of [`RecordBatch`]es to
   /// partitioned destinations
   pub trait DataSink: std::fmt::Debug + std::fmt::Display + Send + Sync {
   
       /// How does this sink want its input distributed?
       fn required_input_distribution(&self) -> Distribution;
   
       /// return a future which writes a RecordBatchStream to a particular 
partition
       /// and return the number of rows written
       fn write_stream(&self, partition: usize, input: 
SendableRecordBatchStream) -> BoxFuture<Result<u64>>;
   }
   ```
   
   # Change signature of `TableProvider`
   
   Then if we change the signature of `TableProvider `from
   ```rust
       /// Insert into this table
       async fn insert_into(
           &self,
           _state: &SessionState,
           _input: Arc<dyn ExecutionPlan>,
       ) -> Result<Arc<dyn ExecutionPlan>> {
           let msg = "Insertion not implemented for this table".to_owned();
           Err(DataFusionError::NotImplemented(msg))
       }
   ```
   
   To something like
   ```rust
       /// Get a sink to use to write to this table, if supported
       async fn sink(
           &self,
       ) -> Result<Arc<dyn DataSink>> {
           let msg = "Insertion not implemented for this table".to_owned();
           Err(DataFusionError::NotImplemented(msg))
       }
   ```
   
   I think almost all of the inert plans can share a common `ExecutionPlan`
   
   ### Describe alternatives you've considered
   
   do nothing
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to