[GitHub] [arrow-datafusion] alamb commented on issue #6339: Simplified TableProvider::Insert API

via GitHub Fri, 12 May 2023 10:01:20 -0700


alamb commented on issue #6339:
URL: 
https://github.com/apache/arrow-datafusion/issues/6339#issuecomment-1546033615


   > Yes, that is the construction you would definitely want. I don't believe a 
corresponding notion exists in the physical layer, which has pretty hard-coded 
assumptions about partition enumerability
   
   I think @JanKaul 's idea in 
https://github.com/apache/arrow-datafusion/issues/6339#issuecomment-1545964425 
would work well
   
   > Unless I'm missing something, its the difference between calling 
[ExecutionPlan::execute](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#tymethod.execute)
 and being given the result? Is that really a meaningful complexity?
   
   I am trying to avoid having to write a new `ExecutionPlan` for each table 
provider (aka avoid all boiler plate here): 
https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/src/physical_plan/memory.rs
   
   
   If the API is like this:
   
   ```rust
   impl TableProvider {
       async fn insert(&self, ctx: Arc<TaskContext>, plan: Arc<dyn 
ExecutionPlan>) -> Result<()>;
   }
   ```
   
   I think the DataFusion implementation would be simpler, but now each sink 
would be potentially more complicated as it would have to deal with running 
multiple streams concurrently. But maybe that is ok 🤔  
   
   I am also trying to keep Execution and datasource separate (so I can 
eventually break them into different crates) -- maybe I can just make another 
trait to abstract away the execution plan detail
   
   I am about to get ion to a plane -- I'll play around with it
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alamb commented on issue #6339: Simplified TableProvider::Insert API

Reply via email to