alamb commented on issue #6339: URL: https://github.com/apache/arrow-datafusion/issues/6339#issuecomment-1546033615
> Yes, that is the construction you would definitely want. I don't believe a corresponding notion exists in the physical layer, which has pretty hard-coded assumptions about partition enumerability I think @JanKaul 's idea in https://github.com/apache/arrow-datafusion/issues/6339#issuecomment-1545964425 would work well > Unless I'm missing something, its the difference between calling [ExecutionPlan::execute](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#tymethod.execute) and being given the result? Is that really a meaningful complexity? I am trying to avoid having to write a new `ExecutionPlan` for each table provider (aka avoid all boiler plate here): https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/src/physical_plan/memory.rs If the API is like this: ```rust impl TableProvider { async fn insert(&self, ctx: Arc<TaskContext>, plan: Arc<dyn ExecutionPlan>) -> Result<()>; } ``` I think the DataFusion implementation would be simpler, but now each sink would be potentially more complicated as it would have to deal with running multiple streams concurrently. But maybe that is ok 🤔 I am also trying to keep Execution and datasource separate (so I can eventually break them into different crates) -- maybe I can just make another trait to abstract away the execution plan detail I am about to get ion to a plane -- I'll play around with it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
