rdettai commented on pull request #8513: URL: https://github.com/apache/arrow/pull/8513#issuecomment-715378341
I just found out something! It seems that datafusion is full of interesting mysteries 😄 ! You actually already have the abstractions required to do what I want, but they are a little hidden. With `ExecutionContext::register_table(&mut self, name: &str, provider: Box<dyn TableProvider>)` you can actually already directly register an implementation of a custom source exec. You have to implement the `TableProvider` trait that allows you to have the projection pushdown, then directly the `ExecutionPlan` trait. You can then run your queries on it, and you can even use the new source from the dataframe API with `ExecutionContext::table(&mut self, table_name: &str) -> Result<Arc<dyn DataFrame>>`. Isn't that wonderful ? Now the only thing that remains to be done I guess is: - add a commodity function like `ExecutionContext::read_provider(&mut self, provider: Box<dyn TableProvider>) -> Result<Arc<dyn DataFrame>>` that shortcuts the two calls mentioned above. This is mainly meant to make this feature more explicit. - add a new example ? - enjoy... ❤️ ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
