rdettai commented on pull request #1010: URL: https://github.com/apache/arrow-datafusion/pull/1010#issuecomment-921034340
> I thought, however, we were headed towards a slightly different abstraction where would still have a ParquetReader that didn't use Path / File directly, but instead would use the ObjectStore abstraction recently added by @yjshen. Correct, that will be the next step > TLDR: I wonder "if DataFusion planning was async would you be able to implement the table format as you would like"? Yes, that would really bring a huge amount of flexibility. A funny example: I have just added a sketch implementation of the partition pruning algorithm. One interesting approach is to load the partitions into a `RecordBatch` to be able to run the pushed down filter on it. DataFusion inside Datafusion! But we are stuck because that requires `async`. Too many APIs are async in the rust ecosystem, we want to be able to use them in the planning 😄 I am going to try to make the `TableProvider.scan()` method async, and if it works I'll submit that in a separate PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
