houqp commented on issue #616: URL: https://github.com/apache/arrow-datafusion/issues/616#issuecomment-868253169
Yeah, this will be very useful for datafusion integration with delta-rs. As @Dandandan mentioned earlier in slack, we need to update ParquetExec to take `datasource::Source` as input instead of path strings. We also need to update `datasource::Source` to make it async compatible. To make existing csv, parquet table provider implementation more reusable, we should probably extend `datasource::Source` to also handle directory listing so we won't need to re-implement table provider for csv/json/parquet in different IO extensions. `datafusion-s3` can just provide a S3 `Source` that does object listing, get and put. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org