avantgardnerio commented on PR #3955: URL: https://github.com/apache/arrow-datafusion/pull/3955#issuecomment-1290713704
> there's some discussion before @yahoNanJing I am not familiar with the discussion (or don't recall)... do you have a link to the PR or github issue? > wondering why it's changed back to async Because [creating TableProviders](https://github.com/apache/arrow-datafusion/blob/7559c4425e6f32655c6d09e8ed17c9c51896472b/datafusion/core/src/execution/context.rs#L432) may have to be an async operation for ones like Deltalake that need to go load schema from the network. I looked into the alternative: having two methods on `TableProviderFactory`: 1. `fn async create(url: String)` - what we have now 2. `fn with_schema(url: String, schema: SchemaRef)` so that in theory when deserializing `TableProviders` we could skip the async operation by using the schema which should have been serialized in the scan. Unfortunately, it did not look trivial to serialize all the state that a `DeltaTable` sets up during planning, so based on @andygrove 's suggestion I switched it to async. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
