tustvold commented on issue #3777: URL: https://github.com/apache/arrow-datafusion/issues/3777#issuecomment-1329512505
> where the information cannot simply be stored in memory. Looking at the interface of `SchemaProvider` the only interface it needs is to provide access to `TableProvider` by name, it doesn't actually need any more information than this. The constraint then becomes, what information is needed to construct a `TableProvider`, which boils down to what information `TableProvider` needs to be able to provide. Currently this is just the schema, there is support for statistics but I'm not sure this is exploited anywhere. My question is therefore, **are there use-cases where the number of tables exceeds what can be stored in memory**? If not I don't see a compelling reason to make `SchemaProvider` async, we can potentially make `TableProvider` methods async to allow deferred loading of metadata, but `SchemaProvider` itself I think can remain sync? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
