kosiew commented on PR #1247: URL: https://github.com/apache/datafusion-python/pull/1247#issuecomment-3356748530
@timsaucer > simpler to provide an alternate SchemaProvider which would perform like the default in memory schema provider Thanks for the suggestion. I prototyped a SchemaProvider hook, but it ran into a hard limitation: the auto-registration logic needs to inspect the caller’s Python frames to discover in-scope variables, which we do today in _lookup_python_object by walking the active stack. During planning we hand control to wait_for_future, which releases the GIL and executes the resolver on Tokio worker threads; by the time DataFusion asks a SchemaProvider for a table, we’re no longer running on the original Python stack, so there’s nothing to inspect from inside the provider. Catching the missing-table error inside SessionContext.sql keeps us on the initiating thread, which is the only place we can reliably enumerate the user’s variables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
