westonpace commented on PR #13800: URL: https://github.com/apache/datafusion/pull/13800#issuecomment-2568592701
@alamb I've rebased and updated the example. I think the only remaining issue is your comment here: > When trying this API out I didn't fully understand this API (or what i should return) -- maybe if it is optional / has an advanced usecase we could provide a default implementation The basic problem I think is that, since we are evaluating the table references directly ourselves we have to figure out which table references make sense for the schema provider. In the sync case this is not neccesary because you do this: ``` ctx.catalog("my_catalog").unwrap().register_schema("my_schema", Arc::clone(&remote_schema))?; ``` As a result, by the time the query planner is even using your table provider, it already has done the lookup into `my_catalog` and `my_schema`. We, however, can't rely on that, because we're working with the table providers themselves. Or, to put it another way, I am replacing this part of your example: ``` let table_names = references.iter().filter_map(|r| { if refers_to_schema("datafusion", "remote_schema", r) { Some(r.table()) } else { None } }); ``` I need to filter down the references to figure out which ones apply to the schema provider before I send the request out to the remote catalog (to allow the possibility that other requests are handled elsewhere). The reason I don't exactly love my current solution is that I think these methods can eventually go away. If we add `register_async_catalog` / `register_async_schema` methods to the `SessionContext` and move the caching inside there then we can rely on the same lookup mechanism that exists there. Still, I don't think these methods are onerous for the implementer and would rather just make progress. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org