westonpace commented on PR #13800:
URL: https://github.com/apache/datafusion/pull/13800#issuecomment-2568592701
@alamb I've rebased and updated the example. I think the only remaining
issue is your comment here:
> When trying this API out I didn't fully understand this API (or what i
should return) -- maybe if it is optional / has an advanced usecase we could
provide a default implementation
The basic problem I think is that, since we are evaluating the table
references directly ourselves we have to figure out which table references make
sense for the schema provider. In the sync case this is not neccesary because
you do this:
```
ctx.catalog("my_catalog").unwrap().register_schema("my_schema",
Arc::clone(&remote_schema))?;
```
As a result, by the time the query planner is even using your table
provider, it already has done the lookup into `my_catalog` and `my_schema`.
We, however, can't rely on that, because we're working with the table providers
themselves. Or, to put it another way, I am replacing this part of your
example:
```
let table_names = references.iter().filter_map(|r| {
if refers_to_schema("datafusion", "remote_schema", r) {
Some(r.table())
} else {
None
}
});
```
I need to filter down the references to figure out which ones apply to the
schema provider before I send the request out to the remote catalog (to allow
the possibility that other requests are handled elsewhere).
The reason I don't exactly love my current solution is that I think these
methods can eventually go away. If we add `register_async_catalog` /
`register_async_schema` methods to the `SessionContext` and move the caching
inside there then we can rely on the same lookup mechanism that exists there.
Still, I don't think these methods are onerous for the implementer and would
rather just make progress.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]