roeap opened a new issue, #6350:
URL: https://github.com/apache/arrow-datafusion/issues/6350

   ### Is your feature request related to a problem or challenge?
   
   When trying to integrate Unity catalog with delta-rs, which in turn means 
integrating it with datafusion, we are facing some challenges making API calls 
to the catalog in some of the trait functions.
   
   Essentially, only the `table` function on `SchemaProvider` is async whereas 
all other functions on the various providers are synchronous.
   
   Looking at other implementations for remote catalogs out there it seems most 
are resorting to downloading the data beforehand and then essentially using the 
build in `Memory*` implementations. While technically feasible it poses some 
other challenges as the amount of registered tables in a catalog may get quite 
large and also change relatively frequently.
   
   Thus I was wondering of there are alternative best practices out there or if 
it would be feasible to make some more of the trait functions async.
   
   ### Describe the solution you'd like
   
   Having more functions on the `*Provider` traits async with priority 
increasing as you move down the tree - i.e. pre-loading all catalogs hurts less 
then pre-laoding all schemas, which hurts less then loading all tables ...
   
   ### Describe alternatives you've considered
   
   * Preloading all data but the table details.
   * Working around using async in sync code while there is already a running 
runtime.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to