matthewgapp opened a new issue, #10889:
URL: https://github.com/apache/datafusion/issues/10889

   ### Is your feature request related to a problem or challenge?
   
   Accessing a `TableProviders` schema is a sync function call. This means that 
the `TableProvider` must know its schema before construction. 
   
   DataFusion recently introduced `TableFunctionImpl`, which allows users to 
define a function to create a `TableProvider.` Unfortunately, this `call` 
method is sync, meaning that the user-defined table function must know its 
schema upfront in a non-blocking way. This isn't possible when implementing 
TableProviders, which might infer their schema async, like an HTTP connector 
that can connect to arbitrary sources with payloads only known once the 
response is streaming in. 
   
   ### Describe the solution you'd like
   
   I propose we make the `call` method async to allow for async schemas and 
thus async table provider construction. 
   
   current code
   ```rust
   use super::TableProvider;
   
   use datafusion_common::Result;
   use datafusion_expr::Expr;
   
   use std::sync::Arc;
   
   /// A trait for table function implementations
   pub trait TableFunctionImpl: Sync + Send {
       /// Create a table provider
       fn call(&self, args: &[Expr]) -> Result<Arc<dyn TableProvider>>;
   }
   
   /// A table that uses a function to generate data
   pub struct TableFunction {
       /// Name of the table function
       name: String,
       /// Function implementation
       fun: Arc<dyn TableFunctionImpl>,
   }
   
   impl TableFunction {
       /// Create a new table function
       pub fn new(name: String, fun: Arc<dyn TableFunctionImpl>) -> Self {
           Self { name, fun }
       }
   
       /// Get the name of the table function
       pub fn name(&self) -> &str {
           &self.name
       }
   
       /// Get the function implementation and generate a table
       pub fn create_table_provider(&self, args: &[Expr]) -> Result<Arc<dyn 
TableProvider>> {
           self.fun.call(args)
       }
   }
   ```
   
   where the new trait would looks something like 
   
   ```rust
   /// A trait for table function implementations
   pub trait TableFunctionImpl: Sync + Send {
       /// Create a table provider
       async fn call(&self, args: &[Expr]) -> Result<Arc<dyn TableProvider>>;
   }
   ```
   
   
   ### Describe alternatives you've considered
   
   I've worked around this by creating the table outside of data fusion, but I 
would prefer to use the table functions to achieve the same thing. 
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to