a-agmon commented on issue #16303:
URL: https://github.com/apache/datafusion/issues/16303#issuecomment-2953899593

   I gave it a shot but it ended up being somewhat messy. Thats mostly due to 
the fact that on the one hand  `TableFunctionImpl::call()` is synchronous, yet, 
on the other hand, it also has to get a hold of the schema of the data, which 
in the case of remote blobs (like s3), requires IO and async to be done right. 
   I was trying to work around this by using the `call()` method to create a 
`TableProvider` that initially reports an empty schema. This satisfies the 
planner's synchronous API. The actual schema discovery is deferred until the 
scan() method is called during the asynchronous execution phase. But this 
creates an issue with projections that require to validate schema, i.e, `select 
X from read_csv(some-glob-pattern)` though `select * from 
read_csv(some-glob-pattern)` will work
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to