rdettai edited a comment on issue #349:
URL: 
https://github.com/apache/arrow-datafusion/issues/349#issuecomment-912419254


   @andygrove as the client is handling the logical plan, I think it does not 
need to know about the list of files or the statistics, it only needs the 
schema:
   - with the current df implementation, we could just build a table provider 
without any statistics on the client, and then load the statistics once the 
logical plan is deserialized on the scheduler (cost based optimizations would 
be ineffective on the client but that is not a big issue as we could run them 
on the scheduler instead)
   - in #962 I am proposing a change that would move completely the statistics 
from the logical plan to the physical plan
   
   As flight already has an endpoint to query the schema, this would avoid 
creating a new one 😃 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to