rdettai edited a comment on issue #349: URL: https://github.com/apache/arrow-datafusion/issues/349#issuecomment-912419254
@andygrove as the client is handling the logical plan, I think it does not need to know about the list of files or the statistics, it only needs the schema: - with the current datafusion implementation, we could just build a table provider without any statistics on the client, and then load the statistics once the logical plan is deserialized on the scheduler (cost based optimizations would be ineffective on the client but that is not a big issue as we could run them on the scheduler instead) - in #962 I am proposing a change that would move completely the statistics from the logical plan to the physical plan As flight already has an endpoint to query the schema, this would avoid creating a new one 😃 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
