alamb opened a new issue, #8698: URL: https://github.com/apache/arrow-datafusion/issues/8698
### Is your feature request related to a problem or challenge? A report from Twitter https://twitter.com/mim_djo/status/1740542585410814393 Says: > a new release of #datafusion 34, still reading #Deltatable via arrow is suboptimal compared to reading Parquet Directly :( something to do with passing stats to get correct join orders.  I think the issue is that https://github.com/apache/arrow-datafusion/issues/7949 and https://github.com/apache/arrow-datafusion/issues/7950 rely on statistics to pick non bad join orders for TPCH queries. These statistics are not available from the delta provider it seems. @andygrove says > RelCommon (common to all operators in Substrait) can contain a hint that has stats ``` message Stats { double row_count = 1; double record_size = 2; substrait.extensions.AdvancedExtension advanced_extension = 10; } ``` ### Describe the solution you'd like I would like the Datafusion substrait consumer/producer to handle translating ### Describe alternatives you've considered _No response_ ### Additional context This was brought up by @Dandandan on the ASF slack: https://the-asf.slack.com/archives/C04RJ0C85UZ/p1703885214702039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
