alamb opened a new issue, #14936:
URL: https://github.com/apache/datafusion/issues/14936

   ### Describe the bug
   
   As @blaginin  found in https://github.com/apache/datafusion/pull/14685, the 
statistics when a File is projected (aka only a subset of the columns are 
present) is incorrect
   
   Specifically, the projected statistics have the same `total_byte_size`  as 
the input. However, given only a subset of columns are selected this will mean 
that the `total_byte_size` should actually be lower
   
   ### To Reproduce
   
   See tests referenced in  https://github.com/apache/datafusion/pull/14685
   
   ### Expected behavior
   
   `total_byte_size` should take into account the subset of columns
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to