alamb opened a new issue, #8335:
URL: https://github.com/apache/arrow-datafusion/issues/8335

   ### Describe the bug
   
   Basically if you have a parquet file with a nested field that has the same 
name as a top level field, the datafusion parquet reader will read statistics 
for the nested field instead of the top level field
   
   
   
   ### To Reproduce
   
   I added a test in https://github.com/apache/arrow-datafusion/pull/8294 
`struct_and_non_struct`
   
   Basically it shows:
   
   ```
   struct_field: struct { 
     int_field: int32,     <-- 'struct_field.int_field'
   }
   int_field: int32.       <-- just 'int_field'
   ```
   
   If you then ask for statistics (e.g. have a predicate on) `int_field`, the 
DataFusion reader will produce the statistics for `struct_field.int_field` 
rather than the top level `int_field`
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to