nantunes opened a new pull request, #15840:
URL: https://github.com/apache/datafusion/pull/15840

   ## Which issue does this PR close?
   
   - Fixes #15839
   
   ## Rationale for this change
   
   When querying Avro files with columns in a different order than the original 
schema, the reader fails with a type mismatch error. This happens because the 
Avro reader implementation didn't correctly handle column ordering in 
projections, causing a mismatch between data types and the expected schema 
order.
   
   ## What changes are included in this PR?
   
   1. Creating a projected schema upfront in `Reader::try_new`
      - The schema now includes only the projected fields in the specified order
      - This projected schema is passed to the array reader
   
   2. Moving projection handling from `AvroArrowArrayReader` to the `Reader` 
class
      - Removed the `projection` field from `AvroArrowArrayReader`
      - Removed projection filtering logic from `build_struct_array`
   
   ## Are these changes tested?
   
   Yes. A new test case `test_avro_with_projection` was added. This test 
confirms that:
   
   1. Only the specified columns are included in the result
   2. Columns appear in the order specified in the projection
   3. The correct data values are returned for each column
   
   ## Are there any user-facing changes?
   
   Nothing apart from something that was supposed to work now works.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to