nantunes opened a new pull request, #15840:
URL: https://github.com/apache/datafusion/pull/15840
## Which issue does this PR close?
- Fixes #15839
## Rationale for this change
When querying Avro files with columns in a different order than the original
schema, the reader fails with a type mismatch error. This happens because the
Avro reader implementation didn't correctly handle column ordering in
projections, causing a mismatch between data types and the expected schema
order.
## What changes are included in this PR?
1. Creating a projected schema upfront in `Reader::try_new`
- The schema now includes only the projected fields in the specified order
- This projected schema is passed to the array reader
2. Moving projection handling from `AvroArrowArrayReader` to the `Reader`
class
- Removed the `projection` field from `AvroArrowArrayReader`
- Removed projection filtering logic from `build_struct_array`
## Are these changes tested?
Yes. A new test case `test_avro_with_projection` was added. This test
confirms that:
1. Only the specified columns are included in the result
2. Columns appear in the order specified in the projection
3. The correct data values are returned for each column
## Are there any user-facing changes?
Nothing apart from something that was supposed to work now works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]