nevi-me commented on pull request #8402: URL: https://github.com/apache/arrow/pull/8402#issuecomment-716110760
@carols10cents @alamb I think the whole reader logic needs replumbing ... There's at least a 1:1 mapping between Parquet types and Arrow types, and we can cast from Arrow types to other Arrow types based on the Arrow metadata. This is a less complex path, because one of the things I've been concerned about is that I/we are going to struggle a lot when we get to deeply-nested reads. I previously didn't understand your needs re. dictionary support between Parquet > Arrow > DataFusion. I now have context, so I can make decisions better. My plan was to remove `trait CastRecordReader` altogether, and instead use Arrow casts. I prefer Arrow casts because they handle transparent casts of `dyn Array & DataType::ANY` instead of the combinatoral `CastRecordReader`. I've now done this in https://github.com/integer32llc/arrow/pull/3, but I left a lot of `TODO`s which I'd love for us to address so we don't carry the tech debt of cast converters. The tests all pass now 🎊 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org