tustvold opened a new issue #1040: URL: https://github.com/apache/arrow-rs/issues/1040
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently `RecordReader` and `ColumnReaderImpl` have a hard-coded assumption that they are decoding to contiguous array of values, or i16 levels. This complicates implementing #1037, #171 and potential future decode related optimisations, e.g. decoding directly to StringArray, or evaluating predicates directly, etc... **Describe the solution you'd like** Create new `GenericColumnReader` and `GenericRecordReader` which `RecordReader` and `ColumnReaderImpl` are type alias to. This preserves API compatibility whilst allowing the introduction of new type parameters. As these types need to be able to influence the buffer types, they aren't object-safe and therefore need to be generics and not simply trait objects. All decode and buffering would be provided by these generic types, allowing them to be swapped out. This would leave `ColumnReaderImpl` responsible for muxing the parquet file, i.e. extracting pages from the `PageReader` and feeding them to the decoders. `RecordReader` would be responsible for delimiting semantic records, as it is today. **Describe alternatives you've considered** We could duplicate the logic in `ColumnReaderImpl` and `RecordReader` into different reader implementations, but this seems unfortunate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
