tustvold opened a new issue #1040:
URL: https://github.com/apache/arrow-rs/issues/1040


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   Currently `RecordReader` and `ColumnReaderImpl` have a hard-coded assumption 
that they are decoding to contiguous array of values, or i16 levels. This 
complicates implementing #1037, #171 and potential future decode related 
optimisations, e.g. decoding directly to StringArray, or evaluating predicates 
directly, etc...
   
   **Describe the solution you'd like**
   
   Create new `GenericColumnReader` and `GenericRecordReader` which 
`RecordReader` and `ColumnReaderImpl` are type alias to. This preserves API 
compatibility whilst allowing the introduction of new type parameters. As these 
types need to be able to influence the buffer types, they aren't object-safe 
and therefore need to be generics and not simply trait objects.
   
   All decode and buffering would be provided by these generic types, allowing 
them to be swapped out. This would leave `ColumnReaderImpl` responsible for 
muxing the parquet file, i.e. extracting pages from the `PageReader` and 
feeding them to the decoders. `RecordReader` would be responsible for 
delimiting semantic records, as it is today.
   
   **Describe alternatives you've considered**
   
   We could duplicate the logic in `ColumnReaderImpl` and `RecordReader` into 
different reader implementations, but this seems unfortunate. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to