pvary commented on PR #15410:
URL: https://github.com/apache/iceberg/pull/15410#issuecomment-4067463253

   I did a first review here:
   
   The functional change is correct, but the implementation approach of 
extending `BaseVectorizedParquetValuesReader` is the wrong abstraction. A 
standalone reader that follows the established extends `ValuesReader` 
implements `VectorizedValuesReader` pattern would be cleaner, more 
maintainable, and consistent with how all the other v2 encoding readers were 
implemented.
   
   **Recommendation**
   Create a dedicated `VectorizedRleBooleanValuesReader` extends `ValuesReader` 
implements `VectorizedValuesReader` that:
   - Only handles boolean `RLE` decoding (bit width 1)
   - Has its own `initFromPage` that reads the 4-byte length + RLE-decodes 
booleans
   - Implements readBoolean() and a batch readBooleans(int total, FieldVector 
vec, int rowId) if possible
   - Follows the same extends `ValuesReader` implements 
`VectorizedValuesReader` pattern as all the other v2 encoding readers
   
   This can reuse the RLE decoding logic from 
`BaseVectorizedParquetValuesReader` by composition (or just inline the small 
amount of RLE logic needed for bit-width-1 booleans), rather than by 
inheritance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to