lesterfan commented on issue #46094:
URL: https://github.com/apache/arrow/issues/46094#issuecomment-2992137350
This makes sense; thank you! So in summary, the way that the
`RleEncoder`/`RleDecoder` are implemented is that literal runs will always be
padded to the nearest multiple of 8 to ensure byte alignment. From the
decoder's perspective, there's no way to distinguish between a "real" value and
one of these padded values. Therefore, it's up to the caller of the
RleEncoder/RleDecoder to keep track of how many "real" values they want. In
practice, this is done by the Parquet page `RecordReader` as mentioned by
@mapleFU, but in my test I was bypassing this.
Given this, I'm going to close the issue. I think the following docstring
could be more clear here, but this sounds largely like a user error from my end:
```
/// Gets the next value. Returns false if there are no more.
template <typename T>
bool Get(T* val);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]