jorgecarleitao commented on a change in pull request #416:
URL: https://github.com/apache/arrow-rs/pull/416#discussion_r646800597
##########
File path: arrow/src/util/bit_chunk_iterator.rs
##########
@@ -137,14 +137,16 @@ impl Iterator for BitChunkIterator<'_> {
// so when reading as u64 on a big-endian machine, the bytes need to
be swapped
let current = unsafe {
std::ptr::read_unaligned(raw_data.add(index)).to_le() };
- let combined = if self.bit_offset == 0 {
+ let bit_offset = self.bit_offset;
+
+ let combined = if bit_offset == 0 {
current
} else {
- let next =
- unsafe { std::ptr::read_unaligned(raw_data.add(index +
1)).to_le() };
+ let next = unsafe {
+ std::ptr::read_unaligned(raw_data.add(index + 1) as *const u8)
as u64
Review comment:
Since this is not the remainder, don't we potentially need to read more
than 8 bits? I.e. doesn't this index contain between 1 and 63 bits that need to
be "merged" into `current`?
I get a feeling that this will ignore all bits after the 8th and less than
64. At least this is what I remember from fixing it in arrow2
[here](https://github.com/jorgecarleitao/arrow2/blob/main/src/bitmap/utils/chunk_iterator/mod.rs#L149).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]