jhorstmann commented on a change in pull request #1228:
URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795038980
##########
File path: arrow/src/util/bit_chunk_iterator.rs
##########
@@ -272,4 +462,149 @@ mod tests {
assert_eq!(u64::MAX, bitchunks.iter().last().unwrap());
assert_eq!(0x7F, bitchunks.remainder_bits());
}
+
+ #[test]
+ #[allow(clippy::assertions_on_constants)]
+ fn test_unaligned_bit_chunk_iterator() {
+ // This test exploits the fact Buffer is at least 64-byte aligned
+ assert!(ALIGNMENT > 64);
+
+ let buffer = Buffer::from(&[0xFF; 5]);
+ let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 0, 40);
+
+ assert_eq!(unaligned.prefix(), Some((1 << 40) - 1));
+ assert_eq!(unaligned.suffix(), None);
+ assert!(unaligned.chunks().is_empty());
+ assert_eq!(unaligned.lead_padding(), 0);
+ assert_eq!(unaligned.trailing_padding(), 24);
+
+ let buffer = buffer.slice(1);
+ let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 0, 32);
+
+ assert_eq!(unaligned.prefix(), Some((1 << 32) - 1));
+ assert_eq!(unaligned.suffix(), None);
+ assert!(unaligned.chunks().is_empty());
+ assert_eq!(unaligned.lead_padding(), 0);
+ assert_eq!(unaligned.trailing_padding(), 32);
+
+ let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 5, 27);
+
+ assert_eq!(unaligned.prefix(), Some(((1 << 32) - 1) - ((1 << 5) - 1)));
Review comment:
These tests might be easier to understand by asserting against a binary
literal.
I added a bit of debug output locally and I'm not sure whether this should
be the expected output:
```
eprintln!("{:064b}", unaligned.prefix().unwrap());
// output: 0000000000000000000000000000000011111111111111111111111111100000
```
Would have expected the prefix chunk to not have trailing zeroes, and
instead have a length less than 64 bits. That does not make a difference when
counting bits, but I find it a bit confusing.
I don't yet understand how the current behavior interacts in the
`advance_to_set_bit` function. The way I understand it, `offset` would be 5 for
this example, then `trailing_zeros` would also be 5, and then in
`SlicesIterator` `start_chunk + start_bit` would be 10. But I'm probably
missing something there.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]