jhorstmann commented on a change in pull request #1228:
URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r796614899
##########
File path: arrow/src/util/bit_chunk_iterator.rs
##########
@@ -272,4 +462,149 @@ mod tests {
assert_eq!(u64::MAX, bitchunks.iter().last().unwrap());
assert_eq!(0x7F, bitchunks.remainder_bits());
}
+
+ #[test]
+ #[allow(clippy::assertions_on_constants)]
+ fn test_unaligned_bit_chunk_iterator() {
+ // This test exploits the fact Buffer is at least 64-byte aligned
+ assert!(ALIGNMENT > 64);
+
+ let buffer = Buffer::from(&[0xFF; 5]);
+ let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 0, 40);
+
+ assert_eq!(unaligned.prefix(), Some((1 << 40) - 1));
+ assert_eq!(unaligned.suffix(), None);
+ assert!(unaligned.chunks().is_empty());
+ assert_eq!(unaligned.lead_padding(), 0);
+ assert_eq!(unaligned.trailing_padding(), 24);
+
+ let buffer = buffer.slice(1);
+ let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 0, 32);
+
+ assert_eq!(unaligned.prefix(), Some((1 << 32) - 1));
+ assert_eq!(unaligned.suffix(), None);
+ assert!(unaligned.chunks().is_empty());
+ assert_eq!(unaligned.lead_padding(), 0);
+ assert_eq!(unaligned.trailing_padding(), 32);
+
+ let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 5, 27);
+
+ assert_eq!(unaligned.prefix(), Some(((1 << 32) - 1) - ((1 << 5) - 1)));
Review comment:
Tests are much nicer to read now, thanks. My one concern is that the
actual `UnalignedBitChunkIterator` iterator is a bit difficult to use, because
it requires a bit of special handling for the first element, which has to be
shifted by `lead_padding`. I played around with a similar design, and one idea
is to return an iterator of `(start_offset, mask, len)` tuples.
Example: Bitmap consisting of 130 one bits, starting at offset 60, the
iterator would return
```
(0, 0b1111, 4)
(4, u64::MAX, 64)
(68, 0b11, 2)
```
Users would then need no special logic for prefix/suffix. When iterating
over all set bits you could use `trailing_zeros` on each chunk and add the
`start_offset`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]