jhorstmann commented on a change in pull request #1228:
URL: https://github.com/apache/arrow-rs/pull/1228#discussion_r795038980



##########
File path: arrow/src/util/bit_chunk_iterator.rs
##########
@@ -272,4 +462,149 @@ mod tests {
         assert_eq!(u64::MAX, bitchunks.iter().last().unwrap());
         assert_eq!(0x7F, bitchunks.remainder_bits());
     }
+
+    #[test]
+    #[allow(clippy::assertions_on_constants)]
+    fn test_unaligned_bit_chunk_iterator() {
+        // This test exploits the fact Buffer is at least 64-byte aligned
+        assert!(ALIGNMENT > 64);
+
+        let buffer = Buffer::from(&[0xFF; 5]);
+        let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 0, 40);
+
+        assert_eq!(unaligned.prefix(), Some((1 << 40) - 1));
+        assert_eq!(unaligned.suffix(), None);
+        assert!(unaligned.chunks().is_empty());
+        assert_eq!(unaligned.lead_padding(), 0);
+        assert_eq!(unaligned.trailing_padding(), 24);
+
+        let buffer = buffer.slice(1);
+        let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 0, 32);
+
+        assert_eq!(unaligned.prefix(), Some((1 << 32) - 1));
+        assert_eq!(unaligned.suffix(), None);
+        assert!(unaligned.chunks().is_empty());
+        assert_eq!(unaligned.lead_padding(), 0);
+        assert_eq!(unaligned.trailing_padding(), 32);
+
+        let unaligned = UnalignedBitChunk::new(buffer.as_slice(), 5, 27);
+
+        assert_eq!(unaligned.prefix(), Some(((1 << 32) - 1) - ((1 << 5) - 1)));

Review comment:
       These tests might be easier to understand by asserting against a binary 
literal.
   
   I added a bit of debug output locally and I'm not sure whether this should 
be the expected output:
   
   ```
   eprintln!("{:064b}", unaligned.prefix().unwrap());
   
   // output: 0000000000000000000000000000000011111111111111111111111111100000 
   ```
   
   Would have expected the prefix chunk to not have trailing zeroes, and 
instead have a length less than 64 bits. That does not make a difference when 
counting bits, but I find it a bit confusing.
   
   I don't yet understand how the current behavior interacts in the 
`advance_to_set_bit` function. The way I understand it, `offset` would be 5 for 
this example, then `trailing_zeros` would also be 5, and then in 
`SlicesIterator` `start_chunk + start_bit` would be 10. But I'm probably 
missing something there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to