alamb commented on code in PR #9020:
URL: https://github.com/apache/arrow-rs/pull/9020#discussion_r2637157194
##########
arrow-buffer/src/buffer/boolean.rs:
##########
@@ -26,17 +26,56 @@ use std::ops::{BitAnd, BitOr, BitXor, Not};
/// A slice-able [`Buffer`] containing bit-packed booleans
///
-/// `BooleanBuffer`s can be modified using [`BooleanBufferBuilder`]
+/// This structure represents a sequence of boolean values packed into a
+/// byte-aligned [`Buffer`]. Both the offset and length are represented in
bits.
+///
+/// # Layout
+///
+/// The values are represented as little endian bit-packed values, where the
+/// least significant bit of each byte represents the first boolean value and
+/// then proceeding to the most significant bit.
+///
+/// For example, the 10 bit bitmask `0b0111001101` has length 10, and is
+/// represented using 2 bytes with offset 0 like this:
+///
+/// ```text
+/// ┌─────────────────────────────────┐ ┌─────────────────────────────────┐
+/// │┌───┬───┬───┬───┬───┬───┬───┬───┐│ │┌───┬───┬───┬───┬───┬───┬───┬───┐│
+/// ││ 1 │ 1 │ 0 │ 0 │ 1 │ 1 │ 0 │ 1 ││ ││ ? │ ? │ ? │ ? │ ? │ ? │ 0 │ 1 ││
+/// │└───┴───┴───┴───┴───┴───┴───┴───┘│ │└───┴───┴───┴───┴───┴───┴───┴───┘│
+/// └─────────────────────────────────┘ └─────────────────────────────────┘
+/// 7 Byte 0 0 7 Byte 1 0
bit
+///
offset
+/// length = 10 bits, offset = 0
+/// ```
+///
+/// The same bitmask with length 10 and offset 3 would be represented like
this:
+/// ```
+/// ┌─────────────────────────────────┐ ┌─────────────────────────────────┐
+/// │┌───┬───┬───┬───┬───┬───┬───┬───┐│ │┌───┬───┬───┬───┬───┬───┬───┬───┐│
+/// ││ 0 │ 1 │ 1 │ 0 │ 1 │ ? │ ? │ ? ││ ││ ? │ ? │ ? │ 0 │ 1 │ 1 │ 1 │ 0 ││
+/// │└───┴───┴───┴───┴───┴───┴───┴───┘│ │└───┴───┴───┴───┴───┴───┴───┴───┘│
+/// └─────────────────────────────────┘ └─────────────────────────────────┘
+/// 7 Byte 0 0 7 Byte 1 0
bit
+///
offset
Review Comment:
Yeah, this is tricky -- I went with this layout because it matches the
output when written `0b1011010`
However, having the bytes and bits in a reverse order is confusing. I'll
update the diagrams so the bits do the same thing (low to high, left to right)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]