alamb commented on code in PR #4818:
URL: https://github.com/apache/arrow-rs/pull/4818#discussion_r1326442756


##########
arrow-row/src/lib.rs:
##########
@@ -232,13 +232,13 @@ mod variable;
 /// A non-null, non-empty byte array is encoded as `2_u8` followed by the byte 
array
 /// encoded using a block based scheme described below.
 ///
-/// The byte array is broken up into 32-byte blocks, each block is written in 
turn
+/// The byte array is broken up into fixed-width blocks, each block is written 
in turn
 /// to the output, followed by `0xFF_u8`. The final block is padded to 32-bytes
 /// with `0_u8` and written to the output, followed by the un-padded length in 
bytes
-/// of this final block as a `u8`.
+/// of this final block as a `u8`. The first 4 blocks have a length of 8, with 
subsequent
+/// blocks using a length of 32.

Review Comment:
   I think it would help to explain the rationale for using smaller blocks up 
front (to avoid space wastage for smaller stings)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to