friendlymatthew commented on code in PR #7888:
URL: https://github.com/apache/arrow-rs/pull/7888#discussion_r2198302382
##########
parquet-variant/src/variant/metadata.rs:
##########
@@ -37,16 +37,16 @@ pub(crate) struct VariantMetadataHeader {
const CORRECT_VERSION_VALUE: u8 = 1;
// The metadata header occupies one byte; use a named constant for readability
-const NUM_HEADER_BYTES: usize = 1;
+const NUM_HEADER_BYTES: u32 = 1;
impl VariantMetadataHeader {
// Hide the cast
- const fn offset_size(&self) -> usize {
- self.offset_size as usize
+ const fn offset_size(&self) -> u32 {
+ self.offset_size as u32
}
// Avoid materializing this offset, since it's cheaply and safely
computable
- const fn first_offset_byte(&self) -> usize {
+ const fn first_offset_byte(&self) -> u32 {
NUM_HEADER_BYTES + self.offset_size()
}
Review Comment:
Hi `VariantMetadataHeader` is currently a u8 that encodes 3 pieces of
information. I'm wondering if, instead of storing each piece separately as
fields, we could store just the u8 itself and extract the individual components
using bitmasking when needed.
If we are aiming to minimize the byte footprint, it's a bit unfortunate that
we're storing 3 times more bytes than necessary fro this data. Plus, deriving
the values from the byte is not computationally expensive.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]