scovich opened a new issue, #7831:
URL: https://github.com/apache/arrow-rs/issues/7831

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   `VariantMetadata` is currently 40 bytes large. It's questionable whether 
that's _really_ an appropriate size for a `Copy` type.
   
   Worse, `VariantList` and `VariantObject` (also `Copy`) both take 
`VariantMetadata` by value. This makes lifetimes easier to manage, but bloats 
them up to and 88 and 96 bytes, respectively.
   
   **Describe the solution you'd like**
   
   We should consider storing fewer fields (`first_offset_byte` is very cheap 
and safe to compute from other header information), and also storing `u32` 
instead of `usize` -- variant only uses 32-bit indexing, so we shouldn't double 
the physical size just for convenience of using `usize` on 64-bit architectures.
   
   Doing that would give the following size reductions:
   * `VariantMetadata`: 40 => 32 bytes
   * `VariantList`: 88 => 64 bytes
   * `VariantObject`: 96 => 64 bytes
   
   NOTE: Since we're storing byte slices, we're stuck with 8-byte alignment and 
it would be ~impossible to go smaller. It's still debatable whether these are 
really `Copy` types, but at least they would fit in a typical CPU cache line 
now?
   
   **Describe alternatives you've considered**
   
   We could reduce the footprint of object and list by storing 
`VariantMetadata` by reference instead of by value. But if we shrink 
`VariantMetadata` that would only save 16 bytes while complicating the user 
surface with additional lifetime management worries.
   
   We may even want to consider using u32 offsets everywhere, since that's 
actually what the variant spec mandates. That would also make the arithmetic 
overflow issue easier to reason about and test, because 32- and 64-bit 
architectures would both behave the same way. But rust really likes `usize` for 
indexing operations, so this may also complicate the user surface.
   
   **Additional context**
   
   N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to