Re: [I] Row format better data packing [fory]

via GitHub Tue, 17 Jun 2025 07:34:48 -0700


stevenschlansker commented on issue #2337:
URL: https://github.com/apache/fory/issues/2337#issuecomment-2980633243


   Yes, I had similar thoughts.
   For us, data density is probably more important than speed of access - we 
would happily pay 2x memory accesses for e.g. 30% data size reduction.
   
   I had the same thinking.
   First, sort fields by size, largest to smallest, to preserve alignment
   
   Then, the easiest next step:
   prepare array of ordinal -> offset. Pro: easy to understand Con: double 
memory access
   
   Alternately, we can compute an offset for each size class. replace internal 
calls to `getOffset` with `getOffset1` (2, 4, 8) and each has a pre-computed 
base offset for the size class, such that the result of `ord * size = 
start-of-size-field`.
   Pro: faster (?) than fetching offset from array Con: more code to write / 
maintain
   
   The downside is this will not be compatible with the existing row format, 
nor existing xlang impls. But it could be an option for the user to choose, and 
if it turns popular, we can implement it in other languages too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Row format better data packing [fory]

Reply via email to