mzabaluev commented on PR #9700:
URL: https://github.com/apache/arrow-rs/pull/9700#issuecomment-4263701899

   > If V2 page headers are enabled, I believe we fallback to one of the delta 
encodings (at least for ints and byte arrays). Estimating those sizes might be 
a good deal harder.
   
   Since this is only a heuristic, and the wrong decision is not fatal, I 
thought that the estimation does not have to be perfect. The plain encoded size 
is easy and quick to compute – no need to even read the values for fixed-length 
types – and it gives a good approximation of the worst case (all the other 
encodings were invented to improve over the plain one, after all). I'll think 
of further developing this by giving a cheaply computed upper size bound for 
the actually used fallback encoding, but I don't want to make it too precise at 
the cost of extra computation and memory reads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to