mzabaluev commented on PR #9700: URL: https://github.com/apache/arrow-rs/pull/9700#issuecomment-4263701899
> If V2 page headers are enabled, I believe we fallback to one of the delta encodings (at least for ints and byte arrays). Estimating those sizes might be a good deal harder. Since this is only a heuristic, and the wrong decision is not fatal, I thought that the estimation does not have to be perfect. The plain encoded size is easy and quick to compute – no need to even read the values for fixed-length types – and it gives a good approximation of the worst case (all the other encodings were invented to improve over the plain one, after all). I'll think of further developing this by giving a cheaply computed upper size bound for the actually used fallback encoding, but I don't want to make it too precise at the cost of extra computation and memory reads. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
