jhorstmann commented on issue #5855:
URL: https://github.com/apache/arrow-rs/issues/5855#issuecomment-2155076755

   I did not see varint decoding as a bottleneck in my benchmarks. I 
experimented with [using BMI2 
instructions](https://github.com/jhorstmann/compact-thrift/issues/4), but that 
still requires at least one branch to check whether we can read 8 bytes at a 
time and fallback to sequential code if not, and for numbers larger than 8*7 
bits one or two more branches. That does not make the code much smaller 
anymore, and assuming mostly small integers and good branch prediction there 
seems to be no big improvement.
   
   I think most of the unaccounted time in the flamegraph is related to moving 
of data on the stack, which rust still does not optimize that well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to