jhorstmann commented on issue #5855: URL: https://github.com/apache/arrow-rs/issues/5855#issuecomment-2155076755
I did not see varint decoding as a bottleneck in my benchmarks. I experimented with [using BMI2 instructions](https://github.com/jhorstmann/compact-thrift/issues/4), but that still requires at least one branch to check whether we can read 8 bytes at a time and fallback to sequential code if not, and for numbers larger than 8*7 bits one or two more branches. That does not make the code much smaller anymore, and assuming mostly small integers and good branch prediction there seems to be no big improvement. I think most of the unaccounted time in the flamegraph is related to moving of data on the stack, which rust still does not optimize that well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
