hzuo opened a new pull request, #12793:
URL: https://github.com/apache/arrow/pull/12793
While profiling the performance of decoding TPC-H Customer and Part
in-browser, datasets where there are a lot of UTF8s, it turned out that much of
the time was being spent in `getVariableWidthBytes` - on Chrome it was ~10%,
and on Safari it was close to ~40% (Safari's TextDecoder is much faster than
Chrome's, so getVariableWidthBytes took up relatively more time).
This is likely because the code in this PR is more amenable to V8/JSC's JIT,
since `x` and `y` are guaranteed to be SMIs ("small integers") instead of
Object, allowing the JIT to emit efficient machine instructions that only deal
in 32-bit integers. Once V8 discovers that a `x` and `y` can potentially be
null (upon iterating past the bounds), it "poisons" the codepath forever, since
it has to deal with the null case.
See this V8 post for a more in-depth explanation (in particular see the
examples underneath "Performance tips"):
https://v8.dev/blog/elements-kinds
Doing the bounds check explicitly instead of implicitly basically eliminates
this function from showing up in the profiling. Empirically, on my machine
decoding TPC-H Part dropped from 1.9s to 1.7s on Chrome, and Customer dropped
from 1.4s to 1.2s.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]