hzuo opened a new pull request, #12793:
URL: https://github.com/apache/arrow/pull/12793

   While profiling the performance of decoding TPC-H Customer and Part 
in-browser, datasets where there are a lot of UTF8s, it turned out that much of 
the time was being spent in `getVariableWidthBytes` - on Chrome it was ~10%, 
and on Safari it was close to ~40% (Safari's TextDecoder is much faster than 
Chrome's, so getVariableWidthBytes took up relatively more time).
   
   This is likely because the code in this PR is more amenable to V8/JSC's JIT, 
since `x` and `y` are guaranteed to be SMIs ("small integers") instead of 
Object, allowing the JIT to emit efficient machine instructions that only deal 
in 32-bit integers. Once V8 discovers that a `x` and `y` can potentially be 
null (upon iterating past the bounds), it "poisons" the codepath forever, since 
it has to deal with the null case.
   
   See this V8 post for a more in-depth explanation (in particular see the 
examples underneath "Performance tips"):
   https://v8.dev/blog/elements-kinds
   
   Doing the bounds check explicitly instead of implicitly basically eliminates 
this function from showing up in the profiling. Empirically, on my machine 
decoding TPC-H Part dropped from 1.9s to 1.7s on Chrome, and Customer dropped 
from 1.4s to 1.2s.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to