[GitHub] [arrow] josiahyan opened a new pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

GitBox Thu, 17 Sep 2020 17:41:53 -0700


josiahyan opened a new pull request #8214:
URL: https://github.com/apache/arrow/pull/8214



   It turns out that setSafe performs a very expensive integer division when 
trying to compute buffer capacity; specifically, it divides by the field size, 
which isn't hardcoded. Although it is typically a power of 2, this doesn't 
compile down to a bitshift.
   
   Special-casing and forcing a bitshift operation results in a ~300% increase 
in benchmarks that use a hot loop to set Arrow vectors. We have a similar 
use-case in an internal data-intensive service.
   
   Benchmark results with arrow.enable_unsafe_memory_access=true
   
   Before:
   ```
   Benchmark Mode Cnt Score Error Units
   IntBenchmarks.setIntDirectly avgt 15 9.563 ± 0.335 us/op
   IntBenchmarks.setWithValueHolder avgt 15 9.266 ± 0.064 us/op
   IntBenchmarks.setWithWriter avgt 15 18.806 ± 0.154 us/op
   ```
   
   After:
   ```
   Benchmark Mode Cnt Score Error Units
   IntBenchmarks.setIntDirectly avgt 15 3.490 ± 0.175 us/op
   IntBenchmarks.setWithValueHolder avgt 15 3.806 ± 0.015 us/op
   IntBenchmarks.setWithWriter avgt 15 5.490 ± 0.304 us/op
   ```
   
   See https://issues.apache.org/jira/browse/ARROW-9965 for further benchmarks, 
and an analysis of the root cause of the slowdown.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] josiahyan opened a new pull request #8214: ARROW-9965: [Java] Improve performance of BaseFixedWidthVector.setSafe by optimizing capacity calculations

Reply via email to