js8544 commented on issue #13981:
URL: https://github.com/apache/arrow/issues/13981#issuecomment-1233837375

   @drin Thanks for your reply! I checked my compilation setup and discovered 
that I somehow compiled arrow with no optimization, i.e. "O0". I made some 
changes to my build process and rerun the benchmarks.
   
   1. Add "O2" optimization flag
   2. Add ARROW_SIMD_LEVEL=AVX2 and ARROW_RUNTIME_SIMD_LEVEL=MAX (AVX2 is the 
best our machines support)
   3. According to @save-buffer, memory allocation cost is high for 
arrow::compute. So I switch to `jemalloc` for memory allocation.
   
   The new result is satisfying. `compute` is basically the same as `raw` with 
a small batch size, and performs much better as the batch size increases.
   
   Batch size 25:
   |               ns/op |                op/s |    err% |     total | 
Benchmarking simple feature 25
   
|--------------------:|--------------------:|--------:|----------:|:-------------------------------
   |           11,712.29 |           85,380.39 |    0.4% |      0.02 | 
`arrow_compute`
   |            9,616.53 |          103,987.66 |    0.9% |      0.01 | 
`arrow_raw`
   
   Batch size 50:
   |               ns/op |                op/s |    err% |     total | 
Benchmarking simple feature 50
   
|--------------------:|--------------------:|--------:|----------:|:-------------------------------
   |           11,755.04 |           85,069.87 |    0.5% |      0.02 | 
`arrow_compute`
   |           11,417.35 |           87,586.02 |    0.3% |      0.01 | 
`arrow_raw`
   
   Batch size 100:
   |               ns/op |                op/s |    err% |     total | 
Benchmarking simple feature 100
   
|--------------------:|--------------------:|--------:|----------:|:--------------------------------
   |           11,930.63 |           83,817.85 |    1.1% |      0.02 | 
`arrow_compute`
   |           15,189.21 |           65,836.21 |    0.7% |      0.02 | 
`arrow_raw`
   
   Batch size 1000:
   |               ns/op |                op/s |    err% |     total | 
Benchmarking simple feature 1000
   
|--------------------:|--------------------:|--------:|----------:|:---------------------------------
   |           27,932.44 |           35,800.67 |    0.5% |      0.04 | 
`arrow_compute`
   |           73,840.01 |           13,542.79 |    0.2% |      0.09 | 
`arrow_raw`
   
   Since the result now is expected. I will close this issue. Thanks @drin 
@save-buffer for your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to