felipecrv commented on PR #39817:
URL: https://github.com/apache/arrow/pull/39817#issuecomment-1912925883

   Benchmarks of `sort` and `rank` on chunked arrays -- heavy users of 
`ChunkResolver`. 3 measurements after roughly every change to give an idea of 
level of noise. The purple group (`bounds-check-fix`) is when I fixed the 
out-of-bounds access bug that exists on `main` (not introduced by me in the 
optimizations). After that, the other two groups bring improvements that bring 
the throughput back to what was achieved before the bounds check.
   
   
![resolve](https://github.com/apache/arrow/assets/207795/dca080d7-3c20-450b-ac4d-8ce9fede6c74)
   
   Ideas that were tried and didn't make a difference or made throughput worse:
    - Removing the use of `std::atomic` completely, relaxed atomic operations 
are enough (which is good because that could introduce bugs)
    - Starting the `Bisect` on different ranges depending on the results of the 
branches
    
   
   [1] `ninja arrow-compute-vector-sort-benchmark && 
./**/arrow-compute-vector-sort-benchmark 
--benchmark_filter="ChunkedArray(Sort|Rank).*Int64.*65536/100(/tiebreaker:2|$)" 
--benchmark_out_format=csv`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to