Re: [PR] fix: move from atomic.(Add|Load|Store) to atomic.Int64{} [arrow-go]

via GitHub Fri, 04 Apr 2025 05:11:02 -0700


sahib commented on PR #326:
URL: https://github.com/apache/arrow-go/pull/326#issuecomment-2778508876


   @zeroshade  More a minor thing I also noticed and might be worth sharing:
   
   Most of the asm functions are selected over function variables during 
runtime (i.e. during `init()` different implementation are set based on cpu 
functions). Overall a nice approach, but it comes with a small performance 
penalty, as this prohibits inlining (and maybe some more things?).
   
   Considering this mini benchmark:
   
   ```go
   func BenchmarkTestGreaterThanBitmap(b *testing.B) {
        const N = 10
        levels := make([]int16, N)
        for idx := range levels {
                levels[idx] = int16(idx)
        }
   
        b.Run("func", func(b *testing.B) {
                for b.Loop() {
                        GreaterThanBitmap(levels, int16(N/2))
                }
        })
   
        b.Run("no-func-go", func(b *testing.B) {
                for b.Loop() {
                        greaterThanBitmapGo(levels, int16(N/2))
                }
        })
   }
   ```
   
   ```sh
   # noasm to make sure that we do not compare against arch specific function
   $ go test -tags noasm -bench=. -run=xxx
   BenchmarkTestGreaterThanBitmap/func-16               165481227               
7.289 ns/op
   BenchmarkTestGreaterThanBitmap/no-func-go-16         243919600               
4.611 ns/op
   ```
   
   That difference of course is negligible if the function runtime increases. 
But overall it would be probably possible to squeeze out a few percent of 
benchmark speed when changing those function values to something along the 
lines:
   
   ```go
   func ExtractBits(...) {
       if cpu.X86.HasBMI2 {
           return extractBitsGo(...)
       }
   
       return extractBitsBMI()
   }
   ```
   
   I did push a dummy branch here: 
https://github.com/sahib/arrow-go/tree/bench/build-tag - seems like the whole 
platform selection gets less convoluted using this approach as well. All in all 
more minor stuff, but I wanted your opinion on this first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix: move from atomic.(Add|Load|Store) to atomic.Int64{} [arrow-go]

Reply via email to