abreis opened a new pull request #9454: URL: https://github.com/apache/arrow/pull/9454
This PR proposes a `divide_scalar` kernel that divides numeric arrays by a single scalar. Benchmarks show ~40-50% gains: ``` # features = [] divide 512 time: [2.3210 us 2.3345 us 2.3490 us] divide_scalar 512 time: [1.4374 us 1.4425 us 1.4485 us] (-38%) divide_nulls 512 time: [2.1718 us 2.1799 us 2.1894 us] divide_scalar_nulls 512 time: [1.3888 us 1.3959 us 1.4036 us] (-36%) # features = ["simd"] divide 512 time: [1.0221 us 1.0348 us 1.0481 us] divide_scalar 512 time: [468.04 ns 471.36 ns 475.19 ns] (-54%) divide_nulls 512 time: [960.20 ns 964.30 ns 969.15 ns] divide_scalar_nulls 512 time: [471.33 ns 476.41 ns 482.09 ns] (-51%) ``` The speedups are due to: - only checking for `DivideByZero` once; - not having to combine two null bitmaps; - using `Simd::splat()` to fill the divisor lane. Tests are pretty bare right now, if you think this is worth merging I'll write a few more. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
