nevi-me opened a new pull request #9313:
URL: https://github.com/apache/arrow/pull/9313
This is on top of #9297
I was curious if (ab)using the `compute::unary` kernel would perform better
on slightly complex functions.
I implemented the Haversine function, which calculates the distance between
two geographic coordinates.
I then benchmarked an implementation that I tried to simplify and optimise
with unary kernels, vs one that I'd have to write if I couldn't use the unary
kernels for things like:
- arithmetics with scalars
- functions that would otherwise require generating intermediate arrays
(e.g. `sin(x) * cos(x)` would be `multiply(sin(x), cos(x))`)
The function that uses unary kernels for the above, is slightly faster.
I ran this on an M1 CPU, with the below options
```sh
cargo bench --bench trigonometry_kernels
cargo bench --bench trigonometry_kernels --features simd
RUSTFLAGS="-C target-cpu=native" cargo bench --bench trigonometry_kernels
RUSTFLAGS="-C target-cpu=native" cargo bench --bench trigonometry_kernels
--features simd
```
```rust
haversine_no_unary 512 time: [14.074 us 14.140 us 14.216 us]
haversine_unary 512 time: [11.191 us 11.308 us 11.436 us]
haversine_no_unary_nulls 512
time: [15.902 us 15.985 us 16.083 us]
haversine_unary_nulls 512
time: [12.486 us 12.552 us 12.625 us]
```
The biggest benefit is from setting the `RUSTFLAGS`, the non-null benches go
3-10% faster.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]