zeroshade opened a new pull request, #731:
URL: https://github.com/apache/arrow-go/pull/731
### Rationale for this change
Improve the bytes_to_bools implementation on ARM64 NEON with actual SIMD
instructions. The result is a ~4x throughput improvement for ARM64
### What changes are included in this PR?
Rewrote the assembly using `DUP` + `CMTST` NEON pattern.
1. ld1r {v2.8b}, [ptr] — broadcast one input byte across all 8 SIMD lanes
2. cmtst v2.8b, v2.8b, v0.8b — parallel bit-test against mask
[1,2,4,8,16,32,64,128]
3. and v2.8b, v2.8b, v1.8b — normalize 0xFF → 0x01 for valid Go bool values
4. st1 {v2.8b}, [ptr], #8 — store 8 output bools at once with post-increment
A scalar tail handles the last few bits when fewer than 8 output slots
remain.
### Are these changes tested?
All existing tests continue to pass, new tests added to further validate
- Added TestBytesToBoolsCorrectness — validates every bit position against
the reference Go implementation for sizes 1–1024 bytes
- Added TestBytesToBoolsOutlenSmaller — edge case where output is smaller
than 8× input
- Added BenchmarkBytesToBools — parametric benchmark at 64B, 256B, 1KB, 4KB,
16KB
### Are there any user-facing changes?
No, this is purely a performance optimization:
*Benchmark Results (Apple M4, darwin/arm64)*
```
baseline (scalar) optimized (NEON)
sec/op sec/op vs base
BytesToBools/bytes=64-10 82.69n 21.57n -73.91%
(p=0.008)
BytesToBools/bytes=256-10 333.60n 86.43n -74.09%
(p=0.008)
BytesToBools/bytes=1K-10 1.322µ 327.4n -75.23%
(p=0.008)
BytesToBools/bytes=4K-10 5.293µ 1.297µ -75.50%
(p=0.008)
BytesToBools/bytes=16K-10 21.343µ 5.184µ -75.71%
(p=0.008)
geomean 1.327µ 333.1n -74.90%
```
Throughput: 735 MiB/s → 2,863 MiB/s (+298%)
Zero allocations in both versions. All results statistically significant.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]