Hi,
It is fairy common to see binaries in the wild making use of the Rust
arrow libraries compiled with extremely limited SIMD support enabled. As
I imagine others in the community have run into this before, I thought
I'd send an email to solicit thoughts.
There are a couple of things that make the Rust implementation
particularly susceptible to this problem:
- Rust lacks a stable ABI, and so all builds are from source
- The default x86 release target lacks even SSE3 support (released 2004)
let alone anything more modern
- The Rust implementation relies on LLVM to generate vectorised code,
there are no stable SIMD intrinsics and may never be
My suggestion in [1] is to generate a compilation error if building a
release binary without SSE3 enabled. This provides a very low barrier to
entry, and guides users towards the "right thing". In practice I suspect
most users will be able to add `target-cpu=haswell` and benefit from
everything up to and including AVX2.
An alternative proposal would be to auto-select from multiple
implementations at runtime, however, this will effectively multiply
executable size and compile times, which are already problematic, by
each combination of features. It is tractable, but I feel optimising for
a very rare breed of user that is running high-performance CPU workloads
on a CPU from more than a decade ago... I'm not sure what other people
think?
Any and all feedback welcome, preferably on the linked issue [1] to keep
things in one place.
Kind Regards,
Raphael Taylor-Davies
[1]: https://github.com/apache/arrow-rs/issues/3485