On 28 Mar 2015, at 13:54, Julian Elischer <jul...@freebsd.org> wrote:
> the point is that clang will do this anywhere it can, because it isn't taking
> into account the
> side effects, just the speed of the commands themselves.
This is also something that is not going to decrease. Clang now enables the
SLP vectoriser by default and this code is constantly being improved. Current
generation vector units are explicitly designed as targets for compiler
autovectorisation, not for hand-tuned DSP code (which, increasingly, runs on
the GPU anyway). This means that we're increasingly going to see SSE/AVX/NEON
usage in CPU-bound code, even without an explicit programmer decision to do so.
Optimising for the case when the vector unit is not used is about as sensible
as optimising for the single-core case: it will affect some people, but
generally not those who care about performance, and a decreasing number of
people over time.
firstname.lastname@example.org mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"