https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480
--- Comment #16 from Steven Munroe <munroesj at gcc dot gnu.org> --- (In reply to Segher Boessenkool from comment #14) > > He/she should just write C code to do this (not even use a builtin > function), and trust the compiler will do the right thing (and the best > possible for the selected architecture version, etc.) > Unfortunately this does not work as you suggest. I'll attach a test case specific to GCC vector extensions for vector shift by const scalar. This should generate 2 instuctions over most of the range for int/long long/ and __int128 for P10. But does not. You can do a lot with the 5-bit SIM and splat immediate. This works for all char/short/int shift values for VMX targets. And the ranges 0-15 and 48-63 for long long (112-127 for __int128) This should work for P9 with the addition of xxspltib, including long long but the compile insists on including redudant vextsb2w/vextsb2d instructions. I have working code for this in PVECLIB.