https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96918

--- Comment #17 from Cory Fields <lists at coryfields dot com> ---
(In reply to Jakub Jelinek from comment #16)
> Optimal shuffle is next to impossible on architectures like x86 where you
> have dozens of different permutation instructions and often you need not
> just one, but 2, 3, 4 or 5 of them depending on exact ISA and permutation.
> GCC has over 20k lines of source for choosing reasonable constant
> permutations just on this architecture.
> This PR is not about __builtin_shuffle emitting bad code, but about the
> vector lshift + rshift ored not even trying to emit it as permutation and
> comparing that to what one gets from those 3 operations if there is no
> native rotate.
> Though, sure, one could also derive from it that perhaps some constant
> permutations would be in some cases best emitted as 2 shifts + or, guess we
> don't try that among 3 insn cases yet.

Thanks for the help, I certainly didn't mean to trivialize the work involved. I
have a much better understanding of what's involved now.

I'll have a look at the existing permutations and see if there's room for
improvement on avx/avx2 in these specific cases.

Reply via email to