On Tue, 19 May 2026, Jakub Jelinek wrote:
> Hi! > > We don't use vpermilps insn for V4S[IF]mode variable permutations on > TARGET_AVX without TARGET_AVX512*. For TARGET_AVX512* there are plenty > of permutation instructions already. For TARGET_AVX2, the function has > special cases for one_operand_shuffle for V8SImode/V8SFmode and emits > reasonable code, but for V4SImode/V4SFmode with TARGET_AVX2 it handles > those using V8SImode/V8SFmode as two operand shuffle, which requires > 2 preparation instructions, vpermd and one finalization instruction. > And for !TARGET_AVX2 && TARGET_AVX we just emit terrible code for these. > > So, the following patch uses vpermilps for V4S[IF]mode one_operand_shuffle. Thanks for looking at the issue, I really appreciate it. The same problem exists with 64-bit lanes (V2DF/V2SI modes, we fail to utilize vpermilpd). Do you want a separate bugreport for that? Alexander
