On Tue, 19 May 2026, Jakub Jelinek wrote:

> Hi!
> 
> We don't use vpermilps insn for V4S[IF]mode variable permutations on
> TARGET_AVX without TARGET_AVX512*.  For TARGET_AVX512* there are plenty
> of permutation instructions already.  For TARGET_AVX2, the function has
> special cases for one_operand_shuffle for V8SImode/V8SFmode and emits
> reasonable code, but for V4SImode/V4SFmode with TARGET_AVX2 it handles
> those using V8SImode/V8SFmode as two operand shuffle, which requires
> 2 preparation instructions, vpermd and one finalization instruction.
> And for !TARGET_AVX2 && TARGET_AVX we just emit terrible code for these.
> 
> So, the following patch uses vpermilps for V4S[IF]mode one_operand_shuffle.

Thanks for looking at the issue, I really appreciate it. The same problem
exists with 64-bit lanes (V2DF/V2SI modes, we fail to utilize vpermilpd).

Do you want a separate bugreport for that?

Alexander

Reply via email to