Issue |
159410
|
Summary |
[X86] X86FixupInstTunings - attempt to convert VPERMQri to VINSERTI128rri
|
Labels |
backend:X86,
missed-optimization
|
Assignees |
|
Reporter |
RKSimon
|
AVX2 target shuffle combining/lowering often performs a 128-bit subvector splat as VPERMQ as this allows memory folding if the subvector had been spilled.
But for most targets if we don't use the VPERMQmi and instead lower to `VPERMQri %dst, %src, $0x44` we would have been much better off using `VINSERTI128rri %dst, %src, xmm_sub(%src), 0x1`.
X86FixupInstTunings should check the scheduler model agrees VINSERTI128rri is faster and adjusts the instruction encoding (not just the opcode but the registers and immediate need checking/adjusting as well).
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs