On Sat, Jun 4, 2022 at 1:03 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> By way of an apology for causing PR target/105791, where I'd overlooked
> the need to support V1TImode in TARGET_XOP's vpcmov instruction, this
> patch further improves support for TARGET_XOP's vpcmov instruction, by
> recognizing it in combine.
>
> Currently, the test case:
>
> typedef int v4si __attribute__ ((vector_size (16)));
> v4si foo(v4si c, v4si t, v4si f)
> {
>     return (c&t)|(~c&f);
> }
>
> on x86_64 with -O2 -mxop generates:
>         vpxor   %xmm2, %xmm1, %xmm1
>         vpand   %xmm0, %xmm1, %xmm1
>         vpxor   %xmm2, %xmm1, %xmm0
>         ret
>
> but with this patch now generates:
>         vpcmov  %xmm0, %xmm2, %xmm1, %xmm0
>         ret
>
> On its own, the new combine splitter works fine on TARGET_64BIT, but
> alas with -m32 combine incorrectly thinks the replacement instruction
> is more expensive, as IF_THEN_ELSE isn't currently/correctly handled
> in ix86_rtx_costs.  So to avoid the need for a target selector in the
> new testcase, I've updated ix86_rtx_costs to report that AMD's vpcmov
> has a latency of two cycles [it's now an obsolete instruction set
> extension and there's unlikely to ever be a processor where this
> instruction has a different timing], and while there I also added
> rtx_costs for x86_64's integer conditional move instructions (which
> have single cycle latency).
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
>
> 2022-06-04  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386.cc (ix86_rtx_costs): Add a new case for
>         IF_THEN_ELSE, and provide costs for TARGET_XOP's vpcmov and
>         TARGET_CMOVE's (scalar integer) conditional moves.
>         * config/i386/sse.md (define_split): Recognize XOP's vpcmov
>         from its equivalent (canonical) pxor;pand;pxor sequence.
>
> gcc/testsuite/ChangeLog
>         * gcc.target/i386/xop-pcmov3.c: New test case.

OK with a nit below.

Thanks,
Uros.

+{
+  operands[5] = REGNO (operands[4]) == REGNO (operands[1]) ? operands[2]
+   : operands[1];
+})
+

Please expand this to enhance readability, it is a bit too cryptic for me ...

Reply via email to