On Thu, Aug 13, 2020 at 10:28 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This patch is an alternate/supplementary fix to the recent regression
> PR target/96558.  Currently ix86_expand_clear may/should only be called
> with a general register DEST after reload_completed.  With the simple
> change below, this function now checks these conditions itself, and
> does the right thing (or at least something reasonable) rather than ICE.
>
> This change alone is sufficient to fix the recent regression, and allow
> the recently added testcase to pass, but following the "why fix something
> just once" maxim, I propose adding both solutions (to reduce the risk
> of surprises in the future).  Leaving the peephole2 fix in place is
> reasonable, as scheduling or a later pass eventually move the condition
> code setter/use next to each other, so moving the vpxor with the
> peephole2 provides no (additional) benefit.  i.e. gcc.dg/pr96558.c contains:
>
>         vpxor   %xmm0, %xmm0, %xmm0
>         testl   %eax, %eax
>         jne     .L91
>
> This patch has been tested on x86_64-pc-linux-gnu (without the peephole2
> solution) with a make bootstrap and make -k check with no new failures,
> but allows the recently added testcase to pass.


But we gain nothing for non-GENERAL_REG_P registers. movdi/movsi
patterns can load 0 to XMM registers with pxor by itself, also with
mask registers. The reason that this peephole exists is due to the
fact that XOR with GENERAL_REGs clobbers flags, so we can't just stick
 "xor reg, reg" to movdi/movsi pattern when zero is loaded. We would
like to do so, but we can't - so we have to complicate our lives with
a peephole pattern that checks for live FLAGS_REG before
transformation.

I think that we should assert that only GENERAL_REGS should be
processed by ix86_expand_clear.

Uros.

> Ok for mainline?
>
>
> 2020-08-19  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/96558
>         * config/i386/i386-expand.c (ix86_expand_clear): Explicitly
>         test for reload_completed and GENERAL_REG_P, and emit a simple
>         set of const0_rtx otherwise.
>
> Thanks in advance.
> Sorry for the inconvenience.
> Roger
> --
> Roger Sayle
> NextMove Software
> Cambridge, UK
>
>

Reply via email to