https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82580

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |---

--- Comment #17 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Roger Sayle from comment #16)
> Advance warning that the testcase pr82580.c will start FAILing due to
> differences in register allocation following improvements to __int128
> parameter passing as explained in
> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623756.html.
> We might need additional reload alternatives/preferences to ensure that we
> don't generate a movzbl.  Hopefully, Jakub and/or Uros have some suggestions
> for how best this can be fixed.
> 
> Previously, the SUBREGs and CLOBBERs generated by middle-end RTL expansion
> (unintentionally) ensured that rdx and rax would never be used for __int128
> arguments, which conveniently allowed the use of xor eax,eax; setc al in
> peephole2 as AX_REG wasn't live.  Now reload has more freedom, it elects to
> use rax as at this point the backend hasn't expressed any preference that it
> would like eax reserved for producing the result.

A different regression happens with pr82580.c, f0 function. Without the patch,
the compiler generates:

f0:
        xorq    %rdi, %rdx
        xorq    %rcx, %rsi
        xorl    %eax, %eax
        orq     %rsi, %rdx
        sete    %al
        ret

But with the patch:

f0:
        xchgq   %rdi, %rsi
        movq    %rdx, %r8
        movq    %rcx, %rax
        movq    %rsi, %rdx
        movq    %rdi, %rcx
        xorq    %rax, %rcx
        xorq    %r8, %rdx
        xorl    %eax, %eax
        orq     %rcx, %rdx
        sete    %al
        ret

It looks to me that *concatditi3_3 ties two registers together so RA now tries
to satisfy *concatditi3_3 constraints *and* *cmpti_doubleword constraints.

The gcc.target/i386/pr43644-2.c mitigates this issue with
*addti3_doubleword_concat pattern that combines *addti3_doubleword with concat
insn, but doubleword compares (and other doubleword insn besides addti3) do not
provide these compound instructions.

So, without a common strategy to use doubleword_concat patterns for all double
word instructions, it is questionable if the complications with concat insn are
worth the pain of providing (many?) doubleword_concat patterns.

The real issue is with x86_64 doubleword arguments. Unfortunately, the ABI
specifies RDI/RSI to pass the double word argument, while the compiler regalloc
order sequence is RSI/RDI. IMO, we can try to swap RDI and RSI in the order and
RA would be able to allocate registers in the same optimal way as for x86_32
with -mregparm=3, even without synthetic concat patterns.

Reply via email to