https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152

Iain Sandoe <iains at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|x86_64-apple-darwin19.6.0   |x86_64-apple-darwin*
            Summary|Possible 10.3 bad code      |[10.3, 11, 12 Regression]
                   |generation regression from  |[Darwin, X86] used
                   |10.2/9.3 on Mac OS 10.15.7  |caller-saved register not
                   |(Catalina)                  |preserved across a call.
      Known to fail|                            |10.3.0
   Target Milestone|---                         |10.4

--- Comment #24 from Iain Sandoe <iains at gcc dot gnu.org> ---
O1:
        movl    $0, %ebx
L756:
        movl    0(%rbp,%rbx,4), %esi
        movq    %r14, %rdi
        call    ____UTF_8_put
        movq    %rbx, %rax
        addq    $1, %rbx
        cmpq    %rax, %r13
        jne     L756

works OK because %rbx is callee saved.
----

O2:
        xorl    %r10d, %r10d
        .p2align 4,,10
        .p2align 3
L938:
        movl    0(%rbp,%r10,4), %esi
        call    ____UTF_8_put
        movq    %r10, %rax
        addq    $1, %r10
        cmpq    %rax, %r12
        jne     L938

fails because %r10 is not callee saved and is clobbered by the lazy symbol
resolver.

10-2 uses rbx at O2, and so does Linux (it is of course hard to be 100% sure
that the same problem "could not occur" on other platforms; there is relatively
little Darwin-specific code in the x86 backend, especially for x86_64).

I did see a fail [wrong code] with 11.1 (and would expect that to be present in
master too) - whether the code crashes will depend on which reg happens to be
used - e.g. r8 could survive the call (even tho it is not saved) but r10 will
always be clobbered by the lazy symbol resolver.

A workaround is to build c_intf.o with -O1.  Unfortunately, the configuration
for the project does not allow selection of the RTS optimisation level - it is
jammed on at the highest level found.  Adding or modifying a rule for that
object will work in the short-term.  Locally, I added a --enable-c-opt-rts to
allow testing, you're welcome to that patch if it's helpful.

Next will be to try and bisect to find the change that caused this - but obv.
that is not going to be done before 10.4 / 11.2 so the workaround is probably
needed.

Reply via email to