https://bugs.llvm.org/show_bug.cgi?id=48742

            Bug ID: 48742
           Summary: X86AsmBackend::finishLayout causes different assembler
                    output with and w/o -g (2-byte jmp/jcc vs 5-byte)
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: DebugInfo
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected], [email protected],
                    [email protected], [email protected],
                    [email protected], [email protected],
                    [email protected]

bug 42138#c13 was reopened due to different assembler output with -O1 and -O1
-g. Because the assembler issue is so different from the original BranchFolding
bug, I am opening a new bug.

> <   40: eb 0e                   jmp    50 <_ZN1k1lEv+0x50>
> <   42: 66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
> <   49: 00 00 00
> <   4c: 0f 1f 40 00             nopl   0x0(%rax)
> ---
> >   40: e9 0b 00 00 00          jmpq   50 <_ZN1k1lEv+0x50>
> >   45: 66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
> >   4c: 00 00 00
> >   4f: 90                      nop

The D75203 assembler optimization locates MCRelaxableFragment's within two
MCSymbol's and relaxes some MCRelaxableFragment's (jmp/jcc) to reduce the size
of a MCAlignFragment.

Its behavior is dependent on the MCSymbol's in the text section.
A -g compile may have more labels (due to ranges/locations referenced by
.debug_*; currently it seems that some .Ltmp* may be redundant (I am going to
investigate further) but **many cannot be removed**).

.p2align 4, 0x90 is common due to loops. For a larger program, with a lot of
temporary labels, the assembly output difference is somewhat destined.

I think the cost of D75203 overweighs the benefits, so I think we should
default to -x86-pad-for-align=false for now (https://reviews.llvm.org/D94542 ).



When -mbranches-within-32B-boundaries (to mitigate microcode update for Intel
JCC Erratum) is used, there are many alignment fragments. I think D75203 in
that case. In the absence of -mbranches-within-32B-boundaries, the advantage of
D75203 is questionable.

Other opinions: https://reviews.llvm.org/D75203#2496082 (jyknight), its
previous comment (skan).
I agree that to make the behavior of D75203 deterministic with -g and without
we will need to "find all sections referenced by a relaxable fixup in the text
section", and recursively. This will be very complex and dilute the gain of
D75203

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to