https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95816
Bug ID: 95816
Summary: Aarch64 jumps between Hot/Cold sections use possibly
clobbered registers x16/x17
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: matmal01 at gcc dot gnu.org
Target Milestone: ---
Target: AArch64
When splitting a function into two different sections (hot/cold).
The assembler produces a relocation on jumps between the two sections.
The linker is permitted to use a veneer to implement such a relocated jump.
The registers x16 and x17 are reserved for use in those veneers.
Hence the registers x16 and x17 should be treated as clobbered when jumping
between the hot/cold sections in a function.
This is not done.
We can use the testcase below to demonstrate this (modified from predict-22.c
in the testsuite).
---------
$ aarch64-none-linux-gnu-gcc \
> predict-22.c \
> -O2 -w -fPIC -freorder-blocks-and-partition \
> -c -o predict-22.o
---------
volatile int v;
void bar (void) __attribute__((leaf, cold));
void baz (int *);
void alt (long long);
void
foo (int x, int y, int z)
{
static int f __attribute__((section ("mysection")));
register long long k asm ("x16");
__asm__ ("mov\t%0, 10" : "=r" (k) : "0" (k));
f = 1;
if (__builtin_expect (x, 0))
if (__builtin_expect (y, 0))
if (__builtin_expect (z, 0))
{
f = 2;
k *= 13;
bar ();
v += 1;
v *= 2;
v /= 2;
v -= 1;
v += 1;
v *= 2;
v /= 2;
v -= 1;
v += 1;
v *= 2;
v /= 2;
v -= 1;
v += k;
v *= 2;
v /= 2;
v -= 1;
v += 1;
v *= 2;
v /= 2;
v -= 1;
v += 1;
v *= 2;
v /= 2;
v -= 1;
v += 1;
v *= 2;
v /= 2;
v -= 1;
v += 1;
v *= 2;
v /= 2;
v -= 1;
f = 3;
__builtin_abort ();
}
f = k;
baz (&f);
}
--------
This produces an object file which is dumped below.
The dump below demonstrates that there is a R_AARCH64_JUMP26 relocation on the
jump between the hot/cold sections, and that the value stored in x16 is used
after that jump.
$ aarch64-none-linux-gnu-objdump -dr predict-22.o
predict-22.o: file format elf64-littleaarch64
Disassembly of section .text:
0000000000000000 <foo>:
0: 7100003f cmp w1, #0x0
4: 7a401844 ccmp w2, #0x0, #0x4, ne // ne = any
8: 7a401804 ccmp w0, #0x0, #0x4, ne // ne = any
c: d2800150 mov x16, #0xa // #10
10: 540000a1 b.ne 24 <foo+0x24> // b.any
14: 90000001 adrp x1, 0 <foo>
14: R_AARCH64_ADR_PREL_PG_HI21 .bss
18: 91000020 add x0, x1, #0x0
18: R_AARCH64_ADD_ABS_LO12_NC .bss
1c: b9000030 str w16, [x1]
1c: R_AARCH64_LDST32_ABS_LO12_NC .bss
20: 14000000 b 0 <baz>
20: R_AARCH64_JUMP26 baz
24: a9bd7bfd stp x29, x30, [sp, #-48]!
28: 910003fd mov x29, sp
2c: a90153f3 stp x19, x20, [sp, #16]
30: f90013f5 str x21, [sp, #32]
34: 14000000 b 0 <foo> # Here is the
relocation.
34: R_AARCH64_JUMP26 .text.unlikely
Disassembly of section .text.unlikely:
0000000000000000 <foo.cold>:
0: 90000015 adrp x21, 0 <foo.cold>
0: R_AARCH64_ADR_PREL_PG_HI21 .bss
4: 52800053 mov w19, #0x2 // #2
8: aa1003f4 mov x20, x16 # Here we try
and use the clobbered x16 register.
c: b90002b3 str w19, [x21]
c: R_AARCH64_LDST32_ABS_LO12_NC .bss
10: 94000000 bl 0 <bar>
10: R_AARCH64_CALL26 bar
14: 90000000 adrp x0, 4 <foo.cold+0x4>
14: R_AARCH64_ADR_GOT_PAGE v
18: d28001a3 mov x3, #0xd // #13
1c: 52800062 mov w2, #0x3 // #3
20: f9400000 ldr x0, [x0]
20: R_AARCH64_LD64_GOT_LO12_NC v