https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95816
Bug ID: 95816 Summary: Aarch64 jumps between Hot/Cold sections use possibly clobbered registers x16/x17 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: matmal01 at gcc dot gnu.org Target Milestone: --- Target: AArch64 When splitting a function into two different sections (hot/cold). The assembler produces a relocation on jumps between the two sections. The linker is permitted to use a veneer to implement such a relocated jump. The registers x16 and x17 are reserved for use in those veneers. Hence the registers x16 and x17 should be treated as clobbered when jumping between the hot/cold sections in a function. This is not done. We can use the testcase below to demonstrate this (modified from predict-22.c in the testsuite). --------- $ aarch64-none-linux-gnu-gcc \ > predict-22.c \ > -O2 -w -fPIC -freorder-blocks-and-partition \ > -c -o predict-22.o --------- volatile int v; void bar (void) __attribute__((leaf, cold)); void baz (int *); void alt (long long); void foo (int x, int y, int z) { static int f __attribute__((section ("mysection"))); register long long k asm ("x16"); __asm__ ("mov\t%0, 10" : "=r" (k) : "0" (k)); f = 1; if (__builtin_expect (x, 0)) if (__builtin_expect (y, 0)) if (__builtin_expect (z, 0)) { f = 2; k *= 13; bar (); v += 1; v *= 2; v /= 2; v -= 1; v += 1; v *= 2; v /= 2; v -= 1; v += 1; v *= 2; v /= 2; v -= 1; v += k; v *= 2; v /= 2; v -= 1; v += 1; v *= 2; v /= 2; v -= 1; v += 1; v *= 2; v /= 2; v -= 1; v += 1; v *= 2; v /= 2; v -= 1; v += 1; v *= 2; v /= 2; v -= 1; f = 3; __builtin_abort (); } f = k; baz (&f); } -------- This produces an object file which is dumped below. The dump below demonstrates that there is a R_AARCH64_JUMP26 relocation on the jump between the hot/cold sections, and that the value stored in x16 is used after that jump. $ aarch64-none-linux-gnu-objdump -dr predict-22.o predict-22.o: file format elf64-littleaarch64 Disassembly of section .text: 0000000000000000 <foo>: 0: 7100003f cmp w1, #0x0 4: 7a401844 ccmp w2, #0x0, #0x4, ne // ne = any 8: 7a401804 ccmp w0, #0x0, #0x4, ne // ne = any c: d2800150 mov x16, #0xa // #10 10: 540000a1 b.ne 24 <foo+0x24> // b.any 14: 90000001 adrp x1, 0 <foo> 14: R_AARCH64_ADR_PREL_PG_HI21 .bss 18: 91000020 add x0, x1, #0x0 18: R_AARCH64_ADD_ABS_LO12_NC .bss 1c: b9000030 str w16, [x1] 1c: R_AARCH64_LDST32_ABS_LO12_NC .bss 20: 14000000 b 0 <baz> 20: R_AARCH64_JUMP26 baz 24: a9bd7bfd stp x29, x30, [sp, #-48]! 28: 910003fd mov x29, sp 2c: a90153f3 stp x19, x20, [sp, #16] 30: f90013f5 str x21, [sp, #32] 34: 14000000 b 0 <foo> # Here is the relocation. 34: R_AARCH64_JUMP26 .text.unlikely Disassembly of section .text.unlikely: 0000000000000000 <foo.cold>: 0: 90000015 adrp x21, 0 <foo.cold> 0: R_AARCH64_ADR_PREL_PG_HI21 .bss 4: 52800053 mov w19, #0x2 // #2 8: aa1003f4 mov x20, x16 # Here we try and use the clobbered x16 register. c: b90002b3 str w19, [x21] c: R_AARCH64_LDST32_ABS_LO12_NC .bss 10: 94000000 bl 0 <bar> 10: R_AARCH64_CALL26 bar 14: 90000000 adrp x0, 4 <foo.cold+0x4> 14: R_AARCH64_ADR_GOT_PAGE v 18: d28001a3 mov x3, #0xd // #13 1c: 52800062 mov w2, #0x3 // #3 20: f9400000 ldr x0, [x0] 20: R_AARCH64_LD64_GOT_LO12_NC v