Issue 64633
Summary [BOLT][AArch64] Implementation of option `--plt` on AArch64
Labels new issue
Assignees
Reporter Kepontry
    I'm writing to discuss the questions when implementing the `--plt` option of BOLT on AArch64. Here is the test program written in C.

```C
#include <stdio.h>
int main(){
 printf("Hello World\n");
    return 0;
}
```



In X86, the `printf` function call is compiled as a call to `puts` entry in the .plt section.

```asm
 40058f:    e8 fc fe ff ff     callq 400490 <puts@plt>
```

The first inst of the `puts` entry (i.e., pc 0x400490) is a jump to the implementation address stored in the GOT entry of `puts`.

```asm
Disassembly of section .plt:

0000000000400480 <.plt>:
  400480:       ff 35 42 0b 20 00       pushq  0x200b42(%rip) # 600fc8 <_GLOBAL_OFFSET_TABLE_+0x8>
  400486:       ff 25 44 0b 20 00 jmpq   *0x200b44(%rip)        # 600fd0 <_GLOBAL_OFFSET_TABLE_+0x10>
 40048c:       0f 1f 40 00             nopl 0x0(%rax)

0000000000400490 <puts@plt>:
  400490:       ff 25 42 0b 20 00       jmpq   *0x200b42(%rip)        # 600fd8 <puts@GLIBC_2.2.5>
 400496:       68 00 00 00 00          pushq  $0x0
  40049b:       e9 e0 ff ff ff          jmpq   400480 <.plt>

```

The `--plt` option uses the function `convertCallToIndirectCall` to combine the inst `callq`(0x40058f) and the inst `jumpq`(0x400490) into one `callq`(0xa000ef) and replace the original `callq` inst(0x40058f), thus reducing the inst count executed.

```asm
  a000ef:       ff 15 e3 0e c0 ff       callq *-0x3ff11d(%rip)        # 600fd8 <puts@GLIBC_2.2.5>
```

However in AArch64, there exists no inst that call to an address stored in the memory. They use 4 insts from 0x400540 to 0x40054c to do the similar work.

```asm
  400694: 97ffffab     	bl	0x400540 <puts@plt>
```



```asm
0000000000400540 <puts@plt>:
 400540: 90000110     	adrp	x16, 0x420000 <puts@GLIBC_2.17+0x420000>
 400544: f9400e11     	ldr	x17, [x16, #0x18]
  400548: 91006210 	add	x16, x16, #0x18
  40054c: d61f0220     	br	x17
```

So, my question is, should we replace the original `bl` inst with these 4 insts(do similar optimization work as in X86), or just give up the `--plt` option on AArch64?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to