[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 --- Comment #16 from uros at gcc dot gnu.org --- Author: uros Date: Wed Jul 15 07:39:30 2015 New Revision: 225807 URL: https://gcc.gnu.org/viewcvs?rev=225807root=gccview=rev Log: PR rtl-optimization/58066 * calls.c (expand_call): Precompute register parameters before stack alignment is performed. Modified: trunk/gcc/ChangeLog trunk/gcc/calls.c
[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 --- Comment #14 from Uroš Bizjak ubizjak at gmail dot com --- (In reply to Uroš Bizjak from comment #13) Patch in testing. This patch fixes the testcase, now we get: inet_ntoa: 0: 41 56 push %r14 2: 41 55 push %r13 4: 44 0f b6 ef movzbl %dil,%r13d 8: 41 54 push %r12 a: 55 push %rbp b: 41 89 fcmov%edi,%r12d e: 53 push %rbx f: 89 fb mov%edi,%ebx 11: 41 c1 ec 10 shr$0x10,%r12d 15: 0f b6 c7movzbl %bh,%eax 18: c1 eb 18shr$0x18,%ebx 1b: 45 0f b6 e4 movzbl %r12b,%r12d 1f: 41 89 c6mov%eax,%r14d 22: 48 8d 3d 00 00 00 00lea0(%rip),%rdi# 29 inet_ntoa+0x29 25: R_X86_64_TLSLD buffer+0xfffc 29: e8 00 00 00 00 callq 2e inet_ntoa+0x2e 2a: R_X86_64_PLT32 __tls_get_addr+0xfffc 2e: 48 83 ec 08 sub$0x8,%rsp 32: 48 8d 15 00 00 00 00lea0(%rip),%rdx# 39 inet_ntoa+0x39 35: R_X86_64_PC32 .LC0+0xfffc 39: 45 89 e1mov%r12d,%r9d 3c: 48 8d a8 00 00 00 00lea0x0(%rax),%rbp 3f: R_X86_64_DTPOFF32 buffer 43: 53 push %rbx 44: 45 89 f0mov%r14d,%r8d 47: 44 89 e9mov%r13d,%ecx 4a: 31 c0 xor%eax,%eax 4c: be 12 00 00 00 mov$0x12,%esi 51: 48 89 efmov%rbp,%rdi 54: e8 00 00 00 00 callq 59 inet_ntoa+0x59 55: R_X86_64_PLT32 __snprintf+0xfffc 59: 58 pop%rax 5a: 48 89 e8mov%rbp,%rax 5d: 5a pop%rdx 5e: 5b pop%rbx 5f: 5d pop%rbp 60: 41 5c pop%r12 62: 41 5d pop%r13 64: 41 5e pop%r14 66: c3 retq The difference between patched (+++) and unpatched (---) code is: --- pr58066_.s 2015-07-13 11:58:23.0 +0200 +++ pr58066.s 2015-07-13 11:58:26.0 +0200 @@ -28,16 +28,16 @@ movzbl %bh, %eax shrl$24, %ebx movzbl %r12b, %r12d - subq$8, %rsp -.LCFI5: movl%eax, %r14d leaqbuffer@tlsld(%rip), %rdi call__tls_get_addr@PLT - pushq %rbx -.LCFI6: + subq$8, %rsp +.LCFI5: leaq.LC0(%rip), %rdx movl%r12d, %r9d leaqbuffer@dtpoff(%rax), %rbp + pushq %rbx +.LCFI6: movl%r14d, %r8d movl%r13d, %ecx xorl%eax, %eax HJ, can you please test the patch if it fixes your problem?
[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 --- Comment #13 from Uroš Bizjak ubizjak at gmail dot com --- Created attachment 35964 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35964action=edit Combined middle/end/target patch Patch in testing.
[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 Uroš Bizjak ubizjak at gmail dot com changed: What|Removed |Added Component|target |rtl-optimization --- Comment #12 from Uroš Bizjak ubizjak at gmail dot com --- (In reply to Uroš Bizjak from comment #11) Please make 64bit TLS patterns dependant on SP_REG, in the same way as 32bit are. This wont't fix this particular case, but this dependency would be nice to have. The problem with the testcase from Comment #10 is caused by stack anti-adjustment, emitted from calls.c: 1: NOTE_INSN_DELETED 4: NOTE_INSN_BASIC_BLOCK 2 2: r96:SI=di:SI 3: NOTE_INSN_FUNCTION_BEG 6: {sp:DI=sp:DI-0x8;clobber flags:CC;} --- *** here *** REG_ARGS_SIZE 0x8 7: {r98:SI=r96:SI 00x10;clobber flags:CC;} 8: {r99:QI=r98:SI#00x;clobber flags:CC;} 9: r100:SI=zero_extend(r99:QI) 10: r101:QI#0=zero_extract(r96:SI,0x8,0x8) 11: r102:SI=zero_extend(r101:QI) 12: r103:SI=zero_extend(r96:SI#0) 13: ax:DI=call [`__tls_get_addr'] argc:0 REG_EH_REGION 0x8000 14: r105:DI=ax:DI REG_EQUAL unspec[0] 21 15: {r106:DI=r105:DI+const(unspec[`buffer'] 6);clobber flags:CC;} 16: r104:DI=r106:DI REG_EQUAL `buffer' 17: {r108:SI=r96:SI 00x18;clobber flags:CC;} 18: r109:SI=zero_extend(r108:SI#0) 19: [pre sp:DI+=0xfff8]=r109:SI REG_ARGS_SIZE 0x10 20: r9:SI=r100:SI 21: r8:SI=r102:SI 22: cx:SI=r103:SI 23: dx:DI=`*.LC0' 24: si:DI=0x12 25: di:DI=r104:DI 26: ax:QI=0 27: call [`__snprintf'] argc:0x10 REG_CALL_DECL `__snprintf' 28: ax:DI=call [`__tls_get_addr'] argc:0 REG_EH_REGION 0x8000 29: r111:DI=ax:DI REG_EQUAL unspec[0] 21 30: {r112:DI=r111:DI+const(unspec[`buffer'] 6);clobber flags:CC;} 31: r95:DI=r112:DI REG_EQUAL `buffer' 32: {sp:DI=sp:DI+0x10;clobber flags:CC;} REG_ARGS_SIZE 0 36: ax:DI=r95:DI 37: use ax:DI Putting a breakpoint on anti_adjust_stack will show where it happens: Breakpoint 1, anti_adjust_stack (adjust=0x2e7b0500) at /home/uros/gcc-svn/trunk/gcc/explow.c:902 902 if (adjust == const0_rtx) (gdb) bt #0 anti_adjust_stack (adjust=0x2e7b0500) at /home/uros/gcc-svn/trunk/gcc/explow.c:902 #1 0x0080f24c in expand_call (exp=0x2e7b3680, target=0x0, ignore=1) at /home/uros/gcc-svn/trunk/gcc/calls.c:3165 #2 0x00966084 in expand_expr_real_1 (exp=0x2e7b3680, target=0x0, tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, inner_reference_p=false) at /home/uros/gcc-svn/trunk/gcc/expr.c:10362 There is already precompute_register_parameters function where: /* If the value is a non-legitimate constant, force it into a pseudo now. TLS symbols sometimes need a call to resolve. */ if (CONSTANT_P (args[i].value) !targetm.legitimate_constant_p (args[i].mode, args[i].value)) args[i].value = force_reg (args[i].mode, args[i].value); So, the core of the problem is in the call infrastructure that should emit precomputed register parameters before anti_adjust_stack is emitted After this infrastructure problem is fixed, proposed SP_REG dependency will prevent stack adjustment to be scheduled above TLS patterns. Re-confirmed as RTL-optimization problem.
[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 --- Comment #15 from H.J. Lu hjl.tools at gmail dot com --- (In reply to Uroš Bizjak from comment #13) Created attachment 35964 [details] Combined middle/end/target patch Patch in testing. I tried it on GCC 5 and it works on glibc. Thanks.