[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64

2015-07-15 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

--- Comment #16 from uros at gcc dot gnu.org ---
Author: uros
Date: Wed Jul 15 07:39:30 2015
New Revision: 225807

URL: https://gcc.gnu.org/viewcvs?rev=225807root=gccview=rev
Log:
PR rtl-optimization/58066
* calls.c (expand_call): Precompute register parameters before stack
alignment is performed.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/calls.c


[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64

2015-07-13 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

--- Comment #14 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Uroš Bizjak from comment #13)

 Patch in testing.

This patch fixes the testcase, now we get:

 inet_ntoa:
   0:   41 56   push   %r14
   2:   41 55   push   %r13
   4:   44 0f b6 ef movzbl %dil,%r13d
   8:   41 54   push   %r12
   a:   55  push   %rbp
   b:   41 89 fcmov%edi,%r12d
   e:   53  push   %rbx
   f:   89 fb   mov%edi,%ebx
  11:   41 c1 ec 10 shr$0x10,%r12d
  15:   0f b6 c7movzbl %bh,%eax
  18:   c1 eb 18shr$0x18,%ebx
  1b:   45 0f b6 e4 movzbl %r12b,%r12d
  1f:   41 89 c6mov%eax,%r14d
  22:   48 8d 3d 00 00 00 00lea0(%rip),%rdi# 29
inet_ntoa+0x29
25: R_X86_64_TLSLD  buffer+0xfffc
  29:   e8 00 00 00 00  callq  2e inet_ntoa+0x2e
2a: R_X86_64_PLT32 
__tls_get_addr+0xfffc
  2e:   48 83 ec 08 sub$0x8,%rsp
  32:   48 8d 15 00 00 00 00lea0(%rip),%rdx# 39
inet_ntoa+0x39
35: R_X86_64_PC32   .LC0+0xfffc
  39:   45 89 e1mov%r12d,%r9d
  3c:   48 8d a8 00 00 00 00lea0x0(%rax),%rbp
3f: R_X86_64_DTPOFF32   buffer
  43:   53  push   %rbx
  44:   45 89 f0mov%r14d,%r8d
  47:   44 89 e9mov%r13d,%ecx
  4a:   31 c0   xor%eax,%eax
  4c:   be 12 00 00 00  mov$0x12,%esi
  51:   48 89 efmov%rbp,%rdi
  54:   e8 00 00 00 00  callq  59 inet_ntoa+0x59
55: R_X86_64_PLT32  __snprintf+0xfffc
  59:   58  pop%rax
  5a:   48 89 e8mov%rbp,%rax
  5d:   5a  pop%rdx
  5e:   5b  pop%rbx
  5f:   5d  pop%rbp
  60:   41 5c   pop%r12
  62:   41 5d   pop%r13
  64:   41 5e   pop%r14
  66:   c3  retq   

The difference between patched (+++) and unpatched (---) code is:

--- pr58066_.s  2015-07-13 11:58:23.0 +0200
+++ pr58066.s   2015-07-13 11:58:26.0 +0200
@@ -28,16 +28,16 @@
movzbl  %bh, %eax
shrl$24, %ebx
movzbl  %r12b, %r12d
-   subq$8, %rsp
-.LCFI5:
movl%eax, %r14d
leaqbuffer@tlsld(%rip), %rdi
call__tls_get_addr@PLT
-   pushq   %rbx
-.LCFI6:
+   subq$8, %rsp
+.LCFI5:
leaq.LC0(%rip), %rdx
movl%r12d, %r9d
leaqbuffer@dtpoff(%rax), %rbp
+   pushq   %rbx
+.LCFI6:
movl%r14d, %r8d
movl%r13d, %ecx
xorl%eax, %eax

HJ, can you please test the patch if it fixes your problem?

[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64

2015-07-13 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

--- Comment #13 from Uroš Bizjak ubizjak at gmail dot com ---
Created attachment 35964
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35964action=edit
Combined middle/end/target patch

Patch in testing.

[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64

2015-07-13 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

Uroš Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

  Component|target  |rtl-optimization

--- Comment #12 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Uroš Bizjak from comment #11)
 Please make 64bit TLS patterns dependant on SP_REG, in the same way as 32bit
 are.

This wont't fix this particular case, but this dependency would be nice to
have.

The problem with the testcase from Comment #10 is caused by stack
anti-adjustment, emitted from calls.c:

1: NOTE_INSN_DELETED
4: NOTE_INSN_BASIC_BLOCK 2
2: r96:SI=di:SI
3: NOTE_INSN_FUNCTION_BEG
6: {sp:DI=sp:DI-0x8;clobber flags:CC;}   --- *** here ***
  REG_ARGS_SIZE 0x8
7: {r98:SI=r96:SI 00x10;clobber flags:CC;}
8: {r99:QI=r98:SI#00x;clobber flags:CC;}
9: r100:SI=zero_extend(r99:QI)
   10: r101:QI#0=zero_extract(r96:SI,0x8,0x8)
   11: r102:SI=zero_extend(r101:QI)
   12: r103:SI=zero_extend(r96:SI#0)
   13: ax:DI=call [`__tls_get_addr'] argc:0
  REG_EH_REGION 0x8000
   14: r105:DI=ax:DI
  REG_EQUAL unspec[0] 21
   15: {r106:DI=r105:DI+const(unspec[`buffer'] 6);clobber flags:CC;}
   16: r104:DI=r106:DI
  REG_EQUAL `buffer'
   17: {r108:SI=r96:SI 00x18;clobber flags:CC;}
   18: r109:SI=zero_extend(r108:SI#0)
   19: [pre sp:DI+=0xfff8]=r109:SI
  REG_ARGS_SIZE 0x10
   20: r9:SI=r100:SI
   21: r8:SI=r102:SI
   22: cx:SI=r103:SI
   23: dx:DI=`*.LC0'
   24: si:DI=0x12
   25: di:DI=r104:DI
   26: ax:QI=0
   27: call [`__snprintf'] argc:0x10
  REG_CALL_DECL `__snprintf'
   28: ax:DI=call [`__tls_get_addr'] argc:0
  REG_EH_REGION 0x8000
   29: r111:DI=ax:DI
  REG_EQUAL unspec[0] 21
   30: {r112:DI=r111:DI+const(unspec[`buffer'] 6);clobber flags:CC;}
   31: r95:DI=r112:DI
  REG_EQUAL `buffer'
   32: {sp:DI=sp:DI+0x10;clobber flags:CC;}
  REG_ARGS_SIZE 0
   36: ax:DI=r95:DI
   37: use ax:DI

Putting a breakpoint on anti_adjust_stack will show where it happens:

Breakpoint 1, anti_adjust_stack (adjust=0x2e7b0500) at
/home/uros/gcc-svn/trunk/gcc/explow.c:902
902   if (adjust == const0_rtx)
(gdb) bt
#0  anti_adjust_stack (adjust=0x2e7b0500) at
/home/uros/gcc-svn/trunk/gcc/explow.c:902
#1  0x0080f24c in expand_call (exp=0x2e7b3680, target=0x0,
ignore=1) at /home/uros/gcc-svn/trunk/gcc/calls.c:3165
#2  0x00966084 in expand_expr_real_1 (exp=0x2e7b3680, target=0x0,
tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, inner_reference_p=false)
at /home/uros/gcc-svn/trunk/gcc/expr.c:10362

There is already precompute_register_parameters function where:

/* If the value is a non-legitimate constant, force it into a
   pseudo now.  TLS symbols sometimes need a call to resolve.  */
if (CONSTANT_P (args[i].value)
 !targetm.legitimate_constant_p (args[i].mode, args[i].value))
  args[i].value = force_reg (args[i].mode, args[i].value);

So, the core of the problem is in the call infrastructure that should emit
precomputed register parameters before anti_adjust_stack is emitted

After this infrastructure problem is fixed, proposed SP_REG dependency will
prevent stack adjustment to be scheduled above TLS patterns.

Re-confirmed as RTL-optimization problem.

[Bug rtl-optimization/58066] __tls_get_addr is called with misaligned stack on x86-64

2015-07-13 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066

--- Comment #15 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Uroš Bizjak from comment #13)
 Created attachment 35964 [details]
 Combined middle/end/target patch
 
 Patch in testing.

I tried it on GCC 5 and it works on glibc.  Thanks.