[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-04 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #24 from peien luo  ---
(In reply to Dmitry Vyukov from comment #23)
> Please provide disassembly of the function that contains the PC
> (__gnu_cxx::__normal_iterator...).
> Did we fix any bugs that lead to missed __tsan_func_exit callbacks?
> 
> Before we go any deeper, I would suggest to retest with the latest gcc.
> There might have been bugs, and they may be fixed now. Even if a fix will be
> backported to 4.8 branch, you will still need to update the compiler.

I tried 4.9.4 today, and there seems to be a different error in gdb.

   0x7fe763ba774e <+0>: push   %rbp
   0x7fe763ba774f <+1>: mov%rsp,%rbp
   0x7fe763ba7752 <+4>: push   %r14
   0x7fe763ba7754 <+6>: push   %r13
   0x7fe763ba7756 <+8>: push   %r12
   0x7fe763ba7758 <+10>:push   %rbx
   0x7fe763ba7759 <+11>:sub$0x1000f0,%rsp
=> 0x7fe763ba7760 <+18>:mov%rdi,-0x1000e8(%rbp)
   0x7fe763ba7767 <+25>:mov%rsi,-0x1000f0(%rbp)
   0x7fe763ba776e <+32>:mov%rdx,-0x1000f8(%rbp)
   0x7fe763ba7775 <+39>:mov%rcx,-0x100100(%rbp)
   0x7fe763ba777c <+46>:mov%r8,-0x100108(%rbp)
   0x7fe763ba7783 <+53>:mov%r9d,-0x10010c(%rbp)
   0x7fe763ba778a <+60>:mov0x8(%rbp),%rax
   0x7fe763ba778e <+64>:mov%rax,%rdi
   0x7fe763ba7791 <+67>:callq  0x7fe763871660
<__tsan_func_entry(void*)>

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-03 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #23 from Dmitry Vyukov  ---
Please provide disassembly of the function that contains the PC
(__gnu_cxx::__normal_iterator...).
Did we fix any bugs that lead to missed __tsan_func_exit callbacks?

Before we go any deeper, I would suggest to retest with the latest gcc. There
might have been bugs, and they may be fixed now. Even if a fix will be
backported to 4.8 branch, you will still need to update the compiler.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-03 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #22 from peien luo  ---
The bt only shows a stack size of 27. No recursion. I modified the tsan code to
print out what's in the shadow stack when it's about to overflow. It looks most
of the addresses are:

__gnu_cxx::__normal_iterator > >
std::__unguarded_partition_pivot<__gnu_cxx::__normal_iterator > >,
__gnu_cxx::__ops::_Iter_comp_iter
>(__gnu_cxx::__normal_iterator
> >, __gnu_cxx::__normal_iterator > >, __gnu_cxx::__ops::_Iter_comp_iter)

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-01 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #21 from Dmitry Vyukov  ---
> is that huge number abnormal?

Let's say it is atypical for C/C++ programs because of fixed-size stacks. But
tsan has limit of 64K frames in the latest version (maybe 4.8.2 had limit of
32K frames).
But do you actually have that many frame in the thread stack? If you do bt in
gdb, does it actually show tens of thousands of frames? We had bugs when shadow
stack is maintained incorrectly and frames "leak".

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-12-01 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #20 from peien luo  ---
(In reply to Dmitry Vyukov from comment #18)
> Looks like shadow stack overflow.
> Do you use fibers, ucontext, longjmp, exceptions or any other non-obvious
> control flow constructs?
> Fibers and exceptions are not supported. Longjmp should work.

(gdb) p &(thr->shadow_stack[0])
$9 = (unsigned long *) 0x7f9842712080
(gdb) p thr->shadow_stack_pos 
$10 = (__sanitizer::uptr *) 0x7f9842762b68

so it actually took the 'shadow stack' size of 330472, then it crashed.
is that huge number abnormal?

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-30 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #19 from peien luo  ---
(In reply to Dmitry Vyukov from comment #18)
> Looks like shadow stack overflow.
> Do you use fibers, ucontext, longjmp, exceptions or any other non-obvious
> control flow constructs?
> Fibers and exceptions are not supported. Longjmp should work.

No, there's no use of ucontext, longjmp stuff. exceptions are not used either.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-07 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #18 from Dmitry Vyukov  ---
Looks like shadow stack overflow.
Do you use fibers, ucontext, longjmp, exceptions or any other non-obvious
control flow constructs?
Fibers and exceptions are not supported. Longjmp should work.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-07 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #17 from peien luo  ---
(In reply to Dmitry Vyukov from comment #16)
> > The stack size limit in my box is 8M. I have also checked /proc/limits.
> 
> So, is increasing stack size help?
> Tsan increases stack consumption. 8MB is not that much provided that you
> have 1MB frames.
> 
> > By enabling -fstack-protector-all, in gdb it still may get segfault here 
> > (at function entry).
> 
> Stack protector will not help to detect/prevent stack overflow. It only
> prevents PC overwrite on buffer overflows.

Increased to 80MB, no help.

In a 4.9.0 tsan case, I have come across a segfault in gdb with back trace:
23  void MutexSet::Add(u64 id, bool write, u64 epoch) {
24// Look up existing mutex with the same id.
25for (uptr i = 0; i < size_; i++) {
26  if (descs_[i].id == id) {
27descs_[i].count++;
28descs_[i].epoch = epoch;
29return;
30  }

(gdb) p size_
$2 = 139907922088632

(gdb) bt
#0  0x7f3ed9c5f37c in __tsan::MutexSet::Add
(this=this@entry=0x7f3ec6f63080, id=55166398195828824, 
write=write@entry=true, epoch=1680548) at
../../../../libsanitizer/tsan/tsan_mutexset.cc:26
#1  0x7f3ed9c555e5 in __tsan::MutexLock (thr=thr@entry=0x7f3ec6e78840,
pc=pc@entry=139907918263873, 
addr=addr@entry=138040248895576) at
../../../../libsanitizer/tsan/tsan_rtl_mutex.cc:109
#2  0x7f3ed9c4f6ae in __interceptor_pthread_mutex_lock (m=0x7d8c0858)
at ../../../../libsanitizer/tsan/tsan_interceptors.cc:811
...

something overflowed?

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-06 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #16 from Dmitry Vyukov  ---
> The stack size limit in my box is 8M. I have also checked /proc/limits.

So, is increasing stack size help?
Tsan increases stack consumption. 8MB is not that much provided that you have
1MB frames.

> By enabling -fstack-protector-all, in gdb it still may get segfault here (at 
> function entry).

Stack protector will not help to detect/prevent stack overflow. It only
prevents PC overwrite on buffer overflows.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-06 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #15 from Andrew Pinski  ---
   0x7f224dab0637 <+23>:sub$0x1000f8,%rsp



   0x7fc63563a72d <+29>:sub$0x1000e8,%rsp


We actually use less stack memory with 4.9 so it looks like it was accidently
working on 4.8.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-06 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #14 from peien luo  ---
(In reply to Dmitry Vyukov from comment #12)
> The crash in gdb looks like stack overflow (unsurprising if there are 1MB
> frames). Does increasing thread stack size or reducing frame size (there
> must something very big on the stack) help?

I tried gcc 4.9.4, 4.9.3, 4.9.2, 4.9.1, 4.9.0 today and found in this case, the
problem began to occur compiled with 4.9.0.

I tried to replace libsanitizer in 4.9.0 with the one in 4.8.5, no issue found.

The difference between the disassemble code at that function entry is:

4.8.5:
   0x7f224dab0620 <+0>: push   %r15
   0x7f224dab0622 <+2>: mov%r9d,%r15d
   0x7f224dab0625 <+5>: push   %r14
   0x7f224dab0627 <+7>: push   %r13
   0x7f224dab0629 <+9>: mov%rsi,%r13
   0x7f224dab062c <+12>:push   %r12
   0x7f224dab062e <+14>:push   %rbp
   0x7f224dab062f <+15>:mov%rdi,%rbp
   0x7f224dab0632 <+18>:lea0x30(%rbp),%r14
   0x7f224dab0636 <+22>:push   %rbx
   0x7f224dab0637 <+23>:sub$0x1000f8,%rsp
   0x7f224dab063e <+30>:mov0x100128(%rsp),%rdi
   0x7f224dab0646 <+38>:lea0x50(%rsp),%rbx
   0x7f224dab064b <+43>:mov%rdx,0x28(%rsp)
   0x7f224dab0650 <+48>:mov%rcx,0x38(%rsp)
   0x7f224dab0655 <+53>:mov%r8,0x30(%rsp)
   0x7f224dab065a <+58>:mov%fs:0x28,%rax
   0x7f224dab0663 <+67>:mov%rax,0x1000e8(%rsp)
   0x7f224dab066b <+75>:xor%eax,%eax
   0x7f224dab066d <+77>:callq  0x7f224d69ae50
<__tsan_func_entry(void*)>


4.9.0
   0x7fc63563a710 <+0>: push   %rbp
   0x7fc63563a711 <+1>: mov%rsp,%rbp
   0x7fc63563a714 <+4>: push   %r15
   0x7fc63563a716 <+6>: push   %r14
   0x7fc63563a718 <+8>: push   %r13
   0x7fc63563a71a <+10>:push   %r12
   0x7fc63563a71c <+12>:mov%rdi,%r15
   0x7fc63563a71f <+15>:push   %rbx
   0x7fc63563a720 <+16>:mov%rsi,%r13
   0x7fc63563a723 <+19>:mov%r9d,%r14d
   0x7fc63563a726 <+22>:lea-0x1000d0(%rbp),%rbx
   0x7fc63563a72d <+29>:sub$0x1000e8,%rsp
=> 0x7fc63563a734 <+36>:mov%rdi,-0x1000e8(%rbp)
   0x7fc63563a73b <+43>:mov0x8(%rbp),%rdi
   0x7fc63563a73f <+47>:mov%rdx,-0x1000f0(%rbp)
   0x7fc63563a746 <+54>:mov%rcx,-0x100100(%rbp)
   0x7fc63563a74d <+61>:mov%r8,-0x1000f8(%rbp)
   0x7fc63563a754 <+68>:mov%fs:0x28,%rax
   0x7fc63563a75d <+77>:mov%rax,-0x38(%rbp)
   0x7fc63563a761 <+81>:xor%eax,%eax
   0x7fc63563a763 <+83>:callq  0x7fc63527d1e0
<__tsan_func_entry(void*)>

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-11-05 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #13 from peien luo  ---
(In reply to Dmitry Vyukov from comment #12)
> The crash in gdb looks like stack overflow (unsurprising if there are 1MB
> frames). Does increasing thread stack size or reducing frame size (there
> must something very big on the stack) help?

The stack size limit in my box is 8M. I have also checked /proc/limits.

By enabling -fstack-protector-all, in gdb it still may get segfault here (at
function entry).

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-30 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #12 from Dmitry Vyukov  ---
The crash in gdb looks like stack overflow (unsurprising if there are 1MB
frames). Does increasing thread stack size or reducing frame size (there must
something very big on the stack) help?

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-29 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #11 from peien luo  ---
Sorry for the previous comment regarding running in gdb. the result seems to be
random:

Sometimes it can runs fine
Sometimes it gets a SEGFAULT in calling to a function, gdb says:
   0x7ff0fa19b466 <+22>:lea-0x100060(%rbp),%rbx
   0x7ff0fa19b46d <+29>:sub$0x1000d8,%rsp
=> 0x7ff0fa19b474 <+36>:mov%rdi,-0x1000d8(%rbp)
   0x7ff0fa19b47b <+43>:mov0x8(%rbp),%rdi
   0x7ff0fa19b47f <+47>:mov%rdx,-0x1000e0(%rbp)
   0x7ff0fa19b486 <+54>:mov%rcx,-0x1000f0(%rbp)
   0x7ff0fa19b48d <+61>:mov%r8,-0x1000e8(%rbp)
   0x7ff0fa19b494 <+68>:callq  0x7ff0f9e12e00
<__tsan_func_entry(void*)>
   0x7ff0fa19b499 <+73>:mov%r15,%rax
   0x7ff0fa19b49c <+76>:add$0x30,%rax
   0x7ff0fa19b4a0 <+80>:mov%rax,%rdi
   0x7ff0fa19b4a3 <+83>:mov%rax,-0x1000c8(%rbp)
   0x7ff0fa19b4aa <+90>:callq  0x7ff0f9e1c7e0
<__interceptor_pthread_mutex_lock(void*)>
   0x7ff0fa19b4af <+95>:mov%rbx,%rdi
   0x7ff0fa19b4b2 <+98>:callq  0x7ff0f9e12020
<__tsan_vptr_update(void**, void*)>

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-29 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #10 from peien luo  ---
It's a centOS7, kernel has been updated to 3.10.0-327.36.3.el7.x86_64, the
problem still occurs. Some new findings:

1, With gcc 4.8.5, it runs fine for this specific case.
2, With gcc 4.9.4, it stucks at some point, the ps says:
$ ps -flp 13600
F S UID PID   PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY  TIME
CMD
0 D god   13600  13597 89  80   0 - 26038299885 exit 23:15 pts/9 00:15:28
./test_metaserver
3, With gdb, it runs OK by 'set disable-randomization off' it runs ok as well.
(I need to check it again)

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-11 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #9 from Dmitry Vyukov  ---
Humm... what are they waiting for? Is it also core dump? Stack for the sleeping
task is missing for some reason.
What kernel version do you use? Maybe the problem is with the kernel? Isn't it
too old?.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-10-11 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #8 from peien luo  ---
In another case, the process got stuck, compiled with gcc 4.9.4. I will try a
different version of gcc. The proc stack info is:

[god@localhost 5019]$ cat task/*/status | grep State
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  D (disk sleep)
State:  S (sleeping)
[god@localhost 5019]$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] 0x
[god@localhost 5019]$ cat stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-29 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #7 from peien luo  ---
tried, still got D state, build with gcc 4.9.4

[god@localhost 21586]$ cat stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[god@localhost 21586]$ cat status
Name:   test_metaserver
State:  D (disk sleep)
Tgid:   21586
Ngid:   0
Pid:21586
PPid:   12499
TracerPid:  0
Uid:1000100010001000
Gid:1000100010001000
FDSize: 256
Groups: 1000 
VmPeak: 104153806860 kB
VmSize: 104153793252 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM:342544 kB
VmRSS:342544 kB
VmData: 104153254936 kB
VmStk:  1048 kB
VmExe: 18392 kB
VmLib:  5992 kB
VmPTE:  1904 kB
VmSwap:0 kB
Threads:8
SigQ:   0/63365
SigPnd: 
ShdPnd: 
SigBlk: 
SigIgn: 1000
SigCgt: 00018000
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 001f
Seccomp:0
Cpus_allowed:   ,,,
Cpus_allowed_list:  0-127
Mems_allowed:  
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0001
Mems_allowed_list:  0
voluntary_ctxt_switches:442
nonvoluntary_ctxt_switches: 9

[god@localhost 21586]$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[god@localhost 21586]$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[god@localhost ~]$ g++ -v
Using 

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-19 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #6 from Dmitry Vyukov  ---
It hangs trying to dump core file to some pipe:

[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c

Tsan allocates 100+TB of virtual memory. It's unsurprising that it hangs. Tsan
should disable core dumps by default (done by DisableCoreDumperIfNecessary
function). Not sure why core dumps are not disabled in your case. Try to run
with TSAN_OPTIONS=disable_coredump=1 environment variable.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #5 from peien luo  ---
(In reply to Dmitry Vyukov from comment #4)
> Unkillable processed in D state usually mean kernel bugs (and there are lots
> of them: https://github.com/google/syzkaller/wiki/Found-Bugs).
> 
> Please post results of 'cat /proc/PID/task/*/stack` and `cat
> /proc/PID/task/*/status`. Sometimes hangs happen due to secondary threads in
> the process. Maybe we will be able to figure out something from that info.
> However, I am not sure what we can do about a process hanged in D state,
> generally user must not be able to create them.

It can be killed by kill -9.

I have updated my centos 7 to the latest. Problem still occurs. The kernel
stack output is:

$ cat task/*/stack
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x
[] pipe_wait+0x70/0xc0
[] pipe_write+0x236/0x5b0
[] do_sync_write+0x8d/0xd0
[] dump_write+0x52/0x70
[] dump_seek+0xa4/0xe0
[] elf_core_dump+0x896/0x950
[] do_coredump+0x882/0xb10
[] get_signal_to_deliver+0x1c7/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] retint_signal+0x48/0x8c
[] 0x
[god@localhost 12987]$ ls task
12987  12988  12989  12990  12991  12992  12993  12994


and status:
$ cat task/*/status
Name:   test_metaserver
State:  D (disk sleep)
Tgid:   12987
Ngid:   0
Pid:12987
PPid:   11646
TracerPid:  0
Uid:1000100010001000
Gid:1000100010001000
FDSize: 256
Groups: 1000 
VmPeak: 104153197488 kB
VmSize: 104153197488 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM:265724 kB
VmRSS:265724 kB
VmData: 104153051792 kB
VmStk:   136 kB
VmExe: 18492 kB
VmLib:  5992 kB
VmPTE:  1288 kB
VmSwap:0 kB
Threads:8
SigQ:   0/63365
SigPnd: 
ShdPnd: 
SigBlk: 
SigIgn: 1000
SigCgt: 00018000
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 001f
Seccomp:0
Cpus_allowed:   ,,,
Cpus_allowed_list:  0-127
Mems_allowed:  
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0001
Mems_allowed_list:  0
voluntary_ctxt_switches:96
nonvoluntary_ctxt_switches: 8
Name:   test_metaserver
State:  D (disk sleep)
Tgid:   12987
Ngid:   0
Pid:12988
PPid:   11646
TracerPid:  0
Uid:1000100010001000
Gid:1000100010001000
FDSize: 256
Groups: 1000 
VmPeak: 104153197488 kB
VmSize: 104153197488 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM:265724 kB
VmRSS:265724 kB
VmData: 104153051792 kB
VmStk:   136 kB
VmExe: 18492 kB
VmLib:  5992 kB
VmPTE:  1288 kB
VmSwap:0 kB
Threads:8
SigQ:   0/63365
SigPnd: 
ShdPnd: 
SigBlk: fffbfeff
SigIgn: 1000
SigCgt: 00018000
CapInh: 
CapPrm: 
CapEff: 
CapBnd: 001f
Seccomp:0
Cpus_allowed:   ,,,
Cpus_allowed_list:  0-127
Mems_allowed:  
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0001
Mems_allowed_list:  0
voluntary_ctxt_switches:6
nonvoluntary_ctxt_switches: 0
Name:   test_metaserver
State:  

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #4 from Dmitry Vyukov  ---
Unkillable processed in D state usually mean kernel bugs (and there are lots of
them: https://github.com/google/syzkaller/wiki/Found-Bugs).

Please post results of 'cat /proc/PID/task/*/stack` and `cat
/proc/PID/task/*/status`. Sometimes hangs happen due to secondary threads in
the process. Maybe we will be able to figure out something from that info.
However, I am not sure what we can do about a process hanged in D state,
generally user must not be able to create them.

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #3 from peien luo  ---
The process stuck can be reproduced, the kernel call trace is like:

Sep 16 09:38:37 localhost kernel: test_metaserver D 8803f9307300 0 
4250   3896 0x0080
Sep 16 09:38:37 localhost kernel: 880424b4bcd0 0082
8803f9307300 880424b4bfd8
Sep 16 09:38:37 localhost kernel: 880424b4bfd8 880424b4bfd8
8803f9307300 8803f9307300
Sep 16 09:38:37 localhost kernel: 880421c29f40 880421c29fb8
8803faf06c40 8803f9307300
Sep 16 09:38:37 localhost kernel: Call Trace:
Sep 16 09:38:37 localhost kernel: [] schedule+0x29/0x70
Sep 16 09:38:37 localhost kernel: [] do_exit+0x1e4/0xa60
Sep 16 09:38:37 localhost kernel: [] ? update_curr+0xcc/0x150
Sep 16 09:38:37 localhost kernel: [] ?
account_entity_dequeue+0xae/0xd0
Sep 16 09:38:37 localhost kernel: [] do_group_exit+0x3f/0xa0
Sep 16 09:38:37 localhost kernel: []
get_signal_to_deliver+0x1d0/0x6d0
Sep 16 09:38:37 localhost kernel: [] do_signal+0x57/0x6c0
Sep 16 09:38:37 localhost kernel: [] ? ktime_get+0x4c/0xd0
Sep 16 09:38:37 localhost kernel: [] ?
hrtimer_nanosleep+0xd3/0x170
Sep 16 09:38:37 localhost kernel: [] ?
hrtimer_get_res+0x50/0x50
Sep 16 09:38:37 localhost kernel: []
do_notify_resume+0x5f/0xb0
Sep 16 09:38:37 localhost kernel: [] int_signal+0x12/0x17

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-16 Thread coollpe at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

--- Comment #2 from peien luo  ---
(In reply to Dmitry Vyukov from comment #1)
> Hello,
> 
> Shadow stack size was increased several times, and as far as I remember we
> now have a guard page at the end. Please retest with latest gcc/clang, or
> provide a reproducer.

I moved to another box (a virtual machine) to test the new gcc 4.9.4 (because
the other environment is a shared server I can't make many changes on it.)

What I observed is: without tsan, the process runs fine. With tsan turned on,
then it got fully stuck at some point. (D state, cannot attach or trace). I
haven't yet figured out what caused that. Here is a /proc stack when it got
stuck:

$ cat syscall 
35 0x7ffca05f77e0 0x7ffca05f77e0 0x0 0x8 0x7ffca05f78f0 0x7ffca05f7730
0x7ffca05f77d0 0x7f24ff6f349d

$ cat stack 
[] do_exit+0x1e4/0xa60
[] do_group_exit+0x3f/0xa0
[] get_signal_to_deliver+0x1d0/0x6d0
[] do_signal+0x57/0x6c0
[] do_notify_resume+0x5f/0xb0
[] int_signal+0x12/0x17
[] 0x

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-13 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2016-09-13
 Ever confirmed|0   |1

[Bug sanitizer/77538] segmentation fault: thread sanitizer shadow stack overflow

2016-09-09 Thread dvyukov at google dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77538

Dmitry Vyukov  changed:

   What|Removed |Added

 CC||dvyukov at google dot com

--- Comment #1 from Dmitry Vyukov  ---
Hello,

Shadow stack size was increased several times, and as far as I remember we now
have a guard page at the end. Please retest with latest gcc/clang, or provide a
reproducer.