https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388
Bug ID: 71388 Summary: [6/7 regression] wrong code, DSE removes memset in TBB allocate_scheduler (causes run-time crashes) Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: david.abdurachmanov at gmail dot com Target Milestone: --- Created attachment 38626 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38626&action=edit pre-processed file We found this crashing one of our test cases in GCC 6.1.1 port of our software. Looks like wrong code in Intel TBB. TBB version: tbb44_20151115oss (also affects newer versions) First bad commit: 8a36d0ec201ef1511b372523f72a763b836107b0 or r222135 Last good commit: 8b2942f7c961ee83bb0ff605129165ecdf6ac8f6 or r222134 Potentially wrongly compiled code is from src/tbb/custom_scheduler.h 111 public: 112 static generic_scheduler* allocate_scheduler( market& m ) { 113 void* p = NFS_Allocate(1, sizeof(scheduler_type), NULL); 114 std::memset(p, 0, sizeof(scheduler_type)); 115 scheduler_type* s = new( p ) scheduler_type( m ); 116 s->assert_task_pool_valid(); 117 ITT_SYNC_CREATE(s, SyncType_Scheduler, SyncObj_TaskPoolSpinning); 118 return s; 119 } >From our developer: What is happening is a class instance is being created with placement new where the address is reused from a thread which has gone away. However, (at least) one of the member data of the newly created object is non-0 and it is that non-0 value which ultimately leads to the crash. Object size here is 408 bytes. A call to memset is not generated and don't seem to be inlined also. Can be cured if TTB is compiled with CXXFLAGS='-fno-builtin-memset' which will force GCC to generate a call to memset. Below you can find examples of tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&) symbol with good and bad GCC revision. Note that in bad revision we lost: rep stos %rax,%es:(%rdi) Attaching pre-processed scheduler.ii (not a minimal test case), compiled as: g++ -o scheduler.o -c -g -O2 -m64 -mrtm -fPIC scheduler.ii ### GOOD ### 0000000000023df0 <tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)>: 23df0: 55 push %rbp 23df1: 53 push %rbx 23df2: 31 d2 xor %edx,%edx 23df4: 48 89 fd mov %rdi,%rbp 23df7: be 98 01 00 00 mov $0x198,%esi 23dfc: bf 01 00 00 00 mov $0x1,%edi 23e01: 48 83 ec 08 sub $0x8,%rsp 23e05: e8 96 9a fe ff callq d8a0 <tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt> 23e0a: 48 8d 78 08 lea 0x8(%rax),%rdi 23e0e: 48 89 c1 mov %rax,%rcx 23e11: 48 89 c3 mov %rax,%rbx 23e14: 48 c7 00 00 00 00 00 movq $0x0,(%rax) 23e1b: 48 c7 80 90 01 00 00 movq $0x0,0x190(%rax) 23e22: 00 00 00 00 23e26: 31 c0 xor %eax,%eax 23e28: 48 83 e7 f8 and $0xfffffffffffffff8,%rdi 23e2c: 48 89 ee mov %rbp,%rsi 23e2f: 48 29 f9 sub %rdi,%rcx 23e32: 81 c1 98 01 00 00 add $0x198,%ecx 23e38: c1 e9 03 shr $0x3,%ecx 23e3b: f3 48 ab rep stos %rax,%es:(%rdi) 23e3e: 48 89 df mov %rbx,%rdi 23e41: e8 5a e4 ff ff callq 222a0 <tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)> 23e46: 48 8d 05 13 1c 21 00 lea 0x211c13(%rip),%rax # 235a60 <vtable for tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>> 23e4d: 48 83 c0 10 add $0x10,%rax 23e51: 48 89 03 mov %rax,(%rbx) 23e54: 48 8d 05 35 2b 21 00 lea 0x212b35(%rip),%rax # 236990 <__itt_sync_create_ptr__3_0> 23e5b: 48 8b 00 mov (%rax),%rax 23e5e: 48 85 c0 test %rax,%rax 23e61: 74 1e je 23e81 <tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x91> 23e63: 48 8d 15 fe 25 21 00 lea 0x2125fe(%rip),%rdx # 236468 <tbb::SyncObj_TaskPoolSpinning> 23e6a: 48 8d 35 2f 26 21 00 lea 0x21262f(%rip),%rsi # 2364a0 <tbb::SyncType_Scheduler> 23e71: b9 02 00 00 00 mov $0x2,%ecx 23e76: 48 89 df mov %rbx,%rdi 23e79: 48 8b 12 mov (%rdx),%rdx 23e7c: 48 8b 36 mov (%rsi),%rsi 23e7f: ff d0 callq *%rax 23e81: 48 83 c4 08 add $0x8,%rsp 23e85: 48 89 d8 mov %rbx,%rax 23e88: 5b pop %rbx 23e89: 5d pop %rbp 23e8a: c3 retq 23e8b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ### BAD ### 0000000000023dd0 <tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)>: 23dd0: 55 push %rbp 23dd1: 53 push %rbx 23dd2: 31 d2 xor %edx,%edx 23dd4: 48 89 fd mov %rdi,%rbp 23dd7: be 98 01 00 00 mov $0x198,%esi 23ddc: bf 01 00 00 00 mov $0x1,%edi 23de1: 48 83 ec 08 sub $0x8,%rsp 23de5: e8 b6 9a fe ff callq d8a0 <tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt> 23dea: 48 89 ee mov %rbp,%rsi 23ded: 48 89 c7 mov %rax,%rdi 23df0: 48 89 c3 mov %rax,%rbx 23df3: e8 a8 e4 ff ff callq 222a0 <tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)> 23df8: 48 8d 05 61 1c 21 00 lea 0x211c61(%rip),%rax # 235a60 <vtable for tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>> 23dff: 48 83 c0 10 add $0x10,%rax 23e03: 48 89 03 mov %rax,(%rbx) 23e06: 48 8d 05 83 2b 21 00 lea 0x212b83(%rip),%rax # 236990 <__itt_sync_create_ptr__3_0> 23e0d: 48 8b 00 mov (%rax),%rax 23e10: 48 85 c0 test %rax,%rax 23e13: 74 1e je 23e33 <tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x63> 23e15: 48 8d 15 4c 26 21 00 lea 0x21264c(%rip),%rdx # 236468 <tbb::SyncObj_TaskPoolSpinning> 23e1c: 48 8d 35 7d 26 21 00 lea 0x21267d(%rip),%rsi # 2364a0 <tbb::SyncType_Scheduler> 23e23: b9 02 00 00 00 mov $0x2,%ecx 23e28: 48 89 df mov %rbx,%rdi 23e2b: 48 8b 12 mov (%rdx),%rdx 23e2e: 48 8b 36 mov (%rsi),%rsi 23e31: ff d0 callq *%rax 23e33: 48 83 c4 08 add $0x8,%rsp 23e37: 48 89 d8 mov %rbx,%rax 23e3a: 5b pop %rbx 23e3b: 5d pop %rbp 23e3c: c3 retq 23e3d: 0f 1f 00 nopl (%rax) ### DIFF ### 1 --- 20150415.s 2016-06-02 16:01:06.000000000 +0200 2 +++ 20150415.bad.s 2016-06-02 16:01:54.000000000 +0200 3 @@ -6,31 +6,20 @@ 4 be 98 01 00 00 mov $0x198,%esi 5 bf 01 00 00 00 mov $0x1,%edi 6 48 83 ec 08 sub $0x8,%rsp 7 - e8 96 9a fe ff callq d8a0 <tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt> 8 - 48 8d 78 08 lea 0x8(%rax),%rdi 9 - 48 89 c1 mov %rax,%rcx 10 - 48 89 c3 mov %rax,%rbx 11 - 48 c7 00 00 00 00 00 movq $0x0,(%rax) 12 - 48 c7 80 90 01 00 00 movq $0x0,0x190(%rax) 13 - 00 00 00 00 14 - 31 c0 xor %eax,%eax 15 - 48 83 e7 f8 and $0xfffffffffffffff8,%rdi 16 + e8 b6 9a fe ff callq d8a0 <tbb::internal::NFS_Allocate(unsigned long, unsigned long, void*)@plt> 17 48 89 ee mov %rbp,%rsi 18 - 48 29 f9 sub %rdi,%rcx 19 - 81 c1 98 01 00 00 add $0x198,%ecx 20 - c1 e9 03 shr $0x3,%ecx 21 - f3 48 ab rep stos %rax,%es:(%rdi) 22 - 48 89 df mov %rbx,%rdi 23 - e8 5a e4 ff ff callq 222a0 <tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)> 24 - 48 8d 05 13 1c 21 00 lea 0x211c13(%rip),%rax # 235a60 <vtable for tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>> 25 + 48 89 c7 mov %rax,%rdi 26 + 48 89 c3 mov %rax,%rbx 27 + e8 a8 e4 ff ff callq 222a0 <tbb::internal::generic_scheduler::generic_scheduler(tbb::internal::market&)> 28 + 48 8d 05 61 1c 21 00 lea 0x211c61(%rip),%rax # 235a60 <vtable for tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>> 29 48 83 c0 10 add $0x10,%rax 30 48 89 03 mov %rax,(%rbx) 31 - 48 8d 05 35 2b 21 00 lea 0x212b35(%rip),%rax # 236990 <__itt_sync_create_ptr__3_0> 32 + 48 8d 05 83 2b 21 00 lea 0x212b83(%rip),%rax # 236990 <__itt_sync_create_ptr__3_0> 33 48 8b 00 mov (%rax),%rax 34 48 85 c0 test %rax,%rax 35 - 74 1e je 23e81 <tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x91> 36 - 48 8d 15 fe 25 21 00 lea 0x2125fe(%rip),%rdx # 236468 <tbb::SyncObj_TaskPoolSpinning> 37 - 48 8d 35 2f 26 21 00 lea 0x21262f(%rip),%rsi # 2364a0 <tbb::SyncType_Scheduler> 38 + 74 1e je 23e33 <tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::allocate_scheduler(tbb::internal::market&)+0x63> 39 + 48 8d 15 4c 26 21 00 lea 0x21264c(%rip),%rdx # 236468 <tbb::SyncObj_TaskPoolSpinning> 40 + 48 8d 35 7d 26 21 00 lea 0x21267d(%rip),%rsi # 2364a0 <tbb::SyncType_Scheduler> 41 b9 02 00 00 00 mov $0x2,%ecx 42 48 89 df mov %rbx,%rdi 43 48 8b 12 mov (%rdx),%rdx 44 @@ -41,4 +30,4 @@ 45 5b pop %rbx 46 5d pop %rbp 47 c3 retq 48 - 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 49 + 0f 1f 00 nopl (%rax)