https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89538
Bug ID: 89538 Summary: [7.3.0] GCC miscompiling LLVM because of wrong vectorization Product: gcc Version: 7.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: twoh at fb dot com Target Milestone: --- I compiled pretty recent version of LLVM (https://reviews.llvm.org/rL354365) with GCC 7.3.0, and experiencing segfault from LLVM when I compile. I did some investigation and seems that GCC generates wrong binary because of the issue in vectorization. This (https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L2650) is the line causes the problem. The data structure (SmallVector) holds a pair of a unique_ptr and a raw pointer, and when emplace_back attempt to push a 9th element to the vector, it releases all the unique_ptrs that previously in the vector. I added an instrumentation around emplace_back, and below is the output TypeUnitsUnderConstruction size before: 8 check1 0x67ecea0 0x6b09908 check1 0x67eccb0 0x5eefa48 check1 0x67ecac0 0x61a3b58 check1 0x67ec8d0 0x6b1ab28 check1 0x67ec6e0 0x61a3ca8 check1 0x67ec110 0x6b0c0b8 check1 0x65fd1a0 0x6b0bfb8 check1 0x65fcfb0 0x66cc0d8 emplace_back 0x65fcdc0 0x7a7a958 TypeUnitsUnderConstruction size after: 9 check2 0x0 0x6b09908 check2 0x0 0x5eefa48 check2 0x0 0x61a3b58 check2 0x0 0x6b1ab28 check2 0x0 0x61a3ca8 check2 0x0 0x6b0c0b8 check2 0x0 0x6b0bfb8 check2 0x0 0x66cc0d8 check2 0x65fcdc0 0x7a7a958 You can easily guess that something might go wrong when emplace_back triggers a logic that increases the capacity of the vector and moves the elements from the old location to the new location. That is actually right, and I confirmed that here in the function that increases the capacity (https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/SmallVector.h#L243), "unitilized_move" only resets the the old object but not moves the elements to the new object. So I investigated further to figure out which instruction actually sets "0x0" to the new location, and found that instruction 202aef4 below is the one 202aed0: 48 c7 00 00 00 00 00 movq $0x0,(%rax) 202aed7: f3 0f 6f 08 movdqu (%rax),%xmm1 202aedb: 48 83 c0 20 add $0x20,%rax 202aedf: 48 83 c1 20 add $0x20,%rcx 202aee3: 48 c7 40 f0 00 00 00 movq $0x0,-0x10(%rax) 202aeea: 00 202aeeb: f3 0f 6f 40 f0 movdqu -0x10(%rax),%xmm0 202aef0: 0f 11 49 e0 movups %xmm1,-0x20(%rcx) 202aef4: 0f 11 41 f0 movups %xmm0,-0x10(%rcx) 202aef8: 48 39 f8 cmp %rdi,%rax 202aefb: 75 d3 jne 202aed0 However, here, %xmm0 is always expected to be 0 because of instruction 202aee3 and 202aeeb (there's no branch targeting instructions after 202aeeb). I found this weird, and recompiled the problematic LLVM source (DwarfDebug.cpp) again with all IR dumps (https://gist.github.com/taewookoh/45e710594497b887e2ac54168c86c17f is my compilation command). Then I bisected IRs to find out where these suspicious logic introduced. And it seems to me that vectorizer is the issue. https://gist.github.com/taewookoh/3bbe848ee487473f8012028bc78da64a is the code snippet from DwarfDebug.cpp.160t.ifcvt while https://gist.github.com/taewookoh/766d1bf3198ac33465fad6d875110f3e is from DwarfDebug.cpp.161t.vect. From 160t, we can see the logic of reading a value from the old location, reset the old location, then write the value back to the new location (https://gist.github.com/taewookoh/3bbe848ee487473f8012028bc78da64a#file-gistfile1-txt-L225-L284). However, in one specialization in 161t, the logic seems twisted (https://gist.github.com/taewookoh/3bbe848ee487473f8012028bc78da64a#file-gistfile1-txt-L225-L284). Here, both __first$_M_current_84 and vectp.5429_127 are the index for the original object, while vectp_Result.5433_132 is an index for the new object. line 378-384 performs the operation of reading the value from the old object and writing them back to the new object, but for some reason, the old location is already reset to 0 from line 320-324. And this aligns with what I observe from the asm instructions above: reset the memory to zero first, read from the memory, then write to another memory. As I'm not a gcc expert I still not 100% confident to claim this as a gcc bug (and hard to believe that this bug hasn't been exposed), but what's happening here is suspicious enough to me. It would be nice if someone familiar with the code base can take a look. Thanks!