https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93919
Bug ID: 93919 Summary: [10 Regression] vectorization of 18 char to char16_t conversion is miscompiled Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Target: x86_64-*-*, i?86-*-* Test case (https://godbolt.org/z/8QYarZ): char in[18] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17}; int main() { using V [[gnu::vector_size(32)]] = char; const V seq = {in[0], in[1], in[2], in[3], in[4], in[5], in[6], in[7], in[8], in[9], in[10], in[11], in[12], in[13], in[14], in[15], in[16], in[17]}; char16_t out[18]; for (int i = 0; i < 18; ++i) out[i] = static_cast<char16_t>(seq[i]); for (int i = 0; i < 18; ++i) { const volatile char16_t reference(seq[i]); if (out[i] != reference) __builtin_abort(); } } Compile with `-O1 -ftree-vectorize`. Result: pxor xmm1, xmm1 pcmpgtb xmm1, xmm0 # xmm0 is seq[0:15] movdqa xmm2, xmm0 punpcklbw xmm2, xmm1 movaps XMMWORD PTR [rsp+16], xmm2 punpckhbw xmm0, xmm1 movaps XMMWORD PTR [rsp+32], xmm0 movsx eax, WORD PTR [rsp+80] # WORD PTR [rsp+80] is seq[16:17] mov DWORD PTR [rsp+48], eax # out-of-bounds write by 2 Bytes Conversion of the first 16 char to char16_t is correct. The latter 2 char are loaded as a word and sign extended to a doubleword. The doubleword is stored to the stack, exceeding the size of the destination array by 2 Bytes. Besides the out-of-bounds access, the result in out[16] and out[17] is wrong.