[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Jakub Jelinek --- Fixed for 7.4+.
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 --- Comment #6 from Jakub Jelinek --- Author: jakub Date: Mon Mar 5 16:09:49 2018 New Revision: 258253 URL: https://gcc.gnu.org/viewcvs?rev=258253&root=gcc&view=rev Log: PR target/84524 * config/i386/sse.md (*3): Replace with orig,vex. (*3): Likewise. Remove uses. * gcc.c-torture/execute/pr84524.c: New test. * gcc.target/i386/avx512bw-pr84524.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.c-torture/execute/pr84524.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/avx512bw-pr84524.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/sse.md branches/gcc-7-branch/gcc/testsuite/ChangeLog
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 --- Comment #5 from Jakub Jelinek --- Author: jakub Date: Mon Mar 5 16:01:03 2018 New Revision: 258252 URL: https://gcc.gnu.org/viewcvs?rev=258252&root=gcc&view=rev Log: PR target/84524 * config/i386/sse.md (*3): Replace with orig,vex. (*3): Likewise. Remove uses. * gcc.c-torture/execute/pr84524.c: New test. * gcc.target/i386/avx512bw-pr84524.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr84524.c trunk/gcc/testsuite/gcc.target/i386/avx512bw-pr84524.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 --- Comment #4 from Jakub Jelinek --- Created attachment 43493 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43493&action=edit gcc8-pr84524.patch Untested fix.
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 --- Comment #3 from Jakub Jelinek --- It reproduces even with __attribute__((noipa)) on foo, so the problem is just in that function. In assembly we can see: vpsllw $8, %zmm6, %zmm5 addq$64, %rdi vpmovzxwd %ymm5, %zmm0 vpcmpgtw%zmm5, %zmm2, %k1 vpslld $1, %zmm0, %zmm1 vextracti64x4 $0x1, %zmm5, %ymm0 vpmovzxwd %ymm0, %zmm0 vpslld $1, %zmm0, %zmm0 vpermt2w%zmm0, %zmm4, %zmm1 vpsllw $9, %zmm6, %zmm0 vpaddw %zmm7, %zmm6, %zmm6 vpxorq %zmm3, %zmm1, %zmm0 vpmovzxwd %ymm0, %zmm1 vpcmpgtw%zmm0, %zmm2, %k1 vpslld $1, %zmm1, %zmm5 vextracti64x4 $0x1, %zmm0, %ymm1 the compares into %k1 are useless, nothing really uses the %k1 afterwards, but it should be used for conditional moves (the cond ? (v << 1) ^ 0x1021 : (v << 9) conditional moves). Looking into this.
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 --- Comment #2 from Jakub Jelinek --- A side note, this shows how badly we need a type demotion pass, perhaps just on the LOOP_VECTORIZED copy of loop before vectorization: vect__27.7_175 = [vec_unpack_lo_expr] vect_v_16.5_173; vect__27.7_176 = [vec_unpack_hi_expr] vect_v_16.5_173; vect__28.8_177 = vect__27.7_175 << 1; vect__28.8_178 = vect__27.7_176 << 1; vect__29.9_179 = VEC_PACK_TRUNC_EXPR ; or some match.pd patterns that will fix it after the vectorization. The above is completely useless variant to just vector unsigned short << 1.
[Bug target/84524] -O3 causes behavior change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84524 Jakub Jelinek changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |NEW Last reconfirmed||2018-02-23 CC||jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Adjusted testcase for the testsuite: void foo (unsigned short *x) { unsigned short i, v; unsigned char j; for (i = 0; i < 256; i++) { v = i << 8; for (j = 0; j < 8; j++) if (v & 0x8000) v = (v << 1) ^ 0x1021; else v = v << 1; x[i] = v; } } int main (void) { unsigned short a[256]; foo (a); for (int i = 0; i < 256; i++) { unsigned short v = i << 8; for (int j = 0; j < 8; j++) { asm volatile ("" : "+r" (v)); if (v & 0x8000) v = (v << 1) ^ 0x1021; else v = v << 1; } if (a[i] != v) __builtin_abort (); } return 0; } I can confirm this is miscompiled even with current trunk at -O3 -mavx512bw.