[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #8 from Jakub Jelinek --- Fixed.
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 --- Comment #7 from Jakub Jelinek --- Author: jakub Date: Fri Dec 21 10:37:11 2018 New Revision: 267322 URL: https://gcc.gnu.org/viewcvs?rev=267322=gcc=rev Log: PR target/88547 * config/i386/i386.c (ix86_expand_int_sse_cmp): Optimize x > y ? 0 : -1 into min (x, y) == x ? -1 : 0. * gcc.target/i386/pr88547-1.c: Expect only 2 knotb and 2 knotw insns instead of 4, check for vpminud, vpminuq and no vpsubd or vpsubq. * gcc.target/i386/sse2-pr88547-1.c: New test. * gcc.target/i386/sse2-pr88547-2.c: New test. * gcc.target/i386/sse4_1-pr88547-1.c: New test. * gcc.target/i386/sse4_1-pr88547-2.c: New test. * gcc.target/i386/avx2-pr88547-1.c: New test. * gcc.target/i386/avx2-pr88547-2.c: New test. * gcc.target/i386/avx512f-pr88547-2.c: New test. * gcc.target/i386/avx512vl-pr88547-1.c: New test. * gcc.target/i386/avx512vl-pr88547-2.c: New test. * gcc.target/i386/avx512vl-pr88547-3.c: New test. * gcc.target/i386/avx512f_cond_move.c (y): Change from unsigned int array to int array. Added: trunk/gcc/testsuite/gcc.target/i386/avx2-pr88547-1.c trunk/gcc/testsuite/gcc.target/i386/avx2-pr88547-2.c trunk/gcc/testsuite/gcc.target/i386/avx512f-pr88547-2.c trunk/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-1.c trunk/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-2.c trunk/gcc/testsuite/gcc.target/i386/avx512vl-pr88547-3.c trunk/gcc/testsuite/gcc.target/i386/sse2-pr88547-1.c trunk/gcc/testsuite/gcc.target/i386/sse2-pr88547-2.c trunk/gcc/testsuite/gcc.target/i386/sse4_1-pr88547-1.c trunk/gcc/testsuite/gcc.target/i386/sse4_1-pr88547-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/avx512f_cond_move.c trunk/gcc/testsuite/gcc.target/i386/pr88547-1.c
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 --- Comment #6 from Jakub Jelinek --- Created attachment 45274 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45274=edit gcc9-pr88547.patch Untested patch for the rest. Richard, is that what you had in mind?
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 --- Comment #5 from Jakub Jelinek --- Author: jakub Date: Thu Dec 20 07:58:02 2018 New Revision: 267293 URL: https://gcc.gnu.org/viewcvs?rev=267293=gcc=rev Log: PR target/88547 * config/i386/i386.c (ix86_expand_sse_movcc): For maskcmp, try to emit vpmovm2? instruction perhaps after knot?. Reorganize code so that it doesn't have to test !maskcmp in almost every conditional. * gcc.target/i386/pr88547-1.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr88547-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 --- Comment #4 from Jakub Jelinek --- Created attachment 45264 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45264=edit gcc9-pr88547-1.patch Untested patch to improve the avx512* sse_movcc.
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 --- Comment #3 from Jakub Jelinek --- For 64-byte vectors, we emit vpcmpgtb%zmm1, %zmm0, %k1 vpxor %xmm1, %xmm1, %xmm1 vpternlogd $0xFF, %zmm0, %zmm0, %zmm0 vmovdqu8%zmm1, %zmm0{%k1} for f1, perhaps it would be better to emit: vpcmpgtb%zmm1, %zmm0, %k1 knotq %k1, %k1 vpmovm2b%k1, %zmm0 ?
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- More complete testcase: typedef signed char v16qi __attribute__((vector_size(16))); typedef unsigned char v16uqi __attribute__((vector_size(16))); typedef short v8hi __attribute__((vector_size(16))); typedef unsigned short v8uhi __attribute__((vector_size(16))); typedef int v4si __attribute__((vector_size(16))); typedef unsigned v4usi __attribute__((vector_size(16))); typedef long long v2di __attribute__((vector_size(16))); typedef unsigned long long v2udi __attribute__((vector_size(16))); v16qi f1 (v16qi x, v16qi y) { return x <= y; } v16qi f1a (v16qi x, v16qi y) { return x < y; } v16uqi f2 (v16uqi x, v16uqi y) { return x <= y; } v16qi f3 (v16qi x, v16qi y) { return x >= y; } v16uqi f4 (v16uqi x, v16uqi y) { return x >= y; } v8hi f5 (v8hi x, v8hi y) { return x <= y; } v8uhi f6 (v8uhi x, v8uhi y) { return x <= y; } v8hi f7 (v8hi x, v8hi y) { return x >= y; } v8uhi f8 (v8uhi x, v8uhi y) { return x >= y; } v4si f9 (v4si x, v4si y) { return x <= y; } v4usi f10 (v4usi x, v4usi y) { return x <= y; } v4si f11 (v4si x, v4si y) { return x >= y; } v4usi f12 (v4usi x, v4usi y) { return x >= y; } v2di f13 (v2di x, v2di y) { return x <= y; } v2udi f14 (v2udi x, v2udi y) { return x <= y; } v2di f15 (v2di x, v2di y) { return x >= y; } v2udi f16 (v2udi x, v2udi y) { return x >= y; } plus of course we need a 32-byte and 64-byte vector variant, and test with -msse4.1 (the first one to have pmin{s,u}b, -mavx, -mavx2, -mavx512*. I think it could be done in ix86_expand_int_sse_cmp or in ix86_expand_int_vcond - perhaps only for the cases where one of the vcond operands is all ones and the other one is zero, notice that depending on which one is which the negation is 2 instructions (though, only if we don't hoist the constant load e.g. before a loop) and that for TARGET_SSE4_1 we can use the minimum or maximum.
[Bug target/88547] missed optimization for vector comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88547 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2018-12-19 Version|unknown |9.0 Ever confirmed|0 |1 Severity|normal |enhancement --- Comment #1 from Richard Biener --- Nice. Patch?