[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 --- Comment #9 from Hongtao.liu --- (In reply to Wojciech Mula from comment #8) > Thank you for the answer. Thus my question is: is it possible to delay > conversion from kmasks into ints? I'm not a language lawyer, but I guess a > `x binop y` has to be treated as `(int)x binop (int)y`. If it's true, we > will have to prove that `(int)(x avx512-binop y)` is equivalent to the > latter expr. It's quite tricky to teach RA to choose the best alternative(which is discussed in pr101185). alternatively, you can use intrinsic _kor_mask64 directly to avoid extra movment from GPR to MASK. uint64_t any_whitespace(__m512i string) { return _kor_mask64 (_kor_mask64 (_mm512_cmpeq_epu8_mask(string, _mm512_set1_epi8(' ')) ,_mm512_cmpeq_epu8_mask(string, _mm512_set1_epi8('\n'))) ,_mm512_cmpeq_epu8_mask(string, _mm512_set1_epi8('\r'))); }
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 --- Comment #8 from Wojciech Mula --- Thank you for the answer. Thus my question is: is it possible to delay conversion from kmasks into ints? I'm not a language lawyer, but I guess a `x binop y` has to be treated as `(int)x binop (int)y`. If it's true, we will have to prove that `(int)(x avx512-binop y)` is equivalent to the latter expr.
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 --- Comment #7 from Hongtao.liu --- (In reply to Wojciech Mula from comment #6) > Hongtao, thank you for your patch and for pinging back! I checked the code > from this issue against version 11.2.0 (Debian 11.2.0-14), but still, there > are KMOVQs before performing any bit ops. Here is the output from `gcc -O3 > -march=icelake-server -S` > > vpcmpub $0, .LC0(%rip), %zmm0, %k0 > vpcmpub $0, .LC1(%rip), %zmm0, %k1 > vpcmpub $0, .LC2(%rip), %zmm0, %k2 > kmovq %k0, %rcx > kmovq %k1, %rax > orq %rcx, %rax > kmovq %k2, %rdx > orq %rdx, %rax > ret Oh, Yes, Because of pr101185, mask register is slightly disliked. mask bitwise instructions are generated only if src and dest are both mask registers. .i.e #include __m512i foo_orq (__m512i a, __m512i b, __m512i c, __m512i d) { __mmask64 m1 = _mm512_cmpeq_epi8_mask (a, b); __mmask64 m2 = _mm512_cmpeq_epi8_mask (c, d); return _mm512_mask_add_epi8 (c, m1 | m2, a, d); }
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 --- Comment #6 from Wojciech Mula --- Hongtao, thank you for your patch and for pinging back! I checked the code from this issue against version 11.2.0 (Debian 11.2.0-14), but still, there are KMOVQs before performing any bit ops. Here is the output from `gcc -O3 -march=icelake-server -S` vpcmpub $0, .LC0(%rip), %zmm0, %k0 vpcmpub $0, .LC1(%rip), %zmm0, %k1 vpcmpub $0, .LC2(%rip), %zmm0, %k2 kmovq %k0, %rcx kmovq %k1, %rax orq %rcx, %rax kmovq %k2, %rdx orq %rdx, %rax ret
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #5 from Hongtao.liu --- Fixed in GCC11 by https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=388cb292a94f98a276548cd6ce01285cf36d17df
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-14 Ever confirmed|0 |1 --- Comment #4 from Richard Biener --- The testcase still behaves the same on trunk, not sure if exactly a dup or not.
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 --- Comment #3 from Wojciech Mula --- Sorry, I didn't find that bug; I think you may close this one. BTW, I had checked the code on godbolt.org before submitting. I tested also with their "GCC (trunk)", but the generated code is the same as for 8.2. The trunk's version is "g++ (GCC-Explorer-Build) 9.0.0 20190109 (experimental)" -- seems it's a fresh version and should already include the fixes Andrew mentioned.
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- See PR88473 for more details. You can use _kor_mask64 if you want explicitly using the mask operations instead of GPR.
[Bug target/88798] AVX512BW code does not use bit-operations that work on mask registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88798 --- Comment #1 from Andrew Pinski --- Some if not all has been fixed on the trunk. There was just a few weeks ago a bug that asked for the similar thing.