[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #10 from Jakub Jelinek --- Fixed now.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #9 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:319eafce3e54c8cb10e3fddce6823a6a558fca8b commit r11-147-g319eafce3e54c8cb10e3fddce6823a6a558fca8b Author: Jakub Jelinek Date: Wed May 6 20:05:02 2020 +0200 x86: Fix vextract* masked patterns [PR93069] The AVX512F documentation clearly states that in instructions where the destination is a memory only merging-masking is possible, not zero-masking, and the assembler enforces that. The testcase in this patch fails to assemble because of Error: unsupported masking for `vextracti32x8' on vextracti32x8 $0x0, %zmm1, -64(%rsp){%k1}{z} For the vector extraction patterns, we apparently have 7 *_maskm patterns that only accept memory destinations and rtx_equal_p merge-masking source for it, 7 * corresponding patterns that allow memory destination only for the non-masked cases (through ), then 2 * patterns (lo ssehalf V16FI and lo ssehalf VI8F_256 ones) which do allow memory destination even for masked cases and are the cause of the testsuite failure, because we must not allow C constraint if the destination is m, and finally one pair of patterns (separate * and *_mask, hi ssehalf VI4F_256), which has another issue (for which I don't have a testcase though), where if it would match zero-masking with register destination, it wouldn't emit the needed {z} into assembly. The attached patch fixes those 3 issues only, perhaps more suitable for backporting. But, even with that fixed, we are missing 3 further *_maskm patterns and more importantly, I find the split into 3 separate patterns after subst, *_maskm for masking with memory destination, *_mask for masking with register destination and * for non-masking unnecessarily complex and harder for reload, so the included patch below (non-attached) instead kills all *_maskm patterns and splits the * patterns into * and *_mask by hand instead of subst, where the *_mask ones make sure that with v destination they use 0C, while with m destination they use 0 and as condition enforce that either destination is not MEM, or rtx_equal_p between the destination and corresponding merging-masking operand source. If we had those 3 missing *_maskm patterns, this patch would actually result in both shorter sse.md and shorter machine description after subst (e.g. length of tmp-mddump.md), as we don't have them, the patch is actually 16 lines longer sse.md, but still shorter tmp-mddump.md. 2020-05-06 Jakub Jelinek PR target/93069 * config/i386/subst.md (store_mask_constraint, store_mask_predicate): Remove. (avx512dq_vextract64x2_1_maskm, avx512f_vextract32x4_1_maskm, vec_extract_lo__maskm, vec_extract_hi__maskm): Remove. (avx512dq_vextract64x2_1): Split into ... (*avx512dq_vextract64x2_1, avx512dq_vextract64x2_1_mask): ... these new define_insns. Even in the masked variant allow memory output but in that case use 0 rather than 0C constraint on the source of masked-out elts. (avx512f_vextract32x4_1): Split into ... (*avx512f_vextract32x4_1, avx512f_vextract32x4_1_mask): ... these new define_insns. Even in the masked variant allow memory output but in that case use 0 rather than 0C constraint on the source of masked-out elts. (vec_extract_lo_): Split into ... (vec_extract_lo_, vec_extract_lo__mask): ... these new define_insns. Even in the masked variant allow memory output but in that case use 0 rather than 0C constraint on the source of masked-out elts. (vec_extract_hi_): Split into ... (vec_extract_hi_, vec_extract_hi__mask): ... these new define_insns. Even in the masked variant allow memory output but in that case use 0 rather than 0C constraint on the source of masked-out elts.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #8 from Jakub Jelinek --- Yes, there is a larger patch approved for GCC11, but not for GCC10.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #7 from Arseny Solokha --- Is there some further work pending, or can this PR be closed?
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #6 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:57e276f3e304ef92483763ee1028e5b3e1345e0f commit r9-8473-g57e276f3e304ef92483763ee1028e5b3e1345e0f Author: Jakub Jelinek Date: Tue Apr 7 21:00:28 2020 +0200 Fix vextract* masked patterns [PR93069] The AVX512F documentation clearly states that in instructions where the destination is a memory only merging-masking is possible, not zero-masking, and the assembler enforces that. The testcase in this patch fails to assemble because of Error: unsupported masking for `vextracti32x8' on vextracti32x8 $0x0, %zmm1, -64(%rsp){%k1}{z} For the vector extraction patterns, we apparently have 7 *_maskm patterns that only accept memory destinations and rtx_equal_p merge-masking source for it, 7 * corresponding patterns that allow memory destination only for the non-masked cases (through ), then 2 * patterns (lo ssehalf V16FI and lo ssehalf VI8F_256 ones) which do allow memory destination even for masked cases and are the cause of the testsuite failure, because we must not allow C constraint if the destination is m, and finally one pair of patterns (separate * and *_mask, hi ssehalf VI4F_256), which has another issue (for which I don't have a testcase though), where if it would match zero-masking with register destination, it wouldn't emit the needed {z} into assembly. The attached patch fixes those 3 issues only, perhaps more suitable for backporting. 2020-03-30 Jakub Jelinek PR target/93069 * config/i386/sse.md (vec_extract_lo_): Use instead of m in output operand constraint. (vec_extract_hi_): Use instead of %{%3%}. * gcc.target/i386/avx512vl-pr93069.c: New test. * gcc.dg/vect/pr93069.c: New test.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #5 from Jakub Jelinek --- Smaller fix applied to GCC 10, larger one queued for GCC 11.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #4 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ec919cfcef8d7fcbaab24d0e0d472c65e5329ca6 commit r10-7457-gec919cfcef8d7fcbaab24d0e0d472c65e5329ca6 Author: Jakub Jelinek Date: Mon Mar 30 17:38:21 2020 +0200 Fix vextract* masked patterns [PR93069] The AVX512F documentation clearly states that in instructions where the destination is a memory only merging-masking is possible, not zero-masking, and the assembler enforces that. The testcase in this patch fails to assemble because of Error: unsupported masking for `vextracti32x8' on vextracti32x8 $0x0, %zmm1, -64(%rsp){%k1}{z} For the vector extraction patterns, we apparently have 7 *_maskm patterns that only accept memory destinations and rtx_equal_p merge-masking source for it, 7 * corresponding patterns that allow memory destination only for the non-masked cases (through ), then 2 * patterns (lo ssehalf V16FI and lo ssehalf VI8F_256 ones) which do allow memory destination even for masked cases and are the cause of the testsuite failure, because we must not allow C constraint if the destination is m, and finally one pair of patterns (separate * and *_mask, hi ssehalf VI4F_256), which has another issue (for which I don't have a testcase though), where if it would match zero-masking with register destination, it wouldn't emit the needed {z} into assembly. The attached patch fixes those 3 issues only, perhaps more suitable for backporting. 2020-03-30 Jakub Jelinek PR target/93069 * config/i386/sse.md (vec_extract_lo_): Use instead of m in output operand constraint. (vec_extract_hi_): Use instead of %{%3%}. * gcc.target/i386/avx512vl-pr93069.c: New test. * gcc.dg/vect/pr93069.c: New test.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #3 from Jakub Jelinek --- Note, the above isn't the smallest possible fix, so perhaps for backporting instead of the gcc/config/i386/ changes --- gcc/config/i386/sse.md.jj 2019-12-27 18:16:48.146431083 +0100 +++ gcc/config/i386/sse.md 2019-12-28 16:54:09.536217497 +0100 @@ -8782,7 +8782,8 @@ (define_expand "avx_vextractf128" }) (define_insn "vec_extract_lo_" - [(set (match_operand: 0 "nonimmediate_operand" "=v,v,m") + [(set (match_operand: 0 "" + "=v,v,") (vec_select: (match_operand:V16FI 1 "" "v,,v") @@ -8834,7 +8835,8 @@ (define_split }) (define_insn "vec_extract_lo_" - [(set (match_operand: 0 "" "=v,v,m") + [(set (match_operand: 0 "" + "=v,v,") (vec_select: (match_operand:VI8F_256 1 "" "v,,v") @@ -8844,7 +8846,7 @@ (define_insn "vec_extract_lo_ || !(MEM_P (operands[0]) && MEM_P (operands[1])))" { if () -return "vextract64x2\t{$0x0, %1, %0%{%3%}|%0%{%3%}, %1, 0x0}"; +return "vextract64x2\t{$0x0, %1, %0|%0, %1, 0x0}"; else return "#"; } might be better, i.e. insist on _maskm patterns for masked operations into memory and force non-memory destination for the rest of vector extractions, plus the %{%3%} instead of I think can't work properly if zero masking is needed into register destination.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 --- Comment #2 from Jakub Jelinek --- Created attachment 47556 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47556=edit gcc10-pr93069.patch Untested fix.
[Bug target/93069] Assembler messages: Error: unsupported masking for `vextracti32x8'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93069 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-12-28 CC||jakub at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Indeed, if the destination is memory, {z} is not allowed.