[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 Richard Sandiford changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #17 from Richard Sandiford --- Assuming fixed. Please reopen if there are lingering issues.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #16 from GCC Commits --- The trunk branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:061a82fa2b751b42d0d8ddfcd45367c848d3ee64 commit r14-5878-g061a82fa2b751b42d0d8ddfcd45367c848d3ee64 Author: Richard Sandiford Date: Mon Nov 27 13:38:16 2023 + vect: Avoid duplicate_and_interleave for uniform vectors [PR112661] can_duplicate_and_interleave_p checks whether we know a way of building a particular VLA SLP invariant. g:60034ecf25597bd515f skipped that test for booleans, to support MASK_LEN_GATHER_LOAD calls with a dummy all-ones mask. But there's nothing fundamentally different about VLA masks vs VLA data vectors. If we have a VLA mask that isn't all-ones, we need some way of loading it. This ultimately led to the ICE in the PR. This patch fixes it by applying can_duplicate_and_interleave_p to masks, while also adding a special path for uniform vectors (of all kinds) to support the MASK_LEN_GATHER_LOAD usage. This also fixes an XFAIL in pr36648.cc for SVE. The patch is mostly Richard's. My only changes were to skip redundant conversions and to use gimple_build_vector_from_val for all eligible vectors. 2023-11-27 Richard Biener Richard Sandiford gcc/ PR tree-optimization/112661 * tree-vect-slp.cc (vect_get_and_check_slp_defs): Defer duplicate-and- interleave test to... (vect_build_slp_tree_2): ...here, once we have all the operands. Skip the test for uniform vectors. (vect_create_constant_vectors): Detect uniform vectors. Avoid redundant conversions in that case. Use gimple_build_vector_from_val to build the vector. gcc/testsuite/ * g++.dg/vect/pr36648.cc: Remove XFAIL for VLA load-lanes.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 Richard Sandiford changed: What|Removed |Added Assignee|rguenth at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #15 from Richard Sandiford --- Sure. SVE doesn't exhibit the ICE as far as I can tell, but a modified version passes testing on SVE, and seems to fix an XFAIL. I'll test more widely overnight and post tomorrow.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #14 from Richard Biener --- (In reply to Richard Sandiford from comment #13) > In vect_create_constant_vectors, I think the uniform_elt needs > to come first, and needs to be used irrespective of whether the > number of elements is constant. The general tree_vector_builder > has a more general pattern than 1 duplicated element. can you take it from here since I have limited means to test?
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #13 from Richard Sandiford --- In vect_create_constant_vectors, I think the uniform_elt needs to come first, and needs to be used irrespective of whether the number of elements is constant. The general tree_vector_builder has a more general pattern than 1 duplicated element.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #12 from Richard Biener --- Created attachment 56668 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56668=edit patch (not working) So this tries this, moving the duplicate-and-interleave check and changing code generation. It seems though that gimple_build_vector_from_val only uses VEC_DUPLICATE_EXPR for non-constants but tree-vector-builder doesn't like to build the uniform constant and we ICE: internal compiler error: in finalize, at vector-builder.h:513 0x1e36958 vector_builder::finalize() /space/rguenther/src/gcc/gcc/vector-builder.h:513 0x1e36598 tree_vector_builder::build() /space/rguenther/src/gcc/gcc/tree-vector-builder.cc:42 0x15dc80a gimple_build_vector(gimple_stmt_iterator*, bool, gsi_iterator_update, unsigned int, tree_vector_builder*) /space/rguenther/src/gcc/gcc/gimple-fold.cc:9256 0x1ddb2e7 gimple_build_vector(gimple**, tree_vector_builder*) /space/rguenther/src/gcc/gcc/gimple-fold.h:241 0x1e0d6f5 vect_create_constant_vectors /space/rguenther/src/gcc/gcc/tree-vect-slp.cc:8261 that's the assert 508 void 509 vector_builder::finalize () 510 { 511/* The encoding requires the same number of elements to come from each 512 pattern. */ 513gcc_assert (multiple_p (m_full_nelts, m_npatterns)); I can of course try to manually build a VEC_DUPLICATE here but I wonder if we're on the right track here.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #11 from Richard Biener --- OK, I'll give that a try then.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #10 from Richard Sandiford --- (In reply to Richard Biener from comment #9) > So do we expect - independed of whether a constant/external is used as mask > - that uniform constants/externals are generatable and thus we can elide the > check for those? Possibly also go a different path during code-generation > then? (because that will otherwise assert) Yeah, I think so. At the time, I don't think there were any cases where treating uniform values differently would have helped, and it wasn't trivial thing to test on the fly. But now we have a reason to try :)
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #9 from Richard Biener --- (In reply to Richard Sandiford from comment #8) > I think we're going down the wrong path here. If I've understood > the original change correctly, dummy masks aren't special because > they're masks. They're special because all elements are equal to > the same value. A mask such as: > > { 1, 1, 1, 0, 1 } > > would not be OK, just like an integer vector with those values would > not be OK. > > So IMO we should check whether all elements are equal, rather than > whether the type is one thing or another. So do we expect - independed of whether a constant/external is used as mask - that uniform constants/externals are generatable and thus we can elide the check for those? Possibly also go a different path during code-generation then? (because that will otherwise assert)
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #8 from Richard Sandiford --- I think we're going down the wrong path here. If I've understood the original change correctly, dummy masks aren't special because they're masks. They're special because all elements are equal to the same value. A mask such as: { 1, 1, 1, 0, 1 } would not be OK, just like an integer vector with those values would not be OK. So IMO we should check whether all elements are equal, rather than whether the type is one thing or another.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #7 from Richard Biener --- diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 4a09b3c2aca..d0967240ae3 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -766,7 +766,10 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char swap, if ((dt == vect_constant_def || dt == vect_external_def) && !GET_MODE_SIZE (vinfo->vector_mode).is_constant () - && TREE_CODE (type) != BOOLEAN_TYPE + && (!is_gimple_call (stmt_info->stmt) + || !gimple_call_internal_p (stmt_info->stmt) + || internal_fn_mask_index + (gimple_call_internal_fn (stmt_info->stmt)) != opno) && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type)) { if (dump_enabled_p ()) fixes the testcase, not sure if it still resolves the issue fixed with the original change.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #6 from Richard Biener --- As suggested in the review at time the change would ideally be restricted to actual mask operands, not random BOOLEAN_TYPE ones.
[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 Richard Biener changed: What|Removed |Added Summary|[14] RISC-V ICE: in |[14] RISC-V ICE: in |duplicate_and_interleave, |duplicate_and_interleave, |at tree-vect-slp.cc:8025|at tree-vect-slp.cc:8025 |with maxval_char_3.f90 |with maxval_char_3.f90 |vlen256b|vlen256b since ||r14-5101-g60034ecf25597b --- Comment #5 from Richard Biener --- Btw, a fallout of r14-5101-g60034ecf25597b.