[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-27 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

Richard Sandiford  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from Richard Sandiford  ---
Assuming fixed.  Please reopen if there are lingering issues.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #16 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:061a82fa2b751b42d0d8ddfcd45367c848d3ee64

commit r14-5878-g061a82fa2b751b42d0d8ddfcd45367c848d3ee64
Author: Richard Sandiford 
Date:   Mon Nov 27 13:38:16 2023 +

vect: Avoid duplicate_and_interleave for uniform vectors [PR112661]

can_duplicate_and_interleave_p checks whether we know a way of
building a particular VLA SLP invariant.  g:60034ecf25597bd515f
skipped that test for booleans, to support MASK_LEN_GATHER_LOAD
calls with a dummy all-ones mask.  But there's nothing fundamentally
different about VLA masks vs VLA data vectors.  If we have a VLA mask
that isn't all-ones, we need some way of loading it.  This ultimately
led to the ICE in the PR.

This patch fixes it by applying can_duplicate_and_interleave_p
to masks, while also adding a special path for uniform vectors
(of all kinds) to support the MASK_LEN_GATHER_LOAD usage.  This
also fixes an XFAIL in pr36648.cc for SVE.

The patch is mostly Richard's.  My only changes were to skip
redundant conversions and to use gimple_build_vector_from_val
for all eligible vectors.

2023-11-27  Richard Biener  
Richard Sandiford  

gcc/
PR tree-optimization/112661
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Defer
duplicate-and-
interleave test to...
(vect_build_slp_tree_2): ...here, once we have all the operands.
Skip the test for uniform vectors.
(vect_create_constant_vectors): Detect uniform vectors.  Avoid
redundant conversions in that case.  Use
gimple_build_vector_from_val
to build the vector.

gcc/testsuite/
* g++.dg/vect/pr36648.cc: Remove XFAIL for VLA load-lanes.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-26 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

Richard Sandiford  changed:

   What|Removed |Added

   Assignee|rguenth at gcc dot gnu.org |rsandifo at gcc dot 
gnu.org

--- Comment #15 from Richard Sandiford  ---
Sure.  SVE doesn't exhibit the ICE as far as I can tell, but a modified
version passes testing on SVE, and seems to fix an XFAIL.  I'll test
more widely overnight and post tomorrow.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #14 from Richard Biener  ---
(In reply to Richard Sandiford from comment #13)
> In vect_create_constant_vectors, I think the uniform_elt needs
> to come first, and needs to be used irrespective of whether the
> number of elements is constant.  The general tree_vector_builder
> has a more general pattern than 1 duplicated element.

can you take it from here since I have limited means to test?

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-24 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #13 from Richard Sandiford  ---
In vect_create_constant_vectors, I think the uniform_elt needs
to come first, and needs to be used irrespective of whether the
number of elements is constant.  The general tree_vector_builder
has a more general pattern than 1 duplicated element.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #12 from Richard Biener  ---
Created attachment 56668
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56668=edit
patch  (not working)

So this tries this, moving the duplicate-and-interleave check and changing
code generation.  It seems though that gimple_build_vector_from_val only
uses VEC_DUPLICATE_EXPR for non-constants but tree-vector-builder doesn't
like to build the uniform constant and we ICE:

internal compiler error: in finalize, at vector-builder.h:513
0x1e36958 vector_builder::finalize()
  /space/rguenther/src/gcc/gcc/vector-builder.h:513
0x1e36598 tree_vector_builder::build()
  /space/rguenther/src/gcc/gcc/tree-vector-builder.cc:42
0x15dc80a gimple_build_vector(gimple_stmt_iterator*, bool, gsi_iterator_update,
unsigned int, tree_vector_builder*)
  /space/rguenther/src/gcc/gcc/gimple-fold.cc:9256
0x1ddb2e7 gimple_build_vector(gimple**, tree_vector_builder*)
  /space/rguenther/src/gcc/gcc/gimple-fold.h:241
0x1e0d6f5 vect_create_constant_vectors
  /space/rguenther/src/gcc/gcc/tree-vect-slp.cc:8261

that's the assert

508  void
509  vector_builder::finalize ()
510  {
511/* The encoding requires the same number of elements to come from each
512   pattern.  */
513gcc_assert (multiple_p (m_full_nelts, m_npatterns));

I can of course try to manually build a VEC_DUPLICATE here but I wonder
if we're on the right track here.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #11 from Richard Biener  ---
OK, I'll give that a try then.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #10 from Richard Sandiford  ---
(In reply to Richard Biener from comment #9)
> So do we expect - independed of whether a constant/external is used as mask
> - that uniform constants/externals are generatable and thus we can elide the
> check for those?  Possibly also go a different path during code-generation
> then?  (because that will otherwise assert)
Yeah, I think so.  At the time, I don't think there were any cases where
treating uniform values differently would have helped, and it wasn't
trivial thing to test on the fly.  But now we have a reason to try :)

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #9 from Richard Biener  ---
(In reply to Richard Sandiford from comment #8)
> I think we're going down the wrong path here.  If I've understood
> the original change correctly, dummy masks aren't special because
> they're masks.  They're special because all elements are equal to
> the same value.  A mask such as:
> 
>   { 1, 1, 1, 0, 1 }
> 
> would not be OK, just like an integer vector with those values would
> not be OK.
> 
> So IMO we should check whether all elements are equal, rather than
> whether the type is one thing or another.

So do we expect - independed of whether a constant/external is used as mask -
that uniform constants/externals are generatable and thus we can elide the
check for those?  Possibly also go a different path during code-generation
then?  (because that will otherwise assert)

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #8 from Richard Sandiford  ---
I think we're going down the wrong path here.  If I've understood
the original change correctly, dummy masks aren't special because
they're masks.  They're special because all elements are equal to
the same value.  A mask such as:

  { 1, 1, 1, 0, 1 }

would not be OK, just like an integer vector with those values would
not be OK.

So IMO we should check whether all elements are equal, rather than
whether the type is one thing or another.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #7 from Richard Biener  ---
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 4a09b3c2aca..d0967240ae3 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -766,7 +766,10 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned
char swap,
  if ((dt == vect_constant_def
   || dt == vect_external_def)
  && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
- && TREE_CODE (type) != BOOLEAN_TYPE
+ && (!is_gimple_call (stmt_info->stmt)
+ || !gimple_call_internal_p (stmt_info->stmt)
+ || internal_fn_mask_index
+  (gimple_call_internal_fn (stmt_info->stmt)) != opno)
  && !can_duplicate_and_interleave_p (vinfo, stmts.length (),
type))
{
  if (dump_enabled_p ())

fixes the testcase, not sure if it still resolves the issue fixed with the
original change.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

--- Comment #6 from Richard Biener  ---
As suggested in the review at time the change would ideally be restricted to
actual mask operands, not random BOOLEAN_TYPE ones.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b since r14-5101-g60034ecf25597b

2023-11-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661

Richard Biener  changed:

   What|Removed |Added

Summary|[14] RISC-V ICE: in |[14] RISC-V ICE: in
   |duplicate_and_interleave,   |duplicate_and_interleave,
   |at tree-vect-slp.cc:8025|at tree-vect-slp.cc:8025
   |with maxval_char_3.f90  |with maxval_char_3.f90
   |vlen256b|vlen256b since
   ||r14-5101-g60034ecf25597b

--- Comment #5 from Richard Biener  ---
Btw, a fallout of r14-5101-g60034ecf25597b.