[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #11 from Richard Biener  ---
Fixed.

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

--- Comment #10 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:52a170b1a1818b7521c25e76271638a448b3f630

commit r11-6613-g52a170b1a1818b7521c25e76271638a448b3f630
Author: Richard Biener 
Date:   Tue Jan 12 11:17:33 2021 +0100

tree-optimization/98550 - fix BB vect unrolling check

This fixes the check that disqualifies BB vectorization because of
required unrolling to match up with the later exact_div we do.  To
not disable the ability to split groups that do not match up
exactly with a choosen vector type this also introduces a soft-fail
mechanism to vect_build_slp_tree_1 which delays failing to after
the matches[] array is populated from other checks and only then
determines the split point according to the vector type.

2021-01-12  Richard Biener  

PR tree-optimization/98550
* tree-vect-slp.c (vect_record_max_nunits): Check whether
the group size is a multiple of the vector element count.
(vect_build_slp_tree_1): When we need to fail because
the vector type choosen causes unrolling do so lazily
without affecting matches only at the end to guide group splitting.

* g++.dg/opt/pr98550.C: New testcase.

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Richard Biener  changed:

   What|Removed |Added

  Attachment #49948|0   |1
is obsolete||

--- Comment #9 from Richard Biener  ---
Created attachment 49951
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49951=edit
patch

So while technically correct this interferes with group splitting:

FAIL: gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects  scan-tree-dump-times
slp2 "optimized: basic block" 1
FAIL: gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "optimized: basic
block" 1
FAIL: gcc.dg/vect/bb-slp-pr58135.c -flto -ffat-lto-objects 
scan-tree-dump-times slp2 "optimized: basic block" 1
FAIL: gcc.dg/vect/bb-slp-pr58135.c scan-tree-dump-times slp2 "optimized: basic
block" 1

bb-slp-19.c has a grouped store of size 9 where we immediately fail due to
the check which is "fatal" as in

  tree nunits_vectype;
  if (!vect_get_vector_types_for_stmt (vinfo, stmt_info, ,
   _vectype, group_size)
  || (nunits_vectype
  && !vect_record_max_nunits (vinfo, stmt_info, group_size,
  nunits_vectype, max_nunits)))
{
  if (is_a  (vinfo) && i != 0)
continue;
  /* Fatal mismatch.  */
  matches[0] = false;
  return false;

which means we do not re-try with splitting the store group up.

Now, starting analysis with a group size of 9 is never going to
succeed.  Also note that get_vectype_for_scalar_type mimics the old
test:

  /* If the natural choice of vector type doesn't satisfy GROUP_SIZE,
 try again with an explicit number of elements.  */
  if (vectype
  && group_size
  && maybe_ge (TYPE_VECTOR_SUBPARTS (vectype), group_size))
{

but the fix must be to how we go discover splitting opportunities I guess.
One option is to only "soft-fail" for max_nunits issues, but then analysis
would still stop at the store (maybe that's even good - who knows, but
how far the match is good is only going to be determined later).

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

--- Comment #8 from rsandifo at gcc dot gnu.org  
---
(In reply to Richard Biener from comment #6)
> I guess it should be a !multiple_p (group_size, nunits) check instead?
Sounds plausible :-)

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

--- Comment #7 from Richard Biener  ---
Created attachment 49948
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49948=edit
patch

untested patch

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 CC||rsandifo at gcc dot gnu.org

--- Comment #6 from Richard Biener  ---
So it changes the type from NULL to v4si and the conversion node is

t.ii:87:6: note: node 0x30b4080 (max_nunits=4, refcnt=1)
t.ii:87:6: note: op template: patt_41 = (long int) _10;
t.ii:87:6: note:stmt 0 patt_41 = (long int) _10;
t.ii:87:6: note:stmt 1 patt_47 = (long int) _10;
t.ii:87:6: note:stmt 2 patt_53 = (long int) _10;
t.ii:87:6: note:stmt 3 patt_18 = (long int) _10;
t.ii:87:6: note:children 0x30b4100
$15 = void

already anticipating max_nunits == 4.  The problematic node is

t.ii:87:6: note: node 0x30b3d80 (max_nunits=4, refcnt=1)
t.ii:87:6: note: op template: patt_41 = (long int) _10;
t.ii:87:6: note:stmt 0 patt_41 = (long int) _10;
t.ii:87:6: note:stmt 1 patt_47 = (long int) _10;
t.ii:87:6: note:stmt 2 patt_53 = (long int) _10;
t.ii:87:6: note:stmt 3 patt_18 = (long int) _10;
t.ii:87:6: note:stmt 4 patt_23 = (long int) _10;
t.ii:87:6: note:stmt 5 patt_27 = (long int) _10;
t.ii:87:6: note:children 0x30b3e00

where the issue is that we fail to reject this for BB vectorization because
it would need unrolling.  Looks like the test in vect_record_max_nunits
isn't catching this case:

  /* If populating the vector type requires unrolling then fail
 before adjusting *max_nunits for basic-block vectorization.  */
  poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
  unsigned HOST_WIDE_INT const_nunits;
  if (is_a  (vinfo)
  && (!nunits.is_constant (_nunits)
  || const_nunits > group_size))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "Build SLP failed: unrolling required "
 "in basic block SLP\n");
  /* Fatal mismatch.  */
  return false;
}

I guess it should be a !multiple_p (group_size, nunits) check instead?

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Andreas Krebbel  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #5 from Andreas Krebbel  ---
With the patch the sign extension of 6 shift count operands from int to long
int is now marked as vect_external_def. This makes the vectype field in the slp
node to be bumped from "vector 2 int" to a "vector 4 int" in:
vectorizable_conversion->vect_maybe_update_slp_op_vectype

This then triggers the ICE when trying to divide vf*group_size (which is 1*6
here) by the number of elements in the vector type (now 4) in
vect_slp_analyze_node_operations.

Is changing the vectype field of an slp node to a type with a different number
of elements actually valid?


slp1:


  bb$dh_5 = D.4123.dh;
  _10 = MEM[(int *)bb$dh_5];
  pretmp_62 = a.cp[1];
  pretmp_79 = a.cp[2];
  pretmp_31 = a.cp[3];
  pretmp_39 = a.cp[4];
  pretmp_16 = a.cp[5];
  pretmp_19 = a.cp[6];
  goto ; [100.00%]

   [local count: 1014686041]:
  _20 = prephitmp_78 >> _10;
  a.cp[1] = _20;
  _22 = prephitmp_80 >> _10;
  a.cp[2] = _22;
  _24 = prephitmp_32 >> _10;
  a.cp[3] = _24;
  _26 = prephitmp_40 >> _10;
  a.cp[4] = _26;
  _28 = prephitmp_17 >> _10;
  a.cp[5] = _28;
  _30 = prephitmp_11 >> _10;
  a.cp[6] = _30;
  cn ={v} {CLOBBER};

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

--- Comment #4 from Andreas Krebbel  ---
The problem occurs starting with:

commit 1e1e1edf88a7c40ae4ae0de9e6077179e13ccf6d
Author: Richard Biener 
Date:   Thu Oct 29 08:48:15 2020 +0100

More BB vectorization tweaks

This tweaks the op build from splats to allow loads marked as not
vectorizable.  It also amends some dump prints with the address of
the SLP node or the instance to better be able to debug things.

2020-10-29  Richard Biener  

* tree-vect-slp.c (vect_build_slp_tree_2): Allow splatting
not vectorizable loads.
(vect_build_slp_instance): Amend dumping with address.
(vect_slp_convert_to_external): Likewise.

* gcc.dg/vect/bb-slp-pr65935.c: Adjust.

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-12 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

--- Comment #3 from Andreas Krebbel  ---
Created attachment 49944
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49944=edit
Reduced testcase

This testcase fails on bcb3065b2ba with
cc1plus t.cpp -march=z13 -O3

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |11.0

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-05 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING
   Assignee|marxin at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org

--- Comment #2 from Martin Liška  ---
But I can't reproduce that with the current master with a cross compiler.
Maybe it's fixed?

[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu

2021-01-05 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
   Last reconfirmed||2021-01-06
 CC||marxin at gcc dot gnu.org

--- Comment #1 from Martin Liška  ---
Reducing that right now..