[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #11 from Richard Biener --- Fixed.
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 --- Comment #10 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:52a170b1a1818b7521c25e76271638a448b3f630 commit r11-6613-g52a170b1a1818b7521c25e76271638a448b3f630 Author: Richard Biener Date: Tue Jan 12 11:17:33 2021 +0100 tree-optimization/98550 - fix BB vect unrolling check This fixes the check that disqualifies BB vectorization because of required unrolling to match up with the later exact_div we do. To not disable the ability to split groups that do not match up exactly with a choosen vector type this also introduces a soft-fail mechanism to vect_build_slp_tree_1 which delays failing to after the matches[] array is populated from other checks and only then determines the split point according to the vector type. 2021-01-12 Richard Biener PR tree-optimization/98550 * tree-vect-slp.c (vect_record_max_nunits): Check whether the group size is a multiple of the vector element count. (vect_build_slp_tree_1): When we need to fail because the vector type choosen causes unrolling do so lazily without affecting matches only at the end to guide group splitting. * g++.dg/opt/pr98550.C: New testcase.
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Richard Biener changed: What|Removed |Added Attachment #49948|0 |1 is obsolete|| --- Comment #9 from Richard Biener --- Created attachment 49951 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49951=edit patch So while technically correct this interferes with group splitting: FAIL: gcc.dg/vect/bb-slp-19.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "optimized: basic block" 1 FAIL: gcc.dg/vect/bb-slp-19.c scan-tree-dump-times slp2 "optimized: basic block" 1 FAIL: gcc.dg/vect/bb-slp-pr58135.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "optimized: basic block" 1 FAIL: gcc.dg/vect/bb-slp-pr58135.c scan-tree-dump-times slp2 "optimized: basic block" 1 bb-slp-19.c has a grouped store of size 9 where we immediately fail due to the check which is "fatal" as in tree nunits_vectype; if (!vect_get_vector_types_for_stmt (vinfo, stmt_info, , _vectype, group_size) || (nunits_vectype && !vect_record_max_nunits (vinfo, stmt_info, group_size, nunits_vectype, max_nunits))) { if (is_a (vinfo) && i != 0) continue; /* Fatal mismatch. */ matches[0] = false; return false; which means we do not re-try with splitting the store group up. Now, starting analysis with a group size of 9 is never going to succeed. Also note that get_vectype_for_scalar_type mimics the old test: /* If the natural choice of vector type doesn't satisfy GROUP_SIZE, try again with an explicit number of elements. */ if (vectype && group_size && maybe_ge (TYPE_VECTOR_SUBPARTS (vectype), group_size)) { but the fix must be to how we go discover splitting opportunities I guess. One option is to only "soft-fail" for max_nunits issues, but then analysis would still stop at the store (maybe that's even good - who knows, but how far the match is good is only going to be determined later).
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 --- Comment #8 from rsandifo at gcc dot gnu.org --- (In reply to Richard Biener from comment #6) > I guess it should be a !multiple_p (group_size, nunits) check instead? Sounds plausible :-)
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 --- Comment #7 from Richard Biener --- Created attachment 49948 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49948=edit patch untested patch
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org CC||rsandifo at gcc dot gnu.org --- Comment #6 from Richard Biener --- So it changes the type from NULL to v4si and the conversion node is t.ii:87:6: note: node 0x30b4080 (max_nunits=4, refcnt=1) t.ii:87:6: note: op template: patt_41 = (long int) _10; t.ii:87:6: note:stmt 0 patt_41 = (long int) _10; t.ii:87:6: note:stmt 1 patt_47 = (long int) _10; t.ii:87:6: note:stmt 2 patt_53 = (long int) _10; t.ii:87:6: note:stmt 3 patt_18 = (long int) _10; t.ii:87:6: note:children 0x30b4100 $15 = void already anticipating max_nunits == 4. The problematic node is t.ii:87:6: note: node 0x30b3d80 (max_nunits=4, refcnt=1) t.ii:87:6: note: op template: patt_41 = (long int) _10; t.ii:87:6: note:stmt 0 patt_41 = (long int) _10; t.ii:87:6: note:stmt 1 patt_47 = (long int) _10; t.ii:87:6: note:stmt 2 patt_53 = (long int) _10; t.ii:87:6: note:stmt 3 patt_18 = (long int) _10; t.ii:87:6: note:stmt 4 patt_23 = (long int) _10; t.ii:87:6: note:stmt 5 patt_27 = (long int) _10; t.ii:87:6: note:children 0x30b3e00 where the issue is that we fail to reject this for BB vectorization because it would need unrolling. Looks like the test in vect_record_max_nunits isn't catching this case: /* If populating the vector type requires unrolling then fail before adjusting *max_nunits for basic-block vectorization. */ poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype); unsigned HOST_WIDE_INT const_nunits; if (is_a (vinfo) && (!nunits.is_constant (_nunits) || const_nunits > group_size)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "Build SLP failed: unrolling required " "in basic block SLP\n"); /* Fatal mismatch. */ return false; } I guess it should be a !multiple_p (group_size, nunits) check instead?
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Andreas Krebbel changed: What|Removed |Added Status|WAITING |NEW --- Comment #5 from Andreas Krebbel --- With the patch the sign extension of 6 shift count operands from int to long int is now marked as vect_external_def. This makes the vectype field in the slp node to be bumped from "vector 2 int" to a "vector 4 int" in: vectorizable_conversion->vect_maybe_update_slp_op_vectype This then triggers the ICE when trying to divide vf*group_size (which is 1*6 here) by the number of elements in the vector type (now 4) in vect_slp_analyze_node_operations. Is changing the vectype field of an slp node to a type with a different number of elements actually valid? slp1: bb$dh_5 = D.4123.dh; _10 = MEM[(int *)bb$dh_5]; pretmp_62 = a.cp[1]; pretmp_79 = a.cp[2]; pretmp_31 = a.cp[3]; pretmp_39 = a.cp[4]; pretmp_16 = a.cp[5]; pretmp_19 = a.cp[6]; goto ; [100.00%] [local count: 1014686041]: _20 = prephitmp_78 >> _10; a.cp[1] = _20; _22 = prephitmp_80 >> _10; a.cp[2] = _22; _24 = prephitmp_32 >> _10; a.cp[3] = _24; _26 = prephitmp_40 >> _10; a.cp[4] = _26; _28 = prephitmp_17 >> _10; a.cp[5] = _28; _30 = prephitmp_11 >> _10; a.cp[6] = _30; cn ={v} {CLOBBER};
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 --- Comment #4 from Andreas Krebbel --- The problem occurs starting with: commit 1e1e1edf88a7c40ae4ae0de9e6077179e13ccf6d Author: Richard Biener Date: Thu Oct 29 08:48:15 2020 +0100 More BB vectorization tweaks This tweaks the op build from splats to allow loads marked as not vectorizable. It also amends some dump prints with the address of the SLP node or the instance to better be able to debug things. 2020-10-29 Richard Biener * tree-vect-slp.c (vect_build_slp_tree_2): Allow splatting not vectorizable loads. (vect_build_slp_instance): Amend dumping with address. (vect_slp_convert_to_external): Likewise. * gcc.dg/vect/bb-slp-pr65935.c: Adjust.
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 --- Comment #3 from Andreas Krebbel --- Created attachment 49944 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49944=edit Reduced testcase This testcase fails on bcb3065b2ba with cc1plus t.cpp -march=z13 -O3
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Richard Biener changed: What|Removed |Added Target Milestone|--- |11.0
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Martin Liška changed: What|Removed |Added Status|ASSIGNED|WAITING Assignee|marxin at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #2 from Martin Liška --- But I can't reproduce that with the current master with a cross compiler. Maybe it's fixed?
[Bug target/98550] [11 Regression] ICE in exact_div, at poly-int.h:2219 on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98550 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Last reconfirmed||2021-01-06 CC||marxin at gcc dot gnu.org --- Comment #1 from Martin Liška --- Reducing that right now..