https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95199
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 21 May 2020, zhoukaipeng3 at huawei dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95199 > > --- Comment #4 from Kaipeng Zhou <zhoukaipeng3 at huawei dot com> --- > Sorry for not expressing clearly. > > I have debugged the testcase you provided. Not eliminating them is not caused > by IFN. The relevant code is in the "get_computation_aff_1" function. > > In IVOPTs the IV_STEPs must be checked by function "constant_multiple_of" > before using an IV variable to eliminate the other. But if the tree_code of > input IV_STEP is SSA_NAME, the function will return false. In your testcase, > the tree_code of IV_STEP is MULT_EXPR, so it return true. > > Gimple for my testcase: > <bb 12> [local count: 8589933]: > _83 = (sizetype) inc_y_22(D); > _84 = _83 * POLY_INT_CST [16, 16]; > _85 = (long unsigned int) inc_y_22(D); > _86 = _85 * 8; > _87 = (ssizetype) _86; > _88 = _87 /[ex] 8; > _89 = (long unsigned int) _88; > _90 = VEC_SERIES_EXPR <0, _89>; > vect_cst__95 = [vec_duplicate_expr] m_17(D); > _97 = (sizetype) inc_x_20(D); > _98 = _97 * POLY_INT_CST [16, 16]; > _99 = (long unsigned int) inc_x_20(D); > _100 = _99 * 8; > _101 = (ssizetype) _100; > _102 = _101 /[ex] 8; > _103 = (long unsigned int) _102; > _104 = VEC_SERIES_EXPR <0, _103>; > _109 = (sizetype) inc_x_20(D); > _110 = _109 * POLY_INT_CST [16, 16]; > _111 = (long unsigned int) inc_x_20(D); The issue is you have two copies of (sizetype) inc_x_20(D) * POLY_INT_CST [16, 16]; and IVOPTs does not perform CSE. vinfo->ivexpr_map is supposed to catch those "IV base and/or step expressions". So look where they are inserted and check the CSE map is used. Alternatively fixup hashing/comparing to handle POLY_INT_CST [16, 16] if that is the reason for the missed CSE. > _112 = _111 * 8; > _113 = (ssizetype) _112; > _114 = _113 /[ex] 8; > _115 = (long unsigned int) _114; > _116 = VEC_SERIES_EXPR <0, _115>; > max_mask_123 = .WHILE_ULT (0, 1000, { 0, ... }); > > <bb 3> [local count: 429496649]: > # vectp_b.3_91 = PHI <vectp_b.3_92(5), b_16(D)(12)> > # vectp_a.7_105 = PHI <vectp_a.7_106(5), a_18(D)(12)> > # vectp_a.11_117 = PHI <vectp_a.11_118(5), a_18(D)(12)> > # ivtmp_120 = PHI <ivtmp_121(5), 0(12)> > # loop_mask_93 = PHI <next_mask_124(5), max_mask_123(12)> > vect__4.5_94 = .MASK_GATHER_LOAD (vectp_b.3_91, _90, 8, { 0.0, ... }, > loop_mask_93); > vect__5.6_96 = vect__4.5_94 * vect_cst__95; > vect__9.9_107 = .MASK_GATHER_LOAD (vectp_a.7_105, _104, 8, { 0.0, ... }, > loop_mask_93); > vect__10.10_108 = vect__5.6_96 + vect__9.9_107; > .MASK_SCATTER_STORE (vectp_a.11_117, _116, 8, vect__10.10_108, > loop_mask_93); > vectp_b.3_92 = vectp_b.3_91 + _84; > vectp_a.7_106 = vectp_a.7_105 + _98; > vectp_a.11_118 = vectp_a.11_117 + _110; > ivtmp_121 = ivtmp_120 + POLY_INT_CST [2, 2]; > _122 = (unsigned int) ivtmp_121; > next_mask_124 = .WHILE_ULT (_122, 1000, { 0, ... }); > if (next_mask_124 != { 0, ... }) > goto <bb 5>; [98.00%] > else > goto <bb 4>; [2.00%] > > _98 and _110 are IV_STEPs. They are both SSA_NAME, so they cannot currently > be > eliminated in IVOPTs. > > I am not sure about my opinion. If wrong, please correct me. And can you > provide some suggestions on how to solve this problem? Should I try to > enhance > the "constant_multiple_of" function? > >