Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Wed, Jul 26, 2017 at 9:48 AM, Richard Bienerwrote: > On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse wrote: >> On Tue, 25 Jul 2017, Richard Biener wrote: >> I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. >>> >>> >>> Yeah, this semantic overloading is an issue. For gimple_build we have >>> nothing >>> to "valueize" but we only use it to tell genmatch that it may not look at >>> the >>> SSA_NAME_DEF_STMT. >>> I guess we'll need a fix in genmatch... >>> >>> >>> I'll have a look tomorrow. >> >> >> My impression yesterday was that we could replace the current do_valueize >> wrapper by 2 wrappers (without touching the valueize callbacks): >> - may_check_def_stmt, which returns a bool corresponding to the current >> do_valueize != NULL_TREE >> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it >> returns its argument unchanged. >> >> Not very confident about it though. > > Note I've been there in the past (twice I think) but always ran into existing > latent issues. So hopefully we've resolved those, I'm testing the following > simplified variant of what I had back in time. > > It'll produce > > switch (TREE_CODE (op0)) > { > case SSA_NAME: > if (gimple *def_stmt = get_def (valueize, op0)) > { > if (gassign *def = dyn_cast (def_stmt)) > switch (gimple_assign_rhs_code (def)) > { > case MINUS_EXPR: > { > tree o20 = gimple_assign_rhs1 (def); > o20 = do_valueize (valueize, o20); > tree o21 = gimple_assign_rhs2 (def); > o21 = do_valueize (valueize, o21); > if (op1 == o21 || (operand_equal_p (op1, o21, 0) && > types_match (op1, o21))) > { > > which also indents less which is nice. > > Bootstrap/regtest running on x86_64-unknown-linux-gnu. Now applied with * gcc/testsuite/gcc.dg/pr70920-2.c: Adjust for transform already happening in ccp1. * gcc/testsuite/gcc.dg/pr70920-4.c: Likewise. that shows it's working (CCP is one of the passes eventually running into this issue). Richard. > Richard. > > 2017-07-26 Richard Biener > > * gimple-match-head.c (do_valueize): Return OP if valueize > returns NULL_TREE. > (get_def): New helper to get at the def stmt of a SSA name > if valueize allows. > * genmatch.c (dt_node::gen_kids_1): Use get_def instead of > do_valueize to get at the def stmt. > (dt_operand::gen_gimple_expr): Simplify do_valueize calls. > > >> -- >> Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Wed, Jul 26, 2017 at 11:57 AM, Marc Glissewrote: > On Wed, 26 Jul 2017, Richard Sandiford wrote: > >> Marc Glisse writes: >>> >>> On Wed, 26 Jul 2017, Richard Sandiford wrote: Richard Biener writes: > > On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse > wrote: >> >> On Tue, 25 Jul 2017, Richard Biener wrote: >> I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. >>> >>> >>> >>> Yeah, this semantic overloading is an issue. For gimple_build we >>> have >>> nothing >>> to "valueize" but we only use it to tell genmatch that it may not >>> look at >>> the >>> SSA_NAME_DEF_STMT. >>> I guess we'll need a fix in genmatch... >>> >>> >>> >>> I'll have a look tomorrow. >> >> >> >> My impression yesterday was that we could replace the current >> do_valueize >> wrapper by 2 wrappers (without touching the valueize callbacks): >> - may_check_def_stmt, which returns a bool corresponding to the >> current >> do_valueize != NULL_TREE >> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, >> it >> returns its argument unchanged. >> >> Not very confident about it though. > > > Note I've been there in the past (twice I think) but always ran into > existing > latent issues. So hopefully we've resolved those, I'm testing the > following > simplified variant of what I had back in time. > > It'll produce > > switch (TREE_CODE (op0)) > { > case SSA_NAME: > if (gimple *def_stmt = get_def (valueize, op0)) > { > if (gassign *def = dyn_cast (def_stmt)) > switch (gimple_assign_rhs_code (def)) > { > case MINUS_EXPR: > { > tree o20 = gimple_assign_rhs1 (def); > o20 = do_valueize (valueize, o20); > tree o21 = gimple_assign_rhs2 (def); > o21 = do_valueize (valueize, o21); > if (op1 == o21 || (operand_equal_p (op1, o21, 0) && > types_match (op1, o21))) > { > > which also indents less which is nice. > > Bootstrap/regtest running on x86_64-unknown-linux-gnu. > > Richard. > > 2017-07-26 Richard Biener > > * gimple-match-head.c (do_valueize): Return OP if valueize > returns NULL_TREE. > (get_def): New helper to get at the def stmt of a SSA name > if valueize allows. > * genmatch.c (dt_node::gen_kids_1): Use get_def instead of > do_valueize to get at the def stmt. > (dt_operand::gen_gimple_expr): Simplify do_valueize calls. > > >> -- >> Marc Glisse > > > 2017-07-26 Richard Biener > > * gimple-match-head.c (do_valueize): Return OP if valueize > returns NULL_TREE. > (get_def): New helper to get at the def stmt of a SSA name > if valueize allows. > * genmatch.c (dt_node::gen_kids_1): Use get_def instead of > do_valueize to get at the def stmt. > (dt_operand::gen_gimple_expr): Simplify do_valueize calls. > > Index: gcc/gimple-match-head.c > === > --- gcc/gimple-match-head.c (revision 250518) > +++ gcc/gimple-match-head.c (working copy) > @@ -779,10 +779,25 @@ inline tree > do_valueize (tree (*valueize)(tree), tree op) > { >if (valueize && TREE_CODE (op) == SSA_NAME) > -return valueize (op); > +{ > + tree tem = valueize (op); > + if (tem) > + return tem; > +} >return op; > } > > +/* Helper for the autogenerated code, get at the definition of NAME > when > + VALUEIZE allows that. */ > + > +inline gimple * > +get_def (tree (*valueize)(tree), tree name) > +{ > + if (valueize && ! valueize (name)) > +return NULL; > + return SSA_NAME_DEF_STMT (name); > +} I realise this is preexisting, but why do we ignore the value returned by valueize, even if it's different from NAME? >>> >>> >>> My impression is that we expect it has already been valueized. >> >> >> But in that case, why do we try to
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Wed, 26 Jul 2017, Richard Sandiford wrote: Marc Glissewrites: On Wed, 26 Jul 2017, Richard Sandiford wrote: Richard Biener writes: On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse wrote: On Tue, 25 Jul 2017, Richard Biener wrote: I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. Yeah, this semantic overloading is an issue. For gimple_build we have nothing to "valueize" but we only use it to tell genmatch that it may not look at the SSA_NAME_DEF_STMT. I guess we'll need a fix in genmatch... I'll have a look tomorrow. My impression yesterday was that we could replace the current do_valueize wrapper by 2 wrappers (without touching the valueize callbacks): - may_check_def_stmt, which returns a bool corresponding to the current do_valueize != NULL_TREE - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it returns its argument unchanged. Not very confident about it though. Note I've been there in the past (twice I think) but always ran into existing latent issues. So hopefully we've resolved those, I'm testing the following simplified variant of what I had back in time. It'll produce switch (TREE_CODE (op0)) { case SSA_NAME: if (gimple *def_stmt = get_def (valueize, op0)) { if (gassign *def = dyn_cast (def_stmt)) switch (gimple_assign_rhs_code (def)) { case MINUS_EXPR: { tree o20 = gimple_assign_rhs1 (def); o20 = do_valueize (valueize, o20); tree o21 = gimple_assign_rhs2 (def); o21 = do_valueize (valueize, o21); if (op1 == o21 || (operand_equal_p (op1, o21, 0) && types_match (op1, o21))) { which also indents less which is nice. Bootstrap/regtest running on x86_64-unknown-linux-gnu. Richard. 2017-07-26 Richard Biener * gimple-match-head.c (do_valueize): Return OP if valueize returns NULL_TREE. (get_def): New helper to get at the def stmt of a SSA name if valueize allows. * genmatch.c (dt_node::gen_kids_1): Use get_def instead of do_valueize to get at the def stmt. (dt_operand::gen_gimple_expr): Simplify do_valueize calls. -- Marc Glisse 2017-07-26 Richard Biener * gimple-match-head.c (do_valueize): Return OP if valueize returns NULL_TREE. (get_def): New helper to get at the def stmt of a SSA name if valueize allows. * genmatch.c (dt_node::gen_kids_1): Use get_def instead of do_valueize to get at the def stmt. (dt_operand::gen_gimple_expr): Simplify do_valueize calls. Index: gcc/gimple-match-head.c === --- gcc/gimple-match-head.c (revision 250518) +++ gcc/gimple-match-head.c (working copy) @@ -779,10 +779,25 @@ inline tree do_valueize (tree (*valueize)(tree), tree op) { if (valueize && TREE_CODE (op) == SSA_NAME) -return valueize (op); +{ + tree tem = valueize (op); + if (tem) + return tem; +} return op; } +/* Helper for the autogenerated code, get at the definition of NAME when + VALUEIZE allows that. */ + +inline gimple * +get_def (tree (*valueize)(tree), tree name) +{ + if (valueize && ! valueize (name)) +return NULL; + return SSA_NAME_DEF_STMT (name); +} I realise this is preexisting, but why do we ignore the value returned by valueize, even if it's different from NAME? My impression is that we expect it has already been valueized. But in that case, why do we try to valueize it again? We don't know if valueize returned NULL and we kept the argument as is (should not look at def stmt) or if valueize actually returned something. This way we can also have valueize replace a value with another, and still forbid looking at the def stmt of the replacement value. It feels like we're conflating two things. We are. The question is how much trouble that causes, compared to the complication of for instance having 2 callbacks. -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
Marc Glissewrites: > On Wed, 26 Jul 2017, Richard Sandiford wrote: >> Richard Biener writes: >>> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse wrote: On Tue, 25 Jul 2017, Richard Biener wrote: >> I think we need Richard to say what the intent is for the valueization >> function. It is used both to stop looking at defining stmt if the return >> is >> NULL, and to replace/optimize one SSA_NAME with another, but currently it >> seems hard to prevent looking at the defining statement without >> preventing >> from looking at the SSA_NAME at all. > > > Yeah, this semantic overloading is an issue. For gimple_build we have > nothing > to "valueize" but we only use it to tell genmatch that it may not look at > the > SSA_NAME_DEF_STMT. > >> I guess we'll need a fix in genmatch... > > > I'll have a look tomorrow. My impression yesterday was that we could replace the current do_valueize wrapper by 2 wrappers (without touching the valueize callbacks): - may_check_def_stmt, which returns a bool corresponding to the current do_valueize != NULL_TREE - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it returns its argument unchanged. Not very confident about it though. >>> >>> Note I've been there in the past (twice I think) but always ran into >>> existing >>> latent issues. So hopefully we've resolved those, I'm testing the following >>> simplified variant of what I had back in time. >>> >>> It'll produce >>> >>> switch (TREE_CODE (op0)) >>> { >>> case SSA_NAME: >>> if (gimple *def_stmt = get_def (valueize, op0)) >>> { >>> if (gassign *def = dyn_cast (def_stmt)) >>> switch (gimple_assign_rhs_code (def)) >>> { >>> case MINUS_EXPR: >>> { >>> tree o20 = gimple_assign_rhs1 (def); >>> o20 = do_valueize (valueize, o20); >>> tree o21 = gimple_assign_rhs2 (def); >>> o21 = do_valueize (valueize, o21); >>> if (op1 == o21 || (operand_equal_p (op1, o21, 0) && >>> types_match (op1, o21))) >>> { >>> >>> which also indents less which is nice. >>> >>> Bootstrap/regtest running on x86_64-unknown-linux-gnu. >>> >>> Richard. >>> >>> 2017-07-26 Richard Biener >>> >>> * gimple-match-head.c (do_valueize): Return OP if valueize >>> returns NULL_TREE. >>> (get_def): New helper to get at the def stmt of a SSA name >>> if valueize allows. >>> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of >>> do_valueize to get at the def stmt. >>> (dt_operand::gen_gimple_expr): Simplify do_valueize calls. >>> >>> -- Marc Glisse >>> >>> 2017-07-26 Richard Biener >>> >>> * gimple-match-head.c (do_valueize): Return OP if valueize >>> returns NULL_TREE. >>> (get_def): New helper to get at the def stmt of a SSA name >>> if valueize allows. >>> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of >>> do_valueize to get at the def stmt. >>> (dt_operand::gen_gimple_expr): Simplify do_valueize calls. >>> >>> Index: gcc/gimple-match-head.c >>> === >>> --- gcc/gimple-match-head.c (revision 250518) >>> +++ gcc/gimple-match-head.c (working copy) >>> @@ -779,10 +779,25 @@ inline tree >>> do_valueize (tree (*valueize)(tree), tree op) >>> { >>>if (valueize && TREE_CODE (op) == SSA_NAME) >>> -return valueize (op); >>> +{ >>> + tree tem = valueize (op); >>> + if (tem) >>> + return tem; >>> +} >>>return op; >>> } >>> >>> +/* Helper for the autogenerated code, get at the definition of NAME when >>> + VALUEIZE allows that. */ >>> + >>> +inline gimple * >>> +get_def (tree (*valueize)(tree), tree name) >>> +{ >>> + if (valueize && ! valueize (name)) >>> +return NULL; >>> + return SSA_NAME_DEF_STMT (name); >>> +} >> >> I realise this is preexisting, but why do we ignore the value returned >> by valueize, even if it's different from NAME? > > My impression is that we expect it has already been valueized. But in that case, why do we try to valueize it again? It feels like we're conflating two things. Richard
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Wed, 26 Jul 2017, Richard Sandiford wrote: Richard Bienerwrites: On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse wrote: On Tue, 25 Jul 2017, Richard Biener wrote: I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. Yeah, this semantic overloading is an issue. For gimple_build we have nothing to "valueize" but we only use it to tell genmatch that it may not look at the SSA_NAME_DEF_STMT. I guess we'll need a fix in genmatch... I'll have a look tomorrow. My impression yesterday was that we could replace the current do_valueize wrapper by 2 wrappers (without touching the valueize callbacks): - may_check_def_stmt, which returns a bool corresponding to the current do_valueize != NULL_TREE - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it returns its argument unchanged. Not very confident about it though. Note I've been there in the past (twice I think) but always ran into existing latent issues. So hopefully we've resolved those, I'm testing the following simplified variant of what I had back in time. It'll produce switch (TREE_CODE (op0)) { case SSA_NAME: if (gimple *def_stmt = get_def (valueize, op0)) { if (gassign *def = dyn_cast (def_stmt)) switch (gimple_assign_rhs_code (def)) { case MINUS_EXPR: { tree o20 = gimple_assign_rhs1 (def); o20 = do_valueize (valueize, o20); tree o21 = gimple_assign_rhs2 (def); o21 = do_valueize (valueize, o21); if (op1 == o21 || (operand_equal_p (op1, o21, 0) && types_match (op1, o21))) { which also indents less which is nice. Bootstrap/regtest running on x86_64-unknown-linux-gnu. Richard. 2017-07-26 Richard Biener * gimple-match-head.c (do_valueize): Return OP if valueize returns NULL_TREE. (get_def): New helper to get at the def stmt of a SSA name if valueize allows. * genmatch.c (dt_node::gen_kids_1): Use get_def instead of do_valueize to get at the def stmt. (dt_operand::gen_gimple_expr): Simplify do_valueize calls. -- Marc Glisse 2017-07-26 Richard Biener * gimple-match-head.c (do_valueize): Return OP if valueize returns NULL_TREE. (get_def): New helper to get at the def stmt of a SSA name if valueize allows. * genmatch.c (dt_node::gen_kids_1): Use get_def instead of do_valueize to get at the def stmt. (dt_operand::gen_gimple_expr): Simplify do_valueize calls. Index: gcc/gimple-match-head.c === --- gcc/gimple-match-head.c (revision 250518) +++ gcc/gimple-match-head.c (working copy) @@ -779,10 +779,25 @@ inline tree do_valueize (tree (*valueize)(tree), tree op) { if (valueize && TREE_CODE (op) == SSA_NAME) -return valueize (op); +{ + tree tem = valueize (op); + if (tem) + return tem; +} return op; } +/* Helper for the autogenerated code, get at the definition of NAME when + VALUEIZE allows that. */ + +inline gimple * +get_def (tree (*valueize)(tree), tree name) +{ + if (valueize && ! valueize (name)) +return NULL; + return SSA_NAME_DEF_STMT (name); +} I realise this is preexisting, but why do we ignore the value returned by valueize, even if it's different from NAME? My impression is that we expect it has already been valueized. -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
Richard Bienerwrites: > On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse wrote: >> On Tue, 25 Jul 2017, Richard Biener wrote: >> I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. >>> >>> >>> Yeah, this semantic overloading is an issue. For gimple_build we have >>> nothing >>> to "valueize" but we only use it to tell genmatch that it may not look at >>> the >>> SSA_NAME_DEF_STMT. >>> I guess we'll need a fix in genmatch... >>> >>> >>> I'll have a look tomorrow. >> >> >> My impression yesterday was that we could replace the current do_valueize >> wrapper by 2 wrappers (without touching the valueize callbacks): >> - may_check_def_stmt, which returns a bool corresponding to the current >> do_valueize != NULL_TREE >> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it >> returns its argument unchanged. >> >> Not very confident about it though. > > Note I've been there in the past (twice I think) but always ran into existing > latent issues. So hopefully we've resolved those, I'm testing the following > simplified variant of what I had back in time. > > It'll produce > > switch (TREE_CODE (op0)) > { > case SSA_NAME: > if (gimple *def_stmt = get_def (valueize, op0)) > { > if (gassign *def = dyn_cast (def_stmt)) > switch (gimple_assign_rhs_code (def)) > { > case MINUS_EXPR: > { > tree o20 = gimple_assign_rhs1 (def); > o20 = do_valueize (valueize, o20); > tree o21 = gimple_assign_rhs2 (def); > o21 = do_valueize (valueize, o21); > if (op1 == o21 || (operand_equal_p (op1, o21, 0) && > types_match (op1, o21))) > { > > which also indents less which is nice. > > Bootstrap/regtest running on x86_64-unknown-linux-gnu. > > Richard. > > 2017-07-26 Richard Biener > > * gimple-match-head.c (do_valueize): Return OP if valueize > returns NULL_TREE. > (get_def): New helper to get at the def stmt of a SSA name > if valueize allows. > * genmatch.c (dt_node::gen_kids_1): Use get_def instead of > do_valueize to get at the def stmt. > (dt_operand::gen_gimple_expr): Simplify do_valueize calls. > > >> -- >> Marc Glisse > > 2017-07-26 Richard Biener > > * gimple-match-head.c (do_valueize): Return OP if valueize > returns NULL_TREE. > (get_def): New helper to get at the def stmt of a SSA name > if valueize allows. > * genmatch.c (dt_node::gen_kids_1): Use get_def instead of > do_valueize to get at the def stmt. > (dt_operand::gen_gimple_expr): Simplify do_valueize calls. > > Index: gcc/gimple-match-head.c > === > --- gcc/gimple-match-head.c (revision 250518) > +++ gcc/gimple-match-head.c (working copy) > @@ -779,10 +779,25 @@ inline tree > do_valueize (tree (*valueize)(tree), tree op) > { >if (valueize && TREE_CODE (op) == SSA_NAME) > -return valueize (op); > +{ > + tree tem = valueize (op); > + if (tem) > + return tem; > +} >return op; > } > > +/* Helper for the autogenerated code, get at the definition of NAME when > + VALUEIZE allows that. */ > + > +inline gimple * > +get_def (tree (*valueize)(tree), tree name) > +{ > + if (valueize && ! valueize (name)) > +return NULL; > + return SSA_NAME_DEF_STMT (name); > +} I realise this is preexisting, but why do we ignore the value returned by valueize, even if it's different from NAME? I struggled over that with the old code when I hit the same problem with some SVE patches (for which, thanks for the fix). A comment might be good. Thanks, Richard
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Tue, Jul 25, 2017 at 7:45 PM, Marc Glissewrote: > On Tue, 25 Jul 2017, Richard Biener wrote: > >>> I think we need Richard to say what the intent is for the valueization >>> function. It is used both to stop looking at defining stmt if the return >>> is >>> NULL, and to replace/optimize one SSA_NAME with another, but currently it >>> seems hard to prevent looking at the defining statement without >>> preventing >>> from looking at the SSA_NAME at all. >> >> >> Yeah, this semantic overloading is an issue. For gimple_build we have >> nothing >> to "valueize" but we only use it to tell genmatch that it may not look at >> the >> SSA_NAME_DEF_STMT. >> >>> I guess we'll need a fix in genmatch... >> >> >> I'll have a look tomorrow. > > > My impression yesterday was that we could replace the current do_valueize > wrapper by 2 wrappers (without touching the valueize callbacks): > - may_check_def_stmt, which returns a bool corresponding to the current > do_valueize != NULL_TREE > - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it > returns its argument unchanged. > > Not very confident about it though. Note I've been there in the past (twice I think) but always ran into existing latent issues. So hopefully we've resolved those, I'm testing the following simplified variant of what I had back in time. It'll produce switch (TREE_CODE (op0)) { case SSA_NAME: if (gimple *def_stmt = get_def (valueize, op0)) { if (gassign *def = dyn_cast (def_stmt)) switch (gimple_assign_rhs_code (def)) { case MINUS_EXPR: { tree o20 = gimple_assign_rhs1 (def); o20 = do_valueize (valueize, o20); tree o21 = gimple_assign_rhs2 (def); o21 = do_valueize (valueize, o21); if (op1 == o21 || (operand_equal_p (op1, o21, 0) && types_match (op1, o21))) { which also indents less which is nice. Bootstrap/regtest running on x86_64-unknown-linux-gnu. Richard. 2017-07-26 Richard Biener * gimple-match-head.c (do_valueize): Return OP if valueize returns NULL_TREE. (get_def): New helper to get at the def stmt of a SSA name if valueize allows. * genmatch.c (dt_node::gen_kids_1): Use get_def instead of do_valueize to get at the def stmt. (dt_operand::gen_gimple_expr): Simplify do_valueize calls. > -- > Marc Glisse p3 Description: Binary data
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Tue, 25 Jul 2017, Richard Biener wrote: I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. Yeah, this semantic overloading is an issue. For gimple_build we have nothing to "valueize" but we only use it to tell genmatch that it may not look at the SSA_NAME_DEF_STMT. I guess we'll need a fix in genmatch... I'll have a look tomorrow. My impression yesterday was that we could replace the current do_valueize wrapper by 2 wrappers (without touching the valueize callbacks): - may_check_def_stmt, which returns a bool corresponding to the current do_valueize != NULL_TREE - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it returns its argument unchanged. Not very confident about it though. -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, Jul 24, 2017 at 4:31 PM, Marc Glissewrote: > On Mon, 24 Jul 2017, Bin.Cheng wrote: > >> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse wrote: >>> >>> On Mon, 24 Jul 2017, Bin.Cheng wrote: >>> But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. >>> >>> >>> >>> Wait, actually, how was your fold_build* version working? Why was the >>> first >>> addition "in the current generic tree" and why isn't it "in current stmt >>> sequence"? >> >> Maybe I didn't express it clearly. In compute_new_first_bound, we >> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 - >> 1" by calling gimple_build. The definition of _197 is a PHI and isn't >> in current stmt sequence. For fold_build* version, it builds >> expression "_197 + 1 - 1" and simplifies it. > > > It seems like it shouldn't be relevant whether the definition of _197 is in > the stmt sequence or not, but indeed we seem to generate a lot of calls to > do_valueize... I had misunderstood the issue, sorry. > > Strangely, for a pattern like > (simplify (plus @0 @1) @0) > we generate no call to valueize, Expected. We expect the "toplevel" trees to be already valueized. > while for the following > (simplify (plus @0 (minus @1 @2)) @0) > we generate 3 calls to do_valueize. > > I think we need Richard to say what the intent is for the valueization > function. It is used both to stop looking at defining stmt if the return is > NULL, and to replace/optimize one SSA_NAME with another, but currently it > seems hard to prevent looking at the defining statement without preventing > from looking at the SSA_NAME at all. Yeah, this semantic overloading is an issue. For gimple_build we have nothing to "valueize" but we only use it to tell genmatch that it may not look at the SSA_NAME_DEF_STMT. > I guess we'll need a fix in genmatch... I'll have a look tomorrow. RIchard. > -- > Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, 24 Jul 2017, Bin.Cheng wrote: On Mon, Jul 24, 2017 at 3:31 PM, Marc Glissewrote: On Mon, 24 Jul 2017, Bin.Cheng wrote: On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse wrote: On Mon, 24 Jul 2017, Bin.Cheng wrote: But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. Wait, actually, how was your fold_build* version working? Why was the first addition "in the current generic tree" and why isn't it "in current stmt sequence"? Maybe I didn't express it clearly. In compute_new_first_bound, we have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 - 1" by calling gimple_build. The definition of _197 is a PHI and isn't in current stmt sequence. For fold_build* version, it builds expression "_197 + 1 - 1" and simplifies it. It seems like it shouldn't be relevant whether the definition of _197 is in the stmt sequence or not, but indeed we seem to generate a lot of calls to do_valueize... I had misunderstood the issue, sorry. Oh, no need at all, and thanks very much for all the explanation. Strangely, for a pattern like (simplify (plus @0 @1) @0) we generate no call to valueize, while for the following (simplify (plus @0 (minus @1 @2)) @0) we generate 3 calls to do_valueize. I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. Looks we don't really expand into def_stmt on leaf nodes, maybe valueization can be saved in the case? Passes with a lattice (say PRE) use valueization to replace an SSA_NAME with an equivalent one, not just to let it look through the defining statement, so I am not sure if we can get rid of those completely, or if we need two different do_valueize wrappers for the 2 types of uses. -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, Jul 24, 2017 at 3:31 PM, Marc Glissewrote: > On Mon, 24 Jul 2017, Bin.Cheng wrote: > >> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse wrote: >>> >>> On Mon, 24 Jul 2017, Bin.Cheng wrote: >>> But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. >>> >>> >>> >>> Wait, actually, how was your fold_build* version working? Why was the >>> first >>> addition "in the current generic tree" and why isn't it "in current stmt >>> sequence"? >> >> Maybe I didn't express it clearly. In compute_new_first_bound, we >> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 - >> 1" by calling gimple_build. The definition of _197 is a PHI and isn't >> in current stmt sequence. For fold_build* version, it builds >> expression "_197 + 1 - 1" and simplifies it. > > > It seems like it shouldn't be relevant whether the definition of _197 is in > the stmt sequence or not, but indeed we seem to generate a lot of calls to > do_valueize... I had misunderstood the issue, sorry. Oh, no need at all, and thanks very much for all the explanation. > > Strangely, for a pattern like > (simplify (plus @0 @1) @0) > we generate no call to valueize, while for the following > (simplify (plus @0 (minus @1 @2)) @0) > we generate 3 calls to do_valueize. > > I think we need Richard to say what the intent is for the valueization > function. It is used both to stop looking at defining stmt if the return is > NULL, and to replace/optimize one SSA_NAME with another, but currently it > seems hard to prevent looking at the defining statement without preventing > from looking at the SSA_NAME at all. Looks we don't really expand into def_stmt on leaf nodes, maybe valueization can be saved in the case? > > I guess we'll need a fix in genmatch... Yeah, then the original patch becomes unnecessary. Thanks again! Thanks, bin > > -- > Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, 24 Jul 2017, Bin.Cheng wrote: On Mon, Jul 24, 2017 at 2:59 PM, Marc Glissewrote: On Mon, 24 Jul 2017, Bin.Cheng wrote: But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. Wait, actually, how was your fold_build* version working? Why was the first addition "in the current generic tree" and why isn't it "in current stmt sequence"? Maybe I didn't express it clearly. In compute_new_first_bound, we have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 - 1" by calling gimple_build. The definition of _197 is a PHI and isn't in current stmt sequence. For fold_build* version, it builds expression "_197 + 1 - 1" and simplifies it. It seems like it shouldn't be relevant whether the definition of _197 is in the stmt sequence or not, but indeed we seem to generate a lot of calls to do_valueize... I had misunderstood the issue, sorry. Strangely, for a pattern like (simplify (plus @0 @1) @0) we generate no call to valueize, while for the following (simplify (plus @0 (minus @1 @2)) @0) we generate 3 calls to do_valueize. I think we need Richard to say what the intent is for the valueization function. It is used both to stop looking at defining stmt if the return is NULL, and to replace/optimize one SSA_NAME with another, but currently it seems hard to prevent looking at the defining statement without preventing from looking at the SSA_NAME at all. I guess we'll need a fix in genmatch... -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, Jul 24, 2017 at 2:59 PM, Marc Glissewrote: > On Mon, 24 Jul 2017, Bin.Cheng wrote: > >> But since definition of _197 isn't in current stmt sequence, call "o31 >> = do_valueize (valueize, o31)" will return NULL. As a result, it's >> not matched. > > > Wait, actually, how was your fold_build* version working? Why was the first > addition "in the current generic tree" and why isn't it "in current stmt > sequence"? Maybe I didn't express it clearly. In compute_new_first_bound, we have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 - 1" by calling gimple_build. The definition of _197 is a PHI and isn't in current stmt sequence. For fold_build* version, it builds expression "_197 + 1 - 1" and simplifies it. Thanks, bin > > -- > Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, 24 Jul 2017, Bin.Cheng wrote: But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. Wait, actually, how was your fold_build* version working? Why was the first addition "in the current generic tree" and why isn't it "in current stmt sequence"? -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, Jul 24, 2017 at 1:16 PM, Marc Glissewrote: > On Mon, 24 Jul 2017, Bin.Cheng wrote: > >>> For _123, we have >>> >>> /* (A +- CST1) +- CST2 -> A + CST3 >>> or >>> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ >>> >>> >>> For _115, we have >>> >>> /* min (a, a + CST) -> a where CST is positive. */ >>> /* min (a, a + CST) -> a + CST where CST is negative. */ >>> (simplify >>> (min:c @0 (plus@2 @0 INTEGER_CST@1)) >>> (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) >>>(if (tree_int_cst_sgn (@1) > 0) >>> @0 >>> @2))) >>> >>> What is the type of all those SSA_NAMEs? >> >> Hi, >> From the debugging process, there are two issues preventing "(A +- >> CST1) +- CST2 -> A + CST3" from being applied: >> A) before we reach this pattern, there is pattern: >> >> /* A - B -> A + (-B) if B is easily negatable. */ >> (simplify >> (minus @0 negate_expr_p@1) >> (if (!FIXED_POINT_TYPE_P (type)) >> (plus @0 (negate @1 >> >> which is matched and returned in gimple_simplify_MINUS_EXPR. So does >> pattern order matter here? > > > That shouldn't be a problem, normally we always try to resimplify the result > of the simplification, and the transformation should handle x+1+-1 just as > well as x+1-1. Is that not happening? Yes, it's doesn't matter. > >> B) When folding "_124 - 1" on the basis of existing stmts sequence >> like "_124 = _197 + 1;". The corresponding gimple-match.c code is >> like: > > [...] >> >> But since definition of _197 isn't in current stmt sequence, call "o31 >> = do_valueize (valueize, o31)" will return NULL. As a result, it's >> not matched. > > > Ah, yes, that problem... Jakub was having a very similar issue a few > weeks ago, don't know if he found a solution. You could call > gimple_simplify directly with a different valueization function if > that's safe. Normally the simplification would wait until the next > forwprop pass. Thanks for elaboration. It's too late for next forwprop pass since we are in between loop optimizations and need the simplified code for niter analysis. Function compute_new_first_bound calls gimple_build several times, it's not likely to replace all gimple_build with gimple_simplify? Also gimple_simplify could return NULL_TREE, in which case I need to call gimple_build_assign again. At last, we don't have interface fold_seq similar to fold_stmt either. CCing Jakub if he found a solution. Thanks. Thanks, bin > > -- > Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Mon, 24 Jul 2017, Bin.Cheng wrote: For _123, we have /* (A +- CST1) +- CST2 -> A + CST3 or /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ For _115, we have /* min (a, a + CST) -> a where CST is positive. */ /* min (a, a + CST) -> a + CST where CST is negative. */ (simplify (min:c @0 (plus@2 @0 INTEGER_CST@1)) (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) (if (tree_int_cst_sgn (@1) > 0) @0 @2))) What is the type of all those SSA_NAMEs? Hi, From the debugging process, there are two issues preventing "(A +- CST1) +- CST2 -> A + CST3" from being applied: A) before we reach this pattern, there is pattern: /* A - B -> A + (-B) if B is easily negatable. */ (simplify (minus @0 negate_expr_p@1) (if (!FIXED_POINT_TYPE_P (type)) (plus @0 (negate @1 which is matched and returned in gimple_simplify_MINUS_EXPR. So does pattern order matter here? That shouldn't be a problem, normally we always try to resimplify the result of the simplification, and the transformation should handle x+1+-1 just as well as x+1-1. Is that not happening? B) When folding "_124 - 1" on the basis of existing stmts sequence like "_124 = _197 + 1;". The corresponding gimple-match.c code is like: [...] But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. Ah, yes, that problem... Jakub was having a very similar issue a few weeks ago, don't know if he found a solution. You could call gimple_simplify directly with a different valueization function if that's safe. Normally the simplification would wait until the next forwprop pass. -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 5:48 PM, Marc Glissewrote: > On Fri, 16 Jun 2017, Bin.Cheng wrote: > >> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener >> wrote: >>> >>> >>> That means we miss a pattern in match.PD to handle this case. >> >> I see. I will withdraw this patch and look in that direction. > > > For _123, we have > > /* (A +- CST1) +- CST2 -> A + CST3 > or > /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ > > > For _115, we have > > /* min (a, a + CST) -> a where CST is positive. */ > /* min (a, a + CST) -> a + CST where CST is negative. */ > (simplify > (min:c @0 (plus@2 @0 INTEGER_CST@1)) > (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) >(if (tree_int_cst_sgn (@1) > 0) > @0 > @2))) > > What is the type of all those SSA_NAMEs? Hi, >From the debugging process, there are two issues preventing "(A +- CST1) +- CST2 -> A + CST3" from being applied: A) before we reach this pattern, there is pattern: /* A - B -> A + (-B) if B is easily negatable. */ (simplify (minus @0 negate_expr_p@1) (if (!FIXED_POINT_TYPE_P (type)) (plus @0 (negate @1 which is matched and returned in gimple_simplify_MINUS_EXPR. So does pattern order matter here? B) When folding "_124 - 1" on the basis of existing stmts sequence like "_124 = _197 + 1;". The corresponding gimple-match.c code is like: if (gimple_nop_convert (op0, op0_pops, valueize)) { tree o20 = op0_pops[0]; switch (TREE_CODE (o20)) { case SSA_NAME: if (do_valueize (valueize, o20) != NULL_TREE) { gimple *def_stmt = SSA_NAME_DEF_STMT (o20); if (gassign *def = dyn_cast (def_stmt)) switch (gimple_assign_rhs_code (def)) { case PLUS_EXPR: { tree o30 = gimple_assign_rhs1 (def); if ((o30 = do_valueize (valueize, o30))) { tree o31 = gimple_assign_rhs2 (def); if ((o31 = do_valueize (valueize, o31))) { if (tree_swap_operands_p (o30, o31)) std::swap (o30, o31); if (CONSTANT_CLASS_P (o31)) { if (CONSTANT_CLASS_P (op1)) { { /* #line 1392 "../../gcc/gcc/match.pd" */ tree captures[3] ATTRIBUTE_UNUSED = { o30, o31, op1 }; if (gimple_simplify_194 (res_code, res_ops, seq, valueize, type, captures, PLUS_EXPR, MINUS_EXPR, MINUS_EXPR)) return true; } } } } } break; } Note we have: (gdb) call debug_generic_expr(op0) _124 (gdb) call debug_generic_expr(op1) 1 (gdb) call debug_gimple_stmt(def_stmt) _124 = _197 + 1; (gdb) call debug_generic_expr(o30) _197 But since definition of _197 isn't in current stmt sequence, call "o31 = do_valueize (valueize, o31)" will return NULL. As a result, it's not matched. Any ideas? Thanks. Thanks, bin
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 9:48 AM, Marc Glissewrote: > On Fri, 16 Jun 2017, Bin.Cheng wrote: > >> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener >> wrote: >>> >>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" >>> wrote: On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener wrote: > > On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng wrote: >> >> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener >> wrote: >>> >>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng wrote: Hi, Loop split forces intermediate computation to gimple operands all the time when computing bound information. This is not good since folding opportunities are missed. This patch fixes the issue by feeding all computation to folder and only forcing to gimple operand at last. Bootstrap and test on x86_64 and AArch64. Is it OK? >>> >>> >>> Hm? It uses gimple_build () which should do the same as fold_buildN in terms >>> >>> of simplification. >>> >>> So where does that not work? It is supposed to be the prefered way and no >>> >>> new code should use force_gimple_operand (unless dealing with generic >>> >>> coming from other middle-end infrastructure like SCEV or niter analysis) >> >> Hmm, current code calls force_gimpele operand several times which >> causes the inefficiency. The patch avoids that and does one call at >> the end. > > > But it forces to the same sequence that is used for extending the expression > > so folding should work. Where do you see that it does not? Note the > code uses gimple_build (), not gimple_build_assign (). In spec2k6/hmmer, when building fast_algorithms.c with below command line: ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all -fdump-tree-lsplit The lsplit dump contains: [12.75%]: _124 = _197 + 1; _123 = _124 + -1; _115 = MIN_EXPR <_197, _124>; Which is generated here. >>> >>> >>> That means we miss a pattern in match.PD to handle this case. >> >> I see. I will withdraw this patch and look in that direction. > > > For _123, we have > > /* (A +- CST1) +- CST2 -> A + CST3 > or > /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ > > > For _115, we have > > /* min (a, a + CST) -> a where CST is positive. */ > /* min (a, a + CST) -> a + CST where CST is negative. */ > (simplify > (min:c @0 (plus@2 @0 INTEGER_CST@1)) > (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) >(if (tree_int_cst_sgn (@1) > 0) > @0 > @2))) > > What is the type of all those SSA_NAMEs? https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01352.html which added the min/max patterns. I forgot to get Naveen to mention I saw this while looking into loop splitting and why I was adding them. Thanks, Andrew Pinski > > -- > Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 5:48 PM, Marc Glissewrote: > On Fri, 16 Jun 2017, Bin.Cheng wrote: > >> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener >> wrote: >>> >>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" >>> wrote: On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener wrote: > > On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng wrote: >> >> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener >> wrote: >>> >>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng wrote: Hi, Loop split forces intermediate computation to gimple operands all the time when computing bound information. This is not good since folding opportunities are missed. This patch fixes the issue by feeding all computation to folder and only forcing to gimple operand at last. Bootstrap and test on x86_64 and AArch64. Is it OK? >>> >>> >>> Hm? It uses gimple_build () which should do the same as fold_buildN in terms >>> >>> of simplification. >>> >>> So where does that not work? It is supposed to be the prefered way and no >>> >>> new code should use force_gimple_operand (unless dealing with generic >>> >>> coming from other middle-end infrastructure like SCEV or niter analysis) >> >> Hmm, current code calls force_gimpele operand several times which >> causes the inefficiency. The patch avoids that and does one call at >> the end. > > > But it forces to the same sequence that is used for extending the expression > > so folding should work. Where do you see that it does not? Note the > code uses gimple_build (), not gimple_build_assign (). In spec2k6/hmmer, when building fast_algorithms.c with below command line: ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all -fdump-tree-lsplit The lsplit dump contains: [12.75%]: _124 = _197 + 1; _123 = _124 + -1; _115 = MIN_EXPR <_197, _124>; Which is generated here. >>> >>> >>> That means we miss a pattern in match.PD to handle this case. >> >> I see. I will withdraw this patch and look in that direction. > > > For _123, we have > > /* (A +- CST1) +- CST2 -> A + CST3 > or > /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ > > > For _115, we have > > /* min (a, a + CST) -> a where CST is positive. */ > /* min (a, a + CST) -> a + CST where CST is negative. */ > (simplify > (min:c @0 (plus@2 @0 INTEGER_CST@1)) > (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) >(if (tree_int_cst_sgn (@1) > 0) > @0 > @2))) > > What is the type of all those SSA_NAMEs? Hi Marc, Thanks for pointing out the exact patterns. The variables are of int type. The redundant operation disappears in reduced test case though. Thanks, bin > > -- > Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, 16 Jun 2017, Bin.Cheng wrote: On Fri, Jun 16, 2017 at 5:16 PM, Richard Bienerwrote: On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" wrote: On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener wrote: On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng wrote: On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener wrote: On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng wrote: Hi, Loop split forces intermediate computation to gimple operands all the time when computing bound information. This is not good since folding opportunities are missed. This patch fixes the issue by feeding all computation to folder and only forcing to gimple operand at last. Bootstrap and test on x86_64 and AArch64. Is it OK? Hm? It uses gimple_build () which should do the same as fold_buildN in terms of simplification. So where does that not work? It is supposed to be the prefered way and no new code should use force_gimple_operand (unless dealing with generic coming from other middle-end infrastructure like SCEV or niter analysis) Hmm, current code calls force_gimpele operand several times which causes the inefficiency. The patch avoids that and does one call at the end. But it forces to the same sequence that is used for extending the expression so folding should work. Where do you see that it does not? Note the code uses gimple_build (), not gimple_build_assign (). In spec2k6/hmmer, when building fast_algorithms.c with below command line: ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all -fdump-tree-lsplit The lsplit dump contains: [12.75%]: _124 = _197 + 1; _123 = _124 + -1; _115 = MIN_EXPR <_197, _124>; Which is generated here. That means we miss a pattern in match.PD to handle this case. I see. I will withdraw this patch and look in that direction. For _123, we have /* (A +- CST1) +- CST2 -> A + CST3 or /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ For _115, we have /* min (a, a + CST) -> a where CST is positive. */ /* min (a, a + CST) -> a + CST where CST is negative. */ (simplify (min:c @0 (plus@2 @0 INTEGER_CST@1)) (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) (if (tree_int_cst_sgn (@1) > 0) @0 @2))) What is the type of all those SSA_NAMEs? -- Marc Glisse
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 5:16 PM, Richard Bienerwrote: > On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" > wrote: >>On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener >> wrote: >>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng >>wrote: On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener wrote: > On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng >>wrote: >> Hi, >> Loop split forces intermediate computation to gimple operands all >>the time when >> computing bound information. This is not good since folding >>opportunities are >> missed. This patch fixes the issue by feeding all computation to >>folder and only >> forcing to gimple operand at last. >> >> Bootstrap and test on x86_64 and AArch64. Is it OK? > > Hm? It uses gimple_build () which should do the same as >>fold_buildN in terms > of simplification. > > So where does that not work? It is supposed to be the prefered way >>and no > new code should use force_gimple_operand (unless dealing with >>generic > coming from other middle-end infrastructure like SCEV or niter >>analysis) Hmm, current code calls force_gimpele operand several times which causes the inefficiency. The patch avoids that and does one call at the end. >>> >>> But it forces to the same sequence that is used for extending the >>expression >>> so folding should work. Where do you see that it does not? Note the >>> code uses gimple_build (), not gimple_build_assign (). >>In spec2k6/hmmer, when building fast_algorithms.c with below command >>line: >>./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all >>-fdump-tree-lsplit >>The lsplit dump contains: >> [12.75%]: >> _124 = _197 + 1; >> _123 = _124 + -1; >> _115 = MIN_EXPR <_197, _124>; >>Which is generated here. > > That means we miss a pattern in match.PD to handle this case. I see. I will withdraw this patch and look in that direction. Thanks, bin > > Richard. > >>Thanks, >>bin >>> >>> Richard. >>> Thanks, bin > > Richard. > >> >> Thanks, >> bin >> 2017-06-12 Bin Cheng >> >> * tree-ssa-loop-split.c (compute_new_first_bound): Feed >>bound >> computation to folder, rather than force to gimple >>operands too >> early. >
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"wrote: >On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener > wrote: >> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng >wrote: >>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener >>> wrote: On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng >wrote: > Hi, > Loop split forces intermediate computation to gimple operands all >the time when > computing bound information. This is not good since folding >opportunities are > missed. This patch fixes the issue by feeding all computation to >folder and only > forcing to gimple operand at last. > > Bootstrap and test on x86_64 and AArch64. Is it OK? Hm? It uses gimple_build () which should do the same as >fold_buildN in terms of simplification. So where does that not work? It is supposed to be the prefered way >and no new code should use force_gimple_operand (unless dealing with >generic coming from other middle-end infrastructure like SCEV or niter >analysis) >>> Hmm, current code calls force_gimpele operand several times which >>> causes the inefficiency. The patch avoids that and does one call at >>> the end. >> >> But it forces to the same sequence that is used for extending the >expression >> so folding should work. Where do you see that it does not? Note the >> code uses gimple_build (), not gimple_build_assign (). >In spec2k6/hmmer, when building fast_algorithms.c with below command >line: >./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all >-fdump-tree-lsplit >The lsplit dump contains: > [12.75%]: > _124 = _197 + 1; > _123 = _124 + -1; > _115 = MIN_EXPR <_197, _124>; >Which is generated here. That means we miss a pattern in match.PD to handle this case. Richard. >Thanks, >bin >> >> Richard. >> >>> Thanks, >>> bin Richard. > > Thanks, > bin > 2017-06-12 Bin Cheng > > * tree-ssa-loop-split.c (compute_new_first_bound): Feed >bound > computation to folder, rather than force to gimple >operands too > early.
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 2:10 PM, Richard Bienerwrote: > On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng wrote: >> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener >> wrote: >>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng wrote: Hi, Loop split forces intermediate computation to gimple operands all the time when computing bound information. This is not good since folding opportunities are missed. This patch fixes the issue by feeding all computation to folder and only forcing to gimple operand at last. Bootstrap and test on x86_64 and AArch64. Is it OK? >>> >>> Hm? It uses gimple_build () which should do the same as fold_buildN in >>> terms >>> of simplification. >>> >>> So where does that not work? It is supposed to be the prefered way and no >>> new code should use force_gimple_operand (unless dealing with generic >>> coming from other middle-end infrastructure like SCEV or niter analysis) >> Hmm, current code calls force_gimpele operand several times which >> causes the inefficiency. The patch avoids that and does one call at >> the end. > > But it forces to the same sequence that is used for extending the expression > so folding should work. Where do you see that it does not? Note the > code uses gimple_build (), not gimple_build_assign (). In spec2k6/hmmer, when building fast_algorithms.c with below command line: ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all -fdump-tree-lsplit The lsplit dump contains: [12.75%]: _124 = _197 + 1; _123 = _124 + -1; _115 = MIN_EXPR <_197, _124>; Which is generated here. Thanks, bin > > Richard. > >> Thanks, >> bin >>> >>> Richard. >>> Thanks, bin 2017-06-12 Bin Cheng * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound computation to folder, rather than force to gimple operands too early.
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 3:06 PM, Bin.Chengwrote: > On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener > wrote: >> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng wrote: >>> Hi, >>> Loop split forces intermediate computation to gimple operands all the time >>> when >>> computing bound information. This is not good since folding opportunities >>> are >>> missed. This patch fixes the issue by feeding all computation to folder >>> and only >>> forcing to gimple operand at last. >>> >>> Bootstrap and test on x86_64 and AArch64. Is it OK? >> >> Hm? It uses gimple_build () which should do the same as fold_buildN in terms >> of simplification. >> >> So where does that not work? It is supposed to be the prefered way and no >> new code should use force_gimple_operand (unless dealing with generic >> coming from other middle-end infrastructure like SCEV or niter analysis) > Hmm, current code calls force_gimpele operand several times which > causes the inefficiency. The patch avoids that and does one call at > the end. But it forces to the same sequence that is used for extending the expression so folding should work. Where do you see that it does not? Note the code uses gimple_build (), not gimple_build_assign (). Richard. > Thanks, > bin >> >> Richard. >> >>> >>> Thanks, >>> bin >>> 2017-06-12 Bin Cheng >>> >>> * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound >>> computation to folder, rather than force to gimple operands too >>> early.
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Fri, Jun 16, 2017 at 11:49 AM, Richard Bienerwrote: > On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng wrote: >> Hi, >> Loop split forces intermediate computation to gimple operands all the time >> when >> computing bound information. This is not good since folding opportunities >> are >> missed. This patch fixes the issue by feeding all computation to folder and >> only >> forcing to gimple operand at last. >> >> Bootstrap and test on x86_64 and AArch64. Is it OK? > > Hm? It uses gimple_build () which should do the same as fold_buildN in terms > of simplification. > > So where does that not work? It is supposed to be the prefered way and no > new code should use force_gimple_operand (unless dealing with generic > coming from other middle-end infrastructure like SCEV or niter analysis) Hmm, current code calls force_gimpele operand several times which causes the inefficiency. The patch avoids that and does one call at the end. Thanks, bin > > Richard. > >> >> Thanks, >> bin >> 2017-06-12 Bin Cheng >> >> * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound >> computation to folder, rather than force to gimple operands too >> early.
Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split
On Wed, Jun 14, 2017 at 3:07 PM, Bin Chengwrote: > Hi, > Loop split forces intermediate computation to gimple operands all the time > when > computing bound information. This is not good since folding opportunities are > missed. This patch fixes the issue by feeding all computation to folder and > only > forcing to gimple operand at last. > > Bootstrap and test on x86_64 and AArch64. Is it OK? Hm? It uses gimple_build () which should do the same as fold_buildN in terms of simplification. So where does that not work? It is supposed to be the prefered way and no new code should use force_gimple_operand (unless dealing with generic coming from other middle-end infrastructure like SCEV or niter analysis) Richard. > > Thanks, > bin > 2017-06-12 Bin Cheng > > * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound > computation to folder, rather than force to gimple operands too > early.
[PATCH GCC][1/2]Feed bound computation to folder in loop split
Hi, Loop split forces intermediate computation to gimple operands all the time when computing bound information. This is not good since folding opportunities are missed. This patch fixes the issue by feeding all computation to folder and only forcing to gimple operand at last. Bootstrap and test on x86_64 and AArch64. Is it OK? Thanks, bin 2017-06-12 Bin Cheng* tree-ssa-loop-split.c (compute_new_first_bound): Feed bound computation to folder, rather than force to gimple operands too early.From 372dc98aa91fd495c98c2326f854eb5f2c76500b Mon Sep 17 00:00:00 2001 From: Bin Cheng Date: Fri, 2 Jun 2017 18:05:03 +0100 Subject: [PATCH 1/2] feed-bound-computation-to-folder-20170601.txt --- gcc/tree-ssa-loop-split.c | 77 ++- 1 file changed, 30 insertions(+), 47 deletions(-) diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c index e77f2bf..f8fe8e6 100644 --- a/gcc/tree-ssa-loop-split.c +++ b/gcc/tree-ssa-loop-split.c @@ -396,53 +396,38 @@ compute_new_first_bound (gimple_seq *stmts, struct tree_niter_desc *niter, { /* The niter structure contains the after-increment IV, we need the loop-enter base, so subtract STEP once. */ - tree controlbase = force_gimple_operand (niter->control.base, - stmts, true, NULL_TREE); + tree controlbase = niter->control.base; tree controlstep = niter->control.step; - tree enddiff; + tree enddiff, end = niter->bound; + tree type; + + /* Compute end-beg. */ if (POINTER_TYPE_P (TREE_TYPE (controlbase))) { - controlstep = gimple_build (stmts, NEGATE_EXPR, - TREE_TYPE (controlstep), controlstep); - enddiff = gimple_build (stmts, POINTER_PLUS_EXPR, - TREE_TYPE (controlbase), - controlbase, controlstep); + controlstep = fold_build1 (NEGATE_EXPR, +TREE_TYPE (controlstep), controlstep); + enddiff = fold_build_pointer_plus (controlbase, controlstep); + + type = unsigned_type_for (enddiff); + enddiff = fold_build1 (NEGATE_EXPR, type, fold_convert (type, enddiff)); + end = fold_convert (type, end); + enddiff = fold_build2 (PLUS_EXPR, type, end, enddiff); + enddiff = fold_convert (sizetype, enddiff); } else -enddiff = gimple_build (stmts, MINUS_EXPR, - TREE_TYPE (controlbase), - controlbase, controlstep); - - /* Compute end-beg. */ - gimple_seq stmts2; - tree end = force_gimple_operand (niter->bound, , - true, NULL_TREE); - gimple_seq_add_seq_without_update (stmts, stmts2); - if (POINTER_TYPE_P (TREE_TYPE (enddiff))) { - tree tem = gimple_convert (stmts, sizetype, enddiff); - tem = gimple_build (stmts, NEGATE_EXPR, sizetype, tem); - enddiff = gimple_build (stmts, POINTER_PLUS_EXPR, - TREE_TYPE (enddiff), - end, tem); + enddiff = fold_build2 (MINUS_EXPR, TREE_TYPE (controlbase), +controlbase, controlstep); + enddiff = fold_build2 (MINUS_EXPR, TREE_TYPE (enddiff), end, enddiff); } - else -enddiff = gimple_build (stmts, MINUS_EXPR, TREE_TYPE (enddiff), - end, enddiff); /* Compute guard_init + (end-beg). */ tree newbound; - enddiff = gimple_convert (stmts, TREE_TYPE (guard_init), enddiff); if (POINTER_TYPE_P (TREE_TYPE (guard_init))) -{ - enddiff = gimple_convert (stmts, sizetype, enddiff); - newbound = gimple_build (stmts, POINTER_PLUS_EXPR, - TREE_TYPE (guard_init), - guard_init, enddiff); -} +newbound = fold_build_pointer_plus (guard_init, enddiff); else -newbound = gimple_build (stmts, PLUS_EXPR, TREE_TYPE (guard_init), -guard_init, enddiff); +newbound = fold_build2 (PLUS_EXPR, TREE_TYPE (guard_init), guard_init, + fold_convert (TREE_TYPE (guard_init), enddiff)); /* Depending on the direction of the IVs the new bound for the first loop is the minimum or maximum of old bound and border. @@ -467,20 +452,18 @@ compute_new_first_bound (gimple_seq *stmts, struct tree_niter_desc *niter, if (addbound) { - tree type2 = TREE_TYPE (newbound); - if (POINTER_TYPE_P (type2)) - type2 = sizetype; - newbound = gimple_build (stmts, - POINTER_TYPE_P (TREE_TYPE (newbound)) - ? POINTER_PLUS_EXPR : PLUS_EXPR, - TREE_TYPE (newbound), - newbound, - build_int_cst (type2, addbound)); + type = TREE_TYPE (newbound); + if (POINTER_TYPE_P