Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Richard Biener
On Wed, Jul 26, 2017 at 9:48 AM, Richard Biener
 wrote:
> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse  wrote:
>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>
 I think we need Richard to say what the intent is for the valueization
 function. It is used both to stop looking at defining stmt if the return
 is
 NULL, and to replace/optimize one SSA_NAME with another, but currently it
 seems hard to prevent looking at the defining statement without
 preventing
 from looking at the SSA_NAME at all.
>>>
>>>
>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>> nothing
>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>> the
>>> SSA_NAME_DEF_STMT.
>>>
 I guess we'll need a fix in genmatch...
>>>
>>>
>>> I'll have a look tomorrow.
>>
>>
>> My impression yesterday was that we could replace the current do_valueize
>> wrapper by 2 wrappers (without touching the valueize callbacks):
>> - may_check_def_stmt, which returns a bool corresponding to the current
>> do_valueize != NULL_TREE
>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>> returns its argument unchanged.
>>
>> Not very confident about it though.
>
> Note I've been there in the past (twice I think) but always ran into existing
> latent issues.  So hopefully we've resolved those, I'm testing the following
> simplified variant of what I had back in time.
>
> It'll produce
>
>   switch (TREE_CODE (op0))
> {
> case SSA_NAME:
>   if (gimple *def_stmt = get_def (valueize, op0))
> {
>   if (gassign *def = dyn_cast  (def_stmt))
> switch (gimple_assign_rhs_code (def))
>   {
>   case MINUS_EXPR:
> {
>   tree o20 = gimple_assign_rhs1 (def);
>   o20 = do_valueize (valueize, o20);
>   tree o21 = gimple_assign_rhs2 (def);
>   o21 = do_valueize (valueize, o21);
>   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
> types_match (op1, o21)))
> {
>
> which also indents less which is nice.
>
> Bootstrap/regtest running on x86_64-unknown-linux-gnu.

Now applied with

* gcc/testsuite/gcc.dg/pr70920-2.c: Adjust for transform already
happening in ccp1.
* gcc/testsuite/gcc.dg/pr70920-4.c: Likewise.

that shows it's working (CCP is one of the passes eventually running
into this issue).

Richard.

> Richard.
>
> 2017-07-26  Richard Biener  
>
> * gimple-match-head.c (do_valueize): Return OP if valueize
> returns NULL_TREE.
> (get_def): New helper to get at the def stmt of a SSA name
> if valueize allows.
> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
> do_valueize to get at the def stmt.
> (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
>
>> --
>> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Richard Biener
On Wed, Jul 26, 2017 at 11:57 AM, Marc Glisse  wrote:
> On Wed, 26 Jul 2017, Richard Sandiford wrote:
>
>> Marc Glisse  writes:
>>>
>>> On Wed, 26 Jul 2017, Richard Sandiford wrote:

 Richard Biener  writes:
>
> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse 
> wrote:
>>
>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>
 I think we need Richard to say what the intent is for the
 valueization
 function. It is used both to stop looking at defining stmt if the
 return
 is
 NULL, and to replace/optimize one SSA_NAME with another, but
 currently it
 seems hard to prevent looking at the defining statement without
 preventing
 from looking at the SSA_NAME at all.
>>>
>>>
>>>
>>> Yeah, this semantic overloading is an issue.  For gimple_build we
>>> have
>>> nothing
>>> to "valueize" but we only use it to tell genmatch that it may not
>>> look at
>>> the
>>> SSA_NAME_DEF_STMT.
>>>
 I guess we'll need a fix in genmatch...
>>>
>>>
>>>
>>> I'll have a look tomorrow.
>>
>>
>>
>> My impression yesterday was that we could replace the current
>> do_valueize
>> wrapper by 2 wrappers (without touching the valueize callbacks):
>> - may_check_def_stmt, which returns a bool corresponding to the
>> current
>> do_valueize != NULL_TREE
>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE,
>> it
>> returns its argument unchanged.
>>
>> Not very confident about it though.
>
>
> Note I've been there in the past (twice I think) but always ran into
> existing
> latent issues.  So hopefully we've resolved those, I'm testing the
> following
> simplified variant of what I had back in time.
>
> It'll produce
>
>   switch (TREE_CODE (op0))
> {
> case SSA_NAME:
>   if (gimple *def_stmt = get_def (valueize, op0))
> {
>   if (gassign *def = dyn_cast  (def_stmt))
> switch (gimple_assign_rhs_code (def))
>   {
>   case MINUS_EXPR:
> {
>   tree o20 = gimple_assign_rhs1 (def);
>   o20 = do_valueize (valueize, o20);
>   tree o21 = gimple_assign_rhs2 (def);
>   o21 = do_valueize (valueize, o21);
>   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
> types_match (op1, o21)))
> {
>
> which also indents less which is nice.
>
> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2017-07-26  Richard Biener  
>
> * gimple-match-head.c (do_valueize): Return OP if valueize
> returns NULL_TREE.
> (get_def): New helper to get at the def stmt of a SSA name
> if valueize allows.
> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
> do_valueize to get at the def stmt.
> (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
>
>> --
>> Marc Glisse
>
>
> 2017-07-26  Richard Biener  
>
> * gimple-match-head.c (do_valueize): Return OP if valueize
> returns NULL_TREE.
> (get_def): New helper to get at the def stmt of a SSA name
> if valueize allows.
> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
> do_valueize to get at the def stmt.
> (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
> Index: gcc/gimple-match-head.c
> ===
> --- gcc/gimple-match-head.c (revision 250518)
> +++ gcc/gimple-match-head.c (working copy)
> @@ -779,10 +779,25 @@ inline tree
>  do_valueize (tree (*valueize)(tree), tree op)
>  {
>if (valueize && TREE_CODE (op) == SSA_NAME)
> -return valueize (op);
> +{
> +  tree tem = valueize (op);
> +  if (tem)
> +   return tem;
> +}
>return op;
>  }
>
> +/* Helper for the autogenerated code, get at the definition of NAME
> when
> +   VALUEIZE allows that.  */
> +
> +inline gimple *
> +get_def (tree (*valueize)(tree), tree name)
> +{
> +  if (valueize && ! valueize (name))
> +return NULL;
> +  return SSA_NAME_DEF_STMT (name);
> +}


 I realise this is preexisting, but why do we ignore the value returned
 by valueize, even if it's different from NAME?
>>>
>>>
>>> My impression is that we expect it has already been valueized.
>>
>>
>> But in that case, why do we try to 

Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Marc Glisse

On Wed, 26 Jul 2017, Richard Sandiford wrote:


Marc Glisse  writes:

On Wed, 26 Jul 2017, Richard Sandiford wrote:

Richard Biener  writes:

On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse  wrote:

On Tue, 25 Jul 2017, Richard Biener wrote:


I think we need Richard to say what the intent is for the valueization
function. It is used both to stop looking at defining stmt if the return
is
NULL, and to replace/optimize one SSA_NAME with another, but currently it
seems hard to prevent looking at the defining statement without
preventing
from looking at the SSA_NAME at all.



Yeah, this semantic overloading is an issue.  For gimple_build we have
nothing
to "valueize" but we only use it to tell genmatch that it may not look at
the
SSA_NAME_DEF_STMT.


I guess we'll need a fix in genmatch...



I'll have a look tomorrow.



My impression yesterday was that we could replace the current do_valueize
wrapper by 2 wrappers (without touching the valueize callbacks):
- may_check_def_stmt, which returns a bool corresponding to the current
do_valueize != NULL_TREE
- maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
returns its argument unchanged.

Not very confident about it though.


Note I've been there in the past (twice I think) but always ran into existing
latent issues.  So hopefully we've resolved those, I'm testing the following
simplified variant of what I had back in time.

It'll produce

  switch (TREE_CODE (op0))
{
case SSA_NAME:
  if (gimple *def_stmt = get_def (valueize, op0))
{
  if (gassign *def = dyn_cast  (def_stmt))
switch (gimple_assign_rhs_code (def))
  {
  case MINUS_EXPR:
{
  tree o20 = gimple_assign_rhs1 (def);
  o20 = do_valueize (valueize, o20);
  tree o21 = gimple_assign_rhs2 (def);
  o21 = do_valueize (valueize, o21);
  if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
types_match (op1, o21)))
{

which also indents less which is nice.

Bootstrap/regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-07-26  Richard Biener  

* gimple-match-head.c (do_valueize): Return OP if valueize
returns NULL_TREE.
(get_def): New helper to get at the def stmt of a SSA name
if valueize allows.
* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
do_valueize to get at the def stmt.
(dt_operand::gen_gimple_expr): Simplify do_valueize calls.



--
Marc Glisse


2017-07-26  Richard Biener  

* gimple-match-head.c (do_valueize): Return OP if valueize
returns NULL_TREE.
(get_def): New helper to get at the def stmt of a SSA name
if valueize allows.
* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
do_valueize to get at the def stmt.
(dt_operand::gen_gimple_expr): Simplify do_valueize calls.

Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 250518)
+++ gcc/gimple-match-head.c (working copy)
@@ -779,10 +779,25 @@ inline tree
 do_valueize (tree (*valueize)(tree), tree op)
 {
   if (valueize && TREE_CODE (op) == SSA_NAME)
-return valueize (op);
+{
+  tree tem = valueize (op);
+  if (tem)
+   return tem;
+}
   return op;
 }

+/* Helper for the autogenerated code, get at the definition of NAME when
+   VALUEIZE allows that.  */
+
+inline gimple *
+get_def (tree (*valueize)(tree), tree name)
+{
+  if (valueize && ! valueize (name))
+return NULL;
+  return SSA_NAME_DEF_STMT (name);
+}


I realise this is preexisting, but why do we ignore the value returned
by valueize, even if it's different from NAME?


My impression is that we expect it has already been valueized.


But in that case, why do we try to valueize it again?


We don't know if valueize returned NULL and we kept the argument as is 
(should not look at def stmt) or if valueize actually returned something. 
This way we can also have valueize replace a value with another, and still 
forbid looking at the def stmt of the replacement value.



It feels like we're conflating two things.


We are. The question is how much trouble that causes, compared to the 
complication of for instance having 2 callbacks.


--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Richard Sandiford
Marc Glisse  writes:
> On Wed, 26 Jul 2017, Richard Sandiford wrote:
>> Richard Biener  writes:
>>> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse  wrote:
 On Tue, 25 Jul 2017, Richard Biener wrote:

>> I think we need Richard to say what the intent is for the valueization
>> function. It is used both to stop looking at defining stmt if the return
>> is
>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>> seems hard to prevent looking at the defining statement without
>> preventing
>> from looking at the SSA_NAME at all.
>
>
> Yeah, this semantic overloading is an issue.  For gimple_build we have
> nothing
> to "valueize" but we only use it to tell genmatch that it may not look at
> the
> SSA_NAME_DEF_STMT.
>
>> I guess we'll need a fix in genmatch...
>
>
> I'll have a look tomorrow.


 My impression yesterday was that we could replace the current do_valueize
 wrapper by 2 wrappers (without touching the valueize callbacks):
 - may_check_def_stmt, which returns a bool corresponding to the current
 do_valueize != NULL_TREE
 - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
 returns its argument unchanged.

 Not very confident about it though.
>>>
>>> Note I've been there in the past (twice I think) but always ran into 
>>> existing
>>> latent issues.  So hopefully we've resolved those, I'm testing the following
>>> simplified variant of what I had back in time.
>>>
>>> It'll produce
>>>
>>>   switch (TREE_CODE (op0))
>>> {
>>> case SSA_NAME:
>>>   if (gimple *def_stmt = get_def (valueize, op0))
>>> {
>>>   if (gassign *def = dyn_cast  (def_stmt))
>>> switch (gimple_assign_rhs_code (def))
>>>   {
>>>   case MINUS_EXPR:
>>> {
>>>   tree o20 = gimple_assign_rhs1 (def);
>>>   o20 = do_valueize (valueize, o20);
>>>   tree o21 = gimple_assign_rhs2 (def);
>>>   o21 = do_valueize (valueize, o21);
>>>   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
>>> types_match (op1, o21)))
>>> {
>>>
>>> which also indents less which is nice.
>>>
>>> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>>>
>>> Richard.
>>>
>>> 2017-07-26  Richard Biener  
>>>
>>> * gimple-match-head.c (do_valueize): Return OP if valueize
>>> returns NULL_TREE.
>>> (get_def): New helper to get at the def stmt of a SSA name
>>> if valueize allows.
>>> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>> do_valueize to get at the def stmt.
>>> (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>
>>>
 --
 Marc Glisse
>>>
>>> 2017-07-26  Richard Biener  
>>>
>>> * gimple-match-head.c (do_valueize): Return OP if valueize
>>> returns NULL_TREE.
>>> (get_def): New helper to get at the def stmt of a SSA name
>>> if valueize allows.
>>> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>> do_valueize to get at the def stmt.
>>> (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>
>>> Index: gcc/gimple-match-head.c
>>> ===
>>> --- gcc/gimple-match-head.c (revision 250518)
>>> +++ gcc/gimple-match-head.c (working copy)
>>> @@ -779,10 +779,25 @@ inline tree
>>>  do_valueize (tree (*valueize)(tree), tree op)
>>>  {
>>>if (valueize && TREE_CODE (op) == SSA_NAME)
>>> -return valueize (op);
>>> +{
>>> +  tree tem = valueize (op);
>>> +  if (tem)
>>> +   return tem;
>>> +}
>>>return op;
>>>  }
>>>
>>> +/* Helper for the autogenerated code, get at the definition of NAME when
>>> +   VALUEIZE allows that.  */
>>> +
>>> +inline gimple *
>>> +get_def (tree (*valueize)(tree), tree name)
>>> +{
>>> +  if (valueize && ! valueize (name))
>>> +return NULL;
>>> +  return SSA_NAME_DEF_STMT (name);
>>> +}
>>
>> I realise this is preexisting, but why do we ignore the value returned
>> by valueize, even if it's different from NAME?
>
> My impression is that we expect it has already been valueized.

But in that case, why do we try to valueize it again?  It feels like
we're conflating two things.

Richard


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Marc Glisse

On Wed, 26 Jul 2017, Richard Sandiford wrote:


Richard Biener  writes:

On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse  wrote:

On Tue, 25 Jul 2017, Richard Biener wrote:


I think we need Richard to say what the intent is for the valueization
function. It is used both to stop looking at defining stmt if the return
is
NULL, and to replace/optimize one SSA_NAME with another, but currently it
seems hard to prevent looking at the defining statement without
preventing
from looking at the SSA_NAME at all.



Yeah, this semantic overloading is an issue.  For gimple_build we have
nothing
to "valueize" but we only use it to tell genmatch that it may not look at
the
SSA_NAME_DEF_STMT.


I guess we'll need a fix in genmatch...



I'll have a look tomorrow.



My impression yesterday was that we could replace the current do_valueize
wrapper by 2 wrappers (without touching the valueize callbacks):
- may_check_def_stmt, which returns a bool corresponding to the current
do_valueize != NULL_TREE
- maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
returns its argument unchanged.

Not very confident about it though.


Note I've been there in the past (twice I think) but always ran into existing
latent issues.  So hopefully we've resolved those, I'm testing the following
simplified variant of what I had back in time.

It'll produce

  switch (TREE_CODE (op0))
{
case SSA_NAME:
  if (gimple *def_stmt = get_def (valueize, op0))
{
  if (gassign *def = dyn_cast  (def_stmt))
switch (gimple_assign_rhs_code (def))
  {
  case MINUS_EXPR:
{
  tree o20 = gimple_assign_rhs1 (def);
  o20 = do_valueize (valueize, o20);
  tree o21 = gimple_assign_rhs2 (def);
  o21 = do_valueize (valueize, o21);
  if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
types_match (op1, o21)))
{

which also indents less which is nice.

Bootstrap/regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-07-26  Richard Biener  

* gimple-match-head.c (do_valueize): Return OP if valueize
returns NULL_TREE.
(get_def): New helper to get at the def stmt of a SSA name
if valueize allows.
* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
do_valueize to get at the def stmt.
(dt_operand::gen_gimple_expr): Simplify do_valueize calls.



--
Marc Glisse


2017-07-26  Richard Biener  

* gimple-match-head.c (do_valueize): Return OP if valueize
returns NULL_TREE.
(get_def): New helper to get at the def stmt of a SSA name
if valueize allows.
* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
do_valueize to get at the def stmt.
(dt_operand::gen_gimple_expr): Simplify do_valueize calls.

Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 250518)
+++ gcc/gimple-match-head.c (working copy)
@@ -779,10 +779,25 @@ inline tree
 do_valueize (tree (*valueize)(tree), tree op)
 {
   if (valueize && TREE_CODE (op) == SSA_NAME)
-return valueize (op);
+{
+  tree tem = valueize (op);
+  if (tem)
+   return tem;
+}
   return op;
 }

+/* Helper for the autogenerated code, get at the definition of NAME when
+   VALUEIZE allows that.  */
+
+inline gimple *
+get_def (tree (*valueize)(tree), tree name)
+{
+  if (valueize && ! valueize (name))
+return NULL;
+  return SSA_NAME_DEF_STMT (name);
+}


I realise this is preexisting, but why do we ignore the value returned
by valueize, even if it's different from NAME?


My impression is that we expect it has already been valueized.

--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse  wrote:
>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>
 I think we need Richard to say what the intent is for the valueization
 function. It is used both to stop looking at defining stmt if the return
 is
 NULL, and to replace/optimize one SSA_NAME with another, but currently it
 seems hard to prevent looking at the defining statement without
 preventing
 from looking at the SSA_NAME at all.
>>>
>>>
>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>> nothing
>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>> the
>>> SSA_NAME_DEF_STMT.
>>>
 I guess we'll need a fix in genmatch...
>>>
>>>
>>> I'll have a look tomorrow.
>>
>>
>> My impression yesterday was that we could replace the current do_valueize
>> wrapper by 2 wrappers (without touching the valueize callbacks):
>> - may_check_def_stmt, which returns a bool corresponding to the current
>> do_valueize != NULL_TREE
>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>> returns its argument unchanged.
>>
>> Not very confident about it though.
>
> Note I've been there in the past (twice I think) but always ran into existing
> latent issues.  So hopefully we've resolved those, I'm testing the following
> simplified variant of what I had back in time.
>
> It'll produce
>
>   switch (TREE_CODE (op0))
> {
> case SSA_NAME:
>   if (gimple *def_stmt = get_def (valueize, op0))
> {
>   if (gassign *def = dyn_cast  (def_stmt))
> switch (gimple_assign_rhs_code (def))
>   {
>   case MINUS_EXPR:
> {
>   tree o20 = gimple_assign_rhs1 (def);
>   o20 = do_valueize (valueize, o20);
>   tree o21 = gimple_assign_rhs2 (def);
>   o21 = do_valueize (valueize, o21);
>   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
> types_match (op1, o21)))
> {
>
> which also indents less which is nice.
>
> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2017-07-26  Richard Biener  
>
> * gimple-match-head.c (do_valueize): Return OP if valueize
> returns NULL_TREE.
> (get_def): New helper to get at the def stmt of a SSA name
> if valueize allows.
> * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
> do_valueize to get at the def stmt.
> (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
>
>> --
>> Marc Glisse
>
> 2017-07-26  Richard Biener  
>
>   * gimple-match-head.c (do_valueize): Return OP if valueize
>   returns NULL_TREE.
>   (get_def): New helper to get at the def stmt of a SSA name
>   if valueize allows.
>   * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>   do_valueize to get at the def stmt.
>   (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
> Index: gcc/gimple-match-head.c
> ===
> --- gcc/gimple-match-head.c   (revision 250518)
> +++ gcc/gimple-match-head.c   (working copy)
> @@ -779,10 +779,25 @@ inline tree
>  do_valueize (tree (*valueize)(tree), tree op)
>  {
>if (valueize && TREE_CODE (op) == SSA_NAME)
> -return valueize (op);
> +{
> +  tree tem = valueize (op);
> +  if (tem)
> + return tem;
> +}
>return op;
>  }
>  
> +/* Helper for the autogenerated code, get at the definition of NAME when
> +   VALUEIZE allows that.  */
> +
> +inline gimple *
> +get_def (tree (*valueize)(tree), tree name)
> +{
> +  if (valueize && ! valueize (name))
> +return NULL;
> +  return SSA_NAME_DEF_STMT (name);
> +}

I realise this is preexisting, but why do we ignore the value returned
by valueize, even if it's different from NAME?  I struggled over that
with the old code when I hit the same problem with some SVE patches
(for which, thanks for the fix).

A comment might be good.

Thanks,
Richard


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-26 Thread Richard Biener
On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse  wrote:
> On Tue, 25 Jul 2017, Richard Biener wrote:
>
>>> I think we need Richard to say what the intent is for the valueization
>>> function. It is used both to stop looking at defining stmt if the return
>>> is
>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>> seems hard to prevent looking at the defining statement without
>>> preventing
>>> from looking at the SSA_NAME at all.
>>
>>
>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>> nothing
>> to "valueize" but we only use it to tell genmatch that it may not look at
>> the
>> SSA_NAME_DEF_STMT.
>>
>>> I guess we'll need a fix in genmatch...
>>
>>
>> I'll have a look tomorrow.
>
>
> My impression yesterday was that we could replace the current do_valueize
> wrapper by 2 wrappers (without touching the valueize callbacks):
> - may_check_def_stmt, which returns a bool corresponding to the current
> do_valueize != NULL_TREE
> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
> returns its argument unchanged.
>
> Not very confident about it though.

Note I've been there in the past (twice I think) but always ran into existing
latent issues.  So hopefully we've resolved those, I'm testing the following
simplified variant of what I had back in time.

It'll produce

  switch (TREE_CODE (op0))
{
case SSA_NAME:
  if (gimple *def_stmt = get_def (valueize, op0))
{
  if (gassign *def = dyn_cast  (def_stmt))
switch (gimple_assign_rhs_code (def))
  {
  case MINUS_EXPR:
{
  tree o20 = gimple_assign_rhs1 (def);
  o20 = do_valueize (valueize, o20);
  tree o21 = gimple_assign_rhs2 (def);
  o21 = do_valueize (valueize, o21);
  if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
types_match (op1, o21)))
{

which also indents less which is nice.

Bootstrap/regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-07-26  Richard Biener  

* gimple-match-head.c (do_valueize): Return OP if valueize
returns NULL_TREE.
(get_def): New helper to get at the def stmt of a SSA name
if valueize allows.
* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
do_valueize to get at the def stmt.
(dt_operand::gen_gimple_expr): Simplify do_valueize calls.


> --
> Marc Glisse


p3
Description: Binary data


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-25 Thread Marc Glisse

On Tue, 25 Jul 2017, Richard Biener wrote:


I think we need Richard to say what the intent is for the valueization
function. It is used both to stop looking at defining stmt if the return is
NULL, and to replace/optimize one SSA_NAME with another, but currently it
seems hard to prevent looking at the defining statement without preventing
from looking at the SSA_NAME at all.


Yeah, this semantic overloading is an issue.  For gimple_build we have nothing
to "valueize" but we only use it to tell genmatch that it may not look at the
SSA_NAME_DEF_STMT.


I guess we'll need a fix in genmatch...


I'll have a look tomorrow.


My impression yesterday was that we could replace the current do_valueize 
wrapper by 2 wrappers (without touching the valueize callbacks):
- may_check_def_stmt, which returns a bool corresponding to the current 
do_valueize != NULL_TREE
- maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, 
it returns its argument unchanged.


Not very confident about it though.

--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-25 Thread Richard Biener
On Mon, Jul 24, 2017 at 4:31 PM, Marc Glisse  wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse  wrote:
>>>
>>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>>
 But since definition of _197 isn't in current stmt sequence, call "o31
 = do_valueize (valueize, o31)" will return NULL.  As a result, it's
 not matched.
>>>
>>>
>>>
>>> Wait, actually, how was your fold_build* version working? Why was the
>>> first
>>> addition "in the current generic tree" and why isn't it "in current stmt
>>> sequence"?
>>
>> Maybe I didn't express it clearly.  In compute_new_first_bound, we
>> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
>> 1" by calling gimple_build.  The definition of _197 is a PHI and isn't
>> in current stmt sequence.  For fold_build* version, it builds
>> expression "_197 + 1 - 1" and simplifies it.
>
>
> It seems like it shouldn't be relevant whether the definition of _197 is in
> the stmt sequence or not, but indeed we seem to generate a lot of calls to
> do_valueize... I had misunderstood the issue, sorry.
>
> Strangely, for a pattern like
> (simplify (plus @0 @1) @0)
> we generate no call to valueize,

Expected.  We expect the "toplevel" trees to be already valueized.

> while for the following
> (simplify (plus @0 (minus @1 @2)) @0)
> we generate 3 calls to do_valueize.
>
> I think we need Richard to say what the intent is for the valueization
> function. It is used both to stop looking at defining stmt if the return is
> NULL, and to replace/optimize one SSA_NAME with another, but currently it
> seems hard to prevent looking at the defining statement without preventing
> from looking at the SSA_NAME at all.

Yeah, this semantic overloading is an issue.  For gimple_build we have nothing
to "valueize" but we only use it to tell genmatch that it may not look at the
SSA_NAME_DEF_STMT.

> I guess we'll need a fix in genmatch...

I'll have a look tomorrow.

RIchard.

> --
> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Marc Glisse

On Mon, 24 Jul 2017, Bin.Cheng wrote:


On Mon, Jul 24, 2017 at 3:31 PM, Marc Glisse  wrote:

On Mon, 24 Jul 2017, Bin.Cheng wrote:


On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse  wrote:


On Mon, 24 Jul 2017, Bin.Cheng wrote:


But since definition of _197 isn't in current stmt sequence, call "o31
= do_valueize (valueize, o31)" will return NULL.  As a result, it's
not matched.




Wait, actually, how was your fold_build* version working? Why was the
first
addition "in the current generic tree" and why isn't it "in current stmt
sequence"?


Maybe I didn't express it clearly.  In compute_new_first_bound, we
have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
1" by calling gimple_build.  The definition of _197 is a PHI and isn't
in current stmt sequence.  For fold_build* version, it builds
expression "_197 + 1 - 1" and simplifies it.



It seems like it shouldn't be relevant whether the definition of _197 is in
the stmt sequence or not, but indeed we seem to generate a lot of calls to
do_valueize... I had misunderstood the issue, sorry.

Oh, no need at all, and thanks very much for all the explanation.


Strangely, for a pattern like
(simplify (plus @0 @1) @0)
we generate no call to valueize, while for the following
(simplify (plus @0 (minus @1 @2)) @0)
we generate 3 calls to do_valueize.

I think we need Richard to say what the intent is for the valueization
function. It is used both to stop looking at defining stmt if the return is
NULL, and to replace/optimize one SSA_NAME with another, but currently it
seems hard to prevent looking at the defining statement without preventing
from looking at the SSA_NAME at all.

Looks we don't really expand into def_stmt on leaf nodes, maybe
valueization can be saved in the case?


Passes with a lattice (say PRE) use valueization to replace an SSA_NAME 
with an equivalent one, not just to let it look through the defining 
statement, so I am not sure if we can get rid of those completely, or if 
we need two different do_valueize wrappers for the 2 types of uses.


--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Bin.Cheng
On Mon, Jul 24, 2017 at 3:31 PM, Marc Glisse  wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse  wrote:
>>>
>>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>>
 But since definition of _197 isn't in current stmt sequence, call "o31
 = do_valueize (valueize, o31)" will return NULL.  As a result, it's
 not matched.
>>>
>>>
>>>
>>> Wait, actually, how was your fold_build* version working? Why was the
>>> first
>>> addition "in the current generic tree" and why isn't it "in current stmt
>>> sequence"?
>>
>> Maybe I didn't express it clearly.  In compute_new_first_bound, we
>> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
>> 1" by calling gimple_build.  The definition of _197 is a PHI and isn't
>> in current stmt sequence.  For fold_build* version, it builds
>> expression "_197 + 1 - 1" and simplifies it.
>
>
> It seems like it shouldn't be relevant whether the definition of _197 is in
> the stmt sequence or not, but indeed we seem to generate a lot of calls to
> do_valueize... I had misunderstood the issue, sorry.
Oh, no need at all, and thanks very much for all the explanation.
>
> Strangely, for a pattern like
> (simplify (plus @0 @1) @0)
> we generate no call to valueize, while for the following
> (simplify (plus @0 (minus @1 @2)) @0)
> we generate 3 calls to do_valueize.
>
> I think we need Richard to say what the intent is for the valueization
> function. It is used both to stop looking at defining stmt if the return is
> NULL, and to replace/optimize one SSA_NAME with another, but currently it
> seems hard to prevent looking at the defining statement without preventing
> from looking at the SSA_NAME at all.
Looks we don't really expand into def_stmt on leaf nodes, maybe
valueization can be saved in the case?
>
> I guess we'll need a fix in genmatch...
Yeah, then the original patch becomes unnecessary.  Thanks again!

Thanks,
bin
>
> --
> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Marc Glisse

On Mon, 24 Jul 2017, Bin.Cheng wrote:


On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse  wrote:

On Mon, 24 Jul 2017, Bin.Cheng wrote:


But since definition of _197 isn't in current stmt sequence, call "o31
= do_valueize (valueize, o31)" will return NULL.  As a result, it's
not matched.



Wait, actually, how was your fold_build* version working? Why was the first
addition "in the current generic tree" and why isn't it "in current stmt
sequence"?

Maybe I didn't express it clearly.  In compute_new_first_bound, we
have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
1" by calling gimple_build.  The definition of _197 is a PHI and isn't
in current stmt sequence.  For fold_build* version, it builds
expression "_197 + 1 - 1" and simplifies it.


It seems like it shouldn't be relevant whether the definition of _197 is 
in the stmt sequence or not, but indeed we seem to generate a lot of calls 
to do_valueize... I had misunderstood the issue, sorry.


Strangely, for a pattern like
(simplify (plus @0 @1) @0)
we generate no call to valueize, while for the following
(simplify (plus @0 (minus @1 @2)) @0)
we generate 3 calls to do_valueize.

I think we need Richard to say what the intent is for the valueization 
function. It is used both to stop looking at defining stmt if the return 
is NULL, and to replace/optimize one SSA_NAME with another, but currently 
it seems hard to prevent looking at the defining statement without 
preventing from looking at the SSA_NAME at all.


I guess we'll need a fix in genmatch...

--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Bin.Cheng
On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse  wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>> But since definition of _197 isn't in current stmt sequence, call "o31
>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>> not matched.
>
>
> Wait, actually, how was your fold_build* version working? Why was the first
> addition "in the current generic tree" and why isn't it "in current stmt
> sequence"?
Maybe I didn't express it clearly.  In compute_new_first_bound, we
have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
1" by calling gimple_build.  The definition of _197 is a PHI and isn't
in current stmt sequence.  For fold_build* version, it builds
expression "_197 + 1 - 1" and simplifies it.

Thanks,
bin
>
> --
> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Marc Glisse

On Mon, 24 Jul 2017, Bin.Cheng wrote:


But since definition of _197 isn't in current stmt sequence, call "o31
= do_valueize (valueize, o31)" will return NULL.  As a result, it's
not matched.


Wait, actually, how was your fold_build* version working? Why was the 
first addition "in the current generic tree" and why isn't it "in current 
stmt sequence"?


--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Bin.Cheng
On Mon, Jul 24, 2017 at 1:16 PM, Marc Glisse  wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>>> For _123, we have
>>>
>>>   /* (A +- CST1) +- CST2 -> A + CST3
>>> or
>>> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>>>
>>>
>>> For _115, we have
>>>
>>> /* min (a, a + CST) -> a where CST is positive.  */
>>> /* min (a, a + CST) -> a + CST where CST is negative. */
>>> (simplify
>>>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>>>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>>>(if (tree_int_cst_sgn (@1) > 0)
>>> @0
>>> @2)))
>>>
>>> What is the type of all those SSA_NAMEs?
>>
>> Hi,
>> From the debugging process, there are two issues preventing "(A +-
>> CST1) +- CST2 -> A + CST3" from being applied:
>> A) before we reach this pattern, there is pattern:
>>
>> /* A - B -> A + (-B) if B is easily negatable.  */
>> (simplify
>> (minus @0 negate_expr_p@1)
>> (if (!FIXED_POINT_TYPE_P (type))
>> (plus @0 (negate @1
>>
>> which is matched and returned in gimple_simplify_MINUS_EXPR.  So does
>> pattern order matter here?
>
>
> That shouldn't be a problem, normally we always try to resimplify the result
> of the simplification, and the transformation should handle x+1+-1 just as
> well as x+1-1. Is that not happening?
Yes, it's doesn't matter.

>
>> B) When folding "_124 - 1" on the basis of existing stmts sequence
>> like "_124 = _197 + 1;".  The corresponding gimple-match.c code is
>> like:
>
> [...]
>>
>> But since definition of _197 isn't in current stmt sequence, call "o31
>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>> not matched.
>
>
> Ah, yes, that problem... Jakub was having a very similar issue a few
> weeks ago, don't know if he found a solution. You could call
> gimple_simplify directly with a different valueization function if
> that's safe. Normally the simplification would wait until the next
> forwprop pass.
Thanks for elaboration.
It's too late for next forwprop pass since we are in between loop
optimizations and need the simplified code for niter analysis.
Function compute_new_first_bound calls gimple_build several times,
it's not likely to replace all gimple_build with gimple_simplify?
Also gimple_simplify could return NULL_TREE, in which case I need to
call gimple_build_assign again.  At last, we don't have interface
fold_seq similar to fold_stmt either.
CCing Jakub if he found a solution.  Thanks.

Thanks,
bin
>
> --
> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Marc Glisse

On Mon, 24 Jul 2017, Bin.Cheng wrote:


For _123, we have

  /* (A +- CST1) +- CST2 -> A + CST3
or
/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */


For _115, we have

/* min (a, a + CST) -> a where CST is positive.  */
/* min (a, a + CST) -> a + CST where CST is negative. */
(simplify
 (min:c @0 (plus@2 @0 INTEGER_CST@1))
  (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
   (if (tree_int_cst_sgn (@1) > 0)
@0
@2)))

What is the type of all those SSA_NAMEs?

Hi,
From the debugging process, there are two issues preventing "(A +-
CST1) +- CST2 -> A + CST3" from being applied:
A) before we reach this pattern, there is pattern:

/* A - B -> A + (-B) if B is easily negatable.  */
(simplify
(minus @0 negate_expr_p@1)
(if (!FIXED_POINT_TYPE_P (type))
(plus @0 (negate @1

which is matched and returned in gimple_simplify_MINUS_EXPR.  So does
pattern order matter here?


That shouldn't be a problem, normally we always try to resimplify the 
result of the simplification, and the transformation should handle x+1+-1 
just as well as x+1-1. Is that not happening?



B) When folding "_124 - 1" on the basis of existing stmts sequence
like "_124 = _197 + 1;".  The corresponding gimple-match.c code is
like:

[...]

But since definition of _197 isn't in current stmt sequence, call "o31
= do_valueize (valueize, o31)" will return NULL.  As a result, it's
not matched.


Ah, yes, that problem... Jakub was having a very similar issue a few
weeks ago, don't know if he found a solution. You could call
gimple_simplify directly with a different valueization function if
that's safe. Normally the simplification would wait until the next
forwprop pass.

--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-07-24 Thread Bin.Cheng
On Fri, Jun 16, 2017 at 5:48 PM, Marc Glisse  wrote:
> On Fri, 16 Jun 2017, Bin.Cheng wrote:
>
>> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
>>  wrote:
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see.  I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
>   /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive.  */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>(if (tree_int_cst_sgn (@1) > 0)
> @0
> @2)))
>
> What is the type of all those SSA_NAMEs?
Hi,
>From the debugging process, there are two issues preventing "(A +-
CST1) +- CST2 -> A + CST3" from being applied:
A) before we reach this pattern, there is pattern:

/* A - B -> A + (-B) if B is easily negatable.  */
(simplify
 (minus @0 negate_expr_p@1)
 (if (!FIXED_POINT_TYPE_P (type))
 (plus @0 (negate @1

which is matched and returned in gimple_simplify_MINUS_EXPR.  So does
pattern order matter here?

B) When folding "_124 - 1" on the basis of existing stmts sequence
like "_124 = _197 + 1;".  The corresponding gimple-match.c code is
like:
  if (gimple_nop_convert (op0, op0_pops, valueize))
{
  tree o20 = op0_pops[0];
  switch (TREE_CODE (o20))
{
case SSA_NAME:
  if (do_valueize (valueize, o20) != NULL_TREE)
{
  gimple *def_stmt = SSA_NAME_DEF_STMT (o20);
  if (gassign *def = dyn_cast  (def_stmt))
switch (gimple_assign_rhs_code (def))
  {
  case PLUS_EXPR:
{
  tree o30 = gimple_assign_rhs1 (def);
  if ((o30 = do_valueize (valueize, o30)))
{
  tree o31 = gimple_assign_rhs2 (def);
  if ((o31 = do_valueize (valueize, o31)))
{
  if (tree_swap_operands_p (o30, o31))
std::swap (o30, o31);
  if (CONSTANT_CLASS_P (o31))
{
  if (CONSTANT_CLASS_P (op1))
{
  {
/* #line 1392 "../../gcc/gcc/match.pd" */
tree captures[3] ATTRIBUTE_UNUSED = { o30, o31, op1 };
if (gimple_simplify_194 (res_code, res_ops, seq,
valueize, type, captures, PLUS_EXPR, MINUS_EXPR, MINUS_EXPR))
  return true;
  }
}
}
}
}
  break;
}

Note we have:
(gdb) call debug_generic_expr(op0)
_124
(gdb) call debug_generic_expr(op1)
1
(gdb) call debug_gimple_stmt(def_stmt)
_124 = _197 + 1;
(gdb) call debug_generic_expr(o30)
_197

But since definition of _197 isn't in current stmt sequence, call "o31
= do_valueize (valueize, o31)" will return NULL.  As a result, it's
not matched.
Any ideas?  Thanks.

Thanks,
bin


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Andrew Pinski
On Fri, Jun 16, 2017 at 9:48 AM, Marc Glisse  wrote:
> On Fri, 16 Jun 2017, Bin.Cheng wrote:
>
>> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
>>  wrote:
>>>
>>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"
>>>  wrote:

 On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
  wrote:
>
> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng 

 wrote:
>>
>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>  wrote:
>>>
>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng 

 wrote:

 Hi,
 Loop split forces intermediate computation to gimple operands all

 the time when

 computing bound information.  This is not good since folding

 opportunities are

 missed.  This patch fixes the issue by feeding all computation to

 folder and only

 forcing to gimple operand at last.

 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>>
>>> Hm?  It uses gimple_build () which should do the same as

 fold_buildN in terms
>>>
>>> of simplification.
>>>
>>> So where does that not work?  It is supposed to be the prefered way

 and no
>>>
>>> new code should use force_gimple_operand (unless dealing with

 generic
>>>
>>> coming from other middle-end infrastructure like SCEV or niter

 analysis)
>>
>> Hmm, current code calls force_gimpele operand several times which
>> causes the inefficiency.  The patch avoids that and does one call at
>> the end.
>
>
> But it forces to the same sequence that is used for extending the

 expression
>
> so folding should work.  Where do you see that it does not?  Note the
> code uses gimple_build (), not gimple_build_assign ().

 In spec2k6/hmmer, when building fast_algorithms.c with below command
 line:
 ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
 -fdump-tree-lsplit
 The lsplit dump contains:
   [12.75%]:
  _124 = _197 + 1;
  _123 = _124 + -1;
  _115 = MIN_EXPR <_197, _124>;
 Which is generated here.
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see.  I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
>   /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive.  */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>(if (tree_int_cst_sgn (@1) > 0)
> @0
> @2)))
>
> What is the type of all those SSA_NAMEs?

https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01352.html
which added the min/max patterns.  I forgot to get Naveen to mention I
saw this while looking into loop splitting and why I was adding them.

Thanks,
Andrew Pinski

>
> --
> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Bin.Cheng
On Fri, Jun 16, 2017 at 5:48 PM, Marc Glisse  wrote:
> On Fri, 16 Jun 2017, Bin.Cheng wrote:
>
>> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
>>  wrote:
>>>
>>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"
>>>  wrote:

 On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
  wrote:
>
> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng 

 wrote:
>>
>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>  wrote:
>>>
>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng 

 wrote:

 Hi,
 Loop split forces intermediate computation to gimple operands all

 the time when

 computing bound information.  This is not good since folding

 opportunities are

 missed.  This patch fixes the issue by feeding all computation to

 folder and only

 forcing to gimple operand at last.

 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>>
>>> Hm?  It uses gimple_build () which should do the same as

 fold_buildN in terms
>>>
>>> of simplification.
>>>
>>> So where does that not work?  It is supposed to be the prefered way

 and no
>>>
>>> new code should use force_gimple_operand (unless dealing with

 generic
>>>
>>> coming from other middle-end infrastructure like SCEV or niter

 analysis)
>>
>> Hmm, current code calls force_gimpele operand several times which
>> causes the inefficiency.  The patch avoids that and does one call at
>> the end.
>
>
> But it forces to the same sequence that is used for extending the

 expression
>
> so folding should work.  Where do you see that it does not?  Note the
> code uses gimple_build (), not gimple_build_assign ().

 In spec2k6/hmmer, when building fast_algorithms.c with below command
 line:
 ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
 -fdump-tree-lsplit
 The lsplit dump contains:
   [12.75%]:
  _124 = _197 + 1;
  _123 = _124 + -1;
  _115 = MIN_EXPR <_197, _124>;
 Which is generated here.
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see.  I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
>   /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive.  */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>(if (tree_int_cst_sgn (@1) > 0)
> @0
> @2)))
>
> What is the type of all those SSA_NAMEs?
Hi Marc,
Thanks for pointing out the exact patterns.  The variables are of int
type.  The redundant operation disappears in reduced test case though.

Thanks,
bin
>
> --
> Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Marc Glisse

On Fri, 16 Jun 2017, Bin.Cheng wrote:


On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
 wrote:

On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"  
wrote:

On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
 wrote:

On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng 

wrote:

On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
 wrote:

On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng 

wrote:

Hi,
Loop split forces intermediate computation to gimple operands all

the time when

computing bound information.  This is not good since folding

opportunities are

missed.  This patch fixes the issue by feeding all computation to

folder and only

forcing to gimple operand at last.

Bootstrap and test on x86_64 and AArch64.  Is it OK?


Hm?  It uses gimple_build () which should do the same as

fold_buildN in terms

of simplification.

So where does that not work?  It is supposed to be the prefered way

and no

new code should use force_gimple_operand (unless dealing with

generic

coming from other middle-end infrastructure like SCEV or niter

analysis)

Hmm, current code calls force_gimpele operand several times which
causes the inefficiency.  The patch avoids that and does one call at
the end.


But it forces to the same sequence that is used for extending the

expression

so folding should work.  Where do you see that it does not?  Note the
code uses gimple_build (), not gimple_build_assign ().

In spec2k6/hmmer, when building fast_algorithms.c with below command
line:
./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
-fdump-tree-lsplit
The lsplit dump contains:
  [12.75%]:
 _124 = _197 + 1;
 _123 = _124 + -1;
 _115 = MIN_EXPR <_197, _124>;
Which is generated here.


That means we miss a pattern in match.PD to handle this case.

I see.  I will withdraw this patch and look in that direction.


For _123, we have

  /* (A +- CST1) +- CST2 -> A + CST3
or
/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */


For _115, we have

/* min (a, a + CST) -> a where CST is positive.  */
/* min (a, a + CST) -> a + CST where CST is negative. */
(simplify
 (min:c @0 (plus@2 @0 INTEGER_CST@1))
  (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
   (if (tree_int_cst_sgn (@1) > 0)
@0
@2)))

What is the type of all those SSA_NAMEs?

--
Marc Glisse


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Bin.Cheng
On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
 wrote:
> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"  
> wrote:
>>On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
>> wrote:
>>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng 
>>wrote:
 On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
  wrote:
> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng 
>>wrote:
>> Hi,
>> Loop split forces intermediate computation to gimple operands all
>>the time when
>> computing bound information.  This is not good since folding
>>opportunities are
>> missed.  This patch fixes the issue by feeding all computation to
>>folder and only
>> forcing to gimple operand at last.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> Hm?  It uses gimple_build () which should do the same as
>>fold_buildN in terms
> of simplification.
>
> So where does that not work?  It is supposed to be the prefered way
>>and no
> new code should use force_gimple_operand (unless dealing with
>>generic
> coming from other middle-end infrastructure like SCEV or niter
>>analysis)
 Hmm, current code calls force_gimpele operand several times which
 causes the inefficiency.  The patch avoids that and does one call at
 the end.
>>>
>>> But it forces to the same sequence that is used for extending the
>>expression
>>> so folding should work.  Where do you see that it does not?  Note the
>>> code uses gimple_build (), not gimple_build_assign ().
>>In spec2k6/hmmer, when building fast_algorithms.c with below command
>>line:
>>./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>>-fdump-tree-lsplit
>>The lsplit dump contains:
>>   [12.75%]:
>>  _124 = _197 + 1;
>>  _123 = _124 + -1;
>>  _115 = MIN_EXPR <_197, _124>;
>>Which is generated here.
>
> That means we miss a pattern in match.PD to handle this case.
I see.  I will withdraw this patch and look in that direction.

Thanks,
bin
>
> Richard.
>
>>Thanks,
>>bin
>>>
>>> Richard.
>>>
 Thanks,
 bin
>
> Richard.
>
>>
>> Thanks,
>> bin
>> 2017-06-12  Bin Cheng  
>>
>> * tree-ssa-loop-split.c (compute_new_first_bound): Feed
>>bound
>> computation to folder, rather than force to gimple
>>operands too
>> early.
>


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Richard Biener
On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"  
wrote:
>On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
> wrote:
>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng 
>wrote:
>>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>>  wrote:
 On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng 
>wrote:
> Hi,
> Loop split forces intermediate computation to gimple operands all
>the time when
> computing bound information.  This is not good since folding
>opportunities are
> missed.  This patch fixes the issue by feeding all computation to
>folder and only
> forcing to gimple operand at last.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

 Hm?  It uses gimple_build () which should do the same as
>fold_buildN in terms
 of simplification.

 So where does that not work?  It is supposed to be the prefered way
>and no
 new code should use force_gimple_operand (unless dealing with
>generic
 coming from other middle-end infrastructure like SCEV or niter
>analysis)
>>> Hmm, current code calls force_gimpele operand several times which
>>> causes the inefficiency.  The patch avoids that and does one call at
>>> the end.
>>
>> But it forces to the same sequence that is used for extending the
>expression
>> so folding should work.  Where do you see that it does not?  Note the
>> code uses gimple_build (), not gimple_build_assign ().
>In spec2k6/hmmer, when building fast_algorithms.c with below command
>line:
>./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>-fdump-tree-lsplit
>The lsplit dump contains:
>   [12.75%]:
>  _124 = _197 + 1;
>  _123 = _124 + -1;
>  _115 = MIN_EXPR <_197, _124>;
>Which is generated here.

That means we miss a pattern in match.PD to handle this case.

Richard.

>Thanks,
>bin
>>
>> Richard.
>>
>>> Thanks,
>>> bin

 Richard.

>
> Thanks,
> bin
> 2017-06-12  Bin Cheng  
>
> * tree-ssa-loop-split.c (compute_new_first_bound): Feed
>bound
> computation to folder, rather than force to gimple
>operands too
> early.



Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Bin.Cheng
On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
 wrote:
> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng  wrote:
>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>  wrote:
>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng  wrote:
 Hi,
 Loop split forces intermediate computation to gimple operands all the time 
 when
 computing bound information.  This is not good since folding opportunities 
 are
 missed.  This patch fixes the issue by feeding all computation to folder 
 and only
 forcing to gimple operand at last.

 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>> Hm?  It uses gimple_build () which should do the same as fold_buildN in 
>>> terms
>>> of simplification.
>>>
>>> So where does that not work?  It is supposed to be the prefered way and no
>>> new code should use force_gimple_operand (unless dealing with generic
>>> coming from other middle-end infrastructure like SCEV or niter analysis)
>> Hmm, current code calls force_gimpele operand several times which
>> causes the inefficiency.  The patch avoids that and does one call at
>> the end.
>
> But it forces to the same sequence that is used for extending the expression
> so folding should work.  Where do you see that it does not?  Note the
> code uses gimple_build (), not gimple_build_assign ().
In spec2k6/hmmer, when building fast_algorithms.c with below command line:
./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
-fdump-tree-lsplit
The lsplit dump contains:
   [12.75%]:
  _124 = _197 + 1;
  _123 = _124 + -1;
  _115 = MIN_EXPR <_197, _124>;
Which is generated here.

Thanks,
bin
>
> Richard.
>
>> Thanks,
>> bin
>>>
>>> Richard.
>>>

 Thanks,
 bin
 2017-06-12  Bin Cheng  

 * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
 computation to folder, rather than force to gimple operands too
 early.


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Richard Biener
On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng  wrote:
> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>  wrote:
>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng  wrote:
>>> Hi,
>>> Loop split forces intermediate computation to gimple operands all the time 
>>> when
>>> computing bound information.  This is not good since folding opportunities 
>>> are
>>> missed.  This patch fixes the issue by feeding all computation to folder 
>>> and only
>>> forcing to gimple operand at last.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>
>> Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
>> of simplification.
>>
>> So where does that not work?  It is supposed to be the prefered way and no
>> new code should use force_gimple_operand (unless dealing with generic
>> coming from other middle-end infrastructure like SCEV or niter analysis)
> Hmm, current code calls force_gimpele operand several times which
> causes the inefficiency.  The patch avoids that and does one call at
> the end.

But it forces to the same sequence that is used for extending the expression
so folding should work.  Where do you see that it does not?  Note the
code uses gimple_build (), not gimple_build_assign ().

Richard.

> Thanks,
> bin
>>
>> Richard.
>>
>>>
>>> Thanks,
>>> bin
>>> 2017-06-12  Bin Cheng  
>>>
>>> * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
>>> computation to folder, rather than force to gimple operands too
>>> early.


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Bin.Cheng
On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
 wrote:
> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng  wrote:
>> Hi,
>> Loop split forces intermediate computation to gimple operands all the time 
>> when
>> computing bound information.  This is not good since folding opportunities 
>> are
>> missed.  This patch fixes the issue by feeding all computation to folder and 
>> only
>> forcing to gimple operand at last.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
> of simplification.
>
> So where does that not work?  It is supposed to be the prefered way and no
> new code should use force_gimple_operand (unless dealing with generic
> coming from other middle-end infrastructure like SCEV or niter analysis)
Hmm, current code calls force_gimpele operand several times which
causes the inefficiency.  The patch avoids that and does one call at
the end.

Thanks,
bin
>
> Richard.
>
>>
>> Thanks,
>> bin
>> 2017-06-12  Bin Cheng  
>>
>> * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
>> computation to folder, rather than force to gimple operands too
>> early.


Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Richard Biener
On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng  wrote:
> Hi,
> Loop split forces intermediate computation to gimple operands all the time 
> when
> computing bound information.  This is not good since folding opportunities are
> missed.  This patch fixes the issue by feeding all computation to folder and 
> only
> forcing to gimple operand at last.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
of simplification.

So where does that not work?  It is supposed to be the prefered way and no
new code should use force_gimple_operand (unless dealing with generic
coming from other middle-end infrastructure like SCEV or niter analysis)

Richard.

>
> Thanks,
> bin
> 2017-06-12  Bin Cheng  
>
> * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
> computation to folder, rather than force to gimple operands too
> early.


[PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-14 Thread Bin Cheng
Hi,
Loop split forces intermediate computation to gimple operands all the time when
computing bound information.  This is not good since folding opportunities are
missed.  This patch fixes the issue by feeding all computation to folder and 
only
forcing to gimple operand at last.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin
2017-06-12  Bin Cheng  

* tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
computation to folder, rather than force to gimple operands too
early.From 372dc98aa91fd495c98c2326f854eb5f2c76500b Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Fri, 2 Jun 2017 18:05:03 +0100
Subject: [PATCH 1/2] feed-bound-computation-to-folder-20170601.txt

---
 gcc/tree-ssa-loop-split.c | 77 ++-
 1 file changed, 30 insertions(+), 47 deletions(-)

diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index e77f2bf..f8fe8e6 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -396,53 +396,38 @@ compute_new_first_bound (gimple_seq *stmts, struct 
tree_niter_desc *niter,
 {
   /* The niter structure contains the after-increment IV, we need
  the loop-enter base, so subtract STEP once.  */
-  tree controlbase = force_gimple_operand (niter->control.base,
-  stmts, true, NULL_TREE);
+  tree controlbase = niter->control.base;
   tree controlstep = niter->control.step;
-  tree enddiff;
+  tree enddiff, end = niter->bound;
+  tree type;
+
+  /* Compute end-beg.  */
   if (POINTER_TYPE_P (TREE_TYPE (controlbase)))
 {
-  controlstep = gimple_build (stmts, NEGATE_EXPR,
- TREE_TYPE (controlstep), controlstep);
-  enddiff = gimple_build (stmts, POINTER_PLUS_EXPR,
- TREE_TYPE (controlbase),
- controlbase, controlstep);
+  controlstep = fold_build1 (NEGATE_EXPR,
+TREE_TYPE (controlstep), controlstep);
+  enddiff = fold_build_pointer_plus (controlbase, controlstep);
+
+  type = unsigned_type_for (enddiff);
+  enddiff = fold_build1 (NEGATE_EXPR, type, fold_convert (type, enddiff));
+  end = fold_convert (type, end);
+  enddiff = fold_build2 (PLUS_EXPR, type, end, enddiff);
+  enddiff = fold_convert (sizetype, enddiff);
 }
   else
-enddiff = gimple_build (stmts, MINUS_EXPR,
-   TREE_TYPE (controlbase),
-   controlbase, controlstep);
-
-  /* Compute end-beg.  */
-  gimple_seq stmts2;
-  tree end = force_gimple_operand (niter->bound, ,
-   true, NULL_TREE);
-  gimple_seq_add_seq_without_update (stmts, stmts2);
-  if (POINTER_TYPE_P (TREE_TYPE (enddiff)))
 {
-  tree tem = gimple_convert (stmts, sizetype, enddiff);
-  tem = gimple_build (stmts, NEGATE_EXPR, sizetype, tem);
-  enddiff = gimple_build (stmts, POINTER_PLUS_EXPR,
- TREE_TYPE (enddiff),
- end, tem);
+  enddiff = fold_build2 (MINUS_EXPR, TREE_TYPE (controlbase),
+controlbase, controlstep);
+  enddiff = fold_build2 (MINUS_EXPR, TREE_TYPE (enddiff), end, enddiff);
 }
-  else
-enddiff = gimple_build (stmts, MINUS_EXPR, TREE_TYPE (enddiff),
-   end, enddiff);
 
   /* Compute guard_init + (end-beg).  */
   tree newbound;
-  enddiff = gimple_convert (stmts, TREE_TYPE (guard_init), enddiff);
   if (POINTER_TYPE_P (TREE_TYPE (guard_init)))
-{
-  enddiff = gimple_convert (stmts, sizetype, enddiff);
-  newbound = gimple_build (stmts, POINTER_PLUS_EXPR,
-  TREE_TYPE (guard_init),
-  guard_init, enddiff);
-}
+newbound = fold_build_pointer_plus (guard_init, enddiff);
   else
-newbound = gimple_build (stmts, PLUS_EXPR, TREE_TYPE (guard_init),
-guard_init, enddiff);
+newbound = fold_build2 (PLUS_EXPR, TREE_TYPE (guard_init), guard_init,
+   fold_convert (TREE_TYPE (guard_init), enddiff));
 
   /* Depending on the direction of the IVs the new bound for the first
  loop is the minimum or maximum of old bound and border.
@@ -467,20 +452,18 @@ compute_new_first_bound (gimple_seq *stmts, struct 
tree_niter_desc *niter,
 
   if (addbound)
 {
-  tree type2 = TREE_TYPE (newbound);
-  if (POINTER_TYPE_P (type2))
-   type2 = sizetype;
-  newbound = gimple_build (stmts,
-  POINTER_TYPE_P (TREE_TYPE (newbound))
-  ? POINTER_PLUS_EXPR : PLUS_EXPR,
-  TREE_TYPE (newbound),
-  newbound,
-  build_int_cst (type2, addbound));
+  type = TREE_TYPE (newbound);
+  if (POINTER_TYPE_P