Re: [0/7] Type promotion pass and elimination of zext/sext

2015-12-16 Thread Richard Biener
On Thu, Dec 10, 2015 at 1:27 AM, Kugan
 wrote:
> Hi Riachard,
>
> Thanks for the reviews.
>
> I think since we have some unresolved issues here, it is best to aim for
> the next stage1. I however would like any feedback so that I can
> continue to improve this.

Yeah, sorry I've been distracted lately and am not sure I'll get to
the patch before
christmas break.

> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related
> to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think
> there is any agreement on this. Or is there any better place to fix this?

I don't know enough in this area to suggest anything.

Richard.

> Thanks,
> Kugan


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-12-09 Thread Kugan
Hi Riachard,

Thanks for the reviews.

I think since we have some unresolved issues here, it is best to aim for
the next stage1. I however would like any feedback so that I can
continue to improve this.

https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related
to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think
there is any agreement on this. Or is there any better place to fix this?

Thanks,
Kugan


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-23 Thread Kugan
Hi Richard,

Thanks for you comments. I am attaching  an updated patch with details
below.

On 19/11/15 02:06, Richard Biener wrote:
> On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
>  wrote:
>> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
>>  wrote:
>>>
>>> Attached is the latest version of the patch. With the patches
>>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>>> 0002-Add-type-promotion-pass.patch and
>>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>>
>>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>>> issues in ppc64-linux-gnu regression testing. There are some other test
>>> cases which needs adjustment for scanning for some patterns that are not
>>> valid now.
>>>
>>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>>> index 16c7981..9cf4f43 100644
>>> --- a/gcc/fwprop.c
>>> +++ b/gcc/fwprop.c
>>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>>> new_rtx, rtx_insn *def_insn,
>>>int old_cost = 0;
>>>bool ok;
>>>
>>> +  /* Value to be substituted is the same, nothing to do.  */
>>> +  if (rtx_equal_p (*loc, new_rtx))
>>> +return false;
>>> +
>>>update_df_init (def_insn, insn);
>>>
>>>/* forward_propagate_subreg may be operating on an instruction with
>>
>> Which testcase was this on?

After re-basing the trunk, I cannot reproduce it anymore.

>>
>>> 2. gcc.dg/torture/ftrapv-1.c fails
>>> This is because we are checking for the  SImode trapping. With the
>>> promotion of the operation to wider mode, this is i think expected. I
>>> think the testcase needs updating.
>>
>> No, it is not expected.  As said earlier you need to refrain from promoting
>> integer operations that trap.  You can use ! operation_no_trapping_overflow
>> for this.
>>

I have changed this.

>>> 3. gcc.dg/sms-3.c fails
>>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>>> am looking into it.
>>>
>>>
>>> I also have the following issues based on the previous review (as posted
>>> in the previous patch). Copying again for the review purpose.
>>>
>>> 1.
 you still call promote_ssa on both DEFs and USEs and promote_ssa looks
 at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
 on DEFs and fixup_uses on USEs.
>>>
>>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>>> there anyway to iterate over this. I have added gcc_assert to make sure
>>> that promote_ssa is called only once.
>>
>>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>>
>> with --disable-checking this will be compiled away so you need to do
>> the assert in a separate statement.
>>
>>> 2.
 Instead of this you should, in promote_all_stmts, walk over all uses
>>> doing what
 fixup_uses does and then walk over all defs, doing what promote_ssa does.

 +case GIMPLE_NOP:
 +   {
 + if (SSA_NAME_VAR (def) == NULL)
 +   {
 + /* Promote def by fixing its type for anonymous def.  */
 + TREE_TYPE (def) = promoted_type;
 +   }
 + else
 +   {
 + /* Create a promoted copy of parameters.  */
 + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));

 I think the uninitialized vars are somewhat tricky and it would be best
 to create a new uninit anonymous SSA name for them.  You can
 have SSA_NAME_VAR != NULL and def _not_ being a parameter
 btw.
>>>
>>> I experimented with get_or_create_default_def. Here  we have to have a
>>> SSA_NAME_VAR (def) of promoted type.
>>>
>>> In the attached patch I am doing the following and seems to work. Does
>>> this looks OK?
>>>
>>> + }
>>> +   else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>>> + {
>>> +   tree var = copy_node (SSA_NAME_VAR (def));
>>> +   TREE_TYPE (var) = promoted_type;
>>> +   TREE_TYPE (def) = promoted_type;
>>> +   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>> + }
>>
>> I believe this will wreck the SSA default-def map so you should do
>>
>>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>>   tree var = create_tmp_reg (promoted_type);
>>   TREE_TYPE (def) = promoted_type;
>>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>   set_ssa_default_def (cfun, var, def);
>>
>> instead.
I have changed this.

>>
>>> I prefer to promote def as otherwise iterating over the uses and
>>> promoting can look complicated (have to look at all the different types
>>> of stmts again and do the right thing as It was in the earlier version
>>> of this before we move to this approach)
>>>
>>> 3)
 you can also transparently handle constants for the cases where promoting
 is required.  At the moment their handling is interwinded with the def
>>> promotion
 cod

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-18 Thread Richard Biener
On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
 wrote:
> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
>  wrote:
>>
>> Attached is the latest version of the patch. With the patches
>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>> 0002-Add-type-promotion-pass.patch and
>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>
>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>> issues in ppc64-linux-gnu regression testing. There are some other test
>> cases which needs adjustment for scanning for some patterns that are not
>> valid now.
>>
>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>> index 16c7981..9cf4f43 100644
>> --- a/gcc/fwprop.c
>> +++ b/gcc/fwprop.c
>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>> new_rtx, rtx_insn *def_insn,
>>int old_cost = 0;
>>bool ok;
>>
>> +  /* Value to be substituted is the same, nothing to do.  */
>> +  if (rtx_equal_p (*loc, new_rtx))
>> +return false;
>> +
>>update_df_init (def_insn, insn);
>>
>>/* forward_propagate_subreg may be operating on an instruction with
>
> Which testcase was this on?
>
>> 2. gcc.dg/torture/ftrapv-1.c fails
>> This is because we are checking for the  SImode trapping. With the
>> promotion of the operation to wider mode, this is i think expected. I
>> think the testcase needs updating.
>
> No, it is not expected.  As said earlier you need to refrain from promoting
> integer operations that trap.  You can use ! operation_no_trapping_overflow
> for this.
>
>> 3. gcc.dg/sms-3.c fails
>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>> am looking into it.
>>
>>
>> I also have the following issues based on the previous review (as posted
>> in the previous patch). Copying again for the review purpose.
>>
>> 1.
>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>>> on DEFs and fixup_uses on USEs.
>>
>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>> there anyway to iterate over this. I have added gcc_assert to make sure
>> that promote_ssa is called only once.
>
>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>
> with --disable-checking this will be compiled away so you need to do
> the assert in a separate statement.
>
>> 2.
>>> Instead of this you should, in promote_all_stmts, walk over all uses
>> doing what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +case GIMPLE_NOP:
>>> +   {
>>> + if (SSA_NAME_VAR (def) == NULL)
>>> +   {
>>> + /* Promote def by fixing its type for anonymous def.  */
>>> + TREE_TYPE (def) = promoted_type;
>>> +   }
>>> + else
>>> +   {
>>> + /* Create a promoted copy of parameters.  */
>>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> I experimented with get_or_create_default_def. Here  we have to have a
>> SSA_NAME_VAR (def) of promoted type.
>>
>> In the attached patch I am doing the following and seems to work. Does
>> this looks OK?
>>
>> + }
>> +   else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>> + {
>> +   tree var = copy_node (SSA_NAME_VAR (def));
>> +   TREE_TYPE (var) = promoted_type;
>> +   TREE_TYPE (def) = promoted_type;
>> +   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>> + }
>
> I believe this will wreck the SSA default-def map so you should do
>
>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>   tree var = create_tmp_reg (promoted_type);
>   TREE_TYPE (def) = promoted_type;
>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>   set_ssa_default_def (cfun, var, def);
>
> instead.
>
>> I prefer to promote def as otherwise iterating over the uses and
>> promoting can look complicated (have to look at all the different types
>> of stmts again and do the right thing as It was in the earlier version
>> of this before we move to this approach)
>>
>> 3)
>>> you can also transparently handle constants for the cases where promoting
>>> is required.  At the moment their handling is interwinded with the def
>> promotion
>>> code.  That makes the whole thing hard to follow.
>>
>>
>> I have updated the comments with:
>>
>> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
>> +   promote only the constants in conditions part of the COND_EXPR.
>> +
>> +   We promote the constants when the associated operands are promoted.
>> +   This usually means that we promote the co

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-18 Thread Richard Biener
On Sat, Nov 14, 2015 at 2:15 AM, Kugan
 wrote:
>
> Attached is the latest version of the patch. With the patches
> 0001-Add-new-SEXT_EXPR-tree-code.patch,
> 0002-Add-type-promotion-pass.patch and
> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>
> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
> issues in ppc64-linux-gnu regression testing. There are some other test
> cases which needs adjustment for scanning for some patterns that are not
> valid now.
>
> 1. rtl fwprop was going into infinite loop. Works with the following patch:
> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
> index 16c7981..9cf4f43 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
> new_rtx, rtx_insn *def_insn,
>int old_cost = 0;
>bool ok;
>
> +  /* Value to be substituted is the same, nothing to do.  */
> +  if (rtx_equal_p (*loc, new_rtx))
> +return false;
> +
>update_df_init (def_insn, insn);
>
>/* forward_propagate_subreg may be operating on an instruction with

Which testcase was this on?

> 2. gcc.dg/torture/ftrapv-1.c fails
> This is because we are checking for the  SImode trapping. With the
> promotion of the operation to wider mode, this is i think expected. I
> think the testcase needs updating.

No, it is not expected.  As said earlier you need to refrain from promoting
integer operations that trap.  You can use ! operation_no_trapping_overflow
for this.

> 3. gcc.dg/sms-3.c fails
> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
> am looking into it.
>
>
> I also have the following issues based on the previous review (as posted
> in the previous patch). Copying again for the review purpose.
>
> 1.
>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>> on DEFs and fixup_uses on USEs.
>
> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
> there anyway to iterate over this. I have added gcc_assert to make sure
> that promote_ssa is called only once.

  gcc_assert (!ssa_name_info_map->get_or_insert (def));

with --disable-checking this will be compiled away so you need to do
the assert in a separate statement.

> 2.
>> Instead of this you should, in promote_all_stmts, walk over all uses
> doing what
>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>
>> +case GIMPLE_NOP:
>> +   {
>> + if (SSA_NAME_VAR (def) == NULL)
>> +   {
>> + /* Promote def by fixing its type for anonymous def.  */
>> + TREE_TYPE (def) = promoted_type;
>> +   }
>> + else
>> +   {
>> + /* Create a promoted copy of parameters.  */
>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>
>> I think the uninitialized vars are somewhat tricky and it would be best
>> to create a new uninit anonymous SSA name for them.  You can
>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>> btw.
>
> I experimented with get_or_create_default_def. Here  we have to have a
> SSA_NAME_VAR (def) of promoted type.
>
> In the attached patch I am doing the following and seems to work. Does
> this looks OK?
>
> + }
> +   else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
> + {
> +   tree var = copy_node (SSA_NAME_VAR (def));
> +   TREE_TYPE (var) = promoted_type;
> +   TREE_TYPE (def) = promoted_type;
> +   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
> + }

I believe this will wreck the SSA default-def map so you should do

  set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
  tree var = create_tmp_reg (promoted_type);
  TREE_TYPE (def) = promoted_type;
  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
  set_ssa_default_def (cfun, var, def);

instead.

> I prefer to promote def as otherwise iterating over the uses and
> promoting can look complicated (have to look at all the different types
> of stmts again and do the right thing as It was in the earlier version
> of this before we move to this approach)
>
> 3)
>> you can also transparently handle constants for the cases where promoting
>> is required.  At the moment their handling is interwinded with the def
> promotion
>> code.  That makes the whole thing hard to follow.
>
>
> I have updated the comments with:
>
> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
> +   promote only the constants in conditions part of the COND_EXPR.
> +
> +   We promote the constants when the associated operands are promoted.
> +   This usually means that we promote the constants when we promote the
> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
> +   can promote only when we promote the other operand. Therefore, this
> +   is done during fixup_u

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-13 Thread Kugan

Attached is the latest version of the patch. With the patches
0001-Add-new-SEXT_EXPR-tree-code.patch,
0002-Add-type-promotion-pass.patch and
0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.

I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
issues in ppc64-linux-gnu regression testing. There are some other test
cases which needs adjustment for scanning for some patterns that are not
valid now.

1. rtl fwprop was going into infinite loop. Works with the following patch:
diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 16c7981..9cf4f43 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
new_rtx, rtx_insn *def_insn,
   int old_cost = 0;
   bool ok;

+  /* Value to be substituted is the same, nothing to do.  */
+  if (rtx_equal_p (*loc, new_rtx))
+return false;
+
   update_df_init (def_insn, insn);

   /* forward_propagate_subreg may be operating on an instruction with

2. gcc.dg/torture/ftrapv-1.c fails
This is because we are checking for the  SImode trapping. With the
promotion of the operation to wider mode, this is i think expected. I
think the testcase needs updating.
3. gcc.dg/sms-3.c fails
It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
am looking into it.


I also have the following issues based on the previous review (as posted
in the previous patch). Copying again for the review purpose.

1.
> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
> on DEFs and fixup_uses on USEs.

I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
there anyway to iterate over this. I have added gcc_assert to make sure
that promote_ssa is called only once.

2.
> Instead of this you should, in promote_all_stmts, walk over all uses
doing what
> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>
> +case GIMPLE_NOP:
> +   {
> + if (SSA_NAME_VAR (def) == NULL)
> +   {
> + /* Promote def by fixing its type for anonymous def.  */
> + TREE_TYPE (def) = promoted_type;
> +   }
> + else
> +   {
> + /* Create a promoted copy of parameters.  */
> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>
> I think the uninitialized vars are somewhat tricky and it would be best
> to create a new uninit anonymous SSA name for them.  You can
> have SSA_NAME_VAR != NULL and def _not_ being a parameter
> btw.

I experimented with get_or_create_default_def. Here  we have to have a
SSA_NAME_VAR (def) of promoted type.

In the attached patch I am doing the following and seems to work. Does
this looks OK?

+ }
+   else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+ {
+   tree var = copy_node (SSA_NAME_VAR (def));
+   TREE_TYPE (var) = promoted_type;
+   TREE_TYPE (def) = promoted_type;
+   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+ }

I prefer to promote def as otherwise iterating over the uses and
promoting can look complicated (have to look at all the different types
of stmts again and do the right thing as It was in the earlier version
of this before we move to this approach)

3)
> you can also transparently handle constants for the cases where promoting
> is required.  At the moment their handling is interwinded with the def
promotion
> code.  That makes the whole thing hard to follow.


I have updated the comments with:

+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */


4)
I am handling gimple_debug separately to avoid any code difference with
and without -g option. I have updated the comments for this.

5)
I also noticed that tree-ssa-uninit sometimes gives false positives due
to the assumptions
it makes. Is it OK to move this pass before type promotion? I can do the
testings and post a separate patch with this if this OK.

6)
I also removed the optimization that prevents some of the redundant
truncation/extensions from type promotion pass, as it dosent do much as
of now. I can send a proper follow up patch. Is that OK?

I also did a simple test with coremark for the latest patch. I compared
the code size for coremark for linux-gcc with -Os. Results are as
reported by the "size" utility. I know this doesn't mean much but can
give some indication.
Basewith pass   Percentage improvement
=

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-11 Thread Kugan
Hi Richard,

Thanks for the review.

>>>
>>> The basic "structure" thing still remains.  You walk over all uses and
>>> defs in all stmts
>>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on 
>>> all
>>> uses and defs which in turn promotes (the "def") and then fixes up all
>>> uses in all stmts.
>>
>> Done.
> 
> Not exactly.  I still see
> 
> /* Promote all the stmts in the basic block.  */
> static void
> promote_all_stmts (basic_block bb)
> {
>   gimple_stmt_iterator gsi;
>   ssa_op_iter iter;
>   tree def, use;
>   use_operand_p op;
> 
>   for (gphi_iterator gpi = gsi_start_phis (bb);
>!gsi_end_p (gpi); gsi_next (&gpi))
> {
>   gphi *phi = gpi.phi ();
>   def = PHI_RESULT (phi);
>   promote_ssa (def, &gsi);
> 
>   FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
> {
>   use = USE_FROM_PTR (op);
>   if (TREE_CODE (use) == SSA_NAME
>   && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
> promote_ssa (use, &gsi);
>   fixup_uses (phi, &gsi, op, use);
> }
> 
> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
> on DEFs and fixup_uses on USEs.

I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
there anyway to iterate over this. I have added gcc_assert to make sure
that promote_ssa is called only once.

> 
> Any reason you do not promote debug stmts during the DOM walk?
> 
> So for each DEF you record in ssa_name_info
> 
> struct ssa_name_info
> {
>   tree ssa;
>   tree type;
>   tree promoted_type;
> };
> 
> (the fields need documenting).  Add a tree promoted_def to it which you
> can replace any use of the DEF with.

In this version of the patch, I am promoting the def in place. If we
decide to change, I will add it. If I understand you correctly, this is
to be used in iterating over uses and fixing.

> 
> Currently as you call promote_ssa for DEFs and USEs you repeatedly
> overwrite the entry in ssa_name_info_map with a new copy.  So you
> should assert it wasn't already there.
> 
>   switch (gimple_code (def_stmt))
> {
> case GIMPLE_PHI:
> {
> 
> the last { is indented too much it should be indented 2 spaces
> relative to the 'case'

Done.

> 
> 
>   SSA_NAME_RANGE_INFO (def) = NULL;
> 
> only needed in the case 'def' was promoted itself.  Please use
> reset_flow_sensitive_info (def).

We are promoting all the defs. In some-cases we can however use the
value ranges in SSA just by promoting to new type (as the values will be
the same). Shall I do it as a follow up.
> 
>>>
>>> Instead of this you should, in promote_all_stmts, walk over all uses doing 
>>> what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +case GIMPLE_NOP:
>>> +   {
>>> + if (SSA_NAME_VAR (def) == NULL)
>>> +   {
>>> + /* Promote def by fixing its type for anonymous def.  */
>>> + TREE_TYPE (def) = promoted_type;
>>> +   }
>>> + else
>>> +   {
>>> + /* Create a promoted copy of parameters.  */
>>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> Done. I also had to do some changes to in couple of other places to
>> reflect this.
>> They are:
>> --- a/gcc/tree-ssa-reassoc.c
>> +++ b/gcc/tree-ssa-reassoc.c
>> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
>>  {
>>tree arg = gimple_phi_arg_def (stmt, i);
>>if (TREE_CODE (arg) == SSA_NAME
>> + && SSA_NAME_VAR (arg)
>>   && !SSA_NAME_IS_DEFAULT_DEF (arg))
>> {
>>   gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
>> @@ -434,7 +435,8 @@ get_rank (tree e)
>>if (gimple_code (stmt) == GIMPLE_PHI)
>> return phi_rank (stmt);
>>
>> -  if (!is_gimple_assign (stmt))
>> +  if (!is_gimple_assign (stmt)
>> + && !gimple_nop_p (stmt))
>> return bb_rank[gimple_bb (stmt)->index];
>>
>> and
>>
>> --- a/gcc/tree-ssa.c
>> +++ b/gcc/tree-ssa.c
>> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
>> use_operand_p use_p,
>>TREE_VISITED (ssa_name) = 1;
>>
>>if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
>> -  && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
>> +  && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
>> + || SSA_NAME_VAR (ssa_name) == NULL))
>>  ; /* Default definitions have empty statements.  Nothing to do.  */
>>else if (!def_bb)
>>  {
>>
>> Does this look OK?
> 
> Hmm, no, this looks bogus.

I have removed all the above.

> 
> I think the best thing to do is not promoting default defs at all and instead
> promote at the uses.
> 
>   /* Create a promoted copy of parameters.  */
>   

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-10 Thread Richard Biener
On Sun, Nov 8, 2015 at 10:43 AM, Kugan
 wrote:
>
> Thanks Richard for the comments.  Please find the attached patches which
> now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu  and
> ppc64-linux-gnu. Regression testing is ongoing. Please find the comments
> for your questions/suggestions below.
>
>>
>> I notice
>>
>> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
>> index 82fd4a1..80fcf70 100644
>> --- a/gcc/tree-ssanames.c
>> +++ b/gcc/tree-ssanames.c
>> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type 
>> range_type,
>>unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
>>
>>/* Allocate if not available.  */
>> -  if (ri == NULL)
>> +  if (ri == NULL
>> +  || (precision != ri->get_min ().get_precision ()))
>>
>> and I think you need to clear range info on promoted SSA vars in the
>> promotion pass.
>
> Done.
>
>>
>> The basic "structure" thing still remains.  You walk over all uses and
>> defs in all stmts
>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
>> uses and defs which in turn promotes (the "def") and then fixes up all
>> uses in all stmts.
>
> Done.

Not exactly.  I still see

/* Promote all the stmts in the basic block.  */
static void
promote_all_stmts (basic_block bb)
{
  gimple_stmt_iterator gsi;
  ssa_op_iter iter;
  tree def, use;
  use_operand_p op;

  for (gphi_iterator gpi = gsi_start_phis (bb);
   !gsi_end_p (gpi); gsi_next (&gpi))
{
  gphi *phi = gpi.phi ();
  def = PHI_RESULT (phi);
  promote_ssa (def, &gsi);

  FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
{
  use = USE_FROM_PTR (op);
  if (TREE_CODE (use) == SSA_NAME
  && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
promote_ssa (use, &gsi);
  fixup_uses (phi, &gsi, op, use);
}

you still call promote_ssa on both DEFs and USEs and promote_ssa looks
at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
on DEFs and fixup_uses on USEs.

Any reason you do not promote debug stmts during the DOM walk?

So for each DEF you record in ssa_name_info

struct ssa_name_info
{
  tree ssa;
  tree type;
  tree promoted_type;
};

(the fields need documenting).  Add a tree promoted_def to it which you
can replace any use of the DEF with.

Currently as you call promote_ssa for DEFs and USEs you repeatedly
overwrite the entry in ssa_name_info_map with a new copy.  So you
should assert it wasn't already there.

  switch (gimple_code (def_stmt))
{
case GIMPLE_PHI:
{

the last { is indented too much it should be indented 2 spaces
relative to the 'case'


  SSA_NAME_RANGE_INFO (def) = NULL;

only needed in the case 'def' was promoted itself.  Please use
reset_flow_sensitive_info (def).

>>
>> Instead of this you should, in promote_all_stmts, walk over all uses doing 
>> what
>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>
>> +case GIMPLE_NOP:
>> +   {
>> + if (SSA_NAME_VAR (def) == NULL)
>> +   {
>> + /* Promote def by fixing its type for anonymous def.  */
>> + TREE_TYPE (def) = promoted_type;
>> +   }
>> + else
>> +   {
>> + /* Create a promoted copy of parameters.  */
>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>
>> I think the uninitialized vars are somewhat tricky and it would be best
>> to create a new uninit anonymous SSA name for them.  You can
>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>> btw.
>
> Done. I also had to do some changes to in couple of other places to
> reflect this.
> They are:
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
>  {
>tree arg = gimple_phi_arg_def (stmt, i);
>if (TREE_CODE (arg) == SSA_NAME
> + && SSA_NAME_VAR (arg)
>   && !SSA_NAME_IS_DEFAULT_DEF (arg))
> {
>   gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
> @@ -434,7 +435,8 @@ get_rank (tree e)
>if (gimple_code (stmt) == GIMPLE_PHI)
> return phi_rank (stmt);
>
> -  if (!is_gimple_assign (stmt))
> +  if (!is_gimple_assign (stmt)
> + && !gimple_nop_p (stmt))
> return bb_rank[gimple_bb (stmt)->index];
>
> and
>
> --- a/gcc/tree-ssa.c
> +++ b/gcc/tree-ssa.c
> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
> use_operand_p use_p,
>TREE_VISITED (ssa_name) = 1;
>
>if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
> -  && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
> +  && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
> + || SSA_NAME_VAR (ssa_name) == NULL))
>  ; /* Default definitions have empty statements.  Nothing to do.  */
>else if (!def_bb)
>  {
>
> Does this look OK?

Hmm, no, this looks bogus.

I think the best thing to do is not promoting default defs at all and instead
promote at the uses.

  /* Cr

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-08 Thread Kugan

Thanks Richard for the comments.  Please find the attached patches which
now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu  and
ppc64-linux-gnu. Regression testing is ongoing. Please find the comments
for your questions/suggestions below.

> 
> I notice
> 
> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
> index 82fd4a1..80fcf70 100644
> --- a/gcc/tree-ssanames.c
> +++ b/gcc/tree-ssanames.c
> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type 
> range_type,
>unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
> 
>/* Allocate if not available.  */
> -  if (ri == NULL)
> +  if (ri == NULL
> +  || (precision != ri->get_min ().get_precision ()))
> 
> and I think you need to clear range info on promoted SSA vars in the
> promotion pass.

Done.

> 
> The basic "structure" thing still remains.  You walk over all uses and
> defs in all stmts
> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
> uses and defs which in turn promotes (the "def") and then fixes up all
> uses in all stmts.

Done.

> 
> Instead of this you should, in promote_all_stmts, walk over all uses doing 
> what
> fixup_uses does and then walk over all defs, doing what promote_ssa does.
> 
> +case GIMPLE_NOP:
> +   {
> + if (SSA_NAME_VAR (def) == NULL)
> +   {
> + /* Promote def by fixing its type for anonymous def.  */
> + TREE_TYPE (def) = promoted_type;
> +   }
> + else
> +   {
> + /* Create a promoted copy of parameters.  */
> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> 
> I think the uninitialized vars are somewhat tricky and it would be best
> to create a new uninit anonymous SSA name for them.  You can
> have SSA_NAME_VAR != NULL and def _not_ being a parameter
> btw.

Done. I also had to do some changes to in couple of other places to
reflect this.
They are:
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
 {
   tree arg = gimple_phi_arg_def (stmt, i);
   if (TREE_CODE (arg) == SSA_NAME
+ && SSA_NAME_VAR (arg)
  && !SSA_NAME_IS_DEFAULT_DEF (arg))
{
  gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
@@ -434,7 +435,8 @@ get_rank (tree e)
   if (gimple_code (stmt) == GIMPLE_PHI)
return phi_rank (stmt);

-  if (!is_gimple_assign (stmt))
+  if (!is_gimple_assign (stmt)
+ && !gimple_nop_p (stmt))
return bb_rank[gimple_bb (stmt)->index];

and

--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
use_operand_p use_p,
   TREE_VISITED (ssa_name) = 1;

   if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
-  && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
+  && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
+ || SSA_NAME_VAR (ssa_name) == NULL))
 ; /* Default definitions have empty statements.  Nothing to do.  */
   else if (!def_bb)
 {

Does this look OK?

> 
> +/* Return true if it is safe to promote the defined SSA_NAME in the STMT
> +   itself.  */
> +static bool
> +safe_to_promote_def_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  if (gimple_vuse (stmt) != NULL_TREE
> +  || gimple_vdef (stmt) != NULL_TREE
> +  || code == ARRAY_REF
> +  || code == LROTATE_EXPR
> +  || code == RROTATE_EXPR
> +  || code == VIEW_CONVERT_EXPR
> +  || code == BIT_FIELD_REF
> +  || code == REALPART_EXPR
> +  || code == IMAGPART_EXPR
> +  || code == REDUC_MAX_EXPR
> +  || code == REDUC_PLUS_EXPR
> +  || code == REDUC_MIN_EXPR)
> +return false;
> +  return true;
> 
> huh, I think this function has an odd name, maybe
> can_promote_operation ()?  Please
> use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.

Done.

> 
> Note that as followup things like the rotates should be "expanded" like
> we'd do on RTL (open-coding the thing).  And we'd need a way to
> specify zero-/sign-extended loads.
> 
> +/* Return true if it is safe to promote the use in the STMT.  */
> +static bool
> +safe_to_promote_use_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  tree lhs = gimple_assign_lhs (stmt);
> +
> +  if (gimple_vuse (stmt) != NULL_TREE
> +  || gimple_vdef (stmt) != NULL_TREE
> 
> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
> _2 = a[i_3];
> 
When I remove this, I see errors in stmts like:

unsigned char
unsigned int
# .MEM_197 = VDEF <.MEM_187>
fs_9(D)->fde_encoding = _154;


> +  || code == VIEW_CONVERT_EXPR
> +  || code == LROTATE_EXPR
> +  || code == RROTATE_EXPR
> +  || code == CONSTRUCTOR
> +  || code == BIT_FIELD_REF
> +  || code == COMPLEX_EXPR
> +  || code == ASM_EXPR
> +  || VECTOR_TYPE_P (TREE_TYPE (lhs)))
> +return false;
> +  return true;
> 
> ASM_EXPR can never appear here.  I think PROMOTE_MODE

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-03 Thread Richard Biener
On Mon, Nov 2, 2015 at 10:17 AM, Kugan
 wrote:
>
>
> On 29/10/15 02:45, Richard Biener wrote:
>> On Tue, Oct 27, 2015 at 1:50 AM, kugan
>>  wrote:
>>>
>>>
>>> On 23/10/15 01:23, Richard Biener wrote:

 On Thu, Oct 22, 2015 at 12:50 PM, Kugan
  wrote:
>
>
>
> On 21/10/15 23:45, Richard Biener wrote:
>>
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>  wrote:
>>>
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:


 This a new version of the patch posted in
 https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
 more testing and spitted the patch to make it more easier to review.
 There are still couple of issues to be addressed and I am working on
 them.

 1. AARCH64 bootstrap now fails with the commit
 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
 mis-compiled
 in stage2 and fwprop.c is failing. It looks to me that there is a
 latent
 issue which gets exposed my patch. I can also reproduce this in x86_64
 if I use the same PROMOTE_MODE which is used in aarch64 port. For the
 time being, I am using  patch
 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
 workaround. This meeds to be fixed before the patches are ready to be
 committed.

 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
 -O3 -g Error: unaligned opcodes detected in executable segment. It
 works
 fine if I remove the -g. I am looking into it and needs to be fixed as
 well.
>>>
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>
> Thanks. Please find the patched rebated against trunk@229156. I have
> skipped the test-case readjustment patches.


 Some quick observations.  On x86_64 when building
>>>
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>

 short bar (short y);
 int foo (short x)
 {
short y = bar (x) + 15;
return y;
 }

 with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
 I get

:
_1 = (int) x_10(D);
_2 = (_1) sext (16);
_11 = bar (_2);
_5 = (int) _11;
_12 = (unsigned int) _5;
_6 = _12 & 65535;
_7 = _6 + 15;
_13 = (int) _7;
_8 = (_13) sext (16);
_9 = (_8) sext (16);
return _9;

 which looks fine but the VRP optimization doesn't trigger for the
 redundant sext
 (ranges are computed correctly but the 2nd extension is not removed).
>
> Thanks for the comments. Please fond the attached patches with which I
> am now getting
> cat .192t.optimized
>
> ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0,
> symbol_order=0)
>
> foo (short int x)
> {
>   signed int _1;
>   int _2;
>   signed int _5;
>   unsigned int _6;
>   unsigned int _7;
>   signed int _8;
>   int _9;
>   short int _11;
>   unsigned int _12;
>   signed int _13;
>
>   :
>   _1 = (signed int) x_10(D);
>   _2 = _1;
>   _11 = bar (_2);
>   _5 = (signed int) _11;
>   _12 = (unsigned int) _11;
>   _6 = _12 & 65535;
>   _7 = _6 + 15;
>   _13 = (signed int) _7;
>   _8 = (_13) sext (16);
>   _9 = _8;
>   return _9;
>
> }
>
>
> There are still some redundancies. The asm difference after RTL
> optimizations is
>
> -   addl$15, %eax
> +   addw$15, %ax
>
>

 This also makes me notice trivial match.pd patterns are missing, like
 for example

 (simplify
   (sext (sext@2 @0 @1) @3)
   (if (tree_int_cst_compare (@1, @3) <= 0)
@2
(sext @0 @3)))

 as VRP doesn't run at -O1 we must rely on those to remove rendudant
 extensions,
 otherwise generated code might get worse compared to without the pass(?)
>>>
>>>
>>> Do you think that we should enable this pass only when vrp is enabled.
>>> Otherwise, even when we do the simple optimizations you mentioned below, we
>>> might not be able to remove all the redundancies.
>>>

 I also notice that the 'short' argument does not get it's sign-extension
 removed
 as redundand either even though we have

 _1 = (int) x_8(D);
 Found new range for _1: [-32768, 32767]

>>>
>>> I am looking into it.
>>>
 In the end I suspect that keeping track of the "simple" cases in the
 promotion
 pass itself (by keeping a lattice) might be a good idea (after we fix VRP
 to do
 its work).  In some way whether the ABI guarantees pr

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-11-02 Thread Kugan


On 29/10/15 02:45, Richard Biener wrote:
> On Tue, Oct 27, 2015 at 1:50 AM, kugan
>  wrote:
>>
>>
>> On 23/10/15 01:23, Richard Biener wrote:
>>>
>>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>>>  wrote:



 On 21/10/15 23:45, Richard Biener wrote:
>
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>  wrote:
>>
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on
>>> them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>> mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>> latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>> works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>> well.
>>
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
>
>
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
>

 Thanks. Please find the patched rebated against trunk@229156. I have
 skipped the test-case readjustment patches.
>>>
>>>
>>> Some quick observations.  On x86_64 when building
>>
>>
>> Hi Richard,
>>
>> Thanks for the review.
>>
>>>
>>> short bar (short y);
>>> int foo (short x)
>>> {
>>>short y = bar (x) + 15;
>>>return y;
>>> }
>>>
>>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>>> I get
>>>
>>>:
>>>_1 = (int) x_10(D);
>>>_2 = (_1) sext (16);
>>>_11 = bar (_2);
>>>_5 = (int) _11;
>>>_12 = (unsigned int) _5;
>>>_6 = _12 & 65535;
>>>_7 = _6 + 15;
>>>_13 = (int) _7;
>>>_8 = (_13) sext (16);
>>>_9 = (_8) sext (16);
>>>return _9;
>>>
>>> which looks fine but the VRP optimization doesn't trigger for the
>>> redundant sext
>>> (ranges are computed correctly but the 2nd extension is not removed).

Thanks for the comments. Please fond the attached patches with which I
am now getting
cat .192t.optimized

;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0,
symbol_order=0)

foo (short int x)
{
  signed int _1;
  int _2;
  signed int _5;
  unsigned int _6;
  unsigned int _7;
  signed int _8;
  int _9;
  short int _11;
  unsigned int _12;
  signed int _13;

  :
  _1 = (signed int) x_10(D);
  _2 = _1;
  _11 = bar (_2);
  _5 = (signed int) _11;
  _12 = (unsigned int) _11;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (signed int) _7;
  _8 = (_13) sext (16);
  _9 = _8;
  return _9;

}


There are still some redundancies. The asm difference after RTL
optimizations is

-   addl$15, %eax
+   addw$15, %ax


>>>
>>> This also makes me notice trivial match.pd patterns are missing, like
>>> for example
>>>
>>> (simplify
>>>   (sext (sext@2 @0 @1) @3)
>>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>>@2
>>>(sext @0 @3)))
>>>
>>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>>> extensions,
>>> otherwise generated code might get worse compared to without the pass(?)
>>
>>
>> Do you think that we should enable this pass only when vrp is enabled.
>> Otherwise, even when we do the simple optimizations you mentioned below, we
>> might not be able to remove all the redundancies.
>>
>>>
>>> I also notice that the 'short' argument does not get it's sign-extension
>>> removed
>>> as redundand either even though we have
>>>
>>> _1 = (int) x_8(D);
>>> Found new range for _1: [-32768, 32767]
>>>
>>
>> I am looking into it.
>>
>>> In the end I suspect that keeping track of the "simple" cases in the
>>> promotion
>>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>>> to do
>>> its work).  In some way whether the ABI guarantees promoted argument
>>> registers might need some other target hook queries.

I tried adding it in the attached patch with record_visit_stmt to track
whether an ssa would have value overflow or properly zero/sign extended
in promoted mode. We can use this to elimi

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-28 Thread Richard Biener
On Tue, Oct 27, 2015 at 1:50 AM, kugan
 wrote:
>
>
> On 23/10/15 01:23, Richard Biener wrote:
>>
>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>>  wrote:
>>>
>>>
>>>
>>> On 21/10/15 23:45, Richard Biener wrote:

 On Tue, Oct 20, 2015 at 10:03 PM, Kugan
  wrote:
>
>
>
> On 07/09/15 12:53, Kugan wrote:
>>
>>
>> This a new version of the patch posted in
>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>> more testing and spitted the patch to make it more easier to review.
>> There are still couple of issues to be addressed and I am working on
>> them.
>>
>> 1. AARCH64 bootstrap now fails with the commit
>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>> mis-compiled
>> in stage2 and fwprop.c is failing. It looks to me that there is a
>> latent
>> issue which gets exposed my patch. I can also reproduce this in x86_64
>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>> time being, I am using  patch
>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>> workaround. This meeds to be fixed before the patches are ready to be
>> committed.
>>
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>> works
>> fine if I remove the -g. I am looking into it and needs to be fixed as
>> well.
>
>
> Hi Richard,
>
> Now that stage 1 is going to close, I would like to get these patches
> accepted for stage1. I will try my best to address your review comments
> ASAP.


 Ok, can you make the whole patch series available so I can poke at the
 implementation a bit?  Please state the revision it was rebased on
 (or point me to a git/svn branch the work resides on).

>>>
>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>> skipped the test-case readjustment patches.
>>
>>
>> Some quick observations.  On x86_64 when building
>
>
> Hi Richard,
>
> Thanks for the review.
>
>>
>> short bar (short y);
>> int foo (short x)
>> {
>>short y = bar (x) + 15;
>>return y;
>> }
>>
>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>> I get
>>
>>:
>>_1 = (int) x_10(D);
>>_2 = (_1) sext (16);
>>_11 = bar (_2);
>>_5 = (int) _11;
>>_12 = (unsigned int) _5;
>>_6 = _12 & 65535;
>>_7 = _6 + 15;
>>_13 = (int) _7;
>>_8 = (_13) sext (16);
>>_9 = (_8) sext (16);
>>return _9;
>>
>> which looks fine but the VRP optimization doesn't trigger for the
>> redundant sext
>> (ranges are computed correctly but the 2nd extension is not removed).
>>
>> This also makes me notice trivial match.pd patterns are missing, like
>> for example
>>
>> (simplify
>>   (sext (sext@2 @0 @1) @3)
>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>@2
>>(sext @0 @3)))
>>
>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>> extensions,
>> otherwise generated code might get worse compared to without the pass(?)
>
>
> Do you think that we should enable this pass only when vrp is enabled.
> Otherwise, even when we do the simple optimizations you mentioned below, we
> might not be able to remove all the redundancies.
>
>>
>> I also notice that the 'short' argument does not get it's sign-extension
>> removed
>> as redundand either even though we have
>>
>> _1 = (int) x_8(D);
>> Found new range for _1: [-32768, 32767]
>>
>
> I am looking into it.
>
>> In the end I suspect that keeping track of the "simple" cases in the
>> promotion
>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>> to do
>> its work).  In some way whether the ABI guarantees promoted argument
>> registers might need some other target hook queries.
>>
>> Now onto the 0002 patch.
>>
>> +static bool
>> +type_precision_ok (tree type)
>> +{
>> +  return (TYPE_PRECISION (type)  == 8
>> + || TYPE_PRECISION (type) == 16
>> + || TYPE_PRECISION (type) == 32);
>> +}
>>
>> that's a weird function to me.  You probably want
>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>
>
> I will change this. (I have a patch which I am testing with other changes
> you have asked for)
>
>
>> +/* Return the promoted type for TYPE.  */
>> +static tree
>> +get_promoted_type (tree type)
>> +{
>> +  tree promoted_type;
>> +  enum machine_mode mode;
>> +  int uns;
>> +  if (POINTER_TYPE_P (type)
>> +  || !INTEGRAL_TYPE_P (type)
>> +  || !type_precision_ok (type))
>> +return type;
>> +
>> +  mode = TYPE_MODE (type);
>> +#ifdef PROMOTE_MODE
>> +  uns = TYPE_SIGN (type);
>> +  PROMOTE_MODE (mode, uns, type);
>> +#endif
>> +  uns = TYPE_SIGN (type);
>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>> +  if (promoted_type
>> +  && 

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-26 Thread kugan



On 23/10/15 01:23, Richard Biener wrote:

On Thu, Oct 22, 2015 at 12:50 PM, Kugan
 wrote:



On 21/10/15 23:45, Richard Biener wrote:

On Tue, Oct 20, 2015 at 10:03 PM, Kugan
 wrote:



On 07/09/15 12:53, Kugan wrote:


This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.


Hi Richard,

Now that stage 1 is going to close, I would like to get these patches
accepted for stage1. I will try my best to address your review comments
ASAP.


Ok, can you make the whole patch series available so I can poke at the
implementation a bit?  Please state the revision it was rebased on
(or point me to a git/svn branch the work resides on).



Thanks. Please find the patched rebated against trunk@229156. I have
skipped the test-case readjustment patches.


Some quick observations.  On x86_64 when building


Hi Richard,

Thanks for the review.


short bar (short y);
int foo (short x)
{
   short y = bar (x) + 15;
   return y;
}

with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
I get

   :
   _1 = (int) x_10(D);
   _2 = (_1) sext (16);
   _11 = bar (_2);
   _5 = (int) _11;
   _12 = (unsigned int) _5;
   _6 = _12 & 65535;
   _7 = _6 + 15;
   _13 = (int) _7;
   _8 = (_13) sext (16);
   _9 = (_8) sext (16);
   return _9;

which looks fine but the VRP optimization doesn't trigger for the redundant sext
(ranges are computed correctly but the 2nd extension is not removed).

This also makes me notice trivial match.pd patterns are missing, like
for example

(simplify
  (sext (sext@2 @0 @1) @3)
  (if (tree_int_cst_compare (@1, @3) <= 0)
   @2
   (sext @0 @3)))

as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
otherwise generated code might get worse compared to without the pass(?)


Do you think that we should enable this pass only when vrp is enabled. 
Otherwise, even when we do the simple optimizations you mentioned below, 
we might not be able to remove all the redundancies.




I also notice that the 'short' argument does not get it's sign-extension removed
as redundand either even though we have

_1 = (int) x_8(D);
Found new range for _1: [-32768, 32767]



I am looking into it.


In the end I suspect that keeping track of the "simple" cases in the promotion
pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
its work).  In some way whether the ABI guarantees promoted argument
registers might need some other target hook queries.

Now onto the 0002 patch.

+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+ || TYPE_PRECISION (type) == 16
+ || TYPE_PRECISION (type) == 32);
+}

that's a weird function to me.  You probably want
TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?



I will change this. (I have a patch which I am testing with other 
changes you have asked for)



+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+  || !INTEGRAL_TYPE_P (type)
+  || !type_precision_ok (type))
+return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+  && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+type = promoted_type;

I think what you want to verify is that TYPE_PRECISION (promoted_type)
== GET_MODE_PRECISION (mode).
And to not even bother with this simply use

promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);



I am changing this too.


You use a domwalk but also might create new basic-blocks during it
(insert_on_edge_immediate), that's a
no-no, commit edge inserts after the domwalk.


I am sorry, I dont understand "commit edge inserts after the domwalk" Is 
there a way to do this in the current implementation?



ssa_sets_h

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-22 Thread Richard Biener
On Thu, Oct 22, 2015 at 12:50 PM, Kugan
 wrote:
>
>
> On 21/10/15 23:45, Richard Biener wrote:
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>  wrote:
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:

 This a new version of the patch posted in
 https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
 more testing and spitted the patch to make it more easier to review.
 There are still couple of issues to be addressed and I am working on them.

 1. AARCH64 bootstrap now fails with the commit
 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
 in stage2 and fwprop.c is failing. It looks to me that there is a latent
 issue which gets exposed my patch. I can also reproduce this in x86_64
 if I use the same PROMOTE_MODE which is used in aarch64 port. For the
 time being, I am using  patch
 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
 workaround. This meeds to be fixed before the patches are ready to be
 committed.

 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
 -O3 -g Error: unaligned opcodes detected in executable segment. It works
 fine if I remove the -g. I am looking into it and needs to be fixed as 
 well.
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>
> Thanks. Please find the patched rebated against trunk@229156. I have
> skipped the test-case readjustment patches.

Some quick observations.  On x86_64 when building

short bar (short y);
int foo (short x)
{
  short y = bar (x) + 15;
  return y;
}

with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
I get

  :
  _1 = (int) x_10(D);
  _2 = (_1) sext (16);
  _11 = bar (_2);
  _5 = (int) _11;
  _12 = (unsigned int) _5;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (int) _7;
  _8 = (_13) sext (16);
  _9 = (_8) sext (16);
  return _9;

which looks fine but the VRP optimization doesn't trigger for the redundant sext
(ranges are computed correctly but the 2nd extension is not removed).

This also makes me notice trivial match.pd patterns are missing, like
for example

(simplify
 (sext (sext@2 @0 @1) @3)
 (if (tree_int_cst_compare (@1, @3) <= 0)
  @2
  (sext @0 @3)))

as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
otherwise generated code might get worse compared to without the pass(?)

I also notice that the 'short' argument does not get it's sign-extension removed
as redundand either even though we have

_1 = (int) x_8(D);
Found new range for _1: [-32768, 32767]

In the end I suspect that keeping track of the "simple" cases in the promotion
pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
its work).  In some way whether the ABI guarantees promoted argument
registers might need some other target hook queries.

Now onto the 0002 patch.

+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+ || TYPE_PRECISION (type) == 16
+ || TYPE_PRECISION (type) == 32);
+}

that's a weird function to me.  You probably want
TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+  || !INTEGRAL_TYPE_P (type)
+  || !type_precision_ok (type))
+return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+  && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+type = promoted_type;

I think what you want to verify is that TYPE_PRECISION (promoted_type)
== GET_MODE_PRECISION (mode).
And to not even bother with this simply use

promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);

You use a domwalk but also might create new basic-blocks during it
(insert_on_edge_immediate), that's a
no-no, commit edge inserts after the domwalk.
ssa_sets_higher_bits_bitmap looks unused and
we generally don't free dominance info, so please don't do that.

I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with

/abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
-B/usr/local/powerpc64-unknown-linux-gnu/bin/
-B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
/usr/local/powerpc64-unknown-linux-gnu/include -isystem
/usr/local/powerpc64-unknown-linux-gnu/sy

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-22 Thread Richard Biener
On Wed, Oct 21, 2015 at 7:55 PM, Richard Henderson  wrote:
> On 10/21/2015 03:56 AM, Richard Biener wrote:
>>
>> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
>>  wrote:
>>>
>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>  wrote:



 On 07/09/15 12:53, Kugan wrote:
>
>
> This a new version of the patch posted in
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
> more testing and spitted the patch to make it more easier to review.
> There are still couple of issues to be addressed and I am working on
> them.
>
> 1. AARCH64 bootstrap now fails with the commit
> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
> mis-compiled
> in stage2 and fwprop.c is failing. It looks to me that there is a
> latent
> issue which gets exposed my patch. I can also reproduce this in x86_64
> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
> time being, I am using  patch
> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
> workaround. This meeds to be fixed before the patches are ready to be
> committed.
>
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It
> works
> fine if I remove the -g. I am looking into it and needs to be fixed as
> well.


 Hi Richard,

 Now that stage 1 is going to close, I would like to get these patches
 accepted for stage1. I will try my best to address your review comments
 ASAP.
>>>
>>>
>>> Ok, can you make the whole patch series available so I can poke at the
>>> implementation a bit?  Please state the revision it was rebased on
>>> (or point me to a git/svn branch the work resides on).
>>>
 * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
 longer present as it is fixed in trunk. Patch-6 is no longer needed.

 * Issue 2 is also reported as known issue

 *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
 match.pd for SEXT_EXPR, I would like to propose them as a follow up
 patch once this is accepted.
>>>
>>>
>>> I thought more about this and don't think it can be made work without a
>>> lot of
>>> hassle.  Instead to get rid of the remaining "badly" typed registers in
>>> the
>>> function we can key different type requirements on a pass property
>>> (PROP_promoted_regs), thus simply change the expectation of the
>>> types of function parameters / results according to their promotion.
>>
>>
>> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
>> details from the start (gimplification).  Note that this does not only
>> involve
>> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
>> "lower" passing / returning in registers (whee, and then we have
>> things like targetm.calls.split_complex_arg ... not to mention passing
>> GIMPLE memory in registers).
>>
>> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
>> to the target (to expose those redundant extensions on GIMPLE) and
>> we'll end up with a bigger mess than with not doing this?
>
>
> I'm leary of building this in as early as gimplification, lest we get into
> trouble with splitting out bits of the current function for off-loading.
> What happens when the cpu and gpu have different promotion rules?

Ah, of course.  I tend to forget these issues.

Richard.

>
> r~


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-22 Thread Kugan


On 21/10/15 23:45, Richard Biener wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>  wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
> 
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
> 

Thanks. Please find the patched rebated against trunk@229156. I have
skipped the test-case readjustment patches.


Thanks,
Kugan
>From 2dc1cccfc59ae6967928b52396227b52a50803d9 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Thu, 22 Oct 2015 10:54:31 +1100
Subject: [PATCH 4/4] debug stmt in widen mode

---
 gcc/gimple-ssa-type-promote.c | 82 +--
 1 file changed, 79 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e62a7c6..c0b6aa1 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -589,10 +589,86 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
 	{
 	case GIMPLE_DEBUG:
 	{
-	  gsi = gsi_for_stmt (stmt);
-	  gsi_remove (&gsi, true);
-	  break;
+	  /* Change the GIMPLE_DEBUG stmt such that the value bound is
+		 computed in promoted_type and then converted to required
+		 type.  */
+	  tree op, new_op = NULL_TREE;
+	  gdebug *copy = NULL, *gs = as_a  (stmt);
+	  enum tree_code code;
+
+	  /* Get the value that is bound in debug stmt.  */
+	  switch (gs->subcode)
+		{
+		case GIMPLE_DEBUG_BIND:
+		  op = gimple_debug_bind_get_value (gs);
+		  break;
+		case GIMPLE_DEBUG_SOURCE_BIND:
+		  op = gimple_debug_source_bind_get_value (gs);
+		  break;
+		default:
+		  gcc_unreachable ();
+		}
+
+	  code = TREE_CODE (op);
+	  /* Convert the value computed in promoted_type to
+		 old_type.  */
+	  if (code == SSA_NAME && use == op)
+		new_op = build1 (NOP_EXPR, old_type, use);
+	  else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_unary
+		   && code != NOP_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  if (op0 == use)
+		{
+		  tree temp = build1 (code, promoted_type, op0);
+		  new_op = build1 (NOP_EXPR, old_type, temp);
+		}
+		}
+	  else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_binary
+		   /* Skip codes that are rejected in safe_to_promote_use_p.  */
+		   && code != LROTATE_EXPR
+		   && code != RROTATE_EXPR
+		   && code != COMPLEX_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  tree op1 = TREE_OPERAND (op, 1);
+		  if (op0 == use || op1 == use)
+		{
+		  if (TREE_CODE (op0) == INTEGER_CST)
+			op0 = convert_int_cst (promoted_type, op0, SIGNED);
+		  if (TREE_CODE (op1) == INTEGER_CST)
+			op1 = convert_int_cst (promoted_type, op1, SIGNED);
+		  tree temp = build2 (code, promoted_type, op0, op1);
+		  new_op = build1 (NOP_EXPR, old_type, temp);
+		}
+		}
+
+	  /* Create new GIMPLE_DEBUG stmt with the new value (new_op) to
+		 be bound, if new value has been calculated */
+	  if (new_op)
+		{
+		  if (gimple_debug_bind_p (stmt))
+		{
+		  copy = gimple_build_debug_bind
+			(gimple_debug_bind_get_var (stmt),
+			 new_op,
+			 stmt);
+		}
+		  if (gimple_debug_source_bind_p (stmt))
+		{
+		  copy = gimple_build_debug_source_bind
+			(gimple_debug_source_bind_get_var (stmt), new_op,
+			 stmt);
+		}
+
+		  if (copy)
+		{
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_replace (&gsi, copy, false);
+		}
+		}
 	}
+	  break;
 
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
-- 
1.9.1

>From 1044b1b5ebf8ad696a942207b031e3668ab2a0de Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/4] 

Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-21 Thread Richard Henderson

On 10/21/2015 03:56 AM, Richard Biener wrote:

On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
 wrote:

On Tue, Oct 20, 2015 at 10:03 PM, Kugan
 wrote:



On 07/09/15 12:53, Kugan wrote:


This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.


Hi Richard,

Now that stage 1 is going to close, I would like to get these patches
accepted for stage1. I will try my best to address your review comments
ASAP.


Ok, can you make the whole patch series available so I can poke at the
implementation a bit?  Please state the revision it was rebased on
(or point me to a git/svn branch the work resides on).


* Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
longer present as it is fixed in trunk. Patch-6 is no longer needed.

* Issue 2 is also reported as known issue

*  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
match.pd for SEXT_EXPR, I would like to propose them as a follow up
patch once this is accepted.


I thought more about this and don't think it can be made work without a lot of
hassle.  Instead to get rid of the remaining "badly" typed registers in the
function we can key different type requirements on a pass property
(PROP_promoted_regs), thus simply change the expectation of the
types of function parameters / results according to their promotion.


Or maybe we should simply make GIMPLE _always_ adhere to the ABI
details from the start (gimplification).  Note that this does not only involve
PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
"lower" passing / returning in registers (whee, and then we have
things like targetm.calls.split_complex_arg ... not to mention passing
GIMPLE memory in registers).

Maybe I'm shooting too far here in the attempt to make GIMPLE closer
to the target (to expose those redundant extensions on GIMPLE) and
we'll end up with a bigger mess than with not doing this?


I'm leary of building this in as early as gimplification, lest we get into 
trouble with splitting out bits of the current function for off-loading.  What 
happens when the cpu and gpu have different promotion rules?



r~


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-21 Thread Joseph Myers
On Wed, 21 Oct 2015, Richard Biener wrote:

> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
> details from the start (gimplification).  Note that this does not only involve
> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
> "lower" passing / returning in registers (whee, and then we have
> things like targetm.calls.split_complex_arg ... not to mention passing
> GIMPLE memory in registers).
> 
> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
> to the target (to expose those redundant extensions on GIMPLE) and
> we'll end up with a bigger mess than with not doing this?

I don't know at what point target-specific promotion should appear, but 
right now it's visible before then (front ends use 
targetm.calls.promote_prototypes), which is definitely too early.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-21 Thread Richard Biener
On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
 wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>  wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
>
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
>
>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>
>> * Issue 2 is also reported as known issue
>>
>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>> patch once this is accepted.
>
> I thought more about this and don't think it can be made work without a lot of
> hassle.  Instead to get rid of the remaining "badly" typed registers in the
> function we can key different type requirements on a pass property
> (PROP_promoted_regs), thus simply change the expectation of the
> types of function parameters / results according to their promotion.

Or maybe we should simply make GIMPLE _always_ adhere to the ABI
details from the start (gimplification).  Note that this does not only involve
PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
"lower" passing / returning in registers (whee, and then we have
things like targetm.calls.split_complex_arg ... not to mention passing
GIMPLE memory in registers).

Maybe I'm shooting too far here in the attempt to make GIMPLE closer
to the target (to expose those redundant extensions on GIMPLE) and
we'll end up with a bigger mess than with not doing this?

Richard.

> The promotion pass would set PROP_promoted_regs then.
>
> I will look over the patch(es) this week but as said I'd like to play with
> some code examples myself and thus like to have the current patchset
> in a more easily accessible form (and sure to apply to some rev.).
>
> Thanks,
> Richard.
>
>> * I am happy to turn this pass off by default till IPA and match.pd
>> changes are accepted. I can do regular testing to make sure that this
>> pass works properly till we enable it by default.
>>
>>
>> Please let me know what you think,
>>
>> Thanks,
>> Kugan


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-21 Thread Richard Biener
On Tue, Oct 20, 2015 at 10:03 PM, Kugan
 wrote:
>
>
> On 07/09/15 12:53, Kugan wrote:
>>
>> This a new version of the patch posted in
>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>> more testing and spitted the patch to make it more easier to review.
>> There are still couple of issues to be addressed and I am working on them.
>>
>> 1. AARCH64 bootstrap now fails with the commit
>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>> issue which gets exposed my patch. I can also reproduce this in x86_64
>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>> time being, I am using  patch
>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>> workaround. This meeds to be fixed before the patches are ready to be
>> committed.
>>
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>
> Hi Richard,
>
> Now that stage 1 is going to close, I would like to get these patches
> accepted for stage1. I will try my best to address your review comments
> ASAP.

Ok, can you make the whole patch series available so I can poke at the
implementation a bit?  Please state the revision it was rebased on
(or point me to a git/svn branch the work resides on).

> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>
> * Issue 2 is also reported as known issue
>
> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
> match.pd for SEXT_EXPR, I would like to propose them as a follow up
> patch once this is accepted.

I thought more about this and don't think it can be made work without a lot of
hassle.  Instead to get rid of the remaining "badly" typed registers in the
function we can key different type requirements on a pass property
(PROP_promoted_regs), thus simply change the expectation of the
types of function parameters / results according to their promotion.

The promotion pass would set PROP_promoted_regs then.

I will look over the patch(es) this week but as said I'd like to play with
some code examples myself and thus like to have the current patchset
in a more easily accessible form (and sure to apply to some rev.).

Thanks,
Richard.

> * I am happy to turn this pass off by default till IPA and match.pd
> changes are accepted. I can do regular testing to make sure that this
> pass works properly till we enable it by default.
>
>
> Please let me know what you think,
>
> Thanks,
> Kugan


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-10-20 Thread Kugan


On 07/09/15 12:53, Kugan wrote:
> 
> This a new version of the patch posted in
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
> more testing and spitted the patch to make it more easier to review.
> There are still couple of issues to be addressed and I am working on them.
> 
> 1. AARCH64 bootstrap now fails with the commit
> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
> in stage2 and fwprop.c is failing. It looks to me that there is a latent
> issue which gets exposed my patch. I can also reproduce this in x86_64
> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
> time being, I am using  patch
> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
> workaround. This meeds to be fixed before the patches are ready to be
> committed.
> 
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

Hi Richard,

Now that stage 1 is going to close, I would like to get these patches
accepted for stage1. I will try my best to address your review comments
ASAP.

* Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
longer present as it is fixed in trunk. Patch-6 is no longer needed.

* Issue 2 is also reported as known issue

*  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
match.pd for SEXT_EXPR, I would like to propose them as a follow up
patch once this is accepted.

* I am happy to turn this pass off by default till IPA and match.pd
changes are accepted. I can do regular testing to make sure that this
pass works properly till we enable it by default.


Please let me know what you think,

Thanks,
Kugan


RE: [0/7] Type promotion pass and elimination of zext/sext

2015-09-08 Thread Wilco Dijkstra
> Renlin Li wrote:
> Hi Andrew,
> 
> Previously, there is a discussion thread in binutils mailing list:
> 
> https://sourceware.org/ml/binutils/2015-04/msg00032.html
> 
> Nick proposed a way to fix, Richard Henderson hold similar opinion as you.

Both Nick and Richard H seem to think it is an issue with unaligned 
instructions 
rather than an alignment bug in the debug code in the assembler (probably due to
the misleading error message). Although it would work, since we don't have/need
unaligned instructions that proposed patch is not the right fix for this issue.

Anyway aligning the debug tables correctly should be a safe and trivial fix.

Wilco





Re: [0/7] Type promotion pass and elimination of zext/sext

2015-09-08 Thread Renlin Li

Hi Andrew,

Previously, there is a discussion thread in binutils mailing list:

https://sourceware.org/ml/binutils/2015-04/msg00032.html

Nick proposed a way to fix, Richard Henderson hold similar opinion as you.

Regards,
Renlin

On 07/09/15 12:45, pins...@gmail.com wrote:





On Sep 7, 2015, at 7:22 PM, Kugan  wrote:



On 07/09/15 20:46, Wilco Dijkstra wrote:

Kugan wrote:
2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.

This is a known assembler bug I found a while back, Renlin is looking into it.
Basically when debug tables are inserted at the end of a code section the
assembler doesn't align to the alignment required by the debug tables.

This is precisely what seems to be happening. Renlin, could you please
let me know when you have a patch (even if it is a prototype or a hack).


I had noticed that but I read through the assembler code and it sounded very 
much like it was a designed this way and that the compiler was not supposed to 
emit assembly like this and fix up the alignment.

Thanks,
Andrew


Thanks,
Kugan




RE: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread Wilco Dijkstra
> pins...@gmail.com wrote:
> > On Sep 7, 2015, at 7:22 PM, Kugan  wrote:
> >
> >
> >
> > On 07/09/15 20:46, Wilco Dijkstra wrote:
> >>> Kugan wrote:
> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> >>> fine if I remove the -g. I am looking into it and needs to be fixed as 
> >>> well.
> >>
> >> This is a known assembler bug I found a while back, Renlin is looking into 
> >> it.
> >> Basically when debug tables are inserted at the end of a code section the
> >> assembler doesn't align to the alignment required by the debug tables.
> >
> > This is precisely what seems to be happening. Renlin, could you please
> > let me know when you have a patch (even if it is a prototype or a hack).
> 
> 
> I had noticed that but I read through the assembler code and it sounded very 
> much like it was
> a designed this way and that the compiler was not supposed to emit assembly 
> like this and fix
> up the alignment.

No, the bug is introduced solely by the assembler - there is no way to avoid it 
as you can't expect
users to align the end of the code section to an unspecified debug alignment 
(which could
potentially vary depending on the generated debug info). The assembler aligns 
unaligned instructions
without a warning, and doesn't require the section size to be a multiple of the 
section alignment,
ie. the design is that the assembler can deal with any alignment.

Wilco




Re: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread pinskia




> On Sep 7, 2015, at 7:22 PM, Kugan  wrote:
> 
> 
> 
> On 07/09/15 20:46, Wilco Dijkstra wrote:
>>> Kugan wrote:
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>> 
>> This is a known assembler bug I found a while back, Renlin is looking into 
>> it.
>> Basically when debug tables are inserted at the end of a code section the 
>> assembler doesn't align to the alignment required by the debug tables.
> 
> This is precisely what seems to be happening. Renlin, could you please
> let me know when you have a patch (even if it is a prototype or a hack).


I had noticed that but I read through the assembler code and it sounded very 
much like it was a designed this way and that the compiler was not supposed to 
emit assembly like this and fix up the alignment. 

Thanks,
Andrew

> 
> Thanks,
> Kugan


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread Kugan


On 07/09/15 20:46, Wilco Dijkstra wrote:
>> Kugan wrote:
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
> 
> This is a known assembler bug I found a while back, Renlin is looking into it.
> Basically when debug tables are inserted at the end of a code section the 
> assembler doesn't align to the alignment required by the debug tables.

This is precisely what seems to be happening. Renlin, could you please
let me know when you have a patch (even if it is a prototype or a hack).

Thanks,
Kugan


RE: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread Wilco Dijkstra
> Kugan wrote:
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

This is a known assembler bug I found a while back, Renlin is looking into it.
Basically when debug tables are inserted at the end of a code section the 
assembler doesn't align to the alignment required by the debug tables.

Wilco




[0/7] Type promotion pass and elimination of zext/sext

2015-09-06 Thread Kugan

This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.

In the meantime, I would appreciate if you take some time to review this.

I have bootstrapped on x86_64-linux-gnu, arm-linux-gnu and
aarch-64-linux-gnu (with the workaround) and regression tested.

Thanks,
Kugan