Re: [0/7] Type promotion pass and elimination of zext/sext
On Thu, Dec 10, 2015 at 1:27 AM, Kugan wrote: > Hi Riachard, > > Thanks for the reviews. > > I think since we have some unresolved issues here, it is best to aim for > the next stage1. I however would like any feedback so that I can > continue to improve this. Yeah, sorry I've been distracted lately and am not sure I'll get to the patch before christmas break. > https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related > to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think > there is any agreement on this. Or is there any better place to fix this? I don't know enough in this area to suggest anything. Richard. > Thanks, > Kugan
Re: [0/7] Type promotion pass and elimination of zext/sext
Hi Riachard, Thanks for the reviews. I think since we have some unresolved issues here, it is best to aim for the next stage1. I however would like any feedback so that I can continue to improve this. https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think there is any agreement on this. Or is there any better place to fix this? Thanks, Kugan
Re: [0/7] Type promotion pass and elimination of zext/sext
Hi Richard, Thanks for you comments. I am attaching an updated patch with details below. On 19/11/15 02:06, Richard Biener wrote: > On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener > wrote: >> On Sat, Nov 14, 2015 at 2:15 AM, Kugan >> wrote: >>> >>> Attached is the latest version of the patch. With the patches >>> 0001-Add-new-SEXT_EXPR-tree-code.patch, >>> 0002-Add-type-promotion-pass.patch and >>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. >>> >>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and >>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu, >>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three >>> issues in ppc64-linux-gnu regression testing. There are some other test >>> cases which needs adjustment for scanning for some patterns that are not >>> valid now. >>> >>> 1. rtl fwprop was going into infinite loop. Works with the following patch: >>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c >>> index 16c7981..9cf4f43 100644 >>> --- a/gcc/fwprop.c >>> +++ b/gcc/fwprop.c >>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx >>> new_rtx, rtx_insn *def_insn, >>>int old_cost = 0; >>>bool ok; >>> >>> + /* Value to be substituted is the same, nothing to do. */ >>> + if (rtx_equal_p (*loc, new_rtx)) >>> +return false; >>> + >>>update_df_init (def_insn, insn); >>> >>>/* forward_propagate_subreg may be operating on an instruction with >> >> Which testcase was this on? After re-basing the trunk, I cannot reproduce it anymore. >> >>> 2. gcc.dg/torture/ftrapv-1.c fails >>> This is because we are checking for the SImode trapping. With the >>> promotion of the operation to wider mode, this is i think expected. I >>> think the testcase needs updating. >> >> No, it is not expected. As said earlier you need to refrain from promoting >> integer operations that trap. You can use ! operation_no_trapping_overflow >> for this. >> I have changed this. >>> 3. gcc.dg/sms-3.c fails >>> It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I >>> am looking into it. >>> >>> >>> I also have the following issues based on the previous review (as posted >>> in the previous patch). Copying again for the review purpose. >>> >>> 1. you still call promote_ssa on both DEFs and USEs and promote_ssa looks at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just on DEFs and fixup_uses on USEs. >>> >>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is >>> there anyway to iterate over this. I have added gcc_assert to make sure >>> that promote_ssa is called only once. >> >> gcc_assert (!ssa_name_info_map->get_or_insert (def)); >> >> with --disable-checking this will be compiled away so you need to do >> the assert in a separate statement. >> >>> 2. Instead of this you should, in promote_all_stmts, walk over all uses >>> doing what fixup_uses does and then walk over all defs, doing what promote_ssa does. +case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (def) = promoted_type; + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); I think the uninitialized vars are somewhat tricky and it would be best to create a new uninit anonymous SSA name for them. You can have SSA_NAME_VAR != NULL and def _not_ being a parameter btw. >>> >>> I experimented with get_or_create_default_def. Here we have to have a >>> SSA_NAME_VAR (def) of promoted type. >>> >>> In the attached patch I am doing the following and seems to work. Does >>> this looks OK? >>> >>> + } >>> + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) >>> + { >>> + tree var = copy_node (SSA_NAME_VAR (def)); >>> + TREE_TYPE (var) = promoted_type; >>> + TREE_TYPE (def) = promoted_type; >>> + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); >>> + } >> >> I believe this will wreck the SSA default-def map so you should do >> >> set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); >> tree var = create_tmp_reg (promoted_type); >> TREE_TYPE (def) = promoted_type; >> SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); >> set_ssa_default_def (cfun, var, def); >> >> instead. I have changed this. >> >>> I prefer to promote def as otherwise iterating over the uses and >>> promoting can look complicated (have to look at all the different types >>> of stmts again and do the right thing as It was in the earlier version >>> of this before we move to this approach) >>> >>> 3) you can also transparently handle constants for the cases where promoting is required. At the moment their handling is interwinded with the def >>> promotion cod
Re: [0/7] Type promotion pass and elimination of zext/sext
On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener wrote: > On Sat, Nov 14, 2015 at 2:15 AM, Kugan > wrote: >> >> Attached is the latest version of the patch. With the patches >> 0001-Add-new-SEXT_EXPR-tree-code.patch, >> 0002-Add-type-promotion-pass.patch and >> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. >> >> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and >> x64-64-linux-gnu and regression testing on ppc64-linux-gnu, >> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three >> issues in ppc64-linux-gnu regression testing. There are some other test >> cases which needs adjustment for scanning for some patterns that are not >> valid now. >> >> 1. rtl fwprop was going into infinite loop. Works with the following patch: >> diff --git a/gcc/fwprop.c b/gcc/fwprop.c >> index 16c7981..9cf4f43 100644 >> --- a/gcc/fwprop.c >> +++ b/gcc/fwprop.c >> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx >> new_rtx, rtx_insn *def_insn, >>int old_cost = 0; >>bool ok; >> >> + /* Value to be substituted is the same, nothing to do. */ >> + if (rtx_equal_p (*loc, new_rtx)) >> +return false; >> + >>update_df_init (def_insn, insn); >> >>/* forward_propagate_subreg may be operating on an instruction with > > Which testcase was this on? > >> 2. gcc.dg/torture/ftrapv-1.c fails >> This is because we are checking for the SImode trapping. With the >> promotion of the operation to wider mode, this is i think expected. I >> think the testcase needs updating. > > No, it is not expected. As said earlier you need to refrain from promoting > integer operations that trap. You can use ! operation_no_trapping_overflow > for this. > >> 3. gcc.dg/sms-3.c fails >> It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I >> am looking into it. >> >> >> I also have the following issues based on the previous review (as posted >> in the previous patch). Copying again for the review purpose. >> >> 1. >>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks >>> at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just >>> on DEFs and fixup_uses on USEs. >> >> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is >> there anyway to iterate over this. I have added gcc_assert to make sure >> that promote_ssa is called only once. > > gcc_assert (!ssa_name_info_map->get_or_insert (def)); > > with --disable-checking this will be compiled away so you need to do > the assert in a separate statement. > >> 2. >>> Instead of this you should, in promote_all_stmts, walk over all uses >> doing what >>> fixup_uses does and then walk over all defs, doing what promote_ssa does. >>> >>> +case GIMPLE_NOP: >>> + { >>> + if (SSA_NAME_VAR (def) == NULL) >>> + { >>> + /* Promote def by fixing its type for anonymous def. */ >>> + TREE_TYPE (def) = promoted_type; >>> + } >>> + else >>> + { >>> + /* Create a promoted copy of parameters. */ >>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >>> >>> I think the uninitialized vars are somewhat tricky and it would be best >>> to create a new uninit anonymous SSA name for them. You can >>> have SSA_NAME_VAR != NULL and def _not_ being a parameter >>> btw. >> >> I experimented with get_or_create_default_def. Here we have to have a >> SSA_NAME_VAR (def) of promoted type. >> >> In the attached patch I am doing the following and seems to work. Does >> this looks OK? >> >> + } >> + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) >> + { >> + tree var = copy_node (SSA_NAME_VAR (def)); >> + TREE_TYPE (var) = promoted_type; >> + TREE_TYPE (def) = promoted_type; >> + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); >> + } > > I believe this will wreck the SSA default-def map so you should do > > set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); > tree var = create_tmp_reg (promoted_type); > TREE_TYPE (def) = promoted_type; > SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); > set_ssa_default_def (cfun, var, def); > > instead. > >> I prefer to promote def as otherwise iterating over the uses and >> promoting can look complicated (have to look at all the different types >> of stmts again and do the right thing as It was in the earlier version >> of this before we move to this approach) >> >> 3) >>> you can also transparently handle constants for the cases where promoting >>> is required. At the moment their handling is interwinded with the def >> promotion >>> code. That makes the whole thing hard to follow. >> >> >> I have updated the comments with: >> >> +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, >> + promote only the constants in conditions part of the COND_EXPR. >> + >> + We promote the constants when the associated operands are promoted. >> + This usually means that we promote the co
Re: [0/7] Type promotion pass and elimination of zext/sext
On Sat, Nov 14, 2015 at 2:15 AM, Kugan wrote: > > Attached is the latest version of the patch. With the patches > 0001-Add-new-SEXT_EXPR-tree-code.patch, > 0002-Add-type-promotion-pass.patch and > 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. > > I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and > x64-64-linux-gnu and regression testing on ppc64-linux-gnu, > aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three > issues in ppc64-linux-gnu regression testing. There are some other test > cases which needs adjustment for scanning for some patterns that are not > valid now. > > 1. rtl fwprop was going into infinite loop. Works with the following patch: > diff --git a/gcc/fwprop.c b/gcc/fwprop.c > index 16c7981..9cf4f43 100644 > --- a/gcc/fwprop.c > +++ b/gcc/fwprop.c > @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx > new_rtx, rtx_insn *def_insn, >int old_cost = 0; >bool ok; > > + /* Value to be substituted is the same, nothing to do. */ > + if (rtx_equal_p (*loc, new_rtx)) > +return false; > + >update_df_init (def_insn, insn); > >/* forward_propagate_subreg may be operating on an instruction with Which testcase was this on? > 2. gcc.dg/torture/ftrapv-1.c fails > This is because we are checking for the SImode trapping. With the > promotion of the operation to wider mode, this is i think expected. I > think the testcase needs updating. No, it is not expected. As said earlier you need to refrain from promoting integer operations that trap. You can use ! operation_no_trapping_overflow for this. > 3. gcc.dg/sms-3.c fails > It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I > am looking into it. > > > I also have the following issues based on the previous review (as posted > in the previous patch). Copying again for the review purpose. > > 1. >> you still call promote_ssa on both DEFs and USEs and promote_ssa looks >> at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just >> on DEFs and fixup_uses on USEs. > > I am doing this to promote SSA that are defined with GIMPLE_NOP. Is > there anyway to iterate over this. I have added gcc_assert to make sure > that promote_ssa is called only once. gcc_assert (!ssa_name_info_map->get_or_insert (def)); with --disable-checking this will be compiled away so you need to do the assert in a separate statement. > 2. >> Instead of this you should, in promote_all_stmts, walk over all uses > doing what >> fixup_uses does and then walk over all defs, doing what promote_ssa does. >> >> +case GIMPLE_NOP: >> + { >> + if (SSA_NAME_VAR (def) == NULL) >> + { >> + /* Promote def by fixing its type for anonymous def. */ >> + TREE_TYPE (def) = promoted_type; >> + } >> + else >> + { >> + /* Create a promoted copy of parameters. */ >> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >> >> I think the uninitialized vars are somewhat tricky and it would be best >> to create a new uninit anonymous SSA name for them. You can >> have SSA_NAME_VAR != NULL and def _not_ being a parameter >> btw. > > I experimented with get_or_create_default_def. Here we have to have a > SSA_NAME_VAR (def) of promoted type. > > In the attached patch I am doing the following and seems to work. Does > this looks OK? > > + } > + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) > + { > + tree var = copy_node (SSA_NAME_VAR (def)); > + TREE_TYPE (var) = promoted_type; > + TREE_TYPE (def) = promoted_type; > + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); > + } I believe this will wreck the SSA default-def map so you should do set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); tree var = create_tmp_reg (promoted_type); TREE_TYPE (def) = promoted_type; SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); set_ssa_default_def (cfun, var, def); instead. > I prefer to promote def as otherwise iterating over the uses and > promoting can look complicated (have to look at all the different types > of stmts again and do the right thing as It was in the earlier version > of this before we move to this approach) > > 3) >> you can also transparently handle constants for the cases where promoting >> is required. At the moment their handling is interwinded with the def > promotion >> code. That makes the whole thing hard to follow. > > > I have updated the comments with: > > +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, > + promote only the constants in conditions part of the COND_EXPR. > + > + We promote the constants when the associated operands are promoted. > + This usually means that we promote the constants when we promote the > + defining stmnts (as part of promote_ssa). However for COND_EXPR, we > + can promote only when we promote the other operand. Therefore, this > + is done during fixup_u
Re: [0/7] Type promotion pass and elimination of zext/sext
Attached is the latest version of the patch. With the patches 0001-Add-new-SEXT_EXPR-tree-code.patch, 0002-Add-type-promotion-pass.patch and 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and x64-64-linux-gnu and regression testing on ppc64-linux-gnu, aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three issues in ppc64-linux-gnu regression testing. There are some other test cases which needs adjustment for scanning for some patterns that are not valid now. 1. rtl fwprop was going into infinite loop. Works with the following patch: diff --git a/gcc/fwprop.c b/gcc/fwprop.c index 16c7981..9cf4f43 100644 --- a/gcc/fwprop.c +++ b/gcc/fwprop.c @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx new_rtx, rtx_insn *def_insn, int old_cost = 0; bool ok; + /* Value to be substituted is the same, nothing to do. */ + if (rtx_equal_p (*loc, new_rtx)) +return false; + update_df_init (def_insn, insn); /* forward_propagate_subreg may be operating on an instruction with 2. gcc.dg/torture/ftrapv-1.c fails This is because we are checking for the SImode trapping. With the promotion of the operation to wider mode, this is i think expected. I think the testcase needs updating. 3. gcc.dg/sms-3.c fails It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I am looking into it. I also have the following issues based on the previous review (as posted in the previous patch). Copying again for the review purpose. 1. > you still call promote_ssa on both DEFs and USEs and promote_ssa looks > at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just > on DEFs and fixup_uses on USEs. I am doing this to promote SSA that are defined with GIMPLE_NOP. Is there anyway to iterate over this. I have added gcc_assert to make sure that promote_ssa is called only once. 2. > Instead of this you should, in promote_all_stmts, walk over all uses doing what > fixup_uses does and then walk over all defs, doing what promote_ssa does. > > +case GIMPLE_NOP: > + { > + if (SSA_NAME_VAR (def) == NULL) > + { > + /* Promote def by fixing its type for anonymous def. */ > + TREE_TYPE (def) = promoted_type; > + } > + else > + { > + /* Create a promoted copy of parameters. */ > + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > > I think the uninitialized vars are somewhat tricky and it would be best > to create a new uninit anonymous SSA name for them. You can > have SSA_NAME_VAR != NULL and def _not_ being a parameter > btw. I experimented with get_or_create_default_def. Here we have to have a SSA_NAME_VAR (def) of promoted type. In the attached patch I am doing the following and seems to work. Does this looks OK? + } + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) + { + tree var = copy_node (SSA_NAME_VAR (def)); + TREE_TYPE (var) = promoted_type; + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); + } I prefer to promote def as otherwise iterating over the uses and promoting can look complicated (have to look at all the different types of stmts again and do the right thing as It was in the earlier version of this before we move to this approach) 3) > you can also transparently handle constants for the cases where promoting > is required. At the moment their handling is interwinded with the def promotion > code. That makes the whole thing hard to follow. I have updated the comments with: +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. + + We promote the constants when the associated operands are promoted. + This usually means that we promote the constants when we promote the + defining stmnts (as part of promote_ssa). However for COND_EXPR, we + can promote only when we promote the other operand. Therefore, this + is done during fixup_use. */ 4) I am handling gimple_debug separately to avoid any code difference with and without -g option. I have updated the comments for this. 5) I also noticed that tree-ssa-uninit sometimes gives false positives due to the assumptions it makes. Is it OK to move this pass before type promotion? I can do the testings and post a separate patch with this if this OK. 6) I also removed the optimization that prevents some of the redundant truncation/extensions from type promotion pass, as it dosent do much as of now. I can send a proper follow up patch. Is that OK? I also did a simple test with coremark for the latest patch. I compared the code size for coremark for linux-gcc with -Os. Results are as reported by the "size" utility. I know this doesn't mean much but can give some indication. Basewith pass Percentage improvement =
Re: [0/7] Type promotion pass and elimination of zext/sext
Hi Richard, Thanks for the review. >>> >>> The basic "structure" thing still remains. You walk over all uses and >>> defs in all stmts >>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on >>> all >>> uses and defs which in turn promotes (the "def") and then fixes up all >>> uses in all stmts. >> >> Done. > > Not exactly. I still see > > /* Promote all the stmts in the basic block. */ > static void > promote_all_stmts (basic_block bb) > { > gimple_stmt_iterator gsi; > ssa_op_iter iter; > tree def, use; > use_operand_p op; > > for (gphi_iterator gpi = gsi_start_phis (bb); >!gsi_end_p (gpi); gsi_next (&gpi)) > { > gphi *phi = gpi.phi (); > def = PHI_RESULT (phi); > promote_ssa (def, &gsi); > > FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) > { > use = USE_FROM_PTR (op); > if (TREE_CODE (use) == SSA_NAME > && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) > promote_ssa (use, &gsi); > fixup_uses (phi, &gsi, op, use); > } > > you still call promote_ssa on both DEFs and USEs and promote_ssa looks > at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just > on DEFs and fixup_uses on USEs. I am doing this to promote SSA that are defined with GIMPLE_NOP. Is there anyway to iterate over this. I have added gcc_assert to make sure that promote_ssa is called only once. > > Any reason you do not promote debug stmts during the DOM walk? > > So for each DEF you record in ssa_name_info > > struct ssa_name_info > { > tree ssa; > tree type; > tree promoted_type; > }; > > (the fields need documenting). Add a tree promoted_def to it which you > can replace any use of the DEF with. In this version of the patch, I am promoting the def in place. If we decide to change, I will add it. If I understand you correctly, this is to be used in iterating over uses and fixing. > > Currently as you call promote_ssa for DEFs and USEs you repeatedly > overwrite the entry in ssa_name_info_map with a new copy. So you > should assert it wasn't already there. > > switch (gimple_code (def_stmt)) > { > case GIMPLE_PHI: > { > > the last { is indented too much it should be indented 2 spaces > relative to the 'case' Done. > > > SSA_NAME_RANGE_INFO (def) = NULL; > > only needed in the case 'def' was promoted itself. Please use > reset_flow_sensitive_info (def). We are promoting all the defs. In some-cases we can however use the value ranges in SSA just by promoting to new type (as the values will be the same). Shall I do it as a follow up. > >>> >>> Instead of this you should, in promote_all_stmts, walk over all uses doing >>> what >>> fixup_uses does and then walk over all defs, doing what promote_ssa does. >>> >>> +case GIMPLE_NOP: >>> + { >>> + if (SSA_NAME_VAR (def) == NULL) >>> + { >>> + /* Promote def by fixing its type for anonymous def. */ >>> + TREE_TYPE (def) = promoted_type; >>> + } >>> + else >>> + { >>> + /* Create a promoted copy of parameters. */ >>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >>> >>> I think the uninitialized vars are somewhat tricky and it would be best >>> to create a new uninit anonymous SSA name for them. You can >>> have SSA_NAME_VAR != NULL and def _not_ being a parameter >>> btw. >> >> Done. I also had to do some changes to in couple of other places to >> reflect this. >> They are: >> --- a/gcc/tree-ssa-reassoc.c >> +++ b/gcc/tree-ssa-reassoc.c >> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) >> { >>tree arg = gimple_phi_arg_def (stmt, i); >>if (TREE_CODE (arg) == SSA_NAME >> + && SSA_NAME_VAR (arg) >> && !SSA_NAME_IS_DEFAULT_DEF (arg)) >> { >> gimple *def_stmt = SSA_NAME_DEF_STMT (arg); >> @@ -434,7 +435,8 @@ get_rank (tree e) >>if (gimple_code (stmt) == GIMPLE_PHI) >> return phi_rank (stmt); >> >> - if (!is_gimple_assign (stmt)) >> + if (!is_gimple_assign (stmt) >> + && !gimple_nop_p (stmt)) >> return bb_rank[gimple_bb (stmt)->index]; >> >> and >> >> --- a/gcc/tree-ssa.c >> +++ b/gcc/tree-ssa.c >> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, >> use_operand_p use_p, >>TREE_VISITED (ssa_name) = 1; >> >>if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) >> - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) >> + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) >> + || SSA_NAME_VAR (ssa_name) == NULL)) >> ; /* Default definitions have empty statements. Nothing to do. */ >>else if (!def_bb) >> { >> >> Does this look OK? > > Hmm, no, this looks bogus. I have removed all the above. > > I think the best thing to do is not promoting default defs at all and instead > promote at the uses. > > /* Create a promoted copy of parameters. */ >
Re: [0/7] Type promotion pass and elimination of zext/sext
On Sun, Nov 8, 2015 at 10:43 AM, Kugan wrote: > > Thanks Richard for the comments. Please find the attached patches which > now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu and > ppc64-linux-gnu. Regression testing is ongoing. Please find the comments > for your questions/suggestions below. > >> >> I notice >> >> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c >> index 82fd4a1..80fcf70 100644 >> --- a/gcc/tree-ssanames.c >> +++ b/gcc/tree-ssanames.c >> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type >> range_type, >>unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); >> >>/* Allocate if not available. */ >> - if (ri == NULL) >> + if (ri == NULL >> + || (precision != ri->get_min ().get_precision ())) >> >> and I think you need to clear range info on promoted SSA vars in the >> promotion pass. > > Done. > >> >> The basic "structure" thing still remains. You walk over all uses and >> defs in all stmts >> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all >> uses and defs which in turn promotes (the "def") and then fixes up all >> uses in all stmts. > > Done. Not exactly. I still see /* Promote all the stmts in the basic block. */ static void promote_all_stmts (basic_block bb) { gimple_stmt_iterator gsi; ssa_op_iter iter; tree def, use; use_operand_p op; for (gphi_iterator gpi = gsi_start_phis (bb); !gsi_end_p (gpi); gsi_next (&gpi)) { gphi *phi = gpi.phi (); def = PHI_RESULT (phi); promote_ssa (def, &gsi); FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) { use = USE_FROM_PTR (op); if (TREE_CODE (use) == SSA_NAME && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) promote_ssa (use, &gsi); fixup_uses (phi, &gsi, op, use); } you still call promote_ssa on both DEFs and USEs and promote_ssa looks at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just on DEFs and fixup_uses on USEs. Any reason you do not promote debug stmts during the DOM walk? So for each DEF you record in ssa_name_info struct ssa_name_info { tree ssa; tree type; tree promoted_type; }; (the fields need documenting). Add a tree promoted_def to it which you can replace any use of the DEF with. Currently as you call promote_ssa for DEFs and USEs you repeatedly overwrite the entry in ssa_name_info_map with a new copy. So you should assert it wasn't already there. switch (gimple_code (def_stmt)) { case GIMPLE_PHI: { the last { is indented too much it should be indented 2 spaces relative to the 'case' SSA_NAME_RANGE_INFO (def) = NULL; only needed in the case 'def' was promoted itself. Please use reset_flow_sensitive_info (def). >> >> Instead of this you should, in promote_all_stmts, walk over all uses doing >> what >> fixup_uses does and then walk over all defs, doing what promote_ssa does. >> >> +case GIMPLE_NOP: >> + { >> + if (SSA_NAME_VAR (def) == NULL) >> + { >> + /* Promote def by fixing its type for anonymous def. */ >> + TREE_TYPE (def) = promoted_type; >> + } >> + else >> + { >> + /* Create a promoted copy of parameters. */ >> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >> >> I think the uninitialized vars are somewhat tricky and it would be best >> to create a new uninit anonymous SSA name for them. You can >> have SSA_NAME_VAR != NULL and def _not_ being a parameter >> btw. > > Done. I also had to do some changes to in couple of other places to > reflect this. > They are: > --- a/gcc/tree-ssa-reassoc.c > +++ b/gcc/tree-ssa-reassoc.c > @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) > { >tree arg = gimple_phi_arg_def (stmt, i); >if (TREE_CODE (arg) == SSA_NAME > + && SSA_NAME_VAR (arg) > && !SSA_NAME_IS_DEFAULT_DEF (arg)) > { > gimple *def_stmt = SSA_NAME_DEF_STMT (arg); > @@ -434,7 +435,8 @@ get_rank (tree e) >if (gimple_code (stmt) == GIMPLE_PHI) > return phi_rank (stmt); > > - if (!is_gimple_assign (stmt)) > + if (!is_gimple_assign (stmt) > + && !gimple_nop_p (stmt)) > return bb_rank[gimple_bb (stmt)->index]; > > and > > --- a/gcc/tree-ssa.c > +++ b/gcc/tree-ssa.c > @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, > use_operand_p use_p, >TREE_VISITED (ssa_name) = 1; > >if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) > - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) > + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) > + || SSA_NAME_VAR (ssa_name) == NULL)) > ; /* Default definitions have empty statements. Nothing to do. */ >else if (!def_bb) > { > > Does this look OK? Hmm, no, this looks bogus. I think the best thing to do is not promoting default defs at all and instead promote at the uses. /* Cr
Re: [0/7] Type promotion pass and elimination of zext/sext
Thanks Richard for the comments. Please find the attached patches which now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu and ppc64-linux-gnu. Regression testing is ongoing. Please find the comments for your questions/suggestions below. > > I notice > > diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c > index 82fd4a1..80fcf70 100644 > --- a/gcc/tree-ssanames.c > +++ b/gcc/tree-ssanames.c > @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type > range_type, >unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); > >/* Allocate if not available. */ > - if (ri == NULL) > + if (ri == NULL > + || (precision != ri->get_min ().get_precision ())) > > and I think you need to clear range info on promoted SSA vars in the > promotion pass. Done. > > The basic "structure" thing still remains. You walk over all uses and > defs in all stmts > in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all > uses and defs which in turn promotes (the "def") and then fixes up all > uses in all stmts. Done. > > Instead of this you should, in promote_all_stmts, walk over all uses doing > what > fixup_uses does and then walk over all defs, doing what promote_ssa does. > > +case GIMPLE_NOP: > + { > + if (SSA_NAME_VAR (def) == NULL) > + { > + /* Promote def by fixing its type for anonymous def. */ > + TREE_TYPE (def) = promoted_type; > + } > + else > + { > + /* Create a promoted copy of parameters. */ > + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > > I think the uninitialized vars are somewhat tricky and it would be best > to create a new uninit anonymous SSA name for them. You can > have SSA_NAME_VAR != NULL and def _not_ being a parameter > btw. Done. I also had to do some changes to in couple of other places to reflect this. They are: --- a/gcc/tree-ssa-reassoc.c +++ b/gcc/tree-ssa-reassoc.c @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) { tree arg = gimple_phi_arg_def (stmt, i); if (TREE_CODE (arg) == SSA_NAME + && SSA_NAME_VAR (arg) && !SSA_NAME_IS_DEFAULT_DEF (arg)) { gimple *def_stmt = SSA_NAME_DEF_STMT (arg); @@ -434,7 +435,8 @@ get_rank (tree e) if (gimple_code (stmt) == GIMPLE_PHI) return phi_rank (stmt); - if (!is_gimple_assign (stmt)) + if (!is_gimple_assign (stmt) + && !gimple_nop_p (stmt)) return bb_rank[gimple_bb (stmt)->index]; and --- a/gcc/tree-ssa.c +++ b/gcc/tree-ssa.c @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, use_operand_p use_p, TREE_VISITED (ssa_name) = 1; if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) + || SSA_NAME_VAR (ssa_name) == NULL)) ; /* Default definitions have empty statements. Nothing to do. */ else if (!def_bb) { Does this look OK? > > +/* Return true if it is safe to promote the defined SSA_NAME in the STMT > + itself. */ > +static bool > +safe_to_promote_def_p (gimple *stmt) > +{ > + enum tree_code code = gimple_assign_rhs_code (stmt); > + if (gimple_vuse (stmt) != NULL_TREE > + || gimple_vdef (stmt) != NULL_TREE > + || code == ARRAY_REF > + || code == LROTATE_EXPR > + || code == RROTATE_EXPR > + || code == VIEW_CONVERT_EXPR > + || code == BIT_FIELD_REF > + || code == REALPART_EXPR > + || code == IMAGPART_EXPR > + || code == REDUC_MAX_EXPR > + || code == REDUC_PLUS_EXPR > + || code == REDUC_MIN_EXPR) > +return false; > + return true; > > huh, I think this function has an odd name, maybe > can_promote_operation ()? Please > use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees. Done. > > Note that as followup things like the rotates should be "expanded" like > we'd do on RTL (open-coding the thing). And we'd need a way to > specify zero-/sign-extended loads. > > +/* Return true if it is safe to promote the use in the STMT. */ > +static bool > +safe_to_promote_use_p (gimple *stmt) > +{ > + enum tree_code code = gimple_assign_rhs_code (stmt); > + tree lhs = gimple_assign_lhs (stmt); > + > + if (gimple_vuse (stmt) != NULL_TREE > + || gimple_vdef (stmt) != NULL_TREE > > I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say > _2 = a[i_3]; > When I remove this, I see errors in stmts like: unsigned char unsigned int # .MEM_197 = VDEF <.MEM_187> fs_9(D)->fde_encoding = _154; > + || code == VIEW_CONVERT_EXPR > + || code == LROTATE_EXPR > + || code == RROTATE_EXPR > + || code == CONSTRUCTOR > + || code == BIT_FIELD_REF > + || code == COMPLEX_EXPR > + || code == ASM_EXPR > + || VECTOR_TYPE_P (TREE_TYPE (lhs))) > +return false; > + return true; > > ASM_EXPR can never appear here. I think PROMOTE_MODE
Re: [0/7] Type promotion pass and elimination of zext/sext
On Mon, Nov 2, 2015 at 10:17 AM, Kugan wrote: > > > On 29/10/15 02:45, Richard Biener wrote: >> On Tue, Oct 27, 2015 at 1:50 AM, kugan >> wrote: >>> >>> >>> On 23/10/15 01:23, Richard Biener wrote: On Thu, Oct 22, 2015 at 12:50 PM, Kugan wrote: > > > > On 21/10/15 23:45, Richard Biener wrote: >> >> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >> wrote: >>> >>> >>> >>> On 07/09/15 12:53, Kugan wrote: This a new version of the patch posted in https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done more testing and spitted the patch to make it more easier to review. There are still couple of issues to be addressed and I am working on them. 1. AARCH64 bootstrap now fails with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled in stage2 and fwprop.c is failing. It looks to me that there is a latent issue which gets exposed my patch. I can also reproduce this in x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For the time being, I am using patch 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a workaround. This meeds to be fixed before the patches are ready to be committed. 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. >>> >>> >>> Hi Richard, >>> >>> Now that stage 1 is going to close, I would like to get these patches >>> accepted for stage1. I will try my best to address your review comments >>> ASAP. >> >> >> Ok, can you make the whole patch series available so I can poke at the >> implementation a bit? Please state the revision it was rebased on >> (or point me to a git/svn branch the work resides on). >> > > Thanks. Please find the patched rebated against trunk@229156. I have > skipped the test-case readjustment patches. Some quick observations. On x86_64 when building >>> >>> >>> Hi Richard, >>> >>> Thanks for the review. >>> short bar (short y); int foo (short x) { short y = bar (x) + 15; return y; } with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) I get : _1 = (int) x_10(D); _2 = (_1) sext (16); _11 = bar (_2); _5 = (int) _11; _12 = (unsigned int) _5; _6 = _12 & 65535; _7 = _6 + 15; _13 = (int) _7; _8 = (_13) sext (16); _9 = (_8) sext (16); return _9; which looks fine but the VRP optimization doesn't trigger for the redundant sext (ranges are computed correctly but the 2nd extension is not removed). > > Thanks for the comments. Please fond the attached patches with which I > am now getting > cat .192t.optimized > > ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0, > symbol_order=0) > > foo (short int x) > { > signed int _1; > int _2; > signed int _5; > unsigned int _6; > unsigned int _7; > signed int _8; > int _9; > short int _11; > unsigned int _12; > signed int _13; > > : > _1 = (signed int) x_10(D); > _2 = _1; > _11 = bar (_2); > _5 = (signed int) _11; > _12 = (unsigned int) _11; > _6 = _12 & 65535; > _7 = _6 + 15; > _13 = (signed int) _7; > _8 = (_13) sext (16); > _9 = _8; > return _9; > > } > > > There are still some redundancies. The asm difference after RTL > optimizations is > > - addl$15, %eax > + addw$15, %ax > > This also makes me notice trivial match.pd patterns are missing, like for example (simplify (sext (sext@2 @0 @1) @3) (if (tree_int_cst_compare (@1, @3) <= 0) @2 (sext @0 @3))) as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions, otherwise generated code might get worse compared to without the pass(?) >>> >>> >>> Do you think that we should enable this pass only when vrp is enabled. >>> Otherwise, even when we do the simple optimizations you mentioned below, we >>> might not be able to remove all the redundancies. >>> I also notice that the 'short' argument does not get it's sign-extension removed as redundand either even though we have _1 = (int) x_8(D); Found new range for _1: [-32768, 32767] >>> >>> I am looking into it. >>> In the end I suspect that keeping track of the "simple" cases in the promotion pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do its work). In some way whether the ABI guarantees pr
Re: [0/7] Type promotion pass and elimination of zext/sext
On 29/10/15 02:45, Richard Biener wrote: > On Tue, Oct 27, 2015 at 1:50 AM, kugan > wrote: >> >> >> On 23/10/15 01:23, Richard Biener wrote: >>> >>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan >>> wrote: On 21/10/15 23:45, Richard Biener wrote: > > On Tue, Oct 20, 2015 at 10:03 PM, Kugan > wrote: >> >> >> >> On 07/09/15 12:53, Kugan wrote: >>> >>> >>> This a new version of the patch posted in >>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>> more testing and spitted the patch to make it more easier to review. >>> There are still couple of issues to be addressed and I am working on >>> them. >>> >>> 1. AARCH64 bootstrap now fails with the commit >>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is >>> mis-compiled >>> in stage2 and fwprop.c is failing. It looks to me that there is a >>> latent >>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>> time being, I am using patch >>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>> workaround. This meeds to be fixed before the patches are ready to be >>> committed. >>> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It >>> works >>> fine if I remove the -g. I am looking into it and needs to be fixed as >>> well. >> >> >> Hi Richard, >> >> Now that stage 1 is going to close, I would like to get these patches >> accepted for stage1. I will try my best to address your review comments >> ASAP. > > > Ok, can you make the whole patch series available so I can poke at the > implementation a bit? Please state the revision it was rebased on > (or point me to a git/svn branch the work resides on). > Thanks. Please find the patched rebated against trunk@229156. I have skipped the test-case readjustment patches. >>> >>> >>> Some quick observations. On x86_64 when building >> >> >> Hi Richard, >> >> Thanks for the review. >> >>> >>> short bar (short y); >>> int foo (short x) >>> { >>>short y = bar (x) + 15; >>>return y; >>> } >>> >>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) >>> I get >>> >>>: >>>_1 = (int) x_10(D); >>>_2 = (_1) sext (16); >>>_11 = bar (_2); >>>_5 = (int) _11; >>>_12 = (unsigned int) _5; >>>_6 = _12 & 65535; >>>_7 = _6 + 15; >>>_13 = (int) _7; >>>_8 = (_13) sext (16); >>>_9 = (_8) sext (16); >>>return _9; >>> >>> which looks fine but the VRP optimization doesn't trigger for the >>> redundant sext >>> (ranges are computed correctly but the 2nd extension is not removed). Thanks for the comments. Please fond the attached patches with which I am now getting cat .192t.optimized ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0, symbol_order=0) foo (short int x) { signed int _1; int _2; signed int _5; unsigned int _6; unsigned int _7; signed int _8; int _9; short int _11; unsigned int _12; signed int _13; : _1 = (signed int) x_10(D); _2 = _1; _11 = bar (_2); _5 = (signed int) _11; _12 = (unsigned int) _11; _6 = _12 & 65535; _7 = _6 + 15; _13 = (signed int) _7; _8 = (_13) sext (16); _9 = _8; return _9; } There are still some redundancies. The asm difference after RTL optimizations is - addl$15, %eax + addw$15, %ax >>> >>> This also makes me notice trivial match.pd patterns are missing, like >>> for example >>> >>> (simplify >>> (sext (sext@2 @0 @1) @3) >>> (if (tree_int_cst_compare (@1, @3) <= 0) >>>@2 >>>(sext @0 @3))) >>> >>> as VRP doesn't run at -O1 we must rely on those to remove rendudant >>> extensions, >>> otherwise generated code might get worse compared to without the pass(?) >> >> >> Do you think that we should enable this pass only when vrp is enabled. >> Otherwise, even when we do the simple optimizations you mentioned below, we >> might not be able to remove all the redundancies. >> >>> >>> I also notice that the 'short' argument does not get it's sign-extension >>> removed >>> as redundand either even though we have >>> >>> _1 = (int) x_8(D); >>> Found new range for _1: [-32768, 32767] >>> >> >> I am looking into it. >> >>> In the end I suspect that keeping track of the "simple" cases in the >>> promotion >>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP >>> to do >>> its work). In some way whether the ABI guarantees promoted argument >>> registers might need some other target hook queries. I tried adding it in the attached patch with record_visit_stmt to track whether an ssa would have value overflow or properly zero/sign extended in promoted mode. We can use this to elimi
Re: [0/7] Type promotion pass and elimination of zext/sext
On Tue, Oct 27, 2015 at 1:50 AM, kugan wrote: > > > On 23/10/15 01:23, Richard Biener wrote: >> >> On Thu, Oct 22, 2015 at 12:50 PM, Kugan >> wrote: >>> >>> >>> >>> On 21/10/15 23:45, Richard Biener wrote: On Tue, Oct 20, 2015 at 10:03 PM, Kugan wrote: > > > > On 07/09/15 12:53, Kugan wrote: >> >> >> This a new version of the patch posted in >> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >> more testing and spitted the patch to make it more easier to review. >> There are still couple of issues to be addressed and I am working on >> them. >> >> 1. AARCH64 bootstrap now fails with the commit >> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is >> mis-compiled >> in stage2 and fwprop.c is failing. It looks to me that there is a >> latent >> issue which gets exposed my patch. I can also reproduce this in x86_64 >> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >> time being, I am using patch >> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >> workaround. This meeds to be fixed before the patches are ready to be >> committed. >> >> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >> -O3 -g Error: unaligned opcodes detected in executable segment. It >> works >> fine if I remove the -g. I am looking into it and needs to be fixed as >> well. > > > Hi Richard, > > Now that stage 1 is going to close, I would like to get these patches > accepted for stage1. I will try my best to address your review comments > ASAP. Ok, can you make the whole patch series available so I can poke at the implementation a bit? Please state the revision it was rebased on (or point me to a git/svn branch the work resides on). >>> >>> Thanks. Please find the patched rebated against trunk@229156. I have >>> skipped the test-case readjustment patches. >> >> >> Some quick observations. On x86_64 when building > > > Hi Richard, > > Thanks for the review. > >> >> short bar (short y); >> int foo (short x) >> { >>short y = bar (x) + 15; >>return y; >> } >> >> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) >> I get >> >>: >>_1 = (int) x_10(D); >>_2 = (_1) sext (16); >>_11 = bar (_2); >>_5 = (int) _11; >>_12 = (unsigned int) _5; >>_6 = _12 & 65535; >>_7 = _6 + 15; >>_13 = (int) _7; >>_8 = (_13) sext (16); >>_9 = (_8) sext (16); >>return _9; >> >> which looks fine but the VRP optimization doesn't trigger for the >> redundant sext >> (ranges are computed correctly but the 2nd extension is not removed). >> >> This also makes me notice trivial match.pd patterns are missing, like >> for example >> >> (simplify >> (sext (sext@2 @0 @1) @3) >> (if (tree_int_cst_compare (@1, @3) <= 0) >>@2 >>(sext @0 @3))) >> >> as VRP doesn't run at -O1 we must rely on those to remove rendudant >> extensions, >> otherwise generated code might get worse compared to without the pass(?) > > > Do you think that we should enable this pass only when vrp is enabled. > Otherwise, even when we do the simple optimizations you mentioned below, we > might not be able to remove all the redundancies. > >> >> I also notice that the 'short' argument does not get it's sign-extension >> removed >> as redundand either even though we have >> >> _1 = (int) x_8(D); >> Found new range for _1: [-32768, 32767] >> > > I am looking into it. > >> In the end I suspect that keeping track of the "simple" cases in the >> promotion >> pass itself (by keeping a lattice) might be a good idea (after we fix VRP >> to do >> its work). In some way whether the ABI guarantees promoted argument >> registers might need some other target hook queries. >> >> Now onto the 0002 patch. >> >> +static bool >> +type_precision_ok (tree type) >> +{ >> + return (TYPE_PRECISION (type) == 8 >> + || TYPE_PRECISION (type) == 16 >> + || TYPE_PRECISION (type) == 32); >> +} >> >> that's a weird function to me. You probably want >> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) >> here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? >> > > I will change this. (I have a patch which I am testing with other changes > you have asked for) > > >> +/* Return the promoted type for TYPE. */ >> +static tree >> +get_promoted_type (tree type) >> +{ >> + tree promoted_type; >> + enum machine_mode mode; >> + int uns; >> + if (POINTER_TYPE_P (type) >> + || !INTEGRAL_TYPE_P (type) >> + || !type_precision_ok (type)) >> +return type; >> + >> + mode = TYPE_MODE (type); >> +#ifdef PROMOTE_MODE >> + uns = TYPE_SIGN (type); >> + PROMOTE_MODE (mode, uns, type); >> +#endif >> + uns = TYPE_SIGN (type); >> + promoted_type = lang_hooks.types.type_for_mode (mode, uns); >> + if (promoted_type >> + &&
Re: [0/7] Type promotion pass and elimination of zext/sext
On 23/10/15 01:23, Richard Biener wrote: On Thu, Oct 22, 2015 at 12:50 PM, Kugan wrote: On 21/10/15 23:45, Richard Biener wrote: On Tue, Oct 20, 2015 at 10:03 PM, Kugan wrote: On 07/09/15 12:53, Kugan wrote: This a new version of the patch posted in https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done more testing and spitted the patch to make it more easier to review. There are still couple of issues to be addressed and I am working on them. 1. AARCH64 bootstrap now fails with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled in stage2 and fwprop.c is failing. It looks to me that there is a latent issue which gets exposed my patch. I can also reproduce this in x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For the time being, I am using patch 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a workaround. This meeds to be fixed before the patches are ready to be committed. 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. Hi Richard, Now that stage 1 is going to close, I would like to get these patches accepted for stage1. I will try my best to address your review comments ASAP. Ok, can you make the whole patch series available so I can poke at the implementation a bit? Please state the revision it was rebased on (or point me to a git/svn branch the work resides on). Thanks. Please find the patched rebated against trunk@229156. I have skipped the test-case readjustment patches. Some quick observations. On x86_64 when building Hi Richard, Thanks for the review. short bar (short y); int foo (short x) { short y = bar (x) + 15; return y; } with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) I get : _1 = (int) x_10(D); _2 = (_1) sext (16); _11 = bar (_2); _5 = (int) _11; _12 = (unsigned int) _5; _6 = _12 & 65535; _7 = _6 + 15; _13 = (int) _7; _8 = (_13) sext (16); _9 = (_8) sext (16); return _9; which looks fine but the VRP optimization doesn't trigger for the redundant sext (ranges are computed correctly but the 2nd extension is not removed). This also makes me notice trivial match.pd patterns are missing, like for example (simplify (sext (sext@2 @0 @1) @3) (if (tree_int_cst_compare (@1, @3) <= 0) @2 (sext @0 @3))) as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions, otherwise generated code might get worse compared to without the pass(?) Do you think that we should enable this pass only when vrp is enabled. Otherwise, even when we do the simple optimizations you mentioned below, we might not be able to remove all the redundancies. I also notice that the 'short' argument does not get it's sign-extension removed as redundand either even though we have _1 = (int) x_8(D); Found new range for _1: [-32768, 32767] I am looking into it. In the end I suspect that keeping track of the "simple" cases in the promotion pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do its work). In some way whether the ABI guarantees promoted argument registers might need some other target hook queries. Now onto the 0002 patch. +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) == 8 + || TYPE_PRECISION (type) == 16 + || TYPE_PRECISION (type) == 32); +} that's a weird function to me. You probably want TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? I will change this. (I have a patch which I am testing with other changes you have asked for) +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) +return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + promoted_type = lang_hooks.types.type_for_mode (mode, uns); + if (promoted_type + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) +type = promoted_type; I think what you want to verify is that TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode). And to not even bother with this simply use promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns); I am changing this too. You use a domwalk but also might create new basic-blocks during it (insert_on_edge_immediate), that's a no-no, commit edge inserts after the domwalk. I am sorry, I dont understand "commit edge inserts after the domwalk" Is there a way to do this in the current implementation? ssa_sets_h
Re: [0/7] Type promotion pass and elimination of zext/sext
On Thu, Oct 22, 2015 at 12:50 PM, Kugan wrote: > > > On 21/10/15 23:45, Richard Biener wrote: >> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >> wrote: >>> >>> >>> On 07/09/15 12:53, Kugan wrote: This a new version of the patch posted in https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done more testing and spitted the patch to make it more easier to review. There are still couple of issues to be addressed and I am working on them. 1. AARCH64 bootstrap now fails with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled in stage2 and fwprop.c is failing. It looks to me that there is a latent issue which gets exposed my patch. I can also reproduce this in x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For the time being, I am using patch 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a workaround. This meeds to be fixed before the patches are ready to be committed. 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. >>> >>> Hi Richard, >>> >>> Now that stage 1 is going to close, I would like to get these patches >>> accepted for stage1. I will try my best to address your review comments >>> ASAP. >> >> Ok, can you make the whole patch series available so I can poke at the >> implementation a bit? Please state the revision it was rebased on >> (or point me to a git/svn branch the work resides on). >> > > Thanks. Please find the patched rebated against trunk@229156. I have > skipped the test-case readjustment patches. Some quick observations. On x86_64 when building short bar (short y); int foo (short x) { short y = bar (x) + 15; return y; } with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) I get : _1 = (int) x_10(D); _2 = (_1) sext (16); _11 = bar (_2); _5 = (int) _11; _12 = (unsigned int) _5; _6 = _12 & 65535; _7 = _6 + 15; _13 = (int) _7; _8 = (_13) sext (16); _9 = (_8) sext (16); return _9; which looks fine but the VRP optimization doesn't trigger for the redundant sext (ranges are computed correctly but the 2nd extension is not removed). This also makes me notice trivial match.pd patterns are missing, like for example (simplify (sext (sext@2 @0 @1) @3) (if (tree_int_cst_compare (@1, @3) <= 0) @2 (sext @0 @3))) as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions, otherwise generated code might get worse compared to without the pass(?) I also notice that the 'short' argument does not get it's sign-extension removed as redundand either even though we have _1 = (int) x_8(D); Found new range for _1: [-32768, 32767] In the end I suspect that keeping track of the "simple" cases in the promotion pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do its work). In some way whether the ABI guarantees promoted argument registers might need some other target hook queries. Now onto the 0002 patch. +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) == 8 + || TYPE_PRECISION (type) == 16 + || TYPE_PRECISION (type) == 32); +} that's a weird function to me. You probably want TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) +return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + promoted_type = lang_hooks.types.type_for_mode (mode, uns); + if (promoted_type + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) +type = promoted_type; I think what you want to verify is that TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode). And to not even bother with this simply use promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns); You use a domwalk but also might create new basic-blocks during it (insert_on_edge_immediate), that's a no-no, commit edge inserts after the domwalk. ssa_sets_higher_bits_bitmap looks unused and we generally don't free dominance info, so please don't do that. I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/ -B/usr/local/powerpc64-unknown-linux-gnu/bin/ -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem /usr/local/powerpc64-unknown-linux-gnu/include -isystem /usr/local/powerpc64-unknown-linux-gnu/sy
Re: [0/7] Type promotion pass and elimination of zext/sext
On Wed, Oct 21, 2015 at 7:55 PM, Richard Henderson wrote: > On 10/21/2015 03:56 AM, Richard Biener wrote: >> >> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener >> wrote: >>> >>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >>> wrote: On 07/09/15 12:53, Kugan wrote: > > > This a new version of the patch posted in > https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done > more testing and spitted the patch to make it more easier to review. > There are still couple of issues to be addressed and I am working on > them. > > 1. AARCH64 bootstrap now fails with the commit > 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is > mis-compiled > in stage2 and fwprop.c is failing. It looks to me that there is a > latent > issue which gets exposed my patch. I can also reproduce this in x86_64 > if I use the same PROMOTE_MODE which is used in aarch64 port. For the > time being, I am using patch > 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a > workaround. This meeds to be fixed before the patches are ready to be > committed. > > 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > -O3 -g Error: unaligned opcodes detected in executable segment. It > works > fine if I remove the -g. I am looking into it and needs to be fixed as > well. Hi Richard, Now that stage 1 is going to close, I would like to get these patches accepted for stage1. I will try my best to address your review comments ASAP. >>> >>> >>> Ok, can you make the whole patch series available so I can poke at the >>> implementation a bit? Please state the revision it was rebased on >>> (or point me to a git/svn branch the work resides on). >>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no longer present as it is fixed in trunk. Patch-6 is no longer needed. * Issue 2 is also reported as known issue * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in match.pd for SEXT_EXPR, I would like to propose them as a follow up patch once this is accepted. >>> >>> >>> I thought more about this and don't think it can be made work without a >>> lot of >>> hassle. Instead to get rid of the remaining "badly" typed registers in >>> the >>> function we can key different type requirements on a pass property >>> (PROP_promoted_regs), thus simply change the expectation of the >>> types of function parameters / results according to their promotion. >> >> >> Or maybe we should simply make GIMPLE _always_ adhere to the ABI >> details from the start (gimplification). Note that this does not only >> involve >> PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only >> "lower" passing / returning in registers (whee, and then we have >> things like targetm.calls.split_complex_arg ... not to mention passing >> GIMPLE memory in registers). >> >> Maybe I'm shooting too far here in the attempt to make GIMPLE closer >> to the target (to expose those redundant extensions on GIMPLE) and >> we'll end up with a bigger mess than with not doing this? > > > I'm leary of building this in as early as gimplification, lest we get into > trouble with splitting out bits of the current function for off-loading. > What happens when the cpu and gpu have different promotion rules? Ah, of course. I tend to forget these issues. Richard. > > r~
Re: [0/7] Type promotion pass and elimination of zext/sext
On 21/10/15 23:45, Richard Biener wrote: > On Tue, Oct 20, 2015 at 10:03 PM, Kugan > wrote: >> >> >> On 07/09/15 12:53, Kugan wrote: >>> >>> This a new version of the patch posted in >>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>> more testing and spitted the patch to make it more easier to review. >>> There are still couple of issues to be addressed and I am working on them. >>> >>> 1. AARCH64 bootstrap now fails with the commit >>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>> time being, I am using patch >>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>> workaround. This meeds to be fixed before the patches are ready to be >>> committed. >>> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >> >> Hi Richard, >> >> Now that stage 1 is going to close, I would like to get these patches >> accepted for stage1. I will try my best to address your review comments >> ASAP. > > Ok, can you make the whole patch series available so I can poke at the > implementation a bit? Please state the revision it was rebased on > (or point me to a git/svn branch the work resides on). > Thanks. Please find the patched rebated against trunk@229156. I have skipped the test-case readjustment patches. Thanks, Kugan >From 2dc1cccfc59ae6967928b52396227b52a50803d9 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Thu, 22 Oct 2015 10:54:31 +1100 Subject: [PATCH 4/4] debug stmt in widen mode --- gcc/gimple-ssa-type-promote.c | 82 +-- 1 file changed, 79 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c index e62a7c6..c0b6aa1 100644 --- a/gcc/gimple-ssa-type-promote.c +++ b/gcc/gimple-ssa-type-promote.c @@ -589,10 +589,86 @@ fixup_uses (tree use, tree promoted_type, tree old_type) { case GIMPLE_DEBUG: { - gsi = gsi_for_stmt (stmt); - gsi_remove (&gsi, true); - break; + /* Change the GIMPLE_DEBUG stmt such that the value bound is + computed in promoted_type and then converted to required + type. */ + tree op, new_op = NULL_TREE; + gdebug *copy = NULL, *gs = as_a (stmt); + enum tree_code code; + + /* Get the value that is bound in debug stmt. */ + switch (gs->subcode) + { + case GIMPLE_DEBUG_BIND: + op = gimple_debug_bind_get_value (gs); + break; + case GIMPLE_DEBUG_SOURCE_BIND: + op = gimple_debug_source_bind_get_value (gs); + break; + default: + gcc_unreachable (); + } + + code = TREE_CODE (op); + /* Convert the value computed in promoted_type to + old_type. */ + if (code == SSA_NAME && use == op) + new_op = build1 (NOP_EXPR, old_type, use); + else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_unary + && code != NOP_EXPR) + { + tree op0 = TREE_OPERAND (op, 0); + if (op0 == use) + { + tree temp = build1 (code, promoted_type, op0); + new_op = build1 (NOP_EXPR, old_type, temp); + } + } + else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_binary + /* Skip codes that are rejected in safe_to_promote_use_p. */ + && code != LROTATE_EXPR + && code != RROTATE_EXPR + && code != COMPLEX_EXPR) + { + tree op0 = TREE_OPERAND (op, 0); + tree op1 = TREE_OPERAND (op, 1); + if (op0 == use || op1 == use) + { + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (promoted_type, op0, SIGNED); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (promoted_type, op1, SIGNED); + tree temp = build2 (code, promoted_type, op0, op1); + new_op = build1 (NOP_EXPR, old_type, temp); + } + } + + /* Create new GIMPLE_DEBUG stmt with the new value (new_op) to + be bound, if new value has been calculated */ + if (new_op) + { + if (gimple_debug_bind_p (stmt)) + { + copy = gimple_build_debug_bind + (gimple_debug_bind_get_var (stmt), + new_op, + stmt); + } + if (gimple_debug_source_bind_p (stmt)) + { + copy = gimple_build_debug_source_bind + (gimple_debug_source_bind_get_var (stmt), new_op, + stmt); + } + + if (copy) + { + gsi = gsi_for_stmt (stmt); + gsi_replace (&gsi, copy, false); + } + } } + break; case GIMPLE_ASM: case GIMPLE_CALL: -- 1.9.1 >From 1044b1b5ebf8ad696a942207b031e3668ab2a0de Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Thu, 22 Oct 2015 10:53:56 +1100 Subject: [PATCH 3/4]
Re: [0/7] Type promotion pass and elimination of zext/sext
On 10/21/2015 03:56 AM, Richard Biener wrote: On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener wrote: On Tue, Oct 20, 2015 at 10:03 PM, Kugan wrote: On 07/09/15 12:53, Kugan wrote: This a new version of the patch posted in https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done more testing and spitted the patch to make it more easier to review. There are still couple of issues to be addressed and I am working on them. 1. AARCH64 bootstrap now fails with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled in stage2 and fwprop.c is failing. It looks to me that there is a latent issue which gets exposed my patch. I can also reproduce this in x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For the time being, I am using patch 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a workaround. This meeds to be fixed before the patches are ready to be committed. 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. Hi Richard, Now that stage 1 is going to close, I would like to get these patches accepted for stage1. I will try my best to address your review comments ASAP. Ok, can you make the whole patch series available so I can poke at the implementation a bit? Please state the revision it was rebased on (or point me to a git/svn branch the work resides on). * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no longer present as it is fixed in trunk. Patch-6 is no longer needed. * Issue 2 is also reported as known issue * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in match.pd for SEXT_EXPR, I would like to propose them as a follow up patch once this is accepted. I thought more about this and don't think it can be made work without a lot of hassle. Instead to get rid of the remaining "badly" typed registers in the function we can key different type requirements on a pass property (PROP_promoted_regs), thus simply change the expectation of the types of function parameters / results according to their promotion. Or maybe we should simply make GIMPLE _always_ adhere to the ABI details from the start (gimplification). Note that this does not only involve PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only "lower" passing / returning in registers (whee, and then we have things like targetm.calls.split_complex_arg ... not to mention passing GIMPLE memory in registers). Maybe I'm shooting too far here in the attempt to make GIMPLE closer to the target (to expose those redundant extensions on GIMPLE) and we'll end up with a bigger mess than with not doing this? I'm leary of building this in as early as gimplification, lest we get into trouble with splitting out bits of the current function for off-loading. What happens when the cpu and gpu have different promotion rules? r~
Re: [0/7] Type promotion pass and elimination of zext/sext
On Wed, 21 Oct 2015, Richard Biener wrote: > Or maybe we should simply make GIMPLE _always_ adhere to the ABI > details from the start (gimplification). Note that this does not only involve > PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only > "lower" passing / returning in registers (whee, and then we have > things like targetm.calls.split_complex_arg ... not to mention passing > GIMPLE memory in registers). > > Maybe I'm shooting too far here in the attempt to make GIMPLE closer > to the target (to expose those redundant extensions on GIMPLE) and > we'll end up with a bigger mess than with not doing this? I don't know at what point target-specific promotion should appear, but right now it's visible before then (front ends use targetm.calls.promote_prototypes), which is definitely too early. -- Joseph S. Myers jos...@codesourcery.com
Re: [0/7] Type promotion pass and elimination of zext/sext
On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener wrote: > On Tue, Oct 20, 2015 at 10:03 PM, Kugan > wrote: >> >> >> On 07/09/15 12:53, Kugan wrote: >>> >>> This a new version of the patch posted in >>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>> more testing and spitted the patch to make it more easier to review. >>> There are still couple of issues to be addressed and I am working on them. >>> >>> 1. AARCH64 bootstrap now fails with the commit >>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>> time being, I am using patch >>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>> workaround. This meeds to be fixed before the patches are ready to be >>> committed. >>> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >> >> Hi Richard, >> >> Now that stage 1 is going to close, I would like to get these patches >> accepted for stage1. I will try my best to address your review comments >> ASAP. > > Ok, can you make the whole patch series available so I can poke at the > implementation a bit? Please state the revision it was rebased on > (or point me to a git/svn branch the work resides on). > >> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no >> longer present as it is fixed in trunk. Patch-6 is no longer needed. >> >> * Issue 2 is also reported as known issue >> >> * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in >> match.pd for SEXT_EXPR, I would like to propose them as a follow up >> patch once this is accepted. > > I thought more about this and don't think it can be made work without a lot of > hassle. Instead to get rid of the remaining "badly" typed registers in the > function we can key different type requirements on a pass property > (PROP_promoted_regs), thus simply change the expectation of the > types of function parameters / results according to their promotion. Or maybe we should simply make GIMPLE _always_ adhere to the ABI details from the start (gimplification). Note that this does not only involve PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only "lower" passing / returning in registers (whee, and then we have things like targetm.calls.split_complex_arg ... not to mention passing GIMPLE memory in registers). Maybe I'm shooting too far here in the attempt to make GIMPLE closer to the target (to expose those redundant extensions on GIMPLE) and we'll end up with a bigger mess than with not doing this? Richard. > The promotion pass would set PROP_promoted_regs then. > > I will look over the patch(es) this week but as said I'd like to play with > some code examples myself and thus like to have the current patchset > in a more easily accessible form (and sure to apply to some rev.). > > Thanks, > Richard. > >> * I am happy to turn this pass off by default till IPA and match.pd >> changes are accepted. I can do regular testing to make sure that this >> pass works properly till we enable it by default. >> >> >> Please let me know what you think, >> >> Thanks, >> Kugan
Re: [0/7] Type promotion pass and elimination of zext/sext
On Tue, Oct 20, 2015 at 10:03 PM, Kugan wrote: > > > On 07/09/15 12:53, Kugan wrote: >> >> This a new version of the patch posted in >> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >> more testing and spitted the patch to make it more easier to review. >> There are still couple of issues to be addressed and I am working on them. >> >> 1. AARCH64 bootstrap now fails with the commit >> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >> in stage2 and fwprop.c is failing. It looks to me that there is a latent >> issue which gets exposed my patch. I can also reproduce this in x86_64 >> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >> time being, I am using patch >> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >> workaround. This meeds to be fixed before the patches are ready to be >> committed. >> >> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >> -O3 -g Error: unaligned opcodes detected in executable segment. It works >> fine if I remove the -g. I am looking into it and needs to be fixed as well. > > Hi Richard, > > Now that stage 1 is going to close, I would like to get these patches > accepted for stage1. I will try my best to address your review comments > ASAP. Ok, can you make the whole patch series available so I can poke at the implementation a bit? Please state the revision it was rebased on (or point me to a git/svn branch the work resides on). > * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no > longer present as it is fixed in trunk. Patch-6 is no longer needed. > > * Issue 2 is also reported as known issue > > * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in > match.pd for SEXT_EXPR, I would like to propose them as a follow up > patch once this is accepted. I thought more about this and don't think it can be made work without a lot of hassle. Instead to get rid of the remaining "badly" typed registers in the function we can key different type requirements on a pass property (PROP_promoted_regs), thus simply change the expectation of the types of function parameters / results according to their promotion. The promotion pass would set PROP_promoted_regs then. I will look over the patch(es) this week but as said I'd like to play with some code examples myself and thus like to have the current patchset in a more easily accessible form (and sure to apply to some rev.). Thanks, Richard. > * I am happy to turn this pass off by default till IPA and match.pd > changes are accepted. I can do regular testing to make sure that this > pass works properly till we enable it by default. > > > Please let me know what you think, > > Thanks, > Kugan
Re: [0/7] Type promotion pass and elimination of zext/sext
On 07/09/15 12:53, Kugan wrote: > > This a new version of the patch posted in > https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done > more testing and spitted the patch to make it more easier to review. > There are still couple of issues to be addressed and I am working on them. > > 1. AARCH64 bootstrap now fails with the commit > 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled > in stage2 and fwprop.c is failing. It looks to me that there is a latent > issue which gets exposed my patch. I can also reproduce this in x86_64 > if I use the same PROMOTE_MODE which is used in aarch64 port. For the > time being, I am using patch > 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a > workaround. This meeds to be fixed before the patches are ready to be > committed. > > 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > -O3 -g Error: unaligned opcodes detected in executable segment. It works > fine if I remove the -g. I am looking into it and needs to be fixed as well. Hi Richard, Now that stage 1 is going to close, I would like to get these patches accepted for stage1. I will try my best to address your review comments ASAP. * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no longer present as it is fixed in trunk. Patch-6 is no longer needed. * Issue 2 is also reported as known issue * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in match.pd for SEXT_EXPR, I would like to propose them as a follow up patch once this is accepted. * I am happy to turn this pass off by default till IPA and match.pd changes are accepted. I can do regular testing to make sure that this pass works properly till we enable it by default. Please let me know what you think, Thanks, Kugan
RE: [0/7] Type promotion pass and elimination of zext/sext
> Renlin Li wrote: > Hi Andrew, > > Previously, there is a discussion thread in binutils mailing list: > > https://sourceware.org/ml/binutils/2015-04/msg00032.html > > Nick proposed a way to fix, Richard Henderson hold similar opinion as you. Both Nick and Richard H seem to think it is an issue with unaligned instructions rather than an alignment bug in the debug code in the assembler (probably due to the misleading error message). Although it would work, since we don't have/need unaligned instructions that proposed patch is not the right fix for this issue. Anyway aligning the debug tables correctly should be a safe and trivial fix. Wilco
Re: [0/7] Type promotion pass and elimination of zext/sext
Hi Andrew, Previously, there is a discussion thread in binutils mailing list: https://sourceware.org/ml/binutils/2015-04/msg00032.html Nick proposed a way to fix, Richard Henderson hold similar opinion as you. Regards, Renlin On 07/09/15 12:45, pins...@gmail.com wrote: On Sep 7, 2015, at 7:22 PM, Kugan wrote: On 07/09/15 20:46, Wilco Dijkstra wrote: Kugan wrote: 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. This is a known assembler bug I found a while back, Renlin is looking into it. Basically when debug tables are inserted at the end of a code section the assembler doesn't align to the alignment required by the debug tables. This is precisely what seems to be happening. Renlin, could you please let me know when you have a patch (even if it is a prototype or a hack). I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment. Thanks, Andrew Thanks, Kugan
RE: [0/7] Type promotion pass and elimination of zext/sext
> pins...@gmail.com wrote: > > On Sep 7, 2015, at 7:22 PM, Kugan wrote: > > > > > > > > On 07/09/15 20:46, Wilco Dijkstra wrote: > >>> Kugan wrote: > >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works > >>> fine if I remove the -g. I am looking into it and needs to be fixed as > >>> well. > >> > >> This is a known assembler bug I found a while back, Renlin is looking into > >> it. > >> Basically when debug tables are inserted at the end of a code section the > >> assembler doesn't align to the alignment required by the debug tables. > > > > This is precisely what seems to be happening. Renlin, could you please > > let me know when you have a patch (even if it is a prototype or a hack). > > > I had noticed that but I read through the assembler code and it sounded very > much like it was > a designed this way and that the compiler was not supposed to emit assembly > like this and fix > up the alignment. No, the bug is introduced solely by the assembler - there is no way to avoid it as you can't expect users to align the end of the code section to an unspecified debug alignment (which could potentially vary depending on the generated debug info). The assembler aligns unaligned instructions without a warning, and doesn't require the section size to be a multiple of the section alignment, ie. the design is that the assembler can deal with any alignment. Wilco
Re: [0/7] Type promotion pass and elimination of zext/sext
> On Sep 7, 2015, at 7:22 PM, Kugan wrote: > > > > On 07/09/15 20:46, Wilco Dijkstra wrote: >>> Kugan wrote: >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >> >> This is a known assembler bug I found a while back, Renlin is looking into >> it. >> Basically when debug tables are inserted at the end of a code section the >> assembler doesn't align to the alignment required by the debug tables. > > This is precisely what seems to be happening. Renlin, could you please > let me know when you have a patch (even if it is a prototype or a hack). I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment. Thanks, Andrew > > Thanks, > Kugan
Re: [0/7] Type promotion pass and elimination of zext/sext
On 07/09/15 20:46, Wilco Dijkstra wrote: >> Kugan wrote: >> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >> -O3 -g Error: unaligned opcodes detected in executable segment. It works >> fine if I remove the -g. I am looking into it and needs to be fixed as well. > > This is a known assembler bug I found a while back, Renlin is looking into it. > Basically when debug tables are inserted at the end of a code section the > assembler doesn't align to the alignment required by the debug tables. This is precisely what seems to be happening. Renlin, could you please let me know when you have a patch (even if it is a prototype or a hack). Thanks, Kugan
RE: [0/7] Type promotion pass and elimination of zext/sext
> Kugan wrote: > 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > -O3 -g Error: unaligned opcodes detected in executable segment. It works > fine if I remove the -g. I am looking into it and needs to be fixed as well. This is a known assembler bug I found a while back, Renlin is looking into it. Basically when debug tables are inserted at the end of a code section the assembler doesn't align to the alignment required by the debug tables. Wilco
[0/7] Type promotion pass and elimination of zext/sext
This a new version of the patch posted in https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done more testing and spitted the patch to make it more easier to review. There are still couple of issues to be addressed and I am working on them. 1. AARCH64 bootstrap now fails with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled in stage2 and fwprop.c is failing. It looks to me that there is a latent issue which gets exposed my patch. I can also reproduce this in x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For the time being, I am using patch 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a workaround. This meeds to be fixed before the patches are ready to be committed. 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. In the meantime, I would appreciate if you take some time to review this. I have bootstrapped on x86_64-linux-gnu, arm-linux-gnu and aarch-64-linux-gnu (with the workaround) and regression tested. Thanks, Kugan