Re: [PATCH 2/2] Enable elimination of zext/sext
On 08/09/14 19:48, Richard Biener wrote: On Sun, Sep 7, 2014 at 11:50 AM, Kugan kugan.vivekanandara...@linaro.org wrote: On 05/09/14 19:50, Richard Biener wrote: Well - the best way would be to expose the target specifics to GIMPLE at some point in the optimization pipeline. My guess would be that it's appropriate after loop optimizations (but maybe before induction variable optimization). That is, have a pass that applies register promotion to all SSA names in the function, inserting appropriate truncations and extensions. That way you'd never see (set (subreg...) on RTL. The VRP and DOM passes running after that pass would then be able to aggressively optimize redundant truncations and extensions. Effects on debug information are to be considered. You can change the type of SSA names in-place but you don't want to do that for user DECLs (and we can't have the SSA name type and its DECL type differ - and not sure if we might want to lift that restriction). Thanks. I will try to implement this. I still would like to keep the VRP based approach as there are some cases that I think can only be done with range info. For example: short foo(unsigned char c) { c = c (unsigned char)0x0F; if( c 7 ) return((short)(c - 5)); else return(( short )c); } So, how about adding and setting the overflow/wrap around flag to range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this static_flag to encode overflow/wrap around. Will that be something acceptable? You mean tracking in the VRP lattice whether a value wrapped around (or was assumed not to due to undefined behavior)? I'm not sure this is easy to do correctly (VRP is large). Note that I don't think we'll lose the testcase you quoted if the promotion pass runs before VRP2. We'd have as input to VRP2 sth like (assuming promote mode would promote to SImode) SImode tem_2 = (unsigned int)c_1(D); tem_3 = tem_3 0xF; if (tem_3 7) { tem_4 = tem_3 - 5; short _5 = (short)_4; tem_5 = (unsigned int)_5; return tem_5; } else { short _6 = (short)_3; return _6; } VRP should be able to remove the (unsigned int)(short) sign-extension of tem_4. note that both incoming registers and return registers are interesting. For simplicity I suggest to not promote them on GIMPLE. What you'd lose in VRP2 is the smaller value-ranges you'd get from (undefined) wrapping. You could recover the undefinedness by looking at SSA names recorded value-range and transfering that in the promotion pass (but I'm not sure if you want to open the can of latent signed overflow bugs in programs even more for PROMOTE_MODE targets...) Thanks. In the meantime I would like to revert the patch which is enabling zero/sign extension. I have bootstrapped it in x86_64 and regression testing is ongoing. Is this OK ? Thanks, Kugan gcc/ChangeLog: 2014-09-09 Kugan Vivekanandarajah kug...@linaro.org Revert r213751: * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. diff --git a/gcc/calls.c b/gcc/calls.c index 03ed9c8..345331f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1486,10 +1486,7 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - if (promoted_for_signed_and_unsigned_p (args[i].tree_value, mode)) - SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); - else - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index db76897..8916305 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3325,13 +3325,7 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) -(GET_CODE (temp) == SUBREG) -(GET_MODE (target) == GET_MODE (temp)) -(GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp - emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); - else -
Re: [PATCH 2/2] Enable elimination of zext/sext
On Tue, Sep 9, 2014 at 12:06 PM, Kugan kugan.vivekanandara...@linaro.org wrote: On 08/09/14 19:48, Richard Biener wrote: On Sun, Sep 7, 2014 at 11:50 AM, Kugan kugan.vivekanandara...@linaro.org wrote: On 05/09/14 19:50, Richard Biener wrote: Well - the best way would be to expose the target specifics to GIMPLE at some point in the optimization pipeline. My guess would be that it's appropriate after loop optimizations (but maybe before induction variable optimization). That is, have a pass that applies register promotion to all SSA names in the function, inserting appropriate truncations and extensions. That way you'd never see (set (subreg...) on RTL. The VRP and DOM passes running after that pass would then be able to aggressively optimize redundant truncations and extensions. Effects on debug information are to be considered. You can change the type of SSA names in-place but you don't want to do that for user DECLs (and we can't have the SSA name type and its DECL type differ - and not sure if we might want to lift that restriction). Thanks. I will try to implement this. I still would like to keep the VRP based approach as there are some cases that I think can only be done with range info. For example: short foo(unsigned char c) { c = c (unsigned char)0x0F; if( c 7 ) return((short)(c - 5)); else return(( short )c); } So, how about adding and setting the overflow/wrap around flag to range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this static_flag to encode overflow/wrap around. Will that be something acceptable? You mean tracking in the VRP lattice whether a value wrapped around (or was assumed not to due to undefined behavior)? I'm not sure this is easy to do correctly (VRP is large). Note that I don't think we'll lose the testcase you quoted if the promotion pass runs before VRP2. We'd have as input to VRP2 sth like (assuming promote mode would promote to SImode) SImode tem_2 = (unsigned int)c_1(D); tem_3 = tem_3 0xF; if (tem_3 7) { tem_4 = tem_3 - 5; short _5 = (short)_4; tem_5 = (unsigned int)_5; return tem_5; } else { short _6 = (short)_3; return _6; } VRP should be able to remove the (unsigned int)(short) sign-extension of tem_4. note that both incoming registers and return registers are interesting. For simplicity I suggest to not promote them on GIMPLE. What you'd lose in VRP2 is the smaller value-ranges you'd get from (undefined) wrapping. You could recover the undefinedness by looking at SSA names recorded value-range and transfering that in the promotion pass (but I'm not sure if you want to open the can of latent signed overflow bugs in programs even more for PROMOTE_MODE targets...) Thanks. In the meantime I would like to revert the patch which is enabling zero/sign extension. I have bootstrapped it in x86_64 and regression testing is ongoing. Is this OK ? Ok. Thanks, Richard. Thanks, Kugan gcc/ChangeLog: 2014-09-09 Kugan Vivekanandarajah kug...@linaro.org Revert r213751: * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
Re: [PATCH 2/2] Enable elimination of zext/sext
On Sun, Sep 7, 2014 at 11:50 AM, Kugan kugan.vivekanandara...@linaro.org wrote: On 05/09/14 19:50, Richard Biener wrote: Well - the best way would be to expose the target specifics to GIMPLE at some point in the optimization pipeline. My guess would be that it's appropriate after loop optimizations (but maybe before induction variable optimization). That is, have a pass that applies register promotion to all SSA names in the function, inserting appropriate truncations and extensions. That way you'd never see (set (subreg...) on RTL. The VRP and DOM passes running after that pass would then be able to aggressively optimize redundant truncations and extensions. Effects on debug information are to be considered. You can change the type of SSA names in-place but you don't want to do that for user DECLs (and we can't have the SSA name type and its DECL type differ - and not sure if we might want to lift that restriction). Thanks. I will try to implement this. I still would like to keep the VRP based approach as there are some cases that I think can only be done with range info. For example: short foo(unsigned char c) { c = c (unsigned char)0x0F; if( c 7 ) return((short)(c - 5)); else return(( short )c); } So, how about adding and setting the overflow/wrap around flag to range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this static_flag to encode overflow/wrap around. Will that be something acceptable? You mean tracking in the VRP lattice whether a value wrapped around (or was assumed not to due to undefined behavior)? I'm not sure this is easy to do correctly (VRP is large). Note that I don't think we'll lose the testcase you quoted if the promotion pass runs before VRP2. We'd have as input to VRP2 sth like (assuming promote mode would promote to SImode) SImode tem_2 = (unsigned int)c_1(D); tem_3 = tem_3 0xF; if (tem_3 7) { tem_4 = tem_3 - 5; short _5 = (short)_4; tem_5 = (unsigned int)_5; return tem_5; } else { short _6 = (short)_3; return _6; } VRP should be able to remove the (unsigned int)(short) sign-extension of tem_4. note that both incoming registers and return registers are interesting. For simplicity I suggest to not promote them on GIMPLE. What you'd lose in VRP2 is the smaller value-ranges you'd get from (undefined) wrapping. You could recover the undefinedness by looking at SSA names recorded value-range and transfering that in the promotion pass (but I'm not sure if you want to open the can of latent signed overflow bugs in programs even more for PROMOTE_MODE targets...) Richard. Thanks again, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On 05/09/14 19:50, Richard Biener wrote: Well - the best way would be to expose the target specifics to GIMPLE at some point in the optimization pipeline. My guess would be that it's appropriate after loop optimizations (but maybe before induction variable optimization). That is, have a pass that applies register promotion to all SSA names in the function, inserting appropriate truncations and extensions. That way you'd never see (set (subreg...) on RTL. The VRP and DOM passes running after that pass would then be able to aggressively optimize redundant truncations and extensions. Effects on debug information are to be considered. You can change the type of SSA names in-place but you don't want to do that for user DECLs (and we can't have the SSA name type and its DECL type differ - and not sure if we might want to lift that restriction). Thanks. I will try to implement this. I still would like to keep the VRP based approach as there are some cases that I think can only be done with range info. For example: short foo(unsigned char c) { c = c (unsigned char)0x0F; if( c 7 ) return((short)(c - 5)); else return(( short )c); } So, how about adding and setting the overflow/wrap around flag to range_info. We now set static_flag for VR_RANG/VR_ANTI_RANGE. If we go back to the max + 1, min - 1 for VR_ANTI_RANGE, we can use this static_flag to encode overflow/wrap around. Will that be something acceptable? Thanks again, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Fri, Sep 5, 2014 at 3:33 AM, Kugan kugan.vivekanandara...@linaro.org wrote: Here is an attempt to do the value range computation in promoted_mode's type when it is overflowing. Bootstrapped on x86-84. Err - I think you misunderstood this as a suggestion to do this ;) value-ranges should be computed according to the type not according to the (promoted) mode. Otherwise we will miss optimization opportunities. Oops, sorry, I had my doubts about making trees aware of back-end stuff. Coming back to the original problem, what would be the best approach to handle this. Looking at the VRP pass, it seems to me that only MULT_EXPR and LSHIFT_EXPR are truncating values this way. All other operation are setting it to type_min, type_max. Can we rely on this ? No, that doesn't sound like a good thing to do. Is this error not showing up in PROMOTED_MODE = word_mode (and the mode precision of register from which we SUBREG is = word_mode precision) is just a coincidence. Can we rely on this? Sounds like a coincidence to me. Is there anyway we can fix this? Well - the best way would be to expose the target specifics to GIMPLE at some point in the optimization pipeline. My guess would be that it's appropriate after loop optimizations (but maybe before induction variable optimization). That is, have a pass that applies register promotion to all SSA names in the function, inserting appropriate truncations and extensions. That way you'd never see (set (subreg...) on RTL. The VRP and DOM passes running after that pass would then be able to aggressively optimize redundant truncations and extensions. Effects on debug information are to be considered. You can change the type of SSA names in-place but you don't want to do that for user DECLs (and we can't have the SSA name type and its DECL type differ - and not sure if we might want to lift that restriction). Richard. Thanks again, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Thu, Sep 4, 2014 at 5:41 AM, Kugan kugan.vivekanandara...@linaro.org wrote: I added this part of the code (in cfgexpand.c) to handle binary/unary/.. gimple operations and used the LHS value range to infer the assigned value range. I will revert this part of the code as this is wrong. I dont think checking promoted_mode for temp will be necessary here as convert_move will handle it correctly if promoted_mode is set for temp. Thus, I will reimplement setting promoted_mode to temp (in expand_expr_real_2) based on the gimple statement content on RHS. i.e. by looking at the RHS operands and its value ranges and by calculating the resulting value range. Does this sound OK to you. No, this sounds backward again and won't work because those operands again could be just truncated - thus you can't rely on their value-range. What you would need is VRP computing value-ranges in the promoted mode from the start (and it doesn't do that). Hi Richard, Here is an attempt to do the value range computation in promoted_mode's type when it is overflowing. Bootstrapped on x86-84. Err - I think you misunderstood this as a suggestion to do this ;) value-ranges should be computed according to the type not according to the (promoted) mode. Otherwise we will miss optimization opportunities. Richard. Based on your feedback, I will do more testing on this. Thanks for your time, Kugan gcc/ChangeLog: 2014-09-04 Kugan Vivekanandarajah kug...@linaro.org * tree-ssa-ccp.c (ccp_finalize): Adjust the nonzero_bits precision to the type. (evaluate_stmt): Likewise. * tree-ssanames.c (set_range_info): Adjust if the precision of stored value range is different. * tree-vrp.c (normalize_int_cst_precision): New function. (set_value_range): Add assert to check precision. (set_and_canonicalize_value_range): Call normalize_int_cst_precision on min and max. (promoted_type): New function. (promote_unary_vr): Likewise. (promote_binary_vr): Likewise. (extract_range_from_binary_expr_1): Adjust type to match value range. Store value ranges in promoted type if they overflow. (extract_range_from_unary_expr_1): Likewise. (adjust_range_with_scev): Call normalize_int_cst_precision on min and max. (vrp_visit_assignment_or_call): Likewise. (simplify_bit_ops_using_ranges): Adjust the value range precision. (test_for_singularity): Likewise. (simplify_stmt_for_jump_threading): Likewise. (extract_range_from_assert): Likewise.
Re: [PATCH 2/2] Enable elimination of zext/sext
Here is an attempt to do the value range computation in promoted_mode's type when it is overflowing. Bootstrapped on x86-84. Err - I think you misunderstood this as a suggestion to do this ;) value-ranges should be computed according to the type not according to the (promoted) mode. Otherwise we will miss optimization opportunities. Oops, sorry, I had my doubts about making trees aware of back-end stuff. Coming back to the original problem, what would be the best approach to handle this. Looking at the VRP pass, it seems to me that only MULT_EXPR and LSHIFT_EXPR are truncating values this way. All other operation are setting it to type_min, type_max. Can we rely on this ? Is this error not showing up in PROMOTED_MODE = word_mode (and the mode precision of register from which we SUBREG is = word_mode precision) is just a coincidence. Can we rely on this? Is there anyway we can fix this? Thanks again, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
I added this part of the code (in cfgexpand.c) to handle binary/unary/.. gimple operations and used the LHS value range to infer the assigned value range. I will revert this part of the code as this is wrong. I dont think checking promoted_mode for temp will be necessary here as convert_move will handle it correctly if promoted_mode is set for temp. Thus, I will reimplement setting promoted_mode to temp (in expand_expr_real_2) based on the gimple statement content on RHS. i.e. by looking at the RHS operands and its value ranges and by calculating the resulting value range. Does this sound OK to you. No, this sounds backward again and won't work because those operands again could be just truncated - thus you can't rely on their value-range. What you would need is VRP computing value-ranges in the promoted mode from the start (and it doesn't do that). Hi Richard, Here is an attempt to do the value range computation in promoted_mode's type when it is overflowing. Bootstrapped on x86-84. Based on your feedback, I will do more testing on this. Thanks for your time, Kugan gcc/ChangeLog: 2014-09-04 Kugan Vivekanandarajah kug...@linaro.org * tree-ssa-ccp.c (ccp_finalize): Adjust the nonzero_bits precision to the type. (evaluate_stmt): Likewise. * tree-ssanames.c (set_range_info): Adjust if the precision of stored value range is different. * tree-vrp.c (normalize_int_cst_precision): New function. (set_value_range): Add assert to check precision. (set_and_canonicalize_value_range): Call normalize_int_cst_precision on min and max. (promoted_type): New function. (promote_unary_vr): Likewise. (promote_binary_vr): Likewise. (extract_range_from_binary_expr_1): Adjust type to match value range. Store value ranges in promoted type if they overflow. (extract_range_from_unary_expr_1): Likewise. (adjust_range_with_scev): Call normalize_int_cst_precision on min and max. (vrp_visit_assignment_or_call): Likewise. (simplify_bit_ops_using_ranges): Adjust the value range precision. (test_for_singularity): Likewise. (simplify_stmt_for_jump_threading): Likewise. (extract_range_from_assert): Likewise. diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c index a90f708..1733073 100644 --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -916,7 +916,11 @@ ccp_finalize (void) unsigned int precision = TYPE_PRECISION (TREE_TYPE (val-value)); wide_int nonzero_bits = wide_int::from (val-mask, precision, UNSIGNED) | val-value; - nonzero_bits = get_nonzero_bits (name); + wide_int nonzero_bits_name = get_nonzero_bits (name); + if (precision != nonzero_bits_name.get_precision ()) + nonzero_bits = wi::shwi (*nonzero_bits.get_val (), +nonzero_bits_name.get_precision ()); + nonzero_bits = nonzero_bits_name; set_nonzero_bits (name, nonzero_bits); } } @@ -1852,6 +1856,8 @@ evaluate_stmt (gimple stmt) { tree lhs = gimple_get_lhs (stmt); wide_int nonzero_bits = get_nonzero_bits (lhs); + if (TYPE_PRECISION (TREE_TYPE (lhs)) != nonzero_bits.get_precision ()) + nonzero_bits = wide_int_to_tree (TREE_TYPE (lhs), nonzero_bits); if (nonzero_bits != -1) { if (!is_constant) diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c index 3af80a0..459c669 100644 --- a/gcc/tree-ssanames.c +++ b/gcc/tree-ssanames.c @@ -192,7 +192,7 @@ set_range_info (tree name, enum value_range_type range_type, gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name))); gcc_assert (range_type == VR_RANGE || range_type == VR_ANTI_RANGE); range_info_def *ri = SSA_NAME_RANGE_INFO (name); - unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); + unsigned int precision = min.get_precision (); /* Allocate if not available. */ if (ri == NULL) @@ -204,6 +204,15 @@ set_range_info (tree name, enum value_range_type range_type, SSA_NAME_RANGE_INFO (name) = ri; ri-set_nonzero_bits (wi::shwi (-1, precision)); } + else if (ri-get_min ().get_precision () != precision) +{ + size_t size = (sizeof (range_info_def) ++ trailing_wide_ints 3::extra_size (precision)); + ri = static_castrange_info_def * (ggc_realloc (ri, size)); + ri-ints.set_precision (precision); + SSA_NAME_RANGE_INFO (name) = ri; + ri-set_nonzero_bits (wi::shwi (-1, precision)); +} /* Record the range type. */ if (SSA_NAME_RANGE_TYPE (name) != range_type) diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index d16fd8a..772676a 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -61,6 +61,7 @@ along with GCC; see the file COPYING3. If not see #include optabs.h #include tree-ssa-threadedge.h #include wide-int.h +#include
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote: Something like following (untested) patch that also fixes the testcase perhaps? -- cut here-- Index: cfgexpand.c === --- cfgexpand.c (revision 214445) +++ cfgexpand.c (working copy) @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt) if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) (GET_CODE (temp) == SUBREG) +SUBREG_PROMOTED_VAR_P (temp) (GET_MODE (target) == GET_MODE (temp)) (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp Looks like a wrong order of the predicates in any case, first you should check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then SUBREG_PROMOTED_GET. Also, the extra ()s around single line conditions are unnecessary. emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); -- cut here Uros. Jakub
Re: [PATCH 2/2] Enable elimination of zext/sext
On Mon, Sep 1, 2014 at 10:47 AM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Aug 27, 2014 at 12:25:14PM +0200, Uros Bizjak wrote: Something like following (untested) patch that also fixes the testcase perhaps? -- cut here-- Index: cfgexpand.c === --- cfgexpand.c (revision 214445) +++ cfgexpand.c (working copy) @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt) if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) (GET_CODE (temp) == SUBREG) +SUBREG_PROMOTED_VAR_P (temp) (GET_MODE (target) == GET_MODE (temp)) (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp Looks like a wrong order of the predicates in any case, first you should check if it is a SUBREG, then SUBREG_PROMOTED_VAR_P and only then SUBREG_PROMOTED_GET. Also, the extra ()s around single line conditions are unnecessary. This comment applies to the original code, not the patched line, I guess. Uros.
Re: [PATCH 2/2] Enable elimination of zext/sext
On Thu, 28 Aug 2014, Kugan wrote: On 27/08/14 23:02, Kugan wrote: On 27/08/14 20:01, Uros Bizjak wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test [snip] When compiling this code, we have: lhs = _63 target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0) temp = (subreg:SI (reg:DI 540) 0) So, the code assumes that it is possible to copy (reg:DI 540) directly to (reg:DI 154). However, this is not the case, since we still have garbage in the top 32bits. Reverting the part above fixes the runtime failure, since (insn 599) is now: (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 (nil)) It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. Sorry for the breakage. I am looking into this now and I can reproduce it on qemu-alpha. I have noticed the following VRP data which is used in deciding this erroneous removal. It seems suspicious to me. _343: [2147483652, 2147483715] _344: [8, 134] _345: [8, 134] _343 = ivtmp.179_52 + 2147483645; _344 = _343 * 2; _345 = (integer(kind=4)) _344; Error comes from the third statement. In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of precision and the value_range is truncated. For the test-case provided by Uros, it is _344 = _343 * 2; [...,0x10008], precision = 384 [...,0x10086], precision = 384 and it is converted to following when it goes from wide_int to tree. [8, 134] Why do you believe that is wrong? Assuming _344 has a 32 bit type with wrapping overflow, this is just doing the wrapping modulo 2^32. -- Marc Glisse
Re: [PATCH 2/2] Enable elimination of zext/sext
On 28/08/14 16:44, Marc Glisse wrote: On Thu, 28 Aug 2014, Kugan wrote: On 27/08/14 23:02, Kugan wrote: On 27/08/14 20:01, Uros Bizjak wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test [snip] When compiling this code, we have: lhs = _63 target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0) temp = (subreg:SI (reg:DI 540) 0) So, the code assumes that it is possible to copy (reg:DI 540) directly to (reg:DI 154). However, this is not the case, since we still have garbage in the top 32bits. Reverting the part above fixes the runtime failure, since (insn 599) is now: (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 (nil)) It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. Sorry for the breakage. I am looking into this now and I can reproduce it on qemu-alpha. I have noticed the following VRP data which is used in deciding this erroneous removal. It seems suspicious to me. _343: [2147483652, 2147483715] _344: [8, 134] _345: [8, 134] _343 = ivtmp.179_52 + 2147483645; _344 = _343 * 2; _345 = (integer(kind=4)) _344; Error comes from the third statement. In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of precision and the value_range is truncated. For the test-case provided by Uros, it is _344 = _343 * 2; [...,0x10008], precision = 384 [...,0x10086], precision = 384 and it is converted to following when it goes from wide_int to tree. [8, 134] Why do you believe that is wrong? Assuming _344 has a 32 bit type with wrapping overflow, this is just doing the wrapping modulo 2^32. Indeed. I missed the TYPE_OVERFLOW_WRAPS check earlier. Thanks for pointing me to that. Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On 27/08/14 20:07, Richard Biener wrote: On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak ubiz...@gmail.com wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test on alphaev6-linux-gnu. The problem can be illustrated with attached testcase with a crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in missing SImode extension after DImode shift of SImode subregs for this part: --cut here-- # test.23_12 = PHI 0(37), 1(36) _242 = ivtmp.181_73 + 2147483645; _240 = _242 * 2; _63 = (integer(kind=4)) _240; if (ubound.6_99 = 2) goto bb 39; else goto bb 40; ;;succ: 39 ;;40 ;; basic block 39, loop depth 1 ;;pred: 38 pretmp_337 = test.23_12 | l_76; goto bb 45; ;;succ: 45 ;; basic block 40, loop depth 1 ;;pred: 38 _11 = *c_208[0]; if (_11 != _63) goto bb 45; else goto bb 42; --cut here-- this expands to: (code_label 592 591 593 35 [0 uses]) (note 593 592 0 NOTE_INSN_BASIC_BLOCK) ;; _63 = (integer(kind=4)) _240; (insn 594 593 595 (set (reg:SI 538) (const_int 1073741824 [0x4000])) -1 (nil)) (insn 595 594 596 (set (reg:SI 539) (plus:SI (reg:SI 538) (const_int 1073741824 [0x4000]))) -1 (nil)) (insn 596 595 597 (set (reg:SI 537) (plus:SI (reg:SI 539) (const_int -3 [0xfffd]))) -1 (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffd]) (nil))) (insn 597 596 598 (set (reg:SI 536 [ D.1700 ]) (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0) (reg:SI 537))) -1 (nil)) (insn 598 597 599 (set (reg:DI 540) (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0) (const_int 1 [0x1]))) -1 (nil)) (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (reg:DI 540)) -1 (nil)) ... (note 610 609 0 NOTE_INSN_BASIC_BLOCK) ;; _11 = *c_208[0]; (insn 611 610 0 (set (reg:DI 120 [ D.1694 ]) (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4 A128]))) simd7.f90:12 -1 (nil)) ;; if (_11 != _63) (insn 612 611 613 40 (set (reg:DI 545) (eq:DI (reg:DI 120 [ D.1694 ]) (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1 (nil)) (jump_insn 613 612 616 40 (set (pc) (if_then_else (eq (reg:DI 545) (const_int 0 [0])) (label_ref 0) (pc))) simd7.f90:12 -1 (int_list:REG_BR_PROB 450 (nil))) which results in following asm: $L35: addl $25,$7,$2 # 597addsi3/1[length = 4] addq $2,$2,$2 # 598ashldi3/1[length = 4] -- here bne $24,$L145 # 601*bcc_normal[length = 4] lda $4,4($20) # 627*adddi_internal/2[length = 4] ldl $8,0($20) # 611*extendsidi2_1/2[length = 4] lda $3,3($31) # 74*movdi/2[length = 4] cmpeq $8,$2,$2 # 612*setcc_internal[length = 4] -- compare bne $2,$L40 # 613*bcc_normal[length = 4] br $31,$L88 # 2403jump[length = 4] .align 4 ... Tracking the values with the debugger shows wrong calculation: 0x00012000108c +1788: addlt10,t12,t1 0x000120001090 +1792: addqt1,t1,t1 ... 0x0001200010a4 +1812: cmpeq t6,t1,t1 0x0001200010a8 +1816: bne t1,0x1200010c0 foo_+1840 (gdb) si 0x00012000108c 17 l = l .or. any (b /= 7 + i) (gdb) i r t10 t12 t100x7 7 t120x7ffd 2147483645 (gdb) si 0x000120001090 17 l = l .or. any (b /= 7 + i) (gdb) i r t1 t1 0x8004 -2147483644 (gdb) si 18 l = l .or. any (c /= 8 + 2 * i) (gdb) i r t1 t1 0x0008 -4294967288 At this point, the calculation should zero-extend SImode value to full DImode, since compare operates on DImode values. The problematic insn is (insn 599), which is now a DImode assignment instead of zero-extend, due to: --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) + (GET_CODE (temp) == SUBREG)
Re: [PATCH 2/2] Enable elimination of zext/sext
On Thu, Aug 28, 2014 at 9:50 AM, Kugan kugan.vivekanandara...@linaro.org wrote: On 27/08/14 20:07, Richard Biener wrote: On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak ubiz...@gmail.com wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test on alphaev6-linux-gnu. The problem can be illustrated with attached testcase with a crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in missing SImode extension after DImode shift of SImode subregs for this part: --cut here-- # test.23_12 = PHI 0(37), 1(36) _242 = ivtmp.181_73 + 2147483645; _240 = _242 * 2; _63 = (integer(kind=4)) _240; if (ubound.6_99 = 2) goto bb 39; else goto bb 40; ;;succ: 39 ;;40 ;; basic block 39, loop depth 1 ;;pred: 38 pretmp_337 = test.23_12 | l_76; goto bb 45; ;;succ: 45 ;; basic block 40, loop depth 1 ;;pred: 38 _11 = *c_208[0]; if (_11 != _63) goto bb 45; else goto bb 42; --cut here-- this expands to: (code_label 592 591 593 35 [0 uses]) (note 593 592 0 NOTE_INSN_BASIC_BLOCK) ;; _63 = (integer(kind=4)) _240; (insn 594 593 595 (set (reg:SI 538) (const_int 1073741824 [0x4000])) -1 (nil)) (insn 595 594 596 (set (reg:SI 539) (plus:SI (reg:SI 538) (const_int 1073741824 [0x4000]))) -1 (nil)) (insn 596 595 597 (set (reg:SI 537) (plus:SI (reg:SI 539) (const_int -3 [0xfffd]))) -1 (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffd]) (nil))) (insn 597 596 598 (set (reg:SI 536 [ D.1700 ]) (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0) (reg:SI 537))) -1 (nil)) (insn 598 597 599 (set (reg:DI 540) (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0) (const_int 1 [0x1]))) -1 (nil)) (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (reg:DI 540)) -1 (nil)) ... (note 610 609 0 NOTE_INSN_BASIC_BLOCK) ;; _11 = *c_208[0]; (insn 611 610 0 (set (reg:DI 120 [ D.1694 ]) (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4 A128]))) simd7.f90:12 -1 (nil)) ;; if (_11 != _63) (insn 612 611 613 40 (set (reg:DI 545) (eq:DI (reg:DI 120 [ D.1694 ]) (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1 (nil)) (jump_insn 613 612 616 40 (set (pc) (if_then_else (eq (reg:DI 545) (const_int 0 [0])) (label_ref 0) (pc))) simd7.f90:12 -1 (int_list:REG_BR_PROB 450 (nil))) which results in following asm: $L35: addl $25,$7,$2 # 597addsi3/1[length = 4] addq $2,$2,$2 # 598ashldi3/1[length = 4] -- here bne $24,$L145 # 601*bcc_normal[length = 4] lda $4,4($20) # 627*adddi_internal/2[length = 4] ldl $8,0($20) # 611*extendsidi2_1/2[length = 4] lda $3,3($31) # 74*movdi/2[length = 4] cmpeq $8,$2,$2 # 612*setcc_internal[length = 4] -- compare bne $2,$L40 # 613*bcc_normal[length = 4] br $31,$L88 # 2403jump[length = 4] .align 4 ... Tracking the values with the debugger shows wrong calculation: 0x00012000108c +1788: addlt10,t12,t1 0x000120001090 +1792: addqt1,t1,t1 ... 0x0001200010a4 +1812: cmpeq t6,t1,t1 0x0001200010a8 +1816: bne t1,0x1200010c0 foo_+1840 (gdb) si 0x00012000108c 17 l = l .or. any (b /= 7 + i) (gdb) i r t10 t12 t100x7 7 t120x7ffd 2147483645 (gdb) si 0x000120001090 17 l = l .or. any (b /= 7 + i) (gdb) i r t1 t1 0x8004 -2147483644 (gdb) si 18 l = l .or. any (c /= 8 + 2 * i) (gdb) i r t1 t1 0x0008 -4294967288 At this point, the calculation should zero-extend SImode value to full DImode, since compare operates on DImode values. The problematic insn is (insn 599), which is now a DImode assignment instead of zero-extend, due to: --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if
Re: [PATCH 2/2] Enable elimination of zext/sext
Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test on alphaev6-linux-gnu. The problem can be illustrated with attached testcase with a crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in missing SImode extension after DImode shift of SImode subregs for this part: --cut here-- # test.23_12 = PHI 0(37), 1(36) _242 = ivtmp.181_73 + 2147483645; _240 = _242 * 2; _63 = (integer(kind=4)) _240; if (ubound.6_99 = 2) goto bb 39; else goto bb 40; ;;succ: 39 ;;40 ;; basic block 39, loop depth 1 ;;pred: 38 pretmp_337 = test.23_12 | l_76; goto bb 45; ;;succ: 45 ;; basic block 40, loop depth 1 ;;pred: 38 _11 = *c_208[0]; if (_11 != _63) goto bb 45; else goto bb 42; --cut here-- this expands to: (code_label 592 591 593 35 [0 uses]) (note 593 592 0 NOTE_INSN_BASIC_BLOCK) ;; _63 = (integer(kind=4)) _240; (insn 594 593 595 (set (reg:SI 538) (const_int 1073741824 [0x4000])) -1 (nil)) (insn 595 594 596 (set (reg:SI 539) (plus:SI (reg:SI 538) (const_int 1073741824 [0x4000]))) -1 (nil)) (insn 596 595 597 (set (reg:SI 537) (plus:SI (reg:SI 539) (const_int -3 [0xfffd]))) -1 (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffd]) (nil))) (insn 597 596 598 (set (reg:SI 536 [ D.1700 ]) (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0) (reg:SI 537))) -1 (nil)) (insn 598 597 599 (set (reg:DI 540) (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0) (const_int 1 [0x1]))) -1 (nil)) (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (reg:DI 540)) -1 (nil)) ... (note 610 609 0 NOTE_INSN_BASIC_BLOCK) ;; _11 = *c_208[0]; (insn 611 610 0 (set (reg:DI 120 [ D.1694 ]) (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4 A128]))) simd7.f90:12 -1 (nil)) ;; if (_11 != _63) (insn 612 611 613 40 (set (reg:DI 545) (eq:DI (reg:DI 120 [ D.1694 ]) (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1 (nil)) (jump_insn 613 612 616 40 (set (pc) (if_then_else (eq (reg:DI 545) (const_int 0 [0])) (label_ref 0) (pc))) simd7.f90:12 -1 (int_list:REG_BR_PROB 450 (nil))) which results in following asm: $L35: addl $25,$7,$2 # 597addsi3/1[length = 4] addq $2,$2,$2 # 598ashldi3/1[length = 4] -- here bne $24,$L145 # 601*bcc_normal[length = 4] lda $4,4($20) # 627*adddi_internal/2[length = 4] ldl $8,0($20) # 611*extendsidi2_1/2[length = 4] lda $3,3($31) # 74*movdi/2[length = 4] cmpeq $8,$2,$2 # 612*setcc_internal[length = 4] -- compare bne $2,$L40 # 613*bcc_normal[length = 4] br $31,$L88 # 2403jump[length = 4] .align 4 ... Tracking the values with the debugger shows wrong calculation: 0x00012000108c +1788: addlt10,t12,t1 0x000120001090 +1792: addqt1,t1,t1 ... 0x0001200010a4 +1812: cmpeq t6,t1,t1 0x0001200010a8 +1816: bne t1,0x1200010c0 foo_+1840 (gdb) si 0x00012000108c 17 l = l .or. any (b /= 7 + i) (gdb) i r t10 t12 t100x7 7 t120x7ffd 2147483645 (gdb) si 0x000120001090 17 l = l .or. any (b /= 7 + i) (gdb) i r t1 t1 0x8004 -2147483644 (gdb) si 18 l = l .or. any (c /= 8 + 2 * i) (gdb) i r t1 t1 0x0008 -4294967288 At this point, the calculation should zero-extend SImode value to full DImode, since compare operates on DImode values. The problematic insn is (insn 599), which is now a DImode assignment instead of zero-extend, due to: --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) + (GET_CODE (temp) == SUBREG) + (GET_MODE (target) == GET_MODE (temp)) + (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp + emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); + else + convert_move (SUBREG_REG (target), temp,
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak ubiz...@gmail.com wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test on alphaev6-linux-gnu. The problem can be illustrated with attached testcase with a crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in missing SImode extension after DImode shift of SImode subregs for this part: --cut here-- # test.23_12 = PHI 0(37), 1(36) _242 = ivtmp.181_73 + 2147483645; _240 = _242 * 2; _63 = (integer(kind=4)) _240; if (ubound.6_99 = 2) goto bb 39; else goto bb 40; ;;succ: 39 ;;40 ;; basic block 39, loop depth 1 ;;pred: 38 pretmp_337 = test.23_12 | l_76; goto bb 45; ;;succ: 45 ;; basic block 40, loop depth 1 ;;pred: 38 _11 = *c_208[0]; if (_11 != _63) goto bb 45; else goto bb 42; --cut here-- this expands to: (code_label 592 591 593 35 [0 uses]) (note 593 592 0 NOTE_INSN_BASIC_BLOCK) ;; _63 = (integer(kind=4)) _240; (insn 594 593 595 (set (reg:SI 538) (const_int 1073741824 [0x4000])) -1 (nil)) (insn 595 594 596 (set (reg:SI 539) (plus:SI (reg:SI 538) (const_int 1073741824 [0x4000]))) -1 (nil)) (insn 596 595 597 (set (reg:SI 537) (plus:SI (reg:SI 539) (const_int -3 [0xfffd]))) -1 (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffd]) (nil))) (insn 597 596 598 (set (reg:SI 536 [ D.1700 ]) (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0) (reg:SI 537))) -1 (nil)) (insn 598 597 599 (set (reg:DI 540) (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0) (const_int 1 [0x1]))) -1 (nil)) (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (reg:DI 540)) -1 (nil)) ... (note 610 609 0 NOTE_INSN_BASIC_BLOCK) ;; _11 = *c_208[0]; (insn 611 610 0 (set (reg:DI 120 [ D.1694 ]) (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4 A128]))) simd7.f90:12 -1 (nil)) ;; if (_11 != _63) (insn 612 611 613 40 (set (reg:DI 545) (eq:DI (reg:DI 120 [ D.1694 ]) (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1 (nil)) (jump_insn 613 612 616 40 (set (pc) (if_then_else (eq (reg:DI 545) (const_int 0 [0])) (label_ref 0) (pc))) simd7.f90:12 -1 (int_list:REG_BR_PROB 450 (nil))) which results in following asm: $L35: addl $25,$7,$2 # 597addsi3/1[length = 4] addq $2,$2,$2 # 598ashldi3/1[length = 4] -- here bne $24,$L145 # 601*bcc_normal[length = 4] lda $4,4($20) # 627*adddi_internal/2[length = 4] ldl $8,0($20) # 611*extendsidi2_1/2[length = 4] lda $3,3($31) # 74*movdi/2[length = 4] cmpeq $8,$2,$2 # 612*setcc_internal[length = 4] -- compare bne $2,$L40 # 613*bcc_normal[length = 4] br $31,$L88 # 2403jump[length = 4] .align 4 ... Tracking the values with the debugger shows wrong calculation: 0x00012000108c +1788: addlt10,t12,t1 0x000120001090 +1792: addqt1,t1,t1 ... 0x0001200010a4 +1812: cmpeq t6,t1,t1 0x0001200010a8 +1816: bne t1,0x1200010c0 foo_+1840 (gdb) si 0x00012000108c 17 l = l .or. any (b /= 7 + i) (gdb) i r t10 t12 t100x7 7 t120x7ffd 2147483645 (gdb) si 0x000120001090 17 l = l .or. any (b /= 7 + i) (gdb) i r t1 t1 0x8004 -2147483644 (gdb) si 18 l = l .or. any (c /= 8 + 2 * i) (gdb) i r t1 t1 0x0008 -4294967288 At this point, the calculation should zero-extend SImode value to full DImode, since compare operates on DImode values. The problematic insn is (insn 599), which is now a DImode assignment instead of zero-extend, due to: --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) + (GET_CODE (temp) == SUBREG) + (GET_MODE (target) == GET_MODE
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Aug 27, 2014 at 12:25 PM, Uros Bizjak ubiz...@gmail.com wrote: On Wed, Aug 27, 2014 at 12:07 PM, Richard Biener richard.guent...@gmail.com wrote: 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test on alphaev6-linux-gnu. So, the code assumes that it is possible to copy (reg:DI 540) directly to (reg:DI 154). However, this is not the case, since we still have garbage in the top 32bits. Reverting the part above fixes the runtime failure, since (insn 599) is now: (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 (nil)) It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. Yeah, that makes sense. Something like following (untested) patch that also fixes the testcase perhaps? Yes (though I'm not really familiar with the RTL side here and the comment before SUBREG_PROMOTED_VAR_P looks odd) Richard. -- cut here-- Index: cfgexpand.c === --- cfgexpand.c (revision 214445) +++ cfgexpand.c (working copy) @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt) if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) (GET_CODE (temp) == SUBREG) +SUBREG_PROMOTED_VAR_P (temp) (GET_MODE (target) == GET_MODE (temp)) (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); -- cut here Uros.
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Aug 27, 2014 at 12:07 PM, Richard Biener richard.guent...@gmail.com wrote: 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test on alphaev6-linux-gnu. So, the code assumes that it is possible to copy (reg:DI 540) directly to (reg:DI 154). However, this is not the case, since we still have garbage in the top 32bits. Reverting the part above fixes the runtime failure, since (insn 599) is now: (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 (nil)) It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. Yeah, that makes sense. Something like following (untested) patch that also fixes the testcase perhaps? -- cut here-- Index: cfgexpand.c === --- cfgexpand.c (revision 214445) +++ cfgexpand.c (working copy) @@ -3322,6 +3322,7 @@ expand_gimple_stmt_1 (gimple stmt) if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) (GET_CODE (temp) == SUBREG) +SUBREG_PROMOTED_VAR_P (temp) (GET_MODE (target) == GET_MODE (temp)) (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); -- cut here Uros.
Re: [PATCH 2/2] Enable elimination of zext/sext
On 27/08/14 20:01, Uros Bizjak wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test [snip] When compiling this code, we have: lhs = _63 target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0) temp = (subreg:SI (reg:DI 540) 0) So, the code assumes that it is possible to copy (reg:DI 540) directly to (reg:DI 154). However, this is not the case, since we still have garbage in the top 32bits. Reverting the part above fixes the runtime failure, since (insn 599) is now: (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 (nil)) It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. Sorry for the breakage. I am looking into this now and I can reproduce it on qemu-alpha. I have noticed the following VRP data which is used in deciding this erroneous removal. It seems suspicious to me. _343: [2147483652, 2147483715] _344: [8, 134] _345: [8, 134] _343 = ivtmp.179_52 + 2147483645; _344 = _343 * 2; _345 = (integer(kind=4)) _344; Error comes from the third statement. Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On 27/08/14 23:02, Kugan wrote: On 27/08/14 20:01, Uros Bizjak wrote: Hello! 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. This patch regresses: Running target unix FAIL: libgomp.fortran/simd7.f90 -O2 execution test FAIL: libgomp.fortran/simd7.f90 -Os execution test [snip] When compiling this code, we have: lhs = _63 target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0) temp = (subreg:SI (reg:DI 540) 0) So, the code assumes that it is possible to copy (reg:DI 540) directly to (reg:DI 154). However, this is not the case, since we still have garbage in the top 32bits. Reverting the part above fixes the runtime failure, since (insn 599) is now: (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 (nil)) It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. Sorry for the breakage. I am looking into this now and I can reproduce it on qemu-alpha. I have noticed the following VRP data which is used in deciding this erroneous removal. It seems suspicious to me. _343: [2147483652, 2147483715] _344: [8, 134] _345: [8, 134] _343 = ivtmp.179_52 + 2147483645; _344 = _343 * 2; _345 = (integer(kind=4)) _344; Error comes from the third statement. In tree-vrp.c, in extract_range_from_binary_expr_1, there is a loss of precision and the value_range is truncated. For the test-case provided by Uros, it is _344 = _343 * 2; [...,0x10008], precision = 384 [...,0x10086], precision = 384 and it is converted to following when it goes from wide_int to tree. [8, 134] How about doing something like this to fix it. diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index d16fd8a..c0fb902 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2625,6 +2625,8 @@ extract_range_from_binary_expr_1 (value_range_t *vr, wi::extended_tree WIDE_INT_MAX_PRECISION * 2 vrp_int_cst; vrp_int sizem1 = wi::mask vrp_int (prec, false); vrp_int size = sizem1 + 1; + vrp_int type_min = vrp_int_cst (TYPE_MIN_VALUE (expr_type)); + vrp_int type_max = vrp_int_cst (TYPE_MAX_VALUE (expr_type)); /* Extend the values using the sign of the result to PREC2. From here on out, everthing is just signed math no matter @@ -2688,7 +2690,9 @@ extract_range_from_binary_expr_1 (value_range_t *vr, /* diff = max - min. */ prod2 = prod3 - prod0; - if (wi::geu_p (prod2, sizem1)) + if (wi::geu_p (prod2, sizem1) + || wi::lts_p (prod0, type_min) + || wi::gts_p (prod3, type_max)) { /* the range covers all values. */ set_value_range_to_varying (vr); If this looks reasonable I will do proper testing and post the results with the Changelog. Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote: what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg:PROMOTE_MODE of ssa N))? SRP_SIGNED_AND_UNSIGNED on a subreg should mean that the subreg is both zero and sign extended, which means that the topmost bit of the narrower mode is known to be zero, and all bits above it in the wider mode are known to be zero too. SRP_SIGNED means that the topmost bit of the narrower mode is either 0 or 1 and depending on that the above wider mode bits are either all 0 or all 1. SRP_UNSIGNED means that regardless of the topmost bit value, all above wider mode bits are 0. Ok, then from the context of the patch we already know that either SRP_UNSIGNED or SRP_SIGNED is true which means that the value is sign- or zero-extended. I suppose inside promoted_for_type_p TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure why you pass !unsignedp as lhs_uns. Now, from 'ssa' alone we can't tell anything about a larger mode registers value if that is either zero- or sign-extended. But we know that those bits are properly zero-extended if unsignedp and properly sign-extended if !unsignedp? So what the predicate tries to prove is that sign- and zero-extending results in the same larger-mode value. This is true if the MSB of the smaller mode is not set. Let's assume that smaller mode is that of 'ssa' then the test is just return (!tree_int_cst_sign_bit (min) !tree_int_cst_sign_bit (max)); no? Thanks, Richard. Jakub
Re: [PATCH 2/2] Enable elimination of zext/sext
On 06/08/14 22:09, Richard Biener wrote: On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote: what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg:PROMOTE_MODE of ssa N))? SRP_SIGNED_AND_UNSIGNED on a subreg should mean that the subreg is both zero and sign extended, which means that the topmost bit of the narrower mode is known to be zero, and all bits above it in the wider mode are known to be zero too. SRP_SIGNED means that the topmost bit of the narrower mode is either 0 or 1 and depending on that the above wider mode bits are either all 0 or all 1. SRP_UNSIGNED means that regardless of the topmost bit value, all above wider mode bits are 0. Ok, then from the context of the patch we already know that either SRP_UNSIGNED or SRP_SIGNED is true which means that the value is sign- or zero-extended. I suppose inside promoted_for_type_p TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure why you pass !unsignedp as lhs_uns. In expand_expr_real_1, it is already known that it is promoted for unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp). If we can prove that it is also promoted for !unsignedp, we can set SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED). promoted_for_type_p should prove this based on the value range info. Now, from 'ssa' alone we can't tell anything about a larger mode registers value if that is either zero- or sign-extended. But we know that those bits are properly zero-extended if unsignedp and properly sign-extended if !unsignedp? So what the predicate tries to prove is that sign- and zero-extending results in the same larger-mode value. This is true if the MSB of the smaller mode is not set. Let's assume that smaller mode is that of 'ssa' then the test is just return (!tree_int_cst_sign_bit (min) !tree_int_cst_sign_bit (max)); no? hmm, is this because we will never have a call to promoted_for_type_p with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode. The case with larger mode signed and 'ssa' unsigned will not work. Therefore larger mode unsigned and 'ssa' signed will be the only case that we should consider. However, with PROMOTE_MODE, isnt that we will miss some cases with this. Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Aug 6, 2014 at 3:21 PM, Kugan kugan.vivekanandara...@linaro.org wrote: On 06/08/14 22:09, Richard Biener wrote: On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote: what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg:PROMOTE_MODE of ssa N))? SRP_SIGNED_AND_UNSIGNED on a subreg should mean that the subreg is both zero and sign extended, which means that the topmost bit of the narrower mode is known to be zero, and all bits above it in the wider mode are known to be zero too. SRP_SIGNED means that the topmost bit of the narrower mode is either 0 or 1 and depending on that the above wider mode bits are either all 0 or all 1. SRP_UNSIGNED means that regardless of the topmost bit value, all above wider mode bits are 0. Ok, then from the context of the patch we already know that either SRP_UNSIGNED or SRP_SIGNED is true which means that the value is sign- or zero-extended. I suppose inside promoted_for_type_p TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure why you pass !unsignedp as lhs_uns. In expand_expr_real_1, it is already known that it is promoted for unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp). If we can prove that it is also promoted for !unsignedp, we can set SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED). promoted_for_type_p should prove this based on the value range info. Now, from 'ssa' alone we can't tell anything about a larger mode registers value if that is either zero- or sign-extended. But we know that those bits are properly zero-extended if unsignedp and properly sign-extended if !unsignedp? So what the predicate tries to prove is that sign- and zero-extending results in the same larger-mode value. This is true if the MSB of the smaller mode is not set. Let's assume that smaller mode is that of 'ssa' then the test is just return (!tree_int_cst_sign_bit (min) !tree_int_cst_sign_bit (max)); no? hmm, is this because we will never have a call to promoted_for_type_p with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode. The case with larger mode signed and 'ssa' unsigned will not work. Therefore larger mode unsigned and 'ssa' signed will be the only case that we should consider. However, with PROMOTE_MODE, isnt that we will miss some cases with this. No, PROMOTE_MODE will still either sign- or zero-extend. If either results in zeros in the upper bits then PROMOTE_MODE doesn't matter. Richard. Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On 06/08/14 23:29, Richard Biener wrote: On Wed, Aug 6, 2014 at 3:21 PM, Kugan kugan.vivekanandara...@linaro.org wrote: On 06/08/14 22:09, Richard Biener wrote: On Tue, Aug 5, 2014 at 4:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote: what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg:PROMOTE_MODE of ssa N))? SRP_SIGNED_AND_UNSIGNED on a subreg should mean that the subreg is both zero and sign extended, which means that the topmost bit of the narrower mode is known to be zero, and all bits above it in the wider mode are known to be zero too. SRP_SIGNED means that the topmost bit of the narrower mode is either 0 or 1 and depending on that the above wider mode bits are either all 0 or all 1. SRP_UNSIGNED means that regardless of the topmost bit value, all above wider mode bits are 0. Ok, then from the context of the patch we already know that either SRP_UNSIGNED or SRP_SIGNED is true which means that the value is sign- or zero-extended. I suppose inside promoted_for_type_p TYPE_MODE (TREE_TYPE (ssa)) == lhs_mode, I'm not sure why you pass !unsignedp as lhs_uns. In expand_expr_real_1, it is already known that it is promoted for unsigned_p and we are setting SUBREG_PROMOTED_SET (temp, unsignedp). If we can prove that it is also promoted for !unsignedp, we can set SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED). promoted_for_type_p should prove this based on the value range info. Now, from 'ssa' alone we can't tell anything about a larger mode registers value if that is either zero- or sign-extended. But we know that those bits are properly zero-extended if unsignedp and properly sign-extended if !unsignedp? So what the predicate tries to prove is that sign- and zero-extending results in the same larger-mode value. This is true if the MSB of the smaller mode is not set. Let's assume that smaller mode is that of 'ssa' then the test is just return (!tree_int_cst_sign_bit (min) !tree_int_cst_sign_bit (max)); no? hmm, is this because we will never have a call to promoted_for_type_p with same sign (ignoring PROMOTE_MODE) for 'ssa' and the larger mode. The case with larger mode signed and 'ssa' unsigned will not work. Therefore larger mode unsigned and 'ssa' signed will be the only case that we should consider. However, with PROMOTE_MODE, isnt that we will miss some cases with this. No, PROMOTE_MODE will still either sign- or zero-extend. If either results in zeros in the upper bits then PROMOTE_MODE doesn't matter. Thanks for the explanation. Please find the attached patch that implements this. I have updated the comments and predicate to match this. Bootstrap tested on x86_64-unknown-linux-gnu and regression tested on x86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions. Is this OK? Thanks, Kugan gcc/ 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check promoted_for_signed_and_unsigned_p and set the promoted mode. (promoted_for_signed_and_unsigned_p): New function. (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p and set the promoted mode. * expr.h (promoted_for_signed_and_unsigned_p): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. gcc/testsuite 2014-08-07 Kugan Vivekanandarajah kug...@linaro.org * gcc.dg/zero_sign_ext_test.c: New test. diff --git a/gcc/calls.c b/gcc/calls.c index 00c5028..4285ec1 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (promoted_for_signed_and_unsigned_p (args[i].tree_value, mode)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index f98c322..b14626c 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) +(GET_CODE (temp) == SUBREG) +(GET_MODE (target) == GET_MODE (temp)) +(GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp +
Re: [PATCH 2/2] Enable elimination of zext/sext
On Fri, Aug 1, 2014 at 6:03 PM, Kugan kugan.vivekanandara...@linaro.org wrote: if (rhs_uns) return wi::ge_p (min, 0); // if min = 0 then range contains positive values else return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); // if max = signed-max-of-type then range doesn't need sign-extension I think we will have to check that ssa has necessary sign/zero extension when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will be interpreted differently, the value range of ssa also will have corresponding range. In this cases, shouldn’t we have to check for upper and lower limit for both min and max? Hmm? That's exactly what the check is testing... we know that min = max thus if min = 0 then max = 0. zero_extension will never do anything on [0, INF] If max MAX-SIGNED then sign-extension will not do anything. Ok, sign-extension will do sth for negative values still. So rather if (rhs_uns) return wi::geu_p (min, 0); else return wi::ges_p (min, 0) wi::les_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED)); ? Thanks for the explanation. I agree. Don’t we have to however check this on lhs_uns as this function is checking if ssa is promoted for lhs_sign and lhs_mode? Here is an attempt based on this. I ran regression testing with arm-none-linux-gnueabi on qemu-arm without any new regressions. Sine I am not comparing value ranges to see if it can be represented in lhs_sigh, I can now skip the PROMOTED_MODE check. Now I'm lost. You call this function from two contexts: diff --git a/gcc/calls.c b/gcc/calls.c index a3e6faa..eac512f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); and @@ -9527,7 +9587,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, temp = gen_lowpart_SUBREG (mode, decl_rtl); SUBREG_PROMOTED_VAR_P (temp) = 1; - SUBREG_PROMOTED_SET (temp, unsignedp); + if (is_promoted_for_type (ssa_name, mode, !unsignedp)) + SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (temp, unsignedp); return temp; } what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg:PROMOTE_MODE of ssa N))? it seems that we need to verify that 'ssa', when promoted, does not have bits set above the target modes MSB when we know it is zero-extended (according to PROMOTE_MODE)? Or has all bits set to one and is sign-extended (according to PROMOTE_MODE)? Now it seems that the promotion is according to promote_{function,decl}_mode in expand_expr_real_1 and according to promote_mode in calls.c. The function comment above promoted_for_type_p needs to be more elaborate on what invariant it checks. As you pass in the subreg mode but you need to verify the larger mode is properly extended. I am still using wide_int::from (instead of wi::max_value) to get the limit as I have to match the precision with min, max precision. otherwise wide_int comparisons will not work. Is there a better way for this? I don't understand. wi::max_value takes a precision argument. /* Return TRUE if value in SSA is already zero/sign extended for lhs type (type here is the combination of LHS_MODE and LHS_UNS) using value range information stored. Return FALSE otherwise. */ bool promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) { wide_int min, max, limit; tree lhs_type; bool rhs_uns; signop rhs_signop; if (ssa == NULL_TREE || TREE_CODE (ssa) != SSA_NAME || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))) return false; /* Return FALSE if value_range is not recorded for SSA. */ if (get_range_info (ssa, min, max) != VR_RANGE) return false; rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); rhs_signop = rhs_uns ? UNSIGNED : SIGNED; lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); limit = wide_int::from (TYPE_MAX_VALUE (lhs_type), TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); if (lhs_uns) /* If min = 0 then range contains positive values and doesnt need zero-extension. */ return wi::ge_p (min, 0, rhs_signop); else /* If min = 0 and max = signed-max-of-type then range doesn't need sign-extension. */ return
Re: [PATCH 2/2] Enable elimination of zext/sext
On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote: what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg:PROMOTE_MODE of ssa N))? SRP_SIGNED_AND_UNSIGNED on a subreg should mean that the subreg is both zero and sign extended, which means that the topmost bit of the narrower mode is known to be zero, and all bits above it in the wider mode are known to be zero too. SRP_SIGNED means that the topmost bit of the narrower mode is either 0 or 1 and depending on that the above wider mode bits are either all 0 or all 1. SRP_UNSIGNED means that regardless of the topmost bit value, all above wider mode bits are 0. Jakub
Re: [PATCH 2/2] Enable elimination of zext/sext
On 02/08/14 02:03, Kugan wrote: if (rhs_uns) return wi::ge_p (min, 0); // if min = 0 then range contains positive values else return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); // if max = signed-max-of-type then range doesn't need sign-extension I think we will have to check that ssa has necessary sign/zero extension when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will be interpreted differently, the value range of ssa also will have corresponding range. In this cases, shouldn’t we have to check for upper and lower limit for both min and max? Hmm? That's exactly what the check is testing... we know that min = max thus if min = 0 then max = 0. zero_extension will never do anything on [0, INF] If max MAX-SIGNED then sign-extension will not do anything. Ok, sign-extension will do sth for negative values still. So rather if (rhs_uns) return wi::geu_p (min, 0); else return wi::ges_p (min, 0) wi::les_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED)); ? Looking at your comments again, I think we have to consider three things here. To be able assign to LHS (of lhs_uns and lhs_mode) without conversion of RHS (tree SSA) * If we ignore the mode changes (i.e. LHS_mode can be different in terms of precision) and ignore PROMOTE_MODE and consider only the sign of LHS and RHS if (lhs_uns) return wi::ge_p (min, 0, rhs_signop); // if min = 0 then range contains positive values else if (rhs_uns) // if max = signed-max-of-type then range doesn't need sign-extension return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); else return true; * However, if we consider the PROMOTE_MODE might change the RHS sign if (lhs_uns) { return wi::ge_p (min, 0, rhs_signop); } else { signed_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop); if (rhs_uns) /* If PROMOTE_MODE changed an RHS signed to unsigned and SSA contains negative value range, we still have to do sign-extend. */ return wi::ge_p (min, 0, TYPE_SIGN (TREE_TYPE (ssa))) wi::le_p (max, signed_max, rhs_signop); else /* If PROMOTE_MODE changed an RHS unsigned to signed and SSA contains value range more than signed-max-of-type, we still have to do sign-extend. */ return wi::le_p (max, signed_max, TYPE_SIGN (TREE_TYPE (ssa))); } * If we also consider that LHS mode and RHS mode precision can be different if (lhs_uns) { unsigned_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop); /* If min = 0 then range contains positive values and doesnt need zero-extension. If max = unsigned-max-of-type, then value fits type. */ return wi::ge_p (min, 0, rhs_signop) wi::le_p (max, unsigned_max, rhs_signop); } else { signed_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop); signed_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), TYPE_PRECISION (TREE_TYPE (ssa)), rhs_signop); if (rhs_uns) /* If PROMOTE_MODE changed an RHS signed to unsigned and SSA contains negative value range, we still have to do sign-extend. */ return wi::ge_p (min, 0, TYPE_SIGN (TREE_TYPE (ssa))) wi::le_p (max, signed_max, rhs_signop); else /* If PROMOTE_MODE changed an RHS unsigned to signed and SSA contains value range more than signed-max-of-type, we still have to do sign-extend. */ return wi::le_p (max, signed_max, TYPE_SIGN (TREE_TYPE (ssa))) wi::ge_p (min, signed_min, rhs_signop); } } Since we can have PROMOTE_MODE changing the sign and LHS mode and RHS mode precision can be different, the check should be the third one. Does that make sense or am I still missing it? Thanks again for your time, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Fri, Aug 1, 2014 at 6:51 AM, Kugan kugan.vivekanandara...@linaro.org wrote: + lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); ... + ((!lhs_uns !wi::neg_p (min, TYPE_SIGN (lhs_type))) ... + type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, +TYPE_SIGN (TREE_TYPE (ssa))); + type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, +TYPE_SIGN (TREE_TYPE (ssa))); you shouldn't try getting at lhs_type. Btw, do you want to constrain lhs_mode to MODE_INTs somewhere? Is this in addition to !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))? Do you mean that I should check lhs_mode as well? No, that's probably enough. For TYPE_SIGN use lhs_uns instead, for the min/max value you should use wi::min_value () and wi::max_value () instead. You are still using TYPE_SIGN (TREE_TYPE (ssa)) here and later, but we computed rhs_uns properly using PROMOTE_MODE. I think the code with re-setting lhs_uns if rhs_uns != lhs_uns and later using TYPE_SIGN again is pretty hard to follow. Btw, it seems you need to conditionalize the call to PROMOTE_MODE on its availability. Isn't it simply about choosing a proper range we need to restrict ssa to? That is, dependent on rhs_uns computed by PROMOTE_MODE, simply: + mode = TYPE_MODE (TREE_TYPE (ssa)); + rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); #ifdef PROMOTE_MODE + PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); #endif if (rhs_uns) return wi::ge_p (min, 0); // if min = 0 then range contains positive values else return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); // if max = signed-max-of-type then range doesn't need sign-extension I think we will have to check that ssa has necessary sign/zero extension when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will be interpreted differently, the value range of ssa also will have corresponding range. In this cases, shouldn’t we have to check for upper and lower limit for both min and max? Hmm? That's exactly what the check is testing... we know that min = max thus if min = 0 then max = 0. zero_extension will never do anything on [0, INF] If max MAX-SIGNED then sign-extension will not do anything. Ok, sign-extension will do sth for negative values still. So rather if (rhs_uns) return wi::geu_p (min, 0); else return wi::ges_p (min, 0) wi::les_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED)); ? I don't like the use of int_fits_type_p you propose. Richard. How about this? bool promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) { wide_int min, max; tree lhs_type, rhs_type; bool rhs_uns; enum machine_mode rhs_mode; tree min_tree, max_tree; if (ssa == NULL_TREE || TREE_CODE (ssa) != SSA_NAME || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))) return false; /* Return FALSE if value_range is not recorded for SSA. */ if (get_range_info (ssa, min, max) != VR_RANGE) return false; rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); if (rhs_uns != lhs_uns) { /* Signedness of LHS and RHS differs and values also cannot be represented in LHS range. */ unsigned int prec = min.get_precision (); if ((lhs_uns wi::neg_p (min, rhs_uns ? UNSIGNED : SIGNED)) || (!lhs_uns !wi::le_p (max, wi::max_value (prec, SIGNED), rhs_uns ? UNSIGNED : SIGNED))) return false; } /* In some architectures, modes are promoted and sign changed with target defined PROMOTE_MODE macro. If PROMOTE_MODE tells you to promote _not_ according to ssa's sign then honour that. */ rhs_mode = TYPE_MODE (TREE_TYPE (ssa)); #ifdef PROMOTE_MODE PROMOTE_MODE (rhs_mode, rhs_uns, TREE_TYPE (ssa)); #endif rhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns); lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); min_tree = wide_int_to_tree (rhs_type, min); max_tree = wide_int_to_tree (rhs_type, max); /* Check if values lies in-between the type range. */ if (int_fits_type_p (min_tree, lhs_type) int_fits_type_p (max_tree, lhs_type)) return true; else return false; } Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
if (rhs_uns) return wi::ge_p (min, 0); // if min = 0 then range contains positive values else return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); // if max = signed-max-of-type then range doesn't need sign-extension I think we will have to check that ssa has necessary sign/zero extension when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will be interpreted differently, the value range of ssa also will have corresponding range. In this cases, shouldn’t we have to check for upper and lower limit for both min and max? Hmm? That's exactly what the check is testing... we know that min = max thus if min = 0 then max = 0. zero_extension will never do anything on [0, INF] If max MAX-SIGNED then sign-extension will not do anything. Ok, sign-extension will do sth for negative values still. So rather if (rhs_uns) return wi::geu_p (min, 0); else return wi::ges_p (min, 0) wi::les_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED)); ? Thanks for the explanation. I agree. Don’t we have to however check this on lhs_uns as this function is checking if ssa is promoted for lhs_sign and lhs_mode? Here is an attempt based on this. I ran regression testing with arm-none-linux-gnueabi on qemu-arm without any new regressions. Sine I am not comparing value ranges to see if it can be represented in lhs_sigh, I can now skip the PROMOTED_MODE check. I am still using wide_int::from (instead of wi::max_value) to get the limit as I have to match the precision with min, max precision. otherwise wide_int comparisons will not work. Is there a better way for this? /* Return TRUE if value in SSA is already zero/sign extended for lhs type (type here is the combination of LHS_MODE and LHS_UNS) using value range information stored. Return FALSE otherwise. */ bool promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) { wide_int min, max, limit; tree lhs_type; bool rhs_uns; signop rhs_signop; if (ssa == NULL_TREE || TREE_CODE (ssa) != SSA_NAME || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))) return false; /* Return FALSE if value_range is not recorded for SSA. */ if (get_range_info (ssa, min, max) != VR_RANGE) return false; rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); rhs_signop = rhs_uns ? UNSIGNED : SIGNED; lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); limit = wide_int::from (TYPE_MAX_VALUE (lhs_type), TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); if (lhs_uns) /* If min = 0 then range contains positive values and doesnt need zero-extension. */ return wi::ge_p (min, 0, rhs_signop); else /* If min = 0 and max = signed-max-of-type then range doesn't need sign-extension. */ return wi::ge_p (min, 0, rhs_signop) wi::le_p (max, limit, rhs_signop); } Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
+ lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); ... + ((!lhs_uns !wi::neg_p (min, TYPE_SIGN (lhs_type))) ... + type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, +TYPE_SIGN (TREE_TYPE (ssa))); + type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, +TYPE_SIGN (TREE_TYPE (ssa))); you shouldn't try getting at lhs_type. Btw, do you want to constrain lhs_mode to MODE_INTs somewhere? Is this in addition to !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))? Do you mean that I should check lhs_mode as well? For TYPE_SIGN use lhs_uns instead, for the min/max value you should use wi::min_value () and wi::max_value () instead. You are still using TYPE_SIGN (TREE_TYPE (ssa)) here and later, but we computed rhs_uns properly using PROMOTE_MODE. I think the code with re-setting lhs_uns if rhs_uns != lhs_uns and later using TYPE_SIGN again is pretty hard to follow. Btw, it seems you need to conditionalize the call to PROMOTE_MODE on its availability. Isn't it simply about choosing a proper range we need to restrict ssa to? That is, dependent on rhs_uns computed by PROMOTE_MODE, simply: + mode = TYPE_MODE (TREE_TYPE (ssa)); + rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); #ifdef PROMOTE_MODE + PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); #endif if (rhs_uns) return wi::ge_p (min, 0); // if min = 0 then range contains positive values else return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); // if max = signed-max-of-type then range doesn't need sign-extension I think we will have to check that ssa has necessary sign/zero extension when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will be interpreted differently, the value range of ssa also will have corresponding range. In this cases, shouldn’t we have to check for upper and lower limit for both min and max? How about this? bool promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) { wide_int min, max; tree lhs_type, rhs_type; bool rhs_uns; enum machine_mode rhs_mode; tree min_tree, max_tree; if (ssa == NULL_TREE || TREE_CODE (ssa) != SSA_NAME || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))) return false; /* Return FALSE if value_range is not recorded for SSA. */ if (get_range_info (ssa, min, max) != VR_RANGE) return false; rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); if (rhs_uns != lhs_uns) { /* Signedness of LHS and RHS differs and values also cannot be represented in LHS range. */ unsigned int prec = min.get_precision (); if ((lhs_uns wi::neg_p (min, rhs_uns ? UNSIGNED : SIGNED)) || (!lhs_uns !wi::le_p (max, wi::max_value (prec, SIGNED), rhs_uns ? UNSIGNED : SIGNED))) return false; } /* In some architectures, modes are promoted and sign changed with target defined PROMOTE_MODE macro. If PROMOTE_MODE tells you to promote _not_ according to ssa's sign then honour that. */ rhs_mode = TYPE_MODE (TREE_TYPE (ssa)); #ifdef PROMOTE_MODE PROMOTE_MODE (rhs_mode, rhs_uns, TREE_TYPE (ssa)); #endif rhs_type = lang_hooks.types.type_for_mode (rhs_mode, rhs_uns); lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); min_tree = wide_int_to_tree (rhs_type, min); max_tree = wide_int_to_tree (rhs_type, max); /* Check if values lies in-between the type range. */ if (int_fits_type_p (min_tree, lhs_type) int_fits_type_p (max_tree, lhs_type)) return true; else return false; } Thanks, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Mon, Jul 14, 2014 at 4:57 AM, Kugan kugan.vivekanandara...@linaro.org wrote: On 11/07/14 22:47, Richard Biener wrote: On Fri, Jul 11, 2014 at 1:52 PM, Kugan kugan.vivekanandara...@linaro.org wrote: Thanks foe the review and suggestions. On 10/07/14 22:15, Richard Biener wrote: On Mon, Jul 7, 2014 at 8:55 AM, Kugan kugan.vivekanandara...@linaro.org wrote: [...] For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. Hm? I think you should rather check that you are removing a sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or zero-extend. Definitely + /* In some architectures, negative integer constants are truncated and + sign changed with target defined PROMOTE_MODE macro. This will impact + the value range seen here and produce wrong code if zero/sign extensions + are eliminated. Therefore, return false if this SSA can have negative + integers. */ + if (is_gimple_assign (stmt) + (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary)) +{ + tree rhs1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (rhs1) == INTEGER_CST + !TYPE_UNSIGNED (TREE_TYPE (ssa)) + tree_int_cst_compare (rhs1, integer_zero_node) == -1) + return false; looks completely bogus ... (an unary op with a constant operand?) instead you want to do sth like I see that unary op with a constant operand is not possible in gimple. What I wanted to check here is any sort of constant loads; but seems that will not happen in gimple. Is PHI statements the only possible statements where we will end up with such constants. No, in theory you can have ssa_1 = -1; but that's not unary but a GIMPLE_SINGLE_RHS and thus gimple_assign_rhs_code (stmt) == INTEGER_CST. mode = TYPE_MODE (TREE_TYPE (ssa)); rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); instead of initializing rhs_uns from ssas type. That is, if PROMOTE_MODE tells you to promote _not_ according to ssas sign then honor that. This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi. where, the gimple statement that cause this looks like: . # _3 = PHI _17(7), -1(2) bb43: return _3; ARM PROMOTE_MODE changes the sign for integer constants only and hence looking at the variable with PROMOTE_MODE is not changing the sign in this case. #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ GET_MODE_SIZE (MODE) 4) \ { \ if (MODE == QImode) \ UNSIGNEDP = 1; \ else if (MODE == HImode) \ UNSIGNEDP = 1; \ (MODE) = SImode; \ } Where does it only apply for constants? It applies to all QImode and HImode entities. oops, sorry. I don’t know what I was thinking or looking at when I wrote that :( It indeed fixes my problems. Thanks for that. Here is the modified patch. Bootstrapped and regression tested for 86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions. Is this OK? + lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); ... + ((!lhs_uns !wi::neg_p (min, TYPE_SIGN (lhs_type))) ... + type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, +TYPE_SIGN (TREE_TYPE (ssa))); + type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, +TYPE_SIGN (TREE_TYPE (ssa))); you shouldn't try getting at lhs_type. Btw, do you want to constrain lhs_mode to MODE_INTs somewhere? For TYPE_SIGN use lhs_uns instead, for the min/max value you should use wi::min_value () and wi::max_value () instead. You are still using TYPE_SIGN (TREE_TYPE (ssa)) here and later, but we computed rhs_uns properly using PROMOTE_MODE. I think the code with re-setting lhs_uns if rhs_uns != lhs_uns and later using TYPE_SIGN again is pretty hard to follow. Btw, it seems you need to conditionalize the call to PROMOTE_MODE on its availability. Isn't it simply about choosing a proper range we need to restrict ssa to? That is, dependent on rhs_uns computed by PROMOTE_MODE, simply: + mode = TYPE_MODE (TREE_TYPE (ssa)); + rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); #ifdef PROMOTE_MODE + PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); #endif if (rhs_uns) return wi::ge_p (min, 0); // if min = 0 then range contains positive values else
Re: [PATCH 2/2] Enable elimination of zext/sext
On 14 July 2014 04:58:17 Kugan kugan.vivekanandara...@linaro.org wrote: On 11/07/14 22:47, Richard Biener wrote: On Fri, Jul 11, 2014 at 1:52 PM, Kugan kugan.vivekanandara...@linaro.org wrote: Thanks foe the review and suggestions. On 10/07/14 22:15, Richard Biener wrote: On Mon, Jul 7, 2014 at 8:55 AM, Kugan kugan.vivekanandara...@linaro.org wrote: [...] For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. Hm? I think you should rather check that you are removing a sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or zero-extend. Definitely + /* In some architectures, negative integer constants are truncated and + sign changed with target defined PROMOTE_MODE macro. This will impact + the value range seen here and produce wrong code if zero/sign extensions + are eliminated. Therefore, return false if this SSA can have negative + integers. */ + if (is_gimple_assign (stmt) + (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary)) +{ + tree rhs1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (rhs1) == INTEGER_CST + !TYPE_UNSIGNED (TREE_TYPE (ssa)) + tree_int_cst_compare (rhs1, integer_zero_node) == -1) + return false; looks completely bogus ... (an unary op with a constant operand?) instead you want to do sth like I see that unary op with a constant operand is not possible in gimple. What I wanted to check here is any sort of constant loads; but seems that will not happen in gimple. Is PHI statements the only possible statements where we will end up with such constants. No, in theory you can have ssa_1 = -1; but that's not unary but a GIMPLE_SINGLE_RHS and thus gimple_assign_rhs_code (stmt) == INTEGER_CST. mode = TYPE_MODE (TREE_TYPE (ssa)); rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); instead of initializing rhs_uns from ssas type. That is, if PROMOTE_MODE tells you to promote _not_ according to ssas sign then honor that. This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi. where, the gimple statement that cause this looks like: . # _3 = PHI _17(7), -1(2) bb43: return _3; ARM PROMOTE_MODE changes the sign for integer constants only and hence looking at the variable with PROMOTE_MODE is not changing the sign in this case. #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ GET_MODE_SIZE (MODE) 4) \ { \ if (MODE == QImode) \ UNSIGNEDP = 1; \ else if (MODE == HImode) \ UNSIGNEDP = 1; \ (MODE) = SImode; \ } Where does it only apply for constants? It applies to all QImode and HImode entities. oops, sorry. I don’t know what I was thinking or looking at when I wrote that :( It indeed fixes my problems. Thanks for that. Here is the modified patch. Bootstrapped and regression tested for 86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions. Is this OK? Thanks, Kugan gcc/ 2014-07-14 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type): New function. Don't we name predicates more like promoted_for_type_p? Thanks, (expand_expr_real_1): Check is_promoted_for_type and set the promoted mode. * expr.h (is_promoted_for_type): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. gcc/testsuite 2014-07-14 Kugan Vivekanandarajah kug...@linaro.org * gcc.dg/zero_sign_ext_test.c: New test. Sent with AquaMail for Android http://www.aqua-mail.com
Re: [PATCH 2/2] Enable elimination of zext/sext
On 11/07/14 22:47, Richard Biener wrote: On Fri, Jul 11, 2014 at 1:52 PM, Kugan kugan.vivekanandara...@linaro.org wrote: Thanks foe the review and suggestions. On 10/07/14 22:15, Richard Biener wrote: On Mon, Jul 7, 2014 at 8:55 AM, Kugan kugan.vivekanandara...@linaro.org wrote: [...] For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. Hm? I think you should rather check that you are removing a sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or zero-extend. Definitely + /* In some architectures, negative integer constants are truncated and + sign changed with target defined PROMOTE_MODE macro. This will impact + the value range seen here and produce wrong code if zero/sign extensions + are eliminated. Therefore, return false if this SSA can have negative + integers. */ + if (is_gimple_assign (stmt) + (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary)) +{ + tree rhs1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (rhs1) == INTEGER_CST + !TYPE_UNSIGNED (TREE_TYPE (ssa)) + tree_int_cst_compare (rhs1, integer_zero_node) == -1) + return false; looks completely bogus ... (an unary op with a constant operand?) instead you want to do sth like I see that unary op with a constant operand is not possible in gimple. What I wanted to check here is any sort of constant loads; but seems that will not happen in gimple. Is PHI statements the only possible statements where we will end up with such constants. No, in theory you can have ssa_1 = -1; but that's not unary but a GIMPLE_SINGLE_RHS and thus gimple_assign_rhs_code (stmt) == INTEGER_CST. mode = TYPE_MODE (TREE_TYPE (ssa)); rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); instead of initializing rhs_uns from ssas type. That is, if PROMOTE_MODE tells you to promote _not_ according to ssas sign then honor that. This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi. where, the gimple statement that cause this looks like: . # _3 = PHI _17(7), -1(2) bb43: return _3; ARM PROMOTE_MODE changes the sign for integer constants only and hence looking at the variable with PROMOTE_MODE is not changing the sign in this case. #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ GET_MODE_SIZE (MODE) 4) \ { \ if (MODE == QImode) \ UNSIGNEDP = 1; \ else if (MODE == HImode) \ UNSIGNEDP = 1; \ (MODE) = SImode; \ } Where does it only apply for constants? It applies to all QImode and HImode entities. oops, sorry. I don’t know what I was thinking or looking at when I wrote that :( It indeed fixes my problems. Thanks for that. Here is the modified patch. Bootstrapped and regression tested for 86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions. Is this OK? Thanks, Kugan gcc/ 2014-07-14 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type): New function. (expand_expr_real_1): Check is_promoted_for_type and set the promoted mode. * expr.h (is_promoted_for_type): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. gcc/testsuite 2014-07-14 Kugan Vivekanandarajah kug...@linaro.org * gcc.dg/zero_sign_ext_test.c: New test. diff --git a/gcc/calls.c b/gcc/calls.c index a3e6faa..eac512f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index
Re: [PATCH 2/2] Enable elimination of zext/sext
Thanks foe the review and suggestions. On 10/07/14 22:15, Richard Biener wrote: On Mon, Jul 7, 2014 at 8:55 AM, Kugan kugan.vivekanandara...@linaro.org wrote: [...] For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. Hm? I think you should rather check that you are removing a sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or zero-extend. Definitely + /* In some architectures, negative integer constants are truncated and + sign changed with target defined PROMOTE_MODE macro. This will impact + the value range seen here and produce wrong code if zero/sign extensions + are eliminated. Therefore, return false if this SSA can have negative + integers. */ + if (is_gimple_assign (stmt) + (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary)) +{ + tree rhs1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (rhs1) == INTEGER_CST + !TYPE_UNSIGNED (TREE_TYPE (ssa)) + tree_int_cst_compare (rhs1, integer_zero_node) == -1) + return false; looks completely bogus ... (an unary op with a constant operand?) instead you want to do sth like I see that unary op with a constant operand is not possible in gimple. What I wanted to check here is any sort of constant loads; but seems that will not happen in gimple. Is PHI statements the only possible statements where we will end up with such constants. mode = TYPE_MODE (TREE_TYPE (ssa)); rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); instead of initializing rhs_uns from ssas type. That is, if PROMOTE_MODE tells you to promote _not_ according to ssas sign then honor that. This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi. where, the gimple statement that cause this looks like: . # _3 = PHI _17(7), -1(2) bb43: return _3; ARM PROMOTE_MODE changes the sign for integer constants only and hence looking at the variable with PROMOTE_MODE is not changing the sign in this case. #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ GET_MODE_SIZE (MODE) 4) \ { \ if (MODE == QImode) \ UNSIGNEDP = 1; \ else if (MODE == HImode) \ UNSIGNEDP = 1; \ (MODE) = SImode; \ } As for the -fno-strict-overflow case, if the variables overflows, in VRP dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX. We therefore should limit the comparison to (TYPE_MIN VR_MIN VR_MAX TYPE_MAX) instead of (TYPE_MIN = VR_MIN VR_MAX = TYPE_MAX) when checking to be sure that this is not the overflowing case. Attached patch changes this. I don't think that's necessary - the overflow cases happen only when that overflow has undefined behavior, thus any valid program will have values = MAX. I see that you have now removed +INF(OVF). I will change it this way. Thanks again, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Fri, Jul 11, 2014 at 1:52 PM, Kugan kugan.vivekanandara...@linaro.org wrote: Thanks foe the review and suggestions. On 10/07/14 22:15, Richard Biener wrote: On Mon, Jul 7, 2014 at 8:55 AM, Kugan kugan.vivekanandara...@linaro.org wrote: [...] For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. Hm? I think you should rather check that you are removing a sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or zero-extend. Definitely + /* In some architectures, negative integer constants are truncated and + sign changed with target defined PROMOTE_MODE macro. This will impact + the value range seen here and produce wrong code if zero/sign extensions + are eliminated. Therefore, return false if this SSA can have negative + integers. */ + if (is_gimple_assign (stmt) + (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary)) +{ + tree rhs1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (rhs1) == INTEGER_CST + !TYPE_UNSIGNED (TREE_TYPE (ssa)) + tree_int_cst_compare (rhs1, integer_zero_node) == -1) + return false; looks completely bogus ... (an unary op with a constant operand?) instead you want to do sth like I see that unary op with a constant operand is not possible in gimple. What I wanted to check here is any sort of constant loads; but seems that will not happen in gimple. Is PHI statements the only possible statements where we will end up with such constants. No, in theory you can have ssa_1 = -1; but that's not unary but a GIMPLE_SINGLE_RHS and thus gimple_assign_rhs_code (stmt) == INTEGER_CST. mode = TYPE_MODE (TREE_TYPE (ssa)); rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); instead of initializing rhs_uns from ssas type. That is, if PROMOTE_MODE tells you to promote _not_ according to ssas sign then honor that. This is triggered in pr43017.c in function foo for arm-none-linux-gnueabi. where, the gimple statement that cause this looks like: . # _3 = PHI _17(7), -1(2) bb43: return _3; ARM PROMOTE_MODE changes the sign for integer constants only and hence looking at the variable with PROMOTE_MODE is not changing the sign in this case. #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ GET_MODE_SIZE (MODE) 4) \ { \ if (MODE == QImode) \ UNSIGNEDP = 1; \ else if (MODE == HImode) \ UNSIGNEDP = 1; \ (MODE) = SImode; \ } Where does it only apply for constants? It applies to all QImode and HImode entities. As for the -fno-strict-overflow case, if the variables overflows, in VRP dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX. We therefore should limit the comparison to (TYPE_MIN VR_MIN VR_MAX TYPE_MAX) instead of (TYPE_MIN = VR_MIN VR_MAX = TYPE_MAX) when checking to be sure that this is not the overflowing case. Attached patch changes this. I don't think that's necessary - the overflow cases happen only when that overflow has undefined behavior, thus any valid program will have values = MAX. I see that you have now removed +INF(OVF). I will change it this way. I have not removed anything, I just fixed a bug. Richard. Thanks again, Kugan
Re: [PATCH 2/2] Enable elimination of zext/sext
On Mon, Jul 7, 2014 at 8:55 AM, Kugan kugan.vivekanandara...@linaro.org wrote: For -fwrapv I don't see why you'd get into trouble ever, the VRP computation should be well aware of the -fwrapv semantics and the value ranges should reflect that. For -fno-strict-overflow, I have no idea since it is very weirdly defined. In any case, for your example above, the loop is always well defined, because for char/short a++ is performed as: a = (short) ((int) a + 1) So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow or -Os, it is simply a problem with the patch. VR [1, 32768] looks correct, a++ is performed only if a is = 0, therefore before addition [0, 32767]. But from VR [1, 32768] you can't optimize away the sign extension, make sure you don't have there off-by-one? I have fixed the above bug yesterday. It would be nice if the patch contained some testcases, it is easy to construct testcases where you have arbitrary VRs on some SSA_NAMEs, you just need something to stick the VR on, so you can do something like: type foo (type a) { if (a VR_min + 1 || a VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course. a = a + 1; // now you can try some cast that your optimization would try to optimize return a; } Or void bar (type a) { a = (a mask) + bias; (or similarly) } Make sure to cover the boundary cases, where VR minimum or maximum still allow optimizing away zero and/or sign extensions, and another case where they are +- 1 and already don't allow it. Hi Jakub, For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. Hm? I think you should rather check that you are removing a sign-/zero-extension - PROMOTE_MODE tells you if it will sign- or zero-extend. Definitely + /* In some architectures, negative integer constants are truncated and + sign changed with target defined PROMOTE_MODE macro. This will impact + the value range seen here and produce wrong code if zero/sign extensions + are eliminated. Therefore, return false if this SSA can have negative + integers. */ + if (is_gimple_assign (stmt) + (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_unary)) +{ + tree rhs1 = gimple_assign_rhs1 (stmt); + if (TREE_CODE (rhs1) == INTEGER_CST + !TYPE_UNSIGNED (TREE_TYPE (ssa)) + tree_int_cst_compare (rhs1, integer_zero_node) == -1) + return false; looks completely bogus ... (an unary op with a constant operand?) instead you want to do sth like mode = TYPE_MODE (TREE_TYPE (ssa)); rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); PROMOTE_MODE (mode, rhs_uns, TREE_TYPE (ssa)); instead of initializing rhs_uns from ssas type. That is, if PROMOTE_MODE tells you to promote _not_ according to ssas sign then honor that. As for the -fno-strict-overflow case, if the variables overflows, in VRP dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX. We therefore should limit the comparison to (TYPE_MIN VR_MIN VR_MAX TYPE_MAX) instead of (TYPE_MIN = VR_MIN VR_MAX = TYPE_MAX) when checking to be sure that this is not the overflowing case. Attached patch changes this. I don't think that's necessary - the overflow cases happen only when that overflow has undefined behavior, thus any valid program will have values = MAX. Richard. I have bootstrapped on x86_64-unknown-linux-gnu and regression tested for x86_64-unknown-linux-gnu, arm-none-linux-gnueabi (using qemu), aarch64_be-none-elf (Foundation model), aarch64-none-elf --with-abi=ilp32 (Foundation model) and s390x-ibm-linux (64bit, using qemu) with no new regression. Is this OK? Thanks, Kugan gcc/ 2014-07-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type): New function. (expand_expr_real_1): Check is_promoted_for_type and set the promoted mode. * expr.h (is_promoted_for_type): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. gcc/testsuite 2014-07-07 Kugan Vivekanandarajah kug...@linaro.org * gcc.dg/zero_sign_ext_test.c: New test.
Re: [PATCH 2/2] Enable elimination of zext/sext
For -fwrapv I don't see why you'd get into trouble ever, the VRP computation should be well aware of the -fwrapv semantics and the value ranges should reflect that. For -fno-strict-overflow, I have no idea since it is very weirdly defined. In any case, for your example above, the loop is always well defined, because for char/short a++ is performed as: a = (short) ((int) a + 1) So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow or -Os, it is simply a problem with the patch. VR [1, 32768] looks correct, a++ is performed only if a is = 0, therefore before addition [0, 32767]. But from VR [1, 32768] you can't optimize away the sign extension, make sure you don't have there off-by-one? It would be nice if the patch contained some testcases, it is easy to construct testcases where you have arbitrary VRs on some SSA_NAMEs, you just need something to stick the VR on, so you can do something like: type foo (type a) { if (a VR_min + 1 || a VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course. a = a + 1; // now you can try some cast that your optimization would try to optimize return a; } Or void bar (type a) { a = (a mask) + bias; (or similarly) } Make sure to cover the boundary cases, where VR minimum or maximum still allow optimizing away zero and/or sign extensions, and another case where they are +- 1 and already don't allow it. Hi Jakub, For -fwrapv, it is due to how PROMOTE_MODE is defined in arm back-end. In the test-case, a function (which has signed char return type) returns -1 in one of the paths. ARM PROMOTE_MODE changes that to 255 and relies on zero/sign extension generated by RTL again for the correct value. I saw some other targets also defining similar think. I am therefore skipping removing zero/sign extension if the ssa variable can be set to negative integer constants. As for the -fno-strict-overflow case, if the variables overflows, in VRP dumps, I see +INF(OVF), but the value range stored in ssa has TYPE_MAX. We therefore should limit the comparison to (TYPE_MIN VR_MIN VR_MAX TYPE_MAX) instead of (TYPE_MIN = VR_MIN VR_MAX = TYPE_MAX) when checking to be sure that this is not the overflowing case. Attached patch changes this. I have bootstrapped on x86_64-unknown-linux-gnu and regression tested for x86_64-unknown-linux-gnu, arm-none-linux-gnueabi (using qemu), aarch64_be-none-elf (Foundation model), aarch64-none-elf --with-abi=ilp32 (Foundation model) and s390x-ibm-linux (64bit, using qemu) with no new regression. Is this OK? Thanks, Kugan gcc/ 2014-07-07 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type): New function. (expand_expr_real_1): Check is_promoted_for_type and set the promoted mode. * expr.h (is_promoted_for_type): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. gcc/testsuite 2014-07-07 Kugan Vivekanandarajah kug...@linaro.org * gcc.dg/zero_sign_ext_test.c: New test. diff --git a/gcc/calls.c b/gcc/calls.c index a3e6faa..eac512f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index b7a34a2..ac6776d 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) +(GET_CODE (temp) == SUBREG) +(GET_MODE (target) == GET_MODE (temp)) +(GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp + emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); + else + convert_move (SUBREG_REG (target), temp, unsignedp); } else if (nontemporal emit_storent_insn (target, temp)) ; diff --git a/gcc/expr.c b/gcc/expr.c index 10f4a96..68708c1 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -68,6 +68,7 @@ along
Re: [PATCH 2/2] Enable elimination of zext/sext
On 24/06/14 22:21, Jakub Jelinek wrote: On Tue, Jun 24, 2014 at 09:53:35PM +1000, Kugan wrote: 2014-06-24 Kugan Vivekanandarajah kug...@linaro.org * gcc/calls.c (precompute_arguments: Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type) : New function. (expand_expr_real_1) : Check is_promoted_for_type and set the promoted mode. * gcc/expr.h (is_promoted_for_type) : New function definition. * gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. Similarly to the other patch, no gcc/ prefix in ChangeLog, no space before :, watch for too long lines, remove useless ()s around conditions. Changed it. +bool +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max, limit; + unsigned int prec; + tree lhs_type; + bool rhs_uns; + + if (flag_wrapv Why? + || (flag_strict_overflow == false) Why? Also, that would be !flag_strict_overflow instead of (flag_strict_overflow == false) For these flags, value ranges generated are not usable for extension eliminations. Therefore, without this some of the test cases in regression fails. For example: short a; void foo (void) { for (a = 0; a = 0; a++) ; } -Os -fno-strict-overflow produces the following range for the index increment and hence goes into infinite loop. _10: [1, 32768] _10 = _4 + 1; + || (ssa == NULL_TREE) + || (TREE_CODE (ssa) != SSA_NAME) + || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)) + || POINTER_TYPE_P (TREE_TYPE (ssa))) All pointer types are !INTEGRAL_TYPE_P, so the last condition doesn't make any sense. I have changed this. Please see the attached patch. Thanks, Kugan gcc/ 2014-06-25 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type): New function. (expand_expr_real_1): Check is_promoted_for_type and set the promoted mode. * expr.h (is_promoted_for_type): New function definition. * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. diff --git a/gcc/calls.c b/gcc/calls.c index a3e6faa..eac512f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index e8cd87f..0540b4d 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) +(GET_CODE (temp) == SUBREG) +(GET_MODE (target) == GET_MODE (temp)) +(GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp + emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); + else + convert_move (SUBREG_REG (target), temp, unsignedp); } else if (nontemporal emit_storent_insn (target, temp)) ; diff --git a/gcc/expr.c b/gcc/expr.c index f9103a5..15da092 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9210,6 +9210,59 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, } #undef REDUCE_BIT_FIELD +/* Return TRUE if value in SSA is already zero/sign extended for lhs type + (type here is the combination of LHS_MODE and LHS_UNS) using value range + information stored. Return FALSE otherwise. */ +bool +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max, limit; + unsigned int prec; + tree lhs_type; + bool rhs_uns; + + if (flag_wrapv + || !flag_strict_overflow + || ssa == NULL_TREE + || TREE_CODE (ssa) != SSA_NAME + || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))) +return false; + + /* Return FALSE if value_range is not recorded for SSA. */ + if (get_range_info (ssa, min, max) != VR_RANGE) +return false; + + lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); + rhs_uns = TYPE_UNSIGNED (TREE_TYPE
Re: [PATCH 2/2] Enable elimination of zext/sext
On Wed, Jun 25, 2014 at 06:14:57PM +1000, Kugan wrote: For these flags, value ranges generated are not usable for extension eliminations. Therefore, without this some of the test cases in regression fails. For example: short a; void foo (void) { for (a = 0; a = 0; a++) ; } -Os -fno-strict-overflow produces the following range for the index increment and hence goes into infinite loop. _10: [1, 32768] _10 = _4 + 1; For -fwrapv I don't see why you'd get into trouble ever, the VRP computation should be well aware of the -fwrapv semantics and the value ranges should reflect that. For -fno-strict-overflow, I have no idea since it is very weirdly defined. In any case, for your example above, the loop is always well defined, because for char/short a++ is performed as: a = (short) ((int) a + 1) So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow or -Os, it is simply a problem with the patch. VR [1, 32768] looks correct, a++ is performed only if a is = 0, therefore before addition [0, 32767]. But from VR [1, 32768] you can't optimize away the sign extension, make sure you don't have there off-by-one? It would be nice if the patch contained some testcases, it is easy to construct testcases where you have arbitrary VRs on some SSA_NAMEs, you just need something to stick the VR on, so you can do something like: type foo (type a) { if (a VR_min + 1 || a VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course. a = a + 1; // now you can try some cast that your optimization would try to optimize return a; } Or void bar (type a) { a = (a mask) + bias; (or similarly) } Make sure to cover the boundary cases, where VR minimum or maximum still allow optimizing away zero and/or sign extensions, and another case where they are +- 1 and already don't allow it. Jakub
Re: [PATCH 2/2] Enable elimination of zext/sext
On Tue, Jun 24, 2014 at 09:53:35PM +1000, Kugan wrote: 2014-06-24 Kugan Vivekanandarajah kug...@linaro.org * gcc/calls.c (precompute_arguments: Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type) : New function. (expand_expr_real_1) : Check is_promoted_for_type and set the promoted mode. * gcc/expr.h (is_promoted_for_type) : New function definition. * gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. Similarly to the other patch, no gcc/ prefix in ChangeLog, no space before :, watch for too long lines, remove useless ()s around conditions. +bool +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max, limit; + unsigned int prec; + tree lhs_type; + bool rhs_uns; + + if (flag_wrapv Why? + || (flag_strict_overflow == false) Why? Also, that would be !flag_strict_overflow instead of (flag_strict_overflow == false) + || (ssa == NULL_TREE) + || (TREE_CODE (ssa) != SSA_NAME) + || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)) + || POINTER_TYPE_P (TREE_TYPE (ssa))) All pointer types are !INTEGRAL_TYPE_P, so the last condition doesn't make any sense. Jakub