Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation
On Mon, Sep 12, 2016 at 8:58 PM, Jeff Law wrote: > On 09/06/2016 12:54 PM, Bin Cheng wrote: >> >> Hi, >> LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could >> overflow in loop niters' type. Vectorizer needs to generate more code >> computing vectorized niters if overflow does happen. However, For common >> loops, there is no overflow actually, this patch tries to prove the >> no-overflow information and use that to improve code generation. At the >> moment, no-overflow information comes either from loop niter analysis, or >> the truth that we know loop is peeled for non-zero iterations in prologue >> peeling. For the latter case, it doesn't matter if the original >> LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS - >> LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow. >> >> Thanks, >> bin >> >> 2016-09-01 Bin Cheng >> >> * tree-vect-loop.c (loop_niters_no_overflow): New func. >> (vect_transform_loop): Call loop_niters_no_overflow. Pass the >> no-overflow information to vect_do_peeling_for_loop_bound and >> vect_gen_vector_loop_niters. >> > OK when prereqs are all approved. Hi, I revised this patch using widest_int comparison for trees, rather than int. Attached new patch is committed. Also committed all patches in peel refactoring patch set, they are posted at: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00326.html https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01012.html https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00328.html https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00329.html https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00330.html https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00331.html https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00332.html https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00333.html The patch set bootstrap and test again on x86_64 and AArch64. No regression found. I will keep eyes on possible fallouts. Thanks, bin > > jeff diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 0470445..9cca9b7 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -6620,6 +6620,39 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple *stmt) } } +/* Given loop represented by LOOP_VINFO, return true if computation of + LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false + otherwise. */ + +static bool +loop_niters_no_overflow (loop_vec_info loop_vinfo) +{ + /* Constant case. */ + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) +{ + tree cst_niters = LOOP_VINFO_NITERS (loop_vinfo); + tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo); + + gcc_assert (TREE_CODE (cst_niters) == INTEGER_CST); + gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST); + if (wi::to_widest (cst_nitersm1) < wi::to_widest (cst_niters)) + return true; +} + + widest_int max; + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + /* Check the upper bound of loop niters. */ + if (get_max_loop_iterations (loop, &max)) +{ + tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo)); + signop sgn = TYPE_SIGN (type); + widest_int type_max = widest_int::from (wi::max_value (type), sgn); + if (max < type_max) + return true; +} + return false; +} + /* Function vect_transform_loop. The analysis phase has determined that the loop is vectorizable. @@ -6707,8 +6740,9 @@ vect_transform_loop (loop_vec_info loop_vinfo) tree niters = vect_build_loop_niters (loop_vinfo); LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = niters; tree nitersm1 = unshare_expr (LOOP_VINFO_NITERSM1 (loop_vinfo)); + bool niters_no_overflow = loop_niters_no_overflow (loop_vinfo); vect_do_peeling (loop_vinfo, niters, nitersm1, &niters_vector, th, - check_profitability, false); + check_profitability, niters_no_overflow); if (niters_vector == NULL_TREE) { if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) @@ -6717,7 +6751,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) LOOP_VINFO_INT_NITERS (loop_vinfo) / vf); else vect_gen_vector_loop_niters (loop_vinfo, niters, &niters_vector, -false); +niters_no_overflow); } /* 1) Make sure the loop header has exactly two entries
Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation
On 09/06/2016 12:54 PM, Bin Cheng wrote: Hi, LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could overflow in loop niters' type. Vectorizer needs to generate more code computing vectorized niters if overflow does happen. However, For common loops, there is no overflow actually, this patch tries to prove the no-overflow information and use that to improve code generation. At the moment, no-overflow information comes either from loop niter analysis, or the truth that we know loop is peeled for non-zero iterations in prologue peeling. For the latter case, it doesn't matter if the original LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS - LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow. Thanks, bin 2016-09-01 Bin Cheng * tree-vect-loop.c (loop_niters_no_overflow): New func. (vect_transform_loop): Call loop_niters_no_overflow. Pass the no-overflow information to vect_do_peeling_for_loop_bound and vect_gen_vector_loop_niters. OK when prereqs are all approved. jeff
Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation
Hi Bin, On 07/09/16 17:52, Bin.Cheng wrote: On Wed, Sep 7, 2016 at 1:10 AM, kugan wrote: Hi Bin, On 07/09/16 04:54, Bin Cheng wrote: Hi, LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could overflow in loop niters' type. Vectorizer needs to generate more code computing vectorized niters if overflow does happen. However, For common loops, there is no overflow actually, this patch tries to prove the no-overflow information and use that to improve code generation. At the moment, no-overflow information comes either from loop niter analysis, or the truth that we know loop is peeled for non-zero iterations in prologue peeling. For the latter case, it doesn't matter if the original LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS - LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow. Thanks, bin 2016-09-01 Bin Cheng * tree-vect-loop.c (loop_niters_no_overflow): New func. (vect_transform_loop): Call loop_niters_no_overflow. Pass the no-overflow information to vect_do_peeling_for_loop_bound and vect_gen_vector_loop_niters. 009-prove-no_overflow-for-vect-niters-20160902.txt diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 0d37f55..2ef1f9b 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple *stmt) } } +/* Given loop represented by LOOP_VINFO, return true if computation of + LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false + otherwise. */ + +static bool +loop_niters_no_overflow (loop_vec_info loop_vinfo) +{ + /* Constant case. */ + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) +{ + int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo); Wouldn't it truncate by assigning this to int? Probably, now I think it's unnecessary to use int version niters here, LOOP_VINFO_NITERS can be used directly. + tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo); + + gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST); + if (wi::to_widest (cst_nitersm1) < cst_niters) Shouldn't you have do the addition and comparison in the type of the loop index instead of widest_int to see if that overflows? You mean the type of loop niters? NITERS is computed from NITERSM1 + 1, I don't think we need to do it again here. Imagine that you have LOOP_VINFO_NITERSM1 as TYPE_MAX (loop niters type). In this case, when you add 1, it will overflow in loop niters type but not when you do the computation in widest_int. But, as you said, if NITERS is already computed in loop niters type, yes this compare should be sufficient. You could do the comparison as wide_int or tree. I think, this would make it clearer. Thanks, Kugan Thanks, bin
Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation
On Wed, Sep 7, 2016 at 1:10 AM, kugan wrote: > Hi Bin, > > > On 07/09/16 04:54, Bin Cheng wrote: >> >> Hi, >> LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could >> overflow in loop niters' type. Vectorizer needs to generate more code >> computing vectorized niters if overflow does happen. However, For common >> loops, there is no overflow actually, this patch tries to prove the >> no-overflow information and use that to improve code generation. At the >> moment, no-overflow information comes either from loop niter analysis, or >> the truth that we know loop is peeled for non-zero iterations in prologue >> peeling. For the latter case, it doesn't matter if the original >> LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS - >> LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow. >> >> Thanks, >> bin >> >> 2016-09-01 Bin Cheng >> >> * tree-vect-loop.c (loop_niters_no_overflow): New func. >> (vect_transform_loop): Call loop_niters_no_overflow. Pass the >> no-overflow information to vect_do_peeling_for_loop_bound and >> vect_gen_vector_loop_niters. >> >> >> 009-prove-no_overflow-for-vect-niters-20160902.txt >> >> >> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c >> index 0d37f55..2ef1f9b 100644 >> --- a/gcc/tree-vect-loop.c >> +++ b/gcc/tree-vect-loop.c >> @@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop, >> gimple *stmt) >> } >> } >> >> +/* Given loop represented by LOOP_VINFO, return true if computation of >> + LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false >> + otherwise. */ >> + >> +static bool >> +loop_niters_no_overflow (loop_vec_info loop_vinfo) >> +{ >> + /* Constant case. */ >> + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) >> +{ >> + int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo); > > > Wouldn't it truncate by assigning this to int? Probably, now I think it's unnecessary to use int version niters here, LOOP_VINFO_NITERS can be used directly. > > >> + tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo); >> + >> + gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST); >> + if (wi::to_widest (cst_nitersm1) < cst_niters) > > > Shouldn't you have do the addition and comparison in the type of the loop > index instead of widest_int to see if that overflows? You mean the type of loop niters? NITERS is computed from NITERSM1 + 1, I don't think we need to do it again here. Thanks, bin
Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation
Hi Bin, On 07/09/16 04:54, Bin Cheng wrote: Hi, LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could overflow in loop niters' type. Vectorizer needs to generate more code computing vectorized niters if overflow does happen. However, For common loops, there is no overflow actually, this patch tries to prove the no-overflow information and use that to improve code generation. At the moment, no-overflow information comes either from loop niter analysis, or the truth that we know loop is peeled for non-zero iterations in prologue peeling. For the latter case, it doesn't matter if the original LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS - LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow. Thanks, bin 2016-09-01 Bin Cheng * tree-vect-loop.c (loop_niters_no_overflow): New func. (vect_transform_loop): Call loop_niters_no_overflow. Pass the no-overflow information to vect_do_peeling_for_loop_bound and vect_gen_vector_loop_niters. 009-prove-no_overflow-for-vect-niters-20160902.txt diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 0d37f55..2ef1f9b 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple *stmt) } } +/* Given loop represented by LOOP_VINFO, return true if computation of + LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false + otherwise. */ + +static bool +loop_niters_no_overflow (loop_vec_info loop_vinfo) +{ + /* Constant case. */ + if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) +{ + int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo); Wouldn't it truncate by assigning this to int? + tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo); + + gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST); + if (wi::to_widest (cst_nitersm1) < cst_niters) Shouldn't you have do the addition and comparison in the type of the loop index instead of widest_int to see if that overflows? Thanks, Kugan + return true; +} + + widest_int max; + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + /* Check the upper bound of loop niters. */ + if (get_max_loop_iterations (loop, &max)) +{ + tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo)); + signop sgn = TYPE_SIGN (type); + widest_int type_max = widest_int::from (wi::max_value (type), sgn); + if (max < type_max) + return true; +} + return false; +} + /* Function vect_transform_loop. The analysis phase has determined that the loop is vectorizable. @@ -6697,8 +6729,9 @@ vect_transform_loop (loop_vec_info loop_vinfo) tree niters = vect_build_loop_niters (loop_vinfo); LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = niters; tree nitersm1 = unshare_expr (LOOP_VINFO_NITERSM1 (loop_vinfo)); + bool niters_no_overflow = loop_niters_no_overflow (loop_vinfo); vect_do_peeling (loop_vinfo, niters, nitersm1, &niters_vector, th, - check_profitability, false); + check_profitability, niters_no_overflow); if (niters_vector == NULL_TREE) { if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) @@ -6707,7 +6740,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) LOOP_VINFO_INT_NITERS (loop_vinfo) / vf); else vect_gen_vector_loop_niters (loop_vinfo, niters, &niters_vector, -false); +niters_no_overflow); } /* 1) Make sure the loop header has exactly two entries