Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On 09/18/13 02:26, bin.cheng wrote: -Original Message- From: Dominique Dhumieres [mailto:domi...@lps.ens.fr] Sent: Wednesday, September 18, 2013 1:47 AM To: gcc-patches@gcc.gnu.org Cc: hjl.to...@gmail.com; Bin Cheng Subject: Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ). Looking for MEM in the dump returns _12 = MEM[(int[50] *)_17]; MEM[(int[50] *)_20] = _13; Thanks for reporting, I think this can be fixed by patch: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00761.html Just a quick update on the patch. The proposed patch didn't pass the x86_64 bootstrap, and I'm working on a better fix. Thanks, Yufeng
Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ). Looking for MEM in the dump returns _12 = MEM[(int[50] *)_17]; MEM[(int[50] *)_20] = _13; TIA Dominique
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
-Original Message- From: Dominique Dhumieres [mailto:domi...@lps.ens.fr] Sent: Wednesday, September 18, 2013 1:47 AM To: gcc-patches@gcc.gnu.org Cc: hjl.to...@gmail.com; Bin Cheng Subject: Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ). Looking for MEM in the dump returns _12 = MEM[(int[50] *)_17]; MEM[(int[50] *)_20] = _13; Thanks for reporting, I think this can be fixed by patch: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00761.html Thanks. bin
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of bin.cheng Sent: Thursday, September 12, 2013 2:13 PM To: 'Bill Schmidt'; Yufeng Zhang; Yufeng Zhang Cc: Richard Biener; GCC Patches Subject: RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction On Tue, Sep 10, 2013 at 9:30 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: On Tue, 2013-09-10 at 15:41 +0800, bin.cheng wrote: On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. OK, I see now. I had thought this was handled by fold_build2, but apparently not. I guess all T2's formerly handled were already sizetype as expected. Thanks for the explanation! So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in the argument list to fold_build2? It's picking nits, but that would be slightly more efficient. Hi Bill, This is the 2nd version of patch with your comments incorporated. Bootstrap and re-test on x86. Re-test on ARM ongoing. Is it ok if tests pass? Looks good to me! Thanks, Bin. Sorry I have to hold on this patch since it causes several tests failed on ARM. Will investigate it and get back ASAP. The reported failure is false alarm and happens on trunk too. I must have compared wrong testing results. Since there is no regression and the patch is approved before, I will apply it to trunk. Thanks. bin
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Tue, Sep 10, 2013 at 9:30 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: On Tue, 2013-09-10 at 15:41 +0800, bin.cheng wrote: On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. OK, I see now. I had thought this was handled by fold_build2, but apparently not. I guess all T2's formerly handled were already sizetype as expected. Thanks for the explanation! So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in the argument list to fold_build2? It's picking nits, but that would be slightly more efficient. Hi Bill, This is the 2nd version of patch with your comments incorporated. Bootstrap and re-test on x86. Re-test on ARM ongoing. Is it ok if tests pass? Looks good to me! Thanks, Bin. Sorry I have to hold on this patch since it causes several tests failed on ARM. Will investigate it and get back ASAP. Thanks. bin
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. OK, I see now. I had thought this was handled by fold_build2, but apparently not. I guess all T2's formerly handled were already sizetype as expected. Thanks for the explanation! So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in the argument list to fold_build2? It's picking nits, but that would be slightly more efficient. Hi Bill, This is the 2nd version of patch with your comments incorporated. Bootstrap and re-test on x86. Re-test on ARM ongoing. Is it ok if tests pass? Thanks. binIndex: gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c === --- gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c (revision 0) @@ -0,0 +1,26 @@ +/* Verify straight-line strength reduction for back-tracing + CAND_ADD for T2 in: + +*PBASE:T1 +*POFFSET: MULT_EXPR (T2, C3) +*PINDEX: C1 + (C2 * C3) + C4 */ + +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-slsr } */ + +typedef int arr_2[50][50]; + +void foo (arr_2 a2, int v1) +{ + int i, j; + + i = v1 + 5; + j = i; + a2 [i] [j++] = i; + a2 [i] [j++] = i; + a2 [i] [i-1] += 1; + return; +} + +/* { dg-final { scan-tree-dump-times MEM 4 slsr } } */ +/* { dg-final { cleanup-tree-dump slsr } } */ Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 202067) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -750,6 +750,57 @@ slsr_process_phi (gimple phi, bool speed) add_cand_for_stmt (phi, c); } +/* Given PBASE which is a pointer to tree, look up the defining + statement for it and check whether the candidate is in the + form of: + + X = B + (1 * S), S is integer constant + X = B + (i * S), S is integer one + + If so, set PBASE to the candidate's base_expr and return double + int (i * S). + Otherwise, just return double int zero. */ + +static double_int +backtrace_base_for_ref (tree *pbase) +{ + tree base_in = *pbase; + slsr_cand_t base_cand; + + STRIP_NOPS (base_in); + if (TREE_CODE (base_in) != SSA_NAME) +return tree_to_double_int (integer_zero_node); + + base_cand = base_cand_from_table (base_in); + + while (base_cand base_cand-kind != CAND_PHI) +{ + if (base_cand-kind == CAND_ADD + base_cand-index.is_one () + TREE_CODE (base_cand-stride) == INTEGER_CST) + { + /* X = B + (1 * S), S is integer constant. */ + *pbase = base_cand-base_expr; + return tree_to_double_int (base_cand-stride); + } + else if (base_cand-kind == CAND_ADD + TREE_CODE (base_cand-stride) == INTEGER_CST + integer_onep (base_cand-stride)) +{ + /* X = B + (i * S), S is integer one. */ + *pbase = base_cand-base_expr; + return base_cand-index; + } + + if (base_cand-next_interp) + base_cand = lookup_cand (base_cand-next_interp); + else + base_cand = NULL; +} + + return tree_to_double_int (integer_zero_node); +} + /* Look for the following pattern: *PBASE:MEM_REF (T1, C1) @@ -767,8 +818,15 @@ slsr_process_phi (gimple phi, bool speed) *PBASE:T1 *POFFSET: MULT_EXPR (T2, C3) -*PINDEX: C1 + (C2 * C3) + C4 */ +*PINDEX: C1 + (C2 * C3) + C4 + When T2 is recorded by a CAND_ADD in the form of (T2' + C5), it + will be further restructured to: + +*PBASE:T1 +*POFFSET: MULT_EXPR (T2', C3) +*PINDEX: C1 + (C2 * C3) + C4 + (C5 * C3) */ + static bool restructure_reference (tree *pbase, tree *poffset, double_int *pindex, tree *ptype) @@ -777,7 +835,7 @@ restructure_reference (tree *pbase, tree *poffset, double_int index = *pindex; double_int bpu = double_int::from_uhwi (BITS_PER_UNIT); tree mult_op0, mult_op1, t1, t2, type; - double_int c1, c2, c3, c4; + double_int c1, c2, c3, c4, c5; if (!base || !offset @@ -823,11 +881,12 @@ restructure_reference (tree *pbase, tree *poffset, } c4 = index.udiv (bpu, FLOOR_DIV_EXPR); + c5 = backtrace_base_for_ref (t2); *pbase = t1; - *poffset = fold_build2 (MULT_EXPR, sizetype, t2, + *poffset = fold_build2 (MULT_EXPR, sizetype, fold_convert (sizetype, t2), double_int_to_tree (sizetype, c3)); - *pindex = c1 + c2 * c3 + c4; + *pindex = c1 + c2 * c3 + c4 + c5 * c3; *ptype = type; return true;
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Tue, 2013-09-10 at 15:41 +0800, bin.cheng wrote: On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. OK, I see now. I had thought this was handled by fold_build2, but apparently not. I guess all T2's formerly handled were already sizetype as expected. Thanks for the explanation! So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in the argument list to fold_build2? It's picking nits, but that would be slightly more efficient. Hi Bill, This is the 2nd version of patch with your comments incorporated. Bootstrap and re-test on x86. Re-test on ARM ongoing. Is it ok if tests pass? Looks good to me! Thanks, Bin. Bill Thanks. bin
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
Thanks for reviewing, I will correct all stupid spelling problem in the next version of patch. On Mon, Sep 9, 2013 at 8:15 AM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: + int (i * S). + Otherwise, just return double int zero. */ This is sufficient, since you are properly checking the next_interp chain. Another possible form would be X = (B + i) * 1, but if this is present, then one of the forms you're checking for should also be present, so there's no need to check the MULT_CANDs. I'm not very sure here since I didn't check MULT_CAND in the patch. Could you please explain more about this? + +static double_int +backtrace_base_for_ref (tree *pbase) +{ + tree base_in = *pbase; + slsr_cand_t base_cand; + + STRIP_NOPS (base_in); + if (TREE_CODE (base_in) != SSA_NAME) +return tree_to_double_int (integer_zero_node); + + base_cand = base_cand_from_table (base_in); + + while (base_cand base_cand-kind != CAND_PHI) +{ + if (base_cand-kind == CAND_ADD +base_cand-index.is_one () +TREE_CODE (base_cand-stride) == INTEGER_CST) + { + /* X = B + (1 * S), S is integer constant. */ + *pbase = base_cand-base_expr; + return tree_to_double_int (base_cand-stride); + } + else if (base_cand-kind == CAND_ADD + TREE_CODE (base_cand-stride) == INTEGER_CST + integer_onep (base_cand-stride)) +{ + /* X = B + (i * S), S is integer one. */ + *pbase = base_cand-base_expr; + return base_cand-index; + } + + if (base_cand-next_interp) + base_cand = lookup_cand (base_cand-next_interp); + else + base_cand = NULL; +} + + return tree_to_double_int (integer_zero_node); +} + /* Look for the following pattern: *PBASE:MEM_REF (T1, C1) @@ -767,8 +818,15 @@ slsr_process_phi (gimple phi, bool speed) *PBASE:T1 *POFFSET: MULT_EXPR (T2, C3) -*PINDEX: C1 + (C2 * C3) + C4 */ +*PINDEX: C1 + (C2 * C3) + C4 + When T2 is recorded by an CAND_ADD in the form of (T2' + C5), It ^ ^ a it + will be further restructured to: + +*PBASE:T1 +*POFFSET: MULT_EXPR (T2', C3) +*PINDEX: C1 + (C2 * C3) + C4 + (C5 * C3) */ + static bool restructure_reference (tree *pbase, tree *poffset, double_int *pindex, tree *ptype) @@ -777,7 +835,7 @@ restructure_reference (tree *pbase, tree *poffset, double_int index = *pindex; double_int bpu = double_int::from_uhwi (BITS_PER_UNIT); tree mult_op0, mult_op1, t1, t2, type; - double_int c1, c2, c3, c4; + double_int c1, c2, c3, c4, c5; if (!base || !offset @@ -823,11 +881,12 @@ restructure_reference (tree *pbase, tree *poffset, } c4 = index.udiv (bpu, FLOOR_DIV_EXPR); + c5 = backtrace_base_for_ref (t2); *pbase = t1; - *poffset = fold_build2 (MULT_EXPR, sizetype, t2, - double_int_to_tree (sizetype, c3)); - *pindex = c1 + c2 * c3 + c4; + *poffset = size_binop (MULT_EXPR, fold_convert (sizetype, t2), + double_int_to_tree (sizetype, c3)); I am not sure why you changed this call. fold_build2 is a more efficient call than size_binop. size_binop makes several checks that will fail in this case, and then calls fold_build2_loc, right? Not a big deal but seems like changing it back would be better. Perhaps I'm missing something (as usual ;). I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. Thanks. bin
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Mon, 2013-09-09 at 14:25 +0800, bin.cheng wrote: Thanks for reviewing, I will correct all stupid spelling problem in the next version of patch. On Mon, Sep 9, 2013 at 8:15 AM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: + int (i * S). + Otherwise, just return double int zero. */ This is sufficient, since you are properly checking the next_interp chain. Another possible form would be X = (B + i) * 1, but if this is present, then one of the forms you're checking for should also be present, so there's no need to check the MULT_CANDs. I'm not very sure here since I didn't check MULT_CAND in the patch. Could you please explain more about this? Sorry, perhaps I shouldn't have mentioned it. I was simply stating that, although a candidate representing B + i could be represented with a CAND_MULT as shown, there is no need for you to check it (as you don't) since there will also be a corresponding CAND_ADD in one of the other forms. Since you are walking the next_interp chain, this works. In other words, the code is fine as is. I was just thinking out loud about other candidate types. + +static double_int +backtrace_base_for_ref (tree *pbase) +{ + tree base_in = *pbase; + slsr_cand_t base_cand; + + STRIP_NOPS (base_in); + if (TREE_CODE (base_in) != SSA_NAME) +return tree_to_double_int (integer_zero_node); + + base_cand = base_cand_from_table (base_in); + + while (base_cand base_cand-kind != CAND_PHI) +{ + if (base_cand-kind == CAND_ADD +base_cand-index.is_one () +TREE_CODE (base_cand-stride) == INTEGER_CST) + { + /* X = B + (1 * S), S is integer constant. */ + *pbase = base_cand-base_expr; + return tree_to_double_int (base_cand-stride); + } + else if (base_cand-kind == CAND_ADD + TREE_CODE (base_cand-stride) == INTEGER_CST + integer_onep (base_cand-stride)) +{ + /* X = B + (i * S), S is integer one. */ + *pbase = base_cand-base_expr; + return base_cand-index; + } + + if (base_cand-next_interp) + base_cand = lookup_cand (base_cand-next_interp); + else + base_cand = NULL; +} + + return tree_to_double_int (integer_zero_node); +} + /* Look for the following pattern: *PBASE:MEM_REF (T1, C1) @@ -767,8 +818,15 @@ slsr_process_phi (gimple phi, bool speed) *PBASE:T1 *POFFSET: MULT_EXPR (T2, C3) -*PINDEX: C1 + (C2 * C3) + C4 */ +*PINDEX: C1 + (C2 * C3) + C4 + When T2 is recorded by an CAND_ADD in the form of (T2' + C5), It ^ ^ a it + will be further restructured to: + +*PBASE:T1 +*POFFSET: MULT_EXPR (T2', C3) +*PINDEX: C1 + (C2 * C3) + C4 + (C5 * C3) */ + static bool restructure_reference (tree *pbase, tree *poffset, double_int *pindex, tree *ptype) @@ -777,7 +835,7 @@ restructure_reference (tree *pbase, tree *poffset, double_int index = *pindex; double_int bpu = double_int::from_uhwi (BITS_PER_UNIT); tree mult_op0, mult_op1, t1, t2, type; - double_int c1, c2, c3, c4; + double_int c1, c2, c3, c4, c5; if (!base || !offset @@ -823,11 +881,12 @@ restructure_reference (tree *pbase, tree *poffset, } c4 = index.udiv (bpu, FLOOR_DIV_EXPR); + c5 = backtrace_base_for_ref (t2); *pbase = t1; - *poffset = fold_build2 (MULT_EXPR, sizetype, t2, - double_int_to_tree (sizetype, c3)); - *pindex = c1 + c2 * c3 + c4; + *poffset = size_binop (MULT_EXPR, fold_convert (sizetype, t2), + double_int_to_tree (sizetype, c3)); I am not sure why you changed this call. fold_build2 is a more efficient call than size_binop. size_binop makes several checks that will fail in this case, and then calls fold_build2_loc, right? Not a big deal but seems like changing it back would be better. Perhaps I'm missing something (as usual ;). I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. OK, I see now. I had thought this was handled by fold_build2, but apparently not. I guess all T2's formerly handled were already sizetype as expected. Thanks for the explanation! Bill Thanks. bin
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Mon, 2013-09-09 at 10:20 -0500, Bill Schmidt wrote: On Mon, 2013-09-09 at 14:25 +0800, bin.cheng wrote: Thanks for reviewing, I will correct all stupid spelling problem in the next version of patch. On Mon, Sep 9, 2013 at 8:15 AM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: + int (i * S). + Otherwise, just return double int zero. */ This is sufficient, since you are properly checking the next_interp chain. Another possible form would be X = (B + i) * 1, but if this is present, then one of the forms you're checking for should also be present, so there's no need to check the MULT_CANDs. I'm not very sure here since I didn't check MULT_CAND in the patch. Could you please explain more about this? Sorry, perhaps I shouldn't have mentioned it. I was simply stating that, although a candidate representing B + i could be represented with a CAND_MULT as shown, there is no need for you to check it (as you don't) since there will also be a corresponding CAND_ADD in one of the other forms. Since you are walking the next_interp chain, this works. In other words, the code is fine as is. I was just thinking out loud about other candidate types. + +static double_int +backtrace_base_for_ref (tree *pbase) +{ + tree base_in = *pbase; + slsr_cand_t base_cand; + + STRIP_NOPS (base_in); + if (TREE_CODE (base_in) != SSA_NAME) +return tree_to_double_int (integer_zero_node); + + base_cand = base_cand_from_table (base_in); + + while (base_cand base_cand-kind != CAND_PHI) +{ + if (base_cand-kind == CAND_ADD +base_cand-index.is_one () +TREE_CODE (base_cand-stride) == INTEGER_CST) + { + /* X = B + (1 * S), S is integer constant. */ + *pbase = base_cand-base_expr; + return tree_to_double_int (base_cand-stride); + } + else if (base_cand-kind == CAND_ADD + TREE_CODE (base_cand-stride) == INTEGER_CST + integer_onep (base_cand-stride)) +{ + /* X = B + (i * S), S is integer one. */ + *pbase = base_cand-base_expr; + return base_cand-index; + } + + if (base_cand-next_interp) + base_cand = lookup_cand (base_cand-next_interp); + else + base_cand = NULL; +} + + return tree_to_double_int (integer_zero_node); +} + /* Look for the following pattern: *PBASE:MEM_REF (T1, C1) @@ -767,8 +818,15 @@ slsr_process_phi (gimple phi, bool speed) *PBASE:T1 *POFFSET: MULT_EXPR (T2, C3) -*PINDEX: C1 + (C2 * C3) + C4 */ +*PINDEX: C1 + (C2 * C3) + C4 + When T2 is recorded by an CAND_ADD in the form of (T2' + C5), It ^ ^ a it + will be further restructured to: + +*PBASE:T1 +*POFFSET: MULT_EXPR (T2', C3) +*PINDEX: C1 + (C2 * C3) + C4 + (C5 * C3) */ + static bool restructure_reference (tree *pbase, tree *poffset, double_int *pindex, tree *ptype) @@ -777,7 +835,7 @@ restructure_reference (tree *pbase, tree *poffset, double_int index = *pindex; double_int bpu = double_int::from_uhwi (BITS_PER_UNIT); tree mult_op0, mult_op1, t1, t2, type; - double_int c1, c2, c3, c4; + double_int c1, c2, c3, c4, c5; if (!base || !offset @@ -823,11 +881,12 @@ restructure_reference (tree *pbase, tree *poffset, } c4 = index.udiv (bpu, FLOOR_DIV_EXPR); + c5 = backtrace_base_for_ref (t2); *pbase = t1; - *poffset = fold_build2 (MULT_EXPR, sizetype, t2, - double_int_to_tree (sizetype, c3)); - *pindex = c1 + c2 * c3 + c4; + *poffset = size_binop (MULT_EXPR, fold_convert (sizetype, t2), + double_int_to_tree (sizetype, c3)); I am not sure why you changed this call. fold_build2 is a more efficient call than size_binop. size_binop makes several checks that will fail in this case, and then calls fold_build2_loc, right? Not a big deal but seems like changing it back would be better. Perhaps I'm missing something (as usual ;). I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type. Otherwise there will be ssa_verify error later. OK, I see now. I had thought this was handled by fold_build2, but apparently not. I guess all T2's formerly handled were already sizetype as expected. Thanks for the explanation! So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in the argument list to fold_build2? It's picking nits, but that would be slightly more efficient. Bill Bill Thanks. bin
Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Mon, 2013-09-02 at 11:15 +0200, Richard Biener wrote: On Mon, Sep 2, 2013 at 8:56 AM, bin.cheng bin.ch...@arm.com wrote: Hi, The gimple-ssa-strength-reduction pass handles CAND_REFs in order to find different MEM_REFs sharing common part in addressing expression. If such MEM_REFs are found, the pass rewrites MEM_REFs, and produces more efficient addressing expression during the RTL passes. The pass analyzes addressing expression in each MEM_REF to see if it can be formalized as follows: base:MEM_REF (T1, C1) offset: MULT_EXPR (PLUS_EXPR (T2, C2), C3) bitpos: C4 * BITS_PER_UNIT Then restructures it into below form: MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2, C3)), C1 + (C2 * C3) + C4) At last, rewrite the MEM_REFs if there are two or more sharing common (non-constant) part. The problem is it doesn't back trace T2. If T2 is recorded as a CAND_ADD in form of T2' + C5, the MEM_REF should be restructure into: MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2', C3)), C1 + (C2 * C3) + C4 + (C5 * C3)) The patch also includes a test case to illustrate the problem. Bootstrapped and tested on x86/x86_64/arm-a15, is it ok? This looks ok to me if Bill is ok with it. This is a good generalization and I'm fine with it. There are a few minor nits that should be corrected, outlined below. Thanks, Richard. Thanks. bin 2013-09-02 Bin Cheng bin.ch...@arm.com * gimple-ssa-strength-reduction.c (backtrace_base_for_ref): New. (restructure_reference): Call backtrace_base_for_ref. gcc/testsuite/ChangeLog 2013-09-02 Bin Cheng bin.ch...@arm.com * gcc.dg/tree-ssa/slsr-39.c: New test. Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c === --- gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c (revision 0) @@ -0,0 +1,26 @@ +/* Verify straight-line strength reduction for back-tracing + CADN_ADD for T2 in: CAND_ADD + +*PBASE:T1 +*POFFSET: MULT_EXPR (T2, C3) +*PINDEX: C1 + (C2 * C3) + C4 */ + +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-slsr } */ + +typedef int arr_2[50][50]; + +void foo (arr_2 a2, int v1) +{ + int i, j; + + i = v1 + 5; + j = i; + a2 [i] [j++] = i; + a2 [i] [j++] = i; + a2 [i] [i-1] += 1; + return; +} + +/* { dg-final { scan-tree-dump-times MEM 4 slsr } } */ +/* { dg-final { cleanup-tree-dump slsr } } */ Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 202067) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -750,6 +750,57 @@ slsr_process_phi (gimple phi, bool speed) add_cand_for_stmt (phi, c); } +/* Given PBASE which is a pointer to tree, loop up the defining look up + statement for it and check whether the candidate is in the + form of: + + X = B + (1 * S), S is integer constant + X = B + (i * S), S is integer one + + If so, set PBASE to the candiate's base_expr and return double candidate's + int (i * S). + Otherwise, just return double int zero. */ This is sufficient, since you are properly checking the next_interp chain. Another possible form would be X = (B + i) * 1, but if this is present, then one of the forms you're checking for should also be present, so there's no need to check the MULT_CANDs. + +static double_int +backtrace_base_for_ref (tree *pbase) +{ + tree base_in = *pbase; + slsr_cand_t base_cand; + + STRIP_NOPS (base_in); + if (TREE_CODE (base_in) != SSA_NAME) +return tree_to_double_int (integer_zero_node); + + base_cand = base_cand_from_table (base_in); + + while (base_cand base_cand-kind != CAND_PHI) +{ + if (base_cand-kind == CAND_ADD +base_cand-index.is_one () +TREE_CODE (base_cand-stride) == INTEGER_CST) + { + /* X = B + (1 * S), S is integer constant. */ + *pbase = base_cand-base_expr; + return tree_to_double_int (base_cand-stride); + } + else if (base_cand-kind == CAND_ADD + TREE_CODE (base_cand-stride) == INTEGER_CST + integer_onep (base_cand-stride)) +{ + /* X = B + (i * S), S is integer one. */ + *pbase = base_cand-base_expr; + return base_cand-index; + } + + if (base_cand-next_interp) + base_cand = lookup_cand (base_cand-next_interp); + else + base_cand = NULL; +} + + return tree_to_double_int (integer_zero_node); +} + /* Look for the following pattern: *PBASE:MEM_REF (T1, C1) @@ -767,8 +818,15 @@ slsr_process_phi (gimple phi, bool speed) *PBASE:T1 *POFFSET: MULT_EXPR (T2, C3) -*PINDEX: C1 + (C2 * C3) + C4 */ +*PINDEX: C1 + (C2 * C3) + C4 + When T2 is recorded by an CAND_ADD in the form of (T2' + C5), It
Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Mon, 2013-09-02 at 11:15 +0200, Richard Biener wrote: On Mon, Sep 2, 2013 at 8:56 AM, bin.cheng bin.ch...@arm.com wrote: Hi, The gimple-ssa-strength-reduction pass handles CAND_REFs in order to find different MEM_REFs sharing common part in addressing expression. If such MEM_REFs are found, the pass rewrites MEM_REFs, and produces more efficient addressing expression during the RTL passes. The pass analyzes addressing expression in each MEM_REF to see if it can be formalized as follows: base:MEM_REF (T1, C1) offset: MULT_EXPR (PLUS_EXPR (T2, C2), C3) bitpos: C4 * BITS_PER_UNIT Then restructures it into below form: MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2, C3)), C1 + (C2 * C3) + C4) At last, rewrite the MEM_REFs if there are two or more sharing common (non-constant) part. The problem is it doesn't back trace T2. If T2 is recorded as a CAND_ADD in form of T2' + C5, the MEM_REF should be restructure into: MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2', C3)), C1 + (C2 * C3) + C4 + (C5 * C3)) The patch also includes a test case to illustrate the problem. Bootstrapped and tested on x86/x86_64/arm-a15, is it ok? This looks ok to me if Bill is ok with it. Sorry, I've been on vacation and haven't been checking in until now. I'll have a look at this tomorrow -- sounds good on the surface! Thanks, Bill Thanks, Richard. Thanks. bin 2013-09-02 Bin Cheng bin.ch...@arm.com * gimple-ssa-strength-reduction.c (backtrace_base_for_ref): New. (restructure_reference): Call backtrace_base_for_ref. gcc/testsuite/ChangeLog 2013-09-02 Bin Cheng bin.ch...@arm.com * gcc.dg/tree-ssa/slsr-39.c: New test.
Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
On Mon, Sep 2, 2013 at 8:56 AM, bin.cheng bin.ch...@arm.com wrote: Hi, The gimple-ssa-strength-reduction pass handles CAND_REFs in order to find different MEM_REFs sharing common part in addressing expression. If such MEM_REFs are found, the pass rewrites MEM_REFs, and produces more efficient addressing expression during the RTL passes. The pass analyzes addressing expression in each MEM_REF to see if it can be formalized as follows: base:MEM_REF (T1, C1) offset: MULT_EXPR (PLUS_EXPR (T2, C2), C3) bitpos: C4 * BITS_PER_UNIT Then restructures it into below form: MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2, C3)), C1 + (C2 * C3) + C4) At last, rewrite the MEM_REFs if there are two or more sharing common (non-constant) part. The problem is it doesn't back trace T2. If T2 is recorded as a CAND_ADD in form of T2' + C5, the MEM_REF should be restructure into: MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2', C3)), C1 + (C2 * C3) + C4 + (C5 * C3)) The patch also includes a test case to illustrate the problem. Bootstrapped and tested on x86/x86_64/arm-a15, is it ok? This looks ok to me if Bill is ok with it. Thanks, Richard. Thanks. bin 2013-09-02 Bin Cheng bin.ch...@arm.com * gimple-ssa-strength-reduction.c (backtrace_base_for_ref): New. (restructure_reference): Call backtrace_base_for_ref. gcc/testsuite/ChangeLog 2013-09-02 Bin Cheng bin.ch...@arm.com * gcc.dg/tree-ssa/slsr-39.c: New test.