Re: [PATCH] Fix PR18589

2012-04-12 Thread Richard Guenther
On Thu, Apr 5, 2012 at 3:49 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
 On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
 
  Unfortunately this seems to be necessary if I name the two passes
  reassoc1 and reassoc2.  If I try to name both of them reassoc I
  get failures in other tests like gfortran.dg/reassoc_4, where
  -fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
  obvious, I think I need to keep that change.

 Hm, naming them reassoc1 and reassoc2 is a hack.  Naming both
 reassoc will not trigger re-naming them to reassoc1 and reassoc2
 I think.  How ugly.  Especially that -fdump-tree-reassoc will no longer
 work.  Maybe instead of using two pass structs resort to using
 the existing hack with using first_pass_instance and 
 TODO_mark_first_instance.

 OK, that seems to be the best among evils.  Using the
 first_pass_instance hack, the patch is transformed as below.
 Regstrapped on powerpc64-linux, no additional failures.  OK for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Bill


 gcc:

 2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/18589
        * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
        (operand_entry): Add count field.
        (add_repeat_to_ops_vec): New function.
        (completely_remove_stmt): Likewise.
        (remove_def_if_absorbed_call): Likewise.
        (remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
        (acceptable_pow_call): New function.
        (linearize_expr_tree): Look for builtin pow/powi calls and add operand
        entries with repeat counts when found.
        (repeat_factor_d): New struct and associated typedefs.
        (repeat_factor_vec): New static vector variable.
        (compare_repeat_factors): New function.
        (get_reassoc_pow_ssa_name): Likewise.
        (attempt_builtin_powi): Likewise.
        (reassociate_bb): Call attempt_builtin_powi.
        (fini_reassoc): Two new calls to statistics_counter_event.

 gcc/testsuite:

 2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/18589
        * gcc.dg/tree-ssa/pr18589-1.c: New test.
        * gcc.dg/tree-ssa/pr18589-2.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-3.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-4.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-5.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-6.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-7.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-8.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-9.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-10.c: Likewise.


 Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
 @@ -0,0 +1,10 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
 +
 +double baz (double x, double y, double z, double u)
 +{
 +  return x * x * y * y * y * z * z * z * z * u;
 +}
 +
 +/* { dg-final { scan-tree-dump-times  \\*  6 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c   (revision 0)
 @@ -0,0 +1,10 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
 +
 +double baz (double x, double y, double z, double u)
 +{
 +  return x * x * x * y * y * y * z * z * z * z * u * u * u * u;
 +}
 +
 +/* { dg-final { scan-tree-dump-times  \\*  6 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-6.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr18589-6.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-6.c   (revision 0)
 @@ -0,0 +1,10 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
 +
 +double baz (double x, double y)
 +{
 +  return __builtin_pow (x, 3.0) * __builtin_pow (y, 4.0);
 +}
 +
 +/* { dg-final { scan-tree-dump-times  \\*  4 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-7.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr18589-7.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-7.c   (revision 0)
 @@ -0,0 +1,10 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
 +
 +float baz (float x, float y)
 +{
 +  return x * x * x * x * y * y * y * y;
 +}
 +
 +/* { dg-final { scan-tree-dump-times  \\*  3 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized 

Re: [PATCH] Fix PR18589

2012-04-12 Thread H.J. Lu
On Thu, Apr 5, 2012 at 6:49 AM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
 On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
 
  Unfortunately this seems to be necessary if I name the two passes
  reassoc1 and reassoc2.  If I try to name both of them reassoc I
  get failures in other tests like gfortran.dg/reassoc_4, where
  -fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
  obvious, I think I need to keep that change.

 Hm, naming them reassoc1 and reassoc2 is a hack.  Naming both
 reassoc will not trigger re-naming them to reassoc1 and reassoc2
 I think.  How ugly.  Especially that -fdump-tree-reassoc will no longer
 work.  Maybe instead of using two pass structs resort to using
 the existing hack with using first_pass_instance and 
 TODO_mark_first_instance.

 OK, that seems to be the best among evils.  Using the
 first_pass_instance hack, the patch is transformed as below.
 Regstrapped on powerpc64-linux, no additional failures.  OK for trunk?

 Thanks,
 Bill


 gcc:

 2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/18589
        * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
        (operand_entry): Add count field.
        (add_repeat_to_ops_vec): New function.
        (completely_remove_stmt): Likewise.
        (remove_def_if_absorbed_call): Likewise.
        (remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
        (acceptable_pow_call): New function.
        (linearize_expr_tree): Look for builtin pow/powi calls and add operand
        entries with repeat counts when found.
        (repeat_factor_d): New struct and associated typedefs.
        (repeat_factor_vec): New static vector variable.
        (compare_repeat_factors): New function.
        (get_reassoc_pow_ssa_name): Likewise.
        (attempt_builtin_powi): Likewise.
        (reassociate_bb): Call attempt_builtin_powi.
        (fini_reassoc): Two new calls to statistics_counter_event.


It breaks bootstrap on Linux/ia32:

../../src-trunk/gcc/tree-ssa-reassoc.c: In function 'void
attempt_builtin_powi(gimple, VEC_operand_entry_t_heap**,
tree_node**)':
../../src-trunk/gcc/tree-ssa-reassoc.c:3189:41: error: format '%ld'
expects argument of type 'long int', but argument 3 has type 'long
long int' [-Werror=format]
 fprintf (dump_file, )^%ld\n, power);
 ^
../../src-trunk/gcc/tree-ssa-reassoc.c:3222:44: error: format '%ld'
expects argument of type 'long int', but argument 3 has type 'long
long int' [-Werror=format]
fprintf (dump_file, )^%ld\n, power);
^
cc1plus: all warnings being treated as errors


H.J.


Re: [PATCH] Fix PR18589

2012-04-12 Thread William J. Schmidt
On Thu, 2012-04-12 at 09:50 -0700, H.J. Lu wrote:
 On Thu, Apr 5, 2012 at 6:49 AM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
  On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
  wschm...@linux.vnet.ibm.com wrote:
  
   Unfortunately this seems to be necessary if I name the two passes
   reassoc1 and reassoc2.  If I try to name both of them reassoc I
   get failures in other tests like gfortran.dg/reassoc_4, where
   -fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
   obvious, I think I need to keep that change.
 
  Hm, naming them reassoc1 and reassoc2 is a hack.  Naming both
  reassoc will not trigger re-naming them to reassoc1 and reassoc2
  I think.  How ugly.  Especially that -fdump-tree-reassoc will no longer
  work.  Maybe instead of using two pass structs resort to using
  the existing hack with using first_pass_instance and 
  TODO_mark_first_instance.
 
  OK, that seems to be the best among evils.  Using the
  first_pass_instance hack, the patch is transformed as below.
  Regstrapped on powerpc64-linux, no additional failures.  OK for trunk?
 
  Thanks,
  Bill
 
 
  gcc:
 
  2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
 PR tree-optimization/18589
 * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
 (operand_entry): Add count field.
 (add_repeat_to_ops_vec): New function.
 (completely_remove_stmt): Likewise.
 (remove_def_if_absorbed_call): Likewise.
 (remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
 (acceptable_pow_call): New function.
 (linearize_expr_tree): Look for builtin pow/powi calls and add 
  operand
 entries with repeat counts when found.
 (repeat_factor_d): New struct and associated typedefs.
 (repeat_factor_vec): New static vector variable.
 (compare_repeat_factors): New function.
 (get_reassoc_pow_ssa_name): Likewise.
 (attempt_builtin_powi): Likewise.
 (reassociate_bb): Call attempt_builtin_powi.
 (fini_reassoc): Two new calls to statistics_counter_event.
 
 
 It breaks bootstrap on Linux/ia32:
 
 ../../src-trunk/gcc/tree-ssa-reassoc.c: In function 'void
 attempt_builtin_powi(gimple, VEC_operand_entry_t_heap**,
 tree_node**)':
 ../../src-trunk/gcc/tree-ssa-reassoc.c:3189:41: error: format '%ld'
 expects argument of type 'long int', but argument 3 has type 'long
 long int' [-Werror=format]
  fprintf (dump_file, )^%ld\n, power);
  ^
 ../../src-trunk/gcc/tree-ssa-reassoc.c:3222:44: error: format '%ld'
 expects argument of type 'long int', but argument 3 has type 'long
 long int' [-Werror=format]
 fprintf (dump_file, )^%ld\n, power);
 ^
 cc1plus: all warnings being treated as errors
 
 
 H.J.
 

Whoops.  Looks like I need to use HOST_WIDE_INT_PRINT_DEC instead of %ld
in those spots.  I'll get a fix prepared.



Re: [PATCH] Fix PR18589

2012-04-12 Thread William J. Schmidt
On Thu, 2012-04-12 at 09:50 -0700, H.J. Lu wrote:
 On Thu, Apr 5, 2012 at 6:49 AM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
  On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
  wschm...@linux.vnet.ibm.com wrote:
  
   Unfortunately this seems to be necessary if I name the two passes
   reassoc1 and reassoc2.  If I try to name both of them reassoc I
   get failures in other tests like gfortran.dg/reassoc_4, where
   -fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
   obvious, I think I need to keep that change.
 
  Hm, naming them reassoc1 and reassoc2 is a hack.  Naming both
  reassoc will not trigger re-naming them to reassoc1 and reassoc2
  I think.  How ugly.  Especially that -fdump-tree-reassoc will no longer
  work.  Maybe instead of using two pass structs resort to using
  the existing hack with using first_pass_instance and 
  TODO_mark_first_instance.
 
  OK, that seems to be the best among evils.  Using the
  first_pass_instance hack, the patch is transformed as below.
  Regstrapped on powerpc64-linux, no additional failures.  OK for trunk?
 
  Thanks,
  Bill
 
 
  gcc:
 
  2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
 PR tree-optimization/18589
 * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
 (operand_entry): Add count field.
 (add_repeat_to_ops_vec): New function.
 (completely_remove_stmt): Likewise.
 (remove_def_if_absorbed_call): Likewise.
 (remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
 (acceptable_pow_call): New function.
 (linearize_expr_tree): Look for builtin pow/powi calls and add 
  operand
 entries with repeat counts when found.
 (repeat_factor_d): New struct and associated typedefs.
 (repeat_factor_vec): New static vector variable.
 (compare_repeat_factors): New function.
 (get_reassoc_pow_ssa_name): Likewise.
 (attempt_builtin_powi): Likewise.
 (reassociate_bb): Call attempt_builtin_powi.
 (fini_reassoc): Two new calls to statistics_counter_event.
 
 
 It breaks bootstrap on Linux/ia32:
 
 ../../src-trunk/gcc/tree-ssa-reassoc.c: In function 'void
 attempt_builtin_powi(gimple, VEC_operand_entry_t_heap**,
 tree_node**)':
 ../../src-trunk/gcc/tree-ssa-reassoc.c:3189:41: error: format '%ld'
 expects argument of type 'long int', but argument 3 has type 'long
 long int' [-Werror=format]
  fprintf (dump_file, )^%ld\n, power);
  ^
 ../../src-trunk/gcc/tree-ssa-reassoc.c:3222:44: error: format '%ld'
 expects argument of type 'long int', but argument 3 has type 'long
 long int' [-Werror=format]
 fprintf (dump_file, )^%ld\n, power);
 ^
 cc1plus: all warnings being treated as errors
 
 
 H.J.
 

Thanks, H.J.  Sorry for the problem!

Fixing as follows.  I'll plan to commit as obvious shortly.


2012-04-12  Bill Schmidt  wschm...@linux.vnet.ibm.com

* tree-ssa-reassoc.c (attempt_builtin_powi_stats): Change %ld to
HOST_WIDE_INT_PRINT_DEC in format strings.


Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 186384)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -3186,7 +3186,8 @@ attempt_builtin_powi (gimple stmt, VEC(operand_ent
  if (elt  vec_len - 1)
fputs ( * , dump_file);
}
- fprintf (dump_file, )^%ld\n, power);
+ fprintf (dump_file, )^HOST_WIDE_INT_PRINT_DEC\n,
+  power);
}
}
}
@@ -3219,7 +3220,7 @@ attempt_builtin_powi (gimple stmt, VEC(operand_ent
  if (elt  vec_len - 1)
fputs ( * , dump_file);
}
- fprintf (dump_file, )^%ld\n, power);
+ fprintf (dump_file, )^HOST_WIDE_INT_PRINT_DEC\n, power);
}
 
  reassociate_stats.pows_created++;




Re: [PATCH] Fix PR18589

2012-04-05 Thread Richard Guenther
On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 On Wed, 2012-04-04 at 15:08 +0200, Richard Guenther wrote:
 On Wed, Apr 4, 2012 at 2:35 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
  On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
  wschm...@linux.vnet.ibm.com wrote:
  
  
   On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
   On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
   wschm...@linux.vnet.ibm.com wrote:
Hi,
   
This is a re-post of the patch I posted for comments in January to
address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
modifies reassociation to expose repeated factors from __builtin_pow*
calls, optimally reassociate repeated factors, and possibly 
reconstitute
__builtin_powi calls from the results of reassociation.
   
Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
expect there may need to be some small changes, but I am targeting 
this
for trunk approval.
   
Thanks very much for the review,
  
   Hmm.  How much work would it be to extend the reassoc 'IL' to allow
   a repeat factor per op?  I realize what you do is all within what 
   reassoc
   already does though ideally we would not require any GIMPLE IL changes
   for building up / optimizing the reassoc IL but only do so when we 
   commit
   changes.
  
   Thanks,
   Richard.
  
   Hi Richard,
  
   I've revised my patch along these lines; see the new version below.
   While testing it I realized I could do a better job of reducing the
   number of multiplies, so there are some changes to that logic as well,
   and a couple of additional test cases.  Regstrapped successfully on
   powerpc64-linux.
  
   Hope this looks better!
 
  Yes indeed.  A few observations though.  You didn't integrate
  attempt_builtin_powi
  with optimize_ops_list - presumably because it's result does not really 
  fit
  the single-operation assumption?  But note that undistribute_ops_list and
  optimize_range_tests have the same issue.  Thus, I'd have prefered if
  attempt_builtin_powi worked in the same way, remove the parts of the
  ops list it consumed and stick an operand for its result there instead.
  That should simplify things (not having that special powi_result) and
  allow for multiple powi results in a single op list?
 
  Multiple powi results are already handled, but yes, what you're
  suggesting would simplify things by eliminating the need to create
  explicit multiplies to join them and the cached-multiply results
  together.  Sounds reasonable on the surface; it just hadn't occurred to
  me to do it this way.  I'll have a look.
 
  Any other major concerns while I'm reworking this?

 No, the rest looks fine (you should not need to repace
 -fdump-tree-reassoc-details
 with -fdump-tree-reassoc1-details -fdump-tree-reassoc2-details in the first
 testcase).

 Unfortunately this seems to be necessary if I name the two passes
 reassoc1 and reassoc2.  If I try to name both of them reassoc I
 get failures in other tests like gfortran.dg/reassoc_4, where
 -fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
 obvious, I think I need to keep that change.

Hm, naming them reassoc1 and reassoc2 is a hack.  Naming both
reassoc will not trigger re-naming them to reassoc1 and reassoc2
I think.  How ugly.  Especially that -fdump-tree-reassoc will no longer
work.  Maybe instead of using two pass structs resort to using
the existing hack with using first_pass_instance and TODO_mark_first_instance.

 Frankly I was surprised and relieved that there weren't more tests that
 used the generic -fdump-tree-reassoc.

 Thanks,
 Bill

 Thanks,
 Richard.

  Thanks,
  Bill
 
  Thanks,
  Richard.
 
 
 





Re: [PATCH] Fix PR18589

2012-04-05 Thread William J. Schmidt
On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
 On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  
  Unfortunately this seems to be necessary if I name the two passes
  reassoc1 and reassoc2.  If I try to name both of them reassoc I
  get failures in other tests like gfortran.dg/reassoc_4, where
  -fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
  obvious, I think I need to keep that change.
 
 Hm, naming them reassoc1 and reassoc2 is a hack.  Naming both
 reassoc will not trigger re-naming them to reassoc1 and reassoc2
 I think.  How ugly.  Especially that -fdump-tree-reassoc will no longer
 work.  Maybe instead of using two pass structs resort to using
 the existing hack with using first_pass_instance and TODO_mark_first_instance.

OK, that seems to be the best among evils.  Using the
first_pass_instance hack, the patch is transformed as below.
Regstrapped on powerpc64-linux, no additional failures.  OK for trunk?

Thanks,
Bill


gcc:

2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/18589
* tree-ssa-reassoc.c (reassociate_stats): Add two fields.
(operand_entry): Add count field.
(add_repeat_to_ops_vec): New function.
(completely_remove_stmt): Likewise.
(remove_def_if_absorbed_call): Likewise.
(remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
(acceptable_pow_call): New function.
(linearize_expr_tree): Look for builtin pow/powi calls and add operand
entries with repeat counts when found.
(repeat_factor_d): New struct and associated typedefs.
(repeat_factor_vec): New static vector variable.
(compare_repeat_factors): New function.
(get_reassoc_pow_ssa_name): Likewise.
(attempt_builtin_powi): Likewise.
(reassociate_bb): Call attempt_builtin_powi.
(fini_reassoc): Two new calls to statistics_counter_event.

gcc/testsuite:

2012-04-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/18589
* gcc.dg/tree-ssa/pr18589-1.c: New test.
* gcc.dg/tree-ssa/pr18589-2.c: Likewise.
* gcc.dg/tree-ssa/pr18589-3.c: Likewise.
* gcc.dg/tree-ssa/pr18589-4.c: Likewise.
* gcc.dg/tree-ssa/pr18589-5.c: Likewise.
* gcc.dg/tree-ssa/pr18589-6.c: Likewise.
* gcc.dg/tree-ssa/pr18589-7.c: Likewise.
* gcc.dg/tree-ssa/pr18589-8.c: Likewise.
* gcc.dg/tree-ssa/pr18589-9.c: Likewise.
* gcc.dg/tree-ssa/pr18589-10.c: Likewise.


Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
+
+double baz (double x, double y, double z, double u)
+{
+  return x * x * y * y * y * z * z * z * z * u;
+}
+
+/* { dg-final { scan-tree-dump-times  \\*  6 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c   (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
+
+double baz (double x, double y, double z, double u)
+{
+  return x * x * x * y * y * y * z * z * z * z * u * u * u * u;
+}
+
+/* { dg-final { scan-tree-dump-times  \\*  6 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-6.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-6.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-6.c   (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
+
+double baz (double x, double y)
+{
+  return __builtin_pow (x, 3.0) * __builtin_pow (y, 4.0);
+}
+
+/* { dg-final { scan-tree-dump-times  \\*  4 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-7.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-7.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-7.c   (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
+
+float baz (float x, float y)
+{
+  return x * x * x * x * y * y * y * y;
+}
+
+/* { dg-final { scan-tree-dump-times  \\*  3 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-8.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-8.c   

Re: [PATCH] Fix PR18589

2012-04-04 Thread Richard Guenther
On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:


 On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
 On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Hi,
 
  This is a re-post of the patch I posted for comments in January to
  address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
  modifies reassociation to expose repeated factors from __builtin_pow*
  calls, optimally reassociate repeated factors, and possibly reconstitute
  __builtin_powi calls from the results of reassociation.
 
  Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
  expect there may need to be some small changes, but I am targeting this
  for trunk approval.
 
  Thanks very much for the review,

 Hmm.  How much work would it be to extend the reassoc 'IL' to allow
 a repeat factor per op?  I realize what you do is all within what reassoc
 already does though ideally we would not require any GIMPLE IL changes
 for building up / optimizing the reassoc IL but only do so when we commit
 changes.

 Thanks,
 Richard.

 Hi Richard,

 I've revised my patch along these lines; see the new version below.
 While testing it I realized I could do a better job of reducing the
 number of multiplies, so there are some changes to that logic as well,
 and a couple of additional test cases.  Regstrapped successfully on
 powerpc64-linux.

 Hope this looks better!

Yes indeed.  A few observations though.  You didn't integrate
attempt_builtin_powi
with optimize_ops_list - presumably because it's result does not really fit
the single-operation assumption?  But note that undistribute_ops_list and
optimize_range_tests have the same issue.  Thus, I'd have prefered if
attempt_builtin_powi worked in the same way, remove the parts of the
ops list it consumed and stick an operand for its result there instead.
That should simplify things (not having that special powi_result) and
allow for multiple powi results in a single op list?

Thanks,
Richard.

 Thanks,
 Bill


 gcc:

 2012-04-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/18589
        * tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
        pass_late_reassoc.
        * passes.c (init_optimization_passes): Change pass_reassoc calls to
        pass_early_reassoc and pass_late_reassoc.
        * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
        (operand_entry): Add count field.
        (early_reassoc): New static var.
        (add_repeat_to_ops_vec): New function.
        (completely_remove_stmt): Likewise.
        (remove_def_if_absorbed_call): Likewise.
        (remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
        (acceptable_pow_call): New function.
        (linearize_expr_tree): Look for builtin pow/powi calls and add operand
        entries with repeat counts when found.
        (repeat_factor_d): New struct and associated typedefs.
        (repeat_factor_vec): New static vector variable.
        (compare_repeat_factors): New function.
        (get_reassoc_pow_ssa_name): Likewise.
        (attempt_builtin_powi): Likewise.
        (reassociate_bb): Attempt to create __builtin_powi calls, and multiply
        their results by any leftover reassociated factors; remove builtin
        pow/powi calls that were absorbed by reassociation.
        (fini_reassoc): Two new calls to statistics_counter_event.
        (execute_early_reassoc): New function.
        (execute_late_reassoc): Likewise.
        (pass_early_reassoc): Replace pass_reassoc, renamed to reassoc1,
        call execute_early_reassoc.
        (pass_late_reassoc): New gimple_opt_pass named reassoc2 that calls
        execute_late_reassoc.

 gcc/testsuite:

 2012-04-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/18589
        * gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
        -fdump-tree-reassoc[12]-details.
        * gcc.dg/tree-ssa/pr18589-1.c: New test.
        * gcc.dg/tree-ssa/pr18589-2.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-3.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-4.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-5.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-6.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-7.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-8.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-9.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-10.c: Likewise.


 Index: gcc/tree-pass.h
 ===
 --- gcc/tree-pass.h     (revision 186108)
 +++ gcc/tree-pass.h     (working copy)
 @@ -441,7 +441,8 @@ extern struct gimple_opt_pass pass_copy_prop;
  extern struct gimple_opt_pass pass_vrp;
  extern struct gimple_opt_pass pass_uncprop;
  extern struct gimple_opt_pass pass_return_slot;
 -extern struct gimple_opt_pass pass_reassoc;
 +extern struct gimple_opt_pass pass_early_reassoc;
 +extern struct gimple_opt_pass pass_late_reassoc;

Re: [PATCH] Fix PR18589

2012-04-04 Thread William J. Schmidt
On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
 On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
 
 
  On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
  On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
  wschm...@linux.vnet.ibm.com wrote:
   Hi,
  
   This is a re-post of the patch I posted for comments in January to
   address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
   modifies reassociation to expose repeated factors from __builtin_pow*
   calls, optimally reassociate repeated factors, and possibly reconstitute
   __builtin_powi calls from the results of reassociation.
  
   Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
   expect there may need to be some small changes, but I am targeting this
   for trunk approval.
  
   Thanks very much for the review,
 
  Hmm.  How much work would it be to extend the reassoc 'IL' to allow
  a repeat factor per op?  I realize what you do is all within what reassoc
  already does though ideally we would not require any GIMPLE IL changes
  for building up / optimizing the reassoc IL but only do so when we commit
  changes.
 
  Thanks,
  Richard.
 
  Hi Richard,
 
  I've revised my patch along these lines; see the new version below.
  While testing it I realized I could do a better job of reducing the
  number of multiplies, so there are some changes to that logic as well,
  and a couple of additional test cases.  Regstrapped successfully on
  powerpc64-linux.
 
  Hope this looks better!
 
 Yes indeed.  A few observations though.  You didn't integrate
 attempt_builtin_powi
 with optimize_ops_list - presumably because it's result does not really fit
 the single-operation assumption?  But note that undistribute_ops_list and
 optimize_range_tests have the same issue.  Thus, I'd have prefered if
 attempt_builtin_powi worked in the same way, remove the parts of the
 ops list it consumed and stick an operand for its result there instead.
 That should simplify things (not having that special powi_result) and
 allow for multiple powi results in a single op list?

Multiple powi results are already handled, but yes, what you're
suggesting would simplify things by eliminating the need to create
explicit multiplies to join them and the cached-multiply results
together.  Sounds reasonable on the surface; it just hadn't occurred to
me to do it this way.  I'll have a look.

Any other major concerns while I'm reworking this?

Thanks,
Bill
 
 Thanks,
 Richard.
 




Re: [PATCH] Fix PR18589

2012-04-04 Thread Richard Guenther
On Wed, Apr 4, 2012 at 2:35 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
 On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
 
 
  On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
  On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
  wschm...@linux.vnet.ibm.com wrote:
   Hi,
  
   This is a re-post of the patch I posted for comments in January to
   address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
   modifies reassociation to expose repeated factors from __builtin_pow*
   calls, optimally reassociate repeated factors, and possibly reconstitute
   __builtin_powi calls from the results of reassociation.
  
   Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
   expect there may need to be some small changes, but I am targeting this
   for trunk approval.
  
   Thanks very much for the review,
 
  Hmm.  How much work would it be to extend the reassoc 'IL' to allow
  a repeat factor per op?  I realize what you do is all within what reassoc
  already does though ideally we would not require any GIMPLE IL changes
  for building up / optimizing the reassoc IL but only do so when we commit
  changes.
 
  Thanks,
  Richard.
 
  Hi Richard,
 
  I've revised my patch along these lines; see the new version below.
  While testing it I realized I could do a better job of reducing the
  number of multiplies, so there are some changes to that logic as well,
  and a couple of additional test cases.  Regstrapped successfully on
  powerpc64-linux.
 
  Hope this looks better!

 Yes indeed.  A few observations though.  You didn't integrate
 attempt_builtin_powi
 with optimize_ops_list - presumably because it's result does not really fit
 the single-operation assumption?  But note that undistribute_ops_list and
 optimize_range_tests have the same issue.  Thus, I'd have prefered if
 attempt_builtin_powi worked in the same way, remove the parts of the
 ops list it consumed and stick an operand for its result there instead.
 That should simplify things (not having that special powi_result) and
 allow for multiple powi results in a single op list?

 Multiple powi results are already handled, but yes, what you're
 suggesting would simplify things by eliminating the need to create
 explicit multiplies to join them and the cached-multiply results
 together.  Sounds reasonable on the surface; it just hadn't occurred to
 me to do it this way.  I'll have a look.

 Any other major concerns while I'm reworking this?

No, the rest looks fine (you should not need to repace
-fdump-tree-reassoc-details
with -fdump-tree-reassoc1-details -fdump-tree-reassoc2-details in the first
testcase).

Thanks,
Richard.

 Thanks,
 Bill

 Thanks,
 Richard.





Re: [PATCH] Fix PR18589

2012-04-04 Thread William J. Schmidt
On Wed, 2012-04-04 at 15:08 +0200, Richard Guenther wrote:
 On Wed, Apr 4, 2012 at 2:35 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
  On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
  wschm...@linux.vnet.ibm.com wrote:
  
  
   On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
   On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
   wschm...@linux.vnet.ibm.com wrote:
Hi,
   
This is a re-post of the patch I posted for comments in January to
address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
modifies reassociation to expose repeated factors from __builtin_pow*
calls, optimally reassociate repeated factors, and possibly 
reconstitute
__builtin_powi calls from the results of reassociation.
   
Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
expect there may need to be some small changes, but I am targeting 
this
for trunk approval.
   
Thanks very much for the review,
  
   Hmm.  How much work would it be to extend the reassoc 'IL' to allow
   a repeat factor per op?  I realize what you do is all within what 
   reassoc
   already does though ideally we would not require any GIMPLE IL changes
   for building up / optimizing the reassoc IL but only do so when we 
   commit
   changes.
  
   Thanks,
   Richard.
  
   Hi Richard,
  
   I've revised my patch along these lines; see the new version below.
   While testing it I realized I could do a better job of reducing the
   number of multiplies, so there are some changes to that logic as well,
   and a couple of additional test cases.  Regstrapped successfully on
   powerpc64-linux.
  
   Hope this looks better!
 
  Yes indeed.  A few observations though.  You didn't integrate
  attempt_builtin_powi
  with optimize_ops_list - presumably because it's result does not really fit
  the single-operation assumption?  But note that undistribute_ops_list and
  optimize_range_tests have the same issue.  Thus, I'd have prefered if
  attempt_builtin_powi worked in the same way, remove the parts of the
  ops list it consumed and stick an operand for its result there instead.
  That should simplify things (not having that special powi_result) and
  allow for multiple powi results in a single op list?
 
  Multiple powi results are already handled, but yes, what you're
  suggesting would simplify things by eliminating the need to create
  explicit multiplies to join them and the cached-multiply results
  together.  Sounds reasonable on the surface; it just hadn't occurred to
  me to do it this way.  I'll have a look.
 
  Any other major concerns while I'm reworking this?
 
 No, the rest looks fine (you should not need to repace
 -fdump-tree-reassoc-details
 with -fdump-tree-reassoc1-details -fdump-tree-reassoc2-details in the first
 testcase).

Unfortunately this seems to be necessary if I name the two passes
reassoc1 and reassoc2.  If I try to name both of them reassoc I
get failures in other tests like gfortran.dg/reassoc_4, where
-fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
obvious, I think I need to keep that change.

Frankly I was surprised and relieved that there weren't more tests that
used the generic -fdump-tree-reassoc.

Thanks,
Bill
 
 Thanks,
 Richard.
 
  Thanks,
  Bill
 
  Thanks,
  Richard.
 
 
 
 




Re: [PATCH] Fix PR18589

2012-04-04 Thread William J. Schmidt
On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
 On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
 
  Hi Richard,
 
  I've revised my patch along these lines; see the new version below.
  While testing it I realized I could do a better job of reducing the
  number of multiplies, so there are some changes to that logic as well,
  and a couple of additional test cases.  Regstrapped successfully on
  powerpc64-linux.
 
  Hope this looks better!
 
 Yes indeed.  A few observations though.  You didn't integrate
 attempt_builtin_powi
 with optimize_ops_list - presumably because it's result does not really fit
 the single-operation assumption?  But note that undistribute_ops_list and
 optimize_range_tests have the same issue.  Thus, I'd have prefered if
 attempt_builtin_powi worked in the same way, remove the parts of the
 ops list it consumed and stick an operand for its result there instead.
 That should simplify things (not having that special powi_result) and
 allow for multiple powi results in a single op list?

An excellent suggestion.  I've implemented it below and it is indeed
much cleaner this way.

Bootstrapped/regression tested with no new failures on powerpc64-linux.
Is this incarnation OK for trunk?

Thanks,
Bill

 
 Thanks,
 Richard.
 
  Thanks,
  Bill


gcc:

2012-04-04  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/18589
* tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
pass_late_reassoc.
* passes.c (init_optimization_passes): Change pass_reassoc calls to
pass_early_reassoc and pass_late_reassoc.
* tree-ssa-reassoc.c (reassociate_stats): Add two fields.
(operand_entry): Add count field.
(early_reassoc): New static var.
(add_repeat_to_ops_vec): New function.
(completely_remove_stmt): Likewise.
(remove_def_if_absorbed_call): Likewise.
(remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
(acceptable_pow_call): New function.
(linearize_expr_tree): Look for builtin pow/powi calls and add operand
entries with repeat counts when found.
(repeat_factor_d): New struct and associated typedefs.
(repeat_factor_vec): New static vector variable.
(compare_repeat_factors): New function.
(get_reassoc_pow_ssa_name): Likewise.
(attempt_builtin_powi): Likewise.
(reassociate_bb): Call attempt_builtin_powi.
(fini_reassoc): Two new calls to statistics_counter_event.
(execute_early_reassoc): New function.
(execute_late_reassoc): Likewise.
(pass_early_reassoc): Rename from pass_reassoc, call
execute_early_reassoc.
(pass_late_reassoc): New gimple_opt_pass that calls
execute_late_reassoc.

gcc/testsuite:

2012-04-04  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/18589
* gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
-fdump-tree-reassoc1-details and -fdump-tree-reassoc2-details.
* gcc.dg/tree-ssa/pr18589-1.c: New test.
* gcc.dg/tree-ssa/pr18589-2.c: Likewise.
* gcc.dg/tree-ssa/pr18589-3.c: Likewise.
* gcc.dg/tree-ssa/pr18589-4.c: Likewise.
* gcc.dg/tree-ssa/pr18589-5.c: Likewise.
* gcc.dg/tree-ssa/pr18589-6.c: Likewise.
* gcc.dg/tree-ssa/pr18589-7.c: Likewise.
* gcc.dg/tree-ssa/pr18589-8.c: Likewise.
* gcc.dg/tree-ssa/pr18589-9.c: Likewise.
* gcc.dg/tree-ssa/pr18589-10.c: Likewise.


Index: gcc/tree-pass.h
===
--- gcc/tree-pass.h (revision 186108)
+++ gcc/tree-pass.h (working copy)
@@ -441,7 +441,8 @@ extern struct gimple_opt_pass pass_copy_prop;
 extern struct gimple_opt_pass pass_vrp;
 extern struct gimple_opt_pass pass_uncprop;
 extern struct gimple_opt_pass pass_return_slot;
-extern struct gimple_opt_pass pass_reassoc;
+extern struct gimple_opt_pass pass_early_reassoc;
+extern struct gimple_opt_pass pass_late_reassoc;
 extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
 extern struct gimple_opt_pass pass_remove_cgraph_callee_edges;
 extern struct gimple_opt_pass pass_build_cgraph_edges;
Index: gcc/testsuite/gcc.dg/pr46309.c
===
--- gcc/testsuite/gcc.dg/pr46309.c  (revision 186108)
+++ gcc/testsuite/gcc.dg/pr46309.c  (working copy)
@@ -1,6 +1,6 @@
 /* PR tree-optimization/46309 */
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-tree-reassoc-details } */
+/* { dg-options -O2 -fdump-tree-reassoc1-details 
-fdump-tree-reassoc2-details } */
 /* The transformation depends on BRANCH_COST being greater than 1
(see the notes in the PR), so try to force that.  */
 /* { dg-additional-options -mtune=octeon2 { target mips*-*-* } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c

Re: [PATCH] Fix PR18589

2012-04-03 Thread William J. Schmidt


On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
 On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Hi,
 
  This is a re-post of the patch I posted for comments in January to
  address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
  modifies reassociation to expose repeated factors from __builtin_pow*
  calls, optimally reassociate repeated factors, and possibly reconstitute
  __builtin_powi calls from the results of reassociation.
 
  Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
  expect there may need to be some small changes, but I am targeting this
  for trunk approval.
 
  Thanks very much for the review,
 
 Hmm.  How much work would it be to extend the reassoc 'IL' to allow
 a repeat factor per op?  I realize what you do is all within what reassoc
 already does though ideally we would not require any GIMPLE IL changes
 for building up / optimizing the reassoc IL but only do so when we commit
 changes.
 
 Thanks,
 Richard.

Hi Richard,

I've revised my patch along these lines; see the new version below.
While testing it I realized I could do a better job of reducing the
number of multiplies, so there are some changes to that logic as well,
and a couple of additional test cases.  Regstrapped successfully on
powerpc64-linux.

Hope this looks better!

Thanks,
Bill


gcc:

2012-04-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/18589
* tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
pass_late_reassoc.
* passes.c (init_optimization_passes): Change pass_reassoc calls to
pass_early_reassoc and pass_late_reassoc.
* tree-ssa-reassoc.c (reassociate_stats): Add two fields.
(operand_entry): Add count field.
(early_reassoc): New static var.
(add_repeat_to_ops_vec): New function.
(completely_remove_stmt): Likewise.
(remove_def_if_absorbed_call): Likewise.
(remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
(acceptable_pow_call): New function.
(linearize_expr_tree): Look for builtin pow/powi calls and add operand
entries with repeat counts when found.
(repeat_factor_d): New struct and associated typedefs.
(repeat_factor_vec): New static vector variable.
(compare_repeat_factors): New function.
(get_reassoc_pow_ssa_name): Likewise.
(attempt_builtin_powi): Likewise.
(reassociate_bb): Attempt to create __builtin_powi calls, and multiply
their results by any leftover reassociated factors; remove builtin
pow/powi calls that were absorbed by reassociation.
(fini_reassoc): Two new calls to statistics_counter_event.
(execute_early_reassoc): New function.
(execute_late_reassoc): Likewise.
(pass_early_reassoc): Replace pass_reassoc, renamed to reassoc1,
call execute_early_reassoc.
(pass_late_reassoc): New gimple_opt_pass named reassoc2 that calls
execute_late_reassoc.

gcc/testsuite:

2012-04-03  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/18589
* gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
-fdump-tree-reassoc[12]-details.
* gcc.dg/tree-ssa/pr18589-1.c: New test.
* gcc.dg/tree-ssa/pr18589-2.c: Likewise.
* gcc.dg/tree-ssa/pr18589-3.c: Likewise.
* gcc.dg/tree-ssa/pr18589-4.c: Likewise.
* gcc.dg/tree-ssa/pr18589-5.c: Likewise.
* gcc.dg/tree-ssa/pr18589-6.c: Likewise.
* gcc.dg/tree-ssa/pr18589-7.c: Likewise.
* gcc.dg/tree-ssa/pr18589-8.c: Likewise.
* gcc.dg/tree-ssa/pr18589-9.c: Likewise.
* gcc.dg/tree-ssa/pr18589-10.c: Likewise.


Index: gcc/tree-pass.h
===
--- gcc/tree-pass.h (revision 186108)
+++ gcc/tree-pass.h (working copy)
@@ -441,7 +441,8 @@ extern struct gimple_opt_pass pass_copy_prop;
 extern struct gimple_opt_pass pass_vrp;
 extern struct gimple_opt_pass pass_uncprop;
 extern struct gimple_opt_pass pass_return_slot;
-extern struct gimple_opt_pass pass_reassoc;
+extern struct gimple_opt_pass pass_early_reassoc;
+extern struct gimple_opt_pass pass_late_reassoc;
 extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
 extern struct gimple_opt_pass pass_remove_cgraph_callee_edges;
 extern struct gimple_opt_pass pass_build_cgraph_edges;
Index: gcc/testsuite/gcc.dg/pr46309.c
===
--- gcc/testsuite/gcc.dg/pr46309.c  (revision 186108)
+++ gcc/testsuite/gcc.dg/pr46309.c  (working copy)
@@ -1,6 +1,6 @@
 /* PR tree-optimization/46309 */
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-tree-reassoc-details } */
+/* { dg-options -O2 -fdump-tree-reassoc1-details 
-fdump-tree-reassoc2-details } */
 /* The transformation depends on BRANCH_COST being greater than 1
(see the 

Re: [PATCH] Fix PR18589

2012-03-28 Thread Richard Guenther
On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 This is a re-post of the patch I posted for comments in January to
 address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
 modifies reassociation to expose repeated factors from __builtin_pow*
 calls, optimally reassociate repeated factors, and possibly reconstitute
 __builtin_powi calls from the results of reassociation.

 Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
 expect there may need to be some small changes, but I am targeting this
 for trunk approval.

 Thanks very much for the review,

Hmm.  How much work would it be to extend the reassoc 'IL' to allow
a repeat factor per op?  I realize what you do is all within what reassoc
already does though ideally we would not require any GIMPLE IL changes
for building up / optimizing the reassoc IL but only do so when we commit
changes.

Thanks,
Richard.

 Bill


 gcc:

 2012-03-06  Bill Schmidt  wschm...@linux.vnet.ibm.com

        * tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
        pass_late_reassoc.
        * passes.c (init_optimization_passes): Change pass_reassoc calls to
        pass_early_reassoc and pass_late_reassoc.
        * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
        (early_reassoc): New static var.
        (MAX_POW_EXPAND): New #define'd constant.
        (linear_expand_pow_common): New function.
        (linear_expand_powi): Likewise.
        (linear_expand_pow): Likewise.
        (break_up_subtract_bb): Attempt to expand __builtin_pow[i].
        (repeat_factor_d): New struct and associated typedefs.
        (repeat_factor_vec): New static vector variable.
        (compare_repeat_factors): New function.
        (get_reassoc_pow_ssa_name): Likewise.
        (attempt_builtin_powi): Likewise.
        (reassociate_bb): Attempt to reconstitute __builtin_powi calls, and
        multiply their results by any leftover reassociated factors.
        (fini_reassoc): Two new calls to statistics_counter_event.
        (execute_early_reassoc): New function.
        (execute_late_reassoc): Likewise.
        (pass_early_reassoc): Replace pass_reassoc, renamed to reassoc1,
        call execute_early_reassoc.
        (pass_late_reassoc): New gimple_opt_pass named reassoc2 that calls
        execute_late_reassoc.

 gcc/testsuite:

 2012-03-06  Bill Schmidt  wschm...@linux.vnet.ibm.com

        * gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
        -fdump-tree-reassoc[12]-details.
        * gcc.dg/tree-ssa/pr18589-1.c: New test.
        * gcc.dg/tree-ssa/pr18589-2.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-3.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-4.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-5.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-6.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-7.c: Likewise.
        * gcc.dg/tree-ssa/pr18589-8.c: Likewise.


 Index: gcc/tree-pass.h
 ===
 --- gcc/tree-pass.h     (revision 184997)
 +++ gcc/tree-pass.h     (working copy)
 @@ -440,7 +440,8 @@ extern struct gimple_opt_pass pass_copy_prop;
  extern struct gimple_opt_pass pass_vrp;
  extern struct gimple_opt_pass pass_uncprop;
  extern struct gimple_opt_pass pass_return_slot;
 -extern struct gimple_opt_pass pass_reassoc;
 +extern struct gimple_opt_pass pass_early_reassoc;
 +extern struct gimple_opt_pass pass_late_reassoc;
  extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
  extern struct gimple_opt_pass pass_remove_cgraph_callee_edges;
  extern struct gimple_opt_pass pass_build_cgraph_edges;
 Index: gcc/testsuite/gcc.dg/pr46309.c
 ===
 --- gcc/testsuite/gcc.dg/pr46309.c      (revision 184997)
 +++ gcc/testsuite/gcc.dg/pr46309.c      (working copy)
 @@ -1,6 +1,6 @@
  /* PR tree-optimization/46309 */
  /* { dg-do compile } */
 -/* { dg-options -O2 -fdump-tree-reassoc-details } */
 +/* { dg-options -O2 -fdump-tree-reassoc1-details 
 -fdump-tree-reassoc2-details } */
  /* The transformation depends on BRANCH_COST being greater than 1
    (see the notes in the PR), so try to force that.  */
  /* { dg-additional-options -mtune=octeon2 { target mips*-*-* } } */
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
 @@ -0,0 +1,10 @@
 +/* { dg-do compile } */
 +/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
 +
 +double baz (double x, double y, double z, double u)
 +{
 +  return x * x * y * y * y * z * z * z * z * u;
 +}
 +
 +/* { dg-final { scan-tree-dump-times  \\*  7 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c
 ===
 

Re: [PATCH] Fix PR18589

2012-03-28 Thread William J. Schmidt


On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
 On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Hi,
 
  This is a re-post of the patch I posted for comments in January to
  address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
  modifies reassociation to expose repeated factors from __builtin_pow*
  calls, optimally reassociate repeated factors, and possibly reconstitute
  __builtin_powi calls from the results of reassociation.
 
  Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
  expect there may need to be some small changes, but I am targeting this
  for trunk approval.
 
  Thanks very much for the review,
 
 Hmm.  How much work would it be to extend the reassoc 'IL' to allow
 a repeat factor per op?  I realize what you do is all within what reassoc
 already does though ideally we would not require any GIMPLE IL changes
 for building up / optimizing the reassoc IL but only do so when we commit
 changes.

Ah, I take your point.  I will look into it.  We still need the
additional data structures to allow sorting by factor repeat counts, but
perhaps expanding the builtins can be avoided until it's proven
necessary.  The patch as submitted may be slightly easier to implement
and understand, but I agree it would be better to avoid changing GIMPLE
unnecessarily if possible.  I'll get back to you shortly.

Thanks,
Bill

 
 Thanks,
 Richard.
 




[PATCH] Fix PR18589

2012-03-06 Thread William J. Schmidt
Hi,

This is a re-post of the patch I posted for comments in January to
address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
modifies reassociation to expose repeated factors from __builtin_pow*
calls, optimally reassociate repeated factors, and possibly reconstitute
__builtin_powi calls from the results of reassociation.

Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
expect there may need to be some small changes, but I am targeting this
for trunk approval.

Thanks very much for the review,

Bill


gcc:

2012-03-06  Bill Schmidt  wschm...@linux.vnet.ibm.com

* tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
pass_late_reassoc.
* passes.c (init_optimization_passes): Change pass_reassoc calls to
pass_early_reassoc and pass_late_reassoc.
* tree-ssa-reassoc.c (reassociate_stats): Add two fields.
(early_reassoc): New static var.
(MAX_POW_EXPAND): New #define'd constant.
(linear_expand_pow_common): New function.
(linear_expand_powi): Likewise.
(linear_expand_pow): Likewise.
(break_up_subtract_bb): Attempt to expand __builtin_pow[i].
(repeat_factor_d): New struct and associated typedefs.
(repeat_factor_vec): New static vector variable.
(compare_repeat_factors): New function.
(get_reassoc_pow_ssa_name): Likewise.
(attempt_builtin_powi): Likewise.
(reassociate_bb): Attempt to reconstitute __builtin_powi calls, and
multiply their results by any leftover reassociated factors.
(fini_reassoc): Two new calls to statistics_counter_event.
(execute_early_reassoc): New function.
(execute_late_reassoc): Likewise.
(pass_early_reassoc): Replace pass_reassoc, renamed to reassoc1,
call execute_early_reassoc.
(pass_late_reassoc): New gimple_opt_pass named reassoc2 that calls
execute_late_reassoc.

gcc/testsuite:

2012-03-06  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
-fdump-tree-reassoc[12]-details.
* gcc.dg/tree-ssa/pr18589-1.c: New test.
* gcc.dg/tree-ssa/pr18589-2.c: Likewise.
* gcc.dg/tree-ssa/pr18589-3.c: Likewise.
* gcc.dg/tree-ssa/pr18589-4.c: Likewise.
* gcc.dg/tree-ssa/pr18589-5.c: Likewise.
* gcc.dg/tree-ssa/pr18589-6.c: Likewise.
* gcc.dg/tree-ssa/pr18589-7.c: Likewise.
* gcc.dg/tree-ssa/pr18589-8.c: Likewise.


Index: gcc/tree-pass.h
===
--- gcc/tree-pass.h (revision 184997)
+++ gcc/tree-pass.h (working copy)
@@ -440,7 +440,8 @@ extern struct gimple_opt_pass pass_copy_prop;
 extern struct gimple_opt_pass pass_vrp;
 extern struct gimple_opt_pass pass_uncprop;
 extern struct gimple_opt_pass pass_return_slot;
-extern struct gimple_opt_pass pass_reassoc;
+extern struct gimple_opt_pass pass_early_reassoc;
+extern struct gimple_opt_pass pass_late_reassoc;
 extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
 extern struct gimple_opt_pass pass_remove_cgraph_callee_edges;
 extern struct gimple_opt_pass pass_build_cgraph_edges;
Index: gcc/testsuite/gcc.dg/pr46309.c
===
--- gcc/testsuite/gcc.dg/pr46309.c  (revision 184997)
+++ gcc/testsuite/gcc.dg/pr46309.c  (working copy)
@@ -1,6 +1,6 @@
 /* PR tree-optimization/46309 */
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-tree-reassoc-details } */
+/* { dg-options -O2 -fdump-tree-reassoc1-details 
-fdump-tree-reassoc2-details } */
 /* The transformation depends on BRANCH_COST being greater than 1
(see the notes in the PR), so try to force that.  */
 /* { dg-additional-options -mtune=octeon2 { target mips*-*-* } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c   (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
+
+double baz (double x, double y, double z, double u)
+{
+  return x * x * y * y * y * z * z * z * z * u;
+}
+
+/* { dg-final { scan-tree-dump-times  \\*  7 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr18589-5.c   (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -ffast-math -fdump-tree-optimized } */
+
+double baz (double x, double y, double z, double u)
+{
+  return x * x * x * y * y * y * z * z * z * z * u * u * u * u;
+}
+
+/* { dg-final { scan-tree-dump-times  \\*  6 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized }