Re: [PATCH GCC][6/6]Restrict predcom using register pressure information

2018-05-29 Thread Bin.Cheng
On Tue, May 29, 2018 at 6:18 PM, David Malcolm  wrote:
> On Tue, 2018-05-29 at 17:04 +0100, Bin.Cheng wrote:
>> On Fri, May 4, 2018 at 5:24 PM, Bin Cheng  wrote:
>> > Hi,
>> > This patch restricts predcom pass using register pressure
>> > information.
>> > In case of high register pressure, we now prune additional chains
>> > as well
>> > as disable unrolling in predcom.  In generally, I think this patch
>> > set is
>> > useful.
>> >
>> > Bootstrap and test on x86_64 ongoing.  Any comments?
>>
>> Simple update in line with changes in previous patch.
>>
>> Thanks,
>> bin
>> >
>> > Thanks,
>> > bin
>> > 2018-04-27  Bin Cheng  
>> >
>> > * tree-predcom.c (stor-layout.h, tree-ssa-live.h): Include.
>> > (REG_RELAX_RATIO, prune_chains): New.
>> > (tree_predictive_commoning_loop): Compute reg pressure
>> > using class
>> > region.  Prune chains based on reg pressure.  Force to not
>> > unroll
>> > if reg pressure is high.
>
> [...snip...]
>
>> @@ -3239,6 +3301,11 @@ tree_predictive_commoning_loop (struct loop *loop)
>>/* Try to combine the chains that are always worked with together.  */
>>try_combine_chains (loop, );
>>
>> +  region = new lr_region (loop, max_pressure, NULL, NULL, NULL);
>> +  high_pressure_p = region->calculate_pressure ();
>> +  delete region;
>> +  prune_chains (, max_pressure);
>> +
>
> Possibly a silly question, but why the new/delete of "region" here?
> Couldn't this just be an on-stack object, with something like:
Yes, right.  It contained dynamically allocated memory before, so made
early deleting.  Now a local object will do.

Thanks,
bin
>
>   lr_region region (loop, max_pressure, NULL, NULL, NULL);
>   high_pressure_p = region.calculate_pressure ();
>   prune_chains (, max_pressure);
>
> or:
>
>   {
> lr_region region (loop, max_pressure, NULL, NULL, NULL);
> high_pressure_p = region.calculate_pressure ();
>   }
>   prune_chains (, max_pressure);
>
> if it's important to do the cleanup before prune_chains?
>
> Dave


Re: [PATCH GCC][6/6]Restrict predcom using register pressure information

2018-05-29 Thread David Malcolm
On Tue, 2018-05-29 at 17:04 +0100, Bin.Cheng wrote:
> On Fri, May 4, 2018 at 5:24 PM, Bin Cheng  wrote:
> > Hi,
> > This patch restricts predcom pass using register pressure
> > information.
> > In case of high register pressure, we now prune additional chains
> > as well
> > as disable unrolling in predcom.  In generally, I think this patch
> > set is
> > useful.
> > 
> > Bootstrap and test on x86_64 ongoing.  Any comments?
> 
> Simple update in line with changes in previous patch.
> 
> Thanks,
> bin
> > 
> > Thanks,
> > bin
> > 2018-04-27  Bin Cheng  
> > 
> > * tree-predcom.c (stor-layout.h, tree-ssa-live.h): Include.
> > (REG_RELAX_RATIO, prune_chains): New.
> > (tree_predictive_commoning_loop): Compute reg pressure
> > using class
> > region.  Prune chains based on reg pressure.  Force to not
> > unroll
> > if reg pressure is high.

[...snip...]

> @@ -3239,6 +3301,11 @@ tree_predictive_commoning_loop (struct loop *loop)
>/* Try to combine the chains that are always worked with together.  */
>try_combine_chains (loop, );
>  
> +  region = new lr_region (loop, max_pressure, NULL, NULL, NULL);
> +  high_pressure_p = region->calculate_pressure ();
> +  delete region;
> +  prune_chains (, max_pressure);
> +

Possibly a silly question, but why the new/delete of "region" here?
Couldn't this just be an on-stack object, with something like:

  lr_region region (loop, max_pressure, NULL, NULL, NULL);
  high_pressure_p = region.calculate_pressure ();
  prune_chains (, max_pressure);

or:

  {
lr_region region (loop, max_pressure, NULL, NULL, NULL);
high_pressure_p = region.calculate_pressure ();
  }
  prune_chains (, max_pressure);

if it's important to do the cleanup before prune_chains?

Dave


Re: [PATCH GCC][6/6]Restrict predcom using register pressure information

2018-05-29 Thread Bin.Cheng
On Fri, May 4, 2018 at 5:24 PM, Bin Cheng  wrote:
> Hi,
> This patch restricts predcom pass using register pressure information.
> In case of high register pressure, we now prune additional chains as well
> as disable unrolling in predcom.  In generally, I think this patch set is
> useful.
>
> Bootstrap and test on x86_64 ongoing.  Any comments?
Simple update in line with changes in previous patch.

Thanks,
bin
>
> Thanks,
> bin
> 2018-04-27  Bin Cheng  
>
> * tree-predcom.c (stor-layout.h, tree-ssa-live.h): Include.
> (REG_RELAX_RATIO, prune_chains): New.
> (tree_predictive_commoning_loop): Compute reg pressure using class
> region.  Prune chains based on reg pressure.  Force to not unroll
> if reg pressure is high.
From b78c779907b98930fc4b36e5558d6f315bb4475b Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 25 Apr 2018 16:30:41 +0100
Subject: [PATCH 6/6] pcom-reg-pressure-20180428

---
 gcc/tree-predcom.c | 74 ++
 1 file changed, 74 insertions(+)

diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index aeadbf7..60316e9 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -217,6 +217,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "ssa.h"
 #include "gimple-pretty-print.h"
+#include "stor-layout.h"
 #include "alias.h"
 #include "fold-const.h"
 #include "cfgloop.h"
@@ -227,6 +228,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-loop-manip.h"
 #include "tree-ssa-loop-niter.h"
+#include "tree-ssa-live.h"
 #include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
 #include "tree-dfa.h"
@@ -242,6 +244,10 @@ along with GCC; see the file COPYING3.  If not see
 
 #define MAX_DISTANCE (target_avail_regs[GENERAL_REGS] < 16 ? 4 : 8)
 
+/* The ratio by which register pressure check is relaxed.  */
+
+#define REG_RELAX_RATIO (2)
+
 /* Data references (or phi nodes that carry data reference values across
loop iterations).  */
 
@@ -3156,6 +3162,59 @@ insert_init_seqs (struct loop *loop, vec chains)
   }
 }
 
+/* Prune chains causing high register pressure.  */
+
+static void
+prune_chains (vec *chains, unsigned *max_pressure)
+{
+  bool pruned_p = false;
+  machine_mode mode;
+  enum reg_class cl;
+  unsigned i, new_pressure;
+
+  for (i = 0; i < chains->length ();)
+{
+  chain_p chain = (*chains)[i];
+  /* Always allow combined chain and zero-length chain.  */
+  if (chain->combined || chain->type == CT_COMBINATION
+	  || chain->length == 0 || chain->type == CT_STORE_STORE)
+	{
+	  i++;
+	  continue;
+	}
+
+  gcc_assert (chain->refs.length () > 0);
+  mode = TYPE_MODE (TREE_TYPE (chain->refs[0]->ref->ref));
+  /* Bypass chain that doesn't contribute to any reg_class, although
+	 something could be wrong when mapping type mode to reg_class.  */
+  if (ira_mode_classes[mode] == NO_REGS)
+	{
+	  i++;
+	  continue;
+	}
+
+  cl = ira_pressure_class_translate[ira_mode_classes[mode]];
+  /* Prune chain if it causes higher register pressure than available
+	 registers; otherwise keep the chain and update register pressure
+	 information.  */
+  new_pressure = max_pressure[cl] + chain->length - 1;
+  if (new_pressure <= target_avail_regs[cl] * REG_RELAX_RATIO)
+	{
+	  i++;
+	  max_pressure[cl] = new_pressure;
+	}
+  else
+	{
+	  release_chain (chain);
+	  chains->unordered_remove (i);
+	  pruned_p = true;
+	}
+}
+
+  if (pruned_p && dump_file && (dump_flags & TDF_DETAILS))
+fprintf (dump_file, "Prune chain because of high reg pressure\n");
+}
+
 /* Performs predictive commoning for LOOP.  Sets bit 1<<0 of return value
if LOOP was unrolled; Sets bit 1<<1 of return value if loop closed ssa
form was corrupted.  */
@@ -3171,6 +3230,9 @@ tree_predictive_commoning_loop (struct loop *loop)
   struct tree_niter_desc desc;
   bool unroll = false, loop_closed_ssa = false;
   edge exit;
+  lr_region *region;
+  unsigned max_pressure[N_REG_CLASSES];
+  bool high_pressure_p;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 fprintf (dump_file, "Processing loop %d\n",  loop->num);
@@ -3239,6 +3301,11 @@ tree_predictive_commoning_loop (struct loop *loop)
   /* Try to combine the chains that are always worked with together.  */
   try_combine_chains (loop, );
 
+  region = new lr_region (loop, max_pressure, NULL, NULL, NULL);
+  high_pressure_p = region->calculate_pressure ();
+  delete region;
+  prune_chains (, max_pressure);
+
   insert_init_seqs (loop, chains);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3250,6 +3317,13 @@ tree_predictive_commoning_loop (struct loop *loop)
   /* Determine the unroll factor, and if the loop should be unrolled, ensure
  that its number of iterations is divisible by the factor.  */
   unroll_factor = determine_unroll_factor (chains);
+  /* Force to not unroll if register pressure is high.  */
+  if 

[PATCH GCC][6/6]Restrict predcom using register pressure information

2018-05-04 Thread Bin Cheng
Hi,
This patch restricts predcom pass using register pressure information.
In case of high register pressure, we now prune additional chains as well
as disable unrolling in predcom.  In generally, I think this patch set is
useful.

Bootstrap and test on x86_64 ongoing.  Any comments?

Thanks,
bin
2018-04-27  Bin Cheng  

* tree-predcom.c (stor-layout.h, tree-ssa-live.h): Include.
(REG_RELAX_RATIO, prune_chains): New.
(tree_predictive_commoning_loop): Compute reg pressure using class
region.  Prune chains based on reg pressure.  Force to not unroll
if reg pressure is high.From 1b488665f8fea619c4ce35f71650c342df69de2f Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 25 Apr 2018 16:30:41 +0100
Subject: [PATCH 6/6] pcom-reg-pressure-20180423

---
 gcc/tree-predcom.c | 74 ++
 1 file changed, 74 insertions(+)

diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index aeadbf7..d0c18b3 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -217,6 +217,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "ssa.h"
 #include "gimple-pretty-print.h"
+#include "stor-layout.h"
 #include "alias.h"
 #include "fold-const.h"
 #include "cfgloop.h"
@@ -227,6 +228,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-loop-manip.h"
 #include "tree-ssa-loop-niter.h"
+#include "tree-ssa-live.h"
 #include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
 #include "tree-dfa.h"
@@ -242,6 +244,10 @@ along with GCC; see the file COPYING3.  If not see
 
 #define MAX_DISTANCE (target_avail_regs[GENERAL_REGS] < 16 ? 4 : 8)
 
+/* The ratio by which register pressure check is relaxed.  */
+
+#define REG_RELAX_RATIO (2)
+
 /* Data references (or phi nodes that carry data reference values across
loop iterations).  */
 
@@ -3156,6 +3162,59 @@ insert_init_seqs (struct loop *loop, vec chains)
   }
 }
 
+/* Prune chains causing high register pressure.  */
+
+static void
+prune_chains (vec *chains, unsigned *max_pressure)
+{
+  bool pruned_p = false;
+  machine_mode mode;
+  enum reg_class cl;
+  unsigned i, new_pressure;
+
+  for (i = 0; i < chains->length ();)
+{
+  chain_p chain = (*chains)[i];
+  /* Always allow combined chain and zero-length chain.  */
+  if (chain->combined || chain->type == CT_COMBINATION
+	  || chain->length == 0 || chain->type == CT_STORE_STORE)
+	{
+	  i++;
+	  continue;
+	}
+
+  gcc_assert (chain->refs.length () > 0);
+  mode = TYPE_MODE (TREE_TYPE (chain->refs[0]->ref->ref));
+  /* Bypass chain that doesn't contribute to any reg_class, although
+	 something could be wrong when mapping type mode to reg_class.  */
+  if (ira_mode_classes[mode] == NO_REGS)
+	{
+	  i++;
+	  continue;
+	}
+
+  cl = ira_pressure_class_translate[ira_mode_classes[mode]];
+  /* Prune chain if it causes higher register pressure than available
+	 registers; otherwise keep the chain and update register pressure
+	 information.  */
+  new_pressure = max_pressure[cl] + chain->length - 1;
+  if (new_pressure <= target_avail_regs[cl] * REG_RELAX_RATIO)
+	{
+	  i++;
+	  max_pressure[cl] = new_pressure;
+	}
+  else
+	{
+	  release_chain (chain);
+	  chains->unordered_remove (i);
+	  pruned_p = true;
+	}
+}
+
+  if (pruned_p && dump_file && (dump_flags & TDF_DETAILS))
+fprintf (dump_file, "Prune chain because of high reg pressure\n");
+}
+
 /* Performs predictive commoning for LOOP.  Sets bit 1<<0 of return value
if LOOP was unrolled; Sets bit 1<<1 of return value if loop closed ssa
form was corrupted.  */
@@ -3171,6 +3230,9 @@ tree_predictive_commoning_loop (struct loop *loop)
   struct tree_niter_desc desc;
   bool unroll = false, loop_closed_ssa = false;
   edge exit;
+  lr_region *region;
+  unsigned max_pressure[N_REG_CLASSES];
+  bool high_pressure_p;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 fprintf (dump_file, "Processing loop %d\n",  loop->num);
@@ -3239,6 +3301,11 @@ tree_predictive_commoning_loop (struct loop *loop)
   /* Try to combine the chains that are always worked with together.  */
   try_combine_chains (loop, );
 
+  region = new lr_region (loop);
+  high_pressure_p = region->calculate_pressure (max_pressure);
+  delete region;
+  prune_chains (, max_pressure);
+
   insert_init_seqs (loop, chains);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3250,6 +3317,13 @@ tree_predictive_commoning_loop (struct loop *loop)
   /* Determine the unroll factor, and if the loop should be unrolled, ensure
  that its number of iterations is divisible by the factor.  */
   unroll_factor = determine_unroll_factor (chains);
+  /* Force to not unroll if register pressure is high.  */
+  if (high_pressure_p && unroll_factor > 1)
+{
+  unroll_factor = 1;
+  if (dump_file && (dump_flags &