Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-26 Thread Richard Sandiford
Steven Bosscher stevenb@gmail.com writes:
 On Wed, Mar 21, 2012 at 1:13 AM, Ian Lance Taylor wrote:
 On Tue, Mar 20, 2012 at 2:06 PM, Steven Bosscher wrote:

 This patch splits a couple of pieces of cse_insn out to new functions.
 There are no functional changes, and no code generation differences as
 far as I could tell on x86_64 (-m64 and -m32).

 Likewise for the attached patch.

 The purpose of the patch is and, loto hopefully make cse_insn easier
 to understand. In a follow-up patch, I will make canonicalize_insn run
 only once per insn (it currently, i.e. before and after this patch,
 runs multiple times for CSE on extended basic blocks if a block is in
 multiple extended basic blocks).

 That is what the attached patch does.

 Bootstrappedtested on x86_64-unknown-linux-gnu.
 OK for trunk?

 Ciao!
 Steven

   * cse.c (cse_canonicalized_basic_blocks): New simple bitmap to
   tag basic blocks that have already been traversed at least once,
   so that all insns have been canonicalized.
   (cse_insn): Call canonicalize_insn only if the basic block that
   contains insn is visited for the first time.
   (cse_extended_basic_block): After visiting all insns in a basic
   block, mark the block in cse_canonicalized_basic_blocks.
   (cse_main): Setup and destroy cse_canonicalized_basic_blocks.

OK, thanks (without the microoptimisation, as you say).

Out of curiosity, do you still see this bit as useful:

  /* We potentially will process this insn many times.  Therefore,
 drop the REG_EQUAL note if it is equal to the SET_SRC of the
 unique set in INSN.

 Do not do so if the REG_EQUAL note is for a STRICT_LOW_PART,
 because cse_insn handles those specially.  */

?  Does many times mean in CSE, or later?

Richard


Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-26 Thread Steven Bosscher
On Mon, Mar 26, 2012 at 9:02 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
       * cse.c (cse_canonicalized_basic_blocks): New simple bitmap to
       tag basic blocks that have already been traversed at least once,
       so that all insns have been canonicalized.
       (cse_insn): Call canonicalize_insn only if the basic block that
       contains insn is visited for the first time.
       (cse_extended_basic_block): After visiting all insns in a basic
       block, mark the block in cse_canonicalized_basic_blocks.
       (cse_main): Setup and destroy cse_canonicalized_basic_blocks.

 OK, thanks (without the microoptimisation, as you say).

Thanks, will commit!


 Out of curiosity, do you still see this bit as useful:

      /* We potentially will process this insn many times.  Therefore,
         drop the REG_EQUAL note if it is equal to the SET_SRC of the
         unique set in INSN.

         Do not do so if the REG_EQUAL note is for a STRICT_LOW_PART,
         because cse_insn handles those specially.  */

 ?  Does many times mean in CSE, or later?

It means in CSE, yes. But with the patch to canonicalize only once,
I suppose this can go away again.
Having said that, I do believe it would be good if we avoid having
REG_EQUAL notes that are rtx_equal_p to the SET_SRC. This happens
quite frequently after CSE. I'm not sure how to clean them up.

Ciao!
Steven


Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-21 Thread Steven Bosscher
On Wed, Mar 21, 2012 at 1:13 AM, Ian Lance Taylor wrote:
 On Tue, Mar 20, 2012 at 2:06 PM, Steven Bosscher wrote:

 This patch splits a couple of pieces of cse_insn out to new functions.
 There are no functional changes, and no code generation differences as
 far as I could tell on x86_64 (-m64 and -m32).

Likewise for the attached patch.

 The purpose of the patch is and, loto hopefully make cse_insn easier
 to understand. In a follow-up patch, I will make canonicalize_insn run
 only once per insn (it currently, i.e. before and after this patch,
 runs multiple times for CSE on extended basic blocks if a block is in
 multiple extended basic blocks).

That is what the attached patch does.

Bootstrappedtested on x86_64-unknown-linux-gnu.
OK for trunk?

Ciao!
Steven

* cse.c (cse_canonicalized_basic_blocks): New simple bitmap to
tag basic blocks that have already been traversed at least once,
so that all insns have been canonicalized.
(cse_insn): Call canonicalize_insn only if the basic block that
contains insn is visited for the first time.
(cse_extended_basic_block): After visiting all insns in a basic
block, mark the block in cse_canonicalized_basic_blocks.
(cse_main): Setup and destroy cse_canonicalized_basic_blocks.

(cse_find_path): Micro-optimization, reorder one condition to
avoid a reference to cfun.
	* cse.c (cse_canonicalized_basic_blocks): New simple bitmap to
	tag basic blocks that have already been traversed at least once,
	so that all insns have been canonicalized.
	(cse_insn): Call canonicalize_insn only if the basic block that
	contains insn is visited for the first time.
	(cse_extended_basic_block): After visiting all insns in a basic
	block, mark the block in cse_canonicalized_basic_blocks.
	(cse_main): Setup and destroy cse_canonicalized_basic_blocks.

	(cse_find_path): Micro-optimization, reorder one condition to
	avoid a reference to cfun.

Index: cse.c
===
--- cse.c	(revision 185622)
+++ cse.c	(working copy)
@@ -551,6 +551,10 @@ static bitmap cse_ebb_live_in, cse_ebb_l
already as part of an already processed extended basic block.  */
 static sbitmap cse_visited_basic_blocks;
 
+/* A simple bitmap to track for which basic blocks all insns have been
+   canonicalized already.  */
+static sbitmap cse_canonicalized_basic_blocks;
+
 static bool fixed_base_plus_p (rtx x);
 static int notreg_cost (rtx, enum rtx_code, int);
 static int approx_reg_cost_1 (rtx *, void *);
@@ -4492,8 +4496,10 @@ cse_insn (rtx insn)
   /* Record all the SETs in this instruction.  */
   n_sets = find_sets_in_insn (insn, sets);
 
-  /* Substitute the canonical register where possible.  */
-  canonicalize_insn (insn, sets, n_sets);
+  /* If we have not visited this block before (as part of another extended
+ basic block, substitute the canonical register where possible.  */
+  if (!TEST_BIT (cse_canonicalized_basic_blocks, BLOCK_FOR_INSN (insn)-index))
+canonicalize_insn (insn, sets, n_sets);
 
   /* If this insn has a REG_EQUAL note, store the equivalent value in SRC_EQV,
  if different, or if the DEST is a STRICT_LOW_PART.  The latter condition
@@ -6254,10 +6260,9 @@ cse_find_path (basic_block first_bb, str
 	  else
 	e = NULL;
 
-	  if (e
-	   !((e-flags  EDGE_ABNORMAL_CALL)  cfun-has_nonlocal_label)
-	   e-dest != EXIT_BLOCK_PTR
+	  if (e  e-dest != EXIT_BLOCK_PTR
 	   single_pred_p (e-dest)
+	   !((e-flags  EDGE_ABNORMAL_CALL)  cfun-has_nonlocal_label)
 	  /* Avoid visiting basic blocks twice.  The large comment
 		 above explains why this can happen.  */
 	   !TEST_BIT (cse_visited_basic_blocks, e-dest-index))
@@ -6452,6 +6457,9 @@ cse_extended_basic_block (struct cse_bas
 	}
 	}
 
+  /* We have now canonicalized all insns in this basic block.  */
+  SET_BIT (cse_canonicalized_basic_blocks, bb-index);
+
   /* With non-call exceptions, we are not always able to update
 	 the CFG properly inside cse_insn.  So clean up possibly
 	 redundant EH edges here.  */
@@ -6555,6 +6563,10 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
   cse_visited_basic_blocks = sbitmap_alloc (last_basic_block);
   sbitmap_zero (cse_visited_basic_blocks);
 
+  /* Set up the table of already canonicalized basic blocks.  */
+  cse_canonicalized_basic_blocks = sbitmap_alloc (last_basic_block);
+  sbitmap_zero (cse_canonicalized_basic_blocks);
+
   /* Loop over basic blocks in reverse completion order (RPO),
  excluding the ENTRY and EXIT blocks.  */
   n_blocks = pre_and_rev_post_order_compute (NULL, rc_order, false);
@@ -6598,6 +6610,7 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
   free (reg_eqv_table);
   free (ebb_data.path);
   sbitmap_free (cse_visited_basic_blocks);
+  sbitmap_free (cse_canonicalized_basic_blocks);
   free (rc_order);
   rtl_hooks = general_rtl_hooks;
 


Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-21 Thread Steven Bosscher
On Thu, Mar 22, 2012 at 12:09 AM, Steven Bosscher stevenb@gmail.com wrote:

        (cse_find_path): Micro-optimization, reorder one condition to
        avoid a reference to cfun.

Ah, and please ignore this bit. I don't know what I was thinking...


[patch] Split parts of cse_insn out to a few new functions

2012-03-20 Thread Steven Bosscher
Hello,

This patch splits a couple of pieces of cse_insn out to new functions.
There are no functional changes, and no code generation differences as
far as I could tell on x86_64 (-m64 and -m32).

The purpose of the patch is and, loto hopefully make cse_insn easier
to understand. In a follow-up patch, I will make canonicalize_insn run
only once per insn (it currently, i.e. before and after this patch,
runs multiple times for CSE on extended basic blocks if a block is in
multiple extended basic blocks).

Bootstrapped  tested on x86_64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven
* cse.c (invalidate_from_sets_and_clobbers, try_back_substitute_reg,
find_sets_in_insn, canonicalize_insn): Split out from ...
(cse_insn): ... here.
(invalidate_from_clobbers): Take an insn instead of the pattern.

Index: cse.c
===
--- cse.c   (revision 185515)
+++ cse.c   (working copy)
@@ -597,6 +597,7 @@ static void record_jump_cond (enum rtx_c
 static void cse_insn (rtx);
 static void cse_prescan_path (struct cse_basic_block_data *);
 static void invalidate_from_clobbers (rtx);
+static void invalidate_from_sets_and_clobbers (rtx);
 static rtx cse_process_notes (rtx, rtx, bool *);
 static void cse_extended_basic_block (struct cse_basic_block_data *);
 static void count_reg_usage (rtx, int *, rtx, int);
@@ -4089,10 +4090,22 @@ record_jump_cond (enum rtx_code code, en
 }
 
 /* CSE processing for one instruction.
-   First simplify sources and addresses of all assignments
-   in the instruction, using previously-computed equivalents values.
-   Then install the new sources and destinations in the table
-   of available values.  */
+
+   Most true common subexpressions are mostly optimized away in GIMPLE,
+   but the few that leak through are cleaned up by cse_insn, and complex
+   addressing modes are often formed here.
+
+   The main function is cse_insn, and between here and that function
+   a couple of helper functions is defined to keep the size of cse_insn
+   within reasonable proportions.
+   
+   Data is shared between the main and helper functions via STRUCT SET,
+   that contains all data related for every set in the instruction that
+   is being processed.
+   
+   Note that cse_main processes all sets in the instruction.  Most
+   passes in GCC only process simple SET insns or single_set insns, but
+   CSE processes insns with multiple sets as well.  */
 
 /* Data on one SET contained in the instruction.  */
 
@@ -4128,50 +4141,93 @@ struct set
   /* Table entry for the destination address.  */
   struct table_elt *dest_addr_elt;
 };
+
+/* Special handling for (set REG0 REG1) where REG0 is the
+   cheapest, cheaper than REG1.  After cse, REG1 will probably not
+   be used in the sequel, so (if easily done) change this insn to
+   (set REG1 REG0) and replace REG1 with REG0 in the previous insn
+   that computed their value.  Then REG1 will become a dead store
+   and won't cloud the situation for later optimizations.
+
+   Do not make this change if REG1 is a hard register, because it will
+   then be used in the sequel and we may be changing a two-operand insn
+   into a three-operand insn.
+   
+   This is the last transformation that cse_insn will try to do.  */
 
 static void
-cse_insn (rtx insn)
+try_back_substitute_reg (rtx set, rtx insn)
 {
-  rtx x = PATTERN (insn);
-  int i;
-  rtx tem;
-  int n_sets = 0;
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
 
-  rtx src_eqv = 0;
-  struct table_elt *src_eqv_elt = 0;
-  int src_eqv_volatile = 0;
-  int src_eqv_in_memory = 0;
-  unsigned src_eqv_hash = 0;
+  if (REG_P (dest)
+   REG_P (src)  ! HARD_REGISTER_P (src)
+   REGNO_QTY_VALID_P (REGNO (src)))
+{
+  int src_q = REG_QTY (REGNO (src));
+  struct qty_table_elem *src_ent = qty_table[src_q];
 
-  struct set *sets = (struct set *) 0;
+  if (src_ent-first_reg == REGNO (dest))
+   {
+ /* Scan for the previous nonnote insn, but stop at a basic
+block boundary.  */
+ rtx prev = insn;
+ rtx bb_head = BB_HEAD (BLOCK_FOR_INSN (insn));
+ do
+   {
+ prev = PREV_INSN (prev);
+   }
+ while (prev != bb_head  (NOTE_P (prev) || DEBUG_INSN_P (prev)));
 
-  this_insn = insn;
-#ifdef HAVE_cc0
-  /* Records what this insn does to set CC0.  */
-  this_insn_cc0 = 0;
-  this_insn_cc0_mode = VOIDmode;
-#endif
+ /* Do not swap the registers around if the previous instruction
+attaches a REG_EQUIV note to REG1.
 
-  /* Find all the SETs and CLOBBERs in this instruction.
- Record all the SETs in the array `set' and count them.
- Also determine whether there is a CLOBBER that invalidates
- all memory references, or all references at varying addresses.  */
+??? It's not entirely clear whether we can transfer a REG_EQUIV
+from the pseudo that originally shadowed an 

Re: [patch] Split parts of cse_insn out to a few new functions

2012-03-20 Thread Ian Lance Taylor
On Tue, Mar 20, 2012 at 2:06 PM, Steven Bosscher stevenb@gmail.com wrote:

 This patch splits a couple of pieces of cse_insn out to new functions.
 There are no functional changes, and no code generation differences as
 far as I could tell on x86_64 (-m64 and -m32).

 The purpose of the patch is and, loto hopefully make cse_insn easier
 to understand. In a follow-up patch, I will make canonicalize_insn run
 only once per insn (it currently, i.e. before and after this patch,
 runs multiple times for CSE on extended basic blocks if a block is in
 multiple extended basic blocks).

This is OK.

Thanks.

Ian