Hi ,

Based on the conversation in the thread at
http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
pass trying to undo final value replacement going. The initial
implementation was done by Pranav Bhandarkar when he was employed at
Azingo as part of work sponsored by Icera Semiconductor.  I've been
trying to get this working with my private port over here.  We intend
to contribute this back once our copyright assignments are sorted and
if this will be acceptable to all. I've been getting a few compile
time ICEs with this approach and haven't been able to resolve them
well enough yet. Whilst doing so, I wanted to check on the approach as
outlined below and ask if there's anything that we might have missed
or any problem that one can see with us going along these lines.
Thanks for your time and patience.

cheers
Ramana



1) Understanding what scalar evolution does.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Consider the following pseudo code.

function memcpy (src_pointer, dst_pointer)
src_1 = src_pointer;
dst_1 = dst_pointer;
L1:
*dst_1 = *src_1  (Word copy)
dst++;
src++;    <---- Inc by 4 bytes i.e 1 word.
conditional jump to L1
<fallthrough edge>
 /* This is the exit block of loop 1. The following PHI nodes
 are added by loopinit pass to convert the SSA form into "closed loop
 SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */

src_2 = PHI <src_1>
dst_2 = PHI <dst_1>

<fall through edge>
L2:
*dst_2 = *src_2 (Byte Copy)
dst++;
src++;
<conditional jump to L2>


Now scalar evolution convertes this into

Function memcpy (src_pointer, dst_pointer)
 src_1 = src_pointer;
 dst_1 = dst_pointer;
 L1:
 *dst_1 = *src_1  (Word copy)
 dst++;
 src++;    <---- Inc by 4 bytes i.e 1 word.
 conditional jump to L1

<fallthrough edge>
 /* This is the exit block of loop 1. The following PHI nodes
 are added by loopinit pass to convert the SSA form into "closed loop
 SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */

 D.1234_11 = 4 * n (where 'n' is the number of iterations of L1)
 src_2 = src_pointer + D.1234_11
 D.1235_22 = 4 * n
 dst_2 = dst_pointer + D.1235_22

 <fallthrough edge>
 L2:
 *dst_2 = *src_2 (Byte Copy)
 dst++;
 src++;
 <conditional jump to L2>



Therefore a PHI Node is replaced by the final value of src_1 and
dst_1, thus introducing extra computations.


2) How to undo what scalar evolution does.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To undo what scalar evolution does, we need to record the changes that
scalar evolution makes and then after the loop optimizations are run
we need to replace the PHI nodes that were removed earlier in place of
the computations introduced by scalar evolution.

A high level description of the process is listed here. (see
tree-scalar-evolution.c and grep for DXP_SPECIFIC)

   Explanation of this sub-pass.
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

   Part 1: Record Final Value replacement related changes.
   ~~~~~~
   Final value replacement replaces PHI nodes at the exits of loops with
   computations based on the number of iterations of the loop.
   For e.g.
   <bb 1>
   L1:

   x_1 = src + 4
   ...
   ...
   conditional jump to L1.
   <bb 2> (Loop Exit)
   x_2 = PHI <x_1>  <---- Phi node added by rewrite_into_loop_closed_ssa.
                          (see the loopinit pass).

   Final Value replacement replaces the PHI node in <bb 2> by

   ssa_temp_var = 4 * no_of_iterations_of_loop
   x_2 = src + ssa_temp_var;

   Therefore a PHI node is replaced by computations.

   Recording final value replacement related changes is controlled by the
   variable record_scalar_evolution_changes. When set to a non-zero value
   the function record_changed_stmts records the changes made. The changes
   are recorded in a hashtable changed_stmts_table. The hashtable contains
   the stmt added, the PHI node for which this stmt was added and hashcodes
   for both the stmt and the phi_node. We also note how many computations
   have been added for each of the removed PHI nodes. This is done in a
   link list pointed to by phi_nodes_info_head.

   Part 2: Undo Final value replacement related changes
   ~~~~~~
   This is the part where the new computations are removed and the PHI nodes
   that they are replaced are inserted back in. This replacement is
   contingent to a few conditions.
      a) All the computations that were added are still present in the basic
         block. i.e all the computations are still present in the form in
     which they were added and havent been touched by any of the loop
     optimizations passes that run between the scalar evolution pass (i.e the
     pass when Part 1 is executed) and the 'loopdone' pass. We go
     through the exit basic block and look up each stmt in
     changed_stmt_table. If found we lookup the corresponding PHI node in the
     phi_node_info link list and decrement its count by 1 (count here denotes
     the number of computations added. When count it 0 it means all the
     computatins added in the scalar evolution pass have been found in the
     same form in the loop done pass such a PHI node can be inserted back in
     if 'b' is also true).
      b) If any PHI node has count zero it can be inserted back and its
         corresponding computations removed, iff the argument of the PHI node
     still exists as an SSA variable. This means that we can insert
     a_1 = PHI <D.10_1> if D.10_1 still exists and hasnt been removed by
     any of the passes between the scalar evolution pass and the
     loopdone pass.




3) Implementation details including the issues in 2.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

3.1} Recording the Changes made by scalar evolution.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a) We record changes in tree-scalar-evolution.c  in the scalar
   evolution pass (see function scev_const_prop). We visit the exit
   block of each loop and If we find a PHI node with only one argument
   then we set 'record_scalar_evolution_changes' to 1. We also set
   'current_phi_node' to this particular PHI node. Then via a call
   to 'record_changed_stmts' from 'force_gimple_operand_bsi' in gimplify.c
   we record the new statement added for this PHI node. This
   information is recorded in a hash table 'changed_stmts_table'. We
   hash on the new statement that is added and store the statement
   along with the PHI node for which this statement was added.

   b) Along with this, we also maintain a 'phi_nodes_info' link list
   (pointed to by 'phi_nodes_info_head'). Each node in this list
   contains a unique PHI node (representing a PHI node that was
   removed by scalar evolution) and a 'count' that denotes the number
   of statements added for that PHI node. 'record_changed_stmts' calls
   'record_stmt_for_current_phi_node' so that the 'count' for the
   'current_phi_node' is incremented in its corresponding
   'phi_nodes_info' node. Therefore in the above memcpy example. We
   would have four entries in 'changed_stmts_table', each containing
   the following tuple.
   <new_stmt_added, hashvalue, PHI_node, PHI_id,hashvalue_for_phi_node> i.e
   the changed_stmts_table would look like

    1. D.1234_11 = 4 * n ,  Hashval, src_2 = PHI <src_1>, 1,Hashval_phi_node
    2. src_2 = src_pointer + D.1234_11,Hashval, src_2 = PHI <src_1>,
1,Hashval_phi_node
    3. D.1235_22 = 4 * n ,  Hashval, dst_2 = PHI <dst_1>, 2,Hashval_phi_node
    2. dst_2 = dst_pointer + D.1235_22,Hashval, dst_2 = PHI <dst_1>,
2,Hashval_phi_node

    The phi_nodes_info link list would look like

  __________________________________     ________________________________
  |  phi_node = dst_2 = PHI <dst_1> |    |phi_node = src_2 = PHI <src_1>|
  | count = 2                       |--->|count = 2                     |
   _________________________________     ________________________________
          ^
          |
          |
     phi_nodes_info_head.

   c) The need for hashvalue_for_phi_node: Since we dont release the
   PHI node, sometimes, the PHI node that we have stored itself gets
   changed (Trees are finally pointers and so are their
   operands). Therefore while inserting we rehash the PHI node and
   check the hashvalue with the stored hashvalue of the PHI node. If
   it is the same only then we reinsert the PHI node. The hashvalue
   for the phi node is generated by 'generate_phi_node_hashcode'

   d) Notes about removing PHI nodes: When -ftree-fvr-undo is enabled,
   then we alter slightly the way a PHI node is removed. Each Basic
   block contains a chain of PHI nodes. When removing a PHI node say
   'A', 'A' is removed from this chain and also 'released' i.e the SSA
   NAMES in 'A' are released for reuse. When -ftree-fvr-undo is
   enabled, we remove 'A' from the chain of PHI nodes for the
   particular block but we _donot_ 'release' it. This also helps
   preserve the immediate uses of PHI node.

3.2) Using the recorded information and undoing final value replacement
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   a) This happens in the function 'scev_finalize'. We again visit the
   exit block of each loop. In the exit block of each loop we go
   through each statement and try to find its corresponding entry in
   'changed_stmts_table'. If found we extract the PHI node that it had
   replaced and find the node for this phi node in phi_nodes_info. We
   then decrement the 'count' member of this node (denoting that one
   statement that was added for this PHI node still exists). If this
   count decrements to Zero we should be able to remove these
   statemens and reinsert the phi nodes if b is true. This is done by
   'traverse_exit_block_match_instructions' and 'dec_phi_node_count'.
   b) We again go through all the statements of the block. For each
   statement we extract the PHI node it had replaced from the
   'changed_stmts_table'. We then look up this PHI node in the
   phi_nodes_info link list and if the count is zero and if the
   argument of the PHI node still exists as a gimple variable we
   remove the current statement. If the argument of the PHI node
   doesnt exist anymore as a gimple variable (This is possible because
   a number of passes run between the time that we record scalar
   evolution informations and now when we are undoing final value
   replacement) then we just increment the count of the PHI node to
   indicate that this PHI node shouldnt be reinserted and the current
   statement is not removed.
   c) Once we have reached the end of the basic block we simply
   travers the phi_nodes_info link list and insert the PHI nodes that
   have 'count' zero. b and c are done by
   'check_mark_phi_nodes_insert' and 'remove_stmts_insert_phi_nodes'.


4) Summary of key datastructures.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  a) changed_stmts_table - Hash table to record scalar evolution
  changes.
  b) phi_nodes_info - Link list, whose each node contains a PHI node
  and a count indicating the number of statements that replaced this
  phi node as a result of scalar evolution analysis.



--
Ramana Radhakrishnan

Reply via email to