Hi , Based on the conversation in the thread at http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a pass trying to undo final value replacement going. The initial implementation was done by Pranav Bhandarkar when he was employed at Azingo as part of work sponsored by Icera Semiconductor. I've been trying to get this working with my private port over here. We intend to contribute this back once our copyright assignments are sorted and if this will be acceptable to all. I've been getting a few compile time ICEs with this approach and haven't been able to resolve them well enough yet. Whilst doing so, I wanted to check on the approach as outlined below and ask if there's anything that we might have missed or any problem that one can see with us going along these lines. Thanks for your time and patience.
cheers Ramana 1) Understanding what scalar evolution does. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider the following pseudo code. function memcpy (src_pointer, dst_pointer) src_1 = src_pointer; dst_1 = dst_pointer; L1: *dst_1 = *src_1 (Word copy) dst++; src++; <---- Inc by 4 bytes i.e 1 word. conditional jump to L1 <fallthrough edge> /* This is the exit block of loop 1. The following PHI nodes are added by loopinit pass to convert the SSA form into "closed loop SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ src_2 = PHI <src_1> dst_2 = PHI <dst_1> <fall through edge> L2: *dst_2 = *src_2 (Byte Copy) dst++; src++; <conditional jump to L2> Now scalar evolution convertes this into Function memcpy (src_pointer, dst_pointer) src_1 = src_pointer; dst_1 = dst_pointer; L1: *dst_1 = *src_1 (Word copy) dst++; src++; <---- Inc by 4 bytes i.e 1 word. conditional jump to L1 <fallthrough edge> /* This is the exit block of loop 1. The following PHI nodes are added by loopinit pass to convert the SSA form into "closed loop SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ D.1234_11 = 4 * n (where 'n' is the number of iterations of L1) src_2 = src_pointer + D.1234_11 D.1235_22 = 4 * n dst_2 = dst_pointer + D.1235_22 <fallthrough edge> L2: *dst_2 = *src_2 (Byte Copy) dst++; src++; <conditional jump to L2> Therefore a PHI Node is replaced by the final value of src_1 and dst_1, thus introducing extra computations. 2) How to undo what scalar evolution does. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To undo what scalar evolution does, we need to record the changes that scalar evolution makes and then after the loop optimizations are run we need to replace the PHI nodes that were removed earlier in place of the computations introduced by scalar evolution. A high level description of the process is listed here. (see tree-scalar-evolution.c and grep for DXP_SPECIFIC) Explanation of this sub-pass. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Part 1: Record Final Value replacement related changes. ~~~~~~ Final value replacement replaces PHI nodes at the exits of loops with computations based on the number of iterations of the loop. For e.g. <bb 1> L1: x_1 = src + 4 ... ... conditional jump to L1. <bb 2> (Loop Exit) x_2 = PHI <x_1> <---- Phi node added by rewrite_into_loop_closed_ssa. (see the loopinit pass). Final Value replacement replaces the PHI node in <bb 2> by ssa_temp_var = 4 * no_of_iterations_of_loop x_2 = src + ssa_temp_var; Therefore a PHI node is replaced by computations. Recording final value replacement related changes is controlled by the variable record_scalar_evolution_changes. When set to a non-zero value the function record_changed_stmts records the changes made. The changes are recorded in a hashtable changed_stmts_table. The hashtable contains the stmt added, the PHI node for which this stmt was added and hashcodes for both the stmt and the phi_node. We also note how many computations have been added for each of the removed PHI nodes. This is done in a link list pointed to by phi_nodes_info_head. Part 2: Undo Final value replacement related changes ~~~~~~ This is the part where the new computations are removed and the PHI nodes that they are replaced are inserted back in. This replacement is contingent to a few conditions. a) All the computations that were added are still present in the basic block. i.e all the computations are still present in the form in which they were added and havent been touched by any of the loop optimizations passes that run between the scalar evolution pass (i.e the pass when Part 1 is executed) and the 'loopdone' pass. We go through the exit basic block and look up each stmt in changed_stmt_table. If found we lookup the corresponding PHI node in the phi_node_info link list and decrement its count by 1 (count here denotes the number of computations added. When count it 0 it means all the computatins added in the scalar evolution pass have been found in the same form in the loop done pass such a PHI node can be inserted back in if 'b' is also true). b) If any PHI node has count zero it can be inserted back and its corresponding computations removed, iff the argument of the PHI node still exists as an SSA variable. This means that we can insert a_1 = PHI <D.10_1> if D.10_1 still exists and hasnt been removed by any of the passes between the scalar evolution pass and the loopdone pass. 3) Implementation details including the issues in 2. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3.1} Recording the Changes made by scalar evolution. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ a) We record changes in tree-scalar-evolution.c in the scalar evolution pass (see function scev_const_prop). We visit the exit block of each loop and If we find a PHI node with only one argument then we set 'record_scalar_evolution_changes' to 1. We also set 'current_phi_node' to this particular PHI node. Then via a call to 'record_changed_stmts' from 'force_gimple_operand_bsi' in gimplify.c we record the new statement added for this PHI node. This information is recorded in a hash table 'changed_stmts_table'. We hash on the new statement that is added and store the statement along with the PHI node for which this statement was added. b) Along with this, we also maintain a 'phi_nodes_info' link list (pointed to by 'phi_nodes_info_head'). Each node in this list contains a unique PHI node (representing a PHI node that was removed by scalar evolution) and a 'count' that denotes the number of statements added for that PHI node. 'record_changed_stmts' calls 'record_stmt_for_current_phi_node' so that the 'count' for the 'current_phi_node' is incremented in its corresponding 'phi_nodes_info' node. Therefore in the above memcpy example. We would have four entries in 'changed_stmts_table', each containing the following tuple. <new_stmt_added, hashvalue, PHI_node, PHI_id,hashvalue_for_phi_node> i.e the changed_stmts_table would look like 1. D.1234_11 = 4 * n , Hashval, src_2 = PHI <src_1>, 1,Hashval_phi_node 2. src_2 = src_pointer + D.1234_11,Hashval, src_2 = PHI <src_1>, 1,Hashval_phi_node 3. D.1235_22 = 4 * n , Hashval, dst_2 = PHI <dst_1>, 2,Hashval_phi_node 2. dst_2 = dst_pointer + D.1235_22,Hashval, dst_2 = PHI <dst_1>, 2,Hashval_phi_node The phi_nodes_info link list would look like __________________________________ ________________________________ | phi_node = dst_2 = PHI <dst_1> | |phi_node = src_2 = PHI <src_1>| | count = 2 |--->|count = 2 | _________________________________ ________________________________ ^ | | phi_nodes_info_head. c) The need for hashvalue_for_phi_node: Since we dont release the PHI node, sometimes, the PHI node that we have stored itself gets changed (Trees are finally pointers and so are their operands). Therefore while inserting we rehash the PHI node and check the hashvalue with the stored hashvalue of the PHI node. If it is the same only then we reinsert the PHI node. The hashvalue for the phi node is generated by 'generate_phi_node_hashcode' d) Notes about removing PHI nodes: When -ftree-fvr-undo is enabled, then we alter slightly the way a PHI node is removed. Each Basic block contains a chain of PHI nodes. When removing a PHI node say 'A', 'A' is removed from this chain and also 'released' i.e the SSA NAMES in 'A' are released for reuse. When -ftree-fvr-undo is enabled, we remove 'A' from the chain of PHI nodes for the particular block but we _donot_ 'release' it. This also helps preserve the immediate uses of PHI node. 3.2) Using the recorded information and undoing final value replacement ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ a) This happens in the function 'scev_finalize'. We again visit the exit block of each loop. In the exit block of each loop we go through each statement and try to find its corresponding entry in 'changed_stmts_table'. If found we extract the PHI node that it had replaced and find the node for this phi node in phi_nodes_info. We then decrement the 'count' member of this node (denoting that one statement that was added for this PHI node still exists). If this count decrements to Zero we should be able to remove these statemens and reinsert the phi nodes if b is true. This is done by 'traverse_exit_block_match_instructions' and 'dec_phi_node_count'. b) We again go through all the statements of the block. For each statement we extract the PHI node it had replaced from the 'changed_stmts_table'. We then look up this PHI node in the phi_nodes_info link list and if the count is zero and if the argument of the PHI node still exists as a gimple variable we remove the current statement. If the argument of the PHI node doesnt exist anymore as a gimple variable (This is possible because a number of passes run between the time that we record scalar evolution informations and now when we are undoing final value replacement) then we just increment the count of the PHI node to indicate that this PHI node shouldnt be reinserted and the current statement is not removed. c) Once we have reached the end of the basic block we simply travers the phi_nodes_info link list and insert the PHI nodes that have 'count' zero. b and c are done by 'check_mark_phi_nodes_insert' and 'remove_stmts_insert_phi_nodes'. 4) Summary of key datastructures. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ a) changed_stmts_table - Hash table to record scalar evolution changes. b) phi_nodes_info - Link list, whose each node contains a PHI node and a count indicating the number of statements that replaced this phi node as a result of scalar evolution analysis. -- Ramana Radhakrishnan