------- Comment #39 from rguenth at gcc dot gnu dot org 2008-01-10 15:01 ------- Hmm, looks I was compeltely wrong about the cause of the slowdown. We actually run cfg_cleanup after cunroll and merge blocks like
<BB1> ... <BB2> # SFT.1_2 = PHI <SFT.1_1 (BB1)> ... # SFT.1000_2 = PHI <SFT.1000_1 (BB1)> # SFT.1_3 = VDEF <SFT.1_2> ... # SFT.1000_3 = VDEF <SFT.1_2> *mem = x; and in merging the blocks we do (tree_merge_blocks): /* Remove all single-valued PHI nodes from block B of the form V_i = PHI <V_j> by propagating V_j to all the uses of V_i. */ for (phi = phi_nodes (b); phi; phi = phi_nodes (b)) { ... replace_uses_by (def, use); remove_phi_node (phi, NULL, true); BUT! replace_uses_by will for _each_ phi-node we replace its uses update the target stmt! And fold it! We can do better with VOPs Preparing a patch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34683