This letter contains the first separate patch for ddg.cIt creates additional nessesary anti-dep edges in data dependency graph. 2011/10/12 Ayal Zaks <ayal.z...@gmail.com>:>>> The following situation happens when SMS is enabled without register renaming>>> (-fno-modulo-sched-allow-regmoves). When data dependency graph is built, there>>> is a step when we generate anti-dependencies from last register use to first>>> write of this register at the next iteration.>> is a step when we generate anti-dependencies from all last registers> uses (i.e., of last register def) to first write of this register at> the next iteration.> Right. >>> At this moment we should also>>> create such dependencies to all >>> instructions which clobber the register to>>> prevent this clobbers being >>> before last use is new schedule.>>>>> well, we simply need to connect these >>> last uses to either the first> write *or* the first clobber of this >>> register at the next iteration,> according to whichever is first, no? No, is works now just how you describe. Clobber is considered as a write,and last uses are connected with first write or first clobber.But this is not sufficient: similarly to how there's no dependenciesbetween last uses,there shall be no dependency between "first" clobbers (i.e. clobbersof a registercan be permuted freely). So at least we have to make an additional dependencyedge to each clobber before first write. >>> Here is an model of example:>>>>>> loop {>>> set1 regR>>> use1 regR>>> >>> clobber regR>>> set2 regR>>> use2 regR>>> }>>>>>> If we create only >>> use2->set1 anti-dependency (and no use2->cloober) the>>> following wrong >>> schedule is possible:>>>>>> prologue {>>> set1 regR>>> use1 regR>>> clobber >>> regR>>> }>>> kernel {>>> set2 regR>>> clobber regR (instruction from next >>> iteration in terms of old loop kernel)>>> use2 regR>>> set1 regR (also from >>> next iteration)>>> use1 regR (also from next iteration)>>> }>>> epilogue >>> {>>> set2 regR>>> use2 regR>>> }>>>>> strange; according to prolog (and >>> epilog), clobber should appear after> use1 in the kernel, no? Aren't there >>> (intra-iteration) dependencies> preventing the clobber to skip over use1 >>> and/or set1?> Yeah, this is bad example - I wrongly suggested that there is noanti-dep between use1 and clobber. New example:loop { clobber1 clobber2 set use}When there is no "use->clobber2" anti-dep - the following wrong scheduleis possible. Clobber2 is on stage0, other instructions are on stage1.prologue { clobber2}kernel { clobber1 set clobber2 use}epilogue { clobber1 set use} >>> This problem was succesfully fixed by creating a vector of all >>> clobbering>>> instructions together with first write and adding all needed >>> dependencies.>>>>> seems like an overkill to me; we didn't draw an edge >>> between every> last use and every write, because writes are kept in order >>> by having> output dependences between them. So should be the case with >>> clobbers. Clobbers themselves aren't kept in order - there are no output dependencesbetween them. They may be scheduled in any order. > Presumably, the ddg already has all intra-iteration edges preventing> motion > of clobbering instructions within an iteration, and we are only> missing > inter-iteration deps or deps surfaced by eliminating> anti-deps, right? I think the new version of a fix suits this reasoning.Now it creates dependencies only to clobber instructions before first write.And certainly to first write instruction itself. --Roman zhuykovzhr...@ispras.ru
2011-12-07 Roman Zhuykov <zhr...@ispras.ru> * ddg.c: New VEC. (add_cross_iteration_register_deps): Store information about clobbers before first write to a register. Use collected information to create anti-dependencies from last uses. ---
diff --git a/gcc/ddg.c b/gcc/ddg.c index 2b1cfe8..e3e065c 100644 --- a/gcc/ddg.c +++ b/gcc/ddg.c @@ -49,6 +49,10 @@ along with GCC; see the file COPYING3. If not see /* A flag indicating that a ddg edge belongs to an SCC or not. */ enum edge_flag {NOT_IN_SCC = 0, IN_SCC}; +/* A vector of dependencies needed while processing clobbers. */ +DEF_VEC_P(df_ref); +DEF_VEC_ALLOC_P(df_ref,heap); + /* Forward declarations. */ static void add_backarc_to_ddg (ddg_ptr, ddg_edge_ptr); static void add_backarc_to_scc (ddg_scc_ptr, ddg_edge_ptr); @@ -273,24 +277,52 @@ create_ddg_dep_no_link (ddg_ptr g, ddg_node_ptr from, ddg_node_ptr to, static void add_cross_iteration_register_deps (ddg_ptr g, df_ref last_def) { - int regno = DF_REF_REGNO (last_def); + unsigned int regno = DF_REF_REGNO (last_def); struct df_link *r_use; int has_use_in_bb_p = false; - rtx def_insn = DF_REF_INSN (last_def); + rtx insn, def_insn = DF_REF_INSN (last_def); ddg_node_ptr last_def_node = get_node_of_insn (g, def_insn); ddg_node_ptr use_node; #ifdef ENABLE_CHECKING struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb); #endif - df_ref first_def = df_bb_regno_first_def_find (g->bb, regno); + static VEC(df_ref,heap) *clobbers; + df_ref first_write_def = NULL; + df_ref *def_rec; + unsigned int uid; + + clobbers = VEC_alloc (df_ref, heap, 0); + + FOR_BB_INSNS (g->bb, insn) + { + if (!INSN_P (insn)) + continue; + uid = INSN_UID (insn); + for (def_rec = DF_INSN_UID_DEFS (uid); *def_rec; def_rec++) + { + df_ref def = *def_rec; + if (DF_REF_REGNO (def) == regno) + { + VEC_safe_push (df_ref, heap, clobbers, def); + if (!(DF_REF_FLAGS (def) + & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER))) + { + first_write_def = def; + break; + } + } + } + if (first_write_def != NULL) + break; + } gcc_assert (last_def_node); - gcc_assert (first_def); + gcc_assert (first_write_def); #ifdef ENABLE_CHECKING - if (DF_REF_ID (last_def) != DF_REF_ID (first_def)) + if (DF_REF_ID (last_def) != DF_REF_ID (first_write_def)) gcc_assert (!bitmap_bit_p (&bb_info->gen, - DF_REF_ID (first_def))); + DF_REF_ID (first_write_def))); #endif /* Create inter-loop true dependences and anti dependences. */ @@ -316,29 +348,35 @@ add_cross_iteration_register_deps (ddg_ptr g, df_ref last_def) } else if (!DEBUG_INSN_P (use_insn)) { + unsigned int i; + df_ref curr_def; /* Add anti deps from last_def's uses in the current iteration - to the first def in the next iteration. We do not add ANTI - dep when there is an intra-loop TRUE dep in the opposite - direction, but use regmoves to fix such disregarded ANTI - deps when broken. If the first_def reaches the USE then - there is such a dep. */ - ddg_node_ptr first_def_node = get_node_of_insn (g, - DF_REF_INSN (first_def)); - - gcc_assert (first_def_node); - - /* Always create the edge if the use node is a branch in - order to prevent the creation of reg-moves. - If the address that is being auto-inc or auto-dec in LAST_DEF - is used in USE_INSN then do not remove the edge to make sure - reg-moves will not be created for that address. */ - if (DF_REF_ID (last_def) != DF_REF_ID (first_def) - || !flag_modulo_sched_allow_regmoves - || JUMP_P (use_node->insn) - || autoinc_var_is_used_p (DF_REF_INSN (last_def), use_insn)) - create_ddg_dep_no_link (g, use_node, first_def_node, ANTI_DEP, - REG_DEP, 1); - + to the first def and all clobbers before it in the next iteration. + We do not add ANTI dep when there is an intra-loop TRUE dep + in the opposite direction, but use regmoves to fix such + disregarded ANTI deps when broken. If the curr_def reaches + the USE then there is such a dep. */ + FOR_EACH_VEC_ELT (df_ref, clobbers, i, curr_def) + { + if (DF_REF_ID (last_def) != DF_REF_ID (curr_def) + /* Some hard regs (for ex. CC-flags) can't be renamed. + || HARD_REGISTER_P (DF_REF_REG (last_def)) */ + || !flag_modulo_sched_allow_regmoves + /* If the address that is being auto-inc or auto-dec in LAST_DEF + is used in USE_INSN then do not remove the edge to make sure + reg-moves will not be created for that address. */ + || autoinc_var_is_used_p (DF_REF_INSN (last_def), use_insn) + /* Always create the edge if the use node is a branch in + order to prevent the creation of reg-moves. */ + || JUMP_P (use_node->insn)) + { + ddg_node_ptr curr_def_node = get_node_of_insn (g, + DF_REF_INSN (curr_def)); + gcc_assert (curr_def_node); + create_ddg_dep_no_link (g, use_node, curr_def_node, + ANTI_DEP, REG_DEP, 1); + } + } } } /* Create an inter-loop output dependence between LAST_DEF (which is the @@ -352,14 +390,16 @@ add_cross_iteration_register_deps (ddg_ptr g, df_ref last_def) { ddg_node_ptr dest_node; - if (DF_REF_ID (last_def) == DF_REF_ID (first_def)) + if (DF_REF_ID (last_def) == DF_REF_ID (first_write_def)) return; - dest_node = get_node_of_insn (g, DF_REF_INSN (first_def)); + dest_node = get_node_of_insn (g, DF_REF_INSN (first_write_def)); gcc_assert (dest_node); create_ddg_dep_no_link (g, last_def_node, dest_node, OUTPUT_DEP, REG_DEP, 1); } + + VEC_free (df_ref, heap, clobbers); } /* Build inter-loop dependencies, by looking at DF analysis backwards. */ static void