On Tue, Feb 17, 2015 at 04:41:41PM +0200, Francisco Jerez wrote: > Tom Stellard <t...@stellard.net> writes: > > > On Tue, Feb 17, 2015 at 03:23:05PM +0200, Francisco Jerez wrote: > >> The round-robin allocation strategy is expected to decrease the amount > >> of false dependencies created by the register allocator and give the > >> post-RA scheduling pass more freedom to move instructions around. On > >> the other hand it has the disadvantage of increasing fragmentation and > >> decreasing the number of equally-colored nearby nodes, what increases > >> the likelihood of failure in presence of optimistically colorable > >> nodes. > >> > >> This patch disables the round-robin strategy for optimistically > >> colorable nodes. These typically arise in situations of high register > >> pressure or for registers with large live intervals, in both cases the > >> task of the instruction scheduler shouldn't be constrained excessively > >> by the dense packing of those nodes, and a spill (or on Intel hardware > >> a fall-back to SIMD8 mode) is invariably worse than a slightly less > >> optimal scheduling. > >> > > > Hi Tom, > > > I'm trying to figure out how this will affect r300g, and it seems like > > from your description that it will be an improvement, because r300g > > doesn't have a post-ra scheduler and it also can't spill registers. > > > > What do you think? > > > > It looks like it won't, apparently i965 is the only caller of > ra_set_allocate_round_robin() in the tree right now, so it should be the > only affected back-end. You could consider enabling it to reduce the > number false dependencies introduced by the register allocator -- after > this patch it shouldn't lead to increased likelihood of register > allocation failure anymore. It might however lead to increased register > usage possibly limiting the number of threads your hardware can run in > parallel, the answer really depends on whether that's a limiting factor > for your hardware or not. I guess that if you don't have a post-RA > scheduling pass the benefit you could possibly get from it is rather > limited, it's probably safe to assume that you don't need it but it > might be worth looking into. >
Ok, thanks for the explanation. I probably won't have time to investigate, but it's good knowing this is patch is a no-op for r300g so I don't need to worry about regressions. -Tom > > -Tom > > > > > >> Shader-db results on the i965 driver: > >> > >> total instructions in shared programs: 5488539 -> 5488489 (-0.00%) > >> instructions in affected programs: 1121 -> 1071 (-4.46%) > >> helped: 1 > >> HURT: 0 > >> GAINED: 49 > >> LOST: 5 > >> > >> v2: Re-enable round-robin already for the lowest one of the nodes > >> pushed optimistically onto the sack (Connor). > >> --- > >> src/util/register_allocate.c | 23 ++++++++++++++++++++++- > >> 1 file changed, 22 insertions(+), 1 deletion(-) > >> > >> diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c > >> index af7a20c..b1ed273 100644 > >> --- a/src/util/register_allocate.c > >> +++ b/src/util/register_allocate.c > >> @@ -168,6 +168,12 @@ struct ra_graph { > >> > >> unsigned int *stack; > >> unsigned int stack_count; > >> + > >> + /** > >> + * Tracks the start of the set of optimistically-colored registers in > >> the > >> + * stack. > >> + */ > >> + unsigned int stack_optimistic_start; > >> }; > >> > >> /** > >> @@ -454,6 +460,7 @@ static void > >> ra_simplify(struct ra_graph *g) > >> { > >> bool progress = true; > >> + unsigned int stack_optimistic_start = ~0; > >> int i; > >> > >> while (progress) { > >> @@ -483,12 +490,16 @@ ra_simplify(struct ra_graph *g) > >> > >> if (!progress && best_optimistic_node != ~0U) { > >> decrement_q(g, best_optimistic_node); > >> + stack_optimistic_start = > >> + MIN2(stack_optimistic_start, g->stack_count); > >> g->stack[g->stack_count] = best_optimistic_node; > >> g->stack_count++; > >> g->nodes[best_optimistic_node].in_stack = true; > >> progress = true; > >> } > >> } > >> + > >> + g->stack_optimistic_start = stack_optimistic_start; > >> } > >> > >> /** > >> @@ -542,7 +553,17 @@ ra_select(struct ra_graph *g) > >> g->nodes[n].reg = r; > >> g->stack_count--; > >> > >> - if (g->regs->round_robin) > >> + /* Rotate the starting point except for any nodes above the lowest > >> + * optimistically colorable node. The likelihood that we will > >> succeed > >> + * at allocating optimistically colorable nodes is highly dependent > >> on > >> + * the way that the previous nodes popped off the stack are laid > >> out. > >> + * The round-robin strategy increases the fragmentation of the > >> register > >> + * file and decreases the number of nearby nodes assigned to the > >> same > >> + * color, what increases the likelihood of spilling with respect to > >> the > >> + * dense packing strategy. > >> + */ > >> + if (g->regs->round_robin && > >> + g->stack_count <= g->stack_optimistic_start + 1) > >> start_search_reg = r + 1; > >> } > >> > >> -- > >> 2.1.3 > >> > >> _______________________________________________ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev