Jason Ekstrand <ja...@jlekstrand.net> writes: > On Tue, Aug 16, 2016 at 1:54 PM, Francisco Jerez <curroje...@riseup.net> > wrote: > >> This adds a bit of metadata to schedule_node that will be used to >> compare available nodes in the scheduling heuristic code based on >> which of them unblocks the earliest successor exit node. Note that >> assigning exit nodes wouldn't be necessary in a bottom-up scheduler >> because we could achieve the same effect by scheduling the exit nodes >> themselves appropriately. >> > > Hopefully we can actually make that switch soon. > > >> --- >> .../drivers/dri/i965/brw_schedule_instructions.cpp | 59 >> ++++++++++++++++++++++ >> 1 file changed, 59 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> index 19d9ec1..96562cf 100644 >> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp >> @@ -86,8 +86,34 @@ public: >> * its children, or just the issue_time if it's a leaf node. >> */ >> int delay; >> + >> + /** >> + * Preferred exit node among the (direct or indirect) successors of >> this >> + * node. Among the scheduler nodes blocked by this node, this will be >> the >> + * one that may cause earliest program termination, or NULL if none of >> the >> + * successors is an exit node. >> + */ >> + schedule_node *exit; >> }; >> >> +/** >> + * Lower bound of the scheduling time after which one of the instructions >> + * blocked by this node may lead to program termination. >> + * >> + * exit_unblocked_time() determines a strict partial ordering relation >> '«' on >> + * the set of scheduler nodes as follows: >> + * >> + * n « m <-> exit_unblocked_time(n) < exit_unblocked_time(m) >> + * >> + * which can be used to heuristically order nodes according to how early >> they >> + * can unblock an exit node and lead to program termination. >> + */ >> +static inline int >> +exit_unblocked_time(const schedule_node *n) >> +{ >> + return n->exit ? n->exit->unblocked_time : INT_MAX; >> +} >> + >> void >> schedule_node::set_latency_gen4() >> { >> @@ -460,6 +486,7 @@ public: >> void run(cfg_t *cfg); >> void add_insts_from_block(bblock_t *block); >> void compute_delays(); >> + void compute_exits(); >> virtual void calculate_deps() = 0; >> virtual schedule_node *choose_instruction_to_schedule() = 0; >> >> @@ -772,6 +799,7 @@ schedule_node::schedule_node(backend_instruction >> *inst, >> this->unblocked_time = 0; >> this->cand_generation = 0; >> this->delay = 0; >> + this->exit = NULL; >> >> /* We can't measure Gen6 timings directly but expect them to be much >> * closer to Gen7 than Gen4. >> @@ -812,6 +840,36 @@ instruction_scheduler::compute_delays() >> } >> } >> >> +void >> +instruction_scheduler::compute_exits() >> +{ >> + /* Calculate a lower bound of the scheduling time of each node in the >> + * graph. This is analogous to the node's critical path but calculated >> + * from the top instead of from the bottom of the block. >> + */ >> + foreach_in_list(schedule_node, n, &instructions) { >> + for (int i = 0; i < n->child_count; i++) { >> + n->children[i]->unblocked_time = >> + MAX2(n->children[i]->unblocked_time, >> + n->unblocked_time + issue_time(n->inst) + >> n->child_latency[i]); >> + } >> + } >> > > Dos this calculation affect scheduling later on? I don't think it will > since we're effectively setting n->unblocked_time to the lowest possible > based on a dependency DFS. Later uses *shouldn't* set it to anything > less. Should be easy enough to check with shader-db. >
I've just confirmed that this patch doesn't lead to any shader-db changes (neither cycle count, spilling nor dispatch width changes) on any platform, I'll add a short note in the commit message. > >> + >> + /* Calculate the exit of each node by induction based on the exit >> nodes of >> + * its children. The preferred exit of a node is the one among the >> exit >> + * nodes of its children which can be unblocked first according to the >> + * optimistic unblocked time estimate calculated above. >> + */ >> + foreach_in_list_reverse(schedule_node, n, &instructions) { >> + n->exit = (n->inst->opcode == FS_OPCODE_DISCARD_JUMP ? n : NULL); >> + >> + for (int i = 0; i < n->child_count; i++) { >> + if (exit_unblocked_time(n->children[i]) < >> exit_unblocked_time(n)) >> > > It strikes me as a bit odd that we compare n->children[i] with n rather > than n->exit, but the next line means it's equivalent. > > >> + n->exit = n->children[i]->exit; >> + } >> + } >> +} >> > > Assuming this patch *doesn't* change scheduling in any way, > > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > Thanks! > >> + >> /** >> * Add a dependency between two instruction nodes. >> * >> @@ -1631,6 +1689,7 @@ instruction_scheduler::run(cfg_t *cfg) >> calculate_deps(); >> >> compute_delays(); >> + compute_exits(); >> >> schedule_instructions(block); >> } >> -- >> 2.9.0 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev