On Mon, Mar 21, 2016 at 6:51 PM, Connor Abbott <cwabbo...@gmail.com> wrote: > So overall, I think that there needs to be some explanation of the > design choices in the implementation. The API documentation is great, > but digging into the implementation might be a little daunting without > knowing e.g. why NEEDS_PHI is a thing. From what I gather, there are > three potential states a phi node can be in, once it's determined by > the usual dominance-frontier algorithm that it needs to be there: > > 1. There's just a NEEDS_PHI entry. IMHO this is a little bit of a > misnomer. This means something like "we might need to insert a phi, if > something uses it." > 2. The phi has been constructed and added to the list of phis > associated with this value, but it hasn't been inserted into the block > yet. This means something like "we know that this phi is actually > needed, but we haven't materialized its sources yet." > 3. The phi has actually been inserted into the block. > > There isn't enough explanation as to why #2 needs to be separate from > #3. I think it made more sense to me once I realized that the "phis" > list is actually a worklist, and nir_phi_builder_finish() is really a > worklist-based algorithm for recursively filling in phi node > dependencies, and when we go from #1 to #2 we're really just inserting > the phi into the worklist. > > On Sat, Feb 13, 2016 at 9:14 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: >> Right now, we have phi placement code in two places and there are other >> places where it would be nice to be able to do this analysis. Instead of >> repeating it all over the place, this commit adds a helper for placing all >> of the needed phi nodes for a value. >> --- >> src/compiler/Makefile.sources | 2 + >> src/compiler/nir/Makefile.sources | 2 + >> src/compiler/nir/nir_phi_builder.c | 254 >> +++++++++++++++++++++++++++++++++++++ >> src/compiler/nir/nir_phi_builder.h | 84 ++++++++++++ >> 4 files changed, 342 insertions(+) >> create mode 100644 src/compiler/nir/nir_phi_builder.c >> create mode 100644 src/compiler/nir/nir_phi_builder.h >> >> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources >> index c9780d6..4a1b120 100644 >> --- a/src/compiler/Makefile.sources >> +++ b/src/compiler/Makefile.sources >> @@ -213,6 +213,8 @@ NIR_FILES = \ >> nir/nir_opt_peephole_select.c \ >> nir/nir_opt_remove_phis.c \ >> nir/nir_opt_undef.c \ >> + nir/nir_phi_builder.c \ >> + nir/nir_phi_builder.h \ >> nir/nir_print.c \ >> nir/nir_remove_dead_variables.c \ >> nir/nir_search.c \ >> diff --git a/src/compiler/nir/Makefile.sources >> b/src/compiler/nir/Makefile.sources >> index 0755a10..7269f9f 100644 >> --- a/src/compiler/nir/Makefile.sources >> +++ b/src/compiler/nir/Makefile.sources >> @@ -57,6 +57,8 @@ NIR_FILES = \ >> nir_opt_peephole_select.c \ >> nir_opt_remove_phis.c \ >> nir_opt_undef.c \ >> + nir_phi_builder.c \ >> + nir_phi_builder.h \ >> nir_print.c \ >> nir_remove_dead_variables.c \ >> nir_search.c \ >> diff --git a/src/compiler/nir/nir_phi_builder.c >> b/src/compiler/nir/nir_phi_builder.c >> new file mode 100644 >> index 0000000..5429083 >> --- /dev/null >> +++ b/src/compiler/nir/nir_phi_builder.c >> @@ -0,0 +1,254 @@ >> +/* >> + * Copyright © 2016 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the next >> + * paragraph) shall be included in all copies or substantial portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >> OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> DEALINGS >> + * IN THE SOFTWARE. >> + */ >> + >> +#include "nir_phi_builder.h" >> +#include "nir/nir_vla.h" >> + >> +struct nir_phi_builder { >> + nir_shader *shader; >> + nir_function_impl *impl; >> + >> + /* Copied from the impl for easy access */ >> + unsigned num_blocks; >> + >> + /* Array of all blocks indexed by block->index. */ >> + nir_block **blocks; >> + >> + /* Hold on to the values so we can easily iterate over them. */ >> + struct exec_list values; >> + >> + /* Worklist for phi adding */ >> + unsigned iter_count; >> + unsigned *work; >> + nir_block **W; >> +}; >> + >> +#define NEEDS_PHI ((nir_ssa_def *)(intptr_t)-1) > > Instead of doing this, you could just do something like: > > static nir_ssa_def dummy; > #define NEEDS_PHI &dummy > > We do similar things in the hash table code already. > >> + >> +struct nir_phi_builder_value { >> + struct exec_node node; >> + >> + struct nir_phi_builder *builder; >> + >> + /* Needed so we can create phis and undefs */ >> + unsigned num_components; >> + >> + /* The list of phi nodes associated with this value. Phi nodes are not >> + * added directly. Instead, they are created, the instr->block pointer >> + * set, and then added to this list. Later, in phi_builder_finish, we >> + * set up their sources and add them to the top of their respective >> + * blocks. >> + */ >> + struct exec_list phis; > > In light of my comments above, how about we make this a worklist > instead? This would also eliminate the need for NEEDS_PHI, since we > can just insert the block into the worklist when we realize we need > the phi node's value. > >> + >> + /* Array of SSA defs, indexed by block. If a phi needs to be inserted >> + * in a given block, it will have the magic value NEEDS_PHI. >> + */ >> + nir_ssa_def *defs[0]; >> +}; >> + >> +static bool >> +fill_block_array(nir_block *block, void *void_data) >> +{ >> + nir_block **blocks = void_data; >> + blocks[block->index] = block; >> + return true; >> +} >> + >> +struct nir_phi_builder * >> +nir_phi_builder_create(nir_function_impl *impl) >> +{ >> + struct nir_phi_builder *pb = ralloc(NULL, struct nir_phi_builder); >> + >> + pb->shader = impl->function->shader; >> + pb->impl = impl; >> + >> + assert(impl->valid_metadata & (nir_metadata_block_index | >> + nir_metadata_dominance)); >> + >> + pb->num_blocks = impl->num_blocks; >> + pb->blocks = ralloc_array(pb, nir_block *, pb->num_blocks); >> + nir_foreach_block(impl, fill_block_array, pb->blocks); >> + >> + exec_list_make_empty(&pb->values); >> + >> + pb->iter_count = 0; >> + pb->work = rzalloc_array(pb, unsigned, pb->num_blocks); >> + pb->W = ralloc_array(pb, nir_block *, pb->num_blocks); >> + >> + return pb; >> +} >> + >> +struct nir_phi_builder_value * >> +nir_phi_builder_add_value(struct nir_phi_builder *pb, unsigned >> num_components, >> + const BITSET_WORD *defs) >> +{ >> + struct nir_phi_builder_value *val; >> + unsigned i, w_start = 0, w_end = 0; >> + >> + val = rzalloc_size(pb, sizeof(*val) + sizeof(val->defs[0]) * >> pb->num_blocks); >> + val->builder = pb; >> + val->num_components = num_components; >> + exec_list_make_empty(&val->phis); >> + exec_list_push_tail(&pb->values, &val->node); >> + >> + pb->iter_count++; >> + >> + BITSET_WORD tmp; >> + BITSET_FOREACH_SET(i, tmp, defs, pb->num_blocks) { >> + if (pb->work[i] < pb->iter_count) >> + pb->W[w_end++] = pb->blocks[i]; >> + pb->work[i] = pb->iter_count; >> + } >> + >> + while (w_start != w_end) { >> + nir_block *cur = pb->W[w_start++]; >> + struct set_entry *dom_entry; >> + set_foreach(cur->dom_frontier, dom_entry) { >> + nir_block *next = (nir_block *) dom_entry->key; >> + >> + /* >> + * If there's more than one return statement, then the end block >> + * can be a join point for some definitions. However, there are >> + * no instructions in the end block, so nothing would use those >> + * phi nodes. Of course, we couldn't place those phi nodes >> + * anyways due to the restriction of having no instructions in the >> + * end block... >> + */ >> + if (next == pb->impl->end_block) >> + continue; >> + >> + if (val->defs[next->index] == NULL) { >> + val->defs[next->index] = NEEDS_PHI; >> + >> + if (pb->work[next->index] < pb->iter_count) { >> + pb->work[next->index] = pb->iter_count; >> + pb->W[w_end++] = next; >> + } >> + } >> + } >> + } >> + >> + return val; >> +} >> + >> +void >> +nir_phi_builder_value_set_block_def(struct nir_phi_builder_value *val, >> + nir_block *block, nir_ssa_def *def) >> +{ >> + val->defs[block->index] = def; >> +} >> + >> +nir_ssa_def * >> +nir_phi_builder_value_get_block_def(struct nir_phi_builder_value *val, >> + nir_block *block) >> +{ >> + if (val->defs[block->index] == NULL) { >> + if (block->imm_dom) { >> + /* Grab it from our immediate dominator. We'll stash it here for >> + * easy access later. >> + */ >> + val->defs[block->index] = >> + nir_phi_builder_value_get_block_def(val, block->imm_dom); >> + return val->defs[block->index]; >> + } else { >> + /* No immediate dominator means that this block is either the >> + * start block or unreachable. In either case, the value is >> + * undefined so we need an SSA undef. >> + */ >> + nir_ssa_undef_instr *undef = >> + nir_ssa_undef_instr_create(val->builder->shader, >> + val->num_components); >> + nir_instr_insert(nir_before_cf_list(&val->builder->impl->body), >> + &undef->instr); >> + val->defs[block->index] = &undef->def; >> + return &undef->def; >> + } >> + } else if (val->defs[block->index] == NEEDS_PHI) { >> + /* If we need a phi instruction, go ahead and create one but don't >> + * add it to the program yet. Later, we'll go through and set up phi >> + * sources and add the instructions will be added at that time. >> + */ >> + nir_phi_instr *phi = nir_phi_instr_create(val->builder->shader); >> + nir_ssa_dest_init(&phi->instr, &phi->dest, val->num_components, NULL); >> + phi->instr.block = block; >> + exec_list_push_tail(&val->phis, &phi->instr.node); >> + val->defs[block->index] = &phi->dest.ssa; >> + return &phi->dest.ssa; >> + } else { >> + return val->defs[block->index]; >> + } >> +} >> + >> +static int >> +compare_blocks(const void *_a, const void *_b) >> +{ >> + nir_block * const * a = _a; >> + nir_block * const * b = _b; >> + >> + return (*a)->index - (*b)->index; >> +} >> + >> +void >> +nir_phi_builder_finish(struct nir_phi_builder *pb) >> +{ >> + const unsigned num_blocks = pb->num_blocks; >> + NIR_VLA(nir_block *, preds, num_blocks); >> + >> + foreach_list_typed(struct nir_phi_builder_value, val, node, &pb->values) >> { >> + /* We can't iterate over the list of phis normally because we are >> + * removing them as we go and, in some cases, adding new phis as we >> + * build the source lists of others. >> + */ >> + while (!exec_list_is_empty(&val->phis)) { >> + struct exec_node *head = exec_list_get_head(&val->phis); >> + nir_phi_instr *phi = exec_node_data(nir_phi_instr, head, >> instr.node); >> + assert(phi->instr.type == nir_instr_type_phi); >> + >> + exec_node_remove(&phi->instr.node); >> + >> + /* Construct an array of predecessors. We sort it to ensure >> + * determinism in the phi insertion algorithm. >> + * >> + * XXX: Calling qsort this many times seems expensive. >> + */ >> + int num_preds = 0; >> + struct set_entry *entry; >> + set_foreach(phi->instr.block->predecessors, entry) >> + preds[num_preds++] = (nir_block *)entry->key; >> + qsort(preds, num_preds, sizeof(*preds), compare_blocks); >> + >> + for (unsigned i = 0; i < num_preds; i++) { >> + nir_phi_src *src = ralloc(phi, nir_phi_src); >> + src->pred = preds[i]; >> + src->src = nir_src_for_ssa( >> + nir_phi_builder_value_get_block_def(val, preds[i])); >> + exec_list_push_tail(&phi->srcs, &src->node); >> + } >> + >> + nir_instr_insert(nir_before_block(phi->instr.block), &phi->instr); >> + } >> + } >> + >> + ralloc_free(pb); >> +} >> diff --git a/src/compiler/nir/nir_phi_builder.h >> b/src/compiler/nir/nir_phi_builder.h >> new file mode 100644 >> index 0000000..50251bf >> --- /dev/null >> +++ b/src/compiler/nir/nir_phi_builder.h >> @@ -0,0 +1,84 @@ >> +/* >> + * Copyright © 2016 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the next >> + * paragraph) shall be included in all copies or substantial portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >> OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> DEALINGS >> + * IN THE SOFTWARE. >> + */ >> + >> +#pragma once >> + >> +#include "nir.h" >> + >> +struct nir_phi_builder; >> +struct nir_phi_builder_value; >> + >> +/* Create a new phi builder. >> + * >> + * While this is fairly cheap, it does allocate some memory and walk the >> list >> + * of blocks so it's recommended that you only call it once and use it to >> + * build phis for several values. >> + */ >> +struct nir_phi_builder *nir_phi_builder_create(nir_function_impl *impl); >> + >> +/* Register a value with the builder. >> + * >> + * The 'defs' parameter specifies a bitset of blocks in which the given >> value >> + * is defined. This is used to determine where to place the phi nodes. >> + */ >> +struct nir_phi_builder_value * >> +nir_phi_builder_add_value(struct nir_phi_builder *pb, unsigned >> num_components, >> + const BITSET_WORD *defs); >> + >> +/* Register a definition for the given value and block. >> + * >> + * It is safe to call this function as many times as you wish for any given >> + * block/value pair. However, it always replaces whatever was there >> + * previously even if that definition is from a phi node. The phi builder >> + * always uses the latest information it has, so you must be careful about >> the >> + * order in which you register definitions. The final value at the end of >> the >> + * block must be the last value registered. >> + */ >> +void >> +nir_phi_builder_value_set_block_def(struct nir_phi_builder_value *val, >> + nir_block *block, nir_ssa_def *def); >> + >> +/* Get the definition for the given value in the given block. >> + * >> + * This definition will always be the latest definition known for the given >> + * block. If no definition is immediately available, it will crawl up the >> + * dominance tree and insert phi nodes as needed until it finds one. In the >> + * case that no suitable definition is found, it will return the result of a >> + * nir_ssa_undef_instr with the correct number of components. >> + * >> + * Because this function only uses the latest available information for any >> + * given block, you must have already finished registering definitions for >> any >> + * blocks that dominate the current block in order to get the correct >> result. >> + */ > > I think this isn't conservative enough; you have to register *all* the > definitions before you call nir_phi_builder_get_block_def() anywhere, > since it might return a phi node created by a definition in a block > that doesn't dominate the block you're considering. For example, > consider something like: > > a = ... > loop { > ... = a; > ... > a = ...; > }
Err, nevermind about this part. I was thinking, though, that maybe it would be helpful to have a sort of pseudocode as to how you're supposed to use the API in practice that ties all this information together. Something like: each variable, var, has: a bitset var.defs of blocks where the variable is defined a struct nir_phi_builder_value *pb_val /* initialize bitsets */ foreach block: foreach def of variable var: var.defs[block] = true; /* initialize phi builder */ pb = nir_phi_builder_create() foreach var: var.pb_val = nir_phi_builder_add_value(pb, var.defs) foreach block: /* needs to visit dominators first, nir_for_each_block() will be ok */ foreach instruction: foreach use of variable var: replace use with nir_phi_builder_get_block_def(var.pb_val) foreach def of variable var: create ssa def, register with nir_phi_builder_set_block_def(var.pb_val) nir_phi_builder_finish(pb) did I get this right? > > if you call nir_phi_builder_value_get_block_def() on the use of a > before registering the definition below it, then you'll get the first > definition of a instead of the phi node at the top of the loop, since > the builder hasn't yet figured out that a phi node needs to be > inserted there. > >> +nir_ssa_def * >> +nir_phi_builder_value_get_block_def(struct nir_phi_builder_value *val, >> + nir_block *block); >> + >> +/* Finish building phi nodes and free the builder. >> + * >> + * This function does far more than just free memory. Prior to calling >> + * nir_phi_builder_finish, no phi nodes have actually been inserted in the >> + * program. This function is what finishes setting up phi node sources and >> + * adds the phi nodes to the program. >> + */ >> +void nir_phi_builder_finish(struct nir_phi_builder *pb); >> -- >> 2.5.0.400.gff86faf >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev