On Wed, Dec 9, 2015 at 12:08 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Wednesday, December 09, 2015 08:03:25 AM Jason Ekstrand wrote: >> On Dec 9, 2015 2:51 AM, "Kenneth Graunke" <kenn...@whitecape.org> wrote: >> > >> > My tessellation branch has two additional remap functions. I don't want >> > to replicate this logic there. >> > >> > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> >> > --- >> > src/mesa/drivers/dri/i965/brw_nir.c | 78 >> ++++++++++++++++++++++++------------- >> > 1 file changed, 50 insertions(+), 28 deletions(-) >> > >> > Hey Jason, >> > >> > If you like this patch, and haven't yet merged your NIR input reworks, >> > feel free to just squash it into your changes. Or, we can land it >> > separately after your changes. It's up to you. >> > >> > Separating this out allows me to reuse this in my new tessellation input >> > and output remapping functions, and also means we don't need to add >> structs >> > for the remap functions...we can just pass the builder, or inputs_read, or >> > the VUE map...and not have to pack multiple things together. >> >> Sure. It does make things simpler. >> >> > --Ken >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c >> b/src/mesa/drivers/dri/i965/brw_nir.c >> > index 14ad172..105a175 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_nir.c >> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c >> > @@ -27,15 +27,19 @@ >> > #include "glsl/nir/nir_builder.h" >> > #include "program/prog_to_nir.h" >> > >> > -struct remap_vs_attrs_state { >> > - nir_builder b; >> > - uint64_t inputs_read; >> > -}; >> > - >> > +/** >> > + * In many cases, we just add the base and offset together, so there's no >> > + * reason to keep them separate. Sometimes, combining them is essential: >> > + * if a shader only accesses part of a compound variable (such as a >> matrix >> > + * or array), the variable's base may not actually exist in the VUE map. >> > + * >> > + * This pass adds constant offsets to instr->const_index[0], and resets >> > + * the offset source to 0. Non-constant offsets remain unchanged. >> > + */ >> > static bool >> > -remap_vs_attrs(nir_block *block, void *void_state) >> > +add_const_offset_to_base(nir_block *block, void *closure) >> > { >> > - struct remap_vs_attrs_state *state = void_state; >> > + nir_builder *b = closure; >> > >> > nir_foreach_instr_safe(block, instr) { >> > if (instr->type != nir_instr_type_intrinsic) >> > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state) >> > >> > nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); >> > >> > + if (intrin->intrinsic == nir_intrinsic_load_input || >> > + intrin->intrinsic == nir_intrinsic_load_per_vertex_input || >> > + intrin->intrinsic == nir_intrinsic_load_output || >> > + intrin->intrinsic == nir_intrinsic_load_per_vertex_output || >> > + intrin->intrinsic == nir_intrinsic_store_output || >> > + intrin->intrinsic == nir_intrinsic_store_per_vertex_output) { >> >> This seems a bit scortched-earth. It would be nice if the caller had a bit >> more control. > > We could always add "do input"/"do output" options once we actually need > them. It means adding the structs back, but that's fine. > > I suppose for now we could ignore output intrinsics here. > >> > + nir_src *offset = nir_get_io_offset_src(intrin); >> > + nir_const_value *const_offset = nir_src_as_const_value(*offset); >> > + >> > + if (const_offset) { >> > + intrin->const_index[0] += const_offset->u[0]; >> > + b->cursor = nir_before_instr(&intrin->instr); >> > + nir_instr_rewrite_src(&intrin->instr, offset, >> > + nir_src_for_ssa(nir_imm_int(b, 0))); >> > + } >> >> Else??? It seems that you don't want to run this pass if you think you'll >> ever hit an indirect. I guess it's harmless to just do this for all direct >> things in our driver, but it doesn't sit well. > > Else do nothing. The problem I'm trying to avoid with this logic is > that inputs/outputs which take up multiple slots and are directly > accessed may only have slots assigned for the sub-components that are > actually used. So, I can't use the base offset, and need to move the > base to the slot that's actually used. > > If something is accessed indirectly, we assign all of its slots, so > using the base location is fine.
Ok, that makes sense. >> > + } >> > + } >> > + return true; >> > + >> > +} >> > + >> > +static bool >> > +remap_vs_attrs(nir_block *block, void *closure) >> > +{ >> > + GLbitfield64 inputs_read = *((GLbitfield64 *) closure); >> > + >> > + nir_foreach_instr(block, instr) { >> > + if (instr->type != nir_instr_type_intrinsic) >> > + continue; >> > + >> > + nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); >> > + >> > if (intrin->intrinsic == nir_intrinsic_load_input) { >> > /* Attributes come in a contiguous block, ordered by their >> > * gl_vert_attrib value. That means we can compute the slot >> > * number for an attribute by masking out the enabled attributes >> > * before it and counting the bits. >> > */ >> > - nir_const_value *const_offset = >> nir_src_as_const_value(intrin->src[0]); >> > - >> > - /* We set EmitNoIndirect for VS inputs, so there are no >> indirects. */ >> > - assert(const_offset); >> > - >> > - int attr = intrin->const_index[0] + const_offset->u[0]; >> > - int slot = _mesa_bitcount_64(state->inputs_read & >> > - BITFIELD64_MASK(attr)); >> > + int attr = intrin->const_index[0]; >> > + int slot = _mesa_bitcount_64(inputs_read & >> BITFIELD64_MASK(attr)); >> > >> > - /* The NIR -> FS pass will just add the base and offset >> together, so >> > - * there's no reason to keep them separate. Just put it all in >> > - * const_index[0] and set the offset src[0] to load_const(0). >> > - */ >> > intrin->const_index[0] = 4 * slot; >> > - >> > - state->b.cursor = nir_before_instr(&intrin->instr); >> > - nir_instr_rewrite_src(&intrin->instr, &intrin->src[0], >> > - nir_src_for_ssa(nir_imm_int(&state->b, >> 0))); >> > } >> > } >> > return true; >> > @@ -97,17 +119,17 @@ brw_nir_lower_inputs(nir_shader *nir, >> > * key->inputs_read since the two are identical aside from >> Gen4-5 >> > * edge flag differences. >> > */ >> > - struct remap_vs_attrs_state remap_state = { >> > - .inputs_read = nir->info.inputs_read, >> > - }; >> > + GLbitfield64 inputs_read = nir->info.inputs_read; >> > >> > /* This pass needs actual constants */ >> > nir_opt_constant_folding(nir); >> > >> > nir_foreach_overload(nir, overload) { >> > if (overload->impl) { >> > - nir_builder_init(&remap_state.b, overload->impl); >> > - nir_foreach_block(overload->impl, remap_vs_attrs, >> &remap_state); >> > + nir_builder b; >> > + nir_builder_init(&b, overload->impl); >> > + nir_foreach_block(overload->impl, >> add_const_offset_to_base, &b); >> >> What really bothers me is that this call is in remap_vs_attrs but it stomps >> on store_output intrinsics. That seems bad. > > It's not really part of remap_vs_attrs, but you're right, it is part of > brw_nir_lower_inputs(). So affecting outputs seems rather dirty. > > How about we just drop outputs for now, and I can fix it in my series? That would probably work. I'll make an input-only version for VS, squash it in, and then you can make the full version do whatever you want. >> Overall, I think this is a fine approach but I wish it were more >> fine-grained. Unfortunately, I don't have any hot ideas on how to make it >> better off-hand. >> >> --Jason >> >> > + nir_foreach_block(overload->impl, remap_vs_attrs, >> &inputs_read); >> > } >> > } >> > } >> > -- >> > 2.6.3 >> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev