On Sat, Apr 2, 2016 at 11:50 AM, Rob Clark <robdcl...@gmail.com> wrote:
> On Fri, Apr 1, 2016 at 1:46 PM, Jason Ekstrand <ja...@jlekstrand.net> > wrote: > > > > > > On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark <robdcl...@gmail.com> wrote: > >> > >> From: Rob Clark <robcl...@freedesktop.org> > >> > >> A pass to lower complex (struct/array/mat) inputs/outputs to primitive > >> types. This allows, for example, linking that removes unused components > >> of a larger type which is not indirectly accessed. > >> > >> In the near term, it is needed for gallium (mesa/st) support for NIR, > >> since only used components of a type are assigned VBO slots, and we > >> otherwise have no way to represent that to the driver backend. But it > >> should be useful for doing shader linking in NIR. > >> > >> Signed-off-by: Rob Clark <robcl...@freedesktop.org> > >> --- > >> src/compiler/Makefile.sources | 1 + > >> src/compiler/nir/nir.h | 1 + > >> src/compiler/nir/nir_lower_io_types.c | 178 > >> ++++++++++++++++++++++++++++++++++ > >> 3 files changed, 180 insertions(+) > >> create mode 100644 src/compiler/nir/nir_lower_io_types.c > >> > >> diff --git a/src/compiler/Makefile.sources > b/src/compiler/Makefile.sources > >> index ae7efbf..8f1ffdc 100644 > >> --- a/src/compiler/Makefile.sources > >> +++ b/src/compiler/Makefile.sources > >> @@ -196,6 +196,7 @@ NIR_FILES = \ > >> nir/nir_lower_idiv.c \ > >> nir/nir_lower_indirect_derefs.c \ > >> nir/nir_lower_io.c \ > >> + nir/nir_lower_io_types.c \ > >> nir/nir_lower_outputs_to_temporaries.c \ > >> nir/nir_lower_passthrough_edgeflags.c \ > >> nir/nir_lower_phis_to_scalar.c \ > >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > >> index 65f75bd..7f7c459 100644 > >> --- a/src/compiler/nir/nir.h > >> +++ b/src/compiler/nir/nir.h > >> @@ -2172,6 +2172,7 @@ void nir_lower_io(nir_shader *shader, > >> nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr); > >> nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr); > >> > >> +void nir_lower_io_types(nir_shader *shader, int (*type_size)(const > struct > >> glsl_type *)); > >> void nir_lower_vars_to_ssa(nir_shader *shader); > >> > >> bool nir_remove_dead_variables(nir_shader *shader); > >> diff --git a/src/compiler/nir/nir_lower_io_types.c > >> b/src/compiler/nir/nir_lower_io_types.c > >> new file mode 100644 > >> index 0000000..b7d9ebe > >> --- /dev/null > >> +++ b/src/compiler/nir/nir_lower_io_types.c > >> @@ -0,0 +1,178 @@ > >> +/* > >> + * Copyright © 2016 Red Hat > >> + * > >> + * Permission is hereby granted, free of charge, to any person > obtaining > >> a > >> + * copy of this software and associated documentation files (the > >> "Software"), > >> + * to deal in the Software without restriction, including without > >> limitation > >> + * the rights to use, copy, modify, merge, publish, distribute, > >> sublicense, > >> + * and/or sell copies of the Software, and to permit persons to whom > the > >> + * Software is furnished to do so, subject to the following conditions: > >> + * > >> + * The above copyright notice and this permission notice (including the > >> next > >> + * paragraph) shall be included in all copies or substantial portions > of > >> the > >> + * Software. > >> + * > >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > >> EXPRESS OR > >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >> MERCHANTABILITY, > >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > >> SHALL > >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > >> OTHER > >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > >> ARISING FROM, > >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > >> IN THE > >> + * SOFTWARE. > >> + */ > >> + > >> +#include "nir.h" > >> +#include "nir_builder.h" > >> + > >> +/* Lower complex (struct/array/mat) input and output vars to primitive > >> types > >> + * (vec4) for linking. All indirect input/output access should already > >> be > >> + * lowered (ie. nir_lower_io_to_temporaries). > >> + */ > >> + > >> +struct lower_io_types_state { > >> + nir_shader *shader; > >> + struct exec_list new_ins; > >> + struct exec_list new_outs; > >> + int (*type_size)(const struct glsl_type *); > >> +}; > >> + > >> +static nir_variable * > >> +get_new_var(struct lower_io_types_state *state, nir_variable *var, > >> unsigned off) > >> +{ > >> + struct exec_list *list; > >> + > >> + if (var->data.mode == nir_var_shader_in) { > >> + list = &state->new_ins; > >> + } else { > >> + assert(var->data.mode == nir_var_shader_out); > >> + list = &state->new_outs; > >> + } > >> + > >> + nir_foreach_variable(nvar, list) { > >> + if (nvar->data.location == (var->data.location + off)) > >> + return nvar; > >> + } > > > > > > Doing a linear search here could get expensive. Why not just have an > array > > that maps locations to variables? I think you have a maximum of 64 > possible > > locations (if you disregard tess) so it's not that memory-intensive. > > hmm, list of intputs and outputs is pretty small/bounded.. do you > *really* expect it to be a problem? > No, it should be fine. It just bothers me a bit :-P > >> > >> + > >> + /* doesn't already exist, so we need to create a new one: */ > >> + /* TODO figure out if scalar vs vec, and if float/int/uint/(double?) > >> + * do we need to fixup interpolation mode for int vs float > components > >> + * of a struct, etc.. > >> + */ > >> + const struct glsl_type *ntype = glsl_vec4_type(); > >> + nir_variable *nvar = nir_variable_create(state->shader, > >> var->data.mode, > >> + ntype, NULL); > > > > > > Do you want to create a new one or clone? Cloning seems better because > that > > would ensure that interp qualifiers etc. get copied over. > > I started out that way.. but decided to avoid it for some reason.. I > think just to avoid the duplicate ralloc_asprintf() for the name, but > there might have been another reason.. > > > Also, I seem to recall it being possible for outputs to have constant > > initializers coming out of GLSL IR. This pass doesn't handle that. > > hmm, I haven't come across any piglit that triggered this.. if you > have some ideas for a .shader_test I could try.. > I don't know of anything off-hand that triggers it. I just seem to recall coming across it at some point. I guess one could put in an assert somewhere and do a piglit run. > >> > >> + > >> + nvar->name = ralloc_asprintf(nvar, "%s@%u", var->name, off); > >> + nvar->data = var->data; > >> + nvar->data.location += off; > >> + > >> + /* nir_variable_create is too clever for it's own good: */ > >> + exec_node_remove(&nvar->node); > >> + exec_node_self_link(&nvar->node); /* no delinit() :-( */ > > > > > > This isn't needed. Re-insertion just stomps whatever was there before. > > maybe.. tbh I'd kinda prefer if _remove() spent the extra couple > instructions to _self_link() rather than leaving booby-traps. List > stuff can fail in spectacular fashion if you get that sort of thing > wrong and extra couple instructions is lost in the noise. (Plus, > everything is already in the cache at this point, so you'll have a > hard time proving to me that you could measure a perf diff..) > Sure... I've made that argument on other linked list implementations in the past. Feel free to send the patch and see what the reaction is :-p > > Incidentally, if you had an array, you could just let > nir_variable_create be > > clever. > > > >> > >> + > >> + exec_list_push_tail(list, &nvar->node); > >> + > >> + /* remove existing var from input/output list: */ > >> + exec_node_remove(&var->node); > >> + exec_node_self_link(&var->node); > > > > > > This isn't needed either. > > > >> > >> + > >> + return nvar; > >> +} > >> + > >> +static unsigned > >> +get_deref_offset(struct lower_io_types_state *state, nir_deref *tail) > >> +{ > >> + unsigned offset = 0; > >> + > >> + while (tail->child != NULL) { > >> + const struct glsl_type *parent_type = tail->type; > >> + tail = tail->child; > >> + > >> + if (tail->deref_type == nir_deref_type_array) { > >> + nir_deref_array *deref_array = nir_deref_as_array(tail); > >> + > >> + /* indirect inputs/outputs should already be lowered! */ > >> + assert(deref_array->deref_array_type == > >> nir_deref_array_type_direct); > >> + > >> + unsigned size = state->type_size(tail->type); > >> + > >> + offset += size * deref_array->base_offset; > >> + } else if (tail->deref_type == nir_deref_type_struct) { > >> + nir_deref_struct *deref_struct = nir_deref_as_struct(tail); > >> + > >> + for (unsigned i = 0; i < deref_struct->index; i++) { > >> + offset += > state->type_size(glsl_get_struct_field(parent_type, > >> i)); > >> + } > >> + } > >> + } > >> + > >> + return offset; > >> +} > >> + > >> +static bool > >> +lower_io_types_block(nir_block *block, void *void_state) > >> +{ > >> + struct lower_io_types_state *state = void_state; > >> + > >> + nir_foreach_instr(block, instr) { > >> + if (instr->type != nir_instr_type_intrinsic) > >> + continue; > >> + > >> + nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr); > >> + > >> + if ((intr->intrinsic != nir_intrinsic_load_var) && > >> + (intr->intrinsic != nir_intrinsic_store_var)) > >> + continue; > > > > > > You should say somewhere in the top comment that copies need to also be > > lowered prior to this pass. > > ok, I'll add a comment > > >> > >> + > >> + nir_variable *var = intr->variables[0]->var; > >> + > >> + if ((var->data.mode != nir_var_shader_in) && > >> + (var->data.mode != nir_var_shader_out)) > >> + continue; > >> + > >> + /* no need to split up if already primitive */ > >> + if (state->type_size(var->type) == 1) > >> + continue; > >> + > >> + unsigned off = get_deref_offset(state, > &intr->variables[0]->deref); > >> + nir_variable *nvar = get_new_var(state, var, off); > >> + > >> + /* and then re-write the load/store_var deref: */ > >> + intr->variables[0] = nir_deref_var_create(intr, nvar); > >> + } > >> + > >> + return true; > >> +} > >> + > >> +static void > >> +lower_io_types_impl(nir_function_impl *impl, struct > lower_io_types_state > >> *state) > >> +{ > >> + nir_foreach_block(impl, lower_io_types_block, state); > >> + > >> + nir_metadata_preserve(impl, nir_metadata_block_index | > >> + nir_metadata_dominance); > >> +} > >> + > >> + > >> +void > >> +nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct > >> glsl_type *)) > > > > > > Just a side-note (not a request) but we really should have a > > "glsl_type_size_func" typedef somewhere. > > not a bad idea.. although maybe I'll punt until gallium and i965 > type_size fxns are unified (in which case we could drop this arg).. > > > Also, what happened to just using nir_type_size_vec4? Incidentally, I > think > > you could also use the newly exposed component_slots function from my > > series. > > After the lower_io patch, I didn't need the type_size fxn from > freedreno/ir3. That and I punted on sorting out the small deltas > between the gallium and i965 versions.. you seemed to thing some of > those diffs might not matter, but I've had this series dragging on for > long enough that I figured I'd avoid introducing dependencies on > sorting that out. Not like there are a lot of call-sites for this > pass, so I figured no real problem w/ dropping the type_size fxn ptr > in a later patchset. > Sure. I don't want to block on that. I do think component_slots is more-or-less the "correct" thing to do since it uses the exact same units as var->data.location. I'm not too worried about the other comments. Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > BR, > -R > > >> > >> +{ > >> + struct lower_io_types_state state; > >> + > >> + /* NOTE: only operates on units of vec4 slots: */ > >> + assert(type_size(glsl_vec4_type()) == 1); > >> + > >> + state.shader = shader; > >> + exec_list_make_empty(&state.new_ins); > >> + exec_list_make_empty(&state.new_outs); > >> + state.type_size = type_size; > >> + > >> + nir_foreach_function(shader, function) { > >> + if (function->impl) > >> + lower_io_types_impl(function->impl, &state); > >> + } > >> + > >> + /* move new in/out vars to shader's lists: */ > >> + exec_list_append(&shader->inputs, &state.new_ins); > >> + exec_list_append(&shader->outputs, &state.new_outs); > >> +} > >> -- > >> 2.5.5 > >> > >> _______________________________________________ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev