Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Also consider including a fixes tag. Fixes: 6f5abf31466 "i965: Fix output register sizes when multiple variables share a slot." On Fri, May 18, 2018 at 03:31:00PM +0200, Neil Roberts wrote: > In 6f5abf31466aed this code was fixed to calculate the maximum size of > an attribute in a seperate pass and then allocate the registers to > that size. However this wasn’t taking into account ranges that overlap > but don’t have the same starting location. For example: > > layout(location = 0, component = 0) out float a[4]; > layout(location = 2, component = 1) out float b[4]; > > Previously, if ‘a’ was processed first then it would allocate a > register of size 4 for location 0 and it wouldn’t allocate another > register for location 2 because it would already be covered by the > range of 0. Then if something tries to write to b[2] it would try to > write past the end of the register allocated for ‘a’ and it would hit > an assert. > > This patch changes it to scan for any overlapping ranges that start > within each range to calculate the maximum extent and allocate that > instead. > > Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/ > vs-fs-array-interleave-range.shader_test Verified this particular test is failing in master and the patch makes it pass. > --- > > For what it’s worth I also made a simplified test for this for > VkRunner here: > > https://github.com/Igalia/vkrunner/blob/tests/examples/overlap-outputs.shader_test > > src/intel/compiler/brw_fs_nir.cpp | 25 ++++++++++++++++++------- > 1 file changed, 18 insertions(+), 7 deletions(-) > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 1ce89520bf1..b179018e5e8 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -67,14 +67,25 @@ fs_visitor::nir_setup_outputs() > vec4s[loc] = MAX2(vec4s[loc], var_vec4s); > } > > - nir_foreach_variable(var, &nir->outputs) { > - const int loc = var->data.driver_location; > - if (outputs[loc].file == BAD_FILE) { > - fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * vec4s[loc]); > - for (unsigned i = 0; i < vec4s[loc]; i++) { > - outputs[loc + i] = offset(reg, bld, 4 * i); > - } > + for (unsigned loc = 0; loc < ARRAY_SIZE(vec4s);) { > + if (vec4s[loc] == 0) { > + loc++; > + continue; > } > + > + unsigned reg_size = vec4s[loc]; > + > + /* Check if there are any ranges that start within this range and > extend > + * past it. If so, include them in this allocation. > + */ > + for (unsigned i = 1; i < reg_size; i++) > + reg_size = MAX2(vec4s[i + loc] + i, reg_size); > + > + fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * reg_size); > + for (unsigned i = 0; i < reg_size; i++) > + outputs[loc + i] = offset(reg, bld, 4 * i); > + > + loc += reg_size; > } > } > > -- > 2.14.3 > > _______________________________________________ > mesa-dev mailing list > [email protected] > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
