On 19 January 2012 18:28, Paul Berry <[email protected]> wrote:
> i965 processes assignments of whole structures using > vec4_visitor::emit_block_move, a recursive function which visits each > element of a structure or array (to arbitrary nesting depth) and > copies it from the source to the destination. Then it increments the > source and destination register numbers so that further recursive > invocations will copy the rest of the structure. In addition, it sets > the swizzle field for the source register to an appropriate value of > swizzle_for_size(...) for the size of each element being copied, so > that later optimization passes won't be fooled into thinking that > unused vector elements are live. > > This all works fine. However, emit_block_move also contains an > assertion to verify, before setting the swizzle field for the source > register, that the source register doesn't already contain a > nontrivial swizzle. The intention is to make sure that the caller of > emit_block_move hasn't already done some swizzling of the data before > the call, which emit_block_move would then counteract when it > overwrites the swizzle field. But the assertion is at the lowest > level of nesting of emit_block_move, which means that after the first > element is copied, instead of checking the swizzle field set by the > caller, it checks the swizzle field used when moving the previous > element. That means that if the structure contains elements of > different vector sizes (which therefore require different swizzles), > the assertion will erroneously fire. > > This patch moves the assertion from emit_block_move to the calling > function, vec4_visitor::visit(ir_assignment *). Since the caller is > non-recursive, the assertion will only happen once, and won't be > fooled by emit_block_move's modification of the swizzle field. > > This patch also reverts commit fe006a7 (i965/vs: Fix swizzle related > assertion), which attempted to fix the bug by making the assertion > more lenient, but only worked properly for structures, arrays, and > matrices in which each constituent vector is the same size. > > This fixes the problem described in comment 9 of > https://bugs.freedesktop.org/show_bug.cgi?id=40865. Unfortunately, it > doesn't fix the whole bug, since the test in question is also failing > due to lack of register spilling support in the VS. > > Fixes piglit test fdo40865_c9. No piglit regressions on Sandy Bridge. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40865#c9 > I forgot to mention: This is a candidate for the 8.0 release branch. > --- > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 12 +++++++++--- > 1 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > index 5df2470..10b3c20 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > @@ -1532,9 +1532,6 @@ vec4_visitor::emit_block_move(dst_reg *dst, src_reg > *src, > > dst->writemask = (1 << type->vector_elements) - 1; > > - /* Do we need to worry about swizzling a swizzle? */ > - assert(src->swizzle == BRW_SWIZZLE_NOOP > - || src->swizzle == swizzle_for_size(type->vector_elements)); > src->swizzle = swizzle_for_size(type->vector_elements); > > vec4_instruction *inst = emit(MOV(*dst, *src)); > @@ -1617,6 +1614,15 @@ vec4_visitor::visit(ir_assignment *ir) > emit_bool_to_cond_code(ir->condition, &predicate); > } > > + /* emit_block_move doesn't account for swizzles in the source > register. > + * This should be ok, since the source register is a structure or an > + * array, and those can't be swizzled. But double-check to be sure. > + */ > + assert(src.swizzle == > + (ir->rhs->type->is_matrix() > + ? swizzle_for_size(ir->rhs->type->vector_elements) > + : BRW_SWIZZLE_NOOP)); > + > emit_block_move(&dst, &src, ir->rhs->type, predicate); > return; > } > -- > 1.7.6.5 > >
_______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
