On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <curroje...@riseup.net> wrote:
> This is required in combination with the following commit, because > otherwise if a source region with an extended 8+ stride is present in > the instruction (which we're about to declare legal) we'll end up > emitting code that attempts to write to such a region, even though > strides greater than four are still illegal for the destination. > --- > src/intel/compiler/brw_fs_lower_regioning.cpp | 20 ++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp > b/src/intel/compiler/brw_fs_lower_regioning.cpp > index 6a3c23892b4..b86e95ed9eb 100644 > --- a/src/intel/compiler/brw_fs_lower_regioning.cpp > +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp > @@ -71,15 +71,25 @@ namespace { > !is_byte_raw_mov(inst)) { > return get_exec_type_size(inst); > } else { > - unsigned stride = inst->dst.stride * type_sz(inst->dst.type); > + /* Calculate the maximum byte stride and the minimum type size > across > + * all source and destination operands. > + */ > + unsigned max_stride = inst->dst.stride * type_sz(inst->dst.type); > + unsigned min_size = type_sz(inst->dst.type); > > for (unsigned i = 0; i < inst->sources; i++) { > - if (!is_uniform(inst->src[i]) && !inst->is_control_source(i)) > - stride = MAX2(stride, inst->src[i].stride * > - type_sz(inst->src[i].type)); > + if (!is_uniform(inst->src[i]) && !inst->is_control_source(i)) > { > + max_stride = MAX2(max_stride, inst->src[i].stride * > + type_sz(inst->src[i].type)); > + min_size = MIN2(min_size, type_sz(inst->src[i].type)); > + } > } > > - return stride; > + /* Attempt to use the largest byte stride among all present > operands, > + * but never exceed a stride of 4 since that would lead to > illegal > + * destination regions during lowering. > + */ > + return MIN2(max_stride, 4 * min_size); > Why not just fall back to tightly packed in this case? I think I can answer my own question: Because using something that's equal to one of the strides reduces the liklihood that we'll need a temporary. If that's the correct answer, then maybe what we want is the maximum of all strides with stride_in_bytes <= 4 * type_sz? --Jason > } > } > > -- > 2.19.2 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev