On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <curroje...@riseup.net>
wrote:

> This is required in combination with the following commit, because
> otherwise if a source region with an extended 8+ stride is present in
> the instruction (which we're about to declare legal) we'll end up
> emitting code that attempts to write to such a region, even though
> strides greater than four are still illegal for the destination.
> ---
>  src/intel/compiler/brw_fs_lower_regioning.cpp | 20 ++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp
> b/src/intel/compiler/brw_fs_lower_regioning.cpp
> index 6a3c23892b4..b86e95ed9eb 100644
> --- a/src/intel/compiler/brw_fs_lower_regioning.cpp
> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp
> @@ -71,15 +71,25 @@ namespace {
>            !is_byte_raw_mov(inst)) {
>           return get_exec_type_size(inst);
>        } else {
> -         unsigned stride = inst->dst.stride * type_sz(inst->dst.type);
> +         /* Calculate the maximum byte stride and the minimum type size
> across
> +          * all source and destination operands.
> +          */
> +         unsigned max_stride = inst->dst.stride * type_sz(inst->dst.type);
> +         unsigned min_size = type_sz(inst->dst.type);
>
>           for (unsigned i = 0; i < inst->sources; i++) {
> -            if (!is_uniform(inst->src[i]) && !inst->is_control_source(i))
> -               stride = MAX2(stride, inst->src[i].stride *
> -                             type_sz(inst->src[i].type));
> +            if (!is_uniform(inst->src[i]) && !inst->is_control_source(i))
> {
> +               max_stride = MAX2(max_stride, inst->src[i].stride *
> +                                 type_sz(inst->src[i].type));
> +               min_size = MIN2(min_size, type_sz(inst->src[i].type));
> +            }
>           }
>
> -         return stride;
> +         /* Attempt to use the largest byte stride among all present
> operands,
> +          * but never exceed a stride of 4 since that would lead to
> illegal
> +          * destination regions during lowering.
> +          */
> +         return MIN2(max_stride, 4 * min_size);
>

Why not just fall back to tightly packed in this case?  I think I can
answer my own question:  Because using something that's equal to one of the
strides reduces the liklihood that we'll need a temporary.  If that's the
correct answer, then maybe what we want is the maximum of all strides with
stride_in_bytes <= 4 * type_sz?

--Jason


>        }
>     }
>
> --
> 2.19.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to