On Tue, Dec 4, 2018 at 1:18 AM Iago Toral Quiroga <ito...@igalia.com> wrote:

> The implementation of these opcodes in the generator assumes that their
> arguments are packed, and it generates register regions based on that
> assumption. While this expectation is reasonable for 32-bit,


Expectation, sure, but if someone does ddx(f2f32(d)) where d is a double,
it's broken.  Maybe we should back-port?  Either way

Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>


> when we
> load 16-bit elements from UBOs we get them with a stride of 2 that we
> then need to pack with a stride of 1. Copy propagation can see through this
> and rewrite ddx/ddy operands to use the original, strided register,
> breaking
> the implementation in the generator.
> ---
>  .../compiler/brw_fs_copy_propagation.cpp      | 21 +++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp
> b/src/intel/compiler/brw_fs_copy_propagation.cpp
> index 58d5080b4e9..c01d4ec4a4f 100644
> --- a/src/intel/compiler/brw_fs_copy_propagation.cpp
> +++ b/src/intel/compiler/brw_fs_copy_propagation.cpp
> @@ -361,6 +361,20 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned
> stride,
>     return true;
>  }
>
> +static bool
> +instruction_requires_packed_data(fs_inst *inst)
> +{
> +   switch (inst->opcode) {
> +   case FS_OPCODE_DDX_FINE:
> +   case FS_OPCODE_DDX_COARSE:
> +   case FS_OPCODE_DDY_FINE:
> +   case FS_OPCODE_DDY_COARSE:
> +      return true;
> +   default:
> +      return false;
> +   }
> +}
> +
>  bool
>  fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
>  {
> @@ -407,6 +421,13 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int
> arg, acp_entry *entry)
>         inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE)
>        return false;
>
> +   /* Some instructions implemented in the generator backend, such as
> +    * derivatives, assume that their operands are packed so we can't
> +    * generally propagate strided regions to them.
> +    */
> +   if (instruction_requires_packed_data(inst) && entry->src.stride > 1)
> +      return false;
> +
>     /* Bail if the result of composing both strides would exceed the
>      * hardware limit.
>      */
> --
> 2.17.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to