Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:

> It is tested empirically that IVB/BYT don't support indirect addressing
> with doubles but it is not documented in the PRM.
>
> This patch applies the same solution than for Cherryview/Broxton and
> takes into account that we cannot double the stride, since the
> hardware will do it internally.
>
> v2:
> - Fix assert to take into account Indirect DF MOVs in IVB and HSW.
>
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> ---
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27 
> ++++++++++++++++++--------
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp       | 11 ++++++-----
>  2 files changed, 25 insertions(+), 13 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 487f2e90224..dd6cbab773c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -418,18 +418,29 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
>        brw_MOV(p, dst, reg);
>     } else {
>        /* Prior to Broadwell, there are only 8 address registers. */
> -      assert(inst->exec_size == 8 || devinfo->gen >= 8);
> +      assert(inst->exec_size == 8 || devinfo->gen >= 8 ||
> +             (devinfo->gen == 7 && inst->exec_size < 8 &&
> +              (type_sz(reg.type) == 8 || type_sz(dst.type) == 8)));
>  
>        /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
>        struct brw_reg addr = vec8(brw_address_reg(0));
>  
> -      /* The destination stride of an instruction (in bytes) must be greater
> -       * than or equal to the size of the rest of the instruction.  Since the
> -       * address register is of type UW, we can't use a D-type instruction.
> -       * In order to get around this, re retype to UW and use a stride.
> -       */
> -      indirect_byte_offset =
> -         retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
> +      if (devinfo->gen != 7 || devinfo->is_haswell || type_sz(reg.type) != 
> 8) {
> +         /* The destination stride of an instruction (in bytes) must be 
> greater
> +          * than or equal to the size of the rest of the instruction.  Since 
> the
> +          * address register is of type UW, we can't use a D-type 
> instruction.
> +          * In order to get around this, re retype to UW and use a stride.
> +          */
> +         indirect_byte_offset =
> +            retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
> +      } else {
> +         /* In Ivybridge/Baytrail, when it operates with DF operands, we
> +          * cannot double the stride, since the hardware will do it
> +          * internally. Tested empirically.
> +          */
> +         indirect_byte_offset =
> +            retype(indirect_byte_offset, BRW_REGISTER_TYPE_UW);
> +      }

This doesn't make any sense, indirect_byte_offset is a 32-bit integer
type regardless of the type of the data you're copying, the hardware
won't do any stride doubling for you here.

I've verified on IVB hardware that indirect addressing works for 64-bit
moves, but getting it to do what you want is a bit trickier than on more
recent hardware -- I'll reply with a patch that gets it working.

>  
>        /* There are a number of reasons why we don't use the base offset here.
>         * One reason is that the field is only 9 bits which means we can only
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 8f745dff440..85f43b7b144 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -3822,17 +3822,18 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
>              (instr->num_components - 1) * type_sz(dest.type);
>  
>           fs_reg indirect_chv_high_32bit;
> -         bool is_chv_bxt_64bit =
> -            (devinfo->is_cherryview || devinfo->is_broxton) &&
> -            type_sz(dest.type) == 8;
> -         if (is_chv_bxt_64bit) {
> +         bool is_ivb_byt_chv_bxt_64bit =

Let's use a more sensible name here like "supports_64bit_indirects" and
mark as const...  Or just remove the variable altogether and fold this
into the condition of the second if statement below.

> +            (devinfo->is_cherryview || devinfo->is_broxton ||
> +             devinfo->is_ivybridge || devinfo->is_baytrail) &&
> +           type_sz(dest.type) == 8;
> +         if (is_ivb_byt_chv_bxt_64bit) {
>              indirect_chv_high_32bit = vgrf(glsl_type::uint_type);
>              /* Calculate indirect address to read high 32 bits */
>              bld.ADD(indirect_chv_high_32bit, indirect, brw_imm_ud(4));

The ADD and temporary seem redundant, you could take into account the
additional 32bit offset by applying an immediate offset to the 'src'
region used below.

>           }
>  
>           for (unsigned j = 0; j < instr->num_components; j++) {
> -            if (!is_chv_bxt_64bit) {
> +            if (!is_ivb_byt_chv_bxt_64bit) {

The code below seems pretty broken...  You lower a 64-bit to 64-bit
indirect move into two 64-bit to UD type-converting moves, which is
unlikely to do what you want.  You need to lower it into two raw 32-bit
to 32bit moves (e.g. by using subscript() in the source as well as on
the destination).  Please fix this (and CC the patch to mesa-stable)
regardless of my patch to get 64-bit MOV_INDIRECT working, CHV/BXT are
unlikely to work as expected except by luck.

>                 bld.emit(SHADER_OPCODE_MOV_INDIRECT,
>                          offset(dest, bld, j), offset(src, bld, j),
>                          indirect, brw_imm_ud(read_size));
> -- 
> 2.11.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to