Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

2017-01-23 Thread Matt Turner
On Mon, Jan 23, 2017 at 2:59 AM, Samuel Iglesias Gonsálvez
 wrote:
> On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote:
>> Matt Turner  writes:
>>
>> > In commit c35fa7a, we changed the "width" of DF source registers to
>> > 2,
>> > which is conceptually fine. Unfortunately a VertStride of 2 is not
>> > allowed by align16 instructions on IVB/BYT, and the regular
>> > VertStride
>> > of 4 works fine in any case.
>> >
>>
>> I'll try to throw some light onto why this works -- AFAIUI, in
>> Align16
>> mode the vertical stride control doesn't behave as you would expect
>> -- A
>> VertStride=0 does behave as expected and steps zero elements across
>> rows
>> (modulo instruction decompression bugs), but on Gen7 any other value
>> simply behaves as "step by a fixed offset of 16B across rows".  On
>> HSW
>> they explicitly allowed VertStride=2, but I don't think the hardware
>> became any smarter, it's still most likely a 16B fixed offset.  On
>> IVB
>> neither VertStride=2 nor VertStride=4 is supposed to work for our
>> purposes (the former because it's supposedly not supported and the
>> latter because one would expect it to step by 4 DF elements = 32B per
>> 16B row), but they both likely work in practice.  Anyway let's stick
>> to
>> what the docs say is not illegal, a couple more comments below.
>>
>> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-
>> > functions/vs-round-double.shader_test
>> > for example:
>> >
>> > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{
>> > align16 1Q };
>> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
>> > allowed
>> > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{
>> > align16 2N };
>> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
>> > allowed
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++
>> >  1 file changed, 14 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > index 888f95e..a01083f 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst
>> > *inst, struct brw_reg reg)
>> >  /* This is an oddity of the fact we're using the same
>> >   * descriptions for registers in align_16 as align_1:
>> >   */
>>
>> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
>> nobody gets confused and thinks that the code you added has to do
>> with
>> our representation of align_16 regions?
>>
>> > -if (reg.vstride == BRW_VERTICAL_STRIDE_8)
>> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>> >  brw_inst_set_src0_vstride(devinfo, inst,
>> > BRW_VERTICAL_STRIDE_4);
>> > -else
>> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
>> > +reg.type == BRW_REGISTER_TYPE_DF &&
>> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
>>
>> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF
>> Align16
>> regions with VertStride=4 really behave like VertStride=2.  If the
>> caller expected anything else it's not going to get it.
>>
>> Maybe copy-paste the relevant spec text here and below to explain why
>> we
>> only use BRW_VERTICAL_STRIDE_4?
>>
>
> Matt, I can handle these changes... however, I have not found the
> relevant spec quote. Can you provide it?

I could not find anything useful in the IVB PRM. The HSW PRM has the
second quote below.

In the internal documentation, it says for DevSNB "For Align16 access
mode, only encodings of  and 0011 are allowed. Other codes are
reserved."

and for DevHSW "For Align16 access mode, only encodings of , 0010
and 0011 are allowed. Other codes are reserved."

Presumably the DevSNB behavior applies to IVB as well.

Thanks for handling this.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

2017-01-23 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote:
> Matt Turner  writes:
> 
> > In commit c35fa7a, we changed the "width" of DF source registers to
> > 2,
> > which is conceptually fine. Unfortunately a VertStride of 2 is not
> > allowed by align16 instructions on IVB/BYT, and the regular
> > VertStride
> > of 4 works fine in any case.
> > 
> 
> I'll try to throw some light onto why this works -- AFAIUI, in
> Align16
> mode the vertical stride control doesn't behave as you would expect
> -- A
> VertStride=0 does behave as expected and steps zero elements across
> rows
> (modulo instruction decompression bugs), but on Gen7 any other value
> simply behaves as "step by a fixed offset of 16B across rows".  On
> HSW
> they explicitly allowed VertStride=2, but I don't think the hardware
> became any smarter, it's still most likely a 16B fixed offset.  On
> IVB
> neither VertStride=2 nor VertStride=4 is supposed to work for our
> purposes (the former because it's supposedly not supported and the
> latter because one would expect it to step by 4 DF elements = 32B per
> 16B row), but they both likely work in practice.  Anyway let's stick
> to
> what the docs say is not illegal, a couple more comments below.
> 
> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-
> > functions/vs-round-double.shader_test
> > for example:
> > 
> > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{
> > align16 1Q };
> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
> > allowed
> > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{
> > align16 2N };
> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
> > allowed
> > ---
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 888f95e..a01083f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg reg)
> >      /* This is an oddity of the fact we're using the same
> >       * descriptions for registers in align_16 as align_1:
> >       */
> 
> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
> nobody gets confused and thinks that the code you added has to do
> with
> our representation of align_16 regions?
> 
> > -    if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
> >  brw_inst_set_src0_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > -    else
> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > +reg.type == BRW_REGISTER_TYPE_DF &&
> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> 
> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF
> Align16
> regions with VertStride=4 really behave like VertStride=2.  If the
> caller expected anything else it's not going to get it.
> 
> Maybe copy-paste the relevant spec text here and below to explain why
> we
> only use BRW_VERTICAL_STRIDE_4?
> 

Matt, I can handle these changes... however, I have not found the
relevant spec quote. Can you provide it?

Sam

> > +brw_inst_set_src0_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > + } else {
> >  brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
> > + }
> >    }
> > }
> >  }
> > @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg reg)
> >      /* This is an oddity of the fact we're using the same
> >       * descriptions for registers in align_16 as align_1:
> >       */
> > -    if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
> >  brw_inst_set_src1_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > -    else
> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > +reg.type == BRW_REGISTER_TYPE_DF &&
> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> > +brw_inst_set_src1_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > + } else {
> >  brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
> > + }
> >    }
> > }
> >  }
> > -- 
> > 2.7.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

2017-01-20 Thread Francisco Jerez
Matt Turner  writes:

> In commit c35fa7a, we changed the "width" of DF source registers to 2,
> which is conceptually fine. Unfortunately a VertStride of 2 is not
> allowed by align16 instructions on IVB/BYT, and the regular VertStride
> of 4 works fine in any case.
>

I'll try to throw some light onto why this works -- AFAIUI, in Align16
mode the vertical stride control doesn't behave as you would expect -- A
VertStride=0 does behave as expected and steps zero elements across rows
(modulo instruction decompression bugs), but on Gen7 any other value
simply behaves as "step by a fixed offset of 16B across rows".  On HSW
they explicitly allowed VertStride=2, but I don't think the hardware
became any smarter, it's still most likely a 16B fixed offset.  On IVB
neither VertStride=2 nor VertStride=4 is supposed to work for our
purposes (the former because it's supposedly not supported and the
latter because one would expect it to step by 4 DF elements = 32B per
16B row), but they both likely work in practice.  Anyway let's stick to
what the docs say is not illegal, a couple more comments below.

> See 
> generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test
> for example:
>
> cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{ align16 1Q 
> };
> ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
> cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{ align16 2N 
> };
> ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 888f95e..a01083f 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, 
> struct brw_reg reg)
>/* This is an oddity of the fact we're using the same
> * descriptions for registers in align_16 as align_1:
> */

Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
nobody gets confused and thinks that the code you added has to do with
our representation of align_16 regions?

> -  if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>  brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> -  else
> + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +reg.type == BRW_REGISTER_TYPE_DF &&
> +reg.vstride >= BRW_VERTICAL_STRIDE_1) {

I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF Align16
regions with VertStride=4 really behave like VertStride=2.  If the
caller expected anything else it's not going to get it.

Maybe copy-paste the relevant spec text here and below to explain why we
only use BRW_VERTICAL_STRIDE_4?

> +brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> + } else {
>  brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
> + }
>}
> }
>  }
> @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, 
> struct brw_reg reg)
>/* This is an oddity of the fact we're using the same
> * descriptions for registers in align_16 as align_1:
> */
> -  if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>  brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> -  else
> + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> +reg.type == BRW_REGISTER_TYPE_DF &&
> +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> +brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
> + } else {
>  brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
> + }
>}
> }
>  }
> -- 
> 2.7.3


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

2017-01-20 Thread Matt Turner
In commit c35fa7a, we changed the "width" of DF source registers to 2,
which is conceptually fine. Unfortunately a VertStride of 2 is not
allowed by align16 instructions on IVB/BYT, and the regular VertStride
of 4 works fine in any case.

See 
generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test
for example:

cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{ align16 1Q };
ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{ align16 2N };
ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 888f95e..a01083f 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, 
struct brw_reg reg)
 /* This is an oddity of the fact we're using the same
  * descriptions for registers in align_16 as align_1:
  */
-if (reg.vstride == BRW_VERTICAL_STRIDE_8)
+ if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
 brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
-else
+ } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
+reg.type == BRW_REGISTER_TYPE_DF &&
+reg.vstride >= BRW_VERTICAL_STRIDE_1) {
+brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
+ } else {
 brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
+ }
   }
}
 }
@@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, 
struct brw_reg reg)
 /* This is an oddity of the fact we're using the same
  * descriptions for registers in align_16 as align_1:
  */
-if (reg.vstride == BRW_VERTICAL_STRIDE_8)
+ if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
 brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
-else
+ } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
+reg.type == BRW_REGISTER_TYPE_DF &&
+reg.vstride >= BRW_VERTICAL_STRIDE_1) {
+brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4);
+ } else {
 brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
+ }
   }
}
 }
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev