Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.
On Mon, Jan 23, 2017 at 2:59 AM, Samuel Iglesias Gonsálvezwrote: > On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote: >> Matt Turner writes: >> >> > In commit c35fa7a, we changed the "width" of DF source registers to >> > 2, >> > which is conceptually fine. Unfortunately a VertStride of 2 is not >> > allowed by align16 instructions on IVB/BYT, and the regular >> > VertStride >> > of 4 works fine in any case. >> > >> >> I'll try to throw some light onto why this works -- AFAIUI, in >> Align16 >> mode the vertical stride control doesn't behave as you would expect >> -- A >> VertStride=0 does behave as expected and steps zero elements across >> rows >> (modulo instruction decompression bugs), but on Gen7 any other value >> simply behaves as "step by a fixed offset of 16B across rows". On >> HSW >> they explicitly allowed VertStride=2, but I don't think the hardware >> became any smarter, it's still most likely a 16B fixed offset. On >> IVB >> neither VertStride=2 nor VertStride=4 is supposed to work for our >> purposes (the former because it's supposedly not supported and the >> latter because one would expect it to step by 4 DF elements = 32B per >> 16B row), but they both likely work in practice. Anyway let's stick >> to >> what the docs say is not illegal, a couple more comments below. >> >> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in- >> > functions/vs-round-double.shader_test >> > for example: >> > >> > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{ >> > align16 1Q }; >> > ERROR: In Align16 mode, only VertStride of 0 or 4 is >> > allowed >> > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{ >> > align16 2N }; >> > ERROR: In Align16 mode, only VertStride of 0 or 4 is >> > allowed >> > --- >> > src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++ >> > 1 file changed, 14 insertions(+), 4 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c >> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c >> > index 888f95e..a01083f 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c >> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c >> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst >> > *inst, struct brw_reg reg) >> > /* This is an oddity of the fact we're using the same >> > * descriptions for registers in align_16 as align_1: >> > */ >> >> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so >> nobody gets confused and thinks that the code you added has to do >> with >> our representation of align_16 regions? >> >> > -if (reg.vstride == BRW_VERTICAL_STRIDE_8) >> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { >> > brw_inst_set_src0_vstride(devinfo, inst, >> > BRW_VERTICAL_STRIDE_4); >> > -else >> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell && >> > +reg.type == BRW_REGISTER_TYPE_DF && >> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) { >> >> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF >> Align16 >> regions with VertStride=4 really behave like VertStride=2. If the >> caller expected anything else it's not going to get it. >> >> Maybe copy-paste the relevant spec text here and below to explain why >> we >> only use BRW_VERTICAL_STRIDE_4? >> > > Matt, I can handle these changes... however, I have not found the > relevant spec quote. Can you provide it? I could not find anything useful in the IVB PRM. The HSW PRM has the second quote below. In the internal documentation, it says for DevSNB "For Align16 access mode, only encodings of and 0011 are allowed. Other codes are reserved." and for DevHSW "For Align16 access mode, only encodings of , 0010 and 0011 are allowed. Other codes are reserved." Presumably the DevSNB behavior applies to IVB as well. Thanks for handling this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.
On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote: > Matt Turnerwrites: > > > In commit c35fa7a, we changed the "width" of DF source registers to > > 2, > > which is conceptually fine. Unfortunately a VertStride of 2 is not > > allowed by align16 instructions on IVB/BYT, and the regular > > VertStride > > of 4 works fine in any case. > > > > I'll try to throw some light onto why this works -- AFAIUI, in > Align16 > mode the vertical stride control doesn't behave as you would expect > -- A > VertStride=0 does behave as expected and steps zero elements across > rows > (modulo instruction decompression bugs), but on Gen7 any other value > simply behaves as "step by a fixed offset of 16B across rows". On > HSW > they explicitly allowed VertStride=2, but I don't think the hardware > became any smarter, it's still most likely a 16B fixed offset. On > IVB > neither VertStride=2 nor VertStride=4 is supposed to work for our > purposes (the former because it's supposedly not supported and the > latter because one would expect it to step by 4 DF elements = 32B per > 16B row), but they both likely work in practice. Anyway let's stick > to > what the docs say is not illegal, a couple more comments below. > > > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in- > > functions/vs-round-double.shader_test > > for example: > > > > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{ > > align16 1Q }; > > ERROR: In Align16 mode, only VertStride of 0 or 4 is > > allowed > > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{ > > align16 2N }; > > ERROR: In Align16 mode, only VertStride of 0 or 4 is > > allowed > > --- > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++ > > 1 file changed, 14 insertions(+), 4 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > index 888f95e..a01083f 100644 > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst > > *inst, struct brw_reg reg) > > /* This is an oddity of the fact we're using the same > > * descriptions for registers in align_16 as align_1: > > */ > > Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so > nobody gets confused and thinks that the code you added has to do > with > our representation of align_16 regions? > > > - if (reg.vstride == BRW_VERTICAL_STRIDE_8) > > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { > > brw_inst_set_src0_vstride(devinfo, inst, > > BRW_VERTICAL_STRIDE_4); > > - else > > + } else if (devinfo->gen == 7 && !devinfo->is_haswell && > > +reg.type == BRW_REGISTER_TYPE_DF && > > +reg.vstride >= BRW_VERTICAL_STRIDE_1) { > > I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF > Align16 > regions with VertStride=4 really behave like VertStride=2. If the > caller expected anything else it's not going to get it. > > Maybe copy-paste the relevant spec text here and below to explain why > we > only use BRW_VERTICAL_STRIDE_4? > Matt, I can handle these changes... however, I have not found the relevant spec quote. Can you provide it? Sam > > +brw_inst_set_src0_vstride(devinfo, inst, > > BRW_VERTICAL_STRIDE_4); > > + } else { > > brw_inst_set_src0_vstride(devinfo, inst, reg.vstride); > > + } > > } > > } > > } > > @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst > > *inst, struct brw_reg reg) > > /* This is an oddity of the fact we're using the same > > * descriptions for registers in align_16 as align_1: > > */ > > - if (reg.vstride == BRW_VERTICAL_STRIDE_8) > > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { > > brw_inst_set_src1_vstride(devinfo, inst, > > BRW_VERTICAL_STRIDE_4); > > - else > > + } else if (devinfo->gen == 7 && !devinfo->is_haswell && > > +reg.type == BRW_REGISTER_TYPE_DF && > > +reg.vstride >= BRW_VERTICAL_STRIDE_1) { > > +brw_inst_set_src1_vstride(devinfo, inst, > > BRW_VERTICAL_STRIDE_4); > > + } else { > > brw_inst_set_src1_vstride(devinfo, inst, reg.vstride); > > + } > > } > > } > > } > > -- > > 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.
Matt Turnerwrites: > In commit c35fa7a, we changed the "width" of DF source registers to 2, > which is conceptually fine. Unfortunately a VertStride of 2 is not > allowed by align16 instructions on IVB/BYT, and the regular VertStride > of 4 works fine in any case. > I'll try to throw some light onto why this works -- AFAIUI, in Align16 mode the vertical stride control doesn't behave as you would expect -- A VertStride=0 does behave as expected and steps zero elements across rows (modulo instruction decompression bugs), but on Gen7 any other value simply behaves as "step by a fixed offset of 16B across rows". On HSW they explicitly allowed VertStride=2, but I don't think the hardware became any smarter, it's still most likely a 16B fixed offset. On IVB neither VertStride=2 nor VertStride=4 is supposed to work for our purposes (the former because it's supposedly not supported and the latter because one would expect it to step by 4 DF elements = 32B per 16B row), but they both likely work in practice. Anyway let's stick to what the docs say is not illegal, a couple more comments below. > See > generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test > for example: > > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{ align16 1Q > }; > ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{ align16 2N > }; > ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed > --- > src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++ > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index 888f95e..a01083f 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, > struct brw_reg reg) >/* This is an oddity of the fact we're using the same > * descriptions for registers in align_16 as align_1: > */ Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so nobody gets confused and thinks that the code you added has to do with our representation of align_16 regions? > - if (reg.vstride == BRW_VERTICAL_STRIDE_8) > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { > brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); > - else > + } else if (devinfo->gen == 7 && !devinfo->is_haswell && > +reg.type == BRW_REGISTER_TYPE_DF && > +reg.vstride >= BRW_VERTICAL_STRIDE_1) { I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF Align16 regions with VertStride=4 really behave like VertStride=2. If the caller expected anything else it's not going to get it. Maybe copy-paste the relevant spec text here and below to explain why we only use BRW_VERTICAL_STRIDE_4? > +brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); > + } else { > brw_inst_set_src0_vstride(devinfo, inst, reg.vstride); > + } >} > } > } > @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, > struct brw_reg reg) >/* This is an oddity of the fact we're using the same > * descriptions for registers in align_16 as align_1: > */ > - if (reg.vstride == BRW_VERTICAL_STRIDE_8) > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { > brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); > - else > + } else if (devinfo->gen == 7 && !devinfo->is_haswell && > +reg.type == BRW_REGISTER_TYPE_DF && > +reg.vstride >= BRW_VERTICAL_STRIDE_1) { > +brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); > + } else { > brw_inst_set_src1_vstride(devinfo, inst, reg.vstride); > + } >} > } > } > -- > 2.7.3 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.
In commit c35fa7a, we changed the "width" of DF source registers to 2, which is conceptually fine. Unfortunately a VertStride of 2 is not allowed by align16 instructions on IVB/BYT, and the regular VertStride of 4 works fine in any case. See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test for example: cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{ align16 1Q }; ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{ align16 2N }; ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 888f95e..a01083f 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg) /* This is an oddity of the fact we're using the same * descriptions for registers in align_16 as align_1: */ -if (reg.vstride == BRW_VERTICAL_STRIDE_8) + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); -else + } else if (devinfo->gen == 7 && !devinfo->is_haswell && +reg.type == BRW_REGISTER_TYPE_DF && +reg.vstride >= BRW_VERTICAL_STRIDE_1) { +brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); + } else { brw_inst_set_src0_vstride(devinfo, inst, reg.vstride); + } } } } @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg) /* This is an oddity of the fact we're using the same * descriptions for registers in align_16 as align_1: */ -if (reg.vstride == BRW_VERTICAL_STRIDE_8) + if (reg.vstride == BRW_VERTICAL_STRIDE_8) { brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); -else + } else if (devinfo->gen == 7 && !devinfo->is_haswell && +reg.type == BRW_REGISTER_TYPE_DF && +reg.vstride >= BRW_VERTICAL_STRIDE_1) { +brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_4); + } else { brw_inst_set_src1_vstride(devinfo, inst, reg.vstride); + } } } } -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev