Re: [Mesa-dev] [PATCH] nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations

2018-11-24 Thread Karol Herbst
yeah, sounds fine. I wasn't 100% sure what the dnz flag does, with the
addition below: Reviewed-by: Karol Herbst 

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 307d8762506..202faf0746a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1094,6 +1094,7 @@ ConstantFolding::opnd(Instruction *i,
ImmediateValue , int s)
  if (imm0.isNegative())
 i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
  i->op = OP_ADD;
+ i->dnz = 0;
  i->setSrc(s, i->getSrc(t));
  i->src(s).mod = i->src(t).mod;
   } else

shader:
FRAG
PROPERTY FS_COORD_ORIGIN UPPER_LEFT
PROPERTY MUL_ZERO_WINS 1
DCL IN[0], COLOR, COLOR
DCL IN[1], TEXCOORD[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL OUT[1], COLOR[1]
DCL CONST[0][0..129]
DCL TEMP[0..2]
IMM[0] FLT32 {   -0.,-1., 2.,-0.5000}
 0: ADD TEMP[0].x, -CONST[0][112]., IN[1].
 1: CMP TEMP[0], TEMP[0]., IMM[0]., IMM[0].
 2: KILL_IF TEMP[0]
 3: MUL TEMP[0].xyz, CONST[0][0], IN[0]
 4: MOV TEMP[0].w, IN[0].
 5: MUL TEMP[1].xyz, TEMP[0], IMM[0].
 6: MUL OUT[0].w, TEMP[0]., CONST[0][0].
 7: MAD_SAT TEMP[0].w, IN[1]., CONST[0][128]., CONST[0][128].
 8: MUL TEMP[0].w, TEMP[0]., CONST[0][129].
 9: MOV TEMP[2].z, IMM[0].
10: MAD TEMP[0].xyz, TEMP[2]., -TEMP[0], CONST[0][129]
11: MAD OUT[0].xyz, TEMP[0]., TEMP[0], TEMP[1]
12: MOV OUT[1], -IMM[0].wwwy
13: END

On Sun, Nov 25, 2018 at 3:58 AM Ilia Mirkin  wrote:
>
> dnz flag only applies for multiplications (e.g. to make 0 * Infinity
> becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz
> flag no longer makes sense, and upsets the GM107 emitter (since it looks
> at the ftz and dnz flags together).
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 04d26dcbf53..307d8762506 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -740,6 +740,7 @@ ConstantFolding::expr(Instruction *i,
>// restrictions, so move it into a separate LValue.
>bld.setPosition(i, false);
>i->op = OP_ADD;
> +  i->dnz = 0;
>i->setSrc(1, bld.mkMov(bld.getSSA(type), i->getSrc(0), 
> type)->getDef(0));
>i->setSrc(0, i->getSrc(2));
>i->src(0).mod = i->src(2).mod;
> @@ -1131,6 +1132,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> , int s)
>   i->setSrc(1, i->getSrc(2));
>   i->src(1).mod = i->src(2).mod;
>   i->setSrc(2, NULL);
> + i->dnz = 0;
>   i->op = OP_ADD;
>} else
>if (!isFloatType(i->dType) && !i->subOp && !i->src(t).mod && 
> !i->src(2).mod) {
> --
> 2.18.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations

2018-11-24 Thread Ilia Mirkin
dnz flag only applies for multiplications (e.g. to make 0 * Infinity
becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz
flag no longer makes sense, and upsets the GM107 emitter (since it looks
at the ftz and dnz flags together).

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 04d26dcbf53..307d8762506 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -740,6 +740,7 @@ ConstantFolding::expr(Instruction *i,
   // restrictions, so move it into a separate LValue.
   bld.setPosition(i, false);
   i->op = OP_ADD;
+  i->dnz = 0;
   i->setSrc(1, bld.mkMov(bld.getSSA(type), i->getSrc(0), type)->getDef(0));
   i->setSrc(0, i->getSrc(2));
   i->src(0).mod = i->src(2).mod;
@@ -1131,6 +1132,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
, int s)
  i->setSrc(1, i->getSrc(2));
  i->src(1).mod = i->src(2).mod;
  i->setSrc(2, NULL);
+ i->dnz = 0;
  i->op = OP_ADD;
   } else
   if (!isFloatType(i->dType) && !i->subOp && !i->src(t).mod && 
!i->src(2).mod) {
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] nv50, nvc0: Fix gallium nine regression regarding sampler bindings

2018-11-24 Thread Karol Herbst
On Sun, Nov 25, 2018 at 2:11 AM Ilia Mirkin  wrote:
>
> Using this approach, num_samplers will never go down. Also, this
> applies to more than just samplers -- textures, everything else.

but is this a problem? I was checking on where num_samplers was used
and I don't see that it make much of a difference what the actual
value is. Sure we end up with unused but bound object, but the
original change inside gallium also stopped caring about this
completely.

Maybe I overlooked something, though.

> On Sat, Nov 24, 2018 at 6:04 PM Karol Herbst  wrote:
> >
> > The new approach is that samplers don't get unbound even if they won't be 
> > used
> > in a draw and we should just leave them be as well.
> >
> > Fixes a regression in multiple windows games using gallium nine and nouveau.
> >
> > Fixes: 4d6fab245eec3880e2a59424a579851f44857ce8
> >"cso: don't track the number of sampler states bound"
> > Signed-off-by: Karol Herbst 
> > ---
> >  src/gallium/drivers/nouveau/nv50/nv50_state.c | 13 ++---
> >  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 17 +
> >  2 files changed, 7 insertions(+), 23 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> > b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> > index fb4a259ce16..59437b22c9c 100644
> > --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> > +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> > @@ -600,25 +600,16 @@ static inline void
> >  nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
> > unsigned nr, void **hwcso)
> >  {
> > -   unsigned i;
> > -
> > assert(nr <= PIPE_MAX_SAMPLERS);
> > -   for (i = 0; i < nr; ++i) {
> > +   for (unsigned i = 0; i < nr; ++i) {
> >struct nv50_tsc_entry *old = nv50->samplers[s][i];
> >
> >nv50->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
> >if (old)
> >   nv50_screen_tsc_unlock(nv50->screen, old);
> > }
> > -   assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS);
> > -   for (; i < nv50->num_samplers[s]; ++i) {
> > -  if (nv50->samplers[s][i]) {
> > - nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]);
> > - nv50->samplers[s][i] = NULL;
> > -  }
> > -   }
> >
> > -   nv50->num_samplers[s] = nr;
> > +   nv50->num_samplers[s] = MAX2(nv50->num_samplers[s], nr);
> >
> > nv50->dirty_3d |= NV50_NEW_3D_SAMPLERS;
> >  }
> > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> > b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> > index f2393cb27b5..12765af8585 100644
> > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> > @@ -449,10 +449,12 @@ nvc0_sampler_state_delete(struct pipe_context *pipe, 
> > void *hwcso)
> >  {
> > unsigned s, i;
> >
> > -   for (s = 0; s < 6; ++s)
> > +   for (s = 0; s < 6; ++s) {
> >for (i = 0; i < nvc0_context(pipe)->num_samplers[s]; ++i)
> >   if (nvc0_context(pipe)->samplers[s][i] == hwcso)
> >  nvc0_context(pipe)->samplers[s][i] = NULL;
> > +  nvc0_context(pipe)->num_samplers[s] = 0;
> > +   }
> >
> > nvc0_screen_tsc_free(nvc0_context(pipe)->screen, nv50_tsc_entry(hwcso));
> >
> > @@ -464,9 +466,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context 
> > *nvc0,
> > unsigned s,
> > unsigned nr, void **hwcso)
> >  {
> > -   unsigned i;
> > -
> > -   for (i = 0; i < nr; ++i) {
> > +   for (unsigned i = 0; i < nr; ++i) {
> >struct nv50_tsc_entry *old = nvc0->samplers[s][i];
> >
> >if (hwcso[i] == old)
> > @@ -477,14 +477,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context 
> > *nvc0,
> >if (old)
> >   nvc0_screen_tsc_unlock(nvc0->screen, old);
> > }
> > -   for (; i < nvc0->num_samplers[s]; ++i) {
> > -  if (nvc0->samplers[s][i]) {
> > - nvc0_screen_tsc_unlock(nvc0->screen, nvc0->samplers[s][i]);
> > - nvc0->samplers[s][i] = NULL;
> > -  }
> > -   }
> > -
> > -   nvc0->num_samplers[s] = nr;
> > +   nvc0->num_samplers[s] = MAX2(nvc0->num_samplers[s], nr);
> >  }
> >
> >  static void
> > --
> > 2.19.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nv50/ir: don't optimize dnz muls to add

2018-11-24 Thread Karol Herbst
yeah, I was hitting some asserts with a d3d trace. The issue is that
we optimize some MADs/MULs with dnz set to ADD, but the emiter isn't
able to emit an ADD with the dnz flag. At least for gm107.

example TGSI triggering it (there are more cases though):
VERT
PROPERTY NEXT_SHADER FRAG
PROPERTY MUL_ZERO_WINS 1
DCL IN[0]
DCL IN[1]
DCL IN[2]
DCL OUT[0], POSITION
DCL OUT[1], COLOR
DCL OUT[2].xy, TEXCOORD[0]
DCL CONST[0][0..240]
DCL TEMP[0]
IMM[0] FLT32 {1., 0., 0., 0.}
 0: MUL TEMP[0], CONST[0][1], IN[0].
 1: MAD TEMP[0], IN[0]., CONST[0][0], TEMP[0]
 2: MAD TEMP[0], IN[0]., CONST[0][2], TEMP[0]
 3: MAD TEMP[0], IN[0]., CONST[0][3], TEMP[0]
 4: ADD OUT[0], TEMP[0], CONST[0][240]
 5: MUL OUT[1], CONST[0][4], IN[1]
 6: MAD TEMP[0].xyz, IN[2].xyxw, IMM[0].xxyy, IMM[0].yyxy
 7: DP3 OUT[2].x, TEMP[0], CONST[0][16].xyww
 8: DP3 OUT[2].y, TEMP[0], CONST[0][17].xyww
 9: END
On Sun, Nov 25, 2018 at 2:12 AM Ilia Mirkin  wrote:
>
> Can you elaborate as to what the issue is? The dnz flag is set when we
> want to make NaN -> Infinity. Do you have a concrete TGSI program that
> triggers issues?
> On Sat, Nov 24, 2018 at 6:04 PM Karol Herbst  wrote:
> >
> > fixes asserts with gallium nine
> >
> > Signed-off-by: Karol Herbst 
> > ---
> >  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> > b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > index 04d26dcbf53..0a284572ede 100644
> > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> > @@ -557,6 +557,8 @@ ConstantFolding::expr(Instruction *i,
> > case OP_MAD:
> > case OP_FMA:
> > case OP_MUL:
> > +  if (i->dnz && i->op != OP_MUL)
> > + return;
> >if (i->dnz && i->dType == TYPE_F32) {
> >   if (!isfinite(a->data.f32))
> >  a->data.f32 = 0.0f;
> > @@ -1089,7 +1091,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> > , int s)
> >  i->src(0).mod = 0;
> >   i->setSrc(1, NULL);
> >} else
> > -  if (!i->postFactor && (imm0.isInteger(2) || imm0.isInteger(-2))) {
> > +  if (!i->postFactor && !i->dnz && (imm0.isInteger(2) || 
> > imm0.isInteger(-2))) {
> >   if (imm0.isNegative())
> >  i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
> >   i->op = OP_ADD;
> > @@ -1120,7 +1122,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> > , int s)
> >   if (i->op != OP_CVT)
> >  i->src(0).mod = 0;
> >} else
> > -  if (i->subOp != NV50_IR_SUBOP_MUL_HIGH &&
> > +  if (i->subOp != NV50_IR_SUBOP_MUL_HIGH && !i->dnz &&
> >(imm0.isInteger(1) || imm0.isInteger(-1))) {
> >   if (imm0.isNegative())
> >  i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
> > --
> > 2.19.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nv50/ir: don't optimize dnz muls to add

2018-11-24 Thread Ilia Mirkin
Can you elaborate as to what the issue is? The dnz flag is set when we
want to make NaN -> Infinity. Do you have a concrete TGSI program that
triggers issues?
On Sat, Nov 24, 2018 at 6:04 PM Karol Herbst  wrote:
>
> fixes asserts with gallium nine
>
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 04d26dcbf53..0a284572ede 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -557,6 +557,8 @@ ConstantFolding::expr(Instruction *i,
> case OP_MAD:
> case OP_FMA:
> case OP_MUL:
> +  if (i->dnz && i->op != OP_MUL)
> + return;
>if (i->dnz && i->dType == TYPE_F32) {
>   if (!isfinite(a->data.f32))
>  a->data.f32 = 0.0f;
> @@ -1089,7 +1091,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> , int s)
>  i->src(0).mod = 0;
>   i->setSrc(1, NULL);
>} else
> -  if (!i->postFactor && (imm0.isInteger(2) || imm0.isInteger(-2))) {
> +  if (!i->postFactor && !i->dnz && (imm0.isInteger(2) || 
> imm0.isInteger(-2))) {
>   if (imm0.isNegative())
>  i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
>   i->op = OP_ADD;
> @@ -1120,7 +1122,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
> , int s)
>   if (i->op != OP_CVT)
>  i->src(0).mod = 0;
>} else
> -  if (i->subOp != NV50_IR_SUBOP_MUL_HIGH &&
> +  if (i->subOp != NV50_IR_SUBOP_MUL_HIGH && !i->dnz &&
>(imm0.isInteger(1) || imm0.isInteger(-1))) {
>   if (imm0.isNegative())
>  i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
> --
> 2.19.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] nv50, nvc0: Fix gallium nine regression regarding sampler bindings

2018-11-24 Thread Ilia Mirkin
Using this approach, num_samplers will never go down. Also, this
applies to more than just samplers -- textures, everything else.
On Sat, Nov 24, 2018 at 6:04 PM Karol Herbst  wrote:
>
> The new approach is that samplers don't get unbound even if they won't be used
> in a draw and we should just leave them be as well.
>
> Fixes a regression in multiple windows games using gallium nine and nouveau.
>
> Fixes: 4d6fab245eec3880e2a59424a579851f44857ce8
>"cso: don't track the number of sampler states bound"
> Signed-off-by: Karol Herbst 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_state.c | 13 ++---
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 17 +
>  2 files changed, 7 insertions(+), 23 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index fb4a259ce16..59437b22c9c 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -600,25 +600,16 @@ static inline void
>  nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
> unsigned nr, void **hwcso)
>  {
> -   unsigned i;
> -
> assert(nr <= PIPE_MAX_SAMPLERS);
> -   for (i = 0; i < nr; ++i) {
> +   for (unsigned i = 0; i < nr; ++i) {
>struct nv50_tsc_entry *old = nv50->samplers[s][i];
>
>nv50->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
>if (old)
>   nv50_screen_tsc_unlock(nv50->screen, old);
> }
> -   assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS);
> -   for (; i < nv50->num_samplers[s]; ++i) {
> -  if (nv50->samplers[s][i]) {
> - nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]);
> - nv50->samplers[s][i] = NULL;
> -  }
> -   }
>
> -   nv50->num_samplers[s] = nr;
> +   nv50->num_samplers[s] = MAX2(nv50->num_samplers[s], nr);
>
> nv50->dirty_3d |= NV50_NEW_3D_SAMPLERS;
>  }
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index f2393cb27b5..12765af8585 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -449,10 +449,12 @@ nvc0_sampler_state_delete(struct pipe_context *pipe, 
> void *hwcso)
>  {
> unsigned s, i;
>
> -   for (s = 0; s < 6; ++s)
> +   for (s = 0; s < 6; ++s) {
>for (i = 0; i < nvc0_context(pipe)->num_samplers[s]; ++i)
>   if (nvc0_context(pipe)->samplers[s][i] == hwcso)
>  nvc0_context(pipe)->samplers[s][i] = NULL;
> +  nvc0_context(pipe)->num_samplers[s] = 0;
> +   }
>
> nvc0_screen_tsc_free(nvc0_context(pipe)->screen, nv50_tsc_entry(hwcso));
>
> @@ -464,9 +466,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
> unsigned s,
> unsigned nr, void **hwcso)
>  {
> -   unsigned i;
> -
> -   for (i = 0; i < nr; ++i) {
> +   for (unsigned i = 0; i < nr; ++i) {
>struct nv50_tsc_entry *old = nvc0->samplers[s][i];
>
>if (hwcso[i] == old)
> @@ -477,14 +477,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
>if (old)
>   nvc0_screen_tsc_unlock(nvc0->screen, old);
> }
> -   for (; i < nvc0->num_samplers[s]; ++i) {
> -  if (nvc0->samplers[s][i]) {
> - nvc0_screen_tsc_unlock(nvc0->screen, nvc0->samplers[s][i]);
> - nvc0->samplers[s][i] = NULL;
> -  }
> -   }
> -
> -   nvc0->num_samplers[s] = nr;
> +   nvc0->num_samplers[s] = MAX2(nvc0->num_samplers[s], nr);
>  }
>
>  static void
> --
> 2.19.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] nv50/ir: don't optimize dnz muls to add

2018-11-24 Thread Karol Herbst
fixes asserts with gallium nine

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 04d26dcbf53..0a284572ede 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -557,6 +557,8 @@ ConstantFolding::expr(Instruction *i,
case OP_MAD:
case OP_FMA:
case OP_MUL:
+  if (i->dnz && i->op != OP_MUL)
+ return;
   if (i->dnz && i->dType == TYPE_F32) {
  if (!isfinite(a->data.f32))
 a->data.f32 = 0.0f;
@@ -1089,7 +1091,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
, int s)
 i->src(0).mod = 0;
  i->setSrc(1, NULL);
   } else
-  if (!i->postFactor && (imm0.isInteger(2) || imm0.isInteger(-2))) {
+  if (!i->postFactor && !i->dnz && (imm0.isInteger(2) || 
imm0.isInteger(-2))) {
  if (imm0.isNegative())
 i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
  i->op = OP_ADD;
@@ -1120,7 +1122,7 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
, int s)
  if (i->op != OP_CVT)
 i->src(0).mod = 0;
   } else
-  if (i->subOp != NV50_IR_SUBOP_MUL_HIGH &&
+  if (i->subOp != NV50_IR_SUBOP_MUL_HIGH && !i->dnz &&
   (imm0.isInteger(1) || imm0.isInteger(-1))) {
  if (imm0.isNegative())
 i->src(t).mod = i->src(t).mod ^ Modifier(NV50_IR_MOD_NEG);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] gallium nine fixes for nouveau

2018-11-24 Thread Karol Herbst
Patch 1 fixes some compiler asserts I was running into:

Maybe we can just do those optimizations anyway, but simply drop the dnz flag
on the ADD as long as the instructions aren't marked as being prices

Patch 2 tries to fix our outstanding issue with bound samplers with nine.:

I don't really know if this is the correct fix for it, but it makes sense to me
reading the commit message of the gallium change. No piglit regressions and
games are working again using gallium nine and nouveau. Maybe we should drop
that entire num_samples handling alltogether?

Karol Herbst (2):
  nv50/ir: don't optimize dnz muls to add
  nv50,nvc0: Fix gallium nine regression regarding sampler bindings

 .../nouveau/codegen/nv50_ir_peephole.cpp|  6 --
 src/gallium/drivers/nouveau/nv50/nv50_state.c   | 13 ++---
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 17 +
 3 files changed, 11 insertions(+), 25 deletions(-)

-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nv50, nvc0: Fix gallium nine regression regarding sampler bindings

2018-11-24 Thread Karol Herbst
The new approach is that samplers don't get unbound even if they won't be used
in a draw and we should just leave them be as well.

Fixes a regression in multiple windows games using gallium nine and nouveau.

Fixes: 4d6fab245eec3880e2a59424a579851f44857ce8
   "cso: don't track the number of sampler states bound"
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/nv50/nv50_state.c | 13 ++---
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 17 +
 2 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index fb4a259ce16..59437b22c9c 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -600,25 +600,16 @@ static inline void
 nv50_stage_sampler_states_bind(struct nv50_context *nv50, int s,
unsigned nr, void **hwcso)
 {
-   unsigned i;
-
assert(nr <= PIPE_MAX_SAMPLERS);
-   for (i = 0; i < nr; ++i) {
+   for (unsigned i = 0; i < nr; ++i) {
   struct nv50_tsc_entry *old = nv50->samplers[s][i];
 
   nv50->samplers[s][i] = nv50_tsc_entry(hwcso[i]);
   if (old)
  nv50_screen_tsc_unlock(nv50->screen, old);
}
-   assert(nv50->num_samplers[s] <= PIPE_MAX_SAMPLERS);
-   for (; i < nv50->num_samplers[s]; ++i) {
-  if (nv50->samplers[s][i]) {
- nv50_screen_tsc_unlock(nv50->screen, nv50->samplers[s][i]);
- nv50->samplers[s][i] = NULL;
-  }
-   }
 
-   nv50->num_samplers[s] = nr;
+   nv50->num_samplers[s] = MAX2(nv50->num_samplers[s], nr);
 
nv50->dirty_3d |= NV50_NEW_3D_SAMPLERS;
 }
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index f2393cb27b5..12765af8585 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -449,10 +449,12 @@ nvc0_sampler_state_delete(struct pipe_context *pipe, void 
*hwcso)
 {
unsigned s, i;
 
-   for (s = 0; s < 6; ++s)
+   for (s = 0; s < 6; ++s) {
   for (i = 0; i < nvc0_context(pipe)->num_samplers[s]; ++i)
  if (nvc0_context(pipe)->samplers[s][i] == hwcso)
 nvc0_context(pipe)->samplers[s][i] = NULL;
+  nvc0_context(pipe)->num_samplers[s] = 0;
+   }
 
nvc0_screen_tsc_free(nvc0_context(pipe)->screen, nv50_tsc_entry(hwcso));
 
@@ -464,9 +466,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
unsigned s,
unsigned nr, void **hwcso)
 {
-   unsigned i;
-
-   for (i = 0; i < nr; ++i) {
+   for (unsigned i = 0; i < nr; ++i) {
   struct nv50_tsc_entry *old = nvc0->samplers[s][i];
 
   if (hwcso[i] == old)
@@ -477,14 +477,7 @@ nvc0_stage_sampler_states_bind(struct nvc0_context *nvc0,
   if (old)
  nvc0_screen_tsc_unlock(nvc0->screen, old);
}
-   for (; i < nvc0->num_samplers[s]; ++i) {
-  if (nvc0->samplers[s][i]) {
- nvc0_screen_tsc_unlock(nvc0->screen, nvc0->samplers[s][i]);
- nvc0->samplers[s][i] = NULL;
-  }
-   }
-
-   nvc0->num_samplers[s] = nr;
+   nvc0->num_samplers[s] = MAX2(nvc0->num_samplers[s], nr);
 }
 
 static void
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108245] RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108245

--- Comment #6 from Bas Nieuwenhuizen  ---
https://patchwork.freedesktop.org/patch/263716/ fixes the testcase (and
hopefully does not regress anything else)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Clamp gfx9 image view extents to the allocated image extents.

2018-11-24 Thread Bas Nieuwenhuizen
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd5 "radv/gfx9: fix block compression texture views. (v2)"
---
 src/amd/vulkan/radv_image.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 7492bf48b51..ba8e28f0e23 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1175,8 +1175,6 @@ radv_image_view_init(struct radv_image_view *iview,
 if (device->physical_device->rad_info.chip_class >= GFX9 &&
 vk_format_is_compressed(image->vk_format) &&
 !vk_format_is_compressed(iview->vk_format)) {
-unsigned rounded_img_w = 
util_next_power_of_two(iview->extent.width);
-unsigned rounded_img_h = 
util_next_power_of_two(iview->extent.height);
 unsigned lvl_width  = radv_minify(image->info.width , 
range->baseMipLevel);
 unsigned lvl_height = radv_minify(image->info.height, 
range->baseMipLevel);
 
@@ -1186,8 +1184,8 @@ radv_image_view_init(struct radv_image_view *iview,
 lvl_width <<= range->baseMipLevel;
 lvl_height <<= range->baseMipLevel;
 
-iview->extent.width = CLAMP(lvl_width, 
iview->extent.width, rounded_img_w);
-iview->extent.height = CLAMP(lvl_height, 
iview->extent.height, rounded_img_h);
+iview->extent.width = CLAMP(lvl_width, 
iview->extent.width, iview->image->surface.u.gfx9.surf_pitch);
+iview->extent.height = CLAMP(lvl_height, 
iview->extent.height, iview->image->surface.u.gfx9.surf_height);
 }
}
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Fix opaque metadata descriptor last layer.

2018-11-24 Thread Bas Nieuwenhuizen
We used the layer count which results in an off by one error.

Not sure this really affects anything.

Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
 src/amd/vulkan/radv_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 7492bf48b51..f0b1d31c5bd 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -691,7 +691,7 @@ radv_query_opaque_metadata(struct radv_device *device,
si_make_texture_descriptor(device, image, false,
   (VkImageViewType)image->type, 
image->vk_format,
   , 0, image->info.levels - 1, 0,
-  image->info.array_size,
+  image->info.array_size - 1,
   image->info.width, image->info.height,
   image->info.depth,
   desc, NULL);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108848] 3d image broken in Dragon age: origins

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108848

--- Comment #9 from Axel Davy  ---
You can also try to reduce the memory footprint of pulseaudio, see the trick
described here:
https://www.winehq.org/pipermail/wine-devel/2018-November/134954.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108853] OSMesaGetDepthBuffer flipped vertically

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108853

Bug ID: 108853
   Summary: OSMesaGetDepthBuffer flipped vertically
   Product: Mesa
   Version: 18.2
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/OSMesa
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: popi...@basilisk.fr
QA Contact: mesa-dev@lists.freedesktop.org

When installing OSMesa using

./configure --prefix=$install_dir \
--enable-opengl --disable-gles1 --disable-gles2   \
--disable-va --disable-xvmc --disable-vdpau   \
--enable-shared-glapi \
--disable-texture-float   \
--with-gallium-drivers=swrast \
--disable-dri --with-dri-drivers= \
--disable-egl --with-platforms= --disable-gbm \
--enable-glx--with-platforms=x11  \
--disable-osmesa --enable-gallium-osmesa

i.e. using the gallium osmesa implementation, the depth buffer returned by
OSMesaGetDepthBuffer is flipped vertically relative to the OSMesa framebuffer.

Using the non-gallium osmesa implementation returns a depth buffer correctly
aligned with the framebuffer.

The gallium-returned depth buffer can be fixed using something like:

fbdepth_t * framebuffer_depth (framebuffer * p)
{
  unsigned int * depth;
  GLint width, height, bytesPerValue;
  OSMesaGetDepthBuffer (p->ctx, , , ,
(void **));
  assert (p->width == width && p->height == height && bytesPerValue == 4);
  assert (sizeof(fbdepth_t) == bytesPerValue);
#if GALLIUM
  // fix for bug in gallium/libosmesa
  // the depth buffer is flipped vertically
  for (GLint j = 0; j < height/2; j++)
for (GLint i = 0; i < width; i++) {
  unsigned int tmp = depth[j*width + i];
  depth[j*width + i] = depth[(height - 1 - j)*width + i];
  depth[(height - 1 - j)*width + i] = tmp;
}
#endif // GALLIUM
  return depth;
}

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108848] 3d image broken in Dragon age: origins

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108848

--- Comment #8 from Andre Heider  ---
Game works for me with nine, but only after setting LARGE_ADDRESS_AWARE
manually on DAOrigins.exe. (It seems to work without on low texture details,
but I guess it'll crash at some point eventually).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108848] 3d image broken in Dragon age: origins

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108848

--- Comment #7 from Axel Davy  ---
After looking further, the crash observed seems to be NineVolumeTexture9
missing checks to properly exit on memory allocation failures.
I'll send a fix for this, but it won't help the game work. I suspect the
problem is as suggested not enough memory available.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108848] 3d image broken in Dragon age: origins

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108848

--- Comment #6 from Axel Davy  ---
Is the game 32 bits ?

One thing you may want to try is to use a tool to make the exe large space
aware.
Possibly you run out of virtual space (wine restricts available space without
large space aware, and some linux libs take more space than on windows)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 108848] 3d image broken in Dragon age: origins

2018-11-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=108848

--- Comment #5 from Axel Davy  ---
Mesa needs to be built with --enable-debug to have NINE_DEBUG=all do anything.

I can see though that two mmaps are failing on the log with nine.

Getting a new log with NINE_DEBUG=all would help find what causes the failed
mmap, and what arguments were used for the crashing call.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] winsys/amdgpu: slab allocators and tweaks for address translation

2018-11-24 Thread Christian König

Patch #5 and #6 are Reviewed-by: Christian König 

For patch #7 I think we really need some testing if that gives us an 
improvement. As you noted as well that we have buffer which are slightly 
smaller than a power of two is rather unlikely.


Christian.

Am 24.11.18 um 00:40 schrieb Marek Olšák:

Hi,

This series changes the slab allocation to 3 slab allocators layered
on top of each other, and increases the max slab entry size to 256 KB
and the max slab size to 2 MB.

There are also tweaks for faster address translation, though we don't
know whether it helps anything.

Please review.

Thanks,
Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev