[Mesa-dev] [PATCH] nv50: Fix broken assertion

2012-02-07 Thread Daniel Stone
The assertion added in f09910f3 broke nv50 completely by asserting that
the number of elements in a dereferenced pointer (i.e. 1) was greater
than i (which ranged up to six), rather than checking the number of
elements in the containing array.

Signed-off-by: Daniel Stone dan...@fooishbar.org
---
 src/gallium/drivers/nv50/nv50_pc_regalloc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/gallium/drivers/nv50/nv50_pc_regalloc.c 
b/src/gallium/drivers/nv50/nv50_pc_regalloc.c
index 72922cb..12a59cb 100644
--- a/src/gallium/drivers/nv50/nv50_pc_regalloc.c
+++ b/src/gallium/drivers/nv50/nv50_pc_regalloc.c
@@ -421,7 +421,7 @@ phi_opnd_for_bb(struct nv_instruction *phi, struct 
nv_basic_block *b,
int i, j;
 
for (j = -1, i = 0; i  6  phi-src[i]; ++i) {
-  assert(i  Elements(phi-src[i]));
+  assert(i  Elements(phi-src));
   srci = phi-src[i];
   /* if already replaced, check with original source first */
   if (srci-flags  NV_REF_FLAG_REGALLOC_PRIV)
-- 
1.7.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50: Fix broken assertion

2012-02-07 Thread Jose Fonseca
Makes sense.

Jose

- Original Message -
 The assertion added in f09910f3 broke nv50 completely by asserting
 that
 the number of elements in a dereferenced pointer (i.e. 1) was greater
 than i (which ranged up to six), rather than checking the number of
 elements in the containing array.
 
 Signed-off-by: Daniel Stone dan...@fooishbar.org
 ---
  src/gallium/drivers/nv50/nv50_pc_regalloc.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 b/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 index 72922cb..12a59cb 100644
 --- a/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 +++ b/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 @@ -421,7 +421,7 @@ phi_opnd_for_bb(struct nv_instruction *phi,
 struct nv_basic_block *b,
 int i, j;
  
 for (j = -1, i = 0; i  6  phi-src[i]; ++i) {
 -  assert(i  Elements(phi-src[i]));
 +  assert(i  Elements(phi-src));
srci = phi-src[i];
/* if already replaced, check with original source first */
if (srci-flags  NV_REF_FLAG_REGALLOC_PRIV)
 --
 1.7.9
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50: Fix broken assertion

2012-02-07 Thread Christoph Bumiller
On 07.02.2012 13:47, Jose Fonseca wrote:
 Makes sense.

Very much so ...
http://cgit.freedesktop.org/mesa/mesa/commit/?id=189e6c7e81ce35b89d9b52d4bd0d6271a7e9c10f
(of 26 hours ago).


 Jose

 - Original Message -
 The assertion added in f09910f3 broke nv50 completely by asserting
 that
 the number of elements in a dereferenced pointer (i.e. 1) was greater
 than i (which ranged up to six), rather than checking the number of
 elements in the containing array.

 Signed-off-by: Daniel Stone dan...@fooishbar.org
 ---
  src/gallium/drivers/nv50/nv50_pc_regalloc.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 b/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 index 72922cb..12a59cb 100644
 --- a/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 +++ b/src/gallium/drivers/nv50/nv50_pc_regalloc.c
 @@ -421,7 +421,7 @@ phi_opnd_for_bb(struct nv_instruction *phi,
 struct nv_basic_block *b,
 int i, j;
  
 for (j = -1, i = 0; i  6  phi-src[i]; ++i) {
 -  assert(i  Elements(phi-src[i]));
 +  assert(i  Elements(phi-src));
srci = phi-src[i];
/* if already replaced, check with original source first */
if (srci-flags  NV_REF_FLAG_REGALLOC_PRIV)
 --
 1.7.9

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 45277] [bisected] Shading not working properly in Heroes of Newerth

2012-02-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=45277

--- Comment #2 from Damien Grassart dam...@grassart.com 2012-02-07 05:11:44 
PST ---
I've also found that removing PIPE_CAP_VERTEX_COLOR_UNCLAMPED as a supported
feature for r600 does fix this bug (src/gallium/drivers/r600/r600_pipe.c:369).
Is the statement in the commit message that r600 can do unclamped vertex colors
perhaps false?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50: Fix broken assertion

2012-02-07 Thread Daniel Stone
On 7 February 2012 13:03, Christoph Bumiller
e0425...@student.tuwien.ac.at wrote:
 On 07.02.2012 13:47, Jose Fonseca wrote:
 Makes sense.

 Very much so ...
 http://cgit.freedesktop.org/mesa/mesa/commit/?id=189e6c7e81ce35b89d9b52d4bd0d6271a7e9c10f
 (of 26 hours ago).

Ha, snap.  Thanks anyway. :)

(Doesn't matter too much in any case, as gnome-shell still locks my
machine solid in under a minute on this particular chip ...)

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] gallivm: enable stores of integer types.

2012-02-07 Thread Dave Airlie
On Mon, Feb 6, 2012 at 9:01 PM, Jose Fonseca jfons...@vmware.com wrote:
 - Original Message -
 On Mon, Feb 6, 2012 at 7:47 PM, Jose Fonseca jfons...@vmware.com
 wrote:
  Dave,
 
  I really see no point of inferring anything when translating MOVs.
  The src/dst value will need to be converted from/to a floating
  point anyway so this is unnecessary complexity AFAICS. If you
  believe this is really necessary please provide a concrete
  example.

 The problem is MOV from immediate, since immediates are stored as the
 correct internal type, so we don't get to infer the src in that case,
 then we try to MOV int4* to float4* and llvm crashes.

 I see.

 In that case I'd suggest you to:

 - in lp_emit_immediate_soa() bitcast _all_ integer immediates to floats 
 (i.e., bld-immediates[] is always of llvm type float, just like 
 bld-temps[], and bld-outputs[], etc.).

 - in emit_fetch_immediate() bitcast back to integer when stype == INT or UINT

I've started looking at this, but it means we have to change
immediates from being constants to being something we can bitcast. At
the moment the immediate fetch just fetches one immediate value from a
vector, not a pointer to an immediate value, making this pointer based
just seems like an unnecessary step and seems to be like it will get
in the way of optimisations that can work on constant values.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/gbm: r300 and r600 now depend on libdrm

2012-02-07 Thread Tobias Droste
fixes undefined references in libradeonwinsys.a when linking

Signed-off-by: Tobias Droste tdro...@gmx.de
---
 src/gallium/targets/gbm/Makefile |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/src/gallium/targets/gbm/Makefile b/src/gallium/targets/gbm/Makefile
index ce56f93..2737b79 100644
--- a/src/gallium/targets/gbm/Makefile
+++ b/src/gallium/targets/gbm/Makefile
@@ -72,11 +72,13 @@ nouveau_SYS = -ldrm_nouveau
 r300_LIBS = \
$(TOP)/src/gallium/winsys/radeon/drm/libradeonwinsys.a \
$(TOP)/src/gallium/drivers/r300/libr300.a
+r300_SYS += -ldrm_radeon
 
 # r600 pipe driver
 r600_LIBS = \
$(TOP)/src/gallium/winsys/radeon/drm/libradeonwinsys.a \
$(TOP)/src/gallium/drivers/r600/libr600.a
+r600_SYS += -ldrm_radeon
 
 # vmwgfx pipe driver
 vmwgfx_LIBS = \
-- 
1.7.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] gallivm: enable stores of integer types.

2012-02-07 Thread Jose Fonseca


- Original Message -
 On Mon, Feb 6, 2012 at 9:01 PM, Jose Fonseca jfons...@vmware.com
 wrote:
  - Original Message -
  On Mon, Feb 6, 2012 at 7:47 PM, Jose Fonseca jfons...@vmware.com
  wrote:
   Dave,
  
   I really see no point of inferring anything when translating
   MOVs.
   The src/dst value will need to be converted from/to a floating
   point anyway so this is unnecessary complexity AFAICS. If you
   believe this is really necessary please provide a concrete
   example.
 
  The problem is MOV from immediate, since immediates are stored as
  the
  correct internal type, so we don't get to infer the src in that
  case,
  then we try to MOV int4* to float4* and llvm crashes.
 
  I see.
 
  In that case I'd suggest you to:
 
  - in lp_emit_immediate_soa() bitcast _all_ integer immediates to
  floats (i.e., bld-immediates[] is always of llvm type float, just
  like bld-temps[], and bld-outputs[], etc.).
 
  - in emit_fetch_immediate() bitcast back to integer when stype ==
  INT or UINT
 
 I've started looking at this, but it means we have to change
 immediates from being constants to being something we can bitcast. At
 the moment the immediate fetch just fetches one immediate value from
 a
 vector, not a pointer to an immediate value, making this pointer
 based
 just seems like an unnecessary step and seems to be like it will get
 in the way of optimisations that can work on constant values.

If you use LLVMConstBitCast() the result is a still a constant.  Nothing is 
lost.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: avoid vertex texture update for 0 case

2012-02-07 Thread Dave Airlie
From: Dave Airlie airl...@redhat.com

If we had no vertex textures previously and we have none now,
don't bother doing the enables dance.

Signed-off-by: Dave Airlie airl...@redhat.com
---
 src/mesa/state_tracker/st_atom_texture.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_texture.c 
b/src/mesa/state_tracker/st_atom_texture.c
index 88e6128..1b82283 100644
--- a/src/mesa/state_tracker/st_atom_texture.c
+++ b/src/mesa/state_tracker/st_atom_texture.c
@@ -260,6 +260,9 @@ update_vertex_textures(struct st_context *st)
struct gl_vertex_program *vprog = ctx-VertexProgram._Current;
GLuint su;
 
+   if (!vprog-Base.SamplersUsed  st-state.num_vertex_textures == 0)
+   return;
+
st-state.num_vertex_textures = 0;
 
/* loop over sampler units (aka tex image units) */
-- 
1.7.6.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

2012-02-07 Thread Jose Fonseca
I'm not familiar enough with draw internals to tell whether your changes are 
enough or not.

But I'm OK with this change as is, given that most drivers will not advertise 
PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION, so only those that do 
advertise need to be tested to ensure that things are working correctly.

Is there a piglit test for QuadsFollowProvokingVertexConvention? It's probably 
a good idea to have one piglit test that tests it and draw clipped primitives 
w/ different fill modes.

Jose

- Original Message -
 Just let the hardware do it if it can and avoid drivers having to
 check for the special case on each draw call.
 
 v2: update the draw module
 ---
  src/gallium/auxiliary/draw/draw_context.c   |7 -
  src/gallium/auxiliary/draw/draw_decompose_tmp.h |   26
  +++---
  src/gallium/auxiliary/draw/draw_gs_tmp.h|1 +
  src/gallium/auxiliary/draw/draw_private.h   |2 +
  src/gallium/auxiliary/draw/draw_pt_decompose.h  |2 +
  src/gallium/auxiliary/draw/draw_so_emit_tmp.h   |1 +
  src/gallium/docs/source/cso/rasterizer.rst  |7 +++--
  src/gallium/docs/source/screen.rst  |2 +
  src/gallium/drivers/i915/i915_screen.c  |1 +
  src/gallium/drivers/llvmpipe/lp_screen.c|2 +
  src/gallium/drivers/nv50/nv50_screen.c  |1 +
  src/gallium/drivers/nvc0/nvc0_screen.c  |1 +
  src/gallium/drivers/nvfx/nvfx_screen.c  |1 +
  src/gallium/drivers/r300/r300_screen.c  |1 +
  src/gallium/drivers/r600/r600_pipe.c|1 +
  src/gallium/drivers/softpipe/sp_screen.c|2 +
  src/gallium/drivers/svga/svga_screen.c  |3 ++
  src/gallium/include/pipe/p_defines.h|3 +-
  src/mesa/state_tracker/st_extensions.c  |4 +-
  19 files changed, 52 insertions(+), 16 deletions(-)
 
 diff --git a/src/gallium/auxiliary/draw/draw_context.c
 b/src/gallium/auxiliary/draw/draw_context.c
 index 3c0b1aa..ad49ce7 100644
 --- a/src/gallium/auxiliary/draw/draw_context.c
 +++ b/src/gallium/auxiliary/draw/draw_context.c
 @@ -93,11 +93,11 @@ draw_create_context(struct pipe_context *pipe,
 boolean try_llvm,
 }
  #endif
  
 +   draw-pipe = pipe;
 +
 if (!draw_init(draw))
goto err_destroy;
  
 -   draw-pipe = pipe;
 -
 return draw;
  
  err_destroy:
 @@ -168,6 +168,9 @@ boolean draw_init(struct draw_context *draw)
 if (!draw_gs_init( draw ))
return FALSE;
  
 +   draw-quads_always_flatshade_last =
 !draw-pipe-screen-get_param(
 +  draw-pipe-screen,
 PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION);
 +
 return TRUE;
  }
  
 diff --git a/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 b/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 index a142563..ee6877d 100644
 --- a/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 +++ b/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 @@ -193,13 +193,18 @@ FUNC(FUNC_VARS)
  flags = DRAW_PIPE_RESET_STIPPLE |
  DRAW_PIPE_EDGE_FLAG_0 |
  DRAW_PIPE_EDGE_FLAG_1;
 -/* XXX should always emit idx[0] first */
 -/* always emit idx[3] first */
 -TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +/* always emit idx[3] / idx[0] first */
 +if (quads_flatshade_last)
 +   TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +else
 +   TRIANGLE(flags, idx[0], idx[1], idx[2]);
  
  flags = DRAW_PIPE_EDGE_FLAG_1 |
  DRAW_PIPE_EDGE_FLAG_2;
 -TRIANGLE(flags, idx[3], idx[1], idx[2]);
 +if (quads_flatshade_last)
 +   TRIANGLE(flags, idx[3], idx[1], idx[2]);
 +else
 +   TRIANGLE(flags, idx[0], idx[2], idx[3]);
   }
}
break;
 @@ -237,13 +242,18 @@ FUNC(FUNC_VARS)
 flags = DRAW_PIPE_RESET_STIPPLE |
 DRAW_PIPE_EDGE_FLAG_0 |
 DRAW_PIPE_EDGE_FLAG_1;
 -   /* XXX should always emit idx[0] first */
 -   /* always emit idx[3] first */
 -   TRIANGLE(flags, idx[3], idx[2], idx[0]);
 +   /* always emit idx[3] / idx[0 first */
 +   if (quads_flatshade_last)
 +  TRIANGLE(flags, idx[3], idx[2], idx[0]);
 +   else
 +  TRIANGLE(flags, idx[0], idx[3], idx[2]);
  
 flags = DRAW_PIPE_EDGE_FLAG_1 |
 DRAW_PIPE_EDGE_FLAG_2;
 -   TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +   if (quads_flatshade_last)
 +  TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +   else
 +  TRIANGLE(flags, idx[0], idx[1], idx[3]);
  }
   }
}
 diff --git a/src/gallium/auxiliary/draw/draw_gs_tmp.h
 b/src/gallium/auxiliary/draw/draw_gs_tmp.h
 index de7b026..92b6fb7 100644
 --- 

Re: [Mesa-dev] [PATCH] st/mesa: avoid vertex texture update for 0 case

2012-02-07 Thread Jakob Bornecrantz
- Original Message -
 From: Dave Airlie airl...@redhat.com
 
 If we had no vertex textures previously and we have none now,
 don't bother doing the enables dance.
 
 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/mesa/state_tracker/st_atom_texture.c |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

Nice find. Does this help performance? Can't we do the same for
fragment or does it just cost more to check (when this is rarely
the case) then just doing the rest of the function?

Cheers, Jakob.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: avoid vertex texture update for 0 case

2012-02-07 Thread Dave Airlie
On Tue, Feb 7, 2012 at 4:29 PM, Jakob Bornecrantz ja...@vmware.com wrote:
 - Original Message -
 From: Dave Airlie airl...@redhat.com

 If we had no vertex textures previously and we have none now,
 don't bother doing the enables dance.

 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/mesa/state_tracker/st_atom_texture.c |    3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 Nice find. Does this help performance? Can't we do the same for
 fragment or does it just cost more to check (when this is rarely
 the case) then just doing the rest of the function?

Well I think we generally have a fragment texture so it doesn't save much there,

but I was just playing with noop and nexuiz and saw 0.46 CPU in vertex
texture updates when I knew I had none.

I think we could improve the frag texture updates a fair bit, but I'd
have to a bit more testing on it.

Dave,
(oh I'll fix whitespace in that before pushing).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

2012-02-07 Thread Christoph Bumiller
On 02/07/2012 05:24 PM, Jose Fonseca wrote:
 I'm not familiar enough with draw internals to tell whether your changes are 
 enough or not.
 
 But I'm OK with this change as is, given that most drivers will not advertise 
 PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION, so only those that do 
 advertise need to be tested to ensure that things are working correctly.
 
 Is there a piglit test for QuadsFollowProvokingVertexConvention? It's 
 probably a good idea to have one piglit test that tests it and draw clipped 
 primitives w/ different fill modes.
 

glean's clipFlat tests this, with this patch nv50 and nvc0 will finally
pass.

I tested drawing a clipped quad with softpipe with both the cap enabled
and disabled and it looks like the draw change works.

So, I'll push it soon, then - r600 will get to profit from the cap too,
as Marek already mentioned.

 Jose
 
 - Original Message -
 Just let the hardware do it if it can and avoid drivers having to
 check for the special case on each draw call.

 v2: update the draw module
 ---
  src/gallium/auxiliary/draw/draw_context.c   |7 -
  src/gallium/auxiliary/draw/draw_decompose_tmp.h |   26
  +++---
  src/gallium/auxiliary/draw/draw_gs_tmp.h|1 +
  src/gallium/auxiliary/draw/draw_private.h   |2 +
  src/gallium/auxiliary/draw/draw_pt_decompose.h  |2 +
  src/gallium/auxiliary/draw/draw_so_emit_tmp.h   |1 +
  src/gallium/docs/source/cso/rasterizer.rst  |7 +++--
  src/gallium/docs/source/screen.rst  |2 +
  src/gallium/drivers/i915/i915_screen.c  |1 +
  src/gallium/drivers/llvmpipe/lp_screen.c|2 +
  src/gallium/drivers/nv50/nv50_screen.c  |1 +
  src/gallium/drivers/nvc0/nvc0_screen.c  |1 +
  src/gallium/drivers/nvfx/nvfx_screen.c  |1 +
  src/gallium/drivers/r300/r300_screen.c  |1 +
  src/gallium/drivers/r600/r600_pipe.c|1 +
  src/gallium/drivers/softpipe/sp_screen.c|2 +
  src/gallium/drivers/svga/svga_screen.c  |3 ++
  src/gallium/include/pipe/p_defines.h|3 +-
  src/mesa/state_tracker/st_extensions.c  |4 +-
  19 files changed, 52 insertions(+), 16 deletions(-)

 diff --git a/src/gallium/auxiliary/draw/draw_context.c
 b/src/gallium/auxiliary/draw/draw_context.c
 index 3c0b1aa..ad49ce7 100644
 --- a/src/gallium/auxiliary/draw/draw_context.c
 +++ b/src/gallium/auxiliary/draw/draw_context.c
 @@ -93,11 +93,11 @@ draw_create_context(struct pipe_context *pipe,
 boolean try_llvm,
 }
  #endif
  
 +   draw-pipe = pipe;
 +
 if (!draw_init(draw))
goto err_destroy;
  
 -   draw-pipe = pipe;
 -
 return draw;
  
  err_destroy:
 @@ -168,6 +168,9 @@ boolean draw_init(struct draw_context *draw)
 if (!draw_gs_init( draw ))
return FALSE;
  
 +   draw-quads_always_flatshade_last =
 !draw-pipe-screen-get_param(
 +  draw-pipe-screen,
 PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION);
 +
 return TRUE;
  }
  
 diff --git a/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 b/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 index a142563..ee6877d 100644
 --- a/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 +++ b/src/gallium/auxiliary/draw/draw_decompose_tmp.h
 @@ -193,13 +193,18 @@ FUNC(FUNC_VARS)
  flags = DRAW_PIPE_RESET_STIPPLE |
  DRAW_PIPE_EDGE_FLAG_0 |
  DRAW_PIPE_EDGE_FLAG_1;
 -/* XXX should always emit idx[0] first */
 -/* always emit idx[3] first */
 -TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +/* always emit idx[3] / idx[0] first */
 +if (quads_flatshade_last)
 +   TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +else
 +   TRIANGLE(flags, idx[0], idx[1], idx[2]);
  
  flags = DRAW_PIPE_EDGE_FLAG_1 |
  DRAW_PIPE_EDGE_FLAG_2;
 -TRIANGLE(flags, idx[3], idx[1], idx[2]);
 +if (quads_flatshade_last)
 +   TRIANGLE(flags, idx[3], idx[1], idx[2]);
 +else
 +   TRIANGLE(flags, idx[0], idx[2], idx[3]);
   }
}
break;
 @@ -237,13 +242,18 @@ FUNC(FUNC_VARS)
 flags = DRAW_PIPE_RESET_STIPPLE |
 DRAW_PIPE_EDGE_FLAG_0 |
 DRAW_PIPE_EDGE_FLAG_1;
 -   /* XXX should always emit idx[0] first */
 -   /* always emit idx[3] first */
 -   TRIANGLE(flags, idx[3], idx[2], idx[0]);
 +   /* always emit idx[3] / idx[0 first */
 +   if (quads_flatshade_last)
 +  TRIANGLE(flags, idx[3], idx[2], idx[0]);
 +   else
 +  TRIANGLE(flags, idx[0], idx[3], idx[2]);
  
 flags = DRAW_PIPE_EDGE_FLAG_1 |
 DRAW_PIPE_EDGE_FLAG_2;
 -   TRIANGLE(flags, idx[3], idx[0], idx[1]);
 +   if 

Re: [Mesa-dev] [PATCH] gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

2012-02-07 Thread Jose Fonseca


- Original Message -
 On 02/07/2012 05:24 PM, Jose Fonseca wrote:
  I'm not familiar enough with draw internals to tell whether your
  changes are enough or not.
  
  But I'm OK with this change as is, given that most drivers will not
  advertise PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION, so
  only those that do advertise need to be tested to ensure that
  things are working correctly.
  
  Is there a piglit test for QuadsFollowProvokingVertexConvention?
  It's probably a good idea to have one piglit test that tests it
  and draw clipped primitives w/ different fill modes.
  
 
 glean's clipFlat tests this, with this patch nv50 and nvc0 will
 finally
 pass.
 
 I tested drawing a clipped quad with softpipe with both the cap
 enabled
 and disabled and it looks like the draw change works.
 
 So, I'll push it soon, then - r600 will get to profit from the cap
 too,
 as Marek already mentioned.

Sounds great to me.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] updating tex combine for frag program

2012-02-07 Thread Dave Airlie
Hi guys,

Is there any reason we need to update tex combiner state if we are
using shaders?

can we drop update_tex_combine updates to only the case where we don't
have a fragment shader/program unless its from the FF shader?

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] updating tex combine for frag program

2012-02-07 Thread Brian Paul

On 02/07/2012 01:08 PM, Dave Airlie wrote:

Hi guys,

Is there any reason we need to update tex combiner state if we are
using shaders?

can we drop update_tex_combine updates to only the case where we don't
have a fragment shader/program unless its from the FF shader?


Yes, I think so.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH/RFC] glsl: Avoid excessive loop unrolling.

2012-02-07 Thread Eric Anholt
On Mon, 30 Jan 2012 07:40:15 +0100, Mathias Fröhlich 
mathias.froehl...@gmx.net wrote:
 
 Hi,
 
 I am trying to make the shaders from a sky scattering shader work with mesa.
 One of the problems is that the shader contains a nested 16 x 32 loop with an 
 instruction intensive body. The glsl loop unrolling pass is trying to unroll 
 both loops since both of them have a fixed size =32. Compiling this shader 
 gets stuck in do_common_optimization (= takes more than 10 minutes) since 
 unrolling and then optimizing that huge 16*32*(function body) intruction 
 count 
 kills the runtime behavour of other optimization passes.
 
 Attached is a change that additionally skips loop unrolling if either the 
 loop 
 is a nested loop or the instruction count of the body times the constant 
 iteration count exceeds the former max iteration count times 5 instructions.
 This makes the above shaders at least compile within the usual fraction of a 
 second.
 
 Comments/Review?

Does this also fix our piglit tests for the bug?  If so, note it in the
commit message, and

Reviewed-by: Eric Anholt e...@anholt.net


pgpFCvn7rpPD7.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl_to_tgsi: Adds support for horizontal location

2012-02-07 Thread Vincent Lejeune
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   33 ---
 1 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index ab05896..388e9d7 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -1949,6 +1949,17 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
}
 
this-result = st_src_reg(entry-file, entry-index, var-type);
+   if (ir-type-is_scalar() || ir-type-is_vector())
+  this-result.swizzle = swizzle_for_size(ir-type-vector_elements);
+   else
+  this-result.swizzle = SWIZZLE_NOOP;
+   unsigned swz = this-result.swizzle;
+   this-result.swizzle = MAKE_SWIZZLE4(
+GET_SWZ(swz, 0) + ir-var-horizontal_location % 4,
+GET_SWZ(swz, 1) + ir-var-horizontal_location % 4,
+GET_SWZ(swz, 2) + ir-var-horizontal_location % 4,
+GET_SWZ(swz, 3) + ir-var-horizontal_location % 4
+);
if (!native_integers)
   this-result.type = GLSL_TYPE_FLOAT;
 }
@@ -2169,10 +2180,17 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
   l.writemask = WRITEMASK_XYZW;
} else if (ir-lhs-type-is_scalar() 
   ir-lhs-variable_referenced()-mode == ir_var_out) {
-  /* FINISHME: This hack makes writing to gl_FragDepth, which lives in the
-   * FINISHME: W component of fragment shader output zero, work correctly.
-   */
-  l.writemask = WRITEMASK_XYZW;
+
+   ir_variable *inside_var = ir-lhs-variable_referenced();
+   if (inside_var-location = VERT_RESULT_VAR0) {
+  l.writemask = 1  inside_var-horizontal_location;
+   } else {
+  /* FINISHME: This hack makes writing to gl_FragDepth, which lives in 
the
+* FINISHME: W component of fragment shader output zero, work 
correctly.
+*/
+  l.writemask = WRITEMASK_XYZW;
+   }
+
} else {
   int swizzles[4];
   int first_enabled_chan = 0;
@@ -2180,6 +2198,13 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
 
   l.writemask = ir-write_mask;
 
+   if (ir_variable *inside_var = ir-lhs-variable_referenced()) {
+  if (inside_var-location = VERT_RESULT_VAR0) {
+   l.writemask = l.writemask  inside_var-horizontal_location;
+
+  }
+   }
+
   for (int i = 0; i  4; i++) {
  if (l.writemask  (1  i)) {
 first_enabled_chan = GET_SWZ(r.swizzle, i);
-- 
1.7.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Rewrite the HiZ op

2012-02-07 Thread Eric Anholt
Initial note: I'm very concerned about landing this much driver code
right before a release.  It's very nice code, and easy to read and
review, but I'm still more scared of it than the previous hacks with
known bugs in this timeframe.  For merging after 8.0, I'm fully on
board.

On Mon,  6 Feb 2012 00:33:35 -0800, Chad Versace chad.vers...@linux.intel.com 
wrote:
 +* Ideally, we would not need to flush for the resolve op. But, I suspect
 +* that it's unsafe for CMD_PIPELINE_SELECT to occur multiple times in

I don't see any problem with that.

 +* a single batch, and there is no safe way to ensure that other than by
 +* fencing the resolve with flushes. Ideally, we would just detect if
 +* a batch is in progress and do the right thing, but that would require
 +* the ability to *safely* access brw_context::state::dirty::brw
 +* outside of the brw_state_init() codepath.

What's brw_state_init()?  Did you mean brw_upload_state()?

What concern do you have with safely accessing the dirty bits?  The
context is single-threaded, so that's not an issue.  The batch flush
dirty bit setting happens outside of state upload, too.

 +   /* 3DSTATE_URB
 +*
 +* Assign the entire URB to the VS. Even though the VS disabled, URB space
 +* is still needed because the cliiper loads the VUE's from the URB. From

clipper

 +* the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE,
 +* Dword 1.15:0 VS Number of URB Entries:
 +* This field is always used (even if VS Function Enable is DISABLED).
 +
 +* The warning below appears in the PRM (Section 3DSTATE_URB), but we can
 +* safely ignore it because this batch contains only one draw call.
 +* Because of URB corruption caused by allocating a previous GS unit
 +* URB entry to the VS unit, software is required to send a “GS NULL
 +* Fence” (Send URB fence with VS URB size == 1 and GS URB size == 0)
 +* plus a dummy DRAW call before any case where VS will be taking over
 +* GS URB space.
 +*/

As noted above, I don't think we really need to flush the batch around
the hiz op.  But that currently is providing a sufficient guarantee for
this requirement: MI_BATCH_BUFFER_START is a full pipeline stall, and
all this requirement is about is getting enough of a pipeline stall, I
think.

 +   /* 3DSTATE_DEPTH_BUFFER */
 +   {
 +  uint32_t width = mt-level[level].width;
 +  uint32_t height = mt-level[level].height;
 +
 +  uint32_t tile_x;
 +  uint32_t tile_y;
 +  uint32_t offset;
 +  {
 + /* Construct a dummy renderbuffer just to extract tile offsets. */
 + struct intel_renderbuffer rb;
 + rb.mt = mt;
 + rb.mt_level = level;
 + rb.mt_layer = layer;
 + intel_renderbuffer_set_draw_offset(rb);
 + offset = intel_renderbuffer_tile_offsets(rb, tile_x, tile_y);
 +  }

I'd remove the block indent here.

The actual body of the code in this patch is:

Reviewed-by: Eric Anholt e...@anholt.net

batch flushing mitigation can come later if we feel like it.


pgp5Tn52Ebkys.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] updating tex combine for frag program

2012-02-07 Thread Ian Romanick

On 02/07/2012 12:08 PM, Dave Airlie wrote:

Hi guys,

Is there any reason we need to update tex combiner state if we are
using shaders?

can we drop update_tex_combine updates to only the case where we don't
have a fragment shader/program unless its from the FF shader?


I don't recall when update_tex_combine happens.  Does it happen at 
draw-time or state change-time?  If it happens at state change-time, the 
state will need to get updated when program 0 gets bound.  Otherwise 
something like the following would break:


glUseProgram(my_prog);
glTexEnvi(GL_TEXTURE_2D, GL_TEXTURE_ENV_MODE,
GL_DECAL);
glUseProgram(0);
glDrawArrays(...);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Rewrite the HiZ op

2012-02-07 Thread Chad Versace
On 02/07/2012 01:10 PM, Eric Anholt wrote:
 Initial note: I'm very concerned about landing this much driver code
 right before a release.  It's very nice code, and easy to read and
 review, but I'm still more scared of it than the previous hacks with
 known bugs in this timeframe.  For merging after 8.0, I'm fully on
 board.

I agree that this is a scary piece of code to commit the day of the release. 
Ken,
Ian, Paul, and I agreed that we should give QA time to digest it and it should
likely land in 8.0.1.

 
 On Mon,  6 Feb 2012 00:33:35 -0800, Chad Versace 
 chad.vers...@linux.intel.com wrote:
 +* Ideally, we would not need to flush for the resolve op. But, I suspect
 +* that it's unsafe for CMD_PIPELINE_SELECT to occur multiple times in
 
 I don't see any problem with that.

I was unsure if it was safe for CMD_PIPELINE_SELECT to occur multiple times
in a batch, so I chose to err on the side of safety and emit a new batch to
eliminate my cause of uncertainty.

 
 +* a single batch, and there is no safe way to ensure that other than by
 +* fencing the resolve with flushes. Ideally, we would just detect if
 +* a batch is in progress and do the right thing, but that would require
 +* the ability to *safely* access brw_context::state::dirty::brw
 +* outside of the brw_state_init() codepath.
 
 What's brw_state_init()?  Did you mean brw_upload_state()?

Oops. I meant brw_upload_state().

 
 What concern do you have with safely accessing the dirty bits?  The
 context is single-threaded, so that's not an issue.  The batch flush
 dirty bit setting happens outside of state upload, too.
 
 +   /* 3DSTATE_URB
 +*
 +* Assign the entire URB to the VS. Even though the VS disabled, URB 
 space
 +* is still needed because the cliiper loads the VUE's from the URB. From
 
 clipper

Typo fixed.

 
 +* the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE,
 +* Dword 1.15:0 VS Number of URB Entries:
 +* This field is always used (even if VS Function Enable is 
 DISABLED).
 +
 +* The warning below appears in the PRM (Section 3DSTATE_URB), but we can
 +* safely ignore it because this batch contains only one draw call.
 +* Because of URB corruption caused by allocating a previous GS unit
 +* URB entry to the VS unit, software is required to send a “GS NULL
 +* Fence” (Send URB fence with VS URB size == 1 and GS URB size == 0)
 +* plus a dummy DRAW call before any case where VS will be taking 
 over
 +* GS URB space.
 +*/
 
 As noted above, I don't think we really need to flush the batch around
 the hiz op.  But that currently is providing a sufficient guarantee for
 this requirement: MI_BATCH_BUFFER_START is a full pipeline stall, and
 all this requirement is about is getting enough of a pipeline stall, I
 think.
 
 +   /* 3DSTATE_DEPTH_BUFFER */
 +   {
 +  uint32_t width = mt-level[level].width;
 +  uint32_t height = mt-level[level].height;
 +
 +  uint32_t tile_x;
 +  uint32_t tile_y;
 +  uint32_t offset;
 +  {
 + /* Construct a dummy renderbuffer just to extract tile offsets. */
 + struct intel_renderbuffer rb;
 + rb.mt = mt;
 + rb.mt_level = level;
 + rb.mt_layer = layer;
 + intel_renderbuffer_set_draw_offset(rb);
 + offset = intel_renderbuffer_tile_offsets(rb, tile_x, tile_y);
 +  }
 
 I'd remove the block indent here.
 
 The actual body of the code in this patch is:
 
 Reviewed-by: Eric Anholt e...@anholt.net
 
 batch flushing mitigation can come later if we feel like it.

I do want to eliminate the unnecessary flushing, and I would like to do
that and the requisite dirty bit rework in follow-on work. Paul and I
brainstormed on a way to cleanly do this, and I think all of us should
discuss it in-person one day.

Thanks for taking the time to read the patch.


Chad Versace
chad.vers...@linux.intel.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: unmap vertex store before executing lists

2012-02-07 Thread Eric Anholt
On Tue,  7 Feb 2012 13:10:46 -0700, Brian Paul bri...@vmware.com wrote:
 We don't want our VBOs mapped when we're drawing.  This change checks
 if the vertex store VBO is mapped before we execute a list, unmaps it,
 then remaps it after drawing.  This situation pops up when building a
 nested display list in GL_COMPILE_AND_EXECUTE mode.

Do we have any test for this?

Anyway,

Reviewed-by: Eric Anholt e...@anholt.net


pgpp05Vk8QM61.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: unmap vertex store before executing lists

2012-02-07 Thread Brian Paul

On 02/07/2012 03:18 PM, Eric Anholt wrote:

On Tue,  7 Feb 2012 13:10:46 -0700, Brian Paulbri...@vmware.com  wrote:

We don't want our VBOs mapped when we're drawing.  This change checks
if the vertex store VBO is mapped before we execute a list, unmaps it,
then remaps it after drawing.  This situation pops up when building a
nested display list in GL_COMPILE_AND_EXECUTE mode.


Do we have any test for this?


The legacy conform dlist.c test hit it.

It's a pretty unusual case anyway.



Anyway,

Reviewed-by: Eric Anholte...@anholt.net


-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Rewrite the HiZ op

2012-02-07 Thread Kenneth Graunke

On 02/06/2012 12:33 AM, Chad Versace wrote:

The HiZ op was implemented as a meta-op. This patch reimplements it by
emitting a special HiZ batch. This fixes several known bugs, and likely
a lot of undiscovered ones too.


Reviewed-by: Kenneth Graunke kenn...@whitecape.org

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Implement GL_CLAMP behavior on texture rectangles on gen6+.

2012-02-07 Thread Ian Romanick

On 02/04/2012 05:24 AM, Eric Anholt wrote:

We were doing saturate-based clamping on the [0,width] or [0,height]
coordinate, which meant only the first pixel was addressable.

Fixes piglit ARB_texture_rectangle/texwrap-RECT-bordercolor


Tag this for stable branches too.

Reviewed-by: Ian Romanick ian.d.roman...@intel.com


---

I feel dirty.

  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   54 +++--
  1 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index e57e175..ea8cd37 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1079,12 +1079,18 @@ fs_visitor::visit(ir_texture *ir)
 /* Should be lowered by do_lower_texture_projection */
 assert(!ir-projector);

+   bool needs_gl_clamp = true;
+
+   fs_reg scale_x, scale_y;
+
 /* The 965 requires the EU to do the normalization of GL rectangle
  * texture coordinates.  We use the program parameter state
  * tracking to get the scaling factor.
  */
-   if (intel-gen  6
-   ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT) {
+   if (ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT
+   (intel-gen  6 ||
+   (intel-gen= 6  (c-key.tex.gl_clamp_mask[0]  (1  sampler) ||
+c-key.tex.gl_clamp_mask[1]  (1  sampler) {
struct gl_program_parameter_list *params = 
c-fp-program.Base.Parameters;
int tokens[STATE_LENGTH] = {
 STATE_INTERNAL,
@@ -1105,8 +,9 @@ fs_visitor::visit(ir_texture *ir)
c-prog_data.param_convert[c-prog_data.nr_params + 1] =
 PARAM_NO_CONVERT;

-  fs_reg scale_x = fs_reg(UNIFORM, c-prog_data.nr_params);
-  fs_reg scale_y = fs_reg(UNIFORM, c-prog_data.nr_params + 1);
+  scale_x = fs_reg(UNIFORM, c-prog_data.nr_params);
+  scale_y = fs_reg(UNIFORM, c-prog_data.nr_params + 1);
+
GLuint index = _mesa_add_state_reference(params,
   (gl_state_index *)tokens);

@@ -1116,7 +1123,14 @@ fs_visitor::visit(ir_texture *ir)
this-param_index[c-prog_data.nr_params] = index;
this-param_offset[c-prog_data.nr_params] = 1;
c-prog_data.nr_params++;
+   }

+   /* The 965 requires the EU to do the normalization of GL rectangle
+* texture coordinates.  We use the program parameter state
+* tracking to get the scaling factor.
+*/
+   if (intel-gen  6
+   ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT) {
fs_reg dst = fs_reg(this, ir-coordinate-type);
fs_reg src = coordinate;
coordinate = dst;
@@ -1125,9 +1139,39 @@ fs_visitor::visit(ir_texture *ir)
dst.reg_offset++;
src.reg_offset++;
emit(BRW_OPCODE_MUL, dst, src, scale_y);
+   } else if (ir-sampler-type-sampler_dimensionality == 
GLSL_SAMPLER_DIM_RECT) {
+  /* On gen6+, the sampler handles the rectangle coordinates
+   * natively, without needing rescaling.  But that means we have
+   * to do GL_CLAMP clamping at the [0, width], [0, height] scale,
+   * not [0, 1] like the default case below.
+   */
+  needs_gl_clamp = false;
+
+  for (int i = 0; i  2; i++) {
+if (c-key.tex.gl_clamp_mask[i]  (1  sampler)) {
+   fs_reg chan = coordinate;
+   chan.reg_offset += i;
+
+   inst = emit(BRW_OPCODE_SEL, chan, chan, brw_imm_f(0.0));
+   inst-conditional_mod = BRW_CONDITIONAL_G;
+
+   /* Our parameter comes in as 1.0/width or 1.0/height,
+* because that's what people normally want for doing
+* texture rectangle handling.  We need width or height
+* for clamping, but we don't care enough to make a new
+* parameter type, so just invert back.
+*/
+   fs_reg limit = fs_reg(this, glsl_type::float_type);
+   emit(BRW_OPCODE_MOV, limit, i == 0 ? scale_x : scale_y);
+   emit(SHADER_OPCODE_RCP, limit, limit);
+
+   inst = emit(BRW_OPCODE_SEL, chan, chan, limit);
+   inst-conditional_mod = BRW_CONDITIONAL_L;
+}
+  }
 }

-   if (ir-coordinate) {
+   if (ir-coordinate  needs_gl_clamp) {
for (int i = 0; i  MIN2(ir-coordinate-type-vector_elements, 3); 
i++) {
 if (c-key.tex.gl_clamp_mask[i]  (1  sampler)) {
fs_reg chan = coordinate;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 45571] fatal error: program/symbol_table.h: No such file or directory

2012-02-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=45571

Matt Turner matts...@gmail.com changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||NOTOURBUG

--- Comment #15 from Matt Turner matts...@gmail.com 2012-02-07 15:23:52 PST 
---
(In reply to comment #14)
 Should we open differents bugs if make realclean doesn't clean everything as
 it should? If so, then I think we can close this bug.

Yes.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 8.0 release status

2012-02-07 Thread Ian Romanick

Guys,

It seems that I missed a bunch of patches sitting on master that wanted 
to be picked to 8.0.  I went ahead and picked them over.  My plan is to 
have my QA team run them tonight and do the release tomorrow evening or 
first thing Thursday morning.


There are a few other bugs (e.g., Chad's recent HiZ patches) that need 
fixing.  Those really need a bit more time to cook, so I no intention of 
committing / picking those for 8.0.  My plan is to pick these over in 
the next few days and do an 8.0.1 release in the next week or 10 days.


Does this work for people?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fixes for DrawRangeElements[BaseVertex] index validation

2012-02-07 Thread Roland Scheidegger
Ping? I don't want to commit that without anyone looking at it.

Am 06.02.2012 02:08, schrieb Roland Scheidegger:
 In vbo_exec_DrawRangeElementsBaseVertex, take into account the basevertex.
 As far as I can tell it is completely ok (though maybe stupid) to have
 start/end of 100/199, with _MaxElement being 100, if the basevertex
 is -100 (since the start/end are prior to adding basevertex). The opposite
 is also true (0/99 is not ok if _MaxElement is 100 and and basevertex is 100).
 Previously we would have issued a warning in such cases and worse probably
 fixed end to a bogus value.
 Untested, found by code inspection when looking at bug 40361 (which
 seems unrelated after all).
 
 v2: As per mailing list discussion and the suggestion by Kenneth Graunke just 
 drop
 the questionable effort to fix up the range and revert to have no range 
 instead.
 Rearrange some (disabled) debug bits a bit for slightly less complex code.
 ---
  src/mesa/vbo/vbo_exec_array.c |   63 +++-
  1 files changed, 30 insertions(+), 33 deletions(-)
 
 diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
 index d6b4d61..1f62e3a 100644
 --- a/src/mesa/vbo/vbo_exec_array.c
 +++ b/src/mesa/vbo/vbo_exec_array.c
 @@ -885,17 +885,33 @@ vbo_exec_DrawRangeElementsBaseVertex(GLenum mode,
end = MIN2(end, 0x);
 }
  
 -   if (end = ctx-Array.ArrayObj-_MaxElement) {
 -  /* the max element is out of bounds of one or more enabled arrays */
 +   if (0) {
 +  printf(glDrawRangeElements{,BaseVertex}
 +  (start %u, end %u, type 0x%x, count %d) ElemBuf %u, 
 +  base %d\n,
 +  start, end, type, count,
 +  ctx-Array.ArrayObj-ElementArrayBufferObj-Name,
 +  basevertex);
 +   }
 +
 +#if 0
 +   check_draw_elements_data(ctx, count, type, indices);
 +#else
 +   (void) check_draw_elements_data;
 +#endif
 +
 +   if ((int)(start + basevertex)  0 ||
 +   end + basevertex = ctx-Array.ArrayObj-_MaxElement) {
 +  /* the max (or min) element is out of bounds of one or more enabled 
 arrays */
warnCount++;
  
if (warnCount  10) {
 - _mesa_warning(ctx, glDraw[Range]Elements(start %u, end %u, count 
 %d, 
 -   type 0x%x, indices=%p)\n
 + _mesa_warning(ctx, glDrawRangeElements[BaseVertex](start %u, end 
 %u, count %d, 
 +   type 0x%x, indices=%p, base %d)\n
 \tend is out of bounds (max=%u)  
 Element Buffer %u (size %d)\n
 \tThis should probably be fixed in the application.,
 -   start, end, count, type, indices,
 +   start, end, count, type, indices, basevertex,
 ctx-Array.ArrayObj-_MaxElement - 1,
 ctx-Array.ArrayObj-ElementArrayBufferObj-Name,
 (int) 
 ctx-Array.ArrayObj-ElementArrayBufferObj-Size);
 @@ -913,14 +929,14 @@ vbo_exec_DrawRangeElementsBaseVertex(GLenum mode,
if (0) {
   GLuint max = _mesa_max_buffer_index(ctx, count, type, indices,
   
 ctx-Array.ArrayObj-ElementArrayBufferObj);
 - if (max = ctx-Array.ArrayObj-_MaxElement) {
 + if (max + basevertex = ctx-Array.ArrayObj-_MaxElement) {
  if (warnCount  10) {
 -   _mesa_warning(ctx, glDraw[Range]Elements(start %u, end %u, 
 - count %d, type 0x%x, indices=%p)\n
 +   _mesa_warning(ctx, glDrawRangeElements[BaseVertex](start %u, 
 end %u, 
 + count %d, type 0x%x, indices=%p, base %d)\n
   \tindex=%u is out of bounds (max=%u)  
   Element Buffer %u (size %d)\n
   \tSkipping the glDrawRangeElements() call,
 - start, end, count, type, indices, max,
 + start, end, count, type, indices, basevertex, 
 max,
   ctx-Array.ArrayObj-_MaxElement - 1,
   
 ctx-Array.ArrayObj-ElementArrayBufferObj-Name,
   (int) 
 ctx-Array.ArrayObj-ElementArrayBufferObj-Size);
 @@ -928,34 +944,15 @@ vbo_exec_DrawRangeElementsBaseVertex(GLenum mode,
   }
   /* XXX we could also find the min index and compare to 'start'
* to see if start is correct.  But it's more likely to get the
 -  * upper bound wrong.
 +  * upper bound wrong, at least for the non-BaseVertex variant...
*/
}
 -
 -  /* Set 'end' to the max possible legal value */
 -  assert(ctx-Array.ArrayObj-_MaxElement = 1);
 -  end = ctx-Array.ArrayObj-_MaxElement - 1;
 -
 -  if (end  start) {
 - return;
 -  }
 -   }
 -
 -   if (0) {
 -  printf(glDraw[Range]Elements{,BaseVertex}
 -  (start %u, end %u, type 0x%x, count %d) 

[Mesa-dev] [PATCH 1/3] i965: Add support for the MAD opcode on SNB.

2012-02-07 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/brw_defines.h |1 +
 src/mesa/drivers/dri/i965/brw_disasm.c  |  223 ---
 src/mesa/drivers/dri/i965/brw_eu.h  |   17 +++-
 src/mesa/drivers/dri/i965/brw_eu_emit.c |   82 +++-
 src/mesa/drivers/dri/i965/brw_structs.h |   37 +
 5 files changed, 340 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 029be87..38ce5d7 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -620,6 +620,7 @@ enum opcode {
BRW_OPCODE_DPA2 =   88,
BRW_OPCODE_LINE =   89,
BRW_OPCODE_PLN =90,
+   BRW_OPCODE_MAD =91,
BRW_OPCODE_NOP =126,
 
/* These are compiler backend opcodes that get translated into other
diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
b/src/mesa/drivers/dri/i965/brw_disasm.c
index a86c8f2..187bc0a 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++ b/src/mesa/drivers/dri/i965/brw_disasm.c
@@ -51,6 +51,7 @@ struct {
 [BRW_OPCODE_MACH] = { .name = mach, .nsrc = 2, .ndst = 1 },
 [BRW_OPCODE_LINE] = { .name = line, .nsrc = 2, .ndst = 1 },
 [BRW_OPCODE_PLN] = { .name = pln, .nsrc = 2, .ndst = 1 },
+[BRW_OPCODE_MAD] = { .name = mad, .nsrc = 3, .ndst = 1 },
 [BRW_OPCODE_SAD2] = { .name = sad2, .nsrc = 2, .ndst = 1 },
 [BRW_OPCODE_SADA2] = { .name = sada2, .nsrc = 2, .ndst = 1 },
 [BRW_OPCODE_DP4] = { .name = dp4, .nsrc = 2, .ndst = 1 },
@@ -578,6 +579,28 @@ static int dest (FILE *file, struct brw_instruction *inst)
 return 0;
 }
 
+static int dest_3src (FILE *file, struct brw_instruction *inst)
+{
+interr = 0;
+uint32_t reg_file;
+
+if (inst-bits1.da3src.dest_reg_file)
+   reg_file = BRW_MESSAGE_REGISTER_FILE;
+else
+   reg_file = BRW_GENERAL_REGISTER_FILE;
+
+err |= reg (file, reg_file, inst-bits1.da3src.dest_reg_nr);
+if (err == -1)
+   return 0;
+if (inst-bits1.da3src.dest_subreg_nr)
+   format (file, .%d, inst-bits1.da3src.dest_subreg_nr);
+string (file, 1);
+err |= control (file, writemask, writemask, 
inst-bits1.da3src.dest_writemask, NULL);
+err |= control (file, dest reg encoding, reg_encoding, 
BRW_REGISTER_TYPE_F, NULL);
+
+return 0;
+}
+
 static int src_align1_region (FILE *file,
  GLuint _vert_stride, GLuint _width, GLuint 
_horiz_stride)
 {
@@ -694,6 +717,156 @@ static int src_da16 (FILE *file,
 return err;
 }
 
+static int src0_3src (FILE *file, struct brw_instruction *inst)
+{
+int err = 0;
+GLuint swz_x = (inst-bits2.da3src.src0_swizzle  0)  0x3;
+GLuint swz_y = (inst-bits2.da3src.src0_swizzle  2)  0x3;
+GLuint swz_z = (inst-bits2.da3src.src0_swizzle  4)  0x3;
+GLuint swz_w = (inst-bits2.da3src.src0_swizzle  6)  0x3;
+
+err |= control (file, negate, negate, inst-bits1.da3src.src0_negate, 
NULL);
+err |= control (file, abs, _abs, inst-bits1.da3src.src0_abs, NULL);
+
+err |= reg (file, BRW_GENERAL_REGISTER_FILE, 
inst-bits2.da3src.src0_reg_nr);
+if (err == -1)
+   return 0;
+if (inst-bits2.da3src.src0_subreg_nr)
+   format (file, .%d, inst-bits2.da3src.src0_subreg_nr);
+string (file, 4,1,1);
+err |= control (file, src da16 reg type, reg_encoding,
+   BRW_REGISTER_TYPE_F, NULL);
+/*
+ * Three kinds of swizzle display:
+ *  identity - nothing printed
+ *  1-all  - print the single channel
+ *  1-1 - print the mapping
+ */
+if (swz_x == BRW_CHANNEL_X 
+   swz_y == BRW_CHANNEL_Y 
+   swz_z == BRW_CHANNEL_Z 
+   swz_w == BRW_CHANNEL_W)
+{
+   ;
+}
+else if (swz_x == swz_y  swz_x == swz_z  swz_x == swz_w)
+{
+   string (file, .);
+   err |= control (file, channel select, chan_sel, swz_x, NULL);
+}
+else
+{
+   string (file, .);
+   err |= control (file, channel select, chan_sel, swz_x, NULL);
+   err |= control (file, channel select, chan_sel, swz_y, NULL);
+   err |= control (file, channel select, chan_sel, swz_z, NULL);
+   err |= control (file, channel select, chan_sel, swz_w, NULL);
+}
+return err;
+}
+
+static int src1_3src (FILE *file, struct brw_instruction *inst)
+{
+int err = 0;
+GLuint swz_x = (inst-bits2.da3src.src1_swizzle  0)  0x3;
+GLuint swz_y = (inst-bits2.da3src.src1_swizzle  2)  0x3;
+GLuint swz_z = (inst-bits2.da3src.src1_swizzle  4)  0x3;
+GLuint swz_w = (inst-bits2.da3src.src1_swizzle  6)  0x3;
+GLuint src1_subreg_nr = (inst-bits2.da3src.src1_subreg_nr_low |
+(inst-bits3.da3src.src1_subreg_nr_high  2));
+
+err |= control (file, negate, negate, inst-bits1.da3src.src1_negate,
+   NULL);
+err |= control (file, abs, _abs, inst-bits1.da3src.src1_abs, NULL);
+
+err |= reg (file, BRW_GENERAL_REGISTER_FILE,
+   inst-bits3.da3src.src1_reg_nr);
+if (err == 

[Mesa-dev] [PATCH 3/3] i965/fs: Add support for MADs.

2012-02-07 Thread Eric Anholt
Improves nexuiz performance 0.28% +/- .15% (n=5) on my gen6.  No
statistically significant performance difference on warsow (n=5).
---
 src/mesa/drivers/dri/i965/brw_fs.h   |1 +
 src/mesa/drivers/dri/i965/brw_fs_emit.cpp|6 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |   42 ++
 3 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 5fdc055..060aa36 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -529,6 +529,7 @@ public:
fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0);
fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1);
bool try_emit_saturate(ir_expression *ir);
+   bool try_emit_mad(ir_expression *ir, int mul_arg);
void emit_bool_to_cond_code(ir_rvalue *condition);
void emit_if_gen6(ir_if *ir);
void emit_unspill(fs_inst *inst, fs_reg reg, uint32_t spill_offset);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
index b68d8cb..344a533 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
@@ -725,6 +725,12 @@ fs_visitor::generate_code()
 brw_set_acc_write_control(p, 0);
 break;
 
+  case BRW_OPCODE_MAD:
+brw_set_access_mode(p, BRW_ALIGN_16);
+brw_MAD(p, dst, src[0], src[1], src[2]);
+brw_set_access_mode(p, BRW_ALIGN_1);
+break;
+
   case BRW_OPCODE_FRC:
 brw_FRC(p, dst, src[0]);
 break;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index ea8cd37..efa54b5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -182,6 +182,44 @@ fs_visitor::try_emit_saturate(ir_expression *ir)
return true;
 }
 
+bool
+fs_visitor::try_emit_mad(ir_expression *ir, int mul_arg)
+{
+   /* 3-src instructions were introduced in gen6. */
+   if (intel-gen  6)
+  return false;
+
+   /* FINISHME: Can we do this for 16-wide at all?  Breaks
+* fs-mix-float-float-float at least.
+*/
+   if (c-dispatch_width == 16)
+  return false;
+
+   /* MAD can only handle floating-point data. */
+   if (ir-type != glsl_type::float_type)
+  return false;
+
+   ir_rvalue *nonmul = ir-operands[1 - mul_arg];
+   ir_expression *mul = ir-operands[mul_arg]-as_expression();
+
+   if (!mul || mul-operation != ir_binop_mul)
+  return false;
+
+   nonmul-accept(this);
+   fs_reg src0 = this-result;
+
+   mul-operands[0]-accept(this);
+   fs_reg src1 = this-result;
+
+   mul-operands[1]-accept(this);
+   fs_reg src2 = this-result;
+
+   this-result = fs_reg(this, ir-type);
+   emit(BRW_OPCODE_MAD, this-result, src0, src1, src2);
+
+   return true;
+}
+
 void
 fs_visitor::visit(ir_expression *ir)
 {
@@ -193,6 +231,10 @@ fs_visitor::visit(ir_expression *ir)
 
if (try_emit_saturate(ir))
   return;
+   if (ir-operation == ir_binop_add) {
+  if (try_emit_mad(ir, 0) || try_emit_mad(ir, 1))
+return;
+   }
 
for (operand = 0; operand  ir-get_num_operands(); operand++) {
   ir-operands[operand]-accept(this);
-- 
1.7.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965/fs: Add missing register allocation for 3rd sources.

2012-02-07 Thread Eric Anholt
Our only instruction with a 3rd source so far was linterp, and that
value was never register-allocated.
---
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index 0d1712e..45f83f3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
@@ -61,6 +61,7 @@ fs_visitor::assign_regs_trivial()
   assign_reg(hw_reg_mapping, inst-dst, reg_width);
   assign_reg(hw_reg_mapping, inst-src[0], reg_width);
   assign_reg(hw_reg_mapping, inst-src[1], reg_width);
+  assign_reg(hw_reg_mapping, inst-src[2], reg_width);
}
 
if (this-grf_used = max_grf) {
@@ -271,6 +272,7 @@ fs_visitor::assign_regs()
   assign_reg(hw_reg_mapping, inst-dst, reg_width);
   assign_reg(hw_reg_mapping, inst-src[0], reg_width);
   assign_reg(hw_reg_mapping, inst-src[1], reg_width);
+  assign_reg(hw_reg_mapping, inst-src[2], reg_width);
}
 
ralloc_free(g);
-- 
1.7.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] updating tex combine for frag program

2012-02-07 Thread Dave Airlie
On Tue, Feb 7, 2012 at 10:09 PM, Ian Romanick i...@freedesktop.org wrote:
 On 02/07/2012 12:08 PM, Dave Airlie wrote:

 Hi guys,

 Is there any reason we need to update tex combiner state if we are
 using shaders?

 can we drop update_tex_combine updates to only the case where we don't
 have a fragment shader/program unless its from the FF shader?


 I don't recall when update_tex_combine happens.  Does it happen at draw-time
 or state change-time?  If it happens at state change-time, the state will
 need to get updated when program 0 gets bound.  Otherwise something like the
 following would break:

        glUseProgram(my_prog);
        glTexEnvi(GL_TEXTURE_2D, GL_TEXTURE_ENV_MODE,
                GL_DECAL);
        glUseProgram(0);
        glDrawArrays(...);


IT happens in update_texture_state which happens on _NEW_TEXTURE, so
it should be fine, since with the maintain fixed function stuff we
update the program on _NEW_TEXTURE.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev