[Mesa-dev] [PATCH] nv50: Fix broken assertion
The assertion added in f09910f3 broke nv50 completely by asserting that the number of elements in a dereferenced pointer (i.e. 1) was greater than i (which ranged up to six), rather than checking the number of elements in the containing array. Signed-off-by: Daniel Stone dan...@fooishbar.org --- src/gallium/drivers/nv50/nv50_pc_regalloc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_pc_regalloc.c b/src/gallium/drivers/nv50/nv50_pc_regalloc.c index 72922cb..12a59cb 100644 --- a/src/gallium/drivers/nv50/nv50_pc_regalloc.c +++ b/src/gallium/drivers/nv50/nv50_pc_regalloc.c @@ -421,7 +421,7 @@ phi_opnd_for_bb(struct nv_instruction *phi, struct nv_basic_block *b, int i, j; for (j = -1, i = 0; i 6 phi-src[i]; ++i) { - assert(i Elements(phi-src[i])); + assert(i Elements(phi-src)); srci = phi-src[i]; /* if already replaced, check with original source first */ if (srci-flags NV_REF_FLAG_REGALLOC_PRIV) -- 1.7.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50: Fix broken assertion
Makes sense. Jose - Original Message - The assertion added in f09910f3 broke nv50 completely by asserting that the number of elements in a dereferenced pointer (i.e. 1) was greater than i (which ranged up to six), rather than checking the number of elements in the containing array. Signed-off-by: Daniel Stone dan...@fooishbar.org --- src/gallium/drivers/nv50/nv50_pc_regalloc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_pc_regalloc.c b/src/gallium/drivers/nv50/nv50_pc_regalloc.c index 72922cb..12a59cb 100644 --- a/src/gallium/drivers/nv50/nv50_pc_regalloc.c +++ b/src/gallium/drivers/nv50/nv50_pc_regalloc.c @@ -421,7 +421,7 @@ phi_opnd_for_bb(struct nv_instruction *phi, struct nv_basic_block *b, int i, j; for (j = -1, i = 0; i 6 phi-src[i]; ++i) { - assert(i Elements(phi-src[i])); + assert(i Elements(phi-src)); srci = phi-src[i]; /* if already replaced, check with original source first */ if (srci-flags NV_REF_FLAG_REGALLOC_PRIV) -- 1.7.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50: Fix broken assertion
On 07.02.2012 13:47, Jose Fonseca wrote: Makes sense. Very much so ... http://cgit.freedesktop.org/mesa/mesa/commit/?id=189e6c7e81ce35b89d9b52d4bd0d6271a7e9c10f (of 26 hours ago). Jose - Original Message - The assertion added in f09910f3 broke nv50 completely by asserting that the number of elements in a dereferenced pointer (i.e. 1) was greater than i (which ranged up to six), rather than checking the number of elements in the containing array. Signed-off-by: Daniel Stone dan...@fooishbar.org --- src/gallium/drivers/nv50/nv50_pc_regalloc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_pc_regalloc.c b/src/gallium/drivers/nv50/nv50_pc_regalloc.c index 72922cb..12a59cb 100644 --- a/src/gallium/drivers/nv50/nv50_pc_regalloc.c +++ b/src/gallium/drivers/nv50/nv50_pc_regalloc.c @@ -421,7 +421,7 @@ phi_opnd_for_bb(struct nv_instruction *phi, struct nv_basic_block *b, int i, j; for (j = -1, i = 0; i 6 phi-src[i]; ++i) { - assert(i Elements(phi-src[i])); + assert(i Elements(phi-src)); srci = phi-src[i]; /* if already replaced, check with original source first */ if (srci-flags NV_REF_FLAG_REGALLOC_PRIV) -- 1.7.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 45277] [bisected] Shading not working properly in Heroes of Newerth
https://bugs.freedesktop.org/show_bug.cgi?id=45277 --- Comment #2 from Damien Grassart dam...@grassart.com 2012-02-07 05:11:44 PST --- I've also found that removing PIPE_CAP_VERTEX_COLOR_UNCLAMPED as a supported feature for r600 does fix this bug (src/gallium/drivers/r600/r600_pipe.c:369). Is the statement in the commit message that r600 can do unclamped vertex colors perhaps false? -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50: Fix broken assertion
On 7 February 2012 13:03, Christoph Bumiller e0425...@student.tuwien.ac.at wrote: On 07.02.2012 13:47, Jose Fonseca wrote: Makes sense. Very much so ... http://cgit.freedesktop.org/mesa/mesa/commit/?id=189e6c7e81ce35b89d9b52d4bd0d6271a7e9c10f (of 26 hours ago). Ha, snap. Thanks anyway. :) (Doesn't matter too much in any case, as gnome-shell still locks my machine solid in under a minute on this particular chip ...) Cheers, Daniel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] gallivm: enable stores of integer types.
On Mon, Feb 6, 2012 at 9:01 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - On Mon, Feb 6, 2012 at 7:47 PM, Jose Fonseca jfons...@vmware.com wrote: Dave, I really see no point of inferring anything when translating MOVs. The src/dst value will need to be converted from/to a floating point anyway so this is unnecessary complexity AFAICS. If you believe this is really necessary please provide a concrete example. The problem is MOV from immediate, since immediates are stored as the correct internal type, so we don't get to infer the src in that case, then we try to MOV int4* to float4* and llvm crashes. I see. In that case I'd suggest you to: - in lp_emit_immediate_soa() bitcast _all_ integer immediates to floats (i.e., bld-immediates[] is always of llvm type float, just like bld-temps[], and bld-outputs[], etc.). - in emit_fetch_immediate() bitcast back to integer when stype == INT or UINT I've started looking at this, but it means we have to change immediates from being constants to being something we can bitcast. At the moment the immediate fetch just fetches one immediate value from a vector, not a pointer to an immediate value, making this pointer based just seems like an unnecessary step and seems to be like it will get in the way of optimisations that can work on constant values. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/gbm: r300 and r600 now depend on libdrm
fixes undefined references in libradeonwinsys.a when linking Signed-off-by: Tobias Droste tdro...@gmx.de --- src/gallium/targets/gbm/Makefile |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/gallium/targets/gbm/Makefile b/src/gallium/targets/gbm/Makefile index ce56f93..2737b79 100644 --- a/src/gallium/targets/gbm/Makefile +++ b/src/gallium/targets/gbm/Makefile @@ -72,11 +72,13 @@ nouveau_SYS = -ldrm_nouveau r300_LIBS = \ $(TOP)/src/gallium/winsys/radeon/drm/libradeonwinsys.a \ $(TOP)/src/gallium/drivers/r300/libr300.a +r300_SYS += -ldrm_radeon # r600 pipe driver r600_LIBS = \ $(TOP)/src/gallium/winsys/radeon/drm/libradeonwinsys.a \ $(TOP)/src/gallium/drivers/r600/libr600.a +r600_SYS += -ldrm_radeon # vmwgfx pipe driver vmwgfx_LIBS = \ -- 1.7.7 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] gallivm: enable stores of integer types.
- Original Message - On Mon, Feb 6, 2012 at 9:01 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - On Mon, Feb 6, 2012 at 7:47 PM, Jose Fonseca jfons...@vmware.com wrote: Dave, I really see no point of inferring anything when translating MOVs. The src/dst value will need to be converted from/to a floating point anyway so this is unnecessary complexity AFAICS. If you believe this is really necessary please provide a concrete example. The problem is MOV from immediate, since immediates are stored as the correct internal type, so we don't get to infer the src in that case, then we try to MOV int4* to float4* and llvm crashes. I see. In that case I'd suggest you to: - in lp_emit_immediate_soa() bitcast _all_ integer immediates to floats (i.e., bld-immediates[] is always of llvm type float, just like bld-temps[], and bld-outputs[], etc.). - in emit_fetch_immediate() bitcast back to integer when stype == INT or UINT I've started looking at this, but it means we have to change immediates from being constants to being something we can bitcast. At the moment the immediate fetch just fetches one immediate value from a vector, not a pointer to an immediate value, making this pointer based just seems like an unnecessary step and seems to be like it will get in the way of optimisations that can work on constant values. If you use LLVMConstBitCast() the result is a still a constant. Nothing is lost. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: avoid vertex texture update for 0 case
From: Dave Airlie airl...@redhat.com If we had no vertex textures previously and we have none now, don't bother doing the enables dance. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_atom_texture.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_texture.c b/src/mesa/state_tracker/st_atom_texture.c index 88e6128..1b82283 100644 --- a/src/mesa/state_tracker/st_atom_texture.c +++ b/src/mesa/state_tracker/st_atom_texture.c @@ -260,6 +260,9 @@ update_vertex_textures(struct st_context *st) struct gl_vertex_program *vprog = ctx-VertexProgram._Current; GLuint su; + if (!vprog-Base.SamplersUsed st-state.num_vertex_textures == 0) + return; + st-state.num_vertex_textures = 0; /* loop over sampler units (aka tex image units) */ -- 1.7.6.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
I'm not familiar enough with draw internals to tell whether your changes are enough or not. But I'm OK with this change as is, given that most drivers will not advertise PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION, so only those that do advertise need to be tested to ensure that things are working correctly. Is there a piglit test for QuadsFollowProvokingVertexConvention? It's probably a good idea to have one piglit test that tests it and draw clipped primitives w/ different fill modes. Jose - Original Message - Just let the hardware do it if it can and avoid drivers having to check for the special case on each draw call. v2: update the draw module --- src/gallium/auxiliary/draw/draw_context.c |7 - src/gallium/auxiliary/draw/draw_decompose_tmp.h | 26 +++--- src/gallium/auxiliary/draw/draw_gs_tmp.h|1 + src/gallium/auxiliary/draw/draw_private.h |2 + src/gallium/auxiliary/draw/draw_pt_decompose.h |2 + src/gallium/auxiliary/draw/draw_so_emit_tmp.h |1 + src/gallium/docs/source/cso/rasterizer.rst |7 +++-- src/gallium/docs/source/screen.rst |2 + src/gallium/drivers/i915/i915_screen.c |1 + src/gallium/drivers/llvmpipe/lp_screen.c|2 + src/gallium/drivers/nv50/nv50_screen.c |1 + src/gallium/drivers/nvc0/nvc0_screen.c |1 + src/gallium/drivers/nvfx/nvfx_screen.c |1 + src/gallium/drivers/r300/r300_screen.c |1 + src/gallium/drivers/r600/r600_pipe.c|1 + src/gallium/drivers/softpipe/sp_screen.c|2 + src/gallium/drivers/svga/svga_screen.c |3 ++ src/gallium/include/pipe/p_defines.h|3 +- src/mesa/state_tracker/st_extensions.c |4 +- 19 files changed, 52 insertions(+), 16 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index 3c0b1aa..ad49ce7 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -93,11 +93,11 @@ draw_create_context(struct pipe_context *pipe, boolean try_llvm, } #endif + draw-pipe = pipe; + if (!draw_init(draw)) goto err_destroy; - draw-pipe = pipe; - return draw; err_destroy: @@ -168,6 +168,9 @@ boolean draw_init(struct draw_context *draw) if (!draw_gs_init( draw )) return FALSE; + draw-quads_always_flatshade_last = !draw-pipe-screen-get_param( + draw-pipe-screen, PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION); + return TRUE; } diff --git a/src/gallium/auxiliary/draw/draw_decompose_tmp.h b/src/gallium/auxiliary/draw/draw_decompose_tmp.h index a142563..ee6877d 100644 --- a/src/gallium/auxiliary/draw/draw_decompose_tmp.h +++ b/src/gallium/auxiliary/draw/draw_decompose_tmp.h @@ -193,13 +193,18 @@ FUNC(FUNC_VARS) flags = DRAW_PIPE_RESET_STIPPLE | DRAW_PIPE_EDGE_FLAG_0 | DRAW_PIPE_EDGE_FLAG_1; -/* XXX should always emit idx[0] first */ -/* always emit idx[3] first */ -TRIANGLE(flags, idx[3], idx[0], idx[1]); +/* always emit idx[3] / idx[0] first */ +if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[0], idx[1]); +else + TRIANGLE(flags, idx[0], idx[1], idx[2]); flags = DRAW_PIPE_EDGE_FLAG_1 | DRAW_PIPE_EDGE_FLAG_2; -TRIANGLE(flags, idx[3], idx[1], idx[2]); +if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[1], idx[2]); +else + TRIANGLE(flags, idx[0], idx[2], idx[3]); } } break; @@ -237,13 +242,18 @@ FUNC(FUNC_VARS) flags = DRAW_PIPE_RESET_STIPPLE | DRAW_PIPE_EDGE_FLAG_0 | DRAW_PIPE_EDGE_FLAG_1; - /* XXX should always emit idx[0] first */ - /* always emit idx[3] first */ - TRIANGLE(flags, idx[3], idx[2], idx[0]); + /* always emit idx[3] / idx[0 first */ + if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[2], idx[0]); + else + TRIANGLE(flags, idx[0], idx[3], idx[2]); flags = DRAW_PIPE_EDGE_FLAG_1 | DRAW_PIPE_EDGE_FLAG_2; - TRIANGLE(flags, idx[3], idx[0], idx[1]); + if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[0], idx[1]); + else + TRIANGLE(flags, idx[0], idx[1], idx[3]); } } } diff --git a/src/gallium/auxiliary/draw/draw_gs_tmp.h b/src/gallium/auxiliary/draw/draw_gs_tmp.h index de7b026..92b6fb7 100644 ---
Re: [Mesa-dev] [PATCH] st/mesa: avoid vertex texture update for 0 case
- Original Message - From: Dave Airlie airl...@redhat.com If we had no vertex textures previously and we have none now, don't bother doing the enables dance. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_atom_texture.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) Nice find. Does this help performance? Can't we do the same for fragment or does it just cost more to check (when this is rarely the case) then just doing the rest of the function? Cheers, Jakob. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: avoid vertex texture update for 0 case
On Tue, Feb 7, 2012 at 4:29 PM, Jakob Bornecrantz ja...@vmware.com wrote: - Original Message - From: Dave Airlie airl...@redhat.com If we had no vertex textures previously and we have none now, don't bother doing the enables dance. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_atom_texture.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) Nice find. Does this help performance? Can't we do the same for fragment or does it just cost more to check (when this is rarely the case) then just doing the rest of the function? Well I think we generally have a fragment texture so it doesn't save much there, but I was just playing with noop and nexuiz and saw 0.46 CPU in vertex texture updates when I knew I had none. I think we could improve the frag texture updates a fair bit, but I'd have to a bit more testing on it. Dave, (oh I'll fix whitespace in that before pushing). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
On 02/07/2012 05:24 PM, Jose Fonseca wrote: I'm not familiar enough with draw internals to tell whether your changes are enough or not. But I'm OK with this change as is, given that most drivers will not advertise PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION, so only those that do advertise need to be tested to ensure that things are working correctly. Is there a piglit test for QuadsFollowProvokingVertexConvention? It's probably a good idea to have one piglit test that tests it and draw clipped primitives w/ different fill modes. glean's clipFlat tests this, with this patch nv50 and nvc0 will finally pass. I tested drawing a clipped quad with softpipe with both the cap enabled and disabled and it looks like the draw change works. So, I'll push it soon, then - r600 will get to profit from the cap too, as Marek already mentioned. Jose - Original Message - Just let the hardware do it if it can and avoid drivers having to check for the special case on each draw call. v2: update the draw module --- src/gallium/auxiliary/draw/draw_context.c |7 - src/gallium/auxiliary/draw/draw_decompose_tmp.h | 26 +++--- src/gallium/auxiliary/draw/draw_gs_tmp.h|1 + src/gallium/auxiliary/draw/draw_private.h |2 + src/gallium/auxiliary/draw/draw_pt_decompose.h |2 + src/gallium/auxiliary/draw/draw_so_emit_tmp.h |1 + src/gallium/docs/source/cso/rasterizer.rst |7 +++-- src/gallium/docs/source/screen.rst |2 + src/gallium/drivers/i915/i915_screen.c |1 + src/gallium/drivers/llvmpipe/lp_screen.c|2 + src/gallium/drivers/nv50/nv50_screen.c |1 + src/gallium/drivers/nvc0/nvc0_screen.c |1 + src/gallium/drivers/nvfx/nvfx_screen.c |1 + src/gallium/drivers/r300/r300_screen.c |1 + src/gallium/drivers/r600/r600_pipe.c|1 + src/gallium/drivers/softpipe/sp_screen.c|2 + src/gallium/drivers/svga/svga_screen.c |3 ++ src/gallium/include/pipe/p_defines.h|3 +- src/mesa/state_tracker/st_extensions.c |4 +- 19 files changed, 52 insertions(+), 16 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index 3c0b1aa..ad49ce7 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -93,11 +93,11 @@ draw_create_context(struct pipe_context *pipe, boolean try_llvm, } #endif + draw-pipe = pipe; + if (!draw_init(draw)) goto err_destroy; - draw-pipe = pipe; - return draw; err_destroy: @@ -168,6 +168,9 @@ boolean draw_init(struct draw_context *draw) if (!draw_gs_init( draw )) return FALSE; + draw-quads_always_flatshade_last = !draw-pipe-screen-get_param( + draw-pipe-screen, PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION); + return TRUE; } diff --git a/src/gallium/auxiliary/draw/draw_decompose_tmp.h b/src/gallium/auxiliary/draw/draw_decompose_tmp.h index a142563..ee6877d 100644 --- a/src/gallium/auxiliary/draw/draw_decompose_tmp.h +++ b/src/gallium/auxiliary/draw/draw_decompose_tmp.h @@ -193,13 +193,18 @@ FUNC(FUNC_VARS) flags = DRAW_PIPE_RESET_STIPPLE | DRAW_PIPE_EDGE_FLAG_0 | DRAW_PIPE_EDGE_FLAG_1; -/* XXX should always emit idx[0] first */ -/* always emit idx[3] first */ -TRIANGLE(flags, idx[3], idx[0], idx[1]); +/* always emit idx[3] / idx[0] first */ +if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[0], idx[1]); +else + TRIANGLE(flags, idx[0], idx[1], idx[2]); flags = DRAW_PIPE_EDGE_FLAG_1 | DRAW_PIPE_EDGE_FLAG_2; -TRIANGLE(flags, idx[3], idx[1], idx[2]); +if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[1], idx[2]); +else + TRIANGLE(flags, idx[0], idx[2], idx[3]); } } break; @@ -237,13 +242,18 @@ FUNC(FUNC_VARS) flags = DRAW_PIPE_RESET_STIPPLE | DRAW_PIPE_EDGE_FLAG_0 | DRAW_PIPE_EDGE_FLAG_1; - /* XXX should always emit idx[0] first */ - /* always emit idx[3] first */ - TRIANGLE(flags, idx[3], idx[2], idx[0]); + /* always emit idx[3] / idx[0 first */ + if (quads_flatshade_last) + TRIANGLE(flags, idx[3], idx[2], idx[0]); + else + TRIANGLE(flags, idx[0], idx[3], idx[2]); flags = DRAW_PIPE_EDGE_FLAG_1 | DRAW_PIPE_EDGE_FLAG_2; - TRIANGLE(flags, idx[3], idx[0], idx[1]); + if
Re: [Mesa-dev] [PATCH] gallium: add PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
- Original Message - On 02/07/2012 05:24 PM, Jose Fonseca wrote: I'm not familiar enough with draw internals to tell whether your changes are enough or not. But I'm OK with this change as is, given that most drivers will not advertise PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION, so only those that do advertise need to be tested to ensure that things are working correctly. Is there a piglit test for QuadsFollowProvokingVertexConvention? It's probably a good idea to have one piglit test that tests it and draw clipped primitives w/ different fill modes. glean's clipFlat tests this, with this patch nv50 and nvc0 will finally pass. I tested drawing a clipped quad with softpipe with both the cap enabled and disabled and it looks like the draw change works. So, I'll push it soon, then - r600 will get to profit from the cap too, as Marek already mentioned. Sounds great to me. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] updating tex combine for frag program
Hi guys, Is there any reason we need to update tex combiner state if we are using shaders? can we drop update_tex_combine updates to only the case where we don't have a fragment shader/program unless its from the FF shader? Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] updating tex combine for frag program
On 02/07/2012 01:08 PM, Dave Airlie wrote: Hi guys, Is there any reason we need to update tex combiner state if we are using shaders? can we drop update_tex_combine updates to only the case where we don't have a fragment shader/program unless its from the FF shader? Yes, I think so. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH/RFC] glsl: Avoid excessive loop unrolling.
On Mon, 30 Jan 2012 07:40:15 +0100, Mathias Fröhlich mathias.froehl...@gmx.net wrote: Hi, I am trying to make the shaders from a sky scattering shader work with mesa. One of the problems is that the shader contains a nested 16 x 32 loop with an instruction intensive body. The glsl loop unrolling pass is trying to unroll both loops since both of them have a fixed size =32. Compiling this shader gets stuck in do_common_optimization (= takes more than 10 minutes) since unrolling and then optimizing that huge 16*32*(function body) intruction count kills the runtime behavour of other optimization passes. Attached is a change that additionally skips loop unrolling if either the loop is a nested loop or the instruction count of the body times the constant iteration count exceeds the former max iteration count times 5 instructions. This makes the above shaders at least compile within the usual fraction of a second. Comments/Review? Does this also fix our piglit tests for the bug? If so, note it in the commit message, and Reviewed-by: Eric Anholt e...@anholt.net pgpFCvn7rpPD7.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl_to_tgsi: Adds support for horizontal location
--- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 33 --- 1 files changed, 29 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index ab05896..388e9d7 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1949,6 +1949,17 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir) } this-result = st_src_reg(entry-file, entry-index, var-type); + if (ir-type-is_scalar() || ir-type-is_vector()) + this-result.swizzle = swizzle_for_size(ir-type-vector_elements); + else + this-result.swizzle = SWIZZLE_NOOP; + unsigned swz = this-result.swizzle; + this-result.swizzle = MAKE_SWIZZLE4( +GET_SWZ(swz, 0) + ir-var-horizontal_location % 4, +GET_SWZ(swz, 1) + ir-var-horizontal_location % 4, +GET_SWZ(swz, 2) + ir-var-horizontal_location % 4, +GET_SWZ(swz, 3) + ir-var-horizontal_location % 4 +); if (!native_integers) this-result.type = GLSL_TYPE_FLOAT; } @@ -2169,10 +2180,17 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir) l.writemask = WRITEMASK_XYZW; } else if (ir-lhs-type-is_scalar() ir-lhs-variable_referenced()-mode == ir_var_out) { - /* FINISHME: This hack makes writing to gl_FragDepth, which lives in the - * FINISHME: W component of fragment shader output zero, work correctly. - */ - l.writemask = WRITEMASK_XYZW; + + ir_variable *inside_var = ir-lhs-variable_referenced(); + if (inside_var-location = VERT_RESULT_VAR0) { + l.writemask = 1 inside_var-horizontal_location; + } else { + /* FINISHME: This hack makes writing to gl_FragDepth, which lives in the +* FINISHME: W component of fragment shader output zero, work correctly. +*/ + l.writemask = WRITEMASK_XYZW; + } + } else { int swizzles[4]; int first_enabled_chan = 0; @@ -2180,6 +2198,13 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir) l.writemask = ir-write_mask; + if (ir_variable *inside_var = ir-lhs-variable_referenced()) { + if (inside_var-location = VERT_RESULT_VAR0) { + l.writemask = l.writemask inside_var-horizontal_location; + + } + } + for (int i = 0; i 4; i++) { if (l.writemask (1 i)) { first_enabled_chan = GET_SWZ(r.swizzle, i); -- 1.7.7 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Rewrite the HiZ op
Initial note: I'm very concerned about landing this much driver code right before a release. It's very nice code, and easy to read and review, but I'm still more scared of it than the previous hacks with known bugs in this timeframe. For merging after 8.0, I'm fully on board. On Mon, 6 Feb 2012 00:33:35 -0800, Chad Versace chad.vers...@linux.intel.com wrote: +* Ideally, we would not need to flush for the resolve op. But, I suspect +* that it's unsafe for CMD_PIPELINE_SELECT to occur multiple times in I don't see any problem with that. +* a single batch, and there is no safe way to ensure that other than by +* fencing the resolve with flushes. Ideally, we would just detect if +* a batch is in progress and do the right thing, but that would require +* the ability to *safely* access brw_context::state::dirty::brw +* outside of the brw_state_init() codepath. What's brw_state_init()? Did you mean brw_upload_state()? What concern do you have with safely accessing the dirty bits? The context is single-threaded, so that's not an issue. The batch flush dirty bit setting happens outside of state upload, too. + /* 3DSTATE_URB +* +* Assign the entire URB to the VS. Even though the VS disabled, URB space +* is still needed because the cliiper loads the VUE's from the URB. From clipper +* the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE, +* Dword 1.15:0 VS Number of URB Entries: +* This field is always used (even if VS Function Enable is DISABLED). + +* The warning below appears in the PRM (Section 3DSTATE_URB), but we can +* safely ignore it because this batch contains only one draw call. +* Because of URB corruption caused by allocating a previous GS unit +* URB entry to the VS unit, software is required to send a “GS NULL +* Fence” (Send URB fence with VS URB size == 1 and GS URB size == 0) +* plus a dummy DRAW call before any case where VS will be taking over +* GS URB space. +*/ As noted above, I don't think we really need to flush the batch around the hiz op. But that currently is providing a sufficient guarantee for this requirement: MI_BATCH_BUFFER_START is a full pipeline stall, and all this requirement is about is getting enough of a pipeline stall, I think. + /* 3DSTATE_DEPTH_BUFFER */ + { + uint32_t width = mt-level[level].width; + uint32_t height = mt-level[level].height; + + uint32_t tile_x; + uint32_t tile_y; + uint32_t offset; + { + /* Construct a dummy renderbuffer just to extract tile offsets. */ + struct intel_renderbuffer rb; + rb.mt = mt; + rb.mt_level = level; + rb.mt_layer = layer; + intel_renderbuffer_set_draw_offset(rb); + offset = intel_renderbuffer_tile_offsets(rb, tile_x, tile_y); + } I'd remove the block indent here. The actual body of the code in this patch is: Reviewed-by: Eric Anholt e...@anholt.net batch flushing mitigation can come later if we feel like it. pgp5Tn52Ebkys.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] updating tex combine for frag program
On 02/07/2012 12:08 PM, Dave Airlie wrote: Hi guys, Is there any reason we need to update tex combiner state if we are using shaders? can we drop update_tex_combine updates to only the case where we don't have a fragment shader/program unless its from the FF shader? I don't recall when update_tex_combine happens. Does it happen at draw-time or state change-time? If it happens at state change-time, the state will need to get updated when program 0 gets bound. Otherwise something like the following would break: glUseProgram(my_prog); glTexEnvi(GL_TEXTURE_2D, GL_TEXTURE_ENV_MODE, GL_DECAL); glUseProgram(0); glDrawArrays(...); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Rewrite the HiZ op
On 02/07/2012 01:10 PM, Eric Anholt wrote: Initial note: I'm very concerned about landing this much driver code right before a release. It's very nice code, and easy to read and review, but I'm still more scared of it than the previous hacks with known bugs in this timeframe. For merging after 8.0, I'm fully on board. I agree that this is a scary piece of code to commit the day of the release. Ken, Ian, Paul, and I agreed that we should give QA time to digest it and it should likely land in 8.0.1. On Mon, 6 Feb 2012 00:33:35 -0800, Chad Versace chad.vers...@linux.intel.com wrote: +* Ideally, we would not need to flush for the resolve op. But, I suspect +* that it's unsafe for CMD_PIPELINE_SELECT to occur multiple times in I don't see any problem with that. I was unsure if it was safe for CMD_PIPELINE_SELECT to occur multiple times in a batch, so I chose to err on the side of safety and emit a new batch to eliminate my cause of uncertainty. +* a single batch, and there is no safe way to ensure that other than by +* fencing the resolve with flushes. Ideally, we would just detect if +* a batch is in progress and do the right thing, but that would require +* the ability to *safely* access brw_context::state::dirty::brw +* outside of the brw_state_init() codepath. What's brw_state_init()? Did you mean brw_upload_state()? Oops. I meant brw_upload_state(). What concern do you have with safely accessing the dirty bits? The context is single-threaded, so that's not an issue. The batch flush dirty bit setting happens outside of state upload, too. + /* 3DSTATE_URB +* +* Assign the entire URB to the VS. Even though the VS disabled, URB space +* is still needed because the cliiper loads the VUE's from the URB. From clipper Typo fixed. +* the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE, +* Dword 1.15:0 VS Number of URB Entries: +* This field is always used (even if VS Function Enable is DISABLED). + +* The warning below appears in the PRM (Section 3DSTATE_URB), but we can +* safely ignore it because this batch contains only one draw call. +* Because of URB corruption caused by allocating a previous GS unit +* URB entry to the VS unit, software is required to send a “GS NULL +* Fence” (Send URB fence with VS URB size == 1 and GS URB size == 0) +* plus a dummy DRAW call before any case where VS will be taking over +* GS URB space. +*/ As noted above, I don't think we really need to flush the batch around the hiz op. But that currently is providing a sufficient guarantee for this requirement: MI_BATCH_BUFFER_START is a full pipeline stall, and all this requirement is about is getting enough of a pipeline stall, I think. + /* 3DSTATE_DEPTH_BUFFER */ + { + uint32_t width = mt-level[level].width; + uint32_t height = mt-level[level].height; + + uint32_t tile_x; + uint32_t tile_y; + uint32_t offset; + { + /* Construct a dummy renderbuffer just to extract tile offsets. */ + struct intel_renderbuffer rb; + rb.mt = mt; + rb.mt_level = level; + rb.mt_layer = layer; + intel_renderbuffer_set_draw_offset(rb); + offset = intel_renderbuffer_tile_offsets(rb, tile_x, tile_y); + } I'd remove the block indent here. The actual body of the code in this patch is: Reviewed-by: Eric Anholt e...@anholt.net batch flushing mitigation can come later if we feel like it. I do want to eliminate the unnecessary flushing, and I would like to do that and the requisite dirty bit rework in follow-on work. Paul and I brainstormed on a way to cleanly do this, and I think all of us should discuss it in-person one day. Thanks for taking the time to read the patch. Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: unmap vertex store before executing lists
On Tue, 7 Feb 2012 13:10:46 -0700, Brian Paul bri...@vmware.com wrote: We don't want our VBOs mapped when we're drawing. This change checks if the vertex store VBO is mapped before we execute a list, unmaps it, then remaps it after drawing. This situation pops up when building a nested display list in GL_COMPILE_AND_EXECUTE mode. Do we have any test for this? Anyway, Reviewed-by: Eric Anholt e...@anholt.net pgpp05Vk8QM61.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: unmap vertex store before executing lists
On 02/07/2012 03:18 PM, Eric Anholt wrote: On Tue, 7 Feb 2012 13:10:46 -0700, Brian Paulbri...@vmware.com wrote: We don't want our VBOs mapped when we're drawing. This change checks if the vertex store VBO is mapped before we execute a list, unmaps it, then remaps it after drawing. This situation pops up when building a nested display list in GL_COMPILE_AND_EXECUTE mode. Do we have any test for this? The legacy conform dlist.c test hit it. It's a pretty unusual case anyway. Anyway, Reviewed-by: Eric Anholte...@anholt.net -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Rewrite the HiZ op
On 02/06/2012 12:33 AM, Chad Versace wrote: The HiZ op was implemented as a meta-op. This patch reimplements it by emitting a special HiZ batch. This fixes several known bugs, and likely a lot of undiscovered ones too. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Implement GL_CLAMP behavior on texture rectangles on gen6+.
On 02/04/2012 05:24 AM, Eric Anholt wrote: We were doing saturate-based clamping on the [0,width] or [0,height] coordinate, which meant only the first pixel was addressable. Fixes piglit ARB_texture_rectangle/texwrap-RECT-bordercolor Tag this for stable branches too. Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- I feel dirty. src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 54 +++-- 1 files changed, 49 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index e57e175..ea8cd37 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1079,12 +1079,18 @@ fs_visitor::visit(ir_texture *ir) /* Should be lowered by do_lower_texture_projection */ assert(!ir-projector); + bool needs_gl_clamp = true; + + fs_reg scale_x, scale_y; + /* The 965 requires the EU to do the normalization of GL rectangle * texture coordinates. We use the program parameter state * tracking to get the scaling factor. */ - if (intel-gen 6 - ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT) { + if (ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT + (intel-gen 6 || + (intel-gen= 6 (c-key.tex.gl_clamp_mask[0] (1 sampler) || +c-key.tex.gl_clamp_mask[1] (1 sampler) { struct gl_program_parameter_list *params = c-fp-program.Base.Parameters; int tokens[STATE_LENGTH] = { STATE_INTERNAL, @@ -1105,8 +,9 @@ fs_visitor::visit(ir_texture *ir) c-prog_data.param_convert[c-prog_data.nr_params + 1] = PARAM_NO_CONVERT; - fs_reg scale_x = fs_reg(UNIFORM, c-prog_data.nr_params); - fs_reg scale_y = fs_reg(UNIFORM, c-prog_data.nr_params + 1); + scale_x = fs_reg(UNIFORM, c-prog_data.nr_params); + scale_y = fs_reg(UNIFORM, c-prog_data.nr_params + 1); + GLuint index = _mesa_add_state_reference(params, (gl_state_index *)tokens); @@ -1116,7 +1123,14 @@ fs_visitor::visit(ir_texture *ir) this-param_index[c-prog_data.nr_params] = index; this-param_offset[c-prog_data.nr_params] = 1; c-prog_data.nr_params++; + } + /* The 965 requires the EU to do the normalization of GL rectangle +* texture coordinates. We use the program parameter state +* tracking to get the scaling factor. +*/ + if (intel-gen 6 + ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT) { fs_reg dst = fs_reg(this, ir-coordinate-type); fs_reg src = coordinate; coordinate = dst; @@ -1125,9 +1139,39 @@ fs_visitor::visit(ir_texture *ir) dst.reg_offset++; src.reg_offset++; emit(BRW_OPCODE_MUL, dst, src, scale_y); + } else if (ir-sampler-type-sampler_dimensionality == GLSL_SAMPLER_DIM_RECT) { + /* On gen6+, the sampler handles the rectangle coordinates + * natively, without needing rescaling. But that means we have + * to do GL_CLAMP clamping at the [0, width], [0, height] scale, + * not [0, 1] like the default case below. + */ + needs_gl_clamp = false; + + for (int i = 0; i 2; i++) { +if (c-key.tex.gl_clamp_mask[i] (1 sampler)) { + fs_reg chan = coordinate; + chan.reg_offset += i; + + inst = emit(BRW_OPCODE_SEL, chan, chan, brw_imm_f(0.0)); + inst-conditional_mod = BRW_CONDITIONAL_G; + + /* Our parameter comes in as 1.0/width or 1.0/height, +* because that's what people normally want for doing +* texture rectangle handling. We need width or height +* for clamping, but we don't care enough to make a new +* parameter type, so just invert back. +*/ + fs_reg limit = fs_reg(this, glsl_type::float_type); + emit(BRW_OPCODE_MOV, limit, i == 0 ? scale_x : scale_y); + emit(SHADER_OPCODE_RCP, limit, limit); + + inst = emit(BRW_OPCODE_SEL, chan, chan, limit); + inst-conditional_mod = BRW_CONDITIONAL_L; +} + } } - if (ir-coordinate) { + if (ir-coordinate needs_gl_clamp) { for (int i = 0; i MIN2(ir-coordinate-type-vector_elements, 3); i++) { if (c-key.tex.gl_clamp_mask[i] (1 sampler)) { fs_reg chan = coordinate; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 45571] fatal error: program/symbol_table.h: No such file or directory
https://bugs.freedesktop.org/show_bug.cgi?id=45571 Matt Turner matts...@gmail.com changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution||NOTOURBUG --- Comment #15 from Matt Turner matts...@gmail.com 2012-02-07 15:23:52 PST --- (In reply to comment #14) Should we open differents bugs if make realclean doesn't clean everything as it should? If so, then I think we can close this bug. Yes. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 8.0 release status
Guys, It seems that I missed a bunch of patches sitting on master that wanted to be picked to 8.0. I went ahead and picked them over. My plan is to have my QA team run them tonight and do the release tomorrow evening or first thing Thursday morning. There are a few other bugs (e.g., Chad's recent HiZ patches) that need fixing. Those really need a bit more time to cook, so I no intention of committing / picking those for 8.0. My plan is to pick these over in the next few days and do an 8.0.1 release in the next week or 10 days. Does this work for people? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: fixes for DrawRangeElements[BaseVertex] index validation
Ping? I don't want to commit that without anyone looking at it. Am 06.02.2012 02:08, schrieb Roland Scheidegger: In vbo_exec_DrawRangeElementsBaseVertex, take into account the basevertex. As far as I can tell it is completely ok (though maybe stupid) to have start/end of 100/199, with _MaxElement being 100, if the basevertex is -100 (since the start/end are prior to adding basevertex). The opposite is also true (0/99 is not ok if _MaxElement is 100 and and basevertex is 100). Previously we would have issued a warning in such cases and worse probably fixed end to a bogus value. Untested, found by code inspection when looking at bug 40361 (which seems unrelated after all). v2: As per mailing list discussion and the suggestion by Kenneth Graunke just drop the questionable effort to fix up the range and revert to have no range instead. Rearrange some (disabled) debug bits a bit for slightly less complex code. --- src/mesa/vbo/vbo_exec_array.c | 63 +++- 1 files changed, 30 insertions(+), 33 deletions(-) diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c index d6b4d61..1f62e3a 100644 --- a/src/mesa/vbo/vbo_exec_array.c +++ b/src/mesa/vbo/vbo_exec_array.c @@ -885,17 +885,33 @@ vbo_exec_DrawRangeElementsBaseVertex(GLenum mode, end = MIN2(end, 0x); } - if (end = ctx-Array.ArrayObj-_MaxElement) { - /* the max element is out of bounds of one or more enabled arrays */ + if (0) { + printf(glDrawRangeElements{,BaseVertex} + (start %u, end %u, type 0x%x, count %d) ElemBuf %u, + base %d\n, + start, end, type, count, + ctx-Array.ArrayObj-ElementArrayBufferObj-Name, + basevertex); + } + +#if 0 + check_draw_elements_data(ctx, count, type, indices); +#else + (void) check_draw_elements_data; +#endif + + if ((int)(start + basevertex) 0 || + end + basevertex = ctx-Array.ArrayObj-_MaxElement) { + /* the max (or min) element is out of bounds of one or more enabled arrays */ warnCount++; if (warnCount 10) { - _mesa_warning(ctx, glDraw[Range]Elements(start %u, end %u, count %d, - type 0x%x, indices=%p)\n + _mesa_warning(ctx, glDrawRangeElements[BaseVertex](start %u, end %u, count %d, + type 0x%x, indices=%p, base %d)\n \tend is out of bounds (max=%u) Element Buffer %u (size %d)\n \tThis should probably be fixed in the application., - start, end, count, type, indices, + start, end, count, type, indices, basevertex, ctx-Array.ArrayObj-_MaxElement - 1, ctx-Array.ArrayObj-ElementArrayBufferObj-Name, (int) ctx-Array.ArrayObj-ElementArrayBufferObj-Size); @@ -913,14 +929,14 @@ vbo_exec_DrawRangeElementsBaseVertex(GLenum mode, if (0) { GLuint max = _mesa_max_buffer_index(ctx, count, type, indices, ctx-Array.ArrayObj-ElementArrayBufferObj); - if (max = ctx-Array.ArrayObj-_MaxElement) { + if (max + basevertex = ctx-Array.ArrayObj-_MaxElement) { if (warnCount 10) { - _mesa_warning(ctx, glDraw[Range]Elements(start %u, end %u, - count %d, type 0x%x, indices=%p)\n + _mesa_warning(ctx, glDrawRangeElements[BaseVertex](start %u, end %u, + count %d, type 0x%x, indices=%p, base %d)\n \tindex=%u is out of bounds (max=%u) Element Buffer %u (size %d)\n \tSkipping the glDrawRangeElements() call, - start, end, count, type, indices, max, + start, end, count, type, indices, basevertex, max, ctx-Array.ArrayObj-_MaxElement - 1, ctx-Array.ArrayObj-ElementArrayBufferObj-Name, (int) ctx-Array.ArrayObj-ElementArrayBufferObj-Size); @@ -928,34 +944,15 @@ vbo_exec_DrawRangeElementsBaseVertex(GLenum mode, } /* XXX we could also find the min index and compare to 'start' * to see if start is correct. But it's more likely to get the - * upper bound wrong. + * upper bound wrong, at least for the non-BaseVertex variant... */ } - - /* Set 'end' to the max possible legal value */ - assert(ctx-Array.ArrayObj-_MaxElement = 1); - end = ctx-Array.ArrayObj-_MaxElement - 1; - - if (end start) { - return; - } - } - - if (0) { - printf(glDraw[Range]Elements{,BaseVertex} - (start %u, end %u, type 0x%x, count %d)
[Mesa-dev] [PATCH 1/3] i965: Add support for the MAD opcode on SNB.
--- src/mesa/drivers/dri/i965/brw_defines.h |1 + src/mesa/drivers/dri/i965/brw_disasm.c | 223 --- src/mesa/drivers/dri/i965/brw_eu.h | 17 +++- src/mesa/drivers/dri/i965/brw_eu_emit.c | 82 +++- src/mesa/drivers/dri/i965/brw_structs.h | 37 + 5 files changed, 340 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 029be87..38ce5d7 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -620,6 +620,7 @@ enum opcode { BRW_OPCODE_DPA2 = 88, BRW_OPCODE_LINE = 89, BRW_OPCODE_PLN =90, + BRW_OPCODE_MAD =91, BRW_OPCODE_NOP =126, /* These are compiler backend opcodes that get translated into other diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c b/src/mesa/drivers/dri/i965/brw_disasm.c index a86c8f2..187bc0a 100644 --- a/src/mesa/drivers/dri/i965/brw_disasm.c +++ b/src/mesa/drivers/dri/i965/brw_disasm.c @@ -51,6 +51,7 @@ struct { [BRW_OPCODE_MACH] = { .name = mach, .nsrc = 2, .ndst = 1 }, [BRW_OPCODE_LINE] = { .name = line, .nsrc = 2, .ndst = 1 }, [BRW_OPCODE_PLN] = { .name = pln, .nsrc = 2, .ndst = 1 }, +[BRW_OPCODE_MAD] = { .name = mad, .nsrc = 3, .ndst = 1 }, [BRW_OPCODE_SAD2] = { .name = sad2, .nsrc = 2, .ndst = 1 }, [BRW_OPCODE_SADA2] = { .name = sada2, .nsrc = 2, .ndst = 1 }, [BRW_OPCODE_DP4] = { .name = dp4, .nsrc = 2, .ndst = 1 }, @@ -578,6 +579,28 @@ static int dest (FILE *file, struct brw_instruction *inst) return 0; } +static int dest_3src (FILE *file, struct brw_instruction *inst) +{ +interr = 0; +uint32_t reg_file; + +if (inst-bits1.da3src.dest_reg_file) + reg_file = BRW_MESSAGE_REGISTER_FILE; +else + reg_file = BRW_GENERAL_REGISTER_FILE; + +err |= reg (file, reg_file, inst-bits1.da3src.dest_reg_nr); +if (err == -1) + return 0; +if (inst-bits1.da3src.dest_subreg_nr) + format (file, .%d, inst-bits1.da3src.dest_subreg_nr); +string (file, 1); +err |= control (file, writemask, writemask, inst-bits1.da3src.dest_writemask, NULL); +err |= control (file, dest reg encoding, reg_encoding, BRW_REGISTER_TYPE_F, NULL); + +return 0; +} + static int src_align1_region (FILE *file, GLuint _vert_stride, GLuint _width, GLuint _horiz_stride) { @@ -694,6 +717,156 @@ static int src_da16 (FILE *file, return err; } +static int src0_3src (FILE *file, struct brw_instruction *inst) +{ +int err = 0; +GLuint swz_x = (inst-bits2.da3src.src0_swizzle 0) 0x3; +GLuint swz_y = (inst-bits2.da3src.src0_swizzle 2) 0x3; +GLuint swz_z = (inst-bits2.da3src.src0_swizzle 4) 0x3; +GLuint swz_w = (inst-bits2.da3src.src0_swizzle 6) 0x3; + +err |= control (file, negate, negate, inst-bits1.da3src.src0_negate, NULL); +err |= control (file, abs, _abs, inst-bits1.da3src.src0_abs, NULL); + +err |= reg (file, BRW_GENERAL_REGISTER_FILE, inst-bits2.da3src.src0_reg_nr); +if (err == -1) + return 0; +if (inst-bits2.da3src.src0_subreg_nr) + format (file, .%d, inst-bits2.da3src.src0_subreg_nr); +string (file, 4,1,1); +err |= control (file, src da16 reg type, reg_encoding, + BRW_REGISTER_TYPE_F, NULL); +/* + * Three kinds of swizzle display: + * identity - nothing printed + * 1-all - print the single channel + * 1-1 - print the mapping + */ +if (swz_x == BRW_CHANNEL_X + swz_y == BRW_CHANNEL_Y + swz_z == BRW_CHANNEL_Z + swz_w == BRW_CHANNEL_W) +{ + ; +} +else if (swz_x == swz_y swz_x == swz_z swz_x == swz_w) +{ + string (file, .); + err |= control (file, channel select, chan_sel, swz_x, NULL); +} +else +{ + string (file, .); + err |= control (file, channel select, chan_sel, swz_x, NULL); + err |= control (file, channel select, chan_sel, swz_y, NULL); + err |= control (file, channel select, chan_sel, swz_z, NULL); + err |= control (file, channel select, chan_sel, swz_w, NULL); +} +return err; +} + +static int src1_3src (FILE *file, struct brw_instruction *inst) +{ +int err = 0; +GLuint swz_x = (inst-bits2.da3src.src1_swizzle 0) 0x3; +GLuint swz_y = (inst-bits2.da3src.src1_swizzle 2) 0x3; +GLuint swz_z = (inst-bits2.da3src.src1_swizzle 4) 0x3; +GLuint swz_w = (inst-bits2.da3src.src1_swizzle 6) 0x3; +GLuint src1_subreg_nr = (inst-bits2.da3src.src1_subreg_nr_low | +(inst-bits3.da3src.src1_subreg_nr_high 2)); + +err |= control (file, negate, negate, inst-bits1.da3src.src1_negate, + NULL); +err |= control (file, abs, _abs, inst-bits1.da3src.src1_abs, NULL); + +err |= reg (file, BRW_GENERAL_REGISTER_FILE, + inst-bits3.da3src.src1_reg_nr); +if (err ==
[Mesa-dev] [PATCH 3/3] i965/fs: Add support for MADs.
Improves nexuiz performance 0.28% +/- .15% (n=5) on my gen6. No statistically significant performance difference on warsow (n=5). --- src/mesa/drivers/dri/i965/brw_fs.h |1 + src/mesa/drivers/dri/i965/brw_fs_emit.cpp|6 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 42 ++ 3 files changed, 49 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 5fdc055..060aa36 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -529,6 +529,7 @@ public: fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1); bool try_emit_saturate(ir_expression *ir); + bool try_emit_mad(ir_expression *ir, int mul_arg); void emit_bool_to_cond_code(ir_rvalue *condition); void emit_if_gen6(ir_if *ir); void emit_unspill(fs_inst *inst, fs_reg reg, uint32_t spill_offset); diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp index b68d8cb..344a533 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp @@ -725,6 +725,12 @@ fs_visitor::generate_code() brw_set_acc_write_control(p, 0); break; + case BRW_OPCODE_MAD: +brw_set_access_mode(p, BRW_ALIGN_16); +brw_MAD(p, dst, src[0], src[1], src[2]); +brw_set_access_mode(p, BRW_ALIGN_1); +break; + case BRW_OPCODE_FRC: brw_FRC(p, dst, src[0]); break; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index ea8cd37..efa54b5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -182,6 +182,44 @@ fs_visitor::try_emit_saturate(ir_expression *ir) return true; } +bool +fs_visitor::try_emit_mad(ir_expression *ir, int mul_arg) +{ + /* 3-src instructions were introduced in gen6. */ + if (intel-gen 6) + return false; + + /* FINISHME: Can we do this for 16-wide at all? Breaks +* fs-mix-float-float-float at least. +*/ + if (c-dispatch_width == 16) + return false; + + /* MAD can only handle floating-point data. */ + if (ir-type != glsl_type::float_type) + return false; + + ir_rvalue *nonmul = ir-operands[1 - mul_arg]; + ir_expression *mul = ir-operands[mul_arg]-as_expression(); + + if (!mul || mul-operation != ir_binop_mul) + return false; + + nonmul-accept(this); + fs_reg src0 = this-result; + + mul-operands[0]-accept(this); + fs_reg src1 = this-result; + + mul-operands[1]-accept(this); + fs_reg src2 = this-result; + + this-result = fs_reg(this, ir-type); + emit(BRW_OPCODE_MAD, this-result, src0, src1, src2); + + return true; +} + void fs_visitor::visit(ir_expression *ir) { @@ -193,6 +231,10 @@ fs_visitor::visit(ir_expression *ir) if (try_emit_saturate(ir)) return; + if (ir-operation == ir_binop_add) { + if (try_emit_mad(ir, 0) || try_emit_mad(ir, 1)) +return; + } for (operand = 0; operand ir-get_num_operands(); operand++) { ir-operands[operand]-accept(this); -- 1.7.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965/fs: Add missing register allocation for 3rd sources.
Our only instruction with a 3rd source so far was linterp, and that value was never register-allocated. --- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp index 0d1712e..45f83f3 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp @@ -61,6 +61,7 @@ fs_visitor::assign_regs_trivial() assign_reg(hw_reg_mapping, inst-dst, reg_width); assign_reg(hw_reg_mapping, inst-src[0], reg_width); assign_reg(hw_reg_mapping, inst-src[1], reg_width); + assign_reg(hw_reg_mapping, inst-src[2], reg_width); } if (this-grf_used = max_grf) { @@ -271,6 +272,7 @@ fs_visitor::assign_regs() assign_reg(hw_reg_mapping, inst-dst, reg_width); assign_reg(hw_reg_mapping, inst-src[0], reg_width); assign_reg(hw_reg_mapping, inst-src[1], reg_width); + assign_reg(hw_reg_mapping, inst-src[2], reg_width); } ralloc_free(g); -- 1.7.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] updating tex combine for frag program
On Tue, Feb 7, 2012 at 10:09 PM, Ian Romanick i...@freedesktop.org wrote: On 02/07/2012 12:08 PM, Dave Airlie wrote: Hi guys, Is there any reason we need to update tex combiner state if we are using shaders? can we drop update_tex_combine updates to only the case where we don't have a fragment shader/program unless its from the FF shader? I don't recall when update_tex_combine happens. Does it happen at draw-time or state change-time? If it happens at state change-time, the state will need to get updated when program 0 gets bound. Otherwise something like the following would break: glUseProgram(my_prog); glTexEnvi(GL_TEXTURE_2D, GL_TEXTURE_ENV_MODE, GL_DECAL); glUseProgram(0); glDrawArrays(...); IT happens in update_texture_state which happens on _NEW_TEXTURE, so it should be fine, since with the maintain fixed function stuff we update the program on _NEW_TEXTURE. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev