[Mesa-dev] [PATCH] ACTIVE_UNIFORM_MAX_LENGTH should include 3 extra characters for arrays.
If the active uniform is an array, then the length of the uniform name should include the three extra characters for the [0] suffix, which is required by the GL 4.2 spec to be appended to the uniform name in glGetActiveUniform(). This avoids the situation where the output buffer does not have enough space to hold the [0] suffix, resulting in an incomplete array specification like foobar[0. Change-Id: Icd58cd6a73c9de7bbe5659d757b8009021846019 Signed-off-by: Haixia Shi h...@chromium.org Reviewed-by: Stephane Marchesin marc...@chromium.org --- src/mesa/main/shaderapi.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index be69467..68767f4 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -519,8 +519,11 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, GLint *param for (i = 0; i shProg-NumUserUniformStorage; i++) { /* Add one for the terminating NUL character. + * However if the uniform is an array, then add three extra characters + * for the appended [0] suffix, in addition to the terminating NUL. */ -const GLint len = strlen(shProg-UniformStorage[i].name) + 1; +const GLint len = strlen(shProg-UniformStorage[i].name) + 1 + +((shProg-UniformStorage[i].array_elements != 0) ? 3 : 0); if (len max_len) max_len = len; -- 1.8.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] R600: Emit native instructions for tex
Tests? -- Sean Silva ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon/llvm: Build libradeonllvm as a static library
On Mon, 2013-04-01 at 14:11 -0700, Tom Stellard wrote: From: Tom Stellard thomas.stell...@amd.com Building libradeonllvm as a shared object has led to a number of bugs and build system complications, and I don't think it's necessary for such a small library. This library was originally changed to a shared object to work around linker error in egl_static.so, but these appear to be fixed now. https://bugs.freedesktop.org/show_bug.cgi?id=62226 --- Please test to make sure this works for your build configuration. Tested-by: Michel Dänzer michel.daen...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2] mesa: don't memcmp() off the end of a cache key.
Reported-by: `per` in #intel-gfx The size of the cache key varies, so store the actual size as well as the key blob itself, rather than just assuming it's the same as the size passed in. NOTE: This is a candidate for stable branches. V2: Don't leave silly holes in structure; use unsigned instead of GLuint. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/program/prog_cache.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/mesa/program/prog_cache.c b/src/mesa/program/prog_cache.c index 47f926b..1041f35 100644 --- a/src/mesa/program/prog_cache.c +++ b/src/mesa/program/prog_cache.c @@ -37,6 +37,7 @@ struct cache_item { GLuint hash; + unsigned keysize; void *key; struct gl_program *program; struct cache_item *next; @@ -183,7 +184,10 @@ _mesa_search_program_cache(struct gl_program_cache *cache, struct cache_item *c; for (c = cache-items[hash % cache-size]; c; c = c-next) { - if (c-hash == hash memcmp(c-key, key, keysize) == 0) { + if (c-hash == hash +c-keysize == keysize +memcmp(c-key, key, keysize) == 0) { + cache-last = c; return c-program; } @@ -207,6 +211,7 @@ _mesa_program_cache_insert(struct gl_context *ctx, c-key = malloc(keysize); memcpy(c-key, key, keysize); + c-keysize = keysize; c-program = program; /* no refcount change */ @@ -235,6 +240,7 @@ _mesa_shader_cache_insert(struct gl_context *ctx, c-key = malloc(keysize); memcpy(c-key, key, keysize); + c-keysize = keysize; c-program = (struct gl_program *)program; /* no refcount change */ -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] radeonsi: add instance divisor support v3
On Mit, 2013-03-27 at 16:35 +0100, Christian König wrote: From: Christian König christian.koe...@amd.com v2: reduce key size, don't copy key around to much. v3: remove key size reduction Signed-off-by: Christian König christian.koe...@amd.com Reviewed-by: Michel Dänzer michel.daen...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 62921] [llvmpipe] piglit arb_color_buffer_float-drawpixels GL_RGBA16F regression
https://bugs.freedesktop.org/show_bug.cgi?id=62921 Roland Scheidegger srol...@vmware.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from Roland Scheidegger srol...@vmware.com --- Fixed by 9b329f4c095a6b0aa5e55519c32fcf4c9d823e2b. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: remove sampler writemask v2
From: Christian König christian.koe...@amd.com v2: fix instrinsic name as well Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 5fdf46e..1c5fa51 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -795,10 +795,6 @@ static void tex_fetch_args( unsigned count = 0; unsigned chan; - /* WriteMask */ - /* XXX: should be optimized using emit_data-inst-Dst[0].Register.WriteMask*/ - emit_data-args[0] = lp_build_const_int32(bld_base-base.gallivm, 0xf); - /* Fetch and project texture coordinates */ coords[3] = lp_build_emit_fetch(bld_base, emit_data-inst, 0, TGSI_CHAN_W); for (chan = 0; chan 3; chan++ ) { @@ -904,20 +900,19 @@ static void tex_fetch_args( while (count util_next_power_of_two(count)) address[count++] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm-context)); - emit_data-args[1] = lp_build_gather_values(gallivm, address, count); + emit_data-args[0] = lp_build_gather_values(gallivm, address, count); /* Resource */ - emit_data-args[2] = si_shader_ctx-resources[emit_data-inst-Src[1].Register.Index]; + emit_data-args[1] = si_shader_ctx-resources[emit_data-inst-Src[1].Register.Index]; /* Sampler */ - emit_data-args[3] = si_shader_ctx-samplers[emit_data-inst-Src[1].Register.Index]; + emit_data-args[2] = si_shader_ctx-samplers[emit_data-inst-Src[1].Register.Index]; /* Dimensions */ - emit_data-args[4] = lp_build_const_int32(bld_base-base.gallivm, target); + emit_data-args[3] = lp_build_const_int32(bld_base-base.gallivm, target); + + emit_data-arg_count = 4; - emit_data-arg_count = 5; - /* XXX: To optimize, we could use a float or v2f32, if the last bits of -* the writemask are clear */ emit_data-dst_type = LLVMVectorType( LLVMFloatTypeInContext(bld_base-base.gallivm-context), 4); @@ -931,7 +926,7 @@ static void build_tex_intrinsic(const struct lp_build_tgsi_action * action, char intr_name[23]; sprintf(intr_name, %sv%ui32, action-intr_name, - LLVMGetVectorSize(LLVMTypeOf(emit_data-args[1]))); + LLVMGetVectorSize(LLVMTypeOf(emit_data-args[0]))); emit_data-output[emit_data-chan] = build_intrinsic( base-gallivm-builder, intr_name, emit_data-dst_type, -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] drirc: set always_have_depth_buffer for Topogon
- Original Message - On 03/29/2013 05:30 PM, Brian Paul wrote: Has this bug been reported to the Topogun developer? Yes, I have reported via http://www.topogun.com/support/contact-us.htm on 29th September 2012. I received no reply since, nor did I try a second time. Also note that there was no release/update since then neither. So it is possible that a fix has been incorporated but not released. Jose --- src/mesa/drivers/dri/common/drirc |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/src/mesa/drivers/dri/common/drirc b/src/mesa/drivers/dri/common/drirc index a13941f..556d1b5 100644 --- a/src/mesa/drivers/dri/common/drirc +++ b/src/mesa/drivers/dri/common/drirc @@ -25,5 +25,11 @@ application name=Savage 2 executable=savage2.bin option name=disable_glsl_line_continuations value=true / /application +application name=Topogun (32-bit) executable=topogun32 +option name=always_have_depth_buffer value=true / +/application +application name=Topogun (64-bit) executable=topogun64 +option name=always_have_depth_buffer value=true / +/application /device /driconf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] gallivm: bring back optimized but incorrect float to smallfloat optimizations
- Original Message - From: Roland Scheidegger srol...@vmware.com Conceptually the same as previously done in float_to_half. Should cut down number of instructions from 14 to 10 or so, but will promote some NaNs to Infs, so it's disabled. It gets a bit tricky though handling all the cases correctly... Passes basic tests either way (though there are no tests testing special cases, but some manual tests injecting them seemed promising). --- .../auxiliary/gallivm/lp_bld_format_float.c| 124 ++-- 1 file changed, 86 insertions(+), 38 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_float.c b/src/gallium/auxiliary/gallivm/lp_bld_format_float.c index 161e392..61b6a60 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_format_float.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_format_float.c @@ -79,13 +79,15 @@ lp_build_float_to_smallfloat(struct gallivm_state *gallivm, { LLVMBuilderRef builder = gallivm-builder; LLVMValueRef i32_floatexpmask, i32_smallexpmask, magic, normal; - LLVMValueRef rescale_src, tmp, i32_roundmask, small_max; - LLVMValueRef is_nan, i32_qnanbit, src_abs, shift, infcheck_src, res; - LLVMValueRef is_inf, is_nan_or_inf, nan_or_inf, mask; + LLVMValueRef rescale_src, i32_roundmask, small_max; + LLVMValueRef i32_qnanbit, shift, res; + LLVMValueRef is_nan_or_inf, nan_or_inf, mask, srci; struct lp_type f32_type = lp_type_float_vec(32, 32 * i32_type.length); struct lp_build_context f32_bld, i32_bld; LLVMValueRef zero = lp_build_const_vec(gallivm, f32_type, 0.0f); unsigned exponent_start = mantissa_start + mantissa_bits; + boolean always_preserve_nans = true; + boolean maybe_correct_denorm_rounding = true; lp_build_context_init(f32_bld, gallivm, f32_type); lp_build_context_init(i32_bld, gallivm, i32_type); @@ -94,35 +96,41 @@ lp_build_float_to_smallfloat(struct gallivm_state *gallivm, ((1 exponent_bits) - 1) 23); i32_floatexpmask = lp_build_const_int_vec(gallivm, i32_type, 0xff 23); - src_abs = lp_build_abs(f32_bld, src); - src_abs = LLVMBuildBitCast(builder, src_abs, i32_bld.vec_type, ); + srci = LLVMBuildBitCast(builder, src, i32_bld.vec_type, ); Lets use src_int instead of srci (as the latter invokes more the concept of indexed than integer). if (has_sign) { - rescale_src = src_abs; - infcheck_src = src_abs; - src = LLVMBuildBitCast(builder, src, i32_bld.vec_type, ); + rescale_src = src; } else { /* clamp to pos range (can still have sign bit if NaN or negative zero) */ - rescale_src = lp_build_max(f32_bld, src, zero); - rescale_src = LLVMBuildBitCast(builder, rescale_src, i32_bld.vec_type, ); - src = LLVMBuildBitCast(builder, src, i32_bld.vec_type, ); - infcheck_src = src; + rescale_src = lp_build_max(f32_bld, zero, src); } + rescale_src = LLVMBuildBitCast(builder, rescale_src, i32_bld.vec_type, ); /* ordinary number */ - /* get rid of excess mantissa bits, and while here also potential sign bit */ - i32_roundmask = lp_build_const_int_vec(gallivm, i32_type, - ~((1 (23 - mantissa_bits)) - 1) - 0x7fff); + /* +* get rid of excess mantissa bits and sign bit +* This is only really needed for correct rounding of denorms I think +* but only if we use the preserve NaN path does using +* src_abs instead save us any instruction. +*/ + if (maybe_correct_denorm_rounding || !always_preserve_nans) { + i32_roundmask = lp_build_const_int_vec(gallivm, i32_type, + ~((1 (23 - mantissa_bits)) - 1) + 0x7fff); + rescale_src = LLVMBuildBitCast(builder, rescale_src, i32_bld.vec_type, ); + rescale_src = lp_build_and(i32_bld, rescale_src, i32_roundmask); + rescale_src = LLVMBuildBitCast(builder, rescale_src, f32_bld.vec_type, ); + } + else { + rescale_src = lp_build_abs(f32_bld, src); + } - tmp = lp_build_and(i32_bld, rescale_src, i32_roundmask); - tmp = LLVMBuildBitCast(builder, tmp, f32_bld.vec_type, ); /* bias exponent (and denormalize if necessary) */ magic = lp_build_const_int_vec(gallivm, i32_type, ((1 (exponent_bits - 1)) - 1) 23); magic = LLVMBuildBitCast(builder, magic, f32_bld.vec_type, ); - normal = lp_build_mul(f32_bld, tmp, magic); + normal = lp_build_mul(f32_bld, rescale_src, magic); /* clamp to max value - largest non-infinity number */ small_max = lp_build_const_int_vec(gallivm, i32_type, @@ -141,19 +149,66 @@ lp_build_float_to_smallfloat(struct gallivm_state *gallivm, * (Cannot actually save
Re: [Mesa-dev] [PATCH] gallium/hud: do .xxxx swizzling for the font texture in the fragment shader
On 04/01/2013 07:36 PM, Marek Olšák wrote: This allows using L8 and R8 for the font if I8 isn't supported. --- src/gallium/auxiliary/hud/hud_context.c | 36 +-- 1 file changed, 30 insertions(+), 6 deletions(-) Tested-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] svga: refactor occlusion query code
From: Brian Paul bri...@vmware.com This is in preparation for adding new query types for the HUD. --- src/gallium/drivers/svga/svga_pipe_query.c | 218 1 file changed, 124 insertions(+), 94 deletions(-) diff --git a/src/gallium/drivers/svga/svga_pipe_query.c b/src/gallium/drivers/svga/svga_pipe_query.c index 902f84c..b83c7d4 100644 --- a/src/gallium/drivers/svga/svga_pipe_query.c +++ b/src/gallium/drivers/svga/svga_pipe_query.c @@ -44,7 +44,10 @@ struct pipe_query { struct svga_query { struct pipe_query base; - SVGA3dQueryType type; + unsigned type; /** PIPE_QUERY_x or SVGA_QUERY_x */ + SVGA3dQueryType svga_type; /** SVGA3D_QUERYTYPE_x, or zero */ + + /** For PIPE_QUERY_OCCLUSION_COUNTER / SVGA3D_QUERYTYPE_OCCLUSION */ struct svga_winsys_buffer *hwbuf; volatile SVGA3dQueryResult *queryResult; struct pipe_fence_handle *fence; @@ -79,31 +82,35 @@ static struct pipe_query *svga_create_query( struct pipe_context *pipe, if (!sq) goto no_sq; - sq-type = SVGA3D_QUERYTYPE_OCCLUSION; - - sq-hwbuf = svga_winsys_buffer_create(svga, - 1, - SVGA_BUFFER_USAGE_PINNED, - sizeof *sq-queryResult); - if(!sq-hwbuf) - goto no_hwbuf; - - sq-queryResult = (SVGA3dQueryResult *)sws-buffer_map(sws, - sq-hwbuf, - PIPE_TRANSFER_WRITE); - if(!sq-queryResult) - goto no_query_result; - - sq-queryResult-totalSize = sizeof *sq-queryResult; - sq-queryResult-state = SVGA3D_QUERYSTATE_NEW; - - /* -* We request the buffer to be pinned and assume it is always mapped. -* -* The reason is that we don't want to wait for fences when checking the -* query status. -*/ - sws-buffer_unmap(sws, sq-hwbuf); + switch (query_type) { + case PIPE_QUERY_OCCLUSION_COUNTER: + sq-svga_type = SVGA3D_QUERYTYPE_OCCLUSION; + + sq-hwbuf = svga_winsys_buffer_create(svga, 1, +SVGA_BUFFER_USAGE_PINNED, +sizeof *sq-queryResult); + if (!sq-hwbuf) + goto no_hwbuf; + + sq-queryResult = (SVGA3dQueryResult *) + sws-buffer_map(sws, sq-hwbuf, PIPE_TRANSFER_WRITE); + if (!sq-queryResult) + goto no_query_result; + + sq-queryResult-totalSize = sizeof *sq-queryResult; + sq-queryResult-state = SVGA3D_QUERYSTATE_NEW; + + /* We request the buffer to be pinned and assume it is always mapped. + * The reason is that we don't want to wait for fences when checking the + * query status. + */ + sws-buffer_unmap(sws, sq-hwbuf); + break; + default: + assert(!unexpected query type in svga_create_query()); + } + + sq-type = query_type; return sq-base; @@ -123,8 +130,16 @@ static void svga_destroy_query(struct pipe_context *pipe, struct svga_query *sq = svga_query( q ); SVGA_DBG(DEBUG_QUERY, %s\n, __FUNCTION__); - sws-buffer_destroy(sws, sq-hwbuf); - sws-fence_reference(sws, sq-fence, NULL); + + switch (sq-type) { + case PIPE_QUERY_OCCLUSION_COUNTER: + sws-buffer_destroy(sws, sq-hwbuf); + sws-fence_reference(sws, sq-fence, NULL); + break; + default: + assert(!svga: unexpected query type in svga_destroy_query()); + } + FREE(sq); } @@ -139,39 +154,42 @@ static void svga_begin_query(struct pipe_context *pipe, SVGA_DBG(DEBUG_QUERY, %s\n, __FUNCTION__); - assert(!svga-sq); - /* Need to flush out buffered drawing commands so that they don't * get counted in the query results. */ svga_hwtnl_flush_retry(svga); - if(sq-queryResult-state == SVGA3D_QUERYSTATE_PENDING) { - /* The application doesn't care for the pending query result. We cannot - * let go the existing buffer and just get a new one because its storage - * may be reused for other purposes and clobbered by the host when it - * determines the query result. So the only option here is to wait for - * the existing query's result -- not a big deal, given that no sane - * application would do this. - */ - uint64_t result; + switch (sq-type) { + case PIPE_QUERY_OCCLUSION_COUNTER: + assert(!svga-sq); + if (sq-queryResult-state == SVGA3D_QUERYSTATE_PENDING) { + /* The application doesn't care for the pending query result. We cannot + * let go the existing buffer and just get a new one because its storage + * may be reused for other purposes and clobbered by the host when it + * determines the query result. So the only option here is to wait for + * the existing query's result -- not a big deal, given that no sane + * application would do this. + */ + uint64_t
[Mesa-dev] [PATCH 2/2] svga: add HUD queries for number of draw calls, number of fallbacks
From: Brian Paul bri...@vmware.com The fallbacks count is the number of drawing calls that use a draw module fallback, such as polygon stipple. --- src/gallium/drivers/svga/svga_context.h|9 + src/gallium/drivers/svga/svga_pipe_draw.c |3 +++ src/gallium/drivers/svga/svga_pipe_query.c | 27 +++ src/gallium/drivers/svga/svga_screen.c | 22 ++ 4 files changed, 61 insertions(+) diff --git a/src/gallium/drivers/svga/svga_context.h b/src/gallium/drivers/svga/svga_context.h index 32671ec..e27778e 100644 --- a/src/gallium/drivers/svga/svga_context.h +++ b/src/gallium/drivers/svga/svga_context.h @@ -42,6 +42,11 @@ #include svga3d_shaderdefs.h +/** Non-GPU queries for gallium HUD */ +#define SVGA_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0) +#define SVGA_QUERY_FALLBACKS(PIPE_QUERY_DRIVER_SPECIFIC + 1) + + struct draw_vertex_shader; struct draw_fragment_shader; struct svga_shader_result; @@ -370,6 +375,10 @@ struct svga_context /** List of buffers with queued transfers */ struct list_head dirty_buffers; + + /** performance / info queries */ + uint64_t num_draw_calls; /** SVGA_QUERY_DRAW_CALLS */ + uint64_t num_fallbacks; /** SVGA_QUERY_FALLBACKS */ }; /* A flag for each state_tracker state object: diff --git a/src/gallium/drivers/svga/svga_pipe_draw.c b/src/gallium/drivers/svga/svga_pipe_draw.c index e72032e..f0da170 100644 --- a/src/gallium/drivers/svga/svga_pipe_draw.c +++ b/src/gallium/drivers/svga/svga_pipe_draw.c @@ -330,6 +330,8 @@ svga_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) enum pipe_error ret = 0; boolean needed_swtnl; + svga-num_draw_calls++; /* for SVGA_QUERY_DRAW_CALLS */ + if (!u_trim_pipe_prim( info-mode, count )) return; @@ -358,6 +360,7 @@ svga_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) #endif if (svga-state.sw.need_swtnl) { + svga-num_fallbacks++; /* for SVGA_QUERY_FALLBACKS */ if (!needed_swtnl) { /* * We're switching from HW to SW TNL. SW TNL will require mapping all diff --git a/src/gallium/drivers/svga/svga_pipe_query.c b/src/gallium/drivers/svga/svga_pipe_query.c index b83c7d4..6fa6fac 100644 --- a/src/gallium/drivers/svga/svga_pipe_query.c +++ b/src/gallium/drivers/svga/svga_pipe_query.c @@ -51,6 +51,9 @@ struct svga_query { struct svga_winsys_buffer *hwbuf; volatile SVGA3dQueryResult *queryResult; struct pipe_fence_handle *fence; + + /** For non-GPU SVGA_QUERY_x queries */ + uint64_t begin_count, end_count; }; /*** @@ -106,6 +109,9 @@ static struct pipe_query *svga_create_query( struct pipe_context *pipe, */ sws-buffer_unmap(sws, sq-hwbuf); break; + case SVGA_QUERY_DRAW_CALLS: + case SVGA_QUERY_FALLBACKS: + break; default: assert(!unexpected query type in svga_create_query()); } @@ -136,6 +142,10 @@ static void svga_destroy_query(struct pipe_context *pipe, sws-buffer_destroy(sws, sq-hwbuf); sws-fence_reference(sws, sq-fence, NULL); break; + case SVGA_QUERY_DRAW_CALLS: + case SVGA_QUERY_FALLBACKS: + /* nothing */ + break; default: assert(!svga: unexpected query type in svga_destroy_query()); } @@ -187,6 +197,12 @@ static void svga_begin_query(struct pipe_context *pipe, svga-sq = sq; break; + case SVGA_QUERY_DRAW_CALLS: + sq-begin_count = svga-num_draw_calls; + break; + case SVGA_QUERY_FALLBACKS: + sq-begin_count = svga-num_fallbacks; + break; default: assert(!unexpected query type in svga_begin_query()); } @@ -224,6 +240,12 @@ static void svga_end_query(struct pipe_context *pipe, svga-sq = NULL; break; + case SVGA_QUERY_DRAW_CALLS: + sq-end_count = svga-num_draw_calls; + break; + case SVGA_QUERY_FALLBACKS: + sq-end_count = svga-num_fallbacks; + break; default: assert(!unexpected query type in svga_end_query()); } @@ -277,6 +299,11 @@ static boolean svga_get_query_result(struct pipe_context *pipe, *result = (uint64_t)sq-queryResult-result32; break; + case SVGA_QUERY_DRAW_CALLS: + /* fall-through */ + case SVGA_QUERY_FALLBACKS: + vresult-u64 = sq-end_count - sq-begin_count; + break; default: assert(!unexpected query type in svga_get_query_result); } diff --git a/src/gallium/drivers/svga/svga_screen.c b/src/gallium/drivers/svga/svga_screen.c index 0558a46..70e2fa8 100644 --- a/src/gallium/drivers/svga/svga_screen.c +++ b/src/gallium/drivers/svga/svga_screen.c @@ -491,6 +491,27 @@ svga_fence_finish(struct pipe_screen *screen, } +static int +svga_get_driver_query_info(struct pipe_screen *screen, + unsigned index, + struct pipe_driver_query_info *info) +{ +
Re: [Mesa-dev] [PATCH] st/mesa: fix bitmap, drawpix, drawtex for PIPE_CAP_TGSI_TEXCOORD
On 03/30/2013 08:11 AM, Christoph Bumiller wrote: NOTE: Changed the semantic index for the drawtex coordiante to be the texture unit index instead of always 0. Not sure if this is correct but since the value seems to depend on the unit it would make sense to use different varying slots. Tested-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fix bitmap, drawpix, drawtex for PIPE_CAP_TGSI_TEXCOORD
On 02.04.2013 16:39, Brian Paul wrote: On 03/30/2013 08:11 AM, Christoph Bumiller wrote: NOTE: Changed the semantic index for the drawtex coordiante to be the texture unit index instead of always 0. Not sure if this is correct but since the value seems to depend on the unit it would make sense to use different varying slots. Tested-by: Brian Paul bri...@vmware.com Thanks ! Just to be sure, you're referring to the part that changes the semantic index so that TEX0..7(max units) is used instead of always TEX0, right ? I'll push that as a separate patch then. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] PowerPC: Altivec IROUND operation
I don't see need/benefit in mixing iround (ie, float - int) with round (ie, float - float). If this is a one-off, then you should just call lp_build_intrinsic_unary(builder, llvm.ppc.altivec.vctsxs, ...) If you really need an generic intrinsic helper for iround, then please add a new lp_build_iround_foo(..., enum lp_build_round_mode mode) which takes enum lp_build_round_mode LP_BUILD_ROUND_NEAREST - iround LP_BUILD_ROUND_FLOOR - ifloor LP_BUILD_ROUND_CEIL - iceil LP_BUILD_ROUND_TRUNCATE - itrunc Jose - Original Message - From: Adhemerval Zanella azane...@linux.vnet.ibm.com This adds another rounding mode to the enum, which happens otherwise to match SSE4.1's rounding modes. This should be safe as long as the IROUND case never hits the SSE4.1 path. Reviewed-by: Adam Jackson a...@redhat.com Signed-off-by: Adhemerval Zanella azane...@linux.vnet.ibm.com --- src/gallium/auxiliary/gallivm/lp_bld_arit.c | 29 +++-- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c index ec05026..021cd6e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c @@ -1360,10 +1360,17 @@ lp_build_int_to_float(struct lp_build_context *bld, static boolean arch_rounding_available(const struct lp_type type) { + /* SSE4 vector rounding. */ if ((util_cpu_caps.has_sse4_1 (type.length == 1 || type.width*type.length == 128)) || (util_cpu_caps.has_avx type.width*type.length == 256)) return TRUE; + /* SSE2 vector to word. */ + else if ((util_cpu_caps.has_sse2 +((type.width == 32) (type.length == 1 || type.length == 4))) || +(util_cpu_caps.has_avx type.width == 32 type.length == 8)) + return TRUE; + /* Altivec rounding and vector to word. */ else if ((util_cpu_caps.has_altivec (type.width == 32 type.length == 4))) return TRUE; @@ -1376,7 +1383,8 @@ enum lp_build_round_mode LP_BUILD_ROUND_NEAREST = 0, LP_BUILD_ROUND_FLOOR = 1, LP_BUILD_ROUND_CEIL = 2, - LP_BUILD_ROUND_TRUNCATE = 3 + LP_BUILD_ROUND_TRUNCATE = 3, + LP_BUILD_IROUND = 4 }; /** @@ -1400,6 +1408,7 @@ lp_build_round_sse41(struct lp_build_context *bld, assert(lp_check_value(type, a)); assert(util_cpu_caps.has_sse4_1); + assert(mode != LP_BUILD_IROUND); if (type.length == 1) { LLVMTypeRef vec_type; @@ -1526,8 +1535,6 @@ lp_build_iround_nearest_sse2(struct lp_build_context *bld, } -/* - */ static INLINE LLVMValueRef lp_build_round_altivec(struct lp_build_context *bld, LLVMValueRef a, @@ -1536,8 +1543,10 @@ lp_build_round_altivec(struct lp_build_context *bld, LLVMBuilderRef builder = bld-gallivm-builder; const struct lp_type type = bld-type; const char *intrinsic = NULL; + LLVMTypeRef ret_type = bld-vec_type; assert(type.floating); + assert(type.width == 32); assert(lp_check_value(type, a)); assert(util_cpu_caps.has_altivec); @@ -1555,9 +1564,12 @@ lp_build_round_altivec(struct lp_build_context *bld, case LP_BUILD_ROUND_TRUNCATE: intrinsic = llvm.ppc.altivec.vrfiz; break; + case LP_BUILD_IROUND: + ret_type = lp_build_int_vec_type(bld-gallivm, bld-type); + intrinsic = llvm.ppc.altivec.vctsxs; } - return lp_build_intrinsic_unary(builder, intrinsic, bld-vec_type, a); + return lp_build_intrinsic_unary(builder, intrinsic, ret_type, a); } static INLINE LLVMValueRef @@ -1565,7 +1577,9 @@ lp_build_round_arch(struct lp_build_context *bld, LLVMValueRef a, enum lp_build_round_mode mode) { - if (util_cpu_caps.has_sse4_1) + if (util_cpu_caps.has_sse2 (mode == LP_BUILD_IROUND)) + return lp_build_iround_nearest_sse2(bld, a); + else if (util_cpu_caps.has_sse4_1) return lp_build_round_sse41(bld, a, mode); else /* (util_cpu_caps.has_altivec) */ return lp_build_round_altivec(bld, a, mode); @@ -1893,11 +1907,6 @@ lp_build_iround(struct lp_build_context *bld, assert(lp_check_value(type, a)); - if ((util_cpu_caps.has_sse2 - ((type.width == 32) (type.length == 1 || type.length == 4))) || - (util_cpu_caps.has_avx type.width == 32 type.length == 8)) { - return lp_build_iround_nearest_sse2(bld, a); - } if (arch_rounding_available(type)) { res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST); } -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fix bitmap, drawpix, drawtex for PIPE_CAP_TGSI_TEXCOORD
On 04/02/2013 08:43 AM, Christoph Bumiller wrote: On 02.04.2013 16:39, Brian Paul wrote: On 03/30/2013 08:11 AM, Christoph Bumiller wrote: NOTE: Changed the semantic index for the drawtex coordiante to be the texture unit index instead of always 0. Not sure if this is correct but since the value seems to depend on the unit it would make sense to use different varying slots. Tested-by: Brian Paulbri...@vmware.com Thanks ! Just to be sure, you're referring to the part that changes the semantic index so that TEX0..7(max units) is used instead of always TEX0, right ? I'll push that as a separate patch then. I only tested the patch that's described by the subject line. I don't recall seeing the other one. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: add print for int128
Looks good to me in principle, but I have some remarks (inline) concerning the implementation. Jose - Original Message - From: Adhemerval Zanella azane...@linux.vnet.ibm.com Reviewed-by: Adam Jackson a...@redhat.com Signed-off-by: Adhemerval Zanella azane...@linux.vnet.ibm.com --- src/gallium/auxiliary/gallivm/lp_bld_printf.c | 56 +++ 1 file changed, 39 insertions(+), 17 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_printf.c b/src/gallium/auxiliary/gallivm/lp_bld_printf.c index 7a6bbd9..71c4d1b 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_printf.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_printf.c @@ -83,33 +83,47 @@ lp_build_print_value(struct gallivm_state *gallivm, LLVMTypeKind type_kind; LLVMTypeRef type_ref; LLVMValueRef params[2 + LP_MAX_VECTOR_LENGTH]; - char type_fmt[6] = %x; + char type_fmt[20]; char format[2 + 5 * LP_MAX_VECTOR_LENGTH + 2] = %s; - unsigned length; + unsigned vecsize; + unsigned nargs; unsigned i; type_ref = LLVMTypeOf(value); type_kind = LLVMGetTypeKind(type_ref); if (type_kind == LLVMVectorTypeKind) { - length = LLVMGetVectorSize(type_ref); + vecsize = LLVMGetVectorSize(type_ref); + nargs = vecsize; type_ref = LLVMGetElementType(type_ref); type_kind = LLVMGetTypeKind(type_ref); + } else if (LLVMGetIntTypeWidth(type_ref) == 128) { + vecsize = 1; + nargs = 2; } else { - length = 1; + vecsize = 1; + nargs = 1; } if (type_kind == LLVMFloatTypeKind || type_kind == LLVMDoubleTypeKind) { - type_fmt[2] = '.'; - type_fmt[3] = '9'; - type_fmt[4] = 'g'; - type_fmt[5] = '\0'; + snprintf(type_fmt, sizeof type_fmt, %%9g); The . is missing here. } else if (type_kind == LLVMIntegerTypeKind) { - if (LLVMGetIntTypeWidth(type_ref) == 8) { - type_fmt[2] = 'u'; - } else { - type_fmt[2] = 'i'; + unsigned typeWidth = LLVMGetIntTypeWidth(type_ref); + if (LLVMGetIntTypeWidth(type_ref) = 32) { + snprintf(type_fmt, sizeof type_fmt, %%x); This doesn't look equivalent to me neither. Please retain previous behavior for integers = 32. + } else if (typeWidth == 64) { +#if __WORDSIZE == 64 Is __WORDSIZE standard? I'm particularly concerned about windows. You could just use if (sizeof (unsigned)) here and avoid magic macro. Or better, you could just use inttypes.h macros, which are also defined for MSVC. + snprintf(type_fmt, sizeof type_fmt, %%016lx); +#else + snprintf(type_fmt, sizeof type_fmt, %%016llx); +#endif + } else if (typeWidth == 128) { +#if __WORDSIZE == 64 + snprintf(type_fmt, sizeof type_fmt, %%016lx%%016lx); +#else + snprintf(type_fmt, sizeof type_fmt, %%016llx%%016llx); +#endif } } else { /* Unsupported type */ @@ -117,14 +131,22 @@ lp_build_print_value(struct gallivm_state *gallivm, } /* Create format string and arguments */ - assert(strlen(format) + strlen(type_fmt) * length + 2 = sizeof format); + assert(strlen(format) + strlen(type_fmt) * nargs + 2 = sizeof format); params[1] = lp_build_const_string(gallivm, msg); - if (length == 1) { + if (vecsize == 1) { util_strncat(format, type_fmt, sizeof(format) - strlen(format) - 1); - params[2] = value; + if (LLVMGetIntTypeWidth(type_ref) = 64) { + params[2] = value; + } else { + LLVMValueRef shift = LLVMConstInt(LLVMIntTypeInContext(gallivm-context, 128), 64, 0); + LLVMValueRef lshr = LLVMBuildLShr(builder, value, shift, ); + LLVMTypeRef type64 = LLVMInt64TypeInContext(gallivm-context); + params[2] = LLVMBuildTrunc(builder, lshr, type64, ); + params[3] = LLVMBuildTrunc(builder, value, type64, ); + } } else { - for (i = 0; i length; ++i) { + for (i = 0; i vecsize; ++i) { LLVMValueRef param; util_strncat(format, type_fmt, sizeof(format) - strlen(format) - 1); param = LLVMBuildExtractElement(builder, value, lp_build_const_int32(gallivm, i), ); @@ -144,7 +166,7 @@ lp_build_print_value(struct gallivm_state *gallivm, util_strncat(format, \n, sizeof(format) - strlen(format) - 1); params[0] = lp_build_const_string(gallivm, format); - return lp_build_print_args(gallivm, 2 + length, params); + return lp_build_print_args(gallivm, 2 + nargs, params); } -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 62868] solaris build broken with missing ffsll
https://bugs.freedesktop.org/show_bug.cgi?id=62868 Brian Paul bri...@vmware.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Brian Paul bri...@vmware.com --- Should be fixed with commit 95df2b28831147b3e7ce2a3b6257bf60c46b4ab4 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon/llvm: Build libradeonllvm as a static library
On Die, 2013-04-02 at 10:20 +0200, Michel Dänzer wrote: On Mon, 2013-04-01 at 14:11 -0700, Tom Stellard wrote: From: Tom Stellard thomas.stell...@amd.com Building libradeonllvm as a shared object has led to a number of bugs and build system complications, and I don't think it's necessary for such a small library. This library was originally changed to a shared object to work around linker error in egl_static.so, but these appear to be fixed now. https://bugs.freedesktop.org/show_bug.cgi?id=62226 --- Please test to make sure this works for your build configuration. Tested-by: Michel Dänzer michel.daen...@amd.com Retracted, and patch NACKed: I had forgotten I needed to test starting X with glamor, which this still breaks. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Reduce code duplication in handling of depth, stencil, and HiZ.
On 03/26/2013 09:54 PM, Paul Berry wrote: This patch consolidates duplicate code in the brw_depthbuffer and gen7_depthbuffer state atoms. Previously, these state atoms contained 5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for Gen4-6 and 2 for Gen7). Also a lot of logic for determining the appropriate buffer setup was duplicated between the Gen4-6 and Gen7 functions. This refactor splits the code into three separate functions: brw_emit_depthbuffer(), which determines the appropriate buffer setup in a mostly generation-independent way, brw_emit_depth_stencil_hiz(), which emits the appropriate state packets for Gen4-6, and gen7_emit_depth_stencil_hiz(), which emits the appropriate state packets for Gen7. Tested using Piglit on Gen5-7 (no regressions). Okay, nice work, you've successfully dealt with the rat's nest. This is definitely better. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] glsl: Don't emit spurious errors for constant indexes of the wrong type
On 04/01/2013 11:25 AM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Previously the shader uniform float x[6]; void main() { gl_Position.x = x[1.0]; } would have generated the errors 0:2(33): error: array index must be integer type 0:2(36): error: array index must be 6 Now only 0:2(33): error: array index must be integer type will be generated. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_array_index.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index 486ff55..c7ebcbd 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -58,7 +58,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, * declared size. */ ir_constant *const const_index = idx-constant_expression_value(); - if (const_index != NULL) { + if (const_index != NULL idx-type-is_integer()) { const int idx = const_index-value.i[0]; const char *type_name = error; unsigned bound = 0; @@ -118,7 +118,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, check_builtin_array_max_size(v-name, idx+1, loc, state); } } - } else if (array-type-is_array()) { + } else if (const_index == NULL array-type-is_array()) { if (array-type-array_size() == 0) { _mesa_glsl_error(loc, state, unsized array index must be constant); } else if (array-type-fields.array-is_interface()) { Aww. Patch 6 cleaned this up so nicely, and now it's getting a bit uglier again. How about simply doing an early-return above: if (!idx-type-is_integer()) { _mesa_glsl_error( idx_loc, state, array index must be integer type); return result; } else if (!idx-type-is_scalar()) { _mesa_glsl_error( idx_loc, state, array index must be scalar); return result; } Basically, if you hit those errors, you don't want to continue checking for more of them. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] glsl: Make check_build_array_max_size externally visible
On 04/01/2013 11:25 AM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com A future commit will try to use this function in a different file. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Title of patch should be check_builtin_array_max_size (typo). I was wondering what a check_build_array function would do :) I'm not really sure where you're going with patch 8, and I had a few comments on patch 7 (which you can take or leave). But regardless, this series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] mesa: Add new ctx-Stencil._WriteEnabled derived state flag.
i965 needs to know whether stencil writes are enabled in several places, and gets the test wrong sometimes. While we could create a function to compute this, it seems generally useful enough to warrant a new piece of derived state. Also, all the plumbing is already in place. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/mtypes.h | 1 + src/mesa/main/stencil.c | 5 + 2 files changed, 6 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index ace6938..e731fe3 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1015,6 +1015,7 @@ struct gl_stencil_attrib GLboolean TestTwoSide; /** GL_EXT_stencil_two_side */ GLubyte ActiveFace; /** GL_EXT_stencil_two_side (0 or 2) */ GLboolean _Enabled; /** Enabled and stencil buffer present */ + GLboolean _WriteEnabled; /** _Enabled and non-zero writemasks */ GLboolean _TestTwoSide; GLubyte _BackFace; /** Current back stencil state (1 or 2) */ GLenum Function[3]; /** Stencil function */ diff --git a/src/mesa/main/stencil.c b/src/mesa/main/stencil.c index c161808..3308417 100644 --- a/src/mesa/main/stencil.c +++ b/src/mesa/main/stencil.c @@ -551,6 +551,11 @@ _mesa_update_stencil(struct gl_context *ctx) ctx-Stencil.Ref[0] != ctx-Stencil.Ref[face] || ctx-Stencil.ValueMask[0] != ctx-Stencil.ValueMask[face] || ctx-Stencil.WriteMask[0] != ctx-Stencil.WriteMask[face]); + + ctx-Stencil._WriteEnabled = + ctx-Stencil._Enabled + (ctx-Stencil.WriteMask[0] != 0 || + (ctx-Stencil._TestTwoSide ctx-Stencil.WriteMask[face] != 0)); } -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+.
ctx-Stencil.WriteMask is a statically sized array of 3 elements. Checking it against 0 actually is a NULL check, and can never fail, which meant that we always said stencil writes were enabled. Use the new core Mesa derived state flag to fix this. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_misc_state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c b/src/mesa/drivers/dri/i965/gen7_misc_state.c index 2009070..1d3677d 100644 --- a/src/mesa/drivers/dri/i965/gen7_misc_state.c +++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c @@ -50,7 +50,7 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw, OUT_BATCH((depth_mt ? depth_mt-region-pitch - 1 : 0) | (depthbuffer_format 18) | ((hiz_mt ? 1 : 0) 22) | - ((stencil_mt != NULL ctx-Stencil.WriteMask != 0) 27) | + ((stencil_mt != NULL ctx-Stencil._WriteEnabled) 27) | ((ctx-Depth.Mask != 0) 28) | (depth_surface_type 29)); -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965: Use ctx-Stencil._WriteEnabled in DEPTH_STENCIL_STATE.
This is the same computation as the _WriteEnabled flag, so we may as well use it. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen6_depthstencil.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_depthstencil.c b/src/mesa/drivers/dri/i965/gen6_depthstencil.c index 4ea517f..940d91f 100644 --- a/src/mesa/drivers/dri/i965/gen6_depthstencil.c +++ b/src/mesa/drivers/dri/i965/gen6_depthstencil.c @@ -74,11 +74,7 @@ gen6_upload_depth_stencil_state(struct brw_context *brw) ds-ds1.bf_stencil_test_mask = ctx-Stencil.ValueMask[back]; } - /* Not really sure about this: - */ - if (ctx-Stencil.WriteMask[0] || - (ctx-Stencil._TestTwoSide ctx-Stencil.WriteMask[back])) -ds-ds0.stencil_write_enable = 1; + ds-ds0.stencil_write_enable = ctx-Stencil._WriteEnabled; } /* _NEW_DEPTH */ -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix array indexing when constant folding built-in functions.
On 1 April 2013 11:43, Kenneth Graunke kenn...@whitecape.org wrote: On 04/01/2013 11:30 AM, Ian Romanick wrote: On 03/29/2013 02:13 PM, Paul Berry wrote: Mesa constant-folds built-in functions by using a miniature GLSL interpreter (see ir_function_signature::**constant_expression_evaluate_** expression_list()). This interpreter had a bug in its handling of array indexing, which caused expressions like m[i][j] (where m is a matrix) to be handled incorrectly. Specifically, it incorrectly treated j as indexing into the whole matrix (rather than indexing just into the vector m[i]); as a result the offset computed for m[i] was lost and m[i][j] was treated as m[j][0]. Fixes piglit tests inverse-mat[234].{vert,frag}. NOTE: This is a candidate for the 9.1 branch. Good catch. The test case fails only in 9.1 and later because it requires OpenGL 3.1, but I think the bug exists in earlier versions. I'm glad you mentioned this because it prompted me to investigate further. It turns out that the test case *does* fail in 9.0, because 9.0 supports OpenGL 3.1. The bug doesn't exist in earlier versions, since the miniature GLSL interpreter technique was implemented in the 9.0 timeframe. I'll update the note to mark this patch as a candidate for 9.1 and 9.0. Reviewed-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/**show_bug.cgi?id=57436https://bugs.freedesktop.org/show_bug.cgi?id=57436 I already pushed my work-around to master (but not to 9.1). You can revert it when you push this change if you like. I would like to keep the change to use dot(), as that seems like an actual improvement. For the other changes, I guess I don't have a strong preference. --Ken Sounds reasonable to me. Will do. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] draw/gs: cleanup some debugging code
Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_gs.c |4 1 file changed, 4 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index b98b133..70db837 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -160,8 +160,6 @@ static void tgsi_fetch_gs_input(struct draw_geometry_shader *shader, #if DEBUG_INPUTS debug_printf(\tSlot = %d, vs_slot = %d, idx = %d:\n, slot, vs_slot, idx); -#endif -#if 1 assert(!util_is_inf_or_nan(input[vs_slot][0])); assert(!util_is_inf_or_nan(input[vs_slot][1])); assert(!util_is_inf_or_nan(input[vs_slot][2])); @@ -249,8 +247,6 @@ llvm_fetch_gs_input(struct draw_geometry_shader *shader, #if DEBUG_INPUTS debug_printf(\tSlot = %d, vs_slot = %d, i = %d:\n, slot, vs_slot, i); -#endif -#if 0 assert(!util_is_inf_or_nan(input[vs_slot][0])); assert(!util_is_inf_or_nan(input[vs_slot][1])); assert(!util_is_inf_or_nan(input[vs_slot][2])); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] draw/llvm: use an enum instead of magic numbers
I think this was there before and got accidently removed during a merge. Same code as for the GS context, which is also using an enum instead of hardcoded numbers. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_llvm.c |8 src/gallium/auxiliary/draw/draw_llvm.h | 17 +++-- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_llvm.c b/src/gallium/auxiliary/draw/draw_llvm.c index d0199bb..5100ce0 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.c +++ b/src/gallium/auxiliary/draw/draw_llvm.c @@ -203,7 +203,7 @@ create_jit_context_type(struct gallivm_state *gallivm, { LLVMTargetDataRef target = gallivm-target; LLVMTypeRef float_type = LLVMFloatTypeInContext(gallivm-context); - LLVMTypeRef elem_types[5]; + LLVMTypeRef elem_types[DRAW_JIT_CTX_NUM_FIELDS]; LLVMTypeRef context_type; elem_types[0] = LLVMArrayType(LLVMPointerType(float_type, 0), /* vs_constants */ @@ -224,11 +224,11 @@ create_jit_context_type(struct gallivm_state *gallivm, #endif LP_CHECK_MEMBER_OFFSET(struct draw_jit_context, vs_constants, - target, context_type, 0); + target, context_type, DRAW_JIT_CTX_CONSTANTS); LP_CHECK_MEMBER_OFFSET(struct draw_jit_context, planes, - target, context_type, 1); + target, context_type, DRAW_JIT_CTX_PLANES); LP_CHECK_MEMBER_OFFSET(struct draw_jit_context, viewport, - target, context_type, 2); + target, context_type, DRAW_JIT_CTX_VIEWPORT); LP_CHECK_MEMBER_OFFSET(struct draw_jit_context, textures, target, context_type, DRAW_JIT_CTX_TEXTURES); diff --git a/src/gallium/auxiliary/draw/draw_llvm.h b/src/gallium/auxiliary/draw/draw_llvm.h index 8df02a2..5909fc1 100644 --- a/src/gallium/auxiliary/draw/draw_llvm.h +++ b/src/gallium/auxiliary/draw/draw_llvm.h @@ -130,18 +130,23 @@ struct draw_jit_context struct draw_jit_sampler samplers[PIPE_MAX_SAMPLERS]; }; +enum { + DRAW_JIT_CTX_CONSTANTS = 0, + DRAW_JIT_CTX_PLANES = 1, + DRAW_JIT_CTX_VIEWPORT= 2, + DRAW_JIT_CTX_TEXTURES= 3, + DRAW_JIT_CTX_SAMPLERS= 4, + DRAW_JIT_CTX_NUM_FIELDS +}; #define draw_jit_context_vs_constants(_gallivm, _ptr) \ - lp_build_struct_get_ptr(_gallivm, _ptr, 0, vs_constants) + lp_build_struct_get_ptr(_gallivm, _ptr, DRAW_JIT_CTX_CONSTANTS, vs_constants) #define draw_jit_context_planes(_gallivm, _ptr) \ - lp_build_struct_get(_gallivm, _ptr, 1, planes) + lp_build_struct_get(_gallivm, _ptr, DRAW_JIT_CTX_PLANES, planes) #define draw_jit_context_viewport(_gallivm, _ptr) \ - lp_build_struct_get(_gallivm, _ptr, 2, viewport) - -#define DRAW_JIT_CTX_TEXTURES 3 -#define DRAW_JIT_CTX_SAMPLERS 4 + lp_build_struct_get(_gallivm, _ptr, DRAW_JIT_CTX_VIEWPORT, viewport) #define draw_jit_context_textures(_gallivm, _ptr) \ lp_build_struct_get_ptr(_gallivm, _ptr, DRAW_JIT_CTX_TEXTURES, textures) -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] draw: remove unused function
we use draw_set_mapped_so_targets nowadays Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_context.c |7 --- src/gallium/auxiliary/draw/draw_context.h |5 - 2 files changed, 12 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index ceb74df..bb56f1b 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -735,13 +735,6 @@ draw_set_mapped_so_targets(struct draw_context *draw, } void -draw_set_mapped_so_buffers(struct draw_context *draw, - void *buffers[PIPE_MAX_SO_BUFFERS], - unsigned num_buffers) -{ -} - -void draw_set_so_state(struct draw_context *draw, struct pipe_stream_output_info *state) { diff --git a/src/gallium/auxiliary/draw/draw_context.h b/src/gallium/auxiliary/draw/draw_context.h index b333457..426fd44 100644 --- a/src/gallium/auxiliary/draw/draw_context.h +++ b/src/gallium/auxiliary/draw/draw_context.h @@ -222,11 +222,6 @@ draw_set_mapped_constant_buffer(struct draw_context *draw, unsigned size); void -draw_set_mapped_so_buffers(struct draw_context *draw, - void *buffers[PIPE_MAX_SO_BUFFERS], - unsigned num_buffers); - -void draw_set_mapped_so_targets(struct draw_context *draw, int num_targets, struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS]); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] llvmpipe: reset so buffers when not appending
We need to reset the internal state of the so buffers or we'll keep appending even though we're not supposed to. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/drivers/llvmpipe/lp_state_so.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/llvmpipe/lp_state_so.c b/src/gallium/drivers/llvmpipe/lp_state_so.c index 58bab39..fa58f79 100644 --- a/src/gallium/drivers/llvmpipe/lp_state_so.c +++ b/src/gallium/drivers/llvmpipe/lp_state_so.c @@ -70,6 +70,12 @@ llvmpipe_set_so_targets(struct pipe_context *pipe, int i; for (i = 0; i num_targets; i++) { pipe_so_target_reference((struct pipe_stream_output_target **)llvmpipe-so_targets[i], targets[i]); + /* if we're not appending then lets reset the internal + data of our so target */ + if (!(append_bitmask (1 i)) llvmpipe-so_targets[i]) { + llvmpipe-so_targets[i]-internal_offset = 0; + llvmpipe-so_targets[i]-emitted_vertices = 0; + } } for (; i llvmpipe-num_so_targets; i++) { -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] draw/llvmpipe: allow independent so attachments to the vs
When geometry shaders are present, one needs to be able to create an empty geometry shader with stream output that needs to be resolved later and attached to the currently bound vertex shader. Lets add support for it to llvmpipe and draw. draw allows attaching independent stream output info to any vertex shader and llvmpipe resolves at draw time which vertex shader the given empty geometry shader should be linked to. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_context.c |9 - src/gallium/auxiliary/draw/draw_context.h |7 +++ src/gallium/auxiliary/draw/draw_private.h |1 - src/gallium/auxiliary/draw/draw_vs.c | 13 + src/gallium/drivers/llvmpipe/lp_draw_arrays.c | 15 +++ src/gallium/drivers/llvmpipe/lp_state_gs.c| 21 - 6 files changed, 43 insertions(+), 23 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index bb56f1b..2fb9bac 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -735,15 +735,6 @@ draw_set_mapped_so_targets(struct draw_context *draw, } void -draw_set_so_state(struct draw_context *draw, - struct pipe_stream_output_info *state) -{ - memcpy(draw-so.state, - state, - sizeof(struct pipe_stream_output_info)); -} - -void draw_set_sampler_views(struct draw_context *draw, unsigned shader_stage, struct pipe_sampler_view **views, diff --git a/src/gallium/auxiliary/draw/draw_context.h b/src/gallium/auxiliary/draw/draw_context.h index 426fd44..1d25b7f 100644 --- a/src/gallium/auxiliary/draw/draw_context.h +++ b/src/gallium/auxiliary/draw/draw_context.h @@ -171,6 +171,9 @@ void draw_bind_vertex_shader(struct draw_context *draw, struct draw_vertex_shader *dvs); void draw_delete_vertex_shader(struct draw_context *draw, struct draw_vertex_shader *dvs); +void draw_vs_attach_so(struct draw_vertex_shader *dvs, + const struct pipe_stream_output_info *info); +void draw_vs_reset_so(struct draw_vertex_shader *dvs); /* @@ -226,10 +229,6 @@ draw_set_mapped_so_targets(struct draw_context *draw, int num_targets, struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS]); -void -draw_set_so_state(struct draw_context *draw, - struct pipe_stream_output_info *state); - /*** * draw_pt.c diff --git a/src/gallium/auxiliary/draw/draw_private.h b/src/gallium/auxiliary/draw/draw_private.h index 5063c3c..757ed26 100644 --- a/src/gallium/auxiliary/draw/draw_private.h +++ b/src/gallium/auxiliary/draw/draw_private.h @@ -279,7 +279,6 @@ struct draw_context /** Stream output (vertex feedback) state */ struct { - struct pipe_stream_output_info state; struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS]; uint num_targets; } so; diff --git a/src/gallium/auxiliary/draw/draw_vs.c b/src/gallium/auxiliary/draw/draw_vs.c index 266cca7..afec376 100644 --- a/src/gallium/auxiliary/draw/draw_vs.c +++ b/src/gallium/auxiliary/draw/draw_vs.c @@ -245,3 +245,16 @@ draw_vs_get_emit( struct draw_context *draw, return draw-vs.emit; } + +void +draw_vs_attach_so(struct draw_vertex_shader *dvs, + const struct pipe_stream_output_info *info) +{ + dvs-state.stream_output = *info; +} + +void +draw_vs_reset_so(struct draw_vertex_shader *dvs) +{ + memset(dvs-state.stream_output, 0, sizeof(dvs-state.stream_output)); +} diff --git a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c b/src/gallium/drivers/llvmpipe/lp_draw_arrays.c index ae00c49..efeca25 100644 --- a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c +++ b/src/gallium/drivers/llvmpipe/lp_draw_arrays.c @@ -101,6 +101,13 @@ llvmpipe_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) llvmpipe_prepare_geometry_sampling(lp, lp-num_sampler_views[PIPE_SHADER_GEOMETRY], lp-sampler_views[PIPE_SHADER_GEOMETRY]); + if (lp-gs !lp-gs-shader.tokens) { + /* we have an empty geometry shader with stream output, so + attach the stream output info to the current vertex shader */ + if (lp-vs) { + draw_vs_attach_so(lp-vs-draw_data, lp-gs-shader.stream_output); + } + } /* draw! */ draw_vbo(draw, info); @@ -116,6 +123,14 @@ llvmpipe_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) } draw_set_mapped_so_targets(draw, 0, NULL); + if (lp-gs !lp-gs-shader.tokens) { + /* we have attached stream output to the vs for rendering, + now lets reset it */ + if (lp-vs) { +
[Mesa-dev] [PATCH 1/2] i965: Turn brw-urb.vs_size and gs_size into local variables.
These variables are only used within a single function, so we may as well make them local variables. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_context.h | 9 - src/mesa/drivers/dri/i965/gen6_urb.c| 18 +- src/mesa/drivers/dri/i965/gen7_urb.c| 7 +++ 3 files changed, 12 insertions(+), 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ea5b62a..d3a5042 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -859,15 +859,6 @@ struct brw_context GLuint nr_sf_entries; GLuint nr_cs_entries; - /* gen6: - * The length of each URB entry owned by the VS (or GS), as - * a number of 1024-bit (128-byte) rows. Should be = 1. - * - * gen7: Same meaning, but in 512-bit (64-byte) rows. - */ - GLuint vs_size; - GLuint gs_size; - GLuint vs_start; GLuint gs_start; GLuint clip_start; diff --git a/src/mesa/drivers/dri/i965/gen6_urb.c b/src/mesa/drivers/dri/i965/gen6_urb.c index 2d69cbe..aa985de 100644 --- a/src/mesa/drivers/dri/i965/gen6_urb.c +++ b/src/mesa/drivers/dri/i965/gen6_urb.c @@ -54,7 +54,7 @@ gen6_upload_urb( struct brw_context *brw ) int total_urb_size = brw-urb.size * 1024; /* in bytes */ /* CACHE_NEW_VS_PROG */ - brw-urb.vs_size = MAX2(brw-vs.prog_data-urb_entry_size, 1); + unsigned vs_size = MAX2(brw-vs.prog_data-urb_entry_size, 1); /* We use the same VUE layout for VS outputs and GS outputs (as it's what * the SF and Clipper expect), so we can simply make the GS URB entry size @@ -62,14 +62,14 @@ gen6_upload_urb( struct brw_context *brw ) * where we have few vertex attributes and a lot of varyings, since the VS * size is determined by the larger of the two. For now, it's safe. */ - brw-urb.gs_size = brw-urb.vs_size; + unsigned gs_size = vs_size; /* Calculate how many entries fit in each stage's section of the URB */ if (brw-gs.prog_active) { - nr_vs_entries = (total_urb_size/2) / (brw-urb.vs_size * 128); - nr_gs_entries = (total_urb_size/2) / (brw-urb.gs_size * 128); + nr_vs_entries = (total_urb_size/2) / (vs_size * 128); + nr_gs_entries = (total_urb_size/2) / (gs_size * 128); } else { - nr_vs_entries = total_urb_size / (brw-urb.vs_size * 128); + nr_vs_entries = total_urb_size / (vs_size * 128); nr_gs_entries = 0; } @@ -87,14 +87,14 @@ gen6_upload_urb( struct brw_context *brw ) assert(brw-urb.nr_vs_entries = 24); assert(brw-urb.nr_vs_entries % 4 == 0); assert(brw-urb.nr_gs_entries % 4 == 0); - assert(brw-urb.vs_size 5); - assert(brw-urb.gs_size 5); + assert(vs_size 5); + assert(gs_size 5); BEGIN_BATCH(3); OUT_BATCH(_3DSTATE_URB 16 | (3 - 2)); - OUT_BATCH(((brw-urb.vs_size - 1) GEN6_URB_VS_SIZE_SHIFT) | + OUT_BATCH(((vs_size - 1) GEN6_URB_VS_SIZE_SHIFT) | ((brw-urb.nr_vs_entries) GEN6_URB_VS_ENTRIES_SHIFT)); - OUT_BATCH(((brw-urb.gs_size - 1) GEN6_URB_GS_SIZE_SHIFT) | + OUT_BATCH(((gs_size - 1) GEN6_URB_GS_SIZE_SHIFT) | ((brw-urb.nr_gs_entries) GEN6_URB_GS_ENTRIES_SHIFT)); ADVANCE_BATCH(); diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c b/src/mesa/drivers/dri/i965/gen7_urb.c index 481497b..dafe1ad 100644 --- a/src/mesa/drivers/dri/i965/gen7_urb.c +++ b/src/mesa/drivers/dri/i965/gen7_urb.c @@ -82,9 +82,9 @@ gen7_upload_urb(struct brw_context *brw) int handle_region_size = (brw-urb.size - 16) * 1024; /* bytes */ /* CACHE_NEW_VS_PROG */ - brw-urb.vs_size = MAX2(brw-vs.prog_data-urb_entry_size, 1); + unsigned vs_size = MAX2(brw-vs.prog_data-urb_entry_size, 1); - int nr_vs_entries = handle_region_size / (brw-urb.vs_size * 64); + int nr_vs_entries = handle_region_size / (vs_size * 64); if (nr_vs_entries brw-urb.max_vs_entries) nr_vs_entries = brw-urb.max_vs_entries; @@ -100,8 +100,7 @@ gen7_upload_urb(struct brw_context *brw) assert(!brw-gs.prog_active); gen7_emit_vs_workaround_flush(intel); - gen7_emit_urb_state(brw, brw-urb.nr_vs_entries, brw-urb.vs_size, - brw-urb.vs_start); + gen7_emit_urb_state(brw, brw-urb.nr_vs_entries, vs_size, brw-urb.vs_start); } void -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Use a variable for the push constant size in kB.
This clarifies that the offset of 2 is actually 16 kB / 8kB units. It also keys both computations off of a single variable, which should make it easier to change in the future. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_urb.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c b/src/mesa/drivers/dri/i965/gen7_urb.c index dafe1ad..5ac3885 100644 --- a/src/mesa/drivers/dri/i965/gen7_urb.c +++ b/src/mesa/drivers/dri/i965/gen7_urb.c @@ -78,8 +78,9 @@ static void gen7_upload_urb(struct brw_context *brw) { struct intel_context *intel = brw-intel; + const int push_size_kB = 16; /* Total space for entries is URB size - 16kB for push constants */ - int handle_region_size = (brw-urb.size - 16) * 1024; /* bytes */ + int handle_region_size = (brw-urb.size - push_size_kB) * 1024; /* bytes */ /* CACHE_NEW_VS_PROG */ unsigned vs_size = MAX2(brw-vs.prog_data-urb_entry_size, 1); @@ -92,7 +93,7 @@ gen7_upload_urb(struct brw_context *brw) brw-urb.nr_vs_entries = ROUND_DOWN_TO(nr_vs_entries, 8); /* URB Starting Addresses are specified in multiples of 8kB. */ - brw-urb.vs_start = 2; /* skip over push constants */ + brw-urb.vs_start = push_size_kB / 8; /* skip over push constants */ assert(brw-urb.nr_vs_entries % 8 == 0); assert(brw-urb.nr_gs_entries % 8 == 0); -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] register_allocate: Fix the type of best_benefit.
--- src/mesa/program/register_allocate.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/program/register_allocate.c b/src/mesa/program/register_allocate.c index a9064c3..7d11b73 100644 --- a/src/mesa/program/register_allocate.c +++ b/src/mesa/program/register_allocate.c @@ -561,7 +561,7 @@ int ra_get_best_spill_node(struct ra_graph *g) { unsigned int best_node = -1; - unsigned int best_benefit = 0.0; + float best_benefit = 0.0; unsigned int n; for (n = 0; n g-count; n++) { -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
The multiplication part of tgsi_umad did not work on Cayman, because it did not populate the correct vector slots. --- src/gallium/drivers/r600/r600_shader.c | 45 -- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 82885d1..6c4cc8f 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -5840,7 +5840,7 @@ static int tgsi_umad(struct r600_shader_ctx *ctx) { struct tgsi_full_instruction *inst = ctx-parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; - int i, j, r; + int i, j, k, r; int lasti = tgsi_last_instruction(inst-Dst[0].Register.WriteMask); /* src0 * src1 */ @@ -5848,21 +5848,40 @@ static int tgsi_umad(struct r600_shader_ctx *ctx) if (!(inst-Dst[0].Register.WriteMask (1 i))) continue; - memset(alu, 0, sizeof(struct r600_bytecode_alu)); + if (ctx-bc-chip_class == CAYMAN) { + for (j = 0 ; j 4; j++) { + memset(alu, 0, sizeof(struct r600_bytecode_alu)); - alu.dst.chan = i; - alu.dst.sel = ctx-temp_reg; - alu.dst.write = 1; + alu.op = ALU_OP2_MULLO_UINT; + for (k = 0; k inst-Instruction.NumSrcRegs; k++) { + r600_bytecode_src(alu.src[k], ctx-src[k], i); + } + tgsi_dst(ctx, inst-Dst[0], j, alu.dst); + alu.dst.sel = ctx-temp_reg; + alu.dst.write = (j == i); + if (j == 3) + alu.last = 1; + r = r600_bytecode_add_alu(ctx-bc, alu); + if (r) + return r; + } + } else { + memset(alu, 0, sizeof(struct r600_bytecode_alu)); - alu.op = ALU_OP2_MULLO_UINT; - for (j = 0; j 2; j++) { - r600_bytecode_src(alu.src[j], ctx-src[j], i); - } + alu.dst.chan = i; + alu.dst.sel = ctx-temp_reg; + alu.dst.write = 1; - alu.last = 1; - r = r600_bytecode_add_alu(ctx-bc, alu); - if (r) - return r; + alu.op = ALU_OP2_MULLO_UINT; + for (j = 0; j 2; j++) { + r600_bytecode_src(alu.src[j], ctx-src[j], i); + } + + alu.last = 1; + r = r600_bytecode_add_alu(ctx-bc, alu); + if (r) + return r; + } } -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] register_allocate: Fix the type of best_benefit.
On Tue, Apr 02, 2013 at 01:38:07PM -0700, Matt Turner wrote: --- Nice catch, will this change have any affect on the compiled code? -Tom src/mesa/program/register_allocate.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/program/register_allocate.c b/src/mesa/program/register_allocate.c index a9064c3..7d11b73 100644 --- a/src/mesa/program/register_allocate.c +++ b/src/mesa/program/register_allocate.c @@ -561,7 +561,7 @@ int ra_get_best_spill_node(struct ra_graph *g) { unsigned int best_node = -1; - unsigned int best_benefit = 0.0; + float best_benefit = 0.0; unsigned int n; for (n = 0; n g-count; n++) { -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Avoid spurious GCC warnings in STATIC_ASSERT() macro.
GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produces spurious warnings with the STATIC_ASSERT() macro (which uses a typedef to provoke a compile error in the event of an assertion failure). This patch avoids the warning using the GCC __attribute__((unused)) syntax. --- src/mesa/main/compiler.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h index 8b23665..ddeb61d 100644 --- a/src/mesa/main/compiler.h +++ b/src/mesa/main/compiler.h @@ -249,6 +249,12 @@ static INLINE GLuint CPU_TO_LE32(GLuint x) #endif +#if (__GNUC__ = 3) +#define GCC_ATTRIBUTE_UNUSED __attribute__((unused)) +#else +#define GCC_ATTRIBUTE_UNUSED +#endif + /** * Static (compile-time) assertion. * Basically, use COND to dimension an array. If COND is false/zero the @@ -256,7 +262,7 @@ static INLINE GLuint CPU_TO_LE32(GLuint x) */ #define STATIC_ASSERT(COND) \ do { \ - typedef int static_assertion_failed[(!!(COND))*2-1]; \ + typedef int static_assertion_failed[(!!(COND))*2-1] GCC_ATTRIBUTE_UNUSED; \ } while (0) -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Avoid spurious GCC warnings in STATIC_ASSERT() macro.
On 04/02/2013 04:16 PM, Paul Berry wrote: GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produces spurious warnings with the STATIC_ASSERT() macro (which uses a typedef to provoke a compile error in the event of an assertion failure). This patch avoids the warning using the GCC __attribute__((unused)) syntax. --- src/mesa/main/compiler.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h index 8b23665..ddeb61d 100644 --- a/src/mesa/main/compiler.h +++ b/src/mesa/main/compiler.h @@ -249,6 +249,12 @@ static INLINE GLuint CPU_TO_LE32(GLuint x) #endif +#if (__GNUC__= 3) +#define GCC_ATTRIBUTE_UNUSED __attribute__((unused)) +#else +#define GCC_ATTRIBUTE_UNUSED +#endif + /** * Static (compile-time) assertion. * Basically, use COND to dimension an array. If COND is false/zero the @@ -256,7 +262,7 @@ static INLINE GLuint CPU_TO_LE32(GLuint x) */ #define STATIC_ASSERT(COND) \ do { \ - typedef int static_assertion_failed[(!!(COND))*2-1]; \ + typedef int static_assertion_failed[(!!(COND))*2-1] GCC_ATTRIBUTE_UNUSED; \ } while (0) Without using gcc-isms, I think this would work too: #define STATIC_ASSERT(COND) \ do { \ int static_assertion_failed[(!!(COND))*2-1]; \ (void) static_assertion_failed; \ } while (0) I don't recall why I used the typedef. Also, the same macro should probably be updated in src/gallium/include/pipe/p_compiler.h -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] draw/gs: cleanup some debugging code
On 03/30/2013 07:27 AM, Zack Rusin wrote: Signed-off-by: Zack Rusinza...@vmware.com --- src/gallium/auxiliary/draw/draw_gs.c |4 1 file changed, 4 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index b98b133..70db837 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -160,8 +160,6 @@ static void tgsi_fetch_gs_input(struct draw_geometry_shader *shader, #if DEBUG_INPUTS debug_printf(\tSlot = %d, vs_slot = %d, idx = %d:\n, slot, vs_slot, idx); -#endif -#if 1 assert(!util_is_inf_or_nan(input[vs_slot][0])); assert(!util_is_inf_or_nan(input[vs_slot][1])); assert(!util_is_inf_or_nan(input[vs_slot][2])); @@ -249,8 +247,6 @@ llvm_fetch_gs_input(struct draw_geometry_shader *shader, #if DEBUG_INPUTS debug_printf(\tSlot = %d, vs_slot = %d, i = %d:\n, slot, vs_slot, i); -#endif -#if 0 assert(!util_is_inf_or_nan(input[vs_slot][0])); assert(!util_is_inf_or_nan(input[vs_slot][1])); assert(!util_is_inf_or_nan(input[vs_slot][2])); For the series: Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: use f16c hw support for float-half and half-float conversion
From: Roland Scheidegger srol...@vmware.com Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. --- src/gallium/auxiliary/gallivm/lp_bld_conv.c | 45 --- src/gallium/auxiliary/gallivm/lp_bld_init.c | 10 ++ src/gallium/auxiliary/util/u_cpu_detect.c |1 + src/gallium/auxiliary/util/u_cpu_detect.h |1 + 4 files changed, 53 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_conv.c b/src/gallium/auxiliary/gallivm/lp_bld_conv.c index 38a577c..eb2d096 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_conv.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_conv.c @@ -175,9 +175,24 @@ lp_build_half_to_float(struct gallivm_state *gallivm, struct lp_type f32_type = lp_type_float_vec(32, 32 * src_length); struct lp_type i32_type = lp_type_int_vec(32, 32 * src_length); LLVMTypeRef int_vec_type = lp_build_vec_type(gallivm, i32_type); + LLVMValueRef h; + + if (util_cpu_caps.has_f16c HAVE_LLVM = 0x0301 + (src_length == 4 || src_length == 8)) { + const char *intrinsic = NULL; + if (src_length == 4) { + src = lp_build_pad_vector(gallivm, src, 8); + intrinsic = llvm.x86.vcvtph2ps.128; + } + else { + intrinsic = llvm.x86.vcvtph2ps.256; + } + return lp_build_intrinsic_unary(builder, intrinsic, + lp_build_vec_type(gallivm, f32_type), src); + } /* Convert int16 vector to int32 vector by zero ext (might generate bad code) */ - LLVMValueRef h = LLVMBuildZExt(builder, src, int_vec_type, ); + h = LLVMBuildZExt(builder, src, int_vec_type, ); return lp_build_smallfloat_to_float(gallivm, f32_type, h, 10, 5, 0, true); } @@ -204,9 +219,31 @@ lp_build_float_to_half(struct gallivm_state *gallivm, struct lp_type i16_type = lp_type_int_vec(16, 16 * length); LLVMValueRef result; - result = lp_build_float_to_smallfloat(gallivm, i32_type, src, 10, 5, 0, true); - /* Convert int32 vector to int16 vector by trunc (might generate bad code) */ - result = LLVMBuildTrunc(builder, result, lp_build_vec_type(gallivm, i16_type), ); + if (util_cpu_caps.has_f16c HAVE_LLVM = 0x0301 + (length == 4 || length == 8)) { + struct lp_type i168_type = lp_type_int_vec(16, 16 * 8); + unsigned mode = 3; /* same as LP_BUILD_ROUND_TRUNCATE */ + LLVMTypeRef i32t = LLVMInt32TypeInContext(gallivm-context); + const char *intrinsic = NULL; + if (length == 4) { + intrinsic = llvm.x86.vcvtps2ph.128; + } + else { + intrinsic = llvm.x86.vcvtps2ph.256; + } + result = lp_build_intrinsic_binary(builder, intrinsic, + lp_build_vec_type(gallivm, i168_type), + src, LLVMConstInt(i32t, mode, 0)); + if (length == 4) { + result = lp_build_extract_range(gallivm, result, 0, 4); + } + } + + else { + result = lp_build_float_to_smallfloat(gallivm, i32_type, src, 10, 5, 0, true); + /* Convert int32 vector to int16 vector by trunc (might generate bad code) */ + result = LLVMBuildTrunc(builder, result, lp_build_vec_type(gallivm, i16_type), ); + } /* * Debugging code. diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c b/src/gallium/auxiliary/gallivm/lp_bld_init.c index 050eba7..4fa5887 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c @@ -468,6 +468,15 @@ lp_build_init(void) util_cpu_caps.has_avx = 0; } + if (!HAVE_AVX) { + /* + * note these instructions are VEX-only, so can only emit if we use + * avx (don't want to base it on has_avx has_f16c later as that would + * omit it unnecessarily on amd cpus, see above). + */ + util_cpu_caps.has_f16c = 0; + } + #ifdef PIPE_ARCH_PPC_64 /* Set the NJ bit in VSCR to 0 so denormalized values are handled as * specified by IEEE standard (PowerISA 2.06 - Section 6.3). This garantees @@ -495,6 +504,7 @@ lp_build_init(void) util_cpu_caps.has_ssse3 = 0; util_cpu_caps.has_sse4_1 = 0; util_cpu_caps.has_avx = 0; + util_cpu_caps.has_f16c = 0; #endif } diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c b/src/gallium/auxiliary/util/u_cpu_detect.c index 0328051..7e6df9d 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.c +++ b/src/gallium/auxiliary/util/u_cpu_detect.c @@ -279,6 +279,7 @@ util_cpu_detect(void) util_cpu_caps.has_sse4_1 = (regs2[2] 19) 1; util_cpu_caps.has_sse4_2 = (regs2[2] 20) 1; util_cpu_caps.has_avx= (regs2[2] 28) 1; + util_cpu_caps.has_f16c = (regs2[2] 29) 1; util_cpu_caps.has_mmx2 = util_cpu_caps.has_sse; /* SSE cpus supports mmxext too
Re: [Mesa-dev] [PATCH] gallivm: use f16c hw support for float-half and half-float conversion
On 04/02/2013 05:07 PM, srol...@vmware.com wrote: From: Roland Scheideggersrol...@vmware.com Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. --- src/gallium/auxiliary/gallivm/lp_bld_conv.c | 45 --- src/gallium/auxiliary/gallivm/lp_bld_init.c | 10 ++ src/gallium/auxiliary/util/u_cpu_detect.c |1 + src/gallium/auxiliary/util/u_cpu_detect.h |1 + 4 files changed, 53 insertions(+), 4 deletions(-) LGTM. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Don't immediately schedule instructions that were just made available.
Matt Turner matts...@gmail.com writes: The original goal of pre-register allocation scheduling was to reduce live ranges so we'd use fewer registers and hopefully fit into 16-wide. In shader-db, this change causes us to lose 30 16-wide programs, but we gain 29... so it's a toss-up. At least by choosing instructions in a better order all programs should be slightly faster. I think this will break the GLES3 test that we created this pass for. I think we'll get the same performance benefit by round-robining our allocated registers instead of packing them in the low numbers, which is what that branch I had mentioned to you was for. pgpM3tAcSi1dS.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: add more cases for copying unsupported formats to resource_copy_region
From: Marek Olšák mar...@gmail.com Ported from r600g commit: 8891b2f9c91b2f6c8625184c23a10b8e55875dc0 NOTE: This is a candidate for the stable branches. --- src/gallium/drivers/radeonsi/r600_blit.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_blit.c b/src/gallium/drivers/radeonsi/r600_blit.c index f9d2568..f11f110 100644 --- a/src/gallium/drivers/radeonsi/r600_blit.c +++ b/src/gallium/drivers/radeonsi/r600_blit.c @@ -429,6 +429,18 @@ static void r600_resource_copy_region(struct pipe_context *ctx, r600_change_format(dst, dst_level, orig_info[1], PIPE_FORMAT_R8G8B8A8_UNORM); break; + case 8: + r600_change_format(src, src_level, orig_info[0], + PIPE_FORMAT_R16G16B16A16_UINT); + r600_change_format(dst, dst_level, orig_info[1], + PIPE_FORMAT_R16G16B16A16_UINT); + break; + case 16: + r600_change_format(src, src_level, orig_info[0], + PIPE_FORMAT_R32G32B32A32_UINT); + r600_change_format(dst, dst_level, orig_info[1], + PIPE_FORMAT_R32G32B32A32_UINT); + break; default: fprintf(stderr, Unhandled format %s with blocksize %u\n, util_format_short_name(src-format), blocksize); -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Don't immediately schedule instructions that were just made available.
On Tue, Apr 2, 2013 at 4:48 PM, Eric Anholt e...@anholt.net wrote: Matt Turner matts...@gmail.com writes: The original goal of pre-register allocation scheduling was to reduce live ranges so we'd use fewer registers and hopefully fit into 16-wide. In shader-db, this change causes us to lose 30 16-wide programs, but we gain 29... so it's a toss-up. At least by choosing instructions in a better order all programs should be slightly faster. I think this will break the GLES3 test that we created this pass for. It does, and I've been trying to figure out another way of solving it. I think we'll get the same performance benefit by round-robining our allocated registers instead of packing them in the low numbers, which is what that branch I had mentioned to you was for. I don't think so, since the round-robin allocation would have helped write-after-read stalls, but our hardware doesn't stall on write-after-read. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon/uvd: add UVD implementation
On Tue, Apr 2, 2013 at 4:19 PM, Christian König deathsim...@vodafone.de wrote: diff --git a/configure.ac b/configure.ac index 81d4a3f..93ec1d2 100644 --- a/configure.ac +++ b/configure.ac @@ -1814,6 +1814,7 @@ if test x$with_gallium_drivers != x; then if test x$enable_r600_llvm = xyes -o x$enable_opencl = xyes; then radeon_llvm_check NEED_RADEON_GALLIUM=yes; +NEED_RUVD_GALLIUM=yes; R600_NEED_RADEON_GALLIUM=yes; LLVM_COMPONENTS=${LLVM_COMPONENTS} ipo bitreader asmparser fi @@ -1832,6 +1833,7 @@ if test x$with_gallium_drivers != x; then GALLIUM_DRIVERS_DIRS=$GALLIUM_DRIVERS_DIRS radeonsi radeon_llvm_check NEED_RADEON_GALLIUM=yes; + NEED_RUVD_GALLIUM=yes; gallium_check_st radeon/drm dri-radeonsi xorg-radeonsi vdpau-radeonsi ;; xnouveau) @@ -1987,6 +1989,7 @@ AM_CONDITIONAL(HAVE_GALAHAD_GALLIUM, test x$HAVE_GALAHAD_GALLIUM = xyes) AM_CONDITIONAL(HAVE_IDENTITY_GALLIUM, test x$HAVE_IDENTITY_GALLIUM = xyes) AM_CONDITIONAL(HAVE_NOOP_GALLIUM, test x$HAVE_NOOP_GALLIUM = xyes) AM_CONDITIONAL(NEED_RADEON_GALLIUM, test x$NEED_RADEON_GALLIUM = xyes) +AM_CONDITIONAL(NEED_RUVD_GALLIUM, test x$NEED_RUVD_GALLIUM = xyes) AM_CONDITIONAL(R600_NEED_RADEON_GALLIUM, test x$R600_NEED_RADEON_GALLIUM = xyes) AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes) AM_CONDITIONAL(HAVE_LOADER_GALLIUM, test x$enable_gallium_loader = xyes) @@ -2062,6 +2065,7 @@ AC_CONFIG_FILES([Makefile src/gallium/drivers/softpipe/Makefile src/gallium/drivers/svga/Makefile src/gallium/drivers/trace/Makefile + src/gallium/drivers/ruvd/Makefile Keep this list in alphabetical order please. src/gallium/state_trackers/Makefile src/gallium/state_trackers/clover/Makefile src/gallium/state_trackers/dri/Makefile diff --git a/docs/README.UVD b/docs/README.UVD new file mode 100644 index 000..36b467e --- /dev/null +++ b/docs/README.UVD @@ -0,0 +1,13 @@ +The software may implement third party technologies (e.g. third party +libraries) that are not licensed to you by AMD and for which you may need +to obtain licenses from other parties. Unless explicitly stated otherwise, +these third party technologies are not licensed hereunder. Such third +party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4, +AVC, and VC-1. + +For MPEG-2 Encoding Products ANY USE OF THIS PRODUCT IN ANY MANNER OTHER +THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD FOR ENCODING VIDEO +INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED WITHOUT A LICENSE +UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSES IS +AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers Green Circle, Suite 400E, +Greenwood Village, Colorado 80111 U.S.A. diff --git a/src/gallium/drivers/Makefile.am b/src/gallium/drivers/Makefile.am index 3477fee..b78a3e0 100644 --- a/src/gallium/drivers/Makefile.am +++ b/src/gallium/drivers/Makefile.am @@ -64,4 +64,12 @@ endif +if NEED_RADEON_GALLIUM Supposed to be NEED_RUVD_GALLIUM? + +SUBDIRS += ruvd + +endif + + + SUBDIRS += $(GALLIUM_MAKE_DIRS) diff --git a/src/gallium/drivers/ruvd/Makefile.am b/src/gallium/drivers/ruvd/Makefile.am new file mode 100644 index 000..1d183e7 --- /dev/null +++ b/src/gallium/drivers/ruvd/Makefile.am @@ -0,0 +1,16 @@ +include Makefile.sources + +noinst_LTLIBRARIES = libruvd.la + +AM_CFLAGS = \ + -I$(top_srcdir)/src/gallium/include \ + -I$(top_srcdir)/src/gallium/auxiliary \ + -I$(top_srcdir)/src/gallium/drivers \ + -I$(top_srcdir)/include \ + $(RADEON_CFLAGS) \ + $(DEFINES) \ + $(PIC_FLAGS) \ No more PIC_FLAGS. Congratulations. I bet this has been a really long process for you guys. Matt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Avoid spurious GCC warnings in STATIC_ASSERT() macro.
Brian Paul bri...@vmware.com writes: On 04/02/2013 04:16 PM, Paul Berry wrote: GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produces spurious warnings with the STATIC_ASSERT() macro (which uses a typedef to provoke a compile error in the event of an assertion failure). This patch avoids the warning using the GCC __attribute__((unused)) syntax. --- src/mesa/main/compiler.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h index 8b23665..ddeb61d 100644 --- a/src/mesa/main/compiler.h +++ b/src/mesa/main/compiler.h @@ -249,6 +249,12 @@ static INLINE GLuint CPU_TO_LE32(GLuint x) #endif +#if (__GNUC__= 3) +#define GCC_ATTRIBUTE_UNUSED __attribute__((unused)) +#else +#define GCC_ATTRIBUTE_UNUSED +#endif + /** * Static (compile-time) assertion. * Basically, use COND to dimension an array. If COND is false/zero the @@ -256,7 +262,7 @@ static INLINE GLuint CPU_TO_LE32(GLuint x) */ #define STATIC_ASSERT(COND) \ do { \ - typedef int static_assertion_failed[(!!(COND))*2-1]; \ + typedef int static_assertion_failed[(!!(COND))*2-1] GCC_ATTRIBUTE_UNUSED; \ } while (0) Without using gcc-isms, I think this would work too: #define STATIC_ASSERT(COND) \ do { \ int static_assertion_failed[(!!(COND))*2-1]; \ (void) static_assertion_failed; \ } while (0) I don't recall why I used the typedef. Also, the same macro should probably be updated in src/gallium/include/pipe/p_compiler.h Rusty's CCAN is often a good reference for stuff like this: http://git.ozlabs.org/?p=ccan;a=blob;f=ccan/build_assert/build_assert.h;h=b9ecd84028e3fbebd1bf009c3c57e8a193e45646;hb=HEAD pgp_DUoxHsTu5.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] gallivm: minor rho calculation optimization for 1 or 3 coords
From: Roland Scheidegger srol...@vmware.com Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. --- src/gallium/auxiliary/gallivm/lp_bld_quad.c | 20 +++- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 31 +++-- 2 files changed, 22 insertions(+), 29 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_quad.c b/src/gallium/auxiliary/gallivm/lp_bld_quad.c index 1955add..f2a762a 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_quad.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_quad.c @@ -81,7 +81,8 @@ lp_build_ddy(struct lp_build_context *bld, /* * Helper for building packed ddx/ddy vector for one coord (scalar per quad * values). The vector will look like this (8-wide): - * dr1dx dr1dy _ _ dr2dx dr2dy _ _ + * dr1dx _ -dr1dy _ dr2dx _ -dr2dy _ + * This only requires one shuffle instead of two for more straightforward packing. */ LLVMValueRef lp_build_packed_ddx_ddy_onecoord(struct lp_build_context *bld, @@ -91,19 +92,15 @@ lp_build_packed_ddx_ddy_onecoord(struct lp_build_context *bld, LLVMBuilderRef builder = gallivm-builder; LLVMValueRef vec1, vec2; - /* same packing as _twocoord, but can use aos swizzle helper */ + /* use aos swizzle helper */ - /* -* XXX could make swizzle1 a noop swizzle by using right top/bottom -* pair for ddy -*/ - static const unsigned char swizzle1[] = { - LP_BLD_QUAD_TOP_LEFT, LP_BLD_QUAD_TOP_LEFT, - LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE + static const unsigned char swizzle1[] = { /* no-op swizzle */ + LP_BLD_QUAD_TOP_LEFT, LP_BLD_SWIZZLE_DONTCARE, + LP_BLD_QUAD_BOTTOM_LEFT, LP_BLD_SWIZZLE_DONTCARE }; static const unsigned char swizzle2[] = { - LP_BLD_QUAD_TOP_RIGHT, LP_BLD_QUAD_BOTTOM_LEFT, - LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE + LP_BLD_QUAD_TOP_RIGHT, LP_BLD_SWIZZLE_DONTCARE, + LP_BLD_QUAD_TOP_LEFT, LP_BLD_SWIZZLE_DONTCARE }; vec1 = lp_build_swizzle_aos(bld, a, swizzle1); @@ -120,6 +117,7 @@ lp_build_packed_ddx_ddy_onecoord(struct lp_build_context *bld, * Helper for building packed ddx/ddy vector for one coord (scalar per quad * values). The vector will look like this (8-wide): * ds1dx ds1dy dt1dx dt1dy ds2dx ds2dy dt2dx dt2dy + * This only needs 2 (v)shufps. */ LLVMValueRef lp_build_packed_ddx_ddy_twocoord(struct lp_build_context *bld, diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c b/src/gallium/auxiliary/gallivm/lp_bld_sample.c index fc8bae7..9a00897 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c @@ -226,7 +226,6 @@ lp_build_rho(struct lp_build_sample_context *bld, LLVMValueRef int_size, float_size; LLVMValueRef rho; LLVMValueRef first_level, first_level_vec; - LLVMValueRef abs_ddx_ddy[2]; unsigned length = coord_bld-type.length; unsigned num_quads = length / 4; unsigned i; @@ -279,32 +278,28 @@ lp_build_rho(struct lp_build_sample_context *bld, ddx_ddy[0] = lp_build_packed_ddx_ddy_onecoord(coord_bld, s); } else if (dims = 2) { - ddx_ddy[0] = lp_build_packed_ddx_ddy_twocoord(coord_bld, - s, t); + ddx_ddy[0] = lp_build_packed_ddx_ddy_twocoord(coord_bld, s, t); if (dims 2) { ddx_ddy[1] = lp_build_packed_ddx_ddy_onecoord(coord_bld, r); } } - abs_ddx_ddy[0] = lp_build_abs(coord_bld, ddx_ddy[0]); + ddx_ddy[0] = lp_build_abs(coord_bld, ddx_ddy[0]); if (dims 2) { - abs_ddx_ddy[1] = lp_build_abs(coord_bld, ddx_ddy[1]); - } - else { - abs_ddx_ddy[1] = NULL; + ddx_ddy[1] = lp_build_abs(coord_bld, ddx_ddy[1]); } - if (dims == 1) { - static const unsigned char swizzle1[] = { + if (dims 2) { + static const unsigned char swizzle1[] = { /* no-op swizzle */ 0, LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE }; static const unsigned char swizzle2[] = { -1, LP_BLD_SWIZZLE_DONTCARE, +2, LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE }; - rho_xvec = lp_build_swizzle_aos(coord_bld, abs_ddx_ddy[0], swizzle1); - rho_yvec = lp_build_swizzle_aos(coord_bld, abs_ddx_ddy[0], swizzle2); + rho_xvec = lp_build_swizzle_aos(coord_bld, ddx_ddy[0], swizzle1); + rho_yvec = lp_build_swizzle_aos(coord_bld, ddx_ddy[0], swizzle2); } else if (dims == 2) { static const unsigned char swizzle1[] = { @@ -315,8 +310,8 @@ lp_build_rho(struct lp_build_sample_context *bld, 1, 3, LP_BLD_SWIZZLE_DONTCARE, LP_BLD_SWIZZLE_DONTCARE }; - rho_xvec = lp_build_swizzle_aos(coord_bld, abs_ddx_ddy[0],
[Mesa-dev] [PATCH 2/3] gallivm: do per-pixel cube face selection (finally!!!)
From: Roland Scheidegger srol...@vmware.com This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get randomly exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in nearly the same values as before (the nearly is only because before the code would choose the face based on an average vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the derivative major axis into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. --- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 249 ++--- src/gallium/auxiliary/gallivm/lp_bld_sample.h |4 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |9 +- 3 files changed, 180 insertions(+), 82 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c b/src/gallium/auxiliary/gallivm/lp_bld_sample.c index 9a00897..5d50921 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c @@ -207,6 +207,7 @@ lp_build_rho(struct lp_build_sample_context *bld, LLVMValueRef s, LLVMValueRef t, LLVMValueRef r, + LLVMValueRef cube_rho, const struct lp_derivatives *derivs) { struct gallivm_state *gallivm = bld-gallivm; @@ -240,8 +241,22 @@ lp_build_rho(struct lp_build_sample_context *bld, int_size = lp_build_minify(int_size_bld, bld-int_size, first_level_vec); float_size = lp_build_int_to_float(float_size_bld, int_size); - /* XXX ignoring explicit derivs for cube maps for now */ - if (derivs !(bld-static_texture_state-target == PIPE_TEXTURE_CUBE)) { + if (cube_rho) { + LLVMValueRef cubesize; + LLVMValueRef index0 = lp_build_const_int32(gallivm, 0); + /* + * If we have derivs too then we have per-pixel cube_rho - doesn't matter + * though until we do per-pixel lod. + * Cube map code did already everything except size mul and per-quad extraction. + */ + /* Could optimize this for single quad just skip the broadcast */ + cubesize = lp_build_extract_broadcast(gallivm, bld-float_size_in_type, +coord_bld-type, float_size, index0); + rho_vec = lp_build_mul(coord_bld, cubesize, cube_rho); + rho = lp_build_pack_aos_scalars(bld-gallivm, coord_bld-type, + perquadf_bld-type, rho_vec, 0); + } + else if (derivs !(bld-static_texture_state-target == PIPE_TEXTURE_CUBE)) { LLVMValueRef ddmax[3]; for (i = 0; i dims; i++) { LLVMValueRef ddx, ddy; @@ -561,6 +576,7 @@ lp_build_lod_selector(struct lp_build_sample_context *bld,
[Mesa-dev] [PATCH 3/3] gallivm: honor explicit derivatives values for cube maps.
From: Roland Scheidegger srol...@vmware.com This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Untested (no piglit test) however since the transform works the same as implicit derivatives this should probably work correctly. --- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 10 ++-- src/gallium/auxiliary/gallivm/lp_bld_sample.h |1 + src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |2 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 66 ++--- 4 files changed, 52 insertions(+), 27 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c b/src/gallium/auxiliary/gallivm/lp_bld_sample.c index 5d50921..cc04a70 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c @@ -1287,6 +1287,7 @@ lp_build_cube_lookup(struct lp_build_sample_context *bld, LLVMValueRef s, LLVMValueRef t, LLVMValueRef r, + const struct lp_derivatives *derivs, /* optional */ LLVMValueRef *face, LLVMValueRef *face_s, LLVMValueRef *face_t, @@ -1296,7 +1297,6 @@ lp_build_cube_lookup(struct lp_build_sample_context *bld, LLVMBuilderRef builder = bld-gallivm-builder; struct gallivm_state *gallivm = bld-gallivm; LLVMValueRef si, ti, ri; - boolean implicit_derivs = TRUE; boolean need_derivs = TRUE; if (1 || coord_bld-type.length 4) { @@ -1334,9 +1334,9 @@ lp_build_cube_lookup(struct lp_build_sample_context *bld, assert(PIPE_TEX_FACE_NEG_Z == PIPE_TEX_FACE_POS_Z + 1); /* - * TODO do this only when needed, and implement explicit derivs (trivial). + * TODO do this only when needed. */ - if (need_derivs implicit_derivs) { + if (need_derivs !derivs) { LLVMValueRef ddx_ddy[2], tmp[2]; /* * This isn't quite the same as the ordinary path since @@ -1374,9 +1374,9 @@ lp_build_cube_lookup(struct lp_build_sample_context *bld, dmax[2] = lp_build_max(coord_bld, tmp[0], tmp[1]); } else if (need_derivs) { - /* dmax[0] = lp_build_max(coord_bld, derivs-ddx[0], derivs-ddy[0]); + dmax[0] = lp_build_max(coord_bld, derivs-ddx[0], derivs-ddy[0]); dmax[1] = lp_build_max(coord_bld, derivs-ddx[1], derivs-ddy[1]); - dmax[2] = lp_build_max(coord_bld, derivs-ddx[2], derivs-ddy[2]); */ + dmax[2] = lp_build_max(coord_bld, derivs-ddx[2], derivs-ddy[2]); } si = LLVMBuildBitCast(builder, s, lp_build_vec_type(gallivm, intctype), ); diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h b/src/gallium/auxiliary/gallivm/lp_bld_sample.h index 5026b0a..72af813 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h @@ -433,6 +433,7 @@ lp_build_cube_lookup(struct lp_build_sample_context *bld, LLVMValueRef s, LLVMValueRef t, LLVMValueRef r, + const struct lp_derivatives *derivs, /* optional */ LLVMValueRef *face, LLVMValueRef *face_s, LLVMValueRef *face_t, diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 3b950ea..d2cc0f3 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -1102,7 +1102,7 @@ lp_build_sample_common(struct lp_build_sample_context *bld, */ if (target == PIPE_TEXTURE_CUBE) { LLVMValueRef face, face_s, face_t; - lp_build_cube_lookup(bld, *s, *t, *r, face, face_s, face_t, cube_rho); + lp_build_cube_lookup(bld, *s, *t, *r, derivs, face, face_s, face_t, cube_rho); *s = face_s; /* vec */ *t = face_t; /* vec */ /* use 'r' to indicate cube face */ diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index facfc82..007e3c9 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -1276,8 +1276,7 @@ emit_tex( struct lp_build_tgsi_soa_context *bld, LLVMValueRef offsets[3] = { NULL }; struct lp_derivatives derivs; struct lp_derivatives *deriv_ptr = NULL; - unsigned num_coords; - unsigned dims; + unsigned num_coords, num_derivs, num_offsets; unsigned i; if (!bld-sampler) { @@ -1291,37 +1290,52 @@ emit_tex( struct lp_build_tgsi_soa_context *bld, switch (inst-Texture.Texture) { case TGSI_TEXTURE_1D: num_coords = 1; - dims = 1; + num_offsets = 1; + num_derivs = 1; break; case TGSI_TEXTURE_1D_ARRAY: num_coords = 2; - dims = 1; + num_offsets = 1; + num_derivs = 1;
[Mesa-dev] [PATCH 1/3] intel: Add support for writing to our linear-temporary-CPU-map case.
This will be used for handling updates of large textures. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 25 ++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 66cadeb..ffdaec5 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1347,9 +1347,30 @@ intel_miptree_unmap_blit(struct intel_context *intel, unsigned int level, unsigned int slice) { - assert(!(map-mode GL_MAP_WRITE_BIT)); - + struct gl_context *ctx = intel-ctx; drm_intel_bo_unmap(map-bo); + + if (map-mode GL_MAP_WRITE_BIT) { + unsigned int image_x, image_y; + int x = map-x; + int y = map-y; + intel_miptree_get_image_offset(mt, level, slice, image_x, image_y); + x += image_x; + y += image_y; + + bool ok = intelEmitCopyBlit(intel, + mt-region-cpp, + map-stride, map-bo, + 0, I915_TILING_NONE, + mt-region-pitch, mt-region-bo, + mt-offset, mt-region-tiling, + 0, 0, + x, y, + map-w, map-h, + GL_COPY); + WARN_ONCE(!ok, Failed to blit from linear temporary mapping); + } + drm_intel_bo_unreference(map-bo); } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] intel: Avoid making tiled miptrees we won't be able to blit.
Doing so was breaking miptree mapping, which we really need to be able to handle. With this change, intel_miptree_map_direct() falls through to doing a CPU mapping on the buffer like we need. With the previous 2 patches, all of these should be fixed: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37871 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44958 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53494 Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 35 ++-- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 5e0cd61..8d2b8a3 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -354,6 +354,18 @@ intel_miptree_create(struct intel_context *intel, etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; base_format = _mesa_get_format_base_format(format); + mt = intel_miptree_create_layout(intel, target, format, + first_level, last_level, width0, + height0, depth0, + false, num_samples); + /* +* pitch == 0 || height == 0 indicates the null texture +*/ + if (!mt || !mt-total_width || !mt-total_height) { + intel_miptree_release(mt); + return NULL; + } + if (num_samples 1) { /* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled * Surface): @@ -377,20 +389,15 @@ intel_miptree_create(struct intel_context *intel, tiling = I915_TILING_Y; else if (force_y_tiling) { tiling = I915_TILING_Y; - } else if (width0 = 64) -tiling = I915_TILING_X; - } - - mt = intel_miptree_create_layout(intel, target, format, - first_level, last_level, width0, - height0, depth0, - false, num_samples); - /* -* pitch == 0 || height == 0 indicates the null texture -*/ - if (!mt || !mt-total_width || !mt-total_height) { - intel_miptree_release(mt); - return NULL; + } else if (width0 = 64) { + if (ALIGN(mt-total_width * mt-cpp, 512) 32768) { +tiling = I915_TILING_X; + } else { +perf_debug(%dx%d miptree too large to blit, + falling back to untiled, + mt-total_width, mt-total_height); + } + } } total_width = mt-total_width; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] intel: Do temporary CPU maps of textures that are too big to GTT map.
This still fails, since 8192*4bpp == 32768, which is too big to use the blitter on. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 21 + 1 file changed, 21 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index ffdaec5..5e0cd61 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1703,6 +1703,23 @@ intel_miptree_map_singlesample(struct intel_context *intel, { struct intel_miptree_map *map; + /* Estimate the size of the mappable aperture into the GTT. There's an +* ioctl to get the whole GTT size, but not one to get the mappable subset. +* It turns out it's basically always 256MB, though some ancient hardware +* was smaller. +*/ + uint32_t gtt_size = 256 * 1024 * 1024; + if (intel-gen == 2) + gtt_size = 128 * 1024 * 1024; + + /* We don't want to map two objects such that a memcpy between them would +* just fault one mapping in and then the other over and over forever. So +* we would need to divide the GTT size by 2. Additionally, some GTT is +* taken up by things like the framebuffer and the ringbuffer and such, so +* be more conservative. +*/ + uint32_t max_gtt_map_object_size = gtt_size / 4; + assert(mt-num_samples = 1); map = intel_miptree_attach_map(mt, level, slice, x, y, w, h, mode); @@ -1749,6 +1766,10 @@ intel_miptree_map_singlesample(struct intel_context *intel, mt-region-tiling == I915_TILING_X mt-region-pitch 32768) { intel_miptree_map_blit(intel, mt, map, level, slice); + } else if (mt-region-tiling != I915_TILING_NONE + mt-region-bo-size = max_gtt_map_object_size) { + assert(mt-region-pitch 32768); + intel_miptree_map_blit(intel, mt, map, level, slice); } else { intel_miptree_map_gtt(intel, mt, map, level, slice); } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] register_allocate: Fix the type of best_benefit.
On 04/02/2013 01:38 PM, Matt Turner wrote: --- src/mesa/program/register_allocate.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/program/register_allocate.c b/src/mesa/program/register_allocate.c index a9064c3..7d11b73 100644 --- a/src/mesa/program/register_allocate.c +++ b/src/mesa/program/register_allocate.c @@ -561,7 +561,7 @@ int ra_get_best_spill_node(struct ra_graph *g) { unsigned int best_node = -1; - unsigned int best_benefit = 0.0; + float best_benefit = 0.0; unsigned int n; for (n = 0; n g-count; n++) { Yikes. Good catch... Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] mesa: add texture gather changes
On 03/31/2013 02:10 AM, Chris Forbes wrote: From: Maxence Le Dore maxence.led...@gmail.com --- src/mapi/glapi/gen/ARB_texture_gather.xml | 14 ++ src/mapi/glapi/gen/gl_API.xml | 2 +- src/mesa/main/context.c | 4 src/mesa/main/extensions.c| 1 + src/mesa/main/get.c | 1 + src/mesa/main/get_hash_params.py | 6 ++ src/mesa/main/mtypes.h| 6 ++ src/mesa/main/tests/enum_strings.cpp | 3 +++ 8 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 src/mapi/glapi/gen/ARB_texture_gather.xml diff --git a/src/mapi/glapi/gen/ARB_texture_gather.xml b/src/mapi/glapi/gen/ARB_texture_gather.xml new file mode 100644 index 000..cd331ac --- /dev/null +++ b/src/mapi/glapi/gen/ARB_texture_gather.xml @@ -0,0 +1,14 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +OpenGLAPI + +category name=GL_ARB_texture_gather number=72 + + enum name=MIN_PROGRAM_TEXTURE_GATHER_OFFSET_ARB value=0x8E5E/ + enum name=MAX_PROGRAM_TEXTURE_GATHER_OFFSET_ARB value=0x8E5F/ + enum name=MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB value=0x8F9F/ + +/category + +/OpenGLAPI \ No newline at end of file diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 75957dc..9a957d1 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8188,7 +8188,7 @@ !-- 70. GL_ARB_sample_shading -- xi:include href=ARB_texture_cube_map_array.xml xmlns:xi=http://www.w3.org/2001/XInclude/ -!-- 72. GL_ARB_texture_gather -- +xi:include href=ARB_texture_gather.xml xmlns:xi=http://www.w3.org/2001/XInclude/ !-- 73. GL_ARB_texture_query_lod -- !-- ARB extension number 74 is a WGL extension. -- diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 0539934..d4e773b 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -647,6 +647,10 @@ _mesa_init_constants(struct gl_context *ctx) ctx-Const.MinProgramTexelOffset = -8; ctx-Const.MaxProgramTexelOffset = 7; + /* GL_ARB_texture_gather */ + ctx-Const.MinProgramTextureGatherOffset = -8; + ctx-Const.MaxProgramTextureGatherOffset = 7; + /* GL_ARB_robustness */ ctx-Const.ResetStrategy = GL_NO_RESET_NOTIFICATION_ARB; diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 3116692..593ed1a 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -141,6 +141,7 @@ static const struct extension extension_table[] = { { GL_ARB_texture_env_crossbar, o(ARB_texture_env_crossbar),GLL,2001 }, { GL_ARB_texture_env_dot3,o(ARB_texture_env_dot3), GLL,2001 }, { GL_ARB_texture_float, o(ARB_texture_float), GL, 2004 }, + { GL_ARB_texture_gather, o(ARB_texture_gather), GL, 2009 }, { GL_ARB_texture_mirrored_repeat, o(dummy_true), GLL,2001 }, { GL_ARB_texture_multisample, o(ARB_texture_multisample), GL, 2009 }, { GL_ARB_texture_non_power_of_two, o(ARB_texture_non_power_of_two),GL, 2003 }, diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 582ef31..a182ab4 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -356,6 +356,7 @@ EXTRA_EXT(ARB_map_buffer_alignment); EXTRA_EXT(ARB_texture_cube_map_array); EXTRA_EXT(ARB_texture_buffer_range); EXTRA_EXT(ARB_texture_multisample); +EXTRA_EXT(ARB_texture_gather); static const int extra_NV_primitive_restart[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 7d4f7e2..1941d53 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -709,6 +709,12 @@ descriptor=[ # GL_ARB_texture_cube_map_array [ TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB, LOC_CUSTOM, TYPE_INT, TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array ], + +# GL_ARB_texture_gather + [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET_ARB, CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather], + [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET_ARB, CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather], + [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, CONTEXT_INT(Const.MaxProgramTextureGatherComponents), extra_ARB_texture_gather], + Maybe drop the ARB suffixes? They shouldn't be necessary. ]}, # Enums restricted to OpenGL Core profile diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index e47e835..37e4b61 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2860,6 +2860,11 @@ struct gl_constants /** GL_EXT_gpu_shader4 */ GLint MinProgramTexelOffset, MaxProgramTexelOffset; + /** GL_ARB_texture_gather
Re: [Mesa-dev] [PATCH 4/6] i965/fs: Add support for ir_tg4
On 03/31/2013 02:10 AM, Chris Forbes wrote: Lowers ir_tg4 (from textureGather and textureGatherOffset builtins) to SHADER_OPCODE_TG4. The usual post-sampling swizzle workaround can't work for ir_tg4, so avoid doing that: * For R/G/B/A swizzles use the hardware channel select (lives in the same dword in the header as the texel offset), and then don't do anything afterward in the shader. * For 0/1 swizzles blast the appropriate constant over all the output channels in swizzle_result(). Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 55 2 files changed, 56 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index d9d17a2..bc93bdf 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -250,6 +250,7 @@ public: void visit(ir_function *ir); void visit(ir_function_signature *ir); + uint32_t gather_channel(ir_texture *ir, int sampler); void swizzle_result(ir_texture *ir, fs_reg orig_val, int sampler); bool can_do_source_mods(fs_inst *inst); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 8556b56..2b77883 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1119,6 +1119,14 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, base_mrf--; } + if (ir-op == ir_tg4 !header_present) { + /* ir_tg4 needs to place its channel select in the header, + * for interaction with ARB_texture_swizzle */ + header_present = true; + mlen++; + base_mrf--; + } I'm not a fan of duplicating this block. Why not just change the one above's condition to: if (ir-op == ir_tg4 || (ir-offset ir-op != ir_txf)) { Feel free to keep the comment if you like. + if (ir-shadow_comparitor) { emit(MOV(fs_reg(MRF, base_mrf + mlen), shadow_c)); mlen += reg_width; @@ -1128,6 +1136,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, switch (ir-op) { case ir_tex: case ir_lod: + case ir_tg4: break; case ir_txb: emit(MOV(fs_reg(MRF, base_mrf + mlen), lod)); @@ -1242,6 +1251,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, case ir_txf_ms: inst = emit(SHADER_OPCODE_TXF_MS, dst); break; case ir_txs: inst = emit(SHADER_OPCODE_TXS, dst); break; case ir_lod: inst = emit(SHADER_OPCODE_LOD, dst); break; + case ir_tg4: inst = emit(SHADER_OPCODE_TG4, dst); break; } inst-base_mrf = base_mrf; inst-mlen = mlen; @@ -1394,6 +1404,7 @@ fs_visitor::visit(ir_texture *ir) switch (ir-op) { case ir_tex: case ir_lod: + case ir_tg4: break; case ir_txb: ir-lod_info.bias-accept(this); @@ -1416,6 +1427,8 @@ fs_visitor::visit(ir_texture *ir) ir-lod_info.sample_index-accept(this); sample_index = this-result; break; + default: + assert(!Unrecognized texture opcode); }; /* Writemasking doesn't eliminate channels on SIMD8 texture @@ -1440,6 +1453,9 @@ fs_visitor::visit(ir_texture *ir) if (ir-offset != NULL ir-op != ir_txf) inst-texture_offset = brw_texture_offset(ir-offset-as_constant()); + if (ir-op == ir_tg4) + inst-texture_offset |= gather_channel(ir, sampler) 16; // M0.2:16-17 Clever. Not a bad approach, but perhaps we should rename the field to something more general. Then again, message_header_bits isn't much better... inst-sampler = sampler; if (ir-shadow_comparitor) @@ -1460,6 +1476,24 @@ fs_visitor::visit(ir_texture *ir) } /** + * Set up the gather channel based on the swizzle, for gather4. + */ +uint32_t +fs_visitor::gather_channel(ir_texture *ir, int sampler) +{ + int swiz = GET_SWZ(c-key.tex.swizzles[sampler], 0 /* red */); + switch (swiz) { + case SWIZZLE_X: return 0; + case SWIZZLE_Y: return 1; + case SWIZZLE_Z: return 2; + case SWIZZLE_W: return 3; + default: + /* zero, one swizzles */ + return 0; + } +} + +/** * Swizzle the result of a texture result. This is necessary for * EXT_texture_swizzle as well as DEPTH_TEXTURE_MODE for shadow comparisons. */ @@ -1468,9 +1502,30 @@ fs_visitor::swizzle_result(ir_texture *ir, fs_reg orig_val, int sampler) { this-result = orig_val; + /* txs isn't actually sampling the texture */ if (ir-op == ir_txs) return; + /* tg4 does the channel select in hardware for 'real' swizzles, but can't +* do the degenerate ZERO/ONE cases, so we do them here: +* +* blast all the output channels with zero or one as appropriate +*/ So, if texture swizzling selects the channel...then zero/one are stupid. It might be
Re: [Mesa-dev] [PATCH 5/6] i965/vs: Add support for ir_tg4
On 03/31/2013 02:10 AM, Chris Forbes wrote: Pretty much the same as the FS case. Channel select goes in the header, post-sampling swizzle only does the 0/1 cases. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_emit.cpp| 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 47 -- 3 files changed, 47 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 1f832d1..36c7312 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -443,6 +443,7 @@ public: void emit_pack_half_2x16(dst_reg dst, src_reg src0); void emit_unpack_half_2x16(dst_reg dst, src_reg src0); + uint32_t gather_channel(ir_texture *ir, int sampler); void swizzle_result(ir_texture *ir, src_reg orig_val, int sampler); void emit_ndc_computation(); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp index 7938c14..d427469 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp @@ -354,7 +354,7 @@ vec4_generator::generate_tex(vec4_instruction *inst, brw_MOV(p, retype(brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE, inst-base_mrf, 2), BRW_REGISTER_TYPE_UD), - brw_imm_uw(inst-texture_offset)); + brw_imm_ud(inst-texture_offset)); brw_pop_insn_state(p); } else if (inst-header_present) { /* Set up an implied move from g0 to the MRF. */ diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 8bd2fd8..95cfc3b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2128,6 +2128,7 @@ vec4_visitor::visit(ir_texture *ir) break; case ir_txb: case ir_lod: + case ir_tg4: break; } @@ -2149,15 +2150,21 @@ vec4_visitor::visit(ir_texture *ir) case ir_txs: inst = new(mem_ctx) vec4_instruction(this, SHADER_OPCODE_TXS); break; + case ir_tg4: + inst = new(mem_ctx) vec4_instruction(this, SHADER_OPCODE_TG4); + break; case ir_txb: assert(!TXB is not valid for vertex shaders.); break; case ir_lod: assert(!LOD is not valid for vertex shaders.); break; + default: + assert(!Unrecognized tex op); } - bool use_texture_offset = ir-offset != NULL ir-op != ir_txf; + bool use_texture_offset = (ir-offset != NULL ir-op != ir_txf) + || ir-op == ir_tg4; I'd prefer to leave this as is, and instead... /* Texel offsets go in the message header; Gen4 also requires headers. */ inst-header_present = use_texture_offset || intel-gen 5; inst-header_present = use_texture_offset || ir-op == ir_tg4 || intel-gen 5; @@ -2168,9 +2175,13 @@ vec4_visitor::visit(ir_texture *ir) inst-dst.writemask = WRITEMASK_XYZW; inst-shadow_compare = ir-shadow_comparitor != NULL; - if (use_texture_offset) + if (use_texture_offset ir-offset) inst-texture_offset = brw_texture_offset(ir-offset-as_constant()); Then you can leave this alone too... + /* Stuff the channel select bits in the top of the texture offset */ + if (ir-op == ir_tg4) + inst-texture_offset |= gather_channel(ir, sampler)16; + /* MRF for the first parameter */ int param_base = inst-base_mrf + inst-header_present; @@ -2290,6 +2301,24 @@ vec4_visitor::visit(ir_texture *ir) swizzle_result(ir, src_reg(inst-dst), sampler); } +/** + * Set up the gather channel based on the swizzle, for gather4. + */ +uint32_t +vec4_visitor::gather_channel(ir_texture *ir, int sampler) +{ + int swiz = GET_SWZ(c-key.tex.swizzles[sampler], 0 /* red */); + switch (swiz) { + case SWIZZLE_X: return 0; + case SWIZZLE_Y: return 1; + case SWIZZLE_Z: return 2; + case SWIZZLE_W: return 3; + default: + /* zero, one swizzles */ + return 0; + } +} + void vec4_visitor::swizzle_result(ir_texture *ir, src_reg orig_val, int sampler) { @@ -2304,6 +2333,20 @@ vec4_visitor::swizzle_result(ir_texture *ir, src_reg orig_val, int sampler) return; } + /* ir_tg4 does its swizzling in hardware, except for ZERO/ONE degenerate +* cases, which we'll do here +*/ + if (ir-op == ir_tg4) { + int swiz = GET_SWZ(s,0); + if (swiz != SWIZZLE_ZERO swiz != SWIZZLE_ONE) { + emit(MOV(swizzled_result, orig_val)); + return; + } + + emit(MOV(swizzled_result, src_reg(swiz == SWIZZLE_ONE ? 1.0f : 0.0f))); + return; + } Again, we should probably do this earlier in visit(ir_texture *). Then you can just add || ir-op == ir_tg4 to the above block which short-circuits. + int zero_mask = 0, one_mask = 0, copy_mask = 0; int
Re: [Mesa-dev] [PATCH 6/6] i965: Enable ARB_texture_gather on Gen7
On 03/31/2013 04:01 PM, Matt Turner wrote: On Sun, Mar 31, 2013 at 2:10 AM, Chris Forbes chr...@ijw.co.nz wrote: Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/brw_context.c | 1 + src/mesa/drivers/dri/intel/intel_extensions.c | 4 2 files changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index ceaf325..e8f9c60 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -210,6 +210,7 @@ brwCreateContext(int api, ctx-Const.MaxColorTextureSamples = 8; ctx-Const.MaxDepthTextureSamples = 8; ctx-Const.MaxIntegerSamples = 8; + ctx-Const.MaxProgramTextureGatherComponents = 4; } /* if conformance mode is set, swrast can handle any size AA point */ diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c b/src/mesa/drivers/dri/intel/intel_extensions.c index 9efdee4..450c84d 100755 --- a/src/mesa/drivers/dri/intel/intel_extensions.c +++ b/src/mesa/drivers/dri/intel/intel_extensions.c @@ -110,6 +110,10 @@ intelInitExtensions(struct gl_context *ctx) ctx-Extensions.ARB_texture_multisample = true; } + if (intel-gen == 7) { + ctx-Extensions.ARB_texture_gather = true; + } + Put this above the intel-gen (|=)= 6 blocks? Also update GL3.txt :) Nope, because Chris hasn't implemented it for Sandybridge in this series. It would need MaxProgramTextureGatherComponents = 1 for starts, and probably some workarounds... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] ARB_texture_gather support for Gen7 i965.
On 03/31/2013 02:10 AM, Chris Forbes wrote: This series implements ARB_texture_gather in core mesa, and the driver side for Gen7 i965. Not quite baked -- green/blue/alpha texture swizzles with VS don't work yet. Everything else works, though (R/0/1 swizzles in VS; all swizzles in FS; textureGather and textureGatherOffset). The first two patches are pretty much what Maxence sent out, but tidied up so they work; the third patch of that original series (st + softpipe) is dropped for now, but shouldn't be hard to reinclude. Patches 1-3 are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev