[Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Just restore the HSW path to what it was before the gen6 change, by making gen7-specific functions to set up these stages. --- The alternative here would be to emit the correct lengths of packets in these new functions. But we're not emitting constants for other disabled stages on gen7+, so I'm leaning toward this variant. src/mesa/drivers/dri/i965/gen7_blorp.cpp | 57 ++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index 1c23866..330e3d5 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -276,6 +276,39 @@ gen7_blorp_emit_sampler_state(struct brw_context *brw, } +/* 3DSTATE_VS + * + * Disable vertex shader. + */ +static void +gen7_blorp_emit_vs_disable(struct brw_context *brw, + const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + if (intel-gen == 6) { + /* From the BSpec, Volume 2a, Part 3 Vertex Shader, Section + * 3DSTATE_VS, Dword 5.0 VS Function Enable: + * + * [DevSNB] A pipeline flush must be programmed prior to a + * 3DSTATE_VS command that causes the VS Function Enable to + * toggle. Pipeline flush can be executed by sending a PIPE_CONTROL + * command with CS stall bit set and a post sync operation. + */ + intel_emit_post_sync_nonzero_flush(intel); + } + + BEGIN_BATCH(6); + OUT_BATCH(_3DSTATE_VS 16 | (6 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} + + /* 3DSTATE_HS * * Disable the hull shader. @@ -337,6 +370,26 @@ gen7_blorp_emit_ds_disable(struct brw_context *brw, ADVANCE_BATCH(); } +/* 3DSTATE_GS + * + * Disable the geometry shader. + */ +static void +gen7_blorp_emit_gs_disable(struct brw_context *brw, + const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_GS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} /* 3DSTATE_STREAMOUT * @@ -784,11 +837,11 @@ gen7_blorp_exec(struct intel_context *intel, wm_surf_offset_texture); sampler_offset = gen7_blorp_emit_sampler_state(brw, params); } - gen6_blorp_emit_vs_disable(brw, params); + gen7_blorp_emit_vs_disable(brw, params); gen7_blorp_emit_hs_disable(brw, params); gen7_blorp_emit_te_disable(brw, params); gen7_blorp_emit_ds_disable(brw, params); - gen6_blorp_emit_gs_disable(brw, params); + gen7_blorp_emit_gs_disable(brw, params); gen7_blorp_emit_streamout_disable(brw, params); gen6_blorp_emit_clip_disable(brw, params); gen7_blorp_emit_sf_config(brw, params); -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] ilo: Add missing break statement in aos_tex TGSI_OPCODE_TEX2 case.
On Mon, May 6, 2013 at 3:51 AM, Vinson Lee v...@freedesktop.org wrote: Fixes Missing break in switch defect reported by Coverity. Signed-off-by: Vinson Lee v...@freedesktop.org Applied. Thanks. --- src/gallium/drivers/ilo/shader/toy_tgsi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c b/src/gallium/drivers/ilo/shader/toy_tgsi.c index c2b1da5..046c646 100644 --- a/src/gallium/drivers/ilo/shader/toy_tgsi.c +++ b/src/gallium/drivers/ilo/shader/toy_tgsi.c @@ -357,6 +357,7 @@ aos_tex(struct toy_compiler *tc, break; case TGSI_OPCODE_TEX2: opcode = TOY_OPCODE_TGSI_TEX2; + break; case TGSI_OPCODE_TXB2: opcode = TOY_OPCODE_TGSI_TXB2; break; -- 1.8.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
Eric Anholt e...@anholt.net writes: The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Just restore the HSW path to what it was before the gen6 change, by making gen7-specific functions to set up these stages. Of course, this needs the NOTE for stable branches to get picked back along with the fix that caused the regression. pgpfuRQcDPxbS.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] tgsi: fix operand type of TGSI_OPCODE_NOT
On Mon, May 6, 2013 at 6:45 PM, Roland Scheidegger srol...@vmware.com wrote: Am 05.05.2013 18:34, schrieb Chia-I Wu: It should be TGSI_TYPE_UNSIGNED, not TGSI_TYPE_FLOAT. Fixed also gallivm not_emit_cpu() to use uint build context. Signed-off-by: Chia-I Wu olva...@gmail.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |2 +- src/gallium/auxiliary/tgsi/tgsi_info.c |1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c index dc7c090..1feaa19 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c @@ -1314,7 +1314,7 @@ not_emit_cpu( struct lp_build_tgsi_context * bld_base, struct lp_build_emit_data * emit_data) { - emit_data-output[emit_data-chan] = lp_build_not(bld_base-base, + emit_data-output[emit_data-chan] = lp_build_not(bld_base-uint_bld, emit_data-args[0]); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 90bb497..99b1c66 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -276,6 +276,7 @@ tgsi_opcode_infer_type( uint opcode ) case TGSI_OPCODE_MOV: case TGSI_OPCODE_UCMP: return TGSI_TYPE_UNTYPED; + case TGSI_OPCODE_NOT: case TGSI_OPCODE_SHL: case TGSI_OPCODE_AND: case TGSI_OPCODE_OR: Series looks mostly good to me. I think the order might have been supposed to be alphabetic at some point, but maybe by number makes more sense. I also think that this function leaves something to be desired (even though that's nothing new), e.g. I believe UCMP is not correctly handled - the first src arg is uint (or int but definitely not float) while the other src args ought to be floats, but there's no way to have different argument types for different src args (so if you have tgsi_exec executing ucmp, it will actually assume all arguments are ints as it doesn't use the type info here and can't handle different src types currently hence it will do negation on src1/src2 corresponding to ints, but gallivm code will just inherently know the first src type is a int for comparison purposes (but do negation like on a float arg on it...) whereas for 2nd and 3rd src arg it will do negation/abs like on float arg as intended). Should probably be fixed at some point (or otherwise forbid ucmp to have modifiers on src args and just make them ints). Yes, I intended not to change what these functions currently return (other than the fix I need). The problem is that some opcodes expect different arguments to have different types, but these functions cannot handle that. Whether we want to extend the functions probably depends on whether there is a real need. Drivers may already have special logics for UCMP. If we view texture units as integer immediates, many texturing opcodes also have mixed data types. But they are unlikely to matter. Maybe we can return something like TGSI_TYPE_UNKNOWN so that drivers are not trapped by treating all arguments of some opcodes as to have a specific data type. Right now, we return mostly TGSI_TYPE_FLOAT for opcodes that have no dst or no src. We may (or may not) want to change that too. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] non x11/xlib based EGL and software only renderer
Hi, is there a possibility in mesa to have egl backend based on complete offscreen buffers and complete s/w only gles renderer? If yes, then could someone please guide me how to build it? Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Use Y-tiled blits to untile for cached mappings of miptrees.
On 05/06/2013 04:41 PM, Eric Anholt wrote: Fixes a regression in firefox's ReadScreenIntoImageSurface - glReadPixels() path with the introduction of Y tiling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64213 --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8970228..7f4cb4a 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1903,7 +1903,8 @@ intel_miptree_map_singlesample(struct intel_context *intel, else if (intel-has_llc !(mode GL_MAP_WRITE_BIT) !mt-compressed -mt-region-tiling == I915_TILING_X +(mt-region-tiling == I915_TILING_X || + (intel-gen = 6 mt-region-tiling == I915_TILING_Y)) mt-region-pitch 32768) { intel_miptree_map_blit(intel, mt, map, level, slice); } else if (mt-region-tiling != I915_TILING_NONE This patch is fine, but the blitter can handle untiled buffers as well. It might be even better (and simpler) as: (intel-gen = 6 || mt-region-Tiling != I915_TILING_Y) That said, untiled buffers can also be mapped via the CPU rather than the GTT with a fence, so maybe it's not as big of a deal. Either way, this series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
On 05/06/2013 09:02 PM, Eric Anholt wrote: The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Just restore the HSW path to what it was before the gen6 change, by making gen7-specific functions to set up these stages. --- The alternative here would be to emit the correct lengths of packets in these new functions. But we're not emitting constants for other disabled stages on gen7+, so I'm leaning toward this variant. Actually, we are in the normal state upload path: gen7_disable_stages emits zero-filled 3DSTATE_CONSTANT_GS/HS/DS packets. Given the hangs on Sandybridge, I think it'd be best to explicitly disable them all in blorp too. It's definitely the safest approach, and not a ton of code. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
On 05/03/2013 04:07 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Reviewed-by: Eric Anholt e...@anholt.net --- src/glsl/Makefile.sources | 1 + src/glsl/glsl_parser_extras.cpp| 1 + src/glsl/ir_optimization.h | 1 + src/glsl/lower_vector_insert.cpp | 160 + src/mesa/drivers/dri/i965/brw_shader.cpp | 1 + src/mesa/program/ir_to_mesa.cpp| 1 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 + 7 files changed, 166 insertions(+) create mode 100644 src/glsl/lower_vector_insert.cpp diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 674a05f..8e2dc1b 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -69,6 +69,7 @@ LIBGLSL_FILES = \ $(GLSL_SRCDIR)/lower_vec_index_to_cond_assign.cpp \ $(GLSL_SRCDIR)/lower_vec_index_to_swizzle.cpp \ $(GLSL_SRCDIR)/lower_vector.cpp \ + $(GLSL_SRCDIR)/lower_vector_insert.cpp \ $(GLSL_SRCDIR)/lower_output_reads.cpp \ $(GLSL_SRCDIR)/lower_ubo_reference.cpp \ $(GLSL_SRCDIR)/opt_algebraic.cpp \ diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 0992294..d38e967 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -1236,6 +1236,7 @@ do_common_optimization(exec_list *ir, bool linked, progress = do_algebraic(ir) || progress; progress = do_lower_jumps(ir) || progress; progress = do_vec_index_to_swizzle(ir) || progress; + progress = lower_vector_insert(ir, false) || progress; progress = do_swizzle_swizzle(ir) || progress; progress = do_noop_swizzle(ir) || progress; diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index a8885d7..0216e46 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -106,6 +106,7 @@ void lower_ubo_reference(struct gl_shader *shader, exec_list *instructions); void lower_packed_varyings(void *mem_ctx, unsigned location_base, unsigned locations_used, ir_variable_mode mode, gl_shader *shader); +bool lower_vector_insert(exec_list *instructions, bool lower_nonconstant_index); bool optimize_redundant_jumps(exec_list *instructions); bool optimize_split_arrays(exec_list *instructions, bool linked); diff --git a/src/glsl/lower_vector_insert.cpp b/src/glsl/lower_vector_insert.cpp new file mode 100644 index 000..3dbc263 --- /dev/null +++ b/src/glsl/lower_vector_insert.cpp @@ -0,0 +1,160 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ +#include ir.h +#include ir_builder.h +#include ir_rvalue_visitor.h +#include ir_optimization.h + +using namespace ir_builder; + +class vector_insert_visitor : public ir_rvalue_visitor { +public: + vector_insert_visitor(bool lower_nonconstant_index) + : progress(false), lower_nonconstant_index(lower_nonconstant_index) + { + factory.instructions = factory_instructions; + } + + virtual ~vector_insert_visitor() + { + assert(factory_instructions.is_empty()); + } + + virtual void handle_rvalue(ir_rvalue **rv); + + ir_factory factory; + exec_list factory_instructions; + bool progress; + bool lower_nonconstant_index; +}; + + +void +vector_insert_visitor::handle_rvalue(ir_rvalue **rv) +{ + if (*rv == NULL || (*rv)-ir_type != ir_type_expression) + return; + + ir_expression *const expr = (ir_expression *)
Re: [Mesa-dev] non x11/xlib based EGL and software only renderer
On Tue, May 7, 2013 at 1:28 PM, Divick Kishore divick.kish...@gmail.com wrote: Hi, is there a possibility in mesa to have egl backend based on complete offscreen buffers and complete s/w only gles renderer? If yes, then could someone please guide me how to build it? You may try $ ./configure --disable-dri --enable-gallium-egl --with-egl-platforms=null \ --with-gallium-drivers=swrast It will give you an EGL/GLES driver that uses a software renderer and supports only pbuffers (and FBOs). You won't be able to ask it to render to an application-provided buffer though. Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Intel HD4000 Epic Citadel (Unreal Engine WebGL demo) performance regression and some comparisons with Windows
As Eric suggested in the bug, setting layers.acceleration.force-enabled (current Mesa git + patch series to fix Y tiling) frame-rate goes to 43.7 FPS, compared to 40.5 FPS in Windows 7 (latest Intel drivers). Congratulations, we're faster than Windows (assuming that Windows aren't blacklisted too) :) On Sat, May 4, 2013 at 4:15 PM, Vedran Rodic vro...@gmail.com wrote: Hi, I bisected a performance regression here: https://bugs.freedesktop.org/show_bug.cgi?id=64213 Here's some benchmark numbers: Core i5-3320M, Firefox 23 nightly build 2013-05-04 Epic Citadel 1920x1080 (browser window maximized, not game screen) Windows 40.5 FPS Linux mesa git: 18.9 FPS 2013-05-04 with tiling fix Linux mesa 9.1.1: 18.9 FPS Vedran ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/12] glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor
On 05/03/2013 04:07 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Right now the lower_clip_distance_visitor lowers variable indexing into gl_ClipDistance into variable indexing into both the array gl_ClipDistanceMESA and the vectors of that array. For example, gl_ClipDistance[i] = f; becomes gl_ClipDistanceMESA[i/4][i%4] = f; However, variable indexing into vectors using ir_dereference_array is being removed. Instead, ir_expression with ir_triop_vector_insert will be used. The above code will become gl_ClipDistanceMESA[i/4] = vector_insert(gl_ClipDistanceMESA[i/4], i % 4, f); In order to do this, an ir_rvalue_visitor will need to be used. This commit is really just a refactor to get ready for that. v2: Convert tabs to spaces. Suggested by Eric. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Paul Berry stereotype...@gmail.com --- src/glsl/lower_clip_distance.cpp | 136 +-- 1 file changed, 86 insertions(+), 50 deletions(-) diff --git a/src/glsl/lower_clip_distance.cpp b/src/glsl/lower_clip_distance.cpp index 643807d..19068fb 100644 --- a/src/glsl/lower_clip_distance.cpp +++ b/src/glsl/lower_clip_distance.cpp @@ -46,10 +46,10 @@ */ #include glsl_symbol_table.h -#include ir_hierarchical_visitor.h +#include ir_rvalue_visitor.h #include ir.h -class lower_clip_distance_visitor : public ir_hierarchical_visitor { +class lower_clip_distance_visitor : public ir_rvalue_visitor { public: lower_clip_distance_visitor() : progress(false), old_clip_distance_var(NULL), @@ -59,11 +59,12 @@ public: virtual ir_visitor_status visit(ir_variable *); void create_indices(ir_rvalue*, ir_rvalue *, ir_rvalue *); - virtual ir_visitor_status visit_leave(ir_dereference_array *); virtual ir_visitor_status visit_leave(ir_assignment *); void visit_new_assignment(ir_assignment *ir); virtual ir_visitor_status visit_leave(ir_call *); + virtual void handle_rvalue(ir_rvalue **rvalue); + bool progress; /** @@ -173,33 +174,35 @@ lower_clip_distance_visitor::create_indices(ir_rvalue *old_index, } -/** - * Replace any expression that indexes into the gl_ClipDistance array with an - * expression that indexes into one of the vec4's in gl_ClipDistanceMESA and - * accesses the appropriate component. - */ -ir_visitor_status -lower_clip_distance_visitor::visit_leave(ir_dereference_array *ir) +void +lower_clip_distance_visitor::handle_rvalue(ir_rvalue **rv) { /* If the gl_ClipDistance var hasn't been declared yet, then * there's no way this deref can refer to it. */ - if (!this-old_clip_distance_var) - return visit_continue; - - ir_dereference_variable *old_var_ref = ir-array-as_dereference_variable(); - if (old_var_ref old_var_ref-var == this-old_clip_distance_var) { - this-progress = true; - ir_rvalue *array_index; - ir_rvalue *swizzle_index; - this-create_indices(ir-array_index, array_index, swizzle_index); - void *mem_ctx = ralloc_parent(ir); - ir-array = new(mem_ctx) ir_dereference_array( - this-new_clip_distance_var, array_index); - ir-array_index = swizzle_index; + if (!this-old_clip_distance_var || *rv == NULL) + return; + + ir_dereference_array *const array = (*rv)-as_dereference_array(); + if (array != NULL) { Writing this as: if (array == NULL) return; would have allowed the indentation to stay the same and probably made the patch easier to follow. But at this point I'm not sure it matters. + /* Replace any expression that indexes into the gl_ClipDistance array + * with an expression that indexes into one of the vec4's in + * gl_ClipDistanceMESA and accesses the appropriate component. + */ + ir_dereference_variable *old_var_ref = + array-array-as_dereference_variable(); + if (old_var_ref old_var_ref-var == this-old_clip_distance_var) { + this-progress = true; + ir_rvalue *array_index; + ir_rvalue *swizzle_index; + this-create_indices(array-array_index, array_index, swizzle_index); + void *mem_ctx = ralloc_parent(array); + array-array = +new(mem_ctx) ir_dereference_array(this-new_clip_distance_var, + array_index); + array-array_index = swizzle_index; + } } - - return visit_continue; } @@ -214,38 +217,71 @@ lower_clip_distance_visitor::visit_leave(ir_assignment *ir) { ir_dereference_variable *lhs_var = ir-lhs-as_dereference_variable(); ir_dereference_variable *rhs_var = ir-rhs-as_dereference_variable(); - if ((lhs_var lhs_var-var == this-old_clip_distance_var) - || (rhs_var rhs_var-var == this-old_clip_distance_var)) { Splitting this into LHS/RHS cases is fine, but it's unrelated to
Re: [Mesa-dev] [PATCH] egl/android: Fix error condition for EGL_ANDROID_image_native_buffer
On Mon, May 06, 2013 at 02:23:52PM -0700, Chad Versace wrote: Emit EGL_BAD_CONTEXT if the user passes a context to eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer). From the EGL_ANDROID_image_native_buffer spec: * If target is EGL_NATIVE_BUFFER_ANDROID and ctx is not EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. Note: This is a candidate for the stable branches. CC: Tapani Pälli tapani.pa...@intel.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/egl/drivers/dri2/platform_android.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index cee4035..ed50907 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -337,7 +337,7 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) } static _EGLImage * -dri2_create_image_android_native_buffer(_EGLDisplay *disp, +dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx, struct ANativeWindowBuffer *buf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); @@ -346,6 +346,18 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, uint32_t offsets[3], strides[3], handles[3], tmp; EGLint format; + if (ctx != NULL) { I did a similar check for the 'EGL_LINUX_DMA_BUF_EXT'. Technically 'eglapi.c::eglCreateImageKhr()' does a lookup of the context via '_eglLookupContext()' and does a translation of 'EGL_NO_CONTEXT' also (from NULL to NULL). Hence I chose to do the check there. But would it be better for me to do it also in the driver side as the target is valid only for linux platforms anyway? + /* From the EGL_ANDROID_image_native_buffer spec: + * + * * If target is EGL_NATIVE_BUFFER_ANDROID and ctx is not + * EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. + */ + _eglError(EGL_BAD_CONTEXT, eglCreateEGLImageKHR: for +EGL_NATIVE_BUFFER_ANDROID, the context must be +EGL_NO_CONTEXT); + return NULL; + } + if (!buf || buf-common.magic != ANDROID_NATIVE_BUFFER_MAGIC || buf-common.version != sizeof(*buf)) { _eglError(EGL_BAD_PARAMETER, eglCreateEGLImageKHR); @@ -479,7 +491,7 @@ droid_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, { switch (target) { case EGL_NATIVE_BUFFER_ANDROID: - return dri2_create_image_android_native_buffer(disp, + return dri2_create_image_android_native_buffer(disp, ctx, (struct ANativeWindowBuffer *) buffer); default: return dri2_create_image_khr(drv, disp, ctx, target, buffer, attr_list); -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/android: Fix error condition for EGL_ANDROID_image_native_buffer
On Tue, May 7, 2013 at 3:49 PM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Mon, May 06, 2013 at 02:23:52PM -0700, Chad Versace wrote: Emit EGL_BAD_CONTEXT if the user passes a context to eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer). From the EGL_ANDROID_image_native_buffer spec: * If target is EGL_NATIVE_BUFFER_ANDROID and ctx is not EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. Note: This is a candidate for the stable branches. CC: Tapani Pälli tapani.pa...@intel.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/egl/drivers/dri2/platform_android.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index cee4035..ed50907 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -337,7 +337,7 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) } static _EGLImage * -dri2_create_image_android_native_buffer(_EGLDisplay *disp, +dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx, struct ANativeWindowBuffer *buf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); @@ -346,6 +346,18 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, uint32_t offsets[3], strides[3], handles[3], tmp; EGLint format; + if (ctx != NULL) { I did a similar check for the 'EGL_LINUX_DMA_BUF_EXT'. Technically 'eglapi.c::eglCreateImageKhr()' does a lookup of the context via '_eglLookupContext()' and does a translation of 'EGL_NO_CONTEXT' also (from NULL to NULL). Hence I chose to do the check there. But would it be better for me to do it also in the driver side as the target is valid only for linux platforms anyway? I will suggest do the check in the driver for the moment. eglapi.[ch] should be generated ultimately, IMHO, and adding extension-specific there will make the switch harder. + /* From the EGL_ANDROID_image_native_buffer spec: + * + * * If target is EGL_NATIVE_BUFFER_ANDROID and ctx is not + * EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. + */ + _eglError(EGL_BAD_CONTEXT, eglCreateEGLImageKHR: for +EGL_NATIVE_BUFFER_ANDROID, the context must be +EGL_NO_CONTEXT); + return NULL; + } + if (!buf || buf-common.magic != ANDROID_NATIVE_BUFFER_MAGIC || buf-common.version != sizeof(*buf)) { _eglError(EGL_BAD_PARAMETER, eglCreateEGLImageKHR); @@ -479,7 +491,7 @@ droid_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, { switch (target) { case EGL_NATIVE_BUFFER_ANDROID: - return dri2_create_image_android_native_buffer(disp, + return dri2_create_image_android_native_buffer(disp, ctx, (struct ANativeWindowBuffer *) buffer); default: return dri2_create_image_khr(drv, disp, ctx, target, buffer, attr_list); -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] R600/SI: New patterns and intrinsics for GLSL 1.30 support in radeonsi
On Mon, 2013-05-06 at 16:07 -0700, Tom Stellard wrote: On Mon, May 06, 2013 at 09:30:58AM -0700, Tom Stellard wrote: On Mon, May 06, 2013 at 06:23:05PM +0200, Michel Dänzer wrote: AFAICT these new patterns and intrinsics should be sufficient for full GLSL 1.30 support in radeonsi. I suspect these will need some polishing, but I wanted to send them out now for initial comments so they can hopefully make it into the LLVM 3.3 release. These all look good to me, for the series: Reviewed-by: Tom Stellard thomas.stell...@amd.com Hi Michel, I went ahead and pushed these patches. They seem pretty safe to me, and I think it will make the release manager's life a little easier to have these in before the 3.3 branch is made. Thanks Tom. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 60197] Mesa Gallium VPATH build is broken
https://bugs.freedesktop.org/show_bug.cgi?id=60197 --- Comment #5 from Quentin Glidic sardemff7+freedesk...@sardemff7.net --- Any news on this trivial issue? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns
Am 06.05.2013 18:48, schrieb Tom Stellard: From: Tom Stellard thomas.stell...@amd.com The BFE optimization was the only one we were actually using, and it was emitting an intrinsic that we don't support. https://bugs.freedesktop.org/show_bug.cgi?id=64201 The patch has my rb. I'm wondering if we shouldn't get ride of all those AMDIL* files, cause we obviously don't use that namespace anymore and at least I sometimes wonder why we still have AMDILISelLowering.cpp and AMDGPUISelLowering.cpp. Christian. --- lib/Target/R600/AMDGPUInstructions.td | 11 + lib/Target/R600/AMDGPUTargetMachine.cpp|1 - lib/Target/R600/AMDILPeepholeOptimizer.cpp | 1215 lib/Target/R600/CMakeLists.txt |1 - lib/Target/R600/R600Instructions.td|1 + test/CodeGen/R600/bfe_uint.ll | 26 + 6 files changed, 38 insertions(+), 1217 deletions(-) delete mode 100644 lib/Target/R600/AMDILPeepholeOptimizer.cpp create mode 100644 test/CodeGen/R600/bfe_uint.ll diff --git a/lib/Target/R600/AMDGPUInstructions.td b/lib/Target/R600/AMDGPUInstructions.td index b44d248..d2620b2 100644 --- a/lib/Target/R600/AMDGPUInstructions.td +++ b/lib/Target/R600/AMDGPUInstructions.td @@ -284,6 +284,17 @@ class SHA256MaPattern Instruction BFI_INT, Instruction XOR : Pat (BFI_INT (XOR i32:$x, i32:$y), i32:$z, i32:$y) ; +// Bitfield extract patterns + +def legalshift32 : ImmLeaf i32, [{return Imm =0 Imm 32;}]; +def bfemask : PatLeaf (imm), [{return isMask_32(N-getZExtValue());}], +SDNodeXFormimm, [{ return CurDAG-getTargetConstant(CountTrailingOnes_32(N-getZExtValue()), MVT::i32);}]; + +class BFEPattern Instruction BFE : Pat + (and (srl i32:$x, legalshift32:$y), bfemask:$z), + (BFE $x, $y, $z) +; + include R600Instructions.td include SIInstrInfo.td diff --git a/lib/Target/R600/AMDGPUTargetMachine.cpp b/lib/Target/R600/AMDGPUTargetMachine.cpp index 0ec67ce..31fbf32 100644 --- a/lib/Target/R600/AMDGPUTargetMachine.cpp +++ b/lib/Target/R600/AMDGPUTargetMachine.cpp @@ -115,7 +115,6 @@ AMDGPUPassConfig::addPreISel() { } bool AMDGPUPassConfig::addInstSelector() { - addPass(createAMDGPUPeepholeOpt(*TM)); addPass(createAMDGPUISelDag(getAMDGPUTargetMachine())); const AMDGPUSubtarget ST = TM-getSubtargetAMDGPUSubtarget(); diff --git a/lib/Target/R600/AMDILPeepholeOptimizer.cpp b/lib/Target/R600/AMDILPeepholeOptimizer.cpp deleted file mode 100644 index 3a28038..000 --- a/lib/Target/R600/AMDILPeepholeOptimizer.cpp +++ /dev/null @@ -1,1215 +0,0 @@ -//===-- AMDILPeepholeOptimizer.cpp - AMDGPU Peephole optimizations -===// -// -// The LLVM Compiler Infrastructure -// -// This file is distributed under the University of Illinois Open Source -// License. See LICENSE.TXT for details. -// -/// \file -//==---===// - -#define DEBUG_TYPE PeepholeOpt -#ifdef DEBUG -#define DEBUGME (DebugFlag isCurrentDebugType(DEBUG_TYPE)) -#else -#define DEBUGME 0 -#endif - -#include AMDILDevices.h -#include AMDGPUInstrInfo.h -#include llvm/ADT/Statistic.h -#include llvm/ADT/StringExtras.h -#include llvm/ADT/StringRef.h -#include llvm/ADT/Twine.h -#include llvm/IR/Constants.h -#include llvm/CodeGen/MachineFunction.h -#include llvm/CodeGen/MachineFunctionAnalysis.h -#include llvm/IR/Function.h -#include llvm/IR/Instructions.h -#include llvm/IR/Module.h -#include llvm/Support/Debug.h -#include llvm/Support/MathExtras.h - -#include sstream - -#if 0 -STATISTIC(PointerAssignments, Number of dynamic pointer -assigments discovered); -STATISTIC(PointerSubtract, Number of pointer subtractions discovered); -#endif - -using namespace llvm; -// The Peephole optimization pass is used to do simple last minute optimizations -// that are required for correct code or to remove redundant functions -namespace { - -class OpaqueType; - -class LLVM_LIBRARY_VISIBILITY AMDGPUPeepholeOpt : public FunctionPass { -public: - TargetMachine TM; - static char ID; - AMDGPUPeepholeOpt(TargetMachine tm); - ~AMDGPUPeepholeOpt(); - const char *getPassName() const; - bool runOnFunction(Function F); - bool doInitialization(Module M); - bool doFinalization(Module M); - void getAnalysisUsage(AnalysisUsage AU) const; -protected: -private: - // Function to initiate all of the instruction level optimizations. - bool instLevelOptimizations(BasicBlock::iterator *inst); - // Quick check to see if we need to dump all of the pointers into the - // arena. If this is correct, then we set all pointers to exist in arena. This - // is a workaround for aliasing of pointers in a struct/union. - bool dumpAllIntoArena(Function F); - // Because I don't want to invalidate any pointers while in the - // safeNestedForEachFunction. I push atomic conversions to a vector and handle - // it later. This function does the conversions if required. - void
[Mesa-dev] [Bug 60216] Mesa Gallium can’t build without X
https://bugs.freedesktop.org/show_bug.cgi?id=60216 Quentin Glidic sardemff7+freedesk...@sardemff7.net changed: What|Removed |Added Attachment #74125|0 |1 is obsolete|| --- Comment #3 from Quentin Glidic sardemff7+freedesk...@sardemff7.net --- Created attachment 78980 -- https://bugs.freedesktop.org/attachment.cgi?id=78980action=edit Fix gallium/auxiliary build A new patch to just drop the (wrong) check -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] non x11/xlib based EGL and software only renderer
Hi Chia, $ ./configure --disable-dri --enable-gallium-egl --with-egl-platforms=null \ --with-gallium-drivers=swrast It will give you an EGL/GLES driver that uses a software renderer and supports only pbuffers (and FBOs). The egl lib built so I see has dependency on X11. Sorry if my question was not clear enough but what I really want to achieve is have GLES apps running without dependency on any GPU / Video card and X server. It could just be a headless machine. I do not want to modify app to explicitly do off screen rendering but rather have default rendering in offscreen buffers. You won't be able to ask it to render to an application-provided buffer though. Not sure what do you mean by application provided buffer? Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] non x11/xlib based EGL and software only renderer
On Tue, May 7, 2013 at 5:40 PM, Divick Kishore divick.kish...@gmail.com wrote: Hi Chia, $ ./configure --disable-dri --enable-gallium-egl --with-egl-platforms=null \ --with-gallium-drivers=swrast It will give you an EGL/GLES driver that uses a software renderer and supports only pbuffers (and FBOs). The egl lib built so I see has dependency on X11. Sorry if my question was not clear enough but what I really want to achieve is have GLES apps running without dependency on any GPU / Video card and X server. It could just be a headless machine. I do not want to modify app to explicitly do off screen rendering but rather have default rendering in offscreen buffers. I haven't tried that for a while, but it should not have X11 dependencies. You probably need to disable other stuffs such as --disable-glx and etc. It might still require X11 at compile time, because eglplatform.h may include Xlib.h, but it should not need X11 at runtime. But you need to modify the app to use pbuffer or FBO. Or you may add some code to the null platform to treat windows as pbuffers. You won't be able to ask it to render to an application-provided buffer though. Not sure what do you mean by application provided buffer? You cannot allocate a buffer in the app and ask the driver to render to it, mainly due to alignment requirements. Thanks Regards, Divick -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] tgsi/exec: clean up exec_tex()
Hi, This series adds a util function to get the dimension of texture coordinates given a texture target. The function allows exec_tex() in tgsi_exec.c to be greatly simplified. There is a subtle difference in how TXP works on array texture. That is, layer is now also projected. You can find the details in the second patch. I need the util function for ilo. Being able to simplify exec_tex() is just a execuse to add it to auxiliary/, to be honest. No regression is found with piglit quick-driver.tests. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] tgsi: add tgsi_util_get_texture_coord_dim()
This util function returns the dimension of the texture coordinates for a texture target, and the location of the shadow reference value. For example, when the texture target is TGSI_TEXTURE_SHADOW2D, the dimension of the texture coordinates is 2, and the location of the ref value is 2 (that is, the Z channel). Signed-off-by: Chia-I Wu olva...@gmail.com --- src/gallium/auxiliary/tgsi/tgsi_util.c | 91 src/gallium/auxiliary/tgsi/tgsi_util.h |3 ++ 2 files changed, 94 insertions(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c b/src/gallium/auxiliary/tgsi/tgsi_util.c index 90179c8..862b79f 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_util.c +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c @@ -338,3 +338,94 @@ tgsi_util_get_src_from_ind(const struct tgsi_ind_register *reg) return src; } + +/** + * Return the dimension of the texture coordinates (layer included for array + * textures), as well as the location of the shadow reference value or the + * sample index. + */ +int +tgsi_util_get_texture_coord_dim(int tgsi_tex, int *shadow_or_sample) +{ + int dim; + + /* +* Depending on the texture target, (src0.xyzw, src1.x) is interpreted +* differently: +* +* (s, X, X, X, X), for 1D +* (s, t, X, X, X), for 2D, RECT +* (s, t, r, X, X), for 3D, CUBE +* +* (s, layer, X, X, X), for 1D_ARRAY +* (s, t, layer, X, X), for 2D_ARRAY +* (s, t, r, layer, X), for CUBE_ARRAY +* +* (s, X, shadow, X, X), for SHADOW1D +* (s, t, shadow, X, X), for SHADOW2D, SHADOWRECT +* (s, t, r, shadow, X), for SHADOWCUBE +* +* (s, layer, shadow, X, X), for SHADOW1D_ARRAY +* (s, t, layer, shadow, X), for SHADOW2D_ARRAY +* (s, t, r, layer, shadow), for SHADOWCUBE_ARRAY +* +* (s, t, sample, X, X), for 2D_MSAA +* (s, t, layer, sample, X), for 2D_ARRAY_MSAA +*/ + switch (tgsi_tex) { + case TGSI_TEXTURE_1D: + case TGSI_TEXTURE_SHADOW1D: + dim = 1; + break; + case TGSI_TEXTURE_2D: + case TGSI_TEXTURE_RECT: + case TGSI_TEXTURE_1D_ARRAY: + case TGSI_TEXTURE_SHADOW2D: + case TGSI_TEXTURE_SHADOWRECT: + case TGSI_TEXTURE_SHADOW1D_ARRAY: + case TGSI_TEXTURE_2D_MSAA: + dim = 2; + break; + case TGSI_TEXTURE_3D: + case TGSI_TEXTURE_CUBE: + case TGSI_TEXTURE_2D_ARRAY: + case TGSI_TEXTURE_SHADOWCUBE: + case TGSI_TEXTURE_SHADOW2D_ARRAY: + case TGSI_TEXTURE_2D_ARRAY_MSAA: + dim = 3; + break; + case TGSI_TEXTURE_CUBE_ARRAY: + case TGSI_TEXTURE_SHADOWCUBE_ARRAY: + dim = 4; + break; + default: + assert(!unknown texture target); + dim = 0; + break; + } + + if (shadow_or_sample) { + switch (tgsi_tex) { + case TGSI_TEXTURE_SHADOW1D: + /* there is a gap */ + *shadow_or_sample = 2; + break; + case TGSI_TEXTURE_SHADOW2D: + case TGSI_TEXTURE_SHADOWRECT: + case TGSI_TEXTURE_SHADOWCUBE: + case TGSI_TEXTURE_SHADOW1D_ARRAY: + case TGSI_TEXTURE_SHADOW2D_ARRAY: + case TGSI_TEXTURE_SHADOWCUBE_ARRAY: + case TGSI_TEXTURE_2D_MSAA: + case TGSI_TEXTURE_2D_ARRAY_MSAA: + *shadow_or_sample = dim; + break; + default: + /* no shadow nor sample */ + *shadow_or_sample = -1; + break; + } + } + + return dim; +} diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.h b/src/gallium/auxiliary/tgsi/tgsi_util.h index d9f8859..c1184c8 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_util.h +++ b/src/gallium/auxiliary/tgsi/tgsi_util.h @@ -79,6 +79,9 @@ tgsi_util_get_inst_usage_mask(const struct tgsi_full_instruction *inst, struct tgsi_src_register tgsi_util_get_src_from_ind(const struct tgsi_ind_register *reg); +int +tgsi_util_get_texture_coord_dim(int tgsi_tex, int *shadow_or_sample); + #if defined __cplusplus } #endif -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] tgsi: clean up exec_tex()
Make use of tgsi_util_get_texture_coord_dim() to replace the big switch table. There is a subtle difference with this change. When TXP is used with an array texture, the layer is now also projected. This behavior matches the TGSI doc. Since GLSL does not allow TXP on an array texture, I am not sure which behavior is correct or preferred. Signed-off-by: Chia-I Wu olva...@gmail.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 220 1 file changed, 52 insertions(+), 168 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 75b0663..cb66a40 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -1780,201 +1780,85 @@ exec_tex(struct tgsi_exec_machine *mach, uint modifier, uint sampler) { const uint unit = inst-Src[sampler].Register.Index; - union tgsi_exec_channel r[4], cubearraycomp, cubelod; - const union tgsi_exec_channel *lod = ZeroVec; + const union tgsi_exec_channel *args[5], *proj = NULL; + union tgsi_exec_channel r[5]; enum tgsi_sampler_control control = tgsi_sampler_lod_none; uint chan; int8_t offsets[3]; + int dim, shadow_ref, i; /* always fetch all 3 offsets, overkill but keeps code simple */ fetch_texel_offsets(mach, inst, offsets); assert(modifier != TEX_MODIFIER_LEVEL_ZERO); - if (modifier != TEX_MODIFIER_NONE (sampler == 1)) { - FETCH(r[3], 0, TGSI_CHAN_W); - if (modifier != TEX_MODIFIER_PROJECTED) { - lod = r[3]; - } - } - - if (modifier == TEX_MODIFIER_EXPLICIT_LOD) { - control = tgsi_sampler_lod_explicit; - } else if (modifier == TEX_MODIFIER_LOD_BIAS){ - control = tgsi_sampler_lod_bias; - } - - switch (inst-Texture.Texture) { - case TGSI_TEXTURE_1D: - FETCH(r[0], 0, TGSI_CHAN_X); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - } - - fetch_texel(mach-Sampler, unit, unit, - r[0], ZeroVec, ZeroVec, ZeroVec, lod, /* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* R, G, B, A */ - break; - - case TGSI_TEXTURE_SHADOW1D: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[2], 0, TGSI_CHAN_Z); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - micro_div(r[2], r[2], r[3]); - } - - fetch_texel(mach-Sampler, unit, unit, - r[0], ZeroVec, r[2], ZeroVec, lod, /* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* R, G, B, A */ - break; - - case TGSI_TEXTURE_2D: - case TGSI_TEXTURE_RECT: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - micro_div(r[1], r[1], r[3]); - } + dim = tgsi_util_get_texture_coord_dim(inst-Texture.Texture, shadow_ref); - fetch_texel(mach-Sampler, unit, unit, - r[0], r[1], ZeroVec, ZeroVec, lod,/* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* outputs */ - break; + assert(dim = 4); + if (shadow_ref = 0) + assert(shadow_ref = dim shadow_ref Elements(args)); - case TGSI_TEXTURE_SHADOW2D: - case TGSI_TEXTURE_SHADOWRECT: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - FETCH(r[2], 0, TGSI_CHAN_Z); + /* fetch modifier to the last argument */ + if (modifier != TEX_MODIFIER_NONE) { + const int last = Elements(args) - 1; - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - micro_div(r[1], r[1], r[3]); - micro_div(r[2], r[2], r[3]); + /* fetch modifier from src0.w or src1.x */ + if (sampler == 1) { + assert(dim = TGSI_CHAN_W shadow_ref != TGSI_CHAN_W); + FETCH(r[last], 0, TGSI_CHAN_W); } - - fetch_texel(mach-Sampler, unit, unit, - r[0], r[1], r[2], ZeroVec, lod,/* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* outputs */ - break; - - case TGSI_TEXTURE_1D_ARRAY: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); + else { + assert(shadow_ref != 4); + FETCH(r[last], 1, TGSI_CHAN_X); } - fetch_texel(mach-Sampler, unit, unit, - r[0], r[1], ZeroVec, ZeroVec, lod, /* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* outputs */ - break; - case TGSI_TEXTURE_SHADOW1D_ARRAY: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - FETCH(r[2], 0, TGSI_CHAN_Z); - -
Re: [Mesa-dev] [PATCH 2/2] tgsi: clean up exec_tex()
Am 07.05.2013 12:21, schrieb Chia-I Wu: Make use of tgsi_util_get_texture_coord_dim() to replace the big switch table. There is a subtle difference with this change. When TXP is used with an array texture, the layer is now also projected. This behavior matches the TGSI doc. Since GLSL does not allow TXP on an array texture, I am not sure which behavior is correct or preferred. Signed-off-by: Chia-I Wu olva...@gmail.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 220 1 file changed, 52 insertions(+), 168 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 75b0663..cb66a40 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -1780,201 +1780,85 @@ exec_tex(struct tgsi_exec_machine *mach, uint modifier, uint sampler) { const uint unit = inst-Src[sampler].Register.Index; - union tgsi_exec_channel r[4], cubearraycomp, cubelod; - const union tgsi_exec_channel *lod = ZeroVec; + const union tgsi_exec_channel *args[5], *proj = NULL; + union tgsi_exec_channel r[5]; enum tgsi_sampler_control control = tgsi_sampler_lod_none; uint chan; int8_t offsets[3]; + int dim, shadow_ref, i; /* always fetch all 3 offsets, overkill but keeps code simple */ fetch_texel_offsets(mach, inst, offsets); assert(modifier != TEX_MODIFIER_LEVEL_ZERO); - if (modifier != TEX_MODIFIER_NONE (sampler == 1)) { - FETCH(r[3], 0, TGSI_CHAN_W); - if (modifier != TEX_MODIFIER_PROJECTED) { - lod = r[3]; - } - } - - if (modifier == TEX_MODIFIER_EXPLICIT_LOD) { - control = tgsi_sampler_lod_explicit; - } else if (modifier == TEX_MODIFIER_LOD_BIAS){ - control = tgsi_sampler_lod_bias; - } - - switch (inst-Texture.Texture) { - case TGSI_TEXTURE_1D: - FETCH(r[0], 0, TGSI_CHAN_X); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - } - - fetch_texel(mach-Sampler, unit, unit, - r[0], ZeroVec, ZeroVec, ZeroVec, lod, /* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* R, G, B, A */ - break; - - case TGSI_TEXTURE_SHADOW1D: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[2], 0, TGSI_CHAN_Z); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - micro_div(r[2], r[2], r[3]); - } - - fetch_texel(mach-Sampler, unit, unit, - r[0], ZeroVec, r[2], ZeroVec, lod, /* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* R, G, B, A */ - break; - - case TGSI_TEXTURE_2D: - case TGSI_TEXTURE_RECT: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - micro_div(r[1], r[1], r[3]); - } + dim = tgsi_util_get_texture_coord_dim(inst-Texture.Texture, shadow_ref); - fetch_texel(mach-Sampler, unit, unit, - r[0], r[1], ZeroVec, ZeroVec, lod,/* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* outputs */ - break; + assert(dim = 4); + if (shadow_ref = 0) + assert(shadow_ref = dim shadow_ref Elements(args)); - case TGSI_TEXTURE_SHADOW2D: - case TGSI_TEXTURE_SHADOWRECT: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - FETCH(r[2], 0, TGSI_CHAN_Z); + /* fetch modifier to the last argument */ + if (modifier != TEX_MODIFIER_NONE) { + const int last = Elements(args) - 1; - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); - micro_div(r[1], r[1], r[3]); - micro_div(r[2], r[2], r[3]); + /* fetch modifier from src0.w or src1.x */ + if (sampler == 1) { + assert(dim = TGSI_CHAN_W shadow_ref != TGSI_CHAN_W); + FETCH(r[last], 0, TGSI_CHAN_W); } - - fetch_texel(mach-Sampler, unit, unit, - r[0], r[1], r[2], ZeroVec, lod,/* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2], r[3]); /* outputs */ - break; - - case TGSI_TEXTURE_1D_ARRAY: - FETCH(r[0], 0, TGSI_CHAN_X); - FETCH(r[1], 0, TGSI_CHAN_Y); - - if (modifier == TEX_MODIFIER_PROJECTED) { - micro_div(r[0], r[0], r[3]); + else { + assert(shadow_ref != 4); + FETCH(r[last], 1, TGSI_CHAN_X); } - fetch_texel(mach-Sampler, unit, unit, - r[0], r[1], ZeroVec, ZeroVec, lod, /* S, T, P, C, LOD */ - NULL, offsets, control, - r[0], r[1], r[2],
[Mesa-dev] No configs available with xlib based egl
Hi, I have compiled mesa with the following options: .././configure --prefix=~/lib/mesa/swrast/ --build=x86_64-linux-gnu --with-gallium-drivers= --with-driver=xlib --enable-egl --enable-gles1 --enable-gles2 --with-egl-platforms=x11 CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2 but when I run a sample app with the following egl config, it returns 0 configs. EGLint attr[] = { // some attributes to set up our egl-interface EGL_BUFFER_SIZE, 16, EGL_RENDERABLE_TYPE, EGL_OPENGL_ES2_BIT, EGL_NONE }; EGLConfig ecfg; EGLint num_config; if ( !eglChooseConfig( egl_display, attr, ecfg, 1, num_config ) ) { cerr Failed to choose config (eglError: eglGetError() ) endl; return 1; } The code above prints 'Failed to choose config'. While the same code works fine when I compile with: ../../configure --prefix=~/lib/mesa/dri --build=x86_64-linux-gnu --with-driver=dri --with-dri-drivers=swrast --with-dri-driverdir=~/lib/mesa/dri/ --with-dri-searchpath='~/lib/mesa/dri' --enable-glx-tls --enable-xa --enable-driglx-direct --with-egl-platforms=x11 --enable-gallium-llvm=yes --with-gallium-drivers=swrast --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2 Could someone please suggest what could be causing this? Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Be less casual about texture formats in st_finalize_texture
On 05/06/2013 02:41 PM, Adam Jackson wrote: Commit 62452883 removed a hunk like if (firstImageFormat != stObj-pt-format) st_view_format = firstImageFormat; from update_single_texture(). This broke piglit/glx-tfp on AMD Barts (and probably others), as that hunk was compensating for the mesa and gallium layers disagreeing about the format. Fix this by not ignoring the alpha channel in st_finalize_texture when considering whether two 32-bit formats are sufficiently compatible. It looks like you're undoing change a2817f6ae by Dave Airlie. Dave should review this. It's not 100% clear to me what's going on there. -Brian Signed-off-by: Adam Jacksona...@redhat.com --- src/mesa/state_tracker/st_cb_texture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 123ed2b..0f2656c 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -1567,7 +1567,7 @@ st_finalize_texture(struct gl_context *ctx, */ if (stObj-pt) { if (stObj-pt-target != gl_target_to_pipe(stObj-base.Target) || - !st_sampler_compat_formats(stObj-pt-format, firstImageFormat) || + stObj-pt-format != firstImageFormat || stObj-pt-last_level stObj-lastLevel || stObj-pt-width0 != ptWidth || stObj-pt-height0 != ptHeight || ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] No configs available with xlib based egl
Perhaps 16-bit color isn't supported? Maybe try other color bits or set R/G/B individually and see what happens. Also, there is an eglinfo tool source code in Mesa that can probably tell you a whole lot more. Patrick On Tue, May 7, 2013 at 7:56 AM, Divick Kishore divick.kish...@gmail.comwrote: Hi, I have compiled mesa with the following options: .././configure --prefix=~/lib/mesa/swrast/ --build=x86_64-linux-gnu --with-gallium-drivers= --with-driver=xlib --enable-egl --enable-gles1 --enable-gles2 --with-egl-platforms=x11 CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2 but when I run a sample app with the following egl config, it returns 0 configs. EGLint attr[] = { // some attributes to set up our egl-interface EGL_BUFFER_SIZE, 16, EGL_RENDERABLE_TYPE, EGL_OPENGL_ES2_BIT, EGL_NONE }; EGLConfig ecfg; EGLint num_config; if ( !eglChooseConfig( egl_display, attr, ecfg, 1, num_config ) ) { cerr Failed to choose config (eglError: eglGetError() ) endl; return 1; } The code above prints 'Failed to choose config'. While the same code works fine when I compile with: ../../configure --prefix=~/lib/mesa/dri --build=x86_64-linux-gnu --with-driver=dri --with-dri-drivers=swrast --with-dri-driverdir=~/lib/mesa/dri/ --with-dri-searchpath='~/lib/mesa/dri' --enable-glx-tls --enable-xa --enable-driglx-direct --with-egl-platforms=x11 --enable-gallium-llvm=yes --with-gallium-drivers=swrast --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2 Could someone please suggest what could be causing this? Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64324] New: memory leak of visual info
https://bugs.freedesktop.org/show_bug.cgi?id=64324 Priority: medium Bug ID: 64324 Assignee: mesa-dev@lists.freedesktop.org Summary: memory leak of visual info Severity: minor Classification: Unclassified OS: All Reporter: askin...@mathworks.com Hardware: Other Status: NEW Version: unspecified Component: Drivers/X11 Product: Mesa This is a small leak (64 bytes) that happens as we allocate and free Visuals and FBConfigs in X11. Brian Paul mentioned that he wouldn't be sure it was a leak without study, so I figured I'd propose it here, just to track it. XMesaCreateVisual allocates an XMesaVisual, then allocates an XVisualInfo and copies into it the XVisualInfo that was passed into the function, and stores the pointer in the XMesaVisua, as visinfo. XMesaDestroyVisual calls free on both the XVisualInfo and the XMesaVisual. This is never called. But destroy_visuals_on_display(), in fakeglx.c, only calls free on the XMesaVisual. Nothing frees the XVisualInfo that was copied into the pointer stored in XMesaVisual. I don't know whether destroy_visuals_on_display() should call XMesaDestroyVisual(VisualTable[i]), or should just free(VisualTable[i]-visinfo). Thanks andy -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] No configs available with xlib based egl
On 05/07/2013 05:56 AM, Divick Kishore wrote: Hi, I have compiled mesa with the following options: .././configure --prefix=~/lib/mesa/swrast/ --build=x86_64-linux-gnu --with-gallium-drivers= --with-driver=xlib --enable-egl --enable-gles1 --enable-gles2 --with-egl-platforms=x11 CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2 but when I run a sample app with the following egl config, it returns 0 configs. EGLint attr[] = { // some attributes to set up our egl-interface EGL_BUFFER_SIZE, 16, EGL_RENDERABLE_TYPE, EGL_OPENGL_ES2_BIT, EGL_NONE }; EGLConfig ecfg; EGLint num_config; if ( !eglChooseConfig( egl_display, attr, ecfg, 1, num_config ) ) { cerr Failed to choose config (eglError: eglGetError() ) endl; return 1; } The code above prints 'Failed to choose config'. While the same code works fine when I compile with: ../../configure --prefix=~/lib/mesa/dri --build=x86_64-linux-gnu --with-driver=dri --with-dri-drivers=swrast --with-dri-driverdir=~/lib/mesa/dri/ --with-dri-searchpath='~/lib/mesa/dri' --enable-glx-tls --enable-xa --enable-driglx-direct --with-egl-platforms=x11 --enable-gallium-llvm=yes --with-gallium-drivers=swrast --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu CFLAGS=-Wall -g -O2 CXXFLAGS=-Wall -g -O2 Could someone please suggest what could be causing this? I suspect that, in the first build configuration, the built EGL does not support 16-bit configs. Does your app work if you set EGL_BUFFER_SIZE=24 or EGL_BUFFER_SIZE=32? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] R600/SI: Add lit tests for llvm.SI.imageload and llvm.SI.resinfo intrinsics
Am 07.05.2013 17:37, schrieb Michel Dänzer: From: Michel Dänzer michel.daen...@amd.com Adapted from the llvm.SI.sample test. Signed-off-by: Michel Dänzer michel.daen...@amd.com Reviewed-by: Christian König christian.koe...@amd.com --- test/CodeGen/R600/llvm.SI.imageload.ll | 87 ++ test/CodeGen/R600/llvm.SI.resinfo.ll | 110 + 2 files changed, 197 insertions(+) create mode 100644 test/CodeGen/R600/llvm.SI.imageload.ll create mode 100644 test/CodeGen/R600/llvm.SI.resinfo.ll diff --git a/test/CodeGen/R600/llvm.SI.imageload.ll b/test/CodeGen/R600/llvm.SI.imageload.ll new file mode 100644 index 000..6b321f0 --- /dev/null +++ b/test/CodeGen/R600/llvm.SI.imageload.ll @@ -0,0 +1,87 @@ +;RUN: llc %s -march=r600 -mcpu=verde | FileCheck %s + +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 15, 0, 0, -1 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+_VGPR[0-9]+}}, 3, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+}}, 2, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+}}, 1, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+}}, 4, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+}}, 8, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+_VGPR[0-9]+}}, 5, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+_VGPR[0-9]+}}, 12, 0, 0, -1 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 7, 0, 0, 0 +;CHECK: IMAGE_LOAD_MIP {{VGPR[0-9]+}}, 8, 0, 0, -1 + +define void @test(i32 %a1, i32 %a2, i32 %a3, i32 %a4) { + %v1 = insertelement 4 x i32 undef, i32 %a1, i32 0 + %v2 = insertelement 4 x i32 undef, i32 %a1, i32 1 + %v3 = insertelement 4 x i32 undef, i32 %a1, i32 2 + %v4 = insertelement 4 x i32 undef, i32 %a1, i32 3 + %v5 = insertelement 4 x i32 undef, i32 %a2, i32 0 + %v6 = insertelement 4 x i32 undef, i32 %a2, i32 1 + %v10 = insertelement 4 x i32 undef, i32 %a3, i32 1 + %v11 = insertelement 4 x i32 undef, i32 %a3, i32 2 + %v15 = insertelement 4 x i32 undef, i32 %a4, i32 2 + %v16 = insertelement 4 x i32 undef, i32 %a4, i32 3 + %res1 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v1, + 8 x i32 undef, i32 1) + %res2 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v2, + 8 x i32 undef, i32 2) + %res3 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v3, + 8 x i32 undef, i32 3) + %res4 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v4, + 8 x i32 undef, i32 4) + %res5 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v5, + 8 x i32 undef, i32 5) + %res6 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v6, + 8 x i32 undef, i32 6) + %res10 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v10, + 8 x i32 undef, i32 10) + %res11 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v11, + 8 x i32 undef, i32 11) + %res15 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v15, + 8 x i32 undef, i32 15) + %res16 = call 4 x i32 @llvm.SI.imageload.(4 x i32 %v16, + 8 x i32 undef, i32 16) + %e1 = extractelement 4 x i32 %res1, i32 0 + %e2 = extractelement 4 x i32 %res2, i32 1 + %e3 = extractelement 4 x i32 %res3, i32 2 + %e4 = extractelement 4 x i32 %res4, i32 3 + %t0 = extractelement 4 x i32 %res5, i32 0 + %t1 = extractelement 4 x i32 %res5, i32 1 + %e5 = add i32 %t0, %t1 + %t2 = extractelement 4 x i32 %res6, i32 0 + %t3 = extractelement 4 x i32 %res6, i32 2 + %e6 = add i32 %t2, %t3 + %t10 = extractelement 4 x i32 %res10, i32 2 + %t11 = extractelement 4 x i32 %res10, i32 3 + %e10 = add i32 %t10, %t11 + %t12 = extractelement 4 x i32 %res11, i32 0 + %t13 = extractelement 4 x i32 %res11, i32 1 + %t14 = extractelement 4 x i32 %res11, i32 2 + %t15 = add i32 %t12, %t13 + %e11 = add i32 %t14, %t15 + %t28 = extractelement 4 x i32 %res15, i32 0 + %t29 = extractelement 4 x i32 %res15, i32 1 + %t30 = extractelement 4 x i32 %res15, i32 2 + %t31 = extractelement 4 x i32 %res15, i32 3 + %t32 = add i32 %t28, %t29 + %t33 = add i32 %t30, %t31 + %e15 = add i32 %t32, %t33 + %e16 = extractelement 4 x i32 %res16, i32 3 + %s1 = add i32 %e1, %e2 + %s2 = add i32 %s1, %e3 + %s3 = add i32 %s2, %e4 + %s4 = add i32 %s3, %e5 + %s5 = add i32 %s4, %e6 + %s9 = add i32 %s5, %e10 + %s10 = add i32 %s9, %e11 + %s14 = add i32 %s10, %e15 + %s15 = add i32 %s14, %e16 + %s16 = bitcast i32 %s15 to float + call void @llvm.SI.export(i32 15, i32 0, i32 1, i32 12, i32 0, float %s16, float %s16, float %s16, float %s16) + ret void +} + +declare 4 x i32 @llvm.SI.imageload.(4 x i32, 8 x i32, i32) readnone + +declare void @llvm.SI.export(i32, i32, i32, i32, i32, float, float, float, float) diff --git a/test/CodeGen/R600/llvm.SI.resinfo.ll b/test/CodeGen/R600/llvm.SI.resinfo.ll new file mode 100644 index 000..237cea6 --- /dev/null +++ b/test/CodeGen/R600/llvm.SI.resinfo.ll @@ -0,0 +1,110 @@ +;RUN: llc %s -march=r600 -mcpu=verde | FileCheck %s + +;CHECK: IMAGE_GET_RESINFO {{VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+_VGPR[0-9]+}}, 15, 0, 0, -1 +;CHECK: IMAGE_GET_RESINFO {{VGPR[0-9]+_VGPR[0-9]+}}, 3, 0, 0, 0
Re: [Mesa-dev] [PATCH] egl/android: Fix error condition for EGL_ANDROID_image_native_buffer
On 05/07/2013 01:19 AM, Chia-I Wu wrote: On Tue, May 7, 2013 at 3:49 PM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Mon, May 06, 2013 at 02:23:52PM -0700, Chad Versace wrote: Emit EGL_BAD_CONTEXT if the user passes a context to eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer). From the EGL_ANDROID_image_native_buffer spec: * If target is EGL_NATIVE_BUFFER_ANDROID and ctx is not EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. Note: This is a candidate for the stable branches. CC: Tapani Pälli tapani.pa...@intel.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/egl/drivers/dri2/platform_android.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index cee4035..ed50907 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -337,7 +337,7 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) } static _EGLImage * -dri2_create_image_android_native_buffer(_EGLDisplay *disp, +dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx, struct ANativeWindowBuffer *buf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); @@ -346,6 +346,18 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, uint32_t offsets[3], strides[3], handles[3], tmp; EGLint format; + if (ctx != NULL) { I did a similar check for the 'EGL_LINUX_DMA_BUF_EXT'. Technically 'eglapi.c::eglCreateImageKhr()' does a lookup of the context via '_eglLookupContext()' and does a translation of 'EGL_NO_CONTEXT' also (from NULL to NULL). Hence I chose to do the check there. But would it be better for me to do it also in the driver side as the target is valid only for linux platforms anyway? I will suggest do the check in the driver for the moment. eglapi.[ch] should be generated ultimately, IMHO, and adding extension-specific there will make the switch harder. Topi, I chose to place the check in platform_android.c because this is a platform-specific extension. There's no sense in wasting cycles in platform-common code for a platform-specific check. For dma_buf_import, I think you put the check in the right place. That extension is not confined to any of Wayland, X11, nor Android, so it makes sense to put the check in the common code to reduce code duplication. Chia-I, do you have any concrete plans to generate eglapi.c? What would that generated file look like? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Use Y-tiled blits to untile for cached mappings of miptrees.
Kenneth Graunke kenn...@whitecape.org writes: On 05/06/2013 04:41 PM, Eric Anholt wrote: Fixes a regression in firefox's ReadScreenIntoImageSurface - glReadPixels() path with the introduction of Y tiling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64213 --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8970228..7f4cb4a 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1903,7 +1903,8 @@ intel_miptree_map_singlesample(struct intel_context *intel, else if (intel-has_llc !(mode GL_MAP_WRITE_BIT) !mt-compressed -mt-region-tiling == I915_TILING_X +(mt-region-tiling == I915_TILING_X || + (intel-gen = 6 mt-region-tiling == I915_TILING_Y)) mt-region-pitch 32768) { intel_miptree_map_blit(intel, mt, map, level, slice); } else if (mt-region-tiling != I915_TILING_NONE This patch is fine, but the blitter can handle untiled buffers as well. It might be even better (and simpler) as: (intel-gen = 6 || mt-region-Tiling != I915_TILING_Y) That said, untiled buffers can also be mapped via the CPU rather than the GTT with a fence, so maybe it's not as big of a deal. The point of this path is to get an untiled buffer that can be mapped via the CPU rather than the GTT with a fence. If you don't take it, you'll map with the CPU or the GTT based on tiling and LLC. pgp4lZwFTLDN2.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. I still object to using a vec4 cmp_result. It's awful for code generation. Just compare in the if statements. IFs vs cond assigns I don't care about -- that's something the driver should be deciding. pgpad44idEYpY.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/12] glsl: Use vector-insert and vector-extract on elements of gl_ClipDistanceMESA
On 3 May 2013 16:07, Ian Romanick i...@freedesktop.org wrote: From: Ian Romanick ian.d.roman...@intel.com Variable indexing into vectors using ir_dereference_array is being removed, so this lowering pass has to generate something different. v2: Convert tabs to spaces. Suggested by Eric. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Paul Berry stereotype...@gmail.com --- src/glsl/lower_clip_distance.cpp | 36 ++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/src/glsl/lower_clip_distance.cpp b/src/glsl/lower_clip_distance.cpp index 19068fb..c93c821e 100644 --- a/src/glsl/lower_clip_distance.cpp +++ b/src/glsl/lower_clip_distance.cpp @@ -197,10 +197,17 @@ lower_clip_distance_visitor::handle_rvalue(ir_rvalue **rv) ir_rvalue *swizzle_index; this-create_indices(array-array_index, array_index, swizzle_index); void *mem_ctx = ralloc_parent(array); - array-array = + + ir_dereference_array *const ClipDistanceMESA_deref = new(mem_ctx) ir_dereference_array(this-new_clip_distance_var, array_index); - array-array_index = swizzle_index; + + ir_expression *const expr = +new(mem_ctx) ir_expression(ir_binop_vector_extract, + ClipDistanceMESA_deref, + swizzle_index); + + *rv = expr; } } } @@ -280,7 +287,32 @@ lower_clip_distance_visitor::visit_leave(ir_assignment *ir) return visit_continue; } + /* Handle the LHS as if it were an r-value. This may cause the LHS to get +* replaced with an ir_expression or ir_binop_vector_extract. If this +* occurs, replace it with a dereference of the vector, and replace the RHS +* with an ir_triop_vector_insert. +*/ handle_rvalue((ir_rvalue **)ir-lhs); + if (ir-lhs-ir_type == ir_type_expression) { + ir_expression *const expr = (ir_expression *) ir-lhs; + + /* The expression must be of the form: + * + * (vector_extract gl_ClipDistanceMESA[i], j). + */ + assert(expr-operation == ir_binop_vector_extract); + assert(expr-operands[0]-ir_type == ir_type_dereference_array); + + ir_dereference *const new_lhs = (ir_dereference *) expr-operands[0]; + ir-rhs = new(ctx) ir_expression(ir_triop_vector_insert, + new_lhs-type, + new_lhs-clone(ctx, NULL), + ir-rhs, + expr-operands[1]); + ir-set_lhs(new_lhs); + ir-write_mask = (1U new_lhs-type-vector_elements) - 1; We know that the LHS is always a vec4, so I think it would be clearer to just say: ir-write_mask = 0xf; But I don't feel terribly strongly about it. Regardless of whether you decide to make this change, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com + } + return rvalue_visit(ir); } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] mesa: Implement ext_framebuffer_multisample_blit_scaled extension
On 1 May 2013 14:10, Anuj Phogat anuj.pho...@gmail.com wrote: Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/main/extensions.c | 1 + src/mesa/main/fbobject.c | 17 ++--- src/mesa/main/mtypes.h | 1 + 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index d8c5f53..15c9026 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -183,6 +183,7 @@ static const struct extension extension_table[] = { { GL_EXT_fog_coord, o(EXT_fog_coord), GLL,1999 }, { GL_EXT_framebuffer_blit, o(EXT_framebuffer_blit),GL, 2005 }, { GL_EXT_framebuffer_multisample, o(EXT_framebuffer_multisample), GL, 2005 }, + { GL_EXT_framebuffer_multisample_blit_scaled, o(EXT_framebuffer_multisample_blit_scaled), GL, 2011 }, { GL_EXT_framebuffer_object, o(EXT_framebuffer_object), GL, 2000 }, { GL_EXT_framebuffer_sRGB, o(EXT_framebuffer_sRGB),GL, 1998 }, { GL_EXT_gpu_program_parameters, o(EXT_gpu_program_parameters), GLL,2006 }, diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 645a8a3..f5696b1 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -2940,8 +2940,18 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, return; } - if (filter != GL_NEAREST filter != GL_LINEAR) { - _mesa_error(ctx, GL_INVALID_ENUM, glBlitFramebufferEXT(filter)); + if (filter != GL_NEAREST filter != GL_LINEAR + ((filter != GL_SCALED_RESOLVE_FASTEST_EXT + filter != GL_SCALED_RESOLVE_NICEST_EXT) || +!ctx-Extensions.EXT_framebuffer_multisample_blit_scaled)) { + _mesa_error(ctx, GL_INVALID_ENUM, glBlitFramebufferEXT(filter)); + return; I believe this is correct, but I find deeply nested conditionals like this really hard to follow. How about adding a function like: bool is_valid_blit_filter(const struct gl_context *ctx, GLenum filter) { switch (filter) { case GL_NEAREST: case GL_LINEAR: return true; case GL_SCALED_RESOLVE_FASTEST_EXT: case GL_SCALED_RESOLVE_NICEST_EXT: return ctx-Extensions.EXT_framebuffer_multisample_blit_scaled; default: return false; } } In any case, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com + } + + if ((filter == GL_SCALED_RESOLVE_FASTEST_EXT || +filter == GL_SCALED_RESOLVE_NICEST_EXT) +(readFb-Visual.samples == 0 || drawFb-Visual.samples 0)) { + _mesa_error(ctx, GL_INVALID_OPERATION, glBlitFramebufferEXT(filter)); return; } @@ -3174,7 +3184,8 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, } /* extra checks for multisample copies... */ - if (readFb-Visual.samples 0 || drawFb-Visual.samples 0) { + if ((readFb-Visual.samples 0 || drawFb-Visual.samples 0) + (filter == GL_NEAREST || filter == GL_LINEAR)) { /* src and dest region sizes must be the same */ if (abs(srcX1 - srcX0) != abs(dstX1 - dstX0) || abs(srcY1 - srcY0) != abs(dstY1 - dstY0)) { diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 139c6af..9ec0c7d 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3020,6 +3020,7 @@ struct gl_extensions GLboolean EXT_fog_coord; GLboolean EXT_framebuffer_blit; GLboolean EXT_framebuffer_multisample; + GLboolean EXT_framebuffer_multisample_blit_scaled; GLboolean EXT_framebuffer_object; GLboolean EXT_framebuffer_sRGB; GLboolean EXT_gpu_program_parameters; -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
On 05/07/2013 09:53 AM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. I still object to using a vec4 cmp_result. It's awful for code generation. Just compare in the if statements. So you're telling me that it's unacceptable to continue generating the same code that we currently generate? ...for a case that happens is how many real shaders? IFs vs cond assigns I don't care about -- that's something the driver should be deciding. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nouveau: emit and flush fence in fence_signalled if needed
The Mesa state tracker expects us to emit the fence even if it doesn't call fence_finish. Notably, this occurs when glClientWaitSync is called with timeout 0. Fixes Portal and Left 4 Dead 2, which were both stalling on startup by repeatedly calling glClientWaitSync with timeout 0 while waiting for commands to complete. --- src/gallium/drivers/nouveau/nouveau_fence.c | 36 ++- src/gallium/drivers/nouveau/nouveau_fence.h |1 + 2 files changed, 25 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_fence.c b/src/gallium/drivers/nouveau/nouveau_fence.c index dea146c..722be01 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.c +++ b/src/gallium/drivers/nouveau/nouveau_fence.c @@ -167,6 +167,25 @@ nouveau_fence_update(struct nouveau_screen *screen, boolean flushed) } } +boolean +nouveau_fence_ensure_flushed(struct nouveau_fence *fence) +{ + struct nouveau_screen *screen = fence-screen; + + if (fence-state NOUVEAU_FENCE_STATE_EMITTED) { + nouveau_fence_emit(fence); + + if (fence == screen-fence.current) + nouveau_fence_new(screen, screen-fence.current, FALSE); + } + if (fence-state NOUVEAU_FENCE_STATE_FLUSHED) { + if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel)) + return FALSE; + } + + return TRUE; +} + #define NOUVEAU_FENCE_MAX_SPINS (1 31) boolean @@ -174,8 +193,9 @@ nouveau_fence_signalled(struct nouveau_fence *fence) { struct nouveau_screen *screen = fence-screen; - if (fence-state = NOUVEAU_FENCE_STATE_EMITTED) - nouveau_fence_update(screen, FALSE); + if (!nouveau_fence_ensure_flushed(fence)) + return FALSE; + nouveau_fence_update(screen, FALSE); return fence-state == NOUVEAU_FENCE_STATE_SIGNALLED; } @@ -189,16 +209,8 @@ nouveau_fence_wait(struct nouveau_fence *fence) /* wtf, someone is waiting on a fence in flush_notify handler? */ assert(fence-state != NOUVEAU_FENCE_STATE_EMITTING); - if (fence-state NOUVEAU_FENCE_STATE_EMITTED) { - nouveau_fence_emit(fence); - - if (fence == screen-fence.current) - nouveau_fence_new(screen, screen-fence.current, FALSE); - } - if (fence-state NOUVEAU_FENCE_STATE_FLUSHED) { - if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel)) - return FALSE; - } + if (!nouveau_fence_ensure_flushed(fence)) + return FALSE; do { nouveau_fence_update(screen, FALSE); diff --git a/src/gallium/drivers/nouveau/nouveau_fence.h b/src/gallium/drivers/nouveau/nouveau_fence.h index 3984a9a..d497c7f 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.h +++ b/src/gallium/drivers/nouveau/nouveau_fence.h @@ -34,6 +34,7 @@ boolean nouveau_fence_new(struct nouveau_screen *, struct nouveau_fence **, boolean nouveau_fence_work(struct nouveau_fence *, void (*)(void *), void *); voidnouveau_fence_update(struct nouveau_screen *, boolean flushed); voidnouveau_fence_next(struct nouveau_screen *); +boolean nouveau_fence_ensure_flushed(struct nouveau_fence *); boolean nouveau_fence_wait(struct nouveau_fence *); boolean nouveau_fence_signalled(struct nouveau_fence *); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Use Y-tiled blits to untile for cached mappings of miptrees.
On 05/07/2013 09:47 AM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: On 05/06/2013 04:41 PM, Eric Anholt wrote: Fixes a regression in firefox's ReadScreenIntoImageSurface - glReadPixels() path with the introduction of Y tiling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64213 --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8970228..7f4cb4a 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1903,7 +1903,8 @@ intel_miptree_map_singlesample(struct intel_context *intel, else if (intel-has_llc !(mode GL_MAP_WRITE_BIT) !mt-compressed -mt-region-tiling == I915_TILING_X +(mt-region-tiling == I915_TILING_X || + (intel-gen = 6 mt-region-tiling == I915_TILING_Y)) mt-region-pitch 32768) { intel_miptree_map_blit(intel, mt, map, level, slice); } else if (mt-region-tiling != I915_TILING_NONE This patch is fine, but the blitter can handle untiled buffers as well. It might be even better (and simpler) as: (intel-gen = 6 || mt-region-Tiling != I915_TILING_Y) That said, untiled buffers can also be mapped via the CPU rather than the GTT with a fence, so maybe it's not as big of a deal. The point of this path is to get an untiled buffer that can be mapped via the CPU rather than the GTT with a fence. If you don't take it, you'll map with the CPU or the GTT based on tiling and LLC. Right. So it would be stupid to do so. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Count occlusion query samples for CopyPixels using the 2D engine.
On 05/06/2013 04:41 PM, Eric Anholt wrote: We accidentally fixed the piglit test for this when introducing Y tiling, since this path stopped being executed. In reenabling this path for Y tiling, we ended up regressing it again, so just fix it. Patch 1 and 2 are Reviewed-by: Ian Romanick ian.d.roman...@intel.com I'll defer to Ken on patch 3. --- src/mesa/drivers/dri/intel/intel_pixel_copy.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_pixel_copy.c b/src/mesa/drivers/dri/intel/intel_pixel_copy.c index 5d80fed..34376ba 100644 --- a/src/mesa/drivers/dri/intel/intel_pixel_copy.c +++ b/src/mesa/drivers/dri/intel/intel_pixel_copy.c @@ -213,6 +213,9 @@ do_blit_copypixels(struct gl_context * ctx, return false; } + if (ctx-Query.CurrentOcclusionObject) + ctx-Query.CurrentOcclusionObject-Result += width * height; + out: intel_check_front_buffer_rendering(intel); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glx: Ensure that glXWaitX actually waits for X.
On 05/06/2013 05:39 PM, Eric Anholt wrote: Ever since fake front was introduced in 63b51b5cf17ddde09b72a2811296f37b9a4c5ad2, we were skipping the XSync() in the non-fake-front path, so compositors like Firefox's GL canvas were having to manually put it in outside of glXWaitX(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52930 Both patches are Reviewed-by: Ian Romanick ian.d.roman...@intel.com and NOTE: This is a candidate for the stable branches. It might be worth adding the bit about pixmaps from the second commit message to this one as well. I was unsure how you'd hit this problem until I read the second patch. --- src/glx/dri2_glx.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 7ce5775..3cdd249 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -639,10 +639,23 @@ dri2_wait_x(struct glx_context *gc) struct dri2_drawable *priv = (struct dri2_drawable *) GetGLXDRIDrawable(gc-currentDpy, gc-currentDrawable); - if (priv == NULL || !priv-have_fake_front) - return; - - dri2_copy_drawable(priv, DRI2BufferFakeFrontLeft, DRI2BufferFrontLeft); + if (priv != NULL priv-have_fake_front) { + /* Ask the server to update our copy of the front buffer from the real + * front buffer. This will round-trip with the server, so we can skip + * the XSync(). + */ + dri2_copy_drawable(priv, DRI2BufferFakeFrontLeft, DRI2BufferFrontLeft); + } else { + /* From the GLX 1.4 spec, page 33: + * + * X rendering calls made prior to glXWaitX are guaranteed to be + * executed before OpenGL rendering calls made after + * glXWaitX. While the same result can be achieved using XSync, + * glXWaitX does not require a round trip to the server, and may + * therefore be more efficient. + */ + XSync(gc-currentDpy, False); + } } static void ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: Make virtual grf live intervals actually cover their used range.
On 04/30/2013 05:21 PM, Eric Anholt wrote: Previously, we would sometimes not consider a write to a register to extend the end of the interval, nor would we consider a read before a write to extend the start. This made for a bunch of complicated logic related to how to treat the results when dead code might be present. Instead, just extend the interval and fix dead code elimination to know how to remove it. Interestingly, this actually results in a tiny bit more optimization: total instructions in shared programs: 1391220 - 1390799 (-0.03%) instructions in affected programs: 14037 - 13616 (-3.00%) Both patches are Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +++--- src/mesa/drivers/dri/i965/brw_fs.h | 4 +- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +- .../drivers/dri/i965/brw_fs_live_variables.cpp | 76 ++ src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 3 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 +- 6 files changed, 38 insertions(+), 72 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a8610ee..0821c05 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1449,8 +1449,8 @@ fs_visitor::compact_virtual_grfs() remap_table[i] = new_index; virtual_grf_sizes[new_index] = virtual_grf_sizes[i]; if (live_intervals_valid) { -virtual_grf_use[new_index] = virtual_grf_use[i]; -virtual_grf_def[new_index] = virtual_grf_def[i]; +virtual_grf_start[new_index] = virtual_grf_start[i]; +virtual_grf_end[new_index] = virtual_grf_end[i]; } ++new_index; } @@ -1764,10 +1764,8 @@ fs_visitor::opt_algebraic() } /** - * Must be called after calculate_live_intervales() to remove unused - * writes to registers -- register allocation will fail otherwise - * because something deffed but not used won't be considered to - * interfere with other regs. + * Removes any instructions writing a VGRF where that VGRF is not used by any + * later instruction. */ bool fs_visitor::dead_code_eliminate() @@ -1780,9 +1778,12 @@ fs_visitor::dead_code_eliminate() foreach_list_safe(node, this-instructions) { fs_inst *inst = (fs_inst *)node; - if (inst-dst.file == GRF this-virtual_grf_use[inst-dst.reg] = pc) { -inst-remove(); -progress = true; + if (inst-dst.file == GRF) { + assert(this-virtual_grf_end[inst-dst.reg] = pc); + if (this-virtual_grf_end[inst-dst.reg] == pc) { +inst-remove(); +progress = true; + } } pc++; @@ -2194,7 +2195,7 @@ fs_visitor::compute_to_mrf() /* Can't compute-to-MRF this GRF if someone else was going to * read it later. */ - if (this-virtual_grf_use[inst-src[0].reg] ip) + if (this-virtual_grf_end[inst-src[0].reg] ip) continue; /* Found a move of a GRF to a MRF. Let's see if we can go diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index c9c9856..3df2ce1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -434,8 +434,8 @@ public: int *virtual_grf_sizes; int virtual_grf_count; int virtual_grf_array_size; - int *virtual_grf_def; - int *virtual_grf_use; + int *virtual_grf_start; + int *virtual_grf_end; bool live_intervals_valid; /* This is the map from UNIFORM hw_reg + reg_offset as generated by diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index b5c2200..9b60d9b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -194,7 +194,7 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb) /* Kill any AEB entries using registers that don't get reused any * more -- a sure sign they'll fail operands_match(). */ -if (src_reg-file == GRF virtual_grf_use[src_reg-reg] ip) { +if (src_reg-file == GRF virtual_grf_end[src_reg-reg] ip) { entry-remove(); ralloc_free(entry); break; diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp index fdcfac6..dd8923e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp @@ -167,16 +167,16 @@ fs_visitor::calculate_live_intervals() if (this-live_intervals_valid) return; - int *def = ralloc_array(mem_ctx, int, num_vars); - int *use = ralloc_array(mem_ctx, int, num_vars); - ralloc_free(this-virtual_grf_def); - ralloc_free(this-virtual_grf_use); - this-virtual_grf_def = def; -
Re: [Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
Ian Romanick i...@freedesktop.org writes: On 05/07/2013 09:53 AM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. I still object to using a vec4 cmp_result. It's awful for code generation. Just compare in the if statements. So you're telling me that it's unacceptable to continue generating the same code that we currently generate? ...for a case that happens is how many real shaders? I was just looking at the code in this patch, and saying that it's bad for drivers, and there's extra complexity in it in order to be bad for drivers like that. pgpd6OevxvccT.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glxgears performance higher with software renderer compared to h/w drivers
Doesn't your glxgears tell you about it? Mine does: It does. Just that I did not pay enough attention to it and did not know how to avoid it to sync with vsync. Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
On 1 May 2013 14:10, Anuj Phogat anuj.pho...@gmail.com wrote: This patch enables ext_framebuffer_multisample_blit_scaled extension on intel h/w = gen6. Note: Patches for piglit tests to verify this functionality are out for review on piglit mailing list. Tests pass for all of the scaling factors except 1.3 and 1.8. I'm still investigating what's so special about these scaling factors. I'll update my patches to fix the issue along with other review comments. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com We discussed this a bit in person yesterday, but I'd like to get the discussion on record: I have some concerns about the image quality of the method you've implemented. As I understand it, the primary use case of this extension is to allow the client to do multisampled rendering at slightly less than screen resolution (e.g. 720p instead of 1080p), and then blit the result to the screen in one step while keeping most of the quality benefits of multisampling. Since your implementation is effectively equivalent to downsampling and then blitting using GL_NEAREST filtering, my fear is that it will lead to blocky artifacts that are severe enough to negate the benefit of multisampling in the first place. Before we turn this extension on in the Intel driver, I'd like to look at a comparison of: (1) your technique (2) downsampling followed by scaling with GL_LINEAR filtering (3) The nVidia implementation, in GL_SCALED_RESOLVE_FASTEST_EXT mode (4) The nVidia implementation, in GL_SCALED_RESOLVE_NICEST_EXT mode (5) Just rendering the image directly to the single-sampled destination buffer I believe my nVidia card supports this extension; I'll try to gather images of (3) and (4) today. --- src/mesa/drivers/dri/intel/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c b/src/mesa/drivers/dri/intel/intel_extensions.c index 18f19b8..e5c2cd6 100755 --- a/src/mesa/drivers/dri/intel/intel_extensions.c +++ b/src/mesa/drivers/dri/intel/intel_extensions.c @@ -97,6 +97,7 @@ intelInitExtensions(struct gl_context *ctx) if (intel-gen = 6) { ctx-Extensions.EXT_framebuffer_multisample = true; + ctx-Extensions.EXT_framebuffer_multisample_blit_scaled = true; ctx-Extensions.ARB_blend_func_extended = !driQueryOptionb(intel-optionCache, disable_blend_func_extended); ctx-Extensions.ARB_draw_buffers_blend = true; ctx-Extensions.ARB_ES3_compatibility = true; -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glxgears performance higher with software renderer compared to h/w drivers
Hi Patrick, I don't know a whole lot about Mesa's structure, but I know that the LLVMpipe driver is supposed to be the fastest software driver for x86 CPUs. The reason is that it JIT compiles vertex/fragment programs into x86/x64 assembly using LLVM as the code generator. LLVM contains extensive optimization functions and is much more sophisticated than the rtasm module you'll see in Mesa's source tree. Additionally, the LLVMpipe driver will utilize multiple cores to execute vertex/fragment programs. For super simple rendering, e.g. flat shaded gears in glxgears, I wouldn't be surprised if special-case code (generally part of Mesa's 'swrast' module) is faster because it does not require the overhead of executing a program for each vertex/pixel, and can possibly operate on multiple pixels at once using MMX/SSE2. However, the optimizations for those generally target apps = 2001, while newer apps that make extensive use of programmable hardware will likely be far faster on LLVM pipe. I too noticed that with a simple GLES2 app with a not so simple rendering. I saw that with that app the xlib based renderer was crawling at 8-9 fps while the llvm based renderer was able to give performance as high as close to 250fps with single thread. I am not sure if that is expected and if the difference is so much significant. In any case I have not been able to understand clearly the difference between xlib based s/w renderer and the gallium based softpipe renderer (without llvm). Any pointers would be helpful. Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glxgears performance higher with software renderer compared to h/w drivers
Just in case someone else faces the same issue. Resolved the issue and was able to build the gallium softpipe renderer with llvm. . Could you guide me on how to build llvmpipe driver? I was able to build the driver with llvm but I had to modify the configure.ac (in mesa 8.0.5) to avoid specific dependence on llvm-2.9. My system had llvm-3.0 and autoconf was detecting as llvm absent. Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] non x11/xlib based EGL and software only renderer
Hi Chia, I haven't tried that for a while, but it should not have X11 dependencies. You probably need to disable other stuffs such as --disable-glx and etc. It might still require X11 at compile time, because eglplatform.h may include Xlib.h, but it should not need X11 at runtime. Alright, I will try and update with my findings. But doing ldd it does show dependency on X11.so. But you need to modify the app to use pbuffer or FBO. Or you may add some code to the null platform to treat windows as pbuffers. What exactly is a null platform? Also could you please point me to the directory and potential starting points to support window as pbuffers? Modifying apps is a complete no no. Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] mesa-dev Digest, Vol 38, Issue 79
Hi Patrick, Perhaps 16-bit color isn't supported? Maybe try other color bits or set R/G/B individually and see what happens. Also, there is an eglinfo tool source code in Mesa that can probably tell you a whole lot more. I see that es and es2 renderables are not supported due to which eglChooseConfig returns 0 configs. Is it the default build of mesa with xlib s/w renderer or do I need to add any thing else to make it support gles1 and gles2? I did pass --enable-gles1 and --enable-gles2 with --with-driver=xlib to configure though. here is the output from eglinfo: Configurations: bf lv colorbuffer dp st msvis cav bi renderable supported id sz l r g b a th cl ns bid eat nd gl es es2 vg surfaces - 0x01 24 0 8 8 8 0 16 8 0 0 0x21TC y win 0x02 24 0 8 8 8 0 16 8 0 0 0x22DC y win 0x03 24 0 8 8 8 0 16 8 0 0 0xa3TC y win 0x04 24 0 8 8 8 0 16 8 0 0 0xa4TC y win 0x05 24 0 8 8 8 0 16 8 0 0 0xa5TC y win 0x06 24 0 8 8 8 0 16 8 0 0 0xa6TC y win 0x07 24 0 8 8 8 0 16 8 0 0 0xa7TC y win 0x08 24 0 8 8 8 0 16 8 0 0 0xa8TC y win 0x09 24 0 8 8 8 0 16 8 0 0 0xa9TC y win 0x0a 24 0 8 8 8 0 16 8 0 0 0xaaTC y win 0x0b 24 0 8 8 8 0 16 8 0 0 0xabTC y win 0x0c 24 0 8 8 8 0 16 8 0 0 0xacTC y win 0x0d 24 0 8 8 8 0 16 8 0 0 0xadTC y win 0x0e 24 0 8 8 8 0 16 8 0 0 0xaeTC y win 0x0f 24 0 8 8 8 0 16 8 0 0 0xafTC y win 0x10 24 0 8 8 8 0 16 8 0 0 0xb0TC y win 0x11 24 0 8 8 8 0 16 8 0 0 0xb1DC y win 0x12 24 0 8 8 8 0 16 8 0 0 0xb2DC y win 0x13 24 0 8 8 8 0 16 8 0 0 0xb3DC y win 0x14 24 0 8 8 8 0 16 8 0 0 0xb4DC y win 0x15 24 0 8 8 8 0 16 8 0 0 0xb5DC y win 0x16 24 0 8 8 8 0 16 8 0 0 0xb6DC y win 0x17 24 0 8 8 8 0 16 8 0 0 0xb7DC y win 0x18 24 0 8 8 8 0 16 8 0 0 0xb8DC y win 0x19 24 0 8 8 8 0 16 8 0 0 0xb9DC y win 0x1a 24 0 8 8 8 0 16 8 0 0 0xbaDC y win 0x1b 24 0 8 8 8 0 16 8 0 0 0xbbDC y win 0x1c 24 0 8 8 8 0 16 8 0 0 0xbcDC y win 0x1d 24 0 8 8 8 0 16 8 0 0 0xbdDC y win 0x1e 24 0 8 8 8 0 16 8 0 0 0xbeDC y win 0x1f 24 0 8 8 8 0 16 8 0 0 0xbfDC y win 0x20 24 0 8 8 8 0 16 8 0 0 0x72TC y win Thanks Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64335] New: DispatchList leaked in glxapi.c?
https://bugs.freedesktop.org/show_bug.cgi?id=64335 Priority: medium Bug ID: 64335 Assignee: mesa-dev@lists.freedesktop.org Summary: DispatchList leaked in glxapi.c? Severity: normal Classification: Unclassified OS: All Reporter: askin...@mathworks.com Hardware: Other Status: NEW Version: unspecified Component: Drivers/X11 Product: Mesa get_dispatch() in glxapi.c grows the DispatchList table. It is small, but as far as I can see, items don't get removed as a Display is closed. Could/should this get cleaned up, maybe from close_display_callback() in fakeglx.c? It is a small leak (we see 24 bytes per Display in 7.2), but I think it would be better if closed, as this list will just grow. I don't know if the table pointed to by the structures in this list are cleared up, but I haven't seen them as leaks. The relevant code looks similar in repository. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nouveau: emit and flush fence in fence_signalled if needed
On 07.05.2013 19:25, Bryan Cain wrote: The Mesa state tracker expects us to emit the fence even if it doesn't call fence_finish. Notably, this occurs when glClientWaitSync is called with timeout 0. Fixes Portal and Left 4 Dead 2, which were both stalling on startup by repeatedly calling glClientWaitSync with timeout 0 while waiting for commands to complete. --- I'm not sure I want to do this. pipe_screen::fence_signalled probably shouldn't flush the command buffer, r600g doesn't seem to do it either. They should probably call glFlush() before looping on glClientWaitSync, or, if they don't have anything better to do in the meantime, simply specify an infinite timeout if they're going to loop forever anyway. src/gallium/drivers/nouveau/nouveau_fence.c | 36 ++- src/gallium/drivers/nouveau/nouveau_fence.h |1 + 2 files changed, 25 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_fence.c b/src/gallium/drivers/nouveau/nouveau_fence.c index dea146c..722be01 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.c +++ b/src/gallium/drivers/nouveau/nouveau_fence.c @@ -167,6 +167,25 @@ nouveau_fence_update(struct nouveau_screen *screen, boolean flushed) } } +boolean +nouveau_fence_ensure_flushed(struct nouveau_fence *fence) +{ + struct nouveau_screen *screen = fence-screen; + + if (fence-state NOUVEAU_FENCE_STATE_EMITTED) { + nouveau_fence_emit(fence); + + if (fence == screen-fence.current) + nouveau_fence_new(screen, screen-fence.current, FALSE); + } + if (fence-state NOUVEAU_FENCE_STATE_FLUSHED) { + if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel)) + return FALSE; + } + + return TRUE; +} + #define NOUVEAU_FENCE_MAX_SPINS (1 31) boolean @@ -174,8 +193,9 @@ nouveau_fence_signalled(struct nouveau_fence *fence) { struct nouveau_screen *screen = fence-screen; - if (fence-state = NOUVEAU_FENCE_STATE_EMITTED) - nouveau_fence_update(screen, FALSE); + if (!nouveau_fence_ensure_flushed(fence)) + return FALSE; + nouveau_fence_update(screen, FALSE); return fence-state == NOUVEAU_FENCE_STATE_SIGNALLED; } @@ -189,16 +209,8 @@ nouveau_fence_wait(struct nouveau_fence *fence) /* wtf, someone is waiting on a fence in flush_notify handler? */ assert(fence-state != NOUVEAU_FENCE_STATE_EMITTING); - if (fence-state NOUVEAU_FENCE_STATE_EMITTED) { - nouveau_fence_emit(fence); - - if (fence == screen-fence.current) - nouveau_fence_new(screen, screen-fence.current, FALSE); - } - if (fence-state NOUVEAU_FENCE_STATE_FLUSHED) { - if (nouveau_pushbuf_kick(screen-pushbuf, screen-pushbuf-channel)) - return FALSE; - } + if (!nouveau_fence_ensure_flushed(fence)) + return FALSE; do { nouveau_fence_update(screen, FALSE); diff --git a/src/gallium/drivers/nouveau/nouveau_fence.h b/src/gallium/drivers/nouveau/nouveau_fence.h index 3984a9a..d497c7f 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.h +++ b/src/gallium/drivers/nouveau/nouveau_fence.h @@ -34,6 +34,7 @@ boolean nouveau_fence_new(struct nouveau_screen *, struct nouveau_fence **, boolean nouveau_fence_work(struct nouveau_fence *, void (*)(void *), void *); voidnouveau_fence_update(struct nouveau_screen *, boolean flushed); voidnouveau_fence_next(struct nouveau_screen *); +boolean nouveau_fence_ensure_flushed(struct nouveau_fence *); boolean nouveau_fence_wait(struct nouveau_fence *); boolean nouveau_fence_signalled(struct nouveau_fence *); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
On 05/06/2013 09:02 PM, Eric Anholt wrote: The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Just restore the HSW path to what it was before the gen6 change, by making gen7-specific functions to set up these stages. --- The alternative here would be to emit the correct lengths of packets in these new functions. But we're not emitting constants for other disabled stages on gen7+, so I'm leaning toward this variant. I'm in favor of emitting zero-filled constant packets on gen7 just as 1dfea559c3f1 does. Neglecting to do so on gen6 caused no problems for us for several years; then, boom!, the neglect began causing hangs. Let's proactively apply the same fix here to gen7 so a similar bug doesn't haunt in the future. src/mesa/drivers/dri/i965/gen7_blorp.cpp | 57 ++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index 1c23866..330e3d5 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -276,6 +276,39 @@ gen7_blorp_emit_sampler_state(struct brw_context *brw, } +/* 3DSTATE_VS + * + * Disable vertex shader. + */ +static void +gen7_blorp_emit_vs_disable(struct brw_context *brw, + const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + if (intel-gen == 6) { This if-branch is dead code. The function is prefixed gen7. + /* From the BSpec, Volume 2a, Part 3 Vertex Shader, Section + * 3DSTATE_VS, Dword 5.0 VS Function Enable: + * + * [DevSNB] A pipeline flush must be programmed prior to a + * 3DSTATE_VS command that causes the VS Function Enable to + * toggle. Pipeline flush can be executed by sending a PIPE_CONTROL + * command with CS stall bit set and a post sync operation. + */ + intel_emit_post_sync_nonzero_flush(intel); + } + + BEGIN_BATCH(6); + OUT_BATCH(_3DSTATE_VS 16 | (6 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} + + /* 3DSTATE_HS * * Disable the hull shader. @@ -337,6 +370,26 @@ gen7_blorp_emit_ds_disable(struct brw_context *brw, ADVANCE_BATCH(); } +/* 3DSTATE_GS + * + * Disable the geometry shader. + */ +static void +gen7_blorp_emit_gs_disable(struct brw_context *brw, + const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_GS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} /* 3DSTATE_STREAMOUT * @@ -784,11 +837,11 @@ gen7_blorp_exec(struct intel_context *intel, wm_surf_offset_texture); sampler_offset = gen7_blorp_emit_sampler_state(brw, params); } - gen6_blorp_emit_vs_disable(brw, params); + gen7_blorp_emit_vs_disable(brw, params); gen7_blorp_emit_hs_disable(brw, params); gen7_blorp_emit_te_disable(brw, params); gen7_blorp_emit_ds_disable(brw, params); - gen6_blorp_emit_gs_disable(brw, params); + gen7_blorp_emit_gs_disable(brw, params); gen7_blorp_emit_streamout_disable(brw, params); gen6_blorp_emit_clip_disable(brw, params); gen7_blorp_emit_sf_config(brw, params); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64335] DispatchList leaked in glxapi.c?
https://bugs.freedesktop.org/show_bug.cgi?id=64335 Andy Skinner askin...@mathworks.com changed: What|Removed |Added Priority|medium |low -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] amd_performance_monitor: Fix multi-statement macro 'report'.
Fixes Nesting level does not match indentation defect reported by Coverity. Signed-off-by: Vinson Lee v...@freedesktop.org --- tests/spec/amd_performance_monitor/api.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/spec/amd_performance_monitor/api.c b/tests/spec/amd_performance_monitor/api.c index 7b321cf..3205fc0 100644 --- a/tests/spec/amd_performance_monitor/api.c +++ b/tests/spec/amd_performance_monitor/api.c @@ -113,8 +113,10 @@ find_invalid_counter(unsigned *counters, int num_counters) } #define report(pass) \ -piglit_report_subtest_result((pass) ? PIGLIT_PASS : PIGLIT_FAIL, __FUNCTION__); \ -return + do { \ + piglit_report_subtest_result((pass) ? PIGLIT_PASS : PIGLIT_FAIL, __FUNCTION__); \ + return; \ + } while (0) /**/ -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glx: Ensure that glXWaitX actually waits for X.
On Tuesday 07 May 2013, Eric Anholt wrote: Ever since fake front was introduced in 63b51b5cf17ddde09b72a2811296f37b9a4c5ad2, we were skipping the XSync() in the non-fake-front path, so compositors like Firefox's GL canvas were having to manually put it in outside of glXWaitX(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52930 --- src/glx/dri2_glx.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 7ce5775..3cdd249 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -639,10 +639,23 @@ dri2_wait_x(struct glx_context *gc) struct dri2_drawable *priv = (struct dri2_drawable *) GetGLXDRIDrawable(gc-currentDpy, gc-currentDrawable); - if (priv == NULL || !priv-have_fake_front) - return; - - dri2_copy_drawable(priv, DRI2BufferFakeFrontLeft, DRI2BufferFrontLeft); + if (priv != NULL priv-have_fake_front) { + /* Ask the server to update our copy of the front buffer from the real + * front buffer. This will round-trip with the server, so we can skip + * the XSync(). + */ + dri2_copy_drawable(priv, DRI2BufferFakeFrontLeft, DRI2BufferFrontLeft); + } else { + /* From the GLX 1.4 spec, page 33: + * + * X rendering calls made prior to glXWaitX are guaranteed to be + * executed before OpenGL rendering calls made after + * glXWaitX. While the same result can be achieved using XSync, + * glXWaitX does not require a round trip to the server, and may + * therefore be more efficient. + */ + XSync(gc-currentDpy, False); I think this could be improved a bit by calling xcb_get_input_focus() followed by xcb_flush() here, and forcing the reply the next time the command buffer is about to be flushed. Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] i965: Haswell has been broken for over 5 days
Haswell has been broken on master for a surprisingly long time, since commit 1dfea559c3 (Thu May 2 11:27:37 2013 -0700). Reverting that commit fixed it for me. If it doesn't get properly fixed by the 7th day, I'd like to see the guilty patch reverted. A full week is too long for a platform under active development to be down. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel: Fix render-to-texture in non-FinishRenderTexture cases.
With EGL_KHR_gl_renderbuffer_iamge, we have the ability to render to renderbuffers that are also textures, so the core Mesa FinishRenderTexture hook doesn't get called. That hook also wasn't called in various cases within the driver where we'd update texture contents using the render cache (like glCopyTexSubImage) that resulted in intel_batchbuffer_emit_mi_flush(). To fix it, track a set of rendered-to BOs in our context, which is cleared at batch wrap or emit_mi_flush time, and do an emit_mi_flush if one of our textures is in that set. This change doesn't turn the other emit_mi_flushes (such as intel_blit.c operations) into render_cache_set operations yet, as that would increase the size of our set and we expect that those operations get immediately flushed anyway. No statistically significant performance difference in cairo-gl (n=53/54, slow turbo outliers removed), despite spending ~1% CPU in these set operations. Fixes piglit EGL_KHR_gl_renderbuffer_image/renderbuffer-texture. --- src/mesa/drivers/dri/i915/i830_texstate.c | 3 ++ src/mesa/drivers/dri/i915/i915_texstate.c | 3 ++ src/mesa/drivers/dri/i915/intel_tris.c | 22 src/mesa/drivers/dri/i965/brw_draw.c | 23 +--- src/mesa/drivers/dri/i965/brw_misc_state.c | 5 +++ src/mesa/drivers/dri/intel/intel_batchbuffer.c | 5 +++ src/mesa/drivers/dri/intel/intel_context.h | 7 src/mesa/drivers/dri/intel/intel_fbo.c | 49 ++ src/mesa/drivers/dri/intel/intel_fbo.h | 6 9 files changed, 112 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i830_texstate.c b/src/mesa/drivers/dri/i915/i830_texstate.c index f186fac..6b1dbf0 100644 --- a/src/mesa/drivers/dri/i915/i830_texstate.c +++ b/src/mesa/drivers/dri/i915/i830_texstate.c @@ -33,6 +33,7 @@ #include intel_mipmap_tree.h #include intel_tex.h +#include intel_fbo.h #include i830_context.h #include i830_reg.h @@ -128,6 +129,8 @@ i830_update_tex_unit(struct intel_context *intel, GLuint unit, GLuint ss3) GLubyte border[4]; GLuint dst_x, dst_y; + intel_render_cache_set_check_flush(intel, intelObj-mt-region-bo); + memset(state, 0, sizeof(*state)); /*We need to refcount these. */ diff --git a/src/mesa/drivers/dri/i915/i915_texstate.c b/src/mesa/drivers/dri/i915/i915_texstate.c index 43c802b..148da15 100644 --- a/src/mesa/drivers/dri/i915/i915_texstate.c +++ b/src/mesa/drivers/dri/i915/i915_texstate.c @@ -33,6 +33,7 @@ #include intel_mipmap_tree.h #include intel_tex.h +#include intel_fbo.h #include i915_context.h #include i915_reg.h @@ -151,6 +152,8 @@ i915_update_tex_unit(struct intel_context *intel, GLuint unit, GLuint ss3) GLubyte border[4]; GLfloat maxlod; + intel_render_cache_set_check_flush(intel, intelObj-mt-region-bo); + memset(state, 0, sizeof(*state)); /*We need to refcount these. */ diff --git a/src/mesa/drivers/dri/i915/intel_tris.c b/src/mesa/drivers/dri/i915/intel_tris.c index 7c60d84..1f27243 100644 --- a/src/mesa/drivers/dri/i915/intel_tris.c +++ b/src/mesa/drivers/dri/i915/intel_tris.c @@ -52,6 +52,8 @@ #include intel_batchbuffer.h #include intel_buffers.h #include intel_reg.h +#include intel_fbo.h +#include intel_mipmap_tree.h #include i830_context.h #include i830_reg.h #include i915_context.h @@ -61,6 +63,22 @@ static void intelRasterPrimitive(struct gl_context * ctx, GLenum rprim, GLuint hwprim); static void +mark_render_cache(struct intel_context *intel) +{ + struct gl_context *ctx = intel-ctx; + struct gl_framebuffer *fb = ctx-DrawBuffer; + struct intel_renderbuffer *depth_irb = + intel_get_renderbuffer(fb, BUFFER_DEPTH); + struct intel_renderbuffer *color_irb = + intel_renderbuffer(fb-_ColorDrawBuffers[0]); + + if (color_irb) + intel_render_cache_set_add_bo(intel, color_irb-mt-region-bo); + if (depth_irb) + intel_render_cache_set_add_bo(intel, depth_irb-mt-region-bo); +} + +static void intel_flush_inline_primitive(struct intel_context *intel) { GLuint used = intel-batch.used - intel-prim.start_ptr; @@ -75,6 +93,8 @@ intel_flush_inline_primitive(struct intel_context *intel) intel-batch.map[intel-prim.start_ptr] = _3DPRIMITIVE | intel-prim.primitive | (used - 2); + mark_render_cache(intel); + goto finished; do_discard: @@ -310,6 +330,8 @@ void intel_flush_prim(struct intel_context *intel) ADVANCE_BATCH(); } + mark_render_cache(intel); + if (intel-always_flush_cache) { intel_batchbuffer_emit_mi_flush(intel); } diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 8c37e0b..670d648 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -292,8 +292,8 @@ static void brw_merge_inputs( struct brw_context *brw, /* * \brief Resolve buffers before drawing. * - * Resolve the depth buffer's
Re: [Mesa-dev] i965: Haswell has been broken for over 5 days
Chad Versace chad.vers...@linux.intel.com writes: Haswell has been broken on master for a surprisingly long time, since commit 1dfea559c3 (Thu May 2 11:27:37 2013 -0700). Reverting that commit fixed it for me. If it doesn't get properly fixed by the 7th day, I'd like to see the guilty patch reverted. A full week is too long for a platform under active development to be down. Well, it was broken for 2 working days before I caught the bug myself and posted a patch, which got (negative) review feedback at midnight. QA caught the bug shortly after I posted the patch. I think you have unreasonable expectations of turnaround time here for something that wasn't even bisected and reported. pgpF7Q1aK_tb2.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Fix it by emitting the correct packets on gen7, for all stages. v2: Include the packets instead of just skipping them. NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 103 ++- 1 file changed, 101 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index 1c23866..f55805c 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -276,6 +276,37 @@ gen7_blorp_emit_sampler_state(struct brw_context *brw, } +/* 3DSTATE_VS + * + * Disable vertex shader. + */ +static void +gen7_blorp_emit_vs_disable(struct brw_context *brw, + const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_CONSTANT_VS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); + + BEGIN_BATCH(6); + OUT_BATCH(_3DSTATE_VS 16 | (6 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} + + /* 3DSTATE_HS * * Disable the hull shader. @@ -287,6 +318,16 @@ gen7_blorp_emit_hs_disable(struct brw_context *brw, struct intel_context *intel = brw-intel; BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_CONSTANT_HS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); + + BEGIN_BATCH(7); OUT_BATCH(_3DSTATE_HS 16 | (7 - 2)); OUT_BATCH(0); OUT_BATCH(0); @@ -327,6 +368,16 @@ gen7_blorp_emit_ds_disable(struct brw_context *brw, { struct intel_context *intel = brw-intel; + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_CONSTANT_DS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); + BEGIN_BATCH(6); OUT_BATCH(_3DSTATE_DS 16 | (6 - 2)); OUT_BATCH(0); @@ -337,6 +388,36 @@ gen7_blorp_emit_ds_disable(struct brw_context *brw, ADVANCE_BATCH(); } +/* 3DSTATE_GS + * + * Disable the geometry shader. + */ +static void +gen7_blorp_emit_gs_disable(struct brw_context *brw, + const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_CONSTANT_GS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); + + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_GS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} /* 3DSTATE_STREAMOUT * @@ -573,6 +654,22 @@ gen7_blorp_emit_constant_ps(struct brw_context *brw, ADVANCE_BATCH(); } +static void +gen7_blorp_emit_constant_ps_disable(struct brw_context *brw, +const brw_blorp_params *params) +{ + struct intel_context *intel = brw-intel; + + BEGIN_BATCH(7); + OUT_BATCH(_3DSTATE_CONSTANT_PS 16 | (7 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} static void gen7_blorp_emit_depth_stencil_config(struct brw_context *brw, @@ -784,11 +881,11 @@ gen7_blorp_exec(struct intel_context *intel, wm_surf_offset_texture); sampler_offset = gen7_blorp_emit_sampler_state(brw, params); } - gen6_blorp_emit_vs_disable(brw, params); + gen7_blorp_emit_vs_disable(brw, params); gen7_blorp_emit_hs_disable(brw, params); gen7_blorp_emit_te_disable(brw, params); gen7_blorp_emit_ds_disable(brw, params); - gen6_blorp_emit_gs_disable(brw, params); + gen7_blorp_emit_gs_disable(brw, params); gen7_blorp_emit_streamout_disable(brw, params); gen6_blorp_emit_clip_disable(brw, params); gen7_blorp_emit_sf_config(brw, params); @@ -798,6 +895,8 @@ gen7_blorp_exec(struct intel_context *intel, wm_bind_bo_offset); gen7_blorp_emit_sampler_state_pointers_ps(brw, params, sampler_offset); gen7_blorp_emit_constant_ps(brw, params, wm_push_const_offset); + } else { + gen7_blorp_emit_constant_ps_disable(brw, params); } gen7_blorp_emit_ps_config(brw, params, prog_offset, prog_data); gen7_blorp_emit_cc_viewport(brw, params); -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965: Haswell has been broken for over 5 days
On 05/07/2013 04:08 PM, Eric Anholt wrote: Chad Versace chad.vers...@linux.intel.com writes: Haswell has been broken on master for a surprisingly long time, since commit 1dfea559c3 (Thu May 2 11:27:37 2013 -0700). Reverting that commit fixed it for me. If it doesn't get properly fixed by the 7th day, I'd like to see the guilty patch reverted. A full week is too long for a platform under active development to be down. Well, it was broken for 2 working days before I caught the bug myself and posted a patch, which got (negative) review feedback at midnight. QA caught the bug shortly after I posted the patch. I think you have unreasonable expectations of turnaround time here for something that wasn't even bisected and reported. I admit I may have an unreasonable expectation here sourced in frustration. The frustration, though, isn't a good excuse to be unreasonable. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/17] mesa: add use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECT
On Thu, May 2, 2013 at 8:12 PM, Eric Anholt e...@anholt.net wrote: Marek Olšák mar...@gmail.com writes: diff --git a/src/mesa/drivers/dri/intel/intel_buffer_objects.c b/src/mesa/drivers/dri/intel/intel_buffer_objects.c index 996518b..f941c56 100644 --- a/src/mesa/drivers/dri/intel/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/intel/intel_buffer_objects.c @@ -39,6 +39,10 @@ #include intel_mipmap_tree.h #include intel_regions.h +#ifndef I915 +#include brw_context.h +#endif + static GLboolean intel_bufferobj_unmap(struct gl_context * ctx, struct gl_buffer_object *obj); @@ -160,6 +164,14 @@ intel_bufferobj_data(struct gl_context * ctx, drm_intel_bo_subdata(intel_obj-buffer, 0, size, data); } +#ifndef I915 + /* BufferData may change a uniform buffer, need to update it */ + { + struct brw_context *brw = brw_context(ctx); + brw-state.dirty.brw |= BRW_NEW_UNIFORM_BUFFER; + } +#endif + return true; } There are also cases where the BO get replaced in subdata, and in map_range with INVALIDATE_BUFFER. If those get fixed by moving this block into intel_bufferobj_alloc_buffer(), then this (and patch 2-6) are: Reviewed-by: Eric Anholt e...@anholt.net diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c b/src/mesa/state_tracker/st_cb_bufferobjects.c index 8ff32c8..d166fe6 100644 --- a/src/mesa/state_tracker/st_cb_bufferobjects.c +++ b/src/mesa/state_tracker/st_cb_bufferobjects.c @@ -247,9 +247,11 @@ st_bufferobj_data(struct gl_context *ctx, if (data) pipe_buffer_write(pipe, st_obj-buffer, 0, size, data); - return GL_TRUE; } + /* BufferData may change a uniform buffer, need to update it */ + st-dirty.st |= ST_NEW_UNIFORM_BUFFER; + return GL_TRUE; } Do you need to also flag in the case where you've PIPE_TRANSFER_DISCARD_BUFFER mapped the BO? I don't know how exactly this state gets used, just a possible issue. No, gallium drivers are responsible for flagging their own internal states if they decide to discard the buffer. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965: Haswell has been broken for over 5 days
Haswell has been broken on master for a surprisingly long time, since commit 1dfea559c3 (Thu May 2 11:27:37 2013 -0700). Reverting that commit fixed it for me. If it doesn't get properly fixed by the 7th day, I'd like to see the guilty patch reverted. A full week is too long for a platform under active development to be down. Well, it was broken for 2 working days before I caught the bug myself and posted a patch, which got (negative) review feedback at midnight. QA caught the bug shortly after I posted the patch. I think you have unreasonable expectations of turnaround time here for something that wasn't even bisected and reported. I admit I may have an unreasonable expectation here sourced in frustration. The frustration, though, isn't a good excuse to be unreasonable. Also HSW isn't hw the rest of the world has to worry about, whereas SNB being broken for 3D at startup on every distro matters more. It might not matter in your reporting and bonus structure but it sure as hell matters to users :-) Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: Fix render-to-texture in non-FinishRenderTexture cases.
On 05/07/2013 03:53 PM, Eric Anholt wrote: With EGL_KHR_gl_renderbuffer_iamge, we have the ability to render to ^ image renderbuffers that are also textures, so the core Mesa FinishRenderTexture hook doesn't get called. That hook also wasn't called in various cases within the driver where we'd update texture contents using the render cache (like glCopyTexSubImage) that resulted in intel_batchbuffer_emit_mi_flush(). To fix it, track a set of rendered-to BOs in our context, which is cleared at batch wrap or emit_mi_flush time, and do an emit_mi_flush if one of our textures is in that set. That sounds like an optimal (if complex) solution for the EGLimage. When Ken described the bug to me, I was envisioning a simpler fix: if a renderbuffer is attached to an EGLimage, use FinishRenderTexture. Out of curiosity, do you think that would have also worked? It seems like that would be cheaper in the non-EGLimage case, though it doesn't sound like it matters. This change doesn't turn the other emit_mi_flushes (such as intel_blit.c operations) into render_cache_set operations yet, as that would increase the size of our set and we expect that those operations get immediately flushed anyway. No statistically significant performance difference in cairo-gl (n=53/54, slow turbo outliers removed), despite spending ~1% CPU in these set operations. Fixes piglit EGL_KHR_gl_renderbuffer_image/renderbuffer-texture. Reviewed-by: Ian Romanick ian.d.roman...@intel.com NOTE: This is a candidate for the 9.1 branch. --- src/mesa/drivers/dri/i915/i830_texstate.c | 3 ++ src/mesa/drivers/dri/i915/i915_texstate.c | 3 ++ src/mesa/drivers/dri/i915/intel_tris.c | 22 src/mesa/drivers/dri/i965/brw_draw.c | 23 +--- src/mesa/drivers/dri/i965/brw_misc_state.c | 5 +++ src/mesa/drivers/dri/intel/intel_batchbuffer.c | 5 +++ src/mesa/drivers/dri/intel/intel_context.h | 7 src/mesa/drivers/dri/intel/intel_fbo.c | 49 ++ src/mesa/drivers/dri/intel/intel_fbo.h | 6 9 files changed, 112 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i830_texstate.c b/src/mesa/drivers/dri/i915/i830_texstate.c index f186fac..6b1dbf0 100644 --- a/src/mesa/drivers/dri/i915/i830_texstate.c +++ b/src/mesa/drivers/dri/i915/i830_texstate.c @@ -33,6 +33,7 @@ #include intel_mipmap_tree.h #include intel_tex.h +#include intel_fbo.h #include i830_context.h #include i830_reg.h @@ -128,6 +129,8 @@ i830_update_tex_unit(struct intel_context *intel, GLuint unit, GLuint ss3) GLubyte border[4]; GLuint dst_x, dst_y; + intel_render_cache_set_check_flush(intel, intelObj-mt-region-bo); + memset(state, 0, sizeof(*state)); /*We need to refcount these. */ diff --git a/src/mesa/drivers/dri/i915/i915_texstate.c b/src/mesa/drivers/dri/i915/i915_texstate.c index 43c802b..148da15 100644 --- a/src/mesa/drivers/dri/i915/i915_texstate.c +++ b/src/mesa/drivers/dri/i915/i915_texstate.c @@ -33,6 +33,7 @@ #include intel_mipmap_tree.h #include intel_tex.h +#include intel_fbo.h #include i915_context.h #include i915_reg.h @@ -151,6 +152,8 @@ i915_update_tex_unit(struct intel_context *intel, GLuint unit, GLuint ss3) GLubyte border[4]; GLfloat maxlod; + intel_render_cache_set_check_flush(intel, intelObj-mt-region-bo); + memset(state, 0, sizeof(*state)); /*We need to refcount these. */ diff --git a/src/mesa/drivers/dri/i915/intel_tris.c b/src/mesa/drivers/dri/i915/intel_tris.c index 7c60d84..1f27243 100644 --- a/src/mesa/drivers/dri/i915/intel_tris.c +++ b/src/mesa/drivers/dri/i915/intel_tris.c @@ -52,6 +52,8 @@ #include intel_batchbuffer.h #include intel_buffers.h #include intel_reg.h +#include intel_fbo.h +#include intel_mipmap_tree.h #include i830_context.h #include i830_reg.h #include i915_context.h @@ -61,6 +63,22 @@ static void intelRasterPrimitive(struct gl_context * ctx, GLenum rprim, GLuint hwprim); static void +mark_render_cache(struct intel_context *intel) +{ + struct gl_context *ctx = intel-ctx; + struct gl_framebuffer *fb = ctx-DrawBuffer; + struct intel_renderbuffer *depth_irb = + intel_get_renderbuffer(fb, BUFFER_DEPTH); + struct intel_renderbuffer *color_irb = + intel_renderbuffer(fb-_ColorDrawBuffers[0]); + + if (color_irb) + intel_render_cache_set_add_bo(intel, color_irb-mt-region-bo); + if (depth_irb) + intel_render_cache_set_add_bo(intel, depth_irb-mt-region-bo); +} + +static void intel_flush_inline_primitive(struct intel_context *intel) { GLuint used = intel-batch.used - intel-prim.start_ptr; @@ -75,6 +93,8 @@ intel_flush_inline_primitive(struct intel_context *intel) intel-batch.map[intel-prim.start_ptr] = _3DPRIMITIVE | intel-prim.primitive | (used - 2); + mark_render_cache(intel); + goto finished;
Re: [Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
On 05/07/2013 04:36 PM, Eric Anholt wrote: The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Fix it by emitting the correct packets on gen7, for all stages. v2: Include the packets instead of just skipping them. NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 103 ++- 1 file changed, 101 insertions(+), 2 deletions(-) Tested on Haswell. Reviewed-and-tested-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Be less casual about texture formats in st_finalize_texture
On Wed, May 8, 2013 at 1:14 AM, Brian Paul bri...@vmware.com wrote: On 05/06/2013 02:41 PM, Adam Jackson wrote: Commit 62452883 removed a hunk like if (firstImageFormat != stObj-pt-format) st_view_format = firstImageFormat; from update_single_texture(). This broke piglit/glx-tfp on AMD Barts (and probably others), as that hunk was compensating for the mesa and gallium layers disagreeing about the format. Fix this by not ignoring the alpha channel in st_finalize_texture when considering whether two 32-bit formats are sufficiently compatible. It looks like you're undoing change a2817f6ae by Dave Airlie. Dave should review this. It's not 100% clear to me what's going on there. I think I'd rather put back what Marek's change undid than remove the alpha channel stuff, I put all that in for a good reason, if memory serves things either went a lot slower or stuff misrendered in gnome-shell without it. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Be less casual about texture formats in st_finalize_texture
Sorry, I'm not following. Which commit of mine are you referring to? Marek On Wed, May 8, 2013 at 2:56 AM, Dave Airlie airl...@gmail.com wrote: On Wed, May 8, 2013 at 1:14 AM, Brian Paul bri...@vmware.com wrote: On 05/06/2013 02:41 PM, Adam Jackson wrote: Commit 62452883 removed a hunk like if (firstImageFormat != stObj-pt-format) st_view_format = firstImageFormat; from update_single_texture(). This broke piglit/glx-tfp on AMD Barts (and probably others), as that hunk was compensating for the mesa and gallium layers disagreeing about the format. Fix this by not ignoring the alpha channel in st_finalize_texture when considering whether two 32-bit formats are sufficiently compatible. It looks like you're undoing change a2817f6ae by Dave Airlie. Dave should review this. It's not 100% clear to me what's going on there. I think I'd rather put back what Marek's change undid than remove the alpha channel stuff, I put all that in for a good reason, if memory serves things either went a lot slower or stuff misrendered in gnome-shell without it. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Be less casual about texture formats in st_finalize_texture
On Wed, May 8, 2013 at 11:14 AM, Marek Olšák mar...@gmail.com wrote: Sorry, I'm not following. Which commit of mine are you referring to? The one ajax pointed out in the first message, 62452883 Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/tgsi: clarify (possibly change) TGSI_OPCODE_UCMP definition
From: Roland Scheidegger srol...@vmware.com UCMP while an integer opcode isn't really consistently implemented as having all integer arguments. softpipe will assume all arguments are ints, whereas gallivm has the arguments defined as untyped which means they'll get treated as floats. This means input modifiers will not work the same. Fix this by saying only first arg is an integer, which seems more useful than making all arguments integers - this would be similar to d3d10 movc opcode. --- src/gallium/docs/source/tgsi.rst |5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 3af1fb7..852f8a0 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -1291,6 +1291,11 @@ Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?) .. opcode:: UCMP - Integer Conditional Move +.. note:: + + Only the first source arg is an integer, the 2nd and 3rd ones are + considered floats (for input modifier purposes). + .. math:: dst.x = src0.x ? src1.x : src2.x -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix hangs on HSW since the gen6 blorp fix.
On 05/07/2013 04:36 PM, Eric Anholt wrote: The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Fix it by emitting the correct packets on gen7, for all stages. v2: Include the packets instead of just skipping them. NOTE: This is a candidate for the stable branches. Thanks so much for fixing this, Eric. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/android: Fix error condition for EGL_ANDROID_image_native_buffer
On Wed, May 8, 2013 at 12:15 AM, Chad Versace chad.vers...@linux.intel.com wrote: On 05/07/2013 01:19 AM, Chia-I Wu wrote: On Tue, May 7, 2013 at 3:49 PM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Mon, May 06, 2013 at 02:23:52PM -0700, Chad Versace wrote: Emit EGL_BAD_CONTEXT if the user passes a context to eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer). From the EGL_ANDROID_image_native_buffer spec: * If target is EGL_NATIVE_BUFFER_ANDROID and ctx is not EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. Note: This is a candidate for the stable branches. CC: Tapani Pälli tapani.pa...@intel.com Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/egl/drivers/dri2/platform_android.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/platform_android.c b/src/egl/drivers/dri2/platform_android.c index cee4035..ed50907 100644 --- a/src/egl/drivers/dri2/platform_android.c +++ b/src/egl/drivers/dri2/platform_android.c @@ -337,7 +337,7 @@ droid_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) } static _EGLImage * -dri2_create_image_android_native_buffer(_EGLDisplay *disp, +dri2_create_image_android_native_buffer(_EGLDisplay *disp, _EGLContext *ctx, struct ANativeWindowBuffer *buf) { struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); @@ -346,6 +346,18 @@ dri2_create_image_android_native_buffer(_EGLDisplay *disp, uint32_t offsets[3], strides[3], handles[3], tmp; EGLint format; + if (ctx != NULL) { I did a similar check for the 'EGL_LINUX_DMA_BUF_EXT'. Technically 'eglapi.c::eglCreateImageKhr()' does a lookup of the context via '_eglLookupContext()' and does a translation of 'EGL_NO_CONTEXT' also (from NULL to NULL). Hence I chose to do the check there. But would it be better for me to do it also in the driver side as the target is valid only for linux platforms anyway? I will suggest do the check in the driver for the moment. eglapi.[ch] should be generated ultimately, IMHO, and adding extension-specific there will make the switch harder. Topi, I chose to place the check in platform_android.c because this is a platform-specific extension. There's no sense in wasting cycles in platform-common code for a platform-specific check. For dma_buf_import, I think you put the check in the right place. That extension is not confined to any of Wayland, X11, nor Android, so it makes sense to put the check in the common code to reduce code duplication. Chia-I, do you have any concrete plans to generate eglapi.c? What would that generated file look like? No, I don't yet. The generated file probably still needs to perform display locking and EGL resource lookup. But parameter checks would need to be moved elsewhere. -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] non x11/xlib based EGL and software only renderer
On Wed, May 8, 2013 at 2:34 AM, Divick Kishore divick.kish...@gmail.com wrote: Hi Chia, I haven't tried that for a while, but it should not have X11 dependencies. You probably need to disable other stuffs such as --disable-glx and etc. It might still require X11 at compile time, because eglplatform.h may include Xlib.h, but it should not need X11 at runtime. Alright, I will try and update with my findings. But doing ldd it does show dependency on X11.so. Then something may go wrong over the time... But you need to modify the app to use pbuffer or FBO. Or you may add some code to the null platform to treat windows as pbuffers. What exactly is a null platform? Also could you please point me to the directory and potential starting points to support window as pbuffers? Modifying apps is a complete no no. See src/gallium/state_trackers/egl/null. It is a (fake) display system that supports no windows nor pixmaps. Thanks Regards, Divick -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] amd_performance_monitor: Fix multi-statement macro 'report'.
On 05/07/2013 02:04 PM, Vinson Lee wrote: Fixes Nesting level does not match indentation defect reported by Coverity. That has to be the stupidest defect I've ever heard of. But your patch looks reasonable nonetheless - most people use do-while blocks like that, so it's probably a good idea. Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Vinson Lee v...@freedesktop.org --- tests/spec/amd_performance_monitor/api.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/spec/amd_performance_monitor/api.c b/tests/spec/amd_performance_monitor/api.c index 7b321cf..3205fc0 100644 --- a/tests/spec/amd_performance_monitor/api.c +++ b/tests/spec/amd_performance_monitor/api.c @@ -113,8 +113,10 @@ find_invalid_counter(unsigned *counters, int num_counters) } #define report(pass) \ -piglit_report_subtest_result((pass) ? PIGLIT_PASS : PIGLIT_FAIL, __FUNCTION__); \ -return + do { \ + piglit_report_subtest_result((pass) ? PIGLIT_PASS : PIGLIT_FAIL, __FUNCTION__); \ + return; \ + } while (0) /**/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Actually use the user timeout in glClientWaitSync.
From: Ben Widawsky b...@bwidawsk.net Use the new libdrm functionality to actually do timed waits on the sync object. Signed-off-by: Ben Widawsky b...@bwidawsk.net Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_syncobj.c | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) I've been keeping this patch around for ages, waiting for an application that actually benefited from the timeout working. I still haven't found one, but it's probably past time to land it anyway. No piglit changes on Ivybridge. diff --git a/src/mesa/drivers/dri/intel/intel_syncobj.c b/src/mesa/drivers/dri/intel/intel_syncobj.c index e965896..9657d9a 100644 --- a/src/mesa/drivers/dri/intel/intel_syncobj.c +++ b/src/mesa/drivers/dri/intel/intel_syncobj.c @@ -80,20 +80,12 @@ intel_fence_sync(struct gl_context *ctx, struct gl_sync_object *s, intel_flush(ctx); } -/* We ignore the user-supplied timeout. This is weaselly -- we're allowed to - * round to an implementation-dependent accuracy, and right now our - * implementation rounds to the wait-forever value. - * - * The fix would be a new kernel function to do the GTT transition with a - * timeout. - */ static void intel_client_wait_sync(struct gl_context *ctx, struct gl_sync_object *s, GLbitfield flags, GLuint64 timeout) { struct intel_sync_object *sync = (struct intel_sync_object *)s; - if (sync-bo) { - drm_intel_bo_wait_rendering(sync-bo); + if (sync-bo drm_intel_gem_bo_wait(sync-bo, timeout) == 0) { s-StatusFlag = 1; drm_intel_bo_unreference(sync-bo); sync-bo = NULL; -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Copy _mesa_shader_type_to_index() to standalone scaffolding.
On 04/17/2013 05:30 PM, Kenneth Graunke wrote: We can't include shaderobj.h from the standalone utilities, so we unfortunately have to copy this function. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/standalone_scaffolding.h | 17 + 1 file changed, 17 insertions(+) Just a quick ping - this series is still awaiting review. It's not critical or anything, but it'd be nice to land it someday... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev