Re: [Mesa-dev] [PATCH 7/7] radeon/llvm: enable LICM and DCE pass
On Mit, 2013-03-06 at 11:56 +0100, Christian König wrote: Am 05.03.2013 18:32, schrieb Vincent Lejeune: LICM stands for Loop Invariant Code Motion. Instructions that does not depend of loop index are moved outside of loop body. (This solves one of llvm generated code Vadim pointed in another thread) DCE is DeadCodeElimination...I don't know the difference between classic DCE and aggressive DCE though. If I understand it correctly the DCE pass just removes the trivial dead instructions, e.g. not used and doesn't have a side effect. While the aggressive DCE pass goes a bit more into the depth. For example imagine a loop with an unused variable incremented, the increment depends on itself, but it's completely useless... Thanks for the explanations guys, but the point is really that the change needs to be self-explanatory. :) -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] GLX_MESA_query_renderer
On Saturday 02 March 2013, Ian Romanick wrote: GLX_RENDERER_OPENGL_CORE_PROFILE_VERSION_MESA 2 Maximum core profile major and minor version supported by the renderer GLX_RENDERER_OPENGL_COMPATIBILITY_PROFILE_VERSION_MESA 2 Maximum compatibility profile major and minor version supported by the renderer I wonder if it would make sense to also have a minimum version in case we ever see implementations that don't support 1.x and 2.x contexts. Or should that case be handled by COMPATIBILITY_PROFILE_VERSION queries returning 0.0.0? Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: fix incorrect 'j' array index in dummy texture code
Am 06.03.2013 02:09, schrieb Brian Paul: Use 0 instead. --- src/gallium/drivers/llvmpipe/lp_setup.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c index 4529775..299fd65 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup.c +++ b/src/gallium/drivers/llvmpipe/lp_setup.c @@ -720,9 +720,9 @@ lp_setup_set_fragment_sampler_views(struct lp_setup_context *setup, jit_tex-depth = 1; jit_tex-first_level = 0; jit_tex-last_level = 0; - jit_tex-mip_offsets[j] = 0; - jit_tex-row_stride[j] = 0; - jit_tex-img_stride[j] = 0; + jit_tex-mip_offsets[0] = 0; + jit_tex-row_stride[0] = 0; + jit_tex-img_stride[0] = 0; } else { jit_tex-width = res-width0; Oops I think I introduced that bug recently when I needed to restructure the code to handle buffer textures. Reviewed-by: Roland Scheidegger srol...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix build of swrast only without libdrm
On 03/06/2013 01:54 AM, Michel Dänzer wrote: Actually, you asked for two different software renderers: --with-dri-drivers=swrast produces lib/swrast_dri.so based on classic swrast, --with-gallium-drivers=swrast produces lib/gallium/swrast_dri.so based on llvmpipe/softpipe. Presumably both are built before and after the change(s) in question, they merely changed which one ends up installed / picked up first at runtime. Oh my... I've started building nightly drm as well. (Fedora 17 has libdrm_nouveau2.so which I can't make work.) Now I can test VTK against the much more default (wild type) configuration: export PKG_CONFIG_PATH=/home/kevin/drm_nightly/lib/pkgconfig/ ./autogen.sh \ --prefix=/home/kevin/mesa_nightly \ --enable-glx \ --enable-dri \ --enable-shared-glapi \ --enable-gallium-llvm \ --enable-osmesa signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Unreference sampler object when it's currently bound to, texture unit.
This change specifically unbinds a sampler object from the texture unit if it's bound to a unit. The spec calls for default object when deleting sampler objects which are currently bound. Signed-off-by: Alan Hourihane al...@vmware.com --- src/mesa/main/samplerobj.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 319a444..2279055 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -206,9 +206,19 @@ _mesa_DeleteSamplers(GLsizei count, const GLuint *samplers) for (i = 0; i count; i++) { if (samplers[i]) { + GLsizei j; struct gl_sampler_object *sampObj = _mesa_lookup_samplerobj(ctx, samplers[i]); + if (sampObj) { +/* If the sampler is currently bound, unbind it. */ +for (j = 0; j ctx-Const.MaxCombinedTextureImageUnits; j++) { + if (ctx-Texture.Unit[j].Sampler == sampObj) { + FLUSH_VERTICES(ctx, _NEW_TEXTURE); + _mesa_reference_sampler_object(ctx, ctx-Texture.Unit[j].Sam pler, NULL); + } +} + /* The ID is immediately freed for re-use */ _mesa_HashRemove(ctx-Shared-SamplerObjects, samplers[i]); /* But the object exists until its reference count goes to zero */ -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Unreference sampler object when it's currently bound to, texture unit.
On 03/06/2013 09:15 AM, Alan Hourihane wrote: This change specifically unbinds a sampler object from the texture unit if it's bound to a unit. The spec calls for default object when deleting sampler objects which are currently bound. Candidate for the stable branches, I think. Signed-off-by: Alan Hourihane al...@vmware.com --- src/mesa/main/samplerobj.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 319a444..2279055 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -206,9 +206,19 @@ _mesa_DeleteSamplers(GLsizei count, const GLuint *samplers) for (i = 0; i count; i++) { if (samplers[i]) { + GLsizei j; GLuint j (so that we don't get signed/unsigned comparison warnings from the loop below). struct gl_sampler_object *sampObj = _mesa_lookup_samplerobj(ctx, samplers[i]); + if (sampObj) { +/* If the sampler is currently bound, unbind it. */ +for (j = 0; j ctx-Const.MaxCombinedTextureImageUnits; j++) { + if (ctx-Texture.Unit[j].Sampler == sampObj) { + FLUSH_VERTICES(ctx, _NEW_TEXTURE); + _mesa_reference_sampler_object(ctx, ctx-Texture.Unit[j].Sam pler, NULL); + } +} + /* The ID is immediately freed for re-use */ _mesa_HashRemove(ctx-Shared-SamplerObjects, samplers[i]); /* But the object exists until its reference count goes to zero */ Reviewed-by: Brian Paul bri...@vmware.com Thanks, Alan. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE
Looks good. Thanks for fine tuning these parameters! Jose - Original Message - We advertise a max texture/surfaces size of 8K x 8K but the old values for these limits didn't actually allow us to handle that surface size. For 8K x 8K we'll have 16384 bins. Each bin needs at least one cmd_block object which was 2192 bytes in size. Since 16384 * 2192 exceeded LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not draw the complete scene. By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks. And by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command blocks for 8K x 8K, plus a few regular data blocks. Fixes the (improved) piglit fbo-maxsize test. Note: This is a candidate for the stable branches. --- src/gallium/drivers/llvmpipe/lp_scene.h | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_scene.h b/src/gallium/drivers/llvmpipe/lp_scene.h index b1db61b..801829d 100644 --- a/src/gallium/drivers/llvmpipe/lp_scene.h +++ b/src/gallium/drivers/llvmpipe/lp_scene.h @@ -49,12 +49,18 @@ struct lp_rast_state; #define TILES_Y (LP_MAX_HEIGHT / TILE_SIZE) -#define CMD_BLOCK_MAX 128 +/* Commands per command block (ideally so sizeof(cmd_block) is a power of + * two in size.) + */ +#define CMD_BLOCK_MAX 29 + +/* Bytes per data block. + */ #define DATA_BLOCK_SIZE (64 * 1024) /* Scene temporary storage is clamped to this size: */ -#define LP_SCENE_MAX_SIZE (4*1024*1024) +#define LP_SCENE_MAX_SIZE (9*1024*1024) /* The maximum amount of texture storage referenced by a scene is * clamped ot this size: -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] GLSL: fix too eager constant variable optimization
On 3 March 2013 00:44, Aras Pranckevicius a...@unity3d.com wrote: opt_constant_variable was marking a variable as constant as long as there was exactly one constant assignment to it, but did not take into account that this assignment might be in a dynamic branch or a loop. Was happening on a fragment shader like this: uniform float mode; float func (float c) { if (mode == 2.0) return c; // was returning 0.1 after optimization! if (mode == 3.0) discard; if (mode == 10.0) c = 0.1; return c; } void main() { vec4 c = gl_FragCoord; c.x = func(c.x); gl_FragColor = c; } Now, looking further this optimization pass should also not mark variables as const if there was a dereference of them before that first assignment. I had code to do this (a hashtable that would track dereferences before assignment is done). But couldn't come up with a test case that would break the whole set of optimizations that Mesa does (lower jumps, or inlining, ... were getting in the way and hide the bug). I'm not sure I agree with this. The real problem with the example code you showed above is that there's an implicit write to the variable c at the top of the function, so c is not in fact constant--it's written twice. What we should really do is modify the optimization pass so that it's aware of the implicit write that happens in in and inout function args. Attached version two of the patch which does what you suggest - any ir_var_in, ir_var_const_in or ir_var_inout function args are being marked as assigned to. Fixes the issue just as well as my initial patch on several shaders that were problematic before. Ok, I've taken a deeper look now, and I still have some concerns about this patch: - The patch doesn't compile cleanly on master. In particular, it looks like it was made using a version of the code prior to commit 42a29d8 (glsl: Eliminate ambiguity between function ins/outs and shader ins/outs). - It seems kludgy to add a visitor for ir_function_signature that loops through all the parameters, since the default hierarchial visitor for ir_function_signature already does that. Why not just modify the ir_variable visitor so that it increments entry-assignment_count when it visits a variable whose mode is ir_var_function_in or ir_var_function_inout? (Note that the ability to distinguish between function in variables and shader in variables was added in commit 42a29d8, the commit I mentined above). - This optimization pass runs in different ways depending on whether it's being run on an unlinked shader or a linked shader. When it's run on an unlinked shader, it runs via do_constant_variable_unlinked(), which finds each function and visits the function body individually. As such, it never visits the parameter declarations (or the ir_function_signature node), so the assignment_entry structures it creates for the variable c (in your example above) has our_scope=false, and no optimization is performed. So the bug doesn't manifest itself, and even if it did, your patch would have no effect, since the ir_function_signature node isn't visited. When the optimization pass is run on a linked shader, function inlining has already been performed, so there is only one ir_function_signature node left (the one for main, which takes no parameters). So again, the bug doesn't manifest itself, and even if it did, your patch would have no effect. Although I agree that opt_constant_variable has problems and could be improved, it seems like I must be missing some crucial piece of information here, since I can't reproduce the bug and I can't understand why your patch would have any effect. I've made a shader_runner test to try to demonstrate the problem (attached), and it works fine on mesa master as well as the 9.1, 9.0, 8.0, and 7.11 branches. Can you help me understand what I'm missing? Finally, in the future would you mind posting patches to the mailing list as inline text rather than attachments? (git send-email is a convenient way to do this.) It makes them far easier to review since we can comment on the code by simply hitting Reply-all and making review comments alongside the code. Thanks! fp-bogus-constant.shader_test Description: Binary data ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix glGetInteger*(GL_SAMPLER_BINDING).
On 03/06/2013 11:23 AM, Alan Hourihane wrote: If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Alan, is this specified somewhere in a spec? I can't find a description of this behaviour and we don't do this for texture objects or buffer objects, etc. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Problem with EGL_KHR_image_pixmap extension
Hi, I do not know if it's the right place for this, but I sent an email to the Mesa-users mailing list explaining the problem, and I have no answers, so if anyone can help, this is what happens: I'm using Mesa 9.1, EGL and xcb, to make texture from pixmap. Using EGL_KHR_image_pixmap, i get errors in textures. I have also checked texture_from_pixmap in the EGL demos, and the demo not work. I have no errors with Mesa 9.0, my test code and demo work well. Linux: archlinux Mesa: mesa-9.1-2, (mesa-demos from git) Drivers: intel-dri 9.1-2, xf86-video-intel 2.21.3-1 Kernel: 3.8.0-2-ARCH (same results with kernel 3.7.9) I've re-checked the test-code with mesa 9.0, and the only error is that the texture (texture_a) is upside down... Test code: // gcc -Wall -O3 egl-xcb-texture.c -o egl-xcb-texture -lEGL -lGLESv1_CM -lGLESv2 -lxcb -lxcb-util #include xcb/xcb.h #include xcb/xcb_util.h #include GLES/gl.h #include GLES/glext.h #include GLES2/gl2.h #include GLES2/gl2ext.h //#include GLES3/gl3.h //#include GLES3/gl3ext.h #include EGL/egl.h #include EGL/eglext.h #include stdio.h #include stdlib.h #include string.h float vert_tex_data[] = { 1.00f, 1.00f, 0.00f,// Position 0 1.00f, 1.00f, // TexCoord 0 -1.00f, 1.00f, 0.00f,// Position 1 0.00f, 1.00f,// TexCoord 1 -1.00f, -1.00f, 0.00f, // Position 2 0.00f, 0.00f, // TexCoord 2 1.00f, -1.00f, 0.00f, // Position 3 1.00f, 0.00f};// TexCoord 3 const GLushort index_data[] = {0, 1, 2, 2, 3, 0,}; struct ctx_t { xcb_connection_t *conn; int num_scr; xcb_screen_t *screen; const xcb_setup_t *setup; xcb_generic_event_t *xcb_event; xcb_window_t window; xcb_pixmap_t xpix; GLuint texture_a; GLuint texture_b; EGLDisplay egl_dpy; EGLConfig egl_cfg; EGLContext egl_context; EGLSurface egl_surface; GLuint shader_program; GLint loc_v_position; GLint loc_tex_coord; GLint loc_texture; PFNEGLCREATEIMAGEKHRPROC eglCreateImageKHR; PFNEGLDESTROYIMAGEKHRPROC eglDestroyImageKHR; PFNGLEGLIMAGETARGETTEXTURE2DOESPROC glEGLImageTargetTexture2DOES; }; void new_xwindow(struct ctx_t *ctx) { uint32_t values[5], mask; xcb_colormap_t colormap = 0; xcb_visualtype_t *visual = NULL; xcb_visualid_t visual_id; visual = xcb_aux_find_visual_by_attrs(ctx-screen, XCB_VISUAL_CLASS_TRUE_COLOR, 32); visual_id = visual-visual_id; colormap = xcb_generate_id(ctx-conn); xcb_create_colormap(ctx-conn, XCB_COLORMAP_ALLOC_NONE, colormap, ctx-screen-root, visual_id); mask = XCB_CW_BACK_PIXEL | XCB_CW_BORDER_PIXEL | XCB_CW_BIT_GRAVITY | XCB_CW_EVENT_MASK | XCB_CW_COLORMAP; values[0] = 0; values[1] = 0; values[2] = XCB_GRAVITY_NORTH_WEST; values[3] = XCB_EVENT_MASK_EXPOSURE | XCB_EVENT_MASK_KEY_PRESS; values[4] = colormap; ctx-window = xcb_generate_id(ctx-conn); xcb_create_window(ctx-conn, 32, ctx-window, ctx-screen-root, 0, 0, 800, 600, 0, //border XCB_WINDOW_CLASS_INPUT_OUTPUT, visual_id, mask, values); xcb_free_colormap(ctx-conn, colormap); xcb_flush(ctx-conn); } void make_pixmap(struct ctx_t *ctx, uint16_t w, uint16_t h, uint8_t depth) { xcb_gcontext_t gc = 0; xcb_rectangle_t rec = {0, 0, w - 1, h - 1}; xcb_arc_t arc[] = {{0, 0, w - 1, h - 1, 0, 90 * 64}, {0, 0, w - 1, h - 1, 90 * 64, 90 * 64}, {0, 0, w - 1, h - 1, 180 * 64, 90 * 64}, {0, 0, w - 1, h - 1, 270 * 64, 90 * 64}}; uint32_t red = 0xff, yellow = 0x808000, blue = 0xff, green = 0x00ff00, violet = 0xff00ff; ctx-xpix = xcb_generate_id(ctx-conn); xcb_create_pixmap(ctx-conn, depth, ctx-xpix, ctx-screen-root, w, h); gc = xcb_generate_id(ctx-conn); xcb_create_gc(ctx-conn, gc, ctx-xpix, XCB_GC_FOREGROUND, yellow); xcb_poly_fill_rectangle(ctx-conn, ctx-xpix, gc, 1, rec); xcb_change_gc(ctx-conn, gc, XCB_GC_FOREGROUND, blue); xcb_poly_rectangle(ctx-conn, ctx-xpix, gc, 1, rec); xcb_change_gc(ctx-conn, gc, XCB_GC_FOREGROUND, red); xcb_poly_fill_arc(ctx-conn, ctx-xpix, gc, 1, arc[0]); xcb_change_gc(ctx-conn, gc, XCB_GC_FOREGROUND, blue); xcb_poly_fill_arc(ctx-conn, ctx-xpix, gc, 1, arc[1]); xcb_change_gc(ctx-conn, gc, XCB_GC_FOREGROUND, green); xcb_poly_fill_arc(ctx-conn, ctx-xpix, gc, 1, arc[2]); xcb_change_gc(ctx-conn, gc, XCB_GC_FOREGROUND, violet); xcb_poly_fill_arc(ctx-conn, ctx-xpix, gc, 1, arc[3]); xcb_copy_area(ctx-conn, ctx-xpix, ctx-window, gc, 0, 0, 0, 0, w, h); xcb_free_gc(ctx-conn, gc); } void mirror(uint8_t *src, int len, int stride, uint8_t *dst) { uint8_t *out = dst + len -
[Mesa-dev] [PATCH] Fix glGetInteger*(GL_SAMPLER_BINDING).
If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Signed-off-by: Alan Hourihane al...@vmware.com --- src/mesa/main/get.c| 12 +++- src/mesa/main/samplerobj.c |2 +- src/mesa/main/samplerobj.h |2 ++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 2399f9c..c627dd6 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -34,6 +34,7 @@ #include state.h #include texcompress.h #include framebuffer.h +#include samplerobj.h /* This is a table driven implemetation of the glGet*v() functions. * The basic idea is that most getters just look up an int somewhere @@ -827,7 +828,16 @@ find_custom_value(struct gl_context *ctx, const struct valu e_desc *d, union valu { struct gl_sampler_object *samp = ctx-Texture.Unit[ctx-Texture.CurrentUnit].Sampler; - v-value_int = samp ? samp-Name : 0; + + /* + * The sampler object may have been deleted on another context, + * so we try to lookup the sampler object before returning it's Name. + */ + if (samp _mesa_lookup_samplerobj(ctx, samp-Name)) { +v-value_int = samp-Name; + } else { +v-value_int = 0; + } } break; /* GL_ARB_uniform_buffer_object */ diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 4664cc3..5cff329 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -40,7 +40,7 @@ #include main/samplerobj.h -static struct gl_sampler_object * +struct gl_sampler_object * _mesa_lookup_samplerobj(struct gl_context *ctx, GLuint name) { if (name == 0) diff --git a/src/mesa/main/samplerobj.h b/src/mesa/main/samplerobj.h index 3114257..69e3899 100644 --- a/src/mesa/main/samplerobj.h +++ b/src/mesa/main/samplerobj.h @@ -62,6 +62,8 @@ _mesa_reference_sampler_object(struct gl_context *ctx, _mesa_reference_sampler_object_(ctx, ptr, samp); } +extern struct gl_sampler_object * +_mesa_lookup_samplerobj(struct gl_context *ctx, GLuint name); extern struct gl_sampler_object * _mesa_new_sampler_object(struct gl_context *ctx, GLuint name); -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix glGetInteger*(GL_SAMPLER_BINDING).
On 03/06/13 18:36, Brian Paul wrote: On 03/06/2013 11:23 AM, Alan Hourihane wrote: If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Alan, is this specified somewhere in a spec? I can't find a description of this behaviour and we don't do this for texture objects or buffer objects, etc. I can't see it specifically mentioned, apart from the note that when deleting the sampler object it should be unbound from the texture unit, and I did consider the case of buffer texture objects whether to do this there too. But getting the GL_SAMPLER_BINDING id when switching contexts and attempting to re-bind with glBindSampler() gives a GL error, which seems wrong to me. I checked with the NVIDIA driver and no GL error is generated. Alan. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vbo: fix crash found with shared display lists
This fixes a crash when a display list is created in one context but executed from a second one. The vbo_save_context::vertex_store memeber will be NULL if we never created a display list with the context. Just check for that before dereferencing the pointer. Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661 Note: This is a candidate for the stable branches. --- src/mesa/vbo/vbo_save_draw.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c index efb386e..f5b5c41 100644 --- a/src/mesa/vbo/vbo_save_draw.c +++ b/src/mesa/vbo/vbo_save_draw.c @@ -253,7 +253,7 @@ vbo_save_playback_vertex_list(struct gl_context *ctx, void *data) struct vbo_save_context *save = vbo_context(ctx)-save; GLboolean remap_vertex_store = GL_FALSE; - if (save-vertex_store-buffer) { + if (save-vertex_store save-vertex_store-buffer) { /* The vertex store is currently mapped but we're about to replay * a display list. This can happen when a nested display list is * being build with GL_COMPILE_AND_EXECUTE. -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 58812] Infinite loop in ./configure make if automake is absent
https://bugs.freedesktop.org/show_bug.cgi?id=58812 Matt Turner matts...@gmail.com changed: What|Removed |Added Status|NEEDINFO|REOPENED --- Comment #3 from Matt Turner matts...@gmail.com --- Still not sure what causes this, but I saw another report of it on IRC today. The reporter said that installing libtool (which he didn't previously have) and regenerating the autoconf files (via ./autogen.sh) fixed the problem for him. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] GLSL: fix too eager constant variable optimization
- The patch doesn't compile cleanly on master. In particular, it looks like it was made using a version of the code prior to commit 42a29d8 (glsl: Eliminate ambiguity between function ins/outs and shader ins/outs). Whoops, indeed. I made in my own modified Mesa fork (GLSL Optimizer, https://github.com/aras-p/glsl-optimizer) that's not up to date with latest master. - It seems kludgy to add a visitor for ir_function_signature that loops through all the parameters, since the default hierarchial visitor for ir_function_signature already does that. Why not just modify the ir_variable visitor so that it increments entry-assignment_count when it visits a variable whose mode is ir_var_function_in or ir_var_function_inout? (Note that the ability to distinguish between function in variables and shader in variables was added in commit 42a29d8, the commit I mentined above). Good point, that should be a better approach. Although I agree that opt_constant_variable has problems and could be improved, it seems like I must be missing some crucial piece of information here, since I can't reproduce the bug and I can't understand why your patch would have any effect. I've made a shader_runner test to try to demonstrate the problem (attached), and it works fine on mesa master as well as the 9.1, 9.0, 8.0, and 7.11 branches. Can you help me understand what I'm missing? Yeah perhaps it can't ever manifest in Mesa's context. In my fork, I'm using Mesa's GLSL compiler optimization passes to do a GLSL-to-GLSL compiler (sounds weird? it kind of is, but works around many mobile platform drivers being really, really weak at GLSL optimization). I do optimization passes very similarly to Mesa, but slightly different at places. But since it's not clear whether my improvements bring any benefit in the context of Mesa, then maybe my patch should just be ignored. And sorry for the wasted time then. Finally, in the future would you mind posting patches to the mailing list as inline text rather than attachments? (git send-email is a convenient way to do this.) Will do. -- Aras Pranckevičius work: http://unity3d.com home: http://aras-p.info ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] radeonsi: switch to v*i8 for resources and samplers
On Tue, Mar 05, 2013 at 07:51:02PM +0100, Tom Stellard wrote: On Tue, Mar 05, 2013 at 03:27:19PM +0100, Christian König wrote: From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com This series has my r-b, but I'd like to test it on r600, before you push it. Hi Christian, There is one issue I had to fix, so please add the attached patch to the series when you commit. There is also a related change to the LLVM patchset, but I'll send that fix to the llvm list. With these two fixes, both series are safe to commit. Also, since this change will make mesa incompatible with older LLVM versions, could you commit the LLVM changes first and then put the revision number in a file called LLVM_REVISION.txt in drivers/radeon. I think keeping track of this will help us if we need to bisect. Thanks, Tom -Tom --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 31 +--- 1 file changed, 12 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 7922928..958d3a3 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -84,10 +84,9 @@ static struct si_shader_context * si_shader_context( enum sgpr_type { SGPR_CONST_PTR_F32, - SGPR_CONST_PTR_V4I32, - SGPR_CONST_PTR_V8I32, - SGPR_I32, - SGPR_I64 + SGPR_CONST_PTR_V16I8, + SGPR_CONST_PTR_V32I8, + SGPR_I32 }; /** @@ -149,22 +148,17 @@ static LLVMValueRef use_sgpr( ret_type = LLVMInt32TypeInContext(gallivm-context); break; - case SGPR_I64: + case SGPR_CONST_PTR_V16I8: assert(sgpr % 2 == 0); - ret_type= LLVMInt64TypeInContext(gallivm-context); - break; - - case SGPR_CONST_PTR_V4I32: - assert(sgpr % 2 == 0); - ret_type = LLVMInt32TypeInContext(gallivm-context); - ret_type = LLVMVectorType(ret_type, 4); + ret_type = LLVMInt8TypeInContext(gallivm-context); + ret_type = LLVMVectorType(ret_type, 16); ret_type = LLVMPointerType(ret_type, CONST_ADDR_SPACE); break; - case SGPR_CONST_PTR_V8I32: + case SGPR_CONST_PTR_V32I8: assert(sgpr % 2 == 0); - ret_type = LLVMInt32TypeInContext(gallivm-context); - ret_type = LLVMVectorType(ret_type, 8); + ret_type = LLVMInt8TypeInContext(gallivm-context); + ret_type = LLVMVectorType(ret_type, 32); ret_type = LLVMPointerType(ret_type, CONST_ADDR_SPACE); break; @@ -197,7 +191,7 @@ static void declare_input_vs( unsigned chan; /* Load the T list */ - t_list_ptr = use_sgpr(base-gallivm, SGPR_CONST_PTR_V4I32, SI_SGPR_VERTEX_BUFFER); + t_list_ptr = use_sgpr(base-gallivm, SGPR_CONST_PTR_V16I8, SI_SGPR_VERTEX_BUFFER); t_offset = lp_build_const_int32(base-gallivm, input_index); @@ -478,7 +472,6 @@ static void si_llvm_init_export_args(struct lp_build_tgsi_context *bld_base, int cbuf = target - V_008DFC_SQ_EXP_MRT; if (cbuf = 0 cbuf 8) { - struct r600_context *rctx = si_shader_ctx-rctx; compressed = (si_shader_ctx-key.export_16bpc cbuf) 0x1; if (compressed) @@ -945,14 +938,14 @@ static void tex_fetch_args( emit_data-args[1] = lp_build_gather_values(gallivm, address, count); /* Resource */ - ptr = use_sgpr(bld_base-base.gallivm, SGPR_CONST_PTR_V8I32, SI_SGPR_RESOURCE); + ptr = use_sgpr(bld_base-base.gallivm, SGPR_CONST_PTR_V32I8, SI_SGPR_RESOURCE); offset = lp_build_const_int32(bld_base-base.gallivm, emit_data-inst-Src[1].Register.Index); emit_data-args[2] = build_indexed_load(bld_base-base.gallivm, ptr, offset); /* Sampler */ - ptr = use_sgpr(bld_base-base.gallivm, SGPR_CONST_PTR_V4I32, SI_SGPR_SAMPLER); + ptr = use_sgpr(bld_base-base.gallivm, SGPR_CONST_PTR_V16I8, SI_SGPR_SAMPLER); offset = lp_build_const_int32(bld_base-base.gallivm, emit_data-inst-Src[1].Register.Index); emit_data-args[3] = build_indexed_load(bld_base-base.gallivm, -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev From 8839de966658f7e31d739c5e6ba430f976984451 Mon Sep 17 00:00:00 2001 From: Tom Stellard thomas.stell...@amd.com Date: Wed, 6 Mar 2013 13:34:56 -0500 Subject: [PATCH] r600g/llvm: Update CONSTANT_BUFFER address space definition to match recent LLVM changes. --- src/gallium/drivers/r600/r600_llvm.c |2 +- 1 files changed, 1
[Mesa-dev] [Bug 61919] New: make fails without C_INCLUDE_PATH: xlib_sw_winsys.c:49:33: fatal error: X11/extensions/XShm.h: No such file or directory
https://bugs.freedesktop.org/show_bug.cgi?id=61919 Priority: medium Bug ID: 61919 Assignee: mesa-dev@lists.freedesktop.org Summary: make fails without C_INCLUDE_PATH: xlib_sw_winsys.c:49:33: fatal error: X11/extensions/XShm.h: No such file or directory Severity: normal Classification: Unclassified OS: All Reporter: dar...@chaosreigns.com Hardware: Other Status: NEW Version: unspecified Component: Other Product: Mesa Making all in xlib make[4]: Entering directory `/home/darxus/source/mesa/src/gallium/winsys/sw/xlib' CC xlib_sw_winsys.lo xlib_sw_winsys.c:49:33: fatal error: X11/extensions/XShm.h: No such file or directory Built libX11 from source, installed into $installdir. Building mesa with: ./autogen.sh --prefix=$installdir --enable-gles2 --disable-gallium-egl --with-egl-platforms=wayland,x11,drm --enable-gbm --enable-shared-glapi --with-gallium-drivers=r300,r600,swrast,nouveau,svga I have these variables set, which I believe should be sufficient for mesa to find libX11 in $installdir: PKG_CONFIG_PATH=$installdir/lib/pkgconfig/:$installdir/share/pkgconfig/ ACLOCAL=aclocal -I $installdir/share/aclocal LIBRARY_PATH=$installdir/lib PATH=$installdir/bin:$PATH With this set, it does not fail, but I believe this should not be required: C_INCLUDE_PATH=$installdir/include 35189d768bf80fdedbb6e70f49215cc8b734f343 is the first bad commit commit 35189d768bf80fdedbb6e70f49215cc8b734f343 Author: Matt Turner matts...@gmail.com Date: Mon Mar 4 10:23:54 2013 -0800 configure.ac: Don't check for X11 unconditionally. X11 is already checked conditionally below. Fixes OSMesa-only configurations to not require X11. Note: This is a candidate for the 9.1 branch. Reviewed-by: Brian Paul bri...@vmware.com :100644 100644 785259554bbb833bc6d03c50414b8262bc553341 1b13d06be3ac263088b5d2d15923c383453d16b6 M configure.ac bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61907] Indirect rendering of multi-texture vertex arrays broken
https://bugs.freedesktop.org/show_bug.cgi?id=61907 --- Comment #1 from Colin McDonald cjmmail10...@yahoo.co.uk --- I've realised that I was so focused on describing my code changes that I omitted to adequately describe the problem symptoms, other than to say that multi-texturing is broken. OpenSceneGraph makes extensive use off glDrawArrays and glDrawElements. Arrays of texture coordinates are set-up using glTexCoordPointer, for the active texture unit selected with glClientActiveTexture. For texture units 0, indirect rendering to a remote display results in those texture coordinates being either ignored, or corrupting the output of the unit 0 texture. This is because of the __glXInitVertexArrayState initialisation problem, which fails to get GL_MAX_TEXTURE_UNITS, and consequently rejects all texture units 0. The incorrect op codes and the wrong output order for double texture coords have the potential to cause protocol errors and/or crash the remote display server, but we don't actually get that because of multi-texture output being ignored due to the first problem. It is only when the first problem is fixed that the others become apparent. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61907] Indirect rendering of multi-texture vertex arrays broken
https://bugs.freedesktop.org/show_bug.cgi?id=61907 --- Comment #2 from Brian Paul bri...@vmware.com --- Just a quick note- it would be great to have a piglit test that exercises this issue (if there isn't one already). -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe: remove the power of two sizeof(struct cmd_block) assertion
It fails on 32-bit systems (I only tested on 64-bit). Power of two size isn't required, so just remove the assertion. --- src/gallium/drivers/llvmpipe/lp_scene.c |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_scene.c b/src/gallium/drivers/llvmpipe/lp_scene.c index dd0943e..a0912eb 100644 --- a/src/gallium/drivers/llvmpipe/lp_scene.c +++ b/src/gallium/drivers/llvmpipe/lp_scene.c @@ -76,13 +76,6 @@ lp_scene_create( struct pipe_context *pipe ) assert(maxCommandBytes LP_SCENE_MAX_SIZE); /* We'll also need space for at least one other data block */ assert(maxCommandPlusData = LP_SCENE_MAX_SIZE); - - /* Ideally, the size of a cmd_block object will be a power of two - * in order to avoid wasting space when we allocation them from - * data blocks (which are power of two also). - */ - assert(sizeof(struct cmd_block) == - util_next_power_of_two(sizeof(struct cmd_block))); } #endif -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix glGetInteger*(GL_SAMPLER_BINDING).
On 03/06/2013 11:46 AM, Alan Hourihane wrote: On 03/06/13 18:36, Brian Paul wrote: On 03/06/2013 11:23 AM, Alan Hourihane wrote: If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Alan, is this specified somewhere in a spec? I can't find a description of this behaviour and we don't do this for texture objects or buffer objects, etc. I can't see it specifically mentioned, apart from the note that when deleting the sampler object it should be unbound from the texture unit, and I did consider the case of buffer texture objects whether to do this there too. But getting the GL_SAMPLER_BINDING id when switching contexts and attempting to re-bind with glBindSampler() gives a GL error, which seems wrong to me. I checked with the NVIDIA driver and no GL error is generated. OK, sounds good. BTW, just a few minor comments on your patch: The subject of the patch should start with mesa: fix glGetInteger... And, candidate for the stable branches? If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Signed-off-by: Alan Hourihane al...@vmware.com --- src/mesa/main/get.c| 12 +++- src/mesa/main/samplerobj.c |2 +- src/mesa/main/samplerobj.h |2 ++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 2399f9c..c627dd6 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -34,6 +34,7 @@ #include state.h #include texcompress.h #include framebuffer.h +#include samplerobj.h /* This is a table driven implemetation of the glGet*v() functions. * The basic idea is that most getters just look up an int somewhere @@ -827,7 +828,16 @@ find_custom_value(struct gl_context *ctx, const struct valu e_desc *d, union valu { struct gl_sampler_object *samp = ctx-Texture.Unit[ctx-Texture.CurrentUnit].Sampler; - v-value_int = samp ? samp-Name : 0; + + /* + * The sampler object may have been deleted on another context, + * so we try to lookup the sampler object before returning it's Name. its + */ + if (samp _mesa_lookup_samplerobj(ctx, samp-Name)) { +v-value_int = samp-Name; + } else { +v-value_int = 0; + } Looks like a mix of spaces and tabs for indentation. Can you just use spaces? } break; /* GL_ARB_uniform_buffer_object */ diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 4664cc3..5cff329 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -40,7 +40,7 @@ #include main/samplerobj.h -static struct gl_sampler_object * +struct gl_sampler_object * _mesa_lookup_samplerobj(struct gl_context *ctx, GLuint name) { if (name == 0) diff --git a/src/mesa/main/samplerobj.h b/src/mesa/main/samplerobj.h index 3114257..69e3899 100644 --- a/src/mesa/main/samplerobj.h +++ b/src/mesa/main/samplerobj.h @@ -62,6 +62,8 @@ _mesa_reference_sampler_object(struct gl_context *ctx, _mesa_reference_sampler_object_(ctx, ptr, samp); } +extern struct gl_sampler_object * +_mesa_lookup_samplerobj(struct gl_context *ctx, GLuint name); extern struct gl_sampler_object * _mesa_new_sampler_object(struct gl_context *ctx, GLuint name); Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: fix wrong shift val for a cpu feature detect
While I was trying to understand how cpu features detection works, I noticed that every was coherent except for the TSC ( Time Stamp Counter). I found many pages on the net that suggest that the right value for the corresponding shift would be 4 and not 8. Even the commented bitfields that represent the bit flag for each cpu feature seems to suggest that. From 17adc106ce2718343dd17750c928137441ef3086 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maxence=20Le=20Dor=C3=A9?= maxence.led...@gmail.com Date: Thu, 7 Mar 2013 02:30:03 +0100 Subject: [PATCH] util: fix wrong shift val for a cpu feature detect --- src/gallium/auxiliary/util/u_cpu_detect.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c b/src/gallium/auxiliary/util/u_cpu_detect.c index d7f0be4..0328051 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.c +++ b/src/gallium/auxiliary/util/u_cpu_detect.c @@ -270,7 +270,7 @@ util_cpu_detect(void) util_cpu_caps.x86_cpu_type = 8 + ((regs2[0] 20) 255); /* use extended family (P4, IA64) */ /* general feature flags */ - util_cpu_caps.has_tsc= (regs2[3] 8) 1; /* 0x010 */ + util_cpu_caps.has_tsc= (regs2[3] 4) 1; /* 0x010 */ util_cpu_caps.has_mmx= (regs2[3] 23) 1; /* 0x080 */ util_cpu_caps.has_sse= (regs2[3] 25) 1; /* 0x200 */ util_cpu_caps.has_sse2 = (regs2[3] 26) 1; /* 0x400 */ -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: fix wrong shift val for a cpu feature detect
On Wed, Mar 6, 2013 at 5:39 PM, Maxence Le Doré maxence.led...@gmail.com wrote: While I was trying to understand how cpu features detection works, I noticed that every was coherent except for the TSC ( Time Stamp Counter). I found many pages on the net that suggest that the right value for the corresponding shift would be 4 and not 8. Even the commented bitfields that represent the bit flag for each cpu feature seems to suggest that. From 17adc106ce2718343dd17750c928137441ef3086 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maxence=20Le=20Dor=C3=A9?= maxence.led...@gmail.com Date: Thu, 7 Mar 2013 02:30:03 +0100 Subject: [PATCH] util: fix wrong shift val for a cpu feature detect --- src/gallium/auxiliary/util/u_cpu_detect.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_cpu_detect.c b/src/gallium/auxiliary/util/u_cpu_detect.c index d7f0be4..0328051 100644 --- a/src/gallium/auxiliary/util/u_cpu_detect.c +++ b/src/gallium/auxiliary/util/u_cpu_detect.c @@ -270,7 +270,7 @@ util_cpu_detect(void) util_cpu_caps.x86_cpu_type = 8 + ((regs2[0] 20) 255); /* use extended family (P4, IA64) */ /* general feature flags */ - util_cpu_caps.has_tsc= (regs2[3] 8) 1; /* 0x010 */ + util_cpu_caps.has_tsc= (regs2[3] 4) 1; /* 0x010 */ util_cpu_caps.has_mmx= (regs2[3] 23) 1; /* 0x080 */ util_cpu_caps.has_sse= (regs2[3] 25) 1; /* 0x200 */ util_cpu_caps.has_sse2 = (regs2[3] 26) 1; /* 0x400 */ -- 1.7.9.5 Indeed, this matches Wikipedia [1] and even the comment on the same line. :) Reviewed-by: Matt Turner matts...@gmail.com [1] http://en.wikipedia.org/wiki/CPUID#EAX.3D1:_Processor_Info_and_Feature_Bits ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61933] New: Weston EGL clients hang with software rendering upon startup
https://bugs.freedesktop.org/show_bug.cgi?id=61933 Priority: medium Bug ID: 61933 Assignee: mesa-dev@lists.freedesktop.org Summary: Weston EGL clients hang with software rendering upon startup Severity: normal Classification: Unclassified OS: All Reporter: nerdopol...@verizon.net Hardware: Other Status: NEW Version: unspecified Component: Mesa core Product: Mesa Created attachment 76068 -- https://bugs.freedesktop.org/attachment.cgi?id=76068action=edit gdb output of weston client This seems to be a issue that effects me in mesa master. I am able to apply this patch http://cgit.freedesktop.org/mesa/mesa/commit/?id=6dbe94c12cd1b3b912a7083055178e0dfd7372af on 9.0, and weston clients to work with 9.0 with software rendering on virtualbox However, as this patch is in master, but there must be a regression from 9.0 to master? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel: Remove some unused debug flags.
I was looking at the list to see what might be interesting to document for application developers, and it turns out some are completely dead. --- src/mesa/drivers/dri/intel/intel_context.c |4 src/mesa/drivers/dri/intel/intel_context.h |4 2 files changed, 8 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index 9e508f7..7651b46 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -470,7 +470,6 @@ static const struct dri_debug_control debug_control[] = { { mip, DEBUG_MIPTREE}, { fall, DEBUG_PERF}, { perf, DEBUG_PERF}, - { verb, DEBUG_VERBOSE}, { bat, DEBUG_BATCH}, { pix, DEBUG_PIXEL}, { buf, DEBUG_BUFMGR}, @@ -483,10 +482,7 @@ static const struct dri_debug_control debug_control[] = { { vert, DEBUG_VERTS }, { dri, DEBUG_DRI }, { sf,DEBUG_SF }, - { san, DEBUG_SANITY }, - { sleep, DEBUG_SLEEP }, { stats, DEBUG_STATS }, - { tile, DEBUG_TILE }, { wm,DEBUG_WM }, { urb, DEBUG_URB }, { vs,DEBUG_VS }, diff --git a/src/mesa/drivers/dri/intel/intel_context.h b/src/mesa/drivers/dri/intel/intel_context.h index 3d2d3ef..5a49603 100644 --- a/src/mesa/drivers/dri/intel/intel_context.h +++ b/src/mesa/drivers/dri/intel/intel_context.h @@ -420,7 +420,6 @@ extern int INTEL_DEBUG; #define DEBUG_BLIT 0x8 #define DEBUG_MIPTREE 0x10 #define DEBUG_PERF 0x20 -#define DEBUG_VERBOSE 0x40 #define DEBUG_BATCH 0x80 #define DEBUG_PIXEL 0x100 #define DEBUG_BUFMGR0x200 @@ -432,10 +431,7 @@ extern int INTEL_DEBUG; #define DEBUG_VERTS0x8000 #define DEBUG_DRI 0x1 #define DEBUG_SF0x2 -#define DEBUG_SANITY0x4 -#define DEBUG_SLEEP 0x8 #define DEBUG_STATS 0x10 -#define DEBUG_TILE 0x20 #define DEBUG_WM0x40 #define DEBUG_URB 0x80 #define DEBUG_VS0x100 -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] i965/fs: Add a comment about about an implementation detail.
I was going to fix the code above like the previous commit, but we already had that covered (otherwise all our uniform access would have been broken, unlike just pull constants). --- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp |4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp index d1147f5..b8936dc 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp @@ -310,6 +310,10 @@ fs_visitor::setup_payload_interference(struct ra_graph *g, * node. */ for (int j = 0; j this-virtual_grf_count; j++) { + /* Note that we use a = comparison, unlike virtual_grf_interferes(), + * in order to not have to worry about the uniform issue described in + * calculate_live_intervals(). + */ if (this-virtual_grf_def[j] = payload_last_use_ip[i] || this-virtual_grf_use[j] = payload_last_use_ip[i]) { ra_add_node_interference(g, first_payload_node + i, j); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] i965/fs: Fix register allocation for uniform pull constants in 16-wide.
We were allowing a compressed instruction to write a register that contained the last use of a uniform pull constant (either UBO load or push constant spillover), so it would get half its values smashed. Since we need to see the actual instruction to decide this, move the pre-gen6 pixel_x/y logic here, which should improve the performance of register allocation since virtual_grf_interferes() is called more than once per instruction. NOTE: This is a candidate for the stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317 --- .../drivers/dri/i965/brw_fs_live_variables.cpp | 54 +++- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp index db8f397..4c7991d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp @@ -190,6 +190,37 @@ fs_visitor::calculate_live_intervals() int reg = inst-src[i].reg; use[reg] = ip; + +/* In most cases, a register can be written over safely by the + * same instruction that is its last use. For a single + * instruction, the sources are dereferenced before writing of the + * destination starts (naturally). This gets more complicated for + * simd16, because the instruction: + * + * mov(16) g41F g48,8,1F g68,8,1F + * + * is actually decoded in hardware as: + * + * mov(8) g41F g48,8,1F g68,8,1F + * mov(8) g51F g58,8,1F g78,8,1F + * + * Which is safe. However, if we have uniform accesses + * happening, we get into trouble: + * + * mov(8) g41F g40,1,0F g68,8,1F + * mov(8) g51F g40,1,0F g78,8,1F + * + * Now our destination for the first instruction overwrote the + * second instruction's src0, and we get garbage for those 8 + * pixels. There's a similar issue for the pre-gen6 + * pixel_x/pixel_y, which are registers of 16-bit values and thus + * would get stomped by the first decode as well. + */ +if (dispatch_width == 16 (inst-src[i].smear || + (this-pixel_x.reg == reg || + this-pixel_y.reg == reg))) { + use[reg]++; +} } } @@ -264,28 +295,5 @@ fs_visitor::virtual_grf_interferes(int a, int b) int start = MAX2(a_def, b_def); int end = MIN2(a_use, b_use); - /* If the register is used to store 16 values of less than float -* size (only the case for pixel_[xy]), then we can't allocate -* another dword-sized thing to that register that would be used in -* the same instruction. This is because when the GPU decodes (for -* example): -* -* (declare (in ) vec4 gl_FragCoord@0x97766a0) -* add(16) g61F g68,8,1UW 0.5F { align1 compr }; -* -* it's actually processed as: -* add(8) g61F g68,8,1UW 0.5F { align1 }; -* add(8) g71F g6.88,8,1UW 0.5F { align1 sechalf }; -* -* so our second half values in g6 got overwritten in the first -* half. -*/ - if (dispatch_width == 16 (this-pixel_x.reg == a || - this-pixel_x.reg == b || - this-pixel_y.reg == a || - this-pixel_y.reg == b)) { - return start = end; - } - return start end; } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] i965/fs: Fix broken rendering in large shaders with UBO loads.
The lowering process creates a new vgrf on gen7 that should be represented in live interval analysis. As-is, it was getting a conflicting allocation with gl_FragDepth in the dolphin emulator, producing broken rendering. NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317 --- src/mesa/drivers/dri/i965/brw_fs.cpp |2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 927cf13..b97a19e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2499,6 +2499,8 @@ fs_visitor::lower_uniform_pull_constant_loads() inst-insert_before(setup2); inst-opcode = FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7; inst-src[1] = payload; + + this-live_intervals_valid = false; } else { /* Before register allocation, we didn't tell the scheduler about the * MRF we use. We know it's safe to use this MRF because nothing -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] i965/fs: Also do the gen4 SEND dependency workaround against other SENDs.
We were handling the the dependency workaround for the first written reg of a send preceding the one we're fixing up, but didn't consider the other regs. Thus if you had two sampler calls that got allocated to the same set of regs, one might, rarely, ovewrite the other. This was occurring in XBMC's GLSL shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567 NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 5380abf..8ce3954 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2300,7 +2300,8 @@ clear_deps_for_inst_src(fs_inst *inst, int dispatch_width, bool *deps, void fs_visitor::insert_gen4_pre_send_dependency_workarounds(fs_inst *inst) { - int write_len = inst-regs_written() * dispatch_width / 8; + int reg_size = dispatch_width / 8; + int write_len = inst-regs_written() * reg_size; int first_write_grf = inst-dst.reg; bool needs_dep[BRW_MAX_MRF]; assert(write_len (int)sizeof(needs_dep) - 1); @@ -2339,14 +2340,19 @@ fs_visitor::insert_gen4_pre_send_dependency_workarounds(fs_inst *inst) * instruction but a MOV that might have left us an outstanding * dependency has more latency than a MOV. */ - if (scan_inst-dst.file == GRF - scan_inst-dst.reg = first_write_grf - scan_inst-dst.reg first_write_grf + write_len - needs_dep[scan_inst-dst.reg - first_write_grf]) { - inst-insert_before(DEP_RESOLVE_MOV(scan_inst-dst.reg)); - needs_dep[scan_inst-dst.reg - first_write_grf] = false; - if (scan_inst_16wide) -needs_dep[scan_inst-dst.reg - first_write_grf + 1] = false; + if (scan_inst-dst.file == GRF) { + for (int i = 0; i scan_inst-regs_written(); i++) { +int reg = scan_inst-dst.reg + i * reg_size; + +if (reg = first_write_grf +reg first_write_grf + write_len +needs_dep[reg - first_write_grf]) { + inst-insert_before(DEP_RESOLVE_MOV(reg)); + needs_dep[reg - first_write_grf] = false; + if (scan_inst_16wide) + needs_dep[reg - first_write_grf + 1] = false; +} + } } /* Clear the flag for registers that actually got read (as expected). */ -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] i965/fs: Switch to using sampler LD messages for uniform pull constants.
When forcing the compiler to always generate pull constants instead of push constants (in order to have an easy to use testcase), improves performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7). --- src/mesa/drivers/dri/i965/brw_defines.h |2 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 39 +++-- src/mesa/drivers/dri/i965/brw_fs.h|7 ++-- src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 54 + 4 files changed, 50 insertions(+), 52 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index d9b7f9a..6414e69 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -727,7 +727,7 @@ enum opcode { FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7, FS_OPCODE_MOV_DISPATCH_TO_FLAGS, FS_OPCODE_DISCARD_JUMP, - FS_OPCODE_SET_GLOBAL_OFFSET, + FS_OPCODE_SET_SIMD4X2_OFFSET, FS_OPCODE_PACK_HALF_2x16_SPLIT, FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X, FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y, diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b97a19e..5380abf 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2461,6 +2461,11 @@ fs_visitor::insert_gen4_send_dependency_workarounds() * scheduling full flexibility, while the conversion to native instructions * allows the post-register-allocation scheduler the best information * possible. + * + * Note that execution masking for setting up pull constant loads is special: + * the channels that need to be written are unrelated to the current execution + * mask, since a later instruction will use one of the result channels as a + * source operand for all 8 or 16 of its channels. */ void fs_visitor::lower_uniform_pull_constant_loads() @@ -2477,26 +2482,24 @@ fs_visitor::lower_uniform_pull_constant_loads() const_offset_reg.type == BRW_REGISTER_TYPE_UD); const_offset_reg.imm.u /= 16; fs_reg payload = fs_reg(this, glsl_type::uint_type); - struct brw_reg g0 = retype(brw_vec8_grf(0, 0), -BRW_REGISTER_TYPE_UD); - - fs_inst *setup1 = MOV(payload, fs_reg(g0)); - setup1-force_writemask_all = true; - /* We don't need the second half of this vgrf to be filled with g1 - * in the 16-wide case, but if we use force_uncompressed then live - * variable analysis won't consider this a def! + + /* This is actually going to be a MOV, but since only the first dword + * is accessed, we have a special opcode to do just that one. Note + * that this needs to be an operation that will be considered a def + * by live variable analysis, or register allocation will explode. */ + fs_inst *setup = new(mem_ctx) fs_inst(FS_OPCODE_SET_SIMD4X2_OFFSET, + payload, const_offset_reg); + setup-force_writemask_all = true; - fs_inst *setup2 = new(mem_ctx) fs_inst(FS_OPCODE_SET_GLOBAL_OFFSET, -payload, payload, -const_offset_reg); + setup-ir = inst-ir; + setup-annotation = inst-annotation; + inst-insert_before(setup); - setup1-ir = inst-ir; - setup1-annotation = inst-annotation; - inst-insert_before(setup1); - setup2-ir = inst-ir; - setup2-annotation = inst-annotation; - inst-insert_before(setup2); + /* Similarly, this will only populate the first 4 channels of the + * result register (since we only use smear values from 0-3), but we + * don't tell the optimizer. + */ inst-opcode = FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7; inst-src[1] = payload; @@ -2533,7 +2536,7 @@ fs_visitor::dump_instruction(fs_inst *inst) case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7: printf(uniform_pull_const_gen7); break; - case FS_OPCODE_SET_GLOBAL_OFFSET: + case FS_OPCODE_SET_SIMD4X2_OFFSET: printf(set_global_offset); break; default: diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index f7ccc79..febd56b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -546,10 +546,9 @@ private: struct brw_reg index, struct brw_reg offset); void generate_mov_dispatch_to_flags(fs_inst *inst); - void generate_set_global_offset(fs_inst *inst, - struct brw_reg dst, - struct brw_reg src, - struct brw_reg offset); + void generate_set_simd4x2_offset(fs_inst *inst, +
[Mesa-dev] glxgears is faster but 3D render is so slow
Hi, I built mesa 9.1 with following configuration: --enable-xlib-glx --disable-dri --with-gallium-drivers=swrast --enable-osmesa --with-osmesa-bits=32 The mesa package is used by TurboVNC, so xlib glx has to be used instead of DRI. It works well to gain faster speed in glxgears, but it has huge problems to render a 3D modules in an application of CHIMERA. The thing puzzled me is when I use CentOS system provided Mesa 7.11, the CHIMERA 3D picture rotation works fine despite the glxgears is very slow (250 FPS). I am not clear that the slower 3D rendering in my built mesa is because the DRI or I may miss some configurations. Appreciate any advice. Thank you. Kind regards. Jupiter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev