Re: [Mesa-dev] Debugging into Mesa Driver
On Wed, 4 Jun 2014 21:46:38 -0700 roshan chaudhari rgc...@gmail.com wrote: Thanks for reply. I have added CFLAGS='-Og -ggdb3' CXXFLAGS='-Og -ggdb3' into configure file and ran The advice was to add those to the configure command line, not into the configure file. Undo all your edits from Mesa and start from scratch. ./configure --enable-debug ; make; make install but still it did not step into driver. So you tried exactly what Ian suggested, and the breakpoint did not trigger? Also, are you perhaps trying to debug a 32-bit application on an otherwise 64-bit system? Thanks, pq On Wed, Jun 4, 2014 at 1:09 PM, Ian Romanick i...@freedesktop.org wrote: On 06/04/2014 11:14 AM, roshan chaudhari wrote: Hello, I just cloned the mesa driver from git repository. I am trying to debug the opengl application with mesa driver. I am not sure which flag to enable for debugging and where, I built a driver with -enable-debug in Makefile and added --DEBUG in CFLAGS in Makefile but still when I try to step into driver code it does not allow me. I am doing it with gdb in ubuntu. Modifying the CFLAGS in the Makefile is likely to cause problems. Instead, try CFLAGS='-Og -ggdb3' CXXFLAGS='-Og -ggdb3' ./configure --enable-debug your other configure options If your version of GCC is too old, you will need to use -O0 instead of -Og. That should be sufficient. You can verify this by doing gdb $(which glxgears) Then, at the gdb prompt, break _mesa_Clear It will ask Function _mesa_Clear not defined. Make breakpoint pending on future shared library load? (y or [n]) Answer 'y'. Then, run If it stops in _mesa_Clear, you're good to go. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Debugging into Mesa Driver
On Wed, 2014-06-04 at 21:46 -0700, roshan chaudhari wrote: Thanks for reply. I have added CFLAGS='-Og -ggdb3' CXXFLAGS='-Og -ggdb3' into configure file and ran ./configure --enable-debug ; make; make install Where are you installing it to?? You don't really want to blow away Ubuntu's drivers as you will have nothing to revert to if something goes wrong. My guess is your installing it to somewhere where Ubuntu doesn't know to look so you not actually using the driver your building. You should take a look at the build guide for the radeon drivers to see how to build and setup Mesa without over wrtting your distros drivers [1]. I also wrote a half finished intro to mesa [2] that some people have found useful as, well as a guide on how to start contributing to Mesa [3]. [1] http://www.x.org/wiki/radeonBuildHowTo/ [2] http://www.itsqueeze.com/2013/09/introduction-to-mesa-through-example/ [3] http://www.itsqueeze.com/2013/11/how-to-start-contributing-to-mesa3d/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Implement the ARB_clear_texture extension
Hi Neil, I'd like to have full hardware acceleration for Gallium drivers before advertising the extension for them. I wouldn't like to have extensions which are only implemented in software where hardware support is preferable. Therefore, not advertising the extension is the way to go if you are not interested in Gallium. Thanks, Marek On Wed, Jun 4, 2014 at 8:12 PM, Neil Roberts n...@linux.intel.com wrote: The clear texture extension is used to clear a texture to a given value without having to provide a buffer for the whole texture and without having to create an FBO. This patch provides a generic implementation that works with any driver. There are two approaches, the first being in meta.c which tries to create a GL framebuffer to the texture and calls glClear. This can fail if the FBO extension is not supported or if the texture can't be used as a render target. In that case it will fall back to an implementation in texstore.c which maps a region of the texture and just directly writes in the values. A small problem with this patch is that the fallback approach that maps the texture doesn't seem to work with depth-stencil textures. However I think this may be a general bug with mapping depth-stencil textures because I seem to get the same issue if I try to update the texture using glTexSubImage2D as well. You can replicate this if you run the Piglit test arb_clear_texture-depth-stencil and set MESA_EXTENSION_OVERRIDE to -GL_ARB_framebuffer_object. --- src/mapi/glapi/gen/ARB_clear_texture.xml | 32 src/mapi/glapi/gen/gl_API.xml| 6 +- src/mesa/drivers/common/driverfuncs.c| 1 + src/mesa/drivers/common/meta.c | 143 ++ src/mesa/drivers/common/meta.h | 14 ++ src/mesa/main/dd.h | 14 ++ src/mesa/main/extensions.c | 1 + src/mesa/main/teximage.c | 251 ++- src/mesa/main/teximage.h | 12 ++ src/mesa/main/texstore.c | 70 + src/mesa/main/texstore.h | 7 + 11 files changed, 549 insertions(+), 2 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_clear_texture.xml diff --git a/src/mapi/glapi/gen/ARB_clear_texture.xml b/src/mapi/glapi/gen/ARB_clear_texture.xml new file mode 100644 index 000..9bb400a --- /dev/null +++ b/src/mapi/glapi/gen/ARB_clear_texture.xml @@ -0,0 +1,32 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +OpenGLAPI + +category name=GL_ARB_clear_texture number=145 + +function name =ClearTexImage offset=assign +param name=texture type=GLuint/ +param name=level type=GLint/ +param name=format type=GLenum/ +param name=type type=GLenum/ +param name=data type=const GLvoid */ +/function + +function name =ClearTexSubImage offset=assign +param name=texture type=GLuint/ +param name=level type=GLint/ +param name=xoffset type=GLint/ +param name=yoffset type=GLint/ +param name=zoffset type=GLint/ +param name=width type=GLsizei/ +param name=height type=GLsizei/ +param name=depth type=GLsizei/ +param name=format type=GLenum/ +param name=type type=GLenum/ +param name=data type=const GLvoid */ +/function + +/category + +/OpenGLAPI diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 0791bfc..181263a 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8344,7 +8344,11 @@ /function /category -!-- ARB extensions #145...#146 -- +!-- ARB extension #145 -- + +xi:include href=ARB_clear_texture.xml xmlns:xi=http://www.w3.org/2001/XInclude/ + +!-- ARB extension #147 -- xi:include href=ARB_multi_bind.xml xmlns:xi=http://www.w3.org/2001/XInclude/ diff --git a/src/mesa/drivers/common/driverfuncs.c b/src/mesa/drivers/common/driverfuncs.c index 6ece5d8..4f0f7a6 100644 --- a/src/mesa/drivers/common/driverfuncs.c +++ b/src/mesa/drivers/common/driverfuncs.c @@ -95,6 +95,7 @@ _mesa_init_driver_functions(struct dd_function_table *driver) driver-TexImage = _mesa_store_teximage; driver-TexSubImage = _mesa_store_texsubimage; driver-GetTexImage = _mesa_meta_GetTexImage; + driver-ClearTexSubImage = _mesa_meta_ClearTexSubImage; driver-CopyTexSubImage = _mesa_meta_CopyTexSubImage; driver-GenerateMipmap = _mesa_meta_GenerateMipmap; driver-TestProxyTexImage = _mesa_test_proxy_teximage; diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index fec0d2b..e4aa8b2 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -40,6 +40,7 @@ #include main/blit.h #include main/bufferobj.h #include main/buffers.h +#include main/clear.h #include main/colortab.h #include main/condrender.h #include main/depth.h @@
[Mesa-dev] [PATCH] i965: Fix unmapping depth-stencil textures with non-zero offset
Hi, This should fix the problem with depth-stencil textures mentioned in the commit message for my GL_ARB_clear_texture patch here: http://lists.freedesktop.org/archives/mesa-dev/2014-June/060739.html - Neil --- 8 --- (use git am --scissors to automatically chop here) intel_miptree_unmap_depthstencil was not taking into account the offset of the mapping when calculating the address to store the depth value. This was making glTexSubImage2D update the wrong texels if it was used with a non-zero offset on a depth-stencil texture. This patch makes the loop look more like the equivalent one in intel_miptree_map_depthstencil so that it will use the right location. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index dd7e57a..2fee6e48 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -2136,13 +2136,14 @@ intel_miptree_unmap_depthstencil(struct brw_context *brw, for (uint32_t y = 0; y map-h; y++) { for (uint32_t x = 0; x map-w; x++) { + int map_x = map-x + x, map_y = map-y + y; ptrdiff_t s_offset = intel_offset_S8(s_mt-pitch, -x + s_image_x + map-x, -y + s_image_y + map-y, +map_x + s_image_x, +map_y + s_image_y, brw-has_swizzling); - ptrdiff_t z_offset = ((y + z_image_y) * + ptrdiff_t z_offset = ((map_y + z_image_y) * (z_mt-pitch / 4) + - (x + z_image_x)); + (map_x + z_image_x)); if (map_z32f_x24s8) { z_map[z_offset] = packed_map[(y * map-w + x) * 2 + 0]; -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH piglit] Test using glTexSubImage2D with packed depth-stencil textures
This adds a test for updating a sub-region of a texture created with the GL_EXT_packed_depth_stencil extension. Currently this seems to trigger a bug on the i965 driver. --- tests/all.py | 1 + .../ext_packed_depth_stencil/CMakeLists.gl.txt | 1 + tests/spec/ext_packed_depth_stencil/texsubimage.c | 157 + 3 files changed, 159 insertions(+) create mode 100644 tests/spec/ext_packed_depth_stencil/texsubimage.c diff --git a/tests/all.py b/tests/all.py index 1652d7c..e692d5c 100644 --- a/tests/all.py +++ b/tests/all.py @@ -2497,6 +2497,7 @@ add_depthstencil_render_miplevels_tests( ext_packed_depth_stencil['fbo-clear-formats stencil'] = concurrent_test('fbo-clear-formats GL_EXT_packed_depth_stencil stencil') ext_packed_depth_stencil['DEPTH_STENCIL texture'] = concurrent_test('ext_packed_depth_stencil-depth-stencil-texture') ext_packed_depth_stencil['getteximage'] = concurrent_test('ext_packed_depth_stencil-getteximage') +ext_packed_depth_stencil['texsubimage'] = concurrent_test('ext_packed_depth_stencil-texsubimage') oes_packed_depth_stencil = {} spec['OES_packed_depth_stencil'] = oes_packed_depth_stencil diff --git a/tests/spec/ext_packed_depth_stencil/CMakeLists.gl.txt b/tests/spec/ext_packed_depth_stencil/CMakeLists.gl.txt index 99439f0..0d81da1 100644 --- a/tests/spec/ext_packed_depth_stencil/CMakeLists.gl.txt +++ b/tests/spec/ext_packed_depth_stencil/CMakeLists.gl.txt @@ -12,5 +12,6 @@ link_libraries ( piglit_add_executable (ext_packed_depth_stencil-depth-stencil-texture depth-stencil-texture.c) piglit_add_executable (ext_packed_depth_stencil-readpixels-24_8 readpixels-24_8.c) piglit_add_executable (ext_packed_depth_stencil-getteximage getteximage.c) +piglit_add_executable (ext_packed_depth_stencil-texsubimage texsubimage.c) # vim: ft=cmake: diff --git a/tests/spec/ext_packed_depth_stencil/texsubimage.c b/tests/spec/ext_packed_depth_stencil/texsubimage.c new file mode 100644 index 000..7238a49 --- /dev/null +++ b/tests/spec/ext_packed_depth_stencil/texsubimage.c @@ -0,0 +1,157 @@ +/* + * Copyright (c) 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/** @file cube.c + * + * A test of using glTexSubImage2D to update a region of a + * depth-stencil texture. A 4x4 depth-stencil is created and then two + * of the texels are set using different values. The whole texture is + * read back using glGetTexImage and compared to the expected values. + * + * This currently fails on the i965 driver. + */ + +#include piglit-util-gl-common.h + +PIGLIT_GL_TEST_CONFIG_BEGIN + + config.supports_gl_compat_version = 13; + + config.window_visual = PIGLIT_GL_VISUAL_RGB | PIGLIT_GL_VISUAL_DOUBLE; + +PIGLIT_GL_TEST_CONFIG_END + +static GLuint +create_texture(void) +{ + static const GLubyte data[] = { + 0xff, 0xff, 0xff, 0xff, + 0x04, 0x05, 0x06, 0x07, + 0xff, 0xff, 0xff, 0xff, + 0x0c, 0x0d, 0x0e, 0x0f, + }; + GLuint tex; + + glGenTextures(1, tex); + glBindTexture(GL_TEXTURE_2D, tex); + glTexImage2D(GL_TEXTURE_2D, +0, /* level */ +GL_DEPTH24_STENCIL8_EXT, +2, 2, /* width/height */ +0, /* border */ +GL_DEPTH_STENCIL_EXT, +GL_UNSIGNED_INT_24_8_EXT, +data); + + return tex; +} + +static void +update_texture(void) +{ + static const GLubyte bottom_left_pixel[] = { + 0x00, 0x01, 0x02, 0x03 + }; + static const GLubyte top_left_pixel[] = { + 0x08, 0x09, 0x0a, 0x0b + }; + glTexSubImage2D(GL_TEXTURE_2D, + 0, /* level */ + 0, 0, /* x/y */ + 1, 1, /* width/height */ +
[Mesa-dev] [PATCH v2 0/4] i965: Add runtime checks for line antialiasing in Gen 6.
Updated series based on review comments. Most important change in the series is related to the use of JMPI instead of IF/THEN/ELSE: in order to use JMPI I moved this part of the code to the generator (previous patch did the conditional in the visitor) because we need to count the number of instructions we need to jump over and some of these instructions will be in the generator anyway, so producing the JMPI instruction in the visitor looks like a bad idea in this case. Patch 1 is irrelevant for this series. It is a fix for a problem I ran into when developing the first version based on IF/THEN/ELSE. Patch 2: Provides the generator with the information it requires to implement handle the runtime conditional in Gen 6. Patch 3: Makes brw_land_fwj_jump generally available again. Patch 4: Implements JMPI based runtime checks for line antialiasing in Gen 6. Tested in Ironlake. Iago Toral Quiroga (4): i965: Always set a valid block end pointer i965/fs: Let the gen 8 generator know about runtime_check_aads_emit Revert i965: Move brw_land_fwd_jump() to compilation unit of its use. i965/fs: Add Gen 6 runtime checks for line antialiasing. src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 2 +- src/mesa/drivers/dri/i965/brw_cfg.cpp | 5 ++ src/mesa/drivers/dri/i965/brw_eu.h | 4 ++ src/mesa/drivers/dri/i965/brw_eu_emit.c | 17 + src/mesa/drivers/dri/i965/brw_fs.cpp| 2 +- src/mesa/drivers/dri/i965/brw_fs.h | 6 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 92 + src/mesa/drivers/dri/i965/brw_sf_emit.c | 16 - 8 files changed, 98 insertions(+), 46 deletions(-) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/4] Revert i965: Move brw_land_fwd_jump() to compilation unit of its use.
This reverts commit f3cb2e6ed7059b22752a6b7d7a98c07ba6b5552e. brw_land_fwd_jump() is convenient wherever we produce JMPI instructions and we will use JMPI to implement framebuffer writes that involve line antialiasing in gen 6. --- src/mesa/drivers/dri/i965/brw_eu.h | 4 src/mesa/drivers/dri/i965/brw_eu_emit.c | 17 + src/mesa/drivers/dri/i965/brw_sf_emit.c | 16 3 files changed, 21 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index c9e5a4b..5d4 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -340,6 +340,10 @@ struct brw_instruction *brw_CONT(struct brw_compile *p); struct brw_instruction *gen6_CONT(struct brw_compile *p); struct brw_instruction *gen6_HALT(struct brw_compile *p); +/* Forward jumps: + */ +void brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx); + struct brw_instruction *brw_JMPI(struct brw_compile *p, struct brw_reg index, unsigned predicate_control); diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index b89070b..9d7cbd9 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -1777,6 +1777,23 @@ struct brw_instruction *brw_WHILE(struct brw_compile *p) return insn; } +/* FORWARD JUMPS: + */ +void brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx) +{ + struct brw_context *brw = p-brw; + struct brw_instruction *jmp_insn = p-store[jmp_insn_idx]; + unsigned jmpi = 1; + + if (brw-gen = 5) + jmpi = 2; + + assert(jmp_insn-header.opcode == BRW_OPCODE_JMPI); + assert(jmp_insn-bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE); + + jmp_insn-bits3.ud = jmpi * (p-nr_insn - jmp_insn_idx - 1); +} + /* To integrate with the above, it makes sense that the comparison * instruction should populate the flag register. It might be simpler * just to use the flag reg for most WM tasks? diff --git a/src/mesa/drivers/dri/i965/brw_sf_emit.c b/src/mesa/drivers/dri/i965/brw_sf_emit.c index b526a5c..693627c 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_emit.c +++ b/src/mesa/drivers/dri/i965/brw_sf_emit.c @@ -729,22 +729,6 @@ void brw_emit_point_setup(struct brw_sf_compile *c, bool allocate) brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); } -static void -brw_land_fwd_jump(struct brw_compile *p, int jmp_insn_idx) -{ - struct brw_context *brw = p-brw; - struct brw_instruction *jmp_insn = p-store[jmp_insn_idx]; - unsigned jmpi = 1; - - if (brw-gen = 5) - jmpi = 2; - - assert(jmp_insn-header.opcode == BRW_OPCODE_JMPI); - assert(jmp_insn-bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE); - - jmp_insn-bits3.ud = jmpi * (p-nr_insn - jmp_insn_idx - 1); -} - void brw_emit_anyprim_setup( struct brw_sf_compile *c ) { struct brw_compile *p = c-func; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/4] i965/fs: Let the gen 8 generator know about runtime_check_aads_emit
In gen 6 we need to produce conditional code based on this flag when doing framebuffer writes. --- src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs.cpp| 2 +- src/mesa/drivers/dri/i965/brw_fs.h | 2 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +++- 4 files changed, 7 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index 33fa606..a2e008b 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -31,7 +31,7 @@ brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw, generator(brw, mem_ctx, rzalloc(mem_ctx, struct brw_wm_prog_key), rzalloc(mem_ctx, struct brw_wm_prog_data), - NULL, NULL, false, debug_flag) + NULL, NULL, false, false, debug_flag) { } diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3fa8334..a8ca9bc 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3211,7 +3211,7 @@ brw_wm_fs_emit(struct brw_context *brw, final_assembly_size); } else { fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src, - INTEL_DEBUG DEBUG_WM); + v.runtime_check_aads_emit, INTEL_DEBUG DEBUG_WM); assembly = g.generate_assembly(v.instructions, simd16_instructions, final_assembly_size); } diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index d91b966..02311a6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -607,6 +607,7 @@ public: struct gl_shader_program *prog, struct gl_fragment_program *fp, bool dual_source_output, +bool runtime_check_aads_emit, bool debug_flag); ~fs_generator(); @@ -716,6 +717,7 @@ private: exec_list discard_halt_patches; bool dual_source_output; + bool runtime_check_aads_emit; const bool debug_flag; void *mem_ctx; }; diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 3ff7682..f4e4826 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -43,10 +43,12 @@ fs_generator::fs_generator(struct brw_context *brw, struct gl_shader_program *prog, struct gl_fragment_program *fp, bool dual_source_output, + bool runtime_check_aads_emit, bool debug_flag) : brw(brw), key(key), prog_data(prog_data), prog(prog), fp(fp), - dual_source_output(dual_source_output), debug_flag(debug_flag), + dual_source_output(dual_source_output), + runtime_check_aads_emit(runtime_check_aads_emit), debug_flag(debug_flag), mem_ctx(mem_ctx) { ctx = brw-ctx; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/4] i965/fs: Add Gen 6 runtime checks for line antialiasing.
In Gen 6 the hardware generates a runtime bit that indicates whether AA data has to be sent as part of the framebuffer write SEND message. This affects the specific case where we have setup antialiased line rendering and we render polygons which have one face setup in GL_LINE mode (line antialiasing will be used) and the other one in GL_FILL mode (no line antialiasing needed). Currently we are not doing this runtime test and instead we always send AA data, which produces incorrect rendering of the GL_FILL face of the polygon in in the aforementioned scenario (verified in ironlake and gm45). In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In Gen5 this has never worked properly. Gen 5 are not affected by this. The patch fixes the problem by adding the appropriate runtime check and adjusting the framebuffer write message accordingly in the conflictive scenario. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679 --- src/mesa/drivers/dri/i965/brw_fs.h | 4 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 88 ++ 2 files changed, 65 insertions(+), 27 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 02311a6..cda344e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -617,6 +617,10 @@ public: private: void generate_code(exec_list *instructions); + void fire_fb_write(fs_inst *inst, + GLuint base_reg, + struct brw_reg implied_header, + GLuint nr); void generate_fb_write(fs_inst *inst); void generate_blorp_fb_write(fs_inst *inst); void generate_pixel_xy(struct brw_reg dst, bool is_x); diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index f4e4826..04c9b74 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -98,11 +98,47 @@ fs_generator::patch_discard_jumps_to_fb_writes() } void +fs_generator::fire_fb_write(fs_inst *inst, +GLuint base_reg, +struct brw_reg implied_header, +GLuint nr) +{ + uint32_t msg_control; + + if (brw-gen 6) { + brw_MOV(p, + brw_message_reg(base_reg + 1), + brw_vec8_grf(1, 0)); + } + + if (this-dual_source_output) + msg_control = BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN01; + else if (dispatch_width == 16) + msg_control = BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE; + else + msg_control = BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_SINGLE_SOURCE_SUBSPAN01; + + uint32_t surf_index = + prog_data-binding_table.render_target_start + inst-target; + + brw_fb_WRITE(p, +dispatch_width, +base_reg, +implied_header, +msg_control, +surf_index, +nr, +0, +inst-eot, +inst-header_present); + + brw_mark_surface_used(prog_data-base, surf_index); +} + +void fs_generator::generate_fb_write(fs_inst *inst) { - bool eot = inst-eot; struct brw_reg implied_header; - uint32_t msg_control; /* Header is 2 regs, g0 and g1 are the contents. g0 will be implied * move, here's g1. @@ -155,38 +191,36 @@ fs_generator::generate_fb_write(fs_inst *inst) implied_header = brw_null_reg(); } else { implied_header = retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UW); - -brw_MOV(p, -brw_message_reg(inst-base_mrf + 1), -brw_vec8_grf(1, 0)); } } else { implied_header = brw_null_reg(); } - if (this-dual_source_output) - msg_control = BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN01; - else if (dispatch_width == 16) - msg_control = BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE; - else - msg_control = BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_SINGLE_SOURCE_SUBSPAN01; + if (!runtime_check_aads_emit) { + fire_fb_write(inst, inst-base_mrf, implied_header, inst-mlen); + } else { + /* This can only happen in gen 6 */ + struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)); + + /* Check runtime bit to detect if we have to send AA data or not */ + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); + brw_AND(p, + v1_null_ud, + retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), + brw_imm_ud(126)); + brw_last_inst-header.destreg__conditionalmod = BRW_CONDITIONAL_NZ; + + int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - p-store; + brw_last_inst-header.execution_size = BRW_EXECUTE_1; + { + /* Don't send AA data */ + fire_fb_write(inst, inst-base_mrf+1,
[Mesa-dev] [PATCH v2 1/4] i965: Always set a valid block end pointer
When a instruction stream ends in a block structure (like a IF/ELSE/ENDIF) the last block's end pointer will not be set, leading to a crash later on in fs_live_variables::setup_def_use(). If we have not assigned the end pointer of the last block, set it to the last instruction. --- src/mesa/drivers/dri/i965/brw_cfg.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp b/src/mesa/drivers/dri/i965/brw_cfg.cpp index 6bf99f1..d4647c4 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp @@ -257,6 +257,11 @@ cfg_t::cfg_t(exec_list *instructions) } } + /* If the instruction stream ended with a block structure we need to + set the block's end pointer to the last instruction here */ + if (!cur-end) + cur-end = (backend_instruction *)instructions-get_tail(); + cur-end_ip = ip; make_block_array(); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU
https://bugs.freedesktop.org/show_bug.cgi?id=79688 Chris Wilson ch...@chris-wilson.co.uk changed: What|Removed |Added Assignee|ch...@chris-wilson.co.uk|mesa-dev@lists.freedesktop. ||org QA Contact|intel-gfx-bugs@lists.freede | |sktop.org | Summary|Latest git breaks PRIME |[dri3] Latest git breaks |Offloading to Nouveau GPU |PRIME Offloading to Nouveau ||GPU Product|xorg|Mesa Component|Driver/intel|GLX -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Debugging into Mesa Driver
On 06/04/2014 09:46 PM, roshan chaudhari wrote: Thanks for reply. I have added CFLAGS='-Og -ggdb3' CXXFLAGS='-Og -ggdb3' into configure file and ran ./configure --enable-debug ; make; make install but still it did not step into driver. Are you sure it's using the driver you built instead of the system installed driver? LIBGL_DEBUG=verbose glxgears will give some information about the driver being used (see below). If that doesn't match the driver installed, you'll need to use LD_LIBRARY_PATH and possibly LIBGL_DRIVERS_PATH to get the right one. libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so libGL: Can't open configuration file /home/idr/.drirc: No such file or directory. libGL: Can't open configuration file /home/idr/.drirc: No such file or directory. On Wed, Jun 4, 2014 at 1:09 PM, Ian Romanick i...@freedesktop.org mailto:i...@freedesktop.org wrote: On 06/04/2014 11:14 AM, roshan chaudhari wrote: Hello, I just cloned the mesa driver from git repository. I am trying to debug the opengl application with mesa driver. I am not sure which flag to enable for debugging and where, I built a driver with -enable-debug in Makefile and added --DEBUG in CFLAGS in Makefile but still when I try to step into driver code it does not allow me. I am doing it with gdb in ubuntu. Modifying the CFLAGS in the Makefile is likely to cause problems. Instead, try CFLAGS='-Og -ggdb3' CXXFLAGS='-Og -ggdb3' ./configure --enable-debug your other configure options If your version of GCC is too old, you will need to use -O0 instead of -Og. That should be sufficient. You can verify this by doing gdb $(which glxgears) Then, at the gdb prompt, break _mesa_Clear It will ask Function _mesa_Clear not defined. Make breakpoint pending on future shared library load? (y or [n]) Answer 'y'. Then, run If it stops in _mesa_Clear, you're good to go. Can anyone please help me with that? -- Thanks, Roshan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org mailto:mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac: Do not use Pthreads with MinGW.
Vinson, As I said in another reply, I think this the right thing to do on Windows. Please submit this and drop the c11: .. patch. Thanks. Jose - Original Message - Match the behavior of the SCons MinGW build. This patch also fixes these build errors. CC glapi_entrypoint.lo glapi_entrypoint.c: In function 'init_glapi_relocs_once': glapi_entrypoint.c:341:4: error: unknown type name 'pthread_once_t' static pthread_once_t once_control = PTHREAD_ONCE_INIT; ^ glapi_entrypoint.c:341:41: error: 'PTHREAD_ONCE_INIT' undeclared (first use in this function) static pthread_once_t once_control = PTHREAD_ONCE_INIT; ^ glapi_entrypoint.c:341:41: note: each undeclared identifier is reported only once for each function it appears in glapi_entrypoint.c:342:4: error: implicit declaration of function 'pthread_once' [-Werror=implicit-function-declaration] pthread_once( once_control, init_glapi_relocs ); ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- configure.ac | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 9c64400..ab3b91d 100644 --- a/configure.ac +++ b/configure.ac @@ -552,7 +552,12 @@ dnl See if posix_memalign is available AC_CHECK_FUNC([posix_memalign], [DEFINES=$DEFINES -DHAVE_POSIX_MEMALIGN]) dnl Check for pthreads -AX_PTHREAD +case $host_os in +mingw*) +;; +*) +AX_PTHREAD +esac dnl AX_PTHREADS leaves PTHREAD_LIBS empty for gcc and sets PTHREAD_CFLAGS dnl to -pthread, which causes problems if we need -lpthread to appear in dnl pkgconfig files. -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=3yagpSf7jQJPUz4%2BPaksqZla2Z81mjpcVmzLcZYx6tA%3D%0As=33df2f053ea51658e55f116c78207797dfe86311f4abb13d51481a31b051f823 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Fix substitution of large shaders
The fixed size is insufficient for shaders I'm debugging. Rather than just bump it up, make it dynamic. Thanks, -C Signed-off-by: Cody Northrop c...@lunarg.com --- src/mesa/main/shaderapi.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 6f84acd..e63c124 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1392,7 +1392,7 @@ _mesa_LinkProgram(GLhandleARB programObj) static GLcharARB * read_shader(const char *fname) { - const int max = 50*1000; + int shader_size = 0; FILE *f = fopen(fname, r); GLcharARB *buffer, *shader; int len; @@ -1401,8 +1401,16 @@ read_shader(const char *fname) return NULL; } - buffer = malloc(max); - len = fread(buffer, 1, max, f); + /* allocate enough room for the entire shader */ + fseek(f, 0, SEEK_END); + shader_size = ftell(f); + rewind(f); + assert(shader_size); + + buffer = malloc(shader_size); + assert(buffer); + + len = fread(buffer, 1, shader_size, f); buffer[len] = 0; fclose(f); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Fix else and brace placement in brw_eu_emit.c.
I'm making a lot of changes to this area, and I figured I may as well not conflate these trivial changes. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 41 +++-- 1 file changed, 13 insertions(+), 28 deletions(-) The easiest opportunity to pad your review stats :) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index b89070b..e39c31c 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -178,8 +178,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn, if (dest.hstride == BRW_HORIZONTAL_STRIDE_0) dest.hstride = BRW_HORIZONTAL_STRIDE_1; insn-bits1.da1.dest_horiz_stride = dest.hstride; - } - else { + } else { insn-bits1.da16.dest_subreg_nr = dest.subnr / 16; insn-bits1.da16.dest_writemask = dest.dw1.bits.writemask; if (dest.file == BRW_GENERAL_REGISTER_FILE || @@ -192,8 +191,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn, */ insn-bits1.da16.dest_horiz_stride = 1; } - } - else { + } else { insn-bits1.ia1.dest_subreg_nr = dest.subnr; /* These are different sizes in align1 vs align16: @@ -203,8 +201,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn, if (dest.hstride == BRW_HORIZONTAL_STRIDE_0) dest.hstride = BRW_HORIZONTAL_STRIDE_1; insn-bits1.ia1.dest_horiz_stride = dest.hstride; - } - else { + } else { insn-bits1.ia16.dest_indirect_offset = dest.dw1.bits.indirect_offset; /* even ignored in da16, still need to set as '01' */ insn-bits1.ia16.dest_horiz_stride = 1; @@ -394,26 +391,21 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, insn-bits1.da1.src0_reg_type = BRW_HW_REG_TYPE_UD; insn-bits1.da1.dest_reg_type = BRW_HW_REG_TYPE_UD; } - } - else - { + } else { if (reg.address_mode == BRW_ADDRESS_DIRECT) { if (insn-header.access_mode == BRW_ALIGN_1) { insn-bits2.da1.src0_subreg_nr = reg.subnr; insn-bits2.da1.src0_reg_nr = reg.nr; -} -else { +} else { insn-bits2.da16.src0_subreg_nr = reg.subnr / 16; insn-bits2.da16.src0_reg_nr = reg.nr; } - } - else { + } else { insn-bits2.ia1.src0_subreg_nr = reg.subnr; if (insn-header.access_mode == BRW_ALIGN_1) { insn-bits2.ia1.src0_indirect_offset = reg.dw1.bits.indirect_offset; -} -else { +} else { insn-bits2.ia16.src0_subreg_nr = reg.dw1.bits.indirect_offset; } } @@ -424,14 +416,12 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, insn-bits2.da1.src0_horiz_stride = BRW_HORIZONTAL_STRIDE_0; insn-bits2.da1.src0_width = BRW_WIDTH_1; insn-bits2.da1.src0_vert_stride = BRW_VERTICAL_STRIDE_0; -} -else { +} else { insn-bits2.da1.src0_horiz_stride = reg.hstride; insn-bits2.da1.src0_width = reg.width; insn-bits2.da1.src0_vert_stride = reg.vstride; } - } - else { + } else { insn-bits2.da16.src0_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X); insn-bits2.da16.src0_swz_y = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Y); insn-bits2.da16.src0_swz_z = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Z); @@ -475,8 +465,7 @@ brw_set_src1(struct brw_compile *p, if (reg.file == BRW_IMMEDIATE_VALUE) { insn-bits3.ud = reg.dw1.ud; - } - else { + } else { /* This is a hardware restriction, which may or may not be lifted * in the future: */ @@ -486,8 +475,7 @@ brw_set_src1(struct brw_compile *p, if (insn-header.access_mode == BRW_ALIGN_1) { insn-bits3.da1.src1_subreg_nr = reg.subnr; insn-bits3.da1.src1_reg_nr = reg.nr; - } - else { + } else { insn-bits3.da16.src1_subreg_nr = reg.subnr / 16; insn-bits3.da16.src1_reg_nr = reg.nr; } @@ -498,14 +486,12 @@ brw_set_src1(struct brw_compile *p, insn-bits3.da1.src1_horiz_stride = BRW_HORIZONTAL_STRIDE_0; insn-bits3.da1.src1_width = BRW_WIDTH_1; insn-bits3.da1.src1_vert_stride = BRW_VERTICAL_STRIDE_0; -} -else { +} else { insn-bits3.da1.src1_horiz_stride = reg.hstride; insn-bits3.da1.src1_width = reg.width; insn-bits3.da1.src1_vert_stride = reg.vstride; } - } - else { + } else { insn-bits3.da16.src1_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X); insn-bits3.da16.src1_swz_y = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Y); insn-bits3.da16.src1_swz_z
[Mesa-dev] Fix negation source modifer when used with logical instructions on Broadwell (v2)
v2 of the fix. Abdiel Janulgue (6): i965/fs: Refactor check for potential copy propagated instructions. i965/fs: skip copy-propate for logical instructions with negated src entries i965/fs: copy propagate 'NOT' instruction when used with logical operation i965/vec4: skip copy-propate for logical instructions with negated src entries i965/vec4: copy propagate 'NOT' instruction when used with logical operation i965/disasm: Properly debug negate source modifier for logical instructions src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 55 src/mesa/drivers/dri/i965/brw_vec4.h| 4 +- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 84 ++- src/mesa/drivers/dri/i965/gen8_disasm.c | 24 +++-- 4 files changed, 128 insertions(+), 39 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/6] i965/fs: Refactor check for potential copy propagated instructions.
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com Reviewed-by: Matt Turner matts...@gmail.com --- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 27 ++ 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index a1aff21..d3d59aa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -478,6 +478,22 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry) return progress; } + +static bool +can_propagate_from(fs_inst *inst) +{ + return (inst-opcode == BRW_OPCODE_MOV + inst-dst.file == GRF + ((inst-src[0].file == GRF + (inst-src[0].reg != inst-dst.reg || + inst-src[0].reg_offset != inst-dst.reg_offset)) || +inst-src[0].file == UNIFORM || +inst-src[0].file == IMM) + inst-src[0].type == inst-dst.type + !inst-saturate + !inst-is_partial_write()); +} + /* Walks a basic block and does copy propagation on it using the acp * list. */ @@ -532,16 +548,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block, /* If this instruction's source could potentially be folded into the * operand of another instruction, add it to the ACP. */ - if (inst-opcode == BRW_OPCODE_MOV - inst-dst.file == GRF - ((inst-src[0].file == GRF - (inst-src[0].reg != inst-dst.reg || -inst-src[0].reg_offset != inst-dst.reg_offset)) || - inst-src[0].file == UNIFORM || - inst-src[0].file == IMM) - inst-src[0].type == inst-dst.type - !inst-saturate - !inst-is_partial_write()) { + if (can_propagate_from(inst)) { acp_entry *entry = ralloc(copy_prop_ctx, acp_entry); entry-dst = inst-dst; entry-src = inst-src[0]; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/6] i965/fs: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when used with logical operations. Don't copy propagate when negate src modifier is set and when the destination instruction is a logical op. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index d3d59aa..aa506f5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -42,6 +42,7 @@ namespace { /* avoid conflict with opt_copy_propagation_elements */ struct acp_entry : public exec_node { fs_reg dst; fs_reg src; + enum opcode opcode; }; struct block_data { @@ -272,6 +273,15 @@ fs_copy_prop_dataflow::dump_block_data() const } } +static bool +is_logic_op(enum opcode opcode) +{ + return (opcode == BRW_OPCODE_AND || + opcode == BRW_OPCODE_OR || + opcode == BRW_OPCODE_XOR || + opcode == BRW_OPCODE_NOT); +} + bool fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) { @@ -330,6 +340,14 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) if (has_source_modifiers entry-dst.type != inst-src[arg].type) return false; + if (brw-gen = 8) { + if (entry-src.negate) { + if (is_logic_op(inst-opcode)) { +return false; + } + } + } + inst-src[arg].file = entry-src.file; inst-src[arg].reg = entry-src.reg; inst-src[arg].reg_offset = entry-src.reg_offset; @@ -552,6 +570,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block, acp_entry *entry = ralloc(copy_prop_ctx, acp_entry); entry-dst = inst-dst; entry-src = inst-src[0]; +entry-opcode = inst-opcode; acp[entry-dst.reg % ACP_HASH_SIZE].push_tail(entry); } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/6] i965/fs: copy propagate 'NOT' instruction when used with logical operation
On Broadwell, this reduces the instruction to a single operation when NOT is used with a logical instruction. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index aa506f5..54d2cb4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -341,7 +341,11 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) return false; if (brw-gen = 8) { - if (entry-src.negate) { + if (entry-opcode == BRW_OPCODE_NOT) { + if (!is_logic_op(inst-opcode)) { +return false; + } + } else if (entry-src.negate) { if (is_logic_op(inst-opcode)) { return false; } @@ -359,6 +363,10 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) inst-src[arg].negate ^= entry-src.negate; } + if (brw-gen =8 entry-opcode == BRW_OPCODE_NOT) { + inst-src[arg].negate ^= !entry-src.negate; + } + return true; } @@ -498,9 +506,10 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry) } static bool -can_propagate_from(fs_inst *inst) +can_propagate_from(struct brw_context *brw, fs_inst *inst) { - return (inst-opcode == BRW_OPCODE_MOV + return ((inst-opcode == BRW_OPCODE_MOV || +(inst-opcode == BRW_OPCODE_NOT brw-gen =8)) inst-dst.file == GRF ((inst-src[0].file == GRF (inst-src[0].reg != inst-dst.reg || @@ -566,7 +575,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block, /* If this instruction's source could potentially be folded into the * operand of another instruction, add it to the ACP. */ - if (can_propagate_from(inst)) { + if (can_propagate_from(brw, inst)) { acp_entry *entry = ralloc(copy_prop_ctx, acp_entry); entry-dst = inst-dst; entry-src = inst-src[0]; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/6] i965/vec4: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when used with logical operations. Don't copy propagate when negate src modifier is set and when the destination instruction is a logical op. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 4 +- .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 68 +++--- 2 files changed, 49 insertions(+), 23 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index fd58b3c..51da46c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -228,6 +228,8 @@ writemask(dst_reg reg, unsigned mask) return reg; } +struct copy_entry; + class vec4_instruction : public backend_instruction { public: DECLARE_RALLOC_CXX_OPERATORS(vec4_instruction) @@ -498,7 +500,7 @@ public: vec4_instruction *last_rhs_inst); bool try_copy_propagation(vec4_instruction *inst, int arg, - src_reg *values[4]); + struct copy_entry *entry); /** Walks an exec_list of ir_instruction and sends it through this visitor. */ void visit_instructions(const exec_list *list); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp index 83cf191..e537895 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp @@ -36,6 +36,11 @@ extern C { namespace brw { +struct copy_entry { + src_reg *value[4]; + enum opcode opcode; +}; + static bool is_direct_copy(vec4_instruction *inst) { @@ -195,24 +200,33 @@ try_constant_propagation(vec4_instruction *inst, int arg, src_reg *values[4]) return false; } +static bool +is_logic_op(enum opcode opcode) +{ + return (opcode == BRW_OPCODE_AND || + opcode == BRW_OPCODE_OR || + opcode == BRW_OPCODE_XOR || + opcode == BRW_OPCODE_NOT); +} + bool vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg, - src_reg *values[4]) + struct copy_entry *entry) { /* For constant propagation, we only handle the same constant * across all 4 channels. Some day, we should handle the 8-bit * float vector format, which would let us constant propagate * vectors better. */ - src_reg value = *values[0]; + src_reg value = *(entry-value[0]); for (int i = 1; i 4; i++) { /* This is equals() except we don't care about the swizzle. */ - if (value.file != values[i]-file || - value.reg != values[i]-reg || - value.reg_offset != values[i]-reg_offset || - value.type != values[i]-type || - value.negate != values[i]-negate || - value.abs != values[i]-abs) { + if (value.file != entry-value[i]-file || + value.reg != entry-value[i]-reg || + value.reg_offset != entry-value[i]-reg_offset || + value.type != entry-value[i]-type || + value.negate != entry-value[i]-negate || + value.abs != entry-value[i]-abs) { return false; } } @@ -223,7 +237,7 @@ vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg, */ int s[4]; for (int i = 0; i 4; i++) { - s[i] = BRW_GET_SWZ(values[i]-swizzle, + s[i] = BRW_GET_SWZ(entry-value[i]-swizzle, BRW_GET_SWZ(inst-src[arg].swizzle, i)); } value.swizzle = BRW_SWIZZLE4(s[0], s[1], s[2], s[3]); @@ -233,6 +247,14 @@ vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg, value.file != ATTR) return false; + if (brw-gen =8) { + if (value.negate) { + if (is_logic_op(inst-opcode)) { +return false; + } + } + } + if (inst-src[arg].abs) { value.negate = false; value.abs = true; @@ -284,9 +306,9 @@ bool vec4_visitor::opt_copy_propagation() { bool progress = false; - src_reg *cur_value[virtual_grf_reg_count][4]; + struct copy_entry entries[virtual_grf_reg_count]; - memset(cur_value, 0, sizeof(cur_value)); + memset(entries, 0, sizeof(entries)); foreach_list(node, this-instructions) { vec4_instruction *inst = (vec4_instruction *)node; @@ -299,7 +321,7 @@ vec4_visitor::opt_copy_propagation() * src/glsl/opt_copy_propagation.cpp to track available copies. */ if (!is_dominated_by_previous_instruction(inst)) { -memset(cur_value, 0, sizeof(cur_value)); +memset(entries, 0, sizeof(entries)); continue; } @@ -320,31 +342,32 @@ vec4_visitor::opt_copy_propagation() /* Find the regs that each swizzle component came from. */ -src_reg *values[4]; +struct copy_entry entry; int c; for (c = 0;
[Mesa-dev] [PATCH v2 6/6] i965/disasm: Properly debug negate source modifier for logical instructions
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/gen8_disasm.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen8_disasm.c b/src/mesa/drivers/dri/i965/gen8_disasm.c index 04f8538..98e2453 100644 --- a/src/mesa/drivers/dri/i965/gen8_disasm.c +++ b/src/mesa/drivers/dri/i965/gen8_disasm.c @@ -50,6 +50,8 @@ static const char *const m_negate[2] = { , - }; static const char *const m_abs[2] = { , (abs) }; +static const char *const m_bitnot[2] = { , ~ }; + static const char *const m_vert_stride[16] = { 0, 1, @@ -511,13 +513,23 @@ src_swizzle(FILE *file, unsigned x, unsigned y, unsigned z, unsigned w) return err; } +static bool +is_logic_instruction(unsigned opcode) +{ + return (opcode == BRW_OPCODE_AND || + opcode == BRW_OPCODE_NOT || + opcode == BRW_OPCODE_OR || + opcode == BRW_OPCODE_XOR); +} + static int -src_da1(FILE *file, unsigned type, unsigned reg_file, +src_da1(FILE *file, unsigned opcode, unsigned type, unsigned reg_file, unsigned vert_stride, unsigned _width, unsigned horiz_stride, unsigned reg_num, unsigned sub_reg_num, unsigned _abs, unsigned negate) { int err = 0; - err |= control(file, negate, m_negate, negate, NULL); + err |= control(file, negate, is_logic_instruction(opcode) ? + m_bitnot : m_negate, negate, NULL); err |= control(file, abs, m_abs, _abs, NULL); err |= reg(file, reg_file, reg_num); @@ -532,6 +544,7 @@ src_da1(FILE *file, unsigned type, unsigned reg_file, static int src_da16(FILE *file, + unsigned opcode, unsigned _reg_type, unsigned reg_file, unsigned vert_stride, @@ -545,7 +558,8 @@ src_da16(FILE *file, unsigned swz_w) { int err = 0; - err |= control(file, negate, m_negate, negate, NULL); + err |= control(file, negate, is_logic_instruction(opcode) ? + m_bitnot : m_negate, negate, NULL); err |= control(file, abs, m_abs, _abs, NULL); err |= reg(file, reg_file, _reg_nr); @@ -714,6 +728,7 @@ src0(FILE *file, struct gen8_instruction *inst) if (gen8_access_mode(inst) == BRW_ALIGN_1) { assert(gen8_src0_address_mode(inst) == BRW_ADDRESS_DIRECT); return src_da1(file, + gen8_opcode(inst), gen8_src0_reg_type(inst), gen8_src0_reg_file(inst), gen8_src0_vert_stride(inst), @@ -726,6 +741,7 @@ src0(FILE *file, struct gen8_instruction *inst) } else { assert(gen8_src0_address_mode(inst) == BRW_ADDRESS_DIRECT); return src_da16(file, + gen8_opcode(inst), gen8_src0_reg_type(inst), gen8_src0_reg_file(inst), gen8_src0_vert_stride(inst), @@ -749,6 +765,7 @@ src1(FILE *file, struct gen8_instruction *inst) if (gen8_access_mode(inst) == BRW_ALIGN_1) { assert(gen8_src1_address_mode(inst) == BRW_ADDRESS_DIRECT); return src_da1(file, + gen8_opcode(inst), gen8_src1_reg_type(inst), gen8_src1_reg_file(inst), gen8_src1_vert_stride(inst), @@ -761,6 +778,7 @@ src1(FILE *file, struct gen8_instruction *inst) } else { assert(gen8_src1_address_mode(inst) == BRW_ADDRESS_DIRECT); return src_da16(file, + gen8_opcode(inst), gen8_src1_reg_type(inst), gen8_src1_reg_file(inst), gen8_src1_vert_stride(inst), -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/6] i965/vec4: copy propagate 'NOT' instruction when used with logical operation
On Broadwell, this reduces the instruction to a single operation when NOT is used with a logical instruction. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp index e537895..5eb4eb4 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp @@ -42,9 +42,11 @@ struct copy_entry { }; static bool -is_direct_copy(vec4_instruction *inst) +can_propagate_from(struct brw_context *brw, vec4_instruction *inst) + { - return (inst-opcode == BRW_OPCODE_MOV + return ((inst-opcode == BRW_OPCODE_MOV || +(inst-opcode == BRW_OPCODE_NOT brw-gen = 8)) !inst-predicate inst-dst.file == GRF !inst-saturate @@ -248,7 +250,11 @@ vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg, return false; if (brw-gen =8) { - if (value.negate) { + if (entry-opcode == BRW_OPCODE_NOT) { + if (!is_logic_op(inst-opcode)) { +return false; + } + } else if (value.negate) { if (is_logic_op(inst-opcode)) { return false; } @@ -299,6 +305,10 @@ vec4_visitor::try_copy_propagation(vec4_instruction *inst, int arg, value.type = inst-src[arg].type; inst-src[arg] = value; + + if (brw-gen =8 entry-opcode == BRW_OPCODE_NOT) + inst-src[arg].negate ^= !value.negate; + return true; } @@ -380,10 +390,10 @@ vec4_visitor::opt_copy_propagation() * the value is the newly propagated source. Otherwise, we don't know * the new value, so clear it. */ -bool direct_copy = is_direct_copy(inst); +bool propagate = can_propagate_from(brw, inst); for (int i = 0; i 4; i++) { if (inst-dst.writemask (1 i)) { - entries[reg].value[i] = direct_copy ? inst-src[0] : NULL; + entries[reg].value[i] = propagate ? inst-src[0] : NULL; entries[reg].opcode = inst-opcode; } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS
On Tue, 2014-06-03 at 08:55 -0400, Alex Deucher wrote: On Mon, Jun 2, 2014 at 7:34 PM, Bruno Jimenez brunoji...@gmail.com wrote: On Mon, 2014-06-02 at 16:16 -0400, Alex Deucher wrote: On Sat, May 31, 2014 at 7:13 AM, Bruno Jimenez brunoji...@gmail.com wrote: On Fri, 2014-05-30 at 19:33 -0400, Alex Deucher wrote: On Fri, May 30, 2014 at 11:31 AM, Bruno Jiménez brunoji...@gmail.com wrote: The data has been extracted from: AMD Accelerated Parallel Processing OpenCL Programming Guide (rev 2.7) Appendix D: Device Parameters You should add a query for the number of compute units to the RADEON_INFO ioctl and then just ask the kernel how many CUs/SIMDs the hw has. This will properly handle all boards (harvest, etc.) since we can read the actual number of CUs off the GPU. Alex Hi, At first I tried to do so (as for the maximum clock frequency), but I couldn't find how to query that value, nor many docs about what I could ask the kernel for. I think I have found now the appropiate docs, and I will try again to query the kernel later. You'd need to add a new query. It doesn't look like we expose this yet. The attached untested patch should mostly do the trick. Alex Honestly, I would have never ever been able to come up with this. I tried quering for MAX_PIPES, MAX_SE and MAX_SH_PER_SE (only for SI), and multiplying them together. And it did work for my little CEDAR, but getting a 2 it's easy. And looking at what would return for other cards it didn't look so well. Should I try this patch on top of kernel 3.14.4? or should I use other version? With a couple of changes, it applied cleanly to 3.14.5 (Arch's stable). And with the attached patch as #2 for my series I can get the correct number of compute units for my CEDAR. But I don't know how or where I should add this new query param, given that it hasn't been added to the kernel yet. For now I have hardcoded the '0x20'. Thanks for all Alex! Bruno It was against Dave's drm-next, but it may apply to 3.14 as well. Alex Thanks in advance and sorry for any inconvenience. Bruno Sorry for any inconvenience. Bruno --- src/gallium/drivers/radeon/r600_pipe_common.c | 90 +++ 1 file changed, 90 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 70c4d1a..c4abacd 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -422,6 +422,89 @@ const char *r600_get_llvm_processor_name(enum radeon_family family) } } +static uint32_t radeon_max_compute_units(enum radeon_family family) +{ + switch (family) { + case CHIP_CEDAR: + return 2; + + /* Redwood PRO2: 4 +* Redwood PRO: 5 +* Redwood XT: 5 */ + case CHIP_REDWOOD: + return 4; + + /* Juniper LE: 9 +* Juniper XT: 10 */ + case CHIP_JUNIPER: + return 9; + + /* Cypress LE: 14 +* Cypress PRO: 18 +* Cypress XT: 20 */ + case CHIP_CYPRESS: + return 14; + + case CHIP_HEMLOCK: + return 40; + + /* XXX: is Zacate really equal to Ontario? +* Zacate E-350: 2 +* Zacate E-240: 2 +* Ontario C-50: 2 +* Ontario C-30: 2 */ + case CHIP_PALM: + return 2; + + /* Caicos: 2 +* Seymour LP: 2 +* Seymour PRO: 2 +* Seymour XT: 2 +* Seymour XTX: 2 */ + case CHIP_CAICOS: + return 2; + + /* Turks PRO:6 +* Turks XT: 6 +* Whistler LP: 6 +* Whistler PRO: 6 +* Whistler XT: 6 */ + case CHIP_TURKS: + return 6; + + /* Barts LE: 10 +* Barts PRO: 12 +* Barts XT: 14 +* Blackcomb PRO: 12 */ + case CHIP_BARTS: + return 10; + + /* Cayman PRO: 22 +* Cayman XT: 24 +* Cayman Gemini: 48 */ + case CHIP_CAYMAN: + return 22; + + /* Verde PRO: 8 +* Verde XT: 10 */ + case CHIP_VERDE: + return 8; + + /* Pitcairn PRO: 16 +* Pitcairn XT: 20 */ + case CHIP_PITCAIRN: + return 16; + + /* Tahiti PRO: 28 +* Tahiti XT: 32 */ + case CHIP_TAHITI: + return 28; + + default: + return 1; + } +} + static int
Re: [Mesa-dev] [PATCH] i965: Fix else and brace placement in brw_eu_emit.c.
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On Thu, Jun 5, 2014 at 10:56 AM, Kenneth Graunke kenn...@whitecape.org wrote: I'm making a lot of changes to this area, and I figured I may as well not conflate these trivial changes. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 41 +++-- 1 file changed, 13 insertions(+), 28 deletions(-) The easiest opportunity to pad your review stats :) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index b89070b..e39c31c 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -178,8 +178,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn, if (dest.hstride == BRW_HORIZONTAL_STRIDE_0) dest.hstride = BRW_HORIZONTAL_STRIDE_1; insn-bits1.da1.dest_horiz_stride = dest.hstride; - } - else { + } else { insn-bits1.da16.dest_subreg_nr = dest.subnr / 16; insn-bits1.da16.dest_writemask = dest.dw1.bits.writemask; if (dest.file == BRW_GENERAL_REGISTER_FILE || @@ -192,8 +191,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn, */ insn-bits1.da16.dest_horiz_stride = 1; } - } - else { + } else { insn-bits1.ia1.dest_subreg_nr = dest.subnr; /* These are different sizes in align1 vs align16: @@ -203,8 +201,7 @@ brw_set_dest(struct brw_compile *p, struct brw_instruction *insn, if (dest.hstride == BRW_HORIZONTAL_STRIDE_0) dest.hstride = BRW_HORIZONTAL_STRIDE_1; insn-bits1.ia1.dest_horiz_stride = dest.hstride; - } - else { + } else { insn-bits1.ia16.dest_indirect_offset = dest.dw1.bits.indirect_offset; /* even ignored in da16, still need to set as '01' */ insn-bits1.ia16.dest_horiz_stride = 1; @@ -394,26 +391,21 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, insn-bits1.da1.src0_reg_type = BRW_HW_REG_TYPE_UD; insn-bits1.da1.dest_reg_type = BRW_HW_REG_TYPE_UD; } - } - else - { + } else { if (reg.address_mode == BRW_ADDRESS_DIRECT) { if (insn-header.access_mode == BRW_ALIGN_1) { insn-bits2.da1.src0_subreg_nr = reg.subnr; insn-bits2.da1.src0_reg_nr = reg.nr; -} -else { +} else { insn-bits2.da16.src0_subreg_nr = reg.subnr / 16; insn-bits2.da16.src0_reg_nr = reg.nr; } - } - else { + } else { insn-bits2.ia1.src0_subreg_nr = reg.subnr; if (insn-header.access_mode == BRW_ALIGN_1) { insn-bits2.ia1.src0_indirect_offset = reg.dw1.bits.indirect_offset; -} -else { +} else { insn-bits2.ia16.src0_subreg_nr = reg.dw1.bits.indirect_offset; } } @@ -424,14 +416,12 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, insn-bits2.da1.src0_horiz_stride = BRW_HORIZONTAL_STRIDE_0; insn-bits2.da1.src0_width = BRW_WIDTH_1; insn-bits2.da1.src0_vert_stride = BRW_VERTICAL_STRIDE_0; -} -else { +} else { insn-bits2.da1.src0_horiz_stride = reg.hstride; insn-bits2.da1.src0_width = reg.width; insn-bits2.da1.src0_vert_stride = reg.vstride; } - } - else { + } else { insn-bits2.da16.src0_swz_x = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_X); insn-bits2.da16.src0_swz_y = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Y); insn-bits2.da16.src0_swz_z = BRW_GET_SWZ(reg.dw1.bits.swizzle, BRW_CHANNEL_Z); @@ -475,8 +465,7 @@ brw_set_src1(struct brw_compile *p, if (reg.file == BRW_IMMEDIATE_VALUE) { insn-bits3.ud = reg.dw1.ud; - } - else { + } else { /* This is a hardware restriction, which may or may not be lifted * in the future: */ @@ -486,8 +475,7 @@ brw_set_src1(struct brw_compile *p, if (insn-header.access_mode == BRW_ALIGN_1) { insn-bits3.da1.src1_subreg_nr = reg.subnr; insn-bits3.da1.src1_reg_nr = reg.nr; - } - else { + } else { insn-bits3.da16.src1_subreg_nr = reg.subnr / 16; insn-bits3.da16.src1_reg_nr = reg.nr; } @@ -498,14 +486,12 @@ brw_set_src1(struct brw_compile *p, insn-bits3.da1.src1_horiz_stride = BRW_HORIZONTAL_STRIDE_0; insn-bits3.da1.src1_width = BRW_WIDTH_1; insn-bits3.da1.src1_vert_stride = BRW_VERTICAL_STRIDE_0; -} -else { +} else { insn-bits3.da1.src1_horiz_stride = reg.hstride; insn-bits3.da1.src1_width = reg.width; insn-bits3.da1.src1_vert_stride = reg.vstride;
[Mesa-dev] [Bug 79706] New: [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Priority: medium Bug ID: 79706 Assignee: mesa-dev@lists.freedesktop.org Summary: [TRACKER] Mesa regression tracker Severity: normal Classification: Unclassified OS: All Reporter: kenn...@whitecape.org Hardware: Other Status: NEW Version: unspecified Component: Other Product: Mesa This is a tracker bug for long-standing Mesa regressions. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on|44519 | -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 44519] translate_test generic regression
https://bugs.freedesktop.org/show_bug.cgi?id=44519 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Blocks|79039 |79706 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on||44519 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 45348] [swrast] piglit fbo-drawbuffers-arbfp regression
https://bugs.freedesktop.org/show_bug.cgi?id=45348 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Blocks|79039 |79706 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on|45348 | -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on||45348 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on|49713 | -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on||49713 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61153] [softpipe] piglit interpolation-noperspective-gl_BackColor-flat-vertex regression
https://bugs.freedesktop.org/show_bug.cgi?id=61153 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Blocks|79039 |79706 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on||61153 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on|59777 | -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on||59777 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61326] [softpipe] piglit interpolation-noperspective-gl_BackSecondaryColor-flat-vertex regression
https://bugs.freedesktop.org/show_bug.cgi?id=61326 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Blocks|79039 |79706 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on||61326 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on|61326 | -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 59777] [softpipe] piglit interpolation-noperspective-gl_BackColor-flat-distance regression
https://bugs.freedesktop.org/show_bug.cgi?id=59777 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Blocks|79039 |79706 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 --- Comment #1 from Kenneth Graunke kenn...@whitecape.org --- To be clear, this bug should track regressions from 10.1 to 10.2, so that we can avoid introducing regressions in the upcoming release. I've created bug 79706 for other, long standing regressions, so that we don't lose those. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79039 Kenneth Graunke kenn...@whitecape.org changed: What|Removed |Added Depends on|78691 | -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Fix substitution of large shaders
On 06/05/2014 10:47 AM, Cody Northrop wrote: The fixed size is insufficient for shaders I'm debugging. Rather than just bump it up, make it dynamic. Thanks, -C Signed-off-by: Cody Northrop c...@lunarg.com mailto:c...@lunarg.com --- src/mesa/main/shaderapi.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 6f84acd..e63c124 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1392,7 +1392,7 @@ _mesa_LinkProgram(GLhandleARB programObj) static GLcharARB * read_shader(const char *fname) { - const int max = 50*1000; + int shader_size = 0; FILE *f = fopen(fname, r); GLcharARB *buffer, *shader; int len; @@ -1401,8 +1401,16 @@ read_shader(const char *fname) return NULL; } - buffer = malloc(max); - len = fread(buffer, 1, max, f); + /* allocate enough room for the entire shader */ + fseek(f, 0, SEEK_END); + shader_size = ftell(f); + rewind(f); + assert(shader_size); + + buffer = malloc(shader_size); Do you have to add one for the terminating zero? + assert(buffer); + + len = fread(buffer, 1, shader_size, f); buffer[len] = 0; fclose(f); -- I thought I was the only person who ever used this code! Other than the one question above this looks alright. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79629] [dri3] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fails
https://bugs.freedesktop.org/show_bug.cgi?id=79629 lu hua huax...@intel.com changed: What|Removed |Added Status|NEEDINFO|NEW CC||huax...@intel.com --- Comment #3 from lu hua huax...@intel.com --- set xorg.conf as below, it works well. Section Device Identifier Intel Option DRI 2 EndSection -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79688] [dri3] Latest git breaks PRIME Offloading to Nouveau GPU
https://bugs.freedesktop.org/show_bug.cgi?id=79688 --- Comment #1 from Axel Davy veb...@hotmail.fr --- This is due to Mesa DRI3 code not taking care of the DRI_PRIME env var. As a temporary fix, the user can set LIBGL_DRI3_DISABLE in addition to DRI_PRIME when wanting to use the secondary card. A temporary patch could be mergedto not try the DRI3 path when DRI_PRIME is set. The complete fix should be DRI3 DRI_PRIME support. http://lists.freedesktop.org/archives/mesa-dev/2014-May/060131.html I hadn't time yet to rewrite the gallium dri3 code and to rebase the patches, but they should be ready soon. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79711] New: Crash bug still exists in glx libs, 2 years after I sent you a patch to fix it.
https://bugs.freedesktop.org/show_bug.cgi?id=79711 Priority: medium Bug ID: 79711 Assignee: mesa-dev@lists.freedesktop.org Summary: Crash bug still exists in glx libs, 2 years after I sent you a patch to fix it. Severity: critical Classification: Unclassified OS: Linux (All) Reporter: danm...@gmail.com Hardware: x86-64 (AMD64) Status: NEW Version: 10.1 Component: GLX Product: Mesa Observe the saga of https://bugs.freedesktop.org/show_bug.cgi?id=54372 Mesa libglx has a NULL pointer dereference bug, discovered by app developer. App developer debugs into libglx, understands problem, prepares patch. App developer sends patch to fix NULL pointer dereference to debian x-strike-force, and freedesktop.org. Nobody at debian or freedsektop.org ever fixes libglx, nobody ever even responds to the patch. App developer pings every 3 months for year, before giving up. A year after giving up App developer tries latest mesa glx, finds the same TWO YEAR OLD bug still exists. Seriously? WTF? How many years until you apply the patch? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 54372] GLX_INTEL_swap_event crashes driver when swapping window buffers
https://bugs.freedesktop.org/show_bug.cgi?id=54372 Alan Coopersmith alan.coopersm...@oracle.com changed: What|Removed |Added QA Contact||mesa-dev@lists.freedesktop. ||org -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Vinson Lee v...@freedesktop.org changed: What|Removed |Added Depends on||79098 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79098] x86/common_x86.c:51:19: error: cpuid.h: No such file or directory
https://bugs.freedesktop.org/show_bug.cgi?id=79098 Vinson Lee v...@freedesktop.org changed: What|Removed |Added Blocks||79706 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] glsl: Rebalance expression trees that are reduction operations.
On Tue, Mar 11, 2014 at 3:49 PM, Eric Anholt e...@anholt.net wrote: Matt Turner matts...@gmail.com writes: The intention of this pass was to give us better instruction scheduling opportunities, but it unexpectedly reduced some instruction counts as well: total instructions in shared programs: 139 - 1666073 (-0.03%) instructions in affected programs: 54612 - 54046 (-1.04%) (and trades 4 SIMD16 programs in SS3) Patches 1, 3, 4, 6 are: Reviewed-by: Eric Anholt e...@anholt.net I got lost on this one, though... diff --git a/src/glsl/opt_rebalance_tree.cpp b/src/glsl/opt_rebalance_tree.cpp new file mode 100644 index 000..91aa999 --- /dev/null +++ b/src/glsl/opt_rebalance_tree.cpp +/** + * \file opt_rebalance_tree.cpp + * + * Rebalances a reduction expression tree. + * + * For reduction operations (e.g., x + y + z + w) we generate an expression + * tree like + * + *+ + * / \ + * + w + * / \ + *+ z + * / \ + * x y + * + * which we can rebalance into + * + * + + * / \ + * / \ + *+ + + * / \ / \ + * x y z w + * + * to get a better instruction scheduling. + * + * See Tree Rebalancing in Optimal Editor Time and Space by Quentin F. Stout + * and Bette L. Warren. + */ + +#include ir.h +#include ir_visitor.h +#include ir_rvalue_visitor.h +#include ir_optimization.h + +/* The DSW algorithm generates a degenerate tree (really, a linked list) in + * tree_to_vine(). We'd rather not leave a binary expression with only one + * operand, so trivial modifications (the ternary operators below) are needed + * to ensure that we only rotate around the ir_expression nodes of the tree. + */ Why do we care about having NULL remainder.left briefly, if we're definitely going to be rebalancing back? It enable[sd] a lot easier debugging by allowing you to ir-print(), which isn't possible with a degenerate tree. +static unsigned +tree_to_vine(ir_expression *root) +{ + unsigned size = 0; + ir_rvalue *vine_tail = root; + ir_rvalue *remainder = root-operands[1]; + + while (remainder != NULL) { + ir_expression *remainder_left = remainder-as_expression() ? + remainder-as_expression()-operands[0]-as_expression() : NULL; A remainder_expr = remainder-as_expression(); temp would have kept me From misreading this function a couple of times. Sure. + + if (remainder_left == NULL) { + /* move vine_tail down one */ + vine_tail = remainder; + remainder = remainder-as_expression() ? +remainder-as_expression()-operands[1] : NULL; + size++; + } else { + /* rotate */ + ir_expression *tempptr = remainder_left; + remainder-as_expression()-operands[0] = tempptr-operands[1]; + tempptr-operands[1] = remainder; + remainder = tempptr; + vine_tail-as_expression()-operands[1] = tempptr; + } + } + + return size; +} +static void +compression(ir_expression *root, unsigned count) +{ + ir_expression *scanner = root; + + for (unsigned i = 0; i count; i++) { + ir_expression *child = scanner-operands[1]-as_expression(); + scanner-operands[1] = child-operands[1]; + scanner = scanner-operands[1]-as_expression(); + child-operands[1] = scanner-operands[0]; + scanner-operands[0] = child; + } +} + +static void +vine_to_tree(ir_expression *root, unsigned size) +{ + int n = size - 1; + for (int m = n / 2; m 0; m = n / 2) { + compression(root, m); + n -= m + 1; + } +} These two functions need some comments. I'm not sure what those needed comments are. I don't really know what I can meaningfully say, short of giving an excerpt of the paper or a citation. I could never get an implementation of vine_to_tree that matches paper to work properly, so I ultimately went with an implementation described here: http://penguin.ewu.edu/~trolfe/DSWpaper/ which actually provides a pretty good explanation (not sure I noticed it before): 1. Reduce the length of the backbone by 1 2. Divide the length of the backbone by 2 [rounding down if the length is not even] to find the number of transformations, m. 3. If m is zero, exit; otherwise perform m transformations on the backbone. 4. Return to 1. Including that link in a comment definitely seems like a good idea. + +namespace { + +class ir_rebalance_visitor : public ir_rvalue_enter_visitor { +public: + ir_rebalance_visitor() + { + progress = false; + } + + void handle_rvalue(ir_rvalue **rvalue); + + bool progress; +}; + +struct is_reduction_data { + ir_expression_operation operation; + const glsl_type *type; + unsigned num_expr; + bool is_reduction; + bool contains_constant; +}; + +} /* anonymous namespace */ + +static bool +is_reduction_operation(ir_expression_operation operation) +{ + switch (operation)