Re: [Mesa-dev] [PATCH 00/17] i965/vs: Generalize VS compiler back-end in preparation for GS.
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On Sun, Apr 7, 2013 at 3:53 PM, Paul Berry stereotype...@gmail.com wrote: This patch series lays the groundwork for the i965 geometry shader back-end by separating the functions and data structures which are specific to vertex shaders from those that can also be used to compile geometry shaders. (Following a naming convention that is already present in the codebase, this common code is referred to as vec4 code, since in the future we should be able to use it for any shader stage where the hardware expects a pair of vec4s to be stored in each register. This includes tessellation control and tessellation evaluation shaders.) In particular, the following structs/classes have been split into a base class containing vec4-generic data, and a derived class containing VS-specific data: - brw_vs_compile (new base struct is brw_vec4_compile) - brw_vs_prog_key (new base struct is brw_vec4_prog_key) - brw_vs_prog_data (new base struct is brw_vec4_prog_data) - vec4_visitor (new derived class is vec4_vs_visitor) In the case of vec4_visitor, standard C++ inheritance is used, and VS-specific behaviours are moved into virtual functions. The other three cases use C-style inheritance (the derived struct has an explicit base element, and there are no virtual functions), since these structs need to be accessible from plain C code. In addition, small modifications have been made to the vec4_generator class and the brw_compute_vue_map() function to generalize them for use by both vertex and geometry shaders. To keep merge conflicts to a minimum (since this patch series has been in development for several weeks), I've tried to minimize the amount of code motion introduced by this change. As a result, the patch series leaves vec4_vs_visitor functions scattered in several files (brw_vec4.cpp, brw_vec4_visitor.cpp, and brw_vec4_vp.cpp). Once this series lands, I'd like to follow up with a patch series that moves all of the vec4_vs_visitor functions to a new brw_vec4_vs_visitor.cpp file, and moves the class declaration to a corresponding header. I'll wait until this patch series has landed before starting on that, and try not to do it while anyone is in the middle of major VS back-end work. Note that this patch series must be applied atop the patch i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes, which I sent out for review this morning. You can find the complete series in context at branch gs-backend-prep of git://github.com/stereotype441/mesa.git. In order to verify that it's actually possible to build geometry shader functionality atop these changes, I have begun prototyping an implementation of geometry shaders which passes all existing geometry shader Piglit tests. (Sadly, there are many more tests left to write, and features left to implement!) I'll be sending out those patches in the coming month(s), as they mature. You can find that series in branch gs of git://github.com/stereotype441/mesa.git. Note that the gs branch is *highly volatile*, so if you want to base work on it, please let me know so we can coordinate. Piglit-tested on i965/Gen7--no regressions. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] radeonsi: Handle new format for configuration values emitted by the LLVM backend
On Fre, 2013-04-05 at 14:54 -0400, Tom Stellard wrote: From: Tom Stellard thomas.stell...@amd.com Instead of emitting configuration values (e.g. number of gprs used) in a predefined order, the LLVM backend now emits these values in register/value pairs. The first dword contains the register address and the second dword contians the value to write. --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 0aeecc2..78c1cf4 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -1175,9 +1175,29 @@ int si_pipe_shader_create( } } - shader-num_sgprs = util_le32_to_cpu(*(uint32_t*)binary.config); - shader-num_vgprs = util_le32_to_cpu(*(uint32_t*)(binary.config + 4)); - shader-spi_ps_input_ena = util_le32_to_cpu(*(uint32_t*)(binary.config + 8)); + /* XXX: We may be able to emit some of these values directly rather than +* extracting fields to be emitted later. +*/ + for (i = 0; i binary.config_size; i+= 8) { + unsigned reg = util_le32_to_cpu(*(uint32_t*)(binary.config + i)); + unsigned value = util_le32_to_cpu(*(uint32_t*)(binary.config + i + 4)); + switch (reg) { + case R_00B028_SPI_SHADER_PGM_RSRC1_PS: + case R_00B128_SPI_SHADER_PGM_RSRC1_VS: + case R_00B228_SPI_SHADER_PGM_RSRC1_GS: + case R_00B848_COMPUTE_PGM_RSRC1: + shader-num_sgprs = (G_00B028_SGPRS(value) * 8) + 1; + shader-num_vgprs = (G_00B028_VGPRS(value) * 4) + 1; This results in the correct values being written to the registers, but I think something like shader-num_sgprs = (G_00B028_SGPRS(value) + 1) * 8; shader-num_vgprs = (G_00B028_VGPRS(value) + 1) * 4; makes clearer how many GPRs are allocated in the hardware. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture
On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com Most test pass, issue are with border color and swizzle. FWIW, those issues are there with non-compressed formats as well. I'm afraid we might need to change the hardware border colour depending on the swizzle. diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index ca9e8b4..d968b95 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1206,6 +1209,51 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen, } /* TODO compressed formats */ Remove this comment? @@ -1541,67 +1589,16 @@ boolean si_is_format_supported(struct pipe_screen *screen, return retval == usage; } -static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, unsigned level) -{ - if (util_format_is_depth_or_stencil(rtex-real_format)) { - if (rtex-surface.level[level].mode == RADEON_SURF_MODE_1D) { - return 4; - } else if (rtex-surface.level[level].mode == RADEON_SURF_MODE_2D) { - switch (rtex-real_format) { - case PIPE_FORMAT_Z16_UNORM: - return 5; - case PIPE_FORMAT_S8_UINT_Z24_UNORM: - case PIPE_FORMAT_X8Z24_UNORM: - case PIPE_FORMAT_Z24X8_UNORM: - case PIPE_FORMAT_Z24_UNORM_S8_UINT: - case PIPE_FORMAT_Z32_FLOAT: - case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT: - return 6; - default: - return 7; - } - } - } +static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, unsigned level, bool stencil) +{ + unsigned tile_mode_index = 0; - switch (rtex-surface.level[level].mode) { - default: - assert(!Invalid surface mode); - /* Fall through */ - case RADEON_SURF_MODE_LINEAR_ALIGNED: - return 8; - case RADEON_SURF_MODE_1D: - if (rtex-surface.flags RADEON_SURF_SCANOUT) - return 9; - else - return 13; - case RADEON_SURF_MODE_2D: - if (rtex-surface.flags RADEON_SURF_SCANOUT) { - switch (util_format_get_blocksize(rtex-real_format)) { - case 1: - return 10; - case 2: - return 11; - default: - assert(!Invalid block size); - /* Fall through */ - case 4: - return 12; - } - } else { - switch (util_format_get_blocksize(rtex-real_format)) { - case 1: - return 14; - case 2: - return 15; - case 4: - return 16; - case 8: - return 17; - default: - return 13; - } - } + if (stencil) { + tile_mode_index = rtex-surface.stencil_tiling_index[level]; + } else { + tile_mode_index = rtex-surface.tiling_index[level]; } + return tile_mode_index; } /* @@ -1638,7 +1635,7 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4, slice = slice - 1; } - tile_mode_index = si_tile_mode_index(rtex, level); + tile_mode_index = si_tile_mode_index(rtex, level, false); desc = util_format_description(surf-base.format); for (i = 0; i 4; i++) { @@ -1780,15 +1777,9 @@ static void si_db(struct r600_context *rctx, struct si_pm4_state *pm4, else s_info = S_028044_FORMAT(V_028044_STENCIL_INVALID); - tile_mode_index = si_tile_mode_index(rtex, level); - if (tile_mode_index 4 || tile_mode_index 7) { - R600_ERR(Invalid DB tiling mode %d!\n, -rtex-surface.level[level].mode); - si_pm4_set_reg(pm4, R_028040_DB_Z_INFO, S_028040_FORMAT(V_028040_Z_INVALID)); - si_pm4_set_reg(pm4, R_028044_DB_STENCIL_INFO, S_028044_FORMAT(V_028044_STENCIL_INVALID)); - return; - } + tile_mode_index = si_tile_mode_index(rtex, level, false); z_info |= S_028040_TILE_MODE_INDEX(tile_mode_index); +
Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture
On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net wrote: On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com Most test pass, issue are with border color and swizzle. FWIW, those issues are there with non-compressed formats as well. I'm afraid we might need to change the hardware border colour depending on the swizzle. I don't think so. The issue with the swizzled border color seems to be a bad hardware design decision present since r600 rather than a hardware bug. I tried fixing it for older chipsets with no success. I doubt the hw designers fixed this for SI. The problem is the hardware tries to guess what the border color swizzle is from the combined pipe_format+sampler view swizzle combination. You need 2 texture swizzle states in the texture unit for the border color to be swizzled correctly, because texels must be swizzled by the pipe_format swizzle and sampler view swizzle, but the border color must be swizzled by the sampler view only. The main problem is that the hardware internally tries to undo the pipe_format swizzle in a way that just doesn't work. I don't remember the exact swizzles being used by hardware, but I got crazy cases like if I set texture swizzle to ywzx, the border color will be ywyy. There is no way to access those zx components of the border color for that specific swizzling. For some cases, the hardware succeeds in guessing what the border color should be, e.g. if I set texture swizzle to .zyxw, the returned border color will be .xyzw (and that would be correct if the swizzle came from pipe_format, and incorrect if the swizzle came from sampler view). It was easy with r300, because I could just undo pipe_format swizzling before passing the border color to the hardware. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture
On 08.04.2013 12:03, Marek Olšák wrote: On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net mailto:mic...@daenzer.net wrote: On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com mailto:j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com mailto:jgli...@redhat.com Most test pass, issue are with border color and swizzle. FWIW, those issues are there with non-compressed formats as well. I'm afraid we might need to change the hardware border colour depending on the swizzle. I don't think so. The issue with the swizzled border color seems to be a bad hardware design decision present since r600 rather than a hardware bug. I tried fixing it for older chipsets with no success. I doubt the hw designers fixed this for SI. The problem is the hardware tries to guess what the border color swizzle is from the combined pipe_format+sampler view swizzle combination. You need 2 texture swizzle states in the texture unit for the border color to be swizzled correctly, because texels must be swizzled by the pipe_format swizzle and sampler view swizzle, but the border color must be swizzled by the sampler view only. The main problem is that the hardware internally tries to undo the pipe_format swizzle in a way that just doesn't work. I don't remember the exact swizzles being used by hardware, but I got crazy cases like if I set texture swizzle to ywzx, the border color will be ywyy. There is no way to access those zx components of the border color for that specific swizzling. For some cases, the hardware succeeds in guessing what the border color should be, e.g. if I set texture swizzle to .zyxw, the returned border color will be .xyzw (and that would be correct if the swizzle came from pipe_format, and incorrect if the swizzle came from sampler view). It was easy with r300, because I could just undo pipe_format swizzling before passing the border color to the hardware. Ah yes, border colour swizzle, it's a problem on NV, too. Because the border colour isn't getting swizzled at all [as far as we know]. The main issue is the separation of samplers and textures in gallium, if that wasn't the case samplers and textures would be coupled and the sampler state could be set according to texture view state (if it's just OpenGL; and if it's just D3D there's no swizzle). So, I just leave it broken, I can't destroying the elegant separation because of such an unimportant detail, that hurts too much. (Also, if someone was to use multiple samplers and views in gallium and index them dynamically, I'd have to set up all combinations of textures and samplers, which is simply ridiculous. And now I'm going to look for some secret sampler setup bit that says swizzle according to texture view state. Maybe looking into the future of OpenGL someone's been wise enough to add that. But then, I'd have the sample problem as you. An intensity texture simply doesn't have separate values for R,G,B,A.) Possible solution: Maybe the state tracker could just do the swizzling, because it knows that samplers and views are coupled, and it knows the swizzle ? Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture
Christoph, You're talking about something entirely different. I was trying to explain that a correct swizzled border color is *impossible* on r600 and later chipsets. I think your hardware is actually good and can do swizzled border color with a little bit of driver work you refuse to do. :) You have the option, we don't. The fact D3D doesn't have sampler swizzling actually explains a lot. In any case, all radeon drivers should be able to pass the normal (unswizzled) border color tests. Marek On Mon, Apr 8, 2013 at 1:01 PM, Christoph Bumiller e0425...@student.tuwien.ac.at wrote: On 08.04.2013 12:03, Marek Olšák wrote: On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net mailto:mic...@daenzer.net wrote: On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com mailto:j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com mailto:jgli...@redhat.com Most test pass, issue are with border color and swizzle. FWIW, those issues are there with non-compressed formats as well. I'm afraid we might need to change the hardware border colour depending on the swizzle. I don't think so. The issue with the swizzled border color seems to be a bad hardware design decision present since r600 rather than a hardware bug. I tried fixing it for older chipsets with no success. I doubt the hw designers fixed this for SI. The problem is the hardware tries to guess what the border color swizzle is from the combined pipe_format+sampler view swizzle combination. You need 2 texture swizzle states in the texture unit for the border color to be swizzled correctly, because texels must be swizzled by the pipe_format swizzle and sampler view swizzle, but the border color must be swizzled by the sampler view only. The main problem is that the hardware internally tries to undo the pipe_format swizzle in a way that just doesn't work. I don't remember the exact swizzles being used by hardware, but I got crazy cases like if I set texture swizzle to ywzx, the border color will be ywyy. There is no way to access those zx components of the border color for that specific swizzling. For some cases, the hardware succeeds in guessing what the border color should be, e.g. if I set texture swizzle to .zyxw, the returned border color will be .xyzw (and that would be correct if the swizzle came from pipe_format, and incorrect if the swizzle came from sampler view). It was easy with r300, because I could just undo pipe_format swizzling before passing the border color to the hardware. Ah yes, border colour swizzle, it's a problem on NV, too. Because the border colour isn't getting swizzled at all [as far as we know]. The main issue is the separation of samplers and textures in gallium, if that wasn't the case samplers and textures would be coupled and the sampler state could be set according to texture view state (if it's just OpenGL; and if it's just D3D there's no swizzle). So, I just leave it broken, I can't destroying the elegant separation because of such an unimportant detail, that hurts too much. (Also, if someone was to use multiple samplers and views in gallium and index them dynamically, I'd have to set up all combinations of textures and samplers, which is simply ridiculous. And now I'm going to look for some secret sampler setup bit that says swizzle according to texture view state. Maybe looking into the future of OpenGL someone's been wise enough to add that. But then, I'd have the sample problem as you. An intensity texture simply doesn't have separate values for R,G,B,A.) Possible solution: Maybe the state tracker could just do the swizzling, because it knows that samplers and views are coupled, and it knows the swizzle ? Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63269] New: explicitly symlinking libraries without libtool breaks OpenBSD build
https://bugs.freedesktop.org/show_bug.cgi?id=63269 Priority: medium Bug ID: 63269 Assignee: mesa-dev@lists.freedesktop.org Summary: explicitly symlinking libraries without libtool breaks OpenBSD build Severity: normal Classification: Unclassified OS: OpenBSD Reporter: j...@openbsd.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa Many of the Mesa Makefiles now contain a comment like: # Provide compatibility with scripts for the old Mesa build system for # a while by putting a link to the driver into /lib of the build tree Followed by then explicitly assuming libraries work like they do in Linux: ../../../bin/install-sh -c -d ../../../lib ln -f .libs/libglapi.so.0.0.0 ../../../lib/libglapi.so.0.0.0 ln: .libs/libglapi.so.0.0.0: No such file or directory The .libs dir already contains a libglapi.so.0.0 here. There is no symbol versioning on OpenBSD, and libraries use just so.major.minor Would it be possible to remove this 'compatibility with scripts' so Mesa can build on more platforms? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: Ensure struct tgsi_ind_register field Index is initialized.
On 04/06/2013 10:33 PM, Vinson Lee wrote: Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Leev...@freedesktop.org --- src/gallium/auxiliary/tgsi/tgsi_build.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index 509bc5c..523430b 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -835,6 +835,7 @@ tgsi_default_ind_register( void ) struct tgsi_ind_register ind_register; ind_register.File = TGSI_FILE_NULL; + ind_register.Index = 0; ind_register.Swizzle = TGSI_SWIZZLE_X; ind_register.ArrayID = 0; Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fix levels in initial texture creation
On 04/06/2013 10:31 PM, Dave Airlie wrote: From: Dave Airlieairl...@redhat.com calim pointed out we were getting mipmap levels for array multisamples, this didn't make sense. So then I noticed this function takes last_level so we are passing in a too high value here. I think this should fix the case he was seeing. Signed-off-by: Dave Airlieairl...@redhat.com --- src/mesa/state_tracker/st_cb_texture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 25ee352..85b5609 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -1691,7 +1691,7 @@ st_AllocTextureStorage(struct gl_context *ctx, stObj-pt = st_texture_create(st, gl_target_to_pipe(texObj-Target), fmt, - levels, + levels - 1, ptWidth, ptHeight, ptDepth, Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: Fix linkage of libOpenCL
On Thu, Apr 04, 2013 at 11:26:45PM +0200, Niels Ole Salscheider wrote: Clover needs the irreader component of llvm v2: Check for irreader component irreader is only available with LLVM 3.3 = 177971 Signed-off-by: Niels Ole Salscheider niels_...@salscheider-online.de I've pushed this, thanks. btw, I also pushed your libclc build fixes to my libclc repo. -Tom --- configure.ac | 4 1 file changed, 4 insertions(+) diff --git a/configure.ac b/configure.ac index 81d4a3f..fea5868 100644 --- a/configure.ac +++ b/configure.ac @@ -1650,6 +1650,10 @@ if test x$enable_gallium_llvm = xyes; then if test x$enable_opencl = xyes; then LLVM_COMPONENTS=${LLVM_COMPONENTS} ipo linker instrumentation +# LLVM 3.3 = 177971 requires IRReader +if $LLVM_CONFIG --components | grep -q '\irreader\'; then +LLVM_COMPONENTS=${LLVM_COMPONENTS} irreader +fi fi LLVM_LDFLAGS=`$LLVM_CONFIG --ldflags` LLVM_BINDIR=`$LLVM_CONFIG --bindir` -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/17] i965/vs: Make type of vec4_visitor::vp more generic.
Paul Berry stereotype...@gmail.com writes: The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to just p and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. In many other places in the driver, p is a brw_compile. I'd rather not overload that name. In a couple other cases where we've had both the gl_shader_program and the gl_program, the shader_program becomes shader_prog (only about 8 instances in brw_vec4) and gl_program gets to be just prog pgpowcUqY6kwn.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965: Remove the BRW_NEW_INPUT_DIMENSIONS flag.
Kenneth Graunke kenn...@whitecape.org writes: When I removed the proj_attrib_mask optimization, I also removed the last consumer of this bit without realizing it. Since nobody uses it, there's no point in flagging it. Series is: Reviewed-by: Eric Anholt e...@anholt.net pgpk5NUvBfZ1J.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/17] i965/vs: move VS-specific data members to vs_vec4_visitor.
Paul Berry stereotype...@gmail.com writes: This patch moves the following data structures from vec4_visitor to vec4_vs_visitor, since they contain VS-specific data: - struct brw_vs_compile *c - struct brw_vs_prog_data *prog_data - src_reg *vp_temp_regs - src_reg vp_addr_reg Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic data, the following pointers are added to the base class, to allow it to access the vec4-generic portions of these data structures: - struct brw_vec4_compile *vec4_compile - struct brw_vec4_prog_key *vec4_key - struct brw_vec4_prog_data *vec4_prog_data I would lean toward the base class (which contains most of the members and usages, I think) having the short name, and the derived class having the more specific name. Either way, patch 7-11 are: Reviewed-by: Eric Anholt e...@anholt.net pgp6KftuxloEV.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/17] i965: Generalize computation of VUE map in preparation for GS.
Paul Berry stereotype...@gmail.com writes: This patch modifies the arguments to brw_compute_vue_map() so that they no longer bake in the assumption that we are generating a VUE map for vertex shader outputs. It also makes the function non-static so that we can re-use it for geometry shader outputs. Reviewed-by: Eric Anholt e...@anholt.net pgp1u3_ILPrBg.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/17] i965/vs: split brw_vs_prog_data into generic and VS-specific parts.
Paul Berry stereotype...@gmail.com writes: -/* Note: brw_vs_prog_data_compare() must be updated when adding fields to this - * struct! + +/* Note: brw_vec4_prog_data_compare() must be updated when adding fields to + * this struct! */ -struct brw_vs_prog_data { +struct brw_vec4_prog_data { struct brw_vue_map vue_map; GLuint curb_read_length; - GLuint urb_read_length; GLuint total_grf; GLuint nr_params; /** number of float params/constants */ GLuint nr_pull_params; /** number of dwords referenced by pull_param[] */ GLuint total_scratch; + int num_surfaces; + + /* These pointers must appear last. See brw_vec4_prog_data_compare(). */ + const float **param; + const float **pull_param; +}; + + +/* Note: brw_vs_prog_data_compare() must be updated when adding fields to this + * struct! + */ +struct brw_vs_prog_data { + struct brw_vec4_prog_data base; + + GLuint urb_read_length; There's a URB read length in the GS state packet, so it seems like you'd want this field in the GS case as well as VS. I'm confused. I also would have expected urb_entry_size in GS. pgpHJskhzQ03h.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/17] i965/vs: Generalize data structures pointed to by vec4_generator.
Paul Berry stereotype...@gmail.com writes: This patch removes the following field from vec4_generator, since it is not used: - struct brw_vs_compile *c And changes the following field: - struct gl_vertex_program *vp = struct gl_program *glprog Same comment about prog/shader_prog as a naming solution. pgpd_Sc9QyiZB.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture
On 04/08/2013 02:03 PM, Marek Olšák wrote: On Mon, Apr 8, 2013 at 11:29 AM, Michel Dänzer mic...@daenzer.net wrote: On Fre, 2013-04-05 at 17:36 -0400, j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com Most test pass, issue are with border color and swizzle. FWIW, those issues are there with non-compressed formats as well. I'm afraid we might need to change the hardware border colour depending on the swizzle. I don't think so. The issue with the swizzled border color seems to be a bad hardware design decision present since r600 rather than a hardware bug. I tried fixing it for older chipsets with no success. I doubt the hw designers fixed this for SI. The problem is the hardware tries to guess what the border color swizzle is from the combined pipe_format+sampler view swizzle combination. You need 2 texture swizzle states in the texture unit for the border color to be swizzled correctly, because texels must be swizzled by the pipe_format swizzle and sampler view swizzle, but the border color must be swizzled by the sampler view only. The main problem is that the hardware internally tries to undo the pipe_format swizzle in a way that just doesn't work. I don't remember the exact swizzles being used by hardware, but I got crazy cases like if I set texture swizzle to ywzx, the border color will be ywyy. There is no way to access those zx components of the border color for that specific swizzling. For some cases, the hardware succeeds in guessing what the border color should be, e.g. if I set texture swizzle to .zyxw, the returned border color will be .xyzw (and that would be correct if the swizzle came from pipe_format, and incorrect if the swizzle came from sampler view). I also looked into this issue some time ago (on evergreen) and IIRC I found that the swizzle is actually applied twice to border color in most cases (at least when swizzle_y is not 2 or 3), I think it's just a bug (or we are missing something in the hw configuration). Anyway, according to my tests in many cases (960 of 1296 total swizzles, 74%) it's possible to apply some precomputed swizzle to border color before writing it to the registers to get the correct result in the end, but I'm not sure if it makes sense to implement that. Vadim It was easy with r300, because I could just undo pipe_format swizzling before passing the border color to the hardware. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] remove mfeatures.h, take two
Ready to commit? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: Update comments to match newer specs.
Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately above the second hunk also uses 'p'. --- src/mesa/main/mtypes.h |2 +- src/mesa/main/texobj.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 008f68b..3d8f359 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1171,7 +1171,7 @@ struct gl_texture_object GLint MaxLevel; /** max mipmap level, OpenGL 1.2 */ GLint ImmutableLevels; /** ES 3.0 / ARB_texture_view */ GLint _MaxLevel;/** actual max mipmap level (q in the spec) */ - GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - b in spec) */ + GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - p in spec) */ GLint CropRect[4]; /** GL_OES_draw_texture */ GLenum Swizzle[4]; /** GL_EXT_texture_swizzle */ GLuint _Swizzle; /** same as Swizzle, but SWIZZLE_* format */ diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 66377c8..d0fcb12 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -553,7 +553,7 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, t-_MaxLevel = MIN2(t-_MaxLevel, t-MaxLevel); t-_MaxLevel = MIN2(t-_MaxLevel, maxLevels - 1); /* 'q' in the GL spec */ - /* Compute _MaxLambda = q - b (see the 1.2 spec) used during mipmapping */ + /* Compute _MaxLambda = q - p in the spec used during mipmapping */ t-_MaxLambda = (GLfloat) (t-_MaxLevel - baseLevel); if (t-Immutable) { -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: Use MIN3 instead of two MIN2s.
--- src/mesa/main/texobj.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index d0fcb12..28b8130 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -548,10 +548,11 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, ASSERT(maxLevels 0); - t-_MaxLevel = - baseLevel + baseImage-MaxNumLevels - 1; /* 'p' in the GL spec */ - t-_MaxLevel = MIN2(t-_MaxLevel, t-MaxLevel); - t-_MaxLevel = MIN2(t-_MaxLevel, maxLevels - 1); /* 'q' in the GL spec */ + t-_MaxLevel = MIN3(t-MaxLevel, + /* 'p' in the GL spec */ + baseLevel + baseImage-MaxNumLevels - 1, + /* 'q' in the GL spec */ + maxLevels - 1); /* Compute _MaxLambda = q - p in the spec used during mipmapping */ t-_MaxLambda = (GLfloat) (t-_MaxLevel - baseLevel); -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.
Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On Sat, Apr 6, 2013 at 8:25 PM, Paul Berry stereotype...@gmail.com wrote: When transform feedback is active, the driver manually counts the number of primitives that run through the pipeline, so that if a batch buffer flush happens, the next batch buffer can pick up transform feedback where the last batch buffer left off. Hardware-accelerated primitive restart interferes with this process (because it makes the primitive count depend not just on the number of vertices entering the pipeline, but also on the contents of the index buffer). So, when transform feedback is active, we need to fall back to the software implementation of primitive restart. Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush. NOTE: This is a candidate for stable release branches. --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index e6902b4..d0f0038 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -27,6 +27,7 @@ #include main/imports.h #include main/bufferobj.h +#include main/transformfeedback.h #include brw_context.h #include brw_defines.h @@ -81,11 +82,18 @@ can_cut_index_handle_prims(struct gl_context *ctx, struct brw_context *brw = brw_context(ctx); if (brw-sol.counting_primitives_generated || - brw-sol.counting_primitives_written) { + brw-sol.counting_primitives_written || + _mesa_is_xfb_active_and_unpaused(ctx)) { /* Counting primitives generated in hardware is not currently * supported, so take the software path. We need to investigate * the *_PRIMITIVES_COUNT registers to allow this to be handled * entirely in hardware. + * + * Note that when transform feedback is active, we also count primitives + * (even if the client hasn't requested it), since that is the only way + * we can start at the proper place in the transform feedback buffer + * after a flush. So we also have to fall back to software when + * transform feedback is active and unpaused. */ return false; } -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.
On 04/07/2013 06:42 AM, Paul Berry wrote: The call to emit_shader_time_end() before the second URB write was conditioned with if (eot), but eot is always false in this code path, so emit_shader_time_end() was never being called for vertex shaders that performed 2 URB writes. I had to look at that code for way to long to convince myself that your patch was correct. I think it might be better to remove both the conditional emit_shader_time_end calls and put this block of code at the very bottom (unless emit_shader_time_end has some side effect that I don't see): if (inst-eot) { if (INTEL_DEBUG DEBUG_SHADER_TIME) emit_shader_time_end(); } Or does the last URB write have to be the last instruction? --- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 8bd2fd8..ca1cfe8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2664,10 +2664,8 @@ vec4_visitor::emit_urb_writes() emit_urb_slot(mrf++, c-prog_data.vue_map.slot_to_varying[slot]); } - if (eot) { - if (INTEL_DEBUG DEBUG_SHADER_TIME) -emit_shader_time_end(); - } + if (INTEL_DEBUG DEBUG_SHADER_TIME) + emit_shader_time_end(); current_annotation = URB write; inst = emit(VS_OPCODE_URB_WRITE); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Update comments to match newer specs.
On 04/08/2013 10:29 AM, Matt Turner wrote: Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately above the second hunk also uses 'p'. Series is Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/main/mtypes.h |2 +- src/mesa/main/texobj.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 008f68b..3d8f359 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1171,7 +1171,7 @@ struct gl_texture_object GLint MaxLevel;/** max mipmap level, OpenGL 1.2 */ GLint ImmutableLevels; /** ES 3.0 / ARB_texture_view */ GLint _MaxLevel; /** actual max mipmap level (q in the spec) */ - GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - b in spec) */ + GLfloat _MaxLambda; /** = _MaxLevel - BaseLevel (q - p in spec) */ GLint CropRect[4]; /** GL_OES_draw_texture */ GLenum Swizzle[4]; /** GL_EXT_texture_swizzle */ GLuint _Swizzle; /** same as Swizzle, but SWIZZLE_* format */ diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 66377c8..d0fcb12 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -553,7 +553,7 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, t-_MaxLevel = MIN2(t-_MaxLevel, t-MaxLevel); t-_MaxLevel = MIN2(t-_MaxLevel, maxLevels - 1); /* 'q' in the GL spec */ - /* Compute _MaxLambda = q - b (see the 1.2 spec) used during mipmapping */ + /* Compute _MaxLambda = q - p in the spec used during mipmapping */ t-_MaxLambda = (GLfloat) (t-_MaxLevel - baseLevel); if (t-Immutable) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.
On 04/06/2013 08:25 PM, Paul Berry wrote: When transform feedback is active, the driver manually counts the number of primitives that run through the pipeline, so that if a batch buffer flush happens, the next batch buffer can pick up transform feedback where the last batch buffer left off. Hardware-accelerated primitive restart interferes with this process (because it makes the primitive count depend not just on the number of vertices entering the pipeline, but also on the contents of the index buffer). So, when transform feedback is active, we need to fall back to the software implementation of primitive restart. Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush. NOTE: This is a candidate for stable release branches. Oof. This shouldn't be a performance hit on too many applications, thankfully. Do we know when we're going to get real hardware counting support? :( Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index e6902b4..d0f0038 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -27,6 +27,7 @@ #include main/imports.h #include main/bufferobj.h +#include main/transformfeedback.h #include brw_context.h #include brw_defines.h @@ -81,11 +82,18 @@ can_cut_index_handle_prims(struct gl_context *ctx, struct brw_context *brw = brw_context(ctx); if (brw-sol.counting_primitives_generated || - brw-sol.counting_primitives_written) { + brw-sol.counting_primitives_written || + _mesa_is_xfb_active_and_unpaused(ctx)) { /* Counting primitives generated in hardware is not currently * supported, so take the software path. We need to investigate * the *_PRIMITIVES_COUNT registers to allow this to be handled * entirely in hardware. + * + * Note that when transform feedback is active, we also count primitives + * (even if the client hasn't requested it), since that is the only way + * we can start at the proper place in the transform feedback buffer + * after a flush. So we also have to fall back to software when + * transform feedback is active and unpaused. */ return false; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] radeonsi: add 2d tiling support for texture v3
From: Jerome Glisse jgli...@redhat.com v2: Remove left over code v3: Restage properly the commit so hunk of first one are not in second one. Signed-off-by: Jerome Glisse jgli...@redhat.com --- src/gallium/drivers/radeonsi/r600_texture.c | 11 ++-- src/gallium/drivers/radeonsi/si_state.c | 81 + 2 files changed, 20 insertions(+), 72 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_texture.c b/src/gallium/drivers/radeonsi/r600_texture.c index 1b8382f..8992f9a 100644 --- a/src/gallium/drivers/radeonsi/r600_texture.c +++ b/src/gallium/drivers/radeonsi/r600_texture.c @@ -47,7 +47,6 @@ static void r600_copy_to_staging_texture(struct pipe_context *ctx, struct r600_t transfer-box); } - /* Copy from a transfer's staging texture to a full GPU one. */ static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct r600_transfer *rtransfer) { @@ -152,12 +151,12 @@ static int r600_init_surface(struct r600_screen *rscreen, if (!is_flushed_depth is_depth) { surface-flags |= RADEON_SURF_ZBUFFER; - if (is_stencil) { surface-flags |= RADEON_SURF_SBUFFER | RADEON_SURF_HAS_SBUFFER_MIPTREE; } } + surface-flags |= RADEON_SURF_HAS_TILE_MODE_INDEX; return 0; } @@ -530,7 +529,11 @@ struct pipe_resource *si_texture_create(struct pipe_screen *screen, if (!(templ-flags R600_RESOURCE_FLAG_TRANSFER) !(templ-bind PIPE_BIND_SCANOUT)) { - array_mode = V_009910_ARRAY_1D_TILED_THIN1; + if (util_format_is_compressed(templ-format)) { + array_mode = V_009910_ARRAY_1D_TILED_THIN1; + } else { + array_mode = V_009910_ARRAY_2D_TILED_THIN1; + } } r = r600_init_surface(rscreen, surface, templ, array_mode, @@ -620,6 +623,8 @@ struct pipe_resource *si_texture_from_handle(struct pipe_screen *screen, if (r) { return NULL; } + /* always set the scanout flags */ + surface.flags |= RADEON_SURF_SCANOUT; return (struct pipe_resource *)r600_texture_create_object(screen, templ, array_mode, stride, 0, buf, FALSE, surface); } diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index ca9e8b4..61ede64 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1541,67 +1541,16 @@ boolean si_is_format_supported(struct pipe_screen *screen, return retval == usage; } -static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, unsigned level) -{ - if (util_format_is_depth_or_stencil(rtex-real_format)) { - if (rtex-surface.level[level].mode == RADEON_SURF_MODE_1D) { - return 4; - } else if (rtex-surface.level[level].mode == RADEON_SURF_MODE_2D) { - switch (rtex-real_format) { - case PIPE_FORMAT_Z16_UNORM: - return 5; - case PIPE_FORMAT_S8_UINT_Z24_UNORM: - case PIPE_FORMAT_X8Z24_UNORM: - case PIPE_FORMAT_Z24X8_UNORM: - case PIPE_FORMAT_Z24_UNORM_S8_UINT: - case PIPE_FORMAT_Z32_FLOAT: - case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT: - return 6; - default: - return 7; - } - } - } +static unsigned si_tile_mode_index(struct r600_resource_texture *rtex, unsigned level, bool stencil) +{ + unsigned tile_mode_index = 0; - switch (rtex-surface.level[level].mode) { - default: - assert(!Invalid surface mode); - /* Fall through */ - case RADEON_SURF_MODE_LINEAR_ALIGNED: - return 8; - case RADEON_SURF_MODE_1D: - if (rtex-surface.flags RADEON_SURF_SCANOUT) - return 9; - else - return 13; - case RADEON_SURF_MODE_2D: - if (rtex-surface.flags RADEON_SURF_SCANOUT) { - switch (util_format_get_blocksize(rtex-real_format)) { - case 1: - return 10; - case 2: - return 11; - default: - assert(!Invalid block size); - /* Fall through */ - case 4: - return 12; - } - } else { - switch (util_format_get_blocksize(rtex-real_format)) { -
[Mesa-dev] [PATCH 2/2] radeonsi: add support for compressed texture v2
From: Jerome Glisse jgli...@redhat.com Most test pass, issue are with border color and swizzle. Based on ircnickmaelcum patch. v2: Restaged commit hunk Signed-off-by: Jerome Glisse jgli...@redhat.com --- src/gallium/drivers/radeonsi/si_state.c | 71 - src/gallium/drivers/radeonsi/sid.h | 7 2 files changed, 76 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 61ede64..a39843c 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -30,6 +30,7 @@ #include util/u_helpers.h #include util/u_math.h #include util/u_pack_color.h +#include util/u_format_s3tc.h #include tgsi/tgsi_parse.h #include radeonsi_pipe.h #include radeonsi_shader.h @@ -1164,6 +1165,8 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen, const struct util_format_description *desc, int first_non_void) { + struct r600_screen *rscreen = (struct r600_screen*)screen; + bool enable_s3tc = rscreen-info.drm_minor = 31; boolean uniform = TRUE; int i; @@ -1205,7 +1208,51 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen, break; } - /* TODO compressed formats */ + if (desc-layout == UTIL_FORMAT_LAYOUT_RGTC) { + if (!enable_s3tc) + goto out_unknown; + + switch (format) { + case PIPE_FORMAT_RGTC1_SNORM: + case PIPE_FORMAT_LATC1_SNORM: + case PIPE_FORMAT_RGTC1_UNORM: + case PIPE_FORMAT_LATC1_UNORM: + return V_008F14_IMG_DATA_FORMAT_BC4; + case PIPE_FORMAT_RGTC2_SNORM: + case PIPE_FORMAT_LATC2_SNORM: + case PIPE_FORMAT_RGTC2_UNORM: + case PIPE_FORMAT_LATC2_UNORM: + return V_008F14_IMG_DATA_FORMAT_BC5; + default: + goto out_unknown; + } + } + + if (desc-layout == UTIL_FORMAT_LAYOUT_S3TC) { + + if (!enable_s3tc) + goto out_unknown; + + if (!util_format_s3tc_enabled) { + goto out_unknown; + } + + switch (format) { + case PIPE_FORMAT_DXT1_RGB: + case PIPE_FORMAT_DXT1_RGBA: + case PIPE_FORMAT_DXT1_SRGB: + case PIPE_FORMAT_DXT1_SRGBA: + return V_008F14_IMG_DATA_FORMAT_BC1; + case PIPE_FORMAT_DXT3_RGBA: + case PIPE_FORMAT_DXT3_SRGBA: + return V_008F14_IMG_DATA_FORMAT_BC2; + case PIPE_FORMAT_DXT5_RGBA: + case PIPE_FORMAT_DXT5_SRGBA: + return V_008F14_IMG_DATA_FORMAT_BC3; + default: + goto out_unknown; + } + } if (format == PIPE_FORMAT_R9G9B9E5_FLOAT) { return V_008F14_IMG_DATA_FORMAT_5_9_9_9; @@ -2109,7 +2156,27 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx break; default: if (first_non_void 0) { - num_format = V_008F14_IMG_NUM_FORMAT_FLOAT; + if (util_format_is_compressed(pipe_format)) { + switch (pipe_format) { + case PIPE_FORMAT_DXT1_SRGB: + case PIPE_FORMAT_DXT1_SRGBA: + case PIPE_FORMAT_DXT3_SRGBA: + case PIPE_FORMAT_DXT5_SRGBA: + num_format = V_008F14_IMG_NUM_FORMAT_SRGB; + break; + case PIPE_FORMAT_RGTC1_SNORM: + case PIPE_FORMAT_LATC1_SNORM: + case PIPE_FORMAT_RGTC2_SNORM: + case PIPE_FORMAT_LATC2_SNORM: + num_format = V_008F14_IMG_NUM_FORMAT_SNORM; + break; + default: + num_format = V_008F14_IMG_NUM_FORMAT_UNORM; + break; + } + } else { + num_format = V_008F14_IMG_NUM_FORMAT_FLOAT; + } } else if (desc-colorspace == UTIL_FORMAT_COLORSPACE_SRGB) { num_format = V_008F14_IMG_NUM_FORMAT_SRGB; } else { diff --git a/src/gallium/drivers/radeonsi/sid.h b/src/gallium/drivers/radeonsi/sid.h index 8528981..2722c79 100644 --- a/src/gallium/drivers/radeonsi/sid.h +++
Re: [Mesa-dev] [PATCH 3/4] st/mesa: add support for ARB_texture_multisample
On 04/06/2013 03:05 AM, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This adds support to the mesa state tracker for ARB_texture_multisample. hardware doesn't seem to use a different texture instructions, so I don't think we need to create one for TGSI at this time. Thanks to Marek for fixes to sample number picking. Reviewed-by: Marek Olšák mar...@gmail.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_atom_framebuffer.c | 1 + src/mesa/state_tracker/st_atom_msaa.c| 2 ++ src/mesa/state_tracker/st_cb_bitmap.c| 4 +-- src/mesa/state_tracker/st_cb_drawpixels.c| 2 +- src/mesa/state_tracker/st_cb_fbo.c | 2 +- src/mesa/state_tracker/st_cb_texture.c | 41 src/mesa/state_tracker/st_extensions.c | 6 +++- src/mesa/state_tracker/st_gen_mipmap.c | 1 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 17 ++-- src/mesa/state_tracker/st_mesa_to_tgsi.c | 2 ++ src/mesa/state_tracker/st_texture.c | 8 +- src/mesa/state_tracker/st_texture.h | 1 + 12 files changed, 72 insertions(+), 15 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c b/src/mesa/state_tracker/st_atom_framebuffer.c index 3df8691..c752640 100644 --- a/src/mesa/state_tracker/st_atom_framebuffer.c +++ b/src/mesa/state_tracker/st_atom_framebuffer.c @@ -59,6 +59,7 @@ update_renderbuffer_surface(struct st_context *st, enum pipe_format format = st-ctx-Color.sRGBEnabled ? resource-format : util_format_linear(resource-format); if (!strb-surface || + strb-surface-texture-nr_samples != strb-Base.NumSamples || strb-surface-format != format || strb-surface-texture != resource || strb-surface-width != rtt_width || diff --git a/src/mesa/state_tracker/st_atom_msaa.c b/src/mesa/state_tracker/st_atom_msaa.c index 9baa4fc..b749a17 100644 --- a/src/mesa/state_tracker/st_atom_msaa.c +++ b/src/mesa/state_tracker/st_atom_msaa.c @@ -63,6 +63,8 @@ static void update_sample_mask( struct st_context *st ) sample_mask = ~sample_mask; } /* TODO merge with app-supplied sample mask */ + if (st-ctx-Multisample.SampleMask) + sample_mask = st-ctx-Multisample.SampleMaskValue; } /* mask off unused bits or don't care? */ diff --git a/src/mesa/state_tracker/st_cb_bitmap.c b/src/mesa/state_tracker/st_cb_bitmap.c index 0513814..ee66ab3 100644 --- a/src/mesa/state_tracker/st_cb_bitmap.c +++ b/src/mesa/state_tracker/st_cb_bitmap.c @@ -299,7 +299,7 @@ make_bitmap_texture(struct gl_context *ctx, GLsizei width, GLsizei height, * Create texture to hold bitmap pattern. */ pt = st_texture_create(st, st-internal_target, st-bitmap.tex_format, - 0, width, height, 1, 1, + 0, width, height, 1, 1, 0, PIPE_BIND_SAMPLER_VIEW); if (!pt) { _mesa_unmap_pbo_source(ctx, unpack); @@ -568,7 +568,7 @@ reset_cache(struct st_context *st) cache-texture = st_texture_create(st, PIPE_TEXTURE_2D, st-bitmap.tex_format, 0, BITMAP_CACHE_WIDTH, BITMAP_CACHE_HEIGHT, - 1, 1, + 1, 1, 0, PIPE_BIND_SAMPLER_VIEW); } diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c b/src/mesa/state_tracker/st_cb_drawpixels.c index b25b776..db2f03a 100644 --- a/src/mesa/state_tracker/st_cb_drawpixels.c +++ b/src/mesa/state_tracker/st_cb_drawpixels.c @@ -466,7 +466,7 @@ alloc_texture(struct st_context *st, GLsizei width, GLsizei height, struct pipe_resource *pt; pt = st_texture_create(st, st-internal_target, texFormat, 0, - width, height, 1, 1, PIPE_BIND_SAMPLER_VIEW); + width, height, 1, 1, 0, PIPE_BIND_SAMPLER_VIEW); return pt; } diff --git a/src/mesa/state_tracker/st_cb_fbo.c b/src/mesa/state_tracker/st_cb_fbo.c index 4452e52..127b123 100644 --- a/src/mesa/state_tracker/st_cb_fbo.c +++ b/src/mesa/state_tracker/st_cb_fbo.c @@ -433,7 +433,7 @@ st_render_texture(struct gl_context *ctx, strb-rtt_level = att-TextureLevel; strb-rtt_face = att-CubeMapFace; strb-rtt_slice = att-Zoffset; - + rb-NumSamples = texImage-NumSamples; rb-Width = texImage-Width2; rb-Height = texImage-Height2; rb-_BaseFormat = texImage-_BaseFormat; diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 0cd0d77..25ee352 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -78,6 +78,8 @@ gl_target_to_pipe(GLenum target) case GL_TEXTURE_2D: case GL_PROXY_TEXTURE_2D: case GL_TEXTURE_EXTERNAL_OES: + case GL_TEXTURE_2D_MULTISAMPLE: + case GL_PROXY_TEXTURE_2D_MULTISAMPLE:
[Mesa-dev] [Bug 56542] [bisected] Piglit gl_select tests crash on exit
https://bugs.freedesktop.org/show_bug.cgi?id=56542 --- Comment #3 from Jerome Glisse gli...@freedesktop.org --- Dunno if it's a freeglut know bug. They could argue it's not a bug, but really using atexit to do Xorg cleanup is bad. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] glsl/linker: Adapt flat varying handling in preparation for geometry shaders.
On 04/06/2013 07:49 PM, Paul Berry wrote: When a varying is consumed by transform feedback, but is not used by the fragment shader, assign_varying_locations() sets its interpolation type to flat in order to ensure that lower_packed_varyings never has to deal with non-flat integral varyings (the GLSL spec doesn't require integral vertex outputs to be flat if they aren't consumed by the fragment shader). A similar situation will arise when geometry shader support is added, since the GLSL spec only requires integral vertex shader outputs to be flat when they are consumed by the geometry shader. This patch fragment? modifies the linker to handle this situation too. --- src/glsl/link_varyings.cpp | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 431d8fd..7e90beb 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -541,7 +541,7 @@ store_tfeedback_info(struct gl_context *ctx, struct gl_shader_program *prog, class varying_matches { public: - varying_matches(bool disable_varying_packing); + varying_matches(bool disable_varying_packing, bool consumer_is_fs); ~varying_matches(); void record(ir_variable *producer_var, ir_variable *consumer_var); unsigned assign_locations(); @@ -621,11 +621,15 @@ private: * it was allocated. */ unsigned matches_capacity; + + const bool consumer_is_fs; }; -varying_matches::varying_matches(bool disable_varying_packing) - : disable_varying_packing(disable_varying_packing) +varying_matches::varying_matches(bool disable_varying_packing, + bool consumer_is_fs) + : disable_varying_packing(disable_varying_packing), + consumer_is_fs(consumer_is_fs) { /* Note: this initial capacity is rather arbitrarily chosen to be large * enough for many cases without wasting an unreasonable amount of space. @@ -672,12 +676,12 @@ varying_matches::record(ir_variable *producer_var, ir_variable *consumer_var) return; } - if (consumer_var == NULL) { - /* Since there is no consumer_var, the interpolation type of this - * varying cannot possibly affect rendering. Also, since the GL spec - * only requires integer varyings to be flat when they are fragment - * shader inputs, it is possible that this variable is non-flat and is - * (or contains) an integer. + if (consumer_var == NULL || !consumer_is_fs) { + /* Since this varying is not being consumed by the fragment shader, its + * interpolation type varying cannot possibly affect rendering. Also, + * since the GL spec only requires integer varyings to be flat when + * they are fragment shader inputs, it is possible that this variable is + * non-flat and is (or contains) an integer. * * lower_packed_varyings requires all integer varyings to flat, * regardless of where they appear. We can trivially satisfy that @@ -685,6 +689,11 @@ varying_matches::record(ir_variable *producer_var, ir_variable *consumer_var) */ producer_var-centroid = false; producer_var-interpolation = INTERP_QUALIFIER_FLAT; + + if (consumer_var) { + consumer_var-centroid = false; + consumer_var-interpolation = INTERP_QUALIFIER_FLAT; + } } if (this-num_matches == this-matches_capacity) { @@ -979,7 +988,8 @@ assign_varying_locations(struct gl_context *ctx, { const unsigned producer_base = VARYING_SLOT_VAR0; const unsigned consumer_base = VARYING_SLOT_VAR0; - varying_matches matches(ctx-Const.DisableVaryingPacking); + varying_matches matches(ctx-Const.DisableVaryingPacking, + consumer consumer-Type == GL_FRAGMENT_SHADER); hash_table *tfeedback_candidates = hash_table_ctor(0, hash_table_string_hash, hash_table_string_compare); hash_table *consumer_inputs ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.
On 8 April 2013 10:40, Ian Romanick i...@freedesktop.org wrote: On 04/06/2013 08:25 PM, Paul Berry wrote: When transform feedback is active, the driver manually counts the number of primitives that run through the pipeline, so that if a batch buffer flush happens, the next batch buffer can pick up transform feedback where the last batch buffer left off. Hardware-accelerated primitive restart interferes with this process (because it makes the primitive count depend not just on the number of vertices entering the pipeline, but also on the contents of the index buffer). So, when transform feedback is active, we need to fall back to the software implementation of primitive restart. Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush. NOTE: This is a candidate for stable release branches. Oof. This shouldn't be a performance hit on too many applications, thankfully. Do we know when we're going to get real hardware counting support? :( We just had a discussion about that this morning. There's no hardware limitation, just kernel limitations. As far as this bug is concerned, all we need is hardware context support (which we have today). I believe Eric is working on this. As for the GL_PRIMITIVES_GENERATED and GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries, I believe we need some kernel changes to allow us to read the hardware counters. I believe Eric is pinging some of the kernel folks on IRC to request that. All of this stuff needs to get sorted out before we can implement geometry shaders, so I'm highly motivated to keep an eye on it and make sure it gets settled soon :) Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_**primitive_restart.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c index e6902b4..d0f0038 100644 --- a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c @@ -27,6 +27,7 @@ #include main/imports.h #include main/bufferobj.h +#include main/transformfeedback.h #include brw_context.h #include brw_defines.h @@ -81,11 +82,18 @@ can_cut_index_handle_prims(**struct gl_context *ctx, struct brw_context *brw = brw_context(ctx); if (brw-sol.counting_primitives_**generated || - brw-sol.counting_primitives_**written) { + brw-sol.counting_primitives_**written || + _mesa_is_xfb_active_and_**unpaused(ctx)) { /* Counting primitives generated in hardware is not currently * supported, so take the software path. We need to investigate * the *_PRIMITIVES_COUNT registers to allow this to be handled * entirely in hardware. + * + * Note that when transform feedback is active, we also count primitives + * (even if the client hasn't requested it), since that is the only way + * we can start at the proper place in the transform feedback buffer + * after a flush. So we also have to fall back to software when + * transform feedback is active and unpaused. */ return false; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] glsl/linker: Adapt flat varying handling in preparation for geometry shaders.
On 8 April 2013 10:57, Ian Romanick i...@freedesktop.org wrote: On 04/06/2013 07:49 PM, Paul Berry wrote: When a varying is consumed by transform feedback, but is not used by the fragment shader, assign_varying_locations() sets its interpolation type to flat in order to ensure that lower_packed_varyings never has to deal with non-flat integral varyings (the GLSL spec doesn't require integral vertex outputs to be flat if they aren't consumed by the fragment shader). A similar situation will arise when geometry shader support is added, since the GLSL spec only requires integral vertex shader outputs to be flat when they are consumed by the geometry shader. This patch fragment? Oops, yes. Thanks. modifies the linker to handle this situation too. --- src/glsl/link_varyings.cpp | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 431d8fd..7e90beb 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -541,7 +541,7 @@ store_tfeedback_info(struct gl_context *ctx, struct gl_shader_program *prog, class varying_matches { public: - varying_matches(bool disable_varying_packing); + varying_matches(bool disable_varying_packing, bool consumer_is_fs); ~varying_matches(); void record(ir_variable *producer_var, ir_variable *consumer_var); unsigned assign_locations(); @@ -621,11 +621,15 @@ private: * it was allocated. */ unsigned matches_capacity; + + const bool consumer_is_fs; }; -varying_matches::varying_**matches(bool disable_varying_packing) - : disable_varying_packing(**disable_varying_packing) +varying_matches::varying_**matches(bool disable_varying_packing, + bool consumer_is_fs) + : disable_varying_packing(**disable_varying_packing), + consumer_is_fs(consumer_is_fs) { /* Note: this initial capacity is rather arbitrarily chosen to be large * enough for many cases without wasting an unreasonable amount of space. @@ -672,12 +676,12 @@ varying_matches::record(ir_**variable *producer_var, ir_variable *consumer_var) return; } - if (consumer_var == NULL) { - /* Since there is no consumer_var, the interpolation type of this - * varying cannot possibly affect rendering. Also, since the GL spec - * only requires integer varyings to be flat when they are fragment - * shader inputs, it is possible that this variable is non-flat and is - * (or contains) an integer. + if (consumer_var == NULL || !consumer_is_fs) { + /* Since this varying is not being consumed by the fragment shader, its + * interpolation type varying cannot possibly affect rendering. Also, + * since the GL spec only requires integer varyings to be flat when + * they are fragment shader inputs, it is possible that this variable is + * non-flat and is (or contains) an integer. * * lower_packed_varyings requires all integer varyings to flat, * regardless of where they appear. We can trivially satisfy that @@ -685,6 +689,11 @@ varying_matches::record(ir_**variable *producer_var, ir_variable *consumer_var) */ producer_var-centroid = false; producer_var-interpolation = INTERP_QUALIFIER_FLAT; + + if (consumer_var) { + consumer_var-centroid = false; + consumer_var-interpolation = INTERP_QUALIFIER_FLAT; + } } if (this-num_matches == this-matches_capacity) { @@ -979,7 +988,8 @@ assign_varying_locations(**struct gl_context *ctx, { const unsigned producer_base = VARYING_SLOT_VAR0; const unsigned consumer_base = VARYING_SLOT_VAR0; - varying_matches matches(ctx-Const.**DisableVaryingPacking); + varying_matches matches(ctx-Const.**DisableVaryingPacking, + consumer consumer-Type == GL_FRAGMENT_SHADER); hash_table *tfeedback_candidates = hash_table_ctor(0, hash_table_string_hash, hash_table_string_compare); hash_table *consumer_inputs ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use software primitive restart when transform feedback active.
On 04/06/2013 08:25 PM, Paul Berry wrote: When transform feedback is active, the driver manually counts the number of primitives that run through the pipeline, so that if a batch buffer flush happens, the next batch buffer can pick up transform feedback where the last batch buffer left off. Hardware-accelerated primitive restart interferes with this process (because it makes the primitive count depend not just on the number of vertices entering the pipeline, but also on the contents of the index buffer). So, when transform feedback is active, we need to fall back to the software implementation of primitive restart. Fixes piglit test spec/!OpenGL 3.1/primitive-restart-xfb flush. NOTE: This is a candidate for stable release branches. --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index e6902b4..d0f0038 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -27,6 +27,7 @@ #include main/imports.h #include main/bufferobj.h +#include main/transformfeedback.h #include brw_context.h #include brw_defines.h @@ -81,11 +82,18 @@ can_cut_index_handle_prims(struct gl_context *ctx, struct brw_context *brw = brw_context(ctx); if (brw-sol.counting_primitives_generated || - brw-sol.counting_primitives_written) { + brw-sol.counting_primitives_written || + _mesa_is_xfb_active_and_unpaused(ctx)) { /* Counting primitives generated in hardware is not currently * supported, so take the software path. We need to investigate * the *_PRIMITIVES_COUNT registers to allow this to be handled * entirely in hardware. + * + * Note that when transform feedback is active, we also count primitives + * (even if the client hasn't requested it), since that is the only way + * we can start at the proper place in the transform feedback buffer + * after a flush. So we also have to fall back to software when + * transform feedback is active and unpaused. */ return false; } Gah. This is unfortunate, but necessary. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.
On 8 April 2013 10:37, Ian Romanick i...@freedesktop.org wrote: On 04/07/2013 06:42 AM, Paul Berry wrote: The call to emit_shader_time_end() before the second URB write was conditioned with if (eot), but eot is always false in this code path, so emit_shader_time_end() was never being called for vertex shaders that performed 2 URB writes. I had to look at that code for way to long to convince myself that your patch was correct. I think it might be better to remove both the conditional emit_shader_time_end calls and put this block of code at the very bottom (unless emit_shader_time_end has some side effect that I don't see): if (inst-eot) { if (INTEL_DEBUG DEBUG_SHADER_TIME) emit_shader_time_end(); } Or does the last URB write have to be the last instruction? The last URB write has to be the last instruction, since it's actually the URB write that ends the thread (eot stands for end of thread). For GL 3.2 we're going to need to refactor this function to use a loop, since GL 3.2 doubles the number of varying components permitted for VS-GS linkage (so we'll need up to 4 URB writes instead of 2). I think once that change is made the function is going to be a lot easier to follow. Maybe I should just do that refactor now? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] intel: Refactor selection of miptree tiling
From: Chad Versace chad.vers...@linux.intel.com This patch (1) extracts from intel_miptree_create() the spaghetti logic that selects the tiling format, (2) rewrites that spaghetti into a lucid form, and (3) moves it to a new function, intel_miptree_choose_tiling(). No behavioral change. As a bonus, it is now evident that the force_y_tiling parameter to intel_miptree_create() does not really force Y tiling. Signed-off-by: Chad Versace chad.vers...@linux.intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 90 +++--- 1 file changed, 54 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 66cadeb..402972a 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -297,6 +297,57 @@ intel_miptree_create_layout(struct intel_context *intel, return mt; } +/** + * \brief Helper function for intel_miptree_create(). + */ +static uint32_t +intel_miptree_choose_tiling(struct intel_context *intel, +gl_format format, +uint32_t width0, +uint32_t num_samples, +bool force_y_tiling) +{ + + if (format == MESA_FORMAT_S8) { + /* The stencil buffer is W tiled. However, we request from the kernel a + * non-tiled buffer because the GTT is incapable of W fencing. + */ + return I915_TILING_NONE; + } + + if (!intel-use_texture_tiling || _mesa_is_format_compressed(format)) + return I915_TILING_NONE; + + if (force_y_tiling) + return I915_TILING_Y; + + if (num_samples 1) { + /* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled + * Surface): + * + * [DevSNB+]: For multi-sample render targets, this field must be + * 1. MSRTs can only be tiled. + * + * Our usual reason for preferring X tiling (fast blits using the + * blitting engine) doesn't apply to MSAA, since we'll generally be + * downsampling or upsampling when blitting between the MSAA buffer + * and another buffer, and the blitting engine doesn't support that. + * So use Y tiling, since it makes better use of the cache. + */ + return I915_TILING_Y; + } + + GLenum base_format = _mesa_get_format_base_format(format); + if (intel-gen = 4 + (base_format == GL_DEPTH_COMPONENT || +base_format == GL_DEPTH_STENCIL_EXT)) + return I915_TILING_Y; + + if (width0 = 64) + return I915_TILING_X; + + return I915_TILING_NONE; +} struct intel_mipmap_tree * intel_miptree_create(struct intel_context *intel, @@ -312,8 +363,6 @@ intel_miptree_create(struct intel_context *intel, bool force_y_tiling) { struct intel_mipmap_tree *mt; - uint32_t tiling = I915_TILING_NONE; - GLenum base_format; gl_format tex_format = format; gl_format etc_format = MESA_FORMAT_NONE; GLuint total_width, total_height; @@ -352,35 +401,6 @@ intel_miptree_create(struct intel_context *intel, } etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; - base_format = _mesa_get_format_base_format(format); - - if (num_samples 1) { - /* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled - * Surface): - * - * [DevSNB+]: For multi-sample render targets, this field must be - * 1. MSRTs can only be tiled. - * - * Our usual reason for preferring X tiling (fast blits using the - * blitting engine) doesn't apply to MSAA, since we'll generally be - * downsampling or upsampling when blitting between the MSAA buffer - * and another buffer, and the blitting engine doesn't support that. - * So use Y tiling, since it makes better use of the cache. - */ - force_y_tiling = true; - } - - if (intel-use_texture_tiling !_mesa_is_format_compressed(format)) { - if (intel-gen = 4 - (base_format == GL_DEPTH_COMPONENT || - base_format == GL_DEPTH_STENCIL_EXT)) -tiling = I915_TILING_Y; - else if (force_y_tiling) { - tiling = I915_TILING_Y; - } else if (width0 = 64) -tiling = I915_TILING_X; - } - mt = intel_miptree_create_layout(intel, target, format, first_level, last_level, width0, height0, depth0, @@ -397,15 +417,13 @@ intel_miptree_create(struct intel_context *intel, total_height = mt-total_height; if (format == MESA_FORMAT_S8) { - /* The stencil buffer is W tiled. However, we request from the kernel a - * non-tiled buffer because the GTT is incapable of W fencing. So round - * up the width and height to match the size of W tiles (64x64). - */ - tiling = I915_TILING_NONE; + /* Align to size
[Mesa-dev] [PATCH 2/3] i965: Use tiling even for compressed textures.
The code has no rationale for why we would force compressed textures to be untiled, and it appears to work fine. Git archeology indicates that it's been that way dating back to when we first started tiling. Improves performance in GLB27_TRex_C24Z16_FixedTimeStep at 1280x720 by 10.0529% +/- 0.573075% (n=12). Improves performance in Xonotic by 4.56409% +/- 0.27965% (n=3). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 402972a..8dd04be 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -315,7 +315,7 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_NONE; } - if (!intel-use_texture_tiling || _mesa_is_format_compressed(format)) + if (!intel-use_texture_tiling) return I915_TILING_NONE; if (force_y_tiling) -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.
In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all uncompressed anyway. Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8dd04be..6a9f08c 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_Y; if (width0 = 64) - return I915_TILING_X; + return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X; return I915_TILING_NONE; } -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Skip resetting SOL offsets at batch start when contexts are present.
We won't be able to compute them in software with the advent of geometry shaders. Fixes piglit OpenGL 3.1/primitive-restart-xfb flush NOTE: This is a candidate for the 9.1 branch. --- src/mesa/drivers/dri/i965/gen6_sol.c |9 + src/mesa/drivers/dri/i965/gen7_sol_state.c | 18 -- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c b/src/mesa/drivers/dri/i965/gen6_sol.c index 9c09ade..a7b63f6 100644 --- a/src/mesa/drivers/dri/i965/gen6_sol.c +++ b/src/mesa/drivers/dri/i965/gen6_sol.c @@ -159,6 +159,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum mode, struct gl_transform_feedback_object *obj) { struct brw_context *brw = brw_context(ctx); + struct intel_context *intel = brw-intel; const struct gl_shader_program *vs_prog = ctx-Shader.CurrentVertexProgram; const struct gl_transform_feedback_info *linked_xfb_info = @@ -180,6 +181,14 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum mode, brw-sol.svbi_0_starting_index = 0; brw-sol.svbi_0_max_index = max_index; brw-sol.offset_0_batch_start = 0; + + if (intel-gen = 7) { + /* Ask the kernel to reset the SO offsets for any previous transform + * feedback, so we start at the start of the user's buffer. (note: these + * are not the query counters) + */ + intel-batch.needs_sol_reset = true; + } } void diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index c83b2df..03709ea 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -82,12 +82,14 @@ upload_3dstate_so_buffers(struct brw_context *brw) end = ALIGN(start + xfb_obj-Size[i], 4); assert(end = bo-size); - /* Offset the starting offset by the current vertex index into the - * feedback buffer, offset register is always set to 0 at the start of the - * batchbuffer. + /* If we don't have hardware contexts, then we reset our offsets at the + * start of every batch, so we track the number of vertices written in + * software and increment our pointers by that many. */ - start += brw-sol.offset_0_batch_start * stride; - assert(start = end); + if (!intel-hw_ctx) { + start += brw-sol.offset_0_batch_start * stride; + assert(start = end); + } BEGIN_BATCH(4); OUT_BATCH(_3DSTATE_SO_BUFFER 16 | (4 - 2)); @@ -244,7 +246,11 @@ upload_sol_state(struct brw_context *brw) /* BRW_NEW_VUE_MAP_GEOM_OUT */ upload_3dstate_so_decl_list(brw, brw-vue_map_geom_out); - intel-batch.needs_sol_reset = true; + /* If we don't have hardware contexts, then some other client may have + * changed the SO write offsets, and we need to rewrite them. + */ + if (!intel-hw_ctx) + intel-batch.needs_sol_reset = true; } /* Finally, set up the SOL stage. This command must always follow updates to -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.
On Mon, Apr 8, 2013 at 7:27 PM, Kenneth Graunke kenn...@whitecape.org wrote: In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all uncompressed anyway. s/uncompressed/untiled/ Series is Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel: Remove the texture_tiling driconf option.
This option can force textures to be untiled. However, on Gen6+, depth buffers must be Y-tiled. MSAA buffers also must be Y-tiled. So setting this option on even a trivial application like glxgears causes assertion failures in a debug build, and likely GPU hangs in a release build. It's just giving users a license to shoot themselves in the foot. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_context.c | 2 -- src/mesa/drivers/dri/intel/intel_context.h | 1 - src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 3 --- src/mesa/drivers/dri/intel/intel_screen.c | 6 +- 4 files changed, 1 insertion(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index bf4045e..990fbea 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -811,8 +811,6 @@ intelInitContext(struct intel_context *intel, intel_fbo_init(intel); - intel-use_texture_tiling = driQueryOptionb(intel-optionCache, - texture_tiling); intel-use_early_z = driQueryOptionb(intel-optionCache, early_z); if (!driQueryOptionb(intel-optionCache, hiz)) { diff --git a/src/mesa/drivers/dri/intel/intel_context.h b/src/mesa/drivers/dri/intel/intel_context.h index b2ded49..22d29be 100644 --- a/src/mesa/drivers/dri/intel/intel_context.h +++ b/src/mesa/drivers/dri/intel/intel_context.h @@ -343,7 +343,6 @@ struct intel_context */ bool is_front_buffer_reading; - bool use_texture_tiling; bool use_early_z; int driFd; diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 6a9f08c..9aff109 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -315,9 +315,6 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_NONE; } - if (!intel-use_texture_tiling) - return I915_TILING_NONE; - if (force_y_tiling) return I915_TILING_Y; diff --git a/src/mesa/drivers/dri/intel/intel_screen.c b/src/mesa/drivers/dri/intel/intel_screen.c index 3ca10c8..ccd513e 100644 --- a/src/mesa/drivers/dri/intel/intel_screen.c +++ b/src/mesa/drivers/dri/intel/intel_screen.c @@ -55,10 +55,6 @@ PUBLIC const char __driConfigOptions[] = DRI_CONF_DESC_END DRI_CONF_OPT_END - DRI_CONF_OPT_BEGIN(texture_tiling, bool, true) -DRI_CONF_DESC(en, Enable texture tiling) - DRI_CONF_OPT_END - DRI_CONF_OPT_BEGIN(hiz, bool, true) DRI_CONF_DESC(en, Enable Hierarchical Z on gen6+) DRI_CONF_OPT_END @@ -95,7 +91,7 @@ PUBLIC const char __driConfigOptions[] = DRI_CONF_SECTION_END DRI_CONF_END; -const GLuint __driNConfigOptions = 17; +const GLuint __driNConfigOptions = 16; #include intel_batchbuffer.h #include intel_buffers.h -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.
Kenneth Graunke kenn...@whitecape.org writes: In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all uncompressed anyway. Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%. This series is: Reviewed-by: Eric Anholt e...@anholt.net pgpeuyXpTM53_.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: fix glGet queries depending on derived framebuffer state
Marek Olšák mar...@gmail.com writes: ctx-DrawBuffer-Visual might be invalid if (NewState _NEW_BUFFERS) != 0. NOTE: This is a candidate for stable branches. --- src/mesa/main/get_hash_params.py |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 4ef2324..580e62f 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -8,7 +8,7 @@ descriptor=[ [ COLOR_WRITEMASK, LOC_CUSTOM, TYPE_INT_4, 0, NO_EXTRA ], [ CULL_FACE, CONTEXT_BOOL(Polygon.CullFlag), NO_EXTRA ], [ CULL_FACE_MODE, CONTEXT_ENUM(Polygon.CullFaceMode), NO_EXTRA ], - [ DEPTH_BITS, BUFFER_INT(Visual.depthBits), NO_EXTRA ], + [ DEPTH_BITS, BUFFER_INT(Visual.depthBits), extra_new_buffers ], [ DEPTH_CLEAR_VALUE, CONTEXT_FIELD(Depth.Clear, TYPE_DOUBLEN), NO_EXTRA ], [ DEPTH_FUNC, CONTEXT_ENUM(Depth.Func), NO_EXTRA ], [ DEPTH_RANGE, CONTEXT_FIELD(Viewport.Near, TYPE_FLOATN_2), NO_EXTRA ], @@ -31,7 +31,7 @@ descriptor=[ [ RED_BITS, BUFFER_INT(Visual.redBits), extra_new_buffers ], [ SCISSOR_BOX, LOC_CUSTOM, TYPE_INT_4, 0, NO_EXTRA ], [ SCISSOR_TEST, CONTEXT_BOOL(Scissor.Enabled), NO_EXTRA ], - [ STENCIL_BITS, BUFFER_INT(Visual.stencilBits), NO_EXTRA ], + [ STENCIL_BITS, BUFFER_INT(Visual.stencilBits), extra_new_buffers ], [ STENCIL_CLEAR_VALUE, CONTEXT_INT(Stencil.Clear), NO_EXTRA ], [ STENCIL_FAIL, LOC_CUSTOM, TYPE_ENUM, NO_OFFSET, NO_EXTRA ], [ STENCIL_FUNC, LOC_CUSTOM, TYPE_ENUM, NO_OFFSET, NO_EXTRA ], @@ -80,8 +80,8 @@ descriptor=[ [ SAMPLE_COVERAGE_ARB, CONTEXT_BOOL(Multisample.SampleCoverage), NO_EXTRA ], [ SAMPLE_COVERAGE_VALUE_ARB, CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA ], [ SAMPLE_COVERAGE_INVERT_ARB, CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA ], - [ SAMPLE_BUFFERS_ARB, BUFFER_INT(Visual.sampleBuffers), NO_EXTRA ], - [ SAMPLES_ARB, BUFFER_INT(Visual.samples), NO_EXTRA ], + [ SAMPLE_BUFFERS_ARB, BUFFER_INT(Visual.sampleBuffers), extra_new_buffers ], + [ SAMPLES_ARB, BUFFER_INT(Visual.samples), extra_new_buffers ], Don't RGBA_FLOAT_MODE_ARB and FRAMEBUFFER_SRGB_CAPABLE_EXT also need this treatment? pgpzxuLDnr9Wa.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: update derived framebuffer state in GetMultisamplefv
Marek Olšák mar...@gmail.com writes: This makes sure that ctx-DrawBuffer-Visual.samples is up-to-date. Reviewed-by: Eric Anholt e...@anholt.net pgpTrmcAX8O6l.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/gen7.5: Allow HW primitive restart for all primitive types.
Gen7.5 (Haswell) hardware supports primitive restart for all primitive types. It also handles all possible primitive restart indices. Rather than specialize both can_cut_index_handle_restart_index() and the switch statement in can_cut_index_handle_prims() for Haswell, just return early if the hardware is Haswell because we know it can handle everything. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index e6902b4..10581b3 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -36,18 +36,12 @@ /** * Check if the hardware's cut index support can handle the primitive - * restart index value. + * restart index value (pre-Haswell only). */ static bool can_cut_index_handle_restart_index(struct gl_context *ctx, const struct _mesa_index_buffer *ib) { - struct intel_context *intel = intel_context(ctx); - - /* Haswell supports an arbitrary cut index. */ - if (intel-is_haswell) - return true; - bool cut_index_will_work; switch (ib-type) { @@ -78,6 +72,7 @@ can_cut_index_handle_prims(struct gl_context *ctx, GLuint nr_prims, const struct _mesa_index_buffer *ib) { + struct intel_context *intel = intel_context(ctx); struct brw_context *brw = brw_context(ctx); if (brw-sol.counting_primitives_generated || @@ -90,6 +85,10 @@ can_cut_index_handle_prims(struct gl_context *ctx, return false; } + /* Otherwise Haswell can do it all. */ + if (intel-is_haswell) + return true; + if (!can_cut_index_handle_restart_index(ctx, ib)) { /* The primitive restart index can't be handled, so take * the software path -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Only use brw_draw.c's trim() function when necessary.
brw_draw.c contains a trim() function which modifies the vertex count for quads and quad strips in order to discard dangling vertices. In principle this shouldn't be necessary, since hardware since Gen4 is capable of discarding dangling vertices by itself. However, it's necessary because as a hack to speed up rendering on Gen 4-5, we sometimes convert quads to trifans and quad strips to tristrips. The trim() function isn't necessary on Gen6 and up. This patch documents why and when the trim() function is necessary, and avoids calling it when it's not needed. This will avoid creating problems when we enable hardware support for primitive restart of quads and quad strips on Haswell. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_draw.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 809bcc5..43a4f05 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -136,6 +136,14 @@ static void gen6_set_prim(struct brw_context *brw, } +/** + * The hardware is capable of removing dangling vertices on its own; however, + * prior to Gen6, we sometimes convert quads into trifans (and quad strips + * into tristrips), since pre-Gen6 hardware requires a GS to render quads. + * This function manually trims dangling vertices from a draw call involving + * quads so that those dangling vertices won't get drawn when we convert to + * trifans/tristrips. + */ static GLuint trim(GLenum prim, GLuint length) { if (prim == GL_QUAD_STRIP) @@ -171,7 +179,11 @@ static void brw_emit_prim(struct brw_context *brw, start_vertex_location += brw-vb.start_vertex_bias; } - verts_per_instance = trim(prim-mode, prim-count); + /* We only need to trim the primitive count on pre-Gen6. */ + if (intel-gen 6) + verts_per_instance = trim(prim-mode, prim-count); + else + verts_per_instance = prim-count; /* If nothing to emit, just return. */ if (verts_per_instance == 0) @@ -228,7 +240,7 @@ static void gen7_emit_prim(struct brw_context *brw, start_vertex_location += brw-vb.start_vertex_bias; } - verts_per_instance = trim(prim-mode, prim-count); + verts_per_instance = prim-count; /* If nothing to emit, just return. */ if (verts_per_instance == 0) -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965/fs/gen7: Allow reads from MRFs.
Since they're actually GRFs, we can read from them. total instructions in shared programs: 852751 - 851371 (-0.16%) instructions in affected programs: 227286 - 225906 (-0.61%) (no regressions) --- src/mesa/drivers/dri/i965/brw_fs.cpp | 22 -- 1 files changed, 12 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index c12ba45..57be319 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2121,16 +2121,18 @@ fs_visitor::compute_to_mrf() /* You can't read from an MRF, so if someone else reads our * MRF's source GRF that we wanted to rewrite, that stops us. */ -bool interfered = false; -for (int i = 0; i 3; i++) { - if (scan_inst-src[i].file == GRF - scan_inst-src[i].reg == inst-src[0].reg - scan_inst-src[i].reg_offset == inst-src[0].reg_offset) { - interfered = true; - } -} -if (interfered) - break; + if (intel-gen 7) { +bool interfered = false; +for (int i = 0; i 3; i++) { + if (scan_inst-src[i].file == GRF + scan_inst-src[i].reg == inst-src[0].reg + scan_inst-src[i].reg_offset == inst-src[0].reg_offset) { + interfered = true; + } +} +if (interfered) + break; + } if (scan_inst-dst.file == MRF) { /* If somebody else writes our MRF here, we can't -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/vs/gen7: Allow reads from MRFs.
Since they're actually GRFs, we can read from them. total instructions in shared programs: 344973 - 342483 (-0.72%) instructions in affected programs: 245602 - 243112 (-1.01%) (no regressions) --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 23 +-- 1 files changed, 13 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index c58fb44..e337738 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -927,16 +927,18 @@ vec4_visitor::opt_register_coalesce() * GRF we're trying to coalesce to, we don't actually handle * rewriting sources so bail in that case as well. */ -bool interfered = false; -for (int i = 0; i 3; i++) { - if (scan_inst-src[i].file == GRF - scan_inst-src[i].reg == inst-src[0].reg - scan_inst-src[i].reg_offset == inst-src[0].reg_offset) { - interfered = true; - } -} -if (interfered) - break; + if (intel-gen 7) { +bool interfered = false; +for (int i = 0; i 3; i++) { + if (scan_inst-src[i].file == GRF + scan_inst-src[i].reg == inst-src[0].reg + scan_inst-src[i].reg_offset == inst-src[0].reg_offset) { + interfered = true; + } +} +if (interfered) + break; + } /* If somebody else writes our destination here, we can't coalesce * before that. @@ -956,6 +958,7 @@ vec4_visitor::opt_register_coalesce() break; } } else { +bool interfered = false; for (int i = 0; i 3; i++) { if (scan_inst-src[i].file == inst-dst.file scan_inst-src[i].reg == inst-dst.reg -- 1.7.8.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/gen7.5: Allow HW primitive restart for all primitive types.
Series Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On Mon, Apr 8, 2013 at 11:57 AM, Paul Berry stereotype...@gmail.com wrote: Gen7.5 (Haswell) hardware supports primitive restart for all primitive types. It also handles all possible primitive restart indices. Rather than specialize both can_cut_index_handle_restart_index() and the switch statement in can_cut_index_handle_prims() for Haswell, just return early if the hardware is Haswell because we know it can handle everything. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index e6902b4..10581b3 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -36,18 +36,12 @@ /** * Check if the hardware's cut index support can handle the primitive - * restart index value. + * restart index value (pre-Haswell only). */ static bool can_cut_index_handle_restart_index(struct gl_context *ctx, const struct _mesa_index_buffer *ib) { - struct intel_context *intel = intel_context(ctx); - - /* Haswell supports an arbitrary cut index. */ - if (intel-is_haswell) - return true; - bool cut_index_will_work; switch (ib-type) { @@ -78,6 +72,7 @@ can_cut_index_handle_prims(struct gl_context *ctx, GLuint nr_prims, const struct _mesa_index_buffer *ib) { + struct intel_context *intel = intel_context(ctx); struct brw_context *brw = brw_context(ctx); if (brw-sol.counting_primitives_generated || @@ -90,6 +85,10 @@ can_cut_index_handle_prims(struct gl_context *ctx, return false; } + /* Otherwise Haswell can do it all. */ + if (intel-is_haswell) + return true; + if (!can_cut_index_handle_restart_index(ctx, ib)) { /* The primitive restart index can't be handled, so take * the software path -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: Remove the texture_tiling driconf option.
Kenneth Graunke kenn...@whitecape.org writes: This option can force textures to be untiled. However, on Gen6+, depth buffers must be Y-tiled. MSAA buffers also must be Y-tiled. So setting this option on even a trivial application like glxgears causes assertion failures in a debug build, and likely GPU hangs in a release build. It's just giving users a license to shoot themselves in the foot. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Long long ago, it was occasionally useful for testing. No more. Reviewed-by: Eric Anholt e...@anholt.net pgp3hQfZjUd_r.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: Allocate hiz in intel_renderbuffer_move_to_temp()
On 5 April 2013 16:51, Paul Berry stereotype...@gmail.com wrote: On 5 April 2013 15:28, Chad Versace chad.vers...@linux.intel.com wrote: When moving the renderbuffer to a new miptree, we neglected to allocate the hiz buffer for the new miptree. Oops. Fixes all Piglit depthstencil-render-miplevels tests from crash to pass on Sandybridge. CC: Paul Berry stereotype...@gmail.com CC: Eric Anholt e...@anholt.net Signed-off-by: Chad Versace chad.vers...@linux.intel.com I haven't had a chance to review this yet, but: Candidate for the 9.1 stable release branch? Ok, if this is marked as a candidate for 9.1, then it is: Reviewed-by: Paul Berry stereotype...@gmail.com --- src/mesa/drivers/dri/intel/intel_fbo.c | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index b91d6e0..2977568 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -1010,6 +1010,10 @@ intel_renderbuffer_move_to_temp(struct intel_context *intel, irb-mt-num_samples, false /* force_y_tiling */); + if (intel-vtbl.is_hiz_depth_format(intel, new_mt-format)) { + intel_miptree_alloc_hiz(intel, new_mt, irb-mt-num_samples); + } + intel_miptree_copy_teximage(intel, intel_image, new_mt, invalidate); intel_miptree_reference(irb-mt, intel_image-mt); -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/gen7.5: Allow HW primitive restart for all primitive types.
Paul Berry stereotype...@gmail.com writes: Gen7.5 (Haswell) hardware supports primitive restart for all primitive types. It also handles all possible primitive restart indices. Rather than specialize both can_cut_index_handle_restart_index() and the switch statement in can_cut_index_handle_prims() for Haswell, just return early if the hardware is Haswell because we know it can handle everything. Reviewed-by: Kenneth Graunke kenn...@whitecape.org Series is: Reviewed-by: Eric Anholt e...@anholt.net pgp6rlpHCigsg.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs/gen7: Allow reads from MRFs.
Matt Turner matts...@gmail.com writes: Since they're actually GRFs, we can read from them. total instructions in shared programs: 852751 - 851371 (-0.16%) instructions in affected programs: 227286 - 225906 (-0.61%) (no regressions) I don't see you actually rewriting these GRF reads to be the new MRF, so they'll now be reading uninitialized values. pgpcsQlEjAxNs.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa 9.1.2? (was Re: Mesa (9.1): 21 new commits)
On 04/05/2013 07:51 PM, Jordan Justen wrote: On Fri, Apr 5, 2013 at 7:03 PM, Ian Romanick i...@freedesktop.org wrote: I just cherry picked (almost) all of the marked patches from master that have been out for two weeks or more. There are a couple that I did not pick. With all that out of the way... how does a Mesa 9.1.2 release next Friday sound? 43 patches have been cherry picked since 9.1.1, so it seems like a good time. 0967c362 brings gen6 from not working at all on TF2, to somewhat working with major issues. So, if it is not considered too risky, then getting it onto 9.1 might be nice. The problem is that patch seems to depend on a pile of other patches... at least 8fbc22e8, but maybe also 463ef47 and a593a1b. Perhaps someone can recommend an alternate patch specifically for 9.1? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.
On 04/08/2013 11:17 AM, Paul Berry wrote: On 8 April 2013 10:37, Ian Romanick i...@freedesktop.org mailto:i...@freedesktop.org wrote: On 04/07/2013 06:42 AM, Paul Berry wrote: The call to emit_shader_time_end() before the second URB write was conditioned with if (eot), but eot is always false in this code path, so emit_shader_time_end() was never being called for vertex shaders that performed 2 URB writes. I had to look at that code for way to long to convince myself that your patch was correct. I think it might be better to remove both the conditional emit_shader_time_end calls and put this block of code at the very bottom (unless emit_shader_time_end has some side effect that I don't see): if (inst-eot) { if (INTEL_DEBUG DEBUG_SHADER_TIME) emit_shader_time_end(); } Or does the last URB write have to be the last instruction? The last URB write has to be the last instruction, since it's actually the URB write that ends the thread (eot stands for end of thread). I suspected it was something like that. For GL 3.2 we're going to need to refactor this function to use a loop, since GL 3.2 doubles the number of varying components permitted for VS-GS linkage (so we'll need up to 4 URB writes instead of 2). I think once that change is made the function is going to be a lot easier to follow. Maybe I should just do that refactor now? It's up to you. I think the code in your patch is okay for now. Reviewed-by: Ian Romanick ian.d.roman...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/12] Death to array dereferences of vectors!
This series gradually replaces array dereferences of vectors with two expressions. It takes so many patches because changes are needed to the existing lowering passes and because several places in the code generate array dereferences of vectors (e.g., lowering accessed to gl_ClipDistance). There is also some challenge in dealing with function inout parameters that are indexed vectors. The two new expressions are ir_binop_vector_extract and ir_triop_vector_insert. The former has a vector operand and a scalar operand. The result is the scalar value from the vector specified by the scalar. The later takes a vector and two scalars. The result is a new vector with one indexed field replaced by a scalar value. Together this series fixes piglit tests glsl-vs-channel-overwrite-01 and glsl-vs-channel-overwrite-03. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/12] glsl: Add ir_binop_vector_extract
From: Ian Romanick ian.d.roman...@intel.com The new opcode is used to get a single field from a vector. The field index may not be constant. This will eventually replace ir_dereference_array of vectors. This is similar to the extractelement instruction in LLVM IR. http://llvm.org/docs/LangRef.html#extractelement-instruction Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ir.cpp | 5 + src/glsl/ir.h | 10 +- src/glsl/ir_constant_expression.cpp | 35 --- src/glsl/ir_validate.cpp| 6 ++ src/mesa/program/ir_to_mesa.cpp | 1 + 5 files changed, 53 insertions(+), 4 deletions(-) diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp index 05b77da..f4596db 100644 --- a/src/glsl/ir.cpp +++ b/src/glsl/ir.cpp @@ -399,6 +399,10 @@ ir_expression::ir_expression(int op, ir_rvalue *op0, ir_rvalue *op1) this-type = op0-type; break; + case ir_binop_vector_extract: + this-type = op0-type-get_scalar_type(); + break; + default: assert(!not reached: missing automatic type setup for ir_expression); this-type = glsl_type::float_type; @@ -505,6 +509,7 @@ static const char *const operator_strs[] = { pow, packHalf2x16_split, ubo_load, + vector_extract, lrp, vector, }; diff --git a/src/glsl/ir.h b/src/glsl/ir.h index 0c3e399..4da54fc 100644 --- a/src/glsl/ir.h +++ b/src/glsl/ir.h @@ -1115,9 +1115,17 @@ enum ir_expression_operation { ir_binop_ubo_load, /** +* Extract a scalar from a vector +* +* operand0 is the vector +* operand1 is the index of the field to read from operand0 +*/ + ir_binop_vector_extract, + + /** * A sentinel marking the last of the binary operations. */ - ir_last_binop = ir_binop_ubo_load, + ir_last_binop = ir_binop_vector_extract, ir_triop_lrp, diff --git a/src/glsl/ir_constant_expression.cpp b/src/glsl/ir_constant_expression.cpp index c09e56a..e802e6c 100644 --- a/src/glsl/ir_constant_expression.cpp +++ b/src/glsl/ir_constant_expression.cpp @@ -391,9 +391,16 @@ ir_expression::constant_expression_value(struct hash_table *variable_context) } if (op[1] != NULL) - assert(op[0]-type-base_type == op[1]-type-base_type || -this-operation == ir_binop_lshift || -this-operation == ir_binop_rshift); + switch (this-operation) { + case ir_binop_lshift: + case ir_binop_rshift: + case ir_binop_vector_extract: +break; + + default: +assert(op[0]-type-base_type == op[1]-type-base_type); +break; + } bool op0_scalar = op[0]-type-is_scalar(); bool op1_scalar = op[1] != NULL op[1]-type-is_scalar(); @@ -1230,6 +1237,28 @@ ir_expression::constant_expression_value(struct hash_table *variable_context) } break; + case ir_binop_vector_extract: { + const int c = op[1]-value.i[0]; + + switch (op[0]-type-base_type) { + case GLSL_TYPE_UINT: +data.u[0] = op[0]-value.u[c]; +break; + case GLSL_TYPE_INT: +data.i[0] = op[0]-value.i[c]; +break; + case GLSL_TYPE_FLOAT: +data.f[0] = op[0]-value.f[c]; +break; + case GLSL_TYPE_BOOL: +data.b[0] = op[0]-value.b[c]; +break; + default: +assert(0); + } + break; + } + case ir_binop_bit_xor: for (unsigned c = 0, c0 = 0, c1 = 0; c components; diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp index 699c192..83519cf 100644 --- a/src/glsl/ir_validate.cpp +++ b/src/glsl/ir_validate.cpp @@ -468,6 +468,12 @@ ir_validate::visit_leave(ir_expression *ir) assert(ir-operands[1]-type == glsl_type::uint_type); break; + case ir_binop_vector_extract: + assert(ir-operands[0]-type-is_vector()); + assert(ir-operands[1]-type-is_scalar() + ir-operands[1]-type-is_integer()); + break; + case ir_triop_lrp: assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT); assert(ir-operands[0]-type == ir-operands[1]-type); diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 14cf5ba..7d351c0 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -1485,6 +1485,7 @@ ir_to_mesa_visitor::visit(ir_expression *ir) emit(ir, OPCODE_LRP, result_dst, op[2], op[1], op[0]); break; + case ir_binop_vector_extract: case ir_quadop_vector: /* This operation should have already been handled. */ -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/12] glsl: Add ir_triop_vector_insert
From: Ian Romanick ian.d.roman...@intel.com The new opcode is used to generate a new vector with a single field from the source vector replaced. This will eventually replace ir_dereference_array of vectors in the LHS of assignments. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ir.cpp | 1 + src/glsl/ir.h | 11 ++- src/glsl/ir_validate.cpp| 9 + src/mesa/program/ir_to_mesa.cpp | 1 + 4 files changed, 21 insertions(+), 1 deletion(-) diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp index f4596db..336ff95 100644 --- a/src/glsl/ir.cpp +++ b/src/glsl/ir.cpp @@ -511,6 +511,7 @@ static const char *const operator_strs[] = { ubo_load, vector_extract, lrp, + vector_insert, vector, }; diff --git a/src/glsl/ir.h b/src/glsl/ir.h index 4da54fc..7106cde 100644 --- a/src/glsl/ir.h +++ b/src/glsl/ir.h @@ -1130,9 +1130,18 @@ enum ir_expression_operation { ir_triop_lrp, /** +* Generate a value with one field of a vector changed +* +* operand0 is the vector +* operand1 is the value to write into the vector result +* operand2 is the index in operand0 to be modified +*/ + ir_triop_vector_insert, + + /** * A sentinel marking the last of the ternary operations. */ - ir_last_triop = ir_triop_lrp, + ir_last_triop = ir_triop_vector_insert, ir_quadop_vector, diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp index 83519cf..f304af4 100644 --- a/src/glsl/ir_validate.cpp +++ b/src/glsl/ir_validate.cpp @@ -480,6 +480,15 @@ ir_validate::visit_leave(ir_expression *ir) assert(ir-operands[2]-type == ir-operands[0]-type || ir-operands[2]-type == glsl_type::float_type); break; + case ir_triop_vector_insert: + assert(ir-operands[0]-type-is_vector()); + assert(ir-operands[1]-type-is_scalar()); + assert(ir-operands[0]-type-base_type == ir-operands[1]-type-base_type); + assert(ir-operands[2]-type-is_scalar() + ir-operands[2]-type-is_integer()); + assert(ir-type == ir-operands[0]-type); + break; + case ir_quadop_vector: /* The vector operator collects some number of scalars and generates a * vector from them. diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 7d351c0..eb64347 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -1486,6 +1486,7 @@ ir_to_mesa_visitor::visit(ir_expression *ir) break; case ir_binop_vector_extract: + case ir_triop_vector_insert: case ir_quadop_vector: /* This operation should have already been handled. */ -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/12] glsl: Refactor part of convert_vec_index_to_cond_assign
From: Ian Romanick ian.d.roman...@intel.com Use a first function that extract the vector being indexed and the index from the deref. Call the second function that does the real work. Coming patches will add a new ir_expression for variable indexing into a vector. Having the lowering pass split into two functions will make it much easier to lower the new ir_expression. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/lower_vec_index_to_cond_assign.cpp | 47 ++--- 1 file changed, 30 insertions(+), 17 deletions(-) diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp b/src/glsl/lower_vec_index_to_cond_assign.cpp index f85875f..6572cc4 100644 --- a/src/glsl/lower_vec_index_to_cond_assign.cpp +++ b/src/glsl/lower_vec_index_to_cond_assign.cpp @@ -53,6 +53,9 @@ public: } ir_rvalue *convert_vec_index_to_cond_assign(ir_rvalue *val); + ir_rvalue *convert_vec_index_to_cond_assign(void *mem_ctx, + ir_rvalue *orig_vector, + ir_rvalue *orig_index); virtual ir_visitor_status visit_enter(ir_expression *); virtual ir_visitor_status visit_enter(ir_swizzle *); @@ -65,24 +68,15 @@ public: }; ir_rvalue * -ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue *ir) +ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_ctx, + ir_rvalue *orig_vector, + ir_rvalue *orig_index) { - ir_dereference_array *orig_deref = ir-as_dereference_array(); ir_assignment *assign, *value_assign; ir_variable *index, *var, *value; ir_dereference *deref, *deref_value; unsigned i; - if (!orig_deref) - return ir; - - if (orig_deref-array-type-is_matrix() || - orig_deref-array-type-is_array()) - return ir; - - void *mem_ctx = ralloc_parent(ir); - - assert(orig_deref-array_index-type-base_type == GLSL_TYPE_INT); exec_list list; @@ -92,15 +86,15 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue ir_var_temporary); list.push_tail(index); deref = new(base_ir) ir_dereference_variable(index); - assign = new(base_ir) ir_assignment(deref, orig_deref-array_index, NULL); + assign = new(base_ir) ir_assignment(deref, orig_index, NULL); list.push_tail(assign); /* Store the value inside a temp, thus avoiding matrixes duplication */ - value = new(base_ir) ir_variable(orig_deref-array-type, vec_value_tmp, + value = new(base_ir) ir_variable(orig_vector-type, vec_value_tmp, ir_var_temporary); list.push_tail(value); deref_value = new(base_ir) ir_dereference_variable(value); - value_assign = new(base_ir) ir_assignment(deref_value, orig_deref-array); + value_assign = new(base_ir) ir_assignment(deref_value, orig_vector); list.push_tail(value_assign); /* Temporary where we store whichever value we swizzle out. */ @@ -113,11 +107,11 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue */ ir_rvalue *const cond_deref = compare_index_block(list, index, 0, - orig_deref-array-type-vector_elements, + orig_vector-type-vector_elements, mem_ctx); /* Generate a conditional move of each vector element to the temp. */ - for (i = 0; i orig_deref-array-type-vector_elements; i++) { + for (i = 0; i orig_vector-type-vector_elements; i++) { ir_rvalue *condition_swizzle = new(base_ir) ir_swizzle(cond_deref-clone(ir, NULL), i, 0, 0, 0, 1); @@ -142,6 +136,25 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue return new(base_ir) ir_dereference_variable(var); } +ir_rvalue * +ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue *ir) +{ + ir_dereference_array *orig_deref = ir-as_dereference_array(); + + if (!orig_deref) + return ir; + + if (orig_deref-array-type-is_matrix() || + orig_deref-array-type-is_array()) + return ir; + + assert(orig_deref-array_index-type-base_type == GLSL_TYPE_INT); + + return convert_vec_index_to_cond_assign(ralloc_parent(ir), + orig_deref-array, + orig_deref-array_index); +} + ir_visitor_status ir_vec_index_to_cond_assign_visitor::visit_enter(ir_expression *ir) { -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/12] glsl: Lower ir_binop_vector_extract to conditional moves
From: Ian Romanick ian.d.roman...@intel.com Lower ir_binop_vector_extract with a non-constant index to a series of conditional moves. This is exactly like ir_dereference_array of a vector with a non-constant index. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/lower_vec_index_to_cond_assign.cpp | 45 + 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp b/src/glsl/lower_vec_index_to_cond_assign.cpp index 6572cc4..2cd540c 100644 --- a/src/glsl/lower_vec_index_to_cond_assign.cpp +++ b/src/glsl/lower_vec_index_to_cond_assign.cpp @@ -55,7 +55,10 @@ public: ir_rvalue *convert_vec_index_to_cond_assign(ir_rvalue *val); ir_rvalue *convert_vec_index_to_cond_assign(void *mem_ctx, ir_rvalue *orig_vector, - ir_rvalue *orig_index); + ir_rvalue *orig_index, + const glsl_type *type); + + ir_rvalue *convert_vector_extract_to_cond_assign(ir_rvalue *ir); virtual ir_visitor_status visit_enter(ir_expression *); virtual ir_visitor_status visit_enter(ir_swizzle *); @@ -70,7 +73,8 @@ public: ir_rvalue * ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_ctx, ir_rvalue *orig_vector, - ir_rvalue *orig_index) + ir_rvalue *orig_index, + const glsl_type *type) { ir_assignment *assign, *value_assign; ir_variable *index, *var, *value; @@ -98,7 +102,7 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_ list.push_tail(value_assign); /* Temporary where we store whichever value we swizzle out. */ - var = new(base_ir) ir_variable(ir-type, vec_index_tmp_v, + var = new(base_ir) ir_variable(type, vec_index_tmp_v, ir_var_temporary); list.push_tail(var); @@ -113,7 +117,7 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_ /* Generate a conditional move of each vector element to the temp. */ for (i = 0; i orig_vector-type-vector_elements; i++) { ir_rvalue *condition_swizzle = -new(base_ir) ir_swizzle(cond_deref-clone(ir, NULL), i, 0, 0, 0, 1); +new(base_ir) ir_swizzle(cond_deref-clone(mem_ctx, NULL), i, 0, 0, 0, 1); /* Just clone the rest of the deref chain when trying to get at the * underlying variable. @@ -152,7 +156,22 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue return convert_vec_index_to_cond_assign(ralloc_parent(ir), orig_deref-array, - orig_deref-array_index); + orig_deref-array_index, + ir-type); +} + +ir_rvalue * +ir_vec_index_to_cond_assign_visitor::convert_vector_extract_to_cond_assign(ir_rvalue *ir) +{ + ir_expression *const expr = ir-as_expression(); + + if (expr == NULL || expr-operation != ir_binop_vector_extract) + return ir; + + return convert_vec_index_to_cond_assign(ralloc_parent(ir), + expr-operands[0], + expr-operands[1], + ir-type); } ir_visitor_status @@ -162,6 +181,7 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_expression *ir) for (i = 0; i ir-get_num_operands(); i++) { ir-operands[i] = convert_vec_index_to_cond_assign(ir-operands[i]); + ir-operands[i] = convert_vector_extract_to_cond_assign(ir-operands[i]); } return visit_continue; @@ -175,6 +195,7 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_swizzle *ir) * using swizzling of scalars for vector construction. */ ir-val = convert_vec_index_to_cond_assign(ir-val); + ir-val = convert_vector_extract_to_cond_assign(ir-val); return visit_continue; } @@ -188,8 +209,12 @@ ir_vec_index_to_cond_assign_visitor::visit_leave(ir_assignment *ir) unsigned i; ir-rhs = convert_vec_index_to_cond_assign(ir-rhs); - if (ir-condition) + ir-rhs = convert_vector_extract_to_cond_assign(ir-rhs); + + if (ir-condition) { ir-condition = convert_vec_index_to_cond_assign(ir-condition); + ir-condition = convert_vector_extract_to_cond_assign(ir-condition); + } /* Last, handle the LHS */ ir_dereference_array *orig_deref = ir-lhs-as_dereference_array(); @@ -279,6 +304,12 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_call *ir) if (new_param !=
[Mesa-dev] [PATCH 04/12] glsl: Lower ir_binop_vector_extract to swizzle
From: Ian Romanick ian.d.roman...@intel.com Lower ir_binop_vector_extract with a constant index to a swizzle. This is exactly like ir_dereference_array of a vector with a constant index. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/lower_vec_index_to_swizzle.cpp | 45 + 1 file changed, 45 insertions(+) diff --git a/src/glsl/lower_vec_index_to_swizzle.cpp b/src/glsl/lower_vec_index_to_swizzle.cpp index 264d6dc..ad09dd2 100644 --- a/src/glsl/lower_vec_index_to_swizzle.cpp +++ b/src/glsl/lower_vec_index_to_swizzle.cpp @@ -47,6 +47,7 @@ public: } ir_rvalue *convert_vec_index_to_swizzle(ir_rvalue *val); + ir_rvalue *convert_vector_extract_to_swizzle(ir_rvalue *val); virtual ir_visitor_status visit_enter(ir_expression *); virtual ir_visitor_status visit_enter(ir_swizzle *); @@ -98,6 +99,40 @@ ir_vec_index_to_swizzle_visitor::convert_vec_index_to_swizzle(ir_rvalue *ir) return new(ctx) ir_swizzle(deref-array, i, 0, 0, 0, 1); } +ir_rvalue * +ir_vec_index_to_swizzle_visitor::convert_vector_extract_to_swizzle(ir_rvalue *ir) +{ + ir_expression *const expr = ir-as_expression(); + if (expr == NULL || expr-operation != ir_binop_vector_extract) + return ir; + + ir_constant *const idx = expr-operands[1]-constant_expression_value(); + if (idx == NULL) + return ir; + + void *ctx = ralloc_parent(ir); + this-progress = true; + + /* Page 40 of the GLSL 1.20 spec says: +* +* When indexing with non-constant expressions, behavior is undefined +* if the index is negative, or greater than or equal to the size of +* the vector. +* +* The quoted spec text mentions non-constant expressions, but this code +* operates on constants. These constants are the result of non-constant +* expressions that have been optimized to constants. The common case here +* is a loop counter from an unrolled loop that is used to index a vector. +* +* The ir_swizzle constructor gets angry if the index is negative or too +* large. For simplicity sake, just clamp the index to [0, size-1]. +*/ + const int i = MIN2(MAX2(idx-value.i[0], 0), + ((int) expr-operands[0]-type-vector_elements - 1)); + + return new(ctx) ir_swizzle(expr-operands[0], i, 0, 0, 0, 1); +} + ir_visitor_status ir_vec_index_to_swizzle_visitor::visit_enter(ir_expression *ir) { @@ -105,6 +140,7 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_expression *ir) for (i = 0; i ir-get_num_operands(); i++) { ir-operands[i] = convert_vec_index_to_swizzle(ir-operands[i]); + ir-operands[i] = convert_vector_extract_to_swizzle(ir-operands[i]); } return visit_continue; @@ -127,6 +163,7 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_assignment *ir) { ir-set_lhs(convert_vec_index_to_swizzle(ir-lhs)); ir-rhs = convert_vec_index_to_swizzle(ir-rhs); + ir-rhs = convert_vector_extract_to_swizzle(ir-rhs); return visit_continue; } @@ -140,6 +177,12 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_call *ir) if (new_param != param) { param-replace_with(new_param); + } else { +new_param = convert_vec_index_to_swizzle(param); + +if (new_param != param) { + param-replace_with(new_param); +} } } @@ -151,6 +194,7 @@ ir_vec_index_to_swizzle_visitor::visit_enter(ir_return *ir) { if (ir-value) { ir-value = convert_vec_index_to_swizzle(ir-value); + ir-value = convert_vector_extract_to_swizzle(ir-value); } return visit_continue; @@ -160,6 +204,7 @@ ir_visitor_status ir_vec_index_to_swizzle_visitor::visit_enter(ir_if *ir) { ir-condition = convert_vec_index_to_swizzle(ir-condition); + ir-condition = convert_vector_extract_to_swizzle(ir-condition); return visit_continue; } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/Makefile.sources | 1 + src/glsl/glsl_parser_extras.cpp| 1 + src/glsl/ir_optimization.h | 1 + src/glsl/lower_vector_insert.cpp | 157 + src/mesa/drivers/dri/i965/brw_shader.cpp | 1 + src/mesa/program/ir_to_mesa.cpp| 1 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 + 7 files changed, 163 insertions(+) create mode 100644 src/glsl/lower_vector_insert.cpp diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 674a05f..8e2dc1b 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -69,6 +69,7 @@ LIBGLSL_FILES = \ $(GLSL_SRCDIR)/lower_vec_index_to_cond_assign.cpp \ $(GLSL_SRCDIR)/lower_vec_index_to_swizzle.cpp \ $(GLSL_SRCDIR)/lower_vector.cpp \ + $(GLSL_SRCDIR)/lower_vector_insert.cpp \ $(GLSL_SRCDIR)/lower_output_reads.cpp \ $(GLSL_SRCDIR)/lower_ubo_reference.cpp \ $(GLSL_SRCDIR)/opt_algebraic.cpp \ diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 0992294..d38e967 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -1236,6 +1236,7 @@ do_common_optimization(exec_list *ir, bool linked, progress = do_algebraic(ir) || progress; progress = do_lower_jumps(ir) || progress; progress = do_vec_index_to_swizzle(ir) || progress; + progress = lower_vector_insert(ir, false) || progress; progress = do_swizzle_swizzle(ir) || progress; progress = do_noop_swizzle(ir) || progress; diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index a8885d7..0216e46 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -106,6 +106,7 @@ void lower_ubo_reference(struct gl_shader *shader, exec_list *instructions); void lower_packed_varyings(void *mem_ctx, unsigned location_base, unsigned locations_used, ir_variable_mode mode, gl_shader *shader); +bool lower_vector_insert(exec_list *instructions, bool lower_nonconstant_index); bool optimize_redundant_jumps(exec_list *instructions); bool optimize_split_arrays(exec_list *instructions, bool linked); diff --git a/src/glsl/lower_vector_insert.cpp b/src/glsl/lower_vector_insert.cpp new file mode 100644 index 000..da1485c --- /dev/null +++ b/src/glsl/lower_vector_insert.cpp @@ -0,0 +1,157 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ +#include ir.h +#include ir_builder.h +#include ir_rvalue_visitor.h +#include ir_optimization.h + +using namespace ir_builder; + +class vector_insert_visitor : public ir_rvalue_visitor { +public: + vector_insert_visitor(bool lower_nonconstant_index) + : progress(false), lower_nonconstant_index(lower_nonconstant_index) + { + factory.instructions = factory_instructions; + } + + virtual ~vector_insert_visitor() + { + assert(factory_instructions.is_empty()); + } + + virtual void handle_rvalue(ir_rvalue **rv); + + ir_factory factory; + exec_list factory_instructions; + bool progress; + bool lower_nonconstant_index; +}; + + +void +vector_insert_visitor::handle_rvalue(ir_rvalue **rv) +{ + if (*rv == NULL || (*rv)-ir_type != ir_type_expression) + return; + + ir_expression *const expr = (ir_expression *) *rv; + + if (likely(expr-operation != ir_triop_vector_insert)) + return; + + factory.mem_ctx = ralloc_parent(expr); + + ir_constant *const idx = expr-operands[2]-constant_expression_value(); + if (idx != NULL) { + /*
[Mesa-dev] [PATCH 07/12] glsl: Convert ir_binop_vector_extract in the LHS to ir_triop_vector_insert
From: Ian Romanick ian.d.roman...@intel.com The ast_array_index code can't know whether to generate an ir_binop_vector_extract or an ir_triop_vector_insert. Instead it will always generate ir_binop_vector_extract, and the LHS and RHS have to be re-written. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_to_hir.cpp | 24 1 file changed, 24 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index a0ec71c..5414e18 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -672,6 +672,30 @@ do_assignment(exec_list *instructions, struct _mesa_glsl_parse_state *state, void *ctx = state; bool error_emitted = (lhs-type-is_error() || rhs-type-is_error()); + /* If the assignment LHS comes back as an ir_binop_vector_extract +* expression, move it to the RHS as an ir_triop_vector_insert. +*/ + if (lhs-ir_type == ir_type_expression) { + ir_expression *const expr = lhs-as_expression(); + + if (unlikely(expr-operation == ir_binop_vector_extract)) { +ir_rvalue *new_rhs = + validate_assignment(state, lhs-type, rhs, is_initializer); + +if (new_rhs == NULL) { + _mesa_glsl_error( lhs_loc, state, type mismatch); + return lhs; +} else { + rhs = new(ctx) ir_expression(ir_triop_vector_insert, +expr-operands[0]-type, +expr-operands[0], +new_rhs, +expr-operands[1]); + lhs = expr-operands[0]-clone(ctx, NULL); +} + } + } + ir_variable *lhs_var = lhs-variable_referenced(); if (lhs_var) lhs_var-assigned = true; -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/12] glsl: Generate ir_binop_vector_extract for indexing of vectors
From: Ian Romanick ian.d.roman...@intel.com Now ir_dereference_array of a vector will never occur in the RHS of an expression. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_array_index.cpp | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index 862f64c..e7bc299 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -31,17 +31,13 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, ir_rvalue *array, ir_rvalue *idx, YYLTYPE loc, YYLTYPE idx_loc) { - ir_rvalue *result = new(mem_ctx) ir_dereference_array(array, idx); - if (!array-type-is_error() !array-type-is_array() !array-type-is_matrix() -!array-type-is_vector()) { +!array-type-is_vector()) _mesa_glsl_error( idx_loc, state, cannot dereference non-array / non-matrix / non-vector); - result-type = glsl_type::error_type; - } if (!idx-type-is_error()) { if (!idx-type-is_integer()) { @@ -174,5 +170,20 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, } } - return result; + /* After performing all of the error checking, generate the IR for the +* expression. +*/ + if (array-type-is_array() + || array-type-is_matrix()) { + return new(mem_ctx) ir_dereference_array(array, idx); + } else if (array-type-is_vector()) { + return new(mem_ctx) ir_expression(ir_binop_vector_extract, array, idx); + } else if (array-type-is_error()) { + return array; + } else { + ir_rvalue *result = new(mem_ctx) ir_dereference_array(array, idx); + result-type = glsl_type::error_type; + + return result; + } } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/12] glsl: Generate correct ir_binop_vector_extract code for out and inout parameters
From: Ian Romanick ian.d.roman...@intel.com Like with type conversions on out parameters, some extra copies need to occur to handle these cases. The fundamental problem is that ir_binop_vector_extract is not an lvalue, but out and inout parameters must be lvalues. A previous patch delt with a similar problem in the LHS of ir_assignment. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_function.cpp | 149 +++--- 1 file changed, 102 insertions(+), 47 deletions(-) diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index 26f72cf..0d32241 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -165,10 +165,18 @@ verify_parameter_modes(_mesa_glsl_parse_state *state, actual-variable_referenced()-name); return false; } else if (!actual-is_lvalue()) { - _mesa_glsl_error(loc, state, -function parameter '%s %s' is not an lvalue, -mode, formal-name); - return false; + /* Even though ir_binop_vector_extract is not an l-value, let it +* slop through. generate_call will handle it correctly. +*/ + ir_expression *const expr = ((ir_rvalue *) actual)-as_expression(); + if (expr == NULL + || expr-operation != ir_binop_vector_extract + || !expr-operands[0]-is_lvalue()) { + _mesa_glsl_error(loc, state, + function parameter '%s %s' is not an lvalue, + mode, formal-name); + return false; + } } } @@ -178,6 +186,93 @@ verify_parameter_modes(_mesa_glsl_parse_state *state, return true; } +static void +fix_parameter(void *mem_ctx, ir_rvalue *actual, const glsl_type *formal_type, + exec_list *before_instructions, exec_list *after_instructions, + bool parameter_is_inout) +{ + ir_expression *const expr = actual-as_expression(); + + /* If the types match exactly and the parameter is not a vector-extract, +* nothing needs to be done to fix the parameter. +*/ + if (formal_type == actual-type +(expr == NULL || expr-operation != ir_binop_vector_extract)) + return; + + /* To convert an out parameter, we need to create a temporary variable to +* hold the value before conversion, and then perform the conversion after +* the function call returns. +* +* This has the effect of transforming code like this: +* +* void f(out int x); +* float value; +* f(value); +* +* Into IR that's equivalent to this: +* +* void f(out int x); +* float value; +* int out_parameter_conversion; +* f(out_parameter_conversion); +* value = float(out_parameter_conversion); +* +* If the parameter is an ir_expression of ir_binop_vector_extract, +* additional conversion is needed in the post-call re-write. +*/ + ir_variable *tmp = + new(mem_ctx) ir_variable(formal_type, inout_tmp, ir_var_temporary); + + before_instructions-push_tail(tmp); + + /* If the parameter is an inout parameter, copy the value of the actual +* parameter to the new temporary. Note that no type conversion is allowed +* here because inout parameters must match types exactly. +*/ + if (parameter_is_inout) { + /* Inout parameters should never require conversion, since that would + * require an implicit conversion to exist both to and from the formal + * parameter type, and there are no bidirectional implicit conversions. + */ + assert (actual-type == formal_type); + + ir_dereference_variable *const deref_tmp_1 = +new(mem_ctx) ir_dereference_variable(tmp); + ir_assignment *const assignment = +new(mem_ctx) ir_assignment(deref_tmp_1, actual); + before_instructions-push_tail(assignment); + } + + /* Replace the parameter in the call with a dereference of the new +* temporary. +*/ + ir_dereference_variable *const deref_tmp_2 = + new(mem_ctx) ir_dereference_variable(tmp); + actual-replace_with(deref_tmp_2); + + + /* Copy the temporary variable to the actual parameter with optional +* type conversion applied. +*/ + ir_rvalue *rhs = new(mem_ctx) ir_dereference_variable(tmp); + if (actual-type != formal_type) + rhs = convert_component(rhs, actual-type); + + ir_rvalue *lhs = actual; + if (expr != NULL expr-operation == ir_binop_vector_extract) { + rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert, + expr-operands[0]-type, + expr-operands[0]-clone(mem_ctx, NULL), + rhs, + expr-operands[1]-clone(mem_ctx, NULL)); + lhs = expr-operands[0]-clone(mem_ctx, NULL); + }
[Mesa-dev] [PATCH 09/12] glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor
From: Ian Romanick ian.d.roman...@intel.com Right now the lower_clip_distance_visitor lowers variable indexing into gl_ClipDistance into variable indexing into both the array gl_ClipDistanceMESA and the vectors of that array. For example, gl_ClipDistance[i] = f; becomes gl_ClipDistanceMESA[i/4][i%4] = f; However, variable indexing into vectors using ir_dereference_array is being removed. Instead, ir_expression with ir_triop_vector_insert will be used. The above code will become gl_ClipDistanceMESA[i/4] = vector_insert(gl_ClipDistanceMESA[i/4], i % 4, f); In order to do this, an ir_rvalue_visitor will need to be used. This commit is really just a refactor to get ready for that. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Paul Berry stereotype...@gmail.com --- src/glsl/lower_clip_distance.cpp | 136 +-- 1 file changed, 86 insertions(+), 50 deletions(-) diff --git a/src/glsl/lower_clip_distance.cpp b/src/glsl/lower_clip_distance.cpp index 643807d..26a0feb 100644 --- a/src/glsl/lower_clip_distance.cpp +++ b/src/glsl/lower_clip_distance.cpp @@ -46,10 +46,10 @@ */ #include glsl_symbol_table.h -#include ir_hierarchical_visitor.h +#include ir_rvalue_visitor.h #include ir.h -class lower_clip_distance_visitor : public ir_hierarchical_visitor { +class lower_clip_distance_visitor : public ir_rvalue_visitor { public: lower_clip_distance_visitor() : progress(false), old_clip_distance_var(NULL), @@ -59,11 +59,12 @@ public: virtual ir_visitor_status visit(ir_variable *); void create_indices(ir_rvalue*, ir_rvalue *, ir_rvalue *); - virtual ir_visitor_status visit_leave(ir_dereference_array *); virtual ir_visitor_status visit_leave(ir_assignment *); void visit_new_assignment(ir_assignment *ir); virtual ir_visitor_status visit_leave(ir_call *); + virtual void handle_rvalue(ir_rvalue **rvalue); + bool progress; /** @@ -173,33 +174,35 @@ lower_clip_distance_visitor::create_indices(ir_rvalue *old_index, } -/** - * Replace any expression that indexes into the gl_ClipDistance array with an - * expression that indexes into one of the vec4's in gl_ClipDistanceMESA and - * accesses the appropriate component. - */ -ir_visitor_status -lower_clip_distance_visitor::visit_leave(ir_dereference_array *ir) +void +lower_clip_distance_visitor::handle_rvalue(ir_rvalue **rv) { /* If the gl_ClipDistance var hasn't been declared yet, then * there's no way this deref can refer to it. */ - if (!this-old_clip_distance_var) - return visit_continue; - - ir_dereference_variable *old_var_ref = ir-array-as_dereference_variable(); - if (old_var_ref old_var_ref-var == this-old_clip_distance_var) { - this-progress = true; - ir_rvalue *array_index; - ir_rvalue *swizzle_index; - this-create_indices(ir-array_index, array_index, swizzle_index); - void *mem_ctx = ralloc_parent(ir); - ir-array = new(mem_ctx) ir_dereference_array( - this-new_clip_distance_var, array_index); - ir-array_index = swizzle_index; + if (!this-old_clip_distance_var || *rv == NULL) + return; + + ir_dereference_array *const array = (*rv)-as_dereference_array(); + if (array != NULL) { + /* Replace any expression that indexes into the gl_ClipDistance array + * with an expression that indexes into one of the vec4's in + * gl_ClipDistanceMESA and accesses the appropriate component. + */ + ir_dereference_variable *old_var_ref = +array-array-as_dereference_variable(); + if (old_var_ref old_var_ref-var == this-old_clip_distance_var) { +this-progress = true; +ir_rvalue *array_index; +ir_rvalue *swizzle_index; +this-create_indices(array-array_index, array_index, swizzle_index); +void *mem_ctx = ralloc_parent(array); +array-array = + new(mem_ctx) ir_dereference_array(this-new_clip_distance_var, + array_index); +array-array_index = swizzle_index; + } } - - return visit_continue; } @@ -214,38 +217,71 @@ lower_clip_distance_visitor::visit_leave(ir_assignment *ir) { ir_dereference_variable *lhs_var = ir-lhs-as_dereference_variable(); ir_dereference_variable *rhs_var = ir-rhs-as_dereference_variable(); - if ((lhs_var lhs_var-var == this-old_clip_distance_var) - || (rhs_var rhs_var-var == this-old_clip_distance_var)) { - /* LHS or RHS of the assignment is the entire gl_ClipDistance array. - * Since we are reshaping gl_ClipDistance from an array of floats to an - * array of vec4's, this isn't going to work as a bulk assignment - * anymore, so unroll it to element-by-element assignments and lower - * each of them. - * - * Note: to unroll into element-by-element assignments, we
[Mesa-dev] [PATCH 12/12] glsl: Death to array dereferences of vectors!
From: Ian Romanick ian.d.roman...@intel.com Now that all the places that used to generate array derefeneces of vectors have been changed to generate either ir_binop_vector_extract or ir_triop_vector_insert (or both), remove all support for dealing with this deprecated construct. As an added safeguard, modify ir_validate to reject ir_dereference_array of a vector. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ir_validate.cpp| 29 +++ src/glsl/lower_vec_index_to_cond_assign.cpp | 116 +--- src/glsl/lower_vec_index_to_swizzle.cpp | 56 +- 3 files changed, 32 insertions(+), 169 deletions(-) diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp index f304af4..4957146 100644 --- a/src/glsl/ir_validate.cpp +++ b/src/glsl/ir_validate.cpp @@ -69,6 +69,8 @@ public: virtual ir_visitor_status visit_leave(ir_expression *ir); virtual ir_visitor_status visit_leave(ir_swizzle *ir); + virtual ir_visitor_status visit_enter(class ir_dereference_array *); + virtual ir_visitor_status visit_enter(ir_assignment *ir); virtual ir_visitor_status visit_enter(ir_call *ir); @@ -102,6 +104,33 @@ ir_validate::visit(ir_dereference_variable *ir) } ir_visitor_status +ir_validate::visit_enter(class ir_dereference_array *ir) +{ + if (!ir-array-type-is_array() !ir-array-type-is_matrix()) { + printf(ir_dereference_array @ %p does not specify an array or a +matrix\n, +(void *) ir); + ir-print(); + printf(\n); + abort(); + } + + if (!ir-array_index-type-is_scalar()) { + printf(ir_dereference_array @ %p does not have scalar index: %s\n, +(void *) ir, ir-array_index-type-name); + abort(); + } + + if (!ir-array_index-type-is_integer()) { + printf(ir_dereference_array @ %p does not have integer index: %s\n, +(void *) ir, ir-array_index-type-name); + abort(); + } + + return visit_continue; +} + +ir_visitor_status ir_validate::visit_enter(ir_if *ir) { if (ir-condition-type != glsl_type::bool_type) { diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp b/src/glsl/lower_vec_index_to_cond_assign.cpp index 2cd540c..f74e1d9 100644 --- a/src/glsl/lower_vec_index_to_cond_assign.cpp +++ b/src/glsl/lower_vec_index_to_cond_assign.cpp @@ -52,7 +52,6 @@ public: progress = false; } - ir_rvalue *convert_vec_index_to_cond_assign(ir_rvalue *val); ir_rvalue *convert_vec_index_to_cond_assign(void *mem_ctx, ir_rvalue *orig_vector, ir_rvalue *orig_index, @@ -141,26 +140,6 @@ ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_ } ir_rvalue * -ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(ir_rvalue *ir) -{ - ir_dereference_array *orig_deref = ir-as_dereference_array(); - - if (!orig_deref) - return ir; - - if (orig_deref-array-type-is_matrix() || - orig_deref-array-type-is_array()) - return ir; - - assert(orig_deref-array_index-type-base_type == GLSL_TYPE_INT); - - return convert_vec_index_to_cond_assign(ralloc_parent(ir), - orig_deref-array, - orig_deref-array_index, - ir-type); -} - -ir_rvalue * ir_vec_index_to_cond_assign_visitor::convert_vector_extract_to_cond_assign(ir_rvalue *ir) { ir_expression *const expr = ir-as_expression(); @@ -180,7 +159,6 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_expression *ir) unsigned int i; for (i = 0; i ir-get_num_operands(); i++) { - ir-operands[i] = convert_vec_index_to_cond_assign(ir-operands[i]); ir-operands[i] = convert_vector_extract_to_cond_assign(ir-operands[i]); } @@ -194,7 +172,6 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_swizzle *ir) * the result of indexing a vector is. But maybe at some point we'll end up * using swizzling of scalars for vector construction. */ - ir-val = convert_vec_index_to_cond_assign(ir-val); ir-val = convert_vector_extract_to_cond_assign(ir-val); return visit_continue; @@ -203,95 +180,12 @@ ir_vec_index_to_cond_assign_visitor::visit_enter(ir_swizzle *ir) ir_visitor_status ir_vec_index_to_cond_assign_visitor::visit_leave(ir_assignment *ir) { - ir_variable *index, *var; - ir_dereference_variable *deref; - ir_assignment *assign; - unsigned i; - - ir-rhs = convert_vec_index_to_cond_assign(ir-rhs); ir-rhs = convert_vector_extract_to_cond_assign(ir-rhs); if (ir-condition) { - ir-condition = convert_vec_index_to_cond_assign(ir-condition); ir-condition = convert_vector_extract_to_cond_assign(ir-condition); } - /* Last, handle the LHS */ - ir_dereference_array *orig_deref = ir-lhs-as_dereference_array(); - - if
[Mesa-dev] [PATCH 10/12] glsl: Use vector-insert and vector-extract on elements of gl_ClipDistanceMESA
From: Ian Romanick ian.d.roman...@intel.com Variable indexing into vectors using ir_dereference_array is being removed, so this lowering pass has to generate something different. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Paul Berry stereotype...@gmail.com --- src/glsl/lower_clip_distance.cpp | 36 ++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/src/glsl/lower_clip_distance.cpp b/src/glsl/lower_clip_distance.cpp index 26a0feb..fd6e3f0 100644 --- a/src/glsl/lower_clip_distance.cpp +++ b/src/glsl/lower_clip_distance.cpp @@ -197,10 +197,17 @@ lower_clip_distance_visitor::handle_rvalue(ir_rvalue **rv) ir_rvalue *swizzle_index; this-create_indices(array-array_index, array_index, swizzle_index); void *mem_ctx = ralloc_parent(array); -array-array = + +ir_dereference_array *const ClipDistanceMESA_deref = new(mem_ctx) ir_dereference_array(this-new_clip_distance_var, array_index); -array-array_index = swizzle_index; + +ir_expression *const expr = + new(mem_ctx) ir_expression(ir_binop_vector_extract, + ClipDistanceMESA_deref, + swizzle_index); + +*rv = expr; } } } @@ -280,7 +287,32 @@ lower_clip_distance_visitor::visit_leave(ir_assignment *ir) return visit_continue; } + /* Handle the LHS as if it were an r-value. This may cause the LHS to get +* replaced with an ir_expression or ir_binop_vector_extract. If this +* occurs, replace it with a dereference of the vector, and replace the RHS +* with an ir_triop_vector_insert. +*/ handle_rvalue((ir_rvalue **)ir-lhs); + if (ir-lhs-ir_type == ir_type_expression) { + ir_expression *const expr = (ir_expression *) ir-lhs; + + /* The expression must be of the form: + * + * (vector_extract gl_ClipDistanceMESA[i], j). + */ + assert(expr-operation == ir_binop_vector_extract); + assert(expr-operands[0]-ir_type == ir_type_dereference_array); + + ir_dereference *const new_lhs = (ir_dereference *) expr-operands[0]; + ir-rhs = new(ctx) ir_expression(ir_triop_vector_insert, + new_lhs-type, + new_lhs-clone(ctx, NULL), + ir-rhs, + expr-operands[1]); + ir-set_lhs(new_lhs); + ir-write_mask = (1U new_lhs-type-vector_elements) - 1; + } + return rvalue_visit(ir); } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.
On Mon, Apr 08, 2013 at 07:27:38PM -0700, Kenneth Graunke wrote: In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all uncompressed anyway. Long ago when I've drawn diagramms showing which pixels lay in which cachelines for enabling tiling on i915g I've figured that at least for the 4x4 block compressed layouts with 128bits per block X and Y tiling should result in about equally optimal layouts (just cachelines stack differently): X-tiled actually gives you an 8x8 grid of 4x4 blocks, so I've figured that'll be better for tlb efficiency. Anyway I've never done real benchmarks, I'm just curious that you blame all the speedup here on compressed textures and wonder a bit what that'd look like when (some) of the compressed layouts would keep on using x tiled. But it's gettin a bit late here ;-) -Daniel Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8dd04be..6a9f08c 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_Y; if (width0 = 64) - return I915_TILING_X; + return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X; return I915_TILING_NONE; } -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.
On Tue, Apr 09, 2013 at 01:17:39AM +0200, Daniel Vetter wrote: On Mon, Apr 08, 2013 at 07:27:38PM -0700, Kenneth Graunke wrote: In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all uncompressed anyway. Long ago when I've drawn diagramms showing which pixels lay in which cachelines for enabling tiling on i915g I've figured that at least for the 4x4 block compressed layouts with 128bits per block X and Y tiling should result in about equally optimal layouts (just cachelines stack differently): X-tiled actually gives you an 8x8 grid of 4x4 blocks, so I've figured that'll be better for tlb efficiency. Blergh, can't do math, should be 8x32 or 32x8 grids of 4x4 blocks in a tile. So on a quick look x/y-tiled are about equally nicely laid out. I've mixed up the 8x8 with the cacheline pattern of y-tiled, where each cacheline is a 4x4 pixel block (at least for 32bit-per-pixel stuff). -Daniel Anyway I've never done real benchmarks, I'm just curious that you blame all the speedup here on compressed textures and wonder a bit what that'd look like when (some) of the compressed layouts would keep on using x tiled. But it's gettin a bit late here ;-) -Daniel Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8dd04be..6a9f08c 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_Y; if (width0 = 64) - return I915_TILING_X; + return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X; return I915_TILING_NONE; } -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] remove mfeatures.h, take two
On 04/08/2013 11:26 AM, Matt Turner wrote: Ready to commit? Thanks for the reminder. I think it's ready but IIRC only one person besides myself really tested it. I think I could cherry-pick the commits a few at a time to master... -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
Pushed, thanks. The transform feedback test still doesn't pass, but at least the hardlocks are gone. Marek On Sun, Apr 7, 2013 at 6:29 PM, Martin Andersson g02ma...@gmail.com wrote: If there are no objections or comments on this, it would be nice if someone could commit it. //Martin On Tue, Apr 2, 2013 at 10:43 PM, Martin Andersson g02ma...@gmail.com wrote: The multiplication part of tgsi_umad did not work on Cayman, because it did not populate the correct vector slots. --- src/gallium/drivers/r600/r600_shader.c | 45 -- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 82885d1..6c4cc8f 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -5840,7 +5840,7 @@ static int tgsi_umad(struct r600_shader_ctx *ctx) { struct tgsi_full_instruction *inst = ctx-parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; - int i, j, r; + int i, j, k, r; int lasti = tgsi_last_instruction(inst-Dst[0].Register.WriteMask); /* src0 * src1 */ @@ -5848,21 +5848,40 @@ static int tgsi_umad(struct r600_shader_ctx *ctx) if (!(inst-Dst[0].Register.WriteMask (1 i))) continue; - memset(alu, 0, sizeof(struct r600_bytecode_alu)); + if (ctx-bc-chip_class == CAYMAN) { + for (j = 0 ; j 4; j++) { + memset(alu, 0, sizeof(struct r600_bytecode_alu)); - alu.dst.chan = i; - alu.dst.sel = ctx-temp_reg; - alu.dst.write = 1; + alu.op = ALU_OP2_MULLO_UINT; + for (k = 0; k inst-Instruction.NumSrcRegs; k++) { + r600_bytecode_src(alu.src[k], ctx-src[k], i); + } + tgsi_dst(ctx, inst-Dst[0], j, alu.dst); + alu.dst.sel = ctx-temp_reg; + alu.dst.write = (j == i); + if (j == 3) + alu.last = 1; + r = r600_bytecode_add_alu(ctx-bc, alu); + if (r) + return r; + } + } else { + memset(alu, 0, sizeof(struct r600_bytecode_alu)); - alu.op = ALU_OP2_MULLO_UINT; - for (j = 0; j 2; j++) { - r600_bytecode_src(alu.src[j], ctx-src[j], i); - } + alu.dst.chan = i; + alu.dst.sel = ctx-temp_reg; + alu.dst.write = 1; - alu.last = 1; - r = r600_bytecode_add_alu(ctx-bc, alu); - if (r) - return r; + alu.op = ALU_OP2_MULLO_UINT; + for (j = 0; j 2; j++) { + r600_bytecode_src(alu.src[j], ctx-src[j], i); + } + + alu.last = 1; + r = r600_bytecode_add_alu(ctx-bc, alu); + if (r) + return r; + } } -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63117] OSMesa Gallium Empty Output
https://bugs.freedesktop.org/show_bug.cgi?id=63117 --- Comment #2 from Brian Paul bri...@vmware.com --- Created attachment 77643 -- https://bugs.freedesktop.org/attachment.cgi?id=77643action=edit patch for osmesa.c Kevin, can you try this patch? I think the unique thing that vtk is doing is calling OSMesaMakeCurrent() several times per frame. Each time OSMesaMakeCurrent() was called we're creating new gallium drawing surfaces so any previous rendering to the frame was getting lost. The patch tries to re-use gallium surfaces from one MakeCurrent to the next. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes.
On 04/07/2013 06:42 AM, Paul Berry wrote: The call to emit_shader_time_end() before the second URB write was conditioned with if (eot), but eot is always false in this code path, so emit_shader_time_end() was never being called for vertex shaders that performed 2 URB writes. --- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 8bd2fd8..ca1cfe8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2664,10 +2664,8 @@ vec4_visitor::emit_urb_writes() emit_urb_slot(mrf++, c-prog_data.vue_map.slot_to_varying[slot]); } - if (eot) { - if (INTEL_DEBUG DEBUG_SHADER_TIME) -emit_shader_time_end(); - } + if (INTEL_DEBUG DEBUG_SHADER_TIME) + emit_shader_time_end(); current_annotation = URB write; inst = emit(VS_OPCODE_URB_WRITE); Yeah...sorry for missing this in the last round of review. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev