Re: [Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)
On 02/04/2015 07:04 AM, Jose Fonseca wrote: This change broke MinGW/MSVC builds because ffsll is not available there. There is a ffsll C fallback, but it's in src/mesa/main/imports.[ch]. So rather than duplicating it in src/gallium/auxiliary/util/u_math.h I'd prefer move it to src/util. And here lies the problem: what header name should be used for math helpers? I think the filenames in src/util and the directory itself is poorly named for something that is meant to be included by some many other components: - there is no unique prefix in most headers - util/ clashes with src/gallium/auxiliary/util/ Hence I'd like to propose to: - rename src/util to something unique (e.g, cgrt, for Common Graphics RunTime And maybe: - prefix all header/source files in there with a cgrt_* unique prefix too And maybe in the future - use cgrt_* prefix for symbols too. Actually, I've been wondering for a while now why we don't just use Gallium's utility code in Mesa for stuff like this. It's pretty well sorted out there already. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88962] [osmesa] Crash on postprocessing if z buffer is NULL
https://bugs.freedesktop.org/show_bug.cgi?id=88962 --- Comment #4 from Brian Paul bri...@vmware.com --- The first patch looks good. I posted something similar to mesa-dev yesterday. But the second patch doesn't seem right. There could be post-process stages that don't care about the depth/stencil buffer but we'd still want them to run. So instead, I think the pp_mlaa.c code should check for a missing/null depth/stencil buffer. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.
On Tuesday, February 03, 2015 10:17:36 PM Matt Turner wrote: Prevents piglit regressions from the next patch. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 37 +- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 77d4908..8cd36f8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1734,7 +1734,42 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_F16TO32(p, dst, src[0]); break; case BRW_OPCODE_CMP: - brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); + /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says + * that when the destination is a GRF that the dependency-clear bit on + * the flag register is cleared early. + * + * Suggested workarounds are to disable coissuing CMP instructions + * or to split CMP(16) instructions into two CMP(8) instructions. + * + * We choose to split into CMP(8) instructions since disabling + * coissuing would affect CMP instructions not otherwise affected by + * the errata. + */ + if (dispatch_width == 16 brw-gen == 7 !brw-is_haswell) { +if (dst.file == BRW_GENERAL_REGISTER_FILE) { + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); + brw_CMP(p, firsthalf(dst), inst-conditional_mod, + firsthalf(src[0]), firsthalf(src[1])); + brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); + brw_CMP(p, sechalf(dst), inst-conditional_mod, + sechalf(src[0]), sechalf(src[1])); + brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + + multiple_instructions_emitted = true; +} else if (dst.file == BRW_ARCHITECTURE_REGISTER_FILE) { + /* For unknown reasons, the aforementioned workaround is not +* sufficient. Overriding the type when the destination is the +* null register is necessary but not sufficient by itself. +*/ + assert(dst.nr == BRW_ARF_NULL); + dst.type = BRW_REGISTER_TYPE_D; + brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); +} else { + unreachable(not reached); +} + } else { +brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); + } break; case BRW_OPCODE_SEL: brw_SEL(p, dst, src[0], src[1]); These three patches seem reasonable to me. Reviewed-by: Kenneth Graunke kenn...@whitecape.org I do wonder, though - overriding the destination type to D effectively disables coissuing for a single instruction. I know the documentation claims that source types = F and destination types = D isn't legal, but we've been doing it for ages with no apparent issues. This might be a less expensive workaround. Then again, if it were that easy, presumably the other driver team would've done it that way... signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Fix saturate on MAD and LRP with the NIR backend.
Hi, On 02/03/2015 05:06 PM, Jason Ekstrand wrote: Ooh! I bet this fixes our rendering problems on some of those benchmarks too! I was wondering why generating made was causing problems. NIR in today's Mesa renders fine all the benchmarks in which I've had noticed rendering artifacts last week. Performance gap (in uniform heavy) things is still there though. :-) - Eero Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com mailto:jason.ekstr...@intel.com On Feb 3, 2015 1:18 AM, Kenneth Graunke kenn...@whitecape.org mailto:kenn...@whitecape.org wrote: Fixes misrendering in Witcher 2 with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke kenn...@whitecape.org mailto:kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) This depends on Jason's 3 patch series that removes emit_percomp. It's available in the 'nir-madfix' branch of my tree. This was caught by tests/spec/arb_fragment_program/lrp_sat.shader_test with my in-progress Mesa IR - NIR converter code, so I don't think we need to write more Piglit tests. We just don't have a GLSL based one. diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index fbb1622..153a1be 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1070,12 +1070,14 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) break; case nir_op_ffma: - emit(MAD(result, op[2], op[1], op[0])); + inst = emit(MAD(result, op[2], op[1], op[0])); + inst-saturate = instr-dest.saturate; break; case nir_op_flrp: /* TODO emulate for gen 6 */ - emit(LRP(result, op[2], op[1], op[0])); + inst = emit(LRP(result, op[2], op[1], op[0])); + inst-saturate = instr-dest.saturate; break; case nir_op_bcsel: -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org mailto:mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail
https://bugs.freedesktop.org/show_bug.cgi?id=79629 --- Comment #17 from Chris Wilson ch...@chris-wilson.co.uk --- (In reply to Eero Tamminen from comment #14) (In reply to XiongZhang from comment #13) Yes, I test with DRI3. When I compile xf86-video-intel and mesa, I add --enable-dri3 option. Without the proposed patch, I could easily reproduce this issue. And adding DRI 2 option to Xorg.conf, this issue disappear. I assume you refer to the original bug and versions of SW at that time. In the meanwhile X intel driver defaulted back to DRI2 from DRI3. I think you need now to set DRI to version 3 specifically in xorg.conf. Chris? Yes. You also needed to manually --enable-dri3. I've just fixed it that so we build DRI3 (if available) and just limit to DRI2 at runtime (and so allow selection of DRI3 with a normal build). commit db629a38342883176d58357fa014176c9e45115d Author: Chris Wilson ch...@chris-wilson.co.uk Date: Wed Feb 4 09:34:14 2015 + Allow runtime selection between DRI levels Rather than imposing a maximum DRI level at compile time by compiling out unwanted protocol handlers, default to limiting it at runtime so that we can switch between any level. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Fix saturate on MAD and LRP with the NIR backend.
On Wednesday, February 04, 2015 11:39:07 AM Eero Tamminen wrote: Hi, On 02/03/2015 05:06 PM, Jason Ekstrand wrote: Ooh! I bet this fixes our rendering problems on some of those benchmarks too! I was wondering why generating made was causing problems. NIR in today's Mesa renders fine all the benchmarks in which I've had noticed rendering artifacts last week. Performance gap (in uniform heavy) things is still there though. :-) - Eero Awesome. Thanks for testing it! --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail
https://bugs.freedesktop.org/show_bug.cgi?id=79629 --- Comment #15 from XiongZhang xiong.y.zh...@intel.com --- (In reply to Eero Tamminen from comment #14) (In reply to XiongZhang from comment #13) Yes, I test with DRI3. When I compile xf86-video-intel and mesa, I add --enable-dri3 option. Without the proposed patch, I could easily reproduce this issue. And adding DRI 2 option to Xorg.conf, this issue disappear. I assume you refer to the original bug and versions of SW at that time. In the meanwhile X intel driver defaulted back to DRI2 from DRI3. I think you need now to set DRI to version 3 specifically in xorg.conf. Chris? I use last week's mesa and xf86-video-intel. I set DRI to version 3 in xorg.conf, the proposed patch fix this issue also. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail
https://bugs.freedesktop.org/show_bug.cgi?id=79629 --- Comment #16 from fangxun xunx.f...@intel.com --- (In reply to Chris Wilson from comment #12) Do you mind reconfirming that you actually tested with DRI3 since the proposed patch is not yet included? Sorry, I tested with DRI2, not DRI3. It still fails on DRI3(build with --enable-dri3,set DRI to version 3 in xorg.conf). It passes with attached patch on DRI3. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #4 from Benjamin Bellec b.bel...@gmail.com --- I guess X3TC is a 32-bit game. Do you have the libtxc-dxtn-s2tc0 32-bit version, and not only the 64-bit one ? Also, what give this command: $ glxinfo |grep GL_EXT_texture_compression_s3tc -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #2 from Grimdoll worm-...@yandex.ru --- My system already have latest libtxc-dxtn-s2tc0 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.
On Wed, Feb 4, 2015 at 2:18 AM, Francisco Jerez curroje...@riseup.net wrote: Matt Turner matts...@gmail.com writes: Prevents piglit regressions from the next patch. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 37 +- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 77d4908..8cd36f8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1734,7 +1734,42 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_F16TO32(p, dst, src[0]); break; case BRW_OPCODE_CMP: - brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); + /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says + * that when the destination is a GRF that the dependency-clear bit on + * the flag register is cleared early. + * + * Suggested workarounds are to disable coissuing CMP instructions + * or to split CMP(16) instructions into two CMP(8) instructions. + * + * We choose to split into CMP(8) instructions since disabling + * coissuing would affect CMP instructions not otherwise affected by + * the errata. + */ + if (dispatch_width == 16 brw-gen == 7 !brw-is_haswell) { +if (dst.file == BRW_GENERAL_REGISTER_FILE) { + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); + brw_CMP(p, firsthalf(dst), inst-conditional_mod, + firsthalf(src[0]), firsthalf(src[1])); + brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); + brw_CMP(p, sechalf(dst), inst-conditional_mod, + sechalf(src[0]), sechalf(src[1])); + brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + + multiple_instructions_emitted = true; +} else if (dst.file == BRW_ARCHITECTURE_REGISTER_FILE) { + /* For unknown reasons, the aforementioned workaround is not +* sufficient. Overriding the type when the destination is the +* null register is necessary but not sufficient by itself. +*/ + assert(dst.nr == BRW_ARF_NULL); + dst.type = BRW_REGISTER_TYPE_D; + brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); What do you mean by not sufficient? This is quite a common use-case of the CMP instruction... Any idea what should be done? Implementing the WaCMPInstFlagDepClearedEarly workaround by splitting CMP(16) - 2x CMP(8) and copying src0's type to dst in *_visitor::CMP leads to some piglit failures (glsl-fs-atan-2.shader_test for instance). But overriding the null destination type to D after copying src0's type to dst in *_visitor::CMP leads to other piglit failures (fs-bool-less-compare-{true,false}). So it seems that both of these are necessary but not sufficient, and frustratingly I cannot find any documentation for why the things fail when the null destination's type is float. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #3 from Grimdoll worm-...@yandex.ru --- When I launching X3TC game through console, after appearing of game menu, in console I just see something like can not to load texture, using dummy... can not load dummy... and this message repeating infinitely. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #1 from Sven Arvidsson s...@whiz.se --- Sounds like you might need to install libtxc-dxtn0 for S3TC support. See http://dri.freedesktop.org/wiki/S3TC/ -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #10 from Grimdoll worm-...@yandex.ru --- ERROR: cannot open `/usr/lib/libtxc_dxtn.so' (No such file or directory) for all of those commands grimdoll@grimdoll-Aspire-5740:~ glxinfo |grep OpenGL core profile version string OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.5.0-devel (git-6fd4a61 2015-02-04 trusty-oibaf-ppa) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965: Mark UB/B immediates as unreachable.
On Wed, Feb 4, 2015 at 12:22 PM, Kenneth Graunke kenn...@whitecape.org wrote: On Friday, January 30, 2015 03:54:28 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_shader.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 678390e..c393bfc 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -602,11 +602,8 @@ brw_saturate_immediate(enum brw_reg_type type, struct brw_reg *reg) sat_imm.f = CLAMP(imm.f, 0.0f, 1.0f); break; case BRW_REGISTER_TYPE_UB: - sat_imm.ud = CLAMP(imm.ud, 0, UCHAR_MAX); - break; case BRW_REGISTER_TYPE_B: - sat_imm.d = CLAMP(imm.d, CHAR_MIN, CHAR_MAX); - break; + unreachable(no UB/B immediates); case BRW_REGISTER_TYPE_V: case BRW_REGISTER_TYPE_UV: case BRW_REGISTER_TYPE_VF: Please justify this change in your commit message - it's not immediately obvious. Does the GPU not allow saturate on B/UB values? I already pushed it yesterday, but the reason is that byte immediates *don't exist* :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #8 from Grimdoll worm-...@yandex.ru --- To be honest, I do not know. I try so many guides to install xorg drivers, to install proprietary drivers. May be it was installed during installation of Linux Mint. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
From: Kristian Høgsberg k...@bitplanet.net This code provides for an on-disk cache of objects. Objects are stored and retrieved (in ~/.cache/mesa) via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The cache is limited to a maximum number of entries (1024 in this patch), and uses random replacement. These attributes are managed via an index file that is stored in the cache directory and mmapped. This file is indexed by the low-order bytes of the cached object's names and each entry stores the complete name. So a quick comparison of the index entry verifies whether the cache has an item, or whether an existing item should be replaced. Note: Some FIXME comments are still present in this commit. These will be addressed in subsequent commits, (and before any of this code gets any active use). --- src/glsl/Makefile.am | 4 + src/glsl/Makefile.sources | 3 + src/glsl/cache.c | 230 ++ 3 files changed, 237 insertions(+) create mode 100644 src/glsl/cache.c diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index 01123bc..604af51 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -137,6 +137,10 @@ libglsl_la_SOURCES = \ $(LIBGLSL_FILES)\ $(NIR_FILES) +if ENABLE_SHADER_CACHE +libglsl_la_SOURCES += $(LIBGLSL_SHADER_CACHE_FILES) +endif + glsl_compiler_SOURCES = \ $(top_srcdir)/src/mesa/main/imports.c \ $(top_srcdir)/src/mesa/program/prog_hash_table.c \ diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 8375f6e..c5b742c 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -179,6 +179,9 @@ LIBGLSL_FILES = \ $(GLSL_SRCDIR)/s_expression.cpp \ $(GLSL_SRCDIR)/s_expression.h +LIBGLSL_SHADER_CACHE_FILES = \ + $(GLSL_SRCDIR)/cache.c + # glsl_compiler GLSL_COMPILER_CXX_FILES = \ diff --git a/src/glsl/cache.c b/src/glsl/cache.c new file mode 100644 index 000..fd087db --- /dev/null +++ b/src/glsl/cache.c @@ -0,0 +1,230 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include string.h +#include stdlib.h +#include stdio.h +#include sys/types.h +#include sys/stat.h +#include sys/mman.h +#include unistd.h +#include fcntl.h +#include pwd.h + +#include util/mesa-sha1.h + +#include cache.h + +#define INDEX_SIZE 1024 +struct program_cache { + unsigned char *index; + char path[256]; +}; + +struct program_cache * +cache_create(void) +{ + struct program_cache *cache; + char index_file[256], buffer[512]; + struct stat sb; + size_t size; + int fd; + struct passwd pwd, *result; + + getpwuid_r(getuid(), pwd, buffer, sizeof buffer, result); + if (result == NULL) + return NULL; + snprintf(index_file, sizeof index_file, +%s/.cache/mesa/index, pwd.pw_dir); + + fd = open(index_file, O_RDWR | O_CREAT | O_CLOEXEC, 0644); + if (fd == -1) { + /* FIXME: Check for ENOENT and mkdir on demand */ + return NULL; + } + + if (fstat(fd, sb) == -1) { + close(fd); + return NULL; + } + + size = INDEX_SIZE * CACHE_KEY_SIZE; + if (sb.st_size == 0) { + if (ftruncate(fd, size) == -1) { + close(fd); + return NULL; + } + fsync(fd); + } + + cache = (struct program_cache *) malloc(sizeof *cache); + if (cache == NULL) { + close(fd); + return NULL; + } + + snprintf(cache-path, sizeof cache-path, +%s/.cache/mesa, pwd.pw_dir); + + /* FIXME: We map this shared, which is a start, but we need to think about +* how to make it multi-process safe. */ + cache-index = (unsigned char *) + mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (cache-index == MAP_FAILED) {
[Mesa-dev] [PATCH 1/7] glsl: Add cache.h, defining an API for a persistent cache of objects
This API forms the base infrastructure for the future shader cache. At this point, the cache is simply a persistent, on-disk store of objects stored and retrieved by 20-byte keys. --- src/glsl/cache.h | 121 +++ 1 file changed, 121 insertions(+) create mode 100644 src/glsl/cache.h diff --git a/src/glsl/cache.h b/src/glsl/cache.h new file mode 100644 index 000..5e9b3a8 --- /dev/null +++ b/src/glsl/cache.h @@ -0,0 +1,121 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#pragma once +#ifndef CACHE_H +#define CACHE_H + +#ifdef __cplusplus +extern C { +#endif + +#include stdint.h + +/* These functions implement a persistent, (on disk), cache of objects. + * + * Objects are stored and retrieved from the cache using keys which are each a + * sequence of 20 arbitrary bytes. This 20-byte size is chosen to allow for a + * SHA-1 signature to be used as they key, (but nothing in cache.c actually + * computes or relies on the keys being SHA-1). See mesa-sha1.h and + * _mesa_sha1_compute for assistance in computing SHA-1 signatures. + */ + +/* Size of cache keys in bytes. */ +#define CACHE_KEY_SIZE 20 + +typedef uint8_t cache_key[CACHE_KEY_SIZE]; + +/** + * Create a new cache object. + * + * This function creates the handle necessary for all subsequent cache_* + * functions. + */ +struct program_cache * +cache_create(void); + +/** + * Store an item in the cache under the name \key. + * + * The item can be retrieved later with cache_get(), (unless the item has + * been evicted). + * + * Any call to cache_put() may cause an existing, random, item to be evicted + * from the cache. + */ +void +cache_put(struct program_cache *cache, cache_key key, + const void *data, size_t size); + +/** + * Mark the cache name \key as used in the cache, (without storing any + * associated data). + * + * A call to cache_mark() is conceptually the same as a call to cache_put() + * but without any associated data. Following such a call, cache_get() + * cannot be usefully used, (since there is no data to return), but + * cache_probe() can be used to check whether key has been marked. + * + * Any call to cache_mark() may cause an existing, random, item to be evicted + * from the cache. + */ +void +cache_mark(struct program_cache *cache, cache_key key); + +/** + * Return an item previously stored in the cache with the name key. + * + * The item must have been previously stored with a call to cache_put(). + * + * If \size is non-NULL, then, on successful return, it will be set to the + * size of the object. + * + * \return A pointer to the stored object, (or NULL if the object is not + * found, or if any error occurs such memory allocation failure or a + * filesystem error). The returned data is malloc'ed so the caller should call + * free() when done with it. + */ +uint8_t * +cache_get(struct program_cache *cache, cache_key key, size_t *size); + +/** + * A lightweight test whether the given key is currently in the cache. + * + * This test is lightweight in that it makes no syscalls and will not hit the + * disk. It is implemented via a small array of \key signatures. + * + * Return value: True if the item is in the cache (at this instant), false + * otherwise. + * + * Note: After cache_has(), a subsequent call to cache_get() + * might fail, (if another user caused the item to be evicted in the + * meantime). + */ +int +cache_has(struct program_cache *cache, cache_key key); + +#ifdef __cplusplus +} +#endif + +#endif /* CACHE_H */ -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Low-level infrastructure for the shader cache
Hi folks, This series adds a layer of code to store a cache of objects on disk. Thanks to Kristian Høgsberg for the initial proof-of-concept implementation here. I've take his original code and added my own cleanups and documentation to the cache API. I've also fixed up a couple of the items he had left as FIXMEs. In this series, my cleaned-up API arrives first, so you won't see the changes I've made there. But, for the implementation fixes, I've left his code as originally written and have separate commits for my fixes. I think this is the history we want for sake of review and maintainability. The code compiles at all points, (and is never used), so it's not like there are any regressions left lingering due to the unsquashed history. Like I said, None of the code here is actually used yet, (that will be a future series of patches, where we'll finally have an actual shader cache working). But I think the cache API and implementation can be productively reviewed now. There is one FIXME comment in the implementation here that will still need to be addressed. I'm soliciting some thoughts from others on how to do it. The issue is around the mmapped index file used for random-cache replacement, and how to ensure things stay consistent with multiple processes. Kristian's comments in this area are as follows. First, when mapping the index file: /* FIXME: We map this shared, which is a start, but we need to think about * how to make it multi-process safe. */ cache-index = (unsigned char *) mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); Then, when reading an entry from the cache file, (and potentially unlinking the old file being replaced): /* FIXME: We'll need an fsync here and think about races... maybe even need * an flock to avoid leaking files. Or maybe fsync, then read back and * verify the entry is still ours, delete it if somebody else overwrote * it. */ So please follow up in replies if you have good ideas on how to best address these comments. Other than that, I think the code in this series is sound. Please do let me know what you see that I've missed. -Carl ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] glsl: Make cache directory if it does not already exist
With this patch, there are now three different options for the shader cache directory, (considered in order until the first variable is set): $MESA_GLSL_CACHE_DIR $XDG_CACHE_HOME/mesa user-home-directory/.cache/mesa Also with this patch, once the desired path is determined, the directory is created if it does not exist, (but this code will not create an arbitrary number of parent directories if they don't exist). --- src/glsl/cache.c | 170 +++ 1 file changed, 147 insertions(+), 23 deletions(-) diff --git a/src/glsl/cache.c b/src/glsl/cache.c index da71868..79e8afb 100644 --- a/src/glsl/cache.c +++ b/src/glsl/cache.c @@ -30,76 +30,200 @@ #include unistd.h #include fcntl.h #include pwd.h +#include errno.h #include util/mesa-sha1.h +#include util/ralloc.h +#include main/errors.h #include cache.h #define INDEX_SIZE 1024 struct program_cache { unsigned char *index; - char path[256]; + char *path; }; +/* Create a directory named 'path' if it does not already exist. + * + * Returns: 0 if path already exists as a directory or if created. + * -1 in all other cases. + */ +static int +mkdir_if_needed(char *path) +{ + struct stat sb; + + /* If the path exists already, then our work is done if it'sa directory, +* but it's an error if it is not. +*/ + if (stat(path, sb) == 0) { + if (S_ISDIR(sb.st_mode)) { + return 0; + } else { + _mesa_warning(NULL, + Cannot use %s for shader cache (not a directory) + ---disabling.\n, path); + return -1; + } + } + + if (mkdir(path, 0755) == 0) + return 0; + + _mesa_warning(NULL, + Failed to create %s for shader cache (%s)---disabling.\n, + path, strerror(errno)); + + return -1; +} + +/* Concatenate an existing path and a new name to form a new path. If the new + * path does not exist as a directory, create it then return the resulting + * name of the new path (ralloc'ed off of 'ctx'). + * + * Returns NULL on any error, such as: + * + * path does not exist or is not a directory + * path/name exists but is not a directory + * path/name cannot be created as a directory + */ +static char * +concatenate_and_mkdir(void *ctx, char *path, char *name) +{ + char *new_path; + struct stat sb; + + if (stat(path, sb) != 0 || ! S_ISDIR(sb.st_mode)) + return NULL; + + new_path = ralloc_asprintf(ctx, %s/%s, path, name); + + if (mkdir_if_needed(new_path) == 0) + return new_path; + else + return NULL; +} + struct program_cache * cache_create(void) { - struct program_cache *cache; - char index_file[256], buffer[512]; + void *ctx = ralloc_context(NULL); + struct program_cache *cache = NULL; + char *path, *index_file; struct stat sb; size_t size; - int fd; - struct passwd pwd, *result; + int fd = -1; /* At user request, disable shader cache entirely. */ if (getenv(MESA_GLSL_CACHE_DISABLE)) - return NULL; + goto done; + + /* Determine path for cache based on the first defined name as follows: +* +* $MESA_GLSL_CACHE_DIR +* $XDG_CACHE_HOME/mesa +* pwd.pw_dir/.cache/mesa +*/ + path = getenv(MESA_GLSL_CACHE_DIR); + if (path mkdir_if_needed(path) == -1) { + goto done; + } - getpwuid_r(getuid(), pwd, buffer, sizeof buffer, result); - if (result == NULL) - return NULL; - snprintf(index_file, sizeof index_file, -%s/.cache/mesa/index, pwd.pw_dir); + if (path == NULL) { + char *xdg_cache_home = getenv(XDG_CACHE_HOME); + + if (xdg_cache_home) { + if (mkdir_if_needed(xdg_cache_home) == -1) +goto done; + + path = concatenate_and_mkdir(ctx, xdg_cache_home, mesa); + if (path == NULL) +goto done; + } + } + + if (path == NULL) { + char *buf; + size_t buf_size; + struct passwd pwd, *result; + + buf_size = sysconf(_SC_GETPW_R_SIZE_MAX); + if (buf_size == -1) + buf_size = 512; + + /* Loop until buf_size is large enough to query the directory */ + while (1) { + buf = ralloc_size(ctx, buf_size); + + getpwuid_r(getuid(), pwd, buf, buf_size, result); + if (result) +break; + + if (errno == ERANGE) { +ralloc_free(buf); +buf = NULL; +buf_size *= 2; + } else { +goto done; + } + } + + path = concatenate_and_mkdir(ctx, pwd.pw_dir, .cache); + if (path == NULL) + goto done; + + path = concatenate_and_mkdir(ctx, path, mesa); + if (path == NULL) + goto done; + } + + index_file = ralloc_asprintf(ctx, %s/%s, path, index); fd = open(index_file, O_RDWR | O_CREAT | O_CLOEXEC, 0644); if (fd == -1) { - /* FIXME: Check for ENOENT and mkdir on demand */ - return NULL; +
[Mesa-dev] [PATCH 7/7] glsl/cache: Write newly cached files atomically via rename()
Instead of writing directly to the desired filename, with this patch we instead first write to filename.tmp and then use rename() to atomically rename from filename.tmp to filename. This ensures that any process that opens filename for reading will never see any partially written file. --- src/glsl/cache.c | 41 + 1 file changed, 25 insertions(+), 16 deletions(-) diff --git a/src/glsl/cache.c b/src/glsl/cache.c index 0bbf659..e0cdc1a 100644 --- a/src/glsl/cache.c +++ b/src/glsl/cache.c @@ -323,9 +323,9 @@ cache_put(struct program_cache *cache, uint32_t *s = (uint32_t *) key; int i = *s (INDEX_SIZE - 1); unsigned char *entry; - int fd, ret; + int fd = -1, ret; size_t len; - char *filename; + char *filename = NULL, *filename_tmp = NULL; const char *p = (const char *) data; /* FIXME: We'll need an fsync here and think about races... maybe even need @@ -336,40 +336,49 @@ cache_put(struct program_cache *cache, entry = cache-index[i * CACHE_KEY_SIZE]; filename = get_cache_file(cache, entry); if (filename == NULL) - return; + goto done; unlink(filename); ralloc_free(filename); + filename = NULL; memcpy(entry, key, CACHE_KEY_SIZE); if (data == NULL) - return; - - /* FIXME: We should write the file to a name like sha1-foo, close it and -* then rename(2) it to sha1 to make sure some other mesa process doesn't -* open it and gets a partial result. Racing with another mesa writing the -* same file is ok, since they'll both write the same contents, and whoever -* finishes first will move the complete file in place. */ + goto done; filename = get_cache_file(cache, key); if (filename == NULL) - return; + goto done; - fd = open(filename, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); - ralloc_free(filename); + filename_tmp = ralloc_asprintf(cache, %s.tmp, filename); + if (filename_tmp == NULL) + goto done; + + fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); if (fd == -1) - return; + goto done; for (len = 0; len size; len += ret) { ret = write(fd, p + len, size - len); if (ret == -1) { - unlink(filename); - break; + unlink(filename_tmp); + goto done; } } close(fd); + fd = -1; + + rename(filename_tmp, filename); + + done: + if (filename_tmp) + ralloc_free(filename_tmp); + if (filename) + ralloc_free(filename); + if (fd != -1) + close(fd); } void -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] glsl/cache.c: Dynamically allocate filenames internal to cache
The user can put the cache directory anywhere, so it's not safe to use fixed-size arrays to store filenames. Instead, allocate the cache pointer itself as a ralloc context and use that to dynamically allocate all filenames. While making this change, simplify the error handling in cache_get with a new goto FAIL block so the cleanup code exists in a single place, rather than being spread throughout the function over and over. --- src/glsl/cache.c | 87 +++- 1 file changed, 54 insertions(+), 33 deletions(-) diff --git a/src/glsl/cache.c b/src/glsl/cache.c index 79e8afb..0bbf659 100644 --- a/src/glsl/cache.c +++ b/src/glsl/cache.c @@ -197,14 +197,14 @@ cache_create(void) fsync(fd); } - cache = (struct program_cache *) malloc(sizeof *cache); + cache = ralloc(NULL, struct program_cache); if (cache == NULL) { goto done; } cache-path = strdup(path); if (cache-path == NULL) { - free (cache); + ralloc_free (cache); cache = NULL; goto done; } @@ -214,7 +214,7 @@ cache_create(void) cache-index = (unsigned char *) mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (cache-index == MAP_FAILED) { - free(cache); + ralloc_free(cache); cache = NULL; goto done; } @@ -227,16 +227,18 @@ cache_create(void) return cache; } +/* Return a filename within the cache's directory corresponding to 'key'. The + * returned filename is ralloced with 'cache' as the parent context. + * + * Returns NULL if out of memory. + */ static char * -get_cache_file(struct program_cache *cache, - char *buffer, size_t size, cache_key key) +get_cache_file(struct program_cache *cache, cache_key key) { char buf[41]; - snprintf(buffer, size, %s/%s, -cache-path, _mesa_sha1_format(buf, key)); - - return buffer; + return ralloc_asprintf(cache, %s/%s, cache-path, + _mesa_sha1_format(buf, key)); } int @@ -261,59 +263,69 @@ cache_has(struct program_cache *cache, cache_key key) uint8_t * cache_get(struct program_cache *cache, cache_key key, size_t *size) { - int fd, ret, len; + int fd = -1, ret, len; struct stat sb; - char filename[256], *data; + char *filename = NULL; + uint8_t *data = NULL; if (size) *size = 0; if (!cache_has(cache, key)) - return NULL; + goto fail; - get_cache_file(cache, filename, sizeof filename, key); + filename = get_cache_file(cache, key); + if (filename == NULL) + goto fail; fd = open(filename, O_RDONLY | O_CLOEXEC); if (fd == -1) - return NULL; + goto fail; - if (fstat(fd, sb) == -1) { - close(fd); - return NULL; - } + if (fstat(fd, sb) == -1) + goto fail; - data = (char *) malloc(sb.st_size); - if (data == NULL) { - close(fd); - return NULL; - } + data = malloc(sb.st_size); + if (data == NULL) + goto fail; for (len = 0; len sb.st_size; len += ret) { ret = read(fd, data + len, sb.st_size - len); - if (ret == -1) { - free(data); - close(fd); - return NULL; - } + if (ret == -1) + goto fail; } + ralloc_free(filename); close(fd); if (size) *size = sb.st_size; - return (void *) data; + return data; + + fail: + if (data) + free(data); + if (filename) + ralloc_free(filename); + if (fd != -1) + close(fd); + + return NULL; } void -cache_put(struct program_cache *cache, cache_key key, const void *data, size_t size) +cache_put(struct program_cache *cache, + cache_key key, + const void *data, + size_t size) { uint32_t *s = (uint32_t *) key; int i = *s (INDEX_SIZE - 1); unsigned char *entry; int fd, ret; size_t len; - char filename[256]; + char *filename; const char *p = (const char *) data; /* FIXME: We'll need an fsync here and think about races... maybe even need @@ -322,8 +334,13 @@ cache_put(struct program_cache *cache, cache_key key, const void *data, size_t s * it. */ entry = cache-index[i * CACHE_KEY_SIZE]; - get_cache_file(cache, filename, sizeof filename, entry); + filename = get_cache_file(cache, entry); + if (filename == NULL) + return; + unlink(filename); + ralloc_free(filename); + memcpy(entry, key, CACHE_KEY_SIZE); if (data == NULL) @@ -335,8 +352,12 @@ cache_put(struct program_cache *cache, cache_key key, const void *data, size_t s * same file is ok, since they'll both write the same contents, and whoever * finishes first will move the complete file in place. */ - get_cache_file(cache, filename, sizeof filename, key); + filename = get_cache_file(cache, key); + if (filename == NULL) + return; + fd = open(filename, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); + ralloc_free(filename); if (fd == -1) return;
[Mesa-dev] [PATCH 2/7] glsl: Add stubs for the case of --disable-shader-cache
If Mesa is being compiled with no shader cache, (whether due to explicit user request or due to a missing library dependency), then we want to incur no cost on the implementation. To achieve this with as little fuss as possible, (that is, without sprinkling #ifdef throughout every call into cache functions), we implement inlined stubs to make all of the called functions do nothing. For these stubs, the configure script is updated to provide a new ENABLE_SHADER_CACHE preprocessor definition that can be queried at compile time. --- configure.ac | 3 +++ src/glsl/Makefile.sources | 1 + src/glsl/cache.h | 39 +++ 3 files changed, 43 insertions(+) diff --git a/configure.ac b/configure.ac index a4c5c74..c081b7b 100644 --- a/configure.ac +++ b/configure.ac @@ -1062,6 +1062,9 @@ if test x$with_sha1 = x; then fi fi AM_CONDITIONAL([ENABLE_SHADER_CACHE], [test x$enable_shader_cache = xyes]) +if test x$enable_shader_cache = xyes; then + AC_DEFINE([ENABLE_SHADER_CACHE], [1], [Enable shader cache]) +fi # Check for libdrm PKG_CHECK_MODULES([LIBDRM], [libdrm = $LIBDRM_REQUIRED], diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 6237627..8375f6e 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -67,6 +67,7 @@ LIBGLSL_FILES = \ $(GLSL_SRCDIR)/ast_type.cpp \ $(GLSL_SRCDIR)/blob.c \ $(GLSL_SRCDIR)/blob.h \ + $(GLSL_SRCDIR)/cache.h \ $(GLSL_SRCDIR)/builtin_functions.cpp \ $(GLSL_SRCDIR)/builtin_type_macros.h \ $(GLSL_SRCDIR)/builtin_types.cpp \ diff --git a/src/glsl/cache.h b/src/glsl/cache.h index 5e9b3a8..51be663 100644 --- a/src/glsl/cache.h +++ b/src/glsl/cache.h @@ -45,6 +45,10 @@ extern C { typedef uint8_t cache_key[CACHE_KEY_SIZE]; +/* Provide inlined stub functions if the shader cache is disabled. */ + +#ifdef ENABLE_SHADER_CACHE + /** * Create a new cache object. * @@ -114,6 +118,41 @@ cache_get(struct program_cache *cache, cache_key key, size_t *size); int cache_has(struct program_cache *cache, cache_key key); +#else + +static inline struct program_cache * +cache_create(void) +{ + return NULL; +} + +static inline void +cache_put(struct program_cache *cache, cache_key key, + const void *data, size_t size) +{ + return 0; +} + +static inline void +cache_mark(struct program_cache *cache, cache_key key) +{ + return; +} + +static inline uint8_t * +cache_get(struct program_cache *cache, cache_key key, size_t *size) +{ + return NULL; +} + +static inline int +cache_has(struct program_cache *cache, cache_key key) +{ + return 0; +} + +#endif /* ENABLE_SHADER_CACHE */ + #ifdef __cplusplus } #endif -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] glsl: Add environment variable to disable shader cache
This patch adds support for a new variable, MESA_GLSL_CACHE_DISABLE. If this variable is set, then all use of the shader cache will be disabled at run time. --- docs/envvars.html | 1 + src/glsl/cache.c | 4 2 files changed, 5 insertions(+) diff --git a/docs/envvars.html b/docs/envvars.html index 31d14a4..65db3a9 100644 --- a/docs/envvars.html +++ b/docs/envvars.html @@ -94,6 +94,7 @@ This is only valid for versions gt;= 3.0. glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as 130. Mesa will not really implement all the features of the given language version if it's higher than what's normally reported. (for developers only) +liMESA_GLSL_CACHE_DISABLE - if set, disables the GLSL shader cache liMESA_GLSL - a href=shading.html#envvarsshading language compiler options/a /ul diff --git a/src/glsl/cache.c b/src/glsl/cache.c index fd087db..da71868 100644 --- a/src/glsl/cache.c +++ b/src/glsl/cache.c @@ -51,6 +51,10 @@ cache_create(void) int fd; struct passwd pwd, *result; + /* At user request, disable shader cache entirely. */ + if (getenv(MESA_GLSL_CACHE_DISABLE)) + return NULL; + getpwuid_r(getuid(), pwd, buffer, sizeof buffer, result); if (result == NULL) return NULL; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #12 from Grimdoll worm-...@yandex.ru --- - grimdoll@grimdoll-Aspire-5740:~ whereis libtxc_dxtn.so libtxc_dxtn: - I dont understand, what wrong with it =) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #5 from Grimdoll worm-...@yandex.ru --- How to check version of libtxc-dxtn-s2tc0 ? -- grimdoll@grimdoll-Aspire-5740:~ glxinfo |grep GL_EXT_texture_compression_s3tc GL_EXT_texture_compression_rgtc, GL_EXT_texture_compression_s3tc, GL_EXT_texture_compression_s3tc, GL_EXT_texture_cube_map, -- And I have no idea, what does it mean, sorry, but thanks for help. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)
On 04/02/15 20:18, Kenneth Graunke wrote: On Wednesday, February 04, 2015 02:04:38 PM Jose Fonseca wrote: This change broke MinGW/MSVC builds because ffsll is not available there. There is a ffsll C fallback, but it's in src/mesa/main/imports.[ch]. So rather than duplicating it in src/gallium/auxiliary/util/u_math.h I'd prefer move it to src/util. And here lies the problem: what header name should be used for math helpers? I think the filenames in src/util and the directory itself is poorly named for something that is meant to be included by some many other components: - there is no unique prefix in most headers - util/ clashes with src/gallium/auxiliary/util/ Hence I'd like to propose to: - rename src/util to something unique (e.g, cgrt, for Common Graphics RunTime And maybe: - prefix all header/source files in there with a cgrt_* unique prefix too And maybe in the future - use cgrt_* prefix for symbols too. Jose util is meant to be for shared utility across the entire code base - both Mesa and Gallium. It's been growing slowly as people move things there. It might make sense to move a lot of src/gallium/auxiliary/util there, in fact - there's always been a lot of duplication between Mesa and Gallium's utility code. But that's up to the Gallium developers. I think that util is precisely the right name. If a new contributor wants to find a hash table, or a set, or some macros...they're going to look for utility code. src/util is obviously named and easy to find. I think any acronym like cgrt is going to confuse people. src/cgrt sounds like some obscure part of the system I can ignore for now - easily overlooked, and what does the acronym mean anyway... We chose not to add the u_ prefix, partly for historical reasons (Mesa never used one), but also specifically to avoid clashing with src/gallium/auxiliary/util. Most people don't put src/util in their include path, and instead use #include util/ralloc.h - which already is a prefix of sorts. What additional value does u_ provide? We had src/util in the include path. But now we don't, so yes, header collision is less likely. There's a problem with the symbols though -- gallium can (and is) embedded in other software systems -- it's not just used to make OpenGL drivers. And if src/util is to be a dependency of gallium, it means it ends up being statically linked against other stuff too. And if everybody just uses the most obvious header names, and the most obvious symbol names, it's just a matter of time until a collision happens. That said, it looks the symbols so far in u_mesa I think you should just invent a header name and put it there. math.h does sound fairly generic. If you're just reimplementing things like ffsll that are usually provided by your system, it might make sense to call it something like os_compat.h (along the lines of c99_compat.h). Or maybe Brian is right - we could just move Gallium's utility code to src/util and use it everywhere. It'd be nice to not have two sets. To be clear: I'm all for moving as much code from src/gallium/auxility to src/util -- that's my objective here. But I believe that not all code in src/gallium/auxiliary/util can be moved into src/util as some is gallium specific (depends on gallium types, helpers, etc), so merely moving files won't work generally: the gallium-specific stuff needs to stay behind, and therefore, most co-exisit without colliding with the stuff that gets moved into src/util. Even u_math.[ch] can't be trivially moved -- it depends on u_debug.[ch] which has a bunch of gallium specific stuff. Moving all this is one go will be tricky. Doing piece by piece seems safer and more guaranteed. I don't feel strongly about this. But it's a matter of practicality: I can't afford take a week off my main work to move the bulk of src/gallium/auxiliary/util into src/util, but I can take a couple of hours to get a sub-module, or a subset of it. Maybe I can approach it from the different angle: if I get things that are dependend by pretty much everything else in src/gallium/auxiliary/util, like p_config.h and u_debug.h, out of the way, then the rest will be easier to migrate... Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 Benjamin Bellec b.bel...@gmail.com changed: What|Removed |Added CC||b.bel...@gmail.com --- Comment #7 from Benjamin Bellec b.bel...@gmail.com --- How did you install the s2tc lib ? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #9 from Benjamin Bellec b.bel...@gmail.com --- OK. Could you tell us what reports these commands : $ file /usr/lib32/libtxc_dxtn.so $ file /usr/lib/libtxc_dxtn.so $ file /usr/local/lib32/libtxc_dxtn.so $ file /usr/local/lib/libtxc_dxtn.so And also : $ glxinfo |grep OpenGL core profile version string -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #11 from Benjamin Bellec b.bel...@gmail.com --- I should have ask you this instead : $ whereis libtxc_dxtn.so -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: Fix logic error.
--- I don't know if this is right, but what we had before was definitely wrong. (And gcc warned about it!) src/glsl/nir/nir_lower_phis_to_scalar.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/glsl/nir/nir_lower_phis_to_scalar.c b/src/glsl/nir/nir_lower_phis_to_scalar.c index 3bb5cc7..7c2f539 100644 --- a/src/glsl/nir/nir_lower_phis_to_scalar.c +++ b/src/glsl/nir/nir_lower_phis_to_scalar.c @@ -65,9 +65,9 @@ is_phi_src_scalarizable(nir_phi_src *src, * are ok too. */ return nir_op_infos[src_alu-op].output_size == 0 || - src_alu-op != nir_op_vec2 || - src_alu-op != nir_op_vec3 || - src_alu-op != nir_op_vec4; + (src_alu-op != nir_op_vec2 + src_alu-op != nir_op_vec3 + src_alu-op != nir_op_vec4); } case nir_instr_type_phi: -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #6 from Grimdoll worm-...@yandex.ru --- I would like to show what's happening in console while I'm running Star Conflict, but this game runs through steam, and don't show any usefull info in console, just only about steam. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965: Mark UB/B immediates as unreachable.
On Friday, January 30, 2015 03:54:28 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_shader.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 678390e..c393bfc 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -602,11 +602,8 @@ brw_saturate_immediate(enum brw_reg_type type, struct brw_reg *reg) sat_imm.f = CLAMP(imm.f, 0.0f, 1.0f); break; case BRW_REGISTER_TYPE_UB: - sat_imm.ud = CLAMP(imm.ud, 0, UCHAR_MAX); - break; case BRW_REGISTER_TYPE_B: - sat_imm.d = CLAMP(imm.d, CHAR_MIN, CHAR_MAX); - break; + unreachable(no UB/B immediates); case BRW_REGISTER_TYPE_V: case BRW_REGISTER_TYPE_UV: case BRW_REGISTER_TYPE_VF: Please justify this change in your commit message - it's not immediately obvious. Does the GPU not allow saturate on B/UB values? signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] gallium/hud: also try R8_UNORM format for font texture
Convert the code to try formats from an array rather than a bunch of if/else cases. --- src/gallium/auxiliary/hud/font.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/gallium/auxiliary/hud/font.c b/src/gallium/auxiliary/hud/font.c index 03e35d9..60e8ae5 100644 --- a/src/gallium/auxiliary/hud/font.c +++ b/src/gallium/auxiliary/hud/font.c @@ -57,6 +57,7 @@ #include pipe/p_state.h #include pipe/p_context.h #include util/u_inlines.h +#include util/u_memory.h typedef unsigned char GLubyte;/* 1-byte unsigned */ typedef struct tagSFG_Font SFG_Font; @@ -373,24 +374,29 @@ static boolean util_font_create_fixed_8x13(struct pipe_context *pipe, struct util_font *out_font) { + static const enum pipe_format formats[] = { + PIPE_FORMAT_I8_UNORM, + PIPE_FORMAT_L8_UNORM, + PIPE_FORMAT_R8_UNORM + }; struct pipe_screen *screen = pipe-screen; struct pipe_resource tex_templ, *tex; struct pipe_transfer *transfer = NULL; char *map; - enum pipe_format tex_format; + enum pipe_format tex_format = PIPE_FORMAT_NONE; int i; - if (screen-is_format_supported(screen, PIPE_FORMAT_I8_UNORM, + for (i = 0; i Elements(formats); i++) { + if (screen-is_format_supported(screen, formats[i], PIPE_TEXTURE_RECT, 0, PIPE_BIND_SAMPLER_VIEW)) { - tex_format = PIPE_FORMAT_I8_UNORM; - } - else if (screen-is_format_supported(screen, PIPE_FORMAT_L8_UNORM, - PIPE_TEXTURE_RECT, 0, - PIPE_BIND_SAMPLER_VIEW)) { - tex_format = PIPE_FORMAT_L8_UNORM; + tex_format = formats[i]; + break; + } } - else { + + if (tex_format == PIPE_FORMAT_NONE) { + debug_printf(Unable to find texture format for font.\n); return FALSE; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [v2] i965: implement ARB_pipeline_statistics_query
On Mon, Feb 02, 2015 at 11:37:26PM -0500, Ilia Mirkin wrote: On Mon, Dec 8, 2014 at 9:50 PM, Ben Widawsky b...@bwidawsk.net wrote: Thanks. All the requests look good, and I'll post it in v3. What happened to this patch? It was pretty close... should be easy to add gallium support for it too once it's in... Hi. I just have a crapload of stuff to do which is more important. I'll try to get the v4 out early next week. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] tgsi: add support for flt64 constants
On Wed, Feb 4, 2015 at 8:08 PM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com These act like flt32 except they take up two slots, and you can only add 2 x flt64 constants in one slot. The main reason they are different is we don't want to match half a flt64 constants against a flt32 constant in the matching code, we need to make sure we treat both parts of the flt64 as an single structure. Cleaned up printing/parsing by Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Dave Airlie airl...@redhat.com Reviewed-by: Ilia Mirkin imir...@alum.mit.edu (for patch 2 and 3. had comments on 1) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
On 05.02.2015 13:54, Ilia Mirkin wrote: On Wed, Feb 4, 2015 at 11:45 PM, Michel Dänzer mic...@daenzer.net wrote: On 05.02.2015 12:48, Ilia Mirkin wrote: Is there a benchmark that demonstrates this? I'd like to test it out with nouveau. Mesa demos src/tests/streaming_rect on Kaveri (radeonsi): Unpatched: 42 frames in 1.023 seconds = 41.056 FPS Patched: 615 frames in 1.000 seconds = 615.000 FPS Hm, 260fps for me on a GF108 (nvc1) with or without the patch (with vblank_mode=0 LIBGL_DRI3_DISABLE=1). I guess with nouveau you don't get that uncacheable nonsense? I guess so, but note that the 'uncacheable nonsense' (i.e. write-combining for CPU mappings of BOs in GART) accounts for about half of the extra boost I'm getting with Fredrik's patches. Anyway, if you look at the description of PIPE_USAGE_* in src/gallium/docs/source/screen.rst, PIPE_USAGE_STAGING is the only one which can be expected to be fast for CPU reads. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
On Wed, Feb 4, 2015 at 1:52 PM, Carl Worth cwo...@cworth.org wrote: From: Kristian Høgsberg k...@bitplanet.net This code provides for an on-disk cache of objects. Objects are stored and retrieved (in ~/.cache/mesa) via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The cache is limited to a maximum number of entries (1024 in this patch), and uses random replacement. These attributes are managed via an index file that is stored in the cache directory and mmapped. This file is indexed by the low-order bytes of the cached object's names and each entry stores the complete name. So a quick comparison of the index entry verifies whether the cache has an item, or whether an existing item should be replaced. Note: Some FIXME comments are still present in this commit. These will be addressed in subsequent commits, (and before any of this code gets any active use). --- src/glsl/Makefile.am | 4 + src/glsl/Makefile.sources | 3 + src/glsl/cache.c | 230 ++ 3 files changed, 237 insertions(+) create mode 100644 src/glsl/cache.c diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index 01123bc..604af51 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -137,6 +137,10 @@ libglsl_la_SOURCES = \ $(LIBGLSL_FILES)\ $(NIR_FILES) +if ENABLE_SHADER_CACHE +libglsl_la_SOURCES += $(LIBGLSL_SHADER_CACHE_FILES) +endif + glsl_compiler_SOURCES = \ $(top_srcdir)/src/mesa/main/imports.c \ $(top_srcdir)/src/mesa/program/prog_hash_table.c \ diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 8375f6e..c5b742c 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -179,6 +179,9 @@ LIBGLSL_FILES = \ $(GLSL_SRCDIR)/s_expression.cpp \ $(GLSL_SRCDIR)/s_expression.h +LIBGLSL_SHADER_CACHE_FILES = \ + $(GLSL_SRCDIR)/cache.c Rebase needed. I removed GLSL_SRCDIR a week and a half ago. I don't think it makes a difference to autotools, but I think I'd list cache.h here instead of in LIBGLSL_FILES. + # glsl_compiler GLSL_COMPILER_CXX_FILES = \ diff --git a/src/glsl/cache.c b/src/glsl/cache.c new file mode 100644 index 000..fd087db --- /dev/null +++ b/src/glsl/cache.c @@ -0,0 +1,230 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include string.h +#include stdlib.h +#include stdio.h +#include sys/types.h +#include sys/stat.h +#include sys/mman.h +#include unistd.h +#include fcntl.h +#include pwd.h + +#include util/mesa-sha1.h + +#include cache.h + +#define INDEX_SIZE 1024 +struct program_cache { + unsigned char *index; + char path[256]; +}; + +struct program_cache * +cache_create(void) +{ + struct program_cache *cache; + char index_file[256], buffer[512]; + struct stat sb; + size_t size; + int fd; + struct passwd pwd, *result; + + getpwuid_r(getuid(), pwd, buffer, sizeof buffer, result); + if (result == NULL) + return NULL; + snprintf(index_file, sizeof index_file, +%s/.cache/mesa/index, pwd.pw_dir); + + fd = open(index_file, O_RDWR | O_CREAT | O_CLOEXEC, 0644); + if (fd == -1) { + /* FIXME: Check for ENOENT and mkdir on demand */ + return NULL; + } + + if (fstat(fd, sb) == -1) { + close(fd); + return NULL; + } + + size = INDEX_SIZE * CACHE_KEY_SIZE; + if (sb.st_size == 0) { + if (ftruncate(fd, size) == -1) { + close(fd); + return NULL; + } + fsync(fd); + } + + cache = (struct program_cache *) malloc(sizeof *cache); Don't cast malloc. + if (cache == NULL) { +
[Mesa-dev] [PATCH] GL: Update glext.h to Revision 29735.
Khronos modified glext.h to get rid of GL_TEXTURE_BINDING, a special enum added for ARB_direct_state_access. This enum was ruled unimplementable. --- include/GL/glext.h | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/include/GL/glext.h b/include/GL/glext.h index 0ca89ca..a3873a6 100644 --- a/include/GL/glext.h +++ b/include/GL/glext.h @@ -33,7 +33,7 @@ extern C { ** used to make the header, and the header can be found at ** http://www.opengl.org/registry/ ** -** Khronos $Revision: 29537 $ on $Date: 2015-01-22 02:32:35 -0800 (Thu, 22 Jan 2015) $ +** Khronos $Revision: 29735 $ on $Date: 2015-02-02 19:00:01 -0800 (Mon, 02 Feb 2015) $ */ #if defined(_WIN32) !defined(APIENTRY) !defined(__CYGWIN__) !defined(__SCITECH_SNAP__) @@ -53,7 +53,7 @@ extern C { #define GLAPI extern #endif -#define GL_GLEXT_VERSION 20150122 +#define GL_GLEXT_VERSION 20150202 /* Generated C header for: * API: gl @@ -2594,7 +2594,6 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui #define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES 0x82FA #define GL_TEXTURE_TARGET 0x1006 #define GL_QUERY_TARGET 0x82EA -#define GL_TEXTURE_BINDING0x82EB #define GL_GUILTY_CONTEXT_RESET 0x8253 #define GL_INNOCENT_CONTEXT_RESET 0x8254 #define GL_UNKNOWN_CONTEXT_RESET 0x8255 @@ -11402,10 +11401,10 @@ GLAPI void APIENTRY glReferencePlaneSGIX (const GLdouble *equation); #ifndef GL_SGIX_resample #define GL_SGIX_resample 1 -#define GL_PACK_RESAMPLE_SGIX 0x842C -#define GL_UNPACK_RESAMPLE_SGIX 0x842D -#define GL_RESAMPLE_REPLICATE_SGIX0x842E -#define GL_RESAMPLE_ZERO_FILL_SGIX0x842F +#define GL_PACK_RESAMPLE_SGIX 0x842E +#define GL_UNPACK_RESAMPLE_SGIX 0x842F +#define GL_RESAMPLE_REPLICATE_SGIX0x8433 +#define GL_RESAMPLE_ZERO_FILL_SGIX0x8434 #define GL_RESAMPLE_DECIMATE_SGIX 0x8430 #endif /* GL_SGIX_resample */ -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88962] [osmesa] Crash on postprocessing if z buffer is NULL
https://bugs.freedesktop.org/show_bug.cgi?id=88962 --- Comment #5 from Park, Jeongmin pjm0...@gmail.com --- Sorry about the first patch. I didn't see that. In dri_sw.c, it checks for z/s buffer: if (ctx-pp drawable-textures[ST_ATTACHMENT_DEPTH_STENCIL]) pp_run(ctx-pp, ptex, ptex, drawable-textures[ST_ATTACHMENT_DEPTH_STENCIL]); and in dri_drawable.c: if (ctx-pp src zsbuf) pp_run(ctx-pp, src, src, zsbuf); Should I remove these checks and add the checks in pp_mlaa.cc instead? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
On Wed, Feb 04 2015, Kenneth Graunke wrote: The cache will need to be much larger than 1024 entries - perhaps by an order of magnitude. Thanks for the feedback. I had meant to add a comment next to that 1024 in the code along the lines of This value was chosen arbitrarily. An appropriate value will need to be found with testing. We should respect $XDG_CACHE_HOME from the XDG base directory spec: http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html Also, the snprintf makes me a bit nervous See patch 5/7. It fixes both of these problems. (My apologies for setting you up to review code that was replaced later in the series---but I'm always a little hesitant to squash things to much when only some of the code is actually mine). Typo: unnecessary compile (missing 'n'). Also, */ goes on its own line. Thanks. I've fixed both of these now. -Carl pgpRw1mcWIDW5.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] glsl/cache.c: Dynamically allocate filenames internal to cache
On Wed, Feb 4, 2015 at 1:53 PM, Carl Worth cwo...@cworth.org wrote: The user can put the cache directory anywhere, so it's not safe to use fixed-size arrays to store filenames. Instead, allocate the cache pointer itself as a ralloc context and use that to dynamically allocate all filenames. While making this change, simplify the error handling in cache_get with a new goto FAIL block so the cleanup code exists in a single place, rather than being spread throughout the function over and over. Is there some reason we should do this as a separate patch instead of just squashing it? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/7] glsl: Make cache directory if it does not already exist
On Wed, Feb 4, 2015 at 1:52 PM, Carl Worth cwo...@cworth.org wrote: - snprintf(cache-path, sizeof cache-path, -%s/.cache/mesa, pwd.pw_dir); + cache-path = strdup(path); + if (cache-path == NULL) { + free (cache); No space after the function name. + cache = NULL; + goto done; + } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] glsl/cache.c: Dynamically allocate filenames internal to cache
On Wed, Feb 04 2015, Matt Turner wrote: Is there some reason we should do this as a separate patch instead of just squashing it? Not really. I've just squashed it, (while also fixing the space before the parenthesis---I should probably add a pre-commit check for those to help me break that habit. That, and the '*/' on its own line as well. I'm never going to get used to that without help). -Carl pgpccczBBcOl8.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
On Wed, Feb 04 2015, Matt Turner wrote: Rebase needed. I removed GLSL_SRCDIR a week and a half ago. Was this the removal of all those subdir-objects warnings? Thank you! I don't think it makes a difference to autotools, but I think I'd list cache.h here instead of in LIBGLSL_FILES. Actually, that was sort of intentional. The cache.c file is only conditionally compiled, but cache.h has the inline stub implementations that we want to compile unconditionally. So it felt right to put the .h in the unconditional list. Of course, maybe that doesn't matter at all. It's the .c files that include the .h file that are going to result in the inline stubs being provided. Now that we're on the subject, what does automake even do with .h files that are listed in these lists? Do they need to be there for non-srcdir builds to work or something? If it truly doesn't matter, then I would prefer to see the cache.h next to the cache.c, yes. + cache = (struct program_cache *) malloc(sizeof *cache); Don't cast malloc. I'll blame krh on this one, (and he can in turn blame it on bad habits From programming in C++ too much). :-) + return (void *) data; I don't think you need this cast either? Just removing that cast doesn't quite do the trick because there's a potential signedness difference between char and uint8_t. But the cast is not the way I want to fix this, so thanks for pointing that out. (I have gone back and forth on whether this function should return a uint8_t* or a void *. There's a later patch in the next series that still expects a direct assignment from cache_get() to some data-structure pointer to work without a cast. So I may change cache_get back to returning a void *. But even then, the cast above won't be needed.). Thanks for your attention to detail. -Carl pgpaAlNnBcoKs.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] GL: Update glext.h to Revision 29735.
In the past I think we usually mentioned the date version (20150202) in the commit msg. But not a big deal. Reviewed-by: Brian Paul bri...@vmware.com On 02/04/2015 05:47 PM, Laura Ekstrand wrote: Khronos modified glext.h to get rid of GL_TEXTURE_BINDING, a special enum added for ARB_direct_state_access. This enum was ruled unimplementable. --- include/GL/glext.h | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/include/GL/glext.h b/include/GL/glext.h index 0ca89ca..a3873a6 100644 --- a/include/GL/glext.h +++ b/include/GL/glext.h @@ -33,7 +33,7 @@ extern C { ** used to make the header, and the header can be found at ** https://urldefense.proofpoint.com/v2/url?u=http-3A__www.opengl.org_registry_d=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=nytRbvpF--lOb3oY85QmCmddP5l7QgqUZa1jsqKpO64s=WNax2anVavTesuD5pXyt0AN-FGp6_SnU5r2j7WBEmbse= ** -** Khronos $Revision: 29537 $ on $Date: 2015-01-22 02:32:35 -0800 (Thu, 22 Jan 2015) $ +** Khronos $Revision: 29735 $ on $Date: 2015-02-02 19:00:01 -0800 (Mon, 02 Feb 2015) $ */ #if defined(_WIN32) !defined(APIENTRY) !defined(__CYGWIN__) !defined(__SCITECH_SNAP__) @@ -53,7 +53,7 @@ extern C { #define GLAPI extern #endif -#define GL_GLEXT_VERSION 20150122 +#define GL_GLEXT_VERSION 20150202 /* Generated C header for: * API: gl @@ -2594,7 +2594,6 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui #define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES 0x82FA #define GL_TEXTURE_TARGET 0x1006 #define GL_QUERY_TARGET 0x82EA -#define GL_TEXTURE_BINDING0x82EB #define GL_GUILTY_CONTEXT_RESET 0x8253 #define GL_INNOCENT_CONTEXT_RESET 0x8254 #define GL_UNKNOWN_CONTEXT_RESET 0x8255 @@ -11402,10 +11401,10 @@ GLAPI void APIENTRY glReferencePlaneSGIX (const GLdouble *equation); #ifndef GL_SGIX_resample #define GL_SGIX_resample 1 -#define GL_PACK_RESAMPLE_SGIX 0x842C -#define GL_UNPACK_RESAMPLE_SGIX 0x842D -#define GL_RESAMPLE_REPLICATE_SGIX0x842E -#define GL_RESAMPLE_ZERO_FILL_SGIX0x842F +#define GL_PACK_RESAMPLE_SGIX 0x842E +#define GL_UNPACK_RESAMPLE_SGIX 0x842F +#define GL_RESAMPLE_REPLICATE_SGIX0x8433 +#define GL_RESAMPLE_ZERO_FILL_SGIX0x8434 #define GL_RESAMPLE_DECIMATE_SGIX 0x8430 #endif /* GL_SGIX_resample */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] util/hash_table: Try to hit a double-insertion bug in the collision test
--- src/util/tests/hash_table/collision.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/util/tests/hash_table/collision.c b/src/util/tests/hash_table/collision.c index b76782b..a2210c3 100644 --- a/src/util/tests/hash_table/collision.c +++ b/src/util/tests/hash_table/collision.c @@ -36,14 +36,19 @@ main(int argc, char **argv) struct hash_table *ht; const char *str1 = test1; const char *str2 = test2; - struct hash_entry *entry1, *entry2; + const char *str3 = test3; + struct hash_entry *entry1, *entry2, *search_entry; uint32_t bad_hash = 5; int i; ht = _mesa_hash_table_create(NULL, NULL, _mesa_key_string_equal); + /* Insert some items. Inserting 3 items forces a rehash and the new +* table size is big enough that we don't get rehashes later. +*/ _mesa_hash_table_insert_pre_hashed(ht, bad_hash, str1, NULL); _mesa_hash_table_insert_pre_hashed(ht, bad_hash, str2, NULL); + _mesa_hash_table_insert_pre_hashed(ht, bad_hash, str3, NULL); entry1 = _mesa_hash_table_search_pre_hashed(ht, bad_hash, str1); assert(entry1-key == str1); @@ -60,6 +65,13 @@ main(int argc, char **argv) entry2 = _mesa_hash_table_search_pre_hashed(ht, bad_hash, str2); assert(entry2-key == str2); + /* Try inserting #2 again and make sure it gets overwritten */ + _mesa_hash_table_insert_pre_hashed(ht, bad_hash, str2, NULL); + entry2 = _mesa_hash_table_search_pre_hashed(ht, bad_hash, str2); + hash_table_foreach(ht, search_entry) { + assert(search_entry == entry2 || search_entry-key != str2); + } + /* Put str1 back, then spam junk into the table to force a * resize and make sure we can still find them both. */ -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] util/hash_table: Do a full search when adding new items
Previously, the hash_table_insert function would bail early if it found a deleted slot that it could re-use. However, this is a problem if the key being inserted is already in the hash table but further down the list. If this happens, the element ends up getting inserted in the hash table twice. This commit makes it so that we walk over all of the possible entries for the given key and then, if we don't find the key, place it in the available free entry we found. --- src/util/hash_table.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/src/util/hash_table.c b/src/util/hash_table.c index fd8e2ea..3247593 100644 --- a/src/util/hash_table.c +++ b/src/util/hash_table.c @@ -267,6 +267,7 @@ hash_table_insert(struct hash_table *ht, uint32_t hash, const void *key, void *data) { uint32_t start_hash_address, hash_address; + struct hash_entry *available_entry = NULL; if (ht-entries = ht-max_entries) { _mesa_hash_table_rehash(ht, ht-size_index + 1); @@ -281,13 +282,11 @@ hash_table_insert(struct hash_table *ht, uint32_t hash, uint32_t double_hash; if (!entry_is_present(ht, entry)) { - if (entry_is_deleted(ht, entry)) -ht-deleted_entries--; - entry-hash = hash; - entry-key = key; - entry-data = data; - ht-entries++; - return entry; + /* Stash the first available entry we find */ + if (available_entry == NULL) +available_entry = entry; + if (entry_is_free(entry)) +break; } /* Implement replacement when another insert happens @@ -314,6 +313,16 @@ hash_table_insert(struct hash_table *ht, uint32_t hash, hash_address = (hash_address + double_hash) % ht-size; } while (hash_address != start_hash_address); + if (available_entry) { + if (entry_is_deleted(ht, available_entry)) + ht-deleted_entries--; + available_entry-hash = hash; + available_entry-key = key; + available_entry-data = data; + ht-entries++; + return available_entry; + } + /* We could hit here if a required resize failed. An unchecked-malloc * application could ignore this result. */ -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #13 from Michel Dänzer mic...@daenzer.net --- What does this say: locate libtxc_dxtn.so | xargs file Or, you can just try sudo apt-get install libtxc-dxtn0:i386 which should either install the 32-bit version of the library or tell you that it's already installed. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add double opcodes and TGSI execution (v2.1)
On 30 December 2014 at 08:14, Roland Scheidegger srol...@vmware.com wrote: Just minor nits, looks good to me otherwise. I agree with others that probably the round family of functions should be added too (but could be done in another patch). Maybe could have one cap bit then (so some implementations only doing what's required by sm5, hence missing things like round and rsq and everybody else being able to do everything). Roland - const struct tgsi_full_src_register *reg, - const uint chan_index, - enum tgsi_exec_datatype src_datatype) +fetch_source_d(const struct tgsi_exec_machine *mach, I think the _d in the name is a bit misleading here since this fetches any type, not just floats. Unless this stands for something else... Its one those extending an API functions, for non-doubles you should still be using fetch_source which wraps this. I'm not sure I can think of a meaningful name, fetch_source2 or fetch_source_ex, kinda sound as pointless. I'll leave it as-is for now as I'm not sure it really makes things worse. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] tgsi: add support for flt64 constants
From: Dave Airlie airl...@redhat.com These act like flt32 except they take up two slots, and you can only add 2 x flt64 constants in one slot. The main reason they are different is we don't want to match half a flt64 constants against a flt32 constant in the matching code, we need to make sure we treat both parts of the flt64 as an single structure. Cleaned up printing/parsing by Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/auxiliary/tgsi/tgsi_dump.c | 8 src/gallium/auxiliary/tgsi/tgsi_parse.c| 1 + src/gallium/auxiliary/tgsi/tgsi_strings.c | 5 +- src/gallium/auxiliary/tgsi/tgsi_strings.h | 2 +- src/gallium/auxiliary/tgsi/tgsi_text.c | 22 + src/gallium/auxiliary/tgsi/tgsi_ureg.c | 75 -- src/gallium/auxiliary/tgsi/tgsi_ureg.h | 5 ++ src/gallium/include/pipe/p_shader_tokens.h | 1 + 8 files changed, 113 insertions(+), 6 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index 972a37e..7ae4049 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c @@ -83,6 +83,7 @@ dump_enum( #define INSTID(I) ctx-dump_printf( ctx, % 3u, I ) #define SID(I) ctx-dump_printf( ctx, %d, I ) #define FLT(F) ctx-dump_printf( ctx, %10.4f, F ) +#define DBL(D) ctx-dump_printf( ctx, %10.8f, D ) #define ENM(E,ENUMS)dump_enum( ctx, E, ENUMS, sizeof( ENUMS ) / sizeof( *ENUMS ) ) const char * @@ -238,6 +239,13 @@ dump_imm_data(struct tgsi_iterate_context *iter, assert( num_tokens = 4 ); for (i = 0; i num_tokens; i++) { switch (data_type) { + case TGSI_IMM_FLOAT64: { + union di d; + d.ui = data[i].Uint | (uint64_t)data[i+1].Uint 32; + DBL( d.d ); + i++; + break; + } case TGSI_IMM_FLOAT32: FLT( data[i].Float ); break; diff --git a/src/gallium/auxiliary/tgsi/tgsi_parse.c b/src/gallium/auxiliary/tgsi/tgsi_parse.c index 9cc8383..1162b26 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_parse.c +++ b/src/gallium/auxiliary/tgsi/tgsi_parse.c @@ -148,6 +148,7 @@ tgsi_parse_token( switch (imm-Immediate.DataType) { case TGSI_IMM_FLOAT32: + case TGSI_IMM_FLOAT64: for (i = 0; i imm_count; i++) { next_token(ctx, imm-u[i].Float); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c b/src/gallium/auxiliary/tgsi/tgsi_strings.c index bd97544..9b727cf 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c @@ -181,11 +181,12 @@ const char *tgsi_fs_coord_pixel_center_names[2] = INTEGER }; -const char *tgsi_immediate_type_names[3] = +const char *tgsi_immediate_type_names[4] = { FLT32, UINT32, - INT32 + INT32, + FLT64 }; diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.h b/src/gallium/auxiliary/tgsi/tgsi_strings.h index c842746..90014a2 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_strings.h +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.h @@ -58,7 +58,7 @@ extern const char *tgsi_fs_coord_origin_names[2]; extern const char *tgsi_fs_coord_pixel_center_names[2]; -extern const char *tgsi_immediate_type_names[3]; +extern const char *tgsi_immediate_type_names[4]; const char * diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index f965b01..5069d13 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -232,6 +232,24 @@ static boolean parse_float( const char **pcur, float *val ) return TRUE; } +static boolean parse_double( const char **pcur, uint32_t *val0, uint32_t *val1) +{ + const char *cur = *pcur; + union { + double dval; + uint32_t uval[2]; + } v; + + v.dval = strtod(cur, pcur); + if (*pcur == cur) + return FALSE; + + *val0 = v.uval[0]; + *val1 = v.uval[1]; + + return TRUE; +} + struct translate_ctx { const char *text; @@ -1104,6 +1122,10 @@ static boolean parse_immediate_data(struct translate_ctx *ctx, unsigned type, } switch (type) { + case TGSI_IMM_FLOAT64: + ret = parse_double(ctx-cur, values[i].Uint, values[i+1].Uint); + i++; + break; case TGSI_IMM_FLOAT32: ret = parse_float(ctx-cur, values[i].Float); break; diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c b/src/gallium/auxiliary/tgsi/tgsi_ureg.c index f524dfb..bc14cfd 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c @@ -650,7 +650,48 @@ ureg_DECL_sampler_view(struct ureg_program *ureg, } static int +match_or_expand_immediate64( const unsigned *v, + int type, + unsigned nr, + unsigned *v2, + unsigned *pnr2, +
[Mesa-dev] [PATCH 1/3] gallium: add double opcodes and TGSI execution (v3.1)
This patch adds support for a set of double opcodes to TGSI. It is an update of work done originally by Michal Krol on the gallium-double-opcodes branch. The opcodes have a hint where they came from in the header file. v2: add unsigned/int - double v2.1: update docs. v3: add DRSQ (Glenn), fix review comments (Glenn). v3.1: add DRSQ docs, fix typo (Roland) This is based on code by Michael Krol mic...@vmware.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 753 - src/gallium/auxiliary/tgsi/tgsi_info.c | 25 +- src/gallium/docs/source/tgsi.rst | 84 +++- src/gallium/include/pipe/p_shader_tokens.h | 27 +- 4 files changed, 870 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 834568b..57385fc 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -72,6 +72,16 @@ #define TILE_BOTTOM_LEFT 2 #define TILE_BOTTOM_RIGHT 3 +union tgsi_double_channel { + double d[TGSI_QUAD_SIZE]; + unsigned u[TGSI_QUAD_SIZE][2]; +}; + +struct tgsi_double_vector { + union tgsi_double_channel xy; + union tgsi_double_channel zw; +}; + static void micro_abs(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src) @@ -147,6 +157,55 @@ micro_cos(union tgsi_exec_channel *dst, } static void +micro_d2f(union tgsi_exec_channel *dst, + const union tgsi_double_channel *src) +{ + dst-f[0] = (float)src-d[0]; + dst-f[1] = (float)src-d[1]; + dst-f[2] = (float)src-d[2]; + dst-f[3] = (float)src-d[3]; +} + +static void +micro_d2i(union tgsi_exec_channel *dst, + const union tgsi_double_channel *src) +{ + dst-i[0] = (int)src-d[0]; + dst-i[1] = (int)src-d[1]; + dst-i[2] = (int)src-d[2]; + dst-i[3] = (int)src-d[3]; +} + +static void +micro_d2u(union tgsi_exec_channel *dst, + const union tgsi_double_channel *src) +{ + dst-u[0] = (unsigned)src-d[0]; + dst-u[1] = (unsigned)src-d[1]; + dst-u[2] = (unsigned)src-d[2]; + dst-u[3] = (unsigned)src-d[3]; +} +static void +micro_dabs(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src-d[0] = 0.0 ? src-d[0] : -src-d[0]; + dst-d[1] = src-d[1] = 0.0 ? src-d[1] : -src-d[1]; + dst-d[2] = src-d[2] = 0.0 ? src-d[2] : -src-d[2]; + dst-d[3] = src-d[3] = 0.0 ? src-d[3] : -src-d[3]; +} + +static void +micro_dadd(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] + src[1].d[0]; + dst-d[1] = src[0].d[1] + src[1].d[1]; + dst-d[2] = src[0].d[2] + src[1].d[2]; + dst-d[3] = src[0].d[3] + src[1].d[3]; +} + +static void micro_ddx(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src) { @@ -167,6 +226,168 @@ micro_ddy(union tgsi_exec_channel *dst, } static void +micro_ddiv(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] / src[1].d[0]; + dst-d[1] = src[0].d[1] / src[1].d[1]; + dst-d[2] = src[0].d[2] / src[1].d[2]; + dst-d[3] = src[0].d[3] / src[1].d[3]; +} + +static void +micro_dmul(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] * src[1].d[0]; + dst-d[1] = src[0].d[1] * src[1].d[1]; + dst-d[2] = src[0].d[2] * src[1].d[2]; + dst-d[3] = src[0].d[3] * src[1].d[3]; +} + +static void +micro_dmax(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] src[1].d[0] ? src[0].d[0] : src[1].d[0]; + dst-d[1] = src[0].d[1] src[1].d[1] ? src[0].d[1] : src[1].d[1]; + dst-d[2] = src[0].d[2] src[1].d[2] ? src[0].d[2] : src[1].d[2]; + dst-d[3] = src[0].d[3] src[1].d[3] ? src[0].d[3] : src[1].d[3]; +} + +static void +micro_dmin(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = src[0].d[0] src[1].d[0] ? src[0].d[0] : src[1].d[0]; + dst-d[1] = src[0].d[1] src[1].d[1] ? src[0].d[1] : src[1].d[1]; + dst-d[2] = src[0].d[2] src[1].d[2] ? src[0].d[2] : src[1].d[2]; + dst-d[3] = src[0].d[3] src[1].d[3] ? src[0].d[3] : src[1].d[3]; +} + +static void +micro_dneg(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-d[0] = -src-d[0]; + dst-d[1] = -src-d[1]; + dst-d[2] = -src-d[2]; + dst-d[3] = -src-d[3]; +} + +static void +micro_dslt(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-u[0][0] = src[0].d[0] src[1].d[0] ? ~0U : 0U; + dst-u[1][0] = src[0].d[1] src[1].d[1] ? ~0U : 0U; + dst-u[2][0] = src[0].d[2] src[1].d[2] ? ~0U : 0U; + dst-u[3][0] = src[0].d[3] src[1].d[3] ? ~0U : 0U; +} + +static void +micro_dsne(union tgsi_double_channel *dst, + const union tgsi_double_channel *src) +{ + dst-u[0][0] = src[0].d[0] != src[1].d[0] ? ~0U :
[Mesa-dev] [PATCH 2/3] tgsi: expose doubles for softpipe.
Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/auxiliary/tgsi/tgsi_exec.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h b/src/gallium/auxiliary/tgsi/tgsi_exec.h index cc5a916..256cf72 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.h +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h @@ -456,7 +456,7 @@ tgsi_exec_get_shader_param(enum pipe_shader_cap param) case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED: return 1; case PIPE_SHADER_CAP_DOUBLES: - return 0; + return 1; } /* if we get here, we missed a shader cap above (and should have seen * a compiler warning.) -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] gallium double support
These 3 patches are similiar to the last repost, and I'd like to merge them to get the ball rolling. The main change is adding DRSQ to this list, Ilia has some other patches to add the rounding ones, but I'd like to get things started with this set and work forwards. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium: add double opcodes and TGSI execution (v3.1)
On Wed, Feb 4, 2015 at 8:08 PM, Dave Airlie airl...@gmail.com wrote: This patch adds support for a set of double opcodes to TGSI. It is an update of work done originally by Michal Krol on the gallium-double-opcodes branch. The opcodes have a hint where they came from in the header file. v2: add unsigned/int - double v2.1: update docs. v3: add DRSQ (Glenn), fix review comments (Glenn). v3.1: add DRSQ docs, fix typo (Roland) This is based on code by Michael Krol mic...@vmware.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 753 - src/gallium/auxiliary/tgsi/tgsi_info.c | 25 +- src/gallium/docs/source/tgsi.rst | 84 +++- src/gallium/include/pipe/p_shader_tokens.h | 27 +- 4 files changed, 870 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 834568b..57385fc 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c +static void +micro_dldexp(union tgsi_double_channel *dst, + const union tgsi_double_channel *src0, + union tgsi_exec_channel *src1) +{ + dst-d[0] = ldexp(src0-d[0], src1-i[1]); oops? + dst-d[1] = ldexp(src0-d[1], src1-i[1]); + dst-d[2] = ldexp(src0-d[2], src1-i[2]); + dst-d[3] = ldexp(src0-d[3], src1-i[3]); +} + @@ -1090,11 +1341,11 @@ fetch_src_file_channel(const struct tgsi_exec_machine *mach, } static void -fetch_source(const struct tgsi_exec_machine *mach, - union tgsi_exec_channel *chan, - const struct tgsi_full_src_register *reg, - const uint chan_index, - enum tgsi_exec_datatype src_datatype) +fetch_source_d(const struct tgsi_exec_machine *mach, + union tgsi_exec_channel *chan, + const struct tgsi_full_src_register *reg, + const uint chan_index, + enum tgsi_exec_datatype src_datatype, bool dtype) { union tgsi_exec_channel index; union tgsi_exec_channel index2D; @@ -1238,6 +1489,10 @@ fetch_source(const struct tgsi_exec_machine *mach, index2D, chan); + /* double modifiers handled by caller */ + if (dtype) + return; Should the below code just get moved to fetch_source? Or does it rely on local args which makes that a pain? If it's not too hard, I think it'd be a lot cleaner / clearer than an extra param here. + if (reg-Register.Absolute) { if (src_datatype == TGSI_EXEC_DATA_FLOAT) { micro_abs(chan, chan); @@ -1256,12 +1511,22 @@ fetch_source(const struct tgsi_exec_machine *mach, } static void -store_dest(struct tgsi_exec_machine *mach, - const union tgsi_exec_channel *chan, - const struct tgsi_full_dst_register *reg, - const struct tgsi_full_instruction *inst, - uint chan_index, - enum tgsi_exec_datatype dst_datatype) +fetch_source(const struct tgsi_exec_machine *mach, + union tgsi_exec_channel *chan, + const struct tgsi_full_src_register *reg, + const uint chan_index, + enum tgsi_exec_datatype src_datatype) +{ + fetch_source_d(mach, chan, reg, chan_index, src_datatype, false); +} + +static void +store_dest_optsat(struct tgsi_exec_machine *mach, + const union tgsi_exec_channel *chan, + const struct tgsi_full_dst_register *reg, + const struct tgsi_full_instruction *inst, + uint chan_index, + enum tgsi_exec_datatype dst_datatype, bool sat) { uint i; union tgsi_exec_channel null; @@ -1471,6 +1736,14 @@ store_dest(struct tgsi_exec_machine *mach, } } + if (!sat) { Same comment here about moving the sat code into store_dest. + /* doubles path */ + for (i = 0; i TGSI_QUAD_SIZE; i++) + if (execmask (1 i)) +dst-i[i] = chan-i[i]; + return; + } + switch (inst-Instruction.Saturate) { case TGSI_SAT_NONE: for (i = 0; i TGSI_QUAD_SIZE; i++) @@ -1505,8 +1778,20 @@ store_dest(struct tgsi_exec_machine *mach, default: assert( 0 ); } + stray newline } +static void +store_dest(struct tgsi_exec_machine *mach, + const union tgsi_exec_channel *chan, + const struct tgsi_full_dst_register *reg, + const struct tgsi_full_instruction *inst, + uint chan_index, + enum tgsi_exec_datatype dst_datatype) +{ + store_dest_optsat(mach, chan, reg, inst, chan_index, +dst_datatype, true); +} Missing newline #define FETCH(VAL,INDEX,CHAN)\ fetch_source(mach, VAL, inst-Src[INDEX], CHAN, TGSI_EXEC_DATA_FLOAT) @@ -2980,6 +3265,354 @@ exec_endswitch(struct tgsi_exec_machine *mach)
Re: [Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
On Wed, Feb 4, 2015 at 6:04 PM, Carl Worth cwo...@cworth.org wrote: On Wed, Feb 04 2015, Matt Turner wrote: Rebase needed. I removed GLSL_SRCDIR a week and a half ago. Was this the removal of all those subdir-objects warnings? Thank you! Yes! Just one warning remaining :). I don't think it makes a difference to autotools, but I think I'd list cache.h here instead of in LIBGLSL_FILES. Actually, that was sort of intentional. The cache.c file is only conditionally compiled, but cache.h has the inline stub implementations that we want to compile unconditionally. So it felt right to put the .h in the unconditional list. Of course, maybe that doesn't matter at all. It's the .c files that include the .h file that are going to result in the inline stubs being provided. Now that we're on the subject, what does automake even do with .h files that are listed in these lists? Do they need to be there for non-srcdir builds to work or something? It's just used to tell the `dist` rule to include it in the tarball. If it truly doesn't matter, then I would prefer to see the cache.h next to the cache.c, yes. Yeah, I don't think it matters. I think make itself figures out dependencies on headers. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/13] add fp64 support to mesa and glsl compiler
Oh, and you can find this branch at https://github.com/imirkin/mesa.git fp64-2 https://github.com/imirkin/mesa/commits/fp64-2 This includes the gallium, st/mesa, and nvc0 patches as well. On Thu, Feb 5, 2015 at 2:27 AM, Ilia Mirkin imir...@alum.mit.edu wrote: These patches have been around for over 6 months with little change. The glsl: add double support patch is a bit large and includes many of Tapani and Topi's follow-on fix patches. There's probably still bugs in there, but at this point, I think it makes sense to push them soon to avoid bitrot. There's a st/mesa implementation, and nvc0 driver support which works as well as softpipe (a handful of issues left over in st/mesa to do with initializers). I believe there are cayman patches as well, not sure what the status is. Dave Airlie (11): glapi: add ARB_gpu_shader_fp64 (v2) mesa: add ARB_gpu_shader_fp64 extension info (v2) mesa: add mesa_type_is_double helper function (v2) glsl: add double type mesa: add double uniform support. (v4) glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2) glsl: add double support glsl: enable/disable certain lowering passes for doubles glsl/lower_instructions: add double lowering passes glsl: implement double builtin functions glsl: lower double optional passes (v2) Ilia Mirkin (1): glsl: add a lowering pass for frexp/ldexp with double arguments Tapani Pälli (1): glsl: validate output types for shader stages src/glsl/ast.h | 2 + src/glsl/ast_function.cpp | 67 +- src/glsl/ast_to_hir.cpp| 85 ++- src/glsl/builtin_functions.cpp | 751 ++--- src/glsl/builtin_type_macros.h | 16 + src/glsl/builtin_types.cpp | 30 + src/glsl/glcpp/glcpp-parse.y | 3 + src/glsl/glsl_lexer.ll | 42 +- src/glsl/glsl_parser.yy| 33 +- src/glsl/glsl_parser_extras.cpp| 5 + src/glsl/glsl_parser_extras.h | 7 + src/glsl/glsl_types.cpp| 109 ++- src/glsl/glsl_types.h | 19 +- src/glsl/ir.cpp| 104 ++- src/glsl/ir.h | 21 + src/glsl/ir_builder.cpp| 23 + src/glsl/ir_builder.h | 5 + src/glsl/ir_clone.cpp | 1 + src/glsl/ir_constant_expression.cpp| 234 ++- src/glsl/ir_optimization.h | 2 + src/glsl/ir_print_visitor.cpp | 11 + src/glsl/ir_set_program_inouts.cpp | 24 +- src/glsl/ir_validate.cpp | 61 +- src/glsl/link_uniform_initializers.cpp | 7 +- src/glsl/link_uniforms.cpp | 8 +- src/glsl/link_varyings.cpp | 3 +- src/glsl/loop_controls.cpp | 19 +- src/glsl/lower_instructions.cpp| 580 +++- src/glsl/lower_mat_op_to_vec.cpp | 2 + src/glsl/lower_ubo_reference.cpp | 13 +- src/glsl/opt_constant_propagation.cpp | 3 + src/glsl/opt_minmax.cpp| 13 + src/glsl/standalone_scaffolding.cpp| 1 + src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml | 143 src/mapi/glapi/gen/ARB_separate_shader_objects.xml | 2 - src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 2 + src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + src/mesa/main/tests/dispatch_sanity.cpp| 70 +- src/mesa/main/uniform_query.cpp| 27 +- src/mesa/main/uniforms.c | 380 ++- src/mesa/main/uniforms.h | 92 ++- src/mesa/program/ir_to_mesa.cpp| 27 +- src/mesa/program/prog_parameter.c | 16 +- src/mesa/program/prog_parameter.h | 22 + 46 files changed, 2649 insertions(+), 439 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/13] glapi: add ARB_gpu_shader_fp64 (v2)
On Wed, Feb 4, 2015 at 11:27 PM, Ilia Mirkin imir...@alum.mit.edu wrote: +void GLAPIENTRY +_mesa_UniformMatrix3dv(GLint location, GLsizei count, GLboolean transpose, + const GLdouble * value) Sloppy whitespace. Align it with the ( on the previous line. Also * goes with the argument name. Both of these occur elsewhere in this patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
On Thu, Feb 5, 2015 at 1:31 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Wednesday, February 04, 2015 01:52:57 PM Carl Worth wrote: From: Kristian Høgsberg k...@bitplanet.net This code provides for an on-disk cache of objects. Objects are stored and retrieved (in ~/.cache/mesa) via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The cache is limited to a maximum number of entries (1024 in this patch), and uses random replacement. These attributes are managed via Hi Carl, The cache will need to be much larger than 1024 entries - perhaps by an order of magnitude. For example, Shadowrun Returns uses 1299 shaders, Left 4 Dead 2 uses 1849 shaders, and Natural Selection 2 uses 2719 shaders. A single application could overflow the cache :) Seconded. Many Unity games end up loading between several hundred to several thousand shaders. -- Aras Pranckevičius work: http://unity3d.com home: http://aras-p.info ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/13] glsl: add double type
On Wed, Feb 4, 2015 at 11:27 PM, Ilia Mirkin imir...@alum.mit.edu wrote: From: Dave Airlie airl...@redhat.com This just adds a placeholder for the GLSL_TYPE_DOUBLE. This causes a lot of warnings about unchecked type in switch statements - fix them later. This isn't used until patch 7. Why don't you just squash it into 7? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 --- Comment #14 from Grimdoll worm-...@yandex.ru --- Ok, thanks, now I installed it and: grimdoll@grimdoll-Aspire-5740:~ locate libtxc_dxtn.so | xargs file /usr/lib/x86_64-linux-gnu/libtxc_dxtn.so: symbolic link to `/etc/alternatives/libtxc-dxtn-x86_64-linux-gnu' -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/13] glapi: add ARB_gpu_shader_fp64 (v2)
From: Dave Airlie airl...@redhat.com Just add the xml file covering this extension, and dummy interface files in mesa, and fix up sanity tests. v2: Enable ProgramUniform*d* from ARB_separate_shader_objects (Ian) use 40 instead of 43 for dispatch_sanity.cpp (Chris) uncomment PU sanity tests. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml | 143 +++ src/mapi/glapi/gen/ARB_separate_shader_objects.xml | 2 - src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 2 + src/mesa/main/tests/dispatch_sanity.cpp| 70 src/mesa/main/uniforms.c | 195 + src/mesa/main/uniforms.h | 89 ++ 7 files changed, 465 insertions(+), 37 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml diff --git a/src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml b/src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml new file mode 100644 index 000..4f860ef --- /dev/null +++ b/src/mapi/glapi/gen/ARB_gpu_shader_fp64.xml @@ -0,0 +1,143 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +OpenGLAPI + +category name=GL_ARB_gpu_shader_fp64 number=89 + +function name=Uniform1d offset=assign +param name=location type=GLint/ +param name=x type=GLdouble/ +/function + +function name=Uniform2d offset=assign +param name=location type=GLint/ +param name=x type=GLdouble/ +param name=y type=GLdouble/ +/function + +function name=Uniform3d offset=assign +param name=location type=GLint/ +param name=x type=GLdouble/ +param name=y type=GLdouble/ +param name=z type=GLdouble/ +/function + +function name=Uniform4d offset=assign +param name=location type=GLint/ +param name=x type=GLdouble/ +param name=y type=GLdouble/ +param name=z type=GLdouble/ +param name=w type=GLdouble/ +/function + +function name=Uniform1dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=value type=const GLdouble */ +/function + +function name=Uniform2dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=value type=const GLdouble */ +/function + +function name=Uniform3dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=value type=const GLdouble */ +/function + +function name=Uniform4dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix2dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix3dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix4dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix2x3dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix2x4dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix3x2dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix3x4dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix4x2dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble */ +/function + +function name=UniformMatrix4x3dv offset=assign +param name=location type=GLint/ +param name=count type=GLsizei/ +param name=transpose type=GLboolean/ +param name=value type=const GLdouble
[Mesa-dev] [PATCH 02/13] mesa: add ARB_gpu_shader_fp64 extension info (v2)
From: Dave Airlie airl...@redhat.com This just adds the entries to extensions.c and mtypes.h v2: use core profile only (Ian) Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 2 files changed, 2 insertions(+) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 220b220..51b16e6 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -121,6 +121,7 @@ static const struct extension extension_table[] = { { GL_ARB_framebuffer_sRGB,o(EXT_framebuffer_sRGB), GL, 1998 }, { GL_ARB_get_program_binary, o(dummy_true), GL, 2010 }, { GL_ARB_gpu_shader5, o(ARB_gpu_shader5), GLC,2010 }, + { GL_ARB_gpu_shader_fp64, o(ARB_gpu_shader_fp64), GLC,2010 }, { GL_ARB_half_float_pixel,o(dummy_true), GL, 2003 }, { GL_ARB_half_float_vertex, o(ARB_half_float_vertex), GL, 2008 }, { GL_ARB_instanced_arrays,o(ARB_instanced_arrays), GL, 2008 }, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 1c33ef4..1c09fc4 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3748,6 +3748,7 @@ struct gl_extensions GLboolean ARB_explicit_uniform_location; GLboolean ARB_geometry_shader4; GLboolean ARB_gpu_shader5; + GLboolean ARB_gpu_shader_fp64; GLboolean ARB_half_float_vertex; GLboolean ARB_instanced_arrays; GLboolean ARB_internalformat_query; -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/13] mesa: add mesa_type_is_double helper function (v2)
From: Dave Airlie airl...@gmail.com This is a helper to return if a type is based on a double. v2: GLboolean-bool (Ian) Reviewed-by: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/program/prog_parameter.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/mesa/program/prog_parameter.h b/src/mesa/program/prog_parameter.h index 6b3b3c2..bcbe142 100644 --- a/src/mesa/program/prog_parameter.h +++ b/src/mesa/program/prog_parameter.h @@ -151,6 +151,28 @@ _mesa_lookup_parameter_constant(const struct gl_program_parameter_list *list, const gl_constant_value v[], GLuint vSize, GLint *posOut, GLuint *swizzleOut); +static INLINE bool mesa_type_is_double(int dataType) +{ + switch (dataType) { + case GL_DOUBLE: + case GL_DOUBLE_VEC2: + case GL_DOUBLE_VEC3: + case GL_DOUBLE_VEC4: + case GL_DOUBLE_MAT2: + case GL_DOUBLE_MAT2x3: + case GL_DOUBLE_MAT2x4: + case GL_DOUBLE_MAT3: + case GL_DOUBLE_MAT3x2: + case GL_DOUBLE_MAT3x4: + case GL_DOUBLE_MAT4: + case GL_DOUBLE_MAT4x2: + case GL_DOUBLE_MAT4x3: + return GL_TRUE; + default: + return GL_FALSE; + } +} + #ifdef __cplusplus } #endif -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/13] glsl: add double type
From: Dave Airlie airl...@redhat.com This just adds a placeholder for the GLSL_TYPE_DOUBLE. This causes a lot of warnings about unchecked type in switch statements - fix them later. Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/glsl_types.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h index 441015c..f0d4ea8 100644 --- a/src/glsl/glsl_types.h +++ b/src/glsl/glsl_types.h @@ -51,6 +51,7 @@ enum glsl_base_type { GLSL_TYPE_UINT = 0, GLSL_TYPE_INT, GLSL_TYPE_FLOAT, + GLSL_TYPE_DOUBLE, GLSL_TYPE_BOOL, GLSL_TYPE_SAMPLER, GLSL_TYPE_IMAGE, -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/13] glsl: lower double optional passes (v2)
From: Dave Airlie airl...@gmail.com These lowering passes are optional for the backend to request, currently the TGSI softpipe backend most likely the r600g backend would want to use these passes as is. They aim to hit the gallium opcodes from the standard rounding/truncation functions. v2: also lower floor in mod_to_floor Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/ir_optimization.h | 1 + src/glsl/lower_instructions.cpp | 212 2 files changed, 213 insertions(+) diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index 912d910..9f91e2f 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -41,6 +41,7 @@ #define CARRY_TO_ARITH 0x200 #define BORROW_TO_ARITH0x400 #define SAT_TO_CLAMP 0x800 +#define DOPS_TO_DFRAC 0x1000 /** * \see class lower_packing_builtins_visitor diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 140b6d4..bf45c95 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -42,6 +42,7 @@ * - CARRY_TO_ARITH * - BORROW_TO_ARITH * - SAT_TO_CLAMP + * - DOPS_TO_DFRAC * * SUB_TO_ADD_NEG: * --- @@ -112,6 +113,9 @@ * - * Converts ir_unop_saturate into min(max(x, 0.0), 1.0) * + * DOPS_TO_DFRAC: + * -- + * Converts double trunc, ceil, floor, round to fract */ #include main/core.h /* for M_LOG2E */ @@ -151,6 +155,11 @@ private: void sat_to_clamp(ir_expression *); void double_dot_to_fma(ir_expression *); void double_lrp(ir_expression *); + void dceil_to_dfrac(ir_expression *); + void dfloor_to_dfrac(ir_expression *); + void dround_even_to_dfrac(ir_expression *); + void dtrunc_to_dfrac(ir_expression *); + void dsign_to_csel(ir_expression *); }; } /* anonymous namespace */ @@ -315,6 +324,9 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir) ir_expression *const floor_expr = new(ir) ir_expression(ir_unop_floor, x-type, div_expr); + if (lowering(DOPS_TO_DFRAC) ir-type-is_double()) + dfloor_to_dfrac(floor_expr); + ir_expression *const mul_expr = new(ir) ir_expression(ir_binop_mul, new(ir) ir_dereference_variable(y), @@ -596,6 +608,182 @@ lower_instructions_visitor::double_lrp(ir_expression *ir) this-progress = true; } +void +lower_instructions_visitor::dceil_to_dfrac(ir_expression *ir) +{ + /* +* frtemp = frac(x); +* temp = sub(x, frtemp); +* result = temp + ((frtemp != 0.0) ? 1.0 : 0.0); +*/ + ir_instruction i = *base_ir; + ir_constant *zero = new(ir) ir_constant(0.0, ir-operands[0]-type-vector_elements); + ir_constant *one = new(ir) ir_constant(1.0, ir-operands[0]-type-vector_elements); + ir_variable *frtemp = new(ir) ir_variable(ir-operands[0]-type, frtemp, + ir_var_temporary); + ir_variable *temp = new(ir) ir_variable(ir-operands[0]-type, temp, + ir_var_temporary); + ir_variable *t2 = new(ir) ir_variable(ir-operands[0]-type, t2, + ir_var_temporary); + + i.insert_before(frtemp); + i.insert_before(assign(frtemp, fract(ir-operands[0]))); + + i.insert_before(temp); + i.insert_before(assign(temp, sub(ir-operands[0]-clone(ir, NULL), frtemp))); + + i.insert_before(t2); + i.insert_before(assign(t2, csel(nequal(frtemp, zero), one, zero-clone(ir, NULL; + ir-operation = ir_binop_add; + ir-operands[0] = new(ir) ir_dereference_variable(temp); + ir-operands[1] = new(ir) ir_dereference_variable(t2); +} + +void +lower_instructions_visitor::dfloor_to_dfrac(ir_expression *ir) +{ + /* +* frtemp = frac(x); +* result = sub(x, frtemp); +*/ + ir_instruction i = *base_ir; + ir_variable *frtemp = new(ir) ir_variable(ir-operands[0]-type, frtemp, + ir_var_temporary); + + i.insert_before(frtemp); + i.insert_before(assign(frtemp, fract(ir-operands[0]-clone(ir, NULL; + + ir-operation = ir_binop_sub; + ir-operands[1] = new(ir) ir_dereference_variable(frtemp); +} +void +lower_instructions_visitor::dround_even_to_dfrac(ir_expression *ir) +{ + /* +* insane but works +* temp = x + 0.5; +* frtemp = frac(temp); +* t2 = sub(temp, frtemp); +* if (frac(x) == 0.5) +* result = frac(t2 * 0.5) == 0 ? t2 : t2 - 1; +* else +* result = t2; + +*/ + const unsigned vec_elem = ir-type-vector_elements; + const glsl_type *bvec = glsl_type::get_instance(GLSL_TYPE_BOOL, vec_elem, 1); + ir_instruction i = *base_ir; + ir_variable *frtemp = new(ir) ir_variable(ir-operands[0]-type, frtemp, + ir_var_temporary); + ir_variable *temp = new(ir) ir_variable(ir-operands[0]-type, temp, + ir_var_temporary); + ir_variable
[Mesa-dev] [PATCH 13/13] glsl: add a lowering pass for frexp/ldexp with double arguments
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/glsl/ir_optimization.h | 1 + src/glsl/lower_instructions.cpp | 279 +++- 2 files changed, 279 insertions(+), 1 deletion(-) diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index 9f91e2f..7eb861a 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -42,6 +42,7 @@ #define BORROW_TO_ARITH0x400 #define SAT_TO_CLAMP 0x800 #define DOPS_TO_DFRAC 0x1000 +#define DFREXP_DLDEXP_TO_ARITH0x2000 /** * \see class lower_packing_builtins_visitor diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index bf45c95..248265d 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -38,6 +38,7 @@ * - LOG_TO_LOG2 * - MOD_TO_FLOOR * - LDEXP_TO_ARITH + * - DFREXP_TO_ARITH * - BITFIELD_INSERT_TO_BFM_BFI * - CARRY_TO_ARITH * - BORROW_TO_ARITH @@ -91,7 +92,12 @@ * * LDEXP_TO_ARITH: * - - * Converts ir_binop_ldexp to arithmetic and bit operations. + * Converts ir_binop_ldexp to arithmetic and bit operations for float sources. + * + * DFREXP_DLDEXP_TO_ARITH: + * --- + * Converts ir_binop_ldexp, ir_unop_frexp_sig, and ir_unop_frexp_exp to + * arithmetic and bit ops for double arguments. * * BITFIELD_INSERT_TO_BFM_BFI: * --- @@ -150,6 +156,9 @@ private: void log_to_log2(ir_expression *); void bitfield_insert_to_bfm_bfi(ir_expression *); void ldexp_to_arith(ir_expression *); + void dldexp_to_arith(ir_expression *); + void dfrexp_sig_to_arith(ir_expression *); + void dfrexp_exp_to_arith(ir_expression *); void carry_to_arith(ir_expression *); void borrow_to_arith(ir_expression *); void sat_to_clamp(ir_expression *); @@ -483,6 +492,262 @@ lower_instructions_visitor::ldexp_to_arith(ir_expression *ir) } void +lower_instructions_visitor::dldexp_to_arith(ir_expression *ir) +{ + /* See ldexp_to_arith for structure. Uses frexp_exp to extract the exponent +* from the significand. +*/ + + const unsigned vec_elem = ir-type-vector_elements; + + /* Types */ + const glsl_type *ivec = glsl_type::get_instance(GLSL_TYPE_INT, vec_elem, 1); + const glsl_type *bvec = glsl_type::get_instance(GLSL_TYPE_BOOL, vec_elem, 1); + + /* Constants */ + ir_constant *zeroi = ir_constant::zero(ir, ivec); + + ir_constant *sign_mask = new(ir) ir_constant(0x8000u); + + ir_constant *exp_shift = new(ir) ir_constant(20); + ir_constant *exp_width = new(ir) ir_constant(11); + ir_constant *exp_bias = new(ir) ir_constant(1022, vec_elem); + + /* Temporary variables */ + ir_variable *x = new(ir) ir_variable(ir-type, x, ir_var_temporary); + ir_variable *exp = new(ir) ir_variable(ivec, exp, ir_var_temporary); + + ir_variable *zero_sign_x = new(ir) ir_variable(ir-type, zero_sign_x, + ir_var_temporary); + + ir_variable *extracted_biased_exp = + new(ir) ir_variable(ivec, extracted_biased_exp, ir_var_temporary); + ir_variable *resulting_biased_exp = + new(ir) ir_variable(ivec, resulting_biased_exp, ir_var_temporary); + + ir_variable *is_not_zero_or_underflow = + new(ir) ir_variable(bvec, is_not_zero_or_underflow, ir_var_temporary); + + ir_instruction i = *base_ir; + + /* Copy x and exp arguments. */ + i.insert_before(x); + i.insert_before(assign(x, ir-operands[0])); + i.insert_before(exp); + i.insert_before(assign(exp, ir-operands[1])); + + ir_expression *frexp_exp = expr(ir_unop_frexp_exp, x); + if (lowering(DFREXP_DLDEXP_TO_ARITH)) + dfrexp_exp_to_arith(frexp_exp); + + /* Extract the biased exponent from x. */ + i.insert_before(extracted_biased_exp); + i.insert_before(assign(extracted_biased_exp, add(frexp_exp, exp_bias))); + + i.insert_before(resulting_biased_exp); + i.insert_before(assign(resulting_biased_exp, + add(extracted_biased_exp, exp))); + + /* Test if result is ±0.0, subnormal, or underflow by checking if the +* resulting biased exponent would be less than 0x1. If so, the result is +* 0.0 with the sign of x. (Actually, invert the conditions so that +* immediate values are the second arguments, which is better for i965) +* TODO: Implement in a vector fashion. +*/ + i.insert_before(zero_sign_x); + for (unsigned elem = 0; elem vec_elem; elem++) { + ir_variable *unpacked = + new(ir) ir_variable(glsl_type::uvec2_type, unpacked, ir_var_temporary); + i.insert_before(unpacked); + i.insert_before( +assign(unpacked, + expr(ir_unop_unpack_double_2x32, swizzle(x, elem, 1; + i.insert_before(assign(unpacked, bit_and(swizzle_y(unpacked), sign_mask-clone(ir, NULL)), + WRITEMASK_Y)); + i.insert_before(assign(unpacked, ir_constant::zero(ir, glsl_type::uint_type), WRITEMASK_X));
[Mesa-dev] [PATCH 10/13] glsl/lower_instructions: add double lowering passes
From: Dave Airlie airl...@gmail.com This lowers double dot product and lrp to fma. Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/lower_instructions.cpp | 83 + 1 file changed, 83 insertions(+) diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 5c1d6aa..140b6d4 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -115,6 +115,7 @@ */ #include main/core.h /* for M_LOG2E */ +#include program/prog_instruction.h /* for swizzle */ #include glsl_types.h #include ir.h #include ir_builder.h @@ -148,6 +149,8 @@ private: void carry_to_arith(ir_expression *); void borrow_to_arith(ir_expression *); void sat_to_clamp(ir_expression *); + void double_dot_to_fma(ir_expression *); + void double_lrp(ir_expression *); }; } /* anonymous namespace */ @@ -521,10 +524,90 @@ lower_instructions_visitor::sat_to_clamp(ir_expression *ir) this-progress = true; } +void +lower_instructions_visitor::double_dot_to_fma(ir_expression *ir) +{ + ir_variable *temp = new(ir) ir_variable(ir-operands[0]-type-get_base_type(), dot_res, + ir_var_temporary); + this-base_ir-insert_before(temp); + + int nc = ir-operands[0]-type-components(); + for (int i = nc - 1; i = 1; i--) { + ir_assignment *assig; + if (i == (nc - 1)) { + assig = assign(temp, mul(swizzle(ir-operands[0]-clone(ir, NULL), i, 1), + swizzle(ir-operands[1]-clone(ir, NULL), i, 1))); + } else { + assig = assign(temp, fma(swizzle(ir-operands[0]-clone(ir, NULL), i, 1), + swizzle(ir-operands[1]-clone(ir, NULL), i, 1), + temp)); + } + this-base_ir-insert_before(assig); + } + + ir-operation = ir_triop_fma; + ir-operands[0] = swizzle(ir-operands[0], 0, 1); + ir-operands[1] = swizzle(ir-operands[1], 0, 1); + ir-operands[2] = new(ir) ir_dereference_variable(temp); + + this-progress = true; + +} + +void +lower_instructions_visitor::double_lrp(ir_expression *ir) +{ + ir_assignment *assig; + ir_constant *one = new(ir) ir_constant(1.0, ir-operands[2]-type-vector_elements); + ir_variable *temp = new(ir) ir_variable(ir-operands[0]-type, lrp_res, + ir_var_temporary); + ir_variable *t2 = new(ir) ir_variable(ir-operands[0]-type, aval, + ir_var_temporary); + int swizval; + this-base_ir-insert_before(temp); + this-base_ir-insert_before(t2); + + assig = assign(temp, mul(sub(one, ir-operands[2]), ir-operands[0])); + this-base_ir-insert_before(assig); + + switch (ir-operands[2]-type-vector_elements) { + case 1: + swizval = SWIZZLE_; + break; + case 2: + swizval = MAKE_SWIZZLE4(SWIZZLE_X, SWIZZLE_Y, SWIZZLE_X, SWIZZLE_X); + break; + case 3: + swizval = MAKE_SWIZZLE4(SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_X); + break; + case 4: + default: + swizval = SWIZZLE_XYZW; + break; + } + assig = assign(t2, swizzle(ir-operands[2], swizval, ir-operands[0]-type-vector_elements)); + this-base_ir-insert_before(assig); + + ir-operation = ir_triop_fma; + ir-operands[0] = new(ir) ir_dereference_variable(t2); + ir-operands[1] = ir-operands[1]; + ir-operands[2] = new(ir) ir_dereference_variable(temp); + + this-progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { switch (ir-operation) { + case ir_binop_dot: + if (ir-operands[0]-type-is_double()) + double_dot_to_fma(ir); + break; + case ir_triop_lrp: + if (ir-operands[0]-type-is_double()) + double_lrp(ir); + break; case ir_binop_sub: if (lowering(SUB_TO_ADD_NEG)) sub_to_add_neg(ir); -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/13] glsl: implement double builtin functions
From: Dave Airlie airl...@gmail.com This implements the bulk of the builtin functions for fp64 support. Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/builtin_functions.cpp | 751 +++-- 1 file changed, 492 insertions(+), 259 deletions(-) diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index bb7fbcd..fb31dad 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -381,6 +381,12 @@ gs_streams(const _mesa_glsl_parse_state *state) return gpu_shader5(state) gs_only(state); } +static bool +fp64(const _mesa_glsl_parse_state *state) +{ + return state-has_double(); +} + /** @} */ /**/ @@ -436,6 +442,7 @@ private: ir_constant *imm(float f, unsigned vector_elements=1); ir_constant *imm(int i, unsigned vector_elements=1); ir_constant *imm(unsigned u, unsigned vector_elements=1); + ir_constant *imm(double d, unsigned vector_elements=1); ir_constant *imm(const glsl_type *type, const ir_constant_data ); ir_dereference_variable *var_ref(ir_variable *var); ir_dereference_array *array_ref(ir_variable *var, int i); @@ -526,29 +533,29 @@ private: B1(log) B1(exp2) B1(log2) - B1(sqrt) - B1(inversesqrt) - B1(abs) - B1(sign) - B1(floor) - B1(trunc) - B1(round) - B1(roundEven) - B1(ceil) - B1(fract) + BA1(sqrt) + BA1(inversesqrt) + BA1(abs) + BA1(sign) + BA1(floor) + BA1(trunc) + BA1(round) + BA1(roundEven) + BA1(ceil) + BA1(fract) B2(mod) - B1(modf) + BA1(modf) BA2(min) BA2(max) BA2(clamp) - B2(mix_lrp) + BA2(mix_lrp) ir_function_signature *_mix_sel(builtin_available_predicate avail, const glsl_type *val_type, const glsl_type *blend_type); - B2(step) - B2(smoothstep) - B1(isnan) - B1(isinf) + BA2(step) + BA2(smoothstep) + BA1(isnan) + BA1(isinf) B1(floatBitsToInt) B1(floatBitsToUint) B1(intBitsToFloat) @@ -563,24 +570,27 @@ private: ir_function_signature *_unpackSnorm4x8(builtin_available_predicate avail); ir_function_signature *_packHalf2x16(builtin_available_predicate avail); ir_function_signature *_unpackHalf2x16(builtin_available_predicate avail); - B1(length) - B1(distance); - B1(dot); - B1(cross); - B1(normalize); + ir_function_signature *_packDouble2x32(builtin_available_predicate avail); + ir_function_signature *_unpackDouble2x32(builtin_available_predicate avail); + + BA1(length) + BA1(distance); + BA1(dot); + BA1(cross); + BA1(normalize); B0(ftransform); - B1(faceforward); - B1(reflect); - B1(refract); - B1(matrixCompMult); - B1(outerProduct); - B0(determinant_mat2); - B0(determinant_mat3); - B0(determinant_mat4); - B0(inverse_mat2); - B0(inverse_mat3); - B0(inverse_mat4); - B1(transpose); + BA1(faceforward); + BA1(reflect); + BA1(refract); + BA1(matrixCompMult); + BA1(outerProduct); + BA1(determinant_mat2); + BA1(determinant_mat3); + BA1(determinant_mat4); + BA1(inverse_mat2); + BA1(inverse_mat3); + BA1(inverse_mat4); + BA1(transpose); BA1(lessThan); BA1(lessThanEqual); BA1(greaterThan); @@ -644,9 +654,10 @@ private: B1(bitCount) B1(findLSB) B1(findMSB) - B1(fma) + BA1(fma) B2(ldexp) B2(frexp) + B2(dfrexp) B1(uaddCarry) B1(usubBorrow) B1(mulExtended) @@ -815,6 +826,42 @@ builtin_builder::create_builtins() _##NAME(glsl_type::vec4_type), \ NULL); +#define FD(NAME) \ + add_function(#NAME, \ +_##NAME(always_available, glsl_type::float_type), \ +_##NAME(always_available, glsl_type::vec2_type), \ +_##NAME(always_available, glsl_type::vec3_type), \ +_##NAME(always_available, glsl_type::vec4_type), \ +_##NAME(fp64, glsl_type::double_type), \ +_##NAME(fp64, glsl_type::dvec2_type),\ +_##NAME(fp64, glsl_type::dvec3_type), \ +_##NAME(fp64, glsl_type::dvec4_type), \ +NULL); + +#define FD130(NAME) \ + add_function(#NAME, \ +_##NAME(v130, glsl_type::float_type), \ +_##NAME(v130, glsl_type::vec2_type), \ +_##NAME(v130, glsl_type::vec3_type), \ +_##NAME(v130, glsl_type::vec4_type), \ +_##NAME(fp64, glsl_type::double_type), \ +_##NAME(fp64, glsl_type::dvec2_type),\ +_##NAME(fp64, glsl_type::dvec3_type), \ +_##NAME(fp64, glsl_type::dvec4_type), \ +NULL); + +#define FDGS5(NAME) \ +
[Mesa-dev] [PATCH 08/13] glsl: validate output types for shader stages
From: Tapani Pälli tapani.pa...@intel.com Patch fixes Piglit test: arb_gpu_shader_fp64/preprocessor/fs-output-double.frag and adds additional validation for shader outputs. Signed-off-by: Tapani Pälli tapani.pa...@intel.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/ast_to_hir.cpp | 47 +++ 1 file changed, 47 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index b2f9165..bd37e19 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3623,6 +3623,53 @@ ast_declarator_list::hir(exec_list *instructions, handle_geometry_shader_input_decl(state, loc, var); } + } else if (var-data.mode == ir_var_shader_out) { + const glsl_type *check_type = var-type; + while (check_type-is_array()) +check_type = check_type-element_type(); + + /* From section 4.3.6 (Output variables) of the GLSL 4.40 spec: + * + * It is a compile-time error to declare a vertex, tessellation + * evaluation, tessellation control, or geometry shader output + * that contains any of the following: + * + * * A Boolean type (bool, bvec2 ...) + * * An opaque type + */ + if (check_type-is_boolean() || check_type-contains_opaque()) +_mesa_glsl_error(loc, state, + %s shader output cannot have type %s, + _mesa_shader_stage_to_string(state-stage), + check_type-name); + + /* From section 4.3.6 (Output variables) of the GLSL 4.40 spec: + * + * It is a compile-time error to declare a fragment shader output + * that contains any of the following: + * + * * A Boolean type (bool, bvec2 ...) + * * A double-precision scalar or vector (double, dvec2 ...) + * * An opaque type + * * Any matrix type + * * A structure + */ + if (state-stage == MESA_SHADER_FRAGMENT) { +if (check_type-is_record() || check_type-is_matrix()) + _mesa_glsl_error(loc, state, +fragment shader output +cannot have struct or array type); +switch (check_type-base_type) { + case GLSL_TYPE_UINT: + case GLSL_TYPE_INT: + case GLSL_TYPE_FLOAT: + break; + default: + _mesa_glsl_error(loc, state, +fragment shader output cannot have +type %s, check_type-name); +} + } } /* Integer fragment inputs must be qualified with 'flat'. In GLSL ES, -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/13] glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2)
From: Dave Airlie airl...@redhat.com v2: add define bit (Tapani Pälli) Patch makes following Piglit tests pass: arb_gpu_shader_fp64/preprocessor/define.vert arb_gpu_shader_fp64/preprocessor/define.frag Reviewed-by: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/glcpp/glcpp-parse.y| 3 +++ src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h | 2 ++ src/glsl/standalone_scaffolding.cpp | 1 + 4 files changed, 7 insertions(+) diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index e5bebe5..c2f5223 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -2445,6 +2445,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio if (extensions-ARB_gpu_shader5) add_builtin_define(parser, GL_ARB_gpu_shader5, 1); + if (extensions-ARB_gpu_shader_fp64) + add_builtin_define(parser, GL_ARB_gpu_shader_fp64, 1); + if (extensions-AMD_vertex_shader_layer) add_builtin_define(parser, GL_AMD_vertex_shader_layer, 1); diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index ccdf031..cb19ce1 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -527,6 +527,7 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = { EXT(ARB_fragment_coord_conventions, true, false, ARB_fragment_coord_conventions), EXT(ARB_fragment_layer_viewport,true, false, ARB_fragment_layer_viewport), EXT(ARB_gpu_shader5,true, false, ARB_gpu_shader5), + EXT(ARB_gpu_shader_fp64,true, false, ARB_gpu_shader_fp64), EXT(ARB_sample_shading, true, false, ARB_sample_shading), EXT(ARB_separate_shader_objects,true, false, dummy_true), EXT(ARB_shader_atomic_counters, true, false, ARB_shader_atomic_counters), diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 843fdae..dafee4e 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -414,6 +414,8 @@ struct _mesa_glsl_parse_state { bool ARB_fragment_layer_viewport_warn; bool ARB_gpu_shader5_enable; bool ARB_gpu_shader5_warn; + bool ARB_gpu_shader_fp64_enable; + bool ARB_gpu_shader_fp64_warn; bool ARB_sample_shading_enable; bool ARB_sample_shading_warn; bool ARB_separate_shader_objects_enable; diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index 67b0d0c..ad0d75b 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -127,6 +127,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, gl_api api) ctx-Extensions.ARB_fragment_coord_conventions = true; ctx-Extensions.ARB_fragment_layer_viewport = true; ctx-Extensions.ARB_gpu_shader5 = true; + ctx-Extensions.ARB_gpu_shader_fp64 = true; ctx-Extensions.ARB_sample_shading = true; ctx-Extensions.ARB_shader_bit_encoding = true; ctx-Extensions.ARB_shader_stencil_export = true; -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/13] glsl: add double support
From: Dave Airlie airl...@gmail.com This adds the guts of the fp64 implementation to the GLSL compiler. - builtin double types - double constant support - lexer parsing for double types (lf, LF) - enforcing flat on double fs inputs - double operations (d2f,f2d, pack/unpack, frexp - in 2 parts) - ir builder bits. - double constant expression handling v2: add has_double check (Ian) add d2i, i2d, d2u, u2d (Tapani + Ian) remove extra -type setting (Ian) v3: include fixes from Tapani and Topi Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/ast.h | 2 + src/glsl/ast_function.cpp | 67 -- src/glsl/ast_to_hir.cpp| 38 +- src/glsl/builtin_type_macros.h | 16 +++ src/glsl/builtin_types.cpp | 30 + src/glsl/glsl_lexer.ll | 42 +- src/glsl/glsl_parser.yy| 33 - src/glsl/glsl_parser_extras.cpp| 4 + src/glsl/glsl_parser_extras.h | 5 + src/glsl/glsl_types.cpp| 109 --- src/glsl/glsl_types.h | 18 ++- src/glsl/ir.cpp| 104 ++- src/glsl/ir.h | 21 +++ src/glsl/ir_builder.cpp| 23 src/glsl/ir_builder.h | 5 + src/glsl/ir_clone.cpp | 1 + src/glsl/ir_constant_expression.cpp| 234 - src/glsl/ir_print_visitor.cpp | 11 ++ src/glsl/ir_set_program_inouts.cpp | 24 +++- src/glsl/ir_validate.cpp | 61 - src/glsl/link_uniform_initializers.cpp | 7 +- src/glsl/link_uniforms.cpp | 8 +- src/glsl/link_varyings.cpp | 3 +- src/glsl/loop_controls.cpp | 19 ++- src/glsl/lower_mat_op_to_vec.cpp | 2 + src/glsl/lower_ubo_reference.cpp | 13 +- src/glsl/opt_constant_propagation.cpp | 3 + src/glsl/opt_minmax.cpp| 13 ++ src/mesa/program/ir_to_mesa.cpp| 10 ++ 29 files changed, 825 insertions(+), 101 deletions(-) diff --git a/src/glsl/ast.h b/src/glsl/ast.h index 6995ae8..ef74e51 100644 --- a/src/glsl/ast.h +++ b/src/glsl/ast.h @@ -189,6 +189,7 @@ enum ast_operators { ast_uint_constant, ast_float_constant, ast_bool_constant, + ast_double_constant, ast_sequence, ast_aggregate @@ -236,6 +237,7 @@ public: float float_constant; unsigned uint_constant; int bool_constant; + double double_constant; } primary_expression; diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index cbff9d8..b3dc861 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -573,6 +573,9 @@ convert_component(ir_rvalue *src, const glsl_type *desired_type) result = new(ctx) ir_expression(ir_unop_i2u, new(ctx) ir_expression(ir_unop_b2i, src)); break; + case GLSL_TYPE_DOUBLE: +result = new(ctx) ir_expression(ir_unop_d2u, src); +break; } break; case GLSL_TYPE_INT: @@ -586,6 +589,9 @@ convert_component(ir_rvalue *src, const glsl_type *desired_type) case GLSL_TYPE_BOOL: result = new(ctx) ir_expression(ir_unop_b2i, src); break; + case GLSL_TYPE_DOUBLE: +result = new(ctx) ir_expression(ir_unop_d2i, src); +break; } break; case GLSL_TYPE_FLOAT: @@ -599,6 +605,9 @@ convert_component(ir_rvalue *src, const glsl_type *desired_type) case GLSL_TYPE_BOOL: result = new(ctx) ir_expression(ir_unop_b2f, desired_type, src, NULL); break; + case GLSL_TYPE_DOUBLE: +result = new(ctx) ir_expression(ir_unop_d2f, desired_type, src, NULL); +break; } break; case GLSL_TYPE_BOOL: @@ -613,8 +622,28 @@ convert_component(ir_rvalue *src, const glsl_type *desired_type) case GLSL_TYPE_FLOAT: result = new(ctx) ir_expression(ir_unop_f2b, desired_type, src, NULL); break; + case GLSL_TYPE_DOUBLE: +result = new(ctx) ir_expression(ir_unop_f2b, + new(ctx) ir_expression(ir_unop_d2f, src)); +break; } break; + case GLSL_TYPE_DOUBLE: + switch (b) { + case GLSL_TYPE_INT: + result = new(ctx) ir_expression(ir_unop_i2d, src); +break; + case GLSL_TYPE_UINT: + result = new(ctx) ir_expression(ir_unop_u2d, src); +break; + case GLSL_TYPE_BOOL: + result = new(ctx) ir_expression(ir_unop_f2d, + new(ctx) ir_expression(ir_unop_b2f, src)); +break; + case GLSL_TYPE_FLOAT: +result = new(ctx) ir_expression(ir_unop_f2d, desired_type, src, NULL); +break; + } } assert(result != NULL); @@ -711,9 +740,9 @@ process_vec_mat_constructor(exec_list *instructions, /* Apply implicit conversions (not the scalar constructor rules!). See * the spec
[Mesa-dev] [PATCH 05/13] mesa: add double uniform support. (v4)
From: Dave Airlie airl...@redhat.com This adds support for the new uniform interfaces from ARB_gpu_shader_fp64. v2: support ARB_separate_shader_objects ProgramUniform*d* (Ian) don't allow boolean uniforms to be updated (issue 15) (Ian) v3: fix size_mul v4: Teach uniform update to take into account double precision (Topi) Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/uniform_query.cpp | 27 +++--- src/mesa/main/uniforms.c | 185 ++ src/mesa/main/uniforms.h | 3 +- src/mesa/program/ir_to_mesa.cpp | 17 +++- src/mesa/program/prog_parameter.c | 16 ++-- 5 files changed, 210 insertions(+), 38 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index d36f506..7db8c36 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -469,6 +469,9 @@ log_uniform(const void *values, enum glsl_base_type basicType, case GLSL_TYPE_FLOAT: printf(%g , v[i].f); break; + case GLSL_TYPE_DOUBLE: +printf(%g , *(double* )v[i * 2].f); +break; default: assert(!Should not get here.); break; @@ -529,11 +532,11 @@ _mesa_propagate_uniforms_to_driver_storage(struct gl_uniform_storage *uni, */ const unsigned components = MAX2(1, uni-type-vector_elements); const unsigned vectors = MAX2(1, uni-type-matrix_columns); - + const int dmul = uni-type-base_type == GLSL_TYPE_DOUBLE ? 2 : 1; /* Store the data in the driver's requested type in the driver's storage * areas. */ - unsigned src_vector_byte_stride = components * 4; + unsigned src_vector_byte_stride = components * 4 * dmul; for (i = 0; i uni-num_driver_storage; i++) { struct gl_uniform_driver_storage *const store = uni-driver_storage[i]; @@ -608,6 +611,7 @@ _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg, unsigned src_components) { unsigned offset; + int size_mul = basicType == GLSL_TYPE_DOUBLE ? 2 : 1; struct gl_uniform_storage *const uni = validate_uniform_parameters(ctx, shProg, location, count, @@ -615,15 +619,13 @@ _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg, if (uni == NULL) return; - /* Verify that the types are compatible. -*/ const unsigned components = uni-type-is_sampler() ? 1 : uni-type-vector_elements; bool match; switch (uni-type-base_type) { case GLSL_TYPE_BOOL: - match = true; + match = (basicType != GLSL_TYPE_DOUBLE); break; case GLSL_TYPE_SAMPLER: case GLSL_TYPE_IMAGE: @@ -710,8 +712,8 @@ _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg, /* Store the data in the actual type backing storage for the uniform. */ if (!uni-type-is_boolean()) { - memcpy(uni-storage[components * offset], values, -sizeof(uni-storage[0]) * components * count); + memcpy(uni-storage[size_mul * components * offset], values, +sizeof(uni-storage[0]) * components * count * size_mul); } else { const union gl_constant_value *src = (const union gl_constant_value *) values; @@ -808,13 +810,14 @@ extern C void _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, GLuint cols, GLuint rows, GLint location, GLsizei count, - GLboolean transpose, const GLfloat *values) + GLboolean transpose, + const GLvoid *values, GLenum type) { unsigned offset; unsigned vectors; unsigned components; unsigned elements; - + int size_mul = mesa_type_is_double(type) ? 2 : 1; struct gl_uniform_storage *const uni = validate_uniform_parameters(ctx, shProg, location, count, offset, glUniformMatrix); @@ -852,7 +855,7 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, } if (unlikely(ctx-_Shader-Flags GLSL_UNIFORMS)) { - log_uniform(values, GLSL_TYPE_FLOAT, components, vectors, count, + log_uniform(values, uni-type-base_type, components, vectors, count, bool(transpose), shProg, location, uni); } @@ -879,11 +882,11 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, if (!transpose) { memcpy(uni-storage[elements * offset], values, -sizeof(uni-storage[0]) * elements * count); +sizeof(uni-storage[0]) * elements * count * size_mul); } else { /* Copy and transpose the matrix. */ - const float *src = values; + const float *src = (const float *)values; float *dst = uni-storage[elements * offset].f; for (int i = 0; i count; i++) { diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index 2a6bd4b..8c1cb9a 100644 --- a/src/mesa/main/uniforms.c +++
[Mesa-dev] [PATCH 09/13] glsl: enable/disable certain lowering passes for doubles
From: Dave Airlie airl...@gmail.com We want to restrict some lowering passes to floats only, and enable other for doubles. Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/lower_instructions.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 09afe55..5c1d6aa 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -306,7 +306,7 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir) /* Don't generate new IR that would need to be lowered in an additional * pass. */ - if (lowering(DIV_TO_MUL_RCP)) + if (lowering(DIV_TO_MUL_RCP) ir-type-is_float()) div_to_mul_rcp(div_expr); ir_expression *const floor_expr = @@ -548,7 +548,7 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_binop_mod: - if (lowering(MOD_TO_FLOOR) ir-type-is_float()) + if (lowering(MOD_TO_FLOOR) (ir-type-is_float() || ir-type-is_double())) mod_to_floor(ir); break; @@ -563,7 +563,7 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_binop_ldexp: - if (lowering(LDEXP_TO_ARITH)) + if (lowering(LDEXP_TO_ARITH) ir-type-is_float()) ldexp_to_arith(ir); break; -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/13] mesa: add double uniform support. (v4)
On Wed, Feb 4, 2015 at 11:27 PM, Ilia Mirkin imir...@alum.mit.edu wrote: diff --git a/src/mesa/program/prog_parameter.c b/src/mesa/program/prog_parameter.c index 0ef4641..e1bbc00 100644 --- a/src/mesa/program/prog_parameter.c +++ b/src/mesa/program/prog_parameter.c @@ -111,7 +111,13 @@ _mesa_add_parameter(struct gl_program_parameter_list *paramList, const gl_state_index state[STATE_LENGTH]) { const GLuint oldNum = paramList-NumParameters; - const GLuint sz4 = (size + 3) / 4; /* no. of new param slots needed */ + GLuint sz4 = (size + 3) / 4; /* no. of new param slots needed */ + int actual_size = size; + + if (mesa_type_is_double(datatype)) { + actual_size *= 2; + sz4 = ((actual_size + 3) / 4); + } assert(size 0); Does this really need to be updated? I thought this code was just for ARB vp/fp programs. @@ -150,15 +156,15 @@ _mesa_add_parameter(struct gl_program_parameter_list *paramList, struct gl_program_parameter *p = paramList-Parameters + oldNum + i; p-Name = name ? _mesa_strdup(name) : NULL; p-Type = type; - p-Size = size; + p-Size = actual_size; p-DataType = datatype; if (values) { -if (size = 4) { +if (actual_size = 4) { COPY_4V(paramList-ParameterValues[oldNum + i], values); } else { /* copy 1, 2 or 3 values */ - GLuint remaining = size % 4; + GLuint remaining = actual_size % 4; assert(remaining 4); for (j = 0; j remaining; j++) { paramList-ParameterValues[oldNum + i][j].f = values[j].f; @@ -176,7 +182,7 @@ _mesa_add_parameter(struct gl_program_parameter_list *paramList, for (j = 0; j 4; j++) paramList-ParameterValues[oldNum + i][j].f = 0; } - size -= 4; + actual_size -= 4; } if (state) { -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/13] mesa: add mesa_type_is_double helper function (v2)
On Wed, Feb 4, 2015 at 11:27 PM, Ilia Mirkin imir...@alum.mit.edu wrote: From: Dave Airlie airl...@gmail.com This is a helper to return if a type is based on a double. v2: GLboolean-bool (Ian) Reviewed-by: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/program/prog_parameter.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/mesa/program/prog_parameter.h b/src/mesa/program/prog_parameter.h index 6b3b3c2..bcbe142 100644 --- a/src/mesa/program/prog_parameter.h +++ b/src/mesa/program/prog_parameter.h @@ -151,6 +151,28 @@ _mesa_lookup_parameter_constant(const struct gl_program_parameter_list *list, const gl_constant_value v[], GLuint vSize, GLint *posOut, GLuint *swizzleOut); +static INLINE bool mesa_type_is_double(int dataType) Other places in this file we use 'inline' so it's safe to use here. Since it is static, I don't think we need the 'mesa_' prefix, but if we want the prefix shouldn't it start with an underscore? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/13] glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2)
Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.
Matt Turner matts...@gmail.com writes: Prevents piglit regressions from the next patch. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 37 +- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 77d4908..8cd36f8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1734,7 +1734,42 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_F16TO32(p, dst, src[0]); break; case BRW_OPCODE_CMP: - brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); + /* The Ivybridge/BayTrail WaCMPInstFlagDepClearedEarly workaround says + * that when the destination is a GRF that the dependency-clear bit on + * the flag register is cleared early. + * + * Suggested workarounds are to disable coissuing CMP instructions + * or to split CMP(16) instructions into two CMP(8) instructions. + * + * We choose to split into CMP(8) instructions since disabling + * coissuing would affect CMP instructions not otherwise affected by + * the errata. + */ + if (dispatch_width == 16 brw-gen == 7 !brw-is_haswell) { +if (dst.file == BRW_GENERAL_REGISTER_FILE) { + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); + brw_CMP(p, firsthalf(dst), inst-conditional_mod, + firsthalf(src[0]), firsthalf(src[1])); + brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); + brw_CMP(p, sechalf(dst), inst-conditional_mod, + sechalf(src[0]), sechalf(src[1])); + brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + + multiple_instructions_emitted = true; +} else if (dst.file == BRW_ARCHITECTURE_REGISTER_FILE) { + /* For unknown reasons, the aforementioned workaround is not +* sufficient. Overriding the type when the destination is the +* null register is necessary but not sufficient by itself. +*/ + assert(dst.nr == BRW_ARF_NULL); + dst.type = BRW_REGISTER_TYPE_D; + brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); What do you mean by not sufficient? This is quite a common use-case of the CMP instruction... Any idea what should be done? +} else { + unreachable(not reached); +} + } else { +brw_CMP(p, dst, inst-conditional_mod, src[0], src[1]); + } break; case BRW_OPCODE_SEL: brw_SEL(p, dst, src[0], src[1]); -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev pgpeNd14XceiS.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] Fixing an x86 FPU bug.
Hi Marius, This patch does indeed make the Piglit test pass but it doesn't seem like a complete fix. It looks like there are still a number of places that are copying via floats that the test wouldn't catch. There are also lots of places that are still using GLfloat to store the attribute values and then these are cast to gl_constant_value* whenever they are used in combination with the places that the patch does change. I think if we wanted to do a complete patch then we should really change all places that are storing attribute values to use gl_constant_value to make it clear that they aren't really just floats. If it is too much of a change just to fix this relatively minor bug then perhaps it would be better to do a simpler change that just fixes this one case using a memcpy or something instead of changing so much code. I don't really like the idea of only changing some of the places to gl_constant_value and leaving the rest as GLfloat because it seems even more confusing that way. I don't think it makes sense to add this patch to the stable branch. The bug has presumably been there since the introduction of integer attributes in 2012 (see acf438f537) and nobody has complained so it doesn't seem particularly urgent. The patch is non-trivial so I think there is a risk it will introduce more bugs. I've made some comments inline below. marius.pre...@intel.com writes: From: Marius Predut marius.pre...@intel.com On 32-bit, for floating point operations is used x86 FPU registers instead SSE, reason for when reinterprets an integer as a float result is unexpected (modify floats when they are written to memory). The defect was checked with and without -O3 compiler flag. CC: mesa-sta...@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82668 Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/main/context.c |3 ++- src/mesa/main/macros.h| 47 - src/mesa/vbo/vbo_attrib_tmp.h | 20 ++ src/mesa/vbo/vbo_exec.h |3 ++- src/mesa/vbo/vbo_exec_api.c | 31 +-- src/mesa/vbo/vbo_exec_eval.c | 22 ++- src/mesa/vbo/vbo_save_api.c | 16 +++--- src/mesa/vbo/vbo_save_draw.c |4 ++-- 8 files changed, 90 insertions(+), 56 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 400c158..11ab8a9 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -134,6 +134,7 @@ #include math/m_matrix.h #include main/dispatch.h /* for _gloffset_COUNT */ #include uniforms.h +#include macros.h #ifdef USE_SPARC_ASM #include sparc/sparc.h @@ -656,7 +657,7 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api) consts-MaxSamples = 0; /* GLSL default if NativeIntegers == FALSE */ - consts-UniformBooleanTrue = FLT_AS_UINT(1.0f); + consts-UniformBooleanTrue = FLOAT_AS_UNION(1.0f).u; /* GL_ARB_sync */ consts-MaxServerWaitTimeout = 0x1fff7fffULL; diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h index cd5f2d6..2651ffc 100644 --- a/src/mesa/main/macros.h +++ b/src/mesa/main/macros.h @@ -32,6 +32,7 @@ #define MACROS_H #include imports.h +#include program/prog_parameter.h /** @@ -170,27 +171,26 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256]; ub = ((GLubyte) F_TO_I((f) * 255.0F)) #endif -static inline GLfloat INT_AS_FLT(GLint i) +static union gl_constant_value UINT_AS_UNION(GLuint u) { - fi_type tmp; - tmp.i = i; - return tmp.f; + union gl_constant_value tmp; + tmp.u = u; + return tmp; } -static inline GLfloat UINT_AS_FLT(GLuint u) +static inline union gl_constant_value INT_AS_UNION(GLint i) { - fi_type tmp; - tmp.u = u; - return tmp.f; + union gl_constant_value tmp; + tmp.i = i; + return tmp; } -static inline unsigned FLT_AS_UINT(float f) +static inline union gl_constant_value FLOAT_AS_UNION(GLfloat f) { - fi_type tmp; + union gl_constant_value tmp; tmp.f = f; - return tmp.u; + return tmp; } - /** * Convert a floating point value to an unsigned fixed point value. * @@ -382,6 +382,15 @@ do {\ V[3] = V3; \ } while(0) +/** Assignment union*/ +#define ASSIGN_4V_UNION( V, V0, V1, V2, V3 ) \ +do {\ +V[0].f = V0; \ +V[1].f = V1; \ +V[2].f = V2; \ +V[3].f = V3; \ +} while(0) + I think it would be better not to have this macro and just use the ASSIGN_4V macro to copy the gl_constant_value structs directly. This macro is copying the union values via a floating-point operation which is what we're trying to avoid. Technically it doesn't really matter in this case because
[Mesa-dev] [Bug 88962] [osmesa] Crash on postprocessing if z buffer is NULL
https://bugs.freedesktop.org/show_bug.cgi?id=88962 --- Comment #3 from Alex Deucher alexdeuc...@gmail.com --- Please send the patches to the mesa-dev mailing list if you haven't already. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)
This change broke MinGW/MSVC builds because ffsll is not available there. There is a ffsll C fallback, but it's in src/mesa/main/imports.[ch]. So rather than duplicating it in src/gallium/auxiliary/util/u_math.h I'd prefer move it to src/util. And here lies the problem: what header name should be used for math helpers? I think the filenames in src/util and the directory itself is poorly named for something that is meant to be included by some many other components: - there is no unique prefix in most headers - util/ clashes with src/gallium/auxiliary/util/ Hence I'd like to propose to: - rename src/util to something unique (e.g, cgrt, for Common Graphics RunTime And maybe: - prefix all header/source files in there with a cgrt_* unique prefix too And maybe in the future - use cgrt_* prefix for symbols too. Jose On 01/02/15 17:15, Marek Olšák wrote: From: Marek Olšák marek.ol...@amd.com Same as u_bit_scan, but for uint64_t. --- src/gallium/auxiliary/util/u_math.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/gallium/auxiliary/util/u_math.h b/src/gallium/auxiliary/util/u_math.h index 19c7343..f5d3487 100644 --- a/src/gallium/auxiliary/util/u_math.h +++ b/src/gallium/auxiliary/util/u_math.h @@ -587,6 +587,13 @@ u_bit_scan(unsigned *mask) return i; } +static INLINE int +u_bit_scan64(uint64_t *mask) +{ + int i = ffsll(*mask) - 1; + *mask = ~(1llu i); + return i; +} /** * Return float bits. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] New stable-branch 10.4 candidate pushed
Hello list, A candidate for the Mesa 10.4.4 release is now available. The current patch queue is as follows: - 7 queued - 4 nominated (outstanding) - and 1 rejected (obsolete) patches In a nut shell this gives us a couple of build fixes, brings back one of Mario's dri3 patches, addresses osmesa texture addressing bug and a couple of smaller patches. Take a look at section Mesa stable queue for more information. Testing --- The following results are against piglit a68d27e7254. Changes - classic i965(snb) --- None. Changes - swrast classic, gallium - None. Testing reports/general approval Any testing reports (or general approval of the state of the branch) will be greatly appreciated. The plan is to have 10.4.4 this Friday(6th February). If you have any questions or comments that you would like to share before the release, please go ahead. Cheers, Emil Mesa stable queue - Nominated (2) == Mario Kleiner (1): glx: Handle out-of-sequence swap completion events correctly. Marius Predut (1): Fixing an x86 FPU bug. Rejected/Obsolete (1) = Marius Predut (1): Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No functional changes, only bug fixed. Queued (7) == Brian Paul (1): mesa: fix display list 8-byte alignment issue José Fonseca (1): egl: Pass the correct X visual depth to xcb_put_image(). Mario Kleiner (1): glx/dri3: Request non-vsynced Present for swapinterval zero. (v3) Matt Turner (1): gallium/util: Don't use __builtin_clrsb in util_last_bit(). Niels Ole Salscheider (1): configure: Link against all LLVM targets when building clover Park, Jeongmin (1): st/osmesa: Fix osbuffer-textures indexing Ville Syrjälä (1): i965: Fix max_wm_threads for CHV ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88967] Black textures Radeon m HD 5650
https://bugs.freedesktop.org/show_bug.cgi?id=88967 Bug ID: 88967 Summary: Black textures Radeon m HD 5650 Product: Mesa Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: worm-...@yandex.ru QA Contact: mesa-dev@lists.freedesktop.org After installing Linux Mint KDE 17.1 automatically was used xorg driver and textures in games such as X3 Terran Conflict, Star Conflict is black. With proprietary driver fglrx 2:13.350.1-0ubuntu2 textures looks fine, but shockwave flash works bad (freezes and other). Current version of xorg driver is 1:7.5.99+git1501140731.c80ea1~gd~t. My videocard is ATI Radeon mobility HD 5650. I don't know what to do. If I need to attach something, please discribe how to do it, thanks. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl: Add initial functions to implement an on-disk cache
Hi; On 02/04/2015 11:52 PM, Carl Worth wrote: From: Kristian Høgsberg k...@bitplanet.net This code provides for an on-disk cache of objects. Objects are stored and retrieved (in ~/.cache/mesa) via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The cache is limited to a maximum number of entries (1024 in this patch), and uses random replacement. These attributes are managed via What would you think about changing this to use some defined maximum size (in MB)? I think for the user size is what matters and it could be a configurable option, number of items seems a bit vague and hard to predict (?) 8 +uint8_t * +cache_get(struct program_cache *cache, cache_key key, size_t *size) +{ + int fd, ret, len; + struct stat sb; + char filename[256], *data; + + if (size) + *size = 0; + + if (!cache_has(cache, key)) + return NULL; + + get_cache_file(cache, filename, sizeof filename, key); + + fd = open(filename, O_RDONLY | O_CLOEXEC); + if (fd == -1) + return NULL; + + if (fstat(fd, sb) == -1) { + close(fd); + return NULL; + } + + data = (char *) malloc(sb.st_size); + if (data == NULL) { + close(fd); + return NULL; + } Will there be some further verification here if the file contents are what is expected or is this done in higher level where cache is called? // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
From: Michel Dänzer michel.daen...@amd.com The latter currently implies CPU access, so we have to avoid getting uncacheable memory. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658 Cc: 10.3 10.4 mesa-sta...@lists.freedestkop.org Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/mesa/state_tracker/st_cb_bufferobjects.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c b/src/mesa/state_tracker/st_cb_bufferobjects.c index 55f3644..90f786c 100644 --- a/src/mesa/state_tracker/st_cb_bufferobjects.c +++ b/src/mesa/state_tracker/st_cb_bufferobjects.c @@ -256,8 +256,15 @@ st_bufferobj_data(struct gl_context *ctx, break; case GL_STREAM_DRAW: case GL_STREAM_COPY: - pipe_usage = PIPE_USAGE_STREAM; - break; + /* XXX: Remove this test and fall-through when we have PBO unpacking + * acceleration. Right now, PBO unpacking is done by the CPU, so we + * have to make sure CPU reads are fast. + */ + if (target != GL_PIXEL_UNPACK_BUFFER_ARB) { +pipe_usage = PIPE_USAGE_STREAM; +break; + } + /* fall through */ case GL_STATIC_READ: case GL_DYNAMIC_READ: case GL_STREAM_READ: -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] i965/cfg: Add function to generate a dot file of the CFG.
--- src/mesa/drivers/dri/i965/brw_cfg.cpp | 14 ++ src/mesa/drivers/dri/i965/brw_cfg.h | 1 + 2 files changed, 15 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp b/src/mesa/drivers/dri/i965/brw_cfg.cpp index 068d4d6..2d6a181 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp @@ -490,3 +490,17 @@ cfg_t::intersect(bblock_t *b1, bblock_t *b2) assert(b1); return b1; } + +void +cfg_t::dump_cfg() +{ + printf(digraph CFG {\n); + for (int b = 0; b num_blocks; b++) { + bblock_t *block = this-blocks[b]; + + foreach_list_typed_safe (bblock_link, child, link, block-children) { + printf(\t%d - %d\n, b, child-block-num); + } + } + printf(}\n); +} diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h b/src/mesa/drivers/dri/i965/brw_cfg.h index 215f248..4d4eb2d 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.h +++ b/src/mesa/drivers/dri/i965/brw_cfg.h @@ -274,6 +274,7 @@ struct cfg_t { static bblock_t *intersect(bblock_t *b1, bblock_t *b2); void dump(backend_visitor *v); + void dump_cfg(); #endif void *mem_ctx; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/10] i965/cfg: Calculate the immediate dominators.
--- src/mesa/drivers/dri/i965/brw_cfg.cpp | 68 +- src/mesa/drivers/dri/i965/brw_cfg.h| 7 ++- .../drivers/dri/i965/brw_dead_control_flow.cpp | 5 +- 3 files changed, 75 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp b/src/mesa/drivers/dri/i965/brw_cfg.cpp index ca5b01c..068d4d6 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp @@ -51,7 +51,7 @@ link(void *mem_ctx, bblock_t *block) } bblock_t::bblock_t(cfg_t *cfg) : - cfg(cfg), start_ip(0), end_ip(0), num(0) + cfg(cfg), idom(NULL), start_ip(0), end_ip(0), num(0) { instructions.make_empty(); parents.make_empty(); @@ -157,6 +157,7 @@ cfg_t::cfg_t(exec_list *instructions) block_list.make_empty(); blocks = NULL; num_blocks = 0; + idom_dirty = true; bblock_t *cur = NULL; int ip = 0; @@ -409,10 +410,13 @@ cfg_t::make_block_array() } void -cfg_t::dump(backend_visitor *v) const +cfg_t::dump(backend_visitor *v) { + if (idom_dirty) + calculate_idom(); + foreach_block (block, this) { - fprintf(stderr, START B%d, block-num); + fprintf(stderr, START B%d IDOM(B%d), block-num, block-idom-num); foreach_list_typed(bblock_link, link, link, block-parents) { fprintf(stderr, -B%d, link-block-num); @@ -428,3 +432,61 @@ cfg_t::dump(backend_visitor *v) const fprintf(stderr, \n); } } + +/* Calculates the immediate dominator of each block, according to A Simple, + * Fast Dominance Algorithm by Keith D. Cooper, Timothy J. Harvey, and Ken + * Kennedy. + * + * The authors claim that for control flow graphs of sizes normally encountered + * (less than 1000 nodes) that this algorithm is significantly faster than + * others like Lengauer-Tarjan. + */ +void +cfg_t::calculate_idom() +{ + foreach_block(block, this) { + block-idom = NULL; + } + blocks[0]-idom = blocks[0]; + + bool changed; + do { + changed = false; + + foreach_block(block, this) { + if (block-num == 0) +continue; + + bblock_t *new_idom = NULL; + foreach_list_typed(bblock_link, parent, link, block-parents) { +if (parent-block-idom) { + if (new_idom == NULL) { + new_idom = parent-block; + } else if (parent-block-idom != NULL) { + new_idom = intersect(parent-block, new_idom); + } +} + } + + if (block-idom != new_idom) { +block-idom = new_idom; +changed = true; + } + } + } while (changed); + + idom_dirty = false; +} + +bblock_t * +cfg_t::intersect(bblock_t *b1, bblock_t *b2) +{ + while (b1-num != b2-num) { + while (b1-num b2-num) + b1 = b1-idom; + while (b2-num b1-num) + b2 = b2-idom; + } + assert(b1); + return b1; +} diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h b/src/mesa/drivers/dri/i965/brw_cfg.h index 0b60fec..215f248 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.h +++ b/src/mesa/drivers/dri/i965/brw_cfg.h @@ -81,6 +81,7 @@ struct bblock_t { struct exec_node link; struct cfg_t *cfg; + struct bblock_t *idom; int start_ip; int end_ip; @@ -269,8 +270,10 @@ struct cfg_t { bblock_t *new_block(); void set_next_block(bblock_t **cur, bblock_t *block, int ip); void make_block_array(); + void calculate_idom(); + static bblock_t *intersect(bblock_t *b1, bblock_t *b2); - void dump(backend_visitor *v) const; + void dump(backend_visitor *v); #endif void *mem_ctx; @@ -278,6 +281,8 @@ struct cfg_t { struct exec_list block_list; struct bblock_t **blocks; int num_blocks; + + bool idom_dirty; }; /* Note that this is implemented with a double for loop -- break will diff --git a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp index 03f838d..4d68e12 100644 --- a/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp +++ b/src/mesa/drivers/dri/i965/brw_dead_control_flow.cpp @@ -114,8 +114,11 @@ dead_control_flow_eliminate(backend_visitor *v) } } - if (progress) + if (progress) { v-invalidate_live_intervals(); + v-cfg-idom_dirty = true; + } + return progress; } -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/10] i965/fs: Combine constants and unconditionally emit MADs
total instructions in shared programs: 5895414 - 5747578 (-2.51%) instructions in affected programs: 3618111 - 3470275 (-4.09%) helped:20492 HURT: 4449 GAINED:54 LOST: 146 and with NIR, that already emits MADs always: total instructions in shared programs: 7992936 - 7772474 (-2.76%) instructions in affected programs: 3738730 - 3518268 (-5.90%) helped:22082 HURT: 3445 GAINED:70 LOST: 78 There are some particularly exciting individual improvements too: kerbal-space-program/667: FS SIMD8: 1024 - 561 (-45.21%) kerbal-space-program/676: FS SIMD8: 1021 - 558 (-45.35%) rocketbirds-hardboiled-chicken/fp-2: FS SIMD8: 1076 - 582 (-45.91%) Changing the fs_visitor and fp backend to always emit MADs is a little like moving the goal posts for switching to NIR, but I think always emitting MADs this is the right direction, and of course we'll discover more improvements the more we work with the code. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] i965/cfg: Allow cfg::dump to be called without a visitor.
The fs_visitor's dump_instruction() implementation calls cfg_t() indirectly through calculate_live_intervals, so if you have an infinite loop in the CFG code, you can't call cfg::dump(fs_visitor *) to debug it. --- src/mesa/drivers/dri/i965/brw_cfg.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp b/src/mesa/drivers/dri/i965/brw_cfg.cpp index 62cc239..ca5b01c 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp @@ -418,7 +418,8 @@ cfg_t::dump(backend_visitor *v) const link-block-num); } fprintf(stderr, \n); - block-dump(v); + if (v != NULL) + block-dump(v); fprintf(stderr, END B%d, block-num); foreach_list_typed(bblock_link, link, link, block-children) { fprintf(stderr, -B%d, -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/10] i965/fs: Add pass to combine immediates.
total instructions in shared programs: 5885407 - 5940958 (0.94%) instructions in affected programs: 3617311 - 3672862 (1.54%) helped:3 HURT: 23556 GAINED:31 LOST: 165 ... but will allow us to always emit MAD instructions. --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 2 + src/mesa/drivers/dri/i965/brw_fs.h | 1 + .../drivers/dri/i965/brw_fs_combine_constants.cpp | 254 + 4 files changed, 258 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index 37697f3..c69441b 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -40,6 +40,7 @@ i965_FILES = \ brw_ff_gs.h \ brw_fs_channel_expressions.cpp \ brw_fs_cmod_propagation.cpp \ + brw_fs_combine_constants.cpp \ brw_fs_copy_propagation.cpp \ brw_fs.cpp \ brw_fs_cse.cpp \ diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3142ab4..69602a7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3612,6 +3612,8 @@ fs_visitor::optimize() OPT(dead_code_eliminate); } + opt_combine_constants(); + lower_uniform_pull_constant_loads(); } diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index b95e2c0..c160bd6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -471,6 +471,7 @@ public: void no16(const char *msg, ...); void lower_uniform_pull_constant_loads(); bool lower_load_payload(); + void opt_combine_constants(); void emit_dummy_fs(); void emit_repclear_shader(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp b/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp new file mode 100644 index 000..ad4a965 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp @@ -0,0 +1,254 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/** @file brw_fs_combine_constants.cpp + * + * This file contains the opt_combine_constants() pass that runs after the + * regular optimization loop. It passes over the instruction list and + * selectively promotes immediate values to registers by emitting a mov(1) + * instruction. + * + * This is useful on Gen 7 particularly, because a few instructions can be + * coissued (i.e., issued in the same cycle as another thread on the same EU + * issues an instruction) under some circumstances, one of which is that they + * cannot use immediate values. + */ + +#include brw_fs.h +#include brw_fs_live_variables.h +#include brw_cfg.h + +static bool +could_coissue(const fs_inst *inst) +{ + /* MAD can coissue, but while the PRM lists various restrictions for Align1 +* instructions related to data alignment and regioning, it doesn't list +* similar restrictions for Align16 instructions. I don't expect that there +* are any restrictions, since Align16 doesn't allow the kinds of operations +* that are restricted in Align1 mode. +* +* Since MAD is an Align16 instruction we assume it can always coissue. +*/ + switch (inst-opcode) { + case BRW_OPCODE_MOV: + case BRW_OPCODE_CMP: + case BRW_OPCODE_ADD: + case BRW_OPCODE_MUL: + return true; + default: + return false; + } +} + +struct reg_link { + DECLARE_RALLOC_CXX_OPERATORS(reg_link) + + reg_link(fs_reg *reg) : reg(reg) {} + + struct exec_node link; + fs_reg *reg; +}; + +static struct exec_node * +link(void *mem_ctx,
[Mesa-dev] [PATCH 04/10] i965/cfg: Add function to generate a dot file of the dominator tree.
--- src/mesa/drivers/dri/i965/brw_cfg.cpp | 10 ++ src/mesa/drivers/dri/i965/brw_cfg.h | 1 + 2 files changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp b/src/mesa/drivers/dri/i965/brw_cfg.cpp index 2d6a181..095c34f 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp @@ -504,3 +504,13 @@ cfg_t::dump_cfg() } printf(}\n); } + +void +cfg_t::dump_domtree() +{ + printf(digraph CFG {\n); + foreach_block(block, this) { + printf(\t%d - %d\n, block-idom-num, block-num); + } + printf(}\n); +} diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h b/src/mesa/drivers/dri/i965/brw_cfg.h index 4d4eb2d..56d7d07 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.h +++ b/src/mesa/drivers/dri/i965/brw_cfg.h @@ -275,6 +275,7 @@ struct cfg_t { void dump(backend_visitor *v); void dump_cfg(); + void dump_domtree(); #endif void *mem_ctx; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] i965/fs: Allow immediates in MAD and LRP instructions.
And then the opt_combine_constants() pass will pull them out into registers. This will allow us to do some algebraic optimizations on MAD and LRP. total instructions in shared programs: 5946656 - 5931320 (-0.26%) instructions in affected programs: 778247 - 762911 (-1.97%) helped:3780 HURT: 6 GAINED:12 LOST: 12 --- .../drivers/dri/i965/brw_fs_combine_constants.cpp | 22 +++--- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 6 ++ 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp b/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp index ad4a965..e9341b7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp @@ -60,6 +60,18 @@ could_coissue(const fs_inst *inst) } } +static bool +must_promote_imm(const fs_inst *inst) +{ + switch (inst-opcode) { + case BRW_OPCODE_MAD: + case BRW_OPCODE_LRP: + return true; + default: + return false; + } +} + struct reg_link { DECLARE_RALLOC_CXX_OPERATORS(reg_link) @@ -86,6 +98,7 @@ struct imm { uint16_t reg; uint16_t uses_by_coissue; + bool must_promote; bool inserted; @@ -153,12 +166,13 @@ fs_visitor::opt_combine_constants() unsigned ip = -1; /* Make a pass through all instructions and count the number of times each -* constant is used by coissueable instructions. +* constant is used by coissueable instructions or instructions that cannot +* take immediate arguments. */ foreach_block_and_inst(block, fs_inst, inst, cfg) { ip++; - if (!could_coissue(inst)) + if (!could_coissue(inst) !must_promote_imm(inst)) continue; for (int i = 0; i inst-sources; i++) { @@ -175,6 +189,7 @@ fs_visitor::opt_combine_constants() imm-block = intersection; imm-uses-push_tail(link(const_ctx, inst-src[i])); imm-uses_by_coissue += could_coissue(inst); +imm-must_promote = imm-must_promote || must_promote_imm(inst); imm-last_use_ip = ip; } else { imm = new_imm(table, const_ctx); @@ -184,6 +199,7 @@ fs_visitor::opt_combine_constants() imm-uses-push_tail(link(const_ctx, inst-src[i])); imm-val = val; imm-uses_by_coissue = could_coissue(inst); +imm-must_promote = must_promote_imm(inst); imm-inserted = false; imm-first_use_ip = ip; imm-last_use_ip = ip; @@ -197,7 +213,7 @@ fs_visitor::opt_combine_constants() for (int i = 0; i table.len;) { struct imm *imm = table.imm[i]; - if (imm-uses_by_coissue 4) { + if (!imm-must_promote imm-uses_by_coissue 4) { table.imm[i] = table.imm[table.len - 1]; table.len--; continue; diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 68a266c..1b47851 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -574,6 +574,12 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry) progress = true; break; + case BRW_OPCODE_MAD: + case BRW_OPCODE_LRP: + inst-src[i] = val; + progress = true; + break; + default: break; } -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/10] i965/fs: Emit MAD instructions when possible.
Previously we didn't emit MAD instructions since they cannot take immediate arguments, but with the opt_combine_constants() pass we can handle this properly. total instructions in shared programs: 5920017 - 5733278 (-3.15%) instructions in affected programs: 3625153 - 3438414 (-5.15%) helped:22017 HURT: 870 GAINED:91 LOST: 49 Without constant pooling, this patch is a complete loss: total instructions in shared programs: 5912589 - 5987888 (1.27%) instructions in affected programs: 3190050 - 3265349 (2.36%) helped:1564 HURT: 17827 GAINED:27 LOST: 101 And since the constant pooling patch by itself hurt a bunch of things, from before constant pooling to this patch the results are: total instructions in shared programs: 5895414 - 5747946 (-2.50%) instructions in affected programs: 3617993 - 3470525 (-4.08%) helped:20478 HURT: 4469 GAINED:54 LOST: 146 --- src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 11 --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 5 - 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp index 7f2874d..382a54a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp @@ -316,9 +316,14 @@ fs_visitor::emit_fragment_program_code() case OPCODE_MAD: for (int i = 0; i 4; i++) { if (fpi-DstReg.WriteMask (1 i)) { - fs_reg temp = vgrf(glsl_type::float_type); - emit(MUL(temp, offset(src[0], i), offset(src[1], i))); - emit(ADD(offset(dst, i), temp, offset(src[2], i))); + if (brw-gen = 6) { + emit(MAD(offset(dst, i), offset(src[2], i), + offset(src[1], i), offset(src[0], i))); + } else { + fs_reg temp = vgrf(glsl_type::float_type); + emit(MUL(temp, offset(src[0], i), offset(src[1], i))); + emit(ADD(offset(dst, i), temp, offset(src[2], i))); + } } } break; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 6cddcf5..3ac51f9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -456,11 +456,6 @@ fs_visitor::try_emit_mad(ir_expression *ir) return false; } - if (nonmul-as_constant() || - mul-operands[0]-as_constant() || - mul-operands[1]-as_constant()) - return false; - nonmul-accept(this); fs_reg src0 = this-result; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/10] i965/fs: Remove force_writemask_all assertion for execsize 8.
This doesn't seem to be necessary. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 8cd36f8..2820b8d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1611,7 +1611,6 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) case 1: case 2: case 4: - assert(inst-force_writemask_all); brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); break; case 8: -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/10] i965/fs: Consider subregister writes to be writing a register.
Since .stride is 0 for a scalar write, we calculated regs_written = 0 for a mov(1) instruction. Instruction scheduling loops from 0 to regs_written when calculating dependencies, so this would cause problems. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 2046eba..3142ab4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -123,7 +123,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const fs_reg dst, case HW_REG: case MRF: case ATTR: - this-regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32; + this-regs_written = MAX2((dst.width * dst.stride * type_sz(dst.type) + 31) / 32, 1); break; case BAD_FILE: this-regs_written = 0; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/10] i965/fs: Add algebraic optimizations for MAD.
total instructions in shared programs: 5764176 - 5763808 (-0.01%) instructions in affected programs: 25121 - 24753 (-1.46%) helped:164 HURT: 2 --- src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++ 1 file changed, 26 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 69602a7..6f36e39 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2417,6 +2417,32 @@ fs_visitor::opt_algebraic() } } break; + case BRW_OPCODE_MAD: + if (inst-src[1].is_zero() || inst-src[2].is_zero()) { +inst-opcode = BRW_OPCODE_MOV; +inst-src[1] = reg_undef; +inst-src[2] = reg_undef; +progress = true; + } else if (inst-src[0].is_zero()) { +inst-opcode = BRW_OPCODE_MUL; +inst-src[0] = inst-src[2]; +inst-src[2] = reg_undef; + } else if (inst-src[1].is_one()) { +inst-opcode = BRW_OPCODE_ADD; +inst-src[1] = inst-src[2]; +inst-src[2] = reg_undef; +progress = true; + } else if (inst-src[2].is_one()) { +inst-opcode = BRW_OPCODE_ADD; +inst-src[2] = reg_undef; +progress = true; + } else if (inst-src[1].file == IMM inst-src[2].file == IMM) { +inst-opcode = BRW_OPCODE_MUL; +inst-src[1].fixed_hw_reg.dw1.f *= inst-src[2].fixed_hw_reg.dw1.f; +inst-src[2] = reg_undef; +progress = true; + } + break; case SHADER_OPCODE_RCP: { fs_inst *prev = (fs_inst *)inst-prev; if (prev-opcode == SHADER_OPCODE_SQRT) { -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
On Wed, Feb 4, 2015 at 11:45 PM, Michel Dänzer mic...@daenzer.net wrote: On 05.02.2015 12:48, Ilia Mirkin wrote: Is there a benchmark that demonstrates this? I'd like to test it out with nouveau. Mesa demos src/tests/streaming_rect on Kaveri (radeonsi): Unpatched: 42 frames in 1.023 seconds = 41.056 FPS Patched: 615 frames in 1.000 seconds = 615.000 FPS Hm, 260fps for me on a GF108 (nvc1) with or without the patch (with vblank_mode=0 LIBGL_DRI3_DISABLE=1). I guess with nouveau you don't get that uncacheable nonsense? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB
Is there a benchmark that demonstrates this? I'd like to test it out with nouveau. On Wed, Feb 4, 2015 at 10:47 PM, Michel Dänzer mic...@daenzer.net wrote: From: Michel Dänzer michel.daen...@amd.com The latter currently implies CPU access, so we have to avoid getting uncacheable memory. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658 Cc: 10.3 10.4 mesa-sta...@lists.freedestkop.org Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/mesa/state_tracker/st_cb_bufferobjects.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c b/src/mesa/state_tracker/st_cb_bufferobjects.c index 55f3644..90f786c 100644 --- a/src/mesa/state_tracker/st_cb_bufferobjects.c +++ b/src/mesa/state_tracker/st_cb_bufferobjects.c @@ -256,8 +256,15 @@ st_bufferobj_data(struct gl_context *ctx, break; case GL_STREAM_DRAW: case GL_STREAM_COPY: - pipe_usage = PIPE_USAGE_STREAM; - break; + /* XXX: Remove this test and fall-through when we have PBO unpacking + * acceleration. Right now, PBO unpacking is done by the CPU, so we + * have to make sure CPU reads are fast. + */ + if (target != GL_PIXEL_UNPACK_BUFFER_ARB) { +pipe_usage = PIPE_USAGE_STREAM; +break; + } + /* fall through */ case GL_STATIC_READ: case GL_DYNAMIC_READ: case GL_STREAM_READ: -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev