[Mesa-dev] [PATCH] gallivm: Fix build with LLVM = 3.6 r215967.
This LLVM 3.6 commit changed EngineBuilder constructor. commit 3f4ed32b4398eaf4fe0080d8001ba01e6c2f43c8 Author: Rafael Espindola rafael.espind...@gmail.com Date: Tue Aug 19 04:04:25 2014 + Make it explicit that ExecutionEngine takes ownership of the modules. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215967 91177308-0d34-0410-b5e6-96231b3b80d8 Signed-off-by: Vinson Lee v...@freedesktop.org --- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp index 6bea964..55aa8b9 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp @@ -421,7 +421,11 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT, using namespace llvm; std::string Error; +#if HAVE_LLVM = 0x0306 + EngineBuilder builder(std::unique_ptrModule(unwrap(M))); +#else EngineBuilder builder(unwrap(M)); +#endif /** * LLVM 3.1+ haven't more extern unsigned llvm::StackAlignmentOverride and -- 1.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallivm: Fix build against LLVM SVN = r215967
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp index 6bea964..55aa8b9 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp @@ -421,7 +421,11 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT, using namespace llvm; std::string Error; +#if HAVE_LLVM = 0x0306 + EngineBuilder builder(std::unique_ptrModule(unwrap(M))); +#else EngineBuilder builder(unwrap(M)); +#endif /** * LLVM 3.1+ haven't more extern unsigned llvm::StackAlignmentOverride and -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 5d2efc4..2643cc3 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -234,7 +234,11 @@ namespace { memcpy(address_spaces, c.getTarget().getAddressSpaceMap(), sizeof(address_spaces)); +#if HAVE_LLVM = 0x0306 + return act.takeModule().get(); +#else return act.takeModule(); +#endif } void -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Fix build with LLVM = 3.6 r215967.
On 20.08.2014 15:17, Vinson Lee wrote: This LLVM 3.6 commit changed EngineBuilder constructor. commit 3f4ed32b4398eaf4fe0080d8001ba01e6c2f43c8 Author: Rafael Espindola rafael.espind...@gmail.com Date: Tue Aug 19 04:04:25 2014 + Make it explicit that ExecutionEngine takes ownership of the modules. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215967 91177308-0d34-0410-b5e6-96231b3b80d8 Signed-off-by: Vinson Lee v...@freedesktop.org --- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp index 6bea964..55aa8b9 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp @@ -421,7 +421,11 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT, using namespace llvm; std::string Error; +#if HAVE_LLVM = 0x0306 + EngineBuilder builder(std::unique_ptrModule(unwrap(M))); +#else EngineBuilder builder(unwrap(M)); +#endif /** * LLVM 3.1+ haven't more extern unsigned llvm::StackAlignmentOverride and I pushed yours, since you beat me by two minutes (and have a more detailed commit log). :) -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand it's not immediately required (as we can address this problem for the time being using the same solution AMD uses) and on the other hand it's still less work than starting from scratch. * LLVM doesn't do modifiers, meaning that
[Mesa-dev] [PATCHv3 07/16] glsl: protect anonymous struct id with a mutex
There may be two contexts compiling shaders at the same time, and we want the anonymous struct id to be globally unique. Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/glsl_parser_extras.cpp | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index 490c3c8..b17cdb1 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -1350,9 +1350,15 @@ ast_struct_specifier::ast_struct_specifier(const char *identifier, ast_declarator_list *declarator_list) { if (identifier == NULL) { + static mtx_t mutex = _MTX_INITIALIZER_NP; static unsigned anon_count = 1; - identifier = ralloc_asprintf(this, #anon_struct_%04x, anon_count); - anon_count++; + unsigned count; + + mtx_lock(mutex); + count = anon_count++; + mtx_unlock(mutex); + + identifier = ralloc_asprintf(this, #anon_struct_%04x, count); } name = identifier; this-declarations.push_degenerate_list_at_head(declarator_list-link); -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 03/16] util: initialize locale_t with a static object
_mesa_strtod and _mesa_strtof may be called from multiple threads. They need to be thread-safe. Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com v2: platform checks are now done in configure.ac --- src/util/strtod.cpp | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/src/util/strtod.cpp b/src/util/strtod.cpp index 2a3e8eb..2b4dd98 100644 --- a/src/util/strtod.cpp +++ b/src/util/strtod.cpp @@ -36,6 +36,12 @@ #include strtod.h +#if defined(_GNU_SOURCE) defined(HAVE_XLOCALE_H) +static struct locale_initializer { + locale_initializer() { loc = newlocale(LC_CTYPE_MASK, C, NULL); } + locale_t loc; +} loc_init; +#endif /** * Wrapper around strtod which uses the C locale so the decimal @@ -45,11 +51,7 @@ double _mesa_strtod(const char *s, char **end) { #if defined(_GNU_SOURCE) defined(HAVE_XLOCALE_H) - static locale_t loc = NULL; - if (!loc) { - loc = newlocale(LC_CTYPE_MASK, C, NULL); - } - return strtod_l(s, end, loc); + return strtod_l(s, end, loc_init.loc); #else return strtod(s, end); #endif @@ -64,11 +66,7 @@ float _mesa_strtof(const char *s, char **end) { #if defined(_GNU_SOURCE) defined(HAVE_XLOCALE_H) - static locale_t loc = NULL; - if (!loc) { - loc = newlocale(LC_CTYPE_MASK, C, NULL); - } - return strtof_l(s, end, loc); + return strtof_l(s, end, loc_init.loc); #elif defined(HAVE_STRTOF) return strtof(s, end); #else -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 11/16] mesa: add infrastructure for threaded shader compilation
Add _mesa_enable_glsl_threadpool to enable the thread pool for a context, and add ctx-Const.DeferCompileShader and ctx-Const.DeferLinkProgram to fine-control what gets threaded. Setting DeferCompileShader to true will make _mesa_glsl_compile_shader be executed in a worker thread. The function is thread-safe so there is no restriction on DeferCompileShader. Setting DeferLinkProgram to true will make _mesa_glsl_link_shader be executed in a worker thread. The function is thread-safe only when certain driver functions (as documented in struct gl_constants) are thread-safe. It is drivers' responsibility to fix those driver functions before setting DeferLinkProgram. When DeferLinkProgram is set, drivers are not supposed to inspect the context in their LinkShader callbacks. Instead, NotifyLinkShader is added. Drivers should inspect the context in NotifyLinkShader and save what they need for LinkShader in gl_shader_program. As a final note, most applications will not benefit from threaded shader compilation because they check GL_COMPILE_STATUS/GL_LINK_STATUS immediately, giving the worker threads no time to do their jobs. A possible improvement is to split LinkShader into two parts: the first part links and error checks while the second part optimizes and generates the machine code. With the split, we can always defer the second part to the thread pool. Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com v2: - replace void *TaskData by struct gl_context *TaskContext - use bool instead of GLboolean internally - add more comments to the newly added functions --- src/mesa/main/context.c | 29 +++ src/mesa/main/context.h | 3 ++ src/mesa/main/dd.h | 8 +++ src/mesa/main/mtypes.h | 34 src/mesa/main/pipelineobj.c | 18 +++ src/mesa/main/shaderapi.c | 124 +++- src/mesa/main/shaderobj.c | 84 +++--- src/mesa/main/shaderobj.h | 55 ++-- 8 files changed, 332 insertions(+), 23 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 7a1b6f6..54d1248 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -112,6 +112,7 @@ #include points.h #include polygon.h #include queryobj.h +#include shaderapi.h #include syncobj.h #include rastpos.h #include remap.h @@ -133,6 +134,7 @@ #include math/m_matrix.h #include main/dispatch.h /* for _gloffset_COUNT */ #include util/simple_list.h +#include util/threadpool.h #ifdef USE_SPARC_ASM #include sparc/sparc.h @@ -1187,6 +1189,27 @@ _mesa_create_context(gl_api api, } } +void +_mesa_enable_glsl_threadpool(struct gl_context *ctx, int max_threads) +{ + if (!ctx-ThreadPool) + ctx-ThreadPool = _mesa_threadpool_get_singleton(max_threads); +} + +static void +wait_shader_object_cb(GLuint id, void *data, void *userData) +{ + struct gl_context *ctx = (struct gl_context *) userData; + struct gl_shader *sh = (struct gl_shader *) data; + + if (_mesa_validate_shader_target(ctx, sh-Type)) { + _mesa_wait_shaders(ctx, sh, 1); + } + else { + struct gl_shader_program *shProg = (struct gl_shader_program *) data; + _mesa_wait_shader_program(ctx, shProg); + } +} /** * Free the data associated with the given context. @@ -1205,6 +1228,12 @@ _mesa_free_context_data( struct gl_context *ctx ) _mesa_make_current(ctx, NULL, NULL); } + if (ctx-ThreadPool) { + _mesa_HashWalk(ctx-Shared-ShaderObjects, wait_shader_object_cb, ctx); + _mesa_threadpool_unref(ctx-ThreadPool); + ctx-ThreadPool = NULL; + } + /* unreference WinSysDraw/Read buffers */ _mesa_reference_framebuffer(ctx-WinSysDrawBuffer, NULL); _mesa_reference_framebuffer(ctx-WinSysReadBuffer, NULL); diff --git a/src/mesa/main/context.h b/src/mesa/main/context.h index d902ea7..e81d4f7 100644 --- a/src/mesa/main/context.h +++ b/src/mesa/main/context.h @@ -118,6 +118,9 @@ _mesa_create_context(gl_api api, const struct dd_function_table *driverFunctions); extern void +_mesa_enable_glsl_threadpool(struct gl_context *ctx, int max_threads); + +extern void _mesa_free_context_data( struct gl_context *ctx ); extern void diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h index c130b14..9310002 100644 --- a/src/mesa/main/dd.h +++ b/src/mesa/main/dd.h @@ -477,6 +477,14 @@ struct dd_function_table { */ /*@{*/ /** +* Called when a shader program is to be linked. +* +* This is optional and gives drivers an opportunity to inspect the context +* and prepare for LinkShader, which may be deferred to another thread. +*/ + void (*NotifyLinkShader)(struct gl_context *ctx, +struct gl_shader_program *shader); + /** * Called when a shader program is linked. * * This gives drivers an opportunity to clone
[Mesa-dev] [PATCHv3 05/16] util: add a generic thread pool data structure
It can be used to implement, for example, threaded glCompileShader and glLinkProgram. Two basic tests are included to verify the basic functions, and to give us some confidence about its thread-safety. v2: allow tasks to complete other tasks Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com v3: move to src/util/ and mention the tests --- configure.ac | 3 +- src/util/Makefile.am | 5 +- src/util/Makefile.sources | 3 +- src/util/tests/threadpool/Makefile.am | 36 ++ src/util/tests/threadpool/threadpool_test.cpp | 137 src/util/threadpool.c | 476 ++ src/util/threadpool.h | 67 7 files changed, 724 insertions(+), 3 deletions(-) create mode 100644 src/util/tests/threadpool/Makefile.am create mode 100644 src/util/tests/threadpool/threadpool_test.cpp create mode 100644 src/util/threadpool.c create mode 100644 src/util/threadpool.h diff --git a/configure.ac b/configure.ac index 57e9f7d..2f7268f 100644 --- a/configure.ac +++ b/configure.ac @@ -2261,7 +2261,8 @@ AC_CONFIG_FILES([Makefile src/mesa/drivers/x11/Makefile src/mesa/main/tests/Makefile src/util/Makefile - src/util/tests/hash_table/Makefile]) + src/util/tests/hash_table/Makefile + src/util/tests/threadpool/Makefile]) dnl Sort the dirs alphabetically GALLIUM_TARGET_DIRS=`echo $GALLIUM_TARGET_DIRS|tr \n|sort -u|tr \n ` diff --git a/src/util/Makefile.am b/src/util/Makefile.am index 4733a1a..da6815e 100644 --- a/src/util/Makefile.am +++ b/src/util/Makefile.am @@ -19,7 +19,7 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS # IN THE SOFTWARE. -SUBDIRS = . tests/hash_table +SUBDIRS = . tests/hash_table tests/threadpool include Makefile.sources @@ -34,6 +34,9 @@ libmesautil_la_SOURCES = \ $(MESA_UTIL_FILES) \ $(MESA_UTIL_GENERATED_FILES) +libmesautil_la_LIBADD = \ + $(PTHREAD_LIBS) + BUILT_SOURCES = $(MESA_UTIL_GENERATED_FILES) CLEANFILES = $(BUILT_SOURCES) diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources index 86466dc..65f98f7 100644 --- a/src/util/Makefile.sources +++ b/src/util/Makefile.sources @@ -1,7 +1,8 @@ MESA_UTIL_FILES := \ hash_table.c\ ralloc.c \ - strtod.cpp + strtod.cpp \ + threadpool.c MESA_UTIL_GENERATED_FILES = \ format_srgb.c diff --git a/src/util/tests/threadpool/Makefile.am b/src/util/tests/threadpool/Makefile.am new file mode 100644 index 000..2aa010c --- /dev/null +++ b/src/util/tests/threadpool/Makefile.am @@ -0,0 +1,36 @@ +# Copyright © 2009 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# on the rights to use, copy, modify, merge, publish, distribute, sub +# license, and/or sell copies of the Software, and to permit persons to whom +# the Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL +# ADAM JACKSON BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +# IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +AM_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/src/gtest/include \ + -I$(top_srcdir)/src/util \ + $(DEFINES) + +TESTS = threadpool-test + +check_PROGRAMS = threadpool-test + +threadpool_test_SOURCES = threadpool_test.cpp +threadpool_test_CFLAGS = $(PTHREAD_CFLAGS) +threadpool_test_LDADD =\ + $(top_builddir)/src/util/libmesautil.la \ + $(top_builddir)/src/gtest/libgtest.la \ + $(PTHREAD_LIBS) diff --git a/src/util/tests/threadpool/threadpool_test.cpp b/src/util/tests/threadpool/threadpool_test.cpp new file mode 100644 index 000..63f55c5 --- /dev/null +++ b/src/util/tests/threadpool/threadpool_test.cpp @@ -0,0 +1,137 @@ +/* + * Copyright © 2014 LunarG, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish,
[Mesa-dev] [PATCHv3 02/16] configure: check for xlocale.h and strtof
With the assumptions that xlocale.h implies newlocale and strtof_l. SCons is updated to define HAVE_XLOCALE_H on linux and darwin. Signed-off-by: Chia-I Wu o...@lunarg.com --- configure.ac| 3 +++ scons/gallium.py| 4 src/util/strtod.cpp | 12 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/configure.ac b/configure.ac index be6898f..57e9f7d 100644 --- a/configure.ac +++ b/configure.ac @@ -494,6 +494,9 @@ if test x$enable_asm = xyes; then esac fi +AC_CHECK_HEADER([xlocale.h], [DEFINES=$DEFINES -DHAVE_XLOCALE_H]) +AC_CHECK_FUNC([strtof], [DEFINES=$DEFINES -DHAVE_STRTOF]) + dnl Check to see if dlopen is in default libraries (like Solaris, which dnl has it in libc), or if libdl is needed to get it. AC_CHECK_FUNC([dlopen], [DEFINES=$DEFINES -DHAVE_DLOPEN], diff --git a/scons/gallium.py b/scons/gallium.py index e915319..70b40f6 100755 --- a/scons/gallium.py +++ b/scons/gallium.py @@ -301,6 +301,10 @@ def generate(env): cppdefines += ['HAVE_ALIAS'] else: cppdefines += ['GLX_ALIAS_UNSUPPORTED'] + +if env['platform'] in ('linux', 'darwin'): +cppdefines += ['HAVE_XLOCALE_H'] + if env['platform'] == 'haiku': cppdefines += [ 'HAVE_PTHREAD', diff --git a/src/util/strtod.cpp b/src/util/strtod.cpp index 2f1d229..2a3e8eb 100644 --- a/src/util/strtod.cpp +++ b/src/util/strtod.cpp @@ -28,7 +28,7 @@ #ifdef _GNU_SOURCE #include locale.h -#ifdef __APPLE__ +#ifdef HAVE_XLOCALE_H #include xlocale.h #endif #endif @@ -44,9 +44,7 @@ double _mesa_strtod(const char *s, char **end) { -#if defined(_GNU_SOURCE) !defined(__CYGWIN__) !defined(__FreeBSD__) \ - !defined(ANDROID) !defined(__HAIKU__) !defined(__UCLIBC__) \ - !defined(__NetBSD__) +#if defined(_GNU_SOURCE) defined(HAVE_XLOCALE_H) static locale_t loc = NULL; if (!loc) { loc = newlocale(LC_CTYPE_MASK, C, NULL); @@ -65,15 +63,13 @@ _mesa_strtod(const char *s, char **end) float _mesa_strtof(const char *s, char **end) { -#if defined(_GNU_SOURCE) !defined(__CYGWIN__) !defined(__FreeBSD__) \ - !defined(ANDROID) !defined(__HAIKU__) !defined(__UCLIBC__) \ - !defined(__NetBSD__) +#if defined(_GNU_SOURCE) defined(HAVE_XLOCALE_H) static locale_t loc = NULL; if (!loc) { loc = newlocale(LC_CTYPE_MASK, C, NULL); } return strtof_l(s, end, loc); -#elif defined(_ISOC99_SOURCE) || (defined(_XOPEN_SOURCE) _XOPEN_SOURCE = 600) +#elif defined(HAVE_STRTOF) return strtof(s, end); #else return (float) strtod(s, end); -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 10/16] mesa: protect the debug state with a mutex
We are about to change mesa to spawn threads for deferred glCompileShader and glLinkProgram, and we need to make sure those threads can send compiler warnings/errors to the debug output safely. Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/main/errors.c | 172 +++-- src/mesa/main/mtypes.h | 1 + 2 files changed, 126 insertions(+), 47 deletions(-) diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c index 6b55a1d..218b4ee 100644 --- a/src/mesa/main/errors.c +++ b/src/mesa/main/errors.c @@ -677,22 +677,41 @@ debug_pop_group(struct gl_debug_state *debug) /** - * Return debug state for the context. The debug state will be allocated - * and initialized upon the first call. + * Lock and return debug state for the context. The debug state will be + * allocated and initialized upon the first call. When NULL is returned, the + * debug state is not locked. */ static struct gl_debug_state * -_mesa_get_debug_state(struct gl_context *ctx) +_mesa_lock_debug_state(struct gl_context *ctx) { + mtx_lock(ctx-DebugMutex); + if (!ctx-Debug) { ctx-Debug = debug_create(); if (!ctx-Debug) { - _mesa_error(ctx, GL_OUT_OF_MEMORY, allocating debug state); + GET_CURRENT_CONTEXT(cur); + mtx_unlock(ctx-DebugMutex); + + /* + * This function may be called from other threads. When that is the + * case, we cannot record this OOM error. + */ + if (ctx == cur) +_mesa_error(ctx, GL_OUT_OF_MEMORY, allocating debug state); + + return NULL; } } return ctx-Debug; } +static void +_mesa_unlock_debug_state(struct gl_context *ctx) +{ + mtx_unlock(ctx-DebugMutex); +} + /** * Set the integer debug state specified by \p pname. This can be called from * _mesa_set_enable for example. @@ -700,7 +719,7 @@ _mesa_get_debug_state(struct gl_context *ctx) bool _mesa_set_debug_state_int(struct gl_context *ctx, GLenum pname, GLint val) { - struct gl_debug_state *debug = _mesa_get_debug_state(ctx); + struct gl_debug_state *debug = _mesa_lock_debug_state(ctx); if (!debug) return false; @@ -717,6 +736,8 @@ _mesa_set_debug_state_int(struct gl_context *ctx, GLenum pname, GLint val) break; } + _mesa_unlock_debug_state(ctx); + return true; } @@ -730,9 +751,12 @@ _mesa_get_debug_state_int(struct gl_context *ctx, GLenum pname) struct gl_debug_state *debug; GLint val; + mtx_lock(ctx-DebugMutex); debug = ctx-Debug; - if (!debug) + if (!debug) { + mtx_unlock(ctx-DebugMutex); return 0; + } switch (pname) { case GL_DEBUG_OUTPUT: @@ -757,6 +781,8 @@ _mesa_get_debug_state_int(struct gl_context *ctx, GLenum pname) break; } + mtx_unlock(ctx-DebugMutex); + return val; } @@ -770,9 +796,12 @@ _mesa_get_debug_state_ptr(struct gl_context *ctx, GLenum pname) struct gl_debug_state *debug; void *val; + mtx_lock(ctx-DebugMutex); debug = ctx-Debug; - if (!debug) + if (!debug) { + mtx_unlock(ctx-DebugMutex); return NULL; + } switch (pname) { case GL_DEBUG_CALLBACK_FUNCTION_ARB: @@ -787,9 +816,49 @@ _mesa_get_debug_state_ptr(struct gl_context *ctx, GLenum pname) break; } + mtx_unlock(ctx-DebugMutex); + return val; } +/** + * Insert a debug message. The mutex is assumed to be locked, and will be + * unlocked by this call. + */ +static void +log_msg_locked_and_unlock(struct gl_context *ctx, + enum mesa_debug_source source, + enum mesa_debug_type type, GLuint id, + enum mesa_debug_severity severity, + GLint len, const char *buf) +{ + struct gl_debug_state *debug = ctx-Debug; + + if (!debug_is_message_enabled(debug, source, type, id, severity)) { + _mesa_unlock_debug_state(ctx); + return; + } + + if (ctx-Debug-Callback) { + GLenum gl_source = debug_source_enums[source]; + GLenum gl_type = debug_type_enums[type]; + GLenum gl_severity = debug_severity_enums[severity]; + GLDEBUGPROC callback = ctx-Debug-Callback; + const void *data = ctx-Debug-CallbackData; + + /* + * When ctx-Debug-SyncOutput is GL_FALSE, the client is prepared for + * unsynchronous calls. When it is GL_TRUE, we will not spawn threads. + * In either case, we can call the callback unlocked. + */ + _mesa_unlock_debug_state(ctx); + callback(gl_source, gl_type, id, gl_severity, len, buf, data); + } + else { + debug_log_message(ctx-Debug, source, type, id, severity, len, buf); + _mesa_unlock_debug_state(ctx); + } +} /** * Log a client or driver debug message. @@ -799,24 +868,12 @@ log_msg(struct gl_context *ctx, enum mesa_debug_source source, enum
[Mesa-dev] [PATCHv3 16/16] i965: enable threaded precompile
Inherit gl_shader_program and add save/restore functions to save precompile results in the shader programs. When DeferLinkProgram is set, we will save the precompile results instead of uploading them immediately because we may be on a different thread. A few other modifications are also needed. brw_shader_program_precompile_key is introduced and initialized in NofityLinkShader for we cannot inspect the context during precompiling. Signed-off-by: Chia-I Wu o...@lunarg.com Acked-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 4 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 33 -- src/mesa/drivers/dri/i965/brw_program.c | 1 + src/mesa/drivers/dri/i965/brw_shader.cpp | 177 ++- src/mesa/drivers/dri/i965/brw_shader.h | 44 src/mesa/drivers/dri/i965/brw_vec4_gs.c | 37 +-- src/mesa/drivers/dri/i965/brw_vs.c | 36 +-- src/mesa/drivers/dri/i965/brw_wm.c | 23 ++-- src/mesa/drivers/dri/i965/brw_wm.h | 5 +- 9 files changed, 310 insertions(+), 50 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index b02128c..70e61f7 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -839,8 +839,8 @@ brwCreateContext(gl_api api, if (INTEL_DEBUG DEBUG_SHADER_TIME) brw_init_shader_time(brw); - /* brw_shader_precompile is not thread-safe */ - if (brw-precompile) + /* brw_shader_precompile is not thread-safe when debug flags are set */ + if (brw-precompile (INTEL_DEBUG || brw-perf_debug)) ctx-Const.DeferLinkProgram = GL_FALSE; _mesa_compute_version(ctx); diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 5c70f50..393a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3452,6 +3452,8 @@ bool brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) { struct brw_context *brw = brw_context(ctx); + const struct brw_shader_program_precompile_key *pre_key = + brw_shader_program_get_precompile_key(prog); struct brw_wm_prog_key key; if (!prog-_LinkedShaders[MESA_SHADER_FRAGMENT]) @@ -3493,7 +3495,7 @@ brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) } if (fp-Base.InputsRead VARYING_BIT_POS) { - key.drawable_height = ctx-DrawBuffer-Height; + key.drawable_height = pre_key-fbo_height; } key.nr_color_regions = _mesa_bitcount_64(fp-Base.OutputsWritten @@ -3501,7 +3503,7 @@ brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))); if ((fp-Base.InputsRead VARYING_BIT_POS) || program_uses_dfdy) { - key.render_to_fbo = _mesa_is_user_fbo(ctx-DrawBuffer) || + key.render_to_fbo = pre_key-is_user_fbo || key.nr_color_regions 1; } @@ -3513,13 +3515,28 @@ brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) key.program_string_id = bfp-id; - uint32_t old_prog_offset = brw-wm.base.prog_offset; - struct brw_wm_prog_data *old_prog_data = brw-wm.prog_data; + struct brw_wm_compile c; - bool success = do_wm_prog(brw, prog, bfp, key); + brw_wm_init_compile(brw, prog, bfp, key, c); + if (!brw_wm_do_compile(brw, c)) { + brw_wm_clear_compile(brw, c); + return false; + } + + if (brw-ctx.Const.DeferLinkProgram) { + brw_shader_program_save_wm_compile(prog, c); + } + else { + uint32_t old_prog_offset = brw-wm.base.prog_offset; + struct brw_wm_prog_data *old_prog_data = brw-wm.prog_data; - brw-wm.base.prog_offset = old_prog_offset; - brw-wm.prog_data = old_prog_data; + brw_wm_upload_compile(brw, c); - return success; + brw-wm.base.prog_offset = old_prog_offset; + brw-wm.prog_data = old_prog_data; + } + + brw_wm_clear_compile(brw, c); + + return true; } diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index d782b4f..35fd69a 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -259,6 +259,7 @@ void brwInitFragProgFuncs( struct dd_function_table *functions ) functions-NewShader = brw_new_shader; functions-NewShaderProgram = brw_new_shader_program; functions-LinkShader = brw_link_shader; + functions-NotifyLinkShader = brw_notify_link_shader; } void diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 28db29a..29f4a19 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -25,14 +25,52 @@ extern C { #include main/macros.h #include brw_context.h } +#include brw_shader.h #include brw_vs.h #include brw_vec4_gs.h +#include brw_vec4_gs_visitor.h #include brw_fs.h #include brw_cfg.h
[Mesa-dev] [PATCHv3 04/16] util: move simple_list.h from core to util
It belongs to util, and we will need it from within util. Signed-off-by: Chia-I Wu o...@lunarg.com --- src/mesa/drivers/dri/i915/i830_texblend.c | 2 +- src/mesa/drivers/dri/i915/intel_syncobj.c | 2 +- src/mesa/drivers/dri/r200/r200_cmdbuf.c| 2 +- src/mesa/drivers/dri/r200/r200_context.c | 3 +- src/mesa/drivers/dri/r200/r200_ioctl.h | 2 +- src/mesa/drivers/dri/r200/r200_swtcl.c | 2 +- src/mesa/drivers/dri/r200/r200_tex.c | 2 +- .../drivers/dri/radeon/radeon_common_context.c | 2 +- src/mesa/drivers/dri/radeon/radeon_context.c | 3 +- src/mesa/drivers/dri/radeon/radeon_dma.c | 2 +- src/mesa/drivers/dri/radeon/radeon_ioctl.c | 2 +- src/mesa/drivers/dri/radeon/radeon_ioctl.h | 2 +- src/mesa/drivers/dri/radeon/radeon_mipmap_tree.c | 2 +- src/mesa/drivers/dri/radeon/radeon_queryobj.c | 2 +- src/mesa/drivers/dri/radeon/radeon_queryobj.h | 2 +- src/mesa/drivers/dri/radeon/radeon_state.c | 2 +- src/mesa/drivers/dri/radeon/radeon_swtcl.c | 3 +- src/mesa/drivers/dri/radeon/radeon_tex.c | 2 +- src/mesa/main/context.c| 2 +- src/mesa/main/enable.c | 2 +- src/mesa/main/errors.c | 1 + src/mesa/main/light.c | 2 +- src/mesa/main/mtypes.h | 1 - src/mesa/main/simple_list.h| 210 - src/mesa/program/prog_hash_table.c | 2 +- src/mesa/tnl/t_context.c | 1 + src/mesa/tnl/t_rasterpos.c | 2 +- src/mesa/tnl/t_vb_light.c | 2 +- src/mesa/tnl/t_vertex_generic.c| 2 +- src/mesa/tnl/t_vertex_sse.c| 2 +- src/util/simple_list.h | 210 + 31 files changed, 241 insertions(+), 237 deletions(-) delete mode 100644 src/mesa/main/simple_list.h create mode 100644 src/util/simple_list.h diff --git a/src/mesa/drivers/dri/i915/i830_texblend.c b/src/mesa/drivers/dri/i915/i830_texblend.c index 236be59..f55d941 100644 --- a/src/mesa/drivers/dri/i915/i830_texblend.c +++ b/src/mesa/drivers/dri/i915/i830_texblend.c @@ -28,9 +28,9 @@ #include main/glheader.h #include main/macros.h #include main/mtypes.h -#include main/simple_list.h #include main/enums.h #include main/mm.h +#include util/simple_list.h #include intel_screen.h #include intel_tex.h diff --git a/src/mesa/drivers/dri/i915/intel_syncobj.c b/src/mesa/drivers/dri/i915/intel_syncobj.c index 9657d9a..95d0b16 100644 --- a/src/mesa/drivers/dri/i915/intel_syncobj.c +++ b/src/mesa/drivers/dri/i915/intel_syncobj.c @@ -38,8 +38,8 @@ * performance bottleneck, though. */ -#include main/simple_list.h #include main/imports.h +#include util/simple_list.h #include intel_context.h #include intel_batchbuffer.h diff --git a/src/mesa/drivers/dri/r200/r200_cmdbuf.c b/src/mesa/drivers/dri/r200/r200_cmdbuf.c index 1e6c0d8..13ac5af 100644 --- a/src/mesa/drivers/dri/r200/r200_cmdbuf.c +++ b/src/mesa/drivers/dri/r200/r200_cmdbuf.c @@ -35,7 +35,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include main/imports.h #include main/macros.h #include main/context.h -#include main/simple_list.h +#include util/simple_list.h #include radeon_common.h #include r200_context.h diff --git a/src/mesa/drivers/dri/r200/r200_context.c b/src/mesa/drivers/dri/r200/r200_context.c index 7815c4e..41040a6 100644 --- a/src/mesa/drivers/dri/r200/r200_context.c +++ b/src/mesa/drivers/dri/r200/r200_context.c @@ -37,7 +37,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include main/api_arrayelt.h #include main/api_exec.h #include main/context.h -#include main/simple_list.h #include main/imports.h #include main/extensions.h #include main/version.h @@ -50,6 +49,8 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include tnl/tnl.h #include tnl/t_pipeline.h +#include util/simple_list.h + #include drivers/common/driverfuncs.h #include r200_context.h diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h b/src/mesa/drivers/dri/r200/r200_ioctl.h index ab5f822..384787c 100644 --- a/src/mesa/drivers/dri/r200/r200_ioctl.h +++ b/src/mesa/drivers/dri/r200/r200_ioctl.h @@ -35,7 +35,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #ifndef __R200_IOCTL_H__ #define __R200_IOCTL_H__ -#include main/simple_list.h +#include util/simple_list.h #include radeon_dri.h #include radeon_bo_gem.h diff --git a/src/mesa/drivers/dri/r200/r200_swtcl.c b/src/mesa/drivers/dri/r200/r200_swtcl.c index 07c64f8..c324d53 100644 --- a/src/mesa/drivers/dri/r200/r200_swtcl.c +++ b/src/mesa/drivers/dri/r200/r200_swtcl.c @@ -39,7 +39,6 @@ WITH THE SOFTWARE OR THE
[Mesa-dev] [PATCHv3 08/16] glsl: protect glsl_type with a mutex
glsl_type has several static hash tables and a static ralloc context. They need to be protected by a mutex as they are not thread-safe. Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/glsl_types.cpp | 57 +++-- src/glsl/glsl_types.h | 15 + 2 files changed, 62 insertions(+), 10 deletions(-) diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index 66e9b13..74ec40f 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -29,6 +29,7 @@ extern C { #include program/hash_table.h } +mtx_t glsl_type::mutex = _MTX_INITIALIZER_NP; hash_table *glsl_type::array_types = NULL; hash_table *glsl_type::record_types = NULL; hash_table *glsl_type::interface_types = NULL; @@ -53,9 +54,14 @@ glsl_type::glsl_type(GLenum gl_type, vector_elements(vector_elements), matrix_columns(matrix_columns), length(0) { + mtx_lock(glsl_type::mutex); + init_ralloc_type_ctx(); assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); + + mtx_unlock(glsl_type::mutex); + /* Neither dimension is zero or both dimensions are zero. */ assert((vector_elements == 0) == (matrix_columns == 0)); @@ -71,9 +77,14 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type base_type, sampler_array(array), sampler_type(type), interface_packing(0), length(0) { + mtx_lock(glsl_type::mutex); + init_ralloc_type_ctx(); assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); + + mtx_unlock(glsl_type::mutex); + memset( fields, 0, sizeof(fields)); if (base_type == GLSL_TYPE_SAMPLER) { @@ -95,11 +106,14 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, { unsigned int i; + mtx_lock(glsl_type::mutex); + init_ralloc_type_ctx(); assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); this-fields.structure = ralloc_array(this-mem_ctx, glsl_struct_field, length); + for (i = 0; i length; i++) { this-fields.structure[i].type = fields[i].type; this-fields.structure[i].name = ralloc_strdup(this-fields.structure, @@ -110,6 +124,8 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, this-fields.structure[i].sample = fields[i].sample; this-fields.structure[i].matrix_layout = fields[i].matrix_layout; } + + mtx_unlock(glsl_type::mutex); } glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, @@ -123,6 +139,8 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, { unsigned int i; + mtx_lock(glsl_type::mutex); + init_ralloc_type_ctx(); assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); @@ -138,6 +156,8 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, this-fields.structure[i].sample = fields[i].sample; this-fields.structure[i].matrix_layout = fields[i].matrix_layout; } + + mtx_unlock(glsl_type::mutex); } @@ -285,6 +305,8 @@ const glsl_type *glsl_type::get_scalar_type() const void _mesa_glsl_release_types(void) { + mtx_lock(glsl_type::mutex); + if (glsl_type::array_types != NULL) { hash_table_dtor(glsl_type::array_types); glsl_type::array_types = NULL; @@ -294,6 +316,8 @@ _mesa_glsl_release_types(void) hash_table_dtor(glsl_type::record_types); glsl_type::record_types = NULL; } + + mtx_unlock(glsl_type::mutex); } @@ -316,7 +340,10 @@ glsl_type::glsl_type(const glsl_type *array, unsigned length) : * NUL. */ const unsigned name_length = strlen(array-name) + 10 + 3; + + mtx_lock(glsl_type::mutex); char *const n = (char *) ralloc_size(this-mem_ctx, name_length); + mtx_unlock(glsl_type::mutex); if (length == 0) snprintf(n, name_length, %s[], array-name); @@ -452,12 +479,6 @@ glsl_type::get_instance(unsigned base_type, unsigned rows, unsigned columns) const glsl_type * glsl_type::get_array_instance(const glsl_type *base, unsigned array_size) { - - if (array_types == NULL) { - array_types = hash_table_ctor(64, hash_table_string_hash, - hash_table_string_compare); - } - /* Generate a name using the base type pointer in the key. This is * done because the name of the base type may not be unique across * shaders. For example, two shaders may have different record types @@ -466,9 +487,19 @@ glsl_type::get_array_instance(const glsl_type *base, unsigned array_size) char key[128]; snprintf(key, sizeof(key), %p[%u], (void *) base, array_size); + mtx_lock(glsl_type::mutex); + + if (array_types == NULL) { + array_types = hash_table_ctor(64, hash_table_string_hash, + hash_table_string_compare); + } + const
[Mesa-dev] [PATCHv3 12/16] i965: add drirc option multithread_glsl_compiler
Setting it to a non-zero value N will cause shader compilation to be deferred to a thread pool. When N is greater than 1, it indicates the maximum number of threads in the pool. When N is 1, the number of threads is up to the driver (two for i965). Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/common/xmlpool/t_options.h | 4 src/mesa/drivers/dri/i965/brw_context.c | 15 +++ src/mesa/drivers/dri/i965/intel_screen.c| 2 ++ 3 files changed, 21 insertions(+) diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h b/src/mesa/drivers/dri/common/xmlpool/t_options.h index b73a662..7ac0298 100644 --- a/src/mesa/drivers/dri/common/xmlpool/t_options.h +++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h @@ -298,6 +298,10 @@ DRI_CONF_OPT_BEGIN_V(texture_heaps,enum,def,0:2) \ DRI_CONF_DESC_END \ DRI_CONF_OPT_END +#define DRI_CONF_MULTITHREAD_GLSL_COMPILER(def) \ +DRI_CONF_OPT_BEGIN(multithread_glsl_compiler, int, def) \ +DRI_CONF_DESC(en,gettext(Enable multithreading in the GLSL compiler)) \ +DRI_CONF_OPT_END /** diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 216b788..b02128c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -624,6 +624,17 @@ brw_process_driconf_options(struct brw_context *brw) ctx-Const.AllowGLSLExtensionDirectiveMidShader = driQueryOptionb(options, allow_glsl_extension_directive_midshader); + + const int multithread_glsl_compiler = + driQueryOptioni(options, multithread_glsl_compiler); + if (multithread_glsl_compiler 0) { + const int max_threads = (multithread_glsl_compiler 1) ? + multithread_glsl_compiler : 2; + + _mesa_enable_glsl_threadpool(ctx, max_threads); + ctx-Const.DeferCompileShader = GL_TRUE; + ctx-Const.DeferLinkProgram = GL_TRUE; + } } GLboolean @@ -828,6 +839,10 @@ brwCreateContext(gl_api api, if (INTEL_DEBUG DEBUG_SHADER_TIME) brw_init_shader_time(brw); + /* brw_shader_precompile is not thread-safe */ + if (brw-precompile) + ctx-Const.DeferLinkProgram = GL_FALSE; + _mesa_compute_version(ctx); _mesa_initialize_dispatch_tables(ctx); diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 9e743ee..95850c1 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -48,6 +48,8 @@ static const __DRIconfigOptionsExtension brw_config_options = { DRI_CONF_BEGIN DRI_CONF_SECTION_PERFORMANCE DRI_CONF_VBLANK_MODE(DRI_CONF_VBLANK_ALWAYS_SYNC) + DRI_CONF_MULTITHREAD_GLSL_COMPILER(0) + /* Options correspond to DRI_CONF_BO_REUSE_DISABLED, * DRI_CONF_BO_REUSE_ALL */ -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 14/16] i965: refactor do_gs_prog
Split do_gs_prog into brw_gs_init_compile brw_gs_do_compile brw_gs_upload_compile brw_gs_clear_complile Signed-off-by: Chia-I Wu o...@lunarg.com Acked-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 161 1 file changed, 102 insertions(+), 59 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 5b2ed51..04407b8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -33,22 +33,29 @@ #include brw_state.h -static bool -do_gs_prog(struct brw_context *brw, - struct gl_shader_program *prog, - struct brw_geometry_program *gp, - struct brw_gs_prog_key *key) +static void +brw_gs_init_compile(struct brw_context *brw, +struct gl_shader_program *prog, +struct brw_geometry_program *gp, +const struct brw_gs_prog_key *key, +struct brw_gs_compile *c) { - struct brw_stage_state *stage_state = brw-gs.base; - struct brw_gs_compile c; - memset(c, 0, sizeof(c)); - c.key = *key; - c.gp = gp; + memset(c, 0, sizeof(*c)); - c.prog_data.include_primitive_id = - (gp-program.Base.InputsRead VARYING_BIT_PRIMITIVE_ID) != 0; + c-key = *key; + c-gp = gp; + c-base.shader_prog = prog; + c-base.mem_ctx = ralloc_context(NULL); +} - c.prog_data.invocations = gp-program.Invocations; +static bool +brw_gs_do_compile(struct brw_context *brw, + struct brw_gs_compile *c) +{ + c-prog_data.include_primitive_id = + (c-gp-program.Base.InputsRead VARYING_BIT_PRIMITIVE_ID) != 0; + + c-prog_data.invocations = c-gp-program.Invocations; /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed @@ -58,34 +65,37 @@ do_gs_prog(struct brw_context *brw, * padding around uniform values below vec4 size, so the worst case is that * every uniform is a float which gets padded to the size of a vec4. */ - struct gl_shader *gs = prog-_LinkedShaders[MESA_SHADER_GEOMETRY]; + struct gl_shader *gs = + c-base.shader_prog-_LinkedShaders[MESA_SHADER_GEOMETRY]; int param_count = gs-num_uniform_components * 4; /* We also upload clip plane data as uniforms */ param_count += MAX_CLIP_PLANES * 4; - c.prog_data.base.base.param = + c-prog_data.base.base.param = rzalloc_array(NULL, const gl_constant_value *, param_count); - c.prog_data.base.base.pull_param = + c-prog_data.base.base.pull_param = rzalloc_array(NULL, const gl_constant_value *, param_count); /* Setting nr_params here NOT to the size of the param and pull_param * arrays, but to the number of uniform components vec4_visitor * needs. vec4_visitor::setup_uniforms() will set it back to a proper value. */ - c.prog_data.base.base.nr_params = ALIGN(param_count, 4) / 4 + gs-num_samplers; + c-prog_data.base.base.nr_params = + ALIGN(param_count, 4) / 4 + gs-num_samplers; - if (gp-program.OutputType == GL_POINTS) { + if (c-gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data * to multiple streams, and EndPrimitive() has no effect. So we * configure the hardware to interpret the control data as stream ID. */ - c.prog_data.control_data_format = GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID; + c-prog_data.control_data_format = + GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID; /* We only have to emit control bits if we are using streams */ - if (prog-Geom.UsesStreams) - c.control_data_bits_per_vertex = 2; + if (c-base.shader_prog-Geom.UsesStreams) + c-control_data_bits_per_vertex = 2; else - c.control_data_bits_per_vertex = 0; + c-control_data_bits_per_vertex = 0; } else { /* When the output type is triangle_strip or line_strip, EndPrimitive() * may be used to terminate the current strip and start a new one @@ -93,32 +103,34 @@ do_gs_prog(struct brw_context *brw, * streams is not supported. So we configure the hardware to interpret * the control data as EndPrimitive information (a.k.a. cut bits). */ - c.prog_data.control_data_format = GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT; + c-prog_data.control_data_format = + GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT; /* We only need to output control data if the shader actually calls * EndPrimitive(). */ - c.control_data_bits_per_vertex = gp-program.UsesEndPrimitive ? 1 : 0; + c-control_data_bits_per_vertex = + c-gp-program.UsesEndPrimitive ? 1 : 0; } - c.control_data_header_size_bits = - gp-program.VerticesOut * c.control_data_bits_per_vertex; + c-control_data_header_size_bits = +
[Mesa-dev] [PATCHv3 15/16] i965: refactor do_wm_prog
Split do_wm_prog into brw_wm_init_compile brw_wm_do_compile brw_wm_upload_compile brw_wm_clear_complile Add struct brw_wm_compile to be passed around them. Signed-off-by: Chia-I Wu o...@lunarg.com Acked-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_wm.c | 116 - src/mesa/drivers/dri/i965/brw_wm.h | 30 ++ 2 files changed, 106 insertions(+), 40 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 2e3cd4b..329e82c 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -135,27 +135,30 @@ brw_wm_prog_data_compare(const void *in_a, const void *in_b) return true; } -/** - * All Mesa program - GPU code generation goes through this function. - * Depending on the instructions used (i.e. flow control instructions) - * we'll use one of two code generators. - */ -bool do_wm_prog(struct brw_context *brw, - struct gl_shader_program *prog, - struct brw_fragment_program *fp, - struct brw_wm_prog_key *key) +void +brw_wm_init_compile(struct brw_context *brw, +struct gl_shader_program *prog, +struct brw_fragment_program *fp, +const struct brw_wm_prog_key *key, +struct brw_wm_compile *c) +{ + memset(c, 0, sizeof(*c)); + + c-shader_prog = prog; + c-fp = fp; + c-key = key; + c-mem_ctx = ralloc_context(NULL); +} + +bool +brw_wm_do_compile(struct brw_context *brw, + struct brw_wm_compile *c) { struct gl_context *ctx = brw-ctx; - void *mem_ctx = ralloc_context(NULL); - struct brw_wm_prog_data prog_data; - const GLuint *program; struct gl_shader *fs = NULL; - GLuint program_size; - if (prog) - fs = prog-_LinkedShaders[MESA_SHADER_FRAGMENT]; - - memset(prog_data, 0, sizeof(prog_data)); + if (c-shader_prog) + fs = c-shader_prog-_LinkedShaders[MESA_SHADER_FRAGMENT]; /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed @@ -165,43 +168,76 @@ bool do_wm_prog(struct brw_context *brw, if (fs) { param_count = fs-num_uniform_components; } else { - param_count = fp-program.Base.Parameters-NumParameters * 4; + param_count = c-fp-program.Base.Parameters-NumParameters * 4; } /* The backend also sometimes adds params for texture size. */ param_count += 2 * ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; - prog_data.base.param = + c-prog_data.base.param = rzalloc_array(NULL, const gl_constant_value *, param_count); - prog_data.base.pull_param = + c-prog_data.base.pull_param = rzalloc_array(NULL, const gl_constant_value *, param_count); - prog_data.base.nr_params = param_count; + c-prog_data.base.nr_params = param_count; - prog_data.barycentric_interp_modes = - brw_compute_barycentric_interp_modes(brw, key-flat_shade, - key-persample_shading, - fp-program); + c-prog_data.barycentric_interp_modes = + brw_compute_barycentric_interp_modes(brw, c-key-flat_shade, + c-key-persample_shading, + c-fp-program); - program = brw_wm_fs_emit(brw, mem_ctx, key, prog_data, -fp-program, prog, program_size); - if (program == NULL) { - ralloc_free(mem_ctx); + c-program = brw_wm_fs_emit(brw, c-mem_ctx, c-key, c-prog_data, + c-fp-program, c-shader_prog, + c-program_size); + if (c-program == NULL) return false; - } - - if (prog_data.total_scratch) { - brw_get_scratch_bo(brw, brw-wm.base.scratch_bo, -prog_data.total_scratch * brw-max_wm_threads); - } if (unlikely(INTEL_DEBUG DEBUG_WM)) fprintf(stderr, \n); + return true; +} + +void +brw_wm_upload_compile(struct brw_context *brw, + const struct brw_wm_compile *c) +{ + if (c-prog_data.total_scratch) { + brw_get_scratch_bo(brw, brw-wm.base.scratch_bo, + c-prog_data.total_scratch * brw-max_wm_threads); + } + brw_upload_cache(brw-cache, BRW_WM_PROG, - key, sizeof(struct brw_wm_prog_key), - program, program_size, - prog_data, sizeof(prog_data), - brw-wm.base.prog_offset, brw-wm.prog_data); +c-key, sizeof(struct brw_wm_prog_key), +c-program, c-program_size, +c-prog_data, sizeof(c-prog_data), +brw-wm.base.prog_offset, brw-wm.prog_data); +} + +void +brw_wm_clear_compile(struct brw_context *brw, + struct brw_wm_compile *c) +{ +
[Mesa-dev] [PATCHv3 09/16] glsl: integrate with the singleton thread pool
The singleton thread pool will be used by contexts to queue compilation tasks. We need to control its lieftime from the compiler. Signed-off-by: Chia-I Wu o...@lunarg.com --- src/glsl/glsl_parser_extras.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index b17cdb1..9342908 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -32,6 +32,7 @@ extern C { } #include util/ralloc.h +#include util/threadpool.h #include ast.h #include glsl_parser_extras.h #include glsl_parser.h @@ -1626,6 +1627,8 @@ extern C { void _mesa_destroy_shader_compiler(void) { + _mesa_threadpool_destroy_singleton(); + _mesa_destroy_shader_compiler_caches(); _mesa_glsl_release_types(); @@ -1639,6 +1642,7 @@ _mesa_destroy_shader_compiler(void) void _mesa_destroy_shader_compiler_caches(void) { + _mesa_threadpool_wait_singleton(); _mesa_glsl_release_builtin_functions(); } -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 01/16] util: add _mesa_strtod and _mesa_strtof
Both core mesa and glsl have their own wrappers for strtof_l. Merge and move them to util/. They are compiled with a C++ compiler so that we can make them thread-safe in a following commit. Signed-off-by: Chia-I Wu o...@lunarg.com --- src/glsl/Makefile.sources| 3 +- src/glsl/glsl_lexer.ll | 12 +++--- src/glsl/s_expression.cpp| 2 +- src/glsl/s_expression.h | 2 +- src/glsl/strtod.c| 79 --- src/glsl/strtod.h| 46 --- src/mesa/main/imports.c | 19 -- src/mesa/main/imports.h | 3 -- src/mesa/program/program_lexer.l | 1 + src/util/Makefile.sources| 3 +- src/util/strtod.cpp | 81 src/util/strtod.h| 46 +++ 12 files changed, 139 insertions(+), 158 deletions(-) delete mode 100644 src/glsl/strtod.c delete mode 100644 src/glsl/strtod.h create mode 100644 src/util/strtod.cpp create mode 100644 src/util/strtod.h diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 2131dda..472ad89 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -101,8 +101,7 @@ LIBGLSL_FILES = \ $(GLSL_SRCDIR)/opt_swizzle_swizzle.cpp \ $(GLSL_SRCDIR)/opt_tree_grafting.cpp \ $(GLSL_SRCDIR)/opt_vectorize.cpp \ - $(GLSL_SRCDIR)/s_expression.cpp \ - $(GLSL_SRCDIR)/strtod.c + $(GLSL_SRCDIR)/s_expression.cpp # glsl_compiler diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index b7c4aad..ed2f26d 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -23,7 +23,7 @@ */ #include ctype.h #include limits.h -#include strtod.h +#include util/strtod.h #include ast.h #include glsl_parser_extras.h #include glsl_parser.h @@ -448,23 +448,23 @@ layout{ } [0-9]+\.[0-9]+([eE][+-]?[0-9]+)?[fF]? { - yylval-real = glsl_strtof(yytext, NULL); + yylval-real = _mesa_strtof(yytext, NULL); return FLOATCONSTANT; } \.[0-9]+([eE][+-]?[0-9]+)?[fF]?{ - yylval-real = glsl_strtof(yytext, NULL); + yylval-real = _mesa_strtof(yytext, NULL); return FLOATCONSTANT; } [0-9]+\.([eE][+-]?[0-9]+)?[fF]?{ - yylval-real = glsl_strtof(yytext, NULL); + yylval-real = _mesa_strtof(yytext, NULL); return FLOATCONSTANT; } [0-9]+[eE][+-]?[0-9]+[fF]? { - yylval-real = glsl_strtof(yytext, NULL); + yylval-real = _mesa_strtof(yytext, NULL); return FLOATCONSTANT; } [0-9]+[fF] { - yylval-real = glsl_strtof(yytext, NULL); + yylval-real = _mesa_strtof(yytext, NULL); return FLOATCONSTANT; } diff --git a/src/glsl/s_expression.cpp b/src/glsl/s_expression.cpp index 1a28e1d..2928a4d 100644 --- a/src/glsl/s_expression.cpp +++ b/src/glsl/s_expression.cpp @@ -73,7 +73,7 @@ read_atom(void *ctx, const char *src, char *symbol_buffer) } else { // Check if the atom is a number. char *float_end = NULL; - float f = glsl_strtof(src, float_end); + float f = _mesa_strtof(src, float_end); if (float_end != src) { char *int_end = NULL; int i = strtol(src, int_end, 10); diff --git a/src/glsl/s_expression.h b/src/glsl/s_expression.h index 642af19..1d47535 100644 --- a/src/glsl/s_expression.h +++ b/src/glsl/s_expression.h @@ -27,7 +27,7 @@ #define S_EXPRESSION_H #include main/core.h /* for Elements */ -#include strtod.h +#include util/strtod.h #include list.h /* Type-safe downcasting macros (also safe to pass NULL) */ diff --git a/src/glsl/strtod.c b/src/glsl/strtod.c deleted file mode 100644 index 5d4346b..000 --- a/src/glsl/strtod.c +++ /dev/null @@ -1,79 +0,0 @@ -/* - * Copyright 2010 VMware, Inc. - * All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the - * Software), to deal in the Software without restriction, including - * without limitation the rights to use, copy, modify, merge, publish, - * distribute, sub license, and/or sell copies of the Software, and to - * permit persons to whom the Software is furnished to do so, subject to - * the following conditions: - * - * The above copyright notice and this permission notice (including the - * next paragraph) shall be included in all copies or substantial portions - * of the Software. - * - * THE SOFTWARE IS
[Mesa-dev] [PATCHv3 00/16] multithread shader compiler
Hi, This is v3 of the series. It should have all the changes I promised to fix. There are some new or splitted patches because _mesa_strtof, simple_list, and thread pool are now moved to src/util/. To summarize, Patch 1-3 merge mesa and glsl strtof wrappers and moves them to src/util/. They go on to clean up the #ifdef hell and make the wrappers thread-safe. Patch 4-6 add a generic thread pool to src/util/. Patch 4 moves simple_list.h from core to util because the thread pool needs it. Patch 7-11 fixe thread safety issues in the frontend compiler and add the infrastructure for multithread compilation to the core. Patch 12-16 fix i965 and add the drirc option to enable multithread compiler. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 06/16] util: allow the thread pool to be used as a singleton
To have a real control over the number of driver threads, we almost never want more than a single thread pool. Signed-off-by: Chia-I Wu o...@lunarg.com Reviewed-by: Brian Paul bri...@vmware.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com v2: split glsl changes to another commit --- src/util/threadpool.c | 72 +++ src/util/threadpool.h | 9 +++ 2 files changed, 81 insertions(+) diff --git a/src/util/threadpool.c b/src/util/threadpool.c index 224f411..9dc91f9 100644 --- a/src/util/threadpool.c +++ b/src/util/threadpool.c @@ -55,6 +55,7 @@ struct _mesa_threadpool_task { struct _mesa_threadpool { mtx_t mutex; int refcnt; + bool shutdown; enum _mesa_threadpool_control thread_control; thrd_t *threads; @@ -168,6 +169,12 @@ _mesa_threadpool_queue_task(struct _mesa_threadpool *pool, mtx_lock(pool-mutex); + if (unlikely(pool-shutdown)) { + mtx_unlock(pool-mutex); + free(task); + return NULL; + } + /* someone is joining with the threads */ while (unlikely(pool-thread_control != MESA_THREADPOOL_NORMAL)) cnd_wait(pool-thread_joined, pool-mutex); @@ -379,6 +386,17 @@ _mesa_threadpool_join(struct _mesa_threadpool *pool, bool graceful) } /** + * After this call, no task can be queued. + */ +static void +_mesa_threadpool_set_shutdown(struct _mesa_threadpool *pool) +{ + mtx_lock(pool-mutex); + pool-shutdown = true; + mtx_unlock(pool-mutex); +} + +/** * Decrease the reference count. Destroy \p pool when the reference count * reaches zero. */ @@ -474,3 +492,57 @@ _mesa_threadpool_create(int max_threads) return pool; } + +static mtx_t threadpool_lock = _MTX_INITIALIZER_NP; +static struct _mesa_threadpool *threadpool; + +/** + * Get the singleton thread pool. \p max_threads is honored only by the first + * call to this function. + */ +struct _mesa_threadpool * +_mesa_threadpool_get_singleton(int max_threads) +{ + mtx_lock(threadpool_lock); + if (!threadpool) + threadpool = _mesa_threadpool_create(max_threads); + if (threadpool) + _mesa_threadpool_ref(threadpool); + mtx_unlock(threadpool_lock); + + return threadpool; +} + +/** + * Wait until all tasks are completed and threads are joined. + */ +void +_mesa_threadpool_wait_singleton(void) +{ + mtx_lock(threadpool_lock); + if (threadpool) + _mesa_threadpool_join(threadpool, true); + mtx_unlock(threadpool_lock); +} + +/** + * Destroy the singleton thread pool. + */ +void +_mesa_threadpool_destroy_singleton(void) +{ + mtx_lock(threadpool_lock); + if (threadpool) { + /* + * No new task is allowed since this point. But whoever owns references + * to the pool can still complete tasks that have been queued (which + * will simply destroy the tasks as all tasks are marked cancelled). + */ + _mesa_threadpool_set_shutdown(threadpool); + + _mesa_threadpool_join(threadpool, false); + _mesa_threadpool_unref(threadpool); + threadpool = NULL; + } + mtx_unlock(threadpool_lock); +} diff --git a/src/util/threadpool.h b/src/util/threadpool.h index 48e4a47..aeda9d3 100644 --- a/src/util/threadpool.h +++ b/src/util/threadpool.h @@ -60,6 +60,15 @@ bool _mesa_threadpool_complete_task(struct _mesa_threadpool *pool, struct _mesa_threadpool_task *task); +struct _mesa_threadpool * +_mesa_threadpool_get_singleton(int max_threads); + +void +_mesa_threadpool_wait_singleton(void); + +void +_mesa_threadpool_destroy_singleton(void); + #ifdef __cplusplus } #endif -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv3 13/16] i965: refactor do_vs_prog
Split do_vs_prog into brw_vs_init_compile brw_vs_do_compile brw_vs_upload_compile brw_vs_clear_complile Signed-off-by: Chia-I Wu o...@lunarg.com Acked-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 6 ++ src/mesa/drivers/dri/i965/brw_vs.c | 121 ++- src/mesa/drivers/dri/i965/brw_vs.h | 1 + 3 files changed, 83 insertions(+), 45 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index f0239cb..f0e9f10 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -47,6 +47,12 @@ extern C { struct brw_vec4_compile { GLuint last_scratch; /** measured in 32-byte (register size) units */ + + struct gl_shader_program *shader_prog; + + void *mem_ctx; + const unsigned *program; + unsigned program_size; }; diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 4574c3e..8e3dcf4 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -187,31 +187,31 @@ brw_vs_prog_data_compare(const void *in_a, const void *in_b) return true; } -static bool -do_vs_prog(struct brw_context *brw, - struct gl_shader_program *prog, - struct brw_vertex_program *vp, - struct brw_vs_prog_key *key) +static void +brw_vs_init_compile(struct brw_context *brw, +struct gl_shader_program *prog, +struct brw_vertex_program *vp, +const struct brw_vs_prog_key *key, +struct brw_vs_compile *c) { - GLuint program_size; - const GLuint *program; - struct brw_vs_compile c; - struct brw_vs_prog_data prog_data; - struct brw_stage_prog_data *stage_prog_data = prog_data.base.base; - void *mem_ctx; - int i; - struct gl_shader *vs = NULL; - - if (prog) - vs = prog-_LinkedShaders[MESA_SHADER_VERTEX]; + memset(c, 0, sizeof(*c)); - memset(c, 0, sizeof(c)); - memcpy(c.key, key, sizeof(*key)); - memset(prog_data, 0, sizeof(prog_data)); + memcpy(c-key, key, sizeof(*key)); + c-vp = vp; + c-base.shader_prog = prog; + c-base.mem_ctx = ralloc_context(NULL); +} - mem_ctx = ralloc_context(NULL); +static bool +brw_vs_do_compile(struct brw_context *brw, + struct brw_vs_compile *c) +{ + struct brw_stage_prog_data *stage_prog_data = c-prog_data.base.base; + struct gl_shader *vs = NULL; + int i; - c.vp = vp; + if (c-base.shader_prog) + vs = c-base.shader_prog-_LinkedShaders[MESA_SHADER_VERTEX]; /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed @@ -226,12 +226,12 @@ do_vs_prog(struct brw_context *brw, param_count = vs-num_uniform_components * 4; } else { - param_count = vp-program.Base.Parameters-NumParameters * 4; + param_count = c-vp-program.Base.Parameters-NumParameters * 4; } /* vec4_visitor::setup_uniform_clipplane_values() also uploads user clip * planes as uniforms. */ - param_count += c.key.base.nr_userclip_plane_consts * 4; + param_count += c-key.base.nr_userclip_plane_consts * 4; stage_prog_data-param = rzalloc_array(NULL, const gl_constant_value *, param_count); @@ -247,12 +247,12 @@ do_vs_prog(struct brw_context *brw, stage_prog_data-nr_params += vs-num_samplers; } - GLbitfield64 outputs_written = vp-program.Base.OutputsWritten; - prog_data.inputs_read = vp-program.Base.InputsRead; + GLbitfield64 outputs_written = c-vp-program.Base.OutputsWritten; + c-prog_data.inputs_read = c-vp-program.Base.InputsRead; - if (c.key.copy_edgeflag) { + if (c-key.copy_edgeflag) { outputs_written |= BITFIELD64_BIT(VARYING_SLOT_EDGE); - prog_data.inputs_read |= VERT_BIT_EDGEFLAG; + c-prog_data.inputs_read |= VERT_BIT_EDGEFLAG; } if (brw-gen 6) { @@ -263,7 +263,7 @@ do_vs_prog(struct brw_context *brw, * coords, which would be a pain to handle. */ for (i = 0; i 8; i++) { - if (c.key.point_coord_replace (1 i)) + if (c-key.point_coord_replace (1 i)) outputs_written |= BITFIELD64_BIT(VARYING_SLOT_TEX0 + i); } @@ -278,45 +278,76 @@ do_vs_prog(struct brw_context *brw, * distance varying slots whenever clipping is enabled, even if the vertex * shader doesn't write to gl_ClipDistance. */ - if (c.key.base.userclip_active) { + if (c-key.base.userclip_active) { outputs_written |= BITFIELD64_BIT(VARYING_SLOT_CLIP_DIST0); outputs_written |= BITFIELD64_BIT(VARYING_SLOT_CLIP_DIST1); } - brw_compute_vue_map(brw, prog_data.base.vue_map, outputs_written); + brw_compute_vue_map(brw, c-prog_data.base.vue_map, outputs_written); if (0) { - _mesa_fprint_program_opt(stderr, c.vp-program.Base, PROG_PRINT_DEBUG, -
[Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- No piglit regressions on nvc0 except for gl-3.0-render-integer, which appears to now fail even without this commit, despite the fact that I'm fairly sure it used to work fine. Same failure with llvmpipe... It's most likely that I've missed some details. It's unclear whether e.g. glGenerateMipmap should work on a view. However the piglits that exist do all pass on nvc0 and llvmpipe. docs/GL3.txt | 2 +- docs/relnotes/10.3.html | 1 + src/mesa/state_tracker/st_atom_texture.c | 28 +++ src/mesa/state_tracker/st_cb_fbo.c | 10 ++ src/mesa/state_tracker/st_cb_texture.c | 62 +++- src/mesa/state_tracker/st_extensions.c | 1 + src/mesa/state_tracker/st_format.c | 5 +-- src/mesa/state_tracker/st_texture.c | 15 ++-- 8 files changed, 105 insertions(+), 19 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 76412c3..5b25865 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -166,7 +166,7 @@ GL 4.3, GLSL 4.30: GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi) GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30) GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample) - GL_ARB_texture_view DONE (i965) + GL_ARB_texture_view DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe) GL_ARB_vertex_attrib_binding DONE (all drivers) diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html index fa4ea23..852aec9 100644 --- a/docs/relnotes/10.3.html +++ b/docs/relnotes/10.3.html @@ -63,6 +63,7 @@ Note: some of the new features are only available with certain drivers. liGL_ARB_texture_gather on r600, radeonsi/li liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, softpipe/li liGL_ARB_texture_query_lod on r600, radeonsi/li +liGL_ARB_texture_view on nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe/li liGL_ARB_viewport_array on nvc0/li liGL_AMD_vertex_shader_viewport_index on i965/gen7+, r600/li liGL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, radeonsi, softpipe, llvmpipe/li diff --git a/src/mesa/state_tracker/st_atom_texture.c b/src/mesa/state_tracker/st_atom_texture.c index 03d0593..8f62494 100644 --- a/src/mesa/state_tracker/st_atom_texture.c +++ b/src/mesa/state_tracker/st_atom_texture.c @@ -192,9 +192,9 @@ get_texture_format_swizzle(const struct st_texture_object *stObj) return swizzle_swizzle(stObj-base._Swizzle, tex_swizzle); } - + /** - * Return TRUE if the texture's sampler view swizzle is equal to + * Return TRUE if the texture's sampler view swizzle is not equal to * the texture's swizzle. * * \param stObj the st texture object, @@ -214,9 +214,20 @@ check_sampler_swizzle(const struct st_texture_object *stObj, static unsigned last_level(struct st_texture_object *stObj) { - return MIN2(stObj-base._MaxLevel, stObj-pt-last_level); + unsigned ret = MIN2(stObj-base.MinLevel + stObj-base._MaxLevel, + stObj-pt-last_level); + if (stObj-base.Immutable) + ret = MIN2(ret, stObj-base.MinLevel + stObj-base.NumLevels - 1); + return ret; } +static unsigned last_layer(struct st_texture_object *stObj) +{ + if (stObj-base.Immutable) + return MIN2(stObj-base.MinLayer + stObj-base.NumLayers - 1, + stObj-pt-array_size - 1); + return stObj-pt-array_size - 1; +} static struct pipe_sampler_view * st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe, @@ -249,9 +260,12 @@ st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe, templ.u.buf.first_element = f; templ.u.buf.last_element = f + (n - 1); } else { - templ.u.tex.first_level = stObj-base.BaseLevel; + templ.u.tex.first_level = stObj-base.MinLevel + stObj-base.BaseLevel; templ.u.tex.last_level = last_level(stObj); assert(templ.u.tex.first_level = templ.u.tex.last_level); + templ.u.tex.first_layer = stObj-base.MinLayer; + templ.u.tex.last_layer = last_layer(stObj); + assert(templ.u.tex.first_layer = templ.u.tex.last_layer); } if (swizzle != SWIZZLE_NOOP) { @@ -287,8 +301,10 @@ st_get_texture_sampler_view_from_stobj(struct st_context *st, if (*sv) { if (check_sampler_swizzle(stObj, *sv) || (format != (*sv)-format) || - stObj-base.BaseLevel != (*sv)-u.tex.first_level || - last_level(stObj) != (*sv)-u.tex.last_level) { + stObj-base.MinLevel + stObj-base.BaseLevel != (*sv)-u.tex.first_level || + last_level(stObj) != (*sv)-u.tex.last_level || + stObj-base.MinLayer != (*sv)-u.tex.first_layer || +
[Mesa-dev] [PATCH 1/2] mesa: force height of 1D textures to be 1 in texture views
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/mesa/main/textureview.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/textureview.c b/src/mesa/main/textureview.c index b3521e2..6e86a9a 100644 --- a/src/mesa/main/textureview.c +++ b/src/mesa/main/textureview.c @@ -536,6 +536,9 @@ _mesa_TextureView(GLuint texture, GLenum target, GLuint origtexture, /* Adjust width, height, depth to be appropriate for new target */ switch (target) { case GL_TEXTURE_1D: + height = 1; + break; + case GL_TEXTURE_3D: break; -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967
Michel Dänzer mic...@daenzer.net writes: From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 5d2efc4..2643cc3 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -234,7 +234,11 @@ namespace { memcpy(address_spaces, c.getTarget().getAddressSpaceMap(), sizeof(address_spaces)); +#if HAVE_LLVM = 0x0306 + return act.takeModule().get(); You probably want to call .release() instead and deallocate it manually later on, otherwise the module will be destroyed here before the end of the function. Thanks. +#else return act.takeModule(); +#endif } void -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev pgp6oeX6taFLc.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967
On 20.08.2014 15:48, Francisco Jerez wrote: Michel Dänzer mic...@daenzer.net writes: From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 5d2efc4..2643cc3 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -234,7 +234,11 @@ namespace { memcpy(address_spaces, c.getTarget().getAddressSpaceMap(), sizeof(address_spaces)); +#if HAVE_LLVM = 0x0306 + return act.takeModule().get(); You probably want to call .release() instead Right, that works better, i.e. doesn't crash. :) and deallocate it manually later on, otherwise the module will be destroyed here before the end of the function. Are you sure anything else needs to be done for destruction? valgrind doesn't seem to show any leaks obviously related to this. If something else does need to be done, I'll have to defer to you or someone else for the proper fix. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand it's not immediately required (as we can address this problem for the time being using the same solution AMD uses) and on the other hand it's still less work than starting from scratch. Well, I've wrote the structurizer pass in LLVM you are talking about here and from my experience you really don't want any structured form of control flow in the IR. Structured control flow is just a specialized form of unstructured control flow and even if it looks rather awkward at first glance it is indeed simpler to destructurize the compiler generated control flow for optimization and structurize again for instruction selection. The only reason I've annotated the LLVM IR with specialized intrinsics for the SI backend was laziness and I wouldn't do that again given the chance. And it's very likely that these backends, which probably aren't using SSA due to the aforementioned difficulties, will also benefit from having modifiers already folded for them - this is something that's already a problem for i965 vec4 backend and that NIR will help a lot. Well, I have the impression that much of the reason why the i965 vec4 backend has lagged behind so much in comparison with the fs backend is precisely because it's so annoying to optimize vec4 code. It seems painful to me that you have this built into the core instruction set so generic optimization passes will have to be explicitly aware of it. I wouldn't be surprised if the i965 vec4 benefited at least as much from scalarizing the code, performing optimizations there, and re-vectorizing afterwards. Completely agree. Being able to do vectorization in an IR is important, but you shouldn't try to handle backend specific swizzle operations and vectorizing restrictions in the IR. Just looking at the swizzle restrictions of R600 for example and I really can't imagine that you want to represent this in a common IR between all different drivers. Regards, Christian. Am 20.08.2014 um 08:33 schrieb Francisco Jerez: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the
Re: [Mesa-dev] Clamp/saturate optimizations v3
On 20.08.2014 05:40, Matt Turner wrote: Patches 2-4, (5-9 already reviewed), 10, 13-16, (17 already reviewed) are Reviewed-by: Matt Turner matts...@gmail.com I've requested a change come before patch 1, and then rebased patch 1 should be an easy R-b. I'll need to take a closer look at 11 and 12. Thanks for the review! I'll send the updated changes ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967
Michel Dänzer mic...@daenzer.net writes: On 20.08.2014 15:48, Francisco Jerez wrote: Michel Dänzer mic...@daenzer.net writes: From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 5d2efc4..2643cc3 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -234,7 +234,11 @@ namespace { memcpy(address_spaces, c.getTarget().getAddressSpaceMap(), sizeof(address_spaces)); +#if HAVE_LLVM = 0x0306 + return act.takeModule().get(); You probably want to call .release() instead Right, that works better, i.e. doesn't crash. :) and deallocate it manually later on, otherwise the module will be destroyed here before the end of the function. Are you sure anything else needs to be done for destruction? valgrind doesn't seem to show any leaks obviously related to this. If something else does need to be done, I'll have to defer to you or someone else for the proper fix. Yeah, I'm afraid. Apparently since clang r215979 CodeGenAction::takeModule() gives up ownership on the module object. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer pgpvCqHI3_jKF.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 1/3] mesa: implement GL_MAX_VERTEX_ATTRIB_STRIDE
V2: moved test for the VertexAttrib*Pointer() functions to update_array(), and made constant available for drivers to set Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- Although 4.4 is a while away GL_MAX_VERTEX_ATTRIB_STRIDE is used in the ARB_direct_state_access spec so it seemed worth while adding this now. I've added MAX_VERTEX_ATTRIB_STRIDE to ARB_vertex_attrib_binding.xml as it didn't seem like it was worth putting it somewhere on its own as its really just a bug fix. Let me know if this should be moved. Finally I've assumed that 2048 is an ok value for i965. V4: add cap for all gallium drivers set to default (except r600g) V3: adds values for r600g and radeonsi (I'm unsable to test either of these patches) src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml | 1 + src/mesa/main/context.c | 3 +++ src/mesa/main/get_hash_params.py | 3 +++ src/mesa/main/mtypes.h | 3 +++ src/mesa/main/varray.c | 22 ++ 5 files changed, 32 insertions(+) diff --git a/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml b/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml index 0ee6a3c..7e62688 100644 --- a/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml +++ b/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml @@ -53,6 +53,7 @@ enum name=VERTEX_BINDING_STRIDE value=0x82D8/ enum name=MAX_VERTEX_ATTRIB_RELATIVE_OFFSET value=0x82D9/ enum name=MAX_VERTEX_ATTRIB_BINDINGS value=0x82DA/ +enum name=MAX_VERTEX_ATTRIB_STRIDE value=0x82E5/ /category /OpenGLAPI diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 2320842..fbdbd68 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -670,6 +670,9 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api) ? GL_CONTEXT_CORE_PROFILE_BIT : GL_CONTEXT_COMPATIBILITY_PROFILE_BIT; + /* GL 4.4 */ + consts-MaxVertexAttribStride = 2048; + /** GL_EXT_gpu_shader4 */ consts-MinProgramTexelOffset = -8; consts-MaxProgramTexelOffset = 7; diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index ff85820..aace8a5 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -712,6 +712,9 @@ descriptor=[ [ MAX_GEOMETRY_INPUT_COMPONENTS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxInputComponents), extra_version_32 ], [ MAX_GEOMETRY_OUTPUT_COMPONENTS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxOutputComponents), extra_version_32 ], +# GL 4.4 + [ MAX_VERTEX_ATTRIB_STRIDE, CONTEXT_ENUM(Const.MaxVertexAttribStride), NO_EXTRA ], + # GL_ARB_robustness [ RESET_NOTIFICATION_STRATEGY_ARB, CONTEXT_ENUM(Const.ResetStrategy), NO_EXTRA ], diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index cb2a4df..adb6788 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3414,6 +3414,9 @@ struct gl_constants /** OpenGL version 3.2 */ GLbitfield ProfileMask; /** Mask of CONTEXT_x_PROFILE_BIT */ + /** OpenGL version 4.4 */ + GLuint MaxVertexAttribStride; + /** GL_EXT_transform_feedback */ GLuint MaxTransformFeedbackBuffers; GLuint MaxTransformFeedbackSeparateComponents; diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 5d3cc2a..7d169f9 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -424,6 +424,13 @@ update_array(struct gl_context *ctx, return; } + if (ctx-API == API_OPENGL_CORE ctx-Version = 44 + stride ctx-Const.MaxVertexAttribStride) { + _mesa_error(ctx, GL_INVALID_VALUE, %s(stride=%d + GL_MAX_VERTEX_ATTRIB_STRIDE), func, stride); + return; + } + /* Page 29 (page 44 of the PDF) of the OpenGL 3.3 spec says: * * An INVALID_OPERATION error is generated under any of the following @@ -1437,6 +1444,13 @@ _mesa_BindVertexBuffer(GLuint bindingIndex, GLuint buffer, GLintptr offset, return; } + if (ctx-API == API_OPENGL_CORE ctx-Version = 44 + stride ctx-Const.MaxVertexAttribStride) { + _mesa_error(ctx, GL_INVALID_VALUE, glBindVertexBuffer(stride=%d + GL_MAX_VERTEX_ATTRIB_STRIDE), stride); + return; + } + if (buffer == vao-VertexBinding[VERT_ATTRIB_GENERIC(bindingIndex)].BufferObj-Name) { vbo = vao-VertexBinding[VERT_ATTRIB_GENERIC(bindingIndex)].BufferObj; } else if (buffer != 0) { @@ -1565,6 +1579,14 @@ _mesa_BindVertexBuffers(GLuint first, GLsizei count, const GLuint *buffers, continue; } + if (ctx-API == API_OPENGL_CORE ctx-Version = 44 + strides[i] ctx-Const.MaxVertexAttribStride) { + _mesa_error(ctx, GL_INVALID_VALUE, + glBindVertexBuffers(strides[%u]=%d + GL_MAX_VERTEX_ATTRIB_STRIDE), i, strides[i]); + continue; + } + if (buffers[i])
[Mesa-dev] [PATCH V4 3/3] docs: mark GL_MAX_VERTEX_ATTRIB_STRIDE as done
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 76412c3..af26214 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -172,7 +172,7 @@ GL 4.3, GLSL 4.30: GL 4.4, GLSL 4.40: - GL_MAX_VERTEX_ATTRIB_STRIDE not started + GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers) GL_ARB_buffer_storageDONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi) GL_ARB_clear_texture DONE (i965) GL_ARB_enhanced_layouts not started -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V4 2/3] gallium: add cap for MAX_VERTEX_ATTRIB_STRIDE
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- Note: I have only compile tested this patch with ilo. src/gallium/docs/source/screen.rst | 1 + src/gallium/drivers/freedreno/freedreno_screen.c | 3 +++ src/gallium/drivers/i915/i915_screen.c | 3 +++ src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++ src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 ++ src/gallium/drivers/r300/r300_screen.c | 3 +++ src/gallium/drivers/r600/r600_pipe.c | 3 +++ src/gallium/drivers/radeonsi/si_pipe.c | 3 +++ src/gallium/drivers/softpipe/sp_screen.c | 2 ++ src/gallium/drivers/svga/svga_screen.c | 2 ++ src/gallium/drivers/vc4/vc4_screen.c | 3 +++ src/gallium/include/pipe/p_defines.h | 1 + src/mesa/state_tracker/st_extensions.c | 3 +++ 16 files changed, 37 insertions(+) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index eee254e..13bf705 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -225,6 +225,7 @@ The integer capabilities: memory and GART. * ``PIPE_CAP_CONDITIONAL_RENDER_INVERTED``: Whether the driver supports inverted condition for conditional rendering. +* ``PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE``: The maximum supported vertex stride. .. _pipe_capf: diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index ab1a740..81a6c84 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -233,6 +233,9 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_MAX_VERTEX_STREAMS: return 0; + case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: + return 2048; + /* Texturing. */ case PIPE_CAP_MAX_TEXTURE_2D_LEVELS: case PIPE_CAP_MAX_TEXTURE_3D_LEVELS: diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index 40976b3..55f8e71 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -275,6 +275,9 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap) case PIPE_CAP_MAX_VERTEX_STREAMS: return 0; + case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: + return 2048; + /* Fragment coordinate conventions. */ case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT: case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER: diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index 15658da..1e034f8 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -382,6 +382,8 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY: case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY: return false; + case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: + return 2048; case PIPE_CAP_COMPUTE: return false; /* TODO */ case PIPE_CAP_USER_INDEX_BUFFERS: diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 2a6e673..8625f0c 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -195,6 +195,8 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) return 1024; case PIPE_CAP_MAX_VERTEX_STREAMS: return 1; + case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: + return 2048; case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME: return 1; case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS: diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c b/src/gallium/drivers/nouveau/nv30/nv30_screen.c index 80d6943..15aba8a 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c @@ -71,6 +71,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return 16; case PIPE_CAP_MAX_VIEWPORTS: return 1; + case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: + return 2048; /* supported capabilities */ case PIPE_CAP_TWO_SIDED_STENCIL: case PIPE_CAP_ANISOTROPIC_FILTER: diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 99dcdc5..1c08c3b 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -117,6 +117,8 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return 1024; case PIPE_CAP_MAX_VERTEX_STREAMS: return 1; + case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: + return 2048; case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
Re: [Mesa-dev] [PATCH 00/37] Geometry shader support in Sandy Bridge
El 2014-08-16 09:11, Jordan Justen escribió: On Thu, Aug 14, 2014 at 4:11 AM, Iago Toral Quiroga ito...@igalia.com wrote: Hi, this series brings support for geometry shaders in Sandy Bridge (gen6) and is combined work from Samuel and myself. A few notes: 1.- Some patches have been based on original work by Ilia Mirkin, specifically the idea of using arrays to buffer the output of the GS, subclassing the vec4_gs_visitor for gen6 and generalizing emit_urb_slot(). 2.- Geometry shaders were already being used in gen6 to implement transform feedback support for vertex shaders. We have not changed this. These patches focus on adding support for user-provided geometry shaders and transform feedback support for the geometry shader stage. In the future it probably makes sense to merge transform feedback support for the vertex shader stage in our implementation so there is only one code path for geometry shaders in gen6, but it is probably better to tackle that at a later moment, once we have merged this work. 2.- On Ivy Bridge there are no piglit regressions. 3.- On Sandy Bridge we get these results after enabling OpenGL 3.2 and GLSL 1.50 (*1): crash:+0 fail:+15 (*2) pass: +3265 skip: -3280 Maybe a list of the failures? Or posting the piglit comparison results might be helpful. For example: http://people.freedesktop.org/~kwg/stuff/bdw-2014-05-13/summary/regressions.html This is not really a big deal, but it would just be nice to quickly see what tests are failing. (*1) Including Jordan's patches from the series Gen6 render surface state changes since these are required to enable layered rendering in geometry shaders. The numbers were obtained by comparing master with Jordan's patches on top (OpenGL 3.1, GLSL 1.40) against master with these and Jordan's patches on top (OpenGL 3.2, GLSL 1.50) I finally pushed my gen6-layered series to master. (a1dca70) I wonder if you might push these patches to a publicly available branch? Thanks! -Jordan Sure. Samuel, can you do this? Iago (*2) These are mostly tests that either fail in Ivy Bridge too, are GS variants of tests that also fail for the VS/FS stages or relate to other aspects of OpenGL 3.2 that are not related with geometry shaders. 4.- With these patches, the following piglit test hangs: bin/glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_STRIP_ADJACENCY This problem seems to be unrelated to our implementation, since the hang happens only for that primitive type, only when using glDrawElements() (so glDrawArrays works fine), and only in specific cases where the list of indices provided includes repeated indices with a certain pattern. Actually, this test hangs even if we have a geometry shader that does nothing (i.e. an empty main function), where the code we generate is trivial and works with any other primitive type. Based on this, I conclude that this is a problem originating somewhere else, I think probably a hardware bug. Because of this, piglit runs with these patches should exclude this test by including -x primitive-id-restart. The offending piglit test can be trivially reworked to avoid repeating indices in the call to glDrawElements() too. I'll develop this issue further in another thread so we can decide what to do about this problem. I'll be on holidays for the next two weeks, starting tomorrow, but Samuel will be around since Tuesday next week so he can start acting on the review feedback we get. A quick summary of the patches: - Patch 1: is actually about gen7, but since gen6's dispatch mode for geometry shaders is equivalent to gen7's SINGLE mode it makes sense to do this first. - Patches 2-4 refactor 3DSTATE_GS to accomodate the code path for user-provided geometry shaders while keeping the original code that handles TF support in vertex shaders. - Patches 5-13 implement generator opcodes, configure state packets and handle required URB space. - Patches 14-15 generalize emit_urb_slot() so we can reuse that code. - Patches 16-19 are the gen6 geometry shader visitor implementation. - Patches 20-21 implement gl_PrimitiveIDIn. - Patch 22 makes sure we compute the right VUE map for user-provided GS. - Patch 23 enables texture related functions in the GS stage. - Patches 24-33 mostly implement transform feedback - Patch 34 handles uploading of ubo and pull constant surfaces - Patch 35 makes gen6 use this implementation of geometry shaders - Patches 36-37 enable GLSL 1.5 and OpenGL 3.2 in gen6 Iago Toral Quiroga (23): i965/gs: Use single dispatch mode as fallback to dual object mode when possible. i965/gen6/gs: Setup constant push buffers for gen6 geometry shaders. i965/gen6/gs: Implement GS_OPCODE_FF_SYNC. i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE. i965/gen6/gs: Add instruction URB flags to geometry shaders EOT message. i965/gen6/gs: Compute URB entry size for user-provided geometry shaders. i965/gen6/gs: Enable
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On 20.08.2014 00:04, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. First of all, thank you for sharing more specific information than 'table-flipping rage'. * LLVM is on a different release schedule (6 months vs. 3 months), has a different review process, etc., which means that to add support for new functionality that involves shaders, we now have to submit patches to two separate projects, and then 2 months later when we ship Mesa it turns out that nobody can actually use the new feature because it depends upon an unreleased version of LLVM that won't be released for another 3 months and then packaged by distros even later... This has indeed been frustrating at times, but it's better now for backend changes since Tom has been making LLVM point releases. As for the GLSL frontend, I agree with Tom that it shouldn't require that much direct interaction with the LLVM project. we've already had problems where distros refused to ship newer Mesa releases because radeon depended on a version of LLVM newer than the one they were shipping, [...] That's news to me, can you be more specific? That sounds like basically a distro issue though, since different LLVM versions can be installed in parallel (and the one used by default doesn't have to be the newest one). And it even works if another part of the same process uses a different version of LLVM. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] pipe-loader: Fix memory leak v2
On 20 August 2014 00:49, Tom Stellard thomas.stell...@amd.com wrote: CC: 10.2 mesa-sta...@lists.freedesktop.org v2: - Change driver_name to char* I knew there was a reason as to why I put a comment in there. Thanks for tracking it down Tom. Reviewed-by: Emil Velikov emil.l.veli...@gmail.com --- src/gallium/auxiliary/pipe-loader/pipe_loader.h | 2 +- src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.h b/src/gallium/auxiliary/pipe-loader/pipe_loader.h index 8ff00b1..6127a6a 100644 --- a/src/gallium/auxiliary/pipe-loader/pipe_loader.h +++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.h @@ -67,7 +67,7 @@ struct pipe_loader_device { } pci; } u; /** Discriminated by \a type */ - const char *driver_name; + char *driver_name; const struct pipe_loader_ops *ops; }; diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c index 1bbaf19..88056f5 100644 --- a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c +++ b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c @@ -256,7 +256,7 @@ pipe_loader_drm_release(struct pipe_loader_device **dev) util_dl_close(ddev-lib); close(ddev-fd); - /* XXX: Free ddev-base.driver_name - strdup at loader_get_driver_for_fd */ + FREE(ddev-base.driver_name); FREE(ddev); *dev = NULL; } -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/37] Geometry shader support in Sandy Bridge
On Wed, 2014-08-20 at 11:16 +0200, Iago Toral wrote: El 2014-08-16 09:11, Jordan Justen escribió: On Thu, Aug 14, 2014 at 4:11 AM, Iago Toral Quiroga ito...@igalia.com wrote: Hi, this series brings support for geometry shaders in Sandy Bridge (gen6) and is combined work from Samuel and myself. A few notes: 1.- Some patches have been based on original work by Ilia Mirkin, specifically the idea of using arrays to buffer the output of the GS, subclassing the vec4_gs_visitor for gen6 and generalizing emit_urb_slot(). 2.- Geometry shaders were already being used in gen6 to implement transform feedback support for vertex shaders. We have not changed this. These patches focus on adding support for user-provided geometry shaders and transform feedback support for the geometry shader stage. In the future it probably makes sense to merge transform feedback support for the vertex shader stage in our implementation so there is only one code path for geometry shaders in gen6, but it is probably better to tackle that at a later moment, once we have merged this work. 2.- On Ivy Bridge there are no piglit regressions. 3.- On Sandy Bridge we get these results after enabling OpenGL 3.2 and GLSL 1.50 (*1): crash:+0 fail:+15 (*2) pass: +3265 skip: -3280 Maybe a list of the failures? Or posting the piglit comparison results might be helpful. For example: http://people.freedesktop.org/~kwg/stuff/bdw-2014-05-13/summary/regressions.html This is not really a big deal, but it would just be nice to quickly see what tests are failing. (*1) Including Jordan's patches from the series Gen6 render surface state changes since these are required to enable layered rendering in geometry shaders. The numbers were obtained by comparing master with Jordan's patches on top (OpenGL 3.1, GLSL 1.40) against master with these and Jordan's patches on top (OpenGL 3.2, GLSL 1.50) I finally pushed my gen6-layered series to master. (a1dca70) I wonder if you might push these patches to a publicly available branch? Thanks! -Jordan Sure. Samuel, can you do this? Sure! The public branch with the submitted patches rebased on top of yesterday's master is here: https://github.com/samuelig/mesa/tree/gs-support-snb-for-submission And the piglit comparison between yesterday's master which already have Jordan's patches in SNB (OpenGL 3.1, GLSL 1.40) and our patches (OpenGL 3.2, GLSL 1.50) is here: http://samuelig.es/mesa-dev/all-submitted-patches-19-aug/ Sorry for the delay, uploading the whole piglit's HTML output is taking a lot of time with my Internet connection :-S If you find that some files are missing just try again later (FTP transfer is still uploading files). Best regards, Samuel signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 37/37] i965/gen6: enable OpenGL 3.2
On Thu, 2014-08-14 at 08:46 -0700, Matt Turner wrote: On Thu, Aug 14, 2014 at 4:12 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- I'd squash the last two patches together. I think it's likely we can go to GL 3.3 on Sandybridge, but we'd probably like to take a look at the piglit results first, so this patch that increases it to 3.2 seems fine. Both of these (squashed together) are Reviewed-by: Matt Turner matts...@gmail.com Thanks Matt. I am going to do the squash for the second version of the patches. I will wait some days in order to receive more feedback before sending the second version :-) Sam signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] EGL_KHR_vg_parent_image extenstion
Hello. I want to use EGL_KHR_vg_parent_image extension. I did grep on Mesa source code but found nothing. Then I looked on google and found this: http://lists.freedesktop.org/archives/mesa-dev/2010-September/002900.html Then I looked at mesa cgit and it looks like this was not merged. So mesa is not supporting EGL_KHR_vg_parent_image extenstion ? Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand it's not immediately required (as we can address this problem for the time being using the same solution AMD uses) and on the other hand
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 2:41 AM, Michel Dänzer mic...@daenzer.net wrote: On 20.08.2014 00:04, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. First of all, thank you for sharing more specific information than 'table-flipping rage'. * LLVM is on a different release schedule (6 months vs. 3 months), has a different review process, etc., which means that to add support for new functionality that involves shaders, we now have to submit patches to two separate projects, and then 2 months later when we ship Mesa it turns out that nobody can actually use the new feature because it depends upon an unreleased version of LLVM that won't be released for another 3 months and then packaged by distros even later... This has indeed been frustrating at times, but it's better now for backend changes since Tom has been making LLVM point releases. As for the GLSL frontend, I agree with Tom that it shouldn't require that much direct interaction with the LLVM project. we've already had problems where distros refused to ship newer Mesa releases because radeon depended on a version of LLVM newer than the one they were shipping, [...] That's news to me, can you be more specific? That sounds like basically a distro issue though, since different LLVM versions can be installed in parallel (and the one used by default doesn't have to be the newest one). And it even works if another part of the same process uses a different version of LLVM. Sorry, I heard about this from one of the other Intel folks (I believe Ian) so they'll have to comment more on it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
Am 20.08.2014 um 14:33 schrieb Connor Abbott: On Tue, Aug 19, 2014 at 11:57 PM, Christian König deathsim...@vodafone.de wrote: I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand it's not immediately required (as we can address this problem for the time being using the same solution AMD uses) and on the other hand it's still less work than starting from scratch. Well, I've wrote the structurizer pass in LLVM you are talking about here and from my experience you really don't want any structured form of control flow in the IR. Structured control flow is just a specialized form of unstructured control flow and even if it looks rather awkward at first glance it is indeed simpler to destructurize the compiler generated control flow for optimization and structurize again for instruction selection. That's interesting. I still think that with the right infrastructure, having structured control flow really isn't that bad, and it prevents optimizations from doing work like optimizing if (foo) { break; } into a single conditional branch when clearly that's not very productive. I would suspect that LLVM just isn't very good at structured control flow since it wasn't designed that way, and that's why it seems hard to work with. Well, maybe I should note that a lot of closed source driver are using LLVM for their internal IR representation and as far as I know they have more or less all a rather structured way of control flow. The problem with LLVM really isn't it's IR, because it's not designed CPU centric like you obviously think, but rather more that LLVM doesn't have a stable interface and is a rather fast moving project. Actually for example for R600 you do want to optimize a pattern like if (foo) { break; } into a conditional branch, cause if you look at the ISA you see that the LOOP_BREAK pattern is able to take an additional condition to apply to the current execution mask. When you design an hardware independent IR looking at the backend hardware level like you do right now is actually the completely wrong approach. What you need to do is making the IR as simple as possible and then allow to do specialized operations on it to translate it into the desired machine code. In other words the logic necessary for code generation shouldn't be inside the IR, cause then the IR is specialized to this specific problem. Instead the logic needs to be in the tools that surround the IR. Regards, Christian. The only reason I've annotated the LLVM IR with specialized intrinsics for the SI backend was laziness and I wouldn't do that again given the chance. And it's very likely that these backends, which probably aren't using SSA due to the aforementioned difficulties, will also benefit from having modifiers already folded for them - this is something that's already a problem for i965 vec4 backend and that NIR will help a lot. Well, I have the impression that much of the reason why the i965 vec4 backend has lagged behind so much in comparison with the fs backend is precisely because it's so annoying to optimize vec4 code. It seems painful to me that you have this built into the core instruction set so generic optimization passes will have to be explicitly aware of it. I wouldn't be surprised if the i965 vec4 benefited at least as much from scalarizing the code, performing optimizations there, and re-vectorizing afterwards. We thought about doing something like that, but I don't think it's really that much of a burden when it comes to the rest of the IR. Most of the difficulty of working with a vec4 representation comes from the fact that instructions can partially update their outputs, and once we convert to SSA that problem goes away since there are no partial updates in SSA. Coming out of SSA is where the difficulty lies, but I still think that's a solvable problem, just a difficult one. Plus, there's the problem of how to do the vectorization - you could do it in SSA, but then you still have the hard bit of coming out of SSA and so you're back to square one, or you could do it once you're out of SSA but then it's a lot harder to reason about since you're back to having partial updates. Completely agree. Being able to do vectorization in an IR is important, but you shouldn't try to handle backend specific swizzle operations and vectorizing restrictions in the IR. Just looking at the swizzle restrictions of R600 for example and I really can't imagine that you want to represent this in a common IR between all different drivers. Regards, Christian. Am 20.08.2014 um 08:33 schrieb Francisco Jerez: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40
Re: [Mesa-dev] EGL_KHR_vg_parent_image extenstion
On 20/08/14 12:42, Peter Hanzel wrote: Hello. I want to use EGL_KHR_vg_parent_image extension. I did grep on Mesa source code but found nothing. Then I looked on google and found this: http://lists.freedesktop.org/archives/mesa-dev/2010-September/002900.html Then I looked at mesa cgit and it looks like this was not merged. So mesa is not supporting EGL_KHR_vg_parent_image extenstion ? Hi Peter, Afaics Chia-I requested some trivial changes to the patch, but the original author never replied back. Imho everyone is welcome to address the comments and resubmit the patch, even yourself. Feel free to give it a try ;) -Emil Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand it's not immediately required (as we can address this problem for the time being
Re: [Mesa-dev] [PATCH 3/3] clover: unsure compat::string is \0 terminated
EdB e...@sigluy.net writes: Each time you call c_str() it will grow up, may be you could check if the string is already \0 terminated before adding it. Nope, that's not how it works. Every time c_str() is called the size of the underlying array is forced to at least size-of-the-actual-string + 1, so nothing will happen if the array is already big enough. The way we do it, we use twice the memory every time a vector capacity increase (before freeing the old vec) as we don't use a realloc. I understand c_str() should be use for debug only purpose, but may be it could be a problem while debugging huge strings. Or we can keep compat::string the same and remove c_str(). If someone needed it, he could use std::string operator and c_str() on it. At the end, the memory used is the same. Le 2014-08-18 14:35, Francisco Jerez a écrit : EdB edb+m...@sigluy.net writes: otherwise c_str() is not safe --- src/gallium/state_trackers/clover/util/compat.hpp | 54 --- 1 file changed, 48 insertions(+), 6 deletions(-) diff --git a/src/gallium/state_trackers/clover/util/compat.hpp b/src/gallium/state_trackers/clover/util/compat.hpp index 6f0f7cc..7ca1f85 100644 --- a/src/gallium/state_trackers/clover/util/compat.hpp +++ b/src/gallium/state_trackers/clover/util/compat.hpp @@ -197,7 +197,7 @@ namespace clover { return _p[i]; } - private: + protected: iterator _p; //memory array size_type _s; //size size_type _c; //capacity @@ -306,18 +306,56 @@ namespace clover { class string : public vectorchar { public: - string() : vector() { + string() : vector(0, 1) { +_p[_s - 1] = '\0'; } - string(const char *p) : vector(p, std::strlen(p)) { + string(const char *p) : vector(p, std::strlen(p) + 1) { +_p[_s - 1] = '\0'; } templatetypename C - string(const C v) : vector(v) { + string(const C v) : vector(*v.begin(), v.size() + 1) { +_p[_s - 1] = '\0'; } - operator std::string() const { -return std::string(begin(), end()); + void + reserve(size_type m) { +vector::reserve(m + 1); + } + + void + resize(size_type m, char x = '\0') { +vector::resize(m + 1, x); +_p[_s - 1] = '\0'; + } + + void + push_back(char x) { +reserve(_s + 1); +_p[_s - 1] = x; +_p[_s] = '\0'; +++_s; + } + + size_type + size() const { +return _s - 1; + } + + size_type + capacity() const { +return _c - 1; + } + + iterator + end() { +return _p + size(); + } + + const_iterator + end() const { +return _p + size(); } At this point where all methods from the base class need to be redefined it probably stops making sense to use inheritance instead of aggregation. Once we've done that fixing c_str() gets a lot easier (two lines of code) because we can just declare the container as mutable and fix up the NULL terminator when c_str() is called. Both changes attached. const char * @@ -325,6 +363,10 @@ namespace clover { return begin(); } + operator std::string() const { +return std::string(begin(), end()); + } + const char * find(const string s) const { for (size_t i = 0; i + s.size() size(); ++i) { -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev pgp9UohsDiR7k.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
And don't forget that explicit vec4 becomes immensely amusing once you add fp64/double to the problem. OG. On Wed, Aug 20, 2014 at 4:01 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 5:57 AM, Christian König deathsim...@vodafone.de wrote: Am 20.08.2014 um 14:33 schrieb Connor Abbott: On Tue, Aug 19, 2014 at 11:57 PM, Christian König deathsim...@vodafone.de wrote: I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand it's not immediately required (as we can address this problem for the time being using the same solution AMD uses) and on the other hand it's still less work than starting from scratch. Well, I've wrote the structurizer pass in LLVM you are talking about here and from my experience you really don't want any structured form of control flow in the IR. Structured control flow is just a specialized form of unstructured control flow and even if it looks rather awkward at first glance it is indeed simpler to destructurize the compiler generated control flow for optimization and structurize again for instruction selection. That's interesting. I still think that with the right infrastructure, having structured control flow really isn't that bad, and it prevents optimizations from doing work like optimizing if (foo) { break; } into a single conditional branch when clearly that's not very productive. I would suspect that LLVM just isn't very good at structured control flow since it wasn't designed that way, and that's why it seems hard to work with. Well, maybe I should note that a lot of closed source driver are using LLVM for their internal IR representation and as far as I know they have more or less all a rather structured way of control flow. The problem with LLVM really isn't it's IR, because it's not designed CPU centric like you obviously think, but rather more that LLVM doesn't have a stable interface and is a rather fast moving project. Actually for example for R600 you do want to optimize a pattern like if (foo) { break; } into a conditional branch, cause if you look at the ISA you see that the LOOP_BREAK pattern is able to take an additional condition to apply to the current execution mask. When you design an hardware independent IR looking at the backend hardware level like you do right now is actually the completely wrong approach. What you need to do is making the IR as simple as possible and then allow to do specialized operations on it to translate it into the desired machine code. I'm not looking at the backend hardware level here, but at other languages (in this case D3D bytecode) that support the same thing, and therefore it's something that the HW probably has/can do efficiently and something that app developers (especially those translating D3D bytecode into GLSL, of which there are quite a lot) expect. NIR obviously doesn't support every HW's strange restrictions on swizzling and modifiers, backends can do the lowering for that themselves. In other words the logic necessary for code generation shouldn't be inside the IR, cause then the IR is specialized to this specific problem. Instead the logic needs to be in the tools that surround the IR. Regards, Christian. These are all good points, and frankly I don't think it would be too bad if we switched to LLVM. Unfortunately, though, I think that the Intel driver won't be using LLVM in the near future, if nothing else for various not-technical reasons I'm not at liberty to discuss, but certainly making the switch to a flat SSA-based IR, in addition to being an improvement over the current state of things, will help us move closer to LLVM and see if it's something we would want to pursue. Connor The only reason I've annotated the LLVM IR with specialized intrinsics for the SI backend was laziness and I wouldn't do that again given the chance. And it's very likely that these backends, which probably aren't using SSA due to the aforementioned difficulties, will also benefit from having modifiers already folded for them - this is something that's already a problem for i965 vec4 backend and that NIR will help a lot. Well, I have the impression that much of the reason why the i965 vec4 backend has lagged behind so much in comparison with the fs backend is precisely because it's so annoying to optimize vec4 code. It seems painful to me that you have this built into the core instruction set so generic optimization passes will have to be explicitly aware of it. I wouldn't be surprised if the i965 vec4 benefited at least as much from scalarizing the code, performing optimizations there, and re-vectorizing afterwards. We thought about doing something like that, but I don't think it's really that much of a burden when it comes to the rest of the IR. Most of the difficulty of working with a vec4
Re: [Mesa-dev] [PATCH 0/2] kms-swrast: PRIME and missing defines
On 15/08/14 22:32, Andreas Pokorny wrote: Hi, This adds support for dma_buf fds to kms_swrast. This is especially interesting for drm capable drivers like qxl or udl. The former recently gained prime support. The second part adds a few defines that werent set anywhere else, but are necessary for dri_kms_init_screen to be not empty. Hi Andreas, As I've added dri_kms_init_screen() I delieberately opted out of dma_buf as it did not made sense considering the lack of winsys handling. Now it's coming back to haunt me :) Do you have any rough numbers about the benefit this brings us ? -Emil regards Andreas Andreas Pokorny (2): kms-swrast: Support Prime fd handling kms-swrast: defines missing to build kms-swrast src/gallium/state_trackers/dri/Makefile.am| 5 ++ src/gallium/state_trackers/dri/dri2.c | 8 ++ src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 91 +++ 3 files changed, 91 insertions(+), 13 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] kms-swrast: defines missing to build kms-swrast
I have pushed a similar patch (commit 16873a6e62e) a couple of days before your post. Afaics it should already cover this case ? -Emil On 15/08/14 22:32, Andreas Pokorny wrote: --- src/gallium/state_trackers/dri/Makefile.am | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/state_trackers/dri/Makefile.am b/src/gallium/state_trackers/dri/Makefile.am index bda75c3..bcbd081 100644 --- a/src/gallium/state_trackers/dri/Makefile.am +++ b/src/gallium/state_trackers/dri/Makefile.am @@ -26,6 +26,7 @@ include $(top_srcdir)/src/gallium/Automake.inc AM_CPPFLAGS = \ $(GALLIUM_PIPE_LOADER_DEFINES) \ + -DDRI_TARGET \ -DPIPE_SEARCH_DIR=\$(libdir)/gallium-pipe\ \ -I$(top_srcdir)/include \ -I$(top_srcdir)/src/mapi \ @@ -37,6 +38,10 @@ AM_CPPFLAGS = \ $(LIBDRM_CFLAGS) \ $(VISIBILITY_CFLAGS) +if HAVE_GALLIUM_SOFTPIPE +AM_CPPFLAGS += \ + -DGALLIUM_SOFTPIPE +endif if HAVE_GALLIUM_STATIC_TARGETS AM_CPPFLAGS += \ -DGALLIUM_STATIC_TARGETS=1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] kms-swrast: Support Prime fd handling
On 15/08/14 22:32, Andreas Pokorny wrote: Allows using prime fds as display target and from display target. Test for PRIME capability after initializing kms_swrast screen. Signed-off-by: Andreas Pokorny andreas.poko...@canonical.com --- src/gallium/state_trackers/dri/dri2.c | 8 ++ src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 91 +++ 2 files changed, 86 insertions(+), 13 deletions(-) diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index c466de7..e52bd71 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -1327,6 +1327,7 @@ dri_kms_init_screen(__DRIscreen * sPriv) const __DRIconfig **configs; struct dri_screen *screen; struct pipe_screen *pscreen = NULL; + uint64_t cap; screen = CALLOC_STRUCT(dri_screen); if (!screen) @@ -1338,6 +1339,13 @@ dri_kms_init_screen(__DRIscreen * sPriv) sPriv-driverPrivate = (void *)screen; pscreen = kms_swrast_create_screen(screen-fd); + + if (drmGetCap(sPriv-fd, DRM_CAP_PRIME, cap) == 0 + (cap DRM_PRIME_CAP_IMPORT)) { + dri2ImageExtension.createImageFromFds = dri2_from_fds; + dri2ImageExtension.createImageFromDmaBufs = dri2_from_dma_bufs; + } + sPriv-extensions = dri_screen_extensions; /* dri_init_screen_helper checks pscreen for us */ diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c index c9934bb..7246ffc 100644 --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c @@ -38,6 +38,7 @@ #include sys/mman.h #include unistd.h #include dlfcn.h +#include fcntl.h #include xf86drm.h #include pipe/p_compiler.h @@ -210,6 +211,38 @@ kms_sw_displaytarget_map(struct sw_winsys *ws, return kms_sw_dt-mapped; } +static struct kms_sw_displaytarget * +kms_sw_displaytarget_add_from_prime(struct kms_sw_winsys *kms_sw, int fd) +{ + uint32_t handle; + struct kms_sw_displaytarget * kms_sw_dt; + int ret; + + ret = drmPrimeFDToHandle(kms_sw-fd, fd, handle); + + if (ret) + return NULL; + + kms_sw_dt = CALLOC_STRUCT(kms_sw_displaytarget); + if (!kms_sw_dt) + return NULL; + + kms_sw_dt-ref_count = 1; + kms_sw_dt-handle = handle; + kms_sw_dt-size = lseek(fd, 0, SEEK_END); + + if (kms_sw_dt-size == (off_t)-1) { + FREE(kms_sw_dt); + return NULL; + } + + lseek(fd, 0, SEEK_SET); + + list_add(kms_sw_dt-link, kms_sw-bo_list); + + return kms_sw_dt; +} + static void kms_sw_displaytarget_unmap(struct sw_winsys *ws, struct sw_displaytarget *dt) @@ -231,17 +264,38 @@ kms_sw_displaytarget_from_handle(struct sw_winsys *ws, struct kms_sw_winsys *kms_sw = kms_sw_winsys(ws); struct kms_sw_displaytarget *kms_sw_dt; - assert(whandle-type == DRM_API_HANDLE_TYPE_KMS); - - LIST_FOR_EACH_ENTRY(kms_sw_dt, kms_sw-bo_list, link) { - if (kms_sw_dt-handle == whandle-handle) { - kms_sw_dt-ref_count++; - - DEBUG(KMS-DEBUG: imported buffer %u (size %u)\n, kms_sw_dt-handle, kms_sw_dt-size); - - *stride = kms_sw_dt-stride; + assert(whandle-type == DRM_API_HANDLE_TYPE_KMS || + whandle-type == DRM_API_HANDLE_TYPE_FD); + + switch(whandle-type) { + case DRM_API_HANDLE_TYPE_FD: + { + kms_sw_dt = kms_sw_displaytarget_add_from_prime(kms_sw, whandle-handle); + if (kms_sw_dt) { +kms_sw_dt-ref_count++; +kms_sw_dt-width = templ-width0; +kms_sw_dt-height = templ-height0; +if (kms_sw_dt-height) + kms_sw_dt-stride = kms_sw_dt-size/kms_sw_dt-height; +*stride = kms_sw_dt-stride; + } return (struct sw_displaytarget *)kms_sw_dt; } + case DRM_API_HANDLE_TYPE_KMS: + { + LIST_FOR_EACH_ENTRY(kms_sw_dt, kms_sw-bo_list, link) { +if (kms_sw_dt-handle == whandle-handle) { + kms_sw_dt-ref_count++; + + DEBUG(KMS-DEBUG: imported buffer %u (size %u)\n, kms_sw_dt-handle, kms_sw_dt-size); + + *stride = kms_sw_dt-stride; + return (struct sw_displaytarget *)kms_sw_dt; +} + } + } + default: + break; Please formatting the switch so that it matches the one below. } assert(0); @@ -253,16 +307,27 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys, struct sw_displaytarget *dt, struct winsys_handle *whandle) { + struct kms_sw_winsys *kms_sw = kms_sw_winsys(winsys); struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt); - if (whandle-type == DRM_API_HANDLE_TYPE_KMS) { + switch(whandle-type) { + case
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Am 20.08.2014 08:45, schrieb Ilia Mirkin: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- No piglit regressions on nvc0 except for gl-3.0-render-integer, which appears to now fail even without this commit, despite the fact that I'm fairly sure it used to work fine. Same failure with llvmpipe... It's most likely that I've missed some details. It's unclear whether e.g. glGenerateMipmap should work on a view. However the piglits that exist do all pass on nvc0 and llvmpipe. docs/GL3.txt | 2 +- docs/relnotes/10.3.html | 1 + src/mesa/state_tracker/st_atom_texture.c | 28 +++ src/mesa/state_tracker/st_cb_fbo.c | 10 ++ src/mesa/state_tracker/st_cb_texture.c | 62 +++- src/mesa/state_tracker/st_extensions.c | 1 + src/mesa/state_tracker/st_format.c | 5 +-- src/mesa/state_tracker/st_texture.c | 15 ++-- 8 files changed, 105 insertions(+), 19 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 76412c3..5b25865 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -166,7 +166,7 @@ GL 4.3, GLSL 4.30: GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi) GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30) GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample) - GL_ARB_texture_view DONE (i965) + GL_ARB_texture_view DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe) GL_ARB_vertex_attrib_binding DONE (all drivers) diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html index fa4ea23..852aec9 100644 --- a/docs/relnotes/10.3.html +++ b/docs/relnotes/10.3.html @@ -63,6 +63,7 @@ Note: some of the new features are only available with certain drivers. liGL_ARB_texture_gather on r600, radeonsi/li liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, softpipe/li liGL_ARB_texture_query_lod on r600, radeonsi/li +liGL_ARB_texture_view on nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe/li liGL_ARB_viewport_array on nvc0/li liGL_AMD_vertex_shader_viewport_index on i965/gen7+, r600/li liGL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, radeonsi, softpipe, llvmpipe/li diff --git a/src/mesa/state_tracker/st_atom_texture.c b/src/mesa/state_tracker/st_atom_texture.c index 03d0593..8f62494 100644 --- a/src/mesa/state_tracker/st_atom_texture.c +++ b/src/mesa/state_tracker/st_atom_texture.c @@ -192,9 +192,9 @@ get_texture_format_swizzle(const struct st_texture_object *stObj) return swizzle_swizzle(stObj-base._Swizzle, tex_swizzle); } - + /** - * Return TRUE if the texture's sampler view swizzle is equal to + * Return TRUE if the texture's sampler view swizzle is not equal to * the texture's swizzle. * * \param stObj the st texture object, @@ -214,9 +214,20 @@ check_sampler_swizzle(const struct st_texture_object *stObj, static unsigned last_level(struct st_texture_object *stObj) { - return MIN2(stObj-base._MaxLevel, stObj-pt-last_level); + unsigned ret = MIN2(stObj-base.MinLevel + stObj-base._MaxLevel, + stObj-pt-last_level); + if (stObj-base.Immutable) + ret = MIN2(ret, stObj-base.MinLevel + stObj-base.NumLevels - 1); + return ret; } +static unsigned last_layer(struct st_texture_object *stObj) +{ + if (stObj-base.Immutable) + return MIN2(stObj-base.MinLayer + stObj-base.NumLayers - 1, + stObj-pt-array_size - 1); + return stObj-pt-array_size - 1; +} static struct pipe_sampler_view * st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe, @@ -249,9 +260,12 @@ st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe, templ.u.buf.first_element = f; templ.u.buf.last_element = f + (n - 1); } else { - templ.u.tex.first_level = stObj-base.BaseLevel; + templ.u.tex.first_level = stObj-base.MinLevel + stObj-base.BaseLevel; templ.u.tex.last_level = last_level(stObj); assert(templ.u.tex.first_level = templ.u.tex.last_level); + templ.u.tex.first_layer = stObj-base.MinLayer; + templ.u.tex.last_layer = last_layer(stObj); +
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 7:01 AM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join Where both branches are guaranteed to converge at join. Sure, this will require fixing many assumptions, but on the one hand
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Am 20.08.2014 08:45, schrieb Ilia Mirkin: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- No piglit regressions on nvc0 except for gl-3.0-render-integer, which appears to now fail even without this commit, despite the fact that I'm fairly sure it used to work fine. Same failure with llvmpipe... It's most likely that I've missed some details. It's unclear whether e.g. glGenerateMipmap should work on a view. However the piglits that exist do all pass on nvc0 and llvmpipe. docs/GL3.txt | 2 +- docs/relnotes/10.3.html | 1 + src/mesa/state_tracker/st_atom_texture.c | 28 +++ src/mesa/state_tracker/st_cb_fbo.c | 10 ++ src/mesa/state_tracker/st_cb_texture.c | 62 +++- src/mesa/state_tracker/st_extensions.c | 1 + src/mesa/state_tracker/st_format.c | 5 +-- src/mesa/state_tracker/st_texture.c | 15 ++-- 8 files changed, 105 insertions(+), 19 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 76412c3..5b25865 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -166,7 +166,7 @@ GL 4.3, GLSL 4.30: GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi) GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30) GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample) - GL_ARB_texture_view DONE (i965) + GL_ARB_texture_view DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe) GL_ARB_vertex_attrib_binding DONE (all drivers) diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html index fa4ea23..852aec9 100644 --- a/docs/relnotes/10.3.html +++ b/docs/relnotes/10.3.html @@ -63,6 +63,7 @@ Note: some of the new features are only available with certain drivers. liGL_ARB_texture_gather on r600, radeonsi/li liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, softpipe/li liGL_ARB_texture_query_lod on r600, radeonsi/li +liGL_ARB_texture_view on nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe/li liGL_ARB_viewport_array on nvc0/li liGL_AMD_vertex_shader_viewport_index on i965/gen7+, r600/li liGL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, radeonsi, softpipe, llvmpipe/li diff --git a/src/mesa/state_tracker/st_atom_texture.c b/src/mesa/state_tracker/st_atom_texture.c index 03d0593..8f62494 100644 --- a/src/mesa/state_tracker/st_atom_texture.c +++ b/src/mesa/state_tracker/st_atom_texture.c @@ -192,9 +192,9 @@ get_texture_format_swizzle(const struct st_texture_object *stObj) return swizzle_swizzle(stObj-base._Swizzle, tex_swizzle); } - + /** - * Return TRUE if the texture's sampler view swizzle is equal to + * Return TRUE if the texture's sampler view swizzle is not equal to * the texture's swizzle. * * \param stObj the st texture object, @@ -214,9 +214,20 @@ check_sampler_swizzle(const struct st_texture_object *stObj, static unsigned last_level(struct st_texture_object *stObj) { - return MIN2(stObj-base._MaxLevel, stObj-pt-last_level); + unsigned ret = MIN2(stObj-base.MinLevel + stObj-base._MaxLevel, + stObj-pt-last_level); + if (stObj-base.Immutable) + ret = MIN2(ret, stObj-base.MinLevel + stObj-base.NumLevels - 1); + return ret; } +static unsigned last_layer(struct st_texture_object *stObj) +{ + if (stObj-base.Immutable) + return MIN2(stObj-base.MinLayer + stObj-base.NumLayers - 1, + stObj-pt-array_size - 1); + return stObj-pt-array_size - 1; +} static struct pipe_sampler_view * st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe, @@ -249,9 +260,12 @@ st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe, templ.u.buf.first_element = f; templ.u.buf.last_element = f + (n - 1); } else { - templ.u.tex.first_level = stObj-base.BaseLevel; + templ.u.tex.first_level = stObj-base.MinLevel +
[Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle
If only the flat/smooth shade state changed between two calls the prior code would miss updating the hardware state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com --- Tested on radeon 6670, no piglit regressions src/gallium/drivers/r600/evergreen_state.c | 2 -- src/gallium/drivers/r600/r600_shader.h | 2 +- src/gallium/drivers/r600/r600_state.c| 2 -- src/gallium/drivers/r600/r600_state_common.c | 6 +++--- 4 files changed, 4 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 841ad0c..b490145 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2927,8 +2927,6 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader shader-ps_depth_export = z_export | stencil_export; shader-sprite_coord_enable = sprite_coord_enable; - if (rctx-rasterizer) - shader-flatshade = rctx-rasterizer-flatshade; } void evergreen_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader) diff --git a/src/gallium/drivers/r600/r600_shader.h b/src/gallium/drivers/r600/r600_shader.h index d6db8f0..8b32966 100644 --- a/src/gallium/drivers/r600/r600_shader.h +++ b/src/gallium/drivers/r600/r600_shader.h @@ -89,6 +89,7 @@ struct r600_shader_key { unsigned alpha_to_one:1; unsigned nr_cbufs:4; unsigned vs_as_es:1; + unsigned flatshade:1; }; struct r600_shader_array { @@ -106,7 +107,6 @@ struct r600_pipe_shader { struct r600_command_buffer command_buffer; /* register writes */ struct r600_resource*bo; unsignedsprite_coord_enable; - unsignedflatshade; unsignedpa_cl_vs_out_cntl; unsignednr_ps_color_outputs; struct r600_shader_key key; diff --git a/src/gallium/drivers/r600/r600_state.c b/src/gallium/drivers/r600/r600_state.c index 607b199..3f5cb2b 100644 --- a/src/gallium/drivers/r600/r600_state.c +++ b/src/gallium/drivers/r600/r600_state.c @@ -2532,8 +2532,6 @@ void r600_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *sha shader-ps_depth_export = z_export | stencil_export; shader-sprite_coord_enable = sprite_coord_enable; - if (rctx-rasterizer) - shader-flatshade = rctx-rasterizer-flatshade; } void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 7594d0e..d8243d1 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -699,6 +699,8 @@ static INLINE struct r600_shader_key r600_shader_selector_key(struct pipe_contex /* Dual-source blending only makes sense with nr_cbufs == 1. */ if (key.nr_cbufs == 1 rctx-dual_src_blend) key.nr_cbufs = 2; + if (rctx-rasterizer-flatshade) + key.flatshade = 1; } else if (sel-type == PIPE_SHADER_VERTEX) { key.vs_as_es = (rctx-gs_shader != NULL); } @@ -1250,9 +1252,7 @@ static bool r600_update_derived_state(struct r600_context *rctx) } if (unlikely(!ps_dirty rctx-ps_shader rctx-rasterizer - ((rctx-rasterizer-sprite_coord_enable != rctx-ps_shader-current-sprite_coord_enable) || - (rctx-rasterizer-flatshade != rctx-ps_shader-current-flatshade { - + ((rctx-rasterizer-sprite_coord_enable != rctx-ps_shader-current-sprite_coord_enable { if (rctx-b.chip_class = EVERGREEN) evergreen_update_ps_state(ctx, rctx-ps_shader-current); else -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Anyways, I guess I'll have to add a PIPE_CAP_TEXTURE_VIEW if the layouts might not be compatible for some drivers? Or is there something that exists that I should restrict it to? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Am 20.08.2014 17:47, schrieb Jose Fonseca: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, I think you meant PIPE_TXTURE_2D there? PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Yes, but that's not what I'm talking about. Even d3d10, which does not have any distinct cube/array texture layouts (actually 10 still had a separate one for cubes, because there was hw which really required a different layout afaik, but it got abandoned in 10.1), still requires shader resource views to have them (and they must match what's declared in the shader): http://msdn.microsoft.com/en-us/library/windows/desktop/ff476211%28v=vs.85%29.aspx So, my guess is we should do the same - just have that type in the sampler view (and drivers wishing to support the extension must take the type from the view, and not the underlying resource - or they could get it from the shader itself, presumably, if they really wanted, this is actually what we do for texture size queries in llvmpipe, but it's more of a necessary hack). You are right though we would not really require distinct types at the resource level, but they don't really get in the way neither. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa: tag mesa-10.2.6: Mesa 10.2.6 release
On 08/19/2014 04:18 PM, Carl Worth wrote: Module: Mesa Branch: refs/tags/mesa-10.2.6 Tag:1d329590143b4236e8c706b80b6551502f5cb780 URL: https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/tag/?id%3D1d329590143b4236e8c706b80b6551502f5cb780k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0Am=2Q0FUEQHYsFSJjkW305hEEutE%2F7Hygc9vQEib%2FHSHSw%3D%0As=294f9ee14355b6cbcba1c6934b6f4f50708741062a79e4999be2ed7fcf1d135a Tagger: Carl Worth cwo...@cworth.org Date: Tue Aug 19 15:17:13 2014 -0700 Mesa 10.2.6 release ___ mesa-commit mailing list mesa-com...@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-commitk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0Am=2Q0FUEQHYsFSJjkW305hEEutE%2F7Hygc9vQEib%2FHSHSw%3D%0As=39d08de0c4e9893863232952eb93589a4c324f448637e7826b5f8c78488c7cd7 Unfortunately, it looks like 31ce84a81f7166ded07e9cb41e5dfe212dd8fed1 was not included. If people complain, we may need a 10.2.7 release before two weeks. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
On 20/08/14 17:02, Roland Scheidegger wrote: Am 20.08.2014 17:47, schrieb Jose Fonseca: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, I think you meant PIPE_TXTURE_2D there? No, I expclitely left PIPE_TEXTURE_CUBE due to the reasons I explained below. PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Yes, but that's not what I'm talking about. Even d3d10, which does not have any distinct cube/array texture layouts Precisely. D3D10 uses D3D10_RESOURCE_DIMENSION_TEXTURE2D for cubes plus the D3D10_RESOURCE_MISC_TEXTURECUBE misc flag. Which is precisely what I was talking about when I said We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag. (actually 10 still had a separate one for cubes, because there was hw which really required a different layout afaik, but it got abandoned in 10.1), No D3D10 doesn't have D3D10_RESOURCE_DIMENSION_TEXTURECUBE. D3D 10, 10.1, and 11, they all use RESOURCE_DIMENSION_TEXTURE2D + RESOURCE_MISC_TEXTURECUBE for cubemaps or cubemaps arrays. still requires shader resource views to have them (and they must match what's declared in the shader): http://msdn.microsoft.com/en-us/library/windows/desktop/ff476211%28v=vs.85%29.aspx Right, there are different enums for resource types and view types. So, my guess is we should do the same - just have that type in the sampler view (and drivers wishing to support the extension must take the type from the view, and not the underlying resource - or they could get it from the shader itself, presumably, if they really wanted, this is actually what we do for texture size queries in llvmpipe, but it's more of a necessary hack). You are right though we would not really require distinct types at the resource level, but they don't really get in the way neither. Yes, we could do the same. But I do think that in that case we should have a separate enum for views different from pipe_texture_target. And pipe_texture_target would be slim down. But if you remove PIPE_TEXTURE_CUBE from pipe_texture_target you'll need to pass that info in a new flag (like D3D10_RESOURCE_MISC_TEXTURECUBE). I don't feel strongly, but I'm not sure this is much more elegant than keeping PIPE_TEXTURE_CUBE around. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. Anyways, I guess I'll have to add a PIPE_CAP_TEXTURE_VIEW if the layouts might not be compatible for some drivers? Or is there something that exists that I should restrict it to? I suspect d3d9-class hw couldn't do it (can r300 access a cube map as a regular 2d texture when sampling)?. Usually it's probably the same hw which also does not support array textures but it can be different (IIRC i965 was one such chipset which really had different layout for cube maps and arrays in particular, though it would not apply to anything that's supported by ilo). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface On struct pipe_sampler_view, I thought... unless I'm misunderstanding. This was also my first thought about fixing this after Roland pointed out the issue. struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Such a cleanup would probably have to be done by someone with a better understanding of gallium than me. OTOH if you guys feel like doing it the sampler_view way will accrue too much technical debt, that's fine too. Unless I hear otherwise, I'm going to try to do it the pipe_sampler_view way tonight. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Am 20.08.2014 18:12, schrieb Jose Fonseca: On 20/08/14 17:02, Roland Scheidegger wrote: Am 20.08.2014 17:47, schrieb Jose Fonseca: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, I think you meant PIPE_TXTURE_2D there? No, I expclitely left PIPE_TEXTURE_CUBE due to the reasons I explained below. PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Yes, but that's not what I'm talking about. Even d3d10, which does not have any distinct cube/array texture layouts Precisely. D3D10 uses D3D10_RESOURCE_DIMENSION_TEXTURE2D for cubes plus the D3D10_RESOURCE_MISC_TEXTURECUBE misc flag. Which is precisely what I was talking about when I said We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag. (actually 10 still had a separate one for cubes, because there was hw which really required a different layout afaik, but it got abandoned in 10.1), No D3D10 doesn't have D3D10_RESOURCE_DIMENSION_TEXTURECUBE. D3D 10, 10.1, and 11, they all use RESOURCE_DIMENSION_TEXTURE2D + RESOURCE_MISC_TEXTURECUBE for cubemaps or cubemaps arrays. You are right that d3d10 did not rally have a cube layout - they required the misc_texturecube flag as you said (this is really the same thing to me if you have a 2d + flag or explicit cube). But this flag is more or less dead and buried with newer d3d versions, it is there for compatibility reasons (with api level 10_0 you still need it for cube maps, since it didn't allow the target casting from cube to 2d array or vice versa later in the resource view). It was never used for cube map arrays. still requires shader resource views to have them (and they must match what's declared in the shader): http://msdn.microsoft.com/en-us/library/windows/desktop/ff476211%28v=vs.85%29.aspx Right, there are different enums for resource types and view types. So, my guess is we should do the same - just have that type in the sampler view (and drivers wishing to support the extension must take the type from the view, and not the underlying resource - or they could get it from the shader itself, presumably, if they really wanted, this is actually what we do for texture size queries in llvmpipe, but it's more of a necessary hack). You are right though we would not really require distinct types at the resource level, but they don't really get in the way neither. Yes, we could do the same. But I do think that in that case we should have a separate enum for views different from pipe_texture_target. And pipe_texture_target would be slim down. Yes that would make sense. I'm not sure it's worth the trouble of changing the code though. But if you remove PIPE_TEXTURE_CUBE from pipe_texture_target you'll need to pass that info in a new flag (like D3D10_RESOURCE_MISC_TEXTURECUBE). I don't feel strongly, but I'm not sure this is much more elegant than keeping PIPE_TEXTURE_CUBE around. Ok understood PIPE_TEXTURE_CUBE would have to stay (as there's hw which needs to know the distinction to an array). So I guess we could do: 1) slim the pipe_texture_target enum down and use a different pipe_view_target (or whatever) in the sampler view which has all values, and allow all required casts (those involving cubes are probably the only ones which really need to be restricted to drivers supporting ARB_texture_view). It requires
Re: [Mesa-dev] [PATCHv3 11/16] mesa: add infrastructure for threaded shader compilation
On Wednesday 20 August 2014, Chia-I Wu wrote: Add _mesa_enable_glsl_threadpool to enable the thread pool for a context, and add ctx-Const.DeferCompileShader and ctx-Const.DeferLinkProgram to fine-control what gets threaded. Setting DeferCompileShader to true will make _mesa_glsl_compile_shader be executed in a worker thread. The function is thread-safe so there is no restriction on DeferCompileShader. Setting DeferLinkProgram to true will make _mesa_glsl_link_shader be executed in a worker thread. The function is thread-safe only when certain driver functions (as documented in struct gl_constants) are thread-safe. It is drivers' responsibility to fix those driver functions before setting DeferLinkProgram. When DeferLinkProgram is set, drivers are not supposed to inspect the context in their LinkShader callbacks. Instead, NotifyLinkShader is added. Drivers should inspect the context in NotifyLinkShader and save what they need for LinkShader in gl_shader_program. As a final note, most applications will not benefit from threaded shader compilation because they check GL_COMPILE_STATUS/GL_LINK_STATUS immediately, giving the worker threads no time to do their jobs. A possible improvement is to split LinkShader into two parts: the first part links and error checks while the second part optimizes and generates the machine code. With the split, we can always defer the second part to the thread pool. It looks like _mesa_create_shader_program() needs a bit of work since it also checks the compile status immediately after compiling the shader. Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
On 20/08/14 17:33, Ilia Mirkin wrote: On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface On struct pipe_sampler_view, I thought... unless I'm misunderstanding. Yep. My mistake. This was also my first thought about fixing this after Roland pointed out the issue. struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Such a cleanup would probably have to be done by someone with a better understanding of gallium than me. OTOH if you guys feel like doing it the sampler_view way will accrue too much technical debt, that's fine too. Unless I hear otherwise, I'm going to try to do it the pipe_sampler_view way tonight. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Am 20.08.2014 18:33, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface On struct pipe_sampler_view, I thought... unless I'm misunderstanding. This was also my first thought about fixing this after Roland pointed out the issue. Yes definitely for pipe_sampler_view - d3d10 also has it on the render target / depth stencil views, though so far I'm not convinced there's any value in that (the addressing of cube maps / arrays, 1d / 1d arrays is entirely the same in all cases, what matters is really the first and last layer only). struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; Make it pipe_texture_target target ;-) enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Such a cleanup would probably have to be done by someone with a better understanding of gallium than me. OTOH if you guys feel like doing it the sampler_view way will accrue too much technical debt, that's fine too. Unless I hear otherwise, I'm going to try to do it the pipe_sampler_view way tonight. Yes I think it would be a nice cleanup to split it up into two enums. I was mostly proposing just reusing the same enum and keeping pipe_texture_target the same because it would require less code change. But maybe that could come back haunting us later, I agree it would
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 12:11 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Wed, Aug 20, 2014 at 7:01 AM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net wrote: Connor Abbott cwabbo...@gmail.com writes: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think we can fix this by introducing new structured variants of the branch instruction in a way that doesn't alter the fundamental structure of the IR. E.g. an if branch could look like: ifbr i1 cond, label iftrue, label iffalse, label join
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wednesday, August 20, 2014 06:41:08 PM Michel Dänzer wrote: On 20.08.2014 00:04, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. First of all, thank you for sharing more specific information than 'table-flipping rage'. * LLVM is on a different release schedule (6 months vs. 3 months), has a different review process, etc., which means that to add support for new functionality that involves shaders, we now have to submit patches to two separate projects, and then 2 months later when we ship Mesa it turns out that nobody can actually use the new feature because it depends upon an unreleased version of LLVM that won't be released for another 3 months and then packaged by distros even later... This has indeed been frustrating at times, but it's better now for backend changes since Tom has been making LLVM point releases. Yeah - absolutely. As for the GLSL frontend, I agree with Tom that it shouldn't require that much direct interaction with the LLVM project. we've already had problems where distros refused to ship newer Mesa releases because radeon depended on a version of LLVM newer than the one they were shipping, [...] That's news to me, can you be more specific? That sounds like basically a distro issue though, since different LLVM versions can be installed in parallel (and the one used by default doesn't have to be the newest one). And it even works if another part of the same process uses a different version of LLVM. Yes, one can argue that it's a distribution issue - but it's an extremely painful problem for distributions. For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 2014-03-22), and I was told this was because of LLVM versioning changes in the other drivers (primarily radeon, I believe, but probably also llvmpipe). Mesa 9.2.2 hung the GPU every 5-10 minutes on Sandybridge, and we fixed that in Mesa 9.2.3. But we couldn't get people to actually ship it, and had to field tons of bug reports from upset users for several months. Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa package mantainer) can probably comment more. I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. Those are the sorts of things I'd like to avoid. The compiler is easily the most crucial part of a modern graphics stack; splitting it out into a separate repository and project seems like a nightmare for people who care about getting new drivers released and shipped in distributions in a timely fashion. Or, looking at it the other way: today, everything you need as an Intel or (AFAIK) Nouveau 3D user is nicely contained within Mesa. Our community has complete control over when we do those releases. New important bug fixes, performance improvements, or features? Ship a new Mesa, and you're done. That's a really nice feature I'd hate to lose. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle
Generally, only states which need a full shader compilation must be in the shader key. Flatshade is not one of them, because it only causes register updates, so this is not a proper solution. Or I am missing something? Marek On Wed, Aug 20, 2014 at 5:34 PM, Glenn Kennard glenn.kenn...@gmail.com wrote: If only the flat/smooth shade state changed between two calls the prior code would miss updating the hardware state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com --- Tested on radeon 6670, no piglit regressions src/gallium/drivers/r600/evergreen_state.c | 2 -- src/gallium/drivers/r600/r600_shader.h | 2 +- src/gallium/drivers/r600/r600_state.c| 2 -- src/gallium/drivers/r600/r600_state_common.c | 6 +++--- 4 files changed, 4 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 841ad0c..b490145 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2927,8 +2927,6 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader shader-ps_depth_export = z_export | stencil_export; shader-sprite_coord_enable = sprite_coord_enable; - if (rctx-rasterizer) - shader-flatshade = rctx-rasterizer-flatshade; } void evergreen_update_es_state(struct pipe_context *ctx, struct r600_pipe_shader *shader) diff --git a/src/gallium/drivers/r600/r600_shader.h b/src/gallium/drivers/r600/r600_shader.h index d6db8f0..8b32966 100644 --- a/src/gallium/drivers/r600/r600_shader.h +++ b/src/gallium/drivers/r600/r600_shader.h @@ -89,6 +89,7 @@ struct r600_shader_key { unsigned alpha_to_one:1; unsigned nr_cbufs:4; unsigned vs_as_es:1; + unsigned flatshade:1; }; struct r600_shader_array { @@ -106,7 +107,6 @@ struct r600_pipe_shader { struct r600_command_buffer command_buffer; /* register writes */ struct r600_resource*bo; unsignedsprite_coord_enable; - unsignedflatshade; unsignedpa_cl_vs_out_cntl; unsignednr_ps_color_outputs; struct r600_shader_key key; diff --git a/src/gallium/drivers/r600/r600_state.c b/src/gallium/drivers/r600/r600_state.c index 607b199..3f5cb2b 100644 --- a/src/gallium/drivers/r600/r600_state.c +++ b/src/gallium/drivers/r600/r600_state.c @@ -2532,8 +2532,6 @@ void r600_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader *sha shader-ps_depth_export = z_export | stencil_export; shader-sprite_coord_enable = sprite_coord_enable; - if (rctx-rasterizer) - shader-flatshade = rctx-rasterizer-flatshade; } void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader *shader) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 7594d0e..d8243d1 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -699,6 +699,8 @@ static INLINE struct r600_shader_key r600_shader_selector_key(struct pipe_contex /* Dual-source blending only makes sense with nr_cbufs == 1. */ if (key.nr_cbufs == 1 rctx-dual_src_blend) key.nr_cbufs = 2; + if (rctx-rasterizer-flatshade) + key.flatshade = 1; } else if (sel-type == PIPE_SHADER_VERTEX) { key.vs_as_es = (rctx-gs_shader != NULL); } @@ -1250,9 +1252,7 @@ static bool r600_update_derived_state(struct r600_context *rctx) } if (unlikely(!ps_dirty rctx-ps_shader rctx-rasterizer - ((rctx-rasterizer-sprite_coord_enable != rctx-ps_shader-current-sprite_coord_enable) || - (rctx-rasterizer-flatshade != rctx-ps_shader-current-flatshade { - + ((rctx-rasterizer-sprite_coord_enable != rctx-ps_shader-current-sprite_coord_enable { if (rctx-b.chip_class = EVERGREEN) evergreen_update_ps_state(ctx, rctx-ps_shader-current); else -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
Am 20.08.2014 20:13, schrieb Kenneth Graunke: On Wednesday, August 20, 2014 06:41:08 PM Michel Dänzer wrote: On 20.08.2014 00:04, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? Yes. See https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=JXdMJqLxDMsEjr3omF4b2U8%2F8eZQQmATYywWCcLRst4%3D%0As=f9f6f3190c2d8c98b183a74dc5d326e78974981e050eb5587820c19299e31ddd and https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=JXdMJqLxDMsEjr3omF4b2U8%2F8eZQQmATYywWCcLRst4%3D%0As=b718382a00ad2a3cd458377a7bed9c477c76bdbde52f6c7a3e914c88b28d4156 I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. First of all, thank you for sharing more specific information than 'table-flipping rage'. * LLVM is on a different release schedule (6 months vs. 3 months), has a different review process, etc., which means that to add support for new functionality that involves shaders, we now have to submit patches to two separate projects, and then 2 months later when we ship Mesa it turns out that nobody can actually use the new feature because it depends upon an unreleased version of LLVM that won't be released for another 3 months and then packaged by distros even later... This has indeed been frustrating at times, but it's better now for backend changes since Tom has been making LLVM point releases. Yeah - absolutely. As for the GLSL frontend, I agree with Tom that it shouldn't require that much direct interaction with the LLVM project. we've already had problems where distros refused to ship newer Mesa releases because radeon depended on a version of LLVM newer than the one they were shipping, [...] That's news to me, can you be more specific? That sounds like basically a distro issue though, since different LLVM versions can be installed in parallel (and the one used by default doesn't have to be the newest one). And it even works if another part of the same process uses a different version of LLVM. Yes, one can argue that it's a distribution issue - but it's an extremely painful problem for distributions. For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 2014-03-22), and I was told this was because of LLVM versioning changes in the other drivers (primarily radeon, I believe, but probably also llvmpipe). llvmpipe generally runs on pretty old llvm versions, though I didn't check the specifics here... Mesa 9.2.2 hung the GPU every 5-10 minutes on Sandybridge, and we fixed that in Mesa 9.2.3. But we couldn't get people to actually ship it, and had to field tons of bug reports from upset users for several months. I think this also begs the question if changes requiring new external libraries to compile really should be in a point release. Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa package mantainer) can probably comment more. I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. Those are the sorts of things I'd like to avoid. The compiler is easily the most crucial part of a modern graphics stack; splitting it out into a separate repository and project seems like a nightmare for people who care about getting new drivers released and shipped in distributions in a timely fashion. Or, looking at it the other way: today, everything you need as an Intel or (AFAIK) Nouveau 3D user is nicely contained within Mesa. Our community has complete control over when we do those releases. New important bug fixes, performance improvements, or features? Ship a new Mesa, and you're done. That's a really nice feature I'd hate to lose. --Ken Couldn't build scripts download and use an appropriate llvm version automatically if the one installed isn't
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 11:28 AM, Roland Scheidegger srol...@vmware.com wrote: Am 20.08.2014 20:13, schrieb Kenneth Graunke: For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 2014-03-22), and I was told this was because of LLVM versioning changes in the other drivers (primarily radeon, I believe, but probably also llvmpipe). llvmpipe generally runs on pretty old llvm versions, though I didn't check the specifics here... There are also 49 instances of 'HAVE_LLVM [=]' to manage that :) Couldn't build scripts download and use an appropriate llvm version automatically if the one installed isn't sufficient? Though maybe the idea is crazy I usually try to avoid to deal with such problems ;-). I don't know the specifics of what you're suggesting, but I don't think I need to to say that that's disgusting. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle
On Wed, 20 Aug 2014 20:16:50 +0200, Marek Olšák mar...@gmail.com wrote: Generally, only states which need a full shader compilation must be in the shader key. Flatshade is not one of them, because it only causes register updates, so this is not a proper solution. Or I am missing something? Marek Evergreen/Cayman need to recompile the shader since the interpolation is done using either INTERP_XY instruction for smooth or INTERP_LOAD_P0 for flat. R600-R700 technically don't need to, but the prior code already does anyway since flat/smooth register setup is done from output values computed when compiling the shader. /Glenn ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 11:13 AM, Kenneth Graunke kenn...@whitecape.org wrote: Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa package mantainer) can probably comment more. Yes, at one point we were stuck two releases behind current Mesa (and this is Gentoo!) because we couldn't get the appropriate version of LLVM stabilized because a number of reverse dependencies didn't work with the new LLVM version. Having multiple versions installed in parallel breaks down pretty easily. Where do the headers go? Where do all the executables go? Do you version all of them and install one for each version? Do other distros allow multiple versions of LLVM to be installed in parallel? How do they manage? I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. I get the sense that this is a problem that a backend in LLVM would cause, but maybe not so if we just used LLVM IR for the GLSL compiler. I think the C API is suitable for this kind of thing as well. Tom? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 11:13:13AM -0700, Kenneth Graunke wrote: On Wednesday, August 20, 2014 06:41:08 PM Michel Dänzer wrote: On 20.08.2014 00:04, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. First of all, thank you for sharing more specific information than 'table-flipping rage'. * LLVM is on a different release schedule (6 months vs. 3 months), has a different review process, etc., which means that to add support for new functionality that involves shaders, we now have to submit patches to two separate projects, and then 2 months later when we ship Mesa it turns out that nobody can actually use the new feature because it depends upon an unreleased version of LLVM that won't be released for another 3 months and then packaged by distros even later... This has indeed been frustrating at times, but it's better now for backend changes since Tom has been making LLVM point releases. Yeah - absolutely. As for the GLSL frontend, I agree with Tom that it shouldn't require that much direct interaction with the LLVM project. we've already had problems where distros refused to ship newer Mesa releases because radeon depended on a version of LLVM newer than the one they were shipping, [...] That's news to me, can you be more specific? That sounds like basically a distro issue though, since different LLVM versions can be installed in parallel (and the one used by default doesn't have to be the newest one). And it even works if another part of the same process uses a different version of LLVM. Yes, one can argue that it's a distribution issue - but it's an extremely painful problem for distributions. For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 2014-03-22), and I was told this was because of LLVM versioning changes in the other drivers (primarily radeon, I believe, but probably also llvmpipe). Mesa 9.2.2 hung the GPU every 5-10 minutes on Sandybridge, and we fixed that in Mesa 9.2.3. But we couldn't get people to actually ship it, and had to field tons of bug reports from upset users for several months. Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa package mantainer) can probably comment more. I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. Those are the sorts of things I'd like to avoid. The compiler is easily the most crucial part of a modern graphics stack; splitting it out into a separate repository and project seems like a nightmare for people who care about getting new drivers released and shipped in distributions in a timely fashion. Or, looking at it the other way: today, everything you need as an Intel or (AFAIK) Nouveau 3D user is nicely contained within Mesa. Our community has complete control over when we do those releases. New important bug fixes, performance improvements, or features? Ship a new Mesa, and you're done. That's a really nice feature I'd hate to lose. It has been a challenge to match versions of LLVM and Mesa for radeonsi, but as Michel mention this has been made easier now that we are doing LLVM point releases. However, as I mentioned before if we were using LLVM IR as a common IR it is unlikely that there would be any new features in Mesa that would depend on changes in LLVM. The only thing we would need to modify LLVM for would be: - Extending the C API - Bug fixes for optimization passes - Optimization pass improvements And remember all these changes would be for improving common code that is shared between drivers. All of the important compiler features would still go into the driver specific backends, which for most drivers are a part of Mesa. Even for
Re: [Mesa-dev] [PATCHv3 01/16] util: add _mesa_strtod and _mesa_strtof
On Wednesday, August 20, 2014 02:40:22 PM Chia-I Wu wrote: Both core mesa and glsl have their own wrappers for strtof_l. Merge and move them to util/. They are compiled with a C++ compiler so that we can make them thread-safe in a following commit. Signed-off-by: Chia-I Wu o...@lunarg.com --- src/glsl/Makefile.sources| 3 +- src/glsl/glsl_lexer.ll | 12 +++--- src/glsl/s_expression.cpp| 2 +- src/glsl/s_expression.h | 2 +- src/glsl/strtod.c| 79 --- src/glsl/strtod.h| 46 --- src/mesa/main/imports.c | 19 -- src/mesa/main/imports.h | 3 -- src/mesa/program/program_lexer.l | 1 + src/util/Makefile.sources| 3 +- src/util/strtod.cpp | 81 src/util/strtod.h| 46 +++ 12 files changed, 139 insertions(+), 158 deletions(-) delete mode 100644 src/glsl/strtod.c delete mode 100644 src/glsl/strtod.h create mode 100644 src/util/strtod.cpp create mode 100644 src/util/strtod.h Patches 1-4 are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle
The flag is only used to set S_028644_FLAT_SHADE on all r600g chips. I don't see it being used by the shader code generation. Marek On Wed, Aug 20, 2014 at 8:50 PM, Glenn Kennard glenn.kenn...@gmail.com wrote: On Wed, 20 Aug 2014 20:16:50 +0200, Marek Olšák mar...@gmail.com wrote: Generally, only states which need a full shader compilation must be in the shader key. Flatshade is not one of them, because it only causes register updates, so this is not a proper solution. Or I am missing something? Marek Evergreen/Cayman need to recompile the shader since the interpolation is done using either INTERP_XY instruction for smooth or INTERP_LOAD_P0 for flat. R600-R700 technically don't need to, but the prior code already does anyway since flat/smooth register setup is done from output values computed when compiling the shader. /Glenn ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote: On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard t...@stellard.net wrote: On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think it's important to distinguish between LLVM IR and the tools available to manipulate it. LLVM IR is meant to be a platform independent program representation. There is nothing about the IR that would prevent
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 11:56 AM, Matt Turner matts...@gmail.com wrote: On Wed, Aug 20, 2014 at 11:13 AM, Kenneth Graunke kenn...@whitecape.org wrote: Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa package mantainer) can probably comment more. Yes, at one point we were stuck two releases behind current Mesa (and this is Gentoo!) because we couldn't get the appropriate version of LLVM stabilized because a number of reverse dependencies didn't work with the new LLVM version. Having multiple versions installed in parallel breaks down pretty easily. Where do the headers go? Where do all the executables go? Do you version all of them and install one for each version? Do other distros allow multiple versions of LLVM to be installed in parallel? How do they manage? For Chrome OS we have multiple versions of LLVM, basically one for each consumer, and each consumer (except for the clang family) links to its version statically. It is tedious but less painful than having to change all the consumers at once (I certainly don't want to update our ASan tools because I upgraded mesa). It's wasteful and by no means ideal, but it's a pragmatic solution to a problem over which I have no control :) Stéphane I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. I get the sense that this is a problem that a backend in LLVM would cause, but maybe not so if we just used LLVM IR for the GLSL compiler. I think the C API is suitable for this kind of thing as well. Tom? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 12:17 PM, Tom Stellard t...@stellard.net wrote: On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote: On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard t...@stellard.net wrote: On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals (preserve the structure) conflict with the way LLVM has been designed. I think it's important to distinguish between LLVM IR and the tools available to manipulate it. LLVM IR is meant to be a platform independent
Re: [Mesa-dev] [PATCH 03/19] glx/drisw: add support for DRI2rendererQueryExtension
On 20/08/14 19:32, Jon TURNEY wrote: On 18/08/2014 13:08, Emil Velikov wrote: On 18/08/14 12:47, Jon TURNEY wrote: On 14/08/2014 23:18, Emil Velikov wrote: The extension is used by GLX_MESA_query_renderer, which can be provided for by hardware and software drivers. v2: Use designated initializers. v3: Move drisw_query_renderer_*() to dri2_query_renderer.c This breaks my build (see [1]) Ouch, I've completely forgot about your recent-ish changes in here. Sorry for the breakage. I guess something like the attached is needed. Possibly dri2_query_renderer.c needs to be renamed, since it's contents now are used for more than dri[23]. My initial plan was to move the functions to dri_common.c, although that caused 'make check' to explode so I've kept them here as per Ian's suggestion. Renaming the file makes sense imho. With a couple of small changes, I believe that you should be safe with dropping the above header and the HAVE_LIBDRM guards below. The small changes: - dri*_query_renderer_* into their respective dri*_priv.h I had a go at writing the patch like that, which seems to work. Revised patch attached. Can you just add glx: or similar prefix in the subject line before committing ? Other than that it looks good imho. Reviewed-by: Emil Velikov emil.l.veli...@gmail.com - Perhaps move a struct from dri2.h to dri2_priv.h I don't know which struct you mean here. I didn't find one I needed to move to make things build. I know that the dri2 waters are quite deep but wasn't sure how murky they are, thus the Perhaps The dri2_convert_glx_query_renderer_attribs() helper function could possibly stand to be given a more generic name. IMHO one could do a few cleanups in glx, and I highly doubt that anyone would object. I would be quite happy if anyone bothered :) Thanks Emil 0001-Fix-build-since-679c2ef-glx-drisw-add-support-for-DR.patch From 1f06833a856b98b6c5248f0f001bf5b3a74ae010 Mon Sep 17 00:00:00 2001 From: Jon TURNEY jon.tur...@dronecode.org.uk Date: Sun, 17 Aug 2014 17:22:22 +0100 Subject: [PATCH] Fix build since 679c2ef glx/drisw: add support for DRI2rendererQueryExtension, when only building drisw renderer. v2: - Move dri*_query_renderer_* into their respective dri*_priv.h headers - Drop then unnneeded include of dri2.h from dri2_query_renderer.c - Rename dri2_query_renderer.c as dri_common_query_renderer.c, as it's contents now are used for more than dri[23] Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk --- src/glx/Makefile.am | 6 +++--- src/glx/dri2.h | 16 src/glx/dri2_priv.h | 8 src/glx/dri3_priv.h | 9 + ...dri2_query_renderer.c = dri_common_query_renderer.c} | 1 - 5 files changed, 20 insertions(+), 20 deletions(-) rename src/glx/{dri2_query_renderer.c = dri_common_query_renderer.c} (99%) diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am index cdd898e..4515312 100644 --- a/src/glx/Makefile.am +++ b/src/glx/Makefile.am @@ -96,7 +96,8 @@ endif if HAVE_DRICOMMON libglx_la_SOURCES += \ xfont.c \ - dri_common.c + dri_common.c \ + dri_common_query_renderer.c endif if HAVE_DRI2 @@ -104,8 +105,7 @@ libglx_la_SOURCES += \ dri_glx.c \ XF86dri.c \ dri2_glx.c \ - dri2.c \ - dri2_query_renderer.c + dri2.c endif if HAVE_DRI3 diff --git a/src/glx/dri2.h b/src/glx/dri2.h index d07b296..4be5bf8 100644 --- a/src/glx/dri2.h +++ b/src/glx/dri2.h @@ -88,20 +88,4 @@ DRI2CopyRegion(Display * dpy, XID drawable, XserverRegion region, CARD32 dest, CARD32 src); -_X_HIDDEN int -dri2_query_renderer_integer(struct glx_screen *base, int attribute, -unsigned int *value); - -_X_HIDDEN int -dri2_query_renderer_string(struct glx_screen *base, int attribute, - const char **value); - -_X_HIDDEN int -dri3_query_renderer_integer(struct glx_screen *base, int attribute, -unsigned int *value); - -_X_HIDDEN int -dri3_query_renderer_string(struct glx_screen *base, int attribute, - const char **value); - #endif diff --git a/src/glx/dri2_priv.h b/src/glx/dri2_priv.h index c21eee5..b93d158 100644 --- a/src/glx/dri2_priv.h +++ b/src/glx/dri2_priv.h @@ -50,3 +50,11 @@ struct dri2_screen { int show_fps_interval; }; + +_X_HIDDEN int +dri2_query_renderer_integer(struct glx_screen *base, int attribute, +unsigned int *value); + +_X_HIDDEN int +dri2_query_renderer_string(struct glx_screen *base, int attribute, + const char **value); diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h index c0e35ee..248fa28
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On 20 August 2014 20:13, Kenneth Graunke kenn...@whitecape.org wrote: I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. For whatever it's worth, I have been avoiding radeonsi in part because of the LLVM dependency. Some of the other issues already mentioned aside, I also think it makes it just painful to do bisects over moderate/longer periods of time. I'm sure AMD carefully considered the tradeoff, and that it's worth it for them, but purely as a user/downstream I'd say using LLVM for the radeonsi compiler was a mistake. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 12:16 PM, Stéphane Marchesin stephane.marche...@gmail.com wrote: On Wed, Aug 20, 2014 at 11:56 AM, Matt Turner matts...@gmail.com wrote: Having multiple versions installed in parallel breaks down pretty easily. Where do the headers go? Where do all the executables go? Do you version all of them and install one for each version? Do other distros allow multiple versions of LLVM to be installed in parallel? How do they manage? For Chrome OS we have multiple versions of LLVM, basically one for each consumer, and each consumer (except for the clang family) links to its version statically. It is tedious but less painful than having to change all the consumers at once (I certainly don't want to update our ASan tools because I upgraded mesa). It's wasteful and by no means ideal, but it's a pragmatic solution to a problem over which I have no control :) Right. That solution would never be acceptable for Gentoo. The LLVM maintainer in Gentoo also confirmed to me that we used to allow multiple versions of LLVM to be installed side-by-side, but it required a lot of patching and was a large pain. The fundamental problem here seems to be that the intended usage model for LLVM is that you just statically link it into your project. Seems fine for proprietary software, not so fine for free software. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 11:56:32AM -0700, Matt Turner wrote: On Wed, Aug 20, 2014 at 11:13 AM, Kenneth Graunke kenn...@whitecape.org wrote: Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa package mantainer) can probably comment more. Yes, at one point we were stuck two releases behind current Mesa (and this is Gentoo!) because we couldn't get the appropriate version of LLVM stabilized because a number of reverse dependencies didn't work with the new LLVM version. Having multiple versions installed in parallel breaks down pretty easily. Where do the headers go? Where do all the executables go? Do you version all of them and install one for each version? Do other distros allow multiple versions of LLVM to be installed in parallel? How do they manage? On one of my (gentoo) dev systems, I have 8 different versions of LLVM installed. All I do is install each version to a different prefix. For example: /usr/local/llvm/3.6/ /usr/local/llvm/3.5/ When I build a project like mesa which depends on LLVM, I just point it to the prefix of the LLVM version that I want to use. Would distros be able to do something like this? I've also heard stories from friends of mine who use radeonsi that they couldn't get new GL features or compiler fixes unless they upgrade both Mesa /and/ LLVM, and that LLVM was usually either not released or not available in their distribution for a few months. I get the sense that this is a problem that a backend in LLVM would cause, but maybe not so if we just used LLVM IR for the GLSL compiler. I think the C API is suitable for this kind of thing as well. Tom? Yes, see my reply to Ken, but basically using LLVM IR for just GLSL would require a much smaller subset of LLVM features, most of which are pretty stable. It would even be using less features than what llvmpipe uses and llvmpipe still works with older LLVM versions. -Tom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle
On Wed, 20 Aug 2014 21:04:34 +0200, Marek Olšák mar...@gmail.com wrote: The flag is only used to set S_028644_FLAT_SHADE on all r600g chips. I don't see it being used by the shader code generation. Marek Ah, i see. Will respin patch with an alternate solution that won't require shader recompilation. Consider v1 dropped. /Glenn ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
On Wed, Aug 20, 2014 at 12:26:15PM -0700, Connor Abbott wrote: On Wed, Aug 20, 2014 at 12:17 PM, Tom Stellard t...@stellard.net wrote: On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote: On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard t...@stellard.net wrote: On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote: On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net writes: On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote: On 19.08.2014 01:28, Connor Abbott wrote: On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote: On 16.08.2014 09:12, Connor Abbott wrote: I know what you might be thinking right now. Wait, *another* IR? Don't we already have like 5 of those, not counting all the driver-specific ones? Isn't this stuff complicated enough already? Well, there are some pretty good reasons to start afresh (again...). In the years we've been using GLSL IR, we've come to realize that, in fact, it's not what we want *at all* to do optimizations on. Did you evaluate using LLVM IR instead of inventing yet another one? -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Yes. See http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html and http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html I know Ian can't deal with LLVM for some reason. I was wondering if *you* evaluated it, and if so, why you rejected it. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer Well, first of all, the fact that Ian and Ken don't want to use it means that any plan to use LLVM for the Intel driver is dead in the water anyways - you can translate NIR into LLVM if you want, but for i965 we want to share optimizations between our 2 backends (FS and vec4) that we can't do today in GLSL IR so this is what we want to use for that, and since nobody else does anything with the core GLSL compiler except when they have to, when we start moving things out of GLSL IR this will probably replace GLSL IR as the infrastructure that all Mesa drivers use. But with that in mind, here are a few reasons why we wouldn't want to use LLVM: * LLVM wasn't built to understand structured CFG's, meaning that you need to re-structurize it using a pass that's fragile and prone to break if some other pass optimizes the shader in a way that makes it non-structured (i.e. not expressible in terms of loops and if statements). This loss of information also means that passes that need to know things like, for example, the loop nesting depth need to do an analysis pass whereas with NIR you can just walk up the control flow tree and count the number of loops we hit. LLVM has a pass to structurize the CFG. We use it in the radeon drivers, and it is run after all of the other LLVM optimizations which have no concept of structured CFG. It's not bug free, but it works really well even with all of the complex OpenCL kernels we throw at it. Your point about losing information when the CFG is de-structurized is valid, but for things like loop depth, I'm not sure why we couldn't write an LLVM analysis pass for this (if one doesn't already exist). I don't think this is such a big deal either. At least the structurization pass used on newer AMD hardware isn't fragile in the way you seem to imply -- AFAIK (unlike the old AMDIL heuristic algorithm) it's guaranteed to give you a valid structurized output no matter what the previous optimization passes have done to the CFG, modulo bugs. I admit that the situation is nevertheless suboptimal. Ideally this information wouldn't get lost along the way. For the long term we may want to represent structured control flow directly in the IR as you say, I just don't see how reinventing the IR saves us any work if we could just fix the existing one. It seems to me that something like how we represent control flow is a pretty fundamental part of the IR - it affects any optimization pass that needs to do anything beyond adding and removing instructions. How would you fix that, especially given that LLVM is primarily designed for CPU's where you don't want to be restricted to structured control flow at all? It seems like our goals
[Mesa-dev] [PATCH 3/4] dri/radeon: cleanup the radeon_context vtbl
Remove the set-but-unused, and set-but-empty vtable entries. Most likely a leftover from the dri1 days. Cc: Marek Olšák marek.ol...@amd.com Cc: Michel Dänzer michel.daen...@amd.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/drivers/dri/r200/r200_context.c | 24 -- src/mesa/drivers/dri/r200/r200_state.c | 52 -- src/mesa/drivers/dri/r200/r200_state.h | 1 - src/mesa/drivers/dri/radeon/radeon_common.c| 3 -- .../drivers/dri/radeon/radeon_common_context.h | 4 -- src/mesa/drivers/dri/radeon/radeon_context.c | 26 --- src/mesa/drivers/dri/radeon/radeon_state.c | 52 -- src/mesa/drivers/dri/radeon/radeon_state.h | 1 - 8 files changed, 163 deletions(-) diff --git a/src/mesa/drivers/dri/r200/r200_context.c b/src/mesa/drivers/dri/r200/r200_context.c index 7815c4e..931f437 100644 --- a/src/mesa/drivers/dri/r200/r200_context.c +++ b/src/mesa/drivers/dri/r200/r200_context.c @@ -143,27 +143,6 @@ static void r200InitDriverFuncs( struct dd_function_table *functions ) } -static void r200_get_lock(radeonContextPtr radeon) -{ - r200ContextPtr rmesa = (r200ContextPtr)radeon; - drm_radeon_sarea_t *sarea = radeon-sarea; - - R200_STATECHANGE( rmesa, ctx ); - if (rmesa-radeon.sarea-tiling_enabled) { - rmesa-hw.ctx.cmd[CTX_RB3D_COLORPITCH] |= R200_COLOR_TILE_ENABLE; - } - else rmesa-hw.ctx.cmd[CTX_RB3D_COLORPITCH] = ~R200_COLOR_TILE_ENABLE; - - if ( sarea-ctx_owner != rmesa-radeon.dri.hwContext ) { - sarea-ctx_owner = rmesa-radeon.dri.hwContext; - } - -} - -static void r200_vtbl_emit_cs_header(struct radeon_cs *cs, radeonContextPtr rmesa) -{ -} - static void r200_emit_query_finish(radeonContextPtr radeon) { BATCH_LOCALS(radeon); @@ -180,9 +159,6 @@ static void r200_emit_query_finish(radeonContextPtr radeon) static void r200_init_vtbl(radeonContextPtr radeon) { - radeon-vtbl.get_lock = r200_get_lock; - radeon-vtbl.update_viewport_offset = r200UpdateViewportOffset; - radeon-vtbl.emit_cs_header = r200_vtbl_emit_cs_header; radeon-vtbl.swtcl_flush = r200_swtcl_flush; radeon-vtbl.fallback = r200Fallback; radeon-vtbl.update_scissor = r200_vtbl_update_scissor; diff --git a/src/mesa/drivers/dri/r200/r200_state.c b/src/mesa/drivers/dri/r200/r200_state.c index 983430f..2ad8439 100644 --- a/src/mesa/drivers/dri/r200/r200_state.c +++ b/src/mesa/drivers/dri/r200/r200_state.c @@ -1616,58 +1616,6 @@ static void r200DepthRange(struct gl_context *ctx) r200UpdateWindow( ctx ); } -void r200UpdateViewportOffset( struct gl_context *ctx ) -{ - r200ContextPtr rmesa = R200_CONTEXT(ctx); - __DRIdrawable *dPriv = radeon_get_drawable(rmesa-radeon); - GLfloat xoffset = (GLfloat)0; - GLfloat yoffset = (GLfloat)dPriv-h; - const GLfloat *v = ctx-ViewportArray[0]._WindowMap.m; - - float_ui32_type tx; - float_ui32_type ty; - - tx.f = v[MAT_TX] + xoffset; - ty.f = (- v[MAT_TY]) + yoffset; - - if ( rmesa-hw.vpt.cmd[VPT_SE_VPORT_XOFFSET] != tx.ui32 || - rmesa-hw.vpt.cmd[VPT_SE_VPORT_YOFFSET] != ty.ui32 ) - { - /* Note: this should also modify whatever data the context reset - * code uses... - */ - R200_STATECHANGE( rmesa, vpt ); - rmesa-hw.vpt.cmd[VPT_SE_VPORT_XOFFSET] = tx.ui32; - rmesa-hw.vpt.cmd[VPT_SE_VPORT_YOFFSET] = ty.ui32; - - /* update polygon stipple x/y screen offset */ - { - GLuint stx, sty; - GLuint m = rmesa-hw.msc.cmd[MSC_RE_MISC]; - - m = ~(R200_STIPPLE_X_OFFSET_MASK | -R200_STIPPLE_Y_OFFSET_MASK); - - /* add magic offsets, then invert */ - stx = 31 - ((-1) R200_STIPPLE_COORD_MASK); - sty = 31 - ((dPriv-h - 1) - R200_STIPPLE_COORD_MASK); - - m |= ((stx R200_STIPPLE_X_OFFSET_SHIFT) | - (sty R200_STIPPLE_Y_OFFSET_SHIFT)); - - if ( rmesa-hw.msc.cmd[MSC_RE_MISC] != m ) { -R200_STATECHANGE( rmesa, msc ); - rmesa-hw.msc.cmd[MSC_RE_MISC] = m; - } - } - } - - radeonUpdateScissor( ctx ); -} - - - /* = * Miscellaneous */ diff --git a/src/mesa/drivers/dri/r200/r200_state.h b/src/mesa/drivers/dri/r200/r200_state.h index a396b06..9111981 100644 --- a/src/mesa/drivers/dri/r200/r200_state.h +++ b/src/mesa/drivers/dri/r200/r200_state.h @@ -43,7 +43,6 @@ extern void r200InitTnlFuncs( struct gl_context *ctx ); extern void r200UpdateMaterial( struct gl_context *ctx ); -extern void r200UpdateViewportOffset( struct gl_context *ctx ); extern void r200UpdateWindow( struct gl_context *ctx ); extern void r200UpdateDrawBuffer(struct gl_context *ctx); diff --git a/src/mesa/drivers/dri/radeon/radeon_common.c b/src/mesa/drivers/dri/radeon/radeon_common.c index 515e55a..966e10a 100644 --- a/src/mesa/drivers/dri/radeon/radeon_common.c +++
[Mesa-dev] [PATCH v2] r600g: Fix flat/smooth shade state toggle
If only the flat/smooth shade state changed between two render calls the prior code would miss updating the hardware state. Also add check for sprite coord, potentially same type of issue otherwise for it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com --- V2: - No new shader variant created - Also check for sprite coord enable since its state is updated in similar fashion to flatshade. src/gallium/drivers/r600/r600_state_common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 7594d0e..028d800 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -1227,7 +1227,9 @@ static bool r600_update_derived_state(struct r600_context *rctx) if (unlikely(!rctx-ps_shader-current)) return false; - if (unlikely(ps_dirty || rctx-pixel_shader.shader != rctx-ps_shader-current)) { + if (unlikely(ps_dirty || rctx-pixel_shader.shader != rctx-ps_shader-current || + rctx-rasterizer-sprite_coord_enable != rctx-ps_shader-current-sprite_coord_enable || + rctx-rasterizer-flatshade != rctx-ps_shader-current-flatshade)) { if (rctx-cb_misc_state.nr_ps_color_outputs != rctx-ps_shader-current-nr_ps_color_outputs) { rctx-cb_misc_state.nr_ps_color_outputs = rctx-ps_shader-current-nr_ps_color_outputs; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] dri/radeon: nuke the remaining references to sarea
It was an 'interesting' feature which I'm clad we not longer use as of dri2. Cc: Marek Olšák marek.ol...@amd.com Cc: Michel Dänzer michel.daen...@amd.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/drivers/dri/r200/r200_ioctl.c | 7 --- src/mesa/drivers/dri/radeon/radeon_common_context.h | 1 - src/mesa/drivers/dri/radeon/radeon_screen.h | 3 --- 3 files changed, 11 deletions(-) diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.c b/src/mesa/drivers/dri/r200/r200_ioctl.c index ef0d637..515be92 100644 --- a/src/mesa/drivers/dri/r200/r200_ioctl.c +++ b/src/mesa/drivers/dri/r200/r200_ioctl.c @@ -62,13 +62,6 @@ static void r200Clear( struct gl_context *ctx, GLbitfield mask ) BUFFER_BIT_DEPTH | BUFFER_BIT_STENCIL | BUFFER_BIT_COLOR0; - if ( R200_DEBUG RADEON_IOCTL ) { - if (rmesa-radeon.sarea) - fprintf( stderr, r200Clear %x %d\n, mask, rmesa-radeon.sarea-pfCurrentPage); - else - fprintf( stderr, r200Clear %x radeon-sarea is NULL\n, mask); - } - radeonFlush( ctx ); hwmask = mask hwbits; diff --git a/src/mesa/drivers/dri/radeon/radeon_common_context.h b/src/mesa/drivers/dri/radeon/radeon_common_context.h index 8330b17..cfed408 100644 --- a/src/mesa/drivers/dri/radeon/radeon_common_context.h +++ b/src/mesa/drivers/dri/radeon/radeon_common_context.h @@ -406,7 +406,6 @@ struct radeon_context { /* Drawable information */ unsigned int lastStamp; - drm_radeon_sarea_t *sarea; /* Private SAREA data */ /* Mirrors of some DRI state */ struct radeon_dri_mirror dri; diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h b/src/mesa/drivers/dri/radeon/radeon_screen.h index b5cc075..b3e9267 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.h +++ b/src/mesa/drivers/dri/radeon/radeon_screen.h @@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include dri_util.h #include radeon_chipset.h #include radeon_reg.h -#include drm_sarea.h #include xmlconfig.h @@ -88,7 +87,6 @@ typedef struct radeon_screen { __volatile__ uint32_t *scratch; __DRIscreen *driScreen; - unsigned int sarea_priv_offset; unsigned int gart_buffer_offset;/* offset in card memory space */ unsigned int gart_texture_offset; /* offset in card memory space */ unsigned int gart_base; @@ -100,7 +98,6 @@ typedef struct radeon_screen { int num_gb_pipes; int num_z_pipes; - drm_radeon_sarea_t *sarea; /* Private SAREA data */ struct radeon_bo_manager *bom; } radeonScreenRec, *radeonScreenPtr; -- 2.0.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] dri/radeon: drop obsolete radeon_{dri, macros}.h headers
Both have been unused for at least a couple of years. For example the last user of radeon_macros.h was removed with commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6 Author: Eric Anholt e...@anholt.net Date: Fri Oct 14 13:27:02 2011 -0700 radeon: Drop the legacy BO manager code. Cc: Marek Olšák marek.ol...@amd.com Cc: Michel Dänzer michel.daen...@amd.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/drivers/dri/r200/r200_ioctl.h | 1 - src/mesa/drivers/dri/r200/server/radeon_dri.h | 1 - src/mesa/drivers/dri/r200/server/radeon_macros.h | 1 - src/mesa/drivers/dri/radeon/radeon_screen.c| 1 - src/mesa/drivers/dri/radeon/radeon_screen.h| 3 +- src/mesa/drivers/dri/radeon/server/radeon_dri.h| 115 -- src/mesa/drivers/dri/radeon/server/radeon_macros.h | 128 - 7 files changed, 2 insertions(+), 248 deletions(-) delete mode 12 src/mesa/drivers/dri/r200/server/radeon_dri.h delete mode 12 src/mesa/drivers/dri/r200/server/radeon_macros.h delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_dri.h delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_macros.h diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h b/src/mesa/drivers/dri/r200/r200_ioctl.h index ab5f822..9133a22 100644 --- a/src/mesa/drivers/dri/r200/r200_ioctl.h +++ b/src/mesa/drivers/dri/r200/r200_ioctl.h @@ -36,7 +36,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define __R200_IOCTL_H__ #include main/simple_list.h -#include radeon_dri.h #include radeon_bo_gem.h #include radeon_cs_gem.h diff --git a/src/mesa/drivers/dri/r200/server/radeon_dri.h b/src/mesa/drivers/dri/r200/server/radeon_dri.h deleted file mode 12 index 27c591d..000 --- a/src/mesa/drivers/dri/r200/server/radeon_dri.h +++ /dev/null @@ -1 +0,0 @@ -../../radeon/server/radeon_dri.h \ No newline at end of file diff --git a/src/mesa/drivers/dri/r200/server/radeon_macros.h b/src/mesa/drivers/dri/r200/server/radeon_macros.h deleted file mode 12 index c56cd73..000 --- a/src/mesa/drivers/dri/r200/server/radeon_macros.h +++ /dev/null @@ -1 +0,0 @@ -../../radeon/server/radeon_macros.h \ No newline at end of file diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c b/src/mesa/drivers/dri/radeon/radeon_screen.c index 9a6fbbd..044e212 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.c +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c @@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include swrast/s_renderbuffer.h #include radeon_chipset.h -#include radeon_macros.h #include radeon_screen.h #include radeon_common.h #include radeon_common_context.h diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h b/src/mesa/drivers/dri/radeon/radeon_screen.h index 9b77627..b5cc075 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.h +++ b/src/mesa/drivers/dri/radeon/radeon_screen.h @@ -40,8 +40,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * IMPORTS: these headers contain all the DRI, X and kernel-related * definitions that we need. */ +#include xf86drm.h +#include radeon_drm.h #include dri_util.h -#include radeon_dri.h #include radeon_chipset.h #include radeon_reg.h #include drm_sarea.h diff --git a/src/mesa/drivers/dri/radeon/server/radeon_dri.h b/src/mesa/drivers/dri/radeon/server/radeon_dri.h deleted file mode 100644 index dc51372..000 --- a/src/mesa/drivers/dri/radeon/server/radeon_dri.h +++ /dev/null @@ -1,115 +0,0 @@ -/** - * \file server/radeon_dri.h - * \brief Radeon server-side structures. - * - * \author Kevin E. Martin mar...@xfree86.org - * \author Rickard E. Faith fa...@valinux.com - */ - -/* - * Copyright 2000 ATI Technologies Inc., Markham, Ontario, - *VA Linux Systems Inc., Fremont, California. - * - * All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining - * a copy of this software and associated documentation files (the - * Software), to deal in the Software without restriction, including - * without limitation on the rights to use, copy, modify, merge, - * publish, distribute, sublicense, and/or sell copies of the Software, - * and to permit persons to whom the Software is furnished to do so, - * subject to the following conditions: - * - * The above copyright notice and this permission notice (including the - * next paragraph) shall be included in all copies or substantial - * portions of the Software. - * - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NON-INFRINGEMENT. IN NO EVENT SHALL ATI, VA LINUX SYSTEMS AND/OR - * THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, - * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - * OUT OF OR IN CONNECTION WITH THE
[Mesa-dev] [PATCH 2/4] include: move sarea.h next to it's only user
The header is used by DRI1 drivers, which we've removed a while back. Now only the dri1 loader in libGL is using it, so let's move it in src/glx, and prefix it accordingly. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- include/GL/internal/sarea.h | 92 - src/glx/dri_glx.c | 2 +- src/glx/dri_sarea.h | 92 + 3 files changed, 93 insertions(+), 93 deletions(-) delete mode 100644 include/GL/internal/sarea.h create mode 100644 src/glx/dri_sarea.h diff --git a/include/GL/internal/sarea.h b/include/GL/internal/sarea.h deleted file mode 100644 index c3b8bca..000 --- a/include/GL/internal/sarea.h +++ /dev/null @@ -1,92 +0,0 @@ -/** - * \file sarea.h - * SAREA definitions. - * - * \author Kevin E. Martin ke...@precisioninsight.com - * \author Jens Owen jo...@vmware.com - * \author Rickard E. (Rik) Faith fa...@valinux.com - */ - -/* - * Copyright 1998-1999 Precision Insight, Inc., Cedar Park, Texas. - * Copyright 2000 VA Linux Systems, Inc. - * All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the - * Software), to deal in the Software without restriction, including - * without limitation the rights to use, copy, modify, merge, publish, - * distribute, sub license, and/or sell copies of the Software, and to - * permit persons to whom the Software is furnished to do so, subject to - * the following conditions: - * - * The above copyright notice and this permission notice (including the - * next paragraph) shall be included in all copies or substantial portions - * of the Software. - * - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS - * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. - * IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE FOR - * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, - * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE - * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. - */ - - -#ifndef _SAREA_H_ -#define _SAREA_H_ - -#include xf86drm.h - -/* SAREA area needs to be at least a page */ -#if defined(__alpha__) -#define SAREA_MAX 0x2000 -#elif defined(__ia64__) -#define SAREA_MAX 0x1 /* 64kB */ -#else -/* Intel 830M driver needs at least 8k SAREA */ -#define SAREA_MAX 0x2000 -#endif - -#define SAREA_MAX_DRAWABLES256 - -#define SAREA_DRAWABLE_CLAIMED_ENTRY 0x8000 - -/** - * SAREA per drawable information. - * - * \sa _XF86DRISAREA. - */ -typedef struct _XF86DRISAREADrawable { -unsigned int stamp; -unsigned int flags; -} XF86DRISAREADrawableRec, *XF86DRISAREADrawablePtr; - -/** - * SAREA frame information. - * - * \sa _XF86DRISAREA. - */ -typedef struct _XF86DRISAREAFrame { -unsigned intx; -unsigned inty; -unsigned intwidth; -unsigned intheight; -unsigned intfullscreen; -} XF86DRISAREAFrameRec, *XF86DRISAREAFramePtr; - -/** - * SAREA definition. - */ -typedef struct _XF86DRISAREA { -/** first thing is always the DRM locking structure */ -drmLocklock; -/** \todo Use readers/writer lock for drawable_lock */ -drmLockdrawable_lock; -XF86DRISAREADrawableRecdrawableTable[SAREA_MAX_DRAWABLES]; -XF86DRISAREAFrameRecframe; -drm_context_t dummy_context; -} XF86DRISAREARec, *XF86DRISAREAPtr; - -#endif diff --git a/src/glx/dri_glx.c b/src/glx/dri_glx.c index 5295331..d087751 100644 --- a/src/glx/dri_glx.c +++ b/src/glx/dri_glx.c @@ -40,7 +40,7 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include glxclient.h #include xf86dri.h #include dri2.h -#include sarea.h +#include dri_sarea.h #include dlfcn.h #include sys/types.h #include sys/mman.h diff --git a/src/glx/dri_sarea.h b/src/glx/dri_sarea.h new file mode 100644 index 000..fe4529b --- /dev/null +++ b/src/glx/dri_sarea.h @@ -0,0 +1,92 @@ +/** + * \file dri_sarea.h + * SAREA definitions. + * + * \author Kevin E. Martin ke...@precisioninsight.com + * \author Jens Owen jo...@vmware.com + * \author Rickard E. (Rik) Faith fa...@valinux.com + */ + +/* + * Copyright 1998-1999 Precision Insight, Inc., Cedar Park, Texas. + * Copyright 2000 VA Linux Systems, Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons
Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa
Am 20.08.2014 20:45, schrieb Matt Turner: On Wed, Aug 20, 2014 at 11:28 AM, Roland Scheidegger srol...@vmware.com wrote: Am 20.08.2014 20:13, schrieb Kenneth Graunke: For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 2014-03-22), and I was told this was because of LLVM versioning changes in the other drivers (primarily radeon, I believe, but probably also llvmpipe). llvmpipe generally runs on pretty old llvm versions, though I didn't check the specifics here... There are also 49 instances of 'HAVE_LLVM [=]' to manage that :) That is true, but note none of them really have anything to do with building the IR or the like, it is all for jit, disassembler, etc. because these things aren't doable with the stable c api. Roland Couldn't build scripts download and use an appropriate llvm version automatically if the one installed isn't sufficient? Though maybe the idea is crazy I usually try to avoid to deal with such problems ;-). I don't know the specifics of what you're suggesting, but I don't think I need to to say that that's disgusting. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Am 20.08.2014 18:48, schrieb Roland Scheidegger: Am 20.08.2014 18:33, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface On struct pipe_sampler_view, I thought... unless I'm misunderstanding. This was also my first thought about fixing this after Roland pointed out the issue. Yes definitely for pipe_sampler_view - d3d10 also has it on the render target / depth stencil views, though so far I'm not convinced there's any value in that (the addressing of cube maps / arrays, 1d / 1d arrays is entirely the same in all cases, what matters is really the first and last layer only). struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; Make it pipe_texture_target target ;-) enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Such a cleanup would probably have to be done by someone with a better understanding of gallium than me. OTOH if you guys feel like doing it the sampler_view way will accrue too much technical debt, that's fine too. Unless I hear otherwise, I'm going to try to do it the pipe_sampler_view way tonight. Yes I think it would be a nice cleanup to split it up into two enums. I was mostly proposing just reusing the same enum and keeping pipe_texture_target the same because it would require less code change. But
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
On Wed, Aug 20, 2014 at 4:12 PM, Roland Scheidegger srol...@vmware.com wrote: Am 20.08.2014 18:48, schrieb Roland Scheidegger: Am 20.08.2014 18:33, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface On struct pipe_sampler_view, I thought... unless I'm misunderstanding. This was also my first thought about fixing this after Roland pointed out the issue. Yes definitely for pipe_sampler_view - d3d10 also has it on the render target / depth stencil views, though so far I'm not convinced there's any value in that (the addressing of cube maps / arrays, 1d / 1d arrays is entirely the same in all cases, what matters is really the first and last layer only). struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; Make it pipe_texture_target target ;-) enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Such a cleanup would probably have to be done by someone with a better understanding of gallium than me. OTOH if you guys feel like doing it the sampler_view way will accrue too much technical debt, that's fine too. Unless I hear otherwise, I'm going to try to do it the pipe_sampler_view way tonight. Yes I think it would be a nice cleanup to split it up into two enums. I was mostly proposing just reusing the same enum and keeping
[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail
https://bugs.freedesktop.org/show_bug.cgi?id=79629 --- Comment #7 from dog paul.a.parent...@intel.com --- DRI3 is still being developed/stabilized and its interaction with fences poorly specified. Mesa does not yet utilize the explicit fences implied in the spec. Chris will disable DRI3 by default in the ddx because there are a number of trivial-to-hit bugs that cause X or the compositor to stop updating. You should cross-reference https://bugs.freedesktop.org/show_bug.cgi?id=81551 for the patch that will identify the issue as being the explicit-vs-implicit fencing issue or something new. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail
https://bugs.freedesktop.org/show_bug.cgi?id=79629 dog paul.a.parent...@intel.com changed: What|Removed |Added Depends on||81551 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/20] i965: Preserve CFG when deleting dead control flow.
On Tue, Aug 19, 2014 at 12:36 PM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Tue, Aug 19, 2014 at 12:03:01PM -0700, Matt Turner wrote: By the way, I committed the first 6 patches of the series (the one touching the generators had started to rot). I think other than 16 and 17, the only ones missing review are the patches that add the insertion and removal methods. I sent new versions of them based on your feedback a few days ago. Oh, so sorry Matt, I somehow forgot to send my r-b, they are just fine. To make sure I didn't misunderstand: patches 10 and 11 are R-b, or 10, 11, 16, and 17? I didn't want to slap your R-b on something that wasn't reviewed yet. :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support
Am 20.08.2014 22:27, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 4:12 PM, Roland Scheidegger srol...@vmware.com wrote: Am 20.08.2014 18:48, schrieb Roland Scheidegger: Am 20.08.2014 18:33, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 17:14, Roland Scheidegger wrote: Am 20.08.2014 17:55, schrieb Ilia Mirkin: On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote: On 20/08/14 16:31, Ilia Mirkin wrote: Hm, it's not tested. And you're right, that would (most likely) mess up, since it would only have the pipe_resource's target. Any suggestions on how to fix it? Should the target be added to pipe_sampler_view? On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote: Didn't look at it that closely, but I'm pretty surprised this really works. One things ARB_texture_view can do is cast cube maps (and cube map arrays) to 2d arrays and vice versa (also 1d/2d to the respective array type), and we cannot express that in sampler views (yet) (we can't express it in surfaces neither but there it should not matter). Which means the type used in the shader for sampling will not match the sampler view, which sounds quite broken to me. Roland Probably the only sane thing to do eliminate the disctinction between PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like in https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff , e.g.,: enum pipe_texture_target { PIPE_BUFFER = 0, PIPE_TEXTURE_1D = 1, PIPE_TEXTURE_2D = 2, PIPE_TEXTURE_3D = 3, PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D PIPE_TEXTURE_RECT = 5, PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D, PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D, PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE, PIPE_MAX_TEXTURE_TYPES }; We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead, drivers that want to be able to support ARB_texture_view will need to ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match. Another quick + cheap alternative (at least looking at nv50/nvc0 code) would be to pass a separate target parameter to -create_sampler_view(). That would be enough for nouveau, but perhaps not more generally? Take a look at nv50_tex.c:nv50_create_texture_view -- it also needs to work out the depth of the texture (presumably to deal with out-of-bounds accesses) and that is written to the texture info structure. Well that should be enough, but I don't think it fits out design. We've encapsulated other override information like the format in the view already, and I see no reason why the target cast should be treated any different. In other words, you're arguing for: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index a82686b..c87ac4e 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -333,6 +333,7 @@ struct pipe_surface On struct pipe_sampler_view, I thought... unless I'm misunderstanding. This was also my first thought about fixing this after Roland pointed out the issue. Yes definitely for pipe_sampler_view - d3d10 also has it on the render target / depth stencil views, though so far I'm not convinced there's any value in that (the addressing of cube maps / arrays, 1d / 1d arrays is entirely the same in all cases, what matters is really the first and last layer only). struct pipe_reference reference; struct pipe_resource *texture; /** resource into which this is a view */ struct pipe_context *context; /** context this surface belongs to */ + enum pipe_texture target; Make it pipe_texture_target target ;-) enum pipe_format format; /* XXX width/height should be removed */ It's a fair point. And I don't object that solution. Of course, for this to work, drivers will need to treat the _ARRAY and non _ARRAY targets the same when determining the texture layout for this to work. I just felt this would be a good oportunity to slim down pipe_texture_target too. I'm not sure the _ARRAY distinction still matters at this level, but I suppose it doesn't hurt. Such a cleanup would probably have to be done by someone with a better understanding of gallium than me. OTOH if you guys feel like doing it the sampler_view way will accrue too much technical debt, that's fine too. Unless I hear otherwise, I'm going to try to do it the pipe_sampler_view way tonight. Yes I think it would be a nice cleanup to split it up into two enums. I was mostly
[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail
https://bugs.freedesktop.org/show_bug.cgi?id=79629 Bastien Nocera bugzi...@hadess.net changed: What|Removed |Added CC||bugzi...@hadess.net -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 76188] EGL_EXT_image_dma_buf_import fd ownership is incorrect
https://bugs.freedesktop.org/show_bug.cgi?id=76188 --- Comment #8 from Matt Turner matts...@gmail.com --- (In reply to comment #7) I see little risk in cherry-picking the fix to stable branches. The fix is isolated and only *removes* code. I do see risk in not cherry-picking the fix. If an app uses this extension with unfixed Mesa 10.2, then that app will leak file descriptors. Sounds good to me. I just wanted you or Ian to take a look at the patch before we shipped it in a release. :) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] dri/radeon: drop obsolete radeon_{dri, macros}.h headers
Sorry, I don't know much about these drivers to be able to review this. Marek On Wed, Aug 20, 2014 at 9:54 PM, Emil Velikov emil.l.veli...@gmail.com wrote: Both have been unused for at least a couple of years. For example the last user of radeon_macros.h was removed with commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6 Author: Eric Anholt e...@anholt.net Date: Fri Oct 14 13:27:02 2011 -0700 radeon: Drop the legacy BO manager code. Cc: Marek Olšák marek.ol...@amd.com Cc: Michel Dänzer michel.daen...@amd.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/drivers/dri/r200/r200_ioctl.h | 1 - src/mesa/drivers/dri/r200/server/radeon_dri.h | 1 - src/mesa/drivers/dri/r200/server/radeon_macros.h | 1 - src/mesa/drivers/dri/radeon/radeon_screen.c| 1 - src/mesa/drivers/dri/radeon/radeon_screen.h| 3 +- src/mesa/drivers/dri/radeon/server/radeon_dri.h| 115 -- src/mesa/drivers/dri/radeon/server/radeon_macros.h | 128 - 7 files changed, 2 insertions(+), 248 deletions(-) delete mode 12 src/mesa/drivers/dri/r200/server/radeon_dri.h delete mode 12 src/mesa/drivers/dri/r200/server/radeon_macros.h delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_dri.h delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_macros.h diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h b/src/mesa/drivers/dri/r200/r200_ioctl.h index ab5f822..9133a22 100644 --- a/src/mesa/drivers/dri/r200/r200_ioctl.h +++ b/src/mesa/drivers/dri/r200/r200_ioctl.h @@ -36,7 +36,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define __R200_IOCTL_H__ #include main/simple_list.h -#include radeon_dri.h #include radeon_bo_gem.h #include radeon_cs_gem.h diff --git a/src/mesa/drivers/dri/r200/server/radeon_dri.h b/src/mesa/drivers/dri/r200/server/radeon_dri.h deleted file mode 12 index 27c591d..000 --- a/src/mesa/drivers/dri/r200/server/radeon_dri.h +++ /dev/null @@ -1 +0,0 @@ -../../radeon/server/radeon_dri.h \ No newline at end of file diff --git a/src/mesa/drivers/dri/r200/server/radeon_macros.h b/src/mesa/drivers/dri/r200/server/radeon_macros.h deleted file mode 12 index c56cd73..000 --- a/src/mesa/drivers/dri/r200/server/radeon_macros.h +++ /dev/null @@ -1 +0,0 @@ -../../radeon/server/radeon_macros.h \ No newline at end of file diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c b/src/mesa/drivers/dri/radeon/radeon_screen.c index 9a6fbbd..044e212 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.c +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c @@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include swrast/s_renderbuffer.h #include radeon_chipset.h -#include radeon_macros.h #include radeon_screen.h #include radeon_common.h #include radeon_common_context.h diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h b/src/mesa/drivers/dri/radeon/radeon_screen.h index 9b77627..b5cc075 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.h +++ b/src/mesa/drivers/dri/radeon/radeon_screen.h @@ -40,8 +40,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * IMPORTS: these headers contain all the DRI, X and kernel-related * definitions that we need. */ +#include xf86drm.h +#include radeon_drm.h #include dri_util.h -#include radeon_dri.h #include radeon_chipset.h #include radeon_reg.h #include drm_sarea.h diff --git a/src/mesa/drivers/dri/radeon/server/radeon_dri.h b/src/mesa/drivers/dri/radeon/server/radeon_dri.h deleted file mode 100644 index dc51372..000 --- a/src/mesa/drivers/dri/radeon/server/radeon_dri.h +++ /dev/null @@ -1,115 +0,0 @@ -/** - * \file server/radeon_dri.h - * \brief Radeon server-side structures. - * - * \author Kevin E. Martin mar...@xfree86.org - * \author Rickard E. Faith fa...@valinux.com - */ - -/* - * Copyright 2000 ATI Technologies Inc., Markham, Ontario, - *VA Linux Systems Inc., Fremont, California. - * - * All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining - * a copy of this software and associated documentation files (the - * Software), to deal in the Software without restriction, including - * without limitation on the rights to use, copy, modify, merge, - * publish, distribute, sublicense, and/or sell copies of the Software, - * and to permit persons to whom the Software is furnished to do so, - * subject to the following conditions: - * - * The above copyright notice and this permission notice (including the - * next paragraph) shall be included in all copies or substantial - * portions of the Software. - * - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A
Re: [Mesa-dev] [PATCH 1/4] dri/radeon: drop obsolete radeon_{dri, macros}.h headers
No problems Marek. Your name popped up at the top of the list based on your recent bugfixes in the area. I believe that Michel and/or Alex will have some (unfortunate) recollection about these drivers :) -Emil On 20/08/14 23:05, Marek Olšák wrote: Sorry, I don't know much about these drivers to be able to review this. Marek On Wed, Aug 20, 2014 at 9:54 PM, Emil Velikov emil.l.veli...@gmail.com wrote: Both have been unused for at least a couple of years. For example the last user of radeon_macros.h was removed with commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6 Author: Eric Anholt e...@anholt.net Date: Fri Oct 14 13:27:02 2011 -0700 radeon: Drop the legacy BO manager code. Cc: Marek Olšák marek.ol...@amd.com Cc: Michel Dänzer michel.daen...@amd.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/drivers/dri/r200/r200_ioctl.h | 1 - src/mesa/drivers/dri/r200/server/radeon_dri.h | 1 - src/mesa/drivers/dri/r200/server/radeon_macros.h | 1 - src/mesa/drivers/dri/radeon/radeon_screen.c| 1 - src/mesa/drivers/dri/radeon/radeon_screen.h| 3 +- src/mesa/drivers/dri/radeon/server/radeon_dri.h| 115 -- src/mesa/drivers/dri/radeon/server/radeon_macros.h | 128 - 7 files changed, 2 insertions(+), 248 deletions(-) delete mode 12 src/mesa/drivers/dri/r200/server/radeon_dri.h delete mode 12 src/mesa/drivers/dri/r200/server/radeon_macros.h delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_dri.h delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_macros.h diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h b/src/mesa/drivers/dri/r200/r200_ioctl.h index ab5f822..9133a22 100644 --- a/src/mesa/drivers/dri/r200/r200_ioctl.h +++ b/src/mesa/drivers/dri/r200/r200_ioctl.h @@ -36,7 +36,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define __R200_IOCTL_H__ #include main/simple_list.h -#include radeon_dri.h #include radeon_bo_gem.h #include radeon_cs_gem.h diff --git a/src/mesa/drivers/dri/r200/server/radeon_dri.h b/src/mesa/drivers/dri/r200/server/radeon_dri.h deleted file mode 12 index 27c591d..000 --- a/src/mesa/drivers/dri/r200/server/radeon_dri.h +++ /dev/null @@ -1 +0,0 @@ -../../radeon/server/radeon_dri.h \ No newline at end of file diff --git a/src/mesa/drivers/dri/r200/server/radeon_macros.h b/src/mesa/drivers/dri/r200/server/radeon_macros.h deleted file mode 12 index c56cd73..000 --- a/src/mesa/drivers/dri/r200/server/radeon_macros.h +++ /dev/null @@ -1 +0,0 @@ -../../radeon/server/radeon_macros.h \ No newline at end of file diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c b/src/mesa/drivers/dri/radeon/radeon_screen.c index 9a6fbbd..044e212 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.c +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c @@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include swrast/s_renderbuffer.h #include radeon_chipset.h -#include radeon_macros.h #include radeon_screen.h #include radeon_common.h #include radeon_common_context.h diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h b/src/mesa/drivers/dri/radeon/radeon_screen.h index 9b77627..b5cc075 100644 --- a/src/mesa/drivers/dri/radeon/radeon_screen.h +++ b/src/mesa/drivers/dri/radeon/radeon_screen.h @@ -40,8 +40,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * IMPORTS: these headers contain all the DRI, X and kernel-related * definitions that we need. */ +#include xf86drm.h +#include radeon_drm.h #include dri_util.h -#include radeon_dri.h #include radeon_chipset.h #include radeon_reg.h #include drm_sarea.h diff --git a/src/mesa/drivers/dri/radeon/server/radeon_dri.h b/src/mesa/drivers/dri/radeon/server/radeon_dri.h deleted file mode 100644 index dc51372..000 --- a/src/mesa/drivers/dri/radeon/server/radeon_dri.h +++ /dev/null @@ -1,115 +0,0 @@ -/** - * \file server/radeon_dri.h - * \brief Radeon server-side structures. - * - * \author Kevin E. Martin mar...@xfree86.org - * \author Rickard E. Faith fa...@valinux.com - */ - -/* - * Copyright 2000 ATI Technologies Inc., Markham, Ontario, - *VA Linux Systems Inc., Fremont, California. - * - * All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining - * a copy of this software and associated documentation files (the - * Software), to deal in the Software without restriction, including - * without limitation on the rights to use, copy, modify, merge, - * publish, distribute, sublicense, and/or sell copies of the Software, - * and to permit persons to whom the Software is furnished to do so, - * subject to the following conditions: - * - * The above copyright notice and this permission notice (including the - * next paragraph) shall
[Mesa-dev] [Bug 82881] New: test_vec4_register_coalesce regression
https://bugs.freedesktop.org/show_bug.cgi?id=82881 Priority: medium Bug ID: 82881 Keywords: regression CC: matts...@gmail.com Assignee: mesa-dev@lists.freedesktop.org Summary: test_vec4_register_coalesce regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa mesa: 04895f5c601b240df547739da786b7c2b65bdd1e (master 10.3.0-devel) = Mesa 10.3.0-devel: src/mesa/drivers/dri/i965/test-suite.log = # TOTAL: 3 # PASS: 2 # SKIP: 0 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 .. contents:: :depth: 2 FAIL: test_vec4_register_coalesce = Running main() from gtest_main.cc [==] Running 5 tests from 1 test case. [--] Global test environment set-up. [--] 5 tests from register_coalesce_test [ RUN ] register_coalesce_test.test_compute_to_mrf [ OK ] register_coalesce_test.test_compute_to_mrf (0 ms) [ RUN ] register_coalesce_test.test_multiple_use [ OK ] register_coalesce_test.test_multiple_use (0 ms) [ RUN ] register_coalesce_test.test_dp4_mrf [ OK ] register_coalesce_test.test_dp4_mrf (0 ms) [ RUN ] register_coalesce_test.test_dp4_grf [ OK ] register_coalesce_test.test_dp4_grf (0 ms) [ RUN ] register_coalesce_test.test_channel_mul_grf test_vec4_register_coalesce.cpp:247: Failure Expected: (mul-dst.reg) != (to.reg), actual: 2 vs 2 [ FAILED ] register_coalesce_test.test_channel_mul_grf (0 ms) [--] 5 tests from register_coalesce_test (0 ms total) [--] Global test environment tear-down [==] 5 tests from 1 test case ran. (0 ms total) [ PASSED ] 4 tests. [ FAILED ] 1 test, listed below: [ FAILED ] register_coalesce_test.test_channel_mul_grf 1 FAILED TEST commit 04895f5c601b240df547739da786b7c2b65bdd1e Author: Matt Turner matts...@gmail.com Date: Fri Aug 15 12:32:23 2014 -0700 i965/vec4: Allow reswizzling writemasks when swizzle is single-valued. total instructions in shared programs: 4288033 - 4266151 (-0.51%) instructions in affected programs: 930915 - 909033 (-2.35%) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev