date:20140820

From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index 6bea964..55aa8b9 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -421,7 +421,11 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
using namespace llvm;
 
std::string Error;
+#if HAVE_LLVM = 0x0306
+   EngineBuilder builder(std::unique_ptrModule(unwrap(M)));
+#else
EngineBuilder builder(unwrap(M));
+#endif
 
/**
 * LLVM 3.1+ haven't more extern unsigned llvm::StackAlignmentOverride and
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967

From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 5d2efc4..2643cc3 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -234,7 +234,11 @@ namespace {
   memcpy(address_spaces, c.getTarget().getAddressSpaceMap(),
 
sizeof(address_spaces));
 
+#if HAVE_LLVM = 0x0306
+  return act.takeModule().get();
+#else
   return act.takeModule();
+#endif
}
 
void
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: Fix build with LLVM = 3.6 r215967.

On 20.08.2014 15:17, Vinson Lee wrote:
 This LLVM 3.6 commit changed EngineBuilder constructor.
 
 commit 3f4ed32b4398eaf4fe0080d8001ba01e6c2f43c8
 Author: Rafael Espindola rafael.espind...@gmail.com
 Date:   Tue Aug 19 04:04:25 2014 +
 
 Make it explicit that ExecutionEngine takes ownership of the modules.
 
 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215967 
 91177308-0d34-0410-b5e6-96231b3b80d8
 
 Signed-off-by: Vinson Lee v...@freedesktop.org
 ---
  src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 4 
  1 file changed, 4 insertions(+)
 
 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
 b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
 index 6bea964..55aa8b9 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
 @@ -421,7 +421,11 @@ 
 lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
 using namespace llvm;
  
 std::string Error;
 +#if HAVE_LLVM = 0x0306
 +   EngineBuilder builder(std::unique_ptrModule(unwrap(M)));
 +#else
 EngineBuilder builder(unwrap(M));
 +#endif
  
 /**
  * LLVM 3.1+ haven't more extern unsigned llvm::StackAlignmentOverride 
 and
 

I pushed yours, since you beat me by two minutes (and have a more
detailed commit log). :)


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
  wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? 
  Don't
  we already have like 5 of those, not counting all the driver-specific
  ones? Isn't this stuff complicated enough already? Well, there are 
  some
  pretty good reasons to start afresh (again...). In the years we've 
  been
  using GLSL IR, we've come to realize that, in fact, it's not what we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  http://www.amd.com
  Libre software enthusiast  |Mesa and X developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't write an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

I think we can fix this by introducing new structured variants of the
branch instruction in a way that doesn't alter the fundamental structure
of the IR.  E.g. an if branch could look like:

ifbr i1 cond, label iftrue, label iffalse, label join

Where both branches are guaranteed to converge at join.  Sure, this
will require fixing many assumptions, but on the one hand it's not
immediately required (as we can address this problem for the time being
using the same solution AMD uses) and on the other hand it's still less
work than starting from scratch.


 * LLVM doesn't do modifiers, meaning that

[Mesa-dev] [PATCHv3 07/16] glsl: protect anonymous struct id with a mutex

There may be two contexts compiling shaders at the same time, and we want the
anonymous struct id to be globally unique.

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/glsl_parser_extras.cpp | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 490c3c8..b17cdb1 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1350,9 +1350,15 @@ ast_struct_specifier::ast_struct_specifier(const char 
*identifier,
   ast_declarator_list *declarator_list)
 {
if (identifier == NULL) {
+  static mtx_t mutex = _MTX_INITIALIZER_NP;
   static unsigned anon_count = 1;
-  identifier = ralloc_asprintf(this, #anon_struct_%04x, anon_count);
-  anon_count++;
+  unsigned count;
+
+  mtx_lock(mutex);
+  count = anon_count++;
+  mtx_unlock(mutex);
+
+  identifier = ralloc_asprintf(this, #anon_struct_%04x, count);
}
name = identifier;
this-declarations.push_degenerate_list_at_head(declarator_list-link);
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 03/16] util: initialize locale_t with a static object

_mesa_strtod and _mesa_strtof may be called from multiple threads.  They need
to be thread-safe.

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com

v2: platform checks are now done in configure.ac
---
 src/util/strtod.cpp | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/util/strtod.cpp b/src/util/strtod.cpp
index 2a3e8eb..2b4dd98 100644
--- a/src/util/strtod.cpp
+++ b/src/util/strtod.cpp
@@ -36,6 +36,12 @@
 #include strtod.h
 
 
+#if defined(_GNU_SOURCE)  defined(HAVE_XLOCALE_H)
+static struct locale_initializer {
+   locale_initializer() { loc = newlocale(LC_CTYPE_MASK, C, NULL); }
+   locale_t loc;
+} loc_init;
+#endif
 
 /**
  * Wrapper around strtod which uses the C locale so the decimal
@@ -45,11 +51,7 @@ double
 _mesa_strtod(const char *s, char **end)
 {
 #if defined(_GNU_SOURCE)  defined(HAVE_XLOCALE_H)
-   static locale_t loc = NULL;
-   if (!loc) {
-  loc = newlocale(LC_CTYPE_MASK, C, NULL);
-   }
-   return strtod_l(s, end, loc);
+   return strtod_l(s, end, loc_init.loc);
 #else
return strtod(s, end);
 #endif
@@ -64,11 +66,7 @@ float
 _mesa_strtof(const char *s, char **end)
 {
 #if defined(_GNU_SOURCE)  defined(HAVE_XLOCALE_H)
-   static locale_t loc = NULL;
-   if (!loc) {
-  loc = newlocale(LC_CTYPE_MASK, C, NULL);
-   }
-   return strtof_l(s, end, loc);
+   return strtof_l(s, end, loc_init.loc);
 #elif defined(HAVE_STRTOF)
return strtof(s, end);
 #else
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 11/16] mesa: add infrastructure for threaded shader compilation

Add _mesa_enable_glsl_threadpool to enable the thread pool for a context, and
add ctx-Const.DeferCompileShader and ctx-Const.DeferLinkProgram to
fine-control what gets threaded.

Setting DeferCompileShader to true will make _mesa_glsl_compile_shader be
executed in a worker thread.  The function is thread-safe so there is no
restriction on DeferCompileShader.

Setting DeferLinkProgram to true will make _mesa_glsl_link_shader be executed
in a worker thread.  The function is thread-safe only when certain driver
functions (as documented in struct gl_constants) are thread-safe.  It is
drivers' responsibility to fix those driver functions before setting
DeferLinkProgram.

When DeferLinkProgram is set, drivers are not supposed to inspect the context
in their LinkShader callbacks.  Instead, NotifyLinkShader is added.  Drivers
should inspect the context in NotifyLinkShader and save what they need for
LinkShader in gl_shader_program.

As a final note, most applications will not benefit from threaded shader
compilation because they check GL_COMPILE_STATUS/GL_LINK_STATUS immediately,
giving the worker threads no time to do their jobs.  A possible improvement is
to split LinkShader into two parts: the first part links and error checks
while the second part optimizes and generates the machine code.  With the
split, we can always defer the second part to the thread pool.

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com

v2:

 - replace void *TaskData by struct gl_context *TaskContext
 - use bool instead of GLboolean internally
 - add more comments to the newly added functions
---
 src/mesa/main/context.c |  29 +++
 src/mesa/main/context.h |   3 ++
 src/mesa/main/dd.h  |   8 +++
 src/mesa/main/mtypes.h  |  34 
 src/mesa/main/pipelineobj.c |  18 +++
 src/mesa/main/shaderapi.c   | 124 +++-
 src/mesa/main/shaderobj.c   |  84 +++---
 src/mesa/main/shaderobj.h   |  55 ++--
 8 files changed, 332 insertions(+), 23 deletions(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 7a1b6f6..54d1248 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -112,6 +112,7 @@
 #include points.h
 #include polygon.h
 #include queryobj.h
+#include shaderapi.h
 #include syncobj.h
 #include rastpos.h
 #include remap.h
@@ -133,6 +134,7 @@
 #include math/m_matrix.h
 #include main/dispatch.h /* for _gloffset_COUNT */
 #include util/simple_list.h
+#include util/threadpool.h
 
 #ifdef USE_SPARC_ASM
 #include sparc/sparc.h
@@ -1187,6 +1189,27 @@ _mesa_create_context(gl_api api,
}
 }
 
+void
+_mesa_enable_glsl_threadpool(struct gl_context *ctx, int max_threads)
+{
+   if (!ctx-ThreadPool)
+  ctx-ThreadPool = _mesa_threadpool_get_singleton(max_threads);
+}
+
+static void
+wait_shader_object_cb(GLuint id, void *data, void *userData)
+{
+   struct gl_context *ctx = (struct gl_context *) userData;
+   struct gl_shader *sh = (struct gl_shader *) data;
+
+   if (_mesa_validate_shader_target(ctx, sh-Type)) {
+  _mesa_wait_shaders(ctx, sh, 1);
+   }
+   else {
+  struct gl_shader_program *shProg = (struct gl_shader_program *) data;
+  _mesa_wait_shader_program(ctx, shProg);
+   }
+}
 
 /**
  * Free the data associated with the given context.
@@ -1205,6 +1228,12 @@ _mesa_free_context_data( struct gl_context *ctx )
   _mesa_make_current(ctx, NULL, NULL);
}
 
+   if (ctx-ThreadPool) {
+  _mesa_HashWalk(ctx-Shared-ShaderObjects, wait_shader_object_cb, ctx);
+  _mesa_threadpool_unref(ctx-ThreadPool);
+  ctx-ThreadPool = NULL;
+   }
+
/* unreference WinSysDraw/Read buffers */
_mesa_reference_framebuffer(ctx-WinSysDrawBuffer, NULL);
_mesa_reference_framebuffer(ctx-WinSysReadBuffer, NULL);
diff --git a/src/mesa/main/context.h b/src/mesa/main/context.h
index d902ea7..e81d4f7 100644
--- a/src/mesa/main/context.h
+++ b/src/mesa/main/context.h
@@ -118,6 +118,9 @@ _mesa_create_context(gl_api api,
  const struct dd_function_table *driverFunctions);
 
 extern void
+_mesa_enable_glsl_threadpool(struct gl_context *ctx, int max_threads);
+
+extern void
 _mesa_free_context_data( struct gl_context *ctx );
 
 extern void
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index c130b14..9310002 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -477,6 +477,14 @@ struct dd_function_table {
 */
/*@{*/
/**
+* Called when a shader program is to be linked.
+*
+* This is optional and gives drivers an opportunity to inspect the context
+* and prepare for LinkShader, which may be deferred to another thread.
+*/
+   void (*NotifyLinkShader)(struct gl_context *ctx,
+struct gl_shader_program *shader);
+   /**
 * Called when a shader program is linked.
 *
 * This gives drivers an opportunity to clone

[Mesa-dev] [PATCHv3 05/16] util: add a generic thread pool data structure

It can be used to implement, for example, threaded glCompileShader and
glLinkProgram.  Two basic tests are included to verify the basic functions,
and to give us some confidence about its thread-safety.

v2: allow tasks to complete other tasks

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com

v3: move to src/util/ and mention the tests
---
 configure.ac  |   3 +-
 src/util/Makefile.am  |   5 +-
 src/util/Makefile.sources |   3 +-
 src/util/tests/threadpool/Makefile.am |  36 ++
 src/util/tests/threadpool/threadpool_test.cpp | 137 
 src/util/threadpool.c | 476 ++
 src/util/threadpool.h |  67 
 7 files changed, 724 insertions(+), 3 deletions(-)
 create mode 100644 src/util/tests/threadpool/Makefile.am
 create mode 100644 src/util/tests/threadpool/threadpool_test.cpp
 create mode 100644 src/util/threadpool.c
 create mode 100644 src/util/threadpool.h

diff --git a/configure.ac b/configure.ac
index 57e9f7d..2f7268f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2261,7 +2261,8 @@ AC_CONFIG_FILES([Makefile
src/mesa/drivers/x11/Makefile
src/mesa/main/tests/Makefile
src/util/Makefile
-   src/util/tests/hash_table/Makefile])
+   src/util/tests/hash_table/Makefile
+   src/util/tests/threadpool/Makefile])
 
 dnl Sort the dirs alphabetically
 GALLIUM_TARGET_DIRS=`echo $GALLIUM_TARGET_DIRS|tr   \n|sort -u|tr \n  `
diff --git a/src/util/Makefile.am b/src/util/Makefile.am
index 4733a1a..da6815e 100644
--- a/src/util/Makefile.am
+++ b/src/util/Makefile.am
@@ -19,7 +19,7 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.
 
-SUBDIRS = . tests/hash_table
+SUBDIRS = . tests/hash_table tests/threadpool
 
 include Makefile.sources
 
@@ -34,6 +34,9 @@ libmesautil_la_SOURCES = \
$(MESA_UTIL_FILES) \
$(MESA_UTIL_GENERATED_FILES)
 
+libmesautil_la_LIBADD = \
+   $(PTHREAD_LIBS)
+
 BUILT_SOURCES = $(MESA_UTIL_GENERATED_FILES)
 CLEANFILES = $(BUILT_SOURCES)
 
diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 86466dc..65f98f7 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -1,7 +1,8 @@
 MESA_UTIL_FILES := \
hash_table.c\
ralloc.c \
-   strtod.cpp
+   strtod.cpp \
+   threadpool.c
 
 MESA_UTIL_GENERATED_FILES = \
format_srgb.c
diff --git a/src/util/tests/threadpool/Makefile.am 
b/src/util/tests/threadpool/Makefile.am
new file mode 100644
index 000..2aa010c
--- /dev/null
+++ b/src/util/tests/threadpool/Makefile.am
@@ -0,0 +1,36 @@
+# Copyright © 2009 Intel Corporation
+#
+#  Permission is hereby granted, free of charge, to any person obtaining a
+#  copy of this software and associated documentation files (the Software),
+#  to deal in the Software without restriction, including without limitation
+#  on the rights to use, copy, modify, merge, publish, distribute, sub
+#  license, and/or sell copies of the Software, and to permit persons to whom
+#  the Software is furnished to do so, subject to the following conditions:
+#
+#  The above copyright notice and this permission notice (including the next
+#  paragraph) shall be included in all copies or substantial portions of the
+#  Software.
+#
+#  THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+#  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+#  FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.  IN NO EVENT SHALL
+#  ADAM JACKSON BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
+#  IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+#  CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+AM_CPPFLAGS = \
+   -I$(top_srcdir)/include \
+   -I$(top_srcdir)/src/gtest/include \
+   -I$(top_srcdir)/src/util \
+   $(DEFINES)
+
+TESTS = threadpool-test
+
+check_PROGRAMS = threadpool-test
+
+threadpool_test_SOURCES = threadpool_test.cpp
+threadpool_test_CFLAGS = $(PTHREAD_CFLAGS)
+threadpool_test_LDADD =\
+   $(top_builddir)/src/util/libmesautil.la \
+   $(top_builddir)/src/gtest/libgtest.la   \
+   $(PTHREAD_LIBS)
diff --git a/src/util/tests/threadpool/threadpool_test.cpp 
b/src/util/tests/threadpool/threadpool_test.cpp
new file mode 100644
index 000..63f55c5
--- /dev/null
+++ b/src/util/tests/threadpool/threadpool_test.cpp
@@ -0,0 +1,137 @@
+/*
+ * Copyright © 2014 LunarG, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish,

[Mesa-dev] [PATCHv3 02/16] configure: check for xlocale.h and strtof

With the assumptions that xlocale.h implies newlocale and strtof_l.  SCons is
updated to define HAVE_XLOCALE_H on linux and darwin.

Signed-off-by: Chia-I Wu o...@lunarg.com
---
 configure.ac|  3 +++
 scons/gallium.py|  4 
 src/util/strtod.cpp | 12 
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/configure.ac b/configure.ac
index be6898f..57e9f7d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -494,6 +494,9 @@ if test x$enable_asm = xyes; then
 esac
 fi
 
+AC_CHECK_HEADER([xlocale.h], [DEFINES=$DEFINES -DHAVE_XLOCALE_H])
+AC_CHECK_FUNC([strtof], [DEFINES=$DEFINES -DHAVE_STRTOF])
+
 dnl Check to see if dlopen is in default libraries (like Solaris, which
 dnl has it in libc), or if libdl is needed to get it.
 AC_CHECK_FUNC([dlopen], [DEFINES=$DEFINES -DHAVE_DLOPEN],
diff --git a/scons/gallium.py b/scons/gallium.py
index e915319..70b40f6 100755
--- a/scons/gallium.py
+++ b/scons/gallium.py
@@ -301,6 +301,10 @@ def generate(env):
 cppdefines += ['HAVE_ALIAS']
 else:
 cppdefines += ['GLX_ALIAS_UNSUPPORTED']
+
+if env['platform'] in ('linux', 'darwin'):
+cppdefines += ['HAVE_XLOCALE_H']
+
 if env['platform'] == 'haiku':
 cppdefines += [
 'HAVE_PTHREAD',
diff --git a/src/util/strtod.cpp b/src/util/strtod.cpp
index 2f1d229..2a3e8eb 100644
--- a/src/util/strtod.cpp
+++ b/src/util/strtod.cpp
@@ -28,7 +28,7 @@
 
 #ifdef _GNU_SOURCE
 #include locale.h
-#ifdef __APPLE__
+#ifdef HAVE_XLOCALE_H
 #include xlocale.h
 #endif
 #endif
@@ -44,9 +44,7 @@
 double
 _mesa_strtod(const char *s, char **end)
 {
-#if defined(_GNU_SOURCE)  !defined(__CYGWIN__)  !defined(__FreeBSD__)  \
-   !defined(ANDROID)  !defined(__HAIKU__)  !defined(__UCLIBC__)  \
-   !defined(__NetBSD__)
+#if defined(_GNU_SOURCE)  defined(HAVE_XLOCALE_H)
static locale_t loc = NULL;
if (!loc) {
   loc = newlocale(LC_CTYPE_MASK, C, NULL);
@@ -65,15 +63,13 @@ _mesa_strtod(const char *s, char **end)
 float
 _mesa_strtof(const char *s, char **end)
 {
-#if defined(_GNU_SOURCE)  !defined(__CYGWIN__)  !defined(__FreeBSD__)  \
-   !defined(ANDROID)  !defined(__HAIKU__)  !defined(__UCLIBC__)  \
-   !defined(__NetBSD__)
+#if defined(_GNU_SOURCE)  defined(HAVE_XLOCALE_H)
static locale_t loc = NULL;
if (!loc) {
   loc = newlocale(LC_CTYPE_MASK, C, NULL);
}
return strtof_l(s, end, loc);
-#elif defined(_ISOC99_SOURCE) || (defined(_XOPEN_SOURCE)  _XOPEN_SOURCE = 
600)
+#elif defined(HAVE_STRTOF)
return strtof(s, end);
 #else
return (float) strtod(s, end);
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 10/16] mesa: protect the debug state with a mutex

We are about to change mesa to spawn threads for deferred glCompileShader and
glLinkProgram, and we need to make sure those threads can send compiler
warnings/errors to the debug output safely.

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/main/errors.c | 172 +++--
 src/mesa/main/mtypes.h |   1 +
 2 files changed, 126 insertions(+), 47 deletions(-)

diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
index 6b55a1d..218b4ee 100644
--- a/src/mesa/main/errors.c
+++ b/src/mesa/main/errors.c
@@ -677,22 +677,41 @@ debug_pop_group(struct gl_debug_state *debug)
 
 
 /**
- * Return debug state for the context.  The debug state will be allocated
- * and initialized upon the first call.
+ * Lock and return debug state for the context.  The debug state will be
+ * allocated and initialized upon the first call.  When NULL is returned, the
+ * debug state is not locked.
  */
 static struct gl_debug_state *
-_mesa_get_debug_state(struct gl_context *ctx)
+_mesa_lock_debug_state(struct gl_context *ctx)
 {
+   mtx_lock(ctx-DebugMutex);
+
if (!ctx-Debug) {
   ctx-Debug = debug_create();
   if (!ctx-Debug) {
- _mesa_error(ctx, GL_OUT_OF_MEMORY, allocating debug state);
+ GET_CURRENT_CONTEXT(cur);
+ mtx_unlock(ctx-DebugMutex);
+
+ /*
+  * This function may be called from other threads.  When that is the
+  * case, we cannot record this OOM error.
+  */
+ if (ctx == cur)
+_mesa_error(ctx, GL_OUT_OF_MEMORY, allocating debug state);
+
+ return NULL;
   }
}
 
return ctx-Debug;
 }
 
+static void
+_mesa_unlock_debug_state(struct gl_context *ctx)
+{
+   mtx_unlock(ctx-DebugMutex);
+}
+
 /**
  * Set the integer debug state specified by \p pname.  This can be called from
  * _mesa_set_enable for example.
@@ -700,7 +719,7 @@ _mesa_get_debug_state(struct gl_context *ctx)
 bool
 _mesa_set_debug_state_int(struct gl_context *ctx, GLenum pname, GLint val)
 {
-   struct gl_debug_state *debug = _mesa_get_debug_state(ctx);
+   struct gl_debug_state *debug = _mesa_lock_debug_state(ctx);
 
if (!debug)
   return false;
@@ -717,6 +736,8 @@ _mesa_set_debug_state_int(struct gl_context *ctx, GLenum 
pname, GLint val)
   break;
}
 
+   _mesa_unlock_debug_state(ctx);
+
return true;
 }
 
@@ -730,9 +751,12 @@ _mesa_get_debug_state_int(struct gl_context *ctx, GLenum 
pname)
struct gl_debug_state *debug;
GLint val;
 
+   mtx_lock(ctx-DebugMutex);
debug = ctx-Debug;
-   if (!debug)
+   if (!debug) {
+  mtx_unlock(ctx-DebugMutex);
   return 0;
+   }
 
switch (pname) {
case GL_DEBUG_OUTPUT:
@@ -757,6 +781,8 @@ _mesa_get_debug_state_int(struct gl_context *ctx, GLenum 
pname)
   break;
}
 
+   mtx_unlock(ctx-DebugMutex);
+
return val;
 }
 
@@ -770,9 +796,12 @@ _mesa_get_debug_state_ptr(struct gl_context *ctx, GLenum 
pname)
struct gl_debug_state *debug;
void *val;
 
+   mtx_lock(ctx-DebugMutex);
debug = ctx-Debug;
-   if (!debug)
+   if (!debug) {
+  mtx_unlock(ctx-DebugMutex);
   return NULL;
+   }
 
switch (pname) {
case GL_DEBUG_CALLBACK_FUNCTION_ARB:
@@ -787,9 +816,49 @@ _mesa_get_debug_state_ptr(struct gl_context *ctx, GLenum 
pname)
   break;
}
 
+   mtx_unlock(ctx-DebugMutex);
+
return val;
 }
 
+/**
+ * Insert a debug message.  The mutex is assumed to be locked, and will be
+ * unlocked by this call.
+ */
+static void
+log_msg_locked_and_unlock(struct gl_context *ctx,
+  enum mesa_debug_source source,
+  enum mesa_debug_type type, GLuint id,
+  enum mesa_debug_severity severity,
+  GLint len, const char *buf)
+{
+   struct gl_debug_state *debug = ctx-Debug;
+
+   if (!debug_is_message_enabled(debug, source, type, id, severity)) {
+  _mesa_unlock_debug_state(ctx);
+  return;
+   }
+
+   if (ctx-Debug-Callback) {
+  GLenum gl_source = debug_source_enums[source];
+  GLenum gl_type = debug_type_enums[type];
+  GLenum gl_severity = debug_severity_enums[severity];
+  GLDEBUGPROC callback = ctx-Debug-Callback;
+  const void *data = ctx-Debug-CallbackData;
+
+  /*
+   * When ctx-Debug-SyncOutput is GL_FALSE, the client is prepared for
+   * unsynchronous calls.  When it is GL_TRUE, we will not spawn threads.
+   * In either case, we can call the callback unlocked.
+   */
+  _mesa_unlock_debug_state(ctx);
+  callback(gl_source, gl_type, id, gl_severity, len, buf, data);
+   }
+   else {
+  debug_log_message(ctx-Debug, source, type, id, severity, len, buf);
+  _mesa_unlock_debug_state(ctx);
+   }
+}
 
 /**
  * Log a client or driver debug message.
@@ -799,24 +868,12 @@ log_msg(struct gl_context *ctx, enum mesa_debug_source 
source,
 enum

[Mesa-dev] [PATCHv3 16/16] i965: enable threaded precompile

Inherit gl_shader_program and add save/restore functions to save precompile
results in the shader programs.  When DeferLinkProgram is set, we will save
the precompile results instead of uploading them immediately because we may be
on a different thread.

A few other modifications are also needed.  brw_shader_program_precompile_key
is introduced and initialized in NofityLinkShader for we cannot inspect the
context during precompiling.

Signed-off-by: Chia-I Wu o...@lunarg.com
Acked-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/drivers/dri/i965/brw_context.c  |   4 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp |  33 --
 src/mesa/drivers/dri/i965/brw_program.c  |   1 +
 src/mesa/drivers/dri/i965/brw_shader.cpp | 177 ++-
 src/mesa/drivers/dri/i965/brw_shader.h   |  44 
 src/mesa/drivers/dri/i965/brw_vec4_gs.c  |  37 +--
 src/mesa/drivers/dri/i965/brw_vs.c   |  36 +--
 src/mesa/drivers/dri/i965/brw_wm.c   |  23 ++--
 src/mesa/drivers/dri/i965/brw_wm.h   |   5 +-
 9 files changed, 310 insertions(+), 50 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index b02128c..70e61f7 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -839,8 +839,8 @@ brwCreateContext(gl_api api,
if (INTEL_DEBUG  DEBUG_SHADER_TIME)
   brw_init_shader_time(brw);
 
-   /* brw_shader_precompile is not thread-safe */
-   if (brw-precompile)
+   /* brw_shader_precompile is not thread-safe when debug flags are set */
+   if (brw-precompile  (INTEL_DEBUG || brw-perf_debug))
   ctx-Const.DeferLinkProgram = GL_FALSE;
 
_mesa_compute_version(ctx);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5c70f50..393a262 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3452,6 +3452,8 @@ bool
 brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog)
 {
struct brw_context *brw = brw_context(ctx);
+   const struct brw_shader_program_precompile_key *pre_key =
+  brw_shader_program_get_precompile_key(prog);
struct brw_wm_prog_key key;
 
if (!prog-_LinkedShaders[MESA_SHADER_FRAGMENT])
@@ -3493,7 +3495,7 @@ brw_fs_precompile(struct gl_context *ctx, struct 
gl_shader_program *prog)
}
 
if (fp-Base.InputsRead  VARYING_BIT_POS) {
-  key.drawable_height = ctx-DrawBuffer-Height;
+  key.drawable_height = pre_key-fbo_height;
}
 
key.nr_color_regions = _mesa_bitcount_64(fp-Base.OutputsWritten 
@@ -3501,7 +3503,7 @@ brw_fs_precompile(struct gl_context *ctx, struct 
gl_shader_program *prog)
  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)));
 
if ((fp-Base.InputsRead  VARYING_BIT_POS) || program_uses_dfdy) {
-  key.render_to_fbo = _mesa_is_user_fbo(ctx-DrawBuffer) ||
+  key.render_to_fbo = pre_key-is_user_fbo ||
   key.nr_color_regions  1;
}
 
@@ -3513,13 +3515,28 @@ brw_fs_precompile(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
key.program_string_id = bfp-id;
 
-   uint32_t old_prog_offset = brw-wm.base.prog_offset;
-   struct brw_wm_prog_data *old_prog_data = brw-wm.prog_data;
+   struct brw_wm_compile c;
 
-   bool success = do_wm_prog(brw, prog, bfp, key);
+   brw_wm_init_compile(brw, prog, bfp, key, c);
+   if (!brw_wm_do_compile(brw, c)) {
+  brw_wm_clear_compile(brw, c);
+  return false;
+   }
+
+   if (brw-ctx.Const.DeferLinkProgram) {
+  brw_shader_program_save_wm_compile(prog, c);
+   }
+   else {
+  uint32_t old_prog_offset = brw-wm.base.prog_offset;
+  struct brw_wm_prog_data *old_prog_data = brw-wm.prog_data;
 
-   brw-wm.base.prog_offset = old_prog_offset;
-   brw-wm.prog_data = old_prog_data;
+  brw_wm_upload_compile(brw, c);
 
-   return success;
+  brw-wm.base.prog_offset = old_prog_offset;
+  brw-wm.prog_data = old_prog_data;
+   }
+
+   brw_wm_clear_compile(brw, c);
+
+   return true;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index d782b4f..35fd69a 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -259,6 +259,7 @@ void brwInitFragProgFuncs( struct dd_function_table 
*functions )
functions-NewShader = brw_new_shader;
functions-NewShaderProgram = brw_new_shader_program;
functions-LinkShader = brw_link_shader;
+   functions-NotifyLinkShader = brw_notify_link_shader;
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 28db29a..29f4a19 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -25,14 +25,52 @@ extern C {
 #include main/macros.h
 #include brw_context.h
 }
+#include brw_shader.h
 #include brw_vs.h
 #include brw_vec4_gs.h
+#include brw_vec4_gs_visitor.h
 #include brw_fs.h
 #include brw_cfg.h

[Mesa-dev] [PATCHv3 04/16] util: move simple_list.h from core to util

It belongs to util, and we will need it from within util.

Signed-off-by: Chia-I Wu o...@lunarg.com
---
 src/mesa/drivers/dri/i915/i830_texblend.c  |   2 +-
 src/mesa/drivers/dri/i915/intel_syncobj.c  |   2 +-
 src/mesa/drivers/dri/r200/r200_cmdbuf.c|   2 +-
 src/mesa/drivers/dri/r200/r200_context.c   |   3 +-
 src/mesa/drivers/dri/r200/r200_ioctl.h |   2 +-
 src/mesa/drivers/dri/r200/r200_swtcl.c |   2 +-
 src/mesa/drivers/dri/r200/r200_tex.c   |   2 +-
 .../drivers/dri/radeon/radeon_common_context.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_context.c   |   3 +-
 src/mesa/drivers/dri/radeon/radeon_dma.c   |   2 +-
 src/mesa/drivers/dri/radeon/radeon_ioctl.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_ioctl.h |   2 +-
 src/mesa/drivers/dri/radeon/radeon_mipmap_tree.c   |   2 +-
 src/mesa/drivers/dri/radeon/radeon_queryobj.c  |   2 +-
 src/mesa/drivers/dri/radeon/radeon_queryobj.h  |   2 +-
 src/mesa/drivers/dri/radeon/radeon_state.c |   2 +-
 src/mesa/drivers/dri/radeon/radeon_swtcl.c |   3 +-
 src/mesa/drivers/dri/radeon/radeon_tex.c   |   2 +-
 src/mesa/main/context.c|   2 +-
 src/mesa/main/enable.c |   2 +-
 src/mesa/main/errors.c |   1 +
 src/mesa/main/light.c  |   2 +-
 src/mesa/main/mtypes.h |   1 -
 src/mesa/main/simple_list.h| 210 -
 src/mesa/program/prog_hash_table.c |   2 +-
 src/mesa/tnl/t_context.c   |   1 +
 src/mesa/tnl/t_rasterpos.c |   2 +-
 src/mesa/tnl/t_vb_light.c  |   2 +-
 src/mesa/tnl/t_vertex_generic.c|   2 +-
 src/mesa/tnl/t_vertex_sse.c|   2 +-
 src/util/simple_list.h | 210 +
 31 files changed, 241 insertions(+), 237 deletions(-)
 delete mode 100644 src/mesa/main/simple_list.h
 create mode 100644 src/util/simple_list.h

diff --git a/src/mesa/drivers/dri/i915/i830_texblend.c 
b/src/mesa/drivers/dri/i915/i830_texblend.c
index 236be59..f55d941 100644
--- a/src/mesa/drivers/dri/i915/i830_texblend.c
+++ b/src/mesa/drivers/dri/i915/i830_texblend.c
@@ -28,9 +28,9 @@
 #include main/glheader.h
 #include main/macros.h
 #include main/mtypes.h
-#include main/simple_list.h
 #include main/enums.h
 #include main/mm.h
+#include util/simple_list.h
 
 #include intel_screen.h
 #include intel_tex.h
diff --git a/src/mesa/drivers/dri/i915/intel_syncobj.c 
b/src/mesa/drivers/dri/i915/intel_syncobj.c
index 9657d9a..95d0b16 100644
--- a/src/mesa/drivers/dri/i915/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i915/intel_syncobj.c
@@ -38,8 +38,8 @@
  * performance bottleneck, though.
  */
 
-#include main/simple_list.h
 #include main/imports.h
+#include util/simple_list.h
 
 #include intel_context.h
 #include intel_batchbuffer.h
diff --git a/src/mesa/drivers/dri/r200/r200_cmdbuf.c 
b/src/mesa/drivers/dri/r200/r200_cmdbuf.c
index 1e6c0d8..13ac5af 100644
--- a/src/mesa/drivers/dri/r200/r200_cmdbuf.c
+++ b/src/mesa/drivers/dri/r200/r200_cmdbuf.c
@@ -35,7 +35,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include main/imports.h
 #include main/macros.h
 #include main/context.h
-#include main/simple_list.h
+#include util/simple_list.h
 
 #include radeon_common.h
 #include r200_context.h
diff --git a/src/mesa/drivers/dri/r200/r200_context.c 
b/src/mesa/drivers/dri/r200/r200_context.c
index 7815c4e..41040a6 100644
--- a/src/mesa/drivers/dri/r200/r200_context.c
+++ b/src/mesa/drivers/dri/r200/r200_context.c
@@ -37,7 +37,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include main/api_arrayelt.h
 #include main/api_exec.h
 #include main/context.h
-#include main/simple_list.h
 #include main/imports.h
 #include main/extensions.h
 #include main/version.h
@@ -50,6 +49,8 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include tnl/tnl.h
 #include tnl/t_pipeline.h
 
+#include util/simple_list.h
+
 #include drivers/common/driverfuncs.h
 
 #include r200_context.h
diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h 
b/src/mesa/drivers/dri/r200/r200_ioctl.h
index ab5f822..384787c 100644
--- a/src/mesa/drivers/dri/r200/r200_ioctl.h
+++ b/src/mesa/drivers/dri/r200/r200_ioctl.h
@@ -35,7 +35,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #ifndef __R200_IOCTL_H__
 #define __R200_IOCTL_H__
 
-#include main/simple_list.h
+#include util/simple_list.h
 #include radeon_dri.h
 
 #include radeon_bo_gem.h
diff --git a/src/mesa/drivers/dri/r200/r200_swtcl.c 
b/src/mesa/drivers/dri/r200/r200_swtcl.c
index 07c64f8..c324d53 100644
--- a/src/mesa/drivers/dri/r200/r200_swtcl.c
+++ b/src/mesa/drivers/dri/r200/r200_swtcl.c
@@ -39,7 +39,6 @@ WITH THE SOFTWARE OR THE

[Mesa-dev] [PATCHv3 08/16] glsl: protect glsl_type with a mutex

glsl_type has several static hash tables and a static ralloc context.  They
need to be protected by a mutex as they are not thread-safe.

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/glsl_types.cpp | 57 +++--
 src/glsl/glsl_types.h   | 15 +
 2 files changed, 62 insertions(+), 10 deletions(-)

diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
index 66e9b13..74ec40f 100644
--- a/src/glsl/glsl_types.cpp
+++ b/src/glsl/glsl_types.cpp
@@ -29,6 +29,7 @@ extern C {
 #include program/hash_table.h
 }
 
+mtx_t glsl_type::mutex = _MTX_INITIALIZER_NP;
 hash_table *glsl_type::array_types = NULL;
 hash_table *glsl_type::record_types = NULL;
 hash_table *glsl_type::interface_types = NULL;
@@ -53,9 +54,14 @@ glsl_type::glsl_type(GLenum gl_type,
vector_elements(vector_elements), matrix_columns(matrix_columns),
length(0)
 {
+   mtx_lock(glsl_type::mutex);
+
init_ralloc_type_ctx();
assert(name != NULL);
this-name = ralloc_strdup(this-mem_ctx, name);
+
+   mtx_unlock(glsl_type::mutex);
+
/* Neither dimension is zero or both dimensions are zero.
 */
assert((vector_elements == 0) == (matrix_columns == 0));
@@ -71,9 +77,14 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type 
base_type,
sampler_array(array), sampler_type(type), interface_packing(0),
length(0)
 {
+   mtx_lock(glsl_type::mutex);
+
init_ralloc_type_ctx();
assert(name != NULL);
this-name = ralloc_strdup(this-mem_ctx, name);
+
+   mtx_unlock(glsl_type::mutex);
+
memset( fields, 0, sizeof(fields));
 
if (base_type == GLSL_TYPE_SAMPLER) {
@@ -95,11 +106,14 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
unsigned num_fields,
 {
unsigned int i;
 
+   mtx_lock(glsl_type::mutex);
+
init_ralloc_type_ctx();
assert(name != NULL);
this-name = ralloc_strdup(this-mem_ctx, name);
this-fields.structure = ralloc_array(this-mem_ctx,
 glsl_struct_field, length);
+
for (i = 0; i  length; i++) {
   this-fields.structure[i].type = fields[i].type;
   this-fields.structure[i].name = ralloc_strdup(this-fields.structure,
@@ -110,6 +124,8 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
unsigned num_fields,
   this-fields.structure[i].sample = fields[i].sample;
   this-fields.structure[i].matrix_layout = fields[i].matrix_layout;
}
+
+   mtx_unlock(glsl_type::mutex);
 }
 
 glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,
@@ -123,6 +139,8 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
unsigned num_fields,
 {
unsigned int i;
 
+   mtx_lock(glsl_type::mutex);
+
init_ralloc_type_ctx();
assert(name != NULL);
this-name = ralloc_strdup(this-mem_ctx, name);
@@ -138,6 +156,8 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
unsigned num_fields,
   this-fields.structure[i].sample = fields[i].sample;
   this-fields.structure[i].matrix_layout = fields[i].matrix_layout;
}
+
+   mtx_unlock(glsl_type::mutex);
 }
 
 
@@ -285,6 +305,8 @@ const glsl_type *glsl_type::get_scalar_type() const
 void
 _mesa_glsl_release_types(void)
 {
+   mtx_lock(glsl_type::mutex);
+
if (glsl_type::array_types != NULL) {
   hash_table_dtor(glsl_type::array_types);
   glsl_type::array_types = NULL;
@@ -294,6 +316,8 @@ _mesa_glsl_release_types(void)
   hash_table_dtor(glsl_type::record_types);
   glsl_type::record_types = NULL;
}
+
+   mtx_unlock(glsl_type::mutex);
 }
 
 
@@ -316,7 +340,10 @@ glsl_type::glsl_type(const glsl_type *array, unsigned 
length) :
 * NUL.
 */
const unsigned name_length = strlen(array-name) + 10 + 3;
+
+   mtx_lock(glsl_type::mutex);
char *const n = (char *) ralloc_size(this-mem_ctx, name_length);
+   mtx_unlock(glsl_type::mutex);
 
if (length == 0)
   snprintf(n, name_length, %s[], array-name);
@@ -452,12 +479,6 @@ glsl_type::get_instance(unsigned base_type, unsigned rows, 
unsigned columns)
 const glsl_type *
 glsl_type::get_array_instance(const glsl_type *base, unsigned array_size)
 {
-
-   if (array_types == NULL) {
-  array_types = hash_table_ctor(64, hash_table_string_hash,
-   hash_table_string_compare);
-   }
-
/* Generate a name using the base type pointer in the key.  This is
 * done because the name of the base type may not be unique across
 * shaders.  For example, two shaders may have different record types
@@ -466,9 +487,19 @@ glsl_type::get_array_instance(const glsl_type *base, 
unsigned array_size)
char key[128];
snprintf(key, sizeof(key), %p[%u], (void *) base, array_size);
 
+   mtx_lock(glsl_type::mutex);
+
+   if (array_types == NULL) {
+  array_types = hash_table_ctor(64, hash_table_string_hash,
+   hash_table_string_compare);
+   }
+
const

[Mesa-dev] [PATCHv3 12/16] i965: add drirc option multithread_glsl_compiler

Setting it to a non-zero value N will cause shader compilation to be deferred
to a thread pool.  When N is greater than 1, it indicates the maximum number
of threads in the pool.  When N is 1, the number of threads is up to the
driver (two for i965).

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/drivers/dri/common/xmlpool/t_options.h |  4 
 src/mesa/drivers/dri/i965/brw_context.c | 15 +++
 src/mesa/drivers/dri/i965/intel_screen.c|  2 ++
 3 files changed, 21 insertions(+)

diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h
index b73a662..7ac0298 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -298,6 +298,10 @@ DRI_CONF_OPT_BEGIN_V(texture_heaps,enum,def,0:2) \
DRI_CONF_DESC_END \
 DRI_CONF_OPT_END
 
+#define DRI_CONF_MULTITHREAD_GLSL_COMPILER(def) \
+DRI_CONF_OPT_BEGIN(multithread_glsl_compiler, int, def) \
+DRI_CONF_DESC(en,gettext(Enable multithreading in the GLSL 
compiler)) \
+DRI_CONF_OPT_END
 
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 216b788..b02128c 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -624,6 +624,17 @@ brw_process_driconf_options(struct brw_context *brw)
 
ctx-Const.AllowGLSLExtensionDirectiveMidShader =
   driQueryOptionb(options, allow_glsl_extension_directive_midshader);
+
+   const int multithread_glsl_compiler =
+  driQueryOptioni(options, multithread_glsl_compiler);
+   if (multithread_glsl_compiler  0) {
+  const int max_threads = (multithread_glsl_compiler  1) ?
+ multithread_glsl_compiler : 2;
+
+  _mesa_enable_glsl_threadpool(ctx, max_threads);
+  ctx-Const.DeferCompileShader = GL_TRUE;
+  ctx-Const.DeferLinkProgram = GL_TRUE;
+   }
 }
 
 GLboolean
@@ -828,6 +839,10 @@ brwCreateContext(gl_api api,
if (INTEL_DEBUG  DEBUG_SHADER_TIME)
   brw_init_shader_time(brw);
 
+   /* brw_shader_precompile is not thread-safe */
+   if (brw-precompile)
+  ctx-Const.DeferLinkProgram = GL_FALSE;
+
_mesa_compute_version(ctx);
 
_mesa_initialize_dispatch_tables(ctx);
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 9e743ee..95850c1 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -48,6 +48,8 @@ static const __DRIconfigOptionsExtension brw_config_options = 
{
 DRI_CONF_BEGIN
DRI_CONF_SECTION_PERFORMANCE
   DRI_CONF_VBLANK_MODE(DRI_CONF_VBLANK_ALWAYS_SYNC)
+  DRI_CONF_MULTITHREAD_GLSL_COMPILER(0)
+
   /* Options correspond to DRI_CONF_BO_REUSE_DISABLED,
* DRI_CONF_BO_REUSE_ALL
*/
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 14/16] i965: refactor do_gs_prog

Split do_gs_prog into

  brw_gs_init_compile
  brw_gs_do_compile
  brw_gs_upload_compile
  brw_gs_clear_complile

Signed-off-by: Chia-I Wu o...@lunarg.com
Acked-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4_gs.c | 161 
 1 file changed, 102 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index 5b2ed51..04407b8 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -33,22 +33,29 @@
 #include brw_state.h
 
 
-static bool
-do_gs_prog(struct brw_context *brw,
-   struct gl_shader_program *prog,
-   struct brw_geometry_program *gp,
-   struct brw_gs_prog_key *key)
+static void
+brw_gs_init_compile(struct brw_context *brw,
+struct gl_shader_program *prog,
+struct brw_geometry_program *gp,
+const struct brw_gs_prog_key *key,
+struct brw_gs_compile *c)
 {
-   struct brw_stage_state *stage_state = brw-gs.base;
-   struct brw_gs_compile c;
-   memset(c, 0, sizeof(c));
-   c.key = *key;
-   c.gp = gp;
+   memset(c, 0, sizeof(*c));
 
-   c.prog_data.include_primitive_id =
-  (gp-program.Base.InputsRead  VARYING_BIT_PRIMITIVE_ID) != 0;
+   c-key = *key;
+   c-gp = gp;
+   c-base.shader_prog = prog;
+   c-base.mem_ctx = ralloc_context(NULL);
+}
 
-   c.prog_data.invocations = gp-program.Invocations;
+static bool
+brw_gs_do_compile(struct brw_context *brw,
+  struct brw_gs_compile *c)
+{
+   c-prog_data.include_primitive_id =
+  (c-gp-program.Base.InputsRead  VARYING_BIT_PRIMITIVE_ID) != 0;
+
+   c-prog_data.invocations = c-gp-program.Invocations;
 
/* Allocate the references to the uniforms that will end up in the
 * prog_data associated with the compiled program, and which will be freed
@@ -58,34 +65,37 @@ do_gs_prog(struct brw_context *brw,
 * padding around uniform values below vec4 size, so the worst case is that
 * every uniform is a float which gets padded to the size of a vec4.
 */
-   struct gl_shader *gs = prog-_LinkedShaders[MESA_SHADER_GEOMETRY];
+   struct gl_shader *gs =
+  c-base.shader_prog-_LinkedShaders[MESA_SHADER_GEOMETRY];
int param_count = gs-num_uniform_components * 4;
 
/* We also upload clip plane data as uniforms */
param_count += MAX_CLIP_PLANES * 4;
 
-   c.prog_data.base.base.param =
+   c-prog_data.base.base.param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
-   c.prog_data.base.base.pull_param =
+   c-prog_data.base.base.pull_param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
/* Setting nr_params here NOT to the size of the param and pull_param
 * arrays, but to the number of uniform components vec4_visitor
 * needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
 */
-   c.prog_data.base.base.nr_params = ALIGN(param_count, 4) / 4 + 
gs-num_samplers;
+   c-prog_data.base.base.nr_params =
+  ALIGN(param_count, 4) / 4 + gs-num_samplers;
 
-   if (gp-program.OutputType == GL_POINTS) {
+   if (c-gp-program.OutputType == GL_POINTS) {
   /* When the output type is points, the geometry shader may output data
* to multiple streams, and EndPrimitive() has no effect.  So we
* configure the hardware to interpret the control data as stream ID.
*/
-  c.prog_data.control_data_format = GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID;
+  c-prog_data.control_data_format =
+ GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_SID;
 
   /* We only have to emit control bits if we are using streams */
-  if (prog-Geom.UsesStreams)
- c.control_data_bits_per_vertex = 2;
+  if (c-base.shader_prog-Geom.UsesStreams)
+ c-control_data_bits_per_vertex = 2;
   else
- c.control_data_bits_per_vertex = 0;
+ c-control_data_bits_per_vertex = 0;
} else {
   /* When the output type is triangle_strip or line_strip, EndPrimitive()
* may be used to terminate the current strip and start a new one
@@ -93,32 +103,34 @@ do_gs_prog(struct brw_context *brw,
* streams is not supported.  So we configure the hardware to interpret
* the control data as EndPrimitive information (a.k.a. cut bits).
*/
-  c.prog_data.control_data_format = GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT;
+  c-prog_data.control_data_format =
+ GEN7_GS_CONTROL_DATA_FORMAT_GSCTL_CUT;
 
   /* We only need to output control data if the shader actually calls
* EndPrimitive().
*/
-  c.control_data_bits_per_vertex = gp-program.UsesEndPrimitive ? 1 : 0;
+  c-control_data_bits_per_vertex =
+ c-gp-program.UsesEndPrimitive ? 1 : 0;
}
-   c.control_data_header_size_bits =
-  gp-program.VerticesOut * c.control_data_bits_per_vertex;
+   c-control_data_header_size_bits =
+

[Mesa-dev] [PATCHv3 15/16] i965: refactor do_wm_prog

Split do_wm_prog into

  brw_wm_init_compile
  brw_wm_do_compile
  brw_wm_upload_compile
  brw_wm_clear_complile

Add struct brw_wm_compile to be passed around them.

Signed-off-by: Chia-I Wu o...@lunarg.com
Acked-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/drivers/dri/i965/brw_wm.c | 116 -
 src/mesa/drivers/dri/i965/brw_wm.h |  30 ++
 2 files changed, 106 insertions(+), 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 2e3cd4b..329e82c 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -135,27 +135,30 @@ brw_wm_prog_data_compare(const void *in_a, const void 
*in_b)
return true;
 }
 
-/**
- * All Mesa program - GPU code generation goes through this function.
- * Depending on the instructions used (i.e. flow control instructions)
- * we'll use one of two code generators.
- */
-bool do_wm_prog(struct brw_context *brw,
-   struct gl_shader_program *prog,
-   struct brw_fragment_program *fp,
-   struct brw_wm_prog_key *key)
+void
+brw_wm_init_compile(struct brw_context *brw,
+struct gl_shader_program *prog,
+struct brw_fragment_program *fp,
+const struct brw_wm_prog_key *key,
+struct brw_wm_compile *c)
+{
+   memset(c, 0, sizeof(*c));
+
+   c-shader_prog = prog;
+   c-fp = fp;
+   c-key = key;
+   c-mem_ctx = ralloc_context(NULL);
+}
+
+bool
+brw_wm_do_compile(struct brw_context *brw,
+  struct brw_wm_compile *c)
 {
struct gl_context *ctx = brw-ctx;
-   void *mem_ctx = ralloc_context(NULL);
-   struct brw_wm_prog_data prog_data;
-   const GLuint *program;
struct gl_shader *fs = NULL;
-   GLuint program_size;
 
-   if (prog)
-  fs = prog-_LinkedShaders[MESA_SHADER_FRAGMENT];
-
-   memset(prog_data, 0, sizeof(prog_data));
+   if (c-shader_prog)
+  fs = c-shader_prog-_LinkedShaders[MESA_SHADER_FRAGMENT];
 
/* Allocate the references to the uniforms that will end up in the
 * prog_data associated with the compiled program, and which will be freed
@@ -165,43 +168,76 @@ bool do_wm_prog(struct brw_context *brw,
if (fs) {
   param_count = fs-num_uniform_components;
} else {
-  param_count = fp-program.Base.Parameters-NumParameters * 4;
+  param_count = c-fp-program.Base.Parameters-NumParameters * 4;
}
/* The backend also sometimes adds params for texture size. */
param_count += 2 * 
ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits;
-   prog_data.base.param =
+   c-prog_data.base.param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
-   prog_data.base.pull_param =
+   c-prog_data.base.pull_param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
-   prog_data.base.nr_params = param_count;
+   c-prog_data.base.nr_params = param_count;
 
-   prog_data.barycentric_interp_modes =
-  brw_compute_barycentric_interp_modes(brw, key-flat_shade,
-   key-persample_shading,
-   fp-program);
+   c-prog_data.barycentric_interp_modes =
+  brw_compute_barycentric_interp_modes(brw, c-key-flat_shade,
+   c-key-persample_shading,
+   c-fp-program);
 
-   program = brw_wm_fs_emit(brw, mem_ctx, key, prog_data,
-fp-program, prog, program_size);
-   if (program == NULL) {
-  ralloc_free(mem_ctx);
+   c-program = brw_wm_fs_emit(brw, c-mem_ctx, c-key, c-prog_data,
+   c-fp-program, c-shader_prog,
+   c-program_size);
+   if (c-program == NULL)
   return false;
-   }
-
-   if (prog_data.total_scratch) {
-  brw_get_scratch_bo(brw, brw-wm.base.scratch_bo,
-prog_data.total_scratch * brw-max_wm_threads);
-   }
 
if (unlikely(INTEL_DEBUG  DEBUG_WM))
   fprintf(stderr, \n);
 
+   return true;
+}
+
+void
+brw_wm_upload_compile(struct brw_context *brw,
+  const struct brw_wm_compile *c)
+{
+   if (c-prog_data.total_scratch) {
+  brw_get_scratch_bo(brw, brw-wm.base.scratch_bo,
+ c-prog_data.total_scratch * brw-max_wm_threads);
+   }
+
brw_upload_cache(brw-cache, BRW_WM_PROG,
-   key, sizeof(struct brw_wm_prog_key),
-   program, program_size,
-   prog_data, sizeof(prog_data),
-   brw-wm.base.prog_offset, brw-wm.prog_data);
+c-key, sizeof(struct brw_wm_prog_key),
+c-program, c-program_size,
+c-prog_data, sizeof(c-prog_data),
+brw-wm.base.prog_offset, brw-wm.prog_data);
+}
+
+void
+brw_wm_clear_compile(struct brw_context *brw,
+ struct brw_wm_compile *c)
+{
+

[Mesa-dev] [PATCHv3 09/16] glsl: integrate with the singleton thread pool

The singleton thread pool will be used by contexts to queue compilation tasks.
We need to control its lieftime from the compiler.

Signed-off-by: Chia-I Wu o...@lunarg.com
---
 src/glsl/glsl_parser_extras.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index b17cdb1..9342908 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -32,6 +32,7 @@ extern C {
 }
 
 #include util/ralloc.h
+#include util/threadpool.h
 #include ast.h
 #include glsl_parser_extras.h
 #include glsl_parser.h
@@ -1626,6 +1627,8 @@ extern C {
 void
 _mesa_destroy_shader_compiler(void)
 {
+   _mesa_threadpool_destroy_singleton();
+
_mesa_destroy_shader_compiler_caches();
 
_mesa_glsl_release_types();
@@ -1639,6 +1642,7 @@ _mesa_destroy_shader_compiler(void)
 void
 _mesa_destroy_shader_compiler_caches(void)
 {
+   _mesa_threadpool_wait_singleton();
_mesa_glsl_release_builtin_functions();
 }
 
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 01/16] util: add _mesa_strtod and _mesa_strtof

Both core mesa and glsl have their own wrappers for strtof_l.  Merge and move
them to util/.  They are compiled with a C++ compiler so that we can make them
thread-safe in a following commit.

Signed-off-by: Chia-I Wu o...@lunarg.com
---
 src/glsl/Makefile.sources|  3 +-
 src/glsl/glsl_lexer.ll   | 12 +++---
 src/glsl/s_expression.cpp|  2 +-
 src/glsl/s_expression.h  |  2 +-
 src/glsl/strtod.c| 79 ---
 src/glsl/strtod.h| 46 ---
 src/mesa/main/imports.c  | 19 --
 src/mesa/main/imports.h  |  3 --
 src/mesa/program/program_lexer.l |  1 +
 src/util/Makefile.sources|  3 +-
 src/util/strtod.cpp  | 81 
 src/util/strtod.h| 46 +++
 12 files changed, 139 insertions(+), 158 deletions(-)
 delete mode 100644 src/glsl/strtod.c
 delete mode 100644 src/glsl/strtod.h
 create mode 100644 src/util/strtod.cpp
 create mode 100644 src/util/strtod.h

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 2131dda..472ad89 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -101,8 +101,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_swizzle_swizzle.cpp \
$(GLSL_SRCDIR)/opt_tree_grafting.cpp \
$(GLSL_SRCDIR)/opt_vectorize.cpp \
-   $(GLSL_SRCDIR)/s_expression.cpp \
-   $(GLSL_SRCDIR)/strtod.c
+   $(GLSL_SRCDIR)/s_expression.cpp
 
 # glsl_compiler
 
diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
index b7c4aad..ed2f26d 100644
--- a/src/glsl/glsl_lexer.ll
+++ b/src/glsl/glsl_lexer.ll
@@ -23,7 +23,7 @@
  */
 #include ctype.h
 #include limits.h
-#include strtod.h
+#include util/strtod.h
 #include ast.h
 #include glsl_parser_extras.h
 #include glsl_parser.h
@@ -448,23 +448,23 @@ layout{
}
 
 [0-9]+\.[0-9]+([eE][+-]?[0-9]+)?[fF]?  {
-   yylval-real = glsl_strtof(yytext, NULL);
+   yylval-real = _mesa_strtof(yytext, NULL);
return FLOATCONSTANT;
}
 \.[0-9]+([eE][+-]?[0-9]+)?[fF]?{
-   yylval-real = glsl_strtof(yytext, NULL);
+   yylval-real = _mesa_strtof(yytext, NULL);
return FLOATCONSTANT;
}
 [0-9]+\.([eE][+-]?[0-9]+)?[fF]?{
-   yylval-real = glsl_strtof(yytext, NULL);
+   yylval-real = _mesa_strtof(yytext, NULL);
return FLOATCONSTANT;
}
 [0-9]+[eE][+-]?[0-9]+[fF]? {
-   yylval-real = glsl_strtof(yytext, NULL);
+   yylval-real = _mesa_strtof(yytext, NULL);
return FLOATCONSTANT;
}
 [0-9]+[fF] {
-   yylval-real = glsl_strtof(yytext, NULL);
+   yylval-real = _mesa_strtof(yytext, NULL);
return FLOATCONSTANT;
}
 
diff --git a/src/glsl/s_expression.cpp b/src/glsl/s_expression.cpp
index 1a28e1d..2928a4d 100644
--- a/src/glsl/s_expression.cpp
+++ b/src/glsl/s_expression.cpp
@@ -73,7 +73,7 @@ read_atom(void *ctx, const char *src, char *symbol_buffer)
} else {
   // Check if the atom is a number.
   char *float_end = NULL;
-  float f = glsl_strtof(src, float_end);
+  float f = _mesa_strtof(src, float_end);
   if (float_end != src) {
  char *int_end = NULL;
  int i = strtol(src, int_end, 10);
diff --git a/src/glsl/s_expression.h b/src/glsl/s_expression.h
index 642af19..1d47535 100644
--- a/src/glsl/s_expression.h
+++ b/src/glsl/s_expression.h
@@ -27,7 +27,7 @@
 #define S_EXPRESSION_H
 
 #include main/core.h /* for Elements */
-#include strtod.h
+#include util/strtod.h
 #include list.h
 
 /* Type-safe downcasting macros (also safe to pass NULL) */
diff --git a/src/glsl/strtod.c b/src/glsl/strtod.c
deleted file mode 100644
index 5d4346b..000
--- a/src/glsl/strtod.c
+++ /dev/null
@@ -1,79 +0,0 @@
-/*
- * Copyright 2010 VMware, Inc.
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the
- * Software), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sub license, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- *
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial portions
- * of the Software.
- *
- * THE SOFTWARE IS

[Mesa-dev] [PATCHv3 00/16] multithread shader compiler

Hi,

This is v3 of the series.  It should have all the changes I promised to fix.
There are some new or splitted patches because _mesa_strtof, simple_list, and
thread pool are now moved to src/util/.  To summarize,

Patch 1-3 merge mesa and glsl strtof wrappers and moves them to src/util/.
They go on to clean up the #ifdef hell and make the wrappers thread-safe.

Patch 4-6 add a generic thread pool to src/util/.  Patch 4 moves simple_list.h
from core to util because the thread pool needs it.

Patch 7-11 fixe thread safety issues in the frontend compiler and add the
infrastructure for multithread compilation to the core.

Patch 12-16 fix i965 and add the drirc option to enable multithread compiler.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 06/16] util: allow the thread pool to be used as a singleton

To have a real control over the number of driver threads, we almost never want
more than a single thread pool.

Signed-off-by: Chia-I Wu o...@lunarg.com
Reviewed-by: Brian Paul bri...@vmware.com
Reviewed-by: Ian Romanick ian.d.roman...@intel.com

v2: split glsl changes to another commit
---
 src/util/threadpool.c | 72 +++
 src/util/threadpool.h |  9 +++
 2 files changed, 81 insertions(+)

diff --git a/src/util/threadpool.c b/src/util/threadpool.c
index 224f411..9dc91f9 100644
--- a/src/util/threadpool.c
+++ b/src/util/threadpool.c
@@ -55,6 +55,7 @@ struct _mesa_threadpool_task {
 struct _mesa_threadpool {
mtx_t mutex;
int refcnt;
+   bool shutdown;
 
enum _mesa_threadpool_control thread_control;
thrd_t *threads;
@@ -168,6 +169,12 @@ _mesa_threadpool_queue_task(struct _mesa_threadpool *pool,
 
mtx_lock(pool-mutex);
 
+   if (unlikely(pool-shutdown)) {
+  mtx_unlock(pool-mutex);
+  free(task);
+  return NULL;
+   }
+
/* someone is joining with the threads */
while (unlikely(pool-thread_control != MESA_THREADPOOL_NORMAL))
   cnd_wait(pool-thread_joined, pool-mutex);
@@ -379,6 +386,17 @@ _mesa_threadpool_join(struct _mesa_threadpool *pool, bool 
graceful)
 }
 
 /**
+ * After this call, no task can be queued.
+ */
+static void
+_mesa_threadpool_set_shutdown(struct _mesa_threadpool *pool)
+{
+   mtx_lock(pool-mutex);
+   pool-shutdown = true;
+   mtx_unlock(pool-mutex);
+}
+
+/**
  * Decrease the reference count.  Destroy \p pool when the reference count
  * reaches zero.
  */
@@ -474,3 +492,57 @@ _mesa_threadpool_create(int max_threads)
 
return pool;
 }
+
+static mtx_t threadpool_lock = _MTX_INITIALIZER_NP;
+static struct _mesa_threadpool *threadpool;
+
+/**
+ * Get the singleton thread pool.  \p max_threads is honored only by the first
+ * call to this function.
+ */
+struct _mesa_threadpool *
+_mesa_threadpool_get_singleton(int max_threads)
+{
+   mtx_lock(threadpool_lock);
+   if (!threadpool)
+  threadpool = _mesa_threadpool_create(max_threads);
+   if (threadpool)
+  _mesa_threadpool_ref(threadpool);
+   mtx_unlock(threadpool_lock);
+
+   return threadpool;
+}
+
+/**
+ * Wait until all tasks are completed and threads are joined.
+ */
+void
+_mesa_threadpool_wait_singleton(void)
+{
+   mtx_lock(threadpool_lock);
+   if (threadpool)
+  _mesa_threadpool_join(threadpool, true);
+   mtx_unlock(threadpool_lock);
+}
+
+/**
+ * Destroy the singleton thread pool.
+ */
+void
+_mesa_threadpool_destroy_singleton(void)
+{
+   mtx_lock(threadpool_lock);
+   if (threadpool) {
+  /*
+   * No new task is allowed since this point.  But whoever owns references
+   * to the pool can still complete tasks that have been queued (which
+   * will simply destroy the tasks as all tasks are marked cancelled).
+   */
+  _mesa_threadpool_set_shutdown(threadpool);
+
+  _mesa_threadpool_join(threadpool, false);
+  _mesa_threadpool_unref(threadpool);
+  threadpool = NULL;
+   }
+   mtx_unlock(threadpool_lock);
+}
diff --git a/src/util/threadpool.h b/src/util/threadpool.h
index 48e4a47..aeda9d3 100644
--- a/src/util/threadpool.h
+++ b/src/util/threadpool.h
@@ -60,6 +60,15 @@ bool
 _mesa_threadpool_complete_task(struct _mesa_threadpool *pool,
struct _mesa_threadpool_task *task);
 
+struct _mesa_threadpool *
+_mesa_threadpool_get_singleton(int max_threads);
+
+void
+_mesa_threadpool_wait_singleton(void);
+
+void
+_mesa_threadpool_destroy_singleton(void);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCHv3 13/16] i965: refactor do_vs_prog

Split do_vs_prog into

  brw_vs_init_compile
  brw_vs_do_compile
  brw_vs_upload_compile
  brw_vs_clear_complile

Signed-off-by: Chia-I Wu o...@lunarg.com
Acked-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.h |   6 ++
 src/mesa/drivers/dri/i965/brw_vs.c   | 121 ++-
 src/mesa/drivers/dri/i965/brw_vs.h   |   1 +
 3 files changed, 83 insertions(+), 45 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index f0239cb..f0e9f10 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -47,6 +47,12 @@ extern C {
 
 struct brw_vec4_compile {
GLuint last_scratch; /** measured in 32-byte (register size) units */
+
+   struct gl_shader_program *shader_prog;
+
+   void *mem_ctx;
+   const unsigned *program;
+   unsigned program_size;
 };
 
 
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 4574c3e..8e3dcf4 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -187,31 +187,31 @@ brw_vs_prog_data_compare(const void *in_a, const void 
*in_b)
return true;
 }
 
-static bool
-do_vs_prog(struct brw_context *brw,
-  struct gl_shader_program *prog,
-  struct brw_vertex_program *vp,
-  struct brw_vs_prog_key *key)
+static void
+brw_vs_init_compile(struct brw_context *brw,
+struct gl_shader_program *prog,
+struct brw_vertex_program *vp,
+const struct brw_vs_prog_key *key,
+struct brw_vs_compile *c)
 {
-   GLuint program_size;
-   const GLuint *program;
-   struct brw_vs_compile c;
-   struct brw_vs_prog_data prog_data;
-   struct brw_stage_prog_data *stage_prog_data = prog_data.base.base;
-   void *mem_ctx;
-   int i;
-   struct gl_shader *vs = NULL;
-
-   if (prog)
-  vs = prog-_LinkedShaders[MESA_SHADER_VERTEX];
+   memset(c, 0, sizeof(*c));
 
-   memset(c, 0, sizeof(c));
-   memcpy(c.key, key, sizeof(*key));
-   memset(prog_data, 0, sizeof(prog_data));
+   memcpy(c-key, key, sizeof(*key));
+   c-vp = vp;
+   c-base.shader_prog = prog;
+   c-base.mem_ctx = ralloc_context(NULL);
+}
 
-   mem_ctx = ralloc_context(NULL);
+static bool
+brw_vs_do_compile(struct brw_context *brw,
+  struct brw_vs_compile *c)
+{
+   struct brw_stage_prog_data *stage_prog_data = c-prog_data.base.base;
+   struct gl_shader *vs = NULL;
+   int i;
 
-   c.vp = vp;
+   if (c-base.shader_prog)
+  vs = c-base.shader_prog-_LinkedShaders[MESA_SHADER_VERTEX];
 
/* Allocate the references to the uniforms that will end up in the
 * prog_data associated with the compiled program, and which will be freed
@@ -226,12 +226,12 @@ do_vs_prog(struct brw_context *brw,
   param_count = vs-num_uniform_components * 4;
 
} else {
-  param_count = vp-program.Base.Parameters-NumParameters * 4;
+  param_count = c-vp-program.Base.Parameters-NumParameters * 4;
}
/* vec4_visitor::setup_uniform_clipplane_values() also uploads user clip
 * planes as uniforms.
 */
-   param_count += c.key.base.nr_userclip_plane_consts * 4;
+   param_count += c-key.base.nr_userclip_plane_consts * 4;
 
stage_prog_data-param =
   rzalloc_array(NULL, const gl_constant_value *, param_count);
@@ -247,12 +247,12 @@ do_vs_prog(struct brw_context *brw,
   stage_prog_data-nr_params += vs-num_samplers;
}
 
-   GLbitfield64 outputs_written = vp-program.Base.OutputsWritten;
-   prog_data.inputs_read = vp-program.Base.InputsRead;
+   GLbitfield64 outputs_written = c-vp-program.Base.OutputsWritten;
+   c-prog_data.inputs_read = c-vp-program.Base.InputsRead;
 
-   if (c.key.copy_edgeflag) {
+   if (c-key.copy_edgeflag) {
   outputs_written |= BITFIELD64_BIT(VARYING_SLOT_EDGE);
-  prog_data.inputs_read |= VERT_BIT_EDGEFLAG;
+  c-prog_data.inputs_read |= VERT_BIT_EDGEFLAG;
}
 
if (brw-gen  6) {
@@ -263,7 +263,7 @@ do_vs_prog(struct brw_context *brw,
* coords, which would be a pain to handle.
*/
   for (i = 0; i  8; i++) {
- if (c.key.point_coord_replace  (1  i))
+ if (c-key.point_coord_replace  (1  i))
 outputs_written |= BITFIELD64_BIT(VARYING_SLOT_TEX0 + i);
   }
 
@@ -278,45 +278,76 @@ do_vs_prog(struct brw_context *brw,
 * distance varying slots whenever clipping is enabled, even if the vertex
 * shader doesn't write to gl_ClipDistance.
 */
-   if (c.key.base.userclip_active) {
+   if (c-key.base.userclip_active) {
   outputs_written |= BITFIELD64_BIT(VARYING_SLOT_CLIP_DIST0);
   outputs_written |= BITFIELD64_BIT(VARYING_SLOT_CLIP_DIST1);
}
 
-   brw_compute_vue_map(brw, prog_data.base.vue_map, outputs_written);
+   brw_compute_vue_map(brw, c-prog_data.base.vue_map, outputs_written);
 
if (0) {
-  _mesa_fprint_program_opt(stderr, c.vp-program.Base, PROG_PRINT_DEBUG,
-

[Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---

No piglit regressions on nvc0 except for gl-3.0-render-integer, which appears
to now fail even without this commit, despite the fact that I'm fairly sure it
used to work fine. Same failure with llvmpipe...

It's most likely that I've missed some details. It's unclear whether
e.g. glGenerateMipmap should work on a view. However the piglits that exist do
all pass on nvc0 and llvmpipe.

 docs/GL3.txt |  2 +-
 docs/relnotes/10.3.html  |  1 +
 src/mesa/state_tracker/st_atom_texture.c | 28 +++
 src/mesa/state_tracker/st_cb_fbo.c   | 10 ++
 src/mesa/state_tracker/st_cb_texture.c   | 62 +++-
 src/mesa/state_tracker/st_extensions.c   |  1 +
 src/mesa/state_tracker/st_format.c   |  5 +--
 src/mesa/state_tracker/st_texture.c  | 15 ++--
 8 files changed, 105 insertions(+), 19 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 76412c3..5b25865 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -166,7 +166,7 @@ GL 4.3, GLSL 4.30:
   GL_ARB_texture_buffer_range  DONE (nv50, nvc0, i965, 
r600, radeonsi)
   GL_ARB_texture_query_levels  DONE (all drivers that 
support GLSL 1.30)
   GL_ARB_texture_storage_multisample   DONE (all drivers that 
support GL_ARB_texture_multisample)
-  GL_ARB_texture_view  DONE (i965)
+  GL_ARB_texture_view  DONE (i965, nv30, nv50, 
nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_binding DONE (all drivers)
 
 
diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
index fa4ea23..852aec9 100644
--- a/docs/relnotes/10.3.html
+++ b/docs/relnotes/10.3.html
@@ -63,6 +63,7 @@ Note: some of the new features are only available with 
certain drivers.
 liGL_ARB_texture_gather on r600, radeonsi/li
 liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
softpipe/li
 liGL_ARB_texture_query_lod on r600, radeonsi/li
+liGL_ARB_texture_view on nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, 
softpipe/li
 liGL_ARB_viewport_array on nvc0/li
 liGL_AMD_vertex_shader_viewport_index on i965/gen7+, r600/li
 liGL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, 
radeonsi, softpipe, llvmpipe/li
diff --git a/src/mesa/state_tracker/st_atom_texture.c 
b/src/mesa/state_tracker/st_atom_texture.c
index 03d0593..8f62494 100644
--- a/src/mesa/state_tracker/st_atom_texture.c
+++ b/src/mesa/state_tracker/st_atom_texture.c
@@ -192,9 +192,9 @@ get_texture_format_swizzle(const struct st_texture_object 
*stObj)
return swizzle_swizzle(stObj-base._Swizzle, tex_swizzle);
 }
 
-
+
 /**
- * Return TRUE if the texture's sampler view swizzle is equal to
+ * Return TRUE if the texture's sampler view swizzle is not equal to
  * the texture's swizzle.
  *
  * \param stObj  the st texture object,
@@ -214,9 +214,20 @@ check_sampler_swizzle(const struct st_texture_object 
*stObj,
 
 static unsigned last_level(struct st_texture_object *stObj)
 {
-   return MIN2(stObj-base._MaxLevel, stObj-pt-last_level);
+   unsigned ret = MIN2(stObj-base.MinLevel + stObj-base._MaxLevel,
+   stObj-pt-last_level);
+   if (stObj-base.Immutable)
+  ret = MIN2(ret, stObj-base.MinLevel + stObj-base.NumLevels - 1);
+   return ret;
 }
 
+static unsigned last_layer(struct st_texture_object *stObj)
+{
+   if (stObj-base.Immutable)
+  return MIN2(stObj-base.MinLayer + stObj-base.NumLayers - 1,
+  stObj-pt-array_size - 1);
+   return stObj-pt-array_size - 1;
+}
 
 static struct pipe_sampler_view *
 st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe,
@@ -249,9 +260,12 @@ st_create_texture_sampler_view_from_stobj(struct 
pipe_context *pipe,
   templ.u.buf.first_element = f;
   templ.u.buf.last_element  = f + (n - 1);
} else {
-  templ.u.tex.first_level = stObj-base.BaseLevel;
+  templ.u.tex.first_level = stObj-base.MinLevel + stObj-base.BaseLevel;
   templ.u.tex.last_level = last_level(stObj);
   assert(templ.u.tex.first_level = templ.u.tex.last_level);
+  templ.u.tex.first_layer = stObj-base.MinLayer;
+  templ.u.tex.last_layer = last_layer(stObj);
+  assert(templ.u.tex.first_layer = templ.u.tex.last_layer);
}
 
if (swizzle != SWIZZLE_NOOP) {
@@ -287,8 +301,10 @@ st_get_texture_sampler_view_from_stobj(struct st_context 
*st,
if (*sv) {
   if (check_sampler_swizzle(stObj, *sv) ||
  (format != (*sv)-format) ||
-  stObj-base.BaseLevel != (*sv)-u.tex.first_level ||
-  last_level(stObj) != (*sv)-u.tex.last_level) {
+  stObj-base.MinLevel + stObj-base.BaseLevel != 
(*sv)-u.tex.first_level ||
+  last_level(stObj) != (*sv)-u.tex.last_level ||
+  stObj-base.MinLayer != (*sv)-u.tex.first_layer ||
+

[Mesa-dev] [PATCH 1/2] mesa: force height of 1D textures to be 1 in texture views

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 src/mesa/main/textureview.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/textureview.c b/src/mesa/main/textureview.c
index b3521e2..6e86a9a 100644
--- a/src/mesa/main/textureview.c
+++ b/src/mesa/main/textureview.c
@@ -536,6 +536,9 @@ _mesa_TextureView(GLuint texture, GLenum target, GLuint 
origtexture,
/* Adjust width, height, depth to be appropriate for new target */
switch (target) {
case GL_TEXTURE_1D:
+  height = 1;
+  break;
+
case GL_TEXTURE_3D:
   break;
 
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967

Michel Dänzer mic...@daenzer.net writes:

 From: Michel Dänzer michel.daen...@amd.com

 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
 b/src/gallium/state_trackers/clover/llvm/invocation.cpp
 index 5d2efc4..2643cc3 100644
 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
 +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
 @@ -234,7 +234,11 @@ namespace {
memcpy(address_spaces, c.getTarget().getAddressSpaceMap(),
  
 sizeof(address_spaces));
  
 +#if HAVE_LLVM = 0x0306
 +  return act.takeModule().get();

You probably want to call .release() instead and deallocate it manually
later on, otherwise the module will be destroyed here before the end of
the function.

Thanks.

 +#else
return act.takeModule();
 +#endif
 }
  
 void
 -- 
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


pgp6oeX6taFLc.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967

On 20.08.2014 15:48, Francisco Jerez wrote:
 Michel Dänzer mic...@daenzer.net writes:
 
 From: Michel Dänzer michel.daen...@amd.com

 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
 b/src/gallium/state_trackers/clover/llvm/invocation.cpp
 index 5d2efc4..2643cc3 100644
 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
 +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
 @@ -234,7 +234,11 @@ namespace {
memcpy(address_spaces, c.getTarget().getAddressSpaceMap(),
  
 sizeof(address_spaces));
  
 +#if HAVE_LLVM = 0x0306
 +  return act.takeModule().get();
 
 You probably want to call .release() instead

Right, that works better, i.e. doesn't crash. :)

 and deallocate it manually later on, otherwise the module will be
 destroyed here before the end of the function.

Are you sure anything else needs to be done for destruction? valgrind
doesn't seem to show any leaks obviously related to this. If something
else does need to be done, I'll have to defer to you or someone else for
the proper fix.


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Christian König


I think we can fix this by introducing new structured variants of the
branch instruction in a way that doesn't alter the fundamental structure
of the IR.  E.g. an if branch could look like:

ifbr i1 cond, label iftrue, label iffalse, label join

Where both branches are guaranteed to converge at join.  Sure, this
will require fixing many assumptions, but on the one hand it's not
immediately required (as we can address this problem for the time being
using the same solution AMD uses) and on the other hand it's still less
work than starting from scratch.
Well, I've wrote the structurizer pass in LLVM you are talking about 
here and from my experience you really don't want any structured form of 
control flow in the IR.


Structured control flow is just a specialized form of unstructured 
control flow and even if it looks rather awkward at first glance it is 
indeed simpler to destructurize the compiler generated control flow for 
optimization and structurize again for instruction selection.


The only reason I've annotated the LLVM IR with specialized intrinsics 
for the SI backend was laziness and I wouldn't do that again given the 
chance.



And it's very likely that these backends, which probably aren't using
SSA due to the aforementioned difficulties, will also benefit from
having modifiers already folded for them - this is something that's
already a problem for i965 vec4 backend and that NIR will help a lot.


Well, I have the impression that much of the reason why the i965 vec4
backend has lagged behind so much in comparison with the fs backend is
precisely because it's so annoying to optimize vec4 code.  It seems
painful to me that you have this built into the core instruction set so
generic optimization passes will have to be explicitly aware of it.  I
wouldn't be surprised if the i965 vec4 benefited at least as much from
scalarizing the code, performing optimizations there, and re-vectorizing
afterwards.


Completely agree.

Being able to do vectorization in an IR is important, but you shouldn't 
try to handle backend specific swizzle operations and vectorizing 
restrictions in the IR. Just looking at the swizzle restrictions of R600 
for example and I really can't imagine that you want to represent this 
in a common IR between all different drivers.


Regards,
Christian.

Am 20.08.2014 um 08:33 schrieb Francisco Jerez:

Connor Abbott cwabbo...@gmail.com writes:


On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net wrote:

Tom Stellard t...@stellard.net writes:


On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:

On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:

On 19.08.2014 01:28, Connor Abbott wrote:

On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote:

On 16.08.2014 09:12, Connor Abbott wrote:

I know what you might be thinking right now. Wait, *another* IR? Don't
we already have like 5 of those, not counting all the driver-specific
ones? Isn't this stuff complicated enough already? Well, there are some
pretty good reasons to start afresh (again...). In the years we've been
using GLSL IR, we've come to realize that, in fact, it's not what we
want *at all* to do optimizations on.

Did you evaluate using LLVM IR instead of inventing yet another one?


--
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

Yes. See

http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html

and

http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html

I know Ian can't deal with LLVM for some reason. I was wondering if
*you* evaluated it, and if so, why you rejected it.


--
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer


Well, first of all, the fact that Ian and Ken don't want to use it
means that any plan to use LLVM for the Intel driver is dead in the
water anyways - you can translate NIR into LLVM if you want, but for
i965 we want to share optimizations between our 2 backends (FS and
vec4) that we can't do today in GLSL IR so this is what we want to use
for that, and since nobody else does anything with the core GLSL
compiler except when they have to, when we start moving things out of
GLSL IR this will probably replace GLSL IR as the infrastructure that
all Mesa drivers use. But with that in mind, here are a few reasons
why we wouldn't want to use LLVM:

* LLVM wasn't built to understand structured CFG's, meaning that you
need to re-structurize it using a pass that's fragile and prone to
break if some other pass optimizes the shader in a way that makes it
non-structured (i.e. not expressible in terms of loops and if
statements). This loss of information also means that passes that need
to know things like, for example, the loop nesting depth need to do an
analysis pass whereas with NIR you can just walk up the

Re: [Mesa-dev] Clamp/saturate optimizations v3

2014-08-20 Thread Abdiel Janulgue



On 20.08.2014 05:40, Matt Turner wrote:
 
 Patches 2-4, (5-9 already reviewed), 10, 13-16, (17 already reviewed) are
 
 Reviewed-by: Matt Turner matts...@gmail.com
 
 I've requested a change come before patch 1, and then rebased patch 1
 should be an easy R-b.
 
 I'll need to take a closer look at 11 and 12.
 

Thanks for the review! I'll send the updated changes
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM SVN = r215967

Michel Dänzer mic...@daenzer.net writes:

 On 20.08.2014 15:48, Francisco Jerez wrote:
 Michel Dänzer mic...@daenzer.net writes:
 
 From: Michel Dänzer michel.daen...@amd.com

 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 
  1 file changed, 4 insertions(+)

 diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
 b/src/gallium/state_trackers/clover/llvm/invocation.cpp
 index 5d2efc4..2643cc3 100644
 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
 +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
 @@ -234,7 +234,11 @@ namespace {
memcpy(address_spaces, c.getTarget().getAddressSpaceMap(),
  
 sizeof(address_spaces));
  
 +#if HAVE_LLVM = 0x0306
 +  return act.takeModule().get();
 
 You probably want to call .release() instead

 Right, that works better, i.e. doesn't crash. :)

 and deallocate it manually later on, otherwise the module will be
 destroyed here before the end of the function.

 Are you sure anything else needs to be done for destruction? valgrind
 doesn't seem to show any leaks obviously related to this. If something
 else does need to be done, I'll have to defer to you or someone else for
 the proper fix.

Yeah, I'm afraid.  Apparently since clang r215979
CodeGenAction::takeModule() gives up ownership on the module object.



 -- 
 Earthling Michel Dänzer|  http://www.amd.com
 Libre software enthusiast  |Mesa and X developer


pgpvCqHI3_jKF.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V4 1/3] mesa: implement GL_MAX_VERTEX_ATTRIB_STRIDE

2014-08-20 Thread Timothy Arceri

V2: moved test for the VertexAttrib*Pointer() functions
 to update_array(), and made constant available for drivers to set

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 Although 4.4 is a while away GL_MAX_VERTEX_ATTRIB_STRIDE is used in
 the ARB_direct_state_access spec so it seemed worth while adding this now.

 I've added MAX_VERTEX_ATTRIB_STRIDE to ARB_vertex_attrib_binding.xml
 as it didn't seem like it was worth putting it somewhere on its own
 as its really just a bug fix. Let me know if this should be moved.

 Finally I've assumed that 2048 is an ok value for i965.

 V4: add cap for all gallium drivers set to default (except r600g)

 V3: adds values for r600g and radeonsi (I'm unsable to test either of these 
patches)

 src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml |  1 +
 src/mesa/main/context.c  |  3 +++
 src/mesa/main/get_hash_params.py |  3 +++
 src/mesa/main/mtypes.h   |  3 +++
 src/mesa/main/varray.c   | 22 ++
 5 files changed, 32 insertions(+)

diff --git a/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml 
b/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml
index 0ee6a3c..7e62688 100644
--- a/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml
+++ b/src/mapi/glapi/gen/ARB_vertex_attrib_binding.xml
@@ -53,6 +53,7 @@
 enum name=VERTEX_BINDING_STRIDE value=0x82D8/
 enum name=MAX_VERTEX_ATTRIB_RELATIVE_OFFSET value=0x82D9/
 enum name=MAX_VERTEX_ATTRIB_BINDINGS value=0x82DA/
+enum name=MAX_VERTEX_ATTRIB_STRIDE value=0x82E5/
 
 /category
 /OpenGLAPI
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 2320842..fbdbd68 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -670,6 +670,9 @@ _mesa_init_constants(struct gl_constants *consts, gl_api 
api)
   ? GL_CONTEXT_CORE_PROFILE_BIT
   : GL_CONTEXT_COMPATIBILITY_PROFILE_BIT;
 
+   /* GL 4.4 */
+   consts-MaxVertexAttribStride = 2048;
+
/** GL_EXT_gpu_shader4 */
consts-MinProgramTexelOffset = -8;
consts-MaxProgramTexelOffset = 7;
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index ff85820..aace8a5 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -712,6 +712,9 @@ descriptor=[
   [ MAX_GEOMETRY_INPUT_COMPONENTS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxInputComponents), 
extra_version_32 ],
   [ MAX_GEOMETRY_OUTPUT_COMPONENTS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxOutputComponents), 
extra_version_32 ],
 
+# GL 4.4
+  [ MAX_VERTEX_ATTRIB_STRIDE, CONTEXT_ENUM(Const.MaxVertexAttribStride), 
NO_EXTRA ],
+
 # GL_ARB_robustness
   [ RESET_NOTIFICATION_STRATEGY_ARB, CONTEXT_ENUM(Const.ResetStrategy), 
NO_EXTRA ],
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index cb2a4df..adb6788 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3414,6 +3414,9 @@ struct gl_constants
/** OpenGL version 3.2 */
GLbitfield ProfileMask;   /** Mask of CONTEXT_x_PROFILE_BIT */
 
+   /** OpenGL version 4.4 */
+   GLuint MaxVertexAttribStride;
+
/** GL_EXT_transform_feedback */
GLuint MaxTransformFeedbackBuffers;
GLuint MaxTransformFeedbackSeparateComponents;
diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
index 5d3cc2a..7d169f9 100644
--- a/src/mesa/main/varray.c
+++ b/src/mesa/main/varray.c
@@ -424,6 +424,13 @@ update_array(struct gl_context *ctx,
   return;
}
 
+   if (ctx-API == API_OPENGL_CORE  ctx-Version = 44 
+   stride  ctx-Const.MaxVertexAttribStride) {
+  _mesa_error(ctx, GL_INVALID_VALUE, %s(stride=%d  
+  GL_MAX_VERTEX_ATTRIB_STRIDE), func, stride);
+  return;
+   }
+
/* Page 29 (page 44 of the PDF) of the OpenGL 3.3 spec says:
 *
 * An INVALID_OPERATION error is generated under any of the following
@@ -1437,6 +1444,13 @@ _mesa_BindVertexBuffer(GLuint bindingIndex, GLuint 
buffer, GLintptr offset,
   return;
}
 
+   if (ctx-API == API_OPENGL_CORE  ctx-Version = 44 
+   stride  ctx-Const.MaxVertexAttribStride) {
+  _mesa_error(ctx, GL_INVALID_VALUE, glBindVertexBuffer(stride=%d  
+  GL_MAX_VERTEX_ATTRIB_STRIDE), stride);
+  return;
+   }
+
if (buffer == 
vao-VertexBinding[VERT_ATTRIB_GENERIC(bindingIndex)].BufferObj-Name) {
   vbo = vao-VertexBinding[VERT_ATTRIB_GENERIC(bindingIndex)].BufferObj;
} else if (buffer != 0) {
@@ -1565,6 +1579,14 @@ _mesa_BindVertexBuffers(GLuint first, GLsizei count, 
const GLuint *buffers,
  continue;
   }
 
+  if (ctx-API == API_OPENGL_CORE  ctx-Version = 44 
+  strides[i]  ctx-Const.MaxVertexAttribStride) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ glBindVertexBuffers(strides[%u]=%d  
+ GL_MAX_VERTEX_ATTRIB_STRIDE), i, strides[i]);
+ continue;
+  }
+
   if (buffers[i])

[Mesa-dev] [PATCH V4 3/3] docs: mark GL_MAX_VERTEX_ATTRIB_STRIDE as done

2014-08-20 Thread Timothy Arceri

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 76412c3..af26214 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -172,7 +172,7 @@ GL 4.3, GLSL 4.30:
 
 GL 4.4, GLSL 4.40:
 
-  GL_MAX_VERTEX_ATTRIB_STRIDE  not started
+  GL_MAX_VERTEX_ATTRIB_STRIDE  DONE (all drivers)
   GL_ARB_buffer_storageDONE (i965, nv30, nv50, 
nvc0, r300, r600, radeonsi)
   GL_ARB_clear_texture DONE (i965)
   GL_ARB_enhanced_layouts  not started
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V4 2/3] gallium: add cap for MAX_VERTEX_ATTRIB_STRIDE

2014-08-20 Thread Timothy Arceri

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 Note: I have only compile tested this patch with ilo.

 src/gallium/docs/source/screen.rst   | 1 +
 src/gallium/drivers/freedreno/freedreno_screen.c | 3 +++
 src/gallium/drivers/i915/i915_screen.c   | 3 +++
 src/gallium/drivers/ilo/ilo_screen.c | 2 ++
 src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 2 ++
 src/gallium/drivers/r300/r300_screen.c   | 3 +++
 src/gallium/drivers/r600/r600_pipe.c | 3 +++
 src/gallium/drivers/radeonsi/si_pipe.c   | 3 +++
 src/gallium/drivers/softpipe/sp_screen.c | 2 ++
 src/gallium/drivers/svga/svga_screen.c   | 2 ++
 src/gallium/drivers/vc4/vc4_screen.c | 3 +++
 src/gallium/include/pipe/p_defines.h | 1 +
 src/mesa/state_tracker/st_extensions.c   | 3 +++
 16 files changed, 37 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index eee254e..13bf705 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -225,6 +225,7 @@ The integer capabilities:
   memory and GART.
 * ``PIPE_CAP_CONDITIONAL_RENDER_INVERTED``: Whether the driver supports 
inverted
   condition for conditional rendering.
+* ``PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE``: The maximum supported vertex stride.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index ab1a740..81a6c84 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -233,6 +233,9 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_MAX_VERTEX_STREAMS:
return 0;
 
+   case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
+   return 2048;
+
/* Texturing. */
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
case PIPE_CAP_MAX_TEXTURE_3D_LEVELS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 40976b3..55f8e71 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -275,6 +275,9 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_MAX_VERTEX_STREAMS:
   return 0;
 
+   case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
+  return 2048;
+
/* Fragment coordinate conventions. */
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 15658da..1e034f8 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -382,6 +382,8 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
   return false;
+   case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
+  return 2048;
case PIPE_CAP_COMPUTE:
   return false; /* TODO */
case PIPE_CAP_USER_INDEX_BUFFERS:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 2a6e673..8625f0c 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -195,6 +195,8 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   return 1024;
case PIPE_CAP_MAX_VERTEX_STREAMS:
   return 1;
+   case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
+  return 2048;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
   return 1;
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 80d6943..15aba8a 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -71,6 +71,8 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return 16;
case PIPE_CAP_MAX_VIEWPORTS:
   return 1;
+   case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
+  return 2048;
/* supported capabilities */
case PIPE_CAP_TWO_SIDED_STENCIL:
case PIPE_CAP_ANISOTROPIC_FILTER:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index 99dcdc5..1c08c3b 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -117,6 +117,8 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
   return 1024;
case PIPE_CAP_MAX_VERTEX_STREAMS:
   return 1;
+   case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
+  return 2048;
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:

Re: [Mesa-dev] [PATCH 00/37] Geometry shader support in Sandy Bridge

2014-08-20 Thread Iago Toral


El 2014-08-16 09:11, Jordan Justen escribió:
On Thu, Aug 14, 2014 at 4:11 AM, Iago Toral Quiroga ito...@igalia.com 
wrote:

Hi,

this series brings support for geometry shaders in Sandy Bridge (gen6) 
and is

combined work from Samuel and myself. A few notes:

1.- Some patches have been based on original work by Ilia Mirkin, 
specifically
the idea of using arrays to buffer the output of the GS, subclassing 
the

vec4_gs_visitor for gen6 and generalizing emit_urb_slot().

2.- Geometry shaders were already being used in gen6 to implement 
transform
feedback support for vertex shaders. We have not changed this. These 
patches
focus on adding support for user-provided geometry shaders and 
transform
feedback support for the geometry shader stage. In the future it 
probably
makes sense to merge transform feedback support for the vertex shader 
stage
in our implementation so there is only one code path for geometry 
shaders
in gen6, but it is probably better to tackle that at a later moment, 
once we

have merged this work.

2.- On Ivy Bridge there are no piglit regressions.

3.- On Sandy Bridge we get these results after enabling OpenGL 3.2 and
GLSL 1.50 (*1):

  crash:+0
  fail:+15 (*2)
  pass:  +3265
  skip:  -3280


Maybe a list of the failures? Or posting the piglit comparison results
might be helpful.

For example:
http://people.freedesktop.org/~kwg/stuff/bdw-2014-05-13/summary/regressions.html

This is not really a big deal, but it would just be nice to quickly
see what tests are failing.


(*1) Including Jordan's patches from the series
Gen6 render surface state changes since these are required to enable
layered rendering in geometry shaders. The numbers were obtained by 
comparing
master with Jordan's patches on top (OpenGL 3.1, GLSL 1.40) against 
master

with these and Jordan's patches on top (OpenGL 3.2, GLSL 1.50)


I finally pushed my gen6-layered series to master. (a1dca70)

I wonder if you might push these patches to a publicly available 
branch?


Thanks!

-Jordan


Sure. Samuel, can you do this?

Iago



(*2) These are mostly tests that either fail in Ivy Bridge too, are GS
variants of tests that also fail for the VS/FS stages or relate to 
other

aspects of OpenGL 3.2 that are not related with geometry shaders.

4.- With these patches, the following piglit test hangs:
bin/glsl-1.50-geometry-primitive-id-restart 
GL_TRIANGLE_STRIP_ADJACENCY


This problem seems to be unrelated to our implementation, since the 
hang

happens only for that primitive type, only when using glDrawElements()
(so glDrawArrays works fine), and only in specific cases where the 
list
of indices provided includes repeated indices with a certain pattern. 
Actually,
this test hangs even if we have a geometry shader that does nothing 
(i.e. an
empty main function), where the code we generate is trivial and works 
with
any other primitive type. Based on this, I conclude that this is a 
problem
originating somewhere else, I think probably a hardware bug. Because 
of this,

piglit runs with these patches should exclude this test by including
-x primitive-id-restart. The offending piglit test can be trivially 
reworked

to avoid repeating indices in the call to glDrawElements() too. I'll
develop this issue further in another thread so we can decide what to 
do about

this problem.

I'll be on holidays for the next two weeks, starting tomorrow, but 
Samuel will
be around since Tuesday next week so he can start acting on the review 
feedback

we get.

A quick summary of the patches:

- Patch 1: is actually about gen7, but since gen6's dispatch mode for 
geometry
  shaders is equivalent to gen7's SINGLE mode it makes sense to do 
this first.
- Patches 2-4 refactor 3DSTATE_GS to accomodate the code path for 
user-provided
  geometry shaders while keeping the original code that handles TF 
support

  in vertex shaders.
- Patches 5-13 implement generator opcodes, configure state packets 
and

  handle required URB space.
- Patches 14-15 generalize emit_urb_slot() so we can reuse that code.
- Patches 16-19 are the gen6 geometry shader visitor implementation.
- Patches 20-21 implement gl_PrimitiveIDIn.
- Patch 22 makes sure we compute the right VUE map for user-provided 
GS.

- Patch 23 enables texture related functions in the GS stage.
- Patches 24-33 mostly implement transform feedback
- Patch 34 handles uploading of ubo and pull constant surfaces
- Patch 35 makes gen6 use this implementation of geometry shaders
- Patches 36-37 enable GLSL 1.5 and OpenGL 3.2 in gen6

Iago Toral Quiroga (23):
  i965/gs: Use single dispatch mode as fallback to dual object mode 
when

possible.
  i965/gen6/gs: Setup constant push buffers for gen6 geometry shaders.
  i965/gen6/gs: Implement GS_OPCODE_FF_SYNC.
  i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE.
  i965/gen6/gs: Add instruction URB flags to geometry shaders EOT
message.
  i965/gen6/gs: Compute URB entry size for user-provided geometry
shaders.
  i965/gen6/gs: Enable

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On 20.08.2014 00:04, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:
 On 19.08.2014 01:28, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote:
 On 16.08.2014 09:12, Connor Abbott wrote:
 I know what you might be thinking right now. Wait, *another* IR? Don't
 we already have like 5 of those, not counting all the driver-specific
 ones? Isn't this stuff complicated enough already? Well, there are some
 pretty good reasons to start afresh (again...). In the years we've been
 using GLSL IR, we've come to realize that, in fact, it's not what we
 want *at all* to do optimizations on.

 Did you evaluate using LLVM IR instead of inventing yet another one?

 Yes. See

 http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html

 and

 http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html

 I know Ian can't deal with LLVM for some reason. I was wondering if
 *you* evaluated it, and if so, why you rejected it.

First of all, thank you for sharing more specific information than
'table-flipping rage'.


 * LLVM is on a different release schedule (6 months vs. 3 months), has
 a different review process, etc., which means that to add support for
 new functionality that involves shaders, we now have to submit patches
 to two separate projects, and then 2 months later when we ship Mesa it
 turns out that nobody can actually use the new feature because it
 depends upon an unreleased version of LLVM that won't be released for
 another 3 months and then packaged by distros even later...

This has indeed been frustrating at times, but it's better now for
backend changes since Tom has been making LLVM point releases.

As for the GLSL frontend, I agree with Tom that it shouldn't require
that much direct interaction with the LLVM project.


 we've already had problems where distros refused to ship newer Mesa
 releases because radeon depended on a version of LLVM newer than the
 one they were shipping, [...]

That's news to me, can you be more specific?

That sounds like basically a distro issue though, since different LLVM
versions can be installed in parallel (and the one used by default
doesn't have to be the newest one). And it even works if another part of
the same process uses a different version of LLVM.


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] pipe-loader: Fix memory leak v2

On 20 August 2014 00:49, Tom Stellard thomas.stell...@amd.com wrote:
 CC: 10.2 mesa-sta...@lists.freedesktop.org

 v2:
   - Change driver_name to char*
I knew there was a reason as to why I put a comment in there.
Thanks for tracking it down Tom.

Reviewed-by: Emil Velikov emil.l.veli...@gmail.com

 ---
  src/gallium/auxiliary/pipe-loader/pipe_loader.h | 2 +-
  src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.h 
 b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
 index 8ff00b1..6127a6a 100644
 --- a/src/gallium/auxiliary/pipe-loader/pipe_loader.h
 +++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
 @@ -67,7 +67,7 @@ struct pipe_loader_device {
} pci;
 } u; /** Discriminated by \a type */

 -   const char *driver_name;
 +   char *driver_name;
 const struct pipe_loader_ops *ops;
  };

 diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c 
 b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
 index 1bbaf19..88056f5 100644
 --- a/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
 +++ b/src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c
 @@ -256,7 +256,7 @@ pipe_loader_drm_release(struct pipe_loader_device **dev)
util_dl_close(ddev-lib);

 close(ddev-fd);
 -   /* XXX: Free ddev-base.driver_name - strdup at loader_get_driver_for_fd 
 */
 +   FREE(ddev-base.driver_name);
 FREE(ddev);
 *dev = NULL;
  }
 --
 1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/37] Geometry shader support in Sandy Bridge

2014-08-20 Thread Samuel Iglesias Gonsálvez

On Wed, 2014-08-20 at 11:16 +0200, Iago Toral wrote:
 El 2014-08-16 09:11, Jordan Justen escribió:
  On Thu, Aug 14, 2014 at 4:11 AM, Iago Toral Quiroga ito...@igalia.com 
  wrote:
  Hi,
  
  this series brings support for geometry shaders in Sandy Bridge (gen6) 
  and is
  combined work from Samuel and myself. A few notes:
  
  1.- Some patches have been based on original work by Ilia Mirkin, 
  specifically
  the idea of using arrays to buffer the output of the GS, subclassing 
  the
  vec4_gs_visitor for gen6 and generalizing emit_urb_slot().
  
  2.- Geometry shaders were already being used in gen6 to implement 
  transform
  feedback support for vertex shaders. We have not changed this. These 
  patches
  focus on adding support for user-provided geometry shaders and 
  transform
  feedback support for the geometry shader stage. In the future it 
  probably
  makes sense to merge transform feedback support for the vertex shader 
  stage
  in our implementation so there is only one code path for geometry 
  shaders
  in gen6, but it is probably better to tackle that at a later moment, 
  once we
  have merged this work.
  
  2.- On Ivy Bridge there are no piglit regressions.
  
  3.- On Sandy Bridge we get these results after enabling OpenGL 3.2 and
  GLSL 1.50 (*1):
  
crash:+0
fail:+15 (*2)
pass:  +3265
skip:  -3280
  
  Maybe a list of the failures? Or posting the piglit comparison results
  might be helpful.
  
  For example:
  http://people.freedesktop.org/~kwg/stuff/bdw-2014-05-13/summary/regressions.html
  
  This is not really a big deal, but it would just be nice to quickly
  see what tests are failing.
  
  (*1) Including Jordan's patches from the series
  Gen6 render surface state changes since these are required to enable
  layered rendering in geometry shaders. The numbers were obtained by 
  comparing
  master with Jordan's patches on top (OpenGL 3.1, GLSL 1.40) against 
  master
  with these and Jordan's patches on top (OpenGL 3.2, GLSL 1.50)
  
  I finally pushed my gen6-layered series to master. (a1dca70)
  
  I wonder if you might push these patches to a publicly available 
  branch?
  
  Thanks!
  
  -Jordan
 
 Sure. Samuel, can you do this?

Sure!

The public branch with the submitted patches rebased on top
of yesterday's master is here:

https://github.com/samuelig/mesa/tree/gs-support-snb-for-submission

And the piglit comparison between yesterday's master which already have
Jordan's patches in SNB (OpenGL 3.1, GLSL 1.40) and our patches
(OpenGL 3.2, GLSL 1.50) is here:

http://samuelig.es/mesa-dev/all-submitted-patches-19-aug/

Sorry for the delay, uploading the whole piglit's HTML output is taking
a lot of time with my Internet connection :-S If you find that some
files are missing just try again later (FTP transfer is still uploading
files).

Best regards,

Samuel


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 37/37] i965/gen6: enable OpenGL 3.2

2014-08-20 Thread Samuel Iglesias Gonsálvez

On Thu, 2014-08-14 at 08:46 -0700, Matt Turner wrote:
 On Thu, Aug 14, 2014 at 4:12 AM, Iago Toral Quiroga ito...@igalia.com wrote:
  From: Samuel Iglesias Gonsalvez sigles...@igalia.com
 
  Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com
  ---
 
 I'd squash the last two patches together. I think it's likely we can
 go to GL 3.3 on Sandybridge, but we'd probably like to take a look at
 the piglit results first, so this patch that increases it to 3.2 seems
 fine.
 
 Both of these (squashed together) are
 
 Reviewed-by: Matt Turner matts...@gmail.com

Thanks Matt. I am going to do the squash for the second version of the
patches.

I will wait some days in order to receive more feedback before sending
the second version :-)

Sam


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] EGL_KHR_vg_parent_image extenstion

2014-08-20 Thread Peter Hanzel

Hello.

I want to use EGL_KHR_vg_parent_image extension. I did grep on Mesa
source code but found nothing. Then I looked on google and found this:

http://lists.freedesktop.org/archives/mesa-dev/2010-September/002900.html

Then I looked at mesa cgit and it looks like this was not merged.
So mesa is not supporting EGL_KHR_vg_parent_image extenstion ?

Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
  wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? 
  Don't
  we already have like 5 of those, not counting all the driver-specific
  ones? Isn't this stuff complicated enough already? Well, there are 
  some
  pretty good reasons to start afresh (again...). In the years we've 
  been
  using GLSL IR, we've come to realize that, in fact, it's not what we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  http://www.amd.com
  Libre software enthusiast  |Mesa and X developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't write 
 an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

 Where both branches are guaranteed to converge at join.  Sure, this
 will require fixing many assumptions, but on the one hand it's not
 immediately required (as we can address this problem for the time being
 using the same solution AMD uses) and on the other hand

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 2:41 AM, Michel Dänzer mic...@daenzer.net wrote:
 On 20.08.2014 00:04, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:
 On 19.08.2014 01:28, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote:
 On 16.08.2014 09:12, Connor Abbott wrote:
 I know what you might be thinking right now. Wait, *another* IR? Don't
 we already have like 5 of those, not counting all the driver-specific
 ones? Isn't this stuff complicated enough already? Well, there are some
 pretty good reasons to start afresh (again...). In the years we've been
 using GLSL IR, we've come to realize that, in fact, it's not what we
 want *at all* to do optimizations on.

 Did you evaluate using LLVM IR instead of inventing yet another one?

 Yes. See

 http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html

 and

 http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html

 I know Ian can't deal with LLVM for some reason. I was wondering if
 *you* evaluated it, and if so, why you rejected it.

 First of all, thank you for sharing more specific information than
 'table-flipping rage'.


 * LLVM is on a different release schedule (6 months vs. 3 months), has
 a different review process, etc., which means that to add support for
 new functionality that involves shaders, we now have to submit patches
 to two separate projects, and then 2 months later when we ship Mesa it
 turns out that nobody can actually use the new feature because it
 depends upon an unreleased version of LLVM that won't be released for
 another 3 months and then packaged by distros even later...

 This has indeed been frustrating at times, but it's better now for
 backend changes since Tom has been making LLVM point releases.

 As for the GLSL frontend, I agree with Tom that it shouldn't require
 that much direct interaction with the LLVM project.


 we've already had problems where distros refused to ship newer Mesa
 releases because radeon depended on a version of LLVM newer than the
 one they were shipping, [...]

 That's news to me, can you be more specific?

 That sounds like basically a distro issue though, since different LLVM
 versions can be installed in parallel (and the one used by default
 doesn't have to be the newest one). And it even works if another part of
 the same process uses a different version of LLVM.

Sorry, I heard about this from one of the other Intel folks (I believe
Ian) so they'll have to comment more on it.



 --
 Earthling Michel Dänzer|  http://www.amd.com
 Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Christian König


Am 20.08.2014 um 14:33 schrieb Connor Abbott:

On Tue, Aug 19, 2014 at 11:57 PM, Christian König
deathsim...@vodafone.de wrote:

I think we can fix this by introducing new structured variants of the
branch instruction in a way that doesn't alter the fundamental structure
of the IR.  E.g. an if branch could look like:

ifbr i1 cond, label iftrue, label iffalse, label join

Where both branches are guaranteed to converge at join.  Sure, this
will require fixing many assumptions, but on the one hand it's not
immediately required (as we can address this problem for the time being
using the same solution AMD uses) and on the other hand it's still less
work than starting from scratch.

Well, I've wrote the structurizer pass in LLVM you are talking about here
and from my experience you really don't want any structured form of control
flow in the IR.

Structured control flow is just a specialized form of unstructured control
flow and even if it looks rather awkward at first glance it is indeed
simpler to destructurize the compiler generated control flow for
optimization and structurize again for instruction selection.

That's interesting. I still think that with the right infrastructure,
having structured control flow really isn't that bad, and it prevents
optimizations from doing work like optimizing if (foo) { break; }
into a single conditional branch when clearly that's not very
productive. I would suspect that LLVM just isn't very good at
structured control flow since it wasn't designed that way, and that's
why it seems hard to work with.


Well, maybe I should note that a lot of closed source driver are using 
LLVM for their internal IR representation and as far as I know they have 
more or less all a rather structured way of control flow.


The problem with LLVM really isn't it's IR, because it's not designed 
CPU centric like you obviously think, but rather more that LLVM doesn't 
have a stable interface and is a rather fast moving project.


Actually for example for R600 you do want to optimize a pattern like if 
(foo) { break; } into a conditional branch, cause if you look at the 
ISA you see that the LOOP_BREAK pattern is able to take an additional 
condition to apply to the current execution mask.


When you design an hardware independent IR looking at the backend 
hardware level like you do right now is actually the completely wrong 
approach. What you need to do is making the IR as simple as possible and 
then allow to do specialized operations on it to translate it into the 
desired machine code.


In other words the logic necessary for code generation shouldn't be 
inside the IR, cause then the IR is specialized to this specific 
problem. Instead the logic needs to be in the tools that surround the IR.


Regards,
Christian.




The only reason I've annotated the LLVM IR with specialized intrinsics for
the SI backend was laziness and I wouldn't do that again given the chance.

And it's very likely that these backends, which probably aren't using
SSA due to the aforementioned difficulties, will also benefit from
having modifiers already folded for them - this is something that's
already a problem for i965 vec4 backend and that NIR will help a lot.

Well, I have the impression that much of the reason why the i965 vec4
backend has lagged behind so much in comparison with the fs backend is
precisely because it's so annoying to optimize vec4 code.  It seems
painful to me that you have this built into the core instruction set so
generic optimization passes will have to be explicitly aware of it.  I
wouldn't be surprised if the i965 vec4 benefited at least as much from
scalarizing the code, performing optimizations there, and re-vectorizing
afterwards.

We thought about doing something like that, but I don't think it's
really that much of a burden when it comes to the rest of the IR. Most
of the difficulty of working with a vec4 representation comes from the
fact that instructions can partially update their outputs, and once we
convert to SSA that problem goes away since there are no partial
updates in SSA. Coming out of SSA is where the difficulty lies, but I
still think that's a solvable problem, just a difficult one. Plus,
there's the problem of how to do the vectorization - you could do it
in SSA, but then you still have the hard bit of coming out of SSA and
so you're back to square one, or you could do it once you're out of
SSA but then it's a lot harder to reason about since you're back to
having partial updates.



Completely agree.

Being able to do vectorization in an IR is important, but you shouldn't try
to handle backend specific swizzle operations and vectorizing restrictions
in the IR. Just looking at the swizzle restrictions of R600 for example and
I really can't imagine that you want to represent this in a common IR
between all different drivers.

Regards,
Christian.

Am 20.08.2014 um 08:33 schrieb Francisco Jerez:

Connor Abbott cwabbo...@gmail.com writes:

On Tue, Aug 19, 2014 at 11:40

Re: [Mesa-dev] EGL_KHR_vg_parent_image extenstion

On 20/08/14 12:42, Peter Hanzel wrote:
 Hello.
 
 I want to use EGL_KHR_vg_parent_image extension. I did grep on Mesa
 source code but found nothing. Then I looked on google and found this:
 
 http://lists.freedesktop.org/archives/mesa-dev/2010-September/002900.html
 
 Then I looked at mesa cgit and it looks like this was not merged.
 So mesa is not supporting EGL_KHR_vg_parent_image extenstion ?
 

Hi Peter,

Afaics Chia-I requested some trivial changes to the patch, but the original
author never replied back.

Imho everyone is welcome to address the comments and resubmit the patch, even
yourself. Feel free to give it a try ;)

-Emil

 Thanks.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
 wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
  wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? 
  Don't
  we already have like 5 of those, not counting all the 
  driver-specific
  ones? Isn't this stuff complicated enough already? Well, there are 
  some
  pretty good reasons to start afresh (again...). In the years we've 
  been
  using GLSL IR, we've come to realize that, in fact, it's not what we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which 
 have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't write 
 an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

 Where both branches are guaranteed to converge at join.  Sure, this
 will require fixing many assumptions, but on the one hand it's not
 immediately required (as we can address this problem for the time being

Re: [Mesa-dev] [PATCH 3/3] clover: unsure compat::string is \0 terminated

EdB e...@sigluy.net writes:

 Each time you call c_str() it will grow up, may be you could check if 
 the string is already \0 terminated before adding it.

Nope, that's not how it works.  Every time c_str() is called the size of
the underlying array is forced to at least size-of-the-actual-string +
1, so nothing will happen if the array is already big enough.

 The way we do it, we use twice the memory every time a vector capacity 
 increase (before freeing the old vec) as we don't use a realloc.
 I understand c_str() should be use for debug only purpose, but may be it 
 could be a problem while debugging huge strings.

 Or we can keep compat::string the same and remove c_str(). If someone 
 needed it, he could use std::string operator and c_str() on it.
 At the end, the memory used is the same.


 Le 2014-08-18 14:35, Francisco Jerez a écrit :
 EdB edb+m...@sigluy.net writes:
 
 otherwise c_str() is not safe
 ---
  src/gallium/state_trackers/clover/util/compat.hpp | 54 
 ---
  1 file changed, 48 insertions(+), 6 deletions(-)
 
 diff --git a/src/gallium/state_trackers/clover/util/compat.hpp 
 b/src/gallium/state_trackers/clover/util/compat.hpp
 index 6f0f7cc..7ca1f85 100644
 --- a/src/gallium/state_trackers/clover/util/compat.hpp
 +++ b/src/gallium/state_trackers/clover/util/compat.hpp
 @@ -197,7 +197,7 @@ namespace clover {
  return _p[i];
   }
 
 -  private:
 +  protected:
   iterator _p;  //memory array
   size_type _s; //size
   size_type _c; //capacity
 @@ -306,18 +306,56 @@ namespace clover {
 
class string : public vectorchar {
public:
 - string() : vector() {
 + string() : vector(0, 1) {
 +_p[_s - 1] = '\0';
   }
 
 - string(const char *p) : vector(p, std::strlen(p)) {
 + string(const char *p) : vector(p, std::strlen(p) + 1) {
 +_p[_s - 1] = '\0';
   }
 
   templatetypename C
 - string(const C v) : vector(v) {
 + string(const C v) : vector(*v.begin(), v.size() + 1) {
 +_p[_s - 1] = '\0';
   }
 
 - operator std::string() const {
 -return std::string(begin(), end());
 + void
 + reserve(size_type m) {
 +vector::reserve(m + 1);
 + }
 +
 + void
 + resize(size_type m, char x = '\0') {
 +vector::resize(m + 1, x);
 +_p[_s - 1] = '\0';
 + }
 +
 + void
 + push_back(char x) {
 +reserve(_s + 1);
 +_p[_s - 1] = x;
 +_p[_s] = '\0';
 +++_s;
 + }
 +
 + size_type
 + size() const {
 +return _s - 1;
 + }
 +
 + size_type
 + capacity() const {
 +return _c - 1;
 + }
 +
 + iterator
 + end() {
 +return _p + size();
 + }
 +
 + const_iterator
 + end() const {
 +return _p + size();
   }
 
 
 At this point where all methods from the base class need to be 
 redefined
 it probably stops making sense to use inheritance instead of
 aggregation.  Once we've done that fixing c_str() gets a lot easier 
 (two
 lines of code) because we can just declare the container as mutable and
 fix up the NULL terminator when c_str() is called.  Both changes
 attached.
 
   const char *
 @@ -325,6 +363,10 @@ namespace clover {
  return begin();
   }
 
 + operator std::string() const {
 +return std::string(begin(), end());
 + }
 +
   const char *
   find(const string s) const {
  for (size_t i = 0; i + s.size()  size(); ++i) {
 --
 2.0.4
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


pgp9UohsDiR7k.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Olivier Galibert

And don't forget that explicit vec4 becomes immensely amusing once you
add fp64/double to the problem.

  OG.


On Wed, Aug 20, 2014 at 4:01 PM, Francisco Jerez curroje...@riseup.net wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
 wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
  wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? 
  Don't
  we already have like 5 of those, not counting all the 
  driver-specific
  ones? Isn't this stuff complicated enough already? Well, there 
  are some
  pretty good reasons to start afresh (again...). In the years we've 
  been
  using GLSL IR, we've come to realize that, in fact, it's not what 
  we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which 
 have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't 
 write an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

 Where both

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 5:57 AM, Christian König
deathsim...@vodafone.de wrote:
 Am 20.08.2014 um 14:33 schrieb Connor Abbott:

 On Tue, Aug 19, 2014 at 11:57 PM, Christian König
 deathsim...@vodafone.de wrote:

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

 Where both branches are guaranteed to converge at join.  Sure, this
 will require fixing many assumptions, but on the one hand it's not
 immediately required (as we can address this problem for the time being
 using the same solution AMD uses) and on the other hand it's still less
 work than starting from scratch.

 Well, I've wrote the structurizer pass in LLVM you are talking about here
 and from my experience you really don't want any structured form of
 control
 flow in the IR.

 Structured control flow is just a specialized form of unstructured
 control
 flow and even if it looks rather awkward at first glance it is indeed
 simpler to destructurize the compiler generated control flow for
 optimization and structurize again for instruction selection.

 That's interesting. I still think that with the right infrastructure,
 having structured control flow really isn't that bad, and it prevents
 optimizations from doing work like optimizing if (foo) { break; }
 into a single conditional branch when clearly that's not very
 productive. I would suspect that LLVM just isn't very good at
 structured control flow since it wasn't designed that way, and that's
 why it seems hard to work with.


 Well, maybe I should note that a lot of closed source driver are using LLVM
 for their internal IR representation and as far as I know they have more or
 less all a rather structured way of control flow.

 The problem with LLVM really isn't it's IR, because it's not designed CPU
 centric like you obviously think, but rather more that LLVM doesn't have a
 stable interface and is a rather fast moving project.

 Actually for example for R600 you do want to optimize a pattern like if
 (foo) { break; } into a conditional branch, cause if you look at the ISA
 you see that the LOOP_BREAK pattern is able to take an additional condition
 to apply to the current execution mask.

 When you design an hardware independent IR looking at the backend hardware
 level like you do right now is actually the completely wrong approach. What
 you need to do is making the IR as simple as possible and then allow to do
 specialized operations on it to translate it into the desired machine code.

I'm not looking at the backend hardware level here, but at other
languages (in this case D3D bytecode) that support the same thing, and
therefore it's something that the HW probably has/can do efficiently
and something that app developers (especially those translating D3D
bytecode into GLSL, of which there are quite a lot) expect. NIR
obviously doesn't support every HW's strange restrictions on swizzling
and modifiers, backends can do the lowering for that themselves.


 In other words the logic necessary for code generation shouldn't be inside
 the IR, cause then the IR is specialized to this specific problem. Instead
 the logic needs to be in the tools that surround the IR.

 Regards,
 Christian.

These are all good points, and frankly I don't think it would be too
bad if we switched to LLVM. Unfortunately, though, I think that the
Intel driver won't be using LLVM in the near future, if nothing else
for various not-technical reasons I'm not at liberty to discuss, but
certainly making the switch to a flat SSA-based IR, in addition to
being an improvement over the current state of things, will help us
move closer to LLVM and see if it's something we would want to pursue.

Connor




 The only reason I've annotated the LLVM IR with specialized intrinsics
 for
 the SI backend was laziness and I wouldn't do that again given the
 chance.

 And it's very likely that these backends, which probably aren't using
 SSA due to the aforementioned difficulties, will also benefit from
 having modifiers already folded for them - this is something that's
 already a problem for i965 vec4 backend and that NIR will help a lot.

 Well, I have the impression that much of the reason why the i965 vec4
 backend has lagged behind so much in comparison with the fs backend is
 precisely because it's so annoying to optimize vec4 code.  It seems
 painful to me that you have this built into the core instruction set so
 generic optimization passes will have to be explicitly aware of it.  I
 wouldn't be surprised if the i965 vec4 benefited at least as much from
 scalarizing the code, performing optimizations there, and re-vectorizing
 afterwards.

 We thought about doing something like that, but I don't think it's
 really that much of a burden when it comes to the rest of the IR. Most
 of the difficulty of working with a vec4

Re: [Mesa-dev] [PATCH 0/2] kms-swrast: PRIME and missing defines

On 15/08/14 22:32, Andreas Pokorny wrote:
 Hi, 
 
 This adds support for dma_buf fds to kms_swrast. This is especially 
 interesting for drm capable drivers like qxl or udl. The former recently 
 gained prime support. The second part adds a few defines that werent set 
 anywhere else, but are necessary for dri_kms_init_screen to be not empty.
 
Hi Andreas,

As I've added dri_kms_init_screen() I delieberately opted out of dma_buf as it
did not made sense considering the lack of winsys handling. Now it's coming
back to haunt me :)

Do you have any rough numbers about the benefit this brings us ?

-Emil

 regards
 Andreas
 
 Andreas Pokorny (2):
   kms-swrast: Support Prime fd handling
   kms-swrast: defines missing to build kms-swrast
 
  src/gallium/state_trackers/dri/Makefile.am|  5 ++
  src/gallium/state_trackers/dri/dri2.c |  8 ++
  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 91 
 +++
  3 files changed, 91 insertions(+), 13 deletions(-)
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] kms-swrast: defines missing to build kms-swrast

I have pushed a similar patch (commit 16873a6e62e) a couple of days before
your post. Afaics it should already cover this case ?

-Emil

On 15/08/14 22:32, Andreas Pokorny wrote:
 ---
  src/gallium/state_trackers/dri/Makefile.am | 5 +
  1 file changed, 5 insertions(+)
 
 diff --git a/src/gallium/state_trackers/dri/Makefile.am 
 b/src/gallium/state_trackers/dri/Makefile.am
 index bda75c3..bcbd081 100644
 --- a/src/gallium/state_trackers/dri/Makefile.am
 +++ b/src/gallium/state_trackers/dri/Makefile.am
 @@ -26,6 +26,7 @@ include $(top_srcdir)/src/gallium/Automake.inc
  
  AM_CPPFLAGS = \
   $(GALLIUM_PIPE_LOADER_DEFINES) \
 + -DDRI_TARGET \
   -DPIPE_SEARCH_DIR=\$(libdir)/gallium-pipe\ \
   -I$(top_srcdir)/include \
   -I$(top_srcdir)/src/mapi \
 @@ -37,6 +38,10 @@ AM_CPPFLAGS = \
   $(LIBDRM_CFLAGS) \
   $(VISIBILITY_CFLAGS)
  
 +if HAVE_GALLIUM_SOFTPIPE
 +AM_CPPFLAGS += \
 + -DGALLIUM_SOFTPIPE
 +endif
  if HAVE_GALLIUM_STATIC_TARGETS
  AM_CPPFLAGS += \
   -DGALLIUM_STATIC_TARGETS=1
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] kms-swrast: Support Prime fd handling

On 15/08/14 22:32, Andreas Pokorny wrote:
 Allows using prime fds as display target and from display target.
 Test for PRIME capability after initializing kms_swrast screen.
 
 Signed-off-by: Andreas Pokorny andreas.poko...@canonical.com
 ---
  src/gallium/state_trackers/dri/dri2.c |  8 ++
  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 91 
 +++
  2 files changed, 86 insertions(+), 13 deletions(-)
 
 diff --git a/src/gallium/state_trackers/dri/dri2.c 
 b/src/gallium/state_trackers/dri/dri2.c
 index c466de7..e52bd71 100644
 --- a/src/gallium/state_trackers/dri/dri2.c
 +++ b/src/gallium/state_trackers/dri/dri2.c
 @@ -1327,6 +1327,7 @@ dri_kms_init_screen(__DRIscreen * sPriv)
 const __DRIconfig **configs;
 struct dri_screen *screen;
 struct pipe_screen *pscreen = NULL;
 +   uint64_t cap;
  
 screen = CALLOC_STRUCT(dri_screen);
 if (!screen)
 @@ -1338,6 +1339,13 @@ dri_kms_init_screen(__DRIscreen * sPriv)
 sPriv-driverPrivate = (void *)screen;
  
 pscreen = kms_swrast_create_screen(screen-fd);
 +
 +   if (drmGetCap(sPriv-fd, DRM_CAP_PRIME, cap) == 0 
 +  (cap  DRM_PRIME_CAP_IMPORT)) {
 +  dri2ImageExtension.createImageFromFds = dri2_from_fds;
 +  dri2ImageExtension.createImageFromDmaBufs = dri2_from_dma_bufs;
 +   }
 +
 sPriv-extensions = dri_screen_extensions;
  
 /* dri_init_screen_helper checks pscreen for us */
 diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
 b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
 index c9934bb..7246ffc 100644
 --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
 +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
 @@ -38,6 +38,7 @@
  #include sys/mman.h
  #include unistd.h
  #include dlfcn.h
 +#include fcntl.h
  #include xf86drm.h
  
  #include pipe/p_compiler.h
 @@ -210,6 +211,38 @@ kms_sw_displaytarget_map(struct sw_winsys *ws,
 return kms_sw_dt-mapped;
  }
  
 +static struct kms_sw_displaytarget *
 +kms_sw_displaytarget_add_from_prime(struct kms_sw_winsys *kms_sw, int fd)
 +{
 +   uint32_t handle;
 +   struct kms_sw_displaytarget * kms_sw_dt;
 +   int ret;
 +
 +   ret = drmPrimeFDToHandle(kms_sw-fd, fd, handle);
 +
 +   if (ret)
 +  return NULL;
 +
 +   kms_sw_dt = CALLOC_STRUCT(kms_sw_displaytarget);
 +   if (!kms_sw_dt)
 +  return NULL;
 +
 +   kms_sw_dt-ref_count = 1;
 +   kms_sw_dt-handle = handle;
 +   kms_sw_dt-size = lseek(fd, 0, SEEK_END);
 +
 +   if (kms_sw_dt-size == (off_t)-1) {
 +  FREE(kms_sw_dt);
 +  return NULL;
 +   }
 +
 +   lseek(fd, 0, SEEK_SET);
 +
 +   list_add(kms_sw_dt-link, kms_sw-bo_list);
 +
 +   return kms_sw_dt;
 +}
 +
  static void
  kms_sw_displaytarget_unmap(struct sw_winsys *ws,
 struct sw_displaytarget *dt)
 @@ -231,17 +264,38 @@ kms_sw_displaytarget_from_handle(struct sw_winsys *ws,
 struct kms_sw_winsys *kms_sw = kms_sw_winsys(ws);
 struct kms_sw_displaytarget *kms_sw_dt;
  
 -   assert(whandle-type == DRM_API_HANDLE_TYPE_KMS);
 -
 -   LIST_FOR_EACH_ENTRY(kms_sw_dt, kms_sw-bo_list, link) {
 -  if (kms_sw_dt-handle == whandle-handle) {
 - kms_sw_dt-ref_count++;
 -
 - DEBUG(KMS-DEBUG: imported buffer %u (size %u)\n, 
 kms_sw_dt-handle, kms_sw_dt-size);
 -
 - *stride = kms_sw_dt-stride;
 +   assert(whandle-type == DRM_API_HANDLE_TYPE_KMS ||
 +  whandle-type == DRM_API_HANDLE_TYPE_FD);
 +
 +   switch(whandle-type) {
 +   case DRM_API_HANDLE_TYPE_FD:
 +  {
 + kms_sw_dt = kms_sw_displaytarget_add_from_prime(kms_sw, 
 whandle-handle);
 + if (kms_sw_dt) {
 +kms_sw_dt-ref_count++;
 +kms_sw_dt-width = templ-width0;
 +kms_sw_dt-height = templ-height0;
 +if (kms_sw_dt-height)
 +   kms_sw_dt-stride = kms_sw_dt-size/kms_sw_dt-height;
 +*stride = kms_sw_dt-stride;
 + }
   return (struct sw_displaytarget *)kms_sw_dt;
}
 +   case DRM_API_HANDLE_TYPE_KMS:
 +  {
 + LIST_FOR_EACH_ENTRY(kms_sw_dt, kms_sw-bo_list, link) {
 +if (kms_sw_dt-handle == whandle-handle) {
 +   kms_sw_dt-ref_count++;
 +
 +   DEBUG(KMS-DEBUG: imported buffer %u (size %u)\n, 
 kms_sw_dt-handle, kms_sw_dt-size);
 +
 +   *stride = kms_sw_dt-stride;
 +   return (struct sw_displaytarget *)kms_sw_dt;
 +}
 + }
 +  }
 +   default:
 +   break;
Please formatting the switch so that it matches the one below.

 }
  
 assert(0);
 @@ -253,16 +307,27 @@ kms_sw_displaytarget_get_handle(struct sw_winsys 
 *winsys,
  struct sw_displaytarget *dt,
  struct winsys_handle *whandle)
  {
 +   struct kms_sw_winsys *kms_sw = kms_sw_winsys(winsys);
 struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt);
  
 -   if (whandle-type == DRM_API_HANDLE_TYPE_KMS) {
 +   switch(whandle-type) {
 +   case

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Didn't look at it that closely, but I'm pretty surprised this really
works. One things ARB_texture_view can do is cast cube maps (and cube
map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
array type), and we cannot express that in sampler views (yet) (we can't
express it in surfaces neither but there it should not matter). Which
means the type used in the shader for sampling will not match the
sampler view, which sounds quite broken to me.

Roland

Am 20.08.2014 08:45, schrieb Ilia Mirkin:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
 
 No piglit regressions on nvc0 except for gl-3.0-render-integer, which appears
 to now fail even without this commit, despite the fact that I'm fairly sure it
 used to work fine. Same failure with llvmpipe...
 
 It's most likely that I've missed some details. It's unclear whether
 e.g. glGenerateMipmap should work on a view. However the piglits that exist do
 all pass on nvc0 and llvmpipe.
 
  docs/GL3.txt |  2 +-
  docs/relnotes/10.3.html  |  1 +
  src/mesa/state_tracker/st_atom_texture.c | 28 +++
  src/mesa/state_tracker/st_cb_fbo.c   | 10 ++
  src/mesa/state_tracker/st_cb_texture.c   | 62 
 +++-
  src/mesa/state_tracker/st_extensions.c   |  1 +
  src/mesa/state_tracker/st_format.c   |  5 +--
  src/mesa/state_tracker/st_texture.c  | 15 ++--
  8 files changed, 105 insertions(+), 19 deletions(-)
 
 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index 76412c3..5b25865 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -166,7 +166,7 @@ GL 4.3, GLSL 4.30:
GL_ARB_texture_buffer_range  DONE (nv50, nvc0, 
 i965, r600, radeonsi)
GL_ARB_texture_query_levels  DONE (all drivers 
 that support GLSL 1.30)
GL_ARB_texture_storage_multisample   DONE (all drivers 
 that support GL_ARB_texture_multisample)
 -  GL_ARB_texture_view  DONE (i965)
 +  GL_ARB_texture_view  DONE (i965, nv30, 
 nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_vertex_attrib_binding DONE (all drivers)
  
  
 diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
 index fa4ea23..852aec9 100644
 --- a/docs/relnotes/10.3.html
 +++ b/docs/relnotes/10.3.html
 @@ -63,6 +63,7 @@ Note: some of the new features are only available with 
 certain drivers.
  liGL_ARB_texture_gather on r600, radeonsi/li
  liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
 softpipe/li
  liGL_ARB_texture_query_lod on r600, radeonsi/li
 +liGL_ARB_texture_view on nv30, nv50, nvc0, r300, r600, radeonsi, llvmpipe, 
 softpipe/li
  liGL_ARB_viewport_array on nvc0/li
  liGL_AMD_vertex_shader_viewport_index on i965/gen7+, r600/li
  liGL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, 
 radeonsi, softpipe, llvmpipe/li
 diff --git a/src/mesa/state_tracker/st_atom_texture.c 
 b/src/mesa/state_tracker/st_atom_texture.c
 index 03d0593..8f62494 100644
 --- a/src/mesa/state_tracker/st_atom_texture.c
 +++ b/src/mesa/state_tracker/st_atom_texture.c
 @@ -192,9 +192,9 @@ get_texture_format_swizzle(const struct st_texture_object 
 *stObj)
 return swizzle_swizzle(stObj-base._Swizzle, tex_swizzle);
  }
  
 -
 +
  /**
 - * Return TRUE if the texture's sampler view swizzle is equal to
 + * Return TRUE if the texture's sampler view swizzle is not equal to
   * the texture's swizzle.
   *
   * \param stObj  the st texture object,
 @@ -214,9 +214,20 @@ check_sampler_swizzle(const struct st_texture_object 
 *stObj,
  
  static unsigned last_level(struct st_texture_object *stObj)
  {
 -   return MIN2(stObj-base._MaxLevel, stObj-pt-last_level);
 +   unsigned ret = MIN2(stObj-base.MinLevel + stObj-base._MaxLevel,
 +   stObj-pt-last_level);
 +   if (stObj-base.Immutable)
 +  ret = MIN2(ret, stObj-base.MinLevel + stObj-base.NumLevels - 1);
 +   return ret;
  }
  
 +static unsigned last_layer(struct st_texture_object *stObj)
 +{
 +   if (stObj-base.Immutable)
 +  return MIN2(stObj-base.MinLayer + stObj-base.NumLayers - 1,
 +  stObj-pt-array_size - 1);
 +   return stObj-pt-array_size - 1;
 +}
  
  static struct pipe_sampler_view *
  st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe,
 @@ -249,9 +260,12 @@ st_create_texture_sampler_view_from_stobj(struct 
 pipe_context *pipe,
templ.u.buf.first_element = f;
templ.u.buf.last_element  = f + (n - 1);
 } else {
 -  templ.u.tex.first_level = stObj-base.BaseLevel;
 +  templ.u.tex.first_level = stObj-base.MinLevel + stObj-base.BaseLevel;
templ.u.tex.last_level = last_level(stObj);
assert(templ.u.tex.first_level = templ.u.tex.last_level);
 +  templ.u.tex.first_layer = stObj-base.MinLayer;
 +  templ.u.tex.last_layer = last_layer(stObj);
 +

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 7:01 AM, Francisco Jerez curroje...@riseup.net wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
 wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
  wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? 
  Don't
  we already have like 5 of those, not counting all the 
  driver-specific
  ones? Isn't this stuff complicated enough already? Well, there 
  are some
  pretty good reasons to start afresh (again...). In the years we've 
  been
  using GLSL IR, we've come to realize that, in fact, it's not what 
  we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which 
 have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't 
 write an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

 Where both branches are guaranteed to converge at join.  Sure, this
 will require fixing many assumptions, but on the one hand

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Hm, it's not tested. And you're right, that would (most likely) mess
up, since it would only have the pipe_resource's target. Any
suggestions on how to fix it? Should the target be added to
pipe_sampler_view?

On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com wrote:
 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland

 Am 20.08.2014 08:45, schrieb Ilia Mirkin:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 No piglit regressions on nvc0 except for gl-3.0-render-integer, which appears
 to now fail even without this commit, despite the fact that I'm fairly sure 
 it
 used to work fine. Same failure with llvmpipe...

 It's most likely that I've missed some details. It's unclear whether
 e.g. glGenerateMipmap should work on a view. However the piglits that exist 
 do
 all pass on nvc0 and llvmpipe.

  docs/GL3.txt |  2 +-
  docs/relnotes/10.3.html  |  1 +
  src/mesa/state_tracker/st_atom_texture.c | 28 +++
  src/mesa/state_tracker/st_cb_fbo.c   | 10 ++
  src/mesa/state_tracker/st_cb_texture.c   | 62 
 +++-
  src/mesa/state_tracker/st_extensions.c   |  1 +
  src/mesa/state_tracker/st_format.c   |  5 +--
  src/mesa/state_tracker/st_texture.c  | 15 ++--
  8 files changed, 105 insertions(+), 19 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index 76412c3..5b25865 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -166,7 +166,7 @@ GL 4.3, GLSL 4.30:
GL_ARB_texture_buffer_range  DONE (nv50, nvc0, 
 i965, r600, radeonsi)
GL_ARB_texture_query_levels  DONE (all drivers 
 that support GLSL 1.30)
GL_ARB_texture_storage_multisample   DONE (all drivers 
 that support GL_ARB_texture_multisample)
 -  GL_ARB_texture_view  DONE (i965)
 +  GL_ARB_texture_view  DONE (i965, nv30, 
 nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_vertex_attrib_binding DONE (all drivers)


 diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
 index fa4ea23..852aec9 100644
 --- a/docs/relnotes/10.3.html
 +++ b/docs/relnotes/10.3.html
 @@ -63,6 +63,7 @@ Note: some of the new features are only available with 
 certain drivers.
  liGL_ARB_texture_gather on r600, radeonsi/li
  liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
 softpipe/li
  liGL_ARB_texture_query_lod on r600, radeonsi/li
 +liGL_ARB_texture_view on nv30, nv50, nvc0, r300, r600, radeonsi, 
 llvmpipe, softpipe/li
  liGL_ARB_viewport_array on nvc0/li
  liGL_AMD_vertex_shader_viewport_index on i965/gen7+, r600/li
  liGL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, 
 radeonsi, softpipe, llvmpipe/li
 diff --git a/src/mesa/state_tracker/st_atom_texture.c 
 b/src/mesa/state_tracker/st_atom_texture.c
 index 03d0593..8f62494 100644
 --- a/src/mesa/state_tracker/st_atom_texture.c
 +++ b/src/mesa/state_tracker/st_atom_texture.c
 @@ -192,9 +192,9 @@ get_texture_format_swizzle(const struct 
 st_texture_object *stObj)
 return swizzle_swizzle(stObj-base._Swizzle, tex_swizzle);
  }

 -
 +
  /**
 - * Return TRUE if the texture's sampler view swizzle is equal to
 + * Return TRUE if the texture's sampler view swizzle is not equal to
   * the texture's swizzle.
   *
   * \param stObj  the st texture object,
 @@ -214,9 +214,20 @@ check_sampler_swizzle(const struct st_texture_object 
 *stObj,

  static unsigned last_level(struct st_texture_object *stObj)
  {
 -   return MIN2(stObj-base._MaxLevel, stObj-pt-last_level);
 +   unsigned ret = MIN2(stObj-base.MinLevel + stObj-base._MaxLevel,
 +   stObj-pt-last_level);
 +   if (stObj-base.Immutable)
 +  ret = MIN2(ret, stObj-base.MinLevel + stObj-base.NumLevels - 1);
 +   return ret;
  }

 +static unsigned last_layer(struct st_texture_object *stObj)
 +{
 +   if (stObj-base.Immutable)
 +  return MIN2(stObj-base.MinLayer + stObj-base.NumLayers - 1,
 +  stObj-pt-array_size - 1);
 +   return stObj-pt-array_size - 1;
 +}

  static struct pipe_sampler_view *
  st_create_texture_sampler_view_from_stobj(struct pipe_context *pipe,
 @@ -249,9 +260,12 @@ st_create_texture_sampler_view_from_stobj(struct 
 pipe_context *pipe,
templ.u.buf.first_element = f;
templ.u.buf.last_element  = f + (n - 1);
 } else {
 -  templ.u.tex.first_level = stObj-base.BaseLevel;
 +  templ.u.tex.first_level = stObj-base.MinLevel +

[Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle

If only the flat/smooth shade state changed between
two calls the prior code would miss updating the
hardware state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967
Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
---
Tested on radeon 6670, no piglit regressions

 src/gallium/drivers/r600/evergreen_state.c   | 2 --
 src/gallium/drivers/r600/r600_shader.h   | 2 +-
 src/gallium/drivers/r600/r600_state.c| 2 --
 src/gallium/drivers/r600/r600_state_common.c | 6 +++---
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 841ad0c..b490145 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2927,8 +2927,6 @@ void evergreen_update_ps_state(struct pipe_context *ctx, 
struct r600_pipe_shader
shader-ps_depth_export = z_export | stencil_export;
 
shader-sprite_coord_enable = sprite_coord_enable;
-   if (rctx-rasterizer)
-   shader-flatshade = rctx-rasterizer-flatshade;
 }
 
 void evergreen_update_es_state(struct pipe_context *ctx, struct 
r600_pipe_shader *shader)
diff --git a/src/gallium/drivers/r600/r600_shader.h 
b/src/gallium/drivers/r600/r600_shader.h
index d6db8f0..8b32966 100644
--- a/src/gallium/drivers/r600/r600_shader.h
+++ b/src/gallium/drivers/r600/r600_shader.h
@@ -89,6 +89,7 @@ struct r600_shader_key {
unsigned alpha_to_one:1;
unsigned nr_cbufs:4;
unsigned vs_as_es:1;
+   unsigned flatshade:1;
 };
 
 struct r600_shader_array {
@@ -106,7 +107,6 @@ struct r600_pipe_shader {
struct r600_command_buffer command_buffer; /* register writes */
struct r600_resource*bo;
unsignedsprite_coord_enable;
-   unsignedflatshade;
unsignedpa_cl_vs_out_cntl;
unsignednr_ps_color_outputs;
struct r600_shader_key  key;
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 607b199..3f5cb2b 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -2532,8 +2532,6 @@ void r600_update_ps_state(struct pipe_context *ctx, 
struct r600_pipe_shader *sha
shader-ps_depth_export = z_export | stencil_export;
 
shader-sprite_coord_enable = sprite_coord_enable;
-   if (rctx-rasterizer)
-   shader-flatshade = rctx-rasterizer-flatshade;
 }
 
 void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader 
*shader)
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 7594d0e..d8243d1 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -699,6 +699,8 @@ static INLINE struct r600_shader_key 
r600_shader_selector_key(struct pipe_contex
/* Dual-source blending only makes sense with nr_cbufs == 1. */
if (key.nr_cbufs == 1  rctx-dual_src_blend)
key.nr_cbufs = 2;
+   if (rctx-rasterizer-flatshade)
+   key.flatshade = 1;
} else if (sel-type == PIPE_SHADER_VERTEX) {
key.vs_as_es = (rctx-gs_shader != NULL);
}
@@ -1250,9 +1252,7 @@ static bool r600_update_derived_state(struct r600_context 
*rctx)
}
 
if (unlikely(!ps_dirty  rctx-ps_shader  rctx-rasterizer 
-   ((rctx-rasterizer-sprite_coord_enable != 
rctx-ps_shader-current-sprite_coord_enable) ||
-   (rctx-rasterizer-flatshade != 
rctx-ps_shader-current-flatshade {
-
+   ((rctx-rasterizer-sprite_coord_enable != 
rctx-ps_shader-current-sprite_coord_enable {
if (rctx-b.chip_class = EVERGREEN)
evergreen_update_ps_state(ctx, 
rctx-ps_shader-current);
else
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 16:31, Ilia Mirkin wrote:

 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com
 wrote:

 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in
 http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx ,
 e.g.,:

 enum pipe_texture_target {
PIPE_BUFFER   = 0,
PIPE_TEXTURE_1D   = 1,
PIPE_TEXTURE_2D   = 2,
PIPE_TEXTURE_3D   = 3,
PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D
PIPE_TEXTURE_RECT = 5,
PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.

Another quick + cheap alternative (at least looking at nv50/nvc0 code)
would be to pass a separate target parameter to
-create_sampler_view(). That would be enough for nouveau, but perhaps
not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
-- it also needs to work out the depth of the texture (presumably to
deal with out-of-bounds accesses) and that is written to the texture
info structure.

Anyways, I guess I'll have to add a PIPE_CAP_TEXTURE_VIEW if the
layouts might not be compatible for some drivers? Or is there
something that exists that I should restrict it to?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Am 20.08.2014 17:47, schrieb Jose Fonseca:
 On 20/08/14 16:31, Ilia Mirkin wrote:
 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com wrote:
 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland

 
 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in
 http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx ,
 e.g.,:
 
 enum pipe_texture_target {
PIPE_BUFFER   = 0,
PIPE_TEXTURE_1D   = 1,
PIPE_TEXTURE_2D   = 2,
PIPE_TEXTURE_3D   = 3,
PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D
PIPE_TEXTURE_RECT = 5,
PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
I think you meant PIPE_TXTURE_2D there?

PIPE_MAX_TEXTURE_TYPES
 };
 
 
 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead,
 drivers that want to be able to support ARB_texture_view will need to
 ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.

Yes, but that's not what I'm talking about.
Even d3d10, which does not have any distinct cube/array texture layouts
(actually 10 still had a separate one for cubes, because there was hw
which really required a different layout afaik, but it got abandoned in
10.1), still requires shader resource views to have them (and they must
match what's declared in the shader):
http://msdn.microsoft.com/en-us/library/windows/desktop/ff476211%28v=vs.85%29.aspx
So, my guess is we should do the same - just have that type in the
sampler view (and drivers wishing to support the extension must take the
type from the view, and not the underlying resource - or they could get
it from the shader itself, presumably, if they really wanted, this is
actually what we do for texture size queries in llvmpipe, but it's more
of a necessary hack).

You are right though we would not really require distinct types at the
resource level, but they don't really get in the way neither.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa: tag mesa-10.2.6: Mesa 10.2.6 release

2014-08-20 Thread Brian Paul


On 08/19/2014 04:18 PM, Carl Worth wrote:

Module: Mesa
Branch: refs/tags/mesa-10.2.6
Tag:1d329590143b4236e8c706b80b6551502f5cb780
URL:
https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/tag/?id%3D1d329590143b4236e8c706b80b6551502f5cb780k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0Am=2Q0FUEQHYsFSJjkW305hEEutE%2F7Hygc9vQEib%2FHSHSw%3D%0As=294f9ee14355b6cbcba1c6934b6f4f50708741062a79e4999be2ed7fcf1d135a

Tagger: Carl Worth cwo...@cworth.org
Date:   Tue Aug 19 15:17:13 2014 -0700

Mesa 10.2.6 release
___
mesa-commit mailing list
mesa-com...@lists.freedesktop.org
https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-commitk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0Am=2Q0FUEQHYsFSJjkW305hEEutE%2F7Hygc9vQEib%2FHSHSw%3D%0As=39d08de0c4e9893863232952eb93589a4c324f448637e7826b5f8c78488c7cd7



Unfortunately, it looks like 31ce84a81f7166ded07e9cb41e5dfe212dd8fed1 
was not included.  If people complain, we may need a 10.2.7 release 
before two weeks.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

2014-08-20 Thread Jose Fonseca


On 20/08/14 17:02, Roland Scheidegger wrote:

Am 20.08.2014 17:47, schrieb Jose Fonseca:

On 20/08/14 16:31, Ilia Mirkin wrote:

Hm, it's not tested. And you're right, that would (most likely) mess
up, since it would only have the pipe_resource's target. Any
suggestions on how to fix it? Should the target be added to
pipe_sampler_view?

On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
srol...@vmware.com wrote:

Didn't look at it that closely, but I'm pretty surprised this really
works. One things ARB_texture_view can do is cast cube maps (and cube
map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
array type), and we cannot express that in sampler views (yet) (we can't
express it in surfaces neither but there it should not matter). Which
means the type used in the shader for sampling will not match the
sampler view, which sounds quite broken to me.

Roland



Probably the only sane thing to do eliminate the disctinction between
PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in
http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx ,
e.g.,:

enum pipe_texture_target {
PIPE_BUFFER   = 0,
PIPE_TEXTURE_1D   = 1,
PIPE_TEXTURE_2D   = 2,
PIPE_TEXTURE_3D   = 3,
PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D
PIPE_TEXTURE_RECT = 5,
PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,

I think you meant PIPE_TXTURE_2D there?


No, I expclitely left PIPE_TEXTURE_CUBE due to the reasons I explained 
below.





PIPE_MAX_TEXTURE_TYPES
};


We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead,
drivers that want to be able to support ARB_texture_view will need to
ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


Yes, but that's not what I'm talking about.
Even d3d10, which does not have any distinct cube/array texture layouts


Precisely.  D3D10 uses D3D10_RESOURCE_DIMENSION_TEXTURE2D for cubes plus 
the D3D10_RESOURCE_MISC_TEXTURECUBE misc flag.  Which is precisely what 
I was talking about when I said  We could also remove PIPE_TEXTURE_CUBE 
and have cube-maps be PIPE_TEXTURE_2D with a flag.



(actually 10 still had a separate one for cubes, because there was hw
which really required a different layout afaik, but it got abandoned in
10.1),


No D3D10 doesn't have D3D10_RESOURCE_DIMENSION_TEXTURECUBE.  D3D 10, 
10.1, and 11, they all use RESOURCE_DIMENSION_TEXTURE2D + 
RESOURCE_MISC_TEXTURECUBE for cubemaps or cubemaps arrays.


 still requires shader resource views to have them (and they must

match what's declared in the shader):
http://msdn.microsoft.com/en-us/library/windows/desktop/ff476211%28v=vs.85%29.aspx


Right, there are different enums for resource types and view types.


So, my guess is we should do the same - just have that type in the
sampler view (and drivers wishing to support the extension must take the
type from the view, and not the underlying resource - or they could get
it from the shader itself, presumably, if they really wanted, this is
actually what we do for texture size queries in llvmpipe, but it's more
of a necessary hack).

You are right though we would not really require distinct types at the
resource level, but they don't really get in the way neither.


Yes, we could do the same. But I do think that in that case we should 
have a separate enum for views different from pipe_texture_target.  And 
pipe_texture_target would be slim down.



But if you remove PIPE_TEXTURE_CUBE from pipe_texture_target you'll need 
to pass that info in a new flag (like D3D10_RESOURCE_MISC_TEXTURECUBE). 
 I don't feel strongly, but I'm not sure this is much more elegant than 
keeping PIPE_TEXTURE_CUBE around.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Am 20.08.2014 17:55, schrieb Ilia Mirkin:
 On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 16:31, Ilia Mirkin wrote:

 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com
 wrote:

 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in
 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
  ,
 e.g.,:

 enum pipe_texture_target {
PIPE_BUFFER   = 0,
PIPE_TEXTURE_1D   = 1,
PIPE_TEXTURE_2D   = 2,
PIPE_TEXTURE_3D   = 3,
PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D
PIPE_TEXTURE_RECT = 5,
PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.
 
 Another quick + cheap alternative (at least looking at nv50/nvc0 code)
 would be to pass a separate target parameter to
 -create_sampler_view(). That would be enough for nouveau, but perhaps
 not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
 -- it also needs to work out the depth of the texture (presumably to
 deal with out-of-bounds accesses) and that is written to the texture
 info structure.
Well that should be enough, but I don't think it fits out design. We've
encapsulated other override information like the format in the view
already, and I see no reason why the target cast should be treated any
different.


 
 Anyways, I guess I'll have to add a PIPE_CAP_TEXTURE_VIEW if the
 layouts might not be compatible for some drivers? Or is there
 something that exists that I should restrict it to?
I suspect d3d9-class hw couldn't do it (can r300 access a cube map as a
regular 2d texture when sampling)?. Usually it's probably the same hw
which also does not support array textures but it can be different (IIRC
i965 was one such chipset which really had different layout for cube
maps and arrays in particular, though it would not apply to anything
that's supported by ilo).

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

2014-08-20 Thread Jose Fonseca


On 20/08/14 17:14, Roland Scheidegger wrote:

Am 20.08.2014 17:55, schrieb Ilia Mirkin:

On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com wrote:

On 20/08/14 16:31, Ilia Mirkin wrote:


Hm, it's not tested. And you're right, that would (most likely) mess
up, since it would only have the pipe_resource's target. Any
suggestions on how to fix it? Should the target be added to
pipe_sampler_view?

On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger srol...@vmware.com
wrote:


Didn't look at it that closely, but I'm pretty surprised this really
works. One things ARB_texture_view can do is cast cube maps (and cube
map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
array type), and we cannot express that in sampler views (yet) (we can't
express it in surfaces neither but there it should not matter). Which
means the type used in the shader for sampling will not match the
sampler view, which sounds quite broken to me.

Roland



Probably the only sane thing to do eliminate the disctinction between
PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in
https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
 ,
e.g.,:

enum pipe_texture_target {
PIPE_BUFFER   = 0,
PIPE_TEXTURE_1D   = 1,
PIPE_TEXTURE_2D   = 2,
PIPE_TEXTURE_3D   = 3,
PIPE_TEXTURE_CUBE = 4, // Must have same layout as PIPE_TEXTURE_2D
PIPE_TEXTURE_RECT = 5,
PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
PIPE_MAX_TEXTURE_TYPES
};


We could also remove PIPE_TEXTURE_CUBE and have cube-maps be PIPE_TEXTURE_2D
with a flag, but that's probably a lot of work. Instead, drivers that want
to be able to support ARB_texture_view will need to ensure
PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


Another quick + cheap alternative (at least looking at nv50/nvc0 code)
would be to pass a separate target parameter to
-create_sampler_view(). That would be enough for nouveau, but perhaps
not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
-- it also needs to work out the depth of the texture (presumably to
deal with out-of-bounds accesses) and that is written to the texture
info structure.

Well that should be enough, but I don't think it fits out design.



 We've

encapsulated other override information like the format in the view
already, and I see no reason why the target cast should be treated any
different.


In other words, you're arguing for:

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h

index a82686b..c87ac4e 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -333,6 +333,7 @@ struct pipe_surface
struct pipe_reference reference;
struct pipe_resource *texture; /** resource into which this is a 
view  */

struct pipe_context *context; /** context this surface belongs to */
+   enum pipe_texture target;
enum pipe_format format;

/* XXX width/height should be removed */


It's a fair point.  And I don't object that solution.

Of course, for this to work, drivers will need to treat the _ARRAY and 
non _ARRAY targets the same when determining the texture layout for this 
to work.



I just felt this would be a good oportunity to slim down 
pipe_texture_target too.  I'm not sure the _ARRAY distinction still 
matters at this level, but I suppose it doesn't hurt.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 17:14, Roland Scheidegger wrote:

 Am 20.08.2014 17:55, schrieb Ilia Mirkin:

 On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com
 wrote:

 On 20/08/14 16:31, Ilia Mirkin wrote:


 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com
 wrote:


 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we
 can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in

 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
 ,
 e.g.,:

 enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
 PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that
 want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


 Another quick + cheap alternative (at least looking at nv50/nvc0 code)
 would be to pass a separate target parameter to
 -create_sampler_view(). That would be enough for nouveau, but perhaps
 not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
 -- it also needs to work out the depth of the texture (presumably to
 deal with out-of-bounds accesses) and that is written to the texture
 info structure.

 Well that should be enough, but I don't think it fits out design.



 We've

 encapsulated other override information like the format in the view
 already, and I see no reason why the target cast should be treated any
 different.


 In other words, you're arguing for:

 diff --git a/src/gallium/include/pipe/p_state.h
 b/src/gallium/include/pipe/p_state.h
 index a82686b..c87ac4e 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -333,6 +333,7 @@ struct pipe_surface

On struct pipe_sampler_view, I thought... unless I'm misunderstanding.
This was also my first thought about fixing this after Roland pointed
out the issue.

 struct pipe_reference reference;
 struct pipe_resource *texture; /** resource into which this is a view
 */
 struct pipe_context *context; /** context this surface belongs to */
 +   enum pipe_texture target;
 enum pipe_format format;

 /* XXX width/height should be removed */


 It's a fair point.  And I don't object that solution.

 Of course, for this to work, drivers will need to treat the _ARRAY and non
 _ARRAY targets the same when determining the texture layout for this to
 work.


 I just felt this would be a good oportunity to slim down pipe_texture_target
 too.  I'm not sure the _ARRAY distinction still matters at this level, but I
 suppose it doesn't hurt.

Such a cleanup would probably have to be done by someone with a better
understanding of gallium than me. OTOH if you guys feel like doing it
the sampler_view way will accrue too much technical debt, that's fine
too. Unless I hear otherwise, I'm going to try to do it the
pipe_sampler_view way tonight.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Am 20.08.2014 18:12, schrieb Jose Fonseca:
 On 20/08/14 17:02, Roland Scheidegger wrote:
 Am 20.08.2014 17:47, schrieb Jose Fonseca:
 On 20/08/14 16:31, Ilia Mirkin wrote:
 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com wrote:
 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we
 can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in
 http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspx ,
 e.g.,:

 enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
 PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 I think you meant PIPE_TXTURE_2D there?
 
 No, I expclitely left PIPE_TEXTURE_CUBE due to the reasons I explained
 below.
 

 PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D with a flag, but that's probably a lot of work. Instead,
 drivers that want to be able to support ARB_texture_view will need to
 ensure PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.

 Yes, but that's not what I'm talking about.
 Even d3d10, which does not have any distinct cube/array texture layouts
 
 Precisely.  D3D10 uses D3D10_RESOURCE_DIMENSION_TEXTURE2D for cubes plus
 the D3D10_RESOURCE_MISC_TEXTURECUBE misc flag.  Which is precisely what
 I was talking about when I said  We could also remove PIPE_TEXTURE_CUBE
 and have cube-maps be PIPE_TEXTURE_2D with a flag.
 
 (actually 10 still had a separate one for cubes, because there was hw
 which really required a different layout afaik, but it got abandoned in
 10.1),
 
 No D3D10 doesn't have D3D10_RESOURCE_DIMENSION_TEXTURECUBE.  D3D 10,
 10.1, and 11, they all use RESOURCE_DIMENSION_TEXTURE2D +
 RESOURCE_MISC_TEXTURECUBE for cubemaps or cubemaps arrays.
You are right that d3d10 did not rally have a cube layout - they
required the misc_texturecube flag as you said (this is really the same
thing to me if you have a 2d + flag or explicit cube).
But this flag is more or less dead and buried with newer d3d versions,
it is there for compatibility reasons (with api level 10_0 you still
need it for cube maps, since it didn't allow the target casting from
cube to 2d array or vice versa later in the resource view). It was never
used for cube map arrays.


 
 still requires shader resource views to have them (and they must
 match what's declared in the shader):
 http://msdn.microsoft.com/en-us/library/windows/desktop/ff476211%28v=vs.85%29.aspx

 
 Right, there are different enums for resource types and view types.
 
 So, my guess is we should do the same - just have that type in the
 sampler view (and drivers wishing to support the extension must take the
 type from the view, and not the underlying resource - or they could get
 it from the shader itself, presumably, if they really wanted, this is
 actually what we do for texture size queries in llvmpipe, but it's more
 of a necessary hack).

 You are right though we would not really require distinct types at the
 resource level, but they don't really get in the way neither.
 
 Yes, we could do the same. But I do think that in that case we should
 have a separate enum for views different from pipe_texture_target.  And
 pipe_texture_target would be slim down.
Yes that would make sense. I'm not sure it's worth the trouble of
changing the code though.

 
 
 But if you remove PIPE_TEXTURE_CUBE from pipe_texture_target you'll need
 to pass that info in a new flag (like D3D10_RESOURCE_MISC_TEXTURECUBE).
  I don't feel strongly, but I'm not sure this is much more elegant than
 keeping PIPE_TEXTURE_CUBE around.
Ok understood PIPE_TEXTURE_CUBE would have to stay (as there's hw which
needs to know the distinction to an array).

So I guess we could do:
1) slim the pipe_texture_target enum down and use a different
pipe_view_target (or whatever) in the sampler view which has all values,
and allow all required casts (those involving cubes are probably the
only ones which really need to be restricted to drivers supporting
ARB_texture_view). It requires

Re: [Mesa-dev] [PATCHv3 11/16] mesa: add infrastructure for threaded shader compilation

2014-08-20 Thread Fredrik Höglund

On Wednesday 20 August 2014, Chia-I Wu wrote:
 Add _mesa_enable_glsl_threadpool to enable the thread pool for a context, and
 add ctx-Const.DeferCompileShader and ctx-Const.DeferLinkProgram to
 fine-control what gets threaded.
 
 Setting DeferCompileShader to true will make _mesa_glsl_compile_shader be
 executed in a worker thread.  The function is thread-safe so there is no
 restriction on DeferCompileShader.
 
 Setting DeferLinkProgram to true will make _mesa_glsl_link_shader be executed
 in a worker thread.  The function is thread-safe only when certain driver
 functions (as documented in struct gl_constants) are thread-safe.  It is
 drivers' responsibility to fix those driver functions before setting
 DeferLinkProgram.
 
 When DeferLinkProgram is set, drivers are not supposed to inspect the context
 in their LinkShader callbacks.  Instead, NotifyLinkShader is added.  Drivers
 should inspect the context in NotifyLinkShader and save what they need for
 LinkShader in gl_shader_program.
 
 As a final note, most applications will not benefit from threaded shader
 compilation because they check GL_COMPILE_STATUS/GL_LINK_STATUS immediately,
 giving the worker threads no time to do their jobs.  A possible improvement is
 to split LinkShader into two parts: the first part links and error checks
 while the second part optimizes and generates the machine code.  With the
 split, we can always defer the second part to the thread pool.

It looks like _mesa_create_shader_program() needs a bit of work since
it also checks the compile status immediately after compiling the shader.

Fredrik

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

2014-08-20 Thread Jose Fonseca


On 20/08/14 17:33, Ilia Mirkin wrote:

On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote:

On 20/08/14 17:14, Roland Scheidegger wrote:


Am 20.08.2014 17:55, schrieb Ilia Mirkin:


On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com
wrote:


On 20/08/14 16:31, Ilia Mirkin wrote:



Hm, it's not tested. And you're right, that would (most likely) mess
up, since it would only have the pipe_resource's target. Any
suggestions on how to fix it? Should the target be added to
pipe_sampler_view?

On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
srol...@vmware.com
wrote:



Didn't look at it that closely, but I'm pretty surprised this really
works. One things ARB_texture_view can do is cast cube maps (and cube
map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
array type), and we cannot express that in sampler views (yet) (we
can't
express it in surfaces neither but there it should not matter). Which
means the type used in the shader for sampling will not match the
sampler view, which sounds quite broken to me.

Roland



Probably the only sane thing to do eliminate the disctinction between
PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in

https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
,
e.g.,:

enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 PIPE_MAX_TEXTURE_TYPES
};


We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
PIPE_TEXTURE_2D
with a flag, but that's probably a lot of work. Instead, drivers that
want
to be able to support ARB_texture_view will need to ensure
PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.



Another quick + cheap alternative (at least looking at nv50/nvc0 code)
would be to pass a separate target parameter to
-create_sampler_view(). That would be enough for nouveau, but perhaps
not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
-- it also needs to work out the depth of the texture (presumably to
deal with out-of-bounds accesses) and that is written to the texture
info structure.


Well that should be enough, but I don't think it fits out design.





We've

encapsulated other override information like the format in the view
already, and I see no reason why the target cast should be treated any
different.



In other words, you're arguing for:

diff --git a/src/gallium/include/pipe/p_state.h
b/src/gallium/include/pipe/p_state.h
index a82686b..c87ac4e 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -333,6 +333,7 @@ struct pipe_surface


On struct pipe_sampler_view, I thought... unless I'm misunderstanding.


Yep. My mistake.


This was also my first thought about fixing this after Roland pointed
out the issue.


 struct pipe_reference reference;
 struct pipe_resource *texture; /** resource into which this is a view
*/
 struct pipe_context *context; /** context this surface belongs to */
+   enum pipe_texture target;
 enum pipe_format format;

 /* XXX width/height should be removed */


It's a fair point.  And I don't object that solution.

Of course, for this to work, drivers will need to treat the _ARRAY and non
_ARRAY targets the same when determining the texture layout for this to
work.


I just felt this would be a good oportunity to slim down pipe_texture_target
too.  I'm not sure the _ARRAY distinction still matters at this level, but I
suppose it doesn't hurt.


Such a cleanup would probably have to be done by someone with a better
understanding of gallium than me. OTOH if you guys feel like doing it
the sampler_view way will accrue too much technical debt, that's fine
too. Unless I hear otherwise, I'm going to try to do it the
pipe_sampler_view way tonight.

   -ilia



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Am 20.08.2014 18:33, schrieb Ilia Mirkin:
 On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 17:14, Roland Scheidegger wrote:

 Am 20.08.2014 17:55, schrieb Ilia Mirkin:

 On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com
 wrote:

 On 20/08/14 16:31, Ilia Mirkin wrote:


 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com
 wrote:


 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we
 can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in

 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
 ,
 e.g.,:

 enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
 PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that
 want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


 Another quick + cheap alternative (at least looking at nv50/nvc0 code)
 would be to pass a separate target parameter to
 -create_sampler_view(). That would be enough for nouveau, but perhaps
 not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
 -- it also needs to work out the depth of the texture (presumably to
 deal with out-of-bounds accesses) and that is written to the texture
 info structure.

 Well that should be enough, but I don't think it fits out design.



 We've

 encapsulated other override information like the format in the view
 already, and I see no reason why the target cast should be treated any
 different.


 In other words, you're arguing for:

 diff --git a/src/gallium/include/pipe/p_state.h
 b/src/gallium/include/pipe/p_state.h
 index a82686b..c87ac4e 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -333,6 +333,7 @@ struct pipe_surface
 
 On struct pipe_sampler_view, I thought... unless I'm misunderstanding.
 This was also my first thought about fixing this after Roland pointed
 out the issue.
Yes definitely for pipe_sampler_view - d3d10 also has it on the render
target / depth stencil views, though so far I'm not convinced there's
any value in that (the addressing of cube maps / arrays, 1d / 1d arrays
is entirely the same in all cases, what matters is really the first and
last layer only).

 
 struct pipe_reference reference;
 struct pipe_resource *texture; /** resource into which this is a view
 */
 struct pipe_context *context; /** context this surface belongs to */
 +   enum pipe_texture target;
Make it pipe_texture_target target ;-)


 enum pipe_format format;

 /* XXX width/height should be removed */


 It's a fair point.  And I don't object that solution.

 Of course, for this to work, drivers will need to treat the _ARRAY and non
 _ARRAY targets the same when determining the texture layout for this to
 work.


 I just felt this would be a good oportunity to slim down pipe_texture_target
 too.  I'm not sure the _ARRAY distinction still matters at this level, but I
 suppose it doesn't hurt.
 
 Such a cleanup would probably have to be done by someone with a better
 understanding of gallium than me. OTOH if you guys feel like doing it
 the sampler_view way will accrue too much technical debt, that's fine
 too. Unless I hear otherwise, I'm going to try to do it the
 pipe_sampler_view way tonight.
 

Yes I think it would be a nice cleanup to split it up into two enums. I
was mostly proposing just reusing the same enum and keeping
pipe_texture_target the same because it would require less code change.
But maybe that could come back haunting us later, I agree it would

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 12:11 PM, Francisco Jerez curroje...@riseup.net wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Wed, Aug 20, 2014 at 7:01 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:33 PM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Connor Abbott cwabbo...@gmail.com writes:

 On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez 
 curroje...@riseup.net wrote:
 Tom Stellard t...@stellard.net writes:

 On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
 wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer 
  mic...@daenzer.net wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* 
  IR? Don't
  we already have like 5 of those, not counting all the 
  driver-specific
  ones? Isn't this stuff complicated enough already? Well, there 
  are some
  pretty good reasons to start afresh (again...). In the years 
  we've been
  using GLSL IR, we've come to realize that, in fact, it's not 
  what we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another 
  one?
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 
  --
  Earthling Michel Dänzer|  
  http://www.amd.com
  Libre software enthusiast  |Mesa and X 
  developer


 Well, first of all, the fact that Ian and Ken don't want to use it
 means that any plan to use LLVM for the Intel driver is dead in the
 water anyways - you can translate NIR into LLVM if you want, but for
 i965 we want to share optimizations between our 2 backends (FS and
 vec4) that we can't do today in GLSL IR so this is what we want to use
 for that, and since nobody else does anything with the core GLSL
 compiler except when they have to, when we start moving things out of
 GLSL IR this will probably replace GLSL IR as the infrastructure that
 all Mesa drivers use. But with that in mind, here are a few reasons
 why we wouldn't want to use LLVM:

 * LLVM wasn't built to understand structured CFG's, meaning that you
 need to re-structurize it using a pass that's fragile and prone to
 break if some other pass optimizes the shader in a way that makes it
 non-structured (i.e. not expressible in terms of loops and if
 statements). This loss of information also means that passes that need
 to know things like, for example, the loop nesting depth need to do an
 analysis pass whereas with NIR you can just walk up the control flow
 tree and count the number of loops we hit.


 LLVM has a pass to structurize the CFG.  We use it in the radeon
 drivers, and it is run after all of the other LLVM optimizations which 
 have
 no concept of structured CFG.  It's not bug free, but it works really
 well even with all of the complex OpenCL kernels we throw at it.

 Your point about losing information when the CFG is de-structurized is
 valid, but for things like loop depth, I'm not sure why we couldn't 
 write an
 LLVM analysis pass for this (if one doesn't already exist).


 I don't think this is such a big deal either.  At least the
 structurization pass used on newer AMD hardware isn't fragile in the
 way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
 algorithm) it's guaranteed to give you a valid structurized output no
 matter what the previous optimization passes have done to the CFG,
 modulo bugs.  I admit that the situation is nevertheless suboptimal.
 Ideally this information wouldn't get lost along the way.  For the long
 term we may want to represent structured control flow directly in the IR
 as you say, I just don't see how reinventing the IR saves us any work if
 we could just fix the existing one.

 It seems to me that something like how we represent control flow is a
 pretty fundamental part of the IR - it affects any optimization pass
 that needs to do anything beyond adding and removing instructions. How
 would you fix that, especially given that LLVM is primarily designed
 for CPU's where you don't want to be restricted to structured control
 flow at all? It seems like our goals (preserve the structure) conflict
 with the way LLVM has been designed.

 I think we can fix this by introducing new structured variants of the
 branch instruction in a way that doesn't alter the fundamental structure
 of the IR.  E.g. an if branch could look like:

 ifbr i1 cond, label iftrue, label iffalse, label join

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Kenneth Graunke

On Wednesday, August 20, 2014 06:41:08 PM Michel Dänzer wrote:
 On 20.08.2014 00:04, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:
  On 19.08.2014 01:28, Connor Abbott wrote:
  On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net wrote:
  On 16.08.2014 09:12, Connor Abbott wrote:
  I know what you might be thinking right now. Wait, *another* IR? Don't
  we already have like 5 of those, not counting all the driver-specific
  ones? Isn't this stuff complicated enough already? Well, there are some
  pretty good reasons to start afresh (again...). In the years we've been
  using GLSL IR, we've come to realize that, in fact, it's not what we
  want *at all* to do optimizations on.
 
  Did you evaluate using LLVM IR instead of inventing yet another one?
 
  Yes. See
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
 
  and
 
  http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
 
  I know Ian can't deal with LLVM for some reason. I was wondering if
  *you* evaluated it, and if so, why you rejected it.
 
 First of all, thank you for sharing more specific information than
 'table-flipping rage'.
 
 
  * LLVM is on a different release schedule (6 months vs. 3 months), has
  a different review process, etc., which means that to add support for
  new functionality that involves shaders, we now have to submit patches
  to two separate projects, and then 2 months later when we ship Mesa it
  turns out that nobody can actually use the new feature because it
  depends upon an unreleased version of LLVM that won't be released for
  another 3 months and then packaged by distros even later...
 
 This has indeed been frustrating at times, but it's better now for
 backend changes since Tom has been making LLVM point releases.

Yeah - absolutely.

 As for the GLSL frontend, I agree with Tom that it shouldn't require
 that much direct interaction with the LLVM project.
 
 
  we've already had problems where distros refused to ship newer Mesa
  releases because radeon depended on a version of LLVM newer than the
  one they were shipping, [...]
 
 That's news to me, can you be more specific?
 
 That sounds like basically a distro issue though, since different LLVM
 versions can be installed in parallel (and the one used by default
 doesn't have to be the newest one). And it even works if another part of
 the same process uses a different version of LLVM.

Yes, one can argue that it's a distribution issue - but it's an extremely 
painful problem for distributions.

For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 
2014-03-22), and I was told this was because of LLVM versioning changes in the 
other drivers (primarily radeon, I believe, but probably also llvmpipe).

Mesa 9.2.2 hung the GPU every 5-10 minutes on Sandybridge, and we fixed that in 
Mesa 9.2.3.  But we couldn't get people to actually ship it, and had to field 
tons of bug reports from upset users for several months.

Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo Mesa 
package mantainer) can probably comment more.

I've also heard stories from friends of mine who use radeonsi that they 
couldn't get new GL features or compiler fixes unless they upgrade both Mesa 
/and/ LLVM, and that LLVM was usually either not released or not available in 
their distribution for a few months.

Those are the sorts of things I'd like to avoid.  The compiler is easily the 
most crucial part of a modern graphics stack; splitting it out into a separate 
repository and project seems like a nightmare for people who care about getting 
new drivers released and shipped in distributions in a timely fashion.

Or, looking at it the other way: today, everything you need as an Intel or 
(AFAIK) Nouveau 3D user is nicely contained within Mesa.  Our community has 
complete control over when we do those releases.  New important bug fixes, 
performance improvements, or features?  Ship a new Mesa, and you're done.  
That's a really nice feature I'd hate to lose.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle

2014-08-20 Thread Marek Olšák

Generally, only states which need a full shader compilation must be in
the shader key. Flatshade is not one of them, because it only causes
register updates, so this is not a proper solution. Or I am missing
something?

Marek



On Wed, Aug 20, 2014 at 5:34 PM, Glenn Kennard glenn.kenn...@gmail.com wrote:
 If only the flat/smooth shade state changed between
 two calls the prior code would miss updating the
 hardware state.

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967
 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
 ---
 Tested on radeon 6670, no piglit regressions

  src/gallium/drivers/r600/evergreen_state.c   | 2 --
  src/gallium/drivers/r600/r600_shader.h   | 2 +-
  src/gallium/drivers/r600/r600_state.c| 2 --
  src/gallium/drivers/r600/r600_state_common.c | 6 +++---
  4 files changed, 4 insertions(+), 8 deletions(-)

 diff --git a/src/gallium/drivers/r600/evergreen_state.c 
 b/src/gallium/drivers/r600/evergreen_state.c
 index 841ad0c..b490145 100644
 --- a/src/gallium/drivers/r600/evergreen_state.c
 +++ b/src/gallium/drivers/r600/evergreen_state.c
 @@ -2927,8 +2927,6 @@ void evergreen_update_ps_state(struct pipe_context 
 *ctx, struct r600_pipe_shader
 shader-ps_depth_export = z_export | stencil_export;

 shader-sprite_coord_enable = sprite_coord_enable;
 -   if (rctx-rasterizer)
 -   shader-flatshade = rctx-rasterizer-flatshade;
  }

  void evergreen_update_es_state(struct pipe_context *ctx, struct 
 r600_pipe_shader *shader)
 diff --git a/src/gallium/drivers/r600/r600_shader.h 
 b/src/gallium/drivers/r600/r600_shader.h
 index d6db8f0..8b32966 100644
 --- a/src/gallium/drivers/r600/r600_shader.h
 +++ b/src/gallium/drivers/r600/r600_shader.h
 @@ -89,6 +89,7 @@ struct r600_shader_key {
 unsigned alpha_to_one:1;
 unsigned nr_cbufs:4;
 unsigned vs_as_es:1;
 +   unsigned flatshade:1;
  };

  struct r600_shader_array {
 @@ -106,7 +107,6 @@ struct r600_pipe_shader {
 struct r600_command_buffer command_buffer; /* register writes */
 struct r600_resource*bo;
 unsignedsprite_coord_enable;
 -   unsignedflatshade;
 unsignedpa_cl_vs_out_cntl;
 unsignednr_ps_color_outputs;
 struct r600_shader_key  key;
 diff --git a/src/gallium/drivers/r600/r600_state.c 
 b/src/gallium/drivers/r600/r600_state.c
 index 607b199..3f5cb2b 100644
 --- a/src/gallium/drivers/r600/r600_state.c
 +++ b/src/gallium/drivers/r600/r600_state.c
 @@ -2532,8 +2532,6 @@ void r600_update_ps_state(struct pipe_context *ctx, 
 struct r600_pipe_shader *sha
 shader-ps_depth_export = z_export | stencil_export;

 shader-sprite_coord_enable = sprite_coord_enable;
 -   if (rctx-rasterizer)
 -   shader-flatshade = rctx-rasterizer-flatshade;
  }

  void r600_update_vs_state(struct pipe_context *ctx, struct r600_pipe_shader 
 *shader)
 diff --git a/src/gallium/drivers/r600/r600_state_common.c 
 b/src/gallium/drivers/r600/r600_state_common.c
 index 7594d0e..d8243d1 100644
 --- a/src/gallium/drivers/r600/r600_state_common.c
 +++ b/src/gallium/drivers/r600/r600_state_common.c
 @@ -699,6 +699,8 @@ static INLINE struct r600_shader_key 
 r600_shader_selector_key(struct pipe_contex
 /* Dual-source blending only makes sense with nr_cbufs == 1. 
 */
 if (key.nr_cbufs == 1  rctx-dual_src_blend)
 key.nr_cbufs = 2;
 +   if (rctx-rasterizer-flatshade)
 +   key.flatshade = 1;
 } else if (sel-type == PIPE_SHADER_VERTEX) {
 key.vs_as_es = (rctx-gs_shader != NULL);
 }
 @@ -1250,9 +1252,7 @@ static bool r600_update_derived_state(struct 
 r600_context *rctx)
 }

 if (unlikely(!ps_dirty  rctx-ps_shader  rctx-rasterizer 
 
 -   ((rctx-rasterizer-sprite_coord_enable != 
 rctx-ps_shader-current-sprite_coord_enable) ||
 -   (rctx-rasterizer-flatshade 
 != rctx-ps_shader-current-flatshade {
 -
 +   ((rctx-rasterizer-sprite_coord_enable != 
 rctx-ps_shader-current-sprite_coord_enable {
 if (rctx-b.chip_class = EVERGREEN)
 evergreen_update_ps_state(ctx, 
 rctx-ps_shader-current);
 else
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Am 20.08.2014 20:13, schrieb Kenneth Graunke:
On Wednesday, August 20, 2014 06:41:08 PM Michel Dänzer wrote:
On 20.08.2014 00:04, Connor Abbott wrote:
On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer
mic...@daenzer.net wrote:
On 19.08.2014 01:28, Connor Abbott wrote:
On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer
mic...@daenzer.net wrote:
On 16.08.2014 09:12, Connor Abbott wrote:
I know what you might be thinking right now. Wait,
*another* IR? Don't we already have like 5 of those, not
counting all the driver-specific ones? Isn't this stuff
complicated enough already? Well, there are some pretty
good reasons to start afresh (again...). In the years
we've been using GLSL IR, we've come to realize that, in
fact, it's not what we want *at all* to do optimizations
on.

Did you evaluate using LLVM IR instead of inventing yet
another one?

Yes. See

https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=JXdMJqLxDMsEjr3omF4b2U8%2F8eZQQmATYywWCcLRst4%3D%0As=f9f6f3190c2d8c98b183a74dc5d326e78974981e050eb5587820c19299e31ddd

and

https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.htmlk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=JXdMJqLxDMsEjr3omF4b2U8%2F8eZQQmATYywWCcLRst4%3D%0As=b718382a00ad2a3cd458377a7bed9c477c76bdbde52f6c7a3e914c88b28d4156

I know Ian can't deal with LLVM for some reason. I was wondering if
*you* evaluated it, and if so, why you rejected it.

First of all, thank you for sharing more specific information than
'table-flipping rage'.

* LLVM is on a different release schedule (6 months vs. 3
months), has a different review process, etc., which means that
to add support for new functionality that involves shaders, we
now have to submit patches to two separate projects, and then 2
months later when we ship Mesa it turns out that nobody can
actually use the new feature because it depends upon an
unreleased version of LLVM that won't be released for another 3
months and then packaged by distros even later...

This has indeed been frustrating at times, but it's better now for
backend changes since Tom has been making LLVM point releases.

Yeah - absolutely.

As for the GLSL frontend, I agree with Tom that it shouldn't
require that much direct interaction with the LLVM project.

we've already had problems where distros refused to ship newer
Mesa releases because radeon depended on a version of LLVM newer
than the one they were shipping, [...]

That's news to me, can you be more specific?

That sounds like basically a distro issue though, since different
LLVM versions can be installed in parallel (and the one used by
default doesn't have to be the newest one). And it even works if
another part of the same process uses a different version of LLVM.

Yes, one can argue that it's a distribution issue - but it's an
extremely painful problem for distributions.

For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08
to 2014-03-22), and I was told this was because of LLVM versioning
changes in the other drivers (primarily radeon, I believe, but
probably also llvmpipe).
llvmpipe generally runs on pretty old llvm versions, though I didn't
check the specifics here...

Mesa 9.2.2 hung the GPU every 5-10 minutes on Sandybridge, and we
fixed that in Mesa 9.2.3. But we couldn't get people to actually
ship it, and had to field tons of bug reports from upset users for
several months.
I think this also begs the question if changes requiring new external
libraries to compile really should be in a point release.

Gentoo has also had trouble updating for similar reasons; Matt (the
Gentoo Mesa package mantainer) can probably comment more.

I've also heard stories from friends of mine who use radeonsi that
they couldn't get new GL features or compiler fixes unless they
upgrade both Mesa /and/ LLVM, and that LLVM was usually either not
released or not available in their distribution for a few months.

Those are the sorts of things I'd like to avoid. The compiler is
easily the most crucial part of a modern graphics stack; splitting it
out into a separate repository and project seems like a nightmare for
people who care about getting new drivers released and shipped in
distributions in a timely fashion.

Or, looking at it the other way: today, everything you need as an
Intel or (AFAIK) Nouveau 3D user is nicely contained within Mesa.
Our community has complete control over when we do those releases.
New important bug fixes, performance improvements, or features? Ship
a new Mesa, and you're done. That's a really nice feature I'd hate
to lose.

--Ken

Couldn't build scripts download and use an appropriate llvm version
automatically if the one installed isn't

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 11:28 AM, Roland Scheidegger srol...@vmware.com wrote:
 Am 20.08.2014 20:13, schrieb Kenneth Graunke:
 For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08
 to 2014-03-22), and I was told this was because of LLVM versioning
 changes in the other drivers (primarily radeon, I believe, but
 probably also llvmpipe).
 llvmpipe generally runs on pretty old llvm versions, though I didn't
 check the specifics here...

There are also 49 instances of 'HAVE_LLVM [=]' to manage that :)

 Couldn't build scripts download and use an appropriate llvm version
 automatically if the one installed isn't sufficient? Though maybe the
 idea is crazy I usually try to avoid to deal with such problems ;-).

I don't know the specifics of what you're suggesting, but I don't
think I need to to say that that's disgusting.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle


On Wed, 20 Aug 2014 20:16:50 +0200, Marek Olšák mar...@gmail.com wrote:


Generally, only states which need a full shader compilation must be in
the shader key. Flatshade is not one of them, because it only causes
register updates, so this is not a proper solution. Or I am missing
something?

Marek



Evergreen/Cayman need to recompile the shader since the interpolation is  
done using either INTERP_XY instruction for smooth or INTERP_LOAD_P0 for  
flat. R600-R700 technically don't need to, but the prior code already does  
anyway since flat/smooth register setup is done from output values  
computed when compiling the shader.



/Glenn
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 11:13 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo 
 Mesa package mantainer) can probably comment more.

Yes, at one point we were stuck two releases behind current Mesa (and
this is Gentoo!) because we couldn't get the appropriate version of
LLVM stabilized because a number of reverse dependencies didn't work
with the new LLVM version.

Having multiple versions installed in parallel breaks down pretty
easily. Where do the headers go? Where do all the executables go? Do
you version all of them and install one for each version? Do other
distros allow multiple versions of LLVM to be installed in parallel?
How do they manage?

 I've also heard stories from friends of mine who use radeonsi that they 
 couldn't get new GL features or compiler fixes unless they upgrade both Mesa 
 /and/ LLVM, and that LLVM was usually either not released or not available in 
 their distribution for a few months.

I get the sense that this is a problem that a backend in LLVM would
cause, but maybe not so if we just used LLVM IR for the GLSL compiler.
I think the C API is suitable for this kind of thing as well. Tom?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 11:13:13AM -0700, Kenneth Graunke wrote:
 On Wednesday, August 20, 2014 06:41:08 PM Michel Dänzer wrote:
  On 20.08.2014 00:04, Connor Abbott wrote:
   On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net wrote:
   On 19.08.2014 01:28, Connor Abbott wrote:
   On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
   wrote:
   On 16.08.2014 09:12, Connor Abbott wrote:
   I know what you might be thinking right now. Wait, *another* IR? 
   Don't
   we already have like 5 of those, not counting all the driver-specific
   ones? Isn't this stuff complicated enough already? Well, there are 
   some
   pretty good reasons to start afresh (again...). In the years we've 
   been
   using GLSL IR, we've come to realize that, in fact, it's not what we
   want *at all* to do optimizations on.
  
   Did you evaluate using LLVM IR instead of inventing yet another one?
  
   Yes. See
  
   http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
  
   and
  
   http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
  
   I know Ian can't deal with LLVM for some reason. I was wondering if
   *you* evaluated it, and if so, why you rejected it.
  
  First of all, thank you for sharing more specific information than
  'table-flipping rage'.
  
  
   * LLVM is on a different release schedule (6 months vs. 3 months), has
   a different review process, etc., which means that to add support for
   new functionality that involves shaders, we now have to submit patches
   to two separate projects, and then 2 months later when we ship Mesa it
   turns out that nobody can actually use the new feature because it
   depends upon an unreleased version of LLVM that won't be released for
   another 3 months and then packaged by distros even later...
  
  This has indeed been frustrating at times, but it's better now for
  backend changes since Tom has been making LLVM point releases.
 
 Yeah - absolutely.
 
  As for the GLSL frontend, I agree with Tom that it shouldn't require
  that much direct interaction with the LLVM project.
  
  
   we've already had problems where distros refused to ship newer Mesa
   releases because radeon depended on a version of LLVM newer than the
   one they were shipping, [...]
  
  That's news to me, can you be more specific?
  
  That sounds like basically a distro issue though, since different LLVM
  versions can be installed in parallel (and the one used by default
  doesn't have to be the newest one). And it even works if another part of
  the same process uses a different version of LLVM.
 
 Yes, one can argue that it's a distribution issue - but it's an extremely 
 painful problem for distributions.
 
 For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08 to 
 2014-03-22), and I was told this was because of LLVM versioning changes in 
 the other drivers (primarily radeon, I believe, but probably also llvmpipe).
 
 Mesa 9.2.2 hung the GPU every 5-10 minutes on Sandybridge, and we fixed that 
 in Mesa 9.2.3.  But we couldn't get people to actually ship it, and had to 
 field tons of bug reports from upset users for several months.
 
 Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo 
 Mesa package mantainer) can probably comment more.
 
 I've also heard stories from friends of mine who use radeonsi that they 
 couldn't get new GL features or compiler fixes unless they upgrade both Mesa 
 /and/ LLVM, and that LLVM was usually either not released or not available in 
 their distribution for a few months.
 
 Those are the sorts of things I'd like to avoid.  The compiler is easily the 
 most crucial part of a modern graphics stack; splitting it out into a 
 separate repository and project seems like a nightmare for people who care 
 about getting new drivers released and shipped in distributions in a timely 
 fashion.
 
 Or, looking at it the other way: today, everything you need as an Intel or 
 (AFAIK) Nouveau 3D user is nicely contained within Mesa.  Our community has 
 complete control over when we do those releases.  New important bug fixes, 
 performance improvements, or features?  Ship a new Mesa, and you're done.  
 That's a really nice feature I'd hate to lose.
 

It has been a challenge to match versions of LLVM and Mesa for radeonsi,
but as Michel mention this has been made easier now that we are doing
LLVM point releases.

However, as I mentioned before if we were using LLVM IR as a common IR
it is unlikely that there would be any new features in Mesa that would
depend on changes in LLVM.  The only thing we would need to modify LLVM
for would be:
- Extending the C API
- Bug fixes for optimization passes
- Optimization pass improvements

And remember all these changes would be for improving common code that
is shared between drivers.  All of the important compiler features would
still go into the driver specific backends, which for most drivers are a
part of Mesa.

Even for

Re: [Mesa-dev] [PATCHv3 01/16] util: add _mesa_strtod and _mesa_strtof

2014-08-20 Thread Kenneth Graunke

On Wednesday, August 20, 2014 02:40:22 PM Chia-I Wu wrote:
 Both core mesa and glsl have their own wrappers for strtof_l.  Merge and move
 them to util/.  They are compiled with a C++ compiler so that we can make them
 thread-safe in a following commit.
 
 Signed-off-by: Chia-I Wu o...@lunarg.com
 ---
  src/glsl/Makefile.sources|  3 +-
  src/glsl/glsl_lexer.ll   | 12 +++---
  src/glsl/s_expression.cpp|  2 +-
  src/glsl/s_expression.h  |  2 +-
  src/glsl/strtod.c| 79 ---
  src/glsl/strtod.h| 46 ---
  src/mesa/main/imports.c  | 19 --
  src/mesa/main/imports.h  |  3 --
  src/mesa/program/program_lexer.l |  1 +
  src/util/Makefile.sources|  3 +-
  src/util/strtod.cpp  | 81 
 
  src/util/strtod.h| 46 +++
  12 files changed, 139 insertions(+), 158 deletions(-)
  delete mode 100644 src/glsl/strtod.c
  delete mode 100644 src/glsl/strtod.h
  create mode 100644 src/util/strtod.cpp
  create mode 100644 src/util/strtod.h

Patches 1-4 are:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle

2014-08-20 Thread Marek Olšák

The flag is only used to set S_028644_FLAT_SHADE on all r600g chips. I
don't see it being used by the shader code generation.

Marek

On Wed, Aug 20, 2014 at 8:50 PM, Glenn Kennard glenn.kenn...@gmail.com wrote:
 On Wed, 20 Aug 2014 20:16:50 +0200, Marek Olšák mar...@gmail.com wrote:

 Generally, only states which need a full shader compilation must be in
 the shader key. Flatshade is not one of them, because it only causes
 register updates, so this is not a proper solution. Or I am missing
 something?

 Marek


 Evergreen/Cayman need to recompile the shader since the interpolation is
 done using either INTERP_XY instruction for smooth or INTERP_LOAD_P0 for
 flat. R600-R700 technically don't need to, but the prior code already does
 anyway since flat/smooth register setup is done from output values computed
 when compiling the shader.


 /Glenn
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote:
 On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard t...@stellard.net wrote:
  On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote:
  On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
  wrote:
   Tom Stellard t...@stellard.net writes:
  
   On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
   On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
   wrote:
On 19.08.2014 01:28, Connor Abbott wrote:
On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer mic...@daenzer.net 
wrote:
On 16.08.2014 09:12, Connor Abbott wrote:
I know what you might be thinking right now. Wait, *another* IR? 
Don't
we already have like 5 of those, not counting all the 
driver-specific
ones? Isn't this stuff complicated enough already? Well, there 
are some
pretty good reasons to start afresh (again...). In the years 
we've been
using GLSL IR, we've come to realize that, in fact, it's not what 
we
want *at all* to do optimizations on.
   
Did you evaluate using LLVM IR instead of inventing yet another 
one?
   
   
--
Earthling Michel Dänzer|  
http://www.amd.com
Libre software enthusiast  |Mesa and X 
developer
   
Yes. See
   
http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
   
and
   
http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
   
I know Ian can't deal with LLVM for some reason. I was wondering if
*you* evaluated it, and if so, why you rejected it.
   
   
--
Earthling Michel Dänzer|  
http://www.amd.com
Libre software enthusiast  |Mesa and X 
developer
  
  
   Well, first of all, the fact that Ian and Ken don't want to use it
   means that any plan to use LLVM for the Intel driver is dead in the
   water anyways - you can translate NIR into LLVM if you want, but for
   i965 we want to share optimizations between our 2 backends (FS and
   vec4) that we can't do today in GLSL IR so this is what we want to use
   for that, and since nobody else does anything with the core GLSL
   compiler except when they have to, when we start moving things out of
   GLSL IR this will probably replace GLSL IR as the infrastructure that
   all Mesa drivers use. But with that in mind, here are a few reasons
   why we wouldn't want to use LLVM:
  
   * LLVM wasn't built to understand structured CFG's, meaning that you
   need to re-structurize it using a pass that's fragile and prone to
   break if some other pass optimizes the shader in a way that makes it
   non-structured (i.e. not expressible in terms of loops and if
   statements). This loss of information also means that passes that need
   to know things like, for example, the loop nesting depth need to do an
   analysis pass whereas with NIR you can just walk up the control flow
   tree and count the number of loops we hit.
  
  
   LLVM has a pass to structurize the CFG.  We use it in the radeon
   drivers, and it is run after all of the other LLVM optimizations which 
   have
   no concept of structured CFG.  It's not bug free, but it works really
   well even with all of the complex OpenCL kernels we throw at it.
  
   Your point about losing information when the CFG is de-structurized is
   valid, but for things like loop depth, I'm not sure why we couldn't 
   write an
   LLVM analysis pass for this (if one doesn't already exist).
  
  
   I don't think this is such a big deal either.  At least the
   structurization pass used on newer AMD hardware isn't fragile in the
   way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
   algorithm) it's guaranteed to give you a valid structurized output no
   matter what the previous optimization passes have done to the CFG,
   modulo bugs.  I admit that the situation is nevertheless suboptimal.
   Ideally this information wouldn't get lost along the way.  For the long
   term we may want to represent structured control flow directly in the IR
   as you say, I just don't see how reinventing the IR saves us any work if
   we could just fix the existing one.
 
  It seems to me that something like how we represent control flow is a
  pretty fundamental part of the IR - it affects any optimization pass
  that needs to do anything beyond adding and removing instructions. How
  would you fix that, especially given that LLVM is primarily designed
  for CPU's where you don't want to be restricted to structured control
  flow at all? It seems like our goals (preserve the structure) conflict
  with the way LLVM has been designed.
 
 
  I think it's important to distinguish between LLVM IR and the tools
  available to manipulate it.  LLVM IR is meant to be a platform
  independent program representation.  There is nothing about the IR that
  would prevent

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Stéphane Marchesin

On Wed, Aug 20, 2014 at 11:56 AM, Matt Turner matts...@gmail.com wrote:
 On Wed, Aug 20, 2014 at 11:13 AM, Kenneth Graunke kenn...@whitecape.org 
 wrote:
 Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo 
 Mesa package mantainer) can probably comment more.

 Yes, at one point we were stuck two releases behind current Mesa (and
 this is Gentoo!) because we couldn't get the appropriate version of
 LLVM stabilized because a number of reverse dependencies didn't work
 with the new LLVM version.

 Having multiple versions installed in parallel breaks down pretty
 easily. Where do the headers go? Where do all the executables go? Do
 you version all of them and install one for each version? Do other
 distros allow multiple versions of LLVM to be installed in parallel?
 How do they manage?

For Chrome OS we have multiple versions of LLVM, basically one for
each consumer, and each consumer (except for the clang family) links
to its version statically. It is tedious but less painful than having
to change all the consumers at once (I certainly don't want to update
our ASan tools because I upgraded mesa). It's wasteful and by no means
ideal, but it's a pragmatic solution to a problem over which I have no
control :)

Stéphane


 I've also heard stories from friends of mine who use radeonsi that they 
 couldn't get new GL features or compiler fixes unless they upgrade both Mesa 
 /and/ LLVM, and that LLVM was usually either not released or not available 
 in their distribution for a few months.

 I get the sense that this is a problem that a backend in LLVM would
 cause, but maybe not so if we just used LLVM IR for the GLSL compiler.
 I think the C API is suitable for this kind of thing as well. Tom?
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 12:17 PM, Tom Stellard t...@stellard.net wrote:
 On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote:
 On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard t...@stellard.net wrote:
  On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote:
  On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez curroje...@riseup.net 
  wrote:
   Tom Stellard t...@stellard.net writes:
  
   On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
   On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
   wrote:
On 19.08.2014 01:28, Connor Abbott wrote:
On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer 
mic...@daenzer.net wrote:
On 16.08.2014 09:12, Connor Abbott wrote:
I know what you might be thinking right now. Wait, *another* 
IR? Don't
we already have like 5 of those, not counting all the 
driver-specific
ones? Isn't this stuff complicated enough already? Well, there 
are some
pretty good reasons to start afresh (again...). In the years 
we've been
using GLSL IR, we've come to realize that, in fact, it's not 
what we
want *at all* to do optimizations on.
   
Did you evaluate using LLVM IR instead of inventing yet another 
one?
   
   
--
Earthling Michel Dänzer|  
http://www.amd.com
Libre software enthusiast  |Mesa and X 
developer
   
Yes. See
   
http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html
   
and
   
http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html
   
I know Ian can't deal with LLVM for some reason. I was wondering if
*you* evaluated it, and if so, why you rejected it.
   
   
--
Earthling Michel Dänzer|  
http://www.amd.com
Libre software enthusiast  |Mesa and X 
developer
  
  
   Well, first of all, the fact that Ian and Ken don't want to use it
   means that any plan to use LLVM for the Intel driver is dead in the
   water anyways - you can translate NIR into LLVM if you want, but for
   i965 we want to share optimizations between our 2 backends (FS and
   vec4) that we can't do today in GLSL IR so this is what we want to use
   for that, and since nobody else does anything with the core GLSL
   compiler except when they have to, when we start moving things out of
   GLSL IR this will probably replace GLSL IR as the infrastructure that
   all Mesa drivers use. But with that in mind, here are a few reasons
   why we wouldn't want to use LLVM:
  
   * LLVM wasn't built to understand structured CFG's, meaning that you
   need to re-structurize it using a pass that's fragile and prone to
   break if some other pass optimizes the shader in a way that makes it
   non-structured (i.e. not expressible in terms of loops and if
   statements). This loss of information also means that passes that need
   to know things like, for example, the loop nesting depth need to do an
   analysis pass whereas with NIR you can just walk up the control flow
   tree and count the number of loops we hit.
  
  
   LLVM has a pass to structurize the CFG.  We use it in the radeon
   drivers, and it is run after all of the other LLVM optimizations which 
   have
   no concept of structured CFG.  It's not bug free, but it works really
   well even with all of the complex OpenCL kernels we throw at it.
  
   Your point about losing information when the CFG is de-structurized is
   valid, but for things like loop depth, I'm not sure why we couldn't 
   write an
   LLVM analysis pass for this (if one doesn't already exist).
  
  
   I don't think this is such a big deal either.  At least the
   structurization pass used on newer AMD hardware isn't fragile in the
   way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
   algorithm) it's guaranteed to give you a valid structurized output no
   matter what the previous optimization passes have done to the CFG,
   modulo bugs.  I admit that the situation is nevertheless suboptimal.
   Ideally this information wouldn't get lost along the way.  For the long
   term we may want to represent structured control flow directly in the IR
   as you say, I just don't see how reinventing the IR saves us any work if
   we could just fix the existing one.
 
  It seems to me that something like how we represent control flow is a
  pretty fundamental part of the IR - it affects any optimization pass
  that needs to do anything beyond adding and removing instructions. How
  would you fix that, especially given that LLVM is primarily designed
  for CPU's where you don't want to be restricted to structured control
  flow at all? It seems like our goals (preserve the structure) conflict
  with the way LLVM has been designed.
 
 
  I think it's important to distinguish between LLVM IR and the tools
  available to manipulate it.  LLVM IR is meant to be a platform
  independent

Re: [Mesa-dev] [PATCH 03/19] glx/drisw: add support for DRI2rendererQueryExtension

On 20/08/14 19:32, Jon TURNEY wrote:
 On 18/08/2014 13:08, Emil Velikov wrote:
 On 18/08/14 12:47, Jon TURNEY wrote:
 On 14/08/2014 23:18, Emil Velikov wrote:
 The extension is used by GLX_MESA_query_renderer, which
 can be provided for by hardware and software drivers.

 v2: Use designated initializers.
 v3: Move drisw_query_renderer_*() to dri2_query_renderer.c

 This breaks my build (see [1])

 Ouch, I've completely forgot about your recent-ish changes in here. Sorry for
 the breakage.

 I guess something like the attached is needed.

 Possibly dri2_query_renderer.c needs to be renamed, since it's contents now
 are used for more than dri[23].

 My initial plan was to move the functions to dri_common.c, although that
 caused 'make check' to explode so I've kept them here as per Ian's 
 suggestion.
 Renaming the file makes sense imho.
 
 With a couple of small changes, I believe that you should be safe with
 dropping the above header and the HAVE_LIBDRM guards below.

 The small changes:
   - dri*_query_renderer_* into their respective dri*_priv.h
 
 I had a go at writing the patch like that, which seems to work.
 
 Revised patch attached.
 
Can you just add glx: or similar prefix in the subject line before
committing ? Other than that it looks good imho.
Reviewed-by: Emil Velikov emil.l.veli...@gmail.com

   - Perhaps move a struct from dri2.h to dri2_priv.h
 
 I don't know which struct you mean here.  I didn't find one I needed to move
 to make things build.
 
I know that the dri2 waters are quite deep but wasn't sure how murky they are,
thus the Perhaps 

 The dri2_convert_glx_query_renderer_attribs() helper function could possibly
 stand to be given a more generic name.
 
IMHO one could do a few cleanups in glx, and I highly doubt that anyone would
object. I would be quite happy if anyone bothered :)

Thanks
Emil


 
 0001-Fix-build-since-679c2ef-glx-drisw-add-support-for-DR.patch
 
 
 From 1f06833a856b98b6c5248f0f001bf5b3a74ae010 Mon Sep 17 00:00:00 2001
 From: Jon TURNEY jon.tur...@dronecode.org.uk
 Date: Sun, 17 Aug 2014 17:22:22 +0100
 Subject: [PATCH] Fix build since 679c2ef glx/drisw: add support for
  DRI2rendererQueryExtension, when only building drisw renderer.
 
 v2:
 - Move dri*_query_renderer_* into their respective dri*_priv.h headers
 - Drop then unnneeded include of dri2.h from dri2_query_renderer.c
 - Rename dri2_query_renderer.c as dri_common_query_renderer.c, as it's 
 contents
 now are used for more than dri[23]
 
 Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk
 ---
  src/glx/Makefile.am  |  6 +++---
  src/glx/dri2.h   | 16 
 
  src/glx/dri2_priv.h  |  8 
  src/glx/dri3_priv.h  |  9 +
  ...dri2_query_renderer.c = dri_common_query_renderer.c} |  1 -
  5 files changed, 20 insertions(+), 20 deletions(-)
  rename src/glx/{dri2_query_renderer.c = dri_common_query_renderer.c} (99%)
 
 diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
 index cdd898e..4515312 100644
 --- a/src/glx/Makefile.am
 +++ b/src/glx/Makefile.am
 @@ -96,7 +96,8 @@ endif
  if HAVE_DRICOMMON
  libglx_la_SOURCES += \
 xfont.c \
 -   dri_common.c
 +   dri_common.c \
 +   dri_common_query_renderer.c
  endif
  
  if HAVE_DRI2
 @@ -104,8 +105,7 @@ libglx_la_SOURCES += \
 dri_glx.c \
 XF86dri.c \
 dri2_glx.c \
 -   dri2.c \
 -   dri2_query_renderer.c
 +   dri2.c
  endif
  
  if HAVE_DRI3
 diff --git a/src/glx/dri2.h b/src/glx/dri2.h
 index d07b296..4be5bf8 100644
 --- a/src/glx/dri2.h
 +++ b/src/glx/dri2.h
 @@ -88,20 +88,4 @@ DRI2CopyRegion(Display * dpy, XID drawable,
 XserverRegion region,
 CARD32 dest, CARD32 src);
  
 -_X_HIDDEN int
 -dri2_query_renderer_integer(struct glx_screen *base, int attribute,
 -unsigned int *value);
 -
 -_X_HIDDEN int
 -dri2_query_renderer_string(struct glx_screen *base, int attribute,
 -   const char **value);
 -
 -_X_HIDDEN int
 -dri3_query_renderer_integer(struct glx_screen *base, int attribute,
 -unsigned int *value);
 -
 -_X_HIDDEN int
 -dri3_query_renderer_string(struct glx_screen *base, int attribute,
 -   const char **value);
 -
  #endif
 diff --git a/src/glx/dri2_priv.h b/src/glx/dri2_priv.h
 index c21eee5..b93d158 100644
 --- a/src/glx/dri2_priv.h
 +++ b/src/glx/dri2_priv.h
 @@ -50,3 +50,11 @@ struct dri2_screen {
  
 int show_fps_interval;
  };
 +
 +_X_HIDDEN int
 +dri2_query_renderer_integer(struct glx_screen *base, int attribute,
 +unsigned int *value);
 +
 +_X_HIDDEN int
 +dri2_query_renderer_string(struct glx_screen *base, int attribute,
 +   const char **value);
 diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
 index c0e35ee..248fa28

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-20 Thread Henri Verbeet

On 20 August 2014 20:13, Kenneth Graunke kenn...@whitecape.org wrote:
 I've also heard stories from friends of mine who use radeonsi that they 
 couldn't get new GL features or compiler fixes unless they upgrade both Mesa 
 /and/ LLVM, and that LLVM was usually either not released or not available in 
 their distribution for a few months.

For whatever it's worth, I have been avoiding radeonsi in part because
of the LLVM dependency. Some of the other issues already mentioned
aside, I also think it makes it just painful to do bisects over
moderate/longer periods of time. I'm sure AMD carefully considered the
tradeoff, and that it's worth it for them, but purely as a
user/downstream I'd say using LLVM for the radeonsi compiler was a
mistake.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 12:16 PM, Stéphane Marchesin
stephane.marche...@gmail.com wrote:
 On Wed, Aug 20, 2014 at 11:56 AM, Matt Turner matts...@gmail.com wrote:
 Having multiple versions installed in parallel breaks down pretty
 easily. Where do the headers go? Where do all the executables go? Do
 you version all of them and install one for each version? Do other
 distros allow multiple versions of LLVM to be installed in parallel?
 How do they manage?

 For Chrome OS we have multiple versions of LLVM, basically one for
 each consumer, and each consumer (except for the clang family) links
 to its version statically. It is tedious but less painful than having
 to change all the consumers at once (I certainly don't want to update
 our ASan tools because I upgraded mesa). It's wasteful and by no means
 ideal, but it's a pragmatic solution to a problem over which I have no
 control :)

Right. That solution would never be acceptable for Gentoo.

The LLVM maintainer in Gentoo also confirmed to me that we used to
allow multiple versions of LLVM to be installed side-by-side, but it
required a lot of patching and was a large pain.

The fundamental problem here seems to be that the intended usage model
for LLVM is that you just statically link it into your project. Seems
fine for proprietary software, not so fine for free software.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 11:56:32AM -0700, Matt Turner wrote:
 On Wed, Aug 20, 2014 at 11:13 AM, Kenneth Graunke kenn...@whitecape.org 
 wrote:
  Gentoo has also had trouble updating for similar reasons; Matt (the Gentoo 
  Mesa package mantainer) can probably comment more.
 
 Yes, at one point we were stuck two releases behind current Mesa (and
 this is Gentoo!) because we couldn't get the appropriate version of
 LLVM stabilized because a number of reverse dependencies didn't work
 with the new LLVM version.
 
 Having multiple versions installed in parallel breaks down pretty
 easily. Where do the headers go? Where do all the executables go? Do
 you version all of them and install one for each version? Do other
 distros allow multiple versions of LLVM to be installed in parallel?
 How do they manage?
 

On one of my (gentoo) dev systems, I have 8 different versions of LLVM 
installed.
All I do is install each version to a different prefix.  For example:

/usr/local/llvm/3.6/
/usr/local/llvm/3.5/

When I build a project like mesa which depends on LLVM, I just point it
to the prefix of the LLVM version that I want to use.  Would distros be
able to do something like this?

  I've also heard stories from friends of mine who use radeonsi that they 
  couldn't get new GL features or compiler fixes unless they upgrade both 
  Mesa /and/ LLVM, and that LLVM was usually either not released or not 
  available in their distribution for a few months.
 
 I get the sense that this is a problem that a backend in LLVM would
 cause, but maybe not so if we just used LLVM IR for the GLSL compiler.
 I think the C API is suitable for this kind of thing as well. Tom?

Yes, see my reply to Ken, but basically using LLVM IR for just GLSL
would require a much smaller subset of LLVM features, most of which
are pretty stable.  It would even be using less features than
what llvmpipe uses and llvmpipe still works with older LLVM versions.

-Tom
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: Fix flat/smooth shade state toggle


On Wed, 20 Aug 2014 21:04:34 +0200, Marek Olšák mar...@gmail.com wrote:


The flag is only used to set S_028644_FLAT_SHADE on all r600g chips. I
don't see it being used by the shader code generation.

Marek



Ah, i see. Will respin patch with an alternate solution that won't require  
shader recompilation. Consider v1 dropped.



/Glenn
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

On Wed, Aug 20, 2014 at 12:26:15PM -0700, Connor Abbott wrote:
 On Wed, Aug 20, 2014 at 12:17 PM, Tom Stellard t...@stellard.net wrote:
  On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote:
  On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard t...@stellard.net wrote:
   On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote:
   On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez 
   curroje...@riseup.net wrote:
Tom Stellard t...@stellard.net writes:
   
On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote:
On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer mic...@daenzer.net 
wrote:
 On 19.08.2014 01:28, Connor Abbott wrote:
 On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer 
 mic...@daenzer.net wrote:
 On 16.08.2014 09:12, Connor Abbott wrote:
 I know what you might be thinking right now. Wait, *another* 
 IR? Don't
 we already have like 5 of those, not counting all the 
 driver-specific
 ones? Isn't this stuff complicated enough already? Well, 
 there are some
 pretty good reasons to start afresh (again...). In the years 
 we've been
 using GLSL IR, we've come to realize that, in fact, it's not 
 what we
 want *at all* to do optimizations on.

 Did you evaluate using LLVM IR instead of inventing yet another 
 one?


 --
 Earthling Michel Dänzer|  
 http://www.amd.com
 Libre software enthusiast  |Mesa and X 
 developer

 Yes. See

 http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html

 and

 http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html

 I know Ian can't deal with LLVM for some reason. I was wondering 
 if
 *you* evaluated it, and if so, why you rejected it.


 --
 Earthling Michel Dänzer|  
 http://www.amd.com
 Libre software enthusiast  |Mesa and X 
 developer
   
   
Well, first of all, the fact that Ian and Ken don't want to use it
means that any plan to use LLVM for the Intel driver is dead in the
water anyways - you can translate NIR into LLVM if you want, but for
i965 we want to share optimizations between our 2 backends (FS and
vec4) that we can't do today in GLSL IR so this is what we want to 
use
for that, and since nobody else does anything with the core GLSL
compiler except when they have to, when we start moving things out 
of
GLSL IR this will probably replace GLSL IR as the infrastructure 
that
all Mesa drivers use. But with that in mind, here are a few reasons
why we wouldn't want to use LLVM:
   
* LLVM wasn't built to understand structured CFG's, meaning that you
need to re-structurize it using a pass that's fragile and prone to
break if some other pass optimizes the shader in a way that makes 
it
non-structured (i.e. not expressible in terms of loops and if
statements). This loss of information also means that passes that 
need
to know things like, for example, the loop nesting depth need to do 
an
analysis pass whereas with NIR you can just walk up the control flow
tree and count the number of loops we hit.
   
   
LLVM has a pass to structurize the CFG.  We use it in the radeon
drivers, and it is run after all of the other LLVM optimizations 
which have
no concept of structured CFG.  It's not bug free, but it works really
well even with all of the complex OpenCL kernels we throw at it.
   
Your point about losing information when the CFG is de-structurized 
is
valid, but for things like loop depth, I'm not sure why we couldn't 
write an
LLVM analysis pass for this (if one doesn't already exist).
   
   
I don't think this is such a big deal either.  At least the
structurization pass used on newer AMD hardware isn't fragile in the
way you seem to imply -- AFAIK (unlike the old AMDIL heuristic
algorithm) it's guaranteed to give you a valid structurized output no
matter what the previous optimization passes have done to the CFG,
modulo bugs.  I admit that the situation is nevertheless suboptimal.
Ideally this information wouldn't get lost along the way.  For the 
long
term we may want to represent structured control flow directly in the 
IR
as you say, I just don't see how reinventing the IR saves us any work 
if
we could just fix the existing one.
  
   It seems to me that something like how we represent control flow is a
   pretty fundamental part of the IR - it affects any optimization pass
   that needs to do anything beyond adding and removing instructions. How
   would you fix that, especially given that LLVM is primarily designed
   for CPU's where you don't want to be restricted to structured control
   flow at all? It seems like our goals

[Mesa-dev] [PATCH 3/4] dri/radeon: cleanup the radeon_context vtbl

Remove the set-but-unused, and set-but-empty vtable entries.
Most likely a leftover from the dri1 days.

Cc: Marek Olšák marek.ol...@amd.com
Cc: Michel Dänzer michel.daen...@amd.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/mesa/drivers/dri/r200/r200_context.c   | 24 --
 src/mesa/drivers/dri/r200/r200_state.c | 52 --
 src/mesa/drivers/dri/r200/r200_state.h |  1 -
 src/mesa/drivers/dri/radeon/radeon_common.c|  3 --
 .../drivers/dri/radeon/radeon_common_context.h |  4 --
 src/mesa/drivers/dri/radeon/radeon_context.c   | 26 ---
 src/mesa/drivers/dri/radeon/radeon_state.c | 52 --
 src/mesa/drivers/dri/radeon/radeon_state.h |  1 -
 8 files changed, 163 deletions(-)

diff --git a/src/mesa/drivers/dri/r200/r200_context.c 
b/src/mesa/drivers/dri/r200/r200_context.c
index 7815c4e..931f437 100644
--- a/src/mesa/drivers/dri/r200/r200_context.c
+++ b/src/mesa/drivers/dri/r200/r200_context.c
@@ -143,27 +143,6 @@ static void r200InitDriverFuncs( struct dd_function_table 
*functions )
 }
 
 
-static void r200_get_lock(radeonContextPtr radeon)
-{
-   r200ContextPtr rmesa = (r200ContextPtr)radeon;
-   drm_radeon_sarea_t *sarea = radeon-sarea;
-
-   R200_STATECHANGE( rmesa, ctx );
-   if (rmesa-radeon.sarea-tiling_enabled) {
-  rmesa-hw.ctx.cmd[CTX_RB3D_COLORPITCH] |= R200_COLOR_TILE_ENABLE;
-   }
-   else rmesa-hw.ctx.cmd[CTX_RB3D_COLORPITCH] = ~R200_COLOR_TILE_ENABLE;
-
-   if ( sarea-ctx_owner != rmesa-radeon.dri.hwContext ) {
-  sarea-ctx_owner = rmesa-radeon.dri.hwContext;
-   }
-
-}
-
-static void r200_vtbl_emit_cs_header(struct radeon_cs *cs, radeonContextPtr 
rmesa)
-{
-}
-
 static void r200_emit_query_finish(radeonContextPtr radeon)
 {
BATCH_LOCALS(radeon);
@@ -180,9 +159,6 @@ static void r200_emit_query_finish(radeonContextPtr radeon)
 
 static void r200_init_vtbl(radeonContextPtr radeon)
 {
-   radeon-vtbl.get_lock = r200_get_lock;
-   radeon-vtbl.update_viewport_offset = r200UpdateViewportOffset;
-   radeon-vtbl.emit_cs_header = r200_vtbl_emit_cs_header;
radeon-vtbl.swtcl_flush = r200_swtcl_flush;
radeon-vtbl.fallback = r200Fallback;
radeon-vtbl.update_scissor = r200_vtbl_update_scissor;
diff --git a/src/mesa/drivers/dri/r200/r200_state.c 
b/src/mesa/drivers/dri/r200/r200_state.c
index 983430f..2ad8439 100644
--- a/src/mesa/drivers/dri/r200/r200_state.c
+++ b/src/mesa/drivers/dri/r200/r200_state.c
@@ -1616,58 +1616,6 @@ static void r200DepthRange(struct gl_context *ctx)
r200UpdateWindow( ctx );
 }
 
-void r200UpdateViewportOffset( struct gl_context *ctx )
-{
-   r200ContextPtr rmesa = R200_CONTEXT(ctx);
-   __DRIdrawable *dPriv = radeon_get_drawable(rmesa-radeon);
-   GLfloat xoffset = (GLfloat)0;
-   GLfloat yoffset = (GLfloat)dPriv-h;
-   const GLfloat *v = ctx-ViewportArray[0]._WindowMap.m;
-
-   float_ui32_type tx;
-   float_ui32_type ty;
-
-   tx.f = v[MAT_TX] + xoffset;
-   ty.f = (- v[MAT_TY]) + yoffset;
-
-   if ( rmesa-hw.vpt.cmd[VPT_SE_VPORT_XOFFSET] != tx.ui32 ||
-   rmesa-hw.vpt.cmd[VPT_SE_VPORT_YOFFSET] != ty.ui32 )
-   {
-  /* Note: this should also modify whatever data the context reset
-   * code uses...
-   */
-  R200_STATECHANGE( rmesa, vpt );
-  rmesa-hw.vpt.cmd[VPT_SE_VPORT_XOFFSET] = tx.ui32;
-  rmesa-hw.vpt.cmd[VPT_SE_VPORT_YOFFSET] = ty.ui32;
-
-  /* update polygon stipple x/y screen offset */
-  {
- GLuint stx, sty;
- GLuint m = rmesa-hw.msc.cmd[MSC_RE_MISC];
-
- m = ~(R200_STIPPLE_X_OFFSET_MASK |
-R200_STIPPLE_Y_OFFSET_MASK);
-
- /* add magic offsets, then invert */
- stx = 31 - ((-1)  R200_STIPPLE_COORD_MASK);
- sty = 31 - ((dPriv-h - 1)
-  R200_STIPPLE_COORD_MASK);
-
- m |= ((stx  R200_STIPPLE_X_OFFSET_SHIFT) |
-   (sty  R200_STIPPLE_Y_OFFSET_SHIFT));
-
- if ( rmesa-hw.msc.cmd[MSC_RE_MISC] != m ) {
-R200_STATECHANGE( rmesa, msc );
-   rmesa-hw.msc.cmd[MSC_RE_MISC] = m;
- }
-  }
-   }
-
-   radeonUpdateScissor( ctx );
-}
-
-
-
 /* =
  * Miscellaneous
  */
diff --git a/src/mesa/drivers/dri/r200/r200_state.h 
b/src/mesa/drivers/dri/r200/r200_state.h
index a396b06..9111981 100644
--- a/src/mesa/drivers/dri/r200/r200_state.h
+++ b/src/mesa/drivers/dri/r200/r200_state.h
@@ -43,7 +43,6 @@ extern void r200InitTnlFuncs( struct gl_context *ctx );
 
 extern void r200UpdateMaterial( struct gl_context *ctx );
 
-extern void r200UpdateViewportOffset( struct gl_context *ctx );
 extern void r200UpdateWindow( struct gl_context *ctx );
 extern void r200UpdateDrawBuffer(struct gl_context *ctx);
 
diff --git a/src/mesa/drivers/dri/radeon/radeon_common.c 
b/src/mesa/drivers/dri/radeon/radeon_common.c
index 515e55a..966e10a 100644
--- a/src/mesa/drivers/dri/radeon/radeon_common.c
+++

[Mesa-dev] [PATCH v2] r600g: Fix flat/smooth shade state toggle

If only the flat/smooth shade state changed between
two render calls the prior code would miss updating the
hardware state.

Also add check for sprite coord, potentially same type
of issue otherwise for it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967
Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
---
V2:
 - No new shader variant created
 - Also check for sprite coord enable since its state is updated
   in similar fashion to flatshade.

 src/gallium/drivers/r600/r600_state_common.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 7594d0e..028d800 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1227,7 +1227,9 @@ static bool r600_update_derived_state(struct r600_context 
*rctx)
if (unlikely(!rctx-ps_shader-current))
return false;
 
-   if (unlikely(ps_dirty || rctx-pixel_shader.shader != 
rctx-ps_shader-current)) {
+   if (unlikely(ps_dirty || rctx-pixel_shader.shader != 
rctx-ps_shader-current ||
+   rctx-rasterizer-sprite_coord_enable != 
rctx-ps_shader-current-sprite_coord_enable ||
+   rctx-rasterizer-flatshade != 
rctx-ps_shader-current-flatshade)) {
 
if (rctx-cb_misc_state.nr_ps_color_outputs != 
rctx-ps_shader-current-nr_ps_color_outputs) {
rctx-cb_misc_state.nr_ps_color_outputs = 
rctx-ps_shader-current-nr_ps_color_outputs;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] dri/radeon: nuke the remaining references to sarea

It was an 'interesting' feature which I'm clad we not longer
use as of dri2.

Cc: Marek Olšák marek.ol...@amd.com
Cc: Michel Dänzer michel.daen...@amd.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/mesa/drivers/dri/r200/r200_ioctl.c  | 7 ---
 src/mesa/drivers/dri/radeon/radeon_common_context.h | 1 -
 src/mesa/drivers/dri/radeon/radeon_screen.h | 3 ---
 3 files changed, 11 deletions(-)

diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.c 
b/src/mesa/drivers/dri/r200/r200_ioctl.c
index ef0d637..515be92 100644
--- a/src/mesa/drivers/dri/r200/r200_ioctl.c
+++ b/src/mesa/drivers/dri/r200/r200_ioctl.c
@@ -62,13 +62,6 @@ static void r200Clear( struct gl_context *ctx, GLbitfield 
mask )
BUFFER_BIT_DEPTH | BUFFER_BIT_STENCIL |
BUFFER_BIT_COLOR0;
 
-   if ( R200_DEBUG  RADEON_IOCTL ) {
-  if (rmesa-radeon.sarea)
-  fprintf( stderr, r200Clear %x %d\n, mask, 
rmesa-radeon.sarea-pfCurrentPage);
-  else
-  fprintf( stderr, r200Clear %x radeon-sarea is NULL\n, mask);
-   }
-
radeonFlush( ctx );
 
hwmask = mask  hwbits;
diff --git a/src/mesa/drivers/dri/radeon/radeon_common_context.h 
b/src/mesa/drivers/dri/radeon/radeon_common_context.h
index 8330b17..cfed408 100644
--- a/src/mesa/drivers/dri/radeon/radeon_common_context.h
+++ b/src/mesa/drivers/dri/radeon/radeon_common_context.h
@@ -406,7 +406,6 @@ struct radeon_context {
 
/* Drawable information */
unsigned int lastStamp;
-   drm_radeon_sarea_t *sarea;  /* Private SAREA data */
 
/* Mirrors of some DRI state */
struct radeon_dri_mirror dri;
diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h 
b/src/mesa/drivers/dri/radeon/radeon_screen.h
index b5cc075..b3e9267 100644
--- a/src/mesa/drivers/dri/radeon/radeon_screen.h
+++ b/src/mesa/drivers/dri/radeon/radeon_screen.h
@@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include dri_util.h
 #include radeon_chipset.h
 #include radeon_reg.h
-#include drm_sarea.h
 #include xmlconfig.h
 
 
@@ -88,7 +87,6 @@ typedef struct radeon_screen {
__volatile__ uint32_t *scratch;
 
__DRIscreen *driScreen;
-   unsigned int sarea_priv_offset;
unsigned int gart_buffer_offset;/* offset in card memory space */
unsigned int gart_texture_offset;   /* offset in card memory space */
unsigned int gart_base;
@@ -100,7 +98,6 @@ typedef struct radeon_screen {
 
int num_gb_pipes;
int num_z_pipes;
-   drm_radeon_sarea_t *sarea;  /* Private SAREA data */
struct radeon_bo_manager *bom;
 
 } radeonScreenRec, *radeonScreenPtr;
-- 
2.0.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] dri/radeon: drop obsolete radeon_{dri, macros}.h headers

Both have been unused for at least a couple of years.
For example the last user of radeon_macros.h was removed with

commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6
Author: Eric Anholt e...@anholt.net
Date:   Fri Oct 14 13:27:02 2011 -0700

radeon: Drop the legacy BO manager code.

Cc: Marek Olšák marek.ol...@amd.com
Cc: Michel Dänzer michel.daen...@amd.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 src/mesa/drivers/dri/r200/r200_ioctl.h |   1 -
 src/mesa/drivers/dri/r200/server/radeon_dri.h  |   1 -
 src/mesa/drivers/dri/r200/server/radeon_macros.h   |   1 -
 src/mesa/drivers/dri/radeon/radeon_screen.c|   1 -
 src/mesa/drivers/dri/radeon/radeon_screen.h|   3 +-
 src/mesa/drivers/dri/radeon/server/radeon_dri.h| 115 --
 src/mesa/drivers/dri/radeon/server/radeon_macros.h | 128 -
 7 files changed, 2 insertions(+), 248 deletions(-)
 delete mode 12 src/mesa/drivers/dri/r200/server/radeon_dri.h
 delete mode 12 src/mesa/drivers/dri/r200/server/radeon_macros.h
 delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_dri.h
 delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_macros.h

diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h 
b/src/mesa/drivers/dri/r200/r200_ioctl.h
index ab5f822..9133a22 100644
--- a/src/mesa/drivers/dri/r200/r200_ioctl.h
+++ b/src/mesa/drivers/dri/r200/r200_ioctl.h
@@ -36,7 +36,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #define __R200_IOCTL_H__
 
 #include main/simple_list.h
-#include radeon_dri.h
 
 #include radeon_bo_gem.h
 #include radeon_cs_gem.h
diff --git a/src/mesa/drivers/dri/r200/server/radeon_dri.h 
b/src/mesa/drivers/dri/r200/server/radeon_dri.h
deleted file mode 12
index 27c591d..000
--- a/src/mesa/drivers/dri/r200/server/radeon_dri.h
+++ /dev/null
@@ -1 +0,0 @@
-../../radeon/server/radeon_dri.h
\ No newline at end of file
diff --git a/src/mesa/drivers/dri/r200/server/radeon_macros.h 
b/src/mesa/drivers/dri/r200/server/radeon_macros.h
deleted file mode 12
index c56cd73..000
--- a/src/mesa/drivers/dri/r200/server/radeon_macros.h
+++ /dev/null
@@ -1 +0,0 @@
-../../radeon/server/radeon_macros.h
\ No newline at end of file
diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c 
b/src/mesa/drivers/dri/radeon/radeon_screen.c
index 9a6fbbd..044e212 100644
--- a/src/mesa/drivers/dri/radeon/radeon_screen.c
+++ b/src/mesa/drivers/dri/radeon/radeon_screen.c
@@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #include swrast/s_renderbuffer.h
 
 #include radeon_chipset.h
-#include radeon_macros.h
 #include radeon_screen.h
 #include radeon_common.h
 #include radeon_common_context.h
diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h 
b/src/mesa/drivers/dri/radeon/radeon_screen.h
index 9b77627..b5cc075 100644
--- a/src/mesa/drivers/dri/radeon/radeon_screen.h
+++ b/src/mesa/drivers/dri/radeon/radeon_screen.h
@@ -40,8 +40,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
  * IMPORTS: these headers contain all the DRI, X and kernel-related
  * definitions that we need.
  */
+#include xf86drm.h
+#include radeon_drm.h
 #include dri_util.h
-#include radeon_dri.h
 #include radeon_chipset.h
 #include radeon_reg.h
 #include drm_sarea.h
diff --git a/src/mesa/drivers/dri/radeon/server/radeon_dri.h 
b/src/mesa/drivers/dri/radeon/server/radeon_dri.h
deleted file mode 100644
index dc51372..000
--- a/src/mesa/drivers/dri/radeon/server/radeon_dri.h
+++ /dev/null
@@ -1,115 +0,0 @@
-/**
- * \file server/radeon_dri.h
- * \brief Radeon server-side structures.
- * 
- * \author Kevin E. Martin mar...@xfree86.org
- * \author Rickard E. Faith fa...@valinux.com
- */
-
-/*
- * Copyright 2000 ATI Technologies Inc., Markham, Ontario,
- *VA Linux Systems Inc., Fremont, California.
- *
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining
- * a copy of this software and associated documentation files (the
- * Software), to deal in the Software without restriction, including
- * without limitation on the rights to use, copy, modify, merge,
- * publish, distribute, sublicense, and/or sell copies of the Software,
- * and to permit persons to whom the Software is furnished to do so,
- * subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial
- * portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NON-INFRINGEMENT.  IN NO EVENT SHALL ATI, VA LINUX SYSTEMS AND/OR
- * THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
- * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE

[Mesa-dev] [PATCH 2/4] include: move sarea.h next to it's only user

The header is used by DRI1 drivers, which we've removed a while
back. Now only the dri1 loader in libGL is using it, so let's
move it in src/glx, and prefix it accordingly.

Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
 include/GL/internal/sarea.h | 92 -
 src/glx/dri_glx.c   |  2 +-
 src/glx/dri_sarea.h | 92 +
 3 files changed, 93 insertions(+), 93 deletions(-)
 delete mode 100644 include/GL/internal/sarea.h
 create mode 100644 src/glx/dri_sarea.h

diff --git a/include/GL/internal/sarea.h b/include/GL/internal/sarea.h
deleted file mode 100644
index c3b8bca..000
--- a/include/GL/internal/sarea.h
+++ /dev/null
@@ -1,92 +0,0 @@
-/**
- * \file sarea.h 
- * SAREA definitions.
- * 
- * \author Kevin E. Martin ke...@precisioninsight.com
- * \author Jens Owen jo...@vmware.com
- * \author Rickard E. (Rik) Faith fa...@valinux.com
- */
-
-/*
- * Copyright 1998-1999 Precision Insight, Inc., Cedar Park, Texas.
- * Copyright 2000 VA Linux Systems, Inc.
- * All Rights Reserved.
- * 
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the
- * Software), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sub license, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- * 
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial portions
- * of the Software.
- * 
- * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
- * IN NO EVENT SHALL PRECISION INSIGHT AND/OR ITS SUPPLIERS BE LIABLE FOR
- * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
- * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
- * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- */
-
-
-#ifndef _SAREA_H_
-#define _SAREA_H_
-
-#include xf86drm.h
-
-/* SAREA area needs to be at least a page */
-#if defined(__alpha__)
-#define SAREA_MAX  0x2000
-#elif defined(__ia64__)
-#define SAREA_MAX  0x1 /* 64kB */
-#else
-/* Intel 830M driver needs at least 8k SAREA */
-#define SAREA_MAX  0x2000
-#endif
-
-#define SAREA_MAX_DRAWABLES256
-
-#define SAREA_DRAWABLE_CLAIMED_ENTRY   0x8000
-
-/**
- * SAREA per drawable information.
- *
- * \sa _XF86DRISAREA.
- */
-typedef struct _XF86DRISAREADrawable {
-unsigned int   stamp;
-unsigned int   flags;
-} XF86DRISAREADrawableRec, *XF86DRISAREADrawablePtr;
-
-/**
- * SAREA frame information.
- *
- * \sa  _XF86DRISAREA.
- */
-typedef struct _XF86DRISAREAFrame {
-unsigned intx;
-unsigned inty;
-unsigned intwidth;
-unsigned intheight;
-unsigned intfullscreen;
-} XF86DRISAREAFrameRec, *XF86DRISAREAFramePtr;
-
-/**
- * SAREA definition.
- */
-typedef struct _XF86DRISAREA {
-/** first thing is always the DRM locking structure */
-drmLocklock;
-/** \todo Use readers/writer lock for drawable_lock */
-drmLockdrawable_lock;
-XF86DRISAREADrawableRecdrawableTable[SAREA_MAX_DRAWABLES];
-XF86DRISAREAFrameRecframe;
-drm_context_t  dummy_context;
-} XF86DRISAREARec, *XF86DRISAREAPtr;
-
-#endif
diff --git a/src/glx/dri_glx.c b/src/glx/dri_glx.c
index 5295331..d087751 100644
--- a/src/glx/dri_glx.c
+++ b/src/glx/dri_glx.c
@@ -40,7 +40,7 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 #include glxclient.h
 #include xf86dri.h
 #include dri2.h
-#include sarea.h
+#include dri_sarea.h
 #include dlfcn.h
 #include sys/types.h
 #include sys/mman.h
diff --git a/src/glx/dri_sarea.h b/src/glx/dri_sarea.h
new file mode 100644
index 000..fe4529b
--- /dev/null
+++ b/src/glx/dri_sarea.h
@@ -0,0 +1,92 @@
+/**
+ * \file dri_sarea.h
+ * SAREA definitions.
+ * 
+ * \author Kevin E. Martin ke...@precisioninsight.com
+ * \author Jens Owen jo...@vmware.com
+ * \author Rickard E. (Rik) Faith fa...@valinux.com
+ */
+
+/*
+ * Copyright 1998-1999 Precision Insight, Inc., Cedar Park, Texas.
+ * Copyright 2000 VA Linux Systems, Inc.
+ * All Rights Reserved.
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * Software), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Am 20.08.2014 20:45, schrieb Matt Turner:
 On Wed, Aug 20, 2014 at 11:28 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 Am 20.08.2014 20:13, schrieb Kenneth Graunke:
 For example, Debian was stuck on Mesa 9.2.2 for 4 months (2013-12-08
 to 2014-03-22), and I was told this was because of LLVM versioning
 changes in the other drivers (primarily radeon, I believe, but
 probably also llvmpipe).
 llvmpipe generally runs on pretty old llvm versions, though I didn't
 check the specifics here...
 
 There are also 49 instances of 'HAVE_LLVM [=]' to manage that :)
That is true, but note none of them really have anything to do with
building the IR or the like, it is all for jit, disassembler, etc.
because these things aren't doable with the stable c api.

Roland


 
 Couldn't build scripts download and use an appropriate llvm version
 automatically if the one installed isn't sufficient? Though maybe the
 idea is crazy I usually try to avoid to deal with such problems ;-).
 
 I don't know the specifics of what you're suggesting, but I don't
 think I need to to say that that's disgusting.
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Am 20.08.2014 18:48, schrieb Roland Scheidegger:
 Am 20.08.2014 18:33, schrieb Ilia Mirkin:
 On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 17:14, Roland Scheidegger wrote:

 Am 20.08.2014 17:55, schrieb Ilia Mirkin:

 On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com
 wrote:

 On 20/08/14 16:31, Ilia Mirkin wrote:


 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com
 wrote:


 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we
 can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in

 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
 ,
 e.g.,:

 enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
 PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that
 want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


 Another quick + cheap alternative (at least looking at nv50/nvc0 code)
 would be to pass a separate target parameter to
 -create_sampler_view(). That would be enough for nouveau, but perhaps
 not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
 -- it also needs to work out the depth of the texture (presumably to
 deal with out-of-bounds accesses) and that is written to the texture
 info structure.

 Well that should be enough, but I don't think it fits out design.



 We've

 encapsulated other override information like the format in the view
 already, and I see no reason why the target cast should be treated any
 different.


 In other words, you're arguing for:

 diff --git a/src/gallium/include/pipe/p_state.h
 b/src/gallium/include/pipe/p_state.h
 index a82686b..c87ac4e 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -333,6 +333,7 @@ struct pipe_surface

 On struct pipe_sampler_view, I thought... unless I'm misunderstanding.
 This was also my first thought about fixing this after Roland pointed
 out the issue.
 Yes definitely for pipe_sampler_view - d3d10 also has it on the render
 target / depth stencil views, though so far I'm not convinced there's
 any value in that (the addressing of cube maps / arrays, 1d / 1d arrays
 is entirely the same in all cases, what matters is really the first and
 last layer only).
 

 struct pipe_reference reference;
 struct pipe_resource *texture; /** resource into which this is a view
 */
 struct pipe_context *context; /** context this surface belongs to */
 +   enum pipe_texture target;
 Make it pipe_texture_target target ;-)
 
 
 enum pipe_format format;

 /* XXX width/height should be removed */


 It's a fair point.  And I don't object that solution.

 Of course, for this to work, drivers will need to treat the _ARRAY and non
 _ARRAY targets the same when determining the texture layout for this to
 work.


 I just felt this would be a good oportunity to slim down pipe_texture_target
 too.  I'm not sure the _ARRAY distinction still matters at this level, but I
 suppose it doesn't hurt.

 Such a cleanup would probably have to be done by someone with a better
 understanding of gallium than me. OTOH if you guys feel like doing it
 the sampler_view way will accrue too much technical debt, that's fine
 too. Unless I hear otherwise, I'm going to try to do it the
 pipe_sampler_view way tonight.

 
 Yes I think it would be a nice cleanup to split it up into two enums. I
 was mostly proposing just reusing the same enum and keeping
 pipe_texture_target the same because it would require less code change.
 But

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

On Wed, Aug 20, 2014 at 4:12 PM, Roland Scheidegger srol...@vmware.com wrote:
 Am 20.08.2014 18:48, schrieb Roland Scheidegger:
 Am 20.08.2014 18:33, schrieb Ilia Mirkin:
 On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 17:14, Roland Scheidegger wrote:

 Am 20.08.2014 17:55, schrieb Ilia Mirkin:

 On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com
 wrote:

 On 20/08/14 16:31, Ilia Mirkin wrote:


 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com
 wrote:


 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we
 can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in

 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
 ,
 e.g.,:

 enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
 PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that
 want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


 Another quick + cheap alternative (at least looking at nv50/nvc0 code)
 would be to pass a separate target parameter to
 -create_sampler_view(). That would be enough for nouveau, but perhaps
 not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
 -- it also needs to work out the depth of the texture (presumably to
 deal with out-of-bounds accesses) and that is written to the texture
 info structure.

 Well that should be enough, but I don't think it fits out design.



 We've

 encapsulated other override information like the format in the view
 already, and I see no reason why the target cast should be treated any
 different.


 In other words, you're arguing for:

 diff --git a/src/gallium/include/pipe/p_state.h
 b/src/gallium/include/pipe/p_state.h
 index a82686b..c87ac4e 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -333,6 +333,7 @@ struct pipe_surface

 On struct pipe_sampler_view, I thought... unless I'm misunderstanding.
 This was also my first thought about fixing this after Roland pointed
 out the issue.
 Yes definitely for pipe_sampler_view - d3d10 also has it on the render
 target / depth stencil views, though so far I'm not convinced there's
 any value in that (the addressing of cube maps / arrays, 1d / 1d arrays
 is entirely the same in all cases, what matters is really the first and
 last layer only).


 struct pipe_reference reference;
 struct pipe_resource *texture; /** resource into which this is a view
 */
 struct pipe_context *context; /** context this surface belongs to */
 +   enum pipe_texture target;
 Make it pipe_texture_target target ;-)


 enum pipe_format format;

 /* XXX width/height should be removed */


 It's a fair point.  And I don't object that solution.

 Of course, for this to work, drivers will need to treat the _ARRAY and non
 _ARRAY targets the same when determining the texture layout for this to
 work.


 I just felt this would be a good oportunity to slim down 
 pipe_texture_target
 too.  I'm not sure the _ARRAY distinction still matters at this level, but 
 I
 suppose it doesn't hurt.

 Such a cleanup would probably have to be done by someone with a better
 understanding of gallium than me. OTOH if you guys feel like doing it
 the sampler_view way will accrue too much technical debt, that's fine
 too. Unless I hear otherwise, I'm going to try to do it the
 pipe_sampler_view way tonight.


 Yes I think it would be a nice cleanup to split it up into two enums. I
 was mostly proposing just reusing the same enum and keeping

[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail

https://bugs.freedesktop.org/show_bug.cgi?id=79629

--- Comment #7 from dog paul.a.parent...@intel.com ---
DRI3 is still being developed/stabilized and its interaction with fences poorly
specified.  Mesa does not yet utilize the explicit fences implied in the spec.

Chris will disable DRI3 by default in the ddx because there are a number of
trivial-to-hit bugs that cause X or the compositor to stop updating.

You should cross-reference
https://bugs.freedesktop.org/show_bug.cgi?id=81551 for the patch that will
identify the issue as being the explicit-vs-implicit fencing issue or something
new.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail

https://bugs.freedesktop.org/show_bug.cgi?id=79629

dog paul.a.parent...@intel.com changed:

   What|Removed |Added

 Depends on||81551

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 17/20] i965: Preserve CFG when deleting dead control flow.

On Tue, Aug 19, 2014 at 12:36 PM, Pohjolainen, Topi
topi.pohjolai...@intel.com wrote:
 On Tue, Aug 19, 2014 at 12:03:01PM -0700, Matt Turner wrote:
 By the way, I committed the first 6 patches of the series (the one
 touching the generators had started to rot). I think other than 16 and
 17, the only ones missing review are the patches that add the
 insertion and removal methods. I sent new versions of them based on
 your feedback a few days ago.

 Oh, so sorry Matt, I somehow forgot to send my r-b, they are just fine.

To make sure I didn't misunderstand: patches 10 and 11 are R-b, or 10,
11, 16, and 17? I didn't want to slap your R-b on something that
wasn't reviewed yet. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] mesa/st: add ARB_texture_view support

Am 20.08.2014 22:27, schrieb Ilia Mirkin:
 On Wed, Aug 20, 2014 at 4:12 PM, Roland Scheidegger srol...@vmware.com 
 wrote:
 Am 20.08.2014 18:48, schrieb Roland Scheidegger:
 Am 20.08.2014 18:33, schrieb Ilia Mirkin:
 On Wed, Aug 20, 2014 at 12:22 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 20/08/14 17:14, Roland Scheidegger wrote:

 Am 20.08.2014 17:55, schrieb Ilia Mirkin:

 On Wed, Aug 20, 2014 at 11:47 AM, Jose Fonseca jfons...@vmware.com
 wrote:

 On 20/08/14 16:31, Ilia Mirkin wrote:


 Hm, it's not tested. And you're right, that would (most likely) mess
 up, since it would only have the pipe_resource's target. Any
 suggestions on how to fix it? Should the target be added to
 pipe_sampler_view?

 On Wed, Aug 20, 2014 at 11:25 AM, Roland Scheidegger
 srol...@vmware.com
 wrote:


 Didn't look at it that closely, but I'm pretty surprised this really
 works. One things ARB_texture_view can do is cast cube maps (and cube
 map arrays) to 2d arrays and vice versa (also 1d/2d to the respective
 array type), and we cannot express that in sampler views (yet) (we
 can't
 express it in surfaces neither but there it should not matter). Which
 means the type used in the shader for sampling will not match the
 sampler view, which sounds quite broken to me.

 Roland


 Probably the only sane thing to do eliminate the disctinction between
 PIPE_TEXTURE_FOO and PIPE_TEXTURE_FOO_ARRAY like  in

 https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/ff476202.aspxk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=x03OmuVWAQgfFbsFB2SLMLwSYavxkU8Zsypu9lEIkpg%3D%0As=4b0ca75d91e6d53d92658d9e334cf2a73a01efe5667464c969f5c085409052ff
 ,
 e.g.,:

 enum pipe_texture_target {
 PIPE_BUFFER   = 0,
 PIPE_TEXTURE_1D   = 1,
 PIPE_TEXTURE_2D   = 2,
 PIPE_TEXTURE_3D   = 3,
 PIPE_TEXTURE_CUBE = 4, // Must have same layout as
 PIPE_TEXTURE_2D
 PIPE_TEXTURE_RECT = 5,
 PIPE_TEXTURE_1D_ARRAY = PIPE_TEXTURE_1D,
 PIPE_TEXTURE_2D_ARRAY = PIPE_TEXTURE_2D,
 PIPE_TEXTURE_CUBE_ARRAY = PIPE_TEXTURE_CUBE,
 PIPE_MAX_TEXTURE_TYPES
 };


 We could also remove PIPE_TEXTURE_CUBE and have cube-maps be
 PIPE_TEXTURE_2D
 with a flag, but that's probably a lot of work. Instead, drivers that
 want
 to be able to support ARB_texture_view will need to ensure
 PIPE_TEXTURE_CUBE/PIPE_TEXTURE_2D layout match.


 Another quick + cheap alternative (at least looking at nv50/nvc0 code)
 would be to pass a separate target parameter to
 -create_sampler_view(). That would be enough for nouveau, but perhaps
 not more generally? Take a look at nv50_tex.c:nv50_create_texture_view
 -- it also needs to work out the depth of the texture (presumably to
 deal with out-of-bounds accesses) and that is written to the texture
 info structure.

 Well that should be enough, but I don't think it fits out design.



 We've

 encapsulated other override information like the format in the view
 already, and I see no reason why the target cast should be treated any
 different.


 In other words, you're arguing for:

 diff --git a/src/gallium/include/pipe/p_state.h
 b/src/gallium/include/pipe/p_state.h
 index a82686b..c87ac4e 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -333,6 +333,7 @@ struct pipe_surface

 On struct pipe_sampler_view, I thought... unless I'm misunderstanding.
 This was also my first thought about fixing this after Roland pointed
 out the issue.
 Yes definitely for pipe_sampler_view - d3d10 also has it on the render
 target / depth stencil views, though so far I'm not convinced there's
 any value in that (the addressing of cube maps / arrays, 1d / 1d arrays
 is entirely the same in all cases, what matters is really the first and
 last layer only).


 struct pipe_reference reference;
 struct pipe_resource *texture; /** resource into which this is a view
 */
 struct pipe_context *context; /** context this surface belongs to */
 +   enum pipe_texture target;
 Make it pipe_texture_target target ;-)


 enum pipe_format format;

 /* XXX width/height should be removed */


 It's a fair point.  And I don't object that solution.

 Of course, for this to work, drivers will need to treat the _ARRAY and non
 _ARRAY targets the same when determining the texture layout for this to
 work.


 I just felt this would be a good oportunity to slim down 
 pipe_texture_target
 too.  I'm not sure the _ARRAY distinction still matters at this level, 
 but I
 suppose it doesn't hurt.

 Such a cleanup would probably have to be done by someone with a better
 understanding of gallium than me. OTOH if you guys feel like doing it
 the sampler_view way will accrue too much technical debt, that's fine
 too. Unless I hear otherwise, I'm going to try to do it the
 pipe_sampler_view way tonight.


 Yes I think it would be a nice cleanup to split it up into two enums. I
 was mostly

[Mesa-dev] [Bug 79629] [Dri3 bisected] piglit glx_GLX_ARB_create_context_current_with_no_framebuffer fail

https://bugs.freedesktop.org/show_bug.cgi?id=79629

Bastien Nocera bugzi...@hadess.net changed:

   What|Removed |Added

 CC||bugzi...@hadess.net

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 76188] EGL_EXT_image_dma_buf_import fd ownership is incorrect

https://bugs.freedesktop.org/show_bug.cgi?id=76188

--- Comment #8 from Matt Turner matts...@gmail.com ---
(In reply to comment #7)
 I see little risk in cherry-picking the fix to stable branches. The fix is
 isolated and only *removes* code.
 
 I do see risk in not cherry-picking the fix. If an app uses this extension
 with unfixed Mesa 10.2, then that app will leak file descriptors.

Sounds good to me. I just wanted you or Ian to take a look at the patch before
we shipped it in a release. :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] dri/radeon: drop obsolete radeon_{dri, macros}.h headers

2014-08-20 Thread Marek Olšák

Sorry, I don't know much about these drivers to be able to review this.

Marek

On Wed, Aug 20, 2014 at 9:54 PM, Emil Velikov emil.l.veli...@gmail.com wrote:
 Both have been unused for at least a couple of years.
 For example the last user of radeon_macros.h was removed with

 commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6
 Author: Eric Anholt e...@anholt.net
 Date:   Fri Oct 14 13:27:02 2011 -0700

 radeon: Drop the legacy BO manager code.

 Cc: Marek Olšák marek.ol...@amd.com
 Cc: Michel Dänzer michel.daen...@amd.com
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/mesa/drivers/dri/r200/r200_ioctl.h |   1 -
  src/mesa/drivers/dri/r200/server/radeon_dri.h  |   1 -
  src/mesa/drivers/dri/r200/server/radeon_macros.h   |   1 -
  src/mesa/drivers/dri/radeon/radeon_screen.c|   1 -
  src/mesa/drivers/dri/radeon/radeon_screen.h|   3 +-
  src/mesa/drivers/dri/radeon/server/radeon_dri.h| 115 --
  src/mesa/drivers/dri/radeon/server/radeon_macros.h | 128 
 -
  7 files changed, 2 insertions(+), 248 deletions(-)
  delete mode 12 src/mesa/drivers/dri/r200/server/radeon_dri.h
  delete mode 12 src/mesa/drivers/dri/r200/server/radeon_macros.h
  delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_dri.h
  delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_macros.h

 diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h 
 b/src/mesa/drivers/dri/r200/r200_ioctl.h
 index ab5f822..9133a22 100644
 --- a/src/mesa/drivers/dri/r200/r200_ioctl.h
 +++ b/src/mesa/drivers/dri/r200/r200_ioctl.h
 @@ -36,7 +36,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 SOFTWARE.
  #define __R200_IOCTL_H__

  #include main/simple_list.h
 -#include radeon_dri.h

  #include radeon_bo_gem.h
  #include radeon_cs_gem.h
 diff --git a/src/mesa/drivers/dri/r200/server/radeon_dri.h 
 b/src/mesa/drivers/dri/r200/server/radeon_dri.h
 deleted file mode 12
 index 27c591d..000
 --- a/src/mesa/drivers/dri/r200/server/radeon_dri.h
 +++ /dev/null
 @@ -1 +0,0 @@
 -../../radeon/server/radeon_dri.h
 \ No newline at end of file
 diff --git a/src/mesa/drivers/dri/r200/server/radeon_macros.h 
 b/src/mesa/drivers/dri/r200/server/radeon_macros.h
 deleted file mode 12
 index c56cd73..000
 --- a/src/mesa/drivers/dri/r200/server/radeon_macros.h
 +++ /dev/null
 @@ -1 +0,0 @@
 -../../radeon/server/radeon_macros.h
 \ No newline at end of file
 diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c 
 b/src/mesa/drivers/dri/radeon/radeon_screen.c
 index 9a6fbbd..044e212 100644
 --- a/src/mesa/drivers/dri/radeon/radeon_screen.c
 +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c
 @@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 SOFTWARE.
  #include swrast/s_renderbuffer.h

  #include radeon_chipset.h
 -#include radeon_macros.h
  #include radeon_screen.h
  #include radeon_common.h
  #include radeon_common_context.h
 diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h 
 b/src/mesa/drivers/dri/radeon/radeon_screen.h
 index 9b77627..b5cc075 100644
 --- a/src/mesa/drivers/dri/radeon/radeon_screen.h
 +++ b/src/mesa/drivers/dri/radeon/radeon_screen.h
 @@ -40,8 +40,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 SOFTWARE.
   * IMPORTS: these headers contain all the DRI, X and kernel-related
   * definitions that we need.
   */
 +#include xf86drm.h
 +#include radeon_drm.h
  #include dri_util.h
 -#include radeon_dri.h
  #include radeon_chipset.h
  #include radeon_reg.h
  #include drm_sarea.h
 diff --git a/src/mesa/drivers/dri/radeon/server/radeon_dri.h 
 b/src/mesa/drivers/dri/radeon/server/radeon_dri.h
 deleted file mode 100644
 index dc51372..000
 --- a/src/mesa/drivers/dri/radeon/server/radeon_dri.h
 +++ /dev/null
 @@ -1,115 +0,0 @@
 -/**
 - * \file server/radeon_dri.h
 - * \brief Radeon server-side structures.
 - *
 - * \author Kevin E. Martin mar...@xfree86.org
 - * \author Rickard E. Faith fa...@valinux.com
 - */
 -
 -/*
 - * Copyright 2000 ATI Technologies Inc., Markham, Ontario,
 - *VA Linux Systems Inc., Fremont, California.
 - *
 - * All Rights Reserved.
 - *
 - * Permission is hereby granted, free of charge, to any person obtaining
 - * a copy of this software and associated documentation files (the
 - * Software), to deal in the Software without restriction, including
 - * without limitation on the rights to use, copy, modify, merge,
 - * publish, distribute, sublicense, and/or sell copies of the Software,
 - * and to permit persons to whom the Software is furnished to do so,
 - * subject to the following conditions:
 - *
 - * The above copyright notice and this permission notice (including the
 - * next paragraph) shall be included in all copies or substantial
 - * portions of the Software.
 - *
 - * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 - * MERCHANTABILITY, FITNESS FOR A

Re: [Mesa-dev] [PATCH 1/4] dri/radeon: drop obsolete radeon_{dri, macros}.h headers

No problems Marek. Your name popped up at the top of the list based on your
recent bugfixes in the area.

I believe that Michel and/or Alex will have some (unfortunate) recollection
about these drivers :)

-Emil

On 20/08/14 23:05, Marek Olšák wrote:
 Sorry, I don't know much about these drivers to be able to review this.
 
 Marek
 
 On Wed, Aug 20, 2014 at 9:54 PM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 Both have been unused for at least a couple of years.
 For example the last user of radeon_macros.h was removed with

 commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6
 Author: Eric Anholt e...@anholt.net
 Date:   Fri Oct 14 13:27:02 2011 -0700

 radeon: Drop the legacy BO manager code.

 Cc: Marek Olšák marek.ol...@amd.com
 Cc: Michel Dänzer michel.daen...@amd.com
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
  src/mesa/drivers/dri/r200/r200_ioctl.h |   1 -
  src/mesa/drivers/dri/r200/server/radeon_dri.h  |   1 -
  src/mesa/drivers/dri/r200/server/radeon_macros.h   |   1 -
  src/mesa/drivers/dri/radeon/radeon_screen.c|   1 -
  src/mesa/drivers/dri/radeon/radeon_screen.h|   3 +-
  src/mesa/drivers/dri/radeon/server/radeon_dri.h| 115 --
  src/mesa/drivers/dri/radeon/server/radeon_macros.h | 128 
 -
  7 files changed, 2 insertions(+), 248 deletions(-)
  delete mode 12 src/mesa/drivers/dri/r200/server/radeon_dri.h
  delete mode 12 src/mesa/drivers/dri/r200/server/radeon_macros.h
  delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_dri.h
  delete mode 100644 src/mesa/drivers/dri/radeon/server/radeon_macros.h

 diff --git a/src/mesa/drivers/dri/r200/r200_ioctl.h 
 b/src/mesa/drivers/dri/r200/r200_ioctl.h
 index ab5f822..9133a22 100644
 --- a/src/mesa/drivers/dri/r200/r200_ioctl.h
 +++ b/src/mesa/drivers/dri/r200/r200_ioctl.h
 @@ -36,7 +36,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 SOFTWARE.
  #define __R200_IOCTL_H__

  #include main/simple_list.h
 -#include radeon_dri.h

  #include radeon_bo_gem.h
  #include radeon_cs_gem.h
 diff --git a/src/mesa/drivers/dri/r200/server/radeon_dri.h 
 b/src/mesa/drivers/dri/r200/server/radeon_dri.h
 deleted file mode 12
 index 27c591d..000
 --- a/src/mesa/drivers/dri/r200/server/radeon_dri.h
 +++ /dev/null
 @@ -1 +0,0 @@
 -../../radeon/server/radeon_dri.h
 \ No newline at end of file
 diff --git a/src/mesa/drivers/dri/r200/server/radeon_macros.h 
 b/src/mesa/drivers/dri/r200/server/radeon_macros.h
 deleted file mode 12
 index c56cd73..000
 --- a/src/mesa/drivers/dri/r200/server/radeon_macros.h
 +++ /dev/null
 @@ -1 +0,0 @@
 -../../radeon/server/radeon_macros.h
 \ No newline at end of file
 diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c 
 b/src/mesa/drivers/dri/radeon/radeon_screen.c
 index 9a6fbbd..044e212 100644
 --- a/src/mesa/drivers/dri/radeon/radeon_screen.c
 +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c
 @@ -45,7 +45,6 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 SOFTWARE.
  #include swrast/s_renderbuffer.h

  #include radeon_chipset.h
 -#include radeon_macros.h
  #include radeon_screen.h
  #include radeon_common.h
  #include radeon_common_context.h
 diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.h 
 b/src/mesa/drivers/dri/radeon/radeon_screen.h
 index 9b77627..b5cc075 100644
 --- a/src/mesa/drivers/dri/radeon/radeon_screen.h
 +++ b/src/mesa/drivers/dri/radeon/radeon_screen.h
 @@ -40,8 +40,9 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 SOFTWARE.
   * IMPORTS: these headers contain all the DRI, X and kernel-related
   * definitions that we need.
   */
 +#include xf86drm.h
 +#include radeon_drm.h
  #include dri_util.h
 -#include radeon_dri.h
  #include radeon_chipset.h
  #include radeon_reg.h
  #include drm_sarea.h
 diff --git a/src/mesa/drivers/dri/radeon/server/radeon_dri.h 
 b/src/mesa/drivers/dri/radeon/server/radeon_dri.h
 deleted file mode 100644
 index dc51372..000
 --- a/src/mesa/drivers/dri/radeon/server/radeon_dri.h
 +++ /dev/null
 @@ -1,115 +0,0 @@
 -/**
 - * \file server/radeon_dri.h
 - * \brief Radeon server-side structures.
 - *
 - * \author Kevin E. Martin mar...@xfree86.org
 - * \author Rickard E. Faith fa...@valinux.com
 - */
 -
 -/*
 - * Copyright 2000 ATI Technologies Inc., Markham, Ontario,
 - *VA Linux Systems Inc., Fremont, California.
 - *
 - * All Rights Reserved.
 - *
 - * Permission is hereby granted, free of charge, to any person obtaining
 - * a copy of this software and associated documentation files (the
 - * Software), to deal in the Software without restriction, including
 - * without limitation on the rights to use, copy, modify, merge,
 - * publish, distribute, sublicense, and/or sell copies of the Software,
 - * and to permit persons to whom the Software is furnished to do so,
 - * subject to the following conditions:
 - *
 - * The above copyright notice and this permission notice (including the
 - * next paragraph) shall

[Mesa-dev] [Bug 82881] New: test_vec4_register_coalesce regression