date:20140818

[Mesa-dev] [Bug 82538] Super Maryo Chronicles fails with st/mesa assertion failure

2014-08-18 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=82538

--- Comment #2 from Michel Dänzer mic...@daenzer.net ---
(In reply to comment #1)
  It works fine for me on Kabini :). Mesa git
 d7d8260f70326cd294715203dae8a8f0150680c1, llvm 3.5-rc2,

I can still reproduce it with current Mesa Git. Does your Mesa build have
assertions enabled?


 smc as Debian package in Sid.

Same here, currently version 1.9+git20121121-1.1.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] clover: fix _logs string creation

2014-08-18 Thread Francisco Jerez

EdB edb+m...@sigluy.net writes:

 compact::string is not \0 terminated.
 size() need to be used for std::string creation
 ---
  src/gallium/state_trackers/clover/core/program.cpp | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
 b/src/gallium/state_trackers/clover/core/program.cpp
 index e09c3aa..3f504d5 100644
 --- a/src/gallium/state_trackers/clover/core/program.cpp
 +++ b/src/gallium/state_trackers/clover/core/program.cpp
 @@ -61,9 +61,9 @@ program::build(const ref_vectordevice devs, const char 
 *opts) {
  dev.ir_target(), 
 build_opts(dev),
  log));
  _binaries.insert({ dev, module });
 -_logs.insert({ dev, std::string(log.c_str()) });
 +_logs.insert({ dev, std::string(log.c_str(), log.size()) });
   } catch (const build_error ) {
 -_logs.insert({ dev, std::string(log.c_str()) });
 +_logs.insert({ dev, std::string(log.c_str(), log.size()) });

Both of these should just be using the conversion operator.  See
attachment.

  throw;
   }
}
 -- 
 2.0.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

From 3c2bec6d790e6aa38fb6d71cd495f281205ddf6c Mon Sep 17 00:00:00 2001
From: Francisco Jerez curroje...@riseup.net
Date: Mon, 18 Aug 2014 09:05:25 +0300
Subject: [PATCH] clover: Use conversion operator to initialize build log from
 compat::string.

Fixes binary garbage in the compilation logs caused by
compat::string::c_str() not being null-terminated (which is a bug on
its own that will be fixed in another commit).

Reported-by: EdB edb+m...@sigluy.net
---
 src/gallium/state_trackers/clover/core/program.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/program.cpp b/src/gallium/state_trackers/clover/core/program.cpp
index 30a1f0e..6c224db 100644
--- a/src/gallium/state_trackers/clover/core/program.cpp
+++ b/src/gallium/state_trackers/clover/core/program.cpp
@@ -61,9 +61,9 @@ program::build(const ref_vectordevice devs, const char *opts) {
 dev.ir_target(), build_opts(dev),
 log));
 _binaries.insert({ dev, module });
-_logs.insert({ dev, std::string(log.c_str()) });
+_logs.insert({ dev, log });
  } catch (const build_error ) {
-_logs.insert({ dev, std::string(log.c_str()) });
+_logs.insert({ dev, log });
 throw;
  }
   }
-- 
2.0.4



pgpejMi7uG3oD.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] clover: stdify compat::vector a little more

2014-08-18 Thread Francisco Jerez

EdB edb+m...@sigluy.net writes:

 make resize work like std::vector
 reserve take advantage of capacity
 rename members to be uniform with other class
 ---
  src/gallium/state_trackers/clover/core/module.cpp |   2 +-
  src/gallium/state_trackers/clover/util/compat.hpp | 113 
 +++---
  2 files changed, 78 insertions(+), 37 deletions(-)


This could be a *lot* simpler, see attachment.

From abd573bffb674a0a7565b18b38be116472fa5f24 Mon Sep 17 00:00:00 2001
From: Francisco Jerez curroje...@riseup.net
Date: Mon, 18 Aug 2014 08:30:46 +0300
Subject: [PATCH] clover/util: Have compat::vector track separate size and
 capacity.

In order to make the behaviour of resize() and reserve() closer to the
standard.
---
 src/gallium/state_trackers/clover/core/module.cpp |  4 +-
 src/gallium/state_trackers/clover/util/compat.hpp | 67 ++-
 2 files changed, 44 insertions(+), 27 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/module.cpp b/src/gallium/state_trackers/clover/core/module.cpp
index 55ed91a..9ef584b 100644
--- a/src/gallium/state_trackers/clover/core/module.cpp
+++ b/src/gallium/state_trackers/clover/core/module.cpp
@@ -94,7 +94,7 @@ namespace {
 
   static void
   proc(compat::istream is, compat::vectorT v) {
- v.reserve(_procuint32_t(is));
+ v.resize(_procuint32_t(is));
 
  for (size_t i = 0; i  v.size(); i++)
 new(v[i]) T(_procT(is));
@@ -122,7 +122,7 @@ namespace {
 
   static void
   proc(compat::istream is, compat::vectorT v) {
- v.reserve(_procuint32_t(is));
+ v.resize(_procuint32_t(is));
  is.read(reinterpret_castchar *(v.begin()),
  v.size() * sizeof(T));
   }
diff --git a/src/gallium/state_trackers/clover/util/compat.hpp b/src/gallium/state_trackers/clover/util/compat.hpp
index 50e1c7d..a4e3938 100644
--- a/src/gallium/state_trackers/clover/util/compat.hpp
+++ b/src/gallium/state_trackers/clover/util/compat.hpp
@@ -66,65 +66,81 @@ namespace clover {
  typedef std::ptrdiff_t difference_type;
  typedef std::size_t size_type;
 
- vector() : p(NULL), n(0) {
+ vector() : p(NULL), _size(0), _capacity(0) {
  }
 
- vector(const vector v) : p(alloc(v.n, v.p, v.n)), n(v.n) {
+ vector(const vector v) :
+p(alloc(v._size, v.p, v._size)),
+_size(v._size), _capacity(v._size) {
  }
 
- vector(const_iterator p, size_type n) : p(alloc(n, p, n)), n(n) {
+ vector(const_iterator p, size_type n) :
+p(alloc(n, p, n)), _size(n), _capacity(n) {
  }
 
  templatetypename C
  vector(const C v) :
-p(alloc(v.size(), *v.begin(), v.size())), n(v.size()) {
+p(alloc(v.size(), *v.begin(), v.size())),
+_size(v.size()) , _capacity(v.size()) {
  }
 
  ~vector() {
-free(n, p);
+free(_size, p);
  }
 
  vector 
  operator=(const vector v) {
-free(n, p);
+free(_size, p);
 
-p = alloc(v.n, v.p, v.n);
-n = v.n;
+p = alloc(v._size, v.p, v._size);
+_size = v._size;
+_capacity = v._size;
 
 return *this;
  }
 
  void
- reserve(size_type m) {
-if (n  m) {
-   T *q = alloc(m, p, n);
-   free(n, p);
+ reserve(size_type n) {
+if (_capacity  n) {
+   T *q = alloc(n, p, _size);
+   free(_size, p);
 
p = q;
-   n = m;
+   _capacity = n;
 }
  }
 
  void
- resize(size_type m, T x = T()) {
-size_type n = size();
+ resize(size_type n, T x = T()) {
+if (n = _size) {
+   for (size_type i = n; i  _size; ++i)
+  p[i].~T();
 
-reserve(m);
+} else {
+   reserve(n);
 
-for (size_type i = n; i  m; ++i)
-   new(p[i]) T(x);
+   for (size_type i = _size; i  n; ++i)
+  new(p[i]) T(x);
+}
+
+_size = n;
  }
 
  void
  push_back(const T x) {
-size_type n = size();
-reserve(n + 1);
-new(p[n]) T(x);
+reserve(_size + 1);
+new(p[_size]) T(x);
+++_size;
  }
 
  size_type
  size() const {
-return n;
+return _size;
+ }
+
+ size_type
+ capacity() const {
+return _capacity;
  }
 
  iterator
@@ -139,12 +155,12 @@ namespace clover {
 
  iterator
  end() {
-return p + n;
+return p + _size;
  }
 
  const_iterator
  end() const {
-return p + n;
+return p + _size;
  }

Re: [Mesa-dev] [PATCH] clover: fix piglit cl-api-build-program test

2014-08-18 Thread Francisco Jerez

EdB edb+m...@sigluy.net writes:

 On Sunday, August 17, 2014 11:50:12 PM Francisco Jerez wrote:
 EdB edb+m...@sigluy.net writes:
  Hello
  
  There is a crash with your version.
  This one works
 
 Oops, sorry for that.  It seems like a hack to me to force the kernel
 reference count to one to keep it from being destroyed...  Can you try
 the attached patch instead on top of my clover-next branch [1]?
 8010325eaf and 47e8adea3a are the ones it depends on.
 
 [1] http://cgit.freedesktop.org/~currojerez/mesa/log/?h=clover-next

 It works

 Thanks

Cool, pushed.


pgpD_NQunJHEv.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 82538] Super Maryo Chronicles fails with st/mesa assertion failure

2014-08-18 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=82538

Michel Dänzer mic...@daenzer.net changed:

   What|Removed |Added

 CC||mar...@gmail.com

--- Comment #3 from Michel Dänzer mic...@daenzer.net ---
Bisected it to:

commit 734e4946f50c1b83dafdb18ced652abc88e6a246
Author: Marek Olšák marek.ol...@amd.com
Date:   Fri Jul 11 00:05:44 2014 +0200

mesa: fix crash in st/mesa after deleting a VAO

This happens when glGetMultisamplefv (or any other non-draw function) is
called, which doesn't invoke the VBO module to update _DrawArrays and
the pointer is invalid at that point.

However st/mesa still dereferences it to setup vertex buffers == crash.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-18 Thread Michel Dänzer

On 16.08.2014 09:12, Connor Abbott wrote:
 I know what you might be thinking right now. Wait, *another* IR? Don't
 we already have like 5 of those, not counting all the driver-specific
 ones? Isn't this stuff complicated enough already? Well, there are some
 pretty good reasons to start afresh (again...). In the years we've been
 using GLSL IR, we've come to realize that, in fact, it's not what we
 want *at all* to do optimizations on.

Did you evaluate using LLVM IR instead of inventing yet another one?


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] clover: fix _logs string creation

2014-08-18 Thread EdB

On Monday, August 18, 2014 09:20:03 AM Francisco Jerez wrote:
 EdB edb+m...@sigluy.net writes:
  compact::string is not \0 terminated.
  size() need to be used for std::string creation
  ---
  
   src/gallium/state_trackers/clover/core/program.cpp | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
  
  diff --git a/src/gallium/state_trackers/clover/core/program.cpp
  b/src/gallium/state_trackers/clover/core/program.cpp index
  e09c3aa..3f504d5 100644
  --- a/src/gallium/state_trackers/clover/core/program.cpp
  +++ b/src/gallium/state_trackers/clover/core/program.cpp
  @@ -61,9 +61,9 @@ program::build(const ref_vectordevice devs, const
  char *opts) { 
   dev.ir_target(),
   build_opts(dev),
   log));
   
   _binaries.insert({ dev, module });
  
  -_logs.insert({ dev, std::string(log.c_str()) });
  +_logs.insert({ dev, std::string(log.c_str(), log.size()) });
  
} catch (const build_error ) {
  
  -_logs.insert({ dev, std::string(log.c_str()) });
  +_logs.insert({ dev, std::string(log.c_str(), log.size()) });
 
 Both of these should just be using the conversion operator.  See
 attachment.

Agreed, I was highlighting the problem.
Yours is better.

Thanks

 
   throw;

}
 
 }
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] clover: stdify compat::vector a little more

2014-08-18 Thread EdB

On Monday, August 18, 2014 09:29:02 AM Francisco Jerez wrote:
 EdB edb+m...@sigluy.net writes:
  make resize work like std::vector
  reserve take advantage of capacity
  rename members to be uniform with other class
  ---
  
   src/gallium/state_trackers/clover/core/module.cpp |   2 +-
   src/gallium/state_trackers/clover/util/compat.hpp | 113
   +++--- 2 files changed, 78 insertions(+), 37
   deletions(-)
 
 This could be a *lot* simpler, see attachment.

Looks good to me.

Thanks
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/7] mapi: Inline shared-glapi/Makefile.

2014-08-18 Thread Emil Velikov

On 18/08/14 05:14, Matt Turner wrote:
 On Sun, Aug 17, 2014 at 1:06 PM, Kristian Høgsberg hoegsb...@gmail.com 
 wrote:
 On Fri, Aug 15, 2014 at 10:47:06AM -0700, Matt Turner wrote:
 ---
  configure.ac  |  1 -
  src/mapi/Makefile.am  | 44 
 ---
  src/mapi/shared-glapi/Makefile.am | 34 --
  src/mesa/Makefile.sources |  3 ---
  4 files changed, 41 insertions(+), 41 deletions(-)
  delete mode 100644 src/mapi/shared-glapi/Makefile.am

 diff --git a/configure.ac b/configure.ac
 index dc81c80..97d5394 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -2243,7 +2243,6 @@ AC_CONFIG_FILES([Makefile
   src/mapi/glapi/Makefile
   src/mapi/glapi/gen/Makefile
   src/mapi/glapi/tests/Makefile
 - src/mapi/shared-glapi/Makefile
   src/mapi/shared-glapi/tests/Makefile
   src/mapi/vgapi/Makefile
   src/mapi/vgapi/vg.pc
 diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
 index ef53803..6b9444a 100644
 --- a/src/mapi/Makefile.am
 +++ b/src/mapi/Makefile.am
 @@ -1,4 +1,4 @@
 -# Copyright © 2013 Intel Corporation
 +# Copyright © 2013, 2014 Intel Corporation
  #
  # Permission is hereby granted, free of charge, to any person obtaining a
  # copy of this software and associated documentation files (the 
 Software),
 @@ -19,10 +19,46 @@
  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
 DEALINGS
  # IN THE SOFTWARE.

 -SUBDIRS = glapi/gen
 +SUBDIRS = glapi/gen .
 +
 +TOP = $(top_srcdir)
 +
 +BUILT_SOURCES =
 +CLEANFILES = $(BUILT_SOURCES)
 +
 +lib_LTLIBRARIES =
 +
 +AM_CFLAGS = $(PTHREAD_CFLAGS)
 +AM_CPPFLAGS =\
 + $(DEFINES)  \
 + $(SELINUX_CFLAGS)   \
 + -I$(top_srcdir)/include \
 + -I$(top_srcdir)/src/mapi\
 + -I$(top_builddir)/src/mapi
 +
 +GLAPI = $(top_srcdir)/src/mapi/glapi
 +include Makefile.sources
 +include glapi/gen/glapi_gen.mk

  if HAVE_SHARED_GLAPI
 -SUBDIRS += shared-glapi
 +SUBDIRS += shared-glapi/tests
 +
 +BUILT_SOURCES += shared-glapi/glapi_mapi_tmp.h
 +
 +lib_LTLIBRARIES += shared-glapi/libglapi.la
 +shared_glapi_libglapi_la_SOURCES = $(MAPI_GLAPI_FILES)
 +shared_glapi_libglapi_la_CPPFLAGS = \
 + $(AM_CPPFLAGS) \
 + -DMAPI_MODE_GLAPI \
 + -DMAPI_ABI_HEADER=\shared-glapi/glapi_mapi_tmp.h\
 +shared_glapi_libglapi_la_LIBADD = $(SELINUX_LIBS)
 +shared_glapi_libglapi_la_LDFLAGS = \
 + -no-undefined \
 + $(GC_SECTIONS) \
 + $(LD_NO_UNDEFINED)
 +
 +shared-glapi/glapi_mapi_tmp.h : $(GLAPI)/gen/gl_and_es_API.xml 
 $(glapi_gen_mapi_deps)
 + $(call glapi_gen_mapi,$,shared-glapi)
  endif

  if HAVE_OPENGL
 @@ -40,3 +76,5 @@ endif
  if HAVE_OPENVG
  SUBDIRS += vgapi
  endif
 +
 +include $(top_srcdir)/install-lib-links.mk
 diff --git a/src/mapi/shared-glapi/Makefile.am 
 b/src/mapi/shared-glapi/Makefile.am
 deleted file mode 100644
 index 330719c..000
 --- a/src/mapi/shared-glapi/Makefile.am
 +++ /dev/null
 @@ -1,34 +0,0 @@
 -# Used by OpenGL ES or when --enable-shared-glapi is specified
 -
 -SUBDIRS = . tests
 -
 -TOP = $(top_srcdir)
 -GLAPI = $(top_srcdir)/src/mapi/glapi
 -include $(top_srcdir)/src/mapi/Makefile.sources
 -
 -lib_LTLIBRARIES = libglapi.la
 -libglapi_la_SOURCES = $(MAPI_GLAPI_FILES)
 -libglapi_la_LIBADD = $(PTHREAD_LIBS) $(SELINUX_LIBS)

 You didn't move $(PTHREAD_LIBS) up to shared_glpai_libglapi_la_LIBADD?
 
 Right... Emil, do you remember whether PTHREAD_LIBS is needed?
 PTHREAD_CFLAGS seems sufficient for me, but I have a vague memory that
 FreeBSD or something needs PTHREAD_LIBS.
 
This seems to be an interesting topic:

ldd states that our current pthreads linking is not needed. On the other hand
the libglapi.so.0.0 has at least one function(pthreads_once) coming from the
pthreads library. At the same time the function is _unused_ by whole of mesa.
Not to mention that *BSD people need the pthreads linking as their libc does
not provide any pthread* symbols.

So in summary, let's keep PTHREAD_LIBS in for now :)

-Emil

 -libglapi_la_LDFLAGS = \
 - -no-undefined \
 - $(GC_SECTIONS) \
 - $(LD_NO_UNDEFINED)
 -
 -include $(GLAPI)/gen/glapi_gen.mk
 -glapi_mapi_tmp.h : $(GLAPI)/gen/gl_and_es_API.xml $(glapi_gen_mapi_deps)
 - $(call glapi_gen_mapi,$,shared-glapi)
 -
 -BUILT_SOURCES = glapi_mapi_tmp.h
 -CLEANFILES = $(BUILT_SOURCES)
 -
 -AM_CFLAGS = $(PTHREAD_CFLAGS)
 -AM_CPPFLAGS =\
 - $(DEFINES)  \
 - $(SELINUX_CFLAGS)   \
 - -I$(top_srcdir)/include \
 - -I$(top_srcdir)/src/mapi\
 - -I$(top_builddir)/src/mapi

Re: [Mesa-dev] [PATCH 03/19] glx/drisw: add support for DRI2rendererQueryExtension

2014-08-18 Thread Jon TURNEY


On 14/08/2014 23:18, Emil Velikov wrote:

The extension is used by GLX_MESA_query_renderer, which
can be provided for by hardware and software drivers.

v2: Use designated initializers.
v3: Move drisw_query_renderer_*() to dri2_query_renderer.c


This breaks my build (see [1])

I guess something like the attached is needed.

Possibly dri2_query_renderer.c needs to be renamed, since it's contents 
now are used for more than dri[23].


[1] http://tinderbox.x.org/builds/2014-08-16-0006/logs/mesa-mesa/#build

From ee9b2d044ebb089bc3daf93fc6b71e167c47841f Mon Sep 17 00:00:00 2001
From: Jon TURNEY jon.tur...@dronecode.org.uk
Date: Sun, 17 Aug 2014 17:22:22 +0100
Subject: [PATCH] Fix build since 679c2ef glx/drisw: add support for
 DRI2rendererQueryExtension, when only building drisw renderer.

Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk
---
 src/glx/Makefile.am   | 6 +++---
 src/glx/dri2_query_renderer.c | 4 
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
index cdd898e..23cb794 100644
--- a/src/glx/Makefile.am
+++ b/src/glx/Makefile.am
@@ -96,7 +96,8 @@ endif
 if HAVE_DRICOMMON
 libglx_la_SOURCES += \
  xfont.c \
- dri_common.c
+ dri_common.c \
+ dri2_query_renderer.c
 endif
 
 if HAVE_DRI2
@@ -104,8 +105,7 @@ libglx_la_SOURCES += \
  dri_glx.c \
  XF86dri.c \
  dri2_glx.c \
- dri2.c \
- dri2_query_renderer.c
+ dri2.c
 endif
 
 if HAVE_DRI3
diff --git a/src/glx/dri2_query_renderer.c b/src/glx/dri2_query_renderer.c
index 247ec1c..6ccd710 100644
--- a/src/glx/dri2_query_renderer.c
+++ b/src/glx/dri2_query_renderer.c
@@ -25,7 +25,9 @@
 
 #include glxclient.h
 #include glx_error.h
+#ifdef HAVE_LIBDRM
 #include dri2.h
+#endif
 #include dri_interface.h
 #include dri2_priv.h
 #if defined(HAVE_DRI3)
@@ -66,6 +68,7 @@ dri2_convert_glx_query_renderer_attribs(int attribute)
return -1;
 }
 
+#ifdef HAVE_LIBDRM
 _X_HIDDEN int
 dri2_query_renderer_integer(struct glx_screen *base, int attribute,
 unsigned int *value)
@@ -103,6 +106,7 @@ dri2_query_renderer_string(struct glx_screen *base, int 
attribute,
 
return psc-rendererQuery-queryString(psc-driScreen, dri_attribute, 
value);
 }
+#endif
 
 #if defined(HAVE_DRI3)
 _X_HIDDEN int
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] build: Let install-lib-links.mk handle .la files in subdirectories.

2014-08-18 Thread Emil Velikov

On 18/08/14 05:19, Matt Turner wrote:
 On Sun, Aug 17, 2014 at 2:39 PM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 On 15/08/14 18:47, Matt Turner wrote:
 The next patches are going to combine some of the mapi subdirectories'
 Makefiles into a single Makefile, giving better build parallelism.

 Hi Matt,

 I must admit that while I like this patch, I'm not at all a fan of the rest 
 of
 the series. But I won't object too strongly against the idea.
 
 Oh, really? I mean, there's some complexity just in all of the
 combinations, but I think this is a clean up.
 
 It's certainly an improvement in that we don't have Makefiles that
 build a single source file. After this series if you build GL, ES1,
 and ES2 all of it happens in parallel including the tests.
 
I shall not be going into mapi anytime soon so it's up-to you to have fun in
there. I prefer to get gallium's 'make dist' close to working and clean-up
some of the pipe-loader/targets mess that I've created :P

Not sure if the extra parallelism will help here as I very rarely build ES*
anyway so ;)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/19] glx/drisw: add support for DRI2rendererQueryExtension

2014-08-18 Thread Emil Velikov

On 18/08/14 12:47, Jon TURNEY wrote:
 On 14/08/2014 23:18, Emil Velikov wrote:
 The extension is used by GLX_MESA_query_renderer, which
 can be provided for by hardware and software drivers.

 v2: Use designated initializers.
 v3: Move drisw_query_renderer_*() to dri2_query_renderer.c
 
 This breaks my build (see [1])
 
Ouch, I've completely forgot about your recent-ish changes in here. Sorry for
the breakage.

 I guess something like the attached is needed.
 
 Possibly dri2_query_renderer.c needs to be renamed, since it's contents now
 are used for more than dri[23].
 
My initial plan was to move the functions to dri_common.c, although that
caused 'make check' to explode so I've kept them here as per Ian's suggestion.
Renaming the file makes sense imho.

 [1] http://tinderbox.x.org/builds/2014-08-16-0006/logs/mesa-mesa/#build
 
 
 0001-Fix-build-since-679c2ef-glx-drisw-add-support-for-DR.patch
 
 
 From ee9b2d044ebb089bc3daf93fc6b71e167c47841f Mon Sep 17 00:00:00 2001
 From: Jon TURNEY jon.tur...@dronecode.org.uk
 Date: Sun, 17 Aug 2014 17:22:22 +0100
 Subject: [PATCH] Fix build since 679c2ef glx/drisw: add support for
  DRI2rendererQueryExtension, when only building drisw renderer.
 
 Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk
 ---
  src/glx/Makefile.am   | 6 +++---
  src/glx/dri2_query_renderer.c | 4 
  2 files changed, 7 insertions(+), 3 deletions(-)
 
 diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am
 index cdd898e..23cb794 100644
 --- a/src/glx/Makefile.am
 +++ b/src/glx/Makefile.am
 @@ -96,7 +96,8 @@ endif
  if HAVE_DRICOMMON
  libglx_la_SOURCES += \
 xfont.c \
 -   dri_common.c
 +   dri_common.c \
 +   dri2_query_renderer.c
  endif
  
  if HAVE_DRI2
 @@ -104,8 +105,7 @@ libglx_la_SOURCES += \
 dri_glx.c \
 XF86dri.c \
 dri2_glx.c \
 -   dri2.c \
 -   dri2_query_renderer.c
 +   dri2.c
  endif
  
  if HAVE_DRI3
 diff --git a/src/glx/dri2_query_renderer.c b/src/glx/dri2_query_renderer.c
 index 247ec1c..6ccd710 100644
 --- a/src/glx/dri2_query_renderer.c
 +++ b/src/glx/dri2_query_renderer.c
 @@ -25,7 +25,9 @@
  
  #include glxclient.h
  #include glx_error.h
 +#ifdef HAVE_LIBDRM
  #include dri2.h
 +#endif
With a couple of small changes, I believe that you should be safe with
dropping the above header and the HAVE_LIBDRM guards below.

The small changes:
 - dri*_query_renderer_* into their respective dri*_priv.h
 - Perhaps move a struct from dri2.h to dri2_priv.h

-Emil

  #include dri_interface.h
  #include dri2_priv.h
  #if defined(HAVE_DRI3)
 @@ -66,6 +68,7 @@ dri2_convert_glx_query_renderer_attribs(int attribute)
 return -1;
  }
  
 +#ifdef HAVE_LIBDRM
  _X_HIDDEN int
  dri2_query_renderer_integer(struct glx_screen *base, int attribute,
  unsigned int *value)
 @@ -103,6 +106,7 @@ dri2_query_renderer_string(struct glx_screen *base, int 
 attribute,
  
 return psc-rendererQuery-queryString(psc-driScreen, dri_attribute, 
 value);
  }
 +#endif
  
  #if defined(HAVE_DRI3)
  _X_HIDDEN int
 -- 1.8.5.5
 
 
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Clamp/saturate optimizations v3

2014-08-18 Thread Abdiel Janulgue

v3 of clamp and saturate optimizations

Changes since v1: 
 - Only remove the old try_emit_saturate operations after the new optimizations 
are
   in place. (Matt, Ian)
 - Output [min/max](saturate(x),b) instead of saturate([min/max](x,b)) as 
suggested
   by Ilia Mirkin.
 - The change above required some refactoring in the fs/vec4 backend to allow
   propagation of certain instructions with saturate flag to SEL. For other 
instructions,
   we don't propagate saturate instructions, similar to the previous behaviour.
Since v2:
 - Fix comments to reflect we are doing a commutative operation, add missing 
conditions
   when optimizing clamp in opt_algebraic pass.
 - Refactor try_emit_saturate() in i965/fs instead of completely removing it. 
This fixed a
   a regression where the changes emitted an (extra) unnecessary saturated mov 
when the 
   expression generating src can do saturate directly instead.
 - Fix regression in the i965/vec4 copy-propagate optimization caused by 
ignoring 
   channels in the propagated instruction.
 - Count generated loops from the fs/vec4 generator.

Results from our shader-db:

total instructions in shared programs: 4538627 - 4560104 (0.47%)
instructions in affected programs: 45144 - 66621 (47.57%)
total loops in shared programs:887 - 711 (-19.84%)
GAINED:0
LOST:  36

I modified shader-db a bit to catch loops unrolls. The shaders that show 
increase in
instruction count are all due to the loop unroll pass triggered by this 
optimization
on games that contain looped clamp/saturate operation. The unroll pass also
resulted in a few shaders with looped clamp/sat skipping SIMD16 generation.

** No piglit regressions observed **

Abdiel Janulgue (17):
  i965/vec4/fs: Count loops in shader debug
  glsl: Add ir_unop_saturate
  glsl: Add constant evaluation of ir_unop_saturate
  glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1)
  ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturate
  ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturate
  i965/fs: Add support for ir_unop_saturate
  i965/vec4: Add support for ir_unop_saturate
  glsl: Implement saturate as ir_unop_saturate
  glsl: Optimize clamp(x, 0, 1) as saturate(x)
  glsl: Optimize clamp(x, 0.0, b), where b  1.0 as min(saturate(x),b)
  glsl: Optimize clamp(x, b, 1.0), where b  0.0 as max(saturate(x),b)
  i965/fs: Allow propagation of instructions with saturate flag to sel
  i965/vec4: Allow propagation of instructions with saturate flag to sel
  ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturate
  i965/fs: Refactor try_emit_saturate
  i965/vec4: Remove try_emit_saturate

 src/glsl/ir.cpp  |  2 +
 src/glsl/ir.h|  1 +
 src/glsl/ir_builder.cpp  |  6 +-
 src/glsl/ir_constant_expression.cpp  |  6 ++
 src/glsl/ir_optimization.h   |  1 +
 src/glsl/ir_validate.cpp |  1 +
 src/glsl/lower_instructions.cpp  | 29 
 src/glsl/opt_algebraic.cpp   | 98 
++
 src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp |  1 +
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp| 18 -
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  6 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 27 ---
 src/mesa/drivers/dri/i965/brw_vec4.h |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp  | 85 
+++---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  6 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 25 ++-
 src/mesa/program/ir_to_mesa.cpp  | 59 +++-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 63 +++--
 18 files changed, 261 insertions(+), 175 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/17] glsl: Add ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/ir.cpp  | 2 ++
 src/glsl/ir.h| 1 +
 src/glsl/ir_validate.cpp | 1 +
 3 files changed, 4 insertions(+)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 4a4d304..ef04ed0 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -255,6 +255,7 @@ ir_expression::ir_expression(int op, ir_rvalue *op0)
case ir_unop_dFdy_fine:
case ir_unop_bitfield_reverse:
case ir_unop_interpolate_at_centroid:
+   case ir_unop_saturate:
   this-type = op0-type;
   break;
 
@@ -534,6 +535,7 @@ static const char *const operator_strs[] = {
bit_count,
find_msb,
find_lsb,
+   sat,
noise,
interpolate_at_centroid,
+,
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 18623b9..96c8b0e 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -1248,6 +1248,7 @@ enum ir_expression_operation {
ir_unop_find_lsb,
/*@}*/
 
+   ir_unop_saturate,
ir_unop_noise,
 
/**
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 5b20677..97a581d 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -241,6 +241,7 @@ ir_validate::visit_leave(ir_expression *ir)
case ir_unop_log:
case ir_unop_exp2:
case ir_unop_log2:
+   case ir_unop_saturate:
   assert(ir-operands[0]-type-base_type == GLSL_TYPE_FLOAT);
   assert(ir-type == ir-operands[0]-type);
   break;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/17] ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

Needed when vertex programs doesn't allow saturate

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/program/ir_to_mesa.cpp| 5 -
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 +-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 011ffed..e8126b3 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -2991,9 +2991,12 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
 /* Lowering */
 do_mat_op_to_vec(ir);
+GLenum target = 
_mesa_shader_stage_to_program(prog-_LinkedShaders[i]-Stage);
 lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2
 | LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP
-| ((options-EmitNoPow) ? POW_TO_EXP2 : 0)));
+| ((options-EmitNoPow) ? POW_TO_EXP2 : 0)
+| ((target == GL_VERTEX_PROGRAM_ARB) ? 
SAT_TO_CLAMP
+: 0)));
 
 progress = do_lower_jumps(ir, true, true, options-EmitNoMainReturn, 
options-EmitNoCont, options-EmitNoLoops) || progress;
 
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 84bdc4f..575da1e 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5429,6 +5429,9 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (!pscreen-get_param(pscreen, PIPE_CAP_TEXTURE_GATHER_OFFSETS))
  lower_offset_arrays(ir);
   do_mat_op_to_vec(ir);
+  /* Emit saturates in the vertex shader only if SM 3.0 is supported. */
+  bool vs_sm3 = 
(_mesa_shader_stage_to_program(prog-_LinkedShaders[i]-Stage) ==
+ GL_VERTEX_PROGRAM_ARB)  
st_context(ctx)-has_shader_model3;
   lower_instructions(ir,
  MOD_TO_FRACT |
  DIV_TO_MUL_RCP |
@@ -5438,7 +5441,8 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
  CARRY_TO_ARITH |
  BORROW_TO_ARITH |
  (options-EmitNoPow ? POW_TO_EXP2 : 0) |
- (!ctx-Const.NativeIntegers ? INT_DIV_TO_MUL_RCP : 
0));
+ (!ctx-Const.NativeIntegers ? INT_DIV_TO_MUL_RCP : 0) 
|
+ (vs_sm3 ? SAT_TO_CLAMP : 0));
 
   lower_ubo_reference(prog-_LinkedShaders[i], ir);
   do_vec_index_to_cond_assign(ir);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/17] i965/fs: Add support for ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp | 1 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
index d98b7eb..cb0a079 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
@@ -246,6 +246,7 @@ ir_channel_expressions_visitor::visit_leave(ir_assignment 
*ir)
case ir_unop_bit_count:
case ir_unop_find_msb:
case ir_unop_find_lsb:
+   case ir_unop_saturate:
   for (i = 0; i  vector_elements; i++) {
 ir_rvalue *op0 = get_element(op_var[0], i);
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 05082ee..c33c46b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -854,6 +854,10 @@ fs_visitor::visit(ir_expression *ir)
case ir_unop_find_lsb:
   emit(FBL(this-result, op[0]));
   break;
+   case ir_unop_saturate:
+  inst = emit(MOV(this-result, op[0]));
+  inst-saturate = true;
+  break;
case ir_triop_bitfield_extract:
   /* Note that the instruction's argument order is reversed from GLSL
* and the IR.
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/17] glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1)

2014-08-18 Thread Abdiel Janulgue

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/ir_optimization.h  |  1 +
 src/glsl/lower_instructions.cpp | 29 +
 2 files changed, 30 insertions(+)

diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index b83c225..1c6f72b 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -40,6 +40,7 @@
 #define LDEXP_TO_ARITH 0x100
 #define CARRY_TO_ARITH 0x200
 #define BORROW_TO_ARITH0x400
+#define SAT_TO_CLAMP   0x800
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp
index 176070c..6842853 100644
--- a/src/glsl/lower_instructions.cpp
+++ b/src/glsl/lower_instructions.cpp
@@ -41,6 +41,7 @@
  * - BITFIELD_INSERT_TO_BFM_BFI
  * - CARRY_TO_ARITH
  * - BORROW_TO_ARITH
+ * - SAT_TO_CLAMP
  *
  * SUB_TO_ADD_NEG:
  * ---
@@ -104,6 +105,10 @@
  * 
  * Converts ir_borrow into (x  y).
  *
+ * SAT_TO_CLAMP:
+ * -
+ * Converts ir_unop_saturate into min(max(x, 0.0), 1.0)
+ *
  */
 
 #include main/core.h /* for M_LOG2E */
@@ -139,6 +144,7 @@ private:
void ldexp_to_arith(ir_expression *);
void carry_to_arith(ir_expression *);
void borrow_to_arith(ir_expression *);
+   void sat_to_clamp(ir_expression *);
 };
 
 } /* anonymous namespace */
@@ -484,6 +490,24 @@ lower_instructions_visitor::borrow_to_arith(ir_expression 
*ir)
this-progress = true;
 }
 
+void
+lower_instructions_visitor::sat_to_clamp(ir_expression *ir)
+{
+   /* Translates
+*   ir_unop_saturate x
+* into
+*   ir_binop_min (ir_binop_max(x, 0.0), 1.0)
+*/
+
+   ir-operation = ir_binop_min;
+   ir-operands[0] = new(ir) ir_expression(ir_binop_max, ir-operands[0]-type,
+   ir-operands[0],
+   new(ir) ir_constant(0.0f));
+   ir-operands[1] = new(ir) ir_constant(1.0f);
+
+   this-progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -540,6 +564,11 @@ lower_instructions_visitor::visit_leave(ir_expression *ir)
  borrow_to_arith(ir);
   break;
 
+   case ir_unop_saturate:
+  if (lowering(SAT_TO_CLAMP))
+ sat_to_clamp(ir);
+  break;
+
default:
   return visit_continue;
}
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/17] i965/vec4: Allow propagation of instructions with saturate flag to sel

2014-08-18 Thread Abdiel Janulgue

When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge  dst a 0.25F

To be propagated as:
sel.ge.sat dst b 0.25F

v3: - Syntax clarifications in inst-saturate assignment
- Remove extra parenthesis when assigning src_reg value
  from copy_entry (Matt Turner)
v4: - Take channels into consideration when propagating saturated instructions.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 85 +++---
 1 file changed, 58 insertions(+), 27 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 37ca661..fe47b0f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -36,13 +36,17 @@ extern C {
 
 namespace brw {
 
+struct copy_entry {
+   src_reg *value[4];
+   int saturatemask;
+};
+
 static bool
 is_direct_copy(vec4_instruction *inst)
 {
return (inst-opcode == BRW_OPCODE_MOV 
   !inst-predicate 
   inst-dst.file == GRF 
-  !inst-saturate 
   !inst-dst.reladdr 
   !inst-src[0].reladdr 
   inst-dst.type == inst-src[0].type);
@@ -74,16 +78,16 @@ is_channel_updated(vec4_instruction *inst, src_reg 
*values[4], int ch)
 
 static bool
 try_constant_propagate(struct brw_context *brw, vec4_instruction *inst,
-   int arg, src_reg *values[4])
+   int arg, struct copy_entry *entry)
 {
/* For constant propagation, we only handle the same constant
 * across all 4 channels.  Some day, we should handle the 8-bit
 * float vector format, which would let us constant propagate
 * vectors better.
 */
-   src_reg value = *values[0];
+   src_reg value = *entry-value[0];
for (int i = 1; i  4; i++) {
-  if (!value.equals(*values[i]))
+  if (!value.equals(*entry-value[i]))
 return false;
}
 
@@ -213,22 +217,22 @@ is_logic_op(enum opcode opcode)
 
 static bool
 try_copy_propagate(struct brw_context *brw, vec4_instruction *inst,
-   int arg, src_reg *values[4])
+   int arg, struct copy_entry *entry, int reg)
 {
/* For constant propagation, we only handle the same constant
 * across all 4 channels.  Some day, we should handle the 8-bit
 * float vector format, which would let us constant propagate
 * vectors better.
 */
-   src_reg value = *values[0];
+   src_reg value = *entry-value[0];
for (int i = 1; i  4; i++) {
   /* This is equals() except we don't care about the swizzle. */
-  if (value.file != values[i]-file ||
- value.reg != values[i]-reg ||
- value.reg_offset != values[i]-reg_offset ||
- value.type != values[i]-type ||
- value.negate != values[i]-negate ||
- value.abs != values[i]-abs) {
+  if (value.file != entry-value[i]-file ||
+ value.reg != entry-value[i]-reg ||
+ value.reg_offset != entry-value[i]-reg_offset ||
+ value.type != entry-value[i]-type ||
+ value.negate != entry-value[i]-negate ||
+ value.abs != entry-value[i]-abs) {
 return false;
   }
}
@@ -239,7 +243,7 @@ try_copy_propagate(struct brw_context *brw, 
vec4_instruction *inst,
 */
int s[4];
for (int i = 0; i  4; i++) {
-  s[i] = BRW_GET_SWZ(values[i]-swizzle,
+  s[i] = BRW_GET_SWZ(entry-value[i]-swizzle,
 BRW_GET_SWZ(inst-src[arg].swizzle, i));
}
value.swizzle = BRW_SWIZZLE4(s[0], s[1], s[2], s[3]);
@@ -300,6 +304,25 @@ try_copy_propagate(struct brw_context *brw, 
vec4_instruction *inst,
if (value.equals(inst-src[arg]))
   return false;
 
+   /* Limit saturate propagation only to SEL with src1 bounded within 1.0 and 
1.0
+* otherwise, skip copy propagate altogether
+*/
+   if (entry-saturatemask  (1  arg)) {
+  switch(inst-opcode) {
+  case BRW_OPCODE_SEL:
+ if (inst-src[1].file != IMM ||
+ inst-src[1].fixed_hw_reg.dw1.f  0.0 ||
+ inst-src[1].fixed_hw_reg.dw1.f  1.0) {
+return false;
+ }
+ if (!inst-saturate)
+inst-saturate = true;
+ break;
+  default:
+ return false;
+  }
+   }
+
value.type = inst-src[arg].type;
inst-src[arg] = value;
return true;
@@ -309,9 +332,9 @@ bool
 vec4_visitor::opt_copy_propagation()
 {
bool progress = false;
-   src_reg *cur_value[virtual_grf_reg_count][4];
+   struct copy_entry entries[virtual_grf_reg_count];
 
-   memset(cur_value, 0, sizeof(cur_value));
+   memset(entries, 0, sizeof(entries));
 
foreach_in_list(vec4_instruction, inst, instructions) {
   /* This pass only works on basic blocks.  If there's flow
@@ -322,7 +345,7 @@ vec4_visitor::opt_copy_propagation()
* src/glsl/opt_copy_propagation.cpp to track available

[Mesa-dev] [PATCH 17/17] i965/vec4: Remove try_emit_saturate

2014-08-18 Thread Abdiel Janulgue

Now that saturate is implemented natively as an instruction,
we can cut down on unneeded functionality.

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 21 -
 2 files changed, 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index c333baa..e5ad7af 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -563,7 +563,6 @@ public:
src_reg orig_src,
int base_offset);
 
-   bool try_emit_sat(ir_expression *ir);
bool try_emit_mad(ir_expression *ir);
bool try_emit_b2f_of_compare(ir_expression *ir);
void resolve_ud_negate(src_reg *reg);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 2e7a85d..95d46c2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1078,24 +1078,6 @@ vec4_visitor::visit(ir_function *ir)
 }
 
 bool
-vec4_visitor::try_emit_sat(ir_expression *ir)
-{
-   ir_rvalue *sat_src = ir-as_rvalue_to_saturate();
-   if (!sat_src)
-  return false;
-
-   sat_src-accept(this);
-   src_reg src = this-result;
-
-   this-result = src_reg(this, ir-type);
-   vec4_instruction *inst;
-   inst = emit(MOV(dst_reg(this-result), src));
-   inst-saturate = true;
-
-   return true;
-}
-
-bool
 vec4_visitor::try_emit_mad(ir_expression *ir)
 {
/* 3-src instructions were introduced in gen6. */
@@ -1228,9 +1210,6 @@ vec4_visitor::visit(ir_expression *ir)
dst_reg result_dst;
vec4_instruction *inst;
 
-   if (try_emit_sat(ir))
-  return;
-
if (ir-operation == ir_binop_add) {
   if (try_emit_mad(ir))
 return;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/17] i965/fs: Allow propagation of instructions with saturate flag to sel

2014-08-18 Thread Abdiel Janulgue

When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge  dst a 0.25F

To be propagated as:
sel.ge.sat dst b 0.25F

v3: Syntax clarifications in inst-saturate assignment (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 09f51bc..7e4eab7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -43,6 +43,7 @@ struct acp_entry : public exec_node {
fs_reg dst;
fs_reg src;
enum opcode opcode;
+   bool saturate;
 };
 
 struct block_data {
@@ -347,11 +348,26 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
   return false;
}
 
+   if (entry-saturate) {
+  switch(inst-opcode) {
+  case BRW_OPCODE_SEL:
+ if (inst-src[1].file != IMM ||
+ inst-src[1].fixed_hw_reg.dw1.f  0.0 ||
+ inst-src[1].fixed_hw_reg.dw1.f  1.0) {
+return false;
+ }
+ break;
+  default:
+ return false;
+  }
+   }
+
inst-src[arg].file = entry-src.file;
inst-src[arg].reg = entry-src.reg;
inst-src[arg].reg_offset = entry-src.reg_offset;
inst-src[arg].subreg_offset = entry-src.subreg_offset;
inst-src[arg].stride *= entry-src.stride;
+   inst-saturate = inst-saturate || entry-saturate;
 
if (!inst-src[arg].abs) {
   inst-src[arg].abs = entry-src.abs;
@@ -514,7 +530,6 @@ can_propagate_from(fs_inst *inst)
 inst-src[0].file == UNIFORM ||
 inst-src[0].file == IMM) 
inst-src[0].type == inst-dst.type 
-   !inst-saturate 
!inst-is_partial_write());
 }
 
@@ -569,6 +584,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, 
bblock_t *block,
 entry-dst = inst-dst;
 entry-src = inst-src[0];
  entry-opcode = inst-opcode;
+ entry-saturate = inst-saturate;
 acp[entry-dst.reg % ACP_HASH_SIZE].push_tail(entry);
   } else if (inst-opcode == SHADER_OPCODE_LOAD_PAYLOAD 
  inst-dst.file == GRF) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/17] i965/vec4/fs: Count loops in shader debug

2014-08-18 Thread Abdiel Janulgue

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 6 --
 src/mesa/drivers/dri/i965/brw_vec4.h | 1 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 6 --
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index a243003..c4e6c6d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1483,6 +1483,7 @@ void
 fs_generator::generate_code(exec_list *instructions)
 {
int start_offset = p-next_insn_offset;
+   int loop_count = 0;
 
struct annotation_info annotation;
memset(annotation, 0, sizeof(annotation));
@@ -1743,6 +1744,7 @@ fs_generator::generate_code(exec_list *instructions)
 
   case BRW_OPCODE_WHILE:
 brw_WHILE(p);
+ loop_count++;
 break;
 
   case SHADER_OPCODE_RCP:
@@ -1970,9 +1972,9 @@ fs_generator::generate_code(exec_list *instructions)
  fprintf(stderr, Native code for blorp program (SIMD%d dispatch):\n,
  dispatch_width);
   }
-  fprintf(stderr, SIMD%d shader: %d instructions. Compacted %d to %d
+  fprintf(stderr, SIMD%d shader: %d instructions. %d loops. Compacted %d 
to %d
bytes (%.0f%%)\n,
-  dispatch_width, before_size / 16, before_size, after_size,
+  dispatch_width, before_size / 16, loop_count, before_size, 
after_size,
   100.0f * (before_size - after_size) / before_size);
 
   const struct gl_program *prog = fp ? fp-Base : NULL;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index c59d24f..c333baa 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -699,6 +699,7 @@ private:
 
void *mem_ctx;
const bool debug_flag;
+   int loop_count;
 };
 
 } /* namespace brw */
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 1b1e647..b8948c3 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1188,6 +1188,7 @@ 
vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
 
case BRW_OPCODE_WHILE:
   brw_WHILE(p);
+  loop_count++;
   break;
 
case SHADER_OPCODE_RCP:
@@ -1318,6 +1319,7 @@ vec4_generator::generate_code(exec_list *instructions)
 {
struct annotation_info annotation;
memset(annotation, 0, sizeof(annotation));
+   loop_count = 0;
 
cfg_t *cfg = NULL;
if (unlikely(debug_flag))
@@ -1372,9 +1374,9 @@ vec4_generator::generate_code(exec_list *instructions)
   } else {
  fprintf(stderr, Native code for vertex program %d:\n, prog-Id);
   }
-  fprintf(stderr, vec4 shader: %d instructions. Compacted %d to %d
+  fprintf(stderr, vec4 shader: %d instructions. %d loops. Compacted %d to 
%d
bytes (%.0f%%)\n,
-  before_size / 16, before_size, after_size,
+  before_size / 16, loop_count, before_size, after_size,
   100.0f * (before_size - after_size) / before_size);
 
   dump_assembly(p-store, annotation.ann_count, annotation.ann, brw, prog);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/17] glsl: Implement saturate as ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

Now that we have the ir_unop_saturate implemented as a single
instruction, generate the correct simplified expression.

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/ir_builder.cpp | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp
index f039414..a2f6f29 100644
--- a/src/glsl/ir_builder.cpp
+++ b/src/glsl/ir_builder.cpp
@@ -271,11 +271,7 @@ clamp(operand a, operand b, operand c)
 ir_expression *
 saturate(operand a)
 {
-   void *mem_ctx = ralloc_parent(a.val);
-
-   return expr(ir_binop_max,
-  expr(ir_binop_min, a, new(mem_ctx) ir_constant(1.0f)),
-  new(mem_ctx) ir_constant(0.0f));
+   return expr(ir_unop_saturate, a);
 }
 
 ir_expression *
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/17] glsl: Optimize clamp(x, 0.0, b), where b 1.0 as min(saturate(x), b)

2014-08-18 Thread Abdiel Janulgue

v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia 
Mirkin
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is zero and
  inner constant is  1
- Fix comments to reflect we are doing a commutative operation (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/opt_algebraic.cpp | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 4b052933..6dfb681 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -110,6 +110,33 @@ is_vec_basis(ir_constant *ir)
return (ir == NULL) ? false : ir-is_basis();
 }
 
+static inline bool
+is_valid_vec_const(ir_constant *ir)
+{
+   if (ir == NULL)
+  return false;
+
+   if (!ir-type-is_scalar()  !ir-type-is_vector())
+  return false;
+
+   return true;
+}
+
+static inline bool
+is_less_than_one(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c  ir-type-vector_elements; c++) {
+  if (ir-get_float_component(c)  1.0f)
+ component++;
+   }
+
+   return (component == ir-type-vector_elements);
+}
+
 static void
 update_type(ir_expression *ir)
 {
@@ -645,6 +672,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 if ((outer_const-is_one()  inner_val_a-is_zero()) ||
 (inner_val_a-is_one()  outer_const-is_zero()))
return saturate(inner_val_b);
+
+/* Found a {min|max} ({max|min} (x, 0.0), b) where b  1.0
+ * and its variations
+ */
+if (is_less_than_one(outer_const)  inner_val_b-is_zero())
+   return expr(ir_binop_min, saturate(inner_val_a), outer_const);
+
+if (!inner_val_b-as_constant())
+   continue;
+
+if (is_less_than_one(inner_val_b-as_constant())  
outer_const-is_zero())
+   return expr(ir_binop_min, saturate(inner_val_a), inner_val_b);
  }
   }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/17] ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/program/ir_to_mesa.cpp| 6 ++
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index e8126b3..f212aed 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -1171,6 +1171,12 @@ ir_to_mesa_visitor::visit(ir_expression *ir)
   emit(ir, OPCODE_DDY, result_dst, op[0]);
   break;
 
+   case ir_unop_saturate: {
+  ir_to_mesa_instruction *inst = emit(ir, OPCODE_MOV,
+  result_dst, op[0]);
+  inst-saturate = true;
+  break;
+   }
case ir_unop_noise: {
   const enum prog_opcode opcode =
 prog_opcode(OPCODE_NOISE1
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 575da1e..55b9940 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -1460,6 +1460,12 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
case ir_unop_cos_reduced:
   emit_scs(ir, TGSI_OPCODE_COS, result_dst, op[0]);
   break;
+   case ir_unop_saturate: {
+  glsl_to_tgsi_instruction *inst;
+  inst = emit(ir, TGSI_OPCODE_MOV, result_dst, op[0]);
+  inst-saturate = true;
+  break;
+   }
 
case ir_unop_dFdx:
case ir_unop_dFdx_coarse:
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/17] glsl: Optimize clamp(x, 0, 1) as saturate(x)

2014-08-18 Thread Abdiel Janulgue

v2: - Check that the base type is float (Ian Romanick)
v3: - Make sure comments reflect that we are doing a commutative operation
- Add missing condition where the inner constant is 1.0 and outer constant 
is 0.0
- Make indexing of operands easier to read (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/opt_algebraic.cpp | 36 
 1 file changed, 36 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index ac7514a..4b052933 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -614,6 +614,42 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 
   break;
 
+   case ir_binop_min:
+   case ir_binop_max:
+  if (ir-type-base_type != GLSL_TYPE_FLOAT)
+ break;
+
+  /* Replace min(max) operations and its commutative combinations with
+   * a saturate operation
+   */
+  for (int op = 0; op  2; op++) {
+ ir_expression *minmax = op_expr[op];
+ ir_constant *outer_const = op_const[1 - op];
+ ir_expression_operation op_cond = (ir-operation == ir_binop_max) ?
+ir_binop_min : ir_binop_max;
+
+ if (!minmax || !outer_const || (minmax-operation != op_cond))
+continue;
+
+ /* Found a min(max) combination. Now try to see if its operands
+  * meet our conditions that we can do just a single saturate operation
+  */
+ for (int minmax_op = 0; minmax_op  2; minmax_op++) {
+ir_rvalue *inner_val_a = minmax-operands[minmax_op];
+ir_rvalue *inner_val_b = minmax-operands[1 - minmax_op];
+
+if (!inner_val_a || !inner_val_b)
+   continue;
+
+/* Found a {min|max} ({max|min} (x, 0.0), 1.0) operation and its 
variations */
+if ((outer_const-is_one()  inner_val_a-is_zero()) ||
+(inner_val_a-is_one()  outer_const-is_zero()))
+   return saturate(inner_val_b);
+ }
+  }
+
+  break;
+
case ir_unop_rcp:
   if (op_expr[0]  op_expr[0]-operation == ir_unop_rcp)
 return op_expr[0]-operands[0];
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/17] glsl: Add constant evaluation of ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

v2: Use CLAMP macro (Ian Romanick)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/ir_constant_expression.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/glsl/ir_constant_expression.cpp 
b/src/glsl/ir_constant_expression.cpp
index 9606021..1e8b3a3 100644
--- a/src/glsl/ir_constant_expression.cpp
+++ b/src/glsl/ir_constant_expression.cpp
@@ -1469,6 +1469,12 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   }
   break;
 
+   case ir_unop_saturate:
+  for (unsigned c = 0; c  components; c++) {
+ data.f[c] = CLAMP(op[0]-value.f[c], 0.0f, 1.0f);
+  }
+  break;
+
case ir_triop_bitfield_extract: {
   int offset = op[1]-value.i[0];
   int bits = op[2]-value.i[0];
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/17] glsl: Optimize clamp(x, b, 1.0), where b 0.0 as max(saturate(x), b)

2014-08-18 Thread Abdiel Janulgue

v2: - Output max(saturate(x),b) instead of saturate(max(x,b))
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is  0.0 and
  inner constant is 1.0.
- Fix comments to show that the optimization is a commutative operation
  (Matt Turner)

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/glsl/opt_algebraic.cpp | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 6dfb681..447618f 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -137,6 +137,21 @@ is_less_than_one(ir_constant *ir)
return (component == ir-type-vector_elements);
 }
 
+static inline bool
+is_greater_than_zero(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c  ir-type-vector_elements; c++) {
+  if (ir-get_float_component(c)  0.0f)
+ component++;
+   }
+
+   return (component == ir-type-vector_elements);
+}
+
 static void
 update_type(ir_expression *ir)
 {
@@ -684,6 +699,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 
 if (is_less_than_one(inner_val_b-as_constant())  
outer_const-is_zero())
return expr(ir_binop_min, saturate(inner_val_a), inner_val_b);
+
+/* Found a {min|max} ({max|min} (x, b), 1.0), where b  0.0
+ * and its variations
+ */
+if (outer_const-is_one()  
is_greater_than_zero(inner_val_b-as_constant()))
+   return expr(ir_binop_max, saturate(inner_val_a), inner_val_b);
+if (inner_val_b-as_constant()-is_one()  
is_greater_than_zero(outer_const))
+   return expr(ir_binop_max, saturate(inner_val_a), outer_const);
  }
   }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 16/17] i965/fs: Refactor try_emit_saturate

2014-08-18 Thread Abdiel Janulgue

v3: Since the fs backend can emit saturate as a separate instruction, there is
no need to detect for min/max instructions and to rewrite the instruction 
tree
accordingly. On the other hand, we don't need to emit a separate saturated
mov either when the expression generating src can do saturate directly.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index c33c46b..aeb076a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -267,17 +267,14 @@ fs_visitor::emit_minmax(enum brw_conditional_mod 
conditionalmod, const fs_reg d
}
 }
 
-/* Instruction selection: Produce a MOV.sat instead of
- * MIN(MAX(val, 0), 1) when possible.
- */
 bool
 fs_visitor::try_emit_saturate(ir_expression *ir)
 {
-   ir_rvalue *sat_val = ir-as_rvalue_to_saturate();
-
-   if (!sat_val)
+   if (ir-operation != ir_unop_saturate)
   return false;
 
+   ir_rvalue *sat_val = ir-operands[0];
+
fs_inst *pre_inst = (fs_inst *) this-instructions.get_tail();
 
sat_val-accept(this);
@@ -285,21 +282,17 @@ fs_visitor::try_emit_saturate(ir_expression *ir)
 
fs_inst *last_inst = (fs_inst *) this-instructions.get_tail();
 
-   /* If the last instruction from our accept() didn't generate our
-* src, generate a saturated MOV
+   /* If the last instruction from our accept() generated our
+* src, just set the saturate flag instead of emmitting a separate mov.
 */
fs_inst *modify = get_instruction_generating_reg(pre_inst, last_inst, src);
-   if (!modify || modify-regs_written != 1) {
-  this-result = fs_reg(this, ir-type);
-  fs_inst *inst = emit(MOV(this-result, src));
-  inst-saturate = true;
-   } else {
+   if (modify  modify-regs_written == 1) {
   modify-saturate = true;
   this-result = src;
+  return true;
}
 
-
-   return true;
+   return false;
 }
 
 bool
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/17] i965/vec4: Add support for ir_unop_saturate

2014-08-18 Thread Abdiel Janulgue

Reviewed-by: Matt Turner matts...@gmail.com
Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index f22d38d..2e7a85d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1389,6 +1389,10 @@ vec4_visitor::visit(ir_expression *ir)
case ir_unop_find_lsb:
   emit(FBL(result_dst, op[0]));
   break;
+   case ir_unop_saturate:
+  inst = emit(MOV(result_dst, op[0]));
+  inst-saturate = true;
+  break;
 
case ir_unop_noise:
   unreachable(not reached: should be handled by lower_noise);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/17] ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturate

2014-08-18 Thread Abdiel Janulgue

Now that saturate is implemented natively as instruction,
we can cut down on unneeded functionality.

Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
---
 src/mesa/program/ir_to_mesa.cpp| 48 
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 51 --
 2 files changed, 99 deletions(-)

diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index f212aed..325946f 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -311,7 +311,6 @@ public:
  int mul_operand);
bool try_emit_mad_for_and_not(ir_expression *ir,
 int mul_operand);
-   bool try_emit_sat(ir_expression *ir);
 
void emit_swz(ir_expression *ir);
 
@@ -866,50 +865,6 @@ ir_to_mesa_visitor::try_emit_mad_for_and_not(ir_expression 
*ir, int try_operand)
return true;
 }
 
-bool
-ir_to_mesa_visitor::try_emit_sat(ir_expression *ir)
-{
-   /* Saturates were only introduced to vertex programs in
-* NV_vertex_program3, so don't give them to drivers in the VP.
-*/
-   if (this-prog-Target == GL_VERTEX_PROGRAM_ARB)
-  return false;
-
-   ir_rvalue *sat_src = ir-as_rvalue_to_saturate();
-   if (!sat_src)
-  return false;
-
-   sat_src-accept(this);
-   src_reg src = this-result;
-
-   /* If we generated an expression instruction into a temporary in
-* processing the saturate's operand, apply the saturate to that
-* instruction.  Otherwise, generate a MOV to do the saturate.
-*
-* Note that we have to be careful to only do this optimization if
-* the instruction in question was what generated src-result.  For
-* example, ir_dereference_array might generate a MUL instruction
-* to create the reladdr, and return us a src reg using that
-* reladdr.  That MUL result is not the value we're trying to
-* saturate.
-*/
-   ir_expression *sat_src_expr = sat_src-as_expression();
-   ir_to_mesa_instruction *new_inst;
-   new_inst = (ir_to_mesa_instruction *)this-instructions.get_tail();
-   if (sat_src_expr  (sat_src_expr-operation == ir_binop_mul ||
-   sat_src_expr-operation == ir_binop_add ||
-   sat_src_expr-operation == ir_binop_dot)) {
-  new_inst-saturate = true;
-   } else {
-  this-result = get_temp(ir-type);
-  ir_to_mesa_instruction *inst;
-  inst = emit(ir, OPCODE_MOV, dst_reg(this-result), src);
-  inst-saturate = true;
-   }
-
-   return true;
-}
-
 void
 ir_to_mesa_visitor::reladdr_to_temp(ir_instruction *ir,
src_reg *reg, int *num_reladdr)
@@ -1072,9 +1027,6 @@ ir_to_mesa_visitor::visit(ir_expression *ir)
 return;
}
 
-   if (try_emit_sat(ir))
-  return;
-
if (ir-operation == ir_quadop_vector) {
   this-emit_swz(ir);
   return;
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 55b9940..2946286 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -446,7 +446,6 @@ public:
   int mul_operand);
bool try_emit_mad_for_and_not(ir_expression *ir,
   int mul_operand);
-   bool try_emit_sat(ir_expression *ir);
 
void emit_swz(ir_expression *ir);
 
@@ -1270,53 +1269,6 @@ 
glsl_to_tgsi_visitor::try_emit_mad_for_and_not(ir_expression *ir, int try_operan
return true;
 }
 
-bool
-glsl_to_tgsi_visitor::try_emit_sat(ir_expression *ir)
-{
-   /* Emit saturates in the vertex shader only if SM 3.0 is supported.
-*/
-   if (this-prog-Target == GL_VERTEX_PROGRAM_ARB 
-   !st_context(this-ctx)-has_shader_model3) {
-  return false;
-   }
-
-   ir_rvalue *sat_src = ir-as_rvalue_to_saturate();
-   if (!sat_src)
-  return false;
-
-   sat_src-accept(this);
-   st_src_reg src = this-result;
-
-   /* If we generated an expression instruction into a temporary in
-* processing the saturate's operand, apply the saturate to that
-* instruction.  Otherwise, generate a MOV to do the saturate.
-*
-* Note that we have to be careful to only do this optimization if
-* the instruction in question was what generated src-result.  For
-* example, ir_dereference_array might generate a MUL instruction
-* to create the reladdr, and return us a src reg using that
-* reladdr.  That MUL result is not the value we're trying to
-* saturate.
-*/
-   ir_expression *sat_src_expr = sat_src-as_expression();
-   if (sat_src_expr  (sat_src_expr-operation == ir_binop_mul ||
-   sat_src_expr-operation == ir_binop_add ||
-   sat_src_expr-operation == ir_binop_dot)) {
-  glsl_to_tgsi_instruction *new_inst;
-  new_inst = (glsl_to_tgsi_instruction *)this-instructions.get_tail();
-  new_inst-saturate = true;
-   } else {
-  this-result = get_temp(ir-type);
-  st_dst_reg result_dst =

Re: [Mesa-dev] [PATCH 10/19] auxiliary/os: introduce os_get_total_physical_memory helper function

2014-08-18 Thread Jon TURNEY


On 14/08/2014 23:18, Emil Velikov wrote:

Cc: Alexander von Gluck IV kallis...@unixzen.com
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
---
  src/gallium/auxiliary/os/os_misc.c | 64 ++
  src/gallium/auxiliary/os/os_misc.h |  7 +
  2 files changed, 71 insertions(+)


Since this #errors on unknown platforms, teach it about the existence of 
Cygwin.


From 03e0df4455810e255c22a0532b9e66dcc3d60a1d Mon Sep 17 00:00:00 2001
From: Jon TURNEY jon.tur...@dronecode.org.uk
Date: Sun, 17 Aug 2014 17:21:27 +0100
Subject: [PATCH] Teach os_get_physical_memory about Cygwin

Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk
---
 src/gallium/auxiliary/os/os_misc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/os/os_misc.c 
b/src/gallium/auxiliary/os/os_misc.c
index 3846a9a..ef84c79 100644
--- a/src/gallium/auxiliary/os/os_misc.c
+++ b/src/gallium/auxiliary/os/os_misc.c
@@ -47,7 +47,7 @@
 #endif
 
 
-#if defined(PIPE_OS_LINUX)
+#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN)
 #  include unistd.h
 #elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
 #  include sys/sysctl.h
@@ -111,7 +111,7 @@ os_get_option(const char *name)
 bool
 os_get_total_physical_memory(uint64_t *size)
 {
-#if defined(PIPE_OS_LINUX)
+#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN)
const long phys_pages = sysconf(_SC_PHYS_PAGES);
const long page_size = sysconf(_SC_PAGE_SIZE);
 
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/19] auxiliary/os: introduce os_get_total_physical_memory helper function

2014-08-18 Thread Emil Velikov

On 18/08/14 13:20, Jon TURNEY wrote:
 On 14/08/2014 23:18, Emil Velikov wrote:
 Cc: Alexander von Gluck IV kallis...@unixzen.com
 Signed-off-by: Emil Velikov emil.l.veli...@gmail.com
 ---
   src/gallium/auxiliary/os/os_misc.c | 64
 ++
   src/gallium/auxiliary/os/os_misc.h |  7 +
   2 files changed, 71 insertions(+)
 
 Since this #errors on unknown platforms, teach it about the existence of 
 Cygwin.
 
 
 0001-Teach-os_get_physical_memory-about-Cygwin.patch
 
 
 From 03e0df4455810e255c22a0532b9e66dcc3d60a1d Mon Sep 17 00:00:00 2001
 From: Jon TURNEY jon.tur...@dronecode.org.uk
 Date: Sun, 17 Aug 2014 17:21:27 +0100
 Subject: [PATCH] Teach os_get_physical_memory about Cygwin
 
 Signed-off-by: Jon TURNEY jon.tur...@dronecode.org.uk
I was under the strange impression that the p_config.h will set PIPE_OS_LINUX
for Cygwin. It seem like I got confused with PIPE_OS_UNIX.

Reviewed-by: Emil Velikov emil.l.veli...@gmail.com

 ---
  src/gallium/auxiliary/os/os_misc.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/gallium/auxiliary/os/os_misc.c 
 b/src/gallium/auxiliary/os/os_misc.c
 index 3846a9a..ef84c79 100644
 --- a/src/gallium/auxiliary/os/os_misc.c
 +++ b/src/gallium/auxiliary/os/os_misc.c
 @@ -47,7 +47,7 @@
  #endif
  
  
 -#if defined(PIPE_OS_LINUX)
 +#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN)
  #  include unistd.h
  #elif defined(PIPE_OS_APPLE) || defined(PIPE_OS_BSD)
  #  include sys/sysctl.h
 @@ -111,7 +111,7 @@ os_get_option(const char *name)
  bool
  os_get_total_physical_memory(uint64_t *size)
  {
 -#if defined(PIPE_OS_LINUX)
 +#if defined(PIPE_OS_LINUX) || defined(PIPE_OS_CYGWIN)
 const long phys_pages = sysconf(_SC_PHYS_PAGES);
 const long page_size = sysconf(_SC_PAGE_SIZE);
  
 -- 1.8.5.5
 
 
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] clover: unsure compat::string is \0 terminated

2014-08-18 Thread Francisco Jerez

EdB edb+m...@sigluy.net writes:

 otherwise c_str() is not safe
 ---
  src/gallium/state_trackers/clover/util/compat.hpp | 54 
 ---
  1 file changed, 48 insertions(+), 6 deletions(-)

 diff --git a/src/gallium/state_trackers/clover/util/compat.hpp 
 b/src/gallium/state_trackers/clover/util/compat.hpp
 index 6f0f7cc..7ca1f85 100644
 --- a/src/gallium/state_trackers/clover/util/compat.hpp
 +++ b/src/gallium/state_trackers/clover/util/compat.hpp
 @@ -197,7 +197,7 @@ namespace clover {
  return _p[i];
   }
  
 -  private:
 +  protected:
   iterator _p;  //memory array
   size_type _s; //size
   size_type _c; //capacity
 @@ -306,18 +306,56 @@ namespace clover {
  
class string : public vectorchar {
public:
 - string() : vector() {
 + string() : vector(0, 1) {
 +_p[_s - 1] = '\0';
   }
  
 - string(const char *p) : vector(p, std::strlen(p)) {
 + string(const char *p) : vector(p, std::strlen(p) + 1) {
 +_p[_s - 1] = '\0';
   }
  
   templatetypename C
 - string(const C v) : vector(v) {
 + string(const C v) : vector(*v.begin(), v.size() + 1) {
 +_p[_s - 1] = '\0';
   }
  
 - operator std::string() const {
 -return std::string(begin(), end());
 + void
 + reserve(size_type m) {
 +vector::reserve(m + 1);
 + }
 +
 + void
 + resize(size_type m, char x = '\0') {
 +vector::resize(m + 1, x);
 +_p[_s - 1] = '\0';
 + }
 +
 + void
 + push_back(char x) {
 +reserve(_s + 1);
 +_p[_s - 1] = x;
 +_p[_s] = '\0';
 +++_s;
 + }
 +
 + size_type
 + size() const {
 +return _s - 1;
 + }
 +
 + size_type
 + capacity() const {
 +return _c - 1;
 + }
 +
 + iterator
 + end() {
 +return _p + size();
 + }
 +
 + const_iterator
 + end() const {
 +return _p + size();
   }
  

At this point where all methods from the base class need to be redefined
it probably stops making sense to use inheritance instead of
aggregation.  Once we've done that fixing c_str() gets a lot easier (two
lines of code) because we can just declare the container as mutable and
fix up the NULL terminator when c_str() is called.  Both changes
attached.

   const char *
 @@ -325,6 +363,10 @@ namespace clover {
  return begin();
   }
  
 + operator std::string() const {
 +return std::string(begin(), end());
 + }
 +
   const char *
   find(const string s) const {
  for (size_t i = 0; i + s.size()  size(); ++i) {
 -- 
 2.0.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

From e1e97e017f25f4ed1c75bae71095ffa116374654 Mon Sep 17 00:00:00 2001
From: Francisco Jerez curroje...@riseup.net
Date: Mon, 18 Aug 2014 15:21:52 +0300
Subject: [PATCH 1/2] clover/util: Implement compat::string using aggregation
 instead of inheritance.

---
 src/gallium/state_trackers/clover/util/compat.hpp | 76 +--
 1 file changed, 71 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/clover/util/compat.hpp b/src/gallium/state_trackers/clover/util/compat.hpp
index a4e3938..e0ab965 100644
--- a/src/gallium/state_trackers/clover/util/compat.hpp
+++ b/src/gallium/state_trackers/clover/util/compat.hpp
@@ -280,20 +280,83 @@ namespace clover {
  size_t offset;
   };
 
-  class string : public vectorchar {
+  class string {
   public:
- string() : vector() {
+ typedef char *iterator;
+ typedef const char *const_iterator;
+ typedef char value_type;
+ typedef char reference;
+ typedef const char const_reference;
+ typedef std::ptrdiff_t difference_type;
+ typedef std::size_t size_type;
+
+ string() : v() {
  }
 
- string(const char *p) : vector(p, std::strlen(p)) {
+ string(const char *p) : v(p, std::strlen(p)) {
  }
 
  templatetypename C
- string(const C v) : vector(v) {
+ string(const C v) : v(v) {
  }
 
  operator std::string() const {
-return std::string(begin(), end());
+return std::string(v.begin(), v.end());
+ }
+
+ void
+ reserve(size_type n) {
+v.reserve(n);
+ }
+
+ void
+ resize(size_type n, char x = char()) {
+v.resize(n, x);
+ }
+
+ void
+ push_back(char x) {
+v.push_back(x);
+ }
+
+ size_type
+ size() const {
+return v.size();
+

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

2014-08-18 Thread Roland Scheidegger

Am 16.08.2014 02:12, schrieb Connor Abbott:
I know what you might be thinking right now. Wait, *another* IR? Don't
we already have like 5 of those, not counting all the driver-specific
ones? Isn't this stuff complicated enough already? Well, there are some
pretty good reasons to start afresh (again...). In the years we've been
using GLSL IR, we've come to realize that, in fact, it's not what we
want *at all* to do optimizations on. Ian has done a talk at FOSDEM that
highlights some of the problems they've run into:

https://urldefense.proofpoint.com/v1/url?u=https://video.fosdem.org/2014/H1301_Cornil/Saturday/Three_Years_Experience_with_a_Treelike_Shader_IR.webmk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=iXhCeAYmidPDc1lFo757Cc9V0PvWAN4n3X%2Fw%2B%2F7Lx%2Fs%3D%0As=f103fb26bf53eee64318a490517d1ee9ab88ecd29fcdbe49d54b5a27e7581c2e

But here's the summary:

* GLSL IR is way too much of a memory hog, since it has to make a new
variable for each temporary the compiler creates and then each time you
want to dereference that temporary you need to create an
ir_dereference_variable that points to it which is also very
cache-unfriendly (downright cache-mean!).

* The expression trees were originally added so that we could do
pattern matching to automatically optimize things, but this turned out
to be both very difficult to do and not very helpful. Instead, all it
does is add more complexity to the IR without much benefit - with SSA or
having proper use-def chains, we could get back what the trees give us
while also being able to do lots more optimizations.

* We don't have the concept of basic blocks in GLSL IR, which makes a
lot of optimizations harder because they were originally designed with
basic blocks in mind - take, for example, my SSA series. I had to map a
whole lot of concepts that were based on the control flow graph to this
tree of statements that GLSL IR uses, and the end result wound up
looking nothing at all like the original paper. This problem gets even
worse for things like e.g. Global Code Motion that depend upon having
the dominance tree.

I originally wanted to modify GLSL IR to fix these problems by adding
new instruction types that would address these issues and then
converting back and forth between the old and the new form, but I
realized that fixing all the problems would basically mean a complete
rewrite - and if that's the case, then why don't we start from scratch?
So I took Ken's suggestions and started designing, and then at Intel
over the summer started implementing, a completely new IR which I call
NIR that's at a lower level than GLSL IR, but still high-level enough to
be mostly device-independant (different drivers may have different
passes and different ways of lowering e.g. matrix multiplies) so that
we can do generic optimizations on it. Having support for SSA from the
beginning was also a must, because lots of optimisations that we really
want for cleaning up DX9-translated games are either a lot easier in or
made possible by SSA. I also made the decision for it to be typeless,
because that's what the cool kids are all doing :) and for a
lower-level, flat IR it seemed like the thing to do (it could have gone
either way, though). So the key design points of NIR (pronounced either
like near as in NIR is near! or to rhyme with burr) are:

* It's flat (no expression trees)

* It's typeless

* Modifiers (abs, negate, saturate), swizzles, and write masks are part
of ALU instructions

* It includes enough GLSL-like things (variables that you can load from
or store to, function calls) to be hardware-agnostic (although we don't
have a way to represent matrix multiplies right now, but that could
easily be added) to be able to do optimizations at a high level, while
having lowering passes that convert variables to registers and
input/output/uniform loads/stores that will open up more opportunities
for optimization and save memory while being more hardware-specific.

* Control flow consists of a tree of if statements and loops, like in
GLSL IR, except the leaves of the tree are now basic blocks instead of
instructions. Also, each basic block keeps track of its successors and
predecessors, so the control flow graph is explicit in the IR.

* SSA is natively supported, and SSA uses point directly to the SSA
definition, which means that the use-def chains are always there, and
def-use chains are kept by tracking the set of all uses for each
definition.

* It's written in C.

(see the README in patch 3 and nir.h in patch 4 for more details)

Some things that are missing or could be improved:

* There's currently no alias tracking for inputs, outputs, and uniforms.
This is especially important for uniforms because we don't pack them
like we pack inputs and outputs.

* We need a way to represent matrix multiplies so that we can do
matrix-flipping optimizations

Re: [Mesa-dev] [PATCH 1/9] glsl: Optimize min/max expression trees

2014-08-18 Thread Petri Latvala


On 08/14/2014 04:33 AM, Ian Romanick wrote:

On 07/29/2014 02:36 AM, Petri Latvala wrote:

Add an optimization pass that drops min/max expression operands that
can be proven to not contribute to the final result. The algorithm is
similar to alpha-beta pruning on a minmax search, from the field of
AI.

This optimization pass can optimize min/max expressions where operands
are min/max expressions. Such code can appear in shaders by itself, or
as the result of clamp() or AMD_shader_trinary_minmax functions.

This optimization pass improves the generated code for piglit's
AMD_shader_trinary_minmax tests as follows:

total instructions in shared programs: 75 - 67 (-10.67%)
instructions in affected programs: 60 - 52 (-13.33%)
GAINED:0
LOST:  0

All tests (max3, min3, mid3) improved.

And I assume no piglit regressions?


Indeed no regressions, or new successes. I wrote that in the cover 
letter, I should have written it also in this patch's commit message...




Also... have you tried this in combination with Abdiel's related work on
saturates?


Tested the combination now, after some fighting with shader-db. The 
results are the same, except :

One shader from
Dungeon Defenders is hurt by shader-db metrics (26 - 28), because of
dropping of a (constant float (0.0)) operand, which was
compiled to a saturate modifier.


This shader compiled into the same code with or without my patches.

Talked with Abdiel about the combination, recapping here: Our changes 
are orthogonal and not conflicting, so we can both proceed at our own paces.




Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861
Signed-off-by: Petri Latvala petri.latv...@intel.com
---
  src/glsl/Makefile.sources   |   1 +
  src/glsl/glsl_parser_extras.cpp |   1 +
  src/glsl/ir_optimization.h  |   1 +
  src/glsl/opt_minmax.cpp | 395 
  4 files changed, 398 insertions(+)
  create mode 100644 src/glsl/opt_minmax.cpp

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index b54eae7..1ee80a3 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -95,6 +95,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_flip_matrices.cpp \
$(GLSL_SRCDIR)/opt_function_inlining.cpp \
$(GLSL_SRCDIR)/opt_if_simplification.cpp \
+   $(GLSL_SRCDIR)/opt_minmax.cpp \
$(GLSL_SRCDIR)/opt_noop_swizzle.cpp \
$(GLSL_SRCDIR)/opt_rebalance_tree.cpp \
$(GLSL_SRCDIR)/opt_redundant_jumps.cpp \
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 890123a..9f57ef3 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1561,6 +1561,7 @@ do_common_optimization(exec_list *ir, bool linked,
 else
progress = do_constant_variable_unlinked(ir) || progress;
 progress = do_constant_folding(ir) || progress;
+   progress = do_minmax_prune(ir) || progress;
 progress = do_cse(ir) || progress;
 progress = do_rebalance_tree(ir) || progress;
 progress = do_algebraic(ir, native_integers, options) || progress;
diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index b83c225..9d22585 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -98,6 +98,7 @@ bool opt_flatten_nested_if_blocks(exec_list *instructions);
  bool do_discard_simplification(exec_list *instructions);
  bool lower_if_to_cond_assign(exec_list *instructions, unsigned max_depth = 0);
  bool do_mat_op_to_vec(exec_list *instructions);
+bool do_minmax_prune(exec_list *instructions);
  bool do_noop_swizzle(exec_list *instructions);
  bool do_structure_splitting(exec_list *instructions);
  bool do_swizzle_swizzle(exec_list *instructions);
diff --git a/src/glsl/opt_minmax.cpp b/src/glsl/opt_minmax.cpp
new file mode 100644
index 000..5656059
--- /dev/null
+++ b/src/glsl/opt_minmax.cpp
@@ -0,0 +1,395 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF

Re: [Mesa-dev] [PATCH] squash! glsl: Optimize min/max expression trees

2014-08-18 Thread Petri Latvala


On 08/14/2014 07:04 AM, Matt Turner wrote:

---
I'd squash this in at minimum. The changes are

  - Whitespace
  - Removal of unnecessary destructor
  - Renaming one and two to a and b (one-value.u[c0]  
two-value.u[c0]...)
  - continue - break
  - assert(!...) - unreachable
  - Not doing assignments in if conditionals
  - Marking swizzle_if_required as static


Thanks, I'll squash this in.


I also think less_all_components should just return an enum like
{ MIXED, EQUAL, LESS, GREATER }, rather than setting a variable in
the class. It, as well as smaller/larger_constant, can then be
static functions outside of the visitor.

Yes, I'll try what it looks like with that.


I think the algorithm itself looks correct.

  src/glsl/opt_minmax.cpp | 145 +---
  1 file changed, 63 insertions(+), 82 deletions(-)

diff --git a/src/glsl/opt_minmax.cpp b/src/glsl/opt_minmax.cpp
index 5656059..b987386 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -37,12 +37,10 @@
  #include glsl_types.h
  #include main/macros.h
  
-namespace

-{
-class minmax_range
-{
-public:
+namespace {
  
+class minmax_range {

+public:
 minmax_range(ir_constant *low = NULL, ir_constant *high = NULL)
 {
range[0] = low;
@@ -60,60 +58,45 @@ public:
  class ir_minmax_visitor : public ir_rvalue_enter_visitor {
  public:
 ir_minmax_visitor()
-  : progress(false)
-  , valid(true)
-   {
-   }
-
-   virtual ~ir_minmax_visitor()
+  : progress(false), valid(true)
 {
 }
  
-   bool

-   less_all_components(ir_constant *one, ir_constant *two);
-
-   ir_constant *
-   smaller_constant(ir_constant *one, ir_constant *two);
-
-   ir_constant *
-   larger_constant(ir_constant *one, ir_constant *two);
+   bool less_all_components(ir_constant *a, ir_constant *b);
+   ir_constant *smaller_constant(ir_constant *a, ir_constant *b);
+   ir_constant *larger_constant(ir_constant *a, ir_constant *b);
  
-   minmax_range

-   combine_range(minmax_range r0, minmax_range r1, bool ismin);
+   minmax_range combine_range(minmax_range r0, minmax_range r1, bool ismin);
  
-   minmax_range

-   range_intersection(minmax_range r0, minmax_range r1);
+   minmax_range range_intersection(minmax_range r0, minmax_range r1);
  
-   minmax_range

-   get_range(ir_rvalue *rval);
+   minmax_range get_range(ir_rvalue *rval);
  
-   ir_rvalue *

-   prune_expression(ir_expression *expr, minmax_range baserange);
+   ir_rvalue *prune_expression(ir_expression *expr, minmax_range baserange);
  
-   void

-   handle_rvalue(ir_rvalue **rvalue);
+   void handle_rvalue(ir_rvalue **rvalue);
  
 bool progress;

 bool valid;
  };
  
  /*

- * Returns true if all vector components of `one' are less than of `two'.
+ * Returns true if all vector components of `a' are less than of `b'.
   *
   * If there are vector components that are less while others are greater, the
   * visitor is marked invalid and no further changes will be made to the IR.
   */
  bool
-ir_minmax_visitor::less_all_components(ir_constant *one, ir_constant *two)
+ir_minmax_visitor::less_all_components(ir_constant *a, ir_constant *b)
  {
-   assert(one != NULL);
-   assert(two != NULL);
+   assert(a != NULL);
+   assert(b != NULL);
  
-   assert(one-type-base_type == two-type-base_type);

+   assert(a-type-base_type == b-type-base_type);
  
-   unsigned oneinc = one-type-is_scalar() ? 0 : 1;

-   unsigned twoinc = two-type-is_scalar() ? 0 : 1;
-   unsigned components = MAX2(one-type-components(), 
two-type-components());
+   unsigned a_inc = a-type-is_scalar() ? 0 : 1;
+   unsigned b_inc = b-type-is_scalar() ? 0 : 1;
+   unsigned components = MAX2(a-type-components(), b-type-components());
  
 /* No early escape. We need to go through all components and mark the

  * visitor as invalid if comparison yields less for some components and
@@ -127,34 +110,34 @@ ir_minmax_visitor::less_all_components(ir_constant *one, 
ir_constant *two)
  
 for (unsigned i = 0, c0 = 0, c1 = 0;

  i  components;
-c0 += oneinc, c1 += twoinc, ++i) {
-  switch (one-type-base_type) {
+c0 += a_inc, c1 += b_inc, ++i) {
+  switch (a-type-base_type) {
case GLSL_TYPE_UINT:
- if (one-value.u[c0]  two-value.u[c1])
+ if (a-value.u[c0]  b-value.u[c1])
  foundless = true;
- else if (one-value.u[c0]  two-value.u[c1])
+ else if (a-value.u[c0]  b-value.u[c1])
  foundgreater = true;
   else
  foundequal = true;
- continue;
+ break;
case GLSL_TYPE_INT:
- if (one-value.i[c0]  two-value.i[c1])
+ if (a-value.i[c0]  b-value.i[c1])
  foundless = true;
- else if (one-value.i[c0]  two-value.i[c1])
+ else if (a-value.i[c0]  b-value.i[c1])
  foundgreater = true;
   else
  foundequal = true;
- continue;
+ break;
case

[Mesa-dev] [Bug 82538] Super Maryo Chronicles fails with st/mesa assertion failure

2014-08-18 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=82538

--- Comment #4 from smoki smoki00...@gmail.com ---
(In reply to comment #2)
 (In reply to comment #1)
 I can still reproduce it with current Mesa Git. Does your Mesa build have
 assertions enabled?
 

 Ah sorry did not have it that time, so yeah bug is there.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 161 matches

Mail list logo