date:20161220

Re: [Mesa-dev] New GBM backend for dEQP

2016-12-20 Thread Tapani Pälli


On 12/20/2016 09:10 PM, Chad Versace wrote:

On Mon 19 Dec 2016, Tapani Pälli wrote:


On 12/17/2016 03:58 AM, Chad Versace wrote:

Happy Christmas to everyone who's busy squashing dEQP bugs.

I wrote a new GBM backend for dEQP. I even submitted it to dEQP's
upstream Gerrit.  Pyry, dEQP's maintainer, told me over beer earlier
this year that he would accept it if I submitted it, and if it wasn't
too crazy. So, maybe it'll be upstream soon.

If you want to try it out, you can either fetch the patch from Gerrit:
 $ git fetch https://android.googlesource.com/platform/external/deqp 
refs/changes/43/315743/1

View it on Gerrit:
 https://android-review.googlesource.com/#/c/315743/

Fetch from personal my work-in-progress branch:
 $ git fetch git://git.kiwitree.net/~chadv/deqp refs/heads/wip/gbm

View it on my cgit:
 http://git.kiwitree.net/cgit/~chadv/deqp/log/?h=wip/gbm

GBM today does not support pixmaps nor pbuffers (eglCreatePixmapSurface
and eglCreatePbufferSurface), so the dEQP test coverage with GBM does
not have parity with X11. But, on the other hand, you get to run dEQP
without the headache of X11.

There's probably bugs. No surprises there.

Branch did not work 'out of the box' for me:

"No rule to make target 'framework/qphelper/.git/index', needed by
'framework/qphelper/qpReleaseInfo.inl'.  Stop."

(attached patch makes it work for me)

Strange. This may be related to a separate fix I submitted to dEQP
upstream:

 Subject: Fix build when '.git' is a gitfile
 https://android-review.googlesource.com/#/c/315234/


Yes, that is related. In my case ${DE_GIT_DIR} variable contains just 
'git' so I needed to concat ${CMAKE_SOURCE_DIR} with that so that 
dependency is correctly found, otherwise it thinks that 'git' is under 
qphelper path.



One issue is that real users will use X11, Wayland or Android. Would be cool
to have a 'switch' to toggle CI to use a particular backend so that most of
the time we would run against gbm but then sometimes check that X11 still
works etc.

Yes. I expect full test runs to be faster with GBM than with X11. If
that's true, then CI should default to running dEQP with GBM. And CI
should occasionally do a run with X11 to ensure there's no regressions,
and to also run any pbuffer and pixmap tests that get skipped on the
GBM run.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/dri2: implement query surface hook

2016-12-20 Thread Tapani Pälli


On 12/20/2016 08:58 PM, Chad Versace wrote:

On Tue 20 Dec 2016, Tapani Pälli wrote:

This makes better guarantee that the values we return are
in sync what the underlying drawable currently has.

Together with dEQP change bug #98327 this fixes following test:

dEQP-EGL.functional.resize.surface_size.grow

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98327
---
  src/egl/drivers/dri2/platform_x11.c | 30 ++
  1 file changed, 30 insertions(+)



diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index db7d3b9..0c5d577 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -395,6 +395,34 @@ dri2_x11_destroy_surface(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *surf)
  }
  
  /**

+ * Function utilizes swrastGetDrawableInfo to get surface
+ * geometry from x server and calls default query surface
+ * implementation that returns the updated values.
+ *
+ * In case of errors we still return values that we currently
+ * have.
+ */
+static EGLBoolean
+dri2_query_surface(_EGLDriver *drv, _EGLDisplay *dpy,
+   _EGLSurface *surf, EGLint attribute,
+   EGLint *value)
+{
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   int x, y, w = -1, h = -1;
+
+   __DRIdrawable *drawable = dri2_dpy->vtbl->get_dri_drawable(surf);
+   swrastGetDrawableInfo(drawable, , , , , dri2_surf);
+
+   if (w != -1 && h != -1) {
+  surf->Width = w;
+  surf->Height = h;
+   }
+
+   return _eglQuerySurface(drv, dpy, surf, attribute, value);
+}

The patch looks correct to me, but it incurs a X11 roundtrip even when
unneeded. A little change would ensure the roundtrip happens only when
needed. This is the same technique that platform_android.c uses to avoid
a SurfaceFlinger roundtrip.

switch (attribute) {
case EGL_WIDTH:
case EGL_HEIGHT:
...  /* Do what the patch does. Update width, height with 
swrastGetDrawableInfo. */
break;
default:
/* Do nothing */
break;
}


Right, makes sense. I'll make this modification and send v2.


return _eglQuerySurface(drv, dpy, surf, attribute, value);

By the way, I also can't reproduce the bug 98327. I'm using Archlinux
with Openbox, a non-compositing window manager. The only apps on my
screen are xterm and dEQP test windows.



The only setup where I'm able to reproduce this myself is Ubuntu running 
Unity desktop with DRI2. Are you running on DRI2 or DRI3?


// Tapani

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 99055] Games hang / freeze completely

2016-12-20 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=99055

--- Comment #20 from Kenneth Graunke  ---
(In reply to Eero Tamminen from comment #17)
> (In reply to Kenneth Graunke from comment #15)
> > Renaming the binaries or editing scripts installed with the game is liable
> > to break when new updates for the game comes out, because Steam will
> > overwrite those changes.
> 
> Only very few (if any) games include their own libstdc++, so the main issue
> is one included with Steam's Ubuntu 12.04 snapshot.  Does that get (ever)
> updated?

Yes.  Steam client updates can replace those.  Also, if Steam crashes, it
thinks that it needs to verify itself and detects missing/corrupt (replaced)
files and puts them back.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/12] nir: Add a LCSAA-pass

2016-12-20 Thread Timothy Arceri

On Tue, 2016-12-20 at 15:14 -0800, Jason Ekstrand wrote:
> I did have a couple of "real" comments on this one that I'd like to
> at least see a reply to.  Does look pretty good though.
> 
> On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri  bora.com> wrote:
> > From: Thomas Helland 
> > 
> > V2: Do a "depth first search" to convert to LCSSA
> > 
> > V3: Small comment fixup
> > 
> > V4: Rebase, adapt to removal of function overloads
> > 
> > V5: Rebase, adapt to relocation of nir to compiler/nir
> >     Still need to adapt to potential if-uses
> >     Work around nir_validate issue
> > 
> > V6 (Timothy):
> >  - tidy lcssa and stop leaking memory
> >  - dont rewrite the src for the lcssa phi node
> >  - validate lcssa phi srcs to avoid postvalidate assert
> >  - don't add new phi if one already exists
> >  - more lcssa phi validation fixes
> >  - Rather than marking ssa defs inside a loop just mark blocks
> > inside
> >    a loop. This is simpler and fixes lcssa for intrinsics which do
> >    not have a destination.
> >  - don't create LCSSA phis for loops we won't unroll
> >  - require loop metadata for lcssa pass
> >  - handle case were the ssa defs use outside the loop is already a
> > phi
> > 
> > V7: (Timothy)
> > - pass indirect mask to metadata call
> > 
> > v8: (Timothy)
> > - make convert to lcssa a helper function rather than a nir pass
> > - replace inside loop bitset with on the fly block index logic.
> > - remove lcssa phi validation special cases
> > - inline code from useless helpers, suggested by Jason.
> > - always do lcssa on loops, suggested by Jason.
> > - stop making lcssa phis special. Add as many source as the block
> >   has predecessors, suggested by Jason.
> > 
> > V9: (Timothy)
> > - fix regression with the is_lcssa_phi field not being initialised
> >   to false now that ralloc() doesn't zero out memory.
> > 
> > V10: (Timothy)
> > - remove extra braces in SSA example, pointed out by Topi
> > 
> > V11: (Timothy)
> > - add missing support for LCSSA phis in if conditions.
> > ---
> >  src/compiler/Makefile.sources   |   1 +
> >  src/compiler/nir/nir.c          |   1 +
> >  src/compiler/nir/nir.h          |   4 +
> >  src/compiler/nir/nir_to_lcssa.c | 215
> > 
> >  4 files changed, 221 insertions(+)
> >  create mode 100644 src/compiler/nir/nir_to_lcssa.c
> > 
> > diff --git a/src/compiler/Makefile.sources
> > b/src/compiler/Makefile.sources
> > index ca8a056..e8f7b02 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -254,6 +254,7 @@ NIR_FILES = \
> >         nir/nir_split_var_copies.c \
> >         nir/nir_sweep.c \
> >         nir/nir_to_ssa.c \
> > +       nir/nir_to_lcssa.c \
> >         nir/nir_validate.c \
> >         nir/nir_vla.h \
> >         nir/nir_worklist.c \
> > diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> > index 2c3531c..e522a67 100644
> > --- a/src/compiler/nir/nir.c
> > +++ b/src/compiler/nir/nir.c
> > @@ -561,6 +561,7 @@ nir_phi_instr_create(nir_shader *shader)
> >  {
> >     nir_phi_instr *instr = ralloc(shader, nir_phi_instr);
> >     instr_init(>instr, nir_instr_type_phi);
> > +   instr->is_lcssa_phi = false;
> > 
> >     dest_init(>dest);
> >     exec_list_make_empty(>srcs);
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index 28010aa..75a91ea 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -1360,6 +1360,8 @@ typedef struct {
> >     struct exec_list srcs; /** < list of nir_phi_src */
> > 
> >     nir_dest dest;
> > +
> > +   bool is_lcssa_phi;
> >  } nir_phi_instr;
> > 
> >  typedef struct {
> > @@ -2526,6 +2528,8 @@ void nir_convert_to_ssa(nir_shader *shader);
> >  bool nir_repair_ssa_impl(nir_function_impl *impl);
> >  bool nir_repair_ssa(nir_shader *shader);
> > 
> > +void nir_convert_loop_to_lcssa(nir_loop *loop);
> > +
> >  /* If phi_webs_only is true, only convert SSA values involved in
> > phi nodes to
> >   * registers.  If false, convert all values (even those not
> > involved in a phi
> >   * node) to registers.
> > diff --git a/src/compiler/nir/nir_to_lcssa.c
> > b/src/compiler/nir/nir_to_lcssa.c
> > new file mode 100644
> > index 000..8afdc54
> > --- /dev/null
> > +++ b/src/compiler/nir/nir_to_lcssa.c
> > @@ -0,0 +1,215 @@
> > +/*
> > + * Copyright © 2015 Thomas Helland
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > + * copy of this software and associated documentation files (the
> > "Software"),
> > + * to deal in the Software without restriction, including without
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > sublicense,
> > + * and/or sell copies of the Software, and to permit persons to
> > whom the
> > + * Software is furnished to do so, subject to the following
> > conditions:
> > + *
> > + * The above copyright notice and this permission notice
> >

[Mesa-dev] [PATCH 3/4] gallivm: optimize lp_build_unpack_arith_rgba_aos slightly

2016-12-20 Thread sroland

From: Roland Scheidegger 

This code uses a vector shift which has to be emulated on x86 unless
there's AVX2. Luckily in some cases we can actually avoid the shift
altogether, so do that.
Also make sure we hit the fast lp_build_conv() path when applicable,
albeit that's quite the hack...
That said, this path is taken for AoS sampling for small unorm (smaller
than rgba8) formats, and it is completely hopeless even with those
changes, with or without AVX.
(Probably should have some code similar to the one in the llvmpipe fs
backend code, using bit replication to extend to rgba - rounding
is not quite 100% accurate but if it's good enough there it should be
here as well.)
---
 src/gallium/auxiliary/gallivm/lp_bld_format_aos.c | 116 ++
 1 file changed, 97 insertions(+), 19 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c
index 322e7b8..574bb64 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c
@@ -38,6 +38,7 @@
 #include "util/u_math.h"
 #include "util/u_pointer.h"
 #include "util/u_string.h"
+#include "util/u_cpu_detect.h"
 
 #include "lp_bld_arit.h"
 #include "lp_bld_init.h"
@@ -49,6 +50,7 @@
 #include "lp_bld_gather.h"
 #include "lp_bld_debug.h"
 #include "lp_bld_format.h"
+#include "lp_bld_pack.h"
 #include "lp_bld_intr.h"
 
 
@@ -156,6 +158,7 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state 
*gallivm,
LLVMValueRef shifts[4];
LLVMValueRef masks[4];
LLVMValueRef scales[4];
+   LLVMTypeRef vec32_type;
 
boolean normalized;
boolean needs_uitofp;
@@ -171,19 +174,17 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state 
*gallivm,
 * matches floating point size */
assert (LLVMTypeOf(packed) == LLVMInt32TypeInContext(gallivm->context));
 
+   vec32_type = LLVMVectorType(LLVMInt32TypeInContext(gallivm->context), 4);
+
/* Broadcast the packed value to all four channels
 * before: packed = BGRA
 * after: packed = {BGRA, BGRA, BGRA, BGRA}
 */
-   packed = LLVMBuildInsertElement(builder,
-   
LLVMGetUndef(LLVMVectorType(LLVMInt32TypeInContext(gallivm->context), 4)),
-   packed,
+   packed = LLVMBuildInsertElement(builder, LLVMGetUndef(vec32_type), packed,

LLVMConstNull(LLVMInt32TypeInContext(gallivm->context)),
"");
-   packed = LLVMBuildShuffleVector(builder,
-   packed,
-   
LLVMGetUndef(LLVMVectorType(LLVMInt32TypeInContext(gallivm->context), 4)),
-   
LLVMConstNull(LLVMVectorType(LLVMInt32TypeInContext(gallivm->context), 4)),
+   packed = LLVMBuildShuffleVector(builder, packed, LLVMGetUndef(vec32_type),
+   LLVMConstNull(vec32_type),
"");
 
/* Initialize vector constants */
@@ -224,9 +225,40 @@ lp_build_unpack_arith_rgba_aos(struct gallivm_state 
*gallivm,
/* Ex: convert packed = {XYZW, XYZW, XYZW, XYZW}
 * into masked = {X, Y, Z, W}
 */
-   /* Note: we cannot do this shift on x86 natively until AVX2. */
-   shifted = LLVMBuildLShr(builder, packed, LLVMConstVector(shifts, 4), "");
-   masked = LLVMBuildAnd(builder, shifted, LLVMConstVector(masks, 4), "");
+   if (desc->block.bits < 32 && normalized) {
+  /*
+   * Note: we cannot do the shift below on x86 natively until AVX2.
+   *
+   * Old llvm versions will resort to scalar extract/shift insert,
+   * which is definitely terrible, new versions will just do
+   * several vector shifts and shuffle/blend results together.
+   * We could turn this into a variable left shift plus a constant
+   * right shift, and llvm would then turn the variable left shift
+   * into a mul for us (albeit without sse41 the mul needs emulation
+   * too...). However, since we're going to do a float mul
+   * anyway, we just adjust that mul instead (plus the mask), skipping
+   * the shift completely.
+   * We could also use a extra mul when the format isn't normalized and
+   * we don't have AVX2 support, but don't bother for now. Unfortunately,
+   * this strategy doesn't work for 32bit formats (such as rgb10a2 or even
+   * rgba8 if it ends up here), as that would require UIToFP, albeit that
+   * would be fixable with easy 16bit shuffle (unless there's channels
+   * crossing 16bit boundaries).
+   */
+  for (i = 0; i < 4; ++i) {
+ if (desc->channel[i].type != UTIL_FORMAT_TYPE_VOID) {
+unsigned bits = desc->channel[i].size;
+unsigned shift = desc->channel[i].shift;
+unsigned long long mask = ((1ULL << bits) - 1) << shift;
+scales[i] = lp_build_const_float(gallivm, 1.0 / mask);
+

[Mesa-dev] [PATCH 4/4] gallivm: implement aos unpack (to unorm8) for small unorm formats

2016-12-20 Thread sroland

From: Roland Scheidegger 

Using bit replication. This path now resembles something which might make
sense. (The logic was mostly copied from llvmpipe fs backend.)
I am not convinced though it is actually faster than SoA sampling (actually
I'm quite certain it's always a loss with AVX).
With SoA it's just shift/mask/cvt/mul for getting the colors, whereas
there's still roughly 3 shifts, 3 or/and per channel for AoS
(i.e. for SoA it's exactly the same as it would be for a rgba8 format,
whereas the extra effort for AoS is significant). The filtering
might still be faster (albeit with FMA the instruction count gets down
quite a bit there on the SoA float filtering path on new cpus). And those
small unorm formats often don't have an alpha channel (which makes things
worse relatively for AoS path).
(This also fixes a trivial bug in the llvmpipe fs code this was derived
from, albeit it was only relevant for 4-bit channels.)
---
 src/gallium/auxiliary/gallivm/lp_bld_format_aos.c | 164 --
 src/gallium/drivers/llvmpipe/lp_state_fs.c|   8 +-
 2 files changed, 155 insertions(+), 17 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c
index 574bb64..11d1118 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_aos.c
@@ -52,6 +52,8 @@
 #include "lp_bld_format.h"
 #include "lp_bld_pack.h"
 #include "lp_bld_intr.h"
+#include "lp_bld_logic.h"
+#include "lp_bld_bitarit.h"
 
 
 /**
@@ -139,6 +141,73 @@ format_matches_type(const struct util_format_description 
*desc,
return TRUE;
 }
 
+/*
+ * Do rounding when converting small unorm values to larger ones.
+ * Not quite 100% accurate, as it's done by appending MSBs, but
+ * should be good enough.
+ */
+
+static inline LLVMValueRef
+scale_bits_up(struct gallivm_state *gallivm,
+  int src_bits,
+  int dst_bits,
+  LLVMValueRef src,
+  struct lp_type src_type)
+{
+   LLVMBuilderRef builder = gallivm->builder;
+   LLVMValueRef result = src;
+
+   if (src_bits == 1 && dst_bits > 1) {
+  /*
+   * Useful for a1 - we'd need quite some repeated copies otherwise.
+   */
+  struct lp_build_context bld;
+  LLVMValueRef dst_mask;
+  lp_build_context_init(, gallivm, src_type);
+  dst_mask = lp_build_const_int_vec(gallivm, src_type,
+(1 << dst_bits) - 1),
+  result = lp_build_cmp(, PIPE_FUNC_EQUAL, src,
+lp_build_const_int_vec(gallivm, src_type, 0));
+  result = lp_build_andnot(, dst_mask, result);
+   }
+   else if (dst_bits > src_bits) {
+  /* Scale up bits */
+  int db = dst_bits - src_bits;
+
+  /* Shift left by difference in bits */
+  result = LLVMBuildShl(builder,
+src,
+lp_build_const_int_vec(gallivm, src_type, db),
+"");
+
+  if (db <= src_bits) {
+ /* Enough bits in src to fill the remainder */
+ LLVMValueRef lower = LLVMBuildLShr(builder,
+src,
+lp_build_const_int_vec(gallivm, 
src_type,
+   src_bits - 
db),
+"");
+
+ result = LLVMBuildOr(builder, result, lower, "");
+  } else if (db > src_bits) {
+ /* Need to repeatedly copy src bits to fill remainder in dst */
+ unsigned n;
+
+ for (n = src_bits; n < dst_bits; n *= 2) {
+LLVMValueRef shuv = lp_build_const_int_vec(gallivm, src_type, n);
+
+result = LLVMBuildOr(builder,
+ result,
+ LLVMBuildLShr(builder, result, shuv, ""),
+ "");
+ }
+  }
+   } else {
+  assert (dst_bits == src_bits);
+   }
+
+   return result;
+}
 
 /**
  * Unpack a single pixel into its XYZW components.
@@ -451,6 +520,86 @@ lp_build_fetch_rgba_aos(struct gallivm_state *gallivm,
}
 
/*
+* Bit arithmetic for converting small_unorm to unorm8.
+*
+* This misses some opportunities for optimizations (like skipping mask
+* for the highest channel for instance, or doing bit scaling in parallel
+* for channels with the same bit width) but it should be passable for
+* all arithmetic formats.
+*/
+   if (format_desc->layout == UTIL_FORMAT_LAYOUT_PLAIN &&
+   format_desc->colorspace == UTIL_FORMAT_COLORSPACE_RGB &&
+   util_format_fits_8unorm(format_desc) &&
+   type.width == 8 && type.norm == 1 && type.sign == 0 &&
+   type.fixed == 0 && type.floating == 0) {
+  LLVMValueRef packed, res, chans[4], rgba[4];
+  LLVMTypeRef dst_vec_type, conv_vec_type;
+  struct lp_type fetch_type, conv_type;
+

[Mesa-dev] [PATCH 1/4] llvmpipe: (trivial) minimally simplify mask construction

2016-12-20 Thread sroland

From: Roland Scheidegger 

simd instruction sets usually have comparisons for equal, not unequal.
So use a different comparison against the mask itself - which also means
we don't need a all-zero as well as a all-one (for the pxor) reg.

Also add code to avoid scalar expansion of i1 values which we definitely
shouldn't do. There's problems with this though with llvm select
interaction, so it's disabled (basically using llvm select instead of
intrinsics may still produce atrocious code, even in cases where we
figured it should not, albeit I think this could probably be fixed
with some better selection of optimization passes, but I have zero
idea there really).
---
 src/gallium/auxiliary/gallivm/lp_bld_logic.c |  2 ++
 src/gallium/drivers/llvmpipe/lp_bld_depth.c  | 52 ++--
 src/gallium/drivers/llvmpipe/lp_state_fs.c   | 16 +
 3 files changed, 53 insertions(+), 17 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_logic.c 
b/src/gallium/auxiliary/gallivm/lp_bld_logic.c
index 1a50e82..524917a 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_logic.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_logic.c
@@ -327,6 +327,8 @@ lp_build_select(struct lp_build_context *bld,
* supported yet for a long time, and LLVM will generate poor code when
* the mask is not the result of a comparison.
* Also, llvm 3.7 may miscompile them (bug 94972).
+   * XXX: Even if the instruction was an SExt, this may still produce
+   * terrible code. Try piglit stencil-twoside.
*/
 
   /* Convert the mask to a vector of booleans.
diff --git a/src/gallium/drivers/llvmpipe/lp_bld_depth.c 
b/src/gallium/drivers/llvmpipe/lp_bld_depth.c
index 0c27c2f..d5d5c5a 100644
--- a/src/gallium/drivers/llvmpipe/lp_bld_depth.c
+++ b/src/gallium/drivers/llvmpipe/lp_bld_depth.c
@@ -963,16 +963,48 @@ lp_build_depth_stencil_test(struct gallivm_state *gallivm,
if (stencil[0].enabled) {
 
   if (face) {
- LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
-
- /* front_facing = face != 0 ? ~0 : 0 */
- front_facing = LLVMBuildICmp(builder, LLVMIntNE, face, zero, "");
- front_facing = LLVMBuildSExt(builder, front_facing,
-  LLVMIntTypeInContext(gallivm->context,
- 
s_bld.type.length*s_bld.type.width),
-  "");
- front_facing = LLVMBuildBitCast(builder, front_facing,
- s_bld.int_vec_type, "");
+ if (0) {
+/*
+ * XXX: the scalar expansion below produces atrocious code
+ * (basically producing a 64bit scalar value, then moving the 2
+ * 32bit pieces separately to simd, plus 4 shuffles, which is
+ * seriously lame). But the scalar-simd transitions are always
+ * tricky, so no big surprise there.
+ * This here would be way better, however llvm has some serious
+ * trouble later using it in the select, probably because it will
+ * recognize the expression as constant and move the simd value
+ * away (out of the loop) - and then it will suddenly try
+ * constructing i1 high-bit masks out of it later...
+ * (Try piglit stencil-twoside.)
+ * Note this is NOT due to using SExt/Trunc, it fails exactly the
+ * same even when using native compare/select.
+ * I cannot reproduce this problem when using stand-alone compiler
+ * though, suggesting some problem with optimization passes...
+ * (With stand-alone compilation, the construction of this mask
+ * value, no matter if the easy 3 instruction here or the complex
+ * 16+ one below, never gets separated from where it's used.)
+ * The scalar code still has the same problem, but the generated
+ * code looks a bit better at least for some reason, even if
+ * mostly by luck (the fundamental issue clearly is the same).
+ */
+front_facing = lp_build_broadcast(gallivm, s_bld.vec_type, face);
+/* front_facing = face != 0 ? ~0 : 0 */
+front_facing = lp_build_compare(gallivm, s_bld.type,
+PIPE_FUNC_NOTEQUAL,
+front_facing, s_bld.zero);
+ } else {
+LLVMValueRef zero = lp_build_const_int32(gallivm, 0);
+
+/* front_facing = face != 0 ? ~0 : 0 */
+front_facing = LLVMBuildICmp(builder, LLVMIntNE, face, zero, "");
+front_facing = LLVMBuildSExt(builder, front_facing,
+ LLVMIntTypeInContext(gallivm->context,
+
s_bld.type.length*s_bld.type.width),
+ "");
+

[Mesa-dev] [PATCH 2/4] gallivm: use 2 srcs for 32->16bit conversions in lp_bld_conv_auto

2016-12-20 Thread sroland

From: Roland Scheidegger 

If we only feed one source vector at a time, we cannot use pack intrinsics
(as we only have a 64bit destination dst vector). lp_bld_conv_auto is
specifically designed to alter the length and number of destination vectors,
so this works just fine (if we use single source vectors at a time, afterwards
we immediately reassemble the vectors).
For AVX though this isn't really possible, since we expect 128bit output
already for a single 256bit input. (One day we should handle AVX2 which again
would need multiple inputs, however there's the problem that we get different
ordered output there and we don't want to reorder, so would need to be able
to tell build_conv to handle upper and lower halfs independently.)
A similar strategy would probably work for 32->8bit too (if it doesn't hit
the special case) but I'm going to try something different for that...
---
 src/gallium/auxiliary/gallivm/lp_bld_conv.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_conv.c 
b/src/gallium/auxiliary/gallivm/lp_bld_conv.c
index 69d24a5..c8f9c28 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_conv.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_conv.c
@@ -497,8 +497,25 @@ int lp_build_conv_auto(struct gallivm_state *gallivm,
if (src_type.width == dst_type->width) {
   lp_build_conv(gallivm, src_type, *dst_type, src, num_srcs, dst, 
num_dsts);
} else {
-  for (i = 0; i < num_srcs; ++i) {
- lp_build_conv(gallivm, src_type, *dst_type, [i], 1, [i], 1);
+  /*
+   * If dst_width is 16 bits and src_width 32 and the dst vector size
+   * 64bit, try feeding 2 vectors at once so pack intrinsics can be used.
+   * (For AVX, this isn't needed, since we usually get 256bit src and
+   * 128bit dst vectors which works ok. If we do AVX2 pack this should
+   * be extended but need to be able to tell conversion code about pack
+   * ordering first.)
+   */
+  unsigned ratio = 1;
+  if (src_type.width == 2 * dst_type->width &&
+  src_type.length == dst_type->length &&
+  dst_type->floating == 0 && (num_srcs % 2 == 0) &&
+  dst_type->width * dst_type->length == 64) {
+ ratio = 2;
+ num_dsts /= 2;
+ dst_type->length *= 2;
+  }
+  for (i = 0; i < num_dsts; i++) {
+ lp_build_conv(gallivm, src_type, *dst_type, [i*ratio], ratio, 
[i], 1);
   }
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)

2016-12-20 Thread Zhang, Jerry

Hi Alex and Marek,

> >
> > Reviewed-by: Marek Olšák 
> >
> > Has it ever been tested with Mesa?
> 
> It was tested when the code was originally written and the hybrid stack 
> (including
> mesa MM) has been tested recently.  I don't have a
> polaris12 card at the moment.
Just now I verified it again.
I could play h264, h265, mp4 as our test procedure.

It's Tested-by: Junwei Zhang 

Regards,
Jerry (Junwei Zhang)

SRDC SW Development
AMD Shanghai
_


> -Original Message-
> From: Alex Deucher [mailto:alexdeuc...@gmail.com]
> Sent: Wednesday, December 21, 2016 3:07
> To: Marek Olšák
> Cc: mesa-dev@lists.freedesktop.org; Zhang, Jerry
> Subject: Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)
> 
> On Tue, Dec 20, 2016 at 1:34 PM, Marek Olšák  wrote:
> > For the series:
> >
> > Reviewed-by: Marek Olšák 
> >
> > Has it ever been tested with Mesa?
> 
> It was tested when the code was originally written and the hybrid stack 
> (including
> mesa MM) has been tested recently.  I don't have a
> polaris12 card at the moment.
> 
> Alex
> 
> >
> > Marek
> >
> > On Mon, Dec 19, 2016 at 11:45 PM, Alex Deucher 
> wrote:
> >> From: Junwei Zhang 
> >>
> >> v2: use gfxip names for llvm 4.0+
> >> v3: use tonga for llvm <= 3.8
> >>
> >> Signed-off-by: Junwei Zhang 
> >> Reviewed-by: Nicolai Hähnle 
> >> Acked-by: Christian König 
> >> ---
> >>  src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
> >>  src/amd/addrlib/r800/ciaddrlib.h  | 1 +
> >>  src/amd/common/amd_family.h   | 1 +
> >>  src/amd/common/amdgpu_id.h| 4 
> >>  src/gallium/drivers/radeon/r600_pipe_common.c | 3 +++
> >>  src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
> >>  src/gallium/drivers/radeonsi/si_pipe.c| 1 +
> >>  src/gallium/drivers/radeonsi/si_state.c   | 1 +
> >>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
> >>  9 files changed, 19 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp
> >> b/src/amd/addrlib/r800/ciaddrlib.cpp
> >> index 7c5d29a..c726c4d 100644
> >> --- a/src/amd/addrlib/r800/ciaddrlib.cpp
> >> +++ b/src/amd/addrlib/r800/ciaddrlib.cpp
> >> @@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
> >>  m_settings.isFiji= 
> >> ASICREV_IS_FIJI_P(uChipRevision);
> >>  m_settings.isPolaris10   = 
> >> ASICREV_IS_POLARIS10_P(uChipRevision);
> >>  m_settings.isPolaris11   = 
> >> ASICREV_IS_POLARIS11_M(uChipRevision);
> >> +m_settings.isPolaris12   = 
> >> ASICREV_IS_POLARIS12_V(uChipRevision);
> >>  break;
> >>  case FAMILY_CZ:
> >>  m_settings.isCarrizo = 1;
> >> @@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
> >>  {
> >>  m_pipes = 16;
> >>  }
> >> -else if (m_settings.isPolaris11)
> >> +else if (m_settings.isPolaris11 || m_settings.isPolaris12)
> >>  {
> >>  m_pipes = 4;
> >>  }
> >> diff --git a/src/amd/addrlib/r800/ciaddrlib.h
> >> b/src/amd/addrlib/r800/ciaddrlib.h
> >> index de995fa..2c9a4cc 100644
> >> --- a/src/amd/addrlib/r800/ciaddrlib.h
> >> +++ b/src/amd/addrlib/r800/ciaddrlib.h
> >> @@ -62,6 +62,7 @@ struct CIChipSettings
> >>  UINT_32 isFiji: 1;
> >>  UINT_32 isPolaris10   : 1;
> >>  UINT_32 isPolaris11   : 1;
> >> +UINT_32 isPolaris12   : 1;
> >>  // VI fusion (Carrizo)
> >>  UINT_32 isCarrizo : 1;
> >>  };
> >> diff --git a/src/amd/common/amd_family.h
> >> b/src/amd/common/amd_family.h index 6a713ad..b09bbb8 100644
> >> --- a/src/amd/common/amd_family.h
> >> +++ b/src/amd/common/amd_family.h
> >> @@ -91,6 +91,7 @@ enum radeon_family {
> >>  CHIP_STONEY,
> >>  CHIP_POLARIS10,
> >>  CHIP_POLARIS11,
> >> +CHIP_POLARIS12,
> >>  CHIP_LAST,
> >>  };
> >>
> >> diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
> >> index f91df55..1683a5a 100644
> >> --- a/src/amd/common/amdgpu_id.h
> >> +++ b/src/amd/common/amdgpu_id.h
> >> @@ -142,6 +142,8 @@ enum {
> >>
> >> VI_POLARIS11_M_A0 = 90,
> >>
> >> +   VI_POLARIS12_V_A0 = 100,
> >> +
> >> VI_UNKNOWN= 0xFF
> >>  };
> >>
> >> @@ -156,6 +158,8 @@ enum {
> >> ((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev <
> VI_POLARIS11_M_A0))
> >>  #define ASICREV_IS_POLARIS11_M(eChipRev)   \
> >> (eChipRev >= VI_POLARIS11_M_A0)
> >> +#define ASICREV_IS_POLARIS12_V(eChipRev)\
> >> +   (eChipRev >= VI_POLARIS12_V_A0)
> >>
> >>  /* CZ specific rev IDs */
> >>  enum {
> >> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
> >> b/src/gallium/drivers/radeon/r600_pipe_common.c
> >> index

[Mesa-dev] [Bug 99076] dEQP-GLES3.functional.negative_api.texture#teximage3d fails due to wrong Error code

2016-12-20 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=99076

Jordan Justen  changed:

   What|Removed |Added

 Resolution|WONTFIX |NOTOURBUG

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 99076] dEQP-GLES3.functional.negative_api.texture#teximage3d fails due to wrong Error code

2016-12-20 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=99076

Randy  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|NEW |RESOLVED

--- Comment #1 from Randy  ---
The patch https://android-review.googlesource.com/#/c/291429/ for GLES3.1 has
been merged to dEQP master branch, so expect the GLES3 change should be merged
also. 

In this case, we don’t need Mesa patch, close the bug

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Mesa: Fix error code for glTexImage3D in GLES

2016-12-20 Thread Xu, Randy

Hi, Kenneth

The patch https://android-review.googlesource.com/#/c/291429/ for GLES3.1 has 
been merged to dEQP master branch, so expect the GLES3 change should be merged 
also. 
In this case, we don't need this change in Mesa. 

Thanks,
Randy

-Original Message-
From: Kenneth Graunke [mailto:kenn...@whitecape.org] 
Sent: Wednesday, December 21, 2016 10:00 AM
To: mesa-dev@lists.freedesktop.org
Cc: Xu, Randy ; mesa-sta...@lists.freedesktop.org
Subject: Re: [Mesa-dev] [PATCH] Mesa: Fix error code for glTexImage3D in GLES

On Wednesday, December 21, 2016 9:05:27 AM PST Randy Xu wrote:
> From the OGLES 3.2 spec, Section 8.5 Texture Image Specification, page 158:
>  "An INVALID_OPERATION error is generated if a combination of
>   values for format, type, and internalformat is specified that is
>   not listed as a valid combination in tables 8.2 or 8.3."
> It means that TexImage3D should return GL_INVALID_OPERATION if the 
> internal format is DEPTH_COMPONENT, DEPTH_-STENCIL or STENCIL_INDEX.
> 
> The current code returns INVALID_ENUM as 
> _mesa_error_check_format_and_type is also used by glReadPixels and the 
> GL specification defines  "INVALID_ENUM is generated if format is 
> DEPTH_STENCIL and type is not
>   UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_- REV".
> 
> This patch only impacts GLES, which can generate GL_INVALID_OPERATION 
> because glReadPixels cannot be used to read depth or stencil buffer.
> Fixes dEQP-GLES3.functional.negative_api.texture.teximage3d.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99076
> Signed-off-by: Randy Xu 
> ---
>  src/mesa/main/glformats.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c 
> index a95909c..3070db9 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -2087,6 +2087,13 @@ _mesa_error_check_format_and_type(const struct 
> gl_context *ctx,
>   else if (ctx->Extensions.ARB_depth_buffer_float &&
>   type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)
>  return GL_NO_ERROR;
> + //From the OpenGL ES 3.2 spec, Section 8.5 Texture Image
> + // Specification, page 158:
> + // An INVALID_OPERATION error is generated if a combination of
> + // values for format, type, and internalformat is specified that
> + // is not listed as a valid combination in tables 8.2 or 8.3.
> + else if (!_mesa_is_desktop_gl(ctx))
> +return GL_INVALID_OPERATION;
>   else
>  return GL_INVALID_ENUM;
>  
> 

There is a bug in dEQP related to this test:

https://android-review.googlesource.com/#/c/316475/

We already landed the equivalent fix for dEQP-GLES31 version, but apparently 
the test exists in dEQP-GLES3 as well.

After applying that patch, the test passes, with no change to Mesa.

--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V2] mesa: use gl_program for CurrentProgram rather than gl_shader_program

2016-12-20 Thread Timothy Arceri

This makes much more sense and should be more performant in some
critical paths such as SSO validation which is called at draw time.

Previously the CurrentProgram array could have contained multiple
pointers to the same struct which was confusing and we would often
need to fish out the information we were really after from the
gl_program anyway.

Also it was error prone to depend on the _LinkedShader array for
programs in current use because a failed linking attempt will lose
the infomation about the current program in use which is still
valid.

V2: fix validate_io() to compare linked_stages rather than the
consumer and producer to decide if we are looking at inward
facing shader interfaces which don't need validation.
---
 src/mesa/drivers/common/meta.c| 11 ++--
 src/mesa/drivers/common/meta.h|  2 +-
 src/mesa/drivers/dri/i965/brw_context.c   | 10 ++--
 src/mesa/drivers/dri/i965/brw_ff_gs.c |  4 +-
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c  |  8 +--
 src/mesa/drivers/dri/i965/brw_tcs_surface_state.c |  8 +--
 src/mesa/drivers/dri/i965/brw_tes_surface_state.c |  8 +--
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c  |  9 +---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 10 ++--
 src/mesa/drivers/dri/i965/gen6_sol.c  | 24 -
 src/mesa/drivers/dri/i965/gen7_l3_state.c |  6 +--
 src/mesa/main/api_validate.c  | 57 
 src/mesa/main/ff_fragment_shader.cpp  |  6 +--
 src/mesa/main/mtypes.h|  2 +-
 src/mesa/main/pipelineobj.c   | 52 +--
 src/mesa/main/shader_query.cpp| 38 ++
 src/mesa/main/shaderapi.c | 63 ++-
 src/mesa/main/state.c | 50 +++---
 src/mesa/main/texstate.c  |  5 +-
 src/mesa/main/transformfeedback.c |  2 +-
 src/mesa/main/uniform_query.cpp   | 21 +++-
 src/mesa/state_tracker/st_atom_atomicbuf.c| 20 +++
 src/mesa/state_tracker/st_atom_constbuf.c | 43 +---
 src/mesa/state_tracker/st_atom_image.c| 42 +--
 src/mesa/state_tracker/st_atom_storagebuf.c   | 48 +
 src/mesa/state_tracker/st_cb_compute.c|  4 +-
 26 files changed, 198 insertions(+), 355 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 0d5661b..15d28b2 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -594,8 +594,8 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
* that we don't have to worry about the current pipeline state.
*/
   for (i = 0; i < MESA_SHADER_STAGES; i++) {
- _mesa_reference_shader_program(ctx, >Shader[i],
-ctx->Shader.CurrentProgram[i]);
+ _mesa_reference_program(ctx, >Program[i],
+ ctx->Shader.CurrentProgram[i]);
   }
   _mesa_reference_shader_program(ctx, >ActiveShader,
  ctx->Shader.ActiveProgram);
@@ -972,16 +972,15 @@ _mesa_meta_end(struct gl_context *ctx)
   * program object must be NULL.  _mesa_use_shader_program is a no-op
   * in that case.
   */
- _mesa_use_shader_program(ctx, targets[i],
-  save->Shader[i],
+ _mesa_use_shader_program(ctx, targets[i], save->Program[i],
   >Shader);
 
  /* Do this *before* killing the reference. :)
   */
- if (save->Shader[i] != NULL)
+ if (save->Program[i] != NULL)
 any_shader = true;
 
- _mesa_reference_shader_program(ctx, >Shader[i], NULL);
+ _mesa_reference_program(ctx, >Program[i], NULL);
   }
 
   _mesa_reference_shader_program(ctx, >Shader.ActiveProgram,
diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index 0a913e9..1b5cf42 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -125,7 +125,7 @@ struct save_state
GLboolean FragmentProgramEnabled;
struct gl_program *FragmentProgram;
GLboolean ATIFragmentShaderEnabled;
-   struct gl_shader_program *Shader[MESA_SHADER_STAGES];
+   struct gl_program *Program[MESA_SHADER_STAGES];
struct gl_shader_program *ActiveShader;
struct gl_pipeline_object   *Pipeline;
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index e53aefd..63cb12c 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -274,14 +274,12 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
 
/* Resolve color for each active shader image. */
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-

Re: [Mesa-dev] [PATCH] Mesa: Fix error code for glTexImage3D in GLES

2016-12-20 Thread Kenneth Graunke

On Wednesday, December 21, 2016 9:05:27 AM PST Randy Xu wrote:
> From the OGLES 3.2 spec, Section 8.5 Texture Image Specification, page 158:
>  "An INVALID_OPERATION error is generated if a combination of
>   values for format, type, and internalformat is specified that is
>   not listed as a valid combination in tables 8.2 or 8.3."
> It means that TexImage3D should return GL_INVALID_OPERATION if the internal
> format is DEPTH_COMPONENT, DEPTH_-STENCIL or STENCIL_INDEX.
> 
> The current code returns INVALID_ENUM as _mesa_error_check_format_and_type is
> also used by glReadPixels and the GL specification defines
>  "INVALID_ENUM is generated if format is DEPTH_STENCIL and type is not
>   UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_- REV".
> 
> This patch only impacts GLES, which can generate GL_INVALID_OPERATION because
> glReadPixels cannot be used to read depth or stencil buffer.
> Fixes dEQP-GLES3.functional.negative_api.texture.teximage3d.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99076
> Signed-off-by: Randy Xu 
> ---
>  src/mesa/main/glformats.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index a95909c..3070db9 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -2087,6 +2087,13 @@ _mesa_error_check_format_and_type(const struct 
> gl_context *ctx,
>   else if (ctx->Extensions.ARB_depth_buffer_float &&
>   type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)
>  return GL_NO_ERROR;
> + //From the OpenGL ES 3.2 spec, Section 8.5 Texture Image
> + // Specification, page 158:
> + // An INVALID_OPERATION error is generated if a combination of
> + // values for format, type, and internalformat is specified that
> + // is not listed as a valid combination in tables 8.2 or 8.3.
> + else if (!_mesa_is_desktop_gl(ctx))
> +return GL_INVALID_OPERATION;
>   else
>  return GL_INVALID_ENUM;
>  
> 

There is a bug in dEQP related to this test:

https://android-review.googlesource.com/#/c/316475/

We already landed the equivalent fix for dEQP-GLES31 version, but
apparently the test exists in dEQP-GLES3 as well.

After applying that patch, the test passes, with no change to Mesa.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/12] nir: add a loop unrolling pass

2016-12-20 Thread Jason Ekstrand

On Tue, Dec 20, 2016 at 5:06 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> On Tue, 2016-12-20 at 16:31 -0800, Jason Ekstrand wrote:
> > On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri  > bora.com> wrote:
> > > V2:
> > > - tidy ups suggested by Connor.
> > > - tidy up cloning logic and handle copy propagation
> > >  based of suggestion by Connor.
> > > - use nir_ssa_def_rewrite_uses to fix up lcssa phis
> > >   suggested by Connor.
> > > - add support for complex loop unrolling (two terminators)
> > > - handle case were the ssa defs use outside the loop is already a
> > > phi
> > > - support unrolling loops with multiple terminators when trip count
> > >   is know for each terminator
> > >
> > > V3:
> > > - set correct num_components when creating phi in complex unroll
> > > - rewrite update remap table based on Jasons suggestions.
> > > - remove unrequired extract_loop_body() helper as suggested by
> > > Jason.
> > > - simplify the lcssa phi fix up code for simple loops as per Jasons
> > > suggestions.
> > > - use mem context to keep track of hash table memory as suggested
> > > by Jason.
> > > - move is_{complex,simple}_loop helpers to the unroll code
> > > - require nir_metadata_block_index
> > > - partially rewrote complex unroll to be simpler and easier to
> > > follow.
> > >
> > > V4:
> > > - use rzalloc() when creating nir_phi_src but not setting pred
> > > right away
> > >  fixes regression cause by ralloc() no longer zeroing memory.
> > >
> > > V5:
> > > - simplify calling of complex_unroll()
> > > - use new loop terminator fields to get the break/continue from
> > > blocks
> > >   and simplify loop unrolling code
> > > - handle slightly less trivial loop terminators. if branches can
> > >   now have instructions but can only contain a single block.
> > > - use nir print type IR snippets in unroll function descriptions
> > > - add better explanation and variable for why we need to clone
> > >   additional times when the second terminator it the limiting
> > >   terminator.
> > > - partially convert out of ssa before unrolling loops (suggested by
> > > Jason)
> > > ---
> > >  src/compiler/Makefile.sources  |   1 +
> > >  src/compiler/nir/nir.h |   2 +
> > >  src/compiler/nir/nir_opt_loop_unroll.c | 559
> > > +
> > >  3 files changed, 562 insertions(+)
> > >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > >
> > > diff --git a/src/compiler/Makefile.sources
> > > b/src/compiler/Makefile.sources
> > > index e8f7b02..ae3e5f0 100644
> > > --- a/src/compiler/Makefile.sources
> > > +++ b/src/compiler/Makefile.sources
> > > @@ -239,6 +239,7 @@ NIR_FILES = \
> > > nir/nir_opt_dead_cf.c \
> > > nir/nir_opt_gcm.c \
> > > nir/nir_opt_global_to_local.c \
> > > +   nir/nir_opt_loop_unroll.c \
> > > nir/nir_opt_peephole_select.c \
> > > nir/nir_opt_remove_phis.c \
> > > nir/nir_opt_undef.c \
> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > index 75a91ea..51bc6b2 100644
> > > --- a/src/compiler/nir/nir.h
> > > +++ b/src/compiler/nir/nir.h
> > > @@ -2552,6 +2552,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > >
> > >  bool nir_opt_gcm(nir_shader *shader, bool value_number);
> > >
> > > +bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
> > > indirect_mask);
> > > +
> > >  bool nir_opt_peephole_select(nir_shader *shader, unsigned limit);
> > >
> > >  bool nir_opt_remove_phis(nir_shader *shader);
> > > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > > b/src/compiler/nir/nir_opt_loop_unroll.c
> > > new file mode 100644
> > > index 000..7eb44cb
> > > --- /dev/null
> > > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > > @@ -0,0 +1,559 @@
> > > +/*
> > > + * Copyright © 2016 Intel Corporation
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person
> > > obtaining a
> > > + * copy of this software and associated documentation files (the
> > > "Software"),
> > > + * to deal in the Software without restriction, including without
> > > limitation
> > > + * the rights to use, copy, modify, merge, publish, distribute,
> > > sublicense,
> > > + * and/or sell copies of the Software, and to permit persons to
> > > whom the
> > > + * Software is furnished to do so, subject to the following
> > > conditions:
> > > + *
> > > + * The above copyright notice and this permission notice
> > > (including the next
> > > + * paragraph) shall be included in all copies or substantial
> > > portions of the
> > > + * Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > > EVENT SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > > DAMAGES OR OTHER
> > > + * LIABILITY,

Re: [Mesa-dev] [PATCH 09/12] nir: add a loop unrolling pass

2016-12-20 Thread Timothy Arceri

On Tue, 2016-12-20 at 16:31 -0800, Jason Ekstrand wrote:
> On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri  bora.com> wrote:
> > V2:
> > - tidy ups suggested by Connor.
> > - tidy up cloning logic and handle copy propagation
> >  based of suggestion by Connor.
> > - use nir_ssa_def_rewrite_uses to fix up lcssa phis
> >   suggested by Connor.
> > - add support for complex loop unrolling (two terminators)
> > - handle case were the ssa defs use outside the loop is already a
> > phi
> > - support unrolling loops with multiple terminators when trip count
> >   is know for each terminator
> > 
> > V3:
> > - set correct num_components when creating phi in complex unroll
> > - rewrite update remap table based on Jasons suggestions.
> > - remove unrequired extract_loop_body() helper as suggested by
> > Jason.
> > - simplify the lcssa phi fix up code for simple loops as per Jasons
> > suggestions.
> > - use mem context to keep track of hash table memory as suggested
> > by Jason.
> > - move is_{complex,simple}_loop helpers to the unroll code
> > - require nir_metadata_block_index
> > - partially rewrote complex unroll to be simpler and easier to
> > follow.
> > 
> > V4:
> > - use rzalloc() when creating nir_phi_src but not setting pred
> > right away
> >  fixes regression cause by ralloc() no longer zeroing memory.
> > 
> > V5:
> > - simplify calling of complex_unroll()
> > - use new loop terminator fields to get the break/continue from
> > blocks
> >   and simplify loop unrolling code
> > - handle slightly less trivial loop terminators. if branches can
> >   now have instructions but can only contain a single block.
> > - use nir print type IR snippets in unroll function descriptions
> > - add better explanation and variable for why we need to clone
> >   additional times when the second terminator it the limiting
> >   terminator.
> > - partially convert out of ssa before unrolling loops (suggested by
> > Jason)
> > ---
> >  src/compiler/Makefile.sources          |   1 +
> >  src/compiler/nir/nir.h                 |   2 +
> >  src/compiler/nir/nir_opt_loop_unroll.c | 559
> > +
> >  3 files changed, 562 insertions(+)
> >  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
> > 
> > diff --git a/src/compiler/Makefile.sources
> > b/src/compiler/Makefile.sources
> > index e8f7b02..ae3e5f0 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -239,6 +239,7 @@ NIR_FILES = \
> >         nir/nir_opt_dead_cf.c \
> >         nir/nir_opt_gcm.c \
> >         nir/nir_opt_global_to_local.c \
> > +       nir/nir_opt_loop_unroll.c \
> >         nir/nir_opt_peephole_select.c \
> >         nir/nir_opt_remove_phis.c \
> >         nir/nir_opt_undef.c \
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index 75a91ea..51bc6b2 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -2552,6 +2552,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
> > 
> >  bool nir_opt_gcm(nir_shader *shader, bool value_number);
> > 
> > +bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
> > indirect_mask);
> > +
> >  bool nir_opt_peephole_select(nir_shader *shader, unsigned limit);
> > 
> >  bool nir_opt_remove_phis(nir_shader *shader);
> > diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> > b/src/compiler/nir/nir_opt_loop_unroll.c
> > new file mode 100644
> > index 000..7eb44cb
> > --- /dev/null
> > +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> > @@ -0,0 +1,559 @@
> > +/*
> > + * Copyright © 2016 Intel Corporation
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > + * copy of this software and associated documentation files (the
> > "Software"),
> > + * to deal in the Software without restriction, including without
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > sublicense,
> > + * and/or sell copies of the Software, and to permit persons to
> > whom the
> > + * Software is furnished to do so, subject to the following
> > conditions:
> > + *
> > + * The above copyright notice and this permission notice
> > (including the next
> > + * paragraph) shall be included in all copies or substantial
> > portions of the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> > EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#include "nir.h"
> > +#include "nir_builder.h"
> > +#include "nir_control_flow.h"
> > +#include "nir_loop_analyze.h"
> > +

[Mesa-dev] [PATCH] Mesa: Fix error code for glTexImage3D in GLES

2016-12-20 Thread Randy Xu

From the OGLES 3.2 spec, Section 8.5 Texture Image Specification, page 158:
 "An INVALID_OPERATION error is generated if a combination of
  values for format, type, and internalformat is specified that is
  not listed as a valid combination in tables 8.2 or 8.3."
It means that TexImage3D should return GL_INVALID_OPERATION if the internal
format is DEPTH_COMPONENT, DEPTH_-STENCIL or STENCIL_INDEX.

The current code returns INVALID_ENUM as _mesa_error_check_format_and_type is
also used by glReadPixels and the GL specification defines
 "INVALID_ENUM is generated if format is DEPTH_STENCIL and type is not
  UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_- REV".

This patch only impacts GLES, which can generate GL_INVALID_OPERATION because
glReadPixels cannot be used to read depth or stencil buffer.
Fixes dEQP-GLES3.functional.negative_api.texture.teximage3d.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99076
Signed-off-by: Randy Xu 
---
 src/mesa/main/glformats.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index a95909c..3070db9 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2087,6 +2087,13 @@ _mesa_error_check_format_and_type(const struct 
gl_context *ctx,
  else if (ctx->Extensions.ARB_depth_buffer_float &&
  type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)
 return GL_NO_ERROR;
+ //From the OpenGL ES 3.2 spec, Section 8.5 Texture Image
+ // Specification, page 158:
+ // An INVALID_OPERATION error is generated if a combination of
+ // values for format, type, and internalformat is specified that
+ // is not listed as a valid combination in tables 8.2 or 8.3.
+ else if (!_mesa_is_desktop_gl(ctx))
+return GL_INVALID_OPERATION;
  else
 return GL_INVALID_ENUM;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-20 Thread Roland Scheidegger

Am 20.12.2016 um 22:12 schrieb Giuseppe Bilotta:
> On Tue, Dec 20, 2016 at 2:17 AM, Matt Turner  wrote:
>> On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
>>  wrote:
>>> Just one question though —not knowing much of the shader language, can
>>> I expect expm1 to be available?
>>
>> No, expm1 doesn't exist in GLSL.
> 
> This is extremely bothersome. Both the (exp(2x)-1)/(exp(2x)+1) and the
> 1-2/(exp(2x)+1) formulas give pretty good results when written
> in terms of expm1.
> 
> On Tue, Dec 20, 2016 at 3:48 AM, Roland Scheidegger  
> wrote:
>> Not sure it really matters though one way or another. If you wanted good
>> accuracy around 0, you'd have to use a different formula plus a select
>> (seems like libm implementations actually use 3 cases depending on input
>> value magnitude - not so hot with vectors, but thankfully glsl doesn't
>> require 1 ULP accuracy).
> 
> Brute-forcing over all floating points on CPU by switching between the
> two formulas above at appropriate thresholds gives a maximum relative
> error of the order of machine epsilon when using expm1, and the switch
> between the two formulas can be implemented with a select on two
> terms. However, this does require expm1.
> 
> Nelson Beebe has a very detailed description of how to achieve very
> accurate results for tanh here
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.math.utah.edu_-7Ebeebe_software_ieee_tanh.pdf=DgIFaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=-8RA3Y0TZk5KhOV7i-V1QiCKZ2b1Xd7ubIOObRsSajM=LQvjfGSg0bmKWKjl7W2DlL0vE-Xw2XJoJCx6He20Bcs=
>   and the
> results are a bit depressing, in that multiple thresholds are
> necessary. I'm not sure if these are the same used by libm, but in any
> case neither lends itself well to vectorization (in contrast to the
> switch between the two formulas above).
> 
> An alternative approach could be to actually provide a software
> implementation of expm1 and use it to compute tanh. I wouldn't be
> surprised if this would turn out to not be slower than using exp
> itself, in fact.
> 

I'd venture a guess, you cannot beat the exp of the gpus (exp2 actually,
but it doesn't matter). Those are built to be fast (and not necessarily
100% exact). Ok maybe for some intel chips which use the famous mathbox
maybe you could be competitive...
Now for something like llvmpipe, you could be right. I have no idea if
exp or expm1 is more difficult to evaluate. But noone is going to bother
for that case. For an opcode we don't even have any evidence it's
actually even used somewhere (outside conformance tests). Well it
probably is somewhere, but it's probably rare enough it's not exactly an
interesting target for optimization.
So, I guess unless more accuracy around 0 is really needed, there's
really not much point investing time in this.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/12] nir: Add a LCSAA-pass

2016-12-20 Thread Timothy Arceri

On Tue, 2016-12-20 at 16:34 -0800, Jason Ekstrand wrote:
> On Tue, Dec 20, 2016 at 3:36 PM, Timothy Arceri  bora.com> wrote:
> > On Tue, 2016-12-20 at 15:14 -0800, Jason Ekstrand wrote:
> > > I did have a couple of "real" comments on this one that I'd like
> > to
> > > at least see a reply to.  Does look pretty good though.
> > >
> > > On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri  > olla
> > > bora.com> wrote:
> > > > From: Thomas Helland 
> > > >
> > > > V2: Do a "depth first search" to convert to LCSSA
> > > >
> > > > V3: Small comment fixup
> > > >
> > > > V4: Rebase, adapt to removal of function overloads
> > > >
> > > > V5: Rebase, adapt to relocation of nir to compiler/nir
> > > >     Still need to adapt to potential if-uses
> > > >     Work around nir_validate issue
> > > >
> > > > V6 (Timothy):
> > > >  - tidy lcssa and stop leaking memory
> > > >  - dont rewrite the src for the lcssa phi node
> > > >  - validate lcssa phi srcs to avoid postvalidate assert
> > > >  - don't add new phi if one already exists
> > > >  - more lcssa phi validation fixes
> > > >  - Rather than marking ssa defs inside a loop just mark blocks
> > > > inside
> > > >    a loop. This is simpler and fixes lcssa for intrinsics which
> > do
> > > >    not have a destination.
> > > >  - don't create LCSSA phis for loops we won't unroll
> > > >  - require loop metadata for lcssa pass
> > > >  - handle case were the ssa defs use outside the loop is
> > already a
> > > > phi
> > > >
> > > > V7: (Timothy)
> > > > - pass indirect mask to metadata call
> > > >
> > > > v8: (Timothy)
> > > > - make convert to lcssa a helper function rather than a nir
> > pass
> > > > - replace inside loop bitset with on the fly block index logic.
> > > > - remove lcssa phi validation special cases
> > > > - inline code from useless helpers, suggested by Jason.
> > > > - always do lcssa on loops, suggested by Jason.
> > > > - stop making lcssa phis special. Add as many source as the
> > block
> > > >   has predecessors, suggested by Jason.
> > > >
> > > > V9: (Timothy)
> > > > - fix regression with the is_lcssa_phi field not being
> > initialised
> > > >   to false now that ralloc() doesn't zero out memory.
> > > >
> > > > V10: (Timothy)
> > > > - remove extra braces in SSA example, pointed out by Topi
> > > >
> > > > V11: (Timothy)
> > > > - add missing support for LCSSA phis in if conditions.
> > > > ---
> > > >  src/compiler/Makefile.sources   |   1 +
> > > >  src/compiler/nir/nir.c          |   1 +
> > > >  src/compiler/nir/nir.h          |   4 +
> > > >  src/compiler/nir/nir_to_lcssa.c | 215
> > > > 
> > > >  4 files changed, 221 insertions(+)
> > > >  create mode 100644 src/compiler/nir/nir_to_lcssa.c
> > > >
> > > > diff --git a/src/compiler/Makefile.sources
> > > > b/src/compiler/Makefile.sources
> > > > index ca8a056..e8f7b02 100644
> > > > --- a/src/compiler/Makefile.sources
> > > > +++ b/src/compiler/Makefile.sources
> > > > @@ -254,6 +254,7 @@ NIR_FILES = \
> > > >         nir/nir_split_var_copies.c \
> > > >         nir/nir_sweep.c \
> > > >         nir/nir_to_ssa.c \
> > > > +       nir/nir_to_lcssa.c \
> > > >         nir/nir_validate.c \
> > > >         nir/nir_vla.h \
> > > >         nir/nir_worklist.c \
> > > > diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> > > > index 2c3531c..e522a67 100644
> > > > --- a/src/compiler/nir/nir.c
> > > > +++ b/src/compiler/nir/nir.c
> > > > @@ -561,6 +561,7 @@ nir_phi_instr_create(nir_shader *shader)
> > > >  {
> > > >     nir_phi_instr *instr = ralloc(shader, nir_phi_instr);
> > > >     instr_init(>instr, nir_instr_type_phi);
> > > > +   instr->is_lcssa_phi = false;
> > > >
> > > >     dest_init(>dest);
> > > >     exec_list_make_empty(>srcs);
> > > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > > index 28010aa..75a91ea 100644
> > > > --- a/src/compiler/nir/nir.h
> > > > +++ b/src/compiler/nir/nir.h
> > > > @@ -1360,6 +1360,8 @@ typedef struct {
> > > >     struct exec_list srcs; /** < list of nir_phi_src */
> > > >
> > > >     nir_dest dest;
> > > > +
> > > > +   bool is_lcssa_phi;
> > > >  } nir_phi_instr;
> > > >
> > > >  typedef struct {
> > > > @@ -2526,6 +2528,8 @@ void nir_convert_to_ssa(nir_shader
> > *shader);
> > > >  bool nir_repair_ssa_impl(nir_function_impl *impl);
> > > >  bool nir_repair_ssa(nir_shader *shader);
> > > >
> > > > +void nir_convert_loop_to_lcssa(nir_loop *loop);
> > > > +
> > > >  /* If phi_webs_only is true, only convert SSA values involved
> > in
> > > > phi nodes to
> > > >   * registers.  If false, convert all values (even those not
> > > > involved in a phi
> > > >   * node) to registers.
> > > > diff --git a/src/compiler/nir/nir_to_lcssa.c
> > > > b/src/compiler/nir/nir_to_lcssa.c
> > > > new file mode 100644
> > > > index 000..8afdc54
> > > > --- /dev/null
> > > > +++ b/src/compiler/nir/nir_to_lcssa.c
> > > > @@ -0,0

Re: [Mesa-dev] [Mesa-stable] [PATCH] Mesa: Fix error code for glTexImage3D in GLES

2016-12-20 Thread Xu, Randy

Thanks, Chad

I will update the patch per your suggestion. 

-Original Message-
From: Chad Versace [mailto:chadvers...@chromium.org] 
Sent: Wednesday, December 21, 2016 5:05 AM
To: Xu, Randy 
Cc: Ian Romanick ; mesa-dev@lists.freedesktop.org; 
mesa-sta...@lists.freedesktop.org
Subject: Re: [Mesa-stable] [Mesa-dev] [PATCH] Mesa: Fix error code for 
glTexImage3D in GLES

On Mon 19 Dec 2016, Xu, Randy wrote:
> Hi, Chad & Ian
> 
> Thanks for your suggestion, and I understand and agree your point, 
> while the texsubimage_error_check (in teximage.c) calls 
> _mesa_error_check_format_and_type first, and if error happens, it will 
> return immediately (in 2175) and not call 
> texture_format_error_check_gles (in 2184). So I did the patch this way.
> 
> Follow your suggestion, we'd better move 
> texture_format_error_check_gles ahead of 
> _mesa_error_check_format_and_type, i.e. handle the GLES API ahead. Do 
> you agree with that?

I'm afraid to move texture_format_error_check_gles() ahead of 
_mesa_error_check_format_and_type(). That may be the correct thing to do, but 
my understanding of this error-checking code is weak and I'm afraid of the 
unintended consequences of moving it. If someone more familiar with this code 
claims that it's safe to move the check, then moving it is ok with me.

So... please ignore my complaint. The check should remain in 
_mesa_error_check_format_and_type(), unless someone has a better suggestion. 
That functions already contains some gles-specific checks, so there is no harm 
in adding yet another.

I have more comments below.

> -Original Message-
> From: Ian Romanick [mailto:i...@freedesktop.org]
> Sent: Saturday, December 17, 2016 6:02 AM
> To: Chad Versace ; Xu, Randy 
> ; mesa-dev@lists.freedesktop.org; 
> mesa-sta...@lists.freedesktop.org; x...@freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH] Mesa: Fix error code for glTexImage3D 
> in GLES
> 
> On 12/16/2016 12:49 PM, Chad Versace wrote:
> > On Fri 16 Dec 2016, Chad Versace wrote:
> >> On Fri 16 Dec 2016, Randy Xu wrote:
> >>> From: "Xu,Randy" 
> >>>
> >>> The ES specification says that TexImage3D should return 
> >>> GL_INVALID_OPERATION if the internal format is DEPTH_COMPONENT, 
> >>> DEPTH_-STENCIL or STENCIL_INDEX.

The above is true. The ES spec says that. However, the GL spec says the same 
thing. See page 158 in the GLES 3.2 spec and page 194 in the GL 4.5 spec. So, I 
believe referring to that text in the spec in the commit message is misleading 
because it is unrelated to the test failure.

It's also misleading misleading because the patch doesn't update any 
glTexImage3D code. The patch updates dimension-independent code that affects 
glTexImage1D, glTexImage2D, and glTexImage3D.

I inspected the test results more closely, with and without the patch.
The debug output differs on a single line, marked with '***'. From the debug 
info, it seems that one of the several causes for the test failure is that 
_mesa_error_check_format_and_type() gets called, and emits the wrong error, 
before Mesa rejects GL_DEPTH_STENCIL as an invalid target for glTexImage3D with 
GL_INVALID_OPERATION.

Without patch:

Test case 'dEQP-GLES3.functional.negative_api.texture.teximage3d'..
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(target=GL_NO_ERROR)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(target=GL_TEXTURE_2D)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_RGBA, type = GL_NO_ERROR)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_NO_ERROR, type = GL_UNSIGNED_BYTE)
Mesa: User error: GL_INVALID_VALUE in 
glTexImage3D(internalFormat=GL_NO_ERROR)
 ***Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_DEPTH_STENCIL, type = GL_UNSIGNED_BYTE)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = 
GL_DEPTH_COMPONENT, type = GL_UNSIGNED_BYTE, internalformat = GL_RGBA)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_RGBA, 
type = GL_UNSIGNED_BYTE, internalformat = GL_RGB)
Mesa: User error: GL_INVALID_OPERATION in glTexImage3D(incompatible format 
= GL_RGB, type = GL_UNSIGNED_SHORT_4_4_4_4)
Mesa: User error: GL_INVALID_OPERATION in glTexImage3D(incompatible format 
= GL_RGB, type = GL_UNSIGNED_SHORT_5_5_5_1)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_RGB, 
type = GL_UNSIGNED_INT_2_10_10_10_REV, internalformat = GL_RGB10_A2)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = 
GL_RGBA_INTEGER, type = GL_INT, internalformat = GL_RGBA32UI)
Test case duration in microseconds = 691466 us
  Fail (Got invalid error)

DONE!

Test run totals:
  Passed:0/1 (0.0%)
  Failed:1/1 (100.0%)
  Not supported: 0/1 (0.0%)
  Warnings:  0/1 (0.0%)

With patch:

Test case

Re: [Mesa-dev] [PATCH 04/12] nir: Add a LCSAA-pass

2016-12-20 Thread Jason Ekstrand

On Tue, Dec 20, 2016 at 3:36 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> On Tue, 2016-12-20 at 15:14 -0800, Jason Ekstrand wrote:
> > I did have a couple of "real" comments on this one that I'd like to
> > at least see a reply to.  Does look pretty good though.
> >
> > On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri  > bora.com> wrote:
> > > From: Thomas Helland 
> > >
> > > V2: Do a "depth first search" to convert to LCSSA
> > >
> > > V3: Small comment fixup
> > >
> > > V4: Rebase, adapt to removal of function overloads
> > >
> > > V5: Rebase, adapt to relocation of nir to compiler/nir
> > > Still need to adapt to potential if-uses
> > > Work around nir_validate issue
> > >
> > > V6 (Timothy):
> > >  - tidy lcssa and stop leaking memory
> > >  - dont rewrite the src for the lcssa phi node
> > >  - validate lcssa phi srcs to avoid postvalidate assert
> > >  - don't add new phi if one already exists
> > >  - more lcssa phi validation fixes
> > >  - Rather than marking ssa defs inside a loop just mark blocks
> > > inside
> > >a loop. This is simpler and fixes lcssa for intrinsics which do
> > >not have a destination.
> > >  - don't create LCSSA phis for loops we won't unroll
> > >  - require loop metadata for lcssa pass
> > >  - handle case were the ssa defs use outside the loop is already a
> > > phi
> > >
> > > V7: (Timothy)
> > > - pass indirect mask to metadata call
> > >
> > > v8: (Timothy)
> > > - make convert to lcssa a helper function rather than a nir pass
> > > - replace inside loop bitset with on the fly block index logic.
> > > - remove lcssa phi validation special cases
> > > - inline code from useless helpers, suggested by Jason.
> > > - always do lcssa on loops, suggested by Jason.
> > > - stop making lcssa phis special. Add as many source as the block
> > >   has predecessors, suggested by Jason.
> > >
> > > V9: (Timothy)
> > > - fix regression with the is_lcssa_phi field not being initialised
> > >   to false now that ralloc() doesn't zero out memory.
> > >
> > > V10: (Timothy)
> > > - remove extra braces in SSA example, pointed out by Topi
> > >
> > > V11: (Timothy)
> > > - add missing support for LCSSA phis in if conditions.
> > > ---
> > >  src/compiler/Makefile.sources   |   1 +
> > >  src/compiler/nir/nir.c  |   1 +
> > >  src/compiler/nir/nir.h  |   4 +
> > >  src/compiler/nir/nir_to_lcssa.c | 215
> > > 
> > >  4 files changed, 221 insertions(+)
> > >  create mode 100644 src/compiler/nir/nir_to_lcssa.c
> > >
> > > diff --git a/src/compiler/Makefile.sources
> > > b/src/compiler/Makefile.sources
> > > index ca8a056..e8f7b02 100644
> > > --- a/src/compiler/Makefile.sources
> > > +++ b/src/compiler/Makefile.sources
> > > @@ -254,6 +254,7 @@ NIR_FILES = \
> > > nir/nir_split_var_copies.c \
> > > nir/nir_sweep.c \
> > > nir/nir_to_ssa.c \
> > > +   nir/nir_to_lcssa.c \
> > > nir/nir_validate.c \
> > > nir/nir_vla.h \
> > > nir/nir_worklist.c \
> > > diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> > > index 2c3531c..e522a67 100644
> > > --- a/src/compiler/nir/nir.c
> > > +++ b/src/compiler/nir/nir.c
> > > @@ -561,6 +561,7 @@ nir_phi_instr_create(nir_shader *shader)
> > >  {
> > > nir_phi_instr *instr = ralloc(shader, nir_phi_instr);
> > > instr_init(>instr, nir_instr_type_phi);
> > > +   instr->is_lcssa_phi = false;
> > >
> > > dest_init(>dest);
> > > exec_list_make_empty(>srcs);
> > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > > index 28010aa..75a91ea 100644
> > > --- a/src/compiler/nir/nir.h
> > > +++ b/src/compiler/nir/nir.h
> > > @@ -1360,6 +1360,8 @@ typedef struct {
> > > struct exec_list srcs; /** < list of nir_phi_src */
> > >
> > > nir_dest dest;
> > > +
> > > +   bool is_lcssa_phi;
> > >  } nir_phi_instr;
> > >
> > >  typedef struct {
> > > @@ -2526,6 +2528,8 @@ void nir_convert_to_ssa(nir_shader *shader);
> > >  bool nir_repair_ssa_impl(nir_function_impl *impl);
> > >  bool nir_repair_ssa(nir_shader *shader);
> > >
> > > +void nir_convert_loop_to_lcssa(nir_loop *loop);
> > > +
> > >  /* If phi_webs_only is true, only convert SSA values involved in
> > > phi nodes to
> > >   * registers.  If false, convert all values (even those not
> > > involved in a phi
> > >   * node) to registers.
> > > diff --git a/src/compiler/nir/nir_to_lcssa.c
> > > b/src/compiler/nir/nir_to_lcssa.c
> > > new file mode 100644
> > > index 000..8afdc54
> > > --- /dev/null
> > > +++ b/src/compiler/nir/nir_to_lcssa.c
> > > @@ -0,0 +1,215 @@
> > > +/*
> > > + * Copyright © 2015 Thomas Helland
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person
> > > obtaining a
> > > + * copy of this software and associated documentation files (the
> > > "Software"),
> > > + * to deal in the Software without restriction, including without
> > >

Re: [Mesa-dev] [PATCH 09/12] nir: add a loop unrolling pass

2016-12-20 Thread Jason Ekstrand

On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> V2:
> - tidy ups suggested by Connor.
> - tidy up cloning logic and handle copy propagation
>  based of suggestion by Connor.
> - use nir_ssa_def_rewrite_uses to fix up lcssa phis
>   suggested by Connor.
> - add support for complex loop unrolling (two terminators)
> - handle case were the ssa defs use outside the loop is already a phi
> - support unrolling loops with multiple terminators when trip count
>   is know for each terminator
>
> V3:
> - set correct num_components when creating phi in complex unroll
> - rewrite update remap table based on Jasons suggestions.
> - remove unrequired extract_loop_body() helper as suggested by Jason.
> - simplify the lcssa phi fix up code for simple loops as per Jasons
> suggestions.
> - use mem context to keep track of hash table memory as suggested by Jason.
> - move is_{complex,simple}_loop helpers to the unroll code
> - require nir_metadata_block_index
> - partially rewrote complex unroll to be simpler and easier to follow.
>
> V4:
> - use rzalloc() when creating nir_phi_src but not setting pred right away
>  fixes regression cause by ralloc() no longer zeroing memory.
>
> V5:
> - simplify calling of complex_unroll()
> - use new loop terminator fields to get the break/continue from blocks
>   and simplify loop unrolling code
> - handle slightly less trivial loop terminators. if branches can
>   now have instructions but can only contain a single block.
> - use nir print type IR snippets in unroll function descriptions
> - add better explanation and variable for why we need to clone
>   additional times when the second terminator it the limiting
>   terminator.
> - partially convert out of ssa before unrolling loops (suggested by Jason)
> ---
>  src/compiler/Makefile.sources  |   1 +
>  src/compiler/nir/nir.h |   2 +
>  src/compiler/nir/nir_opt_loop_unroll.c | 559
> +
>  3 files changed, 562 insertions(+)
>  create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index e8f7b02..ae3e5f0 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -239,6 +239,7 @@ NIR_FILES = \
> nir/nir_opt_dead_cf.c \
> nir/nir_opt_gcm.c \
> nir/nir_opt_global_to_local.c \
> +   nir/nir_opt_loop_unroll.c \
> nir/nir_opt_peephole_select.c \
> nir/nir_opt_remove_phis.c \
> nir/nir_opt_undef.c \
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 75a91ea..51bc6b2 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2552,6 +2552,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
>
>  bool nir_opt_gcm(nir_shader *shader, bool value_number);
>
> +bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode
> indirect_mask);
> +
>  bool nir_opt_peephole_select(nir_shader *shader, unsigned limit);
>
>  bool nir_opt_remove_phis(nir_shader *shader);
> diff --git a/src/compiler/nir/nir_opt_loop_unroll.c
> b/src/compiler/nir/nir_opt_loop_unroll.c
> new file mode 100644
> index 000..7eb44cb
> --- /dev/null
> +++ b/src/compiler/nir/nir_opt_loop_unroll.c
> @@ -0,0 +1,559 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "nir.h"
> +#include "nir_builder.h"
> +#include "nir_control_flow.h"
> +#include "nir_loop_analyze.h"
> +
> +/* Convert all phis in the give block to regs, here we insert a mov in the
> + * pred block of the phi source to copy the src to the reg, then we
> rewrite
> + * all uses of the phi to the new reg.
> + */
> +static void
> +convert_phis_to_regs(nir_builder *b, nir_block *block)
> +{
>

In patch 1/6 of the series I sent yesterday I added a

Re: [Mesa-dev] [PATCH 04/12] nir: Add a LCSAA-pass

2016-12-20 Thread Timothy Arceri

On Tue, 2016-12-20 at 15:14 -0800, Jason Ekstrand wrote:
> I did have a couple of "real" comments on this one that I'd like to
> at least see a reply to.  Does look pretty good though.
> 
> On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri  bora.com> wrote:
> > From: Thomas Helland 
> > 
> > V2: Do a "depth first search" to convert to LCSSA
> > 
> > V3: Small comment fixup
> > 
> > V4: Rebase, adapt to removal of function overloads
> > 
> > V5: Rebase, adapt to relocation of nir to compiler/nir
> >     Still need to adapt to potential if-uses
> >     Work around nir_validate issue
> > 
> > V6 (Timothy):
> >  - tidy lcssa and stop leaking memory
> >  - dont rewrite the src for the lcssa phi node
> >  - validate lcssa phi srcs to avoid postvalidate assert
> >  - don't add new phi if one already exists
> >  - more lcssa phi validation fixes
> >  - Rather than marking ssa defs inside a loop just mark blocks
> > inside
> >    a loop. This is simpler and fixes lcssa for intrinsics which do
> >    not have a destination.
> >  - don't create LCSSA phis for loops we won't unroll
> >  - require loop metadata for lcssa pass
> >  - handle case were the ssa defs use outside the loop is already a
> > phi
> > 
> > V7: (Timothy)
> > - pass indirect mask to metadata call
> > 
> > v8: (Timothy)
> > - make convert to lcssa a helper function rather than a nir pass
> > - replace inside loop bitset with on the fly block index logic.
> > - remove lcssa phi validation special cases
> > - inline code from useless helpers, suggested by Jason.
> > - always do lcssa on loops, suggested by Jason.
> > - stop making lcssa phis special. Add as many source as the block
> >   has predecessors, suggested by Jason.
> > 
> > V9: (Timothy)
> > - fix regression with the is_lcssa_phi field not being initialised
> >   to false now that ralloc() doesn't zero out memory.
> > 
> > V10: (Timothy)
> > - remove extra braces in SSA example, pointed out by Topi
> > 
> > V11: (Timothy)
> > - add missing support for LCSSA phis in if conditions.
> > ---
> >  src/compiler/Makefile.sources   |   1 +
> >  src/compiler/nir/nir.c          |   1 +
> >  src/compiler/nir/nir.h          |   4 +
> >  src/compiler/nir/nir_to_lcssa.c | 215
> > 
> >  4 files changed, 221 insertions(+)
> >  create mode 100644 src/compiler/nir/nir_to_lcssa.c
> > 
> > diff --git a/src/compiler/Makefile.sources
> > b/src/compiler/Makefile.sources
> > index ca8a056..e8f7b02 100644
> > --- a/src/compiler/Makefile.sources
> > +++ b/src/compiler/Makefile.sources
> > @@ -254,6 +254,7 @@ NIR_FILES = \
> >         nir/nir_split_var_copies.c \
> >         nir/nir_sweep.c \
> >         nir/nir_to_ssa.c \
> > +       nir/nir_to_lcssa.c \
> >         nir/nir_validate.c \
> >         nir/nir_vla.h \
> >         nir/nir_worklist.c \
> > diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> > index 2c3531c..e522a67 100644
> > --- a/src/compiler/nir/nir.c
> > +++ b/src/compiler/nir/nir.c
> > @@ -561,6 +561,7 @@ nir_phi_instr_create(nir_shader *shader)
> >  {
> >     nir_phi_instr *instr = ralloc(shader, nir_phi_instr);
> >     instr_init(>instr, nir_instr_type_phi);
> > +   instr->is_lcssa_phi = false;
> > 
> >     dest_init(>dest);
> >     exec_list_make_empty(>srcs);
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index 28010aa..75a91ea 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -1360,6 +1360,8 @@ typedef struct {
> >     struct exec_list srcs; /** < list of nir_phi_src */
> > 
> >     nir_dest dest;
> > +
> > +   bool is_lcssa_phi;
> >  } nir_phi_instr;
> > 
> >  typedef struct {
> > @@ -2526,6 +2528,8 @@ void nir_convert_to_ssa(nir_shader *shader);
> >  bool nir_repair_ssa_impl(nir_function_impl *impl);
> >  bool nir_repair_ssa(nir_shader *shader);
> > 
> > +void nir_convert_loop_to_lcssa(nir_loop *loop);
> > +
> >  /* If phi_webs_only is true, only convert SSA values involved in
> > phi nodes to
> >   * registers.  If false, convert all values (even those not
> > involved in a phi
> >   * node) to registers.
> > diff --git a/src/compiler/nir/nir_to_lcssa.c
> > b/src/compiler/nir/nir_to_lcssa.c
> > new file mode 100644
> > index 000..8afdc54
> > --- /dev/null
> > +++ b/src/compiler/nir/nir_to_lcssa.c
> > @@ -0,0 +1,215 @@
> > +/*
> > + * Copyright © 2015 Thomas Helland
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > obtaining a
> > + * copy of this software and associated documentation files (the
> > "Software"),
> > + * to deal in the Software without restriction, including without
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > sublicense,
> > + * and/or sell copies of the Software, and to permit persons to
> > whom the
> > + * Software is furnished to do so, subject to the following
> > conditions:
> > + *
> > + * The above copyright notice and this permission notice
> >

Re: [Mesa-dev] [PATCH 04/12] nir: Add a LCSAA-pass

2016-12-20 Thread Jason Ekstrand

I did have a couple of "real" comments on this one that I'd like to at
least see a reply to.  Does look pretty good though.

On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> From: Thomas Helland 
>
> V2: Do a "depth first search" to convert to LCSSA
>
> V3: Small comment fixup
>
> V4: Rebase, adapt to removal of function overloads
>
> V5: Rebase, adapt to relocation of nir to compiler/nir
> Still need to adapt to potential if-uses
> Work around nir_validate issue
>
> V6 (Timothy):
>  - tidy lcssa and stop leaking memory
>  - dont rewrite the src for the lcssa phi node
>  - validate lcssa phi srcs to avoid postvalidate assert
>  - don't add new phi if one already exists
>  - more lcssa phi validation fixes
>  - Rather than marking ssa defs inside a loop just mark blocks inside
>a loop. This is simpler and fixes lcssa for intrinsics which do
>not have a destination.
>  - don't create LCSSA phis for loops we won't unroll
>  - require loop metadata for lcssa pass
>  - handle case were the ssa defs use outside the loop is already a phi
>
> V7: (Timothy)
> - pass indirect mask to metadata call
>
> v8: (Timothy)
> - make convert to lcssa a helper function rather than a nir pass
> - replace inside loop bitset with on the fly block index logic.
> - remove lcssa phi validation special cases
> - inline code from useless helpers, suggested by Jason.
> - always do lcssa on loops, suggested by Jason.
> - stop making lcssa phis special. Add as many source as the block
>   has predecessors, suggested by Jason.
>
> V9: (Timothy)
> - fix regression with the is_lcssa_phi field not being initialised
>   to false now that ralloc() doesn't zero out memory.
>
> V10: (Timothy)
> - remove extra braces in SSA example, pointed out by Topi
>
> V11: (Timothy)
> - add missing support for LCSSA phis in if conditions.
> ---
>  src/compiler/Makefile.sources   |   1 +
>  src/compiler/nir/nir.c  |   1 +
>  src/compiler/nir/nir.h  |   4 +
>  src/compiler/nir/nir_to_lcssa.c | 215 ++
> ++
>  4 files changed, 221 insertions(+)
>  create mode 100644 src/compiler/nir/nir_to_lcssa.c
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index ca8a056..e8f7b02 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -254,6 +254,7 @@ NIR_FILES = \
> nir/nir_split_var_copies.c \
> nir/nir_sweep.c \
> nir/nir_to_ssa.c \
> +   nir/nir_to_lcssa.c \
> nir/nir_validate.c \
> nir/nir_vla.h \
> nir/nir_worklist.c \
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index 2c3531c..e522a67 100644
> --- a/src/compiler/nir/nir.c
> +++ b/src/compiler/nir/nir.c
> @@ -561,6 +561,7 @@ nir_phi_instr_create(nir_shader *shader)
>  {
> nir_phi_instr *instr = ralloc(shader, nir_phi_instr);
> instr_init(>instr, nir_instr_type_phi);
> +   instr->is_lcssa_phi = false;
>
> dest_init(>dest);
> exec_list_make_empty(>srcs);
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 28010aa..75a91ea 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -1360,6 +1360,8 @@ typedef struct {
> struct exec_list srcs; /** < list of nir_phi_src */
>
> nir_dest dest;
> +
> +   bool is_lcssa_phi;
>  } nir_phi_instr;
>
>  typedef struct {
> @@ -2526,6 +2528,8 @@ void nir_convert_to_ssa(nir_shader *shader);
>  bool nir_repair_ssa_impl(nir_function_impl *impl);
>  bool nir_repair_ssa(nir_shader *shader);
>
> +void nir_convert_loop_to_lcssa(nir_loop *loop);
> +
>  /* If phi_webs_only is true, only convert SSA values involved in phi
> nodes to
>   * registers.  If false, convert all values (even those not involved in a
> phi
>   * node) to registers.
> diff --git a/src/compiler/nir/nir_to_lcssa.c b/src/compiler/nir/nir_to_
> lcssa.c
> new file mode 100644
> index 000..8afdc54
> --- /dev/null
> +++ b/src/compiler/nir/nir_to_lcssa.c
> @@ -0,0 +1,215 @@
> +/*
> + * Copyright © 2015 Thomas Helland
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN

Re: [Mesa-dev] [PATCH 03/12] nir: Add a loop analysis pass

2016-12-20 Thread Jason Ekstrand

I made a bunch more comments but they're all cosmetic.  The one
non-cosmetic thing I'd like to see changed before we merge is that we fix
the case where the break is in the else.  Feel free to grab the tip of my
jenkins_vulkan branch and squash it in if you like the approach I took.  Or
you can do something slightly different so long as it has the same effect.

Reviewed-by: Jason Ekstrand 

--Jason

On Sun, Dec 18, 2016 at 9:47 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> From: Thomas Helland 
>
> This pass detects induction variables and calculates the
> trip count of loops to be used for loop unrolling.
>
> I've removed support for float induction values for now, for the
> simple reason that they don't appear in my shader-db collection,
> and so I don't see it as common enough that we want to pollute the
> pass with this in the initial version.
>
> V2: Rebase, adapt to removal of function overloads
>
> V3: (Timothy Arceri)
>  - don't try to find trip count if loop terminator conditional is a phi
>  - fix trip count for do-while loops
>  - replace conditional type != alu assert with return
>  - disable unrolling of loops with continues
>  - multiple fixes to memory allocation, stop leaking and don't destroy
>structs we want to use for unrolling.
>  - fix iteration count bugs when induction var not on RHS of condition
>  - add FIXME for && conditions
>  - calculate trip count for unsigned induction/limit vars
>
> V4: (Timothy Arceri)
> - count instructions in a loop
> - set the limiting_terminator even if we can't find the trip count for
>  all terminators. This is needed for complex unrolling where we handle
>  2 terminators and the trip count is unknown for one of them.
> - restruct structs so we don't keep information not required after
>  analysis and remove dead fields.
> - force unrolling in some cases as per the rules in the GLSL IR pass
>
> V5: (Timothy Arceri)
> - fix metadata mask value 0x10 vs 0x16
>
> V6: (Timothy Arceri)
> - merge loop_variable and nir_loop_variable structs and lists suggested by
> Jason
> - remove induction var hash table and store pointer to induction
> information in
>   the loop_variable suggested by Jason.
> - use lowercase list_addtail() suggested by Jason.
> - tidy up init_loop_block() as per Jasons suggestions.
> - replace switch with nir_op_infos[alu->op].num_inputs == 2 in
>   is_var_basic_induction_var() as suggested by Jason.
> - use nir_block_last_instr() in and rename foreach_cf_node_ex_loop() as
> suggested
>   by Jason.
> - fix else check for is_trivial_loop_terminator() as per Connors
> suggetions.
> - simplify offset for induction valiables incremented before the exit
> conditions is
>   checked.
> - replace nir_op_isub check with assert() as it should have been lowered
> away.
>
> V7: (Timothy Arceri)
> - use rzalloc() on nir_loop struct creation. Worked previously because
> ralloc()
>   was broken and always zeroed the struct.
> - fix cf_node_find_loop_jumps() to find jumps when loops contain
>   nested if statements. Code is tidier as a result.
>
> V8: (Timothy Arceri)
> - move is_trivial_loop_terminator() to nir.h so we can use it to assert is
>   the loop unroll pass
> - fix analysis to not bail when looking for terminator when the break is
> in the else
>   rather then the if
> - added new loop terminator fields: break_block, continue_from_block and
>   continue_from_then so we don't have to gather these when doing unrolling.
> - get correct array length when forcing unrolling of variables
>   indexed arrays that are the same size as the iteration count
> - add support for induction variables of type float
> - update trival loop terminator check to allow an if containing
>   instructions as long as both branches contain only a single
>   block.
>
> V9:
>  - bunch of tidy ups and simplifications suggested by Jason.
>  - rewrote trivial terminator detection, now the only restriction is there
>must be no nested jumps, anything else goes.
>  - rewrote the iteration test to use nir_eval_const_opcode().
>  - count instruction properly even when forcing an unroll.
>  - bunch of other tidy ups and simplifications.
> ---
>  src/compiler/Makefile.sources   |   2 +
>  src/compiler/nir/nir.c  |   2 +-
>  src/compiler/nir/nir.h  |  41 +-
>  src/compiler/nir/nir_loop_analyze.c | 852 ++
> ++
>  src/compiler/nir/nir_loop_analyze.h |  92 
>  src/compiler/nir/nir_metadata.c |   8 +-
>  6 files changed, 994 insertions(+), 3 deletions(-)
>  create mode 100644 src/compiler/nir/nir_loop_analyze.c
>  create mode 100644 src/compiler/nir/nir_loop_analyze.h
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index 17b15de..ca8a056 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -193,6 +193,8 @@ NIR_FILES = \
> nir/nir_intrinsics.c \
> nir/nir_intrinsics.h

Re: [Mesa-dev] [PATCH 5/6] radv: do not open random render node(s)

2016-12-20 Thread Bas Nieuwenhuizen

On Fri, Dec 2, 2016 at 5:31 PM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> drmGetDevices2() provides us with enough flexibility to build heuristics
> upon. Opening a random node on the other hand will wake up the device,
> regardless if it's the one we're intereseted or not.
>
> Cc: Michel Dänzer 
> Cc: Dave Airlie 
> Signed-off-by: Emil Velikov 
> ---
> Afacit there is no system with more than one Intel GPU, but on the other
> hand one can easily have setup with many AMD cards.
>
> Dave, any reason why we are capped at 1 device ?

Leftover from anv.

Patches 4 and 5 are
Reviewed-by: Bas Nieuwenhuizen 
> ---
>  src/amd/vulkan/radv_device.c | 51 
> +++-
>  1 file changed, 36 insertions(+), 15 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 0defc0f..3eea0cd 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -300,6 +300,39 @@ void radv_DestroyInstance(
> vk_free(>alloc, instance);
>  }
>
> +static VkResult
> +radv_enumerate_devices(struct radv_instance *instance)
> +{
> +   /* TODO: Check for more devices ? */
> +   drmDevicePtr devices[8];
> +   VkResult result = VK_SUCCESS;
> +   int max_devices;
> +
> +   max_devices = drmGetDevices2(0, devices, sizeof(devices));
> +   if (max_devices < 1)
> +   return VK_ERROR_INCOMPATIBLE_DRIVER;
> +
> +   for (unsigned i = 0; i < (unsigned)max_devices; i++) {
> +   if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
> +   devices[i]->bustype == DRM_BUS_PCI &&
> +   devices[i]->deviceinfo.pci->vendor_id == 0x1002) {
> +
> +   result = 
> radv_physical_device_init(>physicalDevice,
> +  instance,
> +  
> devices[i]->nodes[DRM_NODE_RENDER]);
> +   if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
> +   break;
> +   }
> +   }
> +
> +   if (result == VK_ERROR_INCOMPATIBLE_DRIVER)
> +   instance->physicalDeviceCount = 0;
> +   else if (result == VK_SUCCESS)
> +   instance->physicalDeviceCount = 1;
> +
> +   return result;
> +}
> +
>  VkResult radv_EnumeratePhysicalDevices(
> VkInstance  _instance,
> uint32_t*   pPhysicalDeviceCount,
> @@ -309,22 +342,10 @@ VkResult radv_EnumeratePhysicalDevices(
> VkResult result;
>
> if (instance->physicalDeviceCount < 0) {
> -   char path[20];
> -   for (unsigned i = 0; i < 8; i++) {
> -   snprintf(path, sizeof(path), "/dev/dri/renderD%d", 
> 128 + i);
> -   result = 
> radv_physical_device_init(>physicalDevice,
> -  instance, path);
> -   if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
> -   break;
> -   }
> -
> -   if (result == VK_ERROR_INCOMPATIBLE_DRIVER) {
> -   instance->physicalDeviceCount = 0;
> -   } else if (result == VK_SUCCESS) {
> -   instance->physicalDeviceCount = 1;
> -   } else {
> +   result = radv_enumerate_devices(instance);
> +   if (result != VK_SUCCESS &&
> +   result != VK_ERROR_INCOMPATIBLE_DRIVER)
> return result;
> -   }
> }
>
> /* pPhysicalDeviceCount is an out parameter if pPhysicalDevices is 
> NULL;
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: Use Clang's diagnostics

2016-12-20 Thread Francisco Jerez

Vedran Miletić  writes:

> Presently errors from frontend are handled only if they occur in
> clang::CompilerInvocation::CreateFromArgs(). This patch uses
> clang::DiagnosticsEngine to detect errors such as invalid values for
> Clang frontend arguments.
>
> Fixes Piglit's cl/program/build/fail/invalid-version-declaration.cl
> test.
>
> Signed-off-by: Vedran Miletić 
> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 675cf19..29dec44 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -98,8 +98,9 @@ namespace {
>  const std::vector ,
>  std::string _log) {
>std::unique_ptr c { new 
> clang::CompilerInstance };
> +  clang::TextDiagnosticBuffer* diag_buffer = new 
> clang::TextDiagnosticBuffer;

Inconsistent pointer formatting.  With that fixed:

Reviewed-by: Francisco Jerez 

>clang::DiagnosticsEngine diag { new clang::DiagnosticIDs,
> -new clang::DiagnosticOptions, new clang::TextDiagnosticBuffer };
> +new clang::DiagnosticOptions, diag_buffer };
>  
>// Parse the compiler options.  A file name should be present at the 
> end
>// and must have the .cl extension in order for the CompilerInvocation
> @@ -111,6 +112,10 @@ namespace {
>   c->getInvocation(), copts.data(), copts.data() + copts.size(), 
> diag))
>   throw invalid_build_options_error();
>  
> +  diag_buffer->FlushDiagnostics(diag);
> +  if (diag.hasErrorOccurred())
> +  throw invalid_build_options_error();
> +
>c->getTargetOpts().CPU = target.cpu;
>c->getTargetOpts().Triple = target.triple;
>c->getLangOpts().NoBuiltin = true;
> -- 
> 2.7.4


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] st/vdpau: add h264 constrained baseline profile

2016-12-20 Thread boyuan.zhang

From: Boyuan Zhang 

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/vdpau/vdpau_private.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/vdpau/vdpau_private.h 
b/src/gallium/state_trackers/vdpau/vdpau_private.h
index bcd4bb1..2780265 100644
--- a/src/gallium/state_trackers/vdpau/vdpau_private.h
+++ b/src/gallium/state_trackers/vdpau/vdpau_private.h
@@ -229,6 +229,8 @@ ProfileToPipe(VdpDecoderProfile vdpau_profile)
  return PIPE_VIDEO_PROFILE_MPEG2_MAIN;
   case VDP_DECODER_PROFILE_H264_BASELINE:
  return PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE;
+  case VDP_DECODER_PROFILE_H264_CONSTRAINED_BASELINE:
+ return PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE;
   case VDP_DECODER_PROFILE_H264_MAIN:
  return PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN;
   case VDP_DECODER_PROFILE_H264_HIGH:
@@ -270,6 +272,8 @@ PipeToProfile(enum pipe_video_profile p_profile)
  return VDP_DECODER_PROFILE_MPEG2_MAIN;
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
  return VDP_DECODER_PROFILE_H264_BASELINE;
+  case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
+ return VDP_DECODER_PROFILE_H264_CONSTRAINED_BASELINE;
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN:
  return VDP_DECODER_PROFILE_H264_MAIN;
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] vl: add h264 constrained baseline profile

2016-12-20 Thread boyuan.zhang

From: Boyuan Zhang 

Signed-off-by: Boyuan Zhang 
---
 src/gallium/auxiliary/util/u_video.h | 1 +
 src/gallium/include/pipe/p_video_enums.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_video.h 
b/src/gallium/auxiliary/util/u_video.h
index 7e743de..88af8f6 100644
--- a/src/gallium/auxiliary/util/u_video.h
+++ b/src/gallium/auxiliary/util/u_video.h
@@ -60,6 +60,7 @@ u_reduce_video_profile(enum pipe_video_profile profile)
  return PIPE_VIDEO_FORMAT_VC1;
 
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
+  case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN:
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_EXTENDED:
   case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH:
diff --git a/src/gallium/include/pipe/p_video_enums.h 
b/src/gallium/include/pipe/p_video_enums.h
index aff7842..1e05075 100644
--- a/src/gallium/include/pipe/p_video_enums.h
+++ b/src/gallium/include/pipe/p_video_enums.h
@@ -54,6 +54,7 @@ enum pipe_video_profile
PIPE_VIDEO_PROFILE_VC1_MAIN,
PIPE_VIDEO_PROFILE_VC1_ADVANCED,
PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE,
+   PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE,
PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN,
PIPE_VIDEO_PROFILE_MPEG4_AVC_EXTENDED,
PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] radeon/uvd: add h264 constrained baseline support

2016-12-20 Thread boyuan.zhang

From: Boyuan Zhang 

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_uvd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index d5d654a..228c20d 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -476,6 +476,7 @@ static struct ruvd_h264 get_h264_msg(struct ruvd_decoder 
*dec, struct pipe_h264_
memset(, 0, sizeof(result));
switch (pic->base.profile) {
case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
+   case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
result.profile = RUVD_H264_PROFILE_BASELINE;
break;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] st/va: add h264 constrained baseline profile

2016-12-20 Thread boyuan.zhang

From: Boyuan Zhang 

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/va_private.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index e9ccdbf..08e52fd 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -154,6 +154,7 @@ PipeToProfile(enum pipe_video_profile profile)
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH10:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH422:
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH444:
+   case PIPE_VIDEO_PROFILE_MPEG4_AVC_CONSTRAINED_BASELINE:
case PIPE_VIDEO_PROFILE_HEVC_MAIN_12:
case PIPE_VIDEO_PROFILE_HEVC_MAIN_STILL:
case PIPE_VIDEO_PROFILE_HEVC_MAIN_444:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: update nir_lower_returns to only predicate instructions when needed

2016-12-20 Thread Timothy Arceri

On Tue, 2016-12-20 at 09:44 -0800, Jason Ekstrand wrote:
> On Mon, Dec 19, 2016 at 8:18 PM, Timothy Arceri  bora.com> wrote:
> > Unless an if statement contains nested returns we can simply add
> > any following instructions to the branch without the return.
> > 
> > V2: fix handling if_nested_return value when there is a sibling
> > if/loop
> > that doesn't contain a return. (Spotted by Ken)
> > ---
> >  src/compiler/nir/nir_lower_returns.c | 37
> > ++--
> >  1 file changed, 31 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/compiler/nir/nir_lower_returns.c
> > b/src/compiler/nir/nir_lower_returns.c
> > index cf49d5b..5eec984 100644
> > --- a/src/compiler/nir/nir_lower_returns.c
> > +++ b/src/compiler/nir/nir_lower_returns.c
> > @@ -30,6 +30,8 @@ struct lower_returns_state {
> >     struct exec_list *cf_list;
> >     nir_loop *loop;
> >     nir_variable *return_flag;
> > +   /* Are there other return statments nested in the current if */
> > +   bool if_nested_return;
> >  };
> > 
> >  static bool lower_returns_in_cf_list(struct exec_list *cf_list,
> > @@ -82,8 +84,10 @@ lower_returns_in_loop(nir_loop *loop, struct
> > lower_returns_state *state)
> >      * flag set to true.  We need to predicate everything following
> > the loop
> >      * on the return flag.
> >      */
> > -   if (progress)
> > +   if (progress) {
> >        predicate_following(>cf_node, state);
> > +      state->if_nested_return = true;
> > +   }
> > 
> >     return progress;
> >  }
> > @@ -91,10 +95,13 @@ lower_returns_in_loop(nir_loop *loop, struct
> > lower_returns_state *state)
> >  static bool
> >  lower_returns_in_if(nir_if *if_stmt, struct lower_returns_state
> > *state)
> >  {
> > -   bool progress;
> > +   bool progress, then_progress;
> > 
> > -   progress = lower_returns_in_cf_list(_stmt->then_list,
> > state);
> > -   progress = lower_returns_in_cf_list(_stmt->else_list, state)
> > || progress;
> > +   bool if_nested_return = state->if_nested_return;
> > +   state->if_nested_return = false;
> > +
> > +   then_progress = lower_returns_in_cf_list(_stmt->then_list,
> > state);
> > +   progress = lower_returns_in_cf_list(_stmt->else_list, state)
> > || then_progress;
> 
> I don't really get why we need this if_nested_return thing.  Why
> can't we just have two progress booleans called then_progress and
> else_progress and just do
> 
> if (then_progress && else_progress) {
>    predicate_following
> } else if (!then_progress && !else_progress) {
>    return false;
> } else {
>    /* Put it in one side or the other based on progress */
> }
> 
> That seems way simpler.

Way simpler yes but it doesn't do what we need it to :) Ken had the
same suggestion yesterday. The problem is it won't handle a case like
this:

if () {
  if () {
 return;
  } else {
// If we exit from here we need to predicate the code following
// the outer if, we cant just stick it in the else block.
  }
} else {

}

... code following outer if ...


>  
> >     /* If either of the recursive calls made progress, then there
> > were
> >      * returns inside of the body of the if.  If we're in a loop,
> > then these
> > @@ -106,8 +113,25 @@ lower_returns_in_if(nir_if *if_stmt, struct
> > lower_returns_state *state)
> >      * after a return, we need to predicate everything following on
> > the
> >      * return flag.
> >      */
> > -   if (progress && !state->loop)
> > -      predicate_following(_stmt->cf_node, state);
> > +   if (progress && !state->loop) {
> > +      if (state->if_nested_return) {
> > +         predicate_following(_stmt->cf_node, state);
> > +      } else {
> > +         /* If there are no nested returns we can just add the
> > instructions to
> > +          * the end of the branch that doesn't have the return.
> > +          */
> > +         nir_cf_list list;
> > +         nir_cf_extract(, nir_after_cf_node(_stmt-
> > >cf_node),
> > +                        nir_after_cf_list(state->cf_list));
> > +
> > +         if (then_progress)
> > +            nir_cf_reinsert(, nir_after_cf_list(_stmt-
> > >else_list));
> > +         else
> > +            nir_cf_reinsert(, nir_after_cf_list(_stmt-
> > >then_list));
> > +      }
> > +   }
> > +
> > +   state->if_nested_return = progress || if_nested_return;
> > 
> >     return progress;
> >  }
> > @@ -221,6 +245,7 @@ nir_lower_returns_impl(nir_function_impl *impl)
> >     state.cf_list = >body;
> >     state.loop = NULL;
> >     state.return_flag = NULL;
> > +   state.if_nested_return = false;
> >     nir_builder_init(, impl);
> > 
> >     bool progress = lower_returns_in_cf_list(>body, );
> > --
> > 2.9.3
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
>

Re: [Mesa-dev] [PATCH v2 043/103] i965/vec4: handle 32 and 64 bit channels in liveness analysis

2016-12-20 Thread Francisco Jerez

"Juan A. Suarez Romero"  writes:

> On Mon, 2016-12-19 at 13:58 -0800, Francisco Jerez wrote:
>> Iago Toral Quiroga  writes:
>> 
>> > From: "Juan A. Suarez Romero" 
>> > 
>> > Our current data flow analysis does not take into account that
>> > channels
>> > on 64-bit operands are 64-bit. This is a problem when the same
>> > register
>> > is accessed using both 64-bit and 32-bit channels. This is very
>> > common
>> > in operations where we need to access 64-bit data in 32-bit chunks,
>> > such as the double packing and packing operations.
>> > 
>> > This patch changes the analysis by checking the bits that each
>> > source
>> > or destination datatype needs. Actually, rather than bits, we use
>> > blocks of 32bits, which is the minimum channel size.
>> > 
>> > Because a vgrf can contain a dvec4 (256 bits), we reserve 8
>> > 32-bit blocks to map the channels.
>> > 
>> > v2 (Curro):
>> >   - Simplify code by making the var_from_reg helpers take an extra
>> > argument with the register component we want.
>> >   - Fix a couple of cases where we had to update the code to the
>> > new
>> > way of representing live variables.
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_vec4.cpp |  2 +-
>> >  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  2 +-
>> >  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  | 25 +-
>> > ---
>> >  .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 32
>> > +++---
>> >  .../drivers/dri/i965/brw_vec4_live_variables.h | 15 ++
>> >  5 files changed, 42 insertions(+), 34 deletions(-)
>> > 
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> > b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> > index 3191eab..34cab04 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> > @@ -1140,7 +1140,7 @@ vec4_visitor::opt_register_coalesce()
>> >    /* Can't coalesce this GRF if someone else was going to
>> > * read it later.
>> > */
>> > -  if (var_range_end(var_from_reg(alloc, dst_reg(inst-
>> > >src[0])), 4) > ip)
>> > +  if (var_range_end(var_from_reg(alloc, dst_reg(inst-
>> > >src[0])), 8) > ip)
>> >     continue;
>> >  
>> >    /* We need to check interference with the final destination
>> > between this
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
>> > b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
>> > index 1b91db9..bef897a 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
>> > @@ -246,7 +246,7 @@ vec4_visitor::opt_cse_local(bblock_t *block)
>> >   * more -- a sure sign they'll fail operands_match().
>> >   */
>> >  if (src->file == VGRF) {
>> > -   if (var_range_end(var_from_reg(alloc,
>> > dst_reg(*src)), 4) < ip) {
>> > +   if (var_range_end(var_from_reg(alloc,
>> > dst_reg(*src)), 8) < ip) {
>> >    entry->remove();
>> >    ralloc_free(entry);
>> >    break;
>> > diff --git
>> > a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> > b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> > index 950c6c8..6a80810 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
>> > @@ -57,12 +57,13 @@ vec4_visitor::dead_code_eliminate()
>> >   if ((inst->dst.file == VGRF && !inst->has_side_effects()) 
>> > ||
>> >   (inst->dst.is_null() && inst->writes_flag())){
>> >  bool result_live[4] = { false };
>> > -
>> >  if (inst->dst.file == VGRF) {
>> > -   for (unsigned i = 0; i < regs_written(inst); i++) {
>> > -  for (int c = 0; c < 4; c++)
>> > - result_live[c] |= BITSET_TEST(
>> > -live, var_from_reg(alloc, offset(inst-
>> > >dst, i), c));
>> > +   for (unsigned i = 0; i < 2 * regs_written(inst);
>> > i++) {
>> 
>> One of the issues we discussed in the past about this approach is
>> that
>> it would overestimate the number of register OWORDs accessed by
>> instructions with size_written < REG_SIZE (or size_read(i) <
>> REG_SIZE),
>> which will be emitted by the SIMD lowering pass.  Now that the amount
>> of
>> data read and written by instructions is represented in byte units
>> you
>> can avoid this problem by using DIV_ROUND_UP(inst->size_written, 16)
>> instead of 2 * regs_written(inst) above.
>> 
>> > +  for (int c = 0; c < 4; c++) {
>> > + const unsigned v =
>> > +var_from_reg(alloc, inst->dst, c, i);
>> > + result_live[c] |= BITSET_TEST(live, v);
>> > +  }
>> > }
>> >  } else {
>> > for (unsigned c = 0; c < 4; c++)
>> > @@

Re: [Mesa-dev] [PATCH 2/6] gallivm: optimize SoA AoS fallback fetch path a little

2016-12-20 Thread Roland Scheidegger

Am 20.12.2016 um 15:13 schrieb Jose Fonseca:
> On 12/12/16 00:11, srol...@vmware.com wrote:
>> From: Roland Scheidegger 
>>
>> We should do transpose, not extract/insert, at least with "sufficient"
>> amount
>> of channels (for 4 channels, extract/insert shuffles generated
>> otherwise look
>> truly terrifying). Albeit we shouldn't fallback to that so often in
>> any case.
>> ---
>>  src/gallium/auxiliary/gallivm/lp_bld_format_soa.c | 83
>> +++
>>  1 file changed, 70 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
>> b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
>> index 389bfa0..902c763 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
>> @@ -40,6 +40,39 @@
>>  #include "lp_bld_debug.h"
>>  #include "lp_bld_format.h"
>>  #include "lp_bld_arit.h"
>> +#include "lp_bld_pack.h"
>> +
>> +
>> +static void
>> +convert_to_soa(struct gallivm_state *gallivm,
>> +   LLVMValueRef src_aos[LP_MAX_VECTOR_WIDTH / 32],
>> +   LLVMValueRef dst_soa[4],
>> +   const struct lp_type soa_type)
>> +{
>> +   unsigned j, k;
>> +   struct lp_type aos_channel_type = soa_type;
>> +
>> +   LLVMValueRef aos_channels[4];
>> +   unsigned pixels_per_channel = soa_type.length / 4;
>> +
>> +   debug_assert((soa_type.length % 4) == 0);
>> +
>> +   aos_channel_type.length >>= 1;
>> +
>> +   for (j = 0; j < 4; ++j) {
>> +  LLVMValueRef channel[LP_MAX_VECTOR_LENGTH] = { 0 };
>> +
>> +  assert(pixels_per_channel <= LP_MAX_VECTOR_LENGTH);
>> +
>> +  for (k = 0; k < pixels_per_channel; ++k) {
>> + channel[k] = src_aos[j + 4 * k];
>> +  }
>> +
>> +  aos_channels[j] = lp_build_concat(gallivm, channel,
>> aos_channel_type, pixels_per_channel);
>> +   }
>> +
>> +   lp_build_transpose_aos(gallivm, soa_type, aos_channels, dst_soa);
>> +}
>>
>>
>>  void
>> @@ -48,9 +81,6 @@ lp_build_format_swizzle_soa(const struct
>> util_format_description *format_desc,
>>  const LLVMValueRef *unswizzled,
>>  LLVMValueRef swizzled_out[4])
>>  {
>> -   assert(PIPE_SWIZZLE_0 == (int)PIPE_SWIZZLE_0);
>> -   assert(PIPE_SWIZZLE_1 == (int)PIPE_SWIZZLE_1);
>> -
>> if (format_desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
>>enum pipe_swizzle swizzle;
>>LLVMValueRef depth_or_stencil;
>> @@ -547,9 +577,11 @@ lp_build_fetch_rgba_soa(struct gallivm_state
>> *gallivm,
>> {
>>unsigned k, chan;
>>struct lp_type tmp_type;
>> +  LLVMValueRef aos_fetch[LP_MAX_VECTOR_WIDTH / 32];
>> +  boolean vec_transpose = FALSE;
>>
>>if (gallivm_debug & GALLIVM_DEBUG_PERF) {
>> - debug_printf("%s: scalar unpacking of %s\n",
>> + debug_printf("%s: AoS fetch fallback for %s\n",
>>__FUNCTION__, format_desc->short_name);
>>}
>>
>> @@ -560,12 +592,31 @@ lp_build_fetch_rgba_soa(struct gallivm_state
>> *gallivm,
>>   rgba_out[chan] = lp_build_undef(gallivm, type);
>>}
>>
>> +  if (format_desc->nr_channels > 2 ||
>> +  format_desc->layout != UTIL_FORMAT_LAYOUT_PLAIN) {
>> + /*
>> +  * Note that vector transpose can be worse. This is because
>> +  * llvm will ensure the missing channels have the correct
>> +  * values, in particular typically 1.0 for the last channel
>> +  * (if they are used or not doesn't matter, usually llvm can't
>> +  * figure this out here probably due to the transpose).
>> +  * But with the extract/insert path, since those missing
>> elements
>> +  * were just directly inserted/extracted llvm can optimize this
>> +  * somewhat (though it still doesn't look great - and not for
>> +  * the compressed formats due to their external fetch funcs).
>> +  * So restrict to cases where we are sure it helps (albeit
>> +  * with 2 channels it MIGHT be worth it at least with AVX).
>> +  * In any case, this is just a bandaid, it does NOT replace
>> proper
>> +  * SoA format unpack.
>> +  */
>> + vec_transpose = TRUE;
>> +  }
>> +
> 
> There's a burden in maintaining so many code paths -- it raises the
> difficulty bar next time we want to do an optimization --, so if this is
> just a little worse, or only affects the draw, I'd say it's better to
> always use vec_transpose.

It is quite a bit worse. Though actually that analysis was done when
even some single channel formats were using that path (definitely not
just draw, e.g. r16f texture sampling), for them it's worse than for 2
channel formats. That said, the further changes in the series make sure
we don't really hit this path anymore (for any plain formats at least)
so you're right it can go away. I'll nuke it.

Roland


> 
>>/* loop over number of pixels */
>>for(k = 0; k <

Re: [Mesa-dev] [PATCH v3] compiler/glsl: fix precision problem of tanh

2016-12-20 Thread Giuseppe Bilotta

On Tue, Dec 20, 2016 at 2:17 AM, Matt Turner  wrote:
> On Mon, Dec 19, 2016 at 5:12 PM, Giuseppe Bilotta
>  wrote:
>> Just one question though —not knowing much of the shader language, can
>> I expect expm1 to be available?
>
> No, expm1 doesn't exist in GLSL.

This is extremely bothersome. Both the (exp(2x)-1)/(exp(2x)+1) and the
1-2/(exp(2x)+1) formulas give pretty good results when written
in terms of expm1.

On Tue, Dec 20, 2016 at 3:48 AM, Roland Scheidegger  wrote:
> Not sure it really matters though one way or another. If you wanted good
> accuracy around 0, you'd have to use a different formula plus a select
> (seems like libm implementations actually use 3 cases depending on input
> value magnitude - not so hot with vectors, but thankfully glsl doesn't
> require 1 ULP accuracy).

Brute-forcing over all floating points on CPU by switching between the
two formulas above at appropriate thresholds gives a maximum relative
error of the order of machine epsilon when using expm1, and the switch
between the two formulas can be implemented with a select on two
terms. However, this does require expm1.

Nelson Beebe has a very detailed description of how to achieve very
accurate results for tanh here
https://www.math.utah.edu/~beebe/software/ieee/tanh.pdf and the
results are a bit depressing, in that multiple thresholds are
necessary. I'm not sure if these are the same used by libm, but in any
case neither lends itself well to vectorization (in contrast to the
switch between the two formulas above).

An alternative approach could be to actually provide a software
implementation of expm1 and use it to compute tanh. I wouldn't be
surprised if this would turn out to not be slower than using exp
itself, in fact.

-- 
Giuseppe "Oblomov" Bilotta
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] Mesa: Fix error code for glTexImage3D in GLES

2016-12-20 Thread Chad Versace

On Mon 19 Dec 2016, Xu, Randy wrote:
> Hi, Chad & Ian
> 
> Thanks for your suggestion, and I understand and agree your point,
> while the texsubimage_error_check (in teximage.c) calls
> _mesa_error_check_format_and_type first, and if error happens, it will
> return immediately (in 2175) and not call
> texture_format_error_check_gles (in 2184). So I did the patch this way.
> 
> Follow your suggestion, we'd better move
> texture_format_error_check_gles ahead of
> _mesa_error_check_format_and_type, i.e. handle the GLES API ahead. Do
> you agree with that?

I'm afraid to move texture_format_error_check_gles() ahead of
_mesa_error_check_format_and_type(). That may be the correct thing to
do, but my understanding of this error-checking code is weak and I'm
afraid of the unintended consequences of moving it. If someone more
familiar with this code claims that it's safe to move the check, then
moving it is ok with me.

So... please ignore my complaint. The check should remain in
_mesa_error_check_format_and_type(), unless someone has a better
suggestion. That functions already contains some gles-specific checks,
so there is no harm in adding yet another.

I have more comments below.

> -Original Message-
> From: Ian Romanick [mailto:i...@freedesktop.org] 
> Sent: Saturday, December 17, 2016 6:02 AM
> To: Chad Versace ; Xu, Randy ; 
> mesa-dev@lists.freedesktop.org; mesa-sta...@lists.freedesktop.org; 
> x...@freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH] Mesa: Fix error code for glTexImage3D in GLES
> 
> On 12/16/2016 12:49 PM, Chad Versace wrote:
> > On Fri 16 Dec 2016, Chad Versace wrote:
> >> On Fri 16 Dec 2016, Randy Xu wrote:
> >>> From: "Xu,Randy" 
> >>>
> >>> The ES specification says that TexImage3D should return 
> >>> GL_INVALID_OPERATION if the internal format is DEPTH_COMPONENT, 
> >>> DEPTH_-STENCIL or STENCIL_INDEX.

The above is true. The ES spec says that. However, the GL spec says the
same thing. See page 158 in the GLES 3.2 spec and page 194 in the GL 4.5
spec. So, I believe referring to that text in the spec in the commit message is 
misleading
because it is unrelated to the test failure.

It's also misleading misleading because the patch doesn't update any
glTexImage3D code. The patch updates dimension-independent code that
affects glTexImage1D, glTexImage2D, and glTexImage3D.

I inspected the test results more closely, with and without the patch.
The debug output differs on a single line, marked with '***'. From the
debug info, it seems that one of the several causes for the test failure
is that _mesa_error_check_format_and_type() gets called, and emits the
wrong error, before Mesa rejects GL_DEPTH_STENCIL as an invalid target
for glTexImage3D with GL_INVALID_OPERATION.

Without patch:

Test case 'dEQP-GLES3.functional.negative_api.texture.teximage3d'..
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(target=GL_NO_ERROR)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(target=GL_TEXTURE_2D)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_RGBA, type = GL_NO_ERROR)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_NO_ERROR, type = GL_UNSIGNED_BYTE)
Mesa: User error: GL_INVALID_VALUE in 
glTexImage3D(internalFormat=GL_NO_ERROR)
 ***Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_DEPTH_STENCIL, type = GL_UNSIGNED_BYTE)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = 
GL_DEPTH_COMPONENT, type = GL_UNSIGNED_BYTE, internalformat = GL_RGBA)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_RGBA, 
type = GL_UNSIGNED_BYTE, internalformat = GL_RGB)
Mesa: User error: GL_INVALID_OPERATION in glTexImage3D(incompatible format 
= GL_RGB, type = GL_UNSIGNED_SHORT_4_4_4_4)
Mesa: User error: GL_INVALID_OPERATION in glTexImage3D(incompatible format 
= GL_RGB, type = GL_UNSIGNED_SHORT_5_5_5_1)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_RGB, 
type = GL_UNSIGNED_INT_2_10_10_10_REV, internalformat = GL_RGB10_A2)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = 
GL_RGBA_INTEGER, type = GL_INT, internalformat = GL_RGBA32UI)
Test case duration in microseconds = 691466 us
  Fail (Got invalid error)

DONE!

Test run totals:
  Passed:0/1 (0.0%)
  Failed:1/1 (100.0%)
  Not supported: 0/1 (0.0%)
  Warnings:  0/1 (0.0%)

With patch:

Test case 'dEQP-GLES3.functional.negative_api.texture.teximage3d'..
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(target=GL_NO_ERROR)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(target=GL_TEXTURE_2D)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_RGBA, type = GL_NO_ERROR)
Mesa: User error: GL_INVALID_ENUM in glTexImage3D(incompatible format = 
GL_NO_ERROR, type = GL_UNSIGNED_BYTE)
Mesa: User

Re: [Mesa-dev] [PATCH] mesa: don't attempt to unlock an unlocked debug state mutex

2016-12-20 Thread Kenneth Graunke

On Tuesday, December 20, 2016 12:08:06 PM PST Jonathan Gray wrote:
> Can someone push this to master?

Pushed:

To ssh://git.freedesktop.org/git/mesa/mesa
   ab8ea1b..62b8bcd  master -> master

Have you thought about applying for commit access?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] i965: Consider surface resolves just before draw

2016-12-20 Thread Kenneth Graunke

On Tuesday, December 20, 2016 5:20:43 PM PST Pohjolainen, Topi wrote:
> On Tue, Dec 20, 2016 at 03:05:14PM +, Ben Widawsky wrote:
> > On 16-12-20 16:45:30, Topi Pohjolainen wrote:
> > > If gl-state remains intact api_validate.c::_mesa_valid_to_render()
> > > and brw_try_draw_prims() skip checking if textures and shader
> > > images need resolves.
> > > This can lead to a case where a surface is left unresolved due to
> > > driver writing it internally using blorp. Blorp doesn't trash
> > > global gl state but only the internal driver state.
> > > 
> > > Signed-off-by: Topi Pohjolainen 
> > > CC: Kenneth Graunke 
> > > CC: Jason Ekstrand 
> > > CC: Ben Widawsky 
> > > ---
> > > src/mesa/drivers/dri/i965/brw_compute.c | 1 +
> > > src/mesa/drivers/dri/i965/brw_context.c | 4 
> > > src/mesa/drivers/dri/i965/brw_draw.c| 2 ++
> > > 3 files changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
> > > b/src/mesa/drivers/dri/i965/brw_compute.c
> > > index 16b5df7..77c056c 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_compute.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> > > @@ -186,6 +186,7 @@ brw_dispatch_compute_common(struct gl_context *ctx)
> > >if (ctx->NewState)
> > >   _mesa_update_state(ctx);
> > > 
> > > +   brw_resolve_surfaces(ctx);
> > >brw_validate_textures(brw);
> > > 
> > >const int sampler_state_size = 16; /* 16 bytes */
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > > b/src/mesa/drivers/dri/i965/brw_context.c
> > > index 367cd9d..0d339ff 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > > @@ -180,10 +180,6 @@ intel_update_state(struct gl_context * ctx, GLuint 
> > > new_state)
> > > 
> > >brw->NewGLState |= new_state;
> > > 
> > > -   _mesa_unlock_context_textures(ctx);
> > > -   brw_resolve_surfaces(ctx);
> > > -   _mesa_lock_context_textures(ctx);
> > > -
> > 
> > I'm surprised this patch doesn't fix a bug. From this patch this lock/unlock
> > being removed seems fishy, but I think that might be an issue from the 
> > previous
> > patch.
> 
> Good question, I'll amend the commit message:
> 
> It should be noted that the new callers brw_dispatch_compute_common() and
> brw_try_draw_prims() are deep in the driver draw logic and shouldn't need
> _mesa_unlock_context_textures()/_mesa_lock_context_textures(). Current
> caller intel_update_state() in turn implements dd_function_table::UpdateState
> and also gets called by _mesa_update_state_locked() which apparently needs the
> unlock/lock sequence.
> 
> > 
> > Reviewed-by: Ben Widawsky 
> 
> Thanks!

Also, krh had me pretty convinced at one point that the current texture
locking is a joke :(


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Elminate the open-coded version of process_block_array_leaf

2016-12-20 Thread Alejandro Piñeiro

typo on the subject: "Eliminate"

On 20/12/16 00:45, Timothy Arceri wrote:
> Reviewed-by: Timothy Arceri 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Don't bail on vertex element processing if we need draw params.

2016-12-20 Thread Jason Ekstrand

Bah... Forgot this:

Reviewed-by: Jason Ekstrand 

On Dec 20, 2016 13:12, "Jason Ekstrand"  wrote:

> On Dec 19, 2016 13:27, "Kenneth Graunke"  wrote:
>
> BaseVertex, BaseInstance, DrawID, and some edge flag conditions need
> vertex buffer and elements structs.  We can't bail early in this case.
>
> Gen4-7 already do this properly.  Gen8+ did not.
>
> Thanks to Ilia Mirkin for helping track this down.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99144
> Reported-by
> :
> Pierre-Eric Pelloux-Prayer 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34
> ++--
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> index 69ba8e9..3177f9a 100644
> --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> @@ -110,6 +110,22 @@ gen8_emit_vertices(struct brw_context *brw)
>ADVANCE_BATCH();
> }
>
> +   /* Normally we don't need an element for the SGVS attribute because the
> +* 3DSTATE_VF_SGVS instruction lets you store the generated attribute
> in an
> +* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
> +* we're using draw parameters then we need an element for the those
> +* values.  Additionally if there is an edge flag element then the SGVS
> +* can't be inserted past that so we need a dummy element to ensure
> that
> +* the edge flag is the last one.
> +*/
> +   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
> +vs_prog_data->uses_baseinstance ||
> +((vs_prog_data->uses_instanceid ||
> +  vs_prog_data->uses_vertexid) &&
> + uses_edge_flag));
>
>
> Out of curiosity, why are we trying so hard to avoid an extra element?
>
> +   const unsigned nr_elements =
> +  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
> +
> /* If the VS doesn't read any inputs (calculating vertex position from
>  * a state variable for some reason, for example), emit a single pad
>  * VERTEX_ELEMENT struct and bail.
> @@ -117,7 +133,7 @@ gen8_emit_vertices(struct brw_context *brw)
>  * The stale VB state stays in place, but they don't do anything unless
>  * a VE loads from them.
>  */
> -   if (brw->vb.nr_enabled == 0) {
> +   if (nr_elements == 0) {
>
>
> Seems reasonable.
>
>BEGIN_BATCH(3);
>OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS << 16) | (3 - 2));
>OUT_BATCH((0 << GEN6_VE0_INDEX_SHIFT) |
> @@ -172,22 +188,6 @@ gen8_emit_vertices(struct brw_context *brw)
>ADVANCE_BATCH();
> }
>
> -   /* Normally we don't need an element for the SGVS attribute because the
> -* 3DSTATE_VF_SGVS instruction lets you store the generated attribute
> in an
> -* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
> -* we're using draw parameters then we need an element for the those
> -* values.  Additionally if there is an edge flag element then the SGVS
> -* can't be inserted past that so we need a dummy element to ensure
> that
> -* the edge flag is the last one.
> -*/
> -   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
> -vs_prog_data->uses_baseinstance ||
> -((vs_prog_data->uses_instanceid ||
> -  vs_prog_data->uses_vertexid) &&
> - uses_edge_flag));
> -   const unsigned nr_elements =
> -  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
> -
> /* The hardware allows one more VERTEX_ELEMENTS than VERTEX_BUFFERS,
>  * presumably for VertexID/InstanceID.
>  */
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Don't bail on vertex element processing if we need draw params.

2016-12-20 Thread Jason Ekstrand

On Dec 19, 2016 13:27, "Kenneth Graunke"  wrote:

BaseVertex, BaseInstance, DrawID, and some edge flag conditions need
vertex buffer and elements structs.  We can't bail early in this case.

Gen4-7 already do this properly.  Gen8+ did not.

Thanks to Ilia Mirkin for helping track this down.

Cc: mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99144
Reported-by: Pierre-Eric Pelloux-Prayer 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34
++--
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index 69ba8e9..3177f9a 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -110,6 +110,22 @@ gen8_emit_vertices(struct brw_context *brw)
   ADVANCE_BATCH();
}

+   /* Normally we don't need an element for the SGVS attribute because the
+* 3DSTATE_VF_SGVS instruction lets you store the generated attribute
in an
+* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
+* we're using draw parameters then we need an element for the those
+* values.  Additionally if there is an edge flag element then the SGVS
+* can't be inserted past that so we need a dummy element to ensure that
+* the edge flag is the last one.
+*/
+   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_baseinstance ||
+((vs_prog_data->uses_instanceid ||
+  vs_prog_data->uses_vertexid) &&
+ uses_edge_flag));


Out of curiosity, why are we trying so hard to avoid an extra element?

+   const unsigned nr_elements =
+  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
+
/* If the VS doesn't read any inputs (calculating vertex position from
 * a state variable for some reason, for example), emit a single pad
 * VERTEX_ELEMENT struct and bail.
@@ -117,7 +133,7 @@ gen8_emit_vertices(struct brw_context *brw)
 * The stale VB state stays in place, but they don't do anything unless
 * a VE loads from them.
 */
-   if (brw->vb.nr_enabled == 0) {
+   if (nr_elements == 0) {


Seems reasonable.

   BEGIN_BATCH(3);
   OUT_BATCH((_3DSTATE_VERTEX_ELEMENTS << 16) | (3 - 2));
   OUT_BATCH((0 << GEN6_VE0_INDEX_SHIFT) |
@@ -172,22 +188,6 @@ gen8_emit_vertices(struct brw_context *brw)
   ADVANCE_BATCH();
}

-   /* Normally we don't need an element for the SGVS attribute because the
-* 3DSTATE_VF_SGVS instruction lets you store the generated attribute
in an
-* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
-* we're using draw parameters then we need an element for the those
-* values.  Additionally if there is an edge flag element then the SGVS
-* can't be inserted past that so we need a dummy element to ensure that
-* the edge flag is the last one.
-*/
-   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
-vs_prog_data->uses_baseinstance ||
-((vs_prog_data->uses_instanceid ||
-  vs_prog_data->uses_vertexid) &&
- uses_edge_flag));
-   const unsigned nr_elements =
-  brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
-
/* The hardware allows one more VERTEX_ELEMENTS than VERTEX_BUFFERS,
 * presumably for VertexID/InstanceID.
 */
--
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] New GBM backend for dEQP

2016-12-20 Thread Chad Versace

On Mon 19 Dec 2016, Tapani Pälli wrote:
> 
> 
> On 12/17/2016 03:58 AM, Chad Versace wrote:
> > Happy Christmas to everyone who's busy squashing dEQP bugs.
> > 
> > I wrote a new GBM backend for dEQP. I even submitted it to dEQP's
> > upstream Gerrit.  Pyry, dEQP's maintainer, told me over beer earlier
> > this year that he would accept it if I submitted it, and if it wasn't
> > too crazy. So, maybe it'll be upstream soon.
> > 
> > If you want to try it out, you can either fetch the patch from Gerrit:
> > $ git fetch https://android.googlesource.com/platform/external/deqp 
> > refs/changes/43/315743/1
> > 
> > View it on Gerrit:
> > https://android-review.googlesource.com/#/c/315743/
> > 
> > Fetch from personal my work-in-progress branch:
> > $ git fetch git://git.kiwitree.net/~chadv/deqp refs/heads/wip/gbm
> > 
> > View it on my cgit:
> > http://git.kiwitree.net/cgit/~chadv/deqp/log/?h=wip/gbm
> > 
> > GBM today does not support pixmaps nor pbuffers (eglCreatePixmapSurface
> > and eglCreatePbufferSurface), so the dEQP test coverage with GBM does
> > not have parity with X11. But, on the other hand, you get to run dEQP
> > without the headache of X11.
> > 
> > There's probably bugs. No surprises there.
> 
> Branch did not work 'out of the box' for me:
> 
> "No rule to make target 'framework/qphelper/.git/index', needed by
> 'framework/qphelper/qpReleaseInfo.inl'.  Stop."
> 
> (attached patch makes it work for me)

Strange. This may be related to a separate fix I submitted to dEQP
upstream:

Subject: Fix build when '.git' is a gitfile
https://android-review.googlesource.com/#/c/315234/

> One issue is that real users will use X11, Wayland or Android. Would be cool
> to have a 'switch' to toggle CI to use a particular backend so that most of
> the time we would run against gbm but then sometimes check that X11 still
> works etc.

Yes. I expect full test runs to be faster with GBM than with X11. If
that's true, then CI should default to running dEQP with GBM. And CI
should occasionally do a run with X11 to ensure there's no regressions,
and to also run any pbuffer and pixmap tests that get skipped on the
GBM run.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)

2016-12-20 Thread Alex Deucher

On Tue, Dec 20, 2016 at 1:34 PM, Marek Olšák  wrote:
> For the series:
>
> Reviewed-by: Marek Olšák 
>
> Has it ever been tested with Mesa?

It was tested when the code was originally written and the hybrid
stack (including mesa MM) has been tested recently.  I don't have a
polaris12 card at the moment.

Alex

>
> Marek
>
> On Mon, Dec 19, 2016 at 11:45 PM, Alex Deucher  wrote:
>> From: Junwei Zhang 
>>
>> v2: use gfxip names for llvm 4.0+
>> v3: use tonga for llvm <= 3.8
>>
>> Signed-off-by: Junwei Zhang 
>> Reviewed-by: Nicolai Hähnle 
>> Acked-by: Christian König 
>> ---
>>  src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
>>  src/amd/addrlib/r800/ciaddrlib.h  | 1 +
>>  src/amd/common/amd_family.h   | 1 +
>>  src/amd/common/amdgpu_id.h| 4 
>>  src/gallium/drivers/radeon/r600_pipe_common.c | 3 +++
>>  src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
>>  src/gallium/drivers/radeonsi/si_pipe.c| 1 +
>>  src/gallium/drivers/radeonsi/si_state.c   | 1 +
>>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
>>  9 files changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp 
>> b/src/amd/addrlib/r800/ciaddrlib.cpp
>> index 7c5d29a..c726c4d 100644
>> --- a/src/amd/addrlib/r800/ciaddrlib.cpp
>> +++ b/src/amd/addrlib/r800/ciaddrlib.cpp
>> @@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
>>  m_settings.isFiji= ASICREV_IS_FIJI_P(uChipRevision);
>>  m_settings.isPolaris10   = 
>> ASICREV_IS_POLARIS10_P(uChipRevision);
>>  m_settings.isPolaris11   = 
>> ASICREV_IS_POLARIS11_M(uChipRevision);
>> +m_settings.isPolaris12   = 
>> ASICREV_IS_POLARIS12_V(uChipRevision);
>>  break;
>>  case FAMILY_CZ:
>>  m_settings.isCarrizo = 1;
>> @@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
>>  {
>>  m_pipes = 16;
>>  }
>> -else if (m_settings.isPolaris11)
>> +else if (m_settings.isPolaris11 || m_settings.isPolaris12)
>>  {
>>  m_pipes = 4;
>>  }
>> diff --git a/src/amd/addrlib/r800/ciaddrlib.h 
>> b/src/amd/addrlib/r800/ciaddrlib.h
>> index de995fa..2c9a4cc 100644
>> --- a/src/amd/addrlib/r800/ciaddrlib.h
>> +++ b/src/amd/addrlib/r800/ciaddrlib.h
>> @@ -62,6 +62,7 @@ struct CIChipSettings
>>  UINT_32 isFiji: 1;
>>  UINT_32 isPolaris10   : 1;
>>  UINT_32 isPolaris11   : 1;
>> +UINT_32 isPolaris12   : 1;
>>  // VI fusion (Carrizo)
>>  UINT_32 isCarrizo : 1;
>>  };
>> diff --git a/src/amd/common/amd_family.h b/src/amd/common/amd_family.h
>> index 6a713ad..b09bbb8 100644
>> --- a/src/amd/common/amd_family.h
>> +++ b/src/amd/common/amd_family.h
>> @@ -91,6 +91,7 @@ enum radeon_family {
>>  CHIP_STONEY,
>>  CHIP_POLARIS10,
>>  CHIP_POLARIS11,
>> +CHIP_POLARIS12,
>>  CHIP_LAST,
>>  };
>>
>> diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
>> index f91df55..1683a5a 100644
>> --- a/src/amd/common/amdgpu_id.h
>> +++ b/src/amd/common/amdgpu_id.h
>> @@ -142,6 +142,8 @@ enum {
>>
>> VI_POLARIS11_M_A0 = 90,
>>
>> +   VI_POLARIS12_V_A0 = 100,
>> +
>> VI_UNKNOWN= 0xFF
>>  };
>>
>> @@ -156,6 +158,8 @@ enum {
>> ((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev < VI_POLARIS11_M_A0))
>>  #define ASICREV_IS_POLARIS11_M(eChipRev)   \
>> (eChipRev >= VI_POLARIS11_M_A0)
>> +#define ASICREV_IS_POLARIS12_V(eChipRev)\
>> +   (eChipRev >= VI_POLARIS12_V_A0)
>>
>>  /* CZ specific rev IDs */
>>  enum {
>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>> index 0b5c6dc..e0b914c 100644
>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>> @@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
>> r600_common_screen *rscreen)
>> case CHIP_FIJI: return "AMD FIJI";
>> case CHIP_POLARIS10: return "AMD POLARIS10";
>> case CHIP_POLARIS11: return "AMD POLARIS11";
>> +   case CHIP_POLARIS12: return "AMD POLARIS12";
>> case CHIP_STONEY: return "AMD STONEY";
>> default: return "AMD unknown";
>> }
>> @@ -889,9 +890,11 @@ const char *r600_get_llvm_processor_name(enum 
>> radeon_family family)
>>  #if HAVE_LLVM <= 0x0308
>> case CHIP_POLARIS10: return "tonga";
>> case CHIP_POLARIS11: return "tonga";
>> +   case CHIP_POLARIS12: return "tonga";
>>  #else
>> case CHIP_POLARIS10: return "polaris10";
>> case CHIP_POLARIS11: return "polaris11";
>> +   case CHIP_POLARIS12: return "polaris11";
>>  #endif
>> default: return "";
>> }
>>

Re: [Mesa-dev] [PATCH] egl/dri2: implement query surface hook

2016-12-20 Thread Chad Versace

On Tue 20 Dec 2016, Tapani Pälli wrote:
> This makes better guarantee that the values we return are
> in sync what the underlying drawable currently has.
> 
> Together with dEQP change bug #98327 this fixes following test:
> 
>dEQP-EGL.functional.resize.surface_size.grow
> 
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98327
> ---
>  src/egl/drivers/dri2/platform_x11.c | 30 ++
>  1 file changed, 30 insertions(+)


> diff --git a/src/egl/drivers/dri2/platform_x11.c 
> b/src/egl/drivers/dri2/platform_x11.c
> index db7d3b9..0c5d577 100644
> --- a/src/egl/drivers/dri2/platform_x11.c
> +++ b/src/egl/drivers/dri2/platform_x11.c
> @@ -395,6 +395,34 @@ dri2_x11_destroy_surface(_EGLDriver *drv, _EGLDisplay 
> *disp, _EGLSurface *surf)
>  }
>  
>  /**
> + * Function utilizes swrastGetDrawableInfo to get surface
> + * geometry from x server and calls default query surface
> + * implementation that returns the updated values.
> + *
> + * In case of errors we still return values that we currently
> + * have.
> + */
> +static EGLBoolean
> +dri2_query_surface(_EGLDriver *drv, _EGLDisplay *dpy,
> +   _EGLSurface *surf, EGLint attribute,
> +   EGLint *value)
> +{
> +   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
> +   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
> +   int x, y, w = -1, h = -1;
> +
> +   __DRIdrawable *drawable = dri2_dpy->vtbl->get_dri_drawable(surf);
> +   swrastGetDrawableInfo(drawable, , , , , dri2_surf);
> +
> +   if (w != -1 && h != -1) {
> +  surf->Width = w;
> +  surf->Height = h;
> +   }
> +
> +   return _eglQuerySurface(drv, dpy, surf, attribute, value);
> +}

The patch looks correct to me, but it incurs a X11 roundtrip even when
unneeded. A little change would ensure the roundtrip happens only when
needed. This is the same technique that platform_android.c uses to avoid
a SurfaceFlinger roundtrip.

   switch (attribute) {
   case EGL_WIDTH:
   case EGL_HEIGHT:
   ...  /* Do what the patch does. Update width, height with 
swrastGetDrawableInfo. */
   break;
   default:
   /* Do nothing */
   break;
   }

   return _eglQuerySurface(drv, dpy, surf, attribute, value);

By the way, I also can't reproduce the bug 98327. I'm using Archlinux
with Openbox, a non-compositing window manager. The only apps on my
screen are xterm and dEQP test windows.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)

2016-12-20 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Has it ever been tested with Mesa?

Marek

On Mon, Dec 19, 2016 at 11:45 PM, Alex Deucher  wrote:
> From: Junwei Zhang 
>
> v2: use gfxip names for llvm 4.0+
> v3: use tonga for llvm <= 3.8
>
> Signed-off-by: Junwei Zhang 
> Reviewed-by: Nicolai Hähnle 
> Acked-by: Christian König 
> ---
>  src/amd/addrlib/r800/ciaddrlib.cpp| 3 ++-
>  src/amd/addrlib/r800/ciaddrlib.h  | 1 +
>  src/amd/common/amd_family.h   | 1 +
>  src/amd/common/amdgpu_id.h| 4 
>  src/gallium/drivers/radeon/r600_pipe_common.c | 3 +++
>  src/gallium/drivers/radeon/radeon_vce.c   | 3 ++-
>  src/gallium/drivers/radeonsi/si_pipe.c| 1 +
>  src/gallium/drivers/radeonsi/si_state.c   | 1 +
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
>  9 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/addrlib/r800/ciaddrlib.cpp 
> b/src/amd/addrlib/r800/ciaddrlib.cpp
> index 7c5d29a..c726c4d 100644
> --- a/src/amd/addrlib/r800/ciaddrlib.cpp
> +++ b/src/amd/addrlib/r800/ciaddrlib.cpp
> @@ -353,6 +353,7 @@ AddrChipFamily CIAddrLib::HwlConvertChipFamily(
>  m_settings.isFiji= ASICREV_IS_FIJI_P(uChipRevision);
>  m_settings.isPolaris10   = 
> ASICREV_IS_POLARIS10_P(uChipRevision);
>  m_settings.isPolaris11   = 
> ASICREV_IS_POLARIS11_M(uChipRevision);
> +m_settings.isPolaris12   = 
> ASICREV_IS_POLARIS12_V(uChipRevision);
>  break;
>  case FAMILY_CZ:
>  m_settings.isCarrizo = 1;
> @@ -417,7 +418,7 @@ BOOL_32 CIAddrLib::HwlInitGlobalParams(
>  {
>  m_pipes = 16;
>  }
> -else if (m_settings.isPolaris11)
> +else if (m_settings.isPolaris11 || m_settings.isPolaris12)
>  {
>  m_pipes = 4;
>  }
> diff --git a/src/amd/addrlib/r800/ciaddrlib.h 
> b/src/amd/addrlib/r800/ciaddrlib.h
> index de995fa..2c9a4cc 100644
> --- a/src/amd/addrlib/r800/ciaddrlib.h
> +++ b/src/amd/addrlib/r800/ciaddrlib.h
> @@ -62,6 +62,7 @@ struct CIChipSettings
>  UINT_32 isFiji: 1;
>  UINT_32 isPolaris10   : 1;
>  UINT_32 isPolaris11   : 1;
> +UINT_32 isPolaris12   : 1;
>  // VI fusion (Carrizo)
>  UINT_32 isCarrizo : 1;
>  };
> diff --git a/src/amd/common/amd_family.h b/src/amd/common/amd_family.h
> index 6a713ad..b09bbb8 100644
> --- a/src/amd/common/amd_family.h
> +++ b/src/amd/common/amd_family.h
> @@ -91,6 +91,7 @@ enum radeon_family {
>  CHIP_STONEY,
>  CHIP_POLARIS10,
>  CHIP_POLARIS11,
> +CHIP_POLARIS12,
>  CHIP_LAST,
>  };
>
> diff --git a/src/amd/common/amdgpu_id.h b/src/amd/common/amdgpu_id.h
> index f91df55..1683a5a 100644
> --- a/src/amd/common/amdgpu_id.h
> +++ b/src/amd/common/amdgpu_id.h
> @@ -142,6 +142,8 @@ enum {
>
> VI_POLARIS11_M_A0 = 90,
>
> +   VI_POLARIS12_V_A0 = 100,
> +
> VI_UNKNOWN= 0xFF
>  };
>
> @@ -156,6 +158,8 @@ enum {
> ((eChipRev >= VI_POLARIS10_P_A0) && (eChipRev < VI_POLARIS11_M_A0))
>  #define ASICREV_IS_POLARIS11_M(eChipRev)   \
> (eChipRev >= VI_POLARIS11_M_A0)
> +#define ASICREV_IS_POLARIS12_V(eChipRev)\
> +   (eChipRev >= VI_POLARIS12_V_A0)
>
>  /* CZ specific rev IDs */
>  enum {
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 0b5c6dc..e0b914c 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
> r600_common_screen *rscreen)
> case CHIP_FIJI: return "AMD FIJI";
> case CHIP_POLARIS10: return "AMD POLARIS10";
> case CHIP_POLARIS11: return "AMD POLARIS11";
> +   case CHIP_POLARIS12: return "AMD POLARIS12";
> case CHIP_STONEY: return "AMD STONEY";
> default: return "AMD unknown";
> }
> @@ -889,9 +890,11 @@ const char *r600_get_llvm_processor_name(enum 
> radeon_family family)
>  #if HAVE_LLVM <= 0x0308
> case CHIP_POLARIS10: return "tonga";
> case CHIP_POLARIS11: return "tonga";
> +   case CHIP_POLARIS12: return "tonga";
>  #else
> case CHIP_POLARIS10: return "polaris10";
> case CHIP_POLARIS11: return "polaris11";
> +   case CHIP_POLARIS12: return "polaris11";
>  #endif
> default: return "";
> }
> diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
> b/src/gallium/drivers/radeon/radeon_vce.c
> index aad2ec1..dcd56ea 100644
> --- a/src/gallium/drivers/radeon/radeon_vce.c
> +++ b/src/gallium/drivers/radeon/radeon_vce.c
> @@ -413,7 +413,8 @@ struct pipe_video_codec *rvce_create_encoder(struct 
> pipe_context *context,
> enc->use_vui = true;
>

Re: [Mesa-dev] [PATCH 2/2] gallium-docs: Add documentation for when using several contexts

2016-12-20 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Dec 19, 2016 at 8:09 PM, Axel Davy  wrote:
> Add documentation to explicit what can be expected and what is allowed
> when using several contexts.
>
> Signed-off-by: Axel Davy 
> ---
>  src/gallium/docs/source/context.rst | 23 +++
>  1 file changed, 23 insertions(+)
>
> diff --git a/src/gallium/docs/source/context.rst 
> b/src/gallium/docs/source/context.rst
> index e190cefc85..35f51a0941 100644
> --- a/src/gallium/docs/source/context.rst
> +++ b/src/gallium/docs/source/context.rst
> @@ -707,3 +707,26 @@ notifications are single-shot, i.e. subsequent calls to
>since the last call or since the last notification by callback.
>  * ``set_device_reset_callback`` sets a callback which will be called when
>a device reset is detected. The callback is only called synchronously.
> +
> +Using several contexts
> +--
> +
> +Several contexts from the same screen can be used at the same time. Objects
> +created on one context cannot be used in another context, but the objects
> +created by the screen methods can be used by all contexts.
> +
> +Transfers
> +^
> +A transfer on one context is not expected to synchronize properly with
> +rendering on other contexts, thus only areas not yet used for rendering 
> should
> +be locked.
> +
> +A flush is required after transfer_unmap to expect other contexts to see the
> +uploaded data, unless:
> +
> +* Using persistent mapping. Associated with coherent mapping, unmapping the
> +  resource is also not required to use it in other contexts. Without coherent
> +  mapping, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) should be called on the
> +  context that has mapped the resource. No flush is required.
> +
> +* Mapping the resource with PIPE_TRANSFER_MAP_DIRECTLY.
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nir: update nir_lower_returns to only predicate instructions when needed

2016-12-20 Thread Jason Ekstrand

On Mon, Dec 19, 2016 at 8:18 PM, Timothy Arceri <
timothy.arc...@collabora.com> wrote:

> Unless an if statement contains nested returns we can simply add
> any following instructions to the branch without the return.
>
> V2: fix handling if_nested_return value when there is a sibling if/loop
> that doesn't contain a return. (Spotted by Ken)
> ---
>  src/compiler/nir/nir_lower_returns.c | 37 ++
> --
>  1 file changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/src/compiler/nir/nir_lower_returns.c
> b/src/compiler/nir/nir_lower_returns.c
> index cf49d5b..5eec984 100644
> --- a/src/compiler/nir/nir_lower_returns.c
> +++ b/src/compiler/nir/nir_lower_returns.c
> @@ -30,6 +30,8 @@ struct lower_returns_state {
> struct exec_list *cf_list;
> nir_loop *loop;
> nir_variable *return_flag;
> +   /* Are there other return statments nested in the current if */
> +   bool if_nested_return;
>  };
>
>  static bool lower_returns_in_cf_list(struct exec_list *cf_list,
> @@ -82,8 +84,10 @@ lower_returns_in_loop(nir_loop *loop, struct
> lower_returns_state *state)
>  * flag set to true.  We need to predicate everything following the
> loop
>  * on the return flag.
>  */
> -   if (progress)
> +   if (progress) {
>predicate_following(>cf_node, state);
> +  state->if_nested_return = true;
> +   }
>
> return progress;
>  }
> @@ -91,10 +95,13 @@ lower_returns_in_loop(nir_loop *loop, struct
> lower_returns_state *state)
>  static bool
>  lower_returns_in_if(nir_if *if_stmt, struct lower_returns_state *state)
>  {
> -   bool progress;
> +   bool progress, then_progress;
>
> -   progress = lower_returns_in_cf_list(_stmt->then_list, state);
> -   progress = lower_returns_in_cf_list(_stmt->else_list, state) ||
> progress;
> +   bool if_nested_return = state->if_nested_return;
> +   state->if_nested_return = false;
> +
> +   then_progress = lower_returns_in_cf_list(_stmt->then_list, state);
> +   progress = lower_returns_in_cf_list(_stmt->else_list, state) ||
> then_progress;
>

I don't really get why we need this if_nested_return thing.  Why can't we
just have two progress booleans called then_progress and else_progress and
just do

if (then_progress && else_progress) {
   predicate_following
} else if (!then_progress && !else_progress) {
   return false;
} else {
   /* Put it in one side or the other based on progress */
}

That seems way simpler.


>
> /* If either of the recursive calls made progress, then there were
>  * returns inside of the body of the if.  If we're in a loop, then
> these
> @@ -106,8 +113,25 @@ lower_returns_in_if(nir_if *if_stmt, struct
> lower_returns_state *state)
>  * after a return, we need to predicate everything following on the
>  * return flag.
>  */
> -   if (progress && !state->loop)
> -  predicate_following(_stmt->cf_node, state);
> +   if (progress && !state->loop) {
> +  if (state->if_nested_return) {
> + predicate_following(_stmt->cf_node, state);
> +  } else {
> + /* If there are no nested returns we can just add the
> instructions to
> +  * the end of the branch that doesn't have the return.
> +  */
> + nir_cf_list list;
> + nir_cf_extract(, nir_after_cf_node(_stmt->cf_node),
> +nir_after_cf_list(state->cf_list));
> +
> + if (then_progress)
> +nir_cf_reinsert(, nir_after_cf_list(_stmt->
> else_list));
> + else
> +nir_cf_reinsert(, nir_after_cf_list(_stmt->
> then_list));
> +  }
> +   }
> +
> +   state->if_nested_return = progress || if_nested_return;
>
> return progress;
>  }
> @@ -221,6 +245,7 @@ nir_lower_returns_impl(nir_function_impl *impl)
> state.cf_list = >body;
> state.loop = NULL;
> state.return_flag = NULL;
> +   state.if_nested_return = false;
> nir_builder_init(, impl);
>
> bool progress = lower_returns_in_cf_list(>body, );
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/algebraic: Add optimizations for "a == a && a CMP b"

2016-12-20 Thread Jason Ekstrand

On Tue, Dec 20, 2016 at 12:49 AM, Gustaw Smolarczyk 
wrote:

> 2016-12-20 6:32 GMT+01:00 Jason Ekstrand :
> > This sequence shows up The Talos Principal, at least under Vulkan,
> > and prevents loop analysis from properly computing trip counts in a
> > few loops.
> > ---
> >  src/compiler/nir/nir_opt_algebraic.py | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> b/src/compiler/nir/nir_opt_algebraic.py
> > index 698ac67..cc70ad5 100644
> > --- a/src/compiler/nir/nir_opt_algebraic.py
> > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > @@ -464,6 +464,14 @@ def bitfield_reverse(u):
> >
> >  optimizations += [(bitfield_reverse('x@32'), ('bitfield_reverse',
> 'x'))]
> >
> > +# For any comparison operation, "cmp", if you have "a != a && a cmp b"
> then
> > +# the "a != a" is redundant because it's equivalent to "a is not NaN"
> and, if
>
> Shouldn't the comment have a == a ?
>

Yes I did.  Thanks for catching that.  Fixed locally.
--Jason


> Regards,
> Gustaw
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] clover: Return correct CL_EVENT_REFERENCE_COUNT

2016-12-20 Thread Vedran Miletić

On 12/20/2016 11:59 AM, Jan Vesely wrote:
> On Fri, 2016-12-16 at 13:43 -0800, Francisco Jerez wrote:
>> Vedran Miletić  writes:
>>
>>> Current implementation of event handling keeps an extra reference to
>>> the hardware event, in addition to the reference returned via the OpenCL
>>> API. This additional reference is internal and should not be counted
>>> when queried via the clGetEventInfo() function.
>>>
>>> Fixes Piglit's cl/api/retain_release-event test.
>>>
>>> Signed-off-by: Vedran Miletić 
>>> ---
>>>  src/gallium/state_trackers/clover/api/event.cpp | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/state_trackers/clover/api/event.cpp 
>>> b/src/gallium/state_trackers/clover/api/event.cpp
>>> index 5d1a0e5..74bc4d9 100644
>>> --- a/src/gallium/state_trackers/clover/api/event.cpp
>>> +++ b/src/gallium/state_trackers/clover/api/event.cpp
>>> @@ -107,7 +107,9 @@ clGetEventInfo(cl_event d_ev, cl_event_info param,
>>>break;
>>>  
>>> case CL_EVENT_REFERENCE_COUNT:
>>> -  buf.as_scalar() = ev.ref_count();
>>> +  // Current implementation of event handling keeps an extra reference 
>>> to
>>> +  // the hardware event, which is internal and should not be counted.
>>> +  buf.as_scalar() = ev.ref_count() - 1;
>>
>> I don't think this is correct.  There is an internal event reference
>> held by the command queue object, but only for as long as the event
>> remains in the queue until the next flush.  In other cases the above
>> would give you a reference count which is off by one.  That said:
>>
>>> The reference count returned should be considered immediately
>>> stale. It is unsuitable for general use in applications. This feature
>>> is provided for identifying memory leaks.
> 
> 
> I found only a generic description that mentions reference count == 1
> wrt. events (in Glossary). Even there it says that it's an internal
> counter.
> The only object that seems to require reference count == 1 is root
> device. Contexts, queues, mem, samplers, programs, kernels, events, all
>  include the above footnote.
> I think the piglit test should be changed to check for non-zero value
> instead of 1.
> 
> Jan
> 

Thank you both for your inputs. Let's try changing the Piglit test.

Vedran

-- 
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] clover: Use Clang's diagnostics

2016-12-20 Thread Vedran Miletić

Presently errors from frontend are handled only if they occur in
clang::CompilerInvocation::CreateFromArgs(). This patch uses
clang::DiagnosticsEngine to detect errors such as invalid values for
Clang frontend arguments.

Fixes Piglit's cl/program/build/fail/invalid-version-declaration.cl
test.

Signed-off-by: Vedran Miletić 
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index 675cf19..29dec44 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -98,8 +98,9 @@ namespace {
 const std::vector ,
 std::string _log) {
   std::unique_ptr c { new clang::CompilerInstance 
};
+  clang::TextDiagnosticBuffer* diag_buffer = new 
clang::TextDiagnosticBuffer;
   clang::DiagnosticsEngine diag { new clang::DiagnosticIDs,
-new clang::DiagnosticOptions, new clang::TextDiagnosticBuffer };
+new clang::DiagnosticOptions, diag_buffer };
 
   // Parse the compiler options.  A file name should be present at the end
   // and must have the .cl extension in order for the CompilerInvocation
@@ -111,6 +112,10 @@ namespace {
  c->getInvocation(), copts.data(), copts.data() + copts.size(), 
diag))
  throw invalid_build_options_error();
 
+  diag_buffer->FlushDiagnostics(diag);
+  if (diag.hasErrorOccurred())
+  throw invalid_build_options_error();
+
   c->getTargetOpts().CPU = target.cpu;
   c->getTargetOpts().Triple = target.triple;
   c->getLangOpts().NoBuiltin = true;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)

2016-12-20 Thread Alex Deucher

On Tue, Dec 20, 2016 at 6:49 AM, Andreas Boll
 wrote:
> 2016-12-19 23:45 GMT+01:00 Alex Deucher :
>> From: Junwei Zhang 
>>
>> v2: use gfxip names for llvm 4.0+
>> v3: use tonga for llvm <= 3.8
>>
>> Signed-off-by: Junwei Zhang 
>> Reviewed-by: Nicolai Hähnle 
>> Acked-by: Christian König 
>> ---
>
> snip
>
>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>> index 0b5c6dc..e0b914c 100644
>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>> @@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
>> r600_common_screen *rscreen)
>> case CHIP_FIJI: return "AMD FIJI";
>> case CHIP_POLARIS10: return "AMD POLARIS10";
>> case CHIP_POLARIS11: return "AMD POLARIS11";
>> +   case CHIP_POLARIS12: return "AMD POLARIS12";
>> case CHIP_STONEY: return "AMD STONEY";
>> default: return "AMD unknown";
>> }
>> @@ -889,9 +890,11 @@ const char *r600_get_llvm_processor_name(enum 
>> radeon_family family)
>>  #if HAVE_LLVM <= 0x0308
>> case CHIP_POLARIS10: return "tonga";
>> case CHIP_POLARIS11: return "tonga";
>> +   case CHIP_POLARIS12: return "tonga";
>>  #else
>> case CHIP_POLARIS10: return "polaris10";
>> case CHIP_POLARIS11: return "polaris11";
>> +   case CHIP_POLARIS12: return "polaris11";
>>  #endif
>
> You've dropped the processor name for LLVM 4.0+.
> I guess that wasn't intended.

That was intended.  It didn't seem worth adding all of the additional
special cases.  If/when we convert the other asics to use gfxip names,
we can convert polaris12 as well.

Alex

> Something like this should work:
>
> #if HAVE_LLVM <= 0x0308
> // return processor names for LLVM <= 3.8
> #elif HAVE_LLVM == 0x0309
> // return processor names for LLVM 3.9
> #else
> // return processor names for LLVM > 3.9
> #endif
>
> Andreas
>
>> default: return "";
>> }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] i965: Consider surface resolves just before draw

2016-12-20 Thread Pohjolainen, Topi

On Tue, Dec 20, 2016 at 03:05:14PM +, Ben Widawsky wrote:
> On 16-12-20 16:45:30, Topi Pohjolainen wrote:
> > If gl-state remains intact api_validate.c::_mesa_valid_to_render()
> > and brw_try_draw_prims() skip checking if textures and shader
> > images need resolves.
> > This can lead to a case where a surface is left unresolved due to
> > driver writing it internally using blorp. Blorp doesn't trash
> > global gl state but only the internal driver state.
> > 
> > Signed-off-by: Topi Pohjolainen 
> > CC: Kenneth Graunke 
> > CC: Jason Ekstrand 
> > CC: Ben Widawsky 
> > ---
> > src/mesa/drivers/dri/i965/brw_compute.c | 1 +
> > src/mesa/drivers/dri/i965/brw_context.c | 4 
> > src/mesa/drivers/dri/i965/brw_draw.c| 2 ++
> > 3 files changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
> > b/src/mesa/drivers/dri/i965/brw_compute.c
> > index 16b5df7..77c056c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_compute.c
> > +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> > @@ -186,6 +186,7 @@ brw_dispatch_compute_common(struct gl_context *ctx)
> >if (ctx->NewState)
> >   _mesa_update_state(ctx);
> > 
> > +   brw_resolve_surfaces(ctx);
> >brw_validate_textures(brw);
> > 
> >const int sampler_state_size = 16; /* 16 bytes */
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > b/src/mesa/drivers/dri/i965/brw_context.c
> > index 367cd9d..0d339ff 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -180,10 +180,6 @@ intel_update_state(struct gl_context * ctx, GLuint 
> > new_state)
> > 
> >brw->NewGLState |= new_state;
> > 
> > -   _mesa_unlock_context_textures(ctx);
> > -   brw_resolve_surfaces(ctx);
> > -   _mesa_lock_context_textures(ctx);
> > -
> 
> I'm surprised this patch doesn't fix a bug. From this patch this lock/unlock
> being removed seems fishy, but I think that might be an issue from the 
> previous
> patch.

Good question, I'll amend the commit message:

It should be noted that the new callers brw_dispatch_compute_common() and
brw_try_draw_prims() are deep in the driver draw logic and shouldn't need
_mesa_unlock_context_textures()/_mesa_lock_context_textures(). Current
caller intel_update_state() in turn implements dd_function_table::UpdateState
and also gets called by _mesa_update_state_locked() which apparently needs the
unlock/lock sequence.

> 
> Reviewed-by: Ben Widawsky 

Thanks!

> 
> >if (new_state & _NEW_BUFFERS) {
> >   intel_update_framebuffer(ctx, ctx->DrawBuffer);
> >   if (ctx->DrawBuffer != ctx->ReadBuffer)
> > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> > b/src/mesa/drivers/dri/i965/brw_draw.c
> > index 2ce782d..5e58f96 100644
> > --- a/src/mesa/drivers/dri/i965/brw_draw.c
> > +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> > @@ -628,6 +628,8 @@ brw_try_draw_prims(struct gl_context *ctx,
> >if (ctx->NewState)
> >   _mesa_update_state(ctx);
> > 
> > +   brw_resolve_surfaces(ctx);
> > +
> >/* We have to validate the textures *before* checking for fallbacks;
> > * otherwise, the software fallback won't be able to rely on the
> > * texture state, the firstLevel and lastLevel fields won't be
> > -- 
> > 2.5.5
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] i965: Consider surface resolves just before draw

2016-12-20 Thread Ben Widawsky


On 16-12-20 16:45:30, Topi Pohjolainen wrote:

If gl-state remains intact api_validate.c::_mesa_valid_to_render()
and brw_try_draw_prims() skip checking if textures and shader
images need resolves.
This can lead to a case where a surface is left unresolved due to
driver writing it internally using blorp. Blorp doesn't trash
global gl state but only the internal driver state.

Signed-off-by: Topi Pohjolainen 
CC: Kenneth Graunke 
CC: Jason Ekstrand 
CC: Ben Widawsky 
---
src/mesa/drivers/dri/i965/brw_compute.c | 1 +
src/mesa/drivers/dri/i965/brw_context.c | 4 
src/mesa/drivers/dri/i965/brw_draw.c| 2 ++
3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index 16b5df7..77c056c 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -186,6 +186,7 @@ brw_dispatch_compute_common(struct gl_context *ctx)
   if (ctx->NewState)
  _mesa_update_state(ctx);

+   brw_resolve_surfaces(ctx);
   brw_validate_textures(brw);

   const int sampler_state_size = 16; /* 16 bytes */
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 367cd9d..0d339ff 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -180,10 +180,6 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)

   brw->NewGLState |= new_state;

-   _mesa_unlock_context_textures(ctx);
-   brw_resolve_surfaces(ctx);
-   _mesa_lock_context_textures(ctx);
-


I'm surprised this patch doesn't fix a bug. From this patch this lock/unlock
being removed seems fishy, but I think that might be an issue from the previous
patch.

Reviewed-by: Ben Widawsky 


   if (new_state & _NEW_BUFFERS) {
  intel_update_framebuffer(ctx, ctx->DrawBuffer);
  if (ctx->DrawBuffer != ctx->ReadBuffer)
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 2ce782d..5e58f96 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -628,6 +628,8 @@ brw_try_draw_prims(struct gl_context *ctx,
   if (ctx->NewState)
  _mesa_update_state(ctx);

+   brw_resolve_surfaces(ctx);
+
   /* We have to validate the textures *before* checking for fallbacks;
* otherwise, the software fallback won't be able to rely on the
* texture state, the firstLevel and lastLevel fields won't be
--
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


--
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] i965/gen6+: Yet another blorp path - tex_(sub)image2d

2016-12-20 Thread Topi Pohjolainen

This series introduces new use of brw_blorp_blit_miptrees()/
brw_blorp_copy_miptrees(). Initial intention was to enable compression
on SKL already at the time of upload. That, however, didn't help
benchmarks but quite contrary regressed performance in some of
them (Synmark OglDrvRer for one).
Therefore blorp based upload only replaces the current gpu based
upload path - _mesa_meta_pbo_TexSubImage(). This is limited to cases
where the source is already gpu accessible (buffer object) or where the
target is busy (currently being written bu gpu).

Implementation comes with user space pixel source option which can
be used to handle all cases y-tiled memcpy as well. It is also capable
of handling some of the cases y-tiled leaves to the slow
_mesa_store_texsubimage(). This isn't, however, enabled due to
performance regressions. Uploading with gpu means the incoming pixel
data needs to be gpu accessible which requires first a copy to a
buffer object. This copy hurts if it isn't followed by sufficient
amount of sampling.

Finally there is more RFC type of patch simply dropping the meta
patch for gen < 6. I don't know which real world cases get hurt
without the meta path but at least there aren't any jenkins
regressions.

Topi Pohjolainen (9):
  i965: Refactor surface resolves prior to draw call
  i965: Consider surface resolves just before draw
  intel/blorp/dbg: Name blit shaders for easy recognition in dumps
  i965: Estimate batch space per shader stage
  meta: Refactor texture format translation
  i965: Add support for tex upload using gpu
  i965/gen6+: Use blorp for tex_image_2d
  i965/gen6+: Use for tex_subimage_2d
  i965: Drop _mesa_meta_pbo_TexSubImage() even for gen < 6

 src/intel/blorp/blorp_blit.c   |   2 +
 src/mesa/drivers/common/meta_tex_subimage.c|   9 +-
 src/mesa/drivers/dri/i965/brw_compute.c|   1 +
 src/mesa/drivers/dri/i965/brw_context.c| 176 ---
 src/mesa/drivers/dri/i965/brw_draw.c   | 232 -
 src/mesa/drivers/dri/i965/brw_draw.h   |   2 +
 src/mesa/drivers/dri/i965/intel_tex.h  |   8 +
 src/mesa/drivers/dri/i965/intel_tex_image.c|  29 ++--
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 218 +--
 src/mesa/main/glformats.c  |  15 ++
 src/mesa/main/glformats.h  |   4 +
 11 files changed, 484 insertions(+), 212 deletions(-)

-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/9] i965: Add support for tex upload using gpu

2016-12-20 Thread Topi Pohjolainen

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_tex.h  |   8 +
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 194 +
 2 files changed, 202 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_tex.h 
b/src/mesa/drivers/dri/i965/intel_tex.h
index 376f075..c7d0937 100644
--- a/src/mesa/drivers/dri/i965/intel_tex.h
+++ b/src/mesa/drivers/dri/i965/intel_tex.h
@@ -65,6 +65,14 @@ intel_texsubimage_tiled_memcpy(struct gl_context *ctx,
bool for_glTexImage);
 
 bool
+intel_texsubimage_gpu_copy(struct brw_context *brw, GLuint dims,
+   struct gl_texture_image *tex_image,
+   unsigned x, unsigned y, unsigned z,
+   unsigned w, unsigned h, unsigned d,
+   GLenum format, GLenum type, const void *pixels,
+   const struct gl_pixelstore_attrib *packing);
+
+bool
 intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
   struct gl_texture_image *texImage,
   GLint xoffset, GLint yofset,
diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c 
b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
index b7e52bc..f999a93 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
@@ -24,6 +24,7 @@
  */
 
 #include "main/bufferobj.h"
+#include "main/glformats.h"
 #include "main/image.h"
 #include "main/macros.h"
 #include "main/mtypes.h"
@@ -34,8 +35,10 @@
 #include "main/enums.h"
 #include "drivers/common/meta.h"
 
+#include "brw_blorp.h"
 #include "brw_context.h"
 #include "intel_batchbuffer.h"
+#include "intel_buffer_objects.h"
 #include "intel_tex.h"
 #include "intel_mipmap_tree.h"
 #include "intel_blit.h"
@@ -43,6 +46,197 @@
 
 #define FILE_DEBUG_FLAG DEBUG_TEXTURE
 
+static drm_intel_bo *
+intel_texsubimage_get_src_as_bo(struct brw_context *brw, unsigned dims,
+struct gl_texture_image *tex_image,
+unsigned w, unsigned h, unsigned d,
+GLenum format, GLenum type, const void *pixels,
+const struct gl_pixelstore_attrib *packing)
+{
+   /* Account for SKIP_PIXELS, SKIP_ROWS, ALIGNMENT, and SKIP_IMAGES */
+   const uint32_t first_pixel = _mesa_image_offset(dims, packing, w, h,
+   format, type, 0, 0, 0);
+   const uint32_t last_pixel =  _mesa_image_offset(dims, packing, w, h,
+   format, type,
+   d - 1, h - 1, w);
+   const uint32_t size = last_pixel - first_pixel;
+
+   drm_intel_bo * const bo =
+  drm_intel_bo_alloc(brw->bufmgr, "tmp_tex_subimage_src", size, 64);
+
+   if (bo == NULL) {
+  perf_debug("intel_texsubimage: temp bo creation failed: size = %u\n",
+ size);
+  return false;
+   }
+
+   if (drm_intel_bo_subdata(bo, 0, size, pixels + first_pixel)) {
+  perf_debug("intel_texsubimage: temp bo upload failed\n");
+  drm_intel_bo_unreference(bo);
+  return NULL;
+   }
+
+   return bo;
+}
+
+static uint32_t
+intel_texsubimage_get_src_offset(unsigned dims, unsigned w, unsigned h,
+ GLenum format, GLenum type,
+ const void *pixels,
+ const struct gl_pixelstore_attrib *packing)
+{
+   /* Account for SKIP_PIXELS, SKIP_ROWS, ALIGNMENT, and SKIP_IMAGES */
+   const uint32_t first_pixel = _mesa_image_offset(dims, packing, w, h,
+   format, type, 0, 0, 0);
+
+   /* In case of buffer object source 'pixels' represents offset in bytes. */
+   return first_pixel + (intptr_t)pixels;
+}
+
+/* Consider all the restrictions and determine the format of the source. */
+static mesa_format
+intel_texsubimage_check_upload(struct brw_context *brw,
+   const struct gl_texture_image *tex_image,
+   unsigned h, GLenum format, GLenum type,
+   const struct gl_pixelstore_attrib *packing)
+{
+   /* TODO: Add support for buffer object upload 1D alignment or perhaps use
+* flat 2D source.
+*/
+   if (tex_image->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
+  perf_debug("intel_texsubimage: 1D_ARRAY not supported\n");
+  return MESA_FORMAT_NONE;
+   }
+
+   if (brw->ctx._ImageTransferState)
+  return MESA_FORMAT_NONE;
+
+   if (packing->SwapBytes || packing->LsbFirst || packing->Invert) {
+  perf_debug("intel_texsubimage: unsupported gl_pixelstore_attrib\n");
+  return MESA_FORMAT_NONE;
+   }
+
+   /* TODO: This can be easily supported as blit manually offsets miptree
+*   for each slice.
+*/
+   if (packing->ImageHeight != 0) {
+

[Mesa-dev] [PATCH 2/9] i965: Consider surface resolves just before draw

2016-12-20 Thread Topi Pohjolainen

If gl-state remains intact api_validate.c::_mesa_valid_to_render()
and brw_try_draw_prims() skip checking if textures and shader
images need resolves.
This can lead to a case where a surface is left unresolved due to
driver writing it internally using blorp. Blorp doesn't trash
global gl state but only the internal driver state.

Signed-off-by: Topi Pohjolainen 
CC: Kenneth Graunke 
CC: Jason Ekstrand 
CC: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_compute.c | 1 +
 src/mesa/drivers/dri/i965/brw_context.c | 4 
 src/mesa/drivers/dri/i965/brw_draw.c| 2 ++
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index 16b5df7..77c056c 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -186,6 +186,7 @@ brw_dispatch_compute_common(struct gl_context *ctx)
if (ctx->NewState)
   _mesa_update_state(ctx);
 
+   brw_resolve_surfaces(ctx);
brw_validate_textures(brw);
 
const int sampler_state_size = 16; /* 16 bytes */
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 367cd9d..0d339ff 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -180,10 +180,6 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
 
brw->NewGLState |= new_state;
 
-   _mesa_unlock_context_textures(ctx);
-   brw_resolve_surfaces(ctx);
-   _mesa_lock_context_textures(ctx);
-
if (new_state & _NEW_BUFFERS) {
   intel_update_framebuffer(ctx, ctx->DrawBuffer);
   if (ctx->DrawBuffer != ctx->ReadBuffer)
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 2ce782d..5e58f96 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -628,6 +628,8 @@ brw_try_draw_prims(struct gl_context *ctx,
if (ctx->NewState)
   _mesa_update_state(ctx);
 
+   brw_resolve_surfaces(ctx);
+
/* We have to validate the textures *before* checking for fallbacks;
 * otherwise, the software fallback won't be able to rely on the
 * texture state, the firstLevel and lastLevel fields won't be
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/9] i965: Estimate batch space per shader stage

2016-12-20 Thread Topi Pohjolainen

Current estimate doesn't consider space needed for surface states
and it only calculates for one shader stage. Each stage can have
its own sampler and surface state configuration.

While this is only matter of runtime dynamics we don't seem to hit
it currently. However, this becomes visible with blorp tex uploads
(HSW with piglit test max-samplers). One runs out of space while
batch wrapping isn't allowed.

Signed-off-by: Topi Pohjolainen 
CC: Kenneth Graunke 
CC: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_draw.c | 52 +---
 1 file changed, 49 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 5e58f96..864398e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -427,6 +427,54 @@ brw_predraw_set_aux_buffers(struct brw_context *brw)
}
 }
 
+static unsigned
+brw_get_num_active_samplers(const struct gl_context *ctx,
+const struct gl_program *prog)
+{
+   const unsigned last = util_last_bit(prog->SamplersUsed);
+   unsigned count = 0;
+
+   for (unsigned s = 0; s < last; s++) {
+  if (prog->SamplersUsed & (1 << s)) {
+ const unsigned unit = prog->SamplerUnits[s];
+ if (ctx->Texture.Unit[unit]._Current)
+++count;
+  }
+   }
+
+   return count;
+}
+
+static unsigned
+brw_estimate_batch_space_for_textures(const struct brw_context *brw)
+{
+   const struct gl_context *ctx = >ctx;
+   unsigned total = 0;
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  const struct gl_linked_shader *shader =
+ ctx->_Shader->CurrentProgram[i] ?
+ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL;
+
+  if (shader == NULL)
+ continue;
+
+  const struct gl_program *prog = shader->Program;
+  const unsigned num_samplers = brw_get_num_active_samplers(ctx, prog);
+  const unsigned sampler_needs_per_tex_unit =
+ 16 /* sampler_state_size */ +
+ sizeof(struct gen5_sampler_default_color);
+  const unsigned surface_state_needs_per_tex_unit =
+ ALIGN(brw->isl_dev.ss.size, brw->isl_dev.ss.align) +
+ 4 /* binding table pointer */;
+  const unsigned total_per_tex_unit = sampler_needs_per_tex_unit +
+  surface_state_needs_per_tex_unit;
+  total += (num_samplers * total_per_tex_unit);
+   }
+
+   return total;
+}
+
 static bool
 intel_disable_rb_aux_buffer(struct brw_context *brw, const drm_intel_bo *bo)
 {
@@ -677,11 +725,9 @@ brw_try_draw_prims(struct gl_context *ctx,
 
for (i = 0; i < nr_prims; i++) {
   int estimated_max_prim_size;
-  const int sampler_state_size = 16;
 
   estimated_max_prim_size = 512; /* batchbuffer commands */
-  estimated_max_prim_size += BRW_MAX_TEX_UNIT *
- (sampler_state_size + sizeof(struct gen5_sampler_default_color));
+  estimated_max_prim_size += brw_estimate_batch_space_for_textures(brw);
   estimated_max_prim_size += 1024; /* gen6 VS push constants */
   estimated_max_prim_size += 1024; /* gen6 WM push constants */
   estimated_max_prim_size += 512; /* misc. pad */
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] i965: Drop _mesa_meta_pbo_TexSubImage() even for gen < 6

2016-12-20 Thread Topi Pohjolainen

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_tex_image.c| 24 +++-
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 19 +--
 2 files changed, 12 insertions(+), 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 67f83db..e503043 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -127,7 +127,6 @@ intelTexImage(struct gl_context * ctx,
 {
struct brw_context *brw = brw_context(ctx);
struct intel_texture_image *intelImage = intel_texture_image(texImage);
-   bool ok;
 
bool tex_busy = intelImage->mt && drm_intel_bo_busy(intelImage->mt->bo);
 
@@ -156,22 +155,13 @@ intelTexImage(struct gl_context * ctx,
   format, type, pixels, unpack))
   return;
 
-   if (brw->gen < 6 &&
-   _mesa_meta_pbo_TexSubImage(ctx, dims, texImage, 0, 0, 0,
-  texImage->Width, texImage->Height,
-  texImage->Depth,
-  format, type, pixels,
-  tex_busy, unpack))
-  return;
-
-   ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
-   0, 0, 0, /*x,y,z offsets*/
-   texImage->Width,
-   texImage->Height,
-   texImage->Depth,
-   format, type, pixels, unpack,
-   false /*allocate_storage*/);
-   if (ok)
+   if (intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
+  0, 0, 0, /*x,y,z offsets*/
+  texImage->Width,
+  texImage->Height,
+  texImage->Depth,
+  format, type, pixels, unpack,
+  false /*allocate_storage*/))
   return;
 
DBG("%s: upload image %dx%dx%d pixels %p\n",
diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c 
b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
index 741637a..60dc862 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
@@ -395,7 +395,6 @@ intelTexSubImage(struct gl_context * ctx,
 {
struct brw_context *brw = brw_context(ctx);
struct intel_mipmap_tree *mt = intel_texture_image(texImage)->mt;
-   bool ok;
 
bool tex_busy = mt && drm_intel_bo_busy(mt->bo);
 
@@ -416,19 +415,11 @@ intelTexSubImage(struct gl_context * ctx,
   format, type, pixels, packing))
   return;
 
-   ok = _mesa_meta_pbo_TexSubImage(ctx, dims, texImage,
-   xoffset, yoffset, zoffset,
-   width, height, depth, format, type,
-   pixels, tex_busy, packing);
-   if (ok)
-  return;
-
-   ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
-   xoffset, yoffset, zoffset,
-   width, height, depth,
-   format, type, pixels, packing,
-   false /*for_glTexImage*/);
-   if (ok)
+   if (intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
+  xoffset, yoffset, zoffset,
+  width, height, depth,
+  format, type, pixels, packing,
+  false /*for_glTexImage*/))
  return;
 
_mesa_store_texsubimage(ctx, dims, texImage,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/9] meta: Refactor texture format translation

2016-12-20 Thread Topi Pohjolainen

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/common/meta_tex_subimage.c |  9 +++--
 src/mesa/main/glformats.c   | 15 +++
 src/mesa/main/glformats.h   |  4 
 3 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index 703efcd..b8c422b 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -72,7 +72,8 @@ create_texture_for_pbo(struct gl_context *ctx,
const struct gl_pixelstore_attrib *packing,
struct gl_buffer_object **tmp_pbo, GLuint *tmp_tex)
 {
-   uint32_t pbo_format;
+   const mesa_format pbo_format =
+  _mesa_tex_format_from_format_and_type(ctx, format, type);
GLenum internal_format;
unsigned row_stride;
struct gl_buffer_object *buffer_obj;
@@ -85,11 +86,7 @@ create_texture_for_pbo(struct gl_context *ctx,
packing->Invert)
   return NULL;
 
-   pbo_format = _mesa_format_from_format_and_type(format, type);
-   if (_mesa_format_is_mesa_array_format(pbo_format))
-  pbo_format = _mesa_format_from_array_format(pbo_format);
-
-   if (!pbo_format || !ctx->TextureFormatSupported[pbo_format])
+   if (pbo_format == MESA_FORMAT_NONE)
   return NULL;
 
/* Account for SKIP_PIXELS, SKIP_ROWS, ALIGNMENT, and SKIP_IMAGES */
diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index a95909c..4f24020 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -3632,6 +3632,21 @@ _mesa_format_from_format_and_type(GLenum format, GLenum 
type)
unreachable("Unsupported format");
 }
 
+uint32_t
+_mesa_tex_format_from_format_and_type(const struct gl_context *ctx,
+  GLenum gl_format, GLenum type)
+{
+   mesa_format format = _mesa_format_from_format_and_type(gl_format, type);
+
+   if (_mesa_format_is_mesa_array_format(format))
+  format = _mesa_format_from_array_format(format);
+  
+   if (format == MESA_FORMAT_NONE || !ctx->TextureFormatSupported[format])
+  return MESA_FORMAT_NONE;
+
+   return format;
+}
+
 /**
  * Returns true if \p internal_format is a sized internal format that
  * is marked "Color Renderable" in Table 8.10 of the ES 3.2 specification.
diff --git a/src/mesa/main/glformats.h b/src/mesa/main/glformats.h
index 763307f..5c9d826 100644
--- a/src/mesa/main/glformats.h
+++ b/src/mesa/main/glformats.h
@@ -148,6 +148,10 @@ _mesa_base_tex_format(const struct gl_context *ctx, GLint 
internalFormat );
 extern uint32_t
 _mesa_format_from_format_and_type(GLenum format, GLenum type);
 
+extern uint32_t
+_mesa_tex_format_from_format_and_type(const struct gl_context *ctx,
+  GLenum gl_format, GLenum type);
+
 extern bool
 _mesa_is_es3_color_renderable(GLenum internal_format);
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/9] intel/blorp/dbg: Name blit shaders for easy recognition in dumps

2016-12-20 Thread Topi Pohjolainen

Blorp clears already have an equivalent.

Signed-off-by: Topi Pohjolainen 
---
 src/intel/blorp/blorp_blit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index 8abe3a8..9dcd33f 100644
--- a/src/intel/blorp/blorp_blit.c
+++ b/src/intel/blorp/blorp_blit.c
@@ -1299,6 +1299,8 @@ brw_blorp_get_blit_kernel(struct blorp_context *blorp,
struct brw_wm_prog_data prog_data;
 
nir_shader *nir = brw_blorp_build_nir_shader(blorp, mem_ctx, prog_key);
+   nir->info->name = ralloc_strdup(nir, "BLORP-blit");
+
struct brw_wm_prog_key wm_key;
brw_blorp_init_wm_prog_key(_key);
wm_key.tex.compressed_multisample_layout_mask =
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/9] i965/gen6+: Use blorp for tex_image_2d

2016-12-20 Thread Topi Pohjolainen

instead of _mesa_meta_pbo_TexSubImage().

While the newly introduced intel_texsubimage_gpu_copy() is capable
of handling all the cases _mesa_meta_pbo_TexSubImage() can, it
is also capable of handling everything intel_texsubimage_tiled_memcpy()
does. And in addition part of the cases left to _mesa_store_teximage()
as well.
This, however, leads to performance regressions in few benchmarks,
especially with Synmark OglDrvRes. Therefore intel_texsubimage_gpu_copy
is only used to replace the meta path for now.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 141996f..67f83db 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -125,6 +125,7 @@ intelTexImage(struct gl_context * ctx,
   GLenum format, GLenum type, const void *pixels,
   const struct gl_pixelstore_attrib *unpack)
 {
+   struct brw_context *brw = brw_context(ctx);
struct intel_texture_image *intelImage = intel_texture_image(texImage);
bool ok;
 
@@ -147,12 +148,20 @@ intelTexImage(struct gl_context * ctx,
if (intelImage->mt->format == MESA_FORMAT_S_UINT8)
   intelImage->mt->r8stencil_needs_update = true;
 
-   ok = _mesa_meta_pbo_TexSubImage(ctx, dims, texImage, 0, 0, 0,
-   texImage->Width, texImage->Height,
-   texImage->Depth,
-   format, type, pixels,
-   tex_busy, unpack);
-   if (ok)
+   const bool is_src_bo = _mesa_is_bufferobj(unpack->BufferObj);
+   if (brw->gen >= 6 && (tex_busy || is_src_bo) &&
+   intel_texsubimage_gpu_copy(brw, dims, texImage, 0, 0, 0,
+  texImage->Width, texImage->Height,
+  texImage->Depth,
+  format, type, pixels, unpack))
+  return;
+
+   if (brw->gen < 6 &&
+   _mesa_meta_pbo_TexSubImage(ctx, dims, texImage, 0, 0, 0,
+  texImage->Width, texImage->Height,
+  texImage->Depth,
+  format, type, pixels,
+  tex_busy, unpack))
   return;
 
ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/9] i965/gen6+: Use for tex_subimage_2d

2016-12-20 Thread Topi Pohjolainen

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c 
b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
index f999a93..741637a 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
@@ -393,6 +393,7 @@ intelTexSubImage(struct gl_context * ctx,
  const GLvoid * pixels,
  const struct gl_pixelstore_attrib *packing)
 {
+   struct brw_context *brw = brw_context(ctx);
struct intel_mipmap_tree *mt = intel_texture_image(texImage)->mt;
bool ok;
 
@@ -407,6 +408,14 @@ intelTexSubImage(struct gl_context * ctx,
_mesa_enum_to_string(format), _mesa_enum_to_string(type),
texImage->Level, texImage->Width, texImage->Height, texImage->Depth);
 
+   const bool is_src_bo = _mesa_is_bufferobj(packing->BufferObj);
+   if (brw->gen >= 6 && (tex_busy || is_src_bo) &&
+   intel_texsubimage_gpu_copy(brw, dims, texImage,
+  xoffset, yoffset, zoffset,
+  width, height, depth,
+  format, type, pixels, packing))
+  return;
+
ok = _mesa_meta_pbo_TexSubImage(ctx, dims, texImage,
xoffset, yoffset, zoffset,
width, height, depth, format, type,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/9] i965: Refactor surface resolves prior to draw call

2016-12-20 Thread Topi Pohjolainen

and move it to brw_draw.c where it will be eventually used.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 174 +--
 src/mesa/drivers/dri/i965/brw_draw.c| 178 
 src/mesa/drivers/dri/i965/brw_draw.h|   2 +
 3 files changed, 181 insertions(+), 173 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 4ca77c7..367cd9d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -169,66 +169,10 @@ intel_update_framebuffer(struct gl_context *ctx,
  fb->DefaultGeometry.NumSamples);
 }
 
-static bool
-intel_disable_rb_aux_buffer(struct brw_context *brw, const drm_intel_bo *bo)
-{
-   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
-   bool found = false;
-
-   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
-  const struct intel_renderbuffer *irb =
- intel_renderbuffer(fb->_ColorDrawBuffers[i]);
-
-  if (irb && irb->mt->bo == bo) {
- found = brw->draw_aux_buffer_disabled[i] = true;
-  }
-   }
-
-   return found;
-}
-
-/* On Gen9 color buffers may be compressed by the hardware (lossless
- * compression). There are, however, format restrictions and care needs to be
- * taken that the sampler engine is capable for re-interpreting a buffer with
- * format different the buffer was originally written with.
- *
- * For example, SRGB formats are not compressible and the sampler engine isn't
- * capable of treating RGBA_UNORM as SRGB_ALPHA. In such a case the underlying
- * color buffer needs to be resolved so that the sampling surface can be
- * sampled as non-compressed (i.e., without the auxiliary MCS buffer being
- * set).
- */
-static bool
-intel_texture_view_requires_resolve(struct brw_context *brw,
-struct intel_texture_object *intel_tex)
-{
-   if (brw->gen < 9 ||
-   !intel_miptree_is_lossless_compressed(brw, intel_tex->mt))
- return false;
-
-   const uint32_t brw_format = brw_format_for_mesa_format(intel_tex->_Format);
-
-   if (isl_format_supports_lossless_compression(>screen->devinfo,
-brw_format))
-  return false;
-
-   perf_debug("Incompatible sampling format (%s) for rbc (%s)\n",
-  _mesa_get_format_name(intel_tex->_Format),
-  _mesa_get_format_name(intel_tex->mt->format));
-
-   if (intel_disable_rb_aux_buffer(brw, intel_tex->mt->bo))
-  perf_debug("Sampling renderbuffer with non-compressible format - "
- "turning off compression");
-
-   return true;
-}
-
 static void
 intel_update_state(struct gl_context * ctx, GLuint new_state)
 {
struct brw_context *brw = brw_context(ctx);
-   struct intel_texture_object *tex_obj;
-   struct intel_renderbuffer *depth_irb;
 
if (ctx->swrast_context)
   _swrast_InvalidateState(ctx, new_state);
@@ -237,123 +181,7 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
brw->NewGLState |= new_state;
 
_mesa_unlock_context_textures(ctx);
-
-   /* Resolve the depth buffer's HiZ buffer. */
-   depth_irb = intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH);
-   if (depth_irb)
-  intel_renderbuffer_resolve_hiz(brw, depth_irb);
-
-   memset(brw->draw_aux_buffer_disabled, 0,
-  sizeof(brw->draw_aux_buffer_disabled));
-
-   /* Resolve depth buffer and render cache of each enabled texture. */
-   int maxEnabledUnit = ctx->Texture._MaxEnabledTexImageUnit;
-   for (int i = 0; i <= maxEnabledUnit; i++) {
-  if (!ctx->Texture.Unit[i]._Current)
-continue;
-  tex_obj = intel_texture_object(ctx->Texture.Unit[i]._Current);
-  if (!tex_obj || !tex_obj->mt)
-continue;
-  if (intel_miptree_sample_with_hiz(brw, tex_obj->mt))
- intel_miptree_all_slices_resolve_hiz(brw, tex_obj->mt);
-  else
- intel_miptree_all_slices_resolve_depth(brw, tex_obj->mt);
-  /* Sampling engine understands lossless compression and resolving
-   * those surfaces should be skipped for performance reasons.
-   */
-  const int flags = intel_texture_view_requires_resolve(brw, tex_obj) ?
-   0 : INTEL_MIPTREE_IGNORE_CCS_E;
-  intel_miptree_all_slices_resolve_color(brw, tex_obj->mt, flags);
-  brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
-
-  if (tex_obj->base.StencilSampling ||
-  tex_obj->mt->format == MESA_FORMAT_S_UINT8) {
- intel_update_r8stencil(brw, tex_obj->mt);
-  }
-   }
-
-   /* Resolve color for each active shader image. */
-   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  const struct gl_linked_shader *shader =
- ctx->_Shader->CurrentProgram[i] ?
-ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL;
-
-  if (unlikely(shader &&

Re: [Mesa-dev] [PATCH 2/6] gallivm: optimize SoA AoS fallback fetch path a little

2016-12-20 Thread Jose Fonseca


On 12/12/16 00:11, srol...@vmware.com wrote:

From: Roland Scheidegger 

We should do transpose, not extract/insert, at least with "sufficient" amount
of channels (for 4 channels, extract/insert shuffles generated otherwise look
truly terrifying). Albeit we shouldn't fallback to that so often in any case.
---
 src/gallium/auxiliary/gallivm/lp_bld_format_soa.c | 83 +++
 1 file changed, 70 insertions(+), 13 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
index 389bfa0..902c763 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
@@ -40,6 +40,39 @@
 #include "lp_bld_debug.h"
 #include "lp_bld_format.h"
 #include "lp_bld_arit.h"
+#include "lp_bld_pack.h"
+
+
+static void
+convert_to_soa(struct gallivm_state *gallivm,
+   LLVMValueRef src_aos[LP_MAX_VECTOR_WIDTH / 32],
+   LLVMValueRef dst_soa[4],
+   const struct lp_type soa_type)
+{
+   unsigned j, k;
+   struct lp_type aos_channel_type = soa_type;
+
+   LLVMValueRef aos_channels[4];
+   unsigned pixels_per_channel = soa_type.length / 4;
+
+   debug_assert((soa_type.length % 4) == 0);
+
+   aos_channel_type.length >>= 1;
+
+   for (j = 0; j < 4; ++j) {
+  LLVMValueRef channel[LP_MAX_VECTOR_LENGTH] = { 0 };
+
+  assert(pixels_per_channel <= LP_MAX_VECTOR_LENGTH);
+
+  for (k = 0; k < pixels_per_channel; ++k) {
+ channel[k] = src_aos[j + 4 * k];
+  }
+
+  aos_channels[j] = lp_build_concat(gallivm, channel, aos_channel_type, 
pixels_per_channel);
+   }
+
+   lp_build_transpose_aos(gallivm, soa_type, aos_channels, dst_soa);
+}


 void
@@ -48,9 +81,6 @@ lp_build_format_swizzle_soa(const struct 
util_format_description *format_desc,
 const LLVMValueRef *unswizzled,
 LLVMValueRef swizzled_out[4])
 {
-   assert(PIPE_SWIZZLE_0 == (int)PIPE_SWIZZLE_0);
-   assert(PIPE_SWIZZLE_1 == (int)PIPE_SWIZZLE_1);
-
if (format_desc->colorspace == UTIL_FORMAT_COLORSPACE_ZS) {
   enum pipe_swizzle swizzle;
   LLVMValueRef depth_or_stencil;
@@ -547,9 +577,11 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
{
   unsigned k, chan;
   struct lp_type tmp_type;
+  LLVMValueRef aos_fetch[LP_MAX_VECTOR_WIDTH / 32];
+  boolean vec_transpose = FALSE;

   if (gallivm_debug & GALLIVM_DEBUG_PERF) {
- debug_printf("%s: scalar unpacking of %s\n",
+ debug_printf("%s: AoS fetch fallback for %s\n",
   __FUNCTION__, format_desc->short_name);
   }

@@ -560,12 +592,31 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
  rgba_out[chan] = lp_build_undef(gallivm, type);
   }

+  if (format_desc->nr_channels > 2 ||
+  format_desc->layout != UTIL_FORMAT_LAYOUT_PLAIN) {
+ /*
+  * Note that vector transpose can be worse. This is because
+  * llvm will ensure the missing channels have the correct
+  * values, in particular typically 1.0 for the last channel
+  * (if they are used or not doesn't matter, usually llvm can't
+  * figure this out here probably due to the transpose).
+  * But with the extract/insert path, since those missing elements
+  * were just directly inserted/extracted llvm can optimize this
+  * somewhat (though it still doesn't look great - and not for
+  * the compressed formats due to their external fetch funcs).
+  * So restrict to cases where we are sure it helps (albeit
+  * with 2 channels it MIGHT be worth it at least with AVX).
+  * In any case, this is just a bandaid, it does NOT replace proper
+  * SoA format unpack.
+  */
+ vec_transpose = TRUE;
+  }
+


There's a burden in maintaining so many code paths -- it raises the 
difficulty bar next time we want to do an optimization --, so if this is 
just a little worse, or only affects the draw, I'd say it's better to 
always use vec_transpose.



   /* loop over number of pixels */
   for(k = 0; k < type.length; ++k) {
  LLVMValueRef index = lp_build_const_int32(gallivm, k);
  LLVMValueRef offset_elem;
  LLVMValueRef i_elem, j_elem;
- LLVMValueRef tmp;

  offset_elem = LLVMBuildExtractElement(builder, offset,
index, "");
@@ -574,20 +625,26 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
  j_elem = LLVMBuildExtractElement(builder, j, index, "");

  /* Get a single float[4]={R,G,B,A} pixel */
- tmp = lp_build_fetch_rgba_aos(gallivm, format_desc, tmp_type,
-   aligned, base_ptr, offset_elem,
-   i_elem, j_elem, cache);
+ aos_fetch[k] = lp_build_fetch_rgba_aos(gallivm, format_desc,

Re: [Mesa-dev] [PATCH 1/6] gallivm: (trivial) handle non-aligned fetch for lp_build_fetch_rgba_soa

2016-12-20 Thread Jose Fonseca


On 12/12/16 00:11, srol...@vmware.com wrote:

From: Roland Scheidegger 

soa fetch so far always assumed that data was aligned. However, we want to
use this for vertex fetch, and data might not be aligned there, so handle
it in this path too (basically just pass through alignment through to other
functions). (It looks like it wouldn't work for for cached s3tc but this is
no different than with AoS fetch.)
---
 src/gallium/auxiliary/gallivm/lp_bld_format.h |  1 +
 src/gallium/auxiliary/gallivm/lp_bld_format_soa.c | 15 +--
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 ++--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format.h 
b/src/gallium/auxiliary/gallivm/lp_bld_format.h
index 5c866f4..6540caa 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format.h
@@ -143,6 +143,7 @@ void
 lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
 const struct util_format_description *format_desc,
 struct lp_type type,
+boolean aligned,
 LLVMValueRef base_ptr,
 LLVMValueRef offsets,
 LLVMValueRef i,
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
index 7444c51..389bfa0 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
@@ -349,6 +349,7 @@ lp_build_rgba8_to_fi32_soa(struct gallivm_state *gallivm,
  *
  * \param type  the desired return type for 'rgba'.  The vector length
  *  is the number of texels to fetch
+ * \param aligned if the offset is guaranteed to be aligned to element width
  *
  * \param base_ptr  points to the base of the texture mip tree.
  * \param offsetoffset to start of the texture image block.  For non-
@@ -365,6 +366,7 @@ void
 lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
 const struct util_format_description *format_desc,
 struct lp_type type,
+boolean aligned,
 LLVMValueRef base_ptr,
 LLVMValueRef offset,
 LLVMValueRef i,
@@ -402,7 +404,7 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
type.length,
format_desc->block.bits,
type.width,
-   TRUE,
+   aligned,
base_ptr, offset, FALSE);

   /*
@@ -428,7 +430,7 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,

   packed = lp_build_gather(gallivm, type.length,
format_desc->block.bits,
-   type.width, TRUE,
+   type.width, aligned,
base_ptr, offset, FALSE);
   if (format_desc->format == PIPE_FORMAT_R11G11B10_FLOAT) {
  lp_build_r11g11b10_to_float(gallivm, packed, rgba_out);
@@ -456,14 +458,14 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
  LLVMValueRef s_offset = lp_build_const_int_vec(gallivm, type, 4);
  offset = LLVMBuildAdd(builder, offset, s_offset, "");
  packed = lp_build_gather(gallivm, type.length, 32, type.width,
-  TRUE, base_ptr, offset, FALSE);
+  aligned, base_ptr, offset, FALSE);
  packed = LLVMBuildAnd(builder, packed,
lp_build_const_int_vec(gallivm, type, mask), 
"");
   }
   else {
  assert (format_desc->format == PIPE_FORMAT_Z32_FLOAT_S8X24_UINT);
  packed = lp_build_gather(gallivm, type.length, 32, type.width,
-  TRUE, base_ptr, offset, TRUE);
+  aligned, base_ptr, offset, TRUE);
  packed = LLVMBuildBitCast(builder, packed,
lp_build_vec_type(gallivm, type), "");
   }
@@ -489,7 +491,7 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
   tmp_type.norm = TRUE;

   tmp = lp_build_fetch_rgba_aos(gallivm, format_desc, tmp_type,
-TRUE, base_ptr, offset, i, j, cache);
+aligned, base_ptr, offset, i, j, cache);

   lp_build_rgba8_to_fi32_soa(gallivm,
 type,
@@ -509,6 +511,7 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,
   const struct util_format_description *flinear_desc;
   LLVMValueRef packed;
   flinear_desc = 
util_format_description(util_format_linear(format_desc->format));
+  /* This probably only works with aligned data */
   packed = lp_build_fetch_cached_texels(gallivm,

Re: [Mesa-dev] [PATCH 6/6] draw: use SoA fetch, not AoS one

2016-12-20 Thread Jose Fonseca


On 12/12/16 00:12, srol...@vmware.com wrote:

From: Roland Scheidegger 

Now that there's some SoA fetch which never falls back, we should usually get
results which are better or at least not worse (something like rgba32f will
stay the same). I suppose though it might be worse in some cases where the
format doesn't require conversion (e.g. rg32f) and goes straight to output -
if llvm was able to see through all shuffles then it might have been able
to do away with the aos->soa->aos transpose entirely which can no longer work
possibly except for 4-channel formats (due to replacing the undef channels
with 0/1 before the second transpose and not the first - llvm will
definitely not be able to figure that out). That might actually be quite
common, but I'm not sure llvm really could optimize it in the first place,
and if it's a problem we should just special case such inputs (though note
that if conversion is needed, it isn't obvious if it's better to skip
the transpose or do the conversion AoS-style).

For cases which get way better, think something like R16_UNORM with 8-wide
vectors: this was 8 sign-extend fetches, 8 cvt, 8 muls, followed by
a couple of shuffles to stitch things together (if it is smart enough,
6 unpacks) and then a (8-wide) transpose (not sure if llvm could even
optimize the shuffles + transpose, since the 16bit values were actually
sign-extended to 128bit before being cast to a float vec, so that would be
another 8 unpacks). Now that is just 8 fetches (directly inserted into
vector, albeit there's one 128bit insert needed), 1 cvt, 1 mul.
---
 src/gallium/auxiliary/draw/draw_llvm.c | 54 +-
 1 file changed, 40 insertions(+), 14 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 19b75a5..f895b76 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -755,11 +755,9 @@ fetch_vector(struct gallivm_state *gallivm,
  LLVMValueRef *inputs,
  LLVMValueRef indices)
 {
-   LLVMValueRef zero = LLVMConstNull(LLVMInt32TypeInContext(gallivm->context));
LLVMBuilderRef builder = gallivm->builder;
struct lp_build_context blduivec;
LLVMValueRef offset, valid_mask;
-   LLVMValueRef aos_fetch[LP_MAX_VECTOR_WIDTH / 32];
unsigned i;

lp_build_context_init(, gallivm, lp_uint_type(vs_type));
@@ -783,21 +781,49 @@ fetch_vector(struct gallivm_state *gallivm,
}

/*
-* Note: we probably really want to use SoA fetch, not AoS one (albeit
-* for most formats it will amount to the same as this isn't very
-* optimized). But looks dangerous since it assumes alignment.
+* Use SoA fetch. This should produce better code usually.
+* Albeit it's possible there's exceptions (in particular if the fetched
+* value is going directly to output if it's something like RG32F).
 */
-   for (i = 0; i < vs_type.length; i++) {
-  LLVMValueRef offset1, elem;
-  elem = lp_build_const_int32(gallivm, i);
-  offset1 = LLVMBuildExtractElement(builder, offset, elem, "");
+   if (1) {
+  struct lp_type res_type = vs_type;
+  /* The type handling is annoying here... */
+  if (format_desc->colorspace == UTIL_FORMAT_COLORSPACE_RGB &&
+  format_desc->channel[0].pure_integer) {
+ if (format_desc->channel[0].type == UTIL_FORMAT_TYPE_SIGNED) {
+res_type = lp_type_int_vec(vs_type.width, vs_type.width * 
vs_type.length);
+ }
+ else if (format_desc->channel[0].type == UTIL_FORMAT_TYPE_UNSIGNED) {
+res_type = lp_type_uint_vec(vs_type.width, vs_type.width * 
vs_type.length);
+ }
+  }

-  aos_fetch[i] = lp_build_fetch_rgba_aos(gallivm, format_desc,
- lp_float32_vec4_type(),
- FALSE, map_ptr, offset1,
- zero, zero, NULL);
+  lp_build_fetch_rgba_soa(gallivm, format_desc,
+  res_type, FALSE, map_ptr, offset,
+  blduivec.zero, blduivec.zero,
+  NULL, inputs);
+
+  for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
+ inputs[i] = LLVMBuildBitCast(builder, inputs[i],
+  lp_build_vec_type(gallivm, vs_type), "");
+  }
+
+   }



+   else {


Let's kill the old code path.  The multitude of live code paths is more 
than enough.  No point in keeping additional dead code paths around.



+  LLVMValueRef zero = 
LLVMConstNull(LLVMInt32TypeInContext(gallivm->context));
+  LLVMValueRef aos_fetch[LP_MAX_VECTOR_WIDTH / 32];
+  for (i = 0; i < vs_type.length; i++) {
+ LLVMValueRef offset1, elem;
+ elem = lp_build_const_int32(gallivm, i);
+ offset1 = LLVMBuildExtractElement(builder, offset, elem, "");
+
+ aos_fetch[i] =

Re: [Mesa-dev] [PATCH 4/6] gallivm: provide soa fetch path handling formats with more than 32bit

2016-12-20 Thread Jose Fonseca


On 12/12/16 00:12, srol...@vmware.com wrote:

From: Roland Scheidegger 

This previously always fell back to AoS conversion. Even for 4-float formats
(which is the optimal case by far for that fallback case) this was suboptimal,
since it meant the conversion couldn't be done with 256bit vectors. While this
may still only be partly possible for some formats, (unless there's AVX2
support) at least the transpose can be done with half the unpacks
(and before using the transpose for AoS fallbacks, it was worse still).
With less than 4 channels, things got way worse with the AoS fallback
quickly even with 128bit vectors.
The strategy is pretty much the same as the existing one for formats
which fit into 32 bits, except there's now multiple vectors to be
fetched (2 or 4 to be exact), which need to be shuffled first (if it's 4
vectors, this amounts to a transpose, for 2 it's a bit different),
then the unpack is done the same (with the exception that the shift
of the channels is now modulo 32, and we need to select the right
vector).
In fact the most complex part about it is to get the shuffles right
for separating into lo/hi parts for AVX/AVX2...



This also makes use of the new ability of gather to use provided type
information, which we abuse to outsmart llvm so we get decent shuffles,
and to fetch 3x32bit vectors without having to ZExt the scalar.
And just because we can, we handle double formats too, albeit they are
a bit different (draw sometimes needs to handle that).
---
 src/gallium/auxiliary/gallivm/lp_bld_format_soa.c | 529 +++---
 1 file changed, 375 insertions(+), 154 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
index b3ea709..9550f26 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
@@ -31,6 +31,7 @@
 #include "util/u_format.h"
 #include "util/u_memory.h"
 #include "util/u_string.h"
+#include "util/u_math.h"

 #include "lp_bld_type.h"
 #include "lp_bld_const.h"
@@ -113,6 +114,166 @@ lp_build_format_swizzle_soa(const struct 
util_format_description *format_desc,
 }


+
+static LLVMValueRef
+lp_build_extract_soa_chan(struct lp_build_context *bld,
+  unsigned blockbits,
+  boolean srgb_chan,
+  struct util_format_channel_description chan_desc,
+  LLVMValueRef packed)
+{
+   struct gallivm_state *gallivm = bld->gallivm;
+   LLVMBuilderRef builder = gallivm->builder;
+   struct lp_type type = bld->type;
+   LLVMValueRef input = packed;
+   const unsigned width = chan_desc.size;
+   const unsigned start = chan_desc.shift;
+   const unsigned stop = start + width;
+
+   /* Decode the input vector component */
+
+   switch(chan_desc.type) {
+   case UTIL_FORMAT_TYPE_VOID:
+  input = bld->undef;
+  break;
+
+   case UTIL_FORMAT_TYPE_UNSIGNED:
+  /*
+   * Align the LSB
+   */
+  if (start) {
+ input = LLVMBuildLShr(builder, input,
+   lp_build_const_int_vec(gallivm, type, start), 
"");
+  }
+
+  /*
+   * Zero the MSBs
+   */
+  if (stop < blockbits) {
+ unsigned mask = ((unsigned long long)1 << width) - 1;
+ input = LLVMBuildAnd(builder, input,
+  lp_build_const_int_vec(gallivm, type, mask), "");
+  }
+
+  /*
+   * Type conversion
+   */
+  if (type.floating) {
+ if (srgb_chan) {
+struct lp_type conv_type = lp_uint_type(type);
+input = lp_build_srgb_to_linear(gallivm, conv_type, width, input);
+ }
+ else {
+if(chan_desc.normalized)
+   input = lp_build_unsigned_norm_to_float(gallivm, width, type, 
input);
+else
+   input = LLVMBuildSIToFP(builder, input, bld->vec_type, "");
+ }
+  }
+  else if (chan_desc.pure_integer) {
+ /* Nothing to do */
+  } else {
+  /* FIXME */
+  assert(0);
+  }
+  break;
+
+   case UTIL_FORMAT_TYPE_SIGNED:
+  /*
+   * Align the sign bit first.
+   */
+  if (stop < type.width) {
+ unsigned bits = type.width - stop;
+ LLVMValueRef bits_val = lp_build_const_int_vec(gallivm, type, bits);
+ input = LLVMBuildShl(builder, input, bits_val, "");
+  }
+
+  /*
+   * Align the LSB (with an arithmetic shift to preserve the sign)
+   */
+  if (chan_desc.size < type.width) {
+ unsigned bits = type.width - chan_desc.size;
+ LLVMValueRef bits_val = lp_build_const_int_vec(gallivm, type, bits);
+ input = LLVMBuildAShr(builder, input, bits_val, "");
+  }
+
+  /*
+   * Type conversion
+   */
+  if (type.floating) {
+ input = LLVMBuildSIToFP(builder, input, bld->vec_type, "");
+ if (chan_desc.normalized) {
+

Re: [Mesa-dev] [PATCH 5/6] gallivm: generalize the compressed format soa fetch a bit

2016-12-20 Thread Jose Fonseca


On 12/12/16 00:12, srol...@vmware.com wrote:

From: Roland Scheidegger 

This can now handle rgtc (unorm) too - this path no longer handles plain
formats, but that's unnecessary they now all have their proper SoA unpack
(this will still be dog-slow though due to the actual fetch being per-pixel
util fallbacks).
---
 src/gallium/auxiliary/gallivm/lp_bld_format_soa.c | 86 +--
 1 file changed, 49 insertions(+), 37 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
index 9550f26..68cbb10 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_format_soa.c
@@ -733,64 +733,69 @@ lp_build_fetch_rgba_soa(struct gallivm_state *gallivm,

/*
 * Try calling lp_build_fetch_rgba_aos for all pixels.
+* Should only really hit subsampled, compressed
+* (for s3tc srgb too, for rgtc the unorm ones only) by now.
+* (This is invalid for plain 8unorm formats because we're lazy with
+* the swizzle since some results would arrive swizzled, some not.)
 */

-   if (util_format_fits_8unorm(format_desc) &&
+   if ((format_desc->layout != UTIL_FORMAT_LAYOUT_PLAIN) &&
+   (util_format_fits_8unorm(format_desc) ||
+format_desc->layout == UTIL_FORMAT_LAYOUT_S3TC) &&
type.floating && type.width == 32 &&
(type.length == 1 || (type.length % 4 == 0))) {
   struct lp_type tmp_type;
-  LLVMValueRef tmp;
+  struct lp_build_context bld;
+  LLVMValueRef packed, rgba[4];
+  const struct util_format_description *flinear_desc;
+  const struct util_format_description *frgba8_desc;
+  unsigned chan;

+  lp_build_context_init(, gallivm, type);
+
+  /*
+   * Make sure the conversion in aos really only does convert to rgba8
+   * and not anything more (so use linear format, adjust type).
+   */
+  flinear_desc = util_format_description(util_format_linear(format));
   memset(_type, 0, sizeof tmp_type);
   tmp_type.width = 8;
   tmp_type.length = type.length * 4;
   tmp_type.norm = TRUE;

-  tmp = lp_build_fetch_rgba_aos(gallivm, format_desc, tmp_type,
-aligned, base_ptr, offset, i, j, cache);
+  packed = lp_build_fetch_rgba_aos(gallivm, flinear_desc, tmp_type,
+   aligned, base_ptr, offset, i, j, cache);
+  packed = LLVMBuildBitCast(builder, packed, bld.int_vec_type, "");

-  lp_build_rgba8_to_fi32_soa(gallivm,
-type,
-tmp,
-rgba_out);
-
-  return;
-   }
-
-   if (format_desc->layout == UTIL_FORMAT_LAYOUT_S3TC &&
-   /* non-srgb case is already handled above */
-   format_desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB &&
-   type.floating && type.width == 32 &&
-   (type.length == 1 || (type.length % 4 == 0)) &&
-   cache) {
-  const struct util_format_description *format_decompressed;
-  const struct util_format_description *flinear_desc;
-  LLVMValueRef packed;
-  flinear_desc = 
util_format_description(util_format_linear(format_desc->format));
-  /* This probably only works with aligned data */
-  packed = lp_build_fetch_cached_texels(gallivm,
-flinear_desc,
-type.length,
-base_ptr,
-offset,
-i, j,
-cache);
-  packed = LLVMBuildBitCast(builder, packed,
-lp_build_int_vec_type(gallivm, type), "");
   /*
-   * The values are now packed so they match ordinary srgb RGBA8 format,
+   * The values are now packed so they match ordinary (srgb) RGBA8 format,
* hence need to use matching format for unpack.
*/
-  format_decompressed = util_format_description(PIPE_FORMAT_R8G8B8A8_SRGB);
-
+  frgba8_desc = util_format_description(PIPE_FORMAT_R8G8B8A8_UNORM);
+  if (format_desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB) {
+ assert(format_desc->layout == UTIL_FORMAT_LAYOUT_S3TC);
+ frgba8_desc = util_format_description(PIPE_FORMAT_R8G8B8A8_SRGB);
+  }
   lp_build_unpack_rgba_soa(gallivm,
-   format_decompressed,
+   frgba8_desc,
type,
-   packed, rgba_out);
+   packed, rgba);

+  /*
+   * We converted 4 channels. Make sure llvm can drop unneeded ones
+   * (luckily the rgba order is fixed, only la needs special case).


"la" is confusing.  It's better to use upper-case, like LA, RGTC,


+   */
+  for (chan = 0; chan < 4; chan++) {
+

Re: [Mesa-dev] [PATCH 1/2] radeonsi: add Polaris12 support (v3)

2016-12-20 Thread Andreas Boll

2016-12-19 23:45 GMT+01:00 Alex Deucher :
> From: Junwei Zhang 
>
> v2: use gfxip names for llvm 4.0+
> v3: use tonga for llvm <= 3.8
>
> Signed-off-by: Junwei Zhang 
> Reviewed-by: Nicolai Hähnle 
> Acked-by: Christian König 
> ---

snip

> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 0b5c6dc..e0b914c 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -755,6 +755,7 @@ static const char* r600_get_chip_name(struct 
> r600_common_screen *rscreen)
> case CHIP_FIJI: return "AMD FIJI";
> case CHIP_POLARIS10: return "AMD POLARIS10";
> case CHIP_POLARIS11: return "AMD POLARIS11";
> +   case CHIP_POLARIS12: return "AMD POLARIS12";
> case CHIP_STONEY: return "AMD STONEY";
> default: return "AMD unknown";
> }
> @@ -889,9 +890,11 @@ const char *r600_get_llvm_processor_name(enum 
> radeon_family family)
>  #if HAVE_LLVM <= 0x0308
> case CHIP_POLARIS10: return "tonga";
> case CHIP_POLARIS11: return "tonga";
> +   case CHIP_POLARIS12: return "tonga";
>  #else
> case CHIP_POLARIS10: return "polaris10";
> case CHIP_POLARIS11: return "polaris11";
> +   case CHIP_POLARIS12: return "polaris11";
>  #endif

You've dropped the processor name for LLVM 4.0+.
I guess that wasn't intended.
Something like this should work:

#if HAVE_LLVM <= 0x0308
// return processor names for LLVM <= 3.8
#elif HAVE_LLVM == 0x0309
// return processor names for LLVM 3.9
#else
// return processor names for LLVM > 3.9
#endif

Andreas

> default: return "";
> }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH] clover: Return correct CL_EVENT_REFERENCE_COUNT

2016-12-20 Thread Jan Vesely

On Fri, 2016-12-16 at 13:43 -0800, Francisco Jerez wrote:
> Vedran Miletić  writes:
> 
> > Current implementation of event handling keeps an extra reference to
> > the hardware event, in addition to the reference returned via the OpenCL
> > API. This additional reference is internal and should not be counted
> > when queried via the clGetEventInfo() function.
> > 
> > Fixes Piglit's cl/api/retain_release-event test.
> > 
> > Signed-off-by: Vedran Miletić 
> > ---
> >  src/gallium/state_trackers/clover/api/event.cpp | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/gallium/state_trackers/clover/api/event.cpp 
> > b/src/gallium/state_trackers/clover/api/event.cpp
> > index 5d1a0e5..74bc4d9 100644
> > --- a/src/gallium/state_trackers/clover/api/event.cpp
> > +++ b/src/gallium/state_trackers/clover/api/event.cpp
> > @@ -107,7 +107,9 @@ clGetEventInfo(cl_event d_ev, cl_event_info param,
> >break;
> >  
> > case CL_EVENT_REFERENCE_COUNT:
> > -  buf.as_scalar() = ev.ref_count();
> > +  // Current implementation of event handling keeps an extra reference 
> > to
> > +  // the hardware event, which is internal and should not be counted.
> > +  buf.as_scalar() = ev.ref_count() - 1;
> 
> I don't think this is correct.  There is an internal event reference
> held by the command queue object, but only for as long as the event
> remains in the queue until the next flush.  In other cases the above
> would give you a reference count which is off by one.  That said:
> 
> > The reference count returned should be considered immediately
> > stale. It is unsuitable for general use in applications. This feature
> > is provided for identifying memory leaks.


I found only a generic description that mentions reference count == 1
wrt. events (in Glossary). Even there it says that it's an internal
counter.
The only object that seems to require reference count == 1 is root
device. Contexts, queues, mem, samplers, programs, kernels, events, all
 include the above footnote.
I think the piglit test should be changed to check for non-zero value
instead of 1.

Jan

> >break;
> >  
> > default:
> > -- 
> > 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vulkan/wsi/x11: don't crash on null wsi x11 connection

2016-12-20 Thread Eero Tamminen


Hi,

I think it fixed one of the segfaults with closed vkreplay window, but 
it still segfaults. For details, see backtrace here:

https://github.com/LunarG/VulkanTools/issues/124

- Eero

On 20.12.2016 05:59, Arda Coskunses wrote:

Without this check driver crash when application window
closed unexpectedly.
---
 src/vulkan/wsi/wsi_common_x11.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 25ba0c1..afb7809 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -261,6 +261,11 @@ VkBool32 wsi_get_physical_device_xcb_presentation_support(
struct wsi_x11_connection *wsi_conn =
   wsi_x11_get_connection(wsi_device, alloc, connection);

+   if (!wsi_conn) {
+  fprintf(stderr, "vulkan: wsi connection lost\n");
+  return false;
+   }
+
if (!wsi_conn->has_dri3) {
   fprintf(stderr, "vulkan: No DRI3 support\n");
   return false;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 52/70] st/mesa/glsl: change xfb_program field to last_vert_prog

2016-12-20 Thread Timothy Arceri

Now that the i965 backend doesn't depend on this field we can
make it more generic and short circuit a bunch of code paths.

The new field will be used in a following patch for another
clean-up.
---
 src/compiler/glsl/link_varyings.cpp|  5 +++-
 src/compiler/glsl/linker.cpp   | 47 +++---
 src/mesa/main/mtypes.h |  2 +-
 src/mesa/main/shader_query.cpp |  2 +-
 src/mesa/main/transformfeedback.c  |  5 +++-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  5 +++-
 src/mesa/state_tracker/st_program.c| 10 +--
 7 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index e1a29b0..147a7c3 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -1074,6 +1074,9 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
  unsigned num_tfeedback_decls,
  tfeedback_decl *tfeedback_decls, bool has_xfb_qualifiers)
 {
+   if (!prog->last_vert_prog)
+  return true;
+
/* Make sure MaxTransformFeedbackBuffers is less than 32 so the bitmask for
 * tracking the number of buffers doesn't overflow.
 */
@@ -1082,7 +1085,7 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
bool separate_attribs_mode =
   prog->TransformFeedback.BufferMode == GL_SEPARATE_ATTRIBS;
 
-   struct gl_program *xfb_prog = prog->xfb_program;
+   struct gl_program *xfb_prog = prog->last_vert_prog;
xfb_prog->sh.LinkedTransformFeedback =
   rzalloc(xfb_prog, struct gl_transform_feedback_info);
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 571b53d..22b18fb 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4249,27 +4249,29 @@ build_program_resource_list(struct gl_context *ctx,
 output_stage, GL_PROGRAM_OUTPUT))
   return;
 
-   struct gl_transform_feedback_info *linked_xfb =
-  shProg->xfb_program->sh.LinkedTransformFeedback;
-
-   /* Add transform feedback varyings. */
-   if (linked_xfb->NumVarying > 0) {
-  for (int i = 0; i < linked_xfb->NumVarying; i++) {
- if (!add_program_resource(shProg, resource_set,
-   GL_TRANSFORM_FEEDBACK_VARYING,
-   _xfb->Varyings[i], 0))
- return;
+   if (shProg->last_vert_prog) {
+  struct gl_transform_feedback_info *linked_xfb =
+ shProg->last_vert_prog->sh.LinkedTransformFeedback;
+
+  /* Add transform feedback varyings. */
+  if (linked_xfb->NumVarying > 0) {
+ for (int i = 0; i < linked_xfb->NumVarying; i++) {
+if (!add_program_resource(shProg, resource_set,
+  GL_TRANSFORM_FEEDBACK_VARYING,
+  _xfb->Varyings[i], 0))
+return;
+ }
   }
-   }
 
-   /* Add transform feedback buffers. */
-   for (unsigned i = 0; i < ctx->Const.MaxTransformFeedbackBuffers; i++) {
-  if ((linked_xfb->ActiveBuffers >> i) & 1) {
- linked_xfb->Buffers[i].Binding = i;
- if (!add_program_resource(shProg, resource_set,
-   GL_TRANSFORM_FEEDBACK_BUFFER,
-   _xfb->Buffers[i], 0))
- return;
+  /* Add transform feedback buffers. */
+  for (unsigned i = 0; i < ctx->Const.MaxTransformFeedbackBuffers; i++) {
+ if ((linked_xfb->ActiveBuffers >> i) & 1) {
+linked_xfb->Buffers[i].Binding = i;
+if (!add_program_resource(shProg, resource_set,
+  GL_TRANSFORM_FEEDBACK_BUFFER,
+  _xfb->Buffers[i], 0))
+return;
+ }
   }
}
 
@@ -4595,16 +4597,13 @@ link_varyings_and_uniforms(unsigned first, unsigned 
last,
   varying_names = prog->TransformFeedback.VaryingNames;
}
 
-   /* Find the program used for xfb. Even if we don't use xfb we still want to
-* set this so we can fill the default values for program interface query.
-*/
-   prog->xfb_program = prog->_LinkedShaders[last]->Program;
+   prog->last_vert_prog = NULL;
int next = last == MESA_SHADER_FRAGMENT ? last - 1 : last;
for (int i = next; i >= 0; i--) {
   if (prog->_LinkedShaders[i] == NULL)
  continue;
 
-  prog->xfb_program = prog->_LinkedShaders[i]->Program;
+  prog->last_vert_prog = prog->_LinkedShaders[i]->Program;
   break;
}
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index d503a53..9cb9001 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2740,7 +2740,7 @@ struct gl_shader_program
   GLchar **VaryingNames;  /**< Array [NumVarying] of char * */
} TransformFeedback;
 
-   struct gl_program *xfb_program;
+   struct gl_program

[Mesa-dev] [PATCH 67/70] mesa/glsl/i965: set and get tes layouts directly to and from shader_info

2016-12-20 Thread Timothy Arceri

---
 src/compiler/glsl/linker.cpp| 60 ++---
 src/mesa/drivers/dri/i965/brw_tcs.c |  6 ++--
 src/mesa/main/shaderapi.c   | 15 +++---
 3 files changed, 37 insertions(+), 44 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 55a71d3..7dc0c64 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1712,18 +1712,20 @@ link_tcs_out_layout_qualifiers(struct gl_shader_program 
*prog,
  */
 static void
 link_tes_in_layout_qualifiers(struct gl_shader_program *prog,
-  struct gl_linked_shader *linked_shader,
+  struct gl_program *gl_prog,
   struct gl_shader **shader_list,
   unsigned num_shaders)
 {
-   linked_shader->info.TessEval.PrimitiveMode = PRIM_UNKNOWN;
-   linked_shader->info.TessEval.Spacing = 0;
-   linked_shader->info.TessEval.VertexOrder = 0;
-   linked_shader->info.TessEval.PointMode = -1;
-
-   if (linked_shader->Stage != MESA_SHADER_TESS_EVAL)
+   if (gl_prog->info.stage != MESA_SHADER_TESS_EVAL)
   return;
 
+   int point_mode = -1;
+
+   gl_prog->info.tes.primitive_mode = PRIM_UNKNOWN;
+   gl_prog->info.tes.spacing = 0;
+   gl_prog->info.tes.vertex_order = 0;
+   gl_prog->info.tes.point_mode = false;
+
/* From the GLSL 4.0 spec (chapter 4.3.8.1):
 *
 * "At least one tessellation evaluation shader (compilation unit) in
@@ -1742,49 +1744,45 @@ link_tes_in_layout_qualifiers(struct gl_shader_program 
*prog,
   struct gl_shader *shader = shader_list[i];
 
   if (shader->info.TessEval.PrimitiveMode != PRIM_UNKNOWN) {
- if (linked_shader->info.TessEval.PrimitiveMode != PRIM_UNKNOWN &&
- linked_shader->info.TessEval.PrimitiveMode !=
+ if (gl_prog->info.tes.primitive_mode != PRIM_UNKNOWN &&
+ gl_prog->info.tes.primitive_mode !=
  shader->info.TessEval.PrimitiveMode) {
 linker_error(prog, "tessellation evaluation shader defined with "
  "conflicting input primitive modes.\n");
 return;
  }
- linked_shader->info.TessEval.PrimitiveMode = 
shader->info.TessEval.PrimitiveMode;
+ gl_prog->info.tes.primitive_mode = 
shader->info.TessEval.PrimitiveMode;
   }
 
   if (shader->info.TessEval.Spacing != 0) {
- if (linked_shader->info.TessEval.Spacing != 0 &&
- linked_shader->info.TessEval.Spacing !=
+ if (gl_prog->info.tes.spacing != 0 && gl_prog->info.tes.spacing !=
  shader->info.TessEval.Spacing) {
 linker_error(prog, "tessellation evaluation shader defined with "
  "conflicting vertex spacing.\n");
 return;
  }
- linked_shader->info.TessEval.Spacing = shader->info.TessEval.Spacing;
+ gl_prog->info.tes.spacing = shader->info.TessEval.Spacing;
   }
 
   if (shader->info.TessEval.VertexOrder != 0) {
- if (linked_shader->info.TessEval.VertexOrder != 0 &&
- linked_shader->info.TessEval.VertexOrder !=
+ if (gl_prog->info.tes.vertex_order != 0 &&
+ gl_prog->info.tes.vertex_order !=
  shader->info.TessEval.VertexOrder) {
 linker_error(prog, "tessellation evaluation shader defined with "
  "conflicting ordering.\n");
 return;
  }
- linked_shader->info.TessEval.VertexOrder =
-shader->info.TessEval.VertexOrder;
+ gl_prog->info.tes.vertex_order = shader->info.TessEval.VertexOrder;
   }
 
   if (shader->info.TessEval.PointMode != -1) {
- if (linked_shader->info.TessEval.PointMode != -1 &&
- linked_shader->info.TessEval.PointMode !=
- shader->info.TessEval.PointMode) {
+ if (point_mode != -1 &&
+ point_mode != shader->info.TessEval.PointMode) {
 linker_error(prog, "tessellation evaluation shader defined with "
  "conflicting point modes.\n");
 return;
  }
- linked_shader->info.TessEval.PointMode =
-shader->info.TessEval.PointMode;
+ point_mode = shader->info.TessEval.PointMode;
   }
 
}
@@ -1793,21 +1791,23 @@ link_tes_in_layout_qualifiers(struct gl_shader_program 
*prog,
 * since we already know we're in the right type of shader program
 * for doing it.
 */
-   if (linked_shader->info.TessEval.PrimitiveMode == PRIM_UNKNOWN) {
+   if (gl_prog->info.tes.primitive_mode == PRIM_UNKNOWN) {
   linker_error(prog,
"tessellation evaluation shader didn't declare input "
"primitive modes.\n");
   return;
}
 
-   if (linked_shader->info.TessEval.Spacing == 0)
-  linked_shader->info.TessEval.Spacing = GL_EQUAL;
+   if (gl_prog->info.tes.spacing == 0)
+  gl_prog->info.tes.spacing =

[Mesa-dev] [PATCH 57/70] st/mesa/glsl: set early_fragment_tests directly in shader_info

2016-12-20 Thread Timothy Arceri

We also move EarlyFragmentTests out of the gl_shader_info struct
as it is now only used by gl_shader.
---
 src/compiler/glsl/glsl_parser_extras.cpp   |  2 +-
 src/compiler/glsl/linker.cpp   |  4 ++--
 src/mesa/main/mtypes.h | 12 ++--
 src/mesa/main/shaderapi.c  |  1 -
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
 5 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 4566aa9..9138133 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1815,7 +1815,7 @@ set_shader_inout_layout(struct gl_shader *shader,
   shader->info.origin_upper_left = state->fs_origin_upper_left;
   shader->info.ARB_fragment_coord_conventions_enable =
  state->ARB_fragment_coord_conventions_enable;
-  shader->info.EarlyFragmentTests = state->fs_early_fragment_tests;
+  shader->EarlyFragmentTests = state->fs_early_fragment_tests;
   shader->info.InnerCoverage = state->fs_inner_coverage;
   shader->info.PostDepthCoverage = state->fs_post_depth_coverage;
   shader->BlendSupport = state->fs_blend_support;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 9fe7278..337fa17 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1886,8 +1886,8 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
 shader->info.pixel_center_integer;
   }
 
-  linked_shader->info.EarlyFragmentTests |=
- shader->info.EarlyFragmentTests;
+  linked_shader->Program->info.fs.early_fragment_tests |=
+ shader->EarlyFragmentTests;
   linked_shader->info.InnerCoverage |=
  shader->info.InnerCoverage;
   linked_shader->Program->info.fs.post_depth_coverage |=
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 407e11a..a05ea60 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2310,12 +2310,6 @@ struct gl_shader_info
} Geom;
 
/**
-* Whether early fragment tests are enabled as defined by
-* ARB_shader_image_load_store.
-*/
-   bool EarlyFragmentTests;
-
-   /**
 * Compute shader state from ARB_compute_shader and
 * ARB_compute_variable_group_size layout qualifiers.
 */
@@ -2428,6 +2422,12 @@ struct gl_shader
 */
GLbitfield BlendSupport;
 
+   /**
+* Whether early fragment tests are enabled as defined by
+* ARB_shader_image_load_store.
+*/
+   bool EarlyFragmentTests;
+
struct gl_shader_info info;
 };
 
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index d97c594..b03503e 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2207,7 +2207,6 @@ _mesa_copy_linked_program_data(const struct 
gl_shader_program *src,
}
case MESA_SHADER_FRAGMENT: {
   dst->info.fs.depth_layout = src->FragDepthLayout;
-  dst->info.fs.early_fragment_tests = dst_sh->info.EarlyFragmentTests;
   dst->info.fs.inner_coverage = dst_sh->info.InnerCoverage;
   dst->info.fs.post_depth_coverage = dst_sh->info.PostDepthCoverage;
   break;
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 193e450..0ba83f4 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6118,7 +6118,7 @@ st_translate_program(
}
 
if (procType == PIPE_SHADER_FRAGMENT) {
-  if (program->shader->info.EarlyFragmentTests)
+  if (program->shader->Program->info.fs.early_fragment_tests)
  ureg_property(ureg, TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL, 1);
 
   if (proginfo->info.inputs_read & VARYING_BIT_POS) {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 47/70] mesa/glsl: move ProgramResourceList to gl_shader_program_data

2016-12-20 Thread Timothy Arceri

We also move NumProgramResourceList at the same time.

GLES does interface validation on SSO at runtime so we need to move
this to be able to switch to storing gl_program pointers in
CurrentProgram.
---
 src/compiler/glsl/linker.cpp | 20 +--
 src/mesa/main/mtypes.h   |  8 
 src/mesa/main/program_resource.c | 40 ++---
 src/mesa/main/shader_query.cpp   | 43 +++-
 src/mesa/main/shaderobj.c|  8 
 5 files changed, 63 insertions(+), 56 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 2c90141..571b53d 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -3536,25 +3536,25 @@ add_program_resource(struct gl_shader_program *prog,
if (_mesa_set_search(resource_set, data))
   return true;
 
-   prog->ProgramResourceList =
+   prog->data->ProgramResourceList =
   reralloc(prog,
-   prog->ProgramResourceList,
+   prog->data->ProgramResourceList,
gl_program_resource,
-   prog->NumProgramResourceList + 1);
+   prog->data->NumProgramResourceList + 1);
 
-   if (!prog->ProgramResourceList) {
+   if (!prog->data->ProgramResourceList) {
   linker_error(prog, "Out of memory during linking.\n");
   return false;
}
 
struct gl_program_resource *res =
-  >ProgramResourceList[prog->NumProgramResourceList];
+  >data->ProgramResourceList[prog->data->NumProgramResourceList];
 
res->Type = type;
res->Data = data;
res->StageReferences = stages;
 
-   prog->NumProgramResourceList++;
+   prog->data->NumProgramResourceList++;
 
_mesa_set_add(resource_set, data);
 
@@ -4198,10 +4198,10 @@ build_program_resource_list(struct gl_context *ctx,
 struct gl_shader_program *shProg)
 {
/* Rebuild resource list. */
-   if (shProg->ProgramResourceList) {
-  ralloc_free(shProg->ProgramResourceList);
-  shProg->ProgramResourceList = NULL;
-  shProg->NumProgramResourceList = 0;
+   if (shProg->data->ProgramResourceList) {
+  ralloc_free(shProg->data->ProgramResourceList);
+  shProg->data->ProgramResourceList = NULL;
+  shProg->data->NumProgramResourceList = 0;
}
 
int input_stage = MESA_SHADER_STAGES, output_stage = 0;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 0c86f21..176e0b8 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2666,6 +2666,10 @@ struct gl_shader_program_data
struct gl_active_atomic_buffer *AtomicBuffers;
unsigned NumAtomicBuffers;
 
+   /** List of all active resources after linking. */
+   struct gl_program_resource *ProgramResourceList;
+   unsigned NumProgramResourceList;
+
GLboolean LinkStatus;   /**< GL_LINK_STATUS */
GLboolean Validated;
GLchar *InfoLog;
@@ -2855,10 +2859,6 @@ struct gl_shader_program
 */
struct gl_linked_shader *_LinkedShaders[MESA_SHADER_STAGES];
 
-   /** List of all active resources after linking. */
-   struct gl_program_resource *ProgramResourceList;
-   unsigned NumProgramResourceList;
-
/* True if any of the fragment shaders attached to this program use:
 * #extension ARB_fragment_coord_conventions: enable
 */
diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c
index 5461c4e..4b5be6f 100644
--- a/src/mesa/main/program_resource.c
+++ b/src/mesa/main/program_resource.c
@@ -119,8 +119,8 @@ _mesa_GetProgramInterfaceiv(GLuint program, GLenum 
programInterface,
/* Validate pname against interface. */
switch(pname) {
case GL_ACTIVE_RESOURCES:
-  for (i = 0, *params = 0; i < shProg->NumProgramResourceList; i++)
- if (shProg->ProgramResourceList[i].Type == programInterface)
+  for (i = 0, *params = 0; i < shProg->data->NumProgramResourceList; i++)
+ if (shProg->data->ProgramResourceList[i].Type == programInterface)
 (*params)++;
   break;
case GL_MAX_NAME_LENGTH:
@@ -135,32 +135,32 @@ _mesa_GetProgramInterfaceiv(GLuint program, GLenum 
programInterface,
   /* Name length consists of base name, 3 additional chars '[0]' if
* resource is an array and finally 1 char for string terminator.
*/
-  for (i = 0, *params = 0; i < shProg->NumProgramResourceList; i++) {
- if (shProg->ProgramResourceList[i].Type != programInterface)
+  for (i = 0, *params = 0; i < shProg->data->NumProgramResourceList; i++) {
+ if (shProg->data->ProgramResourceList[i].Type != programInterface)
 continue;
  unsigned len =
-_mesa_program_resource_name_len(>ProgramResourceList[i]);
+
_mesa_program_resource_name_len(>data->ProgramResourceList[i]);
  *params = MAX2(*params, len + 1);
   }
   break;
case GL_MAX_NUM_ACTIVE_VARIABLES:
   switch (programInterface) {
   case GL_UNIFORM_BLOCK:
- for (i = 0,

[Mesa-dev] [PATCH 49/70] mesa: use gl_program for CurrentProgram rather than gl_shader_program

2016-12-20 Thread Timothy Arceri

This makes much more sense and should be more performant in some
critical paths such as SSO validation which is called at draw time.

Previously the CurrentProgram array could have contained multiple
pointers to the same struct which was confusing and we would often
need to fish out the information we were really after from the
gl_program anyway.

Also it was error prone to depend on the _LinkedShader array for
programs in current use because a failed linking attempt will lose
the infomation about the current program in use which is still
valid.
---
 src/mesa/drivers/common/meta.c| 11 ++--
 src/mesa/drivers/common/meta.h|  2 +-
 src/mesa/drivers/dri/i965/brw_context.c   | 10 ++--
 src/mesa/drivers/dri/i965/brw_ff_gs.c |  4 +-
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c  |  8 +--
 src/mesa/drivers/dri/i965/brw_tcs_surface_state.c |  8 +--
 src/mesa/drivers/dri/i965/brw_tes_surface_state.c |  8 +--
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c  |  9 +---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 10 ++--
 src/mesa/drivers/dri/i965/gen6_sol.c  | 24 -
 src/mesa/drivers/dri/i965/gen7_l3_state.c |  6 +--
 src/mesa/main/api_validate.c  | 57 
 src/mesa/main/ff_fragment_shader.cpp  |  6 +--
 src/mesa/main/mtypes.h|  2 +-
 src/mesa/main/pipelineobj.c   | 52 +--
 src/mesa/main/shader_query.cpp| 36 ++---
 src/mesa/main/shaderapi.c | 63 ++-
 src/mesa/main/state.c | 50 +++---
 src/mesa/main/texstate.c  |  5 +-
 src/mesa/main/transformfeedback.c |  2 +-
 src/mesa/main/uniform_query.cpp   | 21 +++-
 src/mesa/state_tracker/st_atom_atomicbuf.c| 20 +++
 src/mesa/state_tracker/st_atom_constbuf.c | 43 +---
 src/mesa/state_tracker/st_atom_image.c| 42 +--
 src/mesa/state_tracker/st_atom_storagebuf.c   | 48 +
 src/mesa/state_tracker/st_cb_compute.c|  4 +-
 26 files changed, 197 insertions(+), 354 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 0d5661b..15d28b2 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -594,8 +594,8 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
* that we don't have to worry about the current pipeline state.
*/
   for (i = 0; i < MESA_SHADER_STAGES; i++) {
- _mesa_reference_shader_program(ctx, >Shader[i],
-ctx->Shader.CurrentProgram[i]);
+ _mesa_reference_program(ctx, >Program[i],
+ ctx->Shader.CurrentProgram[i]);
   }
   _mesa_reference_shader_program(ctx, >ActiveShader,
  ctx->Shader.ActiveProgram);
@@ -972,16 +972,15 @@ _mesa_meta_end(struct gl_context *ctx)
   * program object must be NULL.  _mesa_use_shader_program is a no-op
   * in that case.
   */
- _mesa_use_shader_program(ctx, targets[i],
-  save->Shader[i],
+ _mesa_use_shader_program(ctx, targets[i], save->Program[i],
   >Shader);
 
  /* Do this *before* killing the reference. :)
   */
- if (save->Shader[i] != NULL)
+ if (save->Program[i] != NULL)
 any_shader = true;
 
- _mesa_reference_shader_program(ctx, >Shader[i], NULL);
+ _mesa_reference_program(ctx, >Program[i], NULL);
   }
 
   _mesa_reference_shader_program(ctx, >Shader.ActiveProgram,
diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index 0a913e9..1b5cf42 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -125,7 +125,7 @@ struct save_state
GLboolean FragmentProgramEnabled;
struct gl_program *FragmentProgram;
GLboolean ATIFragmentShaderEnabled;
-   struct gl_shader_program *Shader[MESA_SHADER_STAGES];
+   struct gl_program *Program[MESA_SHADER_STAGES];
struct gl_shader_program *ActiveShader;
struct gl_pipeline_object   *Pipeline;
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index e53aefd..63cb12c 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -274,14 +274,12 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
 
/* Resolve color for each active shader image. */
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  const struct gl_linked_shader *shader =
- ctx->_Shader->CurrentProgram[i] ?
-ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL;
+  const struct

[Mesa-dev] [PATCH 41/70] st/mesa: pass gl_program to st_bind_ssbos()

2016-12-20 Thread Timothy Arceri

We no longer need to pass gl_shader_program.

Reviewed-by: Nicolai Hähnle 
---
 src/mesa/state_tracker/st_atom_storagebuf.c | 30 ++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_storagebuf.c 
b/src/mesa/state_tracker/st_atom_storagebuf.c
index e1efd62..bf87037 100644
--- a/src/mesa/state_tracker/st_atom_storagebuf.c
+++ b/src/mesa/state_tracker/st_atom_storagebuf.c
@@ -41,25 +41,25 @@
 #include "st_program.h"
 
 static void
-st_bind_ssbos(struct st_context *st, struct gl_linked_shader *shader,
+st_bind_ssbos(struct st_context *st, struct gl_program *prog,
   enum pipe_shader_type shader_type)
 {
unsigned i;
struct pipe_shader_buffer buffers[MAX_SHADER_STORAGE_BUFFERS];
struct gl_program_constants *c;
 
-   if (!shader || !st->pipe->set_shader_buffers)
+   if (!prog || !st->pipe->set_shader_buffers)
   return;
 
-   c = >ctx->Const.Program[shader->Stage];
+   c = >ctx->Const.Program[prog->info.stage];
 
-   for (i = 0; i < shader->Program->info.num_ssbos; i++) {
+   for (i = 0; i < prog->info.num_ssbos; i++) {
   struct gl_shader_storage_buffer_binding *binding;
   struct st_buffer_object *st_obj;
   struct pipe_shader_buffer *sb = [i];
 
   binding = >ctx->ShaderStorageBufferBindings[
-shader->Program->sh.ShaderStorageBlocks[i]->Binding];
+prog->sh.ShaderStorageBlocks[i]->Binding];
   st_obj = st_buffer_object(binding->BufferObject);
 
   sb->buffer = st_obj->buffer;
@@ -80,13 +80,13 @@ st_bind_ssbos(struct st_context *st, struct 
gl_linked_shader *shader,
   }
}
st->pipe->set_shader_buffers(st->pipe, shader_type, c->MaxAtomicBuffers,
-shader->Program->info.num_ssbos, buffers);
+prog->info.num_ssbos, buffers);
/* clear out any stale shader buffers */
-   if (shader->Program->info.num_ssbos < c->MaxShaderStorageBlocks)
+   if (prog->info.num_ssbos < c->MaxShaderStorageBlocks)
   st->pipe->set_shader_buffers(
 st->pipe, shader_type,
-c->MaxAtomicBuffers + shader->Program->info.num_ssbos,
-c->MaxShaderStorageBlocks - shader->Program->info.num_ssbos,
+c->MaxAtomicBuffers + prog->info.num_ssbos,
+c->MaxShaderStorageBlocks - prog->info.num_ssbos,
 NULL);
 }
 
@@ -98,7 +98,7 @@ static void bind_vs_ssbos(struct st_context *st)
if (!prog)
   return;
 
-   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_VERTEX],
+   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_VERTEX]->Program,
  PIPE_SHADER_VERTEX);
 }
 
@@ -114,7 +114,7 @@ static void bind_fs_ssbos(struct st_context *st)
if (!prog)
   return;
 
-   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_FRAGMENT],
+   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program,
  PIPE_SHADER_FRAGMENT);
 }
 
@@ -130,7 +130,7 @@ static void bind_gs_ssbos(struct st_context *st)
if (!prog)
   return;
 
-   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_GEOMETRY],
+   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program,
  PIPE_SHADER_GEOMETRY);
 }
 
@@ -146,7 +146,7 @@ static void bind_tcs_ssbos(struct st_context *st)
if (!prog)
   return;
 
-   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_TESS_CTRL],
+   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_TESS_CTRL]->Program,
  PIPE_SHADER_TESS_CTRL);
 }
 
@@ -162,7 +162,7 @@ static void bind_tes_ssbos(struct st_context *st)
if (!prog)
   return;
 
-   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_TESS_EVAL],
+   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program,
  PIPE_SHADER_TESS_EVAL);
 }
 
@@ -178,7 +178,7 @@ static void bind_cs_ssbos(struct st_context *st)
if (!prog)
   return;
 
-   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_COMPUTE],
+   st_bind_ssbos(st, prog->_LinkedShaders[MESA_SHADER_COMPUTE]->Program,
  PIPE_SHADER_COMPUTE);
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 60/70] mesa/glsl: move uses_gl_fragcoord to gl_shader

2016-12-20 Thread Timothy Arceri

This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.
---
 src/compiler/glsl/glsl_parser_extras.cpp |  2 +-
 src/compiler/glsl/linker.cpp | 12 +---
 src/mesa/main/mtypes.h   |  2 +-
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index bd13e00..2271403 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1809,7 +1809,7 @@ set_shader_inout_layout(struct gl_shader *shader,
 
case MESA_SHADER_FRAGMENT:
   shader->redeclares_gl_fragcoord = state->fs_redeclares_gl_fragcoord;
-  shader->info.uses_gl_fragcoord = state->fs_uses_gl_fragcoord;
+  shader->uses_gl_fragcoord = state->fs_uses_gl_fragcoord;
   shader->info.pixel_center_integer = state->fs_pixel_center_integer;
   shader->info.origin_upper_left = state->fs_origin_upper_left;
   shader->ARB_fragment_coord_conventions_enable =
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index e7c36e8..548c59a 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1826,7 +1826,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
 unsigned num_shaders)
 {
bool redeclares_gl_fragcoord = false;
-   linked_shader->info.uses_gl_fragcoord = false;
+   bool uses_gl_fragcoord = false;
linked_shader->info.origin_upper_left = false;
linked_shader->info.pixel_center_integer = false;
 
@@ -1844,9 +1844,9 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
*that have a static use gl_FragCoord."
*/
   if ((redeclares_gl_fragcoord && !shader->redeclares_gl_fragcoord &&
-   shader->info.uses_gl_fragcoord)
+   shader->uses_gl_fragcoord)
   || (shader->redeclares_gl_fragcoord && !redeclares_gl_fragcoord &&
-  linked_shader->info.uses_gl_fragcoord)) {
+  uses_gl_fragcoord)) {
  linker_error(prog, "fragment shader defined with conflicting "
  "layout qualifiers for gl_FragCoord\n");
   }
@@ -1870,11 +1870,9 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
* are multiple redeclarations, all the fields except uses_gl_fragcoord
* are already known to be the same.
*/
-  if (shader->redeclares_gl_fragcoord || shader->info.uses_gl_fragcoord) {
+  if (shader->redeclares_gl_fragcoord || shader->uses_gl_fragcoord) {
  redeclares_gl_fragcoord = shader->redeclares_gl_fragcoord;
- linked_shader->info.uses_gl_fragcoord =
-linked_shader->info.uses_gl_fragcoord ||
-shader->info.uses_gl_fragcoord;
+ uses_gl_fragcoord |= shader->uses_gl_fragcoord;
  linked_shader->info.origin_upper_left =
 shader->info.origin_upper_left;
  linked_shader->info.pixel_center_integer =
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index e4c7fdf..5a5fa6df 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2234,7 +2234,6 @@ struct gl_subroutine_function
  */
 struct gl_shader_info
 {
-   bool uses_gl_fragcoord;
bool PostDepthCoverage;
bool InnerCoverage;
 
@@ -2429,6 +2428,7 @@ struct gl_shader
bool ARB_fragment_coord_conventions_enable;
 
bool redeclares_gl_fragcoord;
+   bool uses_gl_fragcoord;
 
struct gl_shader_info info;
 };
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 48/70] mesa: don't always set _NEW_PROGRAM when linking

2016-12-20 Thread Timothy Arceri

We only need to set it when linking was successful and the program
being linked is currently active.

The programs_in_use mask is just used as a flag for now but in
a following patch we will use it to update the CurrentProgram
array.
---
 src/mesa/main/shaderapi.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index afe7060..fd6aae3 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1083,10 +1083,30 @@ _mesa_link_program(struct gl_context *ctx, struct 
gl_shader_program *shProg)
   return;
}
 
-   FLUSH_VERTICES(ctx, _NEW_PROGRAM);
+   unsigned programs_in_use = 0;
+   if (ctx->_Shader)
+  for (unsigned stage = 0; stage < MESA_SHADER_STAGES; stage++) {
+ if (ctx->_Shader->CurrentProgram[stage] == shProg) {
+programs_in_use |= 1 << stage;
+ }
+   }
 
_mesa_glsl_link_shader(ctx, shProg);
 
+   /* From section 7.3 (Program Objects) of the OpenGL 4.5 spec:
+*
+*"If LinkProgram or ProgramBinary successfully re-links a program
+* object that is active for any shader stage, then the newly generated
+* executable code will be installed as part of the current rendering
+* state for all shader stages where the program is active.
+* Additionally, the newly generated executable code is made part of
+* the state of any program pipeline for all stages where the program
+* is attached."
+*/
+   if (shProg->data->LinkStatus && programs_in_use) {
+  FLUSH_VERTICES(ctx, _NEW_PROGRAM);
+   }
+
/* Capture .shader_test files. */
const char *capture_path = _mesa_get_shader_capture_path();
if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != NULL) {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 42/70] st/mesa/glsl: set num_images directly in shader_info

2016-12-20 Thread Timothy Arceri

This change also removes the now duplicate NumImages field.

Reviewed-by: Nicolai Hähnle 
---
 src/compiler/glsl/link_uniforms.cpp|  2 +-
 src/compiler/glsl/linker.cpp   |  7 ---
 src/mesa/main/mtypes.h |  7 ---
 src/mesa/main/shaderapi.c  |  1 -
 src/mesa/state_tracker/st_atom_image.c | 12 ++--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  4 ++--
 6 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 57a7db4..86711e2 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -1318,7 +1318,7 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
   }
 
   sh->Program->info.num_textures = uniform_size.num_shader_samplers;
-  sh->NumImages = uniform_size.num_shader_images;
+  sh->Program->info.num_images = uniform_size.num_shader_images;
   sh->num_uniform_components = uniform_size.num_shader_uniform_components;
   sh->num_combined_uniform_components = sh->num_uniform_components;
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 3b325e5..2c90141 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -3222,12 +3222,13 @@ check_image_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
   struct gl_linked_shader *sh = prog->_LinkedShaders[i];
 
   if (sh) {
- if (sh->NumImages > ctx->Const.Program[i].MaxImageUniforms)
+ if (sh->Program->info.num_images > 
ctx->Const.Program[i].MaxImageUniforms)
 linker_error(prog, "Too many %s shader image uniforms (%u > %u)\n",
- _mesa_shader_stage_to_string(i), sh->NumImages,
+ _mesa_shader_stage_to_string(i),
+ sh->Program->info.num_images,
  ctx->Const.Program[i].MaxImageUniforms);
 
- total_image_units += sh->NumImages;
+ total_image_units += sh->Program->info.num_images;
  total_shader_storage_blocks += sh->Program->info.num_ssbos;
 
  if (i == MESA_SHADER_FRAGMENT) {
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 797e1e9..0c86f21 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2380,13 +2380,6 @@ struct gl_linked_shader
struct exec_list *fragdata_arrays;
struct glsl_symbol_table *symbols;
 
-   /**
-* Number of image uniforms defined in the shader.  It specifies
-* the number of valid elements in the \c ImageUnits and \c
-* ImageAccess arrays.
-*/
-   GLuint NumImages;
-
struct gl_shader_info info;
 };
 
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 6d0f0e0..afe7060 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2168,7 +2168,6 @@ _mesa_copy_linked_program_data(const struct 
gl_shader_program *src,
 
struct gl_program *dst = dst_sh->Program;
 
-   dst->info.num_images = dst_sh->NumImages;
dst->info.separate_shader = src->SeparateShader;
 
switch (dst_sh->Stage) {
diff --git a/src/mesa/state_tracker/st_atom_image.c 
b/src/mesa/state_tracker/st_atom_image.c
index b30006a..2fb37f5 100644
--- a/src/mesa/state_tracker/st_atom_image.c
+++ b/src/mesa/state_tracker/st_atom_image.c
@@ -57,7 +57,7 @@ st_bind_images(struct st_context *st, struct gl_linked_shader 
*shader,
 
c = >ctx->Const.Program[shader->Stage];
 
-   for (i = 0; i < shader->NumImages; i++) {
+   for (i = 0; i < shader->Program->info.num_images; i++) {
   struct gl_image_unit *u =
  >ctx->ImageUnits[shader->Program->sh.ImageUnits[i]];
   struct st_texture_object *stObj = st_texture_object(u->TexObj);
@@ -118,14 +118,14 @@ st_bind_images(struct st_context *st, struct 
gl_linked_shader *shader,
  }
   }
}
-   cso_set_shader_images(st->cso_context, shader_type, 0, shader->NumImages,
- images);
+   cso_set_shader_images(st->cso_context, shader_type, 0,
+ shader->Program->info.num_images, images);
/* clear out any stale shader images */
-   if (shader->NumImages < c->MaxImageUniforms)
+   if (shader->Program->info.num_images < c->MaxImageUniforms)
   cso_set_shader_images(
 st->cso_context, shader_type,
-shader->NumImages,
-c->MaxImageUniforms - shader->NumImages,
+shader->Program->info.num_images,
+c->MaxImageUniforms - shader->Program->info.num_images,
 NULL);
 }
 
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 5c4c13d..543256c 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6346,7 +6346,7 @@ st_translate_program(
if (program->use_shared_memory)
   t->shared_memory = ureg_DECL_memory(ureg,

[Mesa-dev] [PATCH 50/70] mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use()

2016-12-20 Thread Timothy Arceri

These are rewritten to do what the function name suggests, that is
_mesa_shader_program_use() sets the use of all stage and
_mesa_program_use() sets the use of a single stage.

This patch is split out to make review easier but will be squashed into
mesa: use gl_program for CurrentProgram rather than gl_shader_program
before pushing.
---
 src/mesa/drivers/common/meta.c | 19 ---
 src/mesa/main/pipelineobj.c| 24 ++--
 src/mesa/main/shaderapi.c  | 34 --
 src/mesa/main/shaderapi.h  |  9 +
 4 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 15d28b2..5b99c6b 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -167,7 +167,7 @@ _mesa_meta_use_program(struct gl_context *ctx,
_mesa_reference_pipeline_object(ctx, >_Shader, >Shader);
 
/* Update the program */
-   _mesa_use_program(ctx, sh_prog);
+   _mesa_use_shader_program(ctx, sh_prog);
 }
 
 void
@@ -931,16 +931,6 @@ _mesa_meta_end(struct gl_context *ctx)
}
 
if (state & MESA_META_SHADER) {
-  static const GLenum targets[] = {
- GL_VERTEX_SHADER,
- GL_TESS_CONTROL_SHADER,
- GL_TESS_EVALUATION_SHADER,
- GL_GEOMETRY_SHADER,
- GL_FRAGMENT_SHADER,
- GL_COMPUTE_SHADER,
-  };
-  STATIC_ASSERT(MESA_SHADER_STAGES == ARRAY_SIZE(targets));
-
   bool any_shader;
 
   if (ctx->Extensions.ARB_vertex_program) {
@@ -966,14 +956,13 @@ _mesa_meta_end(struct gl_context *ctx)
 
   any_shader = false;
   for (i = 0; i < MESA_SHADER_STAGES; i++) {
- /* It is safe to call _mesa_use_shader_program even if the extension
+ /* It is safe to call _mesa_use_program even if the extension
   * necessary for that program state is not supported.  In that case,
   * the saved program object must be NULL and the currently bound
-  * program object must be NULL.  _mesa_use_shader_program is a no-op
+  * program object must be NULL.  _mesa_use_program is a no-op
   * in that case.
   */
- _mesa_use_shader_program(ctx, targets[i], save->Program[i],
-  >Shader);
+ _mesa_use_program(ctx, i, save->Program[i],  >Shader);
 
  /* Do this *before* killing the reference. :)
   */
diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index f4b3a50..ca68068 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -218,6 +218,18 @@ _mesa_reference_pipeline_object_(struct gl_context *ctx,
}
 }
 
+static void
+use_program_stage(struct gl_context *ctx, GLenum type,
+  struct gl_shader_program *shProg,
+  struct gl_pipeline_object *pipe) {
+   gl_shader_stage stage = _mesa_shader_enum_to_shader_stage(type);
+   struct gl_program *prog = NULL;
+   if (shProg && shProg->_LinkedShaders[stage])
+  prog = shProg->_LinkedShaders[stage]->Program;
+
+   _mesa_use_program(ctx, stage, prog, pipe);
+}
+
 /**
  * Bound program to severals stages of the pipeline
  */
@@ -325,22 +337,22 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
stages, GLuint program)
 * configured for the indicated shader stages."
 */
if ((stages & GL_VERTEX_SHADER_BIT) != 0)
-  _mesa_use_shader_program(ctx, GL_VERTEX_SHADER, shProg, pipe);
+  use_program_stage(ctx, GL_VERTEX_SHADER, shProg, pipe);
 
if ((stages & GL_FRAGMENT_SHADER_BIT) != 0)
-  _mesa_use_shader_program(ctx, GL_FRAGMENT_SHADER, shProg, pipe);
+  use_program_stage(ctx, GL_FRAGMENT_SHADER, shProg, pipe);
 
if ((stages & GL_GEOMETRY_SHADER_BIT) != 0)
-  _mesa_use_shader_program(ctx, GL_GEOMETRY_SHADER, shProg, pipe);
+  use_program_stage(ctx, GL_GEOMETRY_SHADER, shProg, pipe);
 
if ((stages & GL_TESS_CONTROL_SHADER_BIT) != 0)
-  _mesa_use_shader_program(ctx, GL_TESS_CONTROL_SHADER, shProg, pipe);
+  use_program_stage(ctx, GL_TESS_CONTROL_SHADER, shProg, pipe);
 
if ((stages & GL_TESS_EVALUATION_SHADER_BIT) != 0)
-  _mesa_use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, pipe);
+  use_program_stage(ctx, GL_TESS_EVALUATION_SHADER, shProg, pipe);
 
if ((stages & GL_COMPUTE_SHADER_BIT) != 0)
-  _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe);
+  use_program_stage(ctx, GL_COMPUTE_SHADER, shProg, pipe);
 
pipe->Validated = false;
 }
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index dde4fcc..0525168 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1215,17 +1215,12 @@ _mesa_active_program(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 
 
 static void
-use_shader_program(struct gl_context *ctx, gl_shader_stage stage,
-   struct gl_shader_program *shProg,
-   struct gl_pipeline_object *shTarget)

[Mesa-dev] [PATCH 69/70] mesa/glsl: set and get cs layouts to and from shader_info

2016-12-20 Thread Timothy Arceri

---
 src/compiler/glsl/linker.cpp | 35 +++
 src/mesa/main/mtypes.h   | 10 --
 src/mesa/main/shaderapi.c|  6 ++
 src/mesa/main/shaderobj.c|  2 --
 4 files changed, 17 insertions(+), 36 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 172103f..aeaa5a5 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2004,21 +2004,21 @@ link_gs_inout_layout_qualifiers(struct 
gl_shader_program *prog,
  */
 static void
 link_cs_input_layout_qualifiers(struct gl_shader_program *prog,
-struct gl_linked_shader *linked_shader,
+struct gl_program *gl_prog,
 struct gl_shader **shader_list,
 unsigned num_shaders)
 {
-   for (int i = 0; i < 3; i++)
-  linked_shader->info.Comp.LocalSize[i] = 0;
-
-   linked_shader->info.Comp.LocalSizeVariable = false;
-
/* This function is called for all shader stages, but it only has an effect
 * for compute shaders.
 */
-   if (linked_shader->Stage != MESA_SHADER_COMPUTE)
+   if (gl_prog->info.stage != MESA_SHADER_COMPUTE)
   return;
 
+   for (int i = 0; i < 3; i++)
+  gl_prog->info.cs.local_size[i] = 0;
+
+   gl_prog->info.cs.local_size_variable = false;
+
/* From the ARB_compute_shader spec, in the section describing local size
 * declarations:
 *
@@ -2033,9 +2033,9 @@ link_cs_input_layout_qualifiers(struct gl_shader_program 
*prog,
   struct gl_shader *shader = shader_list[sh];
 
   if (shader->info.Comp.LocalSize[0] != 0) {
- if (linked_shader->info.Comp.LocalSize[0] != 0) {
+ if (gl_prog->info.cs.local_size[0] != 0) {
 for (int i = 0; i < 3; i++) {
-   if (linked_shader->info.Comp.LocalSize[i] !=
+   if (gl_prog->info.cs.local_size[i] !=
shader->info.Comp.LocalSize[i]) {
   linker_error(prog, "compute shader defined with conflicting "
"local sizes\n");
@@ -2044,11 +2044,11 @@ link_cs_input_layout_qualifiers(struct 
gl_shader_program *prog,
 }
  }
  for (int i = 0; i < 3; i++) {
-linked_shader->info.Comp.LocalSize[i] =
+gl_prog->info.cs.local_size[i] =
shader->info.Comp.LocalSize[i];
  }
   } else if (shader->info.Comp.LocalSizeVariable) {
- if (linked_shader->info.Comp.LocalSize[0] != 0) {
+ if (gl_prog->info.cs.local_size[0] != 0) {
 /* The ARB_compute_variable_group_size spec says:
  *
  * If one compute shader attached to a program declares a
@@ -2060,7 +2060,7 @@ link_cs_input_layout_qualifiers(struct gl_shader_program 
*prog,
  "variable local group size\n");
 return;
  }
- linked_shader->info.Comp.LocalSizeVariable = true;
+ gl_prog->info.cs.local_size_variable = true;
   }
}
 
@@ -2068,17 +2068,12 @@ link_cs_input_layout_qualifiers(struct 
gl_shader_program *prog,
 * since we already know we're in the right type of shader program
 * for doing it.
 */
-   if (linked_shader->info.Comp.LocalSize[0] == 0 &&
-   !linked_shader->info.Comp.LocalSizeVariable) {
+   if (gl_prog->info.cs.local_size[0] == 0 &&
+   !gl_prog->info.cs.local_size_variable) {
   linker_error(prog, "compute shader must contain a fixed or a variable "
  "local group size\n");
   return;
}
-   for (int i = 0; i < 3; i++)
-  prog->Comp.LocalSize[i] = linked_shader->info.Comp.LocalSize[i];
-
-   prog->Comp.LocalSizeVariable =
-  linked_shader->info.Comp.LocalSizeVariable;
 }
 
 
@@ -2210,7 +2205,7 @@ link_intrastage_shaders(void *mem_ctx,
link_tcs_out_layout_qualifiers(prog, gl_prog, shader_list, num_shaders);
link_tes_in_layout_qualifiers(prog, gl_prog, shader_list, num_shaders);
link_gs_inout_layout_qualifiers(prog, gl_prog, shader_list, num_shaders);
-   link_cs_input_layout_qualifiers(prog, linked, shader_list, num_shaders);
+   link_cs_input_layout_qualifiers(prog, gl_prog, shader_list, num_shaders);
link_xfb_stride_layout_qualifiers(ctx, prog, linked, shader_list,
  num_shaders);
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4c47d3f..9f84735 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2758,19 +2758,9 @@ struct gl_shader_program
 */
struct {
   /**
-   * If this shader contains a compute stage, size specified using
-   * local_size_{x,y,z}.  Otherwise undefined.
-   */
-  unsigned LocalSize[3];
-  /**
* Size of shared variables accessed by the compute shader.
*/
   unsigned SharedSize;
-
-  /**
-   * Whether a variable work group size has been specified.
-   */
-  bool

[Mesa-dev] [PATCH 40/70] st/mesa/glsl: move SamplerTargets to gl_program

2016-12-20 Thread Timothy Arceri

This will help allow us to simplify the handling of samplers by
storing them in a single location rather than duplicating them in
both gl_linked_shader and gl_program.
---
 src/compiler/glsl/link_uniforms.cpp   |  7 ---
 src/mesa/main/mtypes.h| 14 --
 src/mesa/main/uniform_query.cpp   |  2 +-
 src/mesa/main/uniforms.c  |  2 +-
 src/mesa/program/ir_to_mesa.cpp   |  2 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp |  2 +-
 6 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 8604dba..57a7db4 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -1248,10 +1248,11 @@ link_assign_uniform_storage(struct gl_context *ctx,
  parcel.shader_samplers_used;
   prog->_LinkedShaders[i]->shadow_samplers = parcel.shader_shadow_samplers;
 
-  STATIC_ASSERT(sizeof(prog->_LinkedShaders[i]->SamplerTargets) ==
+  
STATIC_ASSERT(sizeof(prog->_LinkedShaders[i]->Program->sh.SamplerTargets) ==
 sizeof(parcel.targets));
-  memcpy(prog->_LinkedShaders[i]->SamplerTargets, parcel.targets,
- sizeof(prog->_LinkedShaders[i]->SamplerTargets));
+  memcpy(prog->_LinkedShaders[i]->Program->sh.SamplerTargets,
+ parcel.targets,
+ sizeof(prog->_LinkedShaders[i]->Program->sh.SamplerTargets));
}
 
 #ifndef NDEBUG
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 8e13add..797e1e9 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2000,6 +2000,11 @@ struct gl_program
  struct gl_uniform_block **UniformBlocks;
  struct gl_uniform_block **ShaderStorageBlocks;
 
+ /** Which texture target is being sampled
+  * (TEXTURE_1D/2D/3D/etc_INDEX)
+  */
+ gl_texture_index SamplerTargets[MAX_SAMPLERS];
+
  union {
 struct {
/**
@@ -2355,9 +2360,6 @@ struct gl_linked_shader
GLbitfield shadow_samplers; /**< Samplers used for shadow sampling. */
/*@}*/
 
-   /** Which texture target is being sampled (TEXTURE_1D/2D/3D/etc_INDEX) */
-   gl_texture_index SamplerTargets[MAX_SAMPLERS];
-
/**
 * Number of default uniform block components used by this shader.
 *
@@ -2388,14 +2390,14 @@ struct gl_linked_shader
struct gl_shader_info info;
 };
 
-static inline GLbitfield gl_external_samplers(struct gl_linked_shader *shader)
+static inline GLbitfield gl_external_samplers(struct gl_program *prog)
 {
GLbitfield external_samplers = 0;
-   GLbitfield mask = shader->Program->SamplersUsed;
+   GLbitfield mask = prog->SamplersUsed;
 
while (mask) {
   int idx = u_bit_scan();
-  if (shader->SamplerTargets[idx] == TEXTURE_EXTERNAL_INDEX)
+  if (prog->sh.SamplerTargets[idx] == TEXTURE_EXTERNAL_INDEX)
  external_samplers |= (1 << idx);
}
 
diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 047d21a..145eff0 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -1107,7 +1107,7 @@ _mesa_sampler_uniforms_pipeline_are_valid(struct 
gl_pipeline_object *pipeline)
   while (mask) {
  const int s = u_bit_scan();
  GLuint unit = shader->Program->SamplerUnits[s];
- GLuint tgt = shader->SamplerTargets[s];
+ GLuint tgt = shader->Program->sh.SamplerTargets[s];
 
  /* FIXME: Samplers are initialized to 0 and Mesa doesn't do a
   * great job of eliminating unused uniforms currently so for now
diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index 5534fcf..51db39e 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -80,7 +80,7 @@ _mesa_update_shader_textures_used(struct gl_shader_program 
*shProg,
while (mask) {
   const int s = u_bit_scan();
   GLuint unit = prog->SamplerUnits[s];
-  GLuint tgt = shader->SamplerTargets[s];
+  GLuint tgt = prog->sh.SamplerTargets[s];
   assert(unit < ARRAY_SIZE(prog->TexturesUsed));
   assert(tgt < NUM_TEXTURE_TARGETS);
 
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index f360f8f..80516f4 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -2918,7 +2918,7 @@ get_mesa_program(struct gl_context *ctx,
do_set_program_inouts(shader->ir, prog, shader->Stage);
 
prog->ShadowSamplers = shader->shadow_samplers;
-   prog->ExternalSamplersUsed = gl_external_samplers(shader);
+   prog->ExternalSamplersUsed = gl_external_samplers(prog);
_mesa_update_shader_textures_used(shader_program, prog);
 
/* Set the gl_FragDepth layout. */
diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index d5309e4..60d101c 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -413,7

[Mesa-dev] [PATCH 62/70] mesa/glsl: move pixel_center_integer to gl_shader

2016-12-20 Thread Timothy Arceri

This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 src/compiler/glsl/linker.cpp | 8 +++-
 src/mesa/main/mtypes.h   | 3 +--
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 18769e9..b3358f9 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1810,7 +1810,7 @@ set_shader_inout_layout(struct gl_shader *shader,
case MESA_SHADER_FRAGMENT:
   shader->redeclares_gl_fragcoord = state->fs_redeclares_gl_fragcoord;
   shader->uses_gl_fragcoord = state->fs_uses_gl_fragcoord;
-  shader->info.pixel_center_integer = state->fs_pixel_center_integer;
+  shader->pixel_center_integer = state->fs_pixel_center_integer;
   shader->origin_upper_left = state->fs_origin_upper_left;
   shader->ARB_fragment_coord_conventions_enable =
  state->ARB_fragment_coord_conventions_enable;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index ca0f844..6ca5aec 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1828,7 +1828,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
bool redeclares_gl_fragcoord = false;
bool uses_gl_fragcoord = false;
bool origin_upper_left = false;
-   linked_shader->info.pixel_center_integer = false;
+   bool pixel_center_integer = false;
 
if (linked_shader->Stage != MESA_SHADER_FRAGMENT ||
(prog->data->Version < 150 &&
@@ -1858,8 +1858,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
*/
   if (redeclares_gl_fragcoord && shader->redeclares_gl_fragcoord &&
   (shader->origin_upper_left != origin_upper_left ||
-   shader->info.pixel_center_integer !=
-   linked_shader->info.pixel_center_integer)) {
+   shader->pixel_center_integer != pixel_center_integer)) {
  linker_error(prog, "fragment shader defined with conflicting "
   "layout qualifiers for gl_FragCoord\n");
   }
@@ -1873,8 +1872,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
  redeclares_gl_fragcoord = shader->redeclares_gl_fragcoord;
  uses_gl_fragcoord |= shader->uses_gl_fragcoord;
  origin_upper_left = shader->origin_upper_left;
- linked_shader->info.pixel_center_integer =
-shader->info.pixel_center_integer;
+ pixel_center_integer = shader->pixel_center_integer;
   }
 
   linked_shader->Program->info.fs.early_fragment_tests |=
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index f01109a..914fe62 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2237,8 +2237,6 @@ struct gl_shader_info
bool PostDepthCoverage;
bool InnerCoverage;
 
-   bool pixel_center_integer;
-
struct {
   /** Global xfb_stride out qualifier if any */
   GLuint BufferStride[MAX_FEEDBACK_BUFFERS];
@@ -2430,6 +2428,7 @@ struct gl_shader
 * Fragment shader state from GLSL 1.50 layout qualifiers.
 */
bool origin_upper_left;
+   bool pixel_center_integer;
 
struct gl_shader_info info;
 };
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 58/70] mesa/glsl: move ARB_fragment_coord_conventions_enable field

2016-12-20 Thread Timothy Arceri

This is only used by gl_shader not gl_linked_shader so move it
there.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 src/compiler/glsl/linker.cpp | 2 +-
 src/mesa/main/mtypes.h   | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 9138133..b70b1dc 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1813,7 +1813,7 @@ set_shader_inout_layout(struct gl_shader *shader,
   shader->info.uses_gl_fragcoord = state->fs_uses_gl_fragcoord;
   shader->info.pixel_center_integer = state->fs_pixel_center_integer;
   shader->info.origin_upper_left = state->fs_origin_upper_left;
-  shader->info.ARB_fragment_coord_conventions_enable =
+  shader->ARB_fragment_coord_conventions_enable =
  state->ARB_fragment_coord_conventions_enable;
   shader->EarlyFragmentTests = state->fs_early_fragment_tests;
   shader->info.InnerCoverage = state->fs_inner_coverage;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 337fa17..2bbb112 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4816,7 +4816,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
  goto done;
   }
 
-  if (prog->Shaders[i]->info.ARB_fragment_coord_conventions_enable) {
+  if (prog->Shaders[i]->ARB_fragment_coord_conventions_enable) {
  prog->ARB_fragment_coord_conventions_enable = true;
   }
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index a05ea60..3793580 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2238,7 +2238,6 @@ struct gl_shader_info
bool redeclares_gl_fragcoord;
bool PostDepthCoverage;
bool InnerCoverage;
-   bool ARB_fragment_coord_conventions_enable;
 
/**
 * Fragment shader state from GLSL 1.50 layout qualifiers.
@@ -2428,6 +2427,8 @@ struct gl_shader
 */
bool EarlyFragmentTests;
 
+   bool ARB_fragment_coord_conventions_enable;
+
struct gl_shader_info info;
 };
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 43/70] st/mesa: stop passing gl_linked_shader to set_affected_state_flags()

2016-12-20 Thread Timothy Arceri

We now get everything we need from the gl_program param.

Reviewed-by: Nicolai Hähnle 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 543256c..11ec352 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6600,7 +6600,6 @@ get_mesa_program_tgsi(struct gl_context *ctx,
 static void
 set_affected_state_flags(uint64_t *states,
  struct gl_program *prog,
- struct gl_linked_shader *shader,
  uint64_t new_constants,
  uint64_t new_sampler_views,
  uint64_t new_samplers,
@@ -,7 +6665,7 @@ get_mesa_program(struct gl_context *ctx,
ST_NEW_RASTERIZER |
ST_NEW_VERTEX_ARRAYS;
 
- set_affected_state_flags(states, prog, shader,
+ set_affected_state_flags(states, prog,
   ST_NEW_VS_CONSTANTS,
   ST_NEW_VS_SAMPLER_VIEWS,
   ST_NEW_RENDER_SAMPLERS,
@@ -6681,7 +6680,7 @@ get_mesa_program(struct gl_context *ctx,
 
  *states = ST_NEW_TCS_STATE;
 
- set_affected_state_flags(states, prog, shader,
+ set_affected_state_flags(states, prog,
   ST_NEW_TCS_CONSTANTS,
   ST_NEW_TCS_SAMPLER_VIEWS,
   ST_NEW_RENDER_SAMPLERS,
@@ -6697,7 +6696,7 @@ get_mesa_program(struct gl_context *ctx,
  *states = ST_NEW_TES_STATE |
ST_NEW_RASTERIZER;
 
- set_affected_state_flags(states, prog, shader,
+ set_affected_state_flags(states, prog,
   ST_NEW_TES_CONSTANTS,
   ST_NEW_TES_SAMPLER_VIEWS,
   ST_NEW_RENDER_SAMPLERS,
@@ -6713,7 +6712,7 @@ get_mesa_program(struct gl_context *ctx,
  *states = ST_NEW_GS_STATE |
ST_NEW_RASTERIZER;
 
- set_affected_state_flags(states, prog, shader,
+ set_affected_state_flags(states, prog,
   ST_NEW_GS_CONSTANTS,
   ST_NEW_GS_SAMPLER_VIEWS,
   ST_NEW_RENDER_SAMPLERS,
@@ -6731,7 +6730,7 @@ get_mesa_program(struct gl_context *ctx,
ST_NEW_SAMPLE_SHADING |
ST_NEW_FS_CONSTANTS;
 
- set_affected_state_flags(states, prog, shader,
+ set_affected_state_flags(states, prog,
   ST_NEW_FS_CONSTANTS,
   ST_NEW_FS_SAMPLER_VIEWS,
   ST_NEW_RENDER_SAMPLERS,
@@ -6746,7 +6745,7 @@ get_mesa_program(struct gl_context *ctx,
 
  *states = ST_NEW_CS_STATE;
 
- set_affected_state_flags(states, prog, shader,
+ set_affected_state_flags(states, prog,
   ST_NEW_CS_CONSTANTS,
   ST_NEW_CS_SAMPLER_VIEWS,
   ST_NEW_CS_SAMPLERS,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 63/70] glsl: tidy up PostDepthCoverage shader field

2016-12-20 Thread Timothy Arceri

There is no reason for this to be in the shared gl_shader_info or
to copy it to gl_program at the end of linking (its already there).
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 src/compiler/glsl/linker.cpp | 2 +-
 src/mesa/main/mtypes.h   | 3 ++-
 src/mesa/main/shaderapi.c| 1 -
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index b3358f9..75c0157 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1816,7 +1816,7 @@ set_shader_inout_layout(struct gl_shader *shader,
  state->ARB_fragment_coord_conventions_enable;
   shader->EarlyFragmentTests = state->fs_early_fragment_tests;
   shader->info.InnerCoverage = state->fs_inner_coverage;
-  shader->info.PostDepthCoverage = state->fs_post_depth_coverage;
+  shader->PostDepthCoverage = state->fs_post_depth_coverage;
   shader->BlendSupport = state->fs_blend_support;
   break;
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 6ca5aec..e0e08f1 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1880,7 +1880,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
   linked_shader->info.InnerCoverage |=
  shader->info.InnerCoverage;
   linked_shader->Program->info.fs.post_depth_coverage |=
- shader->info.PostDepthCoverage;
+ shader->PostDepthCoverage;
 
   linked_shader->Program->sh.fs.BlendSupport |= shader->BlendSupport;
}
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 914fe62..c3050ee 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2234,7 +2234,6 @@ struct gl_subroutine_function
  */
 struct gl_shader_info
 {
-   bool PostDepthCoverage;
bool InnerCoverage;
 
struct {
@@ -2424,6 +2423,8 @@ struct gl_shader
bool redeclares_gl_fragcoord;
bool uses_gl_fragcoord;
 
+   bool PostDepthCoverage;
+
/**
 * Fragment shader state from GLSL 1.50 layout qualifiers.
 */
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index b03503e..c0dbd28 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2208,7 +2208,6 @@ _mesa_copy_linked_program_data(const struct 
gl_shader_program *src,
case MESA_SHADER_FRAGMENT: {
   dst->info.fs.depth_layout = src->FragDepthLayout;
   dst->info.fs.inner_coverage = dst_sh->info.InnerCoverage;
-  dst->info.fs.post_depth_coverage = dst_sh->info.PostDepthCoverage;
   break;
}
case MESA_SHADER_COMPUTE: {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 53/70] mesa/glsl: set {clip, cull}_distance_array_size directly in gl_program

2016-12-20 Thread Timothy Arceri

There are some line wrapping violations here but those lines will get
deleted in the following patch.
---
 src/compiler/glsl/glsl_to_nir.cpp   |  2 --
 src/compiler/glsl/linker.cpp| 32 +++
 src/mesa/drivers/dri/i965/brw_vs.c  |  2 +-
 src/mesa/main/mtypes.h  | 38 -
 src/mesa/main/shaderapi.c   |  8 
 src/mesa/state_tracker/st_program.c | 16 
 6 files changed, 25 insertions(+), 73 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 135aed2..a41f370 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -149,8 +149,6 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
shader->info->name = ralloc_asprintf(shader, "GLSL%d", shader_prog->Name);
if (shader_prog->Label)
   shader->info->label = ralloc_strdup(shader, shader_prog->Label);
-   shader->info->clip_distance_array_size = sh->Program->ClipDistanceArraySize;
-   shader->info->cull_distance_array_size = sh->Program->CullDistanceArraySize;
shader->info->has_transform_feedback_varyings =
   shader_prog->TransformFeedback.NumVarying > 0;
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 22b18fb..9562d41 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -633,8 +633,8 @@ analyze_clip_cull_usage(struct gl_shader_program *prog,
 /**
  * Verify that a vertex shader executable meets all semantic requirements.
  *
- * Also sets prog->Vert.ClipDistanceArraySize and
- * prog->Vert.CullDistanceArraySize as a side effect.
+ * Also sets info.clip_distance_array_size and
+ * info.cull_distance_array_size as a side effect.
  *
  * \param shader  Vertex shader executable to be verified
  */
@@ -689,8 +689,8 @@ validate_vertex_shader_executable(struct gl_shader_program 
*prog,
}
 
analyze_clip_cull_usage(prog, shader, ctx,
-   >Vert.ClipDistanceArraySize,
-   >Vert.CullDistanceArraySize);
+   >Program->info.clip_distance_array_size,
+   >Program->info.cull_distance_array_size);
 }
 
 void
@@ -702,8 +702,8 @@ validate_tess_eval_shader_executable(struct 
gl_shader_program *prog,
   return;
 
analyze_clip_cull_usage(prog, shader, ctx,
-   >TessEval.ClipDistanceArraySize,
-   >TessEval.CullDistanceArraySize);
+   >Program->info.clip_distance_array_size,
+   >Program->info.cull_distance_array_size);
 }
 
 
@@ -734,8 +734,8 @@ validate_fragment_shader_executable(struct 
gl_shader_program *prog,
 /**
  * Verify that a geometry shader executable meets all semantic requirements
  *
- * Also sets prog->Geom.VerticesIn, and prog->Geom.ClipDistanceArraySize and
- * prog->Geom.CullDistanceArraySize as a side effect.
+ * Also sets prog->Geom.VerticesIn, and info.clip_distance_array_sizeand
+ * info.cull_distance_array_size as a side effect.
  *
  * \param shader Geometry shader executable to be verified
  */
@@ -751,8 +751,8 @@ validate_geometry_shader_executable(struct 
gl_shader_program *prog,
prog->Geom.VerticesIn = num_vertices;
 
analyze_clip_cull_usage(prog, shader, ctx,
-   >Geom.ClipDistanceArraySize,
-   >Geom.CullDistanceArraySize);
+   >Program->info.clip_distance_array_size,
+   >Program->info.cull_distance_array_size);
 }
 
 /**
@@ -4934,14 +4934,14 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
}
 
if (num_shaders[MESA_SHADER_GEOMETRY] > 0) {
-  prog->LastClipDistanceArraySize = prog->Geom.ClipDistanceArraySize;
-  prog->LastCullDistanceArraySize = prog->Geom.CullDistanceArraySize;
+  prog->LastClipDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program->info.clip_distance_array_size;
+  prog->LastCullDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program->info.cull_distance_array_size;
} else if (num_shaders[MESA_SHADER_TESS_EVAL] > 0) {
-  prog->LastClipDistanceArraySize = prog->TessEval.ClipDistanceArraySize;
-  prog->LastCullDistanceArraySize = prog->TessEval.CullDistanceArraySize;
+  prog->LastClipDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program->info.clip_distance_array_size;
+  prog->LastCullDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program->info.cull_distance_array_size;
} else if (num_shaders[MESA_SHADER_VERTEX] > 0) {
-  prog->LastClipDistanceArraySize = prog->Vert.ClipDistanceArraySize;
-  prog->LastCullDistanceArraySize = prog->Vert.CullDistanceArraySize;
+  prog->LastClipDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_VERTEX]->Program->info.clip_distance_array_size;
+

[Mesa-dev] [PATCH 66/70] mesa/glsl: move TransformFeedbackBufferStride to gl_shader

2016-12-20 Thread Timothy Arceri

Here we remove the single use of this field in gl_linked_shader
which allows us to move the field out of gl_shader_info

While we are at it we rewrite link_xfb_stride_layout_qualifiers()
to be more clear.
---
 src/compiler/glsl/glsl_parser_extras.cpp |  2 +-
 src/compiler/glsl/link_varyings.cpp  |  3 +-
 src/compiler/glsl/link_varyings.h|  1 +
 src/compiler/glsl/linker.cpp | 73 +++-
 src/mesa/main/mtypes.h   |  8 ++--
 5 files changed, 42 insertions(+), 45 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index ab073a2..fcfb4ae 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1704,7 +1704,7 @@ set_shader_inout_layout(struct gl_shader *shader,
  if (state->out_qualifier->out_xfb_stride[i]->
 process_qualifier_constant(state, "xfb_stride", _stride,
 true)) {
-shader->info.TransformFeedback.BufferStride[i] = xfb_stride;
+shader->TransformFeedbackBufferStride[i] = xfb_stride;
  }
   }
}
diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 398e1da..f032f2a 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -108,6 +108,7 @@ create_xfb_varying_names(void *mem_ctx, const glsl_type *t, 
char **name,
 
 bool
 process_xfb_layout_qualifiers(void *mem_ctx, const gl_linked_shader *sh,
+  struct gl_shader_program *prog,
   unsigned *num_tfeedback_decls,
   char ***varying_names)
 {
@@ -118,7 +119,7 @@ process_xfb_layout_qualifiers(void *mem_ctx, const 
gl_linked_shader *sh,
 * xfb_stride to interface block members so this will catch that case also.
 */
for (unsigned j = 0; j < MAX_FEEDBACK_BUFFERS; j++) {
-  if (sh->info.TransformFeedback.BufferStride[j]) {
+  if (prog->TransformFeedback.BufferStride[j]) {
  has_xfb_qualifiers = true;
  break;
   }
diff --git a/src/compiler/glsl/link_varyings.h 
b/src/compiler/glsl/link_varyings.h
index afce56e..2abe3ca 100644
--- a/src/compiler/glsl/link_varyings.h
+++ b/src/compiler/glsl/link_varyings.h
@@ -301,6 +301,7 @@ parse_tfeedback_decls(struct gl_context *ctx, struct 
gl_shader_program *prog,
 
 bool
 process_xfb_layout_qualifiers(void *mem_ctx, const gl_linked_shader *sh,
+  struct gl_shader_program *prog,
   unsigned *num_tfeedback_decls,
   char ***varying_names);
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index ca62fb5..55a71d3 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1587,6 +1587,29 @@ private:
hash_table *unnamed_interfaces;
 };
 
+static bool
+validate_xfb_buffer_stride(struct gl_context *ctx, unsigned idx,
+   struct gl_shader_program *prog)
+{
+   /* We will validate doubles at a later stage */
+   if (prog->TransformFeedback.BufferStride[idx] % 4) {
+  linker_error(prog, "invalid qualifier xfb_stride=%d must be a "
+   "multiple of 4 or if its applied to a type that is "
+   "or contains a double a multiple of 8.",
+   prog->TransformFeedback.BufferStride[idx]);
+  return false;
+   }
+
+   if (prog->TransformFeedback.BufferStride[idx] / 4 >
+   ctx->Const.MaxTransformFeedbackInterleavedComponents) {
+  linker_error(prog, "The MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS "
+   "limit has been exceeded.");
+  return false;
+   }
+
+   return true;
+}
+
 /**
  * Check for conflicting xfb_stride default qualifiers and store buffer stride
  * for later use.
@@ -1599,54 +1622,28 @@ link_xfb_stride_layout_qualifiers(struct gl_context 
*ctx,
   unsigned num_shaders)
 {
for (unsigned i = 0; i < MAX_FEEDBACK_BUFFERS; i++) {
-  linked_shader->info.TransformFeedback.BufferStride[i] = 0;
+  prog->TransformFeedback.BufferStride[i] = 0;
}
 
for (unsigned i = 0; i < num_shaders; i++) {
   struct gl_shader *shader = shader_list[i];
 
   for (unsigned j = 0; j < MAX_FEEDBACK_BUFFERS; j++) {
- if (shader->info.TransformFeedback.BufferStride[j]) {
-if (linked_shader->info.TransformFeedback.BufferStride[j] != 0 &&
-shader->info.TransformFeedback.BufferStride[j] != 0 &&
-linked_shader->info.TransformFeedback.BufferStride[j] !=
-   shader->info.TransformFeedback.BufferStride[j]) {
+ if (shader->TransformFeedbackBufferStride[j]) {
+if (prog->TransformFeedback.BufferStride[j] == 0) {
+   prog->TransformFeedback.BufferStride[j] =
+  shader->TransformFeedbackBufferStride[j];
+

[Mesa-dev] [PATCH 61/70] mesa/glsl: move origin_upper_left to gl_shader

2016-12-20 Thread Timothy Arceri

This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 src/compiler/glsl/linker.cpp | 8 +++-
 src/mesa/main/mtypes.h   | 9 +
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 2271403..18769e9 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1811,7 +1811,7 @@ set_shader_inout_layout(struct gl_shader *shader,
   shader->redeclares_gl_fragcoord = state->fs_redeclares_gl_fragcoord;
   shader->uses_gl_fragcoord = state->fs_uses_gl_fragcoord;
   shader->info.pixel_center_integer = state->fs_pixel_center_integer;
-  shader->info.origin_upper_left = state->fs_origin_upper_left;
+  shader->origin_upper_left = state->fs_origin_upper_left;
   shader->ARB_fragment_coord_conventions_enable =
  state->ARB_fragment_coord_conventions_enable;
   shader->EarlyFragmentTests = state->fs_early_fragment_tests;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 548c59a..ca0f844 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1827,7 +1827,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
 {
bool redeclares_gl_fragcoord = false;
bool uses_gl_fragcoord = false;
-   linked_shader->info.origin_upper_left = false;
+   bool origin_upper_left = false;
linked_shader->info.pixel_center_integer = false;
 
if (linked_shader->Stage != MESA_SHADER_FRAGMENT ||
@@ -1857,8 +1857,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
*single program must have the same set of qualifiers."
*/
   if (redeclares_gl_fragcoord && shader->redeclares_gl_fragcoord &&
-  (shader->info.origin_upper_left !=
-   linked_shader->info.origin_upper_left ||
+  (shader->origin_upper_left != origin_upper_left ||
shader->info.pixel_center_integer !=
linked_shader->info.pixel_center_integer)) {
  linker_error(prog, "fragment shader defined with conflicting "
@@ -1873,8 +1872,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
   if (shader->redeclares_gl_fragcoord || shader->uses_gl_fragcoord) {
  redeclares_gl_fragcoord = shader->redeclares_gl_fragcoord;
  uses_gl_fragcoord |= shader->uses_gl_fragcoord;
- linked_shader->info.origin_upper_left =
-shader->info.origin_upper_left;
+ origin_upper_left = shader->origin_upper_left;
  linked_shader->info.pixel_center_integer =
 shader->info.pixel_center_integer;
   }
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 5a5fa6df..f01109a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2237,10 +2237,6 @@ struct gl_shader_info
bool PostDepthCoverage;
bool InnerCoverage;
 
-   /**
-* Fragment shader state from GLSL 1.50 layout qualifiers.
-*/
-   bool origin_upper_left;
bool pixel_center_integer;
 
struct {
@@ -2430,6 +2426,11 @@ struct gl_shader
bool redeclares_gl_fragcoord;
bool uses_gl_fragcoord;
 
+   /**
+* Fragment shader state from GLSL 1.50 layout qualifiers.
+*/
+   bool origin_upper_left;
+
struct gl_shader_info info;
 };
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 65/70] glsl: exit loop early if we find xfb layout qualifers

2016-12-20 Thread Timothy Arceri

---
 src/compiler/glsl/link_varyings.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index da51fd8..398e1da 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -120,6 +120,7 @@ process_xfb_layout_qualifiers(void *mem_ctx, const 
gl_linked_shader *sh,
for (unsigned j = 0; j < MAX_FEEDBACK_BUFFERS; j++) {
   if (sh->info.TransformFeedback.BufferStride[j]) {
  has_xfb_qualifiers = true;
+ break;
   }
}
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 70/70] mesa: remove unused gl_shader_info field from gl_linked_shader

2016-12-20 Thread Timothy Arceri

---
 src/mesa/main/mtypes.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 9f84735..1a56382 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2352,8 +2352,6 @@ struct gl_linked_shader
struct exec_list *packed_varyings;
struct exec_list *fragdata_arrays;
struct glsl_symbol_table *symbols;
-
-   struct gl_shader_info info;
 };
 
 static inline GLbitfield gl_external_samplers(struct gl_program *prog)
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 54/70] glsl: use last_vert_prog to get last {clip, cull}_distance_array_size

2016-12-20 Thread Timothy Arceri

---
 src/compiler/glsl/link_varyings.cpp |  6 --
 src/compiler/glsl/linker.cpp| 14 --
 src/mesa/main/mtypes.h  |  7 ---
 3 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 147a7c3..da51fd8 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -743,10 +743,12 @@ tfeedback_decl::assign_location(struct gl_context *ctx,
   unsigned actual_array_size;
   switch (this->lowered_builtin_array_variable) {
   case clip_distance:
- actual_array_size = prog->LastClipDistanceArraySize;
+ actual_array_size = prog->last_vert_prog ?
+prog->last_vert_prog->info.clip_distance_array_size : 0;
  break;
   case cull_distance:
- actual_array_size = prog->LastCullDistanceArraySize;
+ actual_array_size = prog->last_vert_prog ?
+prog->last_vert_prog->info.cull_distance_array_size : 0;
  break;
   case tess_level_outer:
  actual_array_size = 4;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 9562d41..2d7f18f 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4933,20 +4933,6 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   }
}
 
-   if (num_shaders[MESA_SHADER_GEOMETRY] > 0) {
-  prog->LastClipDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program->info.clip_distance_array_size;
-  prog->LastCullDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_GEOMETRY]->Program->info.cull_distance_array_size;
-   } else if (num_shaders[MESA_SHADER_TESS_EVAL] > 0) {
-  prog->LastClipDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program->info.clip_distance_array_size;
-  prog->LastCullDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_TESS_EVAL]->Program->info.cull_distance_array_size;
-   } else if (num_shaders[MESA_SHADER_VERTEX] > 0) {
-  prog->LastClipDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_VERTEX]->Program->info.clip_distance_array_size;
-  prog->LastCullDistanceArraySize = 
prog->_LinkedShaders[MESA_SHADER_VERTEX]->Program->info.cull_distance_array_size;
-   } else {
-  prog->LastClipDistanceArraySize = 0; /* Not used */
-  prog->LastCullDistanceArraySize = 0; /* Not used */
-   }
-
/* Here begins the inter-stage linking phase.  Some initial validation is
 * performed, then locations are assigned for uniforms, attributes, and
 * varyings.
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ab0204..407e11a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2792,13 +2792,6 @@ struct gl_shader_program
struct exec_list EmptyUniformLocations;
 
/**
-* Size of the gl_ClipDistance array that is output from the last pipeline
-* stage before the fragment shader.
-*/
-   unsigned LastClipDistanceArraySize;
-   unsigned LastCullDistanceArraySize;
-
-   /**
 * Map of active uniform names to locations
 *
 * Maps any active uniform that is not an array element to a location.
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 56/70] mesa/glsl/i965: set and use tcs vertices_out directly

2016-12-20 Thread Timothy Arceri

---
 src/compiler/glsl/linker.cpp| 25 -
 src/mesa/drivers/dri/i965/brw_tcs.c |  6 ++
 src/mesa/main/shaderapi.c   |  6 +-
 3 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 2d7f18f..9fe7278 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1659,15 +1659,15 @@ link_xfb_stride_layout_qualifiers(struct gl_context 
*ctx,
  */
 static void
 link_tcs_out_layout_qualifiers(struct gl_shader_program *prog,
-   struct gl_linked_shader *linked_shader,
+   struct gl_program *gl_prog,
struct gl_shader **shader_list,
unsigned num_shaders)
 {
-   linked_shader->info.TessCtrl.VerticesOut = 0;
-
-   if (linked_shader->Stage != MESA_SHADER_TESS_CTRL)
+   if (gl_prog->info.stage != MESA_SHADER_TESS_CTRL)
   return;
 
+   gl_prog->info.tcs.vertices_out = 0;
+
/* From the GLSL 4.0 spec (chapter 4.3.8.2):
 *
 * "All tessellation control shader layout declarations in a program
@@ -1682,17 +1682,16 @@ link_tcs_out_layout_qualifiers(struct gl_shader_program 
*prog,
   struct gl_shader *shader = shader_list[i];
 
   if (shader->info.TessCtrl.VerticesOut != 0) {
- if (linked_shader->info.TessCtrl.VerticesOut != 0 &&
- linked_shader->info.TessCtrl.VerticesOut !=
- shader->info.TessCtrl.VerticesOut) {
+ if (gl_prog->info.tcs.vertices_out != 0 &&
+ gl_prog->info.tcs.vertices_out !=
+ (unsigned) shader->info.TessCtrl.VerticesOut) {
 linker_error(prog, "tessellation control shader defined with "
  "conflicting output vertex count (%d and %d)\n",
- linked_shader->info.TessCtrl.VerticesOut,
+ gl_prog->info.tcs.vertices_out,
  shader->info.TessCtrl.VerticesOut);
 return;
  }
- linked_shader->info.TessCtrl.VerticesOut =
-shader->info.TessCtrl.VerticesOut;
+ gl_prog->info.tcs.vertices_out = shader->info.TessCtrl.VerticesOut;
   }
}
 
@@ -1700,7 +1699,7 @@ link_tcs_out_layout_qualifiers(struct gl_shader_program 
*prog,
 * since we already know we're in the right type of shader program
 * for doing it.
 */
-   if (linked_shader->info.TessCtrl.VerticesOut == 0) {
+   if (gl_prog->info.tcs.vertices_out == 0) {
   linker_error(prog, "tessellation control shader didn't declare "
"vertices out layout qualifier\n");
   return;
@@ -2220,7 +2219,7 @@ link_intrastage_shaders(void *mem_ctx,
clone_ir_list(mem_ctx, linked->ir, main->ir);
 
link_fs_inout_layout_qualifiers(prog, linked, shader_list, num_shaders);
-   link_tcs_out_layout_qualifiers(prog, linked, shader_list, num_shaders);
+   link_tcs_out_layout_qualifiers(prog, gl_prog, shader_list, num_shaders);
link_tes_in_layout_qualifiers(prog, linked, shader_list, num_shaders);
link_gs_inout_layout_qualifiers(prog, linked, shader_list, num_shaders);
link_cs_input_layout_qualifiers(prog, linked, shader_list, num_shaders);
@@ -2433,7 +2432,7 @@ resize_tes_inputs(struct gl_context *ctx,
 * known until draw time.
 */
const int num_vertices = tcs
-  ? tcs->info.TessCtrl.VerticesOut
+  ? tcs->Program->info.tcs.vertices_out
   : ctx->Const.MaxPatchVertices;
 
array_resize_visitor input_resize_visitor(num_vertices, prog,
diff --git a/src/mesa/drivers/dri/i965/brw_tcs.c 
b/src/mesa/drivers/dri/i965/brw_tcs.c
index 0cb120c..65b5a18 100644
--- a/src/mesa/drivers/dri/i965/brw_tcs.c
+++ b/src/mesa/drivers/dri/i965/brw_tcs.c
@@ -392,10 +392,8 @@ brw_tcs_precompile(struct gl_context *ctx,
brw_setup_tex_for_precompile(brw, , prog);
 
/* Guess that the input and output patches have the same dimensionality. */
-   if (brw->gen < 8) {
-  key.input_vertices = shader_prog->
- _LinkedShaders[MESA_SHADER_TESS_CTRL]->info.TessCtrl.VerticesOut;
-   }
+   if (brw->gen < 8)
+  key.input_vertices = prog->info.tcs.vertices_out;
 
struct brw_program *btep;
if (tes) {
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index d487f95..d97c594 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -821,7 +821,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname,
  break;
   if (check_tcs_query(ctx, shProg)) {
  *params = shProg->_LinkedShaders[MESA_SHADER_TESS_CTRL]->
-info.TessCtrl.VerticesOut;
+Program->info.tcs.vertices_out;
   }
   return;
case GL_TESS_GEN_MODE:
@@ -2188,10 +2188,6 @@ _mesa_copy_linked_program_data(const struct 
gl_shader_program *src,
dst->info.separate_shader = src->SeparateShader;
 
switch (dst_sh->Stage) {
-   case MESA_SHADER_TESS_CTRL: {
-

[Mesa-dev] [PATCH 32/70] i965: make use of new is_arb_asm flag

2016-12-20 Thread Timothy Arceri

---
 src/mesa/drivers/dri/i965/brw_vs.c | 11 +--
 src/mesa/drivers/dri/i965/brw_wm.c | 13 ++---
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index c08be1f..dfa40ac 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -156,7 +156,7 @@ brw_codegen_vs_prog(struct brw_context *brw,
memset(_data, 0, sizeof(prog_data));
 
/* Use ALT floating point mode for ARB programs so that 0^0 == 1. */
-   if (!prog)
+   if (vp->program.is_arb_asm)
   stage_prog_data->use_alt_mode = true;
 
mem_ctx = ralloc_context(NULL);
@@ -187,7 +187,7 @@ brw_codegen_vs_prog(struct brw_context *brw,
 stage_prog_data->nr_image_params);
stage_prog_data->nr_params = param_count;
 
-   if (prog) {
+   if (!vp->program.is_arb_asm) {
   brw_nir_setup_glsl_uniforms(vp->program.nir, >program,
   _data.base.base,
   compiler->scalar_stage[MESA_SHADER_VERTEX]);
@@ -220,7 +220,7 @@ brw_codegen_vs_prog(struct brw_context *brw,
}
 
if (unlikely(INTEL_DEBUG & DEBUG_VS)) {
-  if (!prog)
+  if (vp->program.is_arb_asm)
  brw_dump_arb_asm("vertex", >program);
 
   fprintf(stderr, "VS Output ");
@@ -229,9 +229,8 @@ brw_codegen_vs_prog(struct brw_context *brw,
 
int st_index = -1;
if (INTEL_DEBUG & DEBUG_SHADER_TIME) {
-  bool is_glsl_sh = prog != NULL;
   st_index = brw_get_shader_time_index(brw, >program, ST_VS,
-   is_glsl_sh);
+   !vp->program.is_arb_asm);
}
 
/* Emit GEN4 code.
@@ -243,7 +242,7 @@ brw_codegen_vs_prog(struct brw_context *brw,
 !_mesa_is_gles3(>ctx),
 st_index, _size, _str);
if (program == NULL) {
-  if (prog) {
+  if (!vp->program.is_arb_asm) {
  vp->program.sh.data->LinkStatus = false;
  ralloc_strcat(>program.sh.data->InfoLog, error_str);
   }
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index e6576c2..de8671b 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -152,7 +152,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
memset(_data, 0, sizeof(prog_data));
 
/* Use ALT floating point mode for ARB programs so that 0^0 == 1. */
-   if (!prog)
+   if (fp->program.is_arb_asm)
   prog_data.base.use_alt_mode = true;
 
assign_fs_binding_table_offsets(devinfo, >program, key, _data);
@@ -174,7 +174,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
 prog_data.base.nr_image_params);
prog_data.base.nr_params = param_count;
 
-   if (prog) {
+   if (!fp->program.is_arb_asm) {
   brw_nir_setup_glsl_uniforms(fp->program.nir, >program,
   _data.base, true);
} else {
@@ -193,11 +193,10 @@ brw_codegen_wm_prog(struct brw_context *brw,
 
int st_index8 = -1, st_index16 = -1;
if (INTEL_DEBUG & DEBUG_SHADER_TIME) {
-  bool is_glsl_sh = prog != NULL;
   st_index8 = brw_get_shader_time_index(brw, >program, ST_FS8,
-is_glsl_sh);
+!fp->program.is_arb_asm);
   st_index16 = brw_get_shader_time_index(brw, >program, ST_FS16,
- is_glsl_sh);
+ !fp->program.is_arb_asm);
}
 
char *error_str = NULL;
@@ -208,7 +207,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
 _size, _str);
 
if (program == NULL) {
-  if (prog) {
+  if (!fp->program.is_arb_asm) {
  fp->program.sh.data->LinkStatus = false;
  ralloc_strcat(>program.sh.data->InfoLog, error_str);
   }
@@ -234,7 +233,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
prog_data.base.total_scratch,
devinfo->max_wm_threads);
 
-   if (unlikely((INTEL_DEBUG & DEBUG_WM) && !prog))
+   if (unlikely((INTEL_DEBUG & DEBUG_WM) && fp->program.is_arb_asm))
   fprintf(stderr, "\n");
 
brw_upload_cache(>cache, BRW_CACHE_FS_PROG,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 30/70] i965: pass gl_program directly to brw_compile_tes()

2016-12-20 Thread Timothy Arceri

This is the only thing we use from gl_shader_program so pass it directly.
---
 src/mesa/drivers/dri/i965/brw_compiler.h | 2 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp | 6 ++
 src/mesa/drivers/dri/i965/brw_tes.c  | 2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index 5e70601..db8f39c 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -803,7 +803,7 @@ brw_compile_tes(const struct brw_compiler *compiler, void 
*log_data,
 const struct brw_tes_prog_key *key,
 struct brw_tes_prog_data *prog_data,
 const struct nir_shader *shader,
-struct gl_shader_program *shader_prog,
+struct gl_program *prog,
 int shader_time_index,
 unsigned *final_assembly_size,
 char **error_str);
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 4742e5d..2cd1c48 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -1341,14 +1341,12 @@ brw_compile_tes(const struct brw_compiler *compiler,
 const struct brw_tes_prog_key *key,
 struct brw_tes_prog_data *prog_data,
 const nir_shader *src_shader,
-struct gl_shader_program *shader_prog,
+struct gl_program *prog,
 int shader_time_index,
 unsigned *final_assembly_size,
 char **error_str)
 {
const struct gen_device_info *devinfo = compiler->devinfo;
-   struct gl_linked_shader *shader =
-  shader_prog->_LinkedShaders[MESA_SHADER_TESS_EVAL];
const bool is_scalar = compiler->scalar_stage[MESA_SHADER_TESS_EVAL];
 
nir_shader *nir = nir_shader_clone(mem_ctx, src_shader);
@@ -1406,7 +1404,7 @@ brw_compile_tes(const struct brw_compiler *compiler,
 
if (is_scalar) {
   fs_visitor v(compiler, log_data, mem_ctx, (void *) key,
-   _data->base.base, shader->Program, nir, 8,
+   _data->base.base, prog, nir, 8,
shader_time_index, _vue_map);
   if (!v.run_tes()) {
  if (error_str)
diff --git a/src/mesa/drivers/dri/i965/brw_tes.c 
b/src/mesa/drivers/dri/i965/brw_tes.c
index 464dbde..f98f874 100644
--- a/src/mesa/drivers/dri/i965/brw_tes.c
+++ b/src/mesa/drivers/dri/i965/brw_tes.c
@@ -178,7 +178,7 @@ brw_codegen_tes_prog(struct brw_context *brw,
char *error_str;
const unsigned *program =
   brw_compile_tes(compiler, brw, mem_ctx, key, _data, nir,
-  shader_prog, st_index, _size, _str);
+  >program, st_index, _size, _str);
if (program == NULL) {
   tep->program.sh.data->LinkStatus = false;
   ralloc_strcat(>program.sh.data->InfoLog, error_str);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 64/70] glsl: set InnerCoverage directly in gl_program

2016-12-20 Thread Timothy Arceri

Also move out of the shared gl_shader_info.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 src/compiler/glsl/linker.cpp | 3 +--
 src/mesa/main/mtypes.h   | 3 +--
 src/mesa/main/shaderapi.c| 1 -
 4 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 75c0157..ab073a2 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1815,7 +1815,7 @@ set_shader_inout_layout(struct gl_shader *shader,
   shader->ARB_fragment_coord_conventions_enable =
  state->ARB_fragment_coord_conventions_enable;
   shader->EarlyFragmentTests = state->fs_early_fragment_tests;
-  shader->info.InnerCoverage = state->fs_inner_coverage;
+  shader->InnerCoverage = state->fs_inner_coverage;
   shader->PostDepthCoverage = state->fs_post_depth_coverage;
   shader->BlendSupport = state->fs_blend_support;
   break;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index e0e08f1..ca62fb5 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1877,8 +1877,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
 
   linked_shader->Program->info.fs.early_fragment_tests |=
  shader->EarlyFragmentTests;
-  linked_shader->info.InnerCoverage |=
- shader->info.InnerCoverage;
+  linked_shader->Program->info.fs.inner_coverage |= shader->InnerCoverage;
   linked_shader->Program->info.fs.post_depth_coverage |=
  shader->PostDepthCoverage;
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index c3050ee..bc69b6f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2234,8 +2234,6 @@ struct gl_subroutine_function
  */
 struct gl_shader_info
 {
-   bool InnerCoverage;
-
struct {
   /** Global xfb_stride out qualifier if any */
   GLuint BufferStride[MAX_FEEDBACK_BUFFERS];
@@ -2424,6 +2422,7 @@ struct gl_shader
bool uses_gl_fragcoord;
 
bool PostDepthCoverage;
+   bool InnerCoverage;
 
/**
 * Fragment shader state from GLSL 1.50 layout qualifiers.
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index c0dbd28..0f44c53 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2207,7 +2207,6 @@ _mesa_copy_linked_program_data(const struct 
gl_shader_program *src,
}
case MESA_SHADER_FRAGMENT: {
   dst->info.fs.depth_layout = src->FragDepthLayout;
-  dst->info.fs.inner_coverage = dst_sh->info.InnerCoverage;
   break;
}
case MESA_SHADER_COMPUTE: {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 35/70] mesa: make _CurrentFragmentProgram a gl_program struct pointer

2016-12-20 Thread Timothy Arceri

Making this point to a gl_program struct rather than a gl_shader_program
struct will allow use to later also make the CurrentProgram array hold
gl_program structs which in turn will allow for code simpilifcation.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 ++--
 src/mesa/main/api_validate.c |  6 ++
 src/mesa/main/mtypes.h   |  2 +-
 src/mesa/main/pipelineobj.c  |  2 +-
 src/mesa/main/shaderapi.c| 14 --
 src/mesa/main/state.c| 20 
 6 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index eff19de..4566696 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1438,14 +1438,10 @@ brw_upload_wm_ubo_surfaces(struct brw_context *brw)
 {
struct gl_context *ctx = >ctx;
/* _NEW_PROGRAM */
-   struct gl_shader_program *prog = ctx->_Shader->_CurrentFragmentProgram;
-
-   if (!prog || !prog->_LinkedShaders[MESA_SHADER_FRAGMENT])
-  return;
+   struct gl_program *prog = ctx->_Shader->_CurrentFragmentProgram;
 
/* BRW_NEW_FS_PROG_DATA */
-   brw_upload_ubo_surfaces(brw, 
prog->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program,
-   >wm.base, brw->wm.base.prog_data);
+   brw_upload_ubo_surfaces(brw, prog, >wm.base, brw->wm.base.prog_data);
 }
 
 const struct brw_tracked_state brw_wm_ubo_surfaces = {
diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 42eeeba..5f051db 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -99,10 +99,8 @@ check_blend_func_error(struct gl_context *ctx)
* the blend equation or "blend_support_all_equations", the error
* INVALID_OPERATION is generated [...]"
*/
-  const struct gl_shader_program *sh_prog =
- ctx->_Shader->_CurrentFragmentProgram;
-  const GLbitfield blend_support = !sh_prog ? 0 :
- 
sh_prog->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program->sh.fs.BlendSupport;
+  const struct gl_program *prog = ctx->_Shader->_CurrentFragmentProgram;
+  const GLbitfield blend_support = !prog ? 0 : prog->sh.fs.BlendSupport;
 
   if ((blend_support & ctx->Color._AdvancedBlendMode) == 0) {
  _mesa_error(ctx, GL_INVALID_OPERATION,
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index f7ce6f6..844173b6 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2917,7 +2917,7 @@ struct gl_pipeline_object
 */
struct gl_shader_program *CurrentProgram[MESA_SHADER_STAGES];
 
-   struct gl_shader_program *_CurrentFragmentProgram;
+   struct gl_program *_CurrentFragmentProgram;
 
/**
 * Program used by glUniform calls.
diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index 282071a..bce7a82 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -58,7 +58,7 @@ _mesa_delete_pipeline_object(struct gl_context *ctx,
 {
unsigned i;
 
-   _mesa_reference_shader_program(ctx, >_CurrentFragmentProgram, NULL);
+   _mesa_reference_program(ctx, >_CurrentFragmentProgram, NULL);
 
for (i = 0; i < MESA_SHADER_STAGES; i++)
   _mesa_reference_shader_program(ctx, >CurrentProgram[i], NULL);
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 3f2507d..6d0f0e0 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -161,8 +161,7 @@ _mesa_free_shader_state(struct gl_context *ctx)
   _mesa_reference_shader_program(ctx, >Shader.CurrentProgram[i],
  NULL);
}
-   _mesa_reference_shader_program(ctx, >Shader._CurrentFragmentProgram,
- NULL);
+   _mesa_reference_program(ctx, >Shader._CurrentFragmentProgram, NULL);
_mesa_reference_shader_program(ctx, >Shader.ActiveProgram, NULL);
 
/* Extended for ARB_separate_shader_objects */
@@ -1237,10 +1236,13 @@ use_shader_program(struct gl_context *ctx, 
gl_shader_stage stage,
  /* Empty for now. */
  break;
   case MESA_SHADER_FRAGMENT:
- if (*target == ctx->_Shader->_CurrentFragmentProgram) {
-   _mesa_reference_shader_program(ctx,
-   
>_Shader->_CurrentFragmentProgram,
-  NULL);
+ if (*target != NULL &&
+ ((*target)->_LinkedShaders[MESA_SHADER_FRAGMENT] &&
+  (*target)->_LinkedShaders[MESA_SHADER_FRAGMENT]->Program ==
+  ctx->_Shader->_CurrentFragmentProgram)) {
+   _mesa_reference_program(ctx,
+>_Shader->_CurrentFragmentProgram,
+NULL);
 }
 break;
   }
diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index

[Mesa-dev] [PATCH 37/70] mesa: simplify sampler setting code

2016-12-20 Thread Timothy Arceri

There is no need to loop over active samplers the code above this
would have already exited if the sampler was inactive, or errored
if the count was larger than the uniforms array size.
---
 src/mesa/main/uniform_query.cpp | 33 +++--
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 73e7b0b..ffb20ca 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -859,39 +859,28 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
   for (int i = 0; i < MESA_SHADER_STAGES; i++) {
 struct gl_linked_shader *const sh = shProg->_LinkedShaders[i];
 
-/* If the shader stage doesn't use the sampler uniform, skip this.
- */
-if (sh == NULL || !uni->opaque[i].active)
+/* If the shader stage doesn't use the sampler uniform, skip this. */
+if (!uni->opaque[i].active)
continue;
 
+ bool changed = false;
  for (int j = 0; j < count; j++) {
-sh->SamplerUnits[uni->opaque[i].index + offset + j] =
-   ((unsigned *) values)[j];
+unsigned unit = uni->opaque[i].index + offset + j;
+if (sh->SamplerUnits[unit] != ((unsigned *) values)[j]) {
+   sh->SamplerUnits[unit] = ((unsigned *) values)[j];
+   changed = true;
+}
  }
 
-struct gl_program *const prog = sh->Program;
-
-assert(sizeof(prog->SamplerUnits) == sizeof(sh->SamplerUnits));
-
-/* Determine if any of the samplers used by this shader stage have
- * been modified.
- */
-bool changed = false;
-GLbitfield mask = sh->active_samplers;
-while (mask) {
-   const int j = u_bit_scan();
-   if (prog->SamplerUnits[j] != sh->SamplerUnits[j]) {
-  changed = true;
-  break;
-   }
-}
-
 if (changed) {
if (!flushed) {
   FLUSH_VERTICES(ctx, _NEW_TEXTURE | _NEW_PROGRAM);
   flushed = true;
}
 
+struct gl_program *const prog = sh->Program;
+assert(sizeof(prog->SamplerUnits) == sizeof(sh->SamplerUnits));
+
_mesa_update_shader_textures_used(shProg, prog);
 if (ctx->Driver.SamplerUniformChange)
   ctx->Driver.SamplerUniformChange(ctx, prog->Target, prog);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 59/70] mesa/glsl: move redeclares_gl_fragcoord to gl_shader

2016-12-20 Thread Timothy Arceri

This is never used in gl_linked_shader other than as a temp
during linking so just use a temp instead.
---
 src/compiler/glsl/glsl_parser_extras.cpp |  3 +--
 src/compiler/glsl/linker.cpp | 21 -
 src/mesa/main/mtypes.h   |  3 ++-
 3 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index b70b1dc..bd13e00 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1808,8 +1808,7 @@ set_shader_inout_layout(struct gl_shader *shader,
   break;
 
case MESA_SHADER_FRAGMENT:
-  shader->info.redeclares_gl_fragcoord =
- state->fs_redeclares_gl_fragcoord;
+  shader->redeclares_gl_fragcoord = state->fs_redeclares_gl_fragcoord;
   shader->info.uses_gl_fragcoord = state->fs_uses_gl_fragcoord;
   shader->info.pixel_center_integer = state->fs_pixel_center_integer;
   shader->info.origin_upper_left = state->fs_origin_upper_left;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 2bbb112..e7c36e8 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1825,7 +1825,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
 struct gl_shader **shader_list,
 unsigned num_shaders)
 {
-   linked_shader->info.redeclares_gl_fragcoord = false;
+   bool redeclares_gl_fragcoord = false;
linked_shader->info.uses_gl_fragcoord = false;
linked_shader->info.origin_upper_left = false;
linked_shader->info.pixel_center_integer = false;
@@ -1843,12 +1843,10 @@ link_fs_inout_layout_qualifiers(struct 
gl_shader_program *prog,
*it must be redeclared in all the fragment shaders in that program
*that have a static use gl_FragCoord."
*/
-  if ((linked_shader->info.redeclares_gl_fragcoord
-   && !shader->info.redeclares_gl_fragcoord
-   && shader->info.uses_gl_fragcoord)
-  || (shader->info.redeclares_gl_fragcoord
-  && !linked_shader->info.redeclares_gl_fragcoord
-  && linked_shader->info.uses_gl_fragcoord)) {
+  if ((redeclares_gl_fragcoord && !shader->redeclares_gl_fragcoord &&
+   shader->info.uses_gl_fragcoord)
+  || (shader->redeclares_gl_fragcoord && !redeclares_gl_fragcoord &&
+  linked_shader->info.uses_gl_fragcoord)) {
  linker_error(prog, "fragment shader defined with conflicting "
  "layout qualifiers for gl_FragCoord\n");
   }
@@ -1858,8 +1856,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
*   "All redeclarations of gl_FragCoord in all fragment shaders in a
*single program must have the same set of qualifiers."
*/
-  if (linked_shader->info.redeclares_gl_fragcoord &&
-  shader->info.redeclares_gl_fragcoord &&
+  if (redeclares_gl_fragcoord && shader->redeclares_gl_fragcoord &&
   (shader->info.origin_upper_left !=
linked_shader->info.origin_upper_left ||
shader->info.pixel_center_integer !=
@@ -1873,10 +1870,8 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
* are multiple redeclarations, all the fields except uses_gl_fragcoord
* are already known to be the same.
*/
-  if (shader->info.redeclares_gl_fragcoord ||
-  shader->info.uses_gl_fragcoord) {
- linked_shader->info.redeclares_gl_fragcoord =
-shader->info.redeclares_gl_fragcoord;
+  if (shader->redeclares_gl_fragcoord || shader->info.uses_gl_fragcoord) {
+ redeclares_gl_fragcoord = shader->redeclares_gl_fragcoord;
  linked_shader->info.uses_gl_fragcoord =
 linked_shader->info.uses_gl_fragcoord ||
 shader->info.uses_gl_fragcoord;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 3793580..e4c7fdf 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2235,7 +2235,6 @@ struct gl_subroutine_function
 struct gl_shader_info
 {
bool uses_gl_fragcoord;
-   bool redeclares_gl_fragcoord;
bool PostDepthCoverage;
bool InnerCoverage;
 
@@ -2429,6 +2428,8 @@ struct gl_shader
 
bool ARB_fragment_coord_conventions_enable;
 
+   bool redeclares_gl_fragcoord;
+
struct gl_shader_info info;
 };
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 144 matches

Mail list logo