date:20161012

Mesa (master): radv: trivial case stmt style fixups

2016-10-12 Thread Edward O'Callaghan

Module: Mesa
Branch: master
Commit: cfbf956dfd0ea093135cd4864bf3b5b7508a54b6
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=cfbf956dfd0ea093135cd4864bf3b5b7508a54b6

Author: Edward O'Callaghan 
Date:   Tue Oct 11 11:43:09 2016 +1100

radv: trivial case stmt style fixups

Relocate a 'default:' to the end of a case stmt and fix an
indent issue.

Signed-off-by: Edward O'Callaghan 
Reviewed-by: Thomas Helland 

---

 src/amd/vulkan/radv_image.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 2223c89..0dc364c 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -758,8 +758,6 @@ radv_image_view_init(struct radv_image_view *iview,
const VkImageSubresourceRange *range = &pCreateInfo->subresourceRange;
bool is_stencil = false;
switch (image->type) {
-   default:
-   unreachable("bad VkImageType");
case VK_IMAGE_TYPE_1D:
case VK_IMAGE_TYPE_2D:
assert(range->baseArrayLayer + radv_get_layerCount(image, 
range) - 1 <= image->array_size);
@@ -768,6 +766,8 @@ radv_image_view_init(struct radv_image_view *iview,
assert(range->baseArrayLayer + radv_get_layerCount(image, 
range) - 1
   <= radv_minify(image->extent.depth, 
range->baseMipLevel));
break;
+   default:
+   unreachable("bad VkImageType");
}
iview->image = image;
iview->bo = image->bo;
@@ -842,7 +842,7 @@ void radv_image_set_optimal_micro_tile_mode(struct 
radv_device *device,
case 16:
 image->surface.tiling_index[0] = 11;
 break;
-   default: /* 32, 64 */
+   default: /* 32, 64 */
 image->surface.tiling_index[0] = 12;
 break;
}

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): draw: initialize shader inputs

2016-10-12 Thread Roland Scheidegger

Module: Mesa
Branch: master
Commit: 7e86b2ddae32fc86fc684a0dcf32a87a0d795267
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7e86b2ddae32fc86fc684a0dcf32a87a0d795267

Author: Roland Scheidegger 
Date:   Wed Oct 12 00:00:28 2016 +0200

draw: initialize shader inputs

This should make the code more robust if a shader tries to use inputs which
aren't defined by the vertex element layout (which usually shouldn't happen).

No piglit change.

Reviewed-by: Brian Paul 

---

 src/gallium/auxiliary/draw/draw_llvm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 87951fa..4270a8f 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -1705,6 +1705,13 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   lp_build_printf(gallivm, " --- io %d = %p, loop counter %d\n",
   io_itr, io, lp_loop.counter);
 #endif
+
+  for (j = draw->pt.nr_vertex_elements; j < PIPE_MAX_SHADER_INPUTS; j++) {
+ for (i = 0; i < TGSI_NUM_CHANNELS; i++) {
+inputs[j][i] = lp_build_zero(gallivm, vs_type);
+ }
+  }
+
   for (i = 0; i < vector_length; ++i) {
  LLVMValueRef vert_index =
 LLVMBuildAdd(builder,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): mapi: fix out-of-tree build dependencies

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 85ba409967bb0327b85460639080214b3997fc17
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=85ba409967bb0327b85460639080214b3997fc17

Author: Nicolai Hähnle 
Date:   Tue Oct 11 15:43:44 2016 +0200

mapi: fix out-of-tree build dependencies

We shouldn't be using wildcard here in the first place, but changing that
is some effort. As it stands, make -p confirms that glapi_gen_mapi_deps only
contains mapi_abi.py when building outside the Mesa tree.

As a result, only some of the tables were updated when XML files change, but
not the tables for shared glapi. This change ensures that we pick up the
XML files and scripts from the source tree as dependencies also for shared
glapi.

Reviewed-by: Emil Velikov 

---

 src/mapi/Makefile.am | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
index d6bf5d8..46afe3b 100644
--- a/src/mapi/Makefile.am
+++ b/src/mapi/Makefile.am
@@ -56,8 +56,8 @@ PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)
 
 glapi_gen_mapi_deps := \
mapi_abi.py \
-   $(wildcard glapi/gen/*.xml) \
-   $(wildcard glapi/gen/*.py)
+   $(wildcard $(top_srcdir)/src/mapi/glapi/gen/*.xml) \
+   $(wildcard $(top_srcdir)/src/mapi/glapi/gen/*.py)
 
 if HAVE_SHARED_GLAPI
 BUILT_SOURCES += shared-glapi/glapi_mapi_tmp.h

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-12 Thread Samuel Pitoiset

Module: Mesa
Branch: master
Commit: 87b06cab14c449e442be27650024f044e93c9a7c
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=87b06cab14c449e442be27650024f044e93c9a7c

Author: Samuel Pitoiset 
Date:   Fri Oct  7 01:16:24 2016 +0200

nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs:335256 -> 335273 (0.01%)
total local used in shared programs   :31968 -> 31968 (0.00%)

localgpr   inst  bytes
helped   0  41 852 852
  hurt   0  44  23  23

Signed-off-by: Samuel Pitoiset 
Reviewed-by: Ilia Mirkin 

---

 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 87 ++
 1 file changed, 87 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 6efb29e..d88bb34 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -2132,6 +2132,92 @@ AlgebraicOpt::visit(BasicBlock *bb)
 
 // 
=
 
+// ADD(SHL(a, b), c) -> SHLADD(a, b, c)
+class LateAlgebraicOpt : public Pass
+{
+private:
+   virtual bool visit(Instruction *);
+
+   void handleADD(Instruction *);
+   bool tryADDToSHLADD(Instruction *);
+};
+
+void
+LateAlgebraicOpt::handleADD(Instruction *add)
+{
+   Value *src0 = add->getSrc(0);
+   Value *src1 = add->getSrc(1);
+
+   if (src0->reg.file != FILE_GPR || src1->reg.file != FILE_GPR)
+  return;
+
+   if (prog->getTarget()->isOpSupported(OP_SHLADD, add->dType))
+  tryADDToSHLADD(add);
+}
+
+// ADD(SHL(a, b), c) -> SHLADD(a, b, c)
+bool
+LateAlgebraicOpt::tryADDToSHLADD(Instruction *add)
+{
+   Value *src0 = add->getSrc(0);
+   Value *src1 = add->getSrc(1);
+   ImmediateValue imm;
+   Instruction *shl;
+   Modifier mod[2];
+   Value *src;
+   int s;
+
+   if (add->saturate || add->usesFlags() || typeSizeof(add->dType) == 8
+   || isFloatType(add->dType))
+  return false;
+
+   if (src0->getUniqueInsn() && src0->getUniqueInsn()->op == OP_SHL)
+  s = 0;
+   else
+   if (src1->getUniqueInsn() && src1->getUniqueInsn()->op == OP_SHL)
+  s = 1;
+   else
+  return false;
+
+   src = add->getSrc(s);
+   shl = src->getUniqueInsn();
+
+   if (shl->bb != add->bb || shl->usesFlags() || shl->subOp)
+  return false;
+
+   if (!shl->src(1).getImmediate(imm))
+  return false;
+
+   mod[0] = add->src(0).mod;
+   mod[1] = add->src(1).mod;
+
+   add->op = OP_SHLADD;
+   add->setSrc(2, add->src(!s));
+   add->src(2).mod = mod[s];
+
+   add->setSrc(0, shl->getSrc(0));
+   add->setSrc(1, new_ImmediateValue(shl->bb->getProgram(), imm.reg.data.u32));
+   add->src(1).mod = Modifier(0);
+
+   return true;
+}
+
+bool
+LateAlgebraicOpt::visit(Instruction *i)
+{
+   switch (i->op) {
+   case OP_ADD:
+  handleADD(i);
+  break;
+   default:
+  break;
+   }
+
+   return true;
+}
+
+// 
=
+
 static inline void
 updateLdStOffset(Instruction *ldst, int32_t offset, Function *fn)
 {
@@ -3436,6 +3522,7 @@ Program::optimizeSSA(int level)
RUN_PASS(2, AlgebraicOpt, run);
RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
RUN_PASS(1, ConstantFolding, foldAll);
+   RUN_PASS(2, LateAlgebraicOpt, run);
RUN_PASS(1, LoadPropagation, run);
RUN_PASS(1, IndirectPropagation, run);
RUN_PASS(2, MemoryOpt, run);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi: use TC write-back instead of full cache invalidation

2016-10-12 Thread Marek Olšák

Module: Mesa
Branch: master
Commit: 40e1f7e09bf1bc9b8ed6f847562bbb7154025420
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=40e1f7e09bf1bc9b8ed6f847562bbb7154025420

Author: Marek Olšák 
Date:   Mon Oct 10 18:51:24 2016 +0200

radeonsi: use TC write-back instead of full cache invalidation

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_compute.c|  2 +-
 src/gallium/drivers/radeonsi/si_state.c  | 12 +++-
 src/gallium/drivers/radeonsi/si_state_draw.c |  6 +++---
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 632839f..e785106 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -701,7 +701,7 @@ static void si_launch_grid(
 
/* The hw doesn't read the indirect buffer via TC L2. */
if (r600_resource(info->indirect)->TC_L2_dirty) {
-   sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
r600_resource(info->indirect)->TC_L2_dirty = false;
}
}
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 34f3ed7..ad65fc2 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3397,21 +3397,15 @@ static void si_memory_barrier(struct pipe_context *ctx, 
unsigned flags)
 * L1 isn't used.
 */
if (sctx->screen->b.chip_class <= CIK)
-   sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
}
 
if (flags & PIPE_BARRIER_FRAMEBUFFER)
sctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER;
 
if (flags & (PIPE_BARRIER_FRAMEBUFFER |
-PIPE_BARRIER_INDIRECT_BUFFER)) {
-   /* Not sure if INV_GLOBAL_L2 is the best thing here.
-*
-* We need to make sure that TC L1 & L2 are written back to
-* memory, because CB fetches don't consider TC, but there's
-* no need to invalidate any TC cache lines. */
-   sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
-   }
+PIPE_BARRIER_INDIRECT_BUFFER))
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
 }
 
 static void *si_create_blend_custom(struct si_context *sctx, unsigned mode)
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 33b6b23..c14e852 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1047,18 +1047,18 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
/* VI reads index buffers through TC L2. */
if (info->indexed && sctx->b.chip_class <= CIK &&
r600_resource(ib.buffer)->TC_L2_dirty) {
-   sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
r600_resource(ib.buffer)->TC_L2_dirty = false;
}
 
if (info->indirect && r600_resource(info->indirect)->TC_L2_dirty) {
-   sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
r600_resource(info->indirect)->TC_L2_dirty = false;
}
 
if (info->indirect_params &&
r600_resource(info->indirect_params)->TC_L2_dirty) {
-   sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
r600_resource(info->indirect_params)->TC_L2_dirty = false;
}
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi: fix R600_DEBUG=precompile for shader-db

2016-10-12 Thread Marek Olšák

Module: Mesa
Branch: master
Commit: e4bbab90224116b53a178612aafa83318d4eaf46
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e4bbab90224116b53a178612aafa83318d4eaf46

Author: Marek Olšák 
Date:   Tue Oct 11 16:55:41 2016 +0200

radeonsi: fix R600_DEBUG=precompile for shader-db

radeonsi no longer supports pixel shaders without interpolation optimizations,
which led to assertion failures in si_shader_ps when running shader-db.

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_state_shaders.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index f6bd129..c41c519 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1233,6 +1233,12 @@ void si_init_shader_selector_async(void *job, int 
thread_index)
key.tcs.epilog.prim_mode = PIPE_PRIM_TRIANGLES;
break;
case PIPE_SHADER_FRAGMENT:
+   key.ps.prolog.bc_optimize_for_persp =
+   sel->info.uses_persp_center &&
+   sel->info.uses_persp_centroid;
+   key.ps.prolog.bc_optimize_for_linear =
+   sel->info.uses_linear_center &&
+   sel->info.uses_linear_centroid;
key.ps.epilog.alpha_func = PIPE_FUNC_ALWAYS;
for (i = 0; i < 8; i++)
if (sel->info.colors_written & (1 << i))

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi: implement TC L2 write-back (flush) without cache invalidation

2016-10-12 Thread Marek Olšák

Module: Mesa
Branch: master
Commit: 8cdce30cc20983dcb971dd906a9a9007e282081d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8cdce30cc20983dcb971dd906a9a9007e282081d

Author: Marek Olšák 
Date:   Mon Oct 10 18:49:22 2016 +0200

radeonsi: implement TC L2 write-back (flush) without cache invalidation

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_pipe.h   | 21 
 src/gallium/drivers/radeonsi/si_state_draw.c | 81 +---
 2 files changed, 74 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 3cefee7..e10d3fb 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -50,17 +50,20 @@
 #define SI_CONTEXT_INV_VMEM_L1 (R600_CONTEXT_PRIVATE_FLAG << 2)
 /* Used by everything except CB/DB, can be bypassed (SLC=1). Other names: TC 
L2 */
 #define SI_CONTEXT_INV_GLOBAL_L2   (R600_CONTEXT_PRIVATE_FLAG << 3)
+/* Write dirty L2 lines back to memory (shader and CP DMA stores), but don't
+ * invalidate L2. SI-CIK can't do it, so they will do complete invalidation. */
+#define SI_CONTEXT_WRITEBACK_GLOBAL_L2 (R600_CONTEXT_PRIVATE_FLAG << 4)
 /* Framebuffer caches. */
-#define SI_CONTEXT_FLUSH_AND_INV_CB_META (R600_CONTEXT_PRIVATE_FLAG << 4)
-#define SI_CONTEXT_FLUSH_AND_INV_DB_META (R600_CONTEXT_PRIVATE_FLAG << 5)
-#define SI_CONTEXT_FLUSH_AND_INV_DB(R600_CONTEXT_PRIVATE_FLAG << 6)
-#define SI_CONTEXT_FLUSH_AND_INV_CB(R600_CONTEXT_PRIVATE_FLAG << 7)
+#define SI_CONTEXT_FLUSH_AND_INV_CB_META (R600_CONTEXT_PRIVATE_FLAG << 5)
+#define SI_CONTEXT_FLUSH_AND_INV_DB_META (R600_CONTEXT_PRIVATE_FLAG << 6)
+#define SI_CONTEXT_FLUSH_AND_INV_DB(R600_CONTEXT_PRIVATE_FLAG << 7)
+#define SI_CONTEXT_FLUSH_AND_INV_CB(R600_CONTEXT_PRIVATE_FLAG << 8)
 /* Engine synchronization. */
-#define SI_CONTEXT_VS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 8)
-#define SI_CONTEXT_PS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 9)
-#define SI_CONTEXT_CS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 10)
-#define SI_CONTEXT_VGT_FLUSH   (R600_CONTEXT_PRIVATE_FLAG << 11)
-#define SI_CONTEXT_VGT_STREAMOUT_SYNC  (R600_CONTEXT_PRIVATE_FLAG << 12)
+#define SI_CONTEXT_VS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 9)
+#define SI_CONTEXT_PS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 10)
+#define SI_CONTEXT_CS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 11)
+#define SI_CONTEXT_VGT_FLUSH   (R600_CONTEXT_PRIVATE_FLAG << 12)
+#define SI_CONTEXT_VGT_STREAMOUT_SYNC  (R600_CONTEXT_PRIVATE_FLAG << 13)
 
 #define SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER (SI_CONTEXT_FLUSH_AND_INV_CB | \
  SI_CONTEXT_FLUSH_AND_INV_CB_META 
| \
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 38e5cb4..33b6b23 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -696,6 +696,19 @@ static void si_emit_draw_packets(struct si_context *sctx,
}
 }
 
+static void si_emit_surface_sync(struct r600_common_context *rctx,
+unsigned cp_coher_cntl)
+{
+   struct radeon_winsys_cs *cs = rctx->gfx.cs;
+
+   /* ACQUIRE_MEM is only required on a compute ring. */
+   radeon_emit(cs, PKT3(PKT3_SURFACE_SYNC, 3, 0));
+   radeon_emit(cs, cp_coher_cntl);   /* CP_COHER_CNTL */
+   radeon_emit(cs, 0x);  /* CP_COHER_SIZE */
+   radeon_emit(cs, 0);   /* CP_COHER_BASE */
+   radeon_emit(cs, 0x000A);  /* POLL_INTERVAL */
+}
+
 void si_emit_cache_flush(struct si_context *sctx)
 {
struct r600_common_context *rctx = &sctx->b;
@@ -715,15 +728,6 @@ void si_emit_cache_flush(struct si_context *sctx)
if (rctx->flags & SI_CONTEXT_INV_SMEM_L1)
cp_coher_cntl |= S_0085F0_SH_KCACHE_ACTION_ENA(1);
 
-   if (rctx->flags & SI_CONTEXT_INV_VMEM_L1)
-   cp_coher_cntl |= S_0085F0_TCL1_ACTION_ENA(1);
-   if (rctx->flags & SI_CONTEXT_INV_GLOBAL_L2) {
-   cp_coher_cntl |= S_0085F0_TC_ACTION_ENA(1);
-
-   if (rctx->chip_class >= VI)
-   cp_coher_cntl |= S_0301F0_TC_WB_ACTION_ENA(1);
-   }
-
if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB) {
cp_coher_cntl |= S_0085F0_CB_ACTION_ENA(1) |
 S_0085F0_CB0_DEST_BASE_ENA(1) |
@@ -806,23 +810,62 @@ void si_emit_cache_flush(struct si_context *sctx)
/* Make sure ME is idle (it executes most packets) before continuing.
 * This prevents read-after-write hazards between PFP and ME.
 */
-   if (cp_coher_cntl || (rctx->flags & SI_CONTEXT_CS_PARTIAL_FLUSH)) {
+   if (cp_coher_cntl ||
+   (rctx->flags & (SI_CONTEXT_CS_PARTIAL_FLUSH |
+   SI_CONTEXT_INV_VMEM_L1 |
+   SI_CONTEXT_INV_

Mesa (master): radeonsi: don' t invalidate VMEM L1 for memory barriers for index buffers

2016-10-12 Thread Marek Olšák

Module: Mesa
Branch: master
Commit: 65a4d55a9ff12b44655803da10112d3b1b42ce13
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=65a4d55a9ff12b44655803da10112d3b1b42ce13

Author: Marek Olšák 
Date:   Mon Oct 10 17:39:43 2016 +0200

radeonsi: don't invalidate VMEM L1 for memory barriers for index buffers

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_state.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ddf6cfe..34f3ed7 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3366,6 +3366,7 @@ static void si_texture_barrier(struct pipe_context *ctx)
 SI_CONTEXT_CS_PARTIAL_FLUSH;
 }
 
+/* This only ensures coherency for shader image/buffer stores. */
 static void si_memory_barrier(struct pipe_context *ctx, unsigned flags)
 {
struct si_context *sctx = (struct si_context *)ctx;
@@ -3392,9 +3393,9 @@ static void si_memory_barrier(struct pipe_context *ctx, 
unsigned flags)
}
 
if (flags & PIPE_BARRIER_INDEX_BUFFER) {
-   sctx->b.flags |= SI_CONTEXT_INV_VMEM_L1;
-
-   /* Indices are read through TC L2 since VI. */
+   /* Indices are read through TC L2 since VI.
+* L1 isn't used.
+*/
if (sctx->screen->b.chip_class <= CIK)
sctx->b.flags |= SI_CONTEXT_INV_GLOBAL_L2;
}

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): winsys/amdgpu: fix infinite loop w/ RADEON_NOOP= 1 caused by unsubmitted fences

2016-10-12 Thread Marek Olšák

Module: Mesa
Branch: master
Commit: d7e74b52bbd41ce3699cb3f75320e0592c1d6a29
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=d7e74b52bbd41ce3699cb3f75320e0592c1d6a29

Author: Marek Olšák 
Date:   Mon Oct 10 22:24:27 2016 +0200

winsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index c0e810c..2b86827 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -37,6 +37,8 @@
 
 #include "amd/common/sid.h"
 
+DEBUG_GET_ONCE_BOOL_OPTION(noop, "RADEON_NOOP", false)
+
 /* FENCES */
 
 static struct pipe_fence_handle *
@@ -143,6 +145,9 @@ amdgpu_cs_get_next_fence(struct radeon_winsys_cs *rcs)
struct amdgpu_cs *cs = amdgpu_cs(rcs);
struct pipe_fence_handle *fence = NULL;
 
+   if (debug_get_option_noop())
+  return NULL;
+
if (cs->next_fence) {
   amdgpu_fence_reference(&fence, cs->next_fence);
   return fence;
@@ -1069,8 +1074,6 @@ void amdgpu_cs_sync_flush(struct radeon_winsys_cs *rcs)
   util_queue_job_wait(&cs->flush_completed);
 }
 
-DEBUG_GET_ONCE_BOOL_OPTION(noop, "RADEON_NOOP", false)
-
 static int amdgpu_cs_flush(struct radeon_winsys_cs *rcs,
unsigned flags,
struct pipe_fence_handle **fence)

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi: Use the new image load/store intrinsic signatures

2016-10-12 Thread Tom Stellard

Module: Mesa
Branch: master
Commit: b33cb709fd06006d4c51824f850a4bb6d8d11f98
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b33cb709fd06006d4c51824f850a4bb6d8d11f98

Author: Tom Stellard 
Date:   Tue Oct 11 21:06:54 2016 +

radeonsi: Use the new image load/store intrinsic signatures

This patch requires LLVM r284024 or newer.

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_shader.c | 59 
 1 file changed, 45 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 4e07317..8b77fd1 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3575,16 +3575,29 @@ static void image_append_args(
const struct tgsi_full_instruction *inst = emit_data->inst;
LLVMValueRef i1false = LLVMConstInt(ctx->i1, 0, 0);
LLVMValueRef i1true = LLVMConstInt(ctx->i1, 1, 0);
-
-   emit_data->args[emit_data->arg_count++] = i1false; /* r128 */
-   emit_data->args[emit_data->arg_count++] =
-   tgsi_is_array_image(target) ? i1true : i1false; /* da */
-   if (!atomic) {
-   emit_data->args[emit_data->arg_count++] =
-   inst->Memory.Qualifier & (TGSI_MEMORY_COHERENT | 
TGSI_MEMORY_VOLATILE) ?
-   i1true : i1false; /* glc */
+   LLVMValueRef r128 = i1false;
+   LLVMValueRef da = tgsi_is_array_image(target) ? i1true : i1false;
+   LLVMValueRef glc =
+   inst->Memory.Qualifier & (TGSI_MEMORY_COHERENT | 
TGSI_MEMORY_VOLATILE) ?
+   i1true : i1false;
+   LLVMValueRef slc = i1false;
+   LLVMValueRef lwe = i1false;
+
+   if (atomic || (HAVE_LLVM <= 0x0309)) {
+   emit_data->args[emit_data->arg_count++] = r128;
+   emit_data->args[emit_data->arg_count++] = da;
+   if (!atomic) {
+   emit_data->args[emit_data->arg_count++] = glc;
+   }
+   emit_data->args[emit_data->arg_count++] = slc;
+   return;
}
-   emit_data->args[emit_data->arg_count++] = i1false; /* slc */
+
+   /* HAVE_LLVM >= 0x0400 */
+   emit_data->args[emit_data->arg_count++] = glc;
+   emit_data->args[emit_data->arg_count++] = slc;
+   emit_data->args[emit_data->arg_count++] = lwe;
+   emit_data->args[emit_data->arg_count++] = da;
 }
 
 /**
@@ -3761,7 +3774,9 @@ static void load_emit_memory(
 }
 
 static void get_image_intr_name(const char *base_name,
+   LLVMTypeRef data_type,
LLVMTypeRef coords_type,
+   LLVMTypeRef rsrc_type,
char *out_name, unsigned out_len)
 {
char coords_type_name[8];
@@ -3769,7 +3784,19 @@ static void get_image_intr_name(const char *base_name,
build_type_name_for_intr(coords_type, coords_type_name,
sizeof(coords_type_name));
 
-   snprintf(out_name, out_len, "%s.%s", base_name, coords_type_name);
+   if (HAVE_LLVM <= 0x0309) {
+   snprintf(out_name, out_len, "%s.%s", base_name, 
coords_type_name);
+   } else {
+   char data_type_name[8];
+   char rsrc_type_name[8];
+
+   build_type_name_for_intr(data_type, data_type_name,
+   sizeof(data_type_name));
+   build_type_name_for_intr(rsrc_type, rsrc_type_name,
+   sizeof(rsrc_type_name));
+   snprintf(out_name, out_len, "%s.%s.%s.%s", base_name,
+data_type_name, coords_type_name, rsrc_type_name);
+   }
 }
 
 static void load_emit(
@@ -3781,7 +3808,7 @@ static void load_emit(
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
const struct tgsi_full_instruction * inst = emit_data->inst;
-   char intrinsic_name[32];
+   char intrinsic_name[64];
 
if (inst->Src[0].Register.File == TGSI_FILE_MEMORY) {
load_emit_memory(ctx, emit_data);
@@ -3804,7 +3831,9 @@ static void load_emit(
LLVMReadOnlyAttribute);
} else {
get_image_intr_name("llvm.amdgcn.image.load",
-   LLVMTypeOf(emit_data->args[0]),
+   emit_data->dst_type,/* vdata */
+   LLVMTypeOf(emit_data->args[0]), /* coords */
+   LLVMTypeOf(emit_data->args[1]), /* rsrc */
intrinsic_name, sizeof(intrinsic_name));
 
emit_data->output[emit_data->chan] =
@@ -3981,7 +4010,7 @@ static void store_emit(
LLVMBuilderRef builder = gallivm->builder;
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned target = inst->Memor

Mesa (master): radeonsi: Add function for converting LLVM type to intrinsic string

2016-10-12 Thread Tom Stellard

Module: Mesa
Branch: master
Commit: ff0df66e10476fdb5be90395eed300f4d32a83c3
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=ff0df66e10476fdb5be90395eed300f4d32a83c3

Author: Tom Stellard 
Date:   Tue Oct 11 20:23:52 2016 +

radeonsi: Add function for converting LLVM type to intrinsic string

The existing function only worked for integer types.

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_shader.c | 42 
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 8254cb2..4e07317 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3347,17 +3347,39 @@ static LLVMValueRef get_buffer_size(
  * Given the i32 or vNi32 \p type, generate the textual name (e.g. for use with
  * intrinsic names).
  */
-static void build_int_type_name(
+static void build_type_name_for_intr(
LLVMTypeRef type,
char *buf, unsigned bufsize)
 {
-   assert(bufsize >= 6);
+   LLVMTypeRef elem_type = type;
 
-   if (LLVMGetTypeKind(type) == LLVMVectorTypeKind)
-   snprintf(buf, bufsize, "v%ui32",
-LLVMGetVectorSize(type));
-   else
-   strcpy(buf, "i32");
+   assert(bufsize >= 8);
+
+   if (LLVMGetTypeKind(type) == LLVMVectorTypeKind) {
+   int ret = snprintf(buf, bufsize, "v%u",
+   LLVMGetVectorSize(type));
+   if (ret < 0) {
+   char *type_name = LLVMPrintTypeToString(type);
+   fprintf(stderr, "Error building type name for: %s\n",
+   type_name);
+   return;
+   }
+   elem_type = LLVMGetElementType(type);
+   buf += ret;
+   bufsize -= ret;
+   }
+   switch (LLVMGetTypeKind(elem_type)) {
+   default: break;
+   case LLVMIntegerTypeKind:
+   snprintf(buf, bufsize, "i%d", LLVMGetIntTypeWidth(elem_type));
+   break;
+   case LLVMFloatTypeKind:
+   snprintf(buf, bufsize, "f32");
+   break;
+   case LLVMDoubleTypeKind:
+   snprintf(buf, bufsize, "f64");
+   break;
+   }
 }
 
 static void build_tex_intrinsic(const struct lp_build_tgsi_action *action,
@@ -3744,7 +3766,7 @@ static void get_image_intr_name(const char *base_name,
 {
char coords_type_name[8];
 
-   build_int_type_name(coords_type, coords_type_name,
+   build_type_name_for_intr(coords_type, coords_type_name,
sizeof(coords_type_name));
 
snprintf(out_name, out_len, "%s.%s", base_name, coords_type_name);
@@ -4144,7 +4166,7 @@ static void atomic_emit(
} else {
char coords_type[8];
 
-   build_int_type_name(LLVMTypeOf(emit_data->args[1]),
+   build_type_name_for_intr(LLVMTypeOf(emit_data->args[1]),
coords_type, sizeof(coords_type));
snprintf(intrinsic_name, sizeof(intrinsic_name),
 "llvm.amdgcn.image.atomic.%s.%s",
@@ -4918,7 +4940,7 @@ static void build_tex_intrinsic(const struct 
lp_build_tgsi_action *action,
}
 
/* Add the type and suffixes .c, .o if needed. */
-   build_int_type_name(LLVMTypeOf(emit_data->args[0]), type, sizeof(type));
+   build_type_name_for_intr(LLVMTypeOf(emit_data->args[0]), type, 
sizeof(type));
sprintf(intr_name, "%s%s%s%s.%s",
name, is_shadow ? ".c" : "", infix,
has_offset ? ".o" : "", type);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi: Refactor image store/load intrinsic name creation

2016-10-12 Thread Tom Stellard

Module: Mesa
Branch: master
Commit: a96a7eae04843e3c1c952d6aba62313116a6d368
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a96a7eae04843e3c1c952d6aba62313116a6d368

Author: Tom Stellard 
Date:   Tue Oct 11 16:43:36 2016 +

radeonsi: Refactor image store/load intrinsic name creation

Reviewed-by: Nicolai Hähnle 

---

 src/gallium/drivers/radeonsi/si_shader.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 49d4121..8254cb2 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3738,6 +3738,18 @@ static void load_emit_memory(
emit_data->output[emit_data->chan] = lp_build_gather_values(gallivm, 
channels, 4);
 }
 
+static void get_image_intr_name(const char *base_name,
+   LLVMTypeRef coords_type,
+   char *out_name, unsigned out_len)
+{
+   char coords_type_name[8];
+
+   build_int_type_name(coords_type, coords_type_name,
+   sizeof(coords_type_name));
+
+   snprintf(out_name, out_len, "%s.%s", base_name, coords_type_name);
+}
+
 static void load_emit(
const struct lp_build_tgsi_action *action,
struct lp_build_tgsi_context *bld_base,
@@ -3748,7 +3760,6 @@ static void load_emit(
LLVMBuilderRef builder = gallivm->builder;
const struct tgsi_full_instruction * inst = emit_data->inst;
char intrinsic_name[32];
-   char coords_type[8];
 
if (inst->Src[0].Register.File == TGSI_FILE_MEMORY) {
load_emit_memory(ctx, emit_data);
@@ -3770,11 +3781,9 @@ static void load_emit(
emit_data->args, emit_data->arg_count,
LLVMReadOnlyAttribute);
} else {
-   build_int_type_name(LLVMTypeOf(emit_data->args[0]),
-   coords_type, sizeof(coords_type));
-
-   snprintf(intrinsic_name, sizeof(intrinsic_name),
-"llvm.amdgcn.image.load.%s", coords_type);
+   get_image_intr_name("llvm.amdgcn.image.load",
+   LLVMTypeOf(emit_data->args[0]),
+   intrinsic_name, sizeof(intrinsic_name));
 
emit_data->output[emit_data->chan] =
lp_build_intrinsic(
@@ -3951,7 +3960,6 @@ static void store_emit(
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned target = inst->Memory.Texture;
char intrinsic_name[32];
-   char coords_type[8];
 
if (inst->Dst[0].Register.File == TGSI_FILE_MEMORY) {
store_emit_memory(ctx, emit_data);
@@ -3972,10 +3980,9 @@ static void store_emit(
emit_data->dst_type, emit_data->args,
emit_data->arg_count, 0);
} else {
-   build_int_type_name(LLVMTypeOf(emit_data->args[1]),
-   coords_type, sizeof(coords_type));
-   snprintf(intrinsic_name, sizeof(intrinsic_name),
-"llvm.amdgcn.image.store.%s", coords_type);
+   get_image_intr_name("llvm.amdgcn.image.store",
+   LLVMTypeOf(emit_data->args[1]),
+   intrinsic_name, sizeof(intrinsic_name));
 
emit_data->output[emit_data->chan] =
lp_build_intrinsic(

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): glsl: dump explicit location when printing IR

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 141b4b3dfe8bec897ec0ee289a51627af7bd
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=141b4b3dfe8bec897ec0ee289a51627af7bd

Author: Nicolai Hähnle 
Date:   Thu Oct  6 23:10:10 2016 +0200

glsl: dump explicit location when printing IR

Reviewed-by: Kenneth Graunke 
Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/compiler/glsl/ir_print_visitor.cpp | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/ir_print_visitor.cpp 
b/src/compiler/glsl/ir_print_visitor.cpp
index fc01be9..c238c16 100644
--- a/src/compiler/glsl/ir_print_visitor.cpp
+++ b/src/compiler/glsl/ir_print_visitor.cpp
@@ -165,10 +165,14 @@ void ir_print_visitor::visit(ir_variable *ir)
 {
fprintf(f, "(declare ");
 
-   char loc[256] = {0};
+   char loc[32] = {0};
if (ir->data.location != -1)
   snprintf(loc, sizeof(loc), "location=%i ", ir->data.location);
 
+   char component[32] = {0};
+   if (ir->data.explicit_component)
+  snprintf(component, sizeof(component), "component=%i ", 
ir->data.location_frac);
+
const char *const cent = (ir->data.centroid) ? "centroid " : "";
const char *const samp = (ir->data.sample) ? "sample " : "";
const char *const patc = (ir->data.patch) ? "patch " : "";
@@ -183,8 +187,8 @@ void ir_print_visitor::visit(ir_variable *ir)
const char *const interp[] = { "", "smooth", "flat", "noperspective" };
STATIC_ASSERT(ARRAY_SIZE(interp) == INTERP_MODE_COUNT);
 
-   fprintf(f, "(%s%s%s%s%s%s%s%s%s) ",
-   loc, cent, samp, patc, inv, prec, mode[ir->data.mode],
+   fprintf(f, "(%s%s%s%s%s%s%s%s%s%s) ",
+   loc, component, cent, samp, patc, inv, prec, mode[ir->data.mode],
stream[ir->data.stream],
interp[ir->data.interpolation]);
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: b5b4aa42ba189c8aa2339ead12784c4feb76bdbb
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b5b4aa42ba189c8aa2339ead12784c4feb76bdbb

Author: Nicolai Hähnle 
Date:   Fri Oct  7 12:19:33 2016 +0200

st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 68 +-
 1 file changed, 49 insertions(+), 19 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 26e1153..f721506 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -67,19 +67,34 @@ class st_dst_reg;
 
 static int swizzle_for_size(int size);
 
+static int swizzle_for_type(const glsl_type *type, int component = 0)
+{
+   unsigned num_elements = 4;
+
+   if (type) {
+  type = type->without_array();
+  if (type->is_scalar() || type->is_vector() || type->is_matrix())
+ num_elements = type->vector_elements;
+   }
+
+   int swizzle = swizzle_for_size(num_elements);
+   assert(num_elements + component <= 4);
+
+   swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
+   return swizzle;
+}
+
 /**
  * This struct is a corresponding struct to TGSI ureg_src.
  */
 class st_src_reg {
 public:
-   st_src_reg(gl_register_file file, int index, const glsl_type *type)
+   st_src_reg(gl_register_file file, int index, const glsl_type *type,
+  int component = 0)
{
   this->file = file;
   this->index = index;
-  if (type && (type->is_scalar() || type->is_vector() || 
type->is_matrix()))
- this->swizzle = swizzle_for_size(type->vector_elements);
-  else
- this->swizzle = SWIZZLE_XYZW;
+  this->swizzle = swizzle_for_type(type, component);
   this->negate = 0;
   this->index2D = 0;
   this->type = type ? type->base_type : GLSL_TYPE_ERROR;
@@ -279,13 +294,19 @@ class variable_storage : public exec_node {
 public:
variable_storage(ir_variable *var, gl_register_file file, int index,
 unsigned array_id = 0)
-  : file(file), index(index), var(var), array_id(array_id)
+  : file(file), index(index), component(0), var(var), array_id(array_id)
{
   /* empty */
}
 
gl_register_file file;
int index;
+
+   /* Explicit component location. This is given in terms of the GLSL-style
+* swizzles where each double is a single component, i.e. for 64-bit types
+* it can only be 0 or 1.
+*/
+   int component;
ir_variable *var; /* variable that maps to this, if any */
unsigned array_id;
 };
@@ -2387,9 +2408,12 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 
  const glsl_type *type_without_array = var->type->without_array();
  struct inout_decl *decl = &inputs[num_inputs];
+ unsigned component = var->data.location_frac;
  unsigned num_components;
  num_inputs++;
 
+ if (type_without_array->is_64bit())
+component = component / 2;
  if (type_without_array->vector_elements)
 num_components = type_without_array->vector_elements;
  else
@@ -2397,7 +2421,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 
  decl->mesa_index = var->data.location;
  decl->base_type = type_without_array->base_type;
- decl->usage_mask = u_bit_consecutive(0, num_components);
+ decl->usage_mask = u_bit_consecutive(component, num_components);
 
  if (is_inout_array(shader->Stage, var, &remove_array)) {
 decl->array_id = num_input_arrays + 1;
@@ -2415,6 +2439,8 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
PROGRAM_INPUT,
decl->mesa_index,
decl->array_id);
+ entry->component = component;
+
  this->variables.push_tail(entry);
  break;
   }
@@ -2423,9 +2449,12 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 
  const glsl_type *type_without_array = var->type->without_array();
  struct inout_decl *decl = &outputs[num_outputs];
+ unsigned component = var->data.location_frac;
  unsigned num_components;
  num_outputs++;
 
+ if (type_without_array->is_64bit())
+component = component / 2;
  if (type_without_array->vector_elements)
 num_components = type_without_array->vector_elements;
  else
@@ -2433,7 +2462,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 
  decl->mesa_index = var->data.location + FRAG_RESULT_MAX * 
var->data.index;
  decl->base_type = type_without_array->base_type;
- decl->usage_mask = u_bit_consecutive(0, num_components);
+ decl->usage_mask = u_bit_consecutive(componen

Mesa (master): gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 700a571f8963cff4bff230e8e9b25da0bdce4f54
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=700a571f8963cff4bff230e8e9b25da0bdce4f54

Author: Nicolai Hähnle 
Date:   Fri Oct  7 09:42:55 2016 +0200

gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS

This is a screen cap because drivers are expected to support it either
for all shader types or for none of them.

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/gallium/docs/source/screen.rst   | 8 
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 24 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index cfc0a1b..d79e75e 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -353,6 +353,14 @@ The integer capabilities:
   32-bit. If set to off, that means that a B5G6R5 + Z24 or RGBA8 + Z16
   combination will require a driver fallback, and should not be
   advertised in the GLX/EGL config list.
+* ``PIPE_CAP_TGSI_ARRAY_COMPONENTS``: If true, the driver interprets the
+  UsageMask of input and output declarations and allows declaring arrays
+  in overlapping ranges. The components must be a contiguous range, e.g. a
+  UsageMask of  xy or yzw is allowed, but xz or yw isn't. Declarations with
+  overlapping locations must have matching semantic names and indices, and
+  equal interpolation qualifiers.
+  Components may overlap, notably when the gaps in an array of dvec3 are
+  filled in.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index bc54539..1f7c2a5 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -287,6 +287,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
+   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 9f801a9..003f855 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -277,6 +277,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
+   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
   return 0;
 
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 85357fa..1904cd6 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -515,6 +515,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
+   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 9a0a1a2..8da9428 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -336,6 +336,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
+   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 2ced8f1..961c22c 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -201,6 +201,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE

Mesa (master): st/glsl_to_tgsi: mark "gaps" in input/output arrays as used

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 2299a9940c5ac5fb42b0726afa9a67fc23ba3a48
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2299a9940c5ac5fb42b0726afa9a67fc23ba3a48

Author: Nicolai Hähnle 
Date:   Fri Oct  7 21:30:05 2016 +0200

st/glsl_to_tgsi: mark "gaps" in input/output arrays as used

In some cases, a shader may have an input/output array but not use some
entries in the middle. This happens with eON games, for example.

We emit declarations that cover the entire array range even if there are
some unused gaps. This patch now reflects that in the InputsRead etc.
fields to ensure the various input/outputMapping arrays are actually
correct, which will be important when we re-jiggle the way declarations
are emitted.

v2: fix a typo (Edward O'Callaghan)

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 32 ++
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index dc20fe4..0d39fe8 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -2458,9 +2458,9 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 
 static void
 shrink_array_declarations(struct array_decl *arrays, unsigned count,
-  GLbitfield64 usage_mask,
+  GLbitfield64* usage_mask,
   GLbitfield64 double_usage_mask,
-  GLbitfield patch_usage_mask)
+  GLbitfield* patch_usage_mask)
 {
unsigned i;
int j;
@@ -2474,12 +2474,12 @@ shrink_array_declarations(struct array_decl *arrays, 
unsigned count,
   /* Shrink the beginning. */
   for (j = 0; j < (int)decl->array_size; j++) {
  if (decl->mesa_index >= VARYING_SLOT_PATCH0) {
-if (patch_usage_mask &
+if (*patch_usage_mask &
 BITFIELD64_BIT(decl->mesa_index - VARYING_SLOT_PATCH0 + j))
break;
  }
  else {
-if (usage_mask & BITFIELD64_BIT(decl->mesa_index+j))
+if (*usage_mask & BITFIELD64_BIT(decl->mesa_index+j))
break;
 if (double_usage_mask & BITFIELD64_BIT(decl->mesa_index+j-1))
break;
@@ -2493,12 +2493,12 @@ shrink_array_declarations(struct array_decl *arrays, 
unsigned count,
   /* Shrink the end. */
   for (j = decl->array_size-1; j >= 0; j--) {
  if (decl->mesa_index >= VARYING_SLOT_PATCH0) {
-if (patch_usage_mask &
+if (*patch_usage_mask &
 BITFIELD64_BIT(decl->mesa_index - VARYING_SLOT_PATCH0 + j))
break;
  }
  else {
-if (usage_mask & BITFIELD64_BIT(decl->mesa_index+j))
+if (*usage_mask & BITFIELD64_BIT(decl->mesa_index+j))
break;
 if (double_usage_mask & BITFIELD64_BIT(decl->mesa_index+j-1))
break;
@@ -2506,6 +2506,22 @@ shrink_array_declarations(struct array_decl *arrays, 
unsigned count,
 
  decl->array_size--;
   }
+
+  /* When not all entries of an array are accessed, we mark them as used
+   * here anyway, to ensure that the input/output mapping logic doesn't get
+   * confused.
+   *
+   * TODO This happens when an array isn't used via indirect access, which
+   * some game ports do (at least eON-based). There is an optimization
+   * opportunity here by replacing the array declaration with non-array
+   * declarations of those slots that are actually used.
+   */
+  for (j = 1; j < (int)decl->array_size; ++j) {
+ if (decl->mesa_index >= VARYING_SLOT_PATCH0)
+*patch_usage_mask |= BITFIELD64_BIT(decl->mesa_index - 
VARYING_SLOT_PATCH0 + j);
+ else
+*usage_mask |= BITFIELD64_BIT(decl->mesa_index + j);
+  }
}
 }
 
@@ -6633,9 +6649,9 @@ get_mesa_program_tgsi(struct gl_context *ctx,
 
do_set_program_inouts(shader->ir, prog, shader->Stage);
shrink_array_declarations(v->input_arrays, v->num_input_arrays,
- prog->InputsRead, prog->DoubleInputsRead, 
prog->PatchInputsRead);
+ &prog->InputsRead, prog->DoubleInputsRead, 
&prog->PatchInputsRead);
shrink_array_declarations(v->output_arrays, v->num_output_arrays,
- prog->OutputsWritten, 0ULL, 
prog->PatchOutputsWritten);
+ &prog->OutputsWritten, 0ULL, 
&prog->PatchOutputsWritten);
count_resources(v, prog);
 
/* The GLSL IR won't be needed anymore. */

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): tgsi/ureg: add layout/component input declarations

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 047a7c7a0b419ac9e6deb4ff885a08c684495ce4
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=047a7c7a0b419ac9e6deb4ff885a08c684495ce4

Author: Nicolai Hähnle 
Date:   Fri Oct  7 12:07:21 2016 +0200

tgsi/ureg: add layout/component input declarations

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan  (v1)
Reviewed-by: Dave Airlie  (v1)

---

 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 67 --
 src/gallium/auxiliary/tgsi/tgsi_ureg.h | 21 +++
 2 files changed, 76 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 6ad514d..348c371 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -77,9 +77,9 @@ struct ureg_tokens {
unsigned count;
 };
 
-#define UREG_MAX_INPUT PIPE_MAX_SHADER_INPUTS
+#define UREG_MAX_INPUT (4 * PIPE_MAX_SHADER_INPUTS)
 #define UREG_MAX_SYSTEM_VALUE PIPE_MAX_ATTRIBS
-#define UREG_MAX_OUTPUT PIPE_MAX_SHADER_OUTPUTS
+#define UREG_MAX_OUTPUT (4 * PIPE_MAX_SHADER_OUTPUTS)
 #define UREG_MAX_CONSTANT_RANGE 32
 #define UREG_MAX_IMMEDIATE 4096
 #define UREG_MAX_ADDR 3
@@ -108,6 +108,7 @@ struct ureg_program
   unsigned semantic_index;
   unsigned interp;
   unsigned char cylindrical_wrap;
+  unsigned char usage_mask;
   unsigned interp_location;
   unsigned first;
   unsigned last;
@@ -269,25 +270,33 @@ ureg_property(struct ureg_program *ureg, unsigned name, 
unsigned value)
 }
 
 struct ureg_src
-ureg_DECL_fs_input_cyl_centroid(struct ureg_program *ureg,
+ureg_DECL_fs_input_cyl_centroid_layout(struct ureg_program *ureg,
unsigned semantic_name,
unsigned semantic_index,
unsigned interp_mode,
unsigned cylindrical_wrap,
unsigned interp_location,
+   unsigned index,
+   unsigned usage_mask,
unsigned array_id,
unsigned array_size)
 {
unsigned i;
 
+   assert(usage_mask != 0);
+   assert(usage_mask <= TGSI_WRITEMASK_XYZW);
+
for (i = 0; i < ureg->nr_inputs; i++) {
   if (ureg->input[i].semantic_name == semantic_name &&
   ureg->input[i].semantic_index == semantic_index) {
  assert(ureg->input[i].interp == interp_mode);
  assert(ureg->input[i].cylindrical_wrap == cylindrical_wrap);
  assert(ureg->input[i].interp_location == interp_location);
- assert(ureg->input[i].array_id == array_id);
- goto out;
+ if (ureg->input[i].array_id == array_id) {
+ureg->input[i].usage_mask |= usage_mask;
+goto out;
+ }
+ assert((ureg->input[i].usage_mask & usage_mask) == 0);
   }
}
 
@@ -298,10 +307,11 @@ ureg_DECL_fs_input_cyl_centroid(struct ureg_program *ureg,
   ureg->input[i].interp = interp_mode;
   ureg->input[i].cylindrical_wrap = cylindrical_wrap;
   ureg->input[i].interp_location = interp_location;
-  ureg->input[i].first = ureg->nr_input_regs;
-  ureg->input[i].last = ureg->nr_input_regs + array_size - 1;
+  ureg->input[i].first = index;
+  ureg->input[i].last = index + array_size - 1;
   ureg->input[i].array_id = array_id;
-  ureg->nr_input_regs += array_size;
+  ureg->input[i].usage_mask = usage_mask;
+  ureg->nr_input_regs = MAX2(ureg->nr_input_regs, index + array_size);
   ureg->nr_inputs++;
} else {
   set_bad(ureg);
@@ -312,6 +322,21 @@ out:
   array_id);
 }
 
+struct ureg_src
+ureg_DECL_fs_input_cyl_centroid(struct ureg_program *ureg,
+   unsigned semantic_name,
+   unsigned semantic_index,
+   unsigned interp_mode,
+   unsigned cylindrical_wrap,
+   unsigned interp_location,
+   unsigned array_id,
+   unsigned array_size)
+{
+   return ureg_DECL_fs_input_cyl_centroid_layout(ureg,
+ semantic_name, semantic_index, interp_mode, cylindrical_wrap, 
interp_location,
+ ureg->nr_input_regs, TGSI_WRITEMASK_XYZW, array_id, array_size);
+}
+
 
 struct ureg_src 
 ureg_DECL_vs_input( struct ureg_program *ureg,
@@ -326,6 +351,21 @@ ureg_DECL_vs_input( struct ureg_program *ureg,
 
 
 struct ureg_src
+ureg_DECL_input_layout(struct ureg_program *ureg,
+unsigned semantic_name,
+unsigned semantic_index,
+unsigned index,
+unsigned usage_mask,
+unsigned array_id,
+unsigned array_size)
+{
+   return ureg_DECL_fs_input_cyl_centroid_layout(ureg,
+   semantic_name, semantic_index, 0, 0, 0,
+   index, usage_mask, array_id, array_size);
+}
+
+
+struct ureg_src
 ureg_DECL_input(struct ureg_program *ur

Mesa (master): st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 63193b9cdeca4f5d0e91f90c0926a1565f6b0415
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=63193b9cdeca4f5d0e91f90c0926a1565f6b0415

Author: Nicolai Hähnle 
Date:   Fri Oct  7 16:15:30 2016 +0200

st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations

This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.

A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like

dvec2 d;
float x = (float)d.y;

which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.

Cc: mesa-sta...@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 6a63e5c..dc20fe4 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -255,6 +255,7 @@ public:
ir_instruction *ir;
GLboolean cond_update;
bool saturate;
+   bool is_64bit_expanded;
st_src_reg sampler; /**< sampler register */
int sampler_base;
int sampler_array_size; /**< 1-based size of sampler array, 1 if not array 
*/
@@ -670,6 +671,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned 
op,
inst->src[1] = src1;
inst->src[2] = src2;
inst->src[3] = src3;
+   inst->is_64bit_expanded = false;
inst->ir = ir;
inst->dead_mask = 0;
/* default to float, for paths where this is not initialized
@@ -790,6 +792,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned 
op,
 dinst->prev = NULL;
  }
  this->instructions.push_tail(dinst);
+ dinst->is_64bit_expanded = true;
 
  /* modify the destination if we are splitting */
  for (j = 0; j < 2; j++) {
@@ -2908,6 +2911,7 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
} else if (ir->rhs->as_expression() &&
   this->instructions.get_tail() &&
   ir->rhs == ((glsl_to_tgsi_instruction 
*)this->instructions.get_tail())->ir &&
+  !((glsl_to_tgsi_instruction 
*)this->instructions.get_tail())->is_64bit_expanded &&
   type_size(ir->lhs->type) == 1 &&
   l.writemask == ((glsl_to_tgsi_instruction 
*)this->instructions.get_tail())->dst[0].writemask) {
   /* To avoid emitting an extra MOV when assigning an expression to a

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): st/glsl_to_tgsi: explicitly track all input and output declaration

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 777dcf81b956158546616aae89507cafb83b9ac5
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=777dcf81b956158546616aae89507cafb83b9ac5

Author: Nicolai Hähnle 
Date:   Fri Oct  7 12:19:11 2016 +0200

st/glsl_to_tgsi: explicitly track all input and output declaration

In order to be able to emit overlapping input and output array
declarations, we flip the logic of emitting those declarations on its
head: rather than iterating over slots and emitting the corresponding
declarations, we iterate over the declarations from GLSL and emit those.

v2: fix some regressions related to structs
v3: fix a regression in geometry and tessellation shader array handling

Acked-by: Edward O'Callaghan  (v2)
Reviewed-by: Dave Airlie  (v2)

---

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 325 +++--
 1 file changed, 171 insertions(+), 154 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0d39fe8..26e1153 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -340,25 +340,38 @@ public:
 static st_src_reg undef_src = st_src_reg(PROGRAM_UNDEFINED, 0, 
GLSL_TYPE_ERROR);
 static st_dst_reg undef_dst = st_dst_reg(PROGRAM_UNDEFINED, SWIZZLE_NOOP, 
GLSL_TYPE_ERROR);
 
-struct array_decl {
+struct inout_decl {
unsigned mesa_index;
-   unsigned array_id;
-   unsigned array_size;
-   enum glsl_base_type array_type;
+   unsigned array_id; /* TGSI ArrayID; 1-based: 0 means not an array */
+   unsigned size;
+   enum glsl_base_type base_type;
+   ubyte usage_mask; /* GLSL-style usage-mask,  i.e. single bit per double */
 };
 
-static enum glsl_base_type
-find_array_type(struct array_decl *arrays, unsigned count, unsigned array_id)
+static struct inout_decl *
+find_inout_array(struct inout_decl *decls, unsigned count, unsigned array_id)
 {
-   unsigned i;
+   assert(array_id != 0);
 
-   for (i = 0; i < count; i++) {
-  struct array_decl *decl = &arrays[i];
+   for (unsigned i = 0; i < count; i++) {
+  struct inout_decl *decl = &decls[i];
 
   if (array_id == decl->array_id) {
- return decl->array_type;
+ return decl;
   }
}
+
+   return NULL;
+}
+
+static enum glsl_base_type
+find_array_type(struct inout_decl *decls, unsigned count, unsigned array_id)
+{
+   if (!array_id)
+  return GLSL_TYPE_ERROR;
+   struct inout_decl *decl = find_inout_array(decls, count, array_id);
+   if (decl)
+  return decl->base_type;
return GLSL_TYPE_ERROR;
 }
 
@@ -386,9 +399,11 @@ public:
unsigned max_num_arrays;
unsigned next_array;
 
-   struct array_decl input_arrays[PIPE_MAX_SHADER_INPUTS];
+   struct inout_decl inputs[4 * PIPE_MAX_SHADER_INPUTS];
+   unsigned num_inputs;
unsigned num_input_arrays;
-   struct array_decl output_arrays[PIPE_MAX_SHADER_OUTPUTS];
+   struct inout_decl outputs[4 * PIPE_MAX_SHADER_OUTPUTS];
+   unsigned num_outputs;
unsigned num_output_arrays;
 
int num_address_regs;
@@ -736,7 +751,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned 
op,
for (j = 0; j < 2; j++) {
   dst_is_64bit[j] = glsl_base_type_is_64bit(inst->dst[j].type);
   if (!dst_is_64bit[j] && inst->dst[j].file == PROGRAM_OUTPUT && 
inst->dst[j].type == GLSL_TYPE_ARRAY) {
- enum glsl_base_type type = find_array_type(this->output_arrays, 
this->num_output_arrays, inst->dst[j].array_id);
+ enum glsl_base_type type = find_array_type(this->outputs, 
this->num_outputs, inst->dst[j].array_id);
  if (glsl_base_type_is_64bit(type))
 dst_is_64bit[j] = true;
   }
@@ -2324,16 +2339,16 @@ glsl_to_tgsi_visitor::visit(ir_swizzle *ir)
  * for patch inputs), so only the array element type is considered.
  */
 static bool
-is_inout_array(unsigned stage, ir_variable *var, bool *is_2d)
+is_inout_array(unsigned stage, ir_variable *var, bool *remove_array)
 {
const glsl_type *type = var->type;
 
+   *remove_array = false;
+
if ((stage == MESA_SHADER_VERTEX && var->data.mode == ir_var_shader_in) ||
(stage == MESA_SHADER_FRAGMENT && var->data.mode == ir_var_shader_out))
   return false;
 
-   *is_2d = false;
-
if (((stage == MESA_SHADER_GEOMETRY && var->data.mode == ir_var_shader_in) 
||
 (stage == MESA_SHADER_TESS_EVAL && var->data.mode == ir_var_shader_in) 
||
 stage == MESA_SHADER_TESS_CTRL) &&
@@ -2342,7 +2357,7 @@ is_inout_array(unsigned stage, ir_variable *var, bool 
*is_2d)
  return false; /* a system value probably */
 
   type = var->type->fields.array;
-  *is_2d = true;
+  *remove_array = true;
}
 
return type->is_array() || type->is_matrix();
@@ -2353,7 +2368,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 {
variable_storage *entry = find_variable_storage(ir->var);
ir_variable *var = ir->var;
-   bool is_2d;
+   bool remove_array;
 
if (!entry) {
   switch (var->da

Mesa (master): st/mesa: enable ARB_enhanced_layouts and turn the cap on

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 789119d21212da3891ed57018ec1af197a48977c
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=789119d21212da3891ed57018ec1af197a48977c

Author: Nicolai Hähnle 
Date:   Thu Oct  6 23:10:22 2016 +0200

st/mesa: enable ARB_enhanced_layouts and turn the cap on

v2: mark llvmpipe & softpipe properly as well (Jason Wood)

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 docs/features.txt| 18 +-
 docs/relnotes/12.1.0.html|  2 +-
 src/gallium/drivers/llvmpipe/lp_screen.c |  2 +-
 src/gallium/drivers/radeonsi/si_pipe.c   |  2 +-
 src/gallium/drivers/softpipe/sp_screen.c |  2 +-
 src/mesa/state_tracker/st_extensions.c   |  7 +++
 6 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index 505b61c..ec2634f 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -188,23 +188,23 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
   GL_ARB_vertex_attrib_binding  DONE (all drivers)
 
 
-GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+
+GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, radeonsi
 
   GL_MAX_VERTEX_ATTRIB_STRIDE   DONE (all drivers)
-  GL_ARB_buffer_storage DONE (i965, nv50, 
nvc0, r600, radeonsi)
-  GL_ARB_clear_texture  DONE (i965, nv50, 
nvc0, r600, radeonsi)
-  GL_ARB_enhanced_layouts   DONE (i965)
+  GL_ARB_buffer_storage DONE (i965, nv50, 
nvc0, r600)
+  GL_ARB_clear_texture  DONE (i965, nv50, 
nvc0, r600)
+  GL_ARB_enhanced_layouts   DONE (i965, llvmpipe, 
softpipe)
   - compile-time constant expressions   DONE
   - explicit byte offsets for blocksDONE
   - forced alignment within blocks  DONE
-  - specified vec4-slot component numbers   DONE (i965)
+  - specified vec4-slot component numbers   DONE (i965, llvmpipe, 
softpipe)
   - specified transform/feedback layout DONE
   - input/output block locationsDONE
   GL_ARB_multi_bind DONE (all drivers)
-  GL_ARB_query_buffer_objectDONE (i965/hsw+, nvc0, 
radeonsi)
-  GL_ARB_texture_mirror_clamp_to_edge   DONE (i965, nv50, 
nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
-  GL_ARB_texture_stencil8   DONE (i965/hsw+, nv50, 
nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
-  GL_ARB_vertex_type_10f_11f_11f_revDONE (i965, nv50, 
nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
+  GL_ARB_query_buffer_objectDONE (i965/hsw+, nvc0)
+  GL_ARB_texture_mirror_clamp_to_edge   DONE (i965, nv50, 
nvc0, r600, llvmpipe, softpipe, swr)
+  GL_ARB_texture_stencil8   DONE (i965/hsw+, nv50, 
nvc0, r600, llvmpipe, softpipe, swr)
+  GL_ARB_vertex_type_10f_11f_11f_revDONE (i965, nv50, 
nvc0, r600, llvmpipe, softpipe, swr)
 
 GL 4.5, GLSL 4.50:
 
diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
index 2e4b669..20fd2cb 100644
--- a/docs/relnotes/12.1.0.html
+++ b/docs/relnotes/12.1.0.html
@@ -51,7 +51,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_clear_texture on r600, radeonsi
 GL_ARB_compute_variable_group_size on nvc0, radeonsi
 GL_ARB_cull_distance on radeonsi
-GL_ARB_enhanced_layouts on i965
+GL_ARB_enhanced_layouts on i965, radeonsi, llvmpipe, softpipe
 GL_ARB_indirect_parameters on radeonsi
 GL_ARB_query_buffer_object on radeonsi
 GL_ARB_shader_draw_parameters on radeonsi
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 8da9428..3e4f1ef 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -265,6 +265,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_FAKE_SW_MSAA:
   return 1;
case PIPE_CAP_CONDITIONAL_RENDER_INVERTED:
+   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
   return 1;
 
case PIPE_CAP_VENDOR_ID:
@@ -336,7 +337,6 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
-   case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index dc0c72e..5a3f101 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -409,6 +409,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
ca

Mesa (master): tgsi/scan: fix num_inputs/ num_outputs for shaders with overlapping arrays

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: f9a01f3872fa854c316b36351b166e2e4ebb5570
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f9a01f3872fa854c316b36351b166e2e4ebb5570

Author: Nicolai Hähnle 
Date:   Fri Oct  7 12:53:55 2016 +0200

tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays

v2: remove a tautological left-over assert (Marek)

Reviewed-by: Edward O'Callaghan  (v1)
Reviewed-by: Dave Airlie  (v1)

---

 src/gallium/auxiliary/tgsi/tgsi_scan.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index c7745ce..b862078 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -401,12 +401,7 @@ scan_declaration(struct tgsi_shader_info *info,
  info->input_cylindrical_wrap[reg] = 
(ubyte)fulldecl->Interp.CylindricalWrap;
 
  /* Vertex shaders can have inputs with holes between them. */
- if (info->processor == PIPE_SHADER_VERTEX)
-info->num_inputs = MAX2(info->num_inputs, reg + 1);
- else {
-info->num_inputs++;
-assert(reg < info->num_inputs);
- }
+ info->num_inputs = MAX2(info->num_inputs, reg + 1);
 
  if (semName == TGSI_SEMANTIC_PRIMID)
 info->uses_primid = TRUE;
@@ -456,8 +451,7 @@ scan_declaration(struct tgsi_shader_info *info,
   else if (file == TGSI_FILE_OUTPUT) {
  info->output_semantic_name[reg] = (ubyte) semName;
  info->output_semantic_index[reg] = (ubyte) semIndex;
- info->num_outputs++;
- assert(reg < info->num_outputs);
+ info->num_outputs = MAX2(info->num_outputs, reg + 1);
 
  if (semName == TGSI_SEMANTIC_COLOR)
 info->colors_written |= 1 << semIndex;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): tgsi/ureg: add ureg_DECL_output_layout

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 2b460c750a3cd4acbe05036e92e860800b6d5a96
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2b460c750a3cd4acbe05036e92e860800b6d5a96

Author: Nicolai Hähnle 
Date:   Wed Oct 12 17:24:37 2016 +0200

tgsi/ureg: add ureg_DECL_output_layout

For specifying an exact location/component.

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan  (v1)
Reviewed-by: Dave Airlie  (v1)

---

 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 42 +++---
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  9 
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 348c371..ede648e 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -405,9 +405,10 @@ out:
 }
 
 
-struct ureg_dst 
-ureg_DECL_output_masked(struct ureg_program *ureg,
-unsigned name,
+struct ureg_dst
+ureg_DECL_output_layout(struct ureg_program *ureg,
+unsigned semantic_name,
+unsigned semantic_index,
 unsigned index,
 unsigned usage_mask,
 unsigned array_id,
@@ -418,22 +419,24 @@ ureg_DECL_output_masked(struct ureg_program *ureg,
assert(usage_mask != 0);
 
for (i = 0; i < ureg->nr_outputs; i++) {
-  if (ureg->output[i].semantic_name == name &&
-  ureg->output[i].semantic_index == index) {
- assert(ureg->output[i].array_id == array_id);
- ureg->output[i].usage_mask |= usage_mask;
- goto out;
+  if (ureg->output[i].semantic_name == semantic_name &&
+  ureg->output[i].semantic_index == semantic_index) {
+ if (ureg->output[i].array_id == array_id) {
+ureg->output[i].usage_mask |= usage_mask;
+goto out;
+ }
+ assert((ureg->output[i].usage_mask & usage_mask) == 0);
   }
}
 
if (ureg->nr_outputs < UREG_MAX_OUTPUT) {
-  ureg->output[i].semantic_name = name;
-  ureg->output[i].semantic_index = index;
+  ureg->output[i].semantic_name = semantic_name;
+  ureg->output[i].semantic_index = semantic_index;
   ureg->output[i].usage_mask = usage_mask;
-  ureg->output[i].first = ureg->nr_output_regs;
-  ureg->output[i].last = ureg->nr_output_regs + array_size - 1;
+  ureg->output[i].first = index;
+  ureg->output[i].last = index + array_size - 1;
   ureg->output[i].array_id = array_id;
-  ureg->nr_output_regs += array_size;
+  ureg->nr_output_regs = MAX2(ureg->nr_output_regs, index + array_size);
   ureg->nr_outputs++;
}
else {
@@ -446,6 +449,19 @@ out:
 }
 
 
+struct ureg_dst
+ureg_DECL_output_masked(struct ureg_program *ureg,
+unsigned name,
+unsigned index,
+unsigned usage_mask,
+unsigned array_id,
+unsigned array_size)
+{
+   return ureg_DECL_output_layout(ureg, name, index,
+  ureg->nr_output_regs, usage_mask, array_id, 
array_size);
+}
+
+
 struct ureg_dst 
 ureg_DECL_output(struct ureg_program *ureg,
  unsigned name,
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index 0fa35bf..d3c28b3 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -248,6 +248,15 @@ ureg_DECL_system_value(struct ureg_program *,
unsigned semantic_index);
 
 struct ureg_dst
+ureg_DECL_output_layout(struct ureg_program *,
+unsigned semantic_name,
+unsigned semantic_index,
+unsigned index,
+unsigned usage_mask,
+unsigned array_id,
+unsigned array_size);
+
+struct ureg_dst
 ureg_DECL_output_masked(struct ureg_program *,
 unsigned semantic_name,
 unsigned semantic_index,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): st/glsl_to_tgsi: explicit handling of writemask for depth/ stencil export

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: 957d5410892aa7b12bb19fe081a7073861b424a6
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=957d5410892aa7b12bb19fe081a7073861b424a6

Author: Nicolai Hähnle 
Date:   Fri Oct  7 17:33:07 2016 +0200

st/glsl_to_tgsi: explicit handling of writemask for depth/stencil export

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index f869892..da7b83b 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -2868,19 +2868,28 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
  } else
 l.writemask = WRITEMASK_XYZW;
   }
-   } else if (ir->lhs->type->is_scalar() &&
-  !ir->lhs->type->is_64bit() &&
-  ir->lhs->variable_referenced()->data.mode == ir_var_shader_out) {
-  /* FINISHME: This hack makes writing to gl_FragDepth, which lives in the
-   * FINISHME: W component of fragment shader output zero, work correctly.
-   */
-  l.writemask = WRITEMASK_XYZW;
} else {
   int swizzles[4];
   int first_enabled_chan = 0;
   int rhs_chan = 0;
+  ir_variable *variable = ir->lhs->variable_referenced();
+
+  if (shader->Stage == MESA_SHADER_FRAGMENT &&
+  variable->data.mode == ir_var_shader_out &&
+  (variable->data.location == FRAG_RESULT_DEPTH ||
+   variable->data.location == FRAG_RESULT_STENCIL)) {
+ assert(ir->lhs->type->is_scalar());
+ assert(ir->write_mask == WRITEMASK_X);
 
-  l.writemask = ir->write_mask;
+ if (variable->data.location == FRAG_RESULT_DEPTH)
+l.writemask = WRITEMASK_Z;
+ else {
+assert(variable->data.location == FRAG_RESULT_STENCIL);
+l.writemask = WRITEMASK_Y;
+ }
+  } else {
+ l.writemask = ir->write_mask;
+  }
 
   for (int i = 0; i < 4; i++) {
  if (l.writemask & (1 << i)) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): st/glsl_to_tgsi: simpler fixup of empty writemasks

2016-10-12 Thread Nicolai Hähnle

Module: Mesa
Branch: master
Commit: f5f3cadca3809952288e3726ed5fde22090dc61d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f5f3cadca3809952288e3726ed5fde22090dc61d

Author: Nicolai Hähnle 
Date:   Fri Oct  7 12:49:36 2016 +0200

st/glsl_to_tgsi: simpler fixup of empty writemasks

Empty writemasks mean "copy everything", so we can always just use the number
of vector elements (which uses the GLSL meaning here, i.e. each double is a
single element/writemask bit).

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Dave Airlie 

---

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 37 --
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index da7b83b..6a63e5c 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -2842,33 +2842,7 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
 
l = get_assignment_lhs(ir->lhs, this);
 
-   /* FINISHME: This should really set to the correct maximal writemask for 
each
-* FINISHME: component written (in the loops below).  This case can only
-* FINISHME: occur for matrices, arrays, and structures.
-*/
-   if (ir->write_mask == 0) {
-  assert(!ir->lhs->type->is_scalar() && !ir->lhs->type->is_vector());
-
-  if (ir->lhs->type->is_array() || 
ir->lhs->type->without_array()->is_matrix()) {
- if (ir->lhs->type->without_array()->is_64bit()) {
-switch (ir->lhs->type->without_array()->vector_elements) {
-case 1:
-   l.writemask = WRITEMASK_X;
-   break;
-case 2:
-   l.writemask = WRITEMASK_XY;
-   break;
-case 3:
-   l.writemask = WRITEMASK_XYZ;
-   break;
-case 4:
-   l.writemask = WRITEMASK_XYZW;
-   break;
-}
- } else
-l.writemask = WRITEMASK_XYZW;
-  }
-   } else {
+   {
   int swizzles[4];
   int first_enabled_chan = 0;
   int rhs_chan = 0;
@@ -2887,6 +2861,15 @@ glsl_to_tgsi_visitor::visit(ir_assignment *ir)
 assert(variable->data.location == FRAG_RESULT_STENCIL);
 l.writemask = WRITEMASK_Y;
  }
+  } else if (ir->write_mask == 0) {
+ assert(!ir->lhs->type->is_scalar() && !ir->lhs->type->is_vector());
+
+ if (ir->lhs->type->is_array() || ir->lhs->type->is_matrix()) {
+unsigned num_elements = 
ir->lhs->type->without_array()->vector_elements;
+l.writemask = u_bit_consecutive(0, num_elements);
+ } else {
+l.writemask = WRITEMASK_XYZW;
+ }
   } else {
  l.writemask = ir->write_mask;
   }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): nv50/ir: copy over value' s register id when resolving merge of a phi

2016-10-12 Thread Ilia Mirkin

Module: Mesa
Branch: master
Commit: 300b5ad023962ee95322e890a9ba57396392407e
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=300b5ad023962ee95322e890a9ba57396392407e

Author: Ilia Mirkin 
Date:   Mon Oct 10 16:57:50 2016 -0400

nv50/ir: copy over value's register id when resolving merge of a phi

The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.

Signed-off-by: Ilia Mirkin 
Reviewed-by: Samuel Pitoiset 
Cc: mesa-sta...@lists.freedesktop.org

---

 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 7e64f7c..d36c853 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -1905,8 +1905,10 @@ GCRA::resolveSplitsAndMerges()
  // their registers should be identical.
  if (v->getInsn()->op == OP_PHI || v->getInsn()->op == OP_UNION) {
 Instruction *phi = v->getInsn();
-for (int phis = 0; phi->srcExists(phis); ++phis)
+for (int phis = 0; phi->srcExists(phis); ++phis) {
phi->getSrc(phis)->join = v;
+   phi->getSrc(phis)->reg.data.id = v->reg.data.id;
+}
  }
  reg += v->reg.size;
   }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): nvc0/ir: fix textureGather with a single offset

2016-10-12 Thread Ilia Mirkin

Module: Mesa
Branch: master
Commit: a48a343c299a6486a1540cdf7d083f38aa4ace55
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a48a343c299a6486a1540cdf7d083f38aa4ace55

Author: Ilia Mirkin 
Date:   Wed Oct 12 10:24:59 2016 -0400

nvc0/ir: fix textureGather with a single offset

Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.

Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant 
gather offset")
Signed-off-by: Ilia Mirkin 
Reviewed-by: Samuel Pitoiset 
Cc: mesa-sta...@lists.freedesktop.org

---

 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 4c013c4..dab3e2d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -820,11 +820,11 @@ NVC0LoweringPass::handleTEX(TexInstruction *i)
  // Either there is 1 offset, which goes into the 2 low bytes of the
  // first source, or there are 4 offsets, which go into 2 sources (8
  // values, 1 byte each).
- Value *offs[2] = {bld.getScratch(), bld.getScratch()};
+ Value *offs[2] = {NULL, NULL};
  for (n = 0; n < i->tex.useOffsets; n++) {
 for (c = 0; c < 2; ++c) {
if ((n % 2) == 0 && c == 0)
-  bld.mkMov(offs[n / 2], i->offset[n][c].get());
+  bld.mkMov(offs[n / 2] = bld.getScratch(), 
i->offset[n][c].get());
else
   bld.mkOp3(OP_INSBF, TYPE_U32,
 offs[n / 2],

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radv: add all headers to the sources list

2016-10-12 Thread Emil Velikov

Module: Mesa
Branch: master
Commit: 3c419a941a7a0cc0645e23bb2de2285d6c096f79
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=3c419a941a7a0cc0645e23bb2de2285d6c096f79

Author: Emil Velikov 
Date:   Wed Oct 12 01:03:25 2016 +0100

radv: add all headers to the sources list

Otherwise they'll be missing from the tarball and the build will fail.

Signed-off-by: Emil Velikov 

---

 src/amd/vulkan/Makefile.sources | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.sources
index 97fd0b6..a8857e9 100644
--- a/src/amd/vulkan/Makefile.sources
+++ b/src/amd/vulkan/Makefile.sources
@@ -21,15 +21,21 @@
 
 RADV_WS_AMDGPU_FILES := \
winsys/amdgpu/radv_amdgpu_bo.c \
+   winsys/amdgpu/radv_amdgpu_bo.h \
winsys/amdgpu/radv_amdgpu_cs.c \
+   winsys/amdgpu/radv_amdgpu_cs.h \
winsys/amdgpu/radv_amdgpu_surface.c \
+   winsys/amdgpu/radv_amdgpu_surface.h \
winsys/amdgpu/radv_amdgpu_winsys.c \
-   winsys/amdgpu/radv_amdgpu_winsys.h
+   winsys/amdgpu/radv_amdgpu_winsys.h \
+   winsys/amdgpu/radv_amdgpu_winsys_public.h
 
 VULKAN_FILES := \
radv_cmd_buffer.c \
+   radv_cs.h \
radv_device.c \
radv_descriptor_set.c \
+   radv_descriptor_set.h \
radv_formats.c \
radv_image.c \
radv_meta.c \
@@ -47,11 +53,16 @@ VULKAN_FILES := \
radv_pass.c \
radv_pipeline.c \
radv_pipeline_cache.c \
+   radv_private.h \
+   radv_radeon_winsys.h \
radv_query.c \
radv_util.c \
+   radv_util.h \
radv_wsi.c \
+   radv_wsi.h \
si_cmd_buffer.c \
vk_format_table.c \
+   vk_format.h \
$(RADV_WS_AMDGPU_FILES)
 
 VULKAN_WSI_WAYLAND_FILES := \

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): swr: automake: add ar_eventhandlerfile_h.template to the tarball

2016-10-12 Thread Emil Velikov

Module: Mesa
Branch: master
Commit: a4622305e67dbb3ed224fa966160616688e43ee8
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a4622305e67dbb3ed224fa966160616688e43ee8

Author: Emil Velikov 
Date:   Wed Oct 12 16:06:47 2016 +0100

swr: automake: add ar_eventhandlerfile_h.template to the tarball

Signed-off-by: Emil Velikov 

---

 src/gallium/drivers/swr/Makefile.am | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/Makefile.am 
b/src/gallium/drivers/swr/Makefile.am
index 4299489..dd1c2e6 100644
--- a/src/gallium/drivers/swr/Makefile.am
+++ b/src/gallium/drivers/swr/Makefile.am
@@ -261,4 +261,5 @@ EXTRA_DIST = \
rasterizer/scripts/templates/knobs.template \
rasterizer/scripts/templates/ar_event_h.template \
rasterizer/scripts/templates/ar_event_cpp.template \
-   rasterizer/scripts/templates/ar_eventhandler_h.template
+   rasterizer/scripts/templates/ar_eventhandler_h.template \
+   rasterizer/scripts/templates/ar_eventhandlerfile_h.template

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): st/mesa: only flip stipple pattern for winsys fbo's

2016-10-12 Thread Ilia Mirkin

Module: Mesa
Branch: master
Commit: e6a693c447213b578f562edad8f15ccc43056acd
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e6a693c447213b578f562edad8f15ccc43056acd

Author: Ilia Mirkin 
Date:   Wed Oct 12 14:01:34 2016 -0400

st/mesa: only flip stipple pattern for winsys fbo's

Gallium is completely oblivious to whether the fbo is flipped or not.
Only flip the stipple pattern when the fbo is flipped as well. Otherwise
the driver has no idea when to unflip the pattern.

Fixes bin/gl-2.1-polygon-stipple-fs -fbo

Signed-off-by: Ilia Mirkin 
Reviewed-by: Brian Paul 
Tested-by: Brian Paul 
Reviewed-by: Marek Olšák 

---

 src/mesa/state_tracker/st_atom_stipple.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_stipple.c 
b/src/mesa/state_tracker/st_atom_stipple.c
index a30215f..5f7bf82 100644
--- a/src/mesa/state_tracker/st_atom_stipple.c
+++ b/src/mesa/state_tracker/st_atom_stipple.c
@@ -61,7 +61,7 @@ invert_stipple(GLuint dest[32], const GLuint src[32], GLuint 
winHeight)
 
 
 
-static void 
+static void
 update_stipple( struct st_context *st )
 {
const struct gl_context *ctx = st->ctx;
@@ -74,8 +74,12 @@ update_stipple( struct st_context *st )
 
   memcpy(st->state.poly_stipple, ctx->PolygonStipple, sz);
 
-  invert_stipple(newStipple.stipple, ctx->PolygonStipple,
- ctx->DrawBuffer->Height);
+  if (_mesa_is_user_fbo(ctx->DrawBuffer)) {
+ memcpy(newStipple.stipple, ctx->PolygonStipple, 
sizeof(newStipple.stipple));
+  } else {
+ invert_stipple(newStipple.stipple, ctx->PolygonStipple,
+ctx->DrawBuffer->Height);
+  }
 
   st->pipe->set_polygon_stipple(st->pipe, &newStipple);
}

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radv: Return correct result in EnumeratePhysicalDevices

2016-10-12 Thread Dave Airlie

Module: Mesa
Branch: master
Commit: 35e2bfa6d912ad3ef57195b0e8f31f21eb64678e
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=35e2bfa6d912ad3ef57195b0e8f31f21eb64678e

Author: Nicolas Koch 
Date:   Wed Oct 12 13:55:46 2016 +0200

radv: Return correct result in EnumeratePhysicalDevices

If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE. Since only a single
physical device is supported, this is only the case when
pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.

Signed-off-by: Dave Airlie 

---

 src/amd/vulkan/radv_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 6e06863..71b1481 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -295,6 +295,8 @@ VkResult radv_EnumeratePhysicalDevices(
} else if (*pPhysicalDeviceCount >= 1) {
pPhysicalDevices[0] = 
radv_physical_device_to_handle(&instance->physicalDevice);
*pPhysicalDeviceCount = 1;
+   } else if (*pPhysicalDeviceCount < instance->physicalDeviceCount) {
+   return VK_INCOMPLETE;
} else {
*pPhysicalDeviceCount = 0;
}

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radv: trivial case stmt style fixups

Mesa (master): draw: initialize shader inputs

Mesa (master): mapi: fix out-of-tree build dependencies

Mesa (master): nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

Mesa (master): radeonsi: use TC write-back instead of full cache invalidation

Mesa (master): radeonsi: fix R600_DEBUG=precompile for shader-db

Mesa (master): radeonsi: implement TC L2 write-back (flush) without cache invalidation

Mesa (master): radeonsi: don' t invalidate VMEM L1 for memory barriers for index buffers

Mesa (master): winsys/amdgpu: fix infinite loop w/ RADEON_NOOP= 1 caused by unsubmitted fences

Mesa (master): radeonsi: Use the new image load/store intrinsic signatures

Mesa (master): radeonsi: Add function for converting LLVM type to intrinsic string

Mesa (master): radeonsi: Refactor image store/load intrinsic name creation

Mesa (master): glsl: dump explicit location when printing IR

Mesa (master): st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components

Mesa (master): gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS

Mesa (master): st/glsl_to_tgsi: mark "gaps" in input/output arrays as used

Mesa (master): tgsi/ureg: add layout/component input declarations

Mesa (master): st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations

Mesa (master): st/glsl_to_tgsi: explicitly track all input and output declaration

Mesa (master): st/mesa: enable ARB_enhanced_layouts and turn the cap on

Mesa (master): tgsi/scan: fix num_inputs/ num_outputs for shaders with overlapping arrays

Mesa (master): tgsi/ureg: add ureg_DECL_output_layout

Mesa (master): st/glsl_to_tgsi: explicit handling of writemask for depth/ stencil export

Mesa (master): st/glsl_to_tgsi: simpler fixup of empty writemasks

Mesa (master): nv50/ir: copy over value' s register id when resolving merge of a phi

Mesa (master): nvc0/ir: fix textureGather with a single offset

Mesa (master): radv: add all headers to the sources list

Mesa (master): swr: automake: add ar_eventhandlerfile_h.template to the tarball

Mesa (master): st/mesa: only flip stipple pattern for winsys fbo's

Mesa (master): radv: Return correct result in EnumeratePhysicalDevices

30 matches

Site Navigation

Mail list logo

Footer information