Re: [Mesa-dev] [PATCH] i965: Make TCS precompile use the TES primitive mode when available.

2016-01-02 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2016-01-01 23:23:51, Kenneth Graunke wrote:
> If there's a linked TES program, we should just use the actual
> primitive mode.  If not, just guess triangles (as we did before).
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_tcs.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_tcs.c 
> b/src/mesa/drivers/dri/i965/brw_tcs.c
> index 2c925e7..7e41426 100644
> --- a/src/mesa/drivers/dri/i965/brw_tcs.c
> +++ b/src/mesa/drivers/dri/i965/brw_tcs.c
> @@ -307,7 +307,9 @@ brw_tcs_precompile(struct gl_context *ctx,
> /* Guess that the input and output patches have the same dimensionality. 
> */
> key.input_vertices = shader_prog->TessCtrl.VerticesOut;
>  
> -   key.tes_primitive_mode = GL_TRIANGLES;
> +   key.tes_primitive_mode =
> +  shader_prog->_LinkedShaders[MESA_SHADER_TESS_EVAL] ?
> +  shader_prog->TessEval.PrimitiveMode : GL_TRIANGLES;
>  
> key.outputs_written = prog->OutputsWritten;
> key.patch_outputs_written = prog->PatchOutputsWritten;
> -- 
> 2.6.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93561] ninja: error: '$(PRIVATE_SCRIPT)', needed by 'out/target/product/rpi2/gen/STATIC_LIBRARIES/libmesa_dri_common_intermediates/xmlpool/options.h', missing and no known rule to make

2016-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93561

Bug ID: 93561
   Summary: ninja: error: '$(PRIVATE_SCRIPT)', needed by
'out/target/product/rpi2/gen/STATIC_LIBRARIES/libmesa_
dri_common_intermediates/xmlpool/options.h', missing
and no known rule to make it
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: changuan...@hotmail.my
QA Contact: mesa-dev@lists.freedesktop.org

external/mesa3d/src/mesa/drivers/dri/common/Android.mk:85: kati doesn't support
.SECONDEXPANSION
Starting build with ninja
ninja: Entering directory `.'
ninja: error: '$(PRIVATE_SCRIPT)', needed by
'out/target/product/rpi2/gen/STATIC_LIBRARIES/libmesa_dri_common_intermediates/xmlpool/options.h',
missing and no known rule to make it
make: *** [ninja_wrapper] Error 1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: Fix undefined shifts.

2016-01-02 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-12-30 12:26:25, Matt Turner wrote:
> Shifting into the sign bit if undefined, as is shifting by 32.
> ---
>  src/glsl/ir_constant_expression.cpp | 10 +-
>  src/glsl/nir/nir_opcodes.py |  6 +++---
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/src/glsl/ir_constant_expression.cpp 
> b/src/glsl/ir_constant_expression.cpp
> index 5bf5ce5..cf62f96 100644
> --- a/src/glsl/ir_constant_expression.cpp
> +++ b/src/glsl/ir_constant_expression.cpp
> @@ -1539,10 +1539,10 @@ ir_expression::constant_expression_value(struct 
> hash_table *variable_context)
>  data.i[c] = -1;
>   else {
>  int count = 0;
> -int top_bit = op[0]->type->base_type == GLSL_TYPE_UINT
> -  ? 0 : v & (1 << 31);
> +unsigned top_bit = op[0]->type->base_type == GLSL_TYPE_UINT
> +   ? 0 : v & (1u << 31);
>  
> -while (((v & (1 << 31)) == top_bit) && count != 32) {
> +while (((v & (1u << 31)) == top_bit) && count != 32) {
> count++;
> v <<= 1;
>  }
> @@ -1628,7 +1628,7 @@ ir_expression::constant_expression_value(struct 
> hash_table *variable_context)
>   else if (offset + bits > 32)
>  data.u[c] = 0; /* Undefined for bitfieldInsert, per spec. */
>   else
> -data.u[c] = ((1 << bits) - 1) << offset;
> +data.u[c] = ((1ul << bits) - 1) << offset;
>}
>break;
> }
> @@ -1738,7 +1738,7 @@ ir_expression::constant_expression_value(struct 
> hash_table *variable_context)
>   else if (offset + bits > 32)
>  data.u[c] = 0; /* Undefined, per spec. */
>   else {
> -unsigned insert_mask = ((1 << bits) - 1) << offset;
> +unsigned insert_mask = ((1ul << bits) - 1) << offset;
>  
>  unsigned insert = op[1]->value.u[c];
>  insert <<= offset;
> diff --git a/src/glsl/nir/nir_opcodes.py b/src/glsl/nir/nir_opcodes.py
> index 1cd01a4..e8b5123 100644
> --- a/src/glsl/nir/nir_opcodes.py
> +++ b/src/glsl/nir/nir_opcodes.py
> @@ -516,7 +516,7 @@ int offset = src0, bits = src1;
>  if (offset < 0 || bits < 0 || offset + bits > 32)
> dst = 0; /* undefined per the spec */
>  else
> -   dst = ((1 << bits)- 1) << offset;
> +   dst = ((1ul << bits) - 1) << offset;
>  """)
>  
>  opcode("ldexp", 0, tfloat, [0, 0], [tfloat, tint], "", """
> @@ -578,7 +578,7 @@ if (bits == 0) {
>  } else if (bits < 0 || offset < 0 || offset + bits > 32) {
> dst = 0; /* undefined per the spec */
>  } else {
> -   dst = (base >> offset) & ((1 << bits) - 1);
> +   dst = (base >> offset) & ((1ul << bits) - 1);
>  }
>  """)
>  opcode("ibitfield_extract", 0, tint,
> @@ -618,7 +618,7 @@ if (bits == 0) {
>  } else if (offset < 0 || bits < 0 || bits + offset > 32) {
> dst = 0;
>  } else {
> -   unsigned mask = ((1 << bits) - 1) << offset;
> +   unsigned mask = ((1ul << bits) - 1) << offset;
> dst = (base & ~mask) | ((insert << bits) & mask);
>  }
>  """)
> -- 
> 2.4.9
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2016-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #59 from pavi...@yahoo.fr ---
I hope the bug you filed about this will have some attention.
https://code.google.com/p/chromium/issues/detail?id=505969

But you didn't said if something still need to be fixed in nouveau ;)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3] gallium/tests: fix build with clang compiler

2016-01-02 Thread Samuel Pitoiset
Nested functions are supported as an extension in GNU C, but Clang
don't support them.

This fixes compilation errors when (manually) building compute.c,
or by setting --enable-gallium-tests to the configure script.

Changes from v3:
 - refactor by introducing test_default_init()

Changes from v2:
 - fix typo

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/tests/trivial/compute.c | 603 
 1 file changed, 330 insertions(+), 273 deletions(-)

diff --git a/src/gallium/tests/trivial/compute.c 
b/src/gallium/tests/trivial/compute.c
index bcdfb11..5ce12ab 100644
--- a/src/gallium/tests/trivial/compute.c
+++ b/src/gallium/tests/trivial/compute.c
@@ -428,6 +428,35 @@ static void launch_grid(struct context *ctx, const uint 
*block_layout,
 pipe->launch_grid(pipe, block_layout, grid_layout, pc, input);
 }
 
+static void test_default_init(void *p, int s, int x, int y)
+{
+*(uint32_t *)p = 0xdeadbeef;
+}
+
+/* test_system_values */
+static void test_system_values_expect(void *p, int s, int x, int y)
+{
+int id = x / 16, sv = (x % 16) / 4, c = x % 4;
+int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 };
+int bsz[] = { 4, 3, 5, 1};
+int gsz[] = { 5, 4, 1, 1};
+
+switch (sv) {
+case 0:
+*(uint32_t *)p = tid[c] / bsz[c];
+break;
+case 1:
+*(uint32_t *)p = bsz[c];
+break;
+case 2:
+*(uint32_t *)p = gsz[c];
+break;
+case 3:
+*(uint32_t *)p = tid[c] % bsz[c];
+break;
+}
+}
+
 static void test_system_values(struct context *ctx)
 {
 const char *src = "COMP\n"
@@ -461,44 +490,31 @@ static void test_system_values(struct context *ctx)
 "  STORE RES[0].xyzw, TEMP[0], SV[3]\n"
 "  RET\n"
 "ENDSUB\n";
-void init(void *p, int s, int x, int y) {
-*(uint32_t *)p = 0xdeadbeef;
-}
-void expect(void *p, int s, int x, int y) {
-int id = x / 16, sv = (x % 16) / 4, c = x % 4;
-int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 };
-int bsz[] = { 4, 3, 5, 1};
-int gsz[] = { 5, 4, 1, 1};
-
-switch (sv) {
-case 0:
-*(uint32_t *)p = tid[c] / bsz[c];
-break;
-case 1:
-*(uint32_t *)p = bsz[c];
-break;
-case 2:
-*(uint32_t *)p = gsz[c];
-break;
-case 3:
-*(uint32_t *)p = tid[c] % bsz[c];
-break;
-}
-}
 
 printf("- %s\n", __func__);
 
 init_prog(ctx, 0, 0, 0, src, NULL);
 init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
- 76800, 0, init);
+ 76800, 0, test_default_init);
 init_compute_resources(ctx, (int []) { 0, -1 });
 launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, NULL);
-check_tex(ctx, 0, expect, NULL);
+check_tex(ctx, 0, test_system_values_expect, NULL);
 destroy_compute_resources(ctx);
 destroy_tex(ctx);
 destroy_prog(ctx);
 }
 
+/* test_resource_access */
+static void test_resource_access_init0(void *p, int s, int x, int y)
+{
+*(float *)p = 8.0 - (float)x;
+}
+
+static void test_resource_access_expect(void *p, int s, int x, int y)
+{
+*(float *)p = 8.0 - (float)((x + 4 * y) & 0x3f);
+}
+
 static void test_resource_access(struct context *ctx)
 {
 const char *src = "COMP\n"
@@ -519,31 +535,33 @@ static void test_resource_access(struct context *ctx)
 "   STORE RES[1].xyzw, TEMP[1], TEMP[0]\n"
 "   RET\n"
 "ENDSUB\n";
-void init0(void *p, int s, int x, int y) {
-*(float *)p = 8.0 - (float)x;
-}
-void init1(void *p, int s, int x, int y) {
-*(uint32_t *)p = 0xdeadbeef;
-}
-void expect(void *p, int s, int x, int y) {
-*(float *)p = 8.0 - (float)((x + 4*y) & 0x3f);
-}
 
 printf("- %s\n", __func__);
 
 init_prog(ctx, 0, 0, 0, src, NULL);
 init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
- 256, 0, init0);
+ 256, 0, test_resource_access_init0);
 init_tex(ctx, 1, PIPE_TEXTURE_2D, true, PIPE_FORMAT_R32_FLOAT,
- 60, 12, init1);
+ 60, 12, test_default_init);
 init_compute_resources(ctx, (int []) { 0, 1, -1 });
 launch_grid(ctx, (uint []){1, 1, 1}, (uint []){15, 12, 1}, 0, NULL);
-check_tex(ctx, 1, expect, NULL);
+

Re: [Mesa-dev] [PATCH v4] nv50, nvc0: optimize coherent buffer checking at draw time

2016-01-02 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Sat, Jan 2, 2016 at 12:09 PM, Samuel Pitoiset
 wrote:
> Instead of iterating over all the buffer resources looking for coherent
> buffers, we keep track of a context-wide count. This will save some
> iterations (and CPU cycles) in 99.99% case because usually coherent
> buffers are not so used.
>
> Changes from v4:
>  - fix flag for textures
>
> Changes from v3:
>  - check if views[i] and views[i]->texture are not NULL
>  - fix use of nv50->textures_coherent
>  - check if vb[i].buffer is not NULL
>  - clear out the flag for UBO
>
> Changes from v2:
>  - forgot to apply some changes for nv50 (texture/vertex bufs)
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_context.h |  3 ++
>  src/gallium/drivers/nouveau/nv50/nv50_state.c   | 25 +++
>  src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 42 
> +
>  src/gallium/drivers/nouveau/nvc0/nvc0_context.h |  3 ++
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 36 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +---
>  6 files changed, 82 insertions(+), 68 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
> b/src/gallium/drivers/nouveau/nv50/nv50_context.h
> index 2cebcd9..712d00e 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
> @@ -134,9 +134,11 @@ struct nv50_context {
> struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS];
> uint16_t constbuf_dirty[3];
> uint16_t constbuf_valid[3];
> +   uint16_t constbuf_coherent[3];
>
> struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
> unsigned num_vtxbufs;
> +   uint32_t vtxbufs_coherent;
> struct pipe_index_buffer idxbuf;
> uint32_t vbo_fifo; /* bitmask of vertex elements to be pushed to FIFO */
> uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */
> @@ -148,6 +150,7 @@ struct nv50_context {
>
> struct pipe_sampler_view *textures[3][PIPE_MAX_SAMPLERS];
> unsigned num_textures[3];
> +   uint32_t textures_coherent[3];
> struct nv50_tsc_entry *samplers[3][PIPE_MAX_SAMPLERS];
> unsigned num_samplers[3];
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index de65597..cb04043 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -664,6 +664,17 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
> int s,
>if (old)
>   nv50_screen_tic_unlock(nv50->screen, old);
>
> +  if (views[i] && views[i]->texture) {
> + struct pipe_resource *res = views[i]->texture;
> + if (res->target == PIPE_BUFFER &&
> + (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
> +nv50->textures_coherent[s] |= 1 << i;
> + else
> +nv50->textures_coherent[s] &= ~(1 << i);
> +  } else {
> + nv50->textures_coherent[s] &= ~(1 << i);
> +  }
> +
>pipe_sampler_view_reference(>textures[s][i], views[i]);
> }
>
> @@ -847,13 +858,19 @@ nv50_set_constant_buffer(struct pipe_context *pipe, 
> uint shader, uint index,
>nv50->constbuf[s][i].u.data = cb->user_buffer;
>nv50->constbuf[s][i].size = MIN2(cb->buffer_size, 0x1);
>nv50->constbuf_valid[s] |= 1 << i;
> +  nv50->constbuf_coherent[s] &= ~(1 << i);
> } else
> if (res) {
>nv50->constbuf[s][i].offset = cb->buffer_offset;
>nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 
> 0x1);
>nv50->constbuf_valid[s] |= 1 << i;
> +  if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
> + nv50->constbuf_coherent[s] |= 1 << i;
> +  else
> + nv50->constbuf_coherent[s] &= ~(1 << i);
> } else {
>nv50->constbuf_valid[s] &= ~(1 << i);
> +  nv50->constbuf_coherent[s] &= ~(1 << i);
> }
> nv50->constbuf_dirty[s] |= 1 << i;
>
> @@ -1003,6 +1020,7 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
> if (!vb) {
>nv50->vbo_user &= ~(((1ull << count) - 1) << start_slot);
>nv50->vbo_constant &= ~(((1ull << count) - 1) << start_slot);
> +  nv50->vtxbufs_coherent &= ~(((1ull << count) - 1) << start_slot);
>return;
> }
>
> @@ -1015,9 +1033,16 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
>  nv50->vbo_constant |= 1 << dst_index;
>   else
>  nv50->vbo_constant &= ~(1 << dst_index);
> + nv50->vtxbufs_coherent &= ~(1 << dst_index);
>} else {
>   nv50->vbo_user &= ~(1 << dst_index);
>   nv50->vbo_constant &= ~(1 << dst_index);
> +
> + if (vb[i].buffer &&
> + vb[i].buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
> +nv50->vtxbufs_coherent |= (1 << 

[Mesa-dev] [PATCH] arb_indirect_parameters: add basic rendering tests

2016-01-02 Thread Ilia Mirkin
Creates an array with 3 draws, the last of which is "bad", and makes
sure that the "bad" one is never drawn. Parameter count is supplied from
an earlier XFB draw to ensure that proper fencing occurs.

Signed-off-by: Ilia Mirkin 
---
 tests/spec/CMakeLists.txt  |   1 +
 .../spec/arb_indirect_parameters/CMakeLists.gl.txt |  13 ++
 tests/spec/arb_indirect_parameters/CMakeLists.txt  |   1 +
 .../spec/arb_indirect_parameters/tf-count-arrays.c | 220 
 .../arb_indirect_parameters/tf-count-elements.c| 229 +
 5 files changed, 464 insertions(+)
 create mode 100644 tests/spec/arb_indirect_parameters/CMakeLists.gl.txt
 create mode 100644 tests/spec/arb_indirect_parameters/CMakeLists.txt
 create mode 100644 tests/spec/arb_indirect_parameters/tf-count-arrays.c
 create mode 100644 tests/spec/arb_indirect_parameters/tf-count-elements.c

diff --git a/tests/spec/CMakeLists.txt b/tests/spec/CMakeLists.txt
index 3c4bcfb..a984734 100644
--- a/tests/spec/CMakeLists.txt
+++ b/tests/spec/CMakeLists.txt
@@ -142,3 +142,4 @@ add_subdirectory (mesa_pack_invert)
 add_subdirectory (ext_texture_format_bgra)
 add_subdirectory (oes_draw_elements_base_vertex)
 add_subdirectory (arb_shader_draw_parameters)
+add_subdirectory (arb_indirect_parameters)
diff --git a/tests/spec/arb_indirect_parameters/CMakeLists.gl.txt 
b/tests/spec/arb_indirect_parameters/CMakeLists.gl.txt
new file mode 100644
index 000..88f533d
--- /dev/null
+++ b/tests/spec/arb_indirect_parameters/CMakeLists.gl.txt
@@ -0,0 +1,13 @@
+include_directories(
+   ${GLEXT_INCLUDE_DIR}
+   ${OPENGL_INCLUDE_PATH}
+   ${piglit_SOURCE_DIR}/tests/mesa/util
+)
+
+link_libraries (
+   piglitutil_${piglit_target_api}
+   ${OPENGL_gl_LIBRARY}
+)
+
+piglit_add_executable (arb_indirect_parameters-tf-count-elements 
tf-count-elements.c)
+piglit_add_executable (arb_indirect_parameters-tf-count-arrays 
tf-count-arrays.c)
diff --git a/tests/spec/arb_indirect_parameters/CMakeLists.txt 
b/tests/spec/arb_indirect_parameters/CMakeLists.txt
new file mode 100644
index 000..144a306
--- /dev/null
+++ b/tests/spec/arb_indirect_parameters/CMakeLists.txt
@@ -0,0 +1 @@
+piglit_include_target_api()
diff --git a/tests/spec/arb_indirect_parameters/tf-count-arrays.c 
b/tests/spec/arb_indirect_parameters/tf-count-arrays.c
new file mode 100644
index 000..e88a7ba
--- /dev/null
+++ b/tests/spec/arb_indirect_parameters/tf-count-arrays.c
@@ -0,0 +1,220 @@
+/*
+ * Copyright (C) 2016 Ilia Mirkin
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "piglit-util-gl.h"
+
+PIGLIT_GL_TEST_CONFIG_BEGIN
+
+   config.supports_gl_core_version = 31;
+   config.window_visual = PIGLIT_GL_VISUAL_RGBA | PIGLIT_GL_VISUAL_DOUBLE;
+
+PIGLIT_GL_TEST_CONFIG_END
+
+static const char *vs_tf =
+   "#version 140\n"
+   "out int tf;\n"
+   "uniform int tf_val;\n"
+   "void main() { gl_Position = vec4(0); tf = tf_val; }\n";
+
+static const char *vs_draw =
+   "#version 140\n"
+   "out vec4 color;\n"
+   "in vec4 vtx, in_color;\n"
+   "void main() { gl_Position = vtx; color = in_color; }\n";
+
+static const char *fs_draw =
+   "#version 140\n"
+   "out vec4 c;\n"
+   "in vec4 color;\n"
+   "void main() { c = color; }\n";
+
+static GLint tf_prog, draw_prog;
+static GLint tf_val;
+static GLuint tf_vao, draw_vao;
+
+void
+piglit_init(int argc, char **argv)
+{
+   static const char *varying = "tf";
+   static const unsigned cmds[] = {
+   4, 1, 0, 0,
+   4, 1, 4, 0,
+   4, 1, 8, 0,
+   };
+   static const struct {
+   float vertex_array[12 * 2];
+   float colors[12 * 4];
+   } geometry = {
+   {
+   -1, -1,
+   0, -1,
+   0, 1,
+  

Re: [Mesa-dev] [PATCH 2/2] glsl: Handle bits=32 case in bitfieldInsert/bitfieldExtract.

2016-01-02 Thread Jordan Justen
On 2015-12-30 13:26:48, Ilia Mirkin wrote:
> On Wed, Dec 30, 2015 at 3:26 PM, Matt Turner  wrote:
> > The OpenGL specifications for these functions say:
> >
> >The result will be undefined if  or  is negative, or if
> >the sum of  and  is greater than the number of bits
> >used to store the operand.
> >
> > Therefore passing bits=32, offset=0 is legal and defined in GLSL.
> >
> > But the earlier DX11/SM5 bfi/ibfe/ubfe opcodes are specified to accept a
> > bitfield width ranging from 0-31. As such, Intel and AMD instructions
> > read only the low 5 bits of the width operand, making them not compliant
> > with the GLSL spec, so we have to special case the bits=32 case.
> >
> > Checking that offset=0 is not necessary, since for any other value,
> >  +  will be greater than 32, which is specified as
> > generating an undefined result.
> >
> > Fixes:
> >ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2
> >ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3
> >ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595
> > ---
> > Yuck. Suggestions welcome.
> 
> Can you make a piglit test? Want to see if nvidia has the same
> problem. According to
> http://docs.nvidia.com/cuda/parallel-thread-execution/#integer-arithmetic-instructions-bfe,
> offset/bits can actually be up to 255 (although I can't fully imagine
> why one might want that). However perhaps the HW differs.
> 

Matt,

Should we move this into the driver then?

-Jordan

> 
> >
> >  src/glsl/builtin_functions.cpp  | 6 +-
> >  src/glsl/lower_instructions.cpp | 7 +++
> >  2 files changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
> > index 602852a..3d5de83 100644
> > --- a/src/glsl/builtin_functions.cpp
> > +++ b/src/glsl/builtin_functions.cpp
> > @@ -4894,7 +4894,11 @@ builtin_builder::_bitfieldExtract(const glsl_type 
> > *type)
> > ir_variable *bits   = in_var(glsl_type::int_type, "bits");
> > MAKE_SIG(type, gpu_shader5_or_es31, 3, value, offset, bits);
> >
> > -   body.emit(ret(expr(ir_triop_bitfield_extract, value, offset, bits)));
> > +   ir_if *if_32 = new(mem_ctx) ir_if(greater(bits, imm(31)));
> > +   if_32->then_instructions.push_tail(ret(rshift(value, offset)));
> > +   if_32->else_instructions.push_tail(
> > +  ret(expr(ir_triop_bitfield_extract, value, offset, bits)));
> > +   body.emit(if_32);
> >
> > return sig;
> >  }
> > diff --git a/src/glsl/lower_instructions.cpp 
> > b/src/glsl/lower_instructions.cpp
> > index 845cfff..8a425a8 100644
> > --- a/src/glsl/lower_instructions.cpp
> > +++ b/src/glsl/lower_instructions.cpp
> > @@ -359,10 +359,9 @@ 
> > lower_instructions_visitor::bitfield_insert_to_bfm_bfi(ir_expression *ir)
> > ir_rvalue *base_expr = ir->operands[0];
> >
> > ir->operation = ir_triop_bfi;
> > -   ir->operands[0] = new(ir) ir_expression(ir_binop_bfm,
> > -   ir->type->get_base_type(),
> > -   ir->operands[3],
> > -   ir->operands[2]);
> > +   ir->operands[0] = lshift(rshift(new(ir) ir_constant(~0u),
> > +   sub(new(ir) ir_constant(32), 
> > ir->operands[3])),
> > +ir->operands[2]);
> > /* ir->operands[1] is still the value to insert. */
> > ir->operands[2] = base_expr;
> > ir->operands[3] = NULL;
> > --
> > 2.4.9
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] gallium/tests: fix build with clang compiler

2016-01-02 Thread eocallaghan
omg I don't know why folks insist on using gnuc nested functions they 
are insane.


Thanks for working though this one!

Reviewed-by: Edward O'Callaghan 

On 2016-01-03 04:20, Samuel Pitoiset wrote:

Nested functions are supported as an extension in GNU C, but Clang
don't support them.

This fixes compilation errors when (manually) building compute.c,
or by setting --enable-gallium-tests to the configure script.

Changes from v3:
 - refactor by introducing test_default_init()

Changes from v2:
 - fix typo

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/tests/trivial/compute.c | 603 


 1 file changed, 330 insertions(+), 273 deletions(-)

diff --git a/src/gallium/tests/trivial/compute.c
b/src/gallium/tests/trivial/compute.c
index bcdfb11..5ce12ab 100644
--- a/src/gallium/tests/trivial/compute.c
+++ b/src/gallium/tests/trivial/compute.c
@@ -428,6 +428,35 @@ static void launch_grid(struct context *ctx,
const uint *block_layout,
 pipe->launch_grid(pipe, block_layout, grid_layout, pc, input);
 }

+static void test_default_init(void *p, int s, int x, int y)
+{
+*(uint32_t *)p = 0xdeadbeef;
+}
+
+/* test_system_values */
+static void test_system_values_expect(void *p, int s, int x, int y)
+{
+int id = x / 16, sv = (x % 16) / 4, c = x % 4;
+int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 };
+int bsz[] = { 4, 3, 5, 1};
+int gsz[] = { 5, 4, 1, 1};
+
+switch (sv) {
+case 0:
+*(uint32_t *)p = tid[c] / bsz[c];
+break;
+case 1:
+*(uint32_t *)p = bsz[c];
+break;
+case 2:
+*(uint32_t *)p = gsz[c];
+break;
+case 3:
+*(uint32_t *)p = tid[c] % bsz[c];
+break;
+}
+}
+
 static void test_system_values(struct context *ctx)
 {
 const char *src = "COMP\n"
@@ -461,44 +490,31 @@ static void test_system_values(struct context 
*ctx)

 "  STORE RES[0].xyzw, TEMP[0], SV[3]\n"
 "  RET\n"
 "ENDSUB\n";
-void init(void *p, int s, int x, int y) {
-*(uint32_t *)p = 0xdeadbeef;
-}
-void expect(void *p, int s, int x, int y) {
-int id = x / 16, sv = (x % 16) / 4, c = x % 4;
-int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 };
-int bsz[] = { 4, 3, 5, 1};
-int gsz[] = { 5, 4, 1, 1};
-
-switch (sv) {
-case 0:
-*(uint32_t *)p = tid[c] / bsz[c];
-break;
-case 1:
-*(uint32_t *)p = bsz[c];
-break;
-case 2:
-*(uint32_t *)p = gsz[c];
-break;
-case 3:
-*(uint32_t *)p = tid[c] % bsz[c];
-break;
-}
-}

 printf("- %s\n", __func__);

 init_prog(ctx, 0, 0, 0, src, NULL);
 init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
- 76800, 0, init);
+ 76800, 0, test_default_init);
 init_compute_resources(ctx, (int []) { 0, -1 });
 launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, 
NULL);

-check_tex(ctx, 0, expect, NULL);
+check_tex(ctx, 0, test_system_values_expect, NULL);
 destroy_compute_resources(ctx);
 destroy_tex(ctx);
 destroy_prog(ctx);
 }

+/* test_resource_access */
+static void test_resource_access_init0(void *p, int s, int x, int y)
+{
+*(float *)p = 8.0 - (float)x;
+}
+
+static void test_resource_access_expect(void *p, int s, int x, int y)
+{
+*(float *)p = 8.0 - (float)((x + 4 * y) & 0x3f);
+}
+
 static void test_resource_access(struct context *ctx)
 {
 const char *src = "COMP\n"
@@ -519,31 +535,33 @@ static void test_resource_access(struct context 
*ctx)

 "   STORE RES[1].xyzw, TEMP[1], TEMP[0]\n"
 "   RET\n"
 "ENDSUB\n";
-void init0(void *p, int s, int x, int y) {
-*(float *)p = 8.0 - (float)x;
-}
-void init1(void *p, int s, int x, int y) {
-*(uint32_t *)p = 0xdeadbeef;
-}
-void expect(void *p, int s, int x, int y) {
-*(float *)p = 8.0 - (float)((x + 4*y) & 0x3f);
-}

 printf("- %s\n", __func__);

 init_prog(ctx, 0, 0, 0, src, NULL);
 init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
- 256, 0, init0);
+ 256, 0, test_resource_access_init0);
 init_tex(ctx, 1, PIPE_TEXTURE_2D, true, PIPE_FORMAT_R32_FLOAT,
- 60, 12, init1);
+   

[Mesa-dev] [PATCH v4] nv50, nvc0: optimize coherent buffer checking at draw time

2016-01-02 Thread Samuel Pitoiset
Instead of iterating over all the buffer resources looking for coherent
buffers, we keep track of a context-wide count. This will save some
iterations (and CPU cycles) in 99.99% case because usually coherent
buffers are not so used.

Changes from v4:
 - fix flag for textures

Changes from v3:
 - check if views[i] and views[i]->texture are not NULL
 - fix use of nv50->textures_coherent
 - check if vb[i].buffer is not NULL
 - clear out the flag for UBO

Changes from v2:
 - forgot to apply some changes for nv50 (texture/vertex bufs)

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_context.h |  3 ++
 src/gallium/drivers/nouveau/nv50/nv50_state.c   | 25 +++
 src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 42 +
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h |  3 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 36 +
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +---
 6 files changed, 82 insertions(+), 68 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
b/src/gallium/drivers/nouveau/nv50/nv50_context.h
index 2cebcd9..712d00e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
@@ -134,9 +134,11 @@ struct nv50_context {
struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS];
uint16_t constbuf_dirty[3];
uint16_t constbuf_valid[3];
+   uint16_t constbuf_coherent[3];
 
struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
unsigned num_vtxbufs;
+   uint32_t vtxbufs_coherent;
struct pipe_index_buffer idxbuf;
uint32_t vbo_fifo; /* bitmask of vertex elements to be pushed to FIFO */
uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */
@@ -148,6 +150,7 @@ struct nv50_context {
 
struct pipe_sampler_view *textures[3][PIPE_MAX_SAMPLERS];
unsigned num_textures[3];
+   uint32_t textures_coherent[3];
struct nv50_tsc_entry *samplers[3][PIPE_MAX_SAMPLERS];
unsigned num_samplers[3];
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index de65597..cb04043 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -664,6 +664,17 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
int s,
   if (old)
  nv50_screen_tic_unlock(nv50->screen, old);
 
+  if (views[i] && views[i]->texture) {
+ struct pipe_resource *res = views[i]->texture;
+ if (res->target == PIPE_BUFFER &&
+ (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
+nv50->textures_coherent[s] |= 1 << i;
+ else
+nv50->textures_coherent[s] &= ~(1 << i);
+  } else {
+ nv50->textures_coherent[s] &= ~(1 << i);
+  }
+
   pipe_sampler_view_reference(>textures[s][i], views[i]);
}
 
@@ -847,13 +858,19 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint 
shader, uint index,
   nv50->constbuf[s][i].u.data = cb->user_buffer;
   nv50->constbuf[s][i].size = MIN2(cb->buffer_size, 0x1);
   nv50->constbuf_valid[s] |= 1 << i;
+  nv50->constbuf_coherent[s] &= ~(1 << i);
} else
if (res) {
   nv50->constbuf[s][i].offset = cb->buffer_offset;
   nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 0x1);
   nv50->constbuf_valid[s] |= 1 << i;
+  if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
+ nv50->constbuf_coherent[s] |= 1 << i;
+  else
+ nv50->constbuf_coherent[s] &= ~(1 << i);
} else {
   nv50->constbuf_valid[s] &= ~(1 << i);
+  nv50->constbuf_coherent[s] &= ~(1 << i);
}
nv50->constbuf_dirty[s] |= 1 << i;
 
@@ -1003,6 +1020,7 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
if (!vb) {
   nv50->vbo_user &= ~(((1ull << count) - 1) << start_slot);
   nv50->vbo_constant &= ~(((1ull << count) - 1) << start_slot);
+  nv50->vtxbufs_coherent &= ~(((1ull << count) - 1) << start_slot);
   return;
}
 
@@ -1015,9 +1033,16 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
 nv50->vbo_constant |= 1 << dst_index;
  else
 nv50->vbo_constant &= ~(1 << dst_index);
+ nv50->vtxbufs_coherent &= ~(1 << dst_index);
   } else {
  nv50->vbo_user &= ~(1 << dst_index);
  nv50->vbo_constant &= ~(1 << dst_index);
+
+ if (vb[i].buffer &&
+ vb[i].buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
+nv50->vtxbufs_coherent |= (1 << dst_index);
+ else
+nv50->vtxbufs_coherent &= ~(1 << dst_index);
   }
}
 }
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c 
b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
index 2d1aa6a..60fa2bc 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
@@ -765,7 +765,7 @@ 

Re: [Mesa-dev] [PATCH] arb_indirect_parameters: add basic rendering tests

2016-01-02 Thread Ilia Mirkin
Errr... wrong list. And forgot to add to all.py. Please disregard,
will send a fixed version to the right list shortly.

On Sat, Jan 2, 2016 at 3:02 PM, Ilia Mirkin  wrote:
> Creates an array with 3 draws, the last of which is "bad", and makes
> sure that the "bad" one is never drawn. Parameter count is supplied from
> an earlier XFB draw to ensure that proper fencing occurs.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  tests/spec/CMakeLists.txt  |   1 +
>  .../spec/arb_indirect_parameters/CMakeLists.gl.txt |  13 ++
>  tests/spec/arb_indirect_parameters/CMakeLists.txt  |   1 +
>  .../spec/arb_indirect_parameters/tf-count-arrays.c | 220 
>  .../arb_indirect_parameters/tf-count-elements.c| 229 
> +
>  5 files changed, 464 insertions(+)
>  create mode 100644 tests/spec/arb_indirect_parameters/CMakeLists.gl.txt
>  create mode 100644 tests/spec/arb_indirect_parameters/CMakeLists.txt
>  create mode 100644 tests/spec/arb_indirect_parameters/tf-count-arrays.c
>  create mode 100644 tests/spec/arb_indirect_parameters/tf-count-elements.c
>
> diff --git a/tests/spec/CMakeLists.txt b/tests/spec/CMakeLists.txt
> index 3c4bcfb..a984734 100644
> --- a/tests/spec/CMakeLists.txt
> +++ b/tests/spec/CMakeLists.txt
> @@ -142,3 +142,4 @@ add_subdirectory (mesa_pack_invert)
>  add_subdirectory (ext_texture_format_bgra)
>  add_subdirectory (oes_draw_elements_base_vertex)
>  add_subdirectory (arb_shader_draw_parameters)
> +add_subdirectory (arb_indirect_parameters)
> diff --git a/tests/spec/arb_indirect_parameters/CMakeLists.gl.txt 
> b/tests/spec/arb_indirect_parameters/CMakeLists.gl.txt
> new file mode 100644
> index 000..88f533d
> --- /dev/null
> +++ b/tests/spec/arb_indirect_parameters/CMakeLists.gl.txt
> @@ -0,0 +1,13 @@
> +include_directories(
> +   ${GLEXT_INCLUDE_DIR}
> +   ${OPENGL_INCLUDE_PATH}
> +   ${piglit_SOURCE_DIR}/tests/mesa/util
> +)
> +
> +link_libraries (
> +   piglitutil_${piglit_target_api}
> +   ${OPENGL_gl_LIBRARY}
> +)
> +
> +piglit_add_executable (arb_indirect_parameters-tf-count-elements 
> tf-count-elements.c)
> +piglit_add_executable (arb_indirect_parameters-tf-count-arrays 
> tf-count-arrays.c)
> diff --git a/tests/spec/arb_indirect_parameters/CMakeLists.txt 
> b/tests/spec/arb_indirect_parameters/CMakeLists.txt
> new file mode 100644
> index 000..144a306
> --- /dev/null
> +++ b/tests/spec/arb_indirect_parameters/CMakeLists.txt
> @@ -0,0 +1 @@
> +piglit_include_target_api()
> diff --git a/tests/spec/arb_indirect_parameters/tf-count-arrays.c 
> b/tests/spec/arb_indirect_parameters/tf-count-arrays.c
> new file mode 100644
> index 000..e88a7ba
> --- /dev/null
> +++ b/tests/spec/arb_indirect_parameters/tf-count-arrays.c
> @@ -0,0 +1,220 @@
> +/*
> + * Copyright (C) 2016 Ilia Mirkin
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "piglit-util-gl.h"
> +
> +PIGLIT_GL_TEST_CONFIG_BEGIN
> +
> +   config.supports_gl_core_version = 31;
> +   config.window_visual = PIGLIT_GL_VISUAL_RGBA | 
> PIGLIT_GL_VISUAL_DOUBLE;
> +
> +PIGLIT_GL_TEST_CONFIG_END
> +
> +static const char *vs_tf =
> +   "#version 140\n"
> +   "out int tf;\n"
> +   "uniform int tf_val;\n"
> +   "void main() { gl_Position = vec4(0); tf = tf_val; }\n";
> +
> +static const char *vs_draw =
> +   "#version 140\n"
> +   "out vec4 color;\n"
> +   "in vec4 vtx, in_color;\n"
> +   "void main() { gl_Position = vtx; color = in_color; }\n";
> +
> +static const char *fs_draw =
> +   "#version 140\n"
> +   "out vec4 c;\n"
> +   "in vec4 color;\n"
> +   "void main() { c = color; }\n";
> +
> +static GLint tf_prog, draw_prog;
> +static GLint tf_val;
> +static GLuint tf_vao, draw_vao;
> +
> +void
> +piglit_init(int argc, char **argv)
> +{
> +  

Re: [Mesa-dev] [PATCH 5/9] gallium/radeon: always add +DumpCode to the LLVM target machine for LLVM <= 3.5

2016-01-02 Thread Nicolai Hähnle
What's the reason for always having +DumpCode? Generating the assembly 
is some overhead that's usually unnecessary. Even if it's a small part 
of the profiles I've seen, it still seems like a natural thing to just 
skip. From what I can tell it should be dependent on any of the shader 
dumping flags + DBG_CHECK_VM being set. In any case, I suppose that 
would be for a separate commit.


Cheers,
Nicolai

On 01.01.2016 09:13, Marek Olšák wrote:

From: Marek Olšák 

It's the same behavior that we use for later LLVM.
---
  src/gallium/drivers/r600/r600_llvm.c  | 2 +-
  src/gallium/drivers/radeon/radeon_llvm_emit.c | 5 ++---
  src/gallium/drivers/radeon/radeon_llvm_emit.h | 2 +-
  src/gallium/drivers/radeonsi/si_shader.c  | 2 +-
  4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_llvm.c 
b/src/gallium/drivers/r600/r600_llvm.c
index 1cc3031..7d93658 100644
--- a/src/gallium/drivers/r600/r600_llvm.c
+++ b/src/gallium/drivers/r600/r600_llvm.c
@@ -922,7 +922,7 @@ unsigned r600_llvm_compile(
const char * gpu_family = r600_get_llvm_processor_name(family);

memset(, 0, sizeof(struct radeon_shader_binary));
-   r = radeon_llvm_compile(mod, , gpu_family, dump, dump, NULL);
+   r = radeon_llvm_compile(mod, , gpu_family, dump, NULL);

r = r600_create_shader(bc, , use_kill);

diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
b/src/gallium/drivers/radeon/radeon_llvm_emit.c
index 61ed940..f8c7f54 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
@@ -141,7 +141,7 @@ static void radeonDiagnosticHandler(LLVMDiagnosticInfoRef 
di, void *context)
   * @returns 0 for success, 1 for failure
   */
  unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_shader_binary 
*binary,
-const char *gpu_family, bool dump_ir, bool 
dump_asm,
+const char *gpu_family, bool dump_ir,
 LLVMTargetMachineRef tm)
  {

@@ -165,8 +165,7 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct 
radeon_shader_binary *binar
}
strncpy(cpu, gpu_family, CPU_STRING_LEN);
memset(fs, 0, sizeof(fs));
-   if (dump_asm)
-   strncpy(fs, "+DumpCode", FS_STRING_LEN);
+   strncpy(fs, "+DumpCode", FS_STRING_LEN);
tm = LLVMCreateTargetMachine(target, triple, cpu, fs,
  LLVMCodeGenLevelDefault, LLVMRelocDefault,
  LLVMCodeModelDefault);
diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.h 
b/src/gallium/drivers/radeon/radeon_llvm_emit.h
index e20aed9..5f956dd 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.h
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.h
@@ -38,7 +38,7 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type);
  LLVMTargetRef radeon_llvm_get_r600_target(const char *triple);

  unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_shader_binary 
*binary,
-const char *gpu_family, bool dump_ir, bool 
dump_asm,
+const char *gpu_family, bool dump_ir,
 LLVMTargetMachineRef tm);

  #endif /* RADEON_LLVM_EMIT_H */
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a9297a5..4044961 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3884,7 +3884,7 @@ int si_compile_llvm(struct si_screen *sscreen, struct 
si_shader *shader,
bool dump_ir = dump_asm && !(sscreen->b.debug_flags & DBG_NO_IR);

r = radeon_llvm_compile(mod, >binary,
-   r600_get_llvm_processor_name(sscreen->b.family), dump_ir, 
dump_asm, tm);
+   r600_get_llvm_processor_name(sscreen->b.family), dump_ir, tm);
if (r)
return r;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] RadeonSI: Some shaders cleanups

2016-01-02 Thread Nicolai Hähnle

This looks much better now :)

For the series: Reviewed-by: Nicolai Hähnle 

On 01.01.2016 09:13, Marek Olšák wrote:

Hi,

These are shader cleanups mostly around si_compile_llvm.

You may wonder why the "move si_shader_binary_upload out of xxx" patches. They 
are part of my one-variant-per-shader rework, which needs a lot of restructuring.

Besides this, I have 2 more series of cleanup patches, which I will send when 
this lands.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] nvc0/ir: add support for PK2H/UP2H

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp |  1 +
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  |  5 -
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 23 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  2 +-
 4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index e9ddd36..ec74e7a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -740,6 +740,7 @@ CodeEmitterGM107::emitF2F()
emitCC   (0x2f);
emitField(0x2d, 1, (insn->op == OP_NEG) || insn->src(0).mod.neg());
emitFMZ  (0x2c, 1);
+   emitField(0x29, 1, insn->subOp);
emitRND  (0x27, rnd, 0x2a);
emitField(0x0a, 2, util_logbase2(typeSizeof(insn->sType)));
emitField(0x08, 2, util_logbase2(typeSizeof(insn->dType)));
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index 1d4f0d9..0b28047 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -1030,7 +1030,10 @@ CodeEmitterNVC0::emitCVT(Instruction *i)
 
   // for 8/16 source types, the byte/word is in subOp. word 1 is
   // represented as 2.
-  code[1] |= i->subOp << 0x17;
+  if (!isFloatType(i->sType))
+ code[1] |= i->subOp << 0x17;
+  else
+ code[1] |= i->subOp << 0x18;
 
   if (sat)
  code[0] |= 0x20;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index beb67fe..e0b9435 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -319,6 +319,10 @@ unsigned int Instruction::srcMask(unsigned int s) const
  x |= 2;
   return x;
}
+   case TGSI_OPCODE_PK2H:
+  return 0x3;
+   case TGSI_OPCODE_UP2H:
+  return 0x1;
default:
   break;
}
@@ -452,6 +456,7 @@ nv50_ir::DataType Instruction::inferSrcType() const
case TGSI_OPCODE_ATOMUMAX:
case TGSI_OPCODE_UBFE:
case TGSI_OPCODE_UMSB:
+   case TGSI_OPCODE_UP2H:
   return nv50_ir::TYPE_U32;
case TGSI_OPCODE_I2F:
case TGSI_OPCODE_I2D:
@@ -516,10 +521,12 @@ nv50_ir::DataType Instruction::inferDstType() const
case TGSI_OPCODE_DSGE:
case TGSI_OPCODE_DSLT:
case TGSI_OPCODE_DSNE:
+   case TGSI_OPCODE_PK2H:
   return nv50_ir::TYPE_U32;
case TGSI_OPCODE_I2F:
case TGSI_OPCODE_U2F:
case TGSI_OPCODE_D2F:
+   case TGSI_OPCODE_UP2H:
   return nv50_ir::TYPE_F32;
case TGSI_OPCODE_I2D:
case TGSI_OPCODE_U2D:
@@ -2807,6 +2814,22 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
   FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
  mkCvt(OP_CVT, dstTy, dst0[c], srcTy, fetchSrc(0, c));
   break;
+   case TGSI_OPCODE_PK2H:
+  val0 = getScratch();
+  val1 = getScratch();
+  mkCvt(OP_CVT, TYPE_F16, val0, TYPE_F32, fetchSrc(0, 0));
+  mkCvt(OP_CVT, TYPE_F16, val1, TYPE_F32, fetchSrc(0, 1));
+  mkOp3(OP_INSBF, TYPE_U32, dst0[0], val1, mkImm(0x1010), val0);
+  break;
+   case TGSI_OPCODE_UP2H:
+  src0 = fetchSrc(0, 0);
+  if (dst0[0])
+ mkCvt(OP_CVT, TYPE_F32, dst0[0], TYPE_F16, src0);
+  if (dst0[1]) {
+ geni = mkCvt(OP_CVT, TYPE_F32, dst0[1], TYPE_F16, src0);
+ geni->subOp = 1;
+  }
+  break;
case TGSI_OPCODE_EMIT:
   /* export the saved viewport index */
   if (viewport != NULL) {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 58b712e..43f6164 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -197,6 +197,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_DRAW_PARAMETERS:
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 1;
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
   return (class_3d >= NVE4_3D_CLASS) ? 1 : 0;
@@ -219,7 +220,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_VERTEXID_NOBASE:
case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
-   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H support

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/docs/source/screen.rst   | 2 ++
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 16 files changed, 17 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index db70cc8..39ecc63 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -290,6 +290,8 @@ The integer capabilities:
 * ``PIPE_CAP_DRAW_PARAMETERS``: Whether ``TGSI_SEMANTIC_BASEVERTEX``,
   ``TGSI_SEMANTIC_BASEINSTANCE``, and ``TGSI_SEMANTIC_DRAWID`` are
   supported in vertex shaders.
+* ``PIPE_CAP_TGSI_PACK_HALF_FLOAT``: Whether the ``UP2H`` and ``PK2H``
+  TGSI opcodes are supported.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index c684019..a8030f2 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -241,6 +241,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index d8cfcf0..f42fc37 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -255,6 +255,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 0;
 
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 4ca62a6..3a18e74 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -479,6 +479,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index fcef3b6..ef91c1a 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -304,6 +304,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_DRAW_PARAMETERS:
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index dbe5b3c..6c4a0f3 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -177,6 +177,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index bcb8577..d6131c2 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -220,6 +220,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 22f7885..58b712e 100644
--- 

[Mesa-dev] [PATCH 6/6] r600: add support for PK2H/UP2H

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/r600/r600_pipe.c   |   2 +-
 src/gallium/drivers/r600/r600_shader.c | 102 +++--
 2 files changed, 99 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 70c1ec1..359fe41 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -328,6 +328,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_SAMPLER_VIEW_TARGET:
+   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
return family >= CHIP_CEDAR ? 1 : 0;
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
return family >= CHIP_CEDAR ? 4 : 0;
@@ -351,7 +352,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_DRAW_PARAMETERS:
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
-   case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
return 0;
 
case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index d411b0b..23ea34e 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -8959,6 +8959,100 @@ static int tgsi_umad(struct r600_shader_ctx *ctx)
return 0;
 }
 
+static int tgsi_pk2h(struct r600_shader_ctx *ctx)
+{
+   struct tgsi_full_instruction *inst = 
>parse.FullToken.FullInstruction;
+   struct r600_bytecode_alu alu;
+   int r;
+
+   /* temp.xy = f32_to_f16(src) */
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   alu.op = ALU_OP1_FLT32_TO_FLT16;
+   alu.dst.chan = 0;
+   alu.dst.sel = ctx->temp_reg;
+   alu.dst.write = 1;
+   r600_bytecode_src([0], >src[0], 0);
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+   alu.dst.chan = 1;
+   r600_bytecode_src([0], >src[0], 1);
+   alu.last = 1;
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+
+   /* dst.x = temp.y * 0x1 + temp.x */
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   alu.op = ALU_OP3_MULADD_UINT24;
+   alu.is_op3 = 1;
+   tgsi_dst(ctx, >Dst[0], 0, );
+   alu.last = 1;
+   alu.src[0].sel = ctx->temp_reg;
+   alu.src[0].chan = 1;
+   alu.src[1].sel = V_SQ_ALU_SRC_LITERAL;
+   alu.src[1].value = 0x1;
+   alu.src[2].sel = ctx->temp_reg;
+   alu.src[2].chan = 0;
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+
+   return 0;
+}
+
+static int tgsi_up2h(struct r600_shader_ctx *ctx)
+{
+   struct tgsi_full_instruction *inst = 
>parse.FullToken.FullInstruction;
+   struct r600_bytecode_alu alu;
+   int r;
+   int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
+
+   /* temp.x = src.x */
+   /* note: no need to mask out the high bits */
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   alu.op = ALU_OP1_MOV;
+   alu.dst.chan = 0;
+   alu.dst.sel = ctx->temp_reg;
+   alu.dst.write = 1;
+   r600_bytecode_src([0], >src[0], 0);
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+
+   /* temp.y = src.x >> 16 */
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   alu.op = ALU_OP2_LSHR_INT;
+   alu.dst.chan = 1;
+   alu.dst.sel = ctx->temp_reg;
+   alu.dst.write = 1;
+   r600_bytecode_src([0], >src[0], 0);
+   alu.src[1].sel = V_SQ_ALU_SRC_LITERAL;
+   alu.src[1].value = 16;
+   alu.last = 1;
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+
+   /* dst.xy = f16_to_f32(temp.xy) */
+   for (int i = 0; i < lasti + 1; i++) {
+   if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
+   continue;
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   tgsi_dst(ctx, >Dst[0], i, );
+   alu.op = ALU_OP1_FLT16_TO_FLT32;
+   alu.src[0].sel = ctx->temp_reg;
+   alu.src[0].chan = i;
+   if (i == lasti)
+   alu.last = 1;
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+   }
+
+   return 0;
+}
+
 static const struct r600_shader_tgsi_instruction 
r600_shader_tgsi_instruction[] = {
[TGSI_OPCODE_ARL]   = { ALU_OP0_NOP, tgsi_r600_arl},
[TGSI_OPCODE_MOV]   = { ALU_OP1_MOV, tgsi_op2},
@@ -9205,7 +9299,7 @@ static const struct r600_shader_tgsi_instruction 
eg_shader_tgsi_instruction[] =
[TGSI_OPCODE_DDX]   = { FETCH_OP_GET_GRADIENTS_H, tgsi_tex},
[TGSI_OPCODE_DDY]   = { 

[Mesa-dev] [PATCH 4/6] st/mesa: use PK2H/UP2H when supported

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/state_tracker/st_context.c|  2 ++
 src/mesa/state_tracker/st_context.h|  1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 16 +++-
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index e532c6b..d53de1e 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -250,6 +250,8 @@ st_create_context_priv( struct gl_context *ctx, struct 
pipe_context *pipe,
   screen->get_param(screen, PIPE_CAP_QUERY_TIME_ELAPSED);
st->has_multi_draw_indirect =
   screen->get_param(screen, PIPE_CAP_MULTI_DRAW_INDIRECT);
+   st->has_half_float_packing =
+  screen->get_param(screen, PIPE_CAP_TGSI_PACK_HALF_FLOAT);
 
/* GL limits and extensions */
st_init_limits(st->pipe->screen, >Const, >Extensions);
diff --git a/src/mesa/state_tracker/st_context.h 
b/src/mesa/state_tracker/st_context.h
index ccebdd9..ae0114c 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -102,6 +102,7 @@ struct st_context
boolean force_persample_in_shader;
boolean has_shareable_shaders;
boolean has_multi_draw_indirect;
+   boolean has_half_float_packing;
 
/**
 * If a shader can be created when we get its source.
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index cdbe2f4..2adb57d 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -2163,15 +2163,20 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   }
   break;
 
+   case ir_unop_pack_half_2x16:
+  emit_asm(ir, TGSI_OPCODE_PK2H, result_dst, op[0]);
+  break;
+   case ir_unop_unpack_half_2x16:
+  emit_asm(ir, TGSI_OPCODE_UP2H, result_dst, op[0]);
+  break;
+
case ir_unop_pack_snorm_2x16:
case ir_unop_pack_unorm_2x16:
-   case ir_unop_pack_half_2x16:
case ir_unop_pack_snorm_4x8:
case ir_unop_pack_unorm_4x8:
 
case ir_unop_unpack_snorm_2x16:
case ir_unop_unpack_unorm_2x16:
-   case ir_unop_unpack_half_2x16:
case ir_unop_unpack_half_2x16_split_x:
case ir_unop_unpack_half_2x16_split_y:
case ir_unop_unpack_snorm_4x8:
@@ -5853,13 +5858,14 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
LOWER_PACK_SNORM_4x8 |
LOWER_UNPACK_SNORM_4x8 |
LOWER_UNPACK_UNORM_4x8 |
-   LOWER_PACK_UNORM_4x8 |
-   LOWER_PACK_HALF_2x16 |
-   LOWER_UNPACK_HALF_2x16;
+   LOWER_PACK_UNORM_4x8;
 
  if (ctx->Extensions.ARB_gpu_shader5)
 lower_inst |= LOWER_PACK_USE_BFI |
   LOWER_PACK_USE_BFE;
+ if (!ctx->st->has_half_float_packing)
+lower_inst |= LOWER_PACK_HALF_2x16 |
+  LOWER_UNPACK_HALF_2x16;
 
  lower_packing_builtins(ir, lower_inst);
   }
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] tgsi: update PK2H/UP2H channel behavior info

2016-01-02 Thread Ilia Mirkin
---
 src/gallium/auxiliary/tgsi/tgsi_info.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 3b40c3d..c078b6f 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -77,10 +77,10 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 1, 1, 0, 0, 0, 0, COMP, "DDX", TGSI_OPCODE_DDX },
{ 1, 1, 0, 0, 0, 0, COMP, "DDY", TGSI_OPCODE_DDY },
{ 0, 0, 0, 0, 0, 0, NONE, "KILL", TGSI_OPCODE_KILL },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK2H", TGSI_OPCODE_PK2H },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK2US", TGSI_OPCODE_PK2US },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK4B", TGSI_OPCODE_PK4B },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK4UB", TGSI_OPCODE_PK4UB },
+   { 1, 1, 0, 0, 0, 0, REPL, "PK2H", TGSI_OPCODE_PK2H },
+   { 1, 1, 0, 0, 0, 0, REPL, "PK2US", TGSI_OPCODE_PK2US },
+   { 1, 1, 0, 0, 0, 0, REPL, "PK4B", TGSI_OPCODE_PK4B },
+   { 1, 1, 0, 0, 0, 0, REPL, "PK4UB", TGSI_OPCODE_PK4UB },
{ 0, 1, 0, 0, 0, 1, NONE, "", 44 },  /* removed */
{ 1, 2, 0, 0, 0, 0, COMP, "SEQ", TGSI_OPCODE_SEQ },
{ 0, 1, 0, 0, 0, 1, NONE, "", 46 },  /* removed */
@@ -92,10 +92,10 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 1, 2, 1, 0, 0, 0, OTHR, "TEX", TGSI_OPCODE_TEX },
{ 1, 4, 1, 0, 0, 0, OTHR, "TXD", TGSI_OPCODE_TXD },
{ 1, 2, 1, 0, 0, 0, OTHR, "TXP", TGSI_OPCODE_TXP },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP2H", TGSI_OPCODE_UP2H },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP2US", TGSI_OPCODE_UP2US },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP4B", TGSI_OPCODE_UP4B },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP4UB", TGSI_OPCODE_UP4UB },
+   { 1, 1, 0, 0, 0, 0, CHAN, "UP2H", TGSI_OPCODE_UP2H },
+   { 1, 1, 0, 0, 0, 0, CHAN, "UP2US", TGSI_OPCODE_UP2US },
+   { 1, 1, 0, 0, 0, 0, CHAN, "UP4B", TGSI_OPCODE_UP4B },
+   { 1, 1, 0, 0, 0, 0, CHAN, "UP4UB", TGSI_OPCODE_UP4UB },
{ 0, 1, 0, 0, 0, 1, NONE, "", 59 },  /* removed */
{ 0, 1, 0, 0, 0, 1, NONE, "", 60 },  /* removed */
{ 1, 1, 0, 0, 0, 0, COMP, "ARR", TGSI_OPCODE_ARR },
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/docs/source/tgsi.rst | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 955ece8..f69998f 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -458,7 +458,9 @@ while DDY is allowed to be the same for the entire 2x2 quad.
 
 .. opcode:: PK2H - Pack Two 16-bit Floats
 
-  TBD
+.. math::
+
+  dst.x = f32\_to\_f16(src.x) | f32\_to\_f16(src.y) << 16
 
 
 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
@@ -615,7 +617,11 @@ This instruction replicates its result.
 
 .. opcode:: UP2H - Unpack Two 16-Bit Floats
 
-  TBD
+.. math::
+
+  dst.x = f16\_to\_f32(src0.x \& 0x)
+
+  dst.y = f16\_to\_f32(src0.x >> 16)
 
 .. note::
 
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] glapi: add ARB_indirect_parameters definitions

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mapi/glapi/gen/ARB_indirect_parameters.xml | 30 ++
 src/mapi/glapi/gen/Makefile.am |  1 +
 src/mapi/glapi/gen/gl_API.xml  |  6 +-
 src/mesa/main/extensions_table.h   |  1 +
 src/mesa/main/mtypes.h |  1 +
 src/mesa/main/tests/dispatch_sanity.cpp|  4 
 src/mesa/vbo/vbo_exec_array.c  | 21 ++
 7 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 src/mapi/glapi/gen/ARB_indirect_parameters.xml

diff --git a/src/mapi/glapi/gen/ARB_indirect_parameters.xml 
b/src/mapi/glapi/gen/ARB_indirect_parameters.xml
new file mode 100644
index 000..20de905
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_indirect_parameters.xml
@@ -0,0 +1,30 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 2da8f7d..900b61a 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -137,6 +137,7 @@ API_XML = \
ARB_get_texture_sub_image.xml \
ARB_gpu_shader_fp64.xml \
ARB_gpu_shader5.xml \
+   ARB_indirect_parameters.xml \
ARB_instanced_arrays.xml \
ARB_internalformat_query.xml \
ARB_invalidate_subdata.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 21f6293..593ace4 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8247,7 +8247,11 @@
 
 http://www.w3.org/2001/XInclude"/>
 
-
+
+
+http://www.w3.org/2001/XInclude"/>
+
+
 
 http://www.w3.org/2001/XInclude"/>
 
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 789b55a..aeccb01 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -70,6 +70,7 @@ EXT(ARB_gpu_shader5 , ARB_gpu_shader5
 EXT(ARB_gpu_shader_fp64 , ARB_gpu_shader_fp64  
  ,  x , GLC,  x ,  x , 2010)
 EXT(ARB_half_float_pixel, dummy_true   
  , GLL, GLC,  x ,  x , 2003)
 EXT(ARB_half_float_vertex   , ARB_half_float_vertex
  , GLL, GLC,  x ,  x , 2008)
+EXT(ARB_indirect_parameters , ARB_indirect_parameters  
  ,  x , GLC,  x ,  x , 2013)
 EXT(ARB_instanced_arrays, ARB_instanced_arrays 
  , GLL, GLC,  x ,  x , 2008)
 EXT(ARB_internalformat_query, ARB_internalformat_query 
  , GLL, GLC,  x ,  x , 2011)
 EXT(ARB_invalidate_subdata  , dummy_true   
  , GLL, GLC,  x ,  x , 2012)
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 5b9fce8..5cd2e8e 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3700,6 +3700,7 @@ struct gl_extensions
GLboolean ARB_gpu_shader5;
GLboolean ARB_gpu_shader_fp64;
GLboolean ARB_half_float_vertex;
+   GLboolean ARB_indirect_parameters;
GLboolean ARB_instanced_arrays;
GLboolean ARB_internalformat_query;
GLboolean ARB_map_buffer_range;
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index d288b1d..7610bcb 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -1844,6 +1844,10 @@ const struct function gl_core_functions_possible[] = {
{ "glGetQueryBufferObjecti64v", 45, -1 },
{ "glGetQueryBufferObjectui64v", 45, -1 },
 
+   /* GL_ARB_indirect_parameters */
+   { "glMultiDrawArraysIndirectCountARB", 31, -1 },
+   { "glMultiDrawElementsIndirectCountARB", 31, -1 },
+
{ NULL, 0, -1 }
 };
 
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index fd29837..0c26bad 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -1825,6 +1825,25 @@ vbo_exec_MultiDrawElementsIndirect(GLenum mode, GLenum 
type,
primcount, stride);
 }
 
+static void GLAPIENTRY
+vbo_exec_MultiDrawArraysIndirectCount(GLenum mode,
+  GLintptr indirect,
+  GLintptr drawcount,
+  GLsizei maxdrawcount, GLsizei stride)
+{
+
+}
+
+static void GLAPIENTRY
+vbo_exec_MultiDrawElementsIndirectCount(GLenum mode, GLenum type,
+GLintptr indirect,
+GLintptr drawcount,
+GLsizei maxdrawcount, GLsizei stride)
+{
+
+}
+
+
 /**
  * Initialize the dispatch table with the VBO functions for drawing.
  */
@@ -1872,6 +1891,8 @@ vbo_initialize_exec_dispatch(const struct gl_context *ctx,
if (ctx->API == 

[Mesa-dev] [PATCH 0/5] Add ARB_indirect_parameters support

2016-01-02 Thread Ilia Mirkin
The nvc0 patch applies on top of some unpublished patches, see

https://github.com/imirkin/mesa/commits/tmp4

for the full thing. The whole series applies on top of the
ARB_multi_draw_indirect patches I sent earlier (with potential minor
modifications). There is some type confusion between the
ARB_indirect_parameters spec and the Khronos gl.xml/glcorearb.h files,
I went with the latter's definitions.

This passes the relatively simple piglit test I sent.

Ilia Mirkin (5):
  glapi: add ARB_indirect_parameters definitions
  mesa: add parameter buffer, used for ARB_indirect_parameters
  mesa: add support for ARB_indirect_parameters draw functions
  st/mesa: expose ARB_indirect_parameters when the backend driver allows
  nvc0: add ARB_indirect_parameters support

 docs/relnotes/11.2.0.html  |   1 +
 src/gallium/drivers/nouveau/nvc0/mme/com9097.mme   | 157 +
 src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 125 
 src/gallium/drivers/nouveau/nvc0/nvc0_macros.h |   4 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   4 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c|  29 +++-
 src/mapi/glapi/gen/ARB_indirect_parameters.xml |  30 
 src/mapi/glapi/gen/Makefile.am |   1 +
 src/mapi/glapi/gen/gl_API.xml  |   6 +-
 src/mesa/main/api_validate.c   | 115 +++
 src/mesa/main/api_validate.h   |  16 +++
 src/mesa/main/bufferobj.c  |  15 ++
 src/mesa/main/extensions_table.h   |   1 +
 src/mesa/main/get.c|   5 +
 src/mesa/main/get_hash_params.py   |   4 +
 src/mesa/main/mtypes.h |   2 +
 src/mesa/main/tests/dispatch_sanity.cpp|   4 +
 src/mesa/state_tracker/st_cb_bufferobjects.c   |   1 +
 src/mesa/state_tracker/st_extensions.c |   1 +
 src/mesa/vbo/vbo_exec_array.c  | 124 
 20 files changed, 638 insertions(+), 7 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_indirect_parameters.xml

-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] st/mesa: expose ARB_indirect_parameters when the backend driver allows

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/state_tracker/st_cb_bufferobjects.c | 1 +
 src/mesa/state_tracker/st_extensions.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c 
b/src/mesa/state_tracker/st_cb_bufferobjects.c
index 5d20b26..e775453 100644
--- a/src/mesa/state_tracker/st_cb_bufferobjects.c
+++ b/src/mesa/state_tracker/st_cb_bufferobjects.c
@@ -230,6 +230,7 @@ st_bufferobj_data(struct gl_context *ctx,
   bind = PIPE_BIND_CONSTANT_BUFFER;
   break;
case GL_DRAW_INDIRECT_BUFFER:
+   case GL_PARAMETER_BUFFER_ARB:
   bind = PIPE_BIND_COMMAND_ARGS_BUFFER;
   break;
default:
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 90eb677..3c198ec 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -452,6 +452,7 @@ void st_init_extensions(struct pipe_screen *screen,
   { o(ARB_draw_instanced),   PIPE_CAP_TGSI_INSTANCEID  
},
   { o(ARB_fragment_program_shadow),  PIPE_CAP_TEXTURE_SHADOW_MAP   
},
   { o(ARB_framebuffer_object),   PIPE_CAP_MIXED_FRAMEBUFFER_SIZES  
},
+  { o(ARB_indirect_parameters),  
PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS   },
   { o(ARB_instanced_arrays), 
PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR  },
   { o(ARB_occlusion_query),  PIPE_CAP_OCCLUSION_QUERY  
},
   { o(ARB_occlusion_query2), PIPE_CAP_OCCLUSION_QUERY  
},
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 72877] Wrong colors with Mesa 9.2 and Mesa 10.0 on PPC Linux systems

2016-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=72877

--- Comment #15 from Ilia Mirkin  ---
(In reply to Alex Perez from comment #14)
> Ping. I am still experiencing problems with incorrect colors with the very
> latest Mesa, compiled from a fresh git checkout today.

Mesa 11.0.3+ work fine on a PPC G5 with a NV34 GPU. Haven't tested much else. I
don't think this is a "core" issue anymore.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] mesa: add parameter buffer, used for ARB_indirect_parameters

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/main/bufferobj.c| 15 +++
 src/mesa/main/get.c  |  5 +
 src/mesa/main/get_hash_params.py |  4 
 src/mesa/main/mtypes.h   |  1 +
 4 files changed, 25 insertions(+)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 181eb49..342f319 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -127,6 +127,11 @@ get_buffer_target(struct gl_context *ctx, GLenum target)
  return >DrawIndirectBuffer;
   }
   break;
+   case GL_PARAMETER_BUFFER_ARB:
+  if (_mesa_has_ARB_indirect_parameters(ctx)) {
+ return >ParameterBuffer;
+  }
+  break;
case GL_DISPATCH_INDIRECT_BUFFER:
   if (_mesa_has_compute_shaders(ctx)) {
  return >DispatchIndirectBuffer;
@@ -866,6 +871,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx )
_mesa_reference_buffer_object(ctx, >DrawIndirectBuffer,
 ctx->Shared->NullBufferObj);
 
+   _mesa_reference_buffer_object(ctx, >ParameterBuffer,
+ctx->Shared->NullBufferObj);
+
_mesa_reference_buffer_object(ctx, >DispatchIndirectBuffer,
 ctx->Shared->NullBufferObj);
 
@@ -913,6 +921,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx )
 
_mesa_reference_buffer_object(ctx, >DrawIndirectBuffer, NULL);
 
+   _mesa_reference_buffer_object(ctx, >ParameterBuffer, NULL);
+
_mesa_reference_buffer_object(ctx, >DispatchIndirectBuffer, NULL);
 
for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) {
@@ -1261,6 +1271,11 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids)
 _mesa_BindBuffer( GL_DRAW_INDIRECT_BUFFER, 0 );
  }
 
+ /* unbind ARB_indirect_parameters binding point */
+ if (ctx->ParameterBuffer == bufObj) {
+_mesa_BindBuffer(GL_PARAMETER_BUFFER_ARB, 0);
+ }
+
  /* unbind ARB_compute_shader binding point */
  if (ctx->DispatchIndirectBuffer == bufObj) {
 _mesa_BindBuffer(GL_DISPATCH_INDIRECT_BUFFER, 0);
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index c6a2e5b..95cb18c 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -423,6 +423,7 @@ EXTRA_EXT(ARB_framebuffer_no_attachments);
 EXTRA_EXT(ARB_tessellation_shader);
 EXTRA_EXT(ARB_shader_subroutine);
 EXTRA_EXT(ARB_shader_storage_buffer_object);
+EXTRA_EXT(ARB_indirect_parameters);
 
 static const int
 extra_ARB_color_buffer_float_or_glcore[] = {
@@ -1032,6 +1033,10 @@ find_custom_value(struct gl_context *ctx, const struct 
value_desc *d, union valu
case GL_DRAW_INDIRECT_BUFFER_BINDING:
   v->value_int = ctx->DrawIndirectBuffer->Name;
   break;
+   /* GL_ARB_indirect_parameters */
+   case GL_PARAMETER_BUFFER_BINDING_ARB:
+  v->value_int = ctx->ParameterBuffer->Name;
+  break;
/* GL_ARB_separate_shader_objects */
case GL_PROGRAM_PIPELINE_BINDING:
   if (ctx->Pipeline.Current) {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 7a48ed2..af7a8f4 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -887,6 +887,10 @@ descriptor=[
 # GL_ARB_shader_subroutine
   [ "MAX_SUBROUTINES", "CONST(MAX_SUBROUTINES), extra_ARB_shader_subroutine" ],
   [ "MAX_SUBROUTINE_UNIFORM_LOCATIONS", 
"CONST(MAX_SUBROUTINE_UNIFORM_LOCATIONS), extra_ARB_shader_subroutine" ],
+
+# GL_ARB_indirect_parameters
+  [ "PARAMETER_BUFFER_BINDING_ARB", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_indirect_parameters" ],
+
 ]}
 
 ]
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 5cd2e8e..dd52368 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4349,6 +4349,7 @@ struct gl_context
struct gl_perf_monitor_state PerfMonitor;
 
struct gl_buffer_object *DrawIndirectBuffer; /** < GL_ARB_draw_indirect */
+   struct gl_buffer_object *ParameterBuffer; /** < GL_ARB_indirect_parameters 
*/
struct gl_buffer_object *DispatchIndirectBuffer; /** < 
GL_ARB_compute_shader */
 
struct gl_buffer_object *CopyReadBuffer; /**< GL_ARB_copy_buffer */
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] llvmpipe: add sse code for fixed position calculation

2016-01-02 Thread sroland
From: Roland Scheidegger 

This is quite a few less instructions, albeit still do the 2 64bit muls
with scalar c code (they'd need way more shuffles, plus fixup for the signed
mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed
scalar muls natively just fine after all (even on 32bit).

(This still doesn't have a measurable performance impact in reality, although
profiler seems to say time spent in setup indeed has gone down by 10% or so
overall.)
---
 src/gallium/drivers/llvmpipe/lp_setup_tri.c | 58 +
 1 file changed, 50 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_setup_tri.c 
b/src/gallium/drivers/llvmpipe/lp_setup_tri.c
index cb1d715..fefd1c1 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup_tri.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup_tri.c
@@ -65,11 +65,11 @@ fixed_to_float(int a)
 struct fixed_position {
int32_t x[4];
int32_t y[4];
-   int64_t area;
int32_t dx01;
int32_t dy01;
int32_t dx20;
int32_t dy20;
+   int64_t area;
 };
 
 
@@ -866,29 +866,71 @@ static void retry_triangle_ccw( struct lp_setup_context 
*setup,
 
 /**
  * Calculate fixed position data for a triangle
+ * It is unfortunate we need to do that here (as we need area
+ * calculated in fixed point), as there's quite some code duplication
+ * to what is done in the jit setup prog.
  */
 static inline void
-calc_fixed_position( struct lp_setup_context *setup,
- struct fixed_position* position,
- const float (*v0)[4],
- const float (*v1)[4],
- const float (*v2)[4])
+calc_fixed_position(struct lp_setup_context *setup,
+struct fixed_position* position,
+const float (*v0)[4],
+const float (*v1)[4],
+const float (*v2)[4])
 {
+   /*
+* The rounding may not be quite the same with PIPE_ARCH_SSE
+* (util_iround right now only does nearest/even on x87,
+* otherwise nearest/away-from-zero).
+* Both should be acceptable, I think.
+*/
+#if defined(PIPE_ARCH_SSE)
+   __m128d v0r, v1r, v2r;
+   __m128 vxy0xy2, vxy1xy0;
+   __m128i vxy0xy2i, vxy1xy0i;
+   __m128i dxdy0120, x0x2y0y2, x1x0y1y0, x0120, y0120;
+   __m128 pix_offset = _mm_set1_ps(setup->pixel_offset);
+   __m128 fixed_one = _mm_set1_ps((float)FIXED_ONE);
+   v0r = _mm_load_sd((const double *)v0[0]);
+   v1r = _mm_load_sd((const double *)v1[0]);
+   v2r = _mm_load_sd((const double *)v2[0]);
+   vxy0xy2 = (__m128)_mm_unpacklo_pd(v0r, v2r);
+   vxy1xy0 = (__m128)_mm_unpacklo_pd(v1r, v0r);
+   vxy0xy2 = _mm_sub_ps(vxy0xy2, pix_offset);
+   vxy1xy0 = _mm_sub_ps(vxy1xy0, pix_offset);
+   vxy0xy2 = _mm_mul_ps(vxy0xy2, fixed_one);
+   vxy1xy0 = _mm_mul_ps(vxy1xy0, fixed_one);
+   vxy0xy2i = _mm_cvtps_epi32(vxy0xy2);
+   vxy1xy0i = _mm_cvtps_epi32(vxy1xy0);
+   dxdy0120 = _mm_sub_epi32(vxy0xy2i, vxy1xy0i);
+   _mm_store_si128((__m128i *)>dx01, dxdy0120);
+   /*
+* For the mul, would need some more shuffles, plus emulation
+* for the signed mul (without sse41), so don't bother.
+*/
+   x0x2y0y2 = _mm_shuffle_epi32(vxy0xy2i, _MM_SHUFFLE(3,1,2,0));
+   x1x0y1y0 = _mm_shuffle_epi32(vxy1xy0i, _MM_SHUFFLE(3,1,2,0));
+   x0120 = _mm_unpacklo_epi32(x0x2y0y2, x1x0y1y0);
+   y0120 = _mm_unpackhi_epi32(x0x2y0y2, x1x0y1y0);
+   _mm_store_si128((__m128i *)>x[0], x0120);
+   _mm_store_si128((__m128i *)>y[0], y0120);
+
+#else
position->x[0] = subpixel_snap(v0[0][0] - setup->pixel_offset);
position->x[1] = subpixel_snap(v1[0][0] - setup->pixel_offset);
position->x[2] = subpixel_snap(v2[0][0] - setup->pixel_offset);
-   position->x[3] = 0;
+   position->x[3] = 0; // should be unused
 
position->y[0] = subpixel_snap(v0[0][1] - setup->pixel_offset);
position->y[1] = subpixel_snap(v1[0][1] - setup->pixel_offset);
position->y[2] = subpixel_snap(v2[0][1] - setup->pixel_offset);
-   position->y[3] = 0;
+   position->y[3] = 0; // should be unused
 
position->dx01 = position->x[0] - position->x[1];
position->dy01 = position->y[0] - position->y[1];
 
position->dx20 = position->x[2] - position->x[0];
position->dy20 = position->y[2] - position->y[0];
+#endif
 
position->area = IMUL64(position->dx01, position->dy20) -
  IMUL64(position->dx20, position->dy01);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 72877] Wrong colors with Mesa 9.2 and Mesa 10.0 on PPC Linux systems

2016-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=72877

--- Comment #14 from Alex Perez  ---
Ping. I am still experiencing problems with incorrect colors with the very
latest Mesa, compiled from a fresh git checkout today.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Add ARB_indirect_parameters support

2016-01-02 Thread eocallaghan

In this series patches 1-4 are:

Reviewed-by: Edward O'Callaghan 

No idea what is happening in patch 5 to say anything either way.

On 2016-01-03 07:38, Ilia Mirkin wrote:

The nvc0 patch applies on top of some unpublished patches, see

https://github.com/imirkin/mesa/commits/tmp4

for the full thing. The whole series applies on top of the
ARB_multi_draw_indirect patches I sent earlier (with potential minor
modifications). There is some type confusion between the
ARB_indirect_parameters spec and the Khronos gl.xml/glcorearb.h files,
I went with the latter's definitions.

This passes the relatively simple piglit test I sent.

Ilia Mirkin (5):
  glapi: add ARB_indirect_parameters definitions
  mesa: add parameter buffer, used for ARB_indirect_parameters
  mesa: add support for ARB_indirect_parameters draw functions
  st/mesa: expose ARB_indirect_parameters when the backend driver 
allows

  nvc0: add ARB_indirect_parameters support

 docs/relnotes/11.2.0.html  |   1 +
 src/gallium/drivers/nouveau/nvc0/mme/com9097.mme   | 157 
+
 src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 125 


 src/gallium/drivers/nouveau/nvc0/nvc0_macros.h |   4 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   4 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c|  29 +++-
 src/mapi/glapi/gen/ARB_indirect_parameters.xml |  30 
 src/mapi/glapi/gen/Makefile.am |   1 +
 src/mapi/glapi/gen/gl_API.xml  |   6 +-
 src/mesa/main/api_validate.c   | 115 
+++

 src/mesa/main/api_validate.h   |  16 +++
 src/mesa/main/bufferobj.c  |  15 ++
 src/mesa/main/extensions_table.h   |   1 +
 src/mesa/main/get.c|   5 +
 src/mesa/main/get_hash_params.py   |   4 +
 src/mesa/main/mtypes.h |   2 +
 src/mesa/main/tests/dispatch_sanity.cpp|   4 +
 src/mesa/state_tracker/st_cb_bufferobjects.c   |   1 +
 src/mesa/state_tracker/st_extensions.c |   1 +
 src/mesa/vbo/vbo_exec_array.c  | 124 


 20 files changed, 638 insertions(+), 7 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_indirect_parameters.xml


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] tgsi: add ureg support for image decls

2016-01-02 Thread eocallaghan
There is quite a bit of rename churn happening here at the same time as 
the bring up of ureg support for image declarations.
Would it be possible to split the rename churn out from the actual 
behavioral changes please?


On 2016-01-03 15:37, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c| 62 
+

 src/gallium/auxiliary/tgsi/tgsi_dump.c | 10 +--
 src/gallium/auxiliary/tgsi/tgsi_parse.c|  4 +-
 src/gallium/auxiliary/tgsi/tgsi_parse.h|  2 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  4 +-
 src/gallium/auxiliary/tgsi/tgsi_text.c | 10 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 77 
++

 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  7 ++
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |  8 +--
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 12 +++-
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c|  2 +
 src/gallium/include/pipe/p_shader_tokens.h |  7 +-
 12 files changed, 153 insertions(+), 52 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index fdb7feb..bb9d0cb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -259,36 +259,39 @@ tgsi_build_declaration_semantic(
return ds;
 }

-static struct tgsi_declaration_resource
-tgsi_default_declaration_resource(void)
+static struct tgsi_declaration_image
+tgsi_default_declaration_image(void)
 {
-   struct tgsi_declaration_resource dr;
+   struct tgsi_declaration_image di;

-   dr.Resource = TGSI_TEXTURE_BUFFER;
-   dr.Raw = 0;
-   dr.Writable = 0;
-   dr.Padding = 0;
+   di.Resource = TGSI_TEXTURE_BUFFER;
+   di.Raw = 0;
+   di.Writable = 0;
+   di.Format = 0;
+   di.Padding = 0;

-   return dr;
+   return di;
 }

-static struct tgsi_declaration_resource
-tgsi_build_declaration_resource(unsigned texture,
-unsigned raw,
-unsigned writable,
-struct tgsi_declaration *declaration,
-struct tgsi_header *header)
+static struct tgsi_declaration_image
+tgsi_build_declaration_image(unsigned texture,
+ unsigned format,
+ unsigned raw,
+ unsigned writable,
+ struct tgsi_declaration *declaration,
+ struct tgsi_header *header)
 {
-   struct tgsi_declaration_resource dr;
+   struct tgsi_declaration_image di;

-   dr = tgsi_default_declaration_resource();
-   dr.Resource = texture;
-   dr.Raw = raw;
-   dr.Writable = writable;
+   di = tgsi_default_declaration_image();
+   di.Resource = texture;
+   di.Format = format;
+   di.Raw = raw;
+   di.Writable = writable;

declaration_grow(declaration, header);

-   return dr;
+   return di;
 }

 static struct tgsi_declaration_sampler_view
@@ -364,7 +367,7 @@ tgsi_default_full_declaration( void )
full_declaration.Range = tgsi_default_declaration_range();
full_declaration.Semantic = tgsi_default_declaration_semantic();
full_declaration.Interp = tgsi_default_declaration_interp();
-   full_declaration.Resource = tgsi_default_declaration_resource();
+   full_declaration.Image = tgsi_default_declaration_image();
full_declaration.SamplerView = 
tgsi_default_declaration_sampler_view();

full_declaration.Array = tgsi_default_declaration_array();

@@ -454,20 +457,21 @@ tgsi_build_full_declaration(
  header );
}

-   if (full_decl->Declaration.File == TGSI_FILE_RESOURCE) {
-  struct tgsi_declaration_resource *dr;
+   if (full_decl->Declaration.File == TGSI_FILE_IMAGE) {
+  struct tgsi_declaration_image *di;

   if (maxsize <= size) {
  return  0;
   }
-  dr = (struct tgsi_declaration_resource *)[size];
+  di = (struct tgsi_declaration_image *)[size];
   size++;

-  *dr = 
tgsi_build_declaration_resource(full_decl->Resource.Resource,

-full_decl->Resource.Raw,
-
full_decl->Resource.Writable,

-declaration,
-header);
+  *di = tgsi_build_declaration_image(full_decl->Image.Resource,
+ full_decl->Image.Format,
+ full_decl->Image.Raw,
+ full_decl->Image.Writable,
+ declaration,
+ header);
}

if (full_decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index e29ffb3..dad3839 100644
--- 

Re: [Mesa-dev] [PATCH 0/8] gallium: add shader buffer support

2016-01-02 Thread eocallaghan

In this series patches 2-8 are:

Reviewed-by: Edward O'Callaghan 

with some commentary on patch 1.

Kind Regards,

On 2016-01-03 15:37, Ilia Mirkin wrote:

This provides enough support in TGSI to support shader buffers. I do
away with the defunct TGSI_FILE_RESOURCE (renaming it into
TGSI_FILE_IMAGE to work with pipe_image_view), and add a brand new
TGSI_FILE_BUFFER. At the declaration level, this can have an ATOMIC
qualifier (and later a SHARED qualifier for compute shaders).

I also add memory qualifiers to LOAD/STORE opcodes, which can convey
the coherent/volatile/restrict flags as specified in the GLSL. I also
modified all of the formerly resource opcodes to work on both buffers
and images. For images they will derive the format from the IMAGE
declaration, while buffers are format-less by definition.

This is still missing a way to implement memory barriers, that will
come soon, and is not going to affect anything else I do in this
series.

For the full series I'm working on, you can look at

https://github.com/imirkin/mesa/commits/atomic3

which exposes ARB_shader_atomic_counters and
ARB_shader_storage_buffer_objects on nvc0+ (but it won't work on
maxwell -- need to add emission of atomic ops and cache control).

However this is a nice self-contained chunk to start with.

Ilia Mirkin (8):
  tgsi: add ureg support for image decls
  ureg: add buffer support to ureg
  tgsi: provide a way to encode memory qualifiers for SSBO
  tgsi: add a is_store property
  tgsi: update atomic op docs
  gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS
  gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT
  gallium: add a RESQ opcode to query info about a resource

 src/gallium/auxiliary/gallivm/lp_bld_limits.h  |   1 +
 src/gallium/auxiliary/tgsi/tgsi_build.c| 112 --
 src/gallium/auxiliary/tgsi/tgsi_dump.c |  25 +-
 src/gallium/auxiliary/tgsi/tgsi_exec.h |   1 +
 src/gallium/auxiliary/tgsi/tgsi_info.c | 446 
++---

 src/gallium/auxiliary/tgsi/tgsi_info.h |   1 +
 src/gallium/auxiliary/tgsi/tgsi_parse.c|   8 +-
 src/gallium/auxiliary/tgsi/tgsi_parse.h|   3 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  12 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.h  |   2 +
 src/gallium/auxiliary/tgsi/tgsi_text.c |  42 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 182 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  23 ++
 src/gallium/docs/source/screen.rst |   8 +
 src/gallium/docs/source/tgsi.rst   | 105 ++---
 src/gallium/drivers/freedreno/freedreno_screen.c   |   3 +
 src/gallium/drivers/i915/i915_screen.c |   1 +
 src/gallium/drivers/ilo/ilo_screen.c   |   1 +
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |   8 +-
 src/gallium/drivers/llvmpipe/lp_screen.c   |   1 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  12 +-
 src/gallium/drivers/nouveau/nv30/nv30_screen.c |   3 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   2 +
 src/gallium/drivers/r300/r300_screen.c |   3 +
 src/gallium/drivers/r600/r600_pipe.c   |   2 +
 src/gallium/drivers/radeonsi/si_pipe.c |   3 +
 src/gallium/drivers/softpipe/sp_screen.c   |   1 +
 src/gallium/drivers/svga/svga_screen.c |   4 +
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c|   2 +
 src/gallium/drivers/vc4/vc4_screen.c   |   3 +
 src/gallium/drivers/virgl/virgl_screen.c   |   1 +
 src/gallium/include/pipe/p_defines.h   |   2 +
 src/gallium/include/pipe/p_shader_tokens.h |  28 +-
 34 files changed, 729 insertions(+), 324 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H

2016-01-02 Thread eocallaghan

This series is:

Reviewed-by: Edward O'Callaghan 

On 2016-01-03 11:37, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
 src/gallium/docs/source/tgsi.rst | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst 
b/src/gallium/docs/source/tgsi.rst

index 955ece8..f69998f 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -458,7 +458,9 @@ while DDY is allowed to be the same for the entire 
2x2 quad.


 .. opcode:: PK2H - Pack Two 16-bit Floats

-  TBD
+.. math::
+
+  dst.x = f32\_to\_f16(src.x) | f32\_to\_f16(src.y) << 16


 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
@@ -615,7 +617,11 @@ This instruction replicates its result.

 .. opcode:: UP2H - Unpack Two 16-Bit Floats

-  TBD
+.. math::
+
+  dst.x = f16\_to\_f32(src0.x \& 0x)
+
+  dst.y = f16\_to\_f32(src0.x >> 16)

 .. note::


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] tgsi: add ureg support for image decls

2016-01-02 Thread Ilia Mirkin
On Sun, Jan 3, 2016 at 2:33 AM,   wrote:
> There is quite a bit of rename churn happening here at the same time as the
> bring up of ureg support for image declarations.
> Would it be possible to split the rename churn out from the actual
> behavioral changes please?

This is almost exclusively a rename. The only other thing is adding
the format to the tgsi_declaration_image (formerly
tgsi_declaration_resource) and a couple of ureg helpers. I don't think
it's really worth splitting apart, although if others feel similarly I
can go back and do it.

>
>
> On 2016-01-03 15:37, Ilia Mirkin wrote:
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  src/gallium/auxiliary/tgsi/tgsi_build.c| 62 +
>>  src/gallium/auxiliary/tgsi/tgsi_dump.c | 10 +--
>>  src/gallium/auxiliary/tgsi/tgsi_parse.c|  4 +-
>>  src/gallium/auxiliary/tgsi/tgsi_parse.h|  2 +-
>>  src/gallium/auxiliary/tgsi/tgsi_strings.c  |  4 +-
>>  src/gallium/auxiliary/tgsi/tgsi_text.c | 10 +--
>>  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 77
>> ++
>>  src/gallium/auxiliary/tgsi/tgsi_ureg.h |  7 ++
>>  src/gallium/drivers/ilo/shader/toy_tgsi.c  |  8 +--
>>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 12 +++-
>>  src/gallium/drivers/svga/svga_tgsi_vgpu10.c|  2 +
>>  src/gallium/include/pipe/p_shader_tokens.h |  7 +-
>>  12 files changed, 153 insertions(+), 52 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
>> b/src/gallium/auxiliary/tgsi/tgsi_build.c
>> index fdb7feb..bb9d0cb 100644
>> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
>> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
>> @@ -259,36 +259,39 @@ tgsi_build_declaration_semantic(
>> return ds;
>>  }
>>
>> -static struct tgsi_declaration_resource
>> -tgsi_default_declaration_resource(void)
>> +static struct tgsi_declaration_image
>> +tgsi_default_declaration_image(void)
>>  {
>> -   struct tgsi_declaration_resource dr;
>> +   struct tgsi_declaration_image di;
>>
>> -   dr.Resource = TGSI_TEXTURE_BUFFER;
>> -   dr.Raw = 0;
>> -   dr.Writable = 0;
>> -   dr.Padding = 0;
>> +   di.Resource = TGSI_TEXTURE_BUFFER;
>> +   di.Raw = 0;
>> +   di.Writable = 0;
>> +   di.Format = 0;
>> +   di.Padding = 0;
>>
>> -   return dr;
>> +   return di;
>>  }
>>
>> -static struct tgsi_declaration_resource
>> -tgsi_build_declaration_resource(unsigned texture,
>> -unsigned raw,
>> -unsigned writable,
>> -struct tgsi_declaration *declaration,
>> -struct tgsi_header *header)
>> +static struct tgsi_declaration_image
>> +tgsi_build_declaration_image(unsigned texture,
>> + unsigned format,
>> + unsigned raw,
>> + unsigned writable,
>> + struct tgsi_declaration *declaration,
>> + struct tgsi_header *header)
>>  {
>> -   struct tgsi_declaration_resource dr;
>> +   struct tgsi_declaration_image di;
>>
>> -   dr = tgsi_default_declaration_resource();
>> -   dr.Resource = texture;
>> -   dr.Raw = raw;
>> -   dr.Writable = writable;
>> +   di = tgsi_default_declaration_image();
>> +   di.Resource = texture;
>> +   di.Format = format;
>> +   di.Raw = raw;
>> +   di.Writable = writable;
>>
>> declaration_grow(declaration, header);
>>
>> -   return dr;
>> +   return di;
>>  }
>>
>>  static struct tgsi_declaration_sampler_view
>> @@ -364,7 +367,7 @@ tgsi_default_full_declaration( void )
>> full_declaration.Range = tgsi_default_declaration_range();
>> full_declaration.Semantic = tgsi_default_declaration_semantic();
>> full_declaration.Interp = tgsi_default_declaration_interp();
>> -   full_declaration.Resource = tgsi_default_declaration_resource();
>> +   full_declaration.Image = tgsi_default_declaration_image();
>> full_declaration.SamplerView =
>> tgsi_default_declaration_sampler_view();
>> full_declaration.Array = tgsi_default_declaration_array();
>>
>> @@ -454,20 +457,21 @@ tgsi_build_full_declaration(
>>   header );
>> }
>>
>> -   if (full_decl->Declaration.File == TGSI_FILE_RESOURCE) {
>> -  struct tgsi_declaration_resource *dr;
>> +   if (full_decl->Declaration.File == TGSI_FILE_IMAGE) {
>> +  struct tgsi_declaration_image *di;
>>
>>if (maxsize <= size) {
>>   return  0;
>>}
>> -  dr = (struct tgsi_declaration_resource *)[size];
>> +  di = (struct tgsi_declaration_image *)[size];
>>size++;
>>
>> -  *dr = tgsi_build_declaration_resource(full_decl->Resource.Resource,
>> -full_decl->Resource.Raw,
>> -full_decl->Resource.Writable,
>> -  

[Mesa-dev] [PATCH 6/8] gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/gallivm/lp_bld_limits.h| 1 +
 src/gallium/auxiliary/tgsi/tgsi_exec.h   | 1 +
 src/gallium/docs/source/screen.rst   | 4 
 src/gallium/drivers/freedreno/freedreno_screen.c | 2 ++
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 2 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 2 ++
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 2 ++
 src/gallium/drivers/svga/svga_screen.c   | 3 +++
 src/gallium/drivers/vc4/vc4_screen.c | 2 ++
 src/gallium/include/pipe/p_defines.h | 1 +
 13 files changed, 23 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_limits.h 
b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
index ad64ae0..4598db8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_limits.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
@@ -136,6 +136,7 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
+   case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
   return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
   return 32;
diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h 
b/src/gallium/auxiliary/tgsi/tgsi_exec.h
index f86adce..26fec8e 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h
@@ -473,6 +473,7 @@ tgsi_exec_get_shader_param(enum pipe_shader_cap param)
   return 1;
case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
+   case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
   return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
   return 32;
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 41bd0f8..4402809 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -377,6 +377,10 @@ to be 0.
   of iterations that loops are allowed to have to be unrolled. It is only
   a hint to state trackers. Whether any loops will be unrolled is not
   guaranteed.
+* ``PIPE_SHADER_CAP_MAX_SHADER_BUFFERS``: Maximum number of memory buffers
+  (also used to implement atomic counters). Having this be non-0 also
+  implies support for the ``LOAD``, ``STORE``, and ``ATOM*`` TGSI
+  opcodes.
 
 
 .. _pipe_compute_cap:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 4b6d6af..bf356c4 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -415,6 +415,8 @@ fd_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
+   case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
+   return 0;
}
debug_printf("unknown shader param %d\n", param);
return 0;
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 02303bb..3d77f81 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -266,6 +266,7 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
   case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
   case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
   case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
+  case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
  return 0;
   case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
  return 32;
@@ -309,6 +310,7 @@ nv30_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
   case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
   case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
   case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
+  case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
  return 0;
   case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
  return 32;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index b3f2492..aafca71 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -301,6 +301,7 @@ nv50_screen_get_shader_param(struct pipe_screen *pscreen, 
unsigned shader,
case PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
+   case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
   return 0;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
   return 32;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 

[Mesa-dev] [PATCH 0/8] gallium: add shader buffer support

2016-01-02 Thread Ilia Mirkin
This provides enough support in TGSI to support shader buffers. I do
away with the defunct TGSI_FILE_RESOURCE (renaming it into
TGSI_FILE_IMAGE to work with pipe_image_view), and add a brand new
TGSI_FILE_BUFFER. At the declaration level, this can have an ATOMIC
qualifier (and later a SHARED qualifier for compute shaders).

I also add memory qualifiers to LOAD/STORE opcodes, which can convey
the coherent/volatile/restrict flags as specified in the GLSL. I also
modified all of the formerly resource opcodes to work on both buffers
and images. For images they will derive the format from the IMAGE
declaration, while buffers are format-less by definition.

This is still missing a way to implement memory barriers, that will
come soon, and is not going to affect anything else I do in this
series.

For the full series I'm working on, you can look at

https://github.com/imirkin/mesa/commits/atomic3

which exposes ARB_shader_atomic_counters and
ARB_shader_storage_buffer_objects on nvc0+ (but it won't work on
maxwell -- need to add emission of atomic ops and cache control).

However this is a nice self-contained chunk to start with.

Ilia Mirkin (8):
  tgsi: add ureg support for image decls
  ureg: add buffer support to ureg
  tgsi: provide a way to encode memory qualifiers for SSBO
  tgsi: add a is_store property
  tgsi: update atomic op docs
  gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS
  gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT
  gallium: add a RESQ opcode to query info about a resource

 src/gallium/auxiliary/gallivm/lp_bld_limits.h  |   1 +
 src/gallium/auxiliary/tgsi/tgsi_build.c| 112 --
 src/gallium/auxiliary/tgsi/tgsi_dump.c |  25 +-
 src/gallium/auxiliary/tgsi/tgsi_exec.h |   1 +
 src/gallium/auxiliary/tgsi/tgsi_info.c | 446 ++---
 src/gallium/auxiliary/tgsi/tgsi_info.h |   1 +
 src/gallium/auxiliary/tgsi/tgsi_parse.c|   8 +-
 src/gallium/auxiliary/tgsi/tgsi_parse.h|   3 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  12 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.h  |   2 +
 src/gallium/auxiliary/tgsi/tgsi_text.c |  42 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 182 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  23 ++
 src/gallium/docs/source/screen.rst |   8 +
 src/gallium/docs/source/tgsi.rst   | 105 ++---
 src/gallium/drivers/freedreno/freedreno_screen.c   |   3 +
 src/gallium/drivers/i915/i915_screen.c |   1 +
 src/gallium/drivers/ilo/ilo_screen.c   |   1 +
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |   8 +-
 src/gallium/drivers/llvmpipe/lp_screen.c   |   1 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  12 +-
 src/gallium/drivers/nouveau/nv30/nv30_screen.c |   3 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   2 +
 src/gallium/drivers/r300/r300_screen.c |   3 +
 src/gallium/drivers/r600/r600_pipe.c   |   2 +
 src/gallium/drivers/radeonsi/si_pipe.c |   3 +
 src/gallium/drivers/softpipe/sp_screen.c   |   1 +
 src/gallium/drivers/svga/svga_screen.c |   4 +
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c|   2 +
 src/gallium/drivers/vc4/vc4_screen.c   |   3 +
 src/gallium/drivers/virgl/virgl_screen.c   |   1 +
 src/gallium/include/pipe/p_defines.h   |   2 +
 src/gallium/include/pipe/p_shader_tokens.h |  28 +-
 34 files changed, 729 insertions(+), 324 deletions(-)

-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] tgsi: update atomic op docs

2016-01-02 Thread Ilia Mirkin
Specify that the operation only applies to the x component, not
per-component as previously specified. This is unnecessary for GL and
creates additional complications for images which need to support these
operations as well.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/docs/source/tgsi.rst | 93 
 1 file changed, 47 insertions(+), 46 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 955ece8..a3151e3 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2252,11 +2252,11 @@ after lookup.
 Resource Access Opcodes
 ^^^
 
-.. opcode:: LOAD - Fetch data from a shader resource
+.. opcode:: LOAD - Fetch data from a shader buffer or image
 
Syntax: ``LOAD dst, resource, address``
 
-   Example: ``LOAD TEMP[0], RES[0], TEMP[1]``
+   Example: ``LOAD TEMP[0], BUFFER[0], TEMP[1]``
 
Using the provided integer address, LOAD fetches data
from the specified buffer or texture without any
@@ -2280,7 +2280,7 @@ Resource Access Opcodes
 
Syntax: ``STORE resource, address, src``
 
-   Example: ``STORE RES[0], TEMP[0], TEMP[1]``
+   Example: ``STORE BUFFER[0], TEMP[0], TEMP[1]``
 
Using the provided integer address, STORE writes data
to the specified buffer or texture.
@@ -2358,158 +2358,159 @@ These opcodes provide atomic variants of some common 
arithmetic and
 logical operations.  In this context atomicity means that another
 concurrent memory access operation that affects the same memory
 location is guaranteed to be performed strictly before or after the
-entire execution of the atomic operation.
-
-For the moment they're only valid in compute programs.
+entire execution of the atomic operation. The resource may be a buffer
+or an image. In the case of an image, the offset works the same as for
+``LOAD`` and ``STORE``, specified above. These atomic operations may
+only be used with 32-bit integer image formats.
 
 .. opcode:: ATOMUADD - Atomic integer addition
 
   Syntax: ``ATOMUADD dst, resource, offset, src``
 
-  Example: ``ATOMUADD TEMP[0], RES[0], TEMP[1], TEMP[2]``
+  Example: ``ATOMUADD TEMP[0], BUFFER[0], TEMP[1], TEMP[2]``
 
-  The following operation is performed atomically on each component:
+  The following operation is performed atomically:
 
 .. math::
 
-  dst_i = resource[offset]_i
+  dst_x = resource[offset]
 
-  resource[offset]_i = dst_i + src_i
+  resource[offset] = dst_x + src_x
 
 
 .. opcode:: ATOMXCHG - Atomic exchange
 
   Syntax: ``ATOMXCHG dst, resource, offset, src``
 
-  Example: ``ATOMXCHG TEMP[0], RES[0], TEMP[1], TEMP[2]``
+  Example: ``ATOMXCHG TEMP[0], BUFFER[0], TEMP[1], TEMP[2]``
 
-  The following operation is performed atomically on each component:
+  The following operation is performed atomically:
 
 .. math::
 
-  dst_i = resource[offset]_i
+  dst_x = resource[offset]
 
-  resource[offset]_i = src_i
+  resource[offset] = src_x
 
 
 .. opcode:: ATOMCAS - Atomic compare-and-exchange
 
   Syntax: ``ATOMCAS dst, resource, offset, cmp, src``
 
-  Example: ``ATOMCAS TEMP[0], RES[0], TEMP[1], TEMP[2], TEMP[3]``
+  Example: ``ATOMCAS TEMP[0], BUFFER[0], TEMP[1], TEMP[2], TEMP[3]``
 
-  The following operation is performed atomically on each component:
+  The following operation is performed atomically:
 
 .. math::
 
-  dst_i = resource[offset]_i
+  dst_x = resource[offset]
 
-  resource[offset]_i = (dst_i == cmp_i ? src_i : dst_i)
+  resource[offset] = (dst_x == cmp_x ? src_x : dst_x)
 
 
 .. opcode:: ATOMAND - Atomic bitwise And
 
   Syntax: ``ATOMAND dst, resource, offset, src``
 
-  Example: ``ATOMAND TEMP[0], RES[0], TEMP[1], TEMP[2]``
+  Example: ``ATOMAND TEMP[0], BUFFER[0], TEMP[1], TEMP[2]``
 
-  The following operation is performed atomically on each component:
+  The following operation is performed atomically:
 
 .. math::
 
-  dst_i = resource[offset]_i
+  dst_x = resource[offset]
 
-  resource[offset]_i = dst_i \& src_i
+  resource[offset] = dst_x \& src_x
 
 
 .. opcode:: ATOMOR - Atomic bitwise Or
 
   Syntax: ``ATOMOR dst, resource, offset, src``
 
-  Example: ``ATOMOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
+  Example: ``ATOMOR TEMP[0], BUFFER[0], TEMP[1], TEMP[2]``
 
-  The following operation is performed atomically on each component:
+  The following operation is performed atomically:
 
 .. math::
 
-  dst_i = resource[offset]_i
+  dst_x = resource[offset]
 
-  resource[offset]_i = dst_i | src_i
+  resource[offset] = dst_x | src_x
 
 
 .. opcode:: ATOMXOR - Atomic bitwise Xor
 
   Syntax: ``ATOMXOR dst, resource, offset, src``
 
-  Example: ``ATOMXOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
+  Example: ``ATOMXOR TEMP[0], BUFFER[0], TEMP[1], TEMP[2]``
 
-  The following operation is performed atomically on each component:
+  The following operation is performed 

[Mesa-dev] [PATCH 4/8] tgsi: add a is_store property

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_info.c | 446 -
 src/gallium/auxiliary/tgsi/tgsi_info.h |   1 +
 2 files changed, 224 insertions(+), 223 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 3b40c3d..8a0e9c4 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -37,231 +37,231 @@
 
 static const struct tgsi_opcode_info opcode_info[TGSI_OPCODE_LAST] =
 {
-   { 1, 1, 0, 0, 0, 0, COMP, "ARL", TGSI_OPCODE_ARL },
-   { 1, 1, 0, 0, 0, 0, COMP, "MOV", TGSI_OPCODE_MOV },
-   { 1, 1, 0, 0, 0, 0, CHAN, "LIT", TGSI_OPCODE_LIT },
-   { 1, 1, 0, 0, 0, 0, REPL, "RCP", TGSI_OPCODE_RCP },
-   { 1, 1, 0, 0, 0, 0, REPL, "RSQ", TGSI_OPCODE_RSQ },
-   { 1, 1, 0, 0, 0, 0, CHAN, "EXP", TGSI_OPCODE_EXP },
-   { 1, 1, 0, 0, 0, 0, CHAN, "LOG", TGSI_OPCODE_LOG },
-   { 1, 2, 0, 0, 0, 0, COMP, "MUL", TGSI_OPCODE_MUL },
-   { 1, 2, 0, 0, 0, 0, COMP, "ADD", TGSI_OPCODE_ADD },
-   { 1, 2, 0, 0, 0, 0, REPL, "DP3", TGSI_OPCODE_DP3 },
-   { 1, 2, 0, 0, 0, 0, REPL, "DP4", TGSI_OPCODE_DP4 },
-   { 1, 2, 0, 0, 0, 0, CHAN, "DST", TGSI_OPCODE_DST },
-   { 1, 2, 0, 0, 0, 0, COMP, "MIN", TGSI_OPCODE_MIN },
-   { 1, 2, 0, 0, 0, 0, COMP, "MAX", TGSI_OPCODE_MAX },
-   { 1, 2, 0, 0, 0, 0, COMP, "SLT", TGSI_OPCODE_SLT },
-   { 1, 2, 0, 0, 0, 0, COMP, "SGE", TGSI_OPCODE_SGE },
-   { 1, 3, 0, 0, 0, 0, COMP, "MAD", TGSI_OPCODE_MAD },
-   { 1, 2, 0, 0, 0, 0, COMP, "SUB", TGSI_OPCODE_SUB },
-   { 1, 3, 0, 0, 0, 0, COMP, "LRP", TGSI_OPCODE_LRP },
-   { 1, 3, 0, 0, 0, 0, COMP, "FMA", TGSI_OPCODE_FMA },
-   { 1, 1, 0, 0, 0, 0, REPL, "SQRT", TGSI_OPCODE_SQRT },
-   { 1, 3, 0, 0, 0, 0, REPL, "DP2A", TGSI_OPCODE_DP2A },
-   { 0, 0, 0, 0, 0, 0, NONE, "", 22 },  /* removed */
-   { 0, 0, 0, 0, 0, 0, NONE, "", 23 },  /* removed */
-   { 1, 1, 0, 0, 0, 0, COMP, "FRC", TGSI_OPCODE_FRC },
-   { 1, 3, 0, 0, 0, 0, COMP, "CLAMP", TGSI_OPCODE_CLAMP },
-   { 1, 1, 0, 0, 0, 0, COMP, "FLR", TGSI_OPCODE_FLR },
-   { 1, 1, 0, 0, 0, 0, COMP, "ROUND", TGSI_OPCODE_ROUND },
-   { 1, 1, 0, 0, 0, 0, REPL, "EX2", TGSI_OPCODE_EX2 },
-   { 1, 1, 0, 0, 0, 0, REPL, "LG2", TGSI_OPCODE_LG2 },
-   { 1, 2, 0, 0, 0, 0, REPL, "POW", TGSI_OPCODE_POW },
-   { 1, 2, 0, 0, 0, 0, COMP, "XPD", TGSI_OPCODE_XPD },
-   { 0, 0, 0, 0, 0, 0, NONE, "", 32 },  /* removed */
-   { 1, 1, 0, 0, 0, 0, COMP, "ABS", TGSI_OPCODE_ABS },
-   { 0, 0, 0, 0, 0, 0, NONE, "", 34 },  /* removed */
-   { 1, 2, 0, 0, 0, 0, REPL, "DPH", TGSI_OPCODE_DPH },
-   { 1, 1, 0, 0, 0, 0, REPL, "COS", TGSI_OPCODE_COS },
-   { 1, 1, 0, 0, 0, 0, COMP, "DDX", TGSI_OPCODE_DDX },
-   { 1, 1, 0, 0, 0, 0, COMP, "DDY", TGSI_OPCODE_DDY },
-   { 0, 0, 0, 0, 0, 0, NONE, "KILL", TGSI_OPCODE_KILL },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK2H", TGSI_OPCODE_PK2H },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK2US", TGSI_OPCODE_PK2US },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK4B", TGSI_OPCODE_PK4B },
-   { 1, 1, 0, 0, 0, 0, COMP, "PK4UB", TGSI_OPCODE_PK4UB },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 44 },  /* removed */
-   { 1, 2, 0, 0, 0, 0, COMP, "SEQ", TGSI_OPCODE_SEQ },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 46 },  /* removed */
-   { 1, 2, 0, 0, 0, 0, COMP, "SGT", TGSI_OPCODE_SGT },
-   { 1, 1, 0, 0, 0, 0, REPL, "SIN", TGSI_OPCODE_SIN },
-   { 1, 2, 0, 0, 0, 0, COMP, "SLE", TGSI_OPCODE_SLE },
-   { 1, 2, 0, 0, 0, 0, COMP, "SNE", TGSI_OPCODE_SNE },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 51 },  /* removed */
-   { 1, 2, 1, 0, 0, 0, OTHR, "TEX", TGSI_OPCODE_TEX },
-   { 1, 4, 1, 0, 0, 0, OTHR, "TXD", TGSI_OPCODE_TXD },
-   { 1, 2, 1, 0, 0, 0, OTHR, "TXP", TGSI_OPCODE_TXP },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP2H", TGSI_OPCODE_UP2H },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP2US", TGSI_OPCODE_UP2US },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP4B", TGSI_OPCODE_UP4B },
-   { 1, 1, 0, 0, 0, 0, COMP, "UP4UB", TGSI_OPCODE_UP4UB },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 59 },  /* removed */
-   { 0, 1, 0, 0, 0, 1, NONE, "", 60 },  /* removed */
-   { 1, 1, 0, 0, 0, 0, COMP, "ARR", TGSI_OPCODE_ARR },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 62 },  /* removed */
-   { 0, 0, 0, 1, 0, 0, NONE, "CAL", TGSI_OPCODE_CAL },
-   { 0, 0, 0, 0, 0, 0, NONE, "RET", TGSI_OPCODE_RET },
-   { 1, 1, 0, 0, 0, 0, COMP, "SSG", TGSI_OPCODE_SSG },
-   { 1, 3, 0, 0, 0, 0, COMP, "CMP", TGSI_OPCODE_CMP },
-   { 1, 1, 0, 0, 0, 0, CHAN, "SCS", TGSI_OPCODE_SCS },
-   { 1, 2, 1, 0, 0, 0, OTHR, "TXB", TGSI_OPCODE_TXB },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 69 },  /* removed */
-   { 1, 2, 0, 0, 0, 0, COMP, "DIV", TGSI_OPCODE_DIV },
-   { 1, 2, 0, 0, 0, 0, REPL, "DP2", TGSI_OPCODE_DP2 },
-   { 1, 2, 1, 0, 0, 0, OTHR, "TXL", TGSI_OPCODE_TXL },
-   { 0, 0, 0, 0, 0, 0, NONE, "BRK", TGSI_OPCODE_BRK },
-   { 0, 1, 0, 1, 0, 1, NONE, "IF", TGSI_OPCODE_IF },
-   { 0, 1, 0, 1, 0, 1, NONE, "UIF", TGSI_OPCODE_UIF },
-   { 0, 1, 0, 0, 0, 1, NONE, "", 76 },  /* removed */
-   { 0, 0, 0, 1, 1, 1, NONE, "ELSE", TGSI_OPCODE_ELSE },
-   { 0, 

[Mesa-dev] [PATCH 1/8] tgsi: add ureg support for image decls

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c| 62 +
 src/gallium/auxiliary/tgsi/tgsi_dump.c | 10 +--
 src/gallium/auxiliary/tgsi/tgsi_parse.c|  4 +-
 src/gallium/auxiliary/tgsi/tgsi_parse.h|  2 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  4 +-
 src/gallium/auxiliary/tgsi/tgsi_text.c | 10 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 77 ++
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  7 ++
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |  8 +--
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 12 +++-
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c|  2 +
 src/gallium/include/pipe/p_shader_tokens.h |  7 +-
 12 files changed, 153 insertions(+), 52 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index fdb7feb..bb9d0cb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -259,36 +259,39 @@ tgsi_build_declaration_semantic(
return ds;
 }
 
-static struct tgsi_declaration_resource
-tgsi_default_declaration_resource(void)
+static struct tgsi_declaration_image
+tgsi_default_declaration_image(void)
 {
-   struct tgsi_declaration_resource dr;
+   struct tgsi_declaration_image di;
 
-   dr.Resource = TGSI_TEXTURE_BUFFER;
-   dr.Raw = 0;
-   dr.Writable = 0;
-   dr.Padding = 0;
+   di.Resource = TGSI_TEXTURE_BUFFER;
+   di.Raw = 0;
+   di.Writable = 0;
+   di.Format = 0;
+   di.Padding = 0;
 
-   return dr;
+   return di;
 }
 
-static struct tgsi_declaration_resource
-tgsi_build_declaration_resource(unsigned texture,
-unsigned raw,
-unsigned writable,
-struct tgsi_declaration *declaration,
-struct tgsi_header *header)
+static struct tgsi_declaration_image
+tgsi_build_declaration_image(unsigned texture,
+ unsigned format,
+ unsigned raw,
+ unsigned writable,
+ struct tgsi_declaration *declaration,
+ struct tgsi_header *header)
 {
-   struct tgsi_declaration_resource dr;
+   struct tgsi_declaration_image di;
 
-   dr = tgsi_default_declaration_resource();
-   dr.Resource = texture;
-   dr.Raw = raw;
-   dr.Writable = writable;
+   di = tgsi_default_declaration_image();
+   di.Resource = texture;
+   di.Format = format;
+   di.Raw = raw;
+   di.Writable = writable;
 
declaration_grow(declaration, header);
 
-   return dr;
+   return di;
 }
 
 static struct tgsi_declaration_sampler_view
@@ -364,7 +367,7 @@ tgsi_default_full_declaration( void )
full_declaration.Range = tgsi_default_declaration_range();
full_declaration.Semantic = tgsi_default_declaration_semantic();
full_declaration.Interp = tgsi_default_declaration_interp();
-   full_declaration.Resource = tgsi_default_declaration_resource();
+   full_declaration.Image = tgsi_default_declaration_image();
full_declaration.SamplerView = tgsi_default_declaration_sampler_view();
full_declaration.Array = tgsi_default_declaration_array();
 
@@ -454,20 +457,21 @@ tgsi_build_full_declaration(
  header );
}
 
-   if (full_decl->Declaration.File == TGSI_FILE_RESOURCE) {
-  struct tgsi_declaration_resource *dr;
+   if (full_decl->Declaration.File == TGSI_FILE_IMAGE) {
+  struct tgsi_declaration_image *di;
 
   if (maxsize <= size) {
  return  0;
   }
-  dr = (struct tgsi_declaration_resource *)[size];
+  di = (struct tgsi_declaration_image *)[size];
   size++;
 
-  *dr = tgsi_build_declaration_resource(full_decl->Resource.Resource,
-full_decl->Resource.Raw,
-full_decl->Resource.Writable,
-declaration,
-header);
+  *di = tgsi_build_declaration_image(full_decl->Image.Resource,
+ full_decl->Image.Format,
+ full_decl->Image.Raw,
+ full_decl->Image.Writable,
+ declaration,
+ header);
}
 
if (full_decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index e29ffb3..dad3839 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -348,12 +348,14 @@ iter_declaration(
   }
}
 
-   if (decl->Declaration.File == TGSI_FILE_RESOURCE) {
+   if (decl->Declaration.File == TGSI_FILE_IMAGE) {
   TXT(", ");
-  

[Mesa-dev] [PATCH 3/8] tgsi: provide a way to encode memory qualifiers for SSBO

2016-01-02 Thread Ilia Mirkin
Each load/store on most hardware can specify what caching to do. Since
SSBO allows individual variables to also have separate caching modes,
allow loads/stores to have the qualifiers instead of attempting to
encode them in declarations.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c| 50 +++-
 src/gallium/auxiliary/tgsi/tgsi_dump.c | 10 ++
 src/gallium/auxiliary/tgsi/tgsi_parse.c|  4 +++
 src/gallium/auxiliary/tgsi/tgsi_parse.h|  1 +
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  7 
 src/gallium/auxiliary/tgsi/tgsi_strings.h  |  2 ++
 src/gallium/auxiliary/tgsi/tgsi_text.c | 27 +++
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 53 ++
 src/gallium/auxiliary/tgsi/tgsi_ureg.h | 13 
 src/gallium/include/pipe/p_shader_tokens.h | 16 -
 10 files changed, 181 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index bb9d0cb..ea20746 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -620,7 +620,8 @@ tgsi_default_instruction( void )
instruction.NumSrcRegs = 1;
instruction.Label = 0;
instruction.Texture = 0;
-   instruction.Padding  = 0;
+   instruction.Memory = 0;
+   instruction.Padding = 0;
 
return instruction;
 }
@@ -766,6 +767,34 @@ tgsi_build_instruction_texture(
return instruction_texture;
 }
 
+static struct tgsi_instruction_memory
+tgsi_default_instruction_memory( void )
+{
+   struct tgsi_instruction_memory instruction_memory;
+
+   instruction_memory.Qualifier = 0;
+   instruction_memory.Padding = 0;
+
+   return instruction_memory;
+}
+
+static struct tgsi_instruction_memory
+tgsi_build_instruction_memory(
+   unsigned qualifier,
+   struct tgsi_token *prev_token,
+   struct tgsi_instruction *instruction,
+   struct tgsi_header *header )
+{
+   struct tgsi_instruction_memory instruction_memory;
+
+   instruction_memory.Qualifier = qualifier;
+   instruction_memory.Padding = 0;
+   instruction->Memory = 1;
+
+   instruction_grow( instruction, header );
+
+   return instruction_memory;
+}
 
 static struct tgsi_texture_offset
 tgsi_default_texture_offset( void )
@@ -1012,6 +1041,7 @@ tgsi_default_full_instruction( void )
full_instruction.Predicate = tgsi_default_instruction_predicate();
full_instruction.Label = tgsi_default_instruction_label();
full_instruction.Texture = tgsi_default_instruction_texture();
+   full_instruction.Memory = tgsi_default_instruction_memory();
for( i = 0;  i < TGSI_FULL_MAX_TEX_OFFSETS; i++ ) {
   full_instruction.TexOffsets[i] = tgsi_default_texture_offset();
}
@@ -1123,6 +1153,24 @@ tgsi_build_full_instruction(
  prev_token = (struct tgsi_token *) texture_offset;
   }
}
+
+   if (full_inst->Instruction.Memory) {
+  struct tgsi_instruction_memory *instruction_memory;
+
+  if( maxsize <= size )
+ return 0;
+  instruction_memory =
+ (struct  tgsi_instruction_memory *) [size];
+  size++;
+
+  *instruction_memory = tgsi_build_instruction_memory(
+ full_inst->Memory.Qualifier,
+ prev_token,
+ instruction,
+ header );
+  prev_token = (struct tgsi_token  *) instruction_memory;
+   }
+
for( i = 0;  i <   full_inst->Instruction.NumDstRegs; i++ ) {
   const struct tgsi_full_dst_register *reg = _inst->Dst[i];
   struct tgsi_dst_register *dst_register;
diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index de3aae5..2ad29b9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -624,6 +624,16 @@ iter_instruction(
   }
}
 
+   if (inst->Instruction.Memory) {
+  uint32_t qualifier = inst->Memory.Qualifier;
+  while (qualifier) {
+ int bit = ffs(qualifier) - 1;
+ qualifier &= ~(1U << bit);
+ TXT(", ");
+ ENM(bit, tgsi_memory_names);
+  }
+   }
+
switch (inst->Instruction.Opcode) {
case TGSI_OPCODE_IF:
case TGSI_OPCODE_UIF:
diff --git a/src/gallium/auxiliary/tgsi/tgsi_parse.c 
b/src/gallium/auxiliary/tgsi/tgsi_parse.c
index 9a52bbb..ae95ebd 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_parse.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_parse.c
@@ -195,6 +195,10 @@ tgsi_parse_token(
  }
   }
 
+  if (inst->Instruction.Memory) {
+ next_token(ctx, >Memory);
+  }
+
   assert( inst->Instruction.NumDstRegs <= TGSI_FULL_MAX_DST_REGISTERS );
 
   for (i = 0; i < inst->Instruction.NumDstRegs; i++) {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_parse.h 
b/src/gallium/auxiliary/tgsi/tgsi_parse.h
index 5ed1a83..4689fb7 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_parse.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_parse.h
@@ -91,6 +91,7 @@ struct tgsi_full_instruction
struct tgsi_instruction_predicate   

[Mesa-dev] [PATCH 7/8] gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/docs/source/screen.rst   | 4 
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 16 files changed, 19 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 4402809..cea6fc0 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -285,6 +285,10 @@ The integer capabilities:
 * ``PIPE_CAP_DRAW_PARAMETERS``: Whether ``TGSI_SEMANTIC_BASEVERTEX``,
   ``TGSI_SEMANTIC_BASEINSTANCE``, and ``TGSI_SEMANTIC_DRAWID`` are
   supported in vertex shaders.
+* ``PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT``: Describes the required
+  alignment for pipe_shader_buffer::buffer_offset, in bytes. Maximum
+  value allowed is 256 (for GL conformance). 0 is only allowed if
+  shader buffers are not supported.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index bf356c4..44db5e8 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -239,6 +239,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 14bd8d7..22d926c 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -255,6 +255,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
   return 0;
 
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index ac29b56..02b6851 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -477,6 +477,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 5352963..6f0041a 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -302,6 +302,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
   return 0;
}
/* should only get here on unhandled cases */
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
index 3d77f81..818ee17 100644
--- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
@@ -175,6 +175,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
   return 0;
 
case PIPE_CAP_VENDOR_ID:
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
index aafca71..a1dcfda 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
@@ -218,6 +218,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_DEVICE_RESET_STATUS_QUERY:
case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
case PIPE_CAP_DRAW_PARAMETERS:
+   case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
   return 0;
 

[Mesa-dev] [PATCH 2/8] ureg: add buffer support to ureg

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_dump.c |  5 +++
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  1 +
 src/gallium/auxiliary/tgsi/tgsi_text.c |  5 +++
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 52 ++
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  3 ++
 src/gallium/include/pipe/p_shader_tokens.h |  4 ++-
 6 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index dad3839..de3aae5 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -359,6 +359,11 @@ iter_declaration(
  TXT(", RAW");
}
 
+   if (decl->Declaration.File == TGSI_FILE_BUFFER) {
+  if (decl->Declaration.Atomic)
+ TXT(", ATOMIC");
+   }
+
if (decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
   TXT(", ");
   ENM(decl->SamplerView.Resource, tgsi_texture_names);
diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index ae30399..c0dd044 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -56,6 +56,7 @@ static const char *tgsi_file_names[] =
"SV",
"IMAGE",
"SVIEW",
+   "BUFFER",
 };
 
 const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index a45ab90..d72d843 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -1350,6 +1350,11 @@ static boolean parse_declaration( struct translate_ctx 
*ctx )
decl.SamplerView.ReturnTypeX;
  }
  ctx->cur = cur;
+  } else if (file == TGSI_FILE_BUFFER) {
+ if (str_match_nocase_whole(, "ATOMIC")) {
+decl.Declaration.Atomic = 1;
+ctx->cur = cur;
+ }
   } else {
  if (str_match_nocase_whole(, "LOCAL")) {
 decl.Declaration.Local = 1;
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index ee23df9..6d5092b 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -165,6 +165,12 @@ struct ureg_program
} image[PIPE_MAX_SHADER_IMAGES];
unsigned nr_images;
 
+   struct {
+  unsigned index;
+  bool atomic;
+   } buffer[PIPE_MAX_SHADER_BUFFERS];
+   unsigned nr_buffers;
+
struct util_bitmask *free_temps;
struct util_bitmask *local_temps;
struct util_bitmask *decl_temps;
@@ -689,6 +695,29 @@ ureg_DECL_image(struct ureg_program *ureg,
return reg;
 }
 
+/* Allocate a new buffer.
+ */
+struct ureg_src ureg_DECL_buffer(struct ureg_program *ureg, unsigned nr,
+ bool atomic)
+{
+   struct ureg_src reg = ureg_src_register(TGSI_FILE_BUFFER, nr);
+   unsigned i;
+
+   for (i = 0; i < ureg->nr_buffers; i++)
+  if (ureg->buffer[i].index == nr)
+ return reg;
+
+   if (i < PIPE_MAX_SHADER_BUFFERS) {
+  ureg->buffer[i].index = nr;
+  ureg->buffer[i].atomic = atomic;
+  ureg->nr_buffers++;
+  return reg;
+   }
+
+   assert(0);
+   return reg;
+}
+
 static int
 match_or_expand_immediate64( const unsigned *v,
  int type,
@@ -1546,6 +1575,25 @@ emit_decl_image(struct ureg_program *ureg,
 }
 
 static void
+emit_decl_buffer(struct ureg_program *ureg,
+ unsigned index,
+ bool atomic)
+{
+   union tgsi_any_token *out = get_tokens(ureg, DOMAIN_DECL, 2);
+
+   out[0].value = 0;
+   out[0].decl.Type = TGSI_TOKEN_TYPE_DECLARATION;
+   out[0].decl.NrTokens = 2;
+   out[0].decl.File = TGSI_FILE_BUFFER;
+   out[0].decl.UsageMask = 0xf;
+   out[0].decl.Atomic = atomic;
+
+   out[1].value = 0;
+   out[1].decl_range.First = index;
+   out[1].decl_range.Last = index;
+}
+
+static void
 emit_immediate( struct ureg_program *ureg,
 const unsigned *v,
 unsigned type )
@@ -1713,6 +1761,10 @@ static void emit_decls( struct ureg_program *ureg )
   ureg->image[i].raw);
}
 
+   for (i = 0; i < ureg->nr_buffers; i++) {
+  emit_decl_buffer(ureg, ureg->buffer[i].index, ureg->buffer[i].atomic);
+   }
+
if (ureg->const_decls.nr_constant_ranges) {
   for (i = 0; i < ureg->const_decls.nr_constant_ranges; i++) {
  emit_decl_range(ureg,
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index bba2afb..e25c961 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -335,6 +335,9 @@ ureg_DECL_image(struct ureg_program *ureg,
 boolean wr,
 boolean raw);
 
+struct ureg_src
+ureg_DECL_buffer(struct ureg_program *ureg, unsigned nr, bool atomic);
+
 static inline struct ureg_src
 ureg_imm4f( struct ureg_program *ureg,
float a, 

[Mesa-dev] [PATCH 8/8] gallium: add a RESQ opcode to query info about a resource

2016-01-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_info.c |  2 +-
 src/gallium/docs/source/tgsi.rst   | 12 
 src/gallium/include/pipe/p_shader_tokens.h |  1 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 8a0e9c4..bdd4688 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -142,7 +142,7 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 0, 0, 0, 0, 0, 1, 0, NONE, "ENDSUB", TGSI_OPCODE_ENDSUB },
{ 1, 1, 1, 0, 0, 0, 0, OTHR, "TXQ_LZ", TGSI_OPCODE_TXQ_LZ },
{ 1, 1, 1, 0, 0, 0, 0, OTHR, "TXQS", TGSI_OPCODE_TXQS },
-   { 0, 0, 0, 0, 0, 0, 0, NONE, "", 105 }, /* removed */
+   { 1, 1, 0, 0, 0, 0, 0, NONE, "RESQ", TGSI_OPCODE_RESQ },
{ 0, 0, 0, 0, 0, 0, 0, NONE, "", 106 }, /* removed */
{ 0, 0, 0, 0, 0, 0, 0, NONE, "NOP", TGSI_OPCODE_NOP },
{ 1, 2, 0, 0, 0, 0, 0, COMP, "FSEQ", TGSI_OPCODE_FSEQ },
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index a3151e3..f4b8c78 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2299,6 +2299,18 @@ Resource Access Opcodes
texture arrays and 2D textures.  address.w is always
ignored.
 
+.. opcode:: RESQ - Query information about a resource
+
+  Syntax: ``RESQ dst, resource``
+
+  Example: ``RESQ TEMP[0], BUFFER[0]``
+
+  Returns information about the buffer or image resource. For buffer
+  resources, the size (in bytes) is returned in the x component. For
+  image resources, .xyz will contain the width/height/layers of the
+  image, while .w will contain the number of samples for multi-sampled
+  images.
+
 
 .. _threadsyncopcodes:
 
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 43a5561..f300207 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -411,6 +411,7 @@ struct tgsi_property_data {
 #define TGSI_OPCODE_ENDSUB  102
 #define TGSI_OPCODE_TXQ_LZ  103 /* TXQ for mipmap level 0 */
 #define TGSI_OPCODE_TXQS104
+#define TGSI_OPCODE_RESQ105
 /* gap */
 #define TGSI_OPCODE_NOP 107
 
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] i965: Add state bit to trigger re-emission of color calculator state.

2016-01-02 Thread Francisco Jerez
This will be used on Gen8+ to make sure that the color calculator
state pointers are re-emitted when switching back to the 3D pipeline
after some GPGPU workload due to a hardware workaround.  There are
other state bits already defined that could be used to achieve the
same effect but they all cause a ton of unrelated state to be
re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new
one, state bits are cheap.
---
 src/mesa/drivers/dri/i965/brw_context.h  | 2 ++
 src/mesa/drivers/dri/i965/brw_state_upload.c | 1 +
 src/mesa/drivers/dri/i965/gen6_cc.c  | 1 +
 3 files changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 7b0340f..b80db00 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -221,6 +221,7 @@ enum brw_state_id {
BRW_STATE_COMPUTE_PROGRAM,
BRW_STATE_CS_WORK_GROUPS,
BRW_STATE_URB_SIZE,
+   BRW_STATE_CC_STATE,
BRW_NUM_STATE_BITS
 };
 
@@ -309,6 +310,7 @@ enum brw_state_id {
 #define BRW_NEW_COMPUTE_PROGRAM (1ull << BRW_STATE_COMPUTE_PROGRAM)
 #define BRW_NEW_CS_WORK_GROUPS  (1ull << BRW_STATE_CS_WORK_GROUPS)
 #define BRW_NEW_URB_SIZE(1ull << BRW_STATE_URB_SIZE)
+#define BRW_NEW_CC_STATE(1ull << BRW_STATE_CC_STATE)
 
 struct brw_state_flags {
/** State update flags signalled by mesa internals */
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 2a671a58d..876e130 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -664,6 +664,7 @@ static struct dirty_bit_map brw_bits[] = {
DEFINE_BIT(BRW_NEW_COMPUTE_PROGRAM),
DEFINE_BIT(BRW_NEW_CS_WORK_GROUPS),
DEFINE_BIT(BRW_NEW_URB_SIZE),
+   DEFINE_BIT(BRW_NEW_CC_STATE),
{0, 0, 0}
 };
 
diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c 
b/src/mesa/drivers/dri/i965/gen6_cc.c
index 3bab8f4..cee139b 100644
--- a/src/mesa/drivers/dri/i965/gen6_cc.c
+++ b/src/mesa/drivers/dri/i965/gen6_cc.c
@@ -298,6 +298,7 @@ const struct brw_tracked_state gen6_color_calc_state = {
   .mesa = _NEW_COLOR |
   _NEW_STENCIL,
   .brw = BRW_NEW_BATCH |
+ BRW_NEW_CC_STATE |
  BRW_NEW_STATE_BASE_ADDRESS,
},
.emit = gen6_upload_color_calc_state,
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] i965/gen4-5: Emit MI_FLUSH as required prior to switching pipelines.

2016-01-02 Thread Francisco Jerez
AFAIK brw_emit_select_pipeline() is only called once during context
init on Gen4-5, at which point the pipeline is likely to be already
idle so it may just happen to work by luck regardless of the MI_FLUSH.
---
 src/mesa/drivers/dri/i965/brw_misc_state.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 75540c1..e5af1da 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -914,6 +914,19 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
brw_pipeline pipeline)
   PIPE_CONTROL_STATE_CACHE_INVALIDATE |
   PIPE_CONTROL_INSTRUCTION_INVALIDATE |
   PIPE_CONTROL_NO_WRITE);
+
+   } else {
+  /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
+   * PIPELINE_SELECT [DevBWR+]":
+   *
+   *   Project: PRE-DEVSNB
+   *
+   *   Software must ensure the current pipeline is flushed via an
+   *   MI_FLUSH or PIPE_CONTROL prior to the execution of PIPELINE_SELECT.
+   */
+  BEGIN_BATCH(1);
+  OUT_BATCH(MI_FLUSH);
+  ADVANCE_BATCH();
}
 
/* Select the pipeline */
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] i965/gen6-7: Implement stall and flushes required prior to switching pipelines.

2016-01-02 Thread Francisco Jerez
Switching the current pipeline while it's not completely idle or the
read and write caches aren't flushed can lead to corruption.  Fixes
misrendering of at least the following Khronos CTS test:

 ES31-CTS.shader_image_load_store.basic-allTargets-store-fs

The stall and flushes are no longer required on Gen8+.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323
---
 src/mesa/drivers/dri/i965/brw_misc_state.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 7d53d18..75540c1 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -886,6 +886,34 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
brw_pipeline pipeline)
 
  brw->ctx.NewDriverState |= BRW_NEW_CC_STATE;
   }
+
+   } else if (brw->gen >= 6) {
+  /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
+   * PIPELINE_SELECT [DevBWR+]":
+   *
+   *   Project: DEVSNB+
+   *
+   *   Software must ensure all the write caches are flushed through a
+   *   stalling PIPE_CONTROL command followed by another PIPE_CONTROL
+   *   command to invalidate read only caches prior to programming
+   *   MI_PIPELINE_SELECT command to change the Pipeline Select Mode.
+   */
+  const unsigned dc_flush =
+ brw->gen >= 7 ? PIPE_CONTROL_DATA_CACHE_INVALIDATE : 0;
+
+  brw_emit_pipe_control_flush(brw,
+  PIPE_CONTROL_RENDER_TARGET_FLUSH |
+  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+  dc_flush |
+  PIPE_CONTROL_NO_WRITE |
+  PIPE_CONTROL_CS_STALL);
+
+  brw_emit_pipe_control_flush(brw,
+  PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
+  PIPE_CONTROL_CONST_CACHE_INVALIDATE |
+  PIPE_CONTROL_STATE_CACHE_INVALIDATE |
+  PIPE_CONTROL_INSTRUCTION_INVALIDATE |
+  PIPE_CONTROL_NO_WRITE);
}
 
/* Select the pipeline */
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] i965/gen7: Emit stall and dummy primitive draw after switching to the 3D pipeline.

2016-01-02 Thread Francisco Jerez
This hardware bug can supposedly lead to a hang on IVB and VLV.
---
 src/mesa/drivers/dri/i965/brw_misc_state.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index e5af1da..2263604 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -935,6 +935,30 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
brw_pipeline pipeline)
  (brw->gen >= 9 ? (3 << 8) : 0) |
  (pipeline == BRW_COMPUTE_PIPELINE ? 2 : 0));
ADVANCE_BATCH();
+
+   if (brw->gen == 7 && !brw->is_haswell &&
+   pipeline == BRW_RENDER_PIPELINE) {
+  /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
+   * PIPELINE_SELECT [DevBWR+]":
+   *
+   *   Project: DEVIVB, DEVHSW:GT3:A0
+   *
+   *   Software must send a pipe_control with a CS stall and a post sync
+   *   operation and then a dummy DRAW after every MI_SET_CONTEXT and
+   *   after any PIPELINE_SELECT that is enabling 3D mode.
+   */
+  gen7_emit_cs_stall_flush(brw);
+
+  BEGIN_BATCH(7);
+  OUT_BATCH(CMD_3D_PRIM << 16 | (7 - 2));
+  OUT_BATCH(_3DPRIM_POINTLIST);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   }
 }
 
 /**
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] i965/gen8+: Invalidate color calc state when switching to the GPGPU pipeline.

2016-01-02 Thread Francisco Jerez
This hardware bug can cause a hang on context restore while the
current pipeline is set to GPGPU (BDWGFX HSD 1909593).  In addition to
clearing the valid bit, mark the CC state as dirty to make sure that
the CC indirect state pointer is re-emitted when we switch back to the
3D pipeline.
---
 src/mesa/drivers/dri/i965/brw_misc_state.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index cf6ba5b..7d53d18 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -868,6 +868,26 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
brw_pipeline pipeline)
const uint32_t _3DSTATE_PIPELINE_SELECT =
   is_965 ? CMD_PIPELINE_SELECT_965 : CMD_PIPELINE_SELECT_GM45;
 
+   if (brw->gen >= 8) {
+  /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
+   * PIPELINE_SELECT [DevBWR+]":
+   *
+   *   Project: BDW, SKL
+   *
+   *   Software must clear the COLOR_CALC_STATE Valid field in
+   *   3DSTATE_CC_STATE_POINTERS command prior to send a PIPELINE_SELECT
+   *   with Pipeline Select set to GPGPU.
+   */
+  if (pipeline == BRW_COMPUTE_PIPELINE) {
+ BEGIN_BATCH(2);
+ OUT_BATCH(_3DSTATE_CC_STATE_POINTERS << 16 | (2 - 2));
+ OUT_BATCH(0);
+ ADVANCE_BATCH();
+
+ brw->ctx.NewDriverState |= BRW_NEW_CC_STATE;
+  }
+   }
+
/* Select the pipeline */
BEGIN_BATCH(1);
OUT_BATCH(_3DSTATE_PIPELINE_SELECT << 16 |
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] i965/gen7.5+: Disable resource streamer during GPGPU workloads.

2016-01-02 Thread Francisco Jerez
The RS and hardware binding tables are only supported on the 3D
pipeline and can lead to corruption if left enabled during a GPGPU
workload.  Disable it when switching to the GPGPU (or media) pipeline
and re-enable it when switching back to the 3D pipeline.
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c |  2 +-
 src/mesa/drivers/dri/i965/brw_misc_state.c | 38 ++
 src/mesa/drivers/dri/i965/brw_state.h  |  1 +
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 80935cf..5c5aa0e 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -364,7 +364,7 @@ gen7_disable_hw_binding_tables(struct brw_context *brw)
 /**
  * Enable hardware binding tables and set up the binding table pool.
  */
-static void
+void
 gen7_enable_hw_binding_tables(struct brw_context *brw)
 {
if (!brw->use_resource_streamer)
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 2263604..7e68838 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -868,6 +868,25 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
brw_pipeline pipeline)
const uint32_t _3DSTATE_PIPELINE_SELECT =
   is_965 ? CMD_PIPELINE_SELECT_965 : CMD_PIPELINE_SELECT_GM45;
 
+   if (brw->use_resource_streamer && pipeline != BRW_RENDER_PIPELINE) {
+  /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
+   * PIPELINE_SELECT [DevBWR+]":
+   *
+   * Project: HSW, BDW, CHV, SKL, BXT
+   *
+   * Hardware Binding Tables are only supported for 3D workloads. Resource
+   * streamer must be enabled only for 3D workloads. Resource streamer
+   * must be disabled for Media and GPGPU workloads.
+   */
+  BEGIN_BATCH(1);
+  OUT_BATCH(MI_RS_CONTROL | 0);
+  ADVANCE_BATCH();
+
+  gen7_disable_hw_binding_tables(brw);
+
+  /* XXX - Disable gather constant pool too when we start using it. */
+   }
+
if (brw->gen >= 8) {
   /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
* PIPELINE_SELECT [DevBWR+]":
@@ -959,6 +978,25 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
brw_pipeline pipeline)
   OUT_BATCH(0);
   ADVANCE_BATCH();
}
+
+   if (brw->use_resource_streamer && pipeline == BRW_RENDER_PIPELINE) {
+  /* From "BXML » GT » MI » vol1a GPU Overview » [Instruction]
+   * PIPELINE_SELECT [DevBWR+]":
+   *
+   * Project: HSW, BDW, CHV, SKL, BXT
+   *
+   * Hardware Binding Tables are only supported for 3D workloads. Resource
+   * streamer must be enabled only for 3D workloads. Resource streamer
+   * must be disabled for Media and GPGPU workloads.
+   */
+  BEGIN_BATCH(1);
+  OUT_BATCH(MI_RS_CONTROL | 1);
+  ADVANCE_BATCH();
+
+  gen7_enable_hw_binding_tables(brw);
+
+  /* XXX - Re-enable gather constant pool here. */
+   }
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index d29b997..7d61b7c 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -396,6 +396,7 @@ void gen7_update_binding_table_from_array(struct 
brw_context *brw,
   gl_shader_stage stage,
   const uint32_t* binding_table,
   int num_surfaces);
+void gen7_enable_hw_binding_tables(struct brw_context *brw);
 void gen7_disable_hw_binding_tables(struct brw_context *brw);
 void gen7_reset_hw_bt_pool_offsets(struct brw_context *brw);
 
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/6] i965: GPGPU/3D pipeline switching fixes.

2016-01-02 Thread Francisco Jerez
The PIPELINE_SELECT command has a number of awkward restrictions we
don't currently take into account while switching between the GPGPU
and 3D pipeline, what in some cases can lead to corruption or hangs.
This series should implement all workarounds mentioned in the hardware
spec ("BXML » GT » MI » vol1a GPU Overview » [Instruction]
PIPELINE_SELECT [DevBWR+]") that seem to be relevant to us.

 [PATCH 1/6] i965: Add state bit to trigger re-emission of color calculator 
state.
 [PATCH 2/6] i965/gen8+: Invalidate color calc state when switching to the 
GPGPU pipeline.
 [PATCH 3/6] i965/gen6-7: Implement stall and flushes required prior to 
switching pipelines.
 [PATCH 4/6] i965/gen4-5: Emit MI_FLUSH as required prior to switching 
pipelines.
 [PATCH 5/6] i965/gen7: Emit stall and dummy primitive draw after switching to 
the 3D pipeline.
 [PATCH 6/6] i965/gen7.5+: Disable resource streamer during GPGPU workloads.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev