Re: [Mesa-dev] [PATCH 1/3] gallium/hud: fix a problem where objects are free'd while in use.

2016-11-07 Thread Nicolai Hähnle

Hi Emil,

On 07.11.2016 18:37, Laurent Carlier wrote:

Le lundi 7 novembre 2016, 18:32:07 CET Nicolai Hähnle a écrit :

Looks good to me as well, and pushed! Thanks for the respin and sorry it
took so long.

[...]




Maybe cc 13.0 ? It's buggy with 13.0 and it will be a nice fix


I didn't think of this, but I agree. It only touches gallium/hud, which 
we know to be broken in some cases, and even if there are regressions in 
those patches, their impact isn't critical. Pulling these three patches 
into 13.0 makes sense. The commit hashes on master are:


6ffed086795aaa84ab35668bb59d712cdde34da3
5a58323064b32442e2de23c95642bc421be696f8
381edca826ee27b1a49f19b0731c777bdf241b20

Cheers,
Nicolai





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: emit correct last export when Z/stencil export is enabled

2016-11-07 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Tue, Nov 8, 2016 at 7:24 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> I was getting a random GPU hang in the renderpass simple tests,
> it turns out sometimes radv emitted the wrong thing "last".
>
> This fixes the logic to emit Z/stencil last if they occur,
> and not mark a color output as last. Also this relies on the
> Z/STENCIL being the first two fragment outputs, which they are
> so yay.
>
> Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs)
> Cc: "13.0" 
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 9b2663e..c8ee784 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -4378,12 +4378,10 @@ handle_fs_outputs_post(struct nir_to_llvm_context 
> *ctx,
>
> for (unsigned i = 0; i < RADEON_LLVM_MAX_OUTPUTS; ++i) {
> LLVMValueRef values[4];
> -   bool last;
> +
> if (!(ctx->output_mask & (1ull << i)))
> continue;
>
> -   last = ctx->output_mask <= ((1ull << (i + 1)) - 1);
> -
> if (i == FRAG_RESULT_DEPTH) {
> ctx->shader_info->fs.writes_z = true;
> depth = to_float(ctx, LLVMBuildLoad(ctx->builder,
> @@ -4393,10 +4391,14 @@ handle_fs_outputs_post(struct nir_to_llvm_context 
> *ctx,
> stencil = to_float(ctx, LLVMBuildLoad(ctx->builder,
>   
> ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], ""));
> } else {
> +   bool last = false;
> for (unsigned j = 0; j < 4; j++)
> values[j] = to_float(ctx, 
> LLVMBuildLoad(ctx->builder,
> 
> ctx->outputs[radeon_llvm_reg_index_soa(i, j)], ""));
>
> +   if (!ctx->shader_info->fs.writes_z && 
> !ctx->shader_info->fs.writes_stencil)
> +   last = ctx->output_mask <= ((1ull << (i + 1)) 
> - 1);
> +
> si_export_mrt_color(ctx, values, V_008DFC_SQ_EXP_MRT 
> + index, last);
> index++;
> }
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] ac/nir: add support for discard_if intrinsic (v2)

2016-11-07 Thread Dave Airlie
From: Dave Airlie 

We are going to start lowering to this in NIR code,
so prepare radv for it.

v2: handle conversion to kilp properly (nha)

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index ae0ede6..c8ee784 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2614,6 +2614,24 @@ static void emit_barrier(struct nir_to_llvm_context *ctx)
ctx->voidt, NULL, 0, 0);
 }
 
+static void emit_discard_if(struct nir_to_llvm_context *ctx,
+   nir_intrinsic_instr *instr)
+{
+   LLVMValueRef cond;
+   ctx->shader_info->fs.can_discard = true;
+
+   cond = LLVMBuildICmp(ctx->builder, LLVMIntNE,
+get_src(ctx, instr->src[0]),
+ctx->i32zero, "");
+
+   cond = LLVMBuildSelect(ctx->builder, cond,
+  LLVMConstReal(ctx->f32, -1.0f),
+  ctx->f32zero, "");
+   emit_llvm_intrinsic(ctx, "llvm.AMDGPU.kill",
+   LLVMVoidTypeInContext(ctx->context),
+   , 1, 0);
+}
+
 static LLVMValueRef
 visit_load_local_invocation_index(struct nir_to_llvm_context *ctx)
 {
@@ -2926,6 +2944,9 @@ static void visit_intrinsic(struct nir_to_llvm_context 
*ctx,
LLVMVoidTypeInContext(ctx->context),
NULL, 0, 0);
break;
+   case nir_intrinsic_discard_if:
+   emit_discard_if(ctx, instr);
+   break;
case nir_intrinsic_memory_barrier:
emit_waitcnt(ctx);
break;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radv: enable conditional discard optimisation on radv.

2016-11-07 Thread Dave Airlie
From: Dave Airlie 

This fixes a bunch of GPU hangs introduced in some CTS
tests like
dEQP-VK.memory.pipeline_barrier.host_write_uniform_buffer.65536

It works around an issue seen in the LLVM backend, but
also makes the radv code work more like the radeonsi stack.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_pipeline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 54f59a8..f94436b 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -144,6 +144,7 @@ radv_optimize_nir(struct nir_shader *shader)
 NIR_PASS(progress, shader, nir_opt_algebraic);
 NIR_PASS(progress, shader, nir_opt_constant_folding);
 NIR_PASS(progress, shader, nir_opt_undef);
+NIR_PASS(progress, shader, nir_opt_conditional_discard);
 } while (progress);
 }
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nir: add conditional discard optimisation (v3)

2016-11-07 Thread Dave Airlie
From: Dave Airlie 

This is ported from GLSL and converts

if (cond)
discard;

into
discard_if(cond);

This removes a block, but also is needed by radv
to workaround a bug in the LLVM backend.

v2: handle if (a) discard_if(b) (nha)
cleanup and drop pointless loop (Matt)
make sure there are no dependent phis (Eric)
v3: make sure only one instruction in the then block.

Signed-off-by: Dave Airlie 
---
 src/compiler/Makefile.sources  |   1 +
 src/compiler/nir/nir.h |   2 +
 src/compiler/nir/nir_opt_conditional_discard.c | 125 +
 3 files changed, 128 insertions(+)
 create mode 100644 src/compiler/nir/nir_opt_conditional_discard.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 669c499..b710cbd 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -228,6 +228,7 @@ NIR_FILES = \
nir/nir_metadata.c \
nir/nir_move_vec_src_uses_to_dest.c \
nir/nir_normalize_cubemap_coords.c \
+   nir/nir_opt_conditional_discard.c \
nir/nir_opt_constant_folding.c \
nir/nir_opt_copy_propagate.c \
nir/nir_opt_cse.c \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 9264763..2a77139 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2531,6 +2531,8 @@ bool nir_opt_remove_phis(nir_shader *shader);
 
 bool nir_opt_undef(nir_shader *shader);
 
+bool nir_opt_conditional_discard(nir_shader *shader);
+
 void nir_sweep(nir_shader *shader);
 
 nir_intrinsic_op nir_intrinsic_from_system_value(gl_system_value val);
diff --git a/src/compiler/nir/nir_opt_conditional_discard.c 
b/src/compiler/nir/nir_opt_conditional_discard.c
new file mode 100644
index 000..6e90983
--- /dev/null
+++ b/src/compiler/nir/nir_opt_conditional_discard.c
@@ -0,0 +1,125 @@
+/*
+ * Copyright © 2016 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+#include "nir_builder.h"
+
+/** @file nir_opt_conditional_discard.c
+ *
+ * Handles optimization of lowering if (cond) discard to discard_if(cond).
+ */
+
+static bool
+nir_opt_conditional_discard_block(nir_block *block, void *mem_ctx)
+{
+   nir_builder bld;
+
+   if (nir_cf_node_is_first(>cf_node))
+  return false;
+
+   nir_cf_node *prev_node = nir_cf_node_prev(>cf_node);
+   if (prev_node->type != nir_cf_node_if)
+  return false;
+
+   nir_if *if_stmt = nir_cf_node_as_if(prev_node);
+   nir_block *then_block = nir_if_first_then_block(if_stmt);
+   nir_block *else_block = nir_if_first_else_block(if_stmt);
+
+   /* check there is only one else block and it is empty */
+   if (nir_if_last_else_block(if_stmt) != else_block)
+  return false;
+   if (!exec_list_is_empty(_block->instr_list))
+  return false;
+
+   /* check there is only one then block and it has only one instruction in it 
*/
+   if (nir_if_last_then_block(if_stmt) != then_block)
+  return false;
+   if (exec_list_is_empty(_block->instr_list))
+  return false;
+   if (exec_list_length(_block->instr_list) > 1)
+  return false;
+   /*
+* make sure no subsequent phi nodes point at this if.
+*/
+   nir_block *after = 
nir_cf_node_as_block(nir_cf_node_next(_stmt->cf_node));
+   nir_foreach_instr_safe(instr, after) {
+  if (instr->type != nir_instr_type_phi)
+ break;
+  nir_phi_instr *phi = nir_instr_as_phi(instr);
+
+  nir_foreach_phi_src(phi_src, phi) {
+ if (phi_src->pred == then_block ||
+ phi_src->pred == else_block)
+return false;
+  }
+   }
+
+   /* Get the first instruction in the then block and confirm it is
+* a discard or a discard_if
+*/
+   nir_instr *instr = nir_block_first_instr(then_block);
+   if (instr->type != nir_instr_type_intrinsic)
+  return false;
+
+   nir_intrinsic_instr *intrin = 

[Mesa-dev] [PATCH] radv: emit correct last export when Z/stencil export is enabled

2016-11-07 Thread Dave Airlie
From: Dave Airlie 

I was getting a random GPU hang in the renderpass simple tests,
it turns out sometimes radv emitted the wrong thing "last".

This fixes the logic to emit Z/stencil last if they occur,
and not mark a color output as last. Also this relies on the
Z/STENCIL being the first two fragment outputs, which they are
so yay.

Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs)
Cc: "13.0" 
Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 9b2663e..c8ee784 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4378,12 +4378,10 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx,
 
for (unsigned i = 0; i < RADEON_LLVM_MAX_OUTPUTS; ++i) {
LLVMValueRef values[4];
-   bool last;
+
if (!(ctx->output_mask & (1ull << i)))
continue;
 
-   last = ctx->output_mask <= ((1ull << (i + 1)) - 1);
-
if (i == FRAG_RESULT_DEPTH) {
ctx->shader_info->fs.writes_z = true;
depth = to_float(ctx, LLVMBuildLoad(ctx->builder,
@@ -4393,10 +4391,14 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx,
stencil = to_float(ctx, LLVMBuildLoad(ctx->builder,
  
ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], ""));
} else {
+   bool last = false;
for (unsigned j = 0; j < 4; j++)
values[j] = to_float(ctx, 
LLVMBuildLoad(ctx->builder,

ctx->outputs[radeon_llvm_reg_index_soa(i, j)], ""));
 
+   if (!ctx->shader_info->fs.writes_z && 
!ctx->shader_info->fs.writes_stencil)
+   last = ctx->output_mask <= ((1ull << (i + 1)) - 
1);
+
si_export_mrt_color(ctx, values, V_008DFC_SQ_EXP_MRT + 
index, last);
index++;
}
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] linker: Trivial coding standards fixes

2016-11-07 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/link_uniforms.cpp | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index b3c3c5a..fdcbd36 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -234,10 +234,8 @@ program_resource_visitor::visit_field(const glsl_type 
*type, const char *name,
 }
 
 void
-program_resource_visitor::visit_field(const glsl_struct_field *field)
+program_resource_visitor::visit_field(const glsl_struct_field *)
 {
-   (void) field;
-   /* empty */
 }
 
 void
@@ -346,14 +344,12 @@ public:
 
 private:
virtual void visit_field(const glsl_type *type, const char *name,
-bool row_major)
+bool /* row_major */)
{
   assert(!type->without_array()->is_record());
   assert(!type->without_array()->is_interface());
   assert(!(type->is_array() && type->fields.array->is_array()));
 
-  (void) row_major;
-
   /* Count the number of samplers regardless of whether the uniform is
* already in the hash table.  The hash table prevents adding the same
* uniform for multiple shader targets, but in this case we want to
@@ -372,7 +368,7 @@ private:
   * components in the default block.  The spec allows image
   * uniforms to use up no more than one scalar slot.
   */
- if(!is_shader_storage)
+ if (!is_shader_storage)
 this->num_shader_uniform_components += values;
   } else {
  /* Accumulate the total number of uniform slots used by this shader.
@@ -651,17 +647,16 @@ private:
   this->record_array_count = record_array_count;
}
 
-   virtual void visit_field(const glsl_type *type, const char *name,
-bool row_major)
+   virtual void visit_field(const glsl_type *, const char *,
+bool /* row_major */)
{
-  (void) type;
-  (void) name;
-  (void) row_major;
-  assert(!"Should not get here.");
+  unreachable("Should not get here.");
}
 
virtual void enter_record(const glsl_type *type, const char *,
- bool row_major, const enum glsl_interface_packing 
packing) {
+ bool row_major,
+ const enum glsl_interface_packing packing)
+   {
   assert(type->is_record());
   if (this->buffer_block_index == -1)
  return;
@@ -674,7 +669,9 @@ private:
}
 
virtual void leave_record(const glsl_type *type, const char *,
- bool row_major, const enum glsl_interface_packing 
packing) {
+ bool row_major,
+ const enum glsl_interface_packing packing)
+   {
   assert(type->is_record());
   if (this->buffer_block_index == -1)
  return;
@@ -892,7 +889,7 @@ link_update_uniform_buffer_variables(struct 
gl_linked_shader *shader)
foreach_in_list(ir_instruction, node, shader->ir) {
   ir_variable *const var = node->as_variable();
 
-  if ((var == NULL) || !var->is_in_buffer_block())
+  if (var == NULL || !var->is_in_buffer_block())
  continue;
 
   assert(var->data.mode == ir_var_uniform ||
@@ -942,6 +939,7 @@ link_update_uniform_buffer_variables(struct 
gl_linked_shader *shader)
break;
 }
  }
+
  if (found)
 break;
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] glsl: Add some comments to methods of ir_variable_refcount_visitor

2016-11-07 Thread Ian Romanick
From: Ian Romanick 

It was not obvious from the just the .h file what the hash table
contained.  It was also not obvious that get_variable_entry would create
a new entry in the hash table.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_variable_refcount.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/compiler/glsl/ir_variable_refcount.h 
b/src/compiler/glsl/ir_variable_refcount.h
index 08a11c0..0a8eec7 100644
--- a/src/compiler/glsl/ir_variable_refcount.h
+++ b/src/compiler/glsl/ir_variable_refcount.h
@@ -72,8 +72,14 @@ public:
virtual ir_visitor_status visit_enter(ir_function_signature *);
virtual ir_visitor_status visit_leave(ir_assignment *);
 
+   /**
+* Find variable in the hash table, and insert it if not present
+*/
ir_variable_refcount_entry *get_variable_entry(ir_variable *var);
 
+   /**
+* Hash table mapping ir_variable to ir_variable_refcount_entry.
+*/
struct hash_table *ht;
 
void *mem_ctx;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] linker: Slight code rearrange to prevent duplication in the next commit

2016-11-07 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/link_uniforms.cpp | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index fdcbd36..d70614f 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -928,13 +928,12 @@ link_update_uniform_buffer_variables(struct 
gl_linked_shader *shader)
if ((ptrdiff_t) l != (end - begin))
   continue;
 
-   if (strncmp(var->name, begin, l) == 0) {
-  found = true;
-  var->data.location = j;
-  break;
-   }
-} else if (!strcmp(var->name, blks[i]->Uniforms[j].Name)) {
-   found = true;
+   found = strncmp(var->name, begin, l) == 0;
+} else {
+   found = strcmp(var->name, blks[i]->Uniforms[j].Name) == 0;
+}
+
+if (found) {
var->data.location = j;
break;
 }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] linker: Accurately track gl_uniform_block::stageref

2016-11-07 Thread Ian Romanick
From: Ian Romanick 

As the linked per-stage shaders are processed, mark any block that has a
field that is accessed as referenced.  When combining all the linked
shaders, combine the per-stage stageref masks.

This fixes a number of GLES CTS tests including
ESEXT-CTS.geometry_shader.program_resource.program_resource.  However,
it makes quite a few more fail.  I have diagnosed the failures, but I'm
not sure whether we or the tests are wrong.  After optimizations are
applied, all of the tests are of the form:

buffer X {
float f;
} x;

void main()
{
x.f = x.f;
}

The test then queries that x is referenced by that shader stage.  We
eliminate the assignment of x.f to itself, and that removes the last
reference to x.  We report that x is not referenced, and the test fails.
I do not know whether or not we are allowed to eliminate that assignment
of x.f to itself.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/link_uniforms.cpp | 65 +
 src/compiler/glsl/linker.cpp|  3 +-
 2 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index d70614f..29cd24c 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -28,6 +28,7 @@
 #include "glsl_symbol_table.h"
 #include "program.h"
 #include "util/string_to_uint_map.h"
+#include "ir_variable_refcount.h"
 
 /**
  * \file link_uniforms.cpp
@@ -877,6 +878,15 @@ public:
unsigned shader_shadow_samplers;
 };
 
+static bool
+variable_is_referenced(ir_variable_refcount_visitor , ir_variable *var)
+{
+   ir_variable_refcount_entry *const entry = v.get_variable_entry(var);
+
+   return entry->referenced_count > 0;
+
+}
+
 /**
  * Walks the IR and update the references to uniform blocks in the
  * ir_variables to point at linked shader's list (previously, they
@@ -884,8 +894,13 @@ public:
  * shaders).
  */
 static void
-link_update_uniform_buffer_variables(struct gl_linked_shader *shader)
+link_update_uniform_buffer_variables(struct gl_linked_shader *shader,
+ unsigned stage)
 {
+   ir_variable_refcount_visitor v;
+
+   v.run(shader->ir);
+
foreach_in_list(ir_instruction, node, shader->ir) {
   ir_variable *const var = node->as_variable();
 
@@ -895,7 +910,44 @@ link_update_uniform_buffer_variables(struct 
gl_linked_shader *shader)
   assert(var->data.mode == ir_var_uniform ||
  var->data.mode == ir_var_shader_storage);
 
+  unsigned num_blocks = var->data.mode == ir_var_uniform ?
+ shader->NumUniformBlocks : shader->NumShaderStorageBlocks;
+  struct gl_uniform_block **blks = var->data.mode == ir_var_uniform ?
+ shader->UniformBlocks : shader->ShaderStorageBlocks;
+
   if (var->is_interface_instance()) {
+ if (variable_is_referenced(v, var)) {
+/* Since this is an interface instance, the instance type will be
+ * same as the array-stripped variable type.  If the variable type
+ * is an array, then the block names will be suffixed with [0]
+ * through [n-1].  Unlike for non-interface instances, there will
+ * not be structure types here, so the only name sentinel that we
+ * have to worry about is [.
+ */
+assert(var->type->without_array() == var->get_interface_type());
+const char sentinel = var->type->is_array() ? '[' : '\0';
+
+const ptrdiff_t len = strlen(var->get_interface_type()->name);
+for (unsigned i = 0; i < num_blocks; i++) {
+   const char *const begin = blks[i]->Name;
+   const char *const end = strchr(begin, sentinel);
+
+   if (end == NULL)
+  continue;
+
+   if (len != (end - begin))
+  continue;
+
+   /* Even when a match is found, do not "break" here.  This could
+* be an array of instances, and all elements of the array need
+* to be marked as referenced.
+*/
+   if (strncmp(begin, var->get_interface_type()->name, len) == 0) {
+  blks[i]->stageref |= 1U << stage;
+   }
+}
+ }
+
  var->data.location = 0;
  continue;
   }
@@ -910,11 +962,6 @@ link_update_uniform_buffer_variables(struct 
gl_linked_shader *shader)
  sentinel = '[';
   }
 
-  unsigned num_blocks = var->data.mode == ir_var_uniform ?
- shader->NumUniformBlocks : shader->NumShaderStorageBlocks;
-  struct gl_uniform_block **blks = var->data.mode == ir_var_uniform ?
- shader->UniformBlocks : shader->ShaderStorageBlocks;
-
   const unsigned l = strlen(var->name);
   for (unsigned i = 0; i < num_blocks; i++) {
  for (unsigned j = 0; j < blks[i]->NumUniforms; 

Re: [Mesa-dev] [PATCH 07/13] mesa: Replace program locks with atomic inc/dec.

2016-11-07 Thread Timothy Arceri
There are still some issues with the other patches but is there any
reason this one didn't land?

On Thu, 2015-08-06 at 17:10 -0700, Matt Turner wrote:
> ---
>  src/mesa/main/mtypes.h |  1 -
>  src/mesa/program/program.c | 15 +++
>  2 files changed, 3 insertions(+), 13 deletions(-)
> 
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index fcc527f..c597ccc 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2095,7 +2095,6 @@ enum gl_frag_depth_layout
>   */
>  struct gl_program
>  {
> -   mtx_t Mutex;
> GLuint Id;
> GLint RefCount;
> GLubyte *String;  /**< Null-terminated program text */
> diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c
> index e94c102..54e3498 100644
> --- a/src/mesa/program/program.c
> +++ b/src/mesa/program/program.c
> @@ -38,6 +38,7 @@
>  #include "prog_parameter.h"
>  #include "prog_instruction.h"
>  #include "util/ralloc.h"
> +#include "util/u_atomic.h"
>  
>  
>  /**
> @@ -226,7 +227,6 @@ init_program_struct(struct gl_program *prog,
> GLenum target, GLuint id)
> assert(prog);
>  
> memset(prog, 0, sizeof(*prog));
> -   mtx_init(>Mutex, mtx_plain);
> prog->Id = id;
> prog->Target = target;
> prog->RefCount = 1;
> @@ -419,7 +419,6 @@ _mesa_delete_program(struct gl_context *ctx,
> struct gl_program *prog)
>    ralloc_free(prog->nir);
> }
>  
> -   mtx_destroy(>Mutex);
> free(prog);
>  }
>  
> @@ -464,17 +463,11 @@ _mesa_reference_program_(struct gl_context
> *ctx,
>  #endif
>  
> if (*ptr) {
> -  GLboolean deleteFlag;
>    struct gl_program *oldProg = *ptr;
>  
> -  mtx_lock(>Mutex);
>    assert(oldProg->RefCount > 0);
> -  oldProg->RefCount--;
>  
> -  deleteFlag = (oldProg->RefCount == 0);
> -  mtx_unlock(>Mutex);
> -
> -  if (deleteFlag) {
> +  if (p_atomic_dec_zero(>RefCount)) {
>   assert(ctx);
>   ctx->Driver.DeleteProgram(ctx, oldProg);
>    }
> @@ -484,9 +477,7 @@ _mesa_reference_program_(struct gl_context *ctx,
>  
> assert(!*ptr);
> if (prog) {
> -  mtx_lock(>Mutex);
> -  prog->RefCount++;
> -  mtx_unlock(>Mutex);
> +  p_atomic_inc(>RefCount);
> }
>  
> *ptr = prog;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] Make shader refcounting atomic.

2016-11-07 Thread Timothy Arceri
On Sat, 2016-11-05 at 15:37 +0100, Steinar H. Gunderson wrote:
> These were racy when using the same shaders (seemingly even from
> different
> program objects) on multiple theads sharing the same objects, leading
> to
> issues such as (excerpts from an apitrace dump from a real
> application):
> 
>   1097 @0 glCreateProgram() = 9
>   1099 @0 glAttachShader(program = 9, shader = 7)
>   1101 @0 glAttachShader(program = 9, shader = 8)
>   [...]
>   18122 @2 glCreateProgram() = 137
>   18128 @2 glAttachShader(program = 137, shader = 7)
>   18130 @2 glAttachShader(program = 137, shader = 8)
>   [...]
>   437559 @0 glUseProgram(program = 9)
>   437582 @2 glUseProgram(program = 137)
>   437613 @2 glUseProgram(program = 9)
>   437614 @2 glGetError() = GL_INVALID_VALUE
> 
> with nothing deleting the shaders or programs in-between; just racy
> refcounting, as confirmed by Helgrind:
> 
>   ==13727== Possible data race during read of size 4 at 0x2B3B2648 by
> thread #1
>   ==13727== Locks held: none
>   ==13727==at 0x1EEBF613: _mesa_reference_shader_program_
> (shaderobj.c:247)
>   ==13727==by 0x1EEBDFB2: _mesa_use_program (shaderapi.c:1259)
>   ==13727==by 0x60FA618:
> movit::EffectChain::execute_phase(movit::Phase*, bool, std::set std::less, std::allocator >*, std::map unsigned int, std::less,
> std::allocator > >*,
> std::set,
> std::allocator >*) (effect_chain.cpp:1885)
>   [...]
>   ==13727== This conflicts with a previous write of size 4 by thread
> #20
>   ==13727== Locks held: none
>   ==13727==at 0x1EEBF600: _mesa_reference_shader_program_
> (shaderobj.c:236)
>   ==13727==by 0x1EEBDFB2: _mesa_use_program (shaderapi.c:1259)
>   ==13727==by 0x60FA618:
> movit::EffectChain::execute_phase(movit::Phase*, bool, std::set std::less, std::allocator >*, std::map unsigned int, std::less,
> std::allocator > >*,
> std::set,
> std::allocator >*) (effect_chain.cpp:1885)
> 
> Cc: 11.2 12.0 13.0 
> Signed-off-by: Steinar H. Gunderson 
> ---
>  src/mesa/main/shaderobj.c | 23 +--
>  1 file changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
> index 8fd574e..08e4379 100644
> --- a/src/mesa/main/shaderobj.c
> +++ b/src/mesa/main/shaderobj.c
> @@ -41,6 +41,7 @@
>  #include "program/prog_parameter.h"
>  #include "util/ralloc.h"
>  #include "util/string_to_uint_map.h"
> +#include "util/u_atomic.h"
>  
>  /***
> ***/
>  /*** Shader object
> functions***/
> @@ -64,14 +65,11 @@ _mesa_reference_shader(struct gl_context *ctx,
> struct gl_shader **ptr,
> }
> if (*ptr) {
>    /* Unreference the old shader */
> -  GLboolean deleteFlag = GL_FALSE;
>    struct gl_shader *old = *ptr;
> +  int old_refcount = p_atomic_dec_return(>RefCount);
>  
> -  assert(old->RefCount > 0);
> -  old->RefCount--;
> -  deleteFlag = (old->RefCount == 0);
> -
> -  if (deleteFlag) {
> +  assert(old_refcount >= 0);
> +  if (old_refcount == 0) {

I think you could just use p_atomic_dec_zero(>RefCount) here like
Matt did here:

https://lists.freedesktop.org/archives/mesa-dev/2015-August/090979.html

>    if (old->Name != 0)
>   _mesa_HashRemove(ctx->Shared->ShaderObjects, old->Name);
>   _mesa_delete_shader(ctx, old);
> @@ -83,7 +81,7 @@ _mesa_reference_shader(struct gl_context *ctx,
> struct gl_shader **ptr,
>  
> if (sh) {
>    /* reference new */
> -  sh->RefCount++;
> +  p_atomic_inc(>RefCount);
>    *ptr = sh;
> }
>  }
> @@ -226,14 +224,11 @@ _mesa_reference_shader_program_(struct
> gl_context *ctx,
> }
> if (*ptr) {
>    /* Unreference the old shader program */
> -  GLboolean deleteFlag = GL_FALSE;
>    struct gl_shader_program *old = *ptr;
> +  int old_refcount = p_atomic_dec_return(>RefCount);
>  
> -  assert(old->RefCount > 0);
> -  old->RefCount--;
> -  deleteFlag = (old->RefCount == 0);
> -
> -  if (deleteFlag) {
> +  assert(old_refcount >= 0);
> +  if (old_refcount == 0) {
>    if (old->Name != 0)
>   _mesa_HashRemove(ctx->Shared->ShaderObjects, old->Name);
>   _mesa_delete_shader_program(ctx, old);
> @@ -244,7 +239,7 @@ _mesa_reference_shader_program_(struct gl_context
> *ctx,
> assert(!*ptr);
>  
> if (shProg) {
> -  shProg->RefCount++;
> +  p_atomic_inc(>RefCount);
>    *ptr = shProg;
> }
>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH] llvmpipe: Fix build after removal of deprecated attribute API

2016-11-07 Thread Aaron Watry
Applies on top of v2 of Tom's gallivm change.

Signed-off-by: Aaron Watry 
CC: Tom Stellard 
CC: Jan Vesely 
---
This fixes the build for me. I haven't done more than compile test this and run 
make check.

 src/gallium/drivers/llvmpipe/lp_state_fs.c| 2 +-
 src/gallium/drivers/llvmpipe/lp_state_setup.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c 
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index 3428eed..9a288fe 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -2296,7 +2296,7 @@ generate_fragment(struct llvmpipe_context *lp,
 */
for(i = 0; i < ARRAY_SIZE(arg_types); ++i)
   if(LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
- LLVMAddAttribute(LLVMGetParam(function, i), LLVMNoAliasAttribute);
+ lp_add_function_attr(function, i + 1, "noalias", 7);
 
context_ptr  = LLVMGetParam(function, 0);
x= LLVMGetParam(function, 1);
diff --git a/src/gallium/drivers/llvmpipe/lp_state_setup.c 
b/src/gallium/drivers/llvmpipe/lp_state_setup.c
index a57e2f0..dd1a9a0 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_setup.c
@@ -624,8 +624,7 @@ set_noalias(LLVMBuilderRef builder,
int i;
for(i = 0; i < nr_args; ++i)
   if(LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
- LLVMAddAttribute(LLVMGetParam(function, i),
-LLVMNoAliasAttribute);
+ lp_add_function_attr(function, i + 1, "noalias", 7);
 }
 
 static void
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 08/10] anv/batch: Move last_ss_pool_bo_offset to the command buffer

2016-11-07 Thread Jason Ekstrand
The original reason for putting it in the batch_bo was to allow primaries
to share it across secondaries or something like that.  However, the
relocation lists in secondary command buffers are are always left alone and
copied into the primary command buffer's relocation list.  This means that
the offset really applies at the command buffer level and putting it in the
batch_bo doesn't make sense.  This fixes a couple of potential bugs around
re-submission of command buffers that are not likely to be hit but are bugs
none the less.
---
 src/intel/vulkan/anv_batch_chain.c | 33 +
 src/intel/vulkan/anv_private.h |  6 +++---
 2 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index a21ae78..45cdb95 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -297,8 +297,6 @@ anv_batch_bo_clone(struct anv_cmd_buffer *cmd_buffer,
bbo->length = other_bbo->length;
memcpy(bbo->bo.map, other_bbo->bo.map, other_bbo->length);
 
-   bbo->last_ss_pool_bo_offset = other_bbo->last_ss_pool_bo_offset;
-
*bbo_out = bbo;
 
return VK_SUCCESS;
@@ -318,7 +316,6 @@ anv_batch_bo_start(struct anv_batch_bo *bbo, struct 
anv_batch *batch,
batch->next = batch->start = bbo->bo.map;
batch->end = bbo->bo.map + bbo->bo.size - batch_padding;
batch->relocs = >relocs;
-   bbo->last_ss_pool_bo_offset = 0;
bbo->relocs.num_relocs = 0;
 }
 
@@ -634,6 +631,7 @@ anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 _buffer->pool->alloc);
if (result != VK_SUCCESS)
   goto fail_bt_blocks;
+   cmd_buffer->last_ss_pool_center = 0;
 
anv_cmd_buffer_new_binding_table_block(cmd_buffer);
 
@@ -699,6 +697,7 @@ anv_cmd_buffer_reset_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
cmd_buffer->bt_next = 0;
 
cmd_buffer->surface_relocs.num_relocs = 0;
+   cmd_buffer->last_ss_pool_center = 0;
 
/* Reset the list of seen buffers */
cmd_buffer->seen_bbos.head = 0;
@@ -985,15 +984,19 @@ write_reloc(const struct anv_device *device, void *p, 
uint64_t v, bool flush)
 
 static void
 adjust_relocations_from_state_pool(struct anv_block_pool *pool,
-   struct anv_reloc_list *relocs)
+   struct anv_reloc_list *relocs,
+   uint32_t last_pool_center_bo_offset)
 {
+   assert(last_pool_center_bo_offset <= pool->center_bo_offset);
+   uint32_t delta = pool->center_bo_offset - last_pool_center_bo_offset;
+
for (size_t i = 0; i < relocs->num_relocs; i++) {
   /* All of the relocations from this block pool to other BO's should
* have been emitted relative to the surface block pool center.  We
* need to add the center offset to make them relative to the
* beginning of the actual GEM bo.
*/
-  relocs->relocs[i].offset += pool->center_bo_offset;
+  relocs->relocs[i].offset += delta;
}
 }
 
@@ -1001,10 +1004,10 @@ static void
 adjust_relocations_to_state_pool(struct anv_block_pool *pool,
  struct anv_bo *from_bo,
  struct anv_reloc_list *relocs,
- uint32_t *last_pool_center_bo_offset)
+ uint32_t last_pool_center_bo_offset)
 {
-   assert(*last_pool_center_bo_offset <= pool->center_bo_offset);
-   uint32_t delta = pool->center_bo_offset - *last_pool_center_bo_offset;
+   assert(last_pool_center_bo_offset <= pool->center_bo_offset);
+   uint32_t delta = pool->center_bo_offset - last_pool_center_bo_offset;
 
/* When we initially emit relocations into a block pool, we don't
 * actually know what the final center_bo_offset will be so we just emit
@@ -1032,8 +1035,6 @@ adjust_relocations_to_state_pool(struct anv_block_pool 
*pool,
  relocs->relocs[i].delta, false);
   }
}
-
-   *last_pool_center_bo_offset = pool->center_bo_offset;
 }
 
 void
@@ -1045,7 +1046,9 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
 
cmd_buffer->execbuf2.bo_count = 0;
 
-   adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs);
+   adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs,
+  cmd_buffer->last_ss_pool_center);
+
anv_execbuf_add_bo(_buffer->execbuf2, _pool->bo,
   _buffer->surface_relocs,
   _buffer->pool->alloc);
@@ -1056,12 +1059,18 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
struct anv_batch_bo **bbo;
u_vector_foreach(bbo, _buffer->seen_bbos) {
   adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
-   &(*bbo)->last_ss_pool_bo_offset);
+   cmd_buffer->last_ss_pool_center);
 

[Mesa-dev] [PATCH 2/3] anv/device: Return the right error for failed maps

2016-11-07 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
Cc: "12.0 13.0" 
---
 src/intel/vulkan/anv_device.c | 9 +++--
 src/intel/vulkan/anv_gem.c| 6 ++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 8055893..54efb47 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1281,8 +1282,12 @@ VkResult anv_MapMemory(
/* Let's map whole pages */
map_size = align_u64(map_size, 4096);
 
-   mem->map = anv_gem_mmap(device, mem->bo.gem_handle,
-   map_offset, map_size, gem_flags);
+   void *map = anv_gem_mmap(device, mem->bo.gem_handle,
+map_offset, map_size, gem_flags);
+   if (map == MAP_FAILED)
+  return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
+
+   mem->map = map;
mem->map_size = map_size;
 
*ppData = mem->map + (offset - map_offset);
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index e654689..0dde6d9 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -88,10 +88,8 @@ anv_gem_mmap(struct anv_device *device, uint32_t gem_handle,
};
 
int ret = anv_ioctl(device->fd, DRM_IOCTL_I915_GEM_MMAP, _mmap);
-   if (ret != 0) {
-  /* FIXME: Is NULL the right error return? Cf MAP_INVALID */
-  return NULL;
-   }
+   if (ret != 0)
+  return MAP_FAILED;
 
VG(VALGRIND_MALLOCLIKE_BLOCK(gem_mmap.addr_ptr, gem_mmap.size, 0, 1));
return (void *)(uintptr_t) gem_mmap.addr_ptr;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] anv/device: Implicitly unmap memory objects in FreeMemory

2016-11-07 Thread Jason Ekstrand
From the Vulkan spec version 1.0.32 docs for vkFreeMemory:

   "If a memory object is mapped at the time it is freed, it is implicitly
   unmapped."

Signed-off-by: Jason Ekstrand 
Cc: "12.0 13.0" 
---
 src/intel/vulkan/anv_device.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 54efb47..bc8397e 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1210,6 +1210,9 @@ VkResult anv_AllocateMemory(
 
mem->type_index = pAllocateInfo->memoryTypeIndex;
 
+   mem->map = NULL;
+   mem->map_size = 0;
+
*pMem = anv_device_memory_to_handle(mem);
 
return VK_SUCCESS;
@@ -1231,6 +1234,9 @@ void anv_FreeMemory(
if (mem == NULL)
   return;
 
+   if (mem->map)
+  anv_UnmapMemory(_device, _mem);
+
if (mem->bo.map)
   anv_gem_munmap(mem->bo.map, mem->bo.size);
 
@@ -1305,6 +1311,9 @@ void anv_UnmapMemory(
   return;
 
anv_gem_munmap(mem->map, mem->map_size);
+
+   mem->map = NULL;
+   mem->map_size = 0;
 }
 
 static void
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0

2016-11-07 Thread Jason Ekstrand
Signed-off-by: Jason Ekstrand 
Cc: "12.0 13.0" 
---
 src/intel/vulkan/anv_device.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 5393144..8055893 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1258,6 +1258,11 @@ VkResult anv_MapMemory(
if (size == VK_WHOLE_SIZE)
   size = mem->bo.size - offset;
 
+   if (size == 0) {
+  *ppData = NULL;
+  return VK_SUCCESS;
+   }
+
/* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() only
 * takes a VkDeviceMemory pointer, it seems like only one map of the memory
 * at a time is valid. We could just mmap up front and return an offset
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98002] Mud rendering bug in Portal 2

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98002

Michel Dänzer  changed:

   What|Removed |Added

 Resolution|--- |NOTOURBUG
 Status|NEW |RESOLVED

--- Comment #14 from Michel Dänzer  ---
Looks like a game bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 03/12] egl: add EGL_ANDROID_native_fence_sync

2016-11-07 Thread Rob Clark
On Mon, Nov 7, 2016 at 6:29 PM, Rafael Antognolli
 wrote:
> On Mon, Oct 31, 2016 at 08:58:26AM -0700, Rafael Antognolli wrote:
>> On Sat, Oct 29, 2016 at 01:15:44PM -0400, Rob Clark wrote:
>> > On Fri, Oct 28, 2016 at 7:44 PM, Rafael Antognolli
>> >  wrote:
>
> ...
>
>> > Hey, thanks for this.  I don't suppose you have a branch somewhere w/
>> > the piglit tests?
>>
>> Ouch, I mentioned it on another email but should have mentioned it here
>> too. It's here:
>>
>> https://github.com/rantogno/piglit/tree/fences
>>
>> > I've rebased and pulled in Chad's squash patches (and also a squash
>> > patch based on the issues you pointed out), but not yet the i965
>> > patches:
>> >
>> > https://github.com/freedreno/mesa/commits/wip-fence
>>
>> Awesome, I will check that one.
>
> Just an update: I did test that branch, and there was just one change
> needed for the piglit tests to work:
>
> https://github.com/rantogno/mesa/commit/c637f1ce404acaccaa920d37c52724c9d8093597

oh, good catch.. I'll squash that in and push an updated branch soon

> You can also check my last version of these tests (also submitted to the
> piglit list) here:
>
> https://github.com/rantogno/piglit/tree/review/fences-v02
>
> The only test that I don't know how to do yet is to make sure that Mesa
> and the kernel are respecting an eglSyncWait for a native sync fence.
> eglClientWaitSyncKHR is already covered.

yeah, I can't think of a particularly easy way to test that..  but I
think the API level tests have already caught quite a few issues..

> Also I did test your series with kmscube and some other stuff too, and
> so far it's all behaving really well. I'm looking forward to see your
> patches get merged.

I guess we should pull together a unified branch.. since we have this
working for intel + virgl + freedreno.  AFAIU the current status is
intel and freedreno kernel bits are upstream.  The libdrm bits for
freedreno are upstream, not sure about intel (and virgl doesn't have
any libdrm component).  Not sure about the kernel bit for virgl, but I
assume that will be 4.10?

I have one small update for the gallium patch, to add the pipe-cap to
all the other drivers.  I usually try to wait until the patch is ready
to push since otherwise it ends up being a huge rebase headache.

I would defn like to get this merged, esp. since I'm starting to get
busy on the next thing ;-)

BR,
-R

> Thanks,
> Rafael
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] swr: fix AND_INVERTED logic op conversion

2016-11-07 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/swr/swr_state.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_state.h 
b/src/gallium/drivers/swr/swr_state.h
index 0e3b49d..8409114 100644
--- a/src/gallium/drivers/swr/swr_state.h
+++ b/src/gallium/drivers/swr/swr_state.h
@@ -106,7 +106,7 @@ swr_convert_logic_op(const UINT op)
case PIPE_LOGICOP_NOR:
   return LOGICOP_NOR;
case PIPE_LOGICOP_AND_INVERTED:
-  return LOGICOP_CLEAR;
+  return LOGICOP_AND_INVERTED;
case PIPE_LOGICOP_COPY_INVERTED:
   return LOGICOP_COPY_INVERTED;
case PIPE_LOGICOP_AND_REVERSE:
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] swr: disable logic op when the rt format is float

2016-11-07 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/swr/swr_state.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index d8a8ee1..acb0452 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1305,6 +1305,11 @@ swr_update_derived(struct pipe_context *pipe,
>blend->compileState[target],
sizeof(compileState.blendState));
 
+if (compileState.blendState.logicOpEnable &&
+GetFormatInfo(compileState.format).type[0] == SWR_TYPE_FLOAT) {
+   compileState.blendState.logicOpEnable = false;
+}
+
 if (compileState.blendState.blendEnable == false &&
 compileState.blendState.logicOpEnable == false) {
SwrSetBlendFunc(ctx->swrContext, target, NULL);
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] swr: [rasterizer jitter] fix logic op to work with unorm/snorm

2016-11-07 Thread Ilia Mirkin
Most logic op usage is probably going to end up with normalized
textures. Scale the floating point values and convert to integer before
performing the logic operations.

Signed-off-by: Ilia Mirkin 
---

The gl-1.1-xor-copypixels test still fails. The image stays the same. I'm
suspecting it's for reasons outside of this patch.

I'm not too familiar with the whole swr infrastructure, perhaps there was
an eaiser way to do all this. I looked for conversion helper functions but
couldn't find anything that would fit nicely here. Feel free to point me
in the right direction.

 .../drivers/swr/rasterizer/jitter/blend_jit.cpp| 81 +-
 1 file changed, 64 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
index 1452d27..d69d503 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
@@ -649,29 +649,54 @@ struct BlendJit : public Builder
 if(state.blendState.logicOpEnable)
 {
 const SWR_FORMAT_INFO& info = GetFormatInfo(state.format);
-SWR_ASSERT(info.type[0] == SWR_TYPE_UINT);
 Value* vMask[4];
+float scale[4];
+
+if (!state.blendState.blendEnable) {
+Clamp(state.format, src);
+Clamp(state.format, dst);
+}
+
 for(uint32_t i = 0; i < 4; i++)
 {
-switch(info.bpc[i])
+if (info.type[i] == SWR_TYPE_UNUSED)
 {
-case 0: vMask[i] = VIMMED1(0x); break;
-case 2: vMask[i] = VIMMED1(0x0003); break;
-case 5: vMask[i] = VIMMED1(0x001F); break;
-case 6: vMask[i] = VIMMED1(0x003F); break;
-case 8: vMask[i] = VIMMED1(0x00FF); break;
-case 10: vMask[i] = VIMMED1(0x03FF); break;
-case 11: vMask[i] = VIMMED1(0x07FF); break;
-case 16: vMask[i] = VIMMED1(0x); break;
-case 24: vMask[i] = VIMMED1(0x00FF); break;
-case 32: vMask[i] = VIMMED1(0x); break;
+continue;
+}
+
+if (info.bpc[i] >= 32) {
+vMask[i] = VIMMED1(0x);
+scale[i] = 0x;
+} else {
+vMask[i] = VIMMED1((1 << info.bpc[i]) - 1);
+if (info.type[i] == SWR_TYPE_SNORM)
+scale[i] = (1 << (info.bpc[i] - 1)) - 1;
+else
+scale[i] = (1 << info.bpc[i]) - 1;
+}
+
+switch (info.type[i]) {
 default:
-vMask[i] = VIMMED1(0x0);
-SWR_ASSERT(0, "Unsupported bpc for logic op\n");
+SWR_ASSERT(0, "Unsupported type for logic op\n");
+/* fallthrough */
+case SWR_TYPE_UINT:
+case SWR_TYPE_SINT:
+src[i] = BITCAST(src[i], mSimdInt32Ty);
+dst[i] = BITCAST(dst[i], mSimdInt32Ty);
+break;
+case SWR_TYPE_SNORM:
+src[i] = FADD(src[i], VIMMED1(0.5f));
+dst[i] = FADD(dst[i], VIMMED1(0.5f));
+/* fallthrough */
+case SWR_TYPE_UNORM:
+src[i] = FP_TO_UI(
+FMUL(src[i], VIMMED1(scale[i])),
+mSimdInt32Ty);
+dst[i] = FP_TO_UI(
+FMUL(dst[i], VIMMED1(scale[i])),
+mSimdInt32Ty);
 break;
 }
-src[i] = BITCAST(src[i], mSimdInt32Ty);//, vMask[i]);
-dst[i] = BITCAST(dst[i], mSimdInt32Ty);
 }
 
 LogicOpFunc(state.blendState.logicOpFunc, src, dst, result);
@@ -679,10 +704,32 @@ struct BlendJit : public Builder
 // store results out
 for(uint32_t i = 0; i < 4; ++i)
 {
+if (info.type[i] == SWR_TYPE_UNUSED)
+{
+continue;
+}
+
 // clear upper bits from PS output not in RT format after 
doing logic op
 result[i] = AND(result[i], vMask[i]);
 
-STORE(BITCAST(result[i], mSimdFP32Ty), pResult, {i});
+switch (info.type[i]) {
+default:
+SWR_ASSERT(0, "Unsupported type for logic op\n");
+/* fallthrough */
+case SWR_TYPE_UINT:
+case SWR_TYPE_SINT:
+result[i] = BITCAST(result[i], mSimdFP32Ty);
+break;
+case SWR_TYPE_SNORM:
+case SWR_TYPE_UNORM:
+ 

Re: [Mesa-dev] [RFC 03/12] egl: add EGL_ANDROID_native_fence_sync

2016-11-07 Thread Rafael Antognolli
On Mon, Oct 31, 2016 at 08:58:26AM -0700, Rafael Antognolli wrote:
> On Sat, Oct 29, 2016 at 01:15:44PM -0400, Rob Clark wrote:
> > On Fri, Oct 28, 2016 at 7:44 PM, Rafael Antognolli
> >  wrote:

...

> > Hey, thanks for this.  I don't suppose you have a branch somewhere w/
> > the piglit tests?
> 
> Ouch, I mentioned it on another email but should have mentioned it here
> too. It's here:
> 
> https://github.com/rantogno/piglit/tree/fences
> 
> > I've rebased and pulled in Chad's squash patches (and also a squash
> > patch based on the issues you pointed out), but not yet the i965
> > patches:
> > 
> > https://github.com/freedreno/mesa/commits/wip-fence
> 
> Awesome, I will check that one.

Just an update: I did test that branch, and there was just one change
needed for the piglit tests to work:

https://github.com/rantogno/mesa/commit/c637f1ce404acaccaa920d37c52724c9d8093597

You can also check my last version of these tests (also submitted to the
piglit list) here:

https://github.com/rantogno/piglit/tree/review/fences-v02

The only test that I don't know how to do yet is to make sure that Mesa
and the kernel are respecting an eglSyncWait for a native sync fence.
eglClientWaitSyncKHR is already covered.

Also I did test your series with kmscube and some other stuff too, and
so far it's all behaving really well. I'm looking forward to see your
patches get merged.

Thanks,
Rafael
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 02/10] anv: Don't presume to know what address is in a surface relocation

2016-11-07 Thread Jason Ekstrand
Because our relocation processing happens at EndCommandBuffer time and
because RENDER_SURFACE_STATE objects may be shared by batches, we really
have no clue whatsoever what address is actually written to the relocation
offset in the BO.  We need to stop making such claims to the kernel and
just let it relocate for us.

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_batch_chain.c | 66 +-
 src/intel/vulkan/anv_private.h |  2 --
 2 files changed, 15 insertions(+), 53 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 529fe7e..2c0f803 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -940,32 +940,8 @@ static void
 anv_cmd_buffer_process_relocs(struct anv_cmd_buffer *cmd_buffer,
   struct anv_reloc_list *list)
 {
-   struct anv_bo *bo;
-
-   /* If the kernel supports I915_EXEC_NO_RELOC, it will compare offset in
-* struct drm_i915_gem_exec_object2 against the bos current offset and if
-* all bos haven't moved it will skip relocation processing alltogether.
-* If I915_EXEC_NO_RELOC is not supported, the kernel ignores the incoming
-* value of offset so we can set it either way.  For that to work we need
-* to make sure all relocs use the same presumed offset.
-*/
-
-   for (size_t i = 0; i < list->num_relocs; i++) {
-  bo = list->reloc_bos[i];
-  if (bo->offset != list->relocs[i].presumed_offset)
- cmd_buffer->execbuf2.need_reloc = true;
-
-  list->relocs[i].target_handle = bo->index;
-   }
-}
-
-static uint64_t
-read_reloc(const struct anv_device *device, const void *p)
-{
-   if (device->info.gen >= 8)
-  return *(uint64_t *)p;
-   else
-  return *(uint32_t *)p;
+   for (size_t i = 0; i < list->num_relocs; i++)
+  list->relocs[i].target_handle = list->reloc_bos[i]->index;
 }
 
 static void
@@ -978,27 +954,10 @@ write_reloc(const struct anv_device *device, void *p, 
uint64_t v)
 }
 
 static void
-adjust_relocations_from_block_pool(struct anv_block_pool *pool,
+adjust_relocations_from_state_pool(struct anv_block_pool *pool,
struct anv_reloc_list *relocs)
 {
for (size_t i = 0; i < relocs->num_relocs; i++) {
-  /* In general, we don't know how stale the relocated value is.  It
-   * may have been used last time or it may not.  Since we don't want
-   * to stomp it while the GPU may be accessing it, we haven't updated
-   * it anywhere else in the code.  Instead, we just set the presumed
-   * offset to what it is now based on the delta and the data in the
-   * block pool.  Then the kernel will update it for us if needed.
-   */
-  assert(relocs->relocs[i].offset < pool->state.end);
-  const void *p = pool->map + relocs->relocs[i].offset;
-
-  /* We're reading back the relocated value from potentially incoherent
-   * memory here. However, any change to the value will be from the kernel
-   * writing out relocations, which will keep the CPU cache up to date.
-   */
-  relocs->relocs[i].presumed_offset =
- read_reloc(pool->device, p) - relocs->relocs[i].delta;
-
   /* All of the relocations from this block pool to other BO's should
* have been emitted relative to the surface block pool center.  We
* need to add the center offset to make them relative to the
@@ -1009,7 +968,7 @@ adjust_relocations_from_block_pool(struct anv_block_pool 
*pool,
 }
 
 static void
-adjust_relocations_to_block_pool(struct anv_block_pool *pool,
+adjust_relocations_to_state_pool(struct anv_block_pool *pool,
  struct anv_bo *from_bo,
  struct anv_reloc_list *relocs,
  uint32_t *last_pool_center_bo_offset)
@@ -1055,9 +1014,8 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
   _buffer->device->surface_state_block_pool;
 
cmd_buffer->execbuf2.bo_count = 0;
-   cmd_buffer->execbuf2.need_reloc = false;
 
-   adjust_relocations_from_block_pool(ss_pool, _buffer->surface_relocs);
+   adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs);
anv_cmd_buffer_add_bo(cmd_buffer, _pool->bo, 
_buffer->surface_relocs);
 
/* First, we walk over all of the bos we've seen and add them and their
@@ -1065,7 +1023,7 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
 */
struct anv_batch_bo **bbo;
u_vector_foreach(bbo, _buffer->seen_bbos) {
-  adjust_relocations_to_block_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
+  adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
&(*bbo)->last_ss_pool_bo_offset);
 
   anv_cmd_buffer_add_bo(cmd_buffer, &(*bbo)->bo, &(*bbo)->relocs);
@@ -1127,15 +1085,21 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 

[Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support

2016-11-07 Thread George Kyriazis
- Added code to create screen and handle swaps in libgl_gdi.c
- Added call to swr SConscript
- included llvm 3.9 support for scons (windows swr only support 3.9 and
  later)
- include -DHAVE_SWR to subdirs that need it

To buils SWR on windows, use "scons swr libgl-gdi"
---
 scons/llvm.py | 21 +++--
 src/gallium/SConscript|  1 +
 src/gallium/targets/libgl-gdi/SConscript  |  4 
 src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 +++-
 src/gallium/targets/libgl-xlib/SConscript |  4 
 src/gallium/targets/osmesa/SConscript |  4 
 6 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/scons/llvm.py b/scons/llvm.py
index 1fc8a3f..977e47a 100644
--- a/scons/llvm.py
+++ b/scons/llvm.py
@@ -106,7 +106,24 @@ def generate(env):
 ])
 env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
 # LIBS should match the output of `llvm-config --libs engine mcjit 
bitwriter x86asmprinter`
-if llvm_version >= distutils.version.LooseVersion('3.7'):
+if llvm_version >= distutils.version.LooseVersion('3.9'):
+env.Prepend(LIBS = [
+'LLVMX86Disassembler', 'LLVMX86AsmParser',
+'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
+'LLVMDebugInfoCodeView', 'LLVMCodeGen',
+'LLVMScalarOpts', 'LLVMInstCombine',
+'LLVMInstrumentation', 'LLVMTransformUtils',
+'LLVMBitWriter', 'LLVMX86Desc',
+'LLVMMCDisassembler', 'LLVMX86Info',
+'LLVMX86AsmPrinter', 'LLVMX86Utils',
+'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
+'LLVMAnalysis', 'LLVMProfileData',
+'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
+'LLVMBitReader', 'LLVMMC', 'LLVMCore',
+'LLVMSupport',
+'LLVMIRReader', 'LLVMASMParser'
+])
+elif llvm_version >= distutils.version.LooseVersion('3.7'):
 env.Prepend(LIBS = [
 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
 'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
@@ -203,7 +220,7 @@ def generate(env):
 if '-fno-rtti' in cxxflags:
 env.Append(CXXFLAGS = ['-fno-rtti'])
 
-components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 
'mcdisassembler']
+components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 
'mcdisassembler', 'irreader']
 
 env.ParseConfig('llvm-config --libs ' + ' '.join(components))
 env.ParseConfig('llvm-config --ldflags')
diff --git a/src/gallium/SConscript b/src/gallium/SConscript
index f98268f..9273db7 100644
--- a/src/gallium/SConscript
+++ b/src/gallium/SConscript
@@ -18,6 +18,7 @@ SConscript([
 'drivers/softpipe/SConscript',
 'drivers/svga/SConscript',
 'drivers/trace/SConscript',
+'drivers/swr/SConscript',
 ])
 
 #
diff --git a/src/gallium/targets/libgl-gdi/SConscript 
b/src/gallium/targets/libgl-gdi/SConscript
index 2a52363..ef8050b 100644
--- a/src/gallium/targets/libgl-gdi/SConscript
+++ b/src/gallium/targets/libgl-gdi/SConscript
@@ -30,6 +30,10 @@ if env['llvm']:
 env.Append(CPPDEFINES = 'HAVE_LLVMPIPE')
 drivers += [llvmpipe]
 
+if 'swr' in COMMAND_LINE_TARGETS :
+env.Append(CPPDEFINES = 'HAVE_SWR')
+drivers += [swr]
+
 if env['gcc'] and env['machine'] != 'x86_64':
 # DEF parser in certain versions of MinGW is busted, as does not behave as
 # MSVC.  mingw-w64 works fine.
diff --git a/src/gallium/targets/libgl-gdi/libgl_gdi.c 
b/src/gallium/targets/libgl-gdi/libgl_gdi.c
index 922c186..12576db 100644
--- a/src/gallium/targets/libgl-gdi/libgl_gdi.c
+++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c
@@ -51,9 +51,12 @@
 #include "llvmpipe/lp_public.h"
 #endif
 
+#ifdef HAVE_SWR
+#include "swr/swr_public.h"
+#endif
 
 static boolean use_llvmpipe = FALSE;
-
+static boolean use_swr = FALSE;
 
 static struct pipe_screen *
 gdi_screen_create(void)
@@ -69,6 +72,8 @@ gdi_screen_create(void)
 
 #ifdef HAVE_LLVMPIPE
default_driver = "llvmpipe";
+#elif HAVE_SWR
+   default_driver = "swr";
 #else
default_driver = "softpipe";
 #endif
@@ -78,15 +83,21 @@ gdi_screen_create(void)
 #ifdef HAVE_LLVMPIPE
if (strcmp(driver, "llvmpipe") == 0) {
   screen = llvmpipe_create_screen( winsys );
+  if (screen)
+ use_llvmpipe = TRUE;
+   }
+#endif
+#ifdef HAVE_SWR
+   if (strcmp(driver, "swr") == 0) {
+  screen = swr_create_screen( winsys );
+  if (screen)
+ use_swr = TRUE;
}
-#else
-   (void) driver;
 #endif
+   (void) driver;
 
if (screen == NULL) {
   screen = softpipe_create_screen( winsys );
-   } else {
-  use_llvmpipe = TRUE;
}
 
if(!screen)
@@ -128,6 +139,13 @@ gdi_present(struct pipe_screen *screen,
}
 #endif
 
+#ifdef HAVE_SWR
+   if (use_swr) {
+  

[Mesa-dev] [Bug 98599] xterm menus corrupt since tgsi/scan: handle indirect image indexing correctly

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98599

--- Comment #6 from Andy Furniss  ---
The patches on mesa-dev fix for me, thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] swr: Support Windows builds

2016-11-07 Thread George Kyriazis
Changes to support windows builds for OpenSWR driver.

These are divided into 3 patches:
- scons and core mesa-related changes
- a fix in macros.h to implement HAS_TRIVIAL_DESTRUCTOR
- swr-specific changes

The way to build SWR on windows is using scons.  Build using the following
command line:  "scons swr libgl-gdi".  This will produce 3 .dlls.  The 
(main) opengl32.dll, and 2 swr-specific dlls that are loaded dynamically
at runtime depending on the underlying architecture (swrAVX.dll and 
swrAVX2.dll).

The default software renderer is llvmpipe, and, like on linux, you 
enable SWR by setting the GALLIUM_DRIVER variable to "swr".


George Kyriazis (3):
  gallium/scons: OpenSWR Windows support
  mesa: added msvc HAS_TRIVIAL_DESTRUCTOR implementation
  swr: Support windows builds

 scons/llvm.py  |  21 ++-
 src/gallium/SConscript |   1 +
 src/gallium/drivers/swr/Makefile.am|   8 ++
 src/gallium/drivers/swr/SConscript |  46 +++
 src/gallium/drivers/swr/SConscript-arch| 175 +
 src/gallium/drivers/swr/rasterizer/common/os.h |   5 +-
 src/gallium/drivers/swr/swr_context.cpp|  16 +--
 src/gallium/drivers/swr/swr_context.h  |   2 +
 src/gallium/drivers/swr/swr_loader.cpp |  37 +-
 src/gallium/drivers/swr/swr_public.h   |  11 +-
 src/gallium/drivers/swr/swr_screen.cpp |  25 +---
 src/gallium/targets/libgl-gdi/SConscript   |   4 +
 src/gallium/targets/libgl-gdi/libgl_gdi.c  |  28 +++-
 src/gallium/targets/libgl-xlib/SConscript  |   4 +
 src/gallium/targets/osmesa/SConscript  |   4 +
 src/util/macros.h  |   5 +
 16 files changed, 351 insertions(+), 41 deletions(-)
 create mode 100644 src/gallium/drivers/swr/SConscript
 create mode 100644 src/gallium/drivers/swr/SConscript-arch

-- 
2.10.0.windows.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa: added msvc HAS_TRIVIAL_DESTRUCTOR implementation

2016-11-07 Thread George Kyriazis
not having it on windows causes a CANARY assertion in
src/util/ralloc.c:get_header()

Tested only on MSVC 19.00 (DevStudio 14.0), so #ifdef guards reflect that.
---
 src/util/macros.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/util/macros.h b/src/util/macros.h
index 27d1b62..12b26d3 100644
--- a/src/util/macros.h
+++ b/src/util/macros.h
@@ -175,6 +175,11 @@ do {   \
 #  if __has_feature(has_trivial_destructor)
 # define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T)
 #  endif
+#   elif defined(_MSC_VER) && !defined(__INTEL_COMPILER)
+#  if _MSC_VER >= 1900
+# define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T)
+#  else
+#  endif
 #   endif
 #   ifndef HAS_TRIVIAL_DESTRUCTOR
/* It's always safe (if inefficient) to assume that a
-- 
2.10.0.windows.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] swr: Support windows builds

2016-11-07 Thread George Kyriazis
- Added SConscript files
- better handling of NOMINMAX for  inclusion
- Reorder header files in swr_context.cpp to handle NOMINMAX better, since
  mesa header files include windows.h before we get a chance to #define
  NOMINMAX
- cleaner support for .dll and .so prefix/suffix across OSes
- added PUBLIC for some protos
- added swr_gdi_swap() which is call from libgl_gdi.c
---
 src/gallium/drivers/swr/Makefile.am|   8 ++
 src/gallium/drivers/swr/SConscript |  46 +++
 src/gallium/drivers/swr/SConscript-arch| 175 +
 src/gallium/drivers/swr/rasterizer/common/os.h |   5 +-
 src/gallium/drivers/swr/swr_context.cpp|  16 +--
 src/gallium/drivers/swr/swr_context.h  |   2 +
 src/gallium/drivers/swr/swr_loader.cpp |  37 +-
 src/gallium/drivers/swr/swr_public.h   |  11 +-
 src/gallium/drivers/swr/swr_screen.cpp |  25 +---
 9 files changed, 291 insertions(+), 34 deletions(-)
 create mode 100644 src/gallium/drivers/swr/SConscript
 create mode 100644 src/gallium/drivers/swr/SConscript-arch

diff --git a/src/gallium/drivers/swr/Makefile.am 
b/src/gallium/drivers/swr/Makefile.am
index dd1c2e6..0ec4af2
--- a/src/gallium/drivers/swr/Makefile.am
+++ b/src/gallium/drivers/swr/Makefile.am
@@ -217,6 +217,12 @@ libswrAVX2_la_CXXFLAGS = \
 libswrAVX2_la_SOURCES = \
$(COMMON_SOURCES)
 
+# XXX: $(SWR_AVX_CXXFLAGS) should not be included, but we end up including
+# simdintrin.h, which throws a warning if AVX is not enabled
+libmesaswr_la_CXXFLAGS = \
+   $(COMMON_CXXFLAGS) \
+   $(SWR_AVX_CXXFLAGS)
+
 # XXX: Don't ship these generated sources for now, since they are specific
 # to the LLVM version they are generated from. Thus a release tarball
 # containing the said files, generated against eg. LLVM 3.8 will fail to build
@@ -235,6 +241,8 @@ libswrAVX2_la_LDFLAGS = \
 include $(top_srcdir)/install-gallium-links.mk
 
 EXTRA_DIST = \
+   SConscipt \
+   SConscript-arch \
rasterizer/archrast/events.proto \
rasterizer/jitter/scripts/gen_llvm_ir_macros.py \
rasterizer/jitter/scripts/gen_llvm_types.py \
diff --git a/src/gallium/drivers/swr/SConscript 
b/src/gallium/drivers/swr/SConscript
new file mode 100644
index 000..c470bbd
--- /dev/null
+++ b/src/gallium/drivers/swr/SConscript
@@ -0,0 +1,46 @@
+Import('*')
+
+from sys import executable as python_cmd
+import distutils.version
+import os.path
+
+if not 'swr' in COMMAND_LINE_TARGETS:
+Return()
+
+if not env['llvm']:
+print 'warning: LLVM disabled: not building swr'
+Return()
+
+env.MSVC2013Compat()
+
+swr_arch = 'avx'
+VariantDir('avx', '.', duplicate=0)
+SConscript('avx/SConscript-arch', exports='swr_arch')
+
+swr_arch = 'avx2'
+VariantDir('avx2', '.', duplicate=0)
+SConscript('avx2/SConscript-arch', exports='swr_arch')
+
+env = env.Clone()
+
+source = env.ParseSourceList('Makefile.sources', [
+'LOADER_SOURCES'
+])
+
+env.Prepend(CPPPATH = [
+'rasterizer/scripts'
+])
+
+swr = env.ConvenienceLibrary(
+   target = 'swr',
+   source = source,
+   )
+# treat arch libs as dependencies, even though they are not linked
+# into swr, so we don't have to build them separately
+Depends(swr, ['swrAVX', 'swrAVX2'])
+
+env.Alias('swr', swr)
+
+env.Prepend(LIBS = [swr])
+
+Export('swr')
diff --git a/src/gallium/drivers/swr/SConscript-arch 
b/src/gallium/drivers/swr/SConscript-arch
new file mode 100644
index 000..f7d5b5a
--- /dev/null
+++ b/src/gallium/drivers/swr/SConscript-arch
@@ -0,0 +1,175 @@
+Import('*')
+
+from sys import executable as python_cmd
+import distutils.version
+import os.path
+
+if not env['llvm']:
+print 'warning: LLVM disabled: not building swr'
+Return()
+
+Import('swr_arch')
+
+# construct llvm include dir
+llvm_includedir = os.path.join(os.environ['LLVM'], 'include')
+
+# get path for arch-specific build-path.
+# That's where generated files reside.
+build_path = Dir('.').abspath
+
+env.Prepend(CPPPATH = [
+build_path + '/.',
+build_path + '/rasterizer',
+build_path + '/rasterizer/core',
+build_path + '/rasterizer/jitter',
+build_path + '/rasterizer/scripts',
+build_path + '/rasterizer/archrast'
+])
+
+env = env.Clone()
+
+env.MSVC2013Compat()
+
+env.Append(CPPDEFINES = [
+   '__STDC_CONSTANT_MACROS',
+   '__STDC_LIMIT_MACROS'
+   ])
+
+if not env['msvc'] :
+env.Append(CCFLAGS = [
+'-std=c++11',
+])
+
+swrroot = '#src/gallium/drivers/swr/'
+
+env.CodeGenerate(
+target = 'rasterizer/scripts/gen_knobs.cpp',
+script = swrroot + 'rasterizer/scripts/gen_knobs.py',
+source = [],
+command = python_cmd + ' $SCRIPT ' + Dir('rasterizer/scripts').abspath
+#command = python_cmd + ' $SCRIPT ' + 'rasterizer/scripts'
+)
+
+env.CodeGenerate(
+target = 'rasterizer/scripts/gen_knobs.h',
+script = swrroot + 'rasterizer/scripts/gen_knobs.py',
+source = [],
+command = python_cmd + ' $SCRIPT ' + 

[Mesa-dev] [PATCH v4 06/10] anv/batch_chain: Improve write_reloc

2016-11-07 Thread Jason Ekstrand
The old version wasn't properly handling large addresses where we have to
sign-extend to get it into the "canonical form" expected by the hardware.
Also, the new version is capable of doing a clflush of the newly written
reloc if requested.

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_batch_chain.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 2c0f803..681bbb6 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -945,12 +945,29 @@ anv_cmd_buffer_process_relocs(struct anv_cmd_buffer 
*cmd_buffer,
 }
 
 static void
-write_reloc(const struct anv_device *device, void *p, uint64_t v)
+write_reloc(const struct anv_device *device, void *p, uint64_t v, bool flush)
 {
-   if (device->info.gen >= 8)
-  *(uint64_t *)p = v;
-   else
+   unsigned reloc_size = 0;
+   if (device->info.gen >= 8) {
+  /* From the Broadwell PRM Vol. 2a, MI_LOAD_REGISTER_MEM::MemoryAddress:
+   *
+   *"This field specifies the address of the memory location where the
+   *register value specified in the DWord above will read from. The
+   *address specifies the DWord location of the data. Range =
+   *GraphicsVirtualAddress[63:2] for a DWord register GraphicsAddress
+   *[63:48] are ignored by the HW and assumed to be in correct
+   *canonical form [63:48] == [47]."
+   */
+  const int shift = 63 - 47;
+  reloc_size = sizeof(uint64_t);
+  *(uint64_t *)p = (((int64_t)v) << shift) >> shift;
+   } else {
+  reloc_size = sizeof(uint32_t);
   *(uint32_t *)p = v;
+   }
+
+   if (flush && !device->info.has_llc)
+  anv_clflush_range(p, reloc_size);
 }
 
 static void
@@ -999,7 +1016,7 @@ adjust_relocations_to_state_pool(struct anv_block_pool 
*pool,
  assert(relocs->relocs[i].offset < from_bo->size);
  write_reloc(pool->device, from_bo->map + relocs->relocs[i].offset,
  relocs->relocs[i].presumed_offset +
- relocs->relocs[i].delta);
+ relocs->relocs[i].delta, false);
   }
}
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 10/10] anv: Do relocations in userspace before execbuf ioctl

2016-11-07 Thread Jason Ekstrand
From: Kristian Høgsberg Kristensen 

Since our surface state buffer is shared by all batches, the kernel does a
full stall and sync with the CPU between batches every time we call
execbuf2 because it refuses to do relocations on an active buffer.  Doing
them in userspace and passing the NO_RELOC flag to the kernel allows us to
perform the relocations without stalling.

This improves the performance of Dota 2 by around 30% on a Sky Lake GT2.

v2 (Jason Ekstrand):
 - Better comments (Chris Wilson)
 - Fixed write_reloc for correct canonical form (Chris Wilson)

v3 (Jason Ekstrand):
 - Skip relocations which aren't needed
 - Provide an environment variable to always use the kernel
 - More comments about correctness (Chris Wilson)

v4 (Jason Ekstrand):
 - More comments (Chris Wilson)

v5 (Jason Ekstrand):
 - Rebase on top of moving execbuf2 setup go QueueSubmit

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_batch_chain.c | 154 +++--
 src/intel/vulkan/anv_device.c  |   8 +-
 2 files changed, 153 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 3493eeb..7659f27 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -32,6 +32,8 @@
 #include "genxml/gen7_pack.h"
 #include "genxml/gen8_pack.h"
 
+#include "util/debug.h"
+
 /** \file anv_batch_chain.c
  *
  * This file contains functions related to anv_cmd_buffer as a data
@@ -1044,6 +1046,108 @@ adjust_relocations_to_state_pool(struct anv_block_pool 
*pool,
}
 }
 
+static void
+anv_reloc_list_apply(struct anv_device *device,
+ struct anv_reloc_list *list,
+ struct anv_bo *bo,
+ bool always_relocate)
+{
+   for (size_t i = 0; i < list->num_relocs; i++) {
+  struct anv_bo *target_bo = list->reloc_bos[i];
+  if (list->relocs[i].presumed_offset == target_bo->offset &&
+  !always_relocate)
+ continue;
+
+  void *p = bo->map + list->relocs[i].offset;
+  write_reloc(device, p, target_bo->offset + list->relocs[i].delta, true);
+  list->relocs[i].presumed_offset = target_bo->offset;
+   }
+}
+
+/**
+ * This function applies the relocation for a command buffer and writes the
+ * actual addresses into the buffers as per what we were told by the kernel on
+ * the previous execbuf2 call.  This should be safe to do because, for each
+ * relocated address, we have two cases:
+ *
+ *  1) The target BO is inactive (as seen by the kernel).  In this case, it is
+ * not in use by the GPU so updating the address is 100% ok.  It won't be
+ * in-use by the GPU (from our context) again until the next execbuf2
+ * happens.  If the kernel decides to move it in the next execbuf2, it
+ * will have to do the relocations itself, but that's ok because it should
+ * have all of the information needed to do so.
+ *
+ *  2) The target BO is active (as seen by the kernel).  In this case, it
+ * hasn't moved since the last execbuffer2 call because GTT shuffling
+ * *only* happens when the BO is idle. (From our perspective, it only
+ * happens inside the execbuffer2 ioctl, but the shuffling may be
+ * triggered by another ioctl, with full-ppgtt this is limited to only
+ * execbuffer2 ioctls on the same context, or memory pressure.)  Since the
+ * target BO hasn't moved, our anv_bo::offset exactly matches the BO's GTT
+ * address and the relocated value we are writing into the BO will be the
+ * same as the value that is already there.
+ *
+ * There is also a possibility that the target BO is active but the exact
+ * RENDER_SURFACE_STATE object we are writing the relocation into isn't in
+ * use.  In this case, the address currently in the RENDER_SURFACE_STATE
+ * may be stale but it's still safe to write the relocation because that
+ * particular RENDER_SURFACE_STATE object isn't in-use by the GPU and
+ * won't be until the next execbuf2 call.
+ *
+ * By doing relocations on the CPU, we can tell the kernel that it doesn't
+ * need to bother.  We want to do this because the surface state buffer is
+ * used by every command buffer so, if the kernel does the relocations, it
+ * will always be busy and the kernel will always stall.  This is also
+ * probably the fastest mechanism for doing relocations since the kernel would
+ * have to make a full copy of all the relocations lists.
+ */
+static bool
+relocate_cmd_buffer(struct anv_cmd_buffer *cmd_buffer,
+struct anv_execbuf *exec)
+{
+   static int userspace_relocs = -1;
+   if (userspace_relocs < 0)
+  userspace_relocs = env_var_as_boolean("ANV_USERSPACE_RELOCS", true);
+   if (!userspace_relocs)
+  return false;
+
+   /* First, we have to check to see whether or not we can even do the
+* relocation.  New buffers which have never been 

[Mesa-dev] [PATCH v4 09/10] anv: Move relocation handling from EndCommandBuffer to QueueSubmit

2016-11-07 Thread Jason Ekstrand
Ever since the early days of the Vulkan driver, we've been setting up the
lists of relocations at EndCommandBuffer time.  The idea behind this was to
move some of the CPU load out of QueueSubmit which the client is required
to lock around and into command buffer building which could be done in
parallel.  Then QueueSubmit basically just becomes a bunch of execbuf2
calls.

Technically, this works.  However, when you start to do more in QueueSubmit
than just execbuf2, you start to run into problems.  In particular, if a
block pool is resized between EndCommandBuffer and QueueSubmit, the list of
anv_bo's and the execbuf2 object list can get out of sync.  This can cause
problems if, for instance, you wanted to do relocations in userspace.
---
 src/intel/vulkan/anv_batch_chain.c | 94 --
 src/intel/vulkan/anv_device.c  | 30 ++--
 src/intel/vulkan/anv_private.h | 13 --
 src/intel/vulkan/genX_cmd_buffer.c | 11 -
 4 files changed, 76 insertions(+), 72 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 45cdb95..3493eeb 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -574,20 +574,6 @@ anv_cmd_buffer_new_binding_table_block(struct 
anv_cmd_buffer *cmd_buffer)
return VK_SUCCESS;
 }
 
-static void
-anv_execbuf_init(struct anv_execbuf *exec)
-{
-   memset(exec, 0, sizeof(*exec));
-}
-
-static void
-anv_execbuf_finish(struct anv_execbuf *exec,
-   const VkAllocationCallbacks *alloc)
-{
-   vk_free(alloc, exec->objects);
-   vk_free(alloc, exec->bos);
-}
-
 VkResult
 anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
 {
@@ -635,8 +621,6 @@ anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 
anv_cmd_buffer_new_binding_table_block(cmd_buffer);
 
-   anv_execbuf_init(_buffer->execbuf2);
-
return VK_SUCCESS;
 
  fail_bt_blocks:
@@ -668,8 +652,6 @@ anv_cmd_buffer_fini_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 _buffer->batch_bos, link) {
   anv_batch_bo_destroy(bbo, cmd_buffer);
}
-
-   anv_execbuf_finish(_buffer->execbuf2, _buffer->pool->alloc);
 }
 
 void
@@ -867,6 +849,31 @@ anv_cmd_buffer_add_secondary(struct anv_cmd_buffer 
*primary,
  >surface_relocs, 0);
 }
 
+struct anv_execbuf {
+   struct drm_i915_gem_execbuffer2   execbuf;
+
+   struct drm_i915_gem_exec_object2 *objects;
+   uint32_t  bo_count;
+   struct anv_bo **  bos;
+
+   /* Allocated length of the 'objects' and 'bos' arrays */
+   uint32_t  array_length;
+};
+
+static void
+anv_execbuf_init(struct anv_execbuf *exec)
+{
+   memset(exec, 0, sizeof(*exec));
+}
+
+static void
+anv_execbuf_finish(struct anv_execbuf *exec,
+   const VkAllocationCallbacks *alloc)
+{
+   vk_free(alloc, exec->objects);
+   vk_free(alloc, exec->bos);
+}
+
 static VkResult
 anv_execbuf_add_bo(struct anv_execbuf *exec,
struct anv_bo *bo,
@@ -1037,20 +1044,20 @@ adjust_relocations_to_state_pool(struct anv_block_pool 
*pool,
}
 }
 
-void
-anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer *cmd_buffer)
+VkResult
+anv_cmd_buffer_execbuf(struct anv_device *device,
+   struct anv_cmd_buffer *cmd_buffer)
 {
struct anv_batch *batch = _buffer->batch;
struct anv_block_pool *ss_pool =
   _buffer->device->surface_state_block_pool;
 
-   cmd_buffer->execbuf2.bo_count = 0;
+   struct anv_execbuf execbuf;
+   anv_execbuf_init();
 
adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs,
   cmd_buffer->last_ss_pool_center);
-
-   anv_execbuf_add_bo(_buffer->execbuf2, _pool->bo,
-  _buffer->surface_relocs,
+   anv_execbuf_add_bo(, _pool->bo, _buffer->surface_relocs,
   _buffer->pool->alloc);
 
/* First, we walk over all of the bos we've seen and add them and their
@@ -1061,7 +1068,7 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
   adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
cmd_buffer->last_ss_pool_center);
 
-  anv_execbuf_add_bo(_buffer->execbuf2, &(*bbo)->bo, &(*bbo)->relocs,
+  anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs,
  _buffer->pool->alloc);
}
 
@@ -1079,20 +1086,19 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
 * corresponding to the first batch_bo in the chain with the last
 * element in the list.
 */
-   if (first_batch_bo->bo.index != cmd_buffer->execbuf2.bo_count - 1) {
+   if (first_batch_bo->bo.index != execbuf.bo_count - 1) {
   uint32_t idx = first_batch_bo->bo.index;
-  uint32_t last_idx = cmd_buffer->execbuf2.bo_count - 1;
+  uint32_t last_idx = 

[Mesa-dev] [PATCH v4 07/10] anv: Add an anv_execbuf helper struct

2016-11-07 Thread Jason Ekstrand
This commit adds a little helper struct for storing everything we use to
build an execbuf2 call.  Since the add_bo function really has nothing to do
with a command buffer, it makes sense to break it out a bit.  This also
reduces some of the churn in the next commit.
---
 src/intel/vulkan/anv_batch_chain.c | 84 +++---
 src/intel/vulkan/anv_private.h | 26 ++--
 2 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 681bbb6..a21ae78 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -577,6 +577,20 @@ anv_cmd_buffer_new_binding_table_block(struct 
anv_cmd_buffer *cmd_buffer)
return VK_SUCCESS;
 }
 
+static void
+anv_execbuf_init(struct anv_execbuf *exec)
+{
+   memset(exec, 0, sizeof(*exec));
+}
+
+static void
+anv_execbuf_finish(struct anv_execbuf *exec,
+   const VkAllocationCallbacks *alloc)
+{
+   vk_free(alloc, exec->objects);
+   vk_free(alloc, exec->bos);
+}
+
 VkResult
 anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer *cmd_buffer)
 {
@@ -623,9 +637,7 @@ anv_cmd_buffer_init_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 
anv_cmd_buffer_new_binding_table_block(cmd_buffer);
 
-   cmd_buffer->execbuf2.objects = NULL;
-   cmd_buffer->execbuf2.bos = NULL;
-   cmd_buffer->execbuf2.array_length = 0;
+   anv_execbuf_init(_buffer->execbuf2);
 
return VK_SUCCESS;
 
@@ -659,8 +671,7 @@ anv_cmd_buffer_fini_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
   anv_batch_bo_destroy(bbo, cmd_buffer);
}
 
-   vk_free(_buffer->pool->alloc, cmd_buffer->execbuf2.objects);
-   vk_free(_buffer->pool->alloc, cmd_buffer->execbuf2.bos);
+   anv_execbuf_finish(_buffer->execbuf2, _buffer->pool->alloc);
 }
 
 void
@@ -858,55 +869,57 @@ anv_cmd_buffer_add_secondary(struct anv_cmd_buffer 
*primary,
 }
 
 static VkResult
-anv_cmd_buffer_add_bo(struct anv_cmd_buffer *cmd_buffer,
-  struct anv_bo *bo,
-  struct anv_reloc_list *relocs)
+anv_execbuf_add_bo(struct anv_execbuf *exec,
+   struct anv_bo *bo,
+   struct anv_reloc_list *relocs,
+   const VkAllocationCallbacks *alloc)
 {
struct drm_i915_gem_exec_object2 *obj = NULL;
 
-   if (bo->index < cmd_buffer->execbuf2.bo_count &&
-   cmd_buffer->execbuf2.bos[bo->index] == bo)
-  obj = _buffer->execbuf2.objects[bo->index];
+   if (bo->index < exec->bo_count && exec->bos[bo->index] == bo)
+  obj = >objects[bo->index];
 
if (obj == NULL) {
   /* We've never seen this one before.  Add it to the list and assign
* an id that we can use later.
*/
-  if (cmd_buffer->execbuf2.bo_count >= cmd_buffer->execbuf2.array_length) {
- uint32_t new_len = cmd_buffer->execbuf2.objects ?
-cmd_buffer->execbuf2.array_length * 2 : 64;
+  if (exec->bo_count >= exec->array_length) {
+ uint32_t new_len = exec->objects ? exec->array_length * 2 : 64;
 
  struct drm_i915_gem_exec_object2 *new_objects =
-vk_alloc(_buffer->pool->alloc, new_len * sizeof(*new_objects),
-  8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+vk_alloc(alloc, new_len * sizeof(*new_objects),
+ 8, VK_SYSTEM_ALLOCATION_SCOPE_COMMAND);
  if (new_objects == NULL)
 return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
  struct anv_bo **new_bos =
-vk_alloc(_buffer->pool->alloc, new_len * sizeof(*new_bos),
-  8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+vk_alloc(alloc, new_len * sizeof(*new_bos),
+  8, VK_SYSTEM_ALLOCATION_SCOPE_COMMAND);
  if (new_bos == NULL) {
-vk_free(_buffer->pool->alloc, new_objects);
+vk_free(alloc, new_objects);
 return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
  }
 
- if (cmd_buffer->execbuf2.objects) {
-memcpy(new_objects, cmd_buffer->execbuf2.objects,
-   cmd_buffer->execbuf2.bo_count * sizeof(*new_objects));
-memcpy(new_bos, cmd_buffer->execbuf2.bos,
-   cmd_buffer->execbuf2.bo_count * sizeof(*new_bos));
+ if (exec->objects) {
+memcpy(new_objects, exec->objects,
+   exec->bo_count * sizeof(*new_objects));
+memcpy(new_bos, exec->bos,
+   exec->bo_count * sizeof(*new_bos));
  }
 
- cmd_buffer->execbuf2.objects = new_objects;
- cmd_buffer->execbuf2.bos = new_bos;
- cmd_buffer->execbuf2.array_length = new_len;
+ vk_free(alloc, exec->objects);
+ vk_free(alloc, exec->bos);
+
+ exec->objects = new_objects;
+ exec->bos = new_bos;
+ exec->array_length = new_len;
   }
 
-  assert(cmd_buffer->execbuf2.bo_count < 

[Mesa-dev] [PATCH v4 00/10] anv: Rework relocation handling

2016-11-07 Thread Jason Ekstrand
This is the fourth iteration of my attempt to rework relocation handling
and do relocations in userspace.  I'm finally getting pretty happy with
this and I think I'll probably merge this version if there are no further
objections.

Jason Ekstrand (9):
  anv: Add a cmd_buffer_execbuf helper
  anv: Don't presume to know what address is in a surface relocation
  anv: Add a new bo_pool_init helper
  anv/allocator: Simplify anv_scratch_pool
  anv: Initialize anv_bo::offset to -1
  anv/batch_chain: Improve write_reloc
  anv: Add an anv_execbuf helper struct
  anv/batch: Move last_ss_pool_bo_offset to the command buffer
  anv: Move relocation handling from EndCommandBuffer to QueueSubmit

Kristian Høgsberg (1):
  anv: Do relocations in userspace before execbuf ioctl

 src/intel/vulkan/anv_allocator.c   | 118 +---
 src/intel/vulkan/anv_batch_chain.c | 386 ++---
 src/intel/vulkan/anv_device.c  |  49 +++--
 src/intel/vulkan/anv_intel.c   |  11 +-
 src/intel/vulkan/anv_private.h |  43 +++--
 src/intel/vulkan/genX_cmd_buffer.c |  11 --
 6 files changed, 384 insertions(+), 234 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 04/10] anv/allocator: Simplify anv_scratch_pool

2016-11-07 Thread Jason Ekstrand
The previous implementation was being overly clever and using the
anv_bo::size field as its mutex.  Scratch pool allocations don't happen
often, will happen at most a fixed number of times, and never happen in the
critical path (they only happen in shader compilation).  We can make this
much simpler by just using the device mutex.  This also means that we can
start using anv_bo_init_new directly on the bo and avoid setting fields
one-at-a-time.

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_allocator.c | 109 ++-
 src/intel/vulkan/anv_private.h   |   7 ++-
 2 files changed, 55 insertions(+), 61 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index b846a62..e7cad45 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -888,9 +888,9 @@ anv_scratch_pool_finish(struct anv_device *device, struct 
anv_scratch_pool *pool
 {
for (unsigned s = 0; s < MESA_SHADER_STAGES; s++) {
   for (unsigned i = 0; i < 16; i++) {
- struct anv_bo *bo = >bos[i][s];
- if (bo->size > 0)
-anv_gem_close(device, bo->gem_handle);
+ struct anv_scratch_bo *bo = >bos[i][s];
+ if (bo->exists > 0)
+anv_gem_close(device, bo->bo.gem_handle);
   }
}
 }
@@ -905,70 +905,59 @@ anv_scratch_pool_alloc(struct anv_device *device, struct 
anv_scratch_pool *pool,
unsigned scratch_size_log2 = ffs(per_thread_scratch / 2048);
assert(scratch_size_log2 < 16);
 
-   struct anv_bo *bo = >bos[scratch_size_log2][stage];
+   struct anv_scratch_bo *bo = >bos[scratch_size_log2][stage];
 
-   /* From now on, we go into a critical section.  In order to remain
-* thread-safe, we use the bo size as a lock.  A value of 0 means we don't
-* have a valid BO yet.  A value of 1 means locked.  A value greater than 1
-* means we have a bo of the given size.
-*/
+   /* We can use "exists" to shortcut and ignore the critical section */
+   if (bo->exists)
+  return >bo;
 
-   if (bo->size > 1)
-  return bo;
-
-   uint64_t size = __sync_val_compare_and_swap(>size, 0, 1);
-   if (size == 0) {
-  /* We own the lock.  Allocate a buffer */
-
-  const struct anv_physical_device *physical_device =
- >instance->physicalDevice;
-  const struct gen_device_info *devinfo = _device->info;
-
-  /* WaCSScratchSize:hsw
-   *
-   * Haswell's scratch space address calculation appears to be sparse
-   * rather than tightly packed. The Thread ID has bits indicating which
-   * subslice, EU within a subslice, and thread within an EU it is.
-   * There's a maximum of two slices and two subslices, so these can be
-   * stored with a single bit. Even though there are only 10 EUs per
-   * subslice, this is stored in 4 bits, so there's an effective maximum
-   * value of 16 EUs. Similarly, although there are only 7 threads per EU,
-   * this is stored in a 3 bit number, giving an effective maximum value
-   * of 8 threads per EU.
-   *
-   * This means that we need to use 16 * 8 instead of 10 * 7 for the
-   * number of threads per subslice.
-   */
-  const unsigned subslices = MAX2(physical_device->subslice_total, 1);
-  const unsigned scratch_ids_per_subslice =
- device->info.is_haswell ? 16 * 8 : devinfo->max_cs_threads;
+   pthread_mutex_lock(>mutex);
+
+   __sync_synchronize();
+   if (bo->exists)
+  return >bo;
 
-  uint32_t max_threads[] = {
- [MESA_SHADER_VERTEX]   = devinfo->max_vs_threads,
- [MESA_SHADER_TESS_CTRL]= devinfo->max_tcs_threads,
- [MESA_SHADER_TESS_EVAL]= devinfo->max_tes_threads,
- [MESA_SHADER_GEOMETRY] = devinfo->max_gs_threads,
- [MESA_SHADER_FRAGMENT] = devinfo->max_wm_threads,
- [MESA_SHADER_COMPUTE]  = scratch_ids_per_subslice * subslices,
-  };
+   const struct anv_physical_device *physical_device =
+  >instance->physicalDevice;
+   const struct gen_device_info *devinfo = _device->info;
+
+   /* WaCSScratchSize:hsw
+*
+* Haswell's scratch space address calculation appears to be sparse
+* rather than tightly packed. The Thread ID has bits indicating which
+* subslice, EU within a subslice, and thread within an EU it is.
+* There's a maximum of two slices and two subslices, so these can be
+* stored with a single bit. Even though there are only 10 EUs per
+* subslice, this is stored in 4 bits, so there's an effective maximum
+* value of 16 EUs. Similarly, although there are only 7 threads per EU,
+* this is stored in a 3 bit number, giving an effective maximum value
+* of 8 threads per EU.
+*
+* This means that we need to use 16 * 8 instead of 10 * 7 for the
+* number of threads per subslice.
+*/
+   const unsigned subslices = MAX2(physical_device->subslice_total, 1);

[Mesa-dev] [PATCH v4 05/10] anv: Initialize anv_bo::offset to -1

2016-11-07 Thread Jason Ekstrand
Since -1 is an invalid GPU address, this lets us know whether or not we
have a valid address for a buffer.  We don't get a valid address until the
first time that buffer is used in an execbuf2 ioctl.

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_private.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 5d78d16..fab956b 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -270,7 +270,7 @@ anv_bo_init(struct anv_bo *bo, uint32_t gem_handle, 
uint64_t size)
 {
bo->gem_handle = gem_handle;
bo->index = 0;
-   bo->offset = 0;
+   bo->offset = -1;
bo->size = size;
bo->map = NULL;
bo->is_winsys_bo = false;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 03/10] anv: Add a new bo_pool_init helper

2016-11-07 Thread Jason Ekstrand
This ensures that we're always setting all of the fields in anv_bo

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_allocator.c |  9 ++---
 src/intel/vulkan/anv_device.c| 10 +++---
 src/intel/vulkan/anv_intel.c | 11 +--
 src/intel/vulkan/anv_private.h   | 11 +++
 4 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 36cabd7..b846a62 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -253,10 +253,7 @@ anv_block_pool_init(struct anv_block_pool *pool,
assert(util_is_power_of_two(block_size));
 
pool->device = device;
-   pool->bo.gem_handle = 0;
-   pool->bo.offset = 0;
-   pool->bo.size = 0;
-   pool->bo.is_winsys_bo = false;
+   anv_bo_init(>bo, 0, 0);
pool->block_size = block_size;
pool->free_list = ANV_FREE_LIST_EMPTY;
pool->back_free_list = ANV_FREE_LIST_EMPTY;
@@ -463,10 +460,8 @@ anv_block_pool_grow(struct anv_block_pool *pool, struct 
anv_block_state *state)
 * values back into pool. */
pool->map = map + center_bo_offset;
pool->center_bo_offset = center_bo_offset;
-   pool->bo.gem_handle = gem_handle;
-   pool->bo.size = size;
+   anv_bo_init(>bo, gem_handle, size);
pool->bo.map = map;
-   pool->bo.index = 0;
 
 done:
pthread_mutex_unlock(>device->mutex);
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 27402ce..c40598c 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1146,15 +1146,11 @@ VkResult anv_DeviceWaitIdle(
 VkResult
 anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, uint64_t size)
 {
-   bo->gem_handle = anv_gem_create(device, size);
-   if (!bo->gem_handle)
+   uint32_t gem_handle = anv_gem_create(device, size);
+   if (!gem_handle)
   return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
 
-   bo->map = NULL;
-   bo->index = 0;
-   bo->offset = 0;
-   bo->size = size;
-   bo->is_winsys_bo = false;
+   anv_bo_init(bo, gem_handle, size);
 
return VK_SUCCESS;
 }
diff --git a/src/intel/vulkan/anv_intel.c b/src/intel/vulkan/anv_intel.c
index 3e1cc3f..1c50e2b 100644
--- a/src/intel/vulkan/anv_intel.c
+++ b/src/intel/vulkan/anv_intel.c
@@ -49,16 +49,15 @@ VkResult anv_CreateDmaBufImageINTEL(
if (mem == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   mem->bo.gem_handle = anv_gem_fd_to_handle(device, pCreateInfo->fd);
-   if (!mem->bo.gem_handle) {
+   uint32_t gem_handle = anv_gem_fd_to_handle(device, pCreateInfo->fd);
+   if (!gem_handle) {
   result = vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
   goto fail;
}
 
-   mem->bo.map = NULL;
-   mem->bo.index = 0;
-   mem->bo.offset = 0;
-   mem->bo.size = pCreateInfo->strideInBytes * pCreateInfo->extent.height;
+   uint64_t size = pCreateInfo->strideInBytes * pCreateInfo->extent.height;
+
+   anv_bo_init(>bo, gem_handle, size);
 
anv_image_create(_device,
   &(struct anv_image_create_info) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index dbe23bd..7eedf06 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -265,6 +265,17 @@ struct anv_bo {
bool is_winsys_bo;
 };
 
+static inline void
+anv_bo_init(struct anv_bo *bo, uint32_t gem_handle, uint64_t size)
+{
+   bo->gem_handle = gem_handle;
+   bo->index = 0;
+   bo->offset = 0;
+   bo->size = size;
+   bo->map = NULL;
+   bo->is_winsys_bo = false;
+}
+
 /* Represents a lock-free linked list of "free" things.  This is used by
  * both the block pool and the state pools.  Unfortunately, in order to
  * solve the ABA problem, we can't use a single uint32_t head.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 01/10] anv: Add a cmd_buffer_execbuf helper

2016-11-07 Thread Jason Ekstrand
This puts the actual execbuf2 call in anv_batch_chain.c along with the
other relocation stuff.
---
 src/intel/vulkan/anv_batch_chain.c | 8 
 src/intel/vulkan/anv_device.c  | 3 +--
 src/intel/vulkan/anv_private.h | 2 ++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index dfa9abf..529fe7e 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1131,3 +1131,11 @@ anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer 
*cmd_buffer)
if (!cmd_buffer->execbuf2.need_reloc)
   cmd_buffer->execbuf2.execbuf.flags |= I915_EXEC_NO_RELOC;
 }
+
+VkResult
+anv_cmd_buffer_execbuf(struct anv_device *device,
+   struct anv_cmd_buffer *cmd_buffer)
+{
+   return anv_device_execbuf(device, _buffer->execbuf2.execbuf,
+ cmd_buffer->execbuf2.bos);
+}
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 6d8de90..27402ce 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1103,8 +1103,7 @@ VkResult anv_QueueSubmit(
  pSubmits[i].pCommandBuffers[j]);
  assert(cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY);
 
- result = anv_device_execbuf(device, _buffer->execbuf2.execbuf,
- cmd_buffer->execbuf2.bos);
+ result = anv_cmd_buffer_execbuf(device, cmd_buffer);
  if (result != VK_SUCCESS)
 return result;
   }
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 83b9328..2265346 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1195,6 +1195,8 @@ void anv_cmd_buffer_end_batch_buffer(struct 
anv_cmd_buffer *cmd_buffer);
 void anv_cmd_buffer_add_secondary(struct anv_cmd_buffer *primary,
   struct anv_cmd_buffer *secondary);
 void anv_cmd_buffer_prepare_execbuf(struct anv_cmd_buffer *cmd_buffer);
+VkResult anv_cmd_buffer_execbuf(struct anv_device *device,
+struct anv_cmd_buffer *cmd_buffer);
 
 VkResult anv_cmd_buffer_reset(struct anv_cmd_buffer *cmd_buffer);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Advertise 8 subpixel bits always.

2016-11-07 Thread Francisco Jerez
Chris Forbes  writes:

> Hi Curro,
>
> Thanks for being thorough about this -- I think there is still one area
> where things might be a bit wobbly; if we end up taking a sw fallback,
> swrast only does 4 bits. I'm not sure that matters though.
>
Hmm, I don't think we fall back to swrast for anything that would be
sensitive to the subpixel precision, but it's definitely worth
checking. :)

> - Chris
>
> On Tue, Nov 8, 2016 at 11:01 AM, Francisco Jerez 
> wrote:
>
>> Chris Forbes  writes:
>>
>> > The mesa default is 4, but we program the hardware for 8 on all
>> > generations.
>> >
>>
>> I happened to come across this inconsistency a couple of weeks ago -- I
>> just double-checked that it doesn't cause any conformance regressions
>> because some of the rasterization tests use the GL_SUBPIXEL_BITS value
>> to determine the error tolerance so increasing the value could
>> potentially uncover additional approximation errors.  Doesn't seem to
>> cause any regressions though in our CI system, series is:
>>
>> Reviewed-by: Francisco Jerez 
>>
>> > Signed-off-by: Chris Forbes 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
>> b/src/mesa/drivers/dri/i965/brw_context.c
>> > index 3085a98..d8174c6 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_context.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
>> > @@ -538,6 +538,7 @@ brw_initialize_context_constants(struct brw_context
>> *brw)
>> >ctx->Const.MaxProgramTextureGatherComponents = 1;
>> >
>> > ctx->Const.MaxUniformBlockSize = 65536;
>> > +   ctx->Const.SubPixelBits = 8;
>> >
>> > for (int i = 0; i < MESA_SHADER_STAGES; i++) {
>> >struct gl_program_constants *prog = >Const.Program[i];
>> > --
>> > 2.10.2
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] nvc0: only invalidate currently bound tic/tsc

2016-11-07 Thread Samuel Pitoiset
This could be still improved by adding textures/samplers_valid[6] into 
the context.


On 11/07/2016 11:13 PM, Samuel Pitoiset wrote:

This is especially useful when switching from compute to 3D.

v2: - get rid of one loop with 'x |= (1ULL << y) - 1' instead

Signed-off-by: Samuel Pitoiset 
---

Tested with Elemental on GK208, works fine.

 src/gallium/drivers/nouveau/nvc0/nvc0_compute.c |  6 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 11 +++
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c |  6 +++---
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
index 11635c9..08fa23f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
@@ -151,7 +151,7 @@ nvc0_compute_validate_samplers(struct nvc0_context *nvc0)

/* Invalidate all 3D samplers because they are aliased. */
for (int s = 0; s < 5; s++)
-  nvc0->samplers_dirty[s] = ~0;
+  nvc0->samplers_dirty[s] |= (1ULL << nvc0->num_samplers[5]) - 1;
nvc0->dirty_3d |= NVC0_NEW_3D_SAMPLERS;
 }

@@ -166,9 +166,9 @@ nvc0_compute_validate_textures(struct nvc0_context *nvc0)

/* Invalidate all 3D textures because they are aliased. */
for (int s = 0; s < 5; s++) {
-  for (int i = 0; i < nvc0->num_textures[s]; i++)
+  for (int i = 0; i < nvc0->num_textures[5]; i++)
  nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_3D_TEX(s, i));
-  nvc0->textures_dirty[s] = ~0;
+  nvc0->textures_dirty[s] |= (1ULL << nvc0->num_textures[5]) - 1;
}
nvc0->dirty_3d |= NVC0_NEW_3D_TEXTURES;
 }
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index e57391e..8d620a9 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -606,9 +606,11 @@ void nvc0_validate_textures(struct nvc0_context *nvc0)
}

/* Invalidate all CP textures because they are aliased. */
-   for (int i = 0; i < nvc0->num_textures[5]; i++)
-  nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_TEX(i));
-   nvc0->textures_dirty[5] = ~0;
+   for (int s = 0; s < 5; s++) {
+  for (int i = 0; i < nvc0->num_textures[s]; i++)
+ nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_TEX(i));
+  nvc0->textures_dirty[5] |= (1ULL << nvc0->num_textures[s]) - 1;
+   }
nvc0->dirty_cp |= NVC0_NEW_CP_TEXTURES;
 }

@@ -715,7 +717,8 @@ void nvc0_validate_samplers(struct nvc0_context *nvc0)
}

/* Invalidate all CP samplers because they are aliased. */
-   nvc0->samplers_dirty[5] = ~0;
+   for (int s = 0; s < 5; s++)
+  nvc0->samplers_dirty[5] |= (1ULL << nvc0->num_samplers[s]) - 1;
nvc0->dirty_cp |= NVC0_NEW_CP_SAMPLERS;
 }

diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
index d661c00..33e0ab0 100644
--- a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
@@ -307,7 +307,7 @@ nve4_compute_validate_samplers(struct nvc0_context *nvc0)

/* Invalidate all 3D samplers because they are aliased. */
for (int s = 0; s < 5; s++)
-  nvc0->samplers_dirty[s] = ~0;
+  nvc0->samplers_dirty[s] |= (1ULL << nvc0->num_samplers[5]) - 1;
nvc0->dirty_3d |= NVC0_NEW_3D_SAMPLERS;
 }

@@ -764,9 +764,9 @@ nve4_compute_validate_textures(struct nvc0_context *nvc0)

/* Invalidate all 3D textures because they are aliased. */
for (int s = 0; s < 5; s++) {
-  for (int i = 0; i < nvc0->num_textures[s]; i++)
+  for (int i = 0; i < nvc0->num_textures[5]; i++)
  nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_3D_TEX(s, i));
-  nvc0->textures_dirty[s] = ~0;
+  nvc0->textures_dirty[s] |= (1ULL << nvc0->num_textures[5]) - 1;
}
nvc0->dirty_3d |= NVC0_NEW_3D_TEXTURES;
 }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] nvc0: only invalidate currently bound tic/tsc

2016-11-07 Thread Samuel Pitoiset
This is especially useful when switching from compute to 3D.

v2: - get rid of one loop with 'x |= (1ULL << y) - 1' instead

Signed-off-by: Samuel Pitoiset 
---

Tested with Elemental on GK208, works fine.

 src/gallium/drivers/nouveau/nvc0/nvc0_compute.c |  6 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 11 +++
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c |  6 +++---
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
index 11635c9..08fa23f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
@@ -151,7 +151,7 @@ nvc0_compute_validate_samplers(struct nvc0_context *nvc0)
 
/* Invalidate all 3D samplers because they are aliased. */
for (int s = 0; s < 5; s++)
-  nvc0->samplers_dirty[s] = ~0;
+  nvc0->samplers_dirty[s] |= (1ULL << nvc0->num_samplers[5]) - 1;
nvc0->dirty_3d |= NVC0_NEW_3D_SAMPLERS;
 }
 
@@ -166,9 +166,9 @@ nvc0_compute_validate_textures(struct nvc0_context *nvc0)
 
/* Invalidate all 3D textures because they are aliased. */
for (int s = 0; s < 5; s++) {
-  for (int i = 0; i < nvc0->num_textures[s]; i++)
+  for (int i = 0; i < nvc0->num_textures[5]; i++)
  nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_3D_TEX(s, i));
-  nvc0->textures_dirty[s] = ~0;
+  nvc0->textures_dirty[s] |= (1ULL << nvc0->num_textures[5]) - 1;
}
nvc0->dirty_3d |= NVC0_NEW_3D_TEXTURES;
 }
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index e57391e..8d620a9 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -606,9 +606,11 @@ void nvc0_validate_textures(struct nvc0_context *nvc0)
}
 
/* Invalidate all CP textures because they are aliased. */
-   for (int i = 0; i < nvc0->num_textures[5]; i++)
-  nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_TEX(i));
-   nvc0->textures_dirty[5] = ~0;
+   for (int s = 0; s < 5; s++) {
+  for (int i = 0; i < nvc0->num_textures[s]; i++)
+ nouveau_bufctx_reset(nvc0->bufctx_cp, NVC0_BIND_CP_TEX(i));
+  nvc0->textures_dirty[5] |= (1ULL << nvc0->num_textures[s]) - 1;
+   }
nvc0->dirty_cp |= NVC0_NEW_CP_TEXTURES;
 }
 
@@ -715,7 +717,8 @@ void nvc0_validate_samplers(struct nvc0_context *nvc0)
}
 
/* Invalidate all CP samplers because they are aliased. */
-   nvc0->samplers_dirty[5] = ~0;
+   for (int s = 0; s < 5; s++)
+  nvc0->samplers_dirty[5] |= (1ULL << nvc0->num_samplers[s]) - 1;
nvc0->dirty_cp |= NVC0_NEW_CP_SAMPLERS;
 }
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
index d661c00..33e0ab0 100644
--- a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c
@@ -307,7 +307,7 @@ nve4_compute_validate_samplers(struct nvc0_context *nvc0)
 
/* Invalidate all 3D samplers because they are aliased. */
for (int s = 0; s < 5; s++)
-  nvc0->samplers_dirty[s] = ~0;
+  nvc0->samplers_dirty[s] |= (1ULL << nvc0->num_samplers[5]) - 1;
nvc0->dirty_3d |= NVC0_NEW_3D_SAMPLERS;
 }
 
@@ -764,9 +764,9 @@ nve4_compute_validate_textures(struct nvc0_context *nvc0)
 
/* Invalidate all 3D textures because they are aliased. */
for (int s = 0; s < 5; s++) {
-  for (int i = 0; i < nvc0->num_textures[s]; i++)
+  for (int i = 0; i < nvc0->num_textures[5]; i++)
  nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_3D_TEX(s, i));
-  nvc0->textures_dirty[s] = ~0;
+  nvc0->textures_dirty[s] |= (1ULL << nvc0->num_textures[5]) - 1;
}
nvc0->dirty_3d |= NVC0_NEW_3D_TEXTURES;
 }
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Advertise 8 subpixel bits always.

2016-11-07 Thread Francisco Jerez
Chris Forbes  writes:

> The mesa default is 4, but we program the hardware for 8 on all
> generations.
>

I happened to come across this inconsistency a couple of weeks ago -- I
just double-checked that it doesn't cause any conformance regressions
because some of the rasterization tests use the GL_SUBPIXEL_BITS value
to determine the error tolerance so increasing the value could
potentially uncover additional approximation errors.  Doesn't seem to
cause any regressions though in our CI system, series is:

Reviewed-by: Francisco Jerez 

> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 3085a98..d8174c6 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -538,6 +538,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>ctx->Const.MaxProgramTextureGatherComponents = 1;
>  
> ctx->Const.MaxUniformBlockSize = 65536;
> +   ctx->Const.SubPixelBits = 8;
>  
> for (int i = 0; i < MESA_SHADER_STAGES; i++) {
>struct gl_program_constants *prog = >Const.Program[i];
> -- 
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API v2

2016-11-07 Thread Jan Vesely
On Mon, 2016-11-07 at 21:06 +, Tom Stellard wrote:
> v2:
>   Fix adding parameter attributes with LLVM < 4.0.
> ---
>  src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
>  src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 52 -
>  src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
>  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
>  src/gallium/drivers/radeonsi/si_shader.c  | 69 
> ---
>  src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
>  6 files changed, 116 insertions(+), 52 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
> b/src/gallium/auxiliary/draw/draw_llvm.c
> index 5b4e2a1..5d87318 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
> @@ -1568,8 +1568,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
> draw_llvm_variant *variant,
> LLVMSetFunctionCallConv(variant_func, LLVMCCallConv);
> for (i = 0; i < num_arg_types; ++i)
>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
> - LLVMAddAttribute(LLVMGetParam(variant_func, i),
> -  LLVMNoAliasAttribute);
> + lp_add_function_attr(variant_func, i + 1, "noalias", 7);
>  
> context_ptr   = LLVMGetParam(variant_func, 0);
> io_ptr= LLVMGetParam(variant_func, 1);
> @@ -2193,8 +2192,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
>  
> for (i = 0; i < ARRAY_SIZE(arg_types); ++i)
>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
> - LLVMAddAttribute(LLVMGetParam(variant_func, i),
> -  LLVMNoAliasAttribute);
> + lp_add_function_attr(variant_func, i + 1, "noalias", 7);
>  
> context_ptr   = LLVMGetParam(variant_func, 0);
> input_array   = LLVMGetParam(variant_func, 1);
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
> index f12e735..401e9a2 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_intr.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
> @@ -120,13 +120,57 @@ lp_declare_intrinsic(LLVMModuleRef module,
>  }
>  
>  
> +#if HAVE_LLVM < 0x0400
> +static LLVMAttribute str_to_attr(const char *attr_name, unsigned attr_len)
> +{
> +   if (!strncmp("alwaysinline", attr_name, attr_len)) {
> +  return LLVMAlwaysInlineAttribute;
> +   } else if (!strncmp("byval", attr_name, attr_len)) {
> +  return LLVMByValAttribute;
> +   } else if (!strncmp("inreg", attr_name, attr_len)) {
> +  return LLVMInRegAttribute;
> +   } else if (!strncmp("noalias", attr_name, attr_len)) {
> +  return LLVMNoAlliasAttribute;
> +   } else if (!strncmp("readnone", attr_name, attr_len)) {
> +  return LLVMReadNoneAttribute;
> +   } else if (!strncmp("readonly", attr_name, attr_len)) {
> +  return LLVMReadOnlyAttribute;
> +   } else {
> +  _debug_printf("Unhandled function attribute: %s\n", attr_name);
> +  return 0;
> +   }
> +}
> +#endif
> +
> +void
> +lp_add_function_attr(LLVMValueRef function,
> + int attr_idx,
> + const char *attr_name,
> + unsigned attr_len)

Any reason to pass string length by hand rather than local strlen?

> +{
> +
> +#if HAVE_LLVM < 0x0400
> +   LLVMAttribute attr = str_to_attr(attr_name, attr_len);
> +   if (attr_idx == -1) {
> +  LLVMAddFunctionAttr(function, attr);
> +   } else {
> +  LLVMAddAttribute(LLVMGetParam(function, attr_idx), attr);

I think this needs to be attr_idx - 1. LLVM 4.0 counts parameter
attributes from 1 (0 is the ret value). in the changes below:

-LLVMAddAttribute(LLVMGetParam(function, i), LLVMNoAliasAttribute);
+
+lp_add_function_attr(function, i + 1, "noalias", 7);

Jan


> +   }
> +#else
> +   LLVMContextRef context = 
> LLVMGetModuleContext(LLVMGetGlobalParent(function));
> +   unsigned kind_id = LLVMGetEnumAttributeKindForName(attr_name, attr_len);
> +   LLVMAttributeRef attr = LLVMCreateEnumAttribute(context, kind_id, 0);
> +   LLVMAddAttributeAtIndex(function, attr_idx, attr);
> +#endif
> +}
> +
>  LLVMValueRef
>  lp_build_intrinsic(LLVMBuilderRef builder,
> const char *name,
> LLVMTypeRef ret_type,
> LLVMValueRef *args,
> unsigned num_args,
> -   LLVMAttribute attr)
> +   const char *attr_str)
>  {
> LLVMModuleRef module = 
> LLVMGetGlobalParent(LLVMGetBasicBlockParent(LLVMGetInsertBlock(builder)));
> LLVMValueRef function;
> @@ -145,10 +189,14 @@ lp_build_intrinsic(LLVMBuilderRef builder,
>  
>function = lp_declare_intrinsic(module, name, ret_type, arg_types, 
> num_args);
>  
> +  if (attr_str) {
> + lp_add_function_attr(function, -1, attr_str, sizeof(attr_str));
> +  }
> +
>/* NoUnwind indicates that the intrinsic never raises a 

Re: [Mesa-dev] [PATCH 1/2] gallivm: introduce 32x32->64bit lp_build_mul_32_lohi function

2016-11-07 Thread Roland Scheidegger
Am 07.11.2016 um 22:34 schrieb Jose Fonseca:
> On 07/11/16 19:09, Roland Scheidegger wrote:
>> Am 06.11.2016 um 16:50 schrieb Jose Fonseca:
>>> On 04/11/16 04:14, srol...@vmware.com wrote:
 From: Roland Scheidegger 

 This is used by shader umul_hi/imul_hi functions (and soon by draw).
 It's actually useful separating this out on its own, however the real
 reason for doing it is because we're using an optimized sse2 version,
 since the code llvm generates is atrocious (since there's no widening
 mul in llvm, and it does not recognize the widening mul pattern, so
 it generates code for real 64x64->64bit mul, which the cpu can't do
 natively, in contrast to 32x32->64bit mul which it could do).
 ---
  src/gallium/auxiliary/gallivm/lp_bld_arit.c| 150
 +
  src/gallium/auxiliary/gallivm/lp_bld_arit.h|   6 +
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |  54 +++-
  3 files changed, 172 insertions(+), 38 deletions(-)

 diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
 b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
 index 3ea0734..3de4628 100644
 --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
 +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
 @@ -1091,6 +1091,156 @@ lp_build_mul(struct lp_build_context *bld,
 return res;
  }

 +/*
 + * Widening mul, valid for 32x32 bit -> 64bit only.
 + * Result is low 32bits, high bits returned in res_hi.
 + */
 +LLVMValueRef
 +lp_build_mul_32_lohi(struct lp_build_context *bld,
 + LLVMValueRef a,
 + LLVMValueRef b,
 + LLVMValueRef *res_hi)
 +{
 +   struct gallivm_state *gallivm = bld->gallivm;
 +   LLVMBuilderRef builder = gallivm->builder;
 +
 +   assert(bld->type.width == 32);
 +   assert(bld->type.floating == 0);
 +   assert(bld->type.fixed == 0);
 +   assert(bld->type.norm == 0);
 +
 +   /*
 +* XXX: for some reason, with zext/zext/mul/trunc the code llvm
 produces
 +* for x86 simd is atrocious (even if the high bits weren't
 required),
 +* trying to handle real 64bit inputs (which of course can't
 happen due
 +* to using 64bit umul with 32bit numbers zero-extended to
 64bit, but
 +* apparently llvm does not recognize this widening mul). This
 includes 6
 +* (instead of 2) pmuludq plus extra adds and shifts
 +* The same story applies to signed mul, albeit fixing this
 requires sse41.
 +* https://llvm.org/bugs/show_bug.cgi?id=30845
 +* So, whip up our own code, albeit only for length 4 and 8 (which
 +* should be good enough)...
 +*/
 +   if ((bld->type.length == 4 || bld->type.length == 8) &&
 +   ((util_cpu_caps.has_sse2 && (bld->type.sign == 0)) ||
 +util_cpu_caps.has_sse4_1)) {
 +  const char *intrinsic = NULL;
 +  LLVMValueRef aeven, aodd, beven, bodd, muleven, mulodd;
 +  LLVMValueRef shuf[LP_MAX_VECTOR_WIDTH / 32], shuf_vec;
 +  struct lp_type type_wide = lp_wider_type(bld->type);
 +  LLVMTypeRef wider_type = lp_build_vec_type(gallivm, type_wide);
 +  unsigned i;
 +  for (i = 0; i < bld->type.length; i += 2) {
 + shuf[i] = lp_build_const_int32(gallivm, i+1);
 + shuf[i+1] =
 LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
 +  }
 +  shuf_vec = LLVMConstVector(shuf, bld->type.length);
 +  aeven = a;
 +  beven = b;
 +  aodd = LLVMBuildShuffleVector(builder, aeven, bld->undef,
 shuf_vec, "");
 +  bodd = LLVMBuildShuffleVector(builder, beven, bld->undef,
 shuf_vec, "");
 +
 +  if (util_cpu_caps.has_avx2 && bld->type.length == 8) {
 + if (bld->type.sign) {
 +intrinsic = "llvm.x86.avx2.pmul.dq";
 + } else {
 +intrinsic = "llvm.x86.avx2.pmulu.dq";
 + }
 + muleven = lp_build_intrinsic_binary(builder, intrinsic,
 + wider_type, aeven,
 beven);
 + mulodd = lp_build_intrinsic_binary(builder, intrinsic,
 +wider_type, aodd, bodd);
 +  }
 +  else {
 + /* for consistent naming look elsewhere... */
 + if (bld->type.sign) {
 +intrinsic = "llvm.x86.sse41.pmuldq";
 + } else {
 +intrinsic = "llvm.x86.sse2.pmulu.dq";
 + }
 + /*
 +  * XXX If we only have AVX but not AVX2 this is a pain.
 +  * lp_build_intrinsic_binary_anylength() can't handle it
 +  * (due to src and dst type not being identical).
 +  */
 + if (bld->type.length == 8) {
 

Re: [Mesa-dev] [PATCH 1/2] gallivm: introduce 32x32->64bit lp_build_mul_32_lohi function

2016-11-07 Thread Jose Fonseca

On 07/11/16 19:09, Roland Scheidegger wrote:

Am 06.11.2016 um 16:50 schrieb Jose Fonseca:

On 04/11/16 04:14, srol...@vmware.com wrote:

From: Roland Scheidegger 

This is used by shader umul_hi/imul_hi functions (and soon by draw).
It's actually useful separating this out on its own, however the real
reason for doing it is because we're using an optimized sse2 version,
since the code llvm generates is atrocious (since there's no widening
mul in llvm, and it does not recognize the widening mul pattern, so
it generates code for real 64x64->64bit mul, which the cpu can't do
natively, in contrast to 32x32->64bit mul which it could do).
---
 src/gallium/auxiliary/gallivm/lp_bld_arit.c| 150
+
 src/gallium/auxiliary/gallivm/lp_bld_arit.h|   6 +
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |  54 +++-
 3 files changed, 172 insertions(+), 38 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
index 3ea0734..3de4628 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
@@ -1091,6 +1091,156 @@ lp_build_mul(struct lp_build_context *bld,
return res;
 }

+/*
+ * Widening mul, valid for 32x32 bit -> 64bit only.
+ * Result is low 32bits, high bits returned in res_hi.
+ */
+LLVMValueRef
+lp_build_mul_32_lohi(struct lp_build_context *bld,
+ LLVMValueRef a,
+ LLVMValueRef b,
+ LLVMValueRef *res_hi)
+{
+   struct gallivm_state *gallivm = bld->gallivm;
+   LLVMBuilderRef builder = gallivm->builder;
+
+   assert(bld->type.width == 32);
+   assert(bld->type.floating == 0);
+   assert(bld->type.fixed == 0);
+   assert(bld->type.norm == 0);
+
+   /*
+* XXX: for some reason, with zext/zext/mul/trunc the code llvm
produces
+* for x86 simd is atrocious (even if the high bits weren't
required),
+* trying to handle real 64bit inputs (which of course can't
happen due
+* to using 64bit umul with 32bit numbers zero-extended to 64bit, but
+* apparently llvm does not recognize this widening mul). This
includes 6
+* (instead of 2) pmuludq plus extra adds and shifts
+* The same story applies to signed mul, albeit fixing this
requires sse41.
+* https://llvm.org/bugs/show_bug.cgi?id=30845
+* So, whip up our own code, albeit only for length 4 and 8 (which
+* should be good enough)...
+*/
+   if ((bld->type.length == 4 || bld->type.length == 8) &&
+   ((util_cpu_caps.has_sse2 && (bld->type.sign == 0)) ||
+util_cpu_caps.has_sse4_1)) {
+  const char *intrinsic = NULL;
+  LLVMValueRef aeven, aodd, beven, bodd, muleven, mulodd;
+  LLVMValueRef shuf[LP_MAX_VECTOR_WIDTH / 32], shuf_vec;
+  struct lp_type type_wide = lp_wider_type(bld->type);
+  LLVMTypeRef wider_type = lp_build_vec_type(gallivm, type_wide);
+  unsigned i;
+  for (i = 0; i < bld->type.length; i += 2) {
+ shuf[i] = lp_build_const_int32(gallivm, i+1);
+ shuf[i+1] =
LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
+  }
+  shuf_vec = LLVMConstVector(shuf, bld->type.length);
+  aeven = a;
+  beven = b;
+  aodd = LLVMBuildShuffleVector(builder, aeven, bld->undef,
shuf_vec, "");
+  bodd = LLVMBuildShuffleVector(builder, beven, bld->undef,
shuf_vec, "");
+
+  if (util_cpu_caps.has_avx2 && bld->type.length == 8) {
+ if (bld->type.sign) {
+intrinsic = "llvm.x86.avx2.pmul.dq";
+ } else {
+intrinsic = "llvm.x86.avx2.pmulu.dq";
+ }
+ muleven = lp_build_intrinsic_binary(builder, intrinsic,
+ wider_type, aeven, beven);
+ mulodd = lp_build_intrinsic_binary(builder, intrinsic,
+wider_type, aodd, bodd);
+  }
+  else {
+ /* for consistent naming look elsewhere... */
+ if (bld->type.sign) {
+intrinsic = "llvm.x86.sse41.pmuldq";
+ } else {
+intrinsic = "llvm.x86.sse2.pmulu.dq";
+ }
+ /*
+  * XXX If we only have AVX but not AVX2 this is a pain.
+  * lp_build_intrinsic_binary_anylength() can't handle it
+  * (due to src and dst type not being identical).
+  */
+ if (bld->type.length == 8) {
+LLVMValueRef aevenlo, aevenhi, bevenlo, bevenhi;
+LLVMValueRef aoddlo, aoddhi, boddlo, boddhi;
+LLVMValueRef muleven2[2], mulodd2[2];
+struct lp_type type_wide_half = type_wide;
+LLVMTypeRef wtype_half;
+type_wide_half.length = 2;
+wtype_half = lp_build_vec_type(gallivm, type_wide_half);
+aevenlo = lp_build_extract_range(gallivm, aeven, 0, 4);
+aevenhi = lp_build_extract_range(gallivm, aeven, 4, 4);
+bevenlo = lp_build_extract_range(gallivm, beven, 0, 4);
+  

Re: [Mesa-dev] [PATCH] swr: allow alphatest without blend or logicop

2016-11-07 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak  

> On Nov 7, 2016, at 1:23 PM, Tim Rowley  wrote:
> 
> We need to compile a blend function when alphatest is enabled.
> ---
> src/gallium/drivers/swr/swr_state.cpp | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/swr_state.cpp 
> b/src/gallium/drivers/swr/swr_state.cpp
> index 3e02322..424bff2 100644
> --- a/src/gallium/drivers/swr/swr_state.cpp
> +++ b/src/gallium/drivers/swr/swr_state.cpp
> @@ -1300,7 +1300,8 @@ swr_update_derived(struct pipe_context *pipe,
>sizeof(compileState.blendState));
> 
> if (compileState.blendState.blendEnable == false &&
> -compileState.blendState.logicOpEnable == false) {
> +compileState.blendState.logicOpEnable == false &&
> +ctx->depth_stencil->alpha.enabled == 0) {
>SwrSetBlendFunc(ctx->swrContext, target, NULL);
>continue;
> }
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98632] Fix build on Hurd without PATH_MAX

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98632

Bug ID: 98632
   Summary: Fix build on Hurd without PATH_MAX
   Product: Mesa
   Version: 13.0
  Hardware: Other
OS: other
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: samuel.thiba...@ens-lyon.org
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 127822
  --> https://bugs.freedesktop.org/attachment.cgi?id=127822=edit
proposed fix

Hello,

Version 13.0 of mesa doesn't build on GNU/Hurd any more because of new
occurrences of PATH_MAX, which hurd-i386 doesn't define since it doesn't have
such
arbitrary limitation.

Here is a patch to fix this.

Samuel

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API v2

2016-11-07 Thread Tom Stellard
v2:
  Fix adding parameter attributes with LLVM < 4.0.
---
 src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
 src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 52 -
 src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
 src/gallium/drivers/radeonsi/si_shader.c  | 69 ---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
 6 files changed, 116 insertions(+), 52 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 5b4e2a1..5d87318 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -1568,8 +1568,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMSetFunctionCallConv(variant_func, LLVMCCallConv);
for (i = 0; i < num_arg_types; ++i)
   if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
- LLVMAddAttribute(LLVMGetParam(variant_func, i),
-  LLVMNoAliasAttribute);
+ lp_add_function_attr(variant_func, i + 1, "noalias", 7);
 
context_ptr   = LLVMGetParam(variant_func, 0);
io_ptr= LLVMGetParam(variant_func, 1);
@@ -2193,8 +2192,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
 
for (i = 0; i < ARRAY_SIZE(arg_types); ++i)
   if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
- LLVMAddAttribute(LLVMGetParam(variant_func, i),
-  LLVMNoAliasAttribute);
+ lp_add_function_attr(variant_func, i + 1, "noalias", 7);
 
context_ptr   = LLVMGetParam(variant_func, 0);
input_array   = LLVMGetParam(variant_func, 1);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.c 
b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
index f12e735..401e9a2 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_intr.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
@@ -120,13 +120,57 @@ lp_declare_intrinsic(LLVMModuleRef module,
 }
 
 
+#if HAVE_LLVM < 0x0400
+static LLVMAttribute str_to_attr(const char *attr_name, unsigned attr_len)
+{
+   if (!strncmp("alwaysinline", attr_name, attr_len)) {
+  return LLVMAlwaysInlineAttribute;
+   } else if (!strncmp("byval", attr_name, attr_len)) {
+  return LLVMByValAttribute;
+   } else if (!strncmp("inreg", attr_name, attr_len)) {
+  return LLVMInRegAttribute;
+   } else if (!strncmp("noalias", attr_name, attr_len)) {
+  return LLVMNoAlliasAttribute;
+   } else if (!strncmp("readnone", attr_name, attr_len)) {
+  return LLVMReadNoneAttribute;
+   } else if (!strncmp("readonly", attr_name, attr_len)) {
+  return LLVMReadOnlyAttribute;
+   } else {
+  _debug_printf("Unhandled function attribute: %s\n", attr_name);
+  return 0;
+   }
+}
+#endif
+
+void
+lp_add_function_attr(LLVMValueRef function,
+ int attr_idx,
+ const char *attr_name,
+ unsigned attr_len)
+{
+
+#if HAVE_LLVM < 0x0400
+   LLVMAttribute attr = str_to_attr(attr_name, attr_len);
+   if (attr_idx == -1) {
+  LLVMAddFunctionAttr(function, attr);
+   } else {
+  LLVMAddAttribute(LLVMGetParam(function, attr_idx), attr);
+   }
+#else
+   LLVMContextRef context = 
LLVMGetModuleContext(LLVMGetGlobalParent(function));
+   unsigned kind_id = LLVMGetEnumAttributeKindForName(attr_name, attr_len);
+   LLVMAttributeRef attr = LLVMCreateEnumAttribute(context, kind_id, 0);
+   LLVMAddAttributeAtIndex(function, attr_idx, attr);
+#endif
+}
+
 LLVMValueRef
 lp_build_intrinsic(LLVMBuilderRef builder,
const char *name,
LLVMTypeRef ret_type,
LLVMValueRef *args,
unsigned num_args,
-   LLVMAttribute attr)
+   const char *attr_str)
 {
LLVMModuleRef module = 
LLVMGetGlobalParent(LLVMGetBasicBlockParent(LLVMGetInsertBlock(builder)));
LLVMValueRef function;
@@ -145,10 +189,14 @@ lp_build_intrinsic(LLVMBuilderRef builder,
 
   function = lp_declare_intrinsic(module, name, ret_type, arg_types, 
num_args);
 
+  if (attr_str) {
+ lp_add_function_attr(function, -1, attr_str, sizeof(attr_str));
+  }
+
   /* NoUnwind indicates that the intrinsic never raises a C++ exception.
* Set it for all intrinsics.
*/
-  LLVMAddFunctionAttr(function, attr | LLVMNoUnwindAttribute);
+  lp_add_function_attr(function, -1, "nounwind", 8);
 
   if (gallivm_debug & GALLIVM_DEBUG_IR) {
  lp_debug_dump_value(function);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.h 
b/src/gallium/auxiliary/gallivm/lp_bld_intr.h
index 7d80ac2..a058de4 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_intr.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.h
@@ -60,13 +60,24 @@ lp_declare_intrinsic(LLVMModuleRef module,
  LLVMTypeRef *arg_types,
 

[Mesa-dev] [PATCH] anv: Document cmd_buffer_alloc_binding_table

2016-11-07 Thread Jason Ekstrand
Some of the details of this function are very confusing and have a long
history.  We should document that history and this seems like the best
place to do it.

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_batch_chain.c | 71 ++
 1 file changed, 71 insertions(+)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index dfa9abf..1e348cf 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -522,6 +522,77 @@ anv_cmd_buffer_grow_batch(struct anv_batch *batch, void 
*_data)
return VK_SUCCESS;
 }
 
+/** Allocate a binding table
+ *
+ * This function allocates a binding table.  This is a bit more complicated
+ * than one would think due to a combination of Vulkan driver design and some
+ * unfortunate hardware restrictions.
+ *
+ * The 3DSTATE_BINDING_TABLE_POINTERS_* packets only have a 16-bit field for
+ * the binding table pointer which means that all binding tables need to live
+ * in the bottom 64k of surface state base address.  The way the GL driver has
+ * classically dealt with this restriction is to emit all surface states
+ * on-the-fly into the batch and have a batch buffer smaller than 64k.  This
+ * isn't really an option in Vulkan for a couple of reasons:
+ *
+ *  1) In Vulkan, we have growing (or chaining) batches so surface states have
+ * to live in their own buffer and we have to be able to re-emit
+ * STATE_BASE_ADDRESS as needed which requires a full pipeline stall.  In
+ * order to avoid emitting STATE_BASE_ADDRESS any more often than needed
+ * (it's not that hard to hit 64k of just binding tables), we allocate
+ * surface state objects up-front when VkImageView is created.  In order
+ * for this to work, surface state objects need to be allocated from a
+ * global buffer.
+ *
+ *  2) We tried to design the surface state system in such a way that it's
+ * already ready for bindless texturing.  The way bindless texturing works
+ * on our hardware is that you have a big pool of surface state objects
+ * (with its own state base address) and the bindless handles are simply
+ * offsets into that pool.  With the architecture we chose, we already
+ * have that pool and it's exactly the same pool that we use for regular
+ * surface states so we should already be ready for bindless.
+ *
+ *  3) For render targets, we need to be able to fill out the surface states
+ * later in vkBeginRenderPass so that we can assign clear colors
+ * correctly.  One way to do this would be to just create the surface
+ * state data and then repeatedly copy it into the surface state BO every
+ * time we have to re-emit STATE_BASE_ADDRESS.  While this works, it's
+ * rather annoying and just being able to allocate them up-front and
+ * re-use them for the entire render pass.
+ *
+ * While none of these are technically blockers for emitting state on the fly
+ * like we do in GL, the ability to have a single surface state pool is
+ * simplifies things greatly.  Unfortunately, it comes at a cost...
+ *
+ * Because of the 64k limitation of 3DSTATE_BINDING_TABLE_POINTERS_*, we can't
+ * place the binding tables just anywhere in surface state base address.
+ * Because 64k isn't a whole lot of space, we can't simply restrict the
+ * surface state buffer to 64k, we have to be more clever.  The solution we've
+ * chosen is to have a block pool with a maximum size of 2G that starts at
+ * zero and grows in both directions.  All surface states are allocated from
+ * the top of the pool (positive offsets) and we allocate blocks (< 64k) of
+ * binding tables from the bottom of the pool (negative offsets).  Every time
+ * we allocate a new binding table block, we set surface state base address to
+ * point to the bottom of the binding table block.  This way all of the
+ * binding tables in the block are in the bottom 64k of surface state base
+ * address.  When we fill out the binding table, we add the distance between
+ * the bottom of our binding table block and zero of the block pool to the
+ * surface state offsets so that they are correct relative to out new surface
+ * state base address at the bottom of the binding table block.
+ *
+ * \see adjust_relocations_from_block_pool()
+ * \see adjust_relocations_too_block_pool()
+ *
+ * \param[in]  entriesThe number of surface state entries the binding
+ *table should be able to hold.
+ *
+ * \param[out] state_offset   The offset surface surface state base address
+ *where the surface states live.  This must be
+ *added to the surface state offset when it is
+ *written into the binding table entry.
+ *
+ * \returnAn anv_state representing the binding table
+ */
 struct anv_state
 anv_cmd_buffer_alloc_binding_table(struct 

[Mesa-dev] [PATCH] nir: Avoid an extra NIR op in integer divide lowering.

2016-11-07 Thread Eric Anholt
NIR bools are ~0 for true, so ((unsigned)a >> 31) != 0 -> ((int)a >> 31).
---
 src/compiler/nir/nir_lower_idiv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_lower_idiv.c 
b/src/compiler/nir/nir_lower_idiv.c
index b1e7aeb03c8a..6726b718aaa5 100644
--- a/src/compiler/nir/nir_lower_idiv.c
+++ b/src/compiler/nir/nir_lower_idiv.c
@@ -101,8 +101,7 @@ convert_instr(nir_builder *bld, nir_alu_instr *alu)
if (is_signed)  {
   /* fix the sign: */
   r = nir_ixor(bld, numer, denom);
-  r = nir_ushr(bld, r, nir_imm_int(bld, 31));
-  r = nir_i2b(bld, r);
+  r = nir_ishr(bld, r, nir_imm_int(bld, 31));
   b = nir_ineg(bld, q);
   q = nir_bcsel(bld, r, b, q);
}
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/7] nvc0: simplify draw parameters upload for vertex shaders

2016-11-07 Thread Samuel Pitoiset



On 11/07/2016 04:36 AM, Ilia Mirkin wrote:

On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
index 138e24d..11fd7eb 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
@@ -974,19 +974,17 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)

nvc0_state_validate_3d(nvc0, ~0);

-   if (nvc0->vertprog->vp.need_draw_parameters) {
+   if (nvc0->vertprog->vp.need_draw_parameters && !info->indirect) {
   PUSH_SPACE(push, 9);
   BEGIN_NVC0(push, NVC0_3D(CB_SIZE), 3);
   PUSH_DATA (push, NVC0_CB_AUX_SIZE);
   PUSH_DATAh(push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
   PUSH_DATA (push, screen->uniform_bo->offset + NVC0_CB_AUX_INFO(0));
-  if (!info->indirect) {
- BEGIN_1IC0(push, NVC0_3D(CB_POS), 1 + 3);
- PUSH_DATA (push, NVC0_CB_AUX_DRAW_INFO);
- PUSH_DATA (push, info->index_bias);
- PUSH_DATA (push, info->start_instance);
- PUSH_DATA (push, info->drawid);
-  }
+  BEGIN_1IC0(push, NVC0_3D(CB_POS), 4);


I'd very much prefer this to stay as "1 + 3". Otherwise, this is

Reviewed-by: Ilia Mirkin 


Okay.




+  PUSH_DATA (push, NVC0_CB_AUX_DRAW_INFO);
+  PUSH_DATA (push, info->index_bias);
+  PUSH_DATA (push, info->start_instance);
+  PUSH_DATA (push, info->drawid);
}

if (nvc0->screen->base.class_3d < NVE4_3D_CLASS &&
--
2.10.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/7] nvc0: only update primitive restart for indexed draws

2016-11-07 Thread Samuel Pitoiset



On 11/07/2016 04:34 AM, Ilia Mirkin wrote:

Primitive restart is a thing for non-indexed draws too. There's a
method that controls it - NVC0_3D_PRIM_RESTART_WITH_DRAW_ARRAYS. The
more recently GL 4.5 behaviour is to *not* do primitive restart for
draw arrays, however the older behavior is to do it. I don't think
there's clear direction on this, and I believe at least some piglits
expect the restart behavior. So for now, I think you should leave this
out, but keep it around for when this confusion is resolved.


Yes, I know that primitive restart behaviour has changed since GL 4.5



On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset
 wrote:

Unnecessary to update it at every draw calls, especially for
non-indexed draws. This is similar to what nv50 already does.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
index bc4ab9e..138e24d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
@@ -1050,8 +1050,6 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
nvc0->idxbuf.buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
   nvc0->base.vbo_dirty = true;

-   nvc0_update_prim_restart(nvc0, info->primitive_restart, 
info->restart_index);
-
if (nvc0->base.vbo_dirty) {
   if (nvc0->screen->eng3d->oclass < GM107_3D_CLASS)
  IMMED_NVC0(push, NVC0_3D(VERTEX_ARRAY_FLUSH), 0);
@@ -1067,6 +1065,9 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
if (info->indexed) {
   bool shorten = info->max_index <= 65535;

+  nvc0_update_prim_restart(nvc0, info->primitive_restart,
+   info->restart_index);
+
   if (info->primitive_restart && info->restart_index > 65535)
  shorten = false;

--
2.10.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] nvc0: reduce the number of PUSH_SPACE in draw path

2016-11-07 Thread Samuel Pitoiset



On 11/07/2016 04:32 AM, Ilia Mirkin wrote:

On Wed, Oct 26, 2016 at 4:14 AM, Samuel Pitoiset
 wrote:



On 10/25/2016 09:49 PM, Ilia Mirkin wrote:


What if instance_count = 1M? (It can happen.)



We allocate a giant space in the pushbuf in one shot. Well, anyways this is
not the optimization of the year, so I can drop it. :-)


There are limits to pushbuf sizes. Either drop it, or batch the
instance draws. There's really very limited advantage to doing it this
way though, since PUSH_SPACE is a no-op unless the pushbuf is full.


Yes, it's really minor, I will discard it.








On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset
 wrote:


This might help CPU-bounds applications but should not have
any real effects for GPU-bounds ones.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
index 69ca091..bc4ab9e 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
@@ -598,8 +598,8 @@ nvc0_draw_arrays(struct nvc0_context *nvc0,

prim = nvc0_prim_gl(mode);

+   PUSH_SPACE(push, 6 * instance_count);
while (instance_count--) {
-  PUSH_SPACE(push, 6);
   BEGIN_NVC0(push, NVC0_3D(VERTEX_BEGIN_GL), 1);
   PUSH_DATA (push, prim);
   BEGIN_NVC0(push, NVC0_3D(VERTEX_BUFFER_FIRST), 2);
@@ -730,10 +730,9 @@ nvc0_draw_elements(struct nvc0_context *nvc0, bool
shorten,
}

if (nvc0->idxbuf.buffer) {
-  PUSH_SPACE(push, 1);
+  PUSH_SPACE(push, 1 + 7 * instance_count);
   IMMED_NVC0(push, NVC0_3D(VERTEX_BEGIN_GL), prim);
   do {
- PUSH_SPACE(push, 7);
  BEGIN_NVC0(push, NVC0_3D(INDEX_BATCH_FIRST), 2);
  PUSH_DATA (push, start);
  PUSH_DATA (push, count);
@@ -747,8 +746,8 @@ nvc0_draw_elements(struct nvc0_context *nvc0, bool
shorten,
} else {
   const void *data = nvc0->idxbuf.user_buffer;

+  PUSH_SPACE(push, 3 * instance_count);
   while (instance_count--) {
- PUSH_SPACE(push, 2);
  BEGIN_NVC0(push, NVC0_3D(VERTEX_BEGIN_GL), 1);
  PUSH_DATA (push, prim);
  switch (index_size) {
@@ -768,7 +767,6 @@ nvc0_draw_elements(struct nvc0_context *nvc0, bool
shorten,
 assert(0);
 return;
  }
- PUSH_SPACE(push, 1);
  IMMED_NVC0(push, NVC0_3D(VERTEX_END_GL), 0);

  prim |= NVC0_3D_VERTEX_BEGIN_GL_INSTANCE_NEXT;
--
2.10.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
-Samuel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states

2016-11-07 Thread Pohjolainen, Topi
On Thu, Oct 27, 2016 at 10:08:33AM +0300, Pohjolainen, Topi wrote:
> On Wed, Oct 26, 2016 at 07:11:20PM +0300, Pohjolainen, Topi wrote:
> > 
> > I had a few qeustions in 1 and 3, but regardless patches 1-7 and 9-13 are:
> > 
> > Reviewed-by: Topi Pohjolainen 
> 
> 14-17 are also:
> 
> Reviewed-by: Topi Pohjolainen 

I had some questions/suggestions in 19 and 21 but

18-22 and 24 are also

Reviewed-by: Topi Pohjolainen 

For 23 and 25 I don't know the context that well and I'd rather have
another opinion there.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Advertise 8 subpixel bits always.

2016-11-07 Thread Anuj Phogat
On Sun, Nov 6, 2016 at 10:45 PM, Chris Forbes  wrote:

> The mesa default is 4, but we program the hardware for 8 on all
> generations.
>
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 3085a98..d8174c6 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -538,6 +538,7 @@ brw_initialize_context_constants(struct brw_context
> *brw)
>ctx->Const.MaxProgramTextureGatherComponents = 1;
>
> ctx->Const.MaxUniformBlockSize = 65536;
> +   ctx->Const.SubPixelBits = 8;
>
> for (int i = 0; i < MESA_SHADER_STAGES; i++) {
>struct gl_program_constants *prog = >Const.Program[i];
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


​Both patches are:
​

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API

2016-11-07 Thread Jan Vesely
On Mon, 2016-11-07 at 18:44 +, Tom Stellard wrote:
> ---
> 
> Build tested only so far.
> 
>  src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
>  src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 48 +++-
>  src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
>  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
>  src/gallium/drivers/radeonsi/si_shader.c  | 69 
> ---
>  src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
>  6 files changed, 112 insertions(+), 52 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
> b/src/gallium/auxiliary/draw/draw_llvm.c
> index 5b4e2a1..5d87318 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
> @@ -1568,8 +1568,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
> draw_llvm_variant *variant,
> LLVMSetFunctionCallConv(variant_func, LLVMCCallConv);
> for (i = 0; i < num_arg_types; ++i)
>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
> - LLVMAddAttribute(LLVMGetParam(variant_func, i),
> -  LLVMNoAliasAttribute);
> + lp_add_function_attr(variant_func, i + 1, "noalias", 7);
>  
> context_ptr   = LLVMGetParam(variant_func, 0);
> io_ptr= LLVMGetParam(variant_func, 1);
> @@ -2193,8 +2192,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
>  
> for (i = 0; i < ARRAY_SIZE(arg_types); ++i)
>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
> - LLVMAddAttribute(LLVMGetParam(variant_func, i),
> -  LLVMNoAliasAttribute);
> + lp_add_function_attr(variant_func, i + 1, "noalias", 7);
>  
> context_ptr   = LLVMGetParam(variant_func, 0);
> input_array   = LLVMGetParam(variant_func, 1);
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
> index f12e735..55afe6d 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_intr.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
> @@ -120,13 +120,53 @@ lp_declare_intrinsic(LLVMModuleRef module,
>  }
>  
>  
> +#if HAVE_LLVM < 0x0400
> +static LLVMAttribute str_to_attr(const char *attr_name, unsigned attr_len)
> +{
> +   if (!strncmp("alwaysinline", attr_name, attr_len)) {
> +  return LLVMAlwaysInlineAttribute;
> +   } else if (!strncmp("byval", attr_name, attr_len)) {
> +  return LLVMByValAttribute;
> +   } else if (!strncmp("inreg", attr_name, attr_len)) {
> +  return LLVMInRegAttribute;
> +   } else if (!strncmp("noalias", attr_name, attr_len)) {
> +  return LLVMNoAlliasAttribute;
> +   } else if (!strncmp("readnone", attr_name, attr_len)) {
> +  return LLVMReadNoneAttribute;
> +   } else if (!strncmp("readonly", attr_name, attr_len)) {
> +  return LLVMReadOnlyAttribute;
> +   } else {
> +  _debug_printf("Unhandled function attribute: %s\n", attr_name);
> +  return 0;
> +   }
> +}
> +#endif
> +
> +void
> +lp_add_function_attr(LLVMValueRef function,
> + unsigned attr_idx,
> + const char *attr_name,
> + unsigned attr_len)
> +{
> +
> +#if HAVE_LLVM < 0x0400
> +   LLVMAttribute attr = str_to_attr(attr_name, attr_len);
> +   LLVMAddFunctionAttr(function, attr);

I think this will fail with argument attributes, since it ignores
attr_idx. I think you need something like:

if (attr_idx == -1)
  LLVMAddFunctionAttr(function, attr);
else  
  LLVMAddAttribute(LLVMGetParam(function, i - 1), attr);

Jan

> +#else
> +   LLVMContextRef context = 
> LLVMGetModuleContext(LLVMGetGlobalParent(function));
> +   unsigned kind_id = LLVMGetEnumAttributeKindForName(attr_name, attr_len);
> +   LLVMAttributeRef attr = LLVMCreateEnumAttribute(context, kind_id, 0);
> +   LLVMAddAttributeAtIndex(function, attr_idx, attr);
> +#endif
> +}
> +
>  LLVMValueRef
>  lp_build_intrinsic(LLVMBuilderRef builder,
> const char *name,
> LLVMTypeRef ret_type,
> LLVMValueRef *args,
> unsigned num_args,
> -   LLVMAttribute attr)
> +   const char *attr_str)
>  {
> LLVMModuleRef module = 
> LLVMGetGlobalParent(LLVMGetBasicBlockParent(LLVMGetInsertBlock(builder)));
> LLVMValueRef function;
> @@ -145,10 +185,14 @@ lp_build_intrinsic(LLVMBuilderRef builder,
>  
>function = lp_declare_intrinsic(module, name, ret_type, arg_types, 
> num_args);
>  
> +  if (attr_str) {
> + lp_add_function_attr(function, -1, attr_str, sizeof(attr_str));
> +  }
> +
>/* NoUnwind indicates that the intrinsic never raises a C++ exception.
> * Set it for all intrinsics.
> */
> -  LLVMAddFunctionAttr(function, attr | LLVMNoUnwindAttribute);
> +  lp_add_function_attr(function, -1, "nounwind", 8);
>  
>if (gallivm_debug & GALLIVM_DEBUG_IR) {
>  

Re: [Mesa-dev] [PATCH 24/25] Allocate a null state whenever there is depth/stencil

2016-11-07 Thread Pohjolainen, Topi
On Sat, Oct 22, 2016 at 10:50:55AM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/genX_cmd_buffer.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index f43c643..06a0686 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -180,18 +180,19 @@ genX(cmd_buffer_setup_attachments)(struct 
> anv_cmd_buffer *cmd_buffer,
> }
>  
> bool need_null_state = false;
> -   for (uint32_t s = 0; s < pass->subpass_count; ++s) {
> -  if (pass->subpasses[s].color_count == 0) {
> - need_null_state = true;
> - break;
> -  }
> -   }
> -
> -   unsigned num_states = need_null_state;
> +   unsigned num_states = 0;
> for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> -  if (vk_format_is_color(pass->attachments[i].format))
> +  if (vk_format_is_color(pass->attachments[i].format)) {
>   num_states++;
> +  } else {
> + /* We need a null state for any depth-stencil-only subpasses.
> +  * Importantly, this includes depth/stencil clears so we create one
> +  * whenever we have depth or stencil
> +  */
> + need_null_state = true;
> +  }
> }
> +   num_states += need_null_state;
>  
> const uint32_t ss_stride = align_u32(isl_dev->ss.size, isl_dev->ss.align);
> state->render_pass_states =
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/25] anv/blorp: Use the new clear_attachments entrypoint for attachment clears

2016-11-07 Thread Pohjolainen, Topi
On Sat, Oct 22, 2016 at 10:50:53AM -0700, Jason Ekstrand wrote:
> This allows us to re-use the surface states emitted from the Vulkan driver
> instead of blorp creating its own.
> ---
>  src/intel/vulkan/anv_blorp.c | 93 
> +---
>  1 file changed, 52 insertions(+), 41 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index f495815..b62ea0b 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -890,6 +890,22 @@ anv_cmd_buffer_alloc_blorp_binding_table(struct 
> anv_cmd_buffer *cmd_buffer,
>  state_offset);
>assert(bt_state.map != NULL);
> }
> +
> +   return bt_state;

Like commented in one of the earlier patches, this belongs already there.

> +}
> +
> +static uint32_t
> +binding_table_for_surface_state(struct anv_cmd_buffer *cmd_buffer,
> +struct anv_state surface_state)
> +{
> +   uint32_t state_offset;
> +   struct anv_state bt_state =
> +  anv_cmd_buffer_alloc_blorp_binding_table(cmd_buffer, 1, _offset);
> +
> +   uint32_t *bt_map = bt_state.map;
> +   bt_map[0] = surface_state.offset + state_offset;

Okay, I need to in general study how the binding table allocation works in
detail (and how surface state offsets are relative to that). Even though I
don't understand this fully yet, this does match the existing logic we have in
blorp_emit_surface_states() and therefore this patch is:

Reviewed-by: Topi Pohjolainen 

> +
> +   return bt_state.offset;
>  }
>  
>  static void
> @@ -898,32 +914,31 @@ clear_color_attachment(struct anv_cmd_buffer 
> *cmd_buffer,
> const VkClearAttachment *attachment,
> uint32_t rectCount, const VkClearRect *pRects)
>  {
> -   const struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> const struct anv_subpass *subpass = cmd_buffer->state.subpass;
> -   const uint32_t att = attachment->colorAttachment;
> -   const struct anv_image_view *iview =
> -  fb->attachments[subpass->color_attachments[att]];
> -   const struct anv_image *image = iview->image;
> +   const uint32_t color_att = attachment->colorAttachment;
> +   const uint32_t att_idx = subpass->color_attachments[color_att];
> +   struct anv_render_pass_attachment *pass_att =
> +  _buffer->state.pass->attachments[att_idx];
> +   struct anv_attachment_state *att_state =
> +  _buffer->state.attachments[att_idx];
>  
> -   struct blorp_surf surf;
> -   get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT, );
> +   uint32_t binding_table =
> +  binding_table_for_surface_state(cmd_buffer, att_state->color_rt_state);
>  
> union isl_color_value clear_color;
> memcpy(clear_color.u32, attachment->clearValue.color.uint32,
>sizeof(clear_color.u32));
>  
> -   static const bool color_write_disable[4] = { false, false, false, false };
> -
> for (uint32_t r = 0; r < rectCount; ++r) {
>const VkOffset2D offset = pRects[r].rect.offset;
>const VkExtent2D extent = pRects[r].rect.extent;
> -  blorp_clear(batch, , iview->isl.format, iview->isl.swizzle,
> -  iview->isl.base_level,
> -  iview->isl.base_array_layer + pRects[r].baseArrayLayer,
> -  pRects[r].layerCount,
> -  offset.x, offset.y,
> -  offset.x + extent.width, offset.y + extent.height,
> -  clear_color, color_write_disable);
> +  blorp_clear_attachments(batch, binding_table,
> +  ISL_FORMAT_UNSUPPORTED, pass_att->samples,
> +  pRects[r].baseArrayLayer,
> +  pRects[r].layerCount,
> +  offset.x, offset.y,
> +  offset.x + extent.width, offset.y + 
> extent.height,
> +  true, clear_color, false, 0.0f, 0, 0);
> }
>  }
>  
> @@ -933,44 +948,40 @@ clear_depth_stencil_attachment(struct anv_cmd_buffer 
> *cmd_buffer,
> const VkClearAttachment *attachment,
> uint32_t rectCount, const VkClearRect *pRects)
>  {
> -   const struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> +   static const union isl_color_value color_value = { .u32 = { 0, } };
> const struct anv_subpass *subpass = cmd_buffer->state.subpass;
> -   const struct anv_image_view *iview =
> -  fb->attachments[subpass->depth_stencil_attachment];
> -   const struct anv_image *image = iview->image;
> +   const uint32_t att_idx = subpass->depth_stencil_attachment;
> +   struct anv_render_pass_attachment *pass_att =
> +  _buffer->state.pass->attachments[att_idx];
>  
> bool clear_depth = attachment->aspectMask & VK_IMAGE_ASPECT_DEPTH_BIT;
> bool clear_stencil = attachment->aspectMask & VK_IMAGE_ASPECT_STENCIL_BIT;
> 

[Mesa-dev] [PATCH] swr: allow alphatest without blend or logicop

2016-11-07 Thread Tim Rowley
We need to compile a blend function when alphatest is enabled.
---
 src/gallium/drivers/swr/swr_state.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index 3e02322..424bff2 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1300,7 +1300,8 @@ swr_update_derived(struct pipe_context *pipe,
sizeof(compileState.blendState));
 
 if (compileState.blendState.blendEnable == false &&
-compileState.blendState.logicOpEnable == false) {
+compileState.blendState.logicOpEnable == false &&
+ctx->depth_stencil->alpha.enabled == 0) {
SwrSetBlendFunc(ctx->swrContext, target, NULL);
continue;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallivm: introduce 32x32->64bit lp_build_mul_32_lohi function

2016-11-07 Thread Roland Scheidegger
Am 06.11.2016 um 16:50 schrieb Jose Fonseca:
> On 04/11/16 04:14, srol...@vmware.com wrote:
>> From: Roland Scheidegger 
>>
>> This is used by shader umul_hi/imul_hi functions (and soon by draw).
>> It's actually useful separating this out on its own, however the real
>> reason for doing it is because we're using an optimized sse2 version,
>> since the code llvm generates is atrocious (since there's no widening
>> mul in llvm, and it does not recognize the widening mul pattern, so
>> it generates code for real 64x64->64bit mul, which the cpu can't do
>> natively, in contrast to 32x32->64bit mul which it could do).
>> ---
>>  src/gallium/auxiliary/gallivm/lp_bld_arit.c| 150
>> +
>>  src/gallium/auxiliary/gallivm/lp_bld_arit.h|   6 +
>>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |  54 +++-
>>  3 files changed, 172 insertions(+), 38 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> index 3ea0734..3de4628 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> @@ -1091,6 +1091,156 @@ lp_build_mul(struct lp_build_context *bld,
>> return res;
>>  }
>>
>> +/*
>> + * Widening mul, valid for 32x32 bit -> 64bit only.
>> + * Result is low 32bits, high bits returned in res_hi.
>> + */
>> +LLVMValueRef
>> +lp_build_mul_32_lohi(struct lp_build_context *bld,
>> + LLVMValueRef a,
>> + LLVMValueRef b,
>> + LLVMValueRef *res_hi)
>> +{
>> +   struct gallivm_state *gallivm = bld->gallivm;
>> +   LLVMBuilderRef builder = gallivm->builder;
>> +
>> +   assert(bld->type.width == 32);
>> +   assert(bld->type.floating == 0);
>> +   assert(bld->type.fixed == 0);
>> +   assert(bld->type.norm == 0);
>> +
>> +   /*
>> +* XXX: for some reason, with zext/zext/mul/trunc the code llvm
>> produces
>> +* for x86 simd is atrocious (even if the high bits weren't
>> required),
>> +* trying to handle real 64bit inputs (which of course can't
>> happen due
>> +* to using 64bit umul with 32bit numbers zero-extended to 64bit, but
>> +* apparently llvm does not recognize this widening mul). This
>> includes 6
>> +* (instead of 2) pmuludq plus extra adds and shifts
>> +* The same story applies to signed mul, albeit fixing this
>> requires sse41.
>> +* https://llvm.org/bugs/show_bug.cgi?id=30845
>> +* So, whip up our own code, albeit only for length 4 and 8 (which
>> +* should be good enough)...
>> +*/
>> +   if ((bld->type.length == 4 || bld->type.length == 8) &&
>> +   ((util_cpu_caps.has_sse2 && (bld->type.sign == 0)) ||
>> +util_cpu_caps.has_sse4_1)) {
>> +  const char *intrinsic = NULL;
>> +  LLVMValueRef aeven, aodd, beven, bodd, muleven, mulodd;
>> +  LLVMValueRef shuf[LP_MAX_VECTOR_WIDTH / 32], shuf_vec;
>> +  struct lp_type type_wide = lp_wider_type(bld->type);
>> +  LLVMTypeRef wider_type = lp_build_vec_type(gallivm, type_wide);
>> +  unsigned i;
>> +  for (i = 0; i < bld->type.length; i += 2) {
>> + shuf[i] = lp_build_const_int32(gallivm, i+1);
>> + shuf[i+1] =
>> LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
>> +  }
>> +  shuf_vec = LLVMConstVector(shuf, bld->type.length);
>> +  aeven = a;
>> +  beven = b;
>> +  aodd = LLVMBuildShuffleVector(builder, aeven, bld->undef,
>> shuf_vec, "");
>> +  bodd = LLVMBuildShuffleVector(builder, beven, bld->undef,
>> shuf_vec, "");
>> +
>> +  if (util_cpu_caps.has_avx2 && bld->type.length == 8) {
>> + if (bld->type.sign) {
>> +intrinsic = "llvm.x86.avx2.pmul.dq";
>> + } else {
>> +intrinsic = "llvm.x86.avx2.pmulu.dq";
>> + }
>> + muleven = lp_build_intrinsic_binary(builder, intrinsic,
>> + wider_type, aeven, beven);
>> + mulodd = lp_build_intrinsic_binary(builder, intrinsic,
>> +wider_type, aodd, bodd);
>> +  }
>> +  else {
>> + /* for consistent naming look elsewhere... */
>> + if (bld->type.sign) {
>> +intrinsic = "llvm.x86.sse41.pmuldq";
>> + } else {
>> +intrinsic = "llvm.x86.sse2.pmulu.dq";
>> + }
>> + /*
>> +  * XXX If we only have AVX but not AVX2 this is a pain.
>> +  * lp_build_intrinsic_binary_anylength() can't handle it
>> +  * (due to src and dst type not being identical).
>> +  */
>> + if (bld->type.length == 8) {
>> +LLVMValueRef aevenlo, aevenhi, bevenlo, bevenhi;
>> +LLVMValueRef aoddlo, aoddhi, boddlo, boddhi;
>> +LLVMValueRef muleven2[2], mulodd2[2];
>> +struct lp_type type_wide_half = type_wide;
>> +LLVMTypeRef wtype_half;
>> +type_wide_half.length = 2;

[Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API

2016-11-07 Thread Tom Stellard
---

Build tested only so far.

 src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
 src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 48 +++-
 src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
 src/gallium/drivers/radeonsi/si_shader.c  | 69 ---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
 6 files changed, 112 insertions(+), 52 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 5b4e2a1..5d87318 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -1568,8 +1568,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
LLVMSetFunctionCallConv(variant_func, LLVMCCallConv);
for (i = 0; i < num_arg_types; ++i)
   if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
- LLVMAddAttribute(LLVMGetParam(variant_func, i),
-  LLVMNoAliasAttribute);
+ lp_add_function_attr(variant_func, i + 1, "noalias", 7);
 
context_ptr   = LLVMGetParam(variant_func, 0);
io_ptr= LLVMGetParam(variant_func, 1);
@@ -2193,8 +2192,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
 
for (i = 0; i < ARRAY_SIZE(arg_types); ++i)
   if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
- LLVMAddAttribute(LLVMGetParam(variant_func, i),
-  LLVMNoAliasAttribute);
+ lp_add_function_attr(variant_func, i + 1, "noalias", 7);
 
context_ptr   = LLVMGetParam(variant_func, 0);
input_array   = LLVMGetParam(variant_func, 1);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.c 
b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
index f12e735..55afe6d 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_intr.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
@@ -120,13 +120,53 @@ lp_declare_intrinsic(LLVMModuleRef module,
 }
 
 
+#if HAVE_LLVM < 0x0400
+static LLVMAttribute str_to_attr(const char *attr_name, unsigned attr_len)
+{
+   if (!strncmp("alwaysinline", attr_name, attr_len)) {
+  return LLVMAlwaysInlineAttribute;
+   } else if (!strncmp("byval", attr_name, attr_len)) {
+  return LLVMByValAttribute;
+   } else if (!strncmp("inreg", attr_name, attr_len)) {
+  return LLVMInRegAttribute;
+   } else if (!strncmp("noalias", attr_name, attr_len)) {
+  return LLVMNoAlliasAttribute;
+   } else if (!strncmp("readnone", attr_name, attr_len)) {
+  return LLVMReadNoneAttribute;
+   } else if (!strncmp("readonly", attr_name, attr_len)) {
+  return LLVMReadOnlyAttribute;
+   } else {
+  _debug_printf("Unhandled function attribute: %s\n", attr_name);
+  return 0;
+   }
+}
+#endif
+
+void
+lp_add_function_attr(LLVMValueRef function,
+ unsigned attr_idx,
+ const char *attr_name,
+ unsigned attr_len)
+{
+
+#if HAVE_LLVM < 0x0400
+   LLVMAttribute attr = str_to_attr(attr_name, attr_len);
+   LLVMAddFunctionAttr(function, attr);
+#else
+   LLVMContextRef context = 
LLVMGetModuleContext(LLVMGetGlobalParent(function));
+   unsigned kind_id = LLVMGetEnumAttributeKindForName(attr_name, attr_len);
+   LLVMAttributeRef attr = LLVMCreateEnumAttribute(context, kind_id, 0);
+   LLVMAddAttributeAtIndex(function, attr_idx, attr);
+#endif
+}
+
 LLVMValueRef
 lp_build_intrinsic(LLVMBuilderRef builder,
const char *name,
LLVMTypeRef ret_type,
LLVMValueRef *args,
unsigned num_args,
-   LLVMAttribute attr)
+   const char *attr_str)
 {
LLVMModuleRef module = 
LLVMGetGlobalParent(LLVMGetBasicBlockParent(LLVMGetInsertBlock(builder)));
LLVMValueRef function;
@@ -145,10 +185,14 @@ lp_build_intrinsic(LLVMBuilderRef builder,
 
   function = lp_declare_intrinsic(module, name, ret_type, arg_types, 
num_args);
 
+  if (attr_str) {
+ lp_add_function_attr(function, -1, attr_str, sizeof(attr_str));
+  }
+
   /* NoUnwind indicates that the intrinsic never raises a C++ exception.
* Set it for all intrinsics.
*/
-  LLVMAddFunctionAttr(function, attr | LLVMNoUnwindAttribute);
+  lp_add_function_attr(function, -1, "nounwind", 8);
 
   if (gallivm_debug & GALLIVM_DEBUG_IR) {
  lp_debug_dump_value(function);
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.h 
b/src/gallium/auxiliary/gallivm/lp_bld_intr.h
index 7d80ac2..b4558dc 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_intr.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.h
@@ -60,13 +60,24 @@ lp_declare_intrinsic(LLVMModuleRef module,
  LLVMTypeRef *arg_types,
  unsigned num_args);
 
+void
+lp_remove_attr(LLVMValueRef value,
+   const char *attr_name,
+   unsigned 

Re: [Mesa-dev] [PATCH 3/3] nvc0: refactor textures/samplers validation

2016-11-07 Thread Samuel Pitoiset



On 11/07/2016 04:30 AM, Ilia Mirkin wrote:

Patches 1-2 seem OK. I'm a little concerned that this one is changing
functionality, since it's removing the "need_flush" thing. It'd be
nice if you could get this patch some heavier testing before pushing
it out...


Yes, it's more fine-grained texture flushes. I will run piglit on few 
cards and check elemental on fermi/kepler to be sure the validation is 
still correct.




On Wed, Oct 26, 2016 at 4:00 PM, Samuel Pitoiset
 wrote:

The first goal is to reduce code duplication between 3d and
compute and increase readability of that area.

This refactoring also tries to reduce the number of commands
send through the pushbuffer and to not invalidate all caches
when binding new textures/samplers. Although I don't see any
improvements with Elemental but this might help in some cases.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_compute.c |  12 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h |   7 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 159 ++--
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c |  98 ++-
 4 files changed, 113 insertions(+), 163 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
index 11635c9..041cf1c 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_compute.c
@@ -143,11 +143,7 @@ nvc0_screen_compute_setup(struct nvc0_screen *screen,
 static void
 nvc0_compute_validate_samplers(struct nvc0_context *nvc0)
 {
-   bool need_flush = nvc0_validate_tsc(nvc0, 5);
-   if (need_flush) {
-  BEGIN_NVC0(nvc0->base.pushbuf, NVC0_CP(TSC_FLUSH), 1);
-  PUSH_DATA (nvc0->base.pushbuf, 0);
-   }
+   nvc0_validate_tsc(nvc0, 5);

/* Invalidate all 3D samplers because they are aliased. */
for (int s = 0; s < 5; s++)
@@ -158,11 +154,7 @@ nvc0_compute_validate_samplers(struct nvc0_context *nvc0)
 static void
 nvc0_compute_validate_textures(struct nvc0_context *nvc0)
 {
-   bool need_flush = nvc0_validate_tic(nvc0, 5);
-   if (need_flush) {
-  BEGIN_NVC0(nvc0->base.pushbuf, NVC0_CP(TIC_FLUSH), 1);
-  PUSH_DATA (nvc0->base.pushbuf, 0);
-   }
+   nvc0_validate_tic(nvc0, 5);

/* Invalidate all 3D textures because they are aliased. */
for (int s = 0; s < 5; s++) {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index 37aecae..8750edc 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -330,9 +330,10 @@ extern void nvc0_clear(struct pipe_context *, unsigned 
buffers,
 extern void nvc0_init_surface_functions(struct nvc0_context *);

 /* nvc0_tex.c */
-bool nvc0_validate_tic(struct nvc0_context *nvc0, int s);
-bool nvc0_validate_tsc(struct nvc0_context *nvc0, int s);
-bool nve4_validate_tsc(struct nvc0_context *nvc0, int s);
+void nvc0_validate_tic(struct nvc0_context *nvc0, int s);
+void nvc0_validate_tsc(struct nvc0_context *nvc0, int s);
+void nve4_validate_tic(struct nvc0_context *nvc0, int s);
+void nve4_validate_tsc(struct nvc0_context *nvc0, int s);
 void nvc0_validate_suf(struct nvc0_context *nvc0, int s);
 void nvc0_validate_textures(struct nvc0_context *);
 void nvc0_validate_samplers(struct nvc0_context *);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index 23c9daa..4f6788c 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -24,6 +24,7 @@
 #include "nvc0/nvc0_resource.h"
 #include "nvc0/gm107_texture.xml.h"
 #include "nvc0/nvc0_compute.xml.h"
+#include "nvc0/nve4_compute.xml.h"
 #include "nv50/g80_texture.xml.h"
 #include "nv50/g80_defs.xml.h"

@@ -468,14 +469,13 @@ nvc0_update_tic(struct nvc0_context *nvc0, struct 
nv50_tic_entry *tic,
tic->tic[2] |= address >> 32;
 }

-bool
+void
 nvc0_validate_tic(struct nvc0_context *nvc0, int s)
 {
-   uint32_t commands[32];
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
+   uint32_t commands[3][16];
+   unsigned n[3] = { 0, 0, 0 };
unsigned i;
-   unsigned n = 0;
-   bool need_flush = false;

for (i = 0; i < nvc0->num_textures[s]; ++i) {
   struct nv50_tic_entry *tic = nv50_tic_entry(nvc0->textures[s][i]);
@@ -484,7 +484,7 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)

   if (!tic) {
  if (dirty)
-commands[n++] = (i << 1) | 0;
+commands[0][n[0]++] = (i << 1) | 0;
  continue;
   }
   res = nv04_resource(tic->pipe.texture);
@@ -496,15 +496,11 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
  nvc0_m2mf_push_linear(>base, nvc0->screen->txc, tic->id * 32,
NV_VRAM_DOMAIN(>screen->base), 32,
tic->tic);
- need_flush = true;
+
+   

Re: [Mesa-dev] [PATCH 1/3] gallium/hud: fix a problem where objects are free'd while in use.

2016-11-07 Thread Steven Toth
On Mon, Nov 7, 2016 at 12:32 PM, Nicolai Hähnle  wrote:
> Looks good to me as well, and pushed! Thanks for the respin and sorry it
> took so long.

You are very welcome. No apology necessary.
Thank you for your due diligence and prior feedback.

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/hud: fix a problem where objects are free'd while in use.

2016-11-07 Thread Steven Toth
On Mon, Nov 7, 2016 at 12:37 PM, Laurent Carlier  wrote:
> Le lundi 7 novembre 2016, 18:32:07 CET Nicolai Hähnle a écrit :
>> Looks good to me as well, and pushed! Thanks for the respin and sorry it
>> took so long.
>>
>> Cheers,
>> Nicolai
>>
>
> Maybe cc 13.0 ? It's buggy with 13.0 and it will be a nice fix

I'm the new guy, so I don't get to make that call, but given the
nature of the fix (for some applications multiple multiple surfaces) -
I'd certainly recommend it.

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/hud: fix a problem where objects are free'd while in use.

2016-11-07 Thread Laurent Carlier
Le lundi 7 novembre 2016, 18:32:07 CET Nicolai Hähnle a écrit :
> Looks good to me as well, and pushed! Thanks for the respin and sorry it
> took so long.
> 
> Cheers,
> Nicolai
> 

Maybe cc 13.0 ? It's buggy with 13.0 and it will be a nice fix

-- 
Laurent Carlier
http://www.archlinux.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/hud: fix a problem where objects are free'd while in use.

2016-11-07 Thread Nicolai Hähnle
Looks good to me as well, and pushed! Thanks for the respin and sorry it 
took so long.


Cheers,
Nicolai

On 24.10.2016 16:10, Steven Toth wrote:

Instead of trying to maintain a reference counted list of valid HUD
objects, and freeing them accordingly, creating race conditions
between unanticipated multiple threads, simply accept they're
allocated once and never released until the process terminates.

They're a shared resource between multiple threads, so accept
they're always available for use.

Signed-off-by: Steven Toth 
---
 src/gallium/auxiliary/hud/hud_cpufreq.c  | 13 -
 src/gallium/auxiliary/hud/hud_diskstat.c | 13 -
 src/gallium/auxiliary/hud/hud_nic.c  | 13 -
 src/gallium/auxiliary/hud/hud_sensors_temp.c | 16 
 4 files changed, 55 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_cpufreq.c 
b/src/gallium/auxiliary/hud/hud_cpufreq.c
index 4501bbb..bfc748b 100644
--- a/src/gallium/auxiliary/hud/hud_cpufreq.c
+++ b/src/gallium/auxiliary/hud/hud_cpufreq.c
@@ -112,14 +112,6 @@ query_cfi_load(struct hud_graph *gr)
}
 }

-static void
-free_query_data(void *p)
-{
-   struct cpufreq_info *cfi = (struct cpufreq_info *)p;
-   list_del(>list);
-   FREE(cfi);
-}
-
 /**
   * Create and initialize a new object for a specific CPU.
   * \param  pane  parent context.
@@ -162,11 +154,6 @@ hud_cpufreq_graph_install(struct hud_pane *pane, int 
cpu_index,
gr->query_data = cfi;
gr->query_new_value = query_cfi_load;

-   /* Don't use free() as our callback as that messes up Gallium's
-* memory debugger.  Use simple free_query_data() wrapper.
-*/
-   gr->free_query_data = free_query_data;
-
hud_pane_add_graph(pane, gr);
hud_pane_set_max_value(pane, 300 /* 3 GHz */);
 }
diff --git a/src/gallium/auxiliary/hud/hud_diskstat.c 
b/src/gallium/auxiliary/hud/hud_diskstat.c
index b248baf..7d4f500 100644
--- a/src/gallium/auxiliary/hud/hud_diskstat.c
+++ b/src/gallium/auxiliary/hud/hud_diskstat.c
@@ -162,14 +162,6 @@ query_dsi_load(struct hud_graph *gr)
}
 }

-static void
-free_query_data(void *p)
-{
-   struct diskstat_info *nic = (struct diskstat_info *) p;
-   list_del(>list);
-   FREE(nic);
-}
-
 /**
   * Create and initialize a new object for a specific block I/O device.
   * \param  pane  parent context.
@@ -208,11 +200,6 @@ hud_diskstat_graph_install(struct hud_pane *pane, const 
char *dev_name,
gr->query_data = dsi;
gr->query_new_value = query_dsi_load;

-   /* Don't use free() as our callback as that messes up Gallium's
-* memory debugger.  Use simple free_query_data() wrapper.
-*/
-   gr->free_query_data = free_query_data;
-
hud_pane_add_graph(pane, gr);
hud_pane_set_max_value(pane, 100);
 }
diff --git a/src/gallium/auxiliary/hud/hud_nic.c 
b/src/gallium/auxiliary/hud/hud_nic.c
index fb6b8c0..719dd04 100644
--- a/src/gallium/auxiliary/hud/hud_nic.c
+++ b/src/gallium/auxiliary/hud/hud_nic.c
@@ -234,14 +234,6 @@ query_nic_load(struct hud_graph *gr)
}
 }

-static void
-free_query_data(void *p)
-{
-   struct nic_info *nic = (struct nic_info *) p;
-   list_del(>list);
-   FREE(nic);
-}
-
 /**
   * Create and initialize a new object for a specific network interface dev.
   * \param  pane  parent context.
@@ -284,11 +276,6 @@ hud_nic_graph_install(struct hud_pane *pane, const char 
*nic_name,
gr->query_data = nic;
gr->query_new_value = query_nic_load;

-   /* Don't use free() as our callback as that messes up Gallium's
-* memory debugger.  Use simple free_query_data() wrapper.
-*/
-   gr->free_query_data = free_query_data;
-
hud_pane_add_graph(pane, gr);
hud_pane_set_max_value(pane, 100);
 }
diff --git a/src/gallium/auxiliary/hud/hud_sensors_temp.c 
b/src/gallium/auxiliary/hud/hud_sensors_temp.c
index e41b847..4a8a4fc 100644
--- a/src/gallium/auxiliary/hud/hud_sensors_temp.c
+++ b/src/gallium/auxiliary/hud/hud_sensors_temp.c
@@ -189,17 +189,6 @@ query_sti_load(struct hud_graph *gr)
}
 }

-static void
-free_query_data(void *p)
-{
-   struct sensors_temp_info *sti = (struct sensors_temp_info *) p;
-   list_del(>list);
-   if (sti->chip)
-  sensors_free_chip_name(sti->chip);
-   FREE(sti);
-   sensors_cleanup();
-}
-
 /**
   * Create and initialize a new object for a specific sensor interface dev.
   * \param  pane  parent context.
@@ -237,11 +226,6 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, 
const char *dev_name,
gr->query_data = sti;
gr->query_new_value = query_sti_load;

-   /* Don't use free() as our callback as that messes up Gallium's
-* memory debugger.  Use simple free_query_data() wrapper.
-*/
-   gr->free_query_data = free_query_data;
-
hud_pane_add_graph(pane, gr);
switch (sti->mode) {
case SENSORS_TEMP_CURRENT:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] i965: Store mcs buffer size

2016-11-07 Thread Ben Widawsky

On 16-11-07 10:22:46, Lionel Landwerlin wrote:

On 07/11/16 10:07, Pohjolainen, Topi wrote:

On Thu, Nov 03, 2016 at 10:39:39AM +, Lionel Landwerlin wrote:

From: Ben Widawsky 

libdrm may round up the allocation requested by mesa. As a result, accesses
through the gtt may end up accessing memory which does not belong to mesa. The
problem is described in the following commit:
commit 7ae870211ddc40ef6ed209a322c3a721214bb737
Author: Eric Anholt 
Date:   Mon Apr 14 16:52:43 2014 -0700

i965: Fix buffer overruns in MSAA MCS buf This size field is an alternate

In that patch this was solved by making sure we only 1'd the logical size of the
buffer. This patch becomes necessary because the miptree data structure is going
to go away in the upcoming patch and we won't have access to the total_height
field anymore.

v2: drop setting the size in intel_hiz_miptree_buf_create() (Lionel)

Signed-off-by: Ben Widawsky  (v1)
Signed-off-by: Lionel Landwerlin  (v2)
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 9 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c2bff17..3d1bdb1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1503,8 +1503,7 @@ intel_miptree_init_mcs(struct brw_context *brw,
   return;
}
void *data = mt->mcs_buf->bo->virtual;
-   memset(data, init_value,
-  mt->mcs_buf->mt->total_height * mt->mcs_buf->mt->pitch);

If I read the previous patch right, this is already needed there as
mcs_buf->mt is left NULL?


Thanks, that's wrong indeed.
I think it makes sense to squash patch 3 & 4.



Sounds good to me.




+   memset(data, init_value, mt->mcs_buf->size);
drm_intel_bo_unmap(mt->mcs_buf->bo);
mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR;
 }
@@ -1545,6 +1544,7 @@ intel_mcs_miptree_buf_create(struct brw_context *brw,
buf->bo = temp_mt->bo;
buf->offset = temp_mt->offset;
+   buf->size = temp_mt->total_height * temp_mt->pitch;
buf->pitch = temp_mt->pitch;
buf->qpitch = temp_mt->qpitch;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 0b49dc2..0b4b353 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -350,6 +350,15 @@ struct intel_miptree_aux_buffer
 */
uint32_t offset;
+   /*
+* Size of the MCS surface.
+*
+* This is needed when doing any gtt mapped operations on the buffer (which
+* will be Y-tiled). It is possible that it will not be the same as bo->size
+* when the drm allocator rounds up the requested size.
+*/
+   size_t size;
+
/**
 * Pitch in bytes.
 *
--
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98555] NWN won't start on 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98555

Emil Velikov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #8 from Emil Velikov  ---
Skimming through __nwu_possible - I might have gone overboard calling it "brain
dead". Sorry about that :-\

That said, please check if the issue persists without the nwuser wrapper. If
so, feel free to reopen. Thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium/hud: protect against and initialization race

2016-11-07 Thread Steven Toth
>> A humble ping on this and the two others.
>>
>> - Steve
>>
>
> Series looks OK to me.
>
> Reviewed-by: Brian Paul 
>
> Need me to push these for you?
>
> -Brian
>

That would be a helpful, yes please.

I think they fell through the cracks after the last round of reviews.

Thanks.

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] swr: [rasterizer core]: set depth hottile when depth bounds test enabled

2016-11-07 Thread Rowley, Timothy O
Reviewed-by: Tim Rowley 
>

On Nov 1, 2016, at 3:45 PM, Ilia Mirkin 
> wrote:

Signed-off-by: Ilia Mirkin >
---
src/gallium/drivers/swr/rasterizer/core/api.cpp | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp 
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index 5f941e8..b1a426d 100644
--- a/src/gallium/drivers/swr/rasterizer/core/api.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp
@@ -950,9 +950,11 @@ void SetupPipeline(DRAW_CONTEXT *pDC)
// have to check for the special case where depth/stencil test is enabled 
but depthwrite is disabled.
pState->state.depthHottileEnable = 
((!(pState->state.depthStencilState.depthTestEnable &&
   
!pState->state.depthStencilState.depthWriteEnable &&
+   
!pState->state.depthBoundsState.depthBoundsTestEnable &&
   
pState->state.depthStencilState.depthTestFunc == ZFUNC_ALWAYS)) &&

(pState->state.depthStencilState.depthTestEnable ||
- 
pState->state.depthStencilState.depthWriteEnable)) ? true : false;
+ 
pState->state.depthStencilState.depthWriteEnable ||
+ 
pState->state.depthBoundsState.depthBoundsTestEnable)) ? true : false;

pState->state.stencilHottileEnable = 
(((!(pState->state.depthStencilState.stencilTestEnable &&
 
!pState->state.depthStencilState.stencilWriteEnable &&
--
2.7.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] swr: add support for EXT_depth_bounds_test

2016-11-07 Thread Rowley, Timothy O
We suspect the remaining failure might be due to not quantizing the depth 
bounds min/max values.  That can be addressed in a future patch.

Reviewed-by: Tim Rowley 
>

On Nov 1, 2016, at 3:45 PM, Ilia Mirkin 
> wrote:

Signed-off-by: Ilia Mirkin >
---

This fails one sub-case of the piglit depth_bounds test:

Test 10, bounds=(0.00, 0.50), z=(0.50, 0.50, 0.50, 0.50)
Probe color at (0,20)
 Expected: 255 255 255
 Observed: 26 26 26

I'm blaming it on the floating point boogey man.

src/gallium/drivers/swr/swr_screen.cpp | 2 +-
src/gallium/drivers/swr/swr_state.cpp  | 6 ++
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 704a684..fa16edd 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -332,7 +332,7 @@ swr_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
   case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
  return 0;
   case PIPE_CAP_DEPTH_BOUNDS_TEST:
-  return 0; // xxx
+  return 1;
   case PIPE_CAP_TEXTURE_FLOAT_LINEAR:
   case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR:
  return 1;
diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index 3e02322..d8a8ee1 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1205,6 +1205,7 @@ swr_update_derived(struct pipe_context *pipe,
  struct pipe_depth_state *depth = &(ctx->depth_stencil->depth);
  struct pipe_stencil_state *stencil = ctx->depth_stencil->stencil;
  SWR_DEPTH_STENCIL_STATE depthStencilState = {{0}};
+  SWR_DEPTH_BOUNDS_STATE depthBoundsState = {0};

  /* XXX, incomplete.  Need to flesh out stencil & alpha test state
  struct pipe_stencil_state *front_stencil =
@@ -1251,6 +1252,11 @@ swr_update_derived(struct pipe_context *pipe,
  depthStencilState.depthTestFunc = swr_convert_depth_func(depth->func);
  depthStencilState.depthWriteEnable = depth->writemask;
  SwrSetDepthStencilState(ctx->swrContext, );
+
+  depthBoundsState.depthBoundsTestEnable = depth->bounds_test;
+  depthBoundsState.depthBoundsTestMinValue = depth->bounds_min;
+  depthBoundsState.depthBoundsTestMaxValue = depth->bounds_max;
+  SwrSetDepthBoundsState(ctx->swrContext, );
   }

   /* Blend State */
--
2.7.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] vl/dri3: use external texture as back buffers(v3)

2016-11-07 Thread Leo Liu



On 11/07/2016 11:31 AM, Nayan Deshmukh wrote:



On Mon, Nov 7, 2016 at 8:31 PM, Leo Liu > wrote:




On 11/05/2016 02:44 AM, Nayan Deshmukh wrote:

Hi Leo,

Thanks for the reference patch.

  There are only a number of output surfaces taking turns
as the
mixer render targets, so we probably can use the same pixmap
  corresponding to each of output surface texture.

The mixer renders to a VdpOutputSurface which is provided to
it by the
application, so we can't make any assumptions on the surface
that will
be provided it may or may not be the same. Instead we could have
additional fields in vlVdpOutputSurface which stores the
handle and
pixamp of the texture.


What I meant is in vl dri3 to store certain numbers of pixmaps,
and update them when texture, handle, size
changed by calling "pixmap_from_buffer", if the same buffer got
reused, and then we just can use the same pixmap
for present.

I just tried the mpv, if no resizing, there are only 3 textures in
turn.

I think we should avoid this "creating new pixmap frame by frame",
what do you think?

I agree this needs to be avoided.

   /* In case of a single gpu we need to get the
* handle and pixmap for the texture that is set
*/
if (buffer && scrn->output_texture &&
!scrn->is_different_gpu)
   allocate_new_buffer = true;

For this case we can simply check if the texture is present among the 
available buffers and allocate a new
buffer in case it is not found, but avoid deleting the current buffer 
if the no. of buffers is less than a fixed

value like 3.


Exactly, and we could have a fixed number as 3 or more.



However I will be busy for 2 weeks so it going to take sometime to 
complete the patch.


I think that's okay, as an optimization work, it should be no rush, and 
we like it better.


Thank you for your effort. Appreciated!

Leo



Regards,
Nayan.

Regards,
Leo


Regards,
Nayan





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] vl/dri3: use external texture as back buffers(v3)

2016-11-07 Thread Nayan Deshmukh
On Mon, Nov 7, 2016 at 8:31 PM, Leo Liu  wrote:

>
>
> On 11/05/2016 02:44 AM, Nayan Deshmukh wrote:
>
>> Hi Leo,
>>
>> Thanks for the reference patch.
>>
>>   There are only a number of output surfaces taking turns as the
>> mixer render targets, so we probably can use the same pixmap
>>   corresponding to each of output surface texture.
>>
>> The mixer renders to a VdpOutputSurface which is provided to it by the
>> application, so we can't make any assumptions on the surface that will
>> be provided it may or may not be the same. Instead we could have
>> additional fields in vlVdpOutputSurface which stores the handle and
>> pixamp of the texture.
>>
>
> What I meant is in vl dri3 to store certain numbers of pixmaps, and update
> them when texture, handle, size
> changed by calling "pixmap_from_buffer", if the same buffer got reused,
> and then we just can use the same pixmap
> for present.
>
> I just tried the mpv, if no resizing, there are only 3 textures in turn.
>
> I think we should avoid this "creating new pixmap frame by frame", what do
> you think?
>
> I agree this needs to be avoided.

   /* In case of a single gpu we need to get the
* handle and pixmap for the texture that is set
*/
if (buffer && scrn->output_texture &&
!scrn->is_different_gpu)
   allocate_new_buffer = true;

For this case we can simply check if the texture is present among the
available buffers and allocate a new
buffer in case it is not found, but avoid deleting the current buffer if
the no. of buffers is less than a fixed
value like 3.

However I will be busy for 2 weeks so it going to take sometime to complete
the patch.

Regards,
Nayan.

> Regards,
> Leo
>
>
>> Regards,
>> Nayan
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98555] NWN won't start on 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98555

--- Comment #7 from Alan Swanson  ---
Might be that __nwu_possible return value isn't checked for the opendir
override. Will have a look in a day or two but unlikely to be a mesa bug.

Unfortunately NWN was written/ported without any consideration for local saves
and expected to write into the game directory so overrides were the only
option. There are no old or game included libraries on my system and amdgpu-pro
has never been installed rather that's just what libgl/libdrm barfs.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98555] NWN won't start on 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98555

--- Comment #6 from Laurent carlier  ---
Just tried with my own copy of nwn (resurected from my hard drive) and it
starts perfectly (archlinux/mesa-git/llvm-svn) with radeonsi

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium/hud: protect against and initialization race

2016-11-07 Thread Brian Paul

On 11/07/2016 06:00 AM, Steven Toth wrote:

On Mon, Oct 24, 2016 at 10:10 AM, Steven Toth  wrote:

In the event that multiple threads attempt to install a graph
concurrently, protect the shared list.

Signed-off-by: Steven Toth 



A humble ping on this and the two others.

- Steve



Series looks OK to me.

Reviewed-by: Brian Paul 

Need me to push these for you?

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98563] Xorg segfaults with displaylink attached and mesa version >= 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98563

Lorenzo Bona  changed:

   What|Removed |Added

 CC|lorenz.b...@gmail.com   |

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98563] Xorg segfaults with displaylink attached and mesa version >= 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98563

--- Comment #6 from Emil Velikov  ---
Andrew, the affected/new codepaths shouldn't do anything that causes such
behaviour. Thus I'm inclined that this a separate bug. 

Can you please check/bisect the offending mesa commit and (in parallel/at
first) try mesa built without libdrm/HW drivers*. The latter will isolate any
of the (affected here) libdrm/loader rework.

Please keep all the information in a separate bug and add me in the cc-list.
Thanks

* Check that libdrm isn't installed/accessible.
* Use ./configure --with-dri-drivers=swrast --with-gallium-drivers=swrast ...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] vl/dri3: use external texture as back buffers(v3)

2016-11-07 Thread Leo Liu



On 11/05/2016 02:44 AM, Nayan Deshmukh wrote:

Hi Leo,

Thanks for the reference patch.

  There are only a number of output surfaces taking turns as the
mixer render targets, so we probably can use the same pixmap
  corresponding to each of output surface texture.

The mixer renders to a VdpOutputSurface which is provided to it by the
application, so we can't make any assumptions on the surface that will
be provided it may or may not be the same. Instead we could have
additional fields in vlVdpOutputSurface which stores the handle and
pixamp of the texture.


What I meant is in vl dri3 to store certain numbers of pixmaps, and 
update them when texture, handle, size
changed by calling "pixmap_from_buffer", if the same buffer got reused, 
and then we just can use the same pixmap

for present.

I just tried the mpv, if no resizing, there are only 3 textures in turn.

I think we should avoid this "creating new pixmap frame by frame", what 
do you think?


Regards,
Leo



Regards,
Nayan


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: amd/common: add support for libmesa_amd_common

2016-11-07 Thread Emil Velikov
On 5 November 2016 at 21:57, Marek Olšák  wrote:
> On Nov 5, 2016 9:03 PM, "Mauro Rossi"  wrote:
>>
>> 2016-11-05 18:58 GMT+01:00 Marek Olšák :
>> > Hi,
>> >
>> > I pushed the patch from the bug report. Hopefully it's the same as this
>> > one.
>> >
>> > Marek
>>
>> Hi Marek, thanks,
>> it is identical.
>>
>> As a side note, for any people trying to build mesa-dev for android,
>> a problem affecting our nougat-x86 x84_64 build  is that we
>> (android-x86 volounteers) haven't managed to properly fix llvm to
>> export prototypes of LLVMInitializeAMDGPU* functions, even if AMDGPU
>> backend was bullt.
>>
>> The problem is the same mentioned here:
>> https://lists.freedesktop.org/archives/mesa-dev/2016-March/109602.html
>> where issue seems related to missing porting of
>> ./include/llvm/Config/Target.def processing in Android makefiles in
>> llvm.
>>
>> At the moment in my personal branch I'm using the
>> -Wno-error=implicit-function-declaration LOCAL_CFLAG,
>> but it can't be upstreamed to mesa-dev
>>
>> I will need to knock at the door of llvm/aosp experts, in the
>> appropriate forums or just continue my trial and error progression.
>
> If you want to add some hacks to Mesa for Android, that's fine with me. I
> don't think we have many Android users, so some build system hacks shouldn't
> bother anyone, is that right Emil?
>
Yes and no. Yes, having the odd hack isn't an issue. On the other hand
getting it merged and 'never' fixing it, is. One piece that comes to
mind - commit c3b5afbd4e6 dated ~May 2015.

In general we want to cleanup the asymmetric/outdated Android build
that lead to stuff such as 7cb197c3a8c or the [multiple] issues due to
duplicated/out of sync sources lists/includes/generation rules.

We want to get stuff sorted, before slapping even more on top thus
making it even harder to manage. Right ?
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: amd/common: add support for libmesa_amd_common

2016-11-07 Thread Emil Velikov
On 5 November 2016 at 12:21, Mauro Rossi  wrote:
> Hi Nicolai,
>
> please review the attached patch which is necessary fix android build,
> as per https://bugs.freedesktop.org/show_bug.cgi?id=98573
>
> Tested with nougat-x86 build
> Kind regards
>
> Mauro
>
>
> From 36777861d42ec5ae0c0ed6a60835c76d13e38555 Mon Sep 17 00:00:00 2001
> From: Mauro Rossi 
> Date: Sat, 5 Nov 2016 00:00:29 +0100
> Subject: [PATCH] android: amd/common: add support for libmesa_amd_common
>
> Fixes the following building error introduced with commit 7115e56
> and related amd/common dependencies:
>
> external/mesa/src/gallium/drivers/radeonsi/si_shader.c:6861: error:
> undefined reference to 'ac_is_sgpr_param'
> external/mesa/src/gallium/drivers/radeonsi/si_shader.c:6951: error:
> undefined reference to 'ac_is_sgpr_param'
> clang++: error: linker command failed with exit code 1 (use -v to see
> invocation)
>
> ninja: build stopped: subcommand failed.
> build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed
> make: *** [ninja_wrapper] Error 1
> ---
>  src/amd/Android.common.mk  | 47 
> ++
>  src/amd/Android.mk |  1 +
>  src/amd/Makefile.sources   | 11 +
We'd really want to do:

git rm src/amd/common/Makefile.sources
git mv src/amd/common/Makefile.am src/amd/Makefile.common.am
sed -i s|common/||g src/amd/Makefile.common.am
+ fix the odd piece throughout.

As-is we have the sources lists duplicated and makefiles async from
one another. And if history has thought us anything - this will break
all the everytime.

Mauro, can you give it a try ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: use the prefixed name of the lexer generated functions

2016-11-07 Thread Emil Velikov
On 3 November 2016 at 22:42, Timothy Arceri
 wrote:
> On Fri, 2016-11-04 at 08:56 +1100, Timothy Arceri wrote:
>> On Thu, 2016-10-27 at 12:41 +0100, Emil Velikov wrote:
>> >
>> > From: Emil Velikov 
>> >
>> > Flex version 2.6.2 does not expand (define) the yy version of some
>> > function, thus we fail to compile.
>>
>> functions
>>
>> >
>> >
>> > Strictly speaking this might be a flex bug, although expanding the
>> > few
>> > instances is perfectly trivial and works with 2.6.2 and earlier
>> > versions
>> > of flex.
>>
>> This seems a bit fragile to me. As far as I can tell (although it
>> might
>> be unlikely) there is no guarantee that the expanded functions won't
>> be
>> be changed on the flex end and require renaming again in future.
>>
>> It would be nice if we could discover the real problem rather than
>> papering over it.
>
> This looks like it:
>
> https://github.com/westes/flex/issues/113
>
> I've added it to the bug report. Looking at the gentoo tracker bug it
> looks like this breaks a bunch of software not just Mesa. I think we
> should probably just wait and see what happens there before changing
> anything.
>
Fine with me. Btw, I did double check that older flex does work like a charm.
One might get the odd (depending on GCC used) warning, but that won't
cause any issues... unless flex does some fundamental design breakage
;-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98555] NWN won't start on 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98555

--- Comment #5 from Emil Velikov  ---
s/good _not_/_not_ good/ of course.

Or to put it otherwise - __nwu_possible() [1] should attribute only for
files/directories which are known used by NWM. Everything else must remain
as-is.

[1] https://github.com/nwnlinux/nwuser/blob/master/nwuser/util.c#L94

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium/hud: protect against and initialization race

2016-11-07 Thread Steven Toth
On Mon, Oct 24, 2016 at 10:10 AM, Steven Toth  wrote:
> In the event that multiple threads attempt to install a graph
> concurrently, protect the shared list.
>
> Signed-off-by: Steven Toth 


A humble ping on this and the two others.

- Steve

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com

> ---
>  src/gallium/auxiliary/hud/hud_cpufreq.c  | 12 ++--
>  src/gallium/auxiliary/hud/hud_diskstat.c | 13 +++--
>  src/gallium/auxiliary/hud/hud_nic.c  | 12 ++--
>  src/gallium/auxiliary/hud/hud_sensors_temp.c | 12 ++--
>  4 files changed, 41 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_cpufreq.c 
> b/src/gallium/auxiliary/hud/hud_cpufreq.c
> index e66c3e4..19a6f08 100644
> --- a/src/gallium/auxiliary/hud/hud_cpufreq.c
> +++ b/src/gallium/auxiliary/hud/hud_cpufreq.c
> @@ -36,6 +36,7 @@
>  #include "hud/hud_private.h"
>  #include "util/list.h"
>  #include "os/os_time.h"
> +#include "os/os_thread.h"
>  #include "util/u_memory.h"
>  #include 
>  #include 
> @@ -61,6 +62,7 @@ struct cpufreq_info
>
>  static int gcpufreq_count = 0;
>  static struct list_head gcpufreq_list;
> +pipe_static_mutex(gcpufreq_mutex);
>
>  static struct cpufreq_info *
>  find_cfi_by_index(int cpu_index, int mode)
> @@ -186,16 +188,21 @@ hud_get_num_cpufreq(bool displayhelp)
> int cpu_index;
>
> /* Return the number of CPU metrics we support. */
> -   if (gcpufreq_count)
> +   pipe_mutex_lock(gcpufreq_mutex);
> +   if (gcpufreq_count) {
> +  pipe_mutex_unlock(gcpufreq_mutex);
>return gcpufreq_count;
> +   }
>
> /* Scan /sys/devices.../cpu, for every object type we support, create
>  * and persist an object to represent its different metrics.
>  */
> list_inithead(_list);
> DIR *dir = opendir("/sys/devices/system/cpu");
> -   if (!dir)
> +   if (!dir) {
> +  pipe_mutex_unlock(gcpufreq_mutex);
>return 0;
> +   }
>
> while ((dp = readdir(dir)) != NULL) {
>
> @@ -239,6 +246,7 @@ hud_get_num_cpufreq(bool displayhelp)
>}
> }
>
> +   pipe_mutex_unlock(gcpufreq_mutex);
> return gcpufreq_count;
>  }
>
> diff --git a/src/gallium/auxiliary/hud/hud_diskstat.c 
> b/src/gallium/auxiliary/hud/hud_diskstat.c
> index d4306cd..af6e62d 100644
> --- a/src/gallium/auxiliary/hud/hud_diskstat.c
> +++ b/src/gallium/auxiliary/hud/hud_diskstat.c
> @@ -35,6 +35,7 @@
>  #include "hud/hud_private.h"
>  #include "util/list.h"
>  #include "os/os_time.h"
> +#include "os/os_thread.h"
>  #include "util/u_memory.h"
>  #include 
>  #include 
> @@ -81,6 +82,7 @@ struct diskstat_info
>   */
>  static int gdiskstat_count = 0;
>  static struct list_head gdiskstat_list;
> +pipe_static_mutex(gdiskstat_mutex);
>
>  static struct diskstat_info *
>  find_dsi_by_name(const char *n, int mode)
> @@ -244,16 +246,21 @@ hud_get_num_disks(bool displayhelp)
> char name[64];
>
> /* Return the number of block devices and partitions. */
> -   if (gdiskstat_count)
> +   pipe_mutex_lock(gdiskstat_mutex);
> +   if (gdiskstat_count) {
> +  pipe_mutex_unlock(gdiskstat_mutex);
>return gdiskstat_count;
> +   }
>
> /* Scan /sys/block, for every object type we support, create and
>  * persist an object to represent its different statistics.
>  */
> list_inithead(_list);
> DIR *dir = opendir("/sys/block/");
> -   if (!dir)
> +   if (!dir) {
> +  pipe_mutex_unlock(gdiskstat_mutex);
>return 0;
> +   }
>
> while ((dp = readdir(dir)) != NULL) {
>
> @@ -278,6 +285,7 @@ hud_get_num_disks(bool displayhelp)
>struct dirent *dpart;
>DIR *pdir = opendir(basename);
>if (!pdir) {
> + pipe_mutex_unlock(gdiskstat_mutex);
>   closedir(dir);
>   return 0;
>}
> @@ -312,6 +320,7 @@ hud_get_num_disks(bool displayhelp)
>   puts(line);
>}
> }
> +   pipe_mutex_unlock(gdiskstat_mutex);
>
> return gdiskstat_count;
>  }
> diff --git a/src/gallium/auxiliary/hud/hud_nic.c 
> b/src/gallium/auxiliary/hud/hud_nic.c
> index 2795c93..f9935de 100644
> --- a/src/gallium/auxiliary/hud/hud_nic.c
> +++ b/src/gallium/auxiliary/hud/hud_nic.c
> @@ -35,6 +35,7 @@
>  #include "hud/hud_private.h"
>  #include "util/list.h"
>  #include "os/os_time.h"
> +#include "os/os_thread.h"
>  #include "util/u_memory.h"
>  #include 
>  #include 
> @@ -66,6 +67,7 @@ struct nic_info
>   */
>  static int gnic_count = 0;
>  static struct list_head gnic_list;
> +pipe_static_mutex(gnic_mutex);
>
>  static struct nic_info *
>  find_nic_by_name(const char *n, int mode)
> @@ -329,16 +331,21 @@ hud_get_num_nics(bool displayhelp)
> char name[64];
>
> /* Return the number if network interfaces. */
> -   if (gnic_count)
> +   pipe_mutex_lock(gnic_mutex);
> +   if (gnic_count) {
> +  pipe_mutex_unlock(gnic_mutex);
>return gnic_count;
> +   }
>
> /* Scan /sys/block, for every object type we support, create and
>  * 

Re: [Mesa-dev] [PATCH 2/3] gallium/hud: close a previously opened handle

2016-11-07 Thread Steven Toth
On Mon, Oct 24, 2016 at 10:10 AM, Steven Toth  wrote:
> We're missing the closedir() to the matching opendir().
>
> Signed-off-by: Steven Toth 

A humble ping on this and the two others.

- Steve

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com

> ---
>  src/gallium/auxiliary/hud/hud_cpufreq.c  | 1 +
>  src/gallium/auxiliary/hud/hud_diskstat.c | 5 -
>  src/gallium/auxiliary/hud/hud_nic.c  | 1 +
>  3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_cpufreq.c 
> b/src/gallium/auxiliary/hud/hud_cpufreq.c
> index bfc748b..e66c3e4 100644
> --- a/src/gallium/auxiliary/hud/hud_cpufreq.c
> +++ b/src/gallium/auxiliary/hud/hud_cpufreq.c
> @@ -225,6 +225,7 @@ hud_get_num_cpufreq(bool displayhelp)
>snprintf(fn, sizeof(fn), "%s/cpufreq/scaling_max_freq", basename);
>add_object(dp->d_name, fn, CPUFREQ_MAXIMUM, cpu_index);
> }
> +   closedir(dir);
>
> if (displayhelp) {
>list_for_each_entry(struct cpufreq_info, cfi, _list, list) {
> diff --git a/src/gallium/auxiliary/hud/hud_diskstat.c 
> b/src/gallium/auxiliary/hud/hud_diskstat.c
> index 7d4f500..d4306cd 100644
> --- a/src/gallium/auxiliary/hud/hud_diskstat.c
> +++ b/src/gallium/auxiliary/hud/hud_diskstat.c
> @@ -277,8 +277,10 @@ hud_get_num_disks(bool displayhelp)
>/* Add any partitions */
>struct dirent *dpart;
>DIR *pdir = opendir(basename);
> -  if (!pdir)
> +  if (!pdir) {
> + closedir(dir);
>   return 0;
> +  }
>
>while ((dpart = readdir(pdir)) != NULL) {
>   /* Avoid 'lo' and '..' and '.' */
> @@ -298,6 +300,7 @@ hud_get_num_disks(bool displayhelp)
>   add_object_part(basename, dpart->d_name, DISKSTAT_WR);
>}
> }
> +   closedir(dir);
>
> if (displayhelp) {
>list_for_each_entry(struct diskstat_info, dsi, _list, list) {
> diff --git a/src/gallium/auxiliary/hud/hud_nic.c 
> b/src/gallium/auxiliary/hud/hud_nic.c
> index 719dd04..2795c93 100644
> --- a/src/gallium/auxiliary/hud/hud_nic.c
> +++ b/src/gallium/auxiliary/hud/hud_nic.c
> @@ -399,6 +399,7 @@ hud_get_num_nics(bool displayhelp)
>}
>
> }
> +   closedir(dir);
>
> list_for_each_entry(struct nic_info, nic, _list, list) {
>char line[64];
> --
> 2.7.4
>



-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com
+1.646.355.8490
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/hud: fix a problem where objects are free'd while in use.

2016-11-07 Thread Steven Toth
On Mon, Oct 24, 2016 at 10:10 AM, Steven Toth  wrote:
> Instead of trying to maintain a reference counted list of valid HUD
> objects, and freeing them accordingly, creating race conditions
> between unanticipated multiple threads, simply accept they're
> allocated once and never released until the process terminates.
>
> They're a shared resource between multiple threads, so accept
> they're always available for use.
>
> Signed-off-by: Steven Toth 

A humble ping on this and the two others.

- Steve

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com

> ---
>  src/gallium/auxiliary/hud/hud_cpufreq.c  | 13 -
>  src/gallium/auxiliary/hud/hud_diskstat.c | 13 -
>  src/gallium/auxiliary/hud/hud_nic.c  | 13 -
>  src/gallium/auxiliary/hud/hud_sensors_temp.c | 16 
>  4 files changed, 55 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_cpufreq.c 
> b/src/gallium/auxiliary/hud/hud_cpufreq.c
> index 4501bbb..bfc748b 100644
> --- a/src/gallium/auxiliary/hud/hud_cpufreq.c
> +++ b/src/gallium/auxiliary/hud/hud_cpufreq.c
> @@ -112,14 +112,6 @@ query_cfi_load(struct hud_graph *gr)
> }
>  }
>
> -static void
> -free_query_data(void *p)
> -{
> -   struct cpufreq_info *cfi = (struct cpufreq_info *)p;
> -   list_del(>list);
> -   FREE(cfi);
> -}
> -
>  /**
>* Create and initialize a new object for a specific CPU.
>* \param  pane  parent context.
> @@ -162,11 +154,6 @@ hud_cpufreq_graph_install(struct hud_pane *pane, int 
> cpu_index,
> gr->query_data = cfi;
> gr->query_new_value = query_cfi_load;
>
> -   /* Don't use free() as our callback as that messes up Gallium's
> -* memory debugger.  Use simple free_query_data() wrapper.
> -*/
> -   gr->free_query_data = free_query_data;
> -
> hud_pane_add_graph(pane, gr);
> hud_pane_set_max_value(pane, 300 /* 3 GHz */);
>  }
> diff --git a/src/gallium/auxiliary/hud/hud_diskstat.c 
> b/src/gallium/auxiliary/hud/hud_diskstat.c
> index b248baf..7d4f500 100644
> --- a/src/gallium/auxiliary/hud/hud_diskstat.c
> +++ b/src/gallium/auxiliary/hud/hud_diskstat.c
> @@ -162,14 +162,6 @@ query_dsi_load(struct hud_graph *gr)
> }
>  }
>
> -static void
> -free_query_data(void *p)
> -{
> -   struct diskstat_info *nic = (struct diskstat_info *) p;
> -   list_del(>list);
> -   FREE(nic);
> -}
> -
>  /**
>* Create and initialize a new object for a specific block I/O device.
>* \param  pane  parent context.
> @@ -208,11 +200,6 @@ hud_diskstat_graph_install(struct hud_pane *pane, const 
> char *dev_name,
> gr->query_data = dsi;
> gr->query_new_value = query_dsi_load;
>
> -   /* Don't use free() as our callback as that messes up Gallium's
> -* memory debugger.  Use simple free_query_data() wrapper.
> -*/
> -   gr->free_query_data = free_query_data;
> -
> hud_pane_add_graph(pane, gr);
> hud_pane_set_max_value(pane, 100);
>  }
> diff --git a/src/gallium/auxiliary/hud/hud_nic.c 
> b/src/gallium/auxiliary/hud/hud_nic.c
> index fb6b8c0..719dd04 100644
> --- a/src/gallium/auxiliary/hud/hud_nic.c
> +++ b/src/gallium/auxiliary/hud/hud_nic.c
> @@ -234,14 +234,6 @@ query_nic_load(struct hud_graph *gr)
> }
>  }
>
> -static void
> -free_query_data(void *p)
> -{
> -   struct nic_info *nic = (struct nic_info *) p;
> -   list_del(>list);
> -   FREE(nic);
> -}
> -
>  /**
>* Create and initialize a new object for a specific network interface dev.
>* \param  pane  parent context.
> @@ -284,11 +276,6 @@ hud_nic_graph_install(struct hud_pane *pane, const char 
> *nic_name,
> gr->query_data = nic;
> gr->query_new_value = query_nic_load;
>
> -   /* Don't use free() as our callback as that messes up Gallium's
> -* memory debugger.  Use simple free_query_data() wrapper.
> -*/
> -   gr->free_query_data = free_query_data;
> -
> hud_pane_add_graph(pane, gr);
> hud_pane_set_max_value(pane, 100);
>  }
> diff --git a/src/gallium/auxiliary/hud/hud_sensors_temp.c 
> b/src/gallium/auxiliary/hud/hud_sensors_temp.c
> index e41b847..4a8a4fc 100644
> --- a/src/gallium/auxiliary/hud/hud_sensors_temp.c
> +++ b/src/gallium/auxiliary/hud/hud_sensors_temp.c
> @@ -189,17 +189,6 @@ query_sti_load(struct hud_graph *gr)
> }
>  }
>
> -static void
> -free_query_data(void *p)
> -{
> -   struct sensors_temp_info *sti = (struct sensors_temp_info *) p;
> -   list_del(>list);
> -   if (sti->chip)
> -  sensors_free_chip_name(sti->chip);
> -   FREE(sti);
> -   sensors_cleanup();
> -}
> -
>  /**
>* Create and initialize a new object for a specific sensor interface dev.
>* \param  pane  parent context.
> @@ -237,11 +226,6 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, 
> const char *dev_name,
> gr->query_data = sti;
> gr->query_new_value = query_sti_load;
>
> -   /* Don't use free() as our callback as that messes up Gallium's
> -* memory debugger.  Use simple free_query_data() wrapper.
> -  

Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions

2016-11-07 Thread Boyan Ding
2016-11-05 3:23 GMT+08:00 Matt Turner :
> On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding  wrote:
>> According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector
>> Relational Functions", functions of this type do not operate on scalar
>> types, so remove scalar types from signature definitions to make the
>> behavior consistent with glslangValidator and other drivers.
>
> Yep. Looks like it's always been this way.
>
> The patch is
>
> Reviewed-by: Matt Turner 
>
> Since this seems to be untested by any suite, could you provide some
> piglit parser tests that confirm that lessThanEqual(scalar, scalar),
> et al doesn't work?
>

Thanks for the review, patch has been sent to piglit and tests pass
with this patch. Please help me push this one if you think it is
appropriate.

> Rant: what a stupid mess to require <= for scalars but lessThanEqual
> for vectors.

Yeah, I happened to find this when doing my homework. I was quite
shocked that one of my shaders, in which I somehow used
greaterThanEqual on floats to get a boolean, didn't compile on nvidia
driver until I saw the spec.

Regards,
Boyan Ding

>
> Somewhere on my todo list is a GLSL extension that fixes things like this...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965/gen9: Allow sampling with hiz when supported

2016-11-07 Thread Pohjolainen, Topi
On Thu, Nov 03, 2016 at 10:39:43AM +, Lionel Landwerlin wrote:
> From: Jordan Justen 
> 
> For gen9+ this will indicate when we should allow hiz based sampling
> during rendering.
> 
> Improves performance in :
>   - Synmark's OglDeferred by 2.2% (n=20)
>   - Synmark's OglShMapPcf by 0.44% (n=20)
> 
> v2 by Ben: Add spec reference, and make it fix with some of the changes made 
> on
> the previous patches
> Change the check from mt->aux_buf to mt->num_samples. The presence of an 
> aux_buf
> isn't enough to determine there isn't a HiZ buffer to use.
> 
> v3: It seems all depth surface end up with num_samples = 0 by default,
> so allow sampling from depth HiZ if num_samples <= 1. (Lionel)
> Allow sampling from HiZ only if all LOD are available from the HiZ
> buffer. (Lionel)

Patches 6-8 are also:

Reviewed-by: Topi Pohjolainen 

> 
> Signed-off-by: Jordan Justen  (v1)
> Signed-off-by: Ben Widawsky  (v2)
> Signed-off-by: Lionel Landwerlin  (v3)
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 29 
> ++-
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index bf8e314..4511738 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -2020,7 +2020,34 @@ intel_miptree_sample_with_hiz(struct brw_context *brw,
>return false;
> }
>  
> -   return false;
> +   if (!mt->hiz_buf) {
> +  return false;
> +   }
> +
> +   /* It seems the hardware won't fallback to the depth buffer if some of the
> +* mipmap levels aren't available in the HiZ buffer. So we need all levels
> +* of the texture to be HiZ enabled.
> +*/
> +   for (unsigned level = mt->first_level; level <= mt->last_level; ++level) {
> +  if (!intel_miptree_level_has_hiz(mt, level))
> + return false;
> +   }
> +
> +   /* If compressed multisampling is enabled, then we use it for the 
> auxiliary
> +* buffer instead.
> +*
> +* From the BDW PRM (Volume 2d: Command Reference: Structures
> +*   RENDER_SURFACE_STATE.AuxiliarySurfaceMode):
> +*
> +*  "If this field is set to AUX_HIZ, Number of Multisamples must be
> +*   MULTISAMPLECOUNT_1, and Surface Type cannot be SURFTYPE_3D.
> +*
> +* There is no such blurb for 1D textures, but there is sufficient 
> evidence
> +* that this is broken on SKL+.
> +*/
> +   return (mt->num_samples <= 1 &&
> +   mt->target != GL_TEXTURE_3D &&
> +   mt->target != GL_TEXTURE_1D /* gen9+ restriction */);
>  }
>  
>  /**
> -- 
> 2.10.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98555] NWN won't start on 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98555

--- Comment #4 from Emil Velikov  ---
The most likely thing that comes to mind is that one LD_PRELOADs a bunch of
libraries, amongst which an older version of libdrm.

What's the output of $LD_DEBUG=libs nwn
The above can be a bit large so please attach as plain text.

Or maybe it's this nasty LD_PRELOAD [1] which overrides open/opendir/stat/etc.
in a slightly brain-dead manner. I'm sure that mangling _every_ path is a good
_not_ a idea.

[1] https://github.com/nwnlinux/nwuser

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] tgsi/scan: turn a huge if-else-if.. chain into a switch statement

2016-11-07 Thread Nicolai Hähnle

On 05.11.2016 18:38, Marek Olšák wrote:

From: Marek Olšák 


For the series:

Reviewed-by: Nicolai Hähnle 



---
 src/gallium/auxiliary/tgsi/tgsi_scan.c | 44 +++---
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index 26cb2be..40a1340 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -448,62 +448,72 @@ scan_declaration(struct tgsi_shader_info *info,
  info->output_array_last[array_id] = fulldecl->Range.Last;
  break;
   }
   info->array_max[file] = MAX2(info->array_max[file], array_id);
}

for (reg = fulldecl->Range.First; reg <= fulldecl->Range.Last; reg++) {
   unsigned semName = fulldecl->Semantic.Name;
   unsigned semIndex = fulldecl->Semantic.Index +
  (reg - fulldecl->Range.First);
+  int buffer;
+  unsigned index, target, type;

   /* only first 32 regs will appear in this bitfield */
   info->file_mask[file] |= (1 << reg);
   info->file_count[file]++;
   info->file_max[file] = MAX2(info->file_max[file], (int)reg);

-  if (file == TGSI_FILE_CONSTANT) {
- int buffer = 0;
+  switch (file) {
+  case TGSI_FILE_CONSTANT:
+ buffer = 0;

  if (fulldecl->Declaration.Dimension)
 buffer = fulldecl->Dim.Index2D;

  info->const_file_max[buffer] =
 MAX2(info->const_file_max[buffer], (int)reg);
  info->const_buffers_declared |= 1u << buffer;
-  } else if (file == TGSI_FILE_IMAGE) {
+ break;
+
+  case TGSI_FILE_IMAGE:
  info->images_declared |= 1u << reg;
  if (fulldecl->Image.Resource == TGSI_TEXTURE_BUFFER)
 info->images_buffers |= 1 << reg;
-  } else if (file == TGSI_FILE_BUFFER) {
+ break;
+
+  case TGSI_FILE_BUFFER:
  info->shader_buffers_declared |= 1u << reg;
-  } else if (file == TGSI_FILE_INPUT) {
+ break;
+
+  case TGSI_FILE_INPUT:
  info->input_semantic_name[reg] = (ubyte) semName;
  info->input_semantic_index[reg] = (ubyte) semIndex;
  info->input_interpolate[reg] = (ubyte)fulldecl->Interp.Interpolate;
  info->input_interpolate_loc[reg] = (ubyte)fulldecl->Interp.Location;
  info->input_cylindrical_wrap[reg] = 
(ubyte)fulldecl->Interp.CylindricalWrap;

  /* Vertex shaders can have inputs with holes between them. */
  info->num_inputs = MAX2(info->num_inputs, reg + 1);

  if (semName == TGSI_SEMANTIC_PRIMID)
 info->uses_primid = TRUE;
  else if (procType == PIPE_SHADER_FRAGMENT) {
 if (semName == TGSI_SEMANTIC_POSITION)
info->reads_position = TRUE;
 else if (semName == TGSI_SEMANTIC_FACE)
info->uses_frontface = TRUE;
  }
-  }
-  else if (file == TGSI_FILE_SYSTEM_VALUE) {
- unsigned index = fulldecl->Range.First;
+ break;
+
+  case TGSI_FILE_SYSTEM_VALUE:
+ index = fulldecl->Range.First;

  info->system_value_semantic_name[index] = semName;
  info->num_system_values = MAX2(info->num_system_values, index + 1);

  switch (semName) {
  case TGSI_SEMANTIC_INSTANCEID:
 info->uses_instanceid = TRUE;
 break;
  case TGSI_SEMANTIC_VERTEXID:
 info->uses_vertexid = TRUE;
@@ -523,22 +533,23 @@ scan_declaration(struct tgsi_shader_info *info,
  case TGSI_SEMANTIC_POSITION:
 info->reads_position = TRUE;
 break;
  case TGSI_SEMANTIC_FACE:
 info->uses_frontface = TRUE;
 break;
  case TGSI_SEMANTIC_SAMPLEMASK:
 info->reads_samplemask = TRUE;
 break;
  }
-  }
-  else if (file == TGSI_FILE_OUTPUT) {
+ break;
+
+  case TGSI_FILE_OUTPUT:
  info->output_semantic_name[reg] = (ubyte) semName;
  info->output_semantic_index[reg] = (ubyte) semIndex;
  info->num_outputs = MAX2(info->num_outputs, reg + 1);

  if (semName == TGSI_SEMANTIC_COLOR)
 info->colors_written |= 1 << semIndex;

  if (procType == PIPE_SHADER_VERTEX ||
  procType == PIPE_SHADER_GEOMETRY ||
  procType == PIPE_SHADER_TESS_CTRL ||
@@ -571,37 +582,42 @@ scan_declaration(struct tgsi_shader_info *info,
info->writes_samplemask = TRUE;
break;
 }
  }

  if (procType == PIPE_SHADER_VERTEX) {
 if (semName == TGSI_SEMANTIC_EDGEFLAG) {
info->writes_edgeflag = TRUE;
 }
  }
-  } else if (file == TGSI_FILE_SAMPLER) {
+ break;
+
+  case TGSI_FILE_SAMPLER:
  STATIC_ASSERT(sizeof(info->samplers_declared) * 8 >= 
PIPE_MAX_SAMPLERS);
  

[Mesa-dev] [Bug 98555] NWN won't start on 13.0

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98555

--- Comment #3 from Nicolai Hähnle  ---
amdgpu_dri.so is the name of the amdgpu-pro (closed-source) OpenGL driver.
Since you seem to be trying to use Mesa, please make sure to remove the
amdgpu-pro installation.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] i965/miptree: Create a hiz mcs type

2016-11-07 Thread Pohjolainen, Topi
On Mon, Nov 07, 2016 at 12:30:14PM +0200, Pohjolainen, Topi wrote:
> On Mon, Nov 07, 2016 at 10:24:35AM +, Lionel Landwerlin wrote:
> > On 07/11/16 10:18, Pohjolainen, Topi wrote:
> > > On Thu, Nov 03, 2016 at 10:39:40AM +, Lionel Landwerlin wrote:
> > > > From: Ben Widawsky 
> > > > 
> > > > This seems counter to the goal of consolidating hiz, mcs, and later ccs 
> > > > buffers.
> > > > Unfortunately, hiz on gen6 is a thing the code supports, and this wart 
> > > > will be
> > > > helpful to achieve that. Overall, I believe it does help unify AUX 
> > > > buffers on
> > > > gen7+.
> > > > 
> > > > I updated the size field which I introduced in the previous patch, even 
> > > > though
> > > > we have no use for it.
> > > > 
> > > > XXX: As I mentioned in the last patch, the height given to the MCS 
> > > > buffer
> > > > allocation in intel_miptree_alloc_mcs() looks wrong, but I don't claim 
> > > > to fully
> > > > understand how the MCS buffer is laid out.
> > > > 
> > > > v2: rebase on master (Lionel)
> > > > 
> > > > Signed-off-by: Ben Widawsky  (v1)
> > > > Signed-off-by: Lionel Landwerlin  (v2)
> > > > ---
> > > >   src/mesa/drivers/dri/i965/brw_blorp.c |  4 +-
> > > >   src/mesa/drivers/dri/i965/gen7_misc_state.c   |  6 +--
> > > >   src/mesa/drivers/dri/i965/gen8_depth_state.c  |  6 +--
> > > >   src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 61 
> > > > ++-
> > > >   src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 --
> > > >   5 files changed, 49 insertions(+), 42 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> > > > b/src/mesa/drivers/dri/i965/brw_blorp.c
> > > > index 5adb4c6..f0ad074 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > > > @@ -214,8 +214,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
> > > >   }
> > > >   assert(hiz_mt->pitch == aux_surf->row_pitch);
> > > >} else {
> > > > -surf->aux_addr.buffer = mt->hiz_buf->bo;
> > > > -surf->aux_addr.offset = 0;
> > > > +surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
> > > > +surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
> > > >}
> > > > }
> > > >  } else {
> > > > diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
> > > > b/src/mesa/drivers/dri/i965/gen7_misc_state.c
> > > > index 271d962..7bd5cd5 100644
> > > > --- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
> > > > +++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
> > > > @@ -146,13 +146,13 @@ gen7_emit_depth_stencil_hiz(struct brw_context 
> > > > *brw,
> > > > ADVANCE_BATCH();
> > > >  } else {
> > > > assert(depth_mt);
> > > > -  struct intel_miptree_aux_buffer *hiz_buf = depth_mt->hiz_buf;
> > > > +  struct intel_miptree_hiz_buffer *hiz_buf = depth_mt->hiz_buf;
> > > > BEGIN_BATCH(3);
> > > > OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (3 - 2));
> > > > OUT_BATCH((mocs << 25) |
> > > > -(hiz_buf->pitch - 1));
> > > > -  OUT_RELOC(hiz_buf->bo,
> > > > +(hiz_buf->aux_base.pitch - 1));
> > > > +  OUT_RELOC(hiz_buf->aux_base.bo,
> > > >   I915_GEM_DOMAIN_RENDER,
> > > >   I915_GEM_DOMAIN_RENDER,
> > > >   0);
> > > > diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
> > > > b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > index 73b2186..8920910 100644
> > > > --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > > @@ -93,10 +93,10 @@ emit_depth_packets(struct brw_context *brw,
> > > > assert(depth_mt);
> > > > BEGIN_BATCH(5);
> > > > OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (5 - 2));
> > > > -  OUT_BATCH((depth_mt->hiz_buf->pitch - 1) | mocs_wb << 25);
> > > > -  OUT_RELOC64(depth_mt->hiz_buf->bo,
> > > > +  OUT_BATCH((depth_mt->hiz_buf->aux_base.pitch - 1) | mocs_wb << 
> > > > 25);
> > > > +  OUT_RELOC64(depth_mt->hiz_buf->aux_base.bo,
> > > > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
> > > > -  OUT_BATCH(depth_mt->hiz_buf->qpitch >> 2);
> > > > +  OUT_BATCH(depth_mt->hiz_buf->aux_base.qpitch >> 2);
> > > > ADVANCE_BATCH();
> > > >  }
> > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > > index 3d1bdb1..af0e1a4 100644
> > > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > > @@ -1016,11 +1016,10 @@ intel_miptree_release(struct intel_mipmap_tree 
> > > > **mt)
> > > >if ((*mt)->hiz_buf->mt)
> > > >   

[Mesa-dev] [Bug 98002] Mud rendering bug in Portal 2

2016-11-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98002

--- Comment #13 from almos  ---
Ok, I managed to download the trace with Konqueror. I haven't tried it with
fglrx yet, but I compared the attached screenshot with the YouTube video, and
here are my thoughts.

Actually, the mud is rendered correctly, the problem is with the rocky walls of
the cavern. It looks like the lighting is fullbright, and the fog is missing. I
also read the shader in qapitrace, and I have no idea how it was supposed to
add fog (I'm not an expert on GLSL though).

If you look at the middle of the attached screenshot, you can see some little
black objects in the background. Those are also missing fog.

It seems this bug has been reported to Valve as
https://github.com/ValveSoftware/steam-for-linux/issues/4414 , and it's not
specific to Mesa.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] main: return error if asking for GL_TEXTURE_BORDER_COLOR in TEXTURE_2D_MULTISAMPLE{_ARRAY} through TexParameterI{i, ui}v()

2016-11-07 Thread Samuel Iglesias Gonsálvez
OpenGL ES 3.2 says in section 8.10. "TEXTURE PARAMETERS", at the end of
the section:

"An INVALID_ENUM error is generated if target is TEXTURE_2D_-
MULTISAMPLE or TEXTURE_2D_MULTISAMPLE_ARRAY , and pname is any
sampler state from table 21.12."

GL_TEXTURE_BORDER_COLOR is present in that table.

Signed-off-by: Samuel Iglesias Gonsálvez 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98250
---
 src/mesa/main/texparam.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
index 29eed07..ae96bd8 100644
--- a/src/mesa/main/texparam.c
+++ b/src/mesa/main/texparam.c
@@ -974,6 +974,10 @@ _mesa_texture_parameterIiv(struct gl_context *ctx,
 {
switch (pname) {
case GL_TEXTURE_BORDER_COLOR:
+  if (!_mesa_target_allows_setting_sampler_parameters(texObj->Target)) {
+ _mesa_error(ctx, GL_INVALID_ENUM, "glTextureParameterIiv(texture)");
+ return;
+  }
   FLUSH_VERTICES(ctx, _NEW_TEXTURE);
   /* set the integer-valued border color */
   COPY_4V(texObj->Sampler.BorderColor.i, params);
@@ -992,6 +996,10 @@ _mesa_texture_parameterIuiv(struct gl_context *ctx,
 {
switch (pname) {
case GL_TEXTURE_BORDER_COLOR:
+  if (!_mesa_target_allows_setting_sampler_parameters(texObj->Target)) {
+ _mesa_error(ctx, GL_INVALID_ENUM, "glTextureParameterIuiv(texture)");
+ return;
+  }
   FLUSH_VERTICES(ctx, _NEW_TEXTURE);
   /* set the unsigned integer-valued border color */
   COPY_4V(texObj->Sampler.BorderColor.ui, params);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] dri: fixup driver names if needed

2016-11-07 Thread Christian Gmeiner
2016-11-04 12:59 GMT+01:00 Eric Engestrom :
> On Thursday, 2016-11-03 15:25:22 +0100, Christian Gmeiner wrote:
>> This makes it possible to 'use' the imx-drm driver. Remeber that it
>> is not possible to have sysmbol names in C/C++ with a '-' in it.
>>
>> Signed-off-by: Christian Gmeiner 
>> ---
>>  include/GL/internal/dri_interface.h | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/include/GL/internal/dri_interface.h 
>> b/include/GL/internal/dri_interface.h
>> index 36ba65e..4ec3211 100644
>> --- a/include/GL/internal/dri_interface.h
>> +++ b/include/GL/internal/dri_interface.h
>> @@ -42,6 +42,7 @@
>>
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #ifdef HAVE_LIBDRM
>>  #include 
>> @@ -617,6 +618,12 @@ dri_get_extensions_name(const char *driver_name)
>>   if (asprintf(, "%s_%s", __DRI_DRIVER_GET_EXTENSIONS, driver_name) 
>> < 0)
>>   return NULL;
>>
>> + const size_t len = strlen(name);
>> + for (size_t i = 0; i < len; i++) {
>> + if (name[i] == '-')
>> + name[i] = '_';
>
> Why not replace all non-alnum chars?
>

I am not sure with what char I should replace them.

> Either way, the series is:
> Reviewed-by: Eric Engestrom 
>
> Also, for the asprintf->malloc change, I think that should be a separate
> patch, as the code being deduplicated in the first patch was already
> using asprintf :)
>

good point - v2 will hit ml soon.

>> + }
>> +
>>   return name;
>>  }
>>
>> --
>> 2.7.4
>>

greets
--
Christian Gmeiner, MSc

https://soundcloud.com/christian-gmeiner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] i965/miptree: Create a hiz mcs type

2016-11-07 Thread Pohjolainen, Topi
On Mon, Nov 07, 2016 at 10:24:35AM +, Lionel Landwerlin wrote:
> On 07/11/16 10:18, Pohjolainen, Topi wrote:
> > On Thu, Nov 03, 2016 at 10:39:40AM +, Lionel Landwerlin wrote:
> > > From: Ben Widawsky 
> > > 
> > > This seems counter to the goal of consolidating hiz, mcs, and later ccs 
> > > buffers.
> > > Unfortunately, hiz on gen6 is a thing the code supports, and this wart 
> > > will be
> > > helpful to achieve that. Overall, I believe it does help unify AUX 
> > > buffers on
> > > gen7+.
> > > 
> > > I updated the size field which I introduced in the previous patch, even 
> > > though
> > > we have no use for it.
> > > 
> > > XXX: As I mentioned in the last patch, the height given to the MCS buffer
> > > allocation in intel_miptree_alloc_mcs() looks wrong, but I don't claim to 
> > > fully
> > > understand how the MCS buffer is laid out.
> > > 
> > > v2: rebase on master (Lionel)
> > > 
> > > Signed-off-by: Ben Widawsky  (v1)
> > > Signed-off-by: Lionel Landwerlin  (v2)
> > > ---
> > >   src/mesa/drivers/dri/i965/brw_blorp.c |  4 +-
> > >   src/mesa/drivers/dri/i965/gen7_misc_state.c   |  6 +--
> > >   src/mesa/drivers/dri/i965/gen8_depth_state.c  |  6 +--
> > >   src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 61 
> > > ++-
> > >   src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 --
> > >   5 files changed, 49 insertions(+), 42 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> > > b/src/mesa/drivers/dri/i965/brw_blorp.c
> > > index 5adb4c6..f0ad074 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > > @@ -214,8 +214,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
> > >   }
> > >   assert(hiz_mt->pitch == aux_surf->row_pitch);
> > >} else {
> > > -surf->aux_addr.buffer = mt->hiz_buf->bo;
> > > -surf->aux_addr.offset = 0;
> > > +surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
> > > +surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
> > >}
> > > }
> > >  } else {
> > > diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
> > > b/src/mesa/drivers/dri/i965/gen7_misc_state.c
> > > index 271d962..7bd5cd5 100644
> > > --- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
> > > +++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
> > > @@ -146,13 +146,13 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw,
> > > ADVANCE_BATCH();
> > >  } else {
> > > assert(depth_mt);
> > > -  struct intel_miptree_aux_buffer *hiz_buf = depth_mt->hiz_buf;
> > > +  struct intel_miptree_hiz_buffer *hiz_buf = depth_mt->hiz_buf;
> > > BEGIN_BATCH(3);
> > > OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (3 - 2));
> > > OUT_BATCH((mocs << 25) |
> > > -(hiz_buf->pitch - 1));
> > > -  OUT_RELOC(hiz_buf->bo,
> > > +(hiz_buf->aux_base.pitch - 1));
> > > +  OUT_RELOC(hiz_buf->aux_base.bo,
> > >   I915_GEM_DOMAIN_RENDER,
> > >   I915_GEM_DOMAIN_RENDER,
> > >   0);
> > > diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
> > > b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > index 73b2186..8920910 100644
> > > --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> > > @@ -93,10 +93,10 @@ emit_depth_packets(struct brw_context *brw,
> > > assert(depth_mt);
> > > BEGIN_BATCH(5);
> > > OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (5 - 2));
> > > -  OUT_BATCH((depth_mt->hiz_buf->pitch - 1) | mocs_wb << 25);
> > > -  OUT_RELOC64(depth_mt->hiz_buf->bo,
> > > +  OUT_BATCH((depth_mt->hiz_buf->aux_base.pitch - 1) | mocs_wb << 25);
> > > +  OUT_RELOC64(depth_mt->hiz_buf->aux_base.bo,
> > > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
> > > -  OUT_BATCH(depth_mt->hiz_buf->qpitch >> 2);
> > > +  OUT_BATCH(depth_mt->hiz_buf->aux_base.qpitch >> 2);
> > > ADVANCE_BATCH();
> > >  }
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index 3d1bdb1..af0e1a4 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -1016,11 +1016,10 @@ intel_miptree_release(struct intel_mipmap_tree 
> > > **mt)
> > >if ((*mt)->hiz_buf->mt)
> > >   intel_miptree_release(&(*mt)->hiz_buf->mt);
> > >else
> > > -drm_intel_bo_unreference((*mt)->hiz_buf->bo);
> > > +drm_intel_bo_unreference((*mt)->hiz_buf->aux_base.bo);
> > >free((*mt)->hiz_buf);
> > > }
> > > if ((*mt)->mcs_buf) {
> > > - 

Re: [Mesa-dev] [PATCH 5/8] i965/miptree: Create a hiz mcs type

2016-11-07 Thread Lionel Landwerlin

On 07/11/16 10:18, Pohjolainen, Topi wrote:

On Thu, Nov 03, 2016 at 10:39:40AM +, Lionel Landwerlin wrote:

From: Ben Widawsky 

This seems counter to the goal of consolidating hiz, mcs, and later ccs buffers.
Unfortunately, hiz on gen6 is a thing the code supports, and this wart will be
helpful to achieve that. Overall, I believe it does help unify AUX buffers on
gen7+.

I updated the size field which I introduced in the previous patch, even though
we have no use for it.

XXX: As I mentioned in the last patch, the height given to the MCS buffer
allocation in intel_miptree_alloc_mcs() looks wrong, but I don't claim to fully
understand how the MCS buffer is laid out.

v2: rebase on master (Lionel)

Signed-off-by: Ben Widawsky  (v1)
Signed-off-by: Lionel Landwerlin  (v2)
---
  src/mesa/drivers/dri/i965/brw_blorp.c |  4 +-
  src/mesa/drivers/dri/i965/gen7_misc_state.c   |  6 +--
  src/mesa/drivers/dri/i965/gen8_depth_state.c  |  6 +--
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 61 ++-
  src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 --
  5 files changed, 49 insertions(+), 42 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 5adb4c6..f0ad074 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -214,8 +214,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
  }
  assert(hiz_mt->pitch == aux_surf->row_pitch);
   } else {
-surf->aux_addr.buffer = mt->hiz_buf->bo;
-surf->aux_addr.offset = 0;
+surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
+surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
   }
}
 } else {
diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
b/src/mesa/drivers/dri/i965/gen7_misc_state.c
index 271d962..7bd5cd5 100644
--- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
@@ -146,13 +146,13 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw,
ADVANCE_BATCH();
 } else {
assert(depth_mt);
-  struct intel_miptree_aux_buffer *hiz_buf = depth_mt->hiz_buf;
+  struct intel_miptree_hiz_buffer *hiz_buf = depth_mt->hiz_buf;
  
BEGIN_BATCH(3);

OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (3 - 2));
OUT_BATCH((mocs << 25) |
-(hiz_buf->pitch - 1));
-  OUT_RELOC(hiz_buf->bo,
+(hiz_buf->aux_base.pitch - 1));
+  OUT_RELOC(hiz_buf->aux_base.bo,
  I915_GEM_DOMAIN_RENDER,
  I915_GEM_DOMAIN_RENDER,
  0);
diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
b/src/mesa/drivers/dri/i965/gen8_depth_state.c
index 73b2186..8920910 100644
--- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
@@ -93,10 +93,10 @@ emit_depth_packets(struct brw_context *brw,
assert(depth_mt);
BEGIN_BATCH(5);
OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (5 - 2));
-  OUT_BATCH((depth_mt->hiz_buf->pitch - 1) | mocs_wb << 25);
-  OUT_RELOC64(depth_mt->hiz_buf->bo,
+  OUT_BATCH((depth_mt->hiz_buf->aux_base.pitch - 1) | mocs_wb << 25);
+  OUT_RELOC64(depth_mt->hiz_buf->aux_base.bo,
I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
-  OUT_BATCH(depth_mt->hiz_buf->qpitch >> 2);
+  OUT_BATCH(depth_mt->hiz_buf->aux_base.qpitch >> 2);
ADVANCE_BATCH();
 }
  
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c

index 3d1bdb1..af0e1a4 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1016,11 +1016,10 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
   if ((*mt)->hiz_buf->mt)
  intel_miptree_release(&(*mt)->hiz_buf->mt);
   else
-drm_intel_bo_unreference((*mt)->hiz_buf->bo);
+drm_intel_bo_unreference((*mt)->hiz_buf->aux_base.bo);
   free((*mt)->hiz_buf);
}
if ((*mt)->mcs_buf) {
- intel_miptree_release(&(*mt)->mcs_buf->mt);

Doesn't this belong to the one of the earlier patches?


   free((*mt)->mcs_buf);
}
intel_resolve_map_clear(&(*mt)->hiz_map);
@@ -1726,7 +1725,7 @@ intel_miptree_level_enable_hiz(struct brw_context *brw,
   * Helper for intel_miptree_alloc_hiz() that determines the required hiz
   * buffer dimensions and allocates a bo for the hiz buffer.
   */
-static struct intel_miptree_aux_buffer *
+static struct intel_miptree_hiz_buffer *
  intel_gen7_hiz_buf_create(struct brw_context *brw,
struct intel_mipmap_tree *mt)
  {
@@ -1734,7 +1733,7 @@ intel_gen7_hiz_buf_create(struct brw_context *brw,
 unsigned 

Re: [Mesa-dev] [PATCH 4/8] i965: Store mcs buffer size

2016-11-07 Thread Lionel Landwerlin

On 07/11/16 10:07, Pohjolainen, Topi wrote:

On Thu, Nov 03, 2016 at 10:39:39AM +, Lionel Landwerlin wrote:

From: Ben Widawsky 

libdrm may round up the allocation requested by mesa. As a result, accesses
through the gtt may end up accessing memory which does not belong to mesa. The
problem is described in the following commit:
commit 7ae870211ddc40ef6ed209a322c3a721214bb737
Author: Eric Anholt 
Date:   Mon Apr 14 16:52:43 2014 -0700

 i965: Fix buffer overruns in MSAA MCS buf This size field is an alternate

In that patch this was solved by making sure we only 1'd the logical size of the
buffer. This patch becomes necessary because the miptree data structure is going
to go away in the upcoming patch and we won't have access to the total_height
field anymore.

v2: drop setting the size in intel_hiz_miptree_buf_create() (Lionel)

Signed-off-by: Ben Widawsky  (v1)
Signed-off-by: Lionel Landwerlin  (v2)
---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
  src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 9 +
  2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c2bff17..3d1bdb1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1503,8 +1503,7 @@ intel_miptree_init_mcs(struct brw_context *brw,
return;
 }
 void *data = mt->mcs_buf->bo->virtual;
-   memset(data, init_value,
-  mt->mcs_buf->mt->total_height * mt->mcs_buf->mt->pitch);

If I read the previous patch right, this is already needed there as
mcs_buf->mt is left NULL?


Thanks, that's wrong indeed.
I think it makes sense to squash patch 3 & 4.




+   memset(data, init_value, mt->mcs_buf->size);
 drm_intel_bo_unmap(mt->mcs_buf->bo);
 mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR;
  }
@@ -1545,6 +1544,7 @@ intel_mcs_miptree_buf_create(struct brw_context *brw,
  
 buf->bo = temp_mt->bo;

 buf->offset = temp_mt->offset;
+   buf->size = temp_mt->total_height * temp_mt->pitch;
 buf->pitch = temp_mt->pitch;
 buf->qpitch = temp_mt->qpitch;
  
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h

index 0b49dc2..0b4b353 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -350,6 +350,15 @@ struct intel_miptree_aux_buffer
  */
 uint32_t offset;
  
+   /*

+* Size of the MCS surface.
+*
+* This is needed when doing any gtt mapped operations on the buffer (which
+* will be Y-tiled). It is possible that it will not be the same as bo->size
+* when the drm allocator rounds up the requested size.
+*/
+   size_t size;
+
 /**
  * Pitch in bytes.
  *
--
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] i965/miptree: Create a hiz mcs type

2016-11-07 Thread Pohjolainen, Topi
On Thu, Nov 03, 2016 at 10:39:40AM +, Lionel Landwerlin wrote:
> From: Ben Widawsky 
> 
> This seems counter to the goal of consolidating hiz, mcs, and later ccs 
> buffers.
> Unfortunately, hiz on gen6 is a thing the code supports, and this wart will be
> helpful to achieve that. Overall, I believe it does help unify AUX buffers on
> gen7+.
> 
> I updated the size field which I introduced in the previous patch, even though
> we have no use for it.
> 
> XXX: As I mentioned in the last patch, the height given to the MCS buffer
> allocation in intel_miptree_alloc_mcs() looks wrong, but I don't claim to 
> fully
> understand how the MCS buffer is laid out.
> 
> v2: rebase on master (Lionel)
> 
> Signed-off-by: Ben Widawsky  (v1)
> Signed-off-by: Lionel Landwerlin  (v2)
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c |  4 +-
>  src/mesa/drivers/dri/i965/gen7_misc_state.c   |  6 +--
>  src/mesa/drivers/dri/i965/gen8_depth_state.c  |  6 +--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 61 
> ++-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 14 --
>  5 files changed, 49 insertions(+), 42 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 5adb4c6..f0ad074 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -214,8 +214,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
>  }
>  assert(hiz_mt->pitch == aux_surf->row_pitch);
>   } else {
> -surf->aux_addr.buffer = mt->hiz_buf->bo;
> -surf->aux_addr.offset = 0;
> +surf->aux_addr.buffer = mt->hiz_buf->aux_base.bo;
> +surf->aux_addr.offset = mt->hiz_buf->aux_base.offset;
>   }
>}
> } else {
> diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
> b/src/mesa/drivers/dri/i965/gen7_misc_state.c
> index 271d962..7bd5cd5 100644
> --- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
> @@ -146,13 +146,13 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw,
>ADVANCE_BATCH();
> } else {
>assert(depth_mt);
> -  struct intel_miptree_aux_buffer *hiz_buf = depth_mt->hiz_buf;
> +  struct intel_miptree_hiz_buffer *hiz_buf = depth_mt->hiz_buf;
>  
>BEGIN_BATCH(3);
>OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (3 - 2));
>OUT_BATCH((mocs << 25) |
> -(hiz_buf->pitch - 1));
> -  OUT_RELOC(hiz_buf->bo,
> +(hiz_buf->aux_base.pitch - 1));
> +  OUT_RELOC(hiz_buf->aux_base.bo,
>  I915_GEM_DOMAIN_RENDER,
>  I915_GEM_DOMAIN_RENDER,
>  0);
> diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
> b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> index 73b2186..8920910 100644
> --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> @@ -93,10 +93,10 @@ emit_depth_packets(struct brw_context *brw,
>assert(depth_mt);
>BEGIN_BATCH(5);
>OUT_BATCH(GEN7_3DSTATE_HIER_DEPTH_BUFFER << 16 | (5 - 2));
> -  OUT_BATCH((depth_mt->hiz_buf->pitch - 1) | mocs_wb << 25);
> -  OUT_RELOC64(depth_mt->hiz_buf->bo,
> +  OUT_BATCH((depth_mt->hiz_buf->aux_base.pitch - 1) | mocs_wb << 25);
> +  OUT_RELOC64(depth_mt->hiz_buf->aux_base.bo,
>I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
> -  OUT_BATCH(depth_mt->hiz_buf->qpitch >> 2);
> +  OUT_BATCH(depth_mt->hiz_buf->aux_base.qpitch >> 2);
>ADVANCE_BATCH();
> }
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 3d1bdb1..af0e1a4 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1016,11 +1016,10 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
>   if ((*mt)->hiz_buf->mt)
>  intel_miptree_release(&(*mt)->hiz_buf->mt);
>   else
> -drm_intel_bo_unreference((*mt)->hiz_buf->bo);
> +drm_intel_bo_unreference((*mt)->hiz_buf->aux_base.bo);
>   free((*mt)->hiz_buf);
>}
>if ((*mt)->mcs_buf) {
> - intel_miptree_release(&(*mt)->mcs_buf->mt);

Doesn't this belong to the one of the earlier patches?

>   free((*mt)->mcs_buf);
>}
>intel_resolve_map_clear(&(*mt)->hiz_map);
> @@ -1726,7 +1725,7 @@ intel_miptree_level_enable_hiz(struct brw_context *brw,
>   * Helper for intel_miptree_alloc_hiz() that determines the required hiz
>   * buffer dimensions and allocates a bo for the hiz buffer.
>   */
> -static struct intel_miptree_aux_buffer *
> +static struct intel_miptree_hiz_buffer *
>  intel_gen7_hiz_buf_create(struct brw_context *brw,
> 

Re: [Mesa-dev] [PATCH 3/8] i965: Drop the aux mt when not used

2016-11-07 Thread Pohjolainen, Topi
On Mon, Nov 07, 2016 at 11:38:09AM +0200, Pohjolainen, Topi wrote:
> On Thu, Nov 03, 2016 at 10:39:38AM +, Lionel Landwerlin wrote:
> > From: Ben Widawsky 
> > 
> > This patch will preserve the BO & offset, and not the miptree for the
> > aux_mcs buffer. Eventually it might make sense to pull put the sizing
> > function in miptree creation, but for now this should be sufficient
> > and not too hideous.
> > 
> > v2: Save BO's offset too (Lionel)
> > 
> > Signed-off-by: Ben Widawsky  (v1)
> > Signed-off-by: Lionel Landwerlin  (v2)
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp.c|  4 ++--
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  6 +++---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 25 
> > 
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 12 
> >  4 files changed, 34 insertions(+), 13 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> > b/src/mesa/drivers/dri/i965/brw_blorp.c
> > index d733b35..5adb4c6 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > @@ -194,8 +194,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
> >};
> >  
> >if (mt->mcs_buf) {
> > - surf->aux_addr.buffer = mt->mcs_buf->mt->bo;
> > - surf->aux_addr.offset = mt->mcs_buf->mt->offset;
> > + surf->aux_addr.buffer = mt->mcs_buf->bo;
> > + surf->aux_addr.offset = mt->mcs_buf->offset;
> >} else {
> >   assert(surf->aux_usage == ISL_AUX_USAGE_HIZ);
> >   struct intel_mipmap_tree *hiz_mt = mt->hiz_buf->mt;
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > index d6b799c..bff423e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > @@ -146,8 +146,8 @@ brw_emit_surface_state(struct brw_context *brw,
> > if (mt->mcs_buf && !(flags & INTEL_AUX_BUFFER_DISABLED)) {
> >intel_miptree_get_aux_isl_surf(brw, mt, _surf_s, _usage);
> >aux_surf = _surf_s;
> > -  assert(mt->mcs_buf->mt->offset == 0);
> > -  aux_offset = mt->mcs_buf->mt->bo->offset64;
> > +  assert(mt->mcs_buf->offset == 0);
> > +  aux_offset = mt->mcs_buf->bo->offset64;
> >  
> >/* We only really need a clear color if we also have an auxiliary
> > * surfacae.  Without one, it does nothing.
> > @@ -181,7 +181,7 @@ brw_emit_surface_state(struct brw_context *brw,
> >assert((aux_offset & 0xfff) == 0);
> >drm_intel_bo_emit_reloc(brw->batch.bo,
> >*surf_offset + 4 * ss_info.aux_reloc_dw,
> > -  mt->mcs_buf->mt->bo,
> > +  mt->mcs_buf->bo,
> >dw[ss_info.aux_reloc_dw] & 0xfff,
> >read_domains, write_domains);
> > }
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 0001511..c2bff17 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -1498,7 +1498,7 @@ intel_miptree_init_mcs(struct brw_context *brw,
> >  */
> > const int ret = brw_bo_map_gtt(brw, mt->mcs_buf->bo, "miptree");
> > if (unlikely(ret)) {
> > -  intel_miptree_release(>mcs_buf->mt);
> > +  drm_intel_bo_unreference(mt->mcs_buf->bo);
> >free(mt->mcs_buf);
> >return;
> > }
> > @@ -1518,6 +1518,7 @@ intel_mcs_miptree_buf_create(struct brw_context *brw,
> >   uint32_t layout_flags)
> >  {
> > struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
> > +   struct intel_mipmap_tree *temp_mt;
> >  
> > if (!buf)
> >return NULL;
> > @@ -1527,7 +1528,7 @@ intel_mcs_miptree_buf_create(struct brw_context *brw,
> >  * "The MCS surface must be stored as Tile Y."
> >  */
> > layout_flags |= MIPTREE_LAYOUT_TILING_Y;
> > -   buf->mt = miptree_create(brw,
> > +   temp_mt = miptree_create(brw,
> >  mt->target,
> >  format,
> >  mt->first_level,
> > @@ -1537,14 +1538,22 @@ intel_mcs_miptree_buf_create(struct brw_context 
> > *brw,
> >  mt->logical_depth0,
> >  0 /* num_samples */,
> >  layout_flags);
> > -   if (!buf->mt) {
> > +   if (!temp_mt) {
> >free(buf);
> >return NULL;
> > }
> >  
> > -   buf->bo = buf->mt->bo;
> > -   buf->pitch = buf->mt->pitch;
> > -   buf->qpitch = buf->mt->qpitch;
> > +   buf->bo = temp_mt->bo;
> > +   buf->offset = temp_mt->offset;
> > +   buf->pitch = temp_mt->pitch;
> > +   buf->qpitch = 

  1   2   >