[Mesa-dev] [PATCH] [rfc] ac/surface: always increase dcc size alignment.

2017-08-13 Thread Dave Airlie
From: Dave Airlie 

So with tile swizzle, and dcc enabled, the vrdashboard GL app
generates a bunch of VM faults, this fixes it, however
it now sometimes generates garbage, but I'm just sending this
out to have some place to start.

(it could be a tile swizzle import/export issue still).

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_surface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
index 823a65d..1203c2f 100644
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -733,7 +733,7 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
 * This is what addrlib does, but calling addrlib would be a lot more
 * complicated.
 */
-   if (surf->dcc_size && config->info.levels > 1) {
+   if (surf->dcc_size) {
/* The smallest miplevels that are never compressed by DCC
 * still read the DCC buffer via TC if the base level uses DCC,
 * and for some reason the DCC buffer needs to be larger if
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/st: add support for hw atomics (v2)

2017-08-13 Thread Dave Airlie
On 9 August 2017 at 21:44, Nicolai Hähnle  wrote:
> Hi Dave,
>
> Thanks for the update, I prefer this.
>
> Have you considered Marek's query about pipeline-wide atomic buffers?
>
> The issue I'm thinking about is what happens when multiple shaders access
> the same atomic counter. In a GDS/GWS-based implementation, those accesses
> must map to the same hardware counter slot, as far as I understand it. But
> if you follow a straight-forward mapping based on today's TGSI, they won't
> be.
>
> There are two types of situations that can happen:
>
> 1. A GL program object has one counter that is accessed by multiple stages.
>
> 2. A GL program object (or even a single shader stage) has multiple counters
> with different bindings, and the same buffer range happens to be bound to
> both bindings.
>
> I'm pretty sure the second case is intended to be undefined behavior (see
> Issue 19 of ARB_shader_atomic_counters), but I believe the first case is
> intended to be supported.
>
> So *something* has to make sure that different stages end up accessing the
> same hardware counter. One way of doing this is to have a global (as opposed
> to per-shader-stage) list of atomic buffers at the Gallium pipe_context

So there is a reason I haven't considered it yet, r600 only support
one atomic buffer,
for non-compute shaders. For compute shaders it supports 8.

I'm not sure I'll get much out of changing that interface for a device
I can't really test it
with.

I guess we should do the interface Marek suggested but I think I'd rather wait
until radeonsi tackles it.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] glsl: stop adding pointers from bindless structs to the cache

2017-08-13 Thread Timothy Arceri
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.
---
 src/compiler/glsl/shader_cache.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index aa63bdcf01..aa6c067d04 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1212,32 +1212,34 @@ write_shader_metadata(struct blob *metadata, 
gl_linked_shader *shader)
 sizeof(glprog->SamplerUnits));
blob_write_bytes(metadata, glprog->sh.SamplerTargets,
 sizeof(glprog->sh.SamplerTargets));
blob_write_uint32(metadata, glprog->ShadowSamplers);
 
blob_write_bytes(metadata, glprog->sh.ImageAccess,
 sizeof(glprog->sh.ImageAccess));
blob_write_bytes(metadata, glprog->sh.ImageUnits,
 sizeof(glprog->sh.ImageUnits));
 
+   size_t ptr_size = sizeof(GLvoid *);
+
blob_write_uint32(metadata, glprog->sh.NumBindlessSamplers);
blob_write_uint32(metadata, glprog->sh.HasBoundBindlessSampler);
for (i = 0; i < glprog->sh.NumBindlessSamplers; i++) {
   blob_write_bytes(metadata, >sh.BindlessSamplers[i],
-   sizeof(struct gl_bindless_sampler));
+   sizeof(struct gl_bindless_sampler) - ptr_size);
}
 
blob_write_uint32(metadata, glprog->sh.NumBindlessImages);
blob_write_uint32(metadata, glprog->sh.HasBoundBindlessImage);
for (i = 0; i < glprog->sh.NumBindlessImages; i++) {
   blob_write_bytes(metadata, >sh.BindlessImages[i],
-   sizeof(struct gl_bindless_image));
+   sizeof(struct gl_bindless_image) - ptr_size);
}
 
write_shader_parameters(metadata, glprog->Parameters);
 }
 
 static void
 read_shader_metadata(struct blob_reader *metadata,
  struct gl_program *glprog,
  gl_linked_shader *linked)
 {
@@ -1251,43 +1253,45 @@ read_shader_metadata(struct blob_reader *metadata,
sizeof(glprog->SamplerUnits));
blob_copy_bytes(metadata, (uint8_t *) glprog->sh.SamplerTargets,
sizeof(glprog->sh.SamplerTargets));
glprog->ShadowSamplers = blob_read_uint32(metadata);
 
blob_copy_bytes(metadata, (uint8_t *) glprog->sh.ImageAccess,
sizeof(glprog->sh.ImageAccess));
blob_copy_bytes(metadata, (uint8_t *) glprog->sh.ImageUnits,
sizeof(glprog->sh.ImageUnits));
 
+   size_t ptr_size = sizeof(GLvoid *);
+
glprog->sh.NumBindlessSamplers = blob_read_uint32(metadata);
glprog->sh.HasBoundBindlessSampler = blob_read_uint32(metadata);
if (glprog->sh.NumBindlessSamplers > 0) {
   glprog->sh.BindlessSamplers =
  rzalloc_array(glprog, gl_bindless_sampler,
glprog->sh.NumBindlessSamplers);
 
   for (i = 0; i < glprog->sh.NumBindlessSamplers; i++) {
  blob_copy_bytes(metadata, (uint8_t *) >sh.BindlessSamplers[i],
- sizeof(struct gl_bindless_sampler));
+ sizeof(struct gl_bindless_sampler) - ptr_size);
   }
}
 
glprog->sh.NumBindlessImages = blob_read_uint32(metadata);
glprog->sh.HasBoundBindlessImage = blob_read_uint32(metadata);
if (glprog->sh.NumBindlessImages > 0) {
   glprog->sh.BindlessImages =
  rzalloc_array(glprog, gl_bindless_image,
glprog->sh.NumBindlessImages);
 
   for (i = 0; i < glprog->sh.NumBindlessImages; i++) {
  blob_copy_bytes(metadata, (uint8_t *) >sh.BindlessImages[i],
-sizeof(struct gl_bindless_image));
+sizeof(struct gl_bindless_image) - ptr_size);
   }
}
 
glprog->Parameters = _mesa_new_parameter_list();
read_shader_parameters(metadata, glprog->Parameters);
 }
 
 static void
 create_binding_str(const char *key, unsigned value, void *closure)
 {
-- 
2.13.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] glsl: stop adding pointers from gl_shader_variable to the cache

2017-08-13 Thread Timothy Arceri
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.
---
 src/compiler/glsl/shader_cache.cpp | 40 ++
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 1fd49b82e9..e004ed4f64 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -871,40 +871,55 @@ write_shader_subroutine_index(struct blob *metadata,
for (unsigned j = 0; j < sh->Program->sh.NumSubroutineFunctions; j++) {
   if (strcmp(((gl_subroutine_function *)res->Data)->name,
  sh->Program->sh.SubroutineFunctions[j].name) == 0) {
  blob_write_uint32(metadata, j);
  break;
   }
}
 }
 
 static void
+get_shader_var_and_pointer_sizes(size_t *s_var_size, size_t *s_var_ptrs,
+ const gl_shader_variable *var)
+{
+   *s_var_size = sizeof(gl_shader_variable);
+   *s_var_ptrs =
+  sizeof(var->type) +
+  sizeof(var->interface_type) +
+  sizeof(var->outermost_struct_type) +
+  sizeof(var->name);
+}
+
+static void
 write_program_resource_data(struct blob *metadata,
 struct gl_shader_program *prog,
 struct gl_program_resource *res)
 {
struct gl_linked_shader *sh;
 
switch(res->Type) {
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT: {
   const gl_shader_variable *var = (gl_shader_variable *)res->Data;
-  blob_write_bytes(metadata, var, sizeof(gl_shader_variable));
+
   encode_type_to_blob(metadata, var->type);
+  encode_type_to_blob(metadata, var->interface_type);
+  encode_type_to_blob(metadata, var->outermost_struct_type);
 
-  if (var->interface_type)
- encode_type_to_blob(metadata, var->interface_type);
+  blob_write_string(metadata, var->name);
 
-  if (var->outermost_struct_type)
- encode_type_to_blob(metadata, var->outermost_struct_type);
+  size_t s_var_size, s_var_ptrs;
+  get_shader_var_and_pointer_sizes(_var_size, _var_ptrs, var);
 
-  blob_write_string(metadata, var->name);
+  /* Write gl_shader_variable skipping over the pointers */
+  blob_write_bytes(metadata, ((char *)var) + s_var_ptrs,
+   s_var_size - s_var_ptrs);
   break;
}
case GL_UNIFORM_BLOCK:
   for (unsigned i = 0; i < prog->data->NumUniformBlocks; i++) {
  if (strcmp(((gl_uniform_block *)res->Data)->Name,
 prog->data->UniformBlocks[i].Name) == 0) {
 blob_write_uint32(metadata, i);
 break;
  }
   }
@@ -981,30 +996,31 @@ read_program_resource_data(struct blob_reader *metadata,
struct gl_shader_program *prog,
struct gl_program_resource *res)
 {
struct gl_linked_shader *sh;
 
switch(res->Type) {
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT: {
   gl_shader_variable *var = ralloc(prog, struct gl_shader_variable);
 
-  blob_copy_bytes(metadata, (uint8_t *) var, sizeof(gl_shader_variable));
   var->type = decode_type_from_blob(metadata);
+  var->interface_type = decode_type_from_blob(metadata);
+  var->outermost_struct_type = decode_type_from_blob(metadata);
 
-  if (var->interface_type)
- var->interface_type = decode_type_from_blob(metadata);
+  var->name = ralloc_strdup(prog, blob_read_string(metadata));
 
-  if (var->outermost_struct_type)
- var->outermost_struct_type = decode_type_from_blob(metadata);
+  size_t s_var_size, s_var_ptrs;
+  get_shader_var_and_pointer_sizes(_var_size, _var_ptrs, var);
 
-  var->name = ralloc_strdup(prog, blob_read_string(metadata));
+  blob_copy_bytes(metadata, ((uint8_t *) var) + s_var_ptrs,
+  s_var_size - s_var_ptrs);
 
   res->Data = var;
   break;
}
case GL_UNIFORM_BLOCK:
   res->Data = >data->UniformBlocks[blob_read_uint32(metadata)];
   break;
case GL_SHADER_STORAGE_BLOCK:
   res->Data = >data->ShaderStorageBlocks[blob_read_uint32(metadata)];
   break;
-- 
2.13.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] glsl: always write a name/label string to the cache

2017-08-13 Thread Timothy Arceri
In the following patch we will stop writing the pointer to cache.

Unfortunately adding empty strings to that cache seems to be the
only thing we can do here once we no longer have the pointers.
---
 src/compiler/glsl/shader_cache.cpp | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 26be9e1f88..0e7744bb0b 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1308,24 +1308,23 @@ create_linked_shader_and_program(struct gl_context *ctx,
 
glprog = ctx->Driver.NewProgram(ctx, _mesa_shader_stage_to_program(stage),
prog->Name, false);
glprog->info.stage = stage;
linked->Program = glprog;
 
read_shader_metadata(metadata, glprog, linked);
 
/* Restore shader info */
blob_copy_bytes(metadata, (uint8_t *) >info, sizeof(shader_info));
-   if (glprog->info.name)
-  glprog->info.name = ralloc_strdup(glprog, blob_read_string(metadata));
-   if (glprog->info.label)
-  glprog->info.label = ralloc_strdup(glprog, blob_read_string(metadata));
+
+   glprog->info.name = ralloc_strdup(glprog, blob_read_string(metadata));
+   glprog->info.label = ralloc_strdup(glprog, blob_read_string(metadata));
 
_mesa_reference_shader_program_data(ctx, >sh.data, prog->data);
_mesa_reference_program(ctx, >Program, glprog);
prog->_LinkedShaders[stage] = linked;
 }
 
 void
 shader_cache_write_program_metadata(struct gl_context *ctx,
 struct gl_shader_program *prog)
 {
@@ -1355,23 +1354,27 @@ shader_cache_write_program_metadata(struct gl_context 
*ctx,
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   struct gl_linked_shader *sh = prog->_LinkedShaders[i];
   if (sh) {
  write_shader_metadata(metadata, sh);
 
  /* Store nir shader info */
  blob_write_bytes(metadata, >Program->info, sizeof(shader_info));
 
  if (sh->Program->info.name)
 blob_write_string(metadata, sh->Program->info.name);
+ else
+blob_write_string(metadata, "");
 
  if (sh->Program->info.label)
 blob_write_string(metadata, sh->Program->info.label);
+ else
+blob_write_string(metadata, "");
   }
}
 
write_xfb(metadata, prog);
 
write_uniform_remap_tables(metadata, prog);
 
write_atomic_buffers(metadata, prog);
 
write_buffer_blocks(metadata, prog);
-- 
2.13.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] glsl: stop adding pointers from glsl_struct_field to the cache

2017-08-13 Thread Timothy Arceri
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.
---
 src/compiler/glsl/shader_cache.cpp | 45 --
 1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index e004ed4f64..6c878dae37 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -68,20 +68,31 @@ extern "C" {
 }
 
 static void
 compile_shaders(struct gl_context *ctx, struct gl_shader_program *prog) {
for (unsigned i = 0; i < prog->NumShaders; i++) {
   _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true);
}
 }
 
 static void
+get_struct_type_field_and_pointer_sizes(size_t *s_field_size,
+size_t *s_field_ptrs,
+unsigned num_fields)
+{
+   *s_field_size = sizeof(glsl_struct_field) * num_fields;
+   *s_field_ptrs =
+ sizeof(((glsl_struct_field *)0)->type) +
+ sizeof(((glsl_struct_field *)0)->name);
+}
+
+static void
 encode_type_to_blob(struct blob *blob, const glsl_type *type)
 {
uint32_t encoding;
 
if (!type) {
   blob_write_uint32(blob, 0);
   return;
}
 
switch (type->base_type) {
@@ -120,25 +131,33 @@ encode_type_to_blob(struct blob *blob, const glsl_type 
*type)
case GLSL_TYPE_ARRAY:
   blob_write_uint32(blob, (type->base_type) << 24);
   blob_write_uint32(blob, type->length);
   encode_type_to_blob(blob, type->fields.array);
   return;
case GLSL_TYPE_STRUCT:
case GLSL_TYPE_INTERFACE:
   blob_write_uint32(blob, (type->base_type) << 24);
   blob_write_string(blob, type->name);
   blob_write_uint32(blob, type->length);
-  blob_write_bytes(blob, type->fields.structure,
-   sizeof(glsl_struct_field) * type->length);
+
+  size_t s_field_size, s_field_ptrs;
+  get_struct_type_field_and_pointer_sizes(_field_size, _field_ptrs,
+  type->length);
+
   for (unsigned i = 0; i < type->length; i++) {
  encode_type_to_blob(blob, type->fields.structure[i].type);
  blob_write_string(blob, type->fields.structure[i].name);
+
+ /* Write the struct field skipping the pointers */
+ blob_write_bytes(blob,
+  ((char *)>fields.structure[i]) + s_field_ptrs,
+  s_field_size - s_field_ptrs);
   }
 
   if (type->is_interface()) {
  blob_write_uint32(blob, type->interface_packing);
  blob_write_uint32(blob, type->interface_row_major);
   }
   return;
case GLSL_TYPE_VOID:
case GLSL_TYPE_ERROR:
default:
@@ -185,36 +204,48 @@ decode_type_from_blob(struct blob_reader *blob)
   return glsl_type::atomic_uint_type;
case GLSL_TYPE_ARRAY: {
   unsigned length = blob_read_uint32(blob);
   return glsl_type::get_array_instance(decode_type_from_blob(blob),
length);
}
case GLSL_TYPE_STRUCT:
case GLSL_TYPE_INTERFACE: {
   char *name = blob_read_string(blob);
   unsigned num_fields = blob_read_uint32(blob);
-  glsl_struct_field *fields = (glsl_struct_field *)
- blob_read_bytes(blob, sizeof(glsl_struct_field) * num_fields);
+
+  size_t s_field_size, s_field_ptrs;
+  get_struct_type_field_and_pointer_sizes(_field_size, _field_ptrs,
+  num_fields);
+
+  glsl_struct_field *fields =
+ (glsl_struct_field *) malloc(s_field_size * num_fields);
   for (unsigned i = 0; i < num_fields; i++) {
  fields[i].type = decode_type_from_blob(blob);
  fields[i].name = blob_read_string(blob);
+
+ blob_copy_bytes(blob, ((uint8_t *) [i]) + s_field_ptrs,
+ s_field_size - s_field_ptrs);
   }
 
+  const glsl_type *t;
   if (base_type == GLSL_TYPE_INTERFACE) {
  enum glsl_interface_packing packing =
 (glsl_interface_packing) blob_read_uint32(blob);
  bool row_major = blob_read_uint32(blob);
- return glsl_type::get_interface_instance(fields, num_fields,
-  packing, row_major, name);
+ t = glsl_type::get_interface_instance(fields, num_fields, packing,
+   row_major, name);
   } else {
- return glsl_type::get_record_instance(fields, num_fields, name);
+ t = glsl_type::get_record_instance(fields, num_fields, name);
   }
+
+  free(fields);
+  return t;
}
case GLSL_TYPE_VOID:
case GLSL_TYPE_ERROR:
default:
   assert(!"Cannot decode type!");
   return NULL;
}
 }
 
 static void
-- 
2.13.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 4/8] glsl: add has_uniform_storage() helper to shader cache

2017-08-13 Thread Timothy Arceri
---
 src/compiler/glsl/shader_cache.cpp | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 6c878dae37..2fbab86d30 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -589,20 +589,31 @@ read_xfb(struct blob_reader *metadata, struct 
gl_shader_program *shProg)
   ltf->Varyings[i].BufferIndex = blob_read_uint32(metadata);
   ltf->Varyings[i].Size = blob_read_uint32(metadata);
   ltf->Varyings[i].Offset = blob_read_uint32(metadata);
}
 
blob_copy_bytes(metadata, (uint8_t *) ltf->Buffers,
sizeof(struct gl_transform_feedback_buffer) *
   MAX_FEEDBACK_BUFFERS);
 }
 
+static bool
+has_uniform_storage(struct gl_shader_program *prog, unsigned idx)
+{
+   if (!prog->data->UniformStorage[idx].builtin &&
+   !prog->data->UniformStorage[idx].is_shader_storage &&
+   prog->data->UniformStorage[idx].block_index == -1)
+  return true;
+
+   return false;
+}
+
 static void
 write_uniforms(struct blob *metadata, struct gl_shader_program *prog)
 {
blob_write_uint32(metadata, prog->SamplersValidated);
blob_write_uint32(metadata, prog->data->NumUniformStorage);
blob_write_uint32(metadata, prog->data->NumUniformDataSlots);
 
for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
   encode_type_to_blob(metadata, prog->data->UniformStorage[i].type);
   blob_write_uint32(metadata, 
prog->data->UniformStorage[i].array_elements);
@@ -631,23 +642,21 @@ write_uniforms(struct blob *metadata, struct 
gl_shader_program *prog)
sizeof(prog->data->UniformStorage[i].opaque));
}
 
/* Here we cache all uniform values. We do this to retain values for
 * uniforms with initialisers and also hidden uniforms that may be lowered
 * constant arrays. We could possibly just store the values we need but for
 * now we just store everything.
 */
blob_write_uint32(metadata, prog->data->NumHiddenUniforms);
for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
-  if (!prog->data->UniformStorage[i].builtin &&
-  !prog->data->UniformStorage[i].is_shader_storage &&
-  prog->data->UniformStorage[i].block_index == -1) {
+  if (has_uniform_storage(prog, i)) {
  unsigned vec_size =
 prog->data->UniformStorage[i].type->component_slots() *
 MAX2(prog->data->UniformStorage[i].array_elements, 1);
  blob_write_bytes(metadata, prog->data->UniformStorage[i].storage,
   sizeof(union gl_constant_value) * vec_size);
   }
}
 }
 
 static void
@@ -693,23 +702,21 @@ read_uniforms(struct blob_reader *metadata, struct 
gl_shader_program *prog)
   prog->UniformHash->put(i, uniforms[i].name);
 
   memcpy(uniforms[i].opaque,
  blob_read_bytes(metadata, sizeof(uniforms[i].opaque)),
  sizeof(uniforms[i].opaque));
}
 
/* Restore uniform values. */
prog->data->NumHiddenUniforms = blob_read_uint32(metadata);
for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
-  if (!prog->data->UniformStorage[i].builtin &&
-  !prog->data->UniformStorage[i].is_shader_storage &&
-  prog->data->UniformStorage[i].block_index == -1) {
+  if (has_uniform_storage(prog, i)) {
  unsigned vec_size =
 prog->data->UniformStorage[i].type->component_slots() *
 MAX2(prog->data->UniformStorage[i].array_elements, 1);
  blob_copy_bytes(metadata,
  (uint8_t *) prog->data->UniformStorage[i].storage,
  sizeof(union gl_constant_value) * vec_size);
 
 assert(vec_size + prog->data->UniformStorage[i].storage <=
data +  prog->data->NumUniformDataSlots);
   }
-- 
2.13.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: fix CTS regression caused by fcbb93e86024

2017-08-13 Thread Timothy Arceri



On 12/08/17 06:32, Emil Velikov wrote:

On 1 August 2017 at 08:35, Timothy Arceri  wrote:

When generation the storage offset for struct members we need
to skip opaque types as they no longer have backing storage.

Fixes: fcbb93e86024 ("mesa: stop assigning unused storage for non-bindless opaque 
types")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101983


Tim, did you went with another solution or the patches are waiting for review?
I was about to pick the revert for stable, but would need a reference
for the fix in master.


Still waiting on review. I'll resend today with your previous comments 
about wording addressed.






---
  src/mesa/Makefile.sources  |   2 +
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  16 -
  src/mesa/state_tracker/st_glsl_types.cpp   | 105 +
  src/mesa/state_tracker/st_glsl_types.h |  44 

I think the st_glsl_types.* are revert conflicts and should not be needed.

If that's correct, this series is overall smaller and preferable for
stable, since it does not diverge from master.
It's your call though.


I believe the functions I add back with st_glsl_types.* are slightly 
different than the ones that were there before. However Samuel would 
rather his patch go into stable anyway, I'm fine with that also.





Thanks
Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: disable CE by default

2017-08-13 Thread Christian König

Acked-by: Christian König 

Am 13.08.2017 um 19:50 schrieb Bas Nieuwenhuizen:

Reviewed-by: Bas Nieuwenhuizen 

On Sun, Aug 13, 2017, at 19:27, Marek Olšák wrote:

From: Marek Olšák 

It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.

Cc: 17.1 17.2 
---
  src/gallium/drivers/radeon/r600_pipe_common.c |  1 +
  src/gallium/drivers/radeon/r600_pipe_common.h |  4 ++--
  src/gallium/drivers/radeonsi/si_pipe.c| 24
  ++--
  3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 0038c9a..cb4b7a4 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -768,20 +768,21 @@ static const struct debug_named_value
common_debug_options[] = {
{ "switch_on_eop", DBG_SWITCH_ON_EOP, "Program WD/IA to switch on 
end-of-packet." },
{ "forcedma", DBG_FORCE_DMA, "Use asynchronous DMA for all operations when 
possible." },
{ "precompile", DBG_PRECOMPILE, "Compile one shader variant at shader 
creation." },
{ "nowc", DBG_NO_WC, "Disable GTT write combining" },
{ "check_vm", DBG_CHECK_VM, "Check VM faults and dump debug info." },
{ "nodcc", DBG_NO_DCC, "Disable DCC." },
{ "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." },
{ "norbplus", DBG_NO_RB_PLUS, "Disable RB+." },
{ "sisched", DBG_SI_SCHED, "Enable LLVM SI Machine Instruction 
Scheduler." },
{ "mono", DBG_MONOLITHIC_SHADERS, "Use old-style monolithic shaders compiled 
on demand" },
+   { "ce", DBG_CE, "Force enable the constant engine" },
{ "noce", DBG_NO_CE, "Disable the constant engine"},
{ "unsafemath", DBG_UNSAFE_MATH, "Enable unsafe math shader 
optimizations" },
{ "nodccfb", DBG_NO_DCC_FB, "Disable separate DCC on the main 
framebuffer" },
  
  	DEBUG_NAMED_VALUE_END /* must be last */

  };
  
  static const char* r600_get_vendor(struct pipe_screen* pscreen)

  {
return "X.Org";
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 67b3c87..14bc63e 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -58,26 +58,26 @@
  #define R600_CONTEXT_STREAMOUT_FLUSH(1u << 0)
  /* Pipeline & streamout query controls. */
  #define R600_CONTEXT_START_PIPELINE_STATS   (1u << 1)
  #define R600_CONTEXT_STOP_PIPELINE_STATS(1u << 2)
  #define R600_CONTEXT_PRIVATE_FLAG   (1u << 3)
  
  /* special primitive types */

  #define R600_PRIM_RECTANGLE_LISTPIPE_PRIM_MAX
  
  /* Debug flags. */

-/* logging */
+/* logging and features */
  #define DBG_TEX (1 << 0)
  #define DBG_NIR (1 << 1)
  #define DBG_COMPUTE (1 << 2)
  #define DBG_VM  (1 << 3)
-/* gap - reuse */
+#define DBG_CE (1 << 4)
  /* shader logging */
  #define DBG_FS  (1 << 5)
  #define DBG_VS  (1 << 6)
  #define DBG_GS  (1 << 7)
  #define DBG_PS  (1 << 8)
  #define DBG_CS  (1 << 9)
  #define DBG_TCS (1 << 10)
  #define DBG_TES (1 << 11)
  #define DBG_NO_IR   (1 << 12)
  #define DBG_NO_TGSI (1 << 13)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index 2c65cc8..cac1d01 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -194,26 +194,38 @@ static struct pipe_context
*si_create_context(struct pipe_screen *screen,
sctx->b.b.create_video_codec = si_uvd_create_decoder;
sctx->b.b.create_video_buffer = si_video_buffer_create;
} else {
sctx->b.b.create_video_codec = vl_create_decoder;
sctx->b.b.create_video_buffer = vl_video_buffer_create;
}
  
  	sctx->b.gfx.cs = ws->cs_create(sctx->b.ctx, RING_GFX,

   si_context_gfx_flush, sctx);
  
-   /* SI + AMDGPU + CE = GPU hang */

-   if (!(sscreen->b.debug_flags & DBG_NO_CE) && ws->cs_add_const_ib
&&
-   sscreen->b.chip_class != SI &&
-   /* These can't use CE due to a power gating bug in the
kernel. */
-   sscreen->b.family != CHIP_CARRIZO &&
-   sscreen->b.family != CHIP_STONEY) {
+   bool enable_ce = sscreen->b.chip_class != SI && /* SI hangs */
+/* These can't use CE due to a power gating bug
in the kernel. */
+sscreen->b.family != CHIP_CARRIZO &&
+sscreen->b.family != CHIP_STONEY;
+
+   /* CE is 

[Mesa-dev] [PATCH] radeonsi: disable CE by default

2017-08-13 Thread Marek Olšák
From: Marek Olšák 

It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.

Cc: 17.1 17.2 
---
 src/gallium/drivers/radeon/r600_pipe_common.c |  1 +
 src/gallium/drivers/radeon/r600_pipe_common.h |  4 ++--
 src/gallium/drivers/radeonsi/si_pipe.c| 24 ++--
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 0038c9a..cb4b7a4 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -768,20 +768,21 @@ static const struct debug_named_value 
common_debug_options[] = {
{ "switch_on_eop", DBG_SWITCH_ON_EOP, "Program WD/IA to switch on 
end-of-packet." },
{ "forcedma", DBG_FORCE_DMA, "Use asynchronous DMA for all operations 
when possible." },
{ "precompile", DBG_PRECOMPILE, "Compile one shader variant at shader 
creation." },
{ "nowc", DBG_NO_WC, "Disable GTT write combining" },
{ "check_vm", DBG_CHECK_VM, "Check VM faults and dump debug info." },
{ "nodcc", DBG_NO_DCC, "Disable DCC." },
{ "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." },
{ "norbplus", DBG_NO_RB_PLUS, "Disable RB+." },
{ "sisched", DBG_SI_SCHED, "Enable LLVM SI Machine Instruction 
Scheduler." },
{ "mono", DBG_MONOLITHIC_SHADERS, "Use old-style monolithic shaders 
compiled on demand" },
+   { "ce", DBG_CE, "Force enable the constant engine" },
{ "noce", DBG_NO_CE, "Disable the constant engine"},
{ "unsafemath", DBG_UNSAFE_MATH, "Enable unsafe math shader 
optimizations" },
{ "nodccfb", DBG_NO_DCC_FB, "Disable separate DCC on the main 
framebuffer" },
 
DEBUG_NAMED_VALUE_END /* must be last */
 };
 
 static const char* r600_get_vendor(struct pipe_screen* pscreen)
 {
return "X.Org";
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 67b3c87..14bc63e 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -58,26 +58,26 @@
 #define R600_CONTEXT_STREAMOUT_FLUSH   (1u << 0)
 /* Pipeline & streamout query controls. */
 #define R600_CONTEXT_START_PIPELINE_STATS  (1u << 1)
 #define R600_CONTEXT_STOP_PIPELINE_STATS   (1u << 2)
 #define R600_CONTEXT_PRIVATE_FLAG  (1u << 3)
 
 /* special primitive types */
 #define R600_PRIM_RECTANGLE_LIST   PIPE_PRIM_MAX
 
 /* Debug flags. */
-/* logging */
+/* logging and features */
 #define DBG_TEX(1 << 0)
 #define DBG_NIR(1 << 1)
 #define DBG_COMPUTE(1 << 2)
 #define DBG_VM (1 << 3)
-/* gap - reuse */
+#define DBG_CE (1 << 4)
 /* shader logging */
 #define DBG_FS (1 << 5)
 #define DBG_VS (1 << 6)
 #define DBG_GS (1 << 7)
 #define DBG_PS (1 << 8)
 #define DBG_CS (1 << 9)
 #define DBG_TCS(1 << 10)
 #define DBG_TES(1 << 11)
 #define DBG_NO_IR  (1 << 12)
 #define DBG_NO_TGSI(1 << 13)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 2c65cc8..cac1d01 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -194,26 +194,38 @@ static struct pipe_context *si_create_context(struct 
pipe_screen *screen,
sctx->b.b.create_video_codec = si_uvd_create_decoder;
sctx->b.b.create_video_buffer = si_video_buffer_create;
} else {
sctx->b.b.create_video_codec = vl_create_decoder;
sctx->b.b.create_video_buffer = vl_video_buffer_create;
}
 
sctx->b.gfx.cs = ws->cs_create(sctx->b.ctx, RING_GFX,
   si_context_gfx_flush, sctx);
 
-   /* SI + AMDGPU + CE = GPU hang */
-   if (!(sscreen->b.debug_flags & DBG_NO_CE) && ws->cs_add_const_ib &&
-   sscreen->b.chip_class != SI &&
-   /* These can't use CE due to a power gating bug in the kernel. */
-   sscreen->b.family != CHIP_CARRIZO &&
-   sscreen->b.family != CHIP_STONEY) {
+   bool enable_ce = sscreen->b.chip_class != SI && /* SI hangs */
+/* These can't use CE due to a power gating bug in the 
kernel. */
+sscreen->b.family != CHIP_CARRIZO &&
+sscreen->b.family != CHIP_STONEY;
+
+   /* CE is currently disabled by default, because it makes s_load latency
+* worse, because CE IB doesn't run in lockstep with DE.
+* Remove this line after that performance issue has been resolved.
+*/
+   enable_ce = 

Re: [Mesa-dev] [PATCH v2 2/4] st/omx_tizonia: Add --enable-omx-tizonia flag and build files

2017-08-13 Thread Gurkirpal Singh
On Sun, Aug 13, 2017 at 8:47 AM, Leo Liu  wrote:

> Where is the patch 1?

Sorry for the patches got messed up somehow while sending, I could only see
two patches on mail-archive but three on patchwork.
Tried two times and same result.
About the first one I got a mail saying that it was too large has been put
aside for mod approval.
The changes I made were to just rename the st/omx directory to
st/omx_bellagio (the reason it became large)
and renaming bits in the configure.ac and Makefiles.

>
>
>
> On 08/12/2017 12:07 PM, Gurkirpal Singh wrote:
>
>> Coexist with --enable-omx so they can be built independently
>> Detect tizonia package config file
>> Generate libomxtiz_mesa.so and install it to libtizcore.pc::pluginsdir
>> Only compile empty source (target.c) for now.
>>
>> v2: Show error message when --enable-omx is used (Christian)
>>  Use single PKG_CHECK_MODULES for omx_tizonia checks (Emil)
>>  Use spaces instead of tabs
>>  Add checks around omx-tizonia
>>
>> GSoC Project link: https://summerofcode.withgoogl
>> e.com/projects/#4737166321123328
>>
>> Signed-off-by: Gurkirpal Singh 
>> Reviewed-and-Tested-by: Julien Isorce 
>> ---
>>   configure.ac| 40 ++-
>>   src/gallium/Makefile.am |  4 ++
>>   src/gallium/targets/omx-tizonia/Makefile.am | 77
>> +
>>   src/gallium/targets/omx-tizonia/omx.sym | 11 +
>>   src/gallium/targets/omx-tizonia/target.c|  2 +
>>   5 files changed, 132 insertions(+), 2 deletions(-)
>>   create mode 100644 src/gallium/targets/omx-tizonia/Makefile.am
>>   create mode 100644 src/gallium/targets/omx-tizonia/omx.sym
>>   create mode 100644 src/gallium/targets/omx-tizonia/target.c
>>
>> diff --git a/configure.ac b/configure.ac
>> index 38af96a..5669695 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -85,6 +85,7 @@ dnl Versions for external dependencies
>>   DRI2PROTO_REQUIRED=2.8
>>   GLPROTO_REQUIRED=1.4.14
>>   LIBOMXIL_BELLAGIO_REQUIRED=0.0
>> +LIBOMXIL_TIZONIA_REQUIRED=0.9.0
>>   LIBVA_REQUIRED=0.38.0
>>   VDPAU_REQUIRED=1.1
>>   WAYLAND_REQUIRED=1.11
>> @@ -1216,14 +1217,19 @@ AC_ARG_ENABLE([vdpau],
>>  [enable_vdpau=auto])
>>   AC_ARG_ENABLE([omx],
>>  [AS_HELP_STRING([--enable-omx],
>> - [DEPRECATED: Use --enable-omx-bellagio instead @<:@default=auto@
>> :>@])],
>> -   [AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio
>> instead.])],
>>
>
> Is this in patch 1?

Yes, it is so.

>
>
> + [DEPRECATED: Use --enable-omx-bellagio or --enable-omx-tizonia
>> instead @<:@default=auto@:>@])],
>> +   [AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio
>> or --enable-omx-tizonia instead.])],
>>  [])
>>   AC_ARG_ENABLE([omx-bellagio],
>>  [AS_HELP_STRING([--enable-omx-bellagio],
>>[enable OpenMAX Bellagio library @<:@default=disabled@:>@])],
>>  [enable_omx_bellagio="$enableval"],
>>  [enable_omx_bellagio=no])
>> +AC_ARG_ENABLE([omx-tizonia],
>> +   [AS_HELP_STRING([--enable-omx-tizonia],
>> + [enable OpenMAX Tizonia library @<:@default=disabled@:>@])],
>> +   [enable_omx_tizonia="$enableval"],
>> +   [enable_omx_tizonia=no])
>>   AC_ARG_ENABLE([va],
>>  [AS_HELP_STRING([--enable-va],
>>[enable va library @<:@default=auto@:>@])],
>> @@ -1275,6 +1281,7 @@ if test "x$enable_opengl" = xno -a \
>>   "x$enable_xvmc" = xno -a \
>>   "x$enable_vdpau" = xno -a \
>>   "x$enable_omx_bellagio" = xno -a \
>> +"x$enable_omx_tizonia" = xno -a \
>>   "x$enable_va" = xno -a \
>>   "x$enable_opencl" = xno; then
>>   AC_MSG_ERROR([at least one API should be enabled])
>> @@ -2121,6 +2128,10 @@ if test -n "$with_gallium_drivers" -a
>> "x$with_gallium_drivers" != xswrast; then
>>   PKG_CHECK_EXISTS([libomxil-bellagio >=
>> $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx_bellagio=yes],
>> [enable_omx_bellagio=no])
>>   fi
>>   +if test "x$enable_omx_tizonia" = xauto -a "x$have_omx_platform" =
>> xyes; then
>> +   PKG_CHECK_EXISTS([libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED],
>> [enable_omx_tizonia=yes], [enable_omx_tizonia=no])
>> +fi
>> +
>>   if test "x$enable_va" = xauto -a "x$have_va_platform" = xyes; then
>>   PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes],
>> [enable_va=no])
>>   fi
>> @@ -2130,6 +2141,7 @@ if test "x$enable_dri" = xyes -o \
>>   "x$enable_xvmc" = xyes -o \
>>   "x$enable_vdpau" = xyes -o \
>>   "x$enable_omx_bellagio" = xyes -o \
>> +"x$enable_omx_tizonia" = xyes -o \
>>   "x$enable_va" = xyes; then
>>   need_gallium_vl=yes
>>   fi
>> @@ -2138,6 +2150,7 @@ AM_CONDITIONAL(NEED_GALLIUM_VL, test
>> "x$need_gallium_vl" = xyes)
>>   if test "x$enable_xvmc" = xyes -o \
>>   "x$enable_vdpau" = xyes -o \
>>   "x$enable_omx_bellagio" = xyes -o 

Re: [Mesa-dev] [PATCH v3 4/8] anv: Implement support for exporting semaphores as FENCE_FD

2017-08-13 Thread Lionel Landwerlin

On 04/08/17 18:24, Jason Ekstrand wrote:

---
  src/intel/vulkan/anv_batch_chain.c | 57 +--
  src/intel/vulkan/anv_device.c  |  1 +
  src/intel/vulkan/anv_gem.c | 36 
  src/intel/vulkan/anv_private.h | 23 +
  src/intel/vulkan/anv_queue.c   | 69 --
  5 files changed, 175 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 65fe366..7a84bbd 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1416,11 +1416,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 struct anv_execbuf execbuf;
 anv_execbuf_init();
  
+   int in_fence = -1;

 VkResult result = VK_SUCCESS;
 for (uint32_t i = 0; i < num_in_semaphores; i++) {
ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
-  assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
-  struct anv_semaphore_impl *impl = >permanent;
+  struct anv_semaphore_impl *impl =
+ semaphore->temporary.type != ANV_SEMAPHORE_TYPE_NONE ?
+ >temporary : >permanent;


I know you're not enabling this until patch 8, but for consistency, 
shouldn't this be part of patch 1?


  
switch (impl->type) {

case ANV_SEMAPHORE_TYPE_BO:
@@ -1429,11 +1431,29 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   if (result != VK_SUCCESS)
  return result;
   break;
+
+  case ANV_SEMAPHORE_TYPE_SYNC_FILE:
+ if (in_fence == -1) {
+in_fence = impl->fd;
+ } else {
+int merge = anv_gem_sync_file_merge(device, in_fence, impl->fd);
+if (merge == -1)
+   return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
+
+close(impl->fd);
+close(in_fence);
+in_fence = merge;
+ }
+
+ impl->fd = -1;
+ break;
+
default:
   break;
}
 }
  
+   bool need_out_fence = false;

 for (uint32_t i = 0; i < num_out_semaphores; i++) {
ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
  
@@ -1459,6 +1479,11 @@ anv_cmd_buffer_execbuf(struct anv_device *device,

   if (result != VK_SUCCESS)
  return result;
   break;
+
+  case ANV_SEMAPHORE_TYPE_SYNC_FILE:
+ need_out_fence = true;
+ break;
+
default:
   break;
}
@@ -1472,9 +1497,19 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
setup_empty_execbuf(, device);
 }
  
+   if (in_fence != -1) {

+  execbuf.execbuf.flags |= I915_EXEC_FENCE_IN;
+  execbuf.execbuf.rsvd2 |= (uint32_t)in_fence;
+   }
+
+   if (need_out_fence)
+  execbuf.execbuf.flags |= I915_EXEC_FENCE_OUT;
  
 result = anv_device_execbuf(device, , execbuf.bos);
  
+   /* Execbuf does not consume the in_fence.  It's our job to close it. */

+   close(in_fence);
+
 for (uint32_t i = 0; i < num_in_semaphores; i++) {
ANV_FROM_HANDLE(anv_semaphore, semaphore, in_semaphores[i]);
/* From the Vulkan 1.0.53 spec:
@@ -1489,6 +1524,24 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
anv_semaphore_reset_temporary(device, semaphore);
 }
  
+   if (result == VK_SUCCESS && need_out_fence) {

+  int out_fence = execbuf.execbuf.rsvd2 >> 32;
+  for (uint32_t i = 0; i < num_out_semaphores; i++) {
+ ANV_FROM_HANDLE(anv_semaphore, semaphore, out_semaphores[i]);
+ /* Out fences can't have temporary state because that would imply
+  * that we imported a sync file and are trying to signal it.
+  */
+ assert(semaphore->temporary.type == ANV_SEMAPHORE_TYPE_NONE);
+ struct anv_semaphore_impl *impl = >permanent;
+
+ if (impl->type == ANV_SEMAPHORE_TYPE_SYNC_FILE) {
+assert(impl->fd == -1);
+impl->fd = dup(out_fence);
+ }
+  }
+  close(out_fence);
+   }
+
 anv_execbuf_finish(, >alloc);
  
 return result;

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index e82e1e9..3c5f78c 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -337,6 +337,7 @@ anv_physical_device_init(struct anv_physical_device *device,
goto fail;
  
 device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);

+   device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
  
 bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
  
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c

index 36692f5..5b68e9b 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -22,6 +22,7 @@
   */
  
  #include 

+#include 
  #include 
  #include 
  #include 
@@ -400,3 +401,38 @@ anv_gem_fd_to_handle(struct anv_device *device, int fd)
  
 return args.handle;

  }
+
+#ifndef SYNC_IOC_MAGIC

Re: [Mesa-dev] [PATCH v3 2/8] anv: Submit a dummy batch when only semaphores are provided.

2017-08-13 Thread Lionel Landwerlin

On 04/08/17 18:24, Jason Ekstrand wrote:

Vulkan allows you to do a submit whose only job is to wait on and
trigger semaphores.  The easiest way for us to support that right
now is to insert a dummy execbuf.
---
  src/intel/vulkan/anv_batch_chain.c | 28 +---
  src/intel/vulkan/anv_device.c  | 30 ++
  src/intel/vulkan/anv_private.h |  1 +
  src/intel/vulkan/anv_queue.c   | 17 +
  4 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 94e7a7d..65fe366 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1388,6 +1388,23 @@ setup_execbuf_for_cmd_buffer(struct anv_execbuf *execbuf,
 return VK_SUCCESS;
  }
  
+static void

+setup_empty_execbuf(struct anv_execbuf *execbuf, struct anv_device *device)
+{
+   anv_execbuf_add_bo(execbuf, >trivial_batch_bo, NULL, 0,
+  >alloc);
+
+   execbuf->execbuf = (struct drm_i915_gem_execbuffer2) {
+  .buffers_ptr = (uintptr_t) execbuf->objects,
+  .buffer_count = execbuf->bo_count,
+  .batch_start_offset = 0,
+  .batch_len = 8, /* GEN8_MI_BATCH_BUFFER_END and NOOP */


nit: since you're using GEN7_MI_BATCH_BUFFER_END below, you could use it 
here too.



+  .flags = I915_EXEC_HANDLE_LUT | I915_EXEC_RENDER,
+  .rsvd1 = device->context_id,
+  .rsvd2 = 0,
+   };
+}
+
  VkResult
  anv_cmd_buffer_execbuf(struct anv_device *device,
 struct anv_cmd_buffer *cmd_buffer,
@@ -1447,9 +1464,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 }
  
-   result = setup_execbuf_for_cmd_buffer(, cmd_buffer);

-   if (result != VK_SUCCESS)
-  return result;
+   if (cmd_buffer) {
+  result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
+  if (result != VK_SUCCESS)
+ return result;
+   } else {
+  setup_empty_execbuf(, device);
+   }
+
  
 result = anv_device_execbuf(device, , execbuf.bos);
  
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c

index 793e519..e82e1e9 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1014,6 +1014,32 @@ anv_device_init_border_colors(struct anv_device *device)
  border_colors);
  }
  
+static void

+anv_device_init_trivial_batch(struct anv_device *device)
+{
+   anv_bo_init_new(>trivial_batch_bo, device, 4096);
+
+   if (device->instance->physicalDevice.has_exec_async)
+  device->trivial_batch_bo.flags |= EXEC_OBJECT_ASYNC;
+
+   void *map = anv_gem_mmap(device, device->trivial_batch_bo.gem_handle,
+0, 4096, 0);
+
+   struct anv_batch batch = {
+  .start = map,
+  .next = map,
+  .end = map + 4096,
+   };
+
+   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe);
+   anv_batch_emit(, GEN7_MI_NOOP, noop);
+
+   if (!device->info.has_llc)
+  gen_clflush_range(map, batch.next - map);
+
+   anv_gem_munmap(map, device->trivial_batch_bo.size);
+}
+
  VkResult anv_CreateDevice(
  VkPhysicalDevicephysicalDevice,
  const VkDeviceCreateInfo*   pCreateInfo,
@@ -1131,6 +1157,8 @@ VkResult anv_CreateDevice(
 if (result != VK_SUCCESS)
goto fail_surface_state_pool;
  
+   anv_device_init_trivial_batch(device);

+
 anv_scratch_pool_init(device, >scratch_pool);
  
 anv_queue_init(device, >queue);

@@ -1220,6 +1248,8 @@ void anv_DestroyDevice(
 anv_gem_munmap(device->workaround_bo.map, device->workaround_bo.size);
 anv_gem_close(device, device->workaround_bo.gem_handle);
  
+   anv_gem_close(device, device->trivial_batch_bo.gem_handle);

+
 anv_state_pool_finish(>surface_state_pool);
 anv_state_pool_finish(>instruction_state_pool);
 anv_state_pool_finish(>dynamic_state_pool);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index b599db3..bc67bb6 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -745,6 +745,7 @@ struct anv_device {
  struct anv_state_pool   surface_state_pool;
  
  struct anv_bo   workaround_bo;

+struct anv_bo   trivial_batch_bo;
  
  struct anv_pipeline_cache   blorp_shader_cache;

  struct blorp_contextblorp;
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 9a0789c..039dfd7 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -159,6 +159,23 @@ VkResult anv_QueueSubmit(
 pthread_mutex_lock(>mutex);
  
 for (uint32_t i = 0; i < submitCount; i++) {

+  if (pSubmits[i].commandBufferCount == 0) {
+ /* If we don't have any command buffers, we need to submit a dummy
+  * batch to give GEM something to wait on.  We could, 

Re: [Mesa-dev] [PATCH v3 0/8] anv: Implement VK_KHR_external_semaphore

2017-08-13 Thread Lionel Landwerlin

This series is :

Reviewed-by: Lionel Landwerlin 

I have a couple of nits, feel free to ignore.

Thanks!

On 04/08/17 18:24, Jason Ekstrand wrote:

This series is a quick re-spin of the v2 sent yesterday to address review
feedback from Chris.  In particular, we now set EXEC_ASYNC on the trivial
batch and I deleted the syncobj cache.  Somehow, when I was working on this
yesterday, I got it into my head that the kernel deduplicates syncobj
handles and that we needed a cache to handle them correctly.  This is not
true.  Every call to SYNCOBJ_FD_TO_HANDLE produces a new handle and the
kernel does the reference counting for us.

Cc: Chad Versace 
Cc: Kristian H. Kristensen 
Cc: Chris Wilson 

Jason Ekstrand (8):
   anv: Add a basic implementation of VK_KHX_external_semaphore
   anv: Submit a dummy batch when only semaphores are provided.
   anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set
   anv: Implement support for exporting semaphores as FENCE_FD
   intel/drm: Pull in the i916 fence array API
   anv/gem: Add a drm syncobj support
   anv: Use DRM sync objects for external semaphores when available
   anv: Advertise VK_KHR_external_semaphore

  include/drm-uapi/i915_drm.h|  30 +++-
  src/intel/vulkan/anv_batch_chain.c | 175 +++-
  src/intel/vulkan/anv_device.c  |  32 +
  src/intel/vulkan/anv_extensions.py |   3 +
  src/intel/vulkan/anv_gem.c |  93 -
  src/intel/vulkan/anv_gem_stubs.c   |  24 
  src/intel/vulkan/anv_private.h |  39 +-
  src/intel/vulkan/anv_queue.c   | 271 -
  8 files changed, 646 insertions(+), 21 deletions(-)



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev