[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #18 from b...@besd.de  ---
Just confirmed that it works now.

Thanks!

Maybe this should be in stable too?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105105] Suffixless KHR_robustness functions aren't exposed in ES 3.2

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105105

Tapani Pälli  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #8 from Tapani Pälli  ---
This is now fixed in opengl-es-cts-3.2.4 by following commit, resolving this
one as NOTOURBUG.

--- 8< ---
commit c693c85a5f9983caab94c3973e6dc0efaae57b0c
Author: Tapani Pälli 
Date:   Thu Apr 5 08:13:46 2018 +0300

Prefer KHR entrypoints instead of EXT for robustness tests

When resolving function entrypoints, framework resolves EXT
entrypoints after KHR to the same pointers. There are drivers that
implement only KHR entrypoints, prefer KHR over EXT so that KHR
entrypoints will be the ones used if both extensions are supported
by the driver.

Components: OpenGL ES
VK-GL-CTS issue: 1107

Affects:
KHR-NoContext.es32.robustness.getnuniform
KHR-NoContext.es32.robustness.readnpixels

Change-Id: Iec5f7cbdd53061e105b3445f7613ee41fccc4553
Signed-off-by: Tapani Pälli 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/fs: Register allocator shoudn't use grf127 for sends dest

2018-04-11 Thread Jose Maria Casanova Crespo
Since Gen8+ Intel PRM states that "r127 must not be used for
return address  when there is a src and dest overlap in send
instruction."

This patch implements this restriction creating new register allocator
classes that are copies of the normal classes. These new classes
exclude in their set of registers the last one of the original classes
(the only one that includes the grf127).

So vgrf that are used as destination of send messages sent from a grf are
re-assigned to one of these new classes based on its size. So the register
allocator would never assign to these vgrf a register that involves the
grf127.

If dispatch_width > 8 we don't re-assign to the new classes because all
instructions have a node interference between source and destination. And
that is enought to avoid the r127 restriction.

This fixes CTS tests that raised this issue as they were executed as SIMD8:
  
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.uniform_16struct_to_32struct.uniform_buffer_block_vert
  
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.uniform_16struct_to_32struct.uniform_buffer_block_tessc

Shader-db results on Skylake:

total instructions in shared programs: 7686798 -> 7686790 (<.01%)
instructions in affected programs: 1476 -> 1468 (-0.54%)
helped: 4
HURT: 0

total cycles in shared programs: 337092322 -> 337095944 (<.01%)
cycles in affected programs: 765861 -> 769483 (0.47%)
helped: 167
HURT: 161

Shader-db results on Broadwell:

total instructions in shared programs: 7658574 -> 7658561 (<.01%)
instructions in affected programs: 2355 -> 2342 (-0.55%)
helped: 5
HURT: 0

total cycles in shared programs: 340694553 -> 340689774 (<.01%)
cycles in affected programs: 1200517 -> 1195738 (-0.40%)
helped: 204
HURT: 267

total spills in shared programs: 4300 -> 4299 (-0.02%)
spills in affected programs: 72 -> 71 (-1.39%)
helped: 1
HURT: 0

total fills in shared programs: 5370 -> 5369 (-0.02%)
fills in affected programs: 58 -> 57 (-1.72%)
helped: 1
HURT: 0

As expected Shader-db reports no changes on previous generations.

Cc: Jason Ekstrand 
Cc: Francisco Jerez 
---
 src/intel/compiler/brw_compiler.h  |  7 ++-
 src/intel/compiler/brw_fs_reg_allocate.cpp | 70 +++---
 2 files changed, 59 insertions(+), 18 deletions(-)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index d3ae6499b91..572d373ff0c 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -61,9 +61,12 @@ struct brw_compiler {
 
   /**
* Array of the ra classes for the unaligned contiguous register
-   * block sizes used, indexed by register size.
+   * block sizes used, indexed by register size. Classes starting at
+   * index 16 are classes for registers that should be used for
+   * send destination registers. They are equivalent to 0-15 classes
+   * but not including grf127.
*/
-  int classes[16];
+  int classes[32];
 
   /**
* Mapping from classes to ra_reg ranges.  Each of the per-size
diff --git a/src/intel/compiler/brw_fs_reg_allocate.cpp 
b/src/intel/compiler/brw_fs_reg_allocate.cpp
index ec8e116cb38..66e4a342d0d 100644
--- a/src/intel/compiler/brw_fs_reg_allocate.cpp
+++ b/src/intel/compiler/brw_fs_reg_allocate.cpp
@@ -102,19 +102,31 @@ brw_alloc_reg_set(struct brw_compiler *compiler, int 
dispatch_width)
 * Additionally, on gen5 we need aligned pairs of registers for the PLN
 * instruction, and on gen4 we need 8 contiguous regs for workaround simd16
 * texturing.
+*
+* For Gen8+ we duplicate classes with a new type of classes for vgrfs when
+* they are used as destination of SEND messages. The only difference is
+* that these classes don't include grf127 to implement the restriction of
+* not overlaping source and destination when register grf127 is the
+* destination of a SEND message. So 0-15 are the normal classes indexed by
+* size and 16-31 classes reuse the same registers but don't include
+* grf127.
 */
-   const int class_count = MAX_VGRF_SIZE;
-   int class_sizes[MAX_VGRF_SIZE];
-   for (unsigned i = 0; i < MAX_VGRF_SIZE; i++)
-  class_sizes[i] = i + 1;
+   const int class_types = devinfo->gen >= 8 ? 2 : 1;
+   const int class_count = class_types * MAX_VGRF_SIZE;
+   int class_sizes[class_count];
+   for (int i = 0; i < class_count; i++)
+  class_sizes[i] = (i % MAX_VGRF_SIZE) + 1;
 
memset(compiler->fs_reg_sets[index].class_to_ra_reg_range, 0,
   sizeof(compiler->fs_reg_sets[index].class_to_ra_reg_range));
int *class_to_ra_reg_range = 
compiler->fs_reg_sets[index].class_to_ra_reg_range;
 
-   /* Compute the total number of registers across all classes. */
+   /* Compute the total number of registers across all classes. The duplicated
+* classes to 

[Mesa-dev] [PATCH 1/2] intel/compiler: grf127 can not be dest when src and dest overlap in send

2018-04-11 Thread Jose Maria Casanova Crespo
Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07,
section "Instruction Set Reference", subsection "EUISA Instructions", Send
Message (page 990):

"r127 must not be used for return address when there is a src and dest overlap
in send instruction."

Cc: Jason Ekstrand 
Cc: Matt Turner 
---
 src/intel/compiler/brw_eu_validate.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/intel/compiler/brw_eu_validate.c 
b/src/intel/compiler/brw_eu_validate.c
index d3189d1ef5e..0d711501303 100644
--- a/src/intel/compiler/brw_eu_validate.c
+++ b/src/intel/compiler/brw_eu_validate.c
@@ -261,6 +261,15 @@ send_restrictions(const struct gen_device_info *devinfo,
   brw_inst_src0_da_reg_nr(devinfo, inst) < 112,
   "send with EOT must use g112-g127");
   }
+  if (devinfo->gen >= 8) {
+ ERROR_IF(!dst_is_null(devinfo, inst) &&
+  (brw_inst_dst_da_reg_nr(devinfo, inst) +
+   brw_inst_rlen(devinfo, inst) > 127 ) &&
+  (brw_inst_src0_da_reg_nr(devinfo, inst) +
+   brw_inst_mlen(devinfo, inst) >
+   brw_inst_dst_da_reg_nr(devinfo, inst)),
+  "r127 can not be dest when src and dest overlap in send");
+  }
}
 
return error_msg;
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

Timothy Arceri  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from Timothy Arceri  ---
Thanks again for the report and for bisecting.

Fixed by:

commit c7e3d31b0b5f22299a6bd72655502ce8427b40bf
Author: Timothy Arceri 
Date:   Thu Apr 12 09:23:02 2018 +1000

glsl: fix compat shaders in GLSL 1.40

The compatibility and core tokens were not added until GLSL 1.50,
for GLSL 1.40 just assume all shaders built with a compat profile
are compat shaders.

Fixes rendering issues in Dawn of War II on radeonsi which has
enabled OpenGL 3.1 compat support.

Fixes: a0c8b49284ef "mesa: enable OpenGL 3.1 with ARB_compatibility"

Reviewed-by: Marek Olšák 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix compat shaders in GLSL 1.40

2018-04-11 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Apr 11, 2018 at 7:54 PM, Timothy Arceri 
wrote:

> On 12/04/18 09:29, Timothy Arceri wrote:
>
>> The compatibility and core tokens were not added until GLSL 1.50,
>> for GLSL 1.40 just assume all shader built with a compat profile
>> are compat shaders.
>>
>> Fixes rendering issues in Dawn of War II on radeonsi which has
>> enabled OpenGL 3.1 compat support.
>>
>
> oh and I've added this locally:
>
> Fixes: a0c8b49284ef "mesa: enable OpenGL 3.1 with ARB_compatibility"
>
>
>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
>> ---
>>   src/compiler/glsl/glsl_parser_extras.cpp | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/src/compiler/glsl/glsl_parser_extras.cpp
>> b/src/compiler/glsl/glsl_parser_extras.cpp
>> index 0cc57f5a887..5dd362b3e38 100644
>> --- a/src/compiler/glsl/glsl_parser_extras.cpp
>> +++ b/src/compiler/glsl/glsl_parser_extras.cpp
>> @@ -429,6 +429,8 @@ _mesa_glsl_parse_state::process_version_directive(YYLTYPE
>> *locp, int version,
>> this->language_version = version;
>>this->compat_shader = compat_token_present ||
>> + (this->ctx->API == API_OPENGL_COMPAT &&
>> +  this->language_version == 140) ||
>>(!this->es_shader && this->language_version <
>> 140);
>>bool supported = false;
>>
>> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac/surface: Allow S swizzle for displayable surfaces.

2018-04-11 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Tue, Apr 10, 2018 at 8:10 PM, Bas Nieuwenhuizen 
wrote:

> For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts
> S swizzles.
>
> At the same time addrlib prefers D swizzles is allowed, so we can
> just allow S swizzles as fallback.
>
> Fixes: b64b712558 "ac/surface/gfx9: request desired micro tile mode
> explicitly"
> ---
>  src/amd/common/ac_surface.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
> index 1b4d72e31b..7558dd91e3 100644
> --- a/src/amd/common/ac_surface.c
> +++ b/src/amd/common/ac_surface.c
> @@ -865,9 +865,12 @@ gfx9_get_preferred_swizzle_mode(ADDR_HANDLE addrlib,
> sin.numSamples = in->numSamples;
> sin.numFrags = in->numFrags;
>
> -   if (flags & RADEON_SURF_SCANOUT)
> +   if (flags & RADEON_SURF_SCANOUT) {
> sin.preferredSwSet.sw_D = 1;
> -   else if (in->flags.depth || in->flags.stencil || is_fmask)
> +   /* Raven only allows S for displayable surfaces with < 64
> bpp, so
> +* allow it as fallback */
> +   sin.preferredSwSet.sw_S = 1;
> +   } else if (in->flags.depth || in->flags.stencil || is_fmask)
> sin.preferredSwSet.sw_Z = 1;
> else
> sin.preferredSwSet.sw_S = 1;
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] radeonsi/nir: fix crash in test involving the sample mask

2018-04-11 Thread Timothy Arceri

On 11/04/18 20:56, Nicolai Hähnle wrote:

From: Nicolai Hähnle 


Please add to the commit message which test was fixed by this. Otherwise 
the change seems reasonable:


Reviewed-by: Timothy Arceri 



---
  src/gallium/drivers/radeonsi/si_shader.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 8c62d53e2ad..3e224b083e6 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2009,21 +2009,22 @@ static LLVMValueRef load_sample_position(struct 
ac_shader_abi *abi, LLVMValueRef
buffer_load_const(ctx, resource, offset1),
LLVMConstReal(ctx->f32, 0),
LLVMConstReal(ctx->f32, 0)
};
  
  	return lp_build_gather_values(>gallivm, pos, 4);

  }
  
  static LLVMValueRef load_sample_mask_in(struct ac_shader_abi *abi)

  {
-   return abi->sample_coverage;
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   return ac_to_integer(>ac, abi->sample_coverage);
  }
  
  static LLVMValueRef si_load_tess_coord(struct ac_shader_abi *abi)

  {
struct si_shader_context *ctx = si_shader_context_from_abi(abi);
struct lp_build_context *bld = >bld_base.base;
  
  	LLVMValueRef coord[4] = {

LLVMGetParam(ctx->main_fn, ctx->param_tes_u),
LLVMGetParam(ctx->main_fn, ctx->param_tes_v),


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] i965: Move unmap_depthstencil before map_depthstencil

2018-04-11 Thread Chris Wilson
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 114 +-
 1 file changed, 57 insertions(+), 57 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index f05b7549a79..29db3873c3a 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3408,7 +3408,7 @@ intel_miptree_map_etc(struct brw_context *brw,
 }
 
 /**
- * Mapping function for packed depth/stencil miptrees backed by real separate
+ * Mapping functions for packed depth/stencil miptrees backed by real separate
  * miptrees for depth and stencil.
  *
  * On gen7, and to support HiZ pre-gen7, we have to have the stencil buffer
@@ -3419,30 +3419,20 @@ intel_miptree_map_etc(struct brw_context *brw,
  * copying the data between the actual backing store and the temporary.
  */
 static void
-intel_miptree_map_depthstencil(struct brw_context *brw,
-  struct intel_mipmap_tree *mt,
-  struct intel_miptree_map *map,
-  unsigned int level, unsigned int slice)
+intel_miptree_unmap_depthstencil(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+struct intel_miptree_map *map,
+unsigned int level,
+unsigned int slice)
 {
struct intel_mipmap_tree *z_mt = mt;
struct intel_mipmap_tree *s_mt = mt->stencil_mt;
bool map_z32f_x24s8 = mt->format == MESA_FORMAT_Z_FLOAT32;
-   int packed_bpp = map_z32f_x24s8 ? 8 : 4;
-
-   map->stride = map->w * packed_bpp;
-   map->buffer = map->ptr = malloc(map->stride * map->h);
-   if (!map->buffer)
-  return;
 
-   /* One of either READ_BIT or WRITE_BIT or both is set.  READ_BIT implies no
-* INVALIDATE_RANGE_BIT.  WRITE_BIT needs the original values read in unless
-* invalidate is set, since we'll be writing the whole rectangle from our
-* temporary buffer back out.
-*/
-   if (!(map->mode & GL_MAP_INVALIDATE_RANGE_BIT)) {
+   if (map->mode & GL_MAP_WRITE_BIT) {
   uint32_t *packed_map = map->ptr;
-  uint8_t *s_map = intel_miptree_map_raw(brw, s_mt, GL_MAP_READ_BIT);
-  uint32_t *z_map = intel_miptree_map_raw(brw, z_mt, GL_MAP_READ_BIT);
+  uint8_t *s_map = intel_miptree_map_raw(brw, s_mt, GL_MAP_WRITE_BIT);
+  uint32_t *z_map = intel_miptree_map_raw(brw, z_mt, GL_MAP_WRITE_BIT);
   unsigned int s_image_x, s_image_y;
   unsigned int z_image_x, z_image_y;
 
@@ -3453,22 +3443,21 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
 
   for (uint32_t y = 0; y < map->h; y++) {
 for (uint32_t x = 0; x < map->w; x++) {
-   int map_x = map->x + x, map_y = map->y + y;
ptrdiff_t s_offset = intel_offset_S8(s_mt->surf.row_pitch,
-map_x + s_image_x,
-map_y + s_image_y,
+x + s_image_x + map->x,
+y + s_image_y + map->y,
 brw->has_swizzling);
-   ptrdiff_t z_offset = ((map_y + z_image_y) *
+   ptrdiff_t z_offset = ((y + z_image_y + map->y) *
   (z_mt->surf.row_pitch / 4) +
- (map_x + z_image_x));
-   uint8_t s = s_map[s_offset];
-   uint32_t z = z_map[z_offset];
+ (x + z_image_x + map->x));
 
if (map_z32f_x24s8) {
-  packed_map[(y * map->w + x) * 2 + 0] = z;
-  packed_map[(y * map->w + x) * 2 + 1] = s;
+  z_map[z_offset] = packed_map[(y * map->w + x) * 2 + 0];
+  s_map[s_offset] = packed_map[(y * map->w + x) * 2 + 1];
} else {
-  packed_map[y * map->w + x] = (s << 24) | (z & 0x00ff);
+  uint32_t packed = packed_map[y * map->w + x];
+  s_map[s_offset] = packed >> 24;
+  z_map[z_offset] = packed;
}
 }
   }
@@ -3476,34 +3465,43 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
   intel_miptree_unmap_raw(s_mt);
   intel_miptree_unmap_raw(z_mt);
 
-  DBG("%s: %d,%d %dx%d from z mt %p %d,%d, s mt %p %d,%d = %p/%d\n",
+  DBG("%s: %d,%d %dx%d from z mt %p (%s) %d,%d, s mt %p %d,%d = %p/%d\n",
  __func__,
  map->x, map->y, map->w, map->h,
- z_mt, map->x + z_image_x, map->y + z_image_y,
+ z_mt, _mesa_get_format_name(z_mt->format),
+ map->x + z_image_x, map->y + z_image_y,
  s_mt, map->x + s_image_x, map->y + s_image_y,
  map->ptr, map->stride);
-   } else {
-  DBG("%s: %d,%d %dx%d from mt %p = 

Re: [Mesa-dev] [PATCH 1/5] glsl: prevent spurious Valgrind errors when serializing NIR

2018-04-11 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

On 11/04/18 20:56, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

It looks as if the structure fields array is fully initialized below,
but in fact at least gcc in debug builds will not actually overwrite
the unused bits of bit fields.
---
  src/compiler/glsl_types.cpp | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 9d853caf721..9ebf6a433bd 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -98,22 +98,24 @@ glsl_type::glsl_type(const glsl_struct_field *fields, 
unsigned num_fields,
 vector_elements(0), matrix_columns(0),
 length(num_fields)
  {
 unsigned int i;
  
 this->mem_ctx = ralloc_context(NULL);

 assert(this->mem_ctx != NULL);
  
 assert(name != NULL);

 this->name = ralloc_strdup(this->mem_ctx, name);
-   this->fields.structure = ralloc_array(this->mem_ctx,
- glsl_struct_field, length);
+   /* Zero-fill to prevent spurious Valgrind errors when serializing NIR
+* due to uninitialized unused bits in bit fields. */
+   this->fields.structure = rzalloc_array(this->mem_ctx,
+  glsl_struct_field, length);
  
 for (i = 0; i < length; i++) {

this->fields.structure[i] = fields[i];
this->fields.structure[i].name = ralloc_strdup(this->fields.structure,
   fields[i].name);
 }
  }
  
  glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,

   enum glsl_interface_packing packing,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] radeonsi/nir: set FS properties only when scanning a fragment shader

2018-04-11 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

On 11/04/18 20:56, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

---
  src/gallium/drivers/radeonsi/si_shader_nir.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index c0e08c79a56..b4fba8b8812 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -600,21 +600,22 @@ void si_nir_scan_shader(const struct nir_shader *nir,
case TGSI_SEMANTIC_TESSOUTER:
info->reads_tessfactor_outputs = true;
break;
default:
info->reads_pervertex_outputs = true;
}
}
}
  
  		unsigned loc = variable->data.location;

-   if (loc == FRAG_RESULT_COLOR &&
+   if (nir->info.stage == MESA_SHADER_FRAGMENT &&
+   loc == FRAG_RESULT_COLOR &&
nir->info.outputs_written & (1ull << loc)) {
assert(attrib_count == 1);

info->properties[TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS] = true;
}
}
  
  	info->num_outputs = num_outputs;
  
  	struct set *ubo_set = _mesa_set_create(NULL, _mesa_hash_pointer,

   _mesa_key_pointer_equal);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] i965: Move unmap_etc before map_etc

2018-04-11 Thread Chris Wilson
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 42 +--
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index bc44338c55b..f05b7549a79 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3355,27 +3355,6 @@ intel_miptree_map_s8(struct brw_context *brw,
}
 }
 
-static void
-intel_miptree_map_etc(struct brw_context *brw,
-  struct intel_mipmap_tree *mt,
-  struct intel_miptree_map *map,
-  unsigned int level,
-  unsigned int slice)
-{
-   assert(mt->etc_format != MESA_FORMAT_NONE);
-   if (mt->etc_format == MESA_FORMAT_ETC1_RGB8) {
-  assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
-   }
-
-   assert(map->mode & GL_MAP_WRITE_BIT);
-   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
-
-   map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
-   map->buffer = malloc(_mesa_format_image_size(mt->etc_format,
-map->w, map->h, 1));
-   map->ptr = map->buffer;
-}
-
 static void
 intel_miptree_unmap_etc(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
@@ -3407,6 +3386,27 @@ intel_miptree_unmap_etc(struct brw_context *brw,
free(map->buffer);
 }
 
+static void
+intel_miptree_map_etc(struct brw_context *brw,
+  struct intel_mipmap_tree *mt,
+  struct intel_miptree_map *map,
+  unsigned int level,
+  unsigned int slice)
+{
+   assert(mt->etc_format != MESA_FORMAT_NONE);
+   if (mt->etc_format == MESA_FORMAT_ETC1_RGB8) {
+  assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
+   }
+
+   assert(map->mode & GL_MAP_WRITE_BIT);
+   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
+
+   map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
+   map->buffer = malloc(_mesa_format_image_size(mt->etc_format,
+map->w, map->h, 1));
+   map->ptr = map->buffer;
+}
+
 /**
  * Mapping function for packed depth/stencil miptrees backed by real separate
  * miptrees for depth and stencil.
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965: Move unmap_s8 before map_s8

2018-04-11 Thread Chris Wilson
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 60 +--
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 76239e60527..bc44338c55b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3280,6 +3280,36 @@ intel_miptree_map_movntdqa(struct brw_context *brw,
 }
 #endif
 
+static void
+intel_miptree_unmap_s8(struct brw_context *brw,
+  struct intel_mipmap_tree *mt,
+  struct intel_miptree_map *map,
+  unsigned int level,
+  unsigned int slice)
+{
+   if (map->mode & GL_MAP_WRITE_BIT) {
+  unsigned int image_x, image_y;
+  uint8_t *untiled_s8_map = map->ptr;
+  uint8_t *tiled_s8_map = intel_miptree_map_raw(brw, mt, GL_MAP_WRITE_BIT);
+
+  intel_miptree_get_image_offset(mt, level, slice, _x, _y);
+
+  for (uint32_t y = 0; y < map->h; y++) {
+for (uint32_t x = 0; x < map->w; x++) {
+   ptrdiff_t offset = intel_offset_S8(mt->surf.row_pitch,
+  image_x + x + map->x,
+  image_y + y + map->y,
+  brw->has_swizzling);
+   tiled_s8_map[offset] = untiled_s8_map[y * map->w + x];
+}
+  }
+
+  intel_miptree_unmap_raw(mt);
+   }
+
+   free(map->buffer);
+}
+
 static void
 intel_miptree_map_s8(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
@@ -3325,36 +3355,6 @@ intel_miptree_map_s8(struct brw_context *brw,
}
 }
 
-static void
-intel_miptree_unmap_s8(struct brw_context *brw,
-  struct intel_mipmap_tree *mt,
-  struct intel_miptree_map *map,
-  unsigned int level,
-  unsigned int slice)
-{
-   if (map->mode & GL_MAP_WRITE_BIT) {
-  unsigned int image_x, image_y;
-  uint8_t *untiled_s8_map = map->ptr;
-  uint8_t *tiled_s8_map = intel_miptree_map_raw(brw, mt, GL_MAP_WRITE_BIT);
-
-  intel_miptree_get_image_offset(mt, level, slice, _x, _y);
-
-  for (uint32_t y = 0; y < map->h; y++) {
-for (uint32_t x = 0; x < map->w; x++) {
-   ptrdiff_t offset = intel_offset_S8(mt->surf.row_pitch,
-  image_x + x + map->x,
-  image_y + y + map->y,
-  brw->has_swizzling);
-   tiled_s8_map[offset] = untiled_s8_map[y * map->w + x];
-}
-  }
-
-  intel_miptree_unmap_raw(mt);
-   }
-
-   free(map->buffer);
-}
-
 static void
 intel_miptree_map_etc(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] i965: Move unmap_gtt before map_gtt

2018-04-11 Thread Chris Wilson
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8d3ddd56544..f90d462b925 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3080,6 +3080,12 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt)
brw_bo_unmap(mt->bo);
 }
 
+static void
+intel_miptree_unmap_gtt(struct intel_mipmap_tree *mt)
+{
+   intel_miptree_unmap_raw(mt);
+}
+
 static void
 intel_miptree_map_gtt(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
@@ -3127,12 +3133,6 @@ intel_miptree_map_gtt(struct brw_context *brw,
x, y, map->ptr, map->stride);
 }
 
-static void
-intel_miptree_unmap_gtt(struct intel_mipmap_tree *mt)
-{
-   intel_miptree_unmap_raw(mt);
-}
-
 static void
 intel_miptree_map_blit(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] i965: Record mipmap resolver for unmapping

2018-04-11 Thread Chris Wilson
When mapping a region of the mipmap_tree, record which complementary
method to use to unmap it afterwards. By doing so we can avoid
duplicating the decision tree used when mapping and thereby eliminate
trivial errors that can be introduced if the two if-chains become out of
sync.

Signed-off-by: Chris Wilson 
Cc: Kenneth Graunke 
Cc: Scott D Phillips 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 33 +--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  6 
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 29db3873c3a..1b4e9f8f412 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3081,7 +3081,10 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt)
 }
 
 static void
-intel_miptree_unmap_gtt(struct intel_mipmap_tree *mt)
+intel_miptree_unmap_gtt(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+struct intel_miptree_map *map,
+unsigned int level, unsigned int slice)
 {
intel_miptree_unmap_raw(mt);
 }
@@ -3131,6 +3134,8 @@ intel_miptree_map_gtt(struct brw_context *brw,
map->x, map->y, map->w, map->h,
mt, _mesa_get_format_name(mt->format),
x, y, map->ptr, map->stride);
+
+   map->unmap = intel_miptree_unmap_gtt;
 }
 
 static void
@@ -3196,6 +3201,7 @@ intel_miptree_map_blit(struct brw_context *brw,
mt, _mesa_get_format_name(mt->format),
level, slice, map->ptr, map->stride);
 
+   map->unmap = intel_miptree_unmap_blit;
return;
 
 fail:
@@ -3277,6 +3283,8 @@ intel_miptree_map_movntdqa(struct brw_context *brw,
}
 
intel_miptree_unmap_raw(mt);
+
+   map->unmap = intel_miptree_unmap_movntdqa;
 }
 #endif
 
@@ -3353,6 +3361,8 @@ intel_miptree_map_s8(struct brw_context *brw,
  map->x, map->y, map->w, map->h,
  mt, map->ptr, map->stride);
}
+
+   map->unmap = intel_miptree_unmap_s8;
 }
 
 static void
@@ -3405,6 +3415,7 @@ intel_miptree_map_etc(struct brw_context *brw,
map->buffer = malloc(_mesa_format_image_size(mt->etc_format,
 map->w, map->h, 1));
map->ptr = map->buffer;
+   map->unmap = intel_miptree_unmap_etc;
 }
 
 /**
@@ -3546,6 +3557,8 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
  map->x, map->y, map->w, map->h,
  mt, map->ptr, map->stride);
}
+
+   map->unmap = intel_miptree_unmap_depthstencil;
 }
 
 /**
@@ -3717,22 +3730,8 @@ intel_miptree_unmap(struct brw_context *brw,
DBG("%s: mt %p (%s) level %d slice %d\n", __func__,
mt, _mesa_get_format_name(mt->format), level, slice);
 
-   if (mt->format == MESA_FORMAT_S_UINT8) {
-  intel_miptree_unmap_s8(brw, mt, map, level, slice);
-   } else if (mt->etc_format != MESA_FORMAT_NONE &&
-  !(map->mode & BRW_MAP_DIRECT_BIT)) {
-  intel_miptree_unmap_etc(brw, mt, map, level, slice);
-   } else if (mt->stencil_mt && !(map->mode & BRW_MAP_DIRECT_BIT)) {
-  intel_miptree_unmap_depthstencil(brw, mt, map, level, slice);
-   } else if (map->linear_mt) {
-  intel_miptree_unmap_blit(brw, mt, map, level, slice);
-#if defined(USE_SSE41)
-   } else if (map->buffer && cpu_has_sse4_1) {
-  intel_miptree_unmap_movntdqa(brw, mt, map, level, slice);
-#endif
-   } else {
-  intel_miptree_unmap_gtt(mt);
-   }
+   if (map->unmap)
+  map->unmap(brw, mt, map, level, slice);
 
intel_miptree_release_map(mt, level, slice);
 }
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 4136c6586b6..0f750a6d8b2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -88,6 +88,12 @@ struct intel_miptree_map {
void *ptr;
/** Stride of the mapping. */
int stride;
+
+   void (*unmap)(struct brw_context *brw,
+ struct intel_mipmap_tree *mt,
+ struct intel_miptree_map *map,
+ unsigned int level,
+ unsigned int slice);
 };
 
 /**
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] i965: Move unmap_movntdqa before map_movntdqa

2018-04-11 Thread Chris Wilson
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 +--
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 85a5262a414..76239e60527 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3208,6 +3208,18 @@ fail:
  * "Map" a buffer by copying it to an untiled temporary using MOVNTDQA.
  */
 #if defined(USE_SSE41)
+static void
+intel_miptree_unmap_movntdqa(struct brw_context *brw,
+ struct intel_mipmap_tree *mt,
+ struct intel_miptree_map *map,
+ unsigned int level,
+ unsigned int slice)
+{
+   _mesa_align_free(map->buffer);
+   map->buffer = NULL;
+   map->ptr = NULL;
+}
+
 static void
 intel_miptree_map_movntdqa(struct brw_context *brw,
struct intel_mipmap_tree *mt,
@@ -3266,18 +3278,6 @@ intel_miptree_map_movntdqa(struct brw_context *brw,
 
intel_miptree_unmap_raw(mt);
 }
-
-static void
-intel_miptree_unmap_movntdqa(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- struct intel_miptree_map *map,
- unsigned int level,
- unsigned int slice)
-{
-   _mesa_align_free(map->buffer);
-   map->buffer = NULL;
-   map->ptr = NULL;
-}
 #endif
 
 static void
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] i965: Move unmap_blit before map_blit

2018-04-11 Thread Chris Wilson
Reorder code to avoid a forward declaration in the next patch.

Signed-off-by: Chris Wilson 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 44 +--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index f90d462b925..85a5262a414 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3133,6 +3133,28 @@ intel_miptree_map_gtt(struct brw_context *brw,
x, y, map->ptr, map->stride);
 }
 
+static void
+intel_miptree_unmap_blit(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+struct intel_miptree_map *map,
+unsigned int level,
+unsigned int slice)
+{
+   struct gl_context *ctx = >ctx;
+
+   intel_miptree_unmap_raw(map->linear_mt);
+
+   if (map->mode & GL_MAP_WRITE_BIT) {
+  bool ok = intel_miptree_copy(brw,
+   map->linear_mt, 0, 0, 0, 0,
+   mt, level, slice, map->x, map->y,
+   map->w, map->h);
+  WARN_ONCE(!ok, "Failed to blit from linear temporary mapping");
+   }
+
+   intel_miptree_release(>linear_mt);
+}
+
 static void
 intel_miptree_map_blit(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
@@ -3182,28 +3204,6 @@ fail:
map->stride = 0;
 }
 
-static void
-intel_miptree_unmap_blit(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-struct intel_miptree_map *map,
-unsigned int level,
-unsigned int slice)
-{
-   struct gl_context *ctx = >ctx;
-
-   intel_miptree_unmap_raw(map->linear_mt);
-
-   if (map->mode & GL_MAP_WRITE_BIT) {
-  bool ok = intel_miptree_copy(brw,
-   map->linear_mt, 0, 0, 0, 0,
-   mt, level, slice, map->x, map->y,
-   map->w, map->h);
-  WARN_ONCE(!ok, "Failed to blit from linear temporary mapping");
-   }
-
-   intel_miptree_release(>linear_mt);
-}
-
 /**
  * "Map" a buffer by copying it to an untiled temporary using MOVNTDQA.
  */
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] radeonsi: fix error paths of si_texture_transfer_map

2018-04-11 Thread Timothy Arceri



On 11/04/18 20:56, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

trans is zero-initialized, but trans->resource is setup immediately so
needs to be dereferenced.
---
  src/gallium/drivers/radeonsi/si_texture.c | 25 -
  1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_texture.c 
b/src/gallium/drivers/radeonsi/si_texture.c
index 0bab2a6c45b..907e8c8cec9 100644
--- a/src/gallium/drivers/radeonsi/si_texture.c
+++ b/src/gallium/drivers/radeonsi/si_texture.c
@@ -1728,49 +1728,46 @@ static void *si_texture_transfer_map(struct 
pipe_context *ctx,
 * then decompress the temporary one to staging.
 *
 * Only the region being mapped is transfered.
 */
struct pipe_resource resource;
  
  			si_init_temp_resource_from_box(, texture, box, level, 0);
  
  			if (!si_init_flushed_depth_texture(ctx, , _depth)) {

PRINT_ERR("failed to create temporary texture to 
hold untiled copy\n");
-   FREE(trans);
-   return NULL;
+   goto fail_trans;
}
  
  			if (usage & PIPE_TRANSFER_READ) {

struct pipe_resource *temp = 
ctx->screen->resource_create(ctx->screen, );
if (!temp) {
PRINT_ERR("failed to create a temporary 
depth texture\n");
-   FREE(trans);
-   return NULL;
+   goto fail_trans;
}
  
  si_copy_region_with_blit(ctx, temp, 0, 0, 0, 0, texture, level, box);

si_blit_decompress_depth(ctx, (struct 
r600_texture*)temp, staging_depth,
 0, 0, 0, box->depth, 
0, 0);
pipe_resource_reference(, NULL);
}
  
  			/* Just get the strides. */

si_texture_get_offset(sctx->screen, staging_depth, 
level, NULL,
>b.b.stride,
>b.b.layer_stride);
} else {
/* XXX: only readback the rectangle which is being 
mapped? */
/* XXX: when discard is true, no need to read back from 
depth texture */
if (!si_init_flushed_depth_texture(ctx, texture, 
_depth)) {
PRINT_ERR("failed to create temporary texture to 
hold untiled copy\n");
-   FREE(trans);
-   return NULL;
+   goto fail_trans;
}
  
  			si_blit_decompress_depth(ctx, rtex, staging_depth,

 level, level,
 box->z, box->z + box->depth - 
1,
 0, 0);
  
  			offset = si_texture_get_offset(sctx->screen, staging_depth,

 level, box,
 >b.b.stride,
@@ -1785,22 +1782,21 @@ static void *si_texture_transfer_map(struct 
pipe_context *ctx,
  
  		si_init_temp_resource_from_box(, texture, box, level,

 SI_RESOURCE_FLAG_TRANSFER);
resource.usage = (usage & PIPE_TRANSFER_READ) ?
PIPE_USAGE_STAGING : PIPE_USAGE_STREAM;
  
  		/* Create the temporary texture. */

staging = (struct 
r600_texture*)ctx->screen->resource_create(ctx->screen, );
if (!staging) {
PRINT_ERR("failed to create temporary texture to hold 
untiled copy\n");
-   FREE(trans);
-   return NULL;
+   goto fail_trans;
}
trans->staging = >resource;
  
  		/* Just get the strides. */

si_texture_get_offset(sctx->screen, staging, 0, NULL,
>b.b.stride,
>b.b.layer_stride);
  
  		if (usage & PIPE_TRANSFER_READ)

si_copy_to_staging_texture(ctx, trans);
@@ -1809,28 +1805,31 @@ static void *si_texture_transfer_map(struct 
pipe_context *ctx,
  
  		buf = trans->staging;

} else {
/* the resource is mapped directly */
offset = si_texture_get_offset(sctx->screen, rtex, level, box,
 >b.b.stride,
 >b.b.layer_stride);
buf = >resource;

[Mesa-dev] [PATCH] nir: Look into uniform structs for samplers when counting num_textures.

2018-04-11 Thread Eric Anholt
mesa/st decides whether to update samplers after a program change based on
whether num_textures is nonzero.  By not counting samplers in a uniform
struct, we would segfault in
KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same
context after a non-vertex-shader-uniform testcase (as is the case during
a full conformance run).

v2: Implement using two separate pure functions instead of updating
pointers.
---
 src/compiler/nir/nir_gather_info.c | 56 +++---
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 5530009255d7..95b4456e5b19 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -350,24 +350,56 @@ gather_info_block(nir_block *block, nir_shader *shader)
}
 }
 
+static unsigned
+glsl_type_get_sampler_count(const struct glsl_type *type)
+{
+   if (glsl_type_is_array(type)) {
+  return (glsl_get_aoa_size(type) *
+  glsl_type_get_sampler_count(glsl_without_array(type)));
+   }
+
+   if (glsl_type_is_struct(type)) {
+  unsigned count = 0;
+  for (int i = 0; i < glsl_get_length(type); i++)
+ count += glsl_type_get_sampler_count(glsl_get_struct_field(type, i));
+  return count;
+   }
+
+   if (glsl_type_is_sampler(type))
+  return 1;
+
+   return 0;
+}
+
+static unsigned
+glsl_type_get_image_count(const struct glsl_type *type)
+{
+   if (glsl_type_is_array(type)) {
+  return (glsl_get_aoa_size(type) *
+  glsl_type_get_image_count(glsl_without_array(type)));
+   }
+
+   if (glsl_type_is_struct(type)) {
+  unsigned count = 0;
+  for (int i = 0; i < glsl_get_length(type); i++)
+ count += glsl_type_get_image_count(glsl_get_struct_field(type, i));
+  return count;
+   }
+
+   if (glsl_type_is_image(type))
+  return 1;
+
+   return 0;
+}
+
 void
 nir_shader_gather_info(nir_shader *shader, nir_function_impl *entrypoint)
 {
shader->info.num_textures = 0;
shader->info.num_images = 0;
nir_foreach_variable(var, >uniforms) {
-  const struct glsl_type *type = var->type;
-  unsigned count = 1;
-  if (glsl_type_is_array(type)) {
- count = glsl_get_aoa_size(type);
- type = glsl_without_array(type);
-  }
-
-  if (glsl_type_is_image(type)) {
- shader->info.num_images += count;
-  } else if (glsl_type_is_sampler(type)) {
- shader->info.num_textures += count;
-  }
+  shader->info.num_textures += glsl_type_get_sampler_count(var->type);
+  shader->info.num_images += glsl_type_get_image_count(var->type);
}
 
shader->info.inputs_read = 0;
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix compat shaders in GLSL 1.40

2018-04-11 Thread Timothy Arceri

On 12/04/18 09:29, Timothy Arceri wrote:

The compatibility and core tokens were not added until GLSL 1.50,
for GLSL 1.40 just assume all shader built with a compat profile
are compat shaders.

Fixes rendering issues in Dawn of War II on radeonsi which has
enabled OpenGL 3.1 compat support.


oh and I've added this locally:

Fixes: a0c8b49284ef "mesa: enable OpenGL 3.1 with ARB_compatibility"




Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
---
  src/compiler/glsl/glsl_parser_extras.cpp | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 0cc57f5a887..5dd362b3e38 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -429,6 +429,8 @@ _mesa_glsl_parse_state::process_version_directive(YYLTYPE 
*locp, int version,
this->language_version = version;
  
 this->compat_shader = compat_token_present ||

+ (this->ctx->API == API_OPENGL_COMPAT &&
+  this->language_version == 140) ||
   (!this->es_shader && this->language_version < 140);
  
 bool supported = false;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: include mtypes.h less

2018-04-11 Thread Dylan Baker
This patch breaks compiling radv.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #16 from Timothy Arceri  ---
Ok so I found the problem. We didn't support compat shaders on GLSL 1.40 only
on GLSL versions higher and lower.

I think the Version == 0 might be a separate issue as per comment 15 and it
would be great if you could create a piglit test for that. I'll need to write a
piglit test for this GLSL 1.40 compat shaders bug too.

Fix:
https://patchwork.freedesktop.org/patch/216621/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] radv: Enable VK_EXT_descriptor_indexing.

2018-04-11 Thread Bas Nieuwenhuizen
This adds everything except non-uniform indexing, which needs a bit
more work and testing.
---
 src/amd/vulkan/radv_device.c  | 39 +++
 src/amd/vulkan/radv_extensions.py |  1 +
 src/amd/vulkan/radv_shader.c  |  2 ++
 3 files changed, 42 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index c81b69fef5c..bdbbfc162a2 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -735,6 +735,31 @@ void radv_GetPhysicalDeviceFeatures2(
features->samplerYcbcrConversion = false;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_FEATURES_EXT: {
+   VkPhysicalDeviceDescriptorIndexingFeaturesEXT *features 
=
+   
(VkPhysicalDeviceDescriptorIndexingFeaturesEXT*)features;
+   features->shaderInputAttachmentArrayDynamicIndexing = 
true;
+   features->shaderUniformTexelBufferArrayDynamicIndexing 
= true;
+   features->shaderStorageTexelBufferArrayDynamicIndexing 
= true;
+   features->shaderUniformBufferArrayNonUniformIndexing = 
false;
+   features->shaderSampledImageArrayNonUniformIndexing = 
false;
+   features->shaderStorageBufferArrayNonUniformIndexing = 
false;
+   features->shaderStorageImageArrayNonUniformIndexing = 
false;
+   features->shaderInputAttachmentArrayNonUniformIndexing 
= false;
+   
features->shaderUniformTexelBufferArrayNonUniformIndexing = false;
+   
features->shaderStorageTexelBufferArrayNonUniformIndexing = false;
+   features->descriptorBindingUniformBufferUpdateAfterBind 
= true;
+   features->descriptorBindingSampledImageUpdateAfterBind 
= true;
+   features->descriptorBindingStorageImageUpdateAfterBind 
= true;
+   features->descriptorBindingStorageBufferUpdateAfterBind 
= true;
+   
features->descriptorBindingUniformTexelBufferUpdateAfterBind = true;
+   
features->descriptorBindingStorageTexelBufferUpdateAfterBind = true;
+   features->descriptorBindingUpdateUnusedWhilePending = 
true;
+   features->descriptorBindingPartiallyBound = true;
+   features->descriptorBindingVariableDescriptorCount = 
true;
+   features->runtimeDescriptorArray = true;
+   break;
+   }
default:
break;
}
@@ -1002,6 +1027,20 @@ void radv_GetPhysicalDeviceProperties2(
properties->vgprAllocationGranularity = 4;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_INDEXING_PROPERTIES_EXT: {
+   VkPhysicalDeviceDescriptorIndexingPropertiesEXT 
*properties =
+   
(VkPhysicalDeviceDescriptorIndexingPropertiesEXT*)ext;
+   properties->maxUpdateAfterBindDescriptorsInAllPools = 
UINT32_MAX;
+   
properties->shaderUniformBufferArrayNonUniformIndexingNative = false;
+   
properties->shaderSampledImageArrayNonUniformIndexingNative = false;
+   
properties->shaderStorageBufferArrayNonUniformIndexingNative = false;
+   
properties->shaderStorageImageArrayNonUniformIndexingNative = false;
+   
properties->shaderInputAttachmentArrayNonUniformIndexingNative = false;
+   properties->robustBufferAccessUpdateAfterBind = false;
+   properties->quadDivergentImplicitLod = false;
+   /* TODO rest */
+   break;
+   }
default:
break;
}
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index a680f42dec7..3131a0ad417 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -87,6 +87,7 @@ EXTENSIONS = [
 Extension('VK_KHR_multiview', 1, True),
 Extension('VK_EXT_debug_report',  9, True),
 Extension('VK_EXT_depth_range_unrestricted',  1, True),
+Extension('VK_EXT_descriptor_indexing',   2, True),
 Extension('VK_EXT_discard_rectangles',1, True),
 Extension('VK_EXT_external_memory_dma_buf',   1, True),
 Extension('VK_EXT_external_memory_host',  1, 
'device->rad_info.has_userptr'),
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index eaf24dcdee8..32d5f649e10 100644
--- a/src/amd/vulkan/radv_shader.c
+++ 

Re: [Mesa-dev] NIR function inlining for faster compile times

2018-04-11 Thread Timothy Arceri

On 12/04/18 09:36, Ian Romanick wrote:

Are we still calculating limits (that affect whether or not a shader can
successfully link) after only doing GLSL optimizations?  I'm worried
that making a pretty big change to the optimization path is going to
break some app on (most likely) an older piece of hardware because the
linker will now determine that it exceeds some limit.



All tests pass for all i965 hardware on Jenkins (at least that's what 
the result emails say :P I've been fooled before). I also have no 
shader-db link errors on IVY bridge. Ideally we should move to a NIR 
linker so we can avoid these issues and better eliminate unused uniforms 
etc, I believe there is going to be some for of NIR linker coming with 
the spirv support.


Anyway I'll let you guys decide if you want to turn on 
GLSLOptimizeConservatively for i965 but I still want to land the rest of 
the series so gallium drivers such as radeonsi/vc4 can make use of the 
faster NIR passes.




On 04/09/2018 09:34 PM, Timothy Arceri wrote:

This series is part of an effort to reduce the regression in compile
times when switching radeonsi from TGIS -> NIR. But it also turns
out to be quite handy for i965 too.

The idea is to make better use of GLSLOptimizeConservatively.
Currently TGSI must ignore the flag until all functions have been
inlined by the GLSL IR opts. Since NIR can do function inlining we
can drop the post linking opts calls for Gallium drivers that use
NIR and just use the faster NIR opts instead. The patches to do
this will come in a follow-up series since it requires some
refactoring and testing and I wanted to get this out for review.

For i965 this series enables GLSLOptimizeConservatively for a nice
boost in compile times and very little change in shader-db.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Implement VK_EXT_vertex_attribute_divisor.

2018-04-11 Thread Bas Nieuwenhuizen
Pretty straight forward, just pass the divisors through the shader
key and then do a LLVM divide.
---
 src/amd/vulkan/radv_device.c  |  6 ++
 src/amd/vulkan/radv_extensions.py |  1 +
 src/amd/vulkan/radv_nir_to_llvm.c | 26 +++---
 src/amd/vulkan/radv_pipeline.c| 26 ++
 src/amd/vulkan/radv_private.h |  1 +
 src/amd/vulkan/radv_shader.h  |  1 +
 6 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 41f8242754c..4cd24eb2e96 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -961,6 +961,12 @@ void radv_GetPhysicalDeviceProperties2(
properties->filterMinmaxSingleComponentFormats = true;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT: {
+   VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT 
*properties =
+   
(VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT *)ext;
+   properties->maxVertexAttribDivisor = UINT32_MAX;
+   break;
+   }
default:
break;
}
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index bc63a34896a..48cf3ccd992 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -93,6 +93,7 @@ EXTENSIONS = [
 Extension('VK_EXT_global_priority',   1, 
'device->rad_info.has_ctx_priority'),
 Extension('VK_EXT_sampler_filter_minmax', 1, 
'device->rad_info.chip_class >= CIK'),
 Extension('VK_EXT_shader_viewport_index_layer',   1, True),
+Extension('VK_EXT_vertex_attribute_divisor',  1, True),
 Extension('VK_AMD_draw_indirect_count',   1, True),
 Extension('VK_AMD_gcn_shader',1, True),
 Extension('VK_AMD_rasterization_order',   1, 
'device->has_out_of_order_rast'),
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 2f0864da462..a6b48e297da 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -1794,14 +1794,26 @@ handle_vs_input_decl(struct radv_shader_context *ctx,
 
for (unsigned i = 0; i < attrib_count; ++i, ++idx) {
if (ctx->options->key.vs.instance_rate_inputs & (1u << (index + 
i))) {
-   buffer_index = LLVMBuildAdd(ctx->ac.builder, 
ctx->abi.instance_id,
-   ctx->abi.start_instance, 
"");
-   if (ctx->options->key.vs.as_ls) {
-   ctx->shader_info->vs.vgpr_comp_cnt =
-   MAX2(2, 
ctx->shader_info->vs.vgpr_comp_cnt);
+   uint32_t divisor = 
ctx->options->key.vs.instance_rate_divisors[index + i];
+
+   if (divisor) {
+   buffer_index = LLVMBuildAdd(ctx->ac.builder, 
ctx->abi.instance_id,
+   
ctx->abi.start_instance, "");
+
+   if (divisor != 1) {
+   buffer_index = 
LLVMBuildUDiv(ctx->ac.builder, buffer_index,
+
LLVMConstInt(ctx->ac.i32, divisor, 0), "");
+   }
+
+   if (ctx->options->key.vs.as_ls) {
+   ctx->shader_info->vs.vgpr_comp_cnt =
+   MAX2(2, 
ctx->shader_info->vs.vgpr_comp_cnt);
+   } else {
+   ctx->shader_info->vs.vgpr_comp_cnt =
+   MAX2(1, 
ctx->shader_info->vs.vgpr_comp_cnt);
+   }
} else {
-   ctx->shader_info->vs.vgpr_comp_cnt =
-   MAX2(1, 
ctx->shader_info->vs.vgpr_comp_cnt);
+   buffer_index = ctx->ac.i32_0;
}
} else
buffer_index = LLVMBuildAdd(ctx->ac.builder, 
ctx->abi.vertex_id,
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 08abb9dbc47..91baac431b8 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1743,22 +1743,38 @@ radv_generate_graphics_pipeline_key(struct 
radv_pipeline *pipeline,
 {
const VkPipelineVertexInputStateCreateInfo *input_state =
 pCreateInfo->pVertexInputState;
+   const VkPipelineVertexInputDivisorStateCreateInfoEXT *divisor_state =
+   

[Mesa-dev] [PATCH 08/10] spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps.

2018-04-11 Thread Bas Nieuwenhuizen
---
 src/compiler/shader_info.h| 1 +
 src/compiler/spirv/spirv_to_nir.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index ababe520b2d..c8128fea01b 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -53,6 +53,7 @@ struct spirv_supported_capabilities {
bool subgroup_vote;
bool gcn_shader;
bool trinary_minmax;
+   bool full_uniform_desciptor_indexing;
 };
 
 typedef struct shader_info {
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 78c1e9ff597..04d26841188 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3382,6 +3382,12 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
  spv_check_supported(shader_viewport_index_layer, cap);
  break;
 
+  case SpvCapabilityInputAttachmentArrayDynamicIndexingEXT:
+  case SpvCapabilityUniformTexelBufferArrayDynamicIndexingEXT:
+  case SpvCapabilityStorageTexelBufferArrayDynamicIndexingEXT:
+ spv_check_supported(full_uniform_desciptor_indexing, cap);
+ break;
+
   default:
  vtn_fail("Unhandled capability");
   }
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] radv: Fix GetDescriptorSetLayoutSupport.

2018-04-11 Thread Bas Nieuwenhuizen
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.
---
 src/amd/vulkan/radv_descriptor_set.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 1100ca182b1..7a3a611dd68 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -230,9 +230,6 @@ void radv_GetDescriptorSetLayoutSupport(VkDevice device,
for (uint32_t i = 0; i < pCreateInfo->bindingCount; i++) {
const VkDescriptorSetLayoutBinding *binding = bindings + i;
 
-   if (binding->descriptorCount == 0)
-   continue;
-
uint64_t descriptor_size = 0;
uint64_t descriptor_alignment = 1;
switch (binding->descriptorType) {
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] spirv: Add support for runtime descriptor array cap.

2018-04-11 Thread Bas Nieuwenhuizen
---
 src/compiler/shader_info.h| 1 +
 src/compiler/spirv/spirv_to_nir.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index c8128fea01b..4a0a843c796 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -54,6 +54,7 @@ struct spirv_supported_capabilities {
bool gcn_shader;
bool trinary_minmax;
bool full_uniform_desciptor_indexing;
+   bool runtime_descriptor_array;
 };
 
 typedef struct shader_info {
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 04d26841188..52457554125 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3388,6 +3388,10 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
  spv_check_supported(full_uniform_desciptor_indexing, cap);
  break;
 
+  case SpvCapabilityRuntimeDescriptorArrayEXT:
+ spv_check_supported(runtime_descriptor_array, cap);
+ break;
+
   default:
  vtn_fail("Unhandled capability");
   }
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] radv: Add support for variable descriptor set layouts.

2018-04-11 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_descriptor_set.c | 30 +++-
 src/amd/vulkan/radv_descriptor_set.h |  1 +
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 7a3a611dd68..9b35451c497 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -30,6 +30,7 @@
 #include "util/mesa-sha1.h"
 #include "radv_private.h"
 #include "sid.h"
+#include "vk_util.h"
 
 
 static bool has_equal_immutable_samplers(const VkSampler *samplers, uint32_t 
count)
@@ -76,6 +77,8 @@ VkResult radv_CreateDescriptorSetLayout(
struct radv_descriptor_set_layout *set_layout;
 
assert(pCreateInfo->sType == 
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO);
+   const VkDescriptorSetLayoutBindingFlagsCreateInfoEXT *variable_flags =
+   vk_find_struct_const(pCreateInfo->pNext, 
DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT);
 
uint32_t max_binding = 0;
uint32_t immutable_sampler_count = 0;
@@ -164,6 +167,14 @@ VkResult radv_CreateDescriptorSetLayout(
set_layout->binding[b].offset = set_layout->size;
set_layout->binding[b].dynamic_offset_offset = 
dynamic_offset_count;
 
+   if (variable_flags && binding->binding < 
variable_flags->bindingCount &&
+   (variable_flags->pBindingFlags[binding->binding] & 
VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT_EXT)) {
+   assert(!binding->pImmutableSamplers); /* Terribly ill 
defined  how many samplers are valid */
+   assert(binding->binding == max_binding);
+
+   set_layout->has_variable_descriptors = true;
+   }
+
if (binding->pImmutableSamplers) {
set_layout->binding[b].immutable_samplers_offset = 
samplers_offset;
set_layout->binding[b].immutable_samplers_equal =
@@ -225,6 +236,14 @@ void radv_GetDescriptorSetLayoutSupport(VkDevice device,
return;
}
 
+   const VkDescriptorSetLayoutBindingFlagsCreateInfoEXT *variable_flags =
+   vk_find_struct_const(pCreateInfo->pNext, 
DESCRIPTOR_SET_LAYOUT_BINDING_FLAGS_CREATE_INFO_EXT);
+   VkDescriptorSetVariableDescriptorCountLayoutSupportEXT *variable_count =
+   vk_find_struct((void*)pCreateInfo->pNext, 
DESCRIPTOR_SET_VARIABLE_DESCRIPTOR_COUNT_LAYOUT_SUPPORT_EXT);
+   if (variable_count) {
+   variable_count->maxVariableDescriptorCount = 0;
+   }
+
bool supported = true;
uint64_t size = 0;
for (uint32_t i = 0; i < pCreateInfo->bindingCount; i++) {
@@ -272,9 +291,18 @@ void radv_GetDescriptorSetLayoutSupport(VkDevice device,
supported = false;
}
size = align_u64(size, descriptor_alignment);
-   if (descriptor_size && (UINT64_MAX - size) / descriptor_size < 
binding->descriptorCount) {
+
+   uint64_t max_count = UINT64_MAX;
+   if (descriptor_size)
+   max_count = (UINT64_MAX - size) / descriptor_size;
+
+   if (max_count < binding->descriptorCount) {
supported = false;
}
+   if (variable_flags && binding->binding 
bindingCount && variable_count &&
+   (variable_flags->pBindingFlags[binding->binding] & 
VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT_EXT)) {
+   variable_count->maxVariableDescriptorCount = 
MIN2(UINT32_MAX, max_count);
+   }
size += binding->descriptorCount * descriptor_size;
}
 
diff --git a/src/amd/vulkan/radv_descriptor_set.h 
b/src/amd/vulkan/radv_descriptor_set.h
index e6749311e2a..d1cba953f7e 100644
--- a/src/amd/vulkan/radv_descriptor_set.h
+++ b/src/amd/vulkan/radv_descriptor_set.h
@@ -65,6 +65,7 @@ struct radv_descriptor_set_layout {
uint16_t dynamic_offset_count;
 
bool has_immutable_samplers;
+   bool has_variable_descriptors;
 
/* Bindings in this descriptor set */
struct radv_descriptor_set_binding_layout binding[0];
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] radv: Support allocating variable size descriptor sets.

2018-04-11 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_descriptor_set.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 9b35451c497..55b4aaa388c 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -392,6 +392,7 @@ static VkResult
 radv_descriptor_set_create(struct radv_device *device,
   struct radv_descriptor_pool *pool,
   const struct radv_descriptor_set_layout *layout,
+  const uint32_t *variable_count,
   struct radv_descriptor_set **out_set)
 {
struct radv_descriptor_set *set;
@@ -420,9 +421,9 @@ radv_descriptor_set_create(struct radv_device *device,
}
 
set->layout = layout;
-   if (layout->size) {
-   uint32_t layout_size = align_u32(layout->size, 32);
-   set->size = layout->size;
+   uint32_t layout_size = align_u32(layout->size, 32);
+   if (layout_size) {
+   set->size = layout_size;
 
if (!pool->host_memory_base && pool->entry_count == 
pool->max_entry_count) {
vk_free2(>alloc, NULL, set);
@@ -648,14 +649,26 @@ VkResult radv_AllocateDescriptorSets(
uint32_t i;
struct radv_descriptor_set *set = NULL;
 
+   const VkDescriptorSetVariableDescriptorCountAllocateInfoEXT 
*variable_counts =
+   vk_find_struct_const(pAllocateInfo->pNext, 
DESCRIPTOR_SET_VARIABLE_DESCRIPTOR_COUNT_ALLOCATE_INFO_EXT);
+   const uint32_t zero = 0;
+
/* allocate a set of buffers for each shader to contain descriptors */
for (i = 0; i < pAllocateInfo->descriptorSetCount; i++) {
RADV_FROM_HANDLE(radv_descriptor_set_layout, layout,
 pAllocateInfo->pSetLayouts[i]);
 
+   const uint32_t *variable_count = NULL;
+   if (variable_counts) {
+   if (i < variable_counts->descriptorSetCount)
+   variable_count = 
variable_counts->pDescriptorCounts + i;
+   else
+   variable_count = 
+   }
+
assert(!(layout->flags & 
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR));
 
-   result = radv_descriptor_set_create(device, pool, layout, );
+   result = radv_descriptor_set_create(device, pool, layout, 
variable_count, );
if (result != VK_SUCCESS)
break;
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4

2018-04-11 Thread Bas Nieuwenhuizen
---
 src/compiler/spirv/spirv.core.grammar.json | 169 -
 src/compiler/spirv/spirv.h |  18 +++
 2 files changed, 183 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/spirv.core.grammar.json 
b/src/compiler/spirv/spirv.core.grammar.json
index f3994a60358..a03c024335c 100644
--- a/src/compiler/spirv/spirv.core.grammar.json
+++ b/src/compiler/spirv/spirv.core.grammar.json
@@ -3144,6 +3144,7 @@
 { "kind" : "IdRef", "name" : "'Target'" },
 { "kind" : "Decoration" }
   ],
+  "extensions" : [ "SPV_GOOGLE_hlsl_functionality1" ],
   "version" : "1.2"
 },
 {
@@ -3602,7 +3603,9 @@
 { "kind" : "IdResult" },
 { "kind" : "IdRef", "name" : "'Predicate'" }
   ],
-  "capabilities" : [ "SubgroupBallotKHR" ]
+  "capabilities" : [ "SubgroupBallotKHR" ],
+  "extensions" : [ "SPV_KHR_shader_ballot" ],
+  "version" : "None"
 },
 {
   "opname" : "OpSubgroupFirstInvocationKHR",
@@ -3612,7 +3615,9 @@
 { "kind" : "IdResult" },
 { "kind" : "IdRef", "name" : "'Value'" }
   ],
-  "capabilities" : [ "SubgroupBallotKHR" ]
+  "capabilities" : [ "SubgroupBallotKHR" ],
+  "extensions" : [ "SPV_KHR_shader_ballot" ],
+  "version" : "None"
 },
 {
   "opname" : "OpSubgroupAllKHR",
@@ -3666,6 +3671,7 @@
 { "kind" : "IdRef", "name" : "'Index'" }
   ],
   "capabilities" : [ "SubgroupBallotKHR" ],
+  "extensions" : [ "SPV_KHR_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3679,6 +3685,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3692,6 +3699,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3705,6 +3713,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3718,6 +3727,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3731,6 +3741,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3744,6 +3755,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3757,6 +3769,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3770,6 +3783,7 @@
 { "kind" : "IdRef",  "name" : "'X'" }
   ],
   "capabilities" : [ "Groups" ],
+  "extensions" : [ "SPV_AMD_shader_ballot" ],
   "version" : "None"
 },
 {
@@ -3782,6 +3796,7 @@
 { "kind" : "IdRef", "name" : "'Coordinate'" }
   ],
   "capabilities" : [ "FragmentMaskAMD" ],
+  "extensions" : [ "SPV_AMD_shader_fragment_mask" ],
   "version" : "None"
 },
 {
@@ -3795,6 +3810,7 @@
 { "kind" : "IdRef", "name" : "'Fragment Index'" }
   ],
   "capabilities" : [ "FragmentMaskAMD" ],
+  "extensions" : [ "SPV_AMD_shader_fragment_mask" ],
   "version" : "None"
 },
 {
@@ -3911,6 +3927,18 @@
   ],
   "extensions" : [ "SPV_GOOGLE_decorate_string" ],
   "version" : "None"
+},
+{
+  "opname" : "OpGroupNonUniformPartitionNV",
+  "opcode" : 5296,
+  "operands" : [
+{ "kind" : "IdResultType" },
+{ "kind" : "IdResult" },
+{ "kind" : "IdRef", "name" : "'Value'" }
+  ],
+  "capabilities" : [ "GroupNonUniformPartitionedNV" ],
+  "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ],
+  "version" : "None"
 }
   ],
   "operand_kinds" : [
@@ -4541,12 +4569,14 @@
   "enumerant" : "PostDepthCoverage",
   "value" : 4446,
   "capabilities" : [ "SampleMaskPostDepthCoverage" ],
+  "extensions" : [ "SPV_KHR_post_depth_coverage" ],
   "version" : "None"
 },
 {
   "enumerant" : "StencilRefReplacingEXT",
   "value" : 5027,
   "capabilities" : [ "StencilExportEXT" ],
+  "extensions" : [ "SPV_EXT_shader_stencil_export" ],
   "version" : "None"
 }
   ]
@@ -5513,6 +5543,7 @@
 {
   "enumerant" : "ExplicitInterpAMD",
   "value" : 4999,
+  "extensions" : [ 

[Mesa-dev] [PATCH 03/10] radv: Don't store buffer references in the descriptor set.

2018-04-11 Thread Bas Nieuwenhuizen
---
 src/amd/vulkan/radv_cmd_buffer.c |  4 --
 src/amd/vulkan/radv_debug.c  |  3 -
 src/amd/vulkan/radv_descriptor_set.c | 82 +---
 src/amd/vulkan/radv_descriptor_set.h |  4 --
 src/amd/vulkan/radv_private.h|  2 -
 5 files changed, 13 insertions(+), 82 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index f73526b5fc8..93493c59eb4 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2206,10 +2206,6 @@ radv_bind_descriptor_set(struct radv_cmd_buffer 
*cmd_buffer,
 
assert(!(set->layout->flags & 
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR));
 
-   for (unsigned j = 0; j < set->layout->buffer_count; ++j)
-   if (set->descriptors[j])
-   radv_cs_add_buffer(ws, cmd_buffer->cs, 
set->descriptors[j], 7);
-
if(set->bo)
radv_cs_add_buffer(ws, cmd_buffer->cs, set->bo, 8);
 }
diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
index a0d01b24897..17782ab744b 100644
--- a/src/amd/vulkan/radv_debug.c
+++ b/src/amd/vulkan/radv_debug.c
@@ -249,7 +249,6 @@ radv_dump_descriptor_set(enum chip_class chip_class,
fprintf(f, "\tshader_stages: %x\n", layout->shader_stages);
fprintf(f, "\tdynamic_shader_stages: %x\n",
layout->dynamic_shader_stages);
-   fprintf(f, "\tbuffer_count: %d\n", layout->buffer_count);
fprintf(f, "\tdynamic_offset_count: %d\n",
layout->dynamic_offset_count);
fprintf(f, "\n");
@@ -265,8 +264,6 @@ radv_dump_descriptor_set(enum chip_class chip_class,
layout->binding[i].array_size);
fprintf(f, "\t\toffset: %d\n",
layout->binding[i].offset);
-   fprintf(f, "\t\tbuffer_offset: %d\n",
-   layout->binding[i].buffer_offset);
fprintf(f, "\t\tdynamic_offset_offset: %d\n",
layout->binding[i].dynamic_offset_offset);
fprintf(f, "\t\tdynamic_offset_count: %d\n",
diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 3d56f8c2176..0915f5c3552 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -86,14 +86,12 @@ VkResult radv_CreateDescriptorSetLayout(
 
memset(set_layout->binding, 0, size - sizeof(struct 
radv_descriptor_set_layout));
 
-   uint32_t buffer_count = 0;
uint32_t dynamic_offset_count = 0;
 
for (uint32_t j = 0; j < pCreateInfo->bindingCount; j++) {
const VkDescriptorSetLayoutBinding *binding = 
>pBindings[j];
uint32_t b = binding->binding;
uint32_t alignment;
-   unsigned binding_buffer_count = 0;
 
switch (binding->descriptorType) {
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
@@ -102,7 +100,6 @@ VkResult radv_CreateDescriptorSetLayout(
set_layout->binding[b].dynamic_offset_count = 1;
set_layout->dynamic_shader_stages |= 
binding->stageFlags;
set_layout->binding[b].size = 0;
-   binding_buffer_count = 1;
alignment = 1;
break;
case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER:
@@ -110,7 +107,6 @@ VkResult radv_CreateDescriptorSetLayout(
case VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER:
case VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER:
set_layout->binding[b].size = 16;
-   binding_buffer_count = 1;
alignment = 16;
break;
case VK_DESCRIPTOR_TYPE_STORAGE_IMAGE:
@@ -118,13 +114,11 @@ VkResult radv_CreateDescriptorSetLayout(
case VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT:
/* main descriptor + fmask descriptor */
set_layout->binding[b].size = 64;
-   binding_buffer_count = 1;
alignment = 32;
break;
case VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER:
/* main descriptor + fmask descriptor + sampler */
set_layout->binding[b].size = 96;
-   binding_buffer_count = 1;
alignment = 32;
break;
case VK_DESCRIPTOR_TYPE_SAMPLER:
@@ -140,7 +134,6 @@ VkResult radv_CreateDescriptorSetLayout(
set_layout->binding[b].type = binding->descriptorType;
set_layout->binding[b].array_size = binding->descriptorCount;
set_layout->binding[b].offset = set_layout->size;
-   set_layout->binding[b].buffer_offset = buffer_count;
set_layout->binding[b].dynamic_offset_offset = 

[Mesa-dev] [PATCH 04/10] radv: Use sorted bindings for set layout creation.

2018-04-11 Thread Bas Nieuwenhuizen
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.
---
 src/amd/vulkan/radv_descriptor_set.c | 43 ++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 0915f5c3552..1100ca182b1 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -45,6 +45,27 @@ static bool has_equal_immutable_samplers(const VkSampler 
*samplers, uint32_t cou
return true;
 }
 
+static int binding_compare(const void* av, const void *bv)
+{
+   const VkDescriptorSetLayoutBinding *a = (const 
VkDescriptorSetLayoutBinding*)av;
+   const VkDescriptorSetLayoutBinding *b = (const 
VkDescriptorSetLayoutBinding*)bv;
+
+   return (a->binding < b->binding) ? -1 : (a->binding > b->binding) ? 1 : 
0;
+}
+
+static VkDescriptorSetLayoutBinding *
+create_sorted_bindings(const VkDescriptorSetLayoutBinding *bindings, unsigned 
count) {
+   VkDescriptorSetLayoutBinding *sorted_bindings = malloc(count * 
sizeof(VkDescriptorSetLayoutBinding));
+   if (!sorted_bindings)
+   return NULL;
+
+   memcpy(sorted_bindings, bindings, count * 
sizeof(VkDescriptorSetLayoutBinding));
+
+   qsort(sorted_bindings, count, sizeof(VkDescriptorSetLayoutBinding), 
binding_compare);
+
+   return sorted_bindings;
+}
+
 VkResult radv_CreateDescriptorSetLayout(
VkDevice_device,
const VkDescriptorSetLayoutCreateInfo*  pCreateInfo,
@@ -78,6 +99,13 @@ VkResult radv_CreateDescriptorSetLayout(
/* We just allocate all the samplers at the end of the struct */
uint32_t *samplers = (uint32_t*)_layout->binding[max_binding + 1];
 
+   VkDescriptorSetLayoutBinding *bindings = 
create_sorted_bindings(pCreateInfo->pBindings,
+   
pCreateInfo->bindingCount);
+   if (!bindings) {
+   vk_free2(>alloc, pAllocator, set_layout);
+   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+   }
+
set_layout->binding_count = max_binding + 1;
set_layout->shader_stages = 0;
set_layout->dynamic_shader_stages = 0;
@@ -89,7 +117,7 @@ VkResult radv_CreateDescriptorSetLayout(
uint32_t dynamic_offset_count = 0;
 
for (uint32_t j = 0; j < pCreateInfo->bindingCount; j++) {
-   const VkDescriptorSetLayoutBinding *binding = 
>pBindings[j];
+   const VkDescriptorSetLayoutBinding *binding = bindings + j;
uint32_t b = binding->binding;
uint32_t alignment;
 
@@ -163,6 +191,8 @@ VkResult radv_CreateDescriptorSetLayout(
set_layout->shader_stages |= binding->stageFlags;
}
 
+   free(bindings);
+
set_layout->dynamic_offset_count = dynamic_offset_count;
 
*pSetLayout = radv_descriptor_set_layout_to_handle(set_layout);
@@ -188,10 +218,17 @@ void radv_GetDescriptorSetLayoutSupport(VkDevice device,
 const VkDescriptorSetLayoutCreateInfo* 
pCreateInfo,
 VkDescriptorSetLayoutSupport* pSupport)
 {
+   VkDescriptorSetLayoutBinding *bindings = 
create_sorted_bindings(pCreateInfo->pBindings,
+   
pCreateInfo->bindingCount);
+   if (!bindings) {
+   pSupport->supported = false;
+   return;
+   }
+
bool supported = true;
uint64_t size = 0;
for (uint32_t i = 0; i < pCreateInfo->bindingCount; i++) {
-   const VkDescriptorSetLayoutBinding *binding = 
>pBindings[i];
+   const VkDescriptorSetLayoutBinding *binding = bindings + i;
 
if (binding->descriptorCount == 0)
continue;
@@ -244,6 +281,8 @@ void radv_GetDescriptorSetLayoutSupport(VkDevice device,
size += binding->descriptorCount * descriptor_size;
}
 
+   free(bindings);
+
pSupport->supported = supported;
 }
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] radv: Keep a global BO list for VkMemory.

2018-04-11 Thread Bas Nieuwenhuizen
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.

I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.
---
 src/amd/vulkan/radv_device.c  | 125 +-
 src/amd/vulkan/radv_private.h |   8 ++
 src/amd/vulkan/radv_radeon_winsys.h   |   6 +
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c |  46 ++-
 4 files changed, 146 insertions(+), 39 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 22e8f1e7a78..c81b69fef5c 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1208,6 +1208,55 @@ radv_queue_finish(struct radv_queue *queue)
queue->device->ws->buffer_destroy(queue->compute_scratch_bo);
 }
 
+static void
+radv_bo_list_init(struct radv_bo_list *bo_list)
+{
+   pthread_mutex_init(_list->mutex, NULL);
+   bo_list->list.count = bo_list->capacity = 0;
+   bo_list->list.bos = NULL;
+}
+
+static void
+radv_bo_list_finish(struct radv_bo_list *bo_list)
+{
+   free(bo_list->list.bos);
+   pthread_mutex_destroy(_list->mutex);
+}
+
+static VkResult radv_bo_list_add(struct radv_bo_list *bo_list, struct 
radeon_winsys_bo *bo)
+{
+   pthread_mutex_lock(_list->mutex);
+   if (bo_list->list.count == bo_list->capacity) {
+   unsigned capacity = MAX2(4, bo_list->capacity * 2);
+   void *data = realloc(bo_list->list.bos, capacity * 
sizeof(struct radeon_winsys_bo*));
+
+   if (!data) {
+   pthread_mutex_unlock(_list->mutex);
+   return VK_ERROR_OUT_OF_HOST_MEMORY;
+   }
+
+   bo_list->list.bos = (struct radeon_winsys_bo**)data;
+   bo_list->capacity = capacity;
+   }
+
+   bo_list->list.bos[bo_list->list.count++] = bo;
+   pthread_mutex_unlock(_list->mutex);
+   return VK_SUCCESS;
+}
+
+static void radv_bo_list_remove(struct radv_bo_list *bo_list, struct 
radeon_winsys_bo *bo)
+{
+   pthread_mutex_lock(_list->mutex);
+   for(unsigned i = 0; i < bo_list->list.count; ++i) {
+   if (bo_list->list.bos[i] == bo) {
+   bo_list->list.bos[i] = 
bo_list->list.bos[bo_list->list.count - 1];
+   --bo_list->list.count;
+   break;
+   }
+   }
+   pthread_mutex_unlock(_list->mutex);
+}
+
 static void
 radv_device_init_gs_info(struct radv_device *device)
 {
@@ -1308,6 +1357,8 @@ VkResult radv_CreateDevice(
mtx_init(>shader_slab_mutex, mtx_plain);
list_inithead(>shader_slabs);
 
+   radv_bo_list_init(>bo_list);
+
for (unsigned i = 0; i < pCreateInfo->queueCreateInfoCount; i++) {
const VkDeviceQueueCreateInfo *queue_create = 
>pQueueCreateInfos[i];
uint32_t qfi = queue_create->queueFamilyIndex;
@@ -1440,6 +1491,8 @@ VkResult radv_CreateDevice(
 fail_meta:
radv_device_finish_meta(device);
 fail:
+   radv_bo_list_finish(>bo_list);
+
if (device->trace_bo)
device->ws->buffer_destroy(device->trace_bo);
 
@@ -1487,6 +1540,7 @@ void radv_DestroyDevice(
 
radv_destroy_shader_slabs(device);
 
+   radv_bo_list_finish(>bo_list);
vk_free(>alloc, device);
 }
 
@@ -2257,7 +2311,7 @@ static VkResult radv_signal_fence(struct radv_queue 
*queue,
 
ret = queue->device->ws->cs_submit(queue->hw_ctx, queue->queue_idx,
   
>device->empty_cs[queue->queue_family_index],
-  1, NULL, NULL, _info,
+  1, NULL, NULL, _info, NULL,
   false, fence->fence);
radv_free_sem_info(_info);
 
@@ -2334,7 +2388,7 @@ VkResult radv_QueueSubmit(
ret = queue->device->ws->cs_submit(ctx, 
queue->queue_idx,
   
>device->empty_cs[queue->queue_family_index],
   1, NULL, 
NULL,
-  _info,
+  _info, 
NULL,
   false, 
base_fence);
if (ret) {
radv_loge("failed to submit CS %d\n", 
i);
@@ -2372,11 +2426,15 @@ VkResult radv_QueueSubmit(
sem_info.cs_emit_wait = j == 0;
sem_info.cs_emit_signal = j + advance == 
pSubmits[i].commandBufferCount;
 
+

[Mesa-dev] [PATCH 00/10] radv VK_EXT_descriptor_indexing, part 1

2018-04-11 Thread Bas Nieuwenhuizen
This adds support for VK_EXT_descriptor_indexing except for the
non-uniform indexing, which should be sent shortly.

Please review!

Bas Nieuwenhuizen (10):
  spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4
  radv: Keep a global BO list for VkMemory.
  radv: Don't store buffer references in the descriptor set.
  radv: Use sorted bindings for set layout creation.
  radv: Fix GetDescriptorSetLayoutSupport.
  radv: Add support for variable descriptor set layouts.
  radv: Support allocating variable size descriptor sets.
  spirv: Add support for VK_EXT_descriptor_indexing uniform indexing
caps.
  spirv: Add support for runtime descriptor array cap.
  radv: Enable VK_EXT_descriptor_indexing.

 src/amd/vulkan/radv_cmd_buffer.c  |   4 -
 src/amd/vulkan/radv_debug.c   |   3 -
 src/amd/vulkan/radv_descriptor_set.c  | 179 ++
 src/amd/vulkan/radv_descriptor_set.h  |   5 +-
 src/amd/vulkan/radv_device.c  | 164 
 src/amd/vulkan/radv_extensions.py |   1 +
 src/amd/vulkan/radv_private.h |  10 +-
 src/amd/vulkan/radv_radeon_winsys.h   |   6 +
 src/amd/vulkan/radv_shader.c  |   2 +
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c |  46 -
 src/compiler/shader_info.h|   2 +
 src/compiler/spirv/spirv.core.grammar.json| 169 -
 src/compiler/spirv/spirv.h|  18 ++
 src/compiler/spirv/spirv_to_nir.c |  10 +
 14 files changed, 484 insertions(+), 135 deletions(-)

-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] miptree-map

2018-04-11 Thread Chris Wilson
Splitting intel_miptree_map() like so should help with the yuck factor.
Though don't we also need to treat the stencil_mt to a similar treatment
to avoid slow reads?

Note the map should really record what method intel_miptree_map() used
so that is can be unwound correctly without chasing the same decision
tree (too easy for mistakes to occur).

---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 150 +++---
 1 file changed, 95 insertions(+), 55 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 89074a64930..1314547cc6c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -49,6 +49,17 @@
 
 #define FILE_DEBUG_FLAG DEBUG_MIPTREE
 
+static void __intel_miptree_map(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+unsigned int level,
+unsigned int slice,
+struct intel_miptree_map *map);
+static void __intel_miptree_unmap(struct brw_context *brw,
+  struct intel_mipmap_tree *mt,
+  unsigned int level,
+  unsigned int slice,
+  struct intel_miptree_map *map);
+
 static void *intel_miptree_map_raw(struct brw_context *brw,
struct intel_mipmap_tree *mt,
GLbitfield mode);
@@ -3441,27 +3452,31 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
if (!(map->mode & GL_MAP_INVALIDATE_RANGE_BIT)) {
   uint32_t *packed_map = map->ptr;
   uint8_t *s_map = intel_miptree_map_raw(brw, s_mt, GL_MAP_READ_BIT);
-  uint32_t *z_map = intel_miptree_map_raw(brw, z_mt, GL_MAP_READ_BIT);
   unsigned int s_image_x, s_image_y;
-  unsigned int z_image_x, z_image_y;
+
+  struct intel_miptree_map z_map = {
+ .mode = GL_MAP_READ_BIT | BRW_MAP_DIRECT_BIT,
+ .x = map->x,
+ .y = map->y,
+ .w = map->w,
+ .h = map->h,
+  };
+  __intel_miptree_map(brw, z_mt, level, slice, _map);
 
   intel_miptree_get_image_offset(s_mt, level, slice,
 _image_x, _image_y);
-  intel_miptree_get_image_offset(z_mt, level, slice,
-_image_x, _image_y);
 
   for (uint32_t y = 0; y < map->h; y++) {
+ uint32_t *z_line =
+(uint32_t *)((uint8_t *)z_map.ptr + z_map.stride * y);
 for (uint32_t x = 0; x < map->w; x++) {
int map_x = map->x + x, map_y = map->y + y;
ptrdiff_t s_offset = intel_offset_S8(s_mt->surf.row_pitch,
 map_x + s_image_x,
 map_y + s_image_y,
 brw->has_swizzling);
-   ptrdiff_t z_offset = ((map_y + z_image_y) *
-  (z_mt->surf.row_pitch / 4) +
- (map_x + z_image_x));
uint8_t s = s_map[s_offset];
-   uint32_t z = z_map[z_offset];
+   uint32_t z = z_line[x];
 
if (map_z32f_x24s8) {
   packed_map[(y * map->w + x) * 2 + 0] = z;
@@ -3472,13 +3487,13 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
 }
   }
 
+  __intel_miptree_unmap(brw, z_mt, level, slice, _map);
   intel_miptree_unmap_raw(s_mt);
-  intel_miptree_unmap_raw(z_mt);
 
   DBG("%s: %d,%d %dx%d from z mt %p %d,%d, s mt %p %d,%d = %p/%d\n",
  __func__,
  map->x, map->y, map->w, map->h,
- z_mt, map->x + z_image_x, map->y + z_image_y,
+ z_mt, map->x, map->y,
  s_mt, map->x + s_image_x, map->y + s_image_y,
  map->ptr, map->stride);
} else {
@@ -3502,44 +3517,47 @@ intel_miptree_unmap_depthstencil(struct brw_context 
*brw,
if (map->mode & GL_MAP_WRITE_BIT) {
   uint32_t *packed_map = map->ptr;
   uint8_t *s_map = intel_miptree_map_raw(brw, s_mt, GL_MAP_WRITE_BIT);
-  uint32_t *z_map = intel_miptree_map_raw(brw, z_mt, GL_MAP_WRITE_BIT);
   unsigned int s_image_x, s_image_y;
-  unsigned int z_image_x, z_image_y;
 
   intel_miptree_get_image_offset(s_mt, level, slice,
 _image_x, _image_y);
-  intel_miptree_get_image_offset(z_mt, level, slice,
-_image_x, _image_y);
+
+  struct intel_miptree_map z_map = {
+ .mode = GL_MAP_WRITE_BIT | BRW_MAP_DIRECT_BIT | 
GL_MAP_INVALIDATE_RANGE_BIT,
+ .x = map->x,
+ .y = map->y,
+ .w = map->w,
+ .h = map->h,
+  };
+  __intel_miptree_map(brw, z_mt, level, slice, _map);
 
   for (uint32_t y = 0; y < map->h; y++) {
+ uint32_t *z_line =

[Mesa-dev] [PATCH] glsl: fix compat shaders in GLSL 1.40

2018-04-11 Thread Timothy Arceri
The compatibility and core tokens were not added until GLSL 1.50,
for GLSL 1.40 just assume all shader built with a compat profile
are compat shaders.

Fixes rendering issues in Dawn of War II on radeonsi which has
enabled OpenGL 3.1 compat support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 0cc57f5a887..5dd362b3e38 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -429,6 +429,8 @@ _mesa_glsl_parse_state::process_version_directive(YYLTYPE 
*locp, int version,
   this->language_version = version;
 
this->compat_shader = compat_token_present ||
+ (this->ctx->API == API_OPENGL_COMPAT &&
+  this->language_version == 140) ||
  (!this->es_shader && this->language_version < 140);
 
bool supported = false;
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR function inlining for faster compile times

2018-04-11 Thread Ian Romanick
Are we still calculating limits (that affect whether or not a shader can
successfully link) after only doing GLSL optimizations?  I'm worried
that making a pretty big change to the optimization path is going to
break some app on (most likely) an older piece of hardware because the
linker will now determine that it exceeds some limit.

On 04/09/2018 09:34 PM, Timothy Arceri wrote:
> This series is part of an effort to reduce the regression in compile
> times when switching radeonsi from TGIS -> NIR. But it also turns
> out to be quite handy for i965 too.
> 
> The idea is to make better use of GLSLOptimizeConservatively.
> Currently TGSI must ignore the flag until all functions have been
> inlined by the GLSL IR opts. Since NIR can do function inlining we
> can drop the post linking opts calls for Gallium drivers that use
> NIR and just use the faster NIR opts instead. The patches to do
> this will come in a follow-up series since it requires some
> refactoring and testing and I wanted to get this out for review.
> 
> For i965 this series enables GLSLOptimizeConservatively for a nice
> boost in compile times and very little change in shader-db.
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105567] meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105567

charlie  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from charlie  ---
On the first compile attempt with
"1-2-bin-install_megadrivers-fix-DESTDIR-and--D--path.patch" applied, the ninja
file was regenerated because the git mesa directory is deleted from /dev/shm
and a clean copy of origin/master (git reset --hard) is then copied from disk
to /dev/shm. The copy of origin/master in /dev/shm is then used to compile. I
also logged out before the first attempt to make sure all environment variables
were wiped. So I found it odd that there was no change in my build config
except for adding "-Dprefix" on my second attempt and have
"-Ddri-drivers-path=${XBUILD}/lib${LIBDIRSUFFIX}/xorg/modules/dri" finally
working. So I suspected some meson cache not related to the regeneration of the
"build.ninja" file needed to be 'refreshed'.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Add and use a helper to update the indirect miptree color

2018-04-11 Thread Nanley Chery
Split out this functionality to enable a fast-clear optimization for
color miptrees in the next commit.
---
 src/mesa/drivers/dri/i965/brw_clear.c | 54 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 22 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  7 
 3 files changed, 36 insertions(+), 47 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
b/src/mesa/drivers/dri/i965/brw_clear.c
index 3d540d6d905..1cdc2241eac 100644
--- a/src/mesa/drivers/dri/i965/brw_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_clear.c
@@ -108,7 +108,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
struct intel_mipmap_tree *mt = depth_irb->mt;
struct gl_renderbuffer_attachment *depth_att = 
>Attachment[BUFFER_DEPTH];
const struct gen_device_info *devinfo = >screen->devinfo;
-   bool same_clear_value = true;
 
if (devinfo->gen < 6)
   return false;
@@ -176,7 +175,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
/* If we're clearing to a new clear value, then we need to resolve any clear
 * flags out of the HiZ buffer into the real depth buffer.
 */
-   if (mt->fast_clear_color.f32[0] != clear_value) {
+   const bool same_clear_value = mt->fast_clear_color.f32[0] == clear_value;
+   if (!same_clear_value) {
   for (uint32_t level = mt->first_level; level <= mt->last_level; level++) 
{
  if (!intel_miptree_level_has_hiz(mt, level))
 continue;
@@ -214,7 +214,6 @@ brw_fast_clear_depth(struct gl_context *ctx)
   }
 
   intel_miptree_set_depth_clear_value(brw, mt, clear_value);
-  same_clear_value = false;
}
 
bool need_clear = false;
@@ -225,56 +224,17 @@ brw_fast_clear_depth(struct gl_context *ctx)
 
   if (aux_state != ISL_AUX_STATE_CLEAR) {
  need_clear = true;
- break;
-  }
-   }
-
-   if (!need_clear) {
-  /* If all of the layers we intend to clear are already in the clear
-   * state then simply updating the miptree fast clear value is sufficient
-   * to change their clear value.
-   */
-  if (devinfo->gen >= 10 && !same_clear_value) {
- /* Before gen10, it was enough to just update the clear value in the
-  * miptree. But on gen10+, we let blorp update the clear value state
-  * buffer when doing a fast clear. Since we are skipping the fast
-  * clear here, we need to update the clear color ourselves.
-  */
- uint32_t clear_offset = mt->aux_buf->clear_color_offset;
- union isl_color_value clear_color = { .f32 = { clear_value, } };
-
- /* We can't update the clear color while the hardware is still using
-  * the previous one for a resolve or sampling from it. So make sure
-  * that there's no pending commands at this point.
-  */
- brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
- for (int i = 0; i < 4; i++) {
-brw_store_data_imm32(brw, mt->aux_buf->clear_color_bo,
- clear_offset + i * 4, clear_color.u32[i]);
- }
- brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
-  }
-  return true;
-   }
-
-   for (unsigned a = 0; a < num_layers; a++) {
-  enum isl_aux_state aux_state =
- intel_miptree_get_aux_state(mt, depth_irb->mt_level,
- depth_irb->mt_layer + a);
-
-  if (aux_state != ISL_AUX_STATE_CLEAR) {
  intel_hiz_exec(brw, mt, depth_irb->mt_level,
 depth_irb->mt_layer + a, 1,
 ISL_AUX_OP_FAST_CLEAR);
+ intel_miptree_set_aux_state(brw, mt, depth_irb->mt_level,
+ depth_irb->mt_layer + a, 1,
+ ISL_AUX_STATE_CLEAR);
   }
}
 
-   /* Now, the HiZ buffer contains data that needs to be resolved to the depth
-* buffer.
-*/
-   intel_miptree_set_aux_state(brw, mt, depth_irb->mt_level,
-   depth_irb->mt_layer, num_layers,
-   ISL_AUX_STATE_CLEAR);
+   if (!need_clear && !same_clear_value)
+  intel_miptree_update_indirect_color(brw, mt);
 
return true;
 }
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0b6a821d08c..23e73c5419c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3831,3 +3831,25 @@ intel_miptree_get_clear_color(const struct 
gen_device_info *devinfo,
   return mt->fast_clear_color;
}
 }
+
+void
+intel_miptree_update_indirect_color(struct brw_context *brw,
+struct intel_mipmap_tree *mt)
+{
+   assert(mt->aux_buf);
+
+   if (mt->aux_buf->clear_color_bo == NULL)
+  return;
+
+   /* We can't update the clear color while the hardware is still using the
+* previous one for a resolve or sampling from it. Make sure 

[Mesa-dev] [PATCH 0/2] i965: Also skip the fast clear if the clear color differs

2018-04-11 Thread Nanley Chery
This series lives on top of this mailing-list series:
[PATCH v3 0/9] Enable sRGB-encoded fast-clears on CNL

Nanley Chery (2):
  i965: Add and use a helper to update the indirect miptree color
  i965/blorp: Also skip the fast clear if the clear color differs

 src/mesa/drivers/dri/i965/brw_blorp.c |  7 +++-
 src/mesa/drivers/dri/i965/brw_clear.c | 54 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 22 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  7 
 4 files changed, 41 insertions(+), 49 deletions(-)

-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/blorp: Also skip the fast clear if the clear color differs

2018-04-11 Thread Nanley Chery
If the aux state is CLEAR and clear color value has changed, only the
surface state must be updated. The bit-pattern in the aux buffer is
exactly the same.

v2: Handle the indirect color on gen10+.
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 48022bb1c4f..52fec02174d 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -1232,11 +1232,14 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   bool same_clear_color =
  !intel_miptree_set_clear_color(brw, irb->mt, >Color.ClearColor);
 
-  /* If the buffer is already in INTEL_FAST_CLEAR_STATE_CLEAR, the clear
+  /* If the buffer is already in ISL_AUX_STATE_CLEAR, the clear
* is redundant and can be skipped.
*/
-  if (aux_state == ISL_AUX_STATE_CLEAR && same_clear_color)
+  if (aux_state == ISL_AUX_STATE_CLEAR) {
+ if (!same_clear_color)
+intel_miptree_update_indirect_color(brw, irb->mt);
  return;
+  }
 
   DBG("%s (fast) to mt %p level %d layers %d+%d\n", __FUNCTION__,
   irb->mt, irb->mt_level, irb->mt_layer, num_layers);
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/3] nir: simplify node matching code when lowering to SSA

2018-04-11 Thread Caio Marcelo de Oliveira Filho
On Wed, Apr 11, 2018 at 11:45:57AM -0700, Jason Ekstrand wrote:
> I tweaked your commit messages a bit, added my R-B to this one, and pushed.

Thanks. Your title reads better.


Thanks,
Caio

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 9/9] i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL

2018-04-11 Thread Nanley Chery
The paths which sample with the clear color are now using a getter which
performs the sRGB decode needed to enable this fast clear.

This path can be exercised by fast-clearing a texture, then performing
an operation which requires sRGB decoding. Test coverage for this
feature is provided with the following tests:

* Shader texture calls:
  - spec@ext_texture_srgb@tex-srgb

* Shader texelfetch calls:
  - spec@arb_framebuffer_srgb@fbo-fast-clear
  - spec@arb_framebuffer_srgb@msaa-fast-clear

* Blending:
  - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend

* Blitting:
  - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear

Reviewed-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_meta_util.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c 
b/src/mesa/drivers/dri/i965/brw_meta_util.c
index b31181521c7..d292f5a8e24 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
@@ -293,18 +293,7 @@ brw_is_color_fast_clear_compatible(struct brw_context *brw,
brw->mesa_to_isl_render_format[mt->format])
   return false;
 
-   const bool srgb_rb = _mesa_get_srgb_format_linear(mt->format) != mt->format;
-  /* Gen10 doesn't automatically decode the clear color of sRGB buffers. Since
-   * we currently don't perform this decode in software, avoid a fast-clear
-   * altogether. TODO: Do this in software.
-   */
const mesa_format format = _mesa_get_render_format(ctx, mt->format);
-   if (devinfo->gen >= 10 && srgb_rb) {
-  perf_debug("sRGB fast clear not enabled for (%s)",
- _mesa_get_format_name(format));
-  return false;
-   }
-
if (_mesa_is_format_integer_color(format)) {
   if (devinfo->gen >= 8) {
  perf_debug("Integer fast clear not enabled for (%s)",
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/9] i965/wm_surface_state: Use the clear address if it's non-zero

2018-04-11 Thread Nanley Chery
We want to add and use a getter that turns off the indirect path by
returning zero for the clear address.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 06f739faf61..3c70dbcc110 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -155,6 +155,8 @@ brw_emit_surface_state(struct brw_context *brw,
struct brw_bo *aux_bo = NULL;
struct isl_surf *aux_surf = NULL;
uint64_t aux_offset = 0;
+   struct brw_bo *clear_bo = NULL;
+   uint32_t clear_offset = 0;
struct intel_miptree_aux_buffer *aux_buf = intel_miptree_get_aux_buffer(mt);
 
if (aux_usage != ISL_AUX_USAGE_NONE) {
@@ -165,6 +167,8 @@ brw_emit_surface_state(struct brw_context *brw,
   /* We only really need a clear color if we also have an auxiliary
* surface.  Without one, it does nothing.
*/
+  clear_bo = aux_buf->clear_color_bo;
+  clear_offset = aux_buf->clear_color_offset;
   clear_color = mt->fast_clear_color;
}
 
@@ -173,15 +177,6 @@ brw_emit_surface_state(struct brw_context *brw,
  brw->isl_dev.ss.align,
  surf_offset);
 
-   bool use_clear_address = devinfo->gen >= 10 && aux_surf;
-
-   struct brw_bo *clear_bo = NULL;
-   uint32_t clear_offset = 0;
-   if (use_clear_address) {
-  clear_bo = aux_buf->clear_color_bo;
-  clear_offset = aux_buf->clear_color_offset;
-   }
-
isl_surf_fill_state(>isl_dev, state, .surf = , .view = ,
.address = brw_state_reloc(>batch,
   *surf_offset + 
brw->isl_dev.ss.addr_offset,
@@ -190,7 +185,7 @@ brw_emit_surface_state(struct brw_context *brw,
.aux_address = aux_offset,
.mocs = brw_get_bo_mocs(devinfo, mt->bo),
.clear_color = clear_color,
-   .use_clear_address = use_clear_address,
+   .use_clear_address = clear_bo != NULL,
.clear_address = clear_offset,
.x_offset_sa = tile_x, .y_offset_sa = tile_y);
if (aux_surf) {
@@ -222,7 +217,7 @@ brw_emit_surface_state(struct brw_context *brw,
   }
}
 
-   if (use_clear_address) {
+   if (clear_bo != NULL) {
   /* Make sure the offset is aligned with a cacheline. */
   assert((clear_offset & 0x3f) == 0);
   uint32_t *clear_address =
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 7/9] i965: Add and use a getter for the clear color

2018-04-11 Thread Nanley Chery
This getter allows CNL to sample from fast-cleared sRGB textures correctly.
---
 src/mesa/drivers/dri/i965/brw_blorp.c| 13 ---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  7 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 29 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  8 +++
 4 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index a1882abb7cb..48022bb1c4f 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -166,7 +166,11 @@ blorp_surf_for_miptree(struct brw_context *brw,
   /* We only really need a clear color if we also have an auxiliary
* surface.  Without one, it does nothing.
*/
-  surf->clear_color = mt->fast_clear_color;
+  surf->clear_color =
+ intel_miptree_get_clear_color(devinfo, mt, mt->surf.format,
+   !is_render_target, (struct brw_bo **)
+   >clear_color_addr.buffer,
+   >clear_color_addr.offset);
 
   struct intel_miptree_aux_buffer *aux_buf =
  intel_miptree_get_aux_buffer(mt);
@@ -178,13 +182,6 @@ blorp_surf_for_miptree(struct brw_context *brw,
 
   surf->aux_addr.buffer = aux_buf->bo;
   surf->aux_addr.offset = aux_buf->offset;
-
-  if (devinfo->gen >= 10) {
- surf->clear_color_addr = (struct blorp_address) {
-.buffer = aux_buf->clear_color_bo,
-.offset = aux_buf->clear_color_offset,
- };
-  }
} else {
   surf->aux_addr = (struct blorp_address) {
  .buffer = NULL,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 3c70dbcc110..fb8e5942a11 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -167,9 +167,10 @@ brw_emit_surface_state(struct brw_context *brw,
   /* We only really need a clear color if we also have an auxiliary
* surface.  Without one, it does nothing.
*/
-  clear_bo = aux_buf->clear_color_bo;
-  clear_offset = aux_buf->clear_color_offset;
-  clear_color = mt->fast_clear_color;
+  clear_color =
+ intel_miptree_get_clear_color(devinfo, mt, view.format,
+   view.usage & ISL_SURF_USAGE_TEXTURE_BIT,
+   _bo, _offset);
}
 
void *state = brw_state_batch(brw,
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c5791835409..88468399e1b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -46,6 +46,9 @@
 #include "main/texcompress_etc.h"
 #include "main/teximage.h"
 #include "main/streaming-load-memcpy.h"
+
+#include "util/format_srgb.h"
+
 #include "x86/common_x86_asm.h"
 
 #define FILE_DEBUG_FLAG DEBUG_MIPTREE
@@ -3802,3 +3805,29 @@ intel_miptree_set_depth_clear_value(struct brw_context 
*brw,
}
return false;
 }
+
+union isl_color_value
+intel_miptree_get_clear_color(const struct gen_device_info *devinfo,
+  const struct intel_mipmap_tree *mt,
+  enum isl_format view_format, bool sampling,
+  struct brw_bo **clear_color_bo,
+  uint32_t *clear_color_offset)
+{
+   assert(mt->aux_buf);
+
+   /* The gen10 sampler doesn't gamma-correct the clear color. */
+   if (devinfo->gen == 10 && isl_format_is_srgb(view_format) && sampling) {
+  union isl_color_value srgb_decoded_value = mt->fast_clear_color;
+  for (unsigned i = 0; i < 3; i++) {
+ srgb_decoded_value.f32[i] =
+util_format_srgb_to_linear_float(mt->fast_clear_color.f32[i]);
+  }
+  *clear_color_bo = 0;
+  *clear_color_offset = 0;
+  return srgb_decoded_value;
+   } else {
+  *clear_color_bo = mt->aux_buf->clear_color_bo;
+  *clear_color_offset = mt->aux_buf->clear_color_offset;
+  return mt->fast_clear_color;
+   }
+}
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 643de962d31..bb059cf4e8f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -735,6 +735,14 @@ intel_miptree_set_clear_color(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
   const union gl_color_union *color);
 
+/* Get a clear color suitable for filling out an ISL surface state. */
+union isl_color_value
+intel_miptree_get_clear_color(const struct gen_device_info *devinfo,
+  const struct intel_mipmap_tree *mt,
+  enum isl_format 

[Mesa-dev] [PATCH v3 4/9] i965: Add and use a single miptree aux_buf field

2018-04-11 Thread Nanley Chery
We want to add and use a function that accesses the auxiliary buffer's
clear_color_bo and doesn't care if it has an MCS or HiZ buffer
specifically.
---
 src/mesa/drivers/dri/i965/brw_blorp.c |   4 +-
 src/mesa/drivers/dri/i965/brw_clear.c |   4 +-
 src/mesa/drivers/dri/i965/brw_wm.c|   2 +-
 src/mesa/drivers/dri/i965/gen6_depth_state.c  |   6 +-
 src/mesa/drivers/dri/i965/gen7_misc_state.c   |   4 +-
 src/mesa/drivers/dri/i965/gen8_depth_state.c  |   6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 106 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  42 --
 src/mesa/drivers/dri/i965/intel_tex_image.c   |   2 +-
 9 files changed, 80 insertions(+), 96 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 962a316c5cf..a1882abb7cb 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -1212,7 +1212,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
 
/* If the MCS buffer hasn't been allocated yet, we need to allocate it now.
 */
-   if (can_fast_clear && !irb->mt->mcs_buf) {
+   if (can_fast_clear && !irb->mt->aux_buf) {
   assert(irb->mt->aux_usage == ISL_AUX_USAGE_CCS_D);
   if (!intel_miptree_alloc_ccs(brw, irb->mt)) {
  /* There are a few reasons in addition to out-of-memory, that can
@@ -1611,7 +1611,7 @@ intel_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_STALL);
}
 
-   assert(mt->aux_usage == ISL_AUX_USAGE_HIZ && mt->hiz_buf);
+   assert(mt->aux_usage == ISL_AUX_USAGE_HIZ && mt->aux_buf);
 
struct isl_surf isl_tmp[2];
struct blorp_surf surf;
diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
b/src/mesa/drivers/dri/i965/brw_clear.c
index 487de9b8997..3d540d6d905 100644
--- a/src/mesa/drivers/dri/i965/brw_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_clear.c
@@ -240,7 +240,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
   * buffer when doing a fast clear. Since we are skipping the fast
   * clear here, we need to update the clear color ourselves.
   */
- uint32_t clear_offset = mt->hiz_buf->clear_color_offset;
+ uint32_t clear_offset = mt->aux_buf->clear_color_offset;
  union isl_color_value clear_color = { .f32 = { clear_value, } };
 
  /* We can't update the clear color while the hardware is still using
@@ -249,7 +249,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
   */
  brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
  for (int i = 0; i < 4; i++) {
-brw_store_data_imm32(brw, mt->hiz_buf->clear_color_bo,
+brw_store_data_imm32(brw, mt->aux_buf->clear_color_bo,
  clear_offset + i * 4, clear_color.u32[i]);
  }
  brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 68d4ab88d77..94048cd758f 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -384,7 +384,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx,
  if (intel_tex->mt->aux_usage == ISL_AUX_USAGE_MCS) {
 assert(devinfo->gen >= 7);
 assert(intel_tex->mt->surf.samples > 1);
-assert(intel_tex->mt->mcs_buf);
+assert(intel_tex->mt->aux_buf);
 assert(intel_tex->mt->surf.msaa_layout == ISL_MSAA_LAYOUT_ARRAY);
 key->compressed_multisample_layout_mask |= 1 << s;
 
diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c 
b/src/mesa/drivers/dri/i965/gen6_depth_state.c
index 3a66b42fec1..8a1d5808051 100644
--- a/src/mesa/drivers/dri/i965/gen6_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c
@@ -160,13 +160,13 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw,
  assert(depth_mt);
 
  uint32_t offset;
- isl_surf_get_image_offset_B_tile_sa(_mt->hiz_buf->surf,
+ isl_surf_get_image_offset_B_tile_sa(_mt->aux_buf->surf,
  lod, 0, 0, , NULL, NULL);
 
 BEGIN_BATCH(3);
 OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2));
-OUT_BATCH(depth_mt->hiz_buf->surf.row_pitch - 1);
-OUT_RELOC(depth_mt->hiz_buf->bo, RELOC_WRITE, offset);
+OUT_BATCH(depth_mt->aux_buf->surf.row_pitch - 1);
+OUT_RELOC(depth_mt->aux_buf->bo, RELOC_WRITE, offset);
 ADVANCE_BATCH();
   } else {
 BEGIN_BATCH(3);
diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
b/src/mesa/drivers/dri/i965/gen7_misc_state.c
index 58f0a1bdbfd..1ce76585f2b 100644
--- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
@@ -149,8 +149,8 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw,
   

[Mesa-dev] [PATCH v3 8/9] i965/miptree: Extend the sRGB-blending WA to future platforms

2018-04-11 Thread Nanley Chery
The blending issue seems to be present on CNL as well.

Reviewed-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 88468399e1b..0b6a821d08c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2737,11 +2737,11 @@ intel_miptree_render_aux_usage(struct brw_context *brw,
  return ISL_AUX_USAGE_NONE;
   }
 
-  /* gen9 hardware technically supports non-0/1 clear colors with sRGB
+  /* gen9+ hardware technically supports non-0/1 clear colors with sRGB
* formats.  However, there are issues with blending where it doesn't
* properly apply the sRGB curve to the clear color when blending.
*/
-  if (devinfo->gen == 9 && blend_enabled &&
+  if (devinfo->gen >= 9 && blend_enabled &&
   isl_format_is_srgb(render_format) &&
   !isl_color_value_is_zero_one(mt->fast_clear_color, render_format))
  return ISL_AUX_USAGE_NONE;
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/9] i965/miptree: Delete an unused function

2018-04-11 Thread Nanley Chery
We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once
that happens, this function no longer make sense.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 13 -
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 
 2 files changed, 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0580cc05346..d95128de119 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3792,19 +3792,6 @@ get_isl_dim_layout(const struct gen_device_info *devinfo,
unreachable("Invalid texture target");
 }
 
-enum isl_aux_usage
-intel_miptree_get_aux_isl_usage(const struct brw_context *brw,
-const struct intel_mipmap_tree *mt)
-{
-   if (mt->hiz_buf)
-  return ISL_AUX_USAGE_HIZ;
-
-   if (!mt->mcs_buf)
-  return ISL_AUX_USAGE_NONE;
-
-   return mt->aux_usage;
-}
-
 bool
 intel_miptree_set_clear_color(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 4136c6586b6..2f754427fc5 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -485,10 +485,6 @@ enum isl_dim_layout
 get_isl_dim_layout(const struct gen_device_info *devinfo,
enum isl_tiling tiling, GLenum target);
 
-enum isl_aux_usage
-intel_miptree_get_aux_isl_usage(const struct brw_context *brw,
-const struct intel_mipmap_tree *mt);
-
 void
 intel_get_image_dims(struct gl_texture_image *image,
  int *width, int *height, int *depth);
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/9] i965: Add and use a getter for the miptree aux buffer

2018-04-11 Thread Nanley Chery
Make the next patch easier to read by eliminating most of the would-be
duplicate field accesses now.
---
 src/mesa/drivers/dri/i965/brw_blorp.c|  8 ++--
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 16 +---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 24 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 17 +
 4 files changed, 24 insertions(+), 41 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 5dcd95e9f44..962a316c5cf 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -154,12 +154,6 @@ blorp_surf_for_miptree(struct brw_context *brw,
   .aux_usage = aux_usage,
};
 
-   struct intel_miptree_aux_buffer *aux_buf = NULL;
-   if (mt->mcs_buf)
-  aux_buf = mt->mcs_buf;
-   else if (mt->hiz_buf)
-  aux_buf = mt->hiz_buf;
-
if (mt->format == MESA_FORMAT_S_UINT8 && is_render_target &&
devinfo->gen <= 7)
   mt->r8stencil_needs_update = true;
@@ -174,6 +168,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
*/
   surf->clear_color = mt->fast_clear_color;
 
+  struct intel_miptree_aux_buffer *aux_buf =
+ intel_miptree_get_aux_buffer(mt);
   surf->aux_surf = _buf->surf;
   surf->aux_addr = (struct blorp_address) {
  .reloc_flags = is_render_target ? EXEC_OBJECT_WRITE : 0,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 3fb101bf68b..06f739faf61 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -155,21 +155,7 @@ brw_emit_surface_state(struct brw_context *brw,
struct brw_bo *aux_bo = NULL;
struct isl_surf *aux_surf = NULL;
uint64_t aux_offset = 0;
-   struct intel_miptree_aux_buffer *aux_buf = NULL;
-   switch (aux_usage) {
-   case ISL_AUX_USAGE_MCS:
-   case ISL_AUX_USAGE_CCS_D:
-   case ISL_AUX_USAGE_CCS_E:
-  aux_buf = mt->mcs_buf;
-  break;
-
-   case ISL_AUX_USAGE_HIZ:
-  aux_buf = mt->hiz_buf;
-  break;
-
-   case ISL_AUX_USAGE_NONE:
-  break;
-   }
+   struct intel_miptree_aux_buffer *aux_buf = intel_miptree_get_aux_buffer(mt);
 
if (aux_usage != ISL_AUX_USAGE_NONE) {
   aux_surf = _buf->surf;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index d95128de119..ba5b02bc0aa 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1249,8 +1249,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
   intel_miptree_release(&(*mt)->r8stencil_mt);
-  intel_miptree_aux_buffer_free((*mt)->hiz_buf);
-  intel_miptree_aux_buffer_free((*mt)->mcs_buf);
+  intel_miptree_aux_buffer_free(intel_miptree_get_aux_buffer(*mt));
   free_aux_state_map((*mt)->aux_state);
 
   intel_miptree_release(&(*mt)->plane[0]);
@@ -2876,31 +2875,16 @@ intel_miptree_make_shareable(struct brw_context *brw,
 0, INTEL_REMAINING_LAYERS,
 ISL_AUX_USAGE_NONE, false);
 
-   if (mt->mcs_buf) {
-  intel_miptree_aux_buffer_free(mt->mcs_buf);
+   struct intel_miptree_aux_buffer *aux_buf = intel_miptree_get_aux_buffer(mt);
+   if (aux_buf) {
+  intel_miptree_aux_buffer_free(aux_buf);
   mt->mcs_buf = NULL;
-
-  /* Any pending MCS/CCS operations are no longer needed. Trying to
-   * execute any will likely crash due to the missing aux buffer. So let's
-   * delete all pending ops.
-   */
-  free(mt->aux_state);
-  mt->aux_state = NULL;
-  brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
-   }
-
-   if (mt->hiz_buf) {
-  intel_miptree_aux_buffer_free(mt->hiz_buf);
   mt->hiz_buf = NULL;
 
   for (uint32_t l = mt->first_level; l <= mt->last_level; ++l) {
  mt->level[l].has_hiz = false;
   }
 
-  /* Any pending HiZ operations are no longer needed. Trying to execute
-   * any will likely crash due to the missing aux buffer. So let's delete
-   * all pending ops.
-   */
   free(mt->aux_state);
   mt->aux_state = NULL;
   brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 2f754427fc5..8fe5c4add67 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -485,6 +485,23 @@ enum isl_dim_layout
 get_isl_dim_layout(const struct gen_device_info *devinfo,
enum isl_tiling tiling, GLenum target);
 
+static inline struct intel_miptree_aux_buffer *
+intel_miptree_get_aux_buffer(const struct intel_mipmap_tree *mt)
+{
+   switch (mt->aux_usage) {
+   case ISL_AUX_USAGE_MCS:
+   

[Mesa-dev] [PATCH v3 0/9] Enable sRGB-encoded fast-clears on CNL

2018-04-11 Thread Nanley Chery
The most noteworthy differences between v2 and v3 are:
* A fixed memory leak.
* The extra helpers for intel_miptree::fast_clear_color are dropped.
* The indirect color buffer on gen10 is accounted for.

Jason Ekstrand (1):
  util/srgb: Add a float sRGB -> linear helper

Nanley Chery (8):
  i965/miptree: Don't leak the clear_color_bo
  i965/miptree: Delete an unused function
  i965: Add and use a getter for the miptree aux buffer
  i965: Add and use a single miptree aux_buf field
  i965/wm_surface_state: Use the clear address if it's non-zero
  i965: Add and use a getter for the clear color
  i965/miptree: Extend the sRGB-blending WA to future platforms
  i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL

 src/mesa/drivers/dri/i965/brw_blorp.c|  27 ++--
 src/mesa/drivers/dri/i965/brw_clear.c|   4 +-
 src/mesa/drivers/dri/i965/brw_meta_util.c|  11 --
 src/mesa/drivers/dri/i965/brw_wm.c   |   2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  36 ++---
 src/mesa/drivers/dri/i965/gen6_depth_state.c |   6 +-
 src/mesa/drivers/dri/i965/gen7_misc_state.c  |   4 +-
 src/mesa/drivers/dri/i965/gen8_depth_state.c |   6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 169 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h|  45 +++---
 src/mesa/drivers/dri/i965/intel_tex_image.c  |   2 +-
 src/util/format_srgb.h   |  14 ++
 12 files changed, 154 insertions(+), 172 deletions(-)

-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 6/9] util/srgb: Add a float sRGB -> linear helper

2018-04-11 Thread Nanley Chery
From: Jason Ekstrand 

Reviewed-by: Nanley Chery 
Reviewed-by: Jason Ekstrand 
---
 src/util/format_srgb.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/util/format_srgb.h b/src/util/format_srgb.h
index 34b50afe3d1..596af56f4cd 100644
--- a/src/util/format_srgb.h
+++ b/src/util/format_srgb.h
@@ -54,6 +54,20 @@ extern const unsigned
 util_format_linear_to_srgb_helper_table[104];
 
 
+static inline float
+util_format_srgb_to_linear_float(float cs)
+{
+   if (cs <= 0.0f)
+  return 0.0f;
+   else if (cs <= 0.04045f)
+  return cs / 12.92f;
+   else if (cs < 1.0f)
+  return powf((cs + 0.055) / 1.055f, 2.4f);
+   else
+  return 1.0f;
+}
+
+
 static inline float
 util_format_linear_to_srgb_float(float cl)
 {
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/9] i965/miptree: Don't leak the clear_color_bo

2018-04-11 Thread Nanley Chery
Free the clear_color_bo in addition to freeing the
intel_miptree_aux_buffer which holds the reference to it.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8d3ddd56544..0580cc05346 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2877,8 +2877,7 @@ intel_miptree_make_shareable(struct brw_context *brw,
 ISL_AUX_USAGE_NONE, false);
 
if (mt->mcs_buf) {
-  brw_bo_unreference(mt->mcs_buf->bo);
-  free(mt->mcs_buf);
+  intel_miptree_aux_buffer_free(mt->mcs_buf);
   mt->mcs_buf = NULL;
 
   /* Any pending MCS/CCS operations are no longer needed. Trying to
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: include dispatch.h less

2018-04-11 Thread Marek Olšák
From: Marek Olšák 

---
 src/mesa/main/accum.c | 1 -
 src/mesa/main/arrayobj.c  | 1 -
 src/mesa/main/atifragshader.c | 1 -
 src/mesa/main/attrib.c| 1 -
 src/mesa/main/colortab.c  | 1 -
 src/mesa/main/convolve.c  | 1 -
 src/mesa/main/debug_output.c  | 1 -
 src/mesa/main/drawpix.c   | 1 -
 src/mesa/main/feedback.c  | 1 -
 src/mesa/main/histogram.c | 1 -
 src/mesa/main/pipelineobj.c   | 1 -
 src/mesa/main/pixel.c | 1 -
 src/mesa/main/queryobj.c  | 1 -
 src/mesa/main/rastpos.c   | 1 -
 src/mesa/main/samplerobj.c| 1 -
 src/mesa/main/shaderapi.c | 1 -
 src/mesa/main/syncobj.c   | 1 -
 src/mesa/main/texgen.c| 1 -
 src/mesa/main/transformfeedback.c | 1 -
 src/mesa/main/uniforms.c  | 1 -
 20 files changed, 20 deletions(-)

diff --git a/src/mesa/main/accum.c b/src/mesa/main/accum.c
index 5fbee8fbdbd..f5ac8a10270 100644
--- a/src/mesa/main/accum.c
+++ b/src/mesa/main/accum.c
@@ -26,21 +26,20 @@
 #include "accum.h"
 #include "condrender.h"
 #include "context.h"
 #include "format_unpack.h"
 #include "format_pack.h"
 #include "framebuffer.h"
 #include "imports.h"
 #include "macros.h"
 #include "state.h"
 #include "mtypes.h"
-#include "main/dispatch.h"
 
 
 void GLAPIENTRY
 _mesa_ClearAccum( GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha )
 {
GLfloat tmp[4];
GET_CURRENT_CONTEXT(ctx);
 
tmp[0] = CLAMP( red,   -1.0F, 1.0F );
tmp[1] = CLAMP( green, -1.0F, 1.0F );
diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
index 0d2f7a918ac..899d4dec01c 100644
--- a/src/mesa/main/arrayobj.c
+++ b/src/mesa/main/arrayobj.c
@@ -44,21 +44,20 @@
 #include "hash.h"
 #include "image.h"
 #include "imports.h"
 #include "context.h"
 #include "bufferobj.h"
 #include "arrayobj.h"
 #include "macros.h"
 #include "mtypes.h"
 #include "state.h"
 #include "varray.h"
-#include "main/dispatch.h"
 #include "util/bitscan.h"
 #include "util/u_atomic.h"
 
 
 const GLubyte
 _mesa_vao_attribute_map[ATTRIBUTE_MAP_MODE_MAX][VERT_ATTRIB_MAX] =
 {
/* ATTRIBUTE_MAP_MODE_IDENTITY
 *
 * Grab vertex processing attribute VERT_ATTRIB_POS from
diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
index 6b636f1dc74..a9356ae95b1 100644
--- a/src/mesa/main/atifragshader.c
+++ b/src/mesa/main/atifragshader.c
@@ -21,21 +21,20 @@
  * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  */
 
 #include "main/glheader.h"
 #include "main/context.h"
 #include "main/hash.h"
 #include "main/imports.h"
 #include "main/macros.h"
 #include "main/enums.h"
 #include "main/mtypes.h"
-#include "main/dispatch.h"
 #include "main/atifragshader.h"
 #include "program/program.h"
 
 #define MESA_DEBUG_ATI_FS 0
 
 static struct ati_fragment_shader DummyShader;
 
 
 /**
  * Allocate and initialize a new ATI fragment shader object.
diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c
index 9c632ffb51d..9f0e7161f3e 100644
--- a/src/mesa/main/attrib.c
+++ b/src/mesa/main/attrib.c
@@ -49,21 +49,20 @@
 #include "scissor.h"
 #include "stencil.h"
 #include "texenv.h"
 #include "texgen.h"
 #include "texobj.h"
 #include "texparam.h"
 #include "texstate.h"
 #include "varray.h"
 #include "viewport.h"
 #include "mtypes.h"
-#include "main/dispatch.h"
 #include "state.h"
 #include "hash.h"
 #include 
 
 
 /**
  * glEnable()/glDisable() attribute group (GL_ENABLE_BIT).
  */
 struct gl_enable_attrib
 {
diff --git a/src/mesa/main/colortab.c b/src/mesa/main/colortab.c
index a8edb03dd0e..e8df73a0b83 100644
--- a/src/mesa/main/colortab.c
+++ b/src/mesa/main/colortab.c
@@ -28,21 +28,20 @@
 #include "colortab.h"
 #include "context.h"
 #include "image.h"
 #include "macros.h"
 #include "mtypes.h"
 #include "pack.h"
 #include "pbo.h"
 #include "state.h"
 #include "teximage.h"
 #include "texstate.h"
-#include "main/dispatch.h"
 
 
 void GLAPIENTRY
 _mesa_ColorTable( GLenum target, GLenum internalFormat,
   GLsizei width, GLenum format, GLenum type,
   const GLvoid *data )
 {
GET_CURRENT_CONTEXT(ctx);
_mesa_error(ctx, GL_INVALID_OPERATION, "glColorTable");
 }
diff --git a/src/mesa/main/convolve.c b/src/mesa/main/convolve.c
index 83d590f4a48..e2c355c4f41 100644
--- a/src/mesa/main/convolve.c
+++ b/src/mesa/main/convolve.c
@@ -27,21 +27,20 @@
  * Image convolution functions.
  *
  * Notes: filter kernel elements are indexed by  and  as in
  * the GL spec.
  */
 
 
 #include "glheader.h"
 #include "context.h"
 #include "convolve.h"
-#include "main/dispatch.h"
 
 
 void GLAPIENTRY
 _mesa_ConvolutionFilter1D(GLenum target, GLenum internalFormat, GLsizei width, 
GLenum format, GLenum type, const GLvoid *image)
 {
GET_CURRENT_CONTEXT(ctx);
 
_mesa_error(ctx, GL_INVALID_OPERATION, "glConvolutionFilter1D");
 }
 
diff --git a/src/mesa/main/debug_output.c b/src/mesa/main/debug_output.c
index 

[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #15 from b...@besd.de  ---
I'm probably too tired, but this should work I think except it doesnt.


// Little test program to dump supported shader versions

// compiled like this
// gcc test.c -I/usr/include/GL/ -L/usr/lib/x86_64-linux-gnu -lglut -lGLU -lGL
-lGLEW -o test

#include "glew.h"
#include "glut.h"
#include 
#include 

int main(int argc, char **argv)
{
glutInit(, argv);
glutCreateWindow("GLUT");
glewInit();

printf("OpenGL version supported by this platform (%s): \n",
   glGetString(GL_VERSION));

int num_glsls = 0;
glGetIntegerv(GL_NUM_SHADING_LANGUAGE_VERSIONS, _glsls); 
// supposed to be at least 3 according to page 617 of the OpenGL 4.5 core spec
// currently seems to be 0
printf("GLSL versions supported by this platform: %d", num_glsls);

// this doesnt do anything even though it should
for (int i = 0; i++; i < num_glsls) {
   printf("OpenGLSL versions supported by this platform (%s): \n",
  glGetStringi(GL_SHADING_LANGUAGE_VERSION, 0));
}
}

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] blorp: Silence unused function warnings

2018-04-11 Thread Nanley Chery
On Wed, Apr 11, 2018 at 11:24:00AM -0700, Lionel Landwerlin wrote:
> Reviewed-by: Lionel Landwerlin 
> 

Thanks! And pushed.

> On 11/04/18 11:00, Nanley Chery wrote:
> > vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ 
> > defined but not used [-Wunused-function]
> >   blorp_get_surface_base_address(struct blorp_batch *batch)
> >   ^~
> > In file included from vulkan/genX_blorp_exec.c:35:0:
> > ./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but 
> > not used [-Wunused-function]
> >   blorp_emit_memcpy(struct blorp_batch *batch,
> >   ^
> > genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined 
> > but not used [-Wunused-function]
> >   blorp_get_surface_base_address(struct blorp_batch *batch)
> >   ^~
> > In file included from genX_blorp_exec.c:33:0:
> > ../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: 
> > ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
> >   blorp_emit_memcpy(struct blorp_batch *batch,
> >   ^
> > ---
> >   src/intel/blorp/blorp_genX_exec.h   | 4 ++--
> >   src/intel/vulkan/genX_blorp_exec.c  | 2 +-
> >   src/mesa/drivers/dri/i965/genX_blorp_exec.c | 2 +-
> >   3 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/intel/blorp/blorp_genX_exec.h 
> > b/src/intel/blorp/blorp_genX_exec.h
> > index 7851228d8dc..593521b95cc 100644
> > --- a/src/intel/blorp/blorp_genX_exec.h
> > +++ b/src/intel/blorp/blorp_genX_exec.h
> > @@ -78,7 +78,7 @@ static void
> >   blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
> >   struct blorp_address address, uint32_t delta);
> > -#if GEN_GEN >= 7 && GEN_GEN <= 10
> > +#if GEN_GEN >= 7 && GEN_GEN < 10
> >   static struct blorp_address
> >   blorp_get_surface_base_address(struct blorp_batch *batch);
> >   #endif
> > @@ -1244,7 +1244,7 @@ blorp_emit_pipeline(struct blorp_batch *batch,
> >   #endif /* GEN_GEN >= 6 */
> > -#if GEN_GEN >= 7 && GEN_GEN <= 10
> > +#if GEN_GEN >= 7 && GEN_GEN < 10
> >   static void
> >   blorp_emit_memcpy(struct blorp_batch *batch,
> > struct blorp_address dst,
> > diff --git a/src/intel/vulkan/genX_blorp_exec.c 
> > b/src/intel/vulkan/genX_blorp_exec.c
> > index 1ecec199846..b423046d616 100644
> > --- a/src/intel/vulkan/genX_blorp_exec.c
> > +++ b/src/intel/vulkan/genX_blorp_exec.c
> > @@ -64,7 +64,7 @@ blorp_surface_reloc(struct blorp_batch *batch, uint32_t 
> > ss_offset,
> > anv_batch_set_error(_buffer->batch, result);
> >   }
> > -#if GEN_GEN >= 7 && GEN_GEN <= 10
> > +#if GEN_GEN >= 7 && GEN_GEN < 10
> >   static struct blorp_address
> >   blorp_get_surface_base_address(struct blorp_batch *batch)
> >   {
> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> > b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > index 3406a6fdec6..b72ca9c515b 100644
> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > @@ -94,7 +94,7 @@ blorp_surface_reloc(struct blorp_batch *batch, uint32_t 
> > ss_offset,
> >   #endif
> >   }
> > -#if GEN_GEN >= 7 && GEN_GEN <= 10
> > +#if GEN_GEN >= 7 && GEN_GEN < 10
> >   static struct blorp_address
> >   blorp_get_surface_base_address(struct blorp_batch *batch)
> >   {
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fix radv_layout_dcc_compressed() when image doesn't have DCC

2018-04-11 Thread Samuel Pitoiset
num_dcc_levels means that DCC is supported, but this doesn't
mean that it's enabled by the driver. Instead, we should rely
on radv_image_has_dcc().

This fixes some multisample regressions since 0babc8e5d66
("radv: fix picking the method for resolve subpass") on Vega.
This is because the resolve method changed from HW to FS, but
those fails are totally unexpected, so there might some
differences between Polaris and Vega here.

Fixes: 44fcf587445 ("radv: Disable DCC for GENERAL layout and compute transfer 
dest.")
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index acb569203d4..a14e7c18b29 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1241,7 +1241,7 @@ bool radv_layout_dcc_compressed(const struct radv_image 
*image,
(queue_mask & (1u << RADV_QUEUE_COMPUTE)))
return false;
 
-   return image->surface.num_dcc_levels > 0 && layout != 
VK_IMAGE_LAYOUT_GENERAL;
+   return radv_image_has_dcc(image) && layout != VK_IMAGE_LAYOUT_GENERAL;
 }
 
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/11] gallium: Use Array._DrawVAO in st_atom_array.c.

2018-04-11 Thread Marek Olšák
On Wed, Apr 11, 2018 at 10:23 AM, Brian Paul  wrote:

> Hmm, in my experience, interleaved arrays are fairly common.
>

In what kinds of apps are they common?

Certainly not in Steam games.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] openCL support on SI

2018-04-11 Thread Jan Vesely
Hi,

if you're interested in mesa/clover, it should mostly work (ocl1.1),
with the notable exception of image support.
If you have an app that doesn't work, file a bug. progress is slow
because few people care.
some ocl1.2 features are implemented as well, but it'll take some time
before ocl1.2 can be advertised.

If you're interested in ROCm, talk to AMD.

regards,
Jan

On Wed, 2018-04-11 at 19:18 +0200, zz zz wrote:
> Hi,
> 
> I've been instructed to ask this here:
> 
> Looking at the table from xorg.freedesktop, it says OpenCL for S.Islands
> is WIP.(work in progress)
> 
> Does that mean I can hope it's just a matter of time, or could it still
>  *never* arrive? (say, if the cards 'get old' and the devs drop support?)
> 
> Thanks,
> AZ
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #14 from b...@besd.de  ---
That would explain why a program gets a zero in return when requesting a shader
language lower than the one hardcoded in comment 11. Which is exactly what
happen in comment 8.

Or at least that seems reasonable doesnt it?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #13 from b...@besd.de  ---
from mesa/main/version.c: (the one above is getstring.c)


int
_mesa_get_shading_language_version(const struct gl_context *ctx,
   int index,
   char **versionOut)
{
   int n = 0;

#define GLSL_VERSION(S) \
   if (n++ == index) \
  *versionOut = S

   /* GLSL core */
   if (ctx->Const.GLSLVersion >= 460)
  GLSL_VERSION("460");
   if (ctx->Const.GLSLVersion >= 450)
  GLSL_VERSION("450");
   if (ctx->Const.GLSLVersion >= 440)
  GLSL_VERSION("440");
   if (ctx->Const.GLSLVersion >= 430)
  GLSL_VERSION("430");
   if (ctx->Const.GLSLVersion >= 420)
  GLSL_VERSION("420");
   if (ctx->Const.GLSLVersion >= 410)
  GLSL_VERSION("410");
   if (ctx->Const.GLSLVersion >= 400)
  GLSL_VERSION("400");
   if (ctx->Const.GLSLVersion >= 330)
  GLSL_VERSION("330");
   if (ctx->Const.GLSLVersion >= 150)
  GLSL_VERSION("150");
   if (ctx->Const.GLSLVersion >= 140)
  GLSL_VERSION("140");
   if (ctx->Const.GLSLVersion >= 130)
  GLSL_VERSION("130");
   if (ctx->Const.GLSLVersion >= 120)
  GLSL_VERSION("120");
   /* The GL spec says to return the empty string for GLSL 1.10 */
   if (ctx->Const.GLSLVersion >= 110)
  GLSL_VERSION("");

   /* GLSL es */
   if ((ctx->API == API_OPENGLES2 && ctx->Version >= 32) ||
ctx->Extensions.ARB_ES3_2_compatibility)
  GLSL_VERSION("320 es");
   if (_mesa_is_gles31(ctx) || ctx->Extensions.ARB_ES3_1_compatibility)
  GLSL_VERSION("310 es");
   if (_mesa_is_gles3(ctx) || ctx->Extensions.ARB_ES3_compatibility)
  GLSL_VERSION("300 es");
   if (ctx->API == API_OPENGLES2 || ctx->Extensions.ARB_ES2_compatibility)
  GLSL_VERSION("100");

#undef GLSL_VERSION

   return n;
}

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #12 from b...@besd.de  ---
OpenGL core spec 4.5 page 4 specifies:

"The core profile of OpenGL 4.5 is also guaranteed to support all previous ver-
sions of the OpenGL Shading Language back to version  1.40. In some implemen-
tations the core profile may also support earlier versions of the OpenGL
Shading
Language, and may support compatibility profile versions of the OpenGL Shading
Language for versions 1.40 and earlier. In this case, errors will be generated
when
using language features such as compatibility profile built-ins not supported
by the
core profile API.  The #version strings for all supported versions of the
OpenGL
Shading Language may be queried as described in section 22.2."

But I think the above version returns 0 in cases where you request a glsl
version that is lower and not a list of supported glsl versions like the
function below:

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105807] [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105807

--- Comment #11 from b...@besd.de  ---
I think this function is the problem:

/**
 * Return the string for a glGetString(GL_SHADING_LANGUAGE_VERSION) query.
 */
static const GLubyte *
shading_language_version(struct gl_context *ctx)
{
   switch (ctx->API) {
   case API_OPENGL_COMPAT:
   case API_OPENGL_CORE:
  switch (ctx->Const.GLSLVersion) {
  case 120:
 return (const GLubyte *) "1.20";
  case 130:
 return (const GLubyte *) "1.30";
  case 140:
 return (const GLubyte *) "1.40";
  case 150:
 return (const GLubyte *) "1.50";
  case 330:
 return (const GLubyte *) "3.30";
  case 400:
 return (const GLubyte *) "4.00";
  case 410:
 return (const GLubyte *) "4.10";
  case 420:
 return (const GLubyte *) "4.20";
  case 430:
 return (const GLubyte *) "4.30";
  case 440:
 return (const GLubyte *) "4.40";
  case 450:
 return (const GLubyte *) "4.50";
  case 460:
 return (const GLubyte *) "4.60";
  default:
 _mesa_problem(ctx,
   "Invalid GLSL version in shading_language_version()");
 return (const GLubyte *) 0;
  }
  break;

   case API_OPENGLES2:
  switch (ctx->Version) {
  case 20:
 return (const GLubyte *) "OpenGL ES GLSL ES 1.0.16";
  case 30:
 return (const GLubyte *) "OpenGL ES GLSL ES 3.00";
  case 31:
 return (const GLubyte *) "OpenGL ES GLSL ES 3.10";
  case 32:
 return (const GLubyte *) "OpenGL ES GLSL ES 3.20";
  default:
 _mesa_problem(ctx,
   "Invalid OpenGL ES version in
shading_language_version()");
 return (const GLubyte *) 0;
  }
   case API_OPENGLES:
  /* fall-through */

   default:
  _mesa_problem(ctx, "Unexpected API value in shading_language_version()");
  return (const GLubyte *) 0;
   }
}

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/18] util: Just cut the hash in the pointer table

2018-04-11 Thread Thomas Helland
Meant for testing. Defeats some of the benefits of the implementation,
however it still seems to be better than the current hash table,
and the complexity is undeniably very low.
---
 src/util/pointer_map.c | 99 +-
 src/util/pointer_map.h |  1 -
 2 files changed, 33 insertions(+), 67 deletions(-)

diff --git a/src/util/pointer_map.c b/src/util/pointer_map.c
index 463fa19282..7632218b91 100644
--- a/src/util/pointer_map.c
+++ b/src/util/pointer_map.c
@@ -39,28 +39,25 @@
 #include "ralloc.h"
 #include "macros.h"
 
-static inline uint8_t
-get_hash(uint8_t *metadata)
-{
-   return *metadata & 0x7F;
-}
+static const uint32_t deleted_key_value;
+static const void *deleted_key = _key_value;
 
-static inline void
-set_hash(uint8_t *metadata, uint32_t hash)
+static bool
+entry_is_free(const struct map_entry *entry)
 {
-   *metadata = (*metadata & ~0x7F) | (((uint8_t) hash) & 0x7F);
+   return entry->key == NULL;
 }
 
-static inline bool
-entry_is_free(uint8_t *metadata)
+static bool
+entry_is_deleted(const struct pointer_map *pm, struct map_entry *entry)
 {
-   return !(*metadata >> 7);
+   return entry->key == pm->deleted_key;
 }
 
-static inline void
-set_occupied(uint8_t *metadata, bool occupied)
+static bool
+entry_is_present(const struct pointer_map *pm, struct map_entry *entry)
 {
-   *metadata = occupied ? *metadata | 0x80 : *metadata & 0x7F;
+   return entry->key != NULL && entry->key != pm->deleted_key;
 }
 
 static inline uint32_t
@@ -70,15 +67,6 @@ hash_pointer(const void *pointer)
return (uint32_t) ((num >> 2) ^ (num >> 6) ^ (num >> 10) ^ (num >> 14));
 }
 
-static bool
-entry_is_deleted(struct pointer_map *map, uint8_t *metadata)
-{
-   if (get_hash(metadata) != 0)
-  return false;
-
-   return map->map[metadata - map->metadata].key == NULL;
-}
-
 struct pointer_map *
 _mesa_pointer_map_create(void *mem_ctx)
 {
@@ -91,9 +79,9 @@ _mesa_pointer_map_create(void *mem_ctx)
map->size = 1 << 4;
map->max_entries = map->size * 0.6;
map->map = rzalloc_array(map, struct map_entry, map->size);
-   map->metadata = rzalloc_array(map, uint8_t, map->size);
map->entries = 0;
map->deleted_entries = 0;
+   map->deleted_key = deleted_key;
 
if (map->map == NULL) {
   ralloc_free(map);
@@ -113,15 +101,13 @@ _mesa_pointer_map_clone(struct pointer_map *src, void 
*dst_mem_ctx)
memcpy(pm, src, sizeof(struct pointer_map));
 
pm->map = ralloc_array(pm, struct map_entry, pm->size);
-   pm->metadata = ralloc_array(pm, uint8_t, pm->size);
 
-   if (pm->map == NULL || pm->metadata == NULL) {
+   if (pm->map == NULL) {
   ralloc_free(pm);
   return NULL;
}
 
memcpy(pm->map, src->map, pm->size * sizeof(struct map_entry));
-   memcpy(pm->metadata, src->metadata, pm->size * sizeof(uint8_t));
 
return pm;
 }
@@ -154,7 +140,6 @@ _mesa_pointer_map_destroy(struct pointer_map *map,
 void
 _mesa_pointer_map_clear(struct pointer_map *map)
 {
-   memset(map->metadata, 0, map->size * sizeof(uint8_t));
memset(map->map, 0, sizeof(struct map_entry) * map->size);
map->entries = 0;
map->deleted_entries = 0;
@@ -173,15 +158,14 @@ _mesa_pointer_map_search(struct pointer_map *map, const 
void *key)
uint32_t start_hash_address = hash & (map->size - 1);
uint32_t hash_address = start_hash_address;
 
+   struct map_entry *entry = NULL;
do {
-  uint8_t *metadata = map->metadata + hash_address;
+  entry = map->map + hash_address;
 
-  if (entry_is_free(metadata)) {
+  if (entry_is_free(entry)) {
  return NULL;
-  } else if (get_hash(metadata) == (hash & 0x7F)) {
- if (map->map[hash_address].key == key) {
-return >map[hash_address];
- }
+  } else if (entry->key == key) {
+ return entry;
   }
 
   hash_address = (hash_address + 1) & (map->size - 1);
@@ -195,7 +179,6 @@ _mesa_pointer_map_rehash(struct pointer_map *map, unsigned 
new_size)
 {
struct pointer_map old_map;
struct map_entry *map_entries, *entry;
-   uint8_t *metadatas;
 
old_map = *map;
 
@@ -206,12 +189,7 @@ _mesa_pointer_map_rehash(struct pointer_map *map, unsigned 
new_size)
if (map_entries == NULL)
   return;
 
-   metadatas = rzalloc_array(map, uint8_t, map->size);
-   if (metadatas == NULL)
-  return;
-
map->map = map_entries;
-   map->metadata = metadatas;
map->entries = 0;
map->deleted_entries = 0;
 
@@ -220,7 +198,6 @@ _mesa_pointer_map_rehash(struct pointer_map *map, unsigned 
new_size)
}
 
ralloc_free(old_map.map);
-   ralloc_free(old_map.metadata);
 }
 
 /**
@@ -232,7 +209,7 @@ struct map_entry *
 _mesa_pointer_map_insert(struct pointer_map *map, const void *key, void *data)
 {
uint32_t start_hash_address, hash_address, hash;
-   uint8_t *available_entry = NULL;
+   struct map_entry *available_entry = NULL;
assert(key != NULL);
 
if (map->entries >= map->max_entries) {
@@ -245,16 +222,17 @@ _mesa_pointer_map_insert(struct 

[Mesa-dev] [PATCH 14/18] nir: Migrate lower_vars_to_ssa to use pointer set

2018-04-11 Thread Thomas Helland
---
 src/compiler/nir/nir_lower_vars_to_ssa.c | 35 
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c 
b/src/compiler/nir/nir_lower_vars_to_ssa.c
index 3dfe48d6d3..988936ece8 100644
--- a/src/compiler/nir/nir_lower_vars_to_ssa.c
+++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
@@ -30,6 +30,7 @@
 #include "nir_phi_builder.h"
 #include "nir_vla.h"
 #include "util/pointer_map.h"
+#include "util/pointer_set.h"
 
 
 struct deref_node {
@@ -45,9 +46,9 @@ struct deref_node {
nir_deref_var *deref;
struct exec_node direct_derefs_link;
 
-   struct set *loads;
-   struct set *stores;
-   struct set *copies;
+   struct pointer_set *loads;
+   struct pointer_set *stores;
+   struct pointer_set *copies;
 
struct nir_phi_builder_value *pb_value;
 
@@ -367,10 +368,9 @@ register_load_instr(nir_intrinsic_instr *load_instr,
   return;
 
if (node->loads == NULL)
-  node->loads = _mesa_set_create(state->dead_ctx, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+  node->loads = _mesa_pointer_set_create(state->dead_ctx);
 
-   _mesa_set_add(node->loads, load_instr);
+   _mesa_pointer_set_insert(node->loads, load_instr);
 }
 
 static void
@@ -382,10 +382,9 @@ register_store_instr(nir_intrinsic_instr *store_instr,
   return;
 
if (node->stores == NULL)
-  node->stores = _mesa_set_create(state->dead_ctx, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+  node->stores = _mesa_pointer_set_create(state->dead_ctx);
 
-   _mesa_set_add(node->stores, store_instr);
+   _mesa_pointer_set_insert(node->stores, store_instr);
 }
 
 static void
@@ -400,10 +399,9 @@ register_copy_instr(nir_intrinsic_instr *copy_instr,
  continue;
 
   if (node->copies == NULL)
- node->copies = _mesa_set_create(state->dead_ctx, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+ node->copies = _mesa_pointer_set_create(state->dead_ctx);
 
-  _mesa_set_add(node->copies, copy_instr);
+  _mesa_pointer_set_insert(node->copies, copy_instr);
}
 }
 
@@ -449,8 +447,8 @@ lower_copies_to_load_store(struct deref_node *node,
if (!node->copies)
   return true;
 
-   struct set_entry *copy_entry;
-   set_foreach(node->copies, copy_entry) {
+   struct pointer_set_entry *copy_entry;
+   _mesa_pointer_set_foreach(node->copies, copy_entry) {
   nir_intrinsic_instr *copy = (void *)copy_entry->key;
 
   nir_lower_var_copy_instr(copy, state->shader);
@@ -463,9 +461,10 @@ lower_copies_to_load_store(struct deref_node *node,
  if (arg_node == NULL || arg_node == node)
 continue;
 
- struct set_entry *arg_entry = _mesa_set_search(arg_node->copies, 
copy);
+ struct pointer_set_entry *arg_entry =
+   _mesa_pointer_set_search(arg_node->copies, copy);
  assert(arg_entry);
- _mesa_set_remove(node->copies, arg_entry);
+ _mesa_pointer_set_remove(node->copies, arg_entry);
   }
 
   nir_instr_remove(>instr);
@@ -713,8 +712,8 @@ nir_lower_vars_to_ssa_impl(nir_function_impl *impl)
   assert(node->deref->var->constant_initializer == NULL);
 
   if (node->stores) {
- struct set_entry *store_entry;
- set_foreach(node->stores, store_entry) {
+ struct pointer_set_entry *store_entry;
+ _mesa_pointer_set_foreach(node->stores, store_entry) {
 nir_intrinsic_instr *store =
(nir_intrinsic_instr *)store_entry->key;
 BITSET_SET(store_blocks, store->instr.block->index);
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/18] glsl: Use the pointer map in the glsl linker

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/linker.cpp | 40 +++-
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index af09b7d03e..c549cac4b5 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -75,6 +75,7 @@
 #include "program/program.h"
 #include "util/mesa-sha1.h"
 #include "util/set.h"
+#include "util/pointer_map.h"
 #include "string_to_uint_map.h"
 #include "linker.h"
 #include "link_varyings.h"
@@ -1315,11 +1316,11 @@ populate_symbol_table(gl_linked_shader *sh, 
glsl_symbol_table *symbols)
  */
 static void
 remap_variables(ir_instruction *inst, struct gl_linked_shader *target,
-hash_table *temps)
+pointer_map *temps)
 {
class remap_visitor : public ir_hierarchical_visitor {
public:
- remap_visitor(struct gl_linked_shader *target, hash_table *temps)
+ remap_visitor(struct gl_linked_shader *target, pointer_map *temps)
   {
  this->target = target;
  this->symbols = target->symbols;
@@ -1330,7 +1331,7 @@ remap_variables(ir_instruction *inst, struct 
gl_linked_shader *target,
   virtual ir_visitor_status visit(ir_dereference_variable *ir)
   {
  if (ir->var->data.mode == ir_var_temporary) {
-hash_entry *entry = _mesa_hash_table_search(temps, ir->var);
+map_entry *entry = _mesa_pointer_map_search(temps, ir->var);
 ir_variable *var = entry ? (ir_variable *) entry->data : NULL;
 
 assert(var != NULL);
@@ -1357,7 +1358,7 @@ remap_variables(ir_instruction *inst, struct 
gl_linked_shader *target,
   struct gl_linked_shader *target;
   glsl_symbol_table *symbols;
   exec_list *instructions;
-  hash_table *temps;
+  pointer_map *temps;
};
 
remap_visitor v(target, temps);
@@ -1391,11 +1392,10 @@ static exec_node *
 move_non_declarations(exec_list *instructions, exec_node *last,
   bool make_copies, gl_linked_shader *target)
 {
-   hash_table *temps = NULL;
+   pointer_map *temps = NULL;
 
if (make_copies)
-  temps = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-  _mesa_key_pointer_equal);
+  temps = _mesa_pointer_map_create(NULL);
 
foreach_in_list_safe(ir_instruction, inst, instructions) {
   if (inst->as_function())
@@ -1414,7 +1414,7 @@ move_non_declarations(exec_list *instructions, exec_node 
*last,
  inst = inst->clone(target, NULL);
 
  if (var != NULL)
-_mesa_hash_table_insert(temps, var, inst);
+_mesa_pointer_map_insert(temps, var, inst);
  else
 remap_variables(inst, target, temps);
   } else {
@@ -1426,7 +1426,7 @@ move_non_declarations(exec_list *instructions, exec_node 
*last,
}
 
if (make_copies)
-  _mesa_hash_table_destroy(temps, NULL);
+  _mesa_pointer_map_destroy(temps, NULL);
 
return last;
 }
@@ -1441,14 +1441,13 @@ class array_sizing_visitor : public deref_type_updater {
 public:
array_sizing_visitor()
   : mem_ctx(ralloc_context(NULL)),
-unnamed_interfaces(_mesa_hash_table_create(NULL, _mesa_hash_pointer,
-   _mesa_key_pointer_equal))
+unnamed_interfaces(_mesa_pointer_map_create(NULL))
{
}
 
~array_sizing_visitor()
{
-  _mesa_hash_table_destroy(this->unnamed_interfaces, NULL);
+  _mesa_pointer_map_destroy(this->unnamed_interfaces, NULL);
   ralloc_free(this->mem_ctx);
}
 
@@ -1483,17 +1482,17 @@ public:
  /* Store a pointer to the variable in the unnamed_interfaces
   * hashtable.
   */
- hash_entry *entry =
-   _mesa_hash_table_search(this->unnamed_interfaces,
-   ifc_type);
+ map_entry *entry =
+   _mesa_pointer_map_search(this->unnamed_interfaces,
+ifc_type);
 
  ir_variable **interface_vars = entry ? (ir_variable **) entry->data : 
NULL;
 
  if (interface_vars == NULL) {
 interface_vars = rzalloc_array(mem_ctx, ir_variable *,
ifc_type->length);
-_mesa_hash_table_insert(this->unnamed_interfaces, ifc_type,
-interface_vars);
+_mesa_pointer_map_insert(this->unnamed_interfaces, ifc_type,
+ interface_vars);
  }
  unsigned index = ifc_type->field_index(var->name);
  assert(index < ifc_type->length);
@@ -1511,8 +1510,8 @@ public:
 */
void fixup_unnamed_interface_types()
{
-  hash_table_call_foreach(this->unnamed_interfaces,
-  fixup_unnamed_interface_type, NULL);
+  _mesa_pointer_map_call_foreach(this->unnamed_interfaces,
+ 

[Mesa-dev] [PATCH 12/18] nir: Use pointer map in nir_from_ssa

2018-04-11 Thread Thomas Helland
---
 src/compiler/nir/nir_from_ssa.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/compiler/nir/nir_from_ssa.c b/src/compiler/nir/nir_from_ssa.c
index 1aa35509b1..e38c4fafd6 100644
--- a/src/compiler/nir/nir_from_ssa.c
+++ b/src/compiler/nir/nir_from_ssa.c
@@ -28,6 +28,7 @@
 #include "nir.h"
 #include "nir_builder.h"
 #include "nir_vla.h"
+#include "util/pointer_map.h"
 
 /*
  * This file implements an out-of-SSA pass as described in "Revisiting
@@ -39,7 +40,7 @@ struct from_ssa_state {
nir_builder builder;
void *dead_ctx;
bool phi_webs_only;
-   struct hash_table *merge_node_table;
+   struct pointer_map *merge_node_map;
nir_instr *instr;
bool progress;
 };
@@ -120,8 +121,8 @@ merge_set_dump(merge_set *set, FILE *fp)
 static merge_node *
 get_merge_node(nir_ssa_def *def, struct from_ssa_state *state)
 {
-   struct hash_entry *entry =
-  _mesa_hash_table_search(state->merge_node_table, def);
+   struct map_entry *entry =
+  _mesa_pointer_map_search(state->merge_node_map, def);
if (entry)
   return entry->data;
 
@@ -135,7 +136,7 @@ get_merge_node(nir_ssa_def *def, struct from_ssa_state 
*state)
node->def = def;
exec_list_push_head(>nodes, >node);
 
-   _mesa_hash_table_insert(state->merge_node_table, def, node);
+   _mesa_pointer_map_insert(state->merge_node_map, def, node);
 
return node;
 }
@@ -467,8 +468,8 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
struct from_ssa_state *state = void_state;
nir_register *reg;
 
-   struct hash_entry *entry =
-  _mesa_hash_table_search(state->merge_node_table, def);
+   struct map_entry *entry =
+  _mesa_pointer_map_search(state->merge_node_map, def);
if (entry) {
   /* In this case, we're part of a phi web.  Use the web's register. */
   merge_node *node = (merge_node *)entry->data;
@@ -765,8 +766,7 @@ nir_convert_from_ssa_impl(nir_function_impl *impl, bool 
phi_webs_only)
nir_builder_init(, impl);
state.dead_ctx = ralloc_context(NULL);
state.phi_webs_only = phi_webs_only;
-   state.merge_node_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+   state.merge_node_map = _mesa_pointer_map_create(NULL);
state.progress = false;
 
nir_foreach_block(block, impl) {
@@ -804,7 +804,7 @@ nir_convert_from_ssa_impl(nir_function_impl *impl, bool 
phi_webs_only)
nir_metadata_dominance);
 
/* Clean up dead instructions and the hash tables */
-   _mesa_hash_table_destroy(state.merge_node_table, NULL);
+   _mesa_pointer_map_destroy(state.merge_node_map, NULL);
ralloc_free(state.dead_ctx);
return state.progress;
 }
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/18] glsl: Use pointer set in opt_copy_propagation

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/opt_copy_propagation.cpp | 47 +-
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
b/src/compiler/glsl/opt_copy_propagation.cpp
index 7bcd8a090b..0195dc4e40 100644
--- a/src/compiler/glsl/opt_copy_propagation.cpp
+++ b/src/compiler/glsl/opt_copy_propagation.cpp
@@ -38,8 +38,7 @@
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
 #include "util/pointer_map.h"
-#include "util/hash_table.h"
-#include "util/set.h"
+#include "util/pointer_set.h"
 
 namespace {
 
@@ -51,8 +50,7 @@ public:
   mem_ctx = ralloc_context(0);
   lin_ctx = linear_alloc_parent(mem_ctx, 0);
   acp = _mesa_pointer_map_create(mem_ctx);
-  kills = _mesa_set_create(mem_ctx, _mesa_hash_pointer,
-   _mesa_key_pointer_equal);
+  kills = _mesa_pointer_set_create(mem_ctx);
   killed_all = false;
}
~ir_copy_propagation_visitor()
@@ -79,7 +77,7 @@ public:
/**
 * Set of ir_variables: Whose values were killed in this block.
 */
-   set *kills;
+   pointer_set *kills;
 
bool progress;
 
@@ -99,18 +97,17 @@ 
ir_copy_propagation_visitor::visit_enter(ir_function_signature *ir)
 * main() at link time, so they're irrelevant to us.
 */
pointer_map *orig_acp = this->acp;
-   set *orig_kills = this->kills;
+   pointer_set *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
acp = _mesa_pointer_map_create(NULL);
-   kills = _mesa_set_create(NULL, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+   kills = _mesa_pointer_set_create(NULL);
this->killed_all = false;
 
visit_list_elements(this, >body);
 
_mesa_pointer_map_destroy(acp, NULL);
-   _mesa_set_destroy(kills, NULL);
+   _mesa_pointer_set_destroy(kills, NULL);
 
this->kills = orig_kills;
this->acp = orig_acp;
@@ -209,11 +206,10 @@ void
 ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
 {
pointer_map *orig_acp = this->acp;
-   set *orig_kills = this->kills;
+   pointer_set *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   kills = _mesa_set_create(NULL, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+   kills = _mesa_pointer_set_create(NULL);
this->killed_all = false;
 
/* Populate the initial acp with a copy of the original */
@@ -225,18 +221,18 @@ ir_copy_propagation_visitor::handle_if_block(exec_list 
*instructions)
   _mesa_pointer_map_clear(orig_acp);
}
 
-   set *new_kills = this->kills;
+   pointer_set *new_kills = this->kills;
this->kills = orig_kills;
_mesa_pointer_map_destroy(acp, NULL);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
 
-   struct set_entry *s_entry;
-   set_foreach(new_kills, s_entry) {
-  kill((ir_variable *) s_entry->key);
+   struct pointer_set_entry *pse;
+   _mesa_pointer_set_foreach(new_kills, pse) {
+  kill((ir_variable *) pse->key);
}
 
-   _mesa_set_destroy(new_kills, NULL);
+   _mesa_pointer_set_destroy(new_kills, NULL);
 }
 
 ir_visitor_status
@@ -255,11 +251,10 @@ void
 ir_copy_propagation_visitor::handle_loop(ir_loop *ir, bool keep_acp)
 {
pointer_map *orig_acp = this->acp;
-   set *orig_kills = this->kills;
+   pointer_set *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   kills = _mesa_set_create(NULL, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+   kills = _mesa_pointer_set_create(NULL);
this->killed_all = false;
 
if (keep_acp) {
@@ -274,18 +269,18 @@ ir_copy_propagation_visitor::handle_loop(ir_loop *ir, 
bool keep_acp)
   _mesa_pointer_map_clear(orig_acp);
}
 
-   set *new_kills = this->kills;
+   pointer_set *new_kills = this->kills;
this->kills = orig_kills;
_mesa_pointer_map_destroy(acp, NULL);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
 
-   struct set_entry *entry;
-   set_foreach(new_kills, entry) {
-  kill((ir_variable *) entry->key);
+   struct pointer_set_entry *pse;
+   _mesa_pointer_set_foreach(new_kills, pse) {
+  kill((ir_variable *) pse->key);
}
 
-   _mesa_set_destroy(new_kills, NULL);
+   _mesa_pointer_set_destroy(new_kills, NULL);
 }
 
 ir_visitor_status
@@ -323,7 +318,7 @@ ir_copy_propagation_visitor::kill(ir_variable *var)
}
 
/* Add the LHS variable to the set of killed variables in this block. */
-   _mesa_set_add(kills, var);
+   _mesa_pointer_set_insert(kills, var);
 }
 
 /**
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/18] nir: Use pointer_set in nir_propagate_invariant

2018-04-11 Thread Thomas Helland
Should cut memory consumption approximately in half, while giving
us better cache locality and a simpler implementation.
---
 src/compiler/nir/nir_propagate_invariant.c | 33 +++---
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/src/compiler/nir/nir_propagate_invariant.c 
b/src/compiler/nir/nir_propagate_invariant.c
index 7b5bd6cce6..bc4c9f2465 100644
--- a/src/compiler/nir/nir_propagate_invariant.c
+++ b/src/compiler/nir/nir_propagate_invariant.c
@@ -22,14 +22,15 @@
  */
 
 #include "nir.h"
+#include "util/pointer_set.h"
 
 static void
-add_src(nir_src *src, struct set *invariants)
+add_src(nir_src *src, struct pointer_set *invariants)
 {
if (src->is_ssa) {
-  _mesa_set_add(invariants, src->ssa);
+  _mesa_pointer_set_insert(invariants, src->ssa);
} else {
-  _mesa_set_add(invariants, src->reg.reg);
+  _mesa_pointer_set_insert(invariants, src->reg.reg);
}
 }
 
@@ -41,17 +42,17 @@ add_src_cb(nir_src *src, void *state)
 }
 
 static bool
-dest_is_invariant(nir_dest *dest, struct set *invariants)
+dest_is_invariant(nir_dest *dest, struct pointer_set *invariants)
 {
if (dest->is_ssa) {
-  return _mesa_set_search(invariants, >ssa);
+  return _mesa_pointer_set_search(invariants, >ssa);
} else {
-  return _mesa_set_search(invariants, dest->reg.reg);
+  return _mesa_pointer_set_search(invariants, dest->reg.reg);
}
 }
 
 static void
-add_cf_node(nir_cf_node *cf, struct set *invariants)
+add_cf_node(nir_cf_node *cf, struct pointer_set *invariants)
 {
if (cf->type == nir_cf_node_if) {
   nir_if *if_stmt = nir_cf_node_as_if(cf);
@@ -63,19 +64,19 @@ add_cf_node(nir_cf_node *cf, struct set *invariants)
 }
 
 static void
-add_var(nir_variable *var, struct set *invariants)
+add_var(nir_variable *var, struct pointer_set *invariants)
 {
-   _mesa_set_add(invariants, var);
+   _mesa_pointer_set_insert(invariants, var);
 }
 
 static bool
-var_is_invariant(nir_variable *var, struct set * invariants)
+var_is_invariant(nir_variable *var, struct pointer_set *invariants)
 {
-   return var->data.invariant || _mesa_set_search(invariants, var);
+   return var->data.invariant || _mesa_pointer_set_search(invariants, var);
 }
 
 static void
-propagate_invariant_instr(nir_instr *instr, struct set *invariants)
+propagate_invariant_instr(nir_instr *instr, struct pointer_set *invariants)
 {
switch (instr->type) {
case nir_instr_type_alu: {
@@ -147,7 +148,8 @@ propagate_invariant_instr(nir_instr *instr, struct set 
*invariants)
 }
 
 static bool
-propagate_invariant_impl(nir_function_impl *impl, struct set *invariants)
+propagate_invariant_impl(nir_function_impl *impl,
+ struct pointer_set *invariants)
 {
bool progress = false;
 
@@ -181,8 +183,7 @@ bool
 nir_propagate_invariant(nir_shader *shader)
 {
/* Hash set of invariant things */
-   struct set *invariants = _mesa_set_create(NULL, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+   struct pointer_set *invariants = _mesa_pointer_set_create(NULL);
 
bool progress = false;
nir_foreach_function(function, shader) {
@@ -190,7 +191,7 @@ nir_propagate_invariant(nir_shader *shader)
  progress = true;
}
 
-   _mesa_set_destroy(invariants, NULL);
+   _mesa_pointer_set_destroy(invariants, NULL);
 
return progress;
 }
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/18] util: Add a pointer set implementation

2018-04-11 Thread Thomas Helland
This is a rework of our set for the common usecase of storing pointers.
We are currently storing the hash, and comparing the hash of the key
to the hash that is stored for the entry, plus comparing the key itself.
Seeing as comparing a pointer is cheap, this means we are doubling the
size of our set to do more work, which seems unnecessary. This therefore
implements a special case for a pointer set. It uses a design where we
use power of two sized tables, meaning we can simply do bitmasking
instead of modulo when fitting the hash to our table. We use linear
probing to build on the foundation of the improved cache locality.
The goal is to improve cache locality and memory footprint, and at the
same time reduce the amount of work done, and complexity.

V2: Use bitmask in pointer set as size is always 2^n
---
 src/util/meson.build   |   2 +
 src/util/pointer_set.c | 266 +
 src/util/pointer_set.h |  90 +
 3 files changed, 358 insertions(+)
 create mode 100644 src/util/pointer_set.c
 create mode 100644 src/util/pointer_set.h

diff --git a/src/util/meson.build b/src/util/meson.build
index 9b50647f34..b6f9db5484 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -50,6 +50,8 @@ files_mesa_util = files(
   'os_time.h',
   'pointer_map.c',
   'pointer_map.h',
+  'pointer_set.c',
+  'pointer_set.h',
   'sha1/sha1.c',
   'sha1/sha1.h',
   'ralloc.c',
diff --git a/src/util/pointer_set.c b/src/util/pointer_set.c
new file mode 100644
index 00..8d8eff4541
--- /dev/null
+++ b/src/util/pointer_set.c
@@ -0,0 +1,266 @@
+/*
+ * Copyright © 2017 Thomas Helland
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+/**
+ * Implements a linear probing set specifically for pointer keys.
+ * It does not store the hash, effectively cutting the size of the set in two.
+ * Some of the spared space is used to reduce load factor to 50%. It uses
+ * linear probing for good cache locality.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "pointer_set.h"
+#include "ralloc.h"
+#include "macros.h"
+
+static const uint32_t deleted_key_value;
+static const void *deleted_key = _key_value;
+
+static inline bool
+entry_is_free(struct pointer_set_entry *entry)
+{
+   return entry->key == NULL;
+}
+
+static inline uint32_t
+hash_pointer(const void *pointer)
+{
+   uintptr_t num = (uintptr_t) pointer;
+   return (uint32_t) ((num >> 2) ^ (num >> 6) ^ (num >> 10) ^ (num >> 14));
+}
+
+static inline bool
+entry_is_deleted(struct pointer_set_entry *entry)
+{
+   return entry->key == deleted_key;
+}
+
+static inline bool
+entry_is_present(struct pointer_set_entry *entry)
+{
+   return entry->key != NULL && entry->key != deleted_key;
+}
+
+struct pointer_set *
+_mesa_pointer_set_create(void *mem_ctx)
+{
+   struct pointer_set *set;
+
+   set = ralloc(mem_ctx, struct pointer_set);
+   if (set == NULL)
+  return NULL;
+
+   set->size = 1 << 4;
+   set->max_entries = set->size / 2;
+   set->keys = rzalloc_array(set, struct pointer_set_entry, set->size);
+   set->entries = 0;
+   set->deleted_entries = 0;
+
+   if (set->keys == NULL) {
+  ralloc_free(set);
+  return NULL;
+   }
+
+   return set;
+}
+
+/**
+ * Frees the pointer set.
+ */
+void
+_mesa_pointer_set_destroy(struct pointer_set* set,
+  void (*delete_function)(struct pointer_set_entry *entry))
+{
+   if (!set)
+  return;
+
+   if (delete_function) {
+  struct pointer_set_entry *entry;
+
+  _mesa_pointer_set_foreach(set, entry) {
+ delete_function(entry);
+  }
+   }
+
+   ralloc_free(set);
+}
+
+/**
+ * Finds a set entry with the given key.
+ *
+ * Returns NULL if no entry is found.  Note that the data pointer may be
+ * modified by the user.
+ */
+struct pointer_set_entry *
+_mesa_pointer_set_search(struct pointer_set *set, const void *key)
+{
+   uint32_t hash = hash_pointer(key);
+   uint32_t 

[Mesa-dev] [PATCH 06/18] nir: Change lower_vars_to_ssa to use pointer map

2018-04-11 Thread Thomas Helland
---
 src/compiler/nir/nir_lower_vars_to_ssa.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c 
b/src/compiler/nir/nir_lower_vars_to_ssa.c
index e8cfe308d2..3dfe48d6d3 100644
--- a/src/compiler/nir/nir_lower_vars_to_ssa.c
+++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
@@ -29,6 +29,7 @@
 #include "nir_builder.h"
 #include "nir_phi_builder.h"
 #include "nir_vla.h"
+#include "util/pointer_map.h"
 
 
 struct deref_node {
@@ -61,7 +62,7 @@ struct lower_variables_state {
nir_function_impl *impl;
 
/* A hash table mapping variables to deref_node data */
-   struct hash_table *deref_var_nodes;
+   struct pointer_map *deref_var_nodes;
 
/* A hash table mapping fully-qualified direct dereferences, i.e.
 * dereferences with no indirect or wildcard array dereferences, to
@@ -114,14 +115,14 @@ get_deref_node_for_var(nir_variable *var, struct 
lower_variables_state *state)
 {
struct deref_node *node;
 
-   struct hash_entry *var_entry =
-  _mesa_hash_table_search(state->deref_var_nodes, var);
+   struct map_entry *var_entry =
+  _mesa_pointer_map_search(state->deref_var_nodes, var);
 
if (var_entry) {
   return var_entry->data;
} else {
   node = deref_node_create(NULL, var->type, state->dead_ctx);
-  _mesa_hash_table_insert(state->deref_var_nodes, var, node);
+  _mesa_pointer_map_insert(state->deref_var_nodes, var, node);
   return node;
}
 }
@@ -646,9 +647,7 @@ nir_lower_vars_to_ssa_impl(nir_function_impl *impl)
state.dead_ctx = ralloc_context(state.shader);
state.impl = impl;
 
-   state.deref_var_nodes = _mesa_hash_table_create(state.dead_ctx,
-   _mesa_hash_pointer,
-   _mesa_key_pointer_equal);
+   state.deref_var_nodes = _mesa_pointer_map_create(state.dead_ctx);
exec_list_make_empty(_deref_nodes);
 
/* Build the initial deref structures and direct_deref_nodes table */
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/18] glsl: Change glsl_to_nir to user pointer map

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/glsl_to_nir.cpp | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 80eb15f1ab..310b678680 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -32,6 +32,7 @@
 #include "compiler/nir/nir_control_flow.h"
 #include "compiler/nir/nir_builder.h"
 #include "main/imports.h"
+#include "util/pointer_map.h"
 
 /*
  * pass to lower GLSL IR to NIR
@@ -103,10 +104,10 @@ private:
bool is_global;
 
/* map of ir_variable -> nir_variable */
-   struct hash_table *var_table;
+   struct pointer_map *var_map;
 
/* map of ir_function_signature -> nir_function_overload */
-   struct hash_table *overload_table;
+   struct pointer_map *overload_map;
 };
 
 /*
@@ -191,10 +192,8 @@ nir_visitor::nir_visitor(nir_shader *shader)
this->supports_ints = shader->options->native_integers;
this->shader = shader;
this->is_global = true;
-   this->var_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
-   this->overload_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-  _mesa_key_pointer_equal);
+   this->var_map = _mesa_pointer_map_create(NULL);
+   this->overload_map = _mesa_pointer_map_create(NULL);
this->result = NULL;
this->impl = NULL;
this->var = NULL;
@@ -205,8 +204,8 @@ nir_visitor::nir_visitor(nir_shader *shader)
 
 nir_visitor::~nir_visitor()
 {
-   _mesa_hash_table_destroy(this->var_table, NULL);
-   _mesa_hash_table_destroy(this->overload_table, NULL);
+   _mesa_pointer_map_destroy(this->var_map, NULL);
+   _mesa_pointer_map_destroy(this->overload_map, NULL);
 }
 
 nir_deref_var *
@@ -467,7 +466,7 @@ nir_visitor::visit(ir_variable *ir)
else
   nir_shader_add_variable(shader, var);
 
-   _mesa_hash_table_insert(var_table, ir, var);
+   _mesa_pointer_map_insert(var_map, ir, var);
this->var = var;
 }
 
@@ -491,7 +490,7 @@ nir_visitor::create_function(ir_function_signature *ir)
assert(ir->parameters.is_empty());
assert(ir->return_type == glsl_type::void_type);
 
-   _mesa_hash_table_insert(this->overload_table, ir, func);
+   _mesa_pointer_map_insert(this->overload_map, ir, func);
 }
 
 void
@@ -507,8 +506,8 @@ nir_visitor::visit(ir_function_signature *ir)
if (ir->is_intrinsic())
   return;
 
-   struct hash_entry *entry =
-  _mesa_hash_table_search(this->overload_table, ir);
+   struct map_entry *entry =
+  _mesa_pointer_map_search(this->overload_map, ir);
 
assert(entry);
nir_function *func = (nir_function *) entry->data;
@@ -1231,8 +1230,8 @@ nir_visitor::visit(ir_call *ir)
   return;
}
 
-   struct hash_entry *entry =
-  _mesa_hash_table_search(this->overload_table, ir->callee);
+   struct map_entry *entry =
+  _mesa_pointer_map_search(this->overload_map, ir->callee);
assert(entry);
nir_function *callee = (nir_function *) entry->data;
 
@@ -2174,8 +2173,8 @@ nir_visitor::visit(ir_constant *ir)
 void
 nir_visitor::visit(ir_dereference_variable *ir)
 {
-   struct hash_entry *entry =
-  _mesa_hash_table_search(this->var_table, ir->var);
+   struct map_entry *entry =
+  _mesa_pointer_map_search(this->var_map, ir->var);
assert(entry);
nir_variable *var = (nir_variable *) entry->data;
 
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/18] nir: Use pointer set in remove_dead_variable

2018-04-11 Thread Thomas Helland
This should simplify things, and cut the memory consumption of the
set effectively in half. Cache locality should also be better.
---
 src/compiler/nir/nir_remove_dead_variables.c | 37 ++--
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/src/compiler/nir/nir_remove_dead_variables.c 
b/src/compiler/nir/nir_remove_dead_variables.c
index eff66f92d4..ff78fc6c90 100644
--- a/src/compiler/nir/nir_remove_dead_variables.c
+++ b/src/compiler/nir/nir_remove_dead_variables.c
@@ -26,16 +26,17 @@
  */
 
 #include "nir.h"
+#include "util/pointer_set.h"
 
 static void
-add_var_use_intrinsic(nir_intrinsic_instr *instr, struct set *live,
+add_var_use_intrinsic(nir_intrinsic_instr *instr, struct pointer_set *live,
   nir_variable_mode modes)
 {
unsigned num_vars = nir_intrinsic_infos[instr->intrinsic].num_variables;
 
switch (instr->intrinsic) {
case nir_intrinsic_copy_var:
-  _mesa_set_add(live, instr->variables[1]->var);
+  _mesa_pointer_set_insert(live, instr->variables[1]->var);
   /* Fall through */
case nir_intrinsic_store_var: {
   /* The first source in both copy_var and store_var is the destination.
@@ -44,7 +45,7 @@ add_var_use_intrinsic(nir_intrinsic_instr *instr, struct set 
*live,
*/
   nir_variable_mode mode = instr->variables[0]->var->data.mode;
   if (!(mode & (nir_var_local | nir_var_global | nir_var_shared)))
- _mesa_set_add(live, instr->variables[0]->var);
+ _mesa_pointer_set_insert(live, instr->variables[0]->var);
   break;
}
 
@@ -58,42 +59,42 @@ add_var_use_intrinsic(nir_intrinsic_instr *instr, struct 
set *live,
 
default:
   for (unsigned i = 0; i < num_vars; i++) {
- _mesa_set_add(live, instr->variables[i]->var);
+ _mesa_pointer_set_insert(live, instr->variables[i]->var);
   }
   break;
}
 }
 
 static void
-add_var_use_call(nir_call_instr *instr, struct set *live)
+add_var_use_call(nir_call_instr *instr, struct pointer_set *live)
 {
if (instr->return_deref != NULL) {
   nir_variable *var = instr->return_deref->var;
-  _mesa_set_add(live, var);
+  _mesa_pointer_set_insert(live, var);
}
 
for (unsigned i = 0; i < instr->num_params; i++) {
   nir_variable *var = instr->params[i]->var;
-  _mesa_set_add(live, var);
+  _mesa_pointer_set_insert(live, var);
}
 }
 
 static void
-add_var_use_tex(nir_tex_instr *instr, struct set *live)
+add_var_use_tex(nir_tex_instr *instr, struct pointer_set *live)
 {
if (instr->texture != NULL) {
   nir_variable *var = instr->texture->var;
-  _mesa_set_add(live, var);
+  _mesa_pointer_set_insert(live, var);
}
 
if (instr->sampler != NULL) {
   nir_variable *var = instr->sampler->var;
-  _mesa_set_add(live, var);
+  _mesa_pointer_set_insert(live, var);
}
 }
 
 static void
-add_var_use_shader(nir_shader *shader, struct set *live, nir_variable_mode 
modes)
+add_var_use_shader(nir_shader *shader, struct pointer_set *live, 
nir_variable_mode modes)
 {
nir_foreach_function(function, shader) {
   if (function->impl) {
@@ -123,7 +124,7 @@ add_var_use_shader(nir_shader *shader, struct set *live, 
nir_variable_mode modes
 }
 
 static void
-remove_dead_var_writes(nir_shader *shader, struct set *live)
+remove_dead_var_writes(nir_shader *shader)
 {
nir_foreach_function(function, shader) {
   if (!function->impl)
@@ -148,12 +149,12 @@ remove_dead_var_writes(nir_shader *shader, struct set 
*live)
 }
 
 static bool
-remove_dead_vars(struct exec_list *var_list, struct set *live)
+remove_dead_vars(struct exec_list *var_list, struct pointer_set *live)
 {
bool progress = false;
 
foreach_list_typed_safe(nir_variable, var, node, var_list) {
-  struct set_entry *entry = _mesa_set_search(live, var);
+  struct pointer_set_entry *entry = _mesa_pointer_set_search(live, var);
   if (entry == NULL) {
  /* Mark this variable as used by setting the mode to 0 */
  var->data.mode = 0;
@@ -169,8 +170,8 @@ bool
 nir_remove_dead_variables(nir_shader *shader, nir_variable_mode modes)
 {
bool progress = false;
-   struct set *live =
-  _mesa_set_create(NULL, _mesa_hash_pointer, _mesa_key_pointer_equal);
+   struct pointer_set *live =
+  _mesa_pointer_set_create(NULL);
 
add_var_use_shader(shader, live, modes);
 
@@ -202,7 +203,7 @@ nir_remove_dead_variables(nir_shader *shader, 
nir_variable_mode modes)
}
 
if (progress) {
-  remove_dead_var_writes(shader, live);
+  remove_dead_var_writes(shader);
 
   nir_foreach_function(function, shader) {
  if (function->impl) {
@@ -212,6 +213,6 @@ nir_remove_dead_variables(nir_shader *shader, 
nir_variable_mode modes)
   }
}
 
-   _mesa_set_destroy(live, NULL);
+   _mesa_pointer_set_destroy(live, NULL);
return progress;
 }
-- 
2.16.2

___
mesa-dev mailing list

[Mesa-dev] [PATCH 10/18] util: Add a call_foreach function to the pointer map

2018-04-11 Thread Thomas Helland
---
 src/util/pointer_map.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/util/pointer_map.h b/src/util/pointer_map.h
index 4bfc306a5f..f92e67d40d 100644
--- a/src/util/pointer_map.h
+++ b/src/util/pointer_map.h
@@ -91,6 +91,19 @@ _mesa_pointer_map_next_entry(struct pointer_map *map,
 entry != NULL;  \
 entry = _mesa_pointer_map_next_entry(map, entry))
 
+static inline void
+_mesa_pointer_map_call_foreach(struct pointer_map *pm,
+   void (*callback)(const void *key,
+void *data,
+void *closure),
+   void *closure)
+{
+   struct map_entry *entry;
+
+   _mesa_pointer_map_foreach(pm, entry)
+  callback(entry->key, entry->data, closure);
+}
+
 #ifdef __cplusplus
 } /* extern C */
 #endif
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/18] glsl: Use pointer map in copy propagation

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/opt_copy_propagation.cpp | 48 ++
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
b/src/compiler/glsl/opt_copy_propagation.cpp
index 6220aa86da..7bcd8a090b 100644
--- a/src/compiler/glsl/opt_copy_propagation.cpp
+++ b/src/compiler/glsl/opt_copy_propagation.cpp
@@ -37,6 +37,7 @@
 #include "ir_basic_block.h"
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
+#include "util/pointer_map.h"
 #include "util/hash_table.h"
 #include "util/set.h"
 
@@ -49,8 +50,7 @@ public:
   progress = false;
   mem_ctx = ralloc_context(0);
   lin_ctx = linear_alloc_parent(mem_ctx, 0);
-  acp = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+  acp = _mesa_pointer_map_create(mem_ctx);
   kills = _mesa_set_create(mem_ctx, _mesa_hash_pointer,
_mesa_key_pointer_equal);
   killed_all = false;
@@ -73,8 +73,8 @@ public:
void kill(ir_variable *ir);
void handle_if_block(exec_list *instructions);
 
-   /** Hash of lhs->rhs: The available copies to propagate */
-   hash_table *acp;
+   /** Map of lhs->rhs: The available copies to propagate */
+   pointer_map *acp;
 
/**
 * Set of ir_variables: Whose values were killed in this block.
@@ -98,19 +98,18 @@ 
ir_copy_propagation_visitor::visit_enter(ir_function_signature *ir)
 * block.  Any instructions at global scope will be shuffled into
 * main() at link time, so they're irrelevant to us.
 */
-   hash_table *orig_acp = this->acp;
+   pointer_map *orig_acp = this->acp;
set *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+   acp = _mesa_pointer_map_create(NULL);
kills = _mesa_set_create(NULL, _mesa_hash_pointer,
 _mesa_key_pointer_equal);
this->killed_all = false;
 
visit_list_elements(this, >body);
 
-   _mesa_hash_table_destroy(acp, NULL);
+   _mesa_pointer_map_destroy(acp, NULL);
_mesa_set_destroy(kills, NULL);
 
this->kills = orig_kills;
@@ -150,7 +149,7 @@ ir_copy_propagation_visitor::visit(ir_dereference_variable 
*ir)
if (this->in_assignee)
   return visit_continue;
 
-   struct hash_entry *entry = _mesa_hash_table_search(acp, ir->var);
+   struct map_entry *entry = _mesa_pointer_map_search(acp, ir->var);
if (entry) {
   ir->var = (ir_variable *) entry->data;
   progress = true;
@@ -185,7 +184,7 @@ ir_copy_propagation_visitor::visit_enter(ir_call *ir)
 * and out parameters).
 */
if (!ir->callee->is_intrinsic()) {
-  _mesa_hash_table_clear(acp, NULL);
+  _mesa_pointer_map_clear(acp);
   this->killed_all = true;
} else {
   if (ir->return_deref)
@@ -209,7 +208,7 @@ ir_copy_propagation_visitor::visit_enter(ir_call *ir)
 void
 ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
 {
-   hash_table *orig_acp = this->acp;
+   pointer_map *orig_acp = this->acp;
set *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
@@ -218,17 +217,17 @@ ir_copy_propagation_visitor::handle_if_block(exec_list 
*instructions)
this->killed_all = false;
 
/* Populate the initial acp with a copy of the original */
-   acp = _mesa_hash_table_clone(orig_acp, NULL);
+   acp = _mesa_pointer_map_clone(orig_acp, NULL);
 
visit_list_elements(this, instructions);
 
if (this->killed_all) {
-  _mesa_hash_table_clear(orig_acp, NULL);
+  _mesa_pointer_map_clear(orig_acp);
}
 
set *new_kills = this->kills;
this->kills = orig_kills;
-   _mesa_hash_table_destroy(acp, NULL);
+   _mesa_pointer_map_destroy(acp, NULL);
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
 
@@ -255,7 +254,7 @@ ir_copy_propagation_visitor::visit_enter(ir_if *ir)
 void
 ir_copy_propagation_visitor::handle_loop(ir_loop *ir, bool keep_acp)
 {
-   hash_table *orig_acp = this->acp;
+   pointer_map *orig_acp = this->acp;
set *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
@@ -264,21 +263,20 @@ ir_copy_propagation_visitor::handle_loop(ir_loop *ir, 
bool keep_acp)
this->killed_all = false;
 
if (keep_acp) {
-  acp = _mesa_hash_table_clone(orig_acp, NULL);
+  acp = _mesa_pointer_map_clone(orig_acp, NULL);
} else {
-  acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+  acp = _mesa_pointer_map_create(NULL);
}
 
visit_list_elements(this, >body_instructions);
 
if (this->killed_all) {
-  _mesa_hash_table_clear(orig_acp, NULL);
+  _mesa_pointer_map_clear(orig_acp);
}
 
set *new_kills = this->kills;
this->kills = orig_kills;
-   _mesa_hash_table_destroy(acp, NULL);
+   

[Mesa-dev] [PATCH 08/18] glsl: Use pointer map in opt_constant_variable

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/opt_constant_variable.cpp | 34 ++---
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/compiler/glsl/opt_constant_variable.cpp 
b/src/compiler/glsl/opt_constant_variable.cpp
index 914b46004c..d1d315af7a 100644
--- a/src/compiler/glsl/opt_constant_variable.cpp
+++ b/src/compiler/glsl/opt_constant_variable.cpp
@@ -37,6 +37,7 @@
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
 #include "util/hash_table.h"
+#include "util/pointer_map.h"
 
 namespace {
 
@@ -54,23 +55,23 @@ public:
virtual ir_visitor_status visit_enter(ir_assignment *);
virtual ir_visitor_status visit_enter(ir_call *);
 
-   struct hash_table *ht;
+   struct pointer_map *map;
 };
 
 } /* unnamed namespace */
 
 static struct assignment_entry *
-get_assignment_entry(ir_variable *var, struct hash_table *ht)
+get_assignment_entry(ir_variable *var, struct pointer_map *map)
 {
-   struct hash_entry *hte = _mesa_hash_table_search(ht, var);
+   struct map_entry *me = _mesa_pointer_map_search(map, var);
struct assignment_entry *entry;
 
-   if (hte) {
-  entry = (struct assignment_entry *) hte->data;
+   if (me) {
+  entry = (struct assignment_entry *) me->data;
} else {
   entry = (struct assignment_entry *) calloc(1, sizeof(*entry));
   entry->var = var;
-  _mesa_hash_table_insert(ht, var, entry);
+  _mesa_pointer_map_insert(map, var, entry);
}
 
return entry;
@@ -79,7 +80,7 @@ get_assignment_entry(ir_variable *var, struct hash_table *ht)
 ir_visitor_status
 ir_constant_variable_visitor::visit(ir_variable *ir)
 {
-   struct assignment_entry *entry = get_assignment_entry(ir, this->ht);
+   struct assignment_entry *entry = get_assignment_entry(ir, this->map);
entry->our_scope = true;
return visit_continue;
 }
@@ -98,7 +99,7 @@ ir_constant_variable_visitor::visit_enter(ir_assignment *ir)
ir_constant *constval;
struct assignment_entry *entry;
 
-   entry = get_assignment_entry(ir->lhs->variable_referenced(), this->ht);
+   entry = get_assignment_entry(ir->lhs->variable_referenced(), this->map);
assert(entry);
entry->assignment_count++;
 
@@ -159,7 +160,7 @@ ir_constant_variable_visitor::visit_enter(ir_call *ir)
 struct assignment_entry *entry;
 
 assert(var);
-entry = get_assignment_entry(var, this->ht);
+entry = get_assignment_entry(var, this->map);
 entry->assignment_count++;
   }
}
@@ -170,7 +171,7 @@ ir_constant_variable_visitor::visit_enter(ir_call *ir)
   struct assignment_entry *entry;
 
   assert(var);
-  entry = get_assignment_entry(var, this->ht);
+  entry = get_assignment_entry(var, this->map);
   entry->assignment_count++;
}
 
@@ -186,22 +187,21 @@ do_constant_variable(exec_list *instructions)
bool progress = false;
ir_constant_variable_visitor v;
 
-   v.ht = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-  _mesa_key_pointer_equal);
+   v.map = _mesa_pointer_map_create(NULL);
v.run(instructions);
 
-   struct hash_entry *hte;
-   hash_table_foreach(v.ht, hte) {
-  struct assignment_entry *entry = (struct assignment_entry *) hte->data;
+   struct map_entry *me;
+   _mesa_pointer_map_foreach(v.map, me) {
+  struct assignment_entry *entry = (struct assignment_entry *) me->data;
 
   if (entry->assignment_count == 1 && entry->constval && entry->our_scope) 
{
 entry->var->constant_value = entry->constval;
 progress = true;
   }
-  hte->data = NULL;
+  me->data = NULL;
   free(entry);
}
-   _mesa_hash_table_destroy(v.ht, NULL);
+   _mesa_pointer_map_destroy(v.map, NULL);
 
return progress;
 }
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/18] util: Add a pointer map clone function

2018-04-11 Thread Thomas Helland
---
 src/util/pointer_map.c | 23 +++
 src/util/pointer_map.h |  3 +++
 2 files changed, 26 insertions(+)

diff --git a/src/util/pointer_map.c b/src/util/pointer_map.c
index 8076bd827f..463fa19282 100644
--- a/src/util/pointer_map.c
+++ b/src/util/pointer_map.c
@@ -102,6 +102,29 @@ _mesa_pointer_map_create(void *mem_ctx)
 
return map;
 }
+struct pointer_map *
+_mesa_pointer_map_clone(struct pointer_map *src, void *dst_mem_ctx)
+{
+   struct pointer_map *pm = ralloc(dst_mem_ctx, struct pointer_map);
+
+   if (pm == NULL)
+  return NULL;
+
+   memcpy(pm, src, sizeof(struct pointer_map));
+
+   pm->map = ralloc_array(pm, struct map_entry, pm->size);
+   pm->metadata = ralloc_array(pm, uint8_t, pm->size);
+
+   if (pm->map == NULL || pm->metadata == NULL) {
+  ralloc_free(pm);
+  return NULL;
+   }
+
+   memcpy(pm->map, src->map, pm->size * sizeof(struct map_entry));
+   memcpy(pm->metadata, src->metadata, pm->size * sizeof(uint8_t));
+
+   return pm;
+}
 
 /**
  * Frees the pointer map.
diff --git a/src/util/pointer_map.h b/src/util/pointer_map.h
index e1cef418d8..4bfc306a5f 100644
--- a/src/util/pointer_map.h
+++ b/src/util/pointer_map.h
@@ -55,6 +55,9 @@ struct pointer_map {
 struct pointer_map *
 _mesa_pointer_map_create(void *mem_ctx);
 
+struct pointer_map *
+_mesa_pointer_map_clone(struct pointer_map *, void *dst_mem_ctx);
+
 void _mesa_pointer_map_destroy(struct pointer_map *map,
void (*delete_function)(struct map_entry 
*entry));
 
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/18] util: Add initial pointer map implementation

2018-04-11 Thread Thomas Helland
The motivation is that for the common case of pointers as keys the
current hash table implementation has multiple disadvantages.
It stores the hash, which means we get more memory usage than
is strictly necessary. It also compares both the hash, and the
pointer against the key when searching, when simply comparing
the pointer is enough and just as cheap. Also, it has a very
cache unfriendly reprobing algorithm.

This implementation adresses all of these issue, plus more.
It uses a table of size 2^n, meaning we can simply do mask of bits
instead of computing an expensive modulo when inserting or searching
the table for entries. It also uses linear probing for cache locality.
It also has the nice effect that the CPU should be more likely to be
able to do speculative execution. To further improve cache locality
it takes a trick from the talk "Designing a Fast, Efficient,
cache-friendly Hash Table, Step by Steap" from the 2017 CppCon
held by Matt Kulundis; it stores the metadata separate from the
stored data. The way this is done is that it allocates one byte
per entry, uses 7 bits to store the lower bits of the hash,
and uses the last bit to indicate if the slot is empty. The net
result is a space saving of 7/24ths, along with a much improved
cache friendliness. This can be further improved by using SSE
instructions for processing a large number of entries at the time
but I found that to be too platform specific, so I left it out.
One can argue if the cache penalty of storing the hash in a
separate array, and having to swap cache lines to acquire the
key is as much a penalty as the gain from reduced memory usage.
I should probably swap this implementation for one that just
removes the storage of the hash, and see how that fares. That
would be similar to what is done for the set later in this series.

V2:  Use bitmask instead of modulo as map is always size 2^n

V3: Use some of the saved space to lower the load factor in map

This will reduce the length of clusters, effectively giving us shorter
probing lengths both on insertion and search. This should not affect
cache locality, as the only potential change would be that we find a
free slot more often, and that means we're done with the loop. The only
case where this hurts us in a negative way is when iterating the hash
table as we will need to iterate more entries.
---
 src/util/meson.build   |   2 +
 src/util/pointer_map.c | 323 +
 src/util/pointer_map.h |  95 +++
 3 files changed, 420 insertions(+)
 create mode 100644 src/util/pointer_map.c
 create mode 100644 src/util/pointer_map.h

diff --git a/src/util/meson.build b/src/util/meson.build
index eece1cefef..9b50647f34 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -48,6 +48,8 @@ files_mesa_util = files(
   'mesa-sha1.h',
   'os_time.c',
   'os_time.h',
+  'pointer_map.c',
+  'pointer_map.h',
   'sha1/sha1.c',
   'sha1/sha1.h',
   'ralloc.c',
diff --git a/src/util/pointer_map.c b/src/util/pointer_map.c
new file mode 100644
index 00..8076bd827f
--- /dev/null
+++ b/src/util/pointer_map.c
@@ -0,0 +1,323 @@
+/*
+ * Copyright © 2017 Thomas Helland
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+/**
+ * Implements a linear probing hash table specifically for pointer keys.
+ * It uses a separate metadata array for good cache locality when searching.
+ * The metadata array is an array of bytes, where the seven LSB stores a hash,
+ * and the first bit stores whether the entry is free. An important detail is
+ * that the bit being
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "pointer_map.h"
+#include "ralloc.h"
+#include "macros.h"
+
+static inline uint8_t
+get_hash(uint8_t *metadata)
+{
+   return *metadata & 0x7F;
+}
+
+static inline void
+set_hash(uint8_t *metadata, uint32_t hash)
+{
+   *metadata = (*metadata & ~0x7F) | (((uint8_t) hash) & 

[Mesa-dev] [PATCH 05/18] glsl: Move ir_variable_refcount to using the pointer map

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/ir_variable_refcount.cpp | 13 ++---
 src/compiler/glsl/ir_variable_refcount.h   |  4 ++--
 src/compiler/glsl/opt_dead_code.cpp|  6 +++---
 3 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/ir_variable_refcount.cpp 
b/src/compiler/glsl/ir_variable_refcount.cpp
index 8306be10b9..c5bef9efbf 100644
--- a/src/compiler/glsl/ir_variable_refcount.cpp
+++ b/src/compiler/glsl/ir_variable_refcount.cpp
@@ -33,17 +33,16 @@
 #include "ir_visitor.h"
 #include "ir_variable_refcount.h"
 #include "compiler/glsl_types.h"
-#include "util/hash_table.h"
+#include "util/pointer_map.h"
 
 ir_variable_refcount_visitor::ir_variable_refcount_visitor()
 {
this->mem_ctx = ralloc_context(NULL);
-   this->ht = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-  _mesa_key_pointer_equal);
+   this->pm = _mesa_pointer_map_create(NULL);
 }
 
 static void
-free_entry(struct hash_entry *entry)
+free_entry(struct map_entry *entry)
 {
ir_variable_refcount_entry *ivre = (ir_variable_refcount_entry *) 
entry->data;
 
@@ -61,7 +60,7 @@ free_entry(struct hash_entry *entry)
 ir_variable_refcount_visitor::~ir_variable_refcount_visitor()
 {
ralloc_free(this->mem_ctx);
-   _mesa_hash_table_destroy(this->ht, free_entry);
+   _mesa_pointer_map_destroy(this->pm, free_entry);
 }
 
 // constructor
@@ -79,13 +78,13 @@ 
ir_variable_refcount_visitor::get_variable_entry(ir_variable *var)
 {
assert(var);
 
-   struct hash_entry *e = _mesa_hash_table_search(this->ht, var);
+   struct map_entry *e = _mesa_pointer_map_search(this->pm, var);
if (e)
   return (ir_variable_refcount_entry *)e->data;
 
ir_variable_refcount_entry *entry = new ir_variable_refcount_entry(var);
assert(entry->referenced_count == 0);
-   _mesa_hash_table_insert(this->ht, var, entry);
+   _mesa_pointer_map_insert(this->pm, var, entry);
 
return entry;
 }
diff --git a/src/compiler/glsl/ir_variable_refcount.h 
b/src/compiler/glsl/ir_variable_refcount.h
index 4a90f08c91..270bef7ecd 100644
--- a/src/compiler/glsl/ir_variable_refcount.h
+++ b/src/compiler/glsl/ir_variable_refcount.h
@@ -81,9 +81,9 @@ public:
ir_variable_refcount_entry *get_variable_entry(ir_variable *var);
 
/**
-* Hash table mapping ir_variable to ir_variable_refcount_entry.
+* Pointer map mapping ir_variable to ir_variable_refcount_entry.
 */
-   struct hash_table *ht;
+   struct pointer_map *pm;
 
void *mem_ctx;
 };
diff --git a/src/compiler/glsl/opt_dead_code.cpp 
b/src/compiler/glsl/opt_dead_code.cpp
index 75e668ae46..78247d7f4c 100644
--- a/src/compiler/glsl/opt_dead_code.cpp
+++ b/src/compiler/glsl/opt_dead_code.cpp
@@ -31,7 +31,7 @@
 #include "ir_visitor.h"
 #include "ir_variable_refcount.h"
 #include "compiler/glsl_types.h"
-#include "util/hash_table.h"
+#include "util/pointer_map.h"
 
 static bool debug = false;
 
@@ -50,8 +50,8 @@ do_dead_code(exec_list *instructions, bool 
uniform_locations_assigned)
 
v.run(instructions);
 
-   struct hash_entry *e;
-   hash_table_foreach(v.ht, e) {
+   struct map_entry *e;
+   _mesa_pointer_map_foreach(v.pm, e) {
   ir_variable_refcount_entry *entry = (ir_variable_refcount_entry 
*)e->data;
 
   /* Since each assignment is a reference, the refereneced count must be
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/18] glsl: Use pointer map in constant propagation

2018-04-11 Thread Thomas Helland
---
 src/compiler/glsl/opt_constant_propagation.cpp | 47 --
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/src/compiler/glsl/opt_constant_propagation.cpp 
b/src/compiler/glsl/opt_constant_propagation.cpp
index 05dc71efb7..8072bf4811 100644
--- a/src/compiler/glsl/opt_constant_propagation.cpp
+++ b/src/compiler/glsl/opt_constant_propagation.cpp
@@ -41,6 +41,7 @@
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
 #include "util/hash_table.h"
+#include "util/pointer_map.h"
 
 namespace {
 
@@ -103,8 +104,7 @@ public:
   mem_ctx = ralloc_context(0);
   this->lin_ctx = linear_alloc_parent(this->mem_ctx, 0);
   this->acp = new(mem_ctx) exec_list;
-  this->kills = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+  this->kills = _mesa_pointer_map_create(mem_ctx);
}
~ir_constant_propagation_visitor()
{
@@ -129,10 +129,10 @@ public:
exec_list *acp;
 
/**
-* Hash table of kill_entry: The masks of variables whose values were
+* Pointer map of kill_entry: The masks of variables whose values were
 * killed in this block.
 */
-   hash_table *kills;
+   pointer_map *kills;
 
bool progress;
 
@@ -269,12 +269,11 @@ 
ir_constant_propagation_visitor::visit_enter(ir_function_signature *ir)
 * main() at link time, so they're irrelevant to us.
 */
exec_list *orig_acp = this->acp;
-   hash_table *orig_kills = this->kills;
+   pointer_map *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
this->acp = new(mem_ctx) exec_list;
-   this->kills = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+   this->kills = _mesa_pointer_map_create(mem_ctx);
this->killed_all = false;
 
visit_list_elements(this, >body);
@@ -359,12 +358,11 @@ void
 ir_constant_propagation_visitor::handle_if_block(exec_list *instructions)
 {
exec_list *orig_acp = this->acp;
-   hash_table *orig_kills = this->kills;
+   pointer_map *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
this->acp = new(mem_ctx) exec_list;
-   this->kills = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+   this->kills = _mesa_pointer_map_create(mem_ctx);
this->killed_all = false;
 
/* Populate the initial acp with a constant of the original */
@@ -378,14 +376,14 @@ 
ir_constant_propagation_visitor::handle_if_block(exec_list *instructions)
   orig_acp->make_empty();
}
 
-   hash_table *new_kills = this->kills;
+   pointer_map *new_kills = this->kills;
this->kills = orig_kills;
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
 
-   hash_entry *htk;
-   hash_table_foreach(new_kills, htk) {
-  kill_entry *k = (kill_entry *) htk->data;
+   map_entry *me;
+   _mesa_pointer_map_foreach(new_kills, me) {
+  kill_entry *k = (kill_entry *) me->data;
   kill(k->var, k->write_mask);
}
 }
@@ -407,7 +405,7 @@ ir_visitor_status
 ir_constant_propagation_visitor::visit_enter(ir_loop *ir)
 {
exec_list *orig_acp = this->acp;
-   hash_table *orig_kills = this->kills;
+   pointer_map *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
/* FINISHME: For now, the initial acp for loops is totally empty.
@@ -415,8 +413,7 @@ ir_constant_propagation_visitor::visit_enter(ir_loop *ir)
 * cloned minus the killed entries after the first run through.
 */
this->acp = new(mem_ctx) exec_list;
-   this->kills = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
- _mesa_key_pointer_equal);
+   this->kills = _mesa_pointer_map_create(mem_ctx);
this->killed_all = false;
 
visit_list_elements(this, >body_instructions);
@@ -425,14 +422,14 @@ ir_constant_propagation_visitor::visit_enter(ir_loop *ir)
   orig_acp->make_empty();
}
 
-   hash_table *new_kills = this->kills;
+   pointer_map *new_kills = this->kills;
this->kills = orig_kills;
this->acp = orig_acp;
this->killed_all = this->killed_all || orig_killed_all;
 
-   hash_entry *htk;
-   hash_table_foreach(new_kills, htk) {
-  kill_entry *k = (kill_entry *) htk->data;
+   map_entry *me;
+   _mesa_pointer_map_foreach(new_kills, me) {
+  kill_entry *k = (kill_entry *) me->data;
   kill(k->var, k->write_mask);
}
 
@@ -461,14 +458,14 @@ ir_constant_propagation_visitor::kill(ir_variable *var, 
unsigned write_mask)
/* Add this writemask of the variable to the hash table of killed
 * variables in this block.
 */
-   hash_entry *kill_hash_entry = _mesa_hash_table_search(this->kills, var);
-   if (kill_hash_entry) {
-  kill_entry *entry = (kill_entry *) kill_hash_entry->data;
+   map_entry *kill_map_entry = _mesa_pointer_map_search(this->kills, var);
+   if 

[Mesa-dev] [PATCH 00/18] [RFC] Pointer specific data structures

2018-04-11 Thread Thomas Helland
This series came about when I saw a talk online, while simultaneously
being annoyd about the needless waste of memory in our set as reported
by pahole. I have previously made some patches that changed our hash
table from a reprobing one to a quadratic probing one, in the name of
lower overhead and better cache locality, but I was not quite satisfied.

I'm sending this series out now, as it seems like an ideal time since
Timothy is working at reducing our compile times. Further details about 
the implementation and its advantages are described in the patches.
I've found this to give a reduction in shader-db runtime of about 2%,
but I have to do some more testing on my main computer, as my laptop
is showing its age with some terrible thermal issues.

This special cases on pointers, as that is a very common usecase.
This allows us to drop some comparisons, and reduce the total size
of our hash table to 70% or our current and the set to 50%. It uses 
linear probing and power-of-two table sizes to get good cache locality. 
In the pointer_map caes it moves the stored hashes out into it's own 
array for even better cache locality.

I'm not sure if we want another set and map amongst our utils,
but the patch series is simple enough, and complete enough,
that I thought I could share it for some inital comments.

CC: Timothy Arceri 

Thomas Helland (18):
  util: Add initial pointer map implementation
  glsl: Use pointer map in constant propagation
  util: Add a pointer map clone function
  glsl: Port copy propagation elements to pointer map
  glsl: Move ir_variable_refcount to using the pointer map
  nir: Change lower_vars_to_ssa to use pointer map
  glsl: Use pointer map in copy propagation
  glsl: Use pointer map in opt_constant_variable
  glsl: Change glsl_to_nir to user pointer map
  util: Add a call_foreach function to the pointer map
  glsl: Use the pointer map in the glsl linker
  nir: Use pointer map in nir_from_ssa
  util: Add a pointer set implementation
  nir: Migrate lower_vars_to_ssa to use pointer set
  glsl: Use pointer set in opt_copy_propagation
  nir: Use pointer set in remove_dead_variable
  nir: Use pointer_set in nir_propagate_invariant
  util: Just cut the hash in the pointer table

 src/compiler/glsl/glsl_to_nir.cpp  |  31 +-
 src/compiler/glsl/ir_variable_refcount.cpp |  13 +-
 src/compiler/glsl/ir_variable_refcount.h   |   4 +-
 src/compiler/glsl/linker.cpp   |  40 ++-
 src/compiler/glsl/opt_constant_propagation.cpp |  47 ++--
 src/compiler/glsl/opt_constant_variable.cpp|  34 +--
 src/compiler/glsl/opt_copy_propagation.cpp |  95 +++
 .../glsl/opt_copy_propagation_elements.cpp |  96 ---
 src/compiler/glsl/opt_dead_code.cpp|   6 +-
 src/compiler/nir/nir_from_ssa.c|  18 +-
 src/compiler/nir/nir_lower_vars_to_ssa.c   |  48 ++--
 src/compiler/nir/nir_propagate_invariant.c |  33 +--
 src/compiler/nir/nir_remove_dead_variables.c   |  37 +--
 src/util/meson.build   |   4 +
 src/util/pointer_map.c | 313 +
 src/util/pointer_map.h | 110 
 src/util/pointer_set.c | 266 +
 src/util/pointer_set.h |  90 ++
 18 files changed, 1026 insertions(+), 259 deletions(-)
 create mode 100644 src/util/pointer_map.c
 create mode 100644 src/util/pointer_map.h
 create mode 100644 src/util/pointer_set.c
 create mode 100644 src/util/pointer_set.h

-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/18] glsl: Port copy propagation elements to pointer map

2018-04-11 Thread Thomas Helland
---
 .../glsl/opt_copy_propagation_elements.cpp | 96 +++---
 1 file changed, 47 insertions(+), 49 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp 
b/src/compiler/glsl/opt_copy_propagation_elements.cpp
index 8bae424a1d..8737fe27a5 100644
--- a/src/compiler/glsl/opt_copy_propagation_elements.cpp
+++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp
@@ -46,7 +46,7 @@
 #include "ir_basic_block.h"
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
-#include "util/hash_table.h"
+#include "util/pointer_map.h"
 
 static bool debug = false;
 
@@ -124,24 +124,22 @@ public:
   ralloc_free(mem_ctx);
}
 
-   void clone_acp(hash_table *lhs, hash_table *rhs)
+   void clone_acp(pointer_map *lhs, pointer_map *rhs)
{
-  lhs_ht = _mesa_hash_table_clone(lhs, mem_ctx);
-  rhs_ht = _mesa_hash_table_clone(rhs, mem_ctx);
+  lhs_pm = _mesa_pointer_map_clone(lhs, mem_ctx);
+  rhs_pm = _mesa_pointer_map_clone(rhs, mem_ctx);
}
 
void create_acp()
{
-  lhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
-   _mesa_key_pointer_equal);
-  rhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
-   _mesa_key_pointer_equal);
+  lhs_pm = _mesa_pointer_map_create(mem_ctx);
+  rhs_pm = _mesa_pointer_map_create(mem_ctx);
}
 
void destroy_acp()
{
-  _mesa_hash_table_destroy(lhs_ht, NULL);
-  _mesa_hash_table_destroy(rhs_ht, NULL);
+  _mesa_pointer_map_destroy(lhs_pm, NULL);
+  _mesa_pointer_map_destroy(rhs_pm, NULL);
}
 
void handle_loop(ir_loop *, bool keep_acp);
@@ -159,8 +157,8 @@ public:
void handle_if_block(exec_list *instructions);
 
/** Hash of acp_entry: The available copies to propagate */
-   hash_table *lhs_ht;
-   hash_table *rhs_ht;
+   pointer_map *lhs_pm;
+   pointer_map *rhs_pm;
 
/**
 * List of kill_entry: The variables whose values were killed in this
@@ -191,8 +189,8 @@ 
ir_copy_propagation_elements_visitor::visit_enter(ir_function_signature *ir)
exec_list *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   hash_table *orig_lhs_ht = lhs_ht;
-   hash_table *orig_rhs_ht = rhs_ht;
+   pointer_map *orig_lhs_pm = lhs_pm;
+   pointer_map *orig_rhs_pm = rhs_pm;
 
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
@@ -208,8 +206,8 @@ 
ir_copy_propagation_elements_visitor::visit_enter(ir_function_signature *ir)
this->kills = orig_kills;
this->killed_all = orig_killed_all;
 
-   lhs_ht = orig_lhs_ht;
-   rhs_ht = orig_rhs_ht;
+   lhs_pm = orig_lhs_pm;
+   rhs_pm = orig_rhs_pm;
 
return visit_continue_with_parent;
 }
@@ -296,9 +294,9 @@ 
ir_copy_propagation_elements_visitor::handle_rvalue(ir_rvalue **ir)
/* Try to find ACP entries covering swizzle_chan[], hoping they're
 * the same source variable.
 */
-   hash_entry *ht_entry = _mesa_hash_table_search(lhs_ht, var);
-   if (ht_entry) {
-  exec_list *ht_list = (exec_list *) ht_entry->data;
+   map_entry *pm_entry = _mesa_pointer_map_search(lhs_pm, var);
+   if (pm_entry) {
+  exec_list *ht_list = (exec_list *) pm_entry->data;
   foreach_in_list(acp_entry, entry, ht_list) {
  for (int c = 0; c < chans; c++) {
 if (entry->write_mask & (1 << swizzle_chan[c])) {
@@ -368,8 +366,8 @@ ir_copy_propagation_elements_visitor::visit_enter(ir_call 
*ir)
/* Since we're unlinked, we don't (necessarily) know the side effects of
 * this call.  So kill all copies.
 */
-   _mesa_hash_table_clear(lhs_ht, NULL);
-   _mesa_hash_table_clear(rhs_ht, NULL);
+   _mesa_pointer_map_clear(lhs_pm);
+   _mesa_pointer_map_clear(rhs_pm);
 
this->killed_all = true;
 
@@ -382,20 +380,20 @@ 
ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
exec_list *orig_kills = this->kills;
bool orig_killed_all = this->killed_all;
 
-   hash_table *orig_lhs_ht = lhs_ht;
-   hash_table *orig_rhs_ht = rhs_ht;
+   pointer_map *orig_lhs_pm = lhs_pm;
+   pointer_map *orig_rhs_pm = rhs_pm;
 
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
 
/* Populate the initial acp with a copy of the original */
-   clone_acp(orig_lhs_ht, orig_rhs_ht);
+   clone_acp(orig_lhs_pm, orig_rhs_pm);
 
visit_list_elements(this, instructions);
 
if (this->killed_all) {
-  _mesa_hash_table_clear(orig_lhs_ht, NULL);
-  _mesa_hash_table_clear(orig_rhs_ht, NULL);
+  _mesa_pointer_map_clear(orig_lhs_pm);
+  _mesa_pointer_map_clear(orig_rhs_pm);
}
 
exec_list *new_kills = this->kills;
@@ -404,8 +402,8 @@ 
ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
 
destroy_acp();
 
-   lhs_ht = orig_lhs_ht;
-   rhs_ht = orig_rhs_ht;
+   lhs_pm = orig_lhs_pm;
+   rhs_pm = orig_rhs_pm;
 
/* Move the new kills into the parent block's list, removing them
 * from 

Re: [Mesa-dev] [PATCH 1/4] ac/surface: don't set the display flag for obviously unsupported cases (v2)

2018-04-11 Thread Marek Olšák
On Tue, Apr 10, 2018 at 7:46 PM, Bas Nieuwenhuizen 
wrote:

> What is the addrlib assertion we are hitting?
>

128bpp formats can't set "display = true" even though the tiling is always
_D for 128bpp.

Marek


>
> On Tue, Apr 10, 2018 at 11:44 AM, Michel Dänzer 
> wrote:
> > On 2018-04-06 07:12 PM, Marek Olšák wrote:
> >> From: Marek Olšák 
> >>
> >> This enables the tile swizzle for some cases of the displayable micro
> mode,
> >> and it also fixes an addrlib assertion failure on Vega.
> >
> > [...]
> >
> >> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
> >> index dd3189c67d0..ef6f1072abd 100644
> >> --- a/src/amd/vulkan/radv_image.c
> >> +++ b/src/amd/vulkan/radv_image.c
> >> @@ -919,20 +919,21 @@ radv_image_create(VkDevice _device,
> >>   if (!image)
> >>   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> >>
> >>   image->type = pCreateInfo->imageType;
> >>   image->info.width = pCreateInfo->extent.width;
> >>   image->info.height = pCreateInfo->extent.height;
> >>   image->info.depth = pCreateInfo->extent.depth;
> >>   image->info.samples = pCreateInfo->samples;
> >>   image->info.array_size = pCreateInfo->arrayLayers;
> >>   image->info.levels = pCreateInfo->mipLevels;
> >> + image->info.num_channels = 4; /* TODO: set this correctly */
> >
> > Maybe a radv developer can suggest something here? Anyway,
> >
> > Reviewed-by: Michel Dänzer 
> >
> >
> > --
> > Earthling Michel Dänzer   |   http://www.amd.com
> > Libre software enthusiast | Mesa and X developer
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/3] nir: simplify node matching code when lowering to SSA

2018-04-11 Thread Jason Ekstrand
I tweaked your commit messages a bit, added my R-B to this one, and pushed.

And... Now I get to rebase my deref patches again. :-P

--Jason

On Tue, Apr 10, 2018 at 11:13 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> The matching code doesn't make real use of the return value. The main
> function return value is ignored, and while the worker function
> propagate its return value, the actual callback never returns false.
>
> v2: Style fixes. (Jason)
> ---
>  src/compiler/nir/nir_lower_vars_to_ssa.c | 67 +++-
>  1 file changed, 31 insertions(+), 36 deletions(-)
>
> diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c
> b/src/compiler/nir/nir_lower_vars_to_ssa.c
> index 970eb05307..8bc847fd41 100644
> --- a/src/compiler/nir/nir_lower_vars_to_ssa.c
> +++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
> @@ -217,45 +217,42 @@ get_deref_node(nir_deref_var *deref, struct
> lower_variables_state *state)
>  }
>
>  /* \sa foreach_deref_node_match */
> -static bool
> +static void
>  foreach_deref_node_worker(struct deref_node *node, nir_deref *deref,
> -  bool (* cb)(struct deref_node *node,
> +  void (* cb)(struct deref_node *node,
>struct lower_variables_state
> *state),
>struct lower_variables_state *state)
>  {
> if (deref->child == NULL) {
> -  return cb(node, state);
> -   } else {
> -  switch (deref->child->deref_type) {
> -  case nir_deref_type_array: {
> - nir_deref_array *arr = nir_deref_as_array(deref->child);
> - assert(arr->deref_array_type == nir_deref_array_type_direct);
> - if (node->children[arr->base_offset] &&
> - !foreach_deref_node_worker(node->children[arr->base_offset],
> -deref->child, cb, state))
> -return false;
> +  cb(node, state);
> +  return;
> +   }
>
> - if (node->wildcard &&
> - !foreach_deref_node_worker(node->wildcard,
> -deref->child, cb, state))
> -return false;
> +   switch (deref->child->deref_type) {
> +   case nir_deref_type_array: {
> +  nir_deref_array *arr = nir_deref_as_array(deref->child);
> +  assert(arr->deref_array_type == nir_deref_array_type_direct);
>
> - return true;
> +  if (node->children[arr->base_offset]) {
> + foreach_deref_node_worker(node->children[arr->base_offset],
> +   deref->child, cb, state);
>}
> +  if (node->wildcard)
> + foreach_deref_node_worker(node->wildcard, deref->child, cb,
> state);
> +  break;
> +   }
>
> -  case nir_deref_type_struct: {
> - nir_deref_struct *str = nir_deref_as_struct(deref->child);
> - if (node->children[str->index] &&
> - !foreach_deref_node_worker(node->children[str->index],
> -deref->child, cb, state))
> -return false;
> -
> - return true;
> +   case nir_deref_type_struct: {
> +  nir_deref_struct *str = nir_deref_as_struct(deref->child);
> +  if (node->children[str->index]) {
> + foreach_deref_node_worker(node->children[str->index],
> +   deref->child, cb, state);
>}
> +  break;
> +   }
>
> -  default:
> - unreachable("Invalid deref child type");
> -  }
> +   default:
> +  unreachable("Invalid deref child type");
> }
>  }
>
> @@ -271,9 +268,9 @@ foreach_deref_node_worker(struct deref_node *node,
> nir_deref *deref,
>   * The given deref must be a full-length and fully qualified (no wildcards
>   * or indirects) deref chain.
>   */
> -static bool
> +static void
>  foreach_deref_node_match(nir_deref_var *deref,
> - bool (* cb)(struct deref_node *node,
> + void (* cb)(struct deref_node *node,
>   struct lower_variables_state *state),
>   struct lower_variables_state *state)
>  {
> @@ -282,9 +279,9 @@ foreach_deref_node_match(nir_deref_var *deref,
> struct deref_node *node = get_deref_node(_deref, state);
>
> if (node == NULL)
> -  return false;
> +  return;
>
> -   return foreach_deref_node_worker(node, >deref, cb, state);
> +   foreach_deref_node_worker(node, >deref, cb, state);
>  }
>
>  /* \sa deref_may_be_aliased */
> @@ -441,12 +438,12 @@ register_variable_uses(nir_function_impl *impl,
>  /* Walks over all of the copy instructions to or from the given deref_node
>   * and lowers them to load/store intrinsics.
>   */
> -static bool
> +static void
>  lower_copies_to_load_store(struct deref_node *node,
> struct lower_variables_state *state)
>  {
> if (!node->copies)
> -  return true;
> +  return;
>
> struct set_entry *copy_entry;
> set_foreach(node->copies, 

Re: [Mesa-dev] [PATCH] blorp: Silence unused function warnings

2018-04-11 Thread Lionel Landwerlin

Reviewed-by: Lionel Landwerlin 

On 11/04/18 11:00, Nanley Chery wrote:

vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ 
defined but not used [-Wunused-function]
  blorp_get_surface_base_address(struct blorp_batch *batch)
  ^~
In file included from vulkan/genX_blorp_exec.c:35:0:
./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not 
used [-Wunused-function]
  blorp_emit_memcpy(struct blorp_batch *batch,
  ^
genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but 
not used [-Wunused-function]
  blorp_get_surface_base_address(struct blorp_batch *batch)
  ^~
In file included from genX_blorp_exec.c:33:0:
../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: 
‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
  blorp_emit_memcpy(struct blorp_batch *batch,
  ^
---
  src/intel/blorp/blorp_genX_exec.h   | 4 ++--
  src/intel/vulkan/genX_blorp_exec.c  | 2 +-
  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index 7851228d8dc..593521b95cc 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -78,7 +78,7 @@ static void
  blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
  struct blorp_address address, uint32_t delta);
  
-#if GEN_GEN >= 7 && GEN_GEN <= 10

+#if GEN_GEN >= 7 && GEN_GEN < 10
  static struct blorp_address
  blorp_get_surface_base_address(struct blorp_batch *batch);
  #endif
@@ -1244,7 +1244,7 @@ blorp_emit_pipeline(struct blorp_batch *batch,
  
  #endif /* GEN_GEN >= 6 */
  
-#if GEN_GEN >= 7 && GEN_GEN <= 10

+#if GEN_GEN >= 7 && GEN_GEN < 10
  static void
  blorp_emit_memcpy(struct blorp_batch *batch,
struct blorp_address dst,
diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index 1ecec199846..b423046d616 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -64,7 +64,7 @@ blorp_surface_reloc(struct blorp_batch *batch, uint32_t 
ss_offset,
anv_batch_set_error(_buffer->batch, result);
  }
  
-#if GEN_GEN >= 7 && GEN_GEN <= 10

+#if GEN_GEN >= 7 && GEN_GEN < 10
  static struct blorp_address
  blorp_get_surface_base_address(struct blorp_batch *batch)
  {
diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index 3406a6fdec6..b72ca9c515b 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -94,7 +94,7 @@ blorp_surface_reloc(struct blorp_batch *batch, uint32_t 
ss_offset,
  #endif
  }
  
-#if GEN_GEN >= 7 && GEN_GEN <= 10

+#if GEN_GEN >= 7 && GEN_GEN < 10
  static struct blorp_address
  blorp_get_surface_base_address(struct blorp_batch *batch)
  {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] blorp: Silence unused function warnings

2018-04-11 Thread Nanley Chery
vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ 
defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~
In file included from vulkan/genX_blorp_exec.c:35:0:
./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not 
used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^
genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but 
not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~
In file included from genX_blorp_exec.c:33:0:
../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: 
‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^
---
 src/intel/blorp/blorp_genX_exec.h   | 4 ++--
 src/intel/vulkan/genX_blorp_exec.c  | 2 +-
 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index 7851228d8dc..593521b95cc 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -78,7 +78,7 @@ static void
 blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
 struct blorp_address address, uint32_t delta);
 
-#if GEN_GEN >= 7 && GEN_GEN <= 10
+#if GEN_GEN >= 7 && GEN_GEN < 10
 static struct blorp_address
 blorp_get_surface_base_address(struct blorp_batch *batch);
 #endif
@@ -1244,7 +1244,7 @@ blorp_emit_pipeline(struct blorp_batch *batch,
 
 #endif /* GEN_GEN >= 6 */
 
-#if GEN_GEN >= 7 && GEN_GEN <= 10
+#if GEN_GEN >= 7 && GEN_GEN < 10
 static void
 blorp_emit_memcpy(struct blorp_batch *batch,
   struct blorp_address dst,
diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index 1ecec199846..b423046d616 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -64,7 +64,7 @@ blorp_surface_reloc(struct blorp_batch *batch, uint32_t 
ss_offset,
   anv_batch_set_error(_buffer->batch, result);
 }
 
-#if GEN_GEN >= 7 && GEN_GEN <= 10
+#if GEN_GEN >= 7 && GEN_GEN < 10
 static struct blorp_address
 blorp_get_surface_base_address(struct blorp_batch *batch)
 {
diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index 3406a6fdec6..b72ca9c515b 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -94,7 +94,7 @@ blorp_surface_reloc(struct blorp_batch *batch, uint32_t 
ss_offset,
 #endif
 }
 
-#if GEN_GEN >= 7 && GEN_GEN <= 10
+#if GEN_GEN >= 7 && GEN_GEN < 10
 static struct blorp_address
 blorp_get_surface_base_address(struct blorp_batch *batch)
 {
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] glsl: use NIR function inlining for drivers that use glsl_to_nir

2018-04-11 Thread Jason Ekstrand
On Tue, Apr 10, 2018 at 10:26 PM, Timothy Arceri 
wrote:

>
> On 11/04/18 15:05, Jason Ekstrand wrote:
>
>> If I understand correctly, this is because when running with minimal GLSL
>> IR, opt_function_inlining doesn't acutally inline them all.  Is that
>> correct?  If so, would it make sense to just repeatedly call
>> do_function_inlining until it stops making progress when we're running with
>> minimal optimizations?  Not that I'm opposed to using NIR for more things,
>> but it bothers me a bit that reducing GLSL IR optimizations is causing us
>> to break previous assumptions about what is, effectively, a lowering pass.
>>
>
> The st currently calls do_common_optimization() in a loop to lower these
> away. Which even just calling one time adds something like 8% compile times
> to the Deus Ex shaders.
>

Ouch.


> What are the assumptions you worried about breaking? There are currently
> bugs which are fixed by this series (see patch 5 which I think also needs
> to copy struct members and expression operands now that I think about it),
> I'm not confident at all that calling do_function_inlining() on its own
> would even work due to what is described in commit 6, I'm pretty sure there
> is some sketchy reliance on opt passes for things to work correctly during
> unrolling.
>

Ugh... Yeah, that's sketchy.  I'm mostly concerned about other hidden
assumptions.  For instance, we assume that all arrays have sizes.  Then
again, as you pointed out, we may not actually be guaranteed that those
assumptions hold today. :-)  If Jenkins is happy, I think I can handle my
own nervousness. :-)


> I'd really like to avoid the unnecessary loops.
>

Yeah, I totally get that.

--Jason


>
>> On Mon, Apr 9, 2018 at 9:34 PM, Timothy Arceri > > wrote:
>>
>> ---
>>   src/compiler/glsl/glsl_to_nir.cpp | 20 
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/src/compiler/glsl/glsl_to_nir.cpp
>> b/src/compiler/glsl/glsl_to_nir.cpp
>> index 5a36963607e..55c01024669 100644
>> --- a/src/compiler/glsl/glsl_to_nir.cpp
>> +++ b/src/compiler/glsl/glsl_to_nir.cpp
>> @@ -26,6 +26,7 @@
>>*/
>>
>>   #include "glsl_to_nir.h"
>> +#include "ir_optimization.h"
>>   #include "ir_visitor.h"
>>   #include "ir_hierarchical_visitor.h"
>>   #include "ir.h"
>> @@ -161,6 +162,25 @@ glsl_to_nir(const struct gl_shader_program
>> *shader_prog,
>>  v2.run(sh->ir);
>>  visit_exec_list(sh->ir, );
>>
>> +   nir_validate_shader(shader);
>> +
>> +   /* We have to lower away local constant initializers right before
>> we
>> +* inline functions.  That way they get properly initialized at
>> the top
>> +* of the function and not at the top of its caller.
>> +*/
>> +   nir_lower_constant_initializers(shader, nir_var_local);
>> +   nir_lower_returns(shader);
>> +   nir_inline_functions(shader);
>> +
>> +   /* Now that we have inlined everything remove all of the
>> functions except
>> +* main().
>> +*/
>> +   foreach_list_typed_safe(nir_function, function, node,
>> &(shader)->functions){
>> +  if (strcmp("main", function->name) != 0) {
>> + exec_node_remove(>node);
>> +  }
>> +   }
>> +
>>  nir_lower_constant_initializers(shader, (nir_variable_mode)~0);
>>
>>  /* Remap the locations to slots so those requiring two slots
>> will occupy
>> --
>> 2.17.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org > >
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> 
>>
>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] gettextureimage: verify cube map is complete

2018-04-11 Thread Juan A. Suarez Romero
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTexImage, GetTextureImage, and GetnTexImage:

  "An INVALID_OPERATION error is generated by GetTextureImage if the
   effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY, and
   the texture object is not cube complete or cube array complete,
   respectively."

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero 
---
 src/mesa/main/texgetimage.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index fbdbcd90a7d..85d0ffd4770 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -982,15 +982,20 @@ dimensions_error_check(struct gl_context *ctx,
  "%s(zoffset + depth = %d)", caller, zoffset + depth);
  return true;
   }
-  /* check that the range of faces exist */
-  for (i = 0; i < depth; i++) {
- GLenum face = GL_TEXTURE_CUBE_MAP_POSITIVE_X + zoffset + i;
- if (!_mesa_select_tex_image(texObj, face, level)) {
-/* non-existant face */
-_mesa_error(ctx, GL_INVALID_OPERATION,
-"%s(missing cube face)", caller);
-return true;
- }
+  /* According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image 
Queries"):
+   *
+   *   "An INVALID_OPERATION error is generated by GetTextureImage if the
+   *   effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY ,
+   *   and the texture object is not cube complete or cube array complete,
+   *   respectively."
+   *
+   * This applies also to GetTextureSubImage, GetCompressedTexImage,
+   * GetCompressedTextureImage, and GetnCompressedTexImage.
+   */
+  if (!_mesa_cube_complete(texObj)) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "%s(cube incomplete)", caller);
+ return true;
   }
   break;
default:
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] getteximage: assume texture image is empty for non defined levels

2018-04-11 Thread Juan A. Suarez Romero
Current code is returning an INVALID_OPERATION when trying to use
getTextureImage() on a level that has not been explicitly defined.

That is, we define a mipmapped Texture2D with 3 levels, and try to use
GetTextureImage() for the 4th levels, and INVALID_OPERATION is returned.

Nevertheless, such case is not listed as an error in OpenGL 4.6 spec,
section 8.11.4 ("Texture Image Queries"), where all the case errors for
this function are defined. So it seems this is a valid operation.

On the other hand, in section 8.22 ("Texture State and Proxy State") it
states:

  "Each initial texture image is null. It has zero width, height, and
   depth, internal format RGBA, or R8 for buffer textures, component
   sizes set to zero and component types set to NONE, the compressed
   flag set to FALSE, a zero compressed size, and the bound buffer
   object name is zero."

We can assume that we are reading this initialized empty image when
calling GetTextureImage() with a non defined level.

With this assumption, we will reach one of the other error cases defined
for the functions. At the end, this means that we will return a
different error to the caller.

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero 
---
 src/mesa/main/texgetimage.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 85d0ffd4770..0e4f030cb08 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -913,6 +913,7 @@ dimensions_error_check(struct gl_context *ctx,
const char *caller)
 {
const struct gl_texture_image *texImage;
+   GLuint image_width, image_height, image_depth;
int i;
 
if (xoffset < 0) {
@@ -1004,37 +1005,39 @@ dimensions_error_check(struct gl_context *ctx,
 
texImage = select_tex_image(texObj, target, level, zoffset);
if (!texImage) {
-  /* missing texture image */
-  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(missing image)", caller);
-  return true;
+  /* missing texture image; continue as initialized as empty */
+  _mesa_warning(ctx, "%s(missing image)", caller);
}
 
-   if (xoffset + width > texImage->Width) {
+   image_width = texImage ? texImage->Width : 0;
+   if (xoffset + width > image_width) {
   _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(xoffset %d + width %d > %u)",
-  caller, xoffset, width, texImage->Width);
+  caller, xoffset, width, image_width);
   return true;
}
 
-   if (yoffset + height > texImage->Height) {
+   image_height = texImage ? texImage->Height : 0;
+   if (yoffset + height > image_height) {
   _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(yoffset %d + height %d > %u)",
-  caller, yoffset, height, texImage->Height);
+  caller, yoffset, height, image_height);
   return true;
}
 
if (target != GL_TEXTURE_CUBE_MAP) {
   /* Cube map error checking was done above */
-  if (zoffset + depth > texImage->Depth) {
+  image_depth = texImage ? texImage->Depth : 0;
+  if (zoffset + depth > image_depth) {
  _mesa_error(ctx, GL_INVALID_VALUE,
  "%s(zoffset %d + depth %d > %u)",
- caller, zoffset, depth, texImage->Depth);
+ caller, zoffset, depth, image_depth);
  return true;
   }
}
 
/* Extra checks for compressed textures */
-   {
+   if (texImage) {
   GLuint bw, bh, bd;
   _mesa_get_format_block_size_3d(texImage->TexFormat, , , );
   if (bw > 1 || bh > 1 || bd > 1) {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gettextsubimage: verify zoffset and depth are correct

2018-04-11 Thread Juan A. Suarez Romero
According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"),
relative to errors for GetTextureSubImage() function:

  "An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D and either yoffset is not zero, or height is not one.

   An INVALID_VALUE error is generated if the effective target is
   TEXTURE_1D, TEXTURE_1D_ARRAY, TEXTURE_2D or TEXTURE_RECTANGLE, and
   either zoffset is not zero, or depth is not one."

The commit fixes the check for height and depth.

This fixes arb_get_texture_sub_image piglit tests.

Signed-off-by: Juan A. Suarez Romero 
---
 src/mesa/main/texgetimage.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index c61842e39ad..fbdbcd90a7d 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -953,7 +953,7 @@ dimensions_error_check(struct gl_context *ctx,
  "%s(1D, yoffset = %d)", caller, yoffset);
  return true;
   }
-  if (height > 1) {
+  if (height != 1) {
  _mesa_error(ctx, GL_INVALID_VALUE,
  "%s(1D, height = %d)", caller, height);
  return true;
@@ -967,7 +967,7 @@ dimensions_error_check(struct gl_context *ctx,
  "%s(zoffset = %d)", caller, zoffset);
  return true;
   }
-  if (depth > 1) {
+  if (depth != 1) {
  _mesa_error(ctx, GL_INVALID_VALUE,
  "%s(depth = %d)", caller, depth);
  return true;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] openCL support on SI

2018-04-11 Thread zz zz
Hi,

I've been instructed to ask this here:

Looking at the table from xorg.freedesktop, it says OpenCL for S.Islands
is WIP.(work in progress)

Does that mean I can hope it's just a matter of time, or could it still
 *never* arrive? (say, if the cards 'get old' and the devs drop support?)

Thanks,
AZ
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH 5/5] i965/miptree: recurse to miptree_map for depth in map_depthstencil

2018-04-11 Thread Scott D Phillips
Call back to intel_miptree_map when mapping the separate depth
miptree in map_depthstencil. This brings us back to the mapping
method decision tree in miptree_map where we hopefully find the
most performant mapping method for depth.
---
Here's an idea for a replacement of patch 5. Squirreling the
depthstencil map away from the miptree when doing the recursive
map is kinda yuck, but it lets us go back to the main miptree_map
function for the depth map where we can get cpu detiling or
whatever else.

 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 54 +++
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 98f71471bda..652074eb9fc 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3525,27 +3525,30 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
if (!(map->mode & GL_MAP_INVALIDATE_RANGE_BIT)) {
   uint32_t *packed_map = map->ptr;
   uint8_t *s_map = intel_miptree_map_raw(brw, s_mt, GL_MAP_READ_BIT);
-  uint32_t *z_map = intel_miptree_map_raw(brw, z_mt, GL_MAP_READ_BIT);
   unsigned int s_image_x, s_image_y;
-  unsigned int z_image_x, z_image_y;
+
+  uint32_t *z_map = NULL;
+  ptrdiff_t z_stride = 0;
+
+  z_mt->level[level].slice[slice].map = NULL;
+  intel_miptree_map(brw, z_mt, level, slice, map->x, map->y, map->w, 
map->h,
+map->mode | BRW_MAP_DIRECT_BIT, (void **)_map,
+_stride);
+  assert(z_map && z_stride);
 
   intel_miptree_get_image_offset(s_mt, level, slice,
 _image_x, _image_y);
-  intel_miptree_get_image_offset(z_mt, level, slice,
-_image_x, _image_y);
 
   for (uint32_t y = 0; y < map->h; y++) {
+ uint32_t *z_line = (uint32_t *)((uint8_t *)z_map + z_stride * y);
 for (uint32_t x = 0; x < map->w; x++) {
int map_x = map->x + x, map_y = map->y + y;
ptrdiff_t s_offset = intel_offset_S8(s_mt->surf.row_pitch,
 map_x + s_image_x,
 map_y + s_image_y,
 brw->has_swizzling);
-   ptrdiff_t z_offset = ((map_y + z_image_y) *
-  (z_mt->surf.row_pitch / 4) +
- (map_x + z_image_x));
uint8_t s = s_map[s_offset];
-   uint32_t z = z_map[z_offset];
+   uint32_t z = z_line[x];
 
if (map_z32f_x24s8) {
   packed_map[(y * map->w + x) * 2 + 0] = z;
@@ -3557,12 +3560,13 @@ intel_miptree_map_depthstencil(struct brw_context *brw,
   }
 
   intel_miptree_unmap_raw(s_mt);
-  intel_miptree_unmap_raw(z_mt);
+  intel_miptree_unmap(brw, z_mt, level, slice);
+  z_mt->level[level].slice[slice].map = map;
 
-  DBG("%s: %d,%d %dx%d from z mt %p %d,%d, s mt %p %d,%d = %p/%d\n",
+  DBG("%s: %d,%d %dx%d from z mt %p (%d,%d) @ (level:%d, slice:%d), s mt 
%p %d,%d = %p/%d\n",
  __func__,
  map->x, map->y, map->w, map->h,
- z_mt, map->x + z_image_x, map->y + z_image_y,
+ z_mt, map->x, map->y, level, slice,
  s_mt, map->x + s_image_x, map->y + s_image_y,
  map->ptr, map->stride);
} else {
@@ -3586,44 +3590,48 @@ intel_miptree_unmap_depthstencil(struct brw_context 
*brw,
if (map->mode & GL_MAP_WRITE_BIT) {
   uint32_t *packed_map = map->ptr;
   uint8_t *s_map = intel_miptree_map_raw(brw, s_mt, GL_MAP_WRITE_BIT);
-  uint32_t *z_map = intel_miptree_map_raw(brw, z_mt, GL_MAP_WRITE_BIT);
   unsigned int s_image_x, s_image_y;
-  unsigned int z_image_x, z_image_y;
+
+  uint32_t *z_map = NULL;
+  ptrdiff_t z_stride = 0;
+
+  z_mt->level[level].slice[slice].map = NULL;
+  intel_miptree_map(brw, z_mt, level, slice, map->x, map->y, map->w, 
map->h,
+map->mode | GL_MAP_INVALIDATE_RANGE_BIT | 
BRW_MAP_DIRECT_BIT,
+(void **)_map, _stride);
+  assert(z_map && z_stride);
 
   intel_miptree_get_image_offset(s_mt, level, slice,
 _image_x, _image_y);
-  intel_miptree_get_image_offset(z_mt, level, slice,
-_image_x, _image_y);
 
   for (uint32_t y = 0; y < map->h; y++) {
+ uint32_t *z_line = (uint32_t *)((uint8_t *)z_map + z_stride * y);
 for (uint32_t x = 0; x < map->w; x++) {
ptrdiff_t s_offset = intel_offset_S8(s_mt->surf.row_pitch,
 x + s_image_x + map->x,
 y + s_image_y + map->y,
 brw->has_swizzling);
-   ptrdiff_t z_offset = ((y + z_image_y + 

[Mesa-dev] [Bug 105567] meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR

2018-04-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105567

--- Comment #14 from Dylan Baker  ---
.la files are generated by libtool, meson doesn't use libtool so there will be
no .la files. It's not a bug, it's actually a feature. As Emil said, projects
should not be dependent on .la files, that's bad behavior.

It doesn't surprise me that you needed to regenerate the ninja files to make
the install work correctly.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] meson: don't use compiler.has_header

2018-04-11 Thread Dylan Baker
Awesome, thanks Juan!

Quoting Juan A. Suarez Romero (2018-04-11 06:47:24)
> On Thu, 2018-03-29 at 11:20 -0700, Dylan Baker wrote:
> > This should be nominated for stable
> > 
> 
> Queued for next 18.0 stable release.
> 
> J.A.
> 
> 
> 
> > Quoting Dylan Baker (2018-03-12 11:23:23)
> > > Meson's compiler.has_header is completely useless, it only checks that a
> > > header exists, not whether it's usable. This creates problems if a
> > > header contains a conditional #error declaration, like so:
> > > 
> > > > #if __x86_64__
> > > > # error "Doesn't work with x86_64!"
> > > > #endif
> > > 
> > > Compiler.has_header will return true in this case, even when compiling
> > > for x86_64. This is useless.
> > > 
> > > Instead, we'll do a compile check so that any #error declarations will
> > > be treated as errors, and compilation will work.
> > > 
> > > Fixes compilation on x32 architecture.
> > > 
> > > Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746
> > > meson bug: https://github.com/mesonbuild/meson/issues/2246
> > > CC: Matt Turner 
> > > Signed-off-by: Dylan Baker 
> > > ---
> > >  meson.build | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/meson.build b/meson.build
> > > index 3c63f384381..51b470253f5 100644
> > > --- a/meson.build
> > > +++ b/meson.build
> > > @@ -912,7 +912,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major')
> > >  endif
> > >  
> > >  foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h']
> > > -  if cc.has_header(h)
> > > +  if cc.compiles('#include <@0@>'.format(h), name : '@0@ 
> > > works'.format(h))
> > >  pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify())
> > >endif
> > >  endforeach
> > > -- 
> > > 2.16.2
> > > 
> > 
> > ___
> > mesa-stable mailing list
> > mesa-sta...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-stable


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] ac/nir: Add workaround for GFX9 buffer views.

2018-04-11 Thread Bas Nieuwenhuizen
On Wed, Apr 11, 2018 at 4:25 PM, Juan A. Suarez Romero
 wrote:
> Hi, Bas.
>
> Unfortunately, I can't apply this patch neither in next 17.3 release stable, 
> nor
> in 18.0, as this patch does not apply cleanly, and it requires other different
> commits that didn't land in the stable branches.

This backport (https://patchwork.freedesktop.org/patch/214949/)
applies cleanly on 17.3 for me.

>
>
> For 17.3 I think it is probably not worth to try to provide a specific version
> for the stable, as I'm already cooking the pre-release, and this is the latest
> release for 17.3 series.

Can we please get this in 17.3? This has been submitted upstream
laready for 2 weeks and this backport has been available for a week
and fixes one of the major games from Feral on Vega:

commit 4503ff760c794c3bb15b978a47c530037d56498e (dow3)
Author: Bas Nieuwenhuizen 
Date:   Wed Mar 28 23:54:40 2018 +0200

Why did I not get a message that that commit could not be applied?

>
>
> Maybe it is worth to provide a version for 18.0 branch, though. Probably it
> wouldn't enter in this release, but surely in the next one which should happen
> in 1 or 2 weeks.
>
>
> For more information, for 18.0 this patch seems to require 1251f08ef ("ac: add
> ac_build_buffer_load_common() helper"), bac9fa9f17f ("ac: add glc parameter to
> ac_build_buffer_load_format"), and probably others.
>
>
> J.A.
>
>
> On Wed, 2018-04-11 at 11:26 +0200, Bas Nieuwenhuizen wrote:
>> On Wed, Apr 4, 2018 at 10:19 PM, Bas Nieuwenhuizen
>>  wrote:
>> > On GFX9 whether the buffer size is interpreted as elements or bytes
>> > depends on whether IDXEN is enabled in the instruction. If the index
>> > is a constant zero, LLVM optimizes IDXEN to 0.
>> >
>> > Now the size in elements is interpreted in bytes which of course
>> > results in out of bounds accesses.
>> >
>> > The correct fix is most likely to disable the LLVM optimization,
>> > but we need something to work with LLVM <= 6.0.
>> >
>> > radeonsi does the max between stride and element count on the CPU
>> > but that results in the size intrinsics returning the wrong size
>> > for the buffer. This would cause CTS errors for radv.
>> >
>> > v2: Also include the store changes.
>> >
>> > Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
>> > (backported from 4503ff760c794c3bb15b978a47c530037d56498e for 17.3)
>> > ---
>> >  src/amd/common/ac_llvm_build.c  | 20 
>> >  src/amd/common/ac_llvm_build.h  |  8 
>> >  src/amd/common/ac_nir_to_llvm.c | 36 ++--
>> >  src/amd/common/ac_shader_abi.h  |  4 
>> >  4 files changed, 62 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/src/amd/common/ac_llvm_build.c 
>> > b/src/amd/common/ac_llvm_build.c
>> > index e5cd23e025..f193f71c3e 100644
>> > --- a/src/amd/common/ac_llvm_build.c
>> > +++ b/src/amd/common/ac_llvm_build.c
>> > @@ -960,6 +960,26 @@ LLVMValueRef ac_build_buffer_load_format(struct 
>> > ac_llvm_context *ctx,
>> >   AC_FUNC_ATTR_READONLY);
>> >  }
>> >
>> > +LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context 
>> > *ctx,
>> > +  LLVMValueRef rsrc,
>> > +  LLVMValueRef vindex,
>> > +  LLVMValueRef voffset,
>> > +  bool can_speculate)
>> > +{
>> > +   LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->builder, 
>> > rsrc, LLVMConstInt(ctx->i32, 2, 0), "");
>> > +   LLVMValueRef stride = LLVMBuildExtractElement(ctx->builder, rsrc, 
>> > LLVMConstInt(ctx->i32, 1, 0), "");
>> > +   stride = LLVMBuildLShr(ctx->builder, stride, 
>> > LLVMConstInt(ctx->i32, 16, 0), "");
>> > +
>> > +   LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->builder,
>> > + 
>> > LLVMBuildICmp(ctx->builder, LLVMIntUGT, elem_count, stride, ""),
>> > + elem_count, stride, 
>> > "");
>> > +
>> > +   LLVMValueRef new_rsrc = LLVMBuildInsertElement(ctx->builder, rsrc, 
>> > new_elem_count,
>> > +  
>> > LLVMConstInt(ctx->i32, 2, 0), "");
>> > +
>> > +   return ac_build_buffer_load_format(ctx, new_rsrc, vindex, voffset, 
>> > can_speculate);
>> > +}
>> > +
>> >  /**
>> >   * Set range metadata on an instruction.  This can only be used on load 
>> > and
>> >   * call instructions.  If you know an instruction can only produce the 
>> > values
>> > diff --git a/src/amd/common/ac_llvm_build.h 
>> > b/src/amd/common/ac_llvm_build.h
>> > index aa2a2899ab..d4264f2879 100644
>> > --- a/src/amd/common/ac_llvm_build.h
>> > +++ b/src/amd/common/ac_llvm_build.h
>> > @@ -188,6 +188,14 @@ LLVMValueRef 

Re: [Mesa-dev] [Mesa-stable] [PATCH] ac/nir: Add workaround for GFX9 buffer views.

2018-04-11 Thread Juan A. Suarez Romero
Hi, Bas.

Unfortunately, I can't apply this patch neither in next 17.3 release stable, nor
in 18.0, as this patch does not apply cleanly, and it requires other different
commits that didn't land in the stable branches.


For 17.3 I think it is probably not worth to try to provide a specific version
for the stable, as I'm already cooking the pre-release, and this is the latest
release for 17.3 series.


Maybe it is worth to provide a version for 18.0 branch, though. Probably it
wouldn't enter in this release, but surely in the next one which should happen
in 1 or 2 weeks.


For more information, for 18.0 this patch seems to require 1251f08ef ("ac: add
ac_build_buffer_load_common() helper"), bac9fa9f17f ("ac: add glc parameter to
ac_build_buffer_load_format"), and probably others.


J.A.


On Wed, 2018-04-11 at 11:26 +0200, Bas Nieuwenhuizen wrote:
> On Wed, Apr 4, 2018 at 10:19 PM, Bas Nieuwenhuizen
>  wrote:
> > On GFX9 whether the buffer size is interpreted as elements or bytes
> > depends on whether IDXEN is enabled in the instruction. If the index
> > is a constant zero, LLVM optimizes IDXEN to 0.
> > 
> > Now the size in elements is interpreted in bytes which of course
> > results in out of bounds accesses.
> > 
> > The correct fix is most likely to disable the LLVM optimization,
> > but we need something to work with LLVM <= 6.0.
> > 
> > radeonsi does the max between stride and element count on the CPU
> > but that results in the size intrinsics returning the wrong size
> > for the buffer. This would cause CTS errors for radv.
> > 
> > v2: Also include the store changes.
> > 
> > Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
> > (backported from 4503ff760c794c3bb15b978a47c530037d56498e for 17.3)
> > ---
> >  src/amd/common/ac_llvm_build.c  | 20 
> >  src/amd/common/ac_llvm_build.h  |  8 
> >  src/amd/common/ac_nir_to_llvm.c | 36 ++--
> >  src/amd/common/ac_shader_abi.h  |  4 
> >  4 files changed, 62 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> > index e5cd23e025..f193f71c3e 100644
> > --- a/src/amd/common/ac_llvm_build.c
> > +++ b/src/amd/common/ac_llvm_build.c
> > @@ -960,6 +960,26 @@ LLVMValueRef ac_build_buffer_load_format(struct 
> > ac_llvm_context *ctx,
> >   AC_FUNC_ATTR_READONLY);
> >  }
> > 
> > +LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context 
> > *ctx,
> > +  LLVMValueRef rsrc,
> > +  LLVMValueRef vindex,
> > +  LLVMValueRef voffset,
> > +  bool can_speculate)
> > +{
> > +   LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->builder, 
> > rsrc, LLVMConstInt(ctx->i32, 2, 0), "");
> > +   LLVMValueRef stride = LLVMBuildExtractElement(ctx->builder, rsrc, 
> > LLVMConstInt(ctx->i32, 1, 0), "");
> > +   stride = LLVMBuildLShr(ctx->builder, stride, LLVMConstInt(ctx->i32, 
> > 16, 0), "");
> > +
> > +   LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->builder,
> > + 
> > LLVMBuildICmp(ctx->builder, LLVMIntUGT, elem_count, stride, ""),
> > + elem_count, stride, 
> > "");
> > +
> > +   LLVMValueRef new_rsrc = LLVMBuildInsertElement(ctx->builder, rsrc, 
> > new_elem_count,
> > +  
> > LLVMConstInt(ctx->i32, 2, 0), "");
> > +
> > +   return ac_build_buffer_load_format(ctx, new_rsrc, vindex, voffset, 
> > can_speculate);
> > +}
> > +
> >  /**
> >   * Set range metadata on an instruction.  This can only be used on load and
> >   * call instructions.  If you know an instruction can only produce the 
> > values
> > diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
> > index aa2a2899ab..d4264f2879 100644
> > --- a/src/amd/common/ac_llvm_build.h
> > +++ b/src/amd/common/ac_llvm_build.h
> > @@ -188,6 +188,14 @@ LLVMValueRef ac_build_buffer_load_format(struct 
> > ac_llvm_context *ctx,
> >  LLVMValueRef voffset,
> >  bool can_speculate);
> > 
> > +/* load_format that handles the stride & element count better if idxen is
> > + * disabled by LLVM. */
> > +LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context 
> > *ctx,
> > +  LLVMValueRef rsrc,
> > +  LLVMValueRef vindex,
> > +  LLVMValueRef voffset,
> > +  bool can_speculate);
> > +
> >  LLVMValueRef
> >  

Re: [Mesa-dev] [PATCH 04/11] gallium: Use Array._DrawVAO in st_atom_array.c.

2018-04-11 Thread Brian Paul

Hmm, in my experience, interleaved arrays are fairly common.

I still haven't had much time to look at Mathias's latest patches.

And I haven't looked this code in the state tracker recently, but I seem 
to recall there was some difference between interleaved arrays (in one 
VBO) vs. separate arrays in separate VBOs that needed special handling.


As for determining whether arrays are interleaved, if that's something 
we still need to do, I think it could be lifted into core Mesa.  We 
could add a new gl_vertex_array_object::_IsInterleaved field which is 
only updated when the VAO state is modified.


-Brian


On 04/10/2018 12:09 PM, Marek Olšák wrote:
Generally, if you have to loop over all arrays to find common vertex 
buffers, it's better not to do it. The default separate path is going to 
perform best, because it's straightforward and interleaved arrays are 
super rare.


Marek

On Mon, Apr 9, 2018 at 7:15 PM, Mathias Fröhlich 
> wrote:


Hi Marek,

On Saturday, 7 April 2018 01:53:58 CEST Marek Olšák wrote:
> So interleaved attribs are unsupported, right?
>
> is_interleaved_arrays was probably slowing things down, so I'm OK with 
that.

I am currently away from all the source code and be back at about
the 22.4.

But out of my head: The main purpose of the is_interleaved_arrays
that I could
spot is to minimize the vbo's that are send down the pipeline. In
the non vbo
case the is_interleaved_arrays check did nothing I could finally spot?
The buffer itself is marked as user buffer and we need a new vbuffer
because
of the pointer value anyway? Correct?

So, the VAO now contains all the redundancy information. And thanks
to this
bitmask sieves we can easily collect the arrays belonging to a specific
precollapsed binding point.
So, the is_interleaved is fully there in the vbo case. Even better
as before.
It sees even 4 attributes distributed across two pairwise
interleaved vbo
arrays.

So even if you are fine, if you tell me that the user buffer code
can make use
of the same sharing finally, I can take a look at that and establish
the same
sort of sharing here.

best

Mathias


 >
 > Marek
 >
 > On Sun, Apr 1, 2018 at 2:13 PM, > wrote:
 > > From: Mathias Fröhlich >
 > >
 > > Finally make use of the binding information in the VAO when
 > > setting up arrays for draw.
 > >
 > > Signed-off-by: Mathias Fröhlich >
 > > ---
 > >
 > >  src/mesa/state_tracker/st_atom_array.c | 448
 > >
 > > +
 > >
 > >  1 file changed, 124 insertions(+), 324 deletions(-)
 > >
 > > diff --git a/src/mesa/state_tracker/st_atom_array.c
 > > b/src/mesa/state_tracker/st_atom_array.c
 > > index 2fd67e8d84..46934a718a 100644
 > > --- a/src/mesa/state_tracker/st_atom_array.c
 > > +++ b/src/mesa/state_tracker/st_atom_array.c
 > > @@ -48,6 +48,7 @@
 > >
 > >  #include "main/bufferobj.h"
 > >  #include "main/glformats.h"
 > >  #include "main/varray.h"
 > >
 > > +#include "main/arrayobj.h"
 > >
 > >  /* vertex_formats[gltype - GL_BYTE][integer*2 +
normalized][size - 1] */
 > >  static const uint16_t vertex_formats[][4][4] = {
 > >
 > > @@ -306,79 +307,6 @@ st_pipe_vertex_format(const struct
 > > gl_array_attributes *attrib)
 > >
 > >     return vertex_formats[type - GL_BYTE][index][size-1];
 > >
 > >  }
 > >
 > > -static const struct gl_vertex_array *
 > > -get_client_array(const struct gl_vertex_array *arrays,
 > > -                 unsigned mesaAttr)
 > > -{
 > > -   /* st_program uses 0x to denote a double
placeholder attribute
 > > */
 > > -   if (mesaAttr == ST_DOUBLE_ATTRIB_PLACEHOLDER)
 > > -      return NULL;
 > > -   return [mesaAttr];
 > > -}
 > > -
 > > -/**
 > > - * Examine the active arrays to determine if we have interleaved
 > > - * vertex arrays all living in one VBO, or all living in user
space.
 > > - */
 > > -static GLboolean
 > > -is_interleaved_arrays(const struct st_vertex_program *vp,
 > > -                      const struct gl_vertex_array *arrays,
 > > -                      unsigned num_inputs)
 > > -{
 > > -   GLuint attr;
 > > -   const struct gl_buffer_object *firstBufObj = NULL;
 > > -   GLint firstStride = -1;
 > > -   const GLubyte *firstPtr = NULL;
 > > -   GLboolean userSpaceBuffer = GL_FALSE;
 > > -
 > > -   for (attr = 0; attr < num_inputs; attr++) {
 > > -      const struct gl_vertex_array *array;
 > > -      

Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] ac: make use of if/loop build helpers

2018-04-11 Thread Juan A. Suarez Romero
On Tue, 2018-04-10 at 16:48 +0100, Alex Smith wrote:
> On 10 April 2018 at 15:49, Juan A. Suarez Romero  wrote:
> > On Tue, 2018-04-03 at 10:58 +0100, Alex Smith wrote:
> > > I don't know exactly what's causing it, no. I noticed the issue was fixed 
> > > on master so just bisected to this.
> > >
> > > CC'ing stable to nominate:
> > > 42627dabb4db3011825a022325be7ae9b51103d6 - (1/3) ac: add if/loop build 
> > > helpers
> > > 6e1a142863b368a032e333f09feb107241446053 - (2/3) radeonsi: make use of 
> > > if/loop build helpers in ac
> > > 99cdc019bf6fe11c135b7544ef6daf4ac964fa24 - (3/3) ac: make use of if/loop 
> > > build helpers
> > >
> > 
> > Hi, Alex.
> > 
> > Are these 3 commits nominated for a specific stable branch? From the CC not 
> > sure
> > if you want to nominate them for 17.3, 18.0 or both.
> 
> They work for me on both 18.0 and 17.3, so I think they can be nominated for 
> both.
> 

Thanks. Enqueued them for next 17.3 (last) and 18.0 stable releases.


J.A.

> Thanks,
> Alex
>  
> > 
> > J.A.
> > 
> > >
> > >
> > > On 3 April 2018 at 10:45, Timothy Arceri  wrote:
> > > > I have no issue with these going in stable if they fix bugs. Ideally we 
> > > > should create a piglit test to catch this also but presumably you guys 
> > > > don't actually know the exact shader combination thats tripping things 
> > > > up?
> > > >
> > > >
> > > > On 03/04/18 19:36, Samuel Pitoiset wrote:
> > > > > This fixes a rendering issue with Wolfenstein 2 as well. A backport 
> > > > > sounds reasonable to me.
> > > > >
> > > > > On 04/03/2018 11:33 AM, Alex Smith wrote:
> > > > > > Hi Timothy,
> > > > > >
> > > > > > This patch fixes some rendering issues I see with RADV on SI.
> > > > > >
> > > > > > It doesn't sound like it was really intended to fix anything, so 
> > > > > > possibly it's masking some other issue, but would you object to 
> > > > > > nominating the series for stable? Applying it on the 18.0 branch 
> > > > > > fixes the issue there as well.
> > > > > >
> > > > > > Thanks,
> > > > > > Alex
> > > > > >
> > > > > > On 7 March 2018 at 20:43, Marek Olšák  > > > > > > wrote:
> > > > > >
> > > > > > For the series:
> > > > > >
> > > > > > Reviewed-by: Marek Olšák  > > > > > >
> > > > > >
> > > > > > Marek
> > > > > >
> > > > > > On Tue, Mar 6, 2018 at 8:40 PM, Timothy Arceri
> > > > > > > wrote:
> > > > > >  > These helpers insert the basic block in the same order as 
> > > > > > they
> > > > > >  > appear in NIR making it easier to follow LLVM IR dumps. The 
> > > > > > helpers
> > > > > >  > also insert more useful labels onto the blocks.
> > > > > >  >
> > > > > >  > TGSI use the line number of the corresponding opcode in the 
> > > > > > TGSI
> > > > > >  > dump as the label id, here we use the corresponding block 
> > > > > > index
> > > > > >  > from NIR.
> > > > > >  > ---
> > > > > >  >  src/amd/common/ac_nir_to_llvm.c | 60
> > > > > > +
> > > > > >  >  1 file changed, 18 insertions(+), 42 deletions(-)
> > > > > >  >
> > > > > >  > diff --git a/src/amd/common/ac_nir_to_llvm.c
> > > > > > b/src/amd/common/ac_nir_to_llvm.c
> > > > > >  > index cda91fe8bf..dc463ed253 100644
> > > > > >  > --- a/src/amd/common/ac_nir_to_llvm.c
> > > > > >  > +++ b/src/amd/common/ac_nir_to_llvm.c
> > > > > >  > @@ -5237,17 +5237,15 @@ static void visit_ssa_undef(struct
> > > > > > ac_nir_context *ctx,
> > > > > >  > _mesa_hash_table_insert(ctx->defs, >def, 
> > > > > > undef);
> > > > > >  >  }
> > > > > >  >
> > > > > >  > -static void visit_jump(struct ac_nir_context *ctx,
> > > > > >  > +static void visit_jump(struct ac_llvm_context *ctx,
> > > > > >  >const nir_jump_instr *instr)
> > > > > >  >  {
> > > > > >  > switch (instr->type) {
> > > > > >  > case nir_jump_break:
> > > > > >  > -   LLVMBuildBr(ctx->ac.builder, 
> > > > > > ctx->break_block);
> > > > > >  > -   LLVMClearInsertionPosition(ctx->ac.builder);
> > > > > >  > +   ac_build_break(ctx);
> > > > > >  > break;
> > > > > >  > case nir_jump_continue:
> > > > > >  > -   LLVMBuildBr(ctx->ac.builder, 
> > > > > > ctx->continue_block);
> > > > > >  > -   LLVMClearInsertionPosition(ctx->ac.builder);
> > > > > >  > +   ac_build_continue(ctx);
> > > > > >  > break;
> > > > > >  > default:
> > > > > >  > fprintf(stderr, "Unknown NIR jump instr: ");
> > > > > >  > @@ -5285,7 +5283,7 @@ static void visit_block(struct
> > > > > > ac_nir_context *ctx, 

Re: [Mesa-dev] [PATCH v4 6/6] i965: gl_BaseVertex must be zero for non-indexed draw calls

2018-04-11 Thread Antia Puentes


On 10/04/18 18:26, Jason Ekstrand wrote:
On Tue, Apr 10, 2018 at 1:28 AM, Antia Puentes > wrote:


On 07/04/18 08:21, Jason Ekstrand wrote:


On Fri, Apr 6, 2018 at 2:53 PM, Ian Romanick > wrote:

From: Antia Puentes >

We keep 'firstvertex' as it is and move gl_BaseVertex to the
drawID
vertex element. The previous Vertex Elements order was:

  * VE 1: 
  * VE 2: 

and now it is:

  * VE 1: 
  * VE 2: 

To move the BaseVertex keeping VE1 as it is, allows to keep
pointing the
vertex buffer associated to VE 1 to the indirect buffer for
indirect
draw calls.

From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

  "gl_BaseVertex holds the integer value passed to the baseVertex
  parameter to the command that resulted in the current shader
  invocation. In the case where the command has no baseVertex
parameter,
  the value of gl_BaseVertex is zero."

Fixes CTS tests:

  *
KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
  *

KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
  *
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
  *

KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
  *

KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

v2 (idr): Make changes to brw_prepare_shader_draw_parameters
matching
those in genX(emit_vertices).  Reformat commit message to 72
columns.

Signed-off-by: Ian Romanick >
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678

---
 src/intel/compiler/brw_nir.c     | 14 +
 src/intel/compiler/brw_vec4.cpp          | 14 +
 src/mesa/drivers/dri/i965/brw_context.h      | 32
++-
 src/mesa/drivers/dri/i965/brw_draw.c         | 45
++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c  | 14 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 38
+++---
 6 files changed, 97 insertions(+), 60 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c
b/src/intel/compiler/brw_nir.c
index 16b0d86814f..16ab529737b 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
     */
    const bool has_sgvs =
       nir->info.system_values_read &
-      (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-       BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+      (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
        BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
        BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
        BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,

nir_intrinsic_set_base(load, num_inputs);
                switch (intrin->intrinsic) {
-               case nir_intrinsic_load_base_vertex:
                case nir_intrinsic_load_first_vertex:
 nir_intrinsic_set_component(load, 0);
                   break;
@@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 nir_intrinsic_set_component(load, 3);
                   break;
                case nir_intrinsic_load_draw_id:
-                  /* gl_DrawID is stored right after
gl_VertexID and friends
-                   * if any of them exist.
+               case nir_intrinsic_load_base_vertex:
+                  /* gl_DrawID and gl_BaseVertex are stored
right after
+                     gl_VertexID and friends if any of them
exist.
                    */
 nir_intrinsic_set_base(load, num_inputs + has_sgvs);
- nir_intrinsic_set_component(load, 0);
+                  if (intrin->intrinsic ==
nir_intrinsic_load_draw_id)
+  nir_intrinsic_set_component(load, 0);
+                  else
+  nir_intrinsic_set_component(load, 1);
                   break;
                default:
                   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp

Re: [Mesa-dev] [PATCH] egl/x11: Handle both depth 30 formats for eglCreateImage(). (v2)

2018-04-11 Thread Emil Velikov
Hi Mario,

Just a small suggestion: be that for now or later - your call.

On 10 April 2018 at 08:43, Mario Kleiner  wrote:

> -dri3_format_for_depth(uint32_t depth)
> +dri3_format_for_depth(struct dri2_egl_display *dri2_dpy, uint32_t depth)
There is nothing DRI3 specific here. To avoid duplication - I'd move
it to platform_x11.c and reuse both places.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >