date:20171206

Re: [Mesa-dev] [PATCH] RFC: Workaround for gen9 hw astc5x5 sampler bug

2017-12-06 Thread Rogovin, Kevin



>I'd have to think about it harder but even those are not likely to actually 
>get ASTC decode.  Maybe for some sort of decompression thing but if you're 
>doing
> GetCompressedTexImage, it'll probably turn into a blorp_copy which will use 
> R32G32B32A32_UINT.

I am thinking for the case where an application calls glGetTexImage() or 
glGetTextureImage(), which I think is legal in GL and the GL implementation is 
expected to do the decompress.

-Kevin
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 101560] SPIR-V OpSwitch with int64 not supported even though shaderInt64 is true

2017-12-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=101560

--- Comment #4 from Jason Ekstrand  ---
I have a branch for this now:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/spirv-type-tracking

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] r600/sb: start adding GDS support

2017-12-06 Thread Dave Airlie

From: Dave Airlie 

This adds support for GDS ops to sb backend.

It seems to work for atomic counters on cayman, probably
needs a lot more testing.
---
 src/gallium/drivers/r600/r600_isa.h|  2 +-
 src/gallium/drivers/r600/sb/sb_bc.h|  7 +
 src/gallium/drivers/r600/sb/sb_bc_builder.cpp  | 39 +-
 src/gallium/drivers/r600/sb/sb_bc_decoder.cpp  |  9 +-
 src/gallium/drivers/r600/sb/sb_bc_dump.cpp | 13 +++--
 src/gallium/drivers/r600/sb/sb_bc_finalize.cpp |  7 +
 src/gallium/drivers/r600/sb/sb_bc_parser.cpp   | 11 ++--
 src/gallium/drivers/r600/sb/sb_dump.cpp|  1 +
 src/gallium/drivers/r600/sb/sb_gcm.cpp | 20 ++---
 src/gallium/drivers/r600/sb/sb_ir.h|  3 +-
 src/gallium/drivers/r600/sb/sb_peephole.cpp| 14 -
 src/gallium/drivers/r600/sb/sb_ra_init.cpp |  2 ++
 src/gallium/drivers/r600/sb/sb_shader.cpp  |  3 ++
 13 files changed, 118 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_isa.h 
b/src/gallium/drivers/r600/r600_isa.h
index b5a36b4e80e..f6e26976c5f 100644
--- a/src/gallium/drivers/r600/r600_isa.h
+++ b/src/gallium/drivers/r600/r600_isa.h
@@ -115,7 +115,7 @@ enum alu_op_flags
AF_CC_LE= (5U << AF_CC_SHIFT),
 };
 
-/* flags for FETCH instructions (TEX/VTX) */
+/* flags for FETCH instructions (TEX/VTX/GDS) */
 enum fetch_op_flags
 {
FF_GDS  = (1<<0),
diff --git a/src/gallium/drivers/r600/sb/sb_bc.h 
b/src/gallium/drivers/r600/sb/sb_bc.h
index fed041cf506..fc3fa5082d0 100644
--- a/src/gallium/drivers/r600/sb/sb_bc.h
+++ b/src/gallium/drivers/r600/sb/sb_bc.h
@@ -401,6 +401,7 @@ enum sched_queue_id {
SQ_ALU,
SQ_TEX,
SQ_VTX,
+   SQ_GDS,
 
SQ_NUM
 };
@@ -580,6 +581,11 @@ struct bc_fetch {
unsigned mega_fetch:1;
 
unsigned src2_gpr:7; /* for GDS */
+   unsigned alloc_consume:1;
+   unsigned uav_id:4;
+   unsigned uav_index_mode:2;
+   unsigned bcast_first_req:1;
+
void set_op(unsigned op) { this->op = op; op_ptr = r600_isa_fetch(op); }
 };
 
@@ -966,6 +972,7 @@ private:
int build_fetch_clause(cf_node *n);
int build_fetch_tex(fetch_node *n);
int build_fetch_vtx(fetch_node *n);
+   int build_fetch_gds(fetch_node *n);
 };
 
 } // namespace r600_sb
diff --git a/src/gallium/drivers/r600/sb/sb_bc_builder.cpp 
b/src/gallium/drivers/r600/sb/sb_bc_builder.cpp
index b0df3d9a544..b45e81729df 100644
--- a/src/gallium/drivers/r600/sb/sb_bc_builder.cpp
+++ b/src/gallium/drivers/r600/sb/sb_bc_builder.cpp
@@ -129,7 +129,9 @@ int bc_builder::build_fetch_clause(cf_node* n) {
I != E; ++I) {
fetch_node *f = static_cast(*I);
 
-   if (f->bc.op_ptr->flags & FF_VTX)
+   if (f->bc.op_ptr->flags & FF_GDS)
+   build_fetch_gds(f);
+   else if (f->bc.op_ptr->flags & FF_VTX)
build_fetch_vtx(f);
else
build_fetch_tex(f);
@@ -558,6 +560,41 @@ int bc_builder::build_fetch_tex(fetch_node* n) {
return 0;
 }
 
+int bc_builder::build_fetch_gds(fetch_node *n) {
+   const bc_fetch  = n->bc;
+   const fetch_op_info *fop = bc.op_ptr;
+   unsigned gds_op = (ctx.fetch_opcode(bc.op) >> 8) & 0x3f;
+   assert(fop->flags && FF_GDS);
+
+   fprintf(stderr, "%08x\n", ctx.fetch_opcode(bc.op));
+   bb << MEM_GDS_WORD0_EGCM()
+   .MEM_INST(2)
+   .MEM_OP(4)
+   .SRC_GPR(bc.src_gpr)
+   .SRC_SEL_X(bc.src_sel[0])
+   .SRC_SEL_Y(bc.src_sel[1])
+   .SRC_SEL_Z(bc.src_sel[2]);
+   
+   bb << MEM_GDS_WORD1_EGCM()
+   .DST_GPR(bc.dst_gpr)
+   .DST_REL_MODE(bc.dst_rel)
+   .GDS_OP(gds_op)
+   .SRC_GPR(bc.src2_gpr)
+   .UAV_INDEX_MODE(bc.uav_index_mode)
+   .UAV_ID(bc.uav_id)
+   .ALLOC_CONSUME(bc.alloc_consume)
+   .BCAST_FIRST_REQ(bc.bcast_first_req);
+
+   bb << MEM_GDS_WORD2_EGCM()
+   .DST_SEL_X(bc.dst_sel[0])
+   .DST_SEL_Y(bc.dst_sel[1])
+   .DST_SEL_Z(bc.dst_sel[2])
+   .DST_SEL_W(bc.dst_sel[3]);
+
+   bb << 0;
+   return 0;
+}
+
 int bc_builder::build_fetch_vtx(fetch_node* n) {
const bc_fetch  = n->bc;
const fetch_op_info *fop = bc.op_ptr;
diff --git a/src/gallium/drivers/r600/sb/sb_bc_decoder.cpp 
b/src/gallium/drivers/r600/sb/sb_bc_decoder.cpp
index 8712abe5f78..1fa580e66d6 100644
--- a/src/gallium/drivers/r600/sb/sb_bc_decoder.cpp
+++ b/src/gallium/drivers/r600/sb/sb_bc_decoder.cpp
@@ -415,7 +415,10 @@ int bc_decoder::decode_fetch(unsigned & i, bc_fetch& bc) {
unsigned gds_op;
if (mem_op == 4) {
gds_op = (dw1 >> 9) & 0x1f;
-

[Mesa-dev] [PATCH v2 0/9] anv: Push constant 16bit storage and push UBOs

2017-12-06 Thread Jason Ekstrand

This series is a combination of my push UBOs series, a patch I sent out
this last week and the patch from Jose to enable 16bit storage for push
constants.

I haven't landed UBO pushing yet because I wanted to let Jose land 16bit
storage.  In order for UBO pushing to work now that 16bit storage is
enabled, we need 16bit push constants.  The first patch is a rewrite of
assign_constant_locations to support other bit-widths than 32 and 64.  This
is followed by the patch to enable 16bit storage which is trivial once
assign_constant_locations works.  The last 7 patches are mostly the ones
Jordan reviewed already but with one more patch on top.

Chema Casanova (1):
  anv: Enable VK_KHR_16bit_storage for push_constant

Jason Ekstrand (8):
  i965/fs: Rewrite assign_constant_locations
  anv/pipeline: Translate vulkan_resource_index to a constant when
possible
  anv/cmd_buffer: Add some helpers for working with descriptor sets
  anv/cmd_buffer: Add some stage asserts
  anv/cmd_buffer: Add support for pushing UBO ranges
  anv/device: Increase the UBO alignment requirement to 32
  i965/fs: Handle !supports_pull_constants and push UBOs properly
  anv: Enable UBO pushing

 src/intel/compiler/brw_fs.cpp| 307 +--
 src/intel/vulkan/anv_device.c|  15 +-
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c |  17 +-
 src/intel/vulkan/anv_pipeline.c  |   6 +
 src/intel/vulkan/genX_cmd_buffer.c   | 191 ++
 src/intel/vulkan/genX_pipeline.c |   3 +-
 6 files changed, 356 insertions(+), 183 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/9] anv/cmd_buffer: Add support for pushing UBO ranges

2017-12-06 Thread Jason Ekstrand

In order to do this we have to modify push constant set up to handle
ranges.  We also have to tweak the way we handle dirty bits a bit so
that we re-push whenever a descriptor set changes.

Reviewed-by: Jordan Justen 
---
 src/intel/vulkan/genX_cmd_buffer.c | 142 -
 src/intel/vulkan/genX_pipeline.c   |   3 +-
 2 files changed, 112 insertions(+), 33 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 16b4ca6..0bd3874 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1843,9 +1843,12 @@ cmd_buffer_emit_descriptor_pointers(struct 
anv_cmd_buffer *cmd_buffer,
}
 }
 
-static uint32_t
-cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer)
+static void
+cmd_buffer_flush_push_constants(struct anv_cmd_buffer *cmd_buffer,
+VkShaderStageFlags dirty_stages)
 {
+   UNUSED const struct anv_pipeline *pipeline = cmd_buffer->state.pipeline;
+
static const uint32_t push_constant_opcodes[] = {
   [MESA_SHADER_VERTEX]  = 21,
   [MESA_SHADER_TESS_CTRL]   = 25, /* HS */
@@ -1857,39 +1860,117 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer)
 
VkShaderStageFlags flushed = 0;
 
-   anv_foreach_stage(stage, cmd_buffer->state.push_constants_dirty) {
-  if (stage == MESA_SHADER_COMPUTE)
- continue;
-
+   anv_foreach_stage(stage, dirty_stages) {
   assert(stage < ARRAY_SIZE(push_constant_opcodes));
   assert(push_constant_opcodes[stage] > 0);
 
-  struct anv_state state = anv_cmd_buffer_push_constants(cmd_buffer, 
stage);
+  anv_batch_emit(_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c) {
+ c._3DCommandSubOpcode = push_constant_opcodes[stage];
 
-  if (state.offset == 0) {
- anv_batch_emit(_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c)
-c._3DCommandSubOpcode = push_constant_opcodes[stage];
-  } else {
- anv_batch_emit(_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c) {
-c._3DCommandSubOpcode = push_constant_opcodes[stage],
-c.ConstantBody = (struct GENX(3DSTATE_CONSTANT_BODY)) {
-#if GEN_GEN >= 9
-   .Buffer[2] = { 
_buffer->device->dynamic_state_pool.block_pool.bo, state.offset },
-   .ReadLength[2] = DIV_ROUND_UP(state.alloc_size, 32),
+ if (anv_pipeline_has_stage(cmd_buffer->state.pipeline, stage)) {
+#if GEN_GEN >= 8 || GEN_IS_HASWELL
+const struct brw_stage_prog_data *prog_data =
+   pipeline->shaders[stage]->prog_data;
+const struct anv_pipeline_bind_map *bind_map =
+   >shaders[stage]->bind_map;
+
+/* The Skylake PRM contains the following restriction:
+ *
+ *"The driver must ensure The following case does not occur
+ * without a flush to the 3D engine: 3DSTATE_CONSTANT_* with
+ * buffer 3 read length equal to zero committed followed by a
+ * 3DSTATE_CONSTANT_* with buffer 0 read length not equal to
+ * zero committed."
+ *
+ * To avoid this, we program the buffers in the highest slots.
+ * This way, slot 0 is only used if slot 3 is also used.
+ */
+int n = 3;
+
+for (int i = 3; i >= 0; i--) {
+   const struct brw_ubo_range *range = _data->ubo_ranges[i];
+   if (range->length == 0)
+  continue;
+
+   const unsigned surface =
+  prog_data->binding_table.ubo_start + range->block;
+
+   assert(surface <= bind_map->surface_count);
+   const struct anv_pipeline_binding *binding =
+  _map->surface_to_descriptor[surface];
+
+   const struct anv_descriptor *desc =
+  anv_descriptor_for_binding(cmd_buffer, binding);
+
+   struct anv_address read_addr;
+   uint32_t read_len;
+   if (desc->type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER) {
+  read_len = MIN2(range->length,
+ DIV_ROUND_UP(desc->buffer_view->range, 32) - 
range->start);
+  read_addr = (struct anv_address) {
+ .bo = desc->buffer_view->bo,
+ .offset = desc->buffer_view->offset +
+   range->start * 32,
+  };
+   } else {
+  assert(desc->type == 
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC);
+
+  uint32_t dynamic_offset =
+ dynamic_offset_for_binding(cmd_buffer, pipeline, binding);
+  uint32_t buf_offset =
+ MIN2(desc->offset + dynamic_offset, desc->buffer->size);
+  uint32_t buf_range =
+ MIN2(desc->range, desc->buffer->size -

[Mesa-dev] [PATCH 8/9] i965/fs: Handle !supports_pull_constants and push UBOs properly

2017-12-06 Thread Jason Ekstrand

In Vulkan, we don't support classic pull constants and everything the
client asks us to push, we push.  However, for pushed UBOs, we still
want to fall back to conventional pulls if we run out of space.
---
 src/intel/compiler/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 41260b4..200b77b 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -2144,7 +2144,7 @@ fs_visitor::assign_constant_locations()
 
   unsigned push_start_align = cplx_align_apply(align, num_push_constants);
   unsigned chunk_size = u - chunk_start + 1;
-  if (!compiler->supports_pull_constants ||
+  if ((!compiler->supports_pull_constants && u < UBO_START) ||
   (chunk_size < max_chunk_size &&
push_start_align + chunk_size <= max_push_components)) {
  /* Align up the number of push constants */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] r600/sb: add compute initial state registers.

2017-12-06 Thread Dave Airlie

From: Dave Airlie 

This stops them being optimised out.
---
 src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp 
b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
index ae92a767b4c..8a4abd48306 100644
--- a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
+++ b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
@@ -154,6 +154,9 @@ int bc_parser::parse_decls() {
else if (sh->target == TARGET_GS) {
sh->add_input(0, 1, 0x0F);
sh->add_input(1, 1, 0x0F);
+   } else if (sh->target == TARGET_COMPUTE) {
+   sh->add_input(0, 1, 0x0F);
+   sh->add_input(1, 1, 0x0F);
}
 
bool ps_interp = ctx.hw_class >= HW_CLASS_EVERGREEN
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/9] anv/cmd_buffer: Add some helpers for working with descriptor sets

2017-12-06 Thread Jason Ekstrand

Reviewed-by: Jordan Justen 
---
 src/intel/vulkan/genX_cmd_buffer.c | 45 --
 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d..e4362d1 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1432,6 +1432,35 @@ cmd_buffer_alloc_push_constants(struct anv_cmd_buffer 
*cmd_buffer)
cmd_buffer->state.push_constants_dirty |= VK_SHADER_STAGE_ALL_GRAPHICS;
 }
 
+static const struct anv_descriptor *
+anv_descriptor_for_binding(const struct anv_cmd_buffer *cmd_buffer,
+   const struct anv_pipeline_binding *binding)
+{
+   assert(binding->set < MAX_SETS);
+   const struct anv_descriptor_set *set =
+  cmd_buffer->state.descriptors[binding->set];
+   const uint32_t offset =
+  set->layout->binding[binding->binding].descriptor_index;
+   return >descriptors[offset + binding->index];
+}
+
+static uint32_t
+dynamic_offset_for_binding(const struct anv_cmd_buffer *cmd_buffer,
+   const struct anv_pipeline *pipeline,
+   const struct anv_pipeline_binding *binding)
+{
+   assert(binding->set < MAX_SETS);
+   const struct anv_descriptor_set *set =
+  cmd_buffer->state.descriptors[binding->set];
+
+   uint32_t dynamic_offset_idx =
+  pipeline->layout->set[binding->set].dynamic_offset_start +
+  set->layout->binding[binding->binding].dynamic_offset_index +
+  binding->index;
+
+   return cmd_buffer->state.dynamic_offsets[dynamic_offset_idx];
+}
+
 static VkResult
 emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
gl_shader_stage stage,
@@ -1534,10 +1563,8 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
  continue;
   }
 
-  struct anv_descriptor_set *set =
- cmd_buffer->state.descriptors[binding->set];
-  uint32_t offset = 
set->layout->binding[binding->binding].descriptor_index;
-  struct anv_descriptor *desc = >descriptors[offset + binding->index];
+  const struct anv_descriptor *desc =
+ anv_descriptor_for_binding(cmd_buffer, binding);
 
   switch (desc->type) {
   case VK_DESCRIPTOR_TYPE_SAMPLER:
@@ -1611,14 +1638,10 @@ emit_binding_table(struct anv_cmd_buffer *cmd_buffer,
 
   case VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC:
   case VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC: {
- uint32_t dynamic_offset_idx =
-pipeline->layout->set[binding->set].dynamic_offset_start +
-set->layout->binding[binding->binding].dynamic_offset_index +
-binding->index;
-
  /* Compute the offset within the buffer */
- uint64_t offset = desc->offset +
-cmd_buffer->state.dynamic_offsets[dynamic_offset_idx];
+ uint32_t dynamic_offset =
+dynamic_offset_for_binding(cmd_buffer, pipeline, binding);
+ uint64_t offset = desc->offset + dynamic_offset;
  /* Clamp to the buffer size */
  offset = MIN2(offset, desc->buffer->size);
  /* Clamp the range to the buffer size */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/9] i965/fs: Rewrite assign_constant_locations

2017-12-06 Thread Jason Ekstrand

This rewires the logic for assigning uniform locations to work in terms
of "complex alignments".  The basic idea is that, as we walk the list of
instructions, we keep track of the alignment and continuity requirements
of each slot and assert that the alignments all match up.  We then use
those alignments in the compaction stage to ensure that everything gets
placed at a properly aligned register.  The old mechanism handled
alignments by special-casing each of the bit sizes and placing 64-bit
values first followed by 32-bit values.

The old scheme had the advantage of never leaving a hole since all the
64-bit values could be tightly packed and so could the 32-bit values.
However, the new scheme has no type size special cases so it handles not
only 32 and 64-bit types but should gracefully extend to 16 and 8-bit
types as the need arises.

Tested-by: Jose Maria Casanova Crespo 
---
 src/intel/compiler/brw_fs.cpp | 307 --
 1 file changed, 174 insertions(+), 133 deletions(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 93bb6b4..41260b4 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -1906,62 +1906,6 @@ fs_visitor::compact_virtual_grfs()
return progress;
 }
 
-static void
-set_push_pull_constant_loc(unsigned uniform, int *chunk_start,
-   unsigned *max_chunk_bitsize,
-   bool contiguous, unsigned bitsize,
-   const unsigned target_bitsize,
-   int *push_constant_loc, int *pull_constant_loc,
-   unsigned *num_push_constants,
-   unsigned *num_pull_constants,
-   const unsigned max_push_components,
-   const unsigned max_chunk_size,
-   bool allow_pull_constants,
-   struct brw_stage_prog_data *stage_prog_data)
-{
-   /* This is the first live uniform in the chunk */
-   if (*chunk_start < 0)
-  *chunk_start = uniform;
-
-   /* Keep track of the maximum bit size access in contiguous uniforms */
-   *max_chunk_bitsize = MAX2(*max_chunk_bitsize, bitsize);
-
-   /* If this element does not need to be contiguous with the next, we
-* split at this point and everything between chunk_start and u forms a
-* single chunk.
-*/
-   if (!contiguous) {
-  /* If bitsize doesn't match the target one, skip it */
-  if (*max_chunk_bitsize != target_bitsize) {
- /* FIXME: right now we only support 32 and 64-bit accesses */
- assert(*max_chunk_bitsize == 4 || *max_chunk_bitsize == 8);
- *max_chunk_bitsize = 0;
- *chunk_start = -1;
- return;
-  }
-
-  unsigned chunk_size = uniform - *chunk_start + 1;
-
-  /* Decide whether we should push or pull this parameter.  In the
-   * Vulkan driver, push constants are explicitly exposed via the API
-   * so we push everything.  In GL, we only push small arrays.
-   */
-  if (!allow_pull_constants ||
-  (*num_push_constants + chunk_size <= max_push_components &&
-   chunk_size <= max_chunk_size)) {
- assert(*num_push_constants + chunk_size <= max_push_components);
- for (unsigned j = *chunk_start; j <= uniform; j++)
-push_constant_loc[j] = (*num_push_constants)++;
-  } else {
- for (unsigned j = *chunk_start; j <= uniform; j++)
-pull_constant_loc[j] = (*num_pull_constants)++;
-  }
-
-  *max_chunk_bitsize = 0;
-  *chunk_start = -1;
-   }
-}
-
 static int
 get_subgroup_id_param_index(const brw_stage_prog_data *prog_data)
 {
@@ -1977,6 +1921,98 @@ get_subgroup_id_param_index(const brw_stage_prog_data 
*prog_data)
 }
 
 /**
+ * Struct for handling complex alignments.
+ *
+ * A complex alignment is stored as multiplier and an offset.  A value is
+ * considered to be aligned if it is congruent to the offset modulo the
+ * multiplier.
+ */
+struct cplx_align {
+   unsigned mul:4;
+   unsigned offset:4;
+};
+
+#define CPLX_ALIGN_MAX_MUL 8
+
+static void
+cplx_align_assert_sane(struct cplx_align a)
+{
+   assert(a.mul > 0 && util_is_power_of_two(a.mul));
+   assert(a.offset < a.mul);
+}
+
+/**
+ * Combines two alignments to produce a least multiple of sorts.
+ *
+ * The returned alignment is the smallest (in terms of multiplier) such that
+ * anything aligned to both a and b will be aligned to the new alignment.
+ * This function will assert-fail if a and b are not compatible, i.e. if the
+ * offset parameters are such that no common alignment is possible.
+ */
+static struct cplx_align
+cplx_align_combine(struct cplx_align a, struct cplx_align b)
+{
+   cplx_align_assert_sane(a);
+   cplx_align_assert_sane(b);
+
+   /* Assert that the alignments agree. */
+   assert((a.offset & (b.mul - 1)) == (b.offset & (a.mul - 1)));
+
+   return a.mul > b.mul ? a : b;
+}
+

[Mesa-dev] [PATCH 7/9] anv/device: Increase the UBO alignment requirement to 32

2017-12-06 Thread Jason Ekstrand

Push constants work in terms of 32-byte chunks so if we want to be able
to push UBOs, every thing needs to be 32-byte aligned.  Currently, we
only require 16-byte which is too small.

Reviewed-by: Jordan Justen 
---
 src/intel/vulkan/anv_device.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index fd5326b..013823e 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -849,7 +849,8 @@ void anv_GetPhysicalDeviceProperties(
   .viewportSubPixelBits = 13, /* We take a float? */
   .minMemoryMapAlignment= 4096, /* A page */
   .minTexelBufferOffsetAlignment= 1,
-  .minUniformBufferOffsetAlignment  = 16,
+  /* We need 16 for UBO block reads to work and 32 for push UBOs */
+  .minUniformBufferOffsetAlignment  = 32,
   .minStorageBufferOffsetAlignment  = 4,
   .minTexelOffset   = -8,
   .maxTexelOffset   = 7,
@@ -1915,8 +1916,15 @@ void anv_GetBufferMemoryRequirements(
  memory_types |= (1u << i);
}
 
+   /* Base alignment requirement of a cache line */
+   uint32_t alignment = 16;
+
+   /* We need an alignment of 32 for pushing UBOs */
+   if (buffer->usage & VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT)
+  alignment = MAX2(alignment, 32);
+
pMemoryRequirements->size = buffer->size;
-   pMemoryRequirements->alignment = 16;
+   pMemoryRequirements->alignment = alignment;
pMemoryRequirements->memoryTypeBits = memory_types;
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] anv: Enable UBO pushing

2017-12-06 Thread Jason Ekstrand

Push constants on Intel hardware are significantly more performant than
pull constants.  Since most Vulkan applications don't actively use push
constants on Vulkan or at least don't use it heavily, we're pulling way
more than we should be.  By enabling pushing chunks of UBOs we can get
rid of a lot of those pulls.

On my SKL GT4e, this improves the performance of Dota 2 and Talos by
around 2.5% and improves Aztec Ruins by around 2%.

Reviewed-by: Jordan Justen 
---
 src/intel/vulkan/anv_device.c   | 1 +
 src/intel/vulkan/anv_pipeline.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 013823e..a9364b5 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -419,6 +419,7 @@ anv_physical_device_init(struct anv_physical_device *device,
device->compiler->shader_debug_log = compiler_debug_log;
device->compiler->shader_perf_log = compiler_perf_log;
device->compiler->supports_pull_constants = false;
+   device->compiler->constant_buffer_0_is_relative = true;
 
isl_device_init(>isl_dev, >info, swizzled);
 
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index cf2079d..f2d4d113 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -389,6 +389,9 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
  struct brw_stage_prog_data *prog_data,
  struct anv_pipeline_bind_map *map)
 {
+   const struct brw_compiler *compiler =
+  pipeline->device->instance->physicalDevice.compiler;
+
nir_shader *nir = anv_shader_compile_to_nir(pipeline, mem_ctx,
module, entrypoint, stage,
spec_info);
@@ -438,6 +441,9 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
if (pipeline->layout)
   anv_nir_apply_pipeline_layout(pipeline, nir, prog_data, map);
 
+   if (stage != MESA_SHADER_COMPUTE)
+  brw_nir_analyze_ubo_ranges(compiler, nir, prog_data->ubo_ranges);
+
assert(nir->num_uniforms == prog_data->nr_params * 4);
 
return nir;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/9] anv/cmd_buffer: Add some stage asserts

2017-12-06 Thread Jason Ekstrand

There are several places where we look up opcodes in an array of stages.
Assert that the we don't end up going out-of-bounds.

Reviewed-by: Jordan Justen 
---
 src/intel/vulkan/genX_cmd_buffer.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index e4362d1..16b4ca6 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1822,6 +1822,9 @@ cmd_buffer_emit_descriptor_pointers(struct anv_cmd_buffer 
*cmd_buffer,
};
 
anv_foreach_stage(s, stages) {
+  assert(s < ARRAY_SIZE(binding_table_opcodes));
+  assert(binding_table_opcodes[s] > 0);
+
   if (cmd_buffer->state.samplers[s].alloc_size > 0) {
  anv_batch_emit(_buffer->batch,
 GENX(3DSTATE_SAMPLER_STATE_POINTERS_VS), ssp) {
@@ -1858,6 +1861,9 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer)
   if (stage == MESA_SHADER_COMPUTE)
  continue;
 
+  assert(stage < ARRAY_SIZE(push_constant_opcodes));
+  assert(push_constant_opcodes[stage] > 0);
+
   struct anv_state state = anv_cmd_buffer_push_constants(cmd_buffer, 
stage);
 
   if (state.offset == 0) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: remove set_entry from forward type declarations

2017-12-06 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 06/12/17 21:38, Alejandro Piñeiro wrote:

This type was used at gl_sync_object, but it is not used anymore.
---
  src/mesa/main/mtypes.h | 1 -
  1 file changed, 1 deletion(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b478f6158e2..4c90f58ef5a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -99,7 +99,6 @@ struct gl_uniform_storage;
  struct prog_instruction;
  struct gl_program_parameter_list;
  struct set;
-struct set_entry;
  struct vbo_context;
  /*@}*/
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 21/25] mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader

2017-12-06 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 01/12/17 04:28, Eduardo Lima Mitev wrote:

This is a reference to the spirv_data object stored in gl_shader, which
stores shader SPIR-V data that is needed during linking too.
---
  src/mesa/main/mtypes.h| 8 
  src/mesa/main/shaderobj.c | 1 +
  2 files changed, 9 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index d74bf10daa0..1c8de9542e8 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2542,6 +2542,14 @@ struct gl_linked_shader
 struct exec_list *packed_varyings;
 struct exec_list *fragdata_arrays;
 struct glsl_symbol_table *symbols;
+
+   /**
+* ARB_gl_spirv related data.
+*
+* This is actually a reference to the gl_shader::spirv_data, which
+* stores information that is also needed during linking.
+*/
+   struct gl_shader_spirv_data *spirv_data;
  };
  
  
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c

index 5c1cdd6b27a..834e2a92ec4 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -137,6 +137,7 @@ void
  _mesa_delete_linked_shader(struct gl_context *ctx,
 struct gl_linked_shader *sh)
  {
+   _mesa_shader_spirv_data_reference(>spirv_data, NULL);
 _mesa_reference_program(ctx, >Program, NULL);
 ralloc_free(sh);
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Jason Ekstrand

On Wed, Dec 6, 2017 at 8:48 PM, Timothy Arceri 
wrote:

> On 07/12/17 00:23, Alejandro Piñeiro wrote:
>
>> On 06/12/17 13:29, Pierre Moreau wrote:
>>
>>> Hello Alejandro,
>>>
>>> As far as I understand, nir_spirv_supported_capabilities is being
>>> filled in by
>>> the driver and then fetched by the API entrypoint to check the
>>> capabilities
>>> required by the SPIR-V binary given as input. And this is done
>>> regardless of
>>> the input IR used by the driver, be it NIR, LLVM IR, TGSI or others. So
>>> couldn’t it be just named spirv_supported_capabilities? Unless it also
>>> reflects
>>> the capabilities supported by the IR being used.
>>>
>>
>> Good point. spirv_supported_capabilities is probably a better name,
>> although right now, it would be only used on the spirv to nir pass. I
>> will not send a new version of the patch with just the renaming, but for
>> anyone interested on review the commit, I will use that name.
>>
>
> I would be much happier with this being in mtypes.h with that name. So if
> renamed:
>

Ugh... I just now got around to looking at this and saw that it went in
mtypes.h.  Can we please move it?  We've worked very hard to keep the
Vulkan driver from having to pull in any GL headers and data structures and
now a fairly core piece lives in mtypes.h.

--Jason


> Reviewed-by: Timothy Arceri 
>
>
>
>> I guess nir_spirv_supported_capabilities could be extended later on to
>>> also add
>>> capabilities specific to OpenCL when clover reaches OpenCL 1.2 support
>>> (and can
>>> start accepting SPIR-V binaries as input through the cl_khr_il_program
>>> extension), or would it be better to have a separate one for OpenCL?
>>>
>>
>> Probably it would be re-used, but I don't know the specifics of OpenCL
>> to ensure 100% that.
>>
>>
>>> I haven’t had time to look at the whole gl_spirv series yet, so I am
>>> sorry if
>>> this is something that has already been brought and answered in that
>>> thread.
>>>
>>
>> No sorries, your feedback was good and welcomed.
>>
>>>
>>> Regards,
>>> Pierre
>>>
>>> On 2017-12-06 — 09:57, Alejandro Piñeiro wrote:
>>>
 Until now it was part of spirv_to_nir_options. But it will be used on
 the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
 to the OpenGL context, as a way to save what SPIR-V capabilities the
 current OpenGL implementation supports.
 ---

 We are sending this commit in advance of a v3 of the initial gl_spirv
 and spirv_extensions support series. The issue is that lately there
 were a lot of activity on the spirv/spir_to_nir code base, and we are
 being fixing rebase conflicts constantly. Getting this commit on
 master would make things easier.

 FWIW, this patch is similar to one that Ian Romanick already granted
 Rb, but that was dropped after all the mentioned changes:
 https://lists.freedesktop.org/archives/mesa-dev/2017-Novembe
 r/178261.html

   src/compiler/spirv/nir_spirv.h | 16 +++-
   src/mesa/main/mtypes.h | 12 
   2 files changed, 15 insertions(+), 13 deletions(-)

 diff --git a/src/compiler/spirv/nir_spirv.h
 b/src/compiler/spirv/nir_spirv.h
 index 43ec19d5a50..113bd710a00 100644
 --- a/src/compiler/spirv/nir_spirv.h
 +++ b/src/compiler/spirv/nir_spirv.h
 @@ -28,7 +28,8 @@
   #ifndef _NIR_SPIRV_H_
   #define _NIR_SPIRV_H_
   -#include "nir/nir.h"
 +#include "compiler/nir/nir.h"
 +#include "main/mtypes.h"
 #ifdef __cplusplus
   extern "C" {
 @@ -57,18 +58,7 @@ struct spirv_to_nir_options {
   */
  bool lower_workgroup_access_to_offsets;
   -   struct {
 -  bool float64;
 -  bool image_ms_array;
 -  bool tessellation;
 -  bool draw_parameters;
 -  bool image_read_without_format;
 -  bool image_write_without_format;
 -  bool int64;
 -  bool multiview;
 -  bool variable_pointers;
 -  bool storage_16bit;
 -   } caps;
 +   struct nir_spirv_supported_capabilities caps;
struct {
 void (*func)(void *private_data,
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index b478f6158e2..7da05aa3ee9 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3579,6 +3579,18 @@ struct gl_program_constants
  GLuint MaxShaderStorageBlocks;
   };
   +struct nir_spirv_supported_capabilities {
 +   bool float64;
 +   bool image_ms_array;
 +   bool tessellation;
 +   bool draw_parameters;
 +   bool image_read_without_format;
 +   bool image_write_without_format;
 +   bool int64;
 +   bool multiview;
 +   bool variable_pointers;
 +   bool storage_16bit;
 +};
 /**
* Constants which may be overridden by device driver during context
 creation
 --
 2.11.0

Re: [Mesa-dev] [PATCH v2 19/25] mesa/glspirv: Add a _mesa_spirv_link_shaders() placeholder

2017-12-06 Thread Timothy Arceri


Please squash this with patch 22 tis is just code churn.

On 01/12/17 04:28, Eduardo Lima Mitev wrote:

This will be the equivalent to link_shaders() from
src/compiler/glsl/linker.cpp, but for SPIR-V programs.
---
  src/mesa/main/glspirv.c | 10 ++
  src/mesa/main/glspirv.h |  4 
  2 files changed, 14 insertions(+)

diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
index 18710c0d8fc..e533853f7fa 100644
--- a/src/mesa/main/glspirv.c
+++ b/src/mesa/main/glspirv.c
@@ -104,6 +104,16 @@ _mesa_spirv_shader_binary(struct gl_context *ctx,
 }
  }
  
+void

+_mesa_spirv_link_shaders(struct gl_context *ctx, struct gl_shader_program 
*prog)
+{
+   /* @TODO: This is a placeholder for the equivalent of
+* compiler/glsl/linker.cpp::link_shaders() but for SPIR-V.
+*/
+   prog->data->LinkStatus = linking_success;
+   prog->data->Validated = false;
+}
+
  void GLAPIENTRY
  _mesa_SpecializeShaderARB(GLuint shader,
const GLchar *pEntryPoint,
diff --git a/src/mesa/main/glspirv.h b/src/mesa/main/glspirv.h
index ba281f68bef..0f03b75c111 100644
--- a/src/mesa/main/glspirv.h
+++ b/src/mesa/main/glspirv.h
@@ -76,6 +76,10 @@ _mesa_spirv_shader_binary(struct gl_context *ctx,
unsigned n, struct gl_shader **shaders,
const void* binary, size_t length);
  
+void

+_mesa_spirv_link_shaders(struct gl_context *ctx,
+ struct gl_shader_program *prog);
+
  /**
   * \name API functions
   */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] nir: Fix interaction of GL_CLAMP lowering with texture offsets.

2017-12-06 Thread Eric Anholt

Eric Anholt  writes:

> We want the clamping of the coordinate to apply after the offset, so we
> need to do math to lower the offset out of the instruction.  Fixes texwrap
> offset cases for GL_CLAMP with GL_NEAREST on vc5.

Still looking for a review on this series.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Timothy Arceri


On 07/12/17 00:23, Alejandro Piñeiro wrote:

On 06/12/17 13:29, Pierre Moreau wrote:

Hello Alejandro,

As far as I understand, nir_spirv_supported_capabilities is being filled in by
the driver and then fetched by the API entrypoint to check the capabilities
required by the SPIR-V binary given as input. And this is done regardless of
the input IR used by the driver, be it NIR, LLVM IR, TGSI or others. So
couldn’t it be just named spirv_supported_capabilities? Unless it also reflects
the capabilities supported by the IR being used.


Good point. spirv_supported_capabilities is probably a better name,
although right now, it would be only used on the spirv to nir pass. I
will not send a new version of the patch with just the renaming, but for
anyone interested on review the commit, I will use that name.


I would be much happier with this being in mtypes.h with that name. So 
if renamed:


Reviewed-by: Timothy Arceri 




I guess nir_spirv_supported_capabilities could be extended later on to also add
capabilities specific to OpenCL when clover reaches OpenCL 1.2 support (and can
start accepting SPIR-V binaries as input through the cl_khr_il_program
extension), or would it be better to have a separate one for OpenCL?


Probably it would be re-used, but I don't know the specifics of OpenCL
to ensure 100% that.



I haven’t had time to look at the whole gl_spirv series yet, so I am sorry if
this is something that has already been brought and answered in that thread.


No sorries, your feedback was good and welcomed.


Regards,
Pierre

On 2017-12-06 — 09:57, Alejandro Piñeiro wrote:

Until now it was part of spirv_to_nir_options. But it will be used on
the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
to the OpenGL context, as a way to save what SPIR-V capabilities the
current OpenGL implementation supports.
---

We are sending this commit in advance of a v3 of the initial gl_spirv
and spirv_extensions support series. The issue is that lately there
were a lot of activity on the spirv/spir_to_nir code base, and we are
being fixing rebase conflicts constantly. Getting this commit on
master would make things easier.

FWIW, this patch is similar to one that Ian Romanick already granted
Rb, but that was dropped after all the mentioned changes:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/178261.html

  src/compiler/spirv/nir_spirv.h | 16 +++-
  src/mesa/main/mtypes.h | 12 
  2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
index 43ec19d5a50..113bd710a00 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -28,7 +28,8 @@
  #ifndef _NIR_SPIRV_H_
  #define _NIR_SPIRV_H_
  
-#include "nir/nir.h"

+#include "compiler/nir/nir.h"
+#include "main/mtypes.h"
  
  #ifdef __cplusplus

  extern "C" {
@@ -57,18 +58,7 @@ struct spirv_to_nir_options {
  */
 bool lower_workgroup_access_to_offsets;
  
-   struct {

-  bool float64;
-  bool image_ms_array;
-  bool tessellation;
-  bool draw_parameters;
-  bool image_read_without_format;
-  bool image_write_without_format;
-  bool int64;
-  bool multiview;
-  bool variable_pointers;
-  bool storage_16bit;
-   } caps;
+   struct nir_spirv_supported_capabilities caps;
  
 struct {

void (*func)(void *private_data,
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b478f6158e2..7da05aa3ee9 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3579,6 +3579,18 @@ struct gl_program_constants
 GLuint MaxShaderStorageBlocks;
  };
  
+struct nir_spirv_supported_capabilities {

+   bool float64;
+   bool image_ms_array;
+   bool tessellation;
+   bool draw_parameters;
+   bool image_read_without_format;
+   bool image_write_without_format;
+   bool int64;
+   bool multiview;
+   bool variable_pointers;
+   bool storage_16bit;
+};
  
  /**

   * Constants which may be overridden by device driver during context creation
--
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] glsl: get correct member type when processing xfb ifc arrays

2017-12-06 Thread Kenneth Graunke

On Wednesday, December 6, 2017 4:57:05 PM PST Timothy Arceri wrote:
> This fixes a crash in:
> 
> KHR-GL45.enhanced_layouts.xfb_block_stride
> 
> Fixes: 0822517936d4 "glsl: add helper to process xfb qualifiers during 
> linking"
> Cc: Kenneth Graunke 

Thanks a ton for fixing this, Tim!

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [rfc] r600 sb improvements (compute/gds)

2017-12-06 Thread Dave Airlie

This is a first pass at adding GDS support to the r600/sb backend,
along with a small compute shader fix.

For compute shaders, LDS needs supporting as well.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/9] anv/pipeline: Translate vulkan_resource_index to a constant when possible

2017-12-06 Thread Jason Ekstrand

We want to call brw_nir_analyze_ubo_ranges immedately after
anv_nir_apply_pipeline_layout and it badly wants constants.  We could
run an optimization step and let constant folding do it but that's way
more expensive than needed.  It's really easy to just handle constants
in apply_pipeline_layout.

Reviewed-by: Jordan Justen 
---
 src/intel/vulkan/anv_nir_apply_pipeline_layout.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c 
b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
index 612b3f7..978a8a5 100644
--- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
+++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
@@ -116,12 +116,21 @@ lower_res_index_intrinsic(nir_intrinsic_instr *intrin,
uint32_t array_size =
   state->layout->set[set].layout->binding[binding].array_size;
 
-   nir_ssa_def *block_index = nir_ssa_for_src(b, intrin->src[0], 1);
+   nir_const_value *const_array_index = nir_src_as_const_value(intrin->src[0]);
 
-   if (state->add_bounds_checks)
-  block_index = nir_umin(b, block_index, nir_imm_int(b, array_size - 1));
+   nir_ssa_def *block_index;
+   if (const_array_index) {
+  unsigned array_index = const_array_index->u32[0];
+  array_index = MIN2(array_index, array_size - 1);
+  block_index = nir_imm_int(b, surface_index + array_index);
+   } else {
+  block_index = nir_ssa_for_src(b, intrin->src[0], 1);
 
-   block_index = nir_iadd(b, nir_imm_int(b, surface_index), block_index);
+  if (state->add_bounds_checks)
+ block_index = nir_umin(b, block_index, nir_imm_int(b, array_size - 
1));
+
+  block_index = nir_iadd(b, nir_imm_int(b, surface_index), block_index);
+   }
 
assert(intrin->dest.is_ssa);
nir_ssa_def_rewrite_uses(>dest.ssa, nir_src_for_ssa(block_index));
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/9] anv: Enable VK_KHR_16bit_storage for push_constant

2017-12-06 Thread Jason Ekstrand

From: Chema Casanova 

Enables storagePushConstant16 feature of VK_KHR_16bit_storage
for Gen8+.

Reviewed-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 81a2ed6..fd5326b 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -747,7 +747,7 @@ void anv_GetPhysicalDeviceFeatures2KHR(
 
  features->storageBuffer16BitAccess = pdevice->info.gen >= 8;
  features->uniformAndStorageBuffer16BitAccess = pdevice->info.gen >= 8;
- features->storagePushConstant16 = false;
+ features->storagePushConstant16 = pdevice->info.gen >= 8;
  features->storageInputOutput16 = false;
  break;
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 23/23] docs: Update GL_ARB_get_program_binary docs to support 1 format

2017-12-06 Thread Tapani Pälli


Acked-by: Tapani Pälli 

On 11/29/2017 03:24 AM, Timothy Arceri wrote:

From: Jordan Justen 

Signed-off-by: Jordan Justen 
---
  docs/features.txt | 2 +-
  docs/relnotes/17.4.0.html | 1 +
  2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/features.txt b/docs/features.txt
index 01cd133ef01..39aaa4f603b 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -132,21 +132,21 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, 
radeonsi
GL_ARB_texture_cube_map_array DONE (i965/gen6+, 
nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (freedreno, 
i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod  DONE (freedreno, 
i965, nv50, llvmpipe, softpipe)
GL_ARB_transform_feedback2DONE (i965/gen6+, 
nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback3DONE (i965/gen7+, 
llvmpipe, softpipe, swr)
  
  
  GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
  
GL_ARB_ES2_compatibility  DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)

-  GL_ARB_get_program_binary DONE (0 binary formats)
+  GL_ARB_get_program_binary DONE (0 or 1 binary 
formats)
GL_ARB_separate_shader_objectsDONE (all drivers)
GL_ARB_shader_precision   DONE (i965/gen7+, all 
drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bitDONE (i965/gen7+, 
llvmpipe, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, 
llvmpipe, softpipe)
  
  
  GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi
  
GL_ARB_texture_compression_bptc   DONE (freedreno, i965)

GL_ARB_compressed_texture_pixel_storage   DONE (all drivers)
diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
index ec2386b3305..2e884fa764e 100644
--- a/docs/relnotes/17.4.0.html
+++ b/docs/relnotes/17.4.0.html
@@ -42,20 +42,21 @@ TBD.
  
  Note: some of the new features are only available with certain drivers.
  
  
  

  Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to 
"0" or "false"
  GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on 
r600/evergreen+
  GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on 
r600/evergreen+
  GL_ARB_cull_distance on r600/evergreen+
  OpenGL 4.2 on r600/evergreen with hw fp64 support
+Support 1 binary format for GL_ARB_get_program_binary on i965
  
  
  Bug fixes
  
  

  TBD
  
  
  Changes
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 22/23] i965: Add ARB_get_program_binary support using nir_serialization

2017-12-06 Thread Tapani Pälli


LGTM

Reviewed-by: Tapani Pälli 

On 11/29/2017 03:24 AM, Timothy Arceri wrote:

From: Jordan Justen 

This resolves an apparent game bug described in 85564. The game
doesn't properly handle ARB_get_program_binary with 0 supported
formats.

V2 (Timothy Arceri):
  - less driver code as more has been moved into the common helpers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
Signed-off-by: Timothy Arceri 
Signed-off-by: Jordan Justen  (v1)
---
  src/mesa/drivers/dri/i965/Makefile.sources |  1 +
  src/mesa/drivers/dri/i965/brw_context.c| 10 
  src/mesa/drivers/dri/i965/brw_context.h| 16 ++
  src/mesa/drivers/dri/i965/brw_program.h|  7 ---
  src/mesa/drivers/dri/i965/brw_program_binary.c | 72 ++
  src/mesa/drivers/dri/i965/meson.build  |  1 +
  6 files changed, 100 insertions(+), 7 deletions(-)
  create mode 100644 src/mesa/drivers/dri/i965/brw_program_binary.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 2980cdb3c54..3fba8dc17ef 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -30,20 +30,21 @@ i965_FILES = \
brw_meta_util.h \
brw_misc_state.c \
brw_multisample_state.h \
brw_nir_uniforms.cpp \
brw_object_purgeable.c \
brw_pipe_control.c \
brw_performance_query.h \
brw_performance_query.c \
brw_program.c \
brw_program.h \
+   brw_program_binary.c \
brw_program_cache.c \
brw_primitive_restart.c \
brw_queryobj.c \
brw_reset.c \
brw_sf.c \
brw_state.h \
brw_state_upload.c \
brw_structs.h \
brw_surface_formats.c \
brw_sync.c \
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index dd55b436698..5cd759f7356 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -322,20 +322,27 @@ brw_init_driver_functions(struct brw_context *brw,
functions->BeginTransformFeedback = brw_begin_transform_feedback;
functions->EndTransformFeedback = brw_end_transform_feedback;
functions->PauseTransformFeedback = brw_pause_transform_feedback;
functions->ResumeTransformFeedback = brw_resume_transform_feedback;
functions->GetTransformFeedbackVertexCount =
   brw_get_transform_feedback_vertex_count;
 }
  
 if (devinfo->gen >= 6)

functions->GetSamplePosition = gen6_get_sample_position;
+
+   /* GL_ARB_get_program_binary */
+   brw_program_binary_init(brw->screen->deviceID);
+   functions->GetProgramBinaryDriverSHA1 = brw_get_program_binary_driver_sha1;
+   functions->ProgramBinarySerializeDriverBlob = brw_program_serialize_nir;
+   functions->ProgramBinaryDeserializeDriverBlob =
+  brw_deserialize_program_binary;
  }
  
  static void

  brw_initialize_context_constants(struct brw_context *brw)
  {
 const struct gen_device_info *devinfo = >screen->devinfo;
 struct gl_context *ctx = >ctx;
 const struct brw_compiler *compiler = brw->screen->compiler;
  
 const bool stage_exists[MESA_SHADER_STAGES] = {

@@ -689,20 +696,23 @@ brw_initialize_context_constants(struct brw_context *brw)
  * pull an entire cache line at a time for constant offset loads both of
  * which support almost any alignment.
  *
  * [1] glsl-1.40/uniform_buffer/vs-float-array-variable-index.shader_test
  */
 if (devinfo->gen >= 7)
ctx->Const.UseSTD430AsDefaultPacking = true;
  
 if (!(ctx->Const.ContextFlags & GL_CONTEXT_FLAG_DEBUG_BIT))

ctx->Const.AllowMappedBuffersDuringExecution = true;
+
+   /* GL_ARB_get_program_binary */
+   ctx->Const.NumProgramBinaryFormats = 1;
  }
  
  static void

  brw_initialize_cs_context_constants(struct brw_context *brw)
  {
 struct gl_context *ctx = >ctx;
 const struct intel_screen *screen = brw->screen;
 struct gen_device_info *devinfo = >screen->devinfo;
  
 /* FINISHME: Do this for all platforms that the kernel supports */

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index b3d7c6baf8a..da88bea739e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1539,20 +1539,36 @@ gen7_upload_urb(struct brw_context *brw, unsigned 
vs_size,
  /* brw_reset.c */
  extern GLenum
  brw_get_graphics_reset_status(struct gl_context *ctx);
  void
  brw_check_for_reset(struct brw_context *brw);
  
  /* brw_compute.c */

  extern void
  brw_init_compute_functions(struct dd_function_table *functions);
  
+/* brw_program_binary.c */

+extern void
+brw_program_binary_init(unsigned device_id);
+extern void
+brw_get_program_binary_driver_sha1(struct gl_context *ctx,

Re: [Mesa-dev] [PATCH 1/6] radeonsi: allow DMABUF exports for local buffers

2017-12-06 Thread Marek Olšák

On Wed, Dec 6, 2017 at 5:08 PM, Emil Velikov  wrote:
> On 6 December 2017 at 15:52, Nicolai Hähnle  wrote:
>> On 06.12.2017 13:43, Marek Olšák wrote:
>>>
>>>
>>>
>>> On Dec 6, 2017 12:34 PM, "Nicolai Hähnle" >> > wrote:
>>>
>>> On 05.12.2017 20:05, Marek Olšák wrote:
>>>
>>> From: Marek Olšák >> >
>>>
>>> Cc: 17.3 >> >
>>>
>>> ---
>>>src/gallium/drivers/radeon/r600_texture.c | 6 +-
>>>1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/drivers/radeon/r600_texture.c
>>> b/src/gallium/drivers/radeon/r600_texture.c
>>> index 2aa47b5..07f7c33 100644
>>> --- a/src/gallium/drivers/radeon/r600_texture.c
>>> +++ b/src/gallium/drivers/radeon/r600_texture.c
>>> @@ -739,22 +739,26 @@ static boolean
>>> r600_texture_get_handle(struct pipe_screen* screen,
>>>  stride = rtex->surface.u.gfx9.surf_pitch
>>> *
>>>   rtex->surface.bpe;
>>>  slice_size =
>>> rtex->surface.u.gfx9.surf_slice_size;
>>>  } else {
>>>  offset =
>>> rtex->surface.u.legacy.level[0].offset;
>>>  stride =
>>> rtex->surface.u.legacy.level[0].nblk_x *
>>>   rtex->surface.bpe;
>>>  slice_size =
>>> (uint64_t)rtex->surface.u.legacy.level[0].slice_size_dw * 4;
>>>  }
>>>  } else {
>>> +   /* Buffer exports are for the OpenCL interop. */
>>>  /* Move a suballocated buffer into a
>>> non-suballocated allocation. */
>>> -   if (sscreen->ws->buffer_is_suballocated(res->buf))
>>> {
>>> +   if (sscreen->ws->buffer_is_suballocated(res->buf)
>>> ||
>>> +   /* A DMABUF export always fails if the BO is
>>> local. */
>>> +   (rtex->resource.flags &
>>> RADEON_FLAG_NO_INTERPROCESS_SHARING &&
>>> +whandle->type != DRM_API_HANDLE_TYPE_KMS)) {
>>>
>>>
>>> I still don't think this is right. Or at least, it's bound to blow
>>> up in our faces at some point. Though I think we may have talked
>>> past each other, apologies that I haven't made myself clear.
>>>
>>> The issues I have in mind are scenarios like this:
>>>
>>> 1. Buffer allocated in OpenGL.
>>> 2. Buffer exported as KMS handle for importing to OpenCL in the same
>>> process.
>>> 3. Buffer exported as an FD <-- at this point, the OpenGL and OpenCL
>>> buffers go out of sync because OpenGL re-allocates the buffer but
>>> OpenCL isn't informed.
>>>
>>> Or:
>>>
>>> 1. Buffer allocated in OpenGL.
>>> 2. Buffer exported as KMS handle for importing to OpenCL in the same
>>> process.
>>> 3. Buffer attempted to be exported as an FD from OpenCL <-- fails
>>> because the buffer is local (has NO_INTERPROCESS_SHARING), and
>>> people will be utterly clueless as to what's going on.
>>>
>>> FWIW, I think the patch is good if you drop the whandle->type check
>>> so that we re-allocate unconditionally.
>>>
>>>
>>> I can remove the check, but buffers are only exported as DMABUF. This
>>> patch isn't just random - it does fix OpenCL interop.
>>
>>
>> Well, I doubt it's critical for performance, OpenCL doesn't know about the
>> NO_INTERPROCESS_SHARING status anyway. So I'd prefer to remove the check, in
>> case we change the exporting at some point.
>>
> Humble request: please include the highlights of the discussion in the
> commit message ;-)

Sorry, I just pushed the patches.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] swr/scons: Fix another intermittent build failure

2017-12-06 Thread Cherniak, Bruce

Reviewed-by: Bruce Cherniak  

> On Dec 5, 2017, at 10:50 AM, George Kyriazis  
> wrote:
> 
> gen_BackendPixelRate*.cpp depends on gen_ar_eventhandler.hpp.
> Fix missing dependency.
> ---
> src/gallium/drivers/swr/SConscript | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/src/gallium/drivers/swr/SConscript 
> b/src/gallium/drivers/swr/SConscript
> index 9204ecb..eca4830 100644
> --- a/src/gallium/drivers/swr/SConscript
> +++ b/src/gallium/drivers/swr/SConscript
> @@ -146,6 +146,7 @@ env.CodeGenerate(
> Depends(backendPixelRateFiles,
> ['rasterizer/core/backends/gen_BackendPixelRate.hpp',
>  'rasterizer/archrast/gen_ar_event.hpp',
> + 'rasterizer/archrast/gen_ar_eventhandler.hpp',
>  'rasterizer/codegen/gen_knobs.h']
> )
> 
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv: track different status of a command buffer

2017-12-06 Thread Samuel Pitoiset

RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 6 ++
 src/amd/vulkan/radv_device.c | 2 ++
 src/amd/vulkan/radv_private.h| 9 +
 3 files changed, 17 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index fe4f989dd1..63a5eebab9 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -292,6 +292,8 @@ radv_reset_cmd_buffer(struct radv_cmd_buffer *cmd_buffer)
cmd_buffer->gfx9_fence_bo = cmd_buffer->upload.upload_bo;
}
 
+   cmd_buffer->status = RADV_CMD_BUFFER_STATUS_INITIAL;
+
return cmd_buffer->record_result;
 }
 
@@ -2271,6 +2273,8 @@ VkResult radv_BeginCommandBuffer(
if (unlikely(cmd_buffer->device->trace_bo))
radv_cmd_buffer_trace_emit(cmd_buffer);
 
+   cmd_buffer->status = RADV_CMD_BUFFER_STATUS_RECORDING;
+
return result;
 }
 
@@ -2539,6 +2543,8 @@ VkResult radv_EndCommandBuffer(
if (!cmd_buffer->device->ws->cs_finalize(cmd_buffer->cs))
return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
 
+   cmd_buffer->status = RADV_CMD_BUFFER_STATUS_EXECUTABLE;
+
return cmd_buffer->record_result;
 }
 
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 1b7cd35593..1ff01c24b3 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2008,6 +2008,8 @@ VkResult radv_QueueSubmit(
cs_array[j] = cmd_buffer->cs;
if ((cmd_buffer->usage_flags & 
VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT))
can_patch = false;
+
+   cmd_buffer->status = RADV_CMD_BUFFER_STATUS_PENDING;
}
 
for (uint32_t j = 0; j < pSubmits[i].commandBufferCount; j += 
advance) {
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 67c2011107..1e7b28a1d5 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -857,6 +857,14 @@ struct radv_cmd_buffer_upload {
struct list_head list;
 };
 
+enum radv_cmd_buffer_status {
+   RADV_CMD_BUFFER_STATUS_INVALID,
+   RADV_CMD_BUFFER_STATUS_INITIAL,
+   RADV_CMD_BUFFER_STATUS_RECORDING,
+   RADV_CMD_BUFFER_STATUS_EXECUTABLE,
+   RADV_CMD_BUFFER_STATUS_PENDING,
+};
+
 struct radv_cmd_buffer {
VK_LOADER_DATA   _loader_data;
 
@@ -867,6 +875,7 @@ struct radv_cmd_buffer {
 
VkCommandBufferUsageFlagsusage_flags;
VkCommandBufferLevel level;
+   enum radv_cmd_buffer_status status;
struct radeon_winsys_cs *cs;
struct radv_cmd_state state;
struct radv_vertex_binding   vertex_bindings[MAX_VBS];
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: only reset command buffers that are not in the initial state

2017-12-06 Thread Samuel Pitoiset

dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 63a5eebab9..8821fcacef 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2230,11 +2230,16 @@ VkResult radv_BeginCommandBuffer(
const VkCommandBufferBeginInfo *pBeginInfo)
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-   VkResult result;
+   VkResult result = VK_SUCCESS;
 
-   result = radv_reset_cmd_buffer(cmd_buffer);
-   if (result != VK_SUCCESS)
-   return result;
+   if (cmd_buffer->status != RADV_CMD_BUFFER_STATUS_INITIAL) {
+   /* If the command buffer has already been resetted with
+* vkResetCommandBuffer, no need to do it again.
+*/
+   result = radv_reset_cmd_buffer(cmd_buffer);
+   if (result != VK_SUCCESS)
+   return result;
+   }
 
memset(_buffer->state, 0, sizeof(cmd_buffer->state));
cmd_buffer->state.last_primitive_reset_en = -1;
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/29] anv/blorp: Rework image clear/resolve helpers

2017-12-06 Thread Nanley Chery

On Wed, Dec 06, 2017 at 09:40:25AM -0800, Nanley Chery wrote:
> On Tue, Dec 05, 2017 at 03:48:45PM -0800, Nanley Chery wrote:
> > On Mon, Nov 27, 2017 at 07:05:52PM -0800, Jason Ekstrand wrote:
> > > This replaces image_fast_clear and ccs_resolve with two new helpers that
> > > simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
> > > is a bit cleaner as it separates performing the aux operation from which
> > > blorp helper we have to call to do it.
> > > ---
> > >  src/intel/vulkan/anv_blorp.c   | 218 
> > > ++---
> > >  src/intel/vulkan/anv_private.h |  23 ++--
> > >  src/intel/vulkan/genX_cmd_buffer.c |  28 +++--
> > >  3 files changed, 165 insertions(+), 104 deletions(-)
> > > 
> > > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> > > index e244468..7c8a673 100644
> > > --- a/src/intel/vulkan/anv_blorp.c
> > > +++ b/src/intel/vulkan/anv_blorp.c
> > > @@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
> > >  }
> > >  
> > >  void
> > > -anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
> > > - const struct anv_image *image,
> > > - VkImageAspectFlagBits aspect,
> > > - const uint32_t base_level, const uint32_t 
> > > level_count,
> > > - const uint32_t base_layer, uint32_t layer_count)
> > > -{
> > > -   assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
> > > -
> > > -   if (image->type == VK_IMAGE_TYPE_3D) {
> > > -  assert(base_layer == 0);
> > > -  assert(layer_count == anv_minify(image->extent.depth, base_level));
> > > -   }
> > > -
> > > -   struct blorp_batch batch;
> > > -   blorp_batch_init(_buffer->device->blorp, , cmd_buffer, 0);
> > > -
> > > -   struct blorp_surf surf;
> > > -   get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
> > > -fast_clear_aux_usage(image, aspect),
> > > -);
> > > -
> > > -   /* From the Sky Lake PRM Vol. 7, "Render Target Fast Clear":
> > > -*
> > > -*"After Render target fast clear, pipe-control with color cache
> > > -*write-flush must be issued before sending any DRAW commands on
> > > -*that render target."
> > > -*
> > > -* This comment is a bit cryptic and doesn't really tell you what's 
> > > going
> > > -* or what's really needed.  It appears that fast clear ops are not
> > > -* properly synchronized with other drawing.  This means that we 
> > > cannot
> > > -* have a fast clear operation in the pipe at the same time as other
> > > -* regular drawing operations.  We need to use a PIPE_CONTROL to 
> > > ensure
> > > -* that the contents of the previous draw hit the render target 
> > > before we
> > > -* resolve and then use a second PIPE_CONTROL after the resolve to 
> > > ensure
> > > -* that it is completed before any additional drawing occurs.
> > > -*/
> > > -   cmd_buffer->state.pending_pipe_bits |=
> > > -  ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
> > > -
> > > -   uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
> > > -   uint32_t width_div = 
> > > image->format->planes[plane].denominator_scales[0];
> > > -   uint32_t height_div = 
> > > image->format->planes[plane].denominator_scales[1];
> > > -
> > > -   for (uint32_t l = 0; l < level_count; l++) {
> > > -  const uint32_t level = base_level + l;
> > > -
> > > -  const VkExtent3D extent = {
> > > - .width = anv_minify(image->extent.width, level),
> > > - .height = anv_minify(image->extent.height, level),
> > > - .depth = anv_minify(image->extent.depth, level),
> > > -  };
> > > -
> > > -  if (image->type == VK_IMAGE_TYPE_3D)
> > > - layer_count = extent.depth;
> > > -
> > > -  assert(level < anv_image_aux_levels(image, aspect));
> > > -  assert(base_layer + layer_count <= anv_image_aux_layers(image, 
> > > aspect, level));
> > > -  blorp_fast_clear(, , surf.surf->format,
> > > -   level, base_layer, layer_count,
> > > -   0, 0,
> > > -   extent.width / width_div,
> > > -   extent.height / height_div);
> > > -   }
> > > -
> > > -   cmd_buffer->state.pending_pipe_bits |=
> > > -  ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
> > > -}
> > > -
> > > -void
> > >  anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
> > >  {
> > > struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> > > @@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer 
> > > *cmd_buffer,
> > >  }
> > >  
> > >  void
> > > -anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
> > > -const struct anv_image * const image,
> > > -VkImageAspectFlagBits aspect,
> > > -const

Re: [Mesa-dev] [PATCH 2/3] glx: Lift sending the MakeCurrent request to top-level code (v2)

2017-12-06 Thread Adam Jackson

On Wed, 2017-12-06 at 15:14 +, Emil Velikov wrote:

> > -  if (gc->vtable->bind(gc, oldGC, draw, read) != Success) {
> > +  if (gc->vtable->bind(gc, gc, draw, read) != Success) {
> >   __glXSetCurrentContextNull();
> 
> This line seems inconsistent/wrong.
> 
> The glXMakeCurrent manpage says "If False is returned, the previously
> current rendering context and drawable (if any) remain unchanged."

Ugh. That's not really possible to get perfectly right, there are
unrecoverable states (think MakeCurrent away from a context that's been
deleted, or whose current drawable is a destroyed window). Still, I
suppose we should try at least a little.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] glx: Move vertex array protocol state into the indirect backend (v2)

2017-12-06 Thread Adam Jackson

On Wed, 2017-12-06 at 14:50 +, Emil Velikov wrote:

> > +   * have setup the context, as it needs to query server attributes.
> > +   *
> > +   * At the point this is called gc->currentDpy is not initialized
> > +   * nor is the thread's current context actually set. Hence the
> > +   * cleverness before the GetString calls.
> > +   */
> > +  __GLXattribute *state = gc->client_state_private;
> > +  if (state && state->array_state == NULL) {
> > + gc->currentDpy = gc->psc->dpy;
> > + __glXSetCurrentContext(gc);
> 
> Unless I'm misreading the SendMakeCurrentRequest rework (patch 2/3)
> __glXSetCurrentContext() will be called, hence these two lines +
> respective comment could be omitted.

Pretty sure you're misreading something. This is the ->bind hook, if it
succeeds then MakeContextCurrent will call __glXSetCurrentContext.
Since we have not yet returned, we have not yet succeeded, and
__glXSetCurrentContext has not yet been called, so we must do it
ourselves.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: fix a case statement in GetMemoryFdPropertiesKHR

2017-12-06 Thread Fredrik Höglund

On Wednesday 06 December 2017, Emil Velikov wrote:
> On 5 December 2017 at 20:51, Fredrik Höglund  wrote:
> > The handle type in the case statement is supposed to be VK_EXTERNAL_-
> > MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.
> >
> > Signed-off-by: Fredrik Höglund 
> For the future please include a fixes tag if the commit is known.
> 
> Here
> Fixes: 546e747867c ("radv: Implement VK_EXT_external_memory_dma_buf")
> 
> And for 2/2
> Fixes: ab18e8e59b6 ("anv: Implement VK_EXT_external_memory_dma_buf")

I thought those commits were only in the master branch, but I've pushed
the patches now with those tags.

Thanks,
Fredrik

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/6] radeonsi: make const and stream uploaders allocate read-only memory

2017-12-06 Thread Dieter Nützel


For the series:

Tested-by: Dieter Nützel 

Dieter

Am 05.12.2017 20:05, schrieb Marek Olšák:

From: Marek Olšák 

and anything that clones these uploaders, like u_threaded_context.
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 9090e65..9e45a9f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -438,26 +438,29 @@ bool si_common_context_init(struct
r600_common_context *rctx,
return false;
}

rctx->allocator_zeroed_memory =
u_suballocator_create(>b, sscreen->info.gart_page_size,
  0, PIPE_USAGE_DEFAULT, 0, true);
if (!rctx->allocator_zeroed_memory)
return false;

rctx->b.stream_uploader = u_upload_create(>b, 1024 * 1024,
- 0, PIPE_USAGE_STREAM, 0);
+ 0, PIPE_USAGE_STREAM,
+ R600_RESOURCE_FLAG_READ_ONLY);
if (!rctx->b.stream_uploader)
return false;

rctx->b.const_uploader = u_upload_create(>b, 128 * 1024,
-0, PIPE_USAGE_DEFAULT, 0);
+0, PIPE_USAGE_DEFAULT,
+
sscreen->cpdma_prefetch_writes_memory ?
+   0 : 
R600_RESOURCE_FLAG_READ_ONLY);
if (!rctx->b.const_uploader)
return false;

rctx->cached_gtt_allocator = u_upload_create(>b, 16 * 1024,
 0, PIPE_USAGE_STAGING, 0);
if (!rctx->cached_gtt_allocator)
return false;

rctx->ctx = rctx->ws->ctx_create(rctx->ws);
if (!rctx->ctx)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/25] mesa: move nir_spirv_supported_capabilities definition

2017-12-06 Thread Timothy Arceri


Can we get away with forward declaring this?

There is a section at the top of mtypes you can add it to:

 * \name Some forward type declarations


On 01/12/17 04:28, Eduardo Lima Mitev wrote:

From: Alejandro Piñeiro 

Due gl_spirv we will use it on more places, specifically on
gl_constants, where we would like to use it without a pointer.
---
  src/compiler/spirv/nir_spirv.h | 15 ++-
  src/mesa/main/mtypes.h | 11 +++
  2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
index 0204e81d091..a14b55cdd4b 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -28,7 +28,8 @@
  #ifndef _NIR_SPIRV_H_
  #define _NIR_SPIRV_H_
  
-#include "nir/nir.h"

+#include "compiler/nir/nir.h"
+#include "main/mtypes.h"
  
  #ifdef __cplusplus

  extern "C" {
@@ -42,18 +43,6 @@ struct nir_spirv_specialization {
 };
  };
  
-struct nir_spirv_supported_capabilities {

-   bool float64;
-   bool image_ms_array;
-   bool tessellation;
-   bool draw_parameters;
-   bool image_read_without_format;
-   bool image_write_without_format;
-   bool int64;
-   bool multiview;
-   bool variable_pointers;
-};
-
  nir_function *spirv_to_nir(const uint32_t *words, size_t word_count,
 struct nir_spirv_specialization *specializations,
 unsigned num_specializations,
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 50a47e0a65d..c8177c9a99a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3583,6 +3583,17 @@ struct gl_program_constants
 GLuint MaxShaderStorageBlocks;
  };
  
+struct nir_spirv_supported_capabilities {

+   bool float64;
+   bool image_ms_array;
+   bool tessellation;
+   bool draw_parameters;
+   bool image_read_without_format;
+   bool image_write_without_format;
+   bool int64;
+   bool multiview;
+   bool variable_pointers;
+};
  
  /**

   * Constants which may be overridden by device driver during context creation


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/25] mesa: move nir_spirv_supported_capabilities definition

2017-12-06 Thread Alejandro Piñeiro

On 06/12/17 10:23, Timothy Arceri wrote:
> Can we get away with forward declaring this?
>
> There is a section at the top of mtypes you can add it to:
>
>  * \name Some forward type declarations

Yes, I realized that, and tried, but I still got several build errors.
So that would not be enough.

In any case, after all the recent changes on spirv/spirv_to_nir
codebase, this commit and the following one are obsolete. We are
preparing a v3 series, but meanwhile we send this path alone to mesa-dev:
https://lists.freedesktop.org/archives/mesa-dev/2017-December/179438.html
>
>
> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
>> From: Alejandro Piñeiro 
>>
>> Due gl_spirv we will use it on more places, specifically on
>> gl_constants, where we would like to use it without a pointer.
>> ---
>>   src/compiler/spirv/nir_spirv.h | 15 ++-
>>   src/mesa/main/mtypes.h | 11 +++
>>   2 files changed, 13 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/compiler/spirv/nir_spirv.h
>> b/src/compiler/spirv/nir_spirv.h
>> index 0204e81d091..a14b55cdd4b 100644
>> --- a/src/compiler/spirv/nir_spirv.h
>> +++ b/src/compiler/spirv/nir_spirv.h
>> @@ -28,7 +28,8 @@
>>   #ifndef _NIR_SPIRV_H_
>>   #define _NIR_SPIRV_H_
>>   -#include "nir/nir.h"
>> +#include "compiler/nir/nir.h"
>> +#include "main/mtypes.h"
>>     #ifdef __cplusplus
>>   extern "C" {
>> @@ -42,18 +43,6 @@ struct nir_spirv_specialization {
>>  };
>>   };
>>   -struct nir_spirv_supported_capabilities {
>> -   bool float64;
>> -   bool image_ms_array;
>> -   bool tessellation;
>> -   bool draw_parameters;
>> -   bool image_read_without_format;
>> -   bool image_write_without_format;
>> -   bool int64;
>> -   bool multiview;
>> -   bool variable_pointers;
>> -};
>> -
>>   nir_function *spirv_to_nir(const uint32_t *words, size_t word_count,
>>  struct nir_spirv_specialization
>> *specializations,
>>  unsigned num_specializations,
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index 50a47e0a65d..c8177c9a99a 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -3583,6 +3583,17 @@ struct gl_program_constants
>>  GLuint MaxShaderStorageBlocks;
>>   };
>>   +struct nir_spirv_supported_capabilities {
>> +   bool float64;
>> +   bool image_ms_array;
>> +   bool tessellation;
>> +   bool draw_parameters;
>> +   bool image_read_without_format;
>> +   bool image_write_without_format;
>> +   bool int64;
>> +   bool multiview;
>> +   bool variable_pointers;
>> +};
>>     /**
>>    * Constants which may be overridden by device driver during
>> context creation
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: fix a case statement in GetMemoryFdPropertiesKHR

2017-12-06 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 12/05/2017 09:51 PM, Fredrik Höglund wrote:

The handle type in the case statement is supposed to be VK_EXTERNAL_-
MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.

Signed-off-by: Fredrik Höglund 
---
  src/amd/vulkan/radv_device.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 1b7cd355938..2538472bea6 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3557,7 +3557,7 @@ VkResult radv_GetMemoryFdPropertiesKHR(VkDevice _device,
   VkMemoryFdPropertiesKHR 
*pMemoryFdProperties)
  {
 switch (handleType) {
-   case VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
+   case VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT:
pMemoryFdProperties->memoryTypeBits = (1 << RADV_MEM_TYPE_COUNT) - 1;
return VK_SUCCESS;
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/25] mesa: move nir_spirv_supported_capabilities definition

2017-12-06 Thread Timothy Arceri




On 06/12/17 20:33, Alejandro Piñeiro wrote:

On 06/12/17 10:23, Timothy Arceri wrote:

Can we get away with forward declaring this?

There is a section at the top of mtypes you can add it to:

  * \name Some forward type declarations


Yes, I realized that, and tried, but I still got several build errors.
So that would not be enough.


Doesn't that just mean you need to include compiler/spirv/nir_spirv.h in 
more places?




In any case, after all the recent changes on spirv/spirv_to_nir
codebase, this commit and the following one are obsolete. We are
preparing a v3 series, but meanwhile we send this path alone to mesa-dev:
https://lists.freedesktop.org/archives/mesa-dev/2017-December/179438.html


I'm confused. If it's obsolete why are you trying to get it committed?





On 01/12/17 04:28, Eduardo Lima Mitev wrote:

From: Alejandro Piñeiro 

Due gl_spirv we will use it on more places, specifically on
gl_constants, where we would like to use it without a pointer.
---
   src/compiler/spirv/nir_spirv.h | 15 ++-
   src/mesa/main/mtypes.h | 11 +++
   2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/compiler/spirv/nir_spirv.h
b/src/compiler/spirv/nir_spirv.h
index 0204e81d091..a14b55cdd4b 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -28,7 +28,8 @@
   #ifndef _NIR_SPIRV_H_
   #define _NIR_SPIRV_H_
   -#include "nir/nir.h"
+#include "compiler/nir/nir.h"
+#include "main/mtypes.h"
     #ifdef __cplusplus
   extern "C" {
@@ -42,18 +43,6 @@ struct nir_spirv_specialization {
  };
   };
   -struct nir_spirv_supported_capabilities {
-   bool float64;
-   bool image_ms_array;
-   bool tessellation;
-   bool draw_parameters;
-   bool image_read_without_format;
-   bool image_write_without_format;
-   bool int64;
-   bool multiview;
-   bool variable_pointers;
-};
-
   nir_function *spirv_to_nir(const uint32_t *words, size_t word_count,
  struct nir_spirv_specialization
*specializations,
  unsigned num_specializations,
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 50a47e0a65d..c8177c9a99a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3583,6 +3583,17 @@ struct gl_program_constants
  GLuint MaxShaderStorageBlocks;
   };
   +struct nir_spirv_supported_capabilities {
+   bool float64;
+   bool image_ms_array;
+   bool tessellation;
+   bool draw_parameters;
+   bool image_read_without_format;
+   bool image_write_without_format;
+   bool int64;
+   bool multiview;
+   bool variable_pointers;
+};
     /**
    * Constants which may be overridden by device driver during
context creation


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/x11: Remove unneeded free() on always null string

2017-12-06 Thread Vadym Shovkoplias

Hi Eric,

I used smatch (http://smatch.sourceforge.net/). It is mainly used for Linux
kernel.

On Mon, Dec 4, 2017 at 6:52 PM, Eric Engestrom 
wrote:

> On Monday, 2017-12-04 12:48:55 +0200, Vadim Shovkoplias wrote:
> > Hi Eric,
>
> Hey, sorry, I forgot to hit "send" on the reply I wrote on friday :]
>
> >
> > Mostly by a static analysis tool. It found at least 7 issues with useless
> > free() calls and other problems that probably should be fixed.
>
> What tool? It would be interesting for others to know what tools exist,
> especially when they find issues other tools didn't :)
>
> > Suggest please should I create one cumulative commit for this or it
> should
> > be a separate commits ?
>
> Same kind of issues in the same module should be grouped, whereas
> different kind of issues or different modules should be separate.
>
> Don't worry too much about it though, if people ask you to merge or
> split commits, it's not that complicated to do for a v2 :)
>
> >
> > 2017-12-01 17:41 GMT+02:00 Eric Engestrom :
> >
> > > On Friday, 2017-12-01 17:08:53 +0200, vadim.shovkopl...@gmail.com
> wrote:
> > > > From: Vadym Shovkoplias 
> > > >
> > > > In this condition dri2_dpy->driver_name string always equals
> > > > NULL, so call to free() is useless
> > > >
> > > > Signed-off-by: Vadym Shovkoplias 
> > >
> > > Reviewed and pushed :)
> > >
> > > Are you finding all of these by inspection, or are you using a tool?
> > >
> > > > ---
> > > >  src/egl/drivers/dri2/platform_x11.c | 1 -
> > > >  1 file changed, 1 deletion(-)
> > > >
> > > > diff --git a/src/egl/drivers/dri2/platform_x11.c
> b/src/egl/drivers/dri2/
> > > platform_x11.c
> > > > index c49cb1f..8ede590b 100644
> > > > --- a/src/egl/drivers/dri2/platform_x11.c
> > > > +++ b/src/egl/drivers/dri2/platform_x11.c
> > > > @@ -704,7 +704,6 @@ dri2_x11_connect(struct dri2_egl_display
> *dri2_dpy)
> > > >
> > > > if (dri2_dpy->driver_name == NULL) {
> > > >close(dri2_dpy->fd);
> > > > -  free(dri2_dpy->driver_name);
> > > >free(connect);
> > > >return EGL_FALSE;
> > > > }
> > > > --
> > > > 2.7.4
> > > >
> > >
>



-- 

Vadym Shovkoplias | Software engineer
GlobalLogic
P +x.xxx.xxx.  M +3.8050.931.7304  S vadym.shovkoplias
www.globallogic.com

http://www.globallogic.com/email_disclaimer.txt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/1] nir: Use a freelist in nir_opt_dce to avoid spamming ralloc

2017-12-06 Thread Dieter Nützel


Tested-by: Dieter Nützel 

Dieter

Am 02.12.2017 15:49, schrieb Thomas Helland:

Also, allocate worklist_elem in groups of 20, to reduce the burden of
allocation. Do not use rzalloc, as there is no need. This lets us drop
the number of calls to ralloc from aproximately 10% of all calls to
ralloc(130 000 calls), down to a mere 2000 calls to ralloc_array_size.
This cuts the runtime of shader-db by 1%, while at the same time
reducing the number of stalled cycles, executed cycles, and executed
instructions by about 1 % as reported by perf. I did a five-run
benchmark pre and post and got a statistical variance less than 0.1% 
pre
and post. This was with i965's ir validation polluting the benchmark, 
so

the numbers are even better in release builds.

Performance change as found with perf-diff:
4.74% -0.23%  libc-2.26.so[.] _int_malloc
1.88% -0.21%  libc-2.26.so[.] malloc
2.27% +0.16%  libmesa_dri_drivers.so  [.] match_value.part.7
2.95% -0.12%  libc-2.26.so[.] _int_free
  +0.11%  libmesa_dri_drivers.so  [.] worklist_push
1.22% -0.08%  libc-2.26.so[.] malloc_consolidate
0.16% -0.06%  libmesa_dri_drivers.so  [.] mark_live_cb
1.21% +0.06%  libmesa_dri_drivers.so  [.] match_expression.part.6
0.75% -0.05%  libc-2.26.so[.] cfree@GLIBC_2.2.5
0.50% -0.05%  libmesa_dri_drivers.so  [.] ralloc_size
0.57% +0.04%  libmesa_dri_drivers.so  [.] nir_replace_instr
1.29% -0.04%  libmesa_dri_drivers.so  [.] unsafe_free
---
 src/compiler/nir/nir_opt_dce.c | 47 
--

 1 file changed, 32 insertions(+), 15 deletions(-)

diff --git a/src/compiler/nir/nir_opt_dce.c 
b/src/compiler/nir/nir_opt_dce.c

index 5cefba3a72..f9285fe4ac 100644
--- a/src/compiler/nir/nir_opt_dce.c
+++ b/src/compiler/nir/nir_opt_dce.c
@@ -29,32 +29,46 @@

 /* SSA-based mark-and-sweep dead code elimination */

+typedef struct {
+   struct exec_list *worklist;
+   struct exec_list *free_nodes;
+} worklist;
+
 typedef struct {
struct exec_node node;
nir_instr *instr;
 } worklist_elem;

 static void
-worklist_push(struct exec_list *worklist, nir_instr *instr)
+worklist_push(worklist *worklist, nir_instr *instr)
 {
-   worklist_elem *elem = ralloc(worklist, worklist_elem);
+   if (exec_list_is_empty(worklist->free_nodes)) {
+  worklist_elem *elements = ralloc_array(worklist, worklist_elem, 
20);

+  for (int i = 0; i < 20; i++)
+ exec_list_push_tail(worklist->free_nodes, [i].node);
+   }
+
+   struct exec_node *node = exec_list_pop_head(worklist->free_nodes);
+   worklist_elem *elem = exec_node_data(worklist_elem, node, node);
elem->instr = instr;
instr->pass_flags = 1;
-   exec_list_push_tail(worklist, >node);
+   exec_list_push_tail(worklist->worklist, >node);
 }

 static nir_instr *
-worklist_pop(struct exec_list *worklist)
+worklist_pop(worklist *worklist)
 {
-   struct exec_node *node = exec_list_pop_head(worklist);
+
+   struct exec_node *node = exec_list_pop_head(worklist->worklist);
worklist_elem *elem = exec_node_data(worklist_elem, node, node);
+   exec_list_push_head(worklist->free_nodes, node);
return elem->instr;
 }

 static bool
 mark_live_cb(nir_src *src, void *_state)
 {
-   struct exec_list *worklist = (struct exec_list *) _state;
+   worklist *worklist = _state;

if (src->is_ssa && !src->ssa->parent_instr->pass_flags) {
   worklist_push(worklist, src->ssa->parent_instr);
@@ -64,7 +78,7 @@ mark_live_cb(nir_src *src, void *_state)
 }

 static void
-init_instr(nir_instr *instr, struct exec_list *worklist)
+init_instr(nir_instr *instr, worklist *worklist)
 {
nir_alu_instr *alu_instr;
nir_intrinsic_instr *intrin_instr;
@@ -113,7 +127,7 @@ init_instr(nir_instr *instr, struct exec_list 
*worklist)

 }

 static bool
-init_block(nir_block *block, struct exec_list *worklist)
+init_block(nir_block *block, worklist *worklist)
 {
nir_foreach_instr(instr, block)
   init_instr(instr, worklist);
@@ -131,19 +145,22 @@ init_block(nir_block *block, struct exec_list 
*worklist)

 static bool
 nir_opt_dce_impl(nir_function_impl *impl)
 {
-   struct exec_list *worklist = rzalloc(NULL, struct exec_list);
-   exec_list_make_empty(worklist);
+   worklist *wl = ralloc(NULL, worklist);
+   wl->free_nodes = ralloc(wl, struct exec_list);
+   wl->worklist = ralloc(wl, struct exec_list);
+   exec_list_make_empty(wl->free_nodes);
+   exec_list_make_empty(wl->worklist);

nir_foreach_block(block, impl) {
-  init_block(block, worklist);
+  init_block(block, wl);
}

-   while (!exec_list_is_empty(worklist)) {
-  nir_instr *instr = worklist_pop(worklist);
-  nir_foreach_src(instr, mark_live_cb, worklist);
+   while (!exec_list_is_empty(wl->worklist)) {
+  nir_instr *instr = worklist_pop(wl);
+  nir_foreach_src(instr, mark_live_cb, wl);
}

-   ralloc_free(worklist);
+   ralloc_free(wl);

bool progress = false;

[Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Alejandro Piñeiro

Until now it was part of spirv_to_nir_options. But it will be used on
the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
to the OpenGL context, as a way to save what SPIR-V capabilities the
current OpenGL implementation supports.
---

We are sending this commit in advance of a v3 of the initial gl_spirv
and spirv_extensions support series. The issue is that lately there
were a lot of activity on the spirv/spir_to_nir code base, and we are
being fixing rebase conflicts constantly. Getting this commit on
master would make things easier.

FWIW, this patch is similar to one that Ian Romanick already granted
Rb, but that was dropped after all the mentioned changes:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/178261.html

 src/compiler/spirv/nir_spirv.h | 16 +++-
 src/mesa/main/mtypes.h | 12 
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
index 43ec19d5a50..113bd710a00 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -28,7 +28,8 @@
 #ifndef _NIR_SPIRV_H_
 #define _NIR_SPIRV_H_
 
-#include "nir/nir.h"
+#include "compiler/nir/nir.h"
+#include "main/mtypes.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -57,18 +58,7 @@ struct spirv_to_nir_options {
 */
bool lower_workgroup_access_to_offsets;
 
-   struct {
-  bool float64;
-  bool image_ms_array;
-  bool tessellation;
-  bool draw_parameters;
-  bool image_read_without_format;
-  bool image_write_without_format;
-  bool int64;
-  bool multiview;
-  bool variable_pointers;
-  bool storage_16bit;
-   } caps;
+   struct nir_spirv_supported_capabilities caps;
 
struct {
   void (*func)(void *private_data,
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b478f6158e2..7da05aa3ee9 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3579,6 +3579,18 @@ struct gl_program_constants
GLuint MaxShaderStorageBlocks;
 };
 
+struct nir_spirv_supported_capabilities {
+   bool float64;
+   bool image_ms_array;
+   bool tessellation;
+   bool draw_parameters;
+   bool image_read_without_format;
+   bool image_write_without_format;
+   bool int64;
+   bool multiview;
+   bool variable_pointers;
+   bool storage_16bit;
+};
 
 /**
  * Constants which may be overridden by device driver during context creation
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 23/25] mesa/glspirv: Add a _mesa_spirv_to_nir() function

2017-12-06 Thread Timothy Arceri




On 01/12/17 04:28, Eduardo Lima Mitev wrote:

This is basically a wrapper around spirv_to_nir() that includes
arguments setup and post-conversion validation.

v2: Rebase update (SpirVCapabilities not a pointer anymore)
---
  src/mesa/main/glspirv.c | 60 +
  src/mesa/main/glspirv.h |  7 ++
  2 files changed, 67 insertions(+)

diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
index e5dc8b26ea9..2a20e4b5cc7 100644
--- a/src/mesa/main/glspirv.c
+++ b/src/mesa/main/glspirv.c
@@ -159,6 +159,66 @@ _mesa_spirv_link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 }
  }
  
+nir_shader *

+_mesa_spirv_to_nir(struct gl_context *ctx,
+   const struct gl_shader_program *prog,
+   gl_shader_stage stage,
+   const nir_shader_compiler_options *options)
+{
+   nir_shader *nir = NULL;
+
+   struct gl_linked_shader *linked_shader = prog->_LinkedShaders[stage];
+   assert (linked_shader);
+
+   struct gl_shader_spirv_data *spirv_data = linked_shader->spirv_data;
+   assert(spirv_data);
+
+   struct gl_spirv_module *spirv_module = spirv_data->SpirVModule;
+   assert (spirv_module != NULL);
+
+   const char *entry_point_name = spirv_data->SpirVEntryPoint;
+   assert(entry_point_name);
+
+   struct nir_spirv_specialization *spec_entries = NULL;
+   spec_entries = calloc(sizeof(*spec_entries),
+ spirv_data->NumSpecializationConstants);


Can we just make this:

   struct nir_spirv_specialization *spec_entries =
  calloc(sizeof(*spec_entries),
 spirv_data->NumSpecializationConstants);



+
+   for (unsigned i = 0; i < spirv_data->NumSpecializationConstants; ++i) {
+  spec_entries[i].id = spirv_data->SpecializationConstantsIndex[i];
+  spec_entries[i].data32 = spirv_data->SpecializationConstantsValue[i];
+  spec_entries[i].defined_on_module = false;
+   }
+
+   nir_function *entry_point =
+  spirv_to_nir((const uint32_t *) _module->Binary[0],
+   spirv_module->Length / 4,
+   spec_entries, spirv_data->NumSpecializationConstants,
+   stage, entry_point_name,
+   >Const.SpirVCapabilities,
+   options);
+   free(spec_entries);
+
+   assert (entry_point);
+   nir = entry_point->shader;
+   assert(nir->info.stage == stage);
+
+   nir->options = options;
+
+   nir->info.name =
+  ralloc_asprintf(nir, "SPIRV:%s:%d",
+  _mesa_shader_stage_to_abbrev(nir->info.stage),
+  prog->Name);
+   nir_validate_shader(nir);
+
+   if (false) {
+  /* @FIXME: Only for debugging purposes */
+  nir_print_shader(nir, stdout);
+  fflush(stdout);
+   }


I'd rather we not commit this debug code, if you want this for 
development please carry a patch around in your dev branch.


With those two things addressed:

Reviewed-by: Timothy Arceri 


+
+   return nir;
+}
+
  void GLAPIENTRY
  _mesa_SpecializeShaderARB(GLuint shader,
const GLchar *pEntryPoint,
diff --git a/src/mesa/main/glspirv.h b/src/mesa/main/glspirv.h
index 0f03b75c111..81626ce75b5 100644
--- a/src/mesa/main/glspirv.h
+++ b/src/mesa/main/glspirv.h
@@ -24,6 +24,7 @@
  #ifndef GLSPIRV_H
  #define GLSPIRV_H
  
+#include "compiler/nir/nir.h"

  #include "mtypes.h"
  
  #ifdef __cplusplus

@@ -80,6 +81,12 @@ void
  _mesa_spirv_link_shaders(struct gl_context *ctx,
   struct gl_shader_program *prog);
  
+nir_shader *

+_mesa_spirv_to_nir(struct gl_context *ctx,
+   const struct gl_shader_program *prog,
+   gl_shader_stage stage,
+   const nir_shader_compiler_options *options);
+
  /**
   * \name API functions
   */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 104141] include/c11/threads_posix.h:96: undefined reference to `pthread_once'

2017-12-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=104141

Bug ID: 104141
   Summary: include/c11/threads_posix.h:96: undefined reference to
`pthread_once'
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: baker.dyla...@gmail.com

mesa: 8761a04d0d9332d9c0c99164faf855fc3c741f7c (master 17.4.0-devel)

meson build error

FAILED: src/gallium/targets/xvmc/libXvMCgallium.so 
c++  -o src/gallium/targets/xvmc/libXvMCgallium.so
'src/gallium/targets/xvmc/XvMCgallium@sha/target.c.o' -Wl,--no-undefined
-Wl,--as-needed -shared -fPIC -Wl,--start-group -Wl,-soname,libXvMCgallium.so
src/gallium/state_trackers/xvmc/libxvmc_st.a
src/gallium/auxiliary/libgalliumvlwinsys.a src/gallium/auxiliary/libgalliumvl.a
src/gallium/auxiliary/libgallium.a src/util/libmesa_util.a
src/gallium/auxiliary/pipe-loader/libpipe_loader_static.a
src/loader/libloader.a src/util/libxmlconfig.a
src/gallium/winsys/sw/null/libws_null.a src/gallium/winsys/sw/wrapper/libwsw.a
src/gallium/drivers/r600/libr600.a
src/gallium/winsys/radeon/drm/libradeonwinsys.a
src/gallium/winsys/nouveau/drm/libnouveauwinsys.a
src/gallium/drivers/nouveau/libnouveau.a -Wl,--version-script
mesa/src/gallium/targets/xvmc/xvmc.sym -Wl,--gc-sections -lxcb -lX11-xcb -lX11
-lxcb -lxcb-dri2 -lxcb-dri3 -ldrm -lX11-xcb -lX11 -lxcb -lxcb -lxcb-dri2
-lxcb-dri3 -ldrm -ldrm -L/usr/lib64 -lLLVM-5.0 -lunwind -ldl -lm -lz -ldrm
-ldrm -lexpat -lm -ldrm_radeon -lelf -L/usr/lib64 -lLLVM-5.0 -ldrm_radeon
-ldrm_nouveau -Wl,--end-group -ldrm -ldrm_nouveau '-Wl,-rpath,$ORIGIN/'
-Wl,-rpath-link,mesa/builddir/src/gallium/targets/xvmc  
src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o): In function
`lp_set_target_options':
include/c11/threads_posix.h:96: undefined reference to `pthread_once'

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 103909] anv_allocator.c:113:1: error: static declaration of ‘memfd_create’ follows non-static declaration

2017-12-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103909

Vinson Lee  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #1 from Vinson Lee  ---
commit 8c1e4b1afc8d396ccf99c725c59b29a9aa305557
Author: Vinson Lee 
Date:   Tue Nov 28 23:16:58 2017 -0800

anv: Check if memfd_create is already defined.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909
Signed-off-by: Vinson Lee 
Reviewed-by: Eric Engestrom 

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radeonsi: always place sparse buffers in VRAM

2017-12-06 Thread Dieter Nützel


Hello Nicolai,

any updates?

Dieter

Am 29.11.2017 02:57, schrieb Dieter Nützel:

For the series:

Tested-by: Dieter Nützel 

on RX580

with UH, UV, DiRT Rally, some Wine apps - LS2015/LS2017

Dieter

Am 28.11.2017 14:44, schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

Together with "radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check",
this ensures that sparse buffers are placed in VRAM.

Noticed by an assertion that started triggering with commit d4fac1e1d7
("gallium/radeon: enable suballocations for VRAM with no CPU access")

Fixes KHR-GL45.sparse_buffer_tests.BufferStorageTest in debug builds.
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 3 +++
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c   | 5 +++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 158cabcfaeb..36a29181edc 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -616,20 +616,23 @@ r600_alloc_buffer_struct(struct pipe_screen 
*screen,

return rbuffer;
 }

 struct pipe_resource *si_buffer_create(struct pipe_screen *screen,
   const struct pipe_resource *templ,
   unsigned alignment)
 {
 	struct r600_common_screen *rscreen = (struct 
r600_common_screen*)screen;
 	struct r600_resource *rbuffer = r600_alloc_buffer_struct(screen, 
templ);


+   if (templ->flags & PIPE_RESOURCE_FLAG_SPARSE)
+   rbuffer->b.b.flags |= R600_RESOURCE_FLAG_UNMAPPABLE;
+
si_init_resource_fields(rscreen, rbuffer, templ->width0, alignment);

if (templ->flags & PIPE_RESOURCE_FLAG_SPARSE)
rbuffer->flags |= RADEON_FLAG_SPARSE;

if (!si_alloc_resource(rscreen, rbuffer)) {
FREE(rbuffer);
return NULL;
}
return >b.b;
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index c3e97c22861..d6d5805deed 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -1143,20 +1143,23 @@ amdgpu_bo_create(struct radeon_winsys *rws,
struct amdgpu_winsys *ws = amdgpu_winsys(rws);
struct amdgpu_winsys_bo *bo;
unsigned usage = 0, pb_cache_bucket = 0;

/* VRAM implies WC. This is not optional. */
assert(!(domain & RADEON_DOMAIN_VRAM) || flags & 
RADEON_FLAG_GTT_WC);


/* NO_CPU_ACCESS is valid with VRAM only. */
assert(domain == RADEON_DOMAIN_VRAM || !(flags &
RADEON_FLAG_NO_CPU_ACCESS));

+   /* Sparse buffers must have NO_CPU_ACCESS set. */
+   assert(!(flags & RADEON_FLAG_SPARSE) || flags & 
RADEON_FLAG_NO_CPU_ACCESS);

+
/* Sub-allocate small buffers from slabs. */
if (!(flags & (RADEON_FLAG_NO_SUBALLOC | RADEON_FLAG_SPARSE)) &&
size <= (1 << AMDGPU_SLAB_MAX_SIZE_LOG2) &&
alignment <= MAX2(1 << AMDGPU_SLAB_MIN_SIZE_LOG2,
util_next_power_of_two(size))) {
   struct pb_slab_entry *entry;
   int heap = radeon_get_heap_index(domain, flags);

   if (heap < 0 || heap >= RADEON_MAX_SLAB_HEAPS)
  goto no_slab;

@@ -1175,22 +1178,20 @@ amdgpu_bo_create(struct radeon_winsys *rws,

   pipe_reference_init(>base.reference, 1);

   return >base;
}
 no_slab:

if (flags & RADEON_FLAG_SPARSE) {
   assert(RADEON_SPARSE_PAGE_SIZE % alignment == 0);

-  flags |= RADEON_FLAG_NO_CPU_ACCESS;
-
   return amdgpu_bo_sparse_create(ws, size, domain, flags);
}

/* This flag is irrelevant for the cache. */
flags &= ~RADEON_FLAG_NO_SUBALLOC;

/* Align size to page size. This is the minimum alignment for 
normal
 * BOs. Aligning this here helps the cached bufmgr. Especially 
small BOs,
 * like constant/uniform buffers, can benefit from better and more 
reuse.

 */

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 23/25] mesa/glspirv: Add a _mesa_spirv_to_nir() function

2017-12-06 Thread Eduardo Lima Mitev

On 12/06/2017 10:13 AM, Timothy Arceri wrote:
>
>
> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
>> This is basically a wrapper around spirv_to_nir() that includes
>> arguments setup and post-conversion validation.
>>
>> v2: Rebase update (SpirVCapabilities not a pointer anymore)
>> ---
>>   src/mesa/main/glspirv.c | 60
>> +
>>   src/mesa/main/glspirv.h |  7 ++
>>   2 files changed, 67 insertions(+)
>>
>> diff --git a/src/mesa/main/glspirv.c b/src/mesa/main/glspirv.c
>> index e5dc8b26ea9..2a20e4b5cc7 100644
>> --- a/src/mesa/main/glspirv.c
>> +++ b/src/mesa/main/glspirv.c
>> @@ -159,6 +159,66 @@ _mesa_spirv_link_shaders(struct gl_context *ctx,
>> struct gl_shader_program *prog)
>>  }
>>   }
>>   +nir_shader *
>> +_mesa_spirv_to_nir(struct gl_context *ctx,
>> +   const struct gl_shader_program *prog,
>> +   gl_shader_stage stage,
>> +   const nir_shader_compiler_options *options)
>> +{
>> +   nir_shader *nir = NULL;
>> +
>> +   struct gl_linked_shader *linked_shader =
>> prog->_LinkedShaders[stage];
>> +   assert (linked_shader);
>> +
>> +   struct gl_shader_spirv_data *spirv_data = linked_shader->spirv_data;
>> +   assert(spirv_data);
>> +
>> +   struct gl_spirv_module *spirv_module = spirv_data->SpirVModule;
>> +   assert (spirv_module != NULL);
>> +
>> +   const char *entry_point_name = spirv_data->SpirVEntryPoint;
>> +   assert(entry_point_name);
>> +
>> +   struct nir_spirv_specialization *spec_entries = NULL;
>> +   spec_entries = calloc(sizeof(*spec_entries),
>> + spirv_data->NumSpecializationConstants);
>
> Can we just make this:
>
>    struct nir_spirv_specialization *spec_entries =
>   calloc(sizeof(*spec_entries),
>  spirv_data->NumSpecializationConstants);
>

Sure, will fix it locally.

>
>> +
>> +   for (unsigned i = 0; i < spirv_data->NumSpecializationConstants;
>> ++i) {
>> +  spec_entries[i].id = spirv_data->SpecializationConstantsIndex[i];
>> +  spec_entries[i].data32 =
>> spirv_data->SpecializationConstantsValue[i];
>> +  spec_entries[i].defined_on_module = false;
>> +   }
>> +
>> +   nir_function *entry_point =
>> +  spirv_to_nir((const uint32_t *) _module->Binary[0],
>> +   spirv_module->Length / 4,
>> +   spec_entries,
>> spirv_data->NumSpecializationConstants,
>> +   stage, entry_point_name,
>> +   >Const.SpirVCapabilities,
>> +   options);
>> +   free(spec_entries);
>> +
>> +   assert (entry_point);
>> +   nir = entry_point->shader;
>> +   assert(nir->info.stage == stage);
>> +
>> +   nir->options = options;
>> +
>> +   nir->info.name =
>> +  ralloc_asprintf(nir, "SPIRV:%s:%d",
>> +  _mesa_shader_stage_to_abbrev(nir->info.stage),
>> +  prog->Name);
>> +   nir_validate_shader(nir);
>> +
>> +   if (false) {
>> +  /* @FIXME: Only for debugging purposes */
>> +  nir_print_shader(nir, stdout);
>> +  fflush(stdout);
>> +   }
>
> I'd rather we not commit this debug code, if you want this for
> development please carry a patch around in your dev branch.
>

Agree. Will remove it locally.


> With those two things addressed:
>
> Reviewed-by: Timothy Arceri 
>

Thanks!

>> +
>> +   return nir;
>> +}
>> +
>>   void GLAPIENTRY
>>   _mesa_SpecializeShaderARB(GLuint shader,
>>     const GLchar *pEntryPoint,
>> diff --git a/src/mesa/main/glspirv.h b/src/mesa/main/glspirv.h
>> index 0f03b75c111..81626ce75b5 100644
>> --- a/src/mesa/main/glspirv.h
>> +++ b/src/mesa/main/glspirv.h
>> @@ -24,6 +24,7 @@
>>   #ifndef GLSPIRV_H
>>   #define GLSPIRV_H
>>   +#include "compiler/nir/nir.h"
>>   #include "mtypes.h"
>>     #ifdef __cplusplus
>> @@ -80,6 +81,12 @@ void
>>   _mesa_spirv_link_shaders(struct gl_context *ctx,
>>    struct gl_shader_program *prog);
>>   +nir_shader *
>> +_mesa_spirv_to_nir(struct gl_context *ctx,
>> +   const struct gl_shader_program *prog,
>> +   gl_shader_stage stage,
>> +   const nir_shader_compiler_options *options);
>> +
>>   /**
>>    * \name API functions
>>    */
>>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/25] mesa: move nir_spirv_supported_capabilities definition

2017-12-06 Thread Alejandro Piñeiro

On 06/12/17 10:47, Timothy Arceri wrote:
>
>
> On 06/12/17 20:33, Alejandro Piñeiro wrote:
>> On 06/12/17 10:23, Timothy Arceri wrote:
>>> Can we get away with forward declaring this?
>>>
>>> There is a section at the top of mtypes you can add it to:
>>>
>>>   * \name Some forward type declarations
>>
>> Yes, I realized that, and tried, but I still got several build errors.
>> So that would not be enough.
>
> Doesn't that just mean you need to include compiler/spirv/nir_spirv.h
> in more places?
>
>>
>> In any case, after all the recent changes on spirv/spirv_to_nir
>> codebase, this commit and the following one are obsolete. We are
>> preparing a v3 series, but meanwhile we send this path alone to
>> mesa-dev:
>> https://lists.freedesktop.org/archives/mesa-dev/2017-December/179438.html
>>
>
> I'm confused. If it's obsolete why are you trying to get it committed?

Sorry for the confusion. We sent this v2 series last week. They became
obsolete this week (on Monday). This is the reason I sent a new patch
today. As I was sending the patch, I should have send a warning for
those two patches.

>
>
>>>
>>>
>>> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
 From: Alejandro Piñeiro 

 Due gl_spirv we will use it on more places, specifically on
 gl_constants, where we would like to use it without a pointer.
 ---
    src/compiler/spirv/nir_spirv.h | 15 ++-
    src/mesa/main/mtypes.h | 11 +++
    2 files changed, 13 insertions(+), 13 deletions(-)

 diff --git a/src/compiler/spirv/nir_spirv.h
 b/src/compiler/spirv/nir_spirv.h
 index 0204e81d091..a14b55cdd4b 100644
 --- a/src/compiler/spirv/nir_spirv.h
 +++ b/src/compiler/spirv/nir_spirv.h
 @@ -28,7 +28,8 @@
    #ifndef _NIR_SPIRV_H_
    #define _NIR_SPIRV_H_
    -#include "nir/nir.h"
 +#include "compiler/nir/nir.h"
 +#include "main/mtypes.h"
      #ifdef __cplusplus
    extern "C" {
 @@ -42,18 +43,6 @@ struct nir_spirv_specialization {
   };
    };
    -struct nir_spirv_supported_capabilities {
 -   bool float64;
 -   bool image_ms_array;
 -   bool tessellation;
 -   bool draw_parameters;
 -   bool image_read_without_format;
 -   bool image_write_without_format;
 -   bool int64;
 -   bool multiview;
 -   bool variable_pointers;
 -};
 -
    nir_function *spirv_to_nir(const uint32_t *words, size_t
 word_count,
   struct nir_spirv_specialization
 *specializations,
   unsigned num_specializations,
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 50a47e0a65d..c8177c9a99a 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3583,6 +3583,17 @@ struct gl_program_constants
   GLuint MaxShaderStorageBlocks;
    };
    +struct nir_spirv_supported_capabilities {
 +   bool float64;
 +   bool image_ms_array;
 +   bool tessellation;
 +   bool draw_parameters;
 +   bool image_read_without_format;
 +   bool image_write_without_format;
 +   bool int64;
 +   bool multiview;
 +   bool variable_pointers;
 +};
      /**
     * Constants which may be overridden by device driver during
 context creation

>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

2017-12-06 Thread Alejandro Piñeiro

On 06/12/17 01:19, Chema Casanova wrote:
> On 05/12/17 18:31, Chema Casanova wrote:
>> El 05/12/17 a las 06:16, Jason Ekstrand escribió:
>>> A couple of notes:
>>>
>>>  1) I *think* I gave you enough reviews to land the UBO/SSBO part and
>>> the optimizations in 26-28.  If reviews are still missing anywhere,
>>> please let me know.  If not, let's try and get that part landed.
>> The series is almost ready to land, I have only pending to address your
>> feedback about use untyped_read for reading vec3 ssbos.
>>
>> The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use
>> untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
>> i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your
>> review to confirm the R-b.
>>
>> I expect to finish today vec3 ssbo and send the series to Jenkins before
>> landing, confirm your "pending" R-b, do a last rebase to master and ask
>> for a push.
> I've just prepared a rebased branch with the reviewed commits ready to
> land to enable VK_KHR_16bit_storage support for SSBO/UBO.
>
> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-v4-ubo-ssbo-to-land
>
> As I don't have still commit access to mesa, maybe Eduardo or Alejandro
> can land it for me tomorrow. But, Jason feel free to push it if you want.

I have just pushed it to master.

>
> Chema
>
>>>  2) I send out a patch to rewrite assign_constant_locations which I
>>> think should make it automatically handle 8 and 16-bit values as
>>> well.  I'd rather do that than more special casing if everything works
>>> out ok.
>> I'm testing this patch with 16-bits and make sure whatever is needed to
>> have 16-bit working.
>>
>>>  3) I sent out a series of patches to enable pushing of UBOs in
>>> Vulkan.  If we're not careful, these will clash with 16bit storage as
>>> UBO support suddenly has to imply push constant support.  That said,
>>> I"m willing to wait at least a little while before landing them to let
>>> us get 16bit push constant support sorted out.  The UBO pushing
>>> patches give us a nice little performance boost but we're nowhere near
>>> a release and I don't want it blocking you.
>> That would be my next priority, so we would only have pending to land
>> the 16-bit input/output support to finish this extension.
>>
>> Chema
>>
>>> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
>>> > wrote:
>>>
>>> Hello,
>>>
>>> this is the V4 series for the implementation of the
>>> SPV_KHR_16bit_storage
>>> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in
>>> addition
>>> to the GLSL and NIR support needed.
>>>
>>> The original series can be found here [1], the following v2 [2]
>>> and v3 [3].
>>>
>>> In short V4 includes the following:
>>>
>>>  * Reorder the series to enable features as they are implemented,
>>> the series
>>>    now enables first UBO and SSBO support, and then inputs/outputs and
>>>    finally push constants.
>>>  * Support the byte scattered read/write messages with different
>>> bit sizes
>>>    byte/word/dword.
>>>  * Refactor of the store_ssbo code and also fix stores when
>>> writemask was .yz
>>>  * Uses the sampler for load_ubo avoiding the initial
>>> implementation of
>>>    the series using byte_scattered_read.
>>>  * Addressed all the feedback provided by Jason and Topi on v3 review.
>>>
>>> This series is also available at:
>>>
>>> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4
>>> 
>>>
>>> The objective is to start landing part of this series, all
>>> feedback has been
>>> addressed for SSBO and UBO. But for input/outputs features it will
>>> probably
>>> need another iteration as was not completely reviewed. It is also
>>> needed
>>> to define the approach for push constants issues before of after
>>> landing
>>> the support with this implementation.
>>>
>>> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
>>> reviewed but as it has changed too much i would appreciate another
>>> review. When patches until 25 or 28 are reviewed we could land
>>> UBOs and
>>> SSBOs support.
>>>
>>> Finally an updated overview of the patches:
>>>
>>> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
>>> needed because NIR uses GLSL types internally. We use the enums
>>> already defined at AMD_gpu_shader_half_float and NV_gpu_shader
>>> extensions. Patch 2 updates mesa/st, in order to avoid warnings for
>>> types not handled on a switch.
>>>
>>> Patches 3-6 add NIR support for those new GLSL 16-bit types,
>>> conversion opcodes, and rounding modes for float to half-float
>>> conversions.
>>>
>>> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.
>>>
>>>

Re: [Mesa-dev] [PATCH v2 25/25] i965: Don't call process_glsl_ir() for SPIR-V shaders

2017-12-06 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 01/12/17 04:28, Eduardo Lima Mitev wrote:

v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader
is SPIR-V. (Timothy Arceri)
---
  src/mesa/drivers/dri/i965/brw_link.cpp | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index d18521e792d..6bf4c413db4 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -236,7 +236,8 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
struct gl_program *prog = shader->Program;
prog->Parameters = _mesa_new_parameter_list();
  
-  process_glsl_ir(brw, shProg, shader);

+  if (!shader->spirv_data)
+ process_glsl_ir(brw, shProg, shader);
  
_mesa_copy_linked_program_data(shProg, shader);
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/25] mesa: move nir_spirv_supported_capabilities definition

2017-12-06 Thread Alejandro Piñeiro



On 06/12/17 10:33, Alejandro Piñeiro wrote:
> On 06/12/17 10:23, Timothy Arceri wrote:
>> Can we get away with forward declaring this?
>>
>> There is a section at the top of mtypes you can add it to:
>>
>>  * \name Some forward type declarations
> Yes, I realized that, and tried, but I still got several build errors.
> So that would not be enough.
>
> In any case, after all the recent changes on spirv/spirv_to_nir
> codebase, this commit and the following one are obsolete. 

sorry, and error: this commit and the *previous* one

> We are
> preparing a v3 series, but meanwhile we send this path alone to mesa-dev:
> https://lists.freedesktop.org/archives/mesa-dev/2017-December/179438.html
>>
>> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
>>> From: Alejandro Piñeiro 
>>>
>>> Due gl_spirv we will use it on more places, specifically on
>>> gl_constants, where we would like to use it without a pointer.
>>> ---
>>>   src/compiler/spirv/nir_spirv.h | 15 ++-
>>>   src/mesa/main/mtypes.h | 11 +++
>>>   2 files changed, 13 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/src/compiler/spirv/nir_spirv.h
>>> b/src/compiler/spirv/nir_spirv.h
>>> index 0204e81d091..a14b55cdd4b 100644
>>> --- a/src/compiler/spirv/nir_spirv.h
>>> +++ b/src/compiler/spirv/nir_spirv.h
>>> @@ -28,7 +28,8 @@
>>>   #ifndef _NIR_SPIRV_H_
>>>   #define _NIR_SPIRV_H_
>>>   -#include "nir/nir.h"
>>> +#include "compiler/nir/nir.h"
>>> +#include "main/mtypes.h"
>>>     #ifdef __cplusplus
>>>   extern "C" {
>>> @@ -42,18 +43,6 @@ struct nir_spirv_specialization {
>>>  };
>>>   };
>>>   -struct nir_spirv_supported_capabilities {
>>> -   bool float64;
>>> -   bool image_ms_array;
>>> -   bool tessellation;
>>> -   bool draw_parameters;
>>> -   bool image_read_without_format;
>>> -   bool image_write_without_format;
>>> -   bool int64;
>>> -   bool multiview;
>>> -   bool variable_pointers;
>>> -};
>>> -
>>>   nir_function *spirv_to_nir(const uint32_t *words, size_t word_count,
>>>  struct nir_spirv_specialization
>>> *specializations,
>>>  unsigned num_specializations,
>>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>>> index 50a47e0a65d..c8177c9a99a 100644
>>> --- a/src/mesa/main/mtypes.h
>>> +++ b/src/mesa/main/mtypes.h
>>> @@ -3583,6 +3583,17 @@ struct gl_program_constants
>>>  GLuint MaxShaderStorageBlocks;
>>>   };
>>>   +struct nir_spirv_supported_capabilities {
>>> +   bool float64;
>>> +   bool image_ms_array;
>>> +   bool tessellation;
>>> +   bool draw_parameters;
>>> +   bool image_read_without_format;
>>> +   bool image_write_without_format;
>>> +   bool int64;
>>> +   bool multiview;
>>> +   bool variable_pointers;
>>> +};
>>>     /**
>>>    * Constants which may be overridden by device driver during
>>> context creation
>>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] i965/fs: Rewrite assign_constant_locations

2017-12-06 Thread Chema Casanova

I've tested this patch against the VK-CTS push constant 16-bit tests,
and enabled storagePushConstant16 at VK_KHR_16bit_storage. All test pass
without any extra modification.

dEQP-VK.spirv_assembly.instruction.compute.16bit_storage.push_constant.*
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.push_constant.*

All test pass. I have pending in my TODO to build a test mixing push
constant values with different bitsizes.

Tested-by: Jose Maria Casanova Crespo 


On 04/12/17 02:50, Jason Ekstrand wrote:
> This rewires the logic for assigning uniform locations to work in terms
> of "complex alignments".  The basic idea is that, as we walk the list of
> instructions, we keep track of the alignment and continuity requirements
> of each slot and assert that the alignments all match up.  We then use
> those alignments in the compaction stage to ensure that everything gets
> placed at a properly aligned register.  The old mechanism handled
> alignments by special-casing each of the bit sizes and placing 64-bit
> values first followed by 32-bit values.
> 
> The old scheme had the advantage of never leaving a hole since all the
> 64-bit values could be tightly packed and so could the 32-bit values.
> However, the new scheme has no type size special cases so it handles not
> only 32 and 64-bit types but should gracefully extend to 16 and 8-bit
> types as the need arises.
> 
> Cc: Kenneth Graunke 
> Cc: Jose Maria Casanova Crespo 
> ---
>  src/intel/compiler/brw_fs.cpp | 307 
> --
>  1 file changed, 174 insertions(+), 133 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 6772c0d..ffd8e12 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -1874,62 +1874,6 @@ fs_visitor::compact_virtual_grfs()
> return progress;
>  }
>  
> -static void
> -set_push_pull_constant_loc(unsigned uniform, int *chunk_start,
> -   unsigned *max_chunk_bitsize,
> -   bool contiguous, unsigned bitsize,
> -   const unsigned target_bitsize,
> -   int *push_constant_loc, int *pull_constant_loc,
> -   unsigned *num_push_constants,
> -   unsigned *num_pull_constants,
> -   const unsigned max_push_components,
> -   const unsigned max_chunk_size,
> -   bool allow_pull_constants,
> -   struct brw_stage_prog_data *stage_prog_data)
> -{
> -   /* This is the first live uniform in the chunk */
> -   if (*chunk_start < 0)
> -  *chunk_start = uniform;
> -
> -   /* Keep track of the maximum bit size access in contiguous uniforms */
> -   *max_chunk_bitsize = MAX2(*max_chunk_bitsize, bitsize);
> -
> -   /* If this element does not need to be contiguous with the next, we
> -* split at this point and everything between chunk_start and u forms a
> -* single chunk.
> -*/
> -   if (!contiguous) {
> -  /* If bitsize doesn't match the target one, skip it */
> -  if (*max_chunk_bitsize != target_bitsize) {
> - /* FIXME: right now we only support 32 and 64-bit accesses */
> - assert(*max_chunk_bitsize == 4 || *max_chunk_bitsize == 8);
> - *max_chunk_bitsize = 0;
> - *chunk_start = -1;
> - return;
> -  }
> -
> -  unsigned chunk_size = uniform - *chunk_start + 1;
> -
> -  /* Decide whether we should push or pull this parameter.  In the
> -   * Vulkan driver, push constants are explicitly exposed via the API
> -   * so we push everything.  In GL, we only push small arrays.
> -   */
> -  if (!allow_pull_constants ||
> -  (*num_push_constants + chunk_size <= max_push_components &&
> -   chunk_size <= max_chunk_size)) {
> - assert(*num_push_constants + chunk_size <= max_push_components);
> - for (unsigned j = *chunk_start; j <= uniform; j++)
> -push_constant_loc[j] = (*num_push_constants)++;
> -  } else {
> - for (unsigned j = *chunk_start; j <= uniform; j++)
> -pull_constant_loc[j] = (*num_pull_constants)++;
> -  }
> -
> -  *max_chunk_bitsize = 0;
> -  *chunk_start = -1;
> -   }
> -}
> -
>  static int
>  get_subgroup_id_param_index(const brw_stage_prog_data *prog_data)
>  {
> @@ -1945,6 +1889,98 @@ get_subgroup_id_param_index(const brw_stage_prog_data 
> *prog_data)
>  }
>  
>  /**
> + * Struct for handling complex alignments.
> + *
> + * A complex alignment is stored as multiplier and an offset.  A value is
> + * considered to be aligned if it is congruent to the offset modulo the
> + * multiplier.
> + */
> +struct cplx_align {
> +   unsigned mul:4;
> +   unsigned offset:4;
> +};
> +
> +#define CPLX_ALIGN_MAX_MUL 8
> +
> +static void
>

Re: [Mesa-dev] [PATCH v2 10/25] mesa: add gl_constants::SpirVCapabilities

2017-12-06 Thread Timothy Arceri

If you can forward declare nir_spirv_supported_capabilities as per my 
comment on the last patch, and you add the forward declaration to this 
patch then this patch is:


Reviewed-by: Timothy Arceri 

On 01/12/17 04:28, Eduardo Lima Mitev wrote:

From: Nicolai Hähnle 

For drivers to declare which SPIR-V features they support.

v2: Don't use a pointer (Ian Romanick)
---
  src/mesa/main/mtypes.h | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index c8177c9a99a..7fed85a2ae6 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4028,6 +4028,9 @@ struct gl_constants
  
 /** When drivers are OK with mapped buffers during draw and other calls. */

 bool AllowMappedBuffersDuringExecution;
+
+   /** GL_ARB_gl_spirv */
+   struct nir_spirv_supported_capabilities SpirVCapabilities;
  };
  
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: fix a case statement in GetMemoryFdPropertiesKHR

2017-12-06 Thread Emil Velikov

On 5 December 2017 at 20:51, Fredrik Höglund  wrote:
> The handle type in the case statement is supposed to be VK_EXTERNAL_-
> MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.
>
> Signed-off-by: Fredrik Höglund 
For the future please include a fixes tag if the commit is known.

Here
Fixes: 546e747867c ("radv: Implement VK_EXT_external_memory_dma_buf")

And for 2/2
Fixes: ab18e8e59b6 ("anv: Implement VK_EXT_external_memory_dma_buf")

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/opcodes: Fix constant-folding of bitfield_insert

2017-12-06 Thread Matt Turner

On Wed, Dec 6, 2017 at 3:55 AM, James Legg  wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119
> CC: 
> CC: Samuel Pitoiset 
> ---
>  src/compiler/nir/nir_opcodes.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
> index ac7333fe78..278562b2bd 100644
> --- a/src/compiler/nir/nir_opcodes.py
> +++ b/src/compiler/nir/nir_opcodes.py
> @@ -724,12 +724,12 @@ opcode("bitfield_insert", 0, tuint32, [0, 0, 0, 0],
>  unsigned base = src0, insert = src1;
>  int offset = src2, bits = src3;
>  if (bits == 0) {
> -   dst = 0;
> +   dst = base;
>  } else if (offset < 0 || bits < 0 || bits + offset > 32) {
> dst = 0;
>  } else {
> unsigned mask = ((1ull << bits) - 1) << offset;
> -   dst = (base & ~mask) | ((insert << bits) & mask);
> +   dst = (base & ~mask) | ((insert << offset) & mask);
>  }
>  """)

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] radeonsi: allow DMABUF exports for local buffers

2017-12-06 Thread Nicolai Hähnle


On 06.12.2017 13:43, Marek Olšák wrote:



On Dec 6, 2017 12:34 PM, "Nicolai Hähnle" > wrote:


On 05.12.2017 20:05, Marek Olšák wrote:

From: Marek Olšák >

Cc: 17.3 >
---
   src/gallium/drivers/radeon/r600_texture.c | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c
b/src/gallium/drivers/radeon/r600_texture.c
index 2aa47b5..07f7c33 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -739,22 +739,26 @@ static boolean
r600_texture_get_handle(struct pipe_screen* screen,
                         stride = rtex->surface.u.gfx9.surf_pitch *
                                  rtex->surface.bpe;
                         slice_size =
rtex->surface.u.gfx9.surf_slice_size;
                 } else {
                         offset =
rtex->surface.u.legacy.level[0].offset;
                         stride =
rtex->surface.u.legacy.level[0].nblk_x *
                                  rtex->surface.bpe;
                         slice_size =
(uint64_t)rtex->surface.u.legacy.level[0].slice_size_dw * 4;
                 }
         } else {
+               /* Buffer exports are for the OpenCL interop. */
                 /* Move a suballocated buffer into a
non-suballocated allocation. */
-               if (sscreen->ws->buffer_is_suballocated(res->buf)) {
+               if (sscreen->ws->buffer_is_suballocated(res->buf) ||
+                   /* A DMABUF export always fails if the BO is
local. */
+                   (rtex->resource.flags &
RADEON_FLAG_NO_INTERPROCESS_SHARING &&
+                    whandle->type != DRM_API_HANDLE_TYPE_KMS)) {


I still don't think this is right. Or at least, it's bound to blow
up in our faces at some point. Though I think we may have talked
past each other, apologies that I haven't made myself clear.

The issues I have in mind are scenarios like this:

1. Buffer allocated in OpenGL.
2. Buffer exported as KMS handle for importing to OpenCL in the same
process.
3. Buffer exported as an FD <-- at this point, the OpenGL and OpenCL
buffers go out of sync because OpenGL re-allocates the buffer but
OpenCL isn't informed.

Or:

1. Buffer allocated in OpenGL.
2. Buffer exported as KMS handle for importing to OpenCL in the same
process.
3. Buffer attempted to be exported as an FD from OpenCL <-- fails
because the buffer is local (has NO_INTERPROCESS_SHARING), and
people will be utterly clueless as to what's going on.

FWIW, I think the patch is good if you drop the whandle->type check
so that we re-allocate unconditionally.


I can remove the check, but buffers are only exported as DMABUF. This 
patch isn't just random - it does fix OpenCL interop.


Well, I doubt it's critical for performance, OpenCL doesn't know about 
the NO_INTERPROCESS_SHARING status anyway. So I'd prefer to remove the 
check, in case we change the exporting at some point.


Cheers,
Nicolai




Marek



Cheers,
Nicolai



                         assert(!res->b.is_shared);
                         /* Allocate a new buffer with
PIPE_BIND_SHARED. */
                         struct pipe_resource templ = res->b.b;
                         templ.bind |= PIPE_BIND_SHARED;
                         struct pipe_resource *newb =
                                 screen->resource_create(screen,
);
                         if (!newb)
                                 return false;



-- 
Lerne, wie die Welt wirklich ist,

Aber vergiss niemals, wie sie sein sollte.





--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] radeonsi: allow DMABUF exports for local buffers

2017-12-06 Thread Emil Velikov

On 6 December 2017 at 15:52, Nicolai Hähnle  wrote:
> On 06.12.2017 13:43, Marek Olšák wrote:
>>
>>
>>
>> On Dec 6, 2017 12:34 PM, "Nicolai Hähnle" > > wrote:
>>
>> On 05.12.2017 20:05, Marek Olšák wrote:
>>
>> From: Marek Olšák > >
>>
>> Cc: 17.3 > >
>>
>> ---
>>src/gallium/drivers/radeon/r600_texture.c | 6 +-
>>1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/radeon/r600_texture.c
>> b/src/gallium/drivers/radeon/r600_texture.c
>> index 2aa47b5..07f7c33 100644
>> --- a/src/gallium/drivers/radeon/r600_texture.c
>> +++ b/src/gallium/drivers/radeon/r600_texture.c
>> @@ -739,22 +739,26 @@ static boolean
>> r600_texture_get_handle(struct pipe_screen* screen,
>>  stride = rtex->surface.u.gfx9.surf_pitch
>> *
>>   rtex->surface.bpe;
>>  slice_size =
>> rtex->surface.u.gfx9.surf_slice_size;
>>  } else {
>>  offset =
>> rtex->surface.u.legacy.level[0].offset;
>>  stride =
>> rtex->surface.u.legacy.level[0].nblk_x *
>>   rtex->surface.bpe;
>>  slice_size =
>> (uint64_t)rtex->surface.u.legacy.level[0].slice_size_dw * 4;
>>  }
>>  } else {
>> +   /* Buffer exports are for the OpenCL interop. */
>>  /* Move a suballocated buffer into a
>> non-suballocated allocation. */
>> -   if (sscreen->ws->buffer_is_suballocated(res->buf))
>> {
>> +   if (sscreen->ws->buffer_is_suballocated(res->buf)
>> ||
>> +   /* A DMABUF export always fails if the BO is
>> local. */
>> +   (rtex->resource.flags &
>> RADEON_FLAG_NO_INTERPROCESS_SHARING &&
>> +whandle->type != DRM_API_HANDLE_TYPE_KMS)) {
>>
>>
>> I still don't think this is right. Or at least, it's bound to blow
>> up in our faces at some point. Though I think we may have talked
>> past each other, apologies that I haven't made myself clear.
>>
>> The issues I have in mind are scenarios like this:
>>
>> 1. Buffer allocated in OpenGL.
>> 2. Buffer exported as KMS handle for importing to OpenCL in the same
>> process.
>> 3. Buffer exported as an FD <-- at this point, the OpenGL and OpenCL
>> buffers go out of sync because OpenGL re-allocates the buffer but
>> OpenCL isn't informed.
>>
>> Or:
>>
>> 1. Buffer allocated in OpenGL.
>> 2. Buffer exported as KMS handle for importing to OpenCL in the same
>> process.
>> 3. Buffer attempted to be exported as an FD from OpenCL <-- fails
>> because the buffer is local (has NO_INTERPROCESS_SHARING), and
>> people will be utterly clueless as to what's going on.
>>
>> FWIW, I think the patch is good if you drop the whandle->type check
>> so that we re-allocate unconditionally.
>>
>>
>> I can remove the check, but buffers are only exported as DMABUF. This
>> patch isn't just random - it does fix OpenCL interop.
>
>
> Well, I doubt it's critical for performance, OpenCL doesn't know about the
> NO_INTERPROCESS_SHARING status anyway. So I'd prefer to remove the check, in
> case we change the exporting at some point.
>
Humble request: please include the highlights of the discussion in the
commit message ;-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: remove unneeded _eglGetNativePlatform check

2017-12-06 Thread Emil Velikov

On 13 November 2017 at 14:04, Emil Velikov  wrote:
> From: Emil Velikov 
>
> There's little point in calling _eglGetNativePlatform() in
> eglCopyBuffers. The platform return is identical to the one already
> stored in our _EGLDisplay.
>
> Modulo subtle memory corruption of course. But in these cases returning
> EGL_BAD_NATIVE_PIXMAP doesn't sound right.
>
> Signed-off-by: Emil Velikov 
> ---
>  src/egl/main/eglapi.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
> index c1bf5bbfe19..4aa93db829b 100644
> --- a/src/egl/main/eglapi.c
> +++ b/src/egl/main/eglapi.c
> @@ -1396,8 +1396,6 @@ eglCopyBuffers(EGLDisplay dpy, EGLSurface surface, 
> EGLNativePixmapType target)
> native_pixmap_ptr = (void*) target;
>
> _EGL_CHECK_SURFACE(disp, surf, EGL_FALSE, drv);
> -   if (disp->Platform != _eglGetNativePlatform(disp->PlatformDisplay))
> -  RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_PIXMAP, EGL_FALSE);
> ret = drv->API.CopyBuffers(drv, disp, surf, native_pixmap_ptr);
>
> RETURN_EGL_EVAL(disp, ret);
> --
Humble ping anyone?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] r600/sb: do not convert if-blocks that contain indirect array access

2017-12-06 Thread Gert Wollny

If an array is accessed within an if block, then currently it is not known
whether the value in the address register is involved in the evaluation of the
if condition, and converting the if condition may actually result in
out-of-bounds array access. Consequently, if blocks that contain indirect array
access should not be converted.

Fixes piglits on r600/BARTS:
spec/glsl-1.10/execution/variable-indexing/
  vs-output-array-float-index-wr
  vs-output-array-vec3-index-wr
  vs-output-array-vec4-index-wr

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104143

Signed-off-by: Gert Wollny 
---
* A better fix would probably contain some tracking to see whether the address 
register value derives from a value used in the if condition, but without 
doing some major refactoring and bringing the r600/sb code under test I'm kind 
of afraid to touch it.
* I don't have mesa-git write access. 

Best,
Gert

 src/gallium/drivers/r600/sb/sb_if_conversion.cpp | 2 +-
 src/gallium/drivers/r600/sb/sb_ir.cpp| 2 ++
 src/gallium/drivers/r600/sb/sb_ir.h  | 3 ++-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_if_conversion.cpp 
b/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
index 3f2b1b1b92..3f6431b80f 100644
--- a/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
+++ b/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
@@ -136,7 +136,7 @@ bool if_conversion::check_and_convert(region_node *r) {
);
 
if (s.region_count || s.fetch_count || s.alu_kill_count ||
-   s.if_count != 1 || s.repeat_count)
+   s.if_count != 1 || s.repeat_count || s.uses_ar)
return false;
 
unsigned real_alu_count = s.alu_count - s.alu_copy_mov_count;
diff --git a/src/gallium/drivers/r600/sb/sb_ir.cpp 
b/src/gallium/drivers/r600/sb/sb_ir.cpp
index d989dce62c..6e44193c1e 100644
--- a/src/gallium/drivers/r600/sb/sb_ir.cpp
+++ b/src/gallium/drivers/r600/sb/sb_ir.cpp
@@ -461,6 +461,8 @@ void container_node::collect_stats(node_stats& s) {
++s.alu_kill_count;
else if (a->is_copy_mov())
++s.alu_copy_mov_count;
+   if (a->uses_ar())
+  s.uses_ar = true;
} else if (n->is_fetch_inst())
++s.fetch_count;
else if (n->is_cf_inst())
diff --git a/src/gallium/drivers/r600/sb/sb_ir.h 
b/src/gallium/drivers/r600/sb/sb_ir.h
index ec973e7bfc..67c7cd8aa4 100644
--- a/src/gallium/drivers/r600/sb/sb_ir.h
+++ b/src/gallium/drivers/r600/sb/sb_ir.h
@@ -726,11 +726,12 @@ struct node_stats {
unsigned depart_count;
unsigned repeat_count;
unsigned if_count;
+   bool uses_ar;
 
node_stats() : alu_count(), alu_kill_count(), alu_copy_mov_count(),
cf_count(), fetch_count(), region_count(),
loop_count(), phi_count(), loop_phi_count(), 
depart_count(),
-   repeat_count(), if_count() {}
+   repeat_count(), if_count(), uses_ar(false) {}
 
void dump();
 };
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv: remove useless checks in radv_set_{color, depth}_clear_regs()

2017-12-06 Thread Samuel Pitoiset

Already checked by the respective callers.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 95c2915c97..621f0bad0b 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1300,8 +1300,7 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
va += image->offset + image->clear_value_offset;
unsigned reg_offset = 0, reg_count = 0;
 
-   if (!image->surface.htile_size)
-   return;
+   assert(image->surface.htile_size);
 
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
++reg_count;
@@ -1400,8 +1399,7 @@ radv_set_color_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->clear_value_offset;
 
-   if (!image->cmask.size && !image->surface.dcc_size)
-   return;
+   assert(image->cmask.size || image->surface.dcc_size);
 
radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: only re-mit the index type when it changes

2017-12-06 Thread Samuel Pitoiset

dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 33 +++--
 src/amd/vulkan/radv_private.h|  1 +
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 8821fcacef..95c2915c97 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1517,21 +1517,26 @@ static void
 radv_emit_index_buffer(struct radv_cmd_buffer *cmd_buffer)
 {
struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   struct radv_cmd_state *state = _buffer->state;
 
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
-   radeon_set_uconfig_reg_idx(cs, R_03090C_VGT_INDEX_TYPE,
-  2, cmd_buffer->state.index_type);
-   } else {
-   radeon_emit(cs, PKT3(PKT3_INDEX_TYPE, 0, 0));
-   radeon_emit(cs, cmd_buffer->state.index_type);
+   if (state->index_type != state->last_index_type) {
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
GFX9) {
+   radeon_set_uconfig_reg_idx(cs, R_03090C_VGT_INDEX_TYPE,
+  2, state->index_type);
+   } else {
+   radeon_emit(cs, PKT3(PKT3_INDEX_TYPE, 0, 0));
+   radeon_emit(cs, state->index_type);
+   }
+
+   state->last_index_type = state->index_type;
}
 
radeon_emit(cs, PKT3(PKT3_INDEX_BASE, 1, 0));
-   radeon_emit(cs, cmd_buffer->state.index_va);
-   radeon_emit(cs, cmd_buffer->state.index_va >> 32);
+   radeon_emit(cs, state->index_va);
+   radeon_emit(cs, state->index_va >> 32);
 
radeon_emit(cs, PKT3(PKT3_INDEX_BUFFER_SIZE, 0, 0));
-   radeon_emit(cs, cmd_buffer->state.max_index_count);
+   radeon_emit(cs, state->max_index_count);
 
cmd_buffer->state.dirty &= ~RADV_CMD_DIRTY_INDEX_BUFFER;
 }
@@ -2243,6 +2248,7 @@ VkResult radv_BeginCommandBuffer(
 
memset(_buffer->state, 0, sizeof(cmd_buffer->state));
cmd_buffer->state.last_primitive_reset_en = -1;
+   cmd_buffer->state.last_index_type = -1;
cmd_buffer->usage_flags = pBeginInfo->flags;
 
/* setup initial configuration into command buffer */
@@ -2860,6 +2866,11 @@ void radv_CmdExecuteCommands(
primary->state.last_ia_multi_vgt_param =
secondary->state.last_ia_multi_vgt_param;
}
+
+   if (secondary->state.last_index_type != -1) {
+   primary->state.last_index_type =
+   secondary->state.last_index_type;
+   }
}
 
/* After executing commands from secondary buffers we have to dirty
@@ -3244,8 +3255,10 @@ radv_emit_all_graphics_states(struct radv_cmd_buffer 
*cmd_buffer,
 * so the state must be re-emitted before the next indexed
 * draw.
 */
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
CIK)
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
CIK) {
+   cmd_buffer->state.last_index_type = -1;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_INDEX_BUFFER;
+   }
}
 
radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 1e7b28a1d5..4551208766 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -830,6 +830,7 @@ struct radv_cmd_state {
uint32_t index_type;
uint32_t max_index_count;
uint64_t index_va;
+   int32_t  last_index_type;
 
int32_t  last_primitive_reset_en;
uint32_t last_primitive_reset_index;
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: remove useless check radv_set_dcc_need_cmask_elim_pred()

2017-12-06 Thread Samuel Pitoiset

emit_fast_color_clear() already checks that.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 621f0bad0b..68371dbbe7 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1377,8 +1377,7 @@ radv_set_dcc_need_cmask_elim_pred(struct radv_cmd_buffer 
*cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->dcc_pred_offset;
 
-   if (!image->surface.dcc_size)
-   return;
+   assert(image->surface.dcc_size);
 
radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] meson: fix keyword argument in declare_dependency()

2017-12-06 Thread Dylan Baker

oops! Okay, I'm thinking I'm going to write a linter for meson to help catch
these kind of mistakes because this isn't the first time that I've failed to
catch these kind of errors in code that I don't build locally.

Reviewed-by: Dylan Baker 

Quoting Eric Engestrom (2017-12-06 05:31:00)
> `declare_dependency()` takes `compile_args`, not `c_args`.
> It was correct in all the other `declare_dependency()` from that commit.
> 
> Fixes: 0bbecc5a8548883f76a71 "meson: define driver dependencies"
> Cc: Dylan Baker 
> Signed-off-by: Eric Engestrom 
> ---
>  src/gallium/winsys/imx/drm/meson.build   | 2 +-
>  src/gallium/winsys/pl111/drm/meson.build | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/winsys/imx/drm/meson.build 
> b/src/gallium/winsys/imx/drm/meson.build
> index a4af4688694d6ed9c300..4efd7bb0660ca70079a5 100644
> --- a/src/gallium/winsys/imx/drm/meson.build
> +++ b/src/gallium/winsys/imx/drm/meson.build
> @@ -28,6 +28,6 @@ libimxdrm = static_library(
>  )
>  
>  driver_imx = declare_dependency(
> -  c_args : '-DGALLIUM_IMX',
> +  compile_args : '-DGALLIUM_IMX',
>link_with : libimxdrm,
>  )
> diff --git a/src/gallium/winsys/pl111/drm/meson.build 
> b/src/gallium/winsys/pl111/drm/meson.build
> index 84c26f57e15f97a21ca8..9cb6faf31e21e57bd6ab 100644
> --- a/src/gallium/winsys/pl111/drm/meson.build
> +++ b/src/gallium/winsys/pl111/drm/meson.build
> @@ -31,6 +31,6 @@ libpl111winsys = static_library(
>  )
>  
>  driver_pl111 = declare_dependency(
> -  c_args : '-DGALLIUM_PL111',
> +  compile_args : '-DGALLIUM_PL111',
>link_with : libpl111winsys,
>  )
> -- 
> Cheers,
>   Eric
> 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/29] anv/blorp: Rework image clear/resolve helpers

2017-12-06 Thread Nanley Chery

On Tue, Dec 05, 2017 at 03:48:45PM -0800, Nanley Chery wrote:
> On Mon, Nov 27, 2017 at 07:05:52PM -0800, Jason Ekstrand wrote:
> > This replaces image_fast_clear and ccs_resolve with two new helpers that
> > simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
> > is a bit cleaner as it separates performing the aux operation from which
> > blorp helper we have to call to do it.
> > ---
> >  src/intel/vulkan/anv_blorp.c   | 218 
> > ++---
> >  src/intel/vulkan/anv_private.h |  23 ++--
> >  src/intel/vulkan/genX_cmd_buffer.c |  28 +++--
> >  3 files changed, 165 insertions(+), 104 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> > index e244468..7c8a673 100644
> > --- a/src/intel/vulkan/anv_blorp.c
> > +++ b/src/intel/vulkan/anv_blorp.c
> > @@ -1439,75 +1439,6 @@ fast_clear_aux_usage(const struct anv_image *image,
> >  }
> >  
> >  void
> > -anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
> > - const struct anv_image *image,
> > - VkImageAspectFlagBits aspect,
> > - const uint32_t base_level, const uint32_t level_count,
> > - const uint32_t base_layer, uint32_t layer_count)
> > -{
> > -   assert(image->type == VK_IMAGE_TYPE_3D || image->extent.depth == 1);
> > -
> > -   if (image->type == VK_IMAGE_TYPE_3D) {
> > -  assert(base_layer == 0);
> > -  assert(layer_count == anv_minify(image->extent.depth, base_level));
> > -   }
> > -
> > -   struct blorp_batch batch;
> > -   blorp_batch_init(_buffer->device->blorp, , cmd_buffer, 0);
> > -
> > -   struct blorp_surf surf;
> > -   get_blorp_surf_for_anv_image(cmd_buffer->device, image, aspect,
> > -fast_clear_aux_usage(image, aspect),
> > -);
> > -
> > -   /* From the Sky Lake PRM Vol. 7, "Render Target Fast Clear":
> > -*
> > -*"After Render target fast clear, pipe-control with color cache
> > -*write-flush must be issued before sending any DRAW commands on
> > -*that render target."
> > -*
> > -* This comment is a bit cryptic and doesn't really tell you what's 
> > going
> > -* or what's really needed.  It appears that fast clear ops are not
> > -* properly synchronized with other drawing.  This means that we cannot
> > -* have a fast clear operation in the pipe at the same time as other
> > -* regular drawing operations.  We need to use a PIPE_CONTROL to ensure
> > -* that the contents of the previous draw hit the render target before 
> > we
> > -* resolve and then use a second PIPE_CONTROL after the resolve to 
> > ensure
> > -* that it is completed before any additional drawing occurs.
> > -*/
> > -   cmd_buffer->state.pending_pipe_bits |=
> > -  ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
> > -
> > -   uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
> > -   uint32_t width_div = image->format->planes[plane].denominator_scales[0];
> > -   uint32_t height_div = 
> > image->format->planes[plane].denominator_scales[1];
> > -
> > -   for (uint32_t l = 0; l < level_count; l++) {
> > -  const uint32_t level = base_level + l;
> > -
> > -  const VkExtent3D extent = {
> > - .width = anv_minify(image->extent.width, level),
> > - .height = anv_minify(image->extent.height, level),
> > - .depth = anv_minify(image->extent.depth, level),
> > -  };
> > -
> > -  if (image->type == VK_IMAGE_TYPE_3D)
> > - layer_count = extent.depth;
> > -
> > -  assert(level < anv_image_aux_levels(image, aspect));
> > -  assert(base_layer + layer_count <= anv_image_aux_layers(image, 
> > aspect, level));
> > -  blorp_fast_clear(, , surf.surf->format,
> > -   level, base_layer, layer_count,
> > -   0, 0,
> > -   extent.width / width_div,
> > -   extent.height / height_div);
> > -   }
> > -
> > -   cmd_buffer->state.pending_pipe_bits |=
> > -  ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | ANV_PIPE_CS_STALL_BIT;
> > -}
> > -
> > -void
> >  anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
> >  {
> > struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> > @@ -1681,36 +1612,153 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer 
> > *cmd_buffer,
> >  }
> >  
> >  void
> > -anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
> > -const struct anv_image * const image,
> > -VkImageAspectFlagBits aspect,
> > -const uint8_t level,
> > -const uint32_t start_layer, const uint32_t layer_count,
> > -const enum blorp_fast_clear_op op)
> > +anv_image_mcs_op(struct anv_cmd_buffer *cmd_buffer,
> > + const struct anv_image *image,
> > + VkImageAspectFlagBits

Re: [Mesa-dev] [Mesa-stable] [PATCH] glx: GLX_MESA_multithread_makecurrent is direct-only

2017-12-06 Thread Ian Romanick

On 12/06/2017 10:32 AM, Emil Velikov wrote:
> On 5 December 2017 at 16:10, Adam Jackson  wrote:
>> This extension is not defined for indirect contexts. Marking it as
>> "client only", as the old code did here, would make the extension
>> available in indirect contexts, even though the server would certainly
>> not have it in its extension list.
>>
>> Cc: 
>> Signed-off-by: Adam Jackson 
> Reviewed-by: Emil Velikov 
> 
> Unrelated: reportedly only cairo is using the extension, so could we
> consider the extension obsolete?

It's not too surprising that only Cairo is using it.  IIRC, Eric
specifically made this extension for Cairo, and it was a pretty big perf
win at the time.

I had wanted to test this patch, but... does LIBGL_ALWAYS_INDIRECT=y
just not work any more?  With the distro Mesa I get:

name of display: :0
X Error of failed request:  GLXBadContext
  Major opcode of failed request:  151 (GLX)
  Minor opcode of failed request:  6 (X_GLXIsDirect)
  Serial number of failed request:  37
  Current serial number in output stream:  36

And with fairly recent master I get direct rendering.

> Alternatively how about making it an official Khronos extension ;-)
> 
> -Emil
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] testing for certain compiler options does not work

2017-12-06 Thread Ian Romanick

On 11/29/2017 08:16 AM, Marc Dietrich wrote:
> Hi,
> 
> just found that my gcc 'gcc (SUSE Linux) 7.2.1 20171020 [gcc-7-branch 
> revision 
> 253932]' does not warn when using the negative form of unsupported warning 
> options. So all the configure tests for such options erroneously succeed, e.g:
> 
> gcc -Wno-bob test.c # works fine
> 
> gcc -Walice test.c # warns for unsupported compiler option
> 
> Maybe gcc bug or intended behaviour, maybe only unique to my version?

At least 6.4.1 does not have this problem, but our checks must be broken
in a different way.  I see piles of

cc1: warning: unrecognized command line option ‘-Wno-initializer-overrides’

This probably explains why I see piles of

cc1: warning: unrecognized command line option ‘-Wno-initializer-overrides’

in my builds.

> Marc
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: remove unneeded _eglGetNativePlatform check

2017-12-06 Thread Ian Romanick

Do we have any tests at all that exercise this path?  I don't know this
code well enough to feel comfortable reviewing this (says everyone). :(

On 11/13/2017 09:04 AM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> There's little point in calling _eglGetNativePlatform() in
> eglCopyBuffers. The platform return is identical to the one already
> stored in our _EGLDisplay.
> 
> Modulo subtle memory corruption of course. But in these cases returning
> EGL_BAD_NATIVE_PIXMAP doesn't sound right.
> 
> Signed-off-by: Emil Velikov 
> ---
>  src/egl/main/eglapi.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
> index c1bf5bbfe19..4aa93db829b 100644
> --- a/src/egl/main/eglapi.c
> +++ b/src/egl/main/eglapi.c
> @@ -1396,8 +1396,6 @@ eglCopyBuffers(EGLDisplay dpy, EGLSurface surface, 
> EGLNativePixmapType target)
> native_pixmap_ptr = (void*) target;
>  
> _EGL_CHECK_SURFACE(disp, surf, EGL_FALSE, drv);
> -   if (disp->Platform != _eglGetNativePlatform(disp->PlatformDisplay))
> -  RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_PIXMAP, EGL_FALSE);
> ret = drv->API.CopyBuffers(drv, disp, surf, native_pixmap_ptr);
>  
> RETURN_EGL_EVAL(disp, ret);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Ian Romanick

On 12/06/2017 07:29 AM, Pierre Moreau wrote:
> Hello Alejandro,
> 
> As far as I understand, nir_spirv_supported_capabilities is being filled in by
> the driver and then fetched by the API entrypoint to check the capabilities
> required by the SPIR-V binary given as input. And this is done regardless of
> the input IR used by the driver, be it NIR, LLVM IR, TGSI or others. So
> couldn’t it be just named spirv_supported_capabilities? Unless it also 
> reflects
> the capabilities supported by the IR being used.
> 
> I guess nir_spirv_supported_capabilities could be extended later on to also 
> add
> capabilities specific to OpenCL when clover reaches OpenCL 1.2 support (and 
> can
> start accepting SPIR-V binaries as input through the cl_khr_il_program
> extension), or would it be better to have a separate one for OpenCL?

I expect that over time there will be overlap between SPIR-V
functionality exposed in OpenCL, Vulkan, and OpenGL extensions.  There
already are some cases of this.  Given that, I think having a single
master list of supported capabilities makes sense.

> I haven’t had time to look at the whole gl_spirv series yet, so I am sorry if
> this is something that has already been brought and answered in that thread.
> 
> Regards,
> Pierre
> 
> On 2017-12-06 — 09:57, Alejandro Piñeiro wrote:
>> Until now it was part of spirv_to_nir_options. But it will be used on
>> the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
>> to the OpenGL context, as a way to save what SPIR-V capabilities the
>> current OpenGL implementation supports.
>> ---
>>
>> We are sending this commit in advance of a v3 of the initial gl_spirv
>> and spirv_extensions support series. The issue is that lately there
>> were a lot of activity on the spirv/spir_to_nir code base, and we are
>> being fixing rebase conflicts constantly. Getting this commit on
>> master would make things easier.
>>
>> FWIW, this patch is similar to one that Ian Romanick already granted
>> Rb, but that was dropped after all the mentioned changes:
>> https://lists.freedesktop.org/archives/mesa-dev/2017-November/178261.html
>>
>>  src/compiler/spirv/nir_spirv.h | 16 +++-
>>  src/mesa/main/mtypes.h | 12 
>>  2 files changed, 15 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
>> index 43ec19d5a50..113bd710a00 100644
>> --- a/src/compiler/spirv/nir_spirv.h
>> +++ b/src/compiler/spirv/nir_spirv.h
>> @@ -28,7 +28,8 @@
>>  #ifndef _NIR_SPIRV_H_
>>  #define _NIR_SPIRV_H_
>>  
>> -#include "nir/nir.h"
>> +#include "compiler/nir/nir.h"
>> +#include "main/mtypes.h"
>>  
>>  #ifdef __cplusplus
>>  extern "C" {
>> @@ -57,18 +58,7 @@ struct spirv_to_nir_options {
>>  */
>> bool lower_workgroup_access_to_offsets;
>>  
>> -   struct {
>> -  bool float64;
>> -  bool image_ms_array;
>> -  bool tessellation;
>> -  bool draw_parameters;
>> -  bool image_read_without_format;
>> -  bool image_write_without_format;
>> -  bool int64;
>> -  bool multiview;
>> -  bool variable_pointers;
>> -  bool storage_16bit;
>> -   } caps;
>> +   struct nir_spirv_supported_capabilities caps;
>>  
>> struct {
>>void (*func)(void *private_data,
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index b478f6158e2..7da05aa3ee9 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -3579,6 +3579,18 @@ struct gl_program_constants
>> GLuint MaxShaderStorageBlocks;
>>  };
>>  
>> +struct nir_spirv_supported_capabilities {
>> +   bool float64;
>> +   bool image_ms_array;
>> +   bool tessellation;
>> +   bool draw_parameters;
>> +   bool image_read_without_format;
>> +   bool image_write_without_format;
>> +   bool int64;
>> +   bool multiview;
>> +   bool variable_pointers;
>> +   bool storage_16bit;
>> +};
>>  
>>  /**
>>   * Constants which may be overridden by device driver during context 
>> creation
>> -- 
>> 2.11.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Ian Romanick

Reviewed-by: Ian Romanick 

On 12/06/2017 03:57 AM, Alejandro Piñeiro wrote:
> Until now it was part of spirv_to_nir_options. But it will be used on
> the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
> to the OpenGL context, as a way to save what SPIR-V capabilities the
> current OpenGL implementation supports.
> ---
> 
> We are sending this commit in advance of a v3 of the initial gl_spirv
> and spirv_extensions support series. The issue is that lately there
> were a lot of activity on the spirv/spir_to_nir code base, and we are
> being fixing rebase conflicts constantly. Getting this commit on
> master would make things easier.
> 
> FWIW, this patch is similar to one that Ian Romanick already granted
> Rb, but that was dropped after all the mentioned changes:
> https://lists.freedesktop.org/archives/mesa-dev/2017-November/178261.html
> 
>  src/compiler/spirv/nir_spirv.h | 16 +++-
>  src/mesa/main/mtypes.h | 12 
>  2 files changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
> index 43ec19d5a50..113bd710a00 100644
> --- a/src/compiler/spirv/nir_spirv.h
> +++ b/src/compiler/spirv/nir_spirv.h
> @@ -28,7 +28,8 @@
>  #ifndef _NIR_SPIRV_H_
>  #define _NIR_SPIRV_H_
>  
> -#include "nir/nir.h"
> +#include "compiler/nir/nir.h"
> +#include "main/mtypes.h"
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -57,18 +58,7 @@ struct spirv_to_nir_options {
>  */
> bool lower_workgroup_access_to_offsets;
>  
> -   struct {
> -  bool float64;
> -  bool image_ms_array;
> -  bool tessellation;
> -  bool draw_parameters;
> -  bool image_read_without_format;
> -  bool image_write_without_format;
> -  bool int64;
> -  bool multiview;
> -  bool variable_pointers;
> -  bool storage_16bit;
> -   } caps;
> +   struct nir_spirv_supported_capabilities caps;
>  
> struct {
>void (*func)(void *private_data,
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index b478f6158e2..7da05aa3ee9 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3579,6 +3579,18 @@ struct gl_program_constants
> GLuint MaxShaderStorageBlocks;
>  };
>  
> +struct nir_spirv_supported_capabilities {
> +   bool float64;
> +   bool image_ms_array;
> +   bool tessellation;
> +   bool draw_parameters;
> +   bool image_read_without_format;
> +   bool image_write_without_format;
> +   bool int64;
> +   bool multiview;
> +   bool variable_pointers;
> +   bool storage_16bit;
> +};
>  
>  /**
>   * Constants which may be overridden by device driver during context creation
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] drirc: add option to disable ARB_draw_indirect

2017-12-06 Thread Ian Romanick

On 12/05/2017 08:25 AM, Ilia Mirkin wrote:
> On Tue, Dec 5, 2017 at 8:18 AM, Emil Velikov  wrote:
>> Hi Rob,
>>
>> On 5 December 2017 at 12:54, Rob Clark  wrote:
>>> This is a bit sad/annoying.  But with current GPU firmware (at least on
>>> a5xx) we can support both draw-indirect and base-instance.  But we can't
>>> support draw-indirect with a non-zero base-instance specified.  So add a
>>> driconf option to hide the extension from games that are known to use
>>> both.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>> Tbh, I'm also not really sure what to do when/if we got updated firmware
>>> which handled draw-indirect with base-instance, since we'd need to make
>>> this option conditional on fw version.  For STK that probably isn't a
>>> big deal since it doesn't use draw-indirect in a particularly useful way
>>> (the indirect buffer is generated on CPU).
>>>
>> Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
>> disable the extension) as it detects buggy FW?
>> This is what radeons have been doing as they encounter iffy firmware or LLVM.
>>
>> AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
> 
> Rob is this -><- close to ES 3.1, so that's not a great option.

And I don't suppose there's a way to get updated firmware?  i965 has
similar sorts of cases where higher versions are disabled due to missing
kernel features.

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/2] intel/fs: Teach instruction scheduler about GRF bank conflict cycles.

2017-12-06 Thread Francisco Jerez

This should allow the post-RA scheduler to do a slightly better job at
hiding latency in presence of instructions incurring bank conflicts.
The main purpuse of this patch is not to improve performance though,
but to get conflict cycles to show up in shader-db statistics in order
to make sure that regressions in the bank conflict mitigation pass
don't go unnoticed.
---
 src/intel/compiler/brw_fs.h  |  1 +
 src/intel/compiler/brw_fs_bank_conflicts.cpp | 19 +++
 src/intel/compiler/brw_schedule_instructions.cpp |  5 +++--
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index 6f7351ee506..62b7631a4ac 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -146,6 +146,7 @@ public:
bool opt_drop_redundant_mov_to_flags();
bool opt_register_renaming();
bool opt_bank_conflicts();
+   unsigned bank_conflict_cycles(const fs_inst *inst);
bool register_coalesce();
bool compute_to_mrf();
bool eliminate_find_live_channel();
diff --git a/src/intel/compiler/brw_fs_bank_conflicts.cpp 
b/src/intel/compiler/brw_fs_bank_conflicts.cpp
index dc88cac744b..6912331f5b0 100644
--- a/src/intel/compiler/brw_fs_bank_conflicts.cpp
+++ b/src/intel/compiler/brw_fs_bank_conflicts.cpp
@@ -891,3 +891,22 @@ fs_visitor::opt_bank_conflicts()
delete[] constrained;
return true;
 }
+
+/**
+ * Estimate the number of GRF bank conflict cycles incurred by an instruction.
+ *
+ * Note that this neglects conflict cycles prior to register allocation
+ * because we don't know which bank each VGRF is going to end up aligned to.
+ */
+unsigned
+fs_visitor::bank_conflict_cycles(const fs_inst *inst)
+{
+   if (grf_used && inst->is_3src(devinfo) &&
+   is_grf(inst->src[1]) && is_grf(inst->src[2]) &&
+   bank_of(reg_of(inst->src[1])) == bank_of(reg_of(inst->src[2])) &&
+   !is_conflict_optimized_out(devinfo, inst)) {
+  return DIV_ROUND_UP(inst->dst.component_size(inst->exec_size), REG_SIZE);
+   } else {
+  return 0;
+   }
+}
diff --git a/src/intel/compiler/brw_schedule_instructions.cpp 
b/src/intel/compiler/brw_schedule_instructions.cpp
index a1e825c661c..692f7125323 100644
--- a/src/intel/compiler/brw_schedule_instructions.cpp
+++ b/src/intel/compiler/brw_schedule_instructions.cpp
@@ -1543,10 +1543,11 @@ 
vec4_instruction_scheduler::choose_instruction_to_schedule()
 int
 fs_instruction_scheduler::issue_time(backend_instruction *inst)
 {
+   const unsigned overhead = v->bank_conflict_cycles((fs_inst *)inst);
if (is_compressed((fs_inst *)inst))
-  return 4;
+  return 4 + overhead;
else
-  return 2;
+  return 2 + overhead;
 }
 
 int
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] intel/fs: Implement GRF bank conflict mitigation pass.

2017-12-06 Thread Francisco Jerez

This series (which is ready for production and improves the cycle count
of over 46k shaders) has been sitting here for nearly half a year.  I'm
planning to self-review it and land it (along with PATCH 3/2 I just sent
to make sure we keep regressions under control) if nobody else does in
the next two weeks.

Francisco Jerez  writes:

> Unnecessary GRF bank conflicts increase the issue time of ternary
> instructions (the overwhelmingly most common of which is MAD) by
> roughly 50%, leading to reduced ALU throughput.  This pass attempts to
> minimize the number of bank conflicts by rearranging the layout of the
> GRF space post-register allocation.  It's in general not possible to
> eliminate all of them without introducing extra copies, which are
> typically more expensive than the bank conflict itself.
>
> In a shader-db run on SKL this helps roughly 46k shaders:
>
>total conflicts in shared programs: 1008981 -> 600461 (-40.49%)
>conflicts in affected programs: 816222 -> 407702 (-50.05%)
>helped: 46234
>HURT: 72
>
> The running time of shader-db itself on SKL seems to be increased by
> roughly 2.52%±1.13% with n=20 due to the additional work done by the
> compiler back-end.
>
> On earlier generations the pass is somewhat less effective in relative
> terms because the hardware incurs a bank conflict anytime the last two
> sources of the instruction are duplicate (e.g. while trying to square
> a value using MAD), which is impossible to avoid without introducing
> copies.  E.g. for a shader-db run on SNB:
>
>total conflicts in shared programs: 944636 -> 623185 (-34.03%)
>conflicts in affected programs: 853258 -> 531807 (-37.67%)
>helped: 31052
>HURT: 19
>
> And on BDW:
>
>total conflicts in shared programs: 1418393 -> 987539 (-30.38%)
>conflicts in affected programs: 1179787 -> 748933 (-36.52%)
>helped: 47592
>HURT: 70
>
> On SKL GT4e this improves performance of GpuTest Volplosion by 3.64%
> ±0.33% with n=16.
>
> NOTE: This patch intentionally disregards some i965 coding conventions
>   for the sake of reviewability.  This is addressed by the next
>   squash patch which introduces an amount of (for the most part
>   boring) boilerplate that might distract reviewers from the
>   non-trivial algorithmic details of the pass.
> ---
>  src/intel/Makefile.sources   |   1 +
>  src/intel/compiler/brw_fs.cpp|   2 +
>  src/intel/compiler/brw_fs.h  |   1 +
>  src/intel/compiler/brw_fs_bank_conflicts.cpp | 791 
> +++
>  4 files changed, 795 insertions(+)
>  create mode 100644 src/intel/compiler/brw_fs_bank_conflicts.cpp
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index a877ff2..1b9799a 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -44,6 +44,7 @@ COMPILER_FILES = \
>   compiler/brw_eu_util.c \
>   compiler/brw_eu_validate.c \
>   compiler/brw_fs_builder.h \
> +compiler/brw_fs_bank_conflicts.cpp \
>   compiler/brw_fs_cmod_propagation.cpp \
>   compiler/brw_fs_combine_constants.cpp \
>   compiler/brw_fs_copy_propagation.cpp \
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 43b6e34..0a85c0c 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -5858,6 +5858,8 @@ fs_visitor::allocate_registers(bool allow_spilling)
> if (failed)
>return;
>  
> +   opt_bank_conflicts();
> +
> schedule_instructions(SCHEDULE_POST);
>  
> if (last_scratch > 0) {
> diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
> index 6c8c027..b1fc7b3 100644
> --- a/src/intel/compiler/brw_fs.h
> +++ b/src/intel/compiler/brw_fs.h
> @@ -141,6 +141,7 @@ public:
> exec_list *acp);
> bool opt_drop_redundant_mov_to_flags();
> bool opt_register_renaming();
> +   bool opt_bank_conflicts();
> bool register_coalesce();
> bool compute_to_mrf();
> bool eliminate_find_live_channel();
> diff --git a/src/intel/compiler/brw_fs_bank_conflicts.cpp 
> b/src/intel/compiler/brw_fs_bank_conflicts.cpp
> new file mode 100644
> index 000..0225c70
> --- /dev/null
> +++ b/src/intel/compiler/brw_fs_bank_conflicts.cpp
> @@ -0,0 +1,791 @@
> +/*
> + * Copyright © 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be

Re: [Mesa-dev] [PATCH 3/3] glx: Implement GLX_EXT_no_config_context (v4)

2017-12-06 Thread Kyle Brenneman


On 12/05/2017 02:22 PM, Adam Jackson wrote:

This more or less ports EGL_KHR_no_config_context to GLX.

v2: Enable the extension only for those backends that support it.
v3: Fix glvnd path and dri2_convert_glx_attribs()
v4: Screeching signedness correctness, and disable a now
 inappropriate test.

diff --git a/src/glx/g_glxglvnddispatchfuncs.c 
b/src/glx/g_glxglvnddispatchfuncs.c
index 56d894eda7..04f6d8263a 100644
--- a/src/glx/g_glxglvnddispatchfuncs.c
+++ b/src/glx/g_glxglvnddispatchfuncs.c
@@ -164,7 +164,19 @@ static GLXContext dispatch_CreateContextAttribsARB(Display 
*dpy,
  __GLXvendorInfo *dd;
  GLXContext ret;
  
-dd = GetDispatchFromFBConfig(dpy, config);

+if (config) {
+   dd = GetDispatchFromFBConfig(dpy, config);
+} else if (attrib_list) {
+   int i, screen;
+
+   for (i = 0; attrib_list[i * 2] != None; i++) {
+  if (attrib_list[i * 2] == GLX_SCREEN) {
+ screen = attrib_list[i * 2 + 1];
+ dd = GetDispatchFromDrawable(dpy, RootWindow(dpy, screen));
+ break;
+  }
+   }
+}
  if (dd == NULL)
  return None;
  



That should just look up the vendor by screen number, not based on the 
root window. Calling GetDispatchFromDrawable instead of 
(__VND->getDynDispatch(dpy, screen)) requires an extra round trip to the 
server.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] spirv: fix bug when OpSpecConstantOp calls a conversion

2017-12-06 Thread Ian Romanick

On 11/29/2017 02:17 AM, Samuel Iglesias Gonsálvez wrote:
> On Tue, 2017-11-28 at 13:13 -0800, Ian Romanick wrote:
>> On 11/20/2017 10:25 PM, Samuel Iglesias Gonsálvez wrote:
>>> In that case, nir_eval_const_opcode() will evaluate the conversion
>>> but as it was using destination's bit_size, the resulting
>>> value was just a cast of the source constant value. By passing the
>>> source's bit size, it does the conversion properly.
>>>
>>> Fixes:
>>>
>>> dEQP-VK.spirv_assembly.instruction.*.opspecconstantop.*convert*
>>>
>>> Signed-off-by: Samuel Iglesias Gonsálvez 
>>> ---
>>>  src/compiler/spirv/spirv_to_nir.c | 31 ++-
>>> 
>>>  1 file changed, 26 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/compiler/spirv/spirv_to_nir.c
>>> b/src/compiler/spirv/spirv_to_nir.c
>>> index 2cc3c275ea9..ffbda4f1946 100644
>>> --- a/src/compiler/spirv/spirv_to_nir.c
>>> +++ b/src/compiler/spirv/spirv_to_nir.c
>>> @@ -1348,14 +1348,35 @@ vtn_handle_constant(struct vtn_builder *b,
>>> SpvOp opcode,
>>>   bool swap;
>>>   nir_alu_type dst_alu_type =
>>> nir_get_nir_type_for_glsl_type(val->const_type);
>>>   nir_alu_type src_alu_type = dst_alu_type;
>>> - nir_op op = vtn_nir_alu_op_for_spirv_opcode(opcode,
>>> , src_alu_type, dst_alu_type);
>>> -
>>>   unsigned num_components = glsl_get_vector_elements(val-
 const_type);
>>> - unsigned bit_size =
>>> -glsl_get_bit_size(val->const_type);
>>> + unsigned bit_size;
>>>  
>>> - nir_const_value src[4];
>>>   assert(count <= 7);
>>> +
>>> + switch (opcode) {
>>> + case SpvOpUConvert:
>>> + case SpvOpConvertFToU:
>>> + case SpvOpConvertFToS:
>>> + case SpvOpConvertSToF:
>>> + case SpvOpConvertUToF:
>>> + case SpvOpSConvert:
>>> + case SpvOpFConvert:
>>
>> I'm not sure what we should do here.  Most of these opcodes are not
>> valid in SpvOpSpecConstantOp in a Shader.  Only SpvOpSConvert and
>> SpvOpFConvert are allowed.  I guess it's trivial enough to support
>> the
>> others anyway...
> 
> Right.
> 
>>  do those dEQP-VK.spirv_assembly.instruction.* tests
>> exercise the other opcodes?
>>
> 
> No, they only exercise SpvOpFConvert and SpvOpSConvert. Do you prefer
> to just support these two opcodes for the moment?

Okay... good.  I wanted to make sure they weren't expecting out-of-spec
behavior. :)

> With or without that change, does it get your R-b?

Yes.

Reviewed-by: Ian Romanick 

> 
> Sam
> 
>>> +/* We have a source in a conversion */
>>> +src_alu_type =
>>> +   nir_get_nir_type_for_glsl_type(
>>> +  vtn_value(b, w[4], vtn_value_type_constant)-
 const_type);
>>> +/* We use the bitsize of the conversion source to
>>> evaluate the opcode later */
>>> +bit_size = glsl_get_bit_size(
>>> +   vtn_value(b, w[4], vtn_value_type_constant)-
 const_type);
>>> +break;
>>> + default:
>>> +bit_size = glsl_get_bit_size(val->const_type);
>>> + };
>>> +
>>> +
>>> + nir_op op = vtn_nir_alu_op_for_spirv_opcode(opcode,
>>> , src_alu_type, dst_alu_type);
>>> + nir_const_value src[4];
>>> +
>>>   for (unsigned i = 0; i < count - 4; i++) {
>>>  nir_constant *c =
>>> vtn_value(b, w[4 + i], vtn_value_type_constant)-
 constant;
>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] glx: GLX_MESA_multithread_makecurrent is direct-only

2017-12-06 Thread Adam Jackson

On Wed, 2017-12-06 at 15:01 -0500, Ian Romanick wrote:
> On 12/06/2017 10:32 AM, Emil Velikov wrote:
> > On 5 December 2017 at 16:10, Adam Jackson  wrote:
> > > This extension is not defined for indirect contexts. Marking it as
> > > "client only", as the old code did here, would make the extension
> > > available in indirect contexts, even though the server would certainly
> > > not have it in its extension list.
> > > 
> > > Cc: 
> > > Signed-off-by: Adam Jackson 
> > 
> > Reviewed-by: Emil Velikov 
> > 
> > Unrelated: reportedly only cairo is using the extension, so could we
> > consider the extension obsolete?
> 
> It's not too surprising that only Cairo is using it.  IIRC, Eric
> specifically made this extension for Cairo, and it was a pretty big perf
> win at the time.

I think at this point most of the effect could be achieved with no-
flush contexts, but yeah.

> I had wanted to test this patch, but... does LIBGL_ALWAYS_INDIRECT=y
> just not work any more?  With the distro Mesa I get:

It works, that's the server telling you it doesn't support indirect
rendering. We turned that off by default a few releases ago as being
slow and underfeatured and CVE-prone. Start your server with +iglx or
with this in xorg.conf:

Section "ServerFlags"
Option "IndirectGLX" "true"
EndSection

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glx: GLX_MESA_multithread_makecurrent is direct-only

2017-12-06 Thread Nicolai Hähnle


On 05.12.2017 17:10, Adam Jackson wrote:

This extension is not defined for indirect contexts. Marking it as
"client only", as the old code did here, would make the extension
available in indirect contexts, even though the server would certainly
not have it in its extension list.

Cc: 
Signed-off-by: Adam Jackson 


Makes sense.

Reviewed-by: Nicolai Hähnle 



---
  src/glx/glxextensions.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c
index af6ffbf660..638d8bcbbe 100644
--- a/src/glx/glxextensions.c
+++ b/src/glx/glxextensions.c
@@ -152,7 +152,7 @@ static const struct extension_info known_glx_extensions[] = 
{
 { GLX(ATI_pixel_format_float),  VER(0,0), N, N, N, N },
 { GLX(INTEL_swap_event),VER(0,0), Y, N, N, N },
 { GLX(MESA_copy_sub_buffer),VER(0,0), Y, N, N, N },
-   { GLX(MESA_multithread_makecurrent),VER(0,0), Y, N, Y, N },
+   { GLX(MESA_multithread_makecurrent),VER(0,0), Y, N, N, Y },
 { GLX(MESA_query_renderer), VER(0,0), Y, N, N, Y },
 { GLX(MESA_swap_control),   VER(0,0), Y, N, N, Y },
 { GLX(NV_float_buffer), VER(0,0), N, N, N, N },




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: Add LLVM version to the device name string

2017-12-06 Thread Alex Smith

Allows apps to determine the LLVM version so that they can decide
whether or not to enable workarounds for LLVM issues.

Signed-off-by: Alex Smith 
Cc: "17.2 17.3" 
---
 src/amd/vulkan/radv_device.c  | 61 +--
 src/amd/vulkan/radv_private.h |  2 +-
 2 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 1b7cd35593..2c3c84ee19 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -75,32 +75,43 @@ radv_get_device_uuid(struct radeon_info *info, void *uuid)
ac_compute_device_uuid(info, uuid, VK_UUID_SIZE);
 }
 
-static const char *
-get_chip_name(enum radeon_family family)
+static void
+radv_get_device_name(enum radeon_family family, char *name, size_t name_len)
 {
+   const char *chip_string;
+   char llvm_string[32] = {};
+
switch (family) {
-   case CHIP_TAHITI: return "AMD RADV TAHITI";
-   case CHIP_PITCAIRN: return "AMD RADV PITCAIRN";
-   case CHIP_VERDE: return "AMD RADV CAPE VERDE";
-   case CHIP_OLAND: return "AMD RADV OLAND";
-   case CHIP_HAINAN: return "AMD RADV HAINAN";
-   case CHIP_BONAIRE: return "AMD RADV BONAIRE";
-   case CHIP_KAVERI: return "AMD RADV KAVERI";
-   case CHIP_KABINI: return "AMD RADV KABINI";
-   case CHIP_HAWAII: return "AMD RADV HAWAII";
-   case CHIP_MULLINS: return "AMD RADV MULLINS";
-   case CHIP_TONGA: return "AMD RADV TONGA";
-   case CHIP_ICELAND: return "AMD RADV ICELAND";
-   case CHIP_CARRIZO: return "AMD RADV CARRIZO";
-   case CHIP_FIJI: return "AMD RADV FIJI";
-   case CHIP_POLARIS10: return "AMD RADV POLARIS10";
-   case CHIP_POLARIS11: return "AMD RADV POLARIS11";
-   case CHIP_POLARIS12: return "AMD RADV POLARIS12";
-   case CHIP_STONEY: return "AMD RADV STONEY";
-   case CHIP_VEGA10: return "AMD RADV VEGA";
-   case CHIP_RAVEN: return "AMD RADV RAVEN";
-   default: return "AMD RADV unknown";
-   }
+   case CHIP_TAHITI: chip_string = "AMD RADV TAHITI"; break;
+   case CHIP_PITCAIRN: chip_string = "AMD RADV PITCAIRN"; break;
+   case CHIP_VERDE: chip_string = "AMD RADV CAPE VERDE"; break;
+   case CHIP_OLAND: chip_string = "AMD RADV OLAND"; break;
+   case CHIP_HAINAN: chip_string = "AMD RADV HAINAN"; break;
+   case CHIP_BONAIRE: chip_string = "AMD RADV BONAIRE"; break;
+   case CHIP_KAVERI: chip_string = "AMD RADV KAVERI"; break;
+   case CHIP_KABINI: chip_string = "AMD RADV KABINI"; break;
+   case CHIP_HAWAII: chip_string = "AMD RADV HAWAII"; break;
+   case CHIP_MULLINS: chip_string = "AMD RADV MULLINS"; break;
+   case CHIP_TONGA: chip_string = "AMD RADV TONGA"; break;
+   case CHIP_ICELAND: chip_string = "AMD RADV ICELAND"; break;
+   case CHIP_CARRIZO: chip_string = "AMD RADV CARRIZO"; break;
+   case CHIP_FIJI: chip_string = "AMD RADV FIJI"; break;
+   case CHIP_POLARIS10: chip_string = "AMD RADV POLARIS10"; break;
+   case CHIP_POLARIS11: chip_string = "AMD RADV POLARIS11"; break;
+   case CHIP_POLARIS12: chip_string = "AMD RADV POLARIS12"; break;
+   case CHIP_STONEY: chip_string = "AMD RADV STONEY"; break;
+   case CHIP_VEGA10: chip_string = "AMD RADV VEGA"; break;
+   case CHIP_RAVEN: chip_string = "AMD RADV RAVEN"; break;
+   default: chip_string = "AMD RADV unknown"; break;
+   }
+
+   if (HAVE_LLVM > 0) {
+   snprintf(llvm_string, sizeof(llvm_string),
+" (LLVM %i.%i.%i)", (HAVE_LLVM >> 8) & 0xff,
+HAVE_LLVM & 0xff, MESA_LLVM_VERSION_PATCH);
+   }
+
+   snprintf(name, name_len, "%s%s", chip_string, llvm_string);
 }
 
 static void
@@ -215,7 +226,7 @@ radv_physical_device_init(struct radv_physical_device 
*device,
device->local_fd = fd;
device->ws->query_info(device->ws, >rad_info);
 
-   device->name = get_chip_name(device->rad_info.family);
+   radv_get_device_name(device->rad_info.family, device->name, 
sizeof(device->name));
 
if (radv_device_get_cache_uuid(device->rad_info.family, 
device->cache_uuid)) {
device->ws->destroy(device->ws);
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 67c2011107..3edfda6b12 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -261,7 +261,7 @@ struct radv_physical_device {
struct radeon_winsys *ws;
struct radeon_info rad_info;
charpath[20];
-   const char *name;
+   char
name[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
uint8_t driver_uuid[VK_UUID_SIZE];
uint8_t device_uuid[VK_UUID_SIZE];
uint8_t

[Mesa-dev] [PATCH] radv/winsys: implement query_value()

2017-12-06 Thread Samuel Pitoiset

Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_radeon_winsys.h   | 16 +++
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c | 56 +++
 2 files changed, 72 insertions(+)

diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
index 66a2bcccb4..2b815d9c5a 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -80,6 +80,19 @@ enum radeon_ctx_priority {
RADEON_CTX_PRIORITY_REALTIME,
 };
 
+enum radeon_value_id {
+   RADEON_TIMESTAMP,
+   RADEON_NUM_BYTES_MOVED,
+   RADEON_NUM_EVICTIONS,
+   RADEON_NUM_VRAM_CPU_PAGE_FAULTS,
+   RADEON_VRAM_USAGE,
+   RADEON_VRAM_VIS_USAGE,
+   RADEON_GTT_USAGE,
+   RADEON_GPU_TEMPERATURE,
+   RADEON_CURRENT_SCLK,
+   RADEON_CURRENT_MCLK,
+};
+
 struct radeon_winsys_cs {
unsigned cdw;  /* Number of used dwords. */
unsigned max_dw; /* Maximum number of dwords. */
@@ -169,6 +182,9 @@ struct radeon_winsys {
void (*query_info)(struct radeon_winsys *ws,
   struct radeon_info *info);
 
+   uint64_t (*query_value)(struct radeon_winsys *ws,
+   enum radeon_value_id value);
+
bool (*read_registers)(struct radeon_winsys *ws, unsigned reg_offset,
   unsigned num_registers, uint32_t *out);
 
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
index fec1c5ad65..0c6374e71c 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
@@ -71,6 +71,61 @@ static void radv_amdgpu_winsys_query_info(struct 
radeon_winsys *rws,
*info = ((struct radv_amdgpu_winsys *)rws)->info;
 }
 
+static uint64_t radv_amdgpu_winsys_query_value(struct radeon_winsys *rws,
+  enum radeon_value_id value)
+{
+   struct radv_amdgpu_winsys *ws = (struct radv_amdgpu_winsys *)rws;
+   struct amdgpu_heap_info heap;
+   uint64_t retval = 0;
+
+   switch (value) {
+   case RADEON_TIMESTAMP:
+   amdgpu_query_info(ws->dev, AMDGPU_INFO_TIMESTAMP, 8, );
+   return retval;
+   case RADEON_NUM_BYTES_MOVED:
+   amdgpu_query_info(ws->dev, AMDGPU_INFO_NUM_BYTES_MOVED,
+ 8, );
+   return retval;
+   case RADEON_NUM_EVICTIONS:
+   amdgpu_query_info(ws->dev, AMDGPU_INFO_NUM_EVICTIONS,
+ 8, );
+   return retval;
+   case RADEON_NUM_VRAM_CPU_PAGE_FAULTS:
+   amdgpu_query_info(ws->dev, AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS,
+ 8, );
+   return retval;
+   case RADEON_VRAM_USAGE:
+   amdgpu_query_heap_info(ws->dev, AMDGPU_GEM_DOMAIN_VRAM,
+  0, );
+   return heap.heap_usage;
+   case RADEON_VRAM_VIS_USAGE:
+   amdgpu_query_heap_info(ws->dev, AMDGPU_GEM_DOMAIN_VRAM,
+  AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED,
+  );
+   return heap.heap_usage;
+   case RADEON_GTT_USAGE:
+   amdgpu_query_heap_info(ws->dev, AMDGPU_GEM_DOMAIN_GTT,
+  0, );
+   return heap.heap_usage;
+   case RADEON_GPU_TEMPERATURE:
+   amdgpu_query_sensor_info(ws->dev, AMDGPU_INFO_SENSOR_GPU_TEMP,
+4, );
+   return retval;
+   case RADEON_CURRENT_SCLK:
+   amdgpu_query_sensor_info(ws->dev, AMDGPU_INFO_SENSOR_GFX_SCLK,
+4, );
+   return retval;
+   case RADEON_CURRENT_MCLK:
+   amdgpu_query_sensor_info(ws->dev, AMDGPU_INFO_SENSOR_GFX_MCLK,
+4, );
+   return retval;
+   default:
+   unreachable("invalid query value");
+   }
+
+   return 0;
+}
+
 static bool radv_amdgpu_winsys_read_registers(struct radeon_winsys *rws,
  unsigned reg_offset,
  unsigned num_registers, uint32_t 
*out)
@@ -127,6 +182,7 @@ radv_amdgpu_winsys_create(int fd, uint64_t debug_flags, 
uint64_t perftest_flags)
LIST_INITHEAD(>global_bo_list);
pthread_mutex_init(>global_bo_list_lock, NULL);
ws->base.query_info = radv_amdgpu_winsys_query_info;
+   ws->base.query_value = radv_amdgpu_winsys_query_value;
ws->base.read_registers = radv_amdgpu_winsys_read_registers;
ws->base.get_chip_name =

Re: [Mesa-dev] [PATCH 1/6] radeonsi: allow DMABUF exports for local buffers

2017-12-06 Thread Nicolai Hähnle

On 05.12.2017 20:05, Marek Olšák wrote:

From: Marek Olšák 

Cc: 17.3 
---
  src/gallium/drivers/radeon/r600_texture.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 2aa47b5..07f7c33 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -739,22 +739,26 @@ static boolean r600_texture_get_handle(struct 
pipe_screen* screen,
stride = rtex->surface.u.gfx9.surf_pitch *
 rtex->surface.bpe;
slice_size = rtex->surface.u.gfx9.surf_slice_size;
} else {
offset = rtex->surface.u.legacy.level[0].offset;
stride = rtex->surface.u.legacy.level[0].nblk_x *
 rtex->surface.bpe;
slice_size = 
(uint64_t)rtex->surface.u.legacy.level[0].slice_size_dw * 4;
}
} else {
+   /* Buffer exports are for the OpenCL interop. */
/* Move a suballocated buffer into a non-suballocated 
allocation. */
-   if (sscreen->ws->buffer_is_suballocated(res->buf)) {
+   if (sscreen->ws->buffer_is_suballocated(res->buf) ||
+   /* A DMABUF export always fails if the BO is local. */
+   (rtex->resource.flags & RADEON_FLAG_NO_INTERPROCESS_SHARING 
&&
+whandle->type != DRM_API_HANDLE_TYPE_KMS)) {

I still don't think this is right. Or at least, it's bound to blow up in 
our faces at some point. Though I think we may have talked past each 
other, apologies that I haven't made myself clear.

The issues I have in mind are scenarios like this:

1. Buffer allocated in OpenGL.
2. Buffer exported as KMS handle for importing to OpenCL in the same 
process.
3. Buffer exported as an FD <-- at this point, the OpenGL and OpenCL 
buffers go out of sync because OpenGL re-allocates the buffer but OpenCL 
isn't informed.

Or:

1. Buffer allocated in OpenGL.
2. Buffer exported as KMS handle for importing to OpenCL in the same 
process.
3. Buffer attempted to be exported as an FD from OpenCL <-- fails 
because the buffer is local (has NO_INTERPROCESS_SHARING), and people 
will be utterly clueless as to what's going on.

FWIW, I think the patch is good if you drop the whandle->type check so 
that we re-allocate unconditionally.

Cheers,
Nicolai

assert(!res->b.is_shared);

  			/* Allocate a new buffer with PIPE_BIND_SHARED. */

struct pipe_resource templ = res->b.b;
templ.bind |= PIPE_BIND_SHARED;

  			struct pipe_resource *newb =

screen->resource_create(screen, );
if (!newb)
return false;

--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/opcodes: Fix constant-folding of bitfield_insert

2017-12-06 Thread James Legg

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119
CC: 
CC: Samuel Pitoiset 
---
 src/compiler/nir/nir_opcodes.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index ac7333fe78..278562b2bd 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -724,12 +724,12 @@ opcode("bitfield_insert", 0, tuint32, [0, 0, 0, 0],
 unsigned base = src0, insert = src1;
 int offset = src2, bits = src3;
 if (bits == 0) {
-   dst = 0;
+   dst = base;
 } else if (offset < 0 || bits < 0 || bits + offset > 32) {
dst = 0;
 } else {
unsigned mask = ((1ull << bits) - 1) << offset;
-   dst = (base & ~mask) | ((insert << bits) & mask);
+   dst = (base & ~mask) | ((insert << offset) & mask);
 }
 """)
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-12-06 Thread Nicolai Hähnle


On 06.12.2017 08:07, James Jones wrote:
[snip]

So lets say you have a setup where both display and GPU supported
FOO/tiled, but only GPU supported compressed (FOO/CC) and cached
(FOO/cached).  But the GPU supported the following transitions:

    trans_a: FOO/CC -> null
    trans_b: FOO/cached -> null

Then the sets for each device (in order of preference):

GPU:
    1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=32k)
    2: caps(FOO/tiled, FOO/CC); constraints(alignment=32k)
    3: caps(FOO/tiled); constraints(alignment=32k)

Display:
    1: caps(FOO/tiled); constraints(alignment=64k)

Merged Result:
    1: caps(FOO/tiled, FOO/CC, FOO/cached); 
constraints(alignment=64k);

   transition(GPU->display: trans_a, trans_b; display->GPU: none)
    2: caps(FOO/tiled, FOO/CC); constraints(alignment=64k);
   transition(GPU->display: trans_a; display->GPU: none)
    3: caps(FOO/tiled); constraints(alignment=64k);
   transition(GPU->display: none; display->GPU: none)



We definitely don't want to expose a way of getting uncached rendering
surfaces for radeonsi. I mean, I think we are supposed to be able to 
program

our hardware so that the backend bypasses all caches, but (a) nobody
validates that and (b) it's basically suicide in terms of 
performance. Let's

build fewer footguns :)


sure, this was just a hypothetical example.  But to take this case as
another example, if you didn't want to expose uncached rendering (or
cached w/ cache flushes after each draw), you would exclude the entry
from the GPU set which didn't have FOO/cached (I'm adding back a
cached but not CC config just to make it interesting), and end up
with:

    trans_a: FOO/CC -> null
    trans_b: FOO/cached -> null

GPU:
   1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=32k)
   2: caps(FOO/tiled, FOO/cached); constraints(alignment=32k)

Display:
   1: caps(FOO/tiled); constraints(alignment=64k)

Merged Result:
   1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=64k);
  transition(GPU->display: trans_a, trans_b; display->GPU: none)
   2: caps(FOO/tiled, FOO/cached); constraints(alignment=64k);
  transition(GPU->display: trans_b; display->GPU: none)

So there isn't anything in the result set that doesn't have GPU cache,
and the cache-flush transition is always in the set of required
transitions going from GPU -> display

Hmm, I guess this does require the concept of a required cap..


Which we already introduced to the allocator API when we realized we
would need them as we were prototyping.


Note I also posed the question of whether things like cached (and 
similarly compression, since I view compression as roughly an equivalent 
mechanism to a cache) in one of the open issues on my XDC 2017 slides 
because of this very problem of over-pruning it causes.  It's on slide 
15, as "No device-local capabilities".  You'll have to listen to my 
coverage of it in the recorded presentation for that slide to make any 
sense, but it's the same thing Nicolai has laid out here.


As I continued working through our prototype driver support, I found I 
didn't actually need to include cached or compressed as capabilities: 
The GPU just applies them as needed and the usage transitions make it 
transparent to the non-GPU engines.  That does mean the GPU driver 
currently needs to be the one to realize the allocation from the 
capability set to get optimal behavior.  We could fix that by reworking 
our driver though.  At this point, not including device-local properties 
like on-device caching in capabilities seems like the right solution to 
me.  I'm curious whether this applies universally though, or if other 
hardware doesn't fit the "compression and stuff all behaves like a 
cache" idiom.


Compression is a part of the memory layout for us: framebuffer 
compression uses an additional "meta surface". At the most basic level, 
an allocation with loss-less compression support is by necessity bigger 
than an allocation without.


We can allocate this meta surface separately, but then we're forced to 
decompress when passing the surface around (e.g. to a compositor.)


Consider also the example I gave elsewhere, where a cross-vendor tiling 
layout is combined with vendor-specific compression:


Device 1, rendering: caps(BASE/foo-tiling, VND1/compression)
Device 2, sampling/scanout: caps(BASE/foo-tiling, VND2/compression)

Some more thoughts on caching or "device-local" properties below.


[snip]

I think I like the idea of having transitions being part of the
per-device/engine cap sets, so that such information can be used upon
merging to know which capabilities may remain or have to be dropped.

I think James's proposal for usage transitions was intended to work
with flows like:

   1. App gets GPU caps for RENDER usage
   2. App allocates GPU memory using a layout from (1)
   3. App now decides it wants use the buffer for SCANOUT
   4. App queries usage transition metadata from RENDER to SCANOUT,

[Mesa-dev] [Bug 104119] radv: OpBitFieldInsert produces 0 with a loop counter for Insert

2017-12-06 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=104119

--- Comment #6 from James Legg  ---
I've submitted another patch that fixes the constant folding that was going
wrong:

https://patchwork.freedesktop.org/patch/191977/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/25] mesa: move nir_spirv_supported_capabilities definition

2017-12-06 Thread Alejandro Piñeiro

On 06/12/17 10:47, Timothy Arceri wrote:
>
>
> On 06/12/17 20:33, Alejandro Piñeiro wrote:
>> On 06/12/17 10:23, Timothy Arceri wrote:
>>> Can we get away with forward declaring this?
>>>
>>> There is a section at the top of mtypes you can add it to:
>>>
>>>   * \name Some forward type declarations
>>
>> Yes, I realized that, and tried, but I still got several build errors.
>> So that would not be enough.
>
> Doesn't that just mean you need to include compiler/spirv/nir_spirv.h
> in more places?

Sorry, didn't realize this question.

No. The problem is that with the next patch, we are adding a variable
with that type, not a pointer. If we were just adding a pointer to
nir_spirv_supported_capabilities, then the forward definition would be
enough. If not we are getting "incomplete type" errors. Also adding an
include to compiler/spirv/nir_spirv.h on mtypes.h creates tons of
problems, as that one includes nir.h and so on.
>
>>
>> In any case, after all the recent changes on spirv/spirv_to_nir
>> codebase, this commit and the following one are obsolete. We are
>> preparing a v3 series, but meanwhile we send this path alone to
>> mesa-dev:
>> https://lists.freedesktop.org/archives/mesa-dev/2017-December/179438.html
>>
>
> I'm confused. If it's obsolete why are you trying to get it committed?
>
>
>>>
>>>
>>> On 01/12/17 04:28, Eduardo Lima Mitev wrote:
 From: Alejandro Piñeiro 

 Due gl_spirv we will use it on more places, specifically on
 gl_constants, where we would like to use it without a pointer.
 ---
    src/compiler/spirv/nir_spirv.h | 15 ++-
    src/mesa/main/mtypes.h | 11 +++
    2 files changed, 13 insertions(+), 13 deletions(-)

 diff --git a/src/compiler/spirv/nir_spirv.h
 b/src/compiler/spirv/nir_spirv.h
 index 0204e81d091..a14b55cdd4b 100644
 --- a/src/compiler/spirv/nir_spirv.h
 +++ b/src/compiler/spirv/nir_spirv.h
 @@ -28,7 +28,8 @@
    #ifndef _NIR_SPIRV_H_
    #define _NIR_SPIRV_H_
    -#include "nir/nir.h"
 +#include "compiler/nir/nir.h"
 +#include "main/mtypes.h"
      #ifdef __cplusplus
    extern "C" {
 @@ -42,18 +43,6 @@ struct nir_spirv_specialization {
   };
    };
    -struct nir_spirv_supported_capabilities {
 -   bool float64;
 -   bool image_ms_array;
 -   bool tessellation;
 -   bool draw_parameters;
 -   bool image_read_without_format;
 -   bool image_write_without_format;
 -   bool int64;
 -   bool multiview;
 -   bool variable_pointers;
 -};
 -
    nir_function *spirv_to_nir(const uint32_t *words, size_t
 word_count,
   struct nir_spirv_specialization
 *specializations,
   unsigned num_specializations,
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 50a47e0a65d..c8177c9a99a 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3583,6 +3583,17 @@ struct gl_program_constants
   GLuint MaxShaderStorageBlocks;
    };
    +struct nir_spirv_supported_capabilities {
 +   bool float64;
 +   bool image_ms_array;
 +   bool tessellation;
 +   bool draw_parameters;
 +   bool image_read_without_format;
 +   bool image_write_without_format;
 +   bool int64;
 +   bool multiview;
 +   bool variable_pointers;
 +};
      /**
     * Constants which may be overridden by device driver during
 context creation

>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-12-06 Thread Nicolai Hähnle


On 06.12.2017 08:01, James Jones wrote:

On 12/01/2017 10:34 AM, Nicolai Hähnle wrote:

On 01.12.2017 18:09, Nicolai Hähnle wrote:
[snip]

As for the actual transition API, I accept that some metadata may be
required, and the metadata probably needs to depend on the memory 
layout,

which is often vendor-specific. But even linear layouts need some
transitions for caches. We probably need at least some generic 
"off-device

usage" bit.


I've started thinking of cached as a capability with a transition.. I
think that helps.  Maybe it needs to somehow be more specific (ie. if
you have two devices both with there own cache with no coherency
between the two)


As I wrote above, I'd prefer not to think of "cached" as a capability 
at least for radeonsi.


 From the desktop perspective, I would say let's ignore caches, the 
drivers know which caches they need to flush to make data visible to 
other devices on the system.


On the other hand, there are probably SoC cases where non-coherent 
caches are shared between some but not all devices, and in that case 
perhaps we do need to communicate this.


So perhaps we should have two kinds of "capabilities".

The first, like framebuffer compression, is a capability of the 
allocated memory layout (because the compression requires a meta 
surface), and devices that expose it may opportunistically use it.


The second, like caches, is a capability that the device/driver will 
use and you don't get a say in it, but other devices/drivers also 
don't need to be aware of them.


So then you could theoretically have a system that gives you:

GPU: FOO/tiled(layout-caps=FOO/cc, dev-caps=FOO/gpu-cache)
Display: FOO/tiled(layout-caps=FOO/cc)
Video:   FOO/tiled(dev-caps=FOO/vid-cache)
Camera:  FOO/tiled(dev-caps=FOO/vid-cache)

[snip]

FWIW, I think all that stuff about different caches quite likely 
over-complicates things. At the end of each "command submission" of 
whichever type of engine, the buffer must be in a state where the 
kernel is free to move it around for memory management purposes. This 
already puts a big constraint on the kind of (non-coherent) caches 
that can be supported anyway, so I wouldn't be surprised if we could 
get away with a *much* simpler approach.


I'd rather not depend on this type of cleverness if possible.  Other 
kernels/OS's may not behave this way, and I'd like the allocator 
mechanism to be something we can use across all or at least most of the 
POSIX and POSIX-like OS's we support.  Also, this particular example is 
not true of our proprietary Linux driver, and I suspect it won't always 
be the case for other drivers.  If a particular driver or OS fits this 
assumption, the driver is always free to return no-op transitions in 
that case.


Agreed.

(What I wrote about memory management should be true for all systems, 
but the kernel could use an engine that goes through the relevant caches 
for memory management-related buffer moves. It just so happens that it 
doesn't do that on our hardware, but that's by no means universal.)


Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: remove set_entry from forward type declarations

2017-12-06 Thread Alejandro Piñeiro

This type was used at gl_sync_object, but it is not used anymore.
---
 src/mesa/main/mtypes.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b478f6158e2..4c90f58ef5a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -99,7 +99,6 @@ struct gl_uniform_storage;
 struct prog_instruction;
 struct gl_program_parameter_list;
 struct set;
-struct set_entry;
 struct vbo_context;
 /*@}*/
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/6] radeonsi: make const and stream uploaders allocate read-only memory

2017-12-06 Thread Nicolai Hähnle


Except for the comment on patch 1, this series is:

Reviewed-by: Nicolai Hähnle 


On 05.12.2017 20:05, Marek Olšák wrote:

From: Marek Olšák 

and anything that clones these uploaders, like u_threaded_context.
---
  src/gallium/drivers/radeon/r600_pipe_common.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 9090e65..9e45a9f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -438,26 +438,29 @@ bool si_common_context_init(struct r600_common_context 
*rctx,
return false;
}
  
  	rctx->allocator_zeroed_memory =

u_suballocator_create(>b, sscreen->info.gart_page_size,
  0, PIPE_USAGE_DEFAULT, 0, true);
if (!rctx->allocator_zeroed_memory)
return false;
  
  	rctx->b.stream_uploader = u_upload_create(>b, 1024 * 1024,

- 0, PIPE_USAGE_STREAM, 0);
+ 0, PIPE_USAGE_STREAM,
+ R600_RESOURCE_FLAG_READ_ONLY);
if (!rctx->b.stream_uploader)
return false;
  
  	rctx->b.const_uploader = u_upload_create(>b, 128 * 1024,

-0, PIPE_USAGE_DEFAULT, 0);
+0, PIPE_USAGE_DEFAULT,
+
sscreen->cpdma_prefetch_writes_memory ?
+   0 : 
R600_RESOURCE_FLAG_READ_ONLY);
if (!rctx->b.const_uploader)
return false;
  
  	rctx->cached_gtt_allocator = u_upload_create(>b, 16 * 1024,

 0, PIPE_USAGE_STAGING, 0);
if (!rctx->cached_gtt_allocator)
return false;
  
  	rctx->ctx = rctx->ws->ctx_create(rctx->ws);

if (!rctx->ctx)




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: Add LLVM version to the device name string

2017-12-06 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 12/06/2017 11:32 AM, Alex Smith wrote:

Allows apps to determine the LLVM version so that they can decide
whether or not to enable workarounds for LLVM issues.

Signed-off-by: Alex Smith 
Cc: "17.2 17.3" 
---
  src/amd/vulkan/radv_device.c  | 61 +--
  src/amd/vulkan/radv_private.h |  2 +-
  2 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 1b7cd35593..2c3c84ee19 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -75,32 +75,43 @@ radv_get_device_uuid(struct radeon_info *info, void *uuid)
ac_compute_device_uuid(info, uuid, VK_UUID_SIZE);
  }
  
-static const char *

-get_chip_name(enum radeon_family family)
+static void
+radv_get_device_name(enum radeon_family family, char *name, size_t name_len)
  {
+   const char *chip_string;
+   char llvm_string[32] = {};
+
switch (family) {
-   case CHIP_TAHITI: return "AMD RADV TAHITI";
-   case CHIP_PITCAIRN: return "AMD RADV PITCAIRN";
-   case CHIP_VERDE: return "AMD RADV CAPE VERDE";
-   case CHIP_OLAND: return "AMD RADV OLAND";
-   case CHIP_HAINAN: return "AMD RADV HAINAN";
-   case CHIP_BONAIRE: return "AMD RADV BONAIRE";
-   case CHIP_KAVERI: return "AMD RADV KAVERI";
-   case CHIP_KABINI: return "AMD RADV KABINI";
-   case CHIP_HAWAII: return "AMD RADV HAWAII";
-   case CHIP_MULLINS: return "AMD RADV MULLINS";
-   case CHIP_TONGA: return "AMD RADV TONGA";
-   case CHIP_ICELAND: return "AMD RADV ICELAND";
-   case CHIP_CARRIZO: return "AMD RADV CARRIZO";
-   case CHIP_FIJI: return "AMD RADV FIJI";
-   case CHIP_POLARIS10: return "AMD RADV POLARIS10";
-   case CHIP_POLARIS11: return "AMD RADV POLARIS11";
-   case CHIP_POLARIS12: return "AMD RADV POLARIS12";
-   case CHIP_STONEY: return "AMD RADV STONEY";
-   case CHIP_VEGA10: return "AMD RADV VEGA";
-   case CHIP_RAVEN: return "AMD RADV RAVEN";
-   default: return "AMD RADV unknown";
-   }
+   case CHIP_TAHITI: chip_string = "AMD RADV TAHITI"; break;
+   case CHIP_PITCAIRN: chip_string = "AMD RADV PITCAIRN"; break;
+   case CHIP_VERDE: chip_string = "AMD RADV CAPE VERDE"; break;
+   case CHIP_OLAND: chip_string = "AMD RADV OLAND"; break;
+   case CHIP_HAINAN: chip_string = "AMD RADV HAINAN"; break;
+   case CHIP_BONAIRE: chip_string = "AMD RADV BONAIRE"; break;
+   case CHIP_KAVERI: chip_string = "AMD RADV KAVERI"; break;
+   case CHIP_KABINI: chip_string = "AMD RADV KABINI"; break;
+   case CHIP_HAWAII: chip_string = "AMD RADV HAWAII"; break;
+   case CHIP_MULLINS: chip_string = "AMD RADV MULLINS"; break;
+   case CHIP_TONGA: chip_string = "AMD RADV TONGA"; break;
+   case CHIP_ICELAND: chip_string = "AMD RADV ICELAND"; break;
+   case CHIP_CARRIZO: chip_string = "AMD RADV CARRIZO"; break;
+   case CHIP_FIJI: chip_string = "AMD RADV FIJI"; break;
+   case CHIP_POLARIS10: chip_string = "AMD RADV POLARIS10"; break;
+   case CHIP_POLARIS11: chip_string = "AMD RADV POLARIS11"; break;
+   case CHIP_POLARIS12: chip_string = "AMD RADV POLARIS12"; break;
+   case CHIP_STONEY: chip_string = "AMD RADV STONEY"; break;
+   case CHIP_VEGA10: chip_string = "AMD RADV VEGA"; break;
+   case CHIP_RAVEN: chip_string = "AMD RADV RAVEN"; break;
+   default: chip_string = "AMD RADV unknown"; break;
+   }
+
+   if (HAVE_LLVM > 0) {
+   snprintf(llvm_string, sizeof(llvm_string),
+" (LLVM %i.%i.%i)", (HAVE_LLVM >> 8) & 0xff,
+HAVE_LLVM & 0xff, MESA_LLVM_VERSION_PATCH);
+   }
+
+   snprintf(name, name_len, "%s%s", chip_string, llvm_string);
  }
  
  static void

@@ -215,7 +226,7 @@ radv_physical_device_init(struct radv_physical_device 
*device,
device->local_fd = fd;
device->ws->query_info(device->ws, >rad_info);
  
-	device->name = get_chip_name(device->rad_info.family);

+   radv_get_device_name(device->rad_info.family, device->name, 
sizeof(device->name));
  
  	if (radv_device_get_cache_uuid(device->rad_info.family, device->cache_uuid)) {

device->ws->destroy(device->ws);
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 67c2011107..3edfda6b12 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -261,7 +261,7 @@ struct radv_physical_device {
struct radeon_winsys *ws;
struct radeon_info rad_info;
charpath[20];
-   const char *name;
+   char
name[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
uint8_t driver_uuid[VK_UUID_SIZE];

Re: [Mesa-dev] [PATCH 1/2] gallium/util: add u_transfer_helper

2017-12-06 Thread Eric Anholt

Rob Clark  writes:

> Add a new helper that drivers can use to emulate various things that
> need special handling in particular in transfer_map:
>
>  1) z32_s8x24.. gl/gallium treats this as a single buffer with depth
> and stencil interleaved but hardware frequently treats this as
> separate z32 and s8 buffers.  Special pack/unpack handling is
> needed in transfer_map/unmap to pack/unpack the exposed buffer
>
>  2) fake RGTC.. GPUs designed with GLES in mind, but which can other-
> wise do GL3, if native RGTC is not supported it can be emulated
> by converting to uncompressed internally, but needs pack/unpack
> in transfer_map/unmap
>
>  3) MSAA resolves in the transfer_map() case
>
> v2: add MSAA resolve based on Eric's "gallium: Add helpers for MSAA
> resolves in pipe_transfer_map()/unmap()." patch; avoid wrapping
> pipe_resource, to make it possible for drivers to use both this
> and threaded_context.

The driver side is clean enough with this layer that I'm pretty happy
now.  Just one significant review comment, then I think we'll be
ready...

> +static void
> +flush_region(struct pipe_context *pctx, struct pipe_transfer *ptrans,
> + const struct pipe_box *box)
> +{
> +   struct u_transfer *trans = u_transfer(ptrans);
> +   enum pipe_format format = ptrans->resource->format;
> +   unsigned width = ptrans->box.width;
> +   unsigned height = ptrans->box.height;

It seems silly to be implementing flush_region and ignoring the box
argument to flush the entire mapped region on every call.  We should
either drop this implementation in favor of the no-op and flush at
unmap, or actually use the box in the flushes.

That said, I don't think you can reach flush_region with explicit flush
for non-buffer resources?

> +
> +   if (!(ptrans->usage & PIPE_TRANSFER_WRITE))
> +  return;
> +
> +   if (trans->ss) {
> +  struct pipe_blit_info blit;
> +  memset(, 0, sizeof(blit));
> +
> +  blit.src.resource = trans->ss;
> +  blit.src.format = trans->ss->format;
> +  blit.src.box.width = ptrans->box.width;
> +  blit.src.box.height = ptrans->box.height;
> +  blit.src.box.depth = 1;
> +
> +  blit.dst.resource = ptrans->resource;
> +  blit.dst.format = ptrans->resource->format;
> +  blit.dst.level = ptrans->level;
> +  blit.dst.box = ptrans->box;
> +
> +  blit.mask = util_format_get_mask(ptrans->resource->format);
> +  blit.filter = PIPE_TEX_FILTER_NEAREST;
> +
> +  pctx->blit(pctx, );
> +
> +  return;
> +   }
> +
> +   switch (format) {
> +   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
> +  util_format_z32_float_s8x24_uint_unpack_z_float(trans->ptr,
> +  trans->trans->stride,
> +  trans->staging,
> +  ptrans->stride,
> +  width, height);
> +  /* fallthru */
> +   case PIPE_FORMAT_X32_S8X24_UINT:
> +  util_format_z32_float_s8x24_uint_unpack_s_8uint(trans->ptr2,
> +  trans->trans2->stride,
> +  trans->staging,
> +  ptrans->stride,
> +  width, height);
> +  break;
> +   case PIPE_FORMAT_RGTC1_UNORM:
> +   case PIPE_FORMAT_RGTC1_SNORM:
> +   case PIPE_FORMAT_LATC1_UNORM:
> +   case PIPE_FORMAT_LATC1_SNORM:
> +  util_format_rgtc1_unorm_unpack_rgba_8unorm(trans->ptr,
> + trans->trans->stride,
> + trans->staging,
> + ptrans->stride,
> + width, height);
> +  break;
> +   case PIPE_FORMAT_RGTC2_UNORM:
> +   case PIPE_FORMAT_RGTC2_SNORM:
> +   case PIPE_FORMAT_LATC2_UNORM:
> +   case PIPE_FORMAT_LATC2_SNORM:
> +  util_format_rgtc2_unorm_unpack_rgba_8unorm(trans->ptr,
> + trans->trans->stride,
> + trans->staging,
> + ptrans->stride,
> + width, height);
> +  break;
> +   default:
> +  assert(!"Unexpected staging transfer type");
> +  break;
> +   }
> +}

> diff --git a/src/gallium/auxiliary/util/u_transfer_helper.h 
> b/src/gallium/auxiliary/util/u_transfer_helper.h
> new file mode 100644
> index 000..392b34f0697
> --- /dev/null
> +++ b/src/gallium/auxiliary/util/u_transfer_helper.h
> @@ -0,0 +1,132 @@
> +/*
> + * Copyright © 2017 Red Hat
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation

Re: [Mesa-dev] [PATCH 1/2] gallium/util: add u_transfer_helper

2017-12-06 Thread Rob Clark

On Wed, Dec 6, 2017 at 4:48 PM, Eric Anholt  wrote:
> Rob Clark  writes:
>
>> Add a new helper that drivers can use to emulate various things that
>> need special handling in particular in transfer_map:
>>
>>  1) z32_s8x24.. gl/gallium treats this as a single buffer with depth
>> and stencil interleaved but hardware frequently treats this as
>> separate z32 and s8 buffers.  Special pack/unpack handling is
>> needed in transfer_map/unmap to pack/unpack the exposed buffer
>>
>>  2) fake RGTC.. GPUs designed with GLES in mind, but which can other-
>> wise do GL3, if native RGTC is not supported it can be emulated
>> by converting to uncompressed internally, but needs pack/unpack
>> in transfer_map/unmap
>>
>>  3) MSAA resolves in the transfer_map() case
>>
>> v2: add MSAA resolve based on Eric's "gallium: Add helpers for MSAA
>> resolves in pipe_transfer_map()/unmap()." patch; avoid wrapping
>> pipe_resource, to make it possible for drivers to use both this
>> and threaded_context.
>
> The driver side is clean enough with this layer that I'm pretty happy
> now.  Just one significant review comment, then I think we'll be
> ready...
>
>> +static void
>> +flush_region(struct pipe_context *pctx, struct pipe_transfer *ptrans,
>> + const struct pipe_box *box)
>> +{
>> +   struct u_transfer *trans = u_transfer(ptrans);
>> +   enum pipe_format format = ptrans->resource->format;
>> +   unsigned width = ptrans->box.width;
>> +   unsigned height = ptrans->box.height;
>
> It seems silly to be implementing flush_region and ignoring the box
> argument to flush the entire mapped region on every call.  We should
> either drop this implementation in favor of the no-op and flush at
> unmap, or actually use the box in the flushes.

oh, whoops.. and I guess in theory I should use those dimensions for
the MSAA blit too..

> That said, I don't think you can reach flush_region with explicit flush
> for non-buffer resources?

hmm, not 100% sure about the APIs on the GL side of things, but I
think if that were the case mesa/st would dtrt.  (I guess it could be
different w/ gallium9, not that I have a big collection of windows arm
games to play :-P)

>> +
>> +   if (!(ptrans->usage & PIPE_TRANSFER_WRITE))
>> +  return;
>> +
>> +   if (trans->ss) {
>> +  struct pipe_blit_info blit;
>> +  memset(, 0, sizeof(blit));
>> +
>> +  blit.src.resource = trans->ss;
>> +  blit.src.format = trans->ss->format;
>> +  blit.src.box.width = ptrans->box.width;
>> +  blit.src.box.height = ptrans->box.height;
>> +  blit.src.box.depth = 1;
>> +
>> +  blit.dst.resource = ptrans->resource;
>> +  blit.dst.format = ptrans->resource->format;
>> +  blit.dst.level = ptrans->level;
>> +  blit.dst.box = ptrans->box;
>> +
>> +  blit.mask = util_format_get_mask(ptrans->resource->format);
>> +  blit.filter = PIPE_TEX_FILTER_NEAREST;
>> +
>> +  pctx->blit(pctx, );
>> +
>> +  return;
>> +   }
>> +
>> +   switch (format) {
>> +   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
>> +  util_format_z32_float_s8x24_uint_unpack_z_float(trans->ptr,
>> +  trans->trans->stride,
>> +  trans->staging,
>> +  ptrans->stride,
>> +  width, height);
>> +  /* fallthru */
>> +   case PIPE_FORMAT_X32_S8X24_UINT:
>> +  util_format_z32_float_s8x24_uint_unpack_s_8uint(trans->ptr2,
>> +  trans->trans2->stride,
>> +  trans->staging,
>> +  ptrans->stride,
>> +  width, height);
>> +  break;
>> +   case PIPE_FORMAT_RGTC1_UNORM:
>> +   case PIPE_FORMAT_RGTC1_SNORM:
>> +   case PIPE_FORMAT_LATC1_UNORM:
>> +   case PIPE_FORMAT_LATC1_SNORM:
>> +  util_format_rgtc1_unorm_unpack_rgba_8unorm(trans->ptr,
>> + trans->trans->stride,
>> + trans->staging,
>> + ptrans->stride,
>> + width, height);
>> +  break;
>> +   case PIPE_FORMAT_RGTC2_UNORM:
>> +   case PIPE_FORMAT_RGTC2_SNORM:
>> +   case PIPE_FORMAT_LATC2_UNORM:
>> +   case PIPE_FORMAT_LATC2_SNORM:
>> +  util_format_rgtc2_unorm_unpack_rgba_8unorm(trans->ptr,
>> + trans->trans->stride,
>> + trans->staging,
>> + ptrans->stride,
>> + width, height);
>> +  break;
>> +   default:
>> +

Re: [Mesa-dev] [PATCH] drirc: add option to disable ARB_draw_indirect

2017-12-06 Thread Rob Clark

On Wed, Dec 6, 2017 at 3:31 PM, Ian Romanick  wrote:
> On 12/05/2017 08:25 AM, Ilia Mirkin wrote:
>> On Tue, Dec 5, 2017 at 8:18 AM, Emil Velikov  
>> wrote:
>>> Hi Rob,
>>>
>>> On 5 December 2017 at 12:54, Rob Clark  wrote:
 This is a bit sad/annoying.  But with current GPU firmware (at least on
 a5xx) we can support both draw-indirect and base-instance.  But we can't
 support draw-indirect with a non-zero base-instance specified.  So add a
 driconf option to hide the extension from games that are known to use
 both.

 Signed-off-by: Rob Clark 
 ---
 Tbh, I'm also not really sure what to do when/if we got updated firmware
 which handled draw-indirect with base-instance, since we'd need to make
 this option conditional on fw version.  For STK that probably isn't a
 big deal since it doesn't use draw-indirect in a particularly useful way
 (the indirect buffer is generated on CPU).

>>> Couldn't freedreno just return 0 for PIPE_CAP_DRAW_INDIRECT (aka
>>> disable the extension) as it detects buggy FW?
>>> This is what radeons have been doing as they encounter iffy firmware or 
>>> LLVM.
>>>
>>> AFAICT freedreno doesn't do GL 4.0 or GLES 3.1 so one should be safe.
>>
>> Rob is this -><- close to ES 3.1, so that's not a great option.
>
> And I don't suppose there's a way to get updated firmware?  i965 has
> similar sorts of cases where higher versions are disabled due to missing
> kernel features.
>

I can ask.. I guess the odds are lower compared to a open src driver
effort supported by the hw vendor..

For now I've disabled exposing base-instance since that isn't needed
until later than draw-indirect..

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/x11: Remove unneeded free() on always null string

2017-12-06 Thread Vadim Shovkoplias

Hi Eric,

I used smatch (http://smatch.sourceforge.net/). It is mainly used for Linux
kernel.

2017-12-04 18:52 GMT+02:00 Eric Engestrom :

> On Monday, 2017-12-04 12:48:55 +0200, Vadim Shovkoplias wrote:
> > Hi Eric,
>
> Hey, sorry, I forgot to hit "send" on the reply I wrote on friday :]
>
> >
> > Mostly by a static analysis tool. It found at least 7 issues with useless
> > free() calls and other problems that probably should be fixed.
>
> What tool? It would be interesting for others to know what tools exist,
> especially when they find issues other tools didn't :)
>
> > Suggest please should I create one cumulative commit for this or it
> should
> > be a separate commits ?
>
> Same kind of issues in the same module should be grouped, whereas
> different kind of issues or different modules should be separate.
>
> Don't worry too much about it though, if people ask you to merge or
> split commits, it's not that complicated to do for a v2 :)
>
> >
> > 2017-12-01 17:41 GMT+02:00 Eric Engestrom :
> >
> > > On Friday, 2017-12-01 17:08:53 +0200, vadim.shovkopl...@gmail.com
> wrote:
> > > > From: Vadym Shovkoplias 
> > > >
> > > > In this condition dri2_dpy->driver_name string always equals
> > > > NULL, so call to free() is useless
> > > >
> > > > Signed-off-by: Vadym Shovkoplias 
> > >
> > > Reviewed and pushed :)
> > >
> > > Are you finding all of these by inspection, or are you using a tool?
> > >
> > > > ---
> > > >  src/egl/drivers/dri2/platform_x11.c | 1 -
> > > >  1 file changed, 1 deletion(-)
> > > >
> > > > diff --git a/src/egl/drivers/dri2/platform_x11.c
> b/src/egl/drivers/dri2/
> > > platform_x11.c
> > > > index c49cb1f..8ede590b 100644
> > > > --- a/src/egl/drivers/dri2/platform_x11.c
> > > > +++ b/src/egl/drivers/dri2/platform_x11.c
> > > > @@ -704,7 +704,6 @@ dri2_x11_connect(struct dri2_egl_display
> *dri2_dpy)
> > > >
> > > > if (dri2_dpy->driver_name == NULL) {
> > > >close(dri2_dpy->fd);
> > > > -  free(dri2_dpy->driver_name);
> > > >free(connect);
> > > >return EGL_FALSE;
> > > > }
> > > > --
> > > > 2.7.4
> > > >
> > >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: enable lowering of nir_op_bitfield_insert

2017-12-06 Thread James Legg

On Tue, 2017-12-05 at 14:24 -0500, Connor Abbott wrote:
> lower_bitfield_insert lowers nir_op_bitfield_insert to DX10-style
> nir_op_bfi and nir_op_bfm, both of which aren't handled by
> ac_nir_to_llvm, so unless I'm missing something this will just break
> them even harder. We probably should use this lowering after adding
> support for bfi and bfm, since AMD does have native instructions for
> bfi and bfm, but first I'd like to see the actual bug fixed. Have you
> tried running it with NIR_PRINT=true to pin down which optimization
> pass is going wrong?

I've identified the constant folding pass as the culprit and fixed the
incorrect logic for bitfield_insert with this patch:
nir/opcodes: Fix constant-folding of bitfield_insert
https://patchwork.freedesktop.org/patch/191977/

James
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] radeonsi: allow DMABUF exports for local buffers

2017-12-06 Thread Marek Olšák

On Dec 6, 2017 12:34 PM, "Nicolai Hähnle"  wrote:

On 05.12.2017 20:05, Marek Olšák wrote:

> From: Marek Olšák 
>
> Cc: 17.3 
> ---
>   src/gallium/drivers/radeon/r600_texture.c | 6 +-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_texture.c
> b/src/gallium/drivers/radeon/r600_texture.c
> index 2aa47b5..07f7c33 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -739,22 +739,26 @@ static boolean r600_texture_get_handle(struct
> pipe_screen* screen,
> stride = rtex->surface.u.gfx9.surf_pitch *
>  rtex->surface.bpe;
> slice_size = rtex->surface.u.gfx9.surf_slice_size;
> } else {
> offset = rtex->surface.u.legacy.level[0].offset;
> stride = rtex->surface.u.legacy.level[0].nblk_x *
>  rtex->surface.bpe;
> slice_size = 
> (uint64_t)rtex->surface.u.legacy.level[0].slice_size_dw
> * 4;
> }
> } else {
> +   /* Buffer exports are for the OpenCL interop. */
> /* Move a suballocated buffer into a non-suballocated
> allocation. */
> -   if (sscreen->ws->buffer_is_suballocated(res->buf)) {
> +   if (sscreen->ws->buffer_is_suballocated(res->buf) ||
> +   /* A DMABUF export always fails if the BO is local. */
> +   (rtex->resource.flags & 
> RADEON_FLAG_NO_INTERPROCESS_SHARING
> &&
> +whandle->type != DRM_API_HANDLE_TYPE_KMS)) {
>

I still don't think this is right. Or at least, it's bound to blow up in
our faces at some point. Though I think we may have talked past each other,
apologies that I haven't made myself clear.

The issues I have in mind are scenarios like this:

1. Buffer allocated in OpenGL.
2. Buffer exported as KMS handle for importing to OpenCL in the same
process.
3. Buffer exported as an FD <-- at this point, the OpenGL and OpenCL
buffers go out of sync because OpenGL re-allocates the buffer but OpenCL
isn't informed.

Or:

1. Buffer allocated in OpenGL.
2. Buffer exported as KMS handle for importing to OpenCL in the same
process.
3. Buffer attempted to be exported as an FD from OpenCL <-- fails because
the buffer is local (has NO_INTERPROCESS_SHARING), and people will be
utterly clueless as to what's going on.

FWIW, I think the patch is good if you drop the whandle->type check so that
we re-allocate unconditionally.

I can remove the check, but buffers are only exported as DMABUF. This patch
isn't just random - it does fix OpenCL interop.

Marek

Cheers,
Nicolai

assert(!res->b.is_shared);
> /* Allocate a new buffer with PIPE_BIND_SHARED. */
> struct pipe_resource templ = res->b.b;
> templ.bind |= PIPE_BIND_SHARED;
> struct pipe_resource *newb =
> screen->resource_create(screen, );
> if (!newb)
> return false;
>
>

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-12-06 Thread Rob Clark

On Wed, Dec 6, 2017 at 12:52 AM, James Jones  wrote:
> On 11/30/2017 10:48 AM, Rob Clark wrote:
>>
>> On Thu, Nov 30, 2017 at 1:28 AM, James Jones  wrote:
>>>
>>> On 11/29/2017 01:10 PM, Rob Clark wrote:


 On Wed, Nov 29, 2017 at 12:33 PM, Jason Ekstrand 
 wrote:
>
>
> On Sat, Nov 25, 2017 at 1:20 PM, Rob Clark  wrote:
>>
>>
>>
>> On Sat, Nov 25, 2017 at 12:46 PM, Jason Ekstrand
>> 
>> wrote:
>>>
>>>
>>> I'm not quite some sure what I think about this.  I think I would
>>> like
>>> to
>>> see $new_thing at least replace the guts of GBM. Whether GBM becomes
>>> a
>>> wrapper around $new_thing or $new_thing implements the GBM API, I'm
>>> not
>>> sure.  What I don't think I want is to see GBM development continuing
>>> on
>>> it's own so we have two competing solutions.
>>
>>
>>
>> I don't really view them as competing.. there is *some* overlap, ie.
>> allocating a buffer.. but even if you are using GBM w/out $new_thing
>> you could allocate a buffer externally and import it.  I don't see
>> $new_thing as that much different from GBM PoV.
>>
>> But things like surfaces (aka swap chains) seem a bit out of place
>> when you are thinking about implementing $new_thing for non-gpu
>> devices.  Plus EGL<->GBM tie-ins that seem out of place when talking
>> about a (for ex.) camera.  I kinda don't want to throw out the baby
>> with the bathwater here.
>
>
>
>
> Agreed.  GBM is very EGLish and we don't want the new allocator to be
> that.
>
>>
>> *maybe* GBM could be partially implemented on top of $new_thing.  I
>> don't quite see how that would work.  Possibly we could deprecate
>> parts of GBM that are no longer needed?  idk..  Either way, I fully
>> expect that GBM and mesa's implementation of $new_thing could perhaps
>> sit on to of some of the same set of internal APIs.  The public
>> interface can be decoupled from the internal implementation.
>
>
>
>
> Maybe I should restate things a bit.  My real point was that modifiers
> +
> $new_thing + Kernel blob should be a complete and more powerful
> replacement
> for GBM.  I don't know that we really can implement GBM on top of it
> because
> GBM has lots of wishy-washy concepts such as "cursor plane" which may
> not
> map well at least not without querying the kernel about specifc display
> planes.  In particular, I don't want someone to feel like they need to
> use
> $new_thing and GBM at the same time or together.  Ideally, I'd like
> them
> to
> never do that unless we decide gbm_bo is a useful abstraction for
> $new_thing.
>

 (just to repeat what I mentioned on irc)

 I think main thing is how do you create a swapchain/surface and know
 which is current front buffer after SwapBuffers()..  that is the only
 bits of GBM that seem like there would still be useful.  idk, maybe
 there is some other idea.
>>>
>>>
>>>
>>> I don't view this as terribly useful except for legacy apps that need an
>>> EGL
>>> window surface and can't be updated to use new methods.  Wayland
>>> compositors
>>> certainly don't fall in that category.  I don't know that any GBM apps
>>> do.
>>
>>
>> kmscube doesn't count?  :-P
>>
>> Hmm, I assumed weston and the other wayland compositors where still
>> using gbm to create EGL surfaces, but I confess to have not actually
>> looked at weston src code for quite a few years now.
>>
>> Anyways, I think it is perfectly fine for GBM to stay as-is in it's
>> current form.  It can already import dma-buf fd's, and those can
>> certainly come from $new_thing.
>>
>> So I guess we want an EGL extension to return the allocator device
>> instance for the GPU.  That also takes care of the non-bare-metal
>> case.
>>
>>> Rather, I think the way forward for the classes of apps that need
>>> something
>>> like GBM or the generic allocator is more or less the path ChromeOS took
>>> with their graphics architecture: Render to individual buffers (using
>>> FBOs
>>> bound to imported buffers in GL) and manage buffer exchanges/blits
>>> manually.
>>>
>>> The useful abstraction surfaces provide isn't so much deciding which
>>> buffer
>>> is currently "front" and "back", but rather handling the
>>> transition/hand-off
>>> to the window system/display device/etc. in SwapBuffers(), and the whole
>>> idea of the allocator proposals is to make that something the application
>>> or
>>> at least some non-driver utility library handles explicitly based on
>>> where
>>> exactly the buffer is being handed off to.
>>
>>
>> Hmm, ok..  I guess the transition will need some hook into the driver.
>> For freedreno and vc4 (and I suspect this is not uncommon for

Re: [Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Alejandro Piñeiro

On 06/12/17 13:29, Pierre Moreau wrote:
> Hello Alejandro,
>
> As far as I understand, nir_spirv_supported_capabilities is being filled in by
> the driver and then fetched by the API entrypoint to check the capabilities
> required by the SPIR-V binary given as input. And this is done regardless of
> the input IR used by the driver, be it NIR, LLVM IR, TGSI or others. So
> couldn’t it be just named spirv_supported_capabilities? Unless it also 
> reflects
> the capabilities supported by the IR being used.

Good point. spirv_supported_capabilities is probably a better name,
although right now, it would be only used on the spirv to nir pass. I
will not send a new version of the patch with just the renaming, but for
anyone interested on review the commit, I will use that name.

> I guess nir_spirv_supported_capabilities could be extended later on to also 
> add
> capabilities specific to OpenCL when clover reaches OpenCL 1.2 support (and 
> can
> start accepting SPIR-V binaries as input through the cl_khr_il_program
> extension), or would it be better to have a separate one for OpenCL?

Probably it would be re-used, but I don't know the specifics of OpenCL
to ensure 100% that.

>
> I haven’t had time to look at the whole gl_spirv series yet, so I am sorry if
> this is something that has already been brought and answered in that thread.

No sorries, your feedback was good and welcomed.
>
> Regards,
> Pierre
>
> On 2017-12-06 — 09:57, Alejandro Piñeiro wrote:
>> Until now it was part of spirv_to_nir_options. But it will be used on
>> the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
>> to the OpenGL context, as a way to save what SPIR-V capabilities the
>> current OpenGL implementation supports.
>> ---
>>
>> We are sending this commit in advance of a v3 of the initial gl_spirv
>> and spirv_extensions support series. The issue is that lately there
>> were a lot of activity on the spirv/spir_to_nir code base, and we are
>> being fixing rebase conflicts constantly. Getting this commit on
>> master would make things easier.
>>
>> FWIW, this patch is similar to one that Ian Romanick already granted
>> Rb, but that was dropped after all the mentioned changes:
>> https://lists.freedesktop.org/archives/mesa-dev/2017-November/178261.html
>>
>>  src/compiler/spirv/nir_spirv.h | 16 +++-
>>  src/mesa/main/mtypes.h | 12 
>>  2 files changed, 15 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
>> index 43ec19d5a50..113bd710a00 100644
>> --- a/src/compiler/spirv/nir_spirv.h
>> +++ b/src/compiler/spirv/nir_spirv.h
>> @@ -28,7 +28,8 @@
>>  #ifndef _NIR_SPIRV_H_
>>  #define _NIR_SPIRV_H_
>>  
>> -#include "nir/nir.h"
>> +#include "compiler/nir/nir.h"
>> +#include "main/mtypes.h"
>>  
>>  #ifdef __cplusplus
>>  extern "C" {
>> @@ -57,18 +58,7 @@ struct spirv_to_nir_options {
>>  */
>> bool lower_workgroup_access_to_offsets;
>>  
>> -   struct {
>> -  bool float64;
>> -  bool image_ms_array;
>> -  bool tessellation;
>> -  bool draw_parameters;
>> -  bool image_read_without_format;
>> -  bool image_write_without_format;
>> -  bool int64;
>> -  bool multiview;
>> -  bool variable_pointers;
>> -  bool storage_16bit;
>> -   } caps;
>> +   struct nir_spirv_supported_capabilities caps;
>>  
>> struct {
>>void (*func)(void *private_data,
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index b478f6158e2..7da05aa3ee9 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -3579,6 +3579,18 @@ struct gl_program_constants
>> GLuint MaxShaderStorageBlocks;
>>  };
>>  
>> +struct nir_spirv_supported_capabilities {
>> +   bool float64;
>> +   bool image_ms_array;
>> +   bool tessellation;
>> +   bool draw_parameters;
>> +   bool image_read_without_format;
>> +   bool image_write_without_format;
>> +   bool int64;
>> +   bool multiview;
>> +   bool variable_pointers;
>> +   bool storage_16bit;
>> +};
>>  
>>  /**
>>   * Constants which may be overridden by device driver during context 
>> creation
>> -- 
>> 2.11.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-12-06 Thread Rob Clark

On Wed, Dec 6, 2017 at 2:07 AM, James Jones  wrote:
> On 12/01/2017 01:52 PM, Miguel Angel Vico wrote:
>>
>>
>>
>> On Fri, 1 Dec 2017 13:38:41 -0500
>> Rob Clark  wrote:
>>
>>>
>>> sure, this was just a hypothetical example.  But to take this case as
>>> another example, if you didn't want to expose uncached rendering (or
>>> cached w/ cache flushes after each draw), you would exclude the entry
>>> from the GPU set which didn't have FOO/cached (I'm adding back a
>>> cached but not CC config just to make it interesting), and end up
>>> with:
>>>
>>> trans_a: FOO/CC -> null
>>> trans_b: FOO/cached -> null
>>>
>>> GPU:
>>>1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=32k)
>>>2: caps(FOO/tiled, FOO/cached); constraints(alignment=32k)
>>>
>>> Display:
>>>1: caps(FOO/tiled); constraints(alignment=64k)
>>>
>>> Merged Result:
>>>1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=64k);
>>>   transition(GPU->display: trans_a, trans_b; display->GPU: none)
>>>2: caps(FOO/tiled, FOO/cached); constraints(alignment=64k);
>>>   transition(GPU->display: trans_b; display->GPU: none)
>>>
>>> So there isn't anything in the result set that doesn't have GPU cache,
>>> and the cache-flush transition is always in the set of required
>>> transitions going from GPU -> display
>>>
>>> Hmm, I guess this does require the concept of a required cap..
>>
>>
>> Which we already introduced to the allocator API when we realized we
>> would need them as we were prototyping.
>
>
> Note I also posed the question of whether things like cached (and similarly
> compression, since I view compression as roughly an equivalent mechanism to
> a cache) in one of the open issues on my XDC 2017 slides because of this
> very problem of over-pruning it causes.  It's on slide 15, as "No
> device-local capabilities".  You'll have to listen to my coverage of it in
> the recorded presentation for that slide to make any sense, but it's the
> same thing Nicolai has laid out here.
>
> As I continued working through our prototype driver support, I found I
> didn't actually need to include cached or compressed as capabilities: The
> GPU just applies them as needed and the usage transitions make it
> transparent to the non-GPU engines.  That does mean the GPU driver currently
> needs to be the one to realize the allocation from the capability set to get
> optimal behavior.  We could fix that by reworking our driver though.  At
> this point, not including device-local properties like on-device caching in
> capabilities seems like the right solution to me.  I'm curious whether this
> applies universally though, or if other hardware doesn't fit the
> "compression and stuff all behaves like a cache" idiom.
>

Possibly a SoC(ish) type device which has a "system" cache that some
but not all devices fall into.  I *think* the intel chips w/ EDRAM
might fall into this category.  I know the idea has come up elsewhere,
although not sure if anything like that ended up in production.  It
seems like something we'd at least want to have an idea how to deal
with, even if it isn't used for device internal caches.

Not sure if similar situation could come up w/ discrete GPU and video
decode/encode engines on the same die?

[snip]

>> I think I like the idea of having transitions being part of the
>> per-device/engine cap sets, so that such information can be used upon
>> merging to know which capabilities may remain or have to be dropped.
>>
>> I think James's proposal for usage transitions was intended to work
>> with flows like:
>>
>>1. App gets GPU caps for RENDER usage
>>2. App allocates GPU memory using a layout from (1)
>>3. App now decides it wants use the buffer for SCANOUT
>>4. App queries usage transition metadata from RENDER to SCANOUT,
>>   given the current memory layout.
>>5. Do the transition and hand the buffer off to display
>
>
> No, all usages the app intends to transition to must be specified up front
> when initially querying caps in the model I assumed.  The app then specifies
> some subset (up to the full set) of the specified usages as a src and dst
> when querying transition metadata.
>
>> The problem I see with this is that it isn't guaranteed that there will
>> be a chain of transitions for the buffer to be usable by display.
>

hmm, I guess if a buffer *can* be shared across all uses, there by
definition has to be a chain of transitions to go from any
usage+device to any other usage+device.

Possibly a separate step to query transitions avoids solving for every
possible transition when merging the caps set.. although until you do
that query I don't think you know the resulting merged caps set is
valid.

Maybe in practice for every cap FOO there exists a FOO->null (or
FOO->generic if you prefer) transition, ie. compressed->uncompressed,
cached->clean, etc.  I suppose that makes the problem easier to solve.

>
> I hadn't thought

[Mesa-dev] [PATCH mesa] meson: fix keyword argument in declare_dependency()

2017-12-06 Thread Eric Engestrom

`declare_dependency()` takes `compile_args`, not `c_args`.
It was correct in all the other `declare_dependency()` from that commit.

Fixes: 0bbecc5a8548883f76a71 "meson: define driver dependencies"
Cc: Dylan Baker 
Signed-off-by: Eric Engestrom 
---
 src/gallium/winsys/imx/drm/meson.build   | 2 +-
 src/gallium/winsys/pl111/drm/meson.build | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/imx/drm/meson.build 
b/src/gallium/winsys/imx/drm/meson.build
index a4af4688694d6ed9c300..4efd7bb0660ca70079a5 100644
--- a/src/gallium/winsys/imx/drm/meson.build
+++ b/src/gallium/winsys/imx/drm/meson.build
@@ -28,6 +28,6 @@ libimxdrm = static_library(
 )
 
 driver_imx = declare_dependency(
-  c_args : '-DGALLIUM_IMX',
+  compile_args : '-DGALLIUM_IMX',
   link_with : libimxdrm,
 )
diff --git a/src/gallium/winsys/pl111/drm/meson.build 
b/src/gallium/winsys/pl111/drm/meson.build
index 84c26f57e15f97a21ca8..9cb6faf31e21e57bd6ab 100644
--- a/src/gallium/winsys/pl111/drm/meson.build
+++ b/src/gallium/winsys/pl111/drm/meson.build
@@ -31,6 +31,6 @@ libpl111winsys = static_library(
 )
 
 driver_pl111 = declare_dependency(
-  c_args : '-DGALLIUM_PL111',
+  compile_args : '-DGALLIUM_PL111',
   link_with : libpl111winsys,
 )
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/6] glx: Implement GLX_EXT_no_config_context (v3)

2017-12-06 Thread Emil Velikov

On 5 December 2017 at 20:03, Adam Jackson  wrote:
> On Thu, 2017-11-30 at 16:10 +, Emil Velikov wrote:
>> On 14 November 2017 at 20:13, Adam Jackson  wrote:
>>
>> > @@ -562,6 +562,10 @@ dri2_convert_glx_attribs(unsigned num_attribs,
>> > const uint32_t *attribs,
>> >  return false;
>> >   }
>> >   break;
>> > +  case GLX_SCREEN:
>> > + /* Implies GLX_EXT_no_config_context */
>> > + *render_type = GLX_DONT_CARE;
>> > + break;
>>
>> We should fall-through (and fail) when GLX_SCREEN is set but the
>> extension is missing.
>
> Nah. This function does not make reference to the screen we're trying
> to create a context for. And, for direct contexts (that call this
> conversion helper) this extension is now always present anyway.
>
Hmm you're right - having adding check is pain in the rear and an overkill.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals

2017-12-06 Thread Rob Clark

On Wed, Dec 6, 2017 at 6:25 AM, Nicolai Hähnle  wrote:
> On 06.12.2017 08:07, James Jones wrote:
> [snip]
>
>> So lets say you have a setup where both display and GPU supported
>> FOO/tiled, but only GPU supported compressed (FOO/CC) and cached
>> (FOO/cached).  But the GPU supported the following transitions:
>>
>> trans_a: FOO/CC -> null
>> trans_b: FOO/cached -> null
>>
>> Then the sets for each device (in order of preference):
>>
>> GPU:
>> 1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=32k)
>> 2: caps(FOO/tiled, FOO/CC); constraints(alignment=32k)
>> 3: caps(FOO/tiled); constraints(alignment=32k)
>>
>> Display:
>> 1: caps(FOO/tiled); constraints(alignment=64k)
>>
>> Merged Result:
>> 1: caps(FOO/tiled, FOO/CC, FOO/cached);
>> constraints(alignment=64k);
>>transition(GPU->display: trans_a, trans_b; display->GPU: none)
>> 2: caps(FOO/tiled, FOO/CC); constraints(alignment=64k);
>>transition(GPU->display: trans_a; display->GPU: none)
>> 3: caps(FOO/tiled); constraints(alignment=64k);
>>transition(GPU->display: none; display->GPU: none)
>
>
>
> We definitely don't want to expose a way of getting uncached rendering
> surfaces for radeonsi. I mean, I think we are supposed to be able to
> program
> our hardware so that the backend bypasses all caches, but (a) nobody
> validates that and (b) it's basically suicide in terms of performance.
> Let's
> build fewer footguns :)


 sure, this was just a hypothetical example.  But to take this case as
 another example, if you didn't want to expose uncached rendering (or
 cached w/ cache flushes after each draw), you would exclude the entry
 from the GPU set which didn't have FOO/cached (I'm adding back a
 cached but not CC config just to make it interesting), and end up
 with:

 trans_a: FOO/CC -> null
 trans_b: FOO/cached -> null

 GPU:
1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=32k)
2: caps(FOO/tiled, FOO/cached); constraints(alignment=32k)

 Display:
1: caps(FOO/tiled); constraints(alignment=64k)

 Merged Result:
1: caps(FOO/tiled, FOO/CC, FOO/cached); constraints(alignment=64k);
   transition(GPU->display: trans_a, trans_b; display->GPU: none)
2: caps(FOO/tiled, FOO/cached); constraints(alignment=64k);
   transition(GPU->display: trans_b; display->GPU: none)

 So there isn't anything in the result set that doesn't have GPU cache,
 and the cache-flush transition is always in the set of required
 transitions going from GPU -> display

 Hmm, I guess this does require the concept of a required cap..
>>>
>>>
>>> Which we already introduced to the allocator API when we realized we
>>> would need them as we were prototyping.
>>
>>
>> Note I also posed the question of whether things like cached (and
>> similarly compression, since I view compression as roughly an equivalent
>> mechanism to a cache) in one of the open issues on my XDC 2017 slides
>> because of this very problem of over-pruning it causes.  It's on slide 15,
>> as "No device-local capabilities".  You'll have to listen to my coverage of
>> it in the recorded presentation for that slide to make any sense, but it's
>> the same thing Nicolai has laid out here.
>>
>> As I continued working through our prototype driver support, I found I
>> didn't actually need to include cached or compressed as capabilities: The
>> GPU just applies them as needed and the usage transitions make it
>> transparent to the non-GPU engines.  That does mean the GPU driver currently
>> needs to be the one to realize the allocation from the capability set to get
>> optimal behavior.  We could fix that by reworking our driver though.  At
>> this point, not including device-local properties like on-device caching in
>> capabilities seems like the right solution to me.  I'm curious whether this
>> applies universally though, or if other hardware doesn't fit the
>> "compression and stuff all behaves like a cache" idiom.
>
>
> Compression is a part of the memory layout for us: framebuffer compression
> uses an additional "meta surface". At the most basic level, an allocation
> with loss-less compression support is by necessity bigger than an allocation
> without.
>
> We can allocate this meta surface separately, but then we're forced to
> decompress when passing the surface around (e.g. to a compositor.)
>

side note:  I think this is pretty typical.. although afaict for
adreno at least, when you start getting into sampling from things with
multiple layers/levels, the meta surface needs to be interleaved with
the "main" surface, so it can't really be allocated after the fact.

Also for depth buffer, there is potentially an additional meta

Re: [Mesa-dev] [PATCH] mesa: define nir_spirv_supported_capabilities

2017-12-06 Thread Pierre Moreau

Hello Alejandro,

As far as I understand, nir_spirv_supported_capabilities is being filled in by
the driver and then fetched by the API entrypoint to check the capabilities
required by the SPIR-V binary given as input. And this is done regardless of
the input IR used by the driver, be it NIR, LLVM IR, TGSI or others. So
couldn’t it be just named spirv_supported_capabilities? Unless it also reflects
the capabilities supported by the IR being used.

I guess nir_spirv_supported_capabilities could be extended later on to also add
capabilities specific to OpenCL when clover reaches OpenCL 1.2 support (and can
start accepting SPIR-V binaries as input through the cl_khr_il_program
extension), or would it be better to have a separate one for OpenCL?

I haven’t had time to look at the whole gl_spirv series yet, so I am sorry if
this is something that has already been brought and answered in that thread.

Regards,
Pierre

On 2017-12-06 — 09:57, Alejandro Piñeiro wrote:
> Until now it was part of spirv_to_nir_options. But it will be used on
> the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
> to the OpenGL context, as a way to save what SPIR-V capabilities the
> current OpenGL implementation supports.
> ---
> 
> We are sending this commit in advance of a v3 of the initial gl_spirv
> and spirv_extensions support series. The issue is that lately there
> were a lot of activity on the spirv/spir_to_nir code base, and we are
> being fixing rebase conflicts constantly. Getting this commit on
> master would make things easier.
> 
> FWIW, this patch is similar to one that Ian Romanick already granted
> Rb, but that was dropped after all the mentioned changes:
> https://lists.freedesktop.org/archives/mesa-dev/2017-November/178261.html
> 
>  src/compiler/spirv/nir_spirv.h | 16 +++-
>  src/mesa/main/mtypes.h | 12 
>  2 files changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
> index 43ec19d5a50..113bd710a00 100644
> --- a/src/compiler/spirv/nir_spirv.h
> +++ b/src/compiler/spirv/nir_spirv.h
> @@ -28,7 +28,8 @@
>  #ifndef _NIR_SPIRV_H_
>  #define _NIR_SPIRV_H_
>  
> -#include "nir/nir.h"
> +#include "compiler/nir/nir.h"
> +#include "main/mtypes.h"
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -57,18 +58,7 @@ struct spirv_to_nir_options {
>  */
> bool lower_workgroup_access_to_offsets;
>  
> -   struct {
> -  bool float64;
> -  bool image_ms_array;
> -  bool tessellation;
> -  bool draw_parameters;
> -  bool image_read_without_format;
> -  bool image_write_without_format;
> -  bool int64;
> -  bool multiview;
> -  bool variable_pointers;
> -  bool storage_16bit;
> -   } caps;
> +   struct nir_spirv_supported_capabilities caps;
>  
> struct {
>void (*func)(void *private_data,
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index b478f6158e2..7da05aa3ee9 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3579,6 +3579,18 @@ struct gl_program_constants
> GLuint MaxShaderStorageBlocks;
>  };
>  
> +struct nir_spirv_supported_capabilities {
> +   bool float64;
> +   bool image_ms_array;
> +   bool tessellation;
> +   bool draw_parameters;
> +   bool image_read_without_format;
> +   bool image_write_without_format;
> +   bool int64;
> +   bool multiview;
> +   bool variable_pointers;
> +   bool storage_16bit;
> +};
>  
>  /**
>   * Constants which may be overridden by device driver during context creation
> -- 
> 2.11.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: enable lowering of nir_op_bitfield_insert

2017-12-06 Thread Samuel Pitoiset




On 12/06/2017 01:04 PM, James Legg wrote:

On Tue, 2017-12-05 at 14:24 -0500, Connor Abbott wrote:

lower_bitfield_insert lowers nir_op_bitfield_insert to DX10-style
nir_op_bfi and nir_op_bfm, both of which aren't handled by
ac_nir_to_llvm, so unless I'm missing something this will just break
them even harder. We probably should use this lowering after adding
support for bfi and bfm, since AMD does have native instructions for
bfi and bfm, but first I'd like to see the actual bug fixed. Have you
tried running it with NIR_PRINT=true to pin down which optimization
pass is going wrong?


I've identified the constant folding pass as the culprit and fixed the
incorrect logic for bitfield_insert with this patch:
 nir/opcodes: Fix constant-folding of bitfield_insert
 https://patchwork.freedesktop.org/patch/191977/


Your fix looks much better, thanks!



James


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] r600/evergreen compute shader + glsl 4.30 support

2017-12-06 Thread Dave Airlie

On 30 November 2017 at 22:06, Gert Wollny  wrote:
> Am Donnerstag, den 30.11.2017, 17:56 +1000 schrieb Dave Airlie:
>> On 30 November 2017 at 17:20, Gert Wollny 
>> wrote:
>> > Am Donnerstag, den 30.11.2017, 09:30 +1000 schrieb Dave Airlie:
>> > > On 29 November 2017 at 22:46, Gert Wollny 
>> > > wrote:
>> > > >
>> > > >
>> > > > I run the arb_compute_shader piglits on BARTS, the piglits
>> > > >
>> > > >basic-texelfetch
>> > > >border-color
>> > > >multiple-workgroups
>> > > >basic-uniform-access
>> > > >multiple-texture-reading
>> > > >simple-barrier
>> > > >
>> > > > result in GPU lockups and, consequently, fail. The other 20
>> > > > tests
>> > > > pass.
>> > >
>> > > Does the attached patch help with the lockups at all?
>> >
>> > no, no changes with the arb_compute_shader tests,
>>
>> Could you give:
>> https://cgit.freedesktop.org/~airlied/mesa/log/?h=r600-wip-cs a spin?
>>
>> I'm guessing WIP hacks might fix it, but I really want to avoid
>> flushing. it might be necessary to reemit a bunch of graphics state.
>
> To compile this I had to switch to LLVM 3.9, because the tree doesn't
> compile with LLVM >= 4.0.
>
> Anyway, it fixes all the arb_compute_shader piglits and I by using
> MESA_GL_VERSION_OVERRIDE=4.3 I am even able to walk around in a
> properly rendered "Alien Isolation" (25-45 FPS, GPU: Radeon 6850 HD,
> CPU: FX 6300).
>

I've pushed a bunch of the code, except for the final enables,

Care to give my r600-gl-4.3 branch a spin?

It has the workaround I think I need to stabilise stuff for now.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 109 matches

Mail list logo