Re: [Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-04 Thread Alyssa Rosenzweig
> The blend shader enable bit is already described in the comments in
> the header; the blend shader is enabled when unk2 == 0.

I'm pretty sure that comment was from you, but thank you ;)

> (the blend shader has
> to be allocated within the same 2^24 byte range as the main shader for
> it to work properly anyways, even on Midgard, which is probably not
> implemented properly on mainline).

Indeed. Mainline Midgard blend shaders work (well, stubbed out so they
just do passthrough without any real blending, but the hardware is
correct). That said, we cap shader memory at 16MB upfront, which
"resolves" this problem.

> Maybe it would be better if these functions got passed the
> mali_shader_descriptor itself?

Possibly. I don't have access to any Bifrost machines right now, so I
can't test that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] u_dynarray: return 0 on realloc failure

2019-05-04 Thread Caio Marcelo de Oliveira Filho
Hi,

> > diff --git a/src/util/u_dynarray.h b/src/util/u_dynarray.h
> > index b30fd7b1154..f6a81609dbe 100644
> > --- a/src/util/u_dynarray.h
> > +++ b/src/util/u_dynarray.h
> > @@ -85,20 +85,22 @@ util_dynarray_ensure_cap(struct util_dynarray *buf, 
> > unsigned newcap)
> >   buf->capacity = DYN_ARRAY_INITIAL_SIZE;
> >
> >while (newcap > buf->capacity)
> >   buf->capacity *= 2;
> >
> >if (buf->mem_ctx) {
> >   buf->data = reralloc_size(buf->mem_ctx, buf->data, buf->capacity);
> >} else {
> >   buf->data = realloc(buf->data, buf->capacity);
> >}
> > +  if (!buf->data)
> > + return 0;
> 
> To keep buf->data valid, put the new value in a temporary variable and
> copy it into buf->data on success. If realloc and reralloc_size fail,
> the original pointer is still valid, while if we overwrite buf->data
> we are guaranteed to leak the data on failure.

You also want to use a temporary variable for capacity.  If realloc
fails and we keep the old data, we also want to keep the old capacity.


Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros

2019-05-04 Thread Bas Nieuwenhuizen
On Sat, May 4, 2019 at 3:25 PM Nicolai Hähnle  wrote:
>
> From: Nicolai Hähnle 
>
> The main motivation for this change is API ergonomics: most operations
> on dynarrays are really on elements, not on bytes, so it's weird to have
> grow and resize as the odd operations out.
>
> The secondary motivation is memory safety. Users of the old byte-oriented
> functions would often multiply a number of elements with the element size,
> which could overflow, and checking for overflow is tedious.
>
> With this change, we only need to implement the overflow checks once.
> The checks are cheap: since eltsize is a compile-time constant and the
> functions should be inlined, they only add a single comparison and an
> unlikely branch.
> ---
>  .../drivers/nouveau/nv30/nvfx_fragprog.c  |  2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_state.c |  5 +--
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  5 +--
>  .../compiler/brw_nir_analyze_ubo_ranges.c |  2 +-
>  src/mesa/drivers/dri/i965/brw_bufmgr.c|  4 +-
>  src/util/u_dynarray.h | 38 +--
>  6 files changed, 35 insertions(+), 21 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c 
> b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
> index 86e3599325e..2bcb62b97d8 100644
> --- a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
> +++ b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
> @@ -66,21 +66,21 @@ release_temps(struct nvfx_fpc *fpc)
> fpc->r_temps &= ~fpc->r_temps_discard;
> fpc->r_temps_discard = 0ULL;
>  }
>
>  static inline struct nvfx_reg
>  nvfx_fp_imm(struct nvfx_fpc *fpc, float a, float b, float c, float d)
>  {
> float v[4] = {a, b, c, d};
> int idx = fpc->imm_data.size >> 4;
>
> -   memcpy(util_dynarray_grow(>imm_data, sizeof(float) * 4), v, 4 * 
> sizeof(float));
> +   memcpy(util_dynarray_grow(>imm_data, float, 4), v, 4 * 
> sizeof(float));
> return nvfx_reg(NVFXSR_IMM, idx);
>  }
>
>  static void
>  grow_insns(struct nvfx_fpc *fpc, int size)
>  {
> struct nv30_fragprog *fp = fpc->fp;
>
> fp->insn_len += size;
> fp->insn = realloc(fp->insn, sizeof(uint32_t) * fp->insn_len);
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index 55167a27c09..228feced5d1 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -1256,24 +1256,23 @@ nv50_set_global_bindings(struct pipe_context *pipe,
>   struct pipe_resource **resources,
>   uint32_t **handles)
>  {
> struct nv50_context *nv50 = nv50_context(pipe);
> struct pipe_resource **ptr;
> unsigned i;
> const unsigned end = start + nr;
>
> if (nv50->global_residents.size <= (end * sizeof(struct pipe_resource 
> *))) {
>const unsigned old_size = nv50->global_residents.size;
> -  const unsigned req_size = end * sizeof(struct pipe_resource *);
> -  util_dynarray_resize(>global_residents, req_size);
> +  util_dynarray_resize(>global_residents, struct pipe_resource *, 
> end);
>memset((uint8_t *)nv50->global_residents.data + old_size, 0,
> - req_size - old_size);
> + nv50->global_residents.size - old_size);
> }
>
> if (resources) {
>ptr = util_dynarray_element(
>   >global_residents, struct pipe_resource *, start);
>for (i = 0; i < nr; ++i) {
>   pipe_resource_reference([i], resources[i]);
>   nv50_set_global_handle(handles[i], resources[i]);
>}
> } else {
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 12e21862ee0..2ab51c8529e 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -1363,24 +1363,23 @@ nvc0_set_global_bindings(struct pipe_context *pipe,
>   struct pipe_resource **resources,
>   uint32_t **handles)
>  {
> struct nvc0_context *nvc0 = nvc0_context(pipe);
> struct pipe_resource **ptr;
> unsigned i;
> const unsigned end = start + nr;
>
> if (nvc0->global_residents.size <= (end * sizeof(struct pipe_resource 
> *))) {
>const unsigned old_size = nvc0->global_residents.size;
> -  const unsigned req_size = end * sizeof(struct pipe_resource *);
> -  util_dynarray_resize(>global_residents, req_size);
> +  util_dynarray_resize(>global_residents, struct pipe_resource *, 
> end);
>memset((uint8_t *)nvc0->global_residents.data + old_size, 0,
> - req_size - old_size);
> + nvc0->global_residents.size - old_size);
> }
>
> if (resources) {
>ptr = util_dynarray_element(
>   >global_residents, struct pipe_resource *, start);
>for (i = 0; i < nr; ++i) {
>   pipe_resource_reference([i], resources[i]);
>   

Re: [Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-04 Thread Connor Abbott
On Sun, May 5, 2019 at 12:14 AM Alyssa Rosenzweig  wrote:
>
> This commit does a fairly large cleanup of blend descriptors, although
> there should not be any functional changes. In particular, we split
> apart the Midgard and Bifrost blend descriptors, since they are
> radically different. From there, we can identify that the Midgard
> descriptor as previously written was really two render targets'
> descriptors stuck together. From this observation, we split the Midgard
> descriptor into what a single RT actually needs. This enables us to
> correctly dump blending configuration for MRT samples on Midgard. It
> also allows the Midgard and Bifrost blend code to peacefully coexist,
> with runtime selection rather than a #ifdef. So, as a bonus, this will
> help the future Bifrost effort, eliminating one major source of
> compile-time architectural divergence.
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  .../drivers/panfrost/include/panfrost-job.h   |  56 ---
>  src/gallium/drivers/panfrost/pan_context.c|  31 ++--
>  .../drivers/panfrost/pandecode/decode.c   | 155 +-
>  3 files changed, 122 insertions(+), 120 deletions(-)
>
> diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h 
> b/src/gallium/drivers/panfrost/include/panfrost-job.h
> index c2d922678b8..71ac054f7c3 100644
> --- a/src/gallium/drivers/panfrost/include/panfrost-job.h
> +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h
> @@ -415,25 +415,37 @@ enum mali_format {
>  #define MALI_READS_ZS (1 << 12)
>  #define MALI_READS_TILEBUFFER (1 << 16)
>
> -struct mali_blend_meta {
> -#ifndef BIFROST
> -/* Base value of 0x200.
> +/* The raw Midgard blend payload can either be an equation or a shader
> + * address, depending on the context */
> +
> +union midgard_blend {
> +mali_ptr shader;
> +struct mali_blend_equation equation;
> +};
> +
> +/* On MRT Midgard systems (using an MFBD), each render target gets its own
> + * blend descriptor */
> +
> +struct midgard_blend_rt {
> +/* Flags base value of 0x200 to enable the render target.
>   * OR with 0x1 for blending (anything other than REPLACE).
> - * OR with 0x2 for programmable blending
> + * OR with 0x2 for programmable blending with 0-2 registers
> + * OR with 0x3 for programmable blending with 2+ registers
>   */
>
> -u64 unk1;
> +u64 flags;
> +union midgard_blend blend;
> +} __attribute__((packed));
>
> -union {
> -struct mali_blend_equation blend_equation_1;
> -mali_ptr blend_shader;
> -};
> +/* On Bifrost systems (all MRT), each render target gets one of these
> + * descriptors */
> +
> +struct bifrost_blend_rt {
> +/* This is likely an analogue of the flags on
> + * midgard_blend_rt */
>
> -u64 zero2;
> -struct mali_blend_equation blend_equation_2;
> -#else
>  u32 unk1; // = 0x200
> -struct mali_blend_equation blend_equation;
> +struct mali_blend_equation equation;
>  /*
>   * - 0x19 normally
>   * - 0x3 when this slot is unused (everything else is 0 except the 
> index)
> @@ -479,11 +491,13 @@ struct mali_blend_meta {
>  * in the same pool as the original shader. The kernel will
>  * make sure this allocation is aligned to 2^24 bytes.
>  */
> -   u32 blend_shader;
> +   u32 shader;
> };
> -#endif
>  } __attribute__((packed));
>
> +/* Descriptor for the shader. Following this is at least one, up to four 
> blend
> + * descriptors for each active render target */
> +
>  struct mali_shader_meta {
>  mali_ptr shader;
>  u16 texture_count;
> @@ -584,17 +598,7 @@ struct mali_shader_meta {
>   * MALI_HAS_BLEND_SHADER to decide how to interpret.
>   */
>
> -union {
> -mali_ptr blend_shader;
> -struct mali_blend_equation blend_equation;
> -};
> -
> -/* There can be up to 4 blend_meta's. None of them are required for
> - * vertex shaders or the non-MRT case for Midgard (so the blob 
> doesn't
> - * allocate any space).
> - */
> -struct mali_blend_meta blend_meta[];
> -
> +union midgard_blend blend;
>  } __attribute__((packed));
>
>  /* This only concerns hardware jobs */
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 17b5b75db92..cab7c89ac8b 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1000,7 +1000,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, 
> bool with_vertex_data)
>   * maybe both are read...?) */
>
>  if (ctx->blend->has_blend_shader) {
> -ctx->fragment_shader_core.blend_shader = 
> ctx->blend->blend_shader;
> + 

Re: [Mesa-dev] [PATCH 2/3] u_dynarray: return 0 on realloc failure

2019-05-04 Thread Bas Nieuwenhuizen
On Sat, May 4, 2019 at 3:25 PM Nicolai Hähnle  wrote:
>
> From: Nicolai Hähnle 
>
> We're not very good at handling out-of-memory conditions in general, but
> this change at least gives the caller the option of handling it.
>
> This happens to fix an error in out-of-memory handling in i965, which has
> the following code in brw_bufmgr.c:
>
>   node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node));
>   if (unlikely(!node))
>  return 0ull;
>
> Previously, allocation failure for util_dynarray_grow wouldn't actually
> return NULL when the dynarray was previously non-empty.
> ---
>  src/util/u_dynarray.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/util/u_dynarray.h b/src/util/u_dynarray.h
> index b30fd7b1154..f6a81609dbe 100644
> --- a/src/util/u_dynarray.h
> +++ b/src/util/u_dynarray.h
> @@ -85,20 +85,22 @@ util_dynarray_ensure_cap(struct util_dynarray *buf, 
> unsigned newcap)
>   buf->capacity = DYN_ARRAY_INITIAL_SIZE;
>
>while (newcap > buf->capacity)
>   buf->capacity *= 2;
>
>if (buf->mem_ctx) {
>   buf->data = reralloc_size(buf->mem_ctx, buf->data, buf->capacity);
>} else {
>   buf->data = realloc(buf->data, buf->capacity);
>}
> +  if (!buf->data)
> + return 0;

To keep buf->data valid, put the new value in a temporary variable and
copy it into buf->data on success. If realloc and reralloc_size fail,
the original pointer is still valid, while if we overwrite buf->data
we are guaranteed to leak the data on failure.
> }
>
> return (void *)((char *)buf->data + buf->size);
>  }
>
>  static inline void *
>  util_dynarray_grow_cap(struct util_dynarray *buf, int diff)
>  {
> return util_dynarray_ensure_cap(buf, buf->size + diff);
>  }
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-04 Thread Alyssa Rosenzweig
This commit does a fairly large cleanup of blend descriptors, although
there should not be any functional changes. In particular, we split
apart the Midgard and Bifrost blend descriptors, since they are
radically different. From there, we can identify that the Midgard
descriptor as previously written was really two render targets'
descriptors stuck together. From this observation, we split the Midgard
descriptor into what a single RT actually needs. This enables us to
correctly dump blending configuration for MRT samples on Midgard. It
also allows the Midgard and Bifrost blend code to peacefully coexist,
with runtime selection rather than a #ifdef. So, as a bonus, this will
help the future Bifrost effort, eliminating one major source of
compile-time architectural divergence.

Signed-off-by: Alyssa Rosenzweig 
---
 .../drivers/panfrost/include/panfrost-job.h   |  56 ---
 src/gallium/drivers/panfrost/pan_context.c|  31 ++--
 .../drivers/panfrost/pandecode/decode.c   | 155 +-
 3 files changed, 122 insertions(+), 120 deletions(-)

diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h 
b/src/gallium/drivers/panfrost/include/panfrost-job.h
index c2d922678b8..71ac054f7c3 100644
--- a/src/gallium/drivers/panfrost/include/panfrost-job.h
+++ b/src/gallium/drivers/panfrost/include/panfrost-job.h
@@ -415,25 +415,37 @@ enum mali_format {
 #define MALI_READS_ZS (1 << 12)
 #define MALI_READS_TILEBUFFER (1 << 16)
 
-struct mali_blend_meta {
-#ifndef BIFROST
-/* Base value of 0x200.
+/* The raw Midgard blend payload can either be an equation or a shader
+ * address, depending on the context */
+
+union midgard_blend {
+mali_ptr shader;
+struct mali_blend_equation equation;
+};
+
+/* On MRT Midgard systems (using an MFBD), each render target gets its own
+ * blend descriptor */
+
+struct midgard_blend_rt {
+/* Flags base value of 0x200 to enable the render target.
  * OR with 0x1 for blending (anything other than REPLACE).
- * OR with 0x2 for programmable blending
+ * OR with 0x2 for programmable blending with 0-2 registers
+ * OR with 0x3 for programmable blending with 2+ registers
  */
 
-u64 unk1;
+u64 flags;
+union midgard_blend blend;
+} __attribute__((packed));
 
-union {
-struct mali_blend_equation blend_equation_1;
-mali_ptr blend_shader;
-};
+/* On Bifrost systems (all MRT), each render target gets one of these
+ * descriptors */
+
+struct bifrost_blend_rt {
+/* This is likely an analogue of the flags on
+ * midgard_blend_rt */
 
-u64 zero2;
-struct mali_blend_equation blend_equation_2;
-#else
 u32 unk1; // = 0x200
-struct mali_blend_equation blend_equation;
+struct mali_blend_equation equation;
 /*
  * - 0x19 normally
  * - 0x3 when this slot is unused (everything else is 0 except the 
index)
@@ -479,11 +491,13 @@ struct mali_blend_meta {
 * in the same pool as the original shader. The kernel will
 * make sure this allocation is aligned to 2^24 bytes.
 */
-   u32 blend_shader;
+   u32 shader;
};
-#endif
 } __attribute__((packed));
 
+/* Descriptor for the shader. Following this is at least one, up to four blend
+ * descriptors for each active render target */
+
 struct mali_shader_meta {
 mali_ptr shader;
 u16 texture_count;
@@ -584,17 +598,7 @@ struct mali_shader_meta {
  * MALI_HAS_BLEND_SHADER to decide how to interpret.
  */
 
-union {
-mali_ptr blend_shader;
-struct mali_blend_equation blend_equation;
-};
-
-/* There can be up to 4 blend_meta's. None of them are required for
- * vertex shaders or the non-MRT case for Midgard (so the blob doesn't
- * allocate any space).
- */
-struct mali_blend_meta blend_meta[];
-
+union midgard_blend blend;
 } __attribute__((packed));
 
 /* This only concerns hardware jobs */
diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index 17b5b75db92..cab7c89ac8b 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -1000,7 +1000,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, bool 
with_vertex_data)
  * maybe both are read...?) */
 
 if (ctx->blend->has_blend_shader) {
-ctx->fragment_shader_core.blend_shader = 
ctx->blend->blend_shader;
+ctx->fragment_shader_core.blend.shader = 
ctx->blend->blend_shader;
 }
 
 if (ctx->require_sfbd) {
@@ -1010,7 +1010,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, bool 
with_vertex_data)
  * modes (so we're able to read back 

[Mesa-dev] [Bug 110611] src/compiler/nir/nir.h : warning: type of ‘nir_get_io_offset_src’ does not match original declaration [-Wlto- type-mismatch]

2019-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110611

Bug ID: 110611
   Summary: src/compiler/nir/nir.h : warning: type of
‘nir_get_io_offset_src’ does not match original
declaration [-Wlto- type-mismatch]
   Product: Mesa
   Version: git
  Hardware: All
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: pedretti.fa...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

When building with LTO enabled there are 38 -Wlto-type-mismatch warnings, all
in src/compiler/nir/nir.h , examples:

../src/compiler/nir/nir.h:3150:10: warning: type of ‘nir_get_io_offset_src’
does not match original declaration [-Wlto-
type-mismatch]
../src/compiler/nir/nir.h:2490:22: warning: type of
‘nir_intrinsic_instr_create’ does not match original declaration [-
Wlto-type-mismatch]

Full log here:
https://launchpadlibrarian.net/422180941/buildlog_ubuntu-eoan-amd64.mesa_19.1~git1905040730.918994~oibaf~e_BUILDING.txt.gz

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110608] [bisected][18.3.3 regression] Nouveau on Wayland fails

2019-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110608

Marius Bakke  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Marius Bakke  ---
Derp.  I can confirm that wiping $HOME/.cache/mesa_shader_cache fixes the
issue.

Closing this bug, thanks for the quick response!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] ddebug: fix a few MSVC compiler warnings

2019-05-04 Thread Brian Paul
Don't return an expression in void functions.
Replace an unsigned int with proper enum.
---
 src/gallium/auxiliary/driver_ddebug/dd_context.c | 15 ---
 src/gallium/auxiliary/driver_ddebug/dd_screen.c  |  2 +-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/driver_ddebug/dd_context.c 
b/src/gallium/auxiliary/driver_ddebug/dd_context.c
index 8a3b75a..001a69f 100644
--- a/src/gallium/auxiliary/driver_ddebug/dd_context.c
+++ b/src/gallium/auxiliary/driver_ddebug/dd_context.c
@@ -534,7 +534,8 @@ dd_context_set_shader_images(struct pipe_context *_pipe,
 }
 
 static void
-dd_context_set_shader_buffers(struct pipe_context *_pipe, unsigned shader,
+dd_context_set_shader_buffers(struct pipe_context *_pipe,
+  enum pipe_shader_type shader,
   unsigned start, unsigned num_buffers,
   const struct pipe_shader_buffer *buffers,
   unsigned writable_bitmask)
@@ -680,7 +681,7 @@ dd_context_set_compute_resources(struct pipe_context *_pipe,
 struct pipe_surface **resources)
 {
struct pipe_context *pipe = dd_context(_pipe)->pipe;
-   return pipe->set_compute_resources(pipe, start, count, resources);
+   pipe->set_compute_resources(pipe, start, count, resources);
 }
 
 static void
@@ -690,7 +691,7 @@ dd_context_set_global_binding(struct pipe_context *_pipe,
  uint32_t **handles)
 {
struct pipe_context *pipe = dd_context(_pipe)->pipe;
-   return pipe->set_global_binding(pipe, first, count, resources, handles);
+   pipe->set_global_binding(pipe, first, count, resources, handles);
 }
 
 static void
@@ -700,8 +701,8 @@ dd_context_get_sample_position(struct pipe_context *_pipe,
 {
struct pipe_context *pipe = dd_context(_pipe)->pipe;
 
-   return pipe->get_sample_position(pipe, sample_count, sample_index,
-out_value);
+   pipe->get_sample_position(pipe, sample_count, sample_index,
+ out_value);
 }
 
 static void
@@ -727,7 +728,7 @@ dd_context_set_device_reset_callback(struct pipe_context 
*_pipe,
 {
struct pipe_context *pipe = dd_context(_pipe)->pipe;
 
-   return pipe->set_device_reset_callback(pipe, cb);
+   pipe->set_device_reset_callback(pipe, cb);
 }
 
 static void
@@ -747,7 +748,7 @@ dd_context_dump_debug_state(struct pipe_context *_pipe, 
FILE *stream,
 {
struct pipe_context *pipe = dd_context(_pipe)->pipe;
 
-   return pipe->dump_debug_state(pipe, stream, flags);
+   pipe->dump_debug_state(pipe, stream, flags);
 }
 
 static uint64_t
diff --git a/src/gallium/auxiliary/driver_ddebug/dd_screen.c 
b/src/gallium/auxiliary/driver_ddebug/dd_screen.c
index ce9f697..f3bd079 100644
--- a/src/gallium/auxiliary/driver_ddebug/dd_screen.c
+++ b/src/gallium/auxiliary/driver_ddebug/dd_screen.c
@@ -126,7 +126,7 @@ static void dd_screen_query_memory_info(struct pipe_screen 
*_screen,
 {
struct pipe_screen *screen = dd_screen(_screen)->screen;
 
-   return screen->query_memory_info(screen, info);
+   screen->query_memory_info(screen, info);
 }
 
 static struct pipe_context *
-- 
1.8.5.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] glsl: s/GLboolean/bool/ to silence MSVC compiler warning

2019-05-04 Thread Brian Paul
It complains about mixing GLboolean and bool in the |= expression.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index d99ab3d..41f2a97 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -2226,7 +2226,7 @@ do_common_optimization(exec_list *ir, bool linked,
bool native_integers)
 {
const bool debug = false;
-   GLboolean progress = GL_FALSE;
+   bool progress = false;
 
 #define OPT(PASS, ...) do { \
   if (debug) {  \
-- 
1.8.5.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] gallium/pp: s/uint/enum tgsi_semantic/ to fix MSVC warning

2019-05-04 Thread Brian Paul
---
 src/gallium/auxiliary/postprocess/pp_program.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/postprocess/pp_program.c 
b/src/gallium/auxiliary/postprocess/pp_program.c
index 52786de..4cd3990 100644
--- a/src/gallium/auxiliary/postprocess/pp_program.c
+++ b/src/gallium/auxiliary/postprocess/pp_program.c
@@ -126,7 +126,7 @@ pp_init_prog(struct pp_queue_t *ppq, struct pipe_context 
*pipe,
 
 
{
-  const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
+  const enum tgsi_semantic semantic_names[] = { TGSI_SEMANTIC_POSITION,
  TGSI_SEMANTIC_GENERIC
   };
   const uint semantic_indexes[] = { 0, 0 };
-- 
1.8.5.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110608] [bisected][18.3.3 regression] Nouveau on Wayland fails

2019-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110608

--- Comment #1 from Ilia Mirkin  ---
This feels like a mesa cache format issue. For whatever reason, mesa's internal
mechanisms don't detect that it should invalidate the shader cache.

Try to wipe your $HOME/.cache/mesa_shader_cache directory, and see if the
situation improves.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] noop: s/enum pipe_transfer_usage/unsigned/ to fix MSVC warning

2019-05-04 Thread Brian Paul
The function pointer declaration in pipe_context uses unsigned
for the bitmask.
---
 src/gallium/auxiliary/driver_noop/noop_pipe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/driver_noop/noop_pipe.c 
b/src/gallium/auxiliary/driver_noop/noop_pipe.c
index a6497f0..2a4d3eb 100644
--- a/src/gallium/auxiliary/driver_noop/noop_pipe.c
+++ b/src/gallium/auxiliary/driver_noop/noop_pipe.c
@@ -172,7 +172,7 @@ static void noop_resource_destroy(struct pipe_screen 
*screen,
 static void *noop_transfer_map(struct pipe_context *pipe,
struct pipe_resource *resource,
unsigned level,
-   enum pipe_transfer_usage usage,
+   unsigned usage,
const struct pipe_box *box,
struct pipe_transfer **ptransfer)
 {
-- 
1.8.5.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] gallium/util: fix two MSVC compiler warnings

2019-05-04 Thread Brian Paul
Remove stray const qualifier.
s/unsigned/enum tgsi_semantic/
---
 src/gallium/auxiliary/util/u_format_zs.h  | 2 +-
 src/gallium/auxiliary/util/u_simple_shaders.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_format_zs.h 
b/src/gallium/auxiliary/util/u_format_zs.h
index 160919d..bed3c51 100644
--- a/src/gallium/auxiliary/util/u_format_zs.h
+++ b/src/gallium/auxiliary/util/u_format_zs.h
@@ -113,7 +113,7 @@ void
 util_format_z24_unorm_s8_uint_pack_s_8uint(uint8_t *dst_row, unsigned 
dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, 
unsigned height);
 
 void
-util_format_z24_unorm_s8_uint_pack_separate(uint8_t *dst_row, unsigned 
dst_stride, const uint32_t *z_src_row, unsigned z_src_stride, const uint8_t 
*s_src_row, unsigned s_src_stride, const unsigned width, unsigned height);
+util_format_z24_unorm_s8_uint_pack_separate(uint8_t *dst_row, unsigned 
dst_stride, const uint32_t *z_src_row, unsigned z_src_stride, const uint8_t 
*s_src_row, unsigned s_src_stride, unsigned width, unsigned height);
 
 void
 util_format_s8_uint_z24_unorm_unpack_z_float(float *dst_row, unsigned 
dst_stride, const uint8_t *src_row, unsigned src_stride, unsigned width, 
unsigned height);
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c 
b/src/gallium/auxiliary/util/u_simple_shaders.c
index 4046ab1..d62a655 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.c
+++ b/src/gallium/auxiliary/util/u_simple_shaders.c
@@ -117,8 +117,8 @@ util_make_vertex_passthrough_shader_with_so(struct 
pipe_context *pipe,
 
 void *util_make_layered_clear_vertex_shader(struct pipe_context *pipe)
 {
-   const unsigned semantic_names[] = {TGSI_SEMANTIC_POSITION,
-  TGSI_SEMANTIC_GENERIC};
+   const enum tgsi_semantic semantic_names[] = {TGSI_SEMANTIC_POSITION,
+TGSI_SEMANTIC_GENERIC};
const unsigned semantic_indices[] = {0, 0};
 
return util_make_vertex_passthrough_shader_with_so(pipe, 2, semantic_names,
-- 
1.8.5.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros

2019-05-04 Thread Gustaw Smolarczyk
sob., 4 maj 2019 o 15:25 Nicolai Hähnle  napisał(a):
>
> From: Nicolai Hähnle 
>
> The main motivation for this change is API ergonomics: most operations
> on dynarrays are really on elements, not on bytes, so it's weird to have
> grow and resize as the odd operations out.
>
> The secondary motivation is memory safety. Users of the old byte-oriented
> functions would often multiply a number of elements with the element size,
> which could overflow, and checking for overflow is tedious.
>
> With this change, we only need to implement the overflow checks once.
> The checks are cheap: since eltsize is a compile-time constant and the
> functions should be inlined, they only add a single comparison and an
> unlikely branch.
> ---
>  .../drivers/nouveau/nv30/nvfx_fragprog.c  |  2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_state.c |  5 +--
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  5 +--
>  .../compiler/brw_nir_analyze_ubo_ranges.c |  2 +-
>  src/mesa/drivers/dri/i965/brw_bufmgr.c|  4 +-
>  src/util/u_dynarray.h | 38 +--
>  6 files changed, 35 insertions(+), 21 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c 
> b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
> index 86e3599325e..2bcb62b97d8 100644
> --- a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
> +++ b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
> @@ -66,21 +66,21 @@ release_temps(struct nvfx_fpc *fpc)
> fpc->r_temps &= ~fpc->r_temps_discard;
> fpc->r_temps_discard = 0ULL;
>  }
>
>  static inline struct nvfx_reg
>  nvfx_fp_imm(struct nvfx_fpc *fpc, float a, float b, float c, float d)
>  {
> float v[4] = {a, b, c, d};
> int idx = fpc->imm_data.size >> 4;
>
> -   memcpy(util_dynarray_grow(>imm_data, sizeof(float) * 4), v, 4 * 
> sizeof(float));
> +   memcpy(util_dynarray_grow(>imm_data, float, 4), v, 4 * 
> sizeof(float));
> return nvfx_reg(NVFXSR_IMM, idx);
>  }
>
>  static void
>  grow_insns(struct nvfx_fpc *fpc, int size)
>  {
> struct nv30_fragprog *fp = fpc->fp;
>
> fp->insn_len += size;
> fp->insn = realloc(fp->insn, sizeof(uint32_t) * fp->insn_len);
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index 55167a27c09..228feced5d1 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -1256,24 +1256,23 @@ nv50_set_global_bindings(struct pipe_context *pipe,
>   struct pipe_resource **resources,
>   uint32_t **handles)
>  {
> struct nv50_context *nv50 = nv50_context(pipe);
> struct pipe_resource **ptr;
> unsigned i;
> const unsigned end = start + nr;
>
> if (nv50->global_residents.size <= (end * sizeof(struct pipe_resource 
> *))) {
>const unsigned old_size = nv50->global_residents.size;
> -  const unsigned req_size = end * sizeof(struct pipe_resource *);
> -  util_dynarray_resize(>global_residents, req_size);
> +  util_dynarray_resize(>global_residents, struct pipe_resource *, 
> end);
>memset((uint8_t *)nv50->global_residents.data + old_size, 0,
> - req_size - old_size);
> + nv50->global_residents.size - old_size);
> }
>
> if (resources) {
>ptr = util_dynarray_element(
>   >global_residents, struct pipe_resource *, start);
>for (i = 0; i < nr; ++i) {
>   pipe_resource_reference([i], resources[i]);
>   nv50_set_global_handle(handles[i], resources[i]);
>}
> } else {
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 12e21862ee0..2ab51c8529e 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -1363,24 +1363,23 @@ nvc0_set_global_bindings(struct pipe_context *pipe,
>   struct pipe_resource **resources,
>   uint32_t **handles)
>  {
> struct nvc0_context *nvc0 = nvc0_context(pipe);
> struct pipe_resource **ptr;
> unsigned i;
> const unsigned end = start + nr;
>
> if (nvc0->global_residents.size <= (end * sizeof(struct pipe_resource 
> *))) {
>const unsigned old_size = nvc0->global_residents.size;
> -  const unsigned req_size = end * sizeof(struct pipe_resource *);
> -  util_dynarray_resize(>global_residents, req_size);
> +  util_dynarray_resize(>global_residents, struct pipe_resource *, 
> end);
>memset((uint8_t *)nvc0->global_residents.data + old_size, 0,
> - req_size - old_size);
> + nvc0->global_residents.size - old_size);
> }
>
> if (resources) {
>ptr = util_dynarray_element(
>   >global_residents, struct pipe_resource *, start);
>for (i = 0; i < nr; ++i) {
>   pipe_resource_reference([i], resources[i]);
>   

[Mesa-dev] [Bug 110608] [bisected][18.3.3 regression] Nouveau on Wayland fails

2019-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110608

Bug ID: 110608
   Summary: [bisected][18.3.3 regression] Nouveau on Wayland fails
   Product: Mesa
   Version: 18.3
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: mba...@fastmail.com
QA Contact: intel-3d-b...@lists.freedesktop.org

Greetings,

After updating from Mesa 18.3.1 to 18.3.5, swaywm no longer starts on my
nouveau Nvidia GTX 770.

Sway prints the following error:

sway: ../mesa-18.3.5/src/mesa/program/prog_parameter.c:247:
_mesa_add_parameter: Assertion `0 < size && size <=4' failed.

I bisected it down to this commit:
https://gitlab.freedesktop.org/mesa/mesa/commit/fb78a6cb72270de271f75d6f6c9b5ebadba7a898

However, reverting that on top of my distro-provided Mesa 18.3.5 still yields
the same error.

Mesa 19.0.3 fails with a different message:

sway: ../mesa-19.0.3/src/compiler/glsl/serialize.cpp:555: void
read_uniforms(blob_reader*, gl_shader_program*): Assertion `vec_size +
prog->data->UniformStorage[i].storage <= data +
prog->data->NumUniformDataSlots' failed.

Please let me know what other details I can provide.  Thanks in advance!

This is possibly a duplicate of #109812.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110606] [lib32] [vulkan-overlay-layer] build failure

2019-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110606

Bug ID: 110606
   Summary: [lib32] [vulkan-overlay-layer] build failure
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: major
  Priority: medium
 Component: Drivers/Vulkan/Common
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: lonew...@xs4all.nl
CC: airl...@freedesktop.org, chadvers...@chromium.org,
ja...@jlekstrand.net

Created attachment 144162
  --> https://bugs.freedesktop.org/attachment.cgi?id=144162=edit
build log

When building multilib mesa trunk with 
-D vulkan-overlay-layer=true
build fails.
When building x86_64 everything works.
build log attached

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 110607] Vulkan overlay build broken on IA-32

2019-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=110607

Bug ID: 110607
   Summary: Vulkan overlay build broken on IA-32
   Product: Mesa
   Version: git
  Hardware: x86 (IA32)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: tehfr...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 144163
  --> https://bugs.freedesktop.org/attachment.cgi?id=144163=edit
Partial Mesa IA-32 build log

A recent commit to Mesa broke the IA-32 build of the Vulkan overlay layer,
relevant portion of the log attached.

The fix required would seem to be similar to the one implemented in 3090c6b9.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/6] radeonsi: inline si_shader_binary_read_config into its only caller

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Since it can only be used for reading the config of an individual,
non-combined shader, it is not very reusable anyway.
---
 src/gallium/drivers/radeonsi/si_shader.c | 21 +++--
 src/gallium/drivers/radeonsi/si_shader.h |  2 --
 2 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 757624c52f7..528c34aecba 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5302,33 +5302,20 @@ void si_shader_dump(struct si_screen *sscreen, const 
struct si_shader *shader,
if (shader->epilog)
si_shader_dump_disassembly(>epilog->binary,
   debug, "epilog", file);
fprintf(file, "\n");
}
 
si_shader_dump_stats(sscreen, shader, processor, file,
 check_debug_option);
 }
 
-bool si_shader_binary_read_config(struct si_shader_binary *binary,
- struct ac_shader_config *conf)
-{
-   struct ac_rtld_binary rtld;
-   if (!ac_rtld_open(, 1, >elf_buffer, >elf_size))
-   return false;
-
-   bool ok = ac_rtld_read_config(, conf);
-
-   ac_rtld_close();
-   return ok;
-}
-
 static int si_compile_llvm(struct si_screen *sscreen,
   struct si_shader_binary *binary,
   struct ac_shader_config *conf,
   struct ac_llvm_compiler *compiler,
   LLVMModuleRef mod,
   struct pipe_debug_callback *debug,
   unsigned processor,
   const char *name,
   bool less_optimized)
 {
@@ -5350,21 +5337,27 @@ static int si_compile_llvm(struct si_screen *sscreen,
LLVMDisposeMessage(ir);
}
 
if (!si_replace_shader(count, binary)) {
unsigned r = si_llvm_compile(mod, binary, compiler, debug,
 less_optimized);
if (r)
return r;
}
 
-   if (!si_shader_binary_read_config(binary, conf))
+   struct ac_rtld_binary rtld;
+   if (!ac_rtld_open(, 1, >elf_buffer, >elf_size))
+   return -1;
+
+   bool ok = ac_rtld_read_config(, conf);
+   ac_rtld_close();
+   if (!ok)
return -1;
 
/* Enable 64-bit and 16-bit denormals, because there is no performance
 * cost.
 *
 * If denormals are enabled, all floating-point output modifiers are
 * ignored.
 *
 * Don't enable denormals for 32-bit floats, because:
 * - Floating-point output modifiers would be ignored by the hw.
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 302de427c04..ef9f5c379d3 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -685,22 +685,20 @@ unsigned si_shader_io_get_unique_index(unsigned 
semantic_name, unsigned index,
 bool si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader,
 uint64_t scratch_va);
 void si_shader_dump(struct si_screen *sscreen, const struct si_shader *shader,
struct pipe_debug_callback *debug, unsigned processor,
FILE *f, bool check_debug_option);
 void si_shader_dump_stats_for_shader_db(const struct si_shader *shader,
struct pipe_debug_callback *debug);
 void si_multiwave_lds_size_workaround(struct si_screen *sscreen,
  unsigned *lds_size);
 const char *si_get_shader_name(const struct si_shader *shader, unsigned 
processor);
-bool si_shader_binary_read_config(struct si_shader_binary *binary,
- struct ac_shader_config *conf);
 void si_shader_binary_clean(struct si_shader_binary *binary);
 
 /* si_shader_nir.c */
 void si_nir_scan_shader(const struct nir_shader *nir,
struct tgsi_shader_info *info);
 void si_nir_scan_tess_ctrl(const struct nir_shader *nir,
   struct tgsi_tessctrl_info *out);
 void si_lower_nir(struct si_shader_selector *sel);
 
 /* Inline helpers. */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/6] amd,radeonsi: link explicit LDS symbols

2019-05-04 Thread Nicolai Hähnle
this series builds on my recent series adding a runtime linker to now
support layout and relocation of explicit LDS symbols.

Currently, all our uses of LDS have a single LDS base pointer which
is defined either by an inttoptr case from 0 or as a single global LDS
symbol. This is fine for our current use cases, but it gets tedious
when we want to do more with LDS, such as keeping multiple logically
separately variables in LDS.

(LS/HS shaders are already affected by this issue, because they use LDS
for two conceptually separate things: vertex shader outputs to be read
by the TCS, and TCS outputs in case they are read back for cross-thread
communication. Ironically, since we don't know the LS/HS LDS data sizes
until draw time, this series won't help there.)

This series works in tandem with related changes in LLVM, see the changes
leading up to and including https://reviews.llvm.org/D61494:
- global in the LDS address space are written out as specially marked
  symbols to the ELF object by LLVM
- the Mesa rtld combines those symbols with driver-specified "shared"
  LDS symbols, where the "shared" means shared between multiple shader
  parts
- rtld calculates a layout for the objects in LDS: shared  symbols
  first, followed by private, per-shader-part symbols that can alias,
  followed by the special __lds_end symbol marking the end of LDS
  memory
- rtld resolves any relocations

For a smooth upgrade with Mesa master and LLVM trunk, the plan to upstream
these changes is:

1. Land at least the first two patches of this series, which add rtld
   support for the new LDS symbols.
2. Land the LLVM changes for generating the symbols in the ELF
3. Land the remainder of this series (this should mostly be possible
   earlier, actually).

Please review!

Thanks,
Nicolai
--
 src/amd/common/ac_rtld.c | 210 +++--
 src/amd/common/ac_rtld.h |  39 ++-
 src/gallium/drivers/radeonsi/si_compute.c|   9 +-
 src/gallium/drivers/radeonsi/si_debug.c  |  22 +-
 src/gallium/drivers/radeonsi/si_shader.c | 210 +++--
 src/gallium/drivers/radeonsi/si_shader.h |  26 +-
 src/gallium/drivers/radeonsi/si_state_draw.c |   5 +
 .../drivers/radeonsi/si_state_shaders.c  |  31 +--
 8 files changed, 431 insertions(+), 121 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/6] amd/rtld: layout and relocate LDS symbols

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Upcoming changes to LLVM will emit LDS objects as symbols in the ELF
symbol table, with relocations that will be resolved with this change.

Callers will also be able to define LDS symbols that are shared between
shader parts. This will be used by radeonsi for the ESGS ring in gfx9+
merged shaders.
---
 src/amd/common/ac_rtld.c  | 210 --
 src/amd/common/ac_rtld.h  |  39 +++-
 src/gallium/drivers/radeonsi/si_compute.c |   9 +-
 src/gallium/drivers/radeonsi/si_debug.c   |  22 +-
 src/gallium/drivers/radeonsi/si_shader.c  |  61 +++--
 src/gallium/drivers/radeonsi/si_shader.h  |   5 +-
 .../drivers/radeonsi/si_state_shaders.c   |   2 +-
 7 files changed, 296 insertions(+), 52 deletions(-)

diff --git a/src/amd/common/ac_rtld.c b/src/amd/common/ac_rtld.c
index 4e0468d2062..3df7b3ba51f 100644
--- a/src/amd/common/ac_rtld.c
+++ b/src/amd/common/ac_rtld.c
@@ -24,25 +24,31 @@
 #include "ac_rtld.h"
 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
 #include "ac_binary.h"
+#include "ac_gpu_info.h"
+#include "util/u_dynarray.h"
 #include "util/u_math.h"
 
 // Old distributions may not have this enum constant
 #define MY_EM_AMDGPU 224
 
+#ifndef STT_AMDGPU_LDS
+#define STT_AMDGPU_LDS 13
+#endif
+
 #ifndef R_AMDGPU_NONE
 #define R_AMDGPU_NONE 0
 #define R_AMDGPU_ABS32_LO 1
 #define R_AMDGPU_ABS32_HI 2
 #define R_AMDGPU_ABS64 3
 #define R_AMDGPU_REL32 4
 #define R_AMDGPU_REL64 5
 #define R_AMDGPU_ABS32 6
 #define R_AMDGPU_GOTPCREL 7
 #define R_AMDGPU_GOTPCREL32_LO 8
@@ -97,41 +103,155 @@ static void report_elf_errorf(const char *fmt, ...) 
PRINTFLIKE(1, 2);
 static void report_elf_errorf(const char *fmt, ...)
 {
va_list va;
va_start(va, fmt);
report_erroraf(fmt, va);
va_end(va);
 
fprintf(stderr, "ELF error: %s\n", elf_errmsg(elf_errno()));
 }
 
+/**
+ * Find a symbol in a dynarray of struct ac_rtld_symbol by \p name and shader
+ * \p part_idx.
+ */
+static const struct ac_rtld_symbol *find_symbol(const struct util_dynarray 
*symbols,
+   const char *name, unsigned 
part_idx)
+{
+   util_dynarray_foreach(symbols, struct ac_rtld_symbol, symbol) {
+   if ((symbol->part_idx == ~0u || symbol->part_idx == part_idx) &&
+   !strcmp(name, symbol->name))
+   return symbol;
+   }
+   return 0;
+}
+
+static int compare_symbol_by_align(const void *lhsp, const void *rhsp)
+{
+   const struct ac_rtld_symbol *lhs = lhsp;
+   const struct ac_rtld_symbol *rhs = rhsp;
+   if (rhs->align > lhs->align)
+   return -1;
+   if (rhs->align < lhs->align)
+   return 1;
+   return 0;
+}
+
+/**
+ * Sort the given symbol list by decreasing alignment and assign offsets.
+ */
+static bool layout_symbols(struct ac_rtld_symbol *symbols, unsigned 
num_symbols,
+  uint64_t *ptotal_size)
+{
+   qsort(symbols, num_symbols, sizeof(*symbols), compare_symbol_by_align);
+
+   uint64_t total_size = *ptotal_size;
+
+   for (unsigned i = 0; i < num_symbols; ++i) {
+   struct ac_rtld_symbol *s = [i];
+   assert(util_is_power_of_two_nonzero(s->align));
+
+   total_size = align64(total_size, s->align);
+   s->offset = total_size;
+
+   if (total_size + s->size < total_size) {
+   report_errorf("%s: size overflow", __FUNCTION__);
+   return false;
+   }
+
+   total_size += s->size;
+   }
+
+   *ptotal_size = total_size;
+   return true;
+}
+
+/**
+ * Read LDS symbols from the given \p section of the ELF of \p part and append
+ * them to the LDS symbols list.
+ *
+ * Shared LDS symbols are filtered out.
+ */
+static bool read_private_lds_symbols(struct ac_rtld_binary *binary,
+unsigned part_idx,
+Elf_Scn *section,
+uint32_t *lds_end_align)
+{
+#define report_elf_if(cond) \
+   do { \
+   if ((cond)) { \
+   report_errorf(#cond); \
+   return false; \
+   } \
+   } while (false)
+
+   struct ac_rtld_part *part = >parts[part_idx];
+   Elf64_Shdr *shdr = elf64_getshdr(section);
+   uint32_t strtabidx = shdr->sh_link;
+   Elf_Data *symbols_data = elf_getdata(section, NULL);
+   report_elf_if(!symbols_data);
+
+   const Elf64_Sym *symbol = symbols_data->d_buf;
+   size_t num_symbols = symbols_data->d_size / sizeof(Elf64_Sym);
+
+   for (size_t j = 0; j < num_symbols; ++j, ++symbol) {
+   if (ELF64_ST_TYPE(symbol->st_info) != STT_AMDGPU_LDS)
+   continue;
+
+   report_elf_if(symbol->st_size > 1u << 29);
+
+   struct ac_rtld_symbol s = {};
+  

[Mesa-dev] [PATCH 5/6] radeonsi: use an explicit symbol for the LSHS LDS memory

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_shader.c | 17 +++--
 src/gallium/drivers/radeonsi/si_state_draw.c |  5 +
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index d127b525963..0cf4d01a36f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4842,22 +4842,35 @@ static void create_function(struct si_shader_context 
*ctx)
 
for (i = 0; i < fninfo.num_sgpr_params; ++i)
shader->info.num_input_sgprs += 
ac_get_type_size(fninfo.types[i]) / 4;
 
for (; i < fninfo.num_params; ++i)
shader->info.num_input_vgprs += 
ac_get_type_size(fninfo.types[i]) / 4;
 
assert(shader->info.num_input_vgprs >= num_prolog_vgprs);
shader->info.num_input_vgprs -= num_prolog_vgprs;
 
-   if (shader->key.as_ls || ctx->type == PIPE_SHADER_TESS_CTRL)
-   ac_declare_lds_as_pointer(>ac);
+   if (shader->key.as_ls || ctx->type == PIPE_SHADER_TESS_CTRL) {
+   if (USE_LDS_SYMBOLS && HAVE_LLVM >= 0x0900) {
+   /* The LSHS size is not known until draw time, so we 
append it
+* at the end of whatever LDS use there may be in the 
rest of
+* the shader (currently none, unless LLVM decides to 
do its
+* own LDS-based lowering).
+*/
+   ctx->ac.lds = LLVMAddGlobalInAddressSpace(
+   ctx->ac.module, LLVMArrayType(ctx->i32, 0),
+   "__lds_end", AC_ADDR_SPACE_LDS);
+   LLVMSetAlignment(ctx->ac.lds, 256);
+   } else {
+   ac_declare_lds_as_pointer(>ac);
+   }
+   }
 }
 
 /**
  * Load ESGS and GSVS ring buffer resource descriptors and save the variables
  * for later use.
  */
 static void preload_ring_buffers(struct si_shader_context *ctx)
 {
LLVMBuilderRef builder = ctx->ac.builder;
 
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 8e01e1b35e1..011aaf18ab1 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -244,20 +244,25 @@ static void si_emit_derived_tess_state(struct si_context 
*sctx,
} else {
assert(lds_size <= 32768);
lds_size = align(lds_size, 256) / 256;
}
 
/* Set SI_SGPR_VS_STATE_BITS. */
sctx->current_vs_state &= C_VS_STATE_LS_OUT_PATCH_SIZE &
  C_VS_STATE_LS_OUT_VERTEX_SIZE;
sctx->current_vs_state |= tcs_in_layout;
 
+   /* We should be able to support in-shader LDS use with LLVM >= 9
+* by just adding the lds_sizes together, but it has never
+* been tested. */
+   assert(ls_current->config.lds_size == 0);
+
if (sctx->chip_class >= GFX9) {
unsigned hs_rsrc2 = ls_current->config.rsrc2 |
S_00B42C_LDS_SIZE(lds_size);
 
radeon_set_sh_reg(cs, R_00B42C_SPI_SHADER_PGM_RSRC2_HS, 
hs_rsrc2);
 
/* Set userdata SGPRs for merged LS-HS. */
radeon_set_sh_reg_seq(cs,
  R_00B430_SPI_SHADER_USER_DATA_LS_0 +
  GFX9_SGPR_TCS_OFFCHIP_LAYOUT * 4, 3);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/6] radeonsi: rename lds_{load, store} to lshs_lds_{load, store}

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

These functions are now only used in LS/HS shaders (both separate and
merged).
---
 src/gallium/drivers/radeonsi/si_shader.c | 33 
 1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f95a96f2458..d127b525963 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -999,68 +999,68 @@ static LLVMValueRef buffer_load(struct 
lp_build_tgsi_context *bld_base,
value = ac_build_buffer_load(>ac, buffer, 1, NULL, base, offset,
  swizzle * 4, 1, 0, can_speculate, false);
 
value2 = ac_build_buffer_load(>ac, buffer, 1, NULL, base, offset,
   swizzle * 4 + 4, 1, 0, can_speculate, false);
 
return si_llvm_emit_fetch_64bit(bld_base, type, value, value2);
 }
 
 /**
- * Load from LDS.
+ * Load from LSHS LDS storage.
  *
  * \param type output value type
  * \param swizzle  offset (typically 0..3); it can be ~0, which loads a 
vec4
  * \param dw_addr  address in dwords
  */
-static LLVMValueRef lds_load(struct lp_build_tgsi_context *bld_base,
+static LLVMValueRef lshs_lds_load(struct lp_build_tgsi_context *bld_base,
 LLVMTypeRef type, unsigned swizzle,
 LLVMValueRef dw_addr)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef value;
 
if (swizzle == ~0) {
LLVMValueRef values[TGSI_NUM_CHANNELS];
 
for (unsigned chan = 0; chan < TGSI_NUM_CHANNELS; chan++)
-   values[chan] = lds_load(bld_base, type, chan, dw_addr);
+   values[chan] = lshs_lds_load(bld_base, type, chan, 
dw_addr);
 
return ac_build_gather_values(>ac, values,
  TGSI_NUM_CHANNELS);
}
 
/* Split 64-bit loads. */
if (llvm_type_is_64bit(ctx, type)) {
LLVMValueRef lo, hi;
 
-   lo = lds_load(bld_base, ctx->i32, swizzle, dw_addr);
-   hi = lds_load(bld_base, ctx->i32, swizzle + 1, dw_addr);
+   lo = lshs_lds_load(bld_base, ctx->i32, swizzle, dw_addr);
+   hi = lshs_lds_load(bld_base, ctx->i32, swizzle + 1, dw_addr);
return si_llvm_emit_fetch_64bit(bld_base, type, lo, hi);
}
 
dw_addr = LLVMBuildAdd(ctx->ac.builder, dw_addr,
   LLVMConstInt(ctx->i32, swizzle, 0), "");
 
value = ac_lds_load(>ac, dw_addr);
 
return LLVMBuildBitCast(ctx->ac.builder, value, type, "");
 }
 
 /**
- * Store to LDS.
+ * Store to LSHS LDS storage.
  *
  * \param swizzle  offset (typically 0..3)
  * \param dw_addr  address in dwords
  * \param valuevalue to store
  */
-static void lds_store(struct si_shader_context *ctx,
+static void lshs_lds_store(struct si_shader_context *ctx,
  unsigned dw_offset_imm, LLVMValueRef dw_addr,
  LLVMValueRef value)
 {
dw_addr = LLVMBuildAdd(ctx->ac.builder, dw_addr,
   LLVMConstInt(ctx->i32, dw_offset_imm, 0), "");
 
ac_lds_store(>ac, dw_addr, value);
 }
 
 enum si_tess_ring {
@@ -1110,21 +1110,21 @@ static LLVMValueRef fetch_input_tcs(
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle_in)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef dw_addr, stride;
unsigned swizzle = swizzle_in & 0x;
stride = get_tcs_in_vertex_dw_stride(ctx);
dw_addr = get_tcs_in_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, stride, dw_addr);
 
-   return lds_load(bld_base, tgsi2llvmtype(bld_base, type), swizzle, 
dw_addr);
+   return lshs_lds_load(bld_base, tgsi2llvmtype(bld_base, type), swizzle, 
dw_addr);
 }
 
 static LLVMValueRef si_nir_load_tcs_varyings(struct ac_shader_abi *abi,
 LLVMTypeRef type,
 LLVMValueRef vertex_index,
 LLVMValueRef param_index,
 unsigned const_index,
 unsigned location,
 unsigned driver_location,
 unsigned component,
@@ -1177,21 +1177,21 @@ static LLVMValueRef si_nir_load_tcs_varyings(struct 
ac_shader_abi *abi,
  names, indices,
  is_patch);
 
LLVMValueRef value[4];
for (unsigned i = 0; i < num_components; i++) {
unsigned offset = i;
if 

[Mesa-dev] [PATCH 3/6] radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This will make it easier to use LDS for other purposes in geometry
shaders in the future.

The lifetime of the esgs_ring variable is as follows:
- declared as [0 x i32] while compiling shader parts or monolithic shaders
- just before uploading, gfx9_get_gs_info computes (among other things)
  the final ESGS ring size (this depends on both the ES and the GS shader)
- during upload, the "esgs_ring" symbol is given to ac_rtld as a shared
  LDS symbol, which will lead to correctly laying out the LDS including
  other LDS objects that may be defined in the future
- si_shader_gs uses shader->config.lds_size as the LDS size

This change depends on the LLVM changes for emitting LDS symbols into
the ELF file.
---
 src/gallium/drivers/radeonsi/si_shader.c  | 82 +++
 src/gallium/drivers/radeonsi/si_shader.h  | 19 +
 .../drivers/radeonsi/si_state_shaders.c   | 29 ++-
 3 files changed, 94 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 6968038d4d0..f95a96f2458 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1527,23 +1527,36 @@ LLVMValueRef si_llvm_load_input_gs(struct ac_shader_abi 
*abi,
break;
case 2:
vtx_offset = si_unpack_param(ctx, 
ctx->param_gs_vtx45_offset,
  index % 2 ? 16 : 0, 16);
break;
default:
assert(0);
return NULL;
}
 
+   unsigned offset = param * 4 + swizzle;
vtx_offset = LLVMBuildAdd(ctx->ac.builder, vtx_offset,
- LLVMConstInt(ctx->i32, param * 4, 0), 
"");
-   return lds_load(bld_base, type, swizzle, vtx_offset);
+ LLVMConstInt(ctx->i32, offset, 
false), "");
+
+   LLVMValueRef ptr = ac_build_gep0(>ac, ctx->esgs_ring, 
vtx_offset);
+   LLVMValueRef value = LLVMBuildLoad(ctx->ac.builder, ptr, "");
+   if (llvm_type_is_64bit(ctx, type)) {
+   ptr = LLVMBuildGEP(ctx->ac.builder, ptr,
+  >ac.i32_1, 1, "");
+   LLVMValueRef values[2] = {
+   value,
+   LLVMBuildLoad(ctx->ac.builder, ptr, "")
+   };
+   value = ac_build_gather_values(>ac, values, 2);
+   }
+   return LLVMBuildBitCast(ctx->ac.builder, value, type, "");
}
 
/* GFX6: input load from the ESGS ring in memory. */
if (swizzle == ~0) {
LLVMValueRef values[TGSI_NUM_CHANNELS];
unsigned chan;
for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
values[chan] = si_llvm_load_input_gs(abi, input_index, 
vtx_offset_param,
 type, chan);
}
@@ -3424,21 +3437,23 @@ static void si_llvm_emit_es_epilogue(struct 
ac_shader_abi *abi,
 
for (chan = 0; chan < 4; chan++) {
if (!(info->output_usagemask[i] & (1 << chan)))
continue;
 
LLVMValueRef out_val = LLVMBuildLoad(ctx->ac.builder, 
addrs[4 * i + chan], "");
out_val = ac_to_integer(>ac, out_val);
 
/* GFX9 has the ESGS ring in LDS. */
if (ctx->screen->info.chip_class >= GFX9) {
-   lds_store(ctx, param * 4 + chan, lds_base, 
out_val);
+   LLVMValueRef idx = LLVMConstInt(ctx->i32, param 
* 4 + chan, false);
+   idx = LLVMBuildAdd(ctx->ac.builder, lds_base, 
idx, "");
+   ac_build_indexed_store(>ac, 
ctx->esgs_ring, idx, out_val);
continue;
}
 
ac_build_buffer_store_dword(>ac,
ctx->esgs_ring,
out_val, 1, NULL, soffset,
(4 * param + chan) * 4,
1, 1, true, true);
}
}
@@ -4828,47 +4843,62 @@ static void create_function(struct si_shader_context 
*ctx)
 
for (i = 0; i < fninfo.num_sgpr_params; ++i)
shader->info.num_input_sgprs += 
ac_get_type_size(fninfo.types[i]) / 4;
 
for (; i < fninfo.num_params; ++i)
shader->info.num_input_vgprs += 
ac_get_type_size(fninfo.types[i]) / 4;
 
assert(shader->info.num_input_vgprs >= num_prolog_vgprs);

[Mesa-dev] [PATCH 6/6] radeonsi: raise the alignment of LDS memory for compute shaders

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This implies that the memory will always be at address 0, which allows
LLVM to generate slightly better code.
---
 src/gallium/drivers/radeonsi/si_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0cf4d01a36f..91f4c177bd0 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2201,21 +2201,21 @@ void si_declare_compute_memory(struct si_shader_context 
*ctx)
 
LLVMTypeRef i8p = LLVMPointerType(ctx->i8, AC_ADDR_SPACE_LDS);
LLVMValueRef var;
 
assert(!ctx->ac.lds);
 
var = LLVMAddGlobalInAddressSpace(ctx->ac.module,
  LLVMArrayType(ctx->i8, lds_size),
  "compute_lds",
  AC_ADDR_SPACE_LDS);
-   LLVMSetAlignment(var, 4);
+   LLVMSetAlignment(var, 64 * 1024);
 
ctx->ac.lds = LLVMBuildBitCast(ctx->ac.builder, var, i8p, "");
 }
 
 void si_tgsi_declare_compute_memory(struct si_shader_context *ctx,
const struct tgsi_full_declaration *decl)
 {
assert(decl->Declaration.MemType == TGSI_MEMORY_TYPE_SHARED);
assert(decl->Range.First == decl->Range.Last);
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] u_dynarray: return 0 on realloc failure

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

We're not very good at handling out-of-memory conditions in general, but
this change at least gives the caller the option of handling it.

This happens to fix an error in out-of-memory handling in i965, which has
the following code in brw_bufmgr.c:

  node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node));
  if (unlikely(!node))
 return 0ull;

Previously, allocation failure for util_dynarray_grow wouldn't actually
return NULL when the dynarray was previously non-empty.
---
 src/util/u_dynarray.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/util/u_dynarray.h b/src/util/u_dynarray.h
index b30fd7b1154..f6a81609dbe 100644
--- a/src/util/u_dynarray.h
+++ b/src/util/u_dynarray.h
@@ -85,20 +85,22 @@ util_dynarray_ensure_cap(struct util_dynarray *buf, 
unsigned newcap)
  buf->capacity = DYN_ARRAY_INITIAL_SIZE;
 
   while (newcap > buf->capacity)
  buf->capacity *= 2;
 
   if (buf->mem_ctx) {
  buf->data = reralloc_size(buf->mem_ctx, buf->data, buf->capacity);
   } else {
  buf->data = realloc(buf->data, buf->capacity);
   }
+  if (!buf->data)
+ return 0;
}
 
return (void *)((char *)buf->data + buf->size);
 }
 
 static inline void *
 util_dynarray_grow_cap(struct util_dynarray *buf, int diff)
 {
return util_dynarray_ensure_cap(buf, buf->size + diff);
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

The main motivation for this change is API ergonomics: most operations
on dynarrays are really on elements, not on bytes, so it's weird to have
grow and resize as the odd operations out.

The secondary motivation is memory safety. Users of the old byte-oriented
functions would often multiply a number of elements with the element size,
which could overflow, and checking for overflow is tedious.

With this change, we only need to implement the overflow checks once.
The checks are cheap: since eltsize is a compile-time constant and the
functions should be inlined, they only add a single comparison and an
unlikely branch.
---
 .../drivers/nouveau/nv30/nvfx_fragprog.c  |  2 +-
 src/gallium/drivers/nouveau/nv50/nv50_state.c |  5 +--
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c |  5 +--
 .../compiler/brw_nir_analyze_ubo_ranges.c |  2 +-
 src/mesa/drivers/dri/i965/brw_bufmgr.c|  4 +-
 src/util/u_dynarray.h | 38 +--
 6 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c 
b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
index 86e3599325e..2bcb62b97d8 100644
--- a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
+++ b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c
@@ -66,21 +66,21 @@ release_temps(struct nvfx_fpc *fpc)
fpc->r_temps &= ~fpc->r_temps_discard;
fpc->r_temps_discard = 0ULL;
 }
 
 static inline struct nvfx_reg
 nvfx_fp_imm(struct nvfx_fpc *fpc, float a, float b, float c, float d)
 {
float v[4] = {a, b, c, d};
int idx = fpc->imm_data.size >> 4;
 
-   memcpy(util_dynarray_grow(>imm_data, sizeof(float) * 4), v, 4 * 
sizeof(float));
+   memcpy(util_dynarray_grow(>imm_data, float, 4), v, 4 * sizeof(float));
return nvfx_reg(NVFXSR_IMM, idx);
 }
 
 static void
 grow_insns(struct nvfx_fpc *fpc, int size)
 {
struct nv30_fragprog *fp = fpc->fp;
 
fp->insn_len += size;
fp->insn = realloc(fp->insn, sizeof(uint32_t) * fp->insn_len);
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index 55167a27c09..228feced5d1 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -1256,24 +1256,23 @@ nv50_set_global_bindings(struct pipe_context *pipe,
  struct pipe_resource **resources,
  uint32_t **handles)
 {
struct nv50_context *nv50 = nv50_context(pipe);
struct pipe_resource **ptr;
unsigned i;
const unsigned end = start + nr;
 
if (nv50->global_residents.size <= (end * sizeof(struct pipe_resource *))) {
   const unsigned old_size = nv50->global_residents.size;
-  const unsigned req_size = end * sizeof(struct pipe_resource *);
-  util_dynarray_resize(>global_residents, req_size);
+  util_dynarray_resize(>global_residents, struct pipe_resource *, 
end);
   memset((uint8_t *)nv50->global_residents.data + old_size, 0,
- req_size - old_size);
+ nv50->global_residents.size - old_size);
}
 
if (resources) {
   ptr = util_dynarray_element(
  >global_residents, struct pipe_resource *, start);
   for (i = 0; i < nr; ++i) {
  pipe_resource_reference([i], resources[i]);
  nv50_set_global_handle(handles[i], resources[i]);
   }
} else {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 12e21862ee0..2ab51c8529e 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1363,24 +1363,23 @@ nvc0_set_global_bindings(struct pipe_context *pipe,
  struct pipe_resource **resources,
  uint32_t **handles)
 {
struct nvc0_context *nvc0 = nvc0_context(pipe);
struct pipe_resource **ptr;
unsigned i;
const unsigned end = start + nr;
 
if (nvc0->global_residents.size <= (end * sizeof(struct pipe_resource *))) {
   const unsigned old_size = nvc0->global_residents.size;
-  const unsigned req_size = end * sizeof(struct pipe_resource *);
-  util_dynarray_resize(>global_residents, req_size);
+  util_dynarray_resize(>global_residents, struct pipe_resource *, 
end);
   memset((uint8_t *)nvc0->global_residents.data + old_size, 0,
- req_size - old_size);
+ nvc0->global_residents.size - old_size);
}
 
if (resources) {
   ptr = util_dynarray_element(
  >global_residents, struct pipe_resource *, start);
   for (i = 0; i < nr; ++i) {
  pipe_resource_reference([i], resources[i]);
  nvc0_set_global_handle(handles[i], resources[i]);
   }
} else {
diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c 
b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
index ab7a2705c9a..4c5e03380e1 100644
--- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
+++ 

[Mesa-dev] [PATCH 1/3] freedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0)

2019-05-04 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This is more expressive and simplifies a subsequent change.
---
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c | 12 ++--
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c |  4 ++--
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c |  2 +-
 src/gallium/drivers/freedreno/a5xx/fd5_gmem.c |  2 +-
 src/gallium/drivers/freedreno/a6xx/fd6_gmem.c |  2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
index 0c7ea844fa4..0edc5e940c1 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
@@ -397,21 +397,21 @@ static void
 patch_draws(struct fd_batch *batch, enum pc_di_vis_cull_mode vismode)
 {
unsigned i;
 
if (!is_a20x(batch->ctx->screen)) {
/* identical to a3xx */
for (i = 0; i < fd_patch_num_elements(>draw_patches); 
i++) {
struct fd_cs_patch *patch = 
fd_patch_element(>draw_patches, i);
*patch->cs = patch->val | DRAW(0, 0, 0, vismode, 0);
}
-   util_dynarray_resize(>draw_patches, 0);
+   util_dynarray_clear(>draw_patches);
return;
}
 
if (vismode == USE_VISIBILITY)
return;
 
for (i = 0; i < batch->draw_patches.size / sizeof(uint32_t*); i++) {
uint32_t *ptr = *util_dynarray_element(>draw_patches, 
uint32_t*, i);
unsigned cnt = ptr[0] >> 16 & 0xfff; /* 5 with idx buffer, 3 
without */
 
@@ -465,22 +465,22 @@ fd2_emit_sysmem_prep(struct fd_batch *batch)
OUT_RING(ring, A2XX_PA_SC_SCREEN_SCISSOR_TL_WINDOW_OFFSET_DISABLE);
OUT_RING(ring, A2XX_PA_SC_SCREEN_SCISSOR_BR_X(pfb->width) |
A2XX_PA_SC_SCREEN_SCISSOR_BR_Y(pfb->height));
 
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_PA_SC_WINDOW_OFFSET));
OUT_RING(ring, A2XX_PA_SC_WINDOW_OFFSET_X(0) |
A2XX_PA_SC_WINDOW_OFFSET_Y(0));
 
patch_draws(batch, IGNORE_VISIBILITY);
-   util_dynarray_resize(>draw_patches, 0);
-   util_dynarray_resize(>shader_patches, 0);
+   util_dynarray_clear(>draw_patches);
+   util_dynarray_clear(>shader_patches);
 }
 
 /* before first tile */
 static void
 fd2_emit_tile_init(struct fd_batch *batch)
 {
struct fd_context *ctx = batch->ctx;
struct fd_ringbuffer *ring = batch->gmem;
struct pipe_framebuffer_state *pfb = >framebuffer;
struct fd_gmem_stateobj *gmem = >gmem;
@@ -544,21 +544,21 @@ fd2_emit_tile_init(struct fd_batch *batch)
continue;
}
 
patch->cs[0] = A2XX_PA_SC_SCREEN_SCISSOR_BR_X(32) |
A2XX_PA_SC_SCREEN_SCISSOR_BR_Y(lines);
patch->cs[4] = A2XX_RB_COLOR_INFO_BASE(color_base) |
A2XX_RB_COLOR_INFO_FORMAT(COLORX_8_8_8_8);
patch->cs[5] = A2XX_RB_DEPTH_INFO_DEPTH_BASE(depth_base) |
A2XX_RB_DEPTH_INFO_DEPTH_FORMAT(1);
}
-   util_dynarray_resize(>gmem_patches, 0);
+   util_dynarray_clear(>gmem_patches);
 
/* set to zero, for some reason hardware doesn't like certain values */
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_VGT_CURRENT_BIN_ID_MIN));
OUT_RING(ring, 0);
 
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_VGT_CURRENT_BIN_ID_MAX));
OUT_RING(ring, 0);
 
@@ -649,22 +649,22 @@ fd2_emit_tile_init(struct fd_batch *batch)
 
ctx->emit_ib(ring, batch->binning);
 
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_VGT_VERTEX_REUSE_BLOCK_CNTL));
OUT_RING(ring, 0x0002);
} else {
patch_draws(batch, IGNORE_VISIBILITY);
}
 
-   util_dynarray_resize(>draw_patches, 0);
-   util_dynarray_resize(>shader_patches, 0);
+   util_dynarray_clear(>draw_patches);
+   util_dynarray_clear(>shader_patches);
 }
 
 /* before mem2gmem */
 static void
 fd2_emit_tile_prep(struct fd_batch *batch, struct fd_tile *tile)
 {
struct fd_ringbuffer *ring = batch->gmem;
struct pipe_framebuffer_state *pfb = >framebuffer;
enum pipe_format format = pipe_surface_format(pfb->cbufs[0]);
 
diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c
index 7de0a92cdc1..e4455b3fa63 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_gmem.c
@@ -704,32 +704,32 @@ fd3_emit_tile_mem2gmem(struct fd_batch *batch, struct 
fd_tile *tile)
 }
 
 static void
 patch_draws(struct fd_batch *batch, enum pc_di_vis_cull_mode vismode)
 {
unsigned i;
for (i = 0; i < fd_patch_num_elements(>draw_patches); i++) {
 

[Mesa-dev] [PATCH 0/3] u_dynarray: minor API cleanups

2019-05-04 Thread Nicolai Hähnle
just some small changes that should make util_dynarray more convenient
and safer to use.

Please review!

Thanks,
Nicolai
--
 .../drivers/freedreno/a2xx/fd2_gmem.c| 12 +++---
 .../drivers/freedreno/a3xx/fd3_gmem.c|  4 +-
 .../drivers/freedreno/a4xx/fd4_gmem.c|  2 +-
 .../drivers/freedreno/a5xx/fd5_gmem.c|  2 +-
 .../drivers/freedreno/a6xx/fd6_gmem.c|  2 +-
 .../drivers/nouveau/nv30/nvfx_fragprog.c |  2 +-
 .../drivers/nouveau/nv50/nv50_state.c|  5 +--
 .../drivers/nouveau/nvc0/nvc0_state.c|  5 +--
 .../compiler/brw_nir_analyze_ubo_ranges.c|  2 +-
 src/mesa/drivers/dri/i965/brw_bufmgr.c   |  4 +-
 src/util/u_dynarray.h| 40 +-
 11 files changed, 48 insertions(+), 32 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/10] mesa: Implement _mesa_array_element by walking enabled arrays.

2019-05-04 Thread Mathias Fröhlich
Hi Brian,

On Friday, 3 May 2019 14:40:26 CEST Brian Paul wrote:
> All your suggested changes look good.
>
> Reviewed-by: Brian Paul 
>
> Thanks.

Pushed Thanks!

best
Mathias


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev