Re: [Mesa-dev] [PATCH 15/30] i965/miptree: Add new entrypoints for resolve management

2017-06-06 Thread Chad Versace
On Fri 26 May 2017, Jason Ekstrand wrote:
> This commit adds a new unified interface for doing resolves.  The basic
> format is that, prior to any surface access such as texturing or
> rendering, you call intel_miptree_prepare_access.  If the surface was
> written, you call intel_miptree_finish_write.  These two functions take
> parameters which tell them whether or not auxiliary compression and fast
> clears are supported on the surface.  Later commits will add wrappers
> around these two functions for texturing, rendering, etc.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 156 
> +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  80 +
>  2 files changed, 232 insertions(+), 4 deletions(-)


> +void
> +intel_miptree_prepare_access(struct brw_context *brw,
> + struct intel_mipmap_tree *mt,
> + uint32_t start_level, uint32_t num_levels,
> + uint32_t start_layer, uint32_t num_layers,
> + bool aux_supported, bool fast_clear_supported)

This parameter list seems a good place to pass in a bool that indicates
the miptree data can be invalidated before access. For example, before
a full, non-scissored clear or before mapping with
GL_MAP_INVALIDATE_RANGE_BIT. That info would let us avoid unneeded aux
ops.

But, of course, such a change doesn't belong in this patch, or perhaps
even in this patch series. I just wanted to suggest a future
improvement.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/30] i965: Use the new resolve function for several simple cases

2017-06-06 Thread Chad Versace
On Fri 26 May 2017, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_context.c|  2 +-
>  src/mesa/drivers/dri/i965/intel_blit.c | 14 --
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  | 12 +---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h  | 18 ++
>  src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |  2 +-
>  src/mesa/drivers/dri/i965/intel_pixel_read.c   |  2 +-
>  src/mesa/drivers/dri/i965/intel_tex_image.c|  7 ---
>  src/mesa/drivers/dri/i965/intel_tex_subimage.c |  7 ---
>  8 files changed, 38 insertions(+), 26 deletions(-)

Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/30] i965/miptree: Add new entrypoints for resolve management

2017-06-06 Thread Chad Versace
On Fri 26 May 2017, Jason Ekstrand wrote:
> This commit adds a new unified interface for doing resolves.  The basic
> format is that, prior to any surface access such as texturing or
> rendering, you call intel_miptree_prepare_access.  If the surface was
> written, you call intel_miptree_finish_write.  These two functions take
> parameters which tell them whether or not auxiliary compression and fast
> clears are supported on the surface.  Later commits will add wrappers
> around these two functions for texturing, rendering, etc.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 156 
> +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  80 +
>  2 files changed, 232 insertions(+), 4 deletions(-)

Modulo the errant whitespace changes Topi pointed out, which should be
squashed into an earlier patch, this patch is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 10:11 PM, Chad Versace  wrote:

> On Tue 06 Jun 2017, Jason Ekstrand wrote:
> > On Tue, Jun 6, 2017 at 6:00 PM, Chad Versace  wrote:
> >
> > > On Tue 06 Jun 2017, Jason Ekstrand wrote:
> > > > On Tue, Jun 6, 2017 at 1:32 PM, Jason Ekstrand  >
> > > wrote:
> > > >
> > > > > On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace <
> chadvers...@chromium.org
> > > >
> > > > > wrote:
> > > > >
> > > > >> On Fri 26 May 2017, Jason Ekstrand wrote:
> > >
> > > > > How about a section after the auxiliary compression ops section
> which
> > > goes
> > > > > into detail on each of the compression types and discusses which
> > > states are
> > > > > valid etc.
> > > > >
> > > >
> > > > How does this look:
> > > >
> > > > https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/
> > > i965-resolve-rework-v3=8478b102c99e3ec43ec687b3f4e52acb9acbd5ba
> > > >
> > > > I'll squash it in if you like it.
> > >
> > > Please squash that in, with fixes :)
> > >
> > > I don't believe the pass-through state is impossible with MCS, because
> > > there is no single surface for write to "pass through" to. The aux
> > > surface can never be ignored with MCS. Another bit of evidence for this
> > > is that there exists no MCS ambiguate op, and therefore no arrow can
> > > exist in the diagram that carries the "resolved" box to the
> > > pass-through" box.
> > >
> >
> > Too many negatives going on...  I think you meant "I belive the
> > pass-through state is impossible" :-)
>
> Right... sloppy me.
>
> > I do not agree.  While no resolve has been written, one could easily do
> > so.  It would be implemented most likely as a blit from the surface to
> > itself with MCS enabled on the source and disabled for the destination
> > followed by blasting the appropriate value into the MCS.  Modulo issues
> > with the order in which pixels are dispatched (which I don't think is an
> > actual issue so long as SIMD > num_samples), the result should be a
> surface
> > in the pass-through state which can safely be rendered to with MCS
> > disabled.  Now, why anyone would ever want to do that is beyond me.  The
> > state which doesn't exist is the regular resolved state because, as with
> > CCS and MCS, the aux surface stores so little data, that there isn't
> really
> > any room for compression when the main surface contains valid data.
>
> Thanks for taking the time to explain that corner case to stubborn me.
> Such a state is so far outside of realistic usage that I failed to see
> it earlier.
>

I'll freely admit that it's highly exotic and will never be seen in the
wild.


> This patch, with extras squashed in, is
>

Done.


> Reviewed-by: Chad Versace 
>

Thanks!  And thank you for being careful and making me document even more
things.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Chad Versace
On Tue 06 Jun 2017, Jason Ekstrand wrote:
> On Tue, Jun 6, 2017 at 6:00 PM, Chad Versace  wrote:
> 
> > On Tue 06 Jun 2017, Jason Ekstrand wrote:
> > > On Tue, Jun 6, 2017 at 1:32 PM, Jason Ekstrand 
> > wrote:
> > >
> > > > On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace  > >
> > > > wrote:
> > > >
> > > >> On Fri 26 May 2017, Jason Ekstrand wrote:
> >
> > > > How about a section after the auxiliary compression ops section which
> > goes
> > > > into detail on each of the compression types and discusses which
> > states are
> > > > valid etc.
> > > >
> > >
> > > How does this look:
> > >
> > > https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/
> > i965-resolve-rework-v3=8478b102c99e3ec43ec687b3f4e52acb9acbd5ba
> > >
> > > I'll squash it in if you like it.
> >
> > Please squash that in, with fixes :)
> >
> > I don't believe the pass-through state is impossible with MCS, because
> > there is no single surface for write to "pass through" to. The aux
> > surface can never be ignored with MCS. Another bit of evidence for this
> > is that there exists no MCS ambiguate op, and therefore no arrow can
> > exist in the diagram that carries the "resolved" box to the
> > pass-through" box.
> >
> 
> Too many negatives going on...  I think you meant "I belive the
> pass-through state is impossible" :-)

Right... sloppy me.

> I do not agree.  While no resolve has been written, one could easily do
> so.  It would be implemented most likely as a blit from the surface to
> itself with MCS enabled on the source and disabled for the destination
> followed by blasting the appropriate value into the MCS.  Modulo issues
> with the order in which pixels are dispatched (which I don't think is an
> actual issue so long as SIMD > num_samples), the result should be a surface
> in the pass-through state which can safely be rendered to with MCS
> disabled.  Now, why anyone would ever want to do that is beyond me.  The
> state which doesn't exist is the regular resolved state because, as with
> CCS and MCS, the aux surface stores so little data, that there isn't really
> any room for compression when the main surface contains valid data.

Thanks for taking the time to explain that corner case to stubborn me.
Such a state is so far outside of realistic usage that I failed to see
it earlier.

This patch, with extras squashed in, is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/10] i965/blorp: Inline gen6_blorp_exec

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 10:54 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Tue, Jun 06, 2017 at 08:35:06PM +0300, Pohjolainen, Topi wrote:
> > On Mon, Jun 05, 2017 at 05:55:39PM -0700, Jason Ekstrand wrote:
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_blorp.c | 29
> +++--
> > >  1 file changed, 11 insertions(+), 18 deletions(-)
> >
> > Patches 1-4:
> >
> > Reviewed-by: Topi Pohjolainen 
>
> In fact patches 5 and 7-10 also:
>
> Reviewed-by: Topi Pohjolainen 
>

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/11] i965/blorp: Do a depth flush/stall prior to HiZ operations

2017-06-06 Thread Jason Ekstrand
Without this stall, the test group ES3-CTS.functional.fbo.msaa.\* hangs
about 1 out of every 2 or 3 times on my Sky Lake GT3 laptop.  With the
flush and stall, I can run it 6 times in a row without a hang.

Cc: "17.1" 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 763ce05..38925d9 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -1021,6 +1021,23 @@ intel_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 {
const char *opname = NULL;
 
+   /* From the Ivy Bridge PRM, Vol. 2, pt. 1, section 11.5.3.1 "Depth
+* Buffer Clear":
+*
+*"If other rendering operations have preceded this clear, a
+*PIPE_CONTROL with depth cache flush enabled, Depth Stall bit enabled
+*must be issued before the rectangle primitive used for the depth
+*buffer clear operation."
+*
+* In the Sky Lake and Broadwell docs, this text only appears in the
+* section on legacy HiZ ops.  However, adding it seems to solve some hangs
+* on Sky Lake so it appears it's needed regardless of which kind of HiZ
+* operation is performed.
+*/
+   brw_emit_pipe_control_flush(brw,
+   PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+   PIPE_CONTROL_DEPTH_STALL);
+
switch (op) {
case BLORP_HIZ_OP_DEPTH_RESOLVE:
   opname = "depth resolve";
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/11] i965/blorp: Set no_depth_or_stencil correctly

2017-06-06 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index 3451d71..0de3038 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -273,7 +273,8 @@ retry:
 * rendering tracks for GL.
 */
brw->ctx.NewDriverState |= BRW_NEW_BLORP;
-   brw->no_depth_or_stencil = false;
+   brw->no_depth_or_stencil = !params->depth.enabled &&
+  !params->stencil.enabled;
brw->ib.index_size = -1;
 
if (params->dst.enabled)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/11] i965: Set step_rate == 0 for interleaved vertex buffers

2017-06-06 Thread Jason Ekstrand
Before, we weren't setting step rate so we got whatever old value
happened to be lying around.  This can lead to some interesting
rendering errors.  In particular, if you run the OpenGL ES CTS with
dEQP-GLES3.functional.instanced.types.mat2x4 immediately followed by one
of the dEQP-GLES3.functional.transform_feedback.* tests, the transform
feedback test gets stale instancing data from the other test and fails.
The only thing that is causing this to not be a problem today is that we
use meta for clears and meta is setting up vertex buffers via the VBO or
non-interleaved path and setting step_rate to 0 for us.  When blorp
depth/stencil clears are enabled, meta is no longer sitting between the
two tests and the stale data starts causing noticeable problems.

Cc: "17.1" 
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index cf66770..05b6b1a 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -648,6 +648,7 @@ brw_prepare_vertices(struct brw_context *brw)
 buffer, interleaved);
 buffer->offset -= delta * interleaved;
  buffer->size += delta * interleaved;
+ buffer->step_rate = 0;
 
 for (i = 0; i < nr_uploads; i++) {
/* Then, just point upload[i] at upload[0]'s buffer. */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/11] i965: Use blorp for depth/stencil clears on gen6+

2017-06-06 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 106 ++
 src/mesa/drivers/dri/i965/brw_blorp.h |   4 ++
 src/mesa/drivers/dri/i965/brw_clear.c |   6 ++
 3 files changed, 116 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 38925d9..a46b624 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -930,6 +930,112 @@ brw_blorp_clear_color(struct brw_context *brw, struct 
gl_framebuffer *fb,
 }
 
 void
+brw_blorp_clear_depth_stencil(struct brw_context *brw,
+  struct gl_framebuffer *fb,
+  GLbitfield mask, bool partial_clear)
+{
+   const struct gl_context *ctx = >ctx;
+   struct gl_renderbuffer *depth_rb =
+  fb->Attachment[BUFFER_DEPTH].Renderbuffer;
+   struct gl_renderbuffer *stencil_rb =
+  fb->Attachment[BUFFER_STENCIL].Renderbuffer;
+
+   if (!depth_rb || ctx->Depth.Mask == GL_FALSE)
+  mask &= ~BUFFER_BIT_DEPTH;
+
+   if (!stencil_rb || (ctx->Stencil.WriteMask[0] & 0xff) == 0)
+  mask &= ~BUFFER_BIT_STENCIL;
+
+   if (!(mask & (BUFFER_BITS_DEPTH_STENCIL)))
+  return;
+
+   uint32_t x0, x1, y0, y1, rb_name, rb_height;
+   if (depth_rb) {
+  rb_name = depth_rb->Name;
+  rb_height = depth_rb->Height;
+  if (stencil_rb) {
+ assert(depth_rb->Width == stencil_rb->Width);
+ assert(depth_rb->Height == stencil_rb->Height);
+  }
+   } else {
+  assert(stencil_rb);
+  rb_name = stencil_rb->Name;
+  rb_height = stencil_rb->Height;
+   }
+
+   x0 = fb->_Xmin;
+   x1 = fb->_Xmax;
+   if (rb_name != 0) {
+  y0 = fb->_Ymin;
+  y1 = fb->_Ymax;
+   } else {
+  y0 = rb_height - fb->_Ymax;
+  y1 = rb_height - fb->_Ymin;
+   }
+
+   /* If the clear region is empty, just return. */
+   if (x0 == x1 || y0 == y1)
+  return;
+
+   unsigned level, layer, num_layers;
+   struct isl_surf isl_tmp[4];
+   struct blorp_surf depth_surf, stencil_surf;
+
+   if (mask & BUFFER_BIT_DEPTH) {
+  struct intel_renderbuffer *irb = intel_renderbuffer(depth_rb);
+  struct intel_mipmap_tree *depth_mt =
+ find_miptree(GL_DEPTH_BUFFER_BIT, irb);
+
+  level = irb->mt_level;
+  layer = irb_logical_mt_layer(irb);
+  num_layers = fb->MaxNumLayers ? irb->layer_count : 1;
+
+  intel_miptree_set_all_slices_need_depth_resolve(depth_mt, level);
+
+  unsigned depth_level = level;
+  blorp_surf_for_miptree(brw, _surf, depth_mt, true,
+ (1 << ISL_AUX_USAGE_HIZ),
+ _level, layer, num_layers, _tmp[0]);
+  assert(depth_level == level);
+   }
+
+   uint8_t stencil_mask = 0;
+   if (mask & BUFFER_BIT_STENCIL) {
+  struct intel_renderbuffer *irb = intel_renderbuffer(stencil_rb);
+  struct intel_mipmap_tree *stencil_mt =
+ find_miptree(GL_STENCIL_BUFFER_BIT, irb);
+
+  if (mask & BUFFER_BIT_DEPTH) {
+ assert(level == irb->mt_level);
+ assert(layer == irb_logical_mt_layer(irb));
+ assert(num_layers == fb->MaxNumLayers ? irb->layer_count : 1);
+  } else {
+ level = irb->mt_level;
+ layer = irb_logical_mt_layer(irb);
+ num_layers = fb->MaxNumLayers ? irb->layer_count : 1;
+  }
+
+  stencil_mask = ctx->Stencil.WriteMask[0] & 0xff;
+
+  unsigned stencil_level = level;
+  blorp_surf_for_miptree(brw, _surf, stencil_mt, true,
+ (1 << ISL_AUX_USAGE_HIZ),
+ _level, layer, num_layers, _tmp[2]);
+   }
+
+   assert((mask & BUFFER_BIT_DEPTH) || stencil_mask);
+
+   struct blorp_batch batch;
+   blorp_batch_init(>blorp, , brw, 0);
+   blorp_clear_depth_stencil(, _surf, _surf,
+ level, layer, num_layers,
+ x0, y0, x1, y1,
+ (mask & BUFFER_BIT_DEPTH), ctx->Depth.Clear,
+ stencil_mask, ctx->Stencil.Clear);
+   blorp_batch_finish();
+}
+
+void
 brw_blorp_resolve_color(struct brw_context *brw, struct intel_mipmap_tree *mt,
 unsigned level, unsigned layer)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index 8743d96..868301f 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -62,6 +62,10 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
 bool
 brw_blorp_clear_color(struct brw_context *brw, struct gl_framebuffer *fb,
   GLbitfield mask, bool partial_clear, bool encode_srgb);
+void
+brw_blorp_clear_depth_stencil(struct brw_context *brw,
+  struct gl_framebuffer *fb,
+  GLbitfield mask, bool partial_clear);
 
 void
 brw_blorp_resolve_color(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
b/src/mesa/drivers/dri/i965/brw_clear.c
index 664342d..57e5f16 

[Mesa-dev] [PATCH 05/11] i965: Remove some of the remnants of meta

2017-06-06 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_context.h   | 1 -
 src/mesa/drivers/dri/i965/brw_wm.c| 2 +-
 src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 4c5bc3b..3f4b86a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -750,7 +750,6 @@ struct brw_context
bool has_negative_rhw_bug;
bool has_pln;
bool no_simd8;
-   bool use_rep_send;
 
/**
 * Some versions of Gen hardware don't do centroid interpolation correctly
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 6fac3c4..7f688e2 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -188,7 +188,7 @@ brw_codegen_wm_prog(struct brw_context *brw,
program = brw_compile_fs(brw->screen->compiler, brw, mem_ctx,
 key, _data, fp->program.nir,
 >program, st_index8, st_index16,
-true, brw->use_rep_send, vue_map,
+true, false, vue_map,
 _size, _str);
 
if (program == NULL) {
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 23358c4..f6b2f17 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -1316,7 +1316,7 @@ genX(upload_clip_state)(struct brw_context *brw)
  clip.ClipMode = CLIPMODE_NORMAL;
   }
 
-  clip.ClipEnable = brw->primitive != _3DPRIM_RECTLIST;
+  clip.ClipEnable = true;
 
   /* _NEW_POLYGON,
* BRW_NEW_GEOMETRY_PROGRAM | BRW_NEW_TES_PROG_DATA | BRW_NEW_PRIMITIVE
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/11] intel/isl: Properly set SeparateStencilBufferEnable on gen5-6

2017-06-06 Thread Jason Ekstrand
On gen5-6, SeparateStencilBufferEnable and HierarchicalDepthBufferEnable
come hand in hand and we have to set either both or neither.
---
 src/intel/isl/isl_emit_depth_stencil.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/intel/isl/isl_emit_depth_stencil.c 
b/src/intel/isl/isl_emit_depth_stencil.c
index 41a01be..385c83d 100644
--- a/src/intel/isl/isl_emit_depth_stencil.c
+++ b/src/intel/isl/isl_emit_depth_stencil.c
@@ -113,6 +113,16 @@ isl_genX(emit_depth_stencil_hiz_s)(const struct isl_device 
*dev, void *batch,
 #endif
}
 
+#if GEN_GEN == 5 || GEN_GEN == 6
+   const bool separate_stencil =
+  info->stencil_surf && info->stencil_surf->format == ISL_FORMAT_R8_UINT;
+   if (separate_stencil || info->hiz_usage == ISL_AUX_USAGE_HIZ) {
+  assert(ISL_DEV_USE_SEPARATE_STENCIL(dev));
+  db.SeparateStencilBufferEnable = true;
+  db.HierarchicalDepthBufferEnable = true;
+   }
+#endif
+
 #if GEN_GEN >= 6
struct GENX(3DSTATE_STENCIL_BUFFER) sb = {
   GENX(3DSTATE_STENCIL_BUFFER_header),
@@ -151,9 +161,6 @@ isl_genX(emit_depth_stencil_hiz_s)(const struct isl_device 
*dev, void *batch,
   info->hiz_usage == ISL_AUX_USAGE_HIZ);
if (info->hiz_usage == ISL_AUX_USAGE_HIZ) {
   db.HierarchicalDepthBufferEnable = true;
-#if GEN_GEN == 5 || GEN_GEN == 6
-  db.SeparateStencilBufferEnable = true;
-#endif
 
   hiz.SurfaceBaseAddress = info->hiz_address;
   hiz.HierarchicalDepthBufferMOCS = info->mocs;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/11] i965: Remove some unneeded fields from brw_context

2017-06-06 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_context.h | 12 
 1 file changed, 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 3f4b86a..965c7b9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -965,22 +965,10 @@ struct brw_context
 
struct {
   struct brw_stage_state base;
-
-  /**
-   * True if the 3DSTATE_HS command most recently emitted to the 3D
-   * pipeline enabled the HS; false otherwise.
-   */
-  bool enabled;
} tcs;
 
struct {
   struct brw_stage_state base;
-
-  /**
-   * True if the 3DSTATE_DS command most recently emitted to the 3D
-   * pipeline enabled the DS; false otherwise.
-   */
-  bool enabled;
} tes;
 
struct {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/11] i965: Disable the interleaved vertex optimization when instancing

2017-06-06 Thread Jason Ekstrand
Instance divisor is a property of the vertex buffer and not the vertex
element so if we ever see anything other than 0, bail.

Cc: "17.1" 
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 2ec9a01..cf66770 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -584,15 +584,16 @@ brw_prepare_vertices(struct brw_context *brw)
ptr = glarray->Ptr;
 }
 else if (interleaved != glarray->StrideB ||
+  glarray->InstanceDivisor != 0 ||
   glarray->Ptr < ptr ||
   (uintptr_t)(glarray->Ptr - ptr) + glarray->_ElementSize > 
interleaved)
 {
 /* If our stride is different from the first attribute's stride,
- * or if the first attribute's stride didn't cover our element,
- * disable the interleaved upload optimization.  The second case
- * can most commonly occur in cases where there is a single vertex
- * and, for example, the data is stored on the application's
- * stack.
+ * or if we are using an instance divisor or if the first
+ * attribute's stride didn't cover our element, disable the
+ * interleaved upload optimization.  The second case can most
+ * commonly occur in cases where there is a single vertex and, for
+ * example, the data is stored on the application's stack.
  *
  * NOTE: This will also disable the optimization in cases where
  * the data is in a different order than the array indices.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/11] i965/blorp: Set aux_usage to NONE for miplevels without HiZ

2017-06-06 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 28be620..763ce05 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -189,6 +189,10 @@ blorp_surf_for_miptree(struct brw_context *brw,
   intel_miptree_used_for_rendering(brw, mt, *level,
start_layer, num_layers);
 
+   if (surf->aux_usage == ISL_AUX_USAGE_HIZ &&
+   !intel_miptree_level_has_hiz(mt, *level))
+  surf->aux_usage = ISL_AUX_USAGE_NONE;
+
if (surf->aux_usage != ISL_AUX_USAGE_NONE) {
   /* We only really need a clear color if we also have an auxiliary
* surface.  Without one, it does nothing.
@@ -994,6 +998,8 @@ gen6_blorp_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
blorp_surf_for_miptree(brw, , mt, true, (1 << ISL_AUX_USAGE_HIZ),
   , layer, 1, isl_tmp);
 
+   assert(surf.aux_usage == ISL_AUX_USAGE_HIZ);
+
struct blorp_batch batch;
blorp_batch_init(>blorp, , brw, 0);
blorp_gen6_hiz_op(, , level, layer, op);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/11] i965: Use BLORP for depth/stencil clears

2017-06-06 Thread Jason Ekstrand
This little series switches the GL driver to use BLORP for depth and
stencil clears.  BLORP has had depth/stencil clear support ever since we
started using it in the Vulkan driver but we didn't hook it up in GL
because of a few very hard-to-debug CTS fails.  Patches 10 takes care of
those and we now pass except for some weird behavior around occlusion
queries on Sandy Bridge.  I'll look into those later.  For now, I think the
series is worth reviewing.

Jason Ekstrand (11):
  i965/blorp: Set aux_usage to NONE for miplevels without HiZ
  mesa: Add a BUFFER_BITS mask for depth+stencil
  i965/miptree: Choose the stencil layout in miptree_create_layout
  intel/isl: Properly set SeparateStencilBufferEnable on gen5-6
  i965: Remove some of the remnants of meta
  i965: Remove some unneeded fields from brw_context
  i965/blorp: Set no_depth_or_stencil correctly
  i965/blorp: Do a depth flush/stall prior to HiZ operations
  i965: Disable the interleaved vertex optimization when instancing
  i965: Set step_rate == 0 for interleaved vertex buffers
  i965: Use blorp for depth/stencil clears on gen6+

 src/intel/isl/isl_emit_depth_stencil.c|  13 ++-
 src/mesa/drivers/dri/i965/brw_blorp.c | 129 ++
 src/mesa/drivers/dri/i965/brw_blorp.h |   4 +
 src/mesa/drivers/dri/i965/brw_clear.c |   6 ++
 src/mesa/drivers/dri/i965/brw_context.h   |  13 ---
 src/mesa/drivers/dri/i965/brw_draw_upload.c   |  12 ++-
 src/mesa/drivers/dri/i965/brw_wm.c|   2 +-
 src/mesa/drivers/dri/i965/genX_blorp_exec.c   |   3 +-
 src/mesa/drivers/dri/i965/genX_state_upload.c |   2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |   6 +-
 src/mesa/main/mtypes.h|   3 +
 11 files changed, 167 insertions(+), 26 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/11] mesa: Add a BUFFER_BITS mask for depth+stencil

2017-06-06 Thread Jason Ekstrand
---
 src/mesa/main/mtypes.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ec0123..d77c26a 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -187,6 +187,9 @@ typedef enum
 BUFFER_BIT_COLOR6 | \
 BUFFER_BIT_COLOR7)
 
+/* Mask of bits for depth+stencil buffers */
+#define BUFFER_BITS_DEPTH_STENCIL (BUFFER_BIT_DEPTH | BUFFER_BIT_STENCIL)
+
 /**
  * Framebuffer configuration (aka visual / pixelformat)
  * Note: some of these fields should be boolean, but it appears that
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/11] i965/miptree: Choose the stencil layout in miptree_create_layout

2017-06-06 Thread Jason Ekstrand
This ensures that we get the correct layout for all stencil buffers, not
just those which are created as separate stencil for a depth buffer.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 07e9ecf..519e67b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -335,6 +335,9 @@ intel_miptree_create_layout(struct brw_context *brw,
mt->msaa_layout = INTEL_MSAA_LAYOUT_NONE;
mt->refcount = 1;
 
+   if (brw->gen == 6 && format == MESA_FORMAT_S_UINT8)
+  layout_flags |= MIPTREE_LAYOUT_GEN6_HIZ_STENCIL;
+
int depth_multiply = 1;
if (num_samples > 1) {
   /* Adjust width/height/depth for MSAA */
@@ -465,8 +468,7 @@ intel_miptree_create_layout(struct brw_context *brw,
  intel_miptree_wants_hiz_buffer(brw, mt {
   uint32_t stencil_flags = MIPTREE_LAYOUT_ACCELERATED_UPLOAD;
   if (brw->gen == 6) {
- stencil_flags |= MIPTREE_LAYOUT_GEN6_HIZ_STENCIL |
-  MIPTREE_LAYOUT_TILING_ANY;
+ stencil_flags |= MIPTREE_LAYOUT_TILING_ANY;
   }
 
   mt->stencil_mt = intel_miptree_create(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: Fix broxton 2x6 way size computation

2017-06-06 Thread Mark Janes
Tested-by: Mark Janes 

Anuj Phogat  writes:

> This patch is undoing the changes to way size computation
> in broxton 2x6, made by below commit:
>
> Commit: 0d576fbfbe912cf3fb9ab594bb31eb58bccf2138
> Author: Anuj Phogat 
> i965: Simplify l3 way size computations
>
> By making use of l3_banks field in gen_device_info struct
> l3_way_size for gen7+ = 2 * l3_banks.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306
> Signed-off-by: Anuj Phogat 
> Cc: Jason Ekstrand 
> Cc: Mark Janes 
> Cc: Francisco Jerez 
> ---
> Note: Above bugzilla exposed a bug in our l3 allocation for
> broxton 2x6. We need more changes to fix l3 config. I'll send
> them later to the list. For now this patch brings things back
> to where they were for bxt and unblocks the CI system to be
> utilized for the performance work going on at present.
> ---
>  src/intel/common/gen_l3_config.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/intel/common/gen_l3_config.c 
> b/src/intel/common/gen_l3_config.c
> index e0825e9..2520838 100644
> --- a/src/intel/common/gen_l3_config.c
> +++ b/src/intel/common/gen_l3_config.c
> @@ -255,6 +255,10 @@ static unsigned
>  get_l3_way_size(const struct gen_device_info *devinfo)
>  {
> assert(devinfo->l3_banks);
> +
> +   if (devinfo->is_broxton)
> +  return 4;
> +
> return 2 * devinfo->l3_banks;
>  }
>  
> -- 
> 2.9.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: optimise compute dispatch to avoid looking up the sgpr repeatedly.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

Same as we did for draw dispatch and vertex sgprs.
---
 src/amd/vulkan/radv_cmd_buffer.c | 23 +--
 src/amd/vulkan/radv_pipeline.c   |  6 ++
 src/amd/vulkan/radv_private.h|  4 
 3 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index a069945..a4ddd7e 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2872,13 +2872,10 @@ void radv_CmdDispatch(
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 10);
 
-   struct ac_userdata_info *loc = 
radv_lookup_user_sgpr(cmd_buffer->state.compute_pipeline,
-
MESA_SHADER_COMPUTE, AC_UD_CS_GRID_SIZE);
-   if (loc->sgpr_idx != -1) {
-   assert(!loc->indirect);
+   if (cmd_buffer->state.compute_pipeline->compute.cs_grid_size_sgpr) {
uint8_t grid_used = 
cmd_buffer->state.compute_pipeline->shaders[MESA_SHADER_COMPUTE]->info.info.cs.grid_components_used;
-   assert(loc->num_sgprs == grid_used);
-   radeon_set_sh_reg_seq(cmd_buffer->cs, 
R_00B900_COMPUTE_USER_DATA_0 + loc->sgpr_idx * 4, grid_used);
+   radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.compute_pipeline->compute.cs_grid_size_sgpr,
+ grid_used);
radeon_emit(cmd_buffer->cs, x);
if (grid_used > 1)
radeon_emit(cmd_buffer->cs, y);
@@ -2912,9 +2909,9 @@ void radv_CmdDispatchIndirect(
radv_flush_compute_state(cmd_buffer);
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 25);
-   struct ac_userdata_info *loc = 
radv_lookup_user_sgpr(cmd_buffer->state.compute_pipeline,
-
MESA_SHADER_COMPUTE, AC_UD_CS_GRID_SIZE);
-   if (loc->sgpr_idx != -1) {
+
+
+   if (cmd_buffer->state.compute_pipeline->compute.cs_grid_size_sgpr) {
uint8_t grid_used = 
cmd_buffer->state.compute_pipeline->shaders[MESA_SHADER_COMPUTE]->info.info.cs.grid_components_used;
for (unsigned i = 0; i < grid_used; ++i) {
radeon_emit(cmd_buffer->cs, PKT3(PKT3_COPY_DATA, 4, 0));
@@ -2922,7 +2919,7 @@ void radv_CmdDispatchIndirect(
COPY_DATA_DST_SEL(COPY_DATA_REG));
radeon_emit(cmd_buffer->cs, (va +  4 * i));
radeon_emit(cmd_buffer->cs, (va + 4 * i) >> 32);
-   radeon_emit(cmd_buffer->cs, 
((R_00B900_COMPUTE_USER_DATA_0 + loc->sgpr_idx * 4) >> 2) + i);
+   radeon_emit(cmd_buffer->cs, 
(cmd_buffer->state.compute_pipeline->compute.cs_grid_size_sgpr >> 2) + i);
radeon_emit(cmd_buffer->cs, 0);
}
}
@@ -2984,11 +2981,9 @@ void radv_unaligned_dispatch(

S_00B81C_NUM_THREAD_FULL(compute_shader->info.cs.block_size[2]) |
S_00B81C_NUM_THREAD_PARTIAL(remainder[2]));
 
-   struct ac_userdata_info *loc = 
radv_lookup_user_sgpr(cmd_buffer->state.compute_pipeline,
-
MESA_SHADER_COMPUTE, AC_UD_CS_GRID_SIZE);
-   if (loc->sgpr_idx != -1) {
+   if (cmd_buffer->state.compute_pipeline->compute.cs_grid_size_sgpr) {
uint8_t grid_used = 
cmd_buffer->state.compute_pipeline->shaders[MESA_SHADER_COMPUTE]->info.info.cs.grid_components_used;
-   radeon_set_sh_reg_seq(cmd_buffer->cs, 
R_00B900_COMPUTE_USER_DATA_0 + loc->sgpr_idx * 4, grid_used);
+   radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.compute_pipeline->compute.cs_grid_size_sgpr, grid_used);
radeon_emit(cmd_buffer->cs, blocks[0]);
if (grid_used > 1)
radeon_emit(cmd_buffer->cs, blocks[1]);
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index ccbe20d..bda4c74 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -2375,6 +2375,12 @@ static VkResult radv_compute_pipeline_create(
 
 
pipeline->need_indirect_descriptor_sets |= 
pipeline->shaders[MESA_SHADER_COMPUTE]->info.need_indirect_descriptor_sets;
+
+   struct ac_userdata_info *loc = radv_lookup_user_sgpr(pipeline,
+
MESA_SHADER_COMPUTE, AC_UD_CS_GRID_SIZE);
+   if (loc->sgpr_idx != -1) {
+   pipeline->compute.cs_grid_size_sgpr = 
R_00B900_COMPUTE_USER_DATA_0 + loc->sgpr_idx * 4;
+   }
result = radv_pipeline_scratch_init(device, pipeline);
if (result != VK_SUCCESS) {
radv_pipeline_destroy(device, pipeline, pAllocator);
diff --git a/src/amd/vulkan/radv_private.h 

[Mesa-dev] [PATCH 1/2] radv: move the pipeline static pieces of ia multi vgt calcs to pipeline.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

This shifts a bunch of the pipeline specific calcs into pipeline
creation.

This should allow better optimising of the multi vgt calcs
---
 src/amd/vulkan/radv_pipeline.c | 62 ++
 src/amd/vulkan/radv_private.h  |  6 
 src/amd/vulkan/si_cmd_buffer.c | 60 
 3 files changed, 73 insertions(+), 55 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index e77f959..ccbe20d 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1960,6 +1960,8 @@ static void calculate_ps_inputs(struct radv_pipeline 
*pipeline)
pipeline->graphics.ps_input_cntl_num = ps_offset;
 }
 
+#define SI_GS_PER_ES 128
+
 VkResult
 radv_pipeline_init(struct radv_pipeline *pipeline,
   struct radv_device *device,
@@ -2170,8 +2172,68 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
pipeline->graphics.prim_vertex_count.incr = 1;
}
calculate_tess_state(pipeline, pCreateInfo);
+
+   }
+
+   pipeline->graphics.primgroup_size = 128;
+   if (radv_pipeline_has_tess(pipeline))
+   pipeline->graphics.primgroup_size = 
pipeline->graphics.tess.num_patches;
+   else if (radv_pipeline_has_gs(pipeline))
+   pipeline->graphics.primgroup_size = 64;
+
+   /* WD_SWITCH_ON_EOP has no effect on GPUs with less than
+* 4 shader engines. Set 1 to pass the assertion below.
+* The other cases are hardware requirements. */
+   if (pipeline->device->physical_device->rad_info.chip_class >= CIK) {
+   if (pipeline->device->physical_device->rad_info.max_se < 4 ||
+   pipeline->graphics.prim == V_008958_DI_PT_POLYGON ||
+   pipeline->graphics.prim == V_008958_DI_PT_LINELOOP ||
+   pipeline->graphics.prim == V_008958_DI_PT_TRIFAN ||
+   pipeline->graphics.prim == V_008958_DI_PT_TRISTRIP_ADJ ||
+   (pipeline->graphics.prim_restart_enable &&
+(pipeline->device->physical_device->rad_info.family < 
CHIP_POLARIS10 ||
+ (pipeline->graphics.prim != V_008958_DI_PT_POINTLIST &&
+  pipeline->graphics.prim != V_008958_DI_PT_LINESTRIP &&
+  pipeline->graphics.prim != V_008958_DI_PT_TRISTRIP
+   pipeline->graphics.cik_wd_switch_on_eop = true;
+   }
+
+   if (radv_pipeline_has_tess(pipeline)) {
+   /* SWITCH_ON_EOI must be set if PrimID is used. */
+   if 
(pipeline->shaders[MESA_SHADER_TESS_CTRL]->info.tcs.uses_prim_id ||
+   
pipeline->shaders[MESA_SHADER_TESS_EVAL]->info.tes.uses_prim_id)
+   pipeline->graphics.tess_ia_switch_on_eoi = true;
+
+   /* Bug with tessellation and GS on Bonaire and older 2 SE 
chips. */
+   if ((pipeline->device->physical_device->rad_info.family == 
CHIP_TAHITI ||
+pipeline->device->physical_device->rad_info.family == 
CHIP_PITCAIRN ||
+pipeline->device->physical_device->rad_info.family == 
CHIP_BONAIRE) &&
+   radv_pipeline_has_gs(pipeline))
+   pipeline->graphics.tess_partial_vs_wave = true;
+
+   /* Needed for 028B6C_DISTRIBUTION_MODE != 0 */
+   if (pipeline->device->has_distributed_tess) {
+   if (radv_pipeline_has_gs(pipeline)) {
+   if 
(pipeline->device->physical_device->rad_info.chip_class <= VI)
+   pipeline->graphics.partial_es_wave = 
true;
+
+   if 
(pipeline->device->physical_device->rad_info.family == CHIP_TONGA ||
+   
pipeline->device->physical_device->rad_info.family == CHIP_FIJI ||
+   
pipeline->device->physical_device->rad_info.family == CHIP_POLARIS10 ||
+   
pipeline->device->physical_device->rad_info.family == CHIP_POLARIS11 ||
+   
pipeline->device->physical_device->rad_info.family == CHIP_POLARIS12)
+   pipeline->graphics.tess_partial_vs_wave 
= true;
+   } else {
+   pipeline->graphics.tess_partial_vs_wave = true;
+   }
+   }
}
 
+   if (radv_pipeline_has_gs(pipeline))
+   if (SI_GS_PER_ES / pipeline->graphics.primgroup_size >= 
pipeline->device->gs_table_depth - 3)
+   pipeline->graphics.partial_es_wave = true;
+
+
const VkPipelineVertexInputStateCreateInfo *vi_info =
pCreateInfo->pVertexInputState;
for (uint32_t i = 0; i < vi_info->vertexAttributeDescriptionCount; i++) 
{
diff --git 

[Mesa-dev] [PATCH 2/2] radv: only update ia_multi_vgt_param if something changes.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

This in theory should reduce the number of calculations for this register
per draw.
---
 src/amd/vulkan/radv_private.h  |  2 ++
 src/amd/vulkan/si_cmd_buffer.c | 11 +++
 2 files changed, 13 insertions(+)

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index c88f1ff..8f60d9b 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -775,6 +775,8 @@ struct radv_cmd_state {
uint32_t  descriptors_dirty;
uint32_t  trace_id;
uint32_t  last_ia_multi_vgt_param;
+   bool last_draw_instanced, last_draw_indirect, 
last_multi_instances_smaller_than_primgroup, ia_set_once;
+   
 };
 
 struct radv_cmd_pool {
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 3464def..9b65b45 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -700,6 +700,17 @@ si_get_ia_multi_vgt_param(struct radv_cmd_buffer 
*cmd_buffer,
 
multi_instances_smaller_than_primgroup = indirect_draw || 
instance_less_than_primgroup_size;
 
+   if (cmd_buffer->state.ia_set_once &&
+   cmd_buffer->state.last_draw_instanced == instanced_draw &&
+   cmd_buffer->state.last_draw_indirect == indirect_draw &&
+   cmd_buffer->state.last_multi_instances_smaller_than_primgroup == 
multi_instances_smaller_than_primgroup)
+   return cmd_buffer->state.last_ia_multi_vgt_param;
+
+   cmd_buffer->state.ia_set_once = true;
+   cmd_buffer->state.last_draw_instanced = instanced_draw;
+   cmd_buffer->state.last_draw_indirect = indirect_draw;
+   cmd_buffer->state.last_multi_instances_smaller_than_primgroup = 
multi_instances_smaller_than_primgroup;
+
ia_switch_on_eoi = 
cmd_buffer->state.pipeline->graphics.tess_ia_switch_on_eoi;
partial_vs_wave = 
cmd_buffer->state.pipeline->graphics.tess_partial_vs_wave;
partial_es_wave = cmd_buffer->state.pipeline->graphics.partial_es_wave;
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 6:00 PM, Chad Versace  wrote:

> On Tue 06 Jun 2017, Jason Ekstrand wrote:
> > On Tue, Jun 6, 2017 at 1:32 PM, Jason Ekstrand 
> wrote:
> >
> > > On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace  >
> > > wrote:
> > >
> > >> On Fri 26 May 2017, Jason Ekstrand wrote:
>
> > > How about a section after the auxiliary compression ops section which
> goes
> > > into detail on each of the compression types and discusses which
> states are
> > > valid etc.
> > >
> >
> > How does this look:
> >
> > https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/
> i965-resolve-rework-v3=8478b102c99e3ec43ec687b3f4e52acb9acbd5ba
> >
> > I'll squash it in if you like it.
>
> Please squash that in, with fixes :)
>
> I don't believe the pass-through state is impossible with MCS, because
> there is no single surface for write to "pass through" to. The aux
> surface can never be ignored with MCS. Another bit of evidence for this
> is that there exists no MCS ambiguate op, and therefore no arrow can
> exist in the diagram that carries the "resolved" box to the
> pass-through" box.
>

Too many negatives going on...  I think you meant "I belive the
pass-through state is impossible" :-)

I do not agree.  While no resolve has been written, one could easily do
so.  It would be implemented most likely as a blit from the surface to
itself with MCS enabled on the source and disabled for the destination
followed by blasting the appropriate value into the MCS.  Modulo issues
with the order in which pixels are dispatched (which I don't think is an
actual issue so long as SIMD > num_samples), the result should be a surface
in the pass-through state which can safely be rendered to with MCS
disabled.  Now, why anyone would ever want to do that is beyond me.  The
state which doesn't exist is the regular resolved state because, as with
CCS and MCS, the aux surface stores so little data, that there isn't really
any room for compression when the main surface contains valid data.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 5:41 PM, Chad Versace  wrote:

> On Tue 06 Jun 2017, Jason Ekstrand wrote:
> > On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace 
> > wrote:
> >
> > > On Fri 26 May 2017, Jason Ekstrand wrote:
> > > > This enum describes all of the states that a auxiliary compressed
> > > > surface can have.  All of the states as well as normative language
> for
> > > > referring to each of the compression operations is provided in the
> > > > truly colossal comment for the new isl_aux_state enum.  There is also
> > > > a diagram showing how surfaces move between the different states.
> > > > ---
> > > >  src/intel/isl/isl.h | 142 ++
> > > ++
> > > >  1 file changed, 142 insertions(+)
> > > >
> > > > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > > > index b9d8fa8..df6d3e3 100644
> > > > --- a/src/intel/isl/isl.h
> > > > +++ b/src/intel/isl/isl.h
> > > > @@ -560,6 +560,148 @@ enum isl_aux_usage {
> > > > ISL_AUX_USAGE_CCS_E,
> > > >  };
> > > >
> > > > +/**
> > > > + * Enum for keeping track of the state an auxiliary compressed
> surface.
> > >
> > > This is really nice and helpful for everyone.
> > >
> > > I also learned something new from it: that a resolve on CCS_E also
> > > ambiguates the aux surface. Do you have any insight on why the hardware
> > > does that?
> > >
> > > > + *
> > > > + * For any given auxiliary surface compression format (HiZ, CCS, or
> > > MCS), any
> > > > + * given slice (lod + array layer) can be in one of the six states
> > > described
> > > > + * by this enum.  Draw and resolve operations may cause the slice to
> > > change
> > > > + * from one state to another.  The six valid states are:
> > >
> > > I have one suggestion: please carefully distinguish between CCS_D and
> > > CCS_E in the documentation. In my experience, muddy thinking where the
> > > two are not cleanly distinguished leads to confused minds and confusing
> > > code.
> > >
> > > For someone who already has a firm grasp on aux state, the ambiguous
> > > term "CCS" poses no problem. That wise person automatically infers from
> > > context if "CCS" applies to CCS_D, to CCS_E, or to both. But for
> someone
> > > who's understanding of aux isn't as solid, the term "CCS" can lead to
> > > incorrect inferences.
> > >
> > > For example, below you say that the partial resolve "operation is only
> > > available for CCS". That's misleading. It should say "only available
> for
> > > CCS_E".
> > >
> > > Another benefit: It becomes possible to document that
> > > ISL_AUX_STATE_COMPRESSED_NO_CLEAR is valid only for CCS_E and HIZ, but
> > > not valid for CCS_D and MCS.
> > >
> >
> > It is valid for MCS.  If you don't fast-clear but only render, then
> you're
> > in that state.  It's only invalid for CCS_D.
>
> Oops. You're right. compressed-no-clear is the "normal" state for MCS
> compression blocks.
>
> >
> >
> > > Other than the CCS_D/CCS_E distinction, the patch looks good to me.
> This
> > > is a really nice addition to the driver.
> > >
> >
> > How about a section after the auxiliary compression ops section which
> goes
> > into detail on each of the compression types and discusses which states
> are
> > valid etc.
>
> That sounds good, as long as there's not too much duplication between
> the two sections.
>

How about this:

https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/i965-resolve-rework-v3=8478b102c99e3ec43ec687b3f4e52acb9acbd5ba
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Chad Versace
On Tue 06 Jun 2017, Jason Ekstrand wrote:
> On Tue, Jun 6, 2017 at 1:32 PM, Jason Ekstrand  wrote:
> 
> > On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace 
> > wrote:
> >
> >> On Fri 26 May 2017, Jason Ekstrand wrote:

> > How about a section after the auxiliary compression ops section which goes
> > into detail on each of the compression types and discusses which states are
> > valid etc.
> >
> 
> How does this look:
> 
> https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/i965-resolve-rework-v3=8478b102c99e3ec43ec687b3f4e52acb9acbd5ba
> 
> I'll squash it in if you like it.

Please squash that in, with fixes :)

I don't believe the pass-through state is impossible with MCS, because
there is no single surface for write to "pass through" to. The aux
surface can never be ignored with MCS. Another bit of evidence for this
is that there exists no MCS ambiguate op, and therefore no arrow can
exist in the diagram that carries the "resolved" box to the
pass-through" box.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Chad Versace
On Tue 06 Jun 2017, Jason Ekstrand wrote:
> On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace 
> wrote:
> 
> > On Fri 26 May 2017, Jason Ekstrand wrote:
> > > This enum describes all of the states that a auxiliary compressed
> > > surface can have.  All of the states as well as normative language for
> > > referring to each of the compression operations is provided in the
> > > truly colossal comment for the new isl_aux_state enum.  There is also
> > > a diagram showing how surfaces move between the different states.
> > > ---
> > >  src/intel/isl/isl.h | 142 ++
> > ++
> > >  1 file changed, 142 insertions(+)
> > >
> > > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > > index b9d8fa8..df6d3e3 100644
> > > --- a/src/intel/isl/isl.h
> > > +++ b/src/intel/isl/isl.h
> > > @@ -560,6 +560,148 @@ enum isl_aux_usage {
> > > ISL_AUX_USAGE_CCS_E,
> > >  };
> > >
> > > +/**
> > > + * Enum for keeping track of the state an auxiliary compressed surface.
> >
> > This is really nice and helpful for everyone.
> >
> > I also learned something new from it: that a resolve on CCS_E also
> > ambiguates the aux surface. Do you have any insight on why the hardware
> > does that?
> >
> > > + *
> > > + * For any given auxiliary surface compression format (HiZ, CCS, or
> > MCS), any
> > > + * given slice (lod + array layer) can be in one of the six states
> > described
> > > + * by this enum.  Draw and resolve operations may cause the slice to
> > change
> > > + * from one state to another.  The six valid states are:
> >
> > I have one suggestion: please carefully distinguish between CCS_D and
> > CCS_E in the documentation. In my experience, muddy thinking where the
> > two are not cleanly distinguished leads to confused minds and confusing
> > code.
> >
> > For someone who already has a firm grasp on aux state, the ambiguous
> > term "CCS" poses no problem. That wise person automatically infers from
> > context if "CCS" applies to CCS_D, to CCS_E, or to both. But for someone
> > who's understanding of aux isn't as solid, the term "CCS" can lead to
> > incorrect inferences.
> >
> > For example, below you say that the partial resolve "operation is only
> > available for CCS". That's misleading. It should say "only available for
> > CCS_E".
> >
> > Another benefit: It becomes possible to document that
> > ISL_AUX_STATE_COMPRESSED_NO_CLEAR is valid only for CCS_E and HIZ, but
> > not valid for CCS_D and MCS.
> >
> 
> It is valid for MCS.  If you don't fast-clear but only render, then you're
> in that state.  It's only invalid for CCS_D.

Oops. You're right. compressed-no-clear is the "normal" state for MCS
compression blocks.

> 
> 
> > Other than the CCS_D/CCS_E distinction, the patch looks good to me. This
> > is a really nice addition to the driver.
> >
> 
> How about a section after the auxiliary compression ops section which goes
> into detail on each of the compression types and discusses which states are
> valid etc.

That sounds good, as long as there's not too much duplication between
the two sections.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] configure.ac: remove unused Android specifics

2017-06-06 Thread Chad Versace
On Mon 05 Jun 2017, Emil Velikov wrote:
> From: Emil Velikov 
> 
> The HAVE_ANDROID conditional has been unused as of commit 51accecce77
> ("mesa/dri: always link against shared glapi") and with that one gone we
> no longer need the host detection.
> 
> Cc: Chad Versace 
> Cc: Nicolas Boichat 
> Signed-off-by: Emil Velikov 
> ---
>  configure.ac | 6 --
>  1 file changed, 6 deletions(-)

Since it's not used, then kill it. We can always revive it later if we
need it. (And I do suspect we may need the CHOST detection again).

Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: don't do divide in si_get_ia_multi_vgt_param unless needed.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

We only need the num_prims for instanced and geometry cases.

Bas, I know you've got some more ideas for this area, so feel free
to include this or do it better.
---
 src/amd/vulkan/si_cmd_buffer.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 33414c1..47bf553 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -690,16 +690,20 @@ si_get_ia_multi_vgt_param(struct radv_cmd_buffer 
*cmd_buffer,
bool ia_switch_on_eoi = false;
bool partial_vs_wave = false;
bool partial_es_wave = false;
-   uint32_t num_prims = 
radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
 draw_vertex_count);
+   uint32_t num_prims = 0;
bool multi_instances_smaller_than_primgroup;
-
+   bool instance_less_than_primgroup_size = false;
if (radv_pipeline_has_tess(cmd_buffer->state.pipeline))
primgroup_size = 
cmd_buffer->state.pipeline->graphics.tess.num_patches;
else if (radv_pipeline_has_gs(cmd_buffer->state.pipeline))
primgroup_size = 64;  /* recommended with a GS */
 
-   multi_instances_smaller_than_primgroup = indirect_draw || 
(instanced_draw &&
-  num_prims < 
primgroup_size);
+   if (instanced_draw || radv_pipeline_has_gs(cmd_buffer->state.pipeline)) 
{
+   num_prims = 
radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
 draw_vertex_count);
+   instance_less_than_primgroup_size = num_prims < primgroup_size;
+   }
+
+   multi_instances_smaller_than_primgroup = indirect_draw || 
instance_less_than_primgroup_size;
if (radv_pipeline_has_tess(cmd_buffer->state.pipeline)) {
/* SWITCH_ON_EOI must be set if PrimID is used. */
if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_TESS_CTRL]->info.tcs.uses_prim_id
 ||
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/1] radeonsi: Use libdrm to get chipset name

2017-06-06 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Jun 7, 2017 at 12:21 AM, Samuel Li  wrote:
> v2: Add a func pointer to radeon_winsys to support radeon later.
>
> Change-Id: I614ea71424f9e5c97e4ae68654315d28c89eaa5f
> Signed-off-by: Samuel Li 
> ---
>  src/gallium/drivers/radeon/r600_pipe_common.c | 11 ++-
>  src/gallium/drivers/radeon/radeon_winsys.h|  2 ++
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  8 
>  3 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index 2c0cadb..48d136a 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -790,6 +790,15 @@ static const char* r600_get_device_vendor(struct 
> pipe_screen* pscreen)
>
>  static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
>  {
> +   const char *mname;
> +
> +   if (rscreen->ws->get_chip_name) {
> +   mname = rscreen->ws->get_chip_name(rscreen->ws);
> +   if (mname != NULL)
> +   return mname;
> +   }
> +
> +   /* fall back to family names*/
> switch (rscreen->info.family) {
> case CHIP_R600: return "AMD R600";
> case CHIP_RV610: return "AMD RV610";
> @@ -1321,6 +1330,7 @@ bool r600_common_screen_init(struct r600_common_screen 
> *rscreen,
> struct utsname uname_data;
>
> ws->query_info(ws, >info);
> +   rscreen->ws = ws;
>
> if (uname(_data) == 0)
> snprintf(kernel_version, sizeof(kernel_version),
> @@ -1362,7 +1372,6 @@ bool r600_common_screen_init(struct r600_common_screen 
> *rscreen,
> r600_init_screen_texture_functions(rscreen);
> r600_init_screen_query_functions(rscreen);
>
> -   rscreen->ws = ws;
> rscreen->family = rscreen->info.family;
> rscreen->chip_class = rscreen->info.chip_class;
> rscreen->debug_flags = debug_get_flags_option("R600_DEBUG", 
> common_debug_options, 0);
> diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
> b/src/gallium/drivers/radeon/radeon_winsys.h
> index 524bb46..e19fde6 100644
> --- a/src/gallium/drivers/radeon/radeon_winsys.h
> +++ b/src/gallium/drivers/radeon/radeon_winsys.h
> @@ -637,6 +637,8 @@ struct radeon_winsys {
>
>  bool (*read_registers)(struct radeon_winsys *ws, unsigned reg_offset,
> unsigned num_registers, uint32_t *out);
> +
> +const char* (*get_chip_name)(struct radeon_winsys *ws);
>  };
>
>  static inline bool radeon_emitted(struct radeon_winsys_cs *cs, unsigned 
> num_dw)
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> index c8bd60e..b2307fe 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> @@ -221,6 +221,13 @@ static bool amdgpu_winsys_unref(struct radeon_winsys 
> *rws)
> return destroy;
>  }
>
> +static const char* amdgpu_get_chip_name(struct radeon_winsys *ws)
> +{
> +   amdgpu_device_handle dev = ((struct amdgpu_winsys *)ws)->dev;
> +   return amdgpu_get_marketing_name(dev);
> +}
> +
> +
>  PUBLIC struct radeon_winsys *
>  amdgpu_winsys_create(int fd, radeon_screen_create_t screen_create)
>  {
> @@ -296,6 +303,7 @@ amdgpu_winsys_create(int fd, radeon_screen_create_t 
> screen_create)
> ws->base.cs_request_feature = amdgpu_cs_request_feature;
> ws->base.query_value = amdgpu_query_value;
> ws->base.read_registers = amdgpu_read_registers;
> +   ws->base.get_chip_name = amdgpu_get_chip_name;
>
> amdgpu_bo_init_functions(ws);
> amdgpu_cs_init_functions(ws);
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] ac/nir: mark some arguments const

2017-06-06 Thread Bas Nieuwenhuizen
This series is

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Jun 7, 2017 at 1:31 AM, Grazvydas Ignotas  wrote:
> Most functions are only inspecting nir, so nir related arguments can be
> marked const. Some more can be done if/when some nir changes are
> accepted.
>
> Signed-off-by: Grazvydas Ignotas 
> ---
> does *not* depend on the nir patch
>
>  src/amd/common/ac_nir_to_llvm.c | 61 
> +
>  1 file changed, 31 insertions(+), 30 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 4e5d19a..5f62769 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -174,11 +174,11 @@ struct nir_to_llvm_context {
> uint64_t tess_outputs_written;
> uint64_t tess_patch_outputs_written;
>  };
>
>  static LLVMValueRef get_sampler_desc(struct nir_to_llvm_context *ctx,
> -nir_deref_var *deref,
> +const nir_deref_var *deref,
>  enum desc_type desc_type);
>  static unsigned radeon_llvm_reg_index_soa(unsigned index, unsigned chan)
>  {
> return (index * 4) + chan;
>  }
> @@ -1077,11 +1077,11 @@ build_store_values_extended(struct 
> nir_to_llvm_context *ctx,
> LLVMBuildStore(builder, value, ptr);
> }
>  }
>
>  static LLVMTypeRef get_def_type(struct nir_to_llvm_context *ctx,
> -nir_ssa_def *def)
> +const nir_ssa_def *def)
>  {
> LLVMTypeRef type = LLVMIntTypeInContext(ctx->context, def->bit_size);
> if (def->num_components > 1) {
> type = LLVMVectorType(type, def->num_components);
> }
> @@ -1095,11 +1095,11 @@ static LLVMValueRef get_src(struct 
> nir_to_llvm_context *ctx, nir_src src)
> return (LLVMValueRef)entry->data;
>  }
>
>
>  static LLVMBasicBlockRef get_block(struct nir_to_llvm_context *ctx,
> -   struct nir_block *b)
> +   const struct nir_block *b)
>  {
> struct hash_entry *entry = _mesa_hash_table_search(ctx->defs, b);
> return (LLVMBasicBlockRef)entry->data;
>  }
>
> @@ -1385,11 +1385,11 @@ static LLVMValueRef emit_imul_high(struct 
> nir_to_llvm_context *ctx,
> return result;
>  }
>
>  static LLVMValueRef emit_bitfield_extract(struct nir_to_llvm_context *ctx,
>   bool is_signed,
> - LLVMValueRef srcs[3])
> + const LLVMValueRef srcs[3])
>  {
> LLVMValueRef result;
> LLVMValueRef icond = LLVMBuildICmp(ctx->builder, LLVMIntEQ, srcs[2], 
> LLVMConstInt(ctx->i32, 32, false), "");
>
> result = ac_build_bfe(>ac, srcs[0], srcs[1], srcs[2], is_signed);
> @@ -1524,11 +1524,11 @@ static LLVMValueRef emit_ddxy_interp(
> result[2+i] = emit_ddxy(ctx, nir_op_fddy, a);
> }
> return ac_build_gather_values(>ac, result, 4);
>  }
>
> -static void visit_alu(struct nir_to_llvm_context *ctx, nir_alu_instr *instr)
> +static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr 
> *instr)
>  {
> LLVMValueRef src[4], result = NULL;
> unsigned num_components = instr->dest.dest.ssa.num_components;
> unsigned src_components;
> LLVMTypeRef def_type = get_def_type(ctx, >dest.dest.ssa);
> @@ -1890,11 +1890,11 @@ static void visit_alu(struct nir_to_llvm_context 
> *ctx, nir_alu_instr *instr)
> result);
> }
>  }
>
>  static void visit_load_const(struct nir_to_llvm_context *ctx,
> - nir_load_const_instr *instr)
> + const nir_load_const_instr *instr)
>  {
> LLVMValueRef values[4], value = NULL;
> LLVMTypeRef element_type =
> LLVMIntTypeInContext(ctx->context, instr->def.bit_size);
>
> @@ -1974,11 +1974,11 @@ static void build_int_type_name(
> strcpy(buf, "i32");
>  }
>
>  static LLVMValueRef radv_lower_gather4_integer(struct nir_to_llvm_context 
> *ctx,
>struct ac_image_args *args,
> -  nir_tex_instr *instr)
> +  const nir_tex_instr *instr)
>  {
> enum glsl_base_type stype = 
> glsl_get_sampler_result_type(instr->texture->var->type);
> LLVMValueRef coord = args->addr;
> LLVMValueRef half_texel[2];
> LLVMValueRef compare_cube_wa;
> @@ -2087,11 +2087,11 @@ static LLVMValueRef radv_lower_gather4_integer(struct 
> nir_to_llvm_context *ctx,
> }
> return result;
>  }
>
>  static LLVMValueRef build_tex_intrinsic(struct nir_to_llvm_context *ctx,
> -  

Re: [Mesa-dev] [PATCH 1/3] radv: rename and make global some functions.

2017-06-06 Thread Bas Nieuwenhuizen
This series is

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Jun 7, 2017 at 1:18 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> I want to use these in the pipeline setup stage.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 24 
>  src/amd/vulkan/radv_private.h|  5 +
>  2 files changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index f3187e8..851b2ca 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -394,8 +394,8 @@ static unsigned radv_pack_float_12p4(float x)
>x >= 4096 ? 0x : x * 16;
>  }
>
> -static uint32_t
> -shader_stage_to_user_data_0(gl_shader_stage stage, bool has_gs, bool 
> has_tess)
> +uint32_t
> +radv_shader_stage_to_user_data_0(gl_shader_stage stage, bool has_gs, bool 
> has_tess)
>  {
> switch (stage) {
> case MESA_SHADER_FRAGMENT:
> @@ -421,7 +421,7 @@ shader_stage_to_user_data_0(gl_shader_stage stage, bool 
> has_gs, bool has_tess)
> }
>  }
>
> -static struct ac_userdata_info *
> +struct ac_userdata_info *
>  radv_lookup_user_sgpr(struct radv_pipeline *pipeline,
>   gl_shader_stage stage,
>   int idx)
> @@ -436,7 +436,7 @@ radv_emit_userdata_address(struct radv_cmd_buffer 
> *cmd_buffer,
>int idx, uint64_t va)
>  {
> struct ac_userdata_info *loc = radv_lookup_user_sgpr(pipeline, stage, 
> idx);
> -   uint32_t base_reg = shader_stage_to_user_data_0(stage, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> +   uint32_t base_reg = radv_shader_stage_to_user_data_0(stage, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> if (loc->sgpr_idx == -1)
> return;
> assert(loc->num_sgprs == 2);
> @@ -478,7 +478,7 @@ radv_update_multisample_state(struct radv_cmd_buffer 
> *cmd_buffer,
> if 
> (pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.needs_sample_positions)
>  {
> uint32_t offset;
> struct ac_userdata_info *loc = 
> radv_lookup_user_sgpr(pipeline, MESA_SHADER_FRAGMENT, 
> AC_UD_PS_SAMPLE_POS_OFFSET);
> -   uint32_t base_reg = 
> shader_stage_to_user_data_0(MESA_SHADER_FRAGMENT, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> +   uint32_t base_reg = 
> radv_shader_stage_to_user_data_0(MESA_SHADER_FRAGMENT, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> if (loc->sgpr_idx == -1)
> return;
> assert(loc->num_sgprs == 1);
> @@ -698,7 +698,7 @@ radv_emit_tess_shaders(struct radv_cmd_buffer *cmd_buffer,
>
> loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_TESS_CTRL, 
> AC_UD_TCS_OFFCHIP_LAYOUT);
> if (loc->sgpr_idx != -1) {
> -   uint32_t base_reg = 
> shader_stage_to_user_data_0(MESA_SHADER_TESS_CTRL, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> +   uint32_t base_reg = 
> radv_shader_stage_to_user_data_0(MESA_SHADER_TESS_CTRL, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> assert(loc->num_sgprs == 4);
> assert(!loc->indirect);
> radeon_set_sh_reg_seq(cmd_buffer->cs, base_reg + 
> loc->sgpr_idx * 4, 4);
> @@ -711,7 +711,7 @@ radv_emit_tess_shaders(struct radv_cmd_buffer *cmd_buffer,
>
> loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_TESS_EVAL, 
> AC_UD_TES_OFFCHIP_LAYOUT);
> if (loc->sgpr_idx != -1) {
> -   uint32_t base_reg = 
> shader_stage_to_user_data_0(MESA_SHADER_TESS_EVAL, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> +   uint32_t base_reg = 
> radv_shader_stage_to_user_data_0(MESA_SHADER_TESS_EVAL, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> assert(loc->num_sgprs == 1);
> assert(!loc->indirect);
>
> @@ -721,7 +721,7 @@ radv_emit_tess_shaders(struct radv_cmd_buffer *cmd_buffer,
>
> loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_VERTEX, 
> AC_UD_VS_LS_TCS_IN_LAYOUT);
> if (loc->sgpr_idx != -1) {
> -   uint32_t base_reg = 
> shader_stage_to_user_data_0(MESA_SHADER_VERTEX, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> +   uint32_t base_reg = 
> radv_shader_stage_to_user_data_0(MESA_SHADER_VERTEX, 
> radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
> assert(loc->num_sgprs == 1);
> assert(!loc->indirect);
>
> @@ -1319,7 +1319,7 @@ emit_stage_descriptor_set_userdata(struct 
> radv_cmd_buffer *cmd_buffer,
>gl_shader_stage stage)
>  {
> struct 

Re: [Mesa-dev] [PATCH] radv: move chip_class extraction down further.

2017-06-06 Thread Bas Nieuwenhuizen
On Wed, Jun 7, 2017 at 1:35 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This seems to matter here in a profile, without this we spend a lot
> more time exiting this function with no flush bits.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/vulkan/si_cmd_buffer.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
> index a251a1a..47bf553 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -690,16 +690,20 @@ si_get_ia_multi_vgt_param(struct radv_cmd_buffer 
> *cmd_buffer,
> bool ia_switch_on_eoi = false;
> bool partial_vs_wave = false;
> bool partial_es_wave = false;
> -   uint32_t num_prims = 
> radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
>  draw_vertex_count);
> +   uint32_t num_prims = 0;
> bool multi_instances_smaller_than_primgroup;
> -
> +   bool instance_less_than_primgroup_size = false;
> if (radv_pipeline_has_tess(cmd_buffer->state.pipeline))
> primgroup_size = 
> cmd_buffer->state.pipeline->graphics.tess.num_patches;
> else if (radv_pipeline_has_gs(cmd_buffer->state.pipeline))
> primgroup_size = 64;  /* recommended with a GS */
>
> -   multi_instances_smaller_than_primgroup = indirect_draw || 
> (instanced_draw &&
> -  num_prims 
> < primgroup_size);
> +   if (instanced_draw || 
> radv_pipeline_has_gs(cmd_buffer->state.pipeline)) {
> +   num_prims = 
> radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
>  draw_vertex_count);
> +   instance_less_than_primgroup_size = num_prims < 
> primgroup_size;
> +   }
> +
> +   multi_instances_smaller_than_primgroup = indirect_draw || 
> instance_less_than_primgroup_size;

Seems like you got some unintended chunks in there? The part in
si_emit_cache_flush is

Reviewed-by: Bas Nieuwenhuizen 

I'd guess that it was too much pointer chasing?

> if (radv_pipeline_has_tess(cmd_buffer->state.pipeline)) {
> /* SWITCH_ON_EOI must be set if PrimID is used. */
> if 
> (cmd_buffer->state.pipeline->shaders[MESA_SHADER_TESS_CTRL]->info.tcs.uses_prim_id
>  ||
> @@ -1079,7 +1083,7 @@ void
>  si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
>  {
> bool is_compute = cmd_buffer->queue_family_index == 
> RADV_QUEUE_COMPUTE;
> -   enum chip_class chip_class = 
> cmd_buffer->device->physical_device->rad_info.chip_class;
> +
> if (is_compute)
> cmd_buffer->state.flush_bits &= 
> ~(RADV_CMD_FLAG_FLUSH_AND_INV_CB |
>   
> RADV_CMD_FLAG_FLUSH_AND_INV_CB_META |
> @@ -1092,6 +1096,7 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
> if (!cmd_buffer->state.flush_bits)
> return;
>
> +   enum chip_class chip_class = 
> cmd_buffer->device->physical_device->rad_info.chip_class;
> radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 128);
>
> uint32_t *ptr = NULL;
> --
> 2.9.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: Fix broxton 2x6 way size computation

2017-06-06 Thread Kenneth Graunke
On Tuesday, June 6, 2017 4:34:36 PM PDT Anuj Phogat wrote:
> This patch is undoing the changes to way size computation
> in broxton 2x6, made by below commit:
> 
> Commit: 0d576fbfbe912cf3fb9ab594bb31eb58bccf2138
> Author: Anuj Phogat 
> i965: Simplify l3 way size computations
> 
> By making use of l3_banks field in gen_device_info struct
> l3_way_size for gen7+ = 2 * l3_banks.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306
> Signed-off-by: Anuj Phogat 
> Cc: Jason Ekstrand 
> Cc: Mark Janes 
> Cc: Francisco Jerez 
> ---
> Note: Above bugzilla exposed a bug in our l3 allocation for
> broxton 2x6. We need more changes to fix l3 config. I'll send
> them later to the list. For now this patch brings things back
> to where they were for bxt and unblocks the CI system to be
> utilized for the performance work going on at present.
> ---
>  src/intel/common/gen_l3_config.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/intel/common/gen_l3_config.c 
> b/src/intel/common/gen_l3_config.c
> index e0825e9..2520838 100644
> --- a/src/intel/common/gen_l3_config.c
> +++ b/src/intel/common/gen_l3_config.c
> @@ -255,6 +255,10 @@ static unsigned
>  get_l3_way_size(const struct gen_device_info *devinfo)
>  {
> assert(devinfo->l3_banks);
> +
> +   if (devinfo->is_broxton)
> +  return 4;
> +
> return 2 * devinfo->l3_banks;
>  }
>  
> 

Yeah...it's strange, the docs indicate that there's only 1 bank of L3
on Broxton 2x6, but it seems to have been working with 2...

Acked-by: Kenneth Graunke 

Your patches also changed the number of L3 banks in Kabylake GT 1.5.
It now has more of them, matching GT2 instead of GT1.  I think that's
correct by the documentation.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 58/64] radeonsi: track use of bindless samplers/images from tgsi_shader_info

2017-06-06 Thread Marek Olšák
On Tue, May 30, 2017 at 10:36 PM, Samuel Pitoiset
 wrote:
> This adds some new helper functions to know if the current draw
> call (or dispatch compute) is using bindless samplers/images,
> based on TGSI analysis.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeonsi/si_compute.c |  2 ++
>  src/gallium/drivers/radeonsi/si_compute.h | 14 ++
>  src/gallium/drivers/radeonsi/si_pipe.h| 20 
>  src/gallium/drivers/radeonsi/si_shader.h  | 12 
>  4 files changed, 48 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
> b/src/gallium/drivers/radeonsi/si_compute.c
> index 4c980668d3..61fab7ddb0 100644
> --- a/src/gallium/drivers/radeonsi/si_compute.c
> +++ b/src/gallium/drivers/radeonsi/si_compute.c
> @@ -108,6 +108,8 @@ static void si_create_compute_state_async(void *job, int 
> thread_index)
> program->shader.is_monolithic = true;
> program->uses_grid_size = sel.info.uses_grid_size;
> program->uses_block_size = sel.info.uses_block_size;
> +   program->uses_bindless_samplers = sel.info.uses_bindless_samplers;
> +   program->uses_bindless_images = sel.info.uses_bindless_images;
>
> if (si_shader_create(program->screen, tm, >shader, debug)) {
> program->shader.compilation_failed = true;
> diff --git a/src/gallium/drivers/radeonsi/si_compute.h 
> b/src/gallium/drivers/radeonsi/si_compute.h
> index 764d708c4f..3cf1538267 100644
> --- a/src/gallium/drivers/radeonsi/si_compute.h
> +++ b/src/gallium/drivers/radeonsi/si_compute.h
> @@ -49,6 +49,20 @@ struct si_compute {
> unsigned variable_group_size : 1;
> unsigned uses_grid_size:1;
> unsigned uses_block_size:1;
> +   unsigned uses_bindless_samplers:1;
> +   unsigned uses_bindless_images:1;
>  };
>
> +static inline bool
> +si_compute_uses_bindless_samplers(struct si_context *sctx)
> +{
> +   return sctx->cs_shader_state.program->uses_bindless_samplers;
> +}
> +
> +static inline bool
> +si_compute_uses_bindless_images(struct si_context *sctx)
> +{
> +   return sctx->cs_shader_state.program->uses_bindless_images;
> +}
> +
>  #endif /* SI_COMPUTE_H */
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
> b/src/gallium/drivers/radeonsi/si_pipe.h
> index 434bc0aa67..fe7cf20ec9 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.h
> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> @@ -538,6 +538,26 @@ static inline struct tgsi_shader_info 
> *si_get_vs_info(struct si_context *sctx)
> return NULL;
>  }
>
> +static inline bool
> +si_graphics_uses_bindless_samplers(struct si_context *sctx)
> +{
> +   return si_shader_uses_bindless_samplers(sctx->vs_shader.cso)  ||
> +  si_shader_uses_bindless_samplers(sctx->gs_shader.cso)  ||
> +  si_shader_uses_bindless_samplers(sctx->ps_shader.cso)  ||
> +  si_shader_uses_bindless_samplers(sctx->tcs_shader.cso) ||
> +  si_shader_uses_bindless_samplers(sctx->tes_shader.cso);
> +}
> +
> +static inline bool
> +si_graphics_uses_bindless_images(struct si_context *sctx)
> +{
> +   return si_shader_uses_bindless_images(sctx->vs_shader.cso)  ||
> +  si_shader_uses_bindless_images(sctx->gs_shader.cso)  ||
> +  si_shader_uses_bindless_images(sctx->ps_shader.cso)  ||
> +  si_shader_uses_bindless_images(sctx->tcs_shader.cso) ||
> +  si_shader_uses_bindless_images(sctx->tes_shader.cso);
> +}

I'd like shader bind calls to set the result of these functions in a
bool flag in si_context. Then, patch 59 can use the bool flags instead
of calling the functions.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 56/64] radeonsi: decompress resident textures/images before graphics/compute

2017-06-06 Thread Marek Olšák
On Wed, Jun 7, 2017 at 1:27 AM, Marek Olšák  wrote:
> On Tue, May 30, 2017 at 10:36 PM, Samuel Pitoiset
>  wrote:
>> Similar to the existing decompression code path except that it
>> loops over the list of resident textures/images.
>>
>> v2: - store pipe_sampler_view instead of si_sampler_view
>>
>> Signed-off-by: Samuel Pitoiset 
>> ---
>>  src/gallium/drivers/radeonsi/si_blit.c| 77 
>> +--
>>  src/gallium/drivers/radeonsi/si_descriptors.c | 52 ++
>>  src/gallium/drivers/radeonsi/si_pipe.h|  3 ++
>>  3 files changed, 129 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
>> b/src/gallium/drivers/radeonsi/si_blit.c
>> index 343ca35736..a47f43958c 100644
>> --- a/src/gallium/drivers/radeonsi/si_blit.c
>> +++ b/src/gallium/drivers/radeonsi/si_blit.c
>> @@ -22,6 +22,7 @@
>>   */
>>
>>  #include "si_pipe.h"
>> +#include "si_compute.h"
>>  #include "util/u_format.h"
>>  #include "util/u_surface.h"
>>
>> @@ -690,9 +691,6 @@ static void si_decompress_textures(struct si_context 
>> *sctx, unsigned shader_mask
>>  {
>> unsigned compressed_colortex_counter, mask;
>>
>> -   if (sctx->blitter->running)
>> -   return;
>> -
>
> You can keep this.
>
>> /* Update the compressed_colortex_mask if necessary. */
>> compressed_colortex_counter = 
>> p_atomic_read(>screen->b.compressed_colortex_counter);
>> if (compressed_colortex_counter != 
>> sctx->b.last_compressed_colortex_counter) {
>> @@ -719,14 +717,87 @@ static void si_decompress_textures(struct si_context 
>> *sctx, unsigned shader_mask
>> si_check_render_feedback(sctx);
>>  }
>>
>> +static void si_decompress_resident_textures(struct si_context *sctx)
>> +{
>> +   unsigned num_resident_tex_handles;
>> +   unsigned i;
>> +
>> +   num_resident_tex_handles = sctx->resident_tex_handles.size /
>> +  sizeof(struct si_texture_handle *);
>> +
>> +   for (i = 0; i < num_resident_tex_handles; i++) {
>> +   struct si_texture_handle *tex_handle =
>> +   *util_dynarray_element(>resident_tex_handles,
>> +  struct si_texture_handle *, 
>> i);
>> +   struct pipe_sampler_view *view = tex_handle->view;
>> +   struct si_sampler_view *sview = (struct si_sampler_view 
>> *)view;
>> +   struct r600_texture *tex;
>> +
>> +   assert(view);
>> +   tex = (struct r600_texture *)view->texture;
>> +
>> +   if (view->texture->target == PIPE_BUFFER)
>> +   continue;
>> +
>> +   if (tex_handle->compressed_colortex)
>> +   si_decompress_color_texture(sctx, tex, 
>> view->u.tex.first_level,
>> +   view->u.tex.last_level);
>> +
>> +   if (tex_handle->depth_texture)
>> +   si_flush_depth_texture(sctx, tex,
>> +   sview->is_stencil_sampler ? PIPE_MASK_S : 
>> PIPE_MASK_Z,
>> +   view->u.tex.first_level, 
>> view->u.tex.last_level,
>> +   0, util_max_layer(>resource.b.b, 
>> view->u.tex.first_level));
>> +   }
>> +}
>> +
>> +static void si_decompress_resident_images(struct si_context *sctx)
>> +{
>> +   unsigned num_resident_img_handles;
>> +   unsigned i;
>> +
>> +   num_resident_img_handles = sctx->resident_img_handles.size /
>> +  sizeof(struct si_image_handle *);
>> +
>> +   for (i = 0; i < num_resident_img_handles; i++) {
>> +   struct si_image_handle *img_handle =
>> +   *util_dynarray_element(>resident_img_handles,
>> +  struct si_image_handle *, i);
>> +   struct pipe_image_view *view = _handle->view;
>> +   struct r600_texture *tex;
>> +
>> +   assert(view);
>> +   tex = (struct r600_texture *)view->resource;
>> +
>> +   if (view->resource->target == PIPE_BUFFER)
>> +   continue;
>> +
>> +   if (img_handle->compressed_colortex)
>> +   si_decompress_color_texture(sctx, tex, 
>> view->u.tex.level,
>> +   view->u.tex.level);
>> +   }
>> +}
>
> The loops in the two functions above will destroy CPU performance,
> because both functions are called for every draw call. We need to find
> a better way.
>
> I suggest that si_resident_handles_update_compressed_colortex should
> be rewritten to build 2 separate lists (for samples and images)
> containing only references to bindless slots that need decompression,
> and make_resident functions should update the separate lists too. Then
> the two 

Re: [Mesa-dev] [PATCH 4/5] mesa: add scissor() and scissor_array() helpers

2017-06-06 Thread Timothy Arceri

On 07/06/17 05:58, Samuel Pitoiset wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/scissor.c | 57 -
  1 file changed, 37 insertions(+), 20 deletions(-)

diff --git a/src/mesa/main/scissor.c b/src/mesa/main/scissor.c
index 5cf02168bd..d94663c6e4 100644
--- a/src/mesa/main/scissor.c
+++ b/src/mesa/main/scissor.c
@@ -55,22 +55,10 @@ set_scissor_no_notify(struct gl_context *ctx, unsigned idx,
 ctx->Scissor.ScissorArray[idx].Height = height;
  }
  
-/**

- * Called via glScissor
- */
-void GLAPIENTRY
-_mesa_Scissor( GLint x, GLint y, GLsizei width, GLsizei height )
+static void
+scissor(struct gl_context *ctx, GLint x, GLint y, GLsizei width, GLsizei 
height)
  {
 unsigned i;
-   GET_CURRENT_CONTEXT(ctx);
-
-   if (MESA_VERBOSE & VERBOSE_API)
-  _mesa_debug(ctx, "glScissor %d %d %d %d\n", x, y, width, height);
-
-   if (width < 0 || height < 0) {
-  _mesa_error( ctx, GL_INVALID_VALUE, "glScissor" );
-  return;
-   }
  
 /* The GL_ARB_viewport_array spec says:

  *
@@ -91,6 +79,25 @@ _mesa_Scissor( GLint x, GLint y, GLsizei width, GLsizei 
height )
ctx->Driver.Scissor(ctx);
  }
  
+/**

+ * Called via glScissor
+ */
+void GLAPIENTRY
+_mesa_Scissor(GLint x, GLint y, GLsizei width, GLsizei height)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glScissor %d %d %d %d\n", x, y, width, height);
+
+   if (width < 0 || height < 0) {
+  _mesa_error( ctx, GL_INVALID_VALUE, "glScissor" );
+  return;
+   }
+
+   scissor(ctx, x, y, width, height);
+}
+
  
  /**

   * Define the scissor box.
@@ -115,6 +122,21 @@ _mesa_set_scissor(struct gl_context *ctx, unsigned idx,
ctx->Driver.Scissor(ctx);
  }
  
+static void

+scissor_array(struct gl_context *ctx, GLuint first, GLsizei count,
+  struct gl_scissor_rect *rect)
+{
+   GLsizei i;
+
+   for (i = 0; i < count; i++) {


Please make this:

   for (GLsizei i = 0; i < count; i++) {

IMO it just looks much cleaner :)

Otherwise series:

Reviewed-by: Timothy Arceri 



+  set_scissor_no_notify(ctx, i + first, rect[i].X, rect[i].Y,
+rect[i].Width, rect[i].Height);
+   }
+
+   if (ctx->Driver.Scissor)
+  ctx->Driver.Scissor(ctx);
+}
+
  /**
   * Define count scissor boxes starting at index.
   *
@@ -150,12 +172,7 @@ _mesa_ScissorArrayv(GLuint first, GLsizei count, const 
GLint *v)
}
 }
  
-   for (i = 0; i < count; i++)

-  set_scissor_no_notify(ctx, i + first,
-p[i].X, p[i].Y, p[i].Width, p[i].Height);
-
-   if (ctx->Driver.Scissor)
-  ctx->Driver.Scissor(ctx);
+   scissor_array(ctx, first, count, p);
  }
  
  /**



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: Fix broxton 2x6 way size computation

2017-06-06 Thread Anuj Phogat
This patch is undoing the changes to way size computation
in broxton 2x6, made by below commit:

Commit: 0d576fbfbe912cf3fb9ab594bb31eb58bccf2138
Author: Anuj Phogat 
i965: Simplify l3 way size computations

By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306
Signed-off-by: Anuj Phogat 
Cc: Jason Ekstrand 
Cc: Mark Janes 
Cc: Francisco Jerez 
---
Note: Above bugzilla exposed a bug in our l3 allocation for
broxton 2x6. We need more changes to fix l3 config. I'll send
them later to the list. For now this patch brings things back
to where they were for bxt and unblocks the CI system to be
utilized for the performance work going on at present.
---
 src/intel/common/gen_l3_config.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c
index e0825e9..2520838 100644
--- a/src/intel/common/gen_l3_config.c
+++ b/src/intel/common/gen_l3_config.c
@@ -255,6 +255,10 @@ static unsigned
 get_l3_way_size(const struct gen_device_info *devinfo)
 {
assert(devinfo->l3_banks);
+
+   if (devinfo->is_broxton)
+  return 4;
+
return 2 * devinfo->l3_banks;
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: move chip_class extraction down further.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

This seems to matter here in a profile, without this we spend a lot
more time exiting this function with no flush bits.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/si_cmd_buffer.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index a251a1a..47bf553 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -690,16 +690,20 @@ si_get_ia_multi_vgt_param(struct radv_cmd_buffer 
*cmd_buffer,
bool ia_switch_on_eoi = false;
bool partial_vs_wave = false;
bool partial_es_wave = false;
-   uint32_t num_prims = 
radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
 draw_vertex_count);
+   uint32_t num_prims = 0;
bool multi_instances_smaller_than_primgroup;
-
+   bool instance_less_than_primgroup_size = false;
if (radv_pipeline_has_tess(cmd_buffer->state.pipeline))
primgroup_size = 
cmd_buffer->state.pipeline->graphics.tess.num_patches;
else if (radv_pipeline_has_gs(cmd_buffer->state.pipeline))
primgroup_size = 64;  /* recommended with a GS */
 
-   multi_instances_smaller_than_primgroup = indirect_draw || 
(instanced_draw &&
-  num_prims < 
primgroup_size);
+   if (instanced_draw || radv_pipeline_has_gs(cmd_buffer->state.pipeline)) 
{
+   num_prims = 
radv_prims_for_vertices(_buffer->state.pipeline->graphics.prim_vertex_count,
 draw_vertex_count);
+   instance_less_than_primgroup_size = num_prims < primgroup_size;
+   }
+
+   multi_instances_smaller_than_primgroup = indirect_draw || 
instance_less_than_primgroup_size;
if (radv_pipeline_has_tess(cmd_buffer->state.pipeline)) {
/* SWITCH_ON_EOI must be set if PrimID is used. */
if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_TESS_CTRL]->info.tcs.uses_prim_id
 ||
@@ -1079,7 +1083,7 @@ void
 si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
 {
bool is_compute = cmd_buffer->queue_family_index == RADV_QUEUE_COMPUTE;
-   enum chip_class chip_class = 
cmd_buffer->device->physical_device->rad_info.chip_class;
+
if (is_compute)
cmd_buffer->state.flush_bits &= 
~(RADV_CMD_FLAG_FLUSH_AND_INV_CB |
  
RADV_CMD_FLAG_FLUSH_AND_INV_CB_META |
@@ -1092,6 +1096,7 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
if (!cmd_buffer->state.flush_bits)
return;
 
+   enum chip_class chip_class = 
cmd_buffer->device->physical_device->rad_info.chip_class;
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 128);
 
uint32_t *ptr = NULL;
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] ac/nir: mark some arguments const

2017-06-06 Thread Grazvydas Ignotas
Most functions are only inspecting nir, so nir related arguments can be
marked const. Some more can be done if/when some nir changes are
accepted.

Signed-off-by: Grazvydas Ignotas 
---
does *not* depend on the nir patch

 src/amd/common/ac_nir_to_llvm.c | 61 +
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 4e5d19a..5f62769 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -174,11 +174,11 @@ struct nir_to_llvm_context {
uint64_t tess_outputs_written;
uint64_t tess_patch_outputs_written;
 };
 
 static LLVMValueRef get_sampler_desc(struct nir_to_llvm_context *ctx,
-nir_deref_var *deref,
+const nir_deref_var *deref,
 enum desc_type desc_type);
 static unsigned radeon_llvm_reg_index_soa(unsigned index, unsigned chan)
 {
return (index * 4) + chan;
 }
@@ -1077,11 +1077,11 @@ build_store_values_extended(struct nir_to_llvm_context 
*ctx,
LLVMBuildStore(builder, value, ptr);
}
 }
 
 static LLVMTypeRef get_def_type(struct nir_to_llvm_context *ctx,
-nir_ssa_def *def)
+const nir_ssa_def *def)
 {
LLVMTypeRef type = LLVMIntTypeInContext(ctx->context, def->bit_size);
if (def->num_components > 1) {
type = LLVMVectorType(type, def->num_components);
}
@@ -1095,11 +1095,11 @@ static LLVMValueRef get_src(struct nir_to_llvm_context 
*ctx, nir_src src)
return (LLVMValueRef)entry->data;
 }
 
 
 static LLVMBasicBlockRef get_block(struct nir_to_llvm_context *ctx,
-   struct nir_block *b)
+   const struct nir_block *b)
 {
struct hash_entry *entry = _mesa_hash_table_search(ctx->defs, b);
return (LLVMBasicBlockRef)entry->data;
 }
 
@@ -1385,11 +1385,11 @@ static LLVMValueRef emit_imul_high(struct 
nir_to_llvm_context *ctx,
return result;
 }
 
 static LLVMValueRef emit_bitfield_extract(struct nir_to_llvm_context *ctx,
  bool is_signed,
- LLVMValueRef srcs[3])
+ const LLVMValueRef srcs[3])
 {
LLVMValueRef result;
LLVMValueRef icond = LLVMBuildICmp(ctx->builder, LLVMIntEQ, srcs[2], 
LLVMConstInt(ctx->i32, 32, false), "");
 
result = ac_build_bfe(>ac, srcs[0], srcs[1], srcs[2], is_signed);
@@ -1524,11 +1524,11 @@ static LLVMValueRef emit_ddxy_interp(
result[2+i] = emit_ddxy(ctx, nir_op_fddy, a);
}
return ac_build_gather_values(>ac, result, 4);
 }
 
-static void visit_alu(struct nir_to_llvm_context *ctx, nir_alu_instr *instr)
+static void visit_alu(struct nir_to_llvm_context *ctx, const nir_alu_instr 
*instr)
 {
LLVMValueRef src[4], result = NULL;
unsigned num_components = instr->dest.dest.ssa.num_components;
unsigned src_components;
LLVMTypeRef def_type = get_def_type(ctx, >dest.dest.ssa);
@@ -1890,11 +1890,11 @@ static void visit_alu(struct nir_to_llvm_context *ctx, 
nir_alu_instr *instr)
result);
}
 }
 
 static void visit_load_const(struct nir_to_llvm_context *ctx,
- nir_load_const_instr *instr)
+ const nir_load_const_instr *instr)
 {
LLVMValueRef values[4], value = NULL;
LLVMTypeRef element_type =
LLVMIntTypeInContext(ctx->context, instr->def.bit_size);
 
@@ -1974,11 +1974,11 @@ static void build_int_type_name(
strcpy(buf, "i32");
 }
 
 static LLVMValueRef radv_lower_gather4_integer(struct nir_to_llvm_context *ctx,
   struct ac_image_args *args,
-  nir_tex_instr *instr)
+  const nir_tex_instr *instr)
 {
enum glsl_base_type stype = 
glsl_get_sampler_result_type(instr->texture->var->type);
LLVMValueRef coord = args->addr;
LLVMValueRef half_texel[2];
LLVMValueRef compare_cube_wa;
@@ -2087,11 +2087,11 @@ static LLVMValueRef radv_lower_gather4_integer(struct 
nir_to_llvm_context *ctx,
}
return result;
 }
 
 static LLVMValueRef build_tex_intrinsic(struct nir_to_llvm_context *ctx,
-   nir_tex_instr *instr,
+   const nir_tex_instr *instr,
bool lod_is_zero,
struct ac_image_args *args)
 {
if (instr->sampler_dim == GLSL_SAMPLER_DIM_BUF) {
return ac_build_buffer_load_format(>ac,
@@ -2200,11 +2200,11 @@ static 

[Mesa-dev] [PATCH 2/3] ac/nir: convert several ifs to a switch

2017-06-06 Thread Grazvydas Ignotas
Also solve "outinfo may be used uninitialized" warning by putting in an
unreachable().

Signed-off-by: Grazvydas Ignotas 
---
 src/amd/common/ac_nir_to_llvm.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 5f62769..005e2be 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -5802,27 +5802,29 @@ static void ac_llvm_finalize_module(struct 
nir_to_llvm_context * ctx)
 static void
 ac_nir_eliminate_const_vs_outputs(struct nir_to_llvm_context *ctx)
 {
struct ac_vs_output_info *outinfo;
 
-   if (ctx->stage == MESA_SHADER_FRAGMENT ||
-   ctx->stage == MESA_SHADER_COMPUTE ||
-   ctx->stage == MESA_SHADER_TESS_CTRL ||
-   ctx->stage == MESA_SHADER_GEOMETRY)
+   switch (ctx->stage) {
+   case MESA_SHADER_FRAGMENT:
+   case MESA_SHADER_COMPUTE:
+   case MESA_SHADER_TESS_CTRL:
+   case MESA_SHADER_GEOMETRY:
return;
-
-   if (ctx->stage == MESA_SHADER_VERTEX) {
+   case MESA_SHADER_VERTEX:
if (ctx->options->key.vs.as_ls ||
ctx->options->key.vs.as_es)
return;
outinfo = >shader_info->vs.outinfo;
-   }
-
-   if (ctx->stage == MESA_SHADER_TESS_EVAL) {
+   break;
+   case MESA_SHADER_TESS_EVAL:
if (ctx->options->key.vs.as_es)
return;
outinfo = >shader_info->tes.outinfo;
+   break;
+   default:
+   unreachable("Unhandled shader type");
}
 
ac_optimize_vs_outputs(>ac,
   ctx->main_function,
   outinfo->vs_output_param_offset,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radv/meta: remove an unused variable

2017-06-06 Thread Grazvydas Ignotas
Trivial.

Signed-off-by: Grazvydas Ignotas 
---
 src/amd/vulkan/radv_meta_clear.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index f4cb787..d7e7c5b 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -670,11 +670,10 @@ emit_fast_htile_clear(struct radv_cmd_buffer *cmd_buffer,
const struct radv_framebuffer *fb = cmd_buffer->state.framebuffer;
const struct radv_image_view *iview = 
fb->attachments[pass_att].attachment;
VkClearDepthStencilValue clear_value = 
clear_att->clearValue.depthStencil;
VkImageAspectFlags aspects = clear_att->aspectMask;
uint32_t clear_word;
-   bool ret;
 
if (!iview->image->surface.htile_size)
return false;
 
if (cmd_buffer->device->debug_flags & RADV_DEBUG_NO_FAST_CLEARS)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/23] mesa: add KHR_no_error support for glBlitFramebuffer()

2017-06-06 Thread Timothy Arceri

1-13:

Reviewed-by: Timothy Arceri 

On 06/06/17 07:44, Samuel Pitoiset wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/mapi/glapi/gen/ARB_framebuffer_object.xml |  2 +-
  src/mesa/main/blit.c  | 15 +++
  src/mesa/main/blit.h  |  6 ++
  3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/mapi/glapi/gen/ARB_framebuffer_object.xml 
b/src/mapi/glapi/gen/ARB_framebuffer_object.xml
index 76114eb32a..26f495f8bb 100644
--- a/src/mapi/glapi/gen/ARB_framebuffer_object.xml
+++ b/src/mapi/glapi/gen/ARB_framebuffer_object.xml
@@ -271,7 +271,7 @@

  
  
-

+
  
  
  
diff --git a/src/mesa/main/blit.c b/src/mesa/main/blit.c
index e04f5d79b2..609df63d20 100644
--- a/src/mesa/main/blit.c
+++ b/src/mesa/main/blit.c
@@ -587,6 +587,21 @@ blit_framebuffer(struct gl_context *ctx,
   * when the samples must be resolved to a single color.
   */
  void GLAPIENTRY
+_mesa_BlitFramebuffer_no_error(GLint srcX0, GLint srcY0, GLint srcX1,
+   GLint srcY1, GLint dstX0, GLint dstY0,
+   GLint dstX1, GLint dstY1,
+   GLbitfield mask, GLenum filter)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   blit_framebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
+srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1,
+mask, filter, true, "glBlitFramebuffer");
+}
+
+
+void GLAPIENTRY
  _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
GLbitfield mask, GLenum filter)
diff --git a/src/mesa/main/blit.h b/src/mesa/main/blit.h
index 1ca4f83028..6397518dbd 100644
--- a/src/mesa/main/blit.h
+++ b/src/mesa/main/blit.h
@@ -34,6 +34,12 @@ _mesa_regions_overlap(int srcX0, int srcY0,
int dstX0, int dstY0,
int dstX1, int dstY1);
  
+void GLAPIENTRY

+_mesa_BlitFramebuffer_no_error(GLint srcX0, GLint srcY0, GLint srcX1,
+   GLint srcY1, GLint dstX0, GLint dstY0,
+   GLint dstX1, GLint dstY1,
+   GLbitfield mask, GLenum filter);
+
  extern void GLAPIENTRY
  _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
   GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa: wrap blit_framebuffer() into blit_framebuffer_err()

2017-06-06 Thread Timothy Arceri

Thanks.

Reviewed-by: Timothy Arceri 

On 07/06/17 05:57, Samuel Pitoiset wrote:

Also add ALWAYS_INLINE to blit_framebuffer().

v2: - use correct parameters

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/blit.c | 34 +-
  1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/blit.c b/src/mesa/main/blit.c
index 970c357335..be5e4f109a 100644
--- a/src/mesa/main/blit.c
+++ b/src/mesa/main/blit.c
@@ -177,7 +177,7 @@ is_valid_blit_filter(const struct gl_context *ctx, GLenum 
filter)
  }
  
  
-static void

+static ALWAYS_INLINE void
  blit_framebuffer(struct gl_context *ctx,
   struct gl_framebuffer *readFb, struct gl_framebuffer *drawFb,
   GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
@@ -537,6 +537,22 @@ blit_framebuffer(struct gl_context *ctx,
  }
  
  
+static void

+blit_framebuffer_err(struct gl_context *ctx,
+ struct gl_framebuffer *readFb,
+ struct gl_framebuffer *drawFb,
+ GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
+ GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
+ GLbitfield mask, GLenum filter, const char *func)
+{
+   /* We are wrapping the err variant of the always inlined
+* blit_framebuffer() to avoid inlining it in every caller.
+*/
+   blit_framebuffer(ctx, readFb, drawFb, srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1, mask, filter, false, func);
+}
+
+
  /**
   * Blit rectangular region, optionally from one framebuffer to another.
   *
@@ -558,10 +574,10 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint 
srcX1, GLint srcY1,
dstX0, dstY0, dstX1, dstY1,
mask, _mesa_enum_to_string(filter));
  
-   blit_framebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,

-srcX0, srcY0, srcX1, srcY1,
-dstX0, dstY0, dstX1, dstY1,
-mask, filter, false, "glBlitFramebuffer");
+   blit_framebuffer_err(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
+srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1,
+mask, filter, "glBlitFramebuffer");
  }
  
  
@@ -609,8 +625,8 @@ _mesa_BlitNamedFramebuffer(GLuint readFramebuffer, GLuint drawFramebuffer,

 else
drawFb = ctx->WinSysDrawBuffer;
  
-   blit_framebuffer(ctx, readFb, drawFb,

-srcX0, srcY0, srcX1, srcY1,
-dstX0, dstY0, dstX1, dstY1,
-mask, filter, false, "glBlitNamedFramebuffer");
+   blit_framebuffer_err(ctx, readFb, drawFb,
+srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1,
+mask, filter, "glBlitNamedFramebuffer");
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 56/64] radeonsi: decompress resident textures/images before graphics/compute

2017-06-06 Thread Marek Olšák
On Tue, May 30, 2017 at 10:36 PM, Samuel Pitoiset
 wrote:
> Similar to the existing decompression code path except that it
> loops over the list of resident textures/images.
>
> v2: - store pipe_sampler_view instead of si_sampler_view
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeonsi/si_blit.c| 77 
> +--
>  src/gallium/drivers/radeonsi/si_descriptors.c | 52 ++
>  src/gallium/drivers/radeonsi/si_pipe.h|  3 ++
>  3 files changed, 129 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
> b/src/gallium/drivers/radeonsi/si_blit.c
> index 343ca35736..a47f43958c 100644
> --- a/src/gallium/drivers/radeonsi/si_blit.c
> +++ b/src/gallium/drivers/radeonsi/si_blit.c
> @@ -22,6 +22,7 @@
>   */
>
>  #include "si_pipe.h"
> +#include "si_compute.h"
>  #include "util/u_format.h"
>  #include "util/u_surface.h"
>
> @@ -690,9 +691,6 @@ static void si_decompress_textures(struct si_context 
> *sctx, unsigned shader_mask
>  {
> unsigned compressed_colortex_counter, mask;
>
> -   if (sctx->blitter->running)
> -   return;
> -

You can keep this.

> /* Update the compressed_colortex_mask if necessary. */
> compressed_colortex_counter = 
> p_atomic_read(>screen->b.compressed_colortex_counter);
> if (compressed_colortex_counter != 
> sctx->b.last_compressed_colortex_counter) {
> @@ -719,14 +717,87 @@ static void si_decompress_textures(struct si_context 
> *sctx, unsigned shader_mask
> si_check_render_feedback(sctx);
>  }
>
> +static void si_decompress_resident_textures(struct si_context *sctx)
> +{
> +   unsigned num_resident_tex_handles;
> +   unsigned i;
> +
> +   num_resident_tex_handles = sctx->resident_tex_handles.size /
> +  sizeof(struct si_texture_handle *);
> +
> +   for (i = 0; i < num_resident_tex_handles; i++) {
> +   struct si_texture_handle *tex_handle =
> +   *util_dynarray_element(>resident_tex_handles,
> +  struct si_texture_handle *, i);
> +   struct pipe_sampler_view *view = tex_handle->view;
> +   struct si_sampler_view *sview = (struct si_sampler_view 
> *)view;
> +   struct r600_texture *tex;
> +
> +   assert(view);
> +   tex = (struct r600_texture *)view->texture;
> +
> +   if (view->texture->target == PIPE_BUFFER)
> +   continue;
> +
> +   if (tex_handle->compressed_colortex)
> +   si_decompress_color_texture(sctx, tex, 
> view->u.tex.first_level,
> +   view->u.tex.last_level);
> +
> +   if (tex_handle->depth_texture)
> +   si_flush_depth_texture(sctx, tex,
> +   sview->is_stencil_sampler ? PIPE_MASK_S : 
> PIPE_MASK_Z,
> +   view->u.tex.first_level, 
> view->u.tex.last_level,
> +   0, util_max_layer(>resource.b.b, 
> view->u.tex.first_level));
> +   }
> +}
> +
> +static void si_decompress_resident_images(struct si_context *sctx)
> +{
> +   unsigned num_resident_img_handles;
> +   unsigned i;
> +
> +   num_resident_img_handles = sctx->resident_img_handles.size /
> +  sizeof(struct si_image_handle *);
> +
> +   for (i = 0; i < num_resident_img_handles; i++) {
> +   struct si_image_handle *img_handle =
> +   *util_dynarray_element(>resident_img_handles,
> +  struct si_image_handle *, i);
> +   struct pipe_image_view *view = _handle->view;
> +   struct r600_texture *tex;
> +
> +   assert(view);
> +   tex = (struct r600_texture *)view->resource;
> +
> +   if (view->resource->target == PIPE_BUFFER)
> +   continue;
> +
> +   if (img_handle->compressed_colortex)
> +   si_decompress_color_texture(sctx, tex, 
> view->u.tex.level,
> +   view->u.tex.level);
> +   }
> +}

The loops in the two functions above will destroy CPU performance,
because both functions are called for every draw call. We need to find
a better way.

I suggest that si_resident_handles_update_compressed_colortex should
be rewritten to build 2 separate lists (for samples and images)
containing only references to bindless slots that need decompression,
and make_resident functions should update the separate lists too. Then
the two functions above can walk the separate lists instead of all
resident handles. The effect will be that only those slots that need
decompression will be checked. (ideally 0 of very few)

> +
>  void 

[Mesa-dev] [PATCH] nir: make various getters take const pointers

2017-06-06 Thread Grazvydas Ignotas
This will allow to constify other things.

Signed-off-by: Grazvydas Ignotas 
---
 src/compiler/nir/nir.h  | 25 +
 src/compiler/nir/nir_lower_io.c |  2 +-
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 3b827bf..ab7ba14 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -436,19 +436,19 @@ nir_instr_prev(nir_instr *instr)
else
   return exec_node_data(nir_instr, prev, node);
 }
 
 static inline bool
-nir_instr_is_first(nir_instr *instr)
+nir_instr_is_first(const nir_instr *instr)
 {
-   return exec_node_is_head_sentinel(exec_node_get_prev(>node));
+   return exec_node_is_head_sentinel(exec_node_get_prev_const(>node));
 }
 
 static inline bool
-nir_instr_is_last(nir_instr *instr)
+nir_instr_is_last(const nir_instr *instr)
 {
-   return exec_node_is_tail_sentinel(exec_node_get_next(>node));
+   return exec_node_is_tail_sentinel(exec_node_get_next_const(>node));
 }
 
 typedef struct nir_ssa_def {
/** for debugging only, can be NULL */
const char* name;
@@ -802,11 +802,12 @@ void nir_alu_src_copy(nir_alu_src *dest, const 
nir_alu_src *src,
 void nir_alu_dest_copy(nir_alu_dest *dest, const nir_alu_dest *src,
nir_alu_instr *instr);
 
 /* is this source channel used? */
 static inline bool
-nir_alu_instr_channel_used(nir_alu_instr *instr, unsigned src, unsigned 
channel)
+nir_alu_instr_channel_used(const nir_alu_instr *instr, unsigned src,
+   unsigned channel)
 {
if (nir_op_infos[instr->op].input_sizes[src] > 0)
   return channel < nir_op_infos[instr->op].input_sizes[src];
 
return (instr->dest.write_mask >> channel) & 1;
@@ -1085,11 +1086,11 @@ typedef struct {
 extern const nir_intrinsic_info nir_intrinsic_infos[nir_num_intrinsics];
 
 
 #define INTRINSIC_IDX_ACCESSORS(name, flag, type) \
 static inline type\
-nir_intrinsic_##name(nir_intrinsic_instr *instr)  \
+nir_intrinsic_##name(const nir_intrinsic_instr *instr)\
 { \
const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic];   \
assert(info->index_map[NIR_INTRINSIC_##flag] > 0); \
return instr->const_index[info->index_map[NIR_INTRINSIC_##flag] - 1];  \
 } \
@@ -1219,11 +1220,11 @@ typedef struct {
 */
nir_deref_var *sampler;
 } nir_tex_instr;
 
 static inline unsigned
-nir_tex_instr_dest_size(nir_tex_instr *instr)
+nir_tex_instr_dest_size(const nir_tex_instr *instr)
 {
switch (instr->op) {
case nir_texop_txs: {
   unsigned ret;
   switch (instr->sampler_dim) {
@@ -1268,11 +1269,11 @@ nir_tex_instr_dest_size(nir_tex_instr *instr)
 
 /* Returns true if this texture operation queries something about the texture
  * rather than actually sampling it.
  */
 static inline bool
-nir_tex_instr_is_query(nir_tex_instr *instr)
+nir_tex_instr_is_query(const nir_tex_instr *instr)
 {
switch (instr->op) {
case nir_texop_txs:
case nir_texop_lod:
case nir_texop_texture_samples:
@@ -1291,11 +1292,11 @@ nir_tex_instr_is_query(nir_tex_instr *instr)
   unreachable("Invalid texture opcode");
}
 }
 
 static inline nir_alu_type
-nir_tex_instr_src_type(nir_tex_instr *instr, unsigned src)
+nir_tex_instr_src_type(const nir_tex_instr *instr, unsigned src)
 {
switch (instr->src[src].src_type) {
case nir_tex_src_coord:
   switch (instr->op) {
   case nir_texop_txf:
@@ -1335,11 +1336,11 @@ nir_tex_instr_src_type(nir_tex_instr *instr, unsigned 
src)
   unreachable("Invalid texture source type");
}
 }
 
 static inline unsigned
-nir_tex_instr_src_size(nir_tex_instr *instr, unsigned src)
+nir_tex_instr_src_size(const nir_tex_instr *instr, unsigned src)
 {
if (instr->src[src].src_type == nir_tex_src_coord)
   return instr->coord_components;
 
/* The MCS value is expected to be a vec4 returned by a txf_ms_mcs */
@@ -1357,11 +1358,11 @@ nir_tex_instr_src_size(nir_tex_instr *instr, unsigned 
src)
 
return 1;
 }
 
 static inline int
-nir_tex_instr_src_index(nir_tex_instr *instr, nir_tex_src_type type)
+nir_tex_instr_src_index(const nir_tex_instr *instr, nir_tex_src_type type)
 {
for (unsigned i = 0; i < instr->num_srcs; i++)
   if (instr->src[i].src_type == type)
  return (int) i;
 
@@ -2392,11 +2393,11 @@ bool nir_lower_io(nir_shader *shader,
   int (*type_size)(const struct glsl_type *),
   nir_lower_io_options);
 nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
 nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
 
-bool nir_is_per_vertex_io(nir_variable *var, gl_shader_stage stage);

[Mesa-dev] XDC 2017 : Call for paper

2017-06-06 Thread Martin Peres
Hello,

I have the pleasure to announce that the X.org Developer Conference 2017
will be held in Mountain View, California from September 20th to
September 22nd. The venue is located at the Googleplex.

The official page for the event is http://www.x.org/wiki/Events/XDC2017
while the call for paper is at http://www.x.org/wiki/Other/Press/CFP2017/

As usual, we are open to talks across the layers of the graphics stack,
from the kernel to desktop environments / graphical applications and
about how to make things better for the developers who build them.
Given that the conference is located at Google, we would welcome topics
related to Android and Chromebooks. We would also like to hear about
Virtual Reality and end-to-end buffer format negociation. If you're not
sure if something might fit, mail me or add it to the ideas list found
in the program page.

The conference is free of charge and open to the general public. If
you plan on coming, please add yourself to the attendees list. We'll
use this list to make badges and plan for the catering, so if you are
attending please add your name as early as possible.

I am looking forward to seeing you there. If you have any
inquiries/questions, please send them to Stéphane Marchesin (please also
CC: board at foundation.x.org).

Martin Peres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radv: move lots of index related things into the bind.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

This just moves lots of stuff to the bind stage rather than
dealing with it in the draw stage.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_cmd_buffer.c | 29 -
 src/amd/vulkan/radv_private.h|  4 ++--
 2 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 0e2ae31..1ac9de1 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1987,12 +1987,16 @@ void radv_CmdBindIndexBuffer(
VkIndexType indexType)
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
+   RADV_FROM_HANDLE(radv_buffer, index_buffer, buffer);
 
-   cmd_buffer->state.index_buffer = radv_buffer_from_handle(buffer);
-   cmd_buffer->state.index_offset = offset;
cmd_buffer->state.index_type = indexType; /* vk matches hw */
+   cmd_buffer->state.index_va = 
cmd_buffer->device->ws->buffer_get_va(index_buffer->bo);
+   cmd_buffer->state.index_va += index_buffer->offset + offset;
+
+   int index_size_shift = cmd_buffer->state.index_type ? 2 : 1;
+   cmd_buffer->state.max_index_count = (index_buffer->size - offset) >> 
index_size_shift;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_INDEX_BUFFER;
-   cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, 
cmd_buffer->state.index_buffer->bo, 8);
+   cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, index_buffer->bo, 
8);
 }
 
 
@@ -2639,12 +2643,6 @@ void radv_CmdDraw(
radv_cmd_buffer_trace_emit(cmd_buffer);
 }
 
-static
-uint32_t radv_get_max_index_count(struct radv_cmd_buffer *cmd_buffer) {
-   int index_size_shift = cmd_buffer->state.index_type ? 2 : 1;
-   return (cmd_buffer->state.index_buffer->size - 
cmd_buffer->state.index_offset) >> index_size_shift;
-}
-
 void radv_CmdDrawIndexed(
VkCommandBuffer commandBuffer,
uint32_tindexCount,
@@ -2655,7 +2653,6 @@ void radv_CmdDrawIndexed(
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
int index_size = cmd_buffer->state.index_type ? 4 : 2;
-   uint32_t index_max_size = radv_get_max_index_count(cmd_buffer);
uint64_t index_va;
 
radv_cmd_buffer_flush_state(cmd_buffer, true, (instanceCount > 1), 
false, indexCount);
@@ -2681,10 +2678,10 @@ void radv_CmdDrawIndexed(
radeon_emit(cmd_buffer->cs, PKT3(PKT3_NUM_INSTANCES, 0, 0));
radeon_emit(cmd_buffer->cs, instanceCount);
 
-   index_va = 
cmd_buffer->device->ws->buffer_get_va(cmd_buffer->state.index_buffer->bo);
-   index_va += firstIndex * index_size + 
cmd_buffer->state.index_buffer->offset + cmd_buffer->state.index_offset;
+   index_va = cmd_buffer->state.index_va;
+   index_va += firstIndex * index_size;
radeon_emit(cmd_buffer->cs, PKT3(PKT3_DRAW_INDEX_2, 4, false));
-   radeon_emit(cmd_buffer->cs, index_max_size);
+   radeon_emit(cmd_buffer->cs, cmd_buffer->state.max_index_count);
radeon_emit(cmd_buffer->cs, index_va);
radeon_emit(cmd_buffer->cs, (index_va >> 32UL) & 0xFF);
radeon_emit(cmd_buffer->cs, indexCount);
@@ -2780,12 +2777,10 @@ radv_cmd_draw_indexed_indirect_count(
uint32_tstride)
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-   uint32_t index_max_size = radv_get_max_index_count(cmd_buffer);
uint64_t index_va;
radv_cmd_buffer_flush_state(cmd_buffer, true, false, true, 0);
 
-   index_va = 
cmd_buffer->device->ws->buffer_get_va(cmd_buffer->state.index_buffer->bo);
-   index_va += cmd_buffer->state.index_buffer->offset + 
cmd_buffer->state.index_offset;
+   index_va = cmd_buffer->state.index_va;
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 21);
 
@@ -2797,7 +2792,7 @@ radv_cmd_draw_indexed_indirect_count(
radeon_emit(cmd_buffer->cs, index_va >> 32);
 
radeon_emit(cmd_buffer->cs, PKT3(PKT3_INDEX_BUFFER_SIZE, 0, 0));
-   radeon_emit(cmd_buffer->cs, index_max_size);
+   radeon_emit(cmd_buffer->cs, cmd_buffer->state.max_index_count);
 
radv_emit_indirect_draw(cmd_buffer, buffer, offset,
countBuffer, countBufferOffset, maxDrawCount, 
stride, true);
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 13f298e..c21b17e 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -764,9 +764,9 @@ struct radv_cmd_state {
struct radv_descriptor_set *  descriptors[MAX_SETS];
struct radv_attachment_state *attachments;
VkRect2D render_area;
-   struct radv_buffer * index_buffer;
uint32_t  

[Mesa-dev] [PATCH 1/3] radv: rename and make global some functions.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

I want to use these in the pipeline setup stage.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_cmd_buffer.c | 24 
 src/amd/vulkan/radv_private.h|  5 +
 2 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index f3187e8..851b2ca 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -394,8 +394,8 @@ static unsigned radv_pack_float_12p4(float x)
   x >= 4096 ? 0x : x * 16;
 }
 
-static uint32_t
-shader_stage_to_user_data_0(gl_shader_stage stage, bool has_gs, bool has_tess)
+uint32_t
+radv_shader_stage_to_user_data_0(gl_shader_stage stage, bool has_gs, bool 
has_tess)
 {
switch (stage) {
case MESA_SHADER_FRAGMENT:
@@ -421,7 +421,7 @@ shader_stage_to_user_data_0(gl_shader_stage stage, bool 
has_gs, bool has_tess)
}
 }
 
-static struct ac_userdata_info *
+struct ac_userdata_info *
 radv_lookup_user_sgpr(struct radv_pipeline *pipeline,
  gl_shader_stage stage,
  int idx)
@@ -436,7 +436,7 @@ radv_emit_userdata_address(struct radv_cmd_buffer 
*cmd_buffer,
   int idx, uint64_t va)
 {
struct ac_userdata_info *loc = radv_lookup_user_sgpr(pipeline, stage, 
idx);
-   uint32_t base_reg = shader_stage_to_user_data_0(stage, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
+   uint32_t base_reg = radv_shader_stage_to_user_data_0(stage, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
if (loc->sgpr_idx == -1)
return;
assert(loc->num_sgprs == 2);
@@ -478,7 +478,7 @@ radv_update_multisample_state(struct radv_cmd_buffer 
*cmd_buffer,
if 
(pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.needs_sample_positions) {
uint32_t offset;
struct ac_userdata_info *loc = radv_lookup_user_sgpr(pipeline, 
MESA_SHADER_FRAGMENT, AC_UD_PS_SAMPLE_POS_OFFSET);
-   uint32_t base_reg = 
shader_stage_to_user_data_0(MESA_SHADER_FRAGMENT, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
+   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_FRAGMENT, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
if (loc->sgpr_idx == -1)
return;
assert(loc->num_sgprs == 1);
@@ -698,7 +698,7 @@ radv_emit_tess_shaders(struct radv_cmd_buffer *cmd_buffer,
 
loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_TESS_CTRL, 
AC_UD_TCS_OFFCHIP_LAYOUT);
if (loc->sgpr_idx != -1) {
-   uint32_t base_reg = 
shader_stage_to_user_data_0(MESA_SHADER_TESS_CTRL, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
+   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_TESS_CTRL, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
assert(loc->num_sgprs == 4);
assert(!loc->indirect);
radeon_set_sh_reg_seq(cmd_buffer->cs, base_reg + loc->sgpr_idx 
* 4, 4);
@@ -711,7 +711,7 @@ radv_emit_tess_shaders(struct radv_cmd_buffer *cmd_buffer,
 
loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_TESS_EVAL, 
AC_UD_TES_OFFCHIP_LAYOUT);
if (loc->sgpr_idx != -1) {
-   uint32_t base_reg = 
shader_stage_to_user_data_0(MESA_SHADER_TESS_EVAL, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
+   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_TESS_EVAL, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
assert(loc->num_sgprs == 1);
assert(!loc->indirect);
 
@@ -721,7 +721,7 @@ radv_emit_tess_shaders(struct radv_cmd_buffer *cmd_buffer,
 
loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_VERTEX, 
AC_UD_VS_LS_TCS_IN_LAYOUT);
if (loc->sgpr_idx != -1) {
-   uint32_t base_reg = 
shader_stage_to_user_data_0(MESA_SHADER_VERTEX, radv_pipeline_has_gs(pipeline), 
radv_pipeline_has_tess(pipeline));
+   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_VERTEX, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
assert(loc->num_sgprs == 1);
assert(!loc->indirect);
 
@@ -1319,7 +1319,7 @@ emit_stage_descriptor_set_userdata(struct radv_cmd_buffer 
*cmd_buffer,
   gl_shader_stage stage)
 {
struct ac_userdata_info *desc_set_loc = 
>shaders[stage]->info.user_sgprs_locs.descriptor_sets[idx];
-   uint32_t base_reg = shader_stage_to_user_data_0(stage, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
+   uint32_t base_reg = radv_shader_stage_to_user_data_0(stage, 
radv_pipeline_has_gs(pipeline), radv_pipeline_has_tess(pipeline));
 
  

[Mesa-dev] [PATCH 2/3] radv: move calculating the vertex sgpr to the pipeline.

2017-06-06 Thread Dave Airlie
From: Dave Airlie 

There is no need to calculate this at draw time.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_cmd_buffer.c | 63 ++--
 src/amd/vulkan/radv_pipeline.c   | 10 +++
 src/amd/vulkan/radv_private.h|  2 ++
 3 files changed, 34 insertions(+), 41 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 851b2ca..0e2ae31 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2618,22 +2618,14 @@ void radv_CmdDraw(
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 10);
 
-   struct ac_userdata_info *loc = 
radv_lookup_user_sgpr(cmd_buffer->state.pipeline, MESA_SHADER_VERTEX,
-
AC_UD_VS_BASE_VERTEX_START_INSTANCE);
-   if (loc->sgpr_idx != -1) {
-   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_VERTEX, 
radv_pipeline_has_gs(cmd_buffer->state.pipeline),
-   
radv_pipeline_has_tess(cmd_buffer->state.pipeline));
-   int vs_num = 2;
-   if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_VERTEX]->info.info.vs.needs_draw_id)
-   vs_num = 3;
-
-   assert (loc->num_sgprs == vs_num);
-   radeon_set_sh_reg_seq(cmd_buffer->cs, base_reg + loc->sgpr_idx 
* 4, vs_num);
-   radeon_emit(cmd_buffer->cs, firstVertex);
-   radeon_emit(cmd_buffer->cs, firstInstance);
-   if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_VERTEX]->info.info.vs.needs_draw_id)
-   radeon_emit(cmd_buffer->cs, 0);
-   }
+   assert(cmd_buffer->state.pipeline->graphics.vtx_base_sgpr);
+   radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.pipeline->graphics.vtx_base_sgpr,
+ 
cmd_buffer->state.pipeline->graphics.vtx_emit_num);
+   radeon_emit(cmd_buffer->cs, firstVertex);
+   radeon_emit(cmd_buffer->cs, firstInstance);
+   if (cmd_buffer->state.pipeline->graphics.vtx_emit_num == 3)
+   radeon_emit(cmd_buffer->cs, 0);
+
radeon_emit(cmd_buffer->cs, PKT3(PKT3_NUM_INSTANCES, 0, 0));
radeon_emit(cmd_buffer->cs, instanceCount);
 
@@ -2678,22 +2670,14 @@ void radv_CmdDrawIndexed(
radeon_emit(cmd_buffer->cs, cmd_buffer->state.index_type);
}
 
-   struct ac_userdata_info *loc = 
radv_lookup_user_sgpr(cmd_buffer->state.pipeline, MESA_SHADER_VERTEX,
-
AC_UD_VS_BASE_VERTEX_START_INSTANCE);
-   if (loc->sgpr_idx != -1) {
-   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_VERTEX, 
radv_pipeline_has_gs(cmd_buffer->state.pipeline),
-   
radv_pipeline_has_tess(cmd_buffer->state.pipeline));
-   int vs_num = 2;
-   if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_VERTEX]->info.info.vs.needs_draw_id)
-   vs_num = 3;
-
-   assert (loc->num_sgprs == vs_num);
-   radeon_set_sh_reg_seq(cmd_buffer->cs, base_reg + loc->sgpr_idx 
* 4, vs_num);
-   radeon_emit(cmd_buffer->cs, vertexOffset);
-   radeon_emit(cmd_buffer->cs, firstInstance);
-   if 
(cmd_buffer->state.pipeline->shaders[MESA_SHADER_VERTEX]->info.info.vs.needs_draw_id)
-   radeon_emit(cmd_buffer->cs, 0);
-   }
+   assert(cmd_buffer->state.pipeline->graphics.vtx_base_sgpr);
+   radeon_set_sh_reg_seq(cmd_buffer->cs, 
cmd_buffer->state.pipeline->graphics.vtx_base_sgpr,
+ 
cmd_buffer->state.pipeline->graphics.vtx_emit_num);
+   radeon_emit(cmd_buffer->cs, vertexOffset);
+   radeon_emit(cmd_buffer->cs, firstInstance);
+   if (cmd_buffer->state.pipeline->graphics.vtx_emit_num == 3)
+   radeon_emit(cmd_buffer->cs, 0);
+
radeon_emit(cmd_buffer->cs, PKT3(PKT3_NUM_INSTANCES, 0, 0));
radeon_emit(cmd_buffer->cs, instanceCount);
 
@@ -2738,13 +2722,10 @@ radv_emit_indirect_draw(struct radv_cmd_buffer 
*cmd_buffer,
return;
 
cmd_buffer->device->ws->cs_add_buffer(cs, buffer->bo, 8);
-
-   struct ac_userdata_info *loc = 
radv_lookup_user_sgpr(cmd_buffer->state.pipeline, MESA_SHADER_VERTEX,
-
AC_UD_VS_BASE_VERTEX_START_INSTANCE);
-   uint32_t base_reg = 
radv_shader_stage_to_user_data_0(MESA_SHADER_VERTEX, 
radv_pipeline_has_gs(cmd_buffer->state.pipeline),
-   
radv_pipeline_has_tess(cmd_buffer->state.pipeline));
bool draw_id_enable = 

Re: [Mesa-dev] [PATCH 2/2] mesa: inline update_image_transfer_state() into _mesa_update_pixel()

2017-06-06 Thread Timothy Arceri

With Ian's suggestion series:

Reviewed-by: Timothy Arceri 

On 07/06/17 06:58, Samuel Pitoiset wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/pixel.c | 19 +--
  1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/src/mesa/main/pixel.c b/src/mesa/main/pixel.c
index 218e9fdd6b..a3f04d5688 100644
--- a/src/mesa/main/pixel.c
+++ b/src/mesa/main/pixel.c
@@ -598,12 +598,12 @@ _mesa_PixelTransferi( GLenum pname, GLint param )
  /*State Management*/
  /**/
  
-/*

- * Return a bitmask of IMAGE_*_BIT flags which to indicate which
- * pixel transfer operations are enabled.
+
+/**
+ * Update mesa pixel transfer derived state to indicate which operations are
+ * enabled.
   */
-static void
-update_image_transfer_state(struct gl_context *ctx)
+void _mesa_update_pixel( struct gl_context *ctx )
  {
 GLuint mask = 0;
  
@@ -623,15 +623,6 @@ update_image_transfer_state(struct gl_context *ctx)

  }
  
  
-/**

- * Update mesa pixel transfer derived state.
- */
-void _mesa_update_pixel( struct gl_context *ctx )
-{
-   update_image_transfer_state(ctx);
-}
-
-
  /**/
  /*  Initialization*/
  /**/


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/1] radeonsi: Use libdrm to get chipset name

2017-06-06 Thread Samuel Li
v2: Add a func pointer to radeon_winsys to support radeon later.

Change-Id: I614ea71424f9e5c97e4ae68654315d28c89eaa5f
Signed-off-by: Samuel Li 
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 11 ++-
 src/gallium/drivers/radeon/radeon_winsys.h|  2 ++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  8 
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 2c0cadb..48d136a 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -790,6 +790,15 @@ static const char* r600_get_device_vendor(struct 
pipe_screen* pscreen)
 
 static const char* r600_get_chip_name(struct r600_common_screen *rscreen)
 {
+   const char *mname;
+
+   if (rscreen->ws->get_chip_name) {
+   mname = rscreen->ws->get_chip_name(rscreen->ws);
+   if (mname != NULL)
+   return mname;
+   }
+
+   /* fall back to family names*/
switch (rscreen->info.family) {
case CHIP_R600: return "AMD R600";
case CHIP_RV610: return "AMD RV610";
@@ -1321,6 +1330,7 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
struct utsname uname_data;
 
ws->query_info(ws, >info);
+   rscreen->ws = ws;
 
if (uname(_data) == 0)
snprintf(kernel_version, sizeof(kernel_version),
@@ -1362,7 +1372,6 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
r600_init_screen_texture_functions(rscreen);
r600_init_screen_query_functions(rscreen);
 
-   rscreen->ws = ws;
rscreen->family = rscreen->info.family;
rscreen->chip_class = rscreen->info.chip_class;
rscreen->debug_flags = debug_get_flags_option("R600_DEBUG", 
common_debug_options, 0);
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 524bb46..e19fde6 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -637,6 +637,8 @@ struct radeon_winsys {
 
 bool (*read_registers)(struct radeon_winsys *ws, unsigned reg_offset,
unsigned num_registers, uint32_t *out);
+
+const char* (*get_chip_name)(struct radeon_winsys *ws);
 };
 
 static inline bool radeon_emitted(struct radeon_winsys_cs *cs, unsigned num_dw)
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index c8bd60e..b2307fe 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -221,6 +221,13 @@ static bool amdgpu_winsys_unref(struct radeon_winsys *rws)
return destroy;
 }
 
+static const char* amdgpu_get_chip_name(struct radeon_winsys *ws)
+{
+   amdgpu_device_handle dev = ((struct amdgpu_winsys *)ws)->dev;
+   return amdgpu_get_marketing_name(dev);
+}
+
+
 PUBLIC struct radeon_winsys *
 amdgpu_winsys_create(int fd, radeon_screen_create_t screen_create)
 {
@@ -296,6 +303,7 @@ amdgpu_winsys_create(int fd, radeon_screen_create_t 
screen_create)
ws->base.cs_request_feature = amdgpu_cs_request_feature;
ws->base.query_value = amdgpu_query_value;
ws->base.read_registers = amdgpu_read_registers;
+   ws->base.get_chip_name = amdgpu_get_chip_name;
 
amdgpu_bo_init_functions(ws);
amdgpu_cs_init_functions(ws);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] mesa: Add _mesa_format_fallback_rgba_to_rgbx()

2017-06-06 Thread Dylan Baker
Quoting Chad Versace (2017-06-06 15:11:18)
> On Tue 06 Jun 2017, Dylan Baker wrote:
> > Quoting Chad Versace (2017-06-06 13:36:55)
> > > The new function takes a mesa_format and, if the format is an alpha
> > > format with a non-alpha variant, returns the non-alpha format.
> > > Otherwise, it returns the original format.
> > > 
> > > Example:
> > >   input -> output
> > > 
> > >   // Fallback exists
> > >   MESA_FORMAT_R8G8B8X8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
> > >   MESA_FORMAT_RGBX_UNORM16 -> MESA_FORMAT_RGBA_UNORM16
> > > 
> > >   // No fallback
> > >   MESA_FORMAT_R8G8B8A8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
> > >   MESA_FORMAT_Z_FLOAT32 -> MESA_FORMAT_Z_FLOAT32
> > > 
> > > i965 will use this for EGLImages and DRIimages.
> > > ---
> > >  src/mesa/Android.gen.mk  |  12 +++
> > >  src/mesa/Makefile.am |   7 ++
> > >  src/mesa/Makefile.sources|   2 +
> > >  src/mesa/main/.gitignore |   1 +
> > >  src/mesa/main/format_fallback.h  |  31 +++
> > >  src/mesa/main/format_fallback.py | 180 
> > > +++
> > >  6 files changed, 233 insertions(+)
> > >  create mode 100644 src/mesa/main/format_fallback.h
> > >  create mode 100644 src/mesa/main/format_fallback.py
> 
> [snip]
> 
> > > +def main():
> > > +pargs = parse_args()
> > > +
> > > +formats = {}
> > > +for fmt in format_parser.parse(pargs.csv):
> > > +formats[fmt.name] = fmt
> > 
> > You could simplify this as:
> > formats = {f.name: f for f in format_parser.parse(pargs.csv)}
> 
> Thanks. I'll do that.
> 
> > 
> > > +
> > > +write_preamble(stdout)
> > > +write_func_mesa_format_fallback_rgbx_to_rgba(stdout, formats)
> > 
> > We really shouldn't write to stdout like this, it can cause all kinds of
> > breakages if there's ever a UTF-8 character (say ©) and the terminal doesn't
> > have a unicode locale it'll fail,
> 
> Ugh. I wasn't aware that Python's stdout was broken. Is Python's
> sys.stdout opened in "text" mode, and is that the cause of the
> brokenness?
> 
> Does it still fail if stdout is redirected to a file? Because that's the
> only case that matters here.

It's not python, it's the shell (I think). In this case it won't be a problem
since you don't have any non-ascii characters, but we've run into cases where
someone (like me) adds "Copyright © 3001 Mystery Science Theatre" and then
breaks some (but not many) systems. I know this because I added such a copyright
and broke someone's system, and eventually we narrowed it down to the fact that
this person didn't have a UTF-8 locale but I did, we ended up just removing the
© character from the output to fix it.

> 
> > if you just open the file you want (say one
> > passed as an argument) then it doesn't matter what the console supports. We 
> > do
> > this all over the place so it's not a blocker for me, but I still think 
> > it's a
> > bad idea to write to stdout.
> 
> > If you decide not to change this you at the very least need to call
> > stdout.flush() after write_func_mesa_format_fallback_Rgbx_to_rgba.


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: inline update_image_transfer_state() into _mesa_update_pixel()

2017-06-06 Thread Ian Romanick
On 06/06/2017 01:58 PM, Samuel Pitoiset wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/mesa/main/pixel.c | 19 +--
>  1 file changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/main/pixel.c b/src/mesa/main/pixel.c
> index 218e9fdd6b..a3f04d5688 100644
> --- a/src/mesa/main/pixel.c
> +++ b/src/mesa/main/pixel.c
> @@ -598,12 +598,12 @@ _mesa_PixelTransferi( GLenum pname, GLint param )
>  /*State Management*/
>  /**/
>  
> -/*
> - * Return a bitmask of IMAGE_*_BIT flags which to indicate which
> - * pixel transfer operations are enabled.
> +
> +/**
> + * Update mesa pixel transfer derived state to indicate which operations are
> + * enabled.
>   */
> -static void
> -update_image_transfer_state(struct gl_context *ctx)
> +void _mesa_update_pixel( struct gl_context *ctx )

If you update this to

void
_mesa_update_pixel( struct gl_context *ctx )

then the series is

Reviewed-by: Ian Romanick 

This formatting may seem a little weird, but Mesa uses it because it has
a really useful feature.  You can "grep -nr ^function_name" to find
exactly where a function is defined.  For functions with a lot of
callers, this is much easier than wading through a couple pages of grep
output.

>  {
> GLuint mask = 0;
>  
> @@ -623,15 +623,6 @@ update_image_transfer_state(struct gl_context *ctx)
>  }
>  
>  
> -/**
> - * Update mesa pixel transfer derived state.
> - */
> -void _mesa_update_pixel( struct gl_context *ctx )
> -{
> -   update_image_transfer_state(ctx);
> -}
> -
> -
>  /**/
>  /*  Initialization*/
>  /**/
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] tree-wide: remove trailing backslash

2017-06-06 Thread Ian Romanick
On 06/01/2017 06:48 AM, Eric Engestrom wrote:
> Simple search for a backslash followed by two newlines.
> If one of the newlines were to be removed, this would cause issues, so
> let's just remove these trailing backslashes.
> 
> Signed-off-by: Eric Engestrom 
> ---
> 
> I can split the patch by module if you want, but this seems trivial
> enough?
> A simple ack from a couple people would be enough IMO.

I'm fine with it as one big patch.  All of the changes look correct to me.

Reviewed-by: Ian Romanick 

> ---
>  src/amd/common/amd_kernel_code_t.h  | 2 +-
>  src/compiler/nir/nir_builder.h  | 2 +-
>  src/gallium/auxiliary/draw/draw_gs_tmp.h| 2 +-
>  src/gallium/auxiliary/draw/draw_prim_assembler_tmp.h| 2 +-
>  src/gallium/auxiliary/draw/draw_so_emit_tmp.h   | 2 +-
>  src/gallium/drivers/r600/sb/sb_bc.h | 2 +-
>  src/gallium/drivers/svga/svga_winsys.h  | 2 +-
>  src/gallium/drivers/swr/rasterizer/memory/ClearTile.cpp | 4 ++--
>  src/intel/compiler/brw_inst.h   | 2 +-
>  src/mesa/drivers/dri/r200/r200_vertprog.c   | 2 +-
>  src/mesa/drivers/x11/xmesaP.h   | 2 +-
>  src/mesa/math/m_debug_util.h| 2 +-
>  src/mesa/sparc/sparc_matrix.h   | 2 +-
>  src/mesa/x86/mmx_blend.S| 2 +-
>  src/mesa/x86/read_rgba_span_x86.S   | 2 +-
>  15 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/src/amd/common/amd_kernel_code_t.h 
> b/src/amd/common/amd_kernel_code_t.h
> index d0d7809da1..f8e9508518 100644
> --- a/src/amd/common/amd_kernel_code_t.h
> +++ b/src/amd/common/amd_kernel_code_t.h
> @@ -36,7 +36,7 @@
>  
>  // Gets bits for specified mask from specified src packed instance.
>  #define AMD_HSA_BITS_GET(src, mask)  
>   \
> -  ((src & mask) >> mask ## _SHIFT)   
>   \
> +  ((src & mask) >> mask ## _SHIFT)
>  
>  /* Every amd_*_code_t has the following properties, which are composed of
>   * a number of bit fields. Every bit field has a mask (AMD_CODE_PROPERTY_*),
> diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
> index 7dbf8efbb3..7c65886356 100644
> --- a/src/compiler/nir/nir_builder.h
> +++ b/src/compiler/nir/nir_builder.h
> @@ -621,7 +621,7 @@ nir_load_system_value(nir_builder *build, 
> nir_intrinsic_op op, int index)
> nir_load_##name(nir_builder *build)   \
> { \
>return nir_load_system_value(build, nir_intrinsic_load_##name, 0); \
> -   } \
> +   }
>  
>  #include "nir_intrinsics.h"
>  
> diff --git a/src/gallium/auxiliary/draw/draw_gs_tmp.h 
> b/src/gallium/auxiliary/draw/draw_gs_tmp.h
> index b10bbc413d..bf276d3822 100644
> --- a/src/gallium/auxiliary/draw/draw_gs_tmp.h
> +++ b/src/gallium/auxiliary/draw/draw_gs_tmp.h
> @@ -22,7 +22,7 @@
>default:\
>   break;   \
>}   \
> -   } while (0)\
> +   } while (0)
>  
>  #define POINT(i0) gs_point(gs,i0)
>  #define LINE(flags,i0,i1) gs_line(gs,i0,i1)
> diff --git a/src/gallium/auxiliary/draw/draw_prim_assembler_tmp.h 
> b/src/gallium/auxiliary/draw/draw_prim_assembler_tmp.h
> index bff6d556ed..145a8ca74e 100644
> --- a/src/gallium/auxiliary/draw/draw_prim_assembler_tmp.h
> +++ b/src/gallium/auxiliary/draw/draw_prim_assembler_tmp.h
> @@ -19,7 +19,7 @@
>return;   \
> default: \
>break;\
> -   }\
> +   }
>  
>  
>  #define POINT(i0) prim_point(asmblr, i0)
> diff --git a/src/gallium/auxiliary/draw/draw_so_emit_tmp.h 
> b/src/gallium/auxiliary/draw/draw_so_emit_tmp.h
> index 282a52d1c0..c3a4695c1f 100644
> --- a/src/gallium/auxiliary/draw/draw_so_emit_tmp.h
> +++ b/src/gallium/auxiliary/draw/draw_so_emit_tmp.h
> @@ -22,7 +22,7 @@
>default:\
>   break;   \
>}   \
> -   } while (0)\
> +   } while (0)
>  
>  #define POINT(i0)so_point(so,i0)
>  #define 

Re: [Mesa-dev] [PATCH 5/5] r100: Silence numerous unused this or that warnings

2017-06-06 Thread Ian Romanick
On 06/01/2017 09:26 AM, Marek Olšák wrote:
> For the series:
> 
> Reviewed-by: Marek Olšák 
> 
> About your r100-r200 coding style question, I don't have a specific
> answer. It's up to you what you wanna do with it.

Okay.  I think I'm going to just start using the same style as core Mesa
and i965.  That will make things easier for me, I think.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] mesa: Add _mesa_format_fallback_rgba_to_rgbx()

2017-06-06 Thread Chad Versace
On Tue 06 Jun 2017, Dylan Baker wrote:
> Quoting Chad Versace (2017-06-06 13:36:55)
> > The new function takes a mesa_format and, if the format is an alpha
> > format with a non-alpha variant, returns the non-alpha format.
> > Otherwise, it returns the original format.
> > 
> > Example:
> >   input -> output
> > 
> >   // Fallback exists
> >   MESA_FORMAT_R8G8B8X8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
> >   MESA_FORMAT_RGBX_UNORM16 -> MESA_FORMAT_RGBA_UNORM16
> > 
> >   // No fallback
> >   MESA_FORMAT_R8G8B8A8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
> >   MESA_FORMAT_Z_FLOAT32 -> MESA_FORMAT_Z_FLOAT32
> > 
> > i965 will use this for EGLImages and DRIimages.
> > ---
> >  src/mesa/Android.gen.mk  |  12 +++
> >  src/mesa/Makefile.am |   7 ++
> >  src/mesa/Makefile.sources|   2 +
> >  src/mesa/main/.gitignore |   1 +
> >  src/mesa/main/format_fallback.h  |  31 +++
> >  src/mesa/main/format_fallback.py | 180 
> > +++
> >  6 files changed, 233 insertions(+)
> >  create mode 100644 src/mesa/main/format_fallback.h
> >  create mode 100644 src/mesa/main/format_fallback.py

[snip]

> > +def main():
> > +pargs = parse_args()
> > +
> > +formats = {}
> > +for fmt in format_parser.parse(pargs.csv):
> > +formats[fmt.name] = fmt
> 
> You could simplify this as:
> formats = {f.name: f for f in format_parser.parse(pargs.csv)}

Thanks. I'll do that.

> 
> > +
> > +write_preamble(stdout)
> > +write_func_mesa_format_fallback_rgbx_to_rgba(stdout, formats)
> 
> We really shouldn't write to stdout like this, it can cause all kinds of
> breakages if there's ever a UTF-8 character (say ©) and the terminal doesn't
> have a unicode locale it'll fail,

Ugh. I wasn't aware that Python's stdout was broken. Is Python's
sys.stdout opened in "text" mode, and is that the cause of the
brokenness?

Does it still fail if stdout is redirected to a file? Because that's the
only case that matters here.

> if you just open the file you want (say one
> passed as an argument) then it doesn't matter what the console supports. We do
> this all over the place so it's not a blocker for me, but I still think it's a
> bad idea to write to stdout.

> If you decide not to change this you at the very least need to call
> stdout.flush() after write_func_mesa_format_fallback_Rgbx_to_rgba.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: fix git_sha1.h include path in Android.mk

2017-06-06 Thread Mauro Rossi
And here is last (v3) version tested by building nougat-x86

>From 052df48ae71b82b04ed8f634101d0ec919b497e5 Mon Sep 17 00:00:00 2001
From: Mauro Rossi 
Date: Tue, 6 Jun 2017 23:15:05 +0200
Subject: [PATCH 1/5] svga: fix git_sha1.h include path in Android.mk (v3)

Adds libmesa_git_sha1 static (dummy) library to generate git_sha1.h
with some polishing to header dependency on .git/HEAD and scripted rules.

The now redundant generation rules are removed from Android.gen.mk
libmesa_git_sha1 whole static depedency is added to libmesa_pipe_svga,
libmesa_dricore and libmesa_st_mesa modules

Fixes the following building error:

external/mesa/src/gallium/drivers/svga/svga_screen.c:26:10:
fatal error: 'git_sha1.h' file not found
 ^
1 error generated.

Fixes: 1ce3a27 ("svga: Add the ability to log messages to
vmware.log on the host.")
---
 src/gallium/drivers/svga/Android.mk  |  2 ++
 src/mesa/Android.gen.mk  | 12 
 src/mesa/Android.libmesa_dricore.mk  |  3 +-
 src/mesa/Android.libmesa_git_sha1.mk | 59 
 src/mesa/Android.libmesa_st_mesa.mk  |  3 +-
 src/mesa/Android.mk  |  1 +
 6 files changed, 66 insertions(+), 14 deletions(-)
 create mode 100644 src/mesa/Android.libmesa_git_sha1.mk

diff --git a/src/gallium/drivers/svga/Android.mk
b/src/gallium/drivers/svga/Android.mk
index c50743d509..9ed837fb22 100644
--- a/src/gallium/drivers/svga/Android.mk
+++ b/src/gallium/drivers/svga/Android.mk
@@ -34,6 +34,8 @@ LOCAL_C_INCLUDES := $(LOCAL_PATH)/include

 LOCAL_MODULE := libmesa_pipe_svga

+LOCAL_STATIC_LIBRARIES += libmesa_git_sha1
+
 include $(GALLIUM_COMMON_MK)
 include $(BUILD_STATIC_LIBRARY)

diff --git a/src/mesa/Android.gen.mk b/src/mesa/Android.gen.mk
index 5f1c7ebaf9..366a6b1036 100644
--- a/src/mesa/Android.gen.mk
+++ b/src/mesa/Android.gen.mk
@@ -53,8 +53,6 @@ LOCAL_C_INCLUDES += $(intermediates)/x86
 endif
 endif

-sources += main/git_sha1.h
-
 sources := $(addprefix $(intermediates)/, $(sources))

 LOCAL_GENERATED_SOURCES += $(sources)
@@ -71,16 +69,6 @@ define es-gen
  $(hide) $(PRIVATE_SCRIPT) $(1) $(PRIVATE_XML) > $@
 endef

-$(intermediates)/main/git_sha1.h:
- @mkdir -p $(dir $@)
- @echo "GIT-SHA1: $(PRIVATE_MODULE) <= git"
- $(hide) touch $@
- $(hide) if which git > /dev/null; then \
- git --git-dir $(PRIVATE_PATH)/../../.git log -n 1 --oneline | \
- sed 's/^\([^ ]*\) .*/#define MESA_GIT_SHA1 "git-\1"/' \
- > $@; \
- fi
-
 matypes_deps := \
  $(BUILD_OUT_EXECUTABLES)/mesa_gen_matypes$(BUILD_EXECUTABLE_SUFFIX) \
  $(LOCAL_PATH)/main/mtypes.h \
diff --git a/src/mesa/Android.libmesa_dricore.mk
b/src/mesa/Android.libmesa_dricore.mk
index 599b9ccd71..c7715a50c9 100644
--- a/src/mesa/Android.libmesa_dricore.mk
+++ b/src/mesa/Android.libmesa_dricore.mk
@@ -65,7 +65,8 @@ LOCAL_GENERATED_SOURCES += \
  $(MESA_GEN_GLSL_H)

 LOCAL_WHOLE_STATIC_LIBRARIES += \
- libmesa_program
+ libmesa_program \
+ libmesa_git_sha1

 include $(LOCAL_PATH)/Android.gen.mk
 include $(MESA_COMMON_MK)
diff --git a/src/mesa/Android.libmesa_git_sha1.mk
b/src/mesa/Android.libmesa_git_sha1.mk
new file mode 100644
index 00..0fd176bf7d
--- /dev/null
+++ b/src/mesa/Android.libmesa_git_sha1.mk
@@ -0,0 +1,59 @@
+# Mesa 3-D graphics library
+#
+# Copyright (C) 2017 Mauro Rossi 
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+
+# --
+# libmesa_git_sha1
+# --
+
+LOCAL_PATH := $(call my-dir)
+
+include $(CLEAR_VARS)
+
+LOCAL_MODULE := libmesa_git_sha1
+
+LOCAL_MODULE_CLASS := STATIC_LIBRARIES
+intermediates := $(call local-generated-sources-dir)
+
+# dummy.c source file is generated to meet the build system's rules.
+LOCAL_GENERATED_SOURCES += $(intermediates)/dummy.c
+
+$(intermediates)/dummy.c:
+ @mkdir -p $(dir $@)
+ @echo "Gen Dummy: $(PRIVATE_MODULE) <= $(notdir $(@))"
+ 

Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 1:32 PM, Jason Ekstrand  wrote:

> On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace 
> wrote:
>
>> On Fri 26 May 2017, Jason Ekstrand wrote:
>> > This enum describes all of the states that a auxiliary compressed
>> > surface can have.  All of the states as well as normative language for
>> > referring to each of the compression operations is provided in the
>> > truly colossal comment for the new isl_aux_state enum.  There is also
>> > a diagram showing how surfaces move between the different states.
>> > ---
>> >  src/intel/isl/isl.h | 142 ++
>> ++
>> >  1 file changed, 142 insertions(+)
>> >
>> > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
>> > index b9d8fa8..df6d3e3 100644
>> > --- a/src/intel/isl/isl.h
>> > +++ b/src/intel/isl/isl.h
>> > @@ -560,6 +560,148 @@ enum isl_aux_usage {
>> > ISL_AUX_USAGE_CCS_E,
>> >  };
>> >
>> > +/**
>> > + * Enum for keeping track of the state an auxiliary compressed surface.
>>
>> This is really nice and helpful for everyone.
>>
>> I also learned something new from it: that a resolve on CCS_E also
>> ambiguates the aux surface. Do you have any insight on why the hardware
>> does that?
>>
>> > + *
>> > + * For any given auxiliary surface compression format (HiZ, CCS, or
>> MCS), any
>> > + * given slice (lod + array layer) can be in one of the six states
>> described
>> > + * by this enum.  Draw and resolve operations may cause the slice to
>> change
>> > + * from one state to another.  The six valid states are:
>>
>> I have one suggestion: please carefully distinguish between CCS_D and
>> CCS_E in the documentation. In my experience, muddy thinking where the
>> two are not cleanly distinguished leads to confused minds and confusing
>> code.
>>
>> For someone who already has a firm grasp on aux state, the ambiguous
>> term "CCS" poses no problem. That wise person automatically infers from
>> context if "CCS" applies to CCS_D, to CCS_E, or to both. But for someone
>> who's understanding of aux isn't as solid, the term "CCS" can lead to
>> incorrect inferences.
>>
>> For example, below you say that the partial resolve "operation is only
>> available for CCS". That's misleading. It should say "only available for
>> CCS_E".
>>
>> Another benefit: It becomes possible to document that
>> ISL_AUX_STATE_COMPRESSED_NO_CLEAR is valid only for CCS_E and HIZ, but
>> not valid for CCS_D and MCS.
>>
>
> It is valid for MCS.  If you don't fast-clear but only render, then you're
> in that state.  It's only invalid for CCS_D.
>
>
>> Other than the CCS_D/CCS_E distinction, the patch looks good to me. This
>> is a really nice addition to the driver.
>>
>
> How about a section after the auxiliary compression ops section which goes
> into detail on each of the compression types and discusses which states are
> valid etc.
>

How does this look:

https://cgit.freedesktop.org/~jekstrand/mesa/commit/?h=wip/i965-resolve-rework-v3=8478b102c99e3ec43ec687b3f4e52acb9acbd5ba

I'll squash it in if you like it.


> One more comment at the end...
>>
>> > + *
>> > + *1) Clear:  In this state, each block in the auxiliary surface
>> contains a
>> > + *   magic value that indicates that the block is in the clear
>> state.  If
>> > + *   a block is in the clear state, it's values in the primary
>> surface are
>> > + *   ignored and the color of the samples in the block is taken
>> either the
>> > + *   RENDER_SURFACE_STATE packet for color or 3DSTATE_CLEAR_PARAMS
>> for
>> > + *   depth.  Since neither the primary surface nor the auxiliary
>> surface
>> > + *   contains the clear value, the surface can be cleared to a
>> different
>> > + *   color by simply changing the clear color without modifying
>> either
>> > + *   surface.
>> > + *
>> > + *2) Compressed w/ Clear:  In this state, neither the auxiliary
>> surface
>> > + *   nor the primary surface has a complete representation of the
>> data.
>> > + *   Instead, both surfaces must be used together or else rendering
>> > + *   corruption may occur.  Depending on the auxiliary compression
>> format
>> > + *   and the data, any given block in the primary surface may
>> contain all,
>> > + *   some, or none of the data required to reconstruct the actual
>> sample
>> > + *   values.  Blocks may also be in the clear state (see Clear)
>> and have
>> > + *   their value taken from outside the surface.
>> > + *
>> > + *3) Compressed w/o Clear:  This state is identical to the state
>> above
>> > + *   except that no blocks are in the clear state.  In this state,
>> all of
>> > + *   the data required to reconstruct the final sample values is
>> contained
>> > + *   in the auxiliary and primary surface and the clear value is
>> not
>> > + *   considered.
>> > + *
>> > + *4) Resolved:  In this state, the primary surface contains 100%
>> of the
>> > + 

Re: [Mesa-dev] [PATCH 4/4] nir: add ARB_shader_ballot and ARB_shader_group_vote instructions

2017-06-06 Thread Connor Abbott
On Tue, Jun 6, 2017 at 1:48 PM, Connor Abbott  wrote:
> On Tue, Jun 6, 2017 at 1:45 PM, Jason Ekstrand  wrote:
>>
>>
>> On Mon, Jun 5, 2017 at 9:52 PM, Jason Ekstrand  wrote:
>>>
>>> On Mon, Jun 5, 2017 at 6:37 PM, Connor Abbott  wrote:

 I pushed a v2 at
 https://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-divergence-v2.
 I'm not sure if I like this version better, though. I'll have to think
 about it. In the meantime, feel free to take a look.
>>>
>>>
>>> I've taken a skim through the branch and I agree that I'm not sure either.
>>> Here's a few thoughts in no particular order:
>>>
>>>  1) Other than the fact that it's a pile of churn, it doesn't seem to make
>>> too much difference whether dFdx and dFdy are ALU or intrinsics
>>>
>>>  2) Convergent instructions are, in a lot of ways, easier to deal with
>>> than plain cross-thread ones.  Convergent ops can always be moved up the
>>> dominance tree or down into uniform control-flow.  Regular cross-thread
>>> instructions can't be moved across any non-uniform control-flow.
>>>
>>>  3) dFdx and dFdy are weird because they're convergent so it's clear they
>>> are special but not clear they should be intrinsics instead of ALU
>>>
>>>  4) I like the nir_instr_is_convergent() and nir_instr_is_cross_thread()
>>> helpers
>>>
>>>  5) non-convergent cross-thread instructions should definitely be
>>> intrinsics.
>>>
>>>  6) I think the shader ballot stuff is all non-convergent cross-thread as
>>> are some of the more advanced subgroup operations (see HLSL shader model
>>> 6.0).
>>
>>
>> Having slept on things a bit, I think I've come to the conclusion that
>> leaving dFdx and dFdy as-is should be fine so long as we have the
>> nir_instr_is_convergent() and _is_cross_thread() helpers.  We need to do
>> special casing in those for texture instructions anyway so adding in a quick
>> switch for ALU derivatives isn't bad.  For shader_ballot type instructions,
>> I think they're probably best done as intrinsics for now.  That way the
>> compiler will leave them alone most of the time and only things that
>> actually know what they're doing will ever try to optimize them.
>>
>> --Jason
>
> Ok, that sounds good.

I pushed a nir-divergence-v3 branch which does just that. I'll start
using that as a base for my work on radv.

>
>>
>>>
>>> That's all for now,
>>>
>>> --Jason
>>>

 On Mon, Jun 5, 2017 at 2:43 PM, Jason Ekstrand 
 wrote:
 > On Mon, Jun 5, 2017 at 1:50 PM, Connor Abbott 
 > wrote:
 >>
 >> On Mon, Jun 5, 2017 at 1:37 PM, Jason Ekstrand 
 >> wrote:
 >> > I'm not sure how I feel about having these as ALU operations.  ALU
 >> > operations are generally pure functions (with the exception
 >> > derivative)
 >> > that
 >> > can be re-ordered at will.  I don't really like breaking that.  In
 >> > fact,
 >> > I'd
 >> > almost be inclined to make derivatives intrinsics and just
 >> > special-case
 >> > them
 >> > in constant folding.  Thoughts?
 >>
 >> I wasn't too sure about this either. It is a little weird to make
 >> these ALU instructions. I followed the rule here that if something can
 >> be constant-folded, it should be an ALU instruction, but I guess you
 >> can argue that it's just a coincidence that these can be
 >> constant-folded anyways.
 >
 >
 > Yeah.  As subgroup ops get more complicated, I think a log of the
 > subgroup
 > operations can be constant-folded after a fashion but the rules get
 > weird
 > fast.
 >
 >>
 >> I guess the main downside is that it would be
 >> impossible to make nir_algebraic patterns with these, although I can't
 >> think of too many simple pattern-matching type things you'd want to do
 >> on these instructions anyways.
 >
 >
 > Yeah.  My gut also tells me that shaders which are "advanced" enough to
 > use
 > subgroup features probably don't need (or it can't be done) the massive
 > reductions we do for D3D9-generated shaders.
 >
 >>
 >> Maybe something like not(any(not(foo)))
 >> -> all(foo) and vice-versa?
 >>
 >> >
 >> > On Mon, Jun 5, 2017 at 12:22 PM, Connor Abbott 
 >> > wrote:
 >> >>
 >> >> Signed-off-by: Connor Abbott 
 >> >> ---
 >> >>  src/compiler/nir/nir_intrinsics.h | 14 ++
 >> >>  src/compiler/nir/nir_opcodes.py   | 18 --
 >> >>  2 files changed, 30 insertions(+), 2 deletions(-)
 >> >>
 >> >> diff --git a/src/compiler/nir/nir_intrinsics.h
 >> >> b/src/compiler/nir/nir_intrinsics.h
 >> >> index 21e7d90..157df7f 100644
 >> >> --- a/src/compiler/nir/nir_intrinsics.h
 >> >> +++ b/src/compiler/nir/nir_intrinsics.h
 >> >> @@ 

Re: [Mesa-dev] [PATCH 2/6] i965: Add a RGBX->RGBA fallback for glEGLImageTextureTarget2D()

2017-06-06 Thread Daniel Stone
Hi Chad,

On 6 June 2017 at 21:36, Chad Versace  wrote:
> @@ -254,8 +255,22 @@ create_mt_for_dri_image(struct brw_context *brw,
> struct gl_context *ctx = >ctx;
> struct intel_mipmap_tree *mt;
> uint32_t draw_x, draw_y;
> +   mesa_format format = image->format;
> +
> +   if (!ctx->TextureFormatSupported[format]) {
> +  /* The texture storage paths in core Mesa detect if the driver does not
> +   * support the user-requested format, and then searches for a
> +   * fallback format. The DRIimage code bypasses core Mesa, though. So we
> +   * do the fallbacks here for important formats.
> +   *
> +   * We must support DRM_FOURCC_XBGR textures because the Android
> +   * framework produces HAL_PIXEL_FORMAT_RGBX winsys surfaces, which
> +   * the Chrome OS compositor consumes as dma_buf EGLImages.
> +   */
> +  format = _mesa_format_fallback_rgbx_to_rgba(format);
> +   }
>
> -   if (!ctx->TextureFormatSupported[image->format])
> +   if (!ctx->TextureFormatSupported[format])
>return NULL;
>
> /* Disable creation of the texture's aux buffers because the driver 
> exposes
> @@ -263,7 +278,7 @@ create_mt_for_dri_image(struct brw_context *brw,
>  * buffer's content to the main buffer nor for invalidating the aux 
> buffer's
>  * content.
>  */
> -   mt = intel_miptree_create_for_bo(brw, image->bo, image->format,
> +   mt = intel_miptree_create_for_bo(brw, image->bo, format,
>  0, image->width, image->height, 1,
>  image->pitch,
>  MIPTREE_LAYOUT_DISABLE_AUX);

I wonder if it wouldn't be better to do this in
intel_create_image_from_name. That way it would be more obvious
up-front what's happening, and also it'd be easy to add the analogue
to intel_create_image_from_fds for the explicit dmabuf (as opposed to
ANativeBuffer; your description of 'from dmabufs' threw me because the
dmabuf fd import path would fail immediately on FourCC lookup) import
path.

As an aside, it's safe to enable in Wayland (IMO anyway), because we
only use the DRM format there; there's no concept of a 'surface
format' or visuals inside the Wayland client EGL platform, just direct
sampling from whichever buffer was last attached. EGL_NATIVE_VISUAL_ID
is empty, because we don't have anything to expose to the client.
Probably famous last words tho.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 52/64] radeonsi: implement ARB_bindless_texture

2017-06-06 Thread Marek Olšák
On Tue, May 30, 2017 at 10:36 PM, Samuel Pitoiset
 wrote:
> This implements the Gallium interface. Decompression of resident
> textures/images will follow in the next patches.
>
> v2: - fix a memleak related to util_copy_image_view()
> - remove "texture" parameter from create_texture_handle()
> - store pipe_sampler_view instead of si_sampler_view
> - make use pipe_sampler_view_reference() to fix a refcount issue
> - rename si_resident_descriptor to si_bindless_descriptor
> - use util_dynarray_*
> - add more comments
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c | 249 
> ++
>  src/gallium/drivers/radeonsi/si_pipe.c|  15 ++
>  src/gallium/drivers/radeonsi/si_pipe.h|  20 +++
>  3 files changed, 284 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index b2e2e4b760..238cca4561 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -60,6 +60,7 @@
>  #include "sid.h"
>  #include "gfx9d.h"
>
> +#include "util/hash_table.h"
>  #include "util/u_format.h"
>  #include "util/u_memory.h"
>  #include "util/u_upload_mgr.h"
> @@ -2130,6 +2131,248 @@ void si_bindless_descriptor_slab_free(void *priv, 
> struct pb_slab *pslab)
> FREE(slab);
>  }
>
> +static struct si_bindless_descriptor *
> +si_create_bindless_descriptor(struct si_context *sctx, uint32_t *desc_list,
> + unsigned size)
> +{
> +   struct si_screen *sscreen = sctx->screen;
> +   struct si_bindless_descriptor *desc;
> +   struct pb_slab_entry *entry;
> +   void *ptr;
> +
> +   /* Sub-allocate the bindless descriptor from a slab to avoid dealing
> +* with a ton of buffers and for reducing the winsys overhead.
> +*/
> +   entry = pb_slab_alloc(>bindless_descriptor_slabs, 64, 0);
> +   if (!entry)
> +   return NULL;
> +
> +   desc = NULL;
> +   desc = container_of(entry, desc, entry);
> +
> +   /* Upload the descriptor directly in VRAM. Because the slabs are
> +* currentlyu never reclaimed, we don't need to synchronize the
> +* operation.
> +*/
> +   ptr = sscreen->b.ws->buffer_map(desc->buffer->buf, NULL,
> +   PIPE_TRANSFER_WRITE |
> +   PIPE_TRANSFER_UNSYNCHRONIZED);
> +   util_memcpy_cpu_to_le32(ptr + desc->offset, desc_list, size);
> +   sscreen->b.ws->buffer_unmap(desc->buffer->buf);

I recommend removing this unmap call. Unmapping is costly and creating
a lot of handles can be slow.

> +
> +   return desc;
> +}
> +
> +static uint64_t si_create_texture_handle(struct pipe_context *ctx,
> +struct pipe_sampler_view *view,
> +const struct pipe_sampler_state 
> *state)
> +{
> +   struct si_sampler_view *sview = (struct si_sampler_view *)view;
> +   struct si_context *sctx = (struct si_context *)ctx;
> +   struct si_texture_handle *tex_handle;
> +   struct si_sampler_state *sstate;
> +   uint32_t desc_list[16];
> +   uint64_t handle;
> +
> +   tex_handle = CALLOC_STRUCT(si_texture_handle);
> +   if (!tex_handle)
> +   return 0;
> +
> +   memset(desc_list, 0, sizeof(desc_list));
> +   si_init_descriptor_list(_list[0], 16, 1, 
> null_texture_descriptor);
> +
> +   sstate = ctx->create_sampler_state(ctx, state);
> +   if (!sstate) {
> +   FREE(tex_handle);
> +   return 0;
> +   }
> +
> +   si_set_sampler_view_desc(sctx, sview, sstate, _list[0]);
> +   ctx->delete_sampler_state(ctx, sstate);
> +
> +   tex_handle->desc = si_create_bindless_descriptor(sctx, desc_list,
> +sizeof(desc_list));
> +   if (!tex_handle->desc) {
> +   FREE(tex_handle);
> +   return 0;
> +   }
> +
> +   handle = tex_handle->desc->buffer->gpu_address +
> +tex_handle->desc->offset;
> +
> +   if (!_mesa_hash_table_insert(sctx->tex_handles, (void *)handle,
> +tex_handle)) {
> +   pb_slab_free(>bindless_descriptor_slabs,
> +_handle->desc->entry);
> +   FREE(tex_handle);
> +   return 0;
> +   }
> +
> +   pipe_sampler_view_reference(_handle->view, view);
> +
> +   return handle;
> +}
> +
> +static void si_delete_texture_handle(struct pipe_context *ctx, uint64_t 
> handle)
> +{
> +   struct si_context *sctx = (struct si_context *)ctx;
> +   struct si_texture_handle *tex_handle;
> +   struct hash_entry *entry;
> +
> +   entry = _mesa_hash_table_search(sctx->tex_handles, (void 

Re: [Mesa-dev] [PATCH v2 51/64] radeonsi: add a slab allocator for bindless descriptors

2017-06-06 Thread Marek Olšák
On Tue, May 30, 2017 at 10:36 PM, Samuel Pitoiset
 wrote:
> For each texture/image handles, we need to allocate a new
> buffer for the bindless descriptor. But when the number of
> buffers added to the current CS becomes high, the overhead
> in the winsys (and in the kernel) is important.
>
> To reduce this bottleneck, the idea is to suballocate the
> bindless descriptors using a slab similar to the one used
> in the winsys.
>
> Currently, a buffer can hold 1024 bindless descriptors but
> this limit is arbitrary and could be changed in the future
> for some reasons. Once a slab is allocated the "base" buffer
> is added to a per-context list.
>
> v2: - rename si_resident_descriptor to si_bindless_descriptor
> - make can_reclaim_slab() returns false, always
> - use util_dynarray_*
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c | 84 
> +++
>  src/gallium/drivers/radeonsi/si_pipe.c| 12 
>  src/gallium/drivers/radeonsi/si_pipe.h| 15 +
>  src/gallium/drivers/radeonsi/si_state.h   |  8 +++
>  4 files changed, 119 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 3e78dd205b..b2e2e4b760 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -2046,6 +2046,90 @@ void si_emit_compute_shader_userdata(struct si_context 
> *sctx)
> sctx->shader_pointers_dirty &= ~compute_mask;
>  }
>
> +/* BINDLESS */
> +
> +struct si_bindless_descriptor_slab
> +{
> +   struct pb_slab base;
> +   struct r600_resource *buffer;
> +   struct si_bindless_descriptor *entries;
> +};
> +
> +bool si_bindless_descriptor_can_reclaim_slab(void *priv,
> +struct pb_slab_entry *entry)
> +{
> +   /* Do not allow to reclaim any bindless descriptors for now because 
> the
> +* GPU might be using them. This should be improved later on.
> +*/
> +   return false;
> +}
> +
> +struct pb_slab *si_bindless_descriptor_slab_alloc(void *priv, unsigned heap,
> + unsigned entry_size,
> + unsigned group_index)
> +{
> +   struct si_context *sctx = priv;
> +   struct si_screen *sscreen = sctx->screen;
> +   struct si_bindless_descriptor_slab *slab;
> +
> +   slab = CALLOC_STRUCT(si_bindless_descriptor_slab);
> +   if (!slab)
> +   return NULL;
> +
> +   /* Create a buffer in VRAM for 1024 bindless descriptors. */
> +   slab->buffer = (struct r600_resource *)
> +   pipe_buffer_create(>b.b, 0,
> +  PIPE_USAGE_IMMUTABLE, 64 * 1024);

PIPE_USAGE_DEFAULT would be better here. (even though it's the same)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: inline update_image_transfer_state() into _mesa_update_pixel()

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/pixel.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/src/mesa/main/pixel.c b/src/mesa/main/pixel.c
index 218e9fdd6b..a3f04d5688 100644
--- a/src/mesa/main/pixel.c
+++ b/src/mesa/main/pixel.c
@@ -598,12 +598,12 @@ _mesa_PixelTransferi( GLenum pname, GLint param )
 /*State Management*/
 /**/
 
-/*
- * Return a bitmask of IMAGE_*_BIT flags which to indicate which
- * pixel transfer operations are enabled.
+
+/**
+ * Update mesa pixel transfer derived state to indicate which operations are
+ * enabled.
  */
-static void
-update_image_transfer_state(struct gl_context *ctx)
+void _mesa_update_pixel( struct gl_context *ctx )
 {
GLuint mask = 0;
 
@@ -623,15 +623,6 @@ update_image_transfer_state(struct gl_context *ctx)
 }
 
 
-/**
- * Update mesa pixel transfer derived state.
- */
-void _mesa_update_pixel( struct gl_context *ctx )
-{
-   update_image_transfer_state(ctx);
-}
-
-
 /**/
 /*  Initialization*/
 /**/
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: remove useless check in _mesa_update_pixel()

2017-06-06 Thread Samuel Pitoiset
The only caller is _mesa_update_state_locked() which already
checks if _NEW_PIXEL is set before calling _mesa_update_pixel().

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/pixel.c | 5 ++---
 src/mesa/main/pixel.h | 2 +-
 src/mesa/main/state.c | 2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/pixel.c b/src/mesa/main/pixel.c
index 608a545470..218e9fdd6b 100644
--- a/src/mesa/main/pixel.c
+++ b/src/mesa/main/pixel.c
@@ -626,10 +626,9 @@ update_image_transfer_state(struct gl_context *ctx)
 /**
  * Update mesa pixel transfer derived state.
  */
-void _mesa_update_pixel( struct gl_context *ctx, GLuint new_state )
+void _mesa_update_pixel( struct gl_context *ctx )
 {
-   if (new_state & _NEW_PIXEL)
-  update_image_transfer_state(ctx);
+   update_image_transfer_state(ctx);
 }
 
 
diff --git a/src/mesa/main/pixel.h b/src/mesa/main/pixel.h
index fd1782e1bc..17e7376281 100644
--- a/src/mesa/main/pixel.h
+++ b/src/mesa/main/pixel.h
@@ -64,7 +64,7 @@ void GLAPIENTRY
 _mesa_PixelTransferi( GLenum pname, GLint param );
 
 extern void 
-_mesa_update_pixel( struct gl_context *ctx, GLuint newstate );
+_mesa_update_pixel( struct gl_context *ctx );
 
 extern void 
 _mesa_init_pixel( struct gl_context * ctx );
diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index 73872b822a..d534f554ba 100644
--- a/src/mesa/main/state.c
+++ b/src/mesa/main/state.c
@@ -382,7 +382,7 @@ _mesa_update_state_locked( struct gl_context *ctx )
   _mesa_update_stencil( ctx );
 
if (new_state & _NEW_PIXEL)
-  _mesa_update_pixel( ctx, new_state );
+  _mesa_update_pixel( ctx );
 
/* ctx->_NeedEyeCoords is now up to date.
 *
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 95346] Stellaris - Black/super dark planets

2017-06-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=95346

nikolai.va...@gmail.com changed:

   What|Removed |Added

 CC||nikolai.va...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: fix git_sha1.h include path in Android.mk

2017-06-06 Thread Mauro Rossi
2017-06-05 2:29 GMT+02:00 Emil Velikov :
> On 4 June 2017 at 22:47, Mauro Rossi  wrote:
>> 2017-05-29 14:30 GMT+02:00 Emil Velikov :
>>> On 26 May 2017 at 16:15, Mauro Rossi  wrote:
 Fixes the following building error:

 external/mesa/src/gallium/drivers/svga/svga_screen.c:26:10:
 fatal error: 'git_sha1.h' file not found
  ^
 1 error generated.
>>> Mauro please add
>>>
>>> Fixes: 1ce3a2723f9 ("svga: Add the ability to log messages to
>>> vmware.log on the host.")
>>>
>>
>> Done in v2
>>
 ---
  src/gallium/drivers/svga/Android.mk | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

 diff --git a/src/gallium/drivers/svga/Android.mk 
 b/src/gallium/drivers/svga/Android.mk
 index c50743d509..d19bd59bfe 100644
 --- a/src/gallium/drivers/svga/Android.mk
 +++ b/src/gallium/drivers/svga/Android.mk
 @@ -30,7 +30,9 @@ include $(CLEAR_VARS)

  LOCAL_SRC_FILES := $(C_SOURCES)

 -LOCAL_C_INCLUDES := $(LOCAL_PATH)/include
 +LOCAL_C_INCLUDES := \
 +   $(LOCAL_PATH)/include \
 +   $(call 
 generated-sources-dir-for,STATIC_LIBRARIES,libmesa_dricore,,)/main

>>> Haven't looked too closely on the discussion, so pardon if it's
>>> mentioned already.
>>>
>>> Have you considered doing a "dummy" library analogous to libmesa_genxml,
>>> This one one doesn't need to preemptively build libmesa_dricore.
>>>
>>> -Emil
>>
>> Here is v2, compliant to requirements and build tested
>> One line seemed more..short :-)
>> Mauro
>>
> Apologies if I'm getting a bit too nit-picky Mauro.

No problem, I trust your judgement

>
>
>> From 26ea92a07ca410ee9aebb9624399eca2dee49c29 Mon Sep 17 00:00:00 2001
>> From: Mauro Rossi 
>> Date: Sun, 4 Jun 2017 23:24:59 +0200
>> Subject: [PATCH] svga: fix git_sha1.h include path in Android.mk (v2)
>>
>> Adds libmesa_git_sha1 static (dummy) library to generate git_sha1.h
>> Fixes the following building error:
>>
>> external/mesa/src/gallium/drivers/svga/svga_screen.c:26:10:
>> fatal error: 'git_sha1.h' file not found
>>  ^
>> 1 error generated.
>>
>> Fixes: 1ce3a2723f9 ("svga: Add the ability to log messages to
>> vmware.log on the host.")
>> ---
>>  src/gallium/drivers/svga/Android.mk  |  6 +++-
>>  src/mesa/Android.libmesa_git_sha1.mk | 59 
>> 
>>  src/mesa/Android.mk  |  1 +
>>  3 files changed, 65 insertions(+), 1 deletion(-)
>>  create mode 100644 src/mesa/Android.libmesa_git_sha1.mk
>>
>> diff --git a/src/gallium/drivers/svga/Android.mk
>> b/src/gallium/drivers/svga/Android.mk
>> index c50743d509..17d37ed178 100644
>> --- a/src/gallium/drivers/svga/Android.mk
>> +++ b/src/gallium/drivers/svga/Android.mk
>> @@ -30,10 +30,14 @@ include $(CLEAR_VARS)
>>
>>  LOCAL_SRC_FILES := $(C_SOURCES)
>>
>> -LOCAL_C_INCLUDES := $(LOCAL_PATH)/include
>> +LOCAL_C_INCLUDES := \
>> + $(LOCAL_PATH)/include \
>> + $(dir $(MESA_GEN_GIT_SHA1_H))
>>
> If libmesa_git_sha1 exports the path this should not be needed, correct?

Affirmative

>
>>  LOCAL_MODULE := libmesa_pipe_svga
>>
>> +LOCAL_STATIC_LIBRARIES += libmesa_git_sha1
>> +
>>  include $(GALLIUM_COMMON_MK)
>>  include $(BUILD_STATIC_LIBRARY)
>>
>> diff --git a/src/mesa/Android.libmesa_git_sha1.mk
>> b/src/mesa/Android.libmesa_git_sha1.mk
>> new file mode 100644
>> index 00..ea6079e92e
>> --- /dev/null
>> +++ b/src/mesa/Android.libmesa_git_sha1.mk
> Can we move this a level up to be alongside the Automake/SCons equivalents?

I would recommend to keep it a the current top level for Android,
which is in src/mesa, all other core libraries are built at that level.

>
>
>> +LOCAL_PATH := $(call my-dir)
>> +
>> +include $(CLEAR_VARS)
>> +
>> +LOCAL_MODULE := libmesa_git_sha1
>> +
>> +LOCAL_MODULE_CLASS := STATIC_LIBRARIES
>> +intermediates := $(call local-generated-sources-dir)
>> +
>> +# dummy.c source file is generated to meet the build system's rules.
>> +LOCAL_GENERATED_SOURCES += $(intermediates)/dummy.c
>> +
>> +$(intermediates)/dummy.c:
>> + @mkdir -p $(dir $@)
>> + @echo "Gen Dummy: $(PRIVATE_MODULE) <= $(notdir $(@))"
>> + $(hide) touch $@
>> +
>> +LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, git_sha1.h)
>> +
>> +$(intermediates)/git_sha1.h: $(wildcard $(MESA_TOP)/.git/logs/HEAD)
>> + @mkdir -p $(dir $@)
>> + @echo "GIT-SHA1: $(PRIVATE_MODULE) <= git"
>> + $(hide) touch $@
>> + $(hide) if which git > /dev/null; then \
>> + git --git-dir $(MESA_TOP)/.git log -n 1 --oneline | \
>> + sed 's/^\([^ ]*\) .*/#define MESA_GIT_SHA1 "git-\1"/' \
>> + > $@; \
>> + fi
>> +
> A [nearly] identical hunk in src/mesa/Android.gen.mk can do, now can't it?
> The following one-liner might be needed in
>> +LOCAL_STATIC_LIBRARIES += libmesa_git_sha1
>
>> +LOCAL_STATIC_LIBRARIES += libmesa_git_sha1
>
> LOCAL_STATIC_LIBRARIES += libmesa_git_sha1
>

Sending a v3 which following 

Re: [Mesa-dev] [PATCH 1/6] mesa: Add _mesa_format_fallback_rgba_to_rgbx()

2017-06-06 Thread Dylan Baker
Quoting Chad Versace (2017-06-06 13:36:55)
> The new function takes a mesa_format and, if the format is an alpha
> format with a non-alpha variant, returns the non-alpha format.
> Otherwise, it returns the original format.
> 
> Example:
>   input -> output
> 
>   // Fallback exists
>   MESA_FORMAT_R8G8B8X8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
>   MESA_FORMAT_RGBX_UNORM16 -> MESA_FORMAT_RGBA_UNORM16
> 
>   // No fallback
>   MESA_FORMAT_R8G8B8A8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
>   MESA_FORMAT_Z_FLOAT32 -> MESA_FORMAT_Z_FLOAT32
> 
> i965 will use this for EGLImages and DRIimages.
> ---
>  src/mesa/Android.gen.mk  |  12 +++
>  src/mesa/Makefile.am |   7 ++
>  src/mesa/Makefile.sources|   2 +
>  src/mesa/main/.gitignore |   1 +
>  src/mesa/main/format_fallback.h  |  31 +++
>  src/mesa/main/format_fallback.py | 180 
> +++
>  6 files changed, 233 insertions(+)
>  create mode 100644 src/mesa/main/format_fallback.h
>  create mode 100644 src/mesa/main/format_fallback.py
> 
> diff --git a/src/mesa/Android.gen.mk b/src/mesa/Android.gen.mk
> index 42d4ba1969..8dcfb460e9 100644
> --- a/src/mesa/Android.gen.mk
> +++ b/src/mesa/Android.gen.mk
> @@ -34,6 +34,7 @@ sources := \
> main/enums.c \
> main/api_exec.c \
> main/dispatch.h \
> +   main/format_fallback.c \
> main/format_pack.c \
> main/format_unpack.c \
> main/format_info.h \
> @@ -135,6 +136,17 @@ $(intermediates)/main/get_hash.h: 
> $(glapi)/gl_and_es_API.xml \
> $(LOCAL_PATH)/main/get_hash_params.py $(GET_HASH_GEN)
> $(call es-gen)
>  
> +FORMAT_FALLBACK := $(LOCAL_PATH)/main/format_fallback.py
> +format_fallback_deps := \
> +   $(LOCAL_PATH)/main/formats.csv \
> +   $(LOCAL_PATH)/main/format_parser.py \
> +   $(FORMAT_FALLBACK)
> +
> +$(intermediates)/main/format_fallback.c: PRIVATE_SCRIPT := $(MESA_PYTHON2) 
> $(FORMAT_FALLBACK)
> +$(intermediates)/main/format_fallback.c: PRIVATE_XML :=
> +$(intermediates)/main/format_fallback.c: $(format_fallback_deps)
> +   $(call es-gen, $<)
> +
>  FORMAT_INFO := $(LOCAL_PATH)/main/format_info.py
>  format_info_deps := \
> $(LOCAL_PATH)/main/formats.csv \
> diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
> index 53f311d2a9..2c39374aef 100644
> --- a/src/mesa/Makefile.am
> +++ b/src/mesa/Makefile.am
> @@ -37,6 +37,7 @@ include Makefile.sources
>  
>  EXTRA_DIST = \
> drivers/SConscript \
> +   main/format_fallback.py \
> main/format_info.py \
> main/format_pack.py \
> main/format_parser.py \
> @@ -54,6 +55,7 @@ EXTRA_DIST = \
>  
>  BUILT_SOURCES = \
> main/get_hash.h \
> +   main/format_fallback.c \
> main/format_info.h \
> main/format_pack.c \
> main/format_unpack.c \
> @@ -70,6 +72,11 @@ main/get_hash.h: ../mapi/glapi/gen/gl_and_es_API.xml 
> main/get_hash_params.py \
> $(PYTHON_GEN) $(srcdir)/main/get_hash_generator.py \
> -f $(srcdir)/../mapi/glapi/gen/gl_and_es_API.xml > $@
>  
> +main/format_fallback.c: main/format_fallback.py \
> +main/format_parser.py \
> +   main/formats.csv
> +   $(PYTHON_GEN) $(srcdir)/main/format_fallback.py 
> $(srcdir)/main/formats.csv > $@
> +
>  main/format_info.h: main/formats.csv \
>  main/format_parser.py main/format_info.py
> $(PYTHON_GEN) $(srcdir)/main/format_info.py 
> $(srcdir)/main/formats.csv > $@
> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> index 8a65fbe663..642100b956 100644
> --- a/src/mesa/Makefile.sources
> +++ b/src/mesa/Makefile.sources
> @@ -94,6 +94,8 @@ MAIN_FILES = \
> main/ffvertex_prog.h \
> main/fog.c \
> main/fog.h \
> +   main/format_fallback.h \
> +   main/format_fallback.c \
> main/format_info.h \
> main/format_pack.h \
> main/format_pack.c \
> diff --git a/src/mesa/main/.gitignore b/src/mesa/main/.gitignore
> index 836d8f104a..8cc33cfee6 100644
> --- a/src/mesa/main/.gitignore
> +++ b/src/mesa/main/.gitignore
> @@ -4,6 +4,7 @@ enums.c
>  remap_helper.h
>  get_hash.h
>  get_hash.h.tmp
> +format_fallback.c
>  format_info.h
>  format_info.c
>  format_pack.c
> diff --git a/src/mesa/main/format_fallback.h b/src/mesa/main/format_fallback.h
> new file mode 100644
> index 00..5ca8269b2f
> --- /dev/null
> +++ b/src/mesa/main/format_fallback.h
> @@ -0,0 +1,31 @@
> +/*
> + * Copyright 2017 Google
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, 

Re: [Mesa-dev] [PATCH 4/4] nir: add ARB_shader_ballot and ARB_shader_group_vote instructions

2017-06-06 Thread Connor Abbott
On Tue, Jun 6, 2017 at 1:45 PM, Jason Ekstrand  wrote:
>
>
> On Mon, Jun 5, 2017 at 9:52 PM, Jason Ekstrand  wrote:
>>
>> On Mon, Jun 5, 2017 at 6:37 PM, Connor Abbott  wrote:
>>>
>>> I pushed a v2 at
>>> https://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-divergence-v2.
>>> I'm not sure if I like this version better, though. I'll have to think
>>> about it. In the meantime, feel free to take a look.
>>
>>
>> I've taken a skim through the branch and I agree that I'm not sure either.
>> Here's a few thoughts in no particular order:
>>
>>  1) Other than the fact that it's a pile of churn, it doesn't seem to make
>> too much difference whether dFdx and dFdy are ALU or intrinsics
>>
>>  2) Convergent instructions are, in a lot of ways, easier to deal with
>> than plain cross-thread ones.  Convergent ops can always be moved up the
>> dominance tree or down into uniform control-flow.  Regular cross-thread
>> instructions can't be moved across any non-uniform control-flow.
>>
>>  3) dFdx and dFdy are weird because they're convergent so it's clear they
>> are special but not clear they should be intrinsics instead of ALU
>>
>>  4) I like the nir_instr_is_convergent() and nir_instr_is_cross_thread()
>> helpers
>>
>>  5) non-convergent cross-thread instructions should definitely be
>> intrinsics.
>>
>>  6) I think the shader ballot stuff is all non-convergent cross-thread as
>> are some of the more advanced subgroup operations (see HLSL shader model
>> 6.0).
>
>
> Having slept on things a bit, I think I've come to the conclusion that
> leaving dFdx and dFdy as-is should be fine so long as we have the
> nir_instr_is_convergent() and _is_cross_thread() helpers.  We need to do
> special casing in those for texture instructions anyway so adding in a quick
> switch for ALU derivatives isn't bad.  For shader_ballot type instructions,
> I think they're probably best done as intrinsics for now.  That way the
> compiler will leave them alone most of the time and only things that
> actually know what they're doing will ever try to optimize them.
>
> --Jason

Ok, that sounds good.

>
>>
>> That's all for now,
>>
>> --Jason
>>
>>>
>>> On Mon, Jun 5, 2017 at 2:43 PM, Jason Ekstrand 
>>> wrote:
>>> > On Mon, Jun 5, 2017 at 1:50 PM, Connor Abbott 
>>> > wrote:
>>> >>
>>> >> On Mon, Jun 5, 2017 at 1:37 PM, Jason Ekstrand 
>>> >> wrote:
>>> >> > I'm not sure how I feel about having these as ALU operations.  ALU
>>> >> > operations are generally pure functions (with the exception
>>> >> > derivative)
>>> >> > that
>>> >> > can be re-ordered at will.  I don't really like breaking that.  In
>>> >> > fact,
>>> >> > I'd
>>> >> > almost be inclined to make derivatives intrinsics and just
>>> >> > special-case
>>> >> > them
>>> >> > in constant folding.  Thoughts?
>>> >>
>>> >> I wasn't too sure about this either. It is a little weird to make
>>> >> these ALU instructions. I followed the rule here that if something can
>>> >> be constant-folded, it should be an ALU instruction, but I guess you
>>> >> can argue that it's just a coincidence that these can be
>>> >> constant-folded anyways.
>>> >
>>> >
>>> > Yeah.  As subgroup ops get more complicated, I think a log of the
>>> > subgroup
>>> > operations can be constant-folded after a fashion but the rules get
>>> > weird
>>> > fast.
>>> >
>>> >>
>>> >> I guess the main downside is that it would be
>>> >> impossible to make nir_algebraic patterns with these, although I can't
>>> >> think of too many simple pattern-matching type things you'd want to do
>>> >> on these instructions anyways.
>>> >
>>> >
>>> > Yeah.  My gut also tells me that shaders which are "advanced" enough to
>>> > use
>>> > subgroup features probably don't need (or it can't be done) the massive
>>> > reductions we do for D3D9-generated shaders.
>>> >
>>> >>
>>> >> Maybe something like not(any(not(foo)))
>>> >> -> all(foo) and vice-versa?
>>> >>
>>> >> >
>>> >> > On Mon, Jun 5, 2017 at 12:22 PM, Connor Abbott 
>>> >> > wrote:
>>> >> >>
>>> >> >> Signed-off-by: Connor Abbott 
>>> >> >> ---
>>> >> >>  src/compiler/nir/nir_intrinsics.h | 14 ++
>>> >> >>  src/compiler/nir/nir_opcodes.py   | 18 --
>>> >> >>  2 files changed, 30 insertions(+), 2 deletions(-)
>>> >> >>
>>> >> >> diff --git a/src/compiler/nir/nir_intrinsics.h
>>> >> >> b/src/compiler/nir/nir_intrinsics.h
>>> >> >> index 21e7d90..157df7f 100644
>>> >> >> --- a/src/compiler/nir/nir_intrinsics.h
>>> >> >> +++ b/src/compiler/nir/nir_intrinsics.h
>>> >> >> @@ -330,6 +330,20 @@ SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)
>>> >> >>  SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx)
>>> >> >>  SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)
>>> >> >>  SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)
>>> >> >> +SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx)
>>> >> >> +
>>> >> >> +

Re: [Mesa-dev] [PATCH 4/4] nir: add ARB_shader_ballot and ARB_shader_group_vote instructions

2017-06-06 Thread Jason Ekstrand
On Mon, Jun 5, 2017 at 9:52 PM, Jason Ekstrand  wrote:

> On Mon, Jun 5, 2017 at 6:37 PM, Connor Abbott  wrote:
>
>> I pushed a v2 at
>> https://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-divergence-v2.
>> I'm not sure if I like this version better, though. I'll have to think
>> about it. In the meantime, feel free to take a look.
>>
>
> I've taken a skim through the branch and I agree that I'm not sure
> either.  Here's a few thoughts in no particular order:
>
>  1) Other than the fact that it's a pile of churn, it doesn't seem to make
> too much difference whether dFdx and dFdy are ALU or intrinsics
>
>  2) Convergent instructions are, in a lot of ways, easier to deal with
> than plain cross-thread ones.  Convergent ops can always be moved up the
> dominance tree or down into uniform control-flow.  Regular cross-thread
> instructions can't be moved across any non-uniform control-flow.
>
>  3) dFdx and dFdy are weird because they're convergent so it's clear they
> are special but not clear they should be intrinsics instead of ALU
>
>  4) I like the nir_instr_is_convergent() and nir_instr_is_cross_thread()
> helpers
>
>  5) non-convergent cross-thread instructions should definitely be
> intrinsics.
>
>  6) I think the shader ballot stuff is all non-convergent cross-thread as
> are some of the more advanced subgroup operations (see HLSL shader model
> 6.0).
>

Having slept on things a bit, I think I've come to the conclusion that
leaving dFdx and dFdy as-is should be fine so long as we have the
nir_instr_is_convergent() and _is_cross_thread() helpers.  We need to do
special casing in those for texture instructions anyway so adding in a
quick switch for ALU derivatives isn't bad.  For shader_ballot type
instructions, I think they're probably best done as intrinsics for now.
That way the compiler will leave them alone most of the time and only
things that actually know what they're doing will ever try to optimize them.

--Jason


> That's all for now,
>
> --Jason
>
>
>> On Mon, Jun 5, 2017 at 2:43 PM, Jason Ekstrand 
>> wrote:
>> > On Mon, Jun 5, 2017 at 1:50 PM, Connor Abbott 
>> wrote:
>> >>
>> >> On Mon, Jun 5, 2017 at 1:37 PM, Jason Ekstrand 
>> >> wrote:
>> >> > I'm not sure how I feel about having these as ALU operations.  ALU
>> >> > operations are generally pure functions (with the exception
>> derivative)
>> >> > that
>> >> > can be re-ordered at will.  I don't really like breaking that.  In
>> fact,
>> >> > I'd
>> >> > almost be inclined to make derivatives intrinsics and just
>> special-case
>> >> > them
>> >> > in constant folding.  Thoughts?
>> >>
>> >> I wasn't too sure about this either. It is a little weird to make
>> >> these ALU instructions. I followed the rule here that if something can
>> >> be constant-folded, it should be an ALU instruction, but I guess you
>> >> can argue that it's just a coincidence that these can be
>> >> constant-folded anyways.
>> >
>> >
>> > Yeah.  As subgroup ops get more complicated, I think a log of the
>> subgroup
>> > operations can be constant-folded after a fashion but the rules get
>> weird
>> > fast.
>> >
>> >>
>> >> I guess the main downside is that it would be
>> >> impossible to make nir_algebraic patterns with these, although I can't
>> >> think of too many simple pattern-matching type things you'd want to do
>> >> on these instructions anyways.
>> >
>> >
>> > Yeah.  My gut also tells me that shaders which are "advanced" enough to
>> use
>> > subgroup features probably don't need (or it can't be done) the massive
>> > reductions we do for D3D9-generated shaders.
>> >
>> >>
>> >> Maybe something like not(any(not(foo)))
>> >> -> all(foo) and vice-versa?
>> >>
>> >> >
>> >> > On Mon, Jun 5, 2017 at 12:22 PM, Connor Abbott 
>> >> > wrote:
>> >> >>
>> >> >> Signed-off-by: Connor Abbott 
>> >> >> ---
>> >> >>  src/compiler/nir/nir_intrinsics.h | 14 ++
>> >> >>  src/compiler/nir/nir_opcodes.py   | 18 --
>> >> >>  2 files changed, 30 insertions(+), 2 deletions(-)
>> >> >>
>> >> >> diff --git a/src/compiler/nir/nir_intrinsics.h
>> >> >> b/src/compiler/nir/nir_intrinsics.h
>> >> >> index 21e7d90..157df7f 100644
>> >> >> --- a/src/compiler/nir/nir_intrinsics.h
>> >> >> +++ b/src/compiler/nir/nir_intrinsics.h
>> >> >> @@ -330,6 +330,20 @@ SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)
>> >> >>  SYSTEM_VALUE(alpha_ref_float, 1, 0, xx, xx, xx)
>> >> >>  SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)
>> >> >>  SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)
>> >> >> +SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx)
>> >> >> +
>> >> >> +
>> >> >> +/* ARB_shader_ballot instructions */
>> >> >> +
>> >> >> +SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx)
>> >> >> +SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx)
>> >> >> +SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx)
>> >> >> 

[Mesa-dev] [PATCH 0/6] i965: Add RGBX, RGBA configs, even on gen9

2017-06-06 Thread Chad Versace
More patches to break your formats... again ;)

The Android framework requires support for EGLConfigs with
HAL_PIXEL_FORMAT_RGBX_ and HAL_PIXEL_FORMAT_RGBA_. This prevents
Chrome OS from updating its Android drivers, because earlier this year
Intel disabled all rgbx formats for gen >=9 in brw_surface_formats.c.
This patch series safely (hopefully?) fixes that problem.

If you want the meat, read patches 2 and 6.

Chad Versace (6):
  mesa: Add _mesa_format_fallback_rgba_to_rgbx()
  i965: Add a RGBX->RGBA fallback for glEGLImageTextureTarget2D()
  i965: Rename some vague format members of brw_context
  i965/dri: Add intel_screen param to intel_create_winsys_renderbuffer
  i965: Move brw_context format arrays to intel_screen
  i965/dri: Support R8G8B8A8 and R8G8B8X8 configs

 src/mesa/Android.gen.mk  |  12 ++
 src/mesa/Makefile.am |   7 +
 src/mesa/Makefile.sources|   2 +
 src/mesa/drivers/dri/i965/brw_blorp.c|  10 +-
 src/mesa/drivers/dri/i965/brw_context.h  |   5 +-
 src/mesa/drivers/dri/i965/brw_meta_util.c|   2 +-
 src/mesa/drivers/dri/i965/brw_surface_formats.c  |  94 +++-
 src/mesa/drivers/dri/i965/brw_tex_layout.c   |   2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  10 +-
 src/mesa/drivers/dri/i965/intel_fbo.c|  34 -
 src/mesa/drivers/dri/i965/intel_fbo.h|   6 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|   4 +-
 src/mesa/drivers/dri/i965/intel_screen.c |  39 -
 src/mesa/drivers/dri/i965/intel_screen.h |   5 +
 src/mesa/drivers/dri/i965/intel_tex.c|   2 +-
 src/mesa/drivers/dri/i965/intel_tex_image.c  |  19 ++-
 src/mesa/main/.gitignore |   1 +
 src/mesa/main/format_fallback.h  |  31 
 src/mesa/main/format_fallback.py | 180 +++
 19 files changed, 392 insertions(+), 73 deletions(-)
 create mode 100644 src/mesa/main/format_fallback.h
 create mode 100644 src/mesa/main/format_fallback.py

-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] i965: Rename some vague format members of brw_context

2017-06-06 Thread Chad Versace
I'm swimming in a vortex of formats. Mesa formats, isl formats, DRI
formats, GL formats, etc.

It's easy to misinterpret the following brw_context members unless
you've recently read their definition.  In upcoming patches, I change
them from embedded arrays to simple pointers; after that, even their
definition doesn't help, because the MESA_FORMAT_COUNT hint will no
longer be present.

Rename them to prevent further confusion. While we're renaming, choose
shorter names too.

-format_supported_as_render_target
+mesa_format_supports_render

-render_target_format
+mesa_to_isl_render_format
---
 src/mesa/drivers/dri/i965/brw_blorp.c| 10 +-
 src/mesa/drivers/dri/i965/brw_context.h  |  4 ++--
 src/mesa/drivers/dri/i965/brw_meta_util.c|  2 +-
 src/mesa/drivers/dri/i965/brw_surface_formats.c  | 20 ++--
 src/mesa/drivers/dri/i965/brw_tex_layout.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 10 +-
 src/mesa/drivers/dri/i965/intel_fbo.c|  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  4 ++--
 src/mesa/drivers/dri/i965/intel_tex.c|  2 +-
 9 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 28be620429..34f6bc4c84 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -243,8 +243,8 @@ brw_blorp_to_isl_format(struct brw_context *brw, 
mesa_format format,
   return ISL_FORMAT_R16_UNORM;
default: {
   if (is_render_target) {
- assert(brw->format_supported_as_render_target[format]);
- return brw->render_target_format[format];
+ assert(brw->mesa_format_supports_render[format]);
+ return brw->mesa_to_isl_render_format[format];
   } else {
  return brw_isl_format_for_mesa_format(format);
   }
@@ -607,7 +607,7 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
_mesa_get_format_base_format(dst_mt->format) == GL_DEPTH_STENCIL)
   return false;
 
-   if (!brw->format_supported_as_render_target[dst_image->TexFormat])
+   if (!brw->mesa_format_supports_render[dst_image->TexFormat])
   return false;
 
/* Source clipping shouldn't be necessary, since copytexsubimage (in
@@ -858,7 +858,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   struct blorp_batch batch;
   blorp_batch_init(>blorp, , brw, 0);
   blorp_fast_clear(, ,
-   brw->render_target_format[format],
+   brw->mesa_to_isl_render_format[format],
level, logical_layer, num_layers,
x0, y0, x1, y1);
   blorp_batch_finish();
@@ -884,7 +884,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   struct blorp_batch batch;
   blorp_batch_init(>blorp, , brw, 0);
   blorp_clear(, ,
-  brw->render_target_format[format],
+  brw->mesa_to_isl_render_format[format],
   ISL_SWIZZLE_IDENTITY,
   level, irb_logical_mt_layer(irb), num_layers,
   x0, y0, x1, y1,
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index c15abe1d48..17a76f0808 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1161,8 +1161,8 @@ struct brw_context
const struct brw_tracked_state render_atoms[76];
const struct brw_tracked_state compute_atoms[11];
 
-   enum isl_format render_target_format[MESA_FORMAT_COUNT];
-   bool format_supported_as_render_target[MESA_FORMAT_COUNT];
+   enum isl_format mesa_to_isl_render_format[MESA_FORMAT_COUNT];
+   bool mesa_format_supports_render[MESA_FORMAT_COUNT];
 
/* PrimitiveRestart */
struct {
diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c 
b/src/mesa/drivers/dri/i965/brw_meta_util.c
index cbc2dedde8..0342a527d7 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
@@ -289,7 +289,7 @@ brw_is_color_fast_clear_compatible(struct brw_context *brw,
 */
if (brw->gen >= 9 &&
brw_isl_format_for_mesa_format(mt->format) !=
-   brw->render_target_format[mt->format])
+   brw->mesa_to_isl_render_format[mt->format])
   return false;
 
/* Gen9 doesn't support fast clear on single-sampled SRGB buffers. When
diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index f878317e92..c33cafa836 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -301,21 +301,21 @@ brw_init_surface_formats(struct brw_context *brw)
*/
   if (isl_format_supports_rendering(devinfo, render) &&
   (isl_format_supports_alpha_blending(devinfo, render) || is_integer)) 
{
-

[Mesa-dev] [PATCH 2/6] i965: Add a RGBX->RGBA fallback for glEGLImageTextureTarget2D()

2017-06-06 Thread Chad Versace
This enables support for importing RGBX EGLImage textures on
Skylake.

Chrome OS needs support for RGBX EGLImage textures because because
the Android framework produces HAL_PIXEL_FORMAT_RGBX winsys
surfaces, which the Chrome OS compositor consumes as dma_bufs.  On
hardware for which RGBX is unsupported or disabled, normally core Mesa
provides the RGBX->RGBA fallback during glTexStorage.  But the DRIimage
code bypasses core Mesa, so we must do the fallback in i965.
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 649b3907d1..92c6c15c72 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -5,6 +5,7 @@
 #include "main/bufferobj.h"
 #include "main/context.h"
 #include "main/formats.h"
+#include "main/format_fallback.h"
 #include "main/glformats.h"
 #include "main/image.h"
 #include "main/pbo.h"
@@ -254,8 +255,22 @@ create_mt_for_dri_image(struct brw_context *brw,
struct gl_context *ctx = >ctx;
struct intel_mipmap_tree *mt;
uint32_t draw_x, draw_y;
+   mesa_format format = image->format;
+
+   if (!ctx->TextureFormatSupported[format]) {
+  /* The texture storage paths in core Mesa detect if the driver does not
+   * support the user-requested format, and then searches for a
+   * fallback format. The DRIimage code bypasses core Mesa, though. So we
+   * do the fallbacks here for important formats.
+   *
+   * We must support DRM_FOURCC_XBGR textures because the Android
+   * framework produces HAL_PIXEL_FORMAT_RGBX winsys surfaces, which
+   * the Chrome OS compositor consumes as dma_buf EGLImages.
+   */
+  format = _mesa_format_fallback_rgbx_to_rgba(format);
+   }
 
-   if (!ctx->TextureFormatSupported[image->format])
+   if (!ctx->TextureFormatSupported[format])
   return NULL;
 
/* Disable creation of the texture's aux buffers because the driver exposes
@@ -263,7 +278,7 @@ create_mt_for_dri_image(struct brw_context *brw,
 * buffer's content to the main buffer nor for invalidating the aux buffer's
 * content.
 */
-   mt = intel_miptree_create_for_bo(brw, image->bo, image->format,
+   mt = intel_miptree_create_for_bo(brw, image->bo, format,
 0, image->width, image->height, 1,
 image->pitch,
 MIPTREE_LAYOUT_DISABLE_AUX);
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] i965: Move brw_context format arrays to intel_screen

2017-06-06 Thread Chad Versace
This allows us to query the driver's supported formats in i965's DRI code,
where often there is available a DRIscreen but no GL context.

To reduce diff noise, this patch does not completely remove
brw_context's format arrays. It just redeclares them as pointers which
point to the arrays in intel_screen.

Specifically, move these two arrays from brw_context to intel_screen:
mesa_to_isl_render_format[]
mesa_format_supports_render[]

And add a new array to intel_screen,
mesa_format_supportex_texture[]
which brw_init_surface_formats() copies to ctx->TextureFormatSupported.
---
 src/mesa/drivers/dri/i965/brw_context.h |  5 +-
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 92 +++--
 src/mesa/drivers/dri/i965/intel_screen.c|  2 +
 src/mesa/drivers/dri/i965/intel_screen.h|  5 ++
 4 files changed, 64 insertions(+), 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 17a76f0808..476981bfad 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1161,8 +1161,8 @@ struct brw_context
const struct brw_tracked_state render_atoms[76];
const struct brw_tracked_state compute_atoms[11];
 
-   enum isl_format mesa_to_isl_render_format[MESA_FORMAT_COUNT];
-   bool mesa_format_supports_render[MESA_FORMAT_COUNT];
+   const enum isl_format *mesa_to_isl_render_format;
+   const bool *mesa_format_supports_render;
 
/* PrimitiveRestart */
struct {
@@ -1419,6 +1419,7 @@ void brw_upload_image_surfaces(struct brw_context *brw,
struct brw_stage_prog_data *prog_data);
 
 /* brw_surface_formats.c */
+void intel_screen_init_surface_formats(struct intel_screen *screen);
 void brw_init_surface_formats(struct brw_context *brw);
 bool brw_render_target_supported(struct brw_context *brw,
  struct gl_renderbuffer *rb);
diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index c33cafa836..a2bc1ded6d 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -203,17 +203,16 @@ brw_isl_format_for_mesa_format(mesa_format mesa_format)
 }
 
 void
-brw_init_surface_formats(struct brw_context *brw)
+intel_screen_init_surface_formats(struct intel_screen *screen)
 {
-   const struct gen_device_info *devinfo = >screen->devinfo;
-   struct gl_context *ctx = >ctx;
-   int gen;
+   const struct gen_device_info *devinfo = >devinfo;
mesa_format format;
 
-   memset(>TextureFormatSupported, 0, 
sizeof(ctx->TextureFormatSupported));
+   memset(>mesa_format_supports_texture, 0,
+  sizeof(screen->mesa_format_supports_texture));
 
-   gen = brw->gen * 10;
-   if (brw->is_g4x || brw->is_haswell)
+   int gen = devinfo->gen * 10;
+   if (devinfo->is_g4x || devinfo->is_haswell)
   gen += 5;
 
for (format = MESA_FORMAT_NONE + 1; format < MESA_FORMAT_COUNT; format++) {
@@ -237,7 +236,7 @@ brw_init_surface_formats(struct brw_context *brw)
 
   if (isl_format_supports_sampling(devinfo, texture) &&
   (isl_format_supports_filtering(devinfo, texture) || is_integer))
-ctx->TextureFormatSupported[format] = true;
+screen->mesa_format_supports_texture[format] = true;
 
   /* Re-map some render target formats to make them supported when they
* wouldn't be using their format for texturing.
@@ -301,30 +300,30 @@ brw_init_surface_formats(struct brw_context *brw)
*/
   if (isl_format_supports_rendering(devinfo, render) &&
   (isl_format_supports_alpha_blending(devinfo, render) || is_integer)) 
{
-brw->mesa_to_isl_render_format[format] = render;
-brw->mesa_format_supports_render[format] = true;
+screen->mesa_to_isl_render_format[format] = render;
+screen->mesa_format_supports_render[format] = true;
   }
}
 
/* We will check this table for FBO completeness, but the surface format
 * table above only covered color rendering.
 */
-   brw->mesa_format_supports_render[MESA_FORMAT_Z24_UNORM_S8_UINT] = true;
-   brw->mesa_format_supports_render[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
-   brw->mesa_format_supports_render[MESA_FORMAT_S_UINT8] = true;
-   brw->mesa_format_supports_render[MESA_FORMAT_Z_FLOAT32] = true;
-   brw->mesa_format_supports_render[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] = true;
-   if (brw->gen >= 8)
-  brw->mesa_format_supports_render[MESA_FORMAT_Z_UNORM16] = true;
+   screen->mesa_format_supports_render[MESA_FORMAT_Z24_UNORM_S8_UINT] = true;
+   screen->mesa_format_supports_render[MESA_FORMAT_Z24_UNORM_X8_UINT] = true;
+   screen->mesa_format_supports_render[MESA_FORMAT_S_UINT8] = true;
+   screen->mesa_format_supports_render[MESA_FORMAT_Z_FLOAT32] = true;
+   screen->mesa_format_supports_render[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] = 
true;
+   if (gen >= 80)
+  

[Mesa-dev] [PATCH 6/6] i965/dri: Support R8G8B8A8 and R8G8B8X8 configs

2017-06-06 Thread Chad Versace
The Android framework requires support for EGLConfigs with
HAL_PIXEL_FORMAT_RGBX_ and HAL_PIXEL_FORMAT_RGBA_.

Even though all RGBX formats are disabled on gen9 by
brw_surface_formats.c, the new configs work correctly on Broxton thanks
to _mesa_format_fallback_rgbx_to_rgba().

On GLX, this creates no new configs, and therefore breaks no existing
apps. See in-patch comments for explanation. I tested with glxinfo and
glxgears on Skylake.

On Wayland, this also creates no new configs, and therfore breaks no
existing apps. (I tested with mesa-demos' eglinfo and es2gears_wayland
on Skylake). The reason differs from GLX, though. In
dri2_wl_add_configs_for_visual(), the format table contains only
B8G8R8X8, B8G8R8A8, and B5G6B5; and dri2_add_config() correctly matches
EGLConfig to format by inspecting channel masks.

On Android, in Chrome OS, I tested this on a Broxton device. I confirmed
that the Google Play Store's EGLSurface used HAL_PIXEL_FORMAT_RGBA_,
and that an Asteroid game's EGLSurface used HAL_PIXEL_FORMAT_RGBX_.
Both apps worked well. (Disclaimer: I didn't test this patch on Android
with Mesa master. I backported this patch series to an older Android
branch).
---
 src/mesa/drivers/dri/i965/intel_fbo.c| 24 ++--
 src/mesa/drivers/dri/i965/intel_screen.c | 23 ++-
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index 16d1325736..e56d30a2c0 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -34,6 +34,7 @@
 #include "main/teximage.h"
 #include "main/image.h"
 #include "main/condrender.h"
+#include "main/format_fallback.h"
 #include "util/hash_table.h"
 #include "util/set.h"
 
@@ -450,10 +451,29 @@ intel_create_winsys_renderbuffer(struct intel_screen 
*screen,
 
_mesa_init_renderbuffer(rb, 0);
rb->ClassID = INTEL_RB_CLASS;
+   rb->NumSamples = num_samples;
+
+   /* The base format and internal format must be derived from the user-visible
+* format (that is, the gl_config's format), even if we internally use
+* choose a different format for the renderbuffer. Otherwise, rendering may
+* use incorrect channel write masks.
+*/
rb->_BaseFormat = _mesa_get_format_base_format(format);
-   rb->Format = format;
rb->InternalFormat = rb->_BaseFormat;
-   rb->NumSamples = num_samples;
+
+   rb->Format = format;
+   if (!screen->mesa_format_supports_render[rb->Format]) {
+  /* The glRenderbufferStorage paths in core Mesa detect if the driver
+   * does not support the user-requested format, and then searches for
+   * a falback format. The DRI code bypasses core Mesa, though. So we do
+   * the fallbacks here.
+   *
+   * We must support MESA_FORMAT_R8G8B8X8 on Android because the Android
+   * framework requires HAL_PIXEL_FORMAT_RGBX winsys surfaces.
+   */
+  rb->Format = _mesa_format_fallback_rgbx_to_rgba(rb->Format);
+  assert(screen->mesa_format_supports_render[rb->Format]);
+   }
 
/* intel-specific methods */
rb->Delete = intel_delete_renderbuffer;
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 563065b91f..a9d132f868 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1547,7 +1547,28 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
static const mesa_format formats[] = {
   MESA_FORMAT_B5G6R5_UNORM,
   MESA_FORMAT_B8G8R8A8_UNORM,
-  MESA_FORMAT_B8G8R8X8_UNORM
+  MESA_FORMAT_B8G8R8X8_UNORM,
+
+  /* The 32-bit RGBA format must not precede the 32-bit BGRA format.
+   * Likewise for RGBX and BGRX.  Otherwise, the GLX client and the GLX
+   * server may disagree on which format the GLXFBConfig represents,
+   * resulting in swapped color channels.
+   *
+   * The problem, as of 2017-05-30:
+   * When matching a GLXFBConfig to a __DRIconfig, GLX ignores the channel
+   * order and chooses the first __DRIconfig with the expected channel
+   * sizes. Specifically, GLX compares the GLXFBConfig's and __DRIconfig's
+   * __DRI_ATTRIB_{CHANNEL}_SIZE but ignores __DRI_ATTRIB_{CHANNEL}_MASK.
+   *
+   * EGL does not suffer from this problem. It correctly compares the
+   * channel masks when matching EGLConfig to __DRIconfig.
+   */
+
+  /* Required by Android, for HAL_PIXEL_FORMAT_RGBA_. */
+  MESA_FORMAT_R8G8B8A8_UNORM,
+
+  /* Required by Android, for HAL_PIXEL_FORMAT_RGBX_. */
+  MESA_FORMAT_R8G8B8X8_UNORM,
};
 
/* GLX_SWAP_COPY_OML is not supported due to page flipping. */
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/30] i965: Combine render target resolve code

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 12:54 PM, Chad Versace 
wrote:

> There's a patch on your branch I didn't see on mesa-dev.
>Subject: i965: Be a bit more conservative about certain resolves
> It has my r-b.
>
> I have comments on this patch...
>
> On Fri 26 May 2017, Jason Ekstrand wrote:
> > We have two different bits of resolve code for render targets: one in
> > brw_draw where it's always been and one in brw_context to deal with sRGB
> > on gen9.  Let's pull them together.
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c | 47
> -
> >  src/mesa/drivers/dri/i965/brw_draw.c| 34 
> >  2 files changed, 29 insertions(+), 52 deletions(-)
>
>
>
> > +  /* For layered rendering non-compressed fast cleared buffers need
> to be
> > +   * resolved. Surface state can carry only one fast color clear
> value
> > +   * while each layer may have its own fast clear color value. For
> > +   * compressed buffers color value is available in the color
> buffer.
> > +   */
> > +  if (irb->layer_count > 1 &&
> > +  !(irb->mt->aux_disable & INTEL_AUX_DISABLE_CCS) &&
> > +  !intel_miptree_is_lossless_compressed(brw, mt)) {
>
> This condition smells bad. It smells like a shot in the dark. It smells
> like a haphazard guess. "We haven't permanently disabled CCS for this
> miptree. And it lacks CCS_E. So, well, it probably has CCS_D, I guess.".
>
> I would much rather see the condition with something more certain.
> Something like:
>
> if (irb->layer_count > 1 &&
> intel_miptree_has_css_d_in_layer_range(brw, mt, irb->mt_level,
> irb->mt_layer, irb->layer_count))
>
> Anway, this patch is a good cleanup, and functional changes like I'm
> requesting don't belong in a refactoring patch like this one.
>

Yes, I'd like to get that cleaned up.  I think the right thing to do is
actually to check for whether or not mt->format is_ccs_e_compatible with
the actual format that will be used for rendering.  If it is, then we can
just fix up the clear color.  If not, we need a full resolve.


> Reviewed-by: Chad Versace 
>
> > + assert(brw->gen >= 8);
> > +
> > + intel_miptree_resolve_color(brw, mt, irb->mt_level, 1,
> > + irb->mt_layer, irb->layer_count,
> 0);
> > +  }
> > }
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] i965/dri: Add intel_screen param to intel_create_winsys_renderbuffer

2017-06-06 Thread Chad Versace
The param is currently unused. It will later be used it to support
R8G8B8X8 EGLConfigs on Skylake.
---
 src/mesa/drivers/dri/i965/intel_fbo.c|  8 +---
 src/mesa/drivers/dri/i965/intel_fbo.h|  6 --
 src/mesa/drivers/dri/i965/intel_screen.c | 14 --
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index 88e2fc7bf1..16d1325736 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -438,7 +438,8 @@ intel_nop_alloc_storage(struct gl_context * ctx, struct 
gl_renderbuffer *rb,
  * \param num_samples must be quantized.
  */
 struct intel_renderbuffer *
-intel_create_winsys_renderbuffer(mesa_format format, unsigned num_samples)
+intel_create_winsys_renderbuffer(struct intel_screen *screen,
+ mesa_format format, unsigned num_samples)
 {
struct intel_renderbuffer *irb = CALLOC_STRUCT(intel_renderbuffer);
if (!irb)
@@ -470,11 +471,12 @@ intel_create_winsys_renderbuffer(mesa_format format, 
unsigned num_samples)
  * \param num_samples must be quantized.
  */
 struct intel_renderbuffer *
-intel_create_private_renderbuffer(mesa_format format, unsigned num_samples)
+intel_create_private_renderbuffer(struct intel_screen *screen,
+  mesa_format format, unsigned num_samples)
 {
struct intel_renderbuffer *irb;
 
-   irb = intel_create_winsys_renderbuffer(format, num_samples);
+   irb = intel_create_winsys_renderbuffer(screen, format, num_samples);
irb->Base.Base.AllocStorage = intel_alloc_private_renderbuffer_storage;
 
return irb;
diff --git a/src/mesa/drivers/dri/i965/intel_fbo.h 
b/src/mesa/drivers/dri/i965/intel_fbo.h
index 2d2ef1ebc6..752e4f36a8 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.h
+++ b/src/mesa/drivers/dri/i965/intel_fbo.h
@@ -167,10 +167,12 @@ intel_rb_format(const struct intel_renderbuffer *rb)
 }
 
 extern struct intel_renderbuffer *
-intel_create_winsys_renderbuffer(mesa_format format, unsigned num_samples);
+intel_create_winsys_renderbuffer(struct intel_screen *screen,
+ mesa_format format, unsigned num_samples);
 
 struct intel_renderbuffer *
-intel_create_private_renderbuffer(mesa_format format, unsigned num_samples);
+intel_create_private_renderbuffer(struct intel_screen *screen,
+  mesa_format format, unsigned num_samples);
 
 struct gl_renderbuffer*
 intel_create_wrapped_renderbuffer(struct gl_context * ctx,
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 22f6d9af03..7733d91fea 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1200,11 +1200,11 @@ intelCreateBuffer(__DRIscreen *dri_screen,
}
 
/* setup the hardware-based renderbuffers */
-   rb = intel_create_winsys_renderbuffer(rgbFormat, num_samples);
+   rb = intel_create_winsys_renderbuffer(screen, rgbFormat, num_samples);
_mesa_attach_and_own_rb(fb, BUFFER_FRONT_LEFT, >Base.Base);
 
if (mesaVis->doubleBufferMode) {
-  rb = intel_create_winsys_renderbuffer(rgbFormat, num_samples);
+  rb = intel_create_winsys_renderbuffer(screen, rgbFormat, num_samples);
   _mesa_attach_and_own_rb(fb, BUFFER_BACK_LEFT, >Base.Base);
}
 
@@ -1217,10 +1217,11 @@ intelCreateBuffer(__DRIscreen *dri_screen,
   assert(mesaVis->stencilBits == 8);
 
   if (screen->devinfo.has_hiz_and_separate_stencil) {
- rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_X8_UINT,
+ rb = intel_create_private_renderbuffer(screen,
+MESA_FORMAT_Z24_UNORM_X8_UINT,
 num_samples);
  _mesa_attach_and_own_rb(fb, BUFFER_DEPTH, >Base.Base);
- rb = intel_create_private_renderbuffer(MESA_FORMAT_S_UINT8,
+ rb = intel_create_private_renderbuffer(screen, MESA_FORMAT_S_UINT8,
 num_samples);
  _mesa_attach_and_own_rb(fb, BUFFER_STENCIL, >Base.Base);
   } else {
@@ -1228,7 +1229,8 @@ intelCreateBuffer(__DRIscreen *dri_screen,
   * Use combined depth/stencil. Note that the renderbuffer is
   * attached to two attachment points.
   */
- rb = intel_create_private_renderbuffer(MESA_FORMAT_Z24_UNORM_S8_UINT,
+ rb = intel_create_private_renderbuffer(screen,
+MESA_FORMAT_Z24_UNORM_S8_UINT,
 num_samples);
  _mesa_attach_and_own_rb(fb, BUFFER_DEPTH, >Base.Base);
  _mesa_attach_and_reference_rb(fb, BUFFER_STENCIL, >Base.Base);
@@ -1236,7 +1238,7 @@ intelCreateBuffer(__DRIscreen *dri_screen,
}
else if (mesaVis->depthBits == 16) {
   assert(mesaVis->stencilBits == 0);
-  rb = 

[Mesa-dev] [PATCH 1/6] mesa: Add _mesa_format_fallback_rgba_to_rgbx()

2017-06-06 Thread Chad Versace
The new function takes a mesa_format and, if the format is an alpha
format with a non-alpha variant, returns the non-alpha format.
Otherwise, it returns the original format.

Example:
  input -> output

  // Fallback exists
  MESA_FORMAT_R8G8B8X8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
  MESA_FORMAT_RGBX_UNORM16 -> MESA_FORMAT_RGBA_UNORM16

  // No fallback
  MESA_FORMAT_R8G8B8A8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
  MESA_FORMAT_Z_FLOAT32 -> MESA_FORMAT_Z_FLOAT32

i965 will use this for EGLImages and DRIimages.
---
 src/mesa/Android.gen.mk  |  12 +++
 src/mesa/Makefile.am |   7 ++
 src/mesa/Makefile.sources|   2 +
 src/mesa/main/.gitignore |   1 +
 src/mesa/main/format_fallback.h  |  31 +++
 src/mesa/main/format_fallback.py | 180 +++
 6 files changed, 233 insertions(+)
 create mode 100644 src/mesa/main/format_fallback.h
 create mode 100644 src/mesa/main/format_fallback.py

diff --git a/src/mesa/Android.gen.mk b/src/mesa/Android.gen.mk
index 42d4ba1969..8dcfb460e9 100644
--- a/src/mesa/Android.gen.mk
+++ b/src/mesa/Android.gen.mk
@@ -34,6 +34,7 @@ sources := \
main/enums.c \
main/api_exec.c \
main/dispatch.h \
+   main/format_fallback.c \
main/format_pack.c \
main/format_unpack.c \
main/format_info.h \
@@ -135,6 +136,17 @@ $(intermediates)/main/get_hash.h: 
$(glapi)/gl_and_es_API.xml \
$(LOCAL_PATH)/main/get_hash_params.py $(GET_HASH_GEN)
$(call es-gen)
 
+FORMAT_FALLBACK := $(LOCAL_PATH)/main/format_fallback.py
+format_fallback_deps := \
+   $(LOCAL_PATH)/main/formats.csv \
+   $(LOCAL_PATH)/main/format_parser.py \
+   $(FORMAT_FALLBACK)
+
+$(intermediates)/main/format_fallback.c: PRIVATE_SCRIPT := $(MESA_PYTHON2) 
$(FORMAT_FALLBACK)
+$(intermediates)/main/format_fallback.c: PRIVATE_XML :=
+$(intermediates)/main/format_fallback.c: $(format_fallback_deps)
+   $(call es-gen, $<)
+
 FORMAT_INFO := $(LOCAL_PATH)/main/format_info.py
 format_info_deps := \
$(LOCAL_PATH)/main/formats.csv \
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 53f311d2a9..2c39374aef 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -37,6 +37,7 @@ include Makefile.sources
 
 EXTRA_DIST = \
drivers/SConscript \
+   main/format_fallback.py \
main/format_info.py \
main/format_pack.py \
main/format_parser.py \
@@ -54,6 +55,7 @@ EXTRA_DIST = \
 
 BUILT_SOURCES = \
main/get_hash.h \
+   main/format_fallback.c \
main/format_info.h \
main/format_pack.c \
main/format_unpack.c \
@@ -70,6 +72,11 @@ main/get_hash.h: ../mapi/glapi/gen/gl_and_es_API.xml 
main/get_hash_params.py \
$(PYTHON_GEN) $(srcdir)/main/get_hash_generator.py \
-f $(srcdir)/../mapi/glapi/gen/gl_and_es_API.xml > $@
 
+main/format_fallback.c: main/format_fallback.py \
+main/format_parser.py \
+   main/formats.csv
+   $(PYTHON_GEN) $(srcdir)/main/format_fallback.py 
$(srcdir)/main/formats.csv > $@
+
 main/format_info.h: main/formats.csv \
 main/format_parser.py main/format_info.py
$(PYTHON_GEN) $(srcdir)/main/format_info.py $(srcdir)/main/formats.csv 
> $@
diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 8a65fbe663..642100b956 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -94,6 +94,8 @@ MAIN_FILES = \
main/ffvertex_prog.h \
main/fog.c \
main/fog.h \
+   main/format_fallback.h \
+   main/format_fallback.c \
main/format_info.h \
main/format_pack.h \
main/format_pack.c \
diff --git a/src/mesa/main/.gitignore b/src/mesa/main/.gitignore
index 836d8f104a..8cc33cfee6 100644
--- a/src/mesa/main/.gitignore
+++ b/src/mesa/main/.gitignore
@@ -4,6 +4,7 @@ enums.c
 remap_helper.h
 get_hash.h
 get_hash.h.tmp
+format_fallback.c
 format_info.h
 format_info.c
 format_pack.c
diff --git a/src/mesa/main/format_fallback.h b/src/mesa/main/format_fallback.h
new file mode 100644
index 00..5ca8269b2f
--- /dev/null
+++ b/src/mesa/main/format_fallback.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2017 Google
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 

[Mesa-dev] [PATCH 3/6] radv: Add early exit for cache flushes.

2017-06-06 Thread Bas Nieuwenhuizen
No sense checking each bit separately in the common case of none
being set.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/si_cmd_buffer.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index a10034e4f20..1011c2d3393 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -1089,6 +1089,9 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
  
RADV_CMD_FLAG_VS_PARTIAL_FLUSH |
  RADV_CMD_FLAG_VGT_FLUSH);
 
+   if (!cmd_buffer->state.flush_bits)
+   return;
+
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 128);
 
uint32_t *ptr = NULL;
@@ -1104,8 +1107,7 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
   cmd_buffer->state.flush_bits);
 
 
-   if (cmd_buffer->state.flush_bits)
-   radv_cmd_buffer_trace_emit(cmd_buffer);
+   radv_cmd_buffer_trace_emit(cmd_buffer);
cmd_buffer->state.flush_bits = 0;
 }
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] radv: Move pipeline stuff from flush_state to emit_graphics_pipeline.

2017-06-06 Thread Bas Nieuwenhuizen
No functional changes.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ca9d606a7ca..6dfd52ea9d0 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -901,6 +901,16 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.emitted_pipeline->graphics.can_use_guardband !=
 pipeline->graphics.can_use_guardband)
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_SCISSOR;
+
+   radeon_set_context_reg(cmd_buffer->cs, R_028B54_VGT_SHADER_STAGES_EN, 
pipeline->graphics.vgt_shader_stages_en);
+
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= CIK) {
+   radeon_set_uconfig_reg_idx(cmd_buffer->cs, 
R_030908_VGT_PRIMITIVE_TYPE, 1, pipeline->graphics.prim);
+   } else {
+   radeon_set_config_reg(cmd_buffer->cs, 
R_008958_VGT_PRIMITIVE_TYPE, pipeline->graphics.prim);
+   }
+   radeon_set_context_reg(cmd_buffer->cs, R_028A6C_VGT_GS_OUT_PRIM_TYPE, 
pipeline->graphics.gs_out);
+
cmd_buffer->state.emitted_pipeline = pipeline;
 }
 
@@ -1586,17 +1596,6 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
cmd_buffer->state.last_ia_multi_vgt_param = ia_multi_vgt_param;
}
 
-   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE) {
-   radeon_set_context_reg(cmd_buffer->cs, 
R_028B54_VGT_SHADER_STAGES_EN, pipeline->graphics.vgt_shader_stages_en);
-
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= 
CIK) {
-   radeon_set_uconfig_reg_idx(cmd_buffer->cs, 
R_030908_VGT_PRIMITIVE_TYPE, 1, cmd_buffer->state.pipeline->graphics.prim);
-   } else {
-   radeon_set_config_reg(cmd_buffer->cs, 
R_008958_VGT_PRIMITIVE_TYPE, cmd_buffer->state.pipeline->graphics.prim);
-   }
-   radeon_set_context_reg(cmd_buffer->cs, 
R_028A6C_VGT_GS_OUT_PRIM_TYPE, cmd_buffer->state.pipeline->graphics.gs_out);
-   }
-
radv_cmd_buffer_flush_dynamic_state(cmd_buffer);
 
radv_emit_primitive_reset_state(cmd_buffer, indexed_draw);
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] radv: Don't use a divide by index_size.

2017-06-06 Thread Bas Nieuwenhuizen
Divides are pretty slow, and this is in the hot path of a draw.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index c91c7b91880..ed0aa8020ce 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2643,6 +2643,12 @@ void radv_CmdDraw(
radv_cmd_buffer_trace_emit(cmd_buffer);
 }
 
+static
+uint32_t radv_get_max_index_count(struct radv_cmd_buffer *cmd_buffer) {
+   int index_size_shift = cmd_buffer->state.index_type ? 2 : 1;
+   return (cmd_buffer->state.index_buffer->size - 
cmd_buffer->state.index_offset) >> index_size_shift;
+}
+
 void radv_CmdDrawIndexed(
VkCommandBuffer commandBuffer,
uint32_tindexCount,
@@ -2653,7 +2659,7 @@ void radv_CmdDrawIndexed(
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
int index_size = cmd_buffer->state.index_type ? 4 : 2;
-   uint32_t index_max_size = (cmd_buffer->state.index_buffer->size - 
cmd_buffer->state.index_offset) / index_size;
+   uint32_t index_max_size = radv_get_max_index_count(cmd_buffer);
uint64_t index_va;
 
radv_cmd_buffer_flush_state(cmd_buffer, true, (instanceCount > 1), 
false, indexCount);
@@ -2789,8 +2795,7 @@ radv_cmd_draw_indexed_indirect_count(
uint32_tstride)
 {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
-   int index_size = cmd_buffer->state.index_type ? 4 : 2;
-   uint32_t index_max_size = (cmd_buffer->state.index_buffer->size - 
cmd_buffer->state.index_offset) / index_size;
+   uint32_t index_max_size = radv_get_max_index_count(cmd_buffer);
uint64_t index_va;
radv_cmd_buffer_flush_state(cmd_buffer, true, false, true, 0);
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] radv: Remove SI num RB override for occlusion queries.

2017-06-06 Thread Bas Nieuwenhuizen
radeonsi doesn't have it anymore either.

Signed-off-by: Bas Nieuwenhuizen 
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
 src/amd/vulkan/radv_query.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulkan/radv_query.c
index 6d05612579f..03b5af16a55 100644
--- a/src/amd/vulkan/radv_query.c
+++ b/src/amd/vulkan/radv_query.c
@@ -44,9 +44,6 @@ static unsigned get_max_db(struct radv_device *device)
unsigned num_db = device->physical_device->rad_info.num_render_backends;
MAYBE_UNUSED unsigned rb_mask = 
device->physical_device->rad_info.enabled_rb_mask;
 
-   if (device->physical_device->rad_info.chip_class == SI)
-   num_db = 8;
-
/* Otherwise we need to change the query reset procedure */
assert(rb_mask == ((1ull << num_db) - 1));
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] radv: Split out updating the vertex descriptors.

2017-06-06 Thread Bas Nieuwenhuizen
Simple refactor.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6dfd52ea9d0..f3187e84d7f 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1525,17 +1525,9 @@ static void radv_emit_primitive_reset_state(struct 
radv_cmd_buffer *cmd_buffer,
 }
 
 static void
-radv_cmd_buffer_flush_state(struct radv_cmd_buffer *cmd_buffer,
-   bool indexed_draw, bool instanced_draw,
-   bool indirect_draw,
-   uint32_t draw_vertex_count)
+radv_cmd_buffer_update_vertex_descriptors(struct radv_cmd_buffer *cmd_buffer)
 {
-   struct radv_pipeline *pipeline = cmd_buffer->state.pipeline;
struct radv_device *device = cmd_buffer->device;
-   uint32_t ia_multi_vgt_param;
-
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
-  cmd_buffer->cs, 
4096);
 
if ((cmd_buffer->state.pipeline != cmd_buffer->state.emitted_pipeline 
|| cmd_buffer->state.vb_dirty) &&
cmd_buffer->state.pipeline->num_vertex_attribs &&
@@ -1574,11 +1566,26 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
va = device->ws->buffer_get_va(cmd_buffer->upload.upload_bo);
va += vb_offset;
 
-   radv_emit_userdata_address(cmd_buffer, pipeline, 
MESA_SHADER_VERTEX,
+   radv_emit_userdata_address(cmd_buffer, 
cmd_buffer->state.pipeline, MESA_SHADER_VERTEX,
   AC_UD_VS_VERTEX_BUFFERS, va);
}
-
cmd_buffer->state.vb_dirty = 0;
+}
+
+static void
+radv_cmd_buffer_flush_state(struct radv_cmd_buffer *cmd_buffer,
+   bool indexed_draw, bool instanced_draw,
+   bool indirect_draw,
+   uint32_t draw_vertex_count)
+{
+   struct radv_pipeline *pipeline = cmd_buffer->state.pipeline;
+   uint32_t ia_multi_vgt_param;
+
+   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
+  cmd_buffer->cs, 
4096);
+
+   radv_cmd_buffer_update_vertex_descriptors(cmd_buffer);
+
if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE)
radv_emit_graphics_pipeline(cmd_buffer, pipeline);
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] radv: Remove vertex_descriptors_dirty.

2017-06-06 Thread Bas Nieuwenhuizen
Redundant.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c | 4 +---
 src/amd/vulkan/radv_private.h| 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ed0aa8020ce..ca9d606a7ca 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1527,7 +1527,7 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
   cmd_buffer->cs, 
4096);
 
-   if ((cmd_buffer->state.vertex_descriptors_dirty || 
cmd_buffer->state.vb_dirty) &&
+   if ((cmd_buffer->state.pipeline != cmd_buffer->state.emitted_pipeline 
|| cmd_buffer->state.vb_dirty) &&
cmd_buffer->state.pipeline->num_vertex_attribs &&

cmd_buffer->state.pipeline->shaders[MESA_SHADER_VERTEX]->info.info.vs.has_vertex_buffers)
 {
unsigned vb_offset;
@@ -1568,7 +1568,6 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
   AC_UD_VS_VERTEX_BUFFERS, va);
}
 
-   cmd_buffer->state.vertex_descriptors_dirty = false;
cmd_buffer->state.vb_dirty = 0;
if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_PIPELINE)
radv_emit_graphics_pipeline(cmd_buffer, pipeline);
@@ -2268,7 +2267,6 @@ void radv_CmdBindPipeline(
if (!pipeline)
break;
 
-   cmd_buffer->state.vertex_descriptors_dirty = true;
cmd_buffer->state.dirty |= RADV_CMD_DIRTY_PIPELINE;
cmd_buffer->push_constant_stages |= pipeline->active_stages;
 
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index ed80ba79e7f..fbeac70e39e 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -750,7 +750,6 @@ struct radv_attachment_state {
 struct radv_cmd_state {
uint32_t  vb_dirty;
radv_cmd_dirty_mask_t dirty;
-   bool  vertex_descriptors_dirty;
bool  push_descriptors_dirty;
 
struct radv_pipeline *pipeline;
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Jason Ekstrand
On Tue, Jun 6, 2017 at 1:22 PM, Chad Versace 
wrote:

> On Fri 26 May 2017, Jason Ekstrand wrote:
> > This enum describes all of the states that a auxiliary compressed
> > surface can have.  All of the states as well as normative language for
> > referring to each of the compression operations is provided in the
> > truly colossal comment for the new isl_aux_state enum.  There is also
> > a diagram showing how surfaces move between the different states.
> > ---
> >  src/intel/isl/isl.h | 142 ++
> ++
> >  1 file changed, 142 insertions(+)
> >
> > diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> > index b9d8fa8..df6d3e3 100644
> > --- a/src/intel/isl/isl.h
> > +++ b/src/intel/isl/isl.h
> > @@ -560,6 +560,148 @@ enum isl_aux_usage {
> > ISL_AUX_USAGE_CCS_E,
> >  };
> >
> > +/**
> > + * Enum for keeping track of the state an auxiliary compressed surface.
>
> This is really nice and helpful for everyone.
>
> I also learned something new from it: that a resolve on CCS_E also
> ambiguates the aux surface. Do you have any insight on why the hardware
> does that?
>
> > + *
> > + * For any given auxiliary surface compression format (HiZ, CCS, or
> MCS), any
> > + * given slice (lod + array layer) can be in one of the six states
> described
> > + * by this enum.  Draw and resolve operations may cause the slice to
> change
> > + * from one state to another.  The six valid states are:
>
> I have one suggestion: please carefully distinguish between CCS_D and
> CCS_E in the documentation. In my experience, muddy thinking where the
> two are not cleanly distinguished leads to confused minds and confusing
> code.
>
> For someone who already has a firm grasp on aux state, the ambiguous
> term "CCS" poses no problem. That wise person automatically infers from
> context if "CCS" applies to CCS_D, to CCS_E, or to both. But for someone
> who's understanding of aux isn't as solid, the term "CCS" can lead to
> incorrect inferences.
>
> For example, below you say that the partial resolve "operation is only
> available for CCS". That's misleading. It should say "only available for
> CCS_E".
>
> Another benefit: It becomes possible to document that
> ISL_AUX_STATE_COMPRESSED_NO_CLEAR is valid only for CCS_E and HIZ, but
> not valid for CCS_D and MCS.
>

It is valid for MCS.  If you don't fast-clear but only render, then you're
in that state.  It's only invalid for CCS_D.


> Other than the CCS_D/CCS_E distinction, the patch looks good to me. This
> is a really nice addition to the driver.
>

How about a section after the auxiliary compression ops section which goes
into detail on each of the compression types and discusses which states are
valid etc.


> One more comment at the end...
>
> > + *
> > + *1) Clear:  In this state, each block in the auxiliary surface
> contains a
> > + *   magic value that indicates that the block is in the clear
> state.  If
> > + *   a block is in the clear state, it's values in the primary
> surface are
> > + *   ignored and the color of the samples in the block is taken
> either the
> > + *   RENDER_SURFACE_STATE packet for color or 3DSTATE_CLEAR_PARAMS
> for
> > + *   depth.  Since neither the primary surface nor the auxiliary
> surface
> > + *   contains the clear value, the surface can be cleared to a
> different
> > + *   color by simply changing the clear color without modifying
> either
> > + *   surface.
> > + *
> > + *2) Compressed w/ Clear:  In this state, neither the auxiliary
> surface
> > + *   nor the primary surface has a complete representation of the
> data.
> > + *   Instead, both surfaces must be used together or else rendering
> > + *   corruption may occur.  Depending on the auxiliary compression
> format
> > + *   and the data, any given block in the primary surface may
> contain all,
> > + *   some, or none of the data required to reconstruct the actual
> sample
> > + *   values.  Blocks may also be in the clear state (see Clear) and
> have
> > + *   their value taken from outside the surface.
> > + *
> > + *3) Compressed w/o Clear:  This state is identical to the state
> above
> > + *   except that no blocks are in the clear state.  In this state,
> all of
> > + *   the data required to reconstruct the final sample values is
> contained
> > + *   in the auxiliary and primary surface and the clear value is not
> > + *   considered.
> > + *
> > + *4) Resolved:  In this state, the primary surface contains 100% of
> the
> > + *   data.  The auxiliary surface is also valid so the surface can
> be
> > + *   validly used with or without aux enabled.  The auxiliary
> surface may,
> > + *   however, contain non-trivial data and any update to the primary
> > + *   surface with aux disabled will cause the two to get out of
> sync.
> > + *
> > + *5) Pass-through:  In this state, the primary 

Re: [Mesa-dev] [PATCH 14/30] intel/isl: Add an enum for describing auxiliary compression state

2017-06-06 Thread Chad Versace
On Fri 26 May 2017, Jason Ekstrand wrote:
> This enum describes all of the states that a auxiliary compressed
> surface can have.  All of the states as well as normative language for
> referring to each of the compression operations is provided in the
> truly colossal comment for the new isl_aux_state enum.  There is also
> a diagram showing how surfaces move between the different states.
> ---
>  src/intel/isl/isl.h | 142 
> 
>  1 file changed, 142 insertions(+)
> 
> diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
> index b9d8fa8..df6d3e3 100644
> --- a/src/intel/isl/isl.h
> +++ b/src/intel/isl/isl.h
> @@ -560,6 +560,148 @@ enum isl_aux_usage {
> ISL_AUX_USAGE_CCS_E,
>  };
>  
> +/**
> + * Enum for keeping track of the state an auxiliary compressed surface.

This is really nice and helpful for everyone.

I also learned something new from it: that a resolve on CCS_E also
ambiguates the aux surface. Do you have any insight on why the hardware
does that?

> + *
> + * For any given auxiliary surface compression format (HiZ, CCS, or MCS), any
> + * given slice (lod + array layer) can be in one of the six states described
> + * by this enum.  Draw and resolve operations may cause the slice to change
> + * from one state to another.  The six valid states are:

I have one suggestion: please carefully distinguish between CCS_D and
CCS_E in the documentation. In my experience, muddy thinking where the
two are not cleanly distinguished leads to confused minds and confusing
code.

For someone who already has a firm grasp on aux state, the ambiguous
term "CCS" poses no problem. That wise person automatically infers from
context if "CCS" applies to CCS_D, to CCS_E, or to both. But for someone
who's understanding of aux isn't as solid, the term "CCS" can lead to
incorrect inferences.

For example, below you say that the partial resolve "operation is only
available for CCS". That's misleading. It should say "only available for
CCS_E".

Another benefit: It becomes possible to document that
ISL_AUX_STATE_COMPRESSED_NO_CLEAR is valid only for CCS_E and HIZ, but
not valid for CCS_D and MCS.

Other than the CCS_D/CCS_E distinction, the patch looks good to me. This
is a really nice addition to the driver.

One more comment at the end...

> + *
> + *1) Clear:  In this state, each block in the auxiliary surface contains 
> a
> + *   magic value that indicates that the block is in the clear state.  If
> + *   a block is in the clear state, it's values in the primary surface 
> are
> + *   ignored and the color of the samples in the block is taken either 
> the
> + *   RENDER_SURFACE_STATE packet for color or 3DSTATE_CLEAR_PARAMS for
> + *   depth.  Since neither the primary surface nor the auxiliary surface
> + *   contains the clear value, the surface can be cleared to a different
> + *   color by simply changing the clear color without modifying either
> + *   surface.
> + *
> + *2) Compressed w/ Clear:  In this state, neither the auxiliary surface
> + *   nor the primary surface has a complete representation of the data.
> + *   Instead, both surfaces must be used together or else rendering
> + *   corruption may occur.  Depending on the auxiliary compression format
> + *   and the data, any given block in the primary surface may contain 
> all,
> + *   some, or none of the data required to reconstruct the actual sample
> + *   values.  Blocks may also be in the clear state (see Clear) and have
> + *   their value taken from outside the surface.
> + *
> + *3) Compressed w/o Clear:  This state is identical to the state above
> + *   except that no blocks are in the clear state.  In this state, all of
> + *   the data required to reconstruct the final sample values is 
> contained
> + *   in the auxiliary and primary surface and the clear value is not
> + *   considered.
> + *
> + *4) Resolved:  In this state, the primary surface contains 100% of the
> + *   data.  The auxiliary surface is also valid so the surface can be
> + *   validly used with or without aux enabled.  The auxiliary surface 
> may,
> + *   however, contain non-trivial data and any update to the primary
> + *   surface with aux disabled will cause the two to get out of sync.
> + *
> + *5) Pass-through:  In this state, the primary surface contains 100% of 
> the
> + *   data and every block in the auxiliary surface contains a magic value
> + *   which indicates that the auxiliary surface should be ignored and the
> + *   only the primary surface should be considered.  Updating the primary
> + *   surface without aux works fine and can be done repeatedly in this
> + *   mode.  Writing to a surface in pass-through mode with aux enabled 
> may
> + *   cause the auxiliary buffer to contain non-trivial data and no longer
> + *   be in the pass-through state.
> + *
> 

[Mesa-dev] [PATCH 2/3] mesa: make use of NewScissorRect driver flags

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/scissor.c | 2 ++
 src/mesa/state_tracker/st_context.c | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/scissor.c b/src/mesa/main/scissor.c
index 808cb4d5fe..fb7cf0ebb2 100644
--- a/src/mesa/main/scissor.c
+++ b/src/mesa/main/scissor.c
@@ -49,6 +49,8 @@ set_scissor_no_notify(struct gl_context *ctx, unsigned idx,
   return;
 
FLUSH_VERTICES(ctx, _NEW_SCISSOR);
+   ctx->NewDriverState |= ctx->DriverFlags.NewScissorRect;
+
ctx->Scissor.ScissorArray[idx].X = x;
ctx->Scissor.ScissorArray[idx].Y = y;
ctx->Scissor.ScissorArray[idx].Width = width;
diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 8a5dd2f725..d8ec45f118 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -192,8 +192,7 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
  st->dirty |= ST_NEW_RASTERIZER;
 
   if (new_state & _NEW_SCISSOR)
- st->dirty |= ST_NEW_RASTERIZER |
-  ST_NEW_SCISSOR;
+ st->dirty |= ST_NEW_RASTERIZER;
 
   if (new_state & _NEW_FOG)
  st->dirty |= ST_NEW_FS_STATE;
@@ -514,6 +513,7 @@ static void st_init_driver_flags(struct gl_driver_flags *f)
f->NewShaderStorageBuffer = ST_NEW_STORAGE_BUFFER;
f->NewImageUnits = ST_NEW_IMAGE_UNITS;
f->NewWindowRectangles = ST_NEW_WINDOW_RECTANGLES;
+   f->NewScissorRect = ST_NEW_SCISSOR;
 }
 
 struct st_context *st_create_context(gl_api api, struct pipe_context *pipe,
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] mesa: add gl_driver_flags::NewScissor{Rect, Test}

2017-06-06 Thread Samuel Pitoiset
_NEW_SCISSOR mesa flag is set when a scissor test is enabled/disabled
or when a new rectangle is defined. However, it triggers too much
changes in the state tracker.

Actually, ST_NEW_RASTERIZER should only be called when a scissor
test is enabled/disabled, while ST_NEW_SCISSOR should be called
in both situations.

In other words, this will avoid to update the rasterizer every
time a new rectangle is defined using glScissor*().

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/mtypes.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ec012321f..5d327d0dc7 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4402,6 +4402,12 @@ struct gl_driver_flags
 * gl_context::Scissor::WindowRects
 */
uint64_t NewWindowRectangles;
+
+   /** gl_context::Scissor::EnableFlags */
+   uint64_t NewScissorTest;
+
+   /** gl_context::Scissor::ScissorArray */
+   uint64_t NewScissorRect;
 };
 
 struct gl_uniform_buffer_binding
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't try to resolve CCS with MESA_FORMAT_NONE.

2017-06-06 Thread Chad Versace
On Sun 04 Jun 2017, Jason Ekstrand wrote:
> On June 4, 2017 5:36:57 PM Kenneth Graunke  wrote:
> 
> > On Sunday, June 4, 2017 3:27:04 PM PDT Jason Ekstrand wrote:
> > > How does the texture even have a format of MESA_FORMAT_NONE?  That seems
> > > like the first question to ask.
> > 
> > It's the window system buffer, and it actually has a format of
> > B8G8R8A8_UNORM_SRGB...my guess is just that _Format isn't set yet.
> 
> That seems like the bigger problem :-)

I was afraid that my patch would uncover long-hidden bugs like this.
Thanks for fixing it quickly.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa: make use of NewScissorTest driver flags

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/enable.c  | 2 ++
 src/mesa/state_tracker/st_context.c | 4 +---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index ef278a318a..394c9e13c4 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -671,6 +671,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
state * ((1 << ctx->Const.MaxViewports) - 1);
 if (newEnabled != ctx->Scissor.EnableFlags) {
FLUSH_VERTICES(ctx, _NEW_SCISSOR);
+   ctx->NewDriverState |= ctx->DriverFlags.NewScissorTest;
ctx->Scissor.EnableFlags = newEnabled;
 }
  }
@@ -1113,6 +1114,7 @@ _mesa_set_enablei(struct gl_context *ctx, GLenum cap,
   }
   if (((ctx->Scissor.EnableFlags >> index) & 1) != state) {
  FLUSH_VERTICES(ctx, _NEW_SCISSOR);
+ ctx->NewDriverState |= ctx->DriverFlags.NewScissorTest;
  if (state)
 ctx->Scissor.EnableFlags |= (1 << index);
  else
diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index d8ec45f118..8d348e28b2 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -191,9 +191,6 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
   if (new_state & _NEW_PROGRAM)
  st->dirty |= ST_NEW_RASTERIZER;
 
-  if (new_state & _NEW_SCISSOR)
- st->dirty |= ST_NEW_RASTERIZER;
-
   if (new_state & _NEW_FOG)
  st->dirty |= ST_NEW_FS_STATE;
 
@@ -514,6 +511,7 @@ static void st_init_driver_flags(struct gl_driver_flags *f)
f->NewImageUnits = ST_NEW_IMAGE_UNITS;
f->NewWindowRectangles = ST_NEW_WINDOW_RECTANGLES;
f->NewScissorRect = ST_NEW_SCISSOR;
+   f->NewScissorTest = ST_NEW_SCISSOR | ST_NEW_RASTERIZER;
 }
 
 struct st_context *st_create_context(gl_api api, struct pipe_context *pipe,
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] mesa: add KHR_no_error support for glScissor*()

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mapi/glapi/gen/ARB_viewport_array.xml |  6 +++---
 src/mapi/glapi/gen/gl_API.xml |  2 +-
 src/mesa/main/scissor.c   | 31 +++
 src/mesa/main/scissor.h   | 13 +
 4 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_viewport_array.xml 
b/src/mapi/glapi/gen/ARB_viewport_array.xml
index be67912884..3e9c65549e 100644
--- a/src/mapi/glapi/gen/ARB_viewport_array.xml
+++ b/src/mapi/glapi/gen/ARB_viewport_array.xml
@@ -45,19 +45,19 @@
 
 
 
-
+
 
 
 
 
-
+
 
 
 
 
 
 
-
+
 
 
 
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 8f93318b95..df999248c8 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -2108,7 +2108,7 @@
 
 
 
-
+
 
 
 
diff --git a/src/mesa/main/scissor.c b/src/mesa/main/scissor.c
index d94663c6e4..808cb4d5fe 100644
--- a/src/mesa/main/scissor.c
+++ b/src/mesa/main/scissor.c
@@ -83,6 +83,13 @@ scissor(struct gl_context *ctx, GLint x, GLint y, GLsizei 
width, GLsizei height)
  * Called via glScissor
  */
 void GLAPIENTRY
+_mesa_Scissor_no_error(GLint x, GLint y, GLsizei width, GLsizei height)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   scissor(ctx, x, y, width, height);
+}
+
+void GLAPIENTRY
 _mesa_Scissor(GLint x, GLint y, GLsizei width, GLsizei height)
 {
GET_CURRENT_CONTEXT(ctx);
@@ -149,6 +156,15 @@ scissor_array(struct gl_context *ctx, GLuint first, 
GLsizei count,
  * Verifies the parameters and call set_scissor_no_notify to do the work.
  */
 void GLAPIENTRY
+_mesa_ScissorArrayv_no_error(GLuint first, GLsizei count, const GLint *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   struct gl_scissor_rect *p = (struct gl_scissor_rect *)v;
+   scissor_array(ctx, first, count, p);
+}
+
+void GLAPIENTRY
 _mesa_ScissorArrayv(GLuint first, GLsizei count, const GLint *v)
 {
int i;
@@ -212,6 +228,14 @@ scissor_indexed_err(struct gl_context *ctx, GLuint index, 
GLint left,
 }
 
 void GLAPIENTRY
+_mesa_ScissorIndexed_no_error(GLuint index, GLint left, GLint bottom,
+  GLsizei width, GLsizei height)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_set_scissor(ctx, index, left, bottom, width, height);
+}
+
+void GLAPIENTRY
 _mesa_ScissorIndexed(GLuint index, GLint left, GLint bottom,
  GLsizei width, GLsizei height)
 {
@@ -221,6 +245,13 @@ _mesa_ScissorIndexed(GLuint index, GLint left, GLint 
bottom,
 }
 
 void GLAPIENTRY
+_mesa_ScissorIndexedv_no_error(GLuint index, const GLint *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_set_scissor(ctx, index, v[0], v[1], v[2], v[3]);
+}
+
+void GLAPIENTRY
 _mesa_ScissorIndexedv(GLuint index, const GLint *v)
 {
GET_CURRENT_CONTEXT(ctx);
diff --git a/src/mesa/main/scissor.h b/src/mesa/main/scissor.h
index 1d0fac877b..264873eaf1 100644
--- a/src/mesa/main/scissor.h
+++ b/src/mesa/main/scissor.h
@@ -31,15 +31,28 @@
 
 struct gl_context;
 
+void GLAPIENTRY
+_mesa_Scissor_no_error(GLint x, GLint y, GLsizei width, GLsizei height);
+
 extern void GLAPIENTRY
 _mesa_Scissor( GLint x, GLint y, GLsizei width, GLsizei height );
 
+void GLAPIENTRY
+_mesa_ScissorArrayv_no_error(GLuint first, GLsizei count, const GLint * v);
+
 extern void GLAPIENTRY
 _mesa_ScissorArrayv(GLuint first, GLsizei count, const GLint * v);
 
+void GLAPIENTRY
+_mesa_ScissorIndexed_no_error(GLuint index, GLint left, GLint bottom,
+  GLsizei width, GLsizei height);
+
 extern void GLAPIENTRY
 _mesa_ScissorIndexed(GLuint index, GLint left, GLint bottom, GLsizei width, 
GLsizei height);
 
+void GLAPIENTRY
+_mesa_ScissorIndexedv_no_error(GLuint index, const GLint * v);
+
 extern void GLAPIENTRY
 _mesa_ScissorIndexedv(GLuint index, const GLint * v);
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] mesa: add scissor() and scissor_array() helpers

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/scissor.c | 57 -
 1 file changed, 37 insertions(+), 20 deletions(-)

diff --git a/src/mesa/main/scissor.c b/src/mesa/main/scissor.c
index 5cf02168bd..d94663c6e4 100644
--- a/src/mesa/main/scissor.c
+++ b/src/mesa/main/scissor.c
@@ -55,22 +55,10 @@ set_scissor_no_notify(struct gl_context *ctx, unsigned idx,
ctx->Scissor.ScissorArray[idx].Height = height;
 }
 
-/**
- * Called via glScissor
- */
-void GLAPIENTRY
-_mesa_Scissor( GLint x, GLint y, GLsizei width, GLsizei height )
+static void
+scissor(struct gl_context *ctx, GLint x, GLint y, GLsizei width, GLsizei 
height)
 {
unsigned i;
-   GET_CURRENT_CONTEXT(ctx);
-
-   if (MESA_VERBOSE & VERBOSE_API)
-  _mesa_debug(ctx, "glScissor %d %d %d %d\n", x, y, width, height);
-
-   if (width < 0 || height < 0) {
-  _mesa_error( ctx, GL_INVALID_VALUE, "glScissor" );
-  return;
-   }
 
/* The GL_ARB_viewport_array spec says:
 *
@@ -91,6 +79,25 @@ _mesa_Scissor( GLint x, GLint y, GLsizei width, GLsizei 
height )
   ctx->Driver.Scissor(ctx);
 }
 
+/**
+ * Called via glScissor
+ */
+void GLAPIENTRY
+_mesa_Scissor(GLint x, GLint y, GLsizei width, GLsizei height)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glScissor %d %d %d %d\n", x, y, width, height);
+
+   if (width < 0 || height < 0) {
+  _mesa_error( ctx, GL_INVALID_VALUE, "glScissor" );
+  return;
+   }
+
+   scissor(ctx, x, y, width, height);
+}
+
 
 /**
  * Define the scissor box.
@@ -115,6 +122,21 @@ _mesa_set_scissor(struct gl_context *ctx, unsigned idx,
   ctx->Driver.Scissor(ctx);
 }
 
+static void
+scissor_array(struct gl_context *ctx, GLuint first, GLsizei count,
+  struct gl_scissor_rect *rect)
+{
+   GLsizei i;
+
+   for (i = 0; i < count; i++) {
+  set_scissor_no_notify(ctx, i + first, rect[i].X, rect[i].Y,
+rect[i].Width, rect[i].Height);
+   }
+
+   if (ctx->Driver.Scissor)
+  ctx->Driver.Scissor(ctx);
+}
+
 /**
  * Define count scissor boxes starting at index.
  *
@@ -150,12 +172,7 @@ _mesa_ScissorArrayv(GLuint first, GLsizei count, const 
GLint *v)
   }
}
 
-   for (i = 0; i < count; i++)
-  set_scissor_no_notify(ctx, i + first,
-p[i].X, p[i].Y, p[i].Width, p[i].Height);
-
-   if (ctx->Driver.Scissor)
-  ctx->Driver.Scissor(ctx);
+   scissor_array(ctx, first, count, p);
 }
 
 /**
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] mesa: rename ScissorIndexed() to scissor_indexed_err()

2017-06-06 Thread Samuel Pitoiset
And move GET_CURRENT_CONTEXT() into the APIENTRY calls
for consistency.

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/scissor.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/scissor.c b/src/mesa/main/scissor.c
index 0dd956c9e3..5cf02168bd 100644
--- a/src/mesa/main/scissor.c
+++ b/src/mesa/main/scissor.c
@@ -169,11 +169,10 @@ _mesa_ScissorArrayv(GLuint first, GLsizei count, const 
GLint *v)
  * Verifies the parameters call set_scissor_no_notify to do the work.
  */
 static void
-ScissorIndexed(GLuint index, GLint left, GLint bottom,
-   GLsizei width, GLsizei height, const char *function)
+scissor_indexed_err(struct gl_context *ctx, GLuint index, GLint left,
+GLint bottom, GLsizei width, GLsizei height,
+const char *function)
 {
-   GET_CURRENT_CONTEXT(ctx);
-
if (MESA_VERBOSE & VERBOSE_API)
   _mesa_debug(ctx, "%s(%d, %d, %d, %d, %d)\n",
   function, index, left, bottom, width, height);
@@ -199,13 +198,17 @@ void GLAPIENTRY
 _mesa_ScissorIndexed(GLuint index, GLint left, GLint bottom,
  GLsizei width, GLsizei height)
 {
-   ScissorIndexed(index, left, bottom, width, height, "glScissorIndexed");
+   GET_CURRENT_CONTEXT(ctx);
+   scissor_indexed_err(ctx, index, left, bottom, width, height,
+   "glScissorIndexed");
 }
 
 void GLAPIENTRY
 _mesa_ScissorIndexedv(GLuint index, const GLint *v)
 {
-   ScissorIndexed(index, v[0], v[1], v[2], v[3], "glScissorIndexedv");
+   GET_CURRENT_CONTEXT(ctx);
+   scissor_indexed_err(ctx, index, v[0], v[1], v[2], v[3],
+   "glScissorIndexedv");
 }
 
 void GLAPIENTRY
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] mesa: make _mesa_scissor_bounding_box() static

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/framebuffer.c | 10 +-
 src/mesa/main/framebuffer.h |  4 
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c
index 5069d37394..993cd37137 100644
--- a/src/mesa/main/framebuffer.c
+++ b/src/mesa/main/framebuffer.c
@@ -407,10 +407,10 @@ _mesa_intersect_scissor_bounding_box(const struct 
gl_context *ctx,
  *
  * \sa _mesa_clip_to_region
  */
-void
-_mesa_scissor_bounding_box(const struct gl_context *ctx,
-   const struct gl_framebuffer *buffer,
-   unsigned idx, int *bbox)
+static void
+scissor_bounding_box(const struct gl_context *ctx,
+ const struct gl_framebuffer *buffer,
+ unsigned idx, int *bbox)
 {
bbox[0] = 0;
bbox[2] = 0;
@@ -444,7 +444,7 @@ _mesa_update_draw_buffer_bounds(struct gl_context *ctx,
}
 
/* Default to the first scissor as that's always valid */
-   _mesa_scissor_bounding_box(ctx, buffer, 0, bbox);
+   scissor_bounding_box(ctx, buffer, 0, bbox);
buffer->_Xmin = bbox[0];
buffer->_Ymin = bbox[2];
buffer->_Xmax = bbox[1];
diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h
index ee0690b068..bc6e7bc31a 100644
--- a/src/mesa/main/framebuffer.h
+++ b/src/mesa/main/framebuffer.h
@@ -72,10 +72,6 @@ extern void
 _mesa_resizebuffers( struct gl_context *ctx );
 
 extern void
-_mesa_scissor_bounding_box(const struct gl_context *ctx,
-   const struct gl_framebuffer *buffer,
-   unsigned idx, int *bbox);
-extern void
 _mesa_intersect_scissor_bounding_box(const struct gl_context *ctx,
  unsigned idx, int *bbox);
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] mesa: use _mesa_set_scissor() in ScissorIndexed()

2017-06-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/scissor.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/mesa/main/scissor.c b/src/mesa/main/scissor.c
index 13934f9ca2..0dd956c9e3 100644
--- a/src/mesa/main/scissor.c
+++ b/src/mesa/main/scissor.c
@@ -192,10 +192,7 @@ ScissorIndexed(GLuint index, GLint left, GLint bottom,
   return;
}
 
-   set_scissor_no_notify(ctx, index, left, bottom, width, height);
-
-   if (ctx->Driver.Scissor)
-  ctx->Driver.Scissor(ctx);
+   _mesa_set_scissor(ctx, index, left, bottom, width, height);
 }
 
 void GLAPIENTRY
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa: wrap blit_framebuffer() into blit_framebuffer_err()

2017-06-06 Thread Samuel Pitoiset
Also add ALWAYS_INLINE to blit_framebuffer().

v2: - use correct parameters

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/blit.c | 34 +-
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/blit.c b/src/mesa/main/blit.c
index 970c357335..be5e4f109a 100644
--- a/src/mesa/main/blit.c
+++ b/src/mesa/main/blit.c
@@ -177,7 +177,7 @@ is_valid_blit_filter(const struct gl_context *ctx, GLenum 
filter)
 }
 
 
-static void
+static ALWAYS_INLINE void
 blit_framebuffer(struct gl_context *ctx,
  struct gl_framebuffer *readFb, struct gl_framebuffer *drawFb,
  GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
@@ -537,6 +537,22 @@ blit_framebuffer(struct gl_context *ctx,
 }
 
 
+static void
+blit_framebuffer_err(struct gl_context *ctx,
+ struct gl_framebuffer *readFb,
+ struct gl_framebuffer *drawFb,
+ GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
+ GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
+ GLbitfield mask, GLenum filter, const char *func)
+{
+   /* We are wrapping the err variant of the always inlined
+* blit_framebuffer() to avoid inlining it in every caller.
+*/
+   blit_framebuffer(ctx, readFb, drawFb, srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1, mask, filter, false, func);
+}
+
+
 /**
  * Blit rectangular region, optionally from one framebuffer to another.
  *
@@ -558,10 +574,10 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint 
srcX1, GLint srcY1,
   dstX0, dstY0, dstX1, dstY1,
   mask, _mesa_enum_to_string(filter));
 
-   blit_framebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
-srcX0, srcY0, srcX1, srcY1,
-dstX0, dstY0, dstX1, dstY1,
-mask, filter, false, "glBlitFramebuffer");
+   blit_framebuffer_err(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
+srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1,
+mask, filter, "glBlitFramebuffer");
 }
 
 
@@ -609,8 +625,8 @@ _mesa_BlitNamedFramebuffer(GLuint readFramebuffer, GLuint 
drawFramebuffer,
else
   drawFb = ctx->WinSysDrawBuffer;
 
-   blit_framebuffer(ctx, readFb, drawFb,
-srcX0, srcY0, srcX1, srcY1,
-dstX0, dstY0, dstX1, dstY1,
-mask, filter, false, "glBlitNamedFramebuffer");
+   blit_framebuffer_err(ctx, readFb, drawFb,
+srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1,
+mask, filter, "glBlitNamedFramebuffer");
 }
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/30] i965: Combine render target resolve code

2017-06-06 Thread Chad Versace
There's a patch on your branch I didn't see on mesa-dev.
   Subject: i965: Be a bit more conservative about certain resolves 
It has my r-b.

I have comments on this patch...

On Fri 26 May 2017, Jason Ekstrand wrote:
> We have two different bits of resolve code for render targets: one in
> brw_draw where it's always been and one in brw_context to deal with sRGB
> on gen9.  Let's pull them together.
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 47 
> -
>  src/mesa/drivers/dri/i965/brw_draw.c| 34 
>  2 files changed, 29 insertions(+), 52 deletions(-)



> +  /* For layered rendering non-compressed fast cleared buffers need to be
> +   * resolved. Surface state can carry only one fast color clear value
> +   * while each layer may have its own fast clear color value. For
> +   * compressed buffers color value is available in the color buffer.
> +   */
> +  if (irb->layer_count > 1 &&
> +  !(irb->mt->aux_disable & INTEL_AUX_DISABLE_CCS) &&
> +  !intel_miptree_is_lossless_compressed(brw, mt)) {

This condition smells bad. It smells like a shot in the dark. It smells
like a haphazard guess. "We haven't permanently disabled CCS for this
miptree. And it lacks CCS_E. So, well, it probably has CCS_D, I guess.".

I would much rather see the condition with something more certain.
Something like:

if (irb->layer_count > 1 &&
intel_miptree_has_css_d_in_layer_range(brw, mt, irb->mt_level, 
irb->mt_layer, irb->layer_count))

Anway, this patch is a good cleanup, and functional changes like I'm
requesting don't belong in a refactoring patch like this one.

Reviewed-by: Chad Versace 

> + assert(brw->gen >= 8);
> +
> + intel_miptree_resolve_color(brw, mt, irb->mt_level, 1,
> + irb->mt_layer, irb->layer_count, 0);
> +  }
> }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/30] i965/blorp: Move MCS allocation earlier for clears

2017-06-06 Thread Chad Versace
Patches 10-12 are
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/30] i965: Inline renderbuffer_att_set_needs_depth_resolve

2017-06-06 Thread Chad Versace
On Fri 02 Jun 2017, Jason Ekstrand wrote:
> On Fri, Jun 2, 2017 at 2:41 PM, Chad Versace <[1]chadvers...@chromium.org>
> wrote:
> 
> I believe you could simplify this by eliminating the 'else' branch. As
> long as depth_irb->layer_count == 1 for non-layered renderbuffers (and
> I really hope it is), then the 'for' loop does the right thing.
> 
> 
> Sure.  I was sort-of trying to avoid functional changes.  That said... I'm
> happy to make the change.  Lots of stuff would suddenly get cleaner.  Mind if
> that's a follow-on patch to the series?

It was just a suggestion, a nice-to-have.

This patch is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] radeonsi: don't update dependent states if it has no effect

2017-06-06 Thread Samuel Pitoiset

I really like the idea. :-)

Though, I have two general comments:

1) I think it would be better to introduce some sort of compare helper 
functions for the different state changes. Also, for correctness it 
might be safer to do the opposite checks (if someone introduce a new 
field and forget to update the relevant part).


2) I would suggest to introduce uses_instance_divisors in a separate patch.

Btw, I'm not going to review patch 4 because I'm not familiar enough 
with this part.


On 06/05/2017 06:51 PM, Marek Olšák wrote:

From: Marek Olšák 

This and the previous commit decrease IB sizes and the number of
si_update_shaders invocations as follows:

  IB size   si_update_shader calls
Borderlands 2  -10%-27%
Deus Ex: MD -5%-11%
Talos Principle -8%-30%
---
  src/gallium/drivers/radeonsi/si_state.c | 68 ++---
  src/gallium/drivers/radeonsi/si_state.h |  1 +
  src/gallium/drivers/radeonsi/si_state_shaders.c | 24 +++--
  3 files changed, 80 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 5e5f564..323ec5e 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -596,23 +596,41 @@ static void *si_create_blend_state_mode(struct 
pipe_context *ctx,
  
  static void *si_create_blend_state(struct pipe_context *ctx,

   const struct pipe_blend_state *state)
  {
return si_create_blend_state_mode(ctx, state, V_028808_CB_NORMAL);
  }
  
  static void si_bind_blend_state(struct pipe_context *ctx, void *state)

  {
struct si_context *sctx = (struct si_context *)ctx;
-   si_pm4_bind_state(sctx, blend, (struct si_state_blend *)state);
-   si_mark_atom_dirty(sctx, >cb_render_state);
-   sctx->do_update_shaders = true;
+   struct si_state_blend *old_blend = sctx->queued.named.blend;
+   struct si_state_blend *blend = (struct si_state_blend *)state;
+
+   if (!state)
+   return;
+
+   if (!old_blend ||
+old_blend->cb_target_mask != blend->cb_target_mask ||
+old_blend->dual_src_blend != blend->dual_src_blend)
+   si_mark_atom_dirty(sctx, >cb_render_state);
+
+   si_pm4_bind_state(sctx, blend, state);
+
+   if (!old_blend ||
+   old_blend->cb_target_mask != blend->cb_target_mask ||
+   old_blend->alpha_to_coverage != blend->alpha_to_coverage ||
+   old_blend->alpha_to_one != blend->alpha_to_one ||
+   old_blend->dual_src_blend != blend->dual_src_blend ||
+   old_blend->blend_enable_4bit != blend->blend_enable_4bit ||
+   old_blend->need_src_alpha_4bit != blend->need_src_alpha_4bit)
+   sctx->do_update_shaders = true;
  }
  
  static void si_delete_blend_state(struct pipe_context *ctx, void *state)

  {
struct si_context *sctx = (struct si_context *)ctx;
si_pm4_delete_state(sctx, blend, (struct si_state_blend *)state);
  }
  
  static void si_set_blend_color(struct pipe_context *ctx,

   const struct pipe_blend_color *state)
@@ -914,24 +932,41 @@ static void si_bind_rs_state(struct pipe_context *ctx, 
void *state)
}
  
  	sctx->current_vs_state &= C_VS_STATE_CLAMP_VERTEX_COLOR;

sctx->current_vs_state |= 
S_VS_STATE_CLAMP_VERTEX_COLOR(rs->clamp_vertex_color);
  
  	r600_viewport_set_rast_deps(>b, rs->scissor_enable, rs->clip_halfz);
  
  	si_pm4_bind_state(sctx, rasterizer, rs);

si_update_poly_offset_state(sctx);
  
-	si_mark_atom_dirty(sctx, >clip_regs);

+   if (!old_rs ||
+   old_rs->clip_plane_enable != rs->clip_plane_enable ||
+   old_rs->pa_cl_clip_cntl != rs->pa_cl_clip_cntl)
+   si_mark_atom_dirty(sctx, >clip_regs);
+
sctx->ia_multi_vgt_param_key.u.line_stipple_enabled =
rs->line_stipple_enable;
-   sctx->do_update_shaders = true;
+
+   if (!old_rs ||
+   old_rs->clip_plane_enable != rs->clip_plane_enable ||
+   old_rs->rasterizer_discard != rs->rasterizer_discard ||
+   old_rs->sprite_coord_enable != rs->sprite_coord_enable ||
+   old_rs->flatshade != rs->flatshade ||
+   old_rs->two_side != rs->two_side ||
+   old_rs->multisample_enable != rs->multisample_enable ||
+   old_rs->poly_stipple_enable != rs->poly_stipple_enable ||
+   old_rs->poly_smooth != rs->poly_smooth ||
+   old_rs->line_smooth != rs->line_smooth ||
+   old_rs->clamp_fragment_color != rs->clamp_fragment_color ||
+   old_rs->force_persample_interp != rs->force_persample_interp)
+   sctx->do_update_shaders = true;
  }
  
  static void si_delete_rs_state(struct pipe_context *ctx, void *state)

  {
struct si_context *sctx = (struct si_context *)ctx;
  
  	if 

Re: [Mesa-dev] [PATCH 04/10] i965/blorp: Inline gen6_blorp_exec

2017-06-06 Thread Pohjolainen, Topi
On Tue, Jun 06, 2017 at 08:35:06PM +0300, Pohjolainen, Topi wrote:
> On Mon, Jun 05, 2017 at 05:55:39PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp.c | 29 +++--
> >  1 file changed, 11 insertions(+), 18 deletions(-)
> 
> Patches 1-4:
> 
> Reviewed-by: Topi Pohjolainen 

In fact patches 5 and 7-10 also:

Reviewed-by: Topi Pohjolainen 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   >