Re: [Mesa-dev] [PATCH v3 026/104] nir: Support deref instructions in propagate_invariant

2018-04-06 Thread Jason Ekstrand
On Fri, Apr 6, 2018 at 4:22 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> > +static nir_variable *
> > +intrinsic_get_var(nir_intrinsic_instr *intrin, unsigned i)
> > +{
> > +   if (nir_intrinsic_infos[intrin->intrinsic].num_variables == 0)
> > +  return nir_deref_instr_get_variable(nir_src_as_deref(intrin->src[
> i]));
> > +   else
> > +  return intrin->variables[0]->var;
>
> Should this be intrin->variables[i]->var?
>

Yes, yes it should.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 023/104] nir/deref: Add a deref cleanup function

2018-04-06 Thread Jason Ekstrand
On Fri, Apr 6, 2018 at 2:05 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> On Tue, Apr 03, 2018 at 11:32:50AM -0700, Jason Ekstrand wrote:
> > Sometimes it's useful for a pass to be able to clean up its own derefs
> > instead of waiting for DCE.  This little helper makes it very easy.
> > ---
> >  src/compiler/nir/nir.h   |  2 ++
> >  src/compiler/nir/nir_deref.c | 13 +
> >  2 files changed, 15 insertions(+)
>
>
> The helper is used in earlier patches, so maybe reorder. If I'm not
> mistaken it is used as early as patch 13 ("nir: Support deref
> instructions in remove_dead_variables").
>

Yeah, I noticed this while rebasing.  The earliest use appears to be in
lower_var_copies.  I've moved it to right before that patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 022/104] nir: Support deref instructions in lower_indirect_derefs

2018-04-06 Thread Jason Ekstrand
On Fri, Apr 6, 2018 at 2:23 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> Hi,
>
> > +static void
> > +emit_load_store_deref(nir_builder *b, nir_intrinsic_instr *orig_instr,
> > +  nir_deref_instr *parent,
> > +  nir_deref_instr **deref_arr,
> > +  nir_ssa_def **dest, nir_ssa_def *src)
> > +{
> > +   for (; *deref_arr; deref_arr++) {
> > +  nir_deref_instr *deref = *deref_arr;
> > +  if (deref->deref_type == nir_deref_type_array &&
> > +  nir_src_as_const_value(deref->arr.index) == NULL) {
> > + int length = glsl_get_length(parent->type);
> > +
> > + emit_indirect_load_store_deref(b, orig_instr, parent,
> deref_arr,
> > +0, length, dest, src);
>
> Side note: after reading the existing code (that goes from
> -base_offset to length - base_offset, and later adds base_offset), I'm
> kind of glad this goes from 0 to length.
>
>
> > +static bool
> > +lower_indirect_derefs_block(nir_block *block, nir_builder *b,
> > +nir_variable_mode modes)
> > +{
>
> (...)
>
> > +  nir_deref_instr *deref =
> > + nir_instr_as_deref(intrin->src[0].ssa->parent_instr);
>
> Maybe use the helper 'nir_src_as_deref(intrin->src[0])'?
>
> > +
> > +  /* Walk the deref chain back to the base and look for indirects */
> > +  bool has_indirect = false;
> > +  nir_deref_instr *base = deref;
> > +  while (base->deref_type != nir_deref_type_var) {
> > + if (base->deref_type == nir_deref_type_array &&
> > + nir_src_as_const_value(base->arr.index) == NULL)
> > +has_indirect = true;
> > +
> > + base = nir_instr_as_deref(base->parent.ssa->parent_instr);
>
> Maybe use the helper 'base = nir_deref_instr_parent(base);'?
>

Yeah, due to rebasing, this patch predates those helpers.  Changed locally.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 024/104] nir: Support deref instructions in lower_system_values

2018-04-06 Thread Jason Ekstrand
On Fri, Apr 6, 2018 at 2:47 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> On Tue, Apr 03, 2018 at 11:32:51AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/compiler/nir/nir_lower_system_values.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_lower_system_values.c
> b/src/compiler/nir/nir_lower_system_values.c
> > index fb560ee..104df51 100644
> > --- a/src/compiler/nir/nir_lower_system_values.c
> > +++ b/src/compiler/nir/nir_lower_system_values.c
> > @@ -39,10 +39,15 @@ convert_block(nir_block *block, nir_builder *b)
> >
> >nir_intrinsic_instr *load_var = nir_instr_as_intrinsic(instr);
> >
> > -  if (load_var->intrinsic != nir_intrinsic_load_var)
> > - continue;
> > +  nir_variable *var;
> > +  if (load_var->intrinsic == nir_intrinsic_load_var) {
> > + var = load_var->variables[0]->var;
> > +  } else if (load_var->intrinsic == nir_intrinsic_load_deref) {
> > + var = nir_deref_instr_get_variable(n
> ir_src_as_deref(load_var->src[0]));
>
> Question: nir_deref_instr_get_variable will walk the deref instr
> chain, but does it even make sense if there's a array or struct in
> this deref chain? Should this be asserted?
>

That's an interesting question.  Certainly, at this point in the patch
series, we can't make that assumption.  This is because we haven't checked
the mode yet.  However, once we can assume deref instructions, we can check
the mode and then go on to find the var.  Maybe something like this
(untested):

https://gitlab.freedesktop.org/jekstrand/mesa/commit/33aee39955eff842d6ea263da2f36e60287e62ee
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 016/104] nir: Use nir_builder in lower_io_to_temporaries

2018-04-06 Thread Jason Ekstrand
On Fri, Apr 6, 2018 at 2:51 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> On Tue, Apr 03, 2018 at 11:32:43AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/compiler/nir/nir_lower_io_to_temporaries.c | 37
> --
> >  1 file changed, 17 insertions(+), 20 deletions(-)
>
> This one could land before the rest of the series.
>

Yes, it can.


> Reviewed-by: Caio Marcelo de Oliveira Filho 
>

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] radeonsi: implement mechanism for IBs without partial flushes at the end (v6)

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

(This patch doesn't enable the behavior. It will be enabled in a later
commit.)

Draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it
v6: split the commit

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
CP: 99%
SPI: 86%
CB: 73:

Without partial flushes (time busy):
CP: 99%
SPI: 93%
CB: 81%
---
 src/gallium/drivers/radeon/radeon_winsys.h |  7 
 src/gallium/drivers/radeonsi/si_gfx_cs.c   | 52 ++
 src/gallium/drivers/radeonsi/si_pipe.h |  1 +
 3 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 157b2e40550..fae4fb7a95d 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -21,20 +21,27 @@
  * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
  * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
  * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #ifndef RADEON_WINSYS_H
 #define RADEON_WINSYS_H
 
 /* The public winsys interface header for the radeon driver. */
 
+/* Whether the next IB can start immediately and not wait for draws and
+ * dispatches from the current IB to finish. */
+#define RADEON_FLUSH_START_NEXT_GFX_IB_NOW (1u << 31)
+
+#define RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW \
+   (PIPE_FLUSH_ASYNC | RADEON_FLUSH_START_NEXT_GFX_IB_NOW)
+
 #include "pipebuffer/pb_buffer.h"
 
 #include "amd/common/ac_gpu_info.h"
 #include "amd/common/ac_surface.h"
 
 /* Tiling flags. */
 enum radeon_bo_layout {
 RADEON_LAYOUT_LINEAR = 0,
 RADEON_LAYOUT_TILED,
 RADEON_LAYOUT_SQUARETILED,
diff --git a/src/gallium/drivers/radeonsi/si_gfx_cs.c 
b/src/gallium/drivers/radeonsi/si_gfx_cs.c
index 2d5e510b19e..63bff29e63a 100644
--- a/src/gallium/drivers/radeonsi/si_gfx_cs.c
+++ b/src/gallium/drivers/radeonsi/si_gfx_cs.c
@@ -62,25 +62,42 @@ void si_need_gfx_cs_space(struct si_context *ctx)
unsigned need_dwords = 2048 + ctx->num_cs_dw_queries_suspend;
if (!ctx->ws->cs_check_space(cs, need_dwords))
si_flush_gfx_cs(ctx, PIPE_FLUSH_ASYNC, NULL);
 }
 
 void si_flush_gfx_cs(struct si_context *ctx, unsigned flags,
 struct pipe_fence_handle **fence)
 {
struct radeon_winsys_cs *cs = ctx->gfx_cs;
struct radeon_winsys *ws = ctx->ws;
+   unsigned wait_flags = 0;
 
if (ctx->gfx_flush_in_progress)
return;
 
-   if (!radeon_emitted(cs, ctx->initial_gfx_cs_size))
+   if (ctx->chip_class == VI && ctx->screen->info.drm_minor <= 1) {
+   /* DRM 3.1.0 doesn't flush TC for VI correctly. */
+   wait_flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
+ SI_CONTEXT_CS_PARTIAL_FLUSH |
+ SI_CONTEXT_INV_GLOBAL_L2;
+   } else if (ctx->chip_class == SI) {
+   /* The kernel flushes L2 before shaders are finished. */
+   wait_flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
+ SI_CONTEXT_CS_PARTIAL_FLUSH;
+   } else if (!(flags & RADEON_FLUSH_START_NEXT_GFX_IB_NOW)) {
+   wait_flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
+ SI_CONTEXT_CS_PARTIAL_FLUSH;
+   }
+
+   /* Drop this flush if it's a no-op. */
+   if (!radeon_emitted(cs, ctx->initial_gfx_cs_size) &&
+   (!wait_flags || !ctx->gfx_last_ib_is_busy))
return;
 
if (si_check_device_reset(ctx))
return;
 
if (ctx->screen->debug_flags & DBG(CHECK_VM))
flags &= ~PIPE_FLUSH_ASYNC;
 
/* If the state tracker is flushing the GFX IB, si_flush_from_st is
 * responsible for flushing the DMA IB and merging the fences from both.
@@ -96,27 +113,25 @@ void si_flush_gfx_cs(struct si_context *ctx, unsigned 
flags,
 
if (!LIST_IS_EMPTY(>active_queries))
si_suspend_queries(ctx);
 
ctx->streamout.suspended = false;
if (ctx->streamout.begin_emitted) {
si_emit_streamout_end(ctx);
ctx->streamout.suspended = true;
}
 
-   ctx->flags |= SI_CONTEXT_CS_PARTIAL_FLUSH |
-   SI_CONTEXT_PS_PARTIAL_FLUSH;
-
-   /* DRM 3.1.0 doesn't flush TC for VI correctly. */
-   if (ctx->chip_class == VI && ctx->screen->info.drm_minor <= 1)

[Mesa-dev] [PATCH 2/3] winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

There is a kernel patch that adds the new flag.

Reviewed-by: Samuel Pitoiset 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 36 ++-
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index a3feeb93026..eb050b8fdb2 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -26,20 +26,24 @@
  * of the Software.
  */
 
 #include "amdgpu_cs.h"
 #include "util/os_time.h"
 #include 
 #include 
 
 #include "amd/common/sid.h"
 
+#ifndef AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE
+#define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
+#endif
+
 DEBUG_GET_ONCE_BOOL_OPTION(noop, "RADEON_NOOP", false)
 
 /* FENCES */
 
 static struct pipe_fence_handle *
 amdgpu_fence_create(struct amdgpu_ctx *ctx, unsigned ip_type,
 unsigned ip_instance, unsigned ring)
 {
struct amdgpu_fence *fence = CALLOC_STRUCT(amdgpu_fence);
 
@@ -801,56 +805,68 @@ static void amdgpu_set_ib_size(struct amdgpu_ib *ib)
 }
 
 static void amdgpu_ib_finalize(struct amdgpu_winsys *ws, struct amdgpu_ib *ib)
 {
amdgpu_set_ib_size(ib);
ib->used_ib_space += ib->base.current.cdw * 4;
ib->used_ib_space = align(ib->used_ib_space, ws->info.ib_start_alignment);
ib->max_ib_size = MAX2(ib->max_ib_size, ib->base.prev_dw + 
ib->base.current.cdw);
 }
 
-static bool amdgpu_init_cs_context(struct amdgpu_cs_context *cs,
+static bool amdgpu_init_cs_context(struct amdgpu_winsys *ws,
+   struct amdgpu_cs_context *cs,
enum ring_type ring_type)
 {
switch (ring_type) {
case RING_DMA:
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_DMA;
   break;
 
case RING_UVD:
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_UVD;
   break;
 
case RING_UVD_ENC:
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_UVD_ENC;
   break;
 
case RING_VCE:
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_VCE;
   break;
 
-   case RING_COMPUTE:
-  cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_COMPUTE;
-  break;
-
case RING_VCN_DEC:
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_VCN_DEC;
   break;
 
-  case RING_VCN_ENC:
+   case RING_VCN_ENC:
   cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_VCN_ENC;
   break;
 
-   default:
+   case RING_COMPUTE:
case RING_GFX:
-  cs->ib[IB_MAIN].ip_type = AMDGPU_HW_IP_GFX;
+  cs->ib[IB_MAIN].ip_type = ring_type == RING_GFX ? AMDGPU_HW_IP_GFX :
+AMDGPU_HW_IP_COMPUTE;
+
+  /* The kernel shouldn't invalidate L2 and vL1. The proper place for cache
+   * invalidation is the beginning of IBs (the previous commit does that),
+   * because completion of an IB doesn't care about the state of GPU 
caches,
+   * but the beginning of an IB does. Draw calls from multiple IBs can be
+   * executed in parallel, so draw calls from the current IB can finish 
after
+   * the next IB starts drawing, and so the cache flush at the end of IB
+   * is always late.
+   */
+  if (ws->info.drm_minor >= 26)
+ cs->ib[IB_MAIN].flags = AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE;
   break;
+
+   default:
+  assert(0);
}
 
memset(cs->buffer_indices_hashlist, -1, 
sizeof(cs->buffer_indices_hashlist));
cs->last_added_bo = NULL;
return true;
 }
 
 static void amdgpu_cs_context_cleanup(struct amdgpu_cs_context *cs)
 {
unsigned i;
@@ -918,26 +934,26 @@ amdgpu_cs_create(struct radeon_winsys_ctx *rwctx,
cs->flush_data = flush_ctx;
cs->ring_type = ring_type;
 
struct amdgpu_cs_fence_info fence_info;
fence_info.handle = cs->ctx->user_fence_bo;
fence_info.offset = cs->ring_type;
amdgpu_cs_chunk_fence_info_to_data(_info, (void*)>fence_chunk);
 
cs->main.ib_type = IB_MAIN;
 
-   if (!amdgpu_init_cs_context(>csc1, ring_type)) {
+   if (!amdgpu_init_cs_context(ctx->ws, >csc1, ring_type)) {
   FREE(cs);
   return NULL;
}
 
-   if (!amdgpu_init_cs_context(>csc2, ring_type)) {
+   if (!amdgpu_init_cs_context(ctx->ws, >csc2, ring_type)) {
   amdgpu_destroy_cs_context(>csc1);
   FREE(cs);
   return NULL;
}
 
/* Set the first submission context as current. */
cs->csc = >csc1;
cs->cst = >csc2;
 
if (!amdgpu_get_new_ib(>ws->base, cs, IB_MAIN)) {
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radeonsi: don't emit partial flushes for internal CS flushes only

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_buffer.c|  6 +++---
 src/gallium/drivers/radeonsi/si_dma_cs.c|  2 +-
 src/gallium/drivers/radeonsi/si_fence.c |  5 -
 src/gallium/drivers/radeonsi/si_gfx_cs.c|  4 ++--
 src/gallium/drivers/radeonsi/si_pipe.h  |  2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c |  4 ++--
 src/gallium/drivers/radeonsi/si_texture.c   |  2 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c   | 12 
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c   | 12 
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c   |  3 ++-
 10 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_buffer.c 
b/src/gallium/drivers/radeonsi/si_buffer.c
index 1420702d8d4..d17b2c6a831 100644
--- a/src/gallium/drivers/radeonsi/si_buffer.c
+++ b/src/gallium/drivers/radeonsi/si_buffer.c
@@ -57,24 +57,24 @@ void *si_buffer_map_sync_with_rings(struct si_context *sctx,
 
if (!(usage & PIPE_TRANSFER_WRITE)) {
/* have to wait for the last write */
rusage = RADEON_USAGE_WRITE;
}
 
if (radeon_emitted(sctx->gfx_cs, sctx->initial_gfx_cs_size) &&
sctx->ws->cs_is_buffer_referenced(sctx->gfx_cs,
resource->buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {
-   si_flush_gfx_cs(sctx, PIPE_FLUSH_ASYNC, NULL);
+   si_flush_gfx_cs(sctx, 
RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, NULL);
return NULL;
} else {
-   si_flush_gfx_cs(sctx, 0, NULL);
+   si_flush_gfx_cs(sctx, 
RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, NULL);
busy = true;
}
}
if (radeon_emitted(sctx->dma_cs, 0) &&
sctx->ws->cs_is_buffer_referenced(sctx->dma_cs,
resource->buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {
si_flush_dma_cs(sctx, PIPE_FLUSH_ASYNC, NULL);
return NULL;
} else {
@@ -718,21 +718,21 @@ static bool si_resource_commit(struct pipe_context *pctx,
/*
 * Since buffer commitment changes cannot be pipelined, we need to
 * (a) flush any pending commands that refer to the buffer we're about
 * to change, and
 * (b) wait for threaded submit to finish, including those that were
 * triggered by some other, earlier operation.
 */
if (radeon_emitted(ctx->gfx_cs, ctx->initial_gfx_cs_size) &&
ctx->ws->cs_is_buffer_referenced(ctx->gfx_cs,
   res->buf, 
RADEON_USAGE_READWRITE)) {
-   si_flush_gfx_cs(ctx, PIPE_FLUSH_ASYNC, NULL);
+   si_flush_gfx_cs(ctx, RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, 
NULL);
}
if (radeon_emitted(ctx->dma_cs, 0) &&
ctx->ws->cs_is_buffer_referenced(ctx->dma_cs,
   res->buf, 
RADEON_USAGE_READWRITE)) {
si_flush_dma_cs(ctx, PIPE_FLUSH_ASYNC, NULL);
}
 
ctx->ws->cs_sync_flush(ctx->dma_cs);
ctx->ws->cs_sync_flush(ctx->gfx_cs);
 
diff --git a/src/gallium/drivers/radeonsi/si_dma_cs.c 
b/src/gallium/drivers/radeonsi/si_dma_cs.c
index 7af7c5623b7..1eefaeb6ad5 100644
--- a/src/gallium/drivers/radeonsi/si_dma_cs.c
+++ b/src/gallium/drivers/radeonsi/si_dma_cs.c
@@ -51,21 +51,21 @@ void si_need_dma_space(struct si_context *ctx, unsigned 
num_dw,
}
 
/* Flush the GFX IB if DMA depends on it. */
if (radeon_emitted(ctx->gfx_cs, ctx->initial_gfx_cs_size) &&
((dst &&
  ctx->ws->cs_is_buffer_referenced(ctx->gfx_cs, dst->buf,
 RADEON_USAGE_READWRITE)) ||
 (src &&
  ctx->ws->cs_is_buffer_referenced(ctx->gfx_cs, src->buf,
 RADEON_USAGE_WRITE
-   si_flush_gfx_cs(ctx, PIPE_FLUSH_ASYNC, NULL);
+   si_flush_gfx_cs(ctx, RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, 
NULL);
 
/* Flush if there's not enough space, or if the memory usage per IB
 * is too large.
 *
 * IBs using too little memory are limited by the IB submission 
overhead.
 * IBs using too much memory are limited by the kernel/TTM overhead.
 * Too long IBs create CPU-GPU pipeline bubbles and add latency.
 *
 * This heuristic makes sure that DMA requests are executed
 * very soon after the call is made and lowers memory usage.
diff --git a/src/gallium/drivers/radeonsi/si_fence.c 
b/src/gallium/drivers/radeonsi/si_fence.c
index 26d6c43b34d..19fcb96041f 100644
--- a/src/gallium/drivers/radeonsi/si_fence.c

[Mesa-dev] [PATCH] radeonsi: don't emit partial flushes at the end of IBs for internal flushes (v5)

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

Now draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: squash with the AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE change,
don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
CP: 99%
SPI: 86%
CB: 73:

Without partial flushes (time busy):
CP: 99%
SPI: 93%
CB: 81%
---
 src/gallium/drivers/radeon/radeon_winsys.h  |  7 
 src/gallium/drivers/radeonsi/si_buffer.c|  6 +--
 src/gallium/drivers/radeonsi/si_dma_cs.c|  2 +-
 src/gallium/drivers/radeonsi/si_fence.c |  5 ++-
 src/gallium/drivers/radeonsi/si_gfx_cs.c| 56 ++---
 src/gallium/drivers/radeonsi/si_pipe.h  |  3 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c |  4 +-
 src/gallium/drivers/radeonsi/si_texture.c   |  2 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c   | 12 --
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c   | 36 +++-
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c   | 12 --
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c   |  3 +-
 12 files changed, 104 insertions(+), 44 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 157b2e40550..fae4fb7a95d 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -21,20 +21,27 @@
  * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
  * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
  * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #ifndef RADEON_WINSYS_H
 #define RADEON_WINSYS_H
 
 /* The public winsys interface header for the radeon driver. */
 
+/* Whether the next IB can start immediately and not wait for draws and
+ * dispatches from the current IB to finish. */
+#define RADEON_FLUSH_START_NEXT_GFX_IB_NOW (1u << 31)
+
+#define RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW \
+   (PIPE_FLUSH_ASYNC | RADEON_FLUSH_START_NEXT_GFX_IB_NOW)
+
 #include "pipebuffer/pb_buffer.h"
 
 #include "amd/common/ac_gpu_info.h"
 #include "amd/common/ac_surface.h"
 
 /* Tiling flags. */
 enum radeon_bo_layout {
 RADEON_LAYOUT_LINEAR = 0,
 RADEON_LAYOUT_TILED,
 RADEON_LAYOUT_SQUARETILED,
diff --git a/src/gallium/drivers/radeonsi/si_buffer.c 
b/src/gallium/drivers/radeonsi/si_buffer.c
index 1420702d8d4..d17b2c6a831 100644
--- a/src/gallium/drivers/radeonsi/si_buffer.c
+++ b/src/gallium/drivers/radeonsi/si_buffer.c
@@ -57,24 +57,24 @@ void *si_buffer_map_sync_with_rings(struct si_context *sctx,
 
if (!(usage & PIPE_TRANSFER_WRITE)) {
/* have to wait for the last write */
rusage = RADEON_USAGE_WRITE;
}
 
if (radeon_emitted(sctx->gfx_cs, sctx->initial_gfx_cs_size) &&
sctx->ws->cs_is_buffer_referenced(sctx->gfx_cs,
resource->buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {
-   si_flush_gfx_cs(sctx, PIPE_FLUSH_ASYNC, NULL);
+   si_flush_gfx_cs(sctx, 
RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, NULL);
return NULL;
} else {
-   si_flush_gfx_cs(sctx, 0, NULL);
+   si_flush_gfx_cs(sctx, 
RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, NULL);
busy = true;
}
}
if (radeon_emitted(sctx->dma_cs, 0) &&
sctx->ws->cs_is_buffer_referenced(sctx->dma_cs,
resource->buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {
si_flush_dma_cs(sctx, PIPE_FLUSH_ASYNC, NULL);
return NULL;
} else {
@@ -718,21 +718,21 @@ static bool si_resource_commit(struct pipe_context *pctx,
/*
 * Since buffer commitment changes cannot be pipelined, we need to
 * (a) flush any pending commands that refer to the buffer we're about
 * to change, and
 * (b) wait for threaded submit to finish, including those that were
 * triggered by some other, earlier operation.
 */
if (radeon_emitted(ctx->gfx_cs, ctx->initial_gfx_cs_size) &&
ctx->ws->cs_is_buffer_referenced(ctx->gfx_cs,
   res->buf, 
RADEON_USAGE_READWRITE)) {
-   si_flush_gfx_cs(ctx, 

[Mesa-dev] [PATCH] radeonsi: don't emit partial flushes at the end of IBs for internal flushes (v4)

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

Now draw calls from multiple IBs can be executed in parallel.

v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: squash with the AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE change,
don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
only do this for flushes invoked internally

If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:

With partial flushes (time busy):
CP: 99%
SPI: 86%
CB: 73:

Without partial flushes (time busy):
CP: 99%
SPI: 93%
CB: 81%
---
 src/gallium/drivers/radeon/radeon_winsys.h  |  7 
 src/gallium/drivers/radeonsi/si_buffer.c|  6 ++--
 src/gallium/drivers/radeonsi/si_dma_cs.c|  2 +-
 src/gallium/drivers/radeonsi/si_fence.c |  5 ++-
 src/gallium/drivers/radeonsi/si_gfx_cs.c| 48 +
 src/gallium/drivers/radeonsi/si_pipe.h  |  2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c |  4 +--
 src/gallium/drivers/radeonsi/si_texture.c   |  2 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c   | 12 ---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c   | 36 +--
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c   | 12 ---
 src/gallium/winsys/radeon/drm/radeon_drm_cs.c   |  3 +-
 12 files changed, 96 insertions(+), 43 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 157b2e40550..fae4fb7a95d 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -21,20 +21,27 @@
  * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
  * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
  * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #ifndef RADEON_WINSYS_H
 #define RADEON_WINSYS_H
 
 /* The public winsys interface header for the radeon driver. */
 
+/* Whether the next IB can start immediately and not wait for draws and
+ * dispatches from the current IB to finish. */
+#define RADEON_FLUSH_START_NEXT_GFX_IB_NOW (1u << 31)
+
+#define RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW \
+   (PIPE_FLUSH_ASYNC | RADEON_FLUSH_START_NEXT_GFX_IB_NOW)
+
 #include "pipebuffer/pb_buffer.h"
 
 #include "amd/common/ac_gpu_info.h"
 #include "amd/common/ac_surface.h"
 
 /* Tiling flags. */
 enum radeon_bo_layout {
 RADEON_LAYOUT_LINEAR = 0,
 RADEON_LAYOUT_TILED,
 RADEON_LAYOUT_SQUARETILED,
diff --git a/src/gallium/drivers/radeonsi/si_buffer.c 
b/src/gallium/drivers/radeonsi/si_buffer.c
index 1420702d8d4..d17b2c6a831 100644
--- a/src/gallium/drivers/radeonsi/si_buffer.c
+++ b/src/gallium/drivers/radeonsi/si_buffer.c
@@ -57,24 +57,24 @@ void *si_buffer_map_sync_with_rings(struct si_context *sctx,
 
if (!(usage & PIPE_TRANSFER_WRITE)) {
/* have to wait for the last write */
rusage = RADEON_USAGE_WRITE;
}
 
if (radeon_emitted(sctx->gfx_cs, sctx->initial_gfx_cs_size) &&
sctx->ws->cs_is_buffer_referenced(sctx->gfx_cs,
resource->buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {
-   si_flush_gfx_cs(sctx, PIPE_FLUSH_ASYNC, NULL);
+   si_flush_gfx_cs(sctx, 
RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, NULL);
return NULL;
} else {
-   si_flush_gfx_cs(sctx, 0, NULL);
+   si_flush_gfx_cs(sctx, 
RADEON_FLUSH_ASYNC_START_NEXT_GFX_IB_NOW, NULL);
busy = true;
}
}
if (radeon_emitted(sctx->dma_cs, 0) &&
sctx->ws->cs_is_buffer_referenced(sctx->dma_cs,
resource->buf, rusage)) {
if (usage & PIPE_TRANSFER_DONTBLOCK) {
si_flush_dma_cs(sctx, PIPE_FLUSH_ASYNC, NULL);
return NULL;
} else {
@@ -718,21 +718,21 @@ static bool si_resource_commit(struct pipe_context *pctx,
/*
 * Since buffer commitment changes cannot be pipelined, we need to
 * (a) flush any pending commands that refer to the buffer we're about
 * to change, and
 * (b) wait for threaded submit to finish, including those that were
 * triggered by some other, earlier operation.
 */
if (radeon_emitted(ctx->gfx_cs, ctx->initial_gfx_cs_size) &&
ctx->ws->cs_is_buffer_referenced(ctx->gfx_cs,
   res->buf, 
RADEON_USAGE_READWRITE)) {
-   si_flush_gfx_cs(ctx, PIPE_FLUSH_ASYNC, NULL);
+   

[Mesa-dev] [PATCH] intel: aubinator: print out addresses of invalid instructions

2018-04-06 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/gen_batch_decoder.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/src/intel/tools/gen_batch_decoder.c 
b/src/intel/tools/gen_batch_decoder.c
index 1a8794c84e7..b56aea53f1d 100644
--- a/src/intel/tools/gen_batch_decoder.c
+++ b/src/intel/tools/gen_batch_decoder.c
@@ -57,6 +57,7 @@ gen_batch_decode_ctx_finish(struct gen_batch_decode_ctx *ctx)
 }
 
 #define CSI "\e["
+#define RED_COLORCSI "31m"
 #define BLUE_HEADER  CSI "0;44m"
 #define GREEN_HEADER CSI "1;42m"
 #define NORMAL   CSI "0m"
@@ -734,14 +735,22 @@ gen_print_batch(struct gen_batch_decode_ctx *ctx,
   length = gen_group_get_length(inst, p);
   assert(inst == NULL || length > 0);
   length = MAX2(1, length);
+
+  const char *reset_color = ctx->flags & GEN_BATCH_DECODE_IN_COLOR ? 
NORMAL : "";
+
+  uint64_t offset;
+  if (ctx->flags & GEN_BATCH_DECODE_OFFSETS)
+ offset = batch_addr + ((char *)p - (char *)batch);
+  else
+ offset = 0;
+
   if (inst == NULL) {
- fprintf(ctx->fp, "unknown instruction %08x\n", p[0]);
+ fprintf(ctx->fp, "%s0x%08"PRIx64": unknown instruction %08x%s\n",
+ RED_COLOR, offset, p[0], reset_color);
  continue;
   }
 
-  const char *color, *reset_color;
-  uint64_t offset;
-
+  const char *color;
   const char *inst_name = gen_group_get_name(inst);
   if (ctx->flags & GEN_BATCH_DECODE_IN_COLOR) {
  reset_color = NORMAL;
@@ -759,11 +768,6 @@ gen_print_batch(struct gen_batch_decode_ctx *ctx,
  reset_color = "";
   }
 
-  if (ctx->flags & GEN_BATCH_DECODE_OFFSETS)
- offset = batch_addr + ((char *)p - (char *)batch);
-  else
- offset = 0;
-
   fprintf(ctx->fp, "%s0x%08"PRIx64":  0x%08x:  %-80s%s\n",
   color, offset, p[0], inst_name, reset_color);
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/11] gallium: Use Array._DrawVAO in st_atom_array.c.

2018-04-06 Thread Marek Olšák
So interleaved attribs are unsupported, right?

is_interleaved_arrays was probably slowing things down, so I'm OK with that.

Marek

On Sun, Apr 1, 2018 at 2:13 PM,  wrote:

> From: Mathias Fröhlich 
>
> Finally make use of the binding information in the VAO when
> setting up arrays for draw.
>
> Signed-off-by: Mathias Fröhlich 
> ---
>  src/mesa/state_tracker/st_atom_array.c | 448
> +
>  1 file changed, 124 insertions(+), 324 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_array.c
> b/src/mesa/state_tracker/st_atom_array.c
> index 2fd67e8d84..46934a718a 100644
> --- a/src/mesa/state_tracker/st_atom_array.c
> +++ b/src/mesa/state_tracker/st_atom_array.c
> @@ -48,6 +48,7 @@
>  #include "main/bufferobj.h"
>  #include "main/glformats.h"
>  #include "main/varray.h"
> +#include "main/arrayobj.h"
>
>  /* vertex_formats[gltype - GL_BYTE][integer*2 + normalized][size - 1] */
>  static const uint16_t vertex_formats[][4][4] = {
> @@ -306,79 +307,6 @@ st_pipe_vertex_format(const struct
> gl_array_attributes *attrib)
> return vertex_formats[type - GL_BYTE][index][size-1];
>  }
>
> -static const struct gl_vertex_array *
> -get_client_array(const struct gl_vertex_array *arrays,
> - unsigned mesaAttr)
> -{
> -   /* st_program uses 0x to denote a double placeholder attribute
> */
> -   if (mesaAttr == ST_DOUBLE_ATTRIB_PLACEHOLDER)
> -  return NULL;
> -   return [mesaAttr];
> -}
> -
> -/**
> - * Examine the active arrays to determine if we have interleaved
> - * vertex arrays all living in one VBO, or all living in user space.
> - */
> -static GLboolean
> -is_interleaved_arrays(const struct st_vertex_program *vp,
> -  const struct gl_vertex_array *arrays,
> -  unsigned num_inputs)
> -{
> -   GLuint attr;
> -   const struct gl_buffer_object *firstBufObj = NULL;
> -   GLint firstStride = -1;
> -   const GLubyte *firstPtr = NULL;
> -   GLboolean userSpaceBuffer = GL_FALSE;
> -
> -   for (attr = 0; attr < num_inputs; attr++) {
> -  const struct gl_vertex_array *array;
> -  const struct gl_vertex_buffer_binding *binding;
> -  const struct gl_array_attributes *attrib;
> -  const GLubyte *ptr;
> -  const struct gl_buffer_object *bufObj;
> -  GLsizei stride;
> -
> -  array = get_client_array(arrays, vp->index_to_input[attr]);
> -  if (!array)
> -continue;
> -
> -  binding = array->BufferBinding;
> -  attrib = array->VertexAttrib;
> -  stride = binding->Stride; /* in bytes */
> -  ptr = _mesa_vertex_attrib_address(attrib, binding);
> -
> -  /* To keep things simple, don't allow interleaved zero-stride
> attribs. */
> -  if (stride == 0)
> - return false;
> -
> -  bufObj = binding->BufferObj;
> -  if (attr == 0) {
> - /* save info about the first array */
> - firstStride = stride;
> - firstPtr = ptr;
> - firstBufObj = bufObj;
> - userSpaceBuffer = !_mesa_is_bufferobj(bufObj);
> -  }
> -  else {
> - /* check if other arrays interleave with the first, in same
> buffer */
> - if (stride != firstStride)
> -return GL_FALSE; /* strides don't match */
> -
> - if (bufObj != firstBufObj)
> -return GL_FALSE; /* arrays in different VBOs */
> -
> - if (llabs(ptr - firstPtr) > firstStride)
> -return GL_FALSE; /* arrays start too far apart */
> -
> - if ((!_mesa_is_bufferobj(bufObj)) != userSpaceBuffer)
> -return GL_FALSE; /* mix of VBO and user-space arrays */
> -  }
> -   }
> -
> -   return GL_TRUE;
> -}
> -
>  static void init_velement(struct pipe_vertex_element *velement,
>int src_offset, int format,
>int instance_divisor, int vbo_index)
> @@ -392,13 +320,14 @@ static void init_velement(struct pipe_vertex_element
> *velement,
>
>  static void init_velement_lowered(const struct st_vertex_program *vp,
>struct pipe_vertex_element *velements,
> -  int src_offset, int format,
> -  int instance_divisor, int vbo_index,
> -  int nr_components, GLboolean doubles,
> -  GLuint *attr_idx)
> +  const struct gl_array_attributes
> *attrib,
> +  int src_offset, int instance_divisor,
> +  int vbo_index, int idx)
>  {
> -   int idx = *attr_idx;
> -   if (doubles) {
> +   const unsigned format = st_pipe_vertex_format(attrib);
> +   const GLubyte nr_components = attrib->Size;
> +
> +   if (attrib->Doubles) {
>int lower_format;
>
>if (nr_components < 2)
> @@ -427,15 +356,11 @@ static void init_velement_lowered(const struct
> 

Re: [Mesa-dev] [PATCH v3 026/104] nir: Support deref instructions in propagate_invariant

2018-04-06 Thread Caio Marcelo de Oliveira Filho
> +static nir_variable *
> +intrinsic_get_var(nir_intrinsic_instr *intrin, unsigned i)
> +{
> +   if (nir_intrinsic_infos[intrin->intrinsic].num_variables == 0)
> +  return nir_deref_instr_get_variable(nir_src_as_deref(intrin->src[i]));
> +   else
> +  return intrin->variables[0]->var;

Should this be intrin->variables[i]->var?


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105846] Assertion failure @ st_atom_array.c:675 when playing Natural Selection 2

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105846

--- Comment #12 from l...@protonmail.ch ---
Just crashed with MESA_VERBOSE=all and MESA_DEBUG=context, but I got a nice
error message:
Mesa: User error: GL_INVALID_OPERATION in glVertexAttribPointer(non-VBO array)
ns2_linux: ../src/mesa/state_tracker/st_atom_array.c:675:
setup_non_interleaved_attribs: Assertion `attrib->Ptr' failed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105932] OpenGL scene corrupt using VMware driver

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105932

Logan McNaughton  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTABUG

--- Comment #1 from Logan McNaughton  ---
I figured out the issue, our application is reading/writing texels in shaders
without setting proper barriers in place. It is a bug in the application, not
the driver

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 2/6] intel/compiler: Add a uses_firstvertex flag

2018-04-06 Thread Ian Romanick
From: Neil Roberts 

Reviewed-by: Kenneth Graunke 
---
 src/intel/compiler/brw_compiler.h | 1 +
 src/intel/compiler/brw_vec4.cpp   | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index d3ae6499b91..ffcf577d3d1 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -977,6 +977,7 @@ struct brw_vs_prog_data {
bool uses_vertexid;
bool uses_instanceid;
bool uses_basevertex;
+   bool uses_firstvertex;
bool uses_baseinstance;
bool uses_drawid;
 };
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 218925ccb12..9459d61af6c 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2836,6 +2836,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
 
+   if (shader->info.system_values_read &
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX))
+  prog_data->uses_firstvertex = true;
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE))
   prog_data->uses_baseinstance = true;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 1/6] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics

2018-04-06 Thread Ian Romanick
From: Antia Puentes 

This VS system value will contain the value passed as  for
indexed draw calls or the value passed as  for non-indexed draw
calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance
is referred to as its vertex ID, and may be read by a vertex shader as
gl_VertexID.  The vertex ID of the ith element transferred is first +
i."

- Page 355:
"The index of any element transferred to the GL by
DrawElementsOneInstance is referred to as its vertex ID, and may be read
by a vertex shader as gl_VertexID.  The vertex ID of the ith element
transferred is the sum of basevertex and the value stored in the
currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but
this will have to change when the value of gl_BaseVertex is
fixed. Currently its value is broken for non-indexed draw calls because
it must be zero but we are setting it to .

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be
generated.  Reformat commit message to 72 columns.

Reviewed-by: Neil Roberts 
Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/nir.c |  4 
 src/compiler/nir/nir_gather_info.c |  1 +
 src/compiler/nir/nir_intrinsics.py |  1 +
 src/compiler/shader_enums.c|  1 +
 src/compiler/shader_enums.h| 14 ++
 5 files changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 8364197480b..c4933b05c11 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1988,6 +1988,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_FIRST_VERTEX:
+  return nir_intrinsic_load_first_vertex;
case SYSTEM_VALUE_BASE_VERTEX:
   return nir_intrinsic_load_base_vertex;
case SYSTEM_VALUE_INVOCATION_ID:
@@ -2063,6 +2065,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_BASE_INSTANCE;
case nir_intrinsic_load_vertex_id_zero_base:
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
+   case nir_intrinsic_load_first_vertex:
+  return SYSTEM_VALUE_FIRST_VERTEX;
case nir_intrinsic_load_base_vertex:
   return SYSTEM_VALUE_BASE_VERTEX;
case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 5530009255d..5e5779ac4b7 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -265,6 +265,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
case nir_intrinsic_load_vertex_id:
case nir_intrinsic_load_vertex_id_zero_base:
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.py 
b/src/compiler/nir/nir_intrinsics.py
index 1bc99552cd7..f26aaf35ee3 100644
--- a/src/compiler/nir/nir_intrinsics.py
+++ b/src/compiler/nir/nir_intrinsics.py
@@ -413,6 +413,7 @@ system_value("frag_coord", 4)
 system_value("front_face", 1)
 system_value("vertex_id", 1)
 system_value("vertex_id_zero_base", 1)
+system_value("first_vertex", 1)
 system_value("base_vertex", 1)
 system_value("instance_id", 1)
 system_value("base_instance", 1)
diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index d0ff11b41e2..ebee076b43c 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -216,6 +216,7 @@ gl_system_value_name(gl_system_value sysval)
  ENUM(SYSTEM_VALUE_INSTANCE_ID),
  ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
  ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
+ ENUM(SYSTEM_VALUE_FIRST_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_INSTANCE),
  ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index 2aedbb9b3fa..8a277a14f21 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -503,6 +503,20 @@ typedef enum
 */
SYSTEM_VALUE_BASE_VERTEX,
 
+   /**
+* Depending on the type of the draw call (indexed or non-indexed),
+* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
+* similar, or is the value of \c first passed to \c glDrawArrays and
+* similar.
+*
+* \note
+* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as
+* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE 

[Mesa-dev] [PATCH v4 3/6] intel: Handle firstvertex in an identical way to BaseVertex

2018-04-06 Thread Ian Romanick
From: Antia Puentes 

Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: 
* VE 2: 

v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in
emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.
---
 src/intel/compiler/brw_fs_nir.cpp |  4 
 src/intel/compiler/brw_nir.c  |  3 +++
 src/intel/compiler/brw_vec4.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_context.h   |  8 ++--
 src/mesa/drivers/dri/i965/brw_draw.c  | 14 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   |  7 +--
 src/mesa/drivers/dri/i965/genX_state_upload.c | 11 +++
 7 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 6c4bcd1c113..a830bb9fcd6 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -116,6 +116,7 @@ emit_system_values_block(nir_block *block, fs_visitor *v)
 
   case nir_intrinsic_load_vertex_id_zero_base:
   case nir_intrinsic_load_base_vertex:
+  case nir_intrinsic_load_first_vertex:
   case nir_intrinsic_load_instance_id:
   case nir_intrinsic_load_base_instance:
   case nir_intrinsic_load_draw_id:
@@ -2458,6 +2459,9 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder ,
   break;
}
 
+   case nir_intrinsic_load_first_vertex:
+  unreachable("lowered by brw_nir_lower_vs_inputs");
+
default:
   nir_emit_intrinsic(bld, instr);
   break;
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 69ab162f888..16b0d86814f 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -239,6 +239,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
const bool has_sgvs =
   nir->info.system_values_read &
   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -261,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
 switch (intrin->intrinsic) {
 case nir_intrinsic_load_base_vertex:
+case nir_intrinsic_load_first_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
 case nir_intrinsic_load_instance_id:
@@ -278,6 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
case nir_intrinsic_load_base_instance:
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 9459d61af6c..1e384f5bf4d 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2826,6 +2826,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 */
if (shader->info.system_values_read &
(BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index f049d086492..c65a22c38bb 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -886,8 +886,12 @@ struct brw_context
 
struct {
   struct {
- /** The value of gl_BaseVertex for the current _mesa_prim. */
- int gl_basevertex;
+ /**
+  * Either the value of gl_BaseVertex for indexed draw calls or the
+  * value of the argument  for non-indexed draw calls for the
+  * current _mesa_prim.
+  */
+ int firstvertex;
 
  /** The value of gl_BaseInstance for the current _mesa_prim. */
  int gl_baseinstance;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 4caaadd560d..f51f083178e 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -819,25 +819,29 @@ brw_draw_single_prim(struct gl_context *ctx,
 * always flag if the shader uses one of the values. For direct draws,
 * we only flag if the values change.
 */
-   const int new_basevertex =
+   const int new_firstvertex =
   prim->indexed ? prim->basevertex : prim->start;
const int new_baseinstance = prim->base_instance;
const struct brw_vs_prog_data *vs_prog_data =
   brw_vs_prog_data(brw->vs.base.prog_data);
if 

[Mesa-dev] [PATCH v4 6/6] i965: gl_BaseVertex must be zero for non-indexed draw calls

2018-04-06 Thread Ian Romanick
From: Antia Puentes 

We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID
vertex element. The previous Vertex Elements order was:

  * VE 1: 
  * VE 2: 

and now it is:

  * VE 1: 
  * VE 2: 

To move the BaseVertex keeping VE1 as it is, allows to keep pointing the
vertex buffer associated to VE 1 to the indirect buffer for indirect
draw calls.

From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

  "gl_BaseVertex holds the integer value passed to the baseVertex
  parameter to the command that resulted in the current shader
  invocation. In the case where the command has no baseVertex parameter,
  the value of gl_BaseVertex is zero."

Fixes CTS tests:

  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
  * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
  * 
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
  * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

v2 (idr): Make changes to brw_prepare_shader_draw_parameters matching
those in genX(emit_vertices).  Reformat commit message to 72 columns.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
---
 src/intel/compiler/brw_nir.c  | 14 +
 src/intel/compiler/brw_vec4.cpp   | 14 +
 src/mesa/drivers/dri/i965/brw_context.h   | 32 ++-
 src/mesa/drivers/dri/i965/brw_draw.c  | 45 ++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 14 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 38 +++---
 6 files changed, 97 insertions(+), 60 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 16b0d86814f..16ab529737b 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 */
const bool has_sgvs =
   nir->info.system_values_read &
-  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+  (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
-   case nir_intrinsic_load_base_vertex:
case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
@@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   nir_intrinsic_set_component(load, 3);
   break;
case nir_intrinsic_load_draw_id:
-  /* gl_DrawID is stored right after gl_VertexID and friends
-   * if any of them exist.
+   case nir_intrinsic_load_base_vertex:
+  /* gl_DrawID and gl_BaseVertex are stored right after
+ gl_VertexID and friends if any of them exist.
*/
   nir_intrinsic_set_base(load, num_inputs + has_sgvs);
-  nir_intrinsic_set_component(load, 0);
+  if (intrin->intrinsic == nir_intrinsic_load_draw_id)
+ nir_intrinsic_set_component(load, 0);
+  else
+ nir_intrinsic_set_component(load, 1);
   break;
default:
   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 1e384f5bf4d..d33caefdea9 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2825,14 +2825,19 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
 * incoming vertex attribute.  So, add an extra slot.
 */
if (shader->info.system_values_read &
-   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+   (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
   nr_attribute_slots++;
}
 
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) {
+  nr_attribute_slots++;
+   }
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
@@ -2853,12 +2858,9 @@ brw_compile_vs(const struct brw_compiler 

[Mesa-dev] [PATCH v4 4/6] spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX

2018-04-06 Thread Ian Romanick
From: Neil Roberts 

The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.

Reviewed-by: Ian Romanick 
---
 src/compiler/spirv/vtn_variables.c |  2 +-
 src/intel/vulkan/genX_cmd_buffer.c | 16 
 src/intel/vulkan/genX_pipeline.c   |  2 ++
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index b2897407fb1..9bb7d5a575e 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1296,7 +1296,7 @@ vtn_get_builtin_location(struct vtn_builder *b,
   set_mode_system_value(b, mode);
   break;
case SpvBuiltInBaseVertex:
-  *location = SYSTEM_VALUE_BASE_VERTEX;
+  *location = SYSTEM_VALUE_FIRST_VERTEX;
   set_mode_system_value(b, mode);
   break;
case SpvBuiltInBaseInstance:
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 3c703f6be44..7d190a4d5cf 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2673,7 +2673,9 @@ void genX(CmdDraw)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex ||
+   vs_prog_data->uses_basevertex ||
+   vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2711,7 +2713,9 @@ void genX(CmdDrawIndexed)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex ||
+   vs_prog_data->uses_basevertex ||
+   vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2850,7 +2854,9 @@ void genX(CmdDrawIndirect)(
   struct anv_bo *bo = buffer->bo;
   uint32_t bo_offset = buffer->offset + offset;
 
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_basevertex ||
+  vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
@@ -2889,7 +2895,9 @@ void genX(CmdDrawIndexedIndirect)(
   uint32_t bo_offset = buffer->offset + offset;
 
   /* TODO: We need to stomp base vertex to 0 somehow */
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_basevertex ||
+  vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index eb2d4147357..a473f42c7e1 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -98,6 +98,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
const bool needs_svgs_elem = vs_prog_data->uses_vertexid ||
 vs_prog_data->uses_instanceid ||
 vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance;
 
uint32_t elem_count = __builtin_popcount(elements) -
@@ -178,6 +179,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
* well.  Just do all or nothing.
*/
   uint32_t base_ctrl = (vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance) ?
VFCOMP_STORE_SRC : VFCOMP_STORE_0;
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 5/6] nir: Offset vertex_id by first_vertex instead of base_vertex

2018-04-06 Thread Ian Romanick
From: Neil Roberts 

base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Reviewed-by: Ian Romanick 
Cc: Rob Clark 
Acked-by: Marek Olšák 
---
 src/compiler/nir/nir_lower_system_values.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c| 2 +-
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 5 ++---
 src/intel/vulkan/genX_cmd_buffer.c   | 4 
 src/intel/vulkan/genX_pipeline.c | 4 +---
 6 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/src/compiler/nir/nir_lower_system_values.c 
b/src/compiler/nir/nir_lower_system_values.c
index fb560ee21bb..96643793a70 100644
--- a/src/compiler/nir/nir_lower_system_values.c
+++ b/src/compiler/nir/nir_lower_system_values.c
@@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b)
  if (b->shader->options->vertex_id_zero_based) {
 sysval = nir_iadd(b,
   nir_load_vertex_id_zero_base(b),
-  nir_load_base_vertex(b));
+  nir_load_first_vertex(b));
  } else {
 sysval = nir_load_vertex_id(b);
  }
diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
index b9e1af00e2c..3419ba86d46 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
@@ -374,7 +374,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring, struct 
fd3_emit *emit)
continue;
if (vp->inputs[i].sysval) {
switch(vp->inputs[i].slot) {
-   case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_FIRST_VERTEX:
/* handled elsewhere */
break;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
index 5fec2b6b08a..42268ceea71 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
@@ -378,7 +378,7 @@ fd4_emit_vertex_bufs(struct fd_ringbuffer *ring, struct 
fd4_emit *emit)
continue;
if (vp->inputs[i].sysval) {
switch(vp->inputs[i].slot) {
-   case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_FIRST_VERTEX:
/* handled elsewhere */
break;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 174141b7fec..356d1bc44ee 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2182,11 +2182,10 @@ emit_intrinsic(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
ctx->ir->outputs[n] = src[i];
}
break;
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
if (!ctx->basevertex) {
ctx->basevertex = create_driver_param(ctx, 
IR3_DP_VTXID_BASE);
-   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX,
-   ctx->basevertex);
+   add_sysval_input(ctx, SYSTEM_VALUE_FIRST_VERTEX, 
ctx->basevertex);
}
dst[0] = ctx->basevertex;
break;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7d190a4d5cf..e945c46dac2 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2674,7 +2674,6 @@ void genX(CmdDraw)(
genX(cmd_buffer_flush_state)(cmd_buffer);
 
if (vs_prog_data->uses_firstvertex ||
-   vs_prog_data->uses_basevertex ||
vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
@@ -2714,7 +2713,6 @@ void genX(CmdDrawIndexed)(
genX(cmd_buffer_flush_state)(cmd_buffer);
 
if (vs_prog_data->uses_firstvertex ||
-   vs_prog_data->uses_basevertex ||

Re: [Mesa-dev] [PATCH v3 016/104] nir: Use nir_builder in lower_io_to_temporaries

2018-04-06 Thread Caio Marcelo de Oliveira Filho
On Tue, Apr 03, 2018 at 11:32:43AM -0700, Jason Ekstrand wrote:
> ---
>  src/compiler/nir/nir_lower_io_to_temporaries.c | 37 
> --
>  1 file changed, 17 insertions(+), 20 deletions(-)

This one could land before the rest of the series.

Reviewed-by: Caio Marcelo de Oliveira Filho 



Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 024/104] nir: Support deref instructions in lower_system_values

2018-04-06 Thread Caio Marcelo de Oliveira Filho
On Tue, Apr 03, 2018 at 11:32:51AM -0700, Jason Ekstrand wrote:
> ---
>  src/compiler/nir/nir_lower_system_values.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/src/compiler/nir/nir_lower_system_values.c 
> b/src/compiler/nir/nir_lower_system_values.c
> index fb560ee..104df51 100644
> --- a/src/compiler/nir/nir_lower_system_values.c
> +++ b/src/compiler/nir/nir_lower_system_values.c
> @@ -39,10 +39,15 @@ convert_block(nir_block *block, nir_builder *b)
>  
>nir_intrinsic_instr *load_var = nir_instr_as_intrinsic(instr);
>  
> -  if (load_var->intrinsic != nir_intrinsic_load_var)
> - continue;
> +  nir_variable *var;
> +  if (load_var->intrinsic == nir_intrinsic_load_var) {
> + var = load_var->variables[0]->var;
> +  } else if (load_var->intrinsic == nir_intrinsic_load_deref) {
> + var = 
> nir_deref_instr_get_variable(nir_src_as_deref(load_var->src[0]));

Question: nir_deref_instr_get_variable will walk the deref instr
chain, but does it even make sense if there's a array or struct in
this deref chain? Should this be asserted?


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/nine: Do not use scratch for face register

2018-04-06 Thread Axel Davy
Scratch registers are reused every instructions.
Since vFace is reused, a new temporary register
should be used.

Fixes: https://github.com/iXit/Mesa-3D/issues/311

Signed-off-by: Axel Davy 

CC: "17.3 18.0" 
---
 src/gallium/state_trackers/nine/nine_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/nine/nine_shader.c 
b/src/gallium/state_trackers/nine/nine_shader.c
index 9b8bd16f8b..5d7f944f85 100644
--- a/src/gallium/state_trackers/nine/nine_shader.c
+++ b/src/gallium/state_trackers/nine/nine_shader.c
@@ -1104,7 +1104,7 @@ tx_src_param(struct shader_translator *tx, const struct 
sm1_src_param *param)
 case D3DSMO_FACE:
if (ureg_src_is_undef(tx->regs.vFace)) {
if (tx->face_is_sysval_integer) {
-   tmp = tx_scratch(tx);
+   tmp = ureg_DECL_temporary(ureg);
tx->regs.vFace =
ureg_DECL_system_value(ureg, TGSI_SEMANTIC_FACE, 0);
 
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105846] Assertion failure @ st_atom_array.c:675 when playing Natural Selection 2

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105846

--- Comment #11 from l...@protonmail.ch ---
(In reply to Timothy Arceri from comment #5)
> (In reply to las from comment #2)
> > link to coredump (also uploading as attachment):
> > https://doc-0o-4k-docs.googleusercontent.com/docs/securesc/
> > pum26a1iie2onuvj4oi878g6cm4t7ee5/ogcgqid4h8vlgsdm101jckd6jo5jqqc5/
> > 152266320/00738005966732616001/00738005966732616001/
> > 11no6HF0WfEwwlE2IoeMx6aigsJ-
> > 504qp?e=download=lohv7ot3cnbqm=00738005966732616001=29id47asc
> > b0lesgtto26eu8r8cmsptdn
> 
> I get access denied trying to view this link. Can you try again to upload
> here? I had to try multiple times yesterday but it eventually worked,
> otherwise can you place it somewhere with public access?
> 
> I tried to run the game but it's missing a library on fedora which it seems
> needs to be built from source and configured to run as a service or
> something like that which I'm not willing to mess around with.
> 
> Anyway are you able to either build mesa from git or use a ppa (assuming you
> are using ubuntu) [1] to see if this is still happening in the current dev
> version of mesa/llvm?
> 
> [1] https://launchpad.net/~paulo-miguel-dias/+archive/ubuntu/mesa/+packages

So yeah, it still happens with or without mesa_glthread as of commit
7728720f07a1f0931d7d394d5842de3803e9192b.

I need to do more testing to find out if setting either MESA_VERBOSE or
MESA_DEBUG prevents it, as it seems like it's not happening as often as before
(perhaps due to a game update?).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] meson: fix warnings about comparing unlike types

2018-04-06 Thread Caio Marcelo de Oliveira Filho
> v2: - Use dependency('', required : false) instead of
>   declare_dependency(), the later will always report that it is
>   found, which is not what we want.
> 
> Signed-off-by: Dylan Baker 
> ---
>  meson.build   | 89 
> ---
>  src/gallium/auxiliary/meson.build |  2 +-
>  src/glx/apple/meson.build |  2 +-
>  src/glx/meson.build   |  2 +-
>  4 files changed, 49 insertions(+), 46 deletions(-)

Reviewed-by: Caio Marcelo de Oliveira Filho 


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105846] Assertion failure @ st_atom_array.c:675 when playing Natural Selection 2

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105846

--- Comment #10 from l...@protonmail.ch ---
Seems like it's not mesa_glthread causing it. It just happened again with
mesa_glthread=false.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 022/104] nir: Support deref instructions in lower_indirect_derefs

2018-04-06 Thread Caio Marcelo de Oliveira Filho
Hi,

> +static void
> +emit_load_store_deref(nir_builder *b, nir_intrinsic_instr *orig_instr,
> +  nir_deref_instr *parent,
> +  nir_deref_instr **deref_arr,
> +  nir_ssa_def **dest, nir_ssa_def *src)
> +{
> +   for (; *deref_arr; deref_arr++) {
> +  nir_deref_instr *deref = *deref_arr;
> +  if (deref->deref_type == nir_deref_type_array &&
> +  nir_src_as_const_value(deref->arr.index) == NULL) {
> + int length = glsl_get_length(parent->type);
> +
> + emit_indirect_load_store_deref(b, orig_instr, parent, deref_arr,
> +0, length, dest, src);

Side note: after reading the existing code (that goes from
-base_offset to length - base_offset, and later adds base_offset), I'm
kind of glad this goes from 0 to length.


> +static bool
> +lower_indirect_derefs_block(nir_block *block, nir_builder *b,
> +nir_variable_mode modes)
> +{

(...)

> +  nir_deref_instr *deref =
> + nir_instr_as_deref(intrin->src[0].ssa->parent_instr);

Maybe use the helper 'nir_src_as_deref(intrin->src[0])'?

> +
> +  /* Walk the deref chain back to the base and look for indirects */
> +  bool has_indirect = false;
> +  nir_deref_instr *base = deref;
> +  while (base->deref_type != nir_deref_type_var) {
> + if (base->deref_type == nir_deref_type_array &&
> + nir_src_as_const_value(base->arr.index) == NULL)
> +has_indirect = true;
> +
> + base = nir_instr_as_deref(base->parent.ssa->parent_instr);

Maybe use the helper 'base = nir_deref_instr_parent(base);'?


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] meson: fix warnings about comparing unlike types

2018-04-06 Thread Dylan Baker
In the old days (0.42.x), when mesa's meson system was written the
recommendation for handling conditional dependencies was to define them
as empty lists. When meson would evaluate the dependencies of a target
it would recursively flatten all of the arguments, and empty lists would
be removed. There are some problems with this, among them that lists and
dependencies have different methods (namely .found()), so the
recommendation changed to use `dependency('', required : false)` for
such cases.  This has the advantage of providing a .found() method, so
there is no need to do things like `dep_foo != [] and dep_foo.found()`,
such a dependency should never exist.

I've tested this with 0.42 (the minimum we claim to support) and 0.45.
On 0.45 this removes warnings about comparing unlike types.

v2: - Use dependency('', required : false) instead of
  declare_dependency(), the later will always report that it is
  found, which is not what we want.

Signed-off-by: Dylan Baker 
---
 meson.build   | 89 ---
 src/gallium/auxiliary/meson.build |  2 +-
 src/glx/apple/meson.build |  2 +-
 src/glx/meson.build   |  2 +-
 4 files changed, 49 insertions(+), 46 deletions(-)

diff --git a/meson.build b/meson.build
index ee2b4151e2f..7b01d0ab4b7 100644
--- a/meson.build
+++ b/meson.build
@@ -29,6 +29,8 @@ project(
   default_options : ['buildtype=debugoptimized', 'c_std=c99', 'cpp_std=c++11']
 )
 
+null_dep = dependency('', required : false)
+
 system_has_kms_drm = ['openbsd', 'netbsd', 'freebsd', 'dragonfly', 
'linux'].contains(host_machine.system())
 
 # Arguments for the preprocessor, put these in a separate array from the C and
@@ -422,7 +424,7 @@ elif _vdpau == 'auto'
   _vdpau = 'true'
 endif
 with_gallium_vdpau = _vdpau == 'true'
-dep_vdpau = []
+dep_vdpau = null_dep
 if with_gallium_vdpau
   dep_vdpau = dependency('vdpau', version : '>= 1.1')
   dep_vdpau = declare_dependency(
@@ -461,7 +463,7 @@ elif _xvmc == 'auto'
   _xvmc = 'true'
 endif
 with_gallium_xvmc = _xvmc == 'true'
-dep_xvmc = []
+dep_xvmc = null_dep
 if with_gallium_xvmc
   dep_xvmc = dependency('xvmc', version : '>= 1.0.6')
 endif
@@ -491,7 +493,8 @@ elif not (with_gallium_r600 or with_gallium_radeonsi or 
with_gallium_nouveau)
 error('OMX state tracker requires at least one of the following gallium 
drivers: r600, radeonsi, nouveau.')
   endif
 endif
-dep_omx = []
+with_gallium_omx = _omx
+dep_omx = null_dep
 dep_omx_other = []
 if ['auto', 'bellagio'].contains(_omx)
   dep_omx = dependency(
@@ -579,7 +582,7 @@ elif _va == 'auto'
   _va = 'true'
 endif
 with_gallium_va = _va == 'true'
-dep_va = []
+dep_va = null_dep
 if with_gallium_va
   dep_va = dependency('libva', version : '>= 0.38.0')
   dep_va_headers = declare_dependency(
@@ -638,7 +641,7 @@ if _opencl != 'disabled'
   with_gallium_opencl = true
   with_opencl_icd = _opencl == 'icd'
 else
-  dep_clc = []
+  dep_clc = null_dep
   with_gallium_opencl = false
   with_gallium_icd = false
 endif
@@ -831,7 +834,7 @@ else
 endif
 
 # Check for GCC style atomics
-dep_atomic = declare_dependency()
+dep_atomic = null_dep
 
 if cc.compiles('int main() { int n; return __atomic_load_n(, 
__ATOMIC_ACQUIRE); }',
name : 'GCC atomic builtins')
@@ -976,7 +979,7 @@ endif
 
 # check for dl support
 if cc.has_function('dlopen')
-  dep_dl = []
+  dep_dl = null_dep
 else
   dep_dl = cc.find_library('dl')
 endif
@@ -995,7 +998,7 @@ endif
 
 # Determine whether or not the rt library is needed for time functions
 if cc.has_function('clock_gettime')
-  dep_clock = []
+  dep_clock = null_dep
 else
   dep_clock = cc.find_library('rt')
 endif
@@ -1013,7 +1016,7 @@ if with_amd_vk or with_gallium_radeonsi or 
with_gallium_r600 or with_gallium_ope
 dep_elf = cc.find_library('elf')
   endif
 else
-  dep_elf = []
+  dep_elf = null_dep
 endif
 dep_expat = dependency('expat')
 # this only exists on linux so either this is linux and it will be found, or
@@ -1024,12 +1027,12 @@ dep_m = cc.find_library('m', required : false)
 # but we always want to use the same version for all libdrm modules. That means
 # even if driver foo requires 2.4.0 and driver bar requires 2.4.3, if foo and
 # bar are both on use 2.4.3 for both of them
-dep_libdrm_amdgpu = []
-dep_libdrm_radeon = []
-dep_libdrm_nouveau = []
-dep_libdrm_etnaviv = []
-dep_libdrm_freedreno = []
-dep_libdrm_intel = []
+dep_libdrm_amdgpu = null_dep
+dep_libdrm_radeon = null_dep
+dep_libdrm_nouveau = null_dep
+dep_libdrm_etnaviv = null_dep
+dep_libdrm_freedreno = null_dep
+dep_libdrm_intel = null_dep
 
 _drm_amdgpu_ver = '2.4.91'
 _drm_radeon_ver = '2.4.71'
@@ -1114,7 +1117,7 @@ elif _llvm == 'true'
   dep_llvm = dependency('llvm', version : _llvm_version, modules : 
llvm_modules)
   with_llvm = true
 else
-  dep_llvm = []
+  dep_llvm = null_dep
   with_llvm = false
 endif
 if with_llvm
@@ -1144,7 +1147,7 @@ elif with_amd_vk or with_gallium_radeonsi or 

Re: [Mesa-dev] [PATCH v3 023/104] nir/deref: Add a deref cleanup function

2018-04-06 Thread Caio Marcelo de Oliveira Filho
On Tue, Apr 03, 2018 at 11:32:50AM -0700, Jason Ekstrand wrote:
> Sometimes it's useful for a pass to be able to clean up its own derefs
> instead of waiting for DCE.  This little helper makes it very easy.
> ---
>  src/compiler/nir/nir.h   |  2 ++
>  src/compiler/nir/nir_deref.c | 13 +
>  2 files changed, 15 insertions(+)


The helper is used in earlier patches, so maybe reorder. If I'm not
mistaken it is used as early as patch 13 ("nir: Support deref
instructions in remove_dead_variables").


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon/vce: move feedback command inside of destroy function

2018-04-06 Thread Ian Romanick
On 04/04/2018 11:30 AM, Mark Janes wrote:
> Emil Velikov  writes:
> 
>> On 4 April 2018 at 17:40, Mark Janes  wrote:
>>> Leo Liu  writes:
>>>
 On the CI family, firmware requires the destory command have to be the
 last command in the IB, moving feedback command after destroy is causing
 issues on CI cards, so we have to keep the previous logic that moves
 destroy back to the last command.

 But as the original issue fixed previously, with the newer family like 
 Vega10,
 feedback command have to be included inside of the task info command along
 with destroy command.

 Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command")

 Signed-off-by: Leo Liu 
 Cc: mesa-sta...@lists.freedesktop.org
>>>
>>> These tags seem ambiguous to me.  If this commit fixes a specific
>>> commit, then the patch should be applied only to stable branches which
>>> contain that commit.
>>>
>>> However, the mesa-stable CC caused this patch to be applied to 17.3,
>>> which does *not* contain the broken patch.
>>>
>>> Leo: did you intend for the mesa-stable CC to cause this patch to be
>>> applied to older stable branches?
>>>
>>> Release managers: is there a protocol for how this specification should
>>> be parsed, when considering a patch for stable?
>>>
>> If the Fixes tag, reference a commit that is missing in specific
>> stable branch then obviously the fix is not suitable.
>> Hence the stable piece than be ignored + alongside a reply to the
>> patch that it will not be in $stable_branch because $reason.
> 
> OK, we have violated this policy at least a couple times in the previous
> release, based on my audits:
> 
>   2f67c9b17518cf0d2fe946e39e5b8ff5ec2797c5
>   i965/vec4: Fix null destination register in 3-source instructions

I thought:

NOTE: I have sent a couple tests to the piglit list that reproduce this
bug *without* the commit mentioned below.  This commit fixes those
tests.

made it pretty clear that this is a pre-existing bug.  The commit
mentioned in Fixes: just made it happen more.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105871] Discolored KDE panels after updating to Mesa 18.0 on Intel broadwell

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105871

--- Comment #21 from Alexey Min  ---
(In reply to Tapani Pälli from comment #20)
> Problem is that kwin_x11 cannot easily avoid this because there is only one
> 32bit visual exposed and without the patch mentioned in bug #103699 it
> always has sRGB capability which makes things go bad. It is likely that
> somewhere we are missing a sRGB->linear conversion in that case. With that
> patch for Xorg I wanted to restore previous behaviour back.

Thank you for clarification, that's what I suspected after reading various bug
reports and mail list archives today. Well, hope it will be fixed then :)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105932] OpenGL scene corrupt using VMware driver

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105932

Bug ID: 105932
   Summary: OpenGL scene corrupt using VMware driver
   Product: Mesa
   Version: 17.2
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/X11
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: lo...@bacoosta.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 138661
  --> https://bugs.freedesktop.org/attachment.cgi?id=138661=edit
apitrace of scene

I have attached an apitrace of the scene in question.

The apitrace plays back normally in Windows, and using the llvmpipe driver in
Mesa. However, in VMware, the scene is "garbled", everything looks wrong.

I actually took the apitrace capture using the VMware driver, where it looked
wrong, and when I play it back anywhere else, it looks normal.

I should point out, we've had one other user report this issue on a different
GPU, and it happens on Android as well, but no other desktop drivers.

https://github.com/gonetz/GLideN64/issues/1702

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94957] dEQP failures on llvmpipe

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #11 from Roland Scheidegger  ---
(In reply to msdhedhi007 from comment #10)
> I am also seeing this same issue on Mesa 17.3.6. I wanted to know if there
> is an update /patch available for this issue. 

There is no goal as such to pass dEQP. As mentioned, some bugs are due to
performance optimizations (which can be disabled, albeit only on debug builds).
Some might not even be real bugs (also as mentioned, where I think dEQP relies
on behavior not guaranteed by the spec). For both of these types, there's no
interest in addressing these (albeit I suppose if you're talking about making
it possible to disable performance hacks on release builds, that could be
done).
As for the rest, patches welcome, but personally I've got little interest and
definitely no time to specifically look into dEQP failures.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon/vce: move feedback command inside of destroy function

2018-04-06 Thread Mark Janes
Emil Velikov  writes:

> (Dropping Leo, since it doesn't affect him. He's already subscribed to
> the list.)
>
> On 6 April 2018 at 19:20, Mark Janes  wrote:
>
>> I agree with you, however our release process still has a gap.  We
>> (Intel) test commits on master, and file bugs when we find them in i965
>> or other components.
>>
>> If those commits already have a stable tag in the commit message, they
>> will be shipped at a later date directly to customers, with no testing.
>> There is no way to blacklist broken patches in our Mesa's release
>> automation.
>>
> That's why I mentioned that the process cannot be fully automated ;-)
>
> Let me try to explain slightly differently. Amongst others you want:
>
> a) 24h (ish) buffer (getting closer to 0, as we reach the pre-release
> announcement) before landing fix in the stable branch.
>
> We had broken _badly_ a few multiple times, a balance between the two
> is essential.
>
> Looking at it from Jenkins POV:
> You don't want to test/bisect that master is broken, only to apply
> same patch and run Jenkins on the same broken patch.
>
>  - when issues to happen for example: fdo#103626 currently there's two
> ways to handle it
>
> 1) add the commit to bin/.cherry-ignore. latter of which means that
> you miss the patch when it's actually fixed up.
> See a094314340387ef2463ed8b4ddc9317bc539832b for context.

You are right.  We can just add the commit to .cherry-ignore files in
affected branches when the bug bisects to something with a stable tag.

> 2) carefully/manually git cherry-pick
> Doing this allowed me to add the regression to the tracker, as
> otherwise we would have missed it for 18.0.0 ;-)
>
> Yet we could introduce on-hold list to cherry-ignore. It's fairly trivial.
>
>
> Hope that makes things a bit clearer.
>
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon/vce: move feedback command inside of destroy function

2018-04-06 Thread Emil Velikov
(Dropping Leo, since it doesn't affect him. He's already subscribed to
the list.)

On 6 April 2018 at 19:20, Mark Janes  wrote:

> I agree with you, however our release process still has a gap.  We
> (Intel) test commits on master, and file bugs when we find them in i965
> or other components.
>
> If those commits already have a stable tag in the commit message, they
> will be shipped at a later date directly to customers, with no testing.
> There is no way to blacklist broken patches in our Mesa's release
> automation.
>
That's why I mentioned that the process cannot be fully automated ;-)

Let me try to explain slightly differently. Amongst others you want:

a) 24h (ish) buffer (getting closer to 0, as we reach the pre-release
announcement) before landing fix in the stable branch.

We had broken _badly_ a few multiple times, a balance between the two
is essential.

Looking at it from Jenkins POV:
You don't want to test/bisect that master is broken, only to apply
same patch and run Jenkins on the same broken patch.

 - when issues to happen for example: fdo#103626 currently there's two
ways to handle it

1) add the commit to bin/.cherry-ignore. latter of which means that
you miss the patch when it's actually fixed up.
See a094314340387ef2463ed8b4ddc9317bc539832b for context.

2) carefully/manually git cherry-pick
Doing this allowed me to add the regression to the tracker, as
otherwise we would have missed it for 18.0.0 ;-)

Yet we could introduce on-hold list to cherry-ignore. It's fairly trivial.


Hope that makes things a bit clearer.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] RFC: Externd IMG_context_priority with NV_context_priority_realtime

2018-04-06 Thread Ben Widawsky

On 18-03-31 12:00:16, Chris Wilson wrote:

Quoting Kenneth Graunke (2018-03-30 19:20:57)

On Friday, March 30, 2018 7:40:13 AM PDT Chris Wilson wrote:
> For i915, we are proposing to use a quality-of-service parameter in
> addition to that of just a priority that usurps everyone. Due to our HW,
> preemption may not be immediate and will be forced to wait until an
> uncooperative process hits an arbitration point. To prevent that unduly
> impacting the privileged RealTime context, we back up the preemption
> request with a timeout to reset the GPU and forcibly evict the GPU hog
> in order to execute the new context.

I am strongly against exposing this in general.  Performing a GPU reset
in the middle of a batch can completely screw up whatever application
was running.  If the application is using robustness extensions, we may
be forced to return GL_DEVICE_LOST, causing the application to have to
recreate their entire GL context and start over.  If not, we may try to
let them limp on(*) - and hope they didn't get too badly damaged by some
of their commands not executing, or executing twice (if the kernel tries
to resubmit it).  But it may very well cause the app to misrender, or
even crash.


Yes, I think the revulsion has been universal. However, as a
quality-of-service guarantee, I can understand the appeal. The
difference is that instead of allowing a DoS for 6s or so as we
currently allow, we allow that to be specified by the context. As it
does allow one context to impact another, I want it locked down to
privileged processes. I have been using CAP_SYS_ADMIN as the potential
to do harm is even greater than exploiting the weak scheduler by
changing priority.



I'm not terribly worried about this on our hardware for 3d. Today, there is
exactly one case I think where this would happen, if you have a sufficiently
long running shader on a sufficiently large triangle.

The concern I have is about compute where I think we don't do preemption nearly
as well.


This seems like a crazy plan to me.  Scheduling has never been allowed
to just kill random processes.


That's not strictly true, as processes have their limits which if they
exceed they will be killed. On the CPU preemption is much better, the
issue of unyielding processes is pretty much limited to the kernel, where
we can run the NMI watchdog to kill broken code.


If you ever hit that case, then your
customers will see random application crashes, glitches, GPU hangs,
and be pretty unhappy with the result.  And not because something was
broken, but because somebody was impatient and an app was a bit slow.


Yes, that is their decision. Kill random apps so that their
uber-critical interface updates the clock.


If you have work that is so mission critical, maybe you shouldn't run it
on the same machine as one that runs applications which you care so
little about that you're willing to watch them crash and burn.  Don't
run the entertainment system on the flight computer, so to speak.


You are not the first to say that ;)
-Chris



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] egl/x11: Handle both depth 30 formats for eglCreateImage().

2018-04-06 Thread Mario Kleiner

On 04/06/2018 06:41 PM, Michel Dänzer wrote:

On 2018-04-06 06:18 PM, Mario Kleiner wrote:

On Fri, Apr 6, 2018 at 12:01 PM, Michel Dänzer  wrote:

On 2018-03-27 07:53 PM, Daniel Stone wrote:

On 12 March 2018 at 20:45, Mario Kleiner  wrote:

We need to distinguish if a backing pixmap of a window is
XRGB2101010 or XBGR2101010, as different gpu hw supports
different formats. NVidia hw prefers XBGR, whereas AMD and
Intel are happy with XRGB.

We use the red channel mask of the visual to distinguish at
depth 30, but because we can't easily get the associated
visual of a Pixmap, we use the visual of the x-screens root
window instead as a proxy.

This fixes desktop composition of color depth 30 windows
when the X11 compositor uses EGL.


I have no reason to doubt your testing, so this patch is:
Acked-by: Daniel Stone 

But it does rather fill me with trepidation, given that X11 Pixmaps
are supposed to be a dumb 'bag of bits', doing nothing else than
providing the same number and size of channels to the actual client
data for the Visual associated with the Window.


As far as X11 is concerned, the number of channels and their sizes don't
even matter; a pixmap is simply a container for an unsigned integer of n
bits (where n is the pixmap depth) per pixel, with no inherent meaning
attached to those values.

That said, I'm not sure this is true for EGL as well. But even if it
isn't, there would have to be another mechanism to determine the format,
e.g. a config associated with the EGL pixmap. The pixmap doesn't even
necessarily have the same depth as the root window, so using the
latter's visual doesn't make much sense.


Hi Michel. I thought with this patch i was implementing what you
proposed earlier as a heuristic on how to get around the "pixmaps
don't have an inherent format, only a depth" problem?


Do you have a pointer to that discussion?



Ok, apologies, i think i was just taking your comment too far as an 
inspiration. The best i can find in my inbox atm. is this message of 
yours from 24th November 2017 10:44 AM in a mesa-dev thread "Re: 
[Mesa-dev] 10-bit Mesa/Gallium support":


"Apologies for the badly formatted followup before, let's try that again:

On 2017-11-23 07:31 PM, Mario Kleiner wrote:
>
> 3. In principle the clean solution for nouveau would be to upgrade the
> ddx to drmAddFB2 ioctl, and use xbgr2101010 scanout to support
> everything back to nv50+, but everything we have in X or Wayland is
> meant for xrgb2101010 not xbgr2101010. And we run into ambiguities of
> what, e.g., a depth 30 pixmap means in some extensions like
> glx_texture_form_pixmap.

A pixmap itself never has a format per se, it's just a container for an
n-bit integer value per pixel (where n is the pixmap depth). A
compositor using GLX_EXT_texture_from_pixmap has to determine the format
from the corresponding window's visual.


--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
"

There's nothing in there that suggests my root window solution.
I guess i thought given that we can not get the visual of the window 
corresponding to the pixmap, let's find some window which is a good 
enough proxy for onscreen windows with associated depth 30 pixmaps on 
the same x-screen.





My (possibly inaccurate) understanding is that one can only create a
depth 30 pixmap if the x-screen runs at depth >= 30. It only exposes
depth 30 as supported pixmap format (xdpyinfo) if xorg.conf
DefaultDepth 30 is selected, whereas other depths like
1,4,8,15,16,24,32 are always supported at default depth 24.


That sounds like an X server issue. Just like 32, there's no fundamental
reason a pixmap couldn't have depth 30 despite the screen depth being lower.

Out of curiosity, can you share the output of xdpyinfo with nouveau at
depth 30?



Will have to do that later at the machine. But unless i misremember that 
as well, xdpyinfo always gives me this, if i run at DefaultDepth 24:


"number of supported pixmap formats:7
supported pixmap formats:
depth 1, bits_per_pixel 1, scanline_pad 32
depth 4, bits_per_pixel 8, scanline_pad 32
depth 8, bits_per_pixel 8, scanline_pad 32
depth 15, bits_per_pixel 16, scanline_pad 32
depth 16, bits_per_pixel 16, scanline_pad 32
depth 24, bits_per_pixel 32, scanline_pad 32
depth 32, bits_per_pixel 32, scanline_pad 32
keycode range:minimum 8, maximum 255
"

At least i don't remember seeing any "depth 30, ..." line ever on any 
driver+gpu combo if i run X at default depth 24?





Iff depth 30 is selected, then the root window has depth 30, and a depth 30
visual. If each driver only exports one channel ordering for depth 30,
then the channel ordering of any pixmaps associated drawable should be
the same as the one of the root window.


Repeat after me: "X11 pixmaps don't have a format." :) They're just bags
of bits.




[Mesa-dev] [Bug 105904] Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105904

--- Comment #2 from Snubb  ---
I'm sorry but I can't be more specific than that the game "World of tanks" that
i've been playing didn't start after driver update, no window. Nothing.
When It didn't start I tried to run 64 bit cube.exe from LunarG Vulkan SDK, and
it did run.
Then I tried the 32 bit cube.exe from the LunarG SDK, it crashed. Weird becouse
it run before driver update, and 64 bit cube.exe still worked.
I didn't know what to do, (i'm no genious), so I deleted ~/.cache/mesa/ and
~/.cache/radv_builtin_shaders, and voila, 32 bit vulkan apps in wine starts
working (32 bit cube.exe and dxvk)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94957] dEQP failures on llvmpipe

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94957

--- Comment #10 from msdhedhi...@gmail.com ---
I am also seeing this same issue on Mesa 17.3.6. I wanted to know if there is
an update /patch available for this issue. 
Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon/vce: move feedback command inside of destroy function

2018-04-06 Thread Mark Janes
Emil Velikov  writes:

> On 5 April 2018 at 20:33, Mark Janes  wrote:
>> Emil Velikov  writes:
>>
>>> On 4 April 2018 at 22:50, Mark Janes  wrote:
 Leo Liu  writes:

> On 04/04/2018 12:40 PM, Mark Janes wrote:
>> Leo Liu  writes:
>>
>>> On the CI family, firmware requires the destory command have to be the
>>> last command in the IB, moving feedback command after destroy is causing
>>> issues on CI cards, so we have to keep the previous logic that moves
>>> destroy back to the last command.
>>>
>>> But as the original issue fixed previously, with the newer family like 
>>> Vega10,
>>> feedback command have to be included inside of the task info command 
>>> along
>>> with destroy command.
>>>
>>> Fixes: 6d74cb25("radeon/vce: move destroy command before feedback 
>>> command")
>>>
>>> Signed-off-by: Leo Liu 
>>> Cc: mesa-sta...@lists.freedesktop.org
>> These tags seem ambiguous to me.  If this commit fixes a specific
>> commit, then the patch should be applied only to stable branches which
>> contain that commit.
>>
>> However, the mesa-stable CC caused this patch to be applied to 17.3,
>> which does *not* contain the broken patch.
>>
>> Leo: did you intend for the mesa-stable CC to cause this patch to be
>> applied to older stable branches?
> I would like to have this patch apply to branches "17.2", "17.3",
> "18.0", which got patch titled "radeon/vce: move destroy command before
> feedback command"

 Ok, I understand now.  You cc'd a buggy patch to stable, and the bug was
 shipped in 17.3.1.

>>> May I suggest phrasing things less personally. Mistakes happen, so
>>> let's work in providing suggestions for improvement as opposed to "you
>>> did X/Y".
>>
>> Thank you for the feedback.  I was trying to state the facts, but I
>> understand how this could be read as a criticism.
>>
> Does that mean you tested radeon/vce and observed the breakage?
> 
>
>> As you say, mistakes happen -- and when they happen on the stable
>> branches, there is very little to protect the end users.  Could we
>> enhance automation to prevent this situation?  For example:
>>
> Consistent testing/reporting is needed. I believe I've mentioned if before:
>
> You are the only one who consistently provides feedback about the state.
> There have been individuals to report, while I'm very grateful these
> reports are very rare and far between.
>
> Approx 4 years ago Carl suggested another alternative. Roughly put:
>
> Driver specific patches are _omitted_ unless team member has
> explicitly tested them.
> Needless to say plan did not go forward - see the whole thread [1].
>
> One thing that it had in common with recent discussion is a
> tester/frontman/maintainer/etc for each team.
>
> Having such a person alongside the actual testing is optional, yet
> _highly_ recommended.
>
> As you know Intel's team is the largest one, perhaps as big as all the
> others combined.  So expecting the same amount of manpower and time
> dedication is impossible.

I agree with you, however our release process still has a gap.  We
(Intel) test commits on master, and file bugs when we find them in i965
or other components.

If those commits already have a stable tag in the commit message, they
will be shipped at a later date directly to customers, with no testing.
There is no way to blacklist broken patches in our Mesa's release
automation.

> HTH
> Emil
>
> [1] https://lists.freedesktop.org/archives/mesa-dev/2014-July/062992.html
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri3: Prevent multiple freeing of buffers.

2018-04-06 Thread Thomas Hellstrom

Hi,

On 04/06/2018 04:51 PM, Daniel Stone wrote:

Hi Sergii,

On 6 April 2018 at 09:12, Sergii Romantsov  wrote:

Commit 3160cb86aa92 adds optimization with flag 'reallocate'.
Processing of flag causes buffers freeing while pointer
is still hold in caller stack and than again used to be freed.

Thanks a lot for writing this. I take it the core of the problem is
that dri3_handle_present_event() can be called whilst we're inside
dri3_get_buffer(), which wasn't the case before.

This was only introduced as of a727c804a2c1, and I'm not sure I fully
follow the rationale for that commit. Thomas, why do we need to
process the events? I guess we could also fake it by turning 'busy'
into a refcount, which would be incremented/decremented as it is today
when posting buffers and getting Idle events, but also when we're
holding a local pointer which we can't have stolen from under us.

Cheers,
Daniel


The motivation for this commit IIRC was that with internal glretrace 
automated tests, we typically would end up with corrupt rendering due to 
invalid viewports after window resizes. The resize events were typically 
not picked up as fast with dri3 as with dri2, so due to the lack of 
documented strategy how to handle window- and viewport resizes with dri3 
clients, I tried to make it mimic dri2 where we had no such issues. The 
reason for the slow pick up was that dri3 was waiting for fences rather 
than on X replies...


/Thomas

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Nouveau driver problem when using EGL_LINUX_DMA_BUF_EXT

2018-04-06 Thread Ilia Mirkin
Is the dma buf backed by a GEM object? In
nouveau_screen_bo_from_handle, we assume that it's a PRIME handle, and
look up the associated GEM object.

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nouveau_screen.c#n90
https://cgit.freedesktop.org/mesa/drm/tree/nouveau/nouveau.c#n789

Not sure if this is correct. Is this DMABUF even in system memory?
Otherwise this whole thing can't work.

I think you may be the first to explore this use-case, so expect a
bumpy road ahead. Note that rendering to sysram = very slow, so you
probably will just want to copy such texture objects to vram (i.e.
just create a new texture, and use glCopyImageSubData). Depends on
precisely what you're doing, I suppose.

Cheers,

  -ilia

On Fri, Apr 6, 2018 at 1:33 PM, Volker Vogelhuber
 wrote:
> Not sure if this is the right mailing list, or if the problem may belong to
> the libdrm part.
> I'm currently trying to import a DMABUF from V4L2 UVC source (using
> VIDIOC_EXPBUF) into OpenGL using EGL_LINUX_DMA_BUF_EXT. While this is
> working fine with the i915 driver it fails with the Nouveau driver.
> As a test case I have a UVC camera with a resolution of 400x400 and an 8bit
> raw bayer format. So the following attributes are set during the EGL image
> creation:
>
> // Texture width
> attrs.push_back(EGL_WIDTH);
> attrs.push_back(400);
> // Texture height
> attrs.push_back(EGL_HEIGHT);
> attrs.push_back(400);
> // Color
> attrs.push_back(EGL_LINUX_DRM_FOURCC_EXT);
> attrs.push_back(DRM_FORMAT_R8);
> // FD
> attrs.push_back(EGL_DMA_BUF_PLANE0_FD_EXT);
> attrs.push_back(fd);
> // Offset
> attrs.push_back(EGL_DMA_BUF_PLANE0_OFFSET_EXT);
> attrs.push_back(0);
> // Pitch
> attrs.push_back(EGL_DMA_BUF_PLANE0_PITCH_EXT);
> attrs.push_back(400);
>
> eglCreateImage( eglGetCurrentDisplay(), EGL_NO_CONTEXT,
> EGL_LINUX_DMA_BUF_EXT, NULL, [0] );
>
> So far no error or any other problem. But when I want to render the image it
> is distorted, like if the stride is not correct. I debugged into the system
> libraries but couldn't find any code that may give a hint if there are some
> constraints to be met regarding the stride when importing a DMABUF into the
> nouveau driver. The only thing I found was while the size of the V4L2 buffer
> is 400x400x1 = 16 the size returned by
>
> drmCommandWriteRead(drm->fd, DRM_NOUVEAU_GEM_INFO, , sizeof(req));
>
> in the libdrm nouveau part returned a page aligned size of 163840 bytes.
>
> While the sample case with this camera only resulted in a wrongly displayed
> image, I also have another V4L2 source with RGBX format where using the
> texture with DMABUF even results in a total crash of my machine. I haven't
> debugged that case further as I wanted to resolve the issue with the 400x400
> image first (debugging is easier if the machine does not freeze all the
> time).
>
> I'm currently running Ubuntu 17.10 with libdrm 2.4.83 and mesa 17.2.8. So
> the libraries are not the most current one. Are there any known issues for
> my use case?
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon/vce: move feedback command inside of destroy function

2018-04-06 Thread Emil Velikov
On 5 April 2018 at 20:33, Mark Janes  wrote:
> Emil Velikov  writes:
>
>> On 4 April 2018 at 22:50, Mark Janes  wrote:
>>> Leo Liu  writes:
>>>
 On 04/04/2018 12:40 PM, Mark Janes wrote:
> Leo Liu  writes:
>
>> On the CI family, firmware requires the destory command have to be the
>> last command in the IB, moving feedback command after destroy is causing
>> issues on CI cards, so we have to keep the previous logic that moves
>> destroy back to the last command.
>>
>> But as the original issue fixed previously, with the newer family like 
>> Vega10,
>> feedback command have to be included inside of the task info command 
>> along
>> with destroy command.
>>
>> Fixes: 6d74cb25("radeon/vce: move destroy command before feedback 
>> command")
>>
>> Signed-off-by: Leo Liu 
>> Cc: mesa-sta...@lists.freedesktop.org
> These tags seem ambiguous to me.  If this commit fixes a specific
> commit, then the patch should be applied only to stable branches which
> contain that commit.
>
> However, the mesa-stable CC caused this patch to be applied to 17.3,
> which does *not* contain the broken patch.
>
> Leo: did you intend for the mesa-stable CC to cause this patch to be
> applied to older stable branches?
 I would like to have this patch apply to branches "17.2", "17.3",
 "18.0", which got patch titled "radeon/vce: move destroy command before
 feedback command"
>>>
>>> Ok, I understand now.  You cc'd a buggy patch to stable, and the bug was
>>> shipped in 17.3.1.
>>>
>> May I suggest phrasing things less personally. Mistakes happen, so
>> let's work in providing suggestions for improvement as opposed to "you
>> did X/Y".
>
> Thank you for the feedback.  I was trying to state the facts, but I
> understand how this could be read as a criticism.
>
Does that mean you tested radeon/vce and observed the breakage?


> As you say, mistakes happen -- and when they happen on the stable
> branches, there is very little to protect the end users.  Could we
> enhance automation to prevent this situation?  For example:
>
Consistent testing/reporting is needed. I believe I've mentioned if before:

You are the only one who consistently provides feedback about the state.
There have been individuals to report, while I'm very grateful these
reports are very rare and far between.

Approx 4 years ago Carl suggested another alternative. Roughly put:

Driver specific patches are _omitted_ unless team member has
explicitly tested them.
Needless to say plan did not go forward - see the whole thread [1].

One thing that it had in common with recent discussion is a
tester/frontman/maintainer/etc for each team.

Having such a person alongside the actual testing is optional, yet
_highly_ recommended.

As you know Intel's team is the largest one, perhaps as big as all the
others combined.
So expecting the same amount of manpower and time dedication is impossible.

HTH
Emil

[1] https://lists.freedesktop.org/archives/mesa-dev/2014-July/062992.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Nouveau driver problem when using EGL_LINUX_DMA_BUF_EXT

2018-04-06 Thread Volker Vogelhuber
Not sure if this is the right mailing list, or if the problem may belong 
to the libdrm part.
I'm currently trying to import a DMABUF from V4L2 UVC source (using 
VIDIOC_EXPBUF) into OpenGL using EGL_LINUX_DMA_BUF_EXT. While this is 
working fine with the i915 driver it fails with the Nouveau driver.
As a test case I have a UVC camera with a resolution of 400x400 and an 
8bit raw bayer format. So the following attributes are set during the 
EGL image creation:


// Texture width
attrs.push_back(EGL_WIDTH);
attrs.push_back(400);
// Texture height
attrs.push_back(EGL_HEIGHT);
attrs.push_back(400);
// Color
attrs.push_back(EGL_LINUX_DRM_FOURCC_EXT);
attrs.push_back(DRM_FORMAT_R8);
// FD
attrs.push_back(EGL_DMA_BUF_PLANE0_FD_EXT);
attrs.push_back(fd);
// Offset
attrs.push_back(EGL_DMA_BUF_PLANE0_OFFSET_EXT);
attrs.push_back(0);
// Pitch
attrs.push_back(EGL_DMA_BUF_PLANE0_PITCH_EXT);
attrs.push_back(400);

eglCreateImage( eglGetCurrentDisplay(), EGL_NO_CONTEXT, 
EGL_LINUX_DMA_BUF_EXT, NULL, [0] );


So far no error or any other problem. But when I want to render the 
image it is distorted, like if the stride is not correct. I debugged 
into the system libraries but couldn't find any code that may give a 
hint if there are some constraints to be met regarding the stride when 
importing a DMABUF into the nouveau driver. The only thing I found was 
while the size of the V4L2 buffer is 400x400x1 = 16 the size 
returned by


drmCommandWriteRead(drm->fd, DRM_NOUVEAU_GEM_INFO, , sizeof(req));

in the libdrm nouveau part returned a page aligned size of 163840 bytes.

While the sample case with this camera only resulted in a wrongly 
displayed image, I also have another V4L2 source with RGBX format where 
using the texture with DMABUF even results in a total crash of my 
machine. I haven't debugged that case further as I wanted to resolve the 
issue with the 400x400 image first (debugging is easier if the machine 
does not freeze all the time).


I'm currently running Ubuntu 17.10 with libdrm 2.4.83 and mesa 17.2.8. 
So the libraries are not the most current one. Are there any known 
issues for my use case?


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] radv: add radv_get_cmask_fast_clear_value() helper

2018-04-06 Thread Samuel Pitoiset
DCC for MSAA textures are currently unsupported but that will
be used later on.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_clear.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index 678de4275fa..7de2f2d0133 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -859,6 +859,22 @@ fail:
return res;
 }
 
+static uint32_t
+radv_get_cmask_fast_clear_value(const struct radv_image *image)
+{
+   uint32_t value = 0; /* Default value when no DCC. */
+
+   /* The fast-clear value is different for images that have both DCC and
+* CMASK metadata.
+*/
+   if (image->surface.dcc_size) {
+   /* DCC fast clear with MSAA should clear CMASK to 0xC. */
+   return image->info.samples > 1 ? 0x : 0x;
+   }
+
+   return value;
+}
+
 uint32_t
 radv_clear_cmask(struct radv_cmd_buffer *cmd_buffer,
 struct radv_image *image, uint32_t value)
@@ -970,6 +986,7 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
const struct radv_image_view *iview = 
fb->attachments[pass_att].attachment;
VkClearColorValue clear_value = clear_att->clearValue.color;
uint32_t clear_color[2], flush_bits;
+   uint32_t cmask_clear_value;
bool ret;
 
if (!iview->image->cmask.size && !iview->image->surface.dcc_size)
@@ -1030,6 +1047,9 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
} else
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB |

RADV_CMD_FLAG_FLUSH_AND_INV_CB_META;
+
+   cmask_clear_value = radv_get_cmask_fast_clear_value(iview->image);
+
/* clear cmask buffer */
if (iview->image->surface.dcc_size) {
uint32_t reset_value;
@@ -1043,7 +1063,8 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
radv_set_dcc_need_cmask_elim_pred(cmd_buffer, iview->image,
  !can_avoid_fast_clear_elim);
} else {
-   flush_bits = radv_clear_cmask(cmd_buffer, iview->image, 0);
+   flush_bits = radv_clear_cmask(cmd_buffer, iview->image,
+ cmask_clear_value);
}
 
if (post_flush) {
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] radv: rename radv_image_is_tc_compat_htile()

2018-04-06 Thread Samuel Pitoiset
... to radv_use_tc_compat_htile_for_image(). This function
name makes more sense to me because we want to know if and
only if TC-compat HTILE should be used.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 1a8352fea27..dc4781231d9 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -65,8 +65,8 @@ radv_choose_tiling(struct radv_device *device,
 }
 
 static bool
-radv_image_is_tc_compat_htile(struct radv_device *device,
- const VkImageCreateInfo *pCreateInfo)
+radv_use_tc_compat_htile_for_image(struct radv_device *device,
+  const VkImageCreateInfo *pCreateInfo)
 {
/* TC-compat HTILE is only available for GFX8+. */
if (device->physical_device->rad_info.chip_class < VI)
@@ -149,7 +149,7 @@ radv_init_surface(struct radv_device *device,
 
if (is_depth) {
surface->flags |= RADEON_SURF_ZBUFFER;
-   if (radv_image_is_tc_compat_htile(device, pCreateInfo))
+   if (radv_use_tc_compat_htile_for_image(device, pCreateInfo))
surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE;
}
 
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] radv: simplify a check in radv_initialise_color_surface()

2018-04-06 Thread Samuel Pitoiset
If the image has FMASK metadata, the number of samples is > 1
because radv_image_can_enable_fmask() handles that already.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 846639eab0d..39e320e3771 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3548,7 +3548,7 @@ radv_initialise_color_surface(struct radv_device *device,
format != V_028C70_COLOR_24_8) |
S_028C70_NUMBER_TYPE(ntype) |
S_028C70_ENDIAN(endian);
-   if ((iview->image->info.samples > 1) && 
radv_image_has_fmask(iview->image)) {
+   if (radv_image_has_fmask(iview->image)) {
cb->cb_color_info |= S_028C70_COMPRESSION(1);
if (device->physical_device->rad_info.chip_class == SI) {
unsigned fmask_bankh = 
util_logbase2(iview->image->fmask.bank_height);
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] radv: add radv_use_dcc_for_image() helper

2018-04-06 Thread Samuel Pitoiset
And add some TODOs.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_image.c | 98 +++--
 1 file changed, 68 insertions(+), 30 deletions(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index dc4781231d9..86d97ff83bf 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -103,6 +103,71 @@ radv_use_tc_compat_htile_for_image(struct radv_device 
*device,
return true;
 }
 
+static bool
+radv_use_dcc_for_image(struct radv_device *device,
+  const struct radv_image_create_info *create_info,
+  const VkImageCreateInfo *pCreateInfo)
+{
+   bool dcc_compatible_formats;
+   bool blendable;
+
+   /* DCC (Delta Color Compression) is only available for GFX8+. */
+   if (device->physical_device->rad_info.chip_class < VI)
+   return false;
+
+   if (device->instance->debug_flags & RADV_DEBUG_NO_DCC)
+   return false;
+
+   /* TODO: Enable DCC for storage images. */
+   if ((pCreateInfo->usage & VK_IMAGE_USAGE_STORAGE_BIT) ||
+   (pCreateInfo->flags & VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR))
+   return false;
+
+   if (pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR)
+   return false;
+
+   /* TODO: Enable DCC for mipmaps and array layers. */
+   if (pCreateInfo->mipLevels > 1 || pCreateInfo->arrayLayers > 1)
+   return false;
+
+   if (create_info->scanout)
+   return false;
+
+   /* TODO: Enable DCC for MSAA textures. */
+   if (pCreateInfo->samples >= 2)
+   return false;
+
+   /* Determine if the formats are DCC compatible. */
+   dcc_compatible_formats =
+   radv_is_colorbuffer_format_supported(pCreateInfo->format,
+);
+
+   if (pCreateInfo->flags & VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT) {
+   const struct VkImageFormatListCreateInfoKHR *format_list =
+   (const struct  VkImageFormatListCreateInfoKHR *)
+   vk_find_struct_const(pCreateInfo->pNext,
+
IMAGE_FORMAT_LIST_CREATE_INFO_KHR);
+
+   /* We have to ignore the existence of the list if 
viewFormatCount = 0 */
+   if (format_list && format_list->viewFormatCount) {
+   /* compatibility is transitive, so we only need to check
+* one format with everything else. */
+   for (unsigned i = 0; i < format_list->viewFormatCount; 
++i) {
+   if 
(!radv_dcc_formats_compatible(pCreateInfo->format,
+
format_list->pViewFormats[i]))
+   dcc_compatible_formats = false;
+   }
+   } else {
+   dcc_compatible_formats = false;
+   }
+   }
+
+   if (!dcc_compatible_formats)
+   return false;
+
+   return true;
+}
+
 static int
 radv_init_surface(struct radv_device *device,
  struct radeon_surf *surface,
@@ -112,7 +177,7 @@ radv_init_surface(struct radv_device *device,
unsigned array_mode = radv_choose_tiling(device, create_info);
const struct vk_format_description *desc =
vk_format_description(pCreateInfo->format);
-   bool is_depth, is_stencil, blendable;
+   bool is_depth, is_stencil;
 
is_depth = vk_format_has_depth(desc);
is_stencil = vk_format_has_stencil(desc);
@@ -158,36 +223,9 @@ radv_init_surface(struct radv_device *device,
 
surface->flags |= RADEON_SURF_OPTIMIZE_FOR_SPACE;
 
-   bool dcc_compatible_formats = 
radv_is_colorbuffer_format_supported(pCreateInfo->format, );
-   if (pCreateInfo->flags & VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT) {
-   const struct  VkImageFormatListCreateInfoKHR *format_list =
- (const struct  VkImageFormatListCreateInfoKHR *)
-   vk_find_struct_const(pCreateInfo->pNext,
-
IMAGE_FORMAT_LIST_CREATE_INFO_KHR);
-
-   /* We have to ignore the existence of the list if 
viewFormatCount = 0 */
-   if (format_list && format_list->viewFormatCount) {
-   /* compatibility is transitive, so we only need to check
-* one format with everything else. */
-   for (unsigned i = 0; i < format_list->viewFormatCount; 
++i) {
-   if 
(!radv_dcc_formats_compatible(pCreateInfo->format,
-
format_list->pViewFormats[i]))
-   dcc_compatible_formats = false;
-   

[Mesa-dev] [PATCH 0/8] radv: some cleanups & preliminary work for DCC MSAA

2018-04-06 Thread Samuel Pitoiset
Hi,

This small series is a preliminary work before doing some
improvements in the DCC/CMASK/FMASK codepaths. What I plan to do is:

- implement DCC for MSAA textures (I have a WIP branch)
- implement TC-compatible CMASK
- implement DCC for mipmaps and arrays

And probably some other improvements/cleanups.

Please review,
Thanks!

Samuel Pitoiset (8):
  radv: add radv_get_cmask_fast_clear_value() helper
  radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers
  radv: clean up radv_htile_enabled()
  radv: clean up radv_vi_dcc_enabled()
  radv: simplify a check in radv_initialise_color_surface()
  radv: rename radv_image_is_tc_compat_htile()
  radv: add radv_use_dcc_for_image() helper
  radv: add radv_image_is_tc_compat_htile() helper

 src/amd/vulkan/radv_cmd_buffer.c  |  18 ++---
 src/amd/vulkan/radv_device.c  |  22 +++---
 src/amd/vulkan/radv_image.c   | 124 ++
 src/amd/vulkan/radv_meta_clear.c  |  35 --
 src/amd/vulkan/radv_meta_copy.c   |   4 +-
 src/amd/vulkan/radv_meta_decompress.c |   2 +-
 src/amd/vulkan/radv_meta_fast_clear.c |   8 +--
 src/amd/vulkan/radv_meta_resolve.c|   4 +-
 src/amd/vulkan/radv_private.h |  58 +++-
 9 files changed, 193 insertions(+), 82 deletions(-)

-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] radv: clean up radv_vi_dcc_enabled()

2018-04-06 Thread Samuel Pitoiset
And rename to radv_dcc_enabled() to be consistent.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c  |  2 +-
 src/amd/vulkan/radv_image.c   |  2 +-
 src/amd/vulkan/radv_private.h | 16 ++--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index caf6f00e634..846639eab0d 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3560,7 +3560,7 @@ radv_initialise_color_surface(struct radv_device *device,
!(device->instance->debug_flags & RADV_DEBUG_NO_FAST_CLEARS))
cb->cb_color_info |= S_028C70_FAST_CLEAR(1);
 
-   if (radv_vi_dcc_enabled(iview->image, iview->base_mip))
+   if (radv_dcc_enabled(iview->image, iview->base_mip))
cb->cb_color_info |= S_028C70_DCC_ENABLE(1);
 
if (device->physical_device->rad_info.chip_class >= VI) {
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 56b9ba1cdaf..1a8352fea27 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -294,7 +294,7 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
if (chip_class >= VI) {
state[6] &= C_008F28_COMPRESSION_EN;
state[7] = 0;
-   if (!is_storage_image && radv_vi_dcc_enabled(image, 
first_level)) {
+   if (!is_storage_image && radv_dcc_enabled(image, first_level)) {
meta_va = gpu_address + image->dcc_offset;
if (chip_class <= VI)
meta_va += base_level_info->dcc_offset;
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index fbdaa7d1601..d1acb8748e9 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1383,12 +1383,6 @@ bool radv_layout_dcc_compressed(const struct radv_image 
*image,
VkImageLayout layout,
unsigned queue_mask);
 
-static inline bool
-radv_vi_dcc_enabled(const struct radv_image *image, unsigned level)
-{
-   return image->surface.dcc_size && level < image->surface.num_dcc_levels;
-}
-
 /**
  * Return whether the image has CMASK metadata for color surfaces.
  */
@@ -1416,6 +1410,16 @@ radv_image_has_dcc(const struct radv_image *image)
return image->surface.dcc_size;
 }
 
+/**
+ * Return whether DCC metadata is enabled for a level.
+ */
+static inline bool
+radv_dcc_enabled(const struct radv_image *image, unsigned level)
+{
+   return radv_image_has_dcc(image) &&
+  level < image->surface.num_dcc_levels;
+}
+
 /**
  * Return whether the image has HTILE metadata for depth surfaces.
  */
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] radv: add radv_image_has_{cmask, fmask, dcc, htile}() helpers

2018-04-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c  | 18 +-
 src/amd/vulkan/radv_device.c  | 10 +-
 src/amd/vulkan/radv_image.c   | 14 +++---
 src/amd/vulkan/radv_meta_clear.c  | 12 ++--
 src/amd/vulkan/radv_meta_copy.c   |  4 ++--
 src/amd/vulkan/radv_meta_decompress.c |  2 +-
 src/amd/vulkan/radv_meta_fast_clear.c |  8 
 src/amd/vulkan/radv_meta_resolve.c|  4 ++--
 src/amd/vulkan/radv_private.h | 36 +++
 9 files changed, 72 insertions(+), 36 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index d47325cc985..8e3537ded26 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -950,7 +950,7 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
va += image->offset + image->clear_value_offset;
unsigned reg_offset = 0, reg_count = 0;
 
-   assert(image->surface.htile_size);
+   assert(radv_image_has_htile(image));
 
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
++reg_count;
@@ -988,7 +988,7 @@ radv_load_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
va += image->offset + image->clear_value_offset;
unsigned reg_offset = 0, reg_count = 0;
 
-   if (!image->surface.htile_size)
+   if (!radv_image_has_htile(image))
return;
 
if (aspects & VK_IMAGE_ASPECT_STENCIL_BIT) {
@@ -1027,7 +1027,7 @@ radv_set_dcc_need_cmask_elim_pred(struct radv_cmd_buffer 
*cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->dcc_pred_offset;
 
-   assert(image->surface.dcc_size);
+   assert(radv_image_has_dcc(image));
 
radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
@@ -1048,7 +1048,7 @@ radv_set_color_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->clear_value_offset;
 
-   assert(image->cmask.size || image->surface.dcc_size);
+   assert(radv_image_has_cmask(image) || radv_image_has_dcc(image));
 
radeon_emit(cmd_buffer->cs, PKT3(PKT3_WRITE_DATA, 4, 0));
radeon_emit(cmd_buffer->cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
@@ -1072,7 +1072,7 @@ radv_load_color_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
uint64_t va = radv_buffer_get_va(image->bo);
va += image->offset + image->clear_value_offset;
 
-   if (!image->cmask.size && !image->surface.dcc_size)
+   if (!radv_image_has_cmask(image) && !radv_image_has_dcc(image))
return;
 
uint32_t reg = R_028C8C_CB_COLOR0_CLEAR_WORD0 + idx * 0x3c;
@@ -3635,7 +3635,7 @@ static void radv_handle_cmask_image_transition(struct 
radv_cmd_buffer *cmd_buffe
   const VkImageSubresourceRange 
*range)
 {
if (src_layout == VK_IMAGE_LAYOUT_UNDEFINED) {
-   if (image->fmask.size)
+   if (radv_image_has_fmask(image))
radv_initialise_cmask(cmd_buffer, image, 0xu);
else
radv_initialise_cmask(cmd_buffer, image, 0xu);
@@ -3711,18 +3711,18 @@ static void radv_handle_image_transition(struct 
radv_cmd_buffer *cmd_buffer,
unsigned src_queue_mask = radv_image_queue_family_mask(image, 
src_family, cmd_buffer->queue_family_index);
unsigned dst_queue_mask = radv_image_queue_family_mask(image, 
dst_family, cmd_buffer->queue_family_index);
 
-   if (image->surface.htile_size)
+   if (radv_image_has_htile(image))
radv_handle_depth_image_transition(cmd_buffer, image, 
src_layout,
   dst_layout, src_queue_mask,
   dst_queue_mask, range,
   pending_clears);
 
-   if (image->cmask.size || image->fmask.size)
+   if (radv_image_has_cmask(image) || radv_image_has_fmask(image))
radv_handle_cmask_image_transition(cmd_buffer, image, 
src_layout,
   dst_layout, src_queue_mask,
   dst_queue_mask, range);
 
-   if (image->surface.dcc_size)
+   if (radv_image_has_dcc(image))
radv_handle_dcc_image_transition(cmd_buffer, image, src_layout,
 dst_layout, src_queue_mask,
 dst_queue_mask, range);
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 41f8242754c..caf6f00e634 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3463,7 +3463,7 @@ 

[Mesa-dev] [PATCH 3/8] radv: clean up radv_htile_enabled()

2018-04-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_private.h | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 4847afc7424..fbdaa7d1601 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -1389,12 +1389,6 @@ radv_vi_dcc_enabled(const struct radv_image *image, 
unsigned level)
return image->surface.dcc_size && level < image->surface.num_dcc_levels;
 }
 
-static inline bool
-radv_htile_enabled(const struct radv_image *image, unsigned level)
-{
-   return image->surface.htile_size && level == 0;
-}
-
 /**
  * Return whether the image has CMASK metadata for color surfaces.
  */
@@ -1431,6 +1425,15 @@ radv_image_has_htile(const struct radv_image *image)
return image->surface.htile_size;
 }
 
+/**
+ * Return whether HTILE metadata is enabled for a level.
+ */
+static inline bool
+radv_htile_enabled(const struct radv_image *image, unsigned level)
+{
+   return radv_image_has_htile(image) && level == 0;
+}
+
 unsigned radv_image_queue_family_mask(const struct radv_image *image, uint32_t 
family, uint32_t queue_family);
 
 static inline uint32_t
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] radv: add radv_image_is_tc_compat_htile() helper

2018-04-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c | 10 +-
 src/amd/vulkan/radv_image.c  | 10 +-
 src/amd/vulkan/radv_meta_clear.c |  2 +-
 src/amd/vulkan/radv_meta_copy.c  |  2 +-
 src/amd/vulkan/radv_private.h|  9 +
 5 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 39e320e3771..de184603eb0 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3622,7 +3622,7 @@ radv_calc_decompress_on_z_planes(struct radv_device 
*device,
 {
unsigned max_zplanes = 0;
 
-   assert(iview->image->tc_compatible_htile);
+   assert(radv_image_is_tc_compat_htile(iview->image));
 
if (device->physical_device->rad_info.chip_class >= GFX9) {
/* Default value for 32-bit depth surfaces. */
@@ -3724,7 +3724,7 @@ radv_initialise_ds_surface(struct radv_device *device,
if (radv_htile_enabled(iview->image, level)) {
ds->db_z_info |= S_028038_TILE_SURFACE_ENABLE(1);
 
-   if (iview->image->tc_compatible_htile) {
+   if (radv_image_is_tc_compat_htile(iview->image)) {
unsigned max_zplanes =

radv_calc_decompress_on_z_planes(device, iview);
 
@@ -3752,7 +3752,7 @@ radv_initialise_ds_surface(struct radv_device *device,
z_offs += iview->image->surface.u.legacy.level[level].offset;
s_offs += 
iview->image->surface.u.legacy.stencil_level[level].offset;
 
-   ds->db_depth_info = 
S_02803C_ADDR5_SWIZZLE_MASK(!iview->image->tc_compatible_htile);
+   ds->db_depth_info = 
S_02803C_ADDR5_SWIZZLE_MASK(!radv_image_is_tc_compat_htile(iview->image));
ds->db_z_info = S_028040_FORMAT(format) | 
S_028040_ZRANGE_PRECISION(1);
ds->db_stencil_info = S_028044_FORMAT(stencil_format);
 
@@ -3797,7 +3797,7 @@ radv_initialise_ds_surface(struct radv_device *device,
ds->db_z_info |= S_028040_TILE_SURFACE_ENABLE(1);
 
if (!iview->image->surface.has_stencil &&
-   !iview->image->tc_compatible_htile)
+   !radv_image_is_tc_compat_htile(iview->image))
/* Use all of the htile_buffer for depth if 
there's no stencil. */
ds->db_stencil_info |= 
S_028044_TILE_STENCIL_DISABLE(1);
 
@@ -3806,7 +3806,7 @@ radv_initialise_ds_surface(struct radv_device *device,
ds->db_htile_data_base = va >> 8;
ds->db_htile_surface = S_028ABC_FULL_CACHE(1);
 
-   if (iview->image->tc_compatible_htile) {
+   if (radv_image_is_tc_compat_htile(iview->image)) {
unsigned max_zplanes =

radv_calc_decompress_on_z_planes(device, iview);
 
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 86d97ff83bf..b35df1d172a 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -336,8 +336,8 @@ si_set_mutable_tex_desc_fields(struct radv_device *device,
meta_va = gpu_address + image->dcc_offset;
if (chip_class <= VI)
meta_va += base_level_info->dcc_offset;
-   } else if(!is_storage_image && image->tc_compatible_htile &&
- radv_image_has_htile(image)) {
+   } else if (!is_storage_image &&
+  radv_image_is_tc_compat_htile(image)) {
meta_va = gpu_address + image->htile_offset;
}
 
@@ -488,7 +488,7 @@ si_make_texture_descriptor(struct radv_device *device,
/* S8 with either Z16 or Z32 HTILE need a special format. */
if (device->physical_device->rad_info.chip_class >= GFX9 &&
vk_format == VK_FORMAT_S8_UINT &&
-   image->tc_compatible_htile) {
+   radv_image_is_tc_compat_htile(image)) {
if (image->vk_format == VK_FORMAT_D32_SFLOAT_S8_UINT)
data_format = V_008F14_IMG_DATA_FORMAT_S8_32;
else if (image->vk_format == VK_FORMAT_D16_UNORM_S8_UINT)
@@ -1201,7 +1201,7 @@ bool radv_layout_has_htile(const struct radv_image *image,
VkImageLayout layout,
unsigned queue_mask)
 {
-   if (radv_image_has_htile(image) && image->tc_compatible_htile)
+   if (radv_image_is_tc_compat_htile(image))
return layout != VK_IMAGE_LAYOUT_GENERAL;
 
return radv_image_has_htile(image) &&
@@ -1214,7 +1214,7 @@ bool radv_layout_is_htile_compressed(const struct 
radv_image *image,
  VkImageLayout layout,
   

Re: [Mesa-dev] [PATCH] mesa: Assert base format before truncating to unsigned short

2018-04-06 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Apr 6, 2018 at 10:26 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:

> CID: 1433709
> Fixes: ca721b3d8: mesa: use GLenum16 in a few more places
> CC: Marek Olšák 
> CC: Brian Paul 
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/main/teximage.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index 8f53510..f560512 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -845,6 +845,7 @@ _mesa_init_teximage_fields_ms(struct gl_context *ctx,
>  mesa_format format,
>  GLuint numSamples, GLboolean fixedSampleLocations)
>  {
> +   const GLint base_format =_mesa_base_tex_format(ctx, internalFormat);
> GLenum target;
> assert(img);
> assert(width >= 0);
> @@ -852,8 +853,8 @@ _mesa_init_teximage_fields_ms(struct gl_context *ctx,
> assert(depth >= 0);
>
> target = img->TexObject->Target;
> -   img->_BaseFormat = _mesa_base_tex_format( ctx, internalFormat );
> -   assert(img->_BaseFormat != -1);
> +   assert(base_format != -1);
> +   img->_BaseFormat = (GLenum16)base_format;
> img->InternalFormat = internalFormat;
> img->Border = border;
> img->Width = width;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: rename variables in nir_lower_io_to_temporaries for clarity

2018-04-06 Thread Jason Ekstrand
On Wed, Apr 4, 2018 at 4:16 PM, Caio Marcelo de Oliveira Filho <
caio.olive...@intel.com> wrote:

> In the emit_copies() function, the use of "newv" and "temp" names made
> sense when only copies from temporaries to the new variables were
> being done. But now there are other calls to copy with other pairings,
> and "temp" doesn't always refer to a temporary created in this
> pass. Use the names "dest" and "src" instead.
> ---
>  .../nir/nir_lower_io_to_temporaries.c | 22 +--
>  1 file changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/src/compiler/nir/nir_lower_io_to_temporaries.c
> b/src/compiler/nir/nir_lower_io_to_temporaries.c
> index 301ba65892..c3e1207f4e 100644
> --- a/src/compiler/nir/nir_lower_io_to_temporaries.c
> +++ b/src/compiler/nir/nir_lower_io_to_temporaries.c
> @@ -40,34 +40,34 @@ struct lower_io_state {
>  };
>
>  static void
> -emit_copies(nir_cursor cursor, nir_shader *shader, struct exec_list
> *new_vars,
> -  struct exec_list *old_vars)
> +emit_copies(nir_cursor cursor, nir_shader *shader, struct exec_list
> *dest_vars,
> +struct exec_list *src_vars)
>  {
> -   assert(exec_list_length(new_vars) == exec_list_length(old_vars));
> +   assert(exec_list_length(dest_vars) == exec_list_length(src_vars));
>
> -   foreach_two_lists(new_node, new_vars, old_node, old_vars) {
> -  nir_variable *newv = exec_node_data(nir_variable, new_node, node);
> -  nir_variable *temp = exec_node_data(nir_variable, old_node, node);
> +   foreach_two_lists(new_node, dest_vars, old_node, src_vars) {
> +  nir_variable *dest = exec_node_data(nir_variable, new_node, node);
> +  nir_variable *src = exec_node_data(nir_variable, old_node, node);
>

We probably want to use src_node and dst_node here.

Reviewed-by: Jason Ekstrand 

I've made the above change and will push once Jenkins comes back one more
time.

--Jason


>
>/* No need to copy the contents of a non-fb_fetch_output output
> variable
> * to the temporary allocated for it, since its initial value is
> * undefined.
> */
> -  if (temp->data.mode == nir_var_shader_out &&
> -  !temp->data.fb_fetch_output)
> +  if (src->data.mode == nir_var_shader_out &&
> +  !src->data.fb_fetch_output)
>   continue;
>
>/* Can't copy the contents of the temporary back to a read-only
> * interface variable.  The value of the temporary won't have been
> * modified by the shader anyway.
> */
> -  if (newv->data.read_only)
> +  if (dest->data.read_only)
>   continue;
>
>nir_intrinsic_instr *copy =
>   nir_intrinsic_instr_create(shader, nir_intrinsic_copy_var);
> -  copy->variables[0] = nir_deref_var_create(copy, newv);
> -  copy->variables[1] = nir_deref_var_create(copy, temp);
> +  copy->variables[0] = nir_deref_var_create(copy, dest);
> +  copy->variables[1] = nir_deref_var_create(copy, src);
>
>nir_instr_insert(cursor, >instr);
> }
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/17] winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE

2018-04-06 Thread Marek Olšák
On Fri, Apr 6, 2018 at 3:05 AM, Samuel Pitoiset 
wrote:

>
>
> On 04/05/2018 10:54 PM, Marek Olšák wrote:
>
>> On Thu, Apr 5, 2018 at 4:22 AM, Samuel Pitoiset <
>> samuel.pitoi...@gmail.com > wrote:
>>
>> Patches 16-17 are:
>>
>> Reviewed-by: Samuel Pitoiset > >
>>
>> Those two are quite interesting. I will probably update my kernel
>> and experiment something.
>>
>>
>> Patch 16 breaks things and needs more changes.
>>
>
> What does it break?
>

Everything can be broken by that. It depends on the timing between CPU and
GPU work.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/17] ac/surface: don't set the display flag for obviously unsupported cases

2018-04-06 Thread Marek Olšák
On Fri, Apr 6, 2018 at 11:41 AM, Michel Dänzer  wrote:

> On 2018-04-06 03:25 PM, Marek Olšák wrote:
> > On Thu, Apr 5, 2018, 3:09 AM Michel Dänzer  wrote:
> >> On 2018-04-04 07:35 PM, Marek Olšák wrote:
> >>> On Wed, Apr 4, 2018 at 9:01 AM, Michel Dänzer 
> >> wrote:
>  On 2018-04-04 02:57 PM, Marek Olšák wrote:
> > On Wed, Apr 4, 2018, 6:18 AM Michel Dänzer  > > wrote:
> >
> > On 2018-04-04 03:59 AM, Marek Olšák wrote:
> > > From: Marek Olšák > marek.ol...@amd.com
> >>
> > >
> > > This enables the tile swizzle for some cases of the displayable
> > micro mode,
> > > and it also fixes an addrlib assertion failure on Vega.
> > > ---
> > >  src/amd/common/ac_surface.c | 18 ++
> > >  1 file changed, 14 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/src/amd/common/ac_surface.c
>  b/src/amd/common/ac_surface.c
> > > index b294cd85259..2b20a553d51 100644
> > > --- a/src/amd/common/ac_surface.c
> > > +++ b/src/amd/common/ac_surface.c
> > > @@ -408,20 +408,29 @@ static unsigned
> > cik_get_macro_tile_index(struct radeon_surf *surf)
> > >   tileb = 8 * 8 * surf->bpe;
> > >   tileb = MIN2(surf->u.legacy.tile_split, tileb);
> > >
> > >   for (index = 0; tileb > 64; index++)
> > >   tileb >>= 1;
> > >
> > >   assert(index < 16);
> > >   return index;
> > >  }
> > >
> > > +static bool get_display_flag(const struct ac_surf_config
> >> *config,
> > > +  const struct radeon_surf *surf)
> > > +{
> > > + return surf->flags & RADEON_SURF_SCANOUT &&
> > > +!(surf->flags & RADEON_SURF_FMASK) &&
> > > +config->info.samples <= 1 &&
> > > +surf->bpe >= 4 && surf->bpe <= 8;
> >
> > surf->bpe is the number of bytes used to store each pixel, right?
> >> If
>  so,
> > this cannot exclude surf->bpe < 4, since 16 bpp and 8 bpp formats
>  can be
> > displayed.
> >
> >
> > Sure, but what are the chances they will be displayed with the
> current
> > stack? GLX doesn't have 16bpp visuals for on-screen rendering.
> 
>  Maybe not when the X server runs at depth 24, but it can also run at
>  depths 8, 15 & 16, in which case displayable surfaces with bpe == 1
> or 2
>  are needed even before GLX even comes into the picture.
> 
> >>>
> >>> OK. Let me ask differently. Do we wanna support displayable 8, 15, and
> 16
> >>> bpp?
> >>
> >> We do support it, it's not really a question of whether we want to
> >> anymore. :)
> >>
> >>> Can we just say that we don't support those?
> >>
> >> I'm afraid we can't.
> >>
> >>
> >> Which kind of surfaces are you trying to exclude like this? Maybe they
> >> can be excluded in a different way.
> >
> > Currently just the MSAA resolve temporary destination buffer.
>
> Do those actually have surf->bpe < 4? Im not getting any hits with
> glxgears -samples 8.
>

The main purpose of the patch is to fix addrlib crashes on Vega when bpe ==
16. Everything else you see in the patch is just a bonus.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] ac/surface: don't set the display flag for obviously unsupported cases (v2)

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

This enables the tile swizzle for some cases of the displayable micro mode,
and it also fixes an addrlib assertion failure on Vega.
---
 src/amd/common/ac_surface.c| 34 +++---
 src/amd/common/ac_surface.h|  1 +
 src/amd/vulkan/radv_image.c|  1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c |  1 +
 4 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
index b294cd85259..1b4d72e31bd 100644
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -408,20 +408,45 @@ static unsigned cik_get_macro_tile_index(struct 
radeon_surf *surf)
tileb = 8 * 8 * surf->bpe;
tileb = MIN2(surf->u.legacy.tile_split, tileb);
 
for (index = 0; tileb > 64; index++)
tileb >>= 1;
 
assert(index < 16);
return index;
 }
 
+static bool get_display_flag(const struct ac_surf_config *config,
+const struct radeon_surf *surf)
+{
+   unsigned num_channels = config->info.num_channels;
+   unsigned bpe = surf->bpe;
+
+   if (surf->flags & RADEON_SURF_SCANOUT &&
+   !(surf->flags & RADEON_SURF_FMASK) &&
+   config->info.samples <= 1 &&
+   surf->blk_w <= 2 && surf->blk_h == 1) {
+   /* subsampled */
+   if (surf->blk_w == 2 && surf->blk_h == 1)
+   return true;
+
+   if  (/* RGBA8 or RGBA16F */
+(bpe >= 4 && bpe <= 8 && num_channels == 4) ||
+/* R5G6B5 or R5G5B5A1 */
+(bpe == 2 && num_channels >= 3) ||
+/* C8 palette */
+(bpe == 1 && num_channels == 1))
+   return true;
+   }
+   return false;
+}
+
 /**
  * This must be called after the first level is computed.
  *
  * Copy surface-global settings like pipe/bank config from level 0 surface
  * computation, and compute tile swizzle.
  */
 static int gfx6_surface_settings(ADDR_HANDLE addrlib,
 const struct radeon_info *info,
 const struct ac_surf_config *config,
 ADDR_COMPUTE_SURFACE_INFO_OUTPUT* csio,
@@ -442,21 +467,21 @@ static int gfx6_surface_settings(ADDR_HANDLE addrlib,
} else {
surf->u.legacy.macro_tile_index = 0;
}
 
/* Compute tile swizzle. */
/* TODO: fix tile swizzle with mipmapping for SI */
if ((info->chip_class >= CIK || config->info.levels == 1) &&
config->info.surf_index &&
surf->u.legacy.level[0].mode == RADEON_SURF_MODE_2D &&
!(surf->flags & (RADEON_SURF_Z_OR_SBUFFER | RADEON_SURF_SHAREABLE)) 
&&
-   (config->info.samples > 1 || !(surf->flags & RADEON_SURF_SCANOUT))) 
{
+   !get_display_flag(config, surf)) {
ADDR_COMPUTE_BASE_SWIZZLE_INPUT AddrBaseSwizzleIn = {0};
ADDR_COMPUTE_BASE_SWIZZLE_OUTPUT AddrBaseSwizzleOut = {0};
 
AddrBaseSwizzleIn.size = 
sizeof(ADDR_COMPUTE_BASE_SWIZZLE_INPUT);
AddrBaseSwizzleOut.size = 
sizeof(ADDR_COMPUTE_BASE_SWIZZLE_OUTPUT);
 
AddrBaseSwizzleIn.surfIndex = 
p_atomic_inc_return(config->info.surf_index) - 1;
AddrBaseSwizzleIn.tileIndex = csio->tileIndex;
AddrBaseSwizzleIn.macroModeIndex = csio->macroModeIndex;
AddrBaseSwizzleIn.pTileInfo = csio->pTileInfo;
@@ -561,21 +586,21 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
AddrSurfInfoIn.tileType = ADDR_DISPLAYABLE;
else if (surf->flags & (RADEON_SURF_Z_OR_SBUFFER | RADEON_SURF_FMASK))
AddrSurfInfoIn.tileType = ADDR_DEPTH_SAMPLE_ORDER;
else
AddrSurfInfoIn.tileType = ADDR_NON_DISPLAYABLE;
 
AddrSurfInfoIn.flags.color = !(surf->flags & RADEON_SURF_Z_OR_SBUFFER);
AddrSurfInfoIn.flags.depth = (surf->flags & RADEON_SURF_ZBUFFER) != 0;
AddrSurfInfoIn.flags.cube = config->is_cube;
AddrSurfInfoIn.flags.fmask = (surf->flags & RADEON_SURF_FMASK) != 0;
-   AddrSurfInfoIn.flags.display = (surf->flags & RADEON_SURF_SCANOUT) != 0;
+   AddrSurfInfoIn.flags.display = get_display_flag(config, surf);
AddrSurfInfoIn.flags.pow2Pad = config->info.levels > 1;
AddrSurfInfoIn.flags.tcCompatible = (surf->flags & 
RADEON_SURF_TC_COMPATIBLE_HTILE) != 0;
 
/* Only degrade the tile mode for space if TC-compatible HTILE hasn't 
been
 * requested, because TC-compatible HTILE requires 2D tiling.
 */
AddrSurfInfoIn.flags.opt4Space = !AddrSurfInfoIn.flags.tcCompatible &&
 !AddrSurfInfoIn.flags.fmask &&
 config->info.samples <= 1 &&
   

[Mesa-dev] [PATCH 2/4] radeonsi: make sure CP DMA is idle at the end of IBs

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 12 +++-
 src/gallium/drivers/radeonsi/si_gfx_cs.c |  5 -
 src/gallium/drivers/radeonsi/si_pipe.h   |  1 +
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 358b33c4eb1..b316637d94b 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -58,21 +58,20 @@ static inline unsigned cp_dma_max_byte_count(struct 
si_context *sctx)
  * a buffer. The size must fit in bits [20:0]. If CP_DMA_CLEAR is set, src_va 
is a 32-bit
  * clear value.
  */
 static void si_emit_cp_dma(struct si_context *sctx, uint64_t dst_va,
   uint64_t src_va, unsigned size, unsigned flags,
   enum si_coherency coher)
 {
struct radeon_winsys_cs *cs = sctx->gfx_cs;
uint32_t header = 0, command = 0;
 
-   assert(size);
assert(size <= cp_dma_max_byte_count(sctx));
 
if (sctx->chip_class >= GFX9)
command |= S_414_BYTE_COUNT_GFX9(size);
else
command |= S_414_BYTE_COUNT_GFX6(size);
 
/* Sync flags. */
if (flags & CP_DMA_SYNC)
header |= S_411_CP_SYNC(1);
@@ -121,20 +120,31 @@ static void si_emit_cp_dma(struct si_context *sctx, 
uint64_t dst_va,
 * This ensures that ME (CP DMA) is idle before PFP starts fetching
 * indices. If we wanted to execute CP DMA in PFP, this packet
 * should precede it.
 */
if (coher == SI_COHERENCY_SHADER && flags & CP_DMA_SYNC) {
radeon_emit(cs, PKT3(PKT3_PFP_SYNC_ME, 0, 0));
radeon_emit(cs, 0);
}
 }
 
+void si_cp_dma_wait_for_idle(struct si_context *sctx)
+{
+   /* Issue a dummy DMA that copies zero bytes.
+*
+* The DMA engine will see that there's no work to do and skip this
+* DMA request, however, the CP will see the sync flag and still wait
+* for all DMAs to complete.
+*/
+   si_emit_cp_dma(sctx, 0, 0, 0, CP_DMA_SYNC, SI_COHERENCY_NONE);
+}
+
 static unsigned get_flush_flags(struct si_context *sctx, enum si_coherency 
coher)
 {
switch (coher) {
default:
case SI_COHERENCY_NONE:
return 0;
case SI_COHERENCY_SHADER:
return SI_CONTEXT_INV_SMEM_L1 |
   SI_CONTEXT_INV_VMEM_L1 |
   (sctx->chip_class == SI ? SI_CONTEXT_INV_GLOBAL_L2 : 0);
diff --git a/src/gallium/drivers/radeonsi/si_gfx_cs.c 
b/src/gallium/drivers/radeonsi/si_gfx_cs.c
index f99bc324c98..2d5e510b19e 100644
--- a/src/gallium/drivers/radeonsi/si_gfx_cs.c
+++ b/src/gallium/drivers/radeonsi/si_gfx_cs.c
@@ -104,21 +104,24 @@ void si_flush_gfx_cs(struct si_context *ctx, unsigned 
flags,
}
 
ctx->flags |= SI_CONTEXT_CS_PARTIAL_FLUSH |
SI_CONTEXT_PS_PARTIAL_FLUSH;
 
/* DRM 3.1.0 doesn't flush TC for VI correctly. */
if (ctx->chip_class == VI && ctx->screen->info.drm_minor <= 1)
ctx->flags |= SI_CONTEXT_INV_GLOBAL_L2 |
SI_CONTEXT_INV_VMEM_L1;
 
-   si_emit_cache_flush(ctx);
+   /* Make sure CP DMA is idle at the end of IBs after L2 prefetches
+* because the kernel doesn't wait for it. */
+   if (ctx->chip_class >= CIK)
+   si_cp_dma_wait_for_idle(ctx);
 
if (ctx->current_saved_cs) {
si_trace_emit(ctx);
si_log_hw_flush(ctx);
 
/* Save the IB for debug contexts. */
si_save_cs(ws, cs, >current_saved_cs->gfx, true);
ctx->current_saved_cs->flushed = true;
ctx->current_saved_cs->time_flush = os_time_get_nano();
}
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 8d4072c9c2e..0c90a6c6e46 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -883,20 +883,21 @@ void si_init_clear_functions(struct si_context *sctx);
   SI_CPDMA_SKIP_SYNC_BEFORE | \
   SI_CPDMA_SKIP_GFX_SYNC | \
   SI_CPDMA_SKIP_BO_LIST_UPDATE)
 
 enum si_coherency {
SI_COHERENCY_NONE, /* no cache flushes needed */
SI_COHERENCY_SHADER,
SI_COHERENCY_CB_META,
 };
 
+void si_cp_dma_wait_for_idle(struct si_context *sctx);
 void si_clear_buffer(struct si_context *sctx, struct pipe_resource *dst,
 uint64_t offset, uint64_t size, unsigned value,
 enum si_coherency coher);
 void si_copy_buffer(struct si_context *sctx,
struct pipe_resource *dst, struct pipe_resource *src,
uint64_t dst_offset, uint64_t src_offset, unsigned size,
unsigned user_flags);
 void 

[Mesa-dev] [PATCH 4/4] radeonsi: relax DCC format compatibility contraints

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

The swizzle has no effect on DCC encoding.
---
 src/gallium/drivers/radeonsi/si_texture.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_texture.c 
b/src/gallium/drivers/radeonsi/si_texture.c
index 5a29624f1fa..1f0de5e71ec 100644
--- a/src/gallium/drivers/radeonsi/si_texture.c
+++ b/src/gallium/drivers/radeonsi/si_texture.c
@@ -1926,38 +1926,32 @@ vi_get_dcc_channel_type(const struct 
util_format_description *desc)
return dcc_channel_incompatible;
}
 }
 
 /* Return if it's allowed to reinterpret one format as another with DCC 
enabled. */
 bool vi_dcc_formats_compatible(enum pipe_format format1,
   enum pipe_format format2)
 {
const struct util_format_description *desc1, *desc2;
enum dcc_channel_type type1, type2;
-   int i;
 
if (format1 == format2)
return true;
 
desc1 = util_format_description(format1);
desc2 = util_format_description(format2);
 
+   /* This constraint is only needed if we use the TC-compatible
+* DCC clear encoding with the clear value of 1. */
if (desc1->nr_channels != desc2->nr_channels)
return false;
 
-   /* Swizzles must be the same. */
-   for (i = 0; i < desc1->nr_channels; i++)
-   if (desc1->swizzle[i] <= PIPE_SWIZZLE_W &&
-   desc2->swizzle[i] <= PIPE_SWIZZLE_W &&
-   desc1->swizzle[i] != desc2->swizzle[i])
-   return false;
-
type1 = vi_get_dcc_channel_type(desc1);
type2 = vi_get_dcc_channel_type(desc2);
 
return type1 != dcc_channel_incompatible &&
   type2 != dcc_channel_incompatible &&
   type1 == type2;
 }
 
 bool vi_dcc_formats_are_incompatible(struct pipe_resource *tex,
 unsigned level,
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] radeonsi: add shader binary padding for UMR

2018-04-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index c18915488e5..8c62d53e2ad 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5302,33 +5302,37 @@ void si_shader_apply_scratch_relocs(struct si_shader 
*shader,
if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name)) {
util_memcpy_cpu_to_le32(shader->binary.code + 
reloc->offset,
_rsrc_dword0, 4);
} else if (!strcmp(scratch_rsrc_dword1_symbol, reloc->name)) {
util_memcpy_cpu_to_le32(shader->binary.code + 
reloc->offset,
_rsrc_dword1, 4);
}
}
 }
 
+/* For the UMR disassembler. */
+#define DEBUGGER_END_OF_CODE_MARKER0xbf9f /* invalid instruction */
+#define DEBUGGER_NUM_MARKERS   5
+
 static unsigned si_get_shader_binary_size(const struct si_shader *shader)
 {
unsigned size = shader->binary.code_size;
 
if (shader->prolog)
size += shader->prolog->binary.code_size;
if (shader->previous_stage)
size += shader->previous_stage->binary.code_size;
if (shader->prolog2)
size += shader->prolog2->binary.code_size;
if (shader->epilog)
size += shader->epilog->binary.code_size;
-   return size;
+   return size + DEBUGGER_NUM_MARKERS * 4;
 }
 
 int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader)
 {
const struct ac_shader_binary *prolog =
shader->prolog ? >prolog->binary : NULL;
const struct ac_shader_binary *previous_stage =
shader->previous_stage ? >previous_stage->binary : NULL;
const struct ac_shader_binary *prolog2 =
shader->prolog2 ? >prolog2->binary : NULL;
@@ -5373,24 +5377,32 @@ int si_shader_binary_upload(struct si_screen *sscreen, 
struct si_shader *shader)
ptr += previous_stage->code_size;
}
if (prolog2) {
memcpy(ptr, prolog2->code, prolog2->code_size);
ptr += prolog2->code_size;
}
 
memcpy(ptr, mainb->code, mainb->code_size);
ptr += mainb->code_size;
 
-   if (epilog)
+   if (epilog) {
memcpy(ptr, epilog->code, epilog->code_size);
-   else if (mainb->rodata_size > 0)
+   ptr += epilog->code_size;
+   } else if (mainb->rodata_size > 0) {
memcpy(ptr, mainb->rodata, mainb->rodata_size);
+   ptr += mainb->rodata_size;
+   }
+
+   /* Add end-of-code markers for the UMR disassembler. */
+   uint32_t *ptr32 = (uint32_t*)ptr;
+   for (unsigned i = 0; i < DEBUGGER_NUM_MARKERS; i++)
+   ptr32[i] = DEBUGGER_END_OF_CODE_MARKER;
 
sscreen->ws->buffer_unmap(shader->bo->buf);
return 0;
 }
 
 static void si_shader_dump_disassembly(const struct ac_shader_binary *binary,
   struct pipe_debug_callback *debug,
   const char *name, FILE *file)
 {
char *line, *p;
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] egl/x11: Handle both depth 30 formats for eglCreateImage().

2018-04-06 Thread Michel Dänzer
On 2018-04-06 06:18 PM, Mario Kleiner wrote:
> On Fri, Apr 6, 2018 at 12:01 PM, Michel Dänzer  wrote:
>> On 2018-03-27 07:53 PM, Daniel Stone wrote:
>>> On 12 March 2018 at 20:45, Mario Kleiner  wrote:
 We need to distinguish if a backing pixmap of a window is
 XRGB2101010 or XBGR2101010, as different gpu hw supports
 different formats. NVidia hw prefers XBGR, whereas AMD and
 Intel are happy with XRGB.

 We use the red channel mask of the visual to distinguish at
 depth 30, but because we can't easily get the associated
 visual of a Pixmap, we use the visual of the x-screens root
 window instead as a proxy.

 This fixes desktop composition of color depth 30 windows
 when the X11 compositor uses EGL.
>>>
>>> I have no reason to doubt your testing, so this patch is:
>>> Acked-by: Daniel Stone 
>>>
>>> But it does rather fill me with trepidation, given that X11 Pixmaps
>>> are supposed to be a dumb 'bag of bits', doing nothing else than
>>> providing the same number and size of channels to the actual client
>>> data for the Visual associated with the Window.
>>
>> As far as X11 is concerned, the number of channels and their sizes don't
>> even matter; a pixmap is simply a container for an unsigned integer of n
>> bits (where n is the pixmap depth) per pixel, with no inherent meaning
>> attached to those values.
>>
>> That said, I'm not sure this is true for EGL as well. But even if it
>> isn't, there would have to be another mechanism to determine the format,
>> e.g. a config associated with the EGL pixmap. The pixmap doesn't even
>> necessarily have the same depth as the root window, so using the
>> latter's visual doesn't make much sense.
> 
> Hi Michel. I thought with this patch i was implementing what you
> proposed earlier as a heuristic on how to get around the "pixmaps
> don't have an inherent format, only a depth" problem?

Do you have a pointer to that discussion?


> My (possibly inaccurate) understanding is that one can only create a
> depth 30 pixmap if the x-screen runs at depth >= 30. It only exposes
> depth 30 as supported pixmap format (xdpyinfo) if xorg.conf
> DefaultDepth 30 is selected, whereas other depths like
> 1,4,8,15,16,24,32 are always supported at default depth 24.

That sounds like an X server issue. Just like 32, there's no fundamental
reason a pixmap couldn't have depth 30 despite the screen depth being lower.

Out of curiosity, can you share the output of xdpyinfo with nouveau at
depth 30?


> Iff depth 30 is selected, then the root window has depth 30, and a depth 30
> visual. If each driver only exports one channel ordering for depth 30,
> then the channel ordering of any pixmaps associated drawable should be
> the same as the one of the root window.

Repeat after me: "X11 pixmaps don't have a format." :) They're just bags
of bits.


Does __DRI_IMAGE_FORMAT_ARGB work for depth 30 as well, by any chance?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] egl/x11: Handle both depth 30 formats for eglCreateImage().

2018-04-06 Thread Mario Kleiner
On Fri, Apr 6, 2018 at 12:01 PM, Michel Dänzer  wrote:
> On 2018-03-27 07:53 PM, Daniel Stone wrote:
>> On 12 March 2018 at 20:45, Mario Kleiner  wrote:
>>> We need to distinguish if a backing pixmap of a window is
>>> XRGB2101010 or XBGR2101010, as different gpu hw supports
>>> different formats. NVidia hw prefers XBGR, whereas AMD and
>>> Intel are happy with XRGB.
>>>
>>> We use the red channel mask of the visual to distinguish at
>>> depth 30, but because we can't easily get the associated
>>> visual of a Pixmap, we use the visual of the x-screens root
>>> window instead as a proxy.
>>>
>>> This fixes desktop composition of color depth 30 windows
>>> when the X11 compositor uses EGL.
>>
>> I have no reason to doubt your testing, so this patch is:
>> Acked-by: Daniel Stone 
>>
>> But it does rather fill me with trepidation, given that X11 Pixmaps
>> are supposed to be a dumb 'bag of bits', doing nothing else than
>> providing the same number and size of channels to the actual client
>> data for the Visual associated with the Window.
>
> As far as X11 is concerned, the number of channels and their sizes don't
> even matter; a pixmap is simply a container for an unsigned integer of n
> bits (where n is the pixmap depth) per pixel, with no inherent meaning
> attached to those values.
>
> That said, I'm not sure this is true for EGL as well. But even if it
> isn't, there would have to be another mechanism to determine the format,
> e.g. a config associated with the EGL pixmap. The pixmap doesn't even
> necessarily have the same depth as the root window, so using the
> latter's visual doesn't make much sense.
>

Hi Michel. I thought with this patch i was implementing what you
proposed earlier as a heuristic on how to get around the "pixmaps
don't have an inherent format, only a depth" problem?

My (possibly inaccurate) understanding is that one can only create a
depth 30 pixmap if the x-screen runs at depth >= 30. It only exposes
depth 30 as supported pixmap format (xdpyinfo) if xorg.conf
DefaultDepth 30 is selected, whereas other depths like
1,4,8,15,16,24,32 are always supported at default depth 24. Iff depth
30 is selected, then the root window has depth 30, and a depth 30
visual. If each driver only exports one channel ordering for depth 30,
then the channel ordering of any pixmaps associated drawable should be
the same as the one of the root window.

Wrong thinking?
-mario


>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105871] Discolored KDE panels after updating to Mesa 18.0 on Intel broadwell

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105871

--- Comment #20 from Tapani Pälli  ---
(In reply to Alexey Min from comment #18)
> (In reply to Tasev from comment #16)
> > (In reply to sergio.callegari from comment #13)
> > > As an alternate/complement solution to the patched xserver-xorg-core on 
> > > the
> > > Padoka ppa, for those using kde plasma, there is now also a patch to kwin 
> > > to
> > > fix the visuals selection.
> > > 
> > > https://phabricator.kde.org/D11758
> > 
> > I tried that patch, without succes,he doesn't fix this bug.
> 
> This is understandable, because I only fixed kwin_wayland session with that,
> and I only have amd hardware to test with.
> 
> If you can point me to which config attribute kwin should/should not be
> using to avoid this, we can at least try to fix kwin_x11 too.

Problem is that kwin_x11 cannot easily avoid this because there is only one
32bit visual exposed and without the patch mentioned in bug #103699 it always
has sRGB capability which makes things go bad. It is likely that somewhere we
are missing a sRGB->linear conversion in that case. With that patch for Xorg I
wanted to restore previous behaviour back.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/miptree: Initialize mcs buffer only until clear color

2018-04-06 Thread Pohjolainen, Topi
On Fri, Apr 06, 2018 at 08:53:39AM -0700, Jason Ekstrand wrote:
> On Fri, Apr 6, 2018 at 8:22 AM, Rafael Antognolli <
> rafael.antogno...@intel.com> wrote:
> 
> > On Fri, Apr 06, 2018 at 06:07:52PM +0300, Topi Pohjolainen wrote:
> > > Otherwise even the clear color gets initialised to 0xFF. This
> > > allows enabling of color fast clears on ICL without regressing
> > > multisampling tests.
> > >
> > > CC: Rafael Antognolli 
> > > CC: Jason Ekstrand 
> > > CC: Nanley Chery 
> > > Signed-off-by: Topi Pohjolainen 
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index 89074a6..25f901d 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -1680,7 +1680,12 @@ intel_miptree_init_mcs(struct brw_context *brw,
> > >return;
> > > }
> > > void *data = map;
> > > -   memset(data, init_value, mt->mcs_buf->size);
> > > +
> > > +   /* Only initialize until clear color (if present). */
> > > +   const unsigned aux_size = mt->mcs_buf->clear_color_offset ?
> > > +mt->mcs_buf->clear_color_offset :
> > > +mt->mcs_buf->size;
> > > +   memset(data, init_value, aux_size);
> >
> 
> Why not just use mt->mcs_buf->aux_surf.size?
> 
> Also, I think we probably want to memset the clear color to 0 in case we
> get a recycled BO with unknown garbage in the clear value.

Good thinking, both points.

> 
> 
> > Hmm... that's a good catch, and I think we definitely should not
> > overwrite the clear color here.
> >
> > However, the initial value of the clear color shouldn't matter, right? I
> > think there might still be a bug hidden somewhere...

I agree. I started to look into MCS in more detail - I don't think I fully
understand how the clear color works there.

> >
> > Regardless of that, this patch is
> >
> > Reviewed-by: Rafael Antognolli 
> >
> > > brw_bo_unmap(mt->mcs_buf->bo);
> > >  }
> > >
> > > --
> > > 2.7.4
> > >
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 019/104] nir: Support deref instructions in lower_var_copies

2018-04-06 Thread Caio Marcelo de Oliveira Filho
On Thu, Apr 05, 2018 at 01:53:46PM -0700, Jason Ekstrand wrote:
> On Thu, Apr 5, 2018 at 12:55 PM, Caio Marcelo de Oliveira Filho <
> caio.olive...@intel.com> wrote:
> 
> > Hello,
> >
> > > +static nir_deref_instr *
> > > +build_deref_to_next_wildcard(nir_builder *b,
> > > + nir_deref_instr *parent,
> > > + nir_deref_instr ***deref_arr)
> > > +{
> > > +   for (; **deref_arr; (*deref_arr)++) {
> > > +  if ((**deref_arr)->deref_type == nir_deref_type_array_wildcard)
> > > + return parent;
> > > +
> > > +  parent = nir_build_deref_follower(b, parent, **deref_arr);
> > > +   }
> > > +
> > > +   assert(**deref_arr == NULL);
> > > +   *deref_arr = NULL;
> > > +   return parent;
> > > +}
> >
> > Question: in a scenario where there are no wildcards in the chain,
> > could we just return the original deref (i.e. the last element in
> > deref_arr)?
> >
> 
> Yes, and nir_build_deref_follower magically does that. :-)  It's admittedly
> a bit sketchy because we don't know if the if the follower actually
> dominates the current builder cursor but all of the callers in this series
> do have that guarantee.

Ah, I see how it works now. And it also covers partial reuse of the
original deref chain. I'd consider adding a note about the fact that
it might reuse pieces of the chain to the function comment.


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105846] Assertion failure @ st_atom_array.c:675 when playing Natural Selection 2

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105846

--- Comment #9 from l...@protonmail.ch ---
NVM it just happened again, exact same error too.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/miptree: Initialize mcs buffer only until clear color

2018-04-06 Thread Jason Ekstrand
On Fri, Apr 6, 2018 at 8:22 AM, Rafael Antognolli <
rafael.antogno...@intel.com> wrote:

> On Fri, Apr 06, 2018 at 06:07:52PM +0300, Topi Pohjolainen wrote:
> > Otherwise even the clear color gets initialised to 0xFF. This
> > allows enabling of color fast clears on ICL without regressing
> > multisampling tests.
> >
> > CC: Rafael Antognolli 
> > CC: Jason Ekstrand 
> > CC: Nanley Chery 
> > Signed-off-by: Topi Pohjolainen 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 89074a6..25f901d 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -1680,7 +1680,12 @@ intel_miptree_init_mcs(struct brw_context *brw,
> >return;
> > }
> > void *data = map;
> > -   memset(data, init_value, mt->mcs_buf->size);
> > +
> > +   /* Only initialize until clear color (if present). */
> > +   const unsigned aux_size = mt->mcs_buf->clear_color_offset ?
> > +mt->mcs_buf->clear_color_offset :
> > +mt->mcs_buf->size;
> > +   memset(data, init_value, aux_size);
>

Why not just use mt->mcs_buf->aux_surf.size?

Also, I think we probably want to memset the clear color to 0 in case we
get a recycled BO with unknown garbage in the clear value.


> Hmm... that's a good catch, and I think we definitely should not
> overwrite the clear color here.
>
> However, the initial value of the clear color shouldn't matter, right? I
> think there might still be a bug hidden somewhere...
>
> Regardless of that, this patch is
>
> Reviewed-by: Rafael Antognolli 
>
> > brw_bo_unmap(mt->mcs_buf->bo);
> >  }
> >
> > --
> > 2.7.4
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/17] ac/surface: don't set the display flag for obviously unsupported cases

2018-04-06 Thread Michel Dänzer
On 2018-04-06 03:25 PM, Marek Olšák wrote:
> On Thu, Apr 5, 2018, 3:09 AM Michel Dänzer  wrote:
>> On 2018-04-04 07:35 PM, Marek Olšák wrote:
>>> On Wed, Apr 4, 2018 at 9:01 AM, Michel Dänzer 
>> wrote:
 On 2018-04-04 02:57 PM, Marek Olšák wrote:
> On Wed, Apr 4, 2018, 6:18 AM Michel Dänzer  > wrote:
>
> On 2018-04-04 03:59 AM, Marek Olšák wrote:
> > From: Marek Olšák  marek.ol...@amd.com
>>
> >
> > This enables the tile swizzle for some cases of the displayable
> micro mode,
> > and it also fixes an addrlib assertion failure on Vega.
> > ---
> >  src/amd/common/ac_surface.c | 18 ++
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/amd/common/ac_surface.c
 b/src/amd/common/ac_surface.c
> > index b294cd85259..2b20a553d51 100644
> > --- a/src/amd/common/ac_surface.c
> > +++ b/src/amd/common/ac_surface.c
> > @@ -408,20 +408,29 @@ static unsigned
> cik_get_macro_tile_index(struct radeon_surf *surf)
> >   tileb = 8 * 8 * surf->bpe;
> >   tileb = MIN2(surf->u.legacy.tile_split, tileb);
> >
> >   for (index = 0; tileb > 64; index++)
> >   tileb >>= 1;
> >
> >   assert(index < 16);
> >   return index;
> >  }
> >
> > +static bool get_display_flag(const struct ac_surf_config
>> *config,
> > +  const struct radeon_surf *surf)
> > +{
> > + return surf->flags & RADEON_SURF_SCANOUT &&
> > +!(surf->flags & RADEON_SURF_FMASK) &&
> > +config->info.samples <= 1 &&
> > +surf->bpe >= 4 && surf->bpe <= 8;
>
> surf->bpe is the number of bytes used to store each pixel, right?
>> If
 so,
> this cannot exclude surf->bpe < 4, since 16 bpp and 8 bpp formats
 can be
> displayed.
>
>
> Sure, but what are the chances they will be displayed with the current
> stack? GLX doesn't have 16bpp visuals for on-screen rendering.

 Maybe not when the X server runs at depth 24, but it can also run at
 depths 8, 15 & 16, in which case displayable surfaces with bpe == 1 or 2
 are needed even before GLX even comes into the picture.

>>>
>>> OK. Let me ask differently. Do we wanna support displayable 8, 15, and 16
>>> bpp?
>>
>> We do support it, it's not really a question of whether we want to
>> anymore. :)
>>
>>> Can we just say that we don't support those?
>>
>> I'm afraid we can't.
>>
>>
>> Which kind of surfaces are you trying to exclude like this? Maybe they
>> can be excluded in a different way.
> 
> Currently just the MSAA resolve temporary destination buffer.

Do those actually have surf->bpe < 4? I'm not getting any hits with
glxgears -samples 8.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/miptree: Initialize mcs buffer only until clear color

2018-04-06 Thread Rafael Antognolli
On Fri, Apr 06, 2018 at 06:07:52PM +0300, Topi Pohjolainen wrote:
> Otherwise even the clear color gets initialised to 0xFF. This
> allows enabling of color fast clears on ICL without regressing
> multisampling tests.
> 
> CC: Rafael Antognolli 
> CC: Jason Ekstrand 
> CC: Nanley Chery 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 89074a6..25f901d 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1680,7 +1680,12 @@ intel_miptree_init_mcs(struct brw_context *brw,
>return;
> }
> void *data = map;
> -   memset(data, init_value, mt->mcs_buf->size);
> +
> +   /* Only initialize until clear color (if present). */
> +   const unsigned aux_size = mt->mcs_buf->clear_color_offset ?
> +mt->mcs_buf->clear_color_offset :
> +mt->mcs_buf->size;
> +   memset(data, init_value, aux_size);

Hmm... that's a good catch, and I think we definitely should not
overwrite the clear color here.

However, the initial value of the clear color shouldn't matter, right? I
think there might still be a bug hidden somewhere...

Regardless of that, this patch is

Reviewed-by: Rafael Antognolli 

> brw_bo_unmap(mt->mcs_buf->bo);
>  }
>  
> -- 
> 2.7.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105871] Discolored KDE panels after updating to Mesa 18.0 on Intel broadwell

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105871

Alexey Min  changed:

   What|Removed |Added

 CC||alexey@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/miptree: Initialize mcs buffer only until clear color

2018-04-06 Thread Topi Pohjolainen
Otherwise even the clear color gets initialised to 0xFF. This
allows enabling of color fast clears on ICL without regressing
multisampling tests.

CC: Rafael Antognolli 
CC: Jason Ekstrand 
CC: Nanley Chery 
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 89074a6..25f901d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1680,7 +1680,12 @@ intel_miptree_init_mcs(struct brw_context *brw,
   return;
}
void *data = map;
-   memset(data, init_value, mt->mcs_buf->size);
+
+   /* Only initialize until clear color (if present). */
+   const unsigned aux_size = mt->mcs_buf->clear_color_offset ?
+mt->mcs_buf->clear_color_offset :
+mt->mcs_buf->size;
+   memset(data, init_value, aux_size);
brw_bo_unmap(mt->mcs_buf->bo);
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri3: Prevent multiple freeing of buffers.

2018-04-06 Thread Daniel Stone
Hi Sergii,

On 6 April 2018 at 09:12, Sergii Romantsov  wrote:
> Commit 3160cb86aa92 adds optimization with flag 'reallocate'.
> Processing of flag causes buffers freeing while pointer
> is still hold in caller stack and than again used to be freed.

Thanks a lot for writing this. I take it the core of the problem is
that dri3_handle_present_event() can be called whilst we're inside
dri3_get_buffer(), which wasn't the case before.

This was only introduced as of a727c804a2c1, and I'm not sure I fully
follow the rationale for that commit. Thomas, why do we need to
process the events? I guess we could also fake it by turning 'busy'
into a refcount, which would be incremented/decremented as it is today
when posting buffers and getting Idle events, but also when we're
holding a local pointer which we can't have stolen from under us.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 000/104] nir: Move to using instructions for derefs

2018-04-06 Thread Rob Clark
On Tue, Apr 3, 2018 at 2:32 PM, Jason Ekstrand  wrote:
> This is something that Connor and I have been talking about for some time
> now.  The basic idea is to replace the current singly linked nir_deref list
> with deref instructions.  This is similar to what LLVM does and it offers
> quite a bit more freedom when we start getting more realistic pointers from
> compute applications.
>
> This series implements a complete conversion for both i965 and anv.  As can
> be seen in the penultimate patch, there are three core NIR passes remaining
> to be converted.  None of those three passes is used by any of the Intel
> drivers so I have no ability to test them.  The final patch deletes support
> for old-school deref chains from NIR entirely.  The only NIR-using drivers
> which build with that patch are i965 and anv but it shows that the
> conversion for those two is complete and has also been very useful in
> finding things I missed the first time around.
>
> Somehow, this series manages to shave off 700 lines of code but I wouldn't
> take that to mean much.  Some of that is whole-sale deleting lower_io_types
> (170 lines).  Some of it is that deref instructions and the new function
> call mechanism are more efficient from a data structure perspective because
> you don't have deref chains attached to texture ops and intrinsics.  I've
> also been modernizing as I go and converting some things to use nir_builder
> instead of building instructions manually.  The ammount that deref
> instructions make things easier over deref chains is totally a wash.
>
> Clearly, this can't really proceed until other drivers have added the bits
> (which should be small at this point) to do the conversion.  Someone also
> needs to add "Support deref instructions in..." and "Remove deref chain
> support from..." patches for the three remaining core NIR passes.
>
> My next plan is to try and start experimenting with more advanced
> load/store elimination on shared variables and maybe even SSBOs.  This will
> require properly handling barriers and, thanks to Vulkan's pointer support,
> cast derefs where the source may have come from a phi node or variable.
>
> This series can be found as a branch on gitlab:
>
> https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-deref-instr-v3
>

Fwiw, I've converted the remaining passes that mesa/st uses (but
i965/anv do not), and things seem to be generally working(ish).  I
still have a piglit run going, I'm sure there are some little things
to fix here/there.

The nir_lower_samplers_as_deref isn't so well tested, since freedreno
doesn't use it.  I hacked up ir3_cmdline compiler with a call to that
pass with nir_print_shader before/after, and the results looked sane.
But not tested at all on real hw.

  https://github.com/freedreno/mesa/commits/nir-deref-instr-v3

(there was one unrelated patch to avoid rebasing jason's -v3 branch
and one to make ir3_cmdline compiler work so I could test-drive
nir_lower_samplers_as_deref)

I've also not tried to make this bisectable yet.  But I guess it
should be enough for others to start testing/converting their drivers.

BR,
-R

> Cc: Rob Clark 
> Cc: Timothy Arceri 
> Cc: Eric Anholt 
> Cc: Connor Abbott 
> Cc: Bas Nieuwenhuizen 
> Cc: Karol Herbst 
>
> Jason Ekstrand (104):
>   nir/validate: Rework intrinsic type validation
>   nir: Add a deref instruction type
>   nir/builder: Add deref building helpers
>   nir: Add _deref versions of all of the _var intrinsics
>   nir: Add deref sources to texture instructions
>   nir: Add helpers for working with deref instructions
>   anv,i965,radv,st,ir3: Call nir_lower_deref_instrs
>   glsl/nir: Only claim to handle intrinsic functions
>   glsl/nir: Use deref instructions instead of dref chains
>   prog/nir: Simplify some load/store operations
>   prog/nir: Use deref instructions for params
>   nir/lower_atomics: Rework the main walker loop a bit
>   nir: Support deref instructions in remove_dead_variables
>   nir: Add a pass for fixing deref modes
>   nir: Support deref instructions in lower_global_vars_to_local
>   nir: Use nir_builder in lower_io_to_temporaries
>   nir: Support deref instructions in lower_io_to_temporaries
>   nir: Add a deref path helper struct
>   nir: Support deref instructions in lower_var_copies
>   nir: Support deref instructions in split_var_copies
>   nir: Support deref instructions in lower_vars_to_ssa
>   nir: Support deref instructions in lower_indirect_derefs
>   nir/deref: Add a deref cleanup function
>   nir: Support deref instructions in lower_system_values
>   nir: Support deref instructions in lower_clip_cull
>   nir: Support deref instructions in propagate_invariant
>   nir: Support deref instructions in gather_info
>   nir: Support deref instructions in lower_io
>   nir: Support deref instructions in lower_atomics
>   nir: 

[Mesa-dev] [Bug 105871] Discolored KDE panels after updating to Mesa 18.0 on Intel broadwell

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105871

--- Comment #19 from Tasev  ---
(In reply to Alexey Min from comment #18)
> (In reply to Tasev from comment #16)
> > (In reply to sergio.callegari from comment #13)
> > > As an alternate/complement solution to the patched xserver-xorg-core on 
> > > the
> > > Padoka ppa, for those using kde plasma, there is now also a patch to kwin 
> > > to
> > > fix the visuals selection.
> > > 
> > > https://phabricator.kde.org/D11758
> > 
> > I tried that patch, without succes,he doesn't fix this bug.
> 
> This is understandable, because I only fixed kwin_wayland session with that,
> and I only have amd hardware to test with.
> 
> If you can point me to which config attribute kwin should/should not be
> using to avoid this, we can at least try to fix kwin_x11 too.

I'am just a average user, i can test a patch but i cannot help more
than that.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/3] mesa/i965: Add support for INTEL_blackhole_render

2018-04-06 Thread Lionel Landwerlin
I should have added the branch where you can check this out : 
https://github.com/djdeath/mesa/tree/wip/intel-blackhole-render


Thanks!

On 06/04/18 15:31, Lionel Landwerlin wrote:

Hi all,

This is an update after Emil's feedback and a different implementation
in i965.

Cheers,

Lionel Landwerlin (3):
   include: bump GL/GLES headers & registry
   mesa: add INTEL_blackhole_render
   i965: enable INTEL_blackhole_render

  include/GL/glcorearb.h   |   52 +-
  include/GL/glext.h   |   65 +-
  include/GL/glxext.h  |   20 +-
  include/GL/wglext.h  |6 +-
  include/GLES/gl.h|   15 +-
  include/GLES/glext.h |   33 +-
  include/GLES2/gl2.h  |6 +-
  include/GLES2/gl2ext.h   |  143 +-
  include/GLES3/gl3.h  |6 +-
  src/mapi/glapi/registry/gl.xml   | 3995 --
  src/mesa/drivers/dri/i965/brw_compute.c  |   46 +-
  src/mesa/drivers/dri/i965/brw_defines.h  |8 +-
  src/mesa/drivers/dri/i965/brw_draw.c |   20 +-
  src/mesa/drivers/dri/i965/intel_extensions.c |1 +
  src/mesa/main/clear.c|2 +-
  src/mesa/main/enable.c   |   14 +
  src/mesa/main/extensions_table.h |1 +
  src/mesa/main/mtypes.h   |7 +
  18 files changed, 3118 insertions(+), 1322 deletions(-)

--
2.17.0



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/3] mesa/i965: Add support for INTEL_blackhole_render

2018-04-06 Thread Lionel Landwerlin
Hi all,

This is an update after Emil's feedback and a different implementation
in i965.

Cheers,

Lionel Landwerlin (3):
  include: bump GL/GLES headers & registry
  mesa: add INTEL_blackhole_render
  i965: enable INTEL_blackhole_render

 include/GL/glcorearb.h   |   52 +-
 include/GL/glext.h   |   65 +-
 include/GL/glxext.h  |   20 +-
 include/GL/wglext.h  |6 +-
 include/GLES/gl.h|   15 +-
 include/GLES/glext.h |   33 +-
 include/GLES2/gl2.h  |6 +-
 include/GLES2/gl2ext.h   |  143 +-
 include/GLES3/gl3.h  |6 +-
 src/mapi/glapi/registry/gl.xml   | 3995 --
 src/mesa/drivers/dri/i965/brw_compute.c  |   46 +-
 src/mesa/drivers/dri/i965/brw_defines.h  |8 +-
 src/mesa/drivers/dri/i965/brw_draw.c |   20 +-
 src/mesa/drivers/dri/i965/intel_extensions.c |1 +
 src/mesa/main/clear.c|2 +-
 src/mesa/main/enable.c   |   14 +
 src/mesa/main/extensions_table.h |1 +
 src/mesa/main/mtypes.h   |7 +
 18 files changed, 3118 insertions(+), 1322 deletions(-)

--
2.17.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/3] i965: enable INTEL_blackhole_render

2018-04-06 Thread Lionel Landwerlin
v2: condition the extension on context isolation support from the
kernel (Chris)

v3: (Lionel)

The initial version of this change used a feature of the Gen7+
command parser to turn the primitive instructions into no-ops.
Unfortunately this doesn't play well with how we're using the
hardware outside of the user submitted commands. For example
resolves are implicit operations which should not be turned into
no-ops as part of the previously submitted commands (before
blackhole_render is enabled) might not be disabled. For example
this sequence :

   glClear();
   glEnable(GL_BLACKHOLE_RENDER_INTEL);
   glDrawArrays(...);
   glReadPixels(...);
   glDisable(GL_BLACKHOLE_RENDER_INTEL);

While clear has been emitted outside the blackhole render, it
should still be resolved properly in the read pixels. Hence we
need to be more selective and only disable user submitted
commands.

This v3 manually turns primitives into MI_NOOP if blackhole render
is enabled. This lets us enable this feature on any platform.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_compute.c  | 46 +++-
 src/mesa/drivers/dri/i965/brw_defines.h  |  8 +++-
 src/mesa/drivers/dri/i965/brw_draw.c | 20 ++---
 src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
 4 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index 5ce899bcbcc..a368e5fb2c6 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -131,29 +131,35 @@ brw_emit_gpgpu_walker(struct brw_context *brw)
if (right_non_aligned != 0)
   right_mask >>= (simd_size - right_non_aligned);
 
+   struct gl_context *ctx = >ctx;
uint32_t dwords = devinfo->gen < 8 ? 11 : 15;
BEGIN_BATCH(dwords);
-   OUT_BATCH(GPGPU_WALKER << 16 | (dwords - 2) | indirect_flag);
-   OUT_BATCH(0);
-   if (devinfo->gen >= 8) {
-  OUT_BATCH(0); /* Indirect Data Length */
-  OUT_BATCH(0); /* Indirect Data Start Address */
+   if (ctx->IntelBlackholeRender) {
+  for (uint32_t d = 0; d < dwords; d++)
+ OUT_BATCH(MI_NOOP);
+   } else {
+  OUT_BATCH(GPGPU_WALKER << 16 | (dwords - 2) | indirect_flag);
+  OUT_BATCH(0);
+  if (devinfo->gen >= 8) {
+ OUT_BATCH(0); /* Indirect Data Length */
+ OUT_BATCH(0); /* Indirect Data Start Address */
+  }
+  assert(thread_width_max <= brw->screen->devinfo.max_cs_threads);
+  OUT_BATCH(SET_FIELD(simd_size / 16, GPGPU_WALKER_SIMD_SIZE) |
+SET_FIELD(thread_width_max - 1, 
GPGPU_WALKER_THREAD_WIDTH_MAX));
+  OUT_BATCH(0);/* Thread Group ID Starting X */
+  if (devinfo->gen >= 8)
+ OUT_BATCH(0); /* MBZ */
+  OUT_BATCH(num_groups[0]);/* Thread Group ID X Dimension */
+  OUT_BATCH(0);/* Thread Group ID Starting Y */
+  if (devinfo->gen >= 8)
+ OUT_BATCH(0); /* MBZ */
+  OUT_BATCH(num_groups[1]);/* Thread Group ID Y Dimension */
+  OUT_BATCH(0);/* Thread Group ID Starting/Resume 
Z */
+  OUT_BATCH(num_groups[2]);/* Thread Group ID Z Dimension */
+  OUT_BATCH(right_mask);   /* Right Execution Mask */
+  OUT_BATCH(0x);   /* Bottom Execution Mask */
}
-   assert(thread_width_max <= brw->screen->devinfo.max_cs_threads);
-   OUT_BATCH(SET_FIELD(simd_size / 16, GPGPU_WALKER_SIMD_SIZE) |
- SET_FIELD(thread_width_max - 1, GPGPU_WALKER_THREAD_WIDTH_MAX));
-   OUT_BATCH(0);/* Thread Group ID Starting X */
-   if (devinfo->gen >= 8)
-  OUT_BATCH(0); /* MBZ */
-   OUT_BATCH(num_groups[0]);/* Thread Group ID X Dimension */
-   OUT_BATCH(0);/* Thread Group ID Starting Y */
-   if (devinfo->gen >= 8)
-  OUT_BATCH(0); /* MBZ */
-   OUT_BATCH(num_groups[1]);/* Thread Group ID Y Dimension */
-   OUT_BATCH(0);/* Thread Group ID Starting/Resume Z */
-   OUT_BATCH(num_groups[2]);/* Thread Group ID Z Dimension */
-   OUT_BATCH(right_mask);   /* Right Execution Mask */
-   OUT_BATCH(0x);   /* Bottom Execution Mask */
ADVANCE_BATCH();
 
BEGIN_BATCH(2);
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 8bf6f68b67c..c8a597c8ad0 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1650,11 +1650,17 @@ enum brw_pixel_shader_coverage_mask_mode {
 #define GEN10_CACHE_MODE_SS0x0e420
 #define 

[Mesa-dev] [PATCH v2 2/3] mesa: add INTEL_blackhole_render

2018-04-06 Thread Lionel Landwerlin
v2: Implement missing Enable/Disable (Emil)

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/main/clear.c|  2 +-
 src/mesa/main/enable.c   | 14 ++
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/mtypes.h   |  7 +++
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/clear.c b/src/mesa/main/clear.c
index 6beff9ed842..e9ab59b7116 100644
--- a/src/mesa/main/clear.c
+++ b/src/mesa/main/clear.c
@@ -175,7 +175,7 @@ clear(struct gl_context *ctx, GLbitfield mask, bool 
no_error)
   return;
}
 
-   if (ctx->RasterDiscard)
+   if (ctx->RasterDiscard || ctx->IntelBlackholeRender)
   return;
 
if (ctx->RenderMode == GL_RENDER) {
diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index 7625a4c9577..978258390c1 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -1127,6 +1127,16 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
  ctx->Color.BlendCoherent = state;
  break;
 
+  case GL_BLACKHOLE_RENDER_INTEL:
+ if (!_mesa_has_INTEL_blackhole_render(ctx))
+goto invalid_enum_error;
+ if (ctx->IntelBlackholeRender == state)
+return;
+ FLUSH_VERTICES(ctx, 0);
+ ctx->NewDriverState |= ctx->DriverFlags.NewIntelBlackholeRender;
+ ctx->IntelBlackholeRender = state;
+ break;
+
   default:
  goto invalid_enum_error;
}
@@ -1762,6 +1772,10 @@ _mesa_IsEnabled( GLenum cap )
  CHECK_EXTENSION(MESA_tile_raster_order);
  return ctx->TileRasterOrderIncreasingY;
 
+  case GL_BLACKHOLE_RENDER_INTEL:
+ CHECK_EXTENSION(INTEL_blackhole_render);
+ return ctx->IntelBlackholeRender;
+
   default:
  goto invalid_enum_error;
}
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 492f7c3d20a..f2df7cead60 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -307,6 +307,7 @@ EXT(IBM_texture_mirrored_repeat , dummy_true
 
 EXT(INGR_blend_func_separate, EXT_blend_func_separate  
  , GLL,  x ,  x ,  x , 1999)
 
+EXT(INTEL_blackhole_render  , INTEL_blackhole_render   
  ,  30,  30,  x , ES2, 2018)
 EXT(INTEL_conservative_rasterization, INTEL_conservative_rasterization 
  ,  x , GLC,  x ,  31, 2013)
 EXT(INTEL_performance_query , INTEL_performance_query  
  , GLL, GLC,  x , ES2, 2013)
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b7a7b34a090..efae386aa78 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4335,6 +4335,7 @@ struct gl_extensions
GLboolean ATI_fragment_shader;
GLboolean ATI_separate_stencil;
GLboolean GREMEDY_string_marker;
+   GLboolean INTEL_blackhole_render;
GLboolean INTEL_conservative_rasterization;
GLboolean INTEL_performance_query;
GLboolean KHR_blend_equation_advanced;
@@ -4704,6 +4705,11 @@ struct gl_driver_flags
 
/** Shader constants (uniforms, program parameters, state constants) */
uint64_t NewShaderConstants[MESA_SHADER_STAGES];
+
+   /**
+* gl_context::IntelBlackholeRender
+*/
+   uint64_t NewIntelBlackholeRender;
 };
 
 struct gl_buffer_binding
@@ -5120,6 +5126,7 @@ struct gl_context
 
GLboolean RasterDiscard;  /**< GL_RASTERIZER_DISCARD */
GLboolean IntelConservativeRasterization; /**< 
GL_INTEL_CONSERVATIVE_RASTERIZATION */
+   GLboolean IntelBlackholeRender; /**< GL_INTEL_blackhole_render */
 
/** Does glVertexAttrib(0) alias glVertex()? */
bool _AttribZeroAliasesVertex;
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Assert base format before truncating to unsigned short

2018-04-06 Thread Topi Pohjolainen
CID: 1433709
Fixes: ca721b3d8: mesa: use GLenum16 in a few more places
CC: Marek Olšák 
CC: Brian Paul 

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/main/teximage.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 8f53510..f560512 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -845,6 +845,7 @@ _mesa_init_teximage_fields_ms(struct gl_context *ctx,
 mesa_format format,
 GLuint numSamples, GLboolean fixedSampleLocations)
 {
+   const GLint base_format =_mesa_base_tex_format(ctx, internalFormat);
GLenum target;
assert(img);
assert(width >= 0);
@@ -852,8 +853,8 @@ _mesa_init_teximage_fields_ms(struct gl_context *ctx,
assert(depth >= 0);
 
target = img->TexObject->Target;
-   img->_BaseFormat = _mesa_base_tex_format( ctx, internalFormat );
-   assert(img->_BaseFormat != -1);
+   assert(base_format != -1);
+   img->_BaseFormat = (GLenum16)base_format;
img->InternalFormat = internalFormat;
img->Border = border;
img->Width = width;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] - Rewrite mesa website in Sphinx

2018-04-06 Thread Rob Clark
On Fri, Apr 6, 2018 at 8:30 AM, Emil Velikov  wrote:
> On 6 April 2018 at 10:40, Daniel Stone  wrote:
>> Hi all,
>>
>> On 5 April 2018 at 23:55, Laura Ekstrand  wrote:
>>> So I spoke with Daniel Stone today about the infrastructure.  He estimates
>>> it will be ready to deploy the website in 2-3 weeks, at the most.  So I'd
>>> say the infrastructure will be there when we are ready.
>>>
>>> In the new system, our website will be running in its own container managed
>>> by freedesktop's new Gitlab server. So what we need to do for the deploy is:
>>
>> Right. We can't keep stacking random unmanaged bits for every project
>> into unconfined spaces, not least as it makes upgrades super painful.
>> What we want to do is use GitLab's CI mechanism to automatically run a
>> pipeline to generate the static pages, and likely serve them from
>> GitLab Pages. Having it all done in containers means that you can be
>> in complete control of the environment without having to block on
>> admins, and it's also easier for others to reproduce.
>>
>>> 1.  Fork mesa into a repo on Gitlab.com (https://gitlab.freedesktop.org).
>>
>> This should be actual gitlab.com for now. I don't have CI enabled on
>> gitlab.fd.o yet as it's surprisingly difficult to do without exposing
>> root to everyone. I'm aiming for the next couple of weeks to have this
>> done. I'll ping this thread when it is up.
>>
> Thanks for the confirmation and the work Dan.
>
> Some time back I looked how lxc manages its containers w/o root. I
> could help out porting that to docker/others.
>
> Laura, others,
>
> As you can see things are not finalised nor official*, yet. Hence why
> I would not rush into rewriting the docs/website. Otherwise the work
> would be sit in a branch, again :-\
>

drive-by comment but would pointing mesa3d.org at
mesa3d.readthedocs.org, or something like that, suffice as a temporary
solution, if needed?

BR,
-R

> HTH
> Emil
> * Did anyone mention gitlab to the folks who are not on IRC?
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 18.1.0 release plan

2018-04-06 Thread Emil Velikov
Hi all,

Here is the tentative release plan for 18.1.0. While it hasn't been on the
mesa3d.org website, it shouldn't be a surprise for anyone.

 Apr 20 2017 - Feature freeze/Release candidate 1
 Apr 27 2017 - Release candidate 2
 May 04 2017 - Release candidate 3
 May 11 2017 - Release candidate 4/final release

This gives us approximately two weeks until the branch point.

Note: In the spririt of keeping things clearer and more transparent, we will be
keeping track of any features planned for the release in Bugzilla [1].

Do add a separate "Depends on" for each work you have planned. Alternatively
you can reply to this email and I'll them for you.

Thanks
Emil

[1] https://bugs.freedesktop.org/show_bug.cgi?id=105928
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105928] [TRACKER] Mesa 18.1 feature tracker

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105928

Bug ID: 105928
   Summary: [TRACKER] Mesa 18.1 feature tracker
   Product: Mesa
   Version: unspecified
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: emil.l.veli...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

This is a tracker for features planned for the 18.1.0 release.

Note: features should be merged prior to the branchpoint - schedule is on
mesa3d.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri3: Prevent multiple freeing of buffers.

2018-04-06 Thread Eero Tamminen

Hi,

I tested this on KBL GT2.  Compiz crashes during 3h test run dropped 
from 30 to none.  There were couple of percent changes in few synthetic 
tests, but I think that's just because Compiz is now running properly.


-> looks good!


- Eero

Tested-by: Eero Tamminen 


On 06.04.2018 11:12, Sergii Romantsov wrote:

Commit 3160cb86aa92 adds optimization with flag 'reallocate'.
Processing of flag causes buffers freeing while pointer
is still hold in caller stack and than again used to be freed.

Fixes: 3160cb86aa92 "egl/x11: Re-allocate buffers if format is suboptimal"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105906
Signed-off-by: Sergii Romantsov 
Tested-by: Andriy Khulap 
---
  src/loader/loader_dri3_helper.c | 7 +--
  src/loader/loader_dri3_helper.h | 1 +
  2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index fe17df1..5f9cc42 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -422,7 +422,7 @@ dri3_handle_present_event(struct loader_dri3_drawable *draw,
  buf->busy = 0;
  
   if (buf && draw->cur_blit_source != b && !buf->busy &&

- (buf->reallocate ||
+ ((buf->reallocate && !buf->in_use_to_destroy) ||
   (draw->num_back <= b && b < LOADER_DRI3_MAX_BACK))) {
  dri3_free_render_buffer(draw, buf);
  draw->buffers[b] = NULL;
@@ -1688,6 +1688,7 @@ dri3_get_buffer(__DRIdrawable *driDrawable,
 (buffer_type == loader_dri3_buffer_front && draw->have_fake_front))
&& buffer) {
  
+ buffer->in_use_to_destroy = true;

   /* Fill the new buffer with data from an old buffer */
   dri3_fence_await(draw->conn, draw, buffer);
   if (!loader_dri3_blit_image(draw,
@@ -1731,6 +1732,7 @@ dri3_get_buffer(__DRIdrawable *driDrawable,
draw->buffers[buf_id] = buffer;
 }
 dri3_fence_await(draw->conn, draw, buffer);
+   buffer = draw->buffers[buf_id];
  
 /*

  * Do we need to preserve the content of a previous buffer?
@@ -1744,7 +1746,8 @@ dri3_get_buffer(__DRIdrawable *driDrawable,
 if (buffer_type == loader_dri3_buffer_back &&
 draw->cur_blit_source != -1 &&
 draw->buffers[draw->cur_blit_source] &&
-   buffer != draw->buffers[draw->cur_blit_source]) {
+   buffer != draw->buffers[draw->cur_blit_source] &&
+   buffer != NULL) {
  
struct loader_dri3_buffer *source = draw->buffers[draw->cur_blit_source];
  
diff --git a/src/loader/loader_dri3_helper.h b/src/loader/loader_dri3_helper.h

index 7e3d829..9232d61 100644
--- a/src/loader/loader_dri3_helper.h
+++ b/src/loader/loader_dri3_helper.h
@@ -62,6 +62,7 @@ struct loader_dri3_buffer {
 bool busy;   /* Set on swap, cleared on IdleNotify */
 bool own_pixmap; /* We allocated the pixmap ID, free on 
destroy */
 bool reallocate; /* Buffer should be reallocated and not 
reused */
+   bool in_use_to_destroy; /* Buffer is in use and will be 
destroyed soon */
  
 uint32_t num_planes;

 uint32_t size;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/17] ac/surface: don't set the display flag for obviously unsupported cases

2018-04-06 Thread Marek Olšák
On Thu, Apr 5, 2018, 3:09 AM Michel Dänzer  wrote:

> On 2018-04-04 07:35 PM, Marek Olšák wrote:
> > On Wed, Apr 4, 2018 at 9:01 AM, Michel Dänzer 
> wrote:
> >> On 2018-04-04 02:57 PM, Marek Olšák wrote:
> >>> On Wed, Apr 4, 2018, 6:18 AM Michel Dänzer  >>> > wrote:
> >>>
> >>> On 2018-04-04 03:59 AM, Marek Olšák wrote:
> >>> > From: Marek Olšák  
> >>> >
> >>> > This enables the tile swizzle for some cases of the displayable
> >>> micro mode,
> >>> > and it also fixes an addrlib assertion failure on Vega.
> >>> > ---
> >>> >  src/amd/common/ac_surface.c | 18 ++
> >>> >  1 file changed, 14 insertions(+), 4 deletions(-)
> >>> >
> >>> > diff --git a/src/amd/common/ac_surface.c
> >> b/src/amd/common/ac_surface.c
> >>> > index b294cd85259..2b20a553d51 100644
> >>> > --- a/src/amd/common/ac_surface.c
> >>> > +++ b/src/amd/common/ac_surface.c
> >>> > @@ -408,20 +408,29 @@ static unsigned
> >>> cik_get_macro_tile_index(struct radeon_surf *surf)
> >>> >   tileb = 8 * 8 * surf->bpe;
> >>> >   tileb = MIN2(surf->u.legacy.tile_split, tileb);
> >>> >
> >>> >   for (index = 0; tileb > 64; index++)
> >>> >   tileb >>= 1;
> >>> >
> >>> >   assert(index < 16);
> >>> >   return index;
> >>> >  }
> >>> >
> >>> > +static bool get_display_flag(const struct ac_surf_config
> *config,
> >>> > +  const struct radeon_surf *surf)
> >>> > +{
> >>> > + return surf->flags & RADEON_SURF_SCANOUT &&
> >>> > +!(surf->flags & RADEON_SURF_FMASK) &&
> >>> > +config->info.samples <= 1 &&
> >>> > +surf->bpe >= 4 && surf->bpe <= 8;
> >>>
> >>> surf->bpe is the number of bytes used to store each pixel, right?
> If
> >> so,
> >>> this cannot exclude surf->bpe < 4, since 16 bpp and 8 bpp formats
> >> can be
> >>> displayed.
> >>>
> >>>
> >>> Sure, but what are the chances they will be displayed with the current
> >>> stack? GLX doesn't have 16bpp visuals for on-screen rendering.
> >>
> >> Maybe not when the X server runs at depth 24, but it can also run at
> >> depths 8, 15 & 16, in which case displayable surfaces with bpe == 1 or 2
> >> are needed even before GLX even comes into the picture.
> >>
> >
> > OK. Let me ask differently. Do we wanna support displayable 8, 15, and 16
> > bpp?
>
> We do support it, it's not really a question of whether we want to
> anymore. :)
>
> > Can we just say that we don't support those?
>
> I'm afraid we can't.
>
>
> Which kind of surfaces are you trying to exclude like this? Maybe they
> can be excluded in a different way.
>

Currently just the MSAA resolve temporary destination buffer.

Marek


>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] radv: implement VK_AMD_shader_core_properties

2018-04-06 Thread Grazvydas Ignotas
On Fri, Apr 6, 2018 at 3:28 PM, Samuel Pitoiset
 wrote:
> Simple extension that only returns information for AMD hw.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c  | 71 
> +++
>  src/amd/vulkan/radv_extensions.py |  1 +
>  2 files changed, 72 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 41f8242754..fba0b5c586 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -888,6 +888,39 @@ void radv_GetPhysicalDeviceProperties(
> memcpy(pProperties->pipelineCacheUUID, pdevice->cache_uuid, 
> VK_UUID_SIZE);
>  }
>
> +static uint32_t
> +radv_get_max_cu_per_sh(struct radv_physical_device *device)
> +{
> +   /* This should be queried from the KMD, like the number of SEs. */
> +   switch (device->rad_info.family) {
> +   case CHIP_TAHITI:
> +   return 8;
> +   case CHIP_HAINAN:
> +   return 5;
> +   case CHIP_BONAIRE:
> +   return 7;
> +   case CHIP_HAWAII:
> +   return 11;
> +   case CHIP_ICELAND:
> +   return 6;
> +   case CHIP_CARRIZO:
> +   return 8;
> +   case CHIP_TONGA:
> +   return 8;
> +   case CHIP_FIJI:
> +   return 16;
> +   case CHIP_STONEY:
> +   return 3;
> +   case CHIP_VEGA10:
> +   return 16;
> +   case CHIP_RAVEN:
> +   return 11;
> +   default:
> +   fprintf(stderr, "Number of CUs per SH unknown!\n");
> +   return 0;
> +   }
> +}
> +
>  void radv_GetPhysicalDeviceProperties2(
> VkPhysicalDevicephysicalDevice,
> VkPhysicalDeviceProperties2KHR *pProperties)
> @@ -961,6 +994,44 @@ void radv_GetPhysicalDeviceProperties2(
> properties->filterMinmaxSingleComponentFormats = true;
> break;
> }
> +   case 
> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD: {
> +   VkPhysicalDeviceShaderCorePropertiesAMD *properties =
> +   (VkPhysicalDeviceShaderCorePropertiesAMD 
> *)ext;
> +
> +   /* Shader engines. */
> +   properties->shaderEngineCount =
> +   pdevice->rad_info.max_se;
> +   properties->shaderArraysPerEngineCount =
> +   pdevice->rad_info.max_sh_per_se;
> +   properties->computeUnitsPerShaderArray =
> +   radv_get_max_cu_per_sh(pdevice);

Maybe
pdevice->rad_info.num_good_compute_units / (pdevice->rad_info.max_se *
pdevice->rad_info.max_sh_per_se);
would do the trick?

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] radv: implement VK_AMD_shader_core_properties

2018-04-06 Thread Samuel Pitoiset



On 04/06/2018 03:01 PM, Nils Wallménius wrote:

Hi Samuel, a question below

Den fre 6 apr. 2018 14:28Samuel Pitoiset > skrev:


Simple extension that only returns information for AMD hw.

Signed-off-by: Samuel Pitoiset >
---
  src/amd/vulkan/radv_device.c      | 71
+++
  src/amd/vulkan/radv_extensions.py |  1 +
  2 files changed, 72 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 41f8242754..fba0b5c586 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -888,6 +888,39 @@ void radv_GetPhysicalDeviceProperties(
         memcpy(pProperties->pipelineCacheUUID, pdevice->cache_uuid,
VK_UUID_SIZE);
  }

+static uint32_t
+radv_get_max_cu_per_sh(struct radv_physical_device *device)
+{
+       /* This should be queried from the KMD, like the number of
SEs. */
+       switch (device->rad_info.family) {


Isn't Polaris missing from this switch?


You are right. I did try this too quickly.



BR
Nils

+       case CHIP_TAHITI:
+               return 8;
+       case CHIP_HAINAN:
+               return 5;
+       case CHIP_BONAIRE:
+               return 7;
+       case CHIP_HAWAII:
+               return 11;
+       case CHIP_ICELAND:
+               return 6;
+       case CHIP_CARRIZO:
+               return 8;
+       case CHIP_TONGA:
+               return 8;
+       case CHIP_FIJI:
+               return 16;
+       case CHIP_STONEY:
+               return 3;
+       case CHIP_VEGA10:
+               return 16;
+       case CHIP_RAVEN:
+               return 11;
+       default:
+               fprintf(stderr, "Number of CUs per SH unknown!\n");
+               return 0;
+       }
+}
+
  void radv_GetPhysicalDeviceProperties2(
         VkPhysicalDevice                            physicalDevice,
         VkPhysicalDeviceProperties2KHR             *pProperties)
@@ -961,6 +994,44 @@ void radv_GetPhysicalDeviceProperties2(

properties->filterMinmaxSingleComponentFormats = true;

                         break;
                 }
+               case
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD: {
+                       VkPhysicalDeviceShaderCorePropertiesAMD
*properties =
+ 
  (VkPhysicalDeviceShaderCorePropertiesAMD *)ext;

+
+                       /* Shader engines. */
+                       properties->shaderEngineCount =
+                               pdevice->rad_info.max_se;
+                       properties->shaderArraysPerEngineCount =
+                               pdevice->rad_info.max_sh_per_se;
+                       properties->computeUnitsPerShaderArray =
+                               radv_get_max_cu_per_sh(pdevice);
+                       properties->simdPerComputeUnit = 4;
+                       properties->wavefrontsPerSimd =
+                               pdevice->rad_info.family ==
CHIP_TONGA ||
+                               pdevice->rad_info.family ==
CHIP_ICELAND ||
+                               pdevice->rad_info.family ==
CHIP_POLARIS10 ||
+                               pdevice->rad_info.family ==
CHIP_POLARIS11 ||
+                               pdevice->rad_info.family ==
CHIP_POLARIS12 ? 8 : 10;
+                       properties->wavefrontSize = 64;
+
+                       /* SGPR. */
+                       properties->sgprsPerSimd =
+                               radv_get_num_physical_sgprs(pdevice);
+                       properties->minSgprAllocation =
+                               pdevice->rad_info.chip_class >= VI ?
16 : 8;
+                       properties->maxSgprAllocation =
+                               pdevice->rad_info.family ==
CHIP_TONGA ||
+                               pdevice->rad_info.family ==
CHIP_ICELAND ? 96 : 104;
+                       properties->sgprAllocationGranularity =
+                               pdevice->rad_info.chip_class >= VI ?
16 : 8;
+
+                       /* VGPR. */
+                       properties->vgprsPerSimd =
RADV_NUM_PHYSICAL_VGPRS;
+                       properties->minVgprAllocation = 4;
+                       properties->maxVgprAllocation = 256;
+                       properties->vgprAllocationGranularity = 4;
+                       break;
+               }
                 default:
                         break;
                 }
diff --git 

Re: [Mesa-dev] [PATCH 4/4] radv: implement VK_AMD_shader_core_properties

2018-04-06 Thread Nils Wallménius
Hi Samuel, a question below

Den fre 6 apr. 2018 14:28Samuel Pitoiset  skrev:

> Simple extension that only returns information for AMD hw.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_device.c  | 71
> +++
>  src/amd/vulkan/radv_extensions.py |  1 +
>  2 files changed, 72 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index 41f8242754..fba0b5c586 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -888,6 +888,39 @@ void radv_GetPhysicalDeviceProperties(
> memcpy(pProperties->pipelineCacheUUID, pdevice->cache_uuid,
> VK_UUID_SIZE);
>  }
>
> +static uint32_t
> +radv_get_max_cu_per_sh(struct radv_physical_device *device)
> +{
> +   /* This should be queried from the KMD, like the number of SEs. */
> +   switch (device->rad_info.family) {
>

Isn't Polaris missing from this switch?

BR
Nils

+   case CHIP_TAHITI:
> +   return 8;
> +   case CHIP_HAINAN:
> +   return 5;
> +   case CHIP_BONAIRE:
> +   return 7;
> +   case CHIP_HAWAII:
> +   return 11;
> +   case CHIP_ICELAND:
> +   return 6;
> +   case CHIP_CARRIZO:
> +   return 8;
> +   case CHIP_TONGA:
> +   return 8;
> +   case CHIP_FIJI:
> +   return 16;
> +   case CHIP_STONEY:
> +   return 3;
> +   case CHIP_VEGA10:
> +   return 16;
> +   case CHIP_RAVEN:
> +   return 11;
> +   default:
> +   fprintf(stderr, "Number of CUs per SH unknown!\n");
> +   return 0;
> +   }
> +}
> +
>  void radv_GetPhysicalDeviceProperties2(
> VkPhysicalDevicephysicalDevice,
> VkPhysicalDeviceProperties2KHR *pProperties)
> @@ -961,6 +994,44 @@ void radv_GetPhysicalDeviceProperties2(
> properties->filterMinmaxSingleComponentFormats =
> true;
> break;
> }
> +   case
> VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD: {
> +   VkPhysicalDeviceShaderCorePropertiesAMD
> *properties =
> +   (VkPhysicalDeviceShaderCorePropertiesAMD
> *)ext;
> +
> +   /* Shader engines. */
> +   properties->shaderEngineCount =
> +   pdevice->rad_info.max_se;
> +   properties->shaderArraysPerEngineCount =
> +   pdevice->rad_info.max_sh_per_se;
> +   properties->computeUnitsPerShaderArray =
> +   radv_get_max_cu_per_sh(pdevice);
> +   properties->simdPerComputeUnit = 4;
> +   properties->wavefrontsPerSimd =
> +   pdevice->rad_info.family == CHIP_TONGA ||
> +   pdevice->rad_info.family == CHIP_ICELAND ||
> +   pdevice->rad_info.family == CHIP_POLARIS10
> ||
> +   pdevice->rad_info.family == CHIP_POLARIS11
> ||
> +   pdevice->rad_info.family == CHIP_POLARIS12
> ? 8 : 10;
> +   properties->wavefrontSize = 64;
> +
> +   /* SGPR. */
> +   properties->sgprsPerSimd =
> +   radv_get_num_physical_sgprs(pdevice);
> +   properties->minSgprAllocation =
> +   pdevice->rad_info.chip_class >= VI ? 16 :
> 8;
> +   properties->maxSgprAllocation =
> +   pdevice->rad_info.family == CHIP_TONGA ||
> +   pdevice->rad_info.family == CHIP_ICELAND ?
> 96 : 104;
> +   properties->sgprAllocationGranularity =
> +   pdevice->rad_info.chip_class >= VI ? 16 :
> 8;
> +
> +   /* VGPR. */
> +   properties->vgprsPerSimd = RADV_NUM_PHYSICAL_VGPRS;
> +   properties->minVgprAllocation = 4;
> +   properties->maxVgprAllocation = 256;
> +   properties->vgprAllocationGranularity = 4;
> +   break;
> +   }
> default:
> break;
> }
> diff --git a/src/amd/vulkan/radv_extensions.py
> b/src/amd/vulkan/radv_extensions.py
> index bc63a34896..a25db637e2 100644
> --- a/src/amd/vulkan/radv_extensions.py
> +++ b/src/amd/vulkan/radv_extensions.py
> @@ -96,6 +96,7 @@ EXTENSIONS = [
>  Extension('VK_AMD_draw_indirect_count',   1, True),
>  Extension('VK_AMD_gcn_shader',1, True),
>  Extension('VK_AMD_rasterization_order',   1,
> 

[Mesa-dev] [Bug 105871] Discolored KDE panels after updating to Mesa 18.0 on Intel broadwell

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105871

--- Comment #18 from Alexey Min  ---
(In reply to Tasev from comment #16)
> (In reply to sergio.callegari from comment #13)
> > As an alternate/complement solution to the patched xserver-xorg-core on the
> > Padoka ppa, for those using kde plasma, there is now also a patch to kwin to
> > fix the visuals selection.
> > 
> > https://phabricator.kde.org/D11758
> 
> I tried that patch, without succes,he doesn't fix this bug.

This is understandable, because I only fixed kwin_wayland session with that,
and I only have amd hardware to test with.

If you can point me to which config attribute kwin should/should not be using
to avoid this, we can at least try to fix kwin_x11 too.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: add radv_clear_{cmask,dcc} helpers

2018-04-06 Thread Samuel Pitoiset



On 04/06/2018 02:35 PM, Emil Velikov wrote:

On 6 April 2018 at 11:25, Samuel Pitoiset  wrote:


--- a/src/amd/vulkan/radv_meta.h
+++ b/src/amd/vulkan/radv_meta.h
@@ -195,6 +195,11 @@ void radv_blit_to_prime_linear(struct radv_cmd_buffer 
*cmd_buffer,
struct radv_image *image,
struct radv_image *linear_image);

+uint32_t radv_clear_cmask(struct radv_cmd_buffer *cmd_buffer,
+ struct radv_image *image, uint32_t value);
+uint32_t radv_clear_dcc(struct radv_cmd_buffer *cmd_buffer,
+   struct radv_image *image, uint32_t value);
+
  /* common nir builder helpers */
  #include "nir/nir_builder.h"


Unrelated comment:

Having an include in the middle of the header is normally a bad idea.
Here this creates the 'extern "C" { #include "..." }' pattern which
should be avoided.


I do agree. We should probably try to remove it.



HTH
Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: add radv_clear_{cmask,dcc} helpers

2018-04-06 Thread Emil Velikov
On 6 April 2018 at 11:25, Samuel Pitoiset  wrote:

> --- a/src/amd/vulkan/radv_meta.h
> +++ b/src/amd/vulkan/radv_meta.h
> @@ -195,6 +195,11 @@ void radv_blit_to_prime_linear(struct radv_cmd_buffer 
> *cmd_buffer,
>struct radv_image *image,
>struct radv_image *linear_image);
>
> +uint32_t radv_clear_cmask(struct radv_cmd_buffer *cmd_buffer,
> + struct radv_image *image, uint32_t value);
> +uint32_t radv_clear_dcc(struct radv_cmd_buffer *cmd_buffer,
> +   struct radv_image *image, uint32_t value);
> +
>  /* common nir builder helpers */
>  #include "nir/nir_builder.h"
>
Unrelated comment:

Having an include in the middle of the header is normally a bad idea.
Here this creates the 'extern "C" { #include "..." }' pattern which
should be avoided.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] - Rewrite mesa website in Sphinx

2018-04-06 Thread Emil Velikov
On 6 April 2018 at 10:40, Daniel Stone  wrote:
> Hi all,
>
> On 5 April 2018 at 23:55, Laura Ekstrand  wrote:
>> So I spoke with Daniel Stone today about the infrastructure.  He estimates
>> it will be ready to deploy the website in 2-3 weeks, at the most.  So I'd
>> say the infrastructure will be there when we are ready.
>>
>> In the new system, our website will be running in its own container managed
>> by freedesktop's new Gitlab server. So what we need to do for the deploy is:
>
> Right. We can't keep stacking random unmanaged bits for every project
> into unconfined spaces, not least as it makes upgrades super painful.
> What we want to do is use GitLab's CI mechanism to automatically run a
> pipeline to generate the static pages, and likely serve them from
> GitLab Pages. Having it all done in containers means that you can be
> in complete control of the environment without having to block on
> admins, and it's also easier for others to reproduce.
>
>> 1.  Fork mesa into a repo on Gitlab.com (https://gitlab.freedesktop.org).
>
> This should be actual gitlab.com for now. I don't have CI enabled on
> gitlab.fd.o yet as it's surprisingly difficult to do without exposing
> root to everyone. I'm aiming for the next couple of weeks to have this
> done. I'll ping this thread when it is up.
>
Thanks for the confirmation and the work Dan.

Some time back I looked how lxc manages its containers w/o root. I
could help out porting that to docker/others.

Laura, others,

As you can see things are not finalised nor official*, yet. Hence why
I would not rush into rewriting the docs/website. Otherwise the work
would be sit in a branch, again :-\

HTH
Emil
* Did anyone mention gitlab to the folks who are not on IRC?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] radv: implement VK_AMD_shader_core_properties

2018-04-06 Thread Samuel Pitoiset
Simple extension that only returns information for AMD hw.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_device.c  | 71 +++
 src/amd/vulkan/radv_extensions.py |  1 +
 2 files changed, 72 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 41f8242754..fba0b5c586 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -888,6 +888,39 @@ void radv_GetPhysicalDeviceProperties(
memcpy(pProperties->pipelineCacheUUID, pdevice->cache_uuid, 
VK_UUID_SIZE);
 }
 
+static uint32_t
+radv_get_max_cu_per_sh(struct radv_physical_device *device)
+{
+   /* This should be queried from the KMD, like the number of SEs. */
+   switch (device->rad_info.family) {
+   case CHIP_TAHITI:
+   return 8;
+   case CHIP_HAINAN:
+   return 5;
+   case CHIP_BONAIRE:
+   return 7;
+   case CHIP_HAWAII:
+   return 11;
+   case CHIP_ICELAND:
+   return 6;
+   case CHIP_CARRIZO:
+   return 8;
+   case CHIP_TONGA:
+   return 8;
+   case CHIP_FIJI:
+   return 16;
+   case CHIP_STONEY:
+   return 3;
+   case CHIP_VEGA10:
+   return 16;
+   case CHIP_RAVEN:
+   return 11;
+   default:
+   fprintf(stderr, "Number of CUs per SH unknown!\n");
+   return 0;
+   }
+}
+
 void radv_GetPhysicalDeviceProperties2(
VkPhysicalDevicephysicalDevice,
VkPhysicalDeviceProperties2KHR *pProperties)
@@ -961,6 +994,44 @@ void radv_GetPhysicalDeviceProperties2(
properties->filterMinmaxSingleComponentFormats = true;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_CORE_PROPERTIES_AMD: {
+   VkPhysicalDeviceShaderCorePropertiesAMD *properties =
+   (VkPhysicalDeviceShaderCorePropertiesAMD *)ext;
+
+   /* Shader engines. */
+   properties->shaderEngineCount =
+   pdevice->rad_info.max_se;
+   properties->shaderArraysPerEngineCount =
+   pdevice->rad_info.max_sh_per_se;
+   properties->computeUnitsPerShaderArray =
+   radv_get_max_cu_per_sh(pdevice);
+   properties->simdPerComputeUnit = 4;
+   properties->wavefrontsPerSimd =
+   pdevice->rad_info.family == CHIP_TONGA ||
+   pdevice->rad_info.family == CHIP_ICELAND ||
+   pdevice->rad_info.family == CHIP_POLARIS10 ||
+   pdevice->rad_info.family == CHIP_POLARIS11 ||
+   pdevice->rad_info.family == CHIP_POLARIS12 ? 8 
: 10;
+   properties->wavefrontSize = 64;
+
+   /* SGPR. */
+   properties->sgprsPerSimd =
+   radv_get_num_physical_sgprs(pdevice);
+   properties->minSgprAllocation =
+   pdevice->rad_info.chip_class >= VI ? 16 : 8;
+   properties->maxSgprAllocation =
+   pdevice->rad_info.family == CHIP_TONGA ||
+   pdevice->rad_info.family == CHIP_ICELAND ? 96 : 
104;
+   properties->sgprAllocationGranularity =
+   pdevice->rad_info.chip_class >= VI ? 16 : 8;
+
+   /* VGPR. */
+   properties->vgprsPerSimd = RADV_NUM_PHYSICAL_VGPRS;
+   properties->minVgprAllocation = 4;
+   properties->maxVgprAllocation = 256;
+   properties->vgprAllocationGranularity = 4;
+   break;
+   }
default:
break;
}
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index bc63a34896..a25db637e2 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -96,6 +96,7 @@ EXTENSIONS = [
 Extension('VK_AMD_draw_indirect_count',   1, True),
 Extension('VK_AMD_gcn_shader',1, True),
 Extension('VK_AMD_rasterization_order',   1, 
'device->has_out_of_order_rast'),
+Extension('VK_AMD_shader_core_properties',1, True),
 Extension('VK_AMD_shader_info',   1, True),
 Extension('VK_AMD_shader_trinary_minmax', 1, True),
 ]
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 3/4] radv: add RADV_NUM_PHYSICAL_VGPRS constant

2018-04-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 6 --
 src/amd/vulkan/radv_shader.h | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 59ad2f3819..eaf24dcdee 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -633,7 +633,9 @@ generate_shader_stats(struct radv_device *device,
 
radv_get_num_physical_sgprs(device->physical_device) / conf->num_sgprs);
 
if (conf->num_vgprs)
-   max_simd_waves = MIN2(max_simd_waves, 256 / conf->num_vgprs);
+   max_simd_waves =
+   MIN2(max_simd_waves,
+RADV_NUM_PHYSICAL_VGPRS / conf->num_vgprs);
 
/* LDS is 64KB per CU (4 SIMDs), divided into 16KB blocks per SIMD
 * that PS can use.
@@ -712,7 +714,7 @@ radv_GetShaderInfoAMD(VkDevice _device,
 
VkShaderStatisticsInfoAMD statistics = {};
statistics.shaderStageMask = shaderStage;
-   statistics.numPhysicalVgprs = 256;
+   statistics.numPhysicalVgprs = RADV_NUM_PHYSICAL_VGPRS;
statistics.numPhysicalSgprs = 
radv_get_num_physical_sgprs(device->physical_device);
statistics.numAvailableSgprs = 
statistics.numPhysicalSgprs;
 
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index f5c0645b5f..cbb7394eea 100644
--- a/src/amd/vulkan/radv_shader.h
+++ b/src/amd/vulkan/radv_shader.h
@@ -46,6 +46,8 @@
 // Match MAX_SETS from radv_descriptor_set.h
 #define RADV_UD_MAX_SETS MAX_SETS
 
+#define RADV_NUM_PHYSICAL_VGPRS 256
+
 struct radv_shader_module {
struct nir_shader *nir;
unsigned char sha1[20];
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] vulkan: Update the XML and headers to 1.1.72

2018-04-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 include/vulkan/vulkan_android.h |  66 ++
 include/vulkan/vulkan_core.h| 144 +++-
 src/vulkan/registry/vk.xml  | 286 +---
 3 files changed, 445 insertions(+), 51 deletions(-)

diff --git a/include/vulkan/vulkan_android.h b/include/vulkan/vulkan_android.h
index 5e61c0531a..07aaeda28e 100644
--- a/include/vulkan/vulkan_android.h
+++ b/include/vulkan/vulkan_android.h
@@ -53,6 +53,72 @@ VKAPI_ATTR VkResult VKAPI_CALL vkCreateAndroidSurfaceKHR(
 VkSurfaceKHR*   pSurface);
 #endif
 
+#define VK_ANDROID_external_memory_android_hardware_buffer 1
+struct AHardwareBuffer;
+
+#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_SPEC_VERSION 3
+#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_EXTENSION_NAME 
"VK_ANDROID_external_memory_android_hardware_buffer"
+
+typedef struct VkAndroidHardwareBufferUsageANDROID {
+VkStructureTypesType;
+void*  pNext;
+uint64_t   androidHardwareBufferUsage;
+} VkAndroidHardwareBufferUsageANDROID;
+
+typedef struct VkAndroidHardwareBufferPropertiesANDROID {
+VkStructureTypesType;
+void*  pNext;
+VkDeviceSize   allocationSize;
+uint32_t   memoryTypeBits;
+} VkAndroidHardwareBufferPropertiesANDROID;
+
+typedef struct VkAndroidHardwareBufferFormatPropertiesANDROID {
+VkStructureType  sType;
+void*pNext;
+VkFormat format;
+uint64_t externalFormat;
+VkFormatFeatureFlags formatFeatures;
+VkComponentMapping   samplerYcbcrConversionComponents;
+VkSamplerYcbcrModelConversionsuggestedYcbcrModel;
+VkSamplerYcbcrRange  suggestedYcbcrRange;
+VkChromaLocation suggestedXChromaOffset;
+VkChromaLocation suggestedYChromaOffset;
+} VkAndroidHardwareBufferFormatPropertiesANDROID;
+
+typedef struct VkImportAndroidHardwareBufferInfoANDROID {
+VkStructureTypesType;
+const void*pNext;
+struct AHardwareBuffer*buffer;
+} VkImportAndroidHardwareBufferInfoANDROID;
+
+typedef struct VkMemoryGetAndroidHardwareBufferInfoANDROID {
+VkStructureTypesType;
+const void*pNext;
+VkDeviceMemory memory;
+} VkMemoryGetAndroidHardwareBufferInfoANDROID;
+
+typedef struct VkExternalFormatANDROID {
+VkStructureTypesType;
+void*  pNext;
+uint64_t   externalFormat;
+} VkExternalFormatANDROID;
+
+
+typedef VkResult (VKAPI_PTR 
*PFN_vkGetAndroidHardwareBufferPropertiesANDROID)(VkDevice device, const struct 
AHardwareBuffer* buffer, VkAndroidHardwareBufferPropertiesANDROID* pProperties);
+typedef VkResult (VKAPI_PTR 
*PFN_vkGetMemoryAndroidHardwareBufferANDROID)(VkDevice device, const 
VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo, struct AHardwareBuffer** 
pBuffer);
+
+#ifndef VK_NO_PROTOTYPES
+VKAPI_ATTR VkResult VKAPI_CALL vkGetAndroidHardwareBufferPropertiesANDROID(
+VkDevicedevice,
+const struct AHardwareBuffer*   buffer,
+VkAndroidHardwareBufferPropertiesANDROID*   pProperties);
+
+VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryAndroidHardwareBufferANDROID(
+VkDevicedevice,
+const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo,
+struct AHardwareBuffer**pBuffer);
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/include/vulkan/vulkan_core.h b/include/vulkan/vulkan_core.h
index 92de120bc8..ed0d596f67 100644
--- a/include/vulkan/vulkan_core.h
+++ b/include/vulkan/vulkan_core.h
@@ -43,11 +43,11 @@ extern "C" {
 #define VK_VERSION_MINOR(version) (((uint32_t)(version) >> 12) & 0x3ff)
 #define VK_VERSION_PATCH(version) ((uint32_t)(version) & 0xfff)
 // Version of this file
-#define VK_HEADER_VERSION 70
+#define VK_HEADER_VERSION 72
 
 
 #define VK_NULL_HANDLE 0
-
+
 
 
 #define VK_DEFINE_HANDLE(object) typedef struct object##_T* object;
@@ -60,7 +60,7 @@ extern "C" {
 #define VK_DEFINE_NON_DISPATCHABLE_HANDLE(object) typedef uint64_t 
object;
 #endif
 #endif
-
+
 
 
 typedef uint32_t VkFlags;
@@ -147,6 +147,7 @@ typedef enum VkResult {
 VK_ERROR_INCOMPATIBLE_DISPLAY_KHR = -103001,
 VK_ERROR_VALIDATION_FAILED_EXT = -111001,
 VK_ERROR_INVALID_SHADER_NV = -112000,
+VK_ERROR_FRAGMENTATION_EXT = -1000161000,
 VK_ERROR_NOT_PERMITTED_EXT = -1000174001,
 VK_ERROR_OUT_OF_POOL_MEMORY_KHR = VK_ERROR_OUT_OF_POOL_MEMORY,
 VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR = VK_ERROR_INVALID_EXTERNAL_HANDLE,
@@ -356,6 +357,12 @@ typedef enum VkStructureType {
 VK_STRUCTURE_TYPE_DEBUG_UTILS_LABEL_EXT = 1000128002,
 VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CALLBACK_DATA_EXT = 1000128003,
 

[Mesa-dev] [PATCH 2/4] radv: add radv_get_num_physical_sgprs() helper

2018-04-06 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_shader.c | 15 ---
 src/amd/vulkan/radv_shader.h |  6 ++
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index f46beab8c1..59ad2f3819 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -597,15 +597,6 @@ radv_get_shader_name(struct radv_shader_variant *var, 
gl_shader_stage stage)
};
 }
 
-static uint32_t
-get_total_sgprs(struct radv_device *device)
-{
-   if (device->physical_device->rad_info.chip_class >= VI)
-   return 800;
-   else
-   return 512;
-}
-
 static void
 generate_shader_stats(struct radv_device *device,
  struct radv_shader_variant *variant,
@@ -637,7 +628,9 @@ generate_shader_stats(struct radv_device *device,
}
 
if (conf->num_sgprs)
-   max_simd_waves = MIN2(max_simd_waves, get_total_sgprs(device) / 
conf->num_sgprs);
+   max_simd_waves =
+   MIN2(max_simd_waves,
+
radv_get_num_physical_sgprs(device->physical_device) / conf->num_sgprs);
 
if (conf->num_vgprs)
max_simd_waves = MIN2(max_simd_waves, 256 / conf->num_vgprs);
@@ -720,7 +713,7 @@ radv_GetShaderInfoAMD(VkDevice _device,
VkShaderStatisticsInfoAMD statistics = {};
statistics.shaderStageMask = shaderStage;
statistics.numPhysicalVgprs = 256;
-   statistics.numPhysicalSgprs = get_total_sgprs(device);
+   statistics.numPhysicalSgprs = 
radv_get_num_physical_sgprs(device->physical_device);
statistics.numAvailableSgprs = 
statistics.numPhysicalSgprs;
 
if (stage == MESA_SHADER_COMPUTE) {
diff --git a/src/amd/vulkan/radv_shader.h b/src/amd/vulkan/radv_shader.h
index ae30d6125b..f5c0645b5f 100644
--- a/src/amd/vulkan/radv_shader.h
+++ b/src/amd/vulkan/radv_shader.h
@@ -362,4 +362,10 @@ static inline unsigned 
shader_io_get_unique_index(gl_varying_slot slot)
unreachable("illegal slot in get unique index\n");
 }
 
+static inline uint32_t
+radv_get_num_physical_sgprs(struct radv_physical_device *physical_device)
+{
+   return physical_device->rad_info.chip_class >= VI ? 800 : 512;
+}
+
 #endif
-- 
2.16.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105871] Discolored KDE panels after updating to Mesa 18.0 on Intel broadwell

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105871

Tapani Pälli  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #17 from Tapani Pälli  ---
Resolving as a duplicate, thanks for testing.

*** This bug has been marked as a duplicate of bug 103699 ***

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103699] Latest mesa breaks firefox on kde plasma with compositing on

2018-04-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103699

Tapani Pälli  changed:

   What|Removed |Added

 CC||kamalah...@openmailbox.org

--- Comment #32 from Tapani Pälli  ---
*** Bug 105871 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: add radv_clear_{cmask,dcc} helpers

2018-04-06 Thread Samuel Pitoiset
They will help for DCC MSAA textures and if we support mipmaps
in the future.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c  |  8 ++--
 src/amd/vulkan/radv_meta.h|  5 +
 src/amd/vulkan/radv_meta_clear.c  | 27 +--
 src/amd/vulkan/radv_meta_fast_clear.c |  4 +---
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 526b618f2a7..d47325cc985 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -3621,9 +3621,7 @@ void radv_initialise_cmask(struct radv_cmd_buffer 
*cmd_buffer,
state->flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB |
RADV_CMD_FLAG_FLUSH_AND_INV_CB_META;
 
-   state->flush_bits |= radv_fill_buffer(cmd_buffer, image->bo,
- image->offset + 
image->cmask.offset,
- image->cmask.size, value);
+   state->flush_bits |= radv_clear_cmask(cmd_buffer, image, value);
 
state->flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB_META;
 }
@@ -3655,9 +3653,7 @@ void radv_initialize_dcc(struct radv_cmd_buffer 
*cmd_buffer,
state->flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB |
 RADV_CMD_FLAG_FLUSH_AND_INV_CB_META;
 
-   state->flush_bits |= radv_fill_buffer(cmd_buffer, image->bo,
- image->offset + image->dcc_offset,
- image->surface.dcc_size, value);
+   state->flush_bits |= radv_clear_dcc(cmd_buffer, image, value);
 
state->flush_bits |= RADV_CMD_FLAG_FLUSH_AND_INV_CB |
 RADV_CMD_FLAG_FLUSH_AND_INV_CB_META;
diff --git a/src/amd/vulkan/radv_meta.h b/src/amd/vulkan/radv_meta.h
index 9f3198e8797..57b76c13262 100644
--- a/src/amd/vulkan/radv_meta.h
+++ b/src/amd/vulkan/radv_meta.h
@@ -195,6 +195,11 @@ void radv_blit_to_prime_linear(struct radv_cmd_buffer 
*cmd_buffer,
   struct radv_image *image,
   struct radv_image *linear_image);
 
+uint32_t radv_clear_cmask(struct radv_cmd_buffer *cmd_buffer,
+ struct radv_image *image, uint32_t value);
+uint32_t radv_clear_dcc(struct radv_cmd_buffer *cmd_buffer,
+   struct radv_image *image, uint32_t value);
+
 /* common nir builder helpers */
 #include "nir/nir_builder.h"
 
diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index 98fb8fa6a7c..678de4275fa 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -859,6 +859,24 @@ fail:
return res;
 }
 
+uint32_t
+radv_clear_cmask(struct radv_cmd_buffer *cmd_buffer,
+struct radv_image *image, uint32_t value)
+{
+   return radv_fill_buffer(cmd_buffer, image->bo,
+   image->offset + image->cmask.offset,
+   image->cmask.size, value);
+}
+
+uint32_t
+radv_clear_dcc(struct radv_cmd_buffer *cmd_buffer,
+  struct radv_image *image, uint32_t value)
+{
+   return radv_fill_buffer(cmd_buffer, image->bo,
+   image->offset + image->dcc_offset,
+   image->surface.dcc_size, value);
+}
+
 static void vi_get_fast_clear_parameters(VkFormat format,
 const VkClearColorValue *clear_value,
 uint32_t* reset_value,
@@ -1020,15 +1038,12 @@ emit_fast_color_clear(struct radv_cmd_buffer 
*cmd_buffer,
 _value, _value,
 _avoid_fast_clear_elim);
 
-   flush_bits = radv_fill_buffer(cmd_buffer, iview->image->bo,
- iview->image->offset + 
iview->image->dcc_offset,
- iview->image->surface.dcc_size, 
reset_value);
+   flush_bits = radv_clear_dcc(cmd_buffer, iview->image, 
reset_value);
+
radv_set_dcc_need_cmask_elim_pred(cmd_buffer, iview->image,
  !can_avoid_fast_clear_elim);
} else {
-   flush_bits = radv_fill_buffer(cmd_buffer, iview->image->bo,
- iview->image->offset + 
iview->image->cmask.offset,
- iview->image->cmask.size, 0);
+   flush_bits = radv_clear_cmask(cmd_buffer, iview->image, 0);
}
 
if (post_flush) {
diff --git a/src/amd/vulkan/radv_meta_fast_clear.c 
b/src/amd/vulkan/radv_meta_fast_clear.c
index affecfac742..928062b5abc 100644
--- a/src/amd/vulkan/radv_meta_fast_clear.c
+++ b/src/amd/vulkan/radv_meta_fast_clear.c
@@ -771,9 +771,7 @@ 

  1   2   >