Re: [Mesa-dev] [PATCH 2/2] loader_dri3: Variant 2: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mario Kleiner
On Sat, May 5, 2018 at 4:08 AM, Mike Lothian  wrote:
> I definately saw the steam bug with patch 1 but not with plasmashell,
> I started seeing it with patch 2 but it seemed to fix itself
>

I had two hangs of kwin_x11 within the last 6 hours when alt-tabbing
between windows, where it got stuck in the
loader_dri3_swapbuffer_barrier() from patch 1/2. Not sure how that is
possible, or if the stacktrace was misleading, because i had to VT
switch to a text console to attach the debugger and this might be just
a side effect of that. But if it is true, then patch 1/2 would not be
it. Also 1/2 has a potential performance impact, whereas 2/2 doesn't.
However 2/2 would also need more work, as i can think of more complex
scenarios where it would filter the wrong events, although not in the
case of plasmashell or steam. Probably we'd need to sacrifice a few
sbc bits in the Present events serial field to transport a unique tag
for each incarnation of the loader_dri3_drawable, like a mini-hash of
the draw->eid. Ugly ugly...

-mario
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] intel/compiler: Fix lower_conversions for 8-bit types.

2018-05-04 Thread Jason Ekstrand
Series is

Reviewed-by: Jason Ekstrand 



On Fri, May 4, 2018 at 10:45 AM, Matt Turner  wrote:

> On Fri, May 4, 2018 at 4:10 AM, Iago Toral Quiroga 
> wrote:
> > From: Jose Maria Casanova Crespo 
> >
> > For 8-bit types the execution type is word. A byte raw MOV has 16-bit
> > execution type and 8-bit destination and it shouldn't be considered
> > a conversion case. So there is no need to change alignment and enter
> > in lower_conversions for these instructions.
> >
> > Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export"
> > that is introduced with this patch from the Vulkan shaderInt16 series:
> > 'i965/compiler: handle conversion to smaller type in the lowering
> > pass for that'. The problem is caused because there is already a case
> > in the driver that injects Byte instructions like this:
> >
> > mov(8)  g127<1>UB   g2<32,8,4>UB
> >
> > And the aforementioned pass was not accounting for the special
> > handling of the execution size of Byte instructions. This patch
> > fixes this.
> >
> > v2: (Jason Ekstrand)
> >- Simplify is_byte_raw_mov, include reference to PRM and not
> >consider B <-> UB conversions as raw movs.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
> > ---
> >  src/intel/compiler/brw_fs_lower_conversions.cpp | 21
> -
> >  1 file changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/compiler/brw_fs_lower_conversions.cpp
> b/src/intel/compiler/brw_fs_lower_conversions.cpp
> > index f95b39d3e8..f6c936cf21 100644
> > --- a/src/intel/compiler/brw_fs_lower_conversions.cpp
> > +++ b/src/intel/compiler/brw_fs_lower_conversions.cpp
> > @@ -43,6 +43,24 @@ supports_type_conversion(const fs_inst *inst) {
> > }
> >  }
> >
> > +/* From the SKL PRM Vol 2a, "Move":
> > + *
> > + *"A mov with the same source and destination type, no source
> modifier,
> > + *and no saturation is a raw move. A packed byte destination region
> (B
> > + *or UB type with HorzStride == 1 and ExecSize > 1) can only be
> written
> > + *using raw move."
> > + */
>
> Indent the last three lines one more space to align vertically with
> the first line.
>
> > +static bool
> > +is_byte_raw_mov (const fs_inst *inst)
> > +{
> > +   return type_sz(inst->dst.type) == 1 &&
> > +  inst->opcode == BRW_OPCODE_MOV &&
> > +  inst->src[0].type == inst->dst.type &&
> > +  !inst->saturate &&
> > +  !inst->src[0].negate &&
> > +  !inst->src[0].abs;
>
> And indent these to align vertically as well.
>
> The first three patches are
>
> Reviewed-by: Matt Turner 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965/drm: Purge the bucket when its cached buffer is evicted

2018-05-04 Thread James Xiong
From: "Xiong, James" 

When one of cached buffers is found to be evicted by kernel,
most likely the buffers freed earlier than this buffer are
gone too, go through the cached list in the bucket and purge.

Signed-off-by: Xiong, James 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 5235aa6..9f2e566 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -248,17 +248,20 @@ brw_bo_madvise(struct brw_bo *bo, int state)
return madv.retained;
 }
 
-/* drop the oldest entries that have been purged by the kernel */
+/* drop the entries that are older than the given time */
 static void
 brw_bo_cache_purge_bucket(struct brw_bufmgr *bufmgr,
-  struct bo_cache_bucket *bucket)
+  struct bo_cache_bucket *bucket,
+  time_t time)
 {
list_for_each_entry_safe(struct brw_bo, bo, >head, head) {
-  if (brw_bo_madvise(bo, I915_MADV_DONTNEED))
+  if (bo->free_time >= time) {
+ brw_bo_madvise(bo, I915_MADV_DONTNEED);
+ list_del(>head);
+ bo_free(bo);
+  } else {
  break;
-
-  list_del(>head);
-  bo_free(bo);
+  }
}
 }
 
@@ -319,8 +322,8 @@ retry:
 
   if (bo) {
  if (!brw_bo_madvise(bo, I915_MADV_WILLNEED)) {
+brw_bo_cache_purge_bucket(bufmgr, bucket, bo->free_time);
 bo_free(bo);
-brw_bo_cache_purge_bucket(bufmgr, bucket);
 return NULL;
  }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] improve buffer cache and reuse

2018-05-04 Thread James Xiong
From: "Xiong, James" 

With the current implementation, brw_bufmgr may round up a request
size to the next bucket size, result in 25% more memory allocated in
the worst senario. For example:
Request sizeActual size
32KB+1Byte  40KB
.
8MB+1Byte   10MB
.
96MB+1Byte  112MB
This series align the buffer size up to page instead of a bucket size
to improve memory allocation efficiency.

Performance and memory usage were measured on a gen9 platform using
Basemark ES3, GfxBench 4 and 5, each test case ran 6 times.

Basemark ES3
scorepeak memory size(KB)
beforeafter   diff   before  after   diff
max avg   max avg maxavg
22  2123  21  2.83%  1.21%   409928  395573  -14355
20  2020  20  0.53%  0.41%  

GfxBench 4.0
scorepeak memory size(KB)
score   peak memory size(KB) 
before  after   diffbefore  after   diff
max   avg   max   avg   max avg
584   577   586   583   0.45%   1.02%   566489  539699  -26791
728   727   727   726   -0.03%  -0.16%  614730  586794  -27936
1604  1144  1650  1202  2.81%   4.86%   439220  411596  -27624
2711    2718  2152  0.25%   -3.25%  126065  121398  -4667
1218  1213  1212  1154  -0.53%  -5.10%  54153   53868   -285
106   104   106   103   0.85%   -1.66%  12730   12666   -64
1732  1709  1740  1728  0.49%   1.11%   475716  447726  -27990
3051  2969  3066  3047  0.50%   2.55%   154169  148962  -5207
2626  2607  2626  2625  0.00%   0.70%   84119   83150   -969
211   208   208   205   -1.26%  -1.21%  39924   39667   -257


GfxBench 5.0
score   peak memory size(KB)
beforeafter diffbefore   afterdiff
max  avg  max  avg  max avg  
260  258  259  256  -0.39%  -0.85%  037  1013520  -97517
298  295  298  297  0.00%   0.45%   1143593  1040844  -102749

Xiong, James (4):
  i965/drm: Reorganize code for the next patch
  i965/drm: Round down buffer size and calculate the bucket index
  i965/drm: Searching for a cached buffer for reuse
  i965/drm: Purge the bucket when its cached buffer is evicted

 src/mesa/drivers/dri/i965/brw_bufmgr.c | 139 ++---
 src/util/list.h|   5 ++
 2 files changed, 79 insertions(+), 65 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] i965/drm: Round down buffer size and calculate the bucket index

2018-05-04 Thread James Xiong
From: "Xiong, James" 

a buffer is now put in cached bucket #n when its size is between
bucket[n].size and bucket[n+1].size - 1

Signed-off-by: Xiong, James 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index e68da26..6a9b005 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -189,8 +189,8 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t pitch, 
uint32_t tiling)
 static struct bo_cache_bucket *
 bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
 {
-   /* Calculating the pages and rounding up to the page size. */
-   const unsigned pages = (size + PAGE_SIZE - 1) / PAGE_SIZE;
+   /* Calculating the pages and rounding down to the  page size. */
+   const unsigned pages = (size < PAGE_SIZE) ? 1 : size / PAGE_SIZE;
 
/* Row  Bucket sizesclz((x-1) | 3)   RowColumn
 *in pages  stride   size
@@ -211,8 +211,7 @@ bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
int col_size_log2 = row - 1;
col_size_log2 += (col_size_log2 < 0);
 
-   const unsigned col = (pages - prev_row_max_pages +
-((1 << col_size_log2) - 1)) >> col_size_log2;
+   const unsigned col = (pages - prev_row_max_pages) >> col_size_log2;
 
/* Calculating the index based on the row and column. */
const unsigned index = (row * 4) + (col - 1);
@@ -1285,9 +1284,9 @@ add_bucket(struct brw_bufmgr *bufmgr, int size)
bufmgr->cache_bucket[i].size = size;
bufmgr->num_buckets++;
 
+   assert(bucket_for_size(bufmgr, size - 1) == 
>cache_bucket[i==0?0:i-1]);
assert(bucket_for_size(bufmgr, size) == >cache_bucket[i]);
-   assert(bucket_for_size(bufmgr, size - 2048) == >cache_bucket[i]);
-   assert(bucket_for_size(bufmgr, size + 1) != >cache_bucket[i]);
+   assert(bucket_for_size(bufmgr, size + 1) == >cache_bucket[i]);
 }
 
 static void
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] i965/drm: Reorganize code for the next patch

2018-05-04 Thread James Xiong
From: "Xiong, James" 

split bo_alloc_internal, and add a new function cached_bo_for_size
searches for a suitable cached buffer for a given size.

Signed-off-by: Xiong, James 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 92 +-
 1 file changed, 45 insertions(+), 47 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 7cb1f03..e68da26 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -263,53 +263,29 @@ brw_bo_cache_purge_bucket(struct brw_bufmgr *bufmgr,
}
 }
 
+/* search for a suitable cached bo for reuse */
 static struct brw_bo *
-bo_alloc_internal(struct brw_bufmgr *bufmgr,
-  const char *name,
-  uint64_t size,
-  unsigned flags,
-  uint32_t tiling_mode,
-  uint32_t stride)
+cached_bo_for_size(struct brw_bufmgr *bufmgr,
+   uint64_t size,
+   uint32_t tiling_mode,
+   uint32_t stride,
+   unsigned flags)
 {
-   struct brw_bo *bo;
-   unsigned int page_size = getpagesize();
-   int ret;
-   struct bo_cache_bucket *bucket;
-   bool alloc_from_cache;
-   uint64_t bo_size;
-   bool busy = false;
-   bool zeroed = false;
-
-   if (flags & BO_ALLOC_BUSY)
-  busy = true;
-
-   if (flags & BO_ALLOC_ZEROED)
-  zeroed = true;
+   bool busy = (flags & BO_ALLOC_BUSY) ? true : false;
+   bool zeroed = (flags & BO_ALLOC_ZEROED) ? true : false;
+   struct bo_cache_bucket *bucket =
+  (bufmgr->bo_reuse) ? bucket_for_size(bufmgr, size) : NULL;
 
/* BUSY does doesn't really jive with ZEROED as we have to wait for it to
 * be idle before we can memset.  Just disallow that combination.
 */
assert(!(busy && zeroed));
 
-   /* Round the allocated size up to a power of two number of pages. */
-   bucket = bucket_for_size(bufmgr, size);
-
-   /* If we don't have caching at this size, don't actually round the
-* allocation up.
-*/
-   if (bucket == NULL) {
-  bo_size = size;
-  if (bo_size < page_size)
- bo_size = page_size;
-   } else {
-  bo_size = bucket->size;
-   }
-
-   mtx_lock(>lock);
-   /* Get a buffer out of the cache if available */
+   if(bucket != NULL && !list_empty(>head)) {
+  struct brw_bo *bo;
 retry:
-   alloc_from_cache = false;
-   if (bucket != NULL && !list_empty(>head)) {
+  bo = NULL;
+
   if (busy && !zeroed) {
  /* Allocate new render-target BOs from the tail (MRU)
   * of the list, as it will likely be hot in the GPU
@@ -319,7 +295,6 @@ retry:
   */
  bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
  list_del(>head);
- alloc_from_cache = true;
   } else {
  /* For non-render-target BOs (where we're probably
   * going to map it first thing in order to fill it
@@ -330,16 +305,15 @@ retry:
   */
  bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
  if (!brw_bo_busy(bo)) {
-alloc_from_cache = true;
 list_del(>head);
  }
   }
 
-  if (alloc_from_cache) {
+  if (bo) {
  if (!brw_bo_madvise(bo, I915_MADV_WILLNEED)) {
 bo_free(bo);
 brw_bo_cache_purge_bucket(bufmgr, bucket);
-goto retry;
+return NULL;
  }
 
  if (bo_set_tiling_internal(bo, tiling_mode, stride)) {
@@ -353,20 +327,44 @@ retry:
bo_free(bo);
goto retry;
 }
-memset(map, 0, bo_size);
+memset(map, 0, bo->size);
  }
   }
+
+  return bo;
}
 
-   if (!alloc_from_cache) {
+   return NULL;
+}
+
+static struct brw_bo *
+bo_alloc_internal(struct brw_bufmgr *bufmgr,
+  const char *name,
+  uint64_t size,
+  unsigned flags,
+  uint32_t tiling_mode,
+  uint32_t stride)
+{
+   struct brw_bo *bo;
+   int ret;
+
+   /* align the request size to page size */
+   size = ALIGN(size, getpagesize());
+
+   mtx_lock(>lock);
+
+   /* Get a buffer out of the cache if available */
+   bo = cached_bo_for_size(bufmgr, size, tiling_mode, stride, flags);
+
+   if (bo == NULL) {
   bo = calloc(1, sizeof(*bo));
   if (!bo)
  goto err;
 
-  bo->size = bo_size;
+  bo->size = size;
   bo->idle = true;
 
-  struct drm_i915_gem_create create = { .size = bo_size };
+  struct drm_i915_gem_create create = { .size = size };
 
   /* All new BOs we get from the kernel are zeroed, so we don't need to
* worry about that here.
@@ -413,7 +411,7 @@ retry:
mtx_unlock(>lock);
 
DBG("bo_create: buf %d (%s) %llub\n", bo->gem_handle, bo->name,
-   (unsigned long long) size);
+   (unsigned long long) bo->size);
 
return bo;
 
-- 
2.7.4

___
mesa-dev mailing list

[Mesa-dev] [PATCH 3/4] i965/drm: Searching for a cached buffer for reuse

2018-05-04 Thread James Xiong
From: "Xiong, James" 

Now that a bucket contains cached buffers with different sizes, go
through its list and search for a cached buffer with enough size.

Signed-off-by: Xiong, James 
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 21 +++--
 src/util/list.h|  5 +
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 6a9b005..5235aa6 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -281,7 +281,7 @@ cached_bo_for_size(struct brw_bufmgr *bufmgr,
assert(!(busy && zeroed));
 
if(bucket != NULL && !list_empty(>head)) {
-  struct brw_bo *bo;
+  struct brw_bo *bo, *temp_bo;
 retry:
   bo = NULL;
 
@@ -292,8 +292,13 @@ retry:
   * asked us to zero the buffer, we don't want this
   * because we are going to mmap it.
   */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
- list_del(>head);
+ LIST_FOR_EACH_ENTRY_REV(temp_bo, >head, head) {
+if (temp_bo->size >= size) {
+   bo = temp_bo;
+   list_del(>head);
+   break;
+}
+ }
   } else {
  /* For non-render-target BOs (where we're probably
   * going to map it first thing in order to fill it
@@ -302,9 +307,13 @@ retry:
   * allocating a new buffer is probably faster than
   * waiting for the GPU to finish.
   */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
- if (!brw_bo_busy(bo)) {
-list_del(>head);
+ LIST_FOR_EACH_ENTRY(temp_bo, >head, head) {
+if (temp_bo->size >= size &&
+!brw_bo_busy(temp_bo)) {
+   bo = temp_bo;
+   list_del(>head);
+   break;
+}
  }
   }
 
diff --git a/src/util/list.h b/src/util/list.h
index 6edb750..9362072 100644
--- a/src/util/list.h
+++ b/src/util/list.h
@@ -189,6 +189,11 @@ static inline void list_validate(struct list_head *list)
>member != (head); \
pos = container_of(pos->member.next, pos, member))
 
+#define LIST_FOR_EACH_ENTRY_REV(pos, head, member)  \
+   for (pos = NULL, pos = container_of((head)->prev, pos, member);  \
+>member != (head); \
+pos = container_of(pos->member.prev, pos, member))
+
 #define LIST_FOR_EACH_ENTRY_SAFE(pos, storage, head, member)   \
for (pos = NULL, pos = container_of((head)->next, pos, member), \
storage = container_of(pos->member.next, pos, member);  \
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: add {X, A}BGR2101010 to |intel_image_formats|

2018-05-04 Thread Chad Versace
On Wed 02 May 2018, Miguel Casas wrote:
> This patch adds {X,A}BGR2101010 entries to the list of supported
> |intel_image_formats|.
> 
> BUG=https://crbug.com/776093
> TEST=Compile and deploy mesa this patch, then playback
> a VP9 Profile 2 video with sw decoder using crrev.com/c/897894.
> ---
>  src/mesa/drivers/dri/i965/intel_screen.c | 6 ++
>  1 file changed, 6 insertions(+)

Everything I said about the previous patch applies here too :)

Also, two new comments:

  * Don't add |bars| around token names. If you use annotate the token
name, then use 'single_quotes'. Sometimes people use `backticks` too.
It's also fine to omit the annotation.

  * In brace expansion, spaces are preserved. Therefore, it should be
{X,A} and not {X, A}.

By the way, most people would squash these two patches together and use
the prefix "i965:" or "i965,dri:". Me, I prefer to keep them separate.
It's your call.

When you resend the patch (or patches), CC Kenneth Graunke too.

> 
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index 409f763b64..d3488b9f29 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -190,6 +190,12 @@ static const struct intel_image_format 
> intel_image_formats[] = {
> { __DRI_IMAGE_FOURCC_XRGB2101010, __DRI_IMAGE_COMPONENTS_RGB, 1,
>   { { 0, 0, 0, __DRI_IMAGE_FORMAT_XRGB2101010, 4 } } },
>  
> +   { __DRI_IMAGE_FOURCC_ABGR2101010, __DRI_IMAGE_COMPONENTS_RGBA, 1,
> + { { 0, 0, 0, __DRI_IMAGE_FORMAT_ABGR2101010, 4 } } },
> +
> +   { __DRI_IMAGE_FOURCC_XBGR2101010, __DRI_IMAGE_COMPONENTS_RGB, 1,
> + { { 0, 0, 0, __DRI_IMAGE_FORMAT_XBGR2101010, 4 } } },
> +
> { __DRI_IMAGE_FOURCC_ARGB, __DRI_IMAGE_COMPONENTS_RGBA, 1,
>   { { 0, 0, 0, __DRI_IMAGE_FORMAT_ARGB, 4 } } },
>  
> -- 
> 2.17.0.441.gb46fe60e1d-goog
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: add R10G10B10{A, X}2 to MESA <-> DRI format translations

2018-05-04 Thread Chad Versace
Thanks for the patches. The code looks good. All my suggestions are
merely nitpicks to make the patches follow Mesa conventions.

In general, if you have questions about commit message style, examine
the git log for previous patches that touched the same files and
directories as yours. Sometimes, different directories in Mesa can have
very different code style as well as commit style.

First, when a patch touches src/mesa/drivers/dri/common and/or
include/GL/internal/dri_interface.h, and touches nothing else, the patch
subject should probably have the prefix "dri:". For patches that touch
only dri_util.c, like yours, there is also a precedent for using the
"dri_util:" prefix.

On Wed 02 May 2018, Miguel Casas wrote:
> This patch adds R10G10B10{A,X}2 MESA <-> DRI translation entries
> in the appropriate places for dri2 functions to accept them.

"DRI translation entries in the appropriate places for dri2 functions to
accept them" is quite vague but a lot of text. Dense, precise git logs
are the best. Please omit the phrase, or replace it with a precise one.

At risk of over-laboring the point, short-and-sweet-and-precise like any
of the following:

* Add R10G10B10{A,X}2 translation between mesa_format and DRI format.

* Add R10G10B10{A,X}2 translation between mesa_format and DRI format
  to driGLFormatToImageFormat() and driImageFormatToGLFormat().

* Teach driGLFormatToImageFormat() and driImageFormatToGLFormat() to
  translate __DRI_IMAGE_FORMAT_ABGR2101010 and
  __DRI_IMAGE_FORMAT_XBGR2101010.

* Teach dri_util.c to translate R10G10B10{A,X}2 between mesa_format
  and DRI format.

> BUG=https://crbug.com/776093
> TEST=Compile and deploy mesa+this patch, then playback
> a VP9 Profile 2 video with sw decoder using crrev.com/c/897894.

The Chromium-specific taglines BUG= and TEST= appear nowhere in the Mesa
git log.

The BUG line should be converted to any of the following trailer lines:

Bug: https://crbug.com/776903
(This is my favorite).
Fixes: https://crbug.com/776903
(But only use Fixes if it fully fixes the bug).
Reference: https://crbug.com/776903
References: https://crbug.com/776903
(Some people use singular Reference, others plural. Shrug).

The TEST line doesn't have a clear translation. Some people would simply
add a paragraph to the commit message like this:

Tested by playing a VP9 Profile 2 video with sw
decoder using foo.

Other people try to put in a trailer line, like below. If you use
a trailer, then *please* indent wrapped lines with at least two spaces,
just like RFC 822. Read man:git-interpret-trailers(1) if want to learn
more about trailers.

Test: Play a VP9 Profile 2 video with sw
  decoder using foo.

Regardless, in the test description:

* Don't say you built and deployed the patch, *then* ran a test. If
  you ran the test, then we trust you ran it with the patch applied :-)
  Dense git log == good.

* For a test like this, it's critical to mention what GPU you
  used. If you used Eve, then saying "on Kabylake" would be
  sufficient for this patch.

* How is the VP9 video related to DRI images? Did you import each
  frame as a dma_buf into a single EGLImage? Into multiple
  EGLImages, one per plane? I don't understand how VP9 is related to
  this patch without more description.

* Whose software decoder? I don't believe the source of the VP9
  video is important to this patch. You could probably
  s/video with sw decoder/video/ without losing significant
  information. But if you think it's important to mention that the
  video was sw-decoded, then please tell us what decoder you used.

Woo... that was a lot... Thanks for your first Mesa patch!

> ---
>  src/mesa/drivers/dri/common/dri_util.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/common/dri_util.c 
> b/src/mesa/drivers/dri/common/dri_util.c
> index 7cb6248b13..78c6bbf234 100644
> --- a/src/mesa/drivers/dri/common/dri_util.c
> +++ b/src/mesa/drivers/dri/common/dri_util.c
> @@ -886,6 +886,10 @@ driGLFormatToImageFormat(mesa_format format)
>return __DRI_IMAGE_FORMAT_ARGB2101010;
> case MESA_FORMAT_B10G10R10X2_UNORM:
>return __DRI_IMAGE_FORMAT_XRGB2101010;
> +   case MESA_FORMAT_R10G10B10A2_UNORM:
> +  return __DRI_IMAGE_FORMAT_ABGR2101010;
> +   case MESA_FORMAT_R10G10B10X2_UNORM:
> +  return __DRI_IMAGE_FORMAT_XBGR2101010;
> case MESA_FORMAT_B8G8R8A8_UNORM:
>return __DRI_IMAGE_FORMAT_ARGB;
> case MESA_FORMAT_R8G8B8A8_UNORM:
> @@ -923,6 +927,10 @@ driImageFormatToGLFormat(uint32_t image_format)
>return MESA_FORMAT_B10G10R10A2_UNORM;
> case __DRI_IMAGE_FORMAT_XRGB2101010:
>return MESA_FORMAT_B10G10R10X2_UNORM;
> +   case __DRI_IMAGE_FORMAT_ABGR2101010:
> +  return MESA_FORMAT_R10G10B10A2_UNORM;
> +   case __DRI_IMAGE_FORMAT_XBGR2101010:
> +  return 

Re: [Mesa-dev] [PATCH 1½/9] util: Add a randomized test for the virtual memory allocator

2018-05-04 Thread Scott D Phillips
Scott D Phillips  writes:

> The test pseudo-randomly makes allocations and deallocations with
> the virtual memory allocator and checks that the results are
> consistent. Specifically, we test that:
>
>  * no result from the allocator overlaps an already allocated range
>  * allocated memory fulfills the stated alignment requirement
>  * a failed result from the allocator could not have been fulfilled
>  * memory freed to the allocator can later be allocated again
> ---
>  configure.ac   |   1 +
>  src/util/Makefile.am   |   3 +-
>  src/util/meson.build   |   1 +
>  src/util/tests/vma/Makefile.am |  37 +
>  src/util/tests/vma/meson.build |  29 
>  src/util/tests/vma/vma_random_test.cpp | 239 
> +
>  6 files changed, 309 insertions(+), 1 deletion(-)
>  create mode 100644 src/util/tests/vma/Makefile.am
>  create mode 100644 src/util/tests/vma/meson.build
>  create mode 100644 src/util/tests/vma/vma_random_test.cpp
>
> diff --git a/configure.ac b/configure.ac
> index f1fbdcc6c7c..3dbbccc5ec9 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -3109,6 +3109,7 @@ AC_CONFIG_FILES([Makefile
>   src/util/Makefile
>   src/util/tests/hash_table/Makefile
>   src/util/tests/string_buffer/Makefile
> + src/util/tests/vma/Makefile
>   src/util/xmlpool/Makefile
>   src/vulkan/Makefile])
>  
> diff --git a/src/util/Makefile.am b/src/util/Makefile.am
> index 07bf052175b..b51dccdadfd 100644
> --- a/src/util/Makefile.am
> +++ b/src/util/Makefile.am
> @@ -22,7 +22,8 @@
>  SUBDIRS = . \
>   xmlpool \
>   tests/hash_table \
> - tests/string_buffer
> + tests/string_buffer \
> + tests/vma
>  
>  include Makefile.sources
>  
> diff --git a/src/util/meson.build b/src/util/meson.build
> index 14660e0fa0c..c777984e28d 100644
> --- a/src/util/meson.build
> +++ b/src/util/meson.build
> @@ -159,4 +159,5 @@ if with_tests
>  
>subdir('tests/hash_table')
>subdir('tests/string_buffer')
> +  subdir('tests/vma')
>  endif
> diff --git a/src/util/tests/vma/Makefile.am b/src/util/tests/vma/Makefile.am
> new file mode 100644
> index 000..1c4dd302bfa
> --- /dev/null
> +++ b/src/util/tests/vma/Makefile.am
> @@ -0,0 +1,37 @@
> +# Copyright © 2018 Intel Corporation
> +#
> +#  Permission is hereby granted, free of charge, to any person obtaining a
> +#  copy of this software and associated documentation files (the "Software"),
> +#  to deal in the Software without restriction, including without limitation
> +#  the rights to use, copy, modify, merge, publish, distribute, sublicense,
> +#  and/or sell copies of the Software, and to permit persons to whom the
> +#  Software is furnished to do so, subject to the following conditions:
> +#
> +#  The above copyright notice and this permission notice (including the next
> +#  paragraph) shall be included in all copies or substantial portions of the
> +#  Software.
> +#
> +#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +#  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +#  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> +#  THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> +#  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> +#  FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> +#  IN THE SOFTWARE.
> +
> +AM_CPPFLAGS = \
> + -I$(top_srcdir)/include \
> + -I$(top_srcdir)/src/util \
> + $(DEFINES)
> +
> +TESTS = vma_random_test
> +
> +check_PROGRAMS = $(TESTS)
> +
> +vma_random_test_SOURCES = \
> + vma_random_test.cpp
> +
> +vma_random_test_LDADD = \
> + $(top_builddir)/src/util/libmesautil.la
> +
> +EXTRA_DIST = meson.build
> diff --git a/src/util/tests/vma/meson.build b/src/util/tests/vma/meson.build
> new file mode 100644
> index 000..53562db312b
> --- /dev/null
> +++ b/src/util/tests/vma/meson.build
> @@ -0,0 +1,29 @@
> +# Copyright © 2018 Intel Corporation
> +
> +# Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> +# of this software and associated documentation files (the "Software"), to 
> deal
> +# in the Software without restriction, including without limitation the 
> rights
> +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> +# copies of the Software, and to permit persons to whom the Software is
> +# furnished to do so, subject to the following conditions:
> +
> +# The above copyright notice and this permission notice shall be included in
> +# all copies or substantial portions of the Software.
> +
> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> +# FITNESS FOR A 

[Mesa-dev] [PATCH 1½/9] util: Add a randomized test for the virtual memory allocator

2018-05-04 Thread Scott D Phillips
The test pseudo-randomly makes allocations and deallocations with
the virtual memory allocator and checks that the results are
consistent. Specifically, we test that:

 * no result from the allocator overlaps an already allocated range
 * allocated memory fulfills the stated alignment requirement
 * a failed result from the allocator could not have been fulfilled
 * memory freed to the allocator can later be allocated again
---
 configure.ac   |   1 +
 src/util/Makefile.am   |   3 +-
 src/util/meson.build   |   1 +
 src/util/tests/vma/Makefile.am |  37 +
 src/util/tests/vma/meson.build |  29 
 src/util/tests/vma/vma_random_test.cpp | 239 +
 6 files changed, 309 insertions(+), 1 deletion(-)
 create mode 100644 src/util/tests/vma/Makefile.am
 create mode 100644 src/util/tests/vma/meson.build
 create mode 100644 src/util/tests/vma/vma_random_test.cpp

diff --git a/configure.ac b/configure.ac
index f1fbdcc6c7c..3dbbccc5ec9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3109,6 +3109,7 @@ AC_CONFIG_FILES([Makefile
  src/util/Makefile
  src/util/tests/hash_table/Makefile
  src/util/tests/string_buffer/Makefile
+ src/util/tests/vma/Makefile
  src/util/xmlpool/Makefile
  src/vulkan/Makefile])
 
diff --git a/src/util/Makefile.am b/src/util/Makefile.am
index 07bf052175b..b51dccdadfd 100644
--- a/src/util/Makefile.am
+++ b/src/util/Makefile.am
@@ -22,7 +22,8 @@
 SUBDIRS = . \
xmlpool \
tests/hash_table \
-   tests/string_buffer
+   tests/string_buffer \
+   tests/vma
 
 include Makefile.sources
 
diff --git a/src/util/meson.build b/src/util/meson.build
index 14660e0fa0c..c777984e28d 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -159,4 +159,5 @@ if with_tests
 
   subdir('tests/hash_table')
   subdir('tests/string_buffer')
+  subdir('tests/vma')
 endif
diff --git a/src/util/tests/vma/Makefile.am b/src/util/tests/vma/Makefile.am
new file mode 100644
index 000..1c4dd302bfa
--- /dev/null
+++ b/src/util/tests/vma/Makefile.am
@@ -0,0 +1,37 @@
+# Copyright © 2018 Intel Corporation
+#
+#  Permission is hereby granted, free of charge, to any person obtaining a
+#  copy of this software and associated documentation files (the "Software"),
+#  to deal in the Software without restriction, including without limitation
+#  the rights to use, copy, modify, merge, publish, distribute, sublicense,
+#  and/or sell copies of the Software, and to permit persons to whom the
+#  Software is furnished to do so, subject to the following conditions:
+#
+#  The above copyright notice and this permission notice (including the next
+#  paragraph) shall be included in all copies or substantial portions of the
+#  Software.
+#
+#  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+#  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+#  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+#  THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+#  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+#  FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+#  IN THE SOFTWARE.
+
+AM_CPPFLAGS = \
+   -I$(top_srcdir)/include \
+   -I$(top_srcdir)/src/util \
+   $(DEFINES)
+
+TESTS = vma_random_test
+
+check_PROGRAMS = $(TESTS)
+
+vma_random_test_SOURCES = \
+   vma_random_test.cpp
+
+vma_random_test_LDADD = \
+   $(top_builddir)/src/util/libmesautil.la
+
+EXTRA_DIST = meson.build
diff --git a/src/util/tests/vma/meson.build b/src/util/tests/vma/meson.build
new file mode 100644
index 000..53562db312b
--- /dev/null
+++ b/src/util/tests/vma/meson.build
@@ -0,0 +1,29 @@
+# Copyright © 2018 Intel Corporation
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 

[Mesa-dev] [Bug 106400] [CI][SNB] glsl-1.40 and 1.50 texturesize / geometry tests failing

2018-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106400

Timothy Arceri  changed:

   What|Removed |Added

  Component|glsl-compiler   |Drivers/DRI/i965
   Assignee|mesa-dev@lists.freedesktop. |intel-3d-bugs@lists.freedes
   |org |ktop.org

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: fix error handling in get_framebuffer_parameteriv

2018-05-04 Thread Rhys Perry
CC: 
Signed-off-by: Rhys Perry 
---
 src/mesa/main/fbobject.c | 72 +++-
 1 file changed, 41 insertions(+), 31 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index c72204e11a..0fc3319fe9 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1488,45 +1488,66 @@ _mesa_FramebufferParameteri(GLenum target, GLenum 
pname, GLint param)
 }
 
 static bool
-_pname_valid_for_default_framebuffer(struct gl_context *ctx,
- GLenum pname)
+validate_get_framebuffer_parameteriv_pname(struct gl_context *ctx,
+   struct gl_framebuffer *fb,
+   GLuint pname, const char *func)
 {
-   if (!_mesa_is_desktop_gl(ctx))
-  return false;
+   bool cannotBeWinsysFbo = true;
 
switch (pname) {
+   case GL_FRAMEBUFFER_DEFAULT_LAYERS:
+  /*
+   * According to the OpenGL ES 3.1 specification section 9.2.3, the
+   * GL_FRAMEBUFFER_LAYERS parameter name is not supported.
+   */
+  if (_mesa_is_gles31(ctx) && !ctx->Extensions.OES_geometry_shader) {
+ _mesa_error(ctx, GL_INVALID_ENUM, "%s(pname=0x%x)", func, pname);
+ return false;
+  }
+  break;
+   case GL_FRAMEBUFFER_DEFAULT_WIDTH:
+   case GL_FRAMEBUFFER_DEFAULT_HEIGHT:
+   case GL_FRAMEBUFFER_DEFAULT_SAMPLES:
+   case GL_FRAMEBUFFER_DEFAULT_FIXED_SAMPLE_LOCATIONS:
+  break;
case GL_DOUBLEBUFFER:
case GL_IMPLEMENTATION_COLOR_READ_FORMAT:
case GL_IMPLEMENTATION_COLOR_READ_TYPE:
case GL_SAMPLES:
case GL_SAMPLE_BUFFERS:
case GL_STEREO:
-  return true;
+  /* From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries:
+   *
+   *"An INVALID_OPERATION error is generated by 
GetFramebufferParameteriv
+   * if the default framebuffer is bound to target and pname is not one
+   * of the accepted values from table 23.73, other than
+   * SAMPLE_POSITION."
+   *
+   * For OpenGL ES, using default framebuffer raises INVALID_OPERATION
+   * for any pname.
+   */
+  cannotBeWinsysFbo = !_mesa_is_desktop_gl(ctx);
+  break;
default:
+  _mesa_error(ctx, GL_INVALID_ENUM, "%s(pname=0x%x)", func, pname);
   return false;
}
+
+   if (cannotBeWinsysFbo && _mesa_is_winsys_fbo(fb)) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "%s(invalid pname=0x%x for default framebuffer)", func, 
pname);
+  return false;
+   }
+
+   return true;
 }
 
 static void
 get_framebuffer_parameteriv(struct gl_context *ctx, struct gl_framebuffer *fb,
 GLenum pname, GLint *params, const char *func)
 {
-   /* From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries:
-*
-*"An INVALID_OPERATION error is generated by GetFramebufferParameteriv
-* if the default framebuffer is bound to target and pname is not one
-* of the accepted values from table 23.73, other than
-* SAMPLE_POSITION."
-*
-* For OpenGL ES, using default framebuffer still raises INVALID_OPERATION
-* for any pname.
-*/
-   if (_mesa_is_winsys_fbo(fb) &&
-   !_pname_valid_for_default_framebuffer(ctx, pname)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  "%s(invalid pname=0x%x for default framebuffer)", func, 
pname);
+   if (!validate_get_framebuffer_parameteriv_pname(ctx, fb, pname, func))
   return;
-   }
 
switch (pname) {
case GL_FRAMEBUFFER_DEFAULT_WIDTH:
@@ -1536,14 +1557,6 @@ get_framebuffer_parameteriv(struct gl_context *ctx, 
struct gl_framebuffer *fb,
   *params = fb->DefaultGeometry.Height;
   break;
case GL_FRAMEBUFFER_DEFAULT_LAYERS:
-  /*
-   * According to the OpenGL ES 3.1 specification section 9.2.3, the
-   * GL_FRAMEBUFFER_LAYERS parameter name is not supported.
-   */
-  if (_mesa_is_gles31(ctx) && !ctx->Extensions.OES_geometry_shader) {
- _mesa_error(ctx, GL_INVALID_ENUM, "%s(pname=0x%x)", func, pname);
- break;
-  }
   *params = fb->DefaultGeometry.Layers;
   break;
case GL_FRAMEBUFFER_DEFAULT_SAMPLES:
@@ -1570,9 +1583,6 @@ get_framebuffer_parameteriv(struct gl_context *ctx, 
struct gl_framebuffer *fb,
case GL_STEREO:
   *params = fb->Visual.stereoMode;
   break;
-   default:
-  _mesa_error(ctx, GL_INVALID_ENUM,
-  "%s(pname=0x%x)", func, pname);
}
 }
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Ilia Mirkin
On Fri, May 4, 2018 at 5:30 PM, Gert Wollny  wrote:
> Am Freitag, den 04.05.2018, 16:11 -0400 schrieb Ilia Mirkin:
>> On Fri, May 4, 2018 at 2:50 PM, Gert Wollny > m> wrote:
>> > For that reason I think that a render state validation within
>> > st_Clear  makes sense, because it makes sure that the states
>> > like  SCISSOR_TEST or DISCARD_RASTERIZER are properly send to the
>> > driver, before clear() is executed that makes use of these states.
>> > And while the workaround for DISCARD_RASTERIZER is possible because
>> > if it is set then the host should actually never see the clear
>> > command, SCISSOR_TEST is different, its state can not be guessed on
>> > the host.
>> >
>> > Anyway, I happily take every hint on how this can be fixed directly
>> > in virgl_clear or virgl in general, and thank you for taking the
>> > time to answer me,
>>
>> I thought someone else took care of the scissor thing for clears. At
>> least they were asking about it.
>
> When I implemented the "disable rasterizer_discard" before clear in
> virglrenderer I had tests failing that include scissors, so that's why
> I assumed that it migt be a problem, but with your list below, I don't
> think that it is really that important which state gets not updated
> properly, because they all should be send to virglrenderer.
>
>>
>> I have to assume that virglrenderer keeps track of the last-set
>> rasterizer state. Based on that, you have to turn off scissors,
>> rasterizer discard, window rectangles, stencil tests, sample
>> masks(?), and anything else which affects clears (have a look at
>> st_Clear -- anything that causes it to hit the fallback path which
>> draws should be something you turn off because calling glClear() in
>> virglrenderer).
> As far as I can seen the rasterizer state is updated with the last draw
> command, but, to take scissors as an example, when I set and enable
> scissors before a clear, then this is not transmitted, and the clear on
> the host doesn't take a possible active scissore state into a account.
> How does turning off scissors in virglrenderer help me with that? (same
> for the other states).
>
>> If you don't keep track of the last-set rasterizer state, then now
>> may be a good time to start :)
> When you refer to the states that are set after the last draw command
> but before a clear, then no, they are not updated in virglrenderer
> because there is no code path that exposed them to the driver, and this
> is where I'm stuck.
>
> Now reading a bit more in the code I find it strange that in
> st_validate_state with (pipeline = ST_PIPELINE_CLEAR) not all states
> that actually have an influenze on the clear command are updated (i.e.
> the render states). I find this surprising ...

Rasterizer state doesn't affect the gallium clear callback. clear just
clears -- it's a memset. However rasterizer state affects glClear() so
you have to make sure to keep that in mind when you try to implement
clear() with glClear() (i.e. turn off whatever GL state the rasterizer
may have set which would cause you issues).

Note that the idea of gallium is that it's close to how the hardware
works. For example on NVIDIA, the CLEAR_BUFFERS command just clears.
No questions asked. No dependency on the rast state (except if you
explicitly ask it). You're running into problems because you're using
GL to implement the gallium API, and it doesn't quite match.

If you want, you could construct a temporary rasterizer state, bind
that, do the glClear() and then bind the old one back.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] i965: Prepare batchbuffer module for softpin support.

2018-05-04 Thread Chris Wilson
Quoting Kenneth Graunke (2018-05-04 02:12:37)
> @@ -933,6 +945,14 @@ emit_reloc(struct intel_batchbuffer *batch,
>  {
> assert(target != NULL);
>  
> +   if (target->kflags & EXEC_OBJECT_PINNED) {
> +  brw_use_pinned_bo(batch, target, reloc_flags & RELOC_WRITE);
> +  return target->gtt_offset + target_offset;
> +   }
> +
> +   unsigned int index = add_exec_bo(batch, target);
> +   struct drm_i915_gem_exec_object2 *entry = >validation_list[index];
> +
> if (rlist->reloc_count == rlist->reloc_array_size) {
>rlist->reloc_array_size *= 2;
>rlist->relocs = realloc(rlist->relocs,
> @@ -940,9 +960,6 @@ emit_reloc(struct intel_batchbuffer *batch,
>sizeof(struct drm_i915_gem_relocation_entry));
> }
>  
> -   unsigned int index = add_exec_bo(batch, target);
> -   struct drm_i915_gem_exec_object2 *entry = >validation_list[index];
> -
> if (reloc_flags & RELOC_32BIT) {
>/* Restrict this buffer to the low 32 bits of the address space.
> *
> @@ -976,6 +993,21 @@ emit_reloc(struct intel_batchbuffer *batch,
> return entry->offset + target_offset;
>  }
>  
> +void
> +brw_use_pinned_bo(struct intel_batchbuffer *batch, struct brw_bo *bo,
> +  unsigned writable_flag)
> +{
> +   assert(bo->kflags & EXEC_OBJECT_PINNED);
> +   assert((writable_flag & ~EXEC_OBJECT_WRITE) == 0);
> +
> +   unsigned int index = add_exec_bo(batch, bo);
> +   struct drm_i915_gem_exec_object2 *entry = >validation_list[index];
> +   assert(entry->offset == bo->gtt_offset);
> +
> +   if (writable_flag)
> +  entry->flags |= EXEC_OBJECT_WRITE;
> +}

I'm not fond of this (at least the emit_reloc() perspective). In
emit_reloc() we were very careful to order the writes to ensure the
validation object was always consistent with the batchbuffer entry,
and this throws it all away (granted it's not a concern, my worry is
just that the code looks dangerous).

My preference would be something like:

static uint64_t
emit_reloc(struct intel_batchbuffer *batch,
   struct brw_reloc_list *rlist, uint32_t offset,
   struct brw_bo *target, int32_t target_offset,
   unsigned int reloc_flags)
{
   assert(target != NULL);

   unsigned int index = add_exec_bo(batch, target);
   struct drm_i915_gem_exec_object2 *entry = >validation_list[index];

   if (target->kflags & EXEC_OBJECT_PINNED) {
  assert(!(reloc_flags & ~EXEC_OBJECT_WRITE));
  goto skip_relocs;
   }

   if (rlist->reloc_count == rlist->reloc_array_size) {
  rlist->reloc_array_size *= 2;
  rlist->relocs = realloc(rlist->relocs,
  rlist->reloc_array_size *
  sizeof(struct drm_i915_gem_relocation_entry));
   }

   if (reloc_flags & RELOC_32BIT) {
  /* Restrict this buffer to the low 32 bits of the address space.
   *
   * Altering the validation list flags restricts it for this batch,
   * but we also alter the BO's kflags to restrict it permanently
   * (until the BO is destroyed and put back in the cache).  Buffers
   * may stay bound across batches, and we want keep it constrained.
   */
  target->kflags &= ~EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
  entry->flags &= ~EXEC_OBJECT_SUPPORTS_48B_ADDRESS;

  /* RELOC_32BIT is not an EXEC_OBJECT_* flag, so get rid of it. */
  reloc_flags &= ~RELOC_32BIT;
   }

   rlist->relocs[rlist->reloc_count++] =
  (struct drm_i915_gem_relocation_entry) {
 .offset = offset,
 .delta = target_offset,
 .target_handle = batch->use_batch_first ? index : target->gem_handle,
 .presumed_offset = entry->offset,
  };

skip_relocs:
   if (reloc_flags)
  entry->flags |= reloc_flags & batch->valid_reloc_flags;

   /* Using the old buffer offset, write in what the right data would be, in
* case the buffer doesn't move and we can short-circuit the relocation
* processing in the kernel
*/
   return entry->offset + target_offset;
}

The relationship between validation object entry and the batch buffer
contents is much easier to verify.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Gert Wollny
Am Freitag, den 04.05.2018, 16:11 -0400 schrieb Ilia Mirkin:
> On Fri, May 4, 2018 at 2:50 PM, Gert Wollny  m> wrote:
> > For that reason I think that a render state validation within
> > st_Clear  makes sense, because it makes sure that the states
> > like  SCISSOR_TEST or DISCARD_RASTERIZER are properly send to the
> > driver, before clear() is executed that makes use of these states.
> > And while the workaround for DISCARD_RASTERIZER is possible because
> > if it is set then the host should actually never see the clear
> > command, SCISSOR_TEST is different, its state can not be guessed on
> > the host.
> > 
> > Anyway, I happily take every hint on how this can be fixed directly
> > in virgl_clear or virgl in general, and thank you for taking the
> > time to answer me,
> 
> I thought someone else took care of the scissor thing for clears. At
> least they were asking about it.

When I implemented the "disable rasterizer_discard" before clear in
virglrenderer I had tests failing that include scissors, so that's why
I assumed that it migt be a problem, but with your list below, I don't
think that it is really that important which state gets not updated
properly, because they all should be send to virglrenderer. 

> 
> I have to assume that virglrenderer keeps track of the last-set
> rasterizer state. Based on that, you have to turn off scissors,
> rasterizer discard, window rectangles, stencil tests, sample
> masks(?), and anything else which affects clears (have a look at
> st_Clear -- anything that causes it to hit the fallback path which
> draws should be something you turn off because calling glClear() in
> virglrenderer).
As far as I can seen the rasterizer state is updated with the last draw
command, but, to take scissors as an example, when I set and enable
scissors before a clear, then this is not transmitted, and the clear on
the host doesn't take a possible active scissore state into a account.
How does turning off scissors in virglrenderer help me with that? (same
for the other states).

> If you don't keep track of the last-set rasterizer state, then now
> may be a good time to start :)
When you refer to the states that are set after the last draw command
but before a clear, then no, they are not updated in virglrenderer
because there is no code path that exposed them to the driver, and this
is where I'm stuck. 

Now reading a bit more in the code I find it strange that in 
st_validate_state with (pipeline = ST_PIPELINE_CLEAR) not all states
that actually have an influenze on the clear command are updated (i.e.
the render states). I find this surprising ... 

Best, 
Gert 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] ffvertex: Don't try to read output registers in fog calculation

2018-05-04 Thread Kenneth Graunke
On Thursday, April 26, 2018 2:28:41 PM PDT Ian Romanick wrote:
> From: Ian Romanick 
> 
> This code was clearly never tested.  It's a good thing that no driver
> ever tried to enable this path.
> 
> Signed-off-by: Ian Romanick 
> ---
>  src/mesa/main/ffvertex_prog.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)

That's not quite accurate.  Gallium drivers advertise NV_fog_distance,
but st_program calls _mesa_remove_output_reads(), which presumably
lowers those away.  It looks like none of the classic drivers are
calling that, though, so they would be broken if they tried to enable
this.  Either your patch, or calling _mesa_remove_output_reads(),
would probably solve the issue.

At any rate, using a temporary seems reasonable.

Series is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/10] i965: Add virtual memory allocator infrastructure to brw_bufmgr.

2018-05-04 Thread Chris Wilson
Quoting Kenneth Graunke (2018-05-04 02:12:36)
> +   if (brw_using_softpin(bufmgr) && bo->gtt_offset == 0ull) {
> +  bo->gtt_offset = vma_alloc(bufmgr, memzone, bo->size, 1);
> +
> +  if (bo->gtt_offset == 0ull)
> + goto err_free;
> +   }
> +
> bo->name = name;
> p_atomic_set(>refcount, 1);
> bo->reusable = true;
> @@ -545,6 +792,9 @@ brw_bo_gem_create_from_name(struct brw_bufmgr *bufmgr,
> bo->external = true;
> bo->kflags = bufmgr->initial_kflags;
>  
> +   if (brw_using_softpin(bufmgr))
> +  bo->gtt_offset = vma_alloc(bufmgr, BRW_MEMZONE_OTHER, bo->size, 1);

At this point, I think you want bo_using_softpoin() and pull it from the
kflags. Not any different today, but I think more defensive, especially
on the free paths.

On exec I think you want something like

assert(!bo_using_softpin(bo) ||
   bo->gtt_offset == execobj[bo->index].gtt_iffset);

to document and check that the kernel isn't moving the objects and
breaking the vma manager.

Overall impression: very, very neat.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/1] vulkan/wsi/x11: Fix properly check if the system has DRI_MODIFIERS

2018-05-04 Thread Bas Nieuwenhuizen
On Thu, May 3, 2018 at 8:10 PM, mercuriete  wrote:
> From: Abel Garcia Dorta 
>
> Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106180
>
> Fixes: c80c08e226 "vulkan/wsi/x11: Add support for DRI3 v1.2"
>
> Cc: 
> ---
>  src/vulkan/wsi/wsi_common_x11.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
> index 3a00caddfb..ee7b247ee1 100644
> --- a/src/vulkan/wsi/wsi_common_x11.c
> +++ b/src/vulkan/wsi/wsi_common_x11.c
> @@ -1055,10 +1055,7 @@ x11_image_init(VkDevice device_h, struct x11_swapchain 
> *chain,
> image->pixmap = xcb_generate_id(chain->conn);
>
>  #ifdef HAVE_DRI3_MODIFIERS
> -   if (image->base.drm_modifier != DRM_FORMAT_MOD_INVALID) {
> -  /* If the image has a modifier, we must have DRI3 v1.2. */
> -  assert(chain->has_dri3_modifiers);
> -
> +   if (chain->has_dri3_modifiers && image->base.drm_modifier != 
> DRM_FORMAT_MOD_INVALID) {

I strongly suspect this is not the right fix, as it seems to be an
invariant we are trying to keep that if has_dri3_modifiers is false,
then  the modifier has to be DRM_FORMAT_MOD_INVALID. So the question
is why are we still getting a modifier.
>cookie =
>   xcb_dri3_pixmap_from_buffers_checked(chain->conn,
>image->pixmap,
> --
> 2.16.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] anv/device: expose shaderInt16 support in gen8+

2018-05-04 Thread Mark Janes
Patches 1-4 are

Tested-by: Mark Janes 

Iago Toral Quiroga  writes:

> This rollbacks the revert of this patch introduced with
> commit 7cf284f18e6774c810ed6db17b98e597bf96f8a5.
> ---
>  src/intel/vulkan/anv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 0563eae5c1..fd516fb846 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -759,7 +759,7 @@ void anv_GetPhysicalDeviceFeatures(
>
> pdevice->info.has_64bit_types,
>.shaderInt64  = pdevice->info.gen >= 8 &&
>
> pdevice->info.has_64bit_types,
> -  .shaderInt16  = false,
> +  .shaderInt16  = pdevice->info.gen >= 8,
>.shaderResourceMinLod = false,
>.variableMultisampleRate  = true,
>.inheritedQueries = true,
> -- 
> 2.14.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/10] i965: Introduce a "memory zone" concept on BO allocation.

2018-05-04 Thread Chris Wilson
Quoting Kenneth Graunke (2018-05-04 02:12:35)
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 66f30a1637f..66828f319be 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -268,6 +268,7 @@ static struct brw_bo *
>  bo_alloc_internal(struct brw_bufmgr *bufmgr,
>const char *name,
>uint64_t size,
> +  enum brw_memory_zone memzone,
>unsigned flags,
>uint32_t tiling_mode,
>uint32_t stride)

Did you consider carving out some space from flags for the zone?

> @@ -426,23 +427,27 @@ err:
>  
>  struct brw_bo *
>  brw_bo_alloc(struct brw_bufmgr *bufmgr,
> - const char *name, uint64_t size)
> + const char *name, uint64_t size,
> + enum brw_memory_zone memzone)
>  {
> -   return bo_alloc_internal(bufmgr, name, size, 0, I915_TILING_NONE, 0);
> +   return bo_alloc_internal(bufmgr, name, size, memzone,
> +0, I915_TILING_NONE, 0);
>  }
>  
>  struct brw_bo *
>  brw_bo_alloc_tiled(struct brw_bufmgr *bufmgr, const char *name,
> -   uint64_t size, uint32_t tiling_mode, uint32_t pitch,
> +   uint64_t size, enum brw_memory_zone memzone,
> +   uint32_t tiling_mode, uint32_t pitch,
> unsigned flags)
>  {
> -   return bo_alloc_internal(bufmgr, name, size, flags, tiling_mode, pitch);
> +   return bo_alloc_internal(bufmgr, name, size, memzone,
> +flags, tiling_mode, pitch);
>  }
>  
>  struct brw_bo *
>  brw_bo_alloc_tiled_2d(struct brw_bufmgr *bufmgr, const char *name,
> -  int x, int y, int cpp, uint32_t tiling,
> -  uint32_t *pitch, unsigned flags)
> +  int x, int y, int cpp, enum brw_memory_zone memzone,
> +  uint32_t tiling, uint32_t *pitch, unsigned flags)

The main benefit will be one less parameter here; but whether that is
more readable / less error prone?
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Gurchetan Singh
On Fri, May 4, 2018 at 1:11 PM, Ilia Mirkin  wrote:
> On Fri, May 4, 2018 at 2:50 PM, Gert Wollny  wrote:
>> For that reason I think that a render state validation within st_Clear
>> makes sense, because it makes sure that the states like  SCISSOR_TEST
>> or DISCARD_RASTERIZER are properly send to the driver, before clear()
>> is executed that makes use of these states. And while the workaround
>> for DISCARD_RASTERIZER is possible because if it is set then the host
>> should actually never see the clear command, SCISSOR_TEST is different,
>> its state can not be guessed on the host.
>>
>> Anyway, I happily take every hint on how this can be fixed directly in
>> virgl_clear or virgl in general, and thank you for taking the time to
>> answer me,
>
> I thought someone else took care of the scissor thing for clears. At
> least they were asking about it.

Yes, there were similar problems with stencils and scissors.

https://cgit.freedesktop.org/virglrenderer/commit/?id=b75e0a1dabdfbda44c310a69026a9dbd7d980294
https://cgit.freedesktop.org/virglrenderer/commit/?id=252b00d77c30ce39608c1a9de18523cbdcaca623

Essentially, we need save and restore the prior rasterizer state.

> I have to assume that virglrenderer keeps track of the last-set
> rasterizer state. Based on that, you have to turn off scissors,
> rasterizer discard, window rectangles, stencil tests, sample masks(?),
> and anything else which affects clears (have a look at st_Clear --
> anything that causes it to hit the fallback path which draws should be
> something you turn off because calling glClear() in virglrenderer).
>
> If you don't keep track of the last-set rasterizer state, then now may
> be a good time to start :)
>
>   -ilia
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] i965: Set initial kflags on BO creation.

2018-05-04 Thread Chris Wilson
Quoting Kenneth Graunke (2018-05-04 02:12:31)
> This simplifies kflag initialization, by creating a bufmgr-wide setting
> for initial kflags, and just applying it whenever we create a new BO.
> 
> This also properly allows 48-bit addresses for imported BOs (via prime
> or flink), which I had missed in my earlier 48-bit support series.
> 
> This will be useful when adding softpin support, as we'll want to add
> EXEC_OBJECT_PINNED to initial_kflags as well.

Reviewed-by: Chris Wilson 
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/1] vulkan/wsi/x11: Fix properly check if the system has DRI_MODIFIERS

2018-05-04 Thread mercuriete
Hi this is my first patch to mesa, im not sure how to contribute

In some conditions the assertion is triggered because the code don't
check properly if you have DRI_MODIFIERS.

The comment says "If the image has a modifier, we must have DRI3 v1.2"
but this is not true in my system.

Happy review! :)

Abel Garcia Dorta (1):
  vulkan/wsi/x11: Fix properly check if the system has DRI_MODIFIERS

 src/vulkan/wsi/wsi_common_x11.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] vulkan/wsi/x11: Fix properly check if the system has DRI_MODIFIERS

2018-05-04 Thread mercuriete
From: Abel Garcia Dorta 

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106180

Fixes: c80c08e226 "vulkan/wsi/x11: Add support for DRI3 v1.2"

Cc: 
---
 src/vulkan/wsi/wsi_common_x11.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 3a00caddfb..ee7b247ee1 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -1055,10 +1055,7 @@ x11_image_init(VkDevice device_h, struct x11_swapchain 
*chain,
image->pixmap = xcb_generate_id(chain->conn);
 
 #ifdef HAVE_DRI3_MODIFIERS
-   if (image->base.drm_modifier != DRM_FORMAT_MOD_INVALID) {
-  /* If the image has a modifier, we must have DRI3 v1.2. */
-  assert(chain->has_dri3_modifiers);
-
+   if (chain->has_dri3_modifiers && image->base.drm_modifier != 
DRM_FORMAT_MOD_INVALID) {
   cookie =
  xcb_dri3_pixmap_from_buffers_checked(chain->conn,
   image->pixmap,
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radv: Support for control flow attributes (SPIR-V: Unroll, DontUnroll and maybe Flatten, DontFlatten)

2018-05-04 Thread Panagiotis Christopoulos Charitos
ping

Panagiotis Christopoulos Charitos
www.ancient-ritual.com 
www.anki3d.org

On Mon, Apr 30, 2018 at 12:36 PM, Panagiotis Christopoulos Charitos <
godl...@ancient-ritual.com> wrote:

> Hi,
>
> I have multiple shaders that are part of my opensource engine and I see
> VGPR usage improvements (on AMDVLK).
>
> I can gather a few and attach them here. They can be in glsl or SPIRV
> forms. Will that work or did I misunderstood the question?
>
> Panagiotis Christopoulos Charitos
> www.ancient-ritual.com 
> www.anki3d.org
>
> On Mon, Apr 30, 2018 at 12:20 PM, Bas Nieuwenhuizen <
> b...@basnieuwenhuizen.nl> wrote:
>
>> We can look at passing it through to the optimizations. Do you have
>> any particular shaders where you are interested in the use of these?
>>
>> On Sat, Apr 28, 2018 at 3:02 PM, Panagiotis Christopoulos Charitos
>>  wrote:
>> > Hi,
>> >
>> > SPIR-V supports a few control flow decorations. These decorations are
>> > exposed in glslang as well as part of GL_EXT_control_flow_attributes
>> > extension.
>> >
>> > AMDVLK's compiler supports those decorations and when used it may
>> result in
>> > lower VGPR usage.
>> >
>> > On the other hand RADV seems to ignore them.
>> >
>> > Does it make sense to support them as well? Is yes,  are there any
>> plans?
>> >
>> > PS: I've built mesa with LLVM 6.0.0
>> >
>> >
>> > Panagiotis Christopoulos Charitos
>> > www.ancient-ritual.com
>> > www.anki3d.org
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/1] vulkan/wsi/x11: Fix properly check if the system has DRI_MODIFIERS

2018-05-04 Thread mercuriete
Hi this is my first patch to mesa, im not sure how to contribute

In some conditions the assertion is triggered because the code don't
check properly if you have DRI_MODIFIERS.

The comment says "If the image has a modifier, we must have DRI3 v1.2"
but this is not true in my system.

Happy review! :)

Abel Garcia Dorta (1):
  vulkan/wsi/x11: Fix properly check if the system has DRI_MODIFIERS

 src/vulkan/wsi/wsi_common_x11.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106180] [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)

2018-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106180

Bas Nieuwenhuizen  changed:

   What|Removed |Added

 QA Contact|mesa-dev@lists.freedesktop. |
   |org |
 CC||airl...@freedesktop.org,
   ||chadvers...@chromium.org,
   ||dan...@fooishbar.org,
   ||ja...@jlekstrand.net
  Component|Drivers/Vulkan/radeon   |Drivers/Vulkan/Common

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106180] [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)

2018-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106180

--- Comment #13 from Bas Nieuwenhuizen  ---
To confirm, you are running radv while using intel for the X server?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Ilia Mirkin
On Fri, May 4, 2018 at 2:50 PM, Gert Wollny  wrote:
> For that reason I think that a render state validation within st_Clear
> makes sense, because it makes sure that the states like  SCISSOR_TEST
> or DISCARD_RASTERIZER are properly send to the driver, before clear()
> is executed that makes use of these states. And while the workaround
> for DISCARD_RASTERIZER is possible because if it is set then the host
> should actually never see the clear command, SCISSOR_TEST is different,
> its state can not be guessed on the host.
>
> Anyway, I happily take every hint on how this can be fixed directly in
> virgl_clear or virgl in general, and thank you for taking the time to
> answer me,

I thought someone else took care of the scissor thing for clears. At
least they were asking about it.

I have to assume that virglrenderer keeps track of the last-set
rasterizer state. Based on that, you have to turn off scissors,
rasterizer discard, window rectangles, stencil tests, sample masks(?),
and anything else which affects clears (have a look at st_Clear --
anything that causes it to hit the fallback path which draws should be
something you turn off because calling glClear() in virglrenderer).

If you don't keep track of the last-set rasterizer state, then now may
be a good time to start :)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-05-04 Thread Eleni Maria Stea
Hi Eero,

On Fri, 4 May 2018 18:29:55 +0300
Eero Tamminen  wrote:

> You mean returning CAVEAT_SUPPORT in params for compressed formats
> which are transparently converted to uncompressed data?

Well, that would be the best solution I think, if it's possible to
modify an existing query in the extension, although I am not certain
which is the best query to modify: TEXTURE_COMPRESSED, or
INTERNALFORMAT_SUPPORTED (or maybe both?). 

There's also another solution that we already have, but we are not sure
if it is correct:

I noticed that both mesa and nvidia drivers return GL_FALSE when the
pname is GL_TEXTURE_COMPRESSED and the format is emulated and GL_TRUE
for the natively supported formats. (Specifically on mesa the code that
performs the check is in src/mesa/main/formatquery.c and tests only
for native support). 

But if you take a look at this part of the extension specification:

TEXTURE_COMPRESSED: If  is a compressed format
  that is supported for this type of resource, TRUE is returned in 
  . If the internal format is not compressed, or the type of
  resource is not supported, FALSE is returned.

it is not very clear if we should return true or false for an
emulated format. Maybe returning false when we provide emulation is a
bug in both drivers, just a convenient one in this case. :-)

Is there any way to clarify what should be the correct behavior?

Do you think that even if the current behavior of the
TEXTURE_COMPRESSED query is correct, in which case it should keep
returning GL_FALSE for the emulated formats, we should nevertheless
modify something else, e.g. the INTERNALFORMAT_SUPPORTED query, to
return CAVEAT_SUPPORT? 

> 
> That API's not available for GLES v2, where I think ETC is most widely
> used, so it would be more of a solution for GLES v3.x applications
> only. Sounds OK to me.
> 
> Hardest part will be propagating use of this query to engines &
> toolkits that would benefit from using it. :-)

+1 on that :)

Thanks a lot for the suggestions and the feedback,
Eleni

PS: here is some code to clarify the current situation:

[1]: https://github.com/hikiko/test-compression is a test program to
quickly check the compressed formats supported (see
the function print_compressed_formats at the end of main.c)

[2]: https://pastebin.com/Qif74fFn is the output of [1] on HSW using
the ETC patch and on nvidia where you can see that the natively
supported compression formats return GL_TRUE in both cards whereas the
emulated ones return GL_FALSE in both cards
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Gert Wollny
Am Freitag, den 04.05.2018, 08:21 -0400 schrieb Ilia Mirkin:
> On Fri, May 4, 2018 at 7:36 AM, Gert Wollny  m> wrote:
> > Am Donnerstag, den 03.05.2018, 19:52 +0200 schrieb Gert Wollny:
> > > Am Donnerstag, den 03.05.2018, 13:24 -0400 schrieb Ilia Mirkin:
> > > > 
> > > > The api call is "clear", not "glClear in the context of
> > > > whatever
> > > > random GL state there might be". When the gallium clear API is
> > > > invoked, the bound framebuffer needs to be cleared. This is how
> > > > the
> > > > API works, this is how all drivers implement it. It's basically
> > > > memset(). It doesn't care about rasterizer discard or anything
> > > > else.
> > > > 
> > > 
> > > I stand corrected and sorry for the noise.
> > 
> > Actually no:
> > 
> > The OpenGL standard section 14.1 says:
> 
> Not sure what this has to do with anything. We're not talking about
> glClear(). We're talking about the gallium clear API. Here's how it's
> specified:
> 
> http://gallium.readthedocs.io/en/latest/context.html

Thanks for the pointers, am am reading it and I am looking through the
code, but I don't really see how to solve this properly. 

> What you need to do in the virgl_clear() function is to make that
> happen. The fact that you're messing around with GL state on the
> driver end isn't the gallium api's concern, nor is it the gallium
> api's users'. My recommendation is instead of virgl_clear working
> around virglrenderer's limitations by sending extra state (such as to
> force-disable raster discard), to instead have virglrenderer track it
> properly, 

virglrenderer passes all information it gets through to the underlying
OpenGL implementation, in that sense it tracks the state. Having a
closer look at some simple example pseudo code: 

  ...
  glEnable(GL_RASTERIZER_DISCARD); 
-> the gallium state tracker sets the flag of this state
-> mesa keeps another variable ctx->RasterDiscard to hold this 
   state 
  glClear(...)
-> mesa does an early exit because RasterizerDisable is true. 

  glDrawSomething(...);  
-> before this draw gallium issues bind_rasterizer_state which 
   sends the state to the virglrenderer and it now has 
   RASTERIZER disabled. If mesa is the host backend, then 
   the host ctx->RasterDiscard is now true.
 
  glDisable(GL_RASTERIZER_DISABLE); 
 -> the according flags in the guest gallium state tracker and mesa
 
ctx->RasterDiscard are cleared, but the state change is not 
passed to virgl and hence also not to virglrenderer on the 
host. 

  glClear(); 
 -> the guest gallium state tracker issues the clear call 
virgl_clear which transmits the clear command to virglrenderer 
but since the host state was not updated, the command is 
ignored.

> and if raster discard has been enabled, to first disable it
> before calling glClear() (and then re-enable it).

I tried that, but there were still seven tests out of 17 are failing,
so there is probably some other state change involved. In fact the test
suite in question combines RASTERIZER_DISABLE with SCISSOR_TEST, which
is also communicated by bind_rasterizer_state and also taken into
account by glClear.

Now, from where I stand it is gallium as the state tracker that keeps
track of these states, and virgl (and respectively virglrenderer) only
see the actual state changes when they are passed in by
bind_rasterizer_state, which is always called from gallium.

For that reason I think that a render state validation within st_Clear
makes sense, because it makes sure that the states like  SCISSOR_TEST
or DISCARD_RASTERIZER are properly send to the driver, before clear()
is executed that makes use of these states. And while the workaround
for DISCARD_RASTERIZER is possible because if it is set then the host
should actually never see the clear command, SCISSOR_TEST is different,
its state can not be guessed on the host.

Anyway, I happily take every hint on how this can be fixed directly in
virgl_clear or virgl in general, and thank you for taking the time to
answer me,

best,
Gert














 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] swr/rast: Add ConcatLists()

2018-05-04 Thread George Kyriazis
for concatenating lists
---
 src/gallium/drivers/swr/rasterizer/codegen/gen_common.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_common.py 
b/src/gallium/drivers/swr/rasterizer/codegen/gen_common.py
index 44a0cc8..60b749d 100644
--- a/src/gallium/drivers/swr/rasterizer/codegen/gen_common.py
+++ b/src/gallium/drivers/swr/rasterizer/codegen/gen_common.py
@@ -32,6 +32,12 @@ from mako.template import Template
 from mako.exceptions import RichTraceback
 
 #==
+def ConcatLists(list_of_lists):
+output = []
+for l in list_of_lists: output += l
+return output
+
+#==
 def MakeTmpDir(suffix=''):
 '''
 Create temporary directory for use in codegen scripts.
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] swr/rast: Add Builder::GetVectorType()

2018-05-04 Thread George Kyriazis
---
 .../drivers/swr/rasterizer/jitter/builder.cpp  | 44 ++
 .../drivers/swr/rasterizer/jitter/builder.h|  1 +
 2 files changed, 45 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
index 3248735..e1c5d80 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.cpp
@@ -170,4 +170,48 @@ namespace SwrJit
 return (pGenIntrin->getMetadata("is_evaluate") != nullptr);
 }
 
+//
+/// @brief Packetizes the type. Assumes SOA conversion.
+Type* Builder::GetVectorType(Type* pType)
+{
+if (pType->isVectorTy())
+{
+return pType;
+}
+
+// [N x float] should packetize to [N x <8 x float>]
+if (pType->isArrayTy())
+{
+uint32_t arraySize = pType->getArrayNumElements();
+Type* pArrayType = pType->getArrayElementType();
+Type* pVecArrayType = GetVectorType(pArrayType);
+Type* pVecType = ArrayType::get(pVecArrayType, arraySize);
+return pVecType;
+}
+
+// {float,int} should packetize to {<8 x float>, <8 x int>}
+if (pType->isAggregateType())
+{
+uint32_t numElems = pType->getStructNumElements();
+SmallVector vecTypes;
+for (uint32_t i = 0; i < numElems; ++i)
+{
+Type* pElemType = pType->getStructElementType(i);
+Type* pVecElemType = GetVectorType(pElemType);
+vecTypes.push_back(pVecElemType);
+}
+Type* pVecType = StructType::get(JM()->mContext, vecTypes);
+return pVecType;
+}
+
+// [N x float]* should packetize to [N x <8 x float>]*
+if (pType->isPointerTy() && 
pType->getPointerElementType()->isArrayTy())
+{
+return 
PointerType::get(GetVectorType(pType->getPointerElementType()), 
pType->getPointerAddressSpace());
+}
+
+//  should packetize to <8 x >
+Type* vecType = VectorType::get(pType, JM()->mVWidth);
+return vecType;
+}
 }
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder.h
index 82c5f8c..6ca128d 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder.h
@@ -123,6 +123,7 @@ namespace SwrJit
 bool IsTempAlloca(Value* inst);
 bool SetTexelMaskEvaluate(Instruction* inst);
 bool IsTexelMaskEvaluate(Instruction* inst);
+Type* GetVectorType(Type* pType);
 
 #include "gen_builder.hpp"
 #include "gen_builder_meta.hpp"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] SWR changes

2018-05-04 Thread George Kyriazis
Misc OpenSWR changes

George Kyriazis (7):
  swr/rast: Change formatting
  swr/rast: Use binner topology to assemble backend attributes
  swr/rast: Add constant initializer for uint64_t
  swr/rast: Add ConcatLists()
  swr/rast: Prepend the console output with a newline
  swr/rast: Add Builder::GetVectorType()
  swr/rast: Thread locked tiles improvement

 .../drivers/swr/rasterizer/codegen/gen_common.py   |   6 ++
 .../codegen/templates/gen_ar_eventhandlerfile.hpp  |   2 +-
 src/gallium/drivers/swr/rasterizer/core/api.cpp|  18 +++-
 src/gallium/drivers/swr/rasterizer/core/binner.cpp |   2 +-
 src/gallium/drivers/swr/rasterizer/core/context.h  |   2 +-
 .../drivers/swr/rasterizer/core/threads.cpp|   5 +-
 src/gallium/drivers/swr/rasterizer/core/threads.h  |   2 +-
 .../drivers/swr/rasterizer/core/tilemgr.cpp|  31 +++---
 src/gallium/drivers/swr/rasterizer/core/tilemgr.h  |  20 ++--
 src/gallium/drivers/swr/rasterizer/core/tileset.h  | 105 +
 .../drivers/swr/rasterizer/jitter/builder.cpp  |  44 +
 .../drivers/swr/rasterizer/jitter/builder.h|   1 +
 .../drivers/swr/rasterizer/jitter/builder_misc.cpp |   5 +
 .../drivers/swr/rasterizer/jitter/builder_misc.h   |   1 +
 14 files changed, 217 insertions(+), 27 deletions(-)
 create mode 100644 src/gallium/drivers/swr/rasterizer/core/tileset.h

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] swr/rast: Prepend the console output with a newline

2018-05-04 Thread George Kyriazis
It can get jumbled with output from other threads.
---
 .../swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
index ceded82..79612f3 100644
--- 
a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
+++ 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp
@@ -59,7 +59,7 @@ namespace ArchRast
 outDir << KNOB_DEBUG_OUTPUT_DIR << pBaseName << "_" << pid << 
std::ends;
 mOutputDir = outDir.str();
 if (CreateDirectory(mOutputDir.c_str(), NULL)) {
-std::cout << "ArchRast Dir:  " << mOutputDir << std::endl 
<< std::endl << std::flush;
+std::cout << std::endl << "ArchRast Dir:   " << mOutputDir 
<< std::endl << std::endl << std::flush;
 }
 
 // There could be multiple threads creating thread pools. We
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] swr/rast: Add constant initializer for uint64_t

2018-05-04 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp | 5 +
 src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h   | 1 +
 2 files changed, 6 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
index 619a67b..231fa94 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
@@ -182,6 +182,11 @@ namespace SwrJit
 return ConstantInt::get(IRB()->getInt32Ty(), i);
 }
 
+Constant *Builder::C(uint64_t i)
+{
+return ConstantInt::get(IRB()->getInt64Ty(), i);
+}
+
 Constant *Builder::C(float i)
 {
 return ConstantFP::get(IRB()->getFloatTy(), i);
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h
index a51aad0..d7732ef 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.h
@@ -34,6 +34,7 @@ Constant *C(char i);
 Constant *C(uint8_t i);
 Constant *C(int i);
 Constant *C(int64_t i);
+Constant *C(uint64_t i);
 Constant *C(uint16_t i);
 Constant *C(uint32_t i);
 Constant *C(float i);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] swr/rast: Use binner topology to assemble backend attributes

2018-05-04 Thread George Kyriazis
Previously was using the draw topology, which may change if GS or Tess
are active. Only affected attributes marked with constant interpolation,
which limited the impact.
---
 src/gallium/drivers/swr/rasterizer/core/binner.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp 
b/src/gallium/drivers/swr/rasterizer/core/binner.cpp
index 9f8dc88..7b9c20e 100644
--- a/src/gallium/drivers/swr/rasterizer/core/binner.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/binner.cpp
@@ -81,7 +81,7 @@ INLINE void ProcessAttributes(
 // Conservative Rasterization requires degenerate tris to have constant 
attribute interpolation
 uint32_t constantInterpMask = IsDegenerate::value ? 0x : 
backendState.constantInterpolationMask;
 const uint32_t provokingVertex = 
pDC->pState->state.frontendState.topologyProvokingVertex;
-const PRIMITIVE_TOPOLOGY topo = pDC->pState->state.topology;
+const PRIMITIVE_TOPOLOGY topo = pa.binTopology;
 
 static const float constTable[3][4] = {
 { 0.0f, 0.0f, 0.0f, 0.0f },
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] swr/rast: Change formatting

2018-05-04 Thread George Kyriazis
---
 src/gallium/drivers/swr/rasterizer/core/api.cpp | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp 
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index a2ee85d..3458793 100644
--- a/src/gallium/drivers/swr/rasterizer/core/api.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp
@@ -475,7 +475,12 @@ void SetupDefaultState(SWR_CONTEXT *pContext)
 pState->depthBoundsState.depthBoundsTestMaxValue = 1.0f;
 }
 
-void SwrSync(HANDLE hContext, PFN_CALLBACK_FUNC pfnFunc, uint64_t userData, 
uint64_t userData2, uint64_t userData3)
+void SWR_API SwrSync(
+HANDLE hContext,
+PFN_CALLBACK_FUNC pfnFunc,
+uint64_t userData,
+uint64_t userData2,
+uint64_t userData3)
 {
 SWR_ASSERT(pfnFunc != nullptr);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] swr/rast: Thread locked tiles improvement

2018-05-04 Thread George Kyriazis
- Change tilemgr TILE_ID encoding to use Morton-order (Z-order).
- Change locked tiles set to bitset.  Makes clear, set, get much faster.
---
 src/gallium/drivers/swr/rasterizer/core/api.cpp|  11 ++-
 src/gallium/drivers/swr/rasterizer/core/context.h  |   2 +-
 .../drivers/swr/rasterizer/core/threads.cpp|   5 +-
 src/gallium/drivers/swr/rasterizer/core/threads.h  |   2 +-
 .../drivers/swr/rasterizer/core/tilemgr.cpp|  31 +++---
 src/gallium/drivers/swr/rasterizer/core/tilemgr.h  |  20 ++--
 src/gallium/drivers/swr/rasterizer/core/tileset.h  | 105 +
 7 files changed, 152 insertions(+), 24 deletions(-)
 create mode 100644 src/gallium/drivers/swr/rasterizer/core/tileset.h

diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp 
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index 3458793..47f3633 100644
--- a/src/gallium/drivers/swr/rasterizer/core/api.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp
@@ -42,6 +42,7 @@
 #include "core/tilemgr.h"
 #include "core/clip.h"
 #include "core/utils.h"
+#include "core/tileset.h"
 
 #include "common/os.h"
 
@@ -139,6 +140,11 @@ HANDLE SwrCreateContext(
 BindApiThread(pContext, 0);
 }
 
+if (pContext->threadInfo.SINGLE_THREADED)
+{
+pContext->pSingleThreadLockedTiles = new TileSet();
+}
+
 pContext->ppScratch = new uint8_t*[pContext->NumWorkerThreads];
 pContext->pStats = (SWR_STATS*)AlignedMalloc(sizeof(SWR_STATS) * 
pContext->NumWorkerThreads, 64);
 
@@ -245,7 +251,7 @@ void QueueWork(SWR_CONTEXT *pContext)
 {
 uint32_t curDraw[2] = { pContext->pCurDrawContext->drawId, 
pContext->pCurDrawContext->drawId };
 WorkOnFifoFE(pContext, 0, curDraw[0]);
-WorkOnFifoBE(pContext, 0, curDraw[1], 
pContext->singleThreadLockedTiles, 0, 0);
+WorkOnFifoBE(pContext, 0, curDraw[1], 
*pContext->pSingleThreadLockedTiles, 0, 0);
 }
 else
 {
@@ -427,7 +433,8 @@ void SwrDestroyContext(HANDLE hContext)
 delete[] pContext->ppScratch;
 AlignedFree(pContext->pStats);
 
-delete(pContext->pHotTileMgr);
+delete pContext->pHotTileMgr;
+delete pContext->pSingleThreadLockedTiles;
 
 pContext->~SWR_CONTEXT();
 AlignedFree(GetContext(hContext));
diff --git a/src/gallium/drivers/swr/rasterizer/core/context.h 
b/src/gallium/drivers/swr/rasterizer/core/context.h
index af8f4b8..2cd61e4 100644
--- a/src/gallium/drivers/swr/rasterizer/core/context.h
+++ b/src/gallium/drivers/swr/rasterizer/core/context.h
@@ -516,7 +516,7 @@ struct SWR_CONTEXT
 
 uint32_t lastFrameChecked;
 uint64_t lastDrawChecked;
-TileSet singleThreadLockedTiles;
+TileSet* pSingleThreadLockedTiles;
 
 // ArchRast thread contexts.
 HANDLE* pArContext;
diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp 
b/src/gallium/drivers/swr/rasterizer/core/threads.cpp
index 9e16246..f77ae22 100644
--- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp
@@ -49,6 +49,7 @@
 #include "rasterizer.h"
 #include "rdtsc_core.h"
 #include "tilemgr.h"
+#include "tileset.h"
 
 
 
@@ -587,7 +588,7 @@ bool WorkOnFifoBE(
 }
 
 // can only work on this draw if it's not in use by other threads
-if (lockedTiles.find(tileID) != lockedTiles.end())
+if (lockedTiles.get(tileID))
 {
 continue;
 }
@@ -645,7 +646,7 @@ bool WorkOnFifoBE(
 else
 {
 // This tile is already locked. So let's add it to our locked 
tiles set. This way we don't try locking this one again.
-lockedTiles.insert(tileID);
+lockedTiles.set(tileID);
 }
 }
 }
diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.h 
b/src/gallium/drivers/swr/rasterizer/core/threads.h
index cb918dd..0489a3c 100644
--- a/src/gallium/drivers/swr/rasterizer/core/threads.h
+++ b/src/gallium/drivers/swr/rasterizer/core/threads.h
@@ -62,7 +62,7 @@ struct THREAD_POOL
 THREAD_DATA *pApiThreadData;
 };
 
-typedef std::unordered_set TileSet;
+struct TileSet;
 
 void CreateThreadPool(SWR_CONTEXT *pContext, THREAD_POOL *pPool);
 void StartThreadPool(SWR_CONTEXT* pContext, THREAD_POOL* pPool);
diff --git a/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp 
b/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp
index 28fa787..1bdef4b 100644
--- a/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/tilemgr.cpp
@@ -33,8 +33,6 @@
 #include "core/multisample.h"
 #include "rdtsc_core.h"
 
-#define TILE_ID(x,y) ((x << 16 | y))
-
 MacroTileMgr::MacroTileMgr(CachingArena& arena) : mArena(arena)
 {
 }
@@ -50,26 +48,35 @@ void MacroTileMgr::enqueue(uint32_t x, uint32_t y, BE_WORK 
*pWork)
 return;
 }
 
-uint32_t id = TILE_ID(x, y);
+uint32_t id = getTileId(x, y);
+
+if (id >= mTiles.size())
+  

[Mesa-dev] [PATCH 1/2] ac: add subgroup scan functions for SI

2018-05-04 Thread Daniel Schürmann
This functionality is currently disabled due to a bug in LLVM.
The idea of this implementation is taken from the ROCm Device Libs:
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl
---
 src/amd/common/ac_llvm_build.c | 98 --
 1 file changed, 82 insertions(+), 16 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index c9b2e36b63..8300c9d6bd 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2967,26 +2967,66 @@ ac_build_alu_op(struct ac_llvm_context *ctx, 
LLVMValueRef lhs, LLVMValueRef rhs,
}
 }
 
-/* TODO: add inclusive and excluse scan functions for SI chip class.  */
 static LLVMValueRef
 ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, 
LLVMValueRef identity)
 {
LLVMValueRef result, tmp;
result = src;
-   tmp = ac_build_dpp(ctx, identity, src, dpp_row_sr(1), 0xf, 0xf, false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
-   tmp = ac_build_dpp(ctx, identity, src, dpp_row_sr(2), 0xf, 0xf, false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
-   tmp = ac_build_dpp(ctx, identity, src, dpp_row_sr(3), 0xf, 0xf, false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
-   tmp = ac_build_dpp(ctx, identity, result, dpp_row_sr(4), 0xf, 0xe, 
false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
-   tmp = ac_build_dpp(ctx, identity, result, dpp_row_sr(8), 0xf, 0xc, 
false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
-   tmp = ac_build_dpp(ctx, identity, result, dpp_row_bcast15, 0xa, 0xf, 
false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
-   tmp = ac_build_dpp(ctx, identity, result, dpp_row_bcast31, 0xc, 0xf, 
false);
-   result = ac_build_alu_op(ctx, result, tmp, op);
+   if (ctx->chip_class >= VI) {
+   tmp = ac_build_dpp(ctx, identity, src, dpp_row_sr(1), 0xf, 0xf, 
false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_dpp(ctx, identity, src, dpp_row_sr(2), 0xf, 0xf, 
false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_dpp(ctx, identity, src, dpp_row_sr(3), 0xf, 0xf, 
false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_dpp(ctx, identity, result, dpp_row_sr(4), 0xf, 
0xe, false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_dpp(ctx, identity, result, dpp_row_sr(8), 0xf, 
0xc, false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_dpp(ctx, identity, result, dpp_row_bcast15, 0xa, 
0xf, false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_dpp(ctx, identity, result, dpp_row_bcast31, 0xc, 
0xf, false);
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   } else {
+   LLVMValueRef tid = ac_get_thread_id(ctx);
+   LLVMValueRef active;
+   tmp = ac_build_ds_swizzle(ctx, src, ds_pattern_bitmode(0x1e, 
0x00, 0x00));
+   active = LLVMBuildICmp(ctx->builder, LLVMIntNE,
+  LLVMBuildAnd(ctx->builder, tid, 
ctx->i32_1, ""),
+  ctx->i32_0, "");
+   tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_ds_swizzle(ctx, result, ds_pattern_bitmode(0x1c, 
0x01, 0x00));
+   active = LLVMBuildICmp(ctx->builder, LLVMIntNE,
+  LLVMBuildAnd(ctx->builder, tid, 
LLVMConstInt(ctx->i32, 2, 0), ""),
+  ctx->i32_0, "");
+   tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_ds_swizzle(ctx, result, ds_pattern_bitmode(0x18, 
0x03, 0x00));
+   active = LLVMBuildICmp(ctx->builder, LLVMIntNE,
+  LLVMBuildAnd(ctx->builder, tid, 
LLVMConstInt(ctx->i32, 4, 0), ""),
+  ctx->i32_0, "");
+   tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = ac_build_ds_swizzle(ctx, result, ds_pattern_bitmode(0x10, 
0x07, 0x00));
+   active = LLVMBuildICmp(ctx->builder, LLVMIntNE,
+  LLVMBuildAnd(ctx->builder, tid, 
LLVMConstInt(ctx->i32, 8, 0), ""),
+  ctx->i32_0, "");
+   tmp = LLVMBuildSelect(ctx->builder, active, tmp, identity, "");
+   result = ac_build_alu_op(ctx, result, tmp, op);
+   tmp = 

[Mesa-dev] [PATCH 2/2] radv: enable subgroups arithmetic for VI+

2018-05-04 Thread Daniel Schürmann
---
 src/amd/vulkan/radv_device.c | 4 +++-
 src/amd/vulkan/radv_shader.c | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 9fe415c4e0..ec60f9a9c5 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -970,7 +970,9 @@ void radv_GetPhysicalDeviceProperties2(

VK_SUBGROUP_FEATURE_QUAD_BIT |

VK_SUBGROUP_FEATURE_SHUFFLE_BIT |

VK_SUBGROUP_FEATURE_SHUFFLE_RELATIVE_BIT |
-   
VK_SUBGROUP_FEATURE_VOTE_BIT;
+   
VK_SUBGROUP_FEATURE_VOTE_BIT |
+   
(pdevice->rad_info.chip_class >= VI && HAVE_LLVM >= 0x0600 ?
+   
VK_SUBGROUP_FEATURE_ARITHMETIC_BIT : 0 );
properties->quadOperationsInAllStages = true;
break;
}
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 27b3fbed16..953cebb7e8 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -213,6 +213,7 @@ radv_shader_compile_to_nir(struct radv_device *device,
.tessellation = true,
.int64 = true,
.multiview = true,
+   .subgroup_arithmetic = true,
.subgroup_ballot = true,
.subgroup_basic = true,
.subgroup_quad = true,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [ANNOUNCE] mesa 18.1.0-rc3

2018-05-04 Thread Dylan Baker
Hello list,

Release candidate 3 for the Mesa 18.1.0 is now available. Currently we have:
 - 16 queued
 - 0 nominated (outstanding)
 - 0 rejected patches

Lots of fixes for radv; a few for the intel compiler; and spirv, omx, and utils
each got a single fix.

Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 18.1.0-rc4 next Friday (2018/05/11), around or shortly after
1200 PDT. Assuming that all blocking bugs are closed and no serious regressions
are found, 18.1.0-rc4 will be the last RC before the 18.1.0 release.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please go ahead.


Trivial merge conflicts
---

No merge conflicts.

Dylan


Mesa stable queue
-

Nominated (0)
==


Queued (16)
===

Andres Rodriguez (1):
  radv/winsys: fix leaking resources from bo's imported by fd

Bas Nieuwenhuizen (2):
  radv: Allow vkEnumerateInstanceVersion ProcAddr without instance.
  radv: Don't check the incoming apiVersion on CreateInstance.

Deepak Rawat (1):
  egl/x11: Send invalidate to driver on copy_region path in swap_buffer

Dylan Baker (1):
  bump version to 18.1.0-rc3

Jason Ekstrand (1):
  anv: Allow lookup of vkEnumerateInstanceVersion without an instance

Jose Maria Casanova Crespo (2):
  intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate
  intel/compiler: fix brw_imm_w for negative 16-bit integers

Juan A. Suarez Romero (1):
  autotools, meson: bump up required VA version

Leo Liu (1):
  st/omx/enc: fix blit setup for YUV LoadImage

Marek Olšák (2):
  util/u_queue: fix a deadlock in util_queue_finish
  radeonsi/gfx9: workaround for INTERP with indirect indexing

Matthew Nicholls (1):
  radv: fix multisample image copies

Nanley Chery (1):
  i965/tex_image: Avoid the ASTC LDR workaround on gen9lp

Neil Roberts (1):
  spirv: Apply OriginUpperLeft to FragCoord

Samuel Pitoiset (1):
  radv: compute the number of subpass attachments correctly



Rejected (0)
=



git tag: mesa-18.1.0-rc3

https://mesa.freedesktop.org/archive/mesa-18.1.0-rc3.tar.gz
MD5:  e2f2762276b9a97cdf57d754ace8d1db  mesa-18.1.0-rc3.tar.gz
SHA1: 05980c7a2a14c6a4f31f10d4034565db6ea36cdf  mesa-18.1.0-rc3.tar.gz
SHA256: 2d898f6e957e0a1b7646135d7fde4073d83d6feb3452662eb81d2264a85fae95  
mesa-18.1.0-rc3.tar.gz
SHA512: 
4f3adc0f498604ef7ebd380ba27dd5644a9d619acab35217a66141e27344ad6aaf01a8b08fe7369f8cd5a4820eb8b1a32b326246b3e783e845e74d3a49175b37
  mesa-18.1.0-rc3.tar.gz
PGP:  https://mesa.freedesktop.org/archive/mesa-18.1.0-rc3.tar.gz.sig

https://mesa.freedesktop.org/archive/mesa-18.1.0-rc3.tar.xz
MD5:  e29bb19aeeb6d6024bb633bf20342a4f  mesa-18.1.0-rc3.tar.xz
SHA1: c96409a1b10ef0d1eb475f744f333ff68e0edb30  mesa-18.1.0-rc3.tar.xz
SHA256: 1f1f596f3232b9ae02c71b390593859e35cb4c60a42e776eddd5d9800e279123  
mesa-18.1.0-rc3.tar.xz
SHA512: 
aae834672ce2ce28c9223626019b7ade5908919e9c2423ded070b215248b502dc1cc24e5c12527e28ae23ddce0b37e69c5d777ff923143c273f0a775ee215fc0
  mesa-18.1.0-rc3.tar.xz
PGP:  https://mesa.freedesktop.org/archive/mesa-18.1.0-rc3.tar.xz.sig



signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/1] clover: Add explicit virtual destructors to argument and scalar_argument class

2018-05-04 Thread Francisco Jerez
Jan Vesely  writes:

> These are needed to destroy the v vector.
> Fixes memory leaks on kernel launch.
> Signed-off-by: Jan Vesely 
> ---
>  src/gallium/state_trackers/clover/core/kernel.hpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp 
> b/src/gallium/state_trackers/clover/core/kernel.hpp
> index 4ba6ff467b..218f92cb1d 100644
> --- a/src/gallium/state_trackers/clover/core/kernel.hpp
> +++ b/src/gallium/state_trackers/clover/core/kernel.hpp
> @@ -93,6 +93,7 @@ namespace clover {
>   /// Free any resources that were allocated in bind().
>   virtual void unbind(exec_context ) = 0;
>  
> + virtual ~argument() {};
>protected:
>   argument();
>  
> @@ -143,6 +144,7 @@ namespace clover {
>class scalar_argument : public argument {
>public:
>   scalar_argument(size_t size);
> + virtual ~scalar_argument() {};

This line shouldn't be necessary.  With that fixed:

Reviewed-by: Francisco Jerez 

Thanks!

>  
>   virtual void set(size_t size, const void *value);
>   virtual void bind(exec_context ,
> -- 
> 2.17.0


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/17] i965/miptree: Move init_mcs into alloc_aux_buffer

2018-05-04 Thread Nanley Chery
On Fri, May 04, 2018 at 10:00:32AM -0700, Nanley Chery wrote:
> On Fri, May 04, 2018 at 09:42:34AM -0700, Nanley Chery wrote:
> > On Thu, May 03, 2018 at 12:03:50PM -0700, Nanley Chery wrote:
> > > Add infrastructure for initializing the clear color BO.
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 68 
> > > ---
> > >  1 file changed, 31 insertions(+), 37 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index 182a896e23a..5d3ee569bd8 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -1658,41 +1658,13 @@ intel_miptree_copy_teximage(struct brw_context 
> > > *brw,
> > > intel_obj->needs_validate = true;
> > >  }
> > >  
> > > -static bool
> > > -intel_miptree_init_mcs(struct brw_context *brw,
> > > -   struct intel_mipmap_tree *mt,
> > > -   int init_value)
> > > -{
> > > -   assert(mt->aux_buf != NULL);
> > > -
> > > -   /* From the Ivy Bridge PRM, Vol 2 Part 1 p326:
> > > -*
> > > -* When MCS buffer is enabled and bound to MSRT, it is required 
> > > that it
> > > -* is cleared prior to any rendering.
> > > -*
> > > -* Since we don't use the MCS buffer for any purpose other than 
> > > rendering,
> > > -* it makes sense to just clear it immediately upon allocation.
> > > -*
> > > -* Note: the clear value for MCS buffers is all 1's, so we memset to 
> > > 0xff.
> > > -*/
> > > -   void *map = brw_bo_map(brw, mt->aux_buf->bo, MAP_WRITE | MAP_RAW);
> > > -   if (unlikely(map == NULL)) {
> > > -  fprintf(stderr, "Failed to map mcs buffer into GTT\n");
> > > -  intel_miptree_aux_buffer_free(mt->aux_buf);
> > > -  mt->aux_buf = NULL;
> > > -  return false;
> > > -   }
> > > -   void *data = map;
> > > -   memset(data, init_value, mt->aux_buf->size);
> > > -   brw_bo_unmap(mt->aux_buf->bo);
> > > -   return true;
> > > -}
> > > -
> > >  static struct intel_miptree_aux_buffer *
> > >  intel_alloc_aux_buffer(struct brw_context *brw,
> > > const char *name,
> > > const struct isl_surf *aux_surf,
> > > uint32_t alloc_flags,
> > > +   bool wants_memset,
> > > +   uint8_t memset_value,
> > > struct intel_mipmap_tree *mt)
> > >  {
> > > struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
> > > @@ -1725,6 +1697,19 @@ intel_alloc_aux_buffer(struct brw_context *brw,
> > >return NULL;
> > > }
> > >  
> > > +   /* Initialize the bo to the desired value */
> > > +   if (wants_memset) {
> > > +  assert(!(alloc_flags & BO_ALLOC_BUSY));
> > > +
> > > +  void *map = brw_bo_map(brw, buf->bo, MAP_WRITE | MAP_RAW);
> > > +  if (map == NULL) {
> > > + intel_miptree_aux_buffer_free(buf);
> > > + return NULL;
> > > +  }
> > > +  memset(map, memset_value, mt->aux_buf->size);
> > 
> > Found a bug here. The last argument should be buf->size because
> > mt->aux_buf hasn't been assigned yet. Will fix locally.
> > 
> > -Nanley
> > 
> 
> False alarm. Sorry for the noise.
> 
> -Nanley
> 

... I was looking at the wrong code. I'll fix this.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] i965: perf: sklgt2: drop special programming for pre-production stepping

2018-05-04 Thread Kenneth Graunke
On Friday, May 4, 2018 7:52:45 AM PDT Lionel Landwerlin wrote:
> Production steppings don't need this anymore.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
> b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
> index ba52a63f06d..3a528135070 100644
> --- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
> +++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
> @@ -1,5 +1,5 @@
>  
> -
> +
> mdapi_supported_apis="OGL OCL IO BB"
> underscore_name="render_basic"
> @@ -818,7 +818,7 @@
> mdapi_supported_apis="OGL4 OCL IO BB"
> underscore_name="compute_basic"
> -   hw_config_guid="fe47b29d-ae51-423e-bff4-27d965a95b60"
> +   hw_config_guid="9ae1360b-8036-4af0-943a-27460801d981"
> chipset="SKLGT2"
> symbol_name="ComputeBasic"
> >
> @@ -1514,12 +1514,6 @@
>  
>  
>  
> - - availability="$SliceMask 0x02 AND $SkuRevisionId 0x02 
> ULT "
> - priority="2"
> - >
> -
> -
>  
>  
>  
> 

I cannot find any documentation at all indicating that 0x80 can be
changed to 0 on modern Skylakes...register 0x9840 only appears to be
documented on Broadwell.  At any rate, I was able to confirm that
Skylake PCI ID rev 02 appears to be preproduction, and shouldn't exist,
so I think this is fine to drop.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] i965: perf: sklgt2: update a priority for register programming

2018-05-04 Thread Kenneth Graunke
On Friday, May 4, 2018 7:52:44 AM PDT Lionel Landwerlin wrote:
> This makes no difference in term of programming, it's just a cleanup.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
> b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
> index 343de0c5ea9..ba52a63f06d 100644
> --- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
> +++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
> @@ -1,5 +1,5 @@
>  
> -
> +
> mdapi_supported_apis="OGL OCL IO BB"
> underscore_name="render_basic"
> @@ -724,7 +724,7 @@
>   />
> availability="$SkuRevisionId 0x02 UGTE"
> - priority="1"
> + priority="0"
>   >
>  
>  
> 

Yeah...kind of annoying to have useless fields in our XML, but...
I guess if we're trying to reuse XML created by other people for
other purposes...

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel: devinfo: fix assertion on devices with odd number of EUs

2018-05-04 Thread Kenneth Graunke
On Monday, April 23, 2018 9:52:37 AM PDT Lionel Landwerlin wrote:
> I forgot to change the assert in the second helper function in a
> previous change.
> 
> This hit the assert() on a Broadwell platform with quite a few EUs
> fused off :
>https://i.imgur.com/4Wx6tjz.png

Not sure this link will stay live forever...

Reviewed-by: Kenneth Graunke 

> Fixes: c1900f5b0fb ("intel: devinfo: add helper functions to fill fusing 
> masks values")
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/dev/gen_device_info.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c
> index dfeab6e6060..aa72d96e178 100644
> --- a/src/intel/dev/gen_device_info.c
> +++ b/src/intel/dev/gen_device_info.c
> @@ -1044,11 +1044,7 @@ gen_device_info_update_from_topology(struct 
> gen_device_info *devinfo,
> for (int b = 0; b < eu_mask_len; b++)
>n_eus += __builtin_popcount(devinfo->eu_masks[b]);
> 
> -   /* We expect the total number of EUs to be uniformly distributed 
> throughout
> -* the subslices.
> -*/
> -   assert(n_subslices && (n_eus % n_subslices) == 0);
> -   devinfo->num_eu_per_subslice = n_eus / n_subslices;
> +   devinfo->num_eu_per_subslice = DIV_ROUND_UP(n_eus, n_subslices);
>  }
> 
>  bool
> --
> 2.17.0
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] mesa: add support for ARB_sample_locations

2018-05-04 Thread Ian Romanick
A bunch of comments below.  This is probably the only patch in the
series that I'm going to review.

On 05/04/2018 05:09 AM, Rhys Perry wrote:
> Signed-off-by: Rhys Perry 
> ---
>  src/mapi/glapi/gen/gl_API.xml   |  52 +++
>  src/mesa/main/config.h  |   7 +
>  src/mesa/main/dd.h  |   7 +
>  src/mesa/main/extensions_table.h|   1 +
>  src/mesa/main/fbobject.c| 247 
> 
>  src/mesa/main/fbobject.h|  20 +++
>  src/mesa/main/framebuffer.c |  10 ++
>  src/mesa/main/get.c |  32 +
>  src/mesa/main/get_hash_params.py|   6 +
>  src/mesa/main/mtypes.h  |   6 +
>  src/mesa/main/multisample.c |  18 +++
>  src/mesa/main/tests/dispatch_sanity.cpp |  10 ++
>  12 files changed, 386 insertions(+), 30 deletions(-)
> 
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index 38c1921047..a23094a548 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -10891,6 +10891,58 @@
>  
>  
>  
> +
> + value="0x933D">
> +
> +
> +
> + value="0x933E">
> +
> +
> +
> + value="0x933F">
> +
> +
> +
> + value="0x9340">
> +
> +
> +
> + value="0x8E50">
> +
> +
> +
> + value="0x9341">
> +
> +
> +
> + value="0x9342">
> +
> +
> +
> +
> + value="0x9343">
> +
> +
> +
> +
> + es2="3.1">
> +
> +
> +
> +
> +
> +
> + es2="3.1">
> +
> +
> +
> +
> +
> +
> +
> +
> +
>  
>  
>  
> diff --git a/src/mesa/main/config.h b/src/mesa/main/config.h
> index 81573bfbf2..444e4dedad 100644
> --- a/src/mesa/main/config.h
> +++ b/src/mesa/main/config.h
> @@ -315,4 +315,11 @@
>  #define MAX_CLIPPED_VERTICES ((2 * (6 + MAX_CLIP_PLANES))+1)
>  
>  
> +/** For GL_ARB_sample_locations - maximum of 
> SAMPLE_LOCATION_PIXEL_GRID_*_ARB */
> +#define MAX_SAMPLE_LOCATION_GRID_SIZE 4

Blank line here.

> +/* It is theoretically possible for Consts.MaxSamples to be >32 but
> + * other code seems to assume that is not the case */

*/ goes on its own line.

> +#define MAX_SAMPLE_LOCATION_TABLE_SIZE \
> +   (MAX_SAMPLE_LOCATION_GRID_SIZE*MAX_SAMPLE_LOCATION_GRID_SIZE*32)

Spaces around the * operators.

> +
>  #endif /* MESA_CONFIG_H_INCLUDED */
> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
> index d85d89ef50..8929a2e267 100644
> --- a/src/mesa/main/dd.h
> +++ b/src/mesa/main/dd.h
> @@ -785,6 +785,13 @@ struct dd_function_table {
>GLenum target, GLsizei numAttachments,
>const GLenum *attachments);
>  
> +   /**
> +* \name Functions for GL_ARB_sample_locations
> +*/
> +   void (*GetProgrammableSampleCaps)(struct gl_context *ctx, struct 
> gl_framebuffer *fb,
> + GLuint *bits, GLuint *width, GLuint 
> *height);
> +   void (*EvaluateDepthValues)(struct gl_context *ctx, struct gl_framebuffer 
> *fb);
> +
> /**
>  * \name Query objects
>  */
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 492f7c3d20..3497cbea0e 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -103,6 +103,7 @@ EXT(ARB_provoking_vertex, 
> EXT_provoking_vertex
>  EXT(ARB_query_buffer_object , ARB_query_buffer_object
> , GLL, GLC,  x ,  x , 2013)
>  EXT(ARB_robust_buffer_access_behavior   , 
> ARB_robust_buffer_access_behavior  , GLL, GLC,  x ,  x , 2012)
>  EXT(ARB_robustness  , dummy_true 
> , GLL, GLC,  x ,  x , 2010)
> +EXT(ARB_sample_locations, ARB_sample_locations   
> , GLL, GLC,  x , ES2, 2015)

We are strictly forbidden from exposing ARB extensions in OpenGL ES.  If
we submit conformance results for an ES implementation that exposes and
ARB extension, the submission will be rejected.  I haven't looked
closely at NV_sample_locations, but if it is implementable on the same
hardware was ARB_sample_locations we can enable that.

>  EXT(ARB_sample_shading  , ARB_sample_shading 
> , GLL, GLC,  x ,  x , 2009)
>  EXT(ARB_sampler_objects , dummy_true 
> , GLL, GLC,  x ,  x , 2009)
>  EXT(ARB_seamless_cube_map   , ARB_seamless_cube_map  
> , GLL, GLC,  x ,  x , 2009)
> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index c72204e11a..33d7f0307d 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -35,6 +35,7 @@
>  
>  #include "buffers.h"
>  #include "context.h"
> +#include "debug_output.h"
>  #include 

Re: [Mesa-dev] [PATCH 3/4] intel/compiler: Fix lower_conversions for 8-bit types.

2018-05-04 Thread Matt Turner
On Fri, May 4, 2018 at 4:10 AM, Iago Toral Quiroga  wrote:
> From: Jose Maria Casanova Crespo 
>
> For 8-bit types the execution type is word. A byte raw MOV has 16-bit
> execution type and 8-bit destination and it shouldn't be considered
> a conversion case. So there is no need to change alignment and enter
> in lower_conversions for these instructions.
>
> Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export"
> that is introduced with this patch from the Vulkan shaderInt16 series:
> 'i965/compiler: handle conversion to smaller type in the lowering
> pass for that'. The problem is caused because there is already a case
> in the driver that injects Byte instructions like this:
>
> mov(8)  g127<1>UB   g2<32,8,4>UB
>
> And the aforementioned pass was not accounting for the special
> handling of the execution size of Byte instructions. This patch
> fixes this.
>
> v2: (Jason Ekstrand)
>- Simplify is_byte_raw_mov, include reference to PRM and not
>consider B <-> UB conversions as raw movs.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
> ---
>  src/intel/compiler/brw_fs_lower_conversions.cpp | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/compiler/brw_fs_lower_conversions.cpp 
> b/src/intel/compiler/brw_fs_lower_conversions.cpp
> index f95b39d3e8..f6c936cf21 100644
> --- a/src/intel/compiler/brw_fs_lower_conversions.cpp
> +++ b/src/intel/compiler/brw_fs_lower_conversions.cpp
> @@ -43,6 +43,24 @@ supports_type_conversion(const fs_inst *inst) {
> }
>  }
>
> +/* From the SKL PRM Vol 2a, "Move":
> + *
> + *"A mov with the same source and destination type, no source modifier,
> + *and no saturation is a raw move. A packed byte destination region (B
> + *or UB type with HorzStride == 1 and ExecSize > 1) can only be written
> + *using raw move."
> + */

Indent the last three lines one more space to align vertically with
the first line.

> +static bool
> +is_byte_raw_mov (const fs_inst *inst)
> +{
> +   return type_sz(inst->dst.type) == 1 &&
> +  inst->opcode == BRW_OPCODE_MOV &&
> +  inst->src[0].type == inst->dst.type &&
> +  !inst->saturate &&
> +  !inst->src[0].negate &&
> +  !inst->src[0].abs;

And indent these to align vertically as well.

The first three patches are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] loader_dri3: Variant 2: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mike Lothian
 src/loader/loader_dri3_helper.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 23729f7ecb..2b2a8d21d8 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -26,6 +26,7 @@
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -234,6 +235,10 @@ loader_dri3_drawable_fini(struct
loader_dri3_drawable *draw)
 {
int i;

+   printf("FINI: wxh = %d x %d, drawable %d eid %d recv_sbc %lu,
send_sbc %lu PENDING %lu\n",
+  draw->width, draw->height, draw->drawable, draw->eid,
draw->recv_sbc, draw->send_sbc,
+  draw->send_sbc - draw->recv_sbc);
+
draw->ext->core->destroyDrawable(draw->dri_drawable);

for (i = 0; i < ARRAY_SIZE(draw->buffers); i++) {
@@ -370,6 +375,15 @@ dri3_handle_present_event(struct
loader_dri3_drawable *draw,
* checking for wrap.
*/
   if (ce->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP) {
+ /* Filter out orphan events sent for a previous incarnation
of draw. */
+ if (!(draw->send_sbc & 0xLL) &&
+ ce->serial > draw->send_sbc) {
+printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
+   draw->width, draw->height, draw->drawable, ce->serial,
+   draw->send_sbc);
+break;
+ }
+
  draw->recv_sbc = (draw->send_sbc & 0xLL) | ce->serial;
  if (draw->recv_sbc > draw->send_sbc)
 draw->recv_sbc -= 0x1;
@@ -415,6 +429,15 @@ dri3_handle_present_event(struct
loader_dri3_drawable *draw,
   xcb_present_idle_notify_event_t *ie = (void *) ge;
   int b;

+  /* Filter out orphan events sent for a previous incarnation of draw. */
+  if (!(draw->send_sbc & 0xLL) &&
+  ie->serial > draw->send_sbc) {
+ printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
+draw->width, draw->height, draw->drawable, ie->serial,
+draw->send_sbc);
+ break;
+  }
+
   for (b = 0; b < ARRAY_SIZE(draw->buffers); b++) {
  struct loader_dri3_buffer *buf = draw->buffers[b];

@@ -1432,6 +1455,8 @@ dri3_update_drawable(__DRIdrawable *driDrawable,
  xcb_unregister_for_special_event(draw->conn, draw->special_event);
  draw->special_event = NULL;
   }
+
+  printf("INIT: wxh = %d x %d, drawable %d eid %d\n",
draw->width, draw->height, draw->drawable, draw->eid);
}
dri3_flush_present_events(draw);
mtx_unlock(>mtx);
-- 
2.17.0

On 4 May 2018 at 17:54, Mario Kleiner  wrote:
> On Fri, May 4, 2018 at 6:45 PM, Mike Lothian  wrote:
>> Hi
>>
>> The first hunk doesn't apply, the other 3 gives this with GCC 8.1
>>
>
> Oops, the perils of applying debug patches on top of debug patches...
>
> Can you add a...
>
> #include 
>
> at the top of the file, e.g, after '#include 
>
>  #include 
>  #include 
>  #include 
> -
> +#include 
>  #include 
>  #include 
>  #include 
>
> Then it should compile, albeit with some format warnings, but those
> shouldn't affect the outcome.
> -mario
>
>>
>> ../mesa-/src/loader/loader_dri3_helper.c: In function
>> ‘dri3_handle_present_event’:
>> ../mesa-/src/loader/loader_dri3_helper.c:376:13: error: implicit
>> declaration of function ‘printf’
>> [-Werror=implicit-function-declaration]
>>  printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc 
>> %lu\n",
>>  ^~
>> ../mesa-/src/loader/loader_dri3_helper.c:376:13: warning:
>> incompatible implicit declaration of built-in function ‘printf’
>> ../mesa-/src/loader/loader_dri3_helper.c:376:13: note: include
>> ‘’ or provide a declaration of ‘printf’
>> ../mesa-/src/loader/loader_dri3_helper.c:39:1:
>> +#include 
>>
>> ../mesa-/src/loader/loader_dri3_helper.c:376:13:
>>  printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc 
>> %lu\n",
>>  ^~
>> ../mesa-/src/loader/loader_dri3_helper.c:376:75: warning: format
>> ‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has
>> type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
>>  printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc 
>> %lu\n",
>>  ~~^
>>  %llu
>> ../mesa-/src/loader/loader_dri3_helper.c:378:20:
>> draw->send_sbc);
>> ~~
>> ../mesa-/src/loader/loader_dri3_helper.c:430:10: warning:
>> incompatible implicit declaration of built-in function ‘printf’
>>   printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
>>   ^~
>> ../mesa-/src/loader/loader_dri3_helper.c:430:10: note: include
>> ‘’ or provide a 

Re: [Mesa-dev] [PATCH] mesa/main/readpix: Correct handling of packed floating point values

2018-05-04 Thread Ian Romanick
On 04/27/2018 09:04 AM, Gert Wollny wrote:
> From: Gert Wollny 
> 
> Make sure that clamping in the pixel transfer operations is enabled/disabled
> for packed floating point values just like it is done for single normal and
> half precision floating point values.
> 
> This fixes a series of CTS tests with virgl that use r11f_g11f_b10f
> buffers as target, and where virglrenderer reads these surfaces back 
> using the format GL_UNSIGNED_INT_10F_11F_11F_REV.
> 
> Signed-off-by: Gert Wollny 
> ---
> PS: I don't have write permissions to mesa git
> 
>  src/mesa/main/readpix.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c
> index a4eea3043d..c5fc66988b 100644
> --- a/src/mesa/main/readpix.c
> +++ b/src/mesa/main/readpix.c
> @@ -106,7 +106,8 @@ _mesa_get_readpixels_transfer_ops(const struct gl_context 
> *ctx,
>/* For blit-based ReadPixels packing, the clamping is done 
> automatically
> * unless the type is float. */
>if (_mesa_get_clamp_read_color(ctx, ctx->ReadBuffer) &&
> -  (type == GL_FLOAT || type == GL_HALF_FLOAT)) {
> +  (type == GL_FLOAT || type == GL_HALF_FLOAT ||
> +   type == GL_UNSIGNED_INT_10F_11F_11F_REV)) {

I was thinking that maybe we needed to handle another type or two here,
but I verified that those type (e.g., the 999e5 type) aren't possible.
I'm going to run this through Intel's CI, then push it.

Reviewed-by: Ian Romanick 

>   transferOps |= IMAGE_CLAMP_BIT;
>}
> }
> @@ -114,7 +115,8 @@ _mesa_get_readpixels_transfer_ops(const struct gl_context 
> *ctx,
>/* For CPU-based ReadPixels packing, the clamping must always be done
> * for non-float types, */
>if (_mesa_get_clamp_read_color(ctx, ctx->ReadBuffer) ||
> -  (type != GL_FLOAT && type != GL_HALF_FLOAT)) {
> +  (type != GL_FLOAT && type != GL_HALF_FLOAT &&
> +   type != GL_UNSIGNED_INT_10F_11F_11F_REV)) {
>   transferOps |= IMAGE_CLAMP_BIT;
>}
> }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] i965: Require softpin support for Cannonlake and later.

2018-05-04 Thread Kenneth Graunke
On Friday, May 4, 2018 2:10:19 AM PDT Daniel Vetter wrote:
> On Fri, May 04, 2018 at 11:07:45AM +0200, Daniel Vetter wrote:
> > On Fri, May 04, 2018 at 01:28:02AM -0700, Kenneth Graunke wrote:
> > > On Friday, May 4, 2018 1:16:29 AM PDT Kenneth Graunke wrote:
> > > > On Friday, May 4, 2018 12:39:12 AM PDT Chris Wilson wrote:
> > > > > Quoting Kenneth Graunke (2018-05-04 08:34:07)
> > > > > > On Thursday, May 3, 2018 11:53:24 PM PDT Chris Wilson wrote:
> > > > > > > Quoting Kenneth Graunke (2018-05-04 02:12:40)
> > > > > > > > This isn't strictly necessary, but anyone running Cannonlake 
> > > > > > > > will
> > > > > > > > already have Kernel 4.5 or later, so there's no reason to 
> > > > > > > > support
> > > > > > > > the relocation model on Gen10+.
> > > > > > > 
> > > > > > > /o\ gvt. Need I say more?
> > > > > > > -Chris
> > > > > > 
> > > > > > Yes.  What's the deal with GVT?
> > > > > 
> > > > > Their current restrictions involve forcing the use of a 32b
> > > > > aliasing-ppgtt. Not that they support cnl+ yet, so they might remember
> > > > > to lift that restriction in time.
> > > > > -Chris
> > > > 
> > > > Wow, that's really miserable.  So, we can't actually depend on real
> > > > PPGTT existing?  Do you know if/when they might fix this?
> > > > 
> > > > This seriously wrecks a lot of my plans if we can't assume PPGTT
> > > > and have to deal with relocations for all of eternity.  Jason and I
> > > > have been planning on doing PPGTT-only drivers for months.  We figured
> > > > that full PPGTT had been working since Gen8 and surely would be viable
> > > > on anything modern.  If it weren't for enterprise kernels, I would have
> > > > required this all the way back to Gen8 in a heartbeat.
> > > > 
> > > > --Ken
> > > 
> > > Okay, whew, it looks like we were wrong.  There's a bit of confusing
> > > code in the kernel:
> > > 
> > > if (intel_vgpu_active(dev_priv)) {
> > > /* GVT-g has no support for 32bit ppgtt */
> > > has_full_ppgtt = false;
> > > has_full_48bit_ppgtt = 
> > > intel_vgpu_has_full_48bit_ppgtt(dev_priv);
> > > }
> > > 
> > > But Joonas explained that this means GVT-g does support full PPGTT
> > > with 48-bit addresses, it just never did the 32-bit only thing.
> > > 
> > > So, I think we're fine here after all.
> > 
> > Oh indeed. But only fixed in 4.14 by:
> > 
> > commit 6b3816d69628becb7ff35978aa0751798b4a940a
> > Author: Tina Zhang 
> > Date:   Mon Aug 14 15:24:14 2017 +0800
> > 
> > drm/i915/gvt: Fix guest i915 full ppgtt blocking issue
> > 
> > Maybe note that somewhere in your huge table of "stuff the kernel
> > supports" ...
> 
> Also, for you patch here this means you're fine - since cnl support isn't
> merged yet, but will have full 48bit ppgtt support (since it's there
> already) I think it's safe to require it.
> -Daniel

Yeah, I agree.  I've added a note above this condition:

+ /* Softpin landed in 4.5, but GVT used an aliasing PPGTT until
+  * kernel commit 6b3816d69628becb7ff35978aa0751798b4a940a in
+  * 4.14.  Gen10+ GVT hasn't landed yet, so it's not actually a
+  * problem - but extending this requirement back to earlier gens
+  * might actually mean requiring 4.14.
+  */

That way, if we ever are thinking about bumping kernel requirements to
4.5 and requiring softpin support on Gen8+, we'll remember that GVT
actually needs 4.14, and think twice :)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/17] i965/miptree: Move init_mcs into alloc_aux_buffer

2018-05-04 Thread Nanley Chery
On Fri, May 04, 2018 at 09:42:34AM -0700, Nanley Chery wrote:
> On Thu, May 03, 2018 at 12:03:50PM -0700, Nanley Chery wrote:
> > Add infrastructure for initializing the clear color BO.
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 68 
> > ---
> >  1 file changed, 31 insertions(+), 37 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 182a896e23a..5d3ee569bd8 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -1658,41 +1658,13 @@ intel_miptree_copy_teximage(struct brw_context *brw,
> > intel_obj->needs_validate = true;
> >  }
> >  
> > -static bool
> > -intel_miptree_init_mcs(struct brw_context *brw,
> > -   struct intel_mipmap_tree *mt,
> > -   int init_value)
> > -{
> > -   assert(mt->aux_buf != NULL);
> > -
> > -   /* From the Ivy Bridge PRM, Vol 2 Part 1 p326:
> > -*
> > -* When MCS buffer is enabled and bound to MSRT, it is required 
> > that it
> > -* is cleared prior to any rendering.
> > -*
> > -* Since we don't use the MCS buffer for any purpose other than 
> > rendering,
> > -* it makes sense to just clear it immediately upon allocation.
> > -*
> > -* Note: the clear value for MCS buffers is all 1's, so we memset to 
> > 0xff.
> > -*/
> > -   void *map = brw_bo_map(brw, mt->aux_buf->bo, MAP_WRITE | MAP_RAW);
> > -   if (unlikely(map == NULL)) {
> > -  fprintf(stderr, "Failed to map mcs buffer into GTT\n");
> > -  intel_miptree_aux_buffer_free(mt->aux_buf);
> > -  mt->aux_buf = NULL;
> > -  return false;
> > -   }
> > -   void *data = map;
> > -   memset(data, init_value, mt->aux_buf->size);
> > -   brw_bo_unmap(mt->aux_buf->bo);
> > -   return true;
> > -}
> > -
> >  static struct intel_miptree_aux_buffer *
> >  intel_alloc_aux_buffer(struct brw_context *brw,
> > const char *name,
> > const struct isl_surf *aux_surf,
> > uint32_t alloc_flags,
> > +   bool wants_memset,
> > +   uint8_t memset_value,
> > struct intel_mipmap_tree *mt)
> >  {
> > struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
> > @@ -1725,6 +1697,19 @@ intel_alloc_aux_buffer(struct brw_context *brw,
> >return NULL;
> > }
> >  
> > +   /* Initialize the bo to the desired value */
> > +   if (wants_memset) {
> > +  assert(!(alloc_flags & BO_ALLOC_BUSY));
> > +
> > +  void *map = brw_bo_map(brw, buf->bo, MAP_WRITE | MAP_RAW);
> > +  if (map == NULL) {
> > + intel_miptree_aux_buffer_free(buf);
> > + return NULL;
> > +  }
> > +  memset(map, memset_value, mt->aux_buf->size);
> 
> Found a bug here. The last argument should be buf->size because
> mt->aux_buf hasn't been assigned yet. Will fix locally.
> 
> -Nanley
> 

False alarm. Sorry for the noise.

-Nanley

> > +  brw_bo_unmap(buf->bo);
> > +   }
> > +
> > if (devinfo->gen >= 10) {
> >buf->clear_color_bo = buf->bo;
> >brw_bo_reference(buf->clear_color_bo);
> > @@ -1763,10 +1748,19 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
> >  * to be just used by the GPU.
> >  */
> > const uint32_t alloc_flags = 0;
> > -   mt->aux_buf = intel_alloc_aux_buffer(brw, "mcs-miptree",
> > -_mcs_surf, alloc_flags, mt);
> > -   if (!mt->aux_buf ||
> > -   !intel_miptree_init_mcs(brw, mt, 0xFF)) {
> > +   /* From the Ivy Bridge PRM, Vol 2 Part 1 p326:
> > +*
> > +* When MCS buffer is enabled and bound to MSRT, it is required 
> > that it
> > +* is cleared prior to any rendering.
> > +*
> > +* Since we don't use the MCS buffer for any purpose other than 
> > rendering,
> > +* it makes sense to just clear it immediately upon allocation.
> > +*
> > +* Note: the clear value for MCS buffers is all 1's, so we memset to 
> > 0xff.
> > +*/
> > +   mt->aux_buf = intel_alloc_aux_buffer(brw, "mcs-miptree", _mcs_surf,
> > +alloc_flags, true, 0xFF, mt);
> > +   if (!mt->aux_buf) {
> >free(aux_state);
> >return false;
> > }
> > @@ -1810,7 +1804,7 @@ intel_miptree_alloc_ccs(struct brw_context *brw,
> >  * bits in the aux buffer.
> >  */
> > mt->aux_buf = intel_alloc_aux_buffer(brw, "ccs-miptree", _ccs_surf,
> > -BO_ALLOC_ZEROED, mt);
> > +BO_ALLOC_ZEROED, false, 0, mt);
> > if (!mt->aux_buf) {
> >free(aux_state);
> >return false;
> > @@ -1876,8 +1870,8 @@ intel_miptree_alloc_hiz(struct brw_context *brw,
> > assert(ok);
> >  
> > const uint32_t alloc_flags = BO_ALLOC_BUSY;
> > -   mt->aux_buf = 

Re: [Mesa-dev] [PATCH 2/2] loader_dri3: Variant 2: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mario Kleiner
On Fri, May 4, 2018 at 6:45 PM, Mike Lothian  wrote:
> Hi
>
> The first hunk doesn't apply, the other 3 gives this with GCC 8.1
>

Oops, the perils of applying debug patches on top of debug patches...

Can you add a...

#include 

at the top of the file, e.g, after '#include 

 #include 
 #include 
 #include 
-
+#include 
 #include 
 #include 
 #include 

Then it should compile, albeit with some format warnings, but those
shouldn't affect the outcome.
-mario

>
> ../mesa-/src/loader/loader_dri3_helper.c: In function
> ‘dri3_handle_present_event’:
> ../mesa-/src/loader/loader_dri3_helper.c:376:13: error: implicit
> declaration of function ‘printf’
> [-Werror=implicit-function-declaration]
>  printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc 
> %lu\n",
>  ^~
> ../mesa-/src/loader/loader_dri3_helper.c:376:13: warning:
> incompatible implicit declaration of built-in function ‘printf’
> ../mesa-/src/loader/loader_dri3_helper.c:376:13: note: include
> ‘’ or provide a declaration of ‘printf’
> ../mesa-/src/loader/loader_dri3_helper.c:39:1:
> +#include 
>
> ../mesa-/src/loader/loader_dri3_helper.c:376:13:
>  printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc 
> %lu\n",
>  ^~
> ../mesa-/src/loader/loader_dri3_helper.c:376:75: warning: format
> ‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has
> type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
>  printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc 
> %lu\n",
>  ~~^
>  %llu
> ../mesa-/src/loader/loader_dri3_helper.c:378:20:
> draw->send_sbc);
> ~~
> ../mesa-/src/loader/loader_dri3_helper.c:430:10: warning:
> incompatible implicit declaration of built-in function ‘printf’
>   printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
>   ^~
> ../mesa-/src/loader/loader_dri3_helper.c:430:10: note: include
> ‘’ or provide a declaration of ‘printf’
> ../mesa-/src/loader/loader_dri3_helper.c:430:72: warning: format
> ‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has
> type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
>   printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
>   ~~^
>   %llu
> ../mesa-/src/loader/loader_dri3_helper.c:432:17:
>  draw->send_sbc);
>  ~~
> ../mesa-/src/loader/loader_dri3_helper.c: In function
> ‘dri3_update_drawable’:
> ../mesa-/src/loader/loader_dri3_helper.c:1454:7: warning:
> incompatible implicit declaration of built-in function ‘printf’
>printf("INIT: wxh = %d x %d, drawable %d eid %d\n",
> draw->width, draw->height, draw->drawable, draw->eid);
>^~
> ../mesa-/src/loader/loader_dri3_helper.c:1454:7: note: include
> ‘’ or provide a declaration of ‘printf’
> cc1: some warnings being treated as errors
>
> Cheers
>
> Mike
>
> On 4 May 2018 at 14:45, Mario Kleiner  wrote:
>> See previous patch in series for explanation of the problem.
>>
>> This method avoids a blocking loader_dri3_swapbuffer_barrier() call
>> whenever a GL contexts drawables are changed via glXMakeCurrent et al.
>>
>> Instead it filters out the "orphaned" PresentNotify events from
>> previous incarnations of the loader_dri3_drawable. This should deal
>> correctly with PixmapInvalidate, PixmapPresentCompleteNotify and
>> MscCompleteNotify events, but i don't know a way to filter out
>> WindowConfigureNotify events, or if it even matters to filter them.
>>
>> This PoC one is only meaningful if the first patch is omitted, and
>> shows the spurious "ORPHAN" printouts which would hang KDE plasmashell
>> if not filtered out.
>>
>> Test from a terminal: killall plasmashell; plasmashell
>> Wiggly the mouse around, click etc. on the KDE taskbar, K-Menu,
>> system tray icons, trigger volume/brightness feedback widgets
>> to provoke the occassional ORPHAN event.
>>
>> Signed-off-by: Mario Kleiner 
>> Cc: xorg-de...@lists.x.org
>> Cc: dan...@fooishbar.org
>> Cc: eero.t.tammi...@intel.com
>> Cc: m...@fireburn.co.uk
>> ---
>>  src/loader/loader_dri3_helper.c | 24 
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/src/loader/loader_dri3_helper.c 
>> b/src/loader/loader_dri3_helper.c
>> index 7bd79af..123a996 100644
>> --- a/src/loader/loader_dri3_helper.c
>> +++ b/src/loader/loader_dri3_helper.c
>> @@ -234,6 +234,10 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
>> *draw)
>>  {
>> int i;
>>
>> +   printf("FINI: wxh = %d 

Re: [Mesa-dev] [PATCH 1/2] loader_dri3: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mike Lothian
Hi

Yes I've not seen any freezes with Plasmashell, so a big improvement

Had issues with patch 2

Cheers

Mike

On 4 May 2018 at 17:36, Mario Kleiner  wrote:
> On Fri, May 4, 2018 at 6:31 PM, Mike Lothian  wrote:
>> Hi
>>
>> I'm still seeing the freeze in the Steam client with this patch
>>
>> Just about to test the other one
>>
>
> Thanks for testing. So the plasmashell hang is gone, right?
> Maybe the steam issue is a different bug. With *only* patch 2/2
> applied, seing ORPHAN printouts would hint at a similar problem,
> absence of them would probably mean something else.
>
> -mario
>
>> Mike
>>
>> On 4 May 2018 at 17:17, Mike Lothian  wrote:
>>> Hi Mario
>>>
>>> Again thanks for looking into this issue :D
>>>
>>> Tested-by: Mike Lothian 
>>>
>>> I'll give the other patch a whirl now
>>>
>>> Cheers
>>>
>>> Mike
>>>
>>> On 4 May 2018 at 14:45, Mario Kleiner  wrote:
 Before destroying the loader_dri3_drawable, make sure all pending
 swaps for it have completed. This guides against the following scenario,
 which happens, e.g., with KDE Plasma-5's plasmashell (which uses
 QT-5's QtGui/QtQuick for rendering), when it repaints multiple
 UI elements, each represented by its own Window/GLXDrawable, using
 one common GLXContext for all GLXDrawable's:

 1. glXMakeCurrent(dpy, drawable1, context);
 2. glXXX render to drawable1
 3. glXSwapBuffers(dpy, drawable1); #1
 4. glXMakeCurrent(dpy, drawable2, context);
 5. glXXX render to drawable2
 6. glXSwapBuffers(dpy, drawable2);
 // While the swap #1 is still pending for drawable1:
 7. glXMakeCurrent(dpy, drawable1, context);
 8. glXXX render to drawable1
 9. glXSwapBuffers(dpy, drawable1);

 Binding a different drawable2 to the same context via glXMakeCurrent
 will cause its previous drawable1 to be released (cfe. dri3_bind_context
 -> driReleaseDrawables), which in turn calls loader_dri3_drawable_fini().
 This unselects for Present notify event delivery on the associated
 X-Window and loses all dri3 related state. If drawable1 is selected for
 the context again [7], a new incarnation of loader_dri3_drawable is
 created in dri3_bind_context->driFetchDrawable->dri3_create_drawable->
 loader_dri3_drawable_init(), which again selects for Present notify
 event delivery for the underlying X-Window, but the new incarnation lost
 all state wrt. to previous rendering and swaps. The server now delivers
 PresentPixmapIdle and PresentPixmapComplete events from the completed
 previous swapbuffers call #1 [3] to the new loader_dri3_drawable, which
 doesn't expect those. One problem is that the new incarnation has a
 draw->send_sbc == 0, but now receives PresentPixmapComplete events with
 sbc's > 0, therefore updating draw->recv_sbc to > 0 in
 dri3_handle_present_event(). The draw->recv_sbc > draw_send_sbc is
 misinterpreted as sbc wraparound, triggers recv_sbc wraparound handling
 and ends up with a very large draw->recv_sbc. During the next swapbuffers
 call [9], the totally wrong recv_sbc is used for calculating the target_msc
 for the PresentPixmap request, leading to a target_msc billions of vblanks
 in the future, leading to a swap that never completes and thereby frozen UI
 and hang of the client.

 Make sure that a loader_dri3_drawable can only be destroyed after all
 its pending swaps have completed, to prevent misdelivery of PresentNotify
 events to the right X-Window, but the wrong incarnation of the associated
 loader_dri3_drawable.

 Signed-off-by: Mario Kleiner 
 Cc: xorg-de...@lists.x.org
 Cc: dan...@fooishbar.org
 Cc: eero.t.tammi...@intel.com
 Cc: m...@fireburn.co.uk
 ---
  src/loader/loader_dri3_helper.c | 3 +++
  1 file changed, 3 insertions(+)

 diff --git a/src/loader/loader_dri3_helper.c 
 b/src/loader/loader_dri3_helper.c
 index 6bb11c4..7bd79af 100644
 --- a/src/loader/loader_dri3_helper.c
 +++ b/src/loader/loader_dri3_helper.c
 @@ -234,6 +234,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
 *draw)
  {
 int i;

 +   if (draw->special_event)
 +  loader_dri3_swapbuffer_barrier(draw);
 +
 draw->ext->core->destroyDrawable(draw->dri_drawable);

 for (i = 0; i < ARRAY_SIZE(draw->buffers); i++) {
 --
 2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] loader_dri3: Variant 2: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mike Lothian
Hi

The first hunk doesn't apply, the other 3 gives this with GCC 8.1


../mesa-/src/loader/loader_dri3_helper.c: In function
‘dri3_handle_present_event’:
../mesa-/src/loader/loader_dri3_helper.c:376:13: error: implicit
declaration of function ‘printf’
[-Werror=implicit-function-declaration]
 printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
 ^~
../mesa-/src/loader/loader_dri3_helper.c:376:13: warning:
incompatible implicit declaration of built-in function ‘printf’
../mesa-/src/loader/loader_dri3_helper.c:376:13: note: include
‘’ or provide a declaration of ‘printf’
../mesa-/src/loader/loader_dri3_helper.c:39:1:
+#include 

../mesa-/src/loader/loader_dri3_helper.c:376:13:
 printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
 ^~
../mesa-/src/loader/loader_dri3_helper.c:376:75: warning: format
‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has
type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
 printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
 ~~^
 %llu
../mesa-/src/loader/loader_dri3_helper.c:378:20:
draw->send_sbc);
~~
../mesa-/src/loader/loader_dri3_helper.c:430:10: warning:
incompatible implicit declaration of built-in function ‘printf’
  printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
  ^~
../mesa-/src/loader/loader_dri3_helper.c:430:10: note: include
‘’ or provide a declaration of ‘printf’
../mesa-/src/loader/loader_dri3_helper.c:430:72: warning: format
‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has
type ‘uint64_t’ {aka ‘long long unsigned int’} [-Wformat=]
  printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
  ~~^
  %llu
../mesa-/src/loader/loader_dri3_helper.c:432:17:
 draw->send_sbc);
 ~~
../mesa-/src/loader/loader_dri3_helper.c: In function
‘dri3_update_drawable’:
../mesa-/src/loader/loader_dri3_helper.c:1454:7: warning:
incompatible implicit declaration of built-in function ‘printf’
   printf("INIT: wxh = %d x %d, drawable %d eid %d\n",
draw->width, draw->height, draw->drawable, draw->eid);
   ^~
../mesa-/src/loader/loader_dri3_helper.c:1454:7: note: include
‘’ or provide a declaration of ‘printf’
cc1: some warnings being treated as errors

Cheers

Mike

On 4 May 2018 at 14:45, Mario Kleiner  wrote:
> See previous patch in series for explanation of the problem.
>
> This method avoids a blocking loader_dri3_swapbuffer_barrier() call
> whenever a GL contexts drawables are changed via glXMakeCurrent et al.
>
> Instead it filters out the "orphaned" PresentNotify events from
> previous incarnations of the loader_dri3_drawable. This should deal
> correctly with PixmapInvalidate, PixmapPresentCompleteNotify and
> MscCompleteNotify events, but i don't know a way to filter out
> WindowConfigureNotify events, or if it even matters to filter them.
>
> This PoC one is only meaningful if the first patch is omitted, and
> shows the spurious "ORPHAN" printouts which would hang KDE plasmashell
> if not filtered out.
>
> Test from a terminal: killall plasmashell; plasmashell
> Wiggly the mouse around, click etc. on the KDE taskbar, K-Menu,
> system tray icons, trigger volume/brightness feedback widgets
> to provoke the occassional ORPHAN event.
>
> Signed-off-by: Mario Kleiner 
> Cc: xorg-de...@lists.x.org
> Cc: dan...@fooishbar.org
> Cc: eero.t.tammi...@intel.com
> Cc: m...@fireburn.co.uk
> ---
>  src/loader/loader_dri3_helper.c | 24 
>  1 file changed, 24 insertions(+)
>
> diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
> index 7bd79af..123a996 100644
> --- a/src/loader/loader_dri3_helper.c
> +++ b/src/loader/loader_dri3_helper.c
> @@ -234,6 +234,10 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
> *draw)
>  {
> int i;
>
> +   printf("FINI: wxh = %d x %d, drawable %d eid %d recv_sbc %lu, send_sbc 
> %lu PENDING %lu\n",
> +  draw->width, draw->height, draw->drawable, draw->eid, 
> draw->recv_sbc, draw->send_sbc,
> +  draw->send_sbc - draw->recv_sbc);
> +
> if (draw->special_event)
>loader_dri3_swapbuffer_barrier(draw);
>
> @@ -373,6 +377,15 @@ dri3_handle_present_event(struct loader_dri3_drawable 
> *draw,
> * checking for wrap.
> */
>if (ce->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP) {
> + /* Filter out orphan events sent for a previous incarnation of 
> 

Re: [Mesa-dev] [PATCH 03/17] i965/miptree: Move init_mcs into alloc_aux_buffer

2018-05-04 Thread Nanley Chery
On Thu, May 03, 2018 at 12:03:50PM -0700, Nanley Chery wrote:
> Add infrastructure for initializing the clear color BO.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 68 
> ---
>  1 file changed, 31 insertions(+), 37 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 182a896e23a..5d3ee569bd8 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1658,41 +1658,13 @@ intel_miptree_copy_teximage(struct brw_context *brw,
> intel_obj->needs_validate = true;
>  }
>  
> -static bool
> -intel_miptree_init_mcs(struct brw_context *brw,
> -   struct intel_mipmap_tree *mt,
> -   int init_value)
> -{
> -   assert(mt->aux_buf != NULL);
> -
> -   /* From the Ivy Bridge PRM, Vol 2 Part 1 p326:
> -*
> -* When MCS buffer is enabled and bound to MSRT, it is required that 
> it
> -* is cleared prior to any rendering.
> -*
> -* Since we don't use the MCS buffer for any purpose other than rendering,
> -* it makes sense to just clear it immediately upon allocation.
> -*
> -* Note: the clear value for MCS buffers is all 1's, so we memset to 0xff.
> -*/
> -   void *map = brw_bo_map(brw, mt->aux_buf->bo, MAP_WRITE | MAP_RAW);
> -   if (unlikely(map == NULL)) {
> -  fprintf(stderr, "Failed to map mcs buffer into GTT\n");
> -  intel_miptree_aux_buffer_free(mt->aux_buf);
> -  mt->aux_buf = NULL;
> -  return false;
> -   }
> -   void *data = map;
> -   memset(data, init_value, mt->aux_buf->size);
> -   brw_bo_unmap(mt->aux_buf->bo);
> -   return true;
> -}
> -
>  static struct intel_miptree_aux_buffer *
>  intel_alloc_aux_buffer(struct brw_context *brw,
> const char *name,
> const struct isl_surf *aux_surf,
> uint32_t alloc_flags,
> +   bool wants_memset,
> +   uint8_t memset_value,
> struct intel_mipmap_tree *mt)
>  {
> struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1);
> @@ -1725,6 +1697,19 @@ intel_alloc_aux_buffer(struct brw_context *brw,
>return NULL;
> }
>  
> +   /* Initialize the bo to the desired value */
> +   if (wants_memset) {
> +  assert(!(alloc_flags & BO_ALLOC_BUSY));
> +
> +  void *map = brw_bo_map(brw, buf->bo, MAP_WRITE | MAP_RAW);
> +  if (map == NULL) {
> + intel_miptree_aux_buffer_free(buf);
> + return NULL;
> +  }
> +  memset(map, memset_value, mt->aux_buf->size);

Found a bug here. The last argument should be buf->size because
mt->aux_buf hasn't been assigned yet. Will fix locally.

-Nanley

> +  brw_bo_unmap(buf->bo);
> +   }
> +
> if (devinfo->gen >= 10) {
>buf->clear_color_bo = buf->bo;
>brw_bo_reference(buf->clear_color_bo);
> @@ -1763,10 +1748,19 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
>  * to be just used by the GPU.
>  */
> const uint32_t alloc_flags = 0;
> -   mt->aux_buf = intel_alloc_aux_buffer(brw, "mcs-miptree",
> -_mcs_surf, alloc_flags, mt);
> -   if (!mt->aux_buf ||
> -   !intel_miptree_init_mcs(brw, mt, 0xFF)) {
> +   /* From the Ivy Bridge PRM, Vol 2 Part 1 p326:
> +*
> +* When MCS buffer is enabled and bound to MSRT, it is required that 
> it
> +* is cleared prior to any rendering.
> +*
> +* Since we don't use the MCS buffer for any purpose other than rendering,
> +* it makes sense to just clear it immediately upon allocation.
> +*
> +* Note: the clear value for MCS buffers is all 1's, so we memset to 0xff.
> +*/
> +   mt->aux_buf = intel_alloc_aux_buffer(brw, "mcs-miptree", _mcs_surf,
> +alloc_flags, true, 0xFF, mt);
> +   if (!mt->aux_buf) {
>free(aux_state);
>return false;
> }
> @@ -1810,7 +1804,7 @@ intel_miptree_alloc_ccs(struct brw_context *brw,
>  * bits in the aux buffer.
>  */
> mt->aux_buf = intel_alloc_aux_buffer(brw, "ccs-miptree", _ccs_surf,
> -BO_ALLOC_ZEROED, mt);
> +BO_ALLOC_ZEROED, false, 0, mt);
> if (!mt->aux_buf) {
>free(aux_state);
>return false;
> @@ -1876,8 +1870,8 @@ intel_miptree_alloc_hiz(struct brw_context *brw,
> assert(ok);
>  
> const uint32_t alloc_flags = BO_ALLOC_BUSY;
> -   mt->aux_buf = intel_alloc_aux_buffer(brw, "hiz-miptree",
> -_hiz_surf, alloc_flags, mt);
> +   mt->aux_buf = intel_alloc_aux_buffer(brw, "hiz-miptree", _hiz_surf,
> +alloc_flags, false, 0, mt);
>  
> if (!mt->aux_buf) {
>free(aux_state);
> -- 
> 2.16.2
> 
___

Re: [Mesa-dev] [PATCH 1/2] loader_dri3: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mario Kleiner
On Fri, May 4, 2018 at 6:31 PM, Mike Lothian  wrote:
> Hi
>
> I'm still seeing the freeze in the Steam client with this patch
>
> Just about to test the other one
>

Thanks for testing. So the plasmashell hang is gone, right?
Maybe the steam issue is a different bug. With *only* patch 2/2
applied, seing ORPHAN printouts would hint at a similar problem,
absence of them would probably mean something else.

-mario

> Mike
>
> On 4 May 2018 at 17:17, Mike Lothian  wrote:
>> Hi Mario
>>
>> Again thanks for looking into this issue :D
>>
>> Tested-by: Mike Lothian 
>>
>> I'll give the other patch a whirl now
>>
>> Cheers
>>
>> Mike
>>
>> On 4 May 2018 at 14:45, Mario Kleiner  wrote:
>>> Before destroying the loader_dri3_drawable, make sure all pending
>>> swaps for it have completed. This guides against the following scenario,
>>> which happens, e.g., with KDE Plasma-5's plasmashell (which uses
>>> QT-5's QtGui/QtQuick for rendering), when it repaints multiple
>>> UI elements, each represented by its own Window/GLXDrawable, using
>>> one common GLXContext for all GLXDrawable's:
>>>
>>> 1. glXMakeCurrent(dpy, drawable1, context);
>>> 2. glXXX render to drawable1
>>> 3. glXSwapBuffers(dpy, drawable1); #1
>>> 4. glXMakeCurrent(dpy, drawable2, context);
>>> 5. glXXX render to drawable2
>>> 6. glXSwapBuffers(dpy, drawable2);
>>> // While the swap #1 is still pending for drawable1:
>>> 7. glXMakeCurrent(dpy, drawable1, context);
>>> 8. glXXX render to drawable1
>>> 9. glXSwapBuffers(dpy, drawable1);
>>>
>>> Binding a different drawable2 to the same context via glXMakeCurrent
>>> will cause its previous drawable1 to be released (cfe. dri3_bind_context
>>> -> driReleaseDrawables), which in turn calls loader_dri3_drawable_fini().
>>> This unselects for Present notify event delivery on the associated
>>> X-Window and loses all dri3 related state. If drawable1 is selected for
>>> the context again [7], a new incarnation of loader_dri3_drawable is
>>> created in dri3_bind_context->driFetchDrawable->dri3_create_drawable->
>>> loader_dri3_drawable_init(), which again selects for Present notify
>>> event delivery for the underlying X-Window, but the new incarnation lost
>>> all state wrt. to previous rendering and swaps. The server now delivers
>>> PresentPixmapIdle and PresentPixmapComplete events from the completed
>>> previous swapbuffers call #1 [3] to the new loader_dri3_drawable, which
>>> doesn't expect those. One problem is that the new incarnation has a
>>> draw->send_sbc == 0, but now receives PresentPixmapComplete events with
>>> sbc's > 0, therefore updating draw->recv_sbc to > 0 in
>>> dri3_handle_present_event(). The draw->recv_sbc > draw_send_sbc is
>>> misinterpreted as sbc wraparound, triggers recv_sbc wraparound handling
>>> and ends up with a very large draw->recv_sbc. During the next swapbuffers
>>> call [9], the totally wrong recv_sbc is used for calculating the target_msc
>>> for the PresentPixmap request, leading to a target_msc billions of vblanks
>>> in the future, leading to a swap that never completes and thereby frozen UI
>>> and hang of the client.
>>>
>>> Make sure that a loader_dri3_drawable can only be destroyed after all
>>> its pending swaps have completed, to prevent misdelivery of PresentNotify
>>> events to the right X-Window, but the wrong incarnation of the associated
>>> loader_dri3_drawable.
>>>
>>> Signed-off-by: Mario Kleiner 
>>> Cc: xorg-de...@lists.x.org
>>> Cc: dan...@fooishbar.org
>>> Cc: eero.t.tammi...@intel.com
>>> Cc: m...@fireburn.co.uk
>>> ---
>>>  src/loader/loader_dri3_helper.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/src/loader/loader_dri3_helper.c 
>>> b/src/loader/loader_dri3_helper.c
>>> index 6bb11c4..7bd79af 100644
>>> --- a/src/loader/loader_dri3_helper.c
>>> +++ b/src/loader/loader_dri3_helper.c
>>> @@ -234,6 +234,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
>>> *draw)
>>>  {
>>> int i;
>>>
>>> +   if (draw->special_event)
>>> +  loader_dri3_swapbuffer_barrier(draw);
>>> +
>>> draw->ext->core->destroyDrawable(draw->dri_drawable);
>>>
>>> for (i = 0; i < ARRAY_SIZE(draw->buffers); i++) {
>>> --
>>> 2.7.4
>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] loader_dri3: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mike Lothian
Hi

I'm still seeing the freeze in the Steam client with this patch

Just about to test the other one

Mike

On 4 May 2018 at 17:17, Mike Lothian  wrote:
> Hi Mario
>
> Again thanks for looking into this issue :D
>
> Tested-by: Mike Lothian 
>
> I'll give the other patch a whirl now
>
> Cheers
>
> Mike
>
> On 4 May 2018 at 14:45, Mario Kleiner  wrote:
>> Before destroying the loader_dri3_drawable, make sure all pending
>> swaps for it have completed. This guides against the following scenario,
>> which happens, e.g., with KDE Plasma-5's plasmashell (which uses
>> QT-5's QtGui/QtQuick for rendering), when it repaints multiple
>> UI elements, each represented by its own Window/GLXDrawable, using
>> one common GLXContext for all GLXDrawable's:
>>
>> 1. glXMakeCurrent(dpy, drawable1, context);
>> 2. glXXX render to drawable1
>> 3. glXSwapBuffers(dpy, drawable1); #1
>> 4. glXMakeCurrent(dpy, drawable2, context);
>> 5. glXXX render to drawable2
>> 6. glXSwapBuffers(dpy, drawable2);
>> // While the swap #1 is still pending for drawable1:
>> 7. glXMakeCurrent(dpy, drawable1, context);
>> 8. glXXX render to drawable1
>> 9. glXSwapBuffers(dpy, drawable1);
>>
>> Binding a different drawable2 to the same context via glXMakeCurrent
>> will cause its previous drawable1 to be released (cfe. dri3_bind_context
>> -> driReleaseDrawables), which in turn calls loader_dri3_drawable_fini().
>> This unselects for Present notify event delivery on the associated
>> X-Window and loses all dri3 related state. If drawable1 is selected for
>> the context again [7], a new incarnation of loader_dri3_drawable is
>> created in dri3_bind_context->driFetchDrawable->dri3_create_drawable->
>> loader_dri3_drawable_init(), which again selects for Present notify
>> event delivery for the underlying X-Window, but the new incarnation lost
>> all state wrt. to previous rendering and swaps. The server now delivers
>> PresentPixmapIdle and PresentPixmapComplete events from the completed
>> previous swapbuffers call #1 [3] to the new loader_dri3_drawable, which
>> doesn't expect those. One problem is that the new incarnation has a
>> draw->send_sbc == 0, but now receives PresentPixmapComplete events with
>> sbc's > 0, therefore updating draw->recv_sbc to > 0 in
>> dri3_handle_present_event(). The draw->recv_sbc > draw_send_sbc is
>> misinterpreted as sbc wraparound, triggers recv_sbc wraparound handling
>> and ends up with a very large draw->recv_sbc. During the next swapbuffers
>> call [9], the totally wrong recv_sbc is used for calculating the target_msc
>> for the PresentPixmap request, leading to a target_msc billions of vblanks
>> in the future, leading to a swap that never completes and thereby frozen UI
>> and hang of the client.
>>
>> Make sure that a loader_dri3_drawable can only be destroyed after all
>> its pending swaps have completed, to prevent misdelivery of PresentNotify
>> events to the right X-Window, but the wrong incarnation of the associated
>> loader_dri3_drawable.
>>
>> Signed-off-by: Mario Kleiner 
>> Cc: xorg-de...@lists.x.org
>> Cc: dan...@fooishbar.org
>> Cc: eero.t.tammi...@intel.com
>> Cc: m...@fireburn.co.uk
>> ---
>>  src/loader/loader_dri3_helper.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/src/loader/loader_dri3_helper.c 
>> b/src/loader/loader_dri3_helper.c
>> index 6bb11c4..7bd79af 100644
>> --- a/src/loader/loader_dri3_helper.c
>> +++ b/src/loader/loader_dri3_helper.c
>> @@ -234,6 +234,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
>> *draw)
>>  {
>> int i;
>>
>> +   if (draw->special_event)
>> +  loader_dri3_swapbuffer_barrier(draw);
>> +
>> draw->ext->core->destroyDrawable(draw->dri_drawable);
>>
>> for (i = 0; i < ARRAY_SIZE(draw->buffers); i++) {
>> --
>> 2.7.4
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] clover: Add explicit virtual destructors to argument and scalar_argument class

2018-05-04 Thread Jan Vesely
These are needed to destroy the v vector.
Fixes memory leaks on kernel launch.
Signed-off-by: Jan Vesely 
---
 src/gallium/state_trackers/clover/core/kernel.hpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/state_trackers/clover/core/kernel.hpp 
b/src/gallium/state_trackers/clover/core/kernel.hpp
index 4ba6ff467b..218f92cb1d 100644
--- a/src/gallium/state_trackers/clover/core/kernel.hpp
+++ b/src/gallium/state_trackers/clover/core/kernel.hpp
@@ -93,6 +93,7 @@ namespace clover {
  /// Free any resources that were allocated in bind().
  virtual void unbind(exec_context ) = 0;
 
+ virtual ~argument() {};
   protected:
  argument();
 
@@ -143,6 +144,7 @@ namespace clover {
   class scalar_argument : public argument {
   public:
  scalar_argument(size_t size);
+ virtual ~scalar_argument() {};
 
  virtual void set(size_t size, const void *value);
  virtual void bind(exec_context ,
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] loader_dri3: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mike Lothian
Hi Mario

Again thanks for looking into this issue :D

Tested-by: Mike Lothian 

I'll give the other patch a whirl now

Cheers

Mike

On 4 May 2018 at 14:45, Mario Kleiner  wrote:
> Before destroying the loader_dri3_drawable, make sure all pending
> swaps for it have completed. This guides against the following scenario,
> which happens, e.g., with KDE Plasma-5's plasmashell (which uses
> QT-5's QtGui/QtQuick for rendering), when it repaints multiple
> UI elements, each represented by its own Window/GLXDrawable, using
> one common GLXContext for all GLXDrawable's:
>
> 1. glXMakeCurrent(dpy, drawable1, context);
> 2. glXXX render to drawable1
> 3. glXSwapBuffers(dpy, drawable1); #1
> 4. glXMakeCurrent(dpy, drawable2, context);
> 5. glXXX render to drawable2
> 6. glXSwapBuffers(dpy, drawable2);
> // While the swap #1 is still pending for drawable1:
> 7. glXMakeCurrent(dpy, drawable1, context);
> 8. glXXX render to drawable1
> 9. glXSwapBuffers(dpy, drawable1);
>
> Binding a different drawable2 to the same context via glXMakeCurrent
> will cause its previous drawable1 to be released (cfe. dri3_bind_context
> -> driReleaseDrawables), which in turn calls loader_dri3_drawable_fini().
> This unselects for Present notify event delivery on the associated
> X-Window and loses all dri3 related state. If drawable1 is selected for
> the context again [7], a new incarnation of loader_dri3_drawable is
> created in dri3_bind_context->driFetchDrawable->dri3_create_drawable->
> loader_dri3_drawable_init(), which again selects for Present notify
> event delivery for the underlying X-Window, but the new incarnation lost
> all state wrt. to previous rendering and swaps. The server now delivers
> PresentPixmapIdle and PresentPixmapComplete events from the completed
> previous swapbuffers call #1 [3] to the new loader_dri3_drawable, which
> doesn't expect those. One problem is that the new incarnation has a
> draw->send_sbc == 0, but now receives PresentPixmapComplete events with
> sbc's > 0, therefore updating draw->recv_sbc to > 0 in
> dri3_handle_present_event(). The draw->recv_sbc > draw_send_sbc is
> misinterpreted as sbc wraparound, triggers recv_sbc wraparound handling
> and ends up with a very large draw->recv_sbc. During the next swapbuffers
> call [9], the totally wrong recv_sbc is used for calculating the target_msc
> for the PresentPixmap request, leading to a target_msc billions of vblanks
> in the future, leading to a swap that never completes and thereby frozen UI
> and hang of the client.
>
> Make sure that a loader_dri3_drawable can only be destroyed after all
> its pending swaps have completed, to prevent misdelivery of PresentNotify
> events to the right X-Window, but the wrong incarnation of the associated
> loader_dri3_drawable.
>
> Signed-off-by: Mario Kleiner 
> Cc: xorg-de...@lists.x.org
> Cc: dan...@fooishbar.org
> Cc: eero.t.tammi...@intel.com
> Cc: m...@fireburn.co.uk
> ---
>  src/loader/loader_dri3_helper.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
> index 6bb11c4..7bd79af 100644
> --- a/src/loader/loader_dri3_helper.c
> +++ b/src/loader/loader_dri3_helper.c
> @@ -234,6 +234,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
> *draw)
>  {
> int i;
>
> +   if (draw->special_event)
> +  loader_dri3_swapbuffer_barrier(draw);
> +
> draw->ext->core->destroyDrawable(draw->dri_drawable);
>
> for (i = 0; i < ARRAY_SIZE(draw->buffers); i++) {
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/fs: Register allocator shoudn't use grf127 for sends dest (v2)

2018-05-04 Thread Chema Casanova
This patch is still pending of review.

Adding also Cc: to stable as if fixes some cts issues.

Chema


El 19/04/18 a las 02:38, Jose Maria Casanova Crespo escribió:
> Since Gen8+ Intel PRM states that "r127 must not be used for return
> address when there is a src and dest overlap in send instruction."
> 
> This patch implements this restriction creating new grf127_send_hack_node
> at the register allocator. This node has a fixed assignation to grf127.
> 
> For vgrf that are used as destination of send messages we create node
> interfereces with the grf127_send_hack_node. So the register allocator
> will never assign to these vgrf a register that involves grf127.
> 
> If dispatch_width > 8 we don't create these interferences to the because
> all instructions have node interferences between sources and destination.
> That is enough to avoid the r127 restriction.
> 
> This fixes CTS tests that raised this issue as they were executed as SIMD8:
>   
> dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.uniform_16struct_to_32struct.uniform_buffer_block_vert
>   
> dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.uniform_16struct_to_32struct.uniform_buffer_block_tessc
> 
> Shader-db results on Skylake:
>total instructions in shared programs: 7686798 -> 7686797 (<.01%)
>instructions in affected programs: 301 -> 300 (-0.33%)
>helped: 1
>HURT: 0
> 
>total cycles in shared programs: 337092322 -> 337091919 (<.01%)
>cycles in affected programs: 22420415 -> 22420012 (<.01%)
>helped: 712
>HURT: 588
> 
> Shader-db results on Broadwell:
> 
>total instructions in shared programs: 7658574 -> 7658625 (<.01%)
>instructions in affected programs: 19610 -> 19661 (0.26%)
>helped: 3
>HURT: 4
> 
>total cycles in shared programs: 340694553 -> 340676378 (<.01%)
>cycles in affected programs: 24724915 -> 24706740 (-0.07%)
>helped: 998
>HURT: 916
> 
>total spills in shared programs: 4300 -> 4311 (0.26%)
>spills in affected programs: 333 -> 344 (3.30%)
>helped: 1
>HURT: 3
> 
>total fills in shared programs: 5370 -> 5378 (0.15%)
>fills in affected programs: 274 -> 282 (2.92%)
>helped: 1
>HURT: 3
> 
> v2: Avoid duplicating register classes without grf127. Let's use a node
> with a fixed assignation to grf127 and create interferences to send
> message vgrf destinations. (Eric Anholt)
> ---
>  src/intel/compiler/brw_fs_reg_allocate.cpp | 25 ++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_fs_reg_allocate.cpp 
> b/src/intel/compiler/brw_fs_reg_allocate.cpp
> index ec8e116cb38..59e047483c0 100644
> --- a/src/intel/compiler/brw_fs_reg_allocate.cpp
> +++ b/src/intel/compiler/brw_fs_reg_allocate.cpp
> @@ -548,6 +548,9 @@ fs_visitor::assign_regs(bool allow_spilling, bool 
> spill_all)
> int first_mrf_hack_node = node_count;
> if (devinfo->gen >= 7)
>node_count += BRW_MAX_GRF - GEN7_MRF_HACK_START;
> +   int grf127_send_hack_node = node_count;
> +   if (devinfo->gen >= 8 && dispatch_width == 8)
> +  node_count ++;
> struct ra_graph *g =
>ra_alloc_interference_graph(compiler->fs_reg_sets[rsi].regs, 
> node_count);
>  
> @@ -653,6 +656,28 @@ fs_visitor::assign_regs(bool allow_spilling, bool 
> spill_all)
>}
> }
>  
> +   if (devinfo->gen >= 8 && dispatch_width == 8) {
> +  /* At Intel Broadwell PRM, vol 07, section "Instruction Set Reference",
> +   * subsection "EUISA Instructions", Send Message (page 990):
> +   *
> +   * "r127 must not be used for return address when there is a src and
> +   * dest overlap in send instruction."
> +   *
> +   * We are avoiding using grf127 as part of the destination of send
> +   * messages adding a node interference to the grf127_send_hack_node.
> +   * This node has a fixed asignment to grf127.
> +   *
> +   * We don't apply it to SIMD16 because previous code avoids any 
> register
> +   * overlap between sources and destination.
> +   */
> +  ra_set_node_reg(g, grf127_send_hack_node, 127);
> +  foreach_block_and_inst(block, fs_inst, inst, cfg) {
> + if (inst->is_send_from_grf() && inst->dst.file == VGRF) {
> +ra_add_node_interference(g, inst->dst.nr, grf127_send_hack_node);
> + }
> +  }
> +   }
> +
> /* Debug of register spilling: Go spill everything. */
> if (unlikely(spill_all)) {
>int reg = choose_spill_reg(g);
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-05-04 Thread Eero Tamminen

Hi,

On 04.05.2018 14:11, Eleni Maria Stea wrote:

Thanks for your feedback,
On Thu, 3 May 2018 13:30:38 +0300
Eero Tamminen  wrote:

On 02.05.2018 20:19, Matt Turner wrote:

On Wed, May 2, 2018 at 9:13 AM, Eleni Maria Stea 
wrote:

Gen 7 GPUs store the compressed EAC/ETC2 images in other
non-compressed formats that can render. When GetCompressed*
functions are called, the pixels are returned in the
non-compressed format that is used for the rendering.

With this patch we store both the compressed and non-compressed
versions of the image, so that both rendering commands and
GetCompressed* commands work.

Also, the assertions for GL_MAP_WRITE_BIT and
GL_MAP_INVALIDATE_RANGE_BIT in intel_miptree_map_etc function have
been removed because when the miptree is mapped for reading (for
example from a GetCompress* function) the GL_MAP_WRITE_BIT won't
be set (and shouldn't be set).

Fixes: the following test in CTS for gen7:
KHR-GL45.direct_state_access.textures_compressed_subimage test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272


I think you can add

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843

as well :)


This is really lovely feature.

Compressed texture formats are used to:
1. Reduce disk / network usage for the application install
2. Increase run-time performance (by reducing required bandwidth)
3. Reduce program memory usage

At the cost of worse texture quality.

Mesa transparently converting ETC to uncompressed data on platforms
that don't support ETC in HW, means that application doesn't get 2),
just worse texture quality, although some applications would have
capability to fall back to another (HW supported) texture compression
format.


Examples of these are GLBenchmark and GfxBench benchmarks GLES versions.

(First one defaults to ETC1, but can use also DXT if ETC is not
available.  GfxBench v5 defaults to ASTC/ETC2, but can use also DXT,
if those textures are shipped with it.  DXT is the default for the GL
version.)



And this new patch means that instead of 3), memory usage actually
_increases_ compared to application using non-compressed textures
directly.



You are right about the memory usage and about the purpose of the patch.


This was bit of a cry for "why the world couldn't be a better place?".
I know hoping for a better solution for this when it requires also
users of the driver to do something extra, is a bit desperate...



Some (many?) applications might fail to run if ETC isn't supported, so
I  understand why this feature is done, but it would be nice to have
some better way to handle it.

Maybe some new extension that can be used by future game engines &
application toolkits to query which of the compressed texture formats
are faked, so that they can instead select a compression format that
actually provides run-time benefits?


After having a look I think that we already have an extension for that
type of query: the ARB_internalformat_query2. We are not sure if the
existing queries would fit, but in case they don't we think that the
new extension would rather add more queries to this one than introducing
new methods. What do you think?


You mean returning CAVEAT_SUPPORT in params for compressed formats
which are transparently converted to uncompressed data?

That API's not available for GLES v2, where I think ETC is most widely
used, so it would be more of a solution for GLES v3.x applications only.
Sounds OK to me.

Hardest part will be propagating use of this query to engines & toolkits
that would benefit from using it. :-)


- Eero
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/17] i965/miptree: Fix handling of uninitialized MCS buffers

2018-05-04 Thread Nanley Chery
On Fri, May 04, 2018 at 09:03:32AM +0300, Tapani Pälli wrote:
> Hi Nanley;
> 

Hey Tapani,

> On 05/03/2018 10:03 PM, Nanley Chery wrote:
> > Before this patch, if we failed to initialize an MCS buffer, we'd
> > end up in a state in which the miptree thinks it has an MCS buffer,
> > but doesn't. We also leaked the clear_color_bo if it existed.
> > 
> > With this patch, we now free the miptree aux buffer resources and let
> > intel_miptree_alloc_mcs() know that the MCS buffer no longer exists.
> 
> I triggered this by forcing the failure (by inserting 1 instead unlikely(map
> ==NULL) to mcs buffer map condition) and following happens currently:

Thanks for testing this! It'd be great if we could have a piglit test
that would trigger this case for us, but it's not clear to me what that
would look like.

> 
> --- 8< ---
> Failed to map mcs buffer into GTT
> deqp-egl: ../src/intel/blorp/blorp_clear.c:313: blorp_fast_clear: Assertion
> `start_layer + num_layers <= MAX2(surf->aux_surf->logical_level0_px.depth >>
> level, surf->aux_surf->logical_level0_px.array_len)' failed.
> Aborted (core dumped)

I can confirm a similar assertion with piglit's
arb_clear_texture-multisample. What's happening here is that
blorp_fast_clear is receiving garbage aux surfaces, which causes it to
randomly trigger different assertions.

> --- 8< ---
> 
> However even with this fix it seems we will end up in same or similar case
> (segfault happening in do_single_blorp_clear):
> 
> --- 8< ---
> Failed to map mcs buffer into GTT
> Segmentation fault (core dumped)
> --- 8< ---
> 
> Test case used was:
>dEQP-EGL.functional.color_clears.single_context.gles2.rgba_window

This is interesting. What happens here is that we enter
do_single_blorp_clear() with irb->mt == NULL and we segfault at:

   if (irb->Base.Base.Format != irb->mt->format)

It looks like some callers of intel_miptree_create() don't handle
failure well. Maybe they should throw an EGL error? Here's the backtrace
from the failed init:

#0  intel_miptree_init_mcs
#1  intel_miptree_alloc_mcs
#2  intel_miptree_alloc_aux
#3  intel_miptree_create
#4  intel_miptree_create_for_renderbuffer
#5  intel_update_winsys_renderbuffer_miptree
#6  intel_update_image_buffer
#7  intel_update_image_buffers
#8  intel_update_renderbuffers
#9  intel_prepare_render
#10 intelMakeCurrent
#11 driBindContext
#12 ?? () from /usr/lib/libEGL_mesa.so.0
#13 eglMakeCurrent
#14 ?? () from /usr/lib/libEGL.so
#15 ?? () from /usr/lib/libEGL.so
#16 deqp::egl::SingleThreadColorClearCase::executeForContexts
#17 deqp::egl::MultiContextRenderCase::executeForSurface
#18 deqp::egl::RenderCase::executeForConfig
#19 deqp::egl::SimpleConfigCase::iterate
#20 tcu::TestSessionExecutor::iterateTestCase
#21 tcu::TestSessionExecutor::iterate 
#22 tcu::App::iterate
#23 main (argc=, argv=)

> 
> 
> Are we expected to survive without mcs?

I think it depends. Once we determine that we can enable MCS for a
miptree, we require that the aux buffer exist in order to successfully
create the miptree. In GL, if we can't successfully create a miptree, we
can throw a GL_OUT_OF_MEMORY error, and have mostly undefined behavior
for the rest of our existence. In EGL however, I think we can throw an
error that's not EGL_BAD_CONTEXT and carry on.

-Nanley

> 
> 
> > Cc: 
> > ---
> >   src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 +++---
> >   1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index b9a564552df..377efae32c9 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -1658,7 +1658,7 @@ intel_miptree_copy_teximage(struct brw_context *brw,
> >  intel_obj->needs_validate = true;
> >   }
> > -static void
> > +static bool
> >   intel_miptree_init_mcs(struct brw_context *brw,
> >  struct intel_mipmap_tree *mt,
> >  int init_value)
> > @@ -1678,13 +1678,14 @@ intel_miptree_init_mcs(struct brw_context *brw,
> >  void *map = brw_bo_map(brw, mt->aux_buf->bo, MAP_WRITE | MAP_RAW);
> >  if (unlikely(map == NULL)) {
> > fprintf(stderr, "Failed to map mcs buffer into GTT\n");
> > -  brw_bo_unreference(mt->aux_buf->bo);
> > -  free(mt->aux_buf);
> > -  return;
> > +  intel_miptree_aux_buffer_free(mt->aux_buf);
> > +  mt->aux_buf = NULL;
> > +  return false;
> >  }
> >  void *data = map;
> >  memset(data, init_value, mt->aux_buf->size);
> >  brw_bo_unmap(mt->aux_buf->bo);
> > +   return true;
> >   }
> >   static struct intel_miptree_aux_buffer *
> > @@ -1764,15 +1765,14 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
> >  const uint32_t alloc_flags = 0;
> >  mt->aux_buf = intel_alloc_aux_buffer(brw, "mcs-miptree",
> >   _mcs_surf, alloc_flags, mt);
> 

[Mesa-dev] [PATCH 07/10] i965: perf: drop register programming not needed on HSW

2018-05-04 Thread Lionel Landwerlin
This register is flagged as IVB only in the documentation.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_hsw.xml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_hsw.xml 
b/src/mesa/drivers/dri/i965/brw_oa_hsw.xml
index 861cc1212a7..a3bed73086b 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_hsw.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_hsw.xml
@@ -1,5 +1,5 @@
 
-
+


  

  

-
 
   
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] i965: perf: chv: fixup counters names

2018-05-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_chv.xml | 50 
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_chv.xml 
b/src/mesa/drivers/dri/i965/brw_oa_chv.xml
index 2116c895ab9..85c50bb43bb 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_chv.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_chv.xml
@@ -1,5 +1,5 @@
 
-
+
   
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] i965: perf: sklgt2: update memory write config

2018-05-04 Thread Lionel Landwerlin
This is another pre-production stepping case going away.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 56 ++---
 1 file changed, 49 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
index 3a528135070..3128c88b7ea 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
@@ -3790,6 +3790,53 @@
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
 
 
 
 
-
-
 
 
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] i965: perf: sklgt2: update a priority for register programming

2018-05-04 Thread Lionel Landwerlin
This makes no difference in term of programming, it's just a cleanup.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
index 343de0c5ea9..ba52a63f06d 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
 
 
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] i965: perf: enable Icelake metrics

2018-05-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/Makefile.sources|  3 ++-
 src/mesa/drivers/dri/i965/brw_performance_query.c | 10 --
 src/mesa/drivers/dri/i965/meson.build |  1 +
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 5e53d874d88..55df6ce90f7 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -174,4 +174,5 @@ i965_oa_xml_FILES = \
brw_oa_glk.xml \
brw_oa_cflgt2.xml \
brw_oa_cflgt3.xml \
-   brw_oa_cnl.xml
+   brw_oa_cnl.xml \
+   brw_oa_icl.xml
diff --git a/src/mesa/drivers/dri/i965/brw_performance_query.c 
b/src/mesa/drivers/dri/i965/brw_performance_query.c
index 77d23133ad4..f549db34e92 100644
--- a/src/mesa/drivers/dri/i965/brw_performance_query.c
+++ b/src/mesa/drivers/dri/i965/brw_performance_query.c
@@ -1993,17 +1993,21 @@ compute_topology_builtins(struct brw_context *brw)
brw->perfquery.sys_vars.eu_threads_count =
   brw->perfquery.sys_vars.n_eus * devinfo->num_thread_per_eu;
 
-   /* At the moment the subslice mask builtin has groups of 3bits for each
+   /* The subslice mask builtin contains bits for all slices. Prior to Gen11
+* it had groups of 3bits for each slice, on Gen11 it's 8bits for each
 * slice.
 *
 * Ideally equations would be updated to have a slice/subslice query
 * function/operator.
 */
brw->perfquery.sys_vars.subslice_mask = 0;
+
+   int bits_per_subslice = devinfo->gen == 11 ? 8 : 3;
+
for (int s = 0; s < util_last_bit(devinfo->slice_masks); s++) {
   for (int ss = 0; ss < (devinfo->subslice_slice_stride * 8); ss++) {
  if (gen_device_info_subslice_available(devinfo, s, ss))
-brw->perfquery.sys_vars.subslice_mask |= 1UL << (s * 3 + ss);
+brw->perfquery.sys_vars.subslice_mask |= 1UL << (s * 
bits_per_subslice + ss);
   }
}
 }
@@ -2148,6 +2152,8 @@ get_register_queries_function(const struct 
gen_device_info *devinfo)
}
if (devinfo->is_cannonlake)
   return brw_oa_register_queries_cnl;
+   if (devinfo->gen == 11)
+  return brw_oa_register_queries_icl;
 
return NULL;
 }
diff --git a/src/mesa/drivers/dri/i965/meson.build 
b/src/mesa/drivers/dri/i965/meson.build
index a2c1896fecb..0ef0bb9e69e 100644
--- a/src/mesa/drivers/dri/i965/meson.build
+++ b/src/mesa/drivers/dri/i965/meson.build
@@ -160,6 +160,7 @@ i965_hw_metrics = [
   'cflgt2', 'cflgt3',
   'bxt', 'glk',
   'cnl',
+  'icl',
 ]
 
 i965_hw_metrics_xml_files = []
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] i965: perf: config updates

2018-05-04 Thread Lionel Landwerlin
Hi,

This series add & update a bunch of metrics to measure performance.

There are 3 main things :

   - SKL GT2 had a few special cases for pre-production hardware that
 we can remove.

   - We get a new PMA stall metric for Gen8/9

   - We add Icelake support

And also a few minor cleanups.

You can find this branch at : 
https://github.com/djdeath/mesa/tree/wip/djdeath/oa-gen-11

Cheers,

Lionel Landwerlin (10):
  i965: perf: sklgt2: update a priority for register programming
  i965: perf: sklgt2: drop special programming for pre-production
stepping
  i965: perf: sklgt2: update memory write config
  i965: perf: sklgt2: Add FF Bottlenecks metric
  i965: perf: add PMA stall metrics
  i965: perf: chv: fixup counters names
  i965: perf: drop register programming not needed on HSW
  i965: perf: sklgt2: drop programming of an unused NOA mux
  i965: perf: add Icelake metrics
  i965: perf: enable Icelake metrics

 src/mesa/drivers/dri/i965/Makefile.sources| 3 +-
 src/mesa/drivers/dri/i965/brw_oa_bdw.xml  |   203 +-
 src/mesa/drivers/dri/i965/brw_oa_bxt.xml  |   113 +-
 src/mesa/drivers/dri/i965/brw_oa_cflgt2.xml   |   103 +-
 src/mesa/drivers/dri/i965/brw_oa_cflgt3.xml   |   103 +-
 src/mesa/drivers/dri/i965/brw_oa_chv.xml  |50 +-
 src/mesa/drivers/dri/i965/brw_oa_glk.xml  |   113 +-
 src/mesa/drivers/dri/i965/brw_oa_hsw.xml  | 3 +-
 src/mesa/drivers/dri/i965/brw_oa_icl.xml  | 12092 
 src/mesa/drivers/dri/i965/brw_oa_kblgt2.xml   |   103 +-
 src/mesa/drivers/dri/i965/brw_oa_kblgt3.xml   |   103 +-
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml   |   898 +-
 src/mesa/drivers/dri/i965/brw_oa_sklgt3.xml   |   103 +-
 src/mesa/drivers/dri/i965/brw_oa_sklgt4.xml   |   103 +-
 .../drivers/dri/i965/brw_performance_query.c  |10 +-
 src/mesa/drivers/dri/i965/meson.build | 1 +
 16 files changed, 13931 insertions(+), 173 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_oa_icl.xml

--
2.17.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] i965: perf: sklgt2: Add FF Bottlenecks metric

2018-05-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 500 +++-
 1 file changed, 499 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
index 3128c88b7ea..8723c2124df 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] i965: perf: sklgt2: drop programming of an unused NOA mux

2018-05-04 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
index 5fbb17344ec..5f8d01ca890 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
@@ -1414,7 +1414,6 @@
 
 
 
-
 
 
 
@@ -1535,7 +1534,7 @@
   
@@ -2232,7 +2231,6 @@
 
 
 
-
 
 
 
@@ -2405,7 +2403,7 @@
   
@@ -3000,7 +2998,6 @@
 
 
 
-
 
 
 
@@ -3179,7 +3176,7 @@
   
@@ -3775,7 +3772,6 @@
 
 
 
-
 
 
 
@@ -3995,7 +3991,7 @@
   
@@ -4567,7 +4563,6 @@
 
 
 
-
 
 
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] i965: perf: add PMA stall metrics

2018-05-04 Thread Lionel Landwerlin
These are new metrics for Gen8/9 to measure the effect of the PMA
stall workaround fix.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_bdw.xml| 203 +++-
 src/mesa/drivers/dri/i965/brw_oa_bxt.xml| 113 ++-
 src/mesa/drivers/dri/i965/brw_oa_cflgt2.xml | 103 +-
 src/mesa/drivers/dri/i965/brw_oa_cflgt3.xml | 103 +-
 src/mesa/drivers/dri/i965/brw_oa_glk.xml| 113 ++-
 src/mesa/drivers/dri/i965/brw_oa_kblgt2.xml | 103 +-
 src/mesa/drivers/dri/i965/brw_oa_kblgt3.xml | 103 +-
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 103 +-
 src/mesa/drivers/dri/i965/brw_oa_sklgt3.xml | 103 +-
 src/mesa/drivers/dri/i965/brw_oa_sklgt4.xml | 103 +-
 10 files changed, 1140 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_bdw.xml 
b/src/mesa/drivers/dri/i965/brw_oa_bdw.xml
index 258fcacfd59..714a1f08ea7 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_bdw.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_bdw.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/mesa/drivers/dri/i965/brw_oa_bxt.xml 
b/src/mesa/drivers/dri/i965/brw_oa_bxt.xml
index 5b18369b2cb..db018a3d62b 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_bxt.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_bxt.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/mesa/drivers/dri/i965/brw_oa_cflgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_cflgt2.xml
index 15a8e13378e..29f71d47d2f 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_cflgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_cflgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/mesa/drivers/dri/i965/brw_oa_cflgt3.xml 
b/src/mesa/drivers/dri/i965/brw_oa_cflgt3.xml
index b7b8f52f9b0..986668f672a 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_cflgt3.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_cflgt3.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/mesa/drivers/dri/i965/brw_oa_glk.xml 
b/src/mesa/drivers/dri/i965/brw_oa_glk.xml
index 68fcd2d6dde..e3da757f46f 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_glk.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_glk.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/mesa/drivers/dri/i965/brw_oa_kblgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_kblgt2.xml
index da2954e9e2d..c9150c8dcbe 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_kblgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_kblgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+

[Mesa-dev] [PATCH 02/10] i965: perf: sklgt2: drop special programming for pre-production stepping

2018-05-04 Thread Lionel Landwerlin
Production steppings don't need this anymore.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml 
b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
index ba52a63f06d..3a528135070 100644
--- a/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
+++ b/src/mesa/drivers/dri/i965/brw_oa_sklgt2.xml
@@ -1,5 +1,5 @@
 
-
+
   
@@ -1514,12 +1514,6 @@
 
 
 
-
-
-
 
 
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: add EGL_BAD_MATCH error case for surfaceless and android

2018-05-04 Thread Emil Velikov
On 4 May 2018 at 06:28, Tapani Pälli  wrote:
> Just like is done for other backends when suitable config is not
> found (added in fd4eba4929).
>
> Signed-off-by: Tapani Pälli 

Thanks for fixing this up.

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] loader_dri3: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mario Kleiner
Before destroying the loader_dri3_drawable, make sure all pending
swaps for it have completed. This guides against the following scenario,
which happens, e.g., with KDE Plasma-5's plasmashell (which uses
QT-5's QtGui/QtQuick for rendering), when it repaints multiple
UI elements, each represented by its own Window/GLXDrawable, using
one common GLXContext for all GLXDrawable's:

1. glXMakeCurrent(dpy, drawable1, context);
2. glXXX render to drawable1
3. glXSwapBuffers(dpy, drawable1); #1
4. glXMakeCurrent(dpy, drawable2, context);
5. glXXX render to drawable2
6. glXSwapBuffers(dpy, drawable2);
// While the swap #1 is still pending for drawable1:
7. glXMakeCurrent(dpy, drawable1, context);
8. glXXX render to drawable1
9. glXSwapBuffers(dpy, drawable1);

Binding a different drawable2 to the same context via glXMakeCurrent
will cause its previous drawable1 to be released (cfe. dri3_bind_context
-> driReleaseDrawables), which in turn calls loader_dri3_drawable_fini().
This unselects for Present notify event delivery on the associated
X-Window and loses all dri3 related state. If drawable1 is selected for
the context again [7], a new incarnation of loader_dri3_drawable is
created in dri3_bind_context->driFetchDrawable->dri3_create_drawable->
loader_dri3_drawable_init(), which again selects for Present notify
event delivery for the underlying X-Window, but the new incarnation lost
all state wrt. to previous rendering and swaps. The server now delivers
PresentPixmapIdle and PresentPixmapComplete events from the completed
previous swapbuffers call #1 [3] to the new loader_dri3_drawable, which
doesn't expect those. One problem is that the new incarnation has a
draw->send_sbc == 0, but now receives PresentPixmapComplete events with
sbc's > 0, therefore updating draw->recv_sbc to > 0 in
dri3_handle_present_event(). The draw->recv_sbc > draw_send_sbc is
misinterpreted as sbc wraparound, triggers recv_sbc wraparound handling
and ends up with a very large draw->recv_sbc. During the next swapbuffers
call [9], the totally wrong recv_sbc is used for calculating the target_msc
for the PresentPixmap request, leading to a target_msc billions of vblanks
in the future, leading to a swap that never completes and thereby frozen UI
and hang of the client.

Make sure that a loader_dri3_drawable can only be destroyed after all
its pending swaps have completed, to prevent misdelivery of PresentNotify
events to the right X-Window, but the wrong incarnation of the associated
loader_dri3_drawable.

Signed-off-by: Mario Kleiner 
Cc: xorg-de...@lists.x.org
Cc: dan...@fooishbar.org
Cc: eero.t.tammi...@intel.com
Cc: m...@fireburn.co.uk
---
 src/loader/loader_dri3_helper.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 6bb11c4..7bd79af 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -234,6 +234,9 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable *draw)
 {
int i;
 
+   if (draw->special_event)
+  loader_dri3_swapbuffer_barrier(draw);
+
draw->ext->core->destroyDrawable(draw->dri_drawable);
 
for (i = 0; i < ARRAY_SIZE(draw->buffers); i++) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] loader_dri3: Variant 2: Wait for pending swaps to complete before drawable_fini.

2018-05-04 Thread Mario Kleiner
See previous patch in series for explanation of the problem.

This method avoids a blocking loader_dri3_swapbuffer_barrier() call
whenever a GL contexts drawables are changed via glXMakeCurrent et al.

Instead it filters out the "orphaned" PresentNotify events from
previous incarnations of the loader_dri3_drawable. This should deal
correctly with PixmapInvalidate, PixmapPresentCompleteNotify and
MscCompleteNotify events, but i don't know a way to filter out
WindowConfigureNotify events, or if it even matters to filter them.

This PoC one is only meaningful if the first patch is omitted, and
shows the spurious "ORPHAN" printouts which would hang KDE plasmashell
if not filtered out.

Test from a terminal: killall plasmashell; plasmashell
Wiggly the mouse around, click etc. on the KDE taskbar, K-Menu,
system tray icons, trigger volume/brightness feedback widgets
to provoke the occassional ORPHAN event.

Signed-off-by: Mario Kleiner 
Cc: xorg-de...@lists.x.org
Cc: dan...@fooishbar.org
Cc: eero.t.tammi...@intel.com
Cc: m...@fireburn.co.uk
---
 src/loader/loader_dri3_helper.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 7bd79af..123a996 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -234,6 +234,10 @@ loader_dri3_drawable_fini(struct loader_dri3_drawable 
*draw)
 {
int i;
 
+   printf("FINI: wxh = %d x %d, drawable %d eid %d recv_sbc %lu, send_sbc %lu 
PENDING %lu\n",
+  draw->width, draw->height, draw->drawable, draw->eid, 
draw->recv_sbc, draw->send_sbc,
+  draw->send_sbc - draw->recv_sbc);
+
if (draw->special_event)
   loader_dri3_swapbuffer_barrier(draw);
 
@@ -373,6 +377,15 @@ dri3_handle_present_event(struct loader_dri3_drawable 
*draw,
* checking for wrap.
*/
   if (ce->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP) {
+ /* Filter out orphan events sent for a previous incarnation of draw. 
*/
+ if (!(draw->send_sbc & 0xLL) &&
+ ce->serial > draw->send_sbc) {
+printf("ORPHAN-C: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
+   draw->width, draw->height, draw->drawable, ce->serial,
+   draw->send_sbc);
+break;
+ }
+
  draw->recv_sbc = (draw->send_sbc & 0xLL) | ce->serial;
  if (draw->recv_sbc > draw->send_sbc)
 draw->recv_sbc -= 0x1;
@@ -418,6 +431,15 @@ dri3_handle_present_event(struct loader_dri3_drawable 
*draw,
   xcb_present_idle_notify_event_t *ie = (void *) ge;
   int b;
 
+  /* Filter out orphan events sent for a previous incarnation of draw. */
+  if (!(draw->send_sbc & 0xLL) &&
+  ie->serial > draw->send_sbc) {
+ printf("ORPHAN-I: %d x %d, drawable %d: recv %u vs send_sbc %lu\n",
+draw->width, draw->height, draw->drawable, ie->serial,
+draw->send_sbc);
+ break;
+  }
+
   for (b = 0; b < ARRAY_SIZE(draw->buffers); b++) {
  struct loader_dri3_buffer *buf = draw->buffers[b];
 
@@ -1435,6 +1457,8 @@ dri3_update_drawable(__DRIdrawable *driDrawable,
  xcb_unregister_for_special_event(draw->conn, draw->special_event);
  draw->special_event = NULL;
   }
+
+  printf("INIT: wxh = %d x %d, drawable %d eid %d\n", draw->width, 
draw->height, draw->drawable, draw->eid);
}
dri3_flush_present_events(draw);
mtx_unlock(>mtx);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] Fix attempt for Mesa + X-Server 1.20 + modesetting-ddx hangs on KDE5.

2018-05-04 Thread Mario Kleiner
Two patches, solving the same problem in two different ways, the 1st
one ready to go, the 2nd one would need the debug statements removed.

Only apply one of those for testing, the 2nd one will be useless with
the 1st one applied, but demonstrates the problem.

So X-Server 1.20 RC + modesetting-ddx with DRI3/Present hangs at least
KDE-5's plasmashell and makes KDE-5 unusable with that setup.

As KDE's plasmashell uses QT-5's QtQuick OpenGL based rendering api's
to render scene-graphs, this bug might affect other QT applications
as well.

This fix works, but it points to some problems in modesetting-ddx's
current vblank handling, because other ddx'en seem to be mostly
unaffected by this Mesa bug.

The problem is that neither of these two fixes is a proper final
solution, but better than nothing. It leaves the OML_sync_control
extensions glXWaitForSbcOML(), glXWaitForMscOML() calls and the
SGI_video_sync glXWaitVideoSyncSGI() functions broken for some
use patterns.

The real problem, if i understand it correctly, is the way the life-time
of dri3_drawables and loader_dri3_drawables is managed atm. by Mesa's
bindContext() functions. Whenever glXMakeCurrent() etc. are called to
assign new/different GLXDrawables to the same context (ie. one context
reused for drawing into many different drawables, as opposed to using
one dedicated context for each drawable), we destroy the underlying
DRIDrawables/dri3_drawables_loader_dri3_drawables and they lose all
state wrt. pending bufferswaps, msc, sbc, ust.

Nothing in the specs says that clients should expect to lose such
state on a GLXDrawable d1 whenever they reassign drawables other than
d1 to a GL context. A sequence like...

1.glXMakeCurrent(context, drawable1);
2.draw draw draw
3.glXSwapbuffers(context, drawable1);
4.glXMakeCurrent(context, drawable2); // drawable 1 loses all state!
5.glXWaitForSbcOML(dpy, drawable1, ...);

... would probably cause a hang of the client in glXWaitForSbcOML, as
the function requires information stored in the "original" drawable1
up to step 3, but lost in step 4 due to dri3_drawable destruction.

Patch 1 has a potentially large performance impact when switching
drawables on a given context, due to the enforced wait on swap completion,
but might save OML clients which do waits for sbc,msc on a separate thread,
whereas patch 2 doesn't have a performance impact, but doesn't even
partially solve trouble with OML_sync_control.

However, i'm totally out of time atm. and probably not the right person
to think about a better solution, and by dumb luck, my own application
doesn't recycle the same context for different drawables, but uses a
dedicated context for each drawable, so it dodges this bullet.

Therefore one of these patches is either a good enough fix for the KDE
hang problems atm. or a diagnosis of the problem as a starting point for
brighter people to deal with the root cause ;-)

Thanks,
-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/vcn: fix mpeg4 msg buffer settings

2018-05-04 Thread Christian König

Am 03.05.2018 um 23:29 schrieb boyuan.zh...@amd.com:

From: Boyuan Zhang 

Previous bit-fields assignments are incorrect and will result certain mpeg4
decode failed due to wrong flag values. This patch fixes these assignments.

Cc: 18.0 18.1 

Signed-off-by: Boyuan Zhang 
Reviewed-by: Leo Liu 


Reviewed-by: Christian König  as well.


---
  src/gallium/drivers/radeon/radeon_vcn_dec.c | 18 +-
  1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index f83e9e5..4bc922d 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -554,15 +554,15 @@ static rvcn_dec_message_mpeg4_asp_vld_t 
get_mpeg4_msg(struct radeon_decoder *dec
  
  	result.vop_time_increment_resolution = pic->vop_time_increment_resolution;
  
-	result.short_video_header |= pic->short_video_header << 0;

-   result.interlaced |= pic->interlaced << 2;
-result.load_intra_quant_mat |= 1 << 3;
-   result.load_nonintra_quant_mat |= 1 << 4;
-   result.quarter_sample |= pic->quarter_sample << 5;
-   result.complexity_estimation_disable |= 1 << 6;
-   result.resync_marker_disable |= pic->resync_marker_disable << 7;
-   result.newpred_enable |= 0 << 10; //
-   result.reduced_resolution_vop_enable |= 0 << 11;
+   result.short_video_header = pic->short_video_header;
+   result.interlaced = pic->interlaced;
+   result.load_intra_quant_mat = 1;
+   result.load_nonintra_quant_mat = 1;
+   result.quarter_sample = pic->quarter_sample;
+   result.complexity_estimation_disable = 1;
+   result.resync_marker_disable = pic->resync_marker_disable;
+   result.newpred_enable = 0;
+   result.reduced_resolution_vop_enable = 0;
  
  	result.quant_type = pic->quant_type;
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Ilia Mirkin
On Fri, May 4, 2018 at 7:36 AM, Gert Wollny  wrote:
> Am Donnerstag, den 03.05.2018, 19:52 +0200 schrieb Gert Wollny:
>> Am Donnerstag, den 03.05.2018, 13:24 -0400 schrieb Ilia Mirkin:
>> >
>> > The api call is "clear", not "glClear in the context of whatever
>> > random GL state there might be". When the gallium clear API is
>> > invoked, the bound framebuffer needs to be cleared. This is how the
>> > API works, this is how all drivers implement it. It's basically
>> > memset(). It doesn't care about rasterizer discard or anything
>> > else.
>> >
>>
>> I stand corrected and sorry for the noise.
> Actually no:
>
> The OpenGL standard section 14.1 says:

Not sure what this has to do with anything. We're not talking about
glClear(). We're talking about the gallium clear API. Here's how it's
specified:

http://gallium.readthedocs.io/en/latest/context.html

"""
clear initializes some or all of the surfaces currently bound to the
framebuffer to particular RGBA, depth, or stencil values. Currently,
this does not take into account color or stencil write masks (as used
by GL), and always clears the whole surfaces (no scissoring as used by
GL clear or explicit rectangles like d3d9 uses). It can, however, also
clear only depth or stencil in a combined depth/stencil surface. If a
surface includes several layers then all layers will be cleared.
"""

What you need to do in the virgl_clear() function is to make that
happen. The fact that you're messing around with GL state on the
driver end isn't the gallium api's concern, nor is it the gallium
api's users'. My recommendation is instead of virgl_clear working
around virglrenderer's limitations by sending extra state (such as to
force-disable raster discard), to instead have virglrenderer track it
properly, and if raster discard has been enabled, to first disable it
before calling glClear() (and then re-enable it).

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] docs/features: mark GL_ARB_sample_locations as DONE for nvc0

2018-05-04 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 docs/features.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/features.txt b/docs/features.txt
index b1eb9e91a2..cf15ac854f 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -305,7 +305,7 @@ Khronos, ARB, and OES extensions that are not part of any 
OpenGL or OpenGL ES ve
   GL_ARB_parallel_shader_compilenot started, but 
Chia-I Wu did some related work in 2014
   GL_ARB_post_depth_coverageDONE (i965)
   GL_ARB_robustness_isolation   not started
-  GL_ARB_sample_locations   not started
+  GL_ARB_sample_locations   DONE (nvc0)
   GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
radeonsi, r600, softpipe, swr)
   GL_ARB_shader_ballot  DONE (i965/gen8+, 
nvc0, radeonsi)
   GL_ARB_shader_clock   DONE (i965/gen7+, 
nv50, nvc0, r600, radeonsi)
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] st/mesa: add support for ARB_sample_locations

2018-05-04 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/mesa/state_tracker/st_atom_framebuffer.c | 64 
 src/mesa/state_tracker/st_cb_msaa.c  | 22 ++
 src/mesa/state_tracker/st_extensions.c   |  1 +
 3 files changed, 87 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c 
b/src/mesa/state_tracker/st_atom_framebuffer.c
index 3ef3ff34a9..bb5f02125f 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -102,6 +102,68 @@ framebuffer_quantize_num_samples(struct st_context *st, 
unsigned num_samples)
return quantized_samples;
 }
 
+/**
+ * Update the pipe_context's sample location state
+ */
+static void
+update_sample_locations(struct st_context *st,
+const struct pipe_framebuffer_state *fb_state)
+{
+   struct pipe_sample_locations_state locations;
+   struct gl_framebuffer *fb = st->ctx->DrawBuffer;
+
+   if (!st->ctx->Extensions.ARB_sample_locations)
+  return;
+
+   locations.enabled = fb->ProgrammableSampleLocations;
+   if (locations.enabled) {
+  unsigned grid_width, grid_height;
+  int samples = _mesa_geometric_samples(fb);
+  int pixel, sample_index;
+  bool sample_location_pixel_grid = fb->SampleLocationPixelGrid;
+
+  st->pipe->get_sample_pixel_grid(st->pipe, samples, _width, 
_height);
+
+  /**
+   * when a dimension is greater than MAX_SAMPLE_LOCATION_GRID_SIZE,
+   * st->ctx->Driver.GetSamplePixelGrid() returns 1 for both dimensions.
+   */
+  if (grid_width>MAX_SAMPLE_LOCATION_GRID_SIZE ||
+  grid_height>MAX_SAMPLE_LOCATION_GRID_SIZE)
+ sample_location_pixel_grid = false;
+
+  for (pixel = 0; pixel < grid_width * grid_height; pixel++) {
+ for (sample_index = 0; sample_index < samples; sample_index++) {
+int table_index = sample_index;
+float x = 0.5f, y = 0.5f;
+uint8_t loc;
+if (sample_location_pixel_grid)
+   table_index = pixel * samples + sample_index;
+if (fb->SampleLocationTable) {
+   x = fb->SampleLocationTable[table_index*2];
+   y = fb->SampleLocationTable[table_index*2+1];
+}
+if (st->state.fb_orientation == Y_0_BOTTOM)
+   y = 1.0 - y;
+
+loc = roundf(CLAMP(x*16.0f, 0.0f, 15.0f));
+loc |= (int)roundf(CLAMP(y*16.0f, 0.0f, 15.0f)) << 4;
+locations.locations[pixel*samples+sample_index] = loc;
+ }
+  }
+
+  util_sample_locations_flip_y(st->pipe, , fb_state);
+   } else {
+  /**
+   * util_sample_locations_flip_y() initializes unused data to 0x88, so
+   * this memset is not useful when locations.enabled is true.
+   */
+  memset(locations.locations, 0x88, sizeof(locations.locations));
+   }
+
+   cso_set_sample_locations(st->cso_context, );
+}
+
 /**
  * Update framebuffer state (color, depth, stencil, etc. buffers)
  */
@@ -209,4 +271,6 @@ st_update_framebuffer_state( struct st_context *st )
st->state.fb_num_samples = util_framebuffer_get_num_samples();
st->state.fb_num_layers = util_framebuffer_get_num_layers();
st->state.fb_num_cb = framebuffer.nr_cbufs;
+
+   update_sample_locations(st, );
 }
diff --git a/src/mesa/state_tracker/st_cb_msaa.c 
b/src/mesa/state_tracker/st_cb_msaa.c
index 7f1b4fde91..092e74d28e 100644
--- a/src/mesa/state_tracker/st_cb_msaa.c
+++ b/src/mesa/state_tracker/st_cb_msaa.c
@@ -56,8 +56,30 @@ st_GetSamplePosition(struct gl_context *ctx,
 }
 
 
+static void
+st_GetProgrammableSampleCaps(struct gl_context *ctx, struct gl_framebuffer *fb,
+ GLuint *outBits, GLuint *outWidth, GLuint 
*outHeight)
+{
+   struct st_context *st = st_context(ctx);
+
+   st_validate_state(st, ST_PIPELINE_UPDATE_FRAMEBUFFER);
+
+   if (st->pipe->get_sample_pixel_grid)
+  st->pipe->get_sample_pixel_grid(st->pipe, _mesa_geometric_samples(fb),
+  outWidth, outHeight);
+   *outBits = 4;
+
+   /* We could handle this better in some circumstances,
+* but it's not really an issue */
+   if (*outWidth>MAX_SAMPLE_LOCATION_GRID_SIZE || 
*outHeight>MAX_SAMPLE_LOCATION_GRID_SIZE) {
+  *outWidth = 1;
+  *outHeight = 1;
+   }
+}
+
 void
 st_init_msaa_functions(struct dd_function_table *functions)
 {
functions->GetSamplePosition = st_GetSamplePosition;
+   functions->GetProgrammableSampleCaps = st_GetProgrammableSampleCaps;
 }
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 0dc8adb262..b74fb63cef 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -636,6 +636,7 @@ void st_init_extensions(struct pipe_screen *screen,
   { o(ARB_query_buffer_object),  PIPE_CAP_QUERY_BUFFER_OBJECT  
},
   { o(ARB_robust_buffer_access_behavior), 

[Mesa-dev] [PATCH 1/5] mesa: add support for ARB_sample_locations

2018-05-04 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/mapi/glapi/gen/gl_API.xml   |  52 +++
 src/mesa/main/config.h  |   7 +
 src/mesa/main/dd.h  |   7 +
 src/mesa/main/extensions_table.h|   1 +
 src/mesa/main/fbobject.c| 247 
 src/mesa/main/fbobject.h|  20 +++
 src/mesa/main/framebuffer.c |  10 ++
 src/mesa/main/get.c |  32 +
 src/mesa/main/get_hash_params.py|   6 +
 src/mesa/main/mtypes.h  |   6 +
 src/mesa/main/multisample.c |  18 +++
 src/mesa/main/tests/dispatch_sanity.cpp |  10 ++
 12 files changed, 386 insertions(+), 30 deletions(-)

diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 38c1921047..a23094a548 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -10891,6 +10891,58 @@
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
 
 
 
diff --git a/src/mesa/main/config.h b/src/mesa/main/config.h
index 81573bfbf2..444e4dedad 100644
--- a/src/mesa/main/config.h
+++ b/src/mesa/main/config.h
@@ -315,4 +315,11 @@
 #define MAX_CLIPPED_VERTICES ((2 * (6 + MAX_CLIP_PLANES))+1)
 
 
+/** For GL_ARB_sample_locations - maximum of SAMPLE_LOCATION_PIXEL_GRID_*_ARB 
*/
+#define MAX_SAMPLE_LOCATION_GRID_SIZE 4
+/* It is theoretically possible for Consts.MaxSamples to be >32 but
+ * other code seems to assume that is not the case */
+#define MAX_SAMPLE_LOCATION_TABLE_SIZE \
+   (MAX_SAMPLE_LOCATION_GRID_SIZE*MAX_SAMPLE_LOCATION_GRID_SIZE*32)
+
 #endif /* MESA_CONFIG_H_INCLUDED */
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index d85d89ef50..8929a2e267 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -785,6 +785,13 @@ struct dd_function_table {
   GLenum target, GLsizei numAttachments,
   const GLenum *attachments);
 
+   /**
+* \name Functions for GL_ARB_sample_locations
+*/
+   void (*GetProgrammableSampleCaps)(struct gl_context *ctx, struct 
gl_framebuffer *fb,
+ GLuint *bits, GLuint *width, GLuint 
*height);
+   void (*EvaluateDepthValues)(struct gl_context *ctx, struct gl_framebuffer 
*fb);
+
/**
 * \name Query objects
 */
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 492f7c3d20..3497cbea0e 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -103,6 +103,7 @@ EXT(ARB_provoking_vertex, 
EXT_provoking_vertex
 EXT(ARB_query_buffer_object , ARB_query_buffer_object  
  , GLL, GLC,  x ,  x , 2013)
 EXT(ARB_robust_buffer_access_behavior   , 
ARB_robust_buffer_access_behavior  , GLL, GLC,  x ,  x , 2012)
 EXT(ARB_robustness  , dummy_true   
  , GLL, GLC,  x ,  x , 2010)
+EXT(ARB_sample_locations, ARB_sample_locations 
  , GLL, GLC,  x , ES2, 2015)
 EXT(ARB_sample_shading  , ARB_sample_shading   
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_sampler_objects , dummy_true   
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_seamless_cube_map   , ARB_seamless_cube_map
  , GLL, GLC,  x ,  x , 2009)
diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index c72204e11a..33d7f0307d 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -35,6 +35,7 @@
 
 #include "buffers.h"
 #include "context.h"
+#include "debug_output.h"
 #include "enums.h"
 #include "fbobject.h"
 #include "formats.h"
@@ -1403,15 +1404,57 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint 
renderbuffer)
bind_renderbuffer(target, renderbuffer, true);
 }
 
+static bool
+_pname_writable_for_default_framebuffer(struct gl_context *ctx,
+GLenum pname)
+{
+   switch (pname) {
+   case GL_FRAMEBUFFER_PROGRAMMABLE_SAMPLE_LOCATIONS_ARB:
+   case GL_FRAMEBUFFER_SAMPLE_LOCATION_PIXEL_GRID_ARB:
+  return true;
+   default:
+  return false;
+   }
+}
+
 /**
- * ARB_framebuffer_no_attachment - Application passes requested param's
- * here. NOTE: NumSamples requested need not be _NumSamples which is
- * what the hw supports.
+ * ARB_framebuffer_no_attachment and ARB_sample_locations - Application passes
+ * requested param's here. NOTE: NumSamples requested need not be _NumSamples
+ * which is what the hw supports.
  */
 static void
 framebuffer_parameteri(struct gl_context *ctx, struct gl_framebuffer *fb,
  

[Mesa-dev] [PATCH 0/5] Implement ARB_sample_locations for nvc0

2018-05-04 Thread Rhys Perry
This patch set adds support for GL_ARB_sample_locations in mesa core, gallium,
the mesa OpenGL state tracker and the nvc0 driver.

Rhys Perry (5):
  mesa: add support for ARB_sample_locations
  gallium: add support for programmable sample locations
  st/mesa: add support for ARB_sample_locations
  nvc0: add support for programmable sample locations
  docs/features: mark GL_ARB_sample_locations as DONE for nvc0

 docs/features.txt  |   2 +-
 src/gallium/auxiliary/cso_cache/cso_context.c  |  31 +++
 src/gallium/auxiliary/cso_cache/cso_context.h  |   5 +
 src/gallium/auxiliary/util/u_framebuffer.c |  32 +++
 src/gallium/auxiliary/util/u_framebuffer.h |   6 +
 src/gallium/docs/source/context.rst|   2 +
 src/gallium/docs/source/screen.rst |   3 +
 src/gallium/drivers/etnaviv/etnaviv_screen.c   |   1 +
 src/gallium/drivers/freedreno/freedreno_screen.c   |   1 +
 src/gallium/drivers/i915/i915_screen.c |   1 +
 src/gallium/drivers/llvmpipe/lp_screen.c   |   1 +
 .../drivers/nouveau/codegen/nv50_ir_driver.h   |   2 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |   7 +
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  |  91 +++-
 .../nouveau/codegen/nv50_ir_lowering_nvc0.h|   2 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c |   1 +
 src/gallium/drivers/nouveau/nv50/nv50_miptree.c|   1 +
 src/gallium/drivers/nouveau/nv50/nv50_resource.h   |   1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_context.c|  35 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h|  12 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c|   1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c|   3 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  |  11 +
 .../drivers/nouveau/nvc0/nvc0_state_validate.c | 155 ++---
 src/gallium/drivers/r300/r300_screen.c |   1 +
 src/gallium/drivers/r600/r600_pipe.c   |   1 +
 src/gallium/drivers/radeonsi/si_get.c  |   1 +
 src/gallium/drivers/softpipe/sp_screen.c   |   1 +
 src/gallium/drivers/svga/svga_screen.c |   1 +
 src/gallium/drivers/swr/swr_screen.cpp |   1 +
 src/gallium/drivers/vc4/vc4_screen.c   |   1 +
 src/gallium/drivers/vc5/vc5_screen.c   |   1 +
 src/gallium/drivers/virgl/virgl_screen.c   |   1 +
 src/gallium/include/pipe/p_context.h   |  17 +-
 src/gallium/include/pipe/p_defines.h   |   1 +
 src/gallium/include/pipe/p_state.h |  26 +++
 src/mapi/glapi/gen/gl_API.xml  |  52 +
 src/mesa/main/config.h |   7 +
 src/mesa/main/dd.h |   7 +
 src/mesa/main/extensions_table.h   |   1 +
 src/mesa/main/fbobject.c   | 247 ++---
 src/mesa/main/fbobject.h   |  20 ++
 src/mesa/main/framebuffer.c|  10 +
 src/mesa/main/get.c|  32 +++
 src/mesa/main/get_hash_params.py   |   6 +
 src/mesa/main/mtypes.h |   6 +
 src/mesa/main/multisample.c|  18 ++
 src/mesa/main/tests/dispatch_sanity.cpp|  10 +
 src/mesa/state_tracker/st_atom_framebuffer.c   |  64 ++
 src/mesa/state_tracker/st_cb_msaa.c|  22 ++
 src/mesa/state_tracker/st_extensions.c |   1 +
 53 files changed, 893 insertions(+), 72 deletions(-)

-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] nvc0: add support for programmable sample locations

2018-05-04 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 .../drivers/nouveau/codegen/nv50_ir_driver.h   |   2 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |   7 +
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  |  91 ++--
 .../nouveau/codegen/nv50_ir_lowering_nvc0.h|   2 +
 src/gallium/drivers/nouveau/nv50/nv50_miptree.c|   1 +
 src/gallium/drivers/nouveau/nv50/nv50_resource.h   |   1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_context.c|  35 +
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h|  12 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c|   1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c|   3 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   3 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  |  11 ++
 .../drivers/nouveau/nvc0/nvc0_state_validate.c | 155 +
 13 files changed, 283 insertions(+), 41 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index 3d0782f86b..7c835ceab8 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
@@ -73,6 +73,7 @@ struct nv50_ir_prog_symbol
 #define NVISA_GK104_CHIPSET0xe0
 #define NVISA_GK20A_CHIPSET0xea
 #define NVISA_GM107_CHIPSET0x110
+#define NVISA_GM200_CHIPSET0x120
 
 struct nv50_ir_prog_info
 {
@@ -145,6 +146,7 @@ struct nv50_ir_prog_info
  bool persampleInvocation;
  bool usesSampleMaskIn;
  bool readsFramebuffer;
+ bool readsSampleLocations;
   } fp;
   struct {
  uint32_t inputOffset; /* base address for user args */
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 3c5bad05fe..d7844d7381 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1520,6 +1520,10 @@ void Source::scanInstructionSrc(const Instruction& insn,
  info->out[src.getIndex(0)].oread = 1;
   }
}
+   if (src.getFile() == TGSI_FILE_SYSTEM_VALUE) {
+  if (info->sv[src.getIndex(0)].sn == TGSI_SEMANTIC_SAMPLEPOS)
+ info->prop.fp.readsSampleLocations = true;
+   }
if (src.getFile() != TGSI_FILE_INPUT)
   return;
 
@@ -1560,6 +1564,9 @@ bool Source::scanInstruction(const struct 
tgsi_full_instruction *inst)
if (insn.getOpcode() == TGSI_OPCODE_FBFETCH)
   info->prop.fp.readsFramebuffer = true;
 
+   if (insn.getOpcode() == TGSI_OPCODE_INTERP_SAMPLE)
+  info->prop.fp.readsSampleLocations = true;
+
if (insn.dstCount()) {
   Instruction::DstRegister dst = insn.getDst(0);
 
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 29f674b451..f64439dbdc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -2662,17 +2662,33 @@ NVC0LoweringPass::handleRDSV(Instruction *i)
   ld->subOp = NV50_IR_SUBOP_PIXLD_SAMPLEID;
   break;
case SV_SAMPLE_POS: {
-  Value *off = new_LValue(func, FILE_GPR);
-  ld = bld.mkOp1(OP_PIXLD, TYPE_U32, i->getDef(0), bld.mkImm(0));
+  Value *sampleID = bld.getScratch();
+  ld = bld.mkOp1(OP_PIXLD, TYPE_U32, sampleID, bld.mkImm(0));
   ld->subOp = NV50_IR_SUBOP_PIXLD_SAMPLEID;
-  bld.mkOp2(OP_SHL, TYPE_U32, off, i->getDef(0), bld.mkImm(3));
-  bld.mkLoad(TYPE_F32,
- i->getDef(0),
- bld.mkSymbol(
-   FILE_MEMORY_CONST, prog->driver->io.auxCBSlot,
-   TYPE_U32, prog->driver->io.sampleInfoBase +
-   4 * sym->reg.data.sv.index),
- off);
+  Value *offset = calculateSampleOffset(sampleID);
+
+  assert(prog->driver->prop.fp.readsSampleLocations);
+
+  if (targ->getChipset() >= NVISA_GM200_CHIPSET) {
+ bld.mkLoad(TYPE_F32,
+i->getDef(0),
+bld.mkSymbol(
+  FILE_MEMORY_CONST, prog->driver->io.auxCBSlot,
+  TYPE_U32, prog->driver->io.sampleInfoBase),
+offset);
+ bld.mkOp2(OP_EXTBF, TYPE_U32, i->getDef(0), i->getDef(0),
+   bld.mkImm(0x040c + sym->reg.data.sv.index * 16));
+ bld.mkCvt(OP_CVT, TYPE_F32, i->getDef(0), TYPE_U32, i->getDef(0));
+ bld.mkOp2(OP_MUL, TYPE_F32, i->getDef(0), i->getDef(0), 
bld.mkImm(1.0f/16.0f));
+  } else {
+ bld.mkLoad(TYPE_F32,
+i->getDef(0),
+bld.mkSymbol(
+  FILE_MEMORY_CONST, prog->driver->io.auxCBSlot,
+  TYPE_U32, prog->driver->io.sampleInfoBase +
+  4 * sym->reg.data.sv.index),
+offset);
+ 

[Mesa-dev] [PATCH 2/5] gallium: add support for programmable sample locations

2018-05-04 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/gallium/auxiliary/cso_cache/cso_context.c| 31 +++
 src/gallium/auxiliary/cso_cache/cso_context.h|  5 
 src/gallium/auxiliary/util/u_framebuffer.c   | 32 
 src/gallium/auxiliary/util/u_framebuffer.h   |  6 +
 src/gallium/docs/source/context.rst  |  2 ++
 src/gallium/docs/source/screen.rst   |  3 +++
 src/gallium/drivers/etnaviv/etnaviv_screen.c |  1 +
 src/gallium/drivers/freedreno/freedreno_screen.c |  1 +
 src/gallium/drivers/i915/i915_screen.c   |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  2 +-
 src/gallium/drivers/r300/r300_screen.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.c |  1 +
 src/gallium/drivers/radeonsi/si_get.c|  1 +
 src/gallium/drivers/softpipe/sp_screen.c |  1 +
 src/gallium/drivers/svga/svga_screen.c   |  1 +
 src/gallium/drivers/swr/swr_screen.cpp   |  1 +
 src/gallium/drivers/vc4/vc4_screen.c |  1 +
 src/gallium/drivers/vc5/vc5_screen.c |  1 +
 src/gallium/drivers/virgl/virgl_screen.c |  1 +
 src/gallium/include/pipe/p_context.h | 17 -
 src/gallium/include/pipe/p_defines.h |  1 +
 src/gallium/include/pipe/p_state.h   | 26 +++
 25 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/cso_cache/cso_context.c 
b/src/gallium/auxiliary/cso_cache/cso_context.c
index 3a3a63a332..b3e09ada6e 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.c
+++ b/src/gallium/auxiliary/cso_cache/cso_context.c
@@ -121,6 +121,7 @@ struct cso_context {
boolean render_condition_cond, render_condition_cond_saved;
 
struct pipe_framebuffer_state fb, fb_saved;
+   struct pipe_sample_locations_state sample_locations, sample_locations_saved;
struct pipe_viewport_state vp, vp_saved;
struct pipe_blend_color blend_color;
unsigned sample_mask, sample_mask_saved;
@@ -745,6 +746,32 @@ cso_restore_framebuffer(struct cso_context *ctx)
 }
 
 
+void cso_set_sample_locations(struct cso_context *ctx,
+  const struct pipe_sample_locations_state *locs)
+{
+   size_t size = sizeof(ctx->sample_locations);
+   if (memcmp(>sample_locations, locs, size)) {
+  memcpy(>sample_locations, locs, size);
+  ctx->pipe->set_sample_locations_state(ctx->pipe, locs);
+   }
+}
+
+static void
+cso_save_sample_locations(struct cso_context *ctx)
+{
+   size_t size = sizeof(ctx->sample_locations);
+   memcpy(>sample_locations_saved, >sample_locations, size);
+}
+
+static void
+cso_restore_sample_locations(struct cso_context *ctx)
+{
+   size_t size = sizeof(ctx->sample_locations);
+   if (memcmp(>sample_locations, >sample_locations_saved, size))
+  cso_set_sample_locations(ctx, >sample_locations_saved);
+}
+
+
 void cso_set_viewport(struct cso_context *ctx,
   const struct pipe_viewport_state *vp)
 {
@@ -1636,6 +1663,8 @@ cso_save_state(struct cso_context *cso, unsigned 
state_mask)
   cso->pipe->set_active_query_state(cso->pipe, false);
if (state_mask & CSO_BIT_FRAGMENT_IMAGE0)
   cso_save_fragment_image0(cso);
+   if (state_mask & CSO_BIT_SAMPLE_LOCATIONS)
+  cso_save_sample_locations(cso);
 }
 
 
@@ -1691,6 +1720,8 @@ cso_restore_state(struct cso_context *cso)
   cso->pipe->set_active_query_state(cso->pipe, true);
if (state_mask & CSO_BIT_FRAGMENT_IMAGE0)
   cso_restore_fragment_image0(cso);
+   if (state_mask & CSO_BIT_SAMPLE_LOCATIONS)
+  cso_restore_sample_locations(cso);
 
cso->saved_state = 0;
 }
diff --git a/src/gallium/auxiliary/cso_cache/cso_context.h 
b/src/gallium/auxiliary/cso_cache/cso_context.h
index 3a4e808f0c..3f3b0ae3d4 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.h
+++ b/src/gallium/auxiliary/cso_cache/cso_context.h
@@ -133,6 +133,10 @@ void cso_set_framebuffer(struct cso_context *cso,
  const struct pipe_framebuffer_state *fb);
 
 
+void cso_set_sample_locations(struct cso_context *ctx,
+  const struct pipe_sample_locations_state *locs);
+
+
 void cso_set_viewport(struct cso_context *cso,
   const struct pipe_viewport_state *vp);
 void cso_set_viewport_dims(struct cso_context *ctx,
@@ -176,6 +180,7 @@ void cso_set_render_condition(struct cso_context *cso,
 #define CSO_BIT_VIEWPORT  0x4
 #define CSO_BIT_PAUSE_QUERIES 0x8
 #define CSO_BIT_FRAGMENT_IMAGE0  0x10
+#define CSO_BIT_SAMPLE_LOCATIONS 0x20
 
 #define CSO_BITS_ALL_SHADERS (CSO_BIT_VERTEX_SHADER | \
   CSO_BIT_FRAGMENT_SHADER | \
diff --git 

[Mesa-dev] [Bug 106401] swr driver failing mesa build on 32 bit

2018-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106401

Bug ID: 106401
   Summary: swr driver failing mesa build on 32 bit
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/swr
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: bluescreen_aven...@verizon.net
QA Contact: mesa-dev@lists.freedesktop.org

Hi

I am trying to rebuild mesa master on a 32 bit chroot. It looks like a recent
change now it is trying to compile in something that is 64 bits. 


I am getting this error,

  CXX  rasterizer/core/libswrAVX2_la-backend_clear.lo
  CXX  rasterizer/core/libswrAVX2_la-backend_sample.lo
In file included from ./rasterizer/common/simdintrin.h:28:0,
 from ./rasterizer/core/format_types.h:31,
 from ./rasterizer/core/format_traits.h:32,
 from ./rasterizer/core/multisample.h:30,
 from rasterizer/core/backend.h:33,
 from rasterizer/core/backend.cpp:32:
./rasterizer/common/simdlib.hpp: In member function ‘uint32_t
SIMDVecHash::operator()(Integer) const’:
./rasterizer/common/simdlib.hpp:610:9: error: static assertion failed: This
path only meant for 64-bit code
 static_assert(sizeof(void*) == 8, "This path only meant for 64-bit
code");
 ^
./rasterizer/common/simdlib.hpp:619:50: error: there are no arguments to
‘_mm_crc32_u64’ that depend on a template parameter, so a declaration of
‘_mm_crc32_u64’ must be available [-fpermissive]
 crc32 = _mm_crc32_u64(crc32, pData[i]);
  ^
./rasterizer/common/simdlib.hpp:619:50: note: (if you use ‘-fpermissive’, G++
will accept your code, but allowing the use of an undeclared name is
deprecated)
In file included from ./rasterizer/common/simdintrin.h:28:0,
 from ./rasterizer/core/format_types.h:31,
 from ./rasterizer/core/format_traits.h:32,
 from ./rasterizer/core/multisample.h:30,
 from ./rasterizer/core/backend.h:33,
 from rasterizer/core/api.cpp:35:
./rasterizer/common/simdlib.hpp: In member function ‘uint32_t
SIMDVecHash::operator()(Integer) const’:
./rasterizer/common/simdlib.hpp:610:9: error: static assertion failed: This
path only meant for 64-bit code
 static_assert(sizeof(void*) == 8, "This path only meant for 64-bit
code");
 ^
./rasterizer/common/simdlib.hpp:619:50: error: there are no arguments to
‘_mm_crc32_u64’ that depend on a template parameter, so a declaration of
‘_mm_crc32_u64’ must be available [-fpermissive]
 crc32 = _mm_crc32_u64(crc32, pData[i]);
  ^
./rasterizer/common/simdlib.hpp:619:50: note: (if you use ‘-fpermissive’, G++
will accept your code, but allowing the use of an undeclared name is
deprecated)
  CXX  rasterizer/core/libswrAVX2_la-backend_singlesample.lo
  CXX  rasterizer/core/libswrAVX2_la-binner.lo
Makefile:3101: recipe for target 'rasterizer/core/libswrAVX2_la-backend.lo'
failed
make[5]: *** [rasterizer/core/libswrAVX2_la-backend.lo] Error 1
make[5]: *** Waiting for unfinished jobs
Makefile:3094: recipe for target 'rasterizer/core/libswrAVX2_la-api.lo' failed
make[5]: *** [rasterizer/core/libswrAVX2_la-api.lo] Error 1
In file included from ./rasterizer/common/simdintrin.h:28:0,
 from ./rasterizer/core/format_types.h:31,
 from ./rasterizer/core/format_traits.h:32,
 from ./rasterizer/core/multisample.h:30,
 from rasterizer/core/backend.h:33,
 from rasterizer/core/backend_sample.cpp:32:
./rasterizer/common/simdlib.hpp: In member function ‘uint32_t
SIMDVecHash::operator()(Integer) const’:
./rasterizer/common/simdlib.hpp:610:9: error: static assertion failed: This
path only meant for 64-bit code
 static_assert(sizeof(void*) == 8, "This path only meant for 64-bit
code");
 ^
./rasterizer/common/simdlib.hpp:619:50: error: there are no arguments to
‘_mm_crc32_u64’ that depend on a template parameter, so a declaration of
‘_mm_crc32_u64’ must be available [-fpermissive]
 crc32 = _mm_crc32_u64(crc32, pData[i]);
  ^
./rasterizer/common/simdlib.hpp:619:50: note: (if you use ‘-fpermissive’, G++
will accept your code, but allowing the use of an undeclared name is
deprecated)
In file included from ./rasterizer/common/simdintrin.h:28:0,
 from rasterizer/core/conservativeRast.h:28,
 from rasterizer/core/binner.h:29,
 from rasterizer/core/binner.cpp:29:
./rasterizer/common/simdlib.hpp: In member function ‘uint32_t
SIMDVecHash::operator()(Integer) const’:
./rasterizer/common/simdlib.hpp:610:9: error: 

Re: [Mesa-dev] [RFC PATCH] mesa/st/cb_clear: in st_Clear also validate the render state (needed by virgl)

2018-05-04 Thread Gert Wollny
Am Donnerstag, den 03.05.2018, 19:52 +0200 schrieb Gert Wollny:
> Am Donnerstag, den 03.05.2018, 13:24 -0400 schrieb Ilia Mirkin:
> > 
> > The api call is "clear", not "glClear in the context of whatever
> > random GL state there might be". When the gallium clear API is
> > invoked, the bound framebuffer needs to be cleared. This is how the
> > API works, this is how all drivers implement it. It's basically
> > memset(). It doesn't care about rasterizer discard or anything
> > else.
> > 
> 
> I stand corrected and sorry for the noise.
Actually no: 

The OpenGL standard section 14.1 says: 

"When enabled, RASTERIZER_DISCARD also causes the Clear and
ClearBuffer* commands to be ignored."

The glClear command has the following call chain: 

  glClear(...) = _mesa_Clear at mesa/main/clear.c:242
  clear(...)  at mesa/main/clear.c:221
  st_Clear(...) at mesa/state_tracker/st_cb_clear.c:366

clear() takes care of validating the non-driver state changes stored in
ctx->NewState, but changes to RASTERIZER_DISCARD are stored in 
ctx->NewDriverState and in this case this refers to the gallium state
tracker and hence st_Clear should also validate the renderer state. 

The reason most drivers get away without this validation is that they
use the blitter who also validates the render state befor doing the
actual clear operation. 

best, 
Gert 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106400] [CI][SNB] glsl-1.40 and 1.50 texturesize / geometry tests failing

2018-05-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106400

--- Comment #2 from Tomi Sarvela  ---
Thanks,

It seems that there is some issue with blacklisting. From runlog, it seems that
these tests should be excluded (line beginning with Options)

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4143/pig-snb-2600/run0.log

So the problem is in our execution environment.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 08/14] mesa: handle OES_texture_half_float formats in _mesa_base_tex_format()

2018-05-04 Thread Tapani Pälli



On 05/04/2018 02:12 PM, Tapani Pälli wrote:



On 05/04/2018 10:15 AM, Christian Gmeiner wrote:

Hi



On 05/01/2018 05:48 PM, Christian Gmeiner wrote:

Signed-off-by: Christian Gmeiner 
Reviewed-by: Wladimir J. van der Laan 
---
   src/mesa/main/glformats.c | 19 +++
   1 file changed, 19 insertions(+)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index cba5e670db..1d3d524875 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2424,6 +2424,25 @@ _mesa_base_tex_format(const struct gl_context

*ctx, GLint internalFormat)

    return GL_YCBCR_MESA;
  }

+   if (ctx->Extensions.OES_texture_half_float) {



this could be



if ((ctx->Extensions.ARB_texture_float) ||
 ctx->Extensions.OES_texture_half_float))



I tried to separate half float and float handling so I think this your
suggestion will
not work here.


+   switch (internalFormat) {
+   case GL_ALPHA16F_ARB:
+  return GL_ALPHA;
+   case GL_RGBA16F_ARB:
+  return GL_RGBA;
+   case GL_RGB16F_ARB:
+  return GL_RGB;
+   case GL_INTENSITY16F_ARB:
+  return GL_INTENSITY;
+   case GL_LUMINANCE16F_ARB:
+  return GL_LUMINANCE;
+   case GL_LUMINANCE_ALPHA16F_ARB:
+  return GL_LUMINANCE_ALPHA;
+   default:
+  ; /* fallthrough */
+   }
+   }
+



It seems like we miss OES_texture_float as well .. with the above change
(separation of half float formats from full float ones) we could have a
OES_texture_float || ARB_texture_float check for the rest?



Yeah.. makes sense.



  if (ctx->Extensions.ARB_texture_float) {
 switch (internalFormat) {
 case GL_ALPHA16F_ARB:




What about something like this:

-8<---

iff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index cba5e670db..f6c252cdf0 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2424,7 +2424,27 @@ _mesa_base_tex_format(const struct gl_context 
*ctx,

GLint internalFormat)
    return GL_YCBCR_MESA;
  }

-   if (ctx->Extensions.ARB_texture_float) {
+   if (ctx->Extensions.OES_texture_half_float) {
+   switch (internalFormat) {
+   case GL_ALPHA16F_ARB:
+  return GL_ALPHA;
+   case GL_RGBA16F_ARB:
+  return GL_RGBA;
+   case GL_RGB16F_ARB:
+  return GL_RGB;
+   case GL_INTENSITY16F_ARB:
+  return GL_INTENSITY;
+   case GL_LUMINANCE16F_ARB:
+  return GL_LUMINANCE;
+   case GL_LUMINANCE_ALPHA16F_ARB:
+  return GL_LUMINANCE_ALPHA;
+   default:
+  ; /* fallthrough */
+   }
+   }
+
+   if (ctx->Extensions.ARB_texture_float ||
+   ctx->Extensions.OES_texture_float) {
 switch (internalFormat) {
 case GL_ALPHA16F_ARB:
 case GL_ALPHA32F_ARB:

-8<---



Right this is what I was after, I believe you should be able to also 
remove all the 16F formats from the list below since they are already 
checked above. So first check 16F ones and then 32F ones separately.




So for this separation to happen we would have ...

for 16F ones:

if (ctx->Extensions.ARB_texture_float || 
ctx->Extensions.OES_texture_half_float)


and another one for 32F ones:

if (ctx->Extensions.ARB_texture_float ||
ctx->Extensions.OES_texture_float)


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 08/14] mesa: handle OES_texture_half_float formats in _mesa_base_tex_format()

2018-05-04 Thread Tapani Pälli



On 05/04/2018 10:15 AM, Christian Gmeiner wrote:

Hi



On 05/01/2018 05:48 PM, Christian Gmeiner wrote:

Signed-off-by: Christian Gmeiner 
Reviewed-by: Wladimir J. van der Laan 
---
   src/mesa/main/glformats.c | 19 +++
   1 file changed, 19 insertions(+)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index cba5e670db..1d3d524875 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2424,6 +2424,25 @@ _mesa_base_tex_format(const struct gl_context

*ctx, GLint internalFormat)

return GL_YCBCR_MESA;
  }

+   if (ctx->Extensions.OES_texture_half_float) {



this could be



if ((ctx->Extensions.ARB_texture_float) ||
 ctx->Extensions.OES_texture_half_float))



I tried to separate half float and float handling so I think this your
suggestion will
not work here.


+   switch (internalFormat) {
+   case GL_ALPHA16F_ARB:
+  return GL_ALPHA;
+   case GL_RGBA16F_ARB:
+  return GL_RGBA;
+   case GL_RGB16F_ARB:
+  return GL_RGB;
+   case GL_INTENSITY16F_ARB:
+  return GL_INTENSITY;
+   case GL_LUMINANCE16F_ARB:
+  return GL_LUMINANCE;
+   case GL_LUMINANCE_ALPHA16F_ARB:
+  return GL_LUMINANCE_ALPHA;
+   default:
+  ; /* fallthrough */
+   }
+   }
+



It seems like we miss OES_texture_float as well .. with the above change
(separation of half float formats from full float ones) we could have a
OES_texture_float || ARB_texture_float check for the rest?



Yeah.. makes sense.



  if (ctx->Extensions.ARB_texture_float) {
 switch (internalFormat) {
 case GL_ALPHA16F_ARB:




What about something like this:

-8<---

iff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index cba5e670db..f6c252cdf0 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2424,7 +2424,27 @@ _mesa_base_tex_format(const struct gl_context *ctx,
GLint internalFormat)
return GL_YCBCR_MESA;
  }

-   if (ctx->Extensions.ARB_texture_float) {
+   if (ctx->Extensions.OES_texture_half_float) {
+   switch (internalFormat) {
+   case GL_ALPHA16F_ARB:
+  return GL_ALPHA;
+   case GL_RGBA16F_ARB:
+  return GL_RGBA;
+   case GL_RGB16F_ARB:
+  return GL_RGB;
+   case GL_INTENSITY16F_ARB:
+  return GL_INTENSITY;
+   case GL_LUMINANCE16F_ARB:
+  return GL_LUMINANCE;
+   case GL_LUMINANCE_ALPHA16F_ARB:
+  return GL_LUMINANCE_ALPHA;
+   default:
+  ; /* fallthrough */
+   }
+   }
+
+   if (ctx->Extensions.ARB_texture_float ||
+   ctx->Extensions.OES_texture_float) {
 switch (internalFormat) {
 case GL_ALPHA16F_ARB:
 case GL_ALPHA32F_ARB:

-8<---



Right this is what I was after, I believe you should be able to also 
remove all the 16F formats from the list below since they are already 
checked above. So first check 16F ones and then 32F ones separately.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-05-04 Thread Eleni Maria Stea
Hi Eero,

Thanks for your feedback,

On Thu, 3 May 2018 13:30:38 +0300
Eero Tamminen  wrote:

> Hi,
> 
> On 02.05.2018 20:19, Matt Turner wrote:
> > On Wed, May 2, 2018 at 9:13 AM, Eleni Maria Stea 
> > wrote:  
> >> Gen 7 GPUs store the compressed EAC/ETC2 images in other
> >> non-compressed formats that can render. When GetCompressed*
> >> functions are called, the pixels are returned in the
> >> non-compressed format that is used for the rendering.
> >>
> >> With this patch we store both the compressed and non-compressed
> >> versions of the image, so that both rendering commands and
> >> GetCompressed* commands work.
> >>
> >> Also, the assertions for GL_MAP_WRITE_BIT and
> >> GL_MAP_INVALIDATE_RANGE_BIT in intel_miptree_map_etc function have
> >> been removed because when the miptree is mapped for reading (for
> >> example from a GetCompress* function) the GL_MAP_WRITE_BIT won't
> >> be set (and shouldn't be set).
> >>
> >> Fixes: the following test in CTS for gen7:
> >> KHR-GL45.direct_state_access.textures_compressed_subimage test
> >>
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272  
> > 
> > I think you can add
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843
> > 
> > as well :)  
> 
> This is really lovely feature.
> 
> Compressed texture formats are used to:
> 1. Reduce disk / network usage for the application install
> 2. Increase run-time performance (by reducing required bandwidth)
> 3. Reduce program memory usage
> 
> At the cost of worse texture quality.
> 
> Mesa transparently converting ETC to uncompressed data on platforms
> that don't support ETC in HW, means that application doesn't get 2),
> just worse texture quality, although some applications would have
> capability to fall back to another (HW supported) texture compression
> format.
> 
> And this new patch means that instead of 3), memory usage actually
> _increases_ compared to application using non-compressed textures
> directly.
> 

You are right about the memory usage and about the purpose of the patch.

> Some (many?) applications might fail to run if ETC isn't supported, so
> I  understand why this feature is done, but it would be nice to have
> some better way to handle it.
> 
> Maybe some new extension that can be used by future game engines &
> application toolkits to query which of the compressed texture formats
> are faked, so that they can instead select a compression format that
> actually provides run-time benefits?
> 
> 
>   - Eero

After having a look I think that we already have an extension for that
type of query: the ARB_internalformat_query2. We are not sure if the
existing queries would fit, but in case they don't we think that the
new extension would rather add more queries to this one than introducing
new methods. What do you think?

Thank you,
Eleni 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] i965/compiler: handle conversion to smaller type in the lowering pass for that

2018-05-04 Thread Iago Toral Quiroga
This rollbacks the revert of this same patch introduced in
commit 7b9c15628aae8729118b648f5f473e6ac926b99b.
---
 src/intel/compiler/brw_fs_lower_conversions.cpp |  5 -
 src/intel/compiler/brw_fs_nir.cpp   | 14 +++---
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/intel/compiler/brw_fs_lower_conversions.cpp 
b/src/intel/compiler/brw_fs_lower_conversions.cpp
index 663c9674c4..f95b39d3e8 100644
--- a/src/intel/compiler/brw_fs_lower_conversions.cpp
+++ b/src/intel/compiler/brw_fs_lower_conversions.cpp
@@ -54,7 +54,7 @@ fs_visitor::lower_conversions()
   bool saturate = inst->saturate;
 
   if (supports_type_conversion(inst)) {
- if (get_exec_type_size(inst) == 8 && type_sz(inst->dst.type) < 8) {
+ if (type_sz(inst->dst.type) < get_exec_type_size(inst)) {
 /* From the Broadwell PRM, 3D Media GPGPU, "Double Precision Float 
to
  * Single Precision Float":
  *
@@ -64,6 +64,9 @@ fs_visitor::lower_conversions()
  * So we need to allocate a temporary that's two registers, and 
then do
  * a strided MOV to get the lower DWord of every Qword that has the
  * result.
+ *
+ * This restriction applies, in general, whenever we convert to
+ * a type with a smaller bit-size.
  */
 fs_reg temp = ibld.vgrf(get_exec_type(inst));
 fs_reg strided_temp = subscript(temp, dst.type, 0);
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index c7f7bc21b8..1ce89520bf 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -755,19 +755,9 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
*/
 
case nir_op_f2f16_undef:
-   case nir_op_i2i16:
-   case nir_op_u2u16: {
-  /* TODO: Fixing aligment rules for conversions from 32-bits to
-   * 16-bit types should be moved to lower_conversions
-   */
-  fs_reg tmp = bld.vgrf(op[0].type, 1);
-  tmp = subscript(tmp, result.type, 0);
-  inst = bld.MOV(tmp, op[0]);
-  inst->saturate = instr->dest.saturate;
-  inst = bld.MOV(result, tmp);
+  inst = bld.MOV(result, op[0]);
   inst->saturate = instr->dest.saturate;
   break;
-   }
 
case nir_op_f2f64:
case nir_op_f2i64:
@@ -807,6 +797,8 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
case nir_op_f2u16:
case nir_op_i2i32:
case nir_op_u2u32:
+   case nir_op_i2i16:
+   case nir_op_u2u16:
case nir_op_i2f16:
case nir_op_u2f16:
   inst = bld.MOV(result, op[0]);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] anv/device: expose shaderInt16 support in gen8+

2018-05-04 Thread Iago Toral Quiroga
This rollbacks the revert of this patch introduced with
commit 7cf284f18e6774c810ed6db17b98e597bf96f8a5.
---
 src/intel/vulkan/anv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 0563eae5c1..fd516fb846 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -759,7 +759,7 @@ void anv_GetPhysicalDeviceFeatures(
   
pdevice->info.has_64bit_types,
   .shaderInt64  = pdevice->info.gen >= 8 &&
   
pdevice->info.has_64bit_types,
-  .shaderInt16  = false,
+  .shaderInt16  = pdevice->info.gen >= 8,
   .shaderResourceMinLod = false,
   .variableMultisampleRate  = true,
   .inheritedQueries = true,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] intel/compiler: handle 16-bit to 64-bit conversions in BSW platforms

2018-05-04 Thread Iago Toral Quiroga
These are subject to the general restriction that anything that is converted
to 64-bit needs to be aligned to 64-bit.  We had this already in place for
32-bit to 64-bit conversions, so this patch generalizes the implementation
to take effect on any conversion to 64-bit from a source smaller than
64-bit.

Fixes assembly validation errors in the following CTS tests in BSW:
dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_int64
dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint16_to_uint64
dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_uint64

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106389
---
 src/intel/compiler/brw_fs_nir.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index f9fde145a1..c7f7bc21b8 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -785,12 +785,12 @@ fs_visitor::nir_emit_alu(const fs_builder , 
nir_alu_instr *instr)
*the same qword.
* (...)"
*
-   * This means that 32-bit to 64-bit conversions need to have the 32-bit
-   * data elements aligned to 64-bit. This restriction does not apply to
-   * BDW and later.
+   * This means that conversions from bit-sizes smaller than 64-bit to
+   * 64-bit need to have the source data elements aligned to 64-bit.
+   * This restriction does not apply to BDW and later.
*/
   if (nir_dest_bit_size(instr->dest.dest) == 64 &&
-  nir_src_bit_size(instr->src[0].src) == 32 &&
+  nir_src_bit_size(instr->src[0].src) < 64 &&
   (devinfo->is_cherryview || gen_device_info_is_9lp(devinfo))) {
  fs_reg tmp = bld.vgrf(result.type, 1);
  tmp = subscript(tmp, op[0].type, 0);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] intel/compiler: Fix lower_conversions for 8-bit types.

2018-05-04 Thread Iago Toral Quiroga
From: Jose Maria Casanova Crespo 

For 8-bit types the execution type is word. A byte raw MOV has 16-bit
execution type and 8-bit destination and it shouldn't be considered
a conversion case. So there is no need to change alignment and enter
in lower_conversions for these instructions.

Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export"
that is introduced with this patch from the Vulkan shaderInt16 series:
'i965/compiler: handle conversion to smaller type in the lowering
pass for that'. The problem is caused because there is already a case
in the driver that injects Byte instructions like this:

mov(8)  g127<1>UB   g2<32,8,4>UB

And the aforementioned pass was not accounting for the special
handling of the execution size of Byte instructions. This patch
fixes this.

v2: (Jason Ekstrand)
   - Simplify is_byte_raw_mov, include reference to PRM and not
   consider B <-> UB conversions as raw movs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393
---
 src/intel/compiler/brw_fs_lower_conversions.cpp | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_fs_lower_conversions.cpp 
b/src/intel/compiler/brw_fs_lower_conversions.cpp
index f95b39d3e8..f6c936cf21 100644
--- a/src/intel/compiler/brw_fs_lower_conversions.cpp
+++ b/src/intel/compiler/brw_fs_lower_conversions.cpp
@@ -43,6 +43,24 @@ supports_type_conversion(const fs_inst *inst) {
}
 }
 
+/* From the SKL PRM Vol 2a, "Move":
+ *
+ *"A mov with the same source and destination type, no source modifier,
+ *and no saturation is a raw move. A packed byte destination region (B
+ *or UB type with HorzStride == 1 and ExecSize > 1) can only be written
+ *using raw move."
+ */
+static bool
+is_byte_raw_mov (const fs_inst *inst)
+{
+   return type_sz(inst->dst.type) == 1 &&
+  inst->opcode == BRW_OPCODE_MOV &&
+  inst->src[0].type == inst->dst.type &&
+  !inst->saturate &&
+  !inst->src[0].negate &&
+  !inst->src[0].abs;
+}
+
 bool
 fs_visitor::lower_conversions()
 {
@@ -54,7 +72,8 @@ fs_visitor::lower_conversions()
   bool saturate = inst->saturate;
 
   if (supports_type_conversion(inst)) {
- if (type_sz(inst->dst.type) < get_exec_type_size(inst)) {
+ if (type_sz(inst->dst.type) < get_exec_type_size(inst) &&
+ !is_byte_raw_mov(inst)) {
 /* From the Broadwell PRM, 3D Media GPGPU, "Double Precision Float 
to
  * Single Precision Float":
  *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] Intel: fixes to regressions caused by shaderInt16

2018-05-04 Thread Iago Toral Quiroga
This series fixes these two bug reports:
https://bugs.freedesktop.org/show_bug.cgi?id=106393
https://bugs.freedesktop.org/show_bug.cgi?id=106389

Caused by the shaderInt16 series we landed yesterday. Details follow:

Bug #106389 was triggered on BSW-like platforms, which apparently are not
executed during developer Jenkins runs due to excesive run times. The problem
was manifesting as some test fails for Intel CI. Although the same tests would
not fail for me, at least on a BSW NUC I have around, I could see assembly
validation errors being printed when INTEL_DEBUG was used. Thankfully, this is
a trivial fix, since we just need to account for the fact that all conversions
to 64b need to be aligned to 64b in these platforms, and we were already
doing this for 32b to 64b conversions, we just had to generalize the code.

Bug #106393: I am not sure why we were not seeing this in our Jenkins runs,
but we can certainly reproduce it locally. We had, in fact, seen the problem
before while we were experimenting with int8 support in the compiler since it is
specific to Byte instructions (MOVs in particular). It seems that there is at
least one case in the driver where we emit a Byte MOV instruction already and
that was causing the problem. We already had a fix for this that we had been
discussin with Jason for some time, we just were not aware that we already had
this situation in the driver.

Patch 1 should fix the first bug. I verified that the CTS tests mentioned in
the bug report no longer show validation errors in the assembly output with
INTEL_DEBUG=cs. The tests were already passing for me without the fix though.

Patch 2 is a rollback of the revert of that same commit.

Patch 3 fixes the second bug, which is introduced in Patch 2. Patch 3 was 
created
on top of patch 2 and it is a very specific fix, so at least for
review purposes we think it makes sense to make both patches separate, however,
since patch 2 introduces a regression, we think we want to squash them
together before pushing.

Patch 4 Re-enables shaderInt16

Mark, I ran these patches through Jenkins and didn't get any regressions but
since I didn't get them when I was testing the original shaderInt16 I'd ask if
you can verify the fixes on your end (specially on BSW-like platforms).

Iago

Iago Toral Quiroga (3):
  intel/compiler: handle 16-bit to 64-bit conversions in BSW platforms
  i965/compiler: handle conversion to smaller type in the lowering pass
for that
  anv/device: expose shaderInt16 support in gen8+

Jose Maria Casanova Crespo (1):
  intel/compiler: Fix lower_conversions for 8-bit types.

 src/intel/compiler/brw_fs_lower_conversions.cpp | 24 +++-
 src/intel/compiler/brw_fs_nir.cpp   | 22 +++---
 src/intel/vulkan/anv_device.c   |  2 +-
 3 files changed, 31 insertions(+), 17 deletions(-)

-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] egl: check if colorspace/surface type is supported

2018-05-04 Thread Juan A. Suarez Romero
On Fri, 2018-05-04 at 08:29 +0300, Tapani Pälli wrote:
> 
> On 05/03/2018 08:49 PM, Emil Velikov wrote:
> > On 2 May 2018 at 17:23, Juan A. Suarez Romero  wrote:
> > > According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering
> > > Surfaces"), if config does not support the colorspace or alpha format
> > > attributes specified in attrib_list (as defined for
> > > eglCreateWindowSurface), an EGL_BAD_MATCH error is generated.
> > > 
> > > This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still
> > > not merged,
> > > https://android-review.googlesource.com/c/platform/external/deqp/+/667322),
> > > which is crashing when trying to create a windows surface with RGB888
> > > configuration and sRGB colorspace.
> > > 
> > > v2: Handle the fix in other backends (Tapani)
> > > ---
> > >   src/egl/drivers/dri2/platform_drm.c  | 5 +
> > >   src/egl/drivers/dri2/platform_wayland.c  | 6 ++
> > >   src/egl/drivers/dri2/platform_x11.c  | 5 +
> > >   src/egl/drivers/dri2/platform_x11_dri3.c | 5 +
> > >   4 files changed, 21 insertions(+)
> > > 
> > > diff --git a/src/egl/drivers/dri2/platform_drm.c 
> > > b/src/egl/drivers/dri2/platform_drm.c
> > > index dc4efea9103..35bc4b5b1ac 100644
> > > --- a/src/egl/drivers/dri2/platform_drm.c
> > > +++ b/src/egl/drivers/dri2/platform_drm.c
> > > @@ -155,6 +155,11 @@ dri2_drm_create_window_surface(_EGLDriver *drv, 
> > > _EGLDisplay *disp,
> > >  config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
> > >   dri2_surf->base.GLColorspace);
> > > 
> > > +   if (!config) {
> > > +  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
> > > configuration");
> > 
> > Seems like android and surfaceless are missing the EGL_BAD_MATCH bit.
> > AFAICT they need it right?
> > 
> 
> That is true, I've sent a patch to add the error case.
> 
> BTW there seems something wrong in dEQP surfaceless support (?) Are you 
> guys able to run it? For me it says EGL is not supported.
> 

Same here. It seems it is not very complete, and intended to be used with
ChromeOS.

https://android.googlesource.com/platform/external/deqp/+/04589e1a53e11599a386ff
99dc1ce135e3770653


J.A.

> // Tapani
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] i965: Require softpin support for Cannonlake and later.

2018-05-04 Thread Daniel Vetter
On Fri, May 04, 2018 at 11:07:45AM +0200, Daniel Vetter wrote:
> On Fri, May 04, 2018 at 01:28:02AM -0700, Kenneth Graunke wrote:
> > On Friday, May 4, 2018 1:16:29 AM PDT Kenneth Graunke wrote:
> > > On Friday, May 4, 2018 12:39:12 AM PDT Chris Wilson wrote:
> > > > Quoting Kenneth Graunke (2018-05-04 08:34:07)
> > > > > On Thursday, May 3, 2018 11:53:24 PM PDT Chris Wilson wrote:
> > > > > > Quoting Kenneth Graunke (2018-05-04 02:12:40)
> > > > > > > This isn't strictly necessary, but anyone running Cannonlake will
> > > > > > > already have Kernel 4.5 or later, so there's no reason to support
> > > > > > > the relocation model on Gen10+.
> > > > > > 
> > > > > > /o\ gvt. Need I say more?
> > > > > > -Chris
> > > > > 
> > > > > Yes.  What's the deal with GVT?
> > > > 
> > > > Their current restrictions involve forcing the use of a 32b
> > > > aliasing-ppgtt. Not that they support cnl+ yet, so they might remember
> > > > to lift that restriction in time.
> > > > -Chris
> > > 
> > > Wow, that's really miserable.  So, we can't actually depend on real
> > > PPGTT existing?  Do you know if/when they might fix this?
> > > 
> > > This seriously wrecks a lot of my plans if we can't assume PPGTT
> > > and have to deal with relocations for all of eternity.  Jason and I
> > > have been planning on doing PPGTT-only drivers for months.  We figured
> > > that full PPGTT had been working since Gen8 and surely would be viable
> > > on anything modern.  If it weren't for enterprise kernels, I would have
> > > required this all the way back to Gen8 in a heartbeat.
> > > 
> > > --Ken
> > 
> > Okay, whew, it looks like we were wrong.  There's a bit of confusing
> > code in the kernel:
> > 
> > if (intel_vgpu_active(dev_priv)) {
> > /* GVT-g has no support for 32bit ppgtt */
> > has_full_ppgtt = false;
> > has_full_48bit_ppgtt = 
> > intel_vgpu_has_full_48bit_ppgtt(dev_priv);
> > }
> > 
> > But Joonas explained that this means GVT-g does support full PPGTT
> > with 48-bit addresses, it just never did the 32-bit only thing.
> > 
> > So, I think we're fine here after all.
> 
> Oh indeed. But only fixed in 4.14 by:
> 
> commit 6b3816d69628becb7ff35978aa0751798b4a940a
> Author: Tina Zhang 
> Date:   Mon Aug 14 15:24:14 2017 +0800
> 
> drm/i915/gvt: Fix guest i915 full ppgtt blocking issue
> 
> Maybe note that somewhere in your huge table of "stuff the kernel
> supports" ...

Also, for you patch here this means you're fine - since cnl support isn't
merged yet, but will have full 48bit ppgtt support (since it's there
already) I think it's safe to require it.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >