[Mesa-dev] [PATCH 1/3] egl: Allow creation of per surface out fence

2017-09-14 Thread yogesh . marathe
From: Zhongmin Wu 

Add plumbing to allow creation of per display surface out fence.

Currently enabled only on android, since the system expects a valid
fd in ANativeWindow::{queue,cancel}Buffer. We pass a fd of -1 with
which native applications such as flatland fail. The patch enables
explicit sync on android and fixes one of the functional issue for
apps or buffer consumers which depend upon fence and its timestamp.

v2: a) Also implement the fence in cancelBuffer.
b) The last sync fence is stored in drawable object
   rather than brw context.
c) format clear.

v3: a) Save the last fence fd in DRI Context object.
b) Return the last fence if the batch buffer is empty and
   nothing to be flushed when _intel_batchbuffer_flush_fence
c) Add the new interface in vbtl to set the retrieve fence

v3.1 a) close fd in the new vbtl interface on none Android platform

v4: a) The last fence is saved in brw context.
b) The retrieve fd is for all the platform but not just Android
c) Add a uniform dri2 interface to initialize the surface.

v4.1: a) make some changes of variable name.
  b) the patch is broken into two patches.

v4.2: a) Add a deinit interface for surface to clear the out fence

v5: a) Add enable_out_fence to init, platform sets it true or
   false
b) Change get fd to update fd and check for fence
c) Commit description updated

v6: a) Heading and commit description updated
b) enable_out_fence is set only if fence is supported
c) Review comments on function names
d) Test with standalone patch, resolves the bug

v6.1: Check for old display fence reverted

v6.2: enable_out_fence initialized to false by default,
  dri2_surf_update_fence_fd updated, deinit changed to fini

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655

Signed-off-by: Zhongmin Wu 
Signed-off-by: Yogesh Marathe 
Reviewed-by: Emil Velikov 
Reviewed-by: Tomasz Figa 
---
 src/egl/drivers/dri2/egl_dri2.c | 71 +
 src/egl/drivers/dri2/egl_dri2.h |  9 
 src/egl/drivers/dri2/platform_android.c | 29 ++--
 src/egl/drivers/dri2/platform_drm.c |  3 +-
 src/egl/drivers/dri2/platform_surfaceless.c |  3 +-
 src/egl/drivers/dri2/platform_wayland.c |  3 +-
 src/egl/drivers/dri2/platform_x11.c |  3 +-
 src/egl/drivers/dri2/platform_x11_dri3.c|  3 +-
 8 files changed, 106 insertions(+), 18 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 2667aa5..af238a9 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -1388,6 +1388,45 @@ dri2_destroy_context(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLContext *ctx)
return EGL_TRUE;
 }
 
+EGLBoolean
+dri2_init_surface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
+_EGLConfig *conf, const EGLint *attrib_list, EGLBoolean 
enable_out_fence)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+
+   dri2_surf->out_fence_fd = -1;
+   dri2_surf->enable_out_fence = false;
+   if (dri2_dpy->fence && dri2_dpy->fence->base.version >= 2 &&
+   dri2_dpy->fence->get_capabilities &&
+   (dri2_dpy->fence->get_capabilities(dri2_dpy->dri_screen) &
+__DRI_FENCE_CAP_NATIVE_FD)) {
+  dri2_surf->enable_out_fence = enable_out_fence;
+   }
+
+   return _eglInitSurface(surf, dpy, type, conf, attrib_list);
+}
+
+static void
+dri2_surface_set_out_fence_fd( _EGLSurface *surf, int fence_fd)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+
+   if (dri2_surf->out_fence_fd >=0)
+  close(dri2_surf->out_fence_fd);
+
+   dri2_surf->out_fence_fd = fence_fd;
+}
+
+void
+dri2_fini_surface(_EGLSurface *surf)
+{
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+
+   dri2_surface_set_out_fence_fd(surf, -1);
+   dri2_surf->enable_out_fence = false;
+}
+
 static EGLBoolean
 dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
 {
@@ -1399,6 +1438,28 @@ dri2_destroy_surface(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSurface *surf)
return dri2_dpy->vtbl->destroy_surface(drv, dpy, surf);
 }
 
+static void
+dri2_surf_update_fence_fd(_EGLContext *ctx,
+  _EGLDisplay *dpy, _EGLSurface *surf)
+{
+   __DRIcontext *dri_ctx = dri2_egl_context(ctx)->dri_context;
+   struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   int fence_fd = -1;
+   void *fence;
+
+   if (!dri2_surf->enable_out_fence)
+  return;
+
+   fence = dri2_dpy->fence->create_fence_fd(dri_ctx, -1);
+   if (fence) {
+  fence_fd = dri2_dpy->fence->get_fence_fd(dri2_dpy->dri_screen,
+   fence);
+  dri2_dpy->fence->destroy_fence(dri2_dpy->dri_screen, fence);
+   }
+   dri2_surface_set_out_fence_fd(surf, fence_fd);
+}
+
 /**
  * Called via eglMa

[Mesa-dev] [PATCH 2/3] egl: Wrap dri3 surface primitive around dri2 egl surface

2017-09-14 Thread yogesh . marathe
From: Yogesh Marathe 

Originally dri3 egl surface was wrapped around _EGLSurface. To support
explicit sync, new variables (e.g. enable_out_fence) were added to
dri2_egl_surface. As we reference these new variables we write on to
dri3 loader bits. These get toggled later in execution due to dri3
loader. This results in enable_out_fence to have garbage value and
further triggers an assert on dri3 platforms even where fences are not
supported in kernel.

Thanks to Rafael Antognolli, Emil Velikov and Mark Janes for catching
and root causing this.

Tested with Intel Mesa CI.

Signed-off-by: Yogesh Marathe 
---
 src/egl/drivers/dri2/platform_x11_dri3.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/drivers/dri2/platform_x11_dri3.h 
b/src/egl/drivers/dri2/platform_x11_dri3.h
index 13d8572..96e7ee9 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.h
+++ b/src/egl/drivers/dri2/platform_x11_dri3.h
@@ -28,7 +28,7 @@
 _EGL_DRIVER_TYPECAST(dri3_egl_surface, _EGLSurface, obj)
 
 struct dri3_egl_surface {
-   _EGLSurface base;
+   struct dri2_egl_surface surf;
struct loader_dri3_drawable loader_drawable;
 };
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] egl: dri3 changes to support surface primitive wrap around dri2

2017-09-14 Thread yogesh . marathe
From: Yogesh Marathe 

As base is moved one level down corresponding implementation in
dri3 needs a change.

Tested with Intel Mesa CI

Signed-off-by: Yogesh Marathe 
---
 src/egl/drivers/dri2/platform_x11_dri3.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index 5c4be4d..45bb56c 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -51,8 +51,8 @@ egl_dri3_set_drawable_size(struct loader_dri3_drawable *draw,
 {
struct dri3_egl_surface *dri3_surf = loader_drawable_to_egl_surface(draw);
 
-   dri3_surf->base.Width = width;
-   dri3_surf->base.Height = height;
+   dri3_surf->surf.base.Width = width;
+   dri3_surf->surf.base.Height = height;
 }
 
 static bool
@@ -61,7 +61,7 @@ egl_dri3_in_current_context(struct loader_dri3_drawable *draw)
struct dri3_egl_surface *dri3_surf = loader_drawable_to_egl_surface(draw);
_EGLContext *ctx = _eglGetCurrentContext();
 
-   return ctx->Resource.Display == dri3_surf->base.Resource.Display;
+   return ctx->Resource.Display == dri3_surf->surf.base.Resource.Display;
 }
 
 static __DRIcontext *
@@ -79,9 +79,9 @@ static void
 egl_dri3_flush_drawable(struct loader_dri3_drawable *draw, unsigned flags)
 {
struct dri3_egl_surface *dri3_surf = loader_drawable_to_egl_surface(draw);
-   _EGLDisplay *disp = dri3_surf->base.Resource.Display;
+   _EGLDisplay *disp = dri3_surf->surf.base.Resource.Display;
 
-   dri2_flush_drawable_for_swapbuffers(disp, &dri3_surf->base);
+   dri2_flush_drawable_for_swapbuffers(disp, &dri3_surf->surf.base);
 }
 
 static const struct loader_dri3_vtable egl_dri3_vtable = {
@@ -113,7 +113,7 @@ dri3_set_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surf,
 {
struct dri3_egl_surface *dri3_surf = dri3_egl_surface(surf);
 
-   dri3_surf->base.SwapInterval = interval;
+   dri3_surf->surf.base.SwapInterval = interval;
loader_dri3_set_swap_interval(&dri3_surf->loader_drawable, interval);
 
return EGL_TRUE;
@@ -145,14 +145,14 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
   drawable = xcb_generate_id(dri2_dpy->conn);
   xcb_create_pixmap(dri2_dpy->conn, conf->BufferSize,
 drawable, dri2_dpy->screen->root,
-dri3_surf->base.Width, dri3_surf->base.Height);
+dri3_surf->surf.base.Width, 
dri3_surf->surf.base.Height);
} else {
   STATIC_ASSERT(sizeof(uintptr_t) == sizeof(native_surface));
   drawable = (uintptr_t) native_surface;
}
 
dri_config = dri2_get_dri_config(dri2_conf, type,
-dri3_surf->base.GLColorspace);
+dri3_surf->surf.base.GLColorspace);
 
if (loader_dri3_drawable_init(dri2_dpy->conn, drawable,
  dri2_dpy->dri_screen,
@@ -164,7 +164,7 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
   goto cleanup_pixmap;
}
 
-   return &dri3_surf->base;
+   return &dri3_surf->surf.base;
 
  cleanup_pixmap:
if (type == EGL_PBUFFER_BIT)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: fix build warning on clang

2017-09-14 Thread Tapani Pälli



On 09/14/2017 09:31 PM, Jordan Justen wrote:

On 2017-09-14 00:26:39, Tapani Pälli wrote:

fixes following warning:
warning: format specifies type 'long' but the argument has type 'uint64_t' 
(aka 'unsigned long long')

cast is needed to avoid this turning in to another warning on 32bit build:
warning: format specifies type 'unsigned long long' but the argument has 
type 'uint64_t' (aka 'unsigned long')


size is uint64_t, so the (unsigned long long) cast shouldn't be needed
for 32-bit, right?


Ah right, got confused with the warnings .. cast was required so that 
64bit build would not warn.




Otherwise: Reviewed-by: Jordan Justen 



Signed-off-by: Tapani Pälli 
---
  src/mesa/drivers/dri/i965/brw_bufmgr.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index b9d6a39f1f..cc1a2d1f49 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -396,7 +396,8 @@ retry:
  
 pthread_mutex_unlock(&bufmgr->lock);
  
-   DBG("bo_create: buf %d (%s) %ldb\n", bo->gem_handle, bo->name, size);

+   DBG("bo_create: buf %d (%s) %llub\n", bo->gem_handle, bo->name,
+   (unsigned long long) size);
  
 return bo;
  
--

2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: android build system changes

2017-09-14 Thread Tapani Pälli



On 09/14/2017 07:54 PM, Rob Herring wrote:

On Thu, Sep 14, 2017 at 1:57 AM, Tapani Pälli  wrote:

Following changes are made to support VK_ANDROID_native_buffer:

- bring in vk_android_native_buffer.xml
- rename target as vulkan.$(TARGET_BOARD_PLATFORM)
- use LOCAL_PROPRIETARY_MODULE to install under vendor path


Good to see this. I was working on a patch to change all the targets
to /vendor. Any issues with doing that from your perspective.



That would be fine, we want to move everything to vendor directory.

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv: set has_exec_async to false on Android

2017-09-14 Thread Tapani Pälli



On 09/15/2017 01:09 AM, Chad Versace wrote:

On Thu 14 Sep 2017, Emil Velikov wrote:

On 14 September 2017 at 07:57, Tapani Pälli  wrote:

Other WSI implementations set has_exec_async false for WSI buffers,
so far haven't found a place to do it so we just claim to not have
async exec.


What's the actual side-effects you're seeing? I'd imagine Jason, Chris
and the gang may have some tips/suggestions - be that wrt Mesa or the
kernel.

I'm not saying "don't upstream this", but a comment and/or bug
reference will be beneficial.
Esp. since disabling async exec may have noticeable implication on performance.


Tapani, thanks for finding this problem. I completely overlooked it.

Instead of disabling ASYNC globally on Android, I believe the correct
fix is to set it only on imported gralloc buffers. Anvil already does
that for all X11 and Wayland buffers in anv_wsi.c.

I added that fix in v4 of my patch at
https://lists.freedesktop.org/archives/mesa-dev/2017-September/169698.html.



OK cool, will test the new patches!

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] mesa: align atomic buffer handling code with ubo/ssbo

2017-09-14 Thread Dave Airlie
From: Dave Airlie 

this adds automatic size support to the atomic buffer code,
but also realigns the code to act like the ubo/ssbo code.

Signed-off-by: Dave Airlie 
---
 src/mesa/main/bufferobj.c | 132 ++
 src/mesa/main/mtypes.h|   1 +
 2 files changed, 88 insertions(+), 45 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 2da2128..93b66dc 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1268,18 +1268,19 @@ set_atomic_buffer_binding(struct gl_context *ctx,
   struct gl_atomic_buffer_binding *binding,
   struct gl_buffer_object *bufObj,
   GLintptr offset,
-  GLsizeiptr size)
+  GLsizeiptr size,
+  bool autoSize)
 {
_mesa_reference_buffer_object(ctx, &binding->BufferObject, bufObj);
 
-   if (bufObj == ctx->Shared->NullBufferObj) {
-  binding->Offset = 0;
-  binding->Size = 0;
-   } else {
-  binding->Offset = offset;
-  binding->Size = size;
-  bufObj->UsageHistory |= USAGE_ATOMIC_COUNTER_BUFFER;
-   }
+   binding->Offset = offset;
+   binding->Size = size;
+   binding->AutomaticSize = autoSize;
+   /* If this is a real buffer object, mark it has having been used
+* at some point as an atomic counter buffer.
+*/
+   if (size >= 0)
+ bufObj->UsageHistory |= USAGE_ATOMIC_COUNTER_BUFFER;
 }
 
 /**
@@ -1399,6 +1400,33 @@ bind_shader_storage_buffer(struct gl_context *ctx,
 }
 
 /**
+ * Binds a buffer object to an atomic buffer binding point.
+ *
+ * Unlike set_atomic_binding(), this function also flushes vertices
+ * and updates NewDriverState.  It also checks if the binding
+ * has actually changed before updating it.
+ */
+static void
+bind_atomic_buffer(struct gl_context *ctx, unsigned index,
+   struct gl_buffer_object *bufObj, GLintptr offset,
+   GLsizeiptr size, GLboolean autoSize)
+{
+   struct gl_atomic_buffer_binding *binding =
+  &ctx->AtomicBufferBindings[index];
+   if (binding->BufferObject == bufObj &&
+   binding->Offset == offset &&
+   binding->Size == size &&
+   binding->AutomaticSize == autoSize) {
+  return;
+   }
+
+   FLUSH_VERTICES(ctx, 0);
+   ctx->NewDriverState |= ctx->DriverFlags.NewAtomicBuffer;
+
+   set_atomic_buffer_binding(ctx, binding, bufObj, offset, size, autoSize);
+}
+
+/**
  * Bind a buffer object to a uniform block binding point.
  * As above, but offset = 0.
  */
@@ -1442,25 +1470,26 @@ bind_buffer_base_shader_storage_buffer(struct 
gl_context *ctx,
   bind_shader_storage_buffer(ctx, index, bufObj, 0, 0, GL_TRUE);
 }
 
+/**
+ * Bind a buffer object to a shader storage block binding point.
+ * As above, but offset = 0.
+ */
 static void
-bind_atomic_buffer(struct gl_context *ctx, unsigned index,
-   struct gl_buffer_object *bufObj, GLintptr offset,
-   GLsizeiptr size)
+bind_buffer_base_atomic_buffer(struct gl_context *ctx,
+   GLuint index,
+   struct gl_buffer_object *bufObj)
 {
-   _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, bufObj);
-
-   struct gl_atomic_buffer_binding *binding =
-  &ctx->AtomicBufferBindings[index];
-   if (binding->BufferObject == bufObj &&
-   binding->Offset == offset &&
-   binding->Size == size) {
+   if (index >= ctx->Const.MaxAtomicBufferBindings) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "glBindBufferBase(index=%d)", index);
   return;
}
 
-   FLUSH_VERTICES(ctx, 0);
-   ctx->NewDriverState |= ctx->DriverFlags.NewAtomicBuffer;
+   _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, bufObj);
 
-   set_atomic_buffer_binding(ctx, binding, bufObj, offset, size);
+   if (bufObj == ctx->Shared->NullBufferObj)
+  bind_atomic_buffer(ctx, index, bufObj, -1, -1, GL_TRUE);
+   else
+  bind_atomic_buffer(ctx, index, bufObj, 0, 0, GL_TRUE);
 }
 
 /**
@@ -1562,8 +1591,8 @@ delete_buffers(struct gl_context *ctx, GLsizei n, const 
GLuint *ids)
  /* unbind Atomci Buffer binding points */
  for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) {
 if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) {
-   _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, 0 );
-   bind_atomic_buffer(ctx, j, ctx->Shared->NullBufferObj, 0, 0);
+   bind_buffer_base_atomic_buffer(ctx, j,
+  ctx->Shared->NullBufferObj);
 }
  }
 
@@ -3564,32 +3593,46 @@ bind_buffer_range_shader_storage_buffer_err(struct 
gl_context *ctx,
bind_buffer_range_shader_storage_buffer(ctx, index, bufObj, offset, size);
 }
 
+static void
+bind_buffer_range_atomic_buffer(struct gl_context *ctx, GLuint index,
+ struct gl_buffer_object *bufObj,
+ 

[Mesa-dev] [PATCH 3/5] mesa/bufferobj: consolidate some codepaths between ubo/ssbo/atomics.

2017-09-14 Thread Dave Airlie
From: Dave Airlie 

These are 90% the same code, consoldiate them into a couple of
common codepaths.

Signed-off-by: Dave Airlie 
---
 src/mesa/main/bufferobj.c | 146 +++---
 1 file changed, 47 insertions(+), 99 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 7eb7ccf..052a671 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1258,18 +1258,18 @@ _mesa_BindBuffer(GLenum target, GLuint buffer)
 }
 
 /**
- * Binds a buffer object to an atomic buffer binding point.
+ * Binds a buffer object to a binding point.
  *
  * The caller is responsible for validating the offset,
  * flushing the vertices and updating NewDriverState.
  */
 static void
-set_atomic_buffer_binding(struct gl_context *ctx,
-  struct gl_buffer_binding *binding,
-  struct gl_buffer_object *bufObj,
-  GLintptr offset,
-  GLsizeiptr size,
-  bool autoSize)
+set_buffer_binding(struct gl_context *ctx,
+   struct gl_buffer_binding *binding,
+   struct gl_buffer_object *bufObj,
+   GLintptr offset,
+   GLsizeiptr size,
+   bool autoSize, gl_buffer_usage usage)
 {
_mesa_reference_buffer_object(ctx, &binding->BufferObject, bufObj);
 
@@ -1280,67 +1280,38 @@ set_atomic_buffer_binding(struct gl_context *ctx,
 * at some point as an atomic counter buffer.
 */
if (size >= 0)
- bufObj->UsageHistory |= USAGE_ATOMIC_COUNTER_BUFFER;
+  bufObj->UsageHistory |= usage;
 }
 
-/**
- * Binds a buffer object to a uniform buffer binding point.
- *
- * The caller is responsible for flushing vertices and updating
- * NewDriverState.
- */
 static void
-set_ubo_binding(struct gl_context *ctx,
-struct gl_buffer_binding *binding,
-struct gl_buffer_object *bufObj,
-GLintptr offset,
-GLsizeiptr size,
-GLboolean autoSize)
+set_buffer_multi_binding(struct gl_context *ctx,
+ const GLuint *buffers,
+ int idx,
+ const char *caller,
+ struct gl_buffer_binding *binding,
+ GLintptr offset,
+ GLsizeiptr size,
+ bool range,
+ gl_buffer_usage usage)
 {
-   _mesa_reference_buffer_object(ctx, &binding->BufferObject, bufObj);
-
-   binding->Offset = offset;
-   binding->Size = size;
-   binding->AutomaticSize = autoSize;
-
-   /* If this is a real buffer object, mark it has having been used
-* at some point as a UBO.
-*/
-   if (size >= 0)
-  bufObj->UsageHistory |= USAGE_UNIFORM_BUFFER;
-}
-
-/**
- * Binds a buffer object to a shader storage buffer binding point.
- *
- * The caller is responsible for flushing vertices and updating
- * NewDriverState.
- */
-static void
-set_ssbo_binding(struct gl_context *ctx,
- struct gl_buffer_binding *binding,
- struct gl_buffer_object *bufObj,
- GLintptr offset,
- GLsizeiptr size,
- GLboolean autoSize)
-{
-   _mesa_reference_buffer_object(ctx, &binding->BufferObject, bufObj);
-
-   binding->Offset = offset;
-   binding->Size = size;
-   binding->AutomaticSize = autoSize;
+   struct gl_buffer_object *bufObj;
+   if (binding->BufferObject && binding->BufferObject->Name == buffers[idx])
+  bufObj = binding->BufferObject;
+   else
+  bufObj = _mesa_multi_bind_lookup_bufferobj(ctx, buffers, idx, caller);
 
-   /* If this is a real buffer object, mark it has having been used
-* at some point as a SSBO.
-*/
-   if (size >= 0)
-  bufObj->UsageHistory |= USAGE_SHADER_STORAGE_BUFFER;
+   if (bufObj) {
+  if (bufObj == ctx->Shared->NullBufferObj)
+ set_buffer_binding(ctx, binding, bufObj, -1, -1, !range, usage);
+  else
+ set_buffer_binding(ctx, binding, bufObj, offset, size, !range, usage);
+   }
 }
 
 /**
  * Binds a buffer object to a uniform buffer binding point.
  *
- * Unlike set_ubo_binding(), this function also flushes vertices
+ * Unlike set_buffer_binding(), this function also flushes vertices
  * and updates NewDriverState.  It also checks if the binding
  * has actually changed before updating it.
  */
@@ -1365,7 +1336,7 @@ bind_uniform_buffer(struct gl_context *ctx,
FLUSH_VERTICES(ctx, 0);
ctx->NewDriverState |= ctx->DriverFlags.NewUniformBuffer;
 
-   set_ubo_binding(ctx, binding, bufObj, offset, size, autoSize);
+   set_buffer_binding(ctx, binding, bufObj, offset, size, autoSize, 
USAGE_UNIFORM_BUFFER);
 }
 
 /**
@@ -1396,7 +1367,7 @@ bind_shader_storage_buffer(struct gl_context *ctx,
FLUSH_VERTICES(ctx, 0);
ctx->NewDriverState |= ctx->DriverFlags.NewShaderStorageBuffer;
 
-   set_ssbo_binding(ctx, binding, bufO

[Mesa-dev] realign some atomic/ssbo/ubo code

2017-09-14 Thread Dave Airlie
I was digging around the atomic code looking at r600 again,
and noticed this code had some inconsistencies for the 3 codepaths
that should really be the same. There is probably further room
for consolidation here.

This saves 300 bytes in the text segment :-P

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] mesa: rename various buffer bindings to one struct.

2017-09-14 Thread Dave Airlie
From: Dave Airlie 

One binding to bind them all, these are all the same thing.

Signed-off-by: Dave Airlie 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  6 ++---
 src/mesa/drivers/dri/i965/genX_state_upload.c|  2 +-
 src/mesa/main/bufferobj.c| 18 ++---
 src/mesa/main/mtypes.h   | 33 +++-
 src/mesa/state_tracker/st_atom_atomicbuf.c   |  2 +-
 src/mesa/state_tracker/st_atom_constbuf.c|  2 +-
 src/mesa/state_tracker/st_atom_storagebuf.c  |  2 +-
 7 files changed, 20 insertions(+), 45 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index d110482..dae0439 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1279,7 +1279,7 @@ brw_upload_ubo_surfaces(struct brw_context *brw, struct 
gl_program *prog,
   &stage_state->surf_offset[prog_data->binding_table.ubo_start];
 
for (int i = 0; i < prog->info.num_ubos; i++) {
-  struct gl_uniform_buffer_binding *binding =
+  struct gl_buffer_binding *binding =
  &ctx->UniformBufferBindings[prog->sh.UniformBlocks[i]->Binding];
 
   if (binding->BufferObject == ctx->Shared->NullBufferObj) {
@@ -1304,7 +1304,7 @@ brw_upload_ubo_surfaces(struct brw_context *brw, struct 
gl_program *prog,
   &stage_state->surf_offset[prog_data->binding_table.ssbo_start];
 
for (int i = 0; i < prog->info.num_ssbos; i++) {
-  struct gl_shader_storage_buffer_binding *binding =
+  struct gl_buffer_binding *binding =
  
&ctx->ShaderStorageBufferBindings[prog->sh.ShaderStorageBlocks[i]->Binding];
 
   if (binding->BufferObject == ctx->Shared->NullBufferObj) {
@@ -1386,7 +1386,7 @@ brw_upload_abo_surfaces(struct brw_context *brw,
 
if (prog->info.num_abos) {
   for (unsigned i = 0; i < prog->info.num_abos; i++) {
- struct gl_atomic_buffer_binding *binding =
+ struct gl_buffer_binding *binding =
 &ctx->AtomicBufferBindings[prog->sh.AtomicBuffers[i]->Binding];
  struct intel_buffer_object *intel_bo =
 intel_buffer_object(binding->BufferObject);
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 6127616..54fada7 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -3076,7 +3076,7 @@ genX(upload_push_constant_packets)(struct brw_context 
*brw)
 
const struct gl_uniform_block *block =
   prog->sh.UniformBlocks[range->block];
-   const struct gl_uniform_buffer_binding *binding =
+   const struct gl_buffer_binding *binding =
   &ctx->UniformBufferBindings[block->Binding];
 
if (binding->BufferObject == ctx->Shared->NullBufferObj) {
diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 93b66dc..7eb7ccf 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1265,7 +1265,7 @@ _mesa_BindBuffer(GLenum target, GLuint buffer)
  */
 static void
 set_atomic_buffer_binding(struct gl_context *ctx,
-  struct gl_atomic_buffer_binding *binding,
+  struct gl_buffer_binding *binding,
   struct gl_buffer_object *bufObj,
   GLintptr offset,
   GLsizeiptr size,
@@ -1291,7 +1291,7 @@ set_atomic_buffer_binding(struct gl_context *ctx,
  */
 static void
 set_ubo_binding(struct gl_context *ctx,
-struct gl_uniform_buffer_binding *binding,
+struct gl_buffer_binding *binding,
 struct gl_buffer_object *bufObj,
 GLintptr offset,
 GLsizeiptr size,
@@ -1318,7 +1318,7 @@ set_ubo_binding(struct gl_context *ctx,
  */
 static void
 set_ssbo_binding(struct gl_context *ctx,
- struct gl_shader_storage_buffer_binding *binding,
+ struct gl_buffer_binding *binding,
  struct gl_buffer_object *bufObj,
  GLintptr offset,
  GLsizeiptr size,
@@ -1352,7 +1352,7 @@ bind_uniform_buffer(struct gl_context *ctx,
 GLsizeiptr size,
 GLboolean autoSize)
 {
-   struct gl_uniform_buffer_binding *binding =
+   struct gl_buffer_binding *binding =
   &ctx->UniformBufferBindings[index];
 
if (binding->BufferObject == bufObj &&
@@ -1383,7 +1383,7 @@ bind_shader_storage_buffer(struct gl_context *ctx,
GLsizeiptr size,
GLboolean autoSize)
 {
-   struct gl_shader_storage_buffer_binding *binding =
+   struct gl_buffer_binding *binding =
   &ctx->ShaderStorageBufferBindings[index];
 
if (binding->BufferObject == bufObj &&
@@ -1411,7 +1411,7 @@ bind_atomic_buffer(struct gl_context *ctx,

[Mesa-dev] [PATCH 4/5] mesa/bufferobj: consolidate some buffer binding code.

2017-09-14 Thread Dave Airlie
From: Dave Airlie 

These paths are again 90% the same, consolidate them into
one.

Signed-off-by: Dave Airlie 
---
 src/mesa/main/bufferobj.c | 76 ++-
 1 file changed, 35 insertions(+), 41 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 052a671..fba1f44 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1308,6 +1308,29 @@ set_buffer_multi_binding(struct gl_context *ctx,
}
 }
 
+static void
+bind_buffer(struct gl_context *ctx,
+struct gl_buffer_binding *binding,
+struct gl_buffer_object *bufObj,
+GLintptr offset,
+GLsizeiptr size,
+GLboolean autoSize,
+uint64_t driver_state,
+gl_buffer_usage usage)
+{
+   if (binding->BufferObject == bufObj &&
+   binding->Offset == offset &&
+   binding->Size == size &&
+   binding->AutomaticSize == autoSize) {
+  return;
+   }
+
+   FLUSH_VERTICES(ctx, 0);
+   ctx->NewDriverState |= driver_state;
+
+   set_buffer_binding(ctx, binding, bufObj, offset, size, autoSize, usage);
+}
+
 /**
  * Binds a buffer object to a uniform buffer binding point.
  *
@@ -1323,20 +1346,10 @@ bind_uniform_buffer(struct gl_context *ctx,
 GLsizeiptr size,
 GLboolean autoSize)
 {
-   struct gl_buffer_binding *binding =
-  &ctx->UniformBufferBindings[index];
-
-   if (binding->BufferObject == bufObj &&
-   binding->Offset == offset &&
-   binding->Size == size &&
-   binding->AutomaticSize == autoSize) {
-  return;
-   }
-
-   FLUSH_VERTICES(ctx, 0);
-   ctx->NewDriverState |= ctx->DriverFlags.NewUniformBuffer;
-
-   set_buffer_binding(ctx, binding, bufObj, offset, size, autoSize, 
USAGE_UNIFORM_BUFFER);
+   bind_buffer(ctx, &ctx->UniformBufferBindings[index],
+   bufObj, offset, size, autoSize,
+   ctx->DriverFlags.NewUniformBuffer,
+   USAGE_UNIFORM_BUFFER);
 }
 
 /**
@@ -1354,20 +1367,10 @@ bind_shader_storage_buffer(struct gl_context *ctx,
GLsizeiptr size,
GLboolean autoSize)
 {
-   struct gl_buffer_binding *binding =
-  &ctx->ShaderStorageBufferBindings[index];
-
-   if (binding->BufferObject == bufObj &&
-   binding->Offset == offset &&
-   binding->Size == size &&
-   binding->AutomaticSize == autoSize) {
-  return;
-   }
-
-   FLUSH_VERTICES(ctx, 0);
-   ctx->NewDriverState |= ctx->DriverFlags.NewShaderStorageBuffer;
-
-   set_buffer_binding(ctx, binding, bufObj, offset, size, autoSize, 
USAGE_SHADER_STORAGE_BUFFER);
+   bind_buffer(ctx, &ctx->ShaderStorageBufferBindings[index],
+   bufObj, offset, size, autoSize,
+   ctx->DriverFlags.NewShaderStorageBuffer,
+   USAGE_SHADER_STORAGE_BUFFER);
 }
 
 /**
@@ -1382,19 +1385,10 @@ bind_atomic_buffer(struct gl_context *ctx, unsigned 
index,
struct gl_buffer_object *bufObj, GLintptr offset,
GLsizeiptr size, GLboolean autoSize)
 {
-   struct gl_buffer_binding *binding =
-  &ctx->AtomicBufferBindings[index];
-   if (binding->BufferObject == bufObj &&
-   binding->Offset == offset &&
-   binding->Size == size &&
-   binding->AutomaticSize == autoSize) {
-  return;
-   }
-
-   FLUSH_VERTICES(ctx, 0);
-   ctx->NewDriverState |= ctx->DriverFlags.NewAtomicBuffer;
-
-   set_buffer_binding(ctx, binding, bufObj, offset, size, autoSize, 
USAGE_ATOMIC_COUNTER_BUFFER);
+   bind_buffer(ctx, &ctx->AtomicBufferBindings[index],
+   bufObj, offset, size, autoSize,
+   ctx->DriverFlags.NewAtomicBuffer,
+   USAGE_ATOMIC_COUNTER_BUFFER);
 }
 
 /**
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] mesa/st: fix atomic buffer sizing to align with ssbo.

2017-09-14 Thread Dave Airlie
From: Dave Airlie 

This respects the size from the range setting like ssbo.

Signed-off-by: Dave Airlie 
---
 src/mesa/state_tracker/st_atom_atomicbuf.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_atomicbuf.c 
b/src/mesa/state_tracker/st_atom_atomicbuf.c
index 7ebcd08..ee5944f 100644
--- a/src/mesa/state_tracker/st_atom_atomicbuf.c
+++ b/src/mesa/state_tracker/st_atom_atomicbuf.c
@@ -62,6 +62,12 @@ st_bind_atomics(struct st_context *st, struct gl_program 
*prog,
  sb.buffer = st_obj->buffer;
  sb.buffer_offset = binding->Offset;
  sb.buffer_size = st_obj->buffer->width0 - binding->Offset;
+
+/* AutomaticSize is FALSE if the buffer was set with BindBufferRange.
+  * Take the minimum just to be sure.
+  */
+ if (!binding->AutomaticSize)
+sb.buffer_size = MIN2(sb.buffer_size, (unsigned) binding->Size);
   }
 
   st->pipe->set_shader_buffers(st->pipe, shader_type,
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Disable stencil cache optimization combining two 4x2 blocks

2017-09-14 Thread Pohjolainen, Topi
On Thu, Sep 14, 2017 at 04:16:05PM -0700, Kenneth Graunke wrote:
> On Monday, September 11, 2017 5:48:26 AM PDT Topi Pohjolainen wrote:
> > From the BDW PRM, Volume 15, Workarounds:
> > 
> > KMD Wa4x4STCOptimizationDisable HIZ/STC hang in hawx frames.
> > 
> > W/A: Disable 4x4 RCPFE STC optimization and therefore only send one
> >  valid 4x4 to STC on 4x4 interface. This will require setting bit
> >  6 of reg. 0x7004. Must be done at boot and all save/restore paths.
> > 
> > From the SKL PRM, Volume 16, Workarounds:
> > 
> > 0556 KMD Wa4x4STCOptimizationDisable HIZ/STC hang in hawx frames.
> > 
> > W/A: Disable 4 x4 RCPFE STC optimization and therefore only send
> >  one valid 4x4 to STC on 4x4 interface.  This will require setting
> >  bit 6 of reg. 0x7004. Must be done at boot and all save/restore
> >  paths.
> 
> The kernel has already implemented this workaround since v4.1 for Skylake,
> v4.0 for Cherryview, and maybe v3.18 or older for Broadwell.  So I don't
> think there's any need for us to do it in Mesa (and the text you quoted
> makes me wonder whether it'd even work to do it from userspace).

Ah, right, I should have checked. Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/20] anv: Implement VK_ANDROID_native_buffer (v2)

2017-09-14 Thread zhoucm1



On 2017年09月14日 07:03, Chad Versace wrote:

From: Chad Versace 

This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

   Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
   fd into a VkSemaphore or VkFence, the driver instead simply blocks on
   the sync fd, then puts the VkSemaphore or VkFence into the signalled
   state. Thanks to implicit sync, this produces correct behavior (with
   extra latency overhead, perhaps) despite its ugliness.

   Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
   a collection of wait semaphores as a sync fd, the driver instead
   submits the semaphores to the queue, then returns sync fd -1, which
   informs the caller that no additional synchronization is needed.
   Again, thanks to implicit sync, this produces correct behavior (with
   extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

   Reason 1. I've already tested this patch with dEQP and with demos
   apps. It works. I wanted to get the tested patches into the tree now,
   and polish the implementation afterwards.

   Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
   supports neither Android's sync_fence, nor upstream's sync_file, nor
   drm_syncobj. Again, I tested these patches on Android with a 3.18
   kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===
I did not test at all using the Android.mk buildsystem. I probably
broke it. Please test and review that.

Testing
===
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass:

   a little spinning cube demo APK
   dEQP-VK.info.*
   dEQP-VK.api.smoke.*
   dEQP-VK.api.info.instance.*
   dEQP-VK.api.info.device.*
   dEQP-VK.api.wsi.android.*

v2:
   - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
 the VkImage.
   - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
 during vkCreateImage. Instead, directly import its dma-buf during
 vkCreateImage with anv_bo_cache_import(). [for jekstrand]
   - Rebase onto Tapani's VK_EXT_debug_report changes.
   - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
 exist.
---
  src/intel/Makefile.sources  |   3 +
  src/intel/Makefile.vulkan.am|   2 +
  src/intel/vulkan/anv_android.c  | 245 
  src/intel/vulkan/anv_device.c   |  12 +-
  src/intel/vulkan/anv_entrypoints_gen.py |  10 +-
  src/intel/vulkan/anv_extensions.py  |   1 +
  src/intel/vulkan/anv_image.c| 141 --
  src/intel/vulkan/anv_private.h  |   1 +
  8 files changed, 405 insertions(+), 10 deletions(-)
  create mode 100644 src/intel/vulkan/anv_android.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 8ca50ff622b..6f2dfa91e20 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -229,6 +229,9 @@ VULKAN_FILES := \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
  
+VULKAN_ANDROID_FILES := \

+   vulkan/anv_android.c
+
  VULKAN_WSI_WAYLAND_FILES := \
vulkan/anv_wsi_wayland.c
  
diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am

index d1b1132ed2e..e9c824f717b 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -147,8 +147,10 @@ VULKAN_LIB_DEPS = \
-lm
  
  if HAVE_PLATFORM_ANDROID

+VULKAN_CPPFLAGS += $(ANDROID_CPPFLAGS)
  VULKAN_CFLAGS += $(ANDROID_CFLAGS)
  VULKAN_LIB_DEPS += $(ANDROID_LIBS)
+VULKAN_SOURCES += $(VULKAN_ANDROID_FILES)
  endif
  
  if HAVE_PLATFORM_X11

diff --git a/src/intel/vulkan/anv_android.c b/src/intel/vulkan/anv_android.c
new file mode 100644
index 000..6b19ace4d2d
--- /dev/null
+++ b/src/intel/vulkan/anv_android.c
@@ -0,0 +1,245 @@
+/*
+ * Copyright 2017 Google
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR 

[Mesa-dev] [PATCH 3/5] i965/tex: Make a couple of helpers static

2017-09-14 Thread Kenneth Graunke
From: Jason Ekstrand 

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_tex.h   | 20 
 src/mesa/drivers/dri/i965/intel_tex_image.c |  4 ++--
 2 files changed, 2 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex.h 
b/src/mesa/drivers/dri/i965/intel_tex.h
index 2c5913ad2d1..42565baebf6 100644
--- a/src/mesa/drivers/dri/i965/intel_tex.h
+++ b/src/mesa/drivers/dri/i965/intel_tex.h
@@ -52,24 +52,4 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
 
 void intel_finalize_mipmap_tree(struct brw_context *brw, GLuint unit);
 
-bool
-intel_texsubimage_tiled_memcpy(struct gl_context *ctx,
-   GLuint dims,
-   struct gl_texture_image *texImage,
-   GLint xoffset, GLint yoffset, GLint zoffset,
-   GLsizei width, GLsizei height, GLsizei depth,
-   GLenum format, GLenum type,
-   const GLvoid *pixels,
-   const struct gl_pixelstore_attrib *packing,
-   bool for_glTexImage);
-
-bool
-intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
-  struct gl_texture_image *texImage,
-  GLint xoffset, GLint yofset,
-  GLsizei width, GLsizei height,
-  GLenum format, GLenum type,
-  GLvoid *pixels,
-  const struct gl_pixelstore_attrib *packing);
-
 #endif
diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index b8318a5719f..cb1550bf639 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -153,7 +153,7 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
  * regions are updated with glTexSubImage2D. On some workloads, the
  * performance gain of this fastpath on Sandybridge is over 5x.
  */
-bool
+static bool
 intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
GLuint dims,
struct gl_texture_image *texImage,
@@ -581,7 +581,7 @@ intel_image_target_texture_2d(struct gl_context *ctx, 
GLenum target,
  *
  * \see intel_readpixels_tiled_memcpy()
  */
-bool
+static bool
 intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
   struct gl_texture_image *texImage,
   GLint xoffset, GLint yoffset,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpy

2017-09-14 Thread Kenneth Graunke
From: Jason Ekstrand 

It is set to false in both callers.  It isn't needed for glTexImage
because intelTexImage calls AllocTextureImageBuffer before calling
texsubimage_tiled_memcpy.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index cb1550bf639..29a49840d81 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -131,8 +131,6 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
 /**
  * \brief A fast path for glTexImage and glTexSubImage.
  *
- * \param for_glTexImage Was this called from glTexImage or glTexSubImage?
- *
  * This fast path is taken when the texture format is BGRA, RGBA,
  * A or L and when the texture memory is X- or Y-tiled.  It uploads
  * the texture data by mapping the texture memory without a GTT fence, thus
@@ -161,8 +159,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
GLsizei width, GLsizei height, GLsizei depth,
GLenum format, GLenum type,
const GLvoid *pixels,
-   const struct gl_pixelstore_attrib *packing,
-   bool for_glTexImage)
+   const struct gl_pixelstore_attrib *packing)
 {
struct brw_context *brw = brw_context(ctx);
const struct gen_device_info *devinfo = &brw->screen->devinfo;
@@ -210,9 +207,6 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
if (texImage->TexObject->MinLayer)
   return false;
 
-   if (for_glTexImage)
-  ctx->Driver.AllocTextureImageBuffer(ctx, texImage);
-
if (!image->mt ||
(image->mt->surf.tiling != ISL_TILING_X &&
 image->mt->surf.tiling != ISL_TILING_Y0)) {
@@ -261,12 +255,11 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
 */
DBG("%s: level=%d offset=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x "
"mesa_format=0x%x tiling=%d "
-   "packing=(alignment=%d row_length=%d skip_pixels=%d skip_rows=%d) "
-   "for_glTexImage=%d\n",
+   "packing=(alignment=%d row_length=%d skip_pixels=%d skip_rows=%d) ",
__func__, texImage->Level, xoffset, yoffset, width, height,
format, type, texImage->TexFormat, image->mt->surf.tiling,
packing->Alignment, packing->RowLength, packing->SkipPixels,
-   packing->SkipRows, for_glTexImage);
+   packing->SkipRows);
 
/* Adjust x and y offset based on miplevel */
unsigned level_x, level_y;
@@ -332,8 +325,7 @@ intelTexImage(struct gl_context * ctx,
texImage->Width,
texImage->Height,
texImage->Depth,
-   format, type, pixels, unpack,
-   false /*allocate_storage*/);
+   format, type, pixels, unpack);
if (ok)
   return;
 
@@ -380,8 +372,7 @@ intelTexSubImage(struct gl_context * ctx,
ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
xoffset, yoffset, zoffset,
width, height, depth,
-   format, type, pixels, packing,
-   false /*for_glTexImage*/);
+   format, type, pixels, packing);
if (ok)
  return;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/tex: Unify the TexImage and TexSubImage code

2017-09-14 Thread Kenneth Graunke
From: Jason Ekstrand 

It's nearly the same so there's no good reason why it can't be in a
common function.  The one difference is that _mesa_store_teximage
calls AllocTextureImageBuffer for us, while _mesa_store_texsubimage
doesn't, but we don't need that anyway - intelTexImage already does it.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 103 
 1 file changed, 45 insertions(+), 58 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 29a49840d81..ceb0b3f9810 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -283,6 +283,45 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
 }
 
 
+static void
+intel_upload_tex(struct gl_context * ctx,
+ GLuint dims,
+ struct gl_texture_image *texImage,
+ GLint xoffset, GLint yoffset, GLint zoffset,
+ GLsizei width, GLsizei height, GLsizei depth,
+ GLenum format, GLenum type,
+ const GLvoid * pixels,
+ const struct gl_pixelstore_attrib *packing)
+{
+   struct intel_mipmap_tree *mt = intel_texture_image(texImage)->mt;
+   bool ok;
+
+   bool tex_busy = mt && brw_bo_busy(mt->bo);
+
+   if (mt && mt->format == MESA_FORMAT_S_UINT8)
+  mt->r8stencil_needs_update = true;
+
+   ok = _mesa_meta_pbo_TexSubImage(ctx, dims, texImage,
+   xoffset, yoffset, zoffset,
+   width, height, depth, format, type,
+   pixels, tex_busy, packing);
+   if (ok)
+  return;
+
+   ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
+   xoffset, yoffset, zoffset,
+   width, height, depth,
+   format, type, pixels, packing);
+   if (ok)
+ return;
+
+   _mesa_store_texsubimage(ctx, dims, texImage,
+   xoffset, yoffset, zoffset,
+   width, height, depth,
+   format, type, pixels, packing);
+}
+
+
 static void
 intelTexImage(struct gl_context * ctx,
   GLuint dims,
@@ -290,11 +329,6 @@ intelTexImage(struct gl_context * ctx,
   GLenum format, GLenum type, const void *pixels,
   const struct gl_pixelstore_attrib *unpack)
 {
-   struct intel_texture_image *intelImage = intel_texture_image(texImage);
-   bool ok;
-
-   bool tex_busy = intelImage->mt && brw_bo_busy(intelImage->mt->bo);
-
DBG("%s mesa_format %s target %s format %s type %s level %d %dx%dx%d\n",
__func__, _mesa_get_format_name(texImage->TexFormat),
_mesa_enum_to_string(texImage->TexObject->Target),
@@ -307,34 +341,11 @@ intelTexImage(struct gl_context * ctx,
   return;
}
 
-   assert(intelImage->mt);
-
-   if (intelImage->mt->format == MESA_FORMAT_S_UINT8)
-  intelImage->mt->r8stencil_needs_update = true;
-
-   ok = _mesa_meta_pbo_TexSubImage(ctx, dims, texImage, 0, 0, 0,
-   texImage->Width, texImage->Height,
-   texImage->Depth,
-   format, type, pixels,
-   tex_busy, unpack);
-   if (ok)
-  return;
+   assert(intel_texture_image(texImage)->mt);
 
-   ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage,
-   0, 0, 0, /*x,y,z offsets*/
-   texImage->Width,
-   texImage->Height,
-   texImage->Depth,
-   format, type, pixels, unpack);
-   if (ok)
-  return;
-
-   DBG("%s: upload image %dx%dx%d pixels %p\n",
-   __func__, texImage->Width, texImage->Height, texImage->Depth,
-   pixels);
-
-   _mesa_store_teximage(ctx, dims, texImage,
-format, type, pixels, unpack);
+   intel_upload_tex(ctx, dims, texImage, 0, 0, 0,
+texImage->Width, texImage->Height, texImage->Depth,
+format, type, pixels, unpack);
 }
 
 
@@ -348,38 +359,14 @@ intelTexSubImage(struct gl_context * ctx,
  const GLvoid * pixels,
  const struct gl_pixelstore_attrib *packing)
 {
-   struct intel_mipmap_tree *mt = intel_texture_image(texImage)->mt;
-   bool ok;
-
-   bool tex_busy = mt && brw_bo_busy(mt->bo);
-
-   if (mt && mt->format == MESA_FORMAT_S_UINT8)
-  mt->r8stencil_needs_update = true;
-
DBG("%s mesa_format %s target %s format %s type %s level %d %dx%dx%d\n",
__func__, _mesa_get_format_name(texImage->TexFormat),
_mesa_enum_to_string(texImage->TexObject->Target),
_mesa_enum_to_string(format), _mesa_enum_to_string(type),
texImage->Level, texImage->Width, texImage->

[Mesa-dev] [PATCH 1/5] i965/blorp: Set r8stencil_needs_update when writing stencil

2017-09-14 Thread Kenneth Graunke
From: Jason Ekstrand 

This fixes a crash on Haswell when we try to upload a stencil texture
with blorp.  It would also be a problem if someone tried to texture from
stencil after glBlitFramebuffers.

Cc: "17.2 17.1" 
Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 4c6ae369196..0c58e74b67d 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -135,6 +135,8 @@ blorp_surf_for_miptree(struct brw_context *brw,
unsigned start_layer, unsigned num_layers,
struct isl_surf tmp_surfs[1])
 {
+   const struct gen_device_info *devinfo = &brw->screen->devinfo;
+
if (mt->surf.msaa_layout == ISL_MSAA_LAYOUT_ARRAY) {
   const unsigned num_samples = mt->surf.samples;
   for (unsigned i = 0; i < num_layers; i++) {
@@ -163,6 +165,10 @@ blorp_surf_for_miptree(struct brw_context *brw,
else if (mt->hiz_buf)
   aux_surf = &mt->hiz_buf->surf;
 
+   if (mt->format == MESA_FORMAT_S_UINT8 && is_render_target &&
+   devinfo->gen <= 7)
+  mt->r8stencil_needs_update = true;
+
if (surf->aux_usage == ISL_AUX_USAGE_HIZ &&
!intel_miptree_level_has_hiz(mt, *level))
   surf->aux_usage = ISL_AUX_USAGE_NONE;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965: Move TexSubImage functions to intel_tex_image.c

2017-09-14 Thread Kenneth Graunke
From: Jason Ekstrand 

These two paths are basically the same.  There's no good reason to have
them in different files.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/Makefile.sources |   1 -
 src/mesa/drivers/dri/i965/brw_context.c|   1 -
 src/mesa/drivers/dri/i965/intel_tex.h  |   2 -
 src/mesa/drivers/dri/i965/intel_tex_image.c| 210 
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 256 -
 5 files changed, 210 insertions(+), 260 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/intel_tex_subimage.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index e33dea07128..a701b6a3579 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -106,7 +106,6 @@ i965_FILES = \
intel_tex.h \
intel_tex_image.c \
intel_tex_obj.h \
-   intel_tex_subimage.c \
intel_tex_validate.c \
intel_tiled_memcpy.c \
intel_tiled_memcpy.h \
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6441311d47e..f67c30f3aa4 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -282,7 +282,6 @@ brw_init_driver_functions(struct brw_context *brw,
 
intelInitTextureFuncs(functions);
intelInitTextureImageFuncs(functions);
-   intelInitTextureSubImageFuncs(functions);
intelInitTextureCopyImageFuncs(functions);
intelInitCopyImageFuncs(functions);
intelInitClearFuncs(functions);
diff --git a/src/mesa/drivers/dri/i965/intel_tex.h 
b/src/mesa/drivers/dri/i965/intel_tex.h
index f1b55c706ea..2c5913ad2d1 100644
--- a/src/mesa/drivers/dri/i965/intel_tex.h
+++ b/src/mesa/drivers/dri/i965/intel_tex.h
@@ -35,8 +35,6 @@ void intelInitTextureFuncs(struct dd_function_table 
*functions);
 
 void intelInitTextureImageFuncs(struct dd_function_table *functions);
 
-void intelInitTextureSubImageFuncs(struct dd_function_table *functions);
-
 void intelInitTextureCopyImageFuncs(struct dd_function_table *functions);
 
 void intelInitCopyImageFuncs(struct dd_function_table *functions);
diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 4661581244e..b8318a5719f 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -127,6 +127,169 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
flags);
 }
 
+
+/**
+ * \brief A fast path for glTexImage and glTexSubImage.
+ *
+ * \param for_glTexImage Was this called from glTexImage or glTexSubImage?
+ *
+ * This fast path is taken when the texture format is BGRA, RGBA,
+ * A or L and when the texture memory is X- or Y-tiled.  It uploads
+ * the texture data by mapping the texture memory without a GTT fence, thus
+ * acquiring a tiled view of the memory, and then copying sucessive
+ * spans within each tile.
+ *
+ * This is a performance win over the conventional texture upload path because
+ * it avoids the performance penalty of writing through the write-combine
+ * buffer. In the conventional texture upload path,
+ * texstore.c:store_texsubimage(), the texture memory is mapped through a GTT
+ * fence, thus acquiring a linear view of the memory, then each row in the
+ * image is memcpy'd. In this fast path, we replace each row's copy with
+ * a sequence of copies over each linear span in tile.
+ *
+ * One use case is Google Chrome's paint rectangles.  Chrome (as
+ * of version 21) renders each page as a tiling of 256x256 GL_BGRA textures.
+ * Each page's content is initially uploaded with glTexImage2D and damaged
+ * regions are updated with glTexSubImage2D. On some workloads, the
+ * performance gain of this fastpath on Sandybridge is over 5x.
+ */
+bool
+intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
+   GLuint dims,
+   struct gl_texture_image *texImage,
+   GLint xoffset, GLint yoffset, GLint zoffset,
+   GLsizei width, GLsizei height, GLsizei depth,
+   GLenum format, GLenum type,
+   const GLvoid *pixels,
+   const struct gl_pixelstore_attrib *packing,
+   bool for_glTexImage)
+{
+   struct brw_context *brw = brw_context(ctx);
+   const struct gen_device_info *devinfo = &brw->screen->devinfo;
+   struct intel_texture_image *image = intel_texture_image(texImage);
+   int src_pitch;
+
+   /* The miptree's buffer. */
+   struct brw_bo *bo;
+
+   uint32_t cpp;
+   mem_copy_fn mem_copy = NULL;
+
+   /* This fastpath is restricted to specific texture types:
+* a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to support
+* more types.
+*
+* FINISHME: The restrictions b

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-14 Thread Connor Abbott
On Thu, Sep 14, 2017 at 10:15 PM, Connor Abbott  wrote:
> me too :) I'll push my stuff now
>
> On Thu, Sep 14, 2017 at 8:58 PM, Jason Ekstrand  wrote:
>> On Thu, Sep 14, 2017 at 12:40 PM, Connor Abbott  wrote:
>>>
>>> On Tue, Sep 12, 2017 at 2:09 PM, Jason Ekstrand 
>>> wrote:
>>> > On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick 
>>> > wrote:
>>> >>
>>> >> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
>>> >> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
>>> >> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>>> >> >>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
>>> >>  On 2017-09-06 14:12:41, Daniel Schürmann wrote:
>>> >> > Hello together!
>>> >> > Recently, we had a small discussion (off the list) about the NIR
>>> >> > serialization, which was previously discussed in [RFC]
>>> >> > ARB_gl_spirv
>>> >> > and
>>> >> > NIR backend for radeonsi.
>>> >> >
>>> >> > As this topic could be interesting to more people, I would like
>>> >> > to
>>> >> > share, what was talked about so far (You might want to read from
>>> >> > bottom up).
>>> >> >
>>> >> > TL;DR:
>>> >> > - NIR serialization is in demand for shader cache
>>> >> > - could be done either directly (NIR binary form) or via SPIR-V
>>> >> > - Ian et al. are working on GLSL IR -> SPIR-V transformation,
>>> >> > which
>>> >> > could be adapted for a NIR -> SPIR-V pass
>>> >> > - in NIR representation, some type information is lost
>>> >> > - thus, a serialization via SPIR-V could NOT be a glslang
>>> >> > alternative
>>> >> > (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt
>>> >> > (if
>>> >> > the
>>> >> > output is valid SPIR-V)
>>> >> 
>>> >>  Ian,
>>> >> 
>>> >>  Tim was suggesting that we might look at serializing nir for the
>>> >>  i965
>>> >>  shader cache. Based on this email, it sounds like serialized nir
>>> >>  would
>>> >>  not be enough for the shader cache as some GLSL type info would be
>>> >>  lost. It sounds like GLSL IR => SPIR-V would be good enough. Is
>>> >>  that
>>> >>  right?
>>> >> 
>>> >>  I don't think we have a strict requirement for the GLSL IR =>
>>> >>  SPIR-V
>>> >>  path for GL 4.6, right? So, this is more of a 'nice-to-have'?
>>> >> 
>>> >>  I'm not sure we'd want to make i965 shader cache depend on a
>>> >>  nice-to-have feature. (Unless we're pretty sure it'll be available
>>> >>  soon.)
>>> >> 
>>> >>  But, it would be nice to not have to fallback to compiling the
>>> >>  GLSL
>>> >>  for i965 shader cache, so it would be worth waiting a little bit
>>> >>  to
>>> >>  be
>>> >>  able to rely on a SPIR-V serialization of the GLSL IR.
>>> >> 
>>> >>  What do you suggest?
>>> >> 
>>> >>  -Jordan
>>> >> >>>
>>> >> >>> We shouldn't use SPIR-V for the shader cache.
>>> >> >>>
>>> >> >>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965
>>> >> >>> IRs.
>>> >> >>> Storing the content at one of those points, and later loading it
>>> >> >>> and
>>> >> >>> resuming the normal compilation process from that point...that's
>>> >> >>> totally
>>> >> >>> reasonable.
>>> >> >>>
>>> >> >>> Having a fallback for "some things in the cache but not all the
>>> >> >>> variants
>>> >> >>> we needed" suddenly take a different compilation pipeline, i.e.
>>> >> >>> SPIR-V
>>> >> >>> -> NIR -> ... seems risky.  It's a different compilation path that
>>> >> >>> we
>>> >> >>> don't normally use.  And one you'd only hit in limited
>>> >> >>> circumstances.
>>> >> >>> There's a lot of potential for really obscure bugs.
>>> >> >>
>>> >> >> Since we're going to expose exactly that path for GL_ARB_spirv /
>>> >> >> OpenGL
>>> >> >> 4.6, we'd better make sure it works always.  Right?
>>> >> >
>>> >> > In addition to the old pipeline:
>>> >> >
>>> >> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
>>> >> >
>>> >> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
>>> >> >
>>> >> > - SPIR-V from the app -> NIR -> i965 IR
>>> >> >
>>> >> > Both of those absolutely have to work.  But these:
>>> >> >
>>> >> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
>>> >> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
>>> >> >
>>> >> > aren't required to work, or even be supported.  It makes a lot of
>>> >> > sense
>>> >> > to support them - both for testing purposes, and as an alternative to
>>> >> > glslang, for a broader tooling ecosystem.
>>> >> >
>>> >> > The thing that concerns me is that if you use SPIR-V for the cache,
>>> >> > you
>>> >> > need these paths to not just work, but be _indistinguishable_ from
>>> >> > one
>>> >> > another:
>>> >> >
>>> >> > - GLSL -> GLSL IR -> NIR -> ...
>>> >> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
>>> >> >
>>> >> > Otherwise the original compile and partially-cached recompile might
>>> >> > h

Re: [Mesa-dev] [RFC] NIR serialization

2017-09-14 Thread Connor Abbott
me too :) I'll push my stuff now

On Thu, Sep 14, 2017 at 8:58 PM, Jason Ekstrand  wrote:
> On Thu, Sep 14, 2017 at 12:40 PM, Connor Abbott  wrote:
>>
>> On Tue, Sep 12, 2017 at 2:09 PM, Jason Ekstrand 
>> wrote:
>> > On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick 
>> > wrote:
>> >>
>> >> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
>> >> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
>> >> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>> >> >>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
>> >>  On 2017-09-06 14:12:41, Daniel Schürmann wrote:
>> >> > Hello together!
>> >> > Recently, we had a small discussion (off the list) about the NIR
>> >> > serialization, which was previously discussed in [RFC]
>> >> > ARB_gl_spirv
>> >> > and
>> >> > NIR backend for radeonsi.
>> >> >
>> >> > As this topic could be interesting to more people, I would like
>> >> > to
>> >> > share, what was talked about so far (You might want to read from
>> >> > bottom up).
>> >> >
>> >> > TL;DR:
>> >> > - NIR serialization is in demand for shader cache
>> >> > - could be done either directly (NIR binary form) or via SPIR-V
>> >> > - Ian et al. are working on GLSL IR -> SPIR-V transformation,
>> >> > which
>> >> > could be adapted for a NIR -> SPIR-V pass
>> >> > - in NIR representation, some type information is lost
>> >> > - thus, a serialization via SPIR-V could NOT be a glslang
>> >> > alternative
>> >> > (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt
>> >> > (if
>> >> > the
>> >> > output is valid SPIR-V)
>> >> 
>> >>  Ian,
>> >> 
>> >>  Tim was suggesting that we might look at serializing nir for the
>> >>  i965
>> >>  shader cache. Based on this email, it sounds like serialized nir
>> >>  would
>> >>  not be enough for the shader cache as some GLSL type info would be
>> >>  lost. It sounds like GLSL IR => SPIR-V would be good enough. Is
>> >>  that
>> >>  right?
>> >> 
>> >>  I don't think we have a strict requirement for the GLSL IR =>
>> >>  SPIR-V
>> >>  path for GL 4.6, right? So, this is more of a 'nice-to-have'?
>> >> 
>> >>  I'm not sure we'd want to make i965 shader cache depend on a
>> >>  nice-to-have feature. (Unless we're pretty sure it'll be available
>> >>  soon.)
>> >> 
>> >>  But, it would be nice to not have to fallback to compiling the
>> >>  GLSL
>> >>  for i965 shader cache, so it would be worth waiting a little bit
>> >>  to
>> >>  be
>> >>  able to rely on a SPIR-V serialization of the GLSL IR.
>> >> 
>> >>  What do you suggest?
>> >> 
>> >>  -Jordan
>> >> >>>
>> >> >>> We shouldn't use SPIR-V for the shader cache.
>> >> >>>
>> >> >>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965
>> >> >>> IRs.
>> >> >>> Storing the content at one of those points, and later loading it
>> >> >>> and
>> >> >>> resuming the normal compilation process from that point...that's
>> >> >>> totally
>> >> >>> reasonable.
>> >> >>>
>> >> >>> Having a fallback for "some things in the cache but not all the
>> >> >>> variants
>> >> >>> we needed" suddenly take a different compilation pipeline, i.e.
>> >> >>> SPIR-V
>> >> >>> -> NIR -> ... seems risky.  It's a different compilation path that
>> >> >>> we
>> >> >>> don't normally use.  And one you'd only hit in limited
>> >> >>> circumstances.
>> >> >>> There's a lot of potential for really obscure bugs.
>> >> >>
>> >> >> Since we're going to expose exactly that path for GL_ARB_spirv /
>> >> >> OpenGL
>> >> >> 4.6, we'd better make sure it works always.  Right?
>> >> >
>> >> > In addition to the old pipeline:
>> >> >
>> >> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
>> >> >
>> >> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
>> >> >
>> >> > - SPIR-V from the app -> NIR -> i965 IR
>> >> >
>> >> > Both of those absolutely have to work.  But these:
>> >> >
>> >> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
>> >> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
>> >> >
>> >> > aren't required to work, or even be supported.  It makes a lot of
>> >> > sense
>> >> > to support them - both for testing purposes, and as an alternative to
>> >> > glslang, for a broader tooling ecosystem.
>> >> >
>> >> > The thing that concerns me is that if you use SPIR-V for the cache,
>> >> > you
>> >> > need these paths to not just work, but be _indistinguishable_ from
>> >> > one
>> >> > another:
>> >> >
>> >> > - GLSL -> GLSL IR -> NIR -> ...
>> >> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
>> >> >
>> >> > Otherwise the original compile and partially-cached recompile might
>> >> > have
>> >> > different properties.  For example, if the the SPIR-V step messes
>> >> > with
>> >> > variables or instruction ordering a little, it could trip up the loop
>> >> > unrol

Re: [Mesa-dev] [PATCH 3/4] i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2

2017-09-14 Thread Eric Anholt
Jason Ekstrand  writes:

> The old code made a new miptree that referenced the same BO as the
> renderbuffer and just trusted in the memory aliasing to work.  There are
> only two ways in which the new miptree is liable to differ from the one
> in the renderbuffer and neither of them matter:
>
>  1) It may have a different target.  The only targets that we can ever
> see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
> and the difference between the two doesn't matter as far as the
> miptree is concerned; genX(update_sampler_state) only looks at the
> gl_texture_object and not the miptree when determining whether or
> not to use normalized coordinates.
>
>  2) It may have a very slightly different format.  Again, this doesn't
> matter because we've supported texture views for quite some time so
> we always look at the gl_texture_object format instead of the
> miptree format for hardware setup anyway.
>
> On the other hand, because we were recreating the miptree, we were using
> intel_miptree_create_for_bo which doesn't understand modifiers.  We
> really want this function to work without doing a resolve so long as you
> have modifiers so we need to fix that.

I think this patch in particular might be why I was CCed.  This looks
like a very good, and very obvious (to me) cleanup.

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Use prepare_external instead of make_shareable in setTexBuffer2

2017-09-14 Thread Eric Anholt
Jason Ekstrand  writes:

> The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
> which has tighter restrictions than just "it's shared".  In particular,
> it says that any rendering to the image while it is bound causes the
> contents to become undefined.  This means that we can do whatever aux
> tracking we want between glxBindTexImageEXT and glxReleaseTexImageEXT so
> long as we always transition from external in Bind and to external in
> Release.

The intent of the spec was to get at the hard-to-define "you get pixels
at least as new as the outstanding X11 rendering when you called
glxBindTexImageEXT(), but if X11 keeps on rendering to the thing then
you may get newer pixels, too."  With your CCS plan and X11 rendering in
parallel with you GL texturing from the X11 pixmap, will we always see
either old or new pixels but not anything else?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] NIR serialization

2017-09-14 Thread Jason Ekstrand
On Thu, Sep 14, 2017 at 12:40 PM, Connor Abbott  wrote:

> On Tue, Sep 12, 2017 at 2:09 PM, Jason Ekstrand 
> wrote:
> > On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick 
> wrote:
> >>
> >> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
> >> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
> >> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
> >> >>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
> >>  On 2017-09-06 14:12:41, Daniel Schürmann wrote:
> >> > Hello together!
> >> > Recently, we had a small discussion (off the list) about the NIR
> >> > serialization, which was previously discussed in [RFC]
> ARB_gl_spirv
> >> > and
> >> > NIR backend for radeonsi.
> >> >
> >> > As this topic could be interesting to more people, I would like to
> >> > share, what was talked about so far (You might want to read from
> >> > bottom up).
> >> >
> >> > TL;DR:
> >> > - NIR serialization is in demand for shader cache
> >> > - could be done either directly (NIR binary form) or via SPIR-V
> >> > - Ian et al. are working on GLSL IR -> SPIR-V transformation,
> which
> >> > could be adapted for a NIR -> SPIR-V pass
> >> > - in NIR representation, some type information is lost
> >> > - thus, a serialization via SPIR-V could NOT be a glslang
> >> > alternative
> >> > (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if
> >> > the
> >> > output is valid SPIR-V)
> >> 
> >>  Ian,
> >> 
> >>  Tim was suggesting that we might look at serializing nir for the
> i965
> >>  shader cache. Based on this email, it sounds like serialized nir
> >>  would
> >>  not be enough for the shader cache as some GLSL type info would be
> >>  lost. It sounds like GLSL IR => SPIR-V would be good enough. Is
> that
> >>  right?
> >> 
> >>  I don't think we have a strict requirement for the GLSL IR =>
> SPIR-V
> >>  path for GL 4.6, right? So, this is more of a 'nice-to-have'?
> >> 
> >>  I'm not sure we'd want to make i965 shader cache depend on a
> >>  nice-to-have feature. (Unless we're pretty sure it'll be available
> >>  soon.)
> >> 
> >>  But, it would be nice to not have to fallback to compiling the GLSL
> >>  for i965 shader cache, so it would be worth waiting a little bit to
> >>  be
> >>  able to rely on a SPIR-V serialization of the GLSL IR.
> >> 
> >>  What do you suggest?
> >> 
> >>  -Jordan
> >> >>>
> >> >>> We shouldn't use SPIR-V for the shader cache.
> >> >>>
> >> >>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965
> >> >>> IRs.
> >> >>> Storing the content at one of those points, and later loading it and
> >> >>> resuming the normal compilation process from that point...that's
> >> >>> totally
> >> >>> reasonable.
> >> >>>
> >> >>> Having a fallback for "some things in the cache but not all the
> >> >>> variants
> >> >>> we needed" suddenly take a different compilation pipeline, i.e.
> SPIR-V
> >> >>> -> NIR -> ... seems risky.  It's a different compilation path that
> we
> >> >>> don't normally use.  And one you'd only hit in limited
> circumstances.
> >> >>> There's a lot of potential for really obscure bugs.
> >> >>
> >> >> Since we're going to expose exactly that path for GL_ARB_spirv /
> OpenGL
> >> >> 4.6, we'd better make sure it works always.  Right?
> >> >
> >> > In addition to the old pipeline:
> >> >
> >> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
> >> >
> >> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
> >> >
> >> > - SPIR-V from the app -> NIR -> i965 IR
> >> >
> >> > Both of those absolutely have to work.  But these:
> >> >
> >> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
> >> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
> >> >
> >> > aren't required to work, or even be supported.  It makes a lot of
> sense
> >> > to support them - both for testing purposes, and as an alternative to
> >> > glslang, for a broader tooling ecosystem.
> >> >
> >> > The thing that concerns me is that if you use SPIR-V for the cache,
> you
> >> > need these paths to not just work, but be _indistinguishable_ from one
> >> > another:
> >> >
> >> > - GLSL -> GLSL IR -> NIR -> ...
> >> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
> >> >
> >> > Otherwise the original compile and partially-cached recompile might
> have
> >> > different properties.  For example, if the the SPIR-V step messes with
> >> > variables or instruction ordering a little, it could trip up the loop
> >> > unroller so the original compiler gets unrolled, and the recompile
> from
> >> > partial cache doesn't get unrolled.  I don't want to have to debug
> that.
> >>
> >> That is a very compelling argument.  If we want Mesa to be an
> >> alternative to glslang, I think we would like to have that property, but
> >> it's not a hard requirement for that use case.
> >
> >
> > I a

Re: [Mesa-dev] [PATCH] r600: fork and import gallium/radeon

2017-09-14 Thread Dave Airlie
On 15 September 2017 at 02:10, Marek Olšák  wrote:
> On Thu, Sep 14, 2017 at 4:19 PM, Emil Velikov  
> wrote:
>> Hi Marek,
>>
>> On 14 September 2017 at 14:06, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> This marks the end of code sharing between r600 and radeonsi.
>>>
>> It has the "what" but it's missing the "why". Can you please add some
>> information.
>>
>> From a quick look which will make each binary ~140KiB larger (dri,
>> omx, vdpau ...). As a reference point drivers/r600 and
>> drivers/radeonsi themselves are around 620KiB and 280KiB respectively.
>>
>> With the bits duplicated/forked, should one `mv radeon{,si}` or you're
>> planning that at a later stage?
>>
>>> A lot of functions had to be renamed to prevent linker conflicts.
>>>
>>> There are also minor cleanups.
>>> ---
>>>
>>> This one is huge. Please review here:
>>> https://cgit.freedesktop.org/~mareko/mesa/commit/?h=master&id=858b2d1c8cec727fdf750192c8c210f72d38f853
>>>
>> I'll look at those in an hour or so.
>
> The plan is to merge gallium/radeon into radeonsi gradually over a
> longer period of time.
>
> Existing uncommitted work in gallium/radeon should apply more or less
> cleanly, but will only affect radeonsi, not r600,
>

I don't love it, but we've got to drop the ties at some point, and
it's getting less likely we can usefully share.

if only we had addrlib support for r600->ni :-P

Anyways,

Acked-by: Dave Airlie 

The disk space doesn't concern me.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Vulkan extensions

2017-09-14 Thread Bas Nieuwenhuizen
On Fri, Sep 15, 2017 at 1:18 AM, Dave Airlie  wrote:
> On 15 September 2017 at 09:12, Jordan Justen  
> wrote:
>> On 2017-09-14 15:36:10, Romain Failliot wrote:
>>> Le 14 sept. 2017 6:11 PM, "Bas Nieuwenhuizen"  a
>>> écrit :
>>>
>>> > For vulkan, because 1.0 is the initial version, there are no
>>> > extensions to implement to get to that version, so having an
>>> > extensions list would be nonsensical.
>>>
>>> I don't think it is nonsensical, say the nouveau devs starts to work on a
>>> Vulkan 1.0 driver and they'd like to show their progress in features.txt. I
>>> think it would be interesting for them to have the list of extensions to
>>> implement to be Vulkan 1.0 compliant, so they could flag which extensions
>>> are done, in progress or not started.
>>
>> That would be fine, except I don't think the 1.0 features are bucketed
>> into a set of 'extensions'. Right? I thought 1.0 was the baseline, and
>> extensions were built upon that.
>>
>
> I think Romain missed Bas's point. There is no extension list to get to 1.0.
> 1.0 is step one. The closest thing is probably the device features list,
> and even that you don't expect any device to fill all of it, so what 100% is
> differs for every device.
>
> Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Vulkan extensions

2017-09-14 Thread Bas Nieuwenhuizen
On Fri, Sep 15, 2017 at 1:18 AM, Dave Airlie  wrote:
> On 15 September 2017 at 09:12, Jordan Justen  
> wrote:
>> On 2017-09-14 15:36:10, Romain Failliot wrote:
>>> Le 14 sept. 2017 6:11 PM, "Bas Nieuwenhuizen"  a
>>> écrit :
>>>
>>> > For vulkan, because 1.0 is the initial version, there are no
>>> > extensions to implement to get to that version, so having an
>>> > extensions list would be nonsensical.
>>>
>>> I don't think it is nonsensical, say the nouveau devs starts to work on a
>>> Vulkan 1.0 driver and they'd like to show their progress in features.txt. I
>>> think it would be interesting for them to have the list of extensions to
>>> implement to be Vulkan 1.0 compliant, so they could flag which extensions
>>> are done, in progress or not started.
>>
>> That would be fine, except I don't think the 1.0 features are bucketed
>> into a set of 'extensions'. Right? I thought 1.0 was the baseline, and
>> extensions were built upon that.
>>
>
> I think Romain missed Bas's point. There is no extension list to get to 1.0.
> 1.0 is step one. The closest thing is probably the device features list,
> and even that you don't expect any device to fill all of it, so what 100% is
> differs for every device.

Also you can implement 0% of the feature list and still be vulkan 1.0
compliant ;)

- Bas
>
> Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Vulkan extensions

2017-09-14 Thread Dave Airlie
On 15 September 2017 at 09:12, Jordan Justen  wrote:
> On 2017-09-14 15:36:10, Romain Failliot wrote:
>> Le 14 sept. 2017 6:11 PM, "Bas Nieuwenhuizen"  a
>> écrit :
>>
>> > For vulkan, because 1.0 is the initial version, there are no
>> > extensions to implement to get to that version, so having an
>> > extensions list would be nonsensical.
>>
>> I don't think it is nonsensical, say the nouveau devs starts to work on a
>> Vulkan 1.0 driver and they'd like to show their progress in features.txt. I
>> think it would be interesting for them to have the list of extensions to
>> implement to be Vulkan 1.0 compliant, so they could flag which extensions
>> are done, in progress or not started.
>
> That would be fine, except I don't think the 1.0 features are bucketed
> into a set of 'extensions'. Right? I thought 1.0 was the baseline, and
> extensions were built upon that.
>

I think Romain missed Bas's point. There is no extension list to get to 1.0.
1.0 is step one. The closest thing is probably the device features list,
and even that you don't expect any device to fill all of it, so what 100% is
differs for every device.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965/gen8: Remove unused gen8_emit_3dstate_multisample()

2017-09-14 Thread Kenneth Graunke
On Monday, September 11, 2017 5:48:24 AM PDT Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h|  1 -
>  src/mesa/drivers/dri/i965/gen8_multisample_state.c | 16 
>  2 files changed, 17 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 92fc16de13..bd56ffc819 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1510,7 +1510,6 @@ void
>  gen6_set_sample_maps(struct gl_context *ctx);
>  
>  /* gen8_multisample_state.c */
> -void gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned 
> num_samp);
>  void gen8_emit_3dstate_sample_pattern(struct brw_context *brw);
>  
>  /* gen7_urb.c */
> diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c 
> b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> index 7a31a5df4a..3afa586275 100644
> --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c
> @@ -28,22 +28,6 @@
>  #include "brw_multisample_state.h"
>  
>  /**
> - * 3DSTATE_MULTISAMPLE
> - */
> -void
> -gen8_emit_3dstate_multisample(struct brw_context *brw, unsigned num_samples)
> -{
> -   assert(num_samples <= 16);
> -
> -   unsigned log2_samples = ffs(MAX2(num_samples, 1)) - 1;
> -
> -   BEGIN_BATCH(2);
> -   OUT_BATCH(GEN8_3DSTATE_MULTISAMPLE << 16 | (2 - 2));
> -   OUT_BATCH(MS_PIXEL_LOCATION_CENTER | log2_samples << 1);
> -   ADVANCE_BATCH();
> -}
> -
> -/**
>   * 3DSTATE_SAMPLE_PATTERN
>   */
>  void
> 

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Disable stencil cache optimization combining two 4x2 blocks

2017-09-14 Thread Kenneth Graunke
On Monday, September 11, 2017 5:48:26 AM PDT Topi Pohjolainen wrote:
> From the BDW PRM, Volume 15, Workarounds:
> 
> KMD Wa4x4STCOptimizationDisable HIZ/STC hang in hawx frames.
> 
> W/A: Disable 4x4 RCPFE STC optimization and therefore only send one
>  valid 4x4 to STC on 4x4 interface. This will require setting bit
>  6 of reg. 0x7004. Must be done at boot and all save/restore paths.
> 
> From the SKL PRM, Volume 16, Workarounds:
> 
> 0556 KMD Wa4x4STCOptimizationDisable HIZ/STC hang in hawx frames.
> 
> W/A: Disable 4 x4 RCPFE STC optimization and therefore only send
>  one valid 4x4 to STC on 4x4 interface.  This will require setting
>  bit 6 of reg. 0x7004. Must be done at boot and all save/restore
>  paths.

The kernel has already implemented this workaround since v4.1 for Skylake,
v4.0 for Cherryview, and maybe v3.18 or older for Broadwell.  So I don't
think there's any need for us to do it in Mesa (and the text you quoted
makes me wonder whether it'd even work to do it from userspace).

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] intel/blorp/hiz: Always set sample number

2017-09-14 Thread Kenneth Graunke
On Monday, September 11, 2017 5:48:25 AM PDT Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/intel/blorp/blorp_genX_exec.h | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 5f9a8ab4a5..5389262098 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -1454,6 +1454,17 @@ blorp_emit_gen8_hiz_op(struct blorp_batch *batch,
> if (params->stencil.enabled)
>assert(params->hiz_op == BLORP_HIZ_OP_DEPTH_CLEAR);
>  
> +   /* From the BDW PRM Volume 2, 3DSTATE_WM_HZ_OP:
> +*
> +* 3DSTATE_MULTISAMPLE packet must be used prior to this packet to change
> +* the Number of Multisamples. This packet must not be used to change
> +* Number of Multisamples in a rendering sequence.
> +*
> +* Since HIZ may be the first thing in a batch buffer, play safe and 
> always
> +* emit 3DSTATE_MULTISAMPLE.
> +*/
> +   blorp_emit_3dstate_multisample(batch, params);
> +
> /* If we can't alter the depth stencil config and multiple layers are
>  * involved, the HiZ op will fail. This is because the op requires that a
>  * new config is emitted for each additional layer.
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Vulkan extensions

2017-09-14 Thread Jordan Justen
On 2017-09-14 15:36:10, Romain Failliot wrote:
> Le 14 sept. 2017 6:11 PM, "Bas Nieuwenhuizen"  a
> écrit :
> 
> > For vulkan, because 1.0 is the initial version, there are no
> > extensions to implement to get to that version, so having an
> > extensions list would be nonsensical.
> 
> I don't think it is nonsensical, say the nouveau devs starts to work on a
> Vulkan 1.0 driver and they'd like to show their progress in features.txt. I
> think it would be interesting for them to have the list of extensions to
> implement to be Vulkan 1.0 compliant, so they could flag which extensions
> are done, in progress or not started.

That would be fine, except I don't think the 1.0 features are bucketed
into a set of 'extensions'. Right? I thought 1.0 was the baseline, and
extensions were built upon that.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: don't drop intructions from unreachable terminators continue branch

2017-09-14 Thread Matt Turner
On Wed, Sep 13, 2017 at 9:47 PM, Timothy Arceri  wrote:
> These instruction will be executed on every iteration of the loop
> we cannot drop them.
> ---
>  src/compiler/glsl/loop_analysis.h   |  7 +++
>  src/compiler/glsl/loop_controls.cpp | 15 +++
>  src/compiler/glsl/loop_unroll.cpp   |  7 ---
>  3 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/src/compiler/glsl/loop_analysis.h 
> b/src/compiler/glsl/loop_analysis.h
> index 2894c6359b..0e1bfd8142 100644
> --- a/src/compiler/glsl/loop_analysis.h
> +++ b/src/compiler/glsl/loop_analysis.h
> @@ -27,20 +27,27 @@
>
>  #include "ir.h"
>  #include "util/hash_table.h"
>
>  /**
>   * Analyze and classify all variables used in all loops in the instruction 
> list
>   */
>  extern class loop_state *
>  analyze_loop_variables(exec_list *instructions);
>
> +static inline bool
> +is_break(ir_instruction *ir)
> +{
> +   return ir != NULL && ir->ir_type == ir_type_loop_jump &&
> +  ((ir_loop_jump *) ir)->is_break();

Please indent this expression to align vertically with ir != NULL
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: don't drop intructions from unreachable terminators continue branch

2017-09-14 Thread Timothy Arceri

On 15/09/17 04:25, Emil Velikov wrote:

Hi Tim

On 14 September 2017 at 05:47, Timothy Arceri  wrote:

These instruction will be executed on every iteration of the loop
we cannot drop them.


This and 2/3 sound like very nice bugfixes.
I haven't checked if they apply for 17.2 - if not can you do some backports.


To fix things properly we would really need to land all three. Patch 3 
makes unrolling better but it also fixes potential bugs. That said these 
have been there since day 1 and are kinda corner cases so I'm not overly 
concerned about getting these in stable. If you think its worth it than 
feel free, I believe they should all apply to 17.2 cleanly as no one 
normally touches this code.





Once people have checked the lot and confirmed everything of course.

Thanks!
Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Vulkan extensions

2017-09-14 Thread Bas Nieuwenhuizen
So AFAIK we always put the extensions there that we need to implement
to be able to claim that version, or that is my understanding for GL
at least.

For vulkan, because 1.0 is the initial version, there are no
extensions to implement to get to that version, so having an
extensions list would be nonsensical.

All the extensions you will find using the listed commands are the
ones that are shown in the subsequent list for non version-specific
extensions, or are not listed at all because of being KHX/EXT or
vendor extensions.

- Bas

On Thu, Sep 14, 2017 at 11:46 PM, Romain Failliot
 wrote:
> Hi!
>
> I'm working on exposing the vulkan information recently added in
> features.txt in mesamatrix.net, but there is no extension list under "Vulkan
> 1.0 - all DONE: anv, radv"
>
> There is a couple of command lines in the commit message though:
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=fe3d2559d941f8f69dbdb369221af69a9974d017
>
> I could generate the list locally, but if a new driver comes in (nvidia for
> instance), or if new extensions are added to Vulkan 1.0.xx, this will start
> to be hard to maintain.
>
> Is it possible to do the same as for OpenGL and list all the vulkan
> extensions directly in features.txt?
>
> Thanks,
> Romain
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv: set has_exec_async to false on Android

2017-09-14 Thread Chad Versace
On Thu 14 Sep 2017, Emil Velikov wrote:
> On 14 September 2017 at 07:57, Tapani Pälli  wrote:
> > Other WSI implementations set has_exec_async false for WSI buffers,
> > so far haven't found a place to do it so we just claim to not have
> > async exec.
> >
> What's the actual side-effects you're seeing? I'd imagine Jason, Chris
> and the gang may have some tips/suggestions - be that wrt Mesa or the
> kernel.
> 
> I'm not saying "don't upstream this", but a comment and/or bug
> reference will be beneficial.
> Esp. since disabling async exec may have noticeable implication on 
> performance.

Tapani, thanks for finding this problem. I completely overlooked it.

Instead of disabling ASYNC globally on Android, I believe the correct
fix is to set it only on imported gralloc buffers. Anvil already does
that for all X11 and Wayland buffers in anv_wsi.c.

I added that fix in v4 of my patch at
https://lists.freedesktop.org/archives/mesa-dev/2017-September/169698.html.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/20 v4]] anv: Implement VK_ANDROID_native_buffer (v4)

2017-09-14 Thread Chad Versace
This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

  Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
  fd into a VkSemaphore or VkFence, the driver instead simply blocks on
  the sync fd, then puts the VkSemaphore or VkFence into the signalled
  state. Thanks to implicit sync, this produces correct behavior (with
  extra latency overhead, perhaps) despite its ugliness.

  Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
  a collection of wait semaphores as a sync fd, the driver instead
  submits the semaphores to the queue, then returns sync fd -1, which
  informs the caller that no additional synchronization is needed.
  Again, thanks to implicit sync, this produces correct behavior (with
  extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

  Reason 1. I've already tested this patch with dEQP and with demos
  apps. It works. I wanted to get the tested patches into the tree now,
  and polish the implementation afterwards.

  Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
  supports neither Android's sync_fence, nor upstream's sync_file, nor
  drm_syncobj. Again, I tested these patches on Android with a 3.18
  kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===
I did not test at all using the Android.mk buildsystem. I probably
broke it. Please test and review that.

Testing
===
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass:

  a little spinning cube demo APK
  dEQP-VK.info.*
  dEQP-VK.api.smoke.*
  dEQP-VK.api.info.instance.*
  dEQP-VK.api.info.device.*
  dEQP-VK.api.wsi.android.*

v2:
  - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
the VkImage.
  - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
during vkCreateImage. Instead, directly import its dma-buf during
vkCreateImage with anv_bo_cache_import(). [for jekstrand]
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
exist.

v3:
  - Delete duplicate #include "anv_private.h". [per Tapani]
  - Try to fix the Android-IA build in Android.vulkan.mk by following
Tapani's example in

.
But I truncated the added include path from
"frameworks/native/vulkan/include/hardware" to
"frameworks/native/vulkan/include", and inserted it *after*
$(MESA_TOP)/include/vulkan, so that #include 
hopefully works in both the Autotools and Android.mk build.

v4:
  - Unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE on the imported
gralloc buffer, just as we do for all other winsys buffers in
anv_wsi.c. [found by Tapani]

Cc: Tapani Pälli 
Cc: Jason Ekstrand 
---
 src/intel/Android.vulkan.mk |   7 +-
 src/intel/Makefile.sources  |   3 +
 src/intel/Makefile.vulkan.am|   2 +
 src/intel/vulkan/anv_android.c  | 242 
 src/intel/vulkan/anv_device.c   |  12 +-
 src/intel/vulkan/anv_entrypoints_gen.py |  10 +-
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_image.c| 148 +--
 src/intel/vulkan/anv_private.h  |   1 +
 9 files changed, 414 insertions(+), 12 deletions(-)
 create mode 100644 src/intel/vulkan/anv_android.c

diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index e20b32b87c..dcd6d8b68a 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -28,6 +28,7 @@ VK_ENTRYPOINTS_SCRIPT := $(MESA_PYTHON2) 
$(LOCAL_PATH)/vulkan/anv_entrypoints_ge
 VK_EXTENSIONS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_extensions.py
 
 VULKAN_COMMON_INCLUDES := \
+   $(MESA_TOP)/include/vulkan \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/gallium/auxiliary \
$(MESA_TOP)/src/gallium/include \
@@ -36,7 +37,8 @@ VULKAN_COMMON_INCLUDES := \
$(MESA_TOP)/src/vulkan/util \
$(MESA_TOP)/src/intel \
$(MESA_TOP)/include/drm-uapi \
-   $(MESA_TOP)/src/intel/vulkan
+   $(MESA_TOP)/src/intel/vulkan \
+   frameworks/native/vulkan/include
 
 # libmesa_anv_entrypoints with header and dummy.c
 #
@@ -254,7 +256,8 @@ LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
 LOCAL_LDFLAGS += -Wl,--build-id=sha1
 
 LOCAL_SRC_FILES := \
-   $(VULKAN_GEM_FILES)
+   $(VULKAN_GEM_FILES) \
+   $(VULKAN_ANDROID_FILES)
 
 LOCAL_C_INCLUDES := \
$(VULKAN_COMMON_INCLUDES) \
diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 8ca50ff622..6f2dfa91e2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -229,6 +229,9 @@ VU

[Mesa-dev] [PATCH 21/20] anv: Install as Vulkan HAL module in Android.mk build

2017-09-14 Thread Chad Versace
From: Tapani Pälli 

Now that anvil fully implements the Vulkan HAL interface, we can install
it as the vendor HAL module at /vendor/lib/hw/vulkan.${board}.so. To do
so:

  - Rename LOCAL_MODULE to vulkan.$(TARGET_BOARD_PLATFORM).
  - Use LOCAL_PROPRIETARY_MODULE to install under vendor path.

Tested by running different Sascha Williams demos on Android-IA.

Signed-off-by: Tapani Pälli 
[chadv: Extract this hunk from Tapani's patch, and embed it as
 stand-alone patch in my arc-vulkan series].
Signed-off-by: Chad Versace 
---
 src/intel/Android.vulkan.mk | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index dcd6d8b68a..2aad968f32 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -248,8 +248,10 @@ include $(BUILD_STATIC_LIBRARY)
 
 include $(CLEAR_VARS)
 
-LOCAL_MODULE := libvulkan_intel
+LOCAL_MODULE := vulkan.$(TARGET_BOARD_PLATFORM)
 LOCAL_MODULE_CLASS := SHARED_LIBRARIES
+LOCAL_PROPRIETARY_MODULE := true
+LOCAL_MODULE_RELATIVE_PATH := hw
 
 LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 1/1] clover: Wait for requested operation if blocking flag is set

2017-09-14 Thread Francisco Jerez
Jan Vesely  writes:

> On Mon, 2017-09-04 at 13:23 -0700, Francisco Jerez wrote:
>> Jan Vesely  writes:
>> 
>> > v2: wait in map_buffer and map_image as well
>> > v3: use event::wait instead of wait (skips fence wait for hard_event)
>> > v4: use wait_signalled()
>> > 
>> > Signed-off-by: Jan Vesely 
>> > ---
>> > Hi Francisco,
>> > 
>> > once again sorry for the delay, and thanks for you patience.
>> > This patch applies on top of the two you attached during our email
>> > discussion.
>> > From what I can tell, the functionality is identical to v3 after your
>> > two patches are applied ("event:wait()" calls "wait_signalled()"), but I
>> > suppose calling non-virtual function is preferrable. if not, feel free
>> > to use v3.
>> > 
>> 
>> Yeah, I find v4 more readable than calling the base class'
>> implementation of wait().  Patch is:
>> 
>> Reviewed-by: Francisco Jerez 
>
> thanks. will you include with the other 2 patches,  or should I push it
> separately after those 2 are in?
>

I don't have review tags for the other two, but assuming they get your
R-b feel free to push all the three patches yourself.

> regards,
> Jan
>
>> 
>> Thanks.
>> 
>> > thanks,
>> > Jan
>> > 
>> >  src/gallium/state_trackers/clover/api/transfer.cpp | 30 
>> > --
>> >  1 file changed, 28 insertions(+), 2 deletions(-)
>> > 
>> > diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp 
>> > b/src/gallium/state_trackers/clover/api/transfer.cpp
>> > index f7046253be..34559042ae 100644
>> > --- a/src/gallium/state_trackers/clover/api/transfer.cpp
>> > +++ b/src/gallium/state_trackers/clover/api/transfer.cpp
>> > @@ -295,6 +295,9 @@ clEnqueueReadBuffer(cl_command_queue d_q, cl_mem 
>> > d_mem, cl_bool blocking,
>> > &mem, obj_origin, obj_pitch,
>> > region));
>> >  
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > ret_object(rd_ev, hev);
>> > return CL_SUCCESS;
>> >  
>> > @@ -325,6 +328,9 @@ clEnqueueWriteBuffer(cl_command_queue d_q, cl_mem 
>> > d_mem, cl_bool blocking,
>> > ptr, {}, obj_pitch,
>> > region));
>> >  
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > ret_object(rd_ev, hev);
>> > return CL_SUCCESS;
>> >  
>> > @@ -362,6 +368,9 @@ clEnqueueReadBufferRect(cl_command_queue d_q, cl_mem 
>> > d_mem, cl_bool blocking,
>> > &mem, obj_origin, obj_pitch,
>> > region));
>> >  
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > ret_object(rd_ev, hev);
>> > return CL_SUCCESS;
>> >  
>> > @@ -399,6 +408,9 @@ clEnqueueWriteBufferRect(cl_command_queue d_q, cl_mem 
>> > d_mem, cl_bool blocking,
>> > ptr, host_origin, host_pitch,
>> > region));
>> >  
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > ret_object(rd_ev, hev);
>> > return CL_SUCCESS;
>> >  
>> > @@ -504,6 +516,9 @@ clEnqueueReadImage(cl_command_queue d_q, cl_mem d_mem, 
>> > cl_bool blocking,
>> > &img, src_origin, src_pitch,
>> > region));
>> >  
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > ret_object(rd_ev, hev);
>> > return CL_SUCCESS;
>> >  
>> > @@ -538,6 +553,9 @@ clEnqueueWriteImage(cl_command_queue d_q, cl_mem 
>> > d_mem, cl_bool blocking,
>> > ptr, {}, src_pitch,
>> > region));
>> >  
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > ret_object(rd_ev, hev);
>> > return CL_SUCCESS;
>> >  
>> > @@ -667,7 +685,11 @@ clEnqueueMapBuffer(cl_command_queue d_q, cl_mem 
>> > d_mem, cl_bool blocking,
>> >  
>> > void *map = mem.resource(q).add_map(q, flags, blocking, obj_origin, 
>> > region);
>> >  
>> > -   ret_object(rd_ev, create(q, CL_COMMAND_MAP_BUFFER, deps));
>> > +   auto hev = create(q, CL_COMMAND_MAP_BUFFER, deps);
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > +   ret_object(rd_ev, hev);
>> > ret_error(r_errcode, CL_SUCCESS);
>> > return map;
>> >  
>> > @@ -695,7 +717,11 @@ clEnqueueMapImage(cl_command_queue d_q, cl_mem d_mem, 
>> > cl_bool blocking,
>> >  
>> > void *map = img.resource(q).add_map(q, flags, blocking, origin, 
>> > region);
>> >  
>> > -   ret_object(rd_ev, create(q, CL_COMMAND_MAP_IMAGE, deps));
>> > +   auto hev = create(q, CL_COMMAND_MAP_IMAGE, deps);
>> > +   if (blocking)
>> > +   hev().wait_signalled();
>> > +
>> > +   ret_object(rd_ev, hev);
>> > ret_error(r_errcode, CL_SUCCESS);
>> > return map;
>> >  
>> > -- 
>> > 2.13.5


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 1/1] clover: Wait for requested operation if blocking flag is set

2017-09-14 Thread Jan Vesely
On Mon, 2017-09-04 at 13:23 -0700, Francisco Jerez wrote:
> Jan Vesely  writes:
> 
> > v2: wait in map_buffer and map_image as well
> > v3: use event::wait instead of wait (skips fence wait for hard_event)
> > v4: use wait_signalled()
> > 
> > Signed-off-by: Jan Vesely 
> > ---
> > Hi Francisco,
> > 
> > once again sorry for the delay, and thanks for you patience.
> > This patch applies on top of the two you attached during our email
> > discussion.
> > From what I can tell, the functionality is identical to v3 after your
> > two patches are applied ("event:wait()" calls "wait_signalled()"), but I
> > suppose calling non-virtual function is preferrable. if not, feel free
> > to use v3.
> > 
> 
> Yeah, I find v4 more readable than calling the base class'
> implementation of wait().  Patch is:
> 
> Reviewed-by: Francisco Jerez 

thanks. will you include with the other 2 patches,  or should I push it
separately after those 2 are in?

regards,
Jan

> 
> Thanks.
> 
> > thanks,
> > Jan
> > 
> >  src/gallium/state_trackers/clover/api/transfer.cpp | 30 
> > --
> >  1 file changed, 28 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp 
> > b/src/gallium/state_trackers/clover/api/transfer.cpp
> > index f7046253be..34559042ae 100644
> > --- a/src/gallium/state_trackers/clover/api/transfer.cpp
> > +++ b/src/gallium/state_trackers/clover/api/transfer.cpp
> > @@ -295,6 +295,9 @@ clEnqueueReadBuffer(cl_command_queue d_q, cl_mem d_mem, 
> > cl_bool blocking,
> > &mem, obj_origin, obj_pitch,
> > region));
> >  
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > ret_object(rd_ev, hev);
> > return CL_SUCCESS;
> >  
> > @@ -325,6 +328,9 @@ clEnqueueWriteBuffer(cl_command_queue d_q, cl_mem 
> > d_mem, cl_bool blocking,
> > ptr, {}, obj_pitch,
> > region));
> >  
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > ret_object(rd_ev, hev);
> > return CL_SUCCESS;
> >  
> > @@ -362,6 +368,9 @@ clEnqueueReadBufferRect(cl_command_queue d_q, cl_mem 
> > d_mem, cl_bool blocking,
> > &mem, obj_origin, obj_pitch,
> > region));
> >  
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > ret_object(rd_ev, hev);
> > return CL_SUCCESS;
> >  
> > @@ -399,6 +408,9 @@ clEnqueueWriteBufferRect(cl_command_queue d_q, cl_mem 
> > d_mem, cl_bool blocking,
> > ptr, host_origin, host_pitch,
> > region));
> >  
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > ret_object(rd_ev, hev);
> > return CL_SUCCESS;
> >  
> > @@ -504,6 +516,9 @@ clEnqueueReadImage(cl_command_queue d_q, cl_mem d_mem, 
> > cl_bool blocking,
> > &img, src_origin, src_pitch,
> > region));
> >  
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > ret_object(rd_ev, hev);
> > return CL_SUCCESS;
> >  
> > @@ -538,6 +553,9 @@ clEnqueueWriteImage(cl_command_queue d_q, cl_mem d_mem, 
> > cl_bool blocking,
> > ptr, {}, src_pitch,
> > region));
> >  
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > ret_object(rd_ev, hev);
> > return CL_SUCCESS;
> >  
> > @@ -667,7 +685,11 @@ clEnqueueMapBuffer(cl_command_queue d_q, cl_mem d_mem, 
> > cl_bool blocking,
> >  
> > void *map = mem.resource(q).add_map(q, flags, blocking, obj_origin, 
> > region);
> >  
> > -   ret_object(rd_ev, create(q, CL_COMMAND_MAP_BUFFER, deps));
> > +   auto hev = create(q, CL_COMMAND_MAP_BUFFER, deps);
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > +   ret_object(rd_ev, hev);
> > ret_error(r_errcode, CL_SUCCESS);
> > return map;
> >  
> > @@ -695,7 +717,11 @@ clEnqueueMapImage(cl_command_queue d_q, cl_mem d_mem, 
> > cl_bool blocking,
> >  
> > void *map = img.resource(q).add_map(q, flags, blocking, origin, region);
> >  
> > -   ret_object(rd_ev, create(q, CL_COMMAND_MAP_IMAGE, deps));
> > +   auto hev = create(q, CL_COMMAND_MAP_IMAGE, deps);
> > +   if (blocking)
> > +   hev().wait_signalled();
> > +
> > +   ret_object(rd_ev, hev);
> > ret_error(r_errcode, CL_SUCCESS);
> > return map;
> >  
> > -- 
> > 2.13.5


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Vulkan extensions

2017-09-14 Thread Romain Failliot
Hi!

I'm working on exposing the vulkan information recently added in
features.txt in mesamatrix.net, but there is no extension list under
"Vulkan 1.0 - all DONE: anv, radv"

There is a couple of command lines in the commit message though:
https://cgit.freedesktop.org/mesa/mesa/commit/?id=fe3d2559d941f8f69dbdb369221af69a9974d017

I could generate the list locally, but if a new driver comes in (nvidia for
instance), or if new extensions are added to Vulkan 1.0.xx, this will start
to be hard to maintain.

Is it possible to do the same as for OpenGL and list all the vulkan
extensions directly in features.txt?

Thanks,
Romain
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: android build system changes

2017-09-14 Thread Chad Versace
On Thu 14 Sep 2017, Tapani Pälli wrote:
> Following changes are made to support VK_ANDROID_native_buffer:
> 
>- bring in vk_android_native_buffer.xml
>- rename target as vulkan.$(TARGET_BOARD_PLATFORM)
>- use LOCAL_PROPRIETARY_MODULE to install under vendor path
>- link with libsync and liblog
> 
> Tested by running different Sascha Williams demos on Android-IA.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/intel/Android.vulkan.mk| 14 ++
>  src/intel/vulkan/anv_android.c |  2 +-
>  2 files changed, 11 insertions(+), 5 deletions(-)

Tapani, I don't want my patches to break the Android.mk build in the
middle of the series. So I've begun cherry-picking individual hunks from
your patch into my patch series. You're CC'd on the revised patches.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/20 v3] anv: Implement VK_ANDROID_native_buffer (v3)

2017-09-14 Thread Chad Versace
This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

  Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
  fd into a VkSemaphore or VkFence, the driver instead simply blocks on
  the sync fd, then puts the VkSemaphore or VkFence into the signalled
  state. Thanks to implicit sync, this produces correct behavior (with
  extra latency overhead, perhaps) despite its ugliness.

  Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
  a collection of wait semaphores as a sync fd, the driver instead
  submits the semaphores to the queue, then returns sync fd -1, which
  informs the caller that no additional synchronization is needed.
  Again, thanks to implicit sync, this produces correct behavior (with
  extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

  Reason 1. I've already tested this patch with dEQP and with demos
  apps. It works. I wanted to get the tested patches into the tree now,
  and polish the implementation afterwards.

  Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
  supports neither Android's sync_fence, nor upstream's sync_file, nor
  drm_syncobj. Again, I tested these patches on Android with a 3.18
  kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===
I did not test at all using the Android.mk buildsystem. I probably
broke it. Please test and review that.

Testing
===
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass:

  a little spinning cube demo APK
  dEQP-VK.info.*
  dEQP-VK.api.smoke.*
  dEQP-VK.api.info.instance.*
  dEQP-VK.api.info.device.*
  dEQP-VK.api.wsi.android.*

v2:
  - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
the VkImage.
  - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
during vkCreateImage. Instead, directly import its dma-buf during
vkCreateImage with anv_bo_cache_import(). [for jekstrand]
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
exist.

v3:
  - Delete duplicate #include "anv_private.h". [per Tapani]
  - Try to fix the Android-IA build in Android.vulkan.mk by following
Tapani's example in

.
But I truncated the added include path from
"frameworks/native/vulkan/include/hardware" to
"frameworks/native/vulkan/include", and inserted it *after*
$(MESA_TOP)/include/vulkan, so that #include 
hopefully works in both the Autotools and Android.mk build.
---
 src/intel/Android.vulkan.mk |   7 +-
 src/intel/Makefile.sources  |   3 +
 src/intel/Makefile.vulkan.am|   2 +
 src/intel/vulkan/anv_android.c  | 242 
 src/intel/vulkan/anv_device.c   |  12 +-
 src/intel/vulkan/anv_entrypoints_gen.py |  10 +-
 src/intel/vulkan/anv_extensions.py  |   1 +
 src/intel/vulkan/anv_image.c| 141 +--
 src/intel/vulkan/anv_private.h  |   1 +
 9 files changed, 407 insertions(+), 12 deletions(-)
 create mode 100644 src/intel/vulkan/anv_android.c

diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index e20b32b87c..dcd6d8b68a 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -28,6 +28,7 @@ VK_ENTRYPOINTS_SCRIPT := $(MESA_PYTHON2) 
$(LOCAL_PATH)/vulkan/anv_entrypoints_ge
 VK_EXTENSIONS_SCRIPT := $(MESA_PYTHON2) $(LOCAL_PATH)/vulkan/anv_extensions.py
 
 VULKAN_COMMON_INCLUDES := \
+   $(MESA_TOP)/include/vulkan \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/gallium/auxiliary \
$(MESA_TOP)/src/gallium/include \
@@ -36,7 +37,8 @@ VULKAN_COMMON_INCLUDES := \
$(MESA_TOP)/src/vulkan/util \
$(MESA_TOP)/src/intel \
$(MESA_TOP)/include/drm-uapi \
-   $(MESA_TOP)/src/intel/vulkan
+   $(MESA_TOP)/src/intel/vulkan \
+   frameworks/native/vulkan/include
 
 # libmesa_anv_entrypoints with header and dummy.c
 #
@@ -254,7 +256,8 @@ LOCAL_CFLAGS := -DLOG_TAG=\"INTEL-MESA\"
 LOCAL_LDFLAGS += -Wl,--build-id=sha1
 
 LOCAL_SRC_FILES := \
-   $(VULKAN_GEM_FILES)
+   $(VULKAN_GEM_FILES) \
+   $(VULKAN_ANDROID_FILES)
 
 LOCAL_C_INCLUDES := \
$(VULKAN_COMMON_INCLUDES) \
diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 8ca50ff622..6f2dfa91e2 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -229,6 +229,9 @@ VULKAN_FILES := \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
 
+VULKAN_ANDROID_FILES := \
+   vulkan/anv_android.c
+
 VULKAN_WSI_WAYLAND_FILES := \
vulkan/anv_wsi_wayland.c
 
diff --git a

Re: [Mesa-dev] [PATCH 20/20] anv: Implement VK_ANDROID_native_buffer (v2)

2017-09-14 Thread Chad Versace
On Thu 14 Sep 2017, Tapani Pälli wrote:
> 
> 
> On 09/14/2017 08:51 AM, Tapani Pälli wrote:
> > 
> > 
> > On 09/14/2017 02:03 AM, Chad Versace wrote:
> > > From: Chad Versace 
> > > 
> > > This implementation is correct (afaict), but takes two shortcuts
> > > regarding the import/export of Android sync fds.
> > > 
> > >Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
> > >fd into a VkSemaphore or VkFence, the driver instead simply blocks on
> > >the sync fd, then puts the VkSemaphore or VkFence into the signalled
> > >state. Thanks to implicit sync, this produces correct behavior (with
> > >extra latency overhead, perhaps) despite its ugliness.
> > > 
> > >Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID
> > > to export
> > >a collection of wait semaphores as a sync fd, the driver instead
> > >submits the semaphores to the queue, then returns sync fd -1, which
> > >informs the caller that no additional synchronization is needed.
> > >Again, thanks to implicit sync, this produces correct behavior (with
> > >extra batch submission overhead) despite its ugliness.
> > > 
> > > I chose to take the shortcuts instead of properly importing/exporting
> > > the sync fds for two reasons:
> > > 
> > >Reason 1. I've already tested this patch with dEQP and with demos
> > >apps. It works. I wanted to get the tested patches into the tree now,
> > >and polish the implementation afterwards.
> > > 
> > >Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
> > >supports neither Android's sync_fence, nor upstream's sync_file, nor
> > >drm_syncobj. Again, I tested these patches on Android with a 3.18
> > >kernel and they work.
> > > 
> > > I plan to quickly follow-up with patches that remove the shortcuts and
> > > properly import/export the sync fds.
> > > 
> > > Non-Testing
> > > ===
> > > I did not test at all using the Android.mk buildsystem. I probably
> > > broke it. Please test and review that.
> > > 
> > > Testing
> > > ===
> > > I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
> > > The following pass:
> > > 
> > >a little spinning cube demo APK
> > >dEQP-VK.info.*
> > >dEQP-VK.api.smoke.*
> > >dEQP-VK.api.info.instance.*
> > >dEQP-VK.api.info.device.*
> > >dEQP-VK.api.wsi.android.*
> > > 
> > > v2:
> > >- Reject VkNativeBufferANDROID if the dma-buf's size is too small for
> > >  the VkImage.
> > >- Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
> > >  during vkCreateImage. Instead, directly import its dma-buf during
> > >  vkCreateImage with anv_bo_cache_import(). [for jekstrand]
> > >- Rebase onto Tapani's VK_EXT_debug_report changes.
> > >- Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
> > >  exist.
> > > ---
> > >   src/intel/Makefile.sources  |   3 +
> > >   src/intel/Makefile.vulkan.am|   2 +
> > >   src/intel/vulkan/anv_android.c  | 245
> > > 
> > >   src/intel/vulkan/anv_device.c   |  12 +-
> > >   src/intel/vulkan/anv_entrypoints_gen.py |  10 +-
> > >   src/intel/vulkan/anv_extensions.py  |   1 +
> > >   src/intel/vulkan/anv_image.c| 141 --
> > >   src/intel/vulkan/anv_private.h  |   1 +
> > >   8 files changed, 405 insertions(+), 10 deletions(-)
> > >   create mode 100644 src/intel/vulkan/anv_android.c





> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > 
> > This include breaks things for us because hwvulkan.h includes vulkan.h
> > from android which has different content than Mesa's vulkan.h. If I make
> > this:
> > 
> > #include "hwvulkan.h"
> > 
> > and in Android.vulkan.mk include dir
> > "frameworks/native/vulkan/include/hardware", then this won't cause harm
> > since I believe that causes us to use mesa's vulkan.h. Harm looks like:
> > 
> > --- 8< ---
> > In file included from
> > vendor/intel/external/android_ia/mesa/src/intel/vulkan/anv_gem.c:32:
> > In file included from
> > vendor/intel/external/android_ia/mesa/src/intel/vulkan/anv_private.h:74:
> > out/target/product/androidia_64/gen/STATIC_LIBRARIES/libmesa_anv_entrypoints_intermediates/vulkan/anv_entrypoints.h:187:11:
> > error: unknown type name 'PFN_vkGetPhysicalDeviceFeatures2KHR'; did you
> > mean 'PFN_vkGetPhysicalDeviceFeatures'?
> >PFN_vkGetPhysicalDeviceFeatures2KHR
> > GetPhysicalDeviceFeatures2KHR;
> >^~~
> >PFN_vkGetPhysicalDeviceFeatures
> > ...
> > 
> > --- 8< ---
> > 
> > this same happens with bazillions of other functions not included in
> > android's old header but present in Mesa's vulkan.h.

Replacing #include  with #include "hwvulkan.h"
requires changing the Autotools build too.

Instead, I think a better fix is to ensure that
$(MESA_TOP)/include/vulkan appears before any Android fra

[Mesa-dev] [PATCH 9.5/20] anv/android: Link to libsync, liblog in Android.mk

2017-09-14 Thread Chad Versace
From: Tapani Pälli 

chadv: I made this patch by extracting the hunk from Tapani's patch in
https://lists.freedesktop.org/archives/mesa-dev/2017-September/169602.html.

Signed-off-by: Chad Versace 
---
 src/intel/Android.vulkan.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index a15d916942..ee98e30c35 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -268,7 +268,7 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \
libmesa_intel_compiler \
libmesa_anv_entrypoints
 
-LOCAL_SHARED_LIBRARIES := libdrm libz
+LOCAL_SHARED_LIBRARIES := libdrm libz libsync liblog
 
 include $(MESA_COMMON_MK)
 include $(BUILD_SHARED_LIBRARY)
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Feed vk_android_native_buffer.xml to generators (v2)

2017-09-14 Thread Chad Versace
OOPS. Ignore this. This message has the wrong In-Reply-To.

On Thu 14 Sep 2017, Chad Versace wrote:
> Feed the XML to anv_extensions.py and anv_entrypoints_gen.py.
> Do it on all platforms, not just Android. Tested on Android and Fedora.
> 
> We always parse the Android XML, regardless of target platform, to
> help reduce the chance that people working on non-Android break the
> Android build.
> 
> v2:
>   - Squash in Tapani's changes to Android.*.mk.
> 
> Reviewed-by: Tapani Pälli  (v1)
> ---
>  src/intel/Android.vulkan.mk|  5 -
>  src/intel/Makefile.vulkan.am   | 17 +
>  src/intel/vulkan/anv_extensions.py | 12 +++-
>  3 files changed, 28 insertions(+), 6 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Feed vk_android_native_buffer.xml to generators (v2)

2017-09-14 Thread Chad Versace
Feed the XML to anv_extensions.py and anv_entrypoints_gen.py.
Do it on all platforms, not just Android. Tested on Android and Fedora.

We always parse the Android XML, regardless of target platform, to
help reduce the chance that people working on non-Android break the
Android build.

v2:
  - Squash in Tapani's changes to Android.*.mk.

Reviewed-by: Tapani Pälli  (v1)
---
 src/intel/Android.vulkan.mk|  5 -
 src/intel/Makefile.vulkan.am   | 17 +
 src/intel/vulkan/anv_extensions.py | 12 +++-
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index 17ae4b071b..a15d916942 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -65,7 +65,8 @@ $(intermediates)/vulkan/dummy.c:
 $(intermediates)/vulkan/anv_entrypoints.h: $(intermediates)/vulkan/dummy.c
$(VK_ENTRYPOINTS_SCRIPT) \
--outdir $(dir $@) \
-   --xml $(MESA_TOP)/src/vulkan/registry/vk.xml
+   --xml $(MESA_TOP)/src/vulkan/registry/vk.xml \
+   --xml 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml
 
 LOCAL_EXPORT_C_INCLUDE_DIRS := \
 $(intermediates)
@@ -214,12 +215,14 @@ $(intermediates)/vulkan/anv_entrypoints.c:
@mkdir -p $(dir $@)
$(VK_ENTRYPOINTS_SCRIPT) \
--xml $(MESA_TOP)/src/vulkan/registry/vk.xml \
+   --xml 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \
--outdir $(dir $@)
 
 $(intermediates)/vulkan/anv_extensions.c:
@mkdir -p $(dir $@)
$(VK_EXTENSIONS_SCRIPT) \
--xml $(MESA_TOP)/src/vulkan/registry/vk.xml \
+   --xml 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \
--out $@
 
 LOCAL_SHARED_LIBRARIES := libdrm
diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
index fa9b6ba724..8a19f96096 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -23,18 +23,27 @@
 # rules must be outside of any AM_CONDITIONALs. Otherwise they will be 
commented
 # out and we'll fail at `make dist'
 vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+vk_android_native_buffer_xml = 
$(top_srcdir)/src/vulkan/registry/vk_android_native_buffer.xml
 
 vulkan/anv_entrypoints.c: vulkan/anv_entrypoints_gen.py \
- vulkan/anv_extensions.py $(vulkan_api_xml)
+ vulkan/anv_extensions.py \
+ $(vulkan_api_xml) \
+ $(vk_android_native_buffer_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_entrypoints_gen.py \
-   --xml $(vulkan_api_xml) --outdir $(builddir)/vulkan
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --outdir $(builddir)/vulkan
 vulkan/anv_entrypoints.h: vulkan/anv_entrypoints.c
 
-vulkan/anv_extensions.c: vulkan/anv_extensions.py $(vulkan_api_xml)
+vulkan/anv_extensions.c: vulkan/anv_extensions.py \
+$(vulkan_api_xml) \
+$(vk_android_native_buffer_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_extensions.py \
-   --xml $(vulkan_api_xml) --out $@
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --out $@
 
 BUILT_SOURCES += $(VULKAN_GENERATED_FILES)
 CLEANFILES += \
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index d995f9f177..316e1f04e7 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -134,8 +134,18 @@ def _init_exts_from_xml(xml):
 ext_name = ext_elem.attrib['name']
 if ext_name not in ext_name_map:
 continue
-ext = ext_name_map[ext_name]
 
+# Workaround for VK_ANDROID_native_buffer. Its  element in
+# vk.xml lists it as supported="disabled" and provides only a stub
+# definition.  Its  element in Mesa's custom
+# vk_android_native_buffer.xml, though, lists it as
+# supported='android-vendor' and fully defines the extension. We want
+# to skip the  element in vk.xml.
+if ext_elem.attrib['supported'] == 'disabled':
+assert ext_name == 'VK_ANDROID_native_buffer'
+continue
+
+ext = ext_name_map[ext_name]
 ext.type = ext_elem.attrib['type']
 
 _TEMPLATE = Template(COPYRIGHT + """
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Feed vk_android_native_buffer.xml to generators (v2)

2017-09-14 Thread Chad Versace
Feed the XML to anv_extensions.py and anv_entrypoints_gen.py.
Do it on all platforms, not just Android. Tested on Android and Fedora.

We always parse the Android XML, regardless of target platform, to
help reduce the chance that people working on non-Android break the
Android build.

v2:
  - Squash in Tapani's changes to Android.*.mk.

Reviewed-by: Tapani Pälli  (v1)
---
 src/intel/Android.vulkan.mk|  5 -
 src/intel/Makefile.vulkan.am   | 17 +
 src/intel/vulkan/anv_extensions.py | 12 +++-
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/src/intel/Android.vulkan.mk b/src/intel/Android.vulkan.mk
index 17ae4b071b..a15d916942 100644
--- a/src/intel/Android.vulkan.mk
+++ b/src/intel/Android.vulkan.mk
@@ -65,7 +65,8 @@ $(intermediates)/vulkan/dummy.c:
 $(intermediates)/vulkan/anv_entrypoints.h: $(intermediates)/vulkan/dummy.c
$(VK_ENTRYPOINTS_SCRIPT) \
--outdir $(dir $@) \
-   --xml $(MESA_TOP)/src/vulkan/registry/vk.xml
+   --xml $(MESA_TOP)/src/vulkan/registry/vk.xml \
+   --xml 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml
 
 LOCAL_EXPORT_C_INCLUDE_DIRS := \
 $(intermediates)
@@ -214,12 +215,14 @@ $(intermediates)/vulkan/anv_entrypoints.c:
@mkdir -p $(dir $@)
$(VK_ENTRYPOINTS_SCRIPT) \
--xml $(MESA_TOP)/src/vulkan/registry/vk.xml \
+   --xml 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \
--outdir $(dir $@)
 
 $(intermediates)/vulkan/anv_extensions.c:
@mkdir -p $(dir $@)
$(VK_EXTENSIONS_SCRIPT) \
--xml $(MESA_TOP)/src/vulkan/registry/vk.xml \
+   --xml 
$(MESA_TOP)/src/vulkan/registry/vk_android_native_buffer.xml \
--out $@
 
 LOCAL_SHARED_LIBRARIES := libdrm
diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
index fa9b6ba724..8a19f96096 100644
--- a/src/intel/Makefile.vulkan.am
+++ b/src/intel/Makefile.vulkan.am
@@ -23,18 +23,27 @@
 # rules must be outside of any AM_CONDITIONALs. Otherwise they will be 
commented
 # out and we'll fail at `make dist'
 vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+vk_android_native_buffer_xml = 
$(top_srcdir)/src/vulkan/registry/vk_android_native_buffer.xml
 
 vulkan/anv_entrypoints.c: vulkan/anv_entrypoints_gen.py \
- vulkan/anv_extensions.py $(vulkan_api_xml)
+ vulkan/anv_extensions.py \
+ $(vulkan_api_xml) \
+ $(vk_android_native_buffer_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_entrypoints_gen.py \
-   --xml $(vulkan_api_xml) --outdir $(builddir)/vulkan
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --outdir $(builddir)/vulkan
 vulkan/anv_entrypoints.h: vulkan/anv_entrypoints.c
 
-vulkan/anv_extensions.c: vulkan/anv_extensions.py $(vulkan_api_xml)
+vulkan/anv_extensions.c: vulkan/anv_extensions.py \
+$(vulkan_api_xml) \
+$(vk_android_native_buffer_xml)
$(MKDIR_GEN)
$(AM_V_GEN)$(PYTHON2) $(srcdir)/vulkan/anv_extensions.py \
-   --xml $(vulkan_api_xml) --out $@
+   --xml $(vulkan_api_xml) \
+   --xml $(vk_android_native_buffer_xml) \
+   --out $@
 
 BUILT_SOURCES += $(VULKAN_GENERATED_FILES)
 CLEANFILES += \
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index d995f9f177..316e1f04e7 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -134,8 +134,18 @@ def _init_exts_from_xml(xml):
 ext_name = ext_elem.attrib['name']
 if ext_name not in ext_name_map:
 continue
-ext = ext_name_map[ext_name]
 
+# Workaround for VK_ANDROID_native_buffer. Its  element in
+# vk.xml lists it as supported="disabled" and provides only a stub
+# definition.  Its  element in Mesa's custom
+# vk_android_native_buffer.xml, though, lists it as
+# supported='android-vendor' and fully defines the extension. We want
+# to skip the  element in vk.xml.
+if ext_elem.attrib['supported'] == 'disabled':
+assert ext_name == 'VK_ANDROID_native_buffer'
+continue
+
+ext = ext_name_map[ext_name]
 ext.type = ext_elem.attrib['type']
 
 _TEMPLATE = Template(COPYRIGHT + """
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: Add a string buffer implementation

2017-09-14 Thread Thomas Helland
Based on Vladislav Egorovs work on the preprocessor, but split
out to a util functionality that should be universal. Setup, teardown,
memory handling and general layout is modeled around the hash_table
and the set, to make it familiar for everyone.

A notable change is that this implementation is always null terminated.
The rationale is that it will be less error-prone, as one might
access the buffer directly, thereby reading a non-terminated string.
Also, vsnprintf and friends prints the null-terminator.

Reviewed-by: Nicolai Hähnle 

V2: Address review feedback from Timothy and Grazvydas
   - Fix MINGW preprocessor check
   - Changed len from uint to int
   - Make string argument const in append function
   - Move to header and inline append function
   - Add crimp_to_fit function for resizing buffer

V3: Move include of ralloc to string_buffer.h

V4: Use u_string.h for a cross-platform working vsnprintf

V5: Remember to cast to char * in crimp function

V6: Address review feedback from Nicolai
   - Handle !str->buf in buffer_create
   - Ensure va_end is always called in buffer_append_all
   - Add overflow check in buffer_append_len
   - Do not expose buffer_space_left, just remove it
   - Clarify why a loop is used in vprintf, change to for-loop
   - Add a va_copy to buffer_vprintf to fix failure to append arguments
 when having to resize the buffer for vsnprintf.

V7: Address more review feedback from Nicolai
   - Add missing va_end corresponding to va_copy
   - Error check failure to allocate in crimp_to_fit
---
 src/util/Makefile.sources |   2 +
 src/util/string_buffer.c  | 148 ++
 src/util/string_buffer.h  | 104 
 3 files changed, 254 insertions(+)
 create mode 100644 src/util/string_buffer.c
 create mode 100644 src/util/string_buffer.h

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 4ed4e39f03..c7f6516a99 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -37,6 +37,8 @@ MESA_UTIL_FILES := \
simple_list.h \
slab.c \
slab.h \
+   string_buffer.c \
+   string_buffer.h \
strndup.h \
strtod.c \
strtod.h \
diff --git a/src/util/string_buffer.c b/src/util/string_buffer.c
new file mode 100644
index 00..c33173bfa0
--- /dev/null
+++ b/src/util/string_buffer.c
@@ -0,0 +1,148 @@
+/*
+ * Copyright © 2017 Thomas Helland
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "string_buffer.h"
+
+static bool
+ensure_capacity(struct _mesa_string_buffer *str, uint32_t needed_capacity)
+{
+   if (needed_capacity <= str->capacity)
+  return true;
+
+   /* Too small, double until we can fit the new string */
+   uint32_t new_capacity = str->capacity * 2;
+   while (needed_capacity > new_capacity)
+  new_capacity *= 2;
+
+   str->buf = reralloc_array_size(str, str->buf, sizeof(char), new_capacity);
+   if (str->buf == NULL)
+  return false;
+
+   str->capacity = new_capacity;
+   return true;
+}
+
+struct _mesa_string_buffer *
+_mesa_string_buffer_create(void *mem_ctx, uint32_t initial_capacity)
+{
+   struct _mesa_string_buffer *str;
+   str = ralloc(mem_ctx, struct _mesa_string_buffer);
+
+   if (str == NULL)
+  return NULL;
+
+   /* If no initial capacity is set then set it to something */
+   str->capacity = initial_capacity ? initial_capacity : 32;
+   str->buf = ralloc_array(str, char, str->capacity);
+
+   if (!str->buf) {
+  ralloc_free(str);
+  return NULL;
+   }
+
+   str->length = 0;
+   str->buf[str->length] = '\0';
+   return str;
+}
+
+bool
+_mesa_string_buffer_append_all(struct _mesa_string_buffer *str,
+   uint32_t num_args, ...)
+{
+   int i;
+   char* s;
+   va_list args;
+   va_start(args, num_args);
+   for (i = 0; i < num_args; i++) {
+  s = va_arg(args, char*);
+  if (!_mesa_s

[Mesa-dev] [PATCH] glcpp: Avoid unnecessary call to strlen

2017-09-14 Thread Thomas Helland
Length of the token was already calculated by flex and stored in yyleng,
no need to implicitly call strlen() via linear_strdup().

Reviewed-by: Nicolai Hähnle 
Reviewed-by: Timothy Arceri 

V2: Also convert this pattern in glsl_lexer.ll

V3: Remove a misplaced comment

V4: Use a temporary char to avoid type change
Remove bogus +1 on length check of identifier
---
 src/compiler/glsl/glcpp/glcpp-lex.l |  9 -
 src/compiler/glsl/glsl_lexer.ll | 39 +
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-lex.l 
b/src/compiler/glsl/glcpp/glcpp-lex.l
index 381b97364a..9cfcc12022 100644
--- a/src/compiler/glsl/glcpp/glcpp-lex.l
+++ b/src/compiler/glsl/glcpp/glcpp-lex.l
@@ -101,7 +101,14 @@ void glcpp_set_column (int  column_no , yyscan_t 
yyscanner);
 #define RETURN_STRING_TOKEN(token) \
do {\
if (! parser->skipping) {   \
-   yylval->str = linear_strdup(yyextra->linalloc, yytext); 
\
+   /* We're not doing linear_strdup here, to avoid \
+* an implicit call on strlen() for the length  \
+* of the string, as this is already found by   \
+* flex and stored in yyleng */ \
+   void *mem_ctx = yyextra->linalloc;  \
+   yylval->str = linear_alloc_child(mem_ctx,   \
+yyleng + 1);   \
+   memcpy(yylval->str, yytext, yyleng + 1);\
RETURN_TOKEN_NEVER_SKIP (token);\
}   \
} while(0)
diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler/glsl/glsl_lexer.ll
index 7c41455d98..56519bf92d 100644
--- a/src/compiler/glsl/glsl_lexer.ll
+++ b/src/compiler/glsl/glsl_lexer.ll
@@ -81,8 +81,13 @@ static int classify_identifier(struct _mesa_glsl_parse_state 
*, const char *);
  "illegal use of reserved word `%s'", yytext); \
 return ERROR_TOK;  \
   } else { \
-void *mem_ctx = yyextra->linalloc; 
\
-yylval->identifier = linear_strdup(mem_ctx, yytext);   \
+/* We're not doing linear_strdup here, to avoid an implicit\
+ * call on strlen() for the length of the string, as this is   \
+ * already found by flex and stored in yyleng */   \
+void *mem_ctx = yyextra->linalloc; \
+ char *id = (char *) linear_alloc_child(mem_ctx, yyleng + 1);   \
+ memcpy(id, yytext, yyleng + 1);\
+ yylval->identifier = id;   \
 return classify_identifier(yyextra, yytext);   \
   }
\
} while (0)
@@ -261,8 +266,14 @@ HASH   ^{SPC}#{SPC}
 [ \t\r]*   { }
 :  return COLON;
 [_a-zA-Z][_a-zA-Z0-9]* {
-  void *mem_ctx = yyextra->linalloc;
-  yylval->identifier = linear_strdup(mem_ctx, 
yytext);
+  /* We're not doing linear_strdup here, to 
avoid an implicit call
+   * on strlen() for the length of the string, 
as this is already
+   * found by flex and stored in yyleng
+   */
+void *mem_ctx = yyextra->linalloc;
+char *id = (char *) 
linear_alloc_child(mem_ctx, yyleng + 1);
+memcpy(id, yytext, yyleng + 1);
+yylval->identifier = id;
   return IDENTIFIER;
}
 [1-9][0-9]*{
@@ -449,8 +460,14 @@ layout {
   || yyextra->ARB_tessellation_shader_enable) {
  return LAYOUT_TOK;
   } else {
- void *mem_ctx = yyextra->linalloc;
- yylval->identifier = linear_strdup(mem_ctx, yytext);
+ /* We're not doing linear_strdup here, to avoid an 
implicit call
+  * on strlen() for the length of the string, as this is 
already
+  * found by flex and stored in yyleng
+  */
+  void *mem_ctx = yyextra->linalloc;
+  char *id = (char

Re: [Mesa-dev] [PATCH 2/2] radv: dump the device name into the hang report

2017-09-14 Thread Bas Nieuwenhuizen
This series is

Reviewed-by: Bas Nieuwenhuizen 

On Thu, Sep 14, 2017 at 11:25 AM, Samuel Pitoiset
 wrote:
> Similar to RadeonSI renderer string.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_debug.c | 30 ++
>  1 file changed, 30 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_debug.c b/src/amd/vulkan/radv_debug.c
> index 662e29694f..f6f4dad65c 100644
> --- a/src/amd/vulkan/radv_debug.c
> +++ b/src/amd/vulkan/radv_debug.c
> @@ -27,6 +27,7 @@
>
>  #include 
>  #include 
> +#include 
>
>  #include "sid.h"
>  #include "gfx9d.h"
> @@ -605,6 +606,32 @@ radv_dump_enabled_options(struct radv_device *device, 
> FILE *f)
> fprintf(f, "\n");
>  }
>
> +static void
> +radv_dump_device_name(struct radv_device *device, FILE *f)
> +{
> +   struct radeon_info *info = &device->physical_device->rad_info;
> +   char llvm_string[32] = {}, kernel_version[128] = {};
> +   struct utsname uname_data;
> +   const char *chip_name;
> +
> +   chip_name = device->ws->get_chip_name(device->ws);
> +
> +   if (uname(&uname_data) == 0)
> +   snprintf(kernel_version, sizeof(kernel_version),
> +" / %s", uname_data.release);
> +
> +   if (HAVE_LLVM > 0) {
> +   snprintf(llvm_string, sizeof(llvm_string),
> +", LLVM %i.%i.%i", (HAVE_LLVM >> 8) & 0xff,
> +HAVE_LLVM & 0xff, MESA_LLVM_VERSION_PATCH);
> +   }
> +
> +   fprintf(f, "Device name: %s (%s DRM %i.%i.%i%s%s)\n\n",
> +   chip_name, device->physical_device->name,
> +   info->drm_major, info->drm_minor, info->drm_patchlevel,
> +   kernel_version, llvm_string);
> +}
> +
>  static bool
>  radv_gpu_hang_occured(struct radv_queue *queue, enum ring_type ring)
>  {
> @@ -637,6 +664,9 @@ radv_check_gpu_hangs(struct radv_queue *queue, struct 
> radeon_winsys_cs *cs)
> graphics_pipeline = radv_get_saved_graphics_pipeline(device);
> compute_pipeline = radv_get_saved_compute_pipeline(device);
>
> +   fprintf(stderr, "GPU hang report:\n\n");
> +   radv_dump_device_name(device, stderr);
> +
> radv_dump_enabled_options(device, stderr);
> radv_dump_dmesg(stderr);
>
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)

2017-09-14 Thread Samuel Pitoiset



On 09/14/2017 09:57 PM, Marek Olšák wrote:

On Thu, Sep 14, 2017 at 9:18 PM, Samuel Pitoiset
 wrote:



On 09/14/2017 09:03 PM, Marek Olšák wrote:


Module: Mesa
Branch: master
Commit: 7ffd4d2a6670ccefd4d697954a1ac67b5839da7d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7ffd4d2a6670ccefd4d697954a1ac67b5839da7d

Author: Christoph Berliner 
Date:   Thu Sep 14 21:01:04 2017 +0200

drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman,
SR3)



I didn't see any differences last time I tried glthread with SR3.

What about Hitman, do you see real performance improvements?


The guy sent it to me as a patch. I dropped one game from it but kept the
rest. He documented his findings here:
https://www.gamingonlinux.com/wiki/Performance_impact_of_Mesa_glthread#Results


Ah right! The number of games tested is just awesome. :)



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)

2017-09-14 Thread Marek Olšák
On Thu, Sep 14, 2017 at 9:18 PM, Samuel Pitoiset
 wrote:
>
>
> On 09/14/2017 09:03 PM, Marek Olšák wrote:
>>
>> Module: Mesa
>> Branch: master
>> Commit: 7ffd4d2a6670ccefd4d697954a1ac67b5839da7d
>> URL:
>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=7ffd4d2a6670ccefd4d697954a1ac67b5839da7d
>>
>> Author: Christoph Berliner 
>> Date:   Thu Sep 14 21:01:04 2017 +0200
>>
>> drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman,
>> SR3)
>
>
> I didn't see any differences last time I tried glthread with SR3.
>
> What about Hitman, do you see real performance improvements?

The guy sent it to me as a patch. I dropped one game from it but kept the
rest. He documented his findings here:
https://www.gamingonlinux.com/wiki/Performance_impact_of_Mesa_glthread#Results

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Disable stencil cache optimization combining two 4x2 blocks

2017-09-14 Thread Chad Versace
On Mon 11 Sep 2017, Topi Pohjolainen wrote:
> From the BDW PRM, Volume 15, Workarounds:
> 
> KMD Wa4x4STCOptimizationDisable HIZ/STC hang in hawx frames.
> 
> W/A: Disable 4x4 RCPFE STC optimization and therefore only send one
>  valid 4x4 to STC on 4x4 interface. This will require setting bit
>  6 of reg. 0x7004. Must be done at boot and all save/restore paths.
> 
> From the SKL PRM, Volume 16, Workarounds:
> 
> 0556 KMD Wa4x4STCOptimizationDisable HIZ/STC hang in hawx frames.
> 
> W/A: Disable 4 x4 RCPFE STC optimization and therefore only send
>  one valid 4x4 to STC on 4x4 interface.  This will require setting
>  bit 6 of reg. 0x7004. Must be done at boot and all save/restore
>  paths.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  | 5 -
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)

Acked-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix a potential crash if attachments allocation failed

2017-09-14 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Sep 14, 2017 at 6:47 PM, Samuel Pitoiset
 wrote:
> Also, it's useless to set the error code twice. Though, we
> should probably skip the next commands when the command buffer
> is considered invalid.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index af9f8210bf..0b56087a09 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -2713,9 +2713,10 @@ void radv_CmdBeginRenderPass(
> cmd_buffer->state.framebuffer = framebuffer;
> cmd_buffer->state.pass = pass;
> cmd_buffer->state.render_area = pRenderPassBegin->renderArea;
> +
> result = radv_cmd_state_setup_attachments(cmd_buffer, pass, 
> pRenderPassBegin);
> if (result != VK_SUCCESS)
> -   cmd_buffer->record_result = result;
> +   return;
>
> radv_cmd_buffer_set_subpass(cmd_buffer, pass->subpasses, true);
> assert(cmd_buffer->cs->cdw <= cdw_max);
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] NIR serialization

2017-09-14 Thread Connor Abbott
On Tue, Sep 12, 2017 at 2:09 PM, Jason Ekstrand  wrote:
> On Tue, Sep 12, 2017 at 10:12 AM, Ian Romanick  wrote:
>>
>> On 09/11/2017 11:17 PM, Kenneth Graunke wrote:
>> > On Monday, September 11, 2017 9:23:05 PM PDT Ian Romanick wrote:
>> >> On 09/08/2017 01:59 AM, Kenneth Graunke wrote:
>> >>> On Thursday, September 7, 2017 4:26:04 PM PDT Jordan Justen wrote:
>>  On 2017-09-06 14:12:41, Daniel Schürmann wrote:
>> > Hello together!
>> > Recently, we had a small discussion (off the list) about the NIR
>> > serialization, which was previously discussed in [RFC] ARB_gl_spirv
>> > and
>> > NIR backend for radeonsi.
>> >
>> > As this topic could be interesting to more people, I would like to
>> > share, what was talked about so far (You might want to read from
>> > bottom up).
>> >
>> > TL;DR:
>> > - NIR serialization is in demand for shader cache
>> > - could be done either directly (NIR binary form) or via SPIR-V
>> > - Ian et al. are working on GLSL IR -> SPIR-V transformation, which
>> > could be adapted for a NIR -> SPIR-V pass
>> > - in NIR representation, some type information is lost
>> > - thus, a serialization via SPIR-V could NOT be a glslang
>> > alternative
>> > (otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if
>> > the
>> > output is valid SPIR-V)
>> 
>>  Ian,
>> 
>>  Tim was suggesting that we might look at serializing nir for the i965
>>  shader cache. Based on this email, it sounds like serialized nir
>>  would
>>  not be enough for the shader cache as some GLSL type info would be
>>  lost. It sounds like GLSL IR => SPIR-V would be good enough. Is that
>>  right?
>> 
>>  I don't think we have a strict requirement for the GLSL IR => SPIR-V
>>  path for GL 4.6, right? So, this is more of a 'nice-to-have'?
>> 
>>  I'm not sure we'd want to make i965 shader cache depend on a
>>  nice-to-have feature. (Unless we're pretty sure it'll be available
>>  soon.)
>> 
>>  But, it would be nice to not have to fallback to compiling the GLSL
>>  for i965 shader cache, so it would be worth waiting a little bit to
>>  be
>>  able to rely on a SPIR-V serialization of the GLSL IR.
>> 
>>  What do you suggest?
>> 
>>  -Jordan
>> >>>
>> >>> We shouldn't use SPIR-V for the shader cache.
>> >>>
>> >>> The compilation process for GLSL is: GLSL -> GLSL IR -> NIR -> i965
>> >>> IRs.
>> >>> Storing the content at one of those points, and later loading it and
>> >>> resuming the normal compilation process from that point...that's
>> >>> totally
>> >>> reasonable.
>> >>>
>> >>> Having a fallback for "some things in the cache but not all the
>> >>> variants
>> >>> we needed" suddenly take a different compilation pipeline, i.e. SPIR-V
>> >>> -> NIR -> ... seems risky.  It's a different compilation path that we
>> >>> don't normally use.  And one you'd only hit in limited circumstances.
>> >>> There's a lot of potential for really obscure bugs.
>> >>
>> >> Since we're going to expose exactly that path for GL_ARB_spirv / OpenGL
>> >> 4.6, we'd better make sure it works always.  Right?
>> >
>> > In addition to the old pipeline:
>> >
>> > - GLSL from the app -> GLSL IR -> NIR -> i965 IR
>> >
>> > GL_ARB_spirv and OpenGL 4.6 add a second pipeline:
>> >
>> > - SPIR-V from the app -> NIR -> i965 IR
>> >
>> > Both of those absolutely have to work.  But these:
>> >
>> > - GLSL -> GLSL IR -> NIR -> SPIR-V -> NIR -> i965 IRs
>> > - GLSL -> GLSL IR -> SPIR-V -> NIR -> i965 IRs
>> >
>> > aren't required to work, or even be supported.  It makes a lot of sense
>> > to support them - both for testing purposes, and as an alternative to
>> > glslang, for a broader tooling ecosystem.
>> >
>> > The thing that concerns me is that if you use SPIR-V for the cache, you
>> > need these paths to not just work, but be _indistinguishable_ from one
>> > another:
>> >
>> > - GLSL -> GLSL IR -> NIR -> ...
>> > - GLSL -> GLSL IR -> NIR -> SPIR-V, then SPIR-V -> NIR -> ...
>> >
>> > Otherwise the original compile and partially-cached recompile might have
>> > different properties.  For example, if the the SPIR-V step messes with
>> > variables or instruction ordering a little, it could trip up the loop
>> > unroller so the original compiler gets unrolled, and the recompile from
>> > partial cache doesn't get unrolled.  I don't want to have to debug that.
>>
>> That is a very compelling argument.  If we want Mesa to be an
>> alternative to glslang, I think we would like to have that property, but
>> it's not a hard requirement for that use case.
>
>
> I also find that argument rather compelling.  The SPIR-V -> NIR pass is
> *not* a simple pass.  It does piles of lowering and things on-the-fly as
> well as creating temporary variables for various things.  The best we could
> hope to guarnatee would be that NIR -> SPIR-V -> NIR -> vars_to_ssa -> CSE
> is idemp

Re: [Mesa-dev] Mesa (master): drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)

2017-09-14 Thread Samuel Pitoiset



On 09/14/2017 09:03 PM, Marek Olšák wrote:

Module: Mesa
Branch: master
Commit: 7ffd4d2a6670ccefd4d697954a1ac67b5839da7d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7ffd4d2a6670ccefd4d697954a1ac67b5839da7d

Author: Christoph Berliner 
Date:   Thu Sep 14 21:01:04 2017 +0200

drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)


I didn't see any differences last time I tried glthread with SR3.

What about Hitman, do you see real performance improvements?



Signed-off-by: Marek Olšák 

---

  src/util/drirc | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/src/util/drirc b/src/util/drirc
index 30ac9c839b..145467bac4 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -178,6 +178,21 @@ TODO: document the other workarounds.
  
  
  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
  
  
  

___
mesa-commit mailing list
mesa-com...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util/u_atomic: Add implementation of __sync_val_compare_and_swap_8

2017-09-14 Thread Matt Turner
Grazvydas,

I noticed that there are some __atomic functions in this file, but I'm
not sure what they do or why they're necessary. Remind me?

Thanks,
Matt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] util/u_atomic: Add implementation of __sync_val_compare_and_swap_8

2017-09-14 Thread Matt Turner
Needed for 32-bit PowerPC.
---
 src/util/u_atomic.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c
index 44b75fb0c0..b32527fe34 100644
--- a/src/util/u_atomic.c
+++ b/src/util/u_atomic.c
@@ -61,6 +61,20 @@ __sync_sub_and_fetch_8(uint64_t *ptr, uint64_t val)
 }
 
 WEAK uint64_t
+__sync_val_compare_and_swap_8(uint64_t *ptr, uint64_t oldval, uint64_t newval)
+{
+   uint64_t r;
+
+   pthread_mutex_lock(&sync_mutex);
+   r = *ptr;
+   if (*ptr == oldval)
+  *ptr = newval;
+   pthread_mutex_unlock(&sync_mutex);
+
+   return r;
+}
+
+WEAK uint64_t
 __atomic_fetch_add_8(uint64_t *ptr, uint64_t val, int memorder)
 {
return __sync_add_and_fetch(ptr, val);
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] util: Link libmesautil into u_atomic_test

2017-09-14 Thread Matt Turner
Platforms without particular atomic operations require the
implementations in u_atomic.c
---
 src/util/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/util/Makefile.am b/src/util/Makefile.am
index 4512dc99d5..9885bbe968 100644
--- a/src/util/Makefile.am
+++ b/src/util/Makefile.am
@@ -62,6 +62,7 @@ libxmlconfig_la_LIBADD = $(EXPAT_LIBS) -lm
 
 sysconf_DATA = drirc
 
+u_atomic_test_LDADD = libmesautil.la
 roundeven_test_LDADD = -lm
 
 check_PROGRAMS = u_atomic_test roundeven_test
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/4] gbm: Add a modifier_plane_count query

2017-09-14 Thread Daniel Stone
Hi,

On 5 September 2017 at 16:48, Jason Ekstrand  wrote:
> This is mostly just a re-send of the original patch series I sent out only
> with a couple of reviews and fixes applied.  I'm happy with it and I think
> Daniel can confirm that it fixes the problem we're having in modesetting
> when trying to enable CCS.  Anyone opposed?

Specifically, the X server really wants to do front-buffer rendering
during scanout, and multi-plane formats push this from the standard
tearing that we all know and love, to far more creative noise as the
two planes become desynchronised.

With the GBM_EXPORT fix, series is:
Reviewed-by: Daniel Stone 

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 0/3] Flag new aux state when we create/drop aux surfaces

2017-09-14 Thread Kenneth Graunke
On Thursday, September 14, 2017 10:33:22 AM PDT Jason Ekstrand wrote:
> I read through the series and, while I think it will fix the issue in 90%
> of cases, I don't think it's quite the right solution.  There's a *lot* of
> subtlety here and we need to tread carefully.  I think the better thing to
> do would be to whack the new flag in two places:  Whenever we change the
> fast clear value (brw_blorp.c and brw_clear.c for depth) and whenever
> intel_miptree_set_aux_state actually changes something.  There are several
> things which look at aux_state and it would be better to flag on that
> changing just to make sure we get them all.
> 
> --Jason

There's another subtle factor:

intel_miptree_texture_aux_usage() can return AUX_USAGE_NONE sometimes,
even if mt->aux_usage hasn't changed.  For example, we report NONE if
there's no unresolved color to avoid having to look at CCS, even though
there still is a CCS...

We'd want to re-emit in that case, too.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/4] gbm: Add a gbm_device_get_format_modifier_plane_count function

2017-09-14 Thread Daniel Stone
On 5 September 2017 at 16:48, Jason Ekstrand  wrote:
> +/** Get the number of planes that are required for a given format+modifier
> + *
> + * \param gbm The gbm device returned from gbm_create_device()
> + * \param format The format to query
> + * \param modifier The modifier to query
> + */
> +int
> +gbm_device_get_format_modifier_plane_count(struct gbm_device *gbm,

Needs to be GBM_EXPORT
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: fix build warning on clang

2017-09-14 Thread Jordan Justen
On 2017-09-14 00:26:39, Tapani Pälli wrote:
> fixes following warning:
>warning: format specifies type 'long' but the argument has type 'uint64_t' 
> (aka 'unsigned long long')
> 
> cast is needed to avoid this turning in to another warning on 32bit build:
>warning: format specifies type 'unsigned long long' but the argument has 
> type 'uint64_t' (aka 'unsigned long')

size is uint64_t, so the (unsigned long long) cast shouldn't be needed
for 32-bit, right?

Otherwise: Reviewed-by: Jordan Justen 

> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index b9d6a39f1f..cc1a2d1f49 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -396,7 +396,8 @@ retry:
>  
> pthread_mutex_unlock(&bufmgr->lock);
>  
> -   DBG("bo_create: buf %d (%s) %ldb\n", bo->gem_handle, bo->name, size);
> +   DBG("bo_create: buf %d (%s) %llub\n", bo->gem_handle, bo->name,
> +   (unsigned long long) size);
>  
> return bo;
>  
> -- 
> 2.13.5
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: don't drop intructions from unreachable terminators continue branch

2017-09-14 Thread Emil Velikov
Hi Tim

On 14 September 2017 at 05:47, Timothy Arceri  wrote:
> These instruction will be executed on every iteration of the loop
> we cannot drop them.

This and 2/3 sound like very nice bugfixes.
I haven't checked if they apply for 17.2 - if not can you do some backports.

Once people have checked the lot and confirmed everything of course.

Thanks!
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] intel/blorp/hiz: Always set sample number

2017-09-14 Thread Chad Versace
On Wed 13 Sep 2017, Matt Turner wrote:
> In-Reply-To: <20170913220107.gd6...@ivybridge.mattst88.com>

Matt, upgrade your machine. Ivy Bridge is ancient.

> On 09/11, Topi Pohjolainen wrote:
> > Signed-off-by: Topi Pohjolainen 
> 
> Patches 2-3 (because I don't know a lot about this code) are
> 
> Acked-by: Matt Turner 

Patch 2 is
Reviewed-by: Chad Versace 

Please reference Chromium commit
https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/490534
somewhere in the message. You can also use the short url
http://crrev.com/c/490534. That's the commit where marcheu disabled HiZ
on Brasswell due to hangs.

But what about 3DSTATE_RASTER? In the same section the Broadwell PRM says that
3DSTATE_RASTER must also be emitted prior to 3DSTATE_WM_HZ_OP, but only
"if 3DSTATE_RASTER is used".

I'm unsure how exactly to interpret that self-referential condition. "If
you use it, then you must emit... ok?"

> Chromium has been disabling HiZ on Braswell for stability reasons
> (trying to find out more information, but I currently don't have
> permissions to view their bug tracker). Not sure if they have the
> bandwidth to test these patches, but Cc'ing Chad in case they do.

Summary of problems we saw on Chrome OS: Brasswell boards were
sporadically GPU hanging in the test lab while running GLBench [1].
marcheu caught an i915_error_state that shows the hang happening shortly
after GEN7_3DSTATE_HIER_DEPTH_BUFFER, se he disabled HiZ on
Brasswell.


I'll submit your patches to the test lab and see what happens.  The hang
is hard to reproduce, though, so don't wait on me for verification that
this patch fixes it.

[1]: 
https://chromium.googlesource.com/chromiumos/third_party/autotest/+/master/client/deps/glbench/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv: set has_exec_async to false on Android

2017-09-14 Thread Emil Velikov
On 14 September 2017 at 07:57, Tapani Pälli  wrote:
> Other WSI implementations set has_exec_async false for WSI buffers,
> so far haven't found a place to do it so we just claim to not have
> async exec.
>
What's the actual side-effects you're seeing? I'd imagine Jason, Chris
and the gang may have some tips/suggestions - be that wrt Mesa or the
kernel.

I'm not saying "don't upstream this", but a comment and/or bug
reference will be beneficial.
Esp. since disabling async exec may have noticeable implication on performance.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: android build system changes

2017-09-14 Thread Emil Velikov
Hi Tapani,

On 14 September 2017 at 07:57, Tapani Pälli  wrote:
> Following changes are made to support VK_ANDROID_native_buffer:
>
>- bring in vk_android_native_buffer.xml
>- rename target as vulkan.$(TARGET_BOARD_PLATFORM)
>- use LOCAL_PROPRIETARY_MODULE to install under vendor path
>- link with libsync and liblog
>
Hope I don't come too pretentious - can you split this up?
 - the build/xml changes - can be folt in Chad's patch
 - linking - as above
 - the proprietary/vendor change (perhaps the vulkan.FOO rename?)
 - and most importantly, the hwvulkan.h include - I'm suspecting that
it'll cause grief to Chad/CrOS devs.

On the last one - I guess we could address the original issue by
carefully massaging the include order and/or -I directives?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965/gen8: Remove unused gen8_emit_3dstate_multisample()

2017-09-14 Thread Chad Versace
On Mon 11 Sep 2017, Topi Pohjolainen wrote:
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h|  1 -
>  src/mesa/drivers/dri/i965/gen8_multisample_state.c | 16 
>  2 files changed, 17 deletions(-)

Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeonsi: reallocate if a non-sharable textures is being shared

2017-09-14 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_texture.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 26afc98..e9507c3 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -28,20 +28,21 @@
 #include "r600_cs.h"
 #include "r600_query.h"
 #include "util/u_format.h"
 #include "util/u_log.h"
 #include "util/u_memory.h"
 #include "util/u_pack_color.h"
 #include "util/u_surface.h"
 #include "os/os_time.h"
 #include 
 #include 
+#include "state_tracker/drm_driver.h"
 
 static void r600_texture_discard_cmask(struct r600_common_screen *rscreen,
   struct r600_texture *rtex);
 static enum radeon_surf_mode
 r600_choose_tiling(struct r600_common_screen *rscreen,
   const struct pipe_resource *templ);
 
 
 bool r600_prepare_for_dma_blit(struct r600_common_context *rctx,
   struct r600_texture *rdst,
@@ -598,27 +599,30 @@ static boolean r600_texture_get_handle(struct 
pipe_screen* screen,
 
if (resource->target != PIPE_BUFFER) {
/* This is not supported now, but it might be required for 
OpenCL
 * interop in the future.
 */
if (resource->nr_samples > 1 || rtex->is_depth)
return false;
 
/* Move a suballocated texture into a non-suballocated 
allocation. */
if (rscreen->ws->buffer_is_suballocated(res->buf) ||
-   rtex->surface.tile_swizzle) {
+   rtex->surface.tile_swizzle ||
+   (rtex->resource.flags & RADEON_FLAG_NO_INTERPROCESS_SHARING 
&&
+whandle->type != DRM_API_HANDLE_TYPE_KMS)) {
assert(!res->b.is_shared);
r600_reallocate_texture_inplace(rctx, rtex,
PIPE_BIND_SHARED, 
false);
rctx->b.flush(&rctx->b, NULL, 0);
assert(res->b.b.bind & PIPE_BIND_SHARED);
assert(res->flags & RADEON_FLAG_NO_SUBALLOC);
+   assert(!(res->flags & 
RADEON_FLAG_NO_INTERPROCESS_SHARING));
assert(rtex->surface.tile_swizzle == 0);
}
 
/* Since shader image stores don't support DCC on VI,
 * disable it for external clients that want write
 * access.
 */
if (usage & PIPE_HANDLE_USAGE_WRITE && rtex->dcc_offset) {
if (r600_texture_disable_dcc(rctx, rtex))
update_metadata = true;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: PIPE_BIND_SHARED should allow inter-process sharing

2017-09-14 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_buffer_common.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index f35bc2c..7515f7d 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -162,25 +162,24 @@ void r600_init_resource_fields(struct r600_common_screen 
*rscreen,
/* Tiled textures are unmappable. Always put them in VRAM. */
if ((res->b.b.target != PIPE_BUFFER && !rtex->surface.is_linear) ||
res->flags & R600_RESOURCE_FLAG_UNMAPPABLE) {
res->domains = RADEON_DOMAIN_VRAM;
res->flags |= RADEON_FLAG_NO_CPU_ACCESS |
 RADEON_FLAG_GTT_WC;
}
 
/* Only displayable single-sample textures can be shared between
 * processes. */
-   if (res->b.b.target == PIPE_BUFFER ||
-   res->b.b.nr_samples >= 2 ||
-   (rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY &&
-/* Raven doesn't use display micro mode for 32bpp, so check this: 
*/
-!(res->b.b.bind & PIPE_BIND_SCANOUT)))
+   if (!(res->b.b.bind & (PIPE_BIND_SHARED | PIPE_BIND_SCANOUT)) &&
+   (res->b.b.target == PIPE_BUFFER ||
+res->b.b.nr_samples >= 2 ||
+rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY))
res->flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING;
 
/* If VRAM is just stolen system memory, allow both VRAM and
 * GTT, whichever has free space. If a buffer is evicted from
 * VRAM to GTT, it will stay there.
 *
 * DRM 3.6.0 has good BO move throttling, so we can allow VRAM-only
 * placements even with a low amount of stolen VRAM.
 */
if (!rscreen->info.has_dedicated_vram &&
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 0/3] Flag new aux state when we create/drop aux surfaces

2017-09-14 Thread Jason Ekstrand
I read through the series and, while I think it will fix the issue in 90%
of cases, I don't think it's quite the right solution.  There's a *lot* of
subtlety here and we need to tread carefully.  I think the better thing to
do would be to whack the new flag in two places:  Whenever we change the
fast clear value (brw_blorp.c and brw_clear.c for depth) and whenever
intel_miptree_set_aux_state actually changes something.  There are several
things which look at aux_state and it would be better to flag on that
changing just to make sure we get them all.

--Jason


On Thu, Sep 14, 2017 at 3:54 AM, Iago Toral Quiroga 
wrote:

> Jason, Ken: this is what I came up based on your findings that Ken
> reported in https://bugs.freedesktop.org/show_bug.cgi?id=102611.
>
> Ken mentioned that he is not completely certain that we need to flag
> dirty state every time we update the surfaces and that maybe flagging
> only when we go from/to AUX_USAGE_NONE might suffice. I am not sure,
> but to me it sounds safer to flag when we create the surfaces, since
> at that point we know we are going to use them and in that case we
> need the new surface states emitted, at least until we find specific
> cases where we don't need this and we can drop the flag for them. Let
> me know if you think a different approach is better though.
>
> The last two patches should probably be squashed. The last patch
> signals dirty AUX state when we drop aux surfaces. I think this
> is in necessary too since when we upload new renderbuffers we
> also consider the case where we don't have aux, but I decided to
> split it because we have not been signaling anything for dropped
> aux surfaces before.
>
> Iago Toral Quiroga (3):
>   i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE
>   i965: emit BRW_NEW_AUX_STATE when we allocate aux surfaces
>   i965: flag BRW_NEW_AUX_STATE if we drop the aux buffer
>
>  src/mesa/drivers/dri/i965/brw_blorp.c |  2 +-
>  src/mesa/drivers/dri/i965/brw_context.h   |  4 ++--
>  src/mesa/drivers/dri/i965/brw_gs_surface_state.c  |  2 +-
>  src/mesa/drivers/dri/i965/brw_state_upload.c  |  2 +-
>  src/mesa/drivers/dri/i965/brw_tcs_surface_state.c |  2 +-
>  src/mesa/drivers/dri/i965/brw_tes_surface_state.c |  2 +-
>  src/mesa/drivers/dri/i965/brw_vs_surface_state.c  |  2 +-
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  7 +++
>  9 files changed, 21 insertions(+), 14 deletions(-)
>
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101747] Steam-Game Turmoil, Segfault on start

2017-09-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101747

--- Comment #5 from John  ---
In case it helps, this is the minimum needed: R600_DEBUG="vs,ps" to not crash.
The others don't seem to matter.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101747] Steam-Game Turmoil, Segfault on start

2017-09-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101747

--- Comment #4 from John  ---
That's right it does not!

Out of curiosity why is a debug print command preventing the crash?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: android build system changes

2017-09-14 Thread Rob Herring
On Thu, Sep 14, 2017 at 1:57 AM, Tapani Pälli  wrote:
> Following changes are made to support VK_ANDROID_native_buffer:
>
>- bring in vk_android_native_buffer.xml
>- rename target as vulkan.$(TARGET_BOARD_PLATFORM)
>- use LOCAL_PROPRIETARY_MODULE to install under vendor path

Good to see this. I was working on a patch to change all the targets
to /vendor. Any issues with doing that from your perspective.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] radv: move compute related code to radv_compute.c

2017-09-14 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/Makefile.sources  |   2 +
 src/amd/vulkan/radv_cmd_buffer.c | 239 +-
 src/amd/vulkan/radv_compute.c| 275 +++
 src/amd/vulkan/radv_compute.h|  69 ++
 src/amd/vulkan/radv_meta.h   |   1 +
 src/amd/vulkan/radv_private.h|  35 +++--
 src/amd/vulkan/si_cmd_buffer.c   |  50 +--
 7 files changed, 381 insertions(+), 290 deletions(-)
 create mode 100644 src/amd/vulkan/radv_compute.c
 create mode 100644 src/amd/vulkan/radv_compute.h

diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.sources
index 9489219f5b..7cef56b43d 100644
--- a/src/amd/vulkan/Makefile.sources
+++ b/src/amd/vulkan/Makefile.sources
@@ -32,6 +32,8 @@ RADV_WS_AMDGPU_FILES := \
 
 VULKAN_FILES := \
radv_cmd_buffer.c \
+   radv_compute.c \
+   radv_compute.h \
radv_cs.h \
radv_debug.c \
radv_debug.h \
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 10a071c3d6..af9f8210bf 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -34,6 +34,7 @@
 #include "vk_format.h"
 #include "radv_debug.h"
 #include "radv_meta.h"
+#include "radv_compute.h"
 
 #include "ac_debug.h"
 
@@ -366,7 +367,7 @@ void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer 
*cmd_buffer)
radeon_emit(cs, AC_ENCODE_TRACE_POINT(cmd_buffer->state.trace_id));
 }
 
-static void
+void
 radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer)
 {
if (cmd_buffer->device->debug_flags & RADV_DEBUG_SYNC_SHADERS) {
@@ -386,7 +387,7 @@ radv_cmd_buffer_after_draw(struct radv_cmd_buffer 
*cmd_buffer)
radv_cmd_buffer_trace_emit(cmd_buffer);
 }
 
-static void
+void
 radv_save_pipeline(struct radv_cmd_buffer *cmd_buffer,
   struct radv_pipeline *pipeline, enum ring_type ring)
 {
@@ -601,14 +602,6 @@ radv_emit_graphics_raster_state(struct radv_cmd_buffer 
*cmd_buffer,
   raster->pa_su_sc_mode_cntl);
 }
 
-static inline void
-radv_emit_prefetch(struct radv_cmd_buffer *cmd_buffer, uint64_t va,
-  unsigned size)
-{
-   if (cmd_buffer->device->physical_device->rad_info.chip_class >= CIK)
-   si_cp_dma_prefetch(cmd_buffer, va, size);
-}
-
 static void
 radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
struct radv_pipeline *pipeline,
@@ -1577,7 +1570,7 @@ radv_flush_indirect_descriptor_sets(struct 
radv_cmd_buffer *cmd_buffer)
   AC_UD_INDIRECT_DESCRIPTOR_SETS, va);
 }
 
-static void
+void
 radv_flush_descriptors(struct radv_cmd_buffer *cmd_buffer,
   VkShaderStageFlags stages)
 {
@@ -1615,7 +1608,7 @@ radv_flush_descriptors(struct radv_cmd_buffer *cmd_buffer,
assert(cmd_buffer->cs->cdw <= cdw_max);
 }
 
-static void
+void
 radv_flush_constants(struct radv_cmd_buffer *cmd_buffer,
 struct radv_pipeline *pipeline,
 VkShaderStageFlags stages)
@@ -2108,7 +2101,8 @@ VkResult radv_BeginCommandBuffer(
radv_set_db_count_control(cmd_buffer);
break;
case RADV_QUEUE_COMPUTE:
-   si_init_compute(cmd_buffer);
+   radv_init_compute(cmd_buffer->device->physical_device,
+ cmd_buffer->cs);
break;
case RADV_QUEUE_TRANSFER:
default:
@@ -2378,58 +2372,6 @@ VkResult radv_EndCommandBuffer(
return cmd_buffer->record_result;
 }
 
-static void
-radv_emit_compute_pipeline(struct radv_cmd_buffer *cmd_buffer)
-{
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
-   struct radv_shader_variant *compute_shader;
-   struct radv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
-   uint64_t va;
-
-   if (!pipeline || pipeline == cmd_buffer->state.emitted_compute_pipeline)
-   return;
-
-   cmd_buffer->state.emitted_compute_pipeline = pipeline;
-
-   compute_shader = pipeline->shaders[MESA_SHADER_COMPUTE];
-   va = ws->buffer_get_va(compute_shader->bo) + compute_shader->bo_offset;
-
-   ws->cs_add_buffer(cmd_buffer->cs, compute_shader->bo, 8);
-   radv_emit_prefetch(cmd_buffer, va, compute_shader->code_size);
-
-   MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws,
-  cmd_buffer->cs, 16);
-
-   radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B830_COMPUTE_PGM_LO, 2);
-   radeon_emit(cmd_buffer->cs, va >> 8);
-   radeon_emit(cmd_buffer->cs, va >> 40);
-
-   radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B848_COMPUTE_PGM_RSRC1, 2);
-   radeon_emit(cmd_buffer->cs, compute_shader->rsrc1);
-   radeon_emit(cmd_buffer->cs, compute_shader->rsrc2);
-
-
-   cmd_buffer->compute_scratch_size

[Mesa-dev] [PATCH 3/4] radv: inline radv_flush_compute_state() into radv_dispatch()

2017-09-14 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 143acf1719..10a071c3d6 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -3124,16 +3124,6 @@ void radv_CmdDrawIndexedIndirectCountAMD(
 maxDrawCount, stride);
 }
 
-static void
-radv_flush_compute_state(struct radv_cmd_buffer *cmd_buffer)
-{
-   radv_emit_compute_pipeline(cmd_buffer);
-   radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_COMPUTE_BIT);
-   radv_flush_constants(cmd_buffer, cmd_buffer->state.compute_pipeline,
-VK_SHADER_STAGE_COMPUTE_BIT);
-   si_emit_cache_flush(cmd_buffer);
-}
-
 struct radv_dispatch_info {
/**
 * Determine the layout of the grid (in block units) to be used.
@@ -3272,7 +3262,13 @@ static void
 radv_dispatch(struct radv_cmd_buffer *cmd_buffer,
  const struct radv_dispatch_info *info)
 {
-   radv_flush_compute_state(cmd_buffer);
+   radv_emit_compute_pipeline(cmd_buffer);
+
+   radv_flush_descriptors(cmd_buffer, VK_SHADER_STAGE_COMPUTE_BIT);
+   radv_flush_constants(cmd_buffer, cmd_buffer->state.compute_pipeline,
+VK_SHADER_STAGE_COMPUTE_BIT);
+
+   si_emit_cache_flush(cmd_buffer);
 
radv_emit_dispatch_packets(cmd_buffer, info);
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] radv: add radv_emit_dispatch_packets() helper

2017-09-14 Thread Samuel Pitoiset
To share common dispatch compute code.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 252 +++
 1 file changed, 149 insertions(+), 103 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 068247d04d..6ffbba2f1d 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -3134,6 +3134,140 @@ radv_flush_compute_state(struct radv_cmd_buffer 
*cmd_buffer)
si_emit_cache_flush(cmd_buffer);
 }
 
+struct radv_dispatch_info {
+   /**
+* Determine the layout of the grid (in block units) to be used.
+*/
+   uint32_t blocks[3];
+
+   /**
+* Whether it's an unaligned compute dispatch.
+*/
+   bool unaligned;
+
+   /**
+* Indirect compute parameters resource.
+*/
+   struct radv_buffer *indirect;
+   uint64_t indirect_offset;
+};
+
+static void
+radv_emit_dispatch_packets(struct radv_cmd_buffer *cmd_buffer,
+  const struct radv_dispatch_info *info)
+{
+   struct radv_pipeline *pipeline = cmd_buffer->state.compute_pipeline;
+   struct radv_shader_variant *compute_shader = 
pipeline->shaders[MESA_SHADER_COMPUTE];
+   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   struct ac_userdata_info *loc;
+   uint8_t grid_used;
+
+   grid_used = compute_shader->info.info.cs.grid_components_used;
+
+   loc = radv_lookup_user_sgpr(pipeline, MESA_SHADER_COMPUTE,
+   AC_UD_CS_GRID_SIZE);
+
+   MAYBE_UNUSED unsigned cdw_max = radeon_check_space(ws, cs, 25);
+
+   if (info->indirect) {
+   uint64_t va = ws->buffer_get_va(info->indirect->bo);
+
+   va += info->indirect->offset + info->indirect_offset;
+
+   ws->cs_add_buffer(cs, info->indirect->bo, 8);
+
+   if (loc->sgpr_idx != -1) {
+   for (unsigned i = 0; i < grid_used; ++i) {
+   radeon_emit(cs, PKT3(PKT3_COPY_DATA, 4, 0));
+   radeon_emit(cs, 
COPY_DATA_SRC_SEL(COPY_DATA_MEM) |
+   
COPY_DATA_DST_SEL(COPY_DATA_REG));
+   radeon_emit(cs, (va +  4 * i));
+   radeon_emit(cs, (va + 4 * i) >> 32);
+   radeon_emit(cs, ((R_00B900_COMPUTE_USER_DATA_0
++ loc->sgpr_idx * 4) >> 2) + 
i);
+   radeon_emit(cs, 0);
+   }
+   }
+
+   if (radv_cmd_buffer_uses_mec(cmd_buffer)) {
+   radeon_emit(cs, PKT3(PKT3_DISPATCH_INDIRECT, 2, 0) |
+   PKT3_SHADER_TYPE_S(1));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, 1);
+   } else {
+   radeon_emit(cs, PKT3(PKT3_SET_BASE, 2, 0) |
+   PKT3_SHADER_TYPE_S(1));
+   radeon_emit(cs, 1);
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+
+   radeon_emit(cs, PKT3(PKT3_DISPATCH_INDIRECT, 1, 0) |
+   PKT3_SHADER_TYPE_S(1));
+   radeon_emit(cs, 0);
+   radeon_emit(cs, 1);
+   }
+   } else {
+   unsigned blocks[3] = { info->blocks[0], info->blocks[1], 
info->blocks[2] };
+   unsigned dispatch_initiator = S_00B800_COMPUTE_SHADER_EN(1);
+
+   if (info->unaligned) {
+   unsigned *cs_block_size = 
compute_shader->info.cs.block_size;
+   unsigned remainder[3];
+
+   /* If aligned, these should be an entire block size,
+* not 0.
+*/
+   remainder[0] = blocks[0] + cs_block_size[0] -
+  align_u32_npot(blocks[0], 
cs_block_size[0]);
+   remainder[1] = blocks[1] + cs_block_size[1] -
+  align_u32_npot(blocks[1], 
cs_block_size[1]);
+   remainder[2] = blocks[2] + cs_block_size[2] -
+  align_u32_npot(blocks[2], 
cs_block_size[2]);
+
+   blocks[0] = round_up_u32(blocks[0], cs_block_size[0]);
+   blocks[1] = round_up_u32(blocks[1], cs_block_size[1]);
+   blocks[2] = round_up_u32(blocks[2], cs_block_size[2]);
+
+   radeon_set_sh_reg_seq(cs, 
R_00B81C_COMPUTE_NUM_THREAD_X, 3);
+   radeon_emit(cs,
+   S_00B81C_NUM_THREAD_FULL(cs_block_siz

[Mesa-dev] [PATCH 2/4] radv: add radv_dispatch() helper

2017-09-14 Thread Samuel Pitoiset
To share common dispatch compute code.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6ffbba2f1d..143acf1719 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -3268,6 +3268,17 @@ radv_emit_dispatch_packets(struct radv_cmd_buffer 
*cmd_buffer,
assert(cmd_buffer->cs->cdw <= cdw_max);
 }
 
+static void
+radv_dispatch(struct radv_cmd_buffer *cmd_buffer,
+ const struct radv_dispatch_info *info)
+{
+   radv_flush_compute_state(cmd_buffer);
+
+   radv_emit_dispatch_packets(cmd_buffer, info);
+
+   radv_cmd_buffer_after_draw(cmd_buffer);
+}
+
 void radv_CmdDispatch(
VkCommandBuffer commandBuffer,
uint32_tx,
@@ -3277,15 +3288,11 @@ void radv_CmdDispatch(
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
struct radv_dispatch_info info = {};
 
-   radv_flush_compute_state(cmd_buffer);
-
info.blocks[0] = x;
info.blocks[1] = y;
info.blocks[2] = z;
 
-   radv_emit_dispatch_packets(cmd_buffer, &info);
-
-   radv_cmd_buffer_after_draw(cmd_buffer);
+   radv_dispatch(cmd_buffer, &info);
 }
 
 void radv_CmdDispatchIndirect(
@@ -3297,14 +3304,10 @@ void radv_CmdDispatchIndirect(
RADV_FROM_HANDLE(radv_buffer, buffer, _buffer);
struct radv_dispatch_info info = {};
 
-   radv_flush_compute_state(cmd_buffer);
-
info.indirect = buffer;
info.indirect_offset = offset;
 
-   radv_emit_dispatch_packets(cmd_buffer, &info);
-
-   radv_cmd_buffer_after_draw(cmd_buffer);
+   radv_dispatch(cmd_buffer, &info);
 }
 
 void radv_unaligned_dispatch(
@@ -3320,11 +3323,7 @@ void radv_unaligned_dispatch(
info.blocks[2] = z;
info.unaligned = 1;
 
-   radv_flush_compute_state(cmd_buffer);
-
-   radv_emit_dispatch_packets(cmd_buffer, &info);
-
-   radv_cmd_buffer_after_draw(cmd_buffer);
+   radv_dispatch(cmd_buffer, &info);
 }
 
 void radv_CmdEndRenderPass(
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fix a potential crash if attachments allocation failed

2017-09-14 Thread Samuel Pitoiset
Also, it's useless to set the error code twice. Though, we
should probably skip the next commands when the command buffer
is considered invalid.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index af9f8210bf..0b56087a09 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2713,9 +2713,10 @@ void radv_CmdBeginRenderPass(
cmd_buffer->state.framebuffer = framebuffer;
cmd_buffer->state.pass = pass;
cmd_buffer->state.render_area = pRenderPassBegin->renderArea;
+
result = radv_cmd_state_setup_attachments(cmd_buffer, pass, 
pRenderPassBegin);
if (result != VK_SUCCESS)
-   cmd_buffer->record_result = result;
+   return;
 
radv_cmd_buffer_set_subpass(cmd_buffer, pass->subpasses, true);
assert(cmd_buffer->cs->cdw <= cdw_max);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 12/15] i965: Make BLORP properly avoid batch wrapping.

2017-09-14 Thread Kenneth Graunke
On Thursday, September 14, 2017 3:22:48 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-09-13 21:54:14)
> > We need to set brw->no_batch_wrap to actually avoid flushing in the
> > middle of our BLORP operation, and instead grow the batchbuffer.
> > ---
> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 16 ++--
> >  1 file changed, 2 insertions(+), 14 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> > b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > index feb87923ccb..5bff7eaff59 100644
> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> > @@ -224,9 +224,7 @@ genX(blorp_exec)(struct blorp_batch *batch,
> >  retry:
> > intel_batchbuffer_require_space(brw, estimated_max_batch_usage, 
> > RENDER_RING);
> > intel_batchbuffer_save_state(brw);
> > -   struct brw_bo *saved_bo = brw->batch.bo;
> > -   uint32_t saved_used = USED_BATCH(brw->batch);
> > -   uint32_t saved_state_used = brw->batch.state_used;
> > +   brw->no_batch_wrap = true;
> >  
> >  #if GEN_GEN == 6
> > /* Emit workaround flushes when we switch from drawing to blorping. */
> > @@ -254,17 +252,7 @@ retry:
> >  
> > blorp_exec(batch, params);
> >  
> > -   /* Make sure we didn't wrap the batch unintentionally, and make sure we
> > -* reserved enough space that a wrap will never happen.
> > -*/
> > -   assert(brw->batch.bo == saved_bo);
> > -   assert((USED_BATCH(brw->batch) - saved_used) * 4 +
> > -  (brw->batch.state_used - saved_state_used) <
> > -  estimated_max_batch_usage);
> > -   /* Shut up compiler warnings on release build */
> > -   (void)saved_bo;
> > -   (void)saved_used;
> > -   (void)saved_state_used;
> > +   brw->no_batch_wrap = false;
> 
> Hmm, did you add an assert(brw->no_batch_wrap) into do_flush_locked()?
> Would be good to have that assertion back now that you should have fixed
> all the early flushing...
> -Chris

The assertion was there the whole time in _intel_batchbuffer_flush_fence.
This series just prevents us from hitting it. :)

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: don't drop intructions from unreachable terminators continue branch

2017-09-14 Thread tournier.elie
With the small nitpick from Eric, the series is:
Reviewed-by: Elie Tournier 

On 14 September 2017 at 05:47, Timothy Arceri  wrote:
> These instruction will be executed on every iteration of the loop
> we cannot drop them.
> ---
>  src/compiler/glsl/loop_analysis.h   |  7 +++
>  src/compiler/glsl/loop_controls.cpp | 15 +++
>  src/compiler/glsl/loop_unroll.cpp   |  7 ---
>  3 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/src/compiler/glsl/loop_analysis.h 
> b/src/compiler/glsl/loop_analysis.h
> index 2894c6359b..0e1bfd8142 100644
> --- a/src/compiler/glsl/loop_analysis.h
> +++ b/src/compiler/glsl/loop_analysis.h
> @@ -27,20 +27,27 @@
>
>  #include "ir.h"
>  #include "util/hash_table.h"
>
>  /**
>   * Analyze and classify all variables used in all loops in the instruction 
> list
>   */
>  extern class loop_state *
>  analyze_loop_variables(exec_list *instructions);
>
> +static inline bool
> +is_break(ir_instruction *ir)
> +{
> +   return ir != NULL && ir->ir_type == ir_type_loop_jump &&
> +  ((ir_loop_jump *) ir)->is_break();
> +}
> +
>
>  /**
>   * Fill in loop control fields
>   *
>   * Based on analysis of loop variables, this function tries to remove
>   * redundant sequences in the loop of the form
>   *
>   *  (if (expression bool ...) (break))
>   *
>   * For example, if it is provable that one loop exit condition will
> diff --git a/src/compiler/glsl/loop_controls.cpp 
> b/src/compiler/glsl/loop_controls.cpp
> index 895954fc2d..2dff26aec0 100644
> --- a/src/compiler/glsl/loop_controls.cpp
> +++ b/src/compiler/glsl/loop_controls.cpp
> @@ -215,21 +215,36 @@ loop_control_visitor::visit_leave(ir_loop *ir)
>  * that are associated with a fixed iteration count, except for the one
>  * associated with the limiting terminator--that one needs to stay, since
>  * it terminates the loop.  Exception: if the loop still has a normative
>  * bound, then that terminates the loop, so we don't even need the 
> limiting
>  * terminator.
>  */
> foreach_in_list(loop_terminator, t, &ls->terminators) {
>if (t->iterations < 0)
>   continue;
>
> +  exec_list *branch_instructions;
>if (t != ls->limiting_terminator) {
> + ir_instruction *ir_if_last = (ir_instruction *)
> +   t->ir->then_instructions.get_tail();
> + if (is_break(ir_if_last)) {
> +branch_instructions = &t->ir->else_instructions;
> + } else {
> +branch_instructions = &t->ir->then_instructions;
> +assert(is_break(ir_if_last));
> + }
> +
> + exec_list copy_list;
> + copy_list.make_empty();
> + clone_ir_list(ir, ©_list, branch_instructions);
> +
> + t->ir->insert_before(©_list);
>   t->ir->remove();
>
>   assert(ls->num_loop_jumps > 0);
>   ls->num_loop_jumps--;
>
>   this->progress = true;
>}
> }
>
> return visit_continue;
> diff --git a/src/compiler/glsl/loop_unroll.cpp 
> b/src/compiler/glsl/loop_unroll.cpp
> index dbb3fa2fa5..7f601295a1 100644
> --- a/src/compiler/glsl/loop_unroll.cpp
> +++ b/src/compiler/glsl/loop_unroll.cpp
> @@ -46,27 +46,20 @@ public:
> void splice_post_if_instructions(ir_if *ir_if, exec_list *splice_dest);
>
> loop_state *state;
>
> bool progress;
> const struct gl_shader_compiler_options *options;
>  };
>
>  } /* anonymous namespace */
>
> -static bool
> -is_break(ir_instruction *ir)
> -{
> -   return ir != NULL && ir->ir_type == ir_type_loop_jump
> -&& ((ir_loop_jump *) ir)->is_break();
> -}
> -
>  class loop_unroll_count : public ir_hierarchical_visitor {
>  public:
> int nodes;
> bool unsupported_variable_indexing;
> bool array_indexed_by_induction_var_with_exact_iterations;
> /* If there are nested loops, the node count will be inaccurate. */
> bool nested_loop;
>
> loop_unroll_count(exec_list *list, loop_variable_state *ls,
>   const struct gl_shader_compiler_options *options)
> --
> 2.13.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600: fork and import gallium/radeon

2017-09-14 Thread Marek Olšák
On Thu, Sep 14, 2017 at 4:19 PM, Emil Velikov  wrote:
> Hi Marek,
>
> On 14 September 2017 at 14:06, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> This marks the end of code sharing between r600 and radeonsi.
>>
> It has the "what" but it's missing the "why". Can you please add some
> information.
>
> From a quick look which will make each binary ~140KiB larger (dri,
> omx, vdpau ...). As a reference point drivers/r600 and
> drivers/radeonsi themselves are around 620KiB and 280KiB respectively.
>
> With the bits duplicated/forked, should one `mv radeon{,si}` or you're
> planning that at a later stage?
>
>> A lot of functions had to be renamed to prevent linker conflicts.
>>
>> There are also minor cleanups.
>> ---
>>
>> This one is huge. Please review here:
>> https://cgit.freedesktop.org/~mareko/mesa/commit/?h=master&id=858b2d1c8cec727fdf750192c8c210f72d38f853
>>
> I'll look at those in an hour or so.

The plan is to merge gallium/radeon into radeonsi gradually over a
longer period of time.

Existing uncommitted work in gallium/radeon should apply more or less
cleanly, but will only affect radeonsi, not r600,

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] radeonsi: make use of LOAD for UBOs

2017-09-14 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Tue, Aug 22, 2017 at 2:14 PM, Timothy Arceri  wrote:
> v2: always set can_speculate and allow_smem to true
> ---
>  src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c | 31 
> +++
>  1 file changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c 
> b/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> index f8c99ff7e7..83cd8cd938 100644
> --- a/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> +++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c
> @@ -66,32 +66,36 @@ static LLVMValueRef get_buffer_size(
>   LLVMConstInt(ctx->i32, 0x3FFF, 0), "");
>
> size = LLVMBuildUDiv(builder, size, stride, "");
> }
>
> return size;
>  }
>
>  static LLVMValueRef
>  shader_buffer_fetch_rsrc(struct si_shader_context *ctx,
> -const struct tgsi_full_src_register *reg)
> +const struct tgsi_full_src_register *reg,
> +bool ubo)
>  {
> LLVMValueRef index;
>
> if (!reg->Register.Indirect) {
> index = LLVMConstInt(ctx->i32, reg->Register.Index, false);
> } else {
> index = si_get_indirect_index(ctx, ®->Indirect,
>   reg->Register.Index);
> }
>
> -   return ctx->abi.load_ssbo(&ctx->abi, index, false);
> +   if (ubo)
> +   return ctx->abi.load_ubo(&ctx->abi, index);
> +   else
> +   return ctx->abi.load_ssbo(&ctx->abi, index, false);
>  }
>
>  static bool tgsi_is_array_sampler(unsigned target)
>  {
> return target == TGSI_TEXTURE_1D_ARRAY ||
>target == TGSI_TEXTURE_SHADOW1D_ARRAY ||
>target == TGSI_TEXTURE_2D_ARRAY ||
>target == TGSI_TEXTURE_SHADOW2D_ARRAY ||
>target == TGSI_TEXTURE_CUBE_ARRAY ||
>target == TGSI_TEXTURE_SHADOWCUBE_ARRAY ||
> @@ -356,26 +360,28 @@ static void load_fetch_args(
> struct lp_build_emit_data * emit_data)
>  {
> struct si_shader_context *ctx = si_shader_context(bld_base);
> struct gallivm_state *gallivm = &ctx->gallivm;
> const struct tgsi_full_instruction * inst = emit_data->inst;
> unsigned target = inst->Memory.Texture;
> LLVMValueRef rsrc;
>
> emit_data->dst_type = ctx->v4f32;
>
> -   if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) {
> +   if (inst->Src[0].Register.File == TGSI_FILE_BUFFER ||
> +  inst->Src[0].Register.File == TGSI_FILE_CONSTBUF) {
> LLVMBuilderRef builder = gallivm->builder;
> LLVMValueRef offset;
> LLVMValueRef tmp;
>
> -   rsrc = shader_buffer_fetch_rsrc(ctx, &inst->Src[0]);
> +   bool ubo = inst->Src[0].Register.File == TGSI_FILE_CONSTBUF;
> +   rsrc = shader_buffer_fetch_rsrc(ctx, &inst->Src[0], ubo);
>
> tmp = lp_build_emit_fetch(bld_base, inst, 1, 0);
> offset = LLVMBuildBitCast(builder, tmp, ctx->i32, "");
>
> buffer_append_args(ctx, emit_data, rsrc, ctx->i32_0,
>offset, false, false);
> } else if (inst->Src[0].Register.File == TGSI_FILE_IMAGE ||
>tgsi_is_bindless_image_file(inst->Src[0].Register.File)) {
> LLVMValueRef coords;
>
> @@ -407,21 +413,21 @@ static unsigned get_load_intr_attribs(bool 
> can_speculate)
>
>  static unsigned get_store_intr_attribs(bool writeonly_memory)
>  {
> return writeonly_memory && HAVE_LLVM >= 0x0400 ?
>   LP_FUNC_ATTR_INACCESSIBLE_MEM_ONLY :
>   LP_FUNC_ATTR_WRITEONLY;
>  }
>
>  static void load_emit_buffer(struct si_shader_context *ctx,
>  struct lp_build_emit_data *emit_data,
> -bool can_speculate)
> +bool can_speculate, bool allow_smem)
>  {
> const struct tgsi_full_instruction *inst = emit_data->inst;
> uint writemask = inst->Dst[0].Register.WriteMask;
> uint count = util_last_bit(writemask);
> LLVMValueRef *args = emit_data->args;
>
> /* Don't use SMEM for shader buffer loads, because LLVM doesn't
>  * select SMEM for SI.load.const with a non-constant offset, and
>  * constant offsets practically don't exist with shader buffers.
>  *
> @@ -434,21 +440,21 @@ static void load_emit_buffer(struct si_shader_context 
> *ctx,
>  *   After that, si_memory_barrier should invalidate sL1 for 
> shader
>  *   buffers.
>  */
>
> assert(LLVMConstIntGetZExtValue(args[1]) == 0); /* vindex */
> emit_data->output[emit_data->chan] =
> ac_build_buffer_load(&ctx->ac, args[0], count, NULL,
> 

[Mesa-dev] [Bug 101747] Steam-Game Turmoil, Segfault on start

2017-09-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101747

--- Comment #3 from Samuel Pitoiset  ---
I guess it shouldn't crash with R600_DEBUG="vs,ps,gs,tcs,tes,cs" ?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/2] Add libunwind to travis gallium build and fix it

2017-09-14 Thread Eric Engestrom
On Thursday, 2017-09-14 12:27:40 +0200, Gert Wollny wrote:
> The second version of the two patches now forces llvm-3.3 for 
> "make Gallium ST Other" and no longer requires changing any Makefile.am. 
> In addition, like suggested by Emil, --enable/disable-libunwind is set 
> appropriately.

Thanks, series looks good to me:
Reviewed-by: Eric Engestrom 

Emil, I'll let you push it.

> 
> Best, 
> Gert 
> 
> Gert Wollny (2):
>   .travis.yml: force llvm-3.3 for "make Gallium ST Other"
>   .travis.yml: Add libunwind-dev to gallium/make builds
> 
>  .travis.yml | 15 +++
>  1 file changed, 15 insertions(+)
> 
> -- 
> 2.13.5
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600: fork and import gallium/radeon

2017-09-14 Thread Nicolai Hähnle

On 14.09.2017 16:19, Emil Velikov wrote:

Hi Marek,

On 14 September 2017 at 14:06, Marek Olšák  wrote:

From: Marek Olšák 

This marks the end of code sharing between r600 and radeonsi.


It has the "what" but it's missing the "why". Can you please add some
information.


Basically, it's getting difficult to work on radeonsi without breaking r600.

It's not ideal, but (1) without a solid testing infrastructure that 
tests all the way back to r600, this is a reasonable measure to reduce 
the development risk, and (2) with some rare exceptions, our work on 
radeonsi tends not to help r600, neither in performance nor in features 
(and least of all bug fixes).


I haven't looked at the patch itself yet, but I'm okay with the general 
idea.


Cheers,
Nicolai


 From a quick look which will make each binary ~140KiB larger (dri,
omx, vdpau ...). As a reference point drivers/r600 and
drivers/radeonsi themselves are around 620KiB and 280KiB respectively.

With the bits duplicated/forked, should one `mv radeon{,si}` or you're
planning that at a later stage?


A lot of functions had to be renamed to prevent linker conflicts.

There are also minor cleanups.
---

This one is huge. Please review here:
https://cgit.freedesktop.org/~mareko/mesa/commit/?h=master&id=858b2d1c8cec727fdf750192c8c210f72d38f853


I'll look at those in an hour or so.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 radeonsi use STD430 packing of UBOs by default

2017-09-14 Thread Nicolai Hähnle

On 14.09.2017 15:14, Marek Olšák wrote:

On Thu, Sep 14, 2017 at 12:31 PM, Timothy Arceri  wrote:



On 31/08/17 01:55, Marek Olšák wrote:


On Wed, Aug 30, 2017 at 2:22 PM, Timothy Arceri 
wrote:


On 30/08/17 20:07, Marek Olšák wrote:



If LLVM was fixed to do the correct thing, we could enable CONSTBUF
LOAD for LLVM 6.0 and later.




You seem to think that the compiler *should* be placing them near where
they
are used? What part of LLVM were you expecting to do this? I'm happy to
do
some digging around but don't know where I should start looking.



I think the LLVM machine instruction scheduler should do that. The
starting point would be to add "-print-after-all" to llc or LLVM
arguments in Mesa to have visibility into what LLVM is doing. From
that point it's just about learning to understand that. By default,
LLVM assumes that most or all loads may be affected by any store. LLVM
might also think that the instruction order is OK and doesn't need
changes. I don't know what the exact issue is.

If Natural Selection 2 is the only game showing small changes in
shader-db stats and there are no differences in *real performance* of
NS2 and other apps, I'd say let's merge this.



Retesting with master and more recent LLVM I'm getting:

MaxWaves -1.68% (previously was -2.94%) with -1.60% for NS2.

My care factor for NS2 has officially dropped to 0. I got a copy of it for
testing but I noticed:

  1. OpenGL support is still marked as beta
  2. It crashes when I try to load the tutorial, I assume its related to
 this bug [1].

Since this is the case I'd rather not hold up this work based on the results
of a buggy game. Marek is patch 4 ok with you? Everything else has you r-b
(once I split patch 7).


Can you remind what the name of patch 4 was?


"radeonsi: make use of LOAD for UBOs"

The advantages of using a real e-mail client ;-)

Cheers,
Nicolai



Thanks,
Marek




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600: fork and import gallium/radeon

2017-09-14 Thread Emil Velikov
Hi Marek,

On 14 September 2017 at 14:06, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This marks the end of code sharing between r600 and radeonsi.
>
It has the "what" but it's missing the "why". Can you please add some
information.

From a quick look which will make each binary ~140KiB larger (dri,
omx, vdpau ...). As a reference point drivers/r600 and
drivers/radeonsi themselves are around 620KiB and 280KiB respectively.

With the bits duplicated/forked, should one `mv radeon{,si}` or you're
planning that at a later stage?

> A lot of functions had to be renamed to prevent linker conflicts.
>
> There are also minor cleanups.
> ---
>
> This one is huge. Please review here:
> https://cgit.freedesktop.org/~mareko/mesa/commit/?h=master&id=858b2d1c8cec727fdf750192c8c210f72d38f853
>
I'll look at those in an hour or so.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] r600g/sb: Don't require array declarations for TGSI_FILE_SYSTEM_VALUE

2017-09-14 Thread Gert Wollny
Although gl_SampleMaskIn is declared as an array in GLSL, it is
effectively a 32 bit mask on all hardware supported by mesa, so the
array indexing is ignored (Thanks Glenn Kennard for the explanation).

Add a comment that the assert is not made superfluos by the else branch.

Corrects: piglit spec@arb_gpu_shader5@execution@samplemaskin-indirect for
debug builds (it already passed in release).
---
- v1 was "r600/sb: remove superfluos assert", but this name doesn't make 
  sense anymore
- Submitter has no mesa git write access. 

 src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp 
b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
index ae92a767b4..c7b9032049 100644
--- a/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
+++ b/src/gallium/drivers/r600/sb/sb_bc_parser.cpp
@@ -125,7 +125,9 @@ int bc_parser::parse_decls() {
return 0;
}
 
-   if (pshader->indirect_files & ~((1 << TGSI_FILE_CONSTANT) | (1 << 
TGSI_FILE_SAMPLER))) {
+   if (pshader->indirect_files &
+   ~((1 << TGSI_FILE_CONSTANT) | (1 << TGSI_FILE_SAMPLER) |
+  (1 << TGSI_FILE_SYSTEM_VALUE))) {
 
assert(pshader->num_arrays);
 
@@ -135,6 +137,10 @@ int bc_parser::parse_decls() {
sh->add_gpr_array(a.gpr_start, a.gpr_count, 
a.comp_mask);
}
} else {
+   /* When the above assert is disabled and proper array 
info
+* is missing for some reason then, as a fallback, make 
sure
+* that all GPRs can be accessed indirectly.
+*/
sh->add_gpr_array(0, pshader->bc.ngpr, 0x0F);
}
}
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102318] Mesa3D Scons build - LLVM 5.0 not supported

2017-09-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102318

--- Comment #4 from Emil Velikov  ---
(In reply to Alex Granni from comment #3)
> Created attachment 134221 [details] [review]
> Patch that enables LLVM 5.0 support to Mesa3D scons build
> 
> Here is the patch that adds support for LLVM 5.0 to Scons build.
> Turned out to be very straight-forward.

Nicely done. Please sent it to the list for review.
Small nit: please keep the hunk alike the 4.0 one, adding any extra entries at
the end.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 radeonsi use STD430 packing of UBOs by default

2017-09-14 Thread Marek Olšák
On Thu, Sep 14, 2017 at 12:31 PM, Timothy Arceri  wrote:
>
>
> On 31/08/17 01:55, Marek Olšák wrote:
>>
>> On Wed, Aug 30, 2017 at 2:22 PM, Timothy Arceri 
>> wrote:
>>>
>>> On 30/08/17 20:07, Marek Olšák wrote:


 If LLVM was fixed to do the correct thing, we could enable CONSTBUF
 LOAD for LLVM 6.0 and later.
>>>
>>>
>>>
>>> You seem to think that the compiler *should* be placing them near where
>>> they
>>> are used? What part of LLVM were you expecting to do this? I'm happy to
>>> do
>>> some digging around but don't know where I should start looking.
>>
>>
>> I think the LLVM machine instruction scheduler should do that. The
>> starting point would be to add "-print-after-all" to llc or LLVM
>> arguments in Mesa to have visibility into what LLVM is doing. From
>> that point it's just about learning to understand that. By default,
>> LLVM assumes that most or all loads may be affected by any store. LLVM
>> might also think that the instruction order is OK and doesn't need
>> changes. I don't know what the exact issue is.
>>
>> If Natural Selection 2 is the only game showing small changes in
>> shader-db stats and there are no differences in *real performance* of
>> NS2 and other apps, I'd say let's merge this.
>
>
> Retesting with master and more recent LLVM I'm getting:
>
> MaxWaves -1.68% (previously was -2.94%) with -1.60% for NS2.
>
> My care factor for NS2 has officially dropped to 0. I got a copy of it for
> testing but I noticed:
>
>  1. OpenGL support is still marked as beta
>  2. It crashes when I try to load the tutorial, I assume its related to
> this bug [1].
>
> Since this is the case I'd rather not hold up this work based on the results
> of a buggy game. Marek is patch 4 ok with you? Everything else has you r-b
> (once I split patch 7).

Can you remind what the name of patch 4 was?

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600: fork and import gallium/radeon

2017-09-14 Thread Marek Olšák
From: Marek Olšák 

This marks the end of code sharing between r600 and radeonsi.

A lot of functions had to be renamed to prevent linker conflicts.

There are also minor cleanups.
---

This one is huge. Please review here:
https://cgit.freedesktop.org/~mareko/mesa/commit/?h=master&id=858b2d1c8cec727fdf750192c8c210f72d38f853

Thanks,
Marek

 configure.ac   |3 +-
 src/gallium/Makefile.am|2 +-
 src/gallium/drivers/r600/Automake.inc  |2 -
 src/gallium/drivers/r600/Makefile.am   |3 +-
 src/gallium/drivers/r600/Makefile.sources  |   21 +-
 src/gallium/drivers/r600/cayman_msaa.c |  269 +++
 src/gallium/drivers/r600/r600_buffer_common.c  |  687 ++
 src/gallium/drivers/r600/r600_cs.h |  209 ++
 src/gallium/drivers/r600/r600_gpu_load.c   |  283 +++
 src/gallium/drivers/r600/r600_perfcounter.c|  649 ++
 src/gallium/drivers/r600/r600_pipe.c   |4 +-
 src/gallium/drivers/r600/r600_pipe.h   |4 +-
 src/gallium/drivers/r600/r600_pipe_common.c| 1622 +
 src/gallium/drivers/r600/r600_pipe_common.h| 1020 
 src/gallium/drivers/r600/r600_query.c  | 2201 +
 src/gallium/drivers/r600/r600_query.h  |  327 +++
 src/gallium/drivers/r600/r600_streamout.c  |  381 +++
 src/gallium/drivers/r600/r600_test_dma.c   |  398 
 src/gallium/drivers/r600/r600_texture.c| 2464 
 src/gallium/drivers/r600/r600_uvd.c|6 +-
 src/gallium/drivers/r600/r600_viewport.c   |  433 
 src/gallium/drivers/r600/radeon_uvd.c  | 1618 +
 src/gallium/drivers/r600/radeon_uvd.h  |  447 
 src/gallium/drivers/r600/radeon_vce.c  |  553 +
 src/gallium/drivers/r600/radeon_vce.h  |  462 
 src/gallium/drivers/r600/radeon_video.c|  372 +++
 src/gallium/drivers/r600/radeon_video.h|   85 +
 src/gallium/drivers/radeon/cayman_msaa.c   |   30 +-
 src/gallium/drivers/radeon/r600_buffer_common.c|   72 +-
 src/gallium/drivers/radeon/r600_gpu_load.c |8 +-
 src/gallium/drivers/radeon/r600_perfcounter.c  |   44 +-
 src/gallium/drivers/radeon/r600_pipe_common.c  |  172 +-
 src/gallium/drivers/radeon/r600_pipe_common.h  |  279 ++-
 src/gallium/drivers/radeon/r600_query.c|  229 +-
 src/gallium/drivers/radeon/r600_query.h|   60 +-
 src/gallium/drivers/radeon/r600_streamout.c|   22 +-
 src/gallium/drivers/radeon/r600_test_dma.c |2 +-
 src/gallium/drivers/radeon/r600_texture.c  |  225 +-
 src/gallium/drivers/radeon/r600_viewport.c |   16 +-
 src/gallium/drivers/radeon/radeon_uvd.c|   58 +-
 src/gallium/drivers/radeon/radeon_uvd.h|   10 +-
 src/gallium/drivers/radeon/radeon_vce.c|   66 +-
 src/gallium/drivers/radeon/radeon_vce.h|   52 +-
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c |   14 +-
 src/gallium/drivers/radeon/radeon_vce_50.c |   16 +-
 src/gallium/drivers/radeon/radeon_vce_52.c |   14 +-
 src/gallium/drivers/radeon/radeon_vcn_dec.c|   56 +-
 src/gallium/drivers/radeon/radeon_video.c  |   44 +-
 src/gallium/drivers/radeon/radeon_video.h  |   36 +-
 src/gallium/drivers/radeonsi/cik_sdma.c|   12 +-
 src/gallium/drivers/radeonsi/si_blit.c |8 +-
 src/gallium/drivers/radeonsi/si_compute.c  |4 +-
 src/gallium/drivers/radeonsi/si_cp_dma.c   |2 +-
 src/gallium/drivers/radeonsi/si_debug.c|4 +-
 src/gallium/drivers/radeonsi/si_descriptors.c  |   12 +-
 src/gallium/drivers/radeonsi/si_dma.c  |8 +-
 src/gallium/drivers/radeonsi/si_hw_context.c   |   10 +-
 src/gallium/drivers/radeonsi/si_perfcounter.c  |   14 +-
 src/gallium/drivers/radeonsi/si_pipe.c |   24 +-
 src/gallium/drivers/radeonsi/si_shader.c   |   16 +-
 .../drivers/radeonsi/si_shader_tgsi_setup.c|4 +-
 src/gallium/drivers/radeonsi/si_state.c|   18 +-
 src/gallium/drivers/radeonsi/si_state_draw.c   |6 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c|   18 +-
 src/gallium/drivers/radeonsi/si_uvd.c  |8 +-
 src/gallium/targets/pipe-loader/Makefile.am|1 -
 66 files changed, 15240 insertions(+), 979 deletions(-)
 create mode 100644 src/gallium/drivers/r600/cayman_msaa.c
 create mode 100644 src/gallium/drivers/r600/r600_buffer_common.c
 create mode 100644 src/gallium/drivers/r600/r600_cs.h
 create mode 100644 src/gallium/drivers/r600/r600_gpu_load.c
 create mode 100644 src/gallium/drivers/r600/r600_perfcounter.c
 create mode 100644 src/gallium/drivers/r600/r600_pipe_common.c
 create mode 100644 src/gallium/drivers/r600/r600_pipe_

Re: [Mesa-dev] i965 NIR linking

2017-09-14 Thread Timothy Arceri



On 14/09/17 18:19, Eduardo Lima Mitev wrote:

On 09/13/2017 01:37 AM, Timothy Arceri wrote:

This started out based off the work Jason did back in 2015 to add
NIR linking to the Intel VK driver. It needed a reasonable amount
of updates to work with the GL driver, tess, xfb, etc.

As per the results in patch 8, it can provide some nice
improvements despite the GLSL IR linker already doing the same
link time removal of unused varyings.

Ultimately I'd like to use this with radv but adding it to i965
first provides a good test platform given the mature test suites,
and extensive shader-db collections available for OpenGL. I'm
planning on also adding a NIR packing pass and it makes sense
to test that here also. I beleive the packing pass should be the
last set towards removing any dependency on the GLSL IR
optimisation passes.

Please review.



Hi Timothy,

Apart from the comments I left in some patches, series is

Reviewed-by: Eduardo Lima Mitev 

Thank you for bringing up this series. This is interesting ground work
looking into support for ARB_gl_spirv on i965. We are still analyzing
different approaches, and one option is having "SPIR-V -> NIR -> BRW",
doing linkage in NIR.


Hi Eduardo,

You might want to talk with Nicolai, he is working on ARB_gl_spirv 
support for radeonsi [1]. And is talking a SPIR-V -> NIR -> LLVm path so 
most of your work will likely overlap.


[1] https://lists.freedesktop.org/archives/mesa-dev/2017-May/156413.html


Related to that, I have a couple questions:

Do you plan to continue working on improving on this (tackling the open
points you mention)? If so, do you have a roadmap?

Have you thoughts already on potential issues to implement a fully
capable NIR linker (e.g, one that would avoid any use of the GLSL linker)?.


No real roadmap. I'm going to start playing with a nir varying packing 
pass tomorrow/next week, I've tried to do this a number of times in the 
past for i965 but there always seemed to be something else that needed 
to be done first for it to work. This time however I think everything is 
pretty much in place.


My plan for was never to avoid the GLSL linker, it was to do something 
like this around the same spot I call this linking pass.


  - remove from GLSL IR the varyings/uniforms that the nir opt
passes removed. This is safe because at this point all we want
from the GLSL IR is to assign locations and do validation on
varyings and uniforms we don't care if we make the IR invalid.

  - call GLSL IR assign varying/uniform location/validation passes this
will throw any OpenGL api errors we are required to, setup all the
various GL state that can be queried, etc.

  - free GLSL IR

  - add a NIR varying packing pass here

We *could* replace the above last stage of the GLSL IR linker with a 
version that just uses NIR but that seemed like too much work for a 
single person so I was going for the short cut.


I believe we also need theses patches which I never got a review for 
which move some GLSL IR lowering passes earlier [1][2].


[1] https://patchwork.freedesktop.org/patch/112044/
[2] https://patchwork.freedesktop.org/patch/112045/



Thanks!

Eduardo


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/3] i965: emit BRW_NEW_AUX_STATE when we allocate aux surfaces

2017-09-14 Thread Iago Toral Quiroga
Fixes a regression introduced with b96313c0e1289b296d7, which removed
BRW_NEW_BLORP for a bunch of SURFACE_STATE setup code, including render
targets, on the basis that blorp invalidates binding tables but not
surface states, however, at least on Broadwell, this caused a regression
in a CTS test, which Ken and Jason tracked down to the fact that we
are not uploading new render target surface states after allocating
new CCS_D surfaces for fast clears (which allocation is deferred until
an actual clear occurs).

The reason this only fails in BDW is that on SKL+ we use CCS_E which
is allocated up front so it exists in the initial surface state, the
problem can be reproduced in these platforms too if we use
INTEL_DEBUG=norcb to force the CCS_D path.

This patch ensures that any time we create a new aux surface we
flag BRW_NEW_AUX_STATE so we upload new surface state for it. In theory,
we only need to do this necessarily for CCS_D because it is the only
kind for which allocation can be deferred, all the others are allocated
in the initial state so they will always be uploaded, but it is probably
not a bad idea to flag all of them anyway.

Credit goes to Jason and Ken for figuring out the reason for the
regression.

Fixes:
KHR-GL45.transform_feedback.draw_xfb_test
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 32394ca3aaa..8a809a7320d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -953,6 +953,8 @@ create_ccs_buf_for_image(struct brw_context *brw,
mt->mcs_buf->qpitch = 0;
mt->mcs_buf->surf = temp_ccs_surf;
 
+   brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
+
return true;
 }
 
@@ -1731,6 +1733,7 @@ intel_miptree_alloc_mcs(struct brw_context *brw,
}
 
mt->aux_state = aux_state;
+   brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
 
intel_miptree_init_mcs(brw, mt, 0xFF);
 
@@ -1780,6 +1783,7 @@ intel_miptree_alloc_ccs(struct brw_context *brw,
}
 
mt->aux_state = aux_state;
+   brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
 
return true;
 }
@@ -1851,6 +1855,7 @@ intel_miptree_alloc_hiz(struct brw_context *brw,
   intel_miptree_level_enable_hiz(brw, mt, level);
 
mt->aux_state = aux_state;
+   brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
 
return true;
 }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 3/3] i965: flag BRW_NEW_AUX_STATE if we drop the aux buffer

2017-09-14 Thread Iago Toral Quiroga
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8a809a7320d..0328a4604dd 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2859,6 +2859,7 @@ intel_miptree_make_shareable(struct brw_context *brw,
*/
   free(mt->aux_state);
   mt->aux_state = NULL;
+  brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
}
 
if (mt->hiz_buf) {
@@ -2875,6 +2876,7 @@ intel_miptree_make_shareable(struct brw_context *brw,
*/
   free(mt->aux_state);
   mt->aux_state = NULL;
+  brw->ctx.NewDriverState |= BRW_NEW_AUX_STATE;
}
 
mt->aux_usage = ISL_AUX_USAGE_NONE;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 0/3] Flag new aux state when we create/drop aux surfaces

2017-09-14 Thread Iago Toral Quiroga
Jason, Ken: this is what I came up based on your findings that Ken
reported in https://bugs.freedesktop.org/show_bug.cgi?id=102611.

Ken mentioned that he is not completely certain that we need to flag
dirty state every time we update the surfaces and that maybe flagging
only when we go from/to AUX_USAGE_NONE might suffice. I am not sure,
but to me it sounds safer to flag when we create the surfaces, since
at that point we know we are going to use them and in that case we
need the new surface states emitted, at least until we find specific
cases where we don't need this and we can drop the flag for them. Let
me know if you think a different approach is better though.

The last two patches should probably be squashed. The last patch
signals dirty AUX state when we drop aux surfaces. I think this
is in necessary too since when we upload new renderbuffers we
also consider the case where we don't have aux, but I decided to
split it because we have not been signaling anything for dropped
aux surfaces before.

Iago Toral Quiroga (3):
  i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE
  i965: emit BRW_NEW_AUX_STATE when we allocate aux surfaces
  i965: flag BRW_NEW_AUX_STATE if we drop the aux buffer

 src/mesa/drivers/dri/i965/brw_blorp.c |  2 +-
 src/mesa/drivers/dri/i965/brw_context.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_tcs_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_tes_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  7 +++
 9 files changed, 21 insertions(+), 14 deletions(-)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 1/3] i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE

2017-09-14 Thread Iago Toral Quiroga
We want to use this flag to signal changes to the aux surfaces,
so let's not make it about fast clearing only. Suggested by Jason.
---
 src/mesa/drivers/dri/i965/brw_blorp.c |  2 +-
 src/mesa/drivers/dri/i965/brw_context.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_tcs_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_tes_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c  |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
 8 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 4c6ae369196..fde12576237 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -862,7 +862,7 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
* on the next draw call.
*/
   if (!same_clear_color)
- ctx->NewDriverState |= BRW_NEW_FAST_CLEAR_COLOR;
+ ctx->NewDriverState |= BRW_NEW_AUX_STATE;
 
   DBG("%s (fast) to mt %p level %d layers %d+%d\n", __FUNCTION__,
   irb->mt, irb->mt_level, irb->mt_layer, num_layers);
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 92fc16de136..7205c058665 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -215,7 +215,7 @@ enum brw_state_id {
BRW_STATE_VIEWPORT_COUNT,
BRW_STATE_CONSERVATIVE_RASTERIZATION,
BRW_STATE_DRAW_CALL,
-   BRW_STATE_FAST_CLEAR_COLOR,
+   BRW_STATE_AUX,
BRW_NUM_STATE_BITS
 };
 
@@ -307,7 +307,7 @@ enum brw_state_id {
 #define BRW_NEW_BLORP   (1ull << BRW_STATE_BLORP)
 #define BRW_NEW_CONSERVATIVE_RASTERIZATION (1ull << 
BRW_STATE_CONSERVATIVE_RASTERIZATION)
 #define BRW_NEW_DRAW_CALL   (1ull << BRW_STATE_DRAW_CALL)
-#define BRW_NEW_FAST_CLEAR_COLOR(1ull << BRW_STATE_FAST_CLEAR_COLOR)
+#define BRW_NEW_AUX_STATE   (1ull << BRW_STATE_AUX)
 
 struct brw_state_flags {
/** State update flags signalled by mesa internals */
diff --git a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
index 99219af8ac9..f79ce53d9a5 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_gs_surface_state.c
@@ -129,7 +129,7 @@ const struct brw_tracked_state brw_gs_image_surfaces = {
.dirty = {
   .mesa = _NEW_TEXTURE,
   .brw = BRW_NEW_BATCH |
- BRW_NEW_FAST_CLEAR_COLOR |
+ BRW_NEW_AUX_STATE |
  BRW_NEW_GEOMETRY_PROGRAM |
  BRW_NEW_GS_PROG_DATA |
  BRW_NEW_IMAGE_UNITS,
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 7b31aad170a..3c8a0566596 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -353,7 +353,7 @@ static struct dirty_bit_map brw_bits[] = {
DEFINE_BIT(BRW_NEW_VIEWPORT_COUNT),
DEFINE_BIT(BRW_NEW_CONSERVATIVE_RASTERIZATION),
DEFINE_BIT(BRW_NEW_DRAW_CALL),
-   DEFINE_BIT(BRW_NEW_FAST_CLEAR_COLOR),
+   DEFINE_BIT(BRW_NEW_AUX_STATE),
{0, 0, 0}
 };
 
diff --git a/src/mesa/drivers/dri/i965/brw_tcs_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_tcs_surface_state.c
index 72b1b809e77..df618e0a2aa 100644
--- a/src/mesa/drivers/dri/i965/brw_tcs_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_tcs_surface_state.c
@@ -129,7 +129,7 @@ brw_upload_tcs_image_surfaces(struct brw_context *brw)
 const struct brw_tracked_state brw_tcs_image_surfaces = {
.dirty = {
   .brw = BRW_NEW_BATCH |
- BRW_NEW_FAST_CLEAR_COLOR |
+ BRW_NEW_AUX_STATE |
  BRW_NEW_IMAGE_UNITS |
  BRW_NEW_TCS_PROG_DATA |
  BRW_NEW_TESS_PROGRAMS,
diff --git a/src/mesa/drivers/dri/i965/brw_tes_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_tes_surface_state.c
index 83c625ff43b..a6204ced28b 100644
--- a/src/mesa/drivers/dri/i965/brw_tes_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_tes_surface_state.c
@@ -129,7 +129,7 @@ brw_upload_tes_image_surfaces(struct brw_context *brw)
 const struct brw_tracked_state brw_tes_image_surfaces = {
.dirty = {
   .brw = BRW_NEW_BATCH |
- BRW_NEW_FAST_CLEAR_COLOR |
+ BRW_NEW_AUX_STATE |
  BRW_NEW_IMAGE_UNITS |
  BRW_NEW_TESS_PROGRAMS |
  BRW_NEW_TES_PROG_DATA,
diff --git a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
index 2906a927c9a..00b5077894c 100644
--- a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
@@ -194,7 +194,7 @@ const struct brw_tracked_state brw_vs_image_surfaces = {
.dirty = {
 

Re: [Mesa-dev] [PATCH v2 01/15] i965: Delete a batch size assertion that isn't very useful.

2017-09-14 Thread Chris Wilson
Quoting Matt Turner (2017-09-14 00:22:45)
> The series looks good to me. I had a question on 03/15, mostly for
> clarification in the commit message. The series is
> 
> Reviewed-by: Matt Turner 

Seconded. Just a minor tweak around reallocing the shadow in my case,
but that doesn't have to happen immediately.

Reviewed-by: Chris Wilson 

P.S. Pretty please do make the reloc/execobject vectors a power of two :)
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 radeonsi use STD430 packing of UBOs by default

2017-09-14 Thread Timothy Arceri



On 31/08/17 01:55, Marek Olšák wrote:

On Wed, Aug 30, 2017 at 2:22 PM, Timothy Arceri  wrote:

On 30/08/17 20:07, Marek Olšák wrote:


If LLVM was fixed to do the correct thing, we could enable CONSTBUF
LOAD for LLVM 6.0 and later.



You seem to think that the compiler *should* be placing them near where they
are used? What part of LLVM were you expecting to do this? I'm happy to do
some digging around but don't know where I should start looking.


I think the LLVM machine instruction scheduler should do that. The
starting point would be to add "-print-after-all" to llc or LLVM
arguments in Mesa to have visibility into what LLVM is doing. From
that point it's just about learning to understand that. By default,
LLVM assumes that most or all loads may be affected by any store. LLVM
might also think that the instruction order is OK and doesn't need
changes. I don't know what the exact issue is.

If Natural Selection 2 is the only game showing small changes in
shader-db stats and there are no differences in *real performance* of
NS2 and other apps, I'd say let's merge this.


Retesting with master and more recent LLVM I'm getting:

MaxWaves -1.68% (previously was -2.94%) with -1.60% for NS2.

My care factor for NS2 has officially dropped to 0. I got a copy of it 
for testing but I noticed:


 1. OpenGL support is still marked as beta
 2. It crashes when I try to load the tutorial, I assume its related to
this bug [1].

Since this is the case I'd rather not hold up this work based on the 
results of a buggy game. Marek is patch 4 ok with you? Everything else 
has you r-b (once I split patch 7).


Thanks,
Tim

[1] https://bugs.freedesktop.org/show_bug.cgi?id=93301



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >