date:20180502

[Mesa-dev] [Bug 106337] eglWaitClient() doesn't work as documented using DRI2 backend

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=106337

Tapani Pälli  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|mesa-dev@lists.freedesktop. |lem...@gmail.com
   |org |

--- Comment #8 from Tapani Pälli  ---
Created attachment 139292
  --> https://bugs.freedesktop.org/attachment.cgi?id=139292=edit
fix attempt v2

I thought about this more and I guess ideally we should just call glFinish(),
this way we will ensure same behaviour. Mike, I would appreciate if you can
test this one too.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] egl: check if colorspace/surface type is supported

2018-05-02 Thread Tapani Pälli


Reviewed-by: Tapani Pälli 

On 02.05.2018 19:23, Juan A. Suarez Romero wrote:

According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering
Surfaces"), if config does not support the colorspace or alpha format
attributes specified in attrib_list (as defined for
eglCreateWindowSurface), an EGL_BAD_MATCH error is generated.

This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still
not merged,
https://android-review.googlesource.com/c/platform/external/deqp/+/667322),
which is crashing when trying to create a windows surface with RGB888
configuration and sRGB colorspace.

v2: Handle the fix in other backends (Tapani)
---
  src/egl/drivers/dri2/platform_drm.c  | 5 +
  src/egl/drivers/dri2/platform_wayland.c  | 6 ++
  src/egl/drivers/dri2/platform_x11.c  | 5 +
  src/egl/drivers/dri2/platform_x11_dri3.c | 5 +
  4 files changed, 21 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index dc4efea9103..35bc4b5b1ac 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -155,6 +155,11 @@ dri2_drm_create_window_surface(_EGLDriver *drv, 
_EGLDisplay *disp,
 config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
  dri2_surf->base.GLColorspace);
  
+   if (!config) {

+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_surf;
+   }
+
 if (!dri2_drm_config_is_compatible(dri2_dpy, config, surface)) {
_eglError(EGL_BAD_MATCH, "EGL config not compatible with GBM format");
goto cleanup_surf;
diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index 80853ac00b8..63da21cdf55 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -249,6 +249,12 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay 
*disp,
  
 config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,

  dri2_surf->base.GLColorspace);
+
+   if (!config) {
+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_surf;
+   }
+
 visual_idx = dri2_wl_visual_idx_from_config(dri2_dpy, config);
 assert(visual_idx != -1);
  
diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c

index 6c287b4d06b..fa838f6721e 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -251,6 +251,11 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
 config = dri2_get_dri_config(dri2_conf, type,
  dri2_surf->base.GLColorspace);
  
+   if (!config) {

+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_pixmap;
+   }
+
 if (dri2_dpy->dri2) {
dri2_surf->dri_drawable =
   dri2_dpy->dri2->createNewDrawable(dri2_dpy->dri_screen, config,
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index a41e40156df..5cb6d65c0a3 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -183,6 +183,11 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
 dri_config = dri2_get_dri_config(dri2_conf, type,
  dri3_surf->surf.base.GLColorspace);
  
+   if (!dri_config) {

+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_pixmap;
+   }
+
 if (loader_dri3_drawable_init(dri2_dpy->conn, drawable,
   dri2_dpy->dri_screen,
   dri2_dpy->is_different_gpu,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 20/22] i965: account for NIR uniforms without name

2018-05-02 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 18/04/18 00:36, Alejandro Piñeiro wrote:

From: Eduardo Lima Mitev 

Right now, the BRW linker code assumes nir_variable::name is always
non-NULL, but thanks to ARB_gl_spirv we will soon be linking
SPIR-V programs, and those explicitly require matching uniforms by location.
The name is just a debug hint.

v2: simplified, most of it moved to glsl/nir/spirv (Neil Roberts)

Signed-off-by: Eduardo Lima 
Signed-off-by: Neil Roberts 
---
  src/mesa/drivers/dri/i965/brw_link.cpp | 2 +-
  src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 7d89ccd7d14..5bd9783aa01 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -320,7 +320,7 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 * get sent to the shader.
 */
nir_foreach_variable(var, >nir->uniforms) {
- if (strncmp(var->name, "gl_", 3) == 0) {
+ if (var->name && strncmp(var->name, "gl_", 3) == 0) {
  const nir_state_slot *const slots = var->state_slots;
  assert(var->state_slots != NULL);
  
diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp

index 69da83ad364..62b2951432a 100644
--- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
+++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
@@ -202,7 +202,7 @@ brw_nir_setup_glsl_uniforms(void *mem_ctx, nir_shader 
*shader,
if (var->interface_type != NULL || var->type->contains_atomic())
   continue;
  
-  if (strncmp(var->name, "gl_", 3) == 0) {

+  if (var->name && strncmp(var->name, "gl_", 3) == 0) {
   brw_nir_setup_glsl_builtin_uniform(var, prog, stage_prog_data,
  is_scalar);
} else {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/22] compiler/link: add linker_util.h, move linker_error/warning to it

2018-05-02 Thread Timothy Arceri

After looking at patch 16 I've changed my mind, this seems ok but should 
be moved to src/compiler/glsl/ and at the top of the file you should add 
that these functions are shared between the GLSL IR and NIR linkers.


On 03/05/18 12:43, Timothy Arceri wrote:

NAK as per discussion about moving nir helpers into src/compiler/glsl
this patch should no longer be needed.

On 18/04/18 00:36, Alejandro Piñeiro wrote:

Linker utilities common to glsl and nir. As a first step, it moves
linker_error/warning from glsl/linker.h
---
  src/compiler/Makefile.sources  |  2 +
  .../glsl/link_uniform_block_active_visitor.cpp |  1 +
  src/compiler/glsl/linker.cpp   | 27 
  src/compiler/glsl/linker.h |  8 +---
  src/compiler/glsl/program.h    |  8 
  src/compiler/linker_util.cpp   | 51 
++
  src/compiler/linker_util.h | 43 
++

  src/compiler/meson.build   |  2 +
  8 files changed, 101 insertions(+), 41 deletions(-)
  create mode 100644 src/compiler/linker_util.cpp
  create mode 100644 src/compiler/linker_util.h

diff --git a/src/compiler/Makefile.sources 
b/src/compiler/Makefile.sources

index aca9dab476e..bb86972ea1a 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -4,6 +4,8 @@ LIBCOMPILER_FILES = \
  builtin_type_macros.h \
  glsl_types.cpp \
  glsl_types.h \
+    linker_util.h \
+    linker_util.cpp \
  nir_types.cpp \
  nir_types.h \
  shader_enums.c \
diff --git a/src/compiler/glsl/link_uniform_block_active_visitor.cpp 
b/src/compiler/glsl/link_uniform_block_active_visitor.cpp

index cd1baf78e80..578a201b0f0 100644
--- a/src/compiler/glsl/link_uniform_block_active_visitor.cpp
+++ b/src/compiler/glsl/link_uniform_block_active_visitor.cpp
@@ -23,6 +23,7 @@
  #include "link_uniform_block_active_visitor.h"
  #include "program.h"
+#include "compiler/linker_util.h"
  static link_uniform_block_active *
  process_block(void *mem_ctx, struct hash_table *ht, ir_variable *var)
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f060c5316fa..09488cbd4d2 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -447,33 +447,6 @@ private:
  } /* anonymous namespace */
-void
-linker_error(gl_shader_program *prog, const char *fmt, ...)
-{
-   va_list ap;
-
-   ralloc_strcat(>data->InfoLog, "error: ");
-   va_start(ap, fmt);
-   ralloc_vasprintf_append(>data->InfoLog, fmt, ap);
-   va_end(ap);
-
-   prog->data->LinkStatus = LINKING_FAILURE;
-}
-
-
-void
-linker_warning(gl_shader_program *prog, const char *fmt, ...)
-{
-   va_list ap;
-
-   ralloc_strcat(>data->InfoLog, "warning: ");
-   va_start(ap, fmt);
-   ralloc_vasprintf_append(>data->InfoLog, fmt, ap);
-   va_end(ap);
-
-}
-
-
  /**
   * Given a string identifying a program resource, break it into a 
base name

   * and an optional array index in square brackets.
diff --git a/src/compiler/glsl/linker.h b/src/compiler/glsl/linker.h
index 454b65aebdf..20dbd7adcfa 100644
--- a/src/compiler/glsl/linker.h
+++ b/src/compiler/glsl/linker.h
@@ -29,6 +29,8 @@ struct gl_shader_program;
  struct gl_shader;
  struct gl_linked_shader;
+#include "compiler/linker_util.h"
+
  extern bool
  link_function_calls(gl_shader_program *prog, gl_linked_shader *main,
  gl_shader **shader_list, unsigned num_shaders);
@@ -192,12 +194,6 @@ private:
    const glsl_struct_field *named_ifc_member);
  };
-void
-linker_error(gl_shader_program *prog, const char *fmt, ...);
-
-void
-linker_warning(gl_shader_program *prog, const char *fmt, ...);
-
  /**
   * Sometimes there are empty slots left over in UniformRemapTable 
after we
   * allocate slots to explicit locations. This struct represents a 
single

diff --git a/src/compiler/glsl/program.h b/src/compiler/glsl/program.h
index 480379b10b8..9df42ddc1c4 100644
--- a/src/compiler/glsl/program.h
+++ b/src/compiler/glsl/program.h
@@ -48,14 +48,6 @@ extern void
  build_program_resource_list(struct gl_context *ctx,
  struct gl_shader_program *shProg);
-extern void
-linker_error(struct gl_shader_program *prog, const char *fmt, ...)
-   PRINTFLIKE(2, 3);
-
-extern void
-linker_warning(struct gl_shader_program *prog, const char *fmt, ...)
-   PRINTFLIKE(2, 3);
-
  extern long
  parse_program_resource_name(const GLchar *name,
  const GLchar **out_base_name_end);
diff --git a/src/compiler/linker_util.cpp b/src/compiler/linker_util.cpp
new file mode 100644
index 000..c7d26616245
--- /dev/null
+++ b/src/compiler/linker_util.cpp
@@ -0,0 +1,51 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person 
obtaining a
+ * copy of this software and associated documentation files (the 
"Software"),
+ * to deal in the

Re: [Mesa-dev] [PATCH 16/22] compiler/link: move add_program_resource to linker_util

2018-05-02 Thread Timothy Arceri

I'd rename add_program_resource -> link_util_add_program_resource or 
something like that. And add at the top of the file that these functions 
are shared between the GLSL IR and NIR linkers.


Also the new file should now be located in src/compiler/glsl/

On 18/04/18 00:36, Alejandro Piñeiro wrote:

So it could be used by the GLSL and NIR linker.
---
  src/compiler/glsl/linker.cpp | 36 
  src/compiler/linker_util.cpp | 37 +
  src/compiler/linker_util.h   |  5 +
  3 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 09488cbd4d2..4371fc071d6 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -3586,42 +3586,6 @@ should_add_buffer_variable(struct gl_shader_program 
*shProg,
 return false;
  }
  
-static bool

-add_program_resource(struct gl_shader_program *prog,
- struct set *resource_set,
- GLenum type, const void *data, uint8_t stages)
-{
-   assert(data);
-
-   /* If resource already exists, do not add it again. */
-   if (_mesa_set_search(resource_set, data))
-  return true;
-
-   prog->data->ProgramResourceList =
-  reralloc(prog->data,
-   prog->data->ProgramResourceList,
-   gl_program_resource,
-   prog->data->NumProgramResourceList + 1);
-
-   if (!prog->data->ProgramResourceList) {
-  linker_error(prog, "Out of memory during linking.\n");
-  return false;
-   }
-
-   struct gl_program_resource *res =
-  >data->ProgramResourceList[prog->data->NumProgramResourceList];
-
-   res->Type = type;
-   res->Data = data;
-   res->StageReferences = stages;
-
-   prog->data->NumProgramResourceList++;
-
-   _mesa_set_add(resource_set, data);
-
-   return true;
-}
-
  /* Function checks if a variable var is a packed varying and
   * if given name is part of packed varying's list.
   *
diff --git a/src/compiler/linker_util.cpp b/src/compiler/linker_util.cpp
index c7d26616245..c47e4dc75da 100644
--- a/src/compiler/linker_util.cpp
+++ b/src/compiler/linker_util.cpp
@@ -23,6 +23,7 @@
   */
  #include "main/mtypes.h"
  #include "linker_util.h"
+#include "util/set.h"
  
  void

  linker_error(struct gl_shader_program *prog, const char *fmt, ...)
@@ -49,3 +50,39 @@ linker_warning(struct gl_shader_program *prog, const char 
*fmt, ...)
 va_end(ap);
  
  }

+
+bool
+add_program_resource(struct gl_shader_program *prog,
+ struct set *resource_set,
+ GLenum type, const void *data, uint8_t stages)
+{
+   assert(data);
+
+   /* If resource already exists, do not add it again. */
+   if (_mesa_set_search(resource_set, data))
+  return true;
+
+   prog->data->ProgramResourceList =
+  reralloc(prog->data,
+   prog->data->ProgramResourceList,
+   gl_program_resource,
+   prog->data->NumProgramResourceList + 1);
+
+   if (!prog->data->ProgramResourceList) {
+  linker_error(prog, "Out of memory during linking.\n");
+  return false;
+   }
+
+   struct gl_program_resource *res =
+  >data->ProgramResourceList[prog->data->NumProgramResourceList];
+
+   res->Type = type;
+   res->Data = data;
+   res->StageReferences = stages;
+
+   prog->data->NumProgramResourceList++;
+
+   _mesa_set_add(resource_set, data);
+
+   return true;
+}
diff --git a/src/compiler/linker_util.h b/src/compiler/linker_util.h
index 162db3e532f..8fc1785a041 100644
--- a/src/compiler/linker_util.h
+++ b/src/compiler/linker_util.h
@@ -36,6 +36,11 @@ linker_error(struct gl_shader_program *prog, const char 
*fmt, ...);
  void
  linker_warning(struct gl_shader_program *prog, const char *fmt, ...);
  
+bool

+add_program_resource(struct gl_shader_program *prog,
+ struct set *resource_set,
+ GLenum type, const void *data, uint8_t stages);
+
  #ifdef __cplusplus
  }
  #endif


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/22] compiler/link: add linker_util.h, move linker_error/warning to it

2018-05-02 Thread Timothy Arceri


NAK as per discussion about moving nir helpers into src/compiler/glsl
this patch should no longer be needed.

On 18/04/18 00:36, Alejandro Piñeiro wrote:

Linker utilities common to glsl and nir. As a first step, it moves
linker_error/warning from glsl/linker.h
---
  src/compiler/Makefile.sources  |  2 +
  .../glsl/link_uniform_block_active_visitor.cpp |  1 +
  src/compiler/glsl/linker.cpp   | 27 
  src/compiler/glsl/linker.h |  8 +---
  src/compiler/glsl/program.h|  8 
  src/compiler/linker_util.cpp   | 51 ++
  src/compiler/linker_util.h | 43 ++
  src/compiler/meson.build   |  2 +
  8 files changed, 101 insertions(+), 41 deletions(-)
  create mode 100644 src/compiler/linker_util.cpp
  create mode 100644 src/compiler/linker_util.h

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index aca9dab476e..bb86972ea1a 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -4,6 +4,8 @@ LIBCOMPILER_FILES = \
builtin_type_macros.h \
glsl_types.cpp \
glsl_types.h \
+   linker_util.h \
+   linker_util.cpp \
nir_types.cpp \
nir_types.h \
shader_enums.c \
diff --git a/src/compiler/glsl/link_uniform_block_active_visitor.cpp 
b/src/compiler/glsl/link_uniform_block_active_visitor.cpp
index cd1baf78e80..578a201b0f0 100644
--- a/src/compiler/glsl/link_uniform_block_active_visitor.cpp
+++ b/src/compiler/glsl/link_uniform_block_active_visitor.cpp
@@ -23,6 +23,7 @@
  
  #include "link_uniform_block_active_visitor.h"

  #include "program.h"
+#include "compiler/linker_util.h"
  
  static link_uniform_block_active *

  process_block(void *mem_ctx, struct hash_table *ht, ir_variable *var)
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f060c5316fa..09488cbd4d2 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -447,33 +447,6 @@ private:
  
  } /* anonymous namespace */
  
-void

-linker_error(gl_shader_program *prog, const char *fmt, ...)
-{
-   va_list ap;
-
-   ralloc_strcat(>data->InfoLog, "error: ");
-   va_start(ap, fmt);
-   ralloc_vasprintf_append(>data->InfoLog, fmt, ap);
-   va_end(ap);
-
-   prog->data->LinkStatus = LINKING_FAILURE;
-}
-
-
-void
-linker_warning(gl_shader_program *prog, const char *fmt, ...)
-{
-   va_list ap;
-
-   ralloc_strcat(>data->InfoLog, "warning: ");
-   va_start(ap, fmt);
-   ralloc_vasprintf_append(>data->InfoLog, fmt, ap);
-   va_end(ap);
-
-}
-
-
  /**
   * Given a string identifying a program resource, break it into a base name
   * and an optional array index in square brackets.
diff --git a/src/compiler/glsl/linker.h b/src/compiler/glsl/linker.h
index 454b65aebdf..20dbd7adcfa 100644
--- a/src/compiler/glsl/linker.h
+++ b/src/compiler/glsl/linker.h
@@ -29,6 +29,8 @@ struct gl_shader_program;
  struct gl_shader;
  struct gl_linked_shader;
  
+#include "compiler/linker_util.h"

+
  extern bool
  link_function_calls(gl_shader_program *prog, gl_linked_shader *main,
  gl_shader **shader_list, unsigned num_shaders);
@@ -192,12 +194,6 @@ private:
const glsl_struct_field *named_ifc_member);
  };
  
-void

-linker_error(gl_shader_program *prog, const char *fmt, ...);
-
-void
-linker_warning(gl_shader_program *prog, const char *fmt, ...);
-
  /**
   * Sometimes there are empty slots left over in UniformRemapTable after we
   * allocate slots to explicit locations. This struct represents a single
diff --git a/src/compiler/glsl/program.h b/src/compiler/glsl/program.h
index 480379b10b8..9df42ddc1c4 100644
--- a/src/compiler/glsl/program.h
+++ b/src/compiler/glsl/program.h
@@ -48,14 +48,6 @@ extern void
  build_program_resource_list(struct gl_context *ctx,
  struct gl_shader_program *shProg);
  
-extern void

-linker_error(struct gl_shader_program *prog, const char *fmt, ...)
-   PRINTFLIKE(2, 3);
-
-extern void
-linker_warning(struct gl_shader_program *prog, const char *fmt, ...)
-   PRINTFLIKE(2, 3);
-
  extern long
  parse_program_resource_name(const GLchar *name,
  const GLchar **out_base_name_end);
diff --git a/src/compiler/linker_util.cpp b/src/compiler/linker_util.cpp
new file mode 100644
index 000..c7d26616245
--- /dev/null
+++ b/src/compiler/linker_util.cpp
@@ -0,0 +1,51 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is

Re: [Mesa-dev] [PATCH 06/22] nir/types: Add a glsl_get_component_slots() utility

2018-05-02 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 18/04/18 00:36, Alejandro Piñeiro wrote:

From: Eduardo Lima Mitev 

It is basically a wrapper around glsl_type::component_slots().
---
  src/compiler/nir_types.cpp | 6 ++
  src/compiler/nir_types.h   | 1 +
  2 files changed, 7 insertions(+)

diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
index 78b66803f08..51ca797497e 100644
--- a/src/compiler/nir_types.cpp
+++ b/src/compiler/nir_types.cpp
@@ -124,6 +124,12 @@ glsl_count_attribute_slots(const struct glsl_type *type,
 return type->count_attribute_slots(is_vertex_input);
  }
  
+unsigned

+glsl_get_component_slots(const struct glsl_type *type)
+{
+   return type->component_slots();
+}
+
  const char *
  glsl_get_struct_elem_name(const struct glsl_type *type, unsigned index)
  {
diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index 5b441af1486..9c81980042f 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -73,6 +73,7 @@ unsigned glsl_get_aoa_size(const struct glsl_type *type);
  
  unsigned glsl_count_attribute_slots(const struct glsl_type *type,

  bool is_vertex_input);
+unsigned glsl_get_component_slots(const struct glsl_type *type);
  
  const char *glsl_get_struct_elem_name(const struct glsl_type *type,

unsigned index);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/22] nir: Add explicit_binding to nir_variable

2018-05-02 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 18/04/18 00:36, Alejandro Piñeiro wrote:

From: Neil Roberts 

This is copied from the corresponding value in ir_variable. The
intention is to eventually use it in a pure-NIR linker.
---
  src/compiler/glsl/glsl_to_nir.cpp | 1 +
  src/compiler/nir/nir.h| 5 +
  2 files changed, 6 insertions(+)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 8e5e9c34912..817e9dad2b8 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -435,6 +435,7 @@ nir_visitor::visit(ir_variable *ir)
 var->data.index = ir->data.index;
 var->data.descriptor_set = 0;
 var->data.binding = ir->data.binding;
+   var->data.explicit_binding = ir->data.explicit_binding;
 var->data.bindless = ir->data.bindless;
 var->data.offset = ir->data.offset;
 var->data.image.read_only = ir->data.memory_read_only;
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index f3326e6df94..1c64efedd8e 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -255,6 +255,11 @@ typedef struct nir_variable {
 */
unsigned bindless:1;
  
+  /**

+   * Was an explicit binding set in the shader?
+   */
+  unsigned explicit_binding:1;
+
/**
 * \brief Layout qualifier for gl_FragDepth.
 *


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/22] mesa/main: add NULL name check when searching for a resource name

2018-05-02 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 18/04/18 00:36, Alejandro Piñeiro wrote:

Since ARB_gl_spirv name reflection can be missing. piglit
shader_runner does several resource checking, so this commit is useful
to get even the more simple piglit tests running without crashing on
SPIR-V mode.
---
  src/mesa/main/shader_query.cpp | 5 +
  1 file changed, 5 insertions(+)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 86064a98b97..11ecd71c575 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -533,6 +533,11 @@ _mesa_program_resource_find_name(struct gl_shader_program 
*shProg,
  
/* Resource basename. */

const char *rname = _mesa_program_resource_name(res);
+
+  /* Since ARB_gl_spirv lack of name reflections is a possibility */
+  if (rname == NULL)
+ continue;
+
unsigned baselen = strlen(rname);
unsigned baselen_without_array_index = baselen;
const char *rname_last_square_bracket = strrchr(rname, '[');


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Ilia Mirkin

On Wed, May 2, 2018 at 10:05 PM, Timothy Arceri  wrote:
> On 03/05/18 11:55, Ilia Mirkin wrote:
>>
>> On Wed, May 2, 2018 at 9:27 PM, Timothy Arceri 
>> wrote:
>>>
>>> As far as I understand it apiexec.py came about because of the need to
>>> not
>>> expose functionality in compat. Always setting compat to core would
>>> bi-pass
>>> that getting us back to where we started.
>>
>>
>> Other way around - not exposing things in core that only belong in
>> compat. (Like glTexEnvf, etc.) And similarly for ES.
>>
>
> Ok fair enough, but it looks like it's used for both reasons these days.
> Changing it would enable features in compat that don't necessarily have all
> the compat pieces implemented. I'd rather we work through the list of
> features one at a time. Geometry shaders is a little special in that we
> enable it only starting with the GL version where it becomes a required
> extension.
>
> e.g
>
> commit 4e5efa9e7ddb6d5273996cf9b09677d918759d17
> Author: Ian Romanick 
> Date:   Tue May 19 11:48:11 2015 -0700
>
> glapi: Make GL_ARB_direct_state_access functions exclusive to core
> profile

Hm yeah, OK. Oh well. These have to be audited one-by-one.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Timothy Arceri


On 03/05/18 11:55, Ilia Mirkin wrote:

On Wed, May 2, 2018 at 9:27 PM, Timothy Arceri  wrote:

As far as I understand it apiexec.py came about because of the need to not
expose functionality in compat. Always setting compat to core would bi-pass
that getting us back to where we started.


Other way around - not exposing things in core that only belong in
compat. (Like glTexEnvf, etc.) And similarly for ES.



Ok fair enough, but it looks like it's used for both reasons these days. 
Changing it would enable features in compat that don't necessarily have 
all the compat pieces implemented. I'd rather we work through the list 
of features one at a time. Geometry shaders is a little special in that 
we enable it only starting with the GL version where it becomes a 
required extension.


e.g

commit 4e5efa9e7ddb6d5273996cf9b09677d918759d17
Author: Ian Romanick 
Date:   Tue May 19 11:48:11 2015 -0700

glapi: Make GL_ARB_direct_state_access functions exclusive to core 
profile

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Ilia Mirkin

On Wed, May 2, 2018 at 9:27 PM, Timothy Arceri  wrote:
> As far as I understand it apiexec.py came about because of the need to not
> expose functionality in compat. Always setting compat to core would bi-pass
> that getting us back to where we started.

Other way around - not exposing things in core that only belong in
compat. (Like glTexEnvf, etc.) And similarly for ES.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Timothy Arceri


On 03/05/18 10:49, Ilia Mirkin wrote:

On Wed, May 2, 2018 at 7:21 PM, Timothy Arceri  wrote:

On 03/05/18 02:58, Ilia Mirkin wrote:


On Wed, May 2, 2018 at 6:27 AM, Timothy Arceri 
wrote:


---
   src/mapi/glapi/gen/apiexec.py | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py
b/src/mapi/glapi/gen/apiexec.py
index b5e0ad4a179..d33cc85d47f 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -46,7 +46,7 @@ class exec_info():
   if compatibility is not None:
   assert isinstance(compatibility, int)
   assert compatibility >= 10
-assert compatibility <= 30
+assert compatibility <= 46

   if core is not None:
   assert isinstance(core, int)
@@ -70,7 +70,7 @@ functions = {
   "TexBuffer": exec_info(compatibility=20, core=31, es2=31),

   # OpenGL 3.2 / GL_OES_geometry_shader.
-"FramebufferTexture": exec_info(core=32, es2=31),
+"FramebufferTexture": exec_info(compatibility=32, core=32, es2=31),



Does it make sense to list out compat explicitly in the presence of
core? Are there any core functions that aren't available in compat
contexts of that version?

IMHO it's worth changing the exec_info class to say

if core and compatibility is None:
compatibility = core

... or something along those lines.



If core and compatibility are none then compatibility = core is redundant.
I'm I missing something?


If core is not none and compatibility is none...


As far as I understand it apiexec.py came about because of the need to 
not expose functionality in compat. Always setting compat to core would 
bi-pass that getting us back to where we started.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 7.5/18] intel/compiler: support negate and abs of half float immediates

2018-05-02 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 

Have I reviewed everything?  Can we land shaderInt16 now?

--Jason

On Wed, May 2, 2018 at 5:18 PM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> ---
>  src/intel/compiler/brw_shader.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_
> shader.cpp
> index 284c2e8233c..537defd05d9 100644
> --- a/src/intel/compiler/brw_shader.cpp
> +++ b/src/intel/compiler/brw_shader.cpp
> @@ -605,7 +605,8 @@ brw_negate_immediate(enum brw_reg_type type, struct
> brw_reg *reg)
> case BRW_REGISTER_TYPE_V:
>assert(!"unimplemented: negate UV/V immediate");
> case BRW_REGISTER_TYPE_HF:
> -  assert(!"unimplemented: negate HF immediate");
> +  reg->ud ^= 0x80008000;
> +  return true;
> case BRW_REGISTER_TYPE_NF:
>unreachable("no NF immediates");
> }
> @@ -651,7 +652,8 @@ brw_abs_immediate(enum brw_reg_type type, struct
> brw_reg *reg)
> case BRW_REGISTER_TYPE_V:
>assert(!"unimplemented: abs V immediate");
> case BRW_REGISTER_TYPE_HF:
> -  assert(!"unimplemented: abs HF immediate");
> +  reg->ud &= ~0x80008000;
> +  return true;
> case BRW_REGISTER_TYPE_NF:
>unreachable("no NF immediates");
> }
> --
> 2.14.3
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Ilia Mirkin

On Wed, May 2, 2018 at 7:21 PM, Timothy Arceri  wrote:
>
>
> On 03/05/18 02:58, Ilia Mirkin wrote:
>>
>> On Wed, May 2, 2018 at 6:27 AM, Timothy Arceri 
>> wrote:
>>>
>>> ---
>>>   src/mapi/glapi/gen/apiexec.py | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/mapi/glapi/gen/apiexec.py
>>> b/src/mapi/glapi/gen/apiexec.py
>>> index b5e0ad4a179..d33cc85d47f 100644
>>> --- a/src/mapi/glapi/gen/apiexec.py
>>> +++ b/src/mapi/glapi/gen/apiexec.py
>>> @@ -46,7 +46,7 @@ class exec_info():
>>>   if compatibility is not None:
>>>   assert isinstance(compatibility, int)
>>>   assert compatibility >= 10
>>> -assert compatibility <= 30
>>> +assert compatibility <= 46
>>>
>>>   if core is not None:
>>>   assert isinstance(core, int)
>>> @@ -70,7 +70,7 @@ functions = {
>>>   "TexBuffer": exec_info(compatibility=20, core=31, es2=31),
>>>
>>>   # OpenGL 3.2 / GL_OES_geometry_shader.
>>> -"FramebufferTexture": exec_info(core=32, es2=31),
>>> +"FramebufferTexture": exec_info(compatibility=32, core=32, es2=31),
>>
>>
>> Does it make sense to list out compat explicitly in the presence of
>> core? Are there any core functions that aren't available in compat
>> contexts of that version?
>>
>> IMHO it's worth changing the exec_info class to say
>>
>> if core and compatibility is None:
>>compatibility = core
>>
>> ... or something along those lines.
>
>
> If core and compatibility are none then compatibility = core is redundant.
> I'm I missing something?

If core is not none and compatibility is none...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate

2018-05-02 Thread Jason Ekstrand


Rb

On May 2, 2018 16:44:52 Jose Maria Casanova Crespo  
wrote:



From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
   Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
   Split half float implementation. (Jason Ekstrand)
   Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18 . 0 18 . 1" 
---
src/intel/compiler/brw_shader.cpp | 12 
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp

index 9cdf9fcb23d..284c2e8233c 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -580,9 +580,11 @@ brw_negate_immediate(enum brw_reg_type type, struct 
brw_reg *reg)

  reg->d = -reg->d;
  return true;
   case BRW_REGISTER_TYPE_W:
-   case BRW_REGISTER_TYPE_UW:
-  reg->d = -(int16_t)reg->ud;
+   case BRW_REGISTER_TYPE_UW: {
+  uint16_t value = -(int16_t)reg->ud;
+  reg->ud = value | (uint32_t)value << 16;
  return true;
+   }
   case BRW_REGISTER_TYPE_F:
  reg->f = -reg->f;
  return true;
@@ -618,9 +620,11 @@ brw_abs_immediate(enum brw_reg_type type, struct 
brw_reg *reg)

   case BRW_REGISTER_TYPE_D:
  reg->d = abs(reg->d);
  return true;
-   case BRW_REGISTER_TYPE_W:
-  reg->d = abs((int16_t)reg->ud);
+   case BRW_REGISTER_TYPE_W: {
+  uint16_t value = abs((int16_t)reg->ud);
+  reg->ud = value | (uint32_t)value << 16;
  return true;
+   }
   case BRW_REGISTER_TYPE_F:
  reg->f = fabsf(reg->f);
  return true;
--
2.14.3




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/14] ac/gpu_info: add has_sparse_vm_mappings

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  |  8 
 src/amd/common/ac_gpu_info.h  |  1 +
 src/gallium/drivers/radeonsi/si_get.c | 13 ++---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c |  1 +
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index b7b8c91e264..61454ae9491 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -323,20 +323,27 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_gpu_reset_status_query = true;
info->has_gpu_reset_counter_query = false;
info->has_eqaa_surface_allocator = true;
info->has_format_bc1_through_bc7 = true;
/* DRM 3.1.0 doesn't flush TC for VI correctly. */
info->kernel_flushes_tc_l2_after_ib = info->chip_class != VI ||
  info->drm_minor >= 2;
info->has_indirect_compute_dispatch = true;
/* SI doesn't support unaligned loads. */
info->has_unaligned_shader_loads = info->chip_class != SI;
+   /* Disable sparse mappings on SI due to VM faults in CP DMA. Enable 
them once
+* these faults are mitigated in software.
+* Disable sparse mappings on GFX9 due to hangs.
+*/
+   info->has_sparse_vm_mappings =
+   info->chip_class >= CIK && info->chip_class <= VI &&
+   info->drm_minor >= 13;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -481,20 +488,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
printf("kernel_flushes_tc_l2_after_ib = %u\n", 
info->kernel_flushes_tc_l2_after_ib);
printf("has_indirect_compute_dispatch = %u\n", 
info->has_indirect_compute_dispatch);
printf("has_unaligned_shader_loads = %u\n", 
info->has_unaligned_shader_loads);
+   printf("has_sparse_vm_mappings = %u\n", 
info->has_sparse_vm_mappings);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index e95dcbd906c..7caa6543695 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -100,20 +100,21 @@ struct radeon_info {
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
boolhas_format_bc1_through_bc7;
boolkernel_flushes_tc_l2_after_ib;
boolhas_indirect_compute_dispatch;
boolhas_unaligned_shader_loads;
+   boolhas_sparse_vm_mappings;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git

[Mesa-dev] [PATCH 10/14] radeonsi: expose ARB_query_buffer_object on ancient kernels too

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

It doesn't use indirect dispatches.
---
 src/gallium/drivers/radeonsi/si_get.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index 0e7d28e334c..3feb1ae7823 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -144,20 +144,21 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_SAMPLER_VIEW_TARGET:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_TGSI_TXQS:
case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
case PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS:
case PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL:
case PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL:
case PIPE_CAP_INVALIDATE_BUFFER:
case PIPE_CAP_SURFACE_REINTERPRET_BLOCKS:
+   case PIPE_CAP_QUERY_BUFFER_OBJECT:
case PIPE_CAP_QUERY_MEMORY_INFO:
case PIPE_CAP_TGSI_PACK_HALF_FLOAT:
case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT:
case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
case PIPE_CAP_GENERATE_MIPMAP:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_STRING_MARKER:
case PIPE_CAP_CLEAR_TEXTURE:
case PIPE_CAP_CULL_DISTANCE:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
@@ -276,23 +277,20 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 
case PIPE_CAP_FENCE_SIGNAL:
return sscreen->info.has_syncobj;
 
case PIPE_CAP_CONSTBUF0_FLAGS:
return SI_RESOURCE_FLAG_32BIT;
 
case PIPE_CAP_NATIVE_FENCE_FD:
return sscreen->info.has_fence_to_handle;
 
-   case PIPE_CAP_QUERY_BUFFER_OBJECT:
-   return sscreen->info.has_indirect_compute_dispatch;
-
case PIPE_CAP_DRAW_PARAMETERS:
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
return sscreen->has_draw_indirect_multi;
 
case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
return 30;
 
case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK:
return sscreen->info.chip_class <= VI ?
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/14] ac/gpu_info: add kernel_flushes_tc_l2_after_ib

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 4 
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_gfx_cs.c  | 3 +--
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index e0e30a4a572..5c1bab2e9a0 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -317,20 +317,23 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
info->htile_cmask_support_1d_tiling = true;
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
info->has_bo_metadata = true;
info->has_gpu_reset_status_query = true;
info->has_gpu_reset_counter_query = false;
info->has_eqaa_surface_allocator = true;
info->has_format_bc1_through_bc7 = true;
+   /* DRM 3.1.0 doesn't flush TC for VI correctly. */
+   info->kernel_flushes_tc_l2_after_ib = info->chip_class != VI ||
+ info->drm_minor >= 2;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -472,20 +475,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
+   printf("kernel_flushes_tc_l2_after_ib = %u\n", 
info->kernel_flushes_tc_l2_after_ib);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 9c4c6cb11f0..5e404714db6 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -97,20 +97,21 @@ struct radeon_info {
boolhas_ctx_priority;
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
boolhas_format_bc1_through_bc7;
+   boolkernel_flushes_tc_l2_after_ib;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_gfx_cs.c 
b/src/gallium/drivers/radeonsi/si_gfx_cs.c
index 0af16dd3474..ec74c1bc703 100644
--- a/src/gallium/drivers/radeonsi/si_gfx_cs.c
+++ b/src/gallium/drivers/radeonsi/si_gfx_cs.c
@@ -67,22 +67,21 @@ void si_need_gfx_cs_space(struct si_context

[Mesa-dev] [PATCH 03/14] ac/gpu_info: add si_TA_CS_BC_BASE_ADDR_allowed

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_compute.c | 4 +---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index d9b5b4a1960..b2c29f657ca 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -311,20 +311,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_userptr = true;
info->has_syncobj = has_syncobj(fd);
info->has_syncobj_wait_for_submit = info->has_syncobj && 
info->drm_minor >= 20;
info->has_fence_to_handle = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
info->htile_cmask_support_1d_tiling = true;
+   info->si_TA_CS_BC_BASE_ADDR_allowed = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -460,20 +461,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("drm = %i.%i.%i\n", info->drm_major,
   info->drm_minor, info->drm_patchlevel);
printf("has_userptr = %i\n", info->has_userptr);
printf("has_syncobj = %u\n", info->has_syncobj);
printf("has_syncobj_wait_for_submit = %u\n", 
info->has_syncobj_wait_for_submit);
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
+   printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 578c3fb7da1..bc6350b5625 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -91,20 +91,21 @@ struct radeon_info {
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
boolhas_syncobj_wait_for_submit;
boolhas_fence_to_handle;
boolhas_ctx_priority;
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
+   boolsi_TA_CS_BC_BASE_ADDR_allowed;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index e95e79c7b46..e20bae0afc4 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -324,23 +324,21 @@ static void si_initialize_compute(struct si_context *sctx)
}
 
/* Set the pointer to border colors. */
bc_va = sctx->border_color_buffer->gpu_address;
 
if (sctx->chip_class >= CIK) {
radeon_set_uconfig_reg_seq(cs, R_030E00_TA_CS_BC_BASE_ADDR,

[Mesa-dev] [PATCH 09/14] ac/gpu_info: add has_indirect_compute_dispatch

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c |  2 ++
 src/amd/common/ac_gpu_info.h |  1 +
 src/gallium/drivers/radeonsi/si_get.c| 16 +++-
 .../winsys/radeon/drm/radeon_drm_winsys.c|  5 +
 4 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 5c1bab2e9a0..aa18c97826c 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -320,20 +320,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->htile_cmask_support_1d_tiling = true;
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
info->has_bo_metadata = true;
info->has_gpu_reset_status_query = true;
info->has_gpu_reset_counter_query = false;
info->has_eqaa_surface_allocator = true;
info->has_format_bc1_through_bc7 = true;
/* DRM 3.1.0 doesn't flush TC for VI correctly. */
info->kernel_flushes_tc_l2_after_ib = info->chip_class != VI ||
  info->drm_minor >= 2;
+   info->has_indirect_compute_dispatch = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -476,20 +477,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
printf("kernel_flushes_tc_l2_after_ib = %u\n", 
info->kernel_flushes_tc_l2_after_ib);
+   printf("has_indirect_compute_dispatch = %u\n", 
info->has_indirect_compute_dispatch);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 5e404714db6..d5d10c60102 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -98,20 +98,21 @@ struct radeon_info {
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
boolhas_format_bc1_through_bc7;
boolkernel_flushes_tc_l2_after_ib;
+   boolhas_indirect_compute_dispatch;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index cd3e63c73d7..0e7d28e334c 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -76,30 +76,20 @@ const char *si_get_family_name(const struct si_screen 
*sscreen)
case CHIP_POLARIS11: return "AMD POLARIS11";

[Mesa-dev] [PATCH 05/14] radeonsi: clean up the reset status query implementation

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  |  4 +++
 src/amd/common/ac_gpu_info.h  |  2 ++
 src/gallium/drivers/radeonsi/si_get.c |  5 ++-
 src/gallium/drivers/radeonsi/si_pipe.c| 36 +--
 .../winsys/radeon/drm/radeon_drm_winsys.c |  2 ++
 5 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 85c739ca343..94dfff77ac1 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -313,20 +313,22 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_syncobj_wait_for_submit = info->has_syncobj && 
info->drm_minor >= 20;
info->has_fence_to_handle = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
info->htile_cmask_support_1d_tiling = true;
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
info->has_bo_metadata = true;
+   info->has_gpu_reset_status_query = true;
+   info->has_gpu_reset_counter_query = false;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -464,20 +466,22 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("has_userptr = %i\n", info->has_userptr);
printf("has_syncobj = %u\n", info->has_syncobj);
printf("has_syncobj_wait_for_submit = %u\n", 
info->has_syncobj_wait_for_submit);
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
+   printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
+   printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 340c368bda3..f5b74579ef1 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -93,20 +93,22 @@ struct radeon_info {
boolhas_userptr;
boolhas_syncobj;
boolhas_syncobj_wait_for_submit;
boolhas_fence_to_handle;
boolhas_ctx_priority;
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
+   boolhas_gpu_reset_status_query;
+   boolhas_gpu_reset_counter_query;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index c31ab43cb42..cd3e63c73d7 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++

[Mesa-dev] [PATCH 13/14] ac/gpu_info: add has_2d_tiling

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_get.c | 6 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 ++
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 61454ae9491..99f1996b414 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -330,20 +330,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_indirect_compute_dispatch = true;
/* SI doesn't support unaligned loads. */
info->has_unaligned_shader_loads = info->chip_class != SI;
/* Disable sparse mappings on SI due to VM faults in CP DMA. Enable 
them once
 * these faults are mitigated in software.
 * Disable sparse mappings on GFX9 due to hangs.
 */
info->has_sparse_vm_mappings =
info->chip_class >= CIK && info->chip_class <= VI &&
info->drm_minor >= 13;
+   info->has_2d_tiling = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -489,20 +490,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
printf("kernel_flushes_tc_l2_after_ib = %u\n", 
info->kernel_flushes_tc_l2_after_ib);
printf("has_indirect_compute_dispatch = %u\n", 
info->has_indirect_compute_dispatch);
printf("has_unaligned_shader_loads = %u\n", 
info->has_unaligned_shader_loads);
printf("has_sparse_vm_mappings = %u\n", 
info->has_sparse_vm_mappings);
+   printf("has_2d_tiling = %u\n", info->has_2d_tiling);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 7caa6543695..fb44f7c8af4 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -101,20 +101,21 @@ struct radeon_info {
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
boolhas_format_bc1_through_bc7;
boolkernel_flushes_tc_l2_after_ib;
boolhas_indirect_compute_dispatch;
boolhas_unaligned_shader_loads;
boolhas_sparse_vm_mappings;
+   boolhas_2d_tiling;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index ef74cd457b8..757192f309c 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -191,25 +191,21 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
return HAVE_LLVM >= 0x0500;
 
case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
return

[Mesa-dev] [PATCH 11/14] ac/gpu_info: add has_unaligned_shader_loads

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 3 +++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_get.c | 6 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 3 +++
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index aa18c97826c..b7b8c91e264 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -321,20 +321,22 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
info->has_bo_metadata = true;
info->has_gpu_reset_status_query = true;
info->has_gpu_reset_counter_query = false;
info->has_eqaa_surface_allocator = true;
info->has_format_bc1_through_bc7 = true;
/* DRM 3.1.0 doesn't flush TC for VI correctly. */
info->kernel_flushes_tc_l2_after_ib = info->chip_class != VI ||
  info->drm_minor >= 2;
info->has_indirect_compute_dispatch = true;
+   /* SI doesn't support unaligned loads. */
+   info->has_unaligned_shader_loads = info->chip_class != SI;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -478,20 +480,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
printf("kernel_flushes_tc_l2_after_ib = %u\n", 
info->kernel_flushes_tc_l2_after_ib);
printf("has_indirect_compute_dispatch = %u\n", 
info->has_indirect_compute_dispatch);
+   printf("has_unaligned_shader_loads = %u\n", 
info->has_unaligned_shader_loads);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index d5d10c60102..e95dcbd906c 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -99,20 +99,21 @@ struct radeon_info {
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
boolhas_format_bc1_through_bc7;
boolkernel_flushes_tc_l2_after_ib;
boolhas_indirect_compute_dispatch;
+   boolhas_unaligned_shader_loads;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index 3feb1ae7823..d2bee21a1fe 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -219,25 +219,21 @@ static int si_get_param(struct

[Mesa-dev] [PATCH 14/14] ac/gpu_info: add has_read_registers_query

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_debug.c   | 5 ++---
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 99f1996b414..a02fb4e4dc4 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -331,20 +331,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
/* SI doesn't support unaligned loads. */
info->has_unaligned_shader_loads = info->chip_class != SI;
/* Disable sparse mappings on SI due to VM faults in CP DMA. Enable 
them once
 * these faults are mitigated in software.
 * Disable sparse mappings on GFX9 due to hangs.
 */
info->has_sparse_vm_mappings =
info->chip_class >= CIK && info->chip_class <= VI &&
info->drm_minor >= 13;
info->has_2d_tiling = true;
+   info->has_read_registers_query = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -491,20 +492,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
printf("kernel_flushes_tc_l2_after_ib = %u\n", 
info->kernel_flushes_tc_l2_after_ib);
printf("has_indirect_compute_dispatch = %u\n", 
info->has_indirect_compute_dispatch);
printf("has_unaligned_shader_loads = %u\n", 
info->has_unaligned_shader_loads);
printf("has_sparse_vm_mappings = %u\n", 
info->has_sparse_vm_mappings);
printf("has_2d_tiling = %u\n", info->has_2d_tiling);
+   printf("has_read_registers_query = %u\n", 
info->has_read_registers_query);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index fb44f7c8af4..1201d811361 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -102,20 +102,21 @@ struct radeon_info {
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
boolhas_format_bc1_through_bc7;
boolkernel_flushes_tc_l2_after_ib;
boolhas_indirect_compute_dispatch;
boolhas_unaligned_shader_loads;
boolhas_sparse_vm_mappings;
boolhas_2d_tiling;
+   boolhas_read_registers_query;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index b7d40db21cb..36cbb8866ed 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -287,23 +287,22 @@ static void si_dump_mmapped_reg(struct si_context *sctx, 
FILE *f,
 {
struct radeon_winsys *ws = sctx->ws;
uint32_t value;
 
if (ws->read_registers(ws, offset, 1, ))

[Mesa-dev] [PATCH 04/14] ac/gpu_info: add has_bo_metadata

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_texture.c | 3 +--
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index b2c29f657ca..85c739ca343 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -312,20 +312,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_syncobj = has_syncobj(fd);
info->has_syncobj_wait_for_submit = info->has_syncobj && 
info->drm_minor >= 20;
info->has_fence_to_handle = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
info->htile_cmask_support_1d_tiling = true;
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
+   info->has_bo_metadata = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -462,20 +463,21 @@ void ac_print_gpu_info(struct radeon_info *info)
   info->drm_minor, info->drm_patchlevel);
printf("has_userptr = %i\n", info->has_userptr);
printf("has_syncobj = %u\n", info->has_syncobj);
printf("has_syncobj_wait_for_submit = %u\n", 
info->has_syncobj_wait_for_submit);
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
+   printf("has_bo_metadata = %u\n", info->has_bo_metadata);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index bc6350b5625..340c368bda3 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -92,20 +92,21 @@ struct radeon_info {
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
boolhas_syncobj_wait_for_submit;
boolhas_fence_to_handle;
boolhas_ctx_priority;
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
+   boolhas_bo_metadata;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_texture.c 
b/src/gallium/drivers/radeonsi/si_texture.c
index f38d4721331..3601c2806bc 100644
--- a/src/gallium/drivers/radeonsi/si_texture.c
+++ b/src/gallium/drivers/radeonsi/si_texture.c
@@ -601,22 +601,21 @@ static void si_query_opaque_metadata(struct si_screen 
*sscreen,
struct pipe_resource *res = >buffer.b.b;
static const unsigned char swizzle[] = {
PIPE_SWIZZLE_X,
PIPE_SWIZZLE_Y,
PIPE_SWIZZLE_Z,

[Mesa-dev] [PATCH 02/14] ac/gpu_info: add htile_cmask_support_1d_tiling

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_clear.c   | 7 ++-
 src/gallium/drivers/radeonsi/si_texture.c | 6 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 3 +++
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index b1022ef75de..d9b5b4a1960 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -310,20 +310,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
uvd_enc.available_rings ? true : false;
info->has_userptr = true;
info->has_syncobj = has_syncobj(fd);
info->has_syncobj_wait_for_submit = info->has_syncobj && 
info->drm_minor >= 20;
info->has_fence_to_handle = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
+   info->htile_cmask_support_1d_tiling = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -458,20 +459,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("Kernel info:\n");
printf("drm = %i.%i.%i\n", info->drm_major,
   info->drm_minor, info->drm_patchlevel);
printf("has_userptr = %i\n", info->has_userptr);
printf("has_syncobj = %u\n", info->has_syncobj);
printf("has_syncobj_wait_for_submit = %u\n", 
info->has_syncobj_wait_for_submit);
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
+   printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 8a9721750a6..578c3fb7da1 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -90,20 +90,21 @@ struct radeon_info {
uint32_tdrm_major; /* version */
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
boolhas_syncobj_wait_for_submit;
boolhas_fence_to_handle;
boolhas_ctx_priority;
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
+   boolhtile_cmask_support_1d_tiling;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_clear.c 
b/src/gallium/drivers/radeonsi/si_clear.c
index 23977186611..0e2d2f1013b 100644
--- a/src/gallium/drivers/radeonsi/si_clear.c
+++ b/src/gallium/drivers/radeonsi/si_clear.c
@@ -430,27 +430,24 @@ static void si_do_fast_color_clear(struct si_context 
*sctx,
}
 
/* shared textures can't use fast clear without an explicit 
flush,
 * because there is no way to communicate the clear color among
 * all clients
 */

[Mesa-dev] [PATCH 07/14] ac/gpu_info: add has_format_bc1_through_bc7

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_state.c   | 9 +++--
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 232a8bcd523..e0e30a4a572 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -316,20 +316,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
info->htile_cmask_support_1d_tiling = true;
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
info->has_bo_metadata = true;
info->has_gpu_reset_status_query = true;
info->has_gpu_reset_counter_query = false;
info->has_eqaa_surface_allocator = true;
+   info->has_format_bc1_through_bc7 = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -470,20 +471,21 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
+   printf("has_format_bc1_through_bc7 = %u\n", 
info->has_format_bc1_through_bc7);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index f8e4adf0d41..9c4c6cb11f0 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -96,20 +96,21 @@ struct radeon_info {
boolhas_fence_to_handle;
boolhas_ctx_priority;
boolhas_local_buffers;
boolkernel_flushes_hdp_before_ib;
boolhtile_cmask_support_1d_tiling;
boolsi_TA_CS_BC_BASE_ADDR_allowed;
boolhas_bo_metadata;
boolhas_gpu_reset_status_query;
boolhas_gpu_reset_counter_query;
boolhas_eqaa_surface_allocator;
+   boolhas_format_bc1_through_bc7;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index c7585b285e9..675b1adbe65 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1566,23 +1566,20 @@ static uint32_t si_translate_dbformat(enum pipe_format 
format)
 /*
  * Texture translation
  */
 
 static uint32_t si_translate_texformat(struct pipe_screen *screen,

[Mesa-dev] [PATCH 06/14] ac/gpu_info: add has_eqaa_surface_allocator

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 4 +++-
 src/amd/common/ac_gpu_info.h  | 3 ++-
 src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 94dfff77ac1..232a8bcd523 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -315,20 +315,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->has_ctx_priority = info->drm_minor >= 22;
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
info->kernel_flushes_hdp_before_ib = true;
info->htile_cmask_support_1d_tiling = true;
info->si_TA_CS_BC_BASE_ADDR_allowed = true;
info->has_bo_metadata = true;
info->has_gpu_reset_status_query = true;
info->has_gpu_reset_counter_query = false;
+   info->has_eqaa_surface_allocator = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -453,35 +454,36 @@ void ac_print_gpu_info(struct radeon_info *info)
printf("ce_fw_version = %i\n", info->ce_fw_version);
printf("ce_fw_feature = %i\n", info->ce_fw_feature);
 
printf("Multimedia info:\n");
printf("has_hw_decode = %u\n", info->has_hw_decode);
printf("uvd_enc_supported = %u\n", info->uvd_enc_supported);
printf("uvd_fw_version = %u\n", info->uvd_fw_version);
printf("vce_fw_version = %u\n", info->vce_fw_version);
printf("vce_harvest_config = %i\n", info->vce_harvest_config);
 
-   printf("Kernel info:\n");
+   printf("Kernel & winsys capabilities:\n");
printf("drm = %i.%i.%i\n", info->drm_major,
   info->drm_minor, info->drm_patchlevel);
printf("has_userptr = %i\n", info->has_userptr);
printf("has_syncobj = %u\n", info->has_syncobj);
printf("has_syncobj_wait_for_submit = %u\n", 
info->has_syncobj_wait_for_submit);
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
printf("htile_cmask_support_1d_tiling = %u\n", 
info->htile_cmask_support_1d_tiling);
printf("si_TA_CS_BC_BASE_ADDR_allowed = %u\n", 
info->si_TA_CS_BC_BASE_ADDR_allowed);
printf("has_bo_metadata = %u\n", info->has_bo_metadata);
printf("has_gpu_reset_status_query = %u\n", 
info->has_gpu_reset_status_query);
printf("has_gpu_reset_counter_query = %u\n", 
info->has_gpu_reset_counter_query);
+   printf("has_eqaa_surface_allocator = %u\n", 
info->has_eqaa_surface_allocator);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index f5b74579ef1..f8e4adf0d41 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -79,36 +79,37 @@ struct radeon_info {
uint32_tce_fw_version;
uint32_tce_fw_feature;
 
/* Multimedia info. */
boolhas_hw_decode;
booluvd_enc_supported;
uint32_tuvd_fw_version;
uint32_tvce_fw_version;
uint32_tvce_harvest_config;
 
-   /* Kernel info. */
+   /* Kernel & winsys capabilities. */
uint32_tdrm_major; /* version */
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
bool

[Mesa-dev] [PATCH 01/14] ac/gpu_info: add kernel_flushes_hdp_before_ib

2018-05-02 Thread Marek Olšák

From: Marek Olšák 

---
 src/amd/common/ac_gpu_info.c  | 2 ++
 src/amd/common/ac_gpu_info.h  | 1 +
 src/gallium/drivers/radeonsi/si_buffer.c  | 6 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index fd49dbefd58..b1022ef75de 100644
--- a/src/amd/common/ac_gpu_info.c
+++ b/src/amd/common/ac_gpu_info.c
@@ -309,20 +309,21 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
info->uvd_enc_supported =
uvd_enc.available_rings ? true : false;
info->has_userptr = true;
info->has_syncobj = has_syncobj(fd);
info->has_syncobj_wait_for_submit = info->has_syncobj && 
info->drm_minor >= 20;
info->has_fence_to_handle = info->has_syncobj && info->drm_minor >= 21;
info->has_ctx_priority = info->drm_minor >= 22;
/* TODO: Enable this once the kernel handles it efficiently. */
info->has_local_buffers = info->drm_minor >= 20 &&
  !info->has_dedicated_vram;
+   info->kernel_flushes_hdp_before_ib = true;
 
info->num_render_backends = amdinfo->rb_pipes;
/* The value returned by the kernel driver was wrong. */
if (info->family == CHIP_KAVERI)
info->num_render_backends = 2;
 
info->clock_crystal_freq = amdinfo->gpu_counter_freq;
if (!info->clock_crystal_freq) {
fprintf(stderr, "amdgpu: clock crystal frequency is 0, 
timestamps will be wrong\n");
info->clock_crystal_freq = 1;
@@ -456,20 +457,21 @@ void ac_print_gpu_info(struct radeon_info *info)
 
printf("Kernel info:\n");
printf("drm = %i.%i.%i\n", info->drm_major,
   info->drm_minor, info->drm_patchlevel);
printf("has_userptr = %i\n", info->has_userptr);
printf("has_syncobj = %u\n", info->has_syncobj);
printf("has_syncobj_wait_for_submit = %u\n", 
info->has_syncobj_wait_for_submit);
printf("has_fence_to_handle = %u\n", info->has_fence_to_handle);
printf("has_ctx_priority = %u\n", info->has_ctx_priority);
printf("has_local_buffers = %u\n", info->has_local_buffers);
+   printf("kernel_flushes_hdp_before_ib = %u\n", 
info->kernel_flushes_hdp_before_ib);
 
printf("Shader core info:\n");
printf("max_shader_clock = %i\n", info->max_shader_clock);
printf("num_good_compute_units = %i\n", 
info->num_good_compute_units);
printf("max_se = %i\n", info->max_se);
printf("max_sh_per_se = %i\n", info->max_sh_per_se);
 
printf("Render backend info:\n");
printf("num_render_backends = %i\n", info->num_render_backends);
printf("num_tile_pipes = %i\n", info->num_tile_pipes);
diff --git a/src/amd/common/ac_gpu_info.h b/src/amd/common/ac_gpu_info.h
index 7af6fbfca97..8a9721750a6 100644
--- a/src/amd/common/ac_gpu_info.h
+++ b/src/amd/common/ac_gpu_info.h
@@ -89,20 +89,21 @@ struct radeon_info {
/* Kernel info. */
uint32_tdrm_major; /* version */
uint32_tdrm_minor;
uint32_tdrm_patchlevel;
boolhas_userptr;
boolhas_syncobj;
boolhas_syncobj_wait_for_submit;
boolhas_fence_to_handle;
boolhas_ctx_priority;
boolhas_local_buffers;
+   boolkernel_flushes_hdp_before_ib;
 
/* Shader cores. */
uint32_tr600_max_quad_pipes; /* wave size / 16 */
uint32_tmax_shader_clock;
uint32_tnum_good_compute_units;
uint32_tmax_se; /* shader engines */
uint32_tmax_sh_per_se; /* shader arrays per shader 
engine */
 
/* Render backends (color + depth blocks). */
uint32_tr300_num_gb_pipes;
diff --git a/src/gallium/drivers/radeonsi/si_buffer.c 
b/src/gallium/drivers/radeonsi/si_buffer.c
index 504e0c723dc..2d68edc3404 100644
--- a/src/gallium/drivers/radeonsi/si_buffer.c
+++ b/src/gallium/drivers/radeonsi/si_buffer.c
@@ -118,22 +118,21 @@ void si_init_resource_fields(struct si_screen *sscreen,
/* fall through */
case PIPE_USAGE_STAGING:
/* Transfers are likely to occur more often with these
 * resources. */
res->domains = RADEON_DOMAIN_GTT;
break;
case PIPE_USAGE_DYNAMIC:
/* Older kernels didn't always flush the HDP cache before
 * CS execution
 */
-   if (sscreen->info.drm_major == 2 &&
-

[Mesa-dev] [PATCH v2 7.5/18] intel/compiler: support negate and abs of half float immediates

2018-05-02 Thread Jose Maria Casanova Crespo

---
 src/intel/compiler/brw_shader.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index 284c2e8233c..537defd05d9 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -605,7 +605,8 @@ brw_negate_immediate(enum brw_reg_type type, struct brw_reg 
*reg)
case BRW_REGISTER_TYPE_V:
   assert(!"unimplemented: negate UV/V immediate");
case BRW_REGISTER_TYPE_HF:
-  assert(!"unimplemented: negate HF immediate");
+  reg->ud ^= 0x80008000;
+  return true;
case BRW_REGISTER_TYPE_NF:
   unreachable("no NF immediates");
}
@@ -651,7 +652,8 @@ brw_abs_immediate(enum brw_reg_type type, struct brw_reg 
*reg)
case BRW_REGISTER_TYPE_V:
   assert(!"unimplemented: abs V immediate");
case BRW_REGISTER_TYPE_HF:
-  assert(!"unimplemented: abs HF immediate");
+  reg->ud &= ~0x80008000;
+  return true;
case BRW_REGISTER_TYPE_NF:
   unreachable("no NF immediates");
}
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Marek Olšák

On Wed, May 2, 2018 at 6:27 AM, Timothy Arceri 
wrote:

> ---
>  src/mapi/glapi/gen/apiexec.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
> index b5e0ad4a179..d33cc85d47f 100644
> --- a/src/mapi/glapi/gen/apiexec.py
> +++ b/src/mapi/glapi/gen/apiexec.py
> @@ -46,7 +46,7 @@ class exec_info():
>  if compatibility is not None:
>  assert isinstance(compatibility, int)
>  assert compatibility >= 10
> -assert compatibility <= 30
> +assert compatibility <= 46
>

I would prefer this assertion to be removed completely. With that:

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate

2018-05-02 Thread Jose Maria Casanova Crespo

From Intel Skylake PRM, vol 07, "Immediate" section (page 768):

"For a word, unsigned word, or half-float immediate data,
software must replicate the same 16-bit immediate value to both
the lower word and the high word of the 32-bit immediate field
in a GEN instruction."

This fixes the int16/uint16 negate and abs immediates that weren't
taking into account the replication in lower and upper words.

v2: Integer cases are different to Float cases. (Jason Ekstrand)
Included reference to PRM (Jose Maria Casanova)
v3: Make explicit uint32_t casting for left shift (Jason Ekstrand)
Split half float implementation. (Jason Ekstrand)
Fix brw_abs_immediate (Jose Maria Casanova)

Cc: "18 . 0 18 . 1" 
---
 src/intel/compiler/brw_shader.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index 9cdf9fcb23d..284c2e8233c 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -580,9 +580,11 @@ brw_negate_immediate(enum brw_reg_type type, struct 
brw_reg *reg)
   reg->d = -reg->d;
   return true;
case BRW_REGISTER_TYPE_W:
-   case BRW_REGISTER_TYPE_UW:
-  reg->d = -(int16_t)reg->ud;
+   case BRW_REGISTER_TYPE_UW: {
+  uint16_t value = -(int16_t)reg->ud;
+  reg->ud = value | (uint32_t)value << 16;
   return true;
+   }
case BRW_REGISTER_TYPE_F:
   reg->f = -reg->f;
   return true;
@@ -618,9 +620,11 @@ brw_abs_immediate(enum brw_reg_type type, struct brw_reg 
*reg)
case BRW_REGISTER_TYPE_D:
   reg->d = abs(reg->d);
   return true;
-   case BRW_REGISTER_TYPE_W:
-  reg->d = abs((int16_t)reg->ud);
+   case BRW_REGISTER_TYPE_W: {
+  uint16_t value = abs((int16_t)reg->ud);
+  reg->ud = value | (uint32_t)value << 16;
   return true;
+   }
case BRW_REGISTER_TYPE_F:
   reg->f = fabsf(reg->f);
   return true;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] intel/compiler: fix brw_imm_w for negative 16-bit integers

2018-05-02 Thread Jose Maria Casanova Crespo

16-bit immediates need to replicate the 16-bit immediate value
in both words of the 32-bit value. This needs to be careful
to avoid sign-extension, which the previous implementation was
not handling properly.

For example, with the previous implementation, storing the value
-3 would generate imm.d = 0xfffd due to signed integer sign
extension, which is not correct. Instead, we should cast to
uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd.

We only had a couple of cases hitting this path in the driver
until now, one with value -1, which would work since all bits are
one in this case, and another with value -2 in brw_clip_tri(),
which would hit the aforementioned issue (this case only affects
gen4 although we are not aware of whether this was causing an
actual bug somewhere).

v2: Make explicit uint32_t casting for left shift (Jason Ekstrand)

Reviewed-by: Jason Ekstrand 

Cc: "18 . 0 18 . 1" 
---
 src/intel/compiler/brw_reg.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h
index dff9b970b2f..ac12ab3d2dd 100644
--- a/src/intel/compiler/brw_reg.h
+++ b/src/intel/compiler/brw_reg.h
@@ -705,7 +705,7 @@ static inline struct brw_reg
 brw_imm_w(int16_t w)
 {
struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_W);
-   imm.d = w | (w << 16);
+   imm.ud = (uint16_t)w | (uint32_t)(uint16_t)w << 16;
return imm;
 }
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Timothy Arceri




On 03/05/18 02:58, Ilia Mirkin wrote:

On Wed, May 2, 2018 at 6:27 AM, Timothy Arceri  wrote:

---
  src/mapi/glapi/gen/apiexec.py | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
index b5e0ad4a179..d33cc85d47f 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -46,7 +46,7 @@ class exec_info():
  if compatibility is not None:
  assert isinstance(compatibility, int)
  assert compatibility >= 10
-assert compatibility <= 30
+assert compatibility <= 46

  if core is not None:
  assert isinstance(core, int)
@@ -70,7 +70,7 @@ functions = {
  "TexBuffer": exec_info(compatibility=20, core=31, es2=31),

  # OpenGL 3.2 / GL_OES_geometry_shader.
-"FramebufferTexture": exec_info(core=32, es2=31),
+"FramebufferTexture": exec_info(compatibility=32, core=32, es2=31),


Does it make sense to list out compat explicitly in the presence of
core? Are there any core functions that aren't available in compat
contexts of that version?

IMHO it's worth changing the exec_info class to say

if core and compatibility is None:
   compatibility = core

... or something along those lines.


If core and compatibility are none then compatibility = core is 
redundant. I'm I missing something?







  # OpenGL 4.0 / GL_ARB_shader_subroutines. Mesa only exposes this
  # extension with core profile.
--
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 07/18] intel/compiler: fix brw_negate_immediate for 16-bit types

2018-05-02 Thread Chema Casanova



El 30/04/18 a las 23:14, Jason Ekstrand escribió:
> 
> 
> On Mon, Apr 30, 2018 at 7:18 AM, Iago Toral Quiroga  > wrote:
> 
> From: Jose Maria Casanova Crespo  >
> 
> From Intel Skylake PRM, vol 07, "Immediate" section (page 768):
> 
> "For a word, unsigned word, or half-float immediate data,
> software must replicate the same 16-bit immediate value to both
> the lower word and the high word of the 32-bit immediate field
> in a GEN instruction."
> 
> This patch implements float16 negate and fix the int16/uint16
> negate that wasn't taking into account the replication in lower
> and higher words.
> 
> 
> Since this fixes a bug, do we want to split it in two and send the
> bug-fix to stable?

Makes sense to split. I'm going to send for stable also the brw_imm_w patch.

I detected the same issue with brw_abs_immediate. So I'm including it in
the v3 of this patch.


> 
> v2: Integer cases are different to Float cases. (Jason Ekstrand)
>     Included reference to PRM (Jose Maria Casanova)
> ---
>  src/intel/compiler/brw_shader.cpp | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_shader.cpp
> b/src/intel/compiler/brw_shader.cpp
> index 9cdf9fcb23..76dd1173fa 100644
> --- a/src/intel/compiler/brw_shader.cpp
> +++ b/src/intel/compiler/brw_shader.cpp
> @@ -580,8 +580,13 @@ brw_negate_immediate(enum brw_reg_type type,
> struct brw_reg *reg)
>        reg->d = -reg->d;
>        return true;
>     case BRW_REGISTER_TYPE_W:
> -   case BRW_REGISTER_TYPE_UW:
> -      reg->d = -(int16_t)reg->ud;
> +   case BRW_REGISTER_TYPE_UW: {
> +      uint16_t value = -(int16_t)reg->ud;
> +      reg->ud = value | value << 16;
> 
> 
> You're shifting an explicitly 16-bit value by 16.  I think you want to
> cast to uint32_t.

As agreed I'll change this for:

reg->ud = value | (uint32_t) value << 16;


> +      return true;
> +   }
> +   case BRW_REGISTER_TYPE_HF:
> +      reg->ud ^= 0x80008000;
>        return true;
>     case BRW_REGISTER_TYPE_F:
>        reg->f = -reg->f;
> @@ -602,8 +607,6 @@ brw_negate_immediate(enum brw_reg_type type,
> struct brw_reg *reg)
>     case BRW_REGISTER_TYPE_UV:
>     case BRW_REGISTER_TYPE_V:
>        assert(!"unimplemented: negate UV/V immediate");
> -   case BRW_REGISTER_TYPE_HF:
> -      assert(!"unimplemented: negate HF immediate");
>     case BRW_REGISTER_TYPE_NF:
>        unreachable("no NF immediates");
>     }
> -- 
> 2.14.1
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] mesa: actually support GLSL version overrides in compat profile

2018-05-02 Thread Timothy Arceri




On 03/05/18 02:51, Emil Velikov wrote:

On 2 May 2018 at 11:27, Timothy Arceri  wrote:

---
  src/mesa/main/version.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 84babd69e2f..540f5482034 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -591,6 +591,8 @@ _mesa_get_version(const struct gl_extensions *extensions,
   if (consts->GLSLVersion > 140) {
  consts->GLSLVersion = 140;
   }
+ /* Support GLSL version overrides in compat profile */
+ _mesa_override_glsl_version(consts);


Why are we allowing this only for compat? As-is this feels very dirty
and skimming through the existing code doesn't help much.


Because the code above this hard-codes compat to glsl 140. We have no 
consts->GLSLVersionCompat (and probably shouldn't) so drivers will 
otherwise set what ever they support in core to compat. This is how the 
code currently works and this patch allows the override to work as expected.




* classic drivers - 965, starting at create_context
_mesa_initialize_context -> _mesa_init_constants -> GLSLVersion (120) + override
intelInitExtensions -> GLSLVersion + override combo
_mesa_compute_version -> _mesa_get_version -> [optional] cap up-to 140
-> override
_mesa_compute_version -> tweak/match GLSL version based on the GL version

Not to mention the initial 120 (effectively) in
_mesa_initialize_context is bonkers for the following:
  - Intel Gen2 (GL 1.3) and Gen3 (GL 1.4 or 2.1)
  - nouveau vieux - GL 1.2 or 1.3
  - radeon (r100/r200) - GL 1.3

* gallium - two paths - create_screen and create_context, latter more
or less identical to i965
For the create_screen part:
st_api_query_versions (for max_gl*_version) -> _mesa_init_constants -> see above
st_api_query_versions (for max_gl*_version) -> st_init_extensions ->
GLSLVersion + override combo
st_api_query_versions (for max_gl*_version) -> _mesa_get_version -> see above

As you can see things are hairy.

A few ideas that come to mind:
  - drop the _mesa_init_constants bits and update any drivers needed
  - each of GLSLVersion, override and tweaks should happen [ideally]
once per ctx.


I have no interest in tackling this right now. I disagree that this 
change requires such a big overhaul. Right now this fixes a bug, we can 
tidy up later. We are likely going to need to tweak how versioning works 
once 3.2 compat profiles become supported.




In theory one ought to be able to reuse the gallium approach for
classic drivers, but that's going on a far too big tangent.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 9/9] radeonsi: add an environment variable that forces EQAA for MSAA allocations

2018-05-02 Thread Marek Olšák

FYI, the environment variable will only have effect on amdgpu.

Marek

On Wed, May 2, 2018 at 12:13 AM, Marek Olšák  wrote:

> From: Marek Olšák 
>
> This is for testing and experiments.
> ---
>  src/gallium/drivers/radeonsi/si_pipe.c| 22 
>  src/gallium/drivers/radeonsi/si_pipe.h|  3 +++
>  src/gallium/drivers/radeonsi/si_state.c   |  5 
>  src/gallium/drivers/radeonsi/si_texture.c | 31 +++
>  4 files changed, 56 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/
> radeonsi/si_pipe.c
> index 1ca38ed55cb..35c2c200e57 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -1065,20 +1065,42 @@ struct pipe_screen *radeonsi_screen_create(struct
> radeon_winsys *ws,
> sscreen->barrier_flags.cp_to_L2 = SI_CONTEXT_INV_SMEM_L1 |
> SI_CONTEXT_INV_VMEM_L1;
> if (sscreen->info.chip_class <= VI) {
> sscreen->barrier_flags.cp_to_L2 |=
> SI_CONTEXT_INV_GLOBAL_L2;
> sscreen->barrier_flags.L2_to_cp |=
> SI_CONTEXT_WRITEBACK_GLOBAL_L2;
> }
>
> if (debug_get_bool_option("RADEON_DUMP_SHADERS", false))
> sscreen->debug_flags |= DBG_ALL_SHADERS;
>
> +   /* Syntax:
> +* EQAA=s,z,c
> +* Example:
> +* EQAA=8,4,2
> +
> +* That means 8 coverage samples, 4 Z/S samples, and 2 color
> samples.
> +* Constraints:
> +* s >= z >= c (ignoring this only wastes memory)
> +* s = [2..16]
> +* z = [2..8]
> +* c = [2..8]
> +*
> +* Only MSAA color and depth buffers are overriden.
> +*/
> +   const char *eqaa = debug_get_option("EQAA", NULL);
> +   unsigned s,z,f;
> +   if (eqaa && sscanf(eqaa, "%u,%u,%u", , , ) == 3 && s && z &&
> f) {
> +   sscreen->eqaa_force_coverage_samples = s;
> +   sscreen->eqaa_force_z_samples = z;
> +   sscreen->eqaa_force_color_samples = f;
> +   }
> +
> for (i = 0; i < num_comp_hi_threads; i++)
> si_init_compiler(sscreen, >compiler[i]);
> for (i = 0; i < num_comp_lo_threads; i++)
> si_init_compiler(sscreen, >compiler_lowp[i]);
>
> /* Create the auxiliary context. This must be done last. */
> sscreen->aux_context = si_create_context(>b, 0);
>
> if (sscreen->debug_flags & DBG(TEST_DMA))
> si_test_dma(sscreen);
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/
> radeonsi/si_pipe.h
> index 55a135f3870..6917d5e6068 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.h
> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> @@ -409,20 +409,23 @@ struct si_screen {
>
> struct radeon_info  info;
> uint64_tdebug_flags;
> charrenderer_string[100];
>
> unsignedgs_table_depth;
> unsignedtess_offchip_block_dw_size;
> unsignedtess_offchip_ring_size;
> unsignedtess_factor_ring_size;
> unsignedvgt_hs_offchip_param;
> +   unsignedeqaa_force_coverage_samples;
> +   unsignedeqaa_force_z_samples;
> +   unsignedeqaa_force_color_samples;
> boolhas_clear_state;
> boolhas_distributed_tess;
> boolhas_draw_indirect_multi;
> boolhas_out_of_order_rast;
> boolassume_no_z_fights;
> boolcommutative_blend_add;
> boolclear_db_cache_before_clear;
> boolhas_msaa_sample_loc_bug;
> boolhas_ls_vgpr_init_bug;
> booldpbb_allowed;
> diff --git a/src/gallium/drivers/radeonsi/si_state.c
> b/src/gallium/drivers/radeonsi/si_state.c
> index e133bf28589..c7585b285e9 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -2112,20 +2112,21 @@ static bool si_is_zs_format_supported(enum
> pipe_format format)
>  {
> return si_translate_dbformat(format) != V_028040_Z_INVALID;
>  }
>
>  static boolean si_is_format_supported(struct pipe_screen *screen,
>   enum pipe_format format,
>   enum pipe_texture_target target,
>   unsigned sample_count,
>   unsigned usage)
>  {
> +   struct si_screen

Re: [Mesa-dev] [PATCH v2] egl: check if colorspace/surface type is supported

2018-05-02 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Wed, May 2, 2018 at 12:23 PM, Juan A. Suarez Romero 
wrote:

> According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering
> Surfaces"), if config does not support the colorspace or alpha format
> attributes specified in attrib_list (as defined for
> eglCreateWindowSurface), an EGL_BAD_MATCH error is generated.
>
> This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still
> not merged,
> https://android-review.googlesource.com/c/platform/external/deqp/+/667322
> ),
> which is crashing when trying to create a windows surface with RGB888
> configuration and sRGB colorspace.
>
> v2: Handle the fix in other backends (Tapani)
> ---
>  src/egl/drivers/dri2/platform_drm.c  | 5 +
>  src/egl/drivers/dri2/platform_wayland.c  | 6 ++
>  src/egl/drivers/dri2/platform_x11.c  | 5 +
>  src/egl/drivers/dri2/platform_x11_dri3.c | 5 +
>  4 files changed, 21 insertions(+)
>
> diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/
> platform_drm.c
> index dc4efea9103..35bc4b5b1ac 100644
> --- a/src/egl/drivers/dri2/platform_drm.c
> +++ b/src/egl/drivers/dri2/platform_drm.c
> @@ -155,6 +155,11 @@ dri2_drm_create_window_surface(_EGLDriver *drv,
> _EGLDisplay *disp,
> config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
>  dri2_surf->base.GLColorspace);
>
> +   if (!config) {
> +  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace
> configuration");
> +  goto cleanup_surf;
> +   }
> +
> if (!dri2_drm_config_is_compatible(dri2_dpy, config, surface)) {
>_eglError(EGL_BAD_MATCH, "EGL config not compatible with GBM
> format");
>goto cleanup_surf;
> diff --git a/src/egl/drivers/dri2/platform_wayland.c
> b/src/egl/drivers/dri2/platform_wayland.c
> index 80853ac00b8..63da21cdf55 100644
> --- a/src/egl/drivers/dri2/platform_wayland.c
> +++ b/src/egl/drivers/dri2/platform_wayland.c
> @@ -249,6 +249,12 @@ dri2_wl_create_window_surface(_EGLDriver *drv,
> _EGLDisplay *disp,
>
> config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
>  dri2_surf->base.GLColorspace);
> +
> +   if (!config) {
> +  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace
> configuration");
> +  goto cleanup_surf;
> +   }
> +
> visual_idx = dri2_wl_visual_idx_from_config(dri2_dpy, config);
> assert(visual_idx != -1);
>
> diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/
> platform_x11.c
> index 6c287b4d06b..fa838f6721e 100644
> --- a/src/egl/drivers/dri2/platform_x11.c
> +++ b/src/egl/drivers/dri2/platform_x11.c
> @@ -251,6 +251,11 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay
> *disp, EGLint type,
> config = dri2_get_dri_config(dri2_conf, type,
>  dri2_surf->base.GLColorspace);
>
> +   if (!config) {
> +  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace
> configuration");
> +  goto cleanup_pixmap;
> +   }
> +
> if (dri2_dpy->dri2) {
>dri2_surf->dri_drawable =
>   dri2_dpy->dri2->createNewDrawable(dri2_dpy->dri_screen, config,
> diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c
> b/src/egl/drivers/dri2/platform_x11_dri3.c
> index a41e40156df..5cb6d65c0a3 100644
> --- a/src/egl/drivers/dri2/platform_x11_dri3.c
> +++ b/src/egl/drivers/dri2/platform_x11_dri3.c
> @@ -183,6 +183,11 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay
> *disp, EGLint type,
> dri_config = dri2_get_dri_config(dri2_conf, type,
>  dri3_surf->surf.base.GLColorspace);
>
> +   if (!dri_config) {
> +  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace
> configuration");
> +  goto cleanup_pixmap;
> +   }
> +
> if (loader_dri3_drawable_init(dri2_dpy->conn, drawable,
>   dri2_dpy->dri_screen,
>   dri2_dpy->is_different_gpu,
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel/genxml: recognize 0x, 0o and 0b when setting default value

2018-05-02 Thread Caio Marcelo de Oliveira Filho

Remove the need of converting values that are documented in
hexadecimal. This patch would allow writing



instead of


---
 src/intel/genxml/gen_pack_header.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/genxml/gen_pack_header.py 
b/src/intel/genxml/gen_pack_header.py
index 8989f625d3..6a4c8033a7 100644
--- a/src/intel/genxml/gen_pack_header.py
+++ b/src/intel/genxml/gen_pack_header.py
@@ -241,7 +241,8 @@ class Field(object):
 self.prefix = None
 
 if "default" in attrs:
-self.default = int(attrs["default"])
+# Base 0 recognizes 0x, 0o, 0b prefixes in addition to decimal 
ints.
+self.default = int(attrs["default"], base=0)
 else:
 self.default = None
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] opencl: autotools: Fix linking order for OpenCL target

2018-05-02 Thread Jan Vesely

On Wed, 2018-05-02 at 18:38 +0200, Kai Wasserbäch wrote:
> Hey Jan,
> Jan Vesely wrote on 01.05.2018 23:59:
> > On Tue, 2018-05-01 at 18:23 +0200, Kai Wasserbäch wrote:
> > > Jan Vesely wrote on 01.05.2018 17:19:
> > > > On Tue, 2018-05-01 at 14:14 +0200, Kai Wasserbäch wrote:
> > > > > [...]
> > > > > 
> > > > >  src/gallium/targets/opencl/Makefile.am | 3 +--
> > > > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/src/gallium/targets/opencl/Makefile.am 
> > > > > b/src/gallium/targets/opencl/Makefile.am
> > > > > index de68a93ad5..f0e1de7797 100644
> > > > > --- a/src/gallium/targets/opencl/Makefile.am
> > > > > +++ b/src/gallium/targets/opencl/Makefile.am
> > > > > @@ -23,11 +23,10 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
> > > > >   $(LIBELF_LIBS) \
> > > > >   $(DLOPEN_LIBS) \
> > > > >   -lclangCodeGen \
> > > > > - -lclangFrontendTool \
> > > > >   -lclangFrontend \
> > > > > + -lclangFrontendTool \
> > > > 
> > > > This is strange. Why does reordering help here? Do we use -Wl,--as-
> > > > needed anywhere?
> > > 
> > > No, not that I can see.
> > > 
> > > > Should we use -Wl,--start-group/-Wl,--end-group for all clang libraries
> > > > instead?
> > > 
> > > Maybe? This was the simplest fix I could come up with, but if there's a
> > > preference for a link group, I can give that a try as well.
> > 
> > So the fix is to change ordering?
> 
> yes.
> 
> > Does using groups fix the issue as well? I think that would be
> > preferable, but I use split .so files, so I don't hit this issue.
> 
> I tried convincing autotools to work with those flags but failed. The only
> option I see to solve this, is very messy IMHO (and would still need the
> ordering fix): putting -Wl,--{start,end}-group directly into the right places 
> in
> lib@OPENCL_LIBNAME@_la_LIBADD is forbidden by automake ("error: linker flags
> such as '-Wl,--start-group' belong in 'lib@OPENCL_LIBNAME@_la_LDFLAGS'") and
> adding them to lib@OPENCL_LIBNAME@_la_LDFLAGS like automake is suggesting 
> won't
> work for obvious reasons. The only solution I can see is to work with
> substitution because automake seems to "not see" the flags then. I could do an
> unconditional replacement, but there are probably linkers with no support for
> these flags, which would mean I'd have to do the ordering fix in any case and
> then conditionally set "-Wl,--{start,end}-group" just for the GNU toolchain 
> with
> no immediate benefit beyond future-proofing this section.
> But maybe people who are deeper into the whole autotools stuff (Emil?
> Francisco?) can point me to a solution? Otherwise I'd like to return to my
> original patch which fixes the FTBFS and works for now. Or maybe the library
> could be linked against libclang.so (at least when --enable-llvm-shared-libs 
> is set?

Thank for looking into this. We probably need CLANG_LIBS handling
similar to LLVM_LIBS. I agree this is the best fix for now.

Acked-by: Jan Vesely 

libclang.so might be a solkution, but I'm not sure how it interacts
with older or static build clang. It's also weird that we are linking
to clang here instead of clover which actually uses clang symbols.

@Emil, are you OK with this patch?

> 
> > > > >   -lclangDriver \
> > > > >   -lclangSerialization \
> > > > > - -lclangCodeGen \
> > > > 
> > > > Is this change related?
> > > 
> > > Not really, just a minor clean-up while I was busy a few lines above.
> > > "clangCodeGen" is already named on the first Clang library line.
> > 
> > ah, all right, maybe mention it in the commit message?
> 
> Do I need to resend the patch for that or can you just add a line like "This
> change also removes the duplicate clangCodeGen line (trivial change)." before
> pushing, considering, that there are two T-b tags to be added anyway?

I'll add it on my side before pushing the patch.

thanks,
Jan

> 
> Cheers,
> Kai
> 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/9] anv: Add vma_heap allocators in anv_device

2018-05-02 Thread Chris Wilson

Quoting Scott D Phillips (2018-05-02 20:24:01)
> Chris Wilson  writes:
> 
> > Quoting Scott D Phillips (2018-05-02 17:01:05)
> >> +bool
> >> +anv_vma_alloc(struct anv_device *device, struct anv_bo *bo)
> >> +{
> >> +   if (!(bo->flags & EXEC_OBJECT_PINNED))
> >> +  return true;
> >> +
> >> +   pthread_mutex_lock(>vma_mutex);
> >> +
> >> +   bo->offset = 0;
> >
> > So bo are device scoped. There can only be a single vma per bo, so why
> > not store the vma node inside the bo? No extra allocations, no
> > searching in anv_vma_close() (a linear walk!!! Granted you have the
> > excuse of doing a full walk for list validation on top of that).
> 
> ah, I see what you're saying, something like:
> 
> bool util_vma_heap_alloc(struct util_vma_heap *heap, uint64_t size,
>uint64_t alignment, struct util_vma_heap_node *node);
> 
> to place the node in the vma, where node has a list_head and uint64_t
> addr. And then free becomes a few data structure twiddles.

Yes.

> > I guess you don't have much that stresses the vma manager :)
> 
> Right, Jason's advice is that the "good enough" point for vulkan is
> fairly low, where the model is that memory allocations are not happening
> dynamically and are few in number (~100s, I guess). There's lots of room
> for optimization in the allocator, for sure.

I'm just looking at it from the pov of drm_mm which this "replaces" (not
a complete replacement as the kernel still calls into drm_mm to validate
what the user tells us, but only on first binding). The main lesson,
imho, being that allocation and linear lists are terrible.

> The hope is that that's localized behind the allocator interface and
> everything else is fine in the series.

Also a clear API begs for unittests ;)

> > The decision to split low/high ranges rather than have a up/down
> > allocator wants a few words of explanation.
> 
> I suppose I saw that as a six-of-one/half-dozen-of-another type
> thing. It would save having two allocators, but low-memory allocation
> would still be funny with {mem=alloc; if (it wasn't actually low)
> {free;fail;}}. Maybe that logic could get hoisted into the low-side
> allocation function?
> 
> That being said, I don't really mind changing to a double sided
> allocator if there's a good argument in favor.

I don't mind either way, what I care about is that the rationale for
picking one over the other is recorded. Knowing what your expectations
are helps us all live up to them.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 104302] Wolfenstein 2 (2017) under wine graphical artifacting on RADV

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=104302

--- Comment #18 from Charadon  ---
I dunno if this information helps, but the face issue doesn't exist in AMD's
enterprise drivers.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/9] anv: Add vma_heap allocators in anv_device

2018-05-02 Thread Scott D Phillips

Chris Wilson  writes:

> Quoting Scott D Phillips (2018-05-02 17:01:05)
>> +bool
>> +anv_vma_alloc(struct anv_device *device, struct anv_bo *bo)
>> +{
>> +   if (!(bo->flags & EXEC_OBJECT_PINNED))
>> +  return true;
>> +
>> +   pthread_mutex_lock(>vma_mutex);
>> +
>> +   bo->offset = 0;
>
> So bo are device scoped. There can only be a single vma per bo, so why
> not store the vma node inside the bo? No extra allocations, no
> searching in anv_vma_close() (a linear walk!!! Granted you have the
> excuse of doing a full walk for list validation on top of that).

ah, I see what you're saying, something like:

bool util_vma_heap_alloc(struct util_vma_heap *heap, uint64_t size,
   uint64_t alignment, struct util_vma_heap_node *node);

to place the node in the vma, where node has a list_head and uint64_t
addr. And then free becomes a few data structure twiddles.

> I guess you don't have much that stresses the vma manager :)

Right, Jason's advice is that the "good enough" point for vulkan is
fairly low, where the model is that memory allocations are not happening
dynamically and are few in number (~100s, I guess). There's lots of room
for optimization in the allocator, for sure.

The hope is that that's localized behind the allocator interface and
everything else is fine in the series.

> The decision to split low/high ranges rather than have a up/down
> allocator wants a few words of explanation.

I suppose I saw that as a six-of-one/half-dozen-of-another type
thing. It would save having two allocators, but low-memory allocation
would still be funny with {mem=alloc; if (it wasn't actually low)
{free;fail;}}. Maybe that logic could get hoisted into the low-side
allocation function?

That being said, I don't really mind changing to a double sided
allocator if there's a good argument in favor.

> -Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/16] ac: enable both RBs on Kaveri

2018-05-02 Thread Marek Olšák

The changed field in PA_SC_RASTER_CONFIG is pretty clear here:

0 - RB0 renders all pixels (default for single rb per pkr configs)
1 - RB0 renders rb_tile_id==1, RB1 renders rb_tile_id==0
2 - RB0 renders rb_tile_id==0, RB1 renders rb_tile_id==1
3 - RB1 renders all pixels

So you see that values 0 and 3 are for single RB or harvested configs. You
need either 1 or 2 for full configs. Full configs can also use this to
disable RBs if the bus can't handle too many transfers efficiently (e.g.
dGPUs blitting to GTT).

Marek


On Wed, May 2, 2018 at 3:11 PM, Marek Olšák  wrote:

> If you are shader-bound, you won't see any difference. In order to become
> RB-bound, you need more work in RBs than shaders. Blending or MSAA might
> help add that.
>
> Marek
>
> On Wed, May 2, 2018 at 5:31 AM, Michel Dänzer  wrote:
>
>> On 2018-05-02 06:00 AM, Marek Olšák wrote:
>> > From: Marek Olšák 
>> >
>> > This can result in 2x increase in performance on non-harvested Kaveris.
>>
>> FWIW, no difference with glxgears on my (non-harvested AFAIK) Kaveris,
>> neither with amdgpu nor radeon. Should I try something else?
>>
>> Tested-by: Michel Dänzer 
>>
>>
>> --
>> Earthling Michel Dänzer   |   http://www.amd.com
>> Libre software enthusiast | Mesa and X developer
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/16] ac: enable both RBs on Kaveri

2018-05-02 Thread Marek Olšák

If you are shader-bound, you won't see any difference. In order to become
RB-bound, you need more work in RBs than shaders. Blending or MSAA might
help add that.

Marek

On Wed, May 2, 2018 at 5:31 AM, Michel Dänzer  wrote:

> On 2018-05-02 06:00 AM, Marek Olšák wrote:
> > From: Marek Olšák 
> >
> > This can result in 2x increase in performance on non-harvested Kaveris.
>
> FWIW, no difference with glxgears on my (non-harvested AFAIK) Kaveris,
> neither with amdgpu nor radeon. Should I try something else?
>
> Tested-by: Michel Dänzer 
>
>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/9] util: Add a virtual memory allocator

2018-05-02 Thread Marek Olšák

Hi Jason,

The radeon kernel driver only allows 4GB or 8GB of virtual address space,
so we are talking about 33 usable bits. The radeon winsys also has to
support 32-bit allocations where the high bits are 0, and it uses a
separate allocator if the address space size is 8GB, or not if it's just
4GB.

I you want to duplicate this, that's totally OK with me. I'd much rather
have the radeon winsys code duplicated than having it broken randomly by
changes to common code. Right now I know that if nobody is going to touch
it in the winsys, it's not going to break.

Marek

On Wed, May 2, 2018 at 2:36 PM, Jason Ekstrand  wrote:

> Marek & Nicolai,
>
> FYI, I am NOT trying to NIH anything here.  My hope was that eventually,
> the radeon winsys code and the Intel drivers should share an allocator.  I
> considered starting by pulling the one out of the radeon winsys code and
> modifying it but there were a couple of issues (mentioned below) and I
> didn't want to start stuff off by changing the behavior of your driver.
> :-)  Please review and give your opinions as to whether or not the version
> here would work for radeon's use-cases.
>
> --Jason
>
> On Wed, May 2, 2018 at 9:01 AM, Scott D Phillips <
> scott.d.phill...@intel.com> wrote:
>
>> From: Jason Ekstrand 
>>
>> This is simple linear-walk first-fit allocator roughly based on the
>> allocator in the radeon winsys code.  This allocator has two primary
>> functional differences:
>>
>>  1) It cleanly returns 0 on allocation failure
>>
>>  2) It allocates addresses top-down instead of bottom-up.
>>
>> The second one is needed for Intel because high addresses (with bit 47
>> set) need to be canonicalized in order to work properly.  If we allocate
>> bottom-up, then high addresses will be very rare (if they ever happen).
>> We'd rather always have high addresses so that the canonicalization code
>> gets better testing.
>> ---
>>  src/util/Makefile.sources |   4 +-
>>  src/util/meson.build  |   2 +
>>  src/util/vma.c| 231 ++
>> 
>>  src/util/vma.h|  53 +++
>>  4 files changed, 289 insertions(+), 1 deletion(-)
>>  create mode 100644 src/util/vma.c
>>  create mode 100644 src/util/vma.h
>>
>> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
>> index 104ecae8ed3..534520ce763 100644
>> --- a/src/util/Makefile.sources
>> +++ b/src/util/Makefile.sources
>> @@ -56,7 +56,9 @@ MESA_UTIL_FILES := \
>> u_string.h \
>> u_thread.h \
>> u_vector.c \
>> -   u_vector.h
>> +   u_vector.h \
>> +   vma.c \
>> +   vma.h
>>
>>  MESA_UTIL_GENERATED_FILES = \
>> format_srgb.c
>> diff --git a/src/util/meson.build b/src/util/meson.build
>> index eece1cefef6..14660e0fa0c 100644
>> --- a/src/util/meson.build
>> +++ b/src/util/meson.build
>> @@ -81,6 +81,8 @@ files_mesa_util = files(
>>'u_thread.h',
>>'u_vector.c',
>>'u_vector.h',
>> +  'vma.c',
>> +  'vma.h',
>>  )
>>
>>  install_data('drirc', install_dir : get_option('sysconfdir'))
>> diff --git a/src/util/vma.c b/src/util/vma.c
>> new file mode 100644
>> index 000..0d4e097e21f
>> --- /dev/null
>> +++ b/src/util/vma.c
>> @@ -0,0 +1,231 @@
>> +/*
>> + * Copyright © 2018 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +#include 
>> +
>> +#include "util/u_math.h"
>> +#include "util/vma.h"
>> +
>> +struct util_vma_hole {
>> +   struct list_head link;
>> +   uint64_t offset;
>> +   uint64_t size;
>> +};
>> +
>> +#define util_vma_foreach_hole(_hole, _heap) \
>> +   list_for_each_entry(struct util_vma_hole, _hole, &(_heap)->holes,
>> link)
>> +
>> +#define util_vma_foreach_hole_safe(_hole, _heap) \
>> +   list_for_each_entry_safe(struct util_vma_hole,

Re: [Mesa-dev] [PATCH v3] intel/compiler: fix 16-bit comparisons

2018-05-02 Thread Jason Ekstrand

On Wed, May 2, 2018 at 3:48 AM, Iago Toral Quiroga 
wrote:

> NIR assumes that booleans are always 32-bit, but Intel hardware produces
> 16-bit booleans for 16-bit comparisons. This means that we need to convert
> the 16-bit result to 32-bit.
>
> In the future we want to add an optimization pass to clean this up and
> hopefully remove the conversions.
>
> v2 (Jason): use the type of the source for the temporary and use
> brw_reg_type_from_bit_size for the conversion to 32-bit.
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 42 ++
> +
>  1 file changed, 34 insertions(+), 8 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index b9d8ade4cf..f763dfa4f2 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -1017,9 +1017,11 @@ fs_visitor::nir_emit_alu(const fs_builder ,
> nir_alu_instr *instr)
> case nir_op_feq:
> case nir_op_fne: {
>fs_reg dest = result;
> -  if (nir_src_bit_size(instr->src[0].src) > 32) {
> - dest = bld.vgrf(BRW_REGISTER_TYPE_DF, 1);
> -  }
> +
> +  const uint32_t bit_size =  nir_src_bit_size(instr->src[0].src);
> +  if (bit_size != 32)
> + dest = bld.vgrf(op[0].type, 1);
> +
>brw_conditional_mod cond;
>switch (instr->op) {
>case nir_op_flt:
> @@ -1037,9 +1039,21 @@ fs_visitor::nir_emit_alu(const fs_builder ,
> nir_alu_instr *instr)
>default:
>   unreachable("bad opcode");
>}
> +
>bld.CMP(dest, op[0], op[1], cond);
> -  if (nir_src_bit_size(instr->src[0].src) > 32) {
> +
> +  if (bit_size > 32) {
>   bld.MOV(result, subscript(dest, BRW_REGISTER_TYPE_UD, 0));
> +  } else if(bit_size < 32) {
> + /* When we convert the result to 32-bit we need to be careful
> and do
> +  * it as a signed conversion to get sign extension (for 32-bit
> true)
> +  */
> + const brw_reg_type dst_type =
> +brw_reg_type_from_bit_size(32, BRW_REGISTER_TYPE_D);
>

I think you can just use BRW_REGISTER_TYPE_D here.  But whatever, this is
ok too.

Reviewed-by: Jason Ekstrand 


> + const brw_reg_type src_type =
> +brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_D);
> +
> + bld.MOV(retype(result, dst_type), retype(dest, src_type));
>}
>break;
> }
> @@ -1051,9 +1065,10 @@ fs_visitor::nir_emit_alu(const fs_builder ,
> nir_alu_instr *instr)
> case nir_op_ieq:
> case nir_op_ine: {
>fs_reg dest = result;
> -  if (nir_src_bit_size(instr->src[0].src) > 32) {
> - dest = bld.vgrf(BRW_REGISTER_TYPE_UQ, 1);
> -  }
> +
> +  const uint32_t bit_size = nir_src_bit_size(instr->src[0].src);
> +  if (bit_size != 32)
> + dest = bld.vgrf(op[0].type, 1);
>
>brw_conditional_mod cond;
>switch (instr->op) {
> @@ -1075,8 +1090,19 @@ fs_visitor::nir_emit_alu(const fs_builder ,
> nir_alu_instr *instr)
>   unreachable("bad opcode");
>}
>bld.CMP(dest, op[0], op[1], cond);
> -  if (nir_src_bit_size(instr->src[0].src) > 32) {
> +
> +  if (bit_size > 32) {
>   bld.MOV(result, subscript(dest, BRW_REGISTER_TYPE_UD, 0));
> +  } else if (bit_size < 32) {
> + /* When we convert the result to 32-bit we need to be careful
> and do
> +  * it as a signed conversion to get sign extension (for 32-bit
> true)
> +  */
> + const brw_reg_type dst_type =
> +brw_reg_type_from_bit_size(32, BRW_REGISTER_TYPE_D);
> + const brw_reg_type src_type =
> +brw_reg_type_from_bit_size(bit_size, BRW_REGISTER_TYPE_D);
> +
> + bld.MOV(retype(result, dst_type), retype(dest, src_type));
>}
>break;
> }
> --
> 2.14.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/9] util: Add a virtual memory allocator

2018-05-02 Thread Jason Ekstrand

Marek & Nicolai,

FYI, I am NOT trying to NIH anything here.  My hope was that eventually,
the radeon winsys code and the Intel drivers should share an allocator.  I
considered starting by pulling the one out of the radeon winsys code and
modifying it but there were a couple of issues (mentioned below) and I
didn't want to start stuff off by changing the behavior of your driver.
:-)  Please review and give your opinions as to whether or not the version
here would work for radeon's use-cases.

--Jason

On Wed, May 2, 2018 at 9:01 AM, Scott D Phillips  wrote:

> From: Jason Ekstrand 
>
> This is simple linear-walk first-fit allocator roughly based on the
> allocator in the radeon winsys code.  This allocator has two primary
> functional differences:
>
>  1) It cleanly returns 0 on allocation failure
>
>  2) It allocates addresses top-down instead of bottom-up.
>
> The second one is needed for Intel because high addresses (with bit 47
> set) need to be canonicalized in order to work properly.  If we allocate
> bottom-up, then high addresses will be very rare (if they ever happen).
> We'd rather always have high addresses so that the canonicalization code
> gets better testing.
> ---
>  src/util/Makefile.sources |   4 +-
>  src/util/meson.build  |   2 +
>  src/util/vma.c| 231 ++
> 
>  src/util/vma.h|  53 +++
>  4 files changed, 289 insertions(+), 1 deletion(-)
>  create mode 100644 src/util/vma.c
>  create mode 100644 src/util/vma.h
>
> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
> index 104ecae8ed3..534520ce763 100644
> --- a/src/util/Makefile.sources
> +++ b/src/util/Makefile.sources
> @@ -56,7 +56,9 @@ MESA_UTIL_FILES := \
> u_string.h \
> u_thread.h \
> u_vector.c \
> -   u_vector.h
> +   u_vector.h \
> +   vma.c \
> +   vma.h
>
>  MESA_UTIL_GENERATED_FILES = \
> format_srgb.c
> diff --git a/src/util/meson.build b/src/util/meson.build
> index eece1cefef6..14660e0fa0c 100644
> --- a/src/util/meson.build
> +++ b/src/util/meson.build
> @@ -81,6 +81,8 @@ files_mesa_util = files(
>'u_thread.h',
>'u_vector.c',
>'u_vector.h',
> +  'vma.c',
> +  'vma.h',
>  )
>
>  install_data('drirc', install_dir : get_option('sysconfdir'))
> diff --git a/src/util/vma.c b/src/util/vma.c
> new file mode 100644
> index 000..0d4e097e21f
> --- /dev/null
> +++ b/src/util/vma.c
> @@ -0,0 +1,231 @@
> +/*
> + * Copyright © 2018 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include 
> +
> +#include "util/u_math.h"
> +#include "util/vma.h"
> +
> +struct util_vma_hole {
> +   struct list_head link;
> +   uint64_t offset;
> +   uint64_t size;
> +};
> +
> +#define util_vma_foreach_hole(_hole, _heap) \
> +   list_for_each_entry(struct util_vma_hole, _hole, &(_heap)->holes, link)
> +
> +#define util_vma_foreach_hole_safe(_hole, _heap) \
> +   list_for_each_entry_safe(struct util_vma_hole, _hole,
> &(_heap)->holes, link)
> +
> +void
> +util_vma_heap_init(struct util_vma_heap *heap,
> +   uint64_t start, uint64_t size)
> +{
> +   list_inithead(>holes);
> +   util_vma_heap_free(heap, start, size);
> +}
> +
> +void
> +util_vma_heap_finish(struct util_vma_heap *heap)
> +{
> +   util_vma_foreach_hole_safe(hole, heap)
> +  free(hole);
> +}
> +
> +static void
> +util_vma_heap_validate(struct util_vma_heap *heap)
> +{
> +   uint64_t prev_offset = 0;
> +   util_vma_foreach_hole(hole, heap) {
> +  assert(hole->offset > 0);
> +  assert(hole->size > 0);
> +
> +  if (>link == heap->holes.next) {
> + /* This must be the top-most hole.  Assert that, if it
> overflows, it
> +  * overflows to 0, i.e. 2^64.
> +  */
> +

Re: [Mesa-dev] [PATCH] spirv: Apply OriginUpperLeft to FragCoord

2018-05-02 Thread Anuj Phogat

On Wed, May 2, 2018 at 10:49 AM, Neil Roberts  wrote:
> This behaviour was changed in 1e5b09f42f694687ac. The commit message
> for that says it is just a “tidy up” so my assumption is that the
> behaviour change was a mistake. It’s a little hard to decipher looking
> at the diff, but the previous code before that patch was:
>
>   if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
>  nir_var->data.origin_upper_left = b->origin_upper_left;
>
>   if (builtin == SpvBuiltInFragCoord)
>  nir_var->data.pixel_center_integer = b->pixel_center_integer;
>
> After the patch the code was:
>
>   case SpvBuiltInSamplePosition:
>  nir_var->data.origin_upper_left = b->origin_upper_left;
>  /* fallthrough */
>   case SpvBuiltInFragCoord:
>  nir_var->data.pixel_center_integer = b->pixel_center_integer;
>  break;
>
> Before the patch origin_upper_left affected both builtins and
> pixel_center_integer only affected FragCoord. After the patch
> origin_upper_left only affects SamplePosition and pixel_center_integer
> affects both variables.
>
> This patch tries to restore the previous behaviour by changing the
> code to:
>
>   case SpvBuiltInFragCoord:
>  nir_var->data.pixel_center_integer = b->pixel_center_integer;
>  /* fallthrough */
>   case SpvBuiltInSamplePosition:
>  nir_var->data.origin_upper_left = b->origin_upper_left;
>  break;
>
> This change will be important for ARB_gl_spirv which is meant to
> support OriginLowerLeft.
> ---
>  src/compiler/spirv/vtn_variables.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_variables.c 
> b/src/compiler/spirv/vtn_variables.c
> index 9679ff6526c..fd8ab7f247a 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -1419,11 +1419,11 @@ apply_var_decoration(struct vtn_builder *b, 
> nir_variable *nir_var,
>case SpvBuiltInTessLevelInner:
>   nir_var->data.compact = true;
>   break;
> -  case SpvBuiltInSamplePosition:
> - nir_var->data.origin_upper_left = b->origin_upper_left;
> - /* fallthrough */
>case SpvBuiltInFragCoord:
>   nir_var->data.pixel_center_integer = b->pixel_center_integer;
> + /* fallthrough */
> +  case SpvBuiltInSamplePosition:
> + nir_var->data.origin_upper_left = b->origin_upper_left;
>   break;
>default:
>   break;
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 106337] eglWaitClient() doesn't work as documented using DRI2 backend

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=106337

--- Comment #7 from Tapani Pälli  ---
(In reply to Mike Gorchak from comment #6)
> I was able to test your changes and had to add following addition to the
> intel_screen.c:
> 
> @@ -171,7 +176,7 @@
>  }
> 
>  static const struct __DRI2flushExtensionRec intelFlushExtension = {
> -.base = { __DRI2_FLUSH, 4 },
> +.base = { __DRI2_FLUSH, 5 },

sorry did not remember to bump the version

>  .flush  = intel_dri2_flush,
>  .invalidate = dri2InvalidateDrawable,
> 
> Now I can confirm that it flushes all data to drawable surface and waits for
> it properly. Speed has been decreases dramatically, only a bit better than
> glFinish(). I think we cannot do too much with it.

OK, yeah depending on the usecase you could perhaps do multibuffering

> Another "issue", which I'm not sure if it is issue or expected behavior,
> related to this topic: when FBO is used together with surfaceless contexts. 
> 
> eglWaitClient() bails out with error if surfaceless contexts are in use to
> draw to FBO. Is this expected behavior?
> 
> Specification says: "All rendering calls for the currently bound context,
> for the current rendering API, made prior to eglWaitClient are guaranteed to
> be executed before native rendering calls made after eglWaitClient." and it
> doesn't mention "surfaces", only "contexts".

Right, this feels like another bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] spirv: Apply OriginUpperLeft to FragCoord

2018-05-02 Thread Jason Ekstrand

Looks good to me.  Thanks for catching this!

Reviewed-by: Jason Ekstrand 
Fixes: 1e5b09f42f694687ac "spirv: Tidy some repeated if checks..."

On Wed, May 2, 2018 at 10:49 AM, Neil Roberts  wrote:

> This behaviour was changed in 1e5b09f42f694687ac. The commit message
> for that says it is just a “tidy up” so my assumption is that the
> behaviour change was a mistake. It’s a little hard to decipher looking
> at the diff, but the previous code before that patch was:
>
>   if (builtin == SpvBuiltInFragCoord || builtin ==
> SpvBuiltInSamplePosition)
>  nir_var->data.origin_upper_left = b->origin_upper_left;
>
>   if (builtin == SpvBuiltInFragCoord)
>  nir_var->data.pixel_center_integer = b->pixel_center_integer;
>
> After the patch the code was:
>
>   case SpvBuiltInSamplePosition:
>  nir_var->data.origin_upper_left = b->origin_upper_left;
>  /* fallthrough */
>   case SpvBuiltInFragCoord:
>  nir_var->data.pixel_center_integer = b->pixel_center_integer;
>  break;
>
> Before the patch origin_upper_left affected both builtins and
> pixel_center_integer only affected FragCoord. After the patch
> origin_upper_left only affects SamplePosition and pixel_center_integer
> affects both variables.
>
> This patch tries to restore the previous behaviour by changing the
> code to:
>
>   case SpvBuiltInFragCoord:
>  nir_var->data.pixel_center_integer = b->pixel_center_integer;
>  /* fallthrough */
>   case SpvBuiltInSamplePosition:
>  nir_var->data.origin_upper_left = b->origin_upper_left;
>  break;
>
> This change will be important for ARB_gl_spirv which is meant to
> support OriginLowerLeft.
> ---
>  src/compiler/spirv/vtn_variables.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_
> variables.c
> index 9679ff6526c..fd8ab7f247a 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -1419,11 +1419,11 @@ apply_var_decoration(struct vtn_builder *b,
> nir_variable *nir_var,
>case SpvBuiltInTessLevelInner:
>   nir_var->data.compact = true;
>   break;
> -  case SpvBuiltInSamplePosition:
> - nir_var->data.origin_upper_left = b->origin_upper_left;
> - /* fallthrough */
>case SpvBuiltInFragCoord:
>   nir_var->data.pixel_center_integer = b->pixel_center_integer;
> + /* fallthrough */
> +  case SpvBuiltInSamplePosition:
> + nir_var->data.origin_upper_left = b->origin_upper_left;
>   break;
>default:
>   break;
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] improve buffer cache and reuse

2018-05-02 Thread James Xiong

On Wed, 2 May 2018 14:18:21 +0300
Eero Tamminen  wrote:

> Hi,
> 
> On 02.05.2018 02:25, James Xiong wrote:
> > From: "Xiong, James" 
> > 
> > With the current implementation, brw_bufmgr may round up a request
> > size to the next bucket size, result in 25% more memory allocated in
> > the worst senario. For example:
> > Request sizeActual size
> > 32KB+1Byte  40KB
> > .
> > 8MB+1Byte   10MB
> > .
> > 96MB+1Byte  112MB
> > This series align the buffer size up to page instead of a bucket
> > size to improve memory allocation efficiency. Performances are
> > almost the same with Basemark ES3, GfxBench4 and 5:
> > 
> > Basemark ES3
> > scorepeak memory allocation
> >before  afterdiffbeforeafter  diff
> > 21.537462  21.888784  1.61%419766272  408809472  -10956800
> > 19.566198  19.763429  1.00%   
> 
> What memory you're measuring:
> 
> * VmSize (not that relevant unless you're running out of address
> space)?
> 
> * PrivateDirty (listed in /proc/PID/smaps and e.g. by "smem" tool
> [1])?
> 
> * total of allocation sizes used by Mesa?
> 
> Or something else?
> 
> In general, unused memory isn't much of a problem, only dirty
> (written) memory.  Kernel maps all unused memory to a single zero
> page, so unused memory takes only few bytes of RAM for the page table
> entries (required for tracking the allocation pages).
I did the measurements in brw_bufmgr from the user space, I kept tracks
of the allocated size for each brw_bufmgr context, and printed out the
peak allocated size when the test completed and context was destroyed.
basically I increased/decreased the size when I915_GEM_CREATE or
GEM_CLOSE were called, so the cached buffers, imported or user_ptr
buffers were excluded.

The brw_bufmgr context is created when the test starts and destroyed
after it completes, the size is for the test case in bytes. This method
can measure exact size allocated for a given test case and the result
is precise too.
> 
> 
> > GfxBench 4.0
> >  score
> > peak memory before after diff before
> > after diff gl_4 564.6052246094  565.2348632813
> > 0.11%   578490368 550199296 -28291072 gl_4_off
> > 727.0440063477  703.5833129883   -3.33% 629501952
> > 598216704 -31285248 gl_manhattan 1053.4223632813
> > 1057.3690185547 0.37%   449568768 421134336 -28434432
> > gl_trex  2708.0656738281 2699.2646484375 -0.33%
> > 130076672 125042688 -5033984 gl_alu2  1207.1490478516
> > 1212.2220458984 0.42%   55496704  55029760  -466944
> > gl_driver2   103.0383071899  103.5478439331  0.49%
> > 13107200  12980224  -126976 gl_manhattan_off 1703.4780273438
> > 1736.9074707031 1.92%   490016768 456548352 -33468416
> > gl_trex_off  2951.6809082031 3058.5422363281 3.49%
> > 157511680 152260608 -5251072 gl_alu2_off  2604.0903320313
> > 2626.2524414063 0.84%   86130688  85483520  -647168
> > gl_driver2_off   204.0173187256  207.0510101318  1.47%
> > 40869888  40615936  -253952  
> 
> You're missing information on:
> * On which plaform you did the testing (affects variance)
> * how many test rounds you ran, and
> * what is your variance
I ran these tests on a gen9 platform/ubuntu 17.10 LTS. Most of the tests
are consistent, especially the memory usage. The only exception is
GfxBench 4.0 gl_manhattan, I had to ran it 3 times and pick the highest
one. I will apply this method to all tests and re-send with updated
results.
> 
> -> I don't know whether your numbers are just random noise.  
> 
> 
> Memory is allocated in pages from kernel, so there's no point in
> showing its usage as bytes.  Please use KBs, that's more readable.
> 
> (Because of randomness e.g. interactions with the windowing system, 
> there can be some variance also in process memory usage, which may
> also be useful to report.)
> 
> Because of variance, you don't need that decimals for the scores. 
> Removing the extra ones makes that data a bit more readable too.
> 
> 
>   - Eero
> 
> [1] "smem" is python based tool available at least in Debian.
> If you want something simpler, e.g. shell script working with
> minimal shells like Busybox, you can use this:
> https://github.com/maemo-tools-old/sp-memusage/blob/master/scripts/mem-smaps-private
> 
> 
> > GfxBench 5.0
> >  score   peak memory
> >   beforeafter   before after   diff
> > gl_5   259   259  1137549312  1038286848 -99262464
> > gl_5_off   297   297  1170853888  1071357952 -99495936
> > 
> > Xiong, James (4):
> >i965/drm: Reorganize code for the next patch
> >i965/drm: Round down buffer size and calculate the bucket index
> >i965/drm: Searching for a cached buffer for reuse
> >i965/drm: Purge the bucket when its cached buffer is evicted
> > 
> >

Re: [Mesa-dev] [PATCH v2 06/18] intel/compiler: fix brw_imm_w for negative 16-bit integers

2018-05-02 Thread Jason Ekstrand

On Wed, May 2, 2018 at 8:19 AM, Chema Casanova 
wrote:

>
>
> El 01/05/18 a las 01:22, Jason Ekstrand escribió:
> > On Mon, Apr 30, 2018 at 3:53 PM, Chema Casanova  > > wrote:
> >
> >
> >
> > On 30/04/18 23:12, Jason Ekstrand wrote:
> > > On Mon, Apr 30, 2018 at 7:18 AM, Iago Toral Quiroga <
> ito...@igalia.com 
> > > >> wrote:
> > >
> > > From: Jose Maria Casanova Crespo  
> > > >>
> > >
> > > 16-bit immediates need to replicate the 16-bit immediate value
> > > in both words of the 32-bit value. This needs to be careful
> > > to avoid sign-extension, which the previous implementation was
> > > not handling properly.
> > >
> > > For example, with the previous implementation, storing the
> value
> > > -3 would generate imm.d = 0xfffd due to signed integer sign
> > > extension, which is not correct. Instead, we should cast to
> > > unsigned, which gives us the correct result: imm.ud =
> 0xfffdfffd.
> > >
> > > We only had a couple of cases hitting this path in the driver
> > > until now, one with value -1, which would work since all bits
> are
> > > one in this case, and another with value -2 in brw_clip_tri(),
> > > which would hit the aforementioned issue (this case only
> affects
> > > gen4 although we are not aware of whether this was causing an
> > > actual bug somewhere).
> > > ---
> > >  src/intel/compiler/brw_reg.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/src/intel/compiler/brw_reg.h
> > b/src/intel/compiler/brw_reg.h
> > > index dff9b970b2..0084a78af6 100644
> > > --- a/src/intel/compiler/brw_reg.h
> > > +++ b/src/intel/compiler/brw_reg.h
> > > @@ -705,7 +705,7 @@ static inline struct brw_reg
> > >  brw_imm_w(int16_t w)
> > >  {
> > > struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_W);
> > > -   imm.d = w | (w << 16);
> > > +   imm.ud = (uint16_t)w | ((uint16_t)w << 16);
> >
> > > Uh... Is this cast right?  Doing a << 16 on a 16-bit data type
> should
> > > yield undefined results.  I think you want a (uint32_t) cast.
> >
> > In my test code it was working at least with GCC, I think it is
> because
> > at the end we have an integer promotion for any type with lower rank
> > than int.
> >
> > "Formally, the rule says (C11 6.3.1.1):
> >
> > If an int can represent all values of the original type (as
> > restricted by the width, for a bit-field), the value is converted to
> an
> > int; otherwise, it is converted to an unsigned int. These are called
> the
> > integer promotions."
> >
> > But I agree that is clearer if we just use (uint32_t).
> > I can change also the brw_imm_uw case that has the same issue.
> >
> >
> > Yeah, best to make it clear. :-)
>
> I was wrong, we can't just replace (uint16_t) cast by (uint32_t) because
> the cast from signed short to uint32_t implies sign extension, because
> it seems that sign extensions is done if source is signed and not in
> destination type.
>
> So for example, being w = -2  (0xfffe).
>
> imm.ud = (uint32_t)w | (uint32_t)w << 16;
>
> becomes: 0xfffe
>
> So the alternatives I figure out with the correct result are.
>
> imm.ud = (uint32_t) w & 0x | (uint32_t)w << 16;
>
> Or:
>
> uint16_t value = w;
> imm.ud = (uint32_t)value | (uint32_t)value << 16;
>
> Or something like:
>
> imm.ud = (uint32_t)(uint16_t)w | ((uint32_t)(uint16_t)w << 16);
>

I think I like this one only you can drop the first (uint32_t) and I don't
think  you need the extra parens on the right.

Honestly, I think I'd probably be ok with the original version too now that
I understand better what's going on.  Either way,

Reviewed-by: Jason Ekstrand 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] spirv: convert some operands for bitwise shift and bitwise ops to uint32

2018-05-02 Thread Jason Ekstrand

On Wed, May 2, 2018 at 1:10 AM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:

> SPIR-V allows to define the shift, offset and count operands for
> shift and bitfield opcodes with a bit-size different than 32 bits,
> but in NIR the opcodes have that limitation. As agreed in the
> mailing list, this patch adds a conversion to 32 bits to fix this.
>
> For more info, see:
>
> https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html
>
> v2:
> - src_bit_size will have zero value for variable bit-size operands (Jason).
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/compiler/spirv/vtn_alu.c | 34 ++
>  1 file changed, 34 insertions(+)
>
> diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
> index 3134849ba90..4b21aa9b8ab 100644
> --- a/src/compiler/spirv/vtn_alu.c
> +++ b/src/compiler/spirv/vtn_alu.c
> @@ -635,6 +635,40 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
>break;
> }
>
> +   case SpvOpBitFieldInsert:
> +   case SpvOpBitFieldSExtract:
> +   case SpvOpBitFieldUExtract:
> +   case SpvOpShiftLeftLogical:
> +   case SpvOpShiftRightArithmetic:
> +   case SpvOpShiftRightLogical: {
> +  bool swap;
> +  unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type);
>

Maybe call this src0_bit_size?  With that,

Reviewed-by: Jason Ekstrand 

Thanks for fixing this!


> +  unsigned dst_bit_size = glsl_get_bit_size(type);
> +  nir_op op = vtn_nir_alu_op_for_spirv_opcode(b, opcode, ,
> +  src_bit_size,
> dst_bit_size);
> +
> +  assert (op == nir_op_ushr || op == nir_op_ishr || op == nir_op_ishl
> ||
> +  op == nir_op_bitfield_insert || op ==
> nir_op_ubitfield_extract ||
> +  op == nir_op_ibitfield_extract);
> +
> +  for (unsigned i = 0; i < nir_op_infos[op].num_inputs; i++) {
> + src_bit_size = nir_alu_type_get_type_size(
> nir_op_infos[op].input_types[i]);
> + if (src_bit_size == 0)
> +continue;
> + if (src_bit_size != src[i]->bit_size) {
> +assert(src_bit_size == 32);
> +/* Convert the Shift, Offset and Count  operands to 32 bits,
> which is the bitsize
> + * supported by the NIR instructions. See discussion here:
> + *
> + * https://lists.freedesktop.org/
> archives/mesa-dev/2018-April/193026.html
> + */
> +src[i] = nir_u2u32(>nb, src[i]);
> + }
> +  }
> +  val->ssa->def = nir_build_alu(>nb, op, src[0], src[1], src[2],
> src[3]);
> +  break;
> +   }
> +
> default: {
>bool swap;
>unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type);
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel: fix aubinator include

2018-05-02 Thread Anuj Phogat

On Wed, May 2, 2018 at 9:52 AM, Lionel Landwerlin
 wrote:
> Signed-off-by: Lionel Landwerlin 
> Fixes: 7c22c150c40b3 ("intel: Move batch decoder/disassembler from tools/ to 
> common/")
> ---
>  src/intel/tools/aubinator.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index ab053c66b36..bc263dbf846 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -40,8 +40,8 @@
>  #include "util/macros.h"
>
>  #include "common/gen_decoder.h"
> +#include "common/gen_disasm.h"
>  #include "intel_aub.h"
> -#include "gen_disasm.h"
>
>  /* Below is the only command missing from intel_aub.h in libdrm
>   * So, reuse intel_aub.h from libdrm and #define the
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: Advertise variableMultisampleRate

2018-05-02 Thread Anuj Phogat

On Mon, Apr 30, 2018 at 3:10 PM, Jason Ekstrand  wrote:
>
> Initially, I didn't understand this feature.  Turns out that all it
> means is that you can switch multisample rates in the middle of a
> zero-attachment subpass.  We've been able to do this since forever.
> ---
>  src/intel/vulkan/anv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 202fe73..adcd506 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -761,7 +761,7 @@ void anv_GetPhysicalDeviceFeatures(
>.shaderInt64  = pdevice->info.gen >= 8,
>.shaderInt16  = false,
>.shaderResourceMinLod = false,
> -  .variableMultisampleRate  = false,
> +  .variableMultisampleRate  = true,
>.inheritedQueries = true,
> };
>
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] intel: aubinator: add an option to limit the number of decoded VBO lines

2018-05-02 Thread Kenneth Graunke

On Wednesday, May 2, 2018 10:42:05 AM PDT Lionel Landwerlin wrote:
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/intel/tools/aubinator.c | 39 ++---
>  1 file changed, 23 insertions(+), 16 deletions(-)
> 
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index bc263dbf846..3120e82b22e 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -58,6 +58,7 @@
>  
>  static bool option_full_decode = true;
>  static bool option_print_offsets = true;
> +static int max_vbo_lines = -1;
>  static enum { COLOR_AUTO, COLOR_ALWAYS, COLOR_NEVER } option_color;
>  
>  /* state */
> @@ -179,6 +180,7 @@ aubinator_init(uint16_t aub_pci_id, const char *app_name)
>  
> gen_batch_decode_ctx_init(_ctx, , outfile, batch_flags,
>   xml_path, get_gen_batch_bo, NULL, NULL);
> +   batch_ctx.max_vbo_decoded_lines = max_vbo_lines;
>  
> char *color = GREEN_HEADER, *reset_color = NORMAL;
> if (option_color == COLOR_NEVER)
> @@ -547,14 +549,15 @@ print_help(const char *progname, FILE *file)
> "Usage: %s [OPTION]... [FILE]\n"
> "Decode aub file contents from either FILE or the standard 
> input.\n\n"
> "A valid --gen option must be provided.\n\n"
> -   "  --help  display this help and exit\n"
> -   "  --gen=platform  decode for given platform (3 letter 
> platform name)\n"
> -   "  --headers   decode only command headers\n"
> -   "  --color[=WHEN]  colorize the output; WHEN can be 'auto' 
> (default\n"
> -   "if omitted), 'always', or 'never'\n"
> -   "  --no-pager  don't launch pager\n"
> -   "  --no-offsetsdon't print instruction offsets\n"
> -   "  --xml=DIR   load hardware xml description from 
> directory DIR\n",
> +   "  --help display this help and exit\n"
> +   "  --gen=platform decode for given platform (3 letter 
> platform name)\n"
> +   "  --headers  decode only command headers\n"
> +   "  --color[=WHEN] colorize the output; WHEN can be 'auto' 
> (default\n"
> +   " if omitted), 'always', or 'never'\n"
> +   "  --max-vbo-lines=N  limit the number of decoded VBO lines\n"
> +   "  --no-pager don't launch pager\n"
> +   "  --no-offsets   don't print instruction offsets\n"
> +   "  --xml=DIR  load hardware xml description from 
> directory DIR\n",
> progname);
>  }
>  
> @@ -564,14 +567,15 @@ int main(int argc, char *argv[])
> int c, i;
> bool help = false, pager = true;
> const struct option aubinator_opts[] = {
> -  { "help",   no_argument,   (int *) , true 
> },
> -  { "no-pager",   no_argument,   (int *) ,
> false },
> -  { "no-offsets", no_argument,   (int *) _print_offsets, 
> false },
> -  { "gen",required_argument, NULL,  'g' 
> },
> -  { "headers",no_argument,   (int *) _full_decode,   
> false },
> -  { "color",  required_argument, NULL,  'c' 
> },
> -  { "xml",required_argument, NULL,  'x' 
> },
> -  { NULL, 0, NULL,  0 }
> +  { "help",  no_argument,   (int *) , 
> true },
> +  { "no-pager",  no_argument,   (int *) ,
> false },
> +  { "no-offsets",no_argument,   (int *) _print_offsets, 
> false },
> +  { "gen",   required_argument, NULL,  
> 'g' },
> +  { "headers",   no_argument,   (int *) _full_decode,   
> false },
> +  { "color", required_argument, NULL,  
> 'c' },
> +  { "xml",   required_argument, NULL,  
> 'x' },
> +  { "max-vbo-lines", required_argument, NULL,  
> 'v' },
> +  { NULL,0, NULL,  0 
> }
> };
>  
> outfile = stdout;
> @@ -605,6 +609,9 @@ int main(int argc, char *argv[])
>case 'x':
>   xml_path = strdup(optarg);
>   break;
> +  case 'v':
> + max_vbo_lines = atoi(optarg);
> + break;
>default:
>   break;
>}
> 

Series is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] spirv: Apply OriginUpperLeft to FragCoord

2018-05-02 Thread Neil Roberts

This behaviour was changed in 1e5b09f42f694687ac. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
 nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
 nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
 nir_var->data.origin_upper_left = b->origin_upper_left;
 /* fallthrough */
  case SpvBuiltInFragCoord:
 nir_var->data.pixel_center_integer = b->pixel_center_integer;
 break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
 nir_var->data.pixel_center_integer = b->pixel_center_integer;
 /* fallthrough */
  case SpvBuiltInSamplePosition:
 nir_var->data.origin_upper_left = b->origin_upper_left;
 break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.
---
 src/compiler/spirv/vtn_variables.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 9679ff6526c..fd8ab7f247a 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1419,11 +1419,11 @@ apply_var_decoration(struct vtn_builder *b, 
nir_variable *nir_var,
   case SpvBuiltInTessLevelInner:
  nir_var->data.compact = true;
  break;
-  case SpvBuiltInSamplePosition:
- nir_var->data.origin_upper_left = b->origin_upper_left;
- /* fallthrough */
   case SpvBuiltInFragCoord:
  nir_var->data.pixel_center_integer = b->pixel_center_integer;
+ /* fallthrough */
+  case SpvBuiltInSamplePosition:
+ nir_var->data.origin_upper_left = b->origin_upper_left;
  break;
   default:
  break;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] intel: aubinator: add an option to limit the number of decoded VBO lines

2018-05-02 Thread Lionel Landwerlin

Signed-off-by: Lionel Landwerlin 
---
 src/intel/tools/aubinator.c | 39 ++---
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index bc263dbf846..3120e82b22e 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -58,6 +58,7 @@
 
 static bool option_full_decode = true;
 static bool option_print_offsets = true;
+static int max_vbo_lines = -1;
 static enum { COLOR_AUTO, COLOR_ALWAYS, COLOR_NEVER } option_color;
 
 /* state */
@@ -179,6 +180,7 @@ aubinator_init(uint16_t aub_pci_id, const char *app_name)
 
gen_batch_decode_ctx_init(_ctx, , outfile, batch_flags,
  xml_path, get_gen_batch_bo, NULL, NULL);
+   batch_ctx.max_vbo_decoded_lines = max_vbo_lines;
 
char *color = GREEN_HEADER, *reset_color = NORMAL;
if (option_color == COLOR_NEVER)
@@ -547,14 +549,15 @@ print_help(const char *progname, FILE *file)
"Usage: %s [OPTION]... [FILE]\n"
"Decode aub file contents from either FILE or the standard 
input.\n\n"
"A valid --gen option must be provided.\n\n"
-   "  --help  display this help and exit\n"
-   "  --gen=platform  decode for given platform (3 letter platform 
name)\n"
-   "  --headers   decode only command headers\n"
-   "  --color[=WHEN]  colorize the output; WHEN can be 'auto' 
(default\n"
-   "if omitted), 'always', or 'never'\n"
-   "  --no-pager  don't launch pager\n"
-   "  --no-offsetsdon't print instruction offsets\n"
-   "  --xml=DIR   load hardware xml description from directory 
DIR\n",
+   "  --help display this help and exit\n"
+   "  --gen=platform decode for given platform (3 letter 
platform name)\n"
+   "  --headers  decode only command headers\n"
+   "  --color[=WHEN] colorize the output; WHEN can be 'auto' 
(default\n"
+   " if omitted), 'always', or 'never'\n"
+   "  --max-vbo-lines=N  limit the number of decoded VBO lines\n"
+   "  --no-pager don't launch pager\n"
+   "  --no-offsets   don't print instruction offsets\n"
+   "  --xml=DIR  load hardware xml description from 
directory DIR\n",
progname);
 }
 
@@ -564,14 +567,15 @@ int main(int argc, char *argv[])
int c, i;
bool help = false, pager = true;
const struct option aubinator_opts[] = {
-  { "help",   no_argument,   (int *) , true },
-  { "no-pager",   no_argument,   (int *) ,false 
},
-  { "no-offsets", no_argument,   (int *) _print_offsets, false 
},
-  { "gen",required_argument, NULL,  'g' },
-  { "headers",no_argument,   (int *) _full_decode,   false 
},
-  { "color",  required_argument, NULL,  'c' },
-  { "xml",required_argument, NULL,  'x' },
-  { NULL, 0, NULL,  0 }
+  { "help",  no_argument,   (int *) , 
true },
+  { "no-pager",  no_argument,   (int *) ,
false },
+  { "no-offsets",no_argument,   (int *) _print_offsets, 
false },
+  { "gen",   required_argument, NULL,  'g' 
},
+  { "headers",   no_argument,   (int *) _full_decode,   
false },
+  { "color", required_argument, NULL,  'c' 
},
+  { "xml",   required_argument, NULL,  'x' 
},
+  { "max-vbo-lines", required_argument, NULL,  'v' 
},
+  { NULL,0, NULL,  0 }
};
 
outfile = stdout;
@@ -605,6 +609,9 @@ int main(int argc, char *argv[])
   case 'x':
  xml_path = strdup(optarg);
  break;
+  case 'v':
+ max_vbo_lines = atoi(optarg);
+ break;
   default:
  break;
   }
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] intel: decoder: limit to the number decoded lines from VBO

2018-05-02 Thread Lionel Landwerlin

By default we set no limit, but the debug batch decoder in i965 sets
it to 100.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/common/gen_batch_decoder.c  | 22 ---
 src/intel/common/gen_decoder.h|  2 ++
 src/mesa/drivers/dri/i965/intel_batchbuffer.c |  1 +
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index c059b194974..3852f32de36 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -43,6 +43,7 @@ gen_batch_decode_ctx_init(struct gen_batch_decode_ctx *ctx,
ctx->user_data = user_data;
ctx->fp = fp;
ctx->flags = flags;
+   ctx->max_vbo_decoded_lines = -1; /* No limit! */
 
if (xml_path == NULL)
   ctx->spec = gen_spec_load(devinfo);
@@ -165,24 +166,29 @@ static void
 ctx_print_buffer(struct gen_batch_decode_ctx *ctx,
  struct gen_batch_decode_bo bo,
  uint32_t read_length,
- uint32_t pitch)
+ uint32_t pitch,
+ int max_lines)
 {
const uint32_t *dw_end = bo.map + MIN2(bo.size, read_length);
 
-   unsigned line_count = 0;
+   int column_count = 0, line_count = -1;
for (const uint32_t *dw = bo.map; dw < dw_end; dw++) {
-  if (line_count * 4 == pitch || line_count == 8) {
+  if (column_count * 4 == pitch || column_count == 8) {
  fprintf(ctx->fp, "\n");
- line_count = 0;
+ column_count = 0;
+ line_count++;
+
+ if (max_lines >= 0 && line_count >= max_lines)
+break;
   }
-  fprintf(ctx->fp, line_count == 0 ? "  " : " ");
+  fprintf(ctx->fp, column_count == 0 ? "  " : " ");
 
   if ((ctx->flags & GEN_BATCH_DECODE_FLOATS) && probably_float(*dw))
  fprintf(ctx->fp, "  %8.2f", *(float *) dw);
   else
  fprintf(ctx->fp, "  0x%08x", *dw);
 
-  line_count++;
+  column_count++;
}
fprintf(ctx->fp, "\n");
 }
@@ -387,7 +393,7 @@ handle_3dstate_vertex_buffers(struct gen_batch_decode_ctx 
*ctx,
  if (vb.map == 0 || vb_size == 0)
 continue;
 
- ctx_print_buffer(ctx, vb, vb_size, pitch);
+ ctx_print_buffer(ctx, vb, vb_size, pitch, ctx->max_vbo_decoded_lines);
 
  vb.map = NULL;
  vb_size = 0;
@@ -576,7 +582,7 @@ decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, 
const uint32_t *p)
  unsigned size = read_length[i] * 32;
  fprintf(ctx->fp, "constant buffer %d, size %u\n", i, size);
 
- ctx_print_buffer(ctx, buffer[i], size, 0);
+ ctx_print_buffer(ctx, buffer[i], size, 0, -1);
   }
}
 }
diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h
index 37f6c3ee989..f2207ddf889 100644
--- a/src/intel/common/gen_decoder.h
+++ b/src/intel/common/gen_decoder.h
@@ -220,6 +220,8 @@ struct gen_batch_decode_ctx {
struct gen_batch_decode_bo surface_base;
struct gen_batch_decode_bo dynamic_base;
struct gen_batch_decode_bo instruction_base;
+
+   int max_vbo_decoded_lines;
 };
 
 void gen_batch_decode_ctx_init(struct gen_batch_decode_ctx *ctx,
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index b3e4bdc981e..bac6e6dae85 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -166,6 +166,7 @@ intel_batchbuffer_init(struct brw_context *brw)
   gen_batch_decode_ctx_init(>decoder, devinfo, stderr,
 decode_flags, NULL, decode_get_bo,
 decode_get_state_size, brw);
+  batch->decoder.max_vbo_decoded_lines = 100;
}
 
batch->use_batch_first =
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] nir: Implement optional b2f->iand lowering

2018-05-02 Thread Matt Turner

Thanks!

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-05-02 Thread Matt Turner

On Wed, May 2, 2018 at 9:13 AM, Eleni Maria Stea  wrote:
> Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed
> formats that can render. When GetCompressed* functions are called, the
> pixels are returned in the non-compressed format that is used for the
> rendering.
>
> With this patch we store both the compressed and non-compressed versions
> of the image, so that both rendering commands and GetCompressed*
> commands work.
>
> Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
> in intel_miptree_map_etc function have been removed because when the
> miptree is mapped for reading (for example from a GetCompress*
> function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set).
>
> Fixes: the following test in CTS for gen7:
> KHR-GL45.direct_state_access.textures_compressed_subimage test
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272

I think you can add

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843

as well :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: add missing dependency in meson.build

2018-05-02 Thread Kenneth Graunke

On Wednesday, May 2, 2018 9:49:34 AM PDT Rob Clark wrote:
> nir_builder_opcodes.h also depends on nir_intrinsics.py for generating
> the system-value builders.
> 
> Reported-by: Christoph Haag 
> Reported-by: Kenneth Graunke 
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/nir/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
> index 84715a58912..4fffbb7a1ee 100644
> --- a/src/compiler/nir/meson.build
> +++ b/src/compiler/nir/meson.build
> @@ -18,7 +18,7 @@
>  # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
> THE
>  # SOFTWARE.
>  
> -nir_depends = files('nir_opcodes.py')
> +nir_depends = files('nir_opcodes.py', 'nir_intrinsics.py')
>  
>  nir_builder_opcodes_h = custom_target(
>'nir_builder_opcodes.h',
> 

Thank you!

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel: Fix 3DSTATE_CONSTANT buffer decoding.

2018-05-02 Thread Kenneth Graunke

On Wednesday, May 2, 2018 9:50:33 AM PDT Lionel Landwerlin wrote:
> On 02/05/18 17:45, Kenneth Graunke wrote:
> > First, this was iterating over the 3DSTATE_CONSTANT_* instruction
> > but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure.
> >
> > Secondly, the fields have been called Buffer[0] and Read Length[0],
> > for a while now, and we were not handling the subscripts correctly.
> > ---
> >   src/intel/common/gen_batch_decoder.c | 40 +---
> >   1 file changed, 25 insertions(+), 15 deletions(-)
> >
> > diff --git a/src/intel/common/gen_batch_decoder.c 
> > b/src/intel/common/gen_batch_decoder.c
> > index dd78e07827e..c059b194974 100644
> > --- a/src/intel/common/gen_batch_decoder.c
> > +++ b/src/intel/common/gen_batch_decoder.c
> > @@ -543,31 +543,41 @@ static void
> >   decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, const uint32_t 
> > *p)
> >   {
> >  struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
> > +   struct gen_group *body =
> > +  gen_spec_find_struct(ctx->spec, "3DSTATE_CONSTANT_BODY");
> >   
> >  uint32_t read_length[4];
> >  struct gen_batch_decode_bo buffer[4];
> >  memset(buffer, 0, sizeof(buffer));
> >   
> > -   int rlidx = 0, bidx = 0;
> > +   struct gen_field_iterator outer;
> > +   gen_field_iterator_init(, inst, p, 0, false);
> > +   while (gen_field_iterator_next()) {
> > +  if (outer.struct_desc != body)
> > + continue;
> >   
> > -   struct gen_field_iterator iter;
> > -   gen_field_iterator_init(, inst, p, 0, false);
> > -   while (gen_field_iterator_next()) {
> > -  if (strcmp(iter.name, "Read Length") == 0) {
> > - read_length[rlidx++] = iter.raw_value;
> > -  } else if (strcmp(iter.name, "Buffer") == 0) {
> > - buffer[bidx++] = ctx_get_bo(ctx, iter.raw_value);
> > +  struct gen_field_iterator iter;
> > +  gen_field_iterator_init(, body, [outer.start_bit / 32],
> > +  0, false);
> > +
> > +  while (gen_field_iterator_next()) {
> > + int idx;
> > + if (sscanf(iter.name, "Read Length[%d]", ) == 1) {
> > +read_length[idx] = iter.raw_value;
> > + } else if (sscanf(iter.name, "Buffer[%d]", ) == 1) {
> > +buffer[idx] = ctx_get_bo(ctx, iter.raw_value);
> > + }
> > }
> > -   }
> >   
> > -   for (int i = 0; i < 4; i++) {
> > -  if (read_length[i] == 0 || buffer[i].map == NULL)
> > - continue;
> 
> I'm kind of thinking we could get rid of the read_length[4] buffer[4] 
> arrays and just put that into the outer loop.
> What do you think?
> 
> Either way this is :
> 
> Reviewed-by: Lionel Landwerlin 

FTR, I was confused by this suggestion so I talked to Lionel on IRC, he
was thinking was that we could print them out as we fetch the buffers...

But the iteration order is

   - Read Length[0]
   - Read Length[1]
   - Read Length[2]
   - Read Length[3]
   - Buffer[0]
   - Buffer[1]
   - Buffer[2]
   - Buffer[3]

So we'd at least need the read_length array, still.  We decided to
just leave it as it is.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] util/set: add a set_clear function

2018-05-02 Thread Kenneth Graunke

On Wednesday, May 2, 2018 9:01:02 AM PDT Scott D Phillips wrote:
> Clear a set back to the state of having zero entries.
> ---
>  src/util/set.c | 23 +++
>  src/util/set.h |  3 +++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/src/util/set.c b/src/util/set.c
> index d71f771807f..2c9b09319ff 100644
> --- a/src/util/set.c
> +++ b/src/util/set.c
> @@ -155,6 +155,29 @@ _mesa_set_destroy(struct set *ht, void 
> (*delete_function)(struct set_entry *entr
> ralloc_free(ht);
>  }
>  
> +/**
> + * Clears all values from the given set.
> + *
> + * If delete_function is passed, it gets called on each entry present before
> + * the set is cleared.
> + */
> +void
> +_mesa_set_clear(struct set *set, void (*delete_function)(struct set_entry 
> *entry))
> +{
> +   struct set_entry *entry;
> +
> +   if (!set)
> +  return;
> +
> +   set_foreach (set, entry) {
> +  if (delete_function)
> + delete_function(entry);
> +  entry->key = deleted_key;
> +   }
> +
> +   set->entries = set->deleted_entries = 0;
> +}
> +
>  /**
>   * Finds a set entry with the given key and hash of that key.
>   *
> diff --git a/src/util/set.h b/src/util/set.h
> index 9acd2c28c9f..06e79e15867 100644
> --- a/src/util/set.h
> +++ b/src/util/set.h
> @@ -61,6 +61,9 @@ _mesa_set_create(void *mem_ctx,
>  void
>  _mesa_set_destroy(struct set *set,
>void (*delete_function)(struct set_entry *entry));
> +void
> +_mesa_set_clear(struct set *set,
> +void (*delete_function)(struct set_entry *entry));
>  
>  struct set_entry *
>  _mesa_set_add(struct set *set, const void *key);
> 

Reviewed-by: Kenneth Graunke 

Cc'ing Eric since I think this was imported from a separate project
of his, in case he wants to port this back to the original repo.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Ilia Mirkin

On Wed, May 2, 2018 at 6:27 AM, Timothy Arceri  wrote:
> ---
>  src/mapi/glapi/gen/apiexec.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
> index b5e0ad4a179..d33cc85d47f 100644
> --- a/src/mapi/glapi/gen/apiexec.py
> +++ b/src/mapi/glapi/gen/apiexec.py
> @@ -46,7 +46,7 @@ class exec_info():
>  if compatibility is not None:
>  assert isinstance(compatibility, int)
>  assert compatibility >= 10
> -assert compatibility <= 30
> +assert compatibility <= 46
>
>  if core is not None:
>  assert isinstance(core, int)
> @@ -70,7 +70,7 @@ functions = {
>  "TexBuffer": exec_info(compatibility=20, core=31, es2=31),
>
>  # OpenGL 3.2 / GL_OES_geometry_shader.
> -"FramebufferTexture": exec_info(core=32, es2=31),
> +"FramebufferTexture": exec_info(compatibility=32, core=32, es2=31),

Does it make sense to list out compat explicitly in the presence of
core? Are there any core functions that aren't available in compat
contexts of that version?

IMHO it's worth changing the exec_info class to say

if core and compatibility is None:
  compatibility = core

... or something along those lines.

>
>  # OpenGL 4.0 / GL_ARB_shader_subroutines. Mesa only exposes this
>  # extension with core profile.
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/3] mesa: enable geom shaders in OpenGL 3.2 Compat profile

2018-05-02 Thread Emil Velikov

On 2 May 2018 at 11:27, Timothy Arceri  wrote:
> ---
>  src/mapi/glapi/gen/apiexec.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
> index b5e0ad4a179..d33cc85d47f 100644
> --- a/src/mapi/glapi/gen/apiexec.py
> +++ b/src/mapi/glapi/gen/apiexec.py
> @@ -46,7 +46,7 @@ class exec_info():
>  if compatibility is not None:
>  assert isinstance(compatibility, int)
>  assert compatibility >= 10
> -assert compatibility <= 30
> +assert compatibility <= 46
I'd keep that 32 for now. With that the patch is
Reviewed-by: Emil Velikov 

A mildly related fix or two (in glsl) coming shortly.
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel: fix aubinator include

2018-05-02 Thread Lionel Landwerlin

Signed-off-by: Lionel Landwerlin 
Fixes: 7c22c150c40b3 ("intel: Move batch decoder/disassembler from tools/ to 
common/")
---
 src/intel/tools/aubinator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index ab053c66b36..bc263dbf846 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -40,8 +40,8 @@
 #include "util/macros.h"
 
 #include "common/gen_decoder.h"
+#include "common/gen_disasm.h"
 #include "intel_aub.h"
-#include "gen_disasm.h"
 
 /* Below is the only command missing from intel_aub.h in libdrm
  * So, reuse intel_aub.h from libdrm and #define the
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/3] mesa: actually support GLSL version overrides in compat profile

2018-05-02 Thread Emil Velikov

On 2 May 2018 at 11:27, Timothy Arceri  wrote:
> ---
>  src/mesa/main/version.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index 84babd69e2f..540f5482034 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -591,6 +591,8 @@ _mesa_get_version(const struct gl_extensions *extensions,
>   if (consts->GLSLVersion > 140) {
>  consts->GLSLVersion = 140;
>   }
> + /* Support GLSL version overrides in compat profile */
> + _mesa_override_glsl_version(consts);

Why are we allowing this only for compat? As-is this feels very dirty
and skimming through the existing code doesn't help much.

* classic drivers - 965, starting at create_context
_mesa_initialize_context -> _mesa_init_constants -> GLSLVersion (120) + override
intelInitExtensions -> GLSLVersion + override combo
_mesa_compute_version -> _mesa_get_version -> [optional] cap up-to 140
-> override
_mesa_compute_version -> tweak/match GLSL version based on the GL version

Not to mention the initial 120 (effectively) in
_mesa_initialize_context is bonkers for the following:
 - Intel Gen2 (GL 1.3) and Gen3 (GL 1.4 or 2.1)
 - nouveau vieux - GL 1.2 or 1.3
 - radeon (r100/r200) - GL 1.3

* gallium - two paths - create_screen and create_context, latter more
or less identical to i965
For the create_screen part:
st_api_query_versions (for max_gl*_version) -> _mesa_init_constants -> see above
st_api_query_versions (for max_gl*_version) -> st_init_extensions ->
GLSLVersion + override combo
st_api_query_versions (for max_gl*_version) -> _mesa_get_version -> see above

As you can see things are hairy.

A few ideas that come to mind:
 - drop the _mesa_init_constants bits and update any drivers needed
 - each of GLSLVersion, override and tweaks should happen [ideally]
once per ctx.

In theory one ought to be able to reuse the gallium approach for
classic drivers, but that's going on a far too big tangent.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel: Fix 3DSTATE_CONSTANT buffer decoding.

2018-05-02 Thread Lionel Landwerlin


On 02/05/18 17:45, Kenneth Graunke wrote:

First, this was iterating over the 3DSTATE_CONSTANT_* instruction
but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure.

Secondly, the fields have been called Buffer[0] and Read Length[0],
for a while now, and we were not handling the subscripts correctly.
---
  src/intel/common/gen_batch_decoder.c | 40 +---
  1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index dd78e07827e..c059b194974 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -543,31 +543,41 @@ static void
  decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, const uint32_t *p)
  {
 struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *body =
+  gen_spec_find_struct(ctx->spec, "3DSTATE_CONSTANT_BODY");
  
 uint32_t read_length[4];

 struct gen_batch_decode_bo buffer[4];
 memset(buffer, 0, sizeof(buffer));
  
-   int rlidx = 0, bidx = 0;

+   struct gen_field_iterator outer;
+   gen_field_iterator_init(, inst, p, 0, false);
+   while (gen_field_iterator_next()) {
+  if (outer.struct_desc != body)
+ continue;
  
-   struct gen_field_iterator iter;

-   gen_field_iterator_init(, inst, p, 0, false);
-   while (gen_field_iterator_next()) {
-  if (strcmp(iter.name, "Read Length") == 0) {
- read_length[rlidx++] = iter.raw_value;
-  } else if (strcmp(iter.name, "Buffer") == 0) {
- buffer[bidx++] = ctx_get_bo(ctx, iter.raw_value);
+  struct gen_field_iterator iter;
+  gen_field_iterator_init(, body, [outer.start_bit / 32],
+  0, false);
+
+  while (gen_field_iterator_next()) {
+ int idx;
+ if (sscanf(iter.name, "Read Length[%d]", ) == 1) {
+read_length[idx] = iter.raw_value;
+ } else if (sscanf(iter.name, "Buffer[%d]", ) == 1) {
+buffer[idx] = ctx_get_bo(ctx, iter.raw_value);
+ }
}
-   }
  
-   for (int i = 0; i < 4; i++) {

-  if (read_length[i] == 0 || buffer[i].map == NULL)
- continue;


I'm kind of thinking we could get rid of the read_length[4] buffer[4] 
arrays and just put that into the outer loop.

What do you think?

Either way this is :

Reviewed-by: Lionel Landwerlin 


+  for (int i = 0; i < 4; i++) {
+ if (read_length[i] == 0 || buffer[i].map == NULL)
+continue;
  
-  unsigned size = read_length[i] * 32;

-  fprintf(ctx->fp, "constant buffer %d, size %u\n", i, size);
+ unsigned size = read_length[i] * 32;
+ fprintf(ctx->fp, "constant buffer %d, size %u\n", i, size);
  
-  ctx_print_buffer(ctx, buffer[i], size, 0);

+ ctx_print_buffer(ctx, buffer[i], size, 0);
+  }
 }
  }
  



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir: add missing dependency in meson.build

2018-05-02 Thread Rob Clark

nir_builder_opcodes.h also depends on nir_intrinsics.py for generating
the system-value builders.

Reported-by: Christoph Haag 
Reported-by: Kenneth Graunke 
Signed-off-by: Rob Clark 
---
 src/compiler/nir/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
index 84715a58912..4fffbb7a1ee 100644
--- a/src/compiler/nir/meson.build
+++ b/src/compiler/nir/meson.build
@@ -18,7 +18,7 @@
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.
 
-nir_depends = files('nir_opcodes.py')
+nir_depends = files('nir_opcodes.py', 'nir_intrinsics.py')
 
 nir_builder_opcodes_h = custom_target(
   'nir_builder_opcodes.h',
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel: Fix 3DSTATE_CONSTANT buffer decoding.

2018-05-02 Thread Kenneth Graunke

First, this was iterating over the 3DSTATE_CONSTANT_* instruction
but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure.

Secondly, the fields have been called Buffer[0] and Read Length[0],
for a while now, and we were not handling the subscripts correctly.
---
 src/intel/common/gen_batch_decoder.c | 40 +---
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index dd78e07827e..c059b194974 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -543,31 +543,41 @@ static void
 decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, const uint32_t *p)
 {
struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *body =
+  gen_spec_find_struct(ctx->spec, "3DSTATE_CONSTANT_BODY");
 
uint32_t read_length[4];
struct gen_batch_decode_bo buffer[4];
memset(buffer, 0, sizeof(buffer));
 
-   int rlidx = 0, bidx = 0;
+   struct gen_field_iterator outer;
+   gen_field_iterator_init(, inst, p, 0, false);
+   while (gen_field_iterator_next()) {
+  if (outer.struct_desc != body)
+ continue;
 
-   struct gen_field_iterator iter;
-   gen_field_iterator_init(, inst, p, 0, false);
-   while (gen_field_iterator_next()) {
-  if (strcmp(iter.name, "Read Length") == 0) {
- read_length[rlidx++] = iter.raw_value;
-  } else if (strcmp(iter.name, "Buffer") == 0) {
- buffer[bidx++] = ctx_get_bo(ctx, iter.raw_value);
+  struct gen_field_iterator iter;
+  gen_field_iterator_init(, body, [outer.start_bit / 32],
+  0, false);
+
+  while (gen_field_iterator_next()) {
+ int idx;
+ if (sscanf(iter.name, "Read Length[%d]", ) == 1) {
+read_length[idx] = iter.raw_value;
+ } else if (sscanf(iter.name, "Buffer[%d]", ) == 1) {
+buffer[idx] = ctx_get_bo(ctx, iter.raw_value);
+ }
   }
-   }
 
-   for (int i = 0; i < 4; i++) {
-  if (read_length[i] == 0 || buffer[i].map == NULL)
- continue;
+  for (int i = 0; i < 4; i++) {
+ if (read_length[i] == 0 || buffer[i].map == NULL)
+continue;
 
-  unsigned size = read_length[i] * 32;
-  fprintf(ctx->fp, "constant buffer %d, size %u\n", i, size);
+ unsigned size = read_length[i] * 32;
+ fprintf(ctx->fp, "constant buffer %d, size %u\n", i, size);
 
-  ctx_print_buffer(ctx, buffer[i], size, 0);
+ ctx_print_buffer(ctx, buffer[i], size, 0);
+  }
}
 }
 
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 106337] eglWaitClient() doesn't work as documented using DRI2 backend

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=106337

--- Comment #6 from mgorc...@qnx.com  ---
I was able to test your changes and had to add following addition to the
intel_screen.c:

@@ -171,7 +176,7 @@
 }

 static const struct __DRI2flushExtensionRec intelFlushExtension = {
-.base = { __DRI2_FLUSH, 4 },
+.base = { __DRI2_FLUSH, 5 },

 .flush  = intel_dri2_flush,
 .invalidate = dri2InvalidateDrawable,

Now I can confirm that it flushes all data to drawable surface and waits for it
properly. Speed has been decreases dramatically, only a bit better than
glFinish(). I think we cannot do too much with it.

Another "issue", which I'm not sure if it is issue or expected behavior,
related to this topic: when FBO is used together with surfaceless contexts. 

eglWaitClient() bails out with error if surfaceless contexts are in use to draw
to FBO. Is this expected behavior?

Specification says: "All rendering calls for the currently bound context, for
the current rendering API, made prior to eglWaitClient are guaranteed to be
executed before native rendering calls made after eglWaitClient." and it
doesn't mention "surfaces", only "contexts".

Usually most people create dummy 1x1 pbuffers for FBO rendering, but
surfaceless contexts are more convenient.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: fix spilling regression introduced by 5428066f5e

2018-05-02 Thread Karol Herbst

this is just a minor mistake done while moving the code out into a new
function. The function contained a loop which might have been terminated
earlier and skipped setting noSpill to 1. After the refactoring it was always
set.

Fixes: 5428066f5e1ef5ea6ae04c84019f270023cfc6aa
("nv50/ir: make a copy of tex src if it's referenced multiple times")
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 28e0e260cee..b660fec75c9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -2353,6 +2353,8 @@ 
RegAlloc::InsertConstraintsPass::insertConstraintMove(Instruction *cst, int s)
 
cst->setSrc(s, mov->getDef(0));
cst->bb->insertBefore(cst, mov);
+
+   cst->getDef(0)->asLValue()->noSpill = 1; // doesn't help
 }
 
 // Insert extra moves so that, if multiple register constraints on a value are
@@ -2397,8 +2399,6 @@ RegAlloc::InsertConstraintsPass::insertConstraintMoves()
 }
 
 insertConstraintMove(cst, s);
-
-cst->getDef(0)->asLValue()->noSpill = 1; // doesn't help
  }
   }
}
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] opencl: autotools: Fix linking order for OpenCL target

2018-05-02 Thread Kai Wasserbäch

Hey Jan,
Jan Vesely wrote on 01.05.2018 23:59:
> On Tue, 2018-05-01 at 18:23 +0200, Kai Wasserbäch wrote:
>> Jan Vesely wrote on 01.05.2018 17:19:
>>> On Tue, 2018-05-01 at 14:14 +0200, Kai Wasserbäch wrote:
 [...]

  src/gallium/targets/opencl/Makefile.am | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

 diff --git a/src/gallium/targets/opencl/Makefile.am 
 b/src/gallium/targets/opencl/Makefile.am
 index de68a93ad5..f0e1de7797 100644
 --- a/src/gallium/targets/opencl/Makefile.am
 +++ b/src/gallium/targets/opencl/Makefile.am
 @@ -23,11 +23,10 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \
$(LIBELF_LIBS) \
$(DLOPEN_LIBS) \
-lclangCodeGen \
 -  -lclangFrontendTool \
-lclangFrontend \
 +  -lclangFrontendTool \
>>>
>>> This is strange. Why does reordering help here? Do we use -Wl,--as-
>>> needed anywhere?
>>
>> No, not that I can see.
>>
>>> Should we use -Wl,--start-group/-Wl,--end-group for all clang libraries
>>> instead?
>>
>> Maybe? This was the simplest fix I could come up with, but if there's a
>> preference for a link group, I can give that a try as well.
> 
> So the fix is to change ordering?

yes.

> Does using groups fix the issue as well? I think that would be
> preferable, but I use split .so files, so I don't hit this issue.

I tried convincing autotools to work with those flags but failed. The only
option I see to solve this, is very messy IMHO (and would still need the
ordering fix): putting -Wl,--{start,end}-group directly into the right places in
lib@OPENCL_LIBNAME@_la_LIBADD is forbidden by automake ("error: linker flags
such as '-Wl,--start-group' belong in 'lib@OPENCL_LIBNAME@_la_LDFLAGS'") and
adding them to lib@OPENCL_LIBNAME@_la_LDFLAGS like automake is suggesting won't
work for obvious reasons. The only solution I can see is to work with
substitution because automake seems to "not see" the flags then. I could do an
unconditional replacement, but there are probably linkers with no support for
these flags, which would mean I'd have to do the ordering fix in any case and
then conditionally set "-Wl,--{start,end}-group" just for the GNU toolchain with
no immediate benefit beyond future-proofing this section.
But maybe people who are deeper into the whole autotools stuff (Emil?
Francisco?) can point me to a solution? Otherwise I'd like to return to my
original patch which fixes the FTBFS and works for now. Or maybe the library
could be linked against libclang.so (at least when --enable-llvm-shared-libs is 
set?

-lclangDriver \
-lclangSerialization \
 -  -lclangCodeGen \
>>>
>>> Is this change related?
>>
>> Not really, just a minor clean-up while I was busy a few lines above.
>> "clangCodeGen" is already named on the first Clang library line.
> 
> ah, all right, maybe mention it in the commit message?

Do I need to resend the patch for that or can you just add a line like "This
change also removes the duplicate clangCodeGen line (trivial change)." before
pushing, considering, that there are two T-b tags to be added anyway?

Cheers,
Kai

signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/9] anv: Add vma_heap allocators in anv_device

2018-05-02 Thread Chris Wilson

Quoting Scott D Phillips (2018-05-02 17:01:05)
> +bool
> +anv_vma_alloc(struct anv_device *device, struct anv_bo *bo)
> +{
> +   if (!(bo->flags & EXEC_OBJECT_PINNED))
> +  return true;
> +
> +   pthread_mutex_lock(>vma_mutex);
> +
> +   bo->offset = 0;

So bo are device scoped. There can only be a single vma per bo, so why
not store the vma node inside the bo? No extra allocations, no
searching in anv_vma_close() (a linear walk!!! Granted you have the
excuse of doing a full walk for list validation on top of that).

I guess you don't have much that stresses the vma manager :)

The decision to split low/high ranges rather than have a up/down
allocator wants a few words of explanation.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] egl: check if colorspace/surface type is supported

2018-05-02 Thread Juan A. Suarez Romero

According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering
Surfaces"), if config does not support the colorspace or alpha format
attributes specified in attrib_list (as defined for
eglCreateWindowSurface), an EGL_BAD_MATCH error is generated.

This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still
not merged,
https://android-review.googlesource.com/c/platform/external/deqp/+/667322),
which is crashing when trying to create a windows surface with RGB888
configuration and sRGB colorspace.

v2: Handle the fix in other backends (Tapani)
---
 src/egl/drivers/dri2/platform_drm.c  | 5 +
 src/egl/drivers/dri2/platform_wayland.c  | 6 ++
 src/egl/drivers/dri2/platform_x11.c  | 5 +
 src/egl/drivers/dri2/platform_x11_dri3.c | 5 +
 4 files changed, 21 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index dc4efea9103..35bc4b5b1ac 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -155,6 +155,11 @@ dri2_drm_create_window_surface(_EGLDriver *drv, 
_EGLDisplay *disp,
config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
 dri2_surf->base.GLColorspace);
 
+   if (!config) {
+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_surf;
+   }
+
if (!dri2_drm_config_is_compatible(dri2_dpy, config, surface)) {
   _eglError(EGL_BAD_MATCH, "EGL config not compatible with GBM format");
   goto cleanup_surf;
diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index 80853ac00b8..63da21cdf55 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -249,6 +249,12 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay 
*disp,
 
config = dri2_get_dri_config(dri2_conf, EGL_WINDOW_BIT,
 dri2_surf->base.GLColorspace);
+
+   if (!config) {
+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_surf;
+   }
+
visual_idx = dri2_wl_visual_idx_from_config(dri2_dpy, config);
assert(visual_idx != -1);
 
diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 6c287b4d06b..fa838f6721e 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -251,6 +251,11 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
config = dri2_get_dri_config(dri2_conf, type,
 dri2_surf->base.GLColorspace);
 
+   if (!config) {
+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_pixmap;
+   }
+
if (dri2_dpy->dri2) {
   dri2_surf->dri_drawable =
  dri2_dpy->dri2->createNewDrawable(dri2_dpy->dri_screen, config,
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index a41e40156df..5cb6d65c0a3 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -183,6 +183,11 @@ dri3_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
dri_config = dri2_get_dri_config(dri2_conf, type,
 dri3_surf->surf.base.GLColorspace);
 
+   if (!dri_config) {
+  _eglError(EGL_BAD_MATCH, "Unsupported surfacetype/colorspace 
configuration");
+  goto cleanup_pixmap;
+   }
+
if (loader_dri3_drawable_init(dri2_dpy->conn, drawable,
  dri2_dpy->dri_screen,
  dri2_dpy->is_different_gpu,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] i965: Fix ETC2/EAC GetCompressed* functions on Gen7 GPUs

2018-05-02 Thread Eleni Maria Stea

Gen 7 GPUs store the compressed EAC/ETC2 images in other non-compressed
formats that can render. When GetCompressed* functions are called, the
pixels are returned in the non-compressed format that is used for the
rendering.

With this patch we store both the compressed and non-compressed versions
of the image, so that both rendering commands and GetCompressed*
commands work.

Also, the assertions for GL_MAP_WRITE_BIT and GL_MAP_INVALIDATE_RANGE_BIT
in intel_miptree_map_etc function have been removed because when the
miptree is mapped for reading (for example from a GetCompress*
function) the GL_MAP_WRITE_BIT won't be set (and shouldn't be set).

Fixes: the following test in CTS for gen7:
KHR-GL45.direct_state_access.textures_compressed_subimage test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272

v2: fixes issues:
   a) initialized uninitialized variables (Juan A. Suarez, Andres Gomez)
   b) fixed race condition where mt and cmt were mapped at the same time
   c) fixed indentation issues (Andres Gomez)
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  10 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  21 
 src/mesa/drivers/dri/i965/intel_tex.c | 151 ++
 src/mesa/drivers/dri/i965/intel_tex.h |   8 ++
 src/mesa/drivers/dri/i965/intel_tex_image.c   |  94 +++-
 src/mesa/drivers/dri/i965/intel_tex_obj.h |   8 ++
 6 files changed, 260 insertions(+), 32 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b9a564552d..2eba8c792c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -734,9 +734,10 @@ miptree_create(struct brw_context *brw,
mesa_format etc_format = MESA_FORMAT_NONE;
uint32_t alloc_flags = 0;
 
-   format = intel_lower_compressed_format(brw, format);
-
-   etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
+   if (!(flags & MIPTREE_CREATE_ETC)) {
+  format = intel_lower_compressed_format(brw, format);
+  etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
+   }
 
if (flags & MIPTREE_CREATE_BUSY)
   alloc_flags |= BO_ALLOC_BUSY;
@@ -3393,9 +3394,6 @@ intel_miptree_map_etc(struct brw_context *brw,
   assert(mt->format == MESA_FORMAT_R8G8B8X8_UNORM);
}
 
-   assert(map->mode & GL_MAP_WRITE_BIT);
-   assert(map->mode & GL_MAP_INVALIDATE_RANGE_BIT);
-
map->stride = _mesa_format_row_stride(mt->etc_format, map->w);
map->buffer = malloc(_mesa_format_image_size(mt->etc_format,
 map->w, map->h, 1));
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 8cea562dfa..7f70a6c341 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -405,6 +405,27 @@ enum intel_miptree_create_flags {
 * that the miptree will be created with mt->aux_usage == NONE.
 */
MIPTREE_CREATE_NO_AUX   = 1 << 2,
+
+   /** Create a second miptree for the compressed pixels (Gen7 only)
+*
+* On Gen7, we need to store 2 miptrees for some compressed
+* formats so we can handle rendering as well as getting the
+* compressed image data. This flag indicates that the miptree
+* is expected to hold compressed data for the latter case.
+*/
+   MIPTREE_CREATE_ETC  = 1 << 3,
+};
+
+enum intel_miptree_upload_flags {
+   MIPTREE_UPLOAD_DEFAULT = 0,
+
+   /** Upload the miptree that holds the compressed pixels (Gen 7 only)
+*
+* On Gen7, sometimes we need to map the miptree that stores the
+* image data for the rendering and sometimes the miptree that holds
+* the compressed data. This flag is for the latter case.
+*/
+   MIPTREE_UPLOAD_ETC,
 };
 
 struct intel_mipmap_tree *intel_miptree_create(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/intel_tex.c 
b/src/mesa/drivers/dri/i965/intel_tex.c
index 0650b6e629..54a0431265 100644
--- a/src/mesa/drivers/dri/i965/intel_tex.c
+++ b/src/mesa/drivers/dri/i965/intel_tex.c
@@ -66,6 +66,8 @@ intel_alloc_texture_image_buffer(struct gl_context *ctx,
struct intel_texture_image *intel_image = intel_texture_image(image);
struct gl_texture_object *texobj = image->TexObject;
struct intel_texture_object *intel_texobj = intel_texture_object(texobj);
+   struct gen_device_info *devinfo = >screen->devinfo;
+   mesa_format fmt = image->TexFormat;
 
assert(image->Border == 0);
 
@@ -110,6 +112,33 @@ intel_alloc_texture_image_buffer(struct gl_context *ctx,
   image->Width, image->Height, image->Depth, intel_image->mt);
}
 
+   if (devinfo->gen == 7 && _mesa_is_format_etc2(fmt)) {
+  if (intel_texobj->cmt &&
+  intel_miptree_match_image(intel_texobj->cmt, image)) {
+ intel_miptree_reference(_image->cmt, intel_texobj->cmt);
+ DBG("%s: alloc obj %p

Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon/vcn: fix mpeg4 msg buffer settings

2018-05-02 Thread Mark Janes

Leo Liu  writes:

> Reviewed-by: Leo Liu 
>
> And Cc Mesa-stable as well.

Please include stable tags in your commit message, eg:

Cc: 18.0 18.1 

> On 2018-04-24 04:49 PM, boyuan.zh...@amd.com wrote:
>> From: Boyuan Zhang 
>>
>> Previous bit-fields assignments are incorrect and will result certain mpeg4
>> decode failed due to wrong flag values. This patch fixes these assignments.
>>
>> Signed-off-by: Boyuan Zhang 
>> ---
>>   src/gallium/drivers/radeon/radeon_vcn_dec.c | 18 +-
>>   1 file changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
>> b/src/gallium/drivers/radeon/radeon_vcn_dec.c
>> index f83e9e5..4bc922d 100644
>> --- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
>> +++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
>> @@ -554,15 +554,15 @@ static rvcn_dec_message_mpeg4_asp_vld_t 
>> get_mpeg4_msg(struct radeon_decoder *dec
>>   
>>  result.vop_time_increment_resolution = 
>> pic->vop_time_increment_resolution;
>>   
>> -result.short_video_header |= pic->short_video_header << 0;
>> -result.interlaced |= pic->interlaced << 2;
>> -result.load_intra_quant_mat |= 1 << 3;
>> -result.load_nonintra_quant_mat |= 1 << 4;
>> -result.quarter_sample |= pic->quarter_sample << 5;
>> -result.complexity_estimation_disable |= 1 << 6;
>> -result.resync_marker_disable |= pic->resync_marker_disable << 7;
>> -result.newpred_enable |= 0 << 10; //
>> -result.reduced_resolution_vop_enable |= 0 << 11;
>> +result.short_video_header = pic->short_video_header;
>> +result.interlaced = pic->interlaced;
>> +result.load_intra_quant_mat = 1;
>> +result.load_nonintra_quant_mat = 1;
>> +result.quarter_sample = pic->quarter_sample;
>> +result.complexity_estimation_disable = 1;
>> +result.resync_marker_disable = pic->resync_marker_disable;
>> +result.newpred_enable = 0;
>> +result.reduced_resolution_vop_enable = 0;
>>   
>>  result.quant_type = pic->quant_type;
>>   
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] intel: decoder: fix starting dword of struct fields

2018-05-02 Thread Kenneth Graunke

On Wednesday, May 2, 2018 2:57:50 AM PDT Lionel Landwerlin wrote:
> On 02/05/18 07:44, Kenneth Graunke wrote:
> > On Tuesday, May 1, 2018 4:43:05 PM PDT Lionel Landwerlin wrote:
> >> Struct fields might span several dwords, but iter_dword is incremented
> >> up to the last dword of the current field before we print out the
> >> struct's fields. We can't use iter_dword for computing the offset into
> >> the pointer of data to decode.
> >>
> >> Signed-off-by: Lionel Landwerlin 
> >> ---
> >>   src/intel/common/gen_decoder.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/src/intel/common/gen_decoder.c 
> >> b/src/intel/common/gen_decoder.c
> >> index 93fa4864ee3..a0a9634c5d9 100644
> >> --- a/src/intel/common/gen_decoder.c
> >> +++ b/src/intel/common/gen_decoder.c
> >> @@ -1064,7 +1064,7 @@ gen_print_group(FILE *outfile, struct gen_group 
> >> *group, uint64_t offset,
> >>if (iter.struct_desc) {
> >>   uint64_t struct_offset = offset + 4 * iter_dword;
> >>   gen_print_group(outfile, iter.struct_desc, struct_offset,
> >> -[iter_dword], iter.start_bit % 32, color);
> >> +[iter.start_bit / 32], iter.start_bit % 32, 
> >> color);
> >>}
> >> }
> >>  }
> >>
> > Does struct_offset need to use iter.start_bit / 32 then, too?
> Good catch!
> Thanks,
> 
> -
> Lionel

With uint64_t struct_offset = offset + 4 * (iter.start_bit / 32);
this would be:

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/9] util/set: add a set_clear function

2018-05-02 Thread Scott D Phillips

Clear a set back to the state of having zero entries.
---
 src/util/set.c | 23 +++
 src/util/set.h |  3 +++
 2 files changed, 26 insertions(+)

diff --git a/src/util/set.c b/src/util/set.c
index d71f771807f..2c9b09319ff 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -155,6 +155,29 @@ _mesa_set_destroy(struct set *ht, void 
(*delete_function)(struct set_entry *entr
ralloc_free(ht);
 }
 
+/**
+ * Clears all values from the given set.
+ *
+ * If delete_function is passed, it gets called on each entry present before
+ * the set is cleared.
+ */
+void
+_mesa_set_clear(struct set *set, void (*delete_function)(struct set_entry 
*entry))
+{
+   struct set_entry *entry;
+
+   if (!set)
+  return;
+
+   set_foreach (set, entry) {
+  if (delete_function)
+ delete_function(entry);
+  entry->key = deleted_key;
+   }
+
+   set->entries = set->deleted_entries = 0;
+}
+
 /**
  * Finds a set entry with the given key and hash of that key.
  *
diff --git a/src/util/set.h b/src/util/set.h
index 9acd2c28c9f..06e79e15867 100644
--- a/src/util/set.h
+++ b/src/util/set.h
@@ -61,6 +61,9 @@ _mesa_set_create(void *mem_ctx,
 void
 _mesa_set_destroy(struct set *set,
   void (*delete_function)(struct set_entry *entry));
+void
+_mesa_set_clear(struct set *set,
+void (*delete_function)(struct set_entry *entry));
 
 struct set_entry *
 _mesa_set_add(struct set *set, const void *key);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/9] anv: use a separate pool for binding tables when soft pinning

2018-05-02 Thread Scott D Phillips

Soft pinning lets us satisfy the binding table address
requirements without using both sides of a growing state_pool.

If you do use both sides of a state pool, then you need to read
the state pool's center_bo_offset (with the device mutex held) to
know the final offset of relocations that target the state pool
bo.

By having a separate pool for binding tables that only grows in
the forward direction, the center_bo_offset is always 0 and
relocations don't need an update pass to adjust relocations with
the mutex held.
---
 src/intel/vulkan/anv_batch_chain.c | 23 +++
 src/intel/vulkan/anv_device.c  | 28 +++-
 src/intel/vulkan/anv_private.h | 36 
 3 files changed, 78 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 09514c7b84a..52f69045519 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -452,7 +452,7 @@ anv_cmd_buffer_surface_base_address(struct anv_cmd_buffer 
*cmd_buffer)
 {
struct anv_state *bt_block = u_vector_head(_buffer->bt_block_states);
return (struct anv_address) {
-  .bo = _buffer->device->surface_state_pool.block_pool.bo,
+  .bo = anv_binding_table_pool_bo(cmd_buffer->device),
   .offset = bt_block->offset,
};
 }
@@ -619,7 +619,8 @@ struct anv_state
 anv_cmd_buffer_alloc_binding_table(struct anv_cmd_buffer *cmd_buffer,
uint32_t entries, uint32_t *state_offset)
 {
-   struct anv_state_pool *state_pool = _buffer->device->surface_state_pool;
+   struct anv_device *device = cmd_buffer->device;
+   struct anv_state_pool *state_pool = >surface_state_pool;
struct anv_state *bt_block = u_vector_head(_buffer->bt_block_states);
struct anv_state state;
 
@@ -629,12 +630,18 @@ anv_cmd_buffer_alloc_binding_table(struct anv_cmd_buffer 
*cmd_buffer,
   return (struct anv_state) { 0 };
 
state.offset = cmd_buffer->bt_next;
-   state.map = state_pool->block_pool.map + bt_block->offset + state.offset;
+   state.map = anv_binding_table_pool_map(device) + bt_block->offset +
+  state.offset;
 
cmd_buffer->bt_next += state.alloc_size;
 
-   assert(bt_block->offset < 0);
-   *state_offset = -bt_block->offset;
+   if (device->use_separate_binding_table_pool) {
+  *state_offset = device->surface_state_pool.block_pool.offset -
+ device->binding_table_pool.block_pool.offset - bt_block->offset;
+   } else {
+  assert(bt_block->offset < 0);
+  *state_offset = -bt_block->offset;
+   }
 
return state;
 }
@@ -666,7 +673,7 @@ anv_cmd_buffer_new_binding_table_block(struct 
anv_cmd_buffer *cmd_buffer)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
}
 
-   *bt_block = anv_state_pool_alloc_back(state_pool);
+   *bt_block = anv_binding_table_pool_alloc(cmd_buffer->device);
cmd_buffer->bt_next = 0;
 
return VK_SUCCESS;
@@ -740,7 +747,7 @@ anv_cmd_buffer_fini_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 {
struct anv_state *bt_block;
u_vector_foreach(bt_block, _buffer->bt_block_states)
-  anv_state_pool_free(_buffer->device->surface_state_pool, *bt_block);
+  anv_binding_table_pool_free(cmd_buffer->device, *bt_block);
u_vector_finish(_buffer->bt_block_states);
 
anv_reloc_list_finish(_buffer->surface_relocs, 
_buffer->pool->alloc);
@@ -772,7 +779,7 @@ anv_cmd_buffer_reset_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
 
while (u_vector_length(_buffer->bt_block_states) > 1) {
   struct anv_state *bt_block = 
u_vector_remove(_buffer->bt_block_states);
-  anv_state_pool_free(_buffer->device->surface_state_pool, *bt_block);
+  anv_binding_table_pool_free(cmd_buffer->device, *bt_block);
}
assert(u_vector_length(_buffer->bt_block_states) == 1);
cmd_buffer->bt_next = 0;
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 2837d2f83ca..d1f57081b77 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1637,9 +1637,32 @@ VkResult anv_CreateDevice(
if (result != VK_SUCCESS)
   goto fail_instruction_state_pool;
 
+   device->use_separate_binding_table_pool = physical_device->has_exec_softpin;
+
+   if (device->use_separate_binding_table_pool) {
+  result = anv_state_pool_init(>binding_table_pool, device, 4096,
+   bo_flags);
+  if (result != VK_SUCCESS)
+ goto fail_surface_state_pool;
+
+  if (device->surface_state_pool.block_pool.offset <
+  device->binding_table_pool.block_pool.offset) {
+
+ uint64_t tmp;
+ tmp = device->surface_state_pool.block_pool.offset;
+ device->surface_state_pool.block_pool.offset =
+device->binding_table_pool.block_pool.offset;
+ device->binding_table_pool.block_pool.offset = tmp;
+ tmp = device->surface_state_pool.block_pool.bo.offset;
+

[Mesa-dev] [PATCH 3/9] anv: remove unused field anv_queue::pool

2018-05-02 Thread Scott D Phillips

The last use of the field was removed in 2015's ("48a87f4ba06
anv/queue: Get rid of the serial")
---
 src/intel/vulkan/anv_device.c  | 1 -
 src/intel/vulkan/anv_private.h | 2 --
 2 files changed, 3 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 856035b8b91..c0cec175826 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1268,7 +1268,6 @@ anv_queue_init(struct anv_device *device, struct 
anv_queue *queue)
 {
queue->_loader_data.loaderMagic = ICD_LOADER_MAGIC;
queue->device = device;
-   queue->pool = >surface_state_pool;
queue->flags = 0;
 }
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index d8b34b149e4..d043c77826e 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -838,8 +838,6 @@ struct anv_queue {
 
 struct anv_device * device;
 
-struct anv_state_pool * pool;
-
 VkDeviceQueueCreateFlagsflags;
 };
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/9] anv: elide relocations to pinned target bos

2018-05-02 Thread Scott D Phillips

References to pinned bos won't need relocated, so just write the
final value of the reference into the bo. Add a `set` to the
relocation lists for tracking dependencies that were previously
tracked by relocations.
---
 src/intel/vulkan/anv_batch_chain.c | 36 
 src/intel/vulkan/anv_private.h |  3 +++
 2 files changed, 39 insertions(+)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 52f69045519..a4856376d8d 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -75,11 +75,24 @@ anv_reloc_list_init_clone(struct anv_reloc_list *list,
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
}
 
+   list->deps = _mesa_set_create(NULL, _mesa_hash_pointer,
+ _mesa_key_pointer_equal);
+
+   if (!list->deps) {
+  vk_free(alloc, list->relocs);
+  vk_free(alloc, list->reloc_bos);
+  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+   }
+
if (other_list) {
   memcpy(list->relocs, other_list->relocs,
  list->array_length * sizeof(*list->relocs));
   memcpy(list->reloc_bos, other_list->reloc_bos,
  list->array_length * sizeof(*list->reloc_bos));
+  struct set_entry *entry;
+  set_foreach(other_list->deps, entry) {
+ _mesa_set_add_pre_hashed(list->deps, entry->hash, entry->key);
+  }
}
 
return VK_SUCCESS;
@@ -98,6 +111,7 @@ anv_reloc_list_finish(struct anv_reloc_list *list,
 {
vk_free(alloc, list->relocs);
vk_free(alloc, list->reloc_bos);
+   _mesa_set_destroy(list->deps, NULL);
 }
 
 static VkResult
@@ -148,6 +162,11 @@ anv_reloc_list_add(struct anv_reloc_list *list,
struct drm_i915_gem_relocation_entry *entry;
int index;
 
+   if (target_bo->flags & EXEC_OBJECT_PINNED) {
+  _mesa_set_add(list->deps, target_bo);
+  return VK_SUCCESS;
+   }
+
VkResult result = anv_reloc_list_grow(list, alloc, 1);
if (result != VK_SUCCESS)
   return result;
@@ -185,6 +204,12 @@ anv_reloc_list_append(struct anv_reloc_list *list,
   list->relocs[i + list->num_relocs].offset += offset;
 
list->num_relocs += other->num_relocs;
+
+   struct set_entry *entry;
+   set_foreach(other->deps, entry) {
+  _mesa_set_add_pre_hashed(list->deps, entry->hash, entry->key);
+   }
+
return VK_SUCCESS;
 }
 
@@ -338,6 +363,7 @@ anv_batch_bo_start(struct anv_batch_bo *bbo, struct 
anv_batch *batch,
batch->end = bbo->bo.map + bbo->bo.size - batch_padding;
batch->relocs = >relocs;
bbo->relocs.num_relocs = 0;
+   _mesa_set_clear(bbo->relocs.deps, NULL);
 }
 
 static void
@@ -785,6 +811,7 @@ anv_cmd_buffer_reset_batch_bo_chain(struct anv_cmd_buffer 
*cmd_buffer)
cmd_buffer->bt_next = 0;
 
cmd_buffer->surface_relocs.num_relocs = 0;
+   _mesa_set_clear(cmd_buffer->surface_relocs.deps, NULL);
cmd_buffer->last_ss_pool_center = 0;
 
/* Reset the list of seen buffers */
@@ -1070,6 +1097,15 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
  if (result != VK_SUCCESS)
 return result;
   }
+
+  struct set_entry *entry;
+  set_foreach(relocs->deps, entry) {
+ VkResult result = anv_execbuf_add_bo(exec, entry->key, NULL,
+  extra_flags, alloc);
+
+ if (result != VK_SUCCESS)
+return result;
+  }
}
 
return VK_SUCCESS;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 81d50b3e770..3a448e41bae 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -46,7 +46,9 @@
 #include "blorp/blorp.h"
 #include "compiler/brw_compiler.h"
 #include "util/macros.h"
+#include "util/hash_table.h"
 #include "util/list.h"
+#include "util/set.h"
 #include "util/u_atomic.h"
 #include "util/u_vector.h"
 #include "util/vma.h"
@@ -1047,6 +1049,7 @@ struct anv_reloc_list {
uint32_t array_length;
struct drm_i915_gem_relocation_entry *   relocs;
struct anv_bo ** reloc_bos;
+   struct set * deps;
 };
 
 VkResult anv_reloc_list_init(struct anv_reloc_list *list,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] anv: soft pin the remaining bos

2018-05-02 Thread Scott D Phillips

---
 src/intel/vulkan/anv_allocator.c   | 16 +++-
 src/intel/vulkan/anv_batch_chain.c | 27 +--
 src/intel/vulkan/anv_device.c  | 32 
 src/intel/vulkan/anv_private.h | 16 
 src/intel/vulkan/anv_queue.c   |  2 +-
 src/intel/vulkan/genX_blorp_exec.c |  6 ++
 src/intel/vulkan/genX_cmd_buffer.c | 26 +-
 src/intel/vulkan/genX_query.c  |  6 ++
 8 files changed, 102 insertions(+), 29 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index fa4e7d74ac7..d6d065283a2 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -1001,6 +1001,7 @@ anv_bo_pool_finish(struct anv_bo_pool *pool)
  struct bo_pool_bo_link link_copy = VG_NOACCESS_READ(link);
 
  anv_gem_munmap(link_copy.bo.map, link_copy.bo.size);
+ anv_vma_free(pool->device, _copy.bo);
  anv_gem_close(pool->device, link_copy.bo.gem_handle);
  link = link_copy.next;
   }
@@ -1040,11 +1041,15 @@ anv_bo_pool_alloc(struct anv_bo_pool *pool, struct 
anv_bo *bo, uint32_t size)
 
new_bo.flags = pool->bo_flags;
 
+   if (!anv_vma_alloc(pool->device, _bo))
+  return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
+
assert(new_bo.size == pow2_size);
 
new_bo.map = anv_gem_mmap(pool->device, new_bo.gem_handle, 0, pow2_size, 0);
if (new_bo.map == MAP_FAILED) {
   anv_gem_close(pool->device, new_bo.gem_handle);
+  anv_vma_free(pool->device, _bo);
   return vk_error(VK_ERROR_MEMORY_MAP_FAILED);
}
 
@@ -1088,8 +1093,10 @@ anv_scratch_pool_finish(struct anv_device *device, 
struct anv_scratch_pool *pool
for (unsigned s = 0; s < MESA_SHADER_STAGES; s++) {
   for (unsigned i = 0; i < 16; i++) {
  struct anv_scratch_bo *bo = >bos[i][s];
- if (bo->exists > 0)
+ if (bo->exists > 0) {
+anv_vma_free(device, >bo);
 anv_gem_close(device, bo->bo.gem_handle);
+ }
   }
}
 }
@@ -1187,6 +1194,11 @@ anv_scratch_pool_alloc(struct anv_device *device, struct 
anv_scratch_pool *pool,
if (device->instance->physicalDevice.has_exec_async)
   bo->bo.flags |= EXEC_OBJECT_ASYNC;
 
+   if (device->instance->physicalDevice.has_exec_softpin)
+  bo->bo.flags |= EXEC_OBJECT_PINNED;
+
+   anv_vma_alloc(device, >bo);
+
/* Set the exists last because it may be read by other threads */
__sync_synchronize();
bo->exists = true;
@@ -1406,6 +1418,8 @@ anv_bo_cache_release(struct anv_device *device,
if (bo->bo.map)
   anv_gem_munmap(bo->bo.map, bo->bo.size);
 
+   anv_vma_free(device, bo);
+
anv_gem_close(device, bo->bo.gem_handle);
 
/* Don't unlock until we've actually closed the BO.  The whole point of
diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index a4856376d8d..60b749e0405 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -430,6 +430,7 @@ anv_batch_bo_list_clone(const struct list_head *list,
 struct list_head *new_list)
 {
VkResult result = VK_SUCCESS;
+   struct anv_device *device = cmd_buffer->device;
 
list_inithead(new_list);
 
@@ -448,8 +449,14 @@ anv_batch_bo_list_clone(const struct list_head *list,
   * as it will always be the last relocation in the list.
   */
  uint32_t last_idx = prev_bbo->relocs.num_relocs - 1;
- assert(prev_bbo->relocs.reloc_bos[last_idx] == >bo);
- prev_bbo->relocs.reloc_bos[last_idx] = _bbo->bo;
+ if (last_idx == -1) {
+write_reloc(device, prev_bbo->bo.map + prev_bbo->length -
+(device->info.gen >= 8 ? 8 : 4), new_bbo->bo.offset,
+false);
+ } else {
+assert(prev_bbo->relocs.reloc_bos[last_idx] == >bo);
+prev_bbo->relocs.reloc_bos[last_idx] = _bbo->bo;
+ }
   }
 
   prev_bbo = new_bbo;
@@ -1148,22 +1155,6 @@ anv_cmd_buffer_process_relocs(struct anv_cmd_buffer 
*cmd_buffer,
   list->relocs[i].target_handle = list->reloc_bos[i]->index;
 }
 
-static void
-write_reloc(const struct anv_device *device, void *p, uint64_t v, bool flush)
-{
-   unsigned reloc_size = 0;
-   if (device->info.gen >= 8) {
-  reloc_size = sizeof(uint64_t);
-  *(uint64_t *)p = canonical_address(v);
-   } else {
-  reloc_size = sizeof(uint32_t);
-  *(uint32_t *)p = v;
-   }
-
-   if (flush && !device->info.has_llc)
-  gen_flush_range(p, reloc_size);
-}
-
 static void
 adjust_relocations_from_state_pool(struct anv_state_pool *pool,
struct anv_reloc_list *relocs,
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d1f57081b77..11c23363746 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1325,6 +1325,11 @@ anv_device_init_trivial_batch(struct

[Mesa-dev] [PATCH 5/9] anv: Add vma_heap allocators in anv_device

2018-05-02 Thread Scott D Phillips

These will be used to assign virtual addresses to soft pinned
buffers in a later patch.
---
 src/intel/vulkan/anv_device.c  | 75 ++
 src/intel/vulkan/anv_private.h | 11 +++
 2 files changed, 86 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index c0cec175826..d3d9c779d62 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -369,6 +369,8 @@ anv_physical_device_init(struct anv_physical_device *device,
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
device->has_exec_capture = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_CAPTURE);
device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
+   device->has_exec_softpin = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_SOFTPIN)
+  && device->supports_48bit_addresses;
device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
device->has_syncobj_wait = device->has_syncobj &&
   anv_gem_supports_syncobj_wait(fd);
@@ -1527,6 +1529,26 @@ VkResult anv_CreateDevice(
   goto fail_fd;
}
 
+   if (physical_device->has_exec_softpin) {
+  if (pthread_mutex_init(>vma_mutex, NULL) != 0) {
+ result = vk_error(VK_ERROR_INITIALIZATION_FAILED);
+ goto fail_fd;
+  }
+
+  /* keep the page with address zero out of the allocator */
+  util_vma_heap_init(>vma_lo, 4096, (1ull << 32) - 2 * 4096);
+  device->vma_lo_available =
+ physical_device->memory.heaps[physical_device->memory.heap_count - 
1].size;
+
+  /* Leave the last 4GiB out of the high vma range, so that no state base
+   * address + size can overflow 48 bits.
+   */
+  util_vma_heap_init(>vma_hi, (1ull << 32) + 4096,
+ (1ull << 48) - 2 * (1ull << 32) - 2 * 4096);
+  device->vma_hi_available = physical_device->memory.heap_count == 1 ? 0 :
+ physical_device->memory.heaps[0].size;
+   }
+
/* As per spec, the driver implementation may deny requests to acquire
 * a priority above the default priority (MEDIUM) if the caller does not
 * have sufficient privileges. In this scenario VK_ERROR_NOT_PERMITTED_EXT
@@ -1887,6 +1909,59 @@ VkResult anv_DeviceWaitIdle(
return anv_device_submit_simple_batch(device, );
 }
 
+bool
+anv_vma_alloc(struct anv_device *device, struct anv_bo *bo)
+{
+   if (!(bo->flags & EXEC_OBJECT_PINNED))
+  return true;
+
+   pthread_mutex_lock(>vma_mutex);
+
+   bo->offset = 0;
+
+   if (bo->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS &&
+   device->vma_hi_available >= bo->size) {
+  uint64_t addr = util_vma_heap_alloc(>vma_hi, bo->size, 4096);
+  if (addr) {
+ bo->offset = canonical_address(addr);
+ device->vma_hi_available -= bo->size;
+  }
+   }
+
+   if (bo->offset == 0 && device->vma_lo_available >= bo->size) {
+  uint64_t addr = util_vma_heap_alloc(>vma_lo, bo->size, 4096);
+  if (addr) {
+ bo->offset = canonical_address(addr);
+ device->vma_lo_available -= bo->size;
+  }
+   }
+
+   pthread_mutex_unlock(>vma_mutex);
+
+   return bo->offset != 0;
+}
+
+void
+anv_vma_free(struct anv_device *device, struct anv_bo *bo)
+{
+   if (!(bo->flags & EXEC_OBJECT_PINNED))
+  return;
+
+   pthread_mutex_lock(>vma_mutex);
+
+   if (bo->offset >= 1ull << 32) {
+  util_vma_heap_free(>vma_hi, bo->offset, bo->size);
+  device->vma_hi_available += bo->size;
+   } else {
+  util_vma_heap_free(>vma_lo, bo->offset, bo->size);
+  device->vma_lo_available += bo->size;
+   }
+
+   pthread_mutex_unlock(>vma_mutex);
+
+   bo->offset = 0;
+}
+
 VkResult
 anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, uint64_t size)
 {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 761601d1e37..708c3a540d3 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -49,6 +49,7 @@
 #include "util/list.h"
 #include "util/u_atomic.h"
 #include "util/u_vector.h"
+#include "util/vma.h"
 #include "vk_alloc.h"
 #include "vk_debug_report.h"
 
@@ -802,6 +803,7 @@ struct anv_physical_device {
 boolhas_exec_async;
 boolhas_exec_capture;
 boolhas_exec_fence;
+boolhas_exec_softpin;
 boolhas_syncobj;
 boolhas_syncobj_wait;
 boolhas_context_priority;
@@ -898,6 +900,12 @@ struct anv_device {
 struct anv_device_extension_table   enabled_extensions;
 struct anv_dispatch_table   dispatch;
 
+pthread_mutex_t vma_mutex;
+struct util_vma_heapvma_lo;
+struct util_vma_heap

[Mesa-dev] [PATCH 6/9] anv: soft pin state pools

2018-05-02 Thread Scott D Phillips

The state_pools reserve virtual address space of the full
BLOCK_POOL_MEMFD_SIZE, but maintain the current behavior of
growing from the middle.
---
 src/intel/vulkan/anv_allocator.c | 25 +
 src/intel/vulkan/anv_device.c| 13 +
 src/intel/vulkan/anv_private.h   |  2 ++
 3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 642e1618c10..fa4e7d74ac7 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -250,6 +250,27 @@ anv_block_pool_init(struct anv_block_pool *pool,
 
pool->device = device;
pool->bo_flags = bo_flags;
+
+   if (bo_flags & EXEC_OBJECT_PINNED) {
+  pool->offset = 0;
+
+  pthread_mutex_lock(>vma_mutex);
+
+  if (bo_flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
+ pool->offset = util_vma_heap_alloc(>vma_hi,
+BLOCK_POOL_MEMFD_SIZE, 4096);
+
+  if (!pool->offset)
+ pool->offset = util_vma_heap_alloc(>vma_lo,
+BLOCK_POOL_MEMFD_SIZE, 4096);
+
+  pthread_mutex_unlock(>vma_mutex);
+
+  if (!pool->offset)
+ return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
+
+  pool->offset = canonical_address(pool->offset);
+   }
anv_bo_init(>bo, 0, 0);
 
pool->fd = memfd_create("block pool", MFD_CLOEXEC);
@@ -402,6 +423,10 @@ anv_block_pool_expand_range(struct anv_block_pool *pool,
 * hard work for us.
 */
anv_bo_init(>bo, gem_handle, size);
+   if (pool->bo_flags & EXEC_OBJECT_PINNED) {
+  pool->bo.offset = pool->offset + BLOCK_POOL_MEMFD_CENTER -
+ center_bo_offset;
+   }
pool->bo.flags = pool->bo_flags;
pool->bo.map = map;
 
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d3d9c779d62..2837d2f83ca 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1613,12 +1613,17 @@ VkResult anv_CreateDevice(
if (result != VK_SUCCESS)
   goto fail_batch_bo_pool;
 
-   /* For the state pools we explicitly disable 48bit. */
-   bo_flags = (physical_device->has_exec_async ? EXEC_OBJECT_ASYNC : 0) |
-  (physical_device->has_exec_capture ? EXEC_OBJECT_CAPTURE : 0);
+   if (physical_device->has_exec_softpin)
+  bo_flags |= EXEC_OBJECT_PINNED;
+   else
+  bo_flags &= ~EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
 
+   /* dynamic_state_pool needs to stay in the same 4GiB as index and
+* vertex buffers. For rationale, see the comment in
+* anv_physical_device_init_heaps.
+*/
result = anv_state_pool_init(>dynamic_state_pool, device, 16384,
-bo_flags);
+bo_flags & ~EXEC_OBJECT_SUPPORTS_48B_ADDRESS);
if (result != VK_SUCCESS)
   goto fail_bo_cache;
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 708c3a540d3..23527eebaab 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -582,6 +582,8 @@ struct anv_block_pool {
 
struct anv_bo bo;
 
+   uint64_t offset;
+
/* The offset from the start of the bo to the "center" of the block
 * pool.  Pointers to allocated blocks are given by
 * bo.map + center_bo_offset + offsets.
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/9] anv: move canonical_address calculation into a separate function

2018-05-02 Thread Scott D Phillips

A later patch will make use of this in other places. Also, remove
dependency on undefined behavior of left-shifting a signed value.
---
 src/intel/vulkan/anv_batch_chain.c | 12 +---
 src/intel/vulkan/anv_private.h | 15 +++
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index e083d79d35b..09514c7b84a 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1110,18 +1110,8 @@ write_reloc(const struct anv_device *device, void *p, 
uint64_t v, bool flush)
 {
unsigned reloc_size = 0;
if (device->info.gen >= 8) {
-  /* From the Broadwell PRM Vol. 2a, MI_LOAD_REGISTER_MEM::MemoryAddress:
-   *
-   *"This field specifies the address of the memory location where the
-   *register value specified in the DWord above will read from. The
-   *address specifies the DWord location of the data. Range =
-   *GraphicsVirtualAddress[63:2] for a DWord register GraphicsAddress
-   *[63:48] are ignored by the HW and assumed to be in correct
-   *canonical form [63:48] == [47]."
-   */
-  const int shift = 63 - 47;
   reloc_size = sizeof(uint64_t);
-  *(uint64_t *)p = (((int64_t)v) << shift) >> shift;
+  *(uint64_t *)p = canonical_address(v);
} else {
   reloc_size = sizeof(uint32_t);
   *(uint32_t *)p = v;
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index d043c77826e..761601d1e37 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -155,6 +155,21 @@ align_i32(int32_t v, int32_t a)
return (v + a - 1) & ~(a - 1);
 }
 
+static inline uint64_t
+canonical_address(uint64_t v) {
+   /* From the Broadwell PRM Vol. 2a, MI_LOAD_REGISTER_MEM::MemoryAddress:
+*
+*"This field specifies the address of the memory location where the
+*register value specified in the DWord above will read from. The
+*address specifies the DWord location of the data. Range =
+*GraphicsVirtualAddress[63:2] for a DWord register GraphicsAddress
+*[63:48] are ignored by the HW and assumed to be in correct
+*canonical form [63:48] == [47]."
+*/
+   const int shift = 63 - 47;
+   return (int64_t)(v << shift) >> shift;
+}
+
 /** Alignment must be a power of 2. */
 static inline bool
 anv_is_aligned(uintmax_t n, uintmax_t a)
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/9] util: Add a virtual memory allocator

2018-05-02 Thread Scott D Phillips

From: Jason Ekstrand 

This is simple linear-walk first-fit allocator roughly based on the
allocator in the radeon winsys code.  This allocator has two primary
functional differences:

 1) It cleanly returns 0 on allocation failure

 2) It allocates addresses top-down instead of bottom-up.

The second one is needed for Intel because high addresses (with bit 47
set) need to be canonicalized in order to work properly.  If we allocate
bottom-up, then high addresses will be very rare (if they ever happen).
We'd rather always have high addresses so that the canonicalization code
gets better testing.
---
 src/util/Makefile.sources |   4 +-
 src/util/meson.build  |   2 +
 src/util/vma.c| 231 ++
 src/util/vma.h|  53 +++
 4 files changed, 289 insertions(+), 1 deletion(-)
 create mode 100644 src/util/vma.c
 create mode 100644 src/util/vma.h

diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 104ecae8ed3..534520ce763 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -56,7 +56,9 @@ MESA_UTIL_FILES := \
u_string.h \
u_thread.h \
u_vector.c \
-   u_vector.h
+   u_vector.h \
+   vma.c \
+   vma.h
 
 MESA_UTIL_GENERATED_FILES = \
format_srgb.c
diff --git a/src/util/meson.build b/src/util/meson.build
index eece1cefef6..14660e0fa0c 100644
--- a/src/util/meson.build
+++ b/src/util/meson.build
@@ -81,6 +81,8 @@ files_mesa_util = files(
   'u_thread.h',
   'u_vector.c',
   'u_vector.h',
+  'vma.c',
+  'vma.h',
 )
 
 install_data('drirc', install_dir : get_option('sysconfdir'))
diff --git a/src/util/vma.c b/src/util/vma.c
new file mode 100644
index 000..0d4e097e21f
--- /dev/null
+++ b/src/util/vma.c
@@ -0,0 +1,231 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+
+#include "util/u_math.h"
+#include "util/vma.h"
+
+struct util_vma_hole {
+   struct list_head link;
+   uint64_t offset;
+   uint64_t size;
+};
+
+#define util_vma_foreach_hole(_hole, _heap) \
+   list_for_each_entry(struct util_vma_hole, _hole, &(_heap)->holes, link)
+
+#define util_vma_foreach_hole_safe(_hole, _heap) \
+   list_for_each_entry_safe(struct util_vma_hole, _hole, &(_heap)->holes, link)
+
+void
+util_vma_heap_init(struct util_vma_heap *heap,
+   uint64_t start, uint64_t size)
+{
+   list_inithead(>holes);
+   util_vma_heap_free(heap, start, size);
+}
+
+void
+util_vma_heap_finish(struct util_vma_heap *heap)
+{
+   util_vma_foreach_hole_safe(hole, heap)
+  free(hole);
+}
+
+static void
+util_vma_heap_validate(struct util_vma_heap *heap)
+{
+   uint64_t prev_offset = 0;
+   util_vma_foreach_hole(hole, heap) {
+  assert(hole->offset > 0);
+  assert(hole->size > 0);
+
+  if (>link == heap->holes.next) {
+ /* This must be the top-most hole.  Assert that, if it overflows, it
+  * overflows to 0, i.e. 2^64.
+  */
+ assert(hole->size + hole->offset == 0 ||
+hole->size + hole->offset > hole->offset);
+  } else {
+ /* This is not the top-most hole so it must not overflow and, in
+  * fact, must be strictly lower than the top-most hole.  If
+  * hole->size + hole->offset == prev_offset, then we failed to join
+  * holes during a util_vma_heap_free.
+  */
+ assert(hole->size + hole->offset > hole->offset &&
+hole->size + hole->offset < prev_offset);
+  }
+  prev_offset = hole->offset;
+   }
+}
+
+uint64_t
+util_vma_heap_alloc(struct util_vma_heap *heap,
+uint64_t size, uint64_t alignment)
+{
+   /* The caller is expected to reject zero-size allocations */
+   assert(size > 0);
+
+   assert(util_is_power_of_two_nonzero(alignment));
+
+

[Mesa-dev] [PATCH 0/9] anv: softpin

2018-05-02 Thread Scott D Phillips

This series teaches anv how to pick its own virtual graphics addresses
instead of using the relocation facility provided by the kernel.

Jason Ekstrand (1):
  util: Add a virtual memory allocator

Scott D Phillips (8):
  util/set: add a set_clear function
  anv: remove unused field anv_queue::pool
  anv: move canonical_address calculation into a separate function
  anv: Add vma_heap allocators in anv_device
  anv: soft pin state pools
  anv: use a separate pool for binding tables when soft pinning
  anv: elide relocations to pinned target bos
  anv: soft pin the remaining bos

 src/intel/vulkan/anv_allocator.c   |  41 ++-
 src/intel/vulkan/anv_batch_chain.c |  96 +--
 src/intel/vulkan/anv_device.c  | 143 +--
 src/intel/vulkan/anv_private.h |  85 +-
 src/intel/vulkan/anv_queue.c   |   2 +-
 src/intel/vulkan/genX_blorp_exec.c |   6 +
 src/intel/vulkan/genX_cmd_buffer.c |  26 -
 src/intel/vulkan/genX_query.c  |   6 +
 src/util/Makefile.sources  |   4 +-
 src/util/meson.build   |   2 +
 src/util/set.c |  23 
 src/util/set.h |   3 +
 src/util/vma.c | 231 +
 src/util/vma.h |  53 +
 14 files changed, 668 insertions(+), 53 deletions(-)
 create mode 100644 src/util/vma.c
 create mode 100644 src/util/vma.h

-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] intel: Give the batch decoder a callback to ask about state size.

2018-05-02 Thread Lionel Landwerlin


On 02/05/18 16:45, Kenneth Graunke wrote:

On Wednesday, May 2, 2018 3:52:22 AM PDT Lionel Landwerlin wrote:

On 02/05/18 06:50, Kenneth Graunke wrote:

Given an arbitrary batch, we don't always know what the size of certain
things are, such as how many entries are in a binding table.  But it's
easy for the driver to track that information, so with a simple callback
we can calculate this correctly for INTEL_DEBUG=bat.
---
   src/intel/common/gen_batch_decoder.c | 23 +++
   src/intel/common/gen_decoder.h   |  4 
   src/intel/tools/aubinator.c  |  2 +-
   src/intel/tools/aubinator_error_decode.c |  2 +-
   4 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c
index c6b908758b2..37eac1ab2a1 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -33,11 +33,13 @@ gen_batch_decode_ctx_init(struct gen_batch_decode_ctx *ctx,
 const char *xml_path,
 struct gen_batch_decode_bo (*get_bo)(void *,
  uint64_t),
+  unsigned (*get_state_size)(void *, uint32_t),
 void *user_data)
   {
  memset(ctx, 0, sizeof(*ctx));
   
  ctx->get_bo = get_bo;

+   ctx->get_state_size = get_state_size;
  ctx->user_data = user_data;
  ctx->fp = fp;
  ctx->flags = flags;
@@ -103,6 +105,21 @@ ctx_get_bo(struct gen_batch_decode_ctx *ctx, uint64_t addr)
  return bo;
   }
   
+static int

+update_count(struct gen_batch_decode_ctx *ctx,
+ uint32_t offset_from_dsba,
+ unsigned element_dwords,
+ unsigned guess)
+{
+   unsigned size = ctx->get_state_size(ctx->user_data, offset_from_dsba);

You probably want to get the fact that ctx->get_state_size might be NULL?

Eep, yes...thanks!  I'd done that at first, but botched it in a
refactor...clearly I hadn't actually tested the aubinator tools...

I've changed it to:

-   unsigned size = ctx->get_state_size(ctx->user_data, offset_from_dsba);
+   unsigned size = 0;
+
+   if (ctx->get_state_size)
+  size = ctx->get_state_size(ctx->user_data, offset_from_dsba);


Thanks,

Reviewed-by: Lionel Landwerlin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/3] mesa: actually support compat profile creation with MESA_GL_VERSION_OVERRIDE

2018-05-02 Thread Timothy Arceri




On 03/05/18 01:41, Emil Velikov wrote:

On 2 May 2018 at 11:27, Timothy Arceri  wrote:

Since this has gone unnoticed for a while, it proves to be subtle. Add
some commit message elaborating on the issue/solution.


---
  src/mesa/drivers/dri/common/dri_util.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/common/dri_util.c 
b/src/mesa/drivers/dri/common/dri_util.c
index 7cb6248b130..d72f72d0756 100644
--- a/src/mesa/drivers/dri/common/dri_util.c
+++ b/src/mesa/drivers/dri/common/dri_util.c
@@ -389,10 +389,11 @@ driCreateContextAttribs(__DRIscreen *screen, int api,
  screen->max_gl_compat_version < 31)
 mesa_api = API_OPENGL_CORE;

-if (mesa_api == API_OPENGL_COMPAT
-&& ((ctx_config.major_version > 3)
-|| (ctx_config.major_version == 3 &&
-ctx_config.minor_version >= 2))) {
+if (mesa_api == API_OPENGL_COMPAT &&
+((ctx_config.major_version > 3) || (ctx_config.major_version == 3 &&
+ctx_config.minor_version >= 2)) &&
+!((ctx_config.major_version * 10 + ctx_config.minor_version) <=
+  screen->max_gl_compat_version)) {


Unless I'm misreading it - this seems does the opposite to what the
commit message says.
Namely it causes an error out when the major/minor (overridden or not)
is greater than the max supported one.

In other words the code 'restricts', while the summary implies 'allow'.


The existing code hard-codes a limit of 3.1. This change ignores the 
restriction if we have an environment var with a higher gl version.





-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] intel: Give the batch decoder a callback to ask about state size.

2018-05-02 Thread Kenneth Graunke

On Wednesday, May 2, 2018 3:52:22 AM PDT Lionel Landwerlin wrote:
> On 02/05/18 06:50, Kenneth Graunke wrote:
> > Given an arbitrary batch, we don't always know what the size of certain
> > things are, such as how many entries are in a binding table.  But it's
> > easy for the driver to track that information, so with a simple callback
> > we can calculate this correctly for INTEL_DEBUG=bat.
> > ---
> >   src/intel/common/gen_batch_decoder.c | 23 +++
> >   src/intel/common/gen_decoder.h   |  4 
> >   src/intel/tools/aubinator.c  |  2 +-
> >   src/intel/tools/aubinator_error_decode.c |  2 +-
> >   4 files changed, 25 insertions(+), 6 deletions(-)
> >
> > diff --git a/src/intel/common/gen_batch_decoder.c 
> > b/src/intel/common/gen_batch_decoder.c
> > index c6b908758b2..37eac1ab2a1 100644
> > --- a/src/intel/common/gen_batch_decoder.c
> > +++ b/src/intel/common/gen_batch_decoder.c
> > @@ -33,11 +33,13 @@ gen_batch_decode_ctx_init(struct gen_batch_decode_ctx 
> > *ctx,
> > const char *xml_path,
> > struct gen_batch_decode_bo (*get_bo)(void *,
> >  uint64_t),
> > +  unsigned (*get_state_size)(void *, uint32_t),
> > void *user_data)
> >   {
> >  memset(ctx, 0, sizeof(*ctx));
> >   
> >  ctx->get_bo = get_bo;
> > +   ctx->get_state_size = get_state_size;
> >  ctx->user_data = user_data;
> >  ctx->fp = fp;
> >  ctx->flags = flags;
> > @@ -103,6 +105,21 @@ ctx_get_bo(struct gen_batch_decode_ctx *ctx, uint64_t 
> > addr)
> >  return bo;
> >   }
> >   
> > +static int
> > +update_count(struct gen_batch_decode_ctx *ctx,
> > + uint32_t offset_from_dsba,
> > + unsigned element_dwords,
> > + unsigned guess)
> > +{
> > +   unsigned size = ctx->get_state_size(ctx->user_data, offset_from_dsba);
> 
> You probably want to get the fact that ctx->get_state_size might be NULL?

Eep, yes...thanks!  I'd done that at first, but botched it in a
refactor...clearly I hadn't actually tested the aubinator tools...

I've changed it to:

-   unsigned size = ctx->get_state_size(ctx->user_data, offset_from_dsba);
+   unsigned size = 0;
+
+   if (ctx->get_state_size)
+  size = ctx->get_state_size(ctx->user_data, offset_from_dsba);



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/3] mesa: actually support compat profile creation with MESA_GL_VERSION_OVERRIDE

2018-05-02 Thread Emil Velikov

On 2 May 2018 at 11:27, Timothy Arceri  wrote:

Since this has gone unnoticed for a while, it proves to be subtle. Add
some commit message elaborating on the issue/solution.

> ---
>  src/mesa/drivers/dri/common/dri_util.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/common/dri_util.c 
> b/src/mesa/drivers/dri/common/dri_util.c
> index 7cb6248b130..d72f72d0756 100644
> --- a/src/mesa/drivers/dri/common/dri_util.c
> +++ b/src/mesa/drivers/dri/common/dri_util.c
> @@ -389,10 +389,11 @@ driCreateContextAttribs(__DRIscreen *screen, int api,
>  screen->max_gl_compat_version < 31)
> mesa_api = API_OPENGL_CORE;
>
> -if (mesa_api == API_OPENGL_COMPAT
> -&& ((ctx_config.major_version > 3)
> -|| (ctx_config.major_version == 3 &&
> -ctx_config.minor_version >= 2))) {
> +if (mesa_api == API_OPENGL_COMPAT &&
> +((ctx_config.major_version > 3) || (ctx_config.major_version == 3 &&
> +ctx_config.minor_version >= 2)) 
> &&
> +!((ctx_config.major_version * 10 + ctx_config.minor_version) <=
> +  screen->max_gl_compat_version)) {

Unless I'm misreading it - this seems does the opposite to what the
commit message says.
Namely it causes an error out when the major/minor (overridden or not)
is greater than the max supported one.

In other words the code 'restricts', while the summary implies 'allow'.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 06/18] intel/compiler: fix brw_imm_w for negative 16-bit integers

2018-05-02 Thread Chema Casanova



El 01/05/18 a las 01:22, Jason Ekstrand escribió:
> On Mon, Apr 30, 2018 at 3:53 PM, Chema Casanova  > wrote:
> 
> 
> 
> On 30/04/18 23:12, Jason Ekstrand wrote:
> > On Mon, Apr 30, 2018 at 7:18 AM, Iago Toral Quiroga  
> > >> wrote:
> > 
> >     From: Jose Maria Casanova Crespo  
> >     >>
> >
> >     16-bit immediates need to replicate the 16-bit immediate value
> >     in both words of the 32-bit value. This needs to be careful
> >     to avoid sign-extension, which the previous implementation was
> >     not handling properly.
> >
> >     For example, with the previous implementation, storing the value
> >     -3 would generate imm.d = 0xfffd due to signed integer sign
> >     extension, which is not correct. Instead, we should cast to
> >     unsigned, which gives us the correct result: imm.ud = 0xfffdfffd.
> >
> >     We only had a couple of cases hitting this path in the driver
> >     until now, one with value -1, which would work since all bits are
> >     one in this case, and another with value -2 in brw_clip_tri(),
> >     which would hit the aforementioned issue (this case only affects
> >     gen4 although we are not aware of whether this was causing an
> >     actual bug somewhere).
> >     ---
> >      src/intel/compiler/brw_reg.h | 2 +-
> >      1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >     diff --git a/src/intel/compiler/brw_reg.h
> b/src/intel/compiler/brw_reg.h
> >     index dff9b970b2..0084a78af6 100644
> >     --- a/src/intel/compiler/brw_reg.h
> >     +++ b/src/intel/compiler/brw_reg.h
> >     @@ -705,7 +705,7 @@ static inline struct brw_reg
> >      brw_imm_w(int16_t w)
> >      {
> >         struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_W);
> >     -   imm.d = w | (w << 16);
> >     +   imm.ud = (uint16_t)w | ((uint16_t)w << 16);
> 
> > Uh... Is this cast right?  Doing a << 16 on a 16-bit data type should
> > yield undefined results.  I think you want a (uint32_t) cast.
> 
> In my test code it was working at least with GCC, I think it is because
> at the end we have an integer promotion for any type with lower rank
> than int.
> 
> "Formally, the rule says (C11 6.3.1.1):
> 
>     If an int can represent all values of the original type (as
> restricted by the width, for a bit-field), the value is converted to an
> int; otherwise, it is converted to an unsigned int. These are called the
> integer promotions."
> 
> But I agree that is clearer if we just use (uint32_t).
> I can change also the brw_imm_uw case that has the same issue.
> 
> 
> Yeah, best to make it clear. :-)

I was wrong, we can't just replace (uint16_t) cast by (uint32_t) because
the cast from signed short to uint32_t implies sign extension, because
it seems that sign extensions is done if source is signed and not in
destination type.

So for example, being w = -2  (0xfffe).

imm.ud = (uint32_t)w | (uint32_t)w << 16;

becomes: 0xfffe

So the alternatives I figure out with the correct result are.

imm.ud = (uint32_t) w & 0x | (uint32_t)w << 16;

Or:

uint16_t value = w;
imm.ud = (uint32_t)value | (uint32_t)value << 16;

Or something like:

imm.ud = (uint32_t)(uint16_t)w | ((uint32_t)(uint16_t)w << 16);

Any preference?

Chema
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 106039] Undefined version strings in pc files with meson build

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=106039

Juan A. Suarez  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Juan A. Suarez  ---
18.0.1 was released, so closing this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/vdpau: allow progressive video surface with interop

2018-05-02 Thread Leo Liu

mpv now interop with video surface instead of output surface previously,
so it fails with "vlVdpVideoSurfaceDMABuf", this's fine for Mesa GL, since
the code path will fall back to "vlVdpVideoSurfaceGallium", but this's
not the case for others

Signed-off-by: Leo Liu 
Cc: Christian König 
Cc: "18.1 18.0" 
---
 src/gallium/state_trackers/vdpau/surface.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/vdpau/surface.c 
b/src/gallium/state_trackers/vdpau/surface.c
index 012d303641..d63e761350 100644
--- a/src/gallium/state_trackers/vdpau/surface.c
+++ b/src/gallium/state_trackers/vdpau/surface.c
@@ -513,12 +513,15 @@ VdpStatus vlVdpVideoSurfaceDMABuf(VdpVideoSurface surface,
}
 
/* Check if surface match interop requirements */
-   if (p_surf->video_buffer == NULL || !p_surf->video_buffer->interlaced ||
+   if (p_surf->video_buffer == NULL ||
p_surf->video_buffer->buffer_format != PIPE_FORMAT_NV12) {
   mtx_unlock(_surf->device->mutex);
   return VDP_STATUS_NO_IMPLEMENTATION;
}
 
+   if (!p_surf->video_buffer->interlaced)
+  plane >>= 1;
+
surf = p_surf->video_buffer->get_surfaces(p_surf->video_buffer)[plane];
if (!surf) {
   mtx_unlock(_surf->device->mutex);
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 106355] Strange artifacts when using Chromium with hardware acceleration

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=106355

Michel Dänzer  changed:

   What|Removed |Added

 QA Contact|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org
  Component|Mesa core   |Drivers/Gallium/radeonsi
   Assignee|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 106355] Strange artifacts when using Chromium with hardware acceleration

2018-05-02 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=106355

Bug ID: 106355
   Summary: Strange artifacts when using Chromium with hardware
acceleration
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: klarnorb...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 139276
  --> https://bugs.freedesktop.org/attachment.cgi?id=139276=edit
screenshot

OS: Kubuntu 18.04.0 LTS 64bit
VGA: MSI RX 480 8GB
Mesa: Mesa 18.2.0-devel

Getting these strange artifacts(see attachment), when using Chromium with
hardware acceleration enabled(tried with three different version of the
browser).

After changing tabs in the browser, artifacts disappear, but come back after
some time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 139 matches

Mail list logo