date:20181113

Re: [Mesa-dev] [PATCH 2/2] util/ralloc: Make sizeof(linear_header) a multiple of 8

2018-11-13 Thread Gustaw Smolarczyk

Wt., 13 lis 2018, 06:03: Matt Turner  napisał(a):

> On Mon, Nov 12, 2018 at 3:07 PM Eric Anholt  wrote:
> >
> > Matt Turner  writes:
> >
> > > Prior to this patch sizeof(linear_header) was 20 bytes in a
> > > non-debug build on 32-bit platforms. We do some pointer arithmetic to
> > > calculate the next available location with
> > >
> > >ptr = (linear_size_chunk *)((char *)[1] + latest->offset);
> > >
> > > in linear_alloc_child(). The [1] adds 20 bytes, so an allocation
> > > would only be 4-byte aligned.
> > >
> > > On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair
> of
> > > 4-byte registers to memory) requires an 8-byte aligned address. Such an
> > > instruction is used to store to an 8-byte integer type, like intmax_t
> > > which is used in glcpp's expression_value_t struct.
> > >
> > > As a result of the 4-byte alignment returned by linear_alloc_child() we
> > > would generate a SIGBUS (unaligned exception) on SPARC.
> > >
> > > According to the GNU libc manual malloc() always returns memory that
> has
> > > at least an alignment of 8-bytes [1]. I think our allocator should do
> > > the same.
> > >
> > > So, simple fix with two parts:
> > >
> > >(1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally.
> > >(2) Mark linear_header with an aligned attribute, which will cause
> > >its sizeof to be rounded up to that alignment. (We already do
> > >this for ralloc_header)
> > >
> > > With this done, all Mesa's unit tests now pass on SPARC.
> > >
> > > [1]
> https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
> > >
> > > Fixes: 47e17586924f ("glcpp: use the linear allocator for most
> objects")
> > > Bug: https://bugs.gentoo.org/636326
> > > ---
> > >  src/util/ralloc.c | 14 --
> > >  1 file changed, 12 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/src/util/ralloc.c b/src/util/ralloc.c
> > > index 745b4cf1226..fc35661996d 100644
> > > --- a/src/util/ralloc.c
> > > +++ b/src/util/ralloc.c
> > > @@ -552,10 +552,18 @@ ralloc_vasprintf_rewrite_tail(char **str, size_t
> *start, const char *fmt,
> > >   */
> > >
> > >  #define MIN_LINEAR_BUFSIZE 2048
> > > -#define SUBALLOC_ALIGNMENT sizeof(uintptr_t)
> > > +#define SUBALLOC_ALIGNMENT 8
> > >  #define LMAGIC 0x87b9c7d3
> > >
> > > -struct linear_header {
> > > +struct
> > > +#ifdef _MSC_VER
> > > + __declspec(align(8))
> > > +#elif defined(__LP64__)
> > > + __attribute__((aligned(16)))
> > > +#else
> > > + __attribute__((aligned(8)))
> > > +#endif
> > > +   linear_header {
> > >  #ifndef NDEBUG
> > > unsigned magic;   /* for debugging */
> > >  #endif
> > > @@ -647,6 +655,8 @@ linear_alloc_child(void *parent, unsigned size)
> > > ptr = (linear_size_chunk *)((char*)[1] + latest->offset);
> > > ptr->size = size;
> > > latest->offset += full_size;
> > > +
> > > +   assert((uintptr_t)[1] % SUBALLOC_ALIGNMENT == 0);
> > > return [1];
> > >  }
> >
> > These patches are:
> >
> > Reviewed-by: Eric Anholt 
>
> Thanks a bunch! I hope this is useful for you on arm as well.
>
> > However, shouldn't we also bump SUBALLOC_ALIGNMENT to 16 on LP64, too,
> > if that's what glibc is doing for malloc?
>
> 16-byte alignment is necessary for SSE aligned vector load/store
> instructions. I suppose we're not getting any vectorized SSE
> load/store instructions to memory allocated by linear_alloc_* and
> that's why we haven't seen problems?
>

FWIW, at least clang on x86 assumes malloc/new return pointers aligned to
16 bytes, though it probably doesn't detect linear_alloc_* as such.

Regards,
Gustaw Smolarczyk


> Seems reasonable to bump it to 16-bytes.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [radeonsi] Blender/vsraytrace/fsraytrace/gsraytrace - GPUShader: compile error

2018-11-13 Thread Richard Biener


SR#648671

On Tue, 13 Nov 2018, Dieter Nützel  wrote:

> GREAT hint Tim!
> 
> Yes, of course.
> 
> /home/dieter> gcc --version
> gcc (SUSE Linux) 8.2.1 20181025 [gcc-8-branch revision 265488]
> 
> So I have to ping SUSE to push the fix, too.
> 
> Thanks a lot.
> 
> Dieter
> 
> Am 12.11.2018 08:28, schrieb Timothy Arceri:
> > I'm guessing your using GCC 8.2.1 to compile Mesa? There was a compiler bug:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1645400
> > 
> > On 12/11/18 2:11 pm, Dieter Nützel wrote:
> > > Hello,
> > > 
> > > I get brocken shaders with Blender and the above demos didn't start
> > > any longer.
> > > 
> > > NOT NIR related.
> > > Have to start bisect.
> > > 
> > > OpenGL renderer string: Radeon RX 580 Series (POLARIS10, DRM 3.27.0,
> > > 4.19.0-rc1-1.g7262353-default+, LLVM 8.0.0)
> > > OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.0.0-devel
> > > (git-590fcb50e7)
> > > OpenGL core profile shading language version string: 4.50
> > > 
> > > mesa-demos/glsl> blender
> > > Read prefs: /home/dieter/.config/blender/2.79/config/userpref.blend
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: pci id for fd 16: 1002:67df, driver radeonsi
> > > libGL: OpenDriver: trying /usr/local/lib64/dri/tls/radeonsi_dri.so
> > > libGL: OpenDriver: trying /usr/local/lib64/dri/radeonsi_dri.so
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > usr/share/libdrm/amdgpu.ids version: 1.0.0
> > > libGL: Using DRI3 for screen 0
> > > Read blend: /data/Blender/BMW3v2.blend
> > > 2.66 versioning fix: replacing black sky with premultiplied alpha for
> > > scene Scene
> > > Read blend: /data/Blender/BMW27GE.blend
> > > GPUShader: compile error:
> > > 0:1177(22): error: invalid input layout qualifier used
> > > [-]
> > > 
> > > Read blend: /data/Blender/BMW27.blend
> > > skipping driver '100*power', automatic scripts are disabled
> > > skipping driver '-100*power', automatic scripts are disabled
> > > skipping driver '-90*brake', automatic scripts are disabled
> > > skipping driver '90*brake', automatic scripts are disabled
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > Read blend: /data/Blender/sanisidro.blend
> > > Read blend: /data/Blender/bh.blend
> > > Info: Read library:  '/projeto.blend', '//../../projeto.blend', parent
> > > ''
> > > Warning: Cannot find lib '/projeto.blend'
> > > Warning: LIB: Group: 'Projeto' missing from '/projeto.blend', parent
> > > ''
> > > Info: Read library:  '/projeto.blend', '//../../projeto.blend', parent
> > > ''
> > > Warning: Unable to open '/projeto.blend': No such file or directory
> > > Warning: Cannot find lib '/projeto.blend'
> > > Warning: LIB: Group: 'Projeto' missing from '/projeto.blend', parent
> > > ''
> > > 
> > > GPUShader: compile error:
> > > 0:1177(22): error: invalid input layout qualifier used
> > > [-]
> > > 
> > > mesa-demos/glsl> ./fsraytrace
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: pci id for fd 4: 1002:67df, driver radeonsi
> > > libGL: OpenDriver: trying /usr/local/lib64/dri/tls/radeonsi_dri.so
> > > libGL: OpenDriver: trying /usr/local/lib64/dri/radeonsi_dri.so
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > libGL: Can't open configuration file /usr/local/etc/drirc: No such file or
> > > directory.
> > > usr/share/libdrm/amdgpu.ids version: 1.0.0
> > > libGL: Using DRI3 for screen 0
> > > Error: problem compiling shader: 0:48(2): error: invalid input layout
> > > qualifier used
> > > 
> > > Same with 'vsraytrace' and 'gsraytrace'.
> > > 
> > > Thanks,
> > > Dieter
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 1/3 v2] glsl: prevent qualifiers modification of predeclared variables

2018-11-13 Thread andrey simiklit

Hello,

Thanks a lot for review.

Regards,
Andrii.


On Sat, Nov 10, 2018 at 5:38 AM Timothy Arceri 
wrote:

> Nice! Series is:
>
> Reviewed-by: Timothy Arceri 
>
> On 10/10/18 9:07 am, Ian Romanick wrote:
> > From: Ian Romanick 
> >
> > Section 3.7 (Identifiers) of the GLSL spec says:
> >
> >  However, as noted in the specification, there are some cases where
> >  previously declared variables can be redeclared to change or add
> >  some property, and predeclared "gl_" names are allowed to be
> >  redeclared in a shader only for these specific purposes.  More
> >  generally, it is an error to redeclare a variable, including those
> >  starting "gl_".
> >
> > This patch should fix piglit tests:
> > clip-distance-redeclare-without-inout.frag
> > clip-distance-redeclare-without-inout.vert
> >
> > However, this causes a regression in
> > clip-distance-out-values.shader_test.  A fix for that test has been sent
> > to the piglit list for review:
> >
> >  https://patchwork.freedesktop.org/patch/255201/
> >
> > As far as I understood following mailing thread:
> > https://lists.freedesktop.org/archives/piglit/2013-October/007935.html
> > looks like we have accepted to remove an ability to change qualifiers
> > but have not done it yet. Unless I missed something)
> >
> > v2 (idr): Move 'earlier->data.mode != var->data.mode' test much earlier
> > in the function.  Add special handling for gl_LastFragData.
> >
> > Signed-off-by: Andrii Simiklit 
> > Signed-off-by: Ian Romanick 
> > ---
> >   src/compiler/glsl/ast_to_hir.cpp | 51
> +---
> >   1 file changed, 27 insertions(+), 24 deletions(-)
> >
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> > index 1082d6c91cf..2e4c9ef6776 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -4238,6 +4238,29 @@ get_variable_being_redeclared(ir_variable
> **var_ptr, YYLTYPE loc,
> >
> >  *is_redeclaration = true;
> >
> > +   if (earlier->data.how_declared == ir_var_declared_implicitly) {
> > +  /* Verify that the redeclaration of a built-in does not change the
> > +   * storage qualifier.  There are a couple special cases.
> > +   *
> > +   * 1. Some built-in variables that are defined as 'in' in the
> > +   *specification are implemented as system values.  Allow
> > +   *ir_var_system_value -> ir_var_shader_in.
> > +   *
> > +   * 2. gl_LastFragData is implemented as a ir_var_shader_out, but
> the
> > +   *specification requires that redeclarations omit any
> qualifier.
> > +   *Allow ir_var_shader_out -> ir_var_auto for this one
> variable.
> > +   */
> > +  if (earlier->data.mode != var->data.mode &&
> > +  !(earlier->data.mode == ir_var_system_value &&
> > +var->data.mode == ir_var_shader_in) &&
> > +  !(strcmp(var->name, "gl_LastFragData") == 0 &&
> > +var->data.mode == ir_var_auto)) {
> > + _mesa_glsl_error(, state,
> > +  "redeclaration cannot change qualification of
> `%s'",
> > +  var->name);
> > +  }
> > +   }
> > +
> >  /* From page 24 (page 30 of the PDF) of the GLSL 1.50 spec,
> >   *
> >   * "It is legal to declare an array without a size and then
> > @@ -4246,11 +4269,6 @@ get_variable_being_redeclared(ir_variable
> **var_ptr, YYLTYPE loc,
> >   */
> >  if (earlier->type->is_unsized_array() && var->type->is_array()
> >  && (var->type->fields.array == earlier->type->fields.array)) {
> > -  /* FINISHME: This doesn't match the qualifiers on the two
> > -   * FINISHME: declarations.  It's not 100% clear whether this is
> > -   * FINISHME: required or not.
> > -   */
> > -
> > const int size = var->type->array_size();
> > check_builtin_array_max_size(var->name, size, loc, state);
> > if ((size > 0) && (size <= earlier->data.max_array_access)) {
> > @@ -4342,28 +4360,13 @@ get_variable_being_redeclared(ir_variable
> **var_ptr, YYLTYPE loc,
> > earlier->data.precision = var->data.precision;
> > earlier->data.memory_coherent = var->data.memory_coherent;
> >
> > -   } else if (earlier->data.how_declared == ir_var_declared_implicitly
> &&
> > -  state->allow_builtin_variable_redeclaration) {
> > +   } else if ((earlier->data.how_declared == ir_var_declared_implicitly
> &&
> > +   state->allow_builtin_variable_redeclaration) ||
> > +  allow_all_redeclarations) {
> > /* Allow verbatim redeclarations of built-in variables. Not
> explicitly
> >  * valid, but some applications do it.
> >  */
> > -  if (earlier->data.mode != var->data.mode &&
> > -  !(earlier->data.mode == ir_var_system_value &&
> > -var->data.mode == ir_var_shader_in)) {
> > - _mesa_glsl_error(, state,
> > -

Re: [Mesa-dev] [PATCH 12/15] anv: introduce helper to resolve vk_format from anv_format

2018-11-13 Thread Tapani Pälli




On 11/6/18 3:01 PM, Lionel Landwerlin wrote:
We could touch the macros in anv_formats.c to include VkFormat in 
anv_format if that makes your life easier.


Yep, this makes sense. I'll add VkFormat there.


On 30/10/2018 05:26, Tapani Pälli wrote:

Signed-off-by: Tapani Pälli 
---
  src/intel/vulkan/anv_formats.c | 18 ++
  src/intel/vulkan/anv_private.h |  3 +++
  2 files changed, 21 insertions(+)

diff --git a/src/intel/vulkan/anv_formats.c 
b/src/intel/vulkan/anv_formats.c

index 1d3b1f67928..166b50f5a07 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -405,6 +405,24 @@ anv_get_format(VkFormat vk_format)
 return format;
  }
+VkFormat
+anv_get_vkformat(const struct anv_format *format)
+{
+#define LAST_FORMAT(table) table + sizeof(table) - sizeof(struct 
anv_format)

+
+   const struct anv_format *last_main = LAST_FORMAT(main_formats);
+   const struct anv_format *last_ycbcr = LAST_FORMAT(ycbcr_formats);
+
+#undef LAST_FORMAT
+
+   if (format >= main_formats && format <= last_main)
+  return format - main_formats;
+   else if (format >= ycbcr_formats && format <= last_ycbcr)
+  return format - ycbcr_formats;
+
+   return VK_FORMAT_UNDEFINED;
+}
+
  /**
   * Exactly one bit must be set in \a aspect.
   */
diff --git a/src/intel/vulkan/anv_private.h 
b/src/intel/vulkan/anv_private.h

index 882de030ae0..bfdb711337e 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2634,6 +2634,9 @@ anv_plane_to_aspect(VkImageAspectFlags 
image_aspects,

  const struct anv_format *
  anv_get_format(VkFormat format);
+VkFormat
+anv_get_vkformat(const struct anv_format *format);
+
  static inline uint32_t
  anv_get_format_planes(VkFormat vk_format)
  {




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] intel/decoder: tools: Use engine for decoding batch instructions

2018-11-13 Thread Lionel Landwerlin


I forgot that aubinator_viewer_decoder.cpp needs to be updated too.
But updated locally and will push with the fix.

Thanks!

-
Lionel

On 08/11/2018 10:36, Lionel Landwerlin wrote:

Reviewed-by: Lionel Landwerlin 

On 07/11/2018 14:50, Toni Lönnberg wrote:
The engine to which the batch was sent to is now set to the decoder 
context when

decoding the batch. This is needed so that we can distinguish between
instructions as the render and video pipe share some of the 
instruction opcodes.


v2: The engine is now in the decoder context and the batch decoder 
uses a local

function for finding the instruction for an engine.

v3: Spec uses engine_mask now instead of engine, replaced engine 
class enums

with the definitions from UAPI.
---
  src/intel/common/gen_batch_decoder.c | 25 +---
  src/intel/common/gen_decoder.c   |  7 ++-
  src/intel/common/gen_decoder.h   |  6 +-
  src/intel/tools/aubinator.c  |  3 +-
  src/intel/tools/aubinator_error_decode.c | 73 
  5 files changed, 63 insertions(+), 51 deletions(-)

diff --git a/src/intel/common/gen_batch_decoder.c 
b/src/intel/common/gen_batch_decoder.c

index 63f04627572..d5482a4d455 100644
--- a/src/intel/common/gen_batch_decoder.c
+++ b/src/intel/common/gen_batch_decoder.c
@@ -45,6 +45,7 @@ gen_batch_decode_ctx_init(struct 
gen_batch_decode_ctx *ctx,

 ctx->fp = fp;
 ctx->flags = flags;
 ctx->max_vbo_decoded_lines = -1; /* No limit! */
+   ctx->engine = I915_ENGINE_CLASS_RENDER;
   if (xml_path == NULL)
    ctx->spec = gen_spec_load(devinfo);
@@ -192,10 +193,16 @@ ctx_print_buffer(struct gen_batch_decode_ctx *ctx,
 fprintf(ctx->fp, "\n");
  }
  +static struct gen_group *
+gen_ctx_find_instruction(struct gen_batch_decode_ctx *ctx, const 
uint32_t *p)

+{
+   return gen_spec_find_instruction(ctx->spec, ctx->engine, p);
+}
+
  static void
  handle_state_base_address(struct gen_batch_decode_ctx *ctx, const 
uint32_t *p)

  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
   struct gen_field_iterator iter;
 gen_field_iterator_init(, inst, p, 0, false);
@@ -309,7 +316,7 @@ static void
  handle_media_interface_descriptor_load(struct gen_batch_decode_ctx 
*ctx,

 const uint32_t *p)
  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
 struct gen_group *desc =
    gen_spec_find_struct(ctx->spec, "INTERFACE_DESCRIPTOR_DATA");
  @@ -373,7 +380,7 @@ static void
  handle_3dstate_vertex_buffers(struct gen_batch_decode_ctx *ctx,
    const uint32_t *p)
  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
 struct gen_group *vbs = gen_spec_find_struct(ctx->spec, 
"VERTEX_BUFFER_STATE");

   struct gen_batch_decode_bo vb = {};
@@ -436,7 +443,7 @@ static void
  handle_3dstate_index_buffer(struct gen_batch_decode_ctx *ctx,
  const uint32_t *p)
  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
   struct gen_batch_decode_bo ib = {};
 uint32_t ib_size = 0;
@@ -486,7 +493,7 @@ handle_3dstate_index_buffer(struct 
gen_batch_decode_ctx *ctx,

  static void
  decode_single_ksp(struct gen_batch_decode_ctx *ctx, const uint32_t *p)
  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
   uint64_t ksp = 0;
 bool is_simd8 = false; /* vertex shaders on Gen8+ only */
@@ -528,7 +535,7 @@ decode_single_ksp(struct gen_batch_decode_ctx 
*ctx, const uint32_t *p)

  static void
  decode_ps_kernels(struct gen_batch_decode_ctx *ctx, const uint32_t *p)
  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
   uint64_t ksp[3] = {0, 0, 0};
 bool enabled[3] = {false, false, false};
@@ -576,7 +583,7 @@ decode_ps_kernels(struct gen_batch_decode_ctx 
*ctx, const uint32_t *p)

  static void
  decode_3dstate_constant(struct gen_batch_decode_ctx *ctx, const 
uint32_t *p)

  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
 struct gen_group *body =
    gen_spec_find_struct(ctx->spec, "3DSTATE_CONSTANT_BODY");
  @@ -658,7 +665,7 @@ decode_dynamic_state_pointers(struct 
gen_batch_decode_ctx *ctx,
    const char *struct_type, const 
uint32_t *p,

    int count)
  {
-   struct gen_group *inst = gen_spec_find_instruction(ctx->spec, p);
+   struct gen_group *inst = gen_ctx_find_instruction(ctx, p);
   uint32_t state_offset = 0;
  @@ -802,7 +809,7

Re: [Mesa-dev] [PATCH v2 3/4] dri: add AYUV format

2018-11-13 Thread Lionel Landwerlin


I think this chunk (or the whole patch) should be cherry picked to stable.
Otherwise we get a BAD_ATTRIBUTE error for trying to create an AYUV 
EGLImage.

We should have BAD_MATCH instead.

-
Lionel

On 09/11/2018 10:55, Lionel Landwerlin wrote:

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 87e1a704c6e..3b63aebbf9a 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2278,6 +2278,7 @@ dri2_num_fourcc_format_planes(EGLint format)
 case DRM_FORMAT_YVYU:
 case DRM_FORMAT_UYVY:
 case DRM_FORMAT_VYUY:
+   case DRM_FORMAT_AYUV:
return 1;
  
 case DRM_FORMAT_NV12:



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] virgl: Add command and flags to initiate debugging on the host (v2)

2018-11-13 Thread Gert Wollny

The host side has now landed, but because I re-worked the guest side
since Erik gave his R-B, I thought I'd ask you to take another look, 

Best, 
Gert

Am Mittwoch, den 12.09.2018, 11:59 +0200 schrieb Gert Wollny:
> From: Gert Wollny 
> 
> On the host VREND_DEBUG=guestallow must be set to let the guest
> override
> the debug flags.
> 
> v2: Send flag string instead of flags, this avoids the need to keep
> the flags in sync.
> v3: Only request host logging if the host actually understands the
> command
> 
> Signed-off-by: Gert Wollny 
> ---
> The corresponding virglrenderer patches can be found in this MR: 
> https://gitlab.freedesktop.org/virgl/virglrenderer/merge_requests/39
> 
> Thanks for reviewing, 
> Gert 
> 
>  src/gallium/drivers/virgl/virgl_context.c  |  8 
>  src/gallium/drivers/virgl/virgl_encode.c   | 24
> 
>  src/gallium/drivers/virgl/virgl_encode.h   |  3 +++
>  src/gallium/drivers/virgl/virgl_hw.h   |  1 +
>  src/gallium/drivers/virgl/virgl_protocol.h |  1 +
>  5 files changed, 37 insertions(+)
> 
> diff --git a/src/gallium/drivers/virgl/virgl_context.c
> b/src/gallium/drivers/virgl/virgl_context.c
> index 4511bf3b2f..96932c473d 100644
> --- a/src/gallium/drivers/virgl/virgl_context.c
> +++ b/src/gallium/drivers/virgl/virgl_context.c
> @@ -1164,6 +1164,7 @@ struct pipe_context
> *virgl_context_create(struct pipe_screen *pscreen,
> struct virgl_context *vctx;
> struct virgl_screen *rs = virgl_screen(pscreen);
> vctx = CALLOC_STRUCT(virgl_context);
> +   const char *host_debug_flagstring;
>  
> vctx->cbuf = rs->vws->cmd_buf_create(rs->vws);
> if (!vctx->cbuf) {
> @@ -1268,6 +1269,13 @@ struct pipe_context
> *virgl_context_create(struct pipe_screen *pscreen,
> virgl_encoder_create_sub_ctx(vctx, vctx->hw_sub_ctx_id);
>  
> virgl_encoder_set_sub_ctx(vctx, vctx->hw_sub_ctx_id);
> +
> +   if (rs->caps.caps.v2.capability_bits &
> VIRGL_CAP_GUEST_MAY_INIT_LOG) {
> +  host_debug_flagstring = getenv("VIRGL_HOST_DEBUG");
> +  if (host_debug_flagstring)
> + virgl_encode_host_debug_flagstring(vctx,
> host_debug_flagstring);
> +   }
> +
> return >base;
>  fail:
> return NULL;
> diff --git a/src/gallium/drivers/virgl/virgl_encode.c
> b/src/gallium/drivers/virgl/virgl_encode.c
> index e86d0711a5..400ba68474 100644
> --- a/src/gallium/drivers/virgl/virgl_encode.c
> +++ b/src/gallium/drivers/virgl/virgl_encode.c
> @@ -1044,3 +1044,27 @@ int virgl_encode_texture_barrier(struct
> virgl_context *ctx,
> virgl_encoder_write_dword(ctx->cbuf, flags);
> return 0;
>  }
> +
> +int virgl_encode_host_debug_flagstring(struct virgl_context *ctx,
> +   char *flagstring)
> +{
> +   unsigned long slen = strlen(flagstring) + 1;
> +   uint32_t sslen;
> +   uint32_t string_length;
> +
> +   if (!slen)
> +  return 0;
> +
> +   if (slen > 4 * 0x) {
> +  debug_printf("VIRGL: host debug flag string too long, will be
> truncated\n");
> +  slen = 4 * 0x;
> +   }
> +
> +   sslen = (uint32_t )(slen + 3) / 4;
> +   string_length = (uint32_t)MIN2(sslen * 4, slen);
> +
> +   virgl_encoder_write_cmd_dword(ctx,
> VIRGL_CMD0(VIRGL_CCMD_SET_DEBUG_FLAGS, 0, sslen));
> +   virgl_encoder_write_block(ctx->cbuf, (uint8_t *)flagstring,
> string_length);
> +
> +   return 0;
> +}
> diff --git a/src/gallium/drivers/virgl/virgl_encode.h
> b/src/gallium/drivers/virgl/virgl_encode.h
> index 40e62d453b..80b943a6b3 100644
> --- a/src/gallium/drivers/virgl/virgl_encode.h
> +++ b/src/gallium/drivers/virgl/virgl_encode.h
> @@ -276,4 +276,7 @@ int virgl_encode_launch_grid(struct virgl_context
> *ctx,
>   const struct pipe_grid_info
> *grid_info);
>  int virgl_encode_texture_barrier(struct virgl_context *ctx,
>   unsigned flags);
> +
> +int virgl_encode_host_debug_flagstring(struct virgl_context *ctx,
> +  char *envname);
>  #endif
> diff --git a/src/gallium/drivers/virgl/virgl_hw.h
> b/src/gallium/drivers/virgl/virgl_hw.h
> index 7736ceb935..e682c750e7 100644
> --- a/src/gallium/drivers/virgl/virgl_hw.h
> +++ b/src/gallium/drivers/virgl/virgl_hw.h
> @@ -231,6 +231,7 @@ enum virgl_formats {
>  #define VIRGL_CAP_SHADER_CLOCK (1 << 11)
>  #define VIRGL_CAP_TEXTURE_BARRIER  (1 << 12)
>  #define VIRGL_CAP_TGSI_COMPONENTS  (1 << 13)
> +#define VIRGL_CAP_GUEST_MAY_INIT_LOG   (1 << 14)
>  
>  /* virgl bind flags - these are compatible with mesa 10.5 gallium.
>   * but are fixed, no other should be passed to virgl either.
> diff --git a/src/gallium/drivers/virgl/virgl_protocol.h
> b/src/gallium/drivers/virgl/virgl_protocol.h
> index 8d99c5ed47..3373121bf7 100644
> --- a/src/gallium/drivers/virgl/virgl_protocol.h
> +++ b/src/gallium/drivers/virgl/virgl_protocol.h
> @@ -92,6 +92,7 @@ enum virgl_context_cmd {
> VIRGL_CCMD_SET_FRAMEBUFFER_STATE_NO_ATTACH,
> VIRGL_CCMD_TEXTURE_BARRIER,
>

Re: [Mesa-dev] [PATCH 6/7] RFC: nir/xfb_info: arrays of basic types adds just one varying

2018-11-13 Thread Alejandro Piñeiro

Hi Jason, just one thing here. Although I appreciate your interest to
understand how varyings are enumerated, I think that we are diverting
here, as in the end that would be something that I would need to solve.
I just wanted to know for the way to go.

The main question here is if we are really interested on adding such
complexity on the general xfb gathering pass. This RFC was basically a
way to show how much changes we would need, even for a incomplete
solution Im not totally happy.

So at this point, do you think that it is worth to add varying
computation to the general pass in the name of code reuse, or should
ARB_gl_spirv stick to their own gathering pass?


On 10/11/18 12:13, Alejandro Piñeiro wrote:
> On 09/11/18 16:58, Jason Ekstrand wrote:
>> On November 9, 2018 06:39:25 Alejandro Piñeiro 
>> wrote:
>>> On 08/11/18 23:14, Jason Ekstrand wrote:
 On Thu, Nov 8, 2018 at 7:22 AM Alejandro Piñeiro
 mailto:apinhe...@igalia.com>> wrote:

 On OpenGL, a array of a simple type adds just one varying. So
 gl_transform_feedback_varying_info struct defined at mtypes.h
 includes
 the parameters Type (base_type) and Size (number of elements).

 This commit checks this when the recursive add_var_xfb_outputs call
 handles arrays, to ensure that just one is addded.

 RFC: Until this point, all changes were reasonable, but this
 change is
 (imho) ugly. My idea was introducing as less as possible changes on
 the code, specially on its logic/flow. But this commit is almost a
 hack. The ideal solution would be to change the focus of the
 recursive
 function, focusing on varyings, and at each varying,
 recursively add
 outputs. But that seems like an overkill for a pass that was
 originally intended for consumers only caring about the outputs. So
 perhaps ARB_gl_spirv should keep their own gathering pass, with
 vayings and outputs, and let this one untouched for those that only
 care on outputs.
 ---
  src/compiler/nir/nir_gather_xfb_info.c | 52
 --
  1 file changed, 43 insertions(+), 9 deletions(-)

 diff --git a/src/compiler/nir/nir_gather_xfb_info.c
 b/src/compiler/nir/nir_gather_xfb_info.c
 index 948b802a815..cb0e2724cab 100644
 --- a/src/compiler/nir/nir_gather_xfb_info.c
 +++ b/src/compiler/nir/nir_gather_xfb_info.c
 @@ -36,23 +36,59 @@ nir_gather_xfb_info_create(void *mem_ctx,
 uint16_t output_count, uint16_t varyin
     return xfb;
  }

 +static bool
 +glsl_type_is_leaf(const struct glsl_type *type)
 +{
 +   if (glsl_type_is_struct(type) ||
 +       (glsl_type_is_array(type) &&
 +        (glsl_type_is_array(glsl_get_array_element(type)) ||
 +         glsl_type_is_struct(glsl_get_array_element(type) {


 I'm trying to understand exactly what this means.  From what you
 wrote here it looks like the following are all one varying:

 float var[3];
 vec2 var[3];
 mat4 var[3];
>>>
>>> Yes, GLSL returns one varying per each one (Size 3).
>>
>> Just to be clear, a matrix it array of matrices is one varying?
>
> Yep, and being more clear, for this shader:
> #version 150
> #extension GL_ARB_enhanced_layouts: require
>
> layout(xfb_offset = 0) out mat4 var[3];
>
> void main() {
>   mat4 m4;
>
>   gl_Position = vec4(0.0);
>
>   var[0] = m4;
> }
>
> We get the following when we dump gl_program::LinkedTransformFeedback,
> that is a struct gl_transform_feedback_info defined at mtypes.h:
>
> [gl_transform_feedback_info]
>     NumOuputs = 12, (OutputRegister, OutputBuffer, NumComponents,
> StreamId, DstOffset, ComponentOffset)
>             0:(31,  0,  4,  0,  0,  0)
>             1:(32,  0,  4,  0,  4,  0)
>             2:(33,  0,  4,  0,  8,  0)
>             3:(34,  0,  4,  0, 12,  0)
>             4:(35,  0,  4,  0, 16,  0)
>             5:(36,  0,  4,  0, 20,  0)
>             6:(37,  0,  4,  0, 24,  0)
>             7:(38,  0,  4,  0, 28,  0)
>             8:(39,  0,  4,  0, 32,  0)
>             9:(40,  0,  4,  0, 36,  0)
>             10:(41,  0,  4,  0, 40,  0)
>             11:(42,  0,  4,  0, 44,  0)
>     NumVarying=1, (Offset, Type, BufferIndex, Size, Name)
>             0:( 0,   GL_FLOAT_MAT4,  0,  3, var)
>     ActiveBuffers=1, (Binding, NumVaryings, Stride, Stream):
>             0:( 0,  1, 192,  0)
>
> FWIW, in some cases we are also getting a slightly different amount of
> Outputs. But Im personally not really worried about that as far as it
> keeps working. The number of varyings is somewhat different as it is
> exposed through the program interface queries, so (I assume) it should
> be consistent.
>
>>
>>>

 but the following are not

 struct S {
    float f;
    vec4 v;
 };

Re: [Mesa-dev] [PATCH mesa] xmlpool: update translation po files

2018-11-13 Thread Emil Velikov

On Mon, 12 Nov 2018 at 18:14, Dylan Baker  wrote:
>
> Quoting Eric Engestrom (2018-11-12 09:47:22)
> > On Monday, 2018-11-12 16:56:32 +, Emil Velikov wrote:
> > > On Mon, 12 Nov 2018 at 14:24, Eric Engestrom  
> > > wrote:
> > > >
> > > > These files are close to 4 years out of date; a lot's changed since.
> > > > Let's just check in a recently-regenerated version.
> > > >
> > > Worth removing them from git and letting the build regenerate them as 
> > > needed?
> >
> > No, the point is for them to be filled with the translations.
> > They aren't 100% generated, they're more like "refreshed" by running the
> > ninja command, to add new strings to be translated and adjust file/line
> > references.
> >
> >
> > That said, I've just looked at the state of the translations, and
> > "partial" is already generous. Users would currently get a mostly
> > english driconf interface with a few strings translated here and there,
> > which I'm not sure is worth the hassle of maintaining all this.
> >
> > Should we just drop the translation infrastructure?
>
> I'd try pinging the people who provided the translations in the first place to
> see if they're interested in updating them. If not I'd be in favor of dropping
> unmaintained translations, if there are no maintained translations drop the
> whole things.
>
> Just my 2¢
>
Very well said Dylan. I'm on the same page.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] virgl: native fence fd support

2018-11-13 Thread Emil Velikov

[This time with mesa-dev@ in the list, and less typos]

Hi Rob,

On Mon, 12 Nov 2018 at 15:14, Robert Foss  wrote:

> +++ b/src/gallium/drivers/virgl/virgl_screen.c
> @@ -340,7 +340,7 @@ virgl_get_param(struct pipe_screen *screen, enum pipe_cap 
> param)
> case PIPE_CAP_VIDEO_MEMORY:
>return 0;
> case PIPE_CAP_NATIVE_FENCE_FD:
> -  return 0;
> +  return vscreen->vws->driver_version(vscreen->vws) >= 1;

It seems like the driver_version() vfunc is missing for the vtest winsys.

One could go with an empty stub or drop the function in faviour of a winsys
variable (or bitmask). Personally, I'm leaning towards the latter, although
either will do.

> +static void virgl_fence_server_sync(struct virgl_winsys *vws,
> +   struct virgl_cmd_buf *cbuf,
> +struct pipe_fence_handle *fence)
> +{
> +   struct virgl_hw_res *hw_res = virgl_hw_res(fence);
> +
> +   assert(hw_res->fence_fd != -1);
> +
Skimming at other drivers - they're not using an assert, so I'd change this to
an if statement.

> +   if (cbuf->in_fence_fd == -1) {
> +   cbuf->in_fence_fd = dup(hw_res->fence_fd);
> +   } else {
> +int new_fd = sync_merge("virgl", cbuf->in_fence_fd, 
> hw_res->fence_fd);
> +close(cbuf->in_fence_fd);
> +cbuf->in_fence_fd = new_fd;

The above if/else seems like an open-coded version of sync_accumulate().

Despite the above comments, the kernel interface seems reasonable IMHO.
Would be great if one more person else double-checks it though.

With the three bits handled the patch is:
Reviewed-by: Emil Velikov 

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 107822] Just Cause 3 Flickering Textures with AMD RADV

2018-11-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=107822

--- Comment #6 from Alexander  ---
I already have tested that.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 5/5] intel/tools: avoid 'ignoring return value'

2018-11-13 Thread asimiklit . work

From: Andrii Simiklit 

1. tools/i965_disasm.c:58:4: warning:
 ignoring return value of ‘fread’,
 declared with attribute warn_unused_result
 fread(assembly, *end, 1, fp);

v2: - Fixed incorrect return value check.
   ( Eric Engestrom  )

Signed-off-by: Andrii Simiklit 
---
 src/intel/tools/i965_disasm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/tools/i965_disasm.c b/src/intel/tools/i965_disasm.c
index 73a6760fc1..329f6327ed 100644
--- a/src/intel/tools/i965_disasm.c
+++ b/src/intel/tools/i965_disasm.c
@@ -55,7 +55,8 @@ i965_disasm_read_binary(FILE *fp, size_t *end)
if (assembly == NULL)
   return NULL;
 
-   fread(assembly, *end, 1, fp);
+   MAYBE_UNUSED size_t size = fread(assembly, *end, 1, fp);
+   assert((size || (*end == 0)) && "error: unable to read all elements!");
fclose(fp);
 
return assembly;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/5] main: avoid 'may be used uninitialized' warnings

2018-11-13 Thread asimiklit . work

From: Andrii Simiklit 

1. main/texcompress_etc.c:1314:12:
warning: ‘*((void *)+2)’ may be used uninitialized in this function
2. main/texcompress_etc.c:1354:12:
warning: ‘*((void *)+2)’ may be used uninitialized in this function
3. main/texcompress_etc.c:1293:12:
warning: ‘dst’ may be used uninitialized in this function
4. main/texcompress_etc.c:1335:12:
warning: ‘dst’ may be used uninitialized in this function
5. main/texcompress_etc.c:1460:12:
warning: ‘*((void *)+1)’ may be used uninitialized in this function

v2: Fixed by adding the unreachable case to the etc2_rgb8_fetch_texel
   ( Eric Engestrom  )
Changes for warning 'pixerrorcolorbest' were removed.

Signed-off-by: Andrii Simiklit 
---
 src/mesa/main/texcompress_etc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
index b39ab33d36..f1da4d0f11 100644
--- a/src/mesa/main/texcompress_etc.c
+++ b/src/mesa/main/texcompress_etc.c
@@ -548,6 +548,7 @@ etc2_rgb8_fetch_texel(const struct etc2_block *block,
   if (punchthrough_alpha)
  dst[3] = 255;
}
+   else unreachable("unhandled block mode");
 }
 
 static void
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/5] compiler: avoid 'unused variable'

2018-11-13 Thread asimiklit . work

From: Andrii Simiklit 

1. nir/nir_lower_vars_to_ssa.c:691:21: warning:
   unused variable ‘var’
   nir_variable *var = path->path[0]->var;

v2: Changes for some part of 'may be used uninitialized'
warnings were removed, seems like it is a compiler issue.
( Eric Engestrom  )
Possible like this one:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46684
This issue is flagged as duplicate but an
original one is not closed yet.

Signed-off-by: Andrii Simiklit 
---
 src/compiler/nir/nir_lower_vars_to_ssa.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c 
b/src/compiler/nir/nir_lower_vars_to_ssa.c
index 8e517a7895..646efd9ad8 100644
--- a/src/compiler/nir/nir_lower_vars_to_ssa.c
+++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
@@ -683,10 +683,9 @@ nir_lower_vars_to_ssa_impl(nir_function_impl *impl)
   nir_deref_path *path = >path;
 
   assert(path->path[0]->deref_type == nir_deref_type_var);
-  nir_variable *var = path->path[0]->var;
 
   /* We don't build deref nodes for non-local variables */
-  assert(var->data.mode == nir_var_local);
+  assert(path->path[0]->var->data.mode == nir_var_local);
 
   if (path_may_be_aliased(path, )) {
  exec_node_remove(>direct_derefs_link);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/5] i965: avoid 'unused variable'

2018-11-13 Thread asimiklit . work

From: Andrii Simiklit 

1. brw_pipe_control.c:311:34: warning:
unused variable ‘devinfo’
2. brw_program_binary.c:209:19: warning:
unused variable ‘gen_size’
3. brw_program_binary.c:216:19: warning:
unused variable ‘nir_size’

v2: Changes for unreproducible issues were removed

Signed-off-by: Andrii Simiklit 
---
 src/mesa/drivers/dri/i965/brw_pipe_control.c   | 2 +-
 src/mesa/drivers/dri/i965/brw_program_binary.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
b/src/mesa/drivers/dri/i965/brw_pipe_control.c
index 122ac26070..a3f521b5ae 100644
--- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
@@ -308,7 +308,7 @@ brw_emit_depth_stall_flushes(struct brw_context *brw)
 void
 gen7_emit_vs_workaround_flush(struct brw_context *brw)
 {
-   const struct gen_device_info *devinfo = >screen->devinfo;
+   MAYBE_UNUSED const struct gen_device_info *devinfo = >screen->devinfo;
 
assert(devinfo->gen == 7);
brw_emit_pipe_control_write(brw,
diff --git a/src/mesa/drivers/dri/i965/brw_program_binary.c 
b/src/mesa/drivers/dri/i965/brw_program_binary.c
index db03332241..1298d9e765 100644
--- a/src/mesa/drivers/dri/i965/brw_program_binary.c
+++ b/src/mesa/drivers/dri/i965/brw_program_binary.c
@@ -206,14 +206,14 @@ brw_program_deserialize_driver_blob(struct gl_context 
*ctx,
  break;
   switch ((enum driver_cache_blob_part)part_type) {
   case GEN_PART: {
- uint32_t gen_size = blob_read_uint32();
+ MAYBE_UNUSED uint32_t gen_size = blob_read_uint32();
  assert(!reader.overrun &&
 (uintptr_t)(reader.end - reader.current) > gen_size);
  deserialize_gen_program(, ctx, prog, stage);
  break;
   }
   case NIR_PART: {
- uint32_t nir_size = blob_read_uint32();
+ MAYBE_UNUSED uint32_t nir_size = blob_read_uint32();
  assert(!reader.overrun &&
 (uintptr_t)(reader.end - reader.current) > nir_size);
  const struct nir_shader_compiler_options *options =
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/5] mesa: fix against several compilation warnings

2018-11-13 Thread asimiklit . work

From: Andrii Simiklit 

Fixes several compilation warnings for a release configuration

v2: the patch '1/4' was separated to '1/5' and '5/5'
   ( Eric Engestrom  )

Andrii Simiklit (5):
  intel/tools: avoid 'unused variable' warnings
  compiler: avoid 'unused variable'
  i965: avoid 'unused variable'
  main: avoid 'may be used uninitialized' warnings
  intel/tools: avoid 'ignoring return value'

 src/compiler/nir/nir_lower_vars_to_ssa.c   |  3 +--
 src/intel/tools/aub_mem.c  | 10 ++
 src/intel/tools/aub_read.c |  3 ++-
 src/intel/tools/i965_disasm.c  |  3 ++-
 src/mesa/drivers/dri/i965/brw_pipe_control.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_program_binary.c |  4 ++--
 src/mesa/main/texcompress_etc.c|  1 +
 7 files changed, 15 insertions(+), 11 deletions(-)

-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/5] intel/tools: avoid 'unused variable' warnings

2018-11-13 Thread asimiklit . work

From: Andrii Simiklit 

1. tools/aub_read.c:271:31: warning: unused variable ‘end’
const uint32_t *p = data, *end = data + data_len, *next;

2. tools/aub_mem.c:292:13: warning: unused variable ‘res’
   void *res = mmap((uint8_t *)bo.map + map_offset, 4096, PROT_READ,
   tools/aub_mem.c:357:13: warning: unused variable ‘res’
   void *res = mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ,

v2: The i965_disasm.c changes was moved into a separate patch
The 'end' variable declared separately with MAYBE_UNUSED
to avoid effect of it to other variables.
   ( Eric Engestrom  )

Signed-off-by: Andrii Simiklit 
---
 src/intel/tools/aub_mem.c  | 10 ++
 src/intel/tools/aub_read.c |  3 ++-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/intel/tools/aub_mem.c b/src/intel/tools/aub_mem.c
index 58b51b78a5..98e14219c5 100644
--- a/src/intel/tools/aub_mem.c
+++ b/src/intel/tools/aub_mem.c
@@ -289,8 +289,9 @@ aub_mem_get_ggtt_bo(void *_mem, uint64_t address)
  continue;
 
   uint32_t map_offset = i->virt_addr - address;
-  void *res = mmap((uint8_t *)bo.map + map_offset, 4096, PROT_READ,
-   MAP_SHARED | MAP_FIXED, mem->mem_fd, 
phys_mem->fd_offset);
+  MAYBE_UNUSED void *res =
+mmap((uint8_t *)bo.map + map_offset, 4096, PROT_READ,
+  MAP_SHARED | MAP_FIXED, mem->mem_fd, phys_mem->fd_offset);
   assert(res != MAP_FAILED);
}
 
@@ -354,8 +355,9 @@ aub_mem_get_ppgtt_bo(void *_mem, uint64_t address)
for (uint64_t page = address; page < end; page += 4096) {
   struct phys_mem *phys_mem = ppgtt_walk(mem, mem->pml4, page);
 
-  void *res = mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ,
-   MAP_SHARED | MAP_FIXED, mem->mem_fd, 
phys_mem->fd_offset);
+  MAYBE_UNUSED void *res =
+mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ,
+  MAP_SHARED | MAP_FIXED, mem->mem_fd, phys_mem->fd_offset);
   assert(res != MAP_FAILED);
}
 
diff --git a/src/intel/tools/aub_read.c b/src/intel/tools/aub_read.c
index d83e88ddce..552ca2cc62 100644
--- a/src/intel/tools/aub_read.c
+++ b/src/intel/tools/aub_read.c
@@ -294,7 +294,8 @@ handle_memtrace_mem_write(struct aub_read *read, const 
uint32_t *p)
 int
 aub_read_command(struct aub_read *read, const void *data, uint32_t data_len)
 {
-   const uint32_t *p = data, *end = data + data_len, *next;
+   const uint32_t *p = data, *next;
+   MAYBE_UNUSED const uint32_t *end = data + data_len;
uint32_t h, header_length, bias;
 
assert(data_len >= 4);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 01/10] intel/genxml: Add engine definition to render engine instructions (gen4)

2018-11-13 Thread Lionel Landwerlin


For all the xml changes :


Reviewed-by: Lionel Landwerlin 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/4] dri: add AYUV format

2018-11-13 Thread Tapani Pälli



On 11/13/18 1:43 PM, Lionel Landwerlin wrote:

I think this chunk (or the whole patch) should be cherry picked to stable.
Otherwise we get a BAD_ATTRIBUTE error for trying to create an AYUV 
EGLImage.

We should have BAD_MATCH instead.


Or should we change the reported error code in places where this is 
called? That seems like an existing bug, things wouldn't work correctly 
if someone adds a new format to drm_fourcc.h.



-
Lionel

On 09/11/2018 10:55, Lionel Landwerlin wrote:
diff --git a/src/egl/drivers/dri2/egl_dri2.c 
b/src/egl/drivers/dri2/egl_dri2.c

index 87e1a704c6e..3b63aebbf9a 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2278,6 +2278,7 @@ dri2_num_fourcc_format_planes(EGLint format)
 case DRM_FORMAT_YVYU:
 case DRM_FORMAT_UYVY:
 case DRM_FORMAT_VYUY:
+   case DRM_FORMAT_AYUV:
    return 1;
 case DRM_FORMAT_NV12:




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 32211] [GLSL] lower_jumps with continue-statements in for-loops prevents loop unrolling

2018-11-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=32211

Danylo  changed:

   What|Removed |Added

 CC||danylo.pilia...@gmail.com

--- Comment #12 from Danylo  ---
(In reply to Timothy Arceri from comment #11)
> 
> So all we need to do is move everything after the if into the else block and
> remove the continue. Removing myself as assignee, this would probably be a
> good beginners task.
Hi,

I've tried to do this and it works for me however it alone doesn't solve the
problem.

Consider the resulting nir:

loop {
block block_1:
/* preds: block_0 block_7 */
vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
vec1 32 ssa_14 = ige ssa_8, ssa_5
/* succs: block_2 block_3 */
if ssa_14 {
block block_2:
/* preds: block_1 */
break
/* succs: block_8 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
vec1 32 ssa_15 = ilt ssa_6, ssa_8
/* succs: block_5 block_6 */
if ssa_15 {
block block_5:
/* preds: block_4 */
vec1 32 ssa_16 = iadd ssa_8, ssa_7
vec1 32 ssa_17 = load_const (0x3f80 /* 1.00 */)
/* succs: block_7 */
} else {
block block_6:
/* preds: block_4 */
vec1 32 ssa_18 = iadd ssa_8, ssa_7
vec1 32 ssa_19 = load_const (0x3f80 /* 1.00 */)
/* succs: block_7 */
}
block block_7:
/* preds: block_5 block_6 */
vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
/* succs: block_1 */
}

Now in both "if" (block_5) and "else" (block_6) blocks there are identical
expressions and no continue. 
However there is no optimization to pull these identical expressions out - CSE
pass won't do this since it's a local CSE, only global CSE would help.
And there is no active global CSE pass, there is only Global Code Motion pass
with Global Value Numbering and it is not enabled, enabling it optimizes the
condition in question and in the end whole loop disappears as expected, however
this pass doesn't look something we want since in other cases it hurts shaders
and it is more than just global CSE.

Any opinions on this?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of layers array length for isl_view

2018-11-13 Thread Danylo Piliaiev


Hello,

Could anyone look at the patch?
Thanks!

On 10/24/18 2:22 PM, Danylo Piliaiev wrote:

I have made a Piglit test that exercises the issue:

https://patchwork.freedesktop.org/patch/258180/

- Danil

On 9/10/18 6:21 PM, Danylo Piliaiev wrote:

Handle all cases in calculation of layers count for isl_view
taking into account texture view and image unit.
st_convert_image was taken as a reference.

When u->Layered is true the whole level is taken with respect to
image view. In other case only one layer is taken.

v3: (Józef Kucia and Ilia Mirkin)
 - Rewrote patch by taking st_convert_image as a reference
 - Removed now unused get_image_num_layers function
 - Changed commit message

Fixes: 5a8c8903
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856

Signed-off-by: Danylo Piliaiev 
---
  .../drivers/dri/i965/brw_wm_surface_state.c   | 32 ++-
  1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c

index 944762ec46..9bfe6e2037 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1499,18 +1499,6 @@ update_buffer_image_param(struct brw_context 
*brw,

 param->stride[0] = _mesa_get_format_bytes(u->_ActualFormat);
  }
  -static unsigned
-get_image_num_layers(const struct intel_mipmap_tree *mt, GLenum target,
- unsigned level)
-{
-   if (target == GL_TEXTURE_CUBE_MAP)
-  return 6;
-
-   return target == GL_TEXTURE_3D ?
-  minify(mt->surf.logical_level0_px.depth, level) :
-  mt->surf.logical_level0_px.array_len;
-}
-
  static void
  update_image_surface(struct brw_context *brw,
   struct gl_image_unit *u,
@@ -1541,14 +1529,28 @@ update_image_surface(struct brw_context *brw,
    } else {
   struct intel_texture_object *intel_obj = 
intel_texture_object(obj);

   struct intel_mipmap_tree *mt = intel_obj->mt;
- const unsigned num_layers = u->Layered ?
-    get_image_num_layers(mt, obj->Target, u->Level) : 1;
+
+ unsigned base_layer, num_layers;
+ if (u->Layered) {
+    if (obj->Target == GL_TEXTURE_3D) {
+   base_layer = 0;
+   num_layers = minify(mt->surf.logical_level0_px.depth, 
u->Level);

+    } else {
+   base_layer = obj->MinLayer;
+   num_layers = obj->Immutable ?
+    obj->NumLayers :
+ mt->surf.logical_level0_px.array_len;
+    }
+ } else {
+    base_layer = obj->MinLayer + u->_Layer;
+    num_layers = 1;
+ }
     struct isl_view view = {
  .format = format,
  .base_level = obj->MinLevel + u->Level,
  .levels = 1,
-    .base_array_layer = obj->MinLayer + u->_Layer,
+    .base_array_layer = base_layer,
  .array_len = num_layers,
  .swizzle = ISL_SWIZZLE_IDENTITY,
  .usage = ISL_SURF_USAGE_STORAGE_BIT,




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/4] dri: add AYUV format

2018-11-13 Thread Lionel Landwerlin


On 13/11/2018 12:04, Tapani Pälli wrote:


On 11/13/18 1:43 PM, Lionel Landwerlin wrote:
I think this chunk (or the whole patch) should be cherry picked to 
stable.
Otherwise we get a BAD_ATTRIBUTE error for trying to create an AYUV 
EGLImage.

We should have BAD_MATCH instead.


Or should we change the reported error code in places where this is 
called? That seems like an existing bug, things wouldn't work 
correctly if someone adds a new format to drm_fourcc.h.



Sounds fair, running a patch through CI.





-
Lionel

On 09/11/2018 10:55, Lionel Landwerlin wrote:
diff --git a/src/egl/drivers/dri2/egl_dri2.c 
b/src/egl/drivers/dri2/egl_dri2.c

index 87e1a704c6e..3b63aebbf9a 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2278,6 +2278,7 @@ dri2_num_fourcc_format_planes(EGLint format)
 case DRM_FORMAT_YVYU:
 case DRM_FORMAT_UYVY:
 case DRM_FORMAT_VYUY:
+   case DRM_FORMAT_AYUV:
    return 1;
 case DRM_FORMAT_NV12:







___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] virgl: Add command and flags to initiate debugging on the host (v2)

2018-11-13 Thread Erik Faye-Lund

On Wed, 2018-09-12 at 11:59 +0200, Gert Wollny wrote:
> From: Gert Wollny 
> 
> On the host VREND_DEBUG=guestallow must be set to let the guest
> override
> the debug flags.
> 
> v2: Send flag string instead of flags, this avoids the need to keep
> the flags in sync.
> v3: Only request host logging if the host actually understands the
> command
> 
> Signed-off-by: Gert Wollny 

Looks good to me!

Reviewed-by: Erik Faye-Lund 

> ---
> The corresponding virglrenderer patches can be found in this MR: 
> https://gitlab.freedesktop.org/virgl/virglrenderer/merge_requests/39
> 
> Thanks for reviewing, 
> Gert 
> 
>  src/gallium/drivers/virgl/virgl_context.c  |  8 
>  src/gallium/drivers/virgl/virgl_encode.c   | 24
> 
>  src/gallium/drivers/virgl/virgl_encode.h   |  3 +++
>  src/gallium/drivers/virgl/virgl_hw.h   |  1 +
>  src/gallium/drivers/virgl/virgl_protocol.h |  1 +
>  5 files changed, 37 insertions(+)
> 
> diff --git a/src/gallium/drivers/virgl/virgl_context.c
> b/src/gallium/drivers/virgl/virgl_context.c
> index 4511bf3b2f..96932c473d 100644
> --- a/src/gallium/drivers/virgl/virgl_context.c
> +++ b/src/gallium/drivers/virgl/virgl_context.c
> @@ -1164,6 +1164,7 @@ struct pipe_context
> *virgl_context_create(struct pipe_screen *pscreen,
> struct virgl_context *vctx;
> struct virgl_screen *rs = virgl_screen(pscreen);
> vctx = CALLOC_STRUCT(virgl_context);
> +   const char *host_debug_flagstring;
>  
> vctx->cbuf = rs->vws->cmd_buf_create(rs->vws);
> if (!vctx->cbuf) {
> @@ -1268,6 +1269,13 @@ struct pipe_context
> *virgl_context_create(struct pipe_screen *pscreen,
> virgl_encoder_create_sub_ctx(vctx, vctx->hw_sub_ctx_id);
>  
> virgl_encoder_set_sub_ctx(vctx, vctx->hw_sub_ctx_id);
> +
> +   if (rs->caps.caps.v2.capability_bits &
> VIRGL_CAP_GUEST_MAY_INIT_LOG) {
> +  host_debug_flagstring = getenv("VIRGL_HOST_DEBUG");
> +  if (host_debug_flagstring)
> + virgl_encode_host_debug_flagstring(vctx,
> host_debug_flagstring);
> +   }
> +
> return >base;
>  fail:
> return NULL;
> diff --git a/src/gallium/drivers/virgl/virgl_encode.c
> b/src/gallium/drivers/virgl/virgl_encode.c
> index e86d0711a5..400ba68474 100644
> --- a/src/gallium/drivers/virgl/virgl_encode.c
> +++ b/src/gallium/drivers/virgl/virgl_encode.c
> @@ -1044,3 +1044,27 @@ int virgl_encode_texture_barrier(struct
> virgl_context *ctx,
> virgl_encoder_write_dword(ctx->cbuf, flags);
> return 0;
>  }
> +
> +int virgl_encode_host_debug_flagstring(struct virgl_context *ctx,
> +   char *flagstring)
> +{
> +   unsigned long slen = strlen(flagstring) + 1;
> +   uint32_t sslen;
> +   uint32_t string_length;
> +
> +   if (!slen)
> +  return 0;
> +
> +   if (slen > 4 * 0x) {
> +  debug_printf("VIRGL: host debug flag string too long, will be
> truncated\n");
> +  slen = 4 * 0x;
> +   }
> +
> +   sslen = (uint32_t )(slen + 3) / 4;
> +   string_length = (uint32_t)MIN2(sslen * 4, slen);
> +
> +   virgl_encoder_write_cmd_dword(ctx,
> VIRGL_CMD0(VIRGL_CCMD_SET_DEBUG_FLAGS, 0, sslen));
> +   virgl_encoder_write_block(ctx->cbuf, (uint8_t *)flagstring,
> string_length);
> +
> +   return 0;
> +}
> diff --git a/src/gallium/drivers/virgl/virgl_encode.h
> b/src/gallium/drivers/virgl/virgl_encode.h
> index 40e62d453b..80b943a6b3 100644
> --- a/src/gallium/drivers/virgl/virgl_encode.h
> +++ b/src/gallium/drivers/virgl/virgl_encode.h
> @@ -276,4 +276,7 @@ int virgl_encode_launch_grid(struct virgl_context
> *ctx,
>   const struct pipe_grid_info
> *grid_info);
>  int virgl_encode_texture_barrier(struct virgl_context *ctx,
>   unsigned flags);
> +
> +int virgl_encode_host_debug_flagstring(struct virgl_context *ctx,
> +  char *envname);
>  #endif
> diff --git a/src/gallium/drivers/virgl/virgl_hw.h
> b/src/gallium/drivers/virgl/virgl_hw.h
> index 7736ceb935..e682c750e7 100644
> --- a/src/gallium/drivers/virgl/virgl_hw.h
> +++ b/src/gallium/drivers/virgl/virgl_hw.h
> @@ -231,6 +231,7 @@ enum virgl_formats {
>  #define VIRGL_CAP_SHADER_CLOCK (1 << 11)
>  #define VIRGL_CAP_TEXTURE_BARRIER  (1 << 12)
>  #define VIRGL_CAP_TGSI_COMPONENTS  (1 << 13)
> +#define VIRGL_CAP_GUEST_MAY_INIT_LOG   (1 << 14)
>  
>  /* virgl bind flags - these are compatible with mesa 10.5 gallium.
>   * but are fixed, no other should be passed to virgl either.
> diff --git a/src/gallium/drivers/virgl/virgl_protocol.h
> b/src/gallium/drivers/virgl/virgl_protocol.h
> index 8d99c5ed47..3373121bf7 100644
> --- a/src/gallium/drivers/virgl/virgl_protocol.h
> +++ b/src/gallium/drivers/virgl/virgl_protocol.h
> @@ -92,6 +92,7 @@ enum virgl_context_cmd {
> VIRGL_CCMD_SET_FRAMEBUFFER_STATE_NO_ATTACH,
> VIRGL_CCMD_TEXTURE_BARRIER,
> VIRGL_CCMD_SET_ATOMIC_BUFFERS,
> +   VIRGL_CCMD_SET_DEBUG_FLAGS,
>  };
>  
>  /*

Re: [Mesa-dev] [PATCH 2/3] radv: make use of nir_move_out_const_to_consumer()

2018-11-13 Thread Samuel Pitoiset


Reviewed-by: Samuel Pitoiset 

On 11/7/18 5:20 AM, Timothy Arceri wrote:

vkpipeline-db results:

Totals from affected shaders:
SGPRS: 28400 -> 28576 (0.62 %)
VGPRS: 27916 -> 27692 (-0.80 %)
Spilled SGPRs: 140 -> 138 (-1.43 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1534456 -> 1520560 (-0.91 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3541 -> 3582 (1.16 %)
Wait states: 0 -> 0 (0.00 %)
---
  src/amd/vulkan/radv_pipeline.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index bced19573c1..12e7f43bde7 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1814,6 +1814,10 @@ radv_link_shaders(struct radv_pipeline *pipeline, 
nir_shader **shaders)
nir_lower_io_arrays_to_elements(ordered_shaders[i],
ordered_shaders[i - 1]);
  
+		if (nir_move_out_const_to_consumer(ordered_shaders[i],

+  ordered_shaders[i - 1]))
+   radv_optimize_nir(ordered_shaders[i - 1], false, false);
+
nir_remove_dead_variables(ordered_shaders[i],
  nir_var_shader_out);
nir_remove_dead_variables(ordered_shaders[i - 1],


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Erik Faye-Lund

On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
> Quoting Erik Faye-Lund (2018-11-12 04:51:47)
> > On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
> > > Which has the same behavior.
> > 
> > Does it? I'm not so sure... IROUND_POS seems to round to nearest
> > integer depending on the FPU rounding mode, _mesa_roundevenf rounds
> > to
> > the nearest *even* value regardless of the FPU rounding mode, no?
> > 
> > I'm not sure if it matters or not, but *at least* point that out in
> > the
> > commit message. Unless I'm missing something, of course...
> 
> I should put it in the commit message, but there is a comment in
> rounding.h that
> if you change the rounding mode you get to keep the pieces.

Well, this might regress performance pretty badly. Especially in the
swrast code, this could be bad...

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/dri: fix error value with unknown drm format

2018-11-13 Thread Lionel Landwerlin


On 13/11/2018 15:43, Emil Velikov wrote:

On Tue, 13 Nov 2018 at 14:11, Lionel Landwerlin
 wrote:

According to the EGL_EXT_image_dma_buf_import spec, creating an EGL
image with a DRM format not supported should yield the BAD_MATCH
error :

"
* If  is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
  attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
  is generated.
"

Signed-off-by: Lionel Landwerlin 
Fixes: 20de7f9f226401 ("egl/dri2: support for creating images out of dma 
buffers")

Reviewed-by: Emil Velikov 

Great catch Lionel. Out of curiosity, how did you spot this?

-Emil



Running :

ext_image_dma_buf_import-sample_yuv -fmt=AYUV

on an older driver.


-

Lionel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 22/22] nir/spirv: handle OpBitcasts for pointers

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/spirv/spirv_to_nir.c |   5 +-
 src/compiler/spirv/vtn_alu.c  | 187 +-
 src/compiler/spirv/vtn_private.h  |   3 +
 3 files changed, 115 insertions(+), 80 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 8c341e9c1fa..cbd40df7473 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -4067,7 +4067,6 @@ vtn_handle_body_instruction(struct vtn_builder *b, SpvOp 
opcode,
case SpvOpConvertUToPtr:
case SpvOpPtrCastToGeneric:
case SpvOpGenericCastToPtr:
-   case SpvOpBitcast:
case SpvOpIsNan:
case SpvOpIsInf:
case SpvOpIsFinite:
@@ -4152,6 +4151,10 @@ vtn_handle_body_instruction(struct vtn_builder *b, SpvOp 
opcode,
   vtn_handle_alu(b, opcode, w, count);
   break;
 
+   case SpvOpBitcast:
+  vtn_handle_bitcast(b, opcode, w, count);
+  break;
+
case SpvOpVectorExtractDynamic:
case SpvOpVectorInsertDynamic:
case SpvOpVectorShuffle:
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 32825da29cb..e1088a7e9db 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -211,81 +211,6 @@ vtn_handle_matrix_alu(struct vtn_builder *b, SpvOp opcode,
}
 }
 
-static void
-vtn_handle_bitcast(struct vtn_builder *b, struct vtn_ssa_value *dest,
-   struct nir_ssa_def *src)
-{
-   if (glsl_get_vector_elements(dest->type) == src->num_components) {
-  /* From the definition of OpBitcast in the SPIR-V 1.2 spec:
-   *
-   * "If Result Type has the same number of components as Operand, they
-   * must also have the same component width, and results are computed per
-   * component."
-   */
-  dest->def = nir_imov(>nb, src);
-  return;
-   }
-
-   /* From the definition of OpBitcast in the SPIR-V 1.2 spec:
-*
-* "If Result Type has a different number of components than Operand, the
-* total number of bits in Result Type must equal the total number of bits
-* in Operand. Let L be the type, either Result Type or Operand’s type, that
-* has the larger number of components. Let S be the other type, with the
-* smaller number of components. The number of components in L must be an
-* integer multiple of the number of components in S. The first component
-* (that is, the only or lowest-numbered component) of S maps to the first
-* components of L, and so on, up to the last component of S mapping to the
-* last components of L. Within this mapping, any single component of S
-* (mapping to multiple components of L) maps its lower-ordered bits to the
-* lower-numbered components of L."
-*/
-   unsigned src_bit_size = src->bit_size;
-   unsigned dest_bit_size = glsl_get_bit_size(dest->type);
-   unsigned src_components = src->num_components;
-   unsigned dest_components = glsl_get_vector_elements(dest->type);
-   vtn_assert(src_bit_size * src_components == dest_bit_size * 
dest_components);
-
-   nir_ssa_def *dest_chan[NIR_MAX_VEC_COMPONENTS];
-   if (src_bit_size > dest_bit_size) {
-  vtn_assert(src_bit_size % dest_bit_size == 0);
-  unsigned divisor = src_bit_size / dest_bit_size;
-  for (unsigned comp = 0; comp < src_components; comp++) {
- nir_ssa_def *split;
- if (src_bit_size == 64) {
-assert(dest_bit_size == 32 || dest_bit_size == 16);
-split = dest_bit_size == 32 ?
-   nir_unpack_64_2x32(>nb, nir_channel(>nb, src, comp)) :
-   nir_unpack_64_4x16(>nb, nir_channel(>nb, src, comp));
- } else {
-vtn_assert(src_bit_size == 32);
-vtn_assert(dest_bit_size == 16);
-split = nir_unpack_32_2x16(>nb, nir_channel(>nb, src, comp));
- }
- for (unsigned i = 0; i < divisor; i++)
-dest_chan[divisor * comp + i] = nir_channel(>nb, split, i);
-  }
-   } else {
-  vtn_assert(dest_bit_size % src_bit_size == 0);
-  unsigned divisor = dest_bit_size / src_bit_size;
-  for (unsigned comp = 0; comp < dest_components; comp++) {
- unsigned channels = ((1 << divisor) - 1) << (comp * divisor);
- nir_ssa_def *src_chan = nir_channels(>nb, src, channels);
- if (dest_bit_size == 64) {
-assert(src_bit_size == 32 || src_bit_size == 16);
-dest_chan[comp] = src_bit_size == 32 ?
-   nir_pack_64_2x32(>nb, src_chan) :
-   nir_pack_64_4x16(>nb, src_chan);
- } else {
-vtn_assert(dest_bit_size == 32);
-vtn_assert(src_bit_size == 16);
-dest_chan[comp] = nir_pack_32_2x16(>nb, src_chan);
- }
-  }
-   }
-   dest->def = nir_vec(>nb, dest_chan, dest_components);
-}
-
 nir_op
 vtn_nir_alu_op_for_spirv_opcode(struct vtn_builder *b,
 SpvOp opcode, bool *swap,
@@ -451,6 +376,114 @@ handle_rounding_mode(struct

[Mesa-dev] [PATCH 02/22] nir: replace nir_load_system_value calls with appropiate builder functions

2018-11-13 Thread Karol Herbst

this helps reduce the overall code changes when a bit_size parameter is
added to nir_load_system_value

Reviewed-by: Jason Ekstrand 
Reviewed-by: Eric Anholt 
Signed-off-by: Karol Herbst 
---
 src/amd/vulkan/radv_meta_buffer.c|  8 
 src/amd/vulkan/radv_meta_bufimage.c  | 16 
 src/amd/vulkan/radv_meta_clear.c |  8 
 src/amd/vulkan/radv_meta_fast_clear.c|  4 ++--
 src/amd/vulkan/radv_meta_resolve_cs.c|  4 ++--
 src/amd/vulkan/radv_query.c  |  8 
 src/compiler/nir/nir_lower_clip.c|  3 +--
 src/compiler/nir/nir_lower_wpos_center.c |  3 +--
 .../vulkan/anv_nir_lower_input_attachments.c |  3 +--
 9 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_buffer.c 
b/src/amd/vulkan/radv_meta_buffer.c
index 8726d36f5fa..76854d7bbad 100644
--- a/src/amd/vulkan/radv_meta_buffer.c
+++ b/src/amd/vulkan/radv_meta_buffer.c
@@ -15,8 +15,8 @@ build_buffer_fill_shader(struct radv_device *dev)
b.shader->info.cs.local_size[1] = 1;
b.shader->info.cs.local_size[2] = 1;
 
-   nir_ssa_def *invoc_id = nir_load_system_value(, 
nir_intrinsic_load_local_invocation_id, 0);
-   nir_ssa_def *wg_id = nir_load_system_value(, 
nir_intrinsic_load_work_group_id, 0);
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
+   nir_ssa_def *wg_id = nir_load_work_group_id();
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -67,8 +67,8 @@ build_buffer_copy_shader(struct radv_device *dev)
b.shader->info.cs.local_size[1] = 1;
b.shader->info.cs.local_size[2] = 1;
 
-   nir_ssa_def *invoc_id = nir_load_system_value(, 
nir_intrinsic_load_local_invocation_id, 0);
-   nir_ssa_def *wg_id = nir_load_system_value(, 
nir_intrinsic_load_work_group_id, 0);
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
+   nir_ssa_def *wg_id = nir_load_work_group_id();
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
diff --git a/src/amd/vulkan/radv_meta_bufimage.c 
b/src/amd/vulkan/radv_meta_bufimage.c
index 6f074a70b4c..f5b68f6c9a6 100644
--- a/src/amd/vulkan/radv_meta_bufimage.c
+++ b/src/amd/vulkan/radv_meta_bufimage.c
@@ -60,8 +60,8 @@ build_nir_itob_compute_shader(struct radv_device *dev, bool 
is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
 
-   nir_ssa_def *invoc_id = nir_load_system_value(, 
nir_intrinsic_load_local_invocation_id, 0);
-   nir_ssa_def *wg_id = nir_load_system_value(, 
nir_intrinsic_load_work_group_id, 0);
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
+   nir_ssa_def *wg_id = nir_load_work_group_id();
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -289,8 +289,8 @@ build_nir_btoi_compute_shader(struct radv_device *dev, bool 
is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
 
-   nir_ssa_def *invoc_id = nir_load_system_value(, 
nir_intrinsic_load_local_invocation_id, 0);
-   nir_ssa_def *wg_id = nir_load_system_value(, 
nir_intrinsic_load_work_group_id, 0);
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
+   nir_ssa_def *wg_id = nir_load_work_group_id();
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -719,8 +719,8 @@ build_nir_itoi_compute_shader(struct radv_device *dev, bool 
is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
 
-   nir_ssa_def *invoc_id = nir_load_system_value(, 
nir_intrinsic_load_local_invocation_id, 0);
-   nir_ssa_def *wg_id = nir_load_system_value(, 
nir_intrinsic_load_work_group_id, 0);
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
+   nir_ssa_def *wg_id = nir_load_work_group_id();
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -1139,8 +1139,8 @@ build_nir_cleari_compute_shader(struct radv_device *dev, 
bool is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 0;
 
-   nir_ssa_def *invoc_id = nir_load_system_value(, 
nir_intrinsic_load_local_invocation_id, 0);
-   nir_ssa_def *wg_id =

[Mesa-dev] [PATCH 01/22] nir: add const_index parameters to system value builder function

2018-11-13 Thread Karol Herbst

this allows to replace some nir_load_system_value calls with the specific
system value constructor

Reviewed-by: Jason Ekstrand 
Reviewed-by: Eric Anholt 
Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir_builder_opcodes_h.py | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_builder_opcodes_h.py 
b/src/compiler/nir/nir_builder_opcodes_h.py
index e600093e9f6..34b8c4371e1 100644
--- a/src/compiler/nir/nir_builder_opcodes_h.py
+++ b/src/compiler/nir/nir_builder_opcodes_h.py
@@ -55,11 +55,28 @@ nir_load_system_value(nir_builder *build, nir_intrinsic_op 
op, int index)
return >dest.ssa;
 }
 
+<%
+def sysval_decl_list(opcode):
+   res = ''
+   if opcode.indices:
+  res += ', unsigned ' + opcode.indices[0].lower()
+   return res
+
+def sysval_arg_list(opcode):
+   args = []
+   if opcode.indices:
+  args.append(opcode.indices[0].lower())
+   else:
+  args.append('0')
+   return ', '.join(args)
+%>
+
 % for name, opcode in filter(lambda v: v[1].sysval, 
sorted(INTR_OPCODES.items())):
 static inline nir_ssa_def *
-nir_${name}(nir_builder *build)
+nir_${name}(nir_builder *build${sysval_decl_list(opcode)})
 {
-   return nir_load_system_value(build, nir_intrinsic_${name}, 0);
+   return nir_load_system_value(build, nir_intrinsic_${name},
+${sysval_arg_list(opcode)});
 }
 % endfor
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Dylan Baker

Quoting Erik Faye-Lund (2018-11-13 01:34:53)
> On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
> > Quoting Erik Faye-Lund (2018-11-12 04:51:47)
> > > On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
> > > > Which has the same behavior.
> > > 
> > > Does it? I'm not so sure... IROUND_POS seems to round to nearest
> > > integer depending on the FPU rounding mode, _mesa_roundevenf rounds
> > > to
> > > the nearest *even* value regardless of the FPU rounding mode, no?
> > > 
> > > I'm not sure if it matters or not, but *at least* point that out in
> > > the
> > > commit message. Unless I'm missing something, of course...
> > 
> > I should put it in the commit message, but there is a comment in
> > rounding.h that
> > if you change the rounding mode you get to keep the pieces.
> 
> Well, this might regress performance pretty badly. Especially in the
> swrast code, this could be bad...
> 

Why? we have the assumption that you don't change the rounding mode already in
core mesa and many of the drivers.

For performance, I measured a simple 1000 loops of rounding, and found that the
only way the rounding.h function was slower is if you used the __SSE4_1__
path... (It was the same performance as the int cast +0.5 implementation)

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/22] nir: add legal bit_sizes to intrinsics

2018-11-13 Thread Karol Herbst

With OpenCL some system values match the address bits, but in GLSL we also
have some system values being 64 bit.

With this it is possible to adjust the builder functions so that depending
on the bit_sizes the correct bit_size is used or an additional argument is
added in case of multiple possible values.

Also this allows for further validation

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir.h   |  3 +++
 src/compiler/nir/nir_intrinsics.py   | 15 ++-
 src/compiler/nir/nir_intrinsics_c.py |  6 +-
 src/nouveau/meson.build  | 25 +
 4 files changed, 43 insertions(+), 6 deletions(-)
 create mode 100644 src/nouveau/meson.build

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index be4f64464f9..3855eb0b582 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1283,6 +1283,9 @@ typedef struct {
 
/** semantic flags for calls to this intrinsic */
nir_intrinsic_semantic_flag flags;
+
+   /** bitfield of legal bit sizes */
+   unsigned bit_sizes : 7;
 } nir_intrinsic_info;
 
 extern const nir_intrinsic_info nir_intrinsic_infos[nir_num_intrinsics];
diff --git a/src/compiler/nir/nir_intrinsics.py 
b/src/compiler/nir/nir_intrinsics.py
index ec3049ca06d..9ada44aad8a 100644
--- a/src/compiler/nir/nir_intrinsics.py
+++ b/src/compiler/nir/nir_intrinsics.py
@@ -32,7 +32,7 @@ class Intrinsic(object):
NOTE: this must be kept in sync with nir_intrinsic_info.
"""
def __init__(self, name, src_components, dest_components,
-indices, flags, sysval):
+indices, flags, sysval, bit_sizes):
"""Parameters:
 
- name: the intrinsic name
@@ -45,6 +45,7 @@ class Intrinsic(object):
- indices: list of constant indicies
- flags: list of semantic flags
- sysval: is this a system-value intrinsic
+   - bit_sizes: allowed dest bit_sizes
"""
assert isinstance(name, str)
assert isinstance(src_components, list)
@@ -58,6 +59,8 @@ class Intrinsic(object):
if flags:
assert isinstance(flags[0], str)
assert isinstance(sysval, bool)
+   if bit_sizes:
+   assert isinstance(bit_sizes[0], int)
 
self.name = name
self.num_srcs = len(src_components)
@@ -68,6 +71,7 @@ class Intrinsic(object):
self.indices = indices
self.flags = flags
self.sysval = sysval
+   self.bit_sizes = bit_sizes
 
 #
 # Possible indices:
@@ -120,10 +124,10 @@ CAN_REORDER   = "NIR_INTRINSIC_CAN_REORDER"
 INTR_OPCODES = {}
 
 def intrinsic(name, src_comp=[], dest_comp=-1, indices=[],
-  flags=[], sysval=False):
+  flags=[], sysval=False, bit_sizes=[]):
 assert name not in INTR_OPCODES
 INTR_OPCODES[name] = Intrinsic(name, src_comp, dest_comp,
-   indices, flags, sysval)
+   indices, flags, sysval, bit_sizes)
 
 intrinsic("nop", flags=[CAN_ELIMINATE])
 
@@ -446,9 +450,10 @@ intrinsic("shared_atomic_fmin",  src_comp=[1, 1], 
dest_comp=1, indices=[BASE])
 intrinsic("shared_atomic_fmax",  src_comp=[1, 1], dest_comp=1, indices=[BASE])
 intrinsic("shared_atomic_fcomp_swap", src_comp=[1, 1, 1], dest_comp=1, 
indices=[BASE])
 
-def system_value(name, dest_comp, indices=[]):
+def system_value(name, dest_comp, indices=[], bit_sizes=[32]):
 intrinsic("load_" + name, [], dest_comp, indices,
-  flags=[CAN_ELIMINATE, CAN_REORDER], sysval=True)
+  flags=[CAN_ELIMINATE, CAN_REORDER], sysval=True,
+  bit_sizes=bit_sizes)
 
 system_value("frag_coord", 4)
 system_value("front_face", 1)
diff --git a/src/compiler/nir/nir_intrinsics_c.py 
b/src/compiler/nir/nir_intrinsics_c.py
index ac45b94d496..d0f1c29fa39 100644
--- a/src/compiler/nir/nir_intrinsics_c.py
+++ b/src/compiler/nir/nir_intrinsics_c.py
@@ -1,3 +1,5 @@
+from functools import reduce
+import operator
 
 template = """\
 /* Copyright (C) 2018 Red Hat
@@ -45,6 +47,7 @@ const nir_intrinsic_info 
nir_intrinsic_infos[nir_num_intrinsics] = {
 },
 % endif
.flags = ${"0" if len(opcode.flags) == 0 else " | ".join(opcode.flags)},
+   .bit_sizes = ${reduce(operator.or_, opcode.bit_sizes, 0)},
 },
 % endfor
 };
@@ -54,6 +57,7 @@ from nir_intrinsics import INTR_OPCODES
 from mako.template import Template
 import argparse
 import os
+import functools
 
 def main():
 parser = argparse.ArgumentParser()
@@ -64,7 +68,7 @@ def main():
 
 path = os.path.join(args.outdir, 'nir_intrinsics.c')
 with open(path, 'wb') as f:
-f.write(Template(template, 
output_encoding='utf-8').render(INTR_OPCODES=INTR_OPCODES))
+f.write(Template(template, 
output_encoding='utf-8').render(INTR_OPCODES=INTR_OPCODES, reduce=reduce, 
operator=operator))
 
 if __name__ == '__main__':
 main()
diff --git a/src/nouveau/meson.build b/src/nouveau/meson.build
new file mode 100644
index 000..5c265f207ab
--- /dev/null
+++

[Mesa-dev] [PATCH 20/22] nir/spirv: physical pointer support

2018-11-13 Thread Karol Herbst

this adds support for pointers from CL kernels. The basic idea here is to be
able to start a deref chain from a random ssa value and vice versa.

changes summed up:
1. derefs can start from a deref_cast
2. new ptr_as_array deref type to offset a pointer
3. derefs can end with a ssa_from_deref intrinsic

One problem with this implementation is, that we don't track if a deref points
to a value or a pointer, allthough shouldn't cause any problem at runtime as a
pointer gets fed into a load or ssa_from_deref.

What annoys me more is that we need to keep track of the original SSA value we
started the pointer from. For graphics that's most of the time a variable or
something we can base the pointer on, but for kernels we usually start from a
plain SSA value.

Currently I reuse the UBO offset variable to keep track of the initial ssa
value, but it's hacky. Most bits of the patch feels rather hacky, but it is
much smaller than what we had with the fat ptr approach and I don't have to add
a new phys_pointer value type, which made things a lot easier.

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir.c|  4 +-
 src/compiler/nir/nir.h|  9 ++
 src/compiler/nir/nir_builder.h| 37 +++-
 src/compiler/nir/nir_clone.c  |  1 +
 src/compiler/nir/nir_deref.c  | 26 +-
 src/compiler/nir/nir_instr_set.c  |  2 +
 src/compiler/nir/nir_intrinsics.py|  7 ++
 src/compiler/nir/nir_loop_analyze.c   |  2 +-
 src/compiler/nir/nir_lower_indirect_derefs.c  |  6 +-
 src/compiler/nir/nir_lower_io.c   | 79 +++-
 .../nir/nir_lower_io_arrays_to_elements.c |  4 +-
 src/compiler/nir/nir_lower_locals_to_regs.c   |  9 +-
 src/compiler/nir/nir_lower_var_copies.c   |  3 +-
 src/compiler/nir/nir_lower_vars_to_ssa.c  | 12 ++-
 src/compiler/nir/nir_opt_copy_propagate.c |  2 +-
 src/compiler/nir/nir_opt_dead_write_vars.c|  4 +-
 src/compiler/nir/nir_print.c  |  6 +-
 src/compiler/nir/nir_propagate_invariant.c|  2 +
 src/compiler/nir/nir_remove_dead_variables.c  |  2 +
 src/compiler/nir/nir_serialize.c  |  2 +
 src/compiler/nir/nir_split_vars.c |  4 +-
 src/compiler/nir/nir_validate.c   | 17 +++-
 src/compiler/spirv/spirv_to_nir.c | 28 +-
 src/compiler/spirv/vtn_cfg.c  |  3 +-
 src/compiler/spirv/vtn_private.h  |  1 +
 src/compiler/spirv/vtn_variables.c| 90 +++
 26 files changed, 296 insertions(+), 66 deletions(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index ca258b7c80e..66648885ec7 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -463,7 +463,7 @@ nir_deref_instr_create(nir_shader *shader, nir_deref_type 
deref_type)
if (deref_type != nir_deref_type_var)
   src_init(>parent);
 
-   if (deref_type == nir_deref_type_array)
+   if (nir_deref_is_array(instr))
   src_init(>arr.index);
 
dest_init(>dest);
@@ -1069,7 +1069,7 @@ visit_deref_instr_src(nir_deref_instr *instr,
  return false;
}
 
-   if (instr->deref_type == nir_deref_type_array) {
+   if (nir_deref_is_array(instr)) {
   if (!visit_src(>arr.index, cb, state))
  return false;
}
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 35f2ec02c31..40f5a0e4e06 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -990,6 +990,7 @@ typedef enum {
nir_deref_type_array_wildcard,
nir_deref_type_struct,
nir_deref_type_cast,
+   nir_deref_type_ptr_as_array,
 } nir_deref_type;
 
 typedef struct {
@@ -1014,6 +1015,7 @@ typedef struct {
 
/** Additional deref parameters */
union {
+  /** used for deref_array and deref_ptr_as_array */
   struct {
  nir_src index;
   } arr;
@@ -1068,6 +1070,13 @@ bool nir_deref_instr_has_indirect(nir_deref_instr 
*instr);
 
 bool nir_deref_instr_remove_if_unused(nir_deref_instr *instr);
 
+static inline bool
+nir_deref_is_array(const nir_deref_instr *instr)
+{
+   return instr->deref_type == nir_deref_type_array ||
+  instr->deref_type == nir_deref_type_ptr_as_array;
+}
+
 typedef struct {
nir_instr instr;
 
diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
index 57f0a188c46..428c6b4fd78 100644
--- a/src/compiler/nir/nir_builder.h
+++ b/src/compiler/nir/nir_builder.h
@@ -623,6 +623,7 @@ nir_ssa_for_alu_src(nir_builder *build, nir_alu_instr 
*instr, unsigned srcn)
 static inline nir_deref_instr *
 nir_build_deref_var(nir_builder *build, nir_variable *var)
 {
+   unsigned ptr_size = build->shader->ptr_size ? build->shader->ptr_size : 32;
nir_deref_instr *deref =
   nir_deref_instr_create(build->shader, nir_deref_type_var);
 
@@ -630,7 +631,7 @@ nir_build_deref_var(nir_builder *build, nir_variable *var)
deref->type = var->type;
deref->var = var;
 
-

[Mesa-dev] [PATCH 19/22] nir/spirv: handle kernel function parameters

2018-11-13 Thread Karol Herbst

the idea here is to generate an entry point stub function wrapping around the
actual kernel function and turn all parameters into shader inputs with byte
addressing instead of vec4.

This gives us several advantages:
1. calling kernel functions doesn't differ from calling any other function
2. CL inputs match uniforms in most ways and we can just take advantage of most
   of nir_lower_io

Signed-off-by: Karol Herbst 
---
 src/compiler/spirv/spirv_to_nir.c | 32 +++
 1 file changed, 32 insertions(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index a350a95e27e..dbac3b2e052 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -4335,6 +4335,38 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
nir_function *entry_point = b->entry_point->func->impl->function;
vtn_assert(entry_point);
 
+   /* post process entry_points with input params */
+   if (entry_point->num_params) {
+  /* we shouldn't have any inputs yet */
+  vtn_assert(!entry_point->shader->num_inputs);
+
+  nir_function *main_entry_point = nir_function_create(b->shader, 
ralloc_strdup(b->shader, "main"));
+  main_entry_point->impl = nir_function_impl_create(main_entry_point);
+  nir_builder_init(>nb, main_entry_point->impl);
+  b->nb.cursor = nir_after_cf_list(_entry_point->impl->body);
+  b->func_param_idx = 0;
+
+  nir_call_instr *call = nir_call_instr_create(b->nb.shader, entry_point);
+
+  for (unsigned i = 0; i < entry_point->num_params; ++i) {
+ /* input variable */
+ nir_variable *in_var = rzalloc(b->nb.shader, nir_variable);
+ in_var->data.mode = nir_var_shader_in;
+ in_var->data.read_only = true;
+ in_var->data.location = i;
+ in_var->type = b->entry_point->func->type->params[i]->type;
+
+ nir_shader_add_variable(b->nb.shader, in_var);
+ b->nb.shader->num_inputs++;
+
+ call->params[i] = nir_src_for_ssa(nir_load_var(>nb, in_var));
+  }
+
+  nir_builder_instr_insert(>nb, >instr);
+
+  entry_point = main_entry_point;
+   }
+
/* Unparent the shader from the vtn_builder before we delete the builder */
ralloc_steal(NULL, b->shader);
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 108734] Regression: [bisected] dEQP-GLES31.functional.tessellation.invariance.* start failing on r600

2018-11-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=108734

Bug ID: 108734
   Summary: Regression: [bisected]
dEQP-GLES31.functional.tessellation.invariance.* start
failing on r600
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: gw.foss...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

The patch  

  5d517a599b1eabd1d5696bf31e26f16568d35770 
  st/mesa: Don't record garbage streamout information in the non-SSO case.

breaks dEQP-GLES31.functional.tessellation.invariance.* on r600. All the tests
pass without this patch, but with the patch applied 

   glGetQueryObjectuiv(queryObject, GL_QUERY_RESULT, );

returns zero in result for all the tests from this set, which is not correct.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 4/5] main: avoid 'may be used uninitialized' warnings

2018-11-13 Thread Eric Engestrom

On Tuesday, 2018-11-13 14:19:31 +0200, asimiklit.w...@gmail.com wrote:
> From: Andrii Simiklit 
> 
> 1. main/texcompress_etc.c:1314:12:
> warning: ‘*((void *)+2)’ may be used uninitialized in this function
> 2. main/texcompress_etc.c:1354:12:
> warning: ‘*((void *)+2)’ may be used uninitialized in this function
> 3. main/texcompress_etc.c:1293:12:
> warning: ‘dst’ may be used uninitialized in this function
> 4. main/texcompress_etc.c:1335:12:
> warning: ‘dst’ may be used uninitialized in this function
> 5. main/texcompress_etc.c:1460:12:
> warning: ‘*((void *)+1)’ may be used uninitialized in this function
> 
> v2: Fixed by adding the unreachable case to the etc2_rgb8_fetch_texel
>( Eric Engestrom  )
> Changes for warning 'pixerrorcolorbest' were removed.
> 
> Signed-off-by: Andrii Simiklit 

This is the right way of fixing this code-wise, but I'm not 100% sure we
can actually guarantee this logic-wise, so I'll let someone else review
(and push) this patch.

Acked-by: Eric Engestrom 

> ---
>  src/mesa/main/texcompress_etc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
> index b39ab33d36..f1da4d0f11 100644
> --- a/src/mesa/main/texcompress_etc.c
> +++ b/src/mesa/main/texcompress_etc.c
> @@ -548,6 +548,7 @@ etc2_rgb8_fetch_texel(const struct etc2_block *block,
>if (punchthrough_alpha)
>   dst[3] = 255;
> }
> +   else unreachable("unhandled block mode");
>  }
>  
>  static void
> -- 
> 2.17.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 5/5] intel/tools: avoid 'ignoring return value'

2018-11-13 Thread Eric Engestrom

On Tuesday, 2018-11-13 14:19:32 +0200, asimiklit.w...@gmail.com wrote:
> From: Andrii Simiklit 
> 
> 1. tools/i965_disasm.c:58:4: warning:
>  ignoring return value of ‘fread’,
>  declared with attribute warn_unused_result
>  fread(assembly, *end, 1, fp);
> 
> v2: - Fixed incorrect return value check.
>( Eric Engestrom  )
> 
> Signed-off-by: Andrii Simiklit 
> ---
>  src/intel/tools/i965_disasm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/tools/i965_disasm.c b/src/intel/tools/i965_disasm.c
> index 73a6760fc1..329f6327ed 100644
> --- a/src/intel/tools/i965_disasm.c
> +++ b/src/intel/tools/i965_disasm.c
> @@ -55,7 +55,8 @@ i965_disasm_read_binary(FILE *fp, size_t *end)
> if (assembly == NULL)
>return NULL;
>  
> -   fread(assembly, *end, 1, fp);
> +   MAYBE_UNUSED size_t size = fread(assembly, *end, 1, fp);
> +   assert((size || (*end == 0)) && "error: unable to read all elements!");

I think `*end == 0` should be handled before fread() with an exit(), or
just leave it out from the assert so that it fails here. (Realistically
most people will be using these tools from debug builds anyway.)

> fclose(fp);
>  
> return assembly;
> -- 
> 2.17.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 3/5] i965: avoid 'unused variable'

2018-11-13 Thread Eric Engestrom

On Tuesday, 2018-11-13 14:19:30 +0200, asimiklit.w...@gmail.com wrote:
> From: Andrii Simiklit 
> 
> 1. brw_pipe_control.c:311:34: warning:
> unused variable ‘devinfo’
> 2. brw_program_binary.c:209:19: warning:
> unused variable ‘gen_size’
> 3. brw_program_binary.c:216:19: warning:
> unused variable ‘nir_size’
> 
> v2: Changes for unreproducible issues were removed
> 
> Signed-off-by: Andrii Simiklit 
> ---
>  src/mesa/drivers/dri/i965/brw_pipe_control.c   | 2 +-
>  src/mesa/drivers/dri/i965/brw_program_binary.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 122ac26070..a3f521b5ae 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -308,7 +308,7 @@ brw_emit_depth_stall_flushes(struct brw_context *brw)
>  void
>  gen7_emit_vs_workaround_flush(struct brw_context *brw)
>  {
> -   const struct gen_device_info *devinfo = >screen->devinfo;
> +   MAYBE_UNUSED const struct gen_device_info *devinfo = 
> >screen->devinfo;
>  
> assert(devinfo->gen == 7);

This could've just been folded into the assert, but this works.

Patches 1-3 are:
Reviewed-by: Eric Engestrom 

I assume you want me to push them for you?

> brw_emit_pipe_control_write(brw,
> diff --git a/src/mesa/drivers/dri/i965/brw_program_binary.c 
> b/src/mesa/drivers/dri/i965/brw_program_binary.c
> index db03332241..1298d9e765 100644
> --- a/src/mesa/drivers/dri/i965/brw_program_binary.c
> +++ b/src/mesa/drivers/dri/i965/brw_program_binary.c
> @@ -206,14 +206,14 @@ brw_program_deserialize_driver_blob(struct gl_context 
> *ctx,
>   break;
>switch ((enum driver_cache_blob_part)part_type) {
>case GEN_PART: {
> - uint32_t gen_size = blob_read_uint32();
> + MAYBE_UNUSED uint32_t gen_size = blob_read_uint32();
>   assert(!reader.overrun &&
>  (uintptr_t)(reader.end - reader.current) > gen_size);
>   deserialize_gen_program(, ctx, prog, stage);
>   break;
>}
>case NIR_PART: {
> - uint32_t nir_size = blob_read_uint32();
> + MAYBE_UNUSED uint32_t nir_size = blob_read_uint32();
>   assert(!reader.overrun &&
>  (uintptr_t)(reader.end - reader.current) > nir_size);
>   const struct nir_shader_compiler_options *options =
> -- 
> 2.17.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] egl/dri: fix error value with unknown drm format

2018-11-13 Thread Lionel Landwerlin

According to the EGL_EXT_image_dma_buf_import spec, creating an EGL
image with a DRM format not supported should yield the BAD_MATCH
error :

"
   * If  is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
 attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
 is generated.
"

Signed-off-by: Lionel Landwerlin 
Fixes: 20de7f9f226401 ("egl/dri2: support for creating images out of dma 
buffers")
---
 src/egl/drivers/dri2/egl_dri2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 3b63aebbf9a..198ba73247f 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2310,7 +2310,7 @@ dri2_check_dma_buf_format(const _EGLImageAttribs *attrs)
 {
unsigned plane_n = dri2_num_fourcc_format_planes(attrs->DMABufFourCC.Value);
if (plane_n == 0) {
-  _eglError(EGL_BAD_ATTRIBUTE, "invalid format");
+  _eglError(EGL_BAD_MATCH, "unknown drm fourcc format");
   return 0;
}
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] nir: combine fmul and fadd across ffma operations

2018-11-13 Thread Jonathan marek

The brw_nir_opt_peephole_ffma pass is only doing what the fuse_ffma 
option already does. It produces the same result as the fuse_ffma 
option, which is not optimal.


This is what I get:
   vec4 32 ssa_7 = fmul ssa_6, ssa_1.
   vec4 32 ssa_8 = ffma ssa_5, ssa_1., ssa_7
   vec4 32 ssa_10 = ffma ssa_9, ssa_1., ssa_8
   vec4 32 ssa_12 = fadd ssa_10, ssa_11
But better optimized as (example with the least rearrangements):
   vec4 32 ssa_7 = ffma ssa_6, ssa_1., ssa_11
   vec4 32 ssa_8 = ffma ssa_5, ssa_1., ssa_7
   vec4 32 ssa_10 = ffma ssa_9, ssa_1., ssa_8

Fusing the fmul and fadd in this case is not obvious. Could this patch 
be OK if it is behind the fuse_ffma option?


On 11/12/2018 02:30 PM, Jason Ekstrand wrote:

In general, you're not supposed to mess around with the precision of fma...
What we do in the Intel drivers is to leave fma split, apply operations,
and then we have a special mul+add fusion pass we run at the end.  Leaving
them split allows for exactly this kind of optimization without mixing up
those FMAs that are supposed to be kept fused and those generated by
mul+add fusion which can be split back apart and re-optimized.

On Mon, Nov 12, 2018 at 12:17 PM Jonathan Marek  wrote:


This works by moving the fadd up across the ffma operations, so that it
can eventually can be combined with a fmul. I'm not sure it works in all
cases, but it works in all the common cases.

Example:
 matrix * vec4(coord, 1.0)
is compiled as:
 fmul, ffma, ffma, fadd
and with this patch:
 ffma, ffma, ffma

Signed-off-by: Jonathan Marek 
---
  src/compiler/nir/nir_opt_algebraic.py | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index 8f4df891b8..82e10731a6 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -133,6 +133,7 @@ optimizations = [
 (('~fadd@64', a, ('fmul', c , ('fadd', b, ('fneg', a,
('flrp', a, b, c), '!options->lower_flrp64'),
 (('ffma', a, b, c), ('fadd', ('fmul', a, b), c),
'options->lower_ffma'),
 (('~fadd', ('fmul', a, b), c), ('ffma', a, b, c),
'options->fuse_ffma'),
+   (('~fadd', ('ffma', a, b, c), d), ('ffma', a, b, ('fadd', c, d))),

 (('fdot4', ('vec4', a, b,   c,   1.0), d), ('fdph',  ('vec3', a, b,
c), d)),
 (('fdot4', ('vec4', a, 0.0, 0.0, 0.0), b), ('fmul', a, b)),
--
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl/dri: fix error value with unknown drm format

2018-11-13 Thread Emil Velikov

On Tue, 13 Nov 2018 at 14:11, Lionel Landwerlin
 wrote:
>
> According to the EGL_EXT_image_dma_buf_import spec, creating an EGL
> image with a DRM format not supported should yield the BAD_MATCH
> error :
>
> "
>* If  is EGL_LINUX_DMA_BUF_EXT, and the 
> EGL_LINUX_DRM_FOURCC_EXT
>  attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
>  is generated.
> "
>
> Signed-off-by: Lionel Landwerlin 
> Fixes: 20de7f9f226401 ("egl/dri2: support for creating images out of dma 
> buffers")

Reviewed-by: Emil Velikov 

Great catch Lionel. Out of curiosity, how did you spot this?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/22] glsl: add cl_size and cl_alignment

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/glsl_types.cpp | 48 +
 src/compiler/glsl_types.h   | 10 
 src/compiler/nir_types.cpp  | 12 ++
 src/compiler/nir_types.h|  4 
 4 files changed, 74 insertions(+)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 9b1fd809b41..c74e67f7be1 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -2236,3 +2236,51 @@ decode_type_from_blob(struct blob_reader *blob)
   return NULL;
}
 }
+
+unsigned
+glsl_type::cl_alignment() const
+{
+   /* vectors unlike arrays are aligned to their size */
+   if (this->is_scalar() || this->is_vector())
+  return this->cl_size();
+   else if (this->is_array())
+  return this->without_array()->cl_alignment();
+   else if (this->is_record()) {
+  /* Packed Structs are 0x1 aligned despite their size. */
+  if (this->packed)
+ return 1;
+
+  unsigned res = 1;
+  for (unsigned i = 0; i < this->length; ++i) {
+ struct glsl_struct_field  = this->fields.structure[i];
+ res = MAX2(res, field.type->cl_alignment());
+  }
+  return res;
+   }
+   return 1;
+}
+
+unsigned
+glsl_type::cl_size() const
+{
+   if (this->is_scalar()) {
+  return glsl_base_get_byte_size(this->base_type);
+   } else if (this->is_vector()) {
+  unsigned vec_elemns = this->vector_elements == 3 ? 4 : 
this->vector_elements;
+  return vec_elemns * glsl_base_get_byte_size(this->base_type);
+   } else if (this->is_array()) {
+  unsigned size = this->without_array()->cl_size();
+  return size * this->length;
+   } else if (this->is_record()) {
+  unsigned size = 0;
+  for (unsigned i = 0; i < this->length; ++i) {
+ struct glsl_struct_field  = this->fields.structure[i];
+ /* if a struct is packed, members don't get aligned */
+ if (!this->packed)
+size = align(size, field.type->cl_alignment());
+ size += field.type->cl_size();
+  }
+  return size;
+   }
+   return 1;
+}
diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
index efcbc70af26..c72109cdcfe 100644
--- a/src/compiler/glsl_types.h
+++ b/src/compiler/glsl_types.h
@@ -421,6 +421,16 @@ public:
 */
unsigned std430_size(bool row_major) const;
 
+   /**
+* Alignment in bytes of the start of this type in OpenCL memory.
+*/
+   unsigned cl_alignment() const;
+
+   /**
+* Size in bytes of this type in OpenCL memory
+*/
+   unsigned cl_size() const;
+
/**
 * \brief Can this type be implicitly converted to another?
 *
diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
index 506dabdeb1d..cdde3597f70 100644
--- a/src/compiler/nir_types.cpp
+++ b/src/compiler/nir_types.cpp
@@ -597,3 +597,15 @@ glsl_contains_atomic(const struct glsl_type *type)
 {
return type->contains_atomic();
 }
+
+int
+glsl_get_cl_size(const struct glsl_type *type)
+{
+   return type->cl_size();
+}
+
+int
+glsl_get_cl_alignment(const struct glsl_type *type)
+{
+   return type->cl_alignment();
+}
diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index c06d227e45a..8304b22254b 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -91,6 +91,10 @@ unsigned glsl_get_record_location_offset(const struct 
glsl_type *type,
 
 unsigned glsl_atomic_size(const struct glsl_type *type);
 
+int glsl_get_cl_size(const struct glsl_type *type);
+
+int glsl_get_cl_alignment(const struct glsl_type *type);
+
 static inline unsigned
 glsl_get_bit_size(const struct glsl_type *type)
 {
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/22] glsl: add packed for struct types

2018-11-13 Thread Karol Herbst

We need this for OpenCL kernels because we have to apply C rules for alignment
and padding inside structs and for this we also have to know if a struct is
packed or not.

Signed-off-by: Karol Herbst 
---
 src/compiler/glsl_types.cpp   | 17 +++--
 src/compiler/glsl_types.h | 12 ++--
 src/compiler/nir_types.cpp|  5 +++--
 src/compiler/nir_types.h  |  3 ++-
 src/compiler/spirv/spirv_to_nir.c | 10 +-
 src/compiler/spirv/vtn_private.h  |  7 +++
 6 files changed, 42 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index e6262371bd0..9b1fd809b41 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -91,11 +91,11 @@ glsl_type::glsl_type(GLenum gl_type, glsl_base_type 
base_type,
 }
 
 glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields,
- const char *name) :
+ const char *name, bool packed) :
gl_type(0),
base_type(GLSL_TYPE_STRUCT), sampled_type(GLSL_TYPE_VOID),
sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
-   interface_packing(0), interface_row_major(0),
+   interface_packing(0), interface_row_major(0), packed(packed),
vector_elements(0), matrix_columns(0),
length(num_fields)
 {
@@ -1004,9 +1004,10 @@ glsl_type::record_key_hash(const void *a)
 const glsl_type *
 glsl_type::get_record_instance(const glsl_struct_field *fields,
unsigned num_fields,
-   const char *name)
+   const char *name,
+   bool packed)
 {
-   const glsl_type key(fields, num_fields, name);
+   const glsl_type key(fields, num_fields, name, packed);
 
mtx_lock(_type::hash_mutex);
 
@@ -1018,7 +1019,7 @@ glsl_type::get_record_instance(const glsl_struct_field 
*fields,
const struct hash_entry *entry = _mesa_hash_table_search(record_types,
 );
if (entry == NULL) {
-  const glsl_type *t = new glsl_type(fields, num_fields, name);
+  const glsl_type *t = new glsl_type(fields, num_fields, name, packed);
 
   entry = _mesa_hash_table_insert(record_types, t, (void *) t);
}
@@ -1026,6 +1027,7 @@ glsl_type::get_record_instance(const glsl_struct_field 
*fields,
assert(((glsl_type *) entry->data)->base_type == GLSL_TYPE_STRUCT);
assert(((glsl_type *) entry->data)->length == num_fields);
assert(strcmp(((glsl_type *) entry->data)->name, name) == 0);
+   assert(((glsl_type *) entry->data)->packed == packed);
 
mtx_unlock(_type::hash_mutex);
 
@@ -2138,6 +2140,8 @@ encode_type_to_blob(struct blob *blob, const glsl_type 
*type)
   if (type->is_interface()) {
  blob_write_uint32(blob, type->interface_packing);
  blob_write_uint32(blob, type->interface_row_major);
+  } else {
+ blob_write_uint32(blob, type->packed);
   }
   return;
case GLSL_TYPE_VOID:
@@ -2217,7 +2221,8 @@ decode_type_from_blob(struct blob_reader *blob)
  t = glsl_type::get_interface_instance(fields, num_fields, packing,
row_major, name);
   } else {
- t = glsl_type::get_record_instance(fields, num_fields, name);
+ unsigned packed = blob_read_uint32(blob);
+ t = glsl_type::get_record_instance(fields, num_fields, name, packed);
   }
 
   free(fields);
diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
index d32b580acc1..f2163728610 100644
--- a/src/compiler/glsl_types.h
+++ b/src/compiler/glsl_types.h
@@ -176,6 +176,13 @@ struct glsl_type {
unsigned interface_packing:2;
unsigned interface_row_major:1;
 
+   /**
+* For \c GLSL_TYPE_STRUCT this specifies if the struct is packed or not.
+*
+* Only used for Compute kernels
+*/
+   unsigned packed:1;
+
 private:
glsl_type() : mem_ctx(NULL)
{
@@ -299,7 +306,8 @@ public:
 */
static const glsl_type *get_record_instance(const glsl_struct_field *fields,
   unsigned num_fields,
-  const char *name);
+  const char *name,
+  bool packed = false);
 
/**
 * Get the instance of an interface block type
@@ -888,7 +896,7 @@ private:
 
/** Constructor for record types */
glsl_type(const glsl_struct_field *fields, unsigned num_fields,
-const char *name);
+const char *name, bool packed = false);
 
/** Constructor for interface types */
glsl_type(const glsl_struct_field *fields, unsigned num_fields,
diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
index 3cd61f66056..506dabdeb1d 100644
--- a/src/compiler/nir_types.cpp
+++ b/src/compiler/nir_types.cpp
@@ -439,9 +439,10 @@ glsl_array_type(const

[Mesa-dev] [PATCH 13/22] nir/spirv: parse memory model

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir.h|  8 
 src/compiler/nir/nir_clone.c  |  1 +
 src/compiler/nir/nir_serialize.c  |  2 ++
 src/compiler/spirv/spirv_to_nir.c | 15 +--
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 11e3d18320a..be4f64464f9 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2204,6 +2204,14 @@ typedef struct nir_shader {
 */
void *constant_data;
unsigned constant_data_size;
+
+   /**
+* pointer size is:
+*   AddressingModelLogical:0(default)
+*   AddressingModelPhysical32: 32
+*   AddressingModelPhysical64: 64
+*/
+   unsigned ptr_size;
 } nir_shader;
 
 static inline nir_function_impl *
diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
index 989c5051a54..d47d3e8cb72 100644
--- a/src/compiler/nir/nir_clone.c
+++ b/src/compiler/nir/nir_clone.c
@@ -733,6 +733,7 @@ nir_shader_clone(void *mem_ctx, const nir_shader *s)
ns->num_uniforms = s->num_uniforms;
ns->num_outputs = s->num_outputs;
ns->num_shared = s->num_shared;
+   ns->ptr_size = s->ptr_size;
 
ns->constant_data_size = s->constant_data_size;
if (s->constant_data_size > 0) {
diff --git a/src/compiler/nir/nir_serialize.c b/src/compiler/nir/nir_serialize.c
index 43016310048..5ec6972b02a 100644
--- a/src/compiler/nir/nir_serialize.c
+++ b/src/compiler/nir/nir_serialize.c
@@ -1106,6 +1106,7 @@ nir_serialize(struct blob *blob, const nir_shader *nir)
blob_write_uint32(blob, nir->num_uniforms);
blob_write_uint32(blob, nir->num_outputs);
blob_write_uint32(blob, nir->num_shared);
+   blob_write_uint32(blob, nir->ptr_size);
 
blob_write_uint32(blob, exec_list_length(>functions));
nir_foreach_function(fxn, nir) {
@@ -1165,6 +1166,7 @@ nir_deserialize(void *mem_ctx,
ctx.nir->num_uniforms = blob_read_uint32(blob);
ctx.nir->num_outputs = blob_read_uint32(blob);
ctx.nir->num_shared = blob_read_uint32(blob);
+   ctx.nir->ptr_size = blob_read_uint32(blob);
 
unsigned num_functions = blob_read_uint32(blob);
for (unsigned i = 0; i < num_functions; i++)
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index db2ee51340c..e597b2462cb 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3588,9 +3588,20 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
   break;
 
case SpvOpMemoryModel:
+  if (w[2] == SpvMemoryModelOpenCL) {
+ if (w[1] == SpvAddressingModelPhysical32)
+b->shader->ptr_size = 32;
+ else if (w[1] == SpvAddressingModelPhysical64)
+b->shader->ptr_size = 64;
+ else
+vtn_fail("Couldn't parse OpenCL Memory Model");
+ break;
+  }
+
   vtn_assert(w[1] == SpvAddressingModelLogical);
   vtn_assert(w[2] == SpvMemoryModelSimple ||
  w[2] == SpvMemoryModelGLSL450);
+  b->shader->ptr_size = 0;
   break;
 
case SpvOpEntryPoint:
@@ -4265,6 +4276,8 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
/* Skip the SPIR-V header, handled at vtn_create_builder */
words+= 5;
 
+   b->shader = nir_shader_create(b, stage, nir_options, NULL);
+
/* Handle all the preamble instructions */
words = vtn_foreach_instruction(b, words, word_end,
vtn_handle_preamble_instruction);
@@ -4275,8 +4288,6 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
   return NULL;
}
 
-   b->shader = nir_shader_create(b, stage, nir_options, NULL);
-
/* Set shader info defaults */
b->shader->info.gs.invocations = 1;
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/22] nir/spirv: initial handling of OpenCL.std extension opcodes

2018-11-13 Thread Karol Herbst

Not complete, mostly just adding things as I encounter them in CTS. But
not getting far enough yet to hit most of the OpenCL.std instructions.

Anyway, this is better than nothing and covers the most common builtins.

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/meson.build   |   1 +
 src/compiler/nir/nir_builtin_builder.c | 249 +-
 src/compiler/nir/nir_builtin_builder.h | 150 -
 src/compiler/spirv/spirv_to_nir.c  |   2 +
 src/compiler/spirv/vtn_alu.c   |  15 ++
 src/compiler/spirv/vtn_glsl450.c   |   2 +-
 src/compiler/spirv/vtn_opencl.c| 284 +
 src/compiler/spirv/vtn_private.h   |   3 +
 8 files changed, 701 insertions(+), 5 deletions(-)
 create mode 100644 src/compiler/spirv/vtn_opencl.c

diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
index b0c3a7feb31..00d7f56e6eb 100644
--- a/src/compiler/nir/meson.build
+++ b/src/compiler/nir/meson.build
@@ -206,6 +206,7 @@ files_libnir = files(
   '../spirv/vtn_amd.c',
   '../spirv/vtn_cfg.c',
   '../spirv/vtn_glsl450.c',
+  '../spirv/vtn_opencl.c',
   '../spirv/vtn_private.h',
   '../spirv/vtn_subgroup.c',
   '../spirv/vtn_variables.c',
diff --git a/src/compiler/nir/nir_builtin_builder.c 
b/src/compiler/nir/nir_builtin_builder.c
index 252a7691f36..e37915e92ca 100644
--- a/src/compiler/nir/nir_builtin_builder.c
+++ b/src/compiler/nir/nir_builtin_builder.c
@@ -21,11 +21,43 @@
  * IN THE SOFTWARE.
  */
 
+#include 
+
 #include "nir.h"
 #include "nir_builtin_builder.h"
 
 nir_ssa_def*
-nir_cross(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
+nir_iadd_sat(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
+{
+   int64_t max;
+   switch (x->bit_size) {
+   case 64:
+  max = INT64_MAX;
+  break;
+   case 32:
+  max = INT32_MAX;
+  break;
+   case 16:
+  max = INT16_MAX;
+  break;
+   case  8:
+  max = INT8_MAX;
+  break;
+   }
+
+   nir_ssa_def *sum = nir_iadd(b, x, y);
+
+   nir_ssa_def *hi = nir_bcsel(b, nir_ilt(b, sum, x),
+   nir_imm_intN_t(b, max, x->bit_size), sum);
+
+   nir_ssa_def *lo = nir_bcsel(b, nir_ilt(b, x, sum),
+   nir_imm_intN_t(b, max + 1, x->bit_size), sum);
+
+   return nir_bcsel(b, nir_ige(b, y, nir_imm_intN_t(b, 1, y->bit_size)), hi, 
lo);
+}
+
+nir_ssa_def*
+nir_cross3(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
 {
unsigned yzx[3] = { 1, 2, 0 };
unsigned zxy[3] = { 2, 0, 1 };
@@ -36,6 +68,63 @@ nir_cross(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
   nir_swizzle(b, y, yzx, 3, true)));
 }
 
+nir_ssa_def*
+nir_cross4(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
+{
+   nir_ssa_def *cross = nir_cross3(b, x, y);
+
+   return nir_vec4(b,
+  nir_channel(b, cross, 0),
+  nir_channel(b, cross, 1),
+  nir_channel(b, cross, 2),
+  nir_imm_intN_t(b, 0, cross->bit_size));
+}
+
+static nir_ssa_def*
+nir_hadd(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y, bool sign)
+{
+   nir_ssa_def *imm1 = nir_imm_int(b, 1);
+
+   nir_ssa_def *t0 = nir_ixor(b, x, y);
+   nir_ssa_def *t1 = nir_iand(b, x, y);
+
+   nir_ssa_def *t2;
+   if (sign)
+  t2 = nir_ishr(b, t0, imm1);
+   else
+  t2 = nir_ushr(b, t0, imm1);
+   return nir_iadd(b, t1, t2);
+}
+
+nir_ssa_def*
+nir_ihadd(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
+{
+   return nir_hadd(b, x, y, true);
+}
+
+nir_ssa_def*
+nir_uhadd(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
+{
+   return nir_hadd(b, x, y, false);
+}
+
+nir_ssa_def*
+nir_length(nir_builder *b, nir_ssa_def *vec)
+{
+   nir_ssa_def *finf = nir_imm_floatN_t(b, INFINITY, vec->bit_size);
+
+   nir_ssa_def *abs = nir_fabs(b, vec);
+   if (vec->num_components == 1)
+  return abs;
+
+   nir_ssa_def *maxc = nir_fmax(b, nir_channel(b, abs, 0), nir_channel(b, abs, 
1));
+   for (int i = 2; i < vec->num_components; ++i)
+  maxc = nir_fmax(b, maxc, nir_channel(b, abs, i));
+   abs = nir_fdiv(b, abs, maxc);
+   nir_ssa_def *res = nir_fmul(b, nir_fsqrt(b, nir_fdot(b, abs, abs)), maxc);
+   return nir_bcsel(b, nir_feq(b, maxc, finf), maxc, res);
+}
+
 nir_ssa_def*
 nir_fast_length(nir_builder *b, nir_ssa_def *vec)
 {
@@ -49,6 +138,107 @@ nir_fast_length(nir_builder *b, nir_ssa_def *vec)
}
 }
 
+nir_ssa_def*
+nir_nextafter(nir_builder *b, nir_ssa_def *x, nir_ssa_def *y)
+{
+   nir_ssa_def *zero = nir_imm_intN_t(b, 0, x->bit_size);
+   nir_ssa_def *one = nir_imm_intN_t(b, 1, x->bit_size);
+   nir_ssa_def *nzero = nir_imm_intN_t(b, 1ull << (x->bit_size - 1), 
x->bit_size);
+
+   nir_ssa_def *condeq = nir_feq(b, x, y);
+   nir_ssa_def *conddir = nir_flt(b, x, y);
+   nir_ssa_def *condnzero = nir_feq(b, x, nzero);
+
+   // beware of -0.0 - 1 == NaN
+   nir_ssa_def *xn =
+  nir_bcsel(b,
+condnzero,
+nir_imm_intN_t(b, (1 << (x->bit_size - 1)) + 1, x->bit_size),
+nir_isub(b, x, one));
+
+   // beware of -0.0 + 1 == -0x1p-149
+

[Mesa-dev] [PATCH 12/22] nir: add type alignment support to lower_io

2018-11-13 Thread Karol Herbst

From: Rob Clark 

For cl we can have structs with 8/16/32/64 bit scalar types (as well as,
ofc, arrays/structs/etc), which are padded according to 'C' rules.  So
for lowering struct deref's we need to not just consider a field's size,
but also it's alignment.

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir.h  | 10 +++
 src/compiler/nir/nir_lower_io.c | 52 -
 2 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index c469e111b2c..11e3d18320a 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2825,10 +2825,20 @@ typedef enum {
 */
nir_lower_io_force_sample_interpolation = (1 << 1),
 } nir_lower_io_options;
+typedef struct nir_memory_model {
+   int (*type_size)(const struct glsl_type *);
+   int (*type_align)(const struct glsl_type *);
+} nir_memory_model;
 bool nir_lower_io(nir_shader *shader,
   nir_variable_mode modes,
   int (*type_size)(const struct glsl_type *),
   nir_lower_io_options);
+// TEMP use different name to avoid fixing all the callers yet:
+bool nir_lower_io2(nir_shader *shader,
+  nir_variable_mode modes,
+  const nir_memory_model *mm,
+  nir_lower_io_options);
+
 nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
 nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
 
diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index 2a6c284de2b..292baf9e4fc 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -38,7 +38,7 @@
 struct lower_io_state {
void *dead_ctx;
nir_builder builder;
-   int (*type_size)(const struct glsl_type *type);
+   const nir_memory_model *mm;
nir_variable_mode modes;
nir_lower_io_options options;
 };
@@ -86,12 +86,26 @@ nir_is_per_vertex_io(const nir_variable *var, 
gl_shader_stage stage)
return false;
 }
 
+static int
+default_type_align(const struct glsl_type *type)
+{
+   return 1;
+}
+
+static inline int
+align(int value, int alignment)
+{
+   return (value + alignment - 1) & ~(alignment - 1);
+}
+
 static nir_ssa_def *
 get_io_offset(nir_deref_instr *deref, nir_ssa_def **vertex_index,
   struct lower_io_state *state, unsigned *component)
 {
nir_builder *b = >builder;
-   int (*type_size)(const struct glsl_type *) = state->type_size;
+   int (*type_size)(const struct glsl_type *) = state->mm->type_size;
+   int (*type_align)(const struct glsl_type *) = state->mm->type_align ?
+  state->mm->type_align : default_type_align;
nir_deref_path path;
nir_deref_path_init(, deref, NULL);
 
@@ -137,7 +151,10 @@ get_io_offset(nir_deref_instr *deref, nir_ssa_def 
**vertex_index,
 
  unsigned field_offset = 0;
  for (unsigned i = 0; i < (*p)->strct.index; i++) {
-field_offset += type_size(glsl_get_struct_field(parent->type, i));
+const struct glsl_type *field_type =
+   glsl_get_struct_field(parent->type, i);
+field_offset = align(field_offset, type_align(field_type));
+field_offset += type_size(field_type);
  }
  offset = nir_iadd(b, offset, nir_imm_int(b, field_offset));
   } else {
@@ -207,7 +224,7 @@ lower_load(nir_intrinsic_instr *intrin, struct 
lower_io_state *state,
   nir_intrinsic_set_component(load, component);
 
if (load->intrinsic == nir_intrinsic_load_uniform)
-  nir_intrinsic_set_range(load, state->type_size(var->type));
+  nir_intrinsic_set_range(load, state->mm->type_size(var->type));
 
if (vertex_index) {
   load->src[0] = nir_src_for_ssa(vertex_index);
@@ -488,10 +505,8 @@ nir_lower_io_block(nir_block *block,
 }
 
 static bool
-nir_lower_io_impl(nir_function_impl *impl,
-  nir_variable_mode modes,
-  int (*type_size)(const struct glsl_type *),
-  nir_lower_io_options options)
+nir_lower_io_impl(nir_function_impl *impl, nir_variable_mode modes,
+  const nir_memory_model *mm, nir_lower_io_options options)
 {
struct lower_io_state state;
bool progress = false;
@@ -499,7 +514,7 @@ nir_lower_io_impl(nir_function_impl *impl,
nir_builder_init(, impl);
state.dead_ctx = ralloc_context(NULL);
state.modes = modes;
-   state.type_size = type_size;
+   state.mm = mm;
state.options = options;
 
nir_foreach_block(block, impl) {
@@ -514,22 +529,33 @@ nir_lower_io_impl(nir_function_impl *impl,
 }
 
 bool
-nir_lower_io(nir_shader *shader, nir_variable_mode modes,
- int (*type_size)(const struct glsl_type *),
- nir_lower_io_options options)
+nir_lower_io2(nir_shader *shader, nir_variable_mode modes,
+ const nir_memory_model *mm, nir_lower_io_options options)
 {
bool progress = false;
 
nir_foreach_function(function, shader) {
   if (function->impl) {
  progress

[Mesa-dev] [PATCH 17/22] nir: rename global to private memory

2018-11-13 Thread Karol Herbst

the naming is a bit confusing no matter how you look at it. Within OpenCL
"global" memory is memory accessible from all threads. glsl "global" memory
normally refers to shader thread private memory declared at global scope. As
we already use "shared" for memory shared across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).

Signed-off-by: Karol Herbst 
---
 src/compiler/glsl/glsl_to_nir.cpp|  4 ++--
 src/compiler/nir/nir.c   |  2 +-
 src/compiler/nir/nir.h   |  2 +-
 src/compiler/nir/nir_linking_helpers.c   |  2 +-
 .../nir/nir_lower_constant_initializers.c|  2 +-
 .../nir/nir_lower_global_vars_to_local.c |  4 ++--
 src/compiler/nir/nir_lower_io_to_temporaries.c   |  2 +-
 src/compiler/nir/nir_opt_copy_prop_vars.c|  4 ++--
 src/compiler/nir/nir_opt_dead_write_vars.c   |  2 +-
 src/compiler/nir/nir_print.c |  4 ++--
 src/compiler/nir/nir_remove_dead_variables.c |  4 ++--
 src/compiler/nir/nir_split_vars.c| 16 
 src/compiler/nir/tests/vars_tests.cpp|  2 +-
 src/compiler/spirv/vtn_private.h |  2 +-
 src/compiler/spirv/vtn_variables.c   |  6 +++---
 src/gallium/auxiliary/nir/tgsi_to_nir.c  |  2 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp|  2 +-
 17 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 0479f8fcfe4..8564cd89b5a 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -312,7 +312,7 @@ nir_visitor::visit(ir_variable *ir)
case ir_var_auto:
case ir_var_temporary:
   if (is_global)
- var->data.mode = nir_var_global;
+ var->data.mode = nir_var_private;
   else
  var->data.mode = nir_var_local;
   break;
@@ -1433,7 +1433,7 @@ nir_visitor::visit(ir_expression *ir)
   * sense, we'll just turn it into a load which will probably
   * eventually end up as an SSA definition.
   */
- assert(this->deref->mode == nir_var_global);
+ assert(this->deref->mode == nir_var_private);
  op = nir_intrinsic_load_deref;
   }
 
diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 249b9357c3f..27f5d1b7bca 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -129,7 +129,7 @@ nir_shader_add_variable(nir_shader *shader, nir_variable 
*var)
   assert(!"nir_shader_add_variable cannot be used for local variables");
   break;
 
-   case nir_var_global:
+   case nir_var_private:
   exec_list_push_tail(>globals, >node);
   break;
 
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 89c28e36618..78f3204d3e2 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -96,7 +96,7 @@ typedef struct {
 typedef enum {
nir_var_shader_in   = (1 << 0),
nir_var_shader_out  = (1 << 1),
-   nir_var_global  = (1 << 2),
+   nir_var_private = (1 << 2),
nir_var_local   = (1 << 3),
nir_var_uniform = (1 << 4),
nir_var_shader_storage  = (1 << 5),
diff --git a/src/compiler/nir/nir_linking_helpers.c 
b/src/compiler/nir/nir_linking_helpers.c
index a05890ada43..d8358e08e5a 100644
--- a/src/compiler/nir/nir_linking_helpers.c
+++ b/src/compiler/nir/nir_linking_helpers.c
@@ -134,7 +134,7 @@ nir_remove_unused_io_vars(nir_shader *shader, struct 
exec_list *var_list,
   if (!(other_stage & get_variable_io_mask(var, shader->info.stage))) {
  /* This one is invalid, make it a global variable instead */
  var->data.location = 0;
- var->data.mode = nir_var_global;
+ var->data.mode = nir_var_private;
 
  exec_node_remove(>node);
  exec_list_push_tail(>globals, >node);
diff --git a/src/compiler/nir/nir_lower_constant_initializers.c 
b/src/compiler/nir/nir_lower_constant_initializers.c
index 4e9cea46157..932a32b3c9c 100644
--- a/src/compiler/nir/nir_lower_constant_initializers.c
+++ b/src/compiler/nir/nir_lower_constant_initializers.c
@@ -98,7 +98,7 @@ nir_lower_constant_initializers(nir_shader *shader, 
nir_variable_mode modes)
if (modes & nir_var_shader_out)
   progress |= lower_const_initializer(, >outputs);
 
-   if (modes & nir_var_global)
+   if (modes & nir_var_private)
   progress |= lower_const_initializer(, >globals);
 
if (modes & nir_var_system_value)
diff --git a/src/compiler/nir/nir_lower_global_vars_to_local.c 
b/src/compiler/nir/nir_lower_global_vars_to_local.c
index be99cf9ad02..6c6d9a9d25c 100644
--- a/src/compiler/nir/nir_lower_global_vars_to_local.c
+++ b/src/compiler/nir/nir_lower_global_vars_to_local.c
@@ -36,7 +36,7 @@ static void

[Mesa-dev] [PATCH 04/22] nir/spirv: add OpIsFinite and OpIsNormal

2018-11-13 Thread Karol Herbst

From: Rob Clark 

changes by Karol:
v2: make compatible with 64 bit floats
fix isfinite
v3: use snake_case.

Signed-off-by: Karol Herbst 
---
 src/compiler/spirv/vtn_alu.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index b1492c1501a..ea25d4bcbdc 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -583,6 +583,38 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
}
 
+   case SpvOpIsFinite: {
+  nir_ssa_def *inf = nir_imm_floatN_t(>nb, INFINITY, src[0]->bit_size);
+  nir_ssa_def *is_number = nir_feq(>nb, src[0], src[0]);
+  nir_ssa_def *is_not_inf = nir_ine(>nb, nir_fabs(>nb, src[0]), inf);
+  val->ssa->def = nir_iand(>nb, is_number, is_not_inf);
+  break;
+   }
+
+   case SpvOpIsNormal: {
+  unsigned bit_size = src[0]->bit_size;
+
+  uint32_t m;
+  if (bit_size == 64)
+ m = 11;
+  else if (bit_size == 32)
+ m = 8;
+  else if (bit_size == 16)
+ m = 5;
+  else
+ assert(!"unknown float type");
+
+  nir_ssa_def *shift = nir_imm_int(>nb, bit_size - m - 1);
+  nir_ssa_def *abs = nir_fabs(>nb, src[0]);
+  nir_ssa_def *exp = nir_iadd(>nb,
+  nir_ushr(>nb, abs, shift),
+  nir_imm_intN_t(>nb, -1, bit_size));
+  val->ssa->def = nir_ult(>nb,
+  exp,
+  nir_imm_intN_t(>nb, (1 << m) - 2, bit_size));
+  break;
+   }
+
case SpvOpFUnordEqual:
case SpvOpFUnordNotEqual:
case SpvOpFUnordLessThan:
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/22] nir/vtn: add caps for some cl related capabilities

2018-11-13 Thread Karol Herbst

From: Rob Clark 

vtn supports these, so don't squalk if user is happy with enabling
these.

Signed-off-by: Karol Herbst 
---
 src/compiler/shader_info.h |  3 +++
 src/compiler/spirv/spirv_to_nir.c  | 16 +---
 src/compiler/spirv/vtn_variables.c |  6 --
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h
index 65bc0588d67..5286cf8fc5f 100644
--- a/src/compiler/shader_info.h
+++ b/src/compiler/shader_info.h
@@ -62,6 +62,9 @@ struct spirv_supported_capabilities {
bool post_depth_coverage;
bool transform_feedback;
bool geometry_streams;
+   bool address;
+   bool kernel;
+   bool int8;
 };
 
 typedef struct shader_info {
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index d7dd5a67cc4..db2ee51340c 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -792,8 +792,10 @@ struct_member_decoration_cb(struct vtn_builder *b,
case SpvDecorationFPRoundingMode:
case SpvDecorationFPFastMathMode:
case SpvDecorationAlignment:
-  vtn_warn("Decoration only allowed for CL-style kernels: %s",
-   spirv_decoration_to_string(dec->decoration));
+  if (!b->kernel_mode) {
+ vtn_warn("Decoration only allowed for CL-style kernels: %s",
+  spirv_decoration_to_string(dec->decoration));
+  }
   break;
 
case SpvDecorationHlslSemanticGOOGLE:
@@ -3428,7 +3430,6 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
   case SpvCapabilityFloat16:
   case SpvCapabilityInt64Atomics:
   case SpvCapabilityStorageImageMultisample:
-  case SpvCapabilityInt8:
   case SpvCapabilitySparseResidency:
   case SpvCapabilityMinLod:
  vtn_warn("Unsupported SPIR-V capability: %s",
@@ -3457,8 +3458,17 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, 
SpvOp opcode,
  spv_check_supported(geometry_streams, cap);
  break;
 
+  case SpvCapabilityInt8:
+ spv_check_supported(int8, cap);
+ break;
+
   case SpvCapabilityAddresses:
+ spv_check_supported(address, cap);
+ break;
   case SpvCapabilityKernel:
+ spv_check_supported(kernel, cap);
+ break;
+
   case SpvCapabilityImageBasic:
   case SpvCapabilityImageReadWrite:
   case SpvCapabilityImageMipmap:
diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index c5cf345d02a..e7654b768af 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1371,8 +1371,10 @@ apply_var_decoration(struct vtn_builder *b,
case SpvDecorationFPRoundingMode:
case SpvDecorationFPFastMathMode:
case SpvDecorationAlignment:
-  vtn_warn("Decoration only allowed for CL-style kernels: %s",
-   spirv_decoration_to_string(dec->decoration));
+  if (!b->kernel_mode) {
+ vtn_warn("Decoration only allowed for CL-style kernels: %s",
+  spirv_decoration_to_string(dec->decoration));
+  }
   break;
 
case SpvDecorationHlslSemanticGOOGLE:
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/22] nir: add support for address bit sized system values

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/amd/vulkan/radv_meta_buffer.c |  8 ++--
 src/amd/vulkan/radv_meta_bufimage.c   | 16 
 src/amd/vulkan/radv_meta_fast_clear.c |  4 +-
 src/amd/vulkan/radv_meta_resolve_cs.c |  4 +-
 src/amd/vulkan/radv_query.c   |  8 ++--
 src/compiler/nir/nir_builder_opcodes_h.py | 15 ++-
 src/compiler/nir/nir_intrinsics.py| 10 ++---
 src/compiler/nir/nir_lower_system_values.c| 40 ---
 src/gallium/auxiliary/nir/tgsi_to_nir.c   |  2 +-
 src/gallium/drivers/vc4/vc4_nir_lower_blend.c |  4 +-
 src/intel/compiler/brw_nir.c  |  2 +-
 11 files changed, 67 insertions(+), 46 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_buffer.c 
b/src/amd/vulkan/radv_meta_buffer.c
index 76854d7bbad..208988c3775 100644
--- a/src/amd/vulkan/radv_meta_buffer.c
+++ b/src/amd/vulkan/radv_meta_buffer.c
@@ -15,8 +15,8 @@ build_buffer_fill_shader(struct radv_device *dev)
b.shader->info.cs.local_size[1] = 1;
b.shader->info.cs.local_size[2] = 1;
 
-   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
-   nir_ssa_def *wg_id = nir_load_work_group_id();
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id(, 32);
+   nir_ssa_def *wg_id = nir_load_work_group_id(, 32);
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -67,8 +67,8 @@ build_buffer_copy_shader(struct radv_device *dev)
b.shader->info.cs.local_size[1] = 1;
b.shader->info.cs.local_size[2] = 1;
 
-   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
-   nir_ssa_def *wg_id = nir_load_work_group_id();
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id(, 32);
+   nir_ssa_def *wg_id = nir_load_work_group_id(, 32);
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
diff --git a/src/amd/vulkan/radv_meta_bufimage.c 
b/src/amd/vulkan/radv_meta_bufimage.c
index f5b68f6c9a6..e79919a984b 100644
--- a/src/amd/vulkan/radv_meta_bufimage.c
+++ b/src/amd/vulkan/radv_meta_bufimage.c
@@ -60,8 +60,8 @@ build_nir_itob_compute_shader(struct radv_device *dev, bool 
is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
 
-   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
-   nir_ssa_def *wg_id = nir_load_work_group_id();
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id(, 32);
+   nir_ssa_def *wg_id = nir_load_work_group_id(, 32);
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -289,8 +289,8 @@ build_nir_btoi_compute_shader(struct radv_device *dev, bool 
is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
 
-   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
-   nir_ssa_def *wg_id = nir_load_work_group_id();
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id(, 32);
+   nir_ssa_def *wg_id = nir_load_work_group_id(, 32);
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -719,8 +719,8 @@ build_nir_itoi_compute_shader(struct radv_device *dev, bool 
is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 1;
 
-   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
-   nir_ssa_def *wg_id = nir_load_work_group_id();
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id(, 32);
+   nir_ssa_def *wg_id = nir_load_work_group_id(, 32);
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
@@ -1139,8 +1139,8 @@ build_nir_cleari_compute_shader(struct radv_device *dev, 
bool is_3d)
output_img->data.descriptor_set = 0;
output_img->data.binding = 0;
 
-   nir_ssa_def *invoc_id = nir_load_local_invocation_id();
-   nir_ssa_def *wg_id = nir_load_work_group_id();
+   nir_ssa_def *invoc_id = nir_load_local_invocation_id(, 32);
+   nir_ssa_def *wg_id = nir_load_work_group_id(, 32);
nir_ssa_def *block_size = nir_imm_ivec4(,
b.shader->info.cs.local_size[0],
b.shader->info.cs.local_size[1],
diff --git a/src/amd/vulkan/radv_meta_fast_clear.c 
b/src/amd/vulkan/radv_meta_fast_clear.c
index

[Mesa-dev] [PATCH 16/22] nir+vtn: vec8+vec16 support

2018-11-13 Thread Karol Herbst

This introduces new vec8 and vec16 instructions (which are the only
instructions taking more than 4 sources), in order to construct 8 and 16
component vectors.

In order to avoid fixing up the non-autogenerated nir_build_alu() sites
and making them pass 16 src args for the benefit of the two instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).

Signed-off-by: Rob Clark 
Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir.h   |  4 +-
 src/compiler/nir/nir_builder.h   | 58 +++-
 src/compiler/nir/nir_builder_opcodes_h.py|  5 +-
 src/compiler/nir/nir_constant_expressions.py | 33 +--
 src/compiler/nir/nir_lower_alu_to_scalar.c   |  2 +
 src/compiler/nir/nir_opcodes.py  | 39 -
 src/compiler/nir/nir_print.c | 17 --
 src/compiler/nir/nir_search.c|  8 ++-
 src/compiler/spirv/spirv_to_nir.c|  4 +-
 9 files changed, 140 insertions(+), 30 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 3855eb0b582..89c28e36618 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -57,8 +57,8 @@ extern "C" {
 
 #define NIR_FALSE 0u
 #define NIR_TRUE (~0u)
-#define NIR_MAX_VEC_COMPONENTS 4
-typedef uint8_t nir_component_mask_t;
+#define NIR_MAX_VEC_COMPONENTS 16
+typedef uint16_t nir_component_mask_t;
 
 /** Defines a cast function
  *
diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
index 3271a480520..57f0a188c46 100644
--- a/src/compiler/nir/nir_builder.h
+++ b/src/compiler/nir/nir_builder.h
@@ -352,24 +352,12 @@ nir_imm_ivec4(nir_builder *build, int x, int y, int z, 
int w)
 }
 
 static inline nir_ssa_def *
-nir_build_alu(nir_builder *build, nir_op op, nir_ssa_def *src0,
-  nir_ssa_def *src1, nir_ssa_def *src2, nir_ssa_def *src3)
+nir_build_alu_tail(nir_builder *build, nir_alu_instr *instr)
 {
-   const nir_op_info *op_info = _op_infos[op];
-   nir_alu_instr *instr = nir_alu_instr_create(build->shader, op);
-   if (!instr)
-  return NULL;
+   const nir_op_info *op_info = _op_infos[instr->op];
 
instr->exact = build->exact;
 
-   instr->src[0].src = nir_src_for_ssa(src0);
-   if (src1)
-  instr->src[1].src = nir_src_for_ssa(src1);
-   if (src2)
-  instr->src[2].src = nir_src_for_ssa(src2);
-   if (src3)
-  instr->src[3].src = nir_src_for_ssa(src3);
-
/* Guess the number of components the destination temporary should have
 * based on our input sizes, if it's not fixed for the op.
 */
@@ -425,12 +413,54 @@ nir_build_alu(nir_builder *build, nir_op op, nir_ssa_def 
*src0,
return >dest.dest.ssa;
 }
 
+static inline nir_ssa_def *
+nir_build_alu(nir_builder *build, nir_op op, nir_ssa_def *src0,
+  nir_ssa_def *src1, nir_ssa_def *src2, nir_ssa_def *src3)
+{
+   nir_alu_instr *instr = nir_alu_instr_create(build->shader, op);
+   if (!instr)
+  return NULL;
+
+   instr->src[0].src = nir_src_for_ssa(src0);
+   if (src1)
+  instr->src[1].src = nir_src_for_ssa(src1);
+   if (src2)
+  instr->src[2].src = nir_src_for_ssa(src2);
+   if (src3)
+  instr->src[3].src = nir_src_for_ssa(src3);
+
+   return nir_build_alu_tail(build, instr);
+}
+
+/* for the couple special cases with more than 4 src args: */
+static inline nir_ssa_def *
+nir_build_alu2(nir_builder *build, nir_op op, nir_ssa_def **srcs)
+{
+   const nir_op_info *op_info = _op_infos[op];
+   nir_alu_instr *instr = nir_alu_instr_create(build->shader, op);
+   if (!instr)
+  return NULL;
+
+   for (unsigned i = 0; i < op_info->num_inputs; i++)
+  instr->src[i].src = nir_src_for_ssa(srcs[i]);
+
+   return nir_build_alu_tail(build, instr);
+}
+
 #include "nir_builder_opcodes.h"
 
 static inline nir_ssa_def *
 nir_vec(nir_builder *build, nir_ssa_def **comp, unsigned num_components)
 {
switch (num_components) {
+   case 16:
+  return nir_vec16(build, comp[0], comp[1], comp[2], comp[3],
+   comp[4], comp[5], comp[6], comp[7],
+   comp[8], comp[9], comp[10], comp[11],
+   comp[12], comp[13], comp[14], comp[15]);
+   case 8:
+  return nir_vec8(build, comp[0], comp[1], comp[2], comp[3],
+  comp[4], comp[5], comp[6], comp[7]);
case 4:
   return nir_vec4(build, comp[0], comp[1], comp[2], comp[3]);
case 3:
diff --git a/src/compiler/nir/nir_builder_opcodes_h.py 
b/src/compiler/nir/nir_builder_opcodes_h.py
index 84e5400958e..47edc02896c 100644
--- a/src/compiler/nir/nir_builder_opcodes_h.py
+++ b/src/compiler/nir/nir_builder_opcodes_h.py
@@ -31,14 +31,15 @@ def src_decl_list(num_srcs):
return ', '.join('nir_ssa_def *src' + str(i) for i in range(num_srcs))
 
 def src_list(num_srcs):
-   return ', '.join('src' + str(i) if i < num_srcs else 'NULL' for i in 
range(4))
+   return ',

[Mesa-dev] [PATCH 08/22] glsl: add glsl_base_get_byte_size

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/glsl_types.h | 34 ++
 src/compiler/nir_types.h  | 30 +-
 2 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
index f2163728610..efcbc70af26 100644
--- a/src/compiler/glsl_types.h
+++ b/src/compiler/glsl_types.h
@@ -1089,4 +1089,38 @@ glsl_align(unsigned int a, unsigned int align)
return (a + align - 1) / align * align;
 }
 
+static inline unsigned
+glsl_base_get_byte_size(const enum glsl_base_type base_type)
+{
+   switch (base_type) {
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_BOOL:
+   case GLSL_TYPE_FLOAT: /* TODO handle mediump */
+   case GLSL_TYPE_SUBROUTINE:
+  return 4;
+
+   case GLSL_TYPE_FLOAT16:
+   case GLSL_TYPE_UINT16:
+   case GLSL_TYPE_INT16:
+  return 2;
+
+   case GLSL_TYPE_UINT8:
+   case GLSL_TYPE_INT8:
+  return 1;
+
+   case GLSL_TYPE_DOUBLE:
+   case GLSL_TYPE_INT64:
+   case GLSL_TYPE_UINT64:
+   case GLSL_TYPE_IMAGE:
+   case GLSL_TYPE_SAMPLER:
+  return 8;
+
+   default:
+  unreachable("unknown base type");
+   }
+
+   return 0;
+}
+
 #endif /* GLSL_TYPES_H */
diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index 7080a23e1cc..c06d227e45a 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -94,35 +94,7 @@ unsigned glsl_atomic_size(const struct glsl_type *type);
 static inline unsigned
 glsl_get_bit_size(const struct glsl_type *type)
 {
-   switch (glsl_get_base_type(type)) {
-   case GLSL_TYPE_INT:
-   case GLSL_TYPE_UINT:
-   case GLSL_TYPE_BOOL:
-   case GLSL_TYPE_FLOAT: /* TODO handle mediump */
-   case GLSL_TYPE_SUBROUTINE:
-  return 32;
-
-   case GLSL_TYPE_FLOAT16:
-   case GLSL_TYPE_UINT16:
-   case GLSL_TYPE_INT16:
-  return 16;
-
-   case GLSL_TYPE_UINT8:
-   case GLSL_TYPE_INT8:
-  return 8;
-
-   case GLSL_TYPE_DOUBLE:
-   case GLSL_TYPE_INT64:
-   case GLSL_TYPE_UINT64:
-   case GLSL_TYPE_IMAGE:
-   case GLSL_TYPE_SAMPLER:
-  return 64;
-
-   default:
-  unreachable("unknown base type");
-   }
-
-   return 0;
+   return glsl_base_get_byte_size(glsl_get_base_type(type)) * 8;
 }
 
 bool glsl_type_is_16bit(const struct glsl_type *type);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/22] vtn: handle SpvExecutionModelKernel

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/spirv/spirv_to_nir.c | 3 +++
 src/compiler/spirv/vtn_private.h  | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 2c214324774..650eb6a977c 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -3318,6 +3318,9 @@ stage_for_execution_model(struct vtn_builder *b, 
SpvExecutionModel model)
   return MESA_SHADER_FRAGMENT;
case SpvExecutionModelGLCompute:
   return MESA_SHADER_COMPUTE;
+   case SpvExecutionModelKernel:
+  b->kernel_mode = true;
+  return MESA_SHADER_COMPUTE;
default:
   vtn_fail("Unsupported execution model");
}
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 643a88d1abe..df6356f50fe 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -605,6 +605,8 @@ struct vtn_builder {
unsigned func_param_idx;
 
bool has_loop_continue;
+
+   bool kernel_mode;
 };
 
 nir_ssa_def *
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/22] nir: simplify get_io_offset() parameters

2018-11-13 Thread Karol Herbst

From: Rob Clark 

For pointers we'll need to add another caller, plus in addition a
type_align() fxn ptr.  So  just simplify things and pass the
lower_io_state to get_io_offset().

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir_lower_io.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index b3595bb19d5..2a6c284de2b 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -87,11 +87,11 @@ nir_is_per_vertex_io(const nir_variable *var, 
gl_shader_stage stage)
 }
 
 static nir_ssa_def *
-get_io_offset(nir_builder *b, nir_deref_instr *deref,
-  nir_ssa_def **vertex_index,
-  int (*type_size)(const struct glsl_type *),
-  unsigned *component)
+get_io_offset(nir_deref_instr *deref, nir_ssa_def **vertex_index,
+  struct lower_io_state *state, unsigned *component)
 {
+   nir_builder *b = >builder;
+   int (*type_size)(const struct glsl_type *) = state->type_size;
nir_deref_path path;
nir_deref_path_init(, deref, NULL);
 
@@ -421,8 +421,8 @@ nir_lower_io_block(nir_block *block,
   nir_ssa_def *vertex_index = NULL;
   unsigned component_offset = var->data.location_frac;
 
-  offset = get_io_offset(b, deref, per_vertex ? _index : NULL,
- state->type_size, _offset);
+  offset = get_io_offset(deref, per_vertex ? _index : NULL,
+ state, _offset);
 
   nir_intrinsic_instr *replacement;
 
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 18/22] nir/spirv: handle SpvStorageClassCrossWorkgroup

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/nir/nir.c | 4 
 src/compiler/nir/nir.h | 1 +
 src/compiler/nir/nir_print.c   | 2 ++
 src/compiler/spirv/vtn_private.h   | 1 +
 src/compiler/spirv/vtn_variables.c | 4 
 5 files changed, 12 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 27f5d1b7bca..ca258b7c80e 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -129,6 +129,10 @@ nir_shader_add_variable(nir_shader *shader, nir_variable 
*var)
   assert(!"nir_shader_add_variable cannot be used for local variables");
   break;
 
+   case nir_var_global:
+  assert(!"nir_shader_add_variable cannot be used for global memory");
+  break;
+
case nir_var_private:
   exec_list_push_tail(>globals, >node);
   break;
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 78f3204d3e2..35f2ec02c31 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -102,6 +102,7 @@ typedef enum {
nir_var_shader_storage  = (1 << 5),
nir_var_system_value= (1 << 6),
nir_var_shared  = (1 << 8),
+   nir_var_global  = (1 << 9),
nir_var_all = ~0,
 } nir_variable_mode;
 
diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 88f91087134..2fb041039c6 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -420,6 +420,8 @@ get_variable_mode_str(nir_variable_mode mode, bool 
want_local_global_mode)
   return want_local_global_mode ? "private" : "";
case nir_var_local:
   return want_local_global_mode ? "local" : "";
+   case nir_var_global:
+  return want_local_global_mode ? "global" : "";
default:
   return "";
}
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 86f98083f58..4dec2b66ff0 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -424,6 +424,7 @@ enum vtn_variable_mode {
vtn_variable_mode_ssbo,
vtn_variable_mode_push_constant,
vtn_variable_mode_workgroup,
+   vtn_variable_mode_cross_workgroup,
vtn_variable_mode_input,
vtn_variable_mode_output,
 };
diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 5738941ffb6..7896e58f7e5 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1572,6 +1572,9 @@ vtn_storage_class_to_mode(struct vtn_builder *b,
   nir_mode = nir_var_uniform;
   break;
case SpvStorageClassCrossWorkgroup:
+  mode = vtn_variable_mode_cross_workgroup;
+  nir_mode = nir_var_global;
+  break;
case SpvStorageClassGeneric:
default:
   vtn_fail("Unhandled variable storage class");
@@ -1830,6 +1833,7 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,
case vtn_variable_mode_ubo:
case vtn_variable_mode_ssbo:
case vtn_variable_mode_push_constant:
+   case vtn_variable_mode_cross_workgroup:
   /* These don't need actual variables. */
   break;
}
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/22] nir/spirv: cast shift operand to u32

2018-11-13 Thread Karol Herbst

v2: fix for specialization constants as well

Signed-off-by: Karol Herbst 
---
 src/compiler/spirv/spirv_to_nir.c | 20 
 src/compiler/spirv/vtn_alu.c  | 11 +++
 2 files changed, 31 insertions(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index d72f07dc1f9..2c214324774 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1813,6 +1813,26 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp opcode,
 src[j] = src_val->constant->values[0];
  }
 
+ /* fix up fixed size sources */
+ switch (op) {
+ case nir_op_ishl:
+ case nir_op_ishr:
+ case nir_op_ushr: {
+if (bit_size == 32)
+   break;
+for (unsigned i = 0; i < num_components; ++i) {
+   switch (bit_size) {
+   case 64: src[1].u32[i] = src[1].u64[i]; break;
+   case 16: src[1].u32[i] = src[1].u16[i]; break;
+   case  8: src[1].u32[i] = src[1].u8[i];  break;
+   }
+}
+break;
+ }
+ default:
+break;
+ }
+
  val->constant->values[0] =
 nir_eval_const_opcode(op, num_components, bit_size, src);
  break;
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index ea25d4bcbdc..32825da29cb 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -743,6 +743,17 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
  src[1] = tmp;
   }
 
+  switch (op) {
+  case nir_op_ishl:
+  case nir_op_ishr:
+  case nir_op_ushr:
+ if (src[1]->bit_size != 32)
+src[1] = nir_u2u32(>nb, src[1]);
+ break;
+  default:
+ break;
+  }
+
   val->ssa->def = nir_build_alu(>nb, op, src[0], src[1], src[2], 
src[3]);
   break;
} /* default */
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 21/22] spirv/cl: support vload/vstore

2018-11-13 Thread Karol Herbst

Signed-off-by: Karol Herbst 
---
 src/compiler/spirv/vtn_opencl.c | 59 +
 1 file changed, 59 insertions(+)

diff --git a/src/compiler/spirv/vtn_opencl.c b/src/compiler/spirv/vtn_opencl.c
index 089e6168fd8..ecaca4c17bc 100644
--- a/src/compiler/spirv/vtn_opencl.c
+++ b/src/compiler/spirv/vtn_opencl.c
@@ -191,6 +191,59 @@ handle_special(struct vtn_builder *b, enum OpenCLstd 
opcode, unsigned num_srcs,
}
 }
 
+static void
+_handle_v_load_store(struct vtn_builder *b, enum OpenCLstd opcode,
+ const uint32_t *w, unsigned count, bool load)
+{
+   struct vtn_type *type;
+   if (load)
+  type = vtn_value(b, w[1], vtn_value_type_type)->type;
+   else
+  type = vtn_untyped_value(b, w[5])->type;
+   unsigned a = load ? 0 : 1;
+
+   const struct glsl_type *dest_type = type->type;
+   enum glsl_base_type base_type = glsl_get_base_type(dest_type);
+   const struct glsl_type *scalar_type = glsl_scalar_type(base_type);
+
+   nir_ssa_def *offset = vtn_ssa_value(b, w[5 + a])->def;
+   struct vtn_value *p = vtn_value(b, w[6 + a], vtn_value_type_pointer);
+
+   nir_deref_instr *deref = vtn_pointer_to_deref(b, p->pointer);
+
+   /* we have to manually handle alignment here for vec3 */
+   /* 1. cast to scalar type */
+   deref = nir_build_deref_cast(>nb, >dest.ssa, nir_var_global, 
scalar_type);
+   /* 2. multiple offset by vector size */
+   offset = nir_imul(>nb, offset, nir_imm_intN_t(>nb, 
glsl_get_vector_elements(dest_type), offset->bit_size));
+   /* 3. deref ptr_as_array */
+   deref = nir_build_deref_ptr_as_array(>nb, deref, offset, scalar_type);
+   /* 4. cast to vec type */
+   deref = nir_build_deref_cast(>nb, >dest.ssa, nir_var_global, 
dest_type);
+
+   if (load) {
+  struct vtn_ssa_value *val = vtn_local_load(b, deref);
+  vtn_push_ssa(b, w[2], type, val);
+   } else {
+  struct vtn_ssa_value *val = vtn_ssa_value(b, w[5]);
+  vtn_local_store(b, val, deref);
+   }
+}
+
+static void
+vtn_handle_opencl_vload(struct vtn_builder *b, enum OpenCLstd opcode,
+const uint32_t *w, unsigned count)
+{
+   _handle_v_load_store(b, opcode, w, count, true);
+}
+
+static void
+vtn_handle_opencl_vstore(struct vtn_builder *b, enum OpenCLstd opcode,
+ const uint32_t *w, unsigned count)
+{
+   _handle_v_load_store(b, opcode, w, count, false);
+}
+
 static nir_ssa_def *
 handle_printf(struct vtn_builder *b, enum OpenCLstd opcode, unsigned num_srcs,
   nir_ssa_def **srcs, const struct glsl_type *dest_type)
@@ -271,6 +324,12 @@ vtn_handle_opencl_instruction(struct vtn_builder *b, 
uint32_t ext_opcode,
case U_Upsample:
   handle_instr(b, ext_opcode, w, count, handle_special);
   return true;
+   case Vloadn:
+  vtn_handle_opencl_vload(b, ext_opcode, w, count);
+  return true;
+   case Vstoren:
+  vtn_handle_opencl_vstore(b, ext_opcode, w, count);
+  return true;
case Printf:
   handle_instr(b, ext_opcode, w, count, handle_printf);
   return true;
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/22] nir/spirv: support for CL kernel

2018-11-13 Thread Karol Herbst

some of those patches are already reviewed, but not pushed. Just wanted to
post the patches to show the most current approach and to start discussion
on what we might want to handle differently.

There are some things I am not so happy about as well, like that bit_size
handling for system values or how the derefs for pointers are created.

But overall it feels we require less changes overall with my new approach
to support physical pointers inside nir and vtn.

Karol Herbst (18):
  nir: add const_index parameters to system value builder function
  nir: replace nir_load_system_value calls with appropiate builder
functions
  nir/spirv: initial handling of OpenCL.std extension opcodes
  nir/spirv: cast shift operand to u32
  vtn: handle SpvExecutionModelKernel
  glsl: add packed for struct types
  glsl: add glsl_base_get_byte_size
  glsl: add cl_size and cl_alignment
  nir/spirv: parse memory model
  nir: add legal bit_sizes to intrinsics
  nir: add support for address bit sized system values
  nir+vtn: vec8+vec16 support
  nir: rename global to private memory
  nir/spirv: handle SpvStorageClassCrossWorkgroup
  nir/spirv: handle kernel function parameters
  nir/spirv: physical pointer support
  spirv/cl: support vload/vstore
  nir/spirv: handle OpBitcasts for pointers

Rob Clark (4):
  nir/spirv: add OpIsFinite and OpIsNormal
  nir/vtn: add caps for some cl related capabilities
  nir: simplify get_io_offset() parameters
  nir: add type alignment support to lower_io

 src/amd/vulkan/radv_meta_buffer.c |   8 +-
 src/amd/vulkan/radv_meta_bufimage.c   |  16 +-
 src/amd/vulkan/radv_meta_clear.c  |   8 +-
 src/amd/vulkan/radv_meta_fast_clear.c |   4 +-
 src/amd/vulkan/radv_meta_resolve_cs.c |   4 +-
 src/amd/vulkan/radv_query.c   |   8 +-
 src/compiler/glsl/glsl_to_nir.cpp |   4 +-
 src/compiler/glsl_types.cpp   |  65 +++-
 src/compiler/glsl_types.h |  56 ++-
 src/compiler/nir/meson.build  |   1 +
 src/compiler/nir/nir.c|   8 +-
 src/compiler/nir/nir.h|  37 +-
 src/compiler/nir/nir_builder.h|  95 -
 src/compiler/nir/nir_builder_opcodes_h.py |  41 ++-
 src/compiler/nir/nir_builtin_builder.c| 249 -
 src/compiler/nir/nir_builtin_builder.h| 150 +++-
 src/compiler/nir/nir_clone.c  |   2 +
 src/compiler/nir/nir_constant_expressions.py  |  33 +-
 src/compiler/nir/nir_deref.c  |  26 +-
 src/compiler/nir/nir_instr_set.c  |   2 +
 src/compiler/nir/nir_intrinsics.py|  32 +-
 src/compiler/nir/nir_intrinsics_c.py  |   6 +-
 src/compiler/nir/nir_linking_helpers.c|   2 +-
 src/compiler/nir/nir_loop_analyze.c   |   2 +-
 src/compiler/nir/nir_lower_alu_to_scalar.c|   2 +
 src/compiler/nir/nir_lower_clip.c |   3 +-
 .../nir/nir_lower_constant_initializers.c |   2 +-
 .../nir/nir_lower_global_vars_to_local.c  |   4 +-
 src/compiler/nir/nir_lower_indirect_derefs.c  |   6 +-
 src/compiler/nir/nir_lower_io.c   | 141 +--
 .../nir/nir_lower_io_arrays_to_elements.c |   4 +-
 .../nir/nir_lower_io_to_temporaries.c |   2 +-
 src/compiler/nir/nir_lower_locals_to_regs.c   |   9 +-
 src/compiler/nir/nir_lower_system_values.c|  40 +-
 src/compiler/nir/nir_lower_var_copies.c   |   3 +-
 src/compiler/nir/nir_lower_vars_to_ssa.c  |  12 +-
 src/compiler/nir/nir_lower_wpos_center.c  |   3 +-
 src/compiler/nir/nir_opcodes.py   |  39 +-
 src/compiler/nir/nir_opt_copy_prop_vars.c |   4 +-
 src/compiler/nir/nir_opt_copy_propagate.c |   2 +-
 src/compiler/nir/nir_opt_dead_write_vars.c|   6 +-
 src/compiler/nir/nir_print.c  |  29 +-
 src/compiler/nir/nir_propagate_invariant.c|   2 +
 src/compiler/nir/nir_remove_dead_variables.c  |   6 +-
 src/compiler/nir/nir_search.c |   8 +-
 src/compiler/nir/nir_serialize.c  |   4 +
 src/compiler/nir/nir_split_vars.c |  20 +-
 src/compiler/nir/nir_validate.c   |  17 +-
 src/compiler/nir/tests/vars_tests.cpp |   2 +-
 src/compiler/nir_types.cpp|  17 +-
 src/compiler/nir_types.h  |  37 +-
 src/compiler/shader_info.h|   3 +
 src/compiler/spirv/spirv_to_nir.c | 131 ++-
 src/compiler/spirv/vtn_alu.c  | 245 +
 src/compiler/spirv/vtn_cfg.c  |   3 +-
 src/compiler/spirv/vtn_glsl450.c  |   2 +-
 src/compiler/spirv/vtn_opencl.c   | 343 ++
 src/compiler/spirv/vtn_private.h  |  19 +-
 src/compiler/spirv/vtn_variables.c| 106 --
 src/gallium/auxiliary/nir/tgsi_to_nir.c   |   4 +-
 src/gallium/drivers/vc4/vc4_nir_lower_blend.c |   4 +-
 src/intel/compiler/brw_nir.c

Re: [Mesa-dev] [PATCH mesa] xmlpool: update translation po files

2018-11-13 Thread Eric Engestrom

On Tuesday, 2018-11-13 13:37:14 +, Emil Velikov wrote:
> On Mon, 12 Nov 2018 at 18:14, Dylan Baker  wrote:
> >
> > Quoting Eric Engestrom (2018-11-12 09:47:22)
> > > On Monday, 2018-11-12 16:56:32 +, Emil Velikov wrote:
> > > > On Mon, 12 Nov 2018 at 14:24, Eric Engestrom  
> > > > wrote:
> > > > >
> > > > > These files are close to 4 years out of date; a lot's changed since.
> > > > > Let's just check in a recently-regenerated version.
> > > > >
> > > > Worth removing them from git and letting the build regenerate them as 
> > > > needed?
> > >
> > > No, the point is for them to be filled with the translations.
> > > They aren't 100% generated, they're more like "refreshed" by running the
> > > ninja command, to add new strings to be translated and adjust file/line
> > > references.
> > >
> > >
> > > That said, I've just looked at the state of the translations, and
> > > "partial" is already generous. Users would currently get a mostly
> > > english driconf interface with a few strings translated here and there,
> > > which I'm not sure is worth the hassle of maintaining all this.
> > >
> > > Should we just drop the translation infrastructure?
> >
> > I'd try pinging the people who provided the translations in the first place 
> > to
> > see if they're interested in updating them. If not I'd be in favor of 
> > dropping
> > unmaintained translations, if there are no maintained translations drop the
> > whole things.
> >
> > Just my 2¢
> >
> Very well said Dylan. I'm on the same page.

Sounds like a good plan; I'll ping them privately and we'll see from there :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 4/8] mesa/main/version: Lower the requirements for GLES 3.0

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

GLES 3.0 does not actually require support for EXT_framebuffer_sRGB, it
only needs support for sRGB attachments to framebuffers.

Signed-off-by: Gert Wollny 
---
 src/mesa/main/version.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 610ba2f08c..2f7ac75a81 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -512,8 +512,9 @@ compute_version_es2(const struct gl_extensions *extensions,
  extensions->ARB_texture_float &&
  extensions->ARB_texture_rg &&
  extensions->ARB_depth_buffer_float &&
- /* extensions->ARB_framebuffer_object && */
- extensions->EXT_framebuffer_sRGB &&
+ (extensions->EXT_framebuffer_sRGB ||
+  (extensions->ARB_framebuffer_object &&
+   extensions->EXT_sRGB)) &&
  extensions->EXT_packed_float &&
  extensions->EXT_texture_array &&
  extensions->EXT_texture_shared_exponent &&
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 2/8] virgl: Set sRGB write control CAP based on host capabilities

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

Signed-off-by: Gert Wollny 
---
 src/gallium/drivers/virgl/virgl_hw.h | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/src/gallium/drivers/virgl/virgl_hw.h 
b/src/gallium/drivers/virgl/virgl_hw.h
index e682c750e7..7b4c063f35 100644
--- a/src/gallium/drivers/virgl/virgl_hw.h
+++ b/src/gallium/drivers/virgl/virgl_hw.h
@@ -232,6 +232,7 @@ enum virgl_formats {
 #define VIRGL_CAP_TEXTURE_BARRIER  (1 << 12)
 #define VIRGL_CAP_TGSI_COMPONENTS  (1 << 13)
 #define VIRGL_CAP_GUEST_MAY_INIT_LOG   (1 << 14)
+#define VIRGL_CAP_SRGB_WRITE_CONTROL   (1 << 15)
 
 /* virgl bind flags - these are compatible with mesa 10.5 gallium.
  * but are fixed, no other should be passed to virgl either.
diff --git a/src/gallium/drivers/virgl/virgl_screen.c 
b/src/gallium/drivers/virgl/virgl_screen.c
index e71883b06f..ec486463fe 100644
--- a/src/gallium/drivers/virgl/virgl_screen.c
+++ b/src/gallium/drivers/virgl/virgl_screen.c
@@ -341,6 +341,8 @@ virgl_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
   return 0;
case PIPE_CAP_NATIVE_FENCE_FD:
   return 0;
+   case PIPE_CAP_SRGB_WRITE_CONTROL:
+  return vscreen->caps.caps.v2.capability_bits & 
VIRGL_CAP_SRGB_WRITE_CONTROL;
default:
   return u_pipe_screen_get_param_defaults(screen, param);
}
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 7/8] mesa/main: Remove now superfluos tests for both EXT_sRGB and EXT_framebuffer_sRGB

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

Signed-off-by: Gert Wollny 
---
 src/mesa/main/fbobject.c | 2 +-
 src/mesa/main/teximage.c | 3 +--
 src/mesa/main/version.c  | 5 ++---
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index ca3f3f7f76..7d45ce43f4 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -4253,7 +4253,7 @@ get_framebuffer_attachment_parameter(struct gl_context 
*ctx,
  }
   }
   else {
- if (ctx->Extensions.EXT_framebuffer_sRGB || ctx->Extensions.EXT_sRGB) 
{
+ if (ctx->Extensions.EXT_sRGB) {
 *params =
_mesa_get_format_color_encoding(att->Renderbuffer->Format);
  }
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index e1d652824e..3c9c8ada99 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -2438,8 +2438,7 @@ copytexture_error_check( struct gl_context *ctx, GLuint 
dimensions,
   bool rb_is_srgb = false;
   bool dst_is_srgb = false;
 
-  if ((ctx->Extensions.EXT_framebuffer_sRGB ||
-   ctx->Extensions.EXT_sRGB) &&
+  if (ctx->Extensions.EXT_sRGB &&
   _mesa_get_format_color_encoding(rb->Format) == GL_SRGB) {
  rb_is_srgb = true;
   }
diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
index 2f7ac75a81..5709d283f3 100644
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -512,9 +512,8 @@ compute_version_es2(const struct gl_extensions *extensions,
  extensions->ARB_texture_float &&
  extensions->ARB_texture_rg &&
  extensions->ARB_depth_buffer_float &&
- (extensions->EXT_framebuffer_sRGB ||
-  (extensions->ARB_framebuffer_object &&
-   extensions->EXT_sRGB)) &&
+ extensions->ARB_framebuffer_object &&
+ extensions->EXT_sRGB &&
  extensions->EXT_packed_float &&
  extensions->EXT_texture_array &&
  extensions->EXT_texture_shared_exponent &&
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 8/8] mesa/main: Expose EXT_sRGB_write_control

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

Use EXT_framebuffer_sRGB to expose EXT_sRGB_write_control on GLES. Remove
the checks for desktion GL in the enable calls, since EXT_framebuffer_sRGB
now also indicates support for switching the linear-sRGB color
space conversion on GLES.

Thanks to Ilia Mirkin for all the helpful discussions that helped to rework
this series.

Signed-off-by: Gert Wollny 
---
 src/mesa/main/enable.c   | 4 
 src/mesa/main/extensions_table.h | 1 +
 src/mesa/main/get_hash_params.py | 4 +++-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index bd3e493da5..d03ffc9d80 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -1125,8 +1125,6 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
 
   /* GL3.0 - GL_framebuffer_sRGB */
   case GL_FRAMEBUFFER_SRGB_EXT:
- if (!_mesa_is_desktop_gl(ctx))
-goto invalid_enum_error;
  CHECK_EXTENSION(EXT_framebuffer_sRGB, cap);
  _mesa_set_framebuffer_srgb(ctx, state);
  return;
@@ -1765,8 +1763,6 @@ _mesa_IsEnabled( GLenum cap )
 
   /* GL3.0 - GL_framebuffer_sRGB */
   case GL_FRAMEBUFFER_SRGB_EXT:
- if (!_mesa_is_desktop_gl(ctx))
-goto invalid_enum_error;
  CHECK_EXTENSION(EXT_framebuffer_sRGB);
  return ctx->Color.sRGBEnabled;
 
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index a516a1b17f..ea9f54ecdc 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -266,6 +266,7 @@ EXT(EXT_shader_integer_mix  , 
EXT_shader_integer_mix
 EXT(EXT_shader_io_blocks, dummy_true   
  ,  x ,  x ,  x ,  31, 2014)
 EXT(EXT_shader_samples_identical, EXT_shader_samples_identical 
  , GLL, GLC,  x ,  31, 2015)
 EXT(EXT_shadow_funcs, ARB_shadow   
  , GLL,  x ,  x ,  x , 2002)
+EXT(EXT_sRGB_write_control  , EXT_framebuffer_sRGB 
  ,   x,  x ,  x ,  30, 2013)
 EXT(EXT_stencil_two_side, EXT_stencil_two_side 
  , GLL,  x ,  x ,  x , 2001)
 EXT(EXT_stencil_wrap, dummy_true   
  , GLL,  x ,  x ,  x , 2002)
 EXT(EXT_subtexture  , dummy_true   
  , GLL,  x ,  x ,  x , 1995)
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 1840db6ebb..8de634e90a 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -463,6 +463,9 @@ descriptor=[
   [ "MIN_FRAGMENT_INTERPOLATION_OFFSET", 
"CONTEXT_FLOAT(Const.MinFragmentInterpolationOffset), 
extra_ARB_gpu_shader5_or_OES_sample_variables" ],
   [ "MAX_FRAGMENT_INTERPOLATION_OFFSET", 
"CONTEXT_FLOAT(Const.MaxFragmentInterpolationOffset), 
extra_ARB_gpu_shader5_or_OES_sample_variables" ],
   [ "FRAGMENT_INTERPOLATION_OFFSET_BITS", 
"CONST(FRAGMENT_INTERPOLATION_OFFSET_BITS), 
extra_ARB_gpu_shader5_or_OES_sample_variables" ],
+
+# GL_EXT_framebuffer_EXT  / GLES 3.0 + EXT_sRGB_write_control
+  [ "FRAMEBUFFER_SRGB_EXT", "CONTEXT_BOOL(Color.sRGBEnabled), 
extra_EXT_framebuffer_sRGB" ],
 ]},
 
 { "apis": ["GLES", "GLES2"], "params": [
@@ -934,7 +937,6 @@ descriptor=[
   [ "RGBA_FLOAT_MODE_ARB", "BUFFER_FIELD(Visual.floatMode, TYPE_BOOLEAN), 
extra_core_ARB_color_buffer_float_and_new_buffers" ],
 
 # GL3.0 / GL_EXT_framebuffer_sRGB
-  [ "FRAMEBUFFER_SRGB_EXT", "CONTEXT_BOOL(Color.sRGBEnabled), 
extra_EXT_framebuffer_sRGB" ],
   [ "FRAMEBUFFER_SRGB_CAPABLE_EXT", "BUFFER_INT(Visual.sRGBCapable), 
extra_EXT_framebuffer_sRGB_and_new_buffers" ],
 
 # GL 3.1
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 6/8] i965: Set flag for EXT_sRGB

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

Signed-off-by: Gert Wollny 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index d7e02efb54..ca369e39f2 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -104,6 +104,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.EXT_point_parameters = true;
ctx->Extensions.EXT_provoking_vertex = true;
ctx->Extensions.EXT_render_snorm = true;
+   ctx->Extensions.EXT_sRGB = true;
ctx->Extensions.EXT_stencil_two_side = true;
ctx->Extensions.EXT_texture_array = true;
ctx->Extensions.EXT_texture_env_dot3 = true;
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 5/8] mesa/st: rework support for sRGB framebuffer attachements

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

For GLES sRGB framebuffer attachemnt support is provided in two steps:
sRGB attachments like described in EXT_sRGB and GLES 3.0 that enable
linear to sRGB color space transformation automatically, and sRGB write
control that brings GLES on par with EXT_framebuffer_sRGB. Set the
according flags to reflect these two parts.

As a difference between desktopm GL and GLES, on desktop GL for a sRGB
framebuffer attachment the linear-sRGB conversion is turned off by default,
and for GLES it is turned on. This needs to be taken into account when
creating framebuffer attachemnts.

v2: - always enable the extension when sRGB is supported (Ilia Mirkin).
- Correct handling by moving extension initialization to the
  place where gallium/st actually takes care of this. This also
  fixes properly disabling the extension via MESA_EXTENSION_OVERRIDE
- reinstate check for desktop GL and add check for the extension
  when creating the framebuffer

v3: - Only create sRGB renderbuffers based on Visual.srgbCapable when
  on desktop GL.

v4: - Use PIPE_FORMAT_B8G8R8A8_SRGB to check for the capability, since this
  is also the format that is used top check for EGL_KHR_gl_colorspace
  support.  virgl on a GLES host usually doesn't provide this format but
  one can make it available to signal that the host supports this
  extension.

v5: - drop check for PIPE_FORMAT_B8G8R8A8_SRGB in favour of using the new 
  PIPE_CAP_SRGB_WRITE_CONTROL cap flag.
- enable EXT_sRGB based on the sRGB formats supported and 
  EXT_framebuffer_sRGB by checking for PIPE_CAP_SRGB_WRITE_CONTROL.

Signed-off-by: Gert Wollny 
---
 src/mesa/state_tracker/st_cb_fbo.c |  4 +--
 src/mesa/state_tracker/st_extensions.c |  6 -
 src/mesa/state_tracker/st_format.c |  2 +-
 src/mesa/state_tracker/st_manager.c| 37 --
 4 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_fbo.c 
b/src/mesa/state_tracker/st_cb_fbo.c
index 0e535257cb..49a989f126 100644
--- a/src/mesa/state_tracker/st_cb_fbo.c
+++ b/src/mesa/state_tracker/st_cb_fbo.c
@@ -139,7 +139,7 @@ st_renderbuffer_alloc_storage(struct gl_context * ctx,
/* If an sRGB framebuffer is unsupported, sRGB formats behave like linear
 * formats.
 */
-   if (!ctx->Extensions.EXT_framebuffer_sRGB) {
+   if (!ctx->Extensions.EXT_sRGB) {
   internalFormat = _mesa_get_linear_internalformat(internalFormat);
}
 
@@ -656,7 +656,7 @@ st_validate_attachment(struct gl_context *ctx,
/* If the encoding is sRGB and sRGB rendering cannot be enabled,
 * check for linear format support instead.
 * Later when we create a surface, we change the format to a linear one. */
-   if (!ctx->Extensions.EXT_framebuffer_sRGB &&
+   if (!ctx->Extensions.EXT_sRGB &&
_mesa_get_format_color_encoding(texFormat) == GL_SRGB) {
   const mesa_format linearFormat = _mesa_get_srgb_format_linear(texFormat);
   format = st_mesa_format_to_pipe_format(st_context(ctx), linearFormat);
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 16889074f6..9e63e7b74c 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -786,7 +786,7 @@ void st_init_extensions(struct pipe_screen *screen,
   PIPE_FORMAT_B10G10R10A2_UINT },
  GL_TRUE }, /* at least one format must be supported */
 
-  { { o(EXT_framebuffer_sRGB) },
+  { { o(EXT_sRGB) },
 { PIPE_FORMAT_A8B8G8R8_SRGB,
   PIPE_FORMAT_B8G8R8A8_SRGB,
   PIPE_FORMAT_R8G8B8A8_SRGB },
@@ -1316,6 +1316,10 @@ void st_init_extensions(struct pipe_screen *screen,
   extensions->ARB_texture_buffer_object_rgb32 &&
   extensions->ARB_shader_image_load_store;
 
+   extensions->EXT_framebuffer_sRGB =
+ screen->get_param(screen, PIPE_CAP_SRGB_WRITE_CONTROL) &&
+ extensions->EXT_sRGB;
+
/* Unpacking a varying in the fragment shader costs 1 texture indirection.
 * If the number of available texture indirections is very limited, then we
 * prefer to disable varying packing rather than run the risk of varying
diff --git a/src/mesa/state_tracker/st_format.c 
b/src/mesa/state_tracker/st_format.c
index caddd76c5d..aacb878828 100644
--- a/src/mesa/state_tracker/st_format.c
+++ b/src/mesa/state_tracker/st_format.c
@@ -2457,7 +2457,7 @@ st_QuerySamplesForFormat(struct gl_context *ctx, GLenum 
target,
/* If an sRGB framebuffer is unsupported, sRGB formats behave like linear
 * formats.
 */
-   if (!ctx->Extensions.EXT_framebuffer_sRGB) {
+   if (!ctx->Extensions.EXT_sRGB) {
   internalFormat = _mesa_get_linear_internalformat(internalFormat);
}
 
diff --git a/src/mesa/state_tracker/st_manager.c 
b/src/mesa/state_tracker/st_manager.c
index 076ad42646..25e2dcad4c 100644
--- a/src/mesa/state_tracker/st_manager.c
+++ b/src/mesa/state_tracker/st_manager.c
@@ -295,7 +295,7 @@

[Mesa-dev] [RFC PATCH 3/8] mesa/main: Add flag for EXT_sRGB and use it parallel with EXT_framebuffer_sRGB

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

EXT_sRGB is an (incomplete) GLES extension that provides support for sRGB
framebuffer attachments, hence it can be used to check for this support
as an alternative to EXT_framebuffer_sRGB that provies the same
functionality but also sRGB write control support.

All drivers that support EXT_framebuffer_sRGB also support EXT_sRGB, but
in order to keep this commit minial, and not to break any drivers both
flags are checked.

Since EXT_sRGB  is incomplete and superseted by GLES 3.0 it will not be
exposed as an extension.

Signed-off-by: Gert Wollny 
---
 src/mesa/main/fbobject.c| 2 +-
 src/mesa/main/formatquery.c | 3 ++-
 src/mesa/main/framebuffer.c | 3 ++-
 src/mesa/main/mtypes.h  | 1 +
 src/mesa/main/teximage.c| 3 ++-
 5 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index 68e0daf342..ca3f3f7f76 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -4253,7 +4253,7 @@ get_framebuffer_attachment_parameter(struct gl_context 
*ctx,
  }
   }
   else {
- if (ctx->Extensions.EXT_framebuffer_sRGB) {
+ if (ctx->Extensions.EXT_framebuffer_sRGB || ctx->Extensions.EXT_sRGB) 
{
 *params =
_mesa_get_format_color_encoding(att->Renderbuffer->Format);
  }
diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
index 84b5f512ba..1d43c1e860 100644
--- a/src/mesa/main/formatquery.c
+++ b/src/mesa/main/formatquery.c
@@ -1241,7 +1241,8 @@ _mesa_GetInternalformativ(GLenum target, GLenum 
internalformat, GLenum pname,
   break;
 
case GL_SRGB_WRITE:
-  if (!_mesa_has_EXT_framebuffer_sRGB(ctx) ||
+  if ((!_mesa_has_EXT_framebuffer_sRGB(ctx) &&
+   !ctx->Extensions.EXT_sRGB) ||
   !_mesa_is_color_format(internalformat)) {
  goto end;
   }
diff --git a/src/mesa/main/framebuffer.c b/src/mesa/main/framebuffer.c
index 10dd2fde44..90314ee1bd 100644
--- a/src/mesa/main/framebuffer.c
+++ b/src/mesa/main/framebuffer.c
@@ -459,7 +459,8 @@ _mesa_update_framebuffer_visual(struct gl_context *ctx,
 fb->Visual.rgbBits = fb->Visual.redBits
+ fb->Visual.greenBits + fb->Visual.blueBits;
 if (_mesa_get_format_color_encoding(fmt) == GL_SRGB)
-fb->Visual.sRGBCapable = ctx->Extensions.EXT_framebuffer_sRGB;
+fb->Visual.sRGBCapable = ctx->Extensions.EXT_framebuffer_sRGB 
||
+ ctx->Extensions.EXT_sRGB;
 break;
  }
   }
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 656e1226f9..4ee55266e5 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4253,6 +4253,7 @@ struct gl_extensions
GLboolean EXT_semaphore_fd;
GLboolean EXT_shader_integer_mix;
GLboolean EXT_shader_samples_identical;
+   GLboolean EXT_sRGB;
GLboolean EXT_stencil_two_side;
GLboolean EXT_texture_array;
GLboolean EXT_texture_compression_latc;
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 6805b47c72..e1d652824e 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -2438,7 +2438,8 @@ copytexture_error_check( struct gl_context *ctx, GLuint 
dimensions,
   bool rb_is_srgb = false;
   bool dst_is_srgb = false;
 
-  if (ctx->Extensions.EXT_framebuffer_sRGB &&
+  if ((ctx->Extensions.EXT_framebuffer_sRGB ||
+   ctx->Extensions.EXT_sRGB) &&
   _mesa_get_format_color_encoding(rb->Format) == GL_SRGB) {
  rb_is_srgb = true;
   }
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 0/8] Add and enable extension EXT_sRGB_write_control (reworked)

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

Dear all, 

based on the feedback given by Ilia I've completely reworked the series to add
internal support for EXT_sRGB as a stepstone to implement EXT_sRGB_write_control
and expose GLES 3.0 properly.

Since the series has been reworked thoroughly, most of the original patches 
have completely changed so that carrying a history didn't make much sense for
most patches. 

I'd like to thank Ilia for all his commenst on the first series that helped me 
a lot to rework the series. 

Thanks for any commenst, 
Gert 


Gert Wollny (8):
  Gallium: Add new CAPS to indicate whether a driver can switch SRGB
write
  virgl: Set sRGB write control CAP based on host capabilities
  mesa/main: Add flag for EXT_sRGB and use it parallel with
EXT_framebuffer_sRGB
  mesa/main/version: Lower the requirements for GLES 3.0
  mesa/st: rework support for sRGB framebuffer attachements
  i965: Set flag for EXT_sRGB
  mesa/main: Remove now superfluos tests for both EXT_sRGB and
EXT_framebuffer_sRGB
  mesa/main: Expose EXT_sRGB_write_control

 src/gallium/auxiliary/util/u_screen.c|  3 ++
 src/gallium/docs/source/screen.rst   |  3 ++
 src/gallium/drivers/virgl/virgl_hw.h |  1 +
 src/gallium/drivers/virgl/virgl_screen.c |  2 ++
 src/gallium/include/pipe/p_defines.h |  1 +
 src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
 src/mesa/main/enable.c   |  4 ---
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/fbobject.c |  2 +-
 src/mesa/main/formatquery.c  |  3 +-
 src/mesa/main/framebuffer.c  |  3 +-
 src/mesa/main/get_hash_params.py |  4 ++-
 src/mesa/main/mtypes.h   |  1 +
 src/mesa/main/teximage.c |  2 +-
 src/mesa/main/version.c  |  4 +--
 src/mesa/state_tracker/st_cb_fbo.c   |  4 +--
 src/mesa/state_tracker/st_extensions.c   |  6 +++-
 src/mesa/state_tracker/st_format.c   |  2 +-
 src/mesa/state_tracker/st_manager.c  | 37 
 19 files changed, 55 insertions(+), 29 deletions(-)

-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 1/8] Gallium: Add new CAPS to indicate whether a driver can switch SRGB write

2018-11-13 Thread Gert Wollny

From: Gert Wollny 

Add a new cap that indicates whether the drivers supports
enabling/disabling the conversion from linear space to sRGB
for a framebuffer attachment.

Signed-off-by: Gert Wollny 
---
 src/gallium/auxiliary/util/u_screen.c | 3 +++
 src/gallium/docs/source/screen.rst| 3 +++
 src/gallium/include/pipe/p_defines.h  | 1 +
 3 files changed, 7 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_screen.c 
b/src/gallium/auxiliary/util/u_screen.c
index 73dbbee94a..1d9f367501 100644
--- a/src/gallium/auxiliary/util/u_screen.c
+++ b/src/gallium/auxiliary/util/u_screen.c
@@ -326,6 +326,9 @@ u_pipe_screen_get_param_defaults(struct pipe_screen 
*pscreen,
case PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET:
   return 2047;
 
+   case PIPE_CAP_SRGB_WRITE_CONTROL:
+  return 1;
+
default:
   unreachable("bad PIPE_CAP_*");
}
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 0abd164494..da677eb04b 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -477,6 +477,9 @@ subpixel precision bias in bits during conservative 
rasterization.
   0 means no limit.
 * ``PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET``: The maximum supported value for
   of pipe_vertex_element::src_offset.
+* ``PIPE_CAP_SRGB_WRITE_CONTROL``: Indicates whether the drivers on GLES 
supports
+  enabling/disabling the conversion from linear space to sRGB at framebuffer or
+  blend time.
 
 .. _pipe_capf:
 
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 693f041b1d..7838b18be8 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -826,6 +826,7 @@ enum pipe_cap
PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER_BUFFERS,
PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGET,
PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET,
+   PIPE_CAP_SRGB_WRITE_CONTROL,
 };
 
 /**
-- 
2.18.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 4/8] mesa/main/version: Lower the requirements for GLES 3.0

2018-11-13 Thread Ilia Mirkin

Is ARB_framebuffer_object really needed? IIRC one of the sticking
points is that it allows differently-sized render targets. Does ES3
allow that? If so, this is fine.

On Tue, Nov 13, 2018 at 12:28 PM Gert Wollny  wrote:
>
> From: Gert Wollny 
>
> GLES 3.0 does not actually require support for EXT_framebuffer_sRGB, it
> only needs support for sRGB attachments to framebuffers.
>
> Signed-off-by: Gert Wollny 
> ---
>  src/mesa/main/version.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index 610ba2f08c..2f7ac75a81 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -512,8 +512,9 @@ compute_version_es2(const struct gl_extensions 
> *extensions,
>   extensions->ARB_texture_float &&
>   extensions->ARB_texture_rg &&
>   extensions->ARB_depth_buffer_float &&
> - /* extensions->ARB_framebuffer_object && */
> - extensions->EXT_framebuffer_sRGB &&
> + (extensions->EXT_framebuffer_sRGB ||
> +  (extensions->ARB_framebuffer_object &&
> +   extensions->EXT_sRGB)) &&
>   extensions->EXT_packed_float &&
>   extensions->EXT_texture_array &&
>   extensions->EXT_texture_shared_exponent &&
> --
> 2.18.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 7/8] mesa/main: Remove now superfluos tests for both EXT_sRGB and EXT_framebuffer_sRGB

2018-11-13 Thread Ilia Mirkin

Why not order the series such that this commit is not needed?
On Tue, Nov 13, 2018 at 12:28 PM Gert Wollny  wrote:
>
> From: Gert Wollny 
>
> Signed-off-by: Gert Wollny 
> ---
>  src/mesa/main/fbobject.c | 2 +-
>  src/mesa/main/teximage.c | 3 +--
>  src/mesa/main/version.c  | 5 ++---
>  3 files changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index ca3f3f7f76..7d45ce43f4 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -4253,7 +4253,7 @@ get_framebuffer_attachment_parameter(struct gl_context 
> *ctx,
>   }
>}
>else {
> - if (ctx->Extensions.EXT_framebuffer_sRGB || 
> ctx->Extensions.EXT_sRGB) {
> + if (ctx->Extensions.EXT_sRGB) {
>  *params =
> _mesa_get_format_color_encoding(att->Renderbuffer->Format);
>   }
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index e1d652824e..3c9c8ada99 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -2438,8 +2438,7 @@ copytexture_error_check( struct gl_context *ctx, GLuint 
> dimensions,
>bool rb_is_srgb = false;
>bool dst_is_srgb = false;
>
> -  if ((ctx->Extensions.EXT_framebuffer_sRGB ||
> -   ctx->Extensions.EXT_sRGB) &&
> +  if (ctx->Extensions.EXT_sRGB &&
>_mesa_get_format_color_encoding(rb->Format) == GL_SRGB) {
>   rb_is_srgb = true;
>}
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index 2f7ac75a81..5709d283f3 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -512,9 +512,8 @@ compute_version_es2(const struct gl_extensions 
> *extensions,
>   extensions->ARB_texture_float &&
>   extensions->ARB_texture_rg &&
>   extensions->ARB_depth_buffer_float &&
> - (extensions->EXT_framebuffer_sRGB ||
> -  (extensions->ARB_framebuffer_object &&
> -   extensions->EXT_sRGB)) &&
> + extensions->ARB_framebuffer_object &&
> + extensions->EXT_sRGB &&
>   extensions->EXT_packed_float &&
>   extensions->EXT_texture_array &&
>   extensions->EXT_texture_shared_exponent &&
> --
> 2.18.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] st/xa: Fix transformations when we have both source and mask samplers

2018-11-13 Thread Thomas Hellstrom

In the case when we had both source and mask samplers, transformations were
typically not applied correctly.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/state_trackers/xa/xa_renderer.c | 117 
 1 file changed, 49 insertions(+), 68 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_renderer.c 
b/src/gallium/state_trackers/xa/xa_renderer.c
index 0cb75a8c968..ac26c5508cf 100644
--- a/src/gallium/state_trackers/xa/xa_renderer.c
+++ b/src/gallium/state_trackers/xa/xa_renderer.c
@@ -192,47 +192,55 @@ add_vertex_2tex(struct xa_context *r,
 }
 
 static void
-add_vertex_data1(struct xa_context *r,
- float srcX, float srcY,  float dstX, float dstY,
- float width, float height,
- struct pipe_resource *src, const float *src_matrix)
+compute_src_coords(float sx, float sy, struct pipe_resource *src,
+   const float *src_matrix,
+   float width, float height,
+   float tc0[2], float tc1[2], float tc2[2], float tc3[2])
 {
-float s0, t0, s1, t1, s2, t2, s3, t3;
-float pt0[2], pt1[2], pt2[2], pt3[2];
-
-pt0[0] = srcX;
-pt0[1] = srcY;
-pt1[0] = (srcX + width);
-pt1[1] = srcY;
-pt2[0] = (srcX + width);
-pt2[1] = (srcY + height);
-pt3[0] = srcX;
-pt3[1] = (srcY + height);
+tc0[0] = sx;
+tc0[1] = sy;
+tc1[0] = (sx + width);
+tc1[1] = sy;
+tc2[0] = (sx + width);
+tc2[1] = (sy + height);
+tc3[0] = sx;
+tc3[1] = (sy + height);
 
 if (src_matrix) {
-   map_point((float *)src_matrix, pt0[0], pt0[1], [0], [1]);
-   map_point((float *)src_matrix, pt1[0], pt1[1], [0], [1]);
-   map_point((float *)src_matrix, pt2[0], pt2[1], [0], [1]);
-   map_point((float *)src_matrix, pt3[0], pt3[1], [0], [1]);
+   map_point((float *)src_matrix, tc0[0], tc0[1], [0], [1]);
+   map_point((float *)src_matrix, tc1[0], tc1[1], [0], [1]);
+   map_point((float *)src_matrix, tc2[0], tc2[1], [0], [1]);
+   map_point((float *)src_matrix, tc3[0], tc3[1], [0], [1]);
 }
 
-s0 =  pt0[0] / src->width0;
-s1 =  pt1[0] / src->width0;
-s2 =  pt2[0] / src->width0;
-s3 =  pt3[0] / src->width0;
-t0 =  pt0[1] / src->height0;
-t1 =  pt1[1] / src->height0;
-t2 =  pt2[1] / src->height0;
-t3 =  pt3[1] / src->height0;
+tc0[0] /= src->width0;
+tc1[0] /= src->width0;
+tc2[0] /= src->width0;
+tc3[0] /= src->width0;
+tc0[1] /= src->height0;
+tc1[1] /= src->height0;
+tc2[1] /= src->height0;
+tc3[1] /= src->height0;
+}
 
+static void
+add_vertex_data1(struct xa_context *r,
+ float srcX, float srcY,  float dstX, float dstY,
+ float width, float height,
+ struct pipe_resource *src, const float *src_matrix)
+{
+float tc0[2], tc1[2], tc2[2], tc3[2];
+
+compute_src_coords(srcX, srcY, src, src_matrix, width, height,
+   tc0, tc1, tc2, tc3);
 /* 1st vertex */
-add_vertex_1tex(r, dstX, dstY, s0, t0);
+add_vertex_1tex(r, dstX, dstY, tc0[0], tc0[1]);
 /* 2nd vertex */
-add_vertex_1tex(r, dstX + width, dstY, s1, t1);
+add_vertex_1tex(r, dstX + width, dstY, tc1[0], tc1[1]);
 /* 3rd vertex */
-add_vertex_1tex(r, dstX + width, dstY + height, s2, t2);
+add_vertex_1tex(r, dstX + width, dstY + height, tc2[0], tc2[1]);
 /* 4th vertex */
-add_vertex_1tex(r, dstX, dstY + height, s3, t3);
+add_vertex_1tex(r, dstX, dstY + height, tc3[0], tc3[1]);
 }
 
 static void
@@ -243,53 +251,26 @@ add_vertex_data2(struct xa_context *r,
  struct pipe_resource *mask,
  const float *src_matrix, const float *mask_matrix)
 {
-float src_s0, src_t0, src_s1, src_t1;
-float mask_s0, mask_t0, mask_s1, mask_t1;
-float spt0[2], spt1[2];
-float mpt0[2], mpt1[2];
-
-spt0[0] = srcX;
-spt0[1] = srcY;
-spt1[0] = srcX + width;
-spt1[1] = srcY + height;
-
-mpt0[0] = maskX;
-mpt0[1] = maskY;
-mpt1[0] = maskX + width;
-mpt1[1] = maskY + height;
-
-if (src_matrix) {
-   map_point((float *)src_matrix, spt0[0], spt0[1], [0], [1]);
-   map_point((float *)src_matrix, spt1[0], spt1[1], [0], [1]);
-}
-
-if (mask_matrix) {
-   map_point((float *)mask_matrix, mpt0[0], mpt0[1], [0], [1]);
-   map_point((float *)mask_matrix, mpt1[0], mpt1[1], [0], [1]);
-}
-
-src_s0 = spt0[0] / src->width0;
-src_t0 = spt0[1] / src->height0;
-src_s1 = spt1[0] / src->width0;
-src_t1 = spt1[1] / src->height0;
+float spt0[2], spt1[2], spt2[2], spt3[2];
+float mpt0[2], mpt1[2], mpt2[2], mpt3[2];
 
-mask_s0 = mpt0[0] / mask->width0;
-mask_t0 = mpt0[1] / mask->height0;
-mask_s1 = mpt1[0] / mask->width0;
-mask_t1 = mpt1[1] / mask->height0;
+compute_src_coords(srcX, srcY, src, src_matrix, width, height,
+   spt0, spt1, spt2, spt3);
+

[Mesa-dev] [PATCH 3/3] st/xa: Support Component Alpha with trivial blending

2018-11-13 Thread Thomas Hellstrom

Support Component Alpha for those composite operations that do not require
per-channel alpha blending.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/state_trackers/xa/xa_composite.c | 33 
 src/gallium/state_trackers/xa/xa_priv.h  |  1 +
 src/gallium/state_trackers/xa/xa_tgsi.c  | 18 ---
 3 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
b/src/gallium/state_trackers/xa/xa_composite.c
index b0746327522..34d78027e27 100644
--- a/src/gallium/state_trackers/xa/xa_composite.c
+++ b/src/gallium/state_trackers/xa/xa_composite.c
@@ -111,12 +111,6 @@ blend_for_op(struct xa_composite_blend *blend,
 int i;
 boolean supported = FALSE;
 
-/*
- * No component alpha yet.
- */
-if (mask_pic && mask_pic->component_alpha)
-   return FALSE;
-
 /*
  * our default in case something goes wrong
  */
@@ -130,6 +124,12 @@ blend_for_op(struct xa_composite_blend *blend,
}
 }
 
+/*
+ * No component alpha yet.
+ */
+if (mask_pic && mask_pic->component_alpha && blend->alpha_src)
+   return FALSE;
+
 if (!dst_pic->srf)
return supported;
 
@@ -224,15 +224,9 @@ xa_src_pict_is_accelerated(const union xa_source_pict 
*src_pic)
 XA_EXPORT int
 xa_composite_check_accelerated(const struct xa_composite *comp)
 {
-struct xa_composite_blend blend;
 struct xa_picture *src_pic = comp->src;
 struct xa_picture *mask_pic = comp->mask;
-
-/*
- * No component alpha yet.
- */
-if (mask_pic && mask_pic->component_alpha)
-   return -XA_ERR_INVAL;
+struct xa_composite_blend blend;
 
 if (!xa_is_filter_accelerated(src_pic) ||
!xa_is_filter_accelerated(comp->mask)) {
@@ -246,6 +240,12 @@ xa_composite_check_accelerated(const struct xa_composite 
*comp)
 if (!blend_for_op(, comp->op, comp->src, comp->mask, comp->dst))
return -XA_ERR_INVAL;
 
+/*
+ * No component alpha yet.
+ */
+if (mask_pic && mask_pic->component_alpha && blend.alpha_src)
+   return -XA_ERR_INVAL;
+
 return XA_ERR_NONE;
 }
 
@@ -382,10 +382,15 @@ bind_shaders(struct xa_context *ctx, const struct 
xa_composite *comp)
 struct xa_shader shader;
 struct xa_picture *src_pic = comp->src;
 struct xa_picture *mask_pic = comp->mask;
+struct xa_picture *dst_pic = comp->dst;
 
 ctx->has_solid_src = FALSE;
 ctx->has_solid_mask = FALSE;
 
+if (dst_pic && xa_format_type(dst_pic->pict_format) !=
+xa_format_type(xa_surface_format(dst_pic->srf)))
+   return -XA_ERR_INVAL;
+
 if (src_pic) {
if (src_pic->wrap == xa_wrap_clamp_to_border && src_pic->has_transform)
fs_traits |= FS_SRC_REPEAT_NONE;
@@ -405,6 +410,8 @@ bind_shaders(struct xa_context *ctx, const struct 
xa_composite *comp)
 if (mask_pic) {
vs_traits |= VS_MASK;
fs_traits |= FS_MASK;
+if (mask_pic->component_alpha)
+   fs_traits |= FS_CA;
 if (mask_pic->src_pict) {
 if (!xa_handle_src_pict(ctx, mask_pic->src_pict, true))
 return -XA_ERR_INVAL;
diff --git a/src/gallium/state_trackers/xa/xa_priv.h 
b/src/gallium/state_trackers/xa/xa_priv.h
index 09a858ff972..f368de3b81f 100644
--- a/src/gallium/state_trackers/xa/xa_priv.h
+++ b/src/gallium/state_trackers/xa/xa_priv.h
@@ -166,6 +166,7 @@ enum xa_fs_traits {
 FS_SRC_LUMINANCE = 1 << 11,
 FS_MASK_LUMINANCE = 1 << 12,
 FS_DST_LUMINANCE = 1 << 13,
+FS_CA = 1 << 14,
 };
 
 struct xa_shader {
diff --git a/src/gallium/state_trackers/xa/xa_tgsi.c 
b/src/gallium/state_trackers/xa/xa_tgsi.c
index 5f2608aee55..ed3e0895d98 100644
--- a/src/gallium/state_trackers/xa/xa_tgsi.c
+++ b/src/gallium/state_trackers/xa/xa_tgsi.c
@@ -82,6 +82,7 @@ print_fs_traits(int fs_traits)
"FS_SRC_LUMINANCE", /* = 1 << 11, */
"FS_MASK_LUMINANCE",/* = 1 << 12, */
"FS_DST_LUMINANCE", /* = 1 << 13, */
+"FS_CA",/* = 1 << 14, */
 };
 int i, k;
 
@@ -107,12 +108,20 @@ src_in_mask(struct ureg_program *ureg,
struct ureg_dst dst,
struct ureg_src src,
struct ureg_src mask,
-   unsigned mask_luminance)
+   unsigned mask_luminance, boolean component_alpha)
 {
 if (mask_luminance)
-ureg_MUL(ureg, dst, src, ureg_scalar(mask, TGSI_SWIZZLE_X));
-else
+if (component_alpha) {
+ureg_MOV(ureg, dst, src);
+ureg_MUL(ureg, ureg_writemask(dst, TGSI_WRITEMASK_W),
+ src, ureg_scalar(mask, TGSI_SWIZZLE_X));
+} else {
+ureg_MUL(ureg, dst, src, ureg_scalar(mask, TGSI_SWIZZLE_X));
+}
+else if (!component_alpha)
 ureg_MUL(ureg, dst, src, ureg_scalar(mask, TGSI_SWIZZLE_W));
+else
+ureg_MUL(ureg, dst, src, mask);
 }
 
 static struct ureg_src
@@ -347,6 +356,7 @@ create_fs(struct pipe_context *pipe,

[Mesa-dev] [PATCH 2/3] st/xa: Minor renderer cleanups

2018-11-13 Thread Thomas Hellstrom

constify function arguments to clean up the code a bit.

Reported-by: Brian Paul 
Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/state_trackers/xa/xa_renderer.c | 24 ++---
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_renderer.c 
b/src/gallium/state_trackers/xa/xa_renderer.c
index ac26c5508cf..582a5fa1308 100644
--- a/src/gallium/state_trackers/xa/xa_renderer.c
+++ b/src/gallium/state_trackers/xa/xa_renderer.c
@@ -46,14 +46,14 @@ renderer_set_constants(struct xa_context *r,
   int shader_type, const float *params, int param_bytes);
 
 static inline boolean
-is_affine(float *matrix)
+is_affine(const float *matrix)
 {
 return floatIsZero(matrix[2]) && floatIsZero(matrix[5])
&& floatsEqual(matrix[8], 1);
 }
 
 static inline void
-map_point(float *mat, float x, float y, float *out_x, float *out_y)
+map_point(const float *mat, float x, float y, float *out_x, float *out_y)
 {
 if (!mat) {
*out_x = x;
@@ -192,25 +192,25 @@ add_vertex_2tex(struct xa_context *r,
 }
 
 static void
-compute_src_coords(float sx, float sy, struct pipe_resource *src,
+compute_src_coords(float sx, float sy, const struct pipe_resource *src,
const float *src_matrix,
float width, float height,
float tc0[2], float tc1[2], float tc2[2], float tc3[2])
 {
 tc0[0] = sx;
 tc0[1] = sy;
-tc1[0] = (sx + width);
+tc1[0] = sx + width;
 tc1[1] = sy;
-tc2[0] = (sx + width);
-tc2[1] = (sy + height);
+tc2[0] = sx + width;
+tc2[1] = sy + height;
 tc3[0] = sx;
-tc3[1] = (sy + height);
+tc3[1] = sy + height;
 
 if (src_matrix) {
-   map_point((float *)src_matrix, tc0[0], tc0[1], [0], [1]);
-   map_point((float *)src_matrix, tc1[0], tc1[1], [0], [1]);
-   map_point((float *)src_matrix, tc2[0], tc2[1], [0], [1]);
-   map_point((float *)src_matrix, tc3[0], tc3[1], [0], [1]);
+   map_point(src_matrix, tc0[0], tc0[1], [0], [1]);
+   map_point(src_matrix, tc1[0], tc1[1], [0], [1]);
+   map_point(src_matrix, tc2[0], tc2[1], [0], [1]);
+   map_point(src_matrix, tc3[0], tc3[1], [0], [1]);
 }
 
 tc0[0] /= src->width0;
@@ -227,7 +227,7 @@ static void
 add_vertex_data1(struct xa_context *r,
  float srcX, float srcY,  float dstX, float dstY,
  float width, float height,
- struct pipe_resource *src, const float *src_matrix)
+ const struct pipe_resource *src, const float *src_matrix)
 {
 float tc0[2], tc1[2], tc2[2], tc3[2];
 
-- 
2.19.0.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa/st: swap order of clear() and clear_with_quad()

2018-11-13 Thread Rob Clark

If we can't clear all the buffers with pctx->clear() (say, for example,
because of ColorMask), push the buffers we *can* clear with pctx->clear()
first.  Tilers want to see clears coming before draws to enable fast-
paths, and clearing one of the attachments with a quad-draw first
confuses that logic.

Signed-off-by: Rob Clark 
---
 src/mesa/state_tracker/st_cb_clear.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_clear.c 
b/src/mesa/state_tracker/st_cb_clear.c
index 22e85019764..3b51bd2c8a7 100644
--- a/src/mesa/state_tracker/st_cb_clear.c
+++ b/src/mesa/state_tracker/st_cb_clear.c
@@ -442,9 +442,6 @@ st_Clear(struct gl_context *ctx, GLbitfield mask)
 * use pipe->clear. We want to always use pipe->clear for the other
 * renderbuffers, because it's likely to be faster.
 */
-   if (quad_buffers) {
-  clear_with_quad(ctx, quad_buffers);
-   }
if (clear_buffers) {
   /* We can't translate the clear color to the colorbuffer format,
* because different colorbuffers may have different formats.
@@ -453,6 +450,9 @@ st_Clear(struct gl_context *ctx, GLbitfield mask)
   (union pipe_color_union*)>Color.ClearColor,
   ctx->Depth.Clear, ctx->Stencil.Clear);
}
+   if (quad_buffers) {
+  clear_with_quad(ctx, quad_buffers);
+   }
if (mask & BUFFER_BIT_ACCUM)
   _mesa_clear_accum_buffer(ctx);
 }
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Roland Scheidegger

Am 13.11.18 um 18:00 schrieb Dylan Baker:
> Quoting Erik Faye-Lund (2018-11-13 01:34:53)
>> On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
>>> Quoting Erik Faye-Lund (2018-11-12 04:51:47)
 On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
> Which has the same behavior.

 Does it? I'm not so sure... IROUND_POS seems to round to nearest
 integer depending on the FPU rounding mode, _mesa_roundevenf rounds
 to
 the nearest *even* value regardless of the FPU rounding mode, no?

 I'm not sure if it matters or not, but *at least* point that out in
 the
 commit message. Unless I'm missing something, of course...
>>>
>>> I should put it in the commit message, but there is a comment in
>>> rounding.h that
>>> if you change the rounding mode you get to keep the pieces.
>>
>> Well, this might regress performance pretty badly. Especially in the
>> swrast code, this could be bad...
>>
> 
> Why? we have the assumption that you don't change the rounding mode already in
> core mesa and many of the drivers.
> 
> For performance, I measured a simple 1000 loops of rounding, and found that 
> the
> only way the rounding.h function was slower is if you used the __SSE4_1__
> path... (It was the same performance as the int cast +0.5 implementation)
FWIW I'm not entirely sure it's useful to have a sse41 implementation -
since all sse2 capable cpus can natively do rintf. Although maybe it
should be pointed out that the sse41 implementation will use a defined
rounding mode, whereas rintf will use current rounding mode. But I don't
think anyone ever cares for the results if a different rounding mode
would be set. Although of course rint and its variant do not actually
guarantee the even part of it (but well if it's a sse41 capable box we
pretty much know it would do just that anyway)... (And technically
nearbyintf would probably be an even better solution, since we never
want to get involved with the clunky exceptions, otherwise it's
identical. But there might be reasons why it isn't used.)

Roland


> 
> Dylan
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-devdata=02%7C01%7Csroland%40vmware.com%7C5f77a09021be4da94a1c08d649899668%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636777252795733409sdata=ZS9kXWZAg0jOYt5bXyPV2rqlnhqN1ojr675tb8kKPTg%3Dreserved=0
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/8] intel/compiler: Lower SSBO and shared loads/stores in NIR

2018-11-13 Thread Jason Ekstrand

We have a bunch of code to do this in the back-end compiler but it's
fairly specific to typed surface messages and the way we emit them.
This breaks it out into NIR were it's easier to do things a bit more
generally.  It also means we can easily share the code between the bec4
and FS back-ends if we wish.
---
 src/intel/Makefile.sources|   1 +
 src/intel/compiler/brw_fs_nir.cpp | 381 --
 src/intel/compiler/brw_nir.c  |   2 +
 src/intel/compiler/brw_nir.h  |   2 +
 .../brw_nir_lower_mem_access_bit_sizes.c  | 313 ++
 src/intel/compiler/brw_vec4_nir.cpp   | 126 +-
 src/intel/compiler/meson.build|   1 +
 7 files changed, 421 insertions(+), 405 deletions(-)
 create mode 100644 src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 4da887f7ed2..5e7d32293b7 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -85,6 +85,7 @@ COMPILER_FILES = \
compiler/brw_nir_attribute_workarounds.c \
compiler/brw_nir_lower_cs_intrinsics.c \
compiler/brw_nir_lower_image_load_store.c \
+   compiler/brw_nir_lower_mem_access_bit_sizes.c \
compiler/brw_nir_opt_peephole_ffma.c \
compiler/brw_nir_tcs_workarounds.c \
compiler/brw_packed_float.c \
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 2b36171136e..84d0c6be6c3 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -26,6 +26,7 @@
 #include "brw_fs_surface_builder.h"
 #include "brw_nir.h"
 #include "util/u_math.h"
+#include "util/bitscan.h"
 
 using namespace brw;
 using namespace brw::surface_access;
@@ -2250,107 +2251,6 @@ fs_visitor::get_indirect_offset(nir_intrinsic_instr 
*instr)
return get_nir_src(*offset_src);
 }
 
-static void
-do_untyped_vector_read(const fs_builder ,
-   const fs_reg dest,
-   const fs_reg surf_index,
-   const fs_reg offset_reg,
-   unsigned num_components)
-{
-   if (type_sz(dest.type) <= 2) {
-  assert(dest.stride == 1);
-  boolean is_const_offset = offset_reg.file == BRW_IMMEDIATE_VALUE;
-
-  if (is_const_offset) {
- uint32_t start = offset_reg.ud & ~3;
- uint32_t end = offset_reg.ud + num_components * type_sz(dest.type);
- end = ALIGN(end, 4);
- assert (end - start <= 16);
-
- /* At this point we have 16-bit component/s that have constant
-  * offset aligned to 4-bytes that can be read with untyped_reads.
-  * untyped_read message requires 32-bit aligned offsets.
-  */
- unsigned first_component = (offset_reg.ud & 3) / type_sz(dest.type);
- unsigned num_components_32bit = (end - start) / 4;
-
- fs_reg read_result =
-emit_untyped_read(bld, surf_index, brw_imm_ud(start),
-  1 /* dims */,
-  num_components_32bit,
-  BRW_PREDICATE_NONE);
- shuffle_from_32bit_read(bld, dest, read_result, first_component,
- num_components);
-  } else {
- fs_reg read_offset = bld.vgrf(BRW_REGISTER_TYPE_UD);
- for (unsigned i = 0; i < num_components; i++) {
-if (i == 0) {
-   bld.MOV(read_offset, offset_reg);
-} else {
-   bld.ADD(read_offset, offset_reg,
-   brw_imm_ud(i * type_sz(dest.type)));
-}
-/* Non constant offsets are not guaranteed to be aligned 32-bits
- * so they are read using one byte_scattered_read message
- * for each component.
- */
-fs_reg read_result =
-   emit_byte_scattered_read(bld, surf_index, read_offset,
-1 /* dims */, 1,
-type_sz(dest.type) * 8 /* bit_size */,
-BRW_PREDICATE_NONE);
-bld.MOV(offset(dest, bld, i),
-subscript (read_result, dest.type, 0));
- }
-  }
-   } else if (type_sz(dest.type) == 4) {
-  fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
- 1 /* dims */,
- num_components,
- BRW_PREDICATE_NONE);
-  read_result.type = dest.type;
-  for (unsigned i = 0; i < num_components; i++)
- bld.MOV(offset(dest, bld, i), offset(read_result, bld, i));
-   } else if (type_sz(dest.type) == 8) {
-  /* Reading a dvec, so we need to:
-   *
-   * 1. Multiply num_components by 2, to account for the fact that we
-   *need to read 64-bit components.
-   * 2. Shuffle the result

[Mesa-dev] [PATCH 7/8] nir: Add alignment parameters to SSBO, UBO, and shared access

2018-11-13 Thread Jason Ekstrand

This also changes spirv_to_nir and glsl_to_nir to set them.  The one
place that doesn't set them is shared memory access lowering in
nir_lower_io.  That will have to be updated before any consumers of it
can effectively use these new alignments.
---
 src/compiler/glsl/glsl_to_nir.cpp| 14 +++
 src/compiler/nir/nir.h   | 41 
 src/compiler/nir/nir_intrinsics.py   | 26 -
 src/compiler/nir/nir_lower_atomics_to_ssbo.c |  4 ++
 src/compiler/nir/nir_print.c |  2 +
 src/compiler/spirv/spirv_to_nir.c|  2 +
 src/compiler/spirv/vtn_variables.c   |  6 +++
 7 files changed, 85 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 9bb0f5d4044..9f73b721e39 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -33,6 +33,7 @@
 #include "compiler/nir/nir_builder.h"
 #include "main/imports.h"
 #include "main/mtypes.h"
+#include "util/u_math.h"
 
 /*
  * pass to lower GLSL IR to NIR
@@ -603,6 +604,14 @@ nir_visitor::visit(ir_return *ir)
nir_builder_instr_insert(, >instr);
 }
 
+static void
+intrinsic_set_std430_align(nir_intrinsic_instr *intrin, const glsl_type *type)
+{
+   unsigned bit_size = type->is_boolean() ? 32 : glsl_get_bit_size(type);
+   unsigned pow2_components = util_next_power_of_two(type->vector_elements);
+   nir_intrinsic_set_align(intrin, (bit_size / 8) * pow2_components, 0);
+}
+
 void
 nir_visitor::visit(ir_call *ir)
 {
@@ -1006,6 +1015,7 @@ nir_visitor::visit(ir_call *ir)
  instr->src[0] = nir_src_for_ssa(nir_val);
  instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block));
  instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset));
+ intrinsic_set_std430_align(instr, val->type);
  nir_intrinsic_set_write_mask(instr, write_mask->value.u[0]);
  instr->num_components = val->type->vector_elements;
 
@@ -1024,6 +1034,7 @@ nir_visitor::visit(ir_call *ir)
 
  const glsl_type *type = ir->return_deref->var->type;
  instr->num_components = type->vector_elements;
+ intrinsic_set_std430_align(instr, type);
 
  /* Setup destination register */
  unsigned bit_size = type->is_boolean() ? 32 : glsl_get_bit_size(type);
@@ -1101,6 +1112,7 @@ nir_visitor::visit(ir_call *ir)
 
  const glsl_type *type = ir->return_deref->var->type;
  instr->num_components = type->vector_elements;
+ intrinsic_set_std430_align(instr, type);
 
  /* Setup destination register */
  unsigned bit_size = type->is_boolean() ? 32 : glsl_get_bit_size(type);
@@ -1131,6 +1143,7 @@ nir_visitor::visit(ir_call *ir)
 
  instr->src[0] = nir_src_for_ssa(nir_val);
  instr->num_components = val->type->vector_elements;
+ intrinsic_set_std430_align(instr, val->type);
 
  nir_builder_instr_insert(, >instr);
  break;
@@ -1388,6 +1401,7 @@ nir_visitor::visit(ir_expression *ir)
   load->num_components = ir->type->vector_elements;
   load->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[0]));
   load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
+  intrinsic_set_std430_align(load, ir->type);
   add_instr(>instr, ir->type->vector_elements, bit_size);
 
   /*
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index c469e111b2c..41d61dc8105 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -34,6 +34,7 @@
 #include "util/list.h"
 #include "util/ralloc.h"
 #include "util/set.h"
+#include "util/bitscan.h"
 #include "util/bitset.h"
 #include "util/macros.h"
 #include "compiler/nir_types.h"
@@ -1248,6 +1249,18 @@ typedef enum {
 */
NIR_INTRINSIC_ACCESS = 16,
 
+   /**
+* Alignment for offsets and addresses
+*
+* These two parameters, specify an alignment in terms of a multiplier and
+* an offset.  The offset or address parameter X of the intrinsic is
+* guaranteed to satisfy the following:
+*
+*(X - align_offset) % align_mul == 0
+*/
+   NIR_INTRINSIC_ALIGN_MUL = 17,
+   NIR_INTRINSIC_ALIGN_OFFSET = 18,
+
NIR_INTRINSIC_NUM_INDEX_FLAGS,
 
 } nir_intrinsic_index_flag;
@@ -1342,6 +1355,34 @@ INTRINSIC_IDX_ACCESSORS(image_dim, IMAGE_DIM, enum 
glsl_sampler_dim)
 INTRINSIC_IDX_ACCESSORS(image_array, IMAGE_ARRAY, bool)
 INTRINSIC_IDX_ACCESSORS(access, ACCESS, enum gl_access_qualifier)
 INTRINSIC_IDX_ACCESSORS(format, FORMAT, unsigned)
+INTRINSIC_IDX_ACCESSORS(align_mul, ALIGN_MUL, unsigned)
+INTRINSIC_IDX_ACCESSORS(align_offset, ALIGN_OFFSET, unsigned)
+
+static inline void
+nir_intrinsic_set_align(nir_intrinsic_instr *intrin,
+unsigned align_mul, unsigned align_offset)
+{
+   assert(util_is_power_of_two_nonzero(align_mul));
+   assert(align_offset < align_mul);
+   nir_intrinsic_set_align_mul(intrin, align_mul);
+

[Mesa-dev] [PATCH 2/8] nir/builder: Assert that intN_t immediates fit

2018-11-13 Thread Jason Ekstrand

This assert won't catch all mistakes with this helper but it will at
least ensure that the top bits are all zero or all one which should help
catch bugs.
---
 src/compiler/nir/nir_builder.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
index 3271a480520..3be630ab3dd 100644
--- a/src/compiler/nir/nir_builder.h
+++ b/src/compiler/nir/nir_builder.h
@@ -330,6 +330,10 @@ nir_imm_intN_t(nir_builder *build, uint64_t x, unsigned 
bit_size)
 {
nir_const_value v;
 
+   assert(bit_size == 64 ||
+  (int64_t)x >> bit_size == 0 ||
+  (int64_t)x >> bit_size == -1);
+
memset(, 0, sizeof(v));
assert(bit_size <= 64);
v.i64[0] = x & (~0ull >> (64 - bit_size));
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/8] nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16

2018-11-13 Thread Jason Ekstrand

It messes up when trying to lower.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/compiler/nir/nir_lower_alu_to_scalar.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/nir/nir_lower_alu_to_scalar.c 
b/src/compiler/nir/nir_lower_alu_to_scalar.c
index 0be3aba9456..7ef032cd164 100644
--- a/src/compiler/nir/nir_lower_alu_to_scalar.c
+++ b/src/compiler/nir/nir_lower_alu_to_scalar.c
@@ -194,6 +194,7 @@ lower_alu_instr_scalar(nir_alu_instr *instr, nir_builder *b)
}
 
case nir_op_unpack_64_2x32:
+   case nir_op_unpack_32_2x16:
   return false;
 
   LOWER_REDUCTION(nir_op_fdot, nir_op_fmul, nir_op_fadd);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/8] intel: Move shared/SSBO access lowering to NIR

2018-11-13 Thread Jason Ekstrand

In order to properly do all the different kinds of SSBO and SLM writes that
we have in GL and Vulkan, we have to do some lowering.  The hardware
doesn't have instructions for writing a N-bit vecM with an arbitrary
write-mask.  Instead, we have byte scattered messages which work on a
scalar byte, word, or dword at an unaligned address and untyped surface
messages which work on a 32-bit vecN.  All SSBO and SLM access has to be
lowered to one of these two things.

Previously we did this in the back-end and had separate copies for fs and
vec4.  This works but it was fairly heavily tied to the fs_surface_builder
and the way we emit typed load/store ops.  I've been interested in wiring
up the A64 messages for doing "global" reads and writes and they will need
exactly the same lowering but I'm not at all convinced I want to shove them
through the same emit_untyped_read/write helpers we have today.  In any
case, this lets us share code between vec4 and fs and I think the
implementation is over-all cleaner for it.  This series has a few other
advantages beyond just code sharing:

 1) The new splitting code acts on ranges of bytes and is able to combine
loads/stores in more cases than the old code could.  For example, an
indirect u8vec3 load is now just a single dword load where we throw
away the last 16 bits.  Another example is that a u16vec4 write with a
YZ writemask is now written with a single unaligned dword store.

 2) OpBitcast in SPIR-V now works correctly on 8-bit types.

 3) Writes to 8 and 16-bit shared variables should now work.

Cc: Samuel Iglesias Gonsálvez 
Cc: Jose Maria Casanova Crespo 

Jason Ekstrand (8):
  nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16
  nir/builder: Assert that intN_t immediates fit
  nir/builder: Add iadd_imm and imul_imm helpers
  nir/builder: Add a nir_pack/unpack/bitcast helpers
  nir/spirv: Force 32-bit for UBO and SSBO Booleans
  nir/glsl: Force 32-bit for UBO and SSBO Booleans
  nir: Add alignment parameters to SSBO, UBO, and shared access
  intel/compiler: Lower SSBO and shared loads/stores in NIR

 src/compiler/glsl/glsl_to_nir.cpp |  31 +-
 src/compiler/nir/nir.h|  41 ++
 src/compiler/nir/nir_builder.h| 142 +++
 src/compiler/nir/nir_intrinsics.py|  26 +-
 src/compiler/nir/nir_lower_alu_to_scalar.c|   1 +
 src/compiler/nir/nir_lower_atomics_to_ssbo.c  |   4 +
 src/compiler/nir/nir_lower_io.c   |   5 +-
 src/compiler/nir/nir_print.c  |   2 +
 src/compiler/spirv/spirv_to_nir.c |   2 +
 src/compiler/spirv/vtn_alu.c  | 101 ++---
 src/compiler/spirv/vtn_variables.c|  30 +-
 src/intel/Makefile.sources|   1 +
 src/intel/compiler/brw_fs_nir.cpp | 381 --
 src/intel/compiler/brw_nir.c  |   2 +
 src/intel/compiler/brw_nir.h  |   2 +
 .../brw_nir_lower_mem_access_bit_sizes.c  | 313 ++
 src/intel/compiler/brw_vec4_nir.cpp   | 126 +-
 src/intel/compiler/meson.build|   1 +
 18 files changed, 702 insertions(+), 509 deletions(-)
 create mode 100644 src/intel/compiler/brw_nir_lower_mem_access_bit_sizes.c

-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/8] nir/glsl: Force 32-bit for UBO and SSBO Booleans

2018-11-13 Thread Jason Ekstrand

---
 src/compiler/glsl/glsl_to_nir.cpp | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 0479f8fcfe4..9bb0f5d4044 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -1000,7 +1000,10 @@ nir_visitor::visit(ir_call *ir)
  ir_constant *write_mask = ((ir_instruction *)param)->as_constant();
  assert(write_mask);
 
- instr->src[0] = nir_src_for_ssa(evaluate_rvalue(val));
+ nir_ssa_def *nir_val = evaluate_rvalue(val);
+ assert(!val->type->is_boolean() || nir_val->bit_size == 32);
+
+ instr->src[0] = nir_src_for_ssa(nir_val);
  instr->src[1] = nir_src_for_ssa(evaluate_rvalue(block));
  instr->src[2] = nir_src_for_ssa(evaluate_rvalue(offset));
  nir_intrinsic_set_write_mask(instr, write_mask->value.u[0]);
@@ -1023,7 +1026,7 @@ nir_visitor::visit(ir_call *ir)
  instr->num_components = type->vector_elements;
 
  /* Setup destination register */
- unsigned bit_size = glsl_get_bit_size(type);
+ unsigned bit_size = type->is_boolean() ? 32 : glsl_get_bit_size(type);
  nir_ssa_dest_init(>instr, >dest,
type->vector_elements, bit_size, NULL);
 
@@ -1100,7 +1103,7 @@ nir_visitor::visit(ir_call *ir)
  instr->num_components = type->vector_elements;
 
  /* Setup destination register */
- unsigned bit_size = glsl_get_bit_size(type);
+ unsigned bit_size = type->is_boolean() ? 32 : glsl_get_bit_size(type);
  nir_ssa_dest_init(>instr, >dest,
type->vector_elements, bit_size, NULL);
 
@@ -1123,7 +1126,10 @@ nir_visitor::visit(ir_call *ir)
 
  nir_intrinsic_set_write_mask(instr, write_mask->value.u[0]);
 
- instr->src[0] = nir_src_for_ssa(evaluate_rvalue(val));
+ nir_ssa_def *nir_val = evaluate_rvalue(val);
+ assert(!val->type->is_boolean() || nir_val->bit_size == 32);
+
+ instr->src[0] = nir_src_for_ssa(nir_val);
  instr->num_components = val->type->vector_elements;
 
  nir_builder_instr_insert(, >instr);
@@ -1377,7 +1383,8 @@ nir_visitor::visit(ir_expression *ir)
case ir_binop_ubo_load: {
   nir_intrinsic_instr *load =
  nir_intrinsic_instr_create(this->shader, nir_intrinsic_load_ubo);
-  unsigned bit_size = glsl_get_bit_size(ir->type);
+  unsigned bit_size = ir->type->is_boolean() ? 32 :
+  glsl_get_bit_size(ir->type);
   load->num_components = ir->type->vector_elements;
   load->src[0] = nir_src_for_ssa(evaluate_rvalue(ir->operands[0]));
   load->src[1] = nir_src_for_ssa(evaluate_rvalue(ir->operands[1]));
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi: fix video APIs on Raven2

2018-11-13 Thread Marek Olšák

From: Marek Olšák 

This was missed when I added the new enum.

Cc: 18.3 
---
 src/gallium/drivers/radeonsi/si_get.c | 9 ++---
 src/gallium/drivers/radeonsi/si_uvd.c | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index b440230d227..91f38329d59 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -573,24 +573,26 @@ static int si_get_video_param(struct pipe_screen *screen,
  enum pipe_video_cap param)
 {
struct si_screen *sscreen = (struct si_screen *)screen;
enum pipe_video_format codec = u_reduce_video_profile(profile);
 
if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
return (codec == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
(si_vce_is_fw_version_supported(sscreen) ||
-   sscreen->info.family == CHIP_RAVEN)) ||
+sscreen->info.family == CHIP_RAVEN ||
+sscreen->info.family == CHIP_RAVEN2)) ||
(profile == PIPE_VIDEO_PROFILE_HEVC_MAIN &&
(sscreen->info.family == CHIP_RAVEN ||
-   si_radeon_uvd_enc_supported(sscreen)));
+sscreen->info.family == CHIP_RAVEN2 ||
+si_radeon_uvd_enc_supported(sscreen)));
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
case PIPE_VIDEO_CAP_MAX_WIDTH:
return (sscreen->info.family < CHIP_TONGA) ? 2048 : 
4096;
case PIPE_VIDEO_CAP_MAX_HEIGHT:
return (sscreen->info.family < CHIP_TONGA) ? 1152 : 
2304;
case PIPE_VIDEO_CAP_PREFERED_FORMAT:
return PIPE_FORMAT_NV12;
case PIPE_VIDEO_CAP_PREFERS_INTERLACED:
return false;
@@ -624,21 +626,22 @@ static int si_get_video_param(struct pipe_screen *screen,
return true;
case PIPE_VIDEO_FORMAT_HEVC:
/* Carrizo only supports HEVC Main */
if (sscreen->info.family >= CHIP_STONEY)
return (profile == PIPE_VIDEO_PROFILE_HEVC_MAIN 
||
profile == 
PIPE_VIDEO_PROFILE_HEVC_MAIN_10);
else if (sscreen->info.family >= CHIP_CARRIZO)
return profile == PIPE_VIDEO_PROFILE_HEVC_MAIN;
return false;
case PIPE_VIDEO_FORMAT_JPEG:
-   if (sscreen->info.family == CHIP_RAVEN)
+   if (sscreen->info.family == CHIP_RAVEN ||
+   sscreen->info.family == CHIP_RAVEN2)
return true;
if (sscreen->info.family < CHIP_CARRIZO || 
sscreen->info.family >= CHIP_VEGA10)
return false;
if (!(sscreen->info.drm_major == 3 && 
sscreen->info.drm_minor >= 19)) {
RVID_ERR("No MJPEG support for the kernel 
version\n");
return false;
}
return true;
case PIPE_VIDEO_FORMAT_VP9:
if (sscreen->info.family < CHIP_RAVEN)
diff --git a/src/gallium/drivers/radeonsi/si_uvd.c 
b/src/gallium/drivers/radeonsi/si_uvd.c
index 1a9d8f8d9fa..8c9553acbf3 100644
--- a/src/gallium/drivers/radeonsi/si_uvd.c
+++ b/src/gallium/drivers/radeonsi/si_uvd.c
@@ -139,21 +139,22 @@ static void si_vce_get_buffer(struct pipe_resource 
*resource,
*surface = >surface;
 }
 
 /**
  * creates an UVD compatible decoder
  */
 struct pipe_video_codec *si_uvd_create_decoder(struct pipe_context *context,
   const struct pipe_video_codec 
*templ)
 {
struct si_context *ctx = (struct si_context *)context;
-   bool vcn = (ctx->family == CHIP_RAVEN) ? true : false;
+   bool vcn = ctx->family == CHIP_RAVEN ||
+  ctx->family == CHIP_RAVEN2;
 
if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
if (vcn) {
return radeon_create_encoder(context, templ, ctx->ws, 
si_vce_get_buffer);
} else {
if (u_reduce_video_profile(templ->profile) == 
PIPE_VIDEO_FORMAT_HEVC)
return radeon_uvd_create_encoder(context, 
templ, ctx->ws, si_vce_get_buffer);
else
return si_vce_create_encoder(context, templ, 
ctx->ws, si_vce_get_buffer);
}
-- 
2.17.1

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Roland Scheidegger

Am 14.11.18 um 03:02 schrieb Roland Scheidegger:
> Am 13.11.18 um 23:49 schrieb Dylan Baker:
>> Quoting Roland Scheidegger (2018-11-13 14:13:00)
>>> Am 13.11.18 um 18:00 schrieb Dylan Baker:
 Quoting Erik Faye-Lund (2018-11-13 01:34:53)
> On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
>> Quoting Erik Faye-Lund (2018-11-12 04:51:47)
>>> On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
 Which has the same behavior.
>>>
>>> Does it? I'm not so sure... IROUND_POS seems to round to nearest
>>> integer depending on the FPU rounding mode, _mesa_roundevenf rounds
>>> to
>>> the nearest *even* value regardless of the FPU rounding mode, no?
>>>
>>> I'm not sure if it matters or not, but *at least* point that out in
>>> the
>>> commit message. Unless I'm missing something, of course...
>>
>> I should put it in the commit message, but there is a comment in
>> rounding.h that
>> if you change the rounding mode you get to keep the pieces.
>
> Well, this might regress performance pretty badly. Especially in the
> swrast code, this could be bad...
>

 Why? we have the assumption that you don't change the rounding mode 
 already in
 core mesa and many of the drivers.

 For performance, I measured a simple 1000 loops of rounding, and found 
 that the
 only way the rounding.h function was slower is if you used the __SSE4_1__
 path... (It was the same performance as the int cast +0.5 implementation)
>>> FWIW I'm not entirely sure it's useful to have a sse41 implementation -
>>> since all sse2 capable cpus can natively do rintf. Although maybe it
>>> should be pointed out that the sse41 implementation will use a defined
>>> rounding mode, whereas rintf will use current rounding mode. But I don't
>>> think anyone ever cares for the results if a different rounding mode
>>> would be set. Although of course rint and its variant do not actually
>>> guarantee the even part of it (but well if it's a sse41 capable box we
>>> pretty much know it would do just that anyway)... (And technically
>>> nearbyintf would probably be an even better solution, since we never
>>> want to get involved with the clunky exceptions, otherwise it's
>>> identical. But there might be reasons why it isn't used.)
>>>
>>> Roland
>>
>> I'm not convinced we want it either, since it seems to be slower than glibc's
>> rintf. I guess it probably does make sense to use the nearbyintf instead.
>>
>> As an aside (since I know 0 about assembly), does _MM_FROUND_CUR_DIRECTION 
>> not
>> check the rounding mode?
> Oh indeed, I didn't check the code too closely (I was just assuming
> _mm_round_ss() was used because it is possible to use round-to-nearest
> regardless the actual rounding mode, but that's not the case).
> 
> But actually I misread this code: the point of mesa_roundevenf is to
> round to float WITHOUT conversion to int. In which case it makes more
> sense at least at first look...
> 
> But if you want to round to nearest integer WITH conversion to int, you
> probably really want to use something else. nearbyint family doesn't
> have variants which give you ints. There's rint functions which give you
> ints directly, but they are likely a very bad idea (aside from exception
> handling, not quite sure if this really causes the compiler to do
> something different) because of giving you long (or long long) results -
> meaning that you can't use the simple cpu instructions giving you 32bit
> results (because conversion to 64bit long + trunc to 32bit will give you
> defined (although meaningless) results in some cases where direct
> conversion to 32bit int wouldn't).
> So ideally you'd pick a variant where the compiler is smart enough to
> recognize it can be done with a single instruction. I would guess
> nearbyintf + int cast should do just about everywhere, at least as long
> as x64 or x86 + sse2 is used, my suspicion is the old IROUND function
> was done in a time where x87 was still relevant. Or maybe rintf + int
> cast, no idea how the compiler really handles them differently (I tried
> to quickly look at it in gcc source, but no idea where those are
> buried). As a side note, I hate it when the assembly solution is obvious
> and you can't really figure out how the hell you should coax the
> compiler in giving you the right answer (I mean, high level languages
> are there to help, not get in your way...).
> 
> All that said, I still don't really see the point of the manual sse41
> assembly (even for the case when we don't want to convert to int) -
> assuming there is an easy solution to get the compiler to do the right
> thing...

Err, I tried it out and was completely unable to come up with something
which wouldn't generate huge amounts of crap code (or library calls).
WTF. (But might depend on compiler, of course.)
So I guess maybe for round conversion to int you actually want to
manually do sse2 inline asm

Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of layers array length for isl_view

2018-11-13 Thread Jason Ekstrand

On Mon, Sep 10, 2018 at 10:21 AM Danylo Piliaiev 
wrote:

> Handle all cases in calculation of layers count for isl_view
> taking into account texture view and image unit.
> st_convert_image was taken as a reference.
>
> When u->Layered is true the whole level is taken with respect to
> image view. In other case only one layer is taken.
>
> v3: (Józef Kucia and Ilia Mirkin)
> - Rewrote patch by taking st_convert_image as a reference
> - Removed now unused get_image_num_layers function
> - Changed commit message
>
> Fixes: 5a8c8903
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856
>
> Signed-off-by: Danylo Piliaiev 
> ---
>  .../drivers/dri/i965/brw_wm_surface_state.c   | 32 ++-
>  1 file changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 944762ec46..9bfe6e2037 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -1499,18 +1499,6 @@ update_buffer_image_param(struct brw_context *brw,
> param->stride[0] = _mesa_get_format_bytes(u->_ActualFormat);
>  }
>
> -static unsigned
> -get_image_num_layers(const struct intel_mipmap_tree *mt, GLenum target,
> - unsigned level)
> -{
> -   if (target == GL_TEXTURE_CUBE_MAP)
> -  return 6;
> -
> -   return target == GL_TEXTURE_3D ?
> -  minify(mt->surf.logical_level0_px.depth, level) :
> -  mt->surf.logical_level0_px.array_len;
> -}
> -
>  static void
>  update_image_surface(struct brw_context *brw,
>   struct gl_image_unit *u,
> @@ -1541,14 +1529,28 @@ update_image_surface(struct brw_context *brw,
>} else {
>   struct intel_texture_object *intel_obj =
> intel_texture_object(obj);
>   struct intel_mipmap_tree *mt = intel_obj->mt;
> - const unsigned num_layers = u->Layered ?
> -get_image_num_layers(mt, obj->Target, u->Level) : 1;
> +
> + unsigned base_layer, num_layers;
> + if (u->Layered) {
> +if (obj->Target == GL_TEXTURE_3D) {
> +   base_layer = 0;
> +   num_layers = minify(mt->surf.logical_level0_px.depth,
> u->Level);
> +} else {
> +   base_layer = obj->MinLayer;
> +   num_layers = obj->Immutable ?
> +obj->NumLayers :
> +mt->surf.logical_level0_px.array_len;
>

Doesn't this need to be array_len - base_layer?  I'm not sure on the others
without digging.


> +}
> + } else {
> +base_layer = obj->MinLayer + u->_Layer;
> +num_layers = 1;
> + }
>
>   struct isl_view view = {
>  .format = format,
>  .base_level = obj->MinLevel + u->Level,
>  .levels = 1,
> -.base_array_layer = obj->MinLayer + u->_Layer,
> +.base_array_layer = base_layer,
>  .array_len = num_layers,
>  .swizzle = ISL_SWIZZLE_IDENTITY,
>  .usage = ISL_SURF_USAGE_STORAGE_BIT,
> --
> 2.18.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Dylan Baker

Quoting Roland Scheidegger (2018-11-13 14:13:00)
> Am 13.11.18 um 18:00 schrieb Dylan Baker:
> > Quoting Erik Faye-Lund (2018-11-13 01:34:53)
> >> On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
> >>> Quoting Erik Faye-Lund (2018-11-12 04:51:47)
>  On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
> > Which has the same behavior.
> 
>  Does it? I'm not so sure... IROUND_POS seems to round to nearest
>  integer depending on the FPU rounding mode, _mesa_roundevenf rounds
>  to
>  the nearest *even* value regardless of the FPU rounding mode, no?
> 
>  I'm not sure if it matters or not, but *at least* point that out in
>  the
>  commit message. Unless I'm missing something, of course...
> >>>
> >>> I should put it in the commit message, but there is a comment in
> >>> rounding.h that
> >>> if you change the rounding mode you get to keep the pieces.
> >>
> >> Well, this might regress performance pretty badly. Especially in the
> >> swrast code, this could be bad...
> >>
> > 
> > Why? we have the assumption that you don't change the rounding mode already 
> > in
> > core mesa and many of the drivers.
> > 
> > For performance, I measured a simple 1000 loops of rounding, and found that 
> > the
> > only way the rounding.h function was slower is if you used the __SSE4_1__
> > path... (It was the same performance as the int cast +0.5 implementation)
> FWIW I'm not entirely sure it's useful to have a sse41 implementation -
> since all sse2 capable cpus can natively do rintf. Although maybe it
> should be pointed out that the sse41 implementation will use a defined
> rounding mode, whereas rintf will use current rounding mode. But I don't
> think anyone ever cares for the results if a different rounding mode
> would be set. Although of course rint and its variant do not actually
> guarantee the even part of it (but well if it's a sse41 capable box we
> pretty much know it would do just that anyway)... (And technically
> nearbyintf would probably be an even better solution, since we never
> want to get involved with the clunky exceptions, otherwise it's
> identical. But there might be reasons why it isn't used.)
> 
> Roland

I'm not convinced we want it either, since it seems to be slower than glibc's
rintf. I guess it probably does make sense to use the nearbyintf instead.

As an aside (since I know 0 about assembly), does _MM_FROUND_CUR_DIRECTION not
check the rounding mode?

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: swap order of clear() and clear_with_quad()

2018-11-13 Thread Rob Clark

On Tue, Nov 13, 2018 at 5:25 PM Eric Anholt  wrote:
>
> Rob Clark  writes:
>
> > If we can't clear all the buffers with pctx->clear() (say, for example,
> > because of ColorMask), push the buffers we *can* clear with pctx->clear()
> > first.  Tilers want to see clears coming before draws to enable fast-
> > paths, and clearing one of the attachments with a quad-draw first
> > confuses that logic.
>
> Oh, nice!
>
> Reviewed-by: Eric Anholt 
>
> Though it feels pretty silly that the ->clear() caller needs a
> clear_with_quad implementation when the ->clear() implementation in the
> driver also needs a clear_with_quad implementation for non-fast-cleared
> buffers.  :/

hmm, so perhaps one easy option is to change pctx->clear() to return a
boolean, so driver can return false to ask the state tracker to do a
clear_with_quad()..  maybe that would be a first step towards allowing
driver to handle clears w/ colormask and possibly scissor (although
for the later, plus
glInvalidateFramebuffer()/glInvalidateSubFramebuffer(), I was thinking
of pctx->invalidate_surface()/pctx->invalidate_sub_surface()).

But either way, I guess this patch is a simple stop-gap solution.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: swap order of clear() and clear_with_quad()

2018-11-13 Thread Ilia Mirkin

On Tue, Nov 13, 2018 at 6:50 PM Rob Clark  wrote:
>
> On Tue, Nov 13, 2018 at 6:19 PM Eric Anholt  wrote:
> >
> > Rob Clark  writes:
> >
> > > On Tue, Nov 13, 2018 at 5:25 PM Eric Anholt  wrote:
> > >>
> > >> Rob Clark  writes:
> > >>
> > >> > If we can't clear all the buffers with pctx->clear() (say, for example,
> > >> > because of ColorMask), push the buffers we *can* clear with 
> > >> > pctx->clear()
> > >> > first.  Tilers want to see clears coming before draws to enable fast-
> > >> > paths, and clearing one of the attachments with a quad-draw first
> > >> > confuses that logic.
> > >>
> > >> Oh, nice!
> > >>
> > >> Reviewed-by: Eric Anholt 
> > >>
> > >> Though it feels pretty silly that the ->clear() caller needs a
> > >> clear_with_quad implementation when the ->clear() implementation in the
> > >> driver also needs a clear_with_quad implementation for non-fast-cleared
> > >> buffers.  :/
> > >
> > > hmm, so perhaps one easy option is to change pctx->clear() to return a
> > > boolean, so driver can return false to ask the state tracker to do a
> > > clear_with_quad()..  maybe that would be a first step towards allowing
> > > driver to handle clears w/ colormask and possibly scissor (although
> > > for the later, plus
> > > glInvalidateFramebuffer()/glInvalidateSubFramebuffer(), I was thinking
> > > of pctx->invalidate_surface()/pctx->invalidate_sub_surface()).
> >
> > I was thinking you'd return the mask of what buffers you couldn't (fast)
> > clear.
>
> yeah, makes sense.. I kinda came to same conclusion when I started
> thinking some drivers might not want us to split up the clear per
> attachment.. still not quite sure about adding scissor/colormask,
> might end up needing a pipe cap so st_Clear() would know to flush the
> corresponding state down to driver.  I guess low hanging fruit is to
> not change the definition of pctx->clear() but just let driver ask for
> fallback path for some/all attachments.

You could also create a pipe_clear_info which would take that data
directly and let the driver worry about it. FWIW, nvidia command
stream clears can take into account stencil, scissors, window
rectangles, color masks - maybe everything that st_Clear needs to
worry about. It never seemed important enough to address myself, but
I'll happily go along for the ride.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 102597] [Regression] mpv, high rendering times (two to three times higher)

2018-11-13 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=102597

--- Comment #10 from Dieter Nützel  ---
Code fix under way:

https://lists.freedesktop.org/archives/mesa-dev/2018-November/209473.html

With this patch mpv drops notably, apart that '--vo=opengl-hq' isn't available
any longer. Was replaced by '--vo=gpu'.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [ANNOUNCE] Mesa 18.2.5 release candidate

2018-11-13 Thread Matt Turner

On Mon, Nov 12, 2018 at 8:35 AM Juan A. Suarez Romero
 wrote:
>
> Hello list,
>
> The candidate for the Mesa 18.2.5 is now available. Currently we have:
>  - 25 queued
>  - 0 nominated (outstanding)
>  - and 2 rejected patch

If it's not a big deal if would be convenient for me (for Gentoo) to
have the following patches included in 18.2.5:

efb1ccadca89 ("util/ralloc: Make sizeof(linear_header) a multiple of 8")
 - Maybe needs 7e3748c268cd ("util/ralloc: Switch from DEBUG to NDEBUG")

4eab98b66e7d ("meson: fix libatomic tests")

and the patches to fix https://bugs.freedesktop.org/show_bug.cgi?id=105328#c8

Emil says that the needed commits are

87c156183cd6 ("configure: install KHR/khrplatform.h when needed")
e02f061b690d ("meson: install KHR/khrplatform.h when needed")
f7d42ee7d319 ("include: update GL & GLES headers (v2)")

If they slip to 18.2.6 it's okay.

Thanks!
Matt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Roland Scheidegger

Am 14.11.18 um 03:21 schrieb Matt Turner:
> On Tue, Nov 13, 2018 at 6:03 PM Roland Scheidegger  wrote:
>>
>> Am 13.11.18 um 23:49 schrieb Dylan Baker:
>>> Quoting Roland Scheidegger (2018-11-13 14:13:00)
 Am 13.11.18 um 18:00 schrieb Dylan Baker:
> Quoting Erik Faye-Lund (2018-11-13 01:34:53)
>> On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
>>> Quoting Erik Faye-Lund (2018-11-12 04:51:47)
 On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
> Which has the same behavior.

 Does it? I'm not so sure... IROUND_POS seems to round to nearest
 integer depending on the FPU rounding mode, _mesa_roundevenf rounds
 to
 the nearest *even* value regardless of the FPU rounding mode, no?

 I'm not sure if it matters or not, but *at least* point that out in
 the
 commit message. Unless I'm missing something, of course...
>>>
>>> I should put it in the commit message, but there is a comment in
>>> rounding.h that
>>> if you change the rounding mode you get to keep the pieces.
>>
>> Well, this might regress performance pretty badly. Especially in the
>> swrast code, this could be bad...
>>
>
> Why? we have the assumption that you don't change the rounding mode 
> already in
> core mesa and many of the drivers.
>
> For performance, I measured a simple 1000 loops of rounding, and found 
> that the
> only way the rounding.h function was slower is if you used the __SSE4_1__
> path... (It was the same performance as the int cast +0.5 implementation)
 FWIW I'm not entirely sure it's useful to have a sse41 implementation -
 since all sse2 capable cpus can natively do rintf. Although maybe it
 should be pointed out that the sse41 implementation will use a defined
 rounding mode, whereas rintf will use current rounding mode. But I don't
 think anyone ever cares for the results if a different rounding mode
 would be set. Although of course rint and its variant do not actually
 guarantee the even part of it (but well if it's a sse41 capable box we
 pretty much know it would do just that anyway)... (And technically
 nearbyintf would probably be an even better solution, since we never
 want to get involved with the clunky exceptions, otherwise it's
 identical. But there might be reasons why it isn't used.)

 Roland
>>>
>>> I'm not convinced we want it either, since it seems to be slower than 
>>> glibc's
>>> rintf. I guess it probably does make sense to use the nearbyintf instead.
>>>
>>> As an aside (since I know 0 about assembly), does _MM_FROUND_CUR_DIRECTION 
>>> not
>>> check the rounding mode?
>> Oh indeed, I didn't check the code too closely (I was just assuming
>> _mm_round_ss() was used because it is possible to use round-to-nearest
>> regardless the actual rounding mode, but that's not the case).
>>
>> But actually I misread this code: the point of mesa_roundevenf is to
>> round to float WITHOUT conversion to int. In which case it makes more
>> sense at least at first look...
>>
>> But if you want to round to nearest integer WITH conversion to int, you
>> probably really want to use something else. nearbyint family doesn't
>> have variants which give you ints. There's rint functions which give you
>> ints directly, but they are likely a very bad idea (aside from exception
> 
> Why?
Not sure what the why refers to here?


> 
>> handling, not quite sure if this really causes the compiler to do
>> something different) because of giving you long (or long long) results -
>> meaning that you can't use the simple cpu instructions giving you 32bit
>> results (because conversion to 64bit long + trunc to 32bit will give you
>> defined (although meaningless) results in some cases where direct
>> conversion to 32bit int wouldn't).
>> So ideally you'd pick a variant where the compiler is smart enough to
>> recognize it can be done with a single instruction. I would guess
>> nearbyintf + int cast should do just about everywhere, at least as long
>> as x64 or x86 + sse2 is used, my suspicion is the old IROUND function
>> was done in a time where x87 was still relevant. Or maybe rintf + int
>> cast, no idea how the compiler really handles them differently (I tried
>> to quickly look at it in gcc source, but no idea where those are
>> buried). As a side note, I hate it when the assembly solution is obvious
>> and you can't really figure out how the hell you should coax the
>> compiler in giving you the right answer (I mean, high level languages
>> are there to help, not get in your way...).
> 
> Please read the commit message of
> 
> commit dd0d3a2c0fb388745519c8a3be800720541eccfe
> Author: Matt Turner 
> Date:   Tue Mar 10 17:55:21 2015 -0700
> 
> mesa: Replace _mesa_round_to_even() with _mesa_roundeven().
> 
> for a lot of the background.
> 
> I expect IROUND_POS can be replaced with the _mesa_lroundevenf

[Mesa-dev] [PATCH 3/5] intel/icl: Set way_size_per_bank to 4

2018-11-13 Thread Anuj Phogat

Signed-off-by: Anuj Phogat 
Cc: Kenneth Graunke 
Cc: Francisco Jerez 
Cc: Lionel Landwerlin 
---
 src/intel/common/gen_l3_config.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c
index 079608198bc..de16ad23017 100644
--- a/src/intel/common/gen_l3_config.c
+++ b/src/intel/common/gen_l3_config.c
@@ -313,7 +313,8 @@ static unsigned
 get_l3_way_size(const struct gen_device_info *devinfo)
 {
const unsigned way_size_per_bank =
-  devinfo->gen >= 9 && devinfo->l3_banks == 1 ? 4 : 2;
+  (devinfo->gen >= 9 && devinfo->l3_banks == 1) || devinfo->gen == 11 ?
+  4 : 2;
 
assert(devinfo->l3_banks);
return way_size_per_bank * devinfo->l3_banks;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] anv/icl: Set use full ways in L3CNTLREG

2018-11-13 Thread Anuj Phogat

L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat 
Cc: Kenneth Graunke 
Cc: Francisco Jerez 
Cc: Lionel Landwerlin 
---
 src/intel/genxml/gen11.xml | 1 +
 src/intel/vulkan/genX_cmd_buffer.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml
index b975fe94776..1239ed011ed 100644
--- a/src/intel/genxml/gen11.xml
+++ b/src/intel/genxml/gen11.xml
@@ -3547,6 +3547,7 @@
 
 
 
+
 
 
 
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index ed88157170d..c7e5ef9596e 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1623,6 +1623,7 @@ genX(cmd_buffer_config_l3)(struct anv_cmd_buffer 
*cmd_buffer,
 * desirable behavior.
*/
.ErrorDetectionBehaviorControl = true,
+   .UseFullWays = true,
 #endif
.URBAllocation = cfg->n[GEN_L3P_URB],
.ROAllocation = cfg->n[GEN_L3P_RO],
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] i965/icl: Fix L3 configurations

2018-11-13 Thread Anuj Phogat

Use L3 configuration table specified in h/w specification.

Signed-off-by: Anuj Phogat 
Cc: Kenneth Graunke 
Cc: Francisco Jerez 
Cc: Lionel Landwerlin 
---
 src/intel/common/gen_l3_config.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c
index b977c6ab136..079608198bc 100644
--- a/src/intel/common/gen_l3_config.c
+++ b/src/intel/common/gen_l3_config.c
@@ -137,12 +137,16 @@ static const struct gen_l3_config cnl_l3_configs[] = {
  */
 static const struct gen_l3_config icl_l3_configs[] = {
/* SLM URB ALL DC  RO  IS   C   T */
-   {{  0, 64, 64,  0,  0,  0,  0,  0 }},
-   {{  0, 64,  0, 16, 48,  0,  0,  0 }},
-   {{  0, 48,  0, 16, 64,  0,  0,  0 }},
-   {{  0, 32,  0,  0, 96,  0,  0,  0 }},
-   {{  0, 32, 96,  0,  0,  0,  0,  0 }},
-   {{  0, 32,  0, 16, 80,  0,  0,  0 }},
+   {{  0, 32, 32,  0,  0,  0,  0,  0 }},
+   {{  0, 32, 28,  0,  0,  0,  0,  0 }},
+   {{  0, 24,  0,  8, 28,  0,  0,  0 }},
+   {{  0, 16,  0,  0, 44,  0,  0,  0 }},
+   {{  0, 16, 12,  0,  0,  0,  0,  0 }},
+   {{  0, 16,  0,  0, 12,  0,  0,  0 }},
+   {{  0, 16, 80,  0,  0,  0,  0,  0 }},
+   {{  0, 16, 48,  0,  0,  0,  0,  0 }},
+   {{  0, 16, 44,  0,  0,  0,  0,  0 }},
+   {{  0, 32, 64,  0,  0,  0,  0,  0 }},
{{  0 }}
 };
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] i965/icl: Set use full ways in L3CNTLREG

2018-11-13 Thread Anuj Phogat

L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat 
Cc: Kenneth Graunke 
Cc: Francisco Jerez 
Cc: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
 src/mesa/drivers/dri/i965/gen7_l3_state.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 897c91aa31e..b8ada02d6eb 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1647,6 +1647,7 @@ enum brw_pixel_shader_coverage_mask_mode {
 # define GEN8_L3CNTLREG_ALL_ALLOC_SHIFT25
 # define GEN8_L3CNTLREG_ALL_ALLOC_MASK INTEL_MASK(31, 25)
 # define GEN8_L3CNTLREG_EDBC_NO_HANG   (1 << 9)
+# define GEN8_L3CNTLREG_USE_FULL_WAYS  (1 << 10)
 
 #define GEN10_CACHE_MODE_SS0x0e420
 #define GEN10_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4)
diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index 8c6c4c47481..fb9b2703a50 100644
--- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
@@ -119,6 +119,7 @@ setup_l3_config(struct brw_context *brw, const struct 
gen_l3_config *cfg)
   assert(!cfg->n[GEN_L3P_IS] && !cfg->n[GEN_L3P_C] && !cfg->n[GEN_L3P_T]);
 
   const unsigned imm_data = ((has_slm ? GEN8_L3CNTLREG_SLM_ENABLE : 0) |
+ (devinfo->gen == 11 ? GEN8_L3CNTLREG_USE_FULL_WAYS : 0) |
  SET_FIELD(cfg->n[GEN_L3P_URB], GEN8_L3CNTLREG_URB_ALLOC) |
  SET_FIELD(cfg->n[GEN_L3P_RO], GEN8_L3CNTLREG_RO_ALLOC) |
  SET_FIELD(cfg->n[GEN_L3P_DC], GEN8_L3CNTLREG_DC_ALLOC) |
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of layers array length for isl_view

2018-11-13 Thread Ilia Mirkin

On Tue, Nov 13, 2018 at 4:53 PM Jason Ekstrand  wrote:
>
> On Mon, Sep 10, 2018 at 10:21 AM Danylo Piliaiev  
> wrote:
>>
>> Handle all cases in calculation of layers count for isl_view
>> taking into account texture view and image unit.
>> st_convert_image was taken as a reference.
>>
>> When u->Layered is true the whole level is taken with respect to
>> image view. In other case only one layer is taken.
>>
>> v3: (Józef Kucia and Ilia Mirkin)
>> - Rewrote patch by taking st_convert_image as a reference
>> - Removed now unused get_image_num_layers function
>> - Changed commit message
>>
>> Fixes: 5a8c8903
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856
>>
>> Signed-off-by: Danylo Piliaiev 
>> ---
>>  .../drivers/dri/i965/brw_wm_surface_state.c   | 32 ++-
>>  1 file changed, 17 insertions(+), 15 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
>> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> index 944762ec46..9bfe6e2037 100644
>> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> @@ -1499,18 +1499,6 @@ update_buffer_image_param(struct brw_context *brw,
>> param->stride[0] = _mesa_get_format_bytes(u->_ActualFormat);
>>  }
>>
>> -static unsigned
>> -get_image_num_layers(const struct intel_mipmap_tree *mt, GLenum target,
>> - unsigned level)
>> -{
>> -   if (target == GL_TEXTURE_CUBE_MAP)
>> -  return 6;
>> -
>> -   return target == GL_TEXTURE_3D ?
>> -  minify(mt->surf.logical_level0_px.depth, level) :
>> -  mt->surf.logical_level0_px.array_len;
>> -}
>> -
>>  static void
>>  update_image_surface(struct brw_context *brw,
>>   struct gl_image_unit *u,
>> @@ -1541,14 +1529,28 @@ update_image_surface(struct brw_context *brw,
>>} else {
>>   struct intel_texture_object *intel_obj = intel_texture_object(obj);
>>   struct intel_mipmap_tree *mt = intel_obj->mt;
>> - const unsigned num_layers = u->Layered ?
>> -get_image_num_layers(mt, obj->Target, u->Level) : 1;
>> +
>> + unsigned base_layer, num_layers;
>> + if (u->Layered) {
>> +if (obj->Target == GL_TEXTURE_3D) {
>> +   base_layer = 0;
>> +   num_layers = minify(mt->surf.logical_level0_px.depth, 
>> u->Level);
>> +} else {
>> +   base_layer = obj->MinLayer;
>> +   num_layers = obj->Immutable ?
>> +obj->NumLayers :
>> +mt->surf.logical_level0_px.array_len;
>
>
> Doesn't this need to be array_len - base_layer?  I'm not sure on the others 
> without digging.

Probably not intuitively obvious, but MinLayer/NumLayers are only set
for Immutable textures.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TS

2018-11-13 Thread Dieter Nützel


For the series

Tested-by: Dieter Nützel 

mpv drops notably, apart that '--vo=opengl-hq' isn't available any 
longer. Was replaced by '--vo=gpu'.


Dieter

Am 13.11.2018 22:23, schrieb Marek Olšák:

From: Marek Olšák 

There are no writes.
---
 src/gallium/drivers/radeonsi/si_fence.c   | 3 +--
 src/gallium/drivers/radeonsi/si_perfcounter.c | 3 +--
 src/gallium/drivers/radeonsi/si_query.c   | 8 +++-
 3 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_fence.c
b/src/gallium/drivers/radeonsi/si_fence.c
index 3f22ee31ae8..d385f445774 100644
--- a/src/gallium/drivers/radeonsi/si_fence.c
+++ b/src/gallium/drivers/radeonsi/si_fence.c
@@ -270,22 +270,21 @@ static void si_fine_fence_set(struct si_context 
*ctx,

radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
S_370_WR_CONFIRM(1) |
S_370_ENGINE_SEL(V_370_PFP));
radeon_emit(cs, fence_va);
radeon_emit(cs, fence_va >> 32);
radeon_emit(cs, 0x8000);
} else if (flags & PIPE_FLUSH_BOTTOM_OF_PIPE) {
si_cp_release_mem(ctx,
  V_028A90_BOTTOM_OF_PIPE_TS, 0,
- EOP_DST_SEL_MEM,
- EOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM,
+ EOP_DST_SEL_MEM, EOP_INT_SEL_NONE,
  EOP_DATA_SEL_VALUE_32BIT,
  NULL, fence_va, 0x8000,
  PIPE_QUERY_GPU_FINISHED);
} else {
assert(false);
}
 }

 static boolean si_fence_finish(struct pipe_screen *screen,
   struct pipe_context *ctx,
diff --git a/src/gallium/drivers/radeonsi/si_perfcounter.c
b/src/gallium/drivers/radeonsi/si_perfcounter.c
index 2ca6d2d7410..cea7d57e518 100644
--- a/src/gallium/drivers/radeonsi/si_perfcounter.c
+++ b/src/gallium/drivers/radeonsi/si_perfcounter.c
@@ -574,22 +574,21 @@ static void si_pc_emit_start(struct si_context 
*sctx,

 }

 /* Note: The buffer was already added in si_pc_emit_start, so we don't 
have to

  * do it again in here. */
 static void si_pc_emit_stop(struct si_context *sctx,
struct r600_resource *buffer, uint64_t va)
 {
struct radeon_cmdbuf *cs = sctx->gfx_cs;

si_cp_release_mem(sctx, V_028A90_BOTTOM_OF_PIPE_TS, 0,
- EOP_DST_SEL_MEM,
- EOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM,
+ EOP_DST_SEL_MEM, EOP_INT_SEL_NONE,
  EOP_DATA_SEL_VALUE_32BIT,
  buffer, va, 0, SI_NOT_QUERY);
si_cp_wait_mem(sctx, va, 0, 0x, 0);

radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
 	radeon_emit(cs, EVENT_TYPE(V_028A90_PERFCOUNTER_SAMPLE) | 
EVENT_INDEX(0));

radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
 	radeon_emit(cs, EVENT_TYPE(V_028A90_PERFCOUNTER_STOP) | 
EVENT_INDEX(0));

radeon_set_uconfig_reg(cs, R_036020_CP_PERFMON_CNTL,
   S_036020_PERFMON_STATE(V_036020_STOP_COUNTING) |
diff --git a/src/gallium/drivers/radeonsi/si_query.c
b/src/gallium/drivers/radeonsi/si_query.c
index 9b09c74d48a..21b9aeeac28 100644
--- a/src/gallium/drivers/radeonsi/si_query.c
+++ b/src/gallium/drivers/radeonsi/si_query.c
@@ -883,23 +883,22 @@ static void si_query_hw_do_emit_stop(struct
si_context *sctx,
break;
case PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE:
va += 16;
for (unsigned stream = 0; stream < SI_MAX_STREAMS; ++stream)
emit_sample_streamout(cs, va + 32 * stream, stream);
break;
case PIPE_QUERY_TIME_ELAPSED:
va += 8;
/* fall through */
case PIPE_QUERY_TIMESTAMP:
-   si_cp_release_mem(sctx, V_028A90_BOTTOM_OF_PIPE_TS,
- 0, EOP_DST_SEL_MEM,
- EOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM,
+   si_cp_release_mem(sctx, V_028A90_BOTTOM_OF_PIPE_TS, 0,
+ EOP_DST_SEL_MEM, EOP_INT_SEL_NONE,
  EOP_DATA_SEL_TIMESTAMP, NULL, va,
  0, query->b.type);
fence_va = va + 8;
break;
case PIPE_QUERY_PIPELINE_STATISTICS: {
unsigned sample_size = (query->result_size - 8) / 2;

va += sample_size;
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
 		radeon_emit(cs, EVENT_TYPE(V_028A90_SAMPLE_PIPELINESTAT) | 
EVENT_INDEX(2));

@@ -910,22 +909,21 @@ static void si_query_hw_do_emit_stop(struct
si_context *sctx,
break;
}
default:
assert(0);
}

Re: [Mesa-dev] [PATCH] mesa/st: swap order of clear() and clear_with_quad()

2018-11-13 Thread Eric Anholt

Rob Clark  writes:

> On Tue, Nov 13, 2018 at 5:25 PM Eric Anholt  wrote:
>>
>> Rob Clark  writes:
>>
>> > If we can't clear all the buffers with pctx->clear() (say, for example,
>> > because of ColorMask), push the buffers we *can* clear with pctx->clear()
>> > first.  Tilers want to see clears coming before draws to enable fast-
>> > paths, and clearing one of the attachments with a quad-draw first
>> > confuses that logic.
>>
>> Oh, nice!
>>
>> Reviewed-by: Eric Anholt 
>>
>> Though it feels pretty silly that the ->clear() caller needs a
>> clear_with_quad implementation when the ->clear() implementation in the
>> driver also needs a clear_with_quad implementation for non-fast-cleared
>> buffers.  :/
>
> hmm, so perhaps one easy option is to change pctx->clear() to return a
> boolean, so driver can return false to ask the state tracker to do a
> clear_with_quad()..  maybe that would be a first step towards allowing
> driver to handle clears w/ colormask and possibly scissor (although
> for the later, plus
> glInvalidateFramebuffer()/glInvalidateSubFramebuffer(), I was thinking
> of pctx->invalidate_surface()/pctx->invalidate_sub_surface()).

I was thinking you'd return the mask of what buffers you couldn't (fast)
clear.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Roland Scheidegger

Am 13.11.18 um 23:49 schrieb Dylan Baker:
> Quoting Roland Scheidegger (2018-11-13 14:13:00)
>> Am 13.11.18 um 18:00 schrieb Dylan Baker:
>>> Quoting Erik Faye-Lund (2018-11-13 01:34:53)
 On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
> Quoting Erik Faye-Lund (2018-11-12 04:51:47)
>> On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
>>> Which has the same behavior.
>>
>> Does it? I'm not so sure... IROUND_POS seems to round to nearest
>> integer depending on the FPU rounding mode, _mesa_roundevenf rounds
>> to
>> the nearest *even* value regardless of the FPU rounding mode, no?
>>
>> I'm not sure if it matters or not, but *at least* point that out in
>> the
>> commit message. Unless I'm missing something, of course...
>
> I should put it in the commit message, but there is a comment in
> rounding.h that
> if you change the rounding mode you get to keep the pieces.

 Well, this might regress performance pretty badly. Especially in the
 swrast code, this could be bad...

>>>
>>> Why? we have the assumption that you don't change the rounding mode already 
>>> in
>>> core mesa and many of the drivers.
>>>
>>> For performance, I measured a simple 1000 loops of rounding, and found that 
>>> the
>>> only way the rounding.h function was slower is if you used the __SSE4_1__
>>> path... (It was the same performance as the int cast +0.5 implementation)
>> FWIW I'm not entirely sure it's useful to have a sse41 implementation -
>> since all sse2 capable cpus can natively do rintf. Although maybe it
>> should be pointed out that the sse41 implementation will use a defined
>> rounding mode, whereas rintf will use current rounding mode. But I don't
>> think anyone ever cares for the results if a different rounding mode
>> would be set. Although of course rint and its variant do not actually
>> guarantee the even part of it (but well if it's a sse41 capable box we
>> pretty much know it would do just that anyway)... (And technically
>> nearbyintf would probably be an even better solution, since we never
>> want to get involved with the clunky exceptions, otherwise it's
>> identical. But there might be reasons why it isn't used.)
>>
>> Roland
> 
> I'm not convinced we want it either, since it seems to be slower than glibc's
> rintf. I guess it probably does make sense to use the nearbyintf instead.
> 
> As an aside (since I know 0 about assembly), does _MM_FROUND_CUR_DIRECTION not
> check the rounding mode?
Oh indeed, I didn't check the code too closely (I was just assuming
_mm_round_ss() was used because it is possible to use round-to-nearest
regardless the actual rounding mode, but that's not the case).

But actually I misread this code: the point of mesa_roundevenf is to
round to float WITHOUT conversion to int. In which case it makes more
sense at least at first look...

But if you want to round to nearest integer WITH conversion to int, you
probably really want to use something else. nearbyint family doesn't
have variants which give you ints. There's rint functions which give you
ints directly, but they are likely a very bad idea (aside from exception
handling, not quite sure if this really causes the compiler to do
something different) because of giving you long (or long long) results -
meaning that you can't use the simple cpu instructions giving you 32bit
results (because conversion to 64bit long + trunc to 32bit will give you
defined (although meaningless) results in some cases where direct
conversion to 32bit int wouldn't).
So ideally you'd pick a variant where the compiler is smart enough to
recognize it can be done with a single instruction. I would guess
nearbyintf + int cast should do just about everywhere, at least as long
as x64 or x86 + sse2 is used, my suspicion is the old IROUND function
was done in a time where x87 was still relevant. Or maybe rintf + int
cast, no idea how the compiler really handles them differently (I tried
to quickly look at it in gcc source, but no idea where those are
buried). As a side note, I hate it when the assembly solution is obvious
and you can't really figure out how the hell you should coax the
compiler in giving you the right answer (I mean, high level languages
are there to help, not get in your way...).

All that said, I still don't really see the point of the manual sse41
assembly (even for the case when we don't want to convert to int) -
assuming there is an easy solution to get the compiler to do the right
thing...

Roland

> 
> Dylan
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/28] Replace IROUND_POS with _mesa_roundevenf

2018-11-13 Thread Matt Turner

On Tue, Nov 13, 2018 at 6:03 PM Roland Scheidegger  wrote:
>
> Am 13.11.18 um 23:49 schrieb Dylan Baker:
> > Quoting Roland Scheidegger (2018-11-13 14:13:00)
> >> Am 13.11.18 um 18:00 schrieb Dylan Baker:
> >>> Quoting Erik Faye-Lund (2018-11-13 01:34:53)
>  On Mon, 2018-11-12 at 09:22 -0800, Dylan Baker wrote:
> > Quoting Erik Faye-Lund (2018-11-12 04:51:47)
> >> On Fri, 2018-11-09 at 10:40 -0800, Dylan Baker wrote:
> >>> Which has the same behavior.
> >>
> >> Does it? I'm not so sure... IROUND_POS seems to round to nearest
> >> integer depending on the FPU rounding mode, _mesa_roundevenf rounds
> >> to
> >> the nearest *even* value regardless of the FPU rounding mode, no?
> >>
> >> I'm not sure if it matters or not, but *at least* point that out in
> >> the
> >> commit message. Unless I'm missing something, of course...
> >
> > I should put it in the commit message, but there is a comment in
> > rounding.h that
> > if you change the rounding mode you get to keep the pieces.
> 
>  Well, this might regress performance pretty badly. Especially in the
>  swrast code, this could be bad...
> 
> >>>
> >>> Why? we have the assumption that you don't change the rounding mode 
> >>> already in
> >>> core mesa and many of the drivers.
> >>>
> >>> For performance, I measured a simple 1000 loops of rounding, and found 
> >>> that the
> >>> only way the rounding.h function was slower is if you used the __SSE4_1__
> >>> path... (It was the same performance as the int cast +0.5 implementation)
> >> FWIW I'm not entirely sure it's useful to have a sse41 implementation -
> >> since all sse2 capable cpus can natively do rintf. Although maybe it
> >> should be pointed out that the sse41 implementation will use a defined
> >> rounding mode, whereas rintf will use current rounding mode. But I don't
> >> think anyone ever cares for the results if a different rounding mode
> >> would be set. Although of course rint and its variant do not actually
> >> guarantee the even part of it (but well if it's a sse41 capable box we
> >> pretty much know it would do just that anyway)... (And technically
> >> nearbyintf would probably be an even better solution, since we never
> >> want to get involved with the clunky exceptions, otherwise it's
> >> identical. But there might be reasons why it isn't used.)
> >>
> >> Roland
> >
> > I'm not convinced we want it either, since it seems to be slower than 
> > glibc's
> > rintf. I guess it probably does make sense to use the nearbyintf instead.
> >
> > As an aside (since I know 0 about assembly), does _MM_FROUND_CUR_DIRECTION 
> > not
> > check the rounding mode?
> Oh indeed, I didn't check the code too closely (I was just assuming
> _mm_round_ss() was used because it is possible to use round-to-nearest
> regardless the actual rounding mode, but that's not the case).
>
> But actually I misread this code: the point of mesa_roundevenf is to
> round to float WITHOUT conversion to int. In which case it makes more
> sense at least at first look...
>
> But if you want to round to nearest integer WITH conversion to int, you
> probably really want to use something else. nearbyint family doesn't
> have variants which give you ints. There's rint functions which give you
> ints directly, but they are likely a very bad idea (aside from exception

Why?

> handling, not quite sure if this really causes the compiler to do
> something different) because of giving you long (or long long) results -
> meaning that you can't use the simple cpu instructions giving you 32bit
> results (because conversion to 64bit long + trunc to 32bit will give you
> defined (although meaningless) results in some cases where direct
> conversion to 32bit int wouldn't).
> So ideally you'd pick a variant where the compiler is smart enough to
> recognize it can be done with a single instruction. I would guess
> nearbyintf + int cast should do just about everywhere, at least as long
> as x64 or x86 + sse2 is used, my suspicion is the old IROUND function
> was done in a time where x87 was still relevant. Or maybe rintf + int
> cast, no idea how the compiler really handles them differently (I tried
> to quickly look at it in gcc source, but no idea where those are
> buried). As a side note, I hate it when the assembly solution is obvious
> and you can't really figure out how the hell you should coax the
> compiler in giving you the right answer (I mean, high level languages
> are there to help, not get in your way...).

Please read the commit message of

commit dd0d3a2c0fb388745519c8a3be800720541eccfe
Author: Matt Turner 
Date:   Tue Mar 10 17:55:21 2015 -0700

mesa: Replace _mesa_round_to_even() with _mesa_roundeven().

for a lot of the background.

I expect IROUND_POS can be replaced with the _mesa_lroundevenf function.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH] st/mesa: don't do L3 thread pinning for Blender

2018-11-13 Thread Edmondo Tommasina

Hi Marek

Sure. Thanks for writing these patches.

The looks good.

I've done some small testing:

drawoverhead numbers looks great in my eyes:
  29: DrawElements ( 1 VBO, 8 UBO,  8 Tex) w/ sample mask enable change:
  6.63 million (94.7%)

Hitman benchmark runs nicely, even slightly bit faster than before and uses
all the cores:
 52.61fps Average

Tombraider benchmark is fine:
 105 FPS

Thanks
edmondo


On Tue, Nov 13, 2018 at 1:21 AM Marek Olšák  wrote:

> Hi Edmondo,
>
> can you test the two attached patches? They re-enable and rework the
> thread pinning.
>
> Thanks,
> Marek
>
> On Mon, Nov 12, 2018 at 4:31 PM Edmondo Tommasina <
> edmondo.tommas...@gmail.com> wrote:
>
>> On Mon, Nov 12, 2018 at 6:43 PM Michel Dänzer  wrote:
>>
>>> On 2018-11-08 6:23 a.m., Marek Olšák wrote:
>>> > Thanks a lot man. I'll reconsider this depending on the results I
>>> receive.
>>> >
>>> > I may also just pin the Mesa threads and keep the app thread intact. It
>>> > should perform OK with glthread, but not without glthread.
>>> >
>>> > Another option is to have the gallium and winsys threads "chase" the
>>> main
>>> > thread within the CPU by changing the thread affinity based on
>>> getcpu().
>>>
>>> While those are interesting ideas for the future, I'm afraid it's too
>>> late for them for the 18.3.0 release (scheduled for November 21st IIRC).
>>>
>>> Please make sure the thread pinning code is disabled for the release, at
>>> least by default.
>>>
>>
>> I'm not sure what the best solution is, but pinning the threads to
>> the L3 CCX has shown great potential on my Ryzen 5 2600 and it would
>> be nice to explore the ideas presented by Marek or maybe understand,
>> why the kernel scheduler prefers to put the threads on cores on
>> different CCX.
>>
>> For example The Wicher 2 goes from 60 FPS to 70 FPS average and this
>> is impressive. Tomb Raider just increases about 1 FPS (average 104 FPS)
>> but this can be just noise and for sure not noticeable.
>>
>> Regards
>> edmondo
>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] nir: Allow to skip integer ops in nir_lower_to_source_mods

2018-11-13 Thread Jason Ekstrand

Looks correct.

Reviewed-by: Jason Ekstrand 

On Mon, Nov 12, 2018 at 2:17 AM Gert Wollny  wrote:

> From: Gert Wollny 
>
> Some hardware supports source mods only for float operations. Make it
> possible to skip lowering to source mods in these cases.
>
> v2: use option flags instead of a boolean (Jason Ekstrand)
>
> Signed-off-by: Gert Wollny 
> ---
>  src/compiler/nir/nir.h  | 10 ++-
>  src/compiler/nir/nir_lower_to_source_mods.c | 78 +
>  src/intel/compiler/brw_nir.c|  2 +-
>  3 files changed, 58 insertions(+), 32 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index dc3c729dee..c4601ed218 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -3013,7 +3013,15 @@ typedef struct nir_lower_bitmap_options {
>  void nir_lower_bitmap(nir_shader *shader, const nir_lower_bitmap_options
> *options);
>
>  bool nir_lower_atomics_to_ssbo(nir_shader *shader, unsigned ssbo_offset);
> -bool nir_lower_to_source_mods(nir_shader *shader);
> +
> +typedef enum  {
> +   nir_lower_int_source_mods = 1 << 0,
> +   nir_lower_float_source_mods = 1 << 1,
> +   nir_lower_all_source_mods = (1 << 2) - 1
> +} nir_lower_to_source_mods_flags;
> +
> +
> +bool nir_lower_to_source_mods(nir_shader *shader,
> nir_lower_to_source_mods_flags options);
>
>  bool nir_lower_gs_intrinsics(nir_shader *shader);
>
> diff --git a/src/compiler/nir/nir_lower_to_source_mods.c
> b/src/compiler/nir/nir_lower_to_source_mods.c
> index 077ca53704..657bf8a3d7 100644
> --- a/src/compiler/nir/nir_lower_to_source_mods.c
> +++ b/src/compiler/nir/nir_lower_to_source_mods.c
> @@ -34,7 +34,8 @@
>   */
>
>  static bool
> -nir_lower_to_source_mods_block(nir_block *block)
> +nir_lower_to_source_mods_block(nir_block *block,
> +   nir_lower_to_source_mods_flags options)
>  {
> bool progress = false;
>
> @@ -58,10 +59,14 @@ nir_lower_to_source_mods_block(nir_block *block)
>
>   switch
> (nir_alu_type_get_base_type(nir_op_infos[alu->op].input_types[i])) {
>   case nir_type_float:
> +if (!(options & nir_lower_float_source_mods))
> +   continue;
>  if (parent->op != nir_op_fmov)
> continue;
>  break;
>   case nir_type_int:
> +if (!(options & nir_lower_int_source_mods))
> +   continue;
>  if (parent->op != nir_op_imov)
> continue;
>  break;
> @@ -97,33 +102,41 @@ nir_lower_to_source_mods_block(nir_block *block)
>   progress = true;
>}
>
> -  switch (alu->op) {
> -  case nir_op_fsat:
> - alu->op = nir_op_fmov;
> - alu->dest.saturate = true;
> - break;
> -  case nir_op_ineg:
> - alu->op = nir_op_imov;
> - alu->src[0].negate = !alu->src[0].negate;
> - break;
> -  case nir_op_fneg:
> - alu->op = nir_op_fmov;
> - alu->src[0].negate = !alu->src[0].negate;
> - break;
> -  case nir_op_iabs:
> - alu->op = nir_op_imov;
> - alu->src[0].abs = true;
> - alu->src[0].negate = false;
> - break;
> -  case nir_op_fabs:
> - alu->op = nir_op_fmov;
> - alu->src[0].abs = true;
> - alu->src[0].negate = false;
> - break;
> -  default:
> - break;
> +  if (options & nir_lower_float_source_mods) {
> + switch (alu->op) {
> + case nir_op_fsat:
> +alu->op = nir_op_fmov;
> +alu->dest.saturate = true;
> +break;
> + case nir_op_fneg:
> +alu->op = nir_op_fmov;
> +alu->src[0].negate = !alu->src[0].negate;
> +break;
> + case nir_op_fabs:
> +alu->op = nir_op_fmov;
> +alu->src[0].abs = true;
> +alu->src[0].negate = false;
> +break;
> + default:
> +break;
> + }
>}
>
> +  if (options & nir_lower_int_source_mods) {
> + switch (alu->op) {
> + case nir_op_ineg:
> +alu->op = nir_op_imov;
> +alu->src[0].negate = !alu->src[0].negate;
> +break;
> + case nir_op_iabs:
> +alu->op = nir_op_imov;
> +alu->src[0].abs = true;
> +alu->src[0].negate = false;
> +break;
> + default:
> +break;
> + }
> +  }
>/* We've covered sources.  Now we're going to try and saturate the
> * destination if we can.
> */
> @@ -136,6 +149,9 @@ nir_lower_to_source_mods_block(nir_block *block)
>nir_type_float)
>   continue;
>
> +  if (!(options & nir_lower_float_source_mods))
> + continue;
> +
>if (!list_empty(>dest.dest.ssa.if_uses))
>   continue;
>
> @@ -185,12 +201,13 @@ nir_lower_to_source_mods_block(nir_block *block)
>  }
>
>  static bool
>

1 2 >

1 - 100 of 133 matches

Mail list logo