Re: [Mesa-dev] [PATCH] AndroidIA: android: add libmesa_genxml as dep to libmesa_isl

2017-03-29 Thread Tapani Pälli
doh sorry about that 'AndroidIA' there, we are using it to differentiate 
patches that we have in our tree and are not in Mesa master yet.


On 03/30/2017 08:51 AM, Tapani Pälli wrote:

This is to fix following compile error with libmesa_isl:
   mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not 
found

Fixes: f0eaf38 ("genxml: New generated header genX_bits.h (v6)")
Signed-off-by: Tapani Pälli 
---
 src/intel/Android.isl.mk | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk
index bc58b97..67e6d2d 100644
--- a/src/intel/Android.isl.mk
+++ b/src/intel/Android.isl.mk
@@ -186,7 +186,8 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \
libmesa_isl_gen7 \
libmesa_isl_gen75 \
libmesa_isl_gen8 \
-   libmesa_isl_gen9
+   libmesa_isl_gen9 \
+   libmesa_genxml

 # Autogenerated sources



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] AndroidIA: android: add libmesa_genxml as dep to libmesa_isl

2017-03-29 Thread Tapani Pälli
This is to fix following compile error with libmesa_isl:
   mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not 
found

Fixes: f0eaf38 ("genxml: New generated header genX_bits.h (v6)")
Signed-off-by: Tapani Pälli 
---
 src/intel/Android.isl.mk | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk
index bc58b97..67e6d2d 100644
--- a/src/intel/Android.isl.mk
+++ b/src/intel/Android.isl.mk
@@ -186,7 +186,8 @@ LOCAL_WHOLE_STATIC_LIBRARIES := \
libmesa_isl_gen7 \
libmesa_isl_gen75 \
libmesa_isl_gen8 \
-   libmesa_isl_gen9
+   libmesa_isl_gen9 \
+   libmesa_genxml
 
 # Autogenerated sources
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled

2017-03-29 Thread Timothy Arceri
We could re-enable it also but I haven't tested that yet, and I'm
not sure we care much anyway.
---
 src/mesa/main/debug_output.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/main/debug_output.c b/src/mesa/main/debug_output.c
index bc933db..2b22645 100644
--- a/src/mesa/main/debug_output.c
+++ b/src/mesa/main/debug_output.c
@@ -22,20 +22,21 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
 
 #include 
 #include 
 #include "context.h"
 #include "debug_output.h"
 #include "dispatch.h"
 #include "enums.h"
+#include "glthread.h"
 #include "imports.h"
 #include "hash.h"
 #include "mtypes.h"
 #include "version.h"
 #include "util/hash_table.h"
 #include "util/simple_list.h"
 
 
 static mtx_t DynamicIDMutex = _MTX_INITIALIZER_NP;
 static GLuint NextDynamicID = 1;
@@ -741,20 +742,24 @@ _mesa_set_debug_state_int(struct gl_context *ctx, GLenum 
pname, GLint val)
 
if (!debug)
   return false;
 
switch (pname) {
case GL_DEBUG_OUTPUT:
   debug->DebugOutput = (val != 0);
   break;
case GL_DEBUG_OUTPUT_SYNCHRONOUS_ARB:
   debug->SyncOutput = (val != 0);
+  if (debug->SyncOutput) {
+ _mesa_glthread_finish(ctx);
+ _mesa_glthread_restore_dispatch(ctx);
+  }
   break;
default:
   assert(!"unknown debug output param");
   break;
}
 
_mesa_unlock_debug_state(ctx);
 
return true;
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #3903 completed

2017-03-29 Thread AppVeyor


Build mesa 3903 completed



Commit 36cb2003f1 by Harish Krupo on 3/28/2017 6:38 PM:

android: pass sse4.1 flag as appropriate\n\nWe have functions which depend on sse4.1 support but we didnt pass\nthe right compile flag for it. This patch fixes it.\n\nSigned-off-by: Kalyan Kondapally \nSigned-off-by: Harish Krupo \nReviewed-by: Tapani Pälli 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/u_atomic: provide 64bit atomics where they're missing

2017-03-29 Thread Jonathan Gray
On Wed, Mar 29, 2017 at 04:55:54PM -0700, Matt Turner wrote:
> On Wed, Mar 29, 2017 at 4:13 PM, Grazvydas Ignotas  wrote:
> > There are still some distributions trying to support unfortunate people
> > with old or exotic CPUs that don't have 64bit atomic operations. When
> > compiling for such a machine, gcc conveniently inserts a library call to
> > a helper, but it's implementation is missing and we get a linker error.
> > This allows us to provide our implementation, which is marked weak to
> > prefer a better implementation, should one exist.
> >
> > Cc: Matt Turner 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
> > Signed-off-by: Grazvydas Ignotas 
> > ---
> 
> Thanks, this is a really good idea.
> 
> >  configure.ac  | 12 
> >  src/util/Makefile.sources |  1 +
> >  src/util/u_atomic.c   | 71 
> > +++
> >  3 files changed, 84 insertions(+)
> >  create mode 100644 src/util/u_atomic.c
> >
> > diff --git a/configure.ac b/configure.ac
> > index ab9a91e..89b615b 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -413,10 +413,22 @@ int main() {
> >  if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
> >  DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
> >  fi
> >  AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test 
> > x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
> >
> > +dnl Check if host supports 64bit atomics
> > +dnl note that lack of support usually results in link (not compile) error
> > +AC_LINK_IFELSE([AC_LANG_SOURCE([[
> > +#include 
> > +uint64_t v;
> > +int main() {
> > +return __sync_add_and_fetch(, (uint64_t)1);
> > +}]])], GCC_64BIT_ATOMICS_SUPPORTED=1)
> > +if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then
> > +DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS"
> > +fi
> > +
> >  dnl Check for Endianness
> >  AC_C_BIGENDIAN(
> > little_endian=no,
> > little_endian=yes,
> > little_endian=no,
> > diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
> > index 8ee45d5..e905734 100644
> > --- a/src/util/Makefile.sources
> > +++ b/src/util/Makefile.sources
> > @@ -41,10 +41,11 @@ MESA_UTIL_FILES := \
> > string_to_uint_map.h \
> > strndup.h \
> > strtod.c \
> > strtod.h \
> > texcompress_rgtc_tmp.h \
> > +   u_atomic.c \
> > u_atomic.h \
> > u_endian.h \
> > u_queue.c \
> > u_queue.h \
> > u_string.h \
> > diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c
> > new file mode 100644
> > index 000..77ef119
> > --- /dev/null
> > +++ b/src/util/u_atomic.c
> > @@ -0,0 +1,71 @@
> > +/*
> > + * Copyright ?? 2017 The Mesa Project
> 
> The Mesa Project isn't something that can hold copyright. Your name
> should be here.
> 
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the 
> > "Software"),
> > + * to deal in the Software without restriction, including without 
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice (including the 
> > next
> > + * paragraph) shall be included in all copies or substantial portions of 
> > the
> > + * Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > DEALINGS
> > + * IN THE SOFTWARE.
> > + */
> > +
> > +#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD)
> > +
> > +#include 
> > +#include 
> > +
> > +#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__)
> > +#define WEAK __attribute__((weak))
> > +#else
> > +#define WEAK
> > +#endif
> > +
> > +static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER;
> > +
> > +WEAK uint64_t __sync_add_and_fetch_8(uint64_t *ptr, uint64_t val)
> 
> Let's do BSD-style function declarations, with the qualifiers and
> return type on their own line.
> 
> With those two trivial things changed, this is
> 
> Reviewed-by: Matt Turner 
> 
> Grazvydas, if you have not already, please file a request for a
> Freedesktop account [1] [2] and let's get you commit access.
> 
> Jonathan, can you check whether this resolves the bug entirely? Or are
> there some other __sync functions we need to implement? I see
> 

Re: [Mesa-dev] [PATCH] mesa/glthread: add custom marshalling for ClearBufferfv()

2017-03-29 Thread Timothy Arceri

On 28/03/17 01:02, Gregory Hainaut wrote:

Hello Timothy,

2 small questions:

Will it work for DSA equivalent function, namely
glClearNamedFramebufferfv ?


It looks like we don't currently even bother to implement 
glClearNamedFramebufferfv properly.




Would it be interesting to also do the equivalent for
glClearBufferiv/glClearBufferuiv ?
Note the *uiv variant could be easier as the size is always 4 INT, so it
can be done with a scale attribute on the XML.


Sure. For now I was seeing this one in use so I added it.



Cheers,
Gregory

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gbm/dri: Flush after unmap

2017-03-29 Thread Michel Dänzer
On 30/03/17 12:56 AM, Thomas Hellstrom wrote:
> On 03/29/2017 02:34 PM, Emil Velikov wrote:
>> On 29 March 2017 at 13:02, Thomas Hellstrom  wrote:
>>> On 03/29/2017 01:30 PM, Emil Velikov wrote:
 On 28 March 2017 at 20:39, Thomas Hellstrom  wrote:
>
> Signed-off-by: Thomas Hellstrom 
> ---
>  src/gbm/backends/dri/gbm_dri.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/src/gbm/backends/dri/gbm_dri.c 
> b/src/gbm/backends/dri/gbm_dri.c
> index ac7ede8..6c2244c 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -243,7 +243,7 @@ struct dri_extension_match {
>  };
>
>  static struct dri_extension_match dri_core_extensions[] = {
> -   { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) },
> +   { __DRI2_FLUSH, 4, offsetof(struct gbm_dri_device, flush) },
 Currently the classic nouveau, radeon/r200 and i915 drivers do not
 support v4 of the extension.
 As-is this will 'break' them... if they ever worked to begin with.

 One solution is to bail out (return -ENOSYS or similar) in map/unmap
 API of the when the DRI module is too old.
 Just some ^^ food for thought.
>>> Hmm. Is there even a use-case for gbm with those drivers? If so we
>>> should perhaps make them up-to-date with the flush extension.
>>>
>> Of the above:
>>
>> - nouveau: Does not support DRI_IMAGE, thus it doesn't work even
>> before the patch.
>> - i915: I have some untested ancient patches. Will see if I can rebase
>> + send out.
>> - radeons: ??
>>
>> If someone reports an issue we can ask them to write/test some code, I guess 
>> ;-)
> 
> Indeed. It looks like gbm is mostly used together with KMS anyway...

All of the above drivers are KMS based, FWIW.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-29 Thread Michel Dänzer
On 30/03/17 02:31 AM, Bartosz Tomczyk wrote:
> Call it directly when batch queue is empty. This avoids costly thread
> synchronisation. With this fix games that previously regressed
> with mesa_glthread=true like xonotic or grid autosport.

The second sentence here is missing a verb (at least).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #3902 failed

2017-03-29 Thread AppVeyor



Build mesa 3902 failed


Commit a930c2c612 by Dave Airlie on 3/30/2017 3:09 AM:

radv: fix mask attribs properly.\n\nsome days it just doesn't pay to get out of bed.\n\nSigned-off-by: Dave Airlie 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] drirc: Set glsl_zero_init for Kerbal Space Program.

2017-03-29 Thread Matt Turner
On Wed, Mar 29, 2017 at 7:41 PM, Francisco Jerez  wrote:
> This fixes the stripes of garbage rendered on the floor of the vehicle
> assembly building among other rendering issues.  The reason for the
> misrendering seems to be that some of the GLSL shaders used by the
> application use variables before initializing them, incorrectly
> assuming that they will be implicitly set to zero by the
> implementation.

Sigh.

Acked-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] drirc: Set glsl_zero_init for Kerbal Space Program.

2017-03-29 Thread Francisco Jerez
This fixes the stripes of garbage rendered on the floor of the vehicle
assembly building among other rendering issues.  The reason for the
misrendering seems to be that some of the GLSL shaders used by the
application use variables before initializing them, incorrectly
assuming that they will be implicitly set to zero by the
implementation.
---
 src/mesa/drivers/dri/common/drirc | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index 494e9e1..f8babb7 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -120,5 +120,13 @@ TODO: document the other workarounds.
 
 
 
+
+
+
+
+
+
+
+
 
 
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] stash

2017-03-29 Thread Jason Ekstrand
---
 src/intel/blorp/blorp_gen4_exec_priv.h | 81 ++
 src/intel/blorp/blorp_priv.h   |  1 +
 2 files changed, 82 insertions(+)

diff --git a/src/intel/blorp/blorp_gen4_exec_priv.h 
b/src/intel/blorp/blorp_gen4_exec_priv.h
index 90f9613..b0e4cba 100644
--- a/src/intel/blorp/blorp_gen4_exec_priv.h
+++ b/src/intel/blorp/blorp_gen4_exec_priv.h
@@ -40,13 +40,94 @@ blorp_emit_vs_state(struct blorp_batch *batch,
return blorp_general_state_address(batch, offset);
 }
 
+struct blorp_sf_key {
+   enum blorp_shader_type shader_type; /* Must be BLORP_SHADER_TYPE_GEN4_SF */
+
+   struct brw_sf_prog_key key;
+};
+
+static bool
+blorp_get_gen4_sf_program(struct blorp_context *blorp,
+  const struct brw_wm_prog_data *wm_prog_data,
+  uint32_t *kernel,
+  struct brw_sf_prog_data **prog_data)
+{
+   struct blorp_sf_key key = {
+  .shader_type = BLORP_SHADER_TYPE_GEN4_SF,
+   };
+
+   assert(wm_prog_data);
+
+   /* Everything gets compacted in vertex setup, so we just need a
+* pass-through for the correct number of input varyings.
+*/
+   const uint64_t slots_valid = VARYING_BIT_POS |
+  ((1 << wm_prog_data->num_varying_inputs) - 1) << VARYING_SLOT_VAR0;
+
+   key.key.contains_flat_varying = wm_prog_data->contains_flat_varying;
+   key.key.attrs = slots_valid;
+
+   STATIC_ASSERT(sizeof(key.key.interp_mode) ==
+ sizeof(wm_prog_data->interp_mode));
+   memcpy(key.key.interp_mode, wm_prog_data->interp_mode,
+  sizeof(key.key.interp_mode));
+
+   if (blorp->lookup_shader(blorp, , sizeof(key), kernel, prog_data))
+  return true;
+
+   void *mem_ctx = ralloc_context(NULL);
+
+   const unsigned *program;
+   unsigned program_size;
+
+   struct brw_vue_map vue_map;
+   brw_compute_vue_map(blorp->compiler->devinfo, _map, slots_valid, false);
+
+   struct brw_sf_prog_data prog_data_tmp;
+   program = brw_compile_sf(blorp->compiler, mem_ctx, ,
+_data_tmp, _map, _size);
+
+   bool result =
+  blorp->upload_shader(blorp, , sizeof(key), program, program_size,
+   (void *)_data_tmp, sizeof(prog_data_tmp),
+   kernel, prog_data);
+
+   ralloc_free(mem_ctx);
+
+   return result;
+}
+
 static struct blorp_address
 blorp_emit_sf_state(struct blorp_batch *batch,
 const struct blorp_params *params)
 {
+   uint32_t kernel;
+   struct brw_sf_prog_data *prog_data;
+   blorp_get_gen4_sf_program(batch->blorp, params->wm_prog_data,
+ , _data);
+   /* TODO: Handle error? */
+
uint32_t offset;
blorp_emit_dynamic(batch, GENX(SF_STATE), sf,
   AUB_TRACE_SF_STATE, 64, ) {
+  sf.KernelStartPointer = kernel;
+  sf.GRFRegisterCount = DIV_ROUND_UP(prog_data->total_grf, 16);
+  sf.VertexURBEntryReadLength = prog_data->urb_read_length;
+  sf.VertexURBEntryReadOffset = BRW_SF_URB_ENTRY_READ_OFFSET;
+  sf.DispatchGRFStartRegisterforURBData = 3;
+
+#if GEN_GEN == 5
+  sf.MaximumNumberofThreads = 48;
+#else
+  sf.MaximumNumberofThreads = 24;
+#endif
+
+  sf.URBEntryAllocationSize = prog_data->urb_entry_size;
+  sf.NumberofURBEntries;
+
+  sf.ViewportTransformEnable = false;
+
+  sf.CullMode = CULLMODE_NONE;
}
 
return blorp_general_state_address(batch, offset);
diff --git a/src/intel/blorp/blorp_priv.h b/src/intel/blorp/blorp_priv.h
index c61ab08..e7b3508 100644
--- a/src/intel/blorp/blorp_priv.h
+++ b/src/intel/blorp/blorp_priv.h
@@ -201,6 +201,7 @@ enum blorp_shader_type {
BLORP_SHADER_TYPE_BLIT,
BLORP_SHADER_TYPE_CLEAR,
BLORP_SHADER_TYPE_LAYER_OFFSET_VS,
+   BLORP_SHADER_TYPE_GEN4_SF,
 };
 
 struct brw_blorp_blit_prog_key
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: allow glsl_type::sampler_index() with images

2017-03-29 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/glsl_to_tgsi: use glsl_type::sampler_index()

2017-03-29 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/u_atomic: provide 64bit atomics where they're missing

2017-03-29 Thread Matt Turner
On Wed, Mar 29, 2017 at 4:13 PM, Grazvydas Ignotas  wrote:
> There are still some distributions trying to support unfortunate people
> with old or exotic CPUs that don't have 64bit atomic operations. When
> compiling for such a machine, gcc conveniently inserts a library call to
> a helper, but it's implementation is missing and we get a linker error.
> This allows us to provide our implementation, which is marked weak to
> prefer a better implementation, should one exist.
>
> Cc: Matt Turner 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
> Signed-off-by: Grazvydas Ignotas 
> ---

Thanks, this is a really good idea.

>  configure.ac  | 12 
>  src/util/Makefile.sources |  1 +
>  src/util/u_atomic.c   | 71 
> +++
>  3 files changed, 84 insertions(+)
>  create mode 100644 src/util/u_atomic.c
>
> diff --git a/configure.ac b/configure.ac
> index ab9a91e..89b615b 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -413,10 +413,22 @@ int main() {
>  if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
>  DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
>  fi
>  AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test 
> x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
>
> +dnl Check if host supports 64bit atomics
> +dnl note that lack of support usually results in link (not compile) error
> +AC_LINK_IFELSE([AC_LANG_SOURCE([[
> +#include 
> +uint64_t v;
> +int main() {
> +return __sync_add_and_fetch(, (uint64_t)1);
> +}]])], GCC_64BIT_ATOMICS_SUPPORTED=1)
> +if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then
> +DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS"
> +fi
> +
>  dnl Check for Endianness
>  AC_C_BIGENDIAN(
> little_endian=no,
> little_endian=yes,
> little_endian=no,
> diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
> index 8ee45d5..e905734 100644
> --- a/src/util/Makefile.sources
> +++ b/src/util/Makefile.sources
> @@ -41,10 +41,11 @@ MESA_UTIL_FILES := \
> string_to_uint_map.h \
> strndup.h \
> strtod.c \
> strtod.h \
> texcompress_rgtc_tmp.h \
> +   u_atomic.c \
> u_atomic.h \
> u_endian.h \
> u_queue.c \
> u_queue.h \
> u_string.h \
> diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c
> new file mode 100644
> index 000..77ef119
> --- /dev/null
> +++ b/src/util/u_atomic.c
> @@ -0,0 +1,71 @@
> +/*
> + * Copyright © 2017 The Mesa Project

The Mesa Project isn't something that can hold copyright. Your name
should be here.

> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD)
> +
> +#include 
> +#include 
> +
> +#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__)
> +#define WEAK __attribute__((weak))
> +#else
> +#define WEAK
> +#endif
> +
> +static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER;
> +
> +WEAK uint64_t __sync_add_and_fetch_8(uint64_t *ptr, uint64_t val)

Let's do BSD-style function declarations, with the qualifiers and
return type on their own line.

With those two trivial things changed, this is

Reviewed-by: Matt Turner 

Grazvydas, if you have not already, please file a request for a
Freedesktop account [1] [2] and let's get you commit access.

Jonathan, can you check whether this resolves the bug entirely? Or are
there some other __sync functions we need to implement? I see
__sync_add_and_fetch_4, etc, in the bug report.

[1] 
https://bugs.freedesktop.org/enter_bug.cgi?product=freedesktop.org=New%20Accounts
[2] https://www.freedesktop.org/wiki/AccountRequests/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util/u_atomic: provide 64bit atomics where they're missing

2017-03-29 Thread Grazvydas Ignotas
There are still some distributions trying to support unfortunate people
with old or exotic CPUs that don't have 64bit atomic operations. When
compiling for such a machine, gcc conveniently inserts a library call to
a helper, but it's implementation is missing and we get a linker error.
This allows us to provide our implementation, which is marked weak to
prefer a better implementation, should one exist.

Cc: Matt Turner 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
Signed-off-by: Grazvydas Ignotas 
---
 configure.ac  | 12 
 src/util/Makefile.sources |  1 +
 src/util/u_atomic.c   | 71 +++
 3 files changed, 84 insertions(+)
 create mode 100644 src/util/u_atomic.c

diff --git a/configure.ac b/configure.ac
index ab9a91e..89b615b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -413,10 +413,22 @@ int main() {
 if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
 DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
 fi
 AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test 
x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
 
+dnl Check if host supports 64bit atomics
+dnl note that lack of support usually results in link (not compile) error
+AC_LINK_IFELSE([AC_LANG_SOURCE([[
+#include 
+uint64_t v;
+int main() {
+return __sync_add_and_fetch(, (uint64_t)1);
+}]])], GCC_64BIT_ATOMICS_SUPPORTED=1)
+if test "x$GCC_64BIT_ATOMICS_SUPPORTED" != x1; then
+DEFINES="$DEFINES -DMISSING_64BIT_ATOMICS"
+fi
+
 dnl Check for Endianness
 AC_C_BIGENDIAN(
little_endian=no,
little_endian=yes,
little_endian=no,
diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources
index 8ee45d5..e905734 100644
--- a/src/util/Makefile.sources
+++ b/src/util/Makefile.sources
@@ -41,10 +41,11 @@ MESA_UTIL_FILES := \
string_to_uint_map.h \
strndup.h \
strtod.c \
strtod.h \
texcompress_rgtc_tmp.h \
+   u_atomic.c \
u_atomic.h \
u_endian.h \
u_queue.c \
u_queue.h \
u_string.h \
diff --git a/src/util/u_atomic.c b/src/util/u_atomic.c
new file mode 100644
index 000..77ef119
--- /dev/null
+++ b/src/util/u_atomic.c
@@ -0,0 +1,71 @@
+/*
+ * Copyright © 2017 The Mesa Project
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#if defined(MISSING_64BIT_ATOMICS) && defined(HAVE_PTHREAD)
+
+#include 
+#include 
+
+#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__)
+#define WEAK __attribute__((weak))
+#else
+#define WEAK
+#endif
+
+static pthread_mutex_t sync_mutex = PTHREAD_MUTEX_INITIALIZER;
+
+WEAK uint64_t __sync_add_and_fetch_8(uint64_t *ptr, uint64_t val)
+{
+   uint64_t r;
+
+   pthread_mutex_lock(_mutex);
+   *ptr += val;
+   r = *ptr;
+   pthread_mutex_unlock(_mutex);
+
+   return r;
+}
+
+WEAK uint64_t __sync_sub_and_fetch_8(uint64_t *ptr, uint64_t val)
+{
+   uint64_t r;
+
+   pthread_mutex_lock(_mutex);
+   *ptr -= val;
+   r = *ptr;
+   pthread_mutex_unlock(_mutex);
+
+   return r;
+}
+
+WEAK uint64_t __atomic_fetch_add_8(uint64_t *ptr, uint64_t val, int memorder)
+{
+   return __sync_add_and_fetch(ptr, val);
+}
+
+WEAK uint64_t __atomic_fetch_sub_8(uint64_t *ptr, uint64_t val, int memorder)
+{
+   return __sync_sub_and_fetch(ptr, val);
+}
+
+#endif
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses

2017-03-29 Thread Chris Wilson
On Wed, Mar 29, 2017 at 04:51:12PM +0100, Chris Wilson wrote:
> On Wed, Mar 29, 2017 at 08:36:36AM -0700, Jason Ekstrand wrote:
> >On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilson
> ><[1]ch...@chris-wilson.co.uk> wrote:
> >  > diff --git a/src/intel/vulkan/anv_private.h
> >  b/src/intel/vulkan/anv_private.h
> >  > index 27c887c..425e376 100644
> >  > --- a/src/intel/vulkan/anv_private.h
> >  > +++ b/src/intel/vulkan/anv_private.h
> >  > @@ -299,11 +299,34 @@ struct anv_bo {
> >  >      * writing to them and synchronize uses on other rings (eg if the
> >  display
> >  >      * server uses the blitter ring).
> >  >      */
> >  > -   bool is_winsys_bo;
> >  > +   bool is_winsys_bo:1;
> >  > +
> >  > +   /* Whether or not this BO supports having a 48-bit address.  Not
> >  all
> >  > +    * buffers support arbitrary 48-bit addresses.  In particular, we
> >  need to
> >  > +    * be careful with general and instruction state buffers because
> >  we set the
> >  > +    * size in STATE_BASE_ADDRESS to 0xf (the maximum) even 
> > though
> >  the BO
> >  > +    * is most likely significantly smaller.  If we let the kernel
> >  place it
> >  > +    * anywhere it wants, it will default to placing it as high up 
> > the
> >  address
> >  > +    * space as possible, the range specified by STATE_BASE_ADDRESS
> >  will
> >  > +    * over-flow the 48-bit address range, and the GPU will hang.  In
> >  order to
> >  > +    * avoid this problem, we tell the kernel that the buffer does 
> > not
> >  support
> >  > +    * 48-bit addresses, and it places the buffer at a 32-bit
> >  address.  While
> >  > +    * this solution is probably overkill, it is effective.
> > 
> >  How about just setting the field to the bo->size? You must know the bo
> >  already at that point so that you can set the relocation target.
> > 
> >Actually, we don't.  We have a pointer to a thing that claims to be a BO
> >but the actual GEM handle and size aren't known until execbuf time.  
> > (Yes,
> >that's a bit weird but there are good reasons for it and it's not likely
> >to change.  When we stop doing relocations, there's a separate plan for
> >how to handle that.)
> 
> Hmm. I honestly didn't expect that.

Since you have the machinery to resolve the relocations after the fact,
you could treat the size field as a different type of patching. Just
an idea to resolve the placement restriction later.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: allow glsl_type::sampler_index() with images

2017-03-29 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl_types.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 405aa3679a..cf0fe71d1a 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -315,7 +315,7 @@ glsl_type::sampler_index() const
 {
const glsl_type *const t = (this->is_array()) ? this->fields.array : this;
 
-   assert(t->is_sampler());
+   assert(t->is_sampler() || t->is_image());
 
switch (t->sampler_dimensionality) {
case GLSL_SAMPLER_DIM_1D:
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/glsl_to_tgsi: use glsl_type::sampler_index()

2017-03-29 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 68 +-
 1 file changed, 2 insertions(+), 66 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 46c97783d8..d70018c8a8 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3883,39 +3883,7 @@ glsl_to_tgsi_visitor::visit_image_intrinsic(ir_call *ir)
inst->sampler_array_size = sampler_array_size;
inst->sampler_base = sampler_base;
 
-   switch (type->sampler_dimensionality) {
-   case GLSL_SAMPLER_DIM_1D:
-  inst->tex_target = (type->sampler_array)
- ? TEXTURE_1D_ARRAY_INDEX : TEXTURE_1D_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_2D:
-  inst->tex_target = (type->sampler_array)
- ? TEXTURE_2D_ARRAY_INDEX : TEXTURE_2D_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_3D:
-  inst->tex_target = TEXTURE_3D_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_CUBE:
-  inst->tex_target = (type->sampler_array)
- ? TEXTURE_CUBE_ARRAY_INDEX : TEXTURE_CUBE_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_RECT:
-  inst->tex_target = TEXTURE_RECT_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_BUF:
-  inst->tex_target = TEXTURE_BUFFER_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_EXTERNAL:
-  inst->tex_target = TEXTURE_EXTERNAL_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_MS:
-  inst->tex_target = (type->sampler_array)
- ? TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX : TEXTURE_2D_MULTISAMPLE_INDEX;
-  break;
-   default:
-  assert(!"Should not get here.");
-   }
-
+   inst->tex_target = type->sampler_index();
inst->image_format = st_mesa_format_to_pipe_format(st_context(ctx),
  _mesa_get_shader_image_format(imgvar->data.image_format));
 
@@ -4425,39 +4393,7 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
   inst->tex_offset_num_offset = i;
}
 
-   switch (sampler_type->sampler_dimensionality) {
-   case GLSL_SAMPLER_DIM_1D:
-  inst->tex_target = (sampler_type->sampler_array)
- ? TEXTURE_1D_ARRAY_INDEX : TEXTURE_1D_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_2D:
-  inst->tex_target = (sampler_type->sampler_array)
- ? TEXTURE_2D_ARRAY_INDEX : TEXTURE_2D_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_3D:
-  inst->tex_target = TEXTURE_3D_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_CUBE:
-  inst->tex_target = (sampler_type->sampler_array)
- ? TEXTURE_CUBE_ARRAY_INDEX : TEXTURE_CUBE_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_RECT:
-  inst->tex_target = TEXTURE_RECT_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_BUF:
-  inst->tex_target = TEXTURE_BUFFER_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_EXTERNAL:
-  inst->tex_target = TEXTURE_EXTERNAL_INDEX;
-  break;
-   case GLSL_SAMPLER_DIM_MS:
-  inst->tex_target = (sampler_type->sampler_array)
- ? TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX : TEXTURE_2D_MULTISAMPLE_INDEX;
-  break;
-   default:
-  assert(!"Should not get here.");
-   }
-
+   inst->tex_target = sampler_type->sampler_index();
inst->tex_type = ir->type->base_type;
 
this->result = result_src;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Meson mesademos (Was: [RFC libdrm 0/2] Replace the build system with meson)

2017-03-29 Thread Jose Fonseca

On 28/03/17 22:37, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-28 13:45:57)

On 28/03/17 21:32, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-28 09:19:48)

On 28/03/17 00:12, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-27 09:58:59)

On 27/03/17 17:42, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-27 09:31:04)

On 27/03/17 17:24, Dylan Baker wrote:

Quoting Jose Fonseca (2017-03-26 14:53:50)

I've pushed the branch to mesa/demos, so we can all collaborate without
wasting time crossporting patches between private branches.

   https://cgit.freedesktop.org/mesa/demos/commit/?h=meson

Unfortunately, I couldn't actually go very far until I hit a wall, as
you can see in the last commit message.


The issue is that Windows has no standard paths for dependencies
includes/libraries (like /usr/include or /usr/lib), nor standard tool
for dependencies (no pkgconfig).  But it seems that Meson presumes any
unknown dependency can be resolved with pkgconfig.


The question is: how do I tell Meson where the GLEW headers/library for
MinGW are supposed to be found?


I know one solution might be Meson Wraps.  Is that the only way?


CMake makes it very easy to do it (via Cache files as explained in my
commit message.)  Is there a way to achieve the same, perhaps via
cross_file properties or something like that?


Jose


I think there are two ways you could solve this:

Wraps are probably the most generically correct method; what I mean by that is
that a proper wrap would solve the problem for everyone, on every operating
system, forever.


Yeah, that sounded a good solution, particularly for windows where's so
much easier to just build the dependencies as a subproject rather than
fetch dependencies from somewhere, since MSVC RT versions have to match
and so.

 > That said, I took a look at GLEW and it doesn't look like a

straightforward project to port to meson, since it uses a huge pile of gnu
makefiles for compilation, without any autoconf/cmake/etc. I still might take a
swing at it since I want to know how hard it would be to write a wrap file for
something like GLEW (and it would probably be a pretty useful project to wrap)
where a meson build system is likely never going to go upstream.


BTW, regarding GLEW, some time ago I actually prototyped using GLAD
instead of GLEW for mesademos:

   https://cgit.freedesktop.org/~jrfonseca/mesademos/log/?h=glad

I find GLAD much nicer that GLEW: it's easier to build, it uses upstream
XML files, it supports EGL, and it's easy to bundle.

Maybe we could migrate mesademos to GLAD as part of this work instead of
trying to get glew "mesonfied".


The other option I think you can use use is cross properties[1], which I believe
is the closest thing meson has to cmake's cache files.

I've pushed a couple of commits, the last one implements the cross properties
idea, which gets the build farther, but then it can't find the glut headers,
and I don't understand why, since "cc.has_header('GL/glut')" returns true. I
still think that wraps are a better plan, but I'll have to spend some time today
working on a glew wrap.

[1] https://github.com/mesonbuild/meson/wiki/Cross-compilation (at the bottom
under the heading "Custom Data")


I'm running out of time today, but I'll try to take a look tomorrow.

Jose



I'd had a similar thought, but thought of libpeoxy? It supports the platforms we
want, and already has a meson build system that works for windows.


I have no experience with libepoxy.  I know GLAD is really easy to
understand, use and integrate.  It's completly agnostic to toolkits like
GLUT/GLFW/etc doesn't try to alias equivalent entrypoints, or anything
smart like libepoxy.

In particular I don't fully understand libepoxy behavior regarding
wglMakeCurrent is, and whether that will create problems with GLUT,
since GLUT will call wglMakeCurrent..


Jose


Okay, I have libepoxy working for windows. I also got libepoxy working as a
subproject, but it took a bit of hacking on their build system (there's
some things they're doing that make them non-subproject safe, I'll send patches
and work that out with them.

https://github.com/dcbaker/libepoxy.git fix-suproject


Thanks.

GLEW is not the only one case though.  There's also FREEGLUT.  So we
can't really avoid the problem of external windows binaries/subprojects.

So I've been thinking, and I suspect is better if first get things
working with binary GLEW / FREGLUT projects, then try the glew ->
libepoxy in a 2nd step, so there's less to take in to merge meson into
master.


Clone that repo into $mesa-demos-root/subprojects and things should just work,
or mostly work. I got epoxy compiling, but ran into some issues in the mingw glu
header.

Dylan


I'm pretty sure the problem with MinGW glu is the lack of windows.h.  We
need to do the same as CMakeLists.txt snippet quoted below.

I'm running out of time today, but I'll look into porting this over to
meson tomorrow if you don't beat me to it.

Jose



if (WIN32)
   

[Mesa-dev] [PATCH] i965/fs: Gracefully handle TXS on multisampled textures with no LOD

2017-03-29 Thread Jason Ekstrand
This can happen for multisampled textures since they are never mipmapped
and textureSize(gsampler2DMS*) does not take an LOD parameter.  This
fixes a shader validation error in the new Sascha deferredmultisampling
demo.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Cc: "13.0 17.0" 
---

We could also easily enough handle this in spirv_to_nir like we do with
GLSL.  However, it seems perfectly reasonable that multisampled txs should
allow no LOD in NIR.

 src/intel/compiler/brw_fs_nir.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index bc1ccfb..60604e1 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4381,9 +4381,12 @@ fs_visitor::nir_emit_texture(const fs_builder , 
nir_tex_instr *instr)
srcs[TEX_LOGICAL_SRC_GRAD_COMPONENTS] = brw_imm_d(lod_components);
 
if (instr->op == nir_texop_query_levels ||
+   (instr->op == nir_texop_txs &&
+instr->sampler_dim == GLSL_SAMPLER_DIM_MS) ||
(instr->op == nir_texop_tex && stage != MESA_SHADER_FRAGMENT)) {
-  /* textureQueryLevels() and texture() are implemented in terms of TXS
-   * and TXL respectively, so we need to pass a valid LOD argument.
+  /* textureQueryLevels(), textureSize(), and texture() are implemented in
+   * terms of TXS and TXL respectively, so we need to pass a valid LOD
+   * argument.
*/
   assert(srcs[TEX_LOGICAL_SRC_LOD].file == BAD_FILE);
   srcs[TEX_LOGICAL_SRC_LOD] = brw_imm_ud(0u);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] swr: [rasterizer codegen] Fix windows build

2017-03-29 Thread Rowley, Timothy O
Commit comment should not include “[rasterizer codegen]”, as it doesn’t modify 
that code.

With that fixed, Reviewed-by: Tim Rowley 
>

On Mar 28, 2017, at 4:44 PM, George Kyriazis 
> wrote:

Fix codegen build break that was introduced earlier

v2: update rules for gen_knobs.cpp and gen_knobs.h

v3: Introduce bldroot and revert generator file changes, making patch simpler.
---
src/gallium/drivers/swr/SConscript | 38 +++---
1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/swr/SConscript 
b/src/gallium/drivers/swr/SConscript
index ad16162..18d6c9b 100644
--- a/src/gallium/drivers/swr/SConscript
+++ b/src/gallium/drivers/swr/SConscript
@@ -47,20 +47,25 @@ if not env['msvc'] :
])

swrroot = '#src/gallium/drivers/swr/'
+bldroot = Dir('.').abspath

env.CodeGenerate(
target = 'rasterizer/codegen/gen_knobs.cpp',
script = swrroot + 'rasterizer/codegen/gen_knobs.py',
-source = 'rasterizer/codegen/templates/gen_knobs.cpp',
-command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET 
--gen_cpp'
+source = '',
+command = python_cmd + ' $SCRIPT --output $TARGET --gen_cpp'
)
+Depends('rasterizer/codegen/gen_knobs.cpp',
+swrroot + 'rasterizer/codegen/templates/gen_knobs.cpp')

env.CodeGenerate(
target = 'rasterizer/codegen/gen_knobs.h',
script = swrroot + 'rasterizer/codegen/gen_knobs.py',
-source = 'rasterizer/codegen/templates/gen_knobs.cpp',
-command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET --gen_h'
+source = '',
+command = python_cmd + ' $SCRIPT --output $TARGET --gen_h'
)
+Depends('rasterizer/codegen/gen_knobs.cpp',
+swrroot + 'rasterizer/codegen/templates/gen_knobs.cpp')

env.CodeGenerate(
target = 'rasterizer/jitter/gen_state_llvm.h',
@@ -68,20 +73,26 @@ env.CodeGenerate(
source = 'rasterizer/core/state.h',
command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET'
)
+Depends('rasterizer/jitter/gen_state_llvm.h',
+swrroot + 'rasterizer/codegen/templates/gen_llvm.hpp')

env.CodeGenerate(
target = 'rasterizer/jitter/gen_builder.hpp',
script = swrroot + 'rasterizer/codegen/gen_llvm_ir_macros.py',
source = os.path.join(llvm_includedir, 'llvm/IR/IRBuilder.h'),
-command = python_cmd + ' $SCRIPT --input $SOURCE --output 
rasterizer/jitter --gen_h'
+command = python_cmd + ' $SCRIPT --input $SOURCE --output ' + bldroot + 
'/rasterizer/jitter --gen_h'
)
+Depends('rasterizer/jitter/gen_builder.hpp',
+swrroot + 'rasterizer/codegen/templates/gen_builder.hpp')

env.CodeGenerate(
target = 'rasterizer/jitter/gen_builder_x86.hpp',
script = swrroot + 'rasterizer/codegen/gen_llvm_ir_macros.py',
source = '',
-command = python_cmd + ' $SCRIPT --output rasterizer/jitter --gen_x86_h'
+command = python_cmd + ' $SCRIPT --output ' + bldroot + 
'/rasterizer/jitter --gen_x86_h'
)
+Depends('rasterizer/jitter/gen_builder.hpp',
+swrroot + 'rasterizer/codegen/templates/gen_builder.hpp')

env.CodeGenerate(
target = './gen_swr_context_llvm.h',
@@ -89,6 +100,8 @@ env.CodeGenerate(
source = 'swr_context.h',
command = python_cmd + ' $SCRIPT --input $SOURCE --output $TARGET'
)
+Depends('rasterizer/jitter/gen_state_llvm.h',
+swrroot + 'rasterizer/codegen/templates/gen_llvm.hpp')

env.CodeGenerate(
target = 'rasterizer/archrast/gen_ar_event.hpp',
@@ -96,6 +109,8 @@ env.CodeGenerate(
source = 'rasterizer/archrast/events.proto',
command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET 
--gen_event_h'
)
+Depends('rasterizer/jitter/gen_state_llvm.h',
+swrroot + 'rasterizer/codegen/templates/gen_ar_event.hpp')

env.CodeGenerate(
target = 'rasterizer/archrast/gen_ar_event.cpp',
@@ -103,6 +118,8 @@ env.CodeGenerate(
source = 'rasterizer/archrast/events.proto',
command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET 
--gen_event_cpp'
)
+Depends('rasterizer/jitter/gen_state_llvm.h',
+swrroot + 'rasterizer/codegen/templates/gen_ar_event.cpp')

env.CodeGenerate(
target = 'rasterizer/archrast/gen_ar_eventhandler.hpp',
@@ -110,6 +127,8 @@ env.CodeGenerate(
source = 'rasterizer/archrast/events.proto',
command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET 
--gen_eventhandler_h'
)
+Depends('rasterizer/jitter/gen_state_llvm.h',
+swrroot + 'rasterizer/codegen/templates/gen_ar_eventhandler.hpp')

env.CodeGenerate(
target = 'rasterizer/archrast/gen_ar_eventhandlerfile.hpp',
@@ -117,6 +136,8 @@ env.CodeGenerate(
source = 'rasterizer/archrast/events.proto',
command = python_cmd + ' $SCRIPT --proto $SOURCE --output $TARGET 
--gen_eventhandlerfile_h'
)
+Depends('rasterizer/jitter/gen_state_llvm.h',
+swrroot + 'rasterizer/codegen/templates/gen_ar_eventhandlerfile.hpp')

# 5 

Re: [Mesa-dev] [PATCH 0/9] RadeonSI cleanups

2017-03-29 Thread Samuel Pitoiset

Patches 1-4 & 7 are:

Reviewed-by: Samuel Pitoiset 

On 03/29/2017 07:58 PM, Marek Olšák wrote:

General cleanups and cleanups in preparation for threaded gallium.

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] RadeonSI cleanups

2017-03-29 Thread Edmondo Tommasina
This series is
Tested-by: Edmondo Tommasina 


On Wed, Mar 29, 2017 at 7:58 PM, Marek Olšák  wrote:
> General cleanups and cleanups in preparation for threaded gallium.
>
> Please review.
>
> Thanks,
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] radv: Use the guard band.

2017-03-29 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c |  6 ++-
 src/amd/vulkan/radv_private.h|  3 +-
 src/amd/vulkan/si_cmd_buffer.c   | 94 +++-
 3 files changed, 90 insertions(+), 13 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 09ba7cf4e18..e50245251fb 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -750,7 +750,9 @@ radv_emit_scissor(struct radv_cmd_buffer *cmd_buffer)
 {
uint32_t count = cmd_buffer->state.dynamic.scissor.count;
si_write_scissors(cmd_buffer->cs, 0, count,
- cmd_buffer->state.dynamic.scissor.scissors);
+ cmd_buffer->state.dynamic.scissor.scissors,
+ cmd_buffer->state.dynamic.viewport.viewports,
+ 
cmd_buffer->state.emitted_pipeline->graphics.can_use_guardband);
radeon_set_context_reg(cmd_buffer->cs, R_028A48_PA_SC_MODE_CNTL_0,
   
cmd_buffer->state.pipeline->graphics.ms.pa_sc_mode_cntl_0 | 
S_028A48_VPORT_SCISSOR_ENABLE(count ? 1 : 0));
 }
@@ -1281,7 +1283,7 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
*cmd_buffer,
if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_VIEWPORT))
radv_emit_viewport(cmd_buffer);
 
-   if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_SCISSOR))
+   if (cmd_buffer->state.dirty & (RADV_CMD_DIRTY_DYNAMIC_SCISSOR | 
RADV_CMD_DIRTY_DYNAMIC_VIEWPORT))
radv_emit_scissor(cmd_buffer);
 
ia_multi_vgt_param = si_get_ia_multi_vgt_param(cmd_buffer, 
instanced_draw, indirect_draw, draw_vertex_count);
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 410e63ba413..e8f14dcfe02 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -758,7 +758,8 @@ void cik_create_gfx_config(struct radv_device *device);
 void si_write_viewport(struct radeon_winsys_cs *cs, int first_vp,
   int count, const VkViewport *viewports);
 void si_write_scissors(struct radeon_winsys_cs *cs, int first,
-  int count, const VkRect2D *scissors);
+  int count, const VkRect2D *scissors,
+  const VkViewport *viewports, bool can_use_guardband);
 uint32_t si_get_ia_multi_vgt_param(struct radv_cmd_buffer *cmd_buffer,
   bool instanced_draw, bool indirect_draw,
   uint32_t draw_vertex_count);
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 55c82a9a685..66a4681dad3 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -361,11 +361,6 @@ si_emit_config(struct radv_physical_device 
*physical_device,
radeon_set_context_reg(cs, R_028234_PA_SU_HARDWARE_SCREEN_OFFSET, 0);
radeon_set_context_reg(cs, R_028820_PA_CL_NANINF_CNTL, 0);
 
-   radeon_set_context_reg(cs, R_028BE8_PA_CL_GB_VERT_CLIP_ADJ, fui(1.0));
-   radeon_set_context_reg(cs, R_028BEC_PA_CL_GB_VERT_DISC_ADJ, fui(1.0));
-   radeon_set_context_reg(cs, R_028BF0_PA_CL_GB_HORZ_CLIP_ADJ, fui(1.0));
-   radeon_set_context_reg(cs, R_028BF4_PA_CL_GB_HORZ_DISC_ADJ, fui(1.0));
-
radeon_set_context_reg(cs, R_028AC0_DB_SRESULTS_COMPARE_STATE0, 0x0);
radeon_set_context_reg(cs, R_028AC4_DB_SRESULTS_COMPARE_STATE1, 0x0);
radeon_set_context_reg(cs, R_028AC8_DB_PRELOAD_CONTROL, 0x0);
@@ -500,6 +495,22 @@ get_viewport_xform(const VkViewport *viewport,
translate[2] = n;
 }
 
+static void
+get_viewport_xform_scissor(const VkRect2D *scissor,
+   float scale[2], float translate[2])
+{
+   float x = scissor->offset.x;
+   float y = scissor->offset.y;
+   float half_width = 0.5f * scissor->extent.width;
+   float half_height = 0.5f * scissor->extent.height;
+
+   scale[0] = half_width;
+   translate[0] = half_width + x;
+   scale[1] = half_height;
+   translate[1] = half_height + y;
+
+}
+
 void
 si_write_viewport(struct radeon_winsys_cs *cs, int first_vp,
   int count, const VkViewport *viewports)
@@ -533,21 +544,84 @@ si_write_viewport(struct radeon_winsys_cs *cs, int 
first_vp,
}
 }
 
+static VkRect2D si_scissor_from_viewport(const VkViewport *viewport)
+{
+   float scale[3], translate[3];
+   VkRect2D rect;
+
+   get_viewport_xform(viewport, scale, translate);
+
+   rect.offset.x = translate[0] - abs(scale[0]);
+   rect.offset.y = translate[1] - abs(scale[1]);
+   rect.extent.width = ceilf(translate[0] + abs(scale[0])) - rect.offset.x;
+   rect.extent.height = ceilf(translate[1] + abs(scale[1])) - 
rect.offset.y;
+
+   return rect;
+}
+
+static VkRect2D si_intersect_scissor(const VkRect2D *a, const VkRect2D *b) {
+   VkRect2D ret;
+   ret.offset.x = 

[Mesa-dev] [PATCH 1/4] radv: Set proper viewport & scissor for meta draws.

2017-03-29 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_meta_blit.c   | 53 --
 src/amd/vulkan/radv_meta_blit2d.c | 52 +++--
 src/amd/vulkan/radv_meta_clear.c  | 54 +--
 src/amd/vulkan/radv_meta_decompress.c | 39 +++--
 src/amd/vulkan/radv_meta_fast_clear.c | 52 +
 src/amd/vulkan/radv_meta_resolve.c| 39 +++--
 6 files changed, 214 insertions(+), 75 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_blit.c b/src/amd/vulkan/radv_meta_blit.c
index 9d4d3f02555..228aefaf4b6 100644
--- a/src/amd/vulkan/radv_meta_blit.c
+++ b/src/amd/vulkan/radv_meta_blit.c
@@ -246,8 +246,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
unsigned vb_size = 3 * sizeof(*vb_data);
vb_data[0] = (struct blit_vb_data) {
.pos = {
-   dest_offset_0.x,
-   dest_offset_0.y,
+   -1.0,
+   -1.0,
},
.tex_coord = {
(float)src_offset_0.x / (float)src_iview->extent.width,
@@ -258,8 +258,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
 
vb_data[1] = (struct blit_vb_data) {
.pos = {
-   dest_offset_0.x,
-   dest_offset_1.y,
+   -1.0,
+   1.0,
},
.tex_coord = {
(float)src_offset_0.x / (float)src_iview->extent.width,
@@ -270,8 +270,8 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
 
vb_data[2] = (struct blit_vb_data) {
.pos = {
-   dest_offset_1.x,
-   dest_offset_0.y,
+   1.0,
+   -1.0,
},
.tex_coord = {
(float)src_offset_1.x / (float)src_iview->extent.width,
@@ -444,6 +444,23 @@ meta_emit_blit(struct radv_cmd_buffer *cmd_buffer,
   device->meta_state.blit.pipeline_layout, 0, 
1,
   , 0, NULL);
 
+   radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, 
&(VkViewport) {
+   .x = dest_offset_0.x,
+   .y = dest_offset_0.y,
+   .width = dest_offset_1.x - dest_offset_0.x,
+   .height = dest_offset_1.y - dest_offset_0.y,
+   .minDepth = 0.0f,
+   .maxDepth = 1.0f
+   });
+
+   radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, 
&(VkRect2D) {
+   .offset = (VkOffset2D) { MIN2(dest_offset_0.x, 
dest_offset_1.x), MIN2(dest_offset_0.y, dest_offset_1.y) },
+   .extent = (VkExtent2D) {
+   abs(dest_offset_1.x - dest_offset_0.x),
+   abs(dest_offset_1.y - dest_offset_0.y)
+   },
+   });
+
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
 
radv_CmdEndRenderPass(radv_cmd_buffer_to_handle(cmd_buffer));
@@ -813,8 +830,8 @@ radv_device_init_meta_blit_color(struct radv_device *device,
},
.pViewportState = &(VkPipelineViewportStateCreateInfo) {
.sType = 
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
-   .viewportCount = 0,
-   .scissorCount = 0,
+   .viewportCount = 1,
+   .scissorCount = 1,
},
.pRasterizationState = 
&(VkPipelineRasterizationStateCreateInfo) {
.sType = 
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,
@@ -842,8 +859,10 @@ radv_device_init_meta_blit_color(struct radv_device 
*device,
},
.pDynamicState = &(VkPipelineDynamicStateCreateInfo) {
.sType = 
VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
-   .dynamicStateCount = 2,
+   .dynamicStateCount = 4,
.pDynamicStates = (VkDynamicState[]) {
+   VK_DYNAMIC_STATE_VIEWPORT,
+   VK_DYNAMIC_STATE_SCISSOR,
VK_DYNAMIC_STATE_LINE_WIDTH,
VK_DYNAMIC_STATE_BLEND_CONSTANTS,
},
@@ -990,8 +1009,8 @@ radv_device_init_meta_blit_depth(struct radv_device 
*device,
},
.pViewportState = &(VkPipelineViewportStateCreateInfo) {
.sType = 
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
-   .viewportCount = 0,
- 

[Mesa-dev] [PATCH 3/4] radv: Prepare for not using the guard band for lines & points.

2017-03-29 Thread Bas Nieuwenhuizen
Vulkan Clipping is defined in terms of vertices, the scissor based
clipping happens on pixels. There is a difference with points and
lines, as a vertex can be outside the viewport while some pixels are in.
On Vulkan thoise pixels shouldn't be drawn, while they would be with
the guardband.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c |  5 +
 src/amd/vulkan/radv_pipeline.c   | 26 ++
 src/amd/vulkan/radv_private.h|  1 +
 3 files changed, 32 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index e6f098c208d..09ba7cf4e18 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -730,6 +730,11 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
*cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286E8_SPI_TMPRING_SIZE,
   S_0286E8_WAVES(pipeline->max_waves) |
   
S_0286E8_WAVESIZE(pipeline->scratch_bytes_per_wave >> 10));
+
+   if (!cmd_buffer->state.emitted_pipeline ||
+   cmd_buffer->state.emitted_pipeline->graphics.can_use_guardband !=
+pipeline->graphics.can_use_guardband)
+   cmd_buffer->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_SCISSOR;
cmd_buffer->state.emitted_pipeline = pipeline;
 }
 
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 07020e8c387..a564085c884 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1214,6 +1214,28 @@ radv_pipeline_init_multisample_state(struct 
radv_pipeline *pipeline,
ms->pa_sc_aa_mask[1] = mask | (mask << 16);
 }
 
+static bool
+radv_prim_can_use_guardband(enum VkPrimitiveTopology topology)
+{
+   switch (topology) {
+   case VK_PRIMITIVE_TOPOLOGY_POINT_LIST:
+   case VK_PRIMITIVE_TOPOLOGY_LINE_LIST:
+   case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP:
+   case VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY:
+   case VK_PRIMITIVE_TOPOLOGY_LINE_STRIP_WITH_ADJACENCY:
+   return false;
+   case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST:
+   case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP:
+   case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN:
+   case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY:
+   case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP_WITH_ADJACENCY:
+   case VK_PRIMITIVE_TOPOLOGY_PATCH_LIST:
+   return true;
+   default:
+   unreachable("unhandled primitive type");
+   }
+}
+
 static uint32_t
 si_translate_prim(enum VkPrimitiveTopology topology)
 {
@@ -1714,14 +1736,18 @@ radv_pipeline_init(struct radv_pipeline *pipeline,
radv_pipeline_init_raster_state(pipeline, pCreateInfo);
radv_pipeline_init_multisample_state(pipeline, pCreateInfo);
pipeline->graphics.prim = 
si_translate_prim(pCreateInfo->pInputAssemblyState->topology);
+   pipeline->graphics.can_use_guardband = 
radv_prim_can_use_guardband(pCreateInfo->pInputAssemblyState->topology);
+
if (radv_pipeline_has_gs(pipeline)) {
pipeline->graphics.gs_out = 
si_conv_gl_prim_to_gs_out(pipeline->shaders[MESA_SHADER_GEOMETRY]->info.gs.output_prim);
+   pipeline->graphics.can_use_guardband = 
pipeline->graphics.gs_out == V_028A6C_OUTPRIM_TYPE_TRISTRIP;
} else {
pipeline->graphics.gs_out = 
si_conv_prim_to_gs_out(pCreateInfo->pInputAssemblyState->topology);
}
if (extra && extra->use_rectlist) {
pipeline->graphics.prim = V_008958_DI_PT_RECTLIST;
pipeline->graphics.gs_out = V_028A6C_OUTPRIM_TYPE_TRISTRIP;
+   pipeline->graphics.can_use_guardband = true;
}
pipeline->graphics.prim_restart_enable = 
!!pCreateInfo->pInputAssemblyState->primitiveRestartEnable;
/* prim vertex count will need TESS changes */
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 31e08287c9c..410e63ba413 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -968,6 +968,7 @@ struct radv_pipeline {
uint32_t pa_cl_vs_out_cntl;
uint32_t vgt_shader_stages_en;
struct radv_prim_vertex_count prim_vertex_count;
+   bool can_use_guardband;
} graphics;
};
 
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] radv: Drop the default viewport when 0 viewports are given.

2017-03-29 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/si_cmd_buffer.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 6e50f64a29a..55c82a9a685 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -506,21 +506,7 @@ si_write_viewport(struct radeon_winsys_cs *cs, int 
first_vp,
 {
int i;
 
-   if (count == 0) {
-   radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE, 6);
-   radeon_emit(cs, fui(1.0));
-   radeon_emit(cs, fui(0.0));
-   radeon_emit(cs, fui(1.0));
-   radeon_emit(cs, fui(0.0));
-   radeon_emit(cs, fui(1.0));
-   radeon_emit(cs, fui(0.0));
-
-   radeon_set_context_reg_seq(cs, R_0282D0_PA_SC_VPORT_ZMIN_0, 2);
-   radeon_emit(cs, fui(0.0));
-   radeon_emit(cs, fui(1.0));
-
-   return;
-   }
+   assert(count);
radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE +
   first_vp * 4 * 6, count * 6);
 
@@ -552,8 +538,7 @@ si_write_scissors(struct radeon_winsys_cs *cs, int first,
   int count, const VkRect2D *scissors)
 {
int i;
-   if (count == 0)
-   return;
+   assert(count);
 
radeon_set_context_reg_seq(cs, R_028250_PA_SC_VPORT_SCISSOR_0_TL + 
first * 4 * 2, count * 2);
for (i = 0; i < count; i++) {
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'

2017-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100259

ovarieg...@yahoo.com changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REOPENED|RESOLVED

--- Comment #11 from ovarieg...@yahoo.com ---
It turns out this was all my fault and it was a bug in my pkgconf.SlackBuild.

I was lacking /usr/lib64 as a system libdir...

Sorry for the noise.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/cmd_buffer: fix host memory leak

2017-03-29 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

And pushed.

On Wed, Mar 29, 2017 at 12:14 PM,  wrote:

> From: Craig Stout 
>
> push_constants must be free'd.
>
> https://bugs.freedesktop.org/show_bug.cgi?id=100452
> ---
>  src/intel/vulkan/anv_cmd_buffer.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_
> buffer.c
> index 909bee2..c65eba2 100644
> --- a/src/intel/vulkan/anv_cmd_buffer.c
> +++ b/src/intel/vulkan/anv_cmd_buffer.c
> @@ -120,7 +120,12 @@ anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
> cmd_buffer->batch.status = VK_SUCCESS;
>
> memset(>descriptors, 0, sizeof(state->descriptors));
> -   memset(>push_constants, 0, sizeof(state->push_constants));
> +   for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
> +  if (state->push_constants[i] != NULL) {
> + vk_free(_buffer->pool->alloc, state->push_constants[i]);
> + state->push_constants[i] = NULL;
> +  }
> +   }
> memset(state->binding_tables, 0, sizeof(state->binding_tables));
> memset(state->samplers, 0, sizeof(state->samplers));
>
> @@ -193,6 +198,9 @@ static VkResult anv_create_cmd_buffer(
>
> cmd_buffer->batch.status = VK_SUCCESS;
>
> +   for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
> +  cmd_buffer->state.push_constants[i] = NULL;
> +   }
> cmd_buffer->_loader_data.loaderMagic = ICD_LOADER_MAGIC;
> cmd_buffer->device = device;
> cmd_buffer->pool = pool;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove support for predicates from TGSI

2017-03-29 Thread Jose Fonseca

On 29/03/17 19:02, Roland Scheidegger wrote:

[resend with snipped bits as it's too big]

A couple comments inline.

[snip]


--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -746,39 +746,30 @@ static void lp_exec_default(struct lp_exec_mask *mask,
 }


 /* stores val into an address pointed to by dst_ptr.
  * mask->exec_mask is used to figure out which bits of val
  * should be stored into the address
  * (0 means don't store this bit, 1 means do store).
  */
 static void lp_exec_mask_store(struct lp_exec_mask *mask,
struct lp_build_context *bld_store,
-   LLVMValueRef pred,
LLVMValueRef val,
LLVMValueRef dst_ptr)
 {
LLVMBuilderRef builder = mask->bld->gallivm->builder;
+   LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL;

Calling this "pred" now seems to be somewhat of a misnomer (wasn't all
that great before because it then included exec_mask but it's worse now).





assert(lp_check_value(bld_store->type, val));
assert(LLVMGetTypeKind(LLVMTypeOf(dst_ptr)) == LLVMPointerTypeKind);
assert(LLVMGetElementType(LLVMTypeOf(dst_ptr)) == LLVMTypeOf(val));

-   /* Mix the predicate and execution mask */
-   if (mask->has_mask) {
-  if (pred) {
- pred = LLVMBuildAnd(builder, pred, mask->exec_mask, "");
-  } else {
- pred = mask->exec_mask;
-  }
-   }
-
if (pred) {
   LLVMValueRef res, dst;

   dst = LLVMBuildLoad(builder, dst_ptr, "");
   res = lp_build_select(bld_store, pred, val, dst);
   LLVMBuildStore(builder, res, dst_ptr);
} else
   LLVMBuildStore(builder, val, dst_ptr);
 }

@@ -1029,36 +1020,26 @@ build_gather(struct lp_build_tgsi_context *bld_base,


 /**
  * Scatter/store vector.
  */
 static void
 emit_mask_scatter(struct lp_build_tgsi_soa_context *bld,
   LLVMValueRef base_ptr,
   LLVMValueRef indexes,
   LLVMValueRef values,
-  struct lp_exec_mask *mask,
-  LLVMValueRef pred)
+  struct lp_exec_mask *mask)
 {
struct gallivm_state *gallivm = bld->bld_base.base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
unsigned i;
-
-   /* Mix the predicate and execution mask */
-   if (mask->has_mask) {
-  if (pred) {
- pred = LLVMBuildAnd(builder, pred, mask->exec_mask, "");
-  }
-  else {
- pred = mask->exec_mask;
-  }
-   }
+   LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL;

same here.



diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 6a3fb98..87d2d92 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -62,21 +62,20 @@ struct tgsi_token

 enum tgsi_file_type {
TGSI_FILE_NULL,
TGSI_FILE_CONSTANT,
TGSI_FILE_INPUT,
TGSI_FILE_OUTPUT,
TGSI_FILE_TEMPORARY,
TGSI_FILE_SAMPLER,
TGSI_FILE_ADDRESS,
TGSI_FILE_IMMEDIATE,
-   TGSI_FILE_PREDICATE,
TGSI_FILE_SYSTEM_VALUE,
TGSI_FILE_IMAGE,
TGSI_FILE_SAMPLER_VIEW,
TGSI_FILE_BUFFER,
TGSI_FILE_MEMORY,
TGSI_FILE_COUNT,  /**< how many TGSI_FILE_ types */
 };


 #define TGSI_WRITEMASK_NONE 0x00
@@ -609,34 +608,31 @@ struct tgsi_property_data {

 /**
  * Opcode is the operation code to execute. A given operation defines the
  * semantics how the source registers (if any) are interpreted and what is
  * written to the destination registers (if any) as a result of execution.
  *
  * NumDstRegs and NumSrcRegs is the number of destination and source registers,
  * respectively. For a given operation code, those numbers are fixed and are
  * present here only for convenience.
  *
- * If Predicate is TRUE, tgsi_instruction_predicate token immediately follows.
- *
  * Saturate controls how are final results in destination registers modified.
  */

 struct tgsi_instruction
 {
unsigned Type   : 4;  /* TGSI_TOKEN_TYPE_INSTRUCTION */
unsigned NrTokens   : 8;  /* UINT */
unsigned Opcode : 8;  /* TGSI_OPCODE_ */
unsigned Saturate   : 1;  /* BOOL */
unsigned NumDstRegs : 2;  /* UINT */
unsigned NumSrcRegs : 4;  /* UINT */
-   unsigned Predicate  : 1;  /* BOOL */
unsigned Label  : 1;
unsigned Texture: 1;
unsigned Memory : 1;
unsigned Padding: 1;

The Padding doesn't match.



So, we still have code which uses this - however this code is only used
for some testing, otherwise we translate this d3d9 stuff away like
everybody else.
Maybe it's time to ditch this stuff then - clearly no other drivers are
ever going to support it and apis have moved away from it.

Jose, any opinion on that?


Yes, I agree.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [RFC 3/3] mesa: expose KHR_no_error for GL

2017-03-29 Thread Samuel Pitoiset



On 03/29/2017 11:01 PM, Timothy Arceri wrote:



On 30/03/17 06:53, Marek Olšák wrote:

The series looks good to me except the "==" -> "&" in patch 2. The
patches have no effect without the GLX extension, right?


Correct. I was partly sending this out to see if anyone knew what was
going on since Nvidia exposes this on their driver but I couldn't find
the GLX extension anywhere.

Anyway in the mean time we could add an environment variable to enable it.


And a driconf option also?





Marek

On Tue, Mar 28, 2017 at 6:35 AM, Timothy Arceri
 wrote:

There ES is no support for now as this requires
EGL_KHR_create_context_no_error to be implemented.
---
 src/mesa/main/extensions_table.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/extensions_table.h
b/src/mesa/main/extensions_table.h
index ec71791..4439731 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -294,20 +294,21 @@ EXT(IBM_texture_mirrored_repeat ,
dummy_true

 EXT(INGR_blend_func_separate,
EXT_blend_func_separate, GLL,  x ,  x ,  x , 1999)

 EXT(INTEL_conservative_rasterization,
INTEL_conservative_rasterization   ,  x , GLC,  x ,  31, 2013)
 EXT(INTEL_performance_query ,
INTEL_performance_query, GLL, GLC,  x , ES2, 2013)

 EXT(KHR_blend_equation_advanced ,
KHR_blend_equation_advanced, GLL, GLC,  x , ES2, 2014)
 EXT(KHR_blend_equation_advanced_coherent,
KHR_blend_equation_advanced_coherent   , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_context_flush_control   ,
dummy_true , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_debug   ,
dummy_true , GLL, GLC,  11, ES2, 2012)
+EXT(KHR_no_error,
dummy_true , GLL, GLC,  x ,  x , 2015)
 EXT(KHR_robust_buffer_access_behavior   ,
ARB_robust_buffer_access_behavior  , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_robustness  ,
KHR_robustness , GLL, GLC,  x , ES2, 2012)
 EXT(KHR_texture_compression_astc_hdr,
KHR_texture_compression_astc_hdr   , GLL, GLC,  x , ES2, 2012)
 EXT(KHR_texture_compression_astc_ldr,
KHR_texture_compression_astc_ldr   , GLL, GLC,  x , ES2, 2012)
 EXT(KHR_texture_compression_astc_sliced_3d  ,
KHR_texture_compression_astc_sliced_3d , GLL, GLC,  x , ES2, 2015)

 EXT(MESA_pack_invert,
MESA_pack_invert   , GLL, GLC,  x ,  x , 2002)
 EXT(MESA_shader_integer_functions   ,
MESA_shader_integer_functions  , GLL, GLC,  x ,  30, 2016)
 EXT(MESA_texture_signed_rgba,
EXT_texture_snorm  , GLL, GLC,  x ,  x , 2009)
 EXT(MESA_window_pos ,
dummy_true , GLL,  x ,  x ,  x , 2000)
--
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 3/3] mesa: expose KHR_no_error for GL

2017-03-29 Thread Timothy Arceri



On 30/03/17 06:53, Marek Olšák wrote:

The series looks good to me except the "==" -> "&" in patch 2. The
patches have no effect without the GLX extension, right?


Correct. I was partly sending this out to see if anyone knew what was 
going on since Nvidia exposes this on their driver but I couldn't find 
the GLX extension anywhere.


Anyway in the mean time we could add an environment variable to enable it.



Marek

On Tue, Mar 28, 2017 at 6:35 AM, Timothy Arceri  wrote:

There ES is no support for now as this requires
EGL_KHR_create_context_no_error to be implemented.
---
 src/mesa/main/extensions_table.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index ec71791..4439731 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -294,20 +294,21 @@ EXT(IBM_texture_mirrored_repeat , dummy_true

 EXT(INGR_blend_func_separate, EXT_blend_func_separate  
  , GLL,  x ,  x ,  x , 1999)

 EXT(INTEL_conservative_rasterization, INTEL_conservative_rasterization 
  ,  x , GLC,  x ,  31, 2013)
 EXT(INTEL_performance_query , INTEL_performance_query  
  , GLL, GLC,  x , ES2, 2013)

 EXT(KHR_blend_equation_advanced , KHR_blend_equation_advanced  
  , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_blend_equation_advanced_coherent, 
KHR_blend_equation_advanced_coherent   , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_context_flush_control   , dummy_true   
  , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_debug   , dummy_true   
  , GLL, GLC,  11, ES2, 2012)
+EXT(KHR_no_error, dummy_true   
  , GLL, GLC,  x ,  x , 2015)
 EXT(KHR_robust_buffer_access_behavior   , 
ARB_robust_buffer_access_behavior  , GLL, GLC,  x , ES2, 2014)
 EXT(KHR_robustness  , KHR_robustness   
  , GLL, GLC,  x , ES2, 2012)
 EXT(KHR_texture_compression_astc_hdr, KHR_texture_compression_astc_hdr 
  , GLL, GLC,  x , ES2, 2012)
 EXT(KHR_texture_compression_astc_ldr, KHR_texture_compression_astc_ldr 
  , GLL, GLC,  x , ES2, 2012)
 EXT(KHR_texture_compression_astc_sliced_3d  , 
KHR_texture_compression_astc_sliced_3d , GLL, GLC,  x , ES2, 2015)

 EXT(MESA_pack_invert, MESA_pack_invert 
  , GLL, GLC,  x ,  x , 2002)
 EXT(MESA_shader_integer_functions   , MESA_shader_integer_functions
  , GLL, GLC,  x ,  30, 2016)
 EXT(MESA_texture_signed_rgba, EXT_texture_snorm
  , GLL, GLC,  x ,  x , 2009)
 EXT(MESA_window_pos , dummy_true   
  , GLL,  x ,  x ,  x , 2000)
--
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: require libdrm_amdgpu 2.4.76 for Vega

2017-03-29 Thread Samuel Pitoiset

Reviewed-by: Samuel Pitoiset 

On 03/29/2017 08:23 PM, Marek Olšák wrote:

From: Marek Olšák 

---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index ab9a91e..70885fb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -67,21 +67,21 @@ OPENCL_VERSION=1
 AC_SUBST([OPENCL_VERSION])

 # The idea is that libdrm is distributed as one cohesive package, even
 # though it is composed of multiple libraries. However some drivers
 # may have different version requirements than others. This list
 # codifies which drivers need which version of libdrm. Any libdrm
 # version dependencies in non-driver-specific code should be reflected
 # in the first entry.
 LIBDRM_REQUIRED=2.4.75
 LIBDRM_RADEON_REQUIRED=2.4.71
-LIBDRM_AMDGPU_REQUIRED=2.4.63
+LIBDRM_AMDGPU_REQUIRED=2.4.76
 LIBDRM_INTEL_REQUIRED=2.4.75
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
 LIBDRM_FREEDRENO_REQUIRED=2.4.74
 LIBDRM_VC4_REQUIRED=2.4.69
 LIBDRM_ETNAVIV_REQUIRED=2.4.74

 dnl Versions for external dependencies
 DRI2PROTO_REQUIRED=2.8
 DRI3PROTO_REQUIRED=1.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/cmd_buffer: fix dynamic state leak

2017-03-29 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Wed, Mar 29, 2017 at 12:11 PM,  wrote:

> From: Craig Stout 
>
> anv_state_pool_alloc requires a matching free, whereas
> anv_state_stream_alloc will be cleaned up on finish.
>
> Applies only to 13.0 branch.
> x
> https://bugs.freedesktop.org/show_bug.cgi?id=100365
> ---
>  src/intel/vulkan/anv_private.h | 12 
>  src/intel/vulkan/genX_cmd_buffer.c | 32 
>  2 files changed, 28 insertions(+), 16 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> index dd67508..12a6aa1 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -765,6 +765,18 @@ _anv_combine_address(struct anv_batch *batch, void
> *location,
>__state;  \
> })
>
> +#define anv_state_stream_emit(stream, cmd, align, ...)
>  \
> +   ({
>   \
> +  const uint32_t __size = __anv_cmd_length(cmd) * 4;
>  \
> +  struct anv_state __state = anv_state_stream_alloc((stream),
> __size, align);  \
> +  struct cmd __template = {__VA_ARGS__};
>  \
> +  __anv_cmd_pack(cmd)(NULL, __state.map, &__template);
>  \
> +  VG(VALGRIND_CHECK_MEM_IS_DEFINED(__state.map,
> __anv_cmd_length(cmd) * 4));   \
> +  if (!(stream)->block_pool->device->info.has_llc)
>\
> + anv_state_clflush(__state);
>  \
> +  __state;
>  \
> +   })
> +
>  #define GEN7_MOCS (struct GEN7_MEMORY_OBJECT_CONTROL_STATE) {  \
> .GraphicsDataTypeGFDT= 0,   \
> .LLCCacheabilityControlLLCCC = 0,   \
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 45fefc9..33db7ce 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1367,26 +1367,26 @@ flush_compute_descriptor_set(struct
> anv_cmd_buffer *cmd_buffer)
> const uint32_t slm_size = encode_slm_size(GEN_GEN,
> prog_data->total_shared);
>
> struct anv_state state =
> -  anv_state_pool_emit(>dynamic_state_pool,
> -  GENX(INTERFACE_DESCRIPTOR_DATA), 64,
> -  .KernelStartPointer = pipeline->cs_simd,
> -  .BindingTablePointer = surfaces.offset,
> -  .BindingTableEntryCount = 0,
> -  .SamplerStatePointer = samplers.offset,
> -  .SamplerCount = 0,
> +  anv_state_stream_emit(_buffer->dynamic_state_stream,
> +GENX(INTERFACE_DESCRIPTOR_DATA), 64,
> +.KernelStartPointer = pipeline->cs_simd,
> +.BindingTablePointer = surfaces.offset,
> +.BindingTableEntryCount = 0,
> +.SamplerStatePointer = samplers.offset,
> +.SamplerCount = 0,
>  #if !GEN_IS_HASWELL
> -  .ConstantURBEntryReadOffset = 0,
> +.ConstantURBEntryReadOffset = 0,
>  #endif
> -  .ConstantURBEntryReadLength =
> - cs_prog_data->push.per_thread.regs,
> +.ConstantURBEntryReadLength =
> +  cs_prog_data->push.per_thread.regs,
>  #if GEN_GEN >= 8 || GEN_IS_HASWELL
> -  .CrossThreadConstantDataReadLength =
> - cs_prog_data->push.cross_thread.regs,
> +.CrossThreadConstantDataReadLength =
> +  cs_prog_data->push.cross_thread.regs,
>  #endif
> -  .BarrierEnable = cs_prog_data->uses_barrier,
> -  .SharedLocalMemorySize = slm_size,
> -  .NumberofThreadsinGPGPUThreadGroup =
> - cs_prog_data->threads);
> +.BarrierEnable = cs_prog_data->uses_barrier,
> +.SharedLocalMemorySize = slm_size,
> +.NumberofThreadsinGPGPUThreadGroup =
> +  cs_prog_data->threads);
>
> uint32_t size = GENX(INTERFACE_DESCRIPTOR_DATA_length) *
> sizeof(uint32_t);
> anv_batch_emit(_buffer->batch,
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'

2017-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100259

--- Comment #10 from ovarieg...@yahoo.com ---
It turns out in my case this is an issue with using pkgconf (Which worked
previously) instead of pkg-config. It builds fine with pkg-config.

I'd prefer to keep this open until the pkgconf devs have a chance to take a
look, but it can be closed again if someone finds that preferable.

As for my friend, apparently he was trying to use the perl pkg-config from
openbsd in gentoo (Don't ask...).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: move to using nir clip/cull merge pass.

2017-03-29 Thread Bas Nieuwenhuizen
Acked-by: Bas Nieuwenhuizen 

On Wed, Mar 29, 2017 at 7:14 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> Doing this before tessellation makes doing some bits of
> tessellation a bit cleaner. It also cleans up a bit of the
> llvm generator code.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 144 
> ++--
>  src/amd/vulkan/radv_pipeline.c  |   1 +
>  2 files changed, 36 insertions(+), 109 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index f164d8f..78602fd 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -144,8 +144,6 @@ struct nir_to_llvm_context {
> int num_locals;
> LLVMValueRef *locals;
> bool has_ddxy;
> -   uint8_t num_input_clips;
> -   uint8_t num_input_culls;
> uint8_t num_output_clips;
> uint8_t num_output_culls;
>
> @@ -170,12 +168,9 @@ static unsigned 
> shader_io_get_unique_index(gl_varying_slot slot)
> return 0;
> if (slot == VARYING_SLOT_PSIZ)
> return 1;
> -   if (slot == VARYING_SLOT_CLIP_DIST0 ||
> -   slot == VARYING_SLOT_CULL_DIST0)
> +   if (slot == VARYING_SLOT_CLIP_DIST0)
> return 2;
> -   if (slot == VARYING_SLOT_CLIP_DIST1 ||
> -   slot == VARYING_SLOT_CULL_DIST1)
> -   return 3;
> +   /* 3 is reserved for clip dist as well */
> if (slot >= VARYING_SLOT_VAR0 && slot <= VARYING_SLOT_VAR31)
> return 4 + (slot - VARYING_SLOT_VAR0);
> unreachable("illegal slot in get unique index\n");
> @@ -2195,7 +2190,6 @@ load_gs_input(struct nir_to_llvm_context *ctx,
> unsigned param, vtx_offset_param;
> LLVMValueRef value[4], result;
> unsigned vertex_index;
> -   unsigned cull_offset = 0;
> radv_get_deref_offset(ctx, >variables[0]->deref,
>   false, _index,
>   _index, _index);
> @@ -2205,13 +2199,11 @@ load_gs_input(struct nir_to_llvm_context *ctx,
>   LLVMConstInt(ctx->i32, 4, false), "");
>
> param = 
> shader_io_get_unique_index(instr->variables[0]->var->data.location);
> -   if (instr->variables[0]->var->data.location == 
> VARYING_SLOT_CULL_DIST0)
> -   cull_offset += ctx->num_input_clips;
> for (unsigned i = 0; i < instr->num_components; i++) {
>
> args[0] = ctx->esgs_ring;
> args[1] = vtx_offset;
> -   args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + const_index 
> + cull_offset) * 256, false);
> +   args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + 
> const_index) * 256, false);
> args[3] = ctx->i32zero;
> args[4] = ctx->i32one; /* OFFEN */
> args[5] = ctx->i32zero; /* IDXEN */
> @@ -2366,8 +2358,7 @@ visit_store_var(struct nir_to_llvm_context *ctx,
>
> value = llvm_extract_elem(ctx, src, chan);
>
> -   if (instr->variables[0]->var->data.location == 
> VARYING_SLOT_CLIP_DIST0 ||
> -   instr->variables[0]->var->data.location == 
> VARYING_SLOT_CULL_DIST0)
> +   if (instr->variables[0]->var->data.compact)
> stride = 1;
> if (indir_index) {
> unsigned count = glsl_count_attribute_slots(
> @@ -3143,7 +3134,7 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx,
> LLVMValueRef gs_next_vertex;
> LLVMValueRef can_emit, kill;
> int idx;
> -   int clip_cull_slot = -1;
> +
> assert(instr->const_index[0] == 0);
> /* Write vertex attribute values to GSVS ring */
> gs_next_vertex = LLVMBuildLoad(ctx->builder,
> @@ -3175,27 +3166,11 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx,
> if (!(ctx->output_mask & (1ull << i)))
> continue;
>
> -   if (i == VARYING_SLOT_CLIP_DIST1 ||
> -   i == VARYING_SLOT_CULL_DIST1)
> -   continue;
> -
> -   if (i == VARYING_SLOT_CLIP_DIST0 ||
> -   i == VARYING_SLOT_CULL_DIST0) {
> +   if (i == VARYING_SLOT_CLIP_DIST0) {
> /* pack clip and cull into a single set of slots */
> -   if (clip_cull_slot == -1) {
> -   clip_cull_slot = idx;
> -   if (ctx->num_output_clips + 
> ctx->num_output_culls > 4)
> -   slot_inc = 2;
> -   } else {
> -   slot = clip_cull_slot;
> -   slot_inc = 0;
> -   }
> -   if 

Re: [Mesa-dev] [PATCH] winsys/amdgpu: remove AMDGPU_INFO_NUM_EVICTIONS

2017-03-29 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Mar 29, 2017 at 9:06 PM, Samuel Pitoiset
 wrote:
> This is now exposed with libdrm_amdgpu 2.4.76.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> index 37e0140311..39a05d0f02 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> @@ -59,10 +59,6 @@
>  #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_8X16   16
>  #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_16X16  17
>
> -#ifndef AMDGPU_INFO_NUM_EVICTIONS
> -#define AMDGPU_INFO_NUM_EVICTIONS  0x18
> -#endif
> -
>  static struct util_hash_table *dev_tab = NULL;
>  static mtx_t dev_tab_mutex = _MTX_INITIALIZER_NP;
>
> --
> 2.12.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-29 Thread Thomas Helland
2017-03-29 21:17 GMT+02:00 Thomas Helland :
> 2017-03-29 19:35 GMT+02:00 Bartosz Tomczyk :
>> I  would be very grateful if someone could help with testing performance
>> impact of this change.
>>
>
> Currently prepping some tests on my HTPC, which is a bit CPU-bound.
> I'll report back in about an hour or so.
>

My HTPC has a RX460, i3-6100 combination, so I was thinking
the low number of threads on the processor could be impacted
by the threaded dispatch. However, I've tested Talos Principle,
Dota 2, and Metro Last Light, and none of these show any
regressions as could be seen in Michael Larabels tests on
phoronix back in mid-March. My system is probably to
GPU-limited for any possible bottleneck to show.

I'll see if I can get my workstation up and running.
It has an RX480, and FX-8320 combination.
So weaker cores, and stronger graphics card.
Hopefully I will be able to reproduce things there.

>> On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk
>>  wrote:
>>>
>>> Call it directly when batch queue is empty. This avoids costly thread
>>> synchronisation. With this fix games that previously regressed
>>> with mesa_glthread=true like xonotic or grid autosport.
>>> ---
>>>  src/mesa/main/glthread.c | 47
>>> ++-
>>>  1 file changed, 34 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
>>> index 06115b916d..faf42c2b89 100644
>>> --- a/src/mesa/main/glthread.c
>>> +++ b/src/mesa/main/glthread.c
>>> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context
>>> *ctx)
>>> }
>>>  }
>>>
>>> -void
>>> -_mesa_glthread_flush_batch(struct gl_context *ctx)
>>> +static void
>>> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx)
>>>  {
>>> struct glthread_state *glthread = ctx->GLThread;
>>> -   struct glthread_batch *batch;
>>> -
>>> -   if (!glthread)
>>> -  return;
>>> -
>>> -   batch = glthread->batch;
>>> +   struct glthread_batch *batch = glthread->batch;
>>> +
>>> if (!batch->used)
>>>return;
>>>
>>> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
>>>return;
>>> }
>>>
>>> -   pthread_mutex_lock(>mutex);
>>> *glthread->batch_queue_tail = batch;
>>> glthread->batch_queue_tail = >next;
>>> pthread_cond_broadcast(>new_work);
>>> +
>>> +}
>>> +void
>>> +_mesa_glthread_flush_batch(struct gl_context *ctx)
>>> +{
>>> +   struct glthread_state *glthread = ctx->GLThread;
>>> +   struct glthread_batch *batch;
>>> +
>>> +   if (!glthread)
>>> +  return;
>>> +
>>> +   batch = glthread->batch;
>>> +   if (!batch->used)
>>> +  return;
>>> +
>>> +   pthread_mutex_lock(>mutex);
>>> +   _mesa_glthread_flush_batch_locked(ctx);
>>> pthread_mutex_unlock(>mutex);
>>>  }
>>>
>>> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
>>> if (pthread_self() == glthread->thread)
>>>return;
>>>
>>> -   _mesa_glthread_flush_batch(ctx);
>>> -
>>> pthread_mutex_lock(>mutex);
>>>
>>> -   while (glthread->batch_queue || glthread->busy)
>>> -  pthread_cond_wait(>work_done, >mutex);
>>> +   if (!(glthread->batch_queue || glthread->busy)) {
>>> +  if (glthread->batch && glthread->batch->used) {
>>> + struct _glapi_table *dispatch = _glapi_get_dispatch();
>>> + glthread_unmarshal_batch(ctx, glthread->batch);
>>> + _glapi_set_dispatch(dispatch);
>>> + glthread_allocate_batch(ctx);
>>> +  }
>>> +   }
>>> +   else {
>>> +  _mesa_glthread_flush_batch_locked(ctx);
>>> +  while (glthread->batch_queue || glthread->busy)
>>> + pthread_cond_wait(>work_done, >mutex);
>>> +   }
>>>
>>> pthread_mutex_unlock(>mutex);
>>>  }
>>> --
>>> 2.12.2
>>>
>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] intel: tools: add aubinator_error_decode tool

2017-03-29 Thread Lionel Landwerlin
This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

Signed-off-by: Lionel Landwerlin 
---
 src/intel/Makefile.tools.am  |  20 +-
 src/intel/common/gen_decoder.c   |  10 +
 src/intel/common/gen_decoder.h   |   1 +
 src/intel/tools/.gitignore   |   1 +
 src/intel/tools/aubinator_error_decode.c | 783 +++
 5 files changed, 814 insertions(+), 1 deletion(-)
 create mode 100644 src/intel/tools/aubinator_error_decode.c

diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
index 245bd03eef..a3a917d50e 100644
--- a/src/intel/Makefile.tools.am
+++ b/src/intel/Makefile.tools.am
@@ -19,7 +19,9 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.
 
-noinst_PROGRAMS += tools/aubinator
+noinst_PROGRAMS += \
+   tools/aubinator \
+   tools/aubinator_error_decode
 
 tools_aubinator_SOURCES = \
tools/aubinator.c \
@@ -41,3 +43,19 @@ tools_aubinator_LDADD = \
$(EXPAT_LIBS) \
$(ZLIB_LIBS) \
-lm
+
+
+tools_aubinator_error_decode_SOURCES = \
+   tools/aubinator_error_decode.c
+
+tools_aubinator_error_decode_LDADD = \
+   common/libintel_common.la \
+   $(top_builddir)/src/util/libmesautil.la \
+   $(aubinator_DEPS) \
+   $(EXPAT_LIBS) \
+   $(ZLIB_LIBS)
+
+tools_aubinator_error_decode_CFLAGS = \
+   $(AM_CFLAGS) \
+   $(EXPAT_CFLAGS) \
+   $(ZLIB_CFLAGS)
diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 1c3246f265..3af472caef 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -112,6 +112,16 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset)
return NULL;
 }
 
+struct gen_group *
+gen_spec_find_register_by_name(struct gen_spec *spec, const char *name)
+{
+   for (int i = 0; i < spec->nregisters; i++)
+  if (strcmp(spec->registers[i]->name, name) == 0)
+ return spec->registers[i];
+
+   return NULL;
+}
+
 struct gen_enum *
 gen_spec_find_enum(struct gen_spec *spec, const char *name)
 {
diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h
index 1c41de80a4..936b052455 100644
--- a/src/intel/common/gen_decoder.h
+++ b/src/intel/common/gen_decoder.h
@@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct 
gen_device_info *devinfo,
 uint32_t gen_spec_get_gen(struct gen_spec *spec);
 struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const 
uint32_t *p);
 struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset);
+struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, const 
char *name);
 int gen_group_get_length(struct gen_group *group, const uint32_t *p);
 const char *gen_group_get_name(struct gen_group *group);
 uint32_t gen_group_get_opcode(struct gen_group *group);
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6fed2..27437f9eef 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,2 @@
 /aubinator
+/aubinator_error_decode
diff --git a/src/intel/tools/aubinator_error_decode.c 
b/src/intel/tools/aubinator_error_decode.c
new file mode 100644
index 00..a477086cd8
--- /dev/null
+++ b/src/intel/tools/aubinator_error_decode.c
@@ -0,0 +1,783 @@
+/*
+ * Copyright © 2007-2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Eric Anholt 
+ *Carl Worth 
+ *Chris Wilson 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 

[Mesa-dev] [PATCH 2/7] intel: genxml: add GFX_ARB_ERROR_RPT register

2017-03-29 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen6.xml  | 12 
 src/intel/genxml/gen7.xml  | 12 
 src/intel/genxml/gen75.xml | 13 +
 src/intel/genxml/gen8.xml  | 18 ++
 src/intel/genxml/gen9.xml  | 18 ++
 5 files changed, 73 insertions(+)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 3ec13cd8fc..02ed465c5d 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -2075,4 +2075,16 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index d79aad9d14..ba9c8e8154 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2653,4 +2653,16 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 18481f1f50..979f1e3ee2 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -3076,4 +3076,17 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index b6af98a194..91573ae73a 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3330,4 +3330,22 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index b4dc6c4966..448ac6c8ab 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3614,4 +3614,22 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] intel: genxml: add ACTHD registers

2017-03-29 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen8.xml | 16 
 src/intel/genxml/gen9.xml | 16 
 2 files changed, 32 insertions(+)

diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 91573ae73a..be54748876 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3348,4 +3348,20 @@
 
   
 
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 448ac6c8ab..7509e49236 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3632,4 +3632,20 @@
 
   
 
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] intel: genxml: add INSTDONE registers

2017-03-29 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen6.xml  | 110 +
 src/intel/genxml/gen7.xml  |  64 ++
 src/intel/genxml/gen75.xml |  71 +
 src/intel/genxml/gen8.xml  |  71 +
 src/intel/genxml/gen9.xml  |  71 +
 5 files changed, 387 insertions(+)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 33969d937e..3ec13cd8fc 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -1965,4 +1965,114 @@
 
   
 
+  
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index f46dae7ce0..d79aad9d14 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2546,6 +2546,70 @@
 
   
 
+  
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 7fe9b02d6e..18481f1f50 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -2954,6 +2954,77 @@
 
   
 
+  
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 0ebf2aa9c0..b6af98a194 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3214,6 +3214,77 @@
 
   
 
+  
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+  
+
   
 
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 79fad000b2..b4dc6c4966 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3491,6 +3491,77 @@
 
   
 
+  
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+  
+
   
 
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] intel: genxml: add gen7 ERR_INT register

2017-03-29 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen7.xml | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index ba9c8e8154..08307b3506 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2665,4 +2665,15 @@
 
   
 
+  
+
+
+
+
+
+
+
+
+  
+
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] Aubinator error decode

2017-03-29 Thread Lionel Landwerlin
Hi,

This series introduces a slightly enhanced version of
intel_error_decode. Most Mesa developers working on the i965/vulkan
drivers may have to deal with hangs related to a specific workload.
Having the complete decoding of the instruction stream is quite
useful.

With the Anv driver genxml files where introduces and we have used
them successfully in aubinator to look at .aub files. With this change
we can apply the same error states reported by the kernel driver.

Cheers,

Lionel Landwerlin (7):
  intel: genxml: add INSTDONE registers
  intel: genxml: add GFX_ARB_ERROR_RPT register
  intel: genxml: add ACTHD registers
  intel: genxml: add gen7 ERR_INT register
  intel: genxml: add FAULT_REG register
  intel: genxml: add RING_BUFFER_CTL registers
  intel: tools: add aubinator_error_decode tool

 src/intel/Makefile.tools.am  |  20 +-
 src/intel/common/gen_decoder.c   |  10 +
 src/intel/common/gen_decoder.h   |   1 +
 src/intel/genxml/gen6.xml| 210 +
 src/intel/genxml/gen7.xml| 175 +++
 src/intel/genxml/gen75.xml   | 202 
 src/intel/genxml/gen8.xml| 197 
 src/intel/genxml/gen9.xml| 197 
 src/intel/tools/.gitignore   |   1 +
 src/intel/tools/aubinator_error_decode.c | 783 +++
 10 files changed, 1795 insertions(+), 1 deletion(-)
 create mode 100644 src/intel/tools/aubinator_error_decode.c

--
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] intel: genxml: add RING_BUFFER_CTL registers

2017-03-29 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen6.xml  | 40 +++
 src/intel/genxml/gen7.xml  | 40 +++
 src/intel/genxml/gen75.xml | 54 
 src/intel/genxml/gen8.xml  | 69 ++
 src/intel/genxml/gen9.xml  | 69 ++
 5 files changed, 272 insertions(+)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 99683ceed5..5083f074a1 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -2135,4 +2135,44 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index cbd5bbbf5a..ada8f74396 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2724,4 +2724,44 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 9137e6f460..50d6d8d8aa 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -3153,4 +3153,58 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 8835cb99f7..1390fe68c1 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3387,4 +3387,73 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 26e6459e4d..4bf0fb6199 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3671,4 +3671,73 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] intel: genxml: add FAULT_REG register

2017-03-29 Thread Lionel Landwerlin
Signed-off-by: Lionel Landwerlin 
---
 src/intel/genxml/gen6.xml  | 48 ++
 src/intel/genxml/gen7.xml  | 48 ++
 src/intel/genxml/gen75.xml | 64 ++
 src/intel/genxml/gen8.xml  | 23 +
 src/intel/genxml/gen9.xml  | 23 +
 5 files changed, 206 insertions(+)

diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 02ed465c5d..99683ceed5 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -2087,4 +2087,52 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
 
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 08307b3506..cbd5bbbf5a 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -2676,4 +2676,52 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 979f1e3ee2..9137e6f460 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -3089,4 +3089,68 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index be54748876..8835cb99f7 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -3364,4 +3364,27 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+  
+  
+  
+  
+  
+
+  
+
 
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 7509e49236..26e6459e4d 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -3648,4 +3648,27 @@
 
   
 
+  
+
+
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+  
+  
+  
+  
+  
+  
+
+  
+
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st: Add cubeMapFace parameter to st_finalize_texture.

2017-03-29 Thread Marek Olšák
I'm OK with this patch.

Marek

On Wed, Mar 29, 2017 at 12:57 PM, Nicolai Hähnle  wrote:
> Hi Michal,
>
> thanks for the patch. That piglit test actually fails on radeonsi as well.
>
>
> On 28.03.2017 22:39, Michal Srb wrote:
>>
>> st_finalize_texture always accesses image at face 0, but it may not be set
>> if we are working with cubemap that had other face set.
>>
>> This fixes crash in piglit
>> same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT.
>
>
> Please make sure commit messages are wrapped to <75 characters.
>
> Also:
>
> Cc: mesa-sta...@lists.freedesktop.org
>
>
>> ---
>> Hi, this is my attempt to fix crash in piglit test
>> same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT ran with
>> LIBGL_ALWAYS_INDIRECT=1.
>> I am not sure if it is the right approach. From what I found online
>> rendering into a face of a cube texture that doesn't have all faces set
>> would be invalid, but the test passes with other drivers, so maybe it's ok.
>> This makes it pass with software rendering as well.
>
>
> I actually don't see anything in the spec that would require texture
> completeness. That makes sense, since rendering into one image of a texture
> doesn't imply using sampler state. So allowing the test to pass is good.
>
> The flip-side is that this means calling st_finalize_texture at all may not
> be the right thing to do in the FBO code (except perhaps as an opportunistic
> optimization). After all, we could have a messed up situation where there
> are incompatible mip level in a texture, and we render to one of them
> anyway.
>
> Cleaning that up would be quite involved. I think this fix is fine for now,
> since it does improve the situation:
>
> Reviewed-by: Nicolai Hähnle 
>
> Let's see if there are any other opinions...
>
> Cheers,
> Nicolai
>
>
>
>>
>>  src/gallium/state_trackers/dri/dri2.c| 2 +-
>>  src/mesa/state_tracker/st_atom_image.c   | 2 +-
>>  src/mesa/state_tracker/st_atom_texture.c | 2 +-
>>  src/mesa/state_tracker/st_cb_fbo.c   | 2 +-
>>  src/mesa/state_tracker/st_cb_texture.c   | 5 +++--
>>  src/mesa/state_tracker/st_cb_texture.h   | 3 ++-
>>  src/mesa/state_tracker/st_gen_mipmap.c   | 2 +-
>>  7 files changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/dri/dri2.c
>> b/src/gallium/state_trackers/dri/dri2.c
>> index b50e096..ed6004f 100644
>> --- a/src/gallium/state_trackers/dri/dri2.c
>> +++ b/src/gallium/state_trackers/dri/dri2.c
>> @@ -1808,7 +1808,7 @@ dri2_interop_export_object(__DRIcontext *_ctx,
>>   return MESA_GLINTEROP_INVALID_MIP_LEVEL;
>>}
>>
>> -  if (!st_finalize_texture(ctx, st->pipe, obj)) {
>> +  if (!st_finalize_texture(ctx, st->pipe, obj, 0)) {
>>   mtx_unlock(>Shared->Mutex);
>>   return MESA_GLINTEROP_OUT_OF_RESOURCES;
>>}
>> diff --git a/src/mesa/state_tracker/st_atom_image.c
>> b/src/mesa/state_tracker/st_atom_image.c
>> index 5dd2cd6..4101552 100644
>> --- a/src/mesa/state_tracker/st_atom_image.c
>> +++ b/src/mesa/state_tracker/st_atom_image.c
>> @@ -64,7 +64,7 @@ st_bind_images(struct st_context *st, struct gl_program
>> *prog,
>>struct pipe_image_view *img = [i];
>>
>>if (!_mesa_is_image_unit_valid(st->ctx, u) ||
>> -  !st_finalize_texture(st->ctx, st->pipe, u->TexObj) ||
>> +  !st_finalize_texture(st->ctx, st->pipe, u->TexObj, 0) ||
>>!stObj->pt) {
>>   memset(img, 0, sizeof(*img));
>>   continue;
>> diff --git a/src/mesa/state_tracker/st_atom_texture.c
>> b/src/mesa/state_tracker/st_atom_texture.c
>> index 92023e0..5b481ec 100644
>> --- a/src/mesa/state_tracker/st_atom_texture.c
>> +++ b/src/mesa/state_tracker/st_atom_texture.c
>> @@ -73,7 +73,7 @@ update_single_texture(struct st_context *st,
>> }
>> stObj = st_texture_object(texObj);
>>
>> -   retval = st_finalize_texture(ctx, st->pipe, texObj);
>> +   retval = st_finalize_texture(ctx, st->pipe, texObj, 0);
>> if (!retval) {
>>/* out of mem */
>>return GL_FALSE;
>> diff --git a/src/mesa/state_tracker/st_cb_fbo.c
>> b/src/mesa/state_tracker/st_cb_fbo.c
>> index 78433bf..dce4239 100644
>> --- a/src/mesa/state_tracker/st_cb_fbo.c
>> +++ b/src/mesa/state_tracker/st_cb_fbo.c
>> @@ -488,7 +488,7 @@ st_render_texture(struct gl_context *ctx,
>> struct st_renderbuffer *strb = st_renderbuffer(rb);
>> struct pipe_resource *pt;
>>
>> -   if (!st_finalize_texture(ctx, pipe, att->Texture))
>> +   if (!st_finalize_texture(ctx, pipe, att->Texture, att->CubeMapFace))
>>return;
>>
>> pt = st_get_texobj_resource(att->Texture);
>> diff --git a/src/mesa/state_tracker/st_cb_texture.c
>> b/src/mesa/state_tracker/st_cb_texture.c
>> index bc6f108..1b486d7 100644
>> --- a/src/mesa/state_tracker/st_cb_texture.c
>> +++ b/src/mesa/state_tracker/st_cb_texture.c
>> @@ -2434,7 +2434,8 @@ copy_image_data_to_texture(struct st_context *st,
>>  GLboolean
>>  

Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-29 Thread Edmondo Tommasina
This patch helps against the massive performance drop of glthread with
Two Worlds.

The performance boost in Civ5 is not hurt by this patch. It looks good.

Some trivial comments in the patch:

On Wed, Mar 29, 2017 at 7:35 PM, Bartosz Tomczyk
 wrote:
> I  would be very grateful if someone could help with testing performance
> impact of this change.
>
> On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk
>  wrote:
>>
>> Call it directly when batch queue is empty. This avoids costly thread
>> synchronisation. With this fix games that previously regressed
>> with mesa_glthread=true like xonotic or grid autosport.
>> ---
>>  src/mesa/main/glthread.c | 47
>> ++-
>>  1 file changed, 34 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
>> index 06115b916d..faf42c2b89 100644
>> --- a/src/mesa/main/glthread.c
>> +++ b/src/mesa/main/glthread.c
>> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context
>> *ctx)
>> }
>>  }
>>
>> -void
>> -_mesa_glthread_flush_batch(struct gl_context *ctx)
>> +static void
>> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx)
>>  {
>> struct glthread_state *glthread = ctx->GLThread;
>> -   struct glthread_batch *batch;
>> -
>> -   if (!glthread)
>> -  return;
>> -
>> -   batch = glthread->batch;
>> +   struct glthread_batch *batch = glthread->batch;
>> +

Trailing whitespace.

>> if (!batch->used)
>>return;
>>
>> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
>>return;
>> }
>>
>> -   pthread_mutex_lock(>mutex);
>> *glthread->batch_queue_tail = batch;
>> glthread->batch_queue_tail = >next;
>> pthread_cond_broadcast(>new_work);
>> +
>> +}

Move the the bracket one line up.

Thanks
edmondo

>> +void
>> +_mesa_glthread_flush_batch(struct gl_context *ctx)
>> +{
>> +   struct glthread_state *glthread = ctx->GLThread;
>> +   struct glthread_batch *batch;
>> +
>> +   if (!glthread)
>> +  return;
>> +
>> +   batch = glthread->batch;
>> +   if (!batch->used)
>> +  return;
>> +
>> +   pthread_mutex_lock(>mutex);
>> +   _mesa_glthread_flush_batch_locked(ctx);
>> pthread_mutex_unlock(>mutex);
>>  }
>>
>> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
>> if (pthread_self() == glthread->thread)
>>return;
>>
>> -   _mesa_glthread_flush_batch(ctx);
>> -
>> pthread_mutex_lock(>mutex);
>>
>> -   while (glthread->batch_queue || glthread->busy)
>> -  pthread_cond_wait(>work_done, >mutex);
>> +   if (!(glthread->batch_queue || glthread->busy)) {
>> +  if (glthread->batch && glthread->batch->used) {
>> + struct _glapi_table *dispatch = _glapi_get_dispatch();
>> + glthread_unmarshal_batch(ctx, glthread->batch);
>> + _glapi_set_dispatch(dispatch);
>> + glthread_allocate_batch(ctx);
>> +  }
>> +   }
>> +   else {
>> +  _mesa_glthread_flush_batch_locked(ctx);
>> +  while (glthread->batch_queue || glthread->busy)
>> + pthread_cond_wait(>work_done, >mutex);
>> +   }
>>
>> pthread_mutex_unlock(>mutex);
>>  }
>> --
>> 2.12.2
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 3/3] mesa: expose KHR_no_error for GL

2017-03-29 Thread Marek Olšák
The series looks good to me except the "==" -> "&" in patch 2. The
patches have no effect without the GLX extension, right?

Marek

On Tue, Mar 28, 2017 at 6:35 AM, Timothy Arceri  wrote:
> There ES is no support for now as this requires
> EGL_KHR_create_context_no_error to be implemented.
> ---
>  src/mesa/main/extensions_table.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index ec71791..4439731 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -294,20 +294,21 @@ EXT(IBM_texture_mirrored_repeat , dummy_true
>
>  EXT(INGR_blend_func_separate, EXT_blend_func_separate
> , GLL,  x ,  x ,  x , 1999)
>
>  EXT(INTEL_conservative_rasterization, 
> INTEL_conservative_rasterization   ,  x , GLC,  x ,  31, 2013)
>  EXT(INTEL_performance_query , INTEL_performance_query
> , GLL, GLC,  x , ES2, 2013)
>
>  EXT(KHR_blend_equation_advanced , KHR_blend_equation_advanced
> , GLL, GLC,  x , ES2, 2014)
>  EXT(KHR_blend_equation_advanced_coherent, 
> KHR_blend_equation_advanced_coherent   , GLL, GLC,  x , ES2, 2014)
>  EXT(KHR_context_flush_control   , dummy_true 
> , GLL, GLC,  x , ES2, 2014)
>  EXT(KHR_debug   , dummy_true 
> , GLL, GLC,  11, ES2, 2012)
> +EXT(KHR_no_error, dummy_true 
> , GLL, GLC,  x ,  x , 2015)
>  EXT(KHR_robust_buffer_access_behavior   , 
> ARB_robust_buffer_access_behavior  , GLL, GLC,  x , ES2, 2014)
>  EXT(KHR_robustness  , KHR_robustness 
> , GLL, GLC,  x , ES2, 2012)
>  EXT(KHR_texture_compression_astc_hdr, 
> KHR_texture_compression_astc_hdr   , GLL, GLC,  x , ES2, 2012)
>  EXT(KHR_texture_compression_astc_ldr, 
> KHR_texture_compression_astc_ldr   , GLL, GLC,  x , ES2, 2012)
>  EXT(KHR_texture_compression_astc_sliced_3d  , 
> KHR_texture_compression_astc_sliced_3d , GLL, GLC,  x , ES2, 2015)
>
>  EXT(MESA_pack_invert, MESA_pack_invert   
> , GLL, GLC,  x ,  x , 2002)
>  EXT(MESA_shader_integer_functions   , MESA_shader_integer_functions  
> , GLL, GLC,  x ,  30, 2016)
>  EXT(MESA_texture_signed_rgba, EXT_texture_snorm  
> , GLL, GLC,  x ,  x , 2009)
>  EXT(MESA_window_pos , dummy_true 
> , GLL,  x ,  x ,  x , 2000)
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100424] X hang (in kernel) after some event in Serious Sam Fusion using radv. 4.9/amd-staging-4.9

2017-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100424

--- Comment #4 from Darren Salt  ---
… okay, it's looking like the Steam overlay has a lot to do with this problem.
(Tested with current Mesa git, but the same LLVM as before.)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 25/25] radeonsi: enable ARB_sparse_buffer

2017-03-29 Thread Marek Olšák
For patches 13-25:

Reviewed-by: Marek Olšák 

I think the series will also need a newer libdrm than the one required
by configure.ac, but my latest configure.ac patch for Vega should
address that.

Marek

On Tue, Mar 28, 2017 at 11:12 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> TODO add features.txt and ChangeLog
>
> v2:
> - fill in DRM version requirement
> - disable on SI due to CP DMA faults
> ---
>  src/gallium/drivers/radeonsi/si_pipe.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 277fa28..9096f16 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -461,20 +461,30 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
>
> case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> /* SI doesn't support unaligned loads.
>  * CIK needs DRM 2.50.0 on radeon. */
> return sscreen->b.chip_class == SI ||
>(sscreen->b.info.drm_major == 2 &&
> sscreen->b.info.drm_minor < 50);
>
> +   case PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE:
> +   /* Disable on SI due to VM faults in CP DMA. Enable once these
> +* faults are mitigated in software.
> +*/
> +   if (sscreen->b.chip_class >= CIK &&
> +   sscreen->b.info.drm_major == 3 &&
> +   sscreen->b.info.drm_minor >= 13)
> +   return RADEON_SPARSE_PAGE_SIZE;
> +   return 0;
> +
> /* Unsupported features. */
> case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
> case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
> case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
> case PIPE_CAP_USER_VERTEX_BUFFERS:
> case PIPE_CAP_FAKE_SW_MSAA:
> case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
> case PIPE_CAP_VERTEXID_NOBASE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/2] anv: Query the kernel for reset status

2017-03-29 Thread Jason Ekstrand
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible.  In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering.  In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.
---
 src/intel/vulkan/anv_device.c  | 114 +
 src/intel/vulkan/anv_gem.c |  17 ++
 src/intel/vulkan/anv_private.h |   5 ++
 src/intel/vulkan/genX_query.c  |  11 ++--
 4 files changed, 107 insertions(+), 40 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 5f0d00f..bc3be23 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -884,8 +884,6 @@ anv_device_submit_simple_batch(struct anv_device *device,
struct anv_bo bo, *exec_bos[1];
VkResult result = VK_SUCCESS;
uint32_t size;
-   int64_t timeout;
-   int ret;
 
/* Kernel driver requires 8 byte aligned batch length */
size = align_u32(batch->next - batch->start, 8);
@@ -925,14 +923,7 @@ anv_device_submit_simple_batch(struct anv_device *device,
if (result != VK_SUCCESS)
   goto fail;
 
-   timeout = INT64_MAX;
-   ret = anv_gem_wait(device, bo.gem_handle, );
-   if (ret != 0) {
-  /* We don't know the real error. */
-  device->lost = true;
-  result = vk_errorf(VK_ERROR_DEVICE_LOST, "execbuf2 failed: %m");
-  goto fail;
-   }
+   result = anv_device_wait(device, , INT64_MAX);
 
  fail:
anv_bo_pool_free(>batch_bo_pool, );
@@ -1264,6 +1255,58 @@ anv_device_execbuf(struct anv_device *device,
return VK_SUCCESS;
 }
 
+VkResult
+anv_device_query_status(struct anv_device *device)
+{
+   /* This isn't likely as most of the callers of this function already check
+* for it.  However, it doesn't hurt to check and it potentially lets us
+* avoid an ioctl.
+*/
+   if (unlikely(device->lost))
+  return VK_ERROR_DEVICE_LOST;
+
+   uint32_t active, pending;
+   int ret = anv_gem_gpu_get_reset_stats(device, , );
+   if (ret == -1) {
+  /* We don't know the real error. */
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST, "get_reset_stats failed: %m");
+   }
+
+   if (active) {
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST,
+   "GPU hung on one of our command buffers");
+   } else if (pending) {
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST,
+   "GPU hung with commands in-flight");
+   }
+
+   return VK_SUCCESS;
+}
+
+VkResult
+anv_device_wait(struct anv_device *device, struct anv_bo *bo,
+int64_t timeout)
+{
+   int ret = anv_gem_wait(device, bo->gem_handle, );
+   if (ret == -1 && errno == ETIME) {
+  return VK_TIMEOUT;
+   } else if (ret == -1) {
+  /* We don't know the real error. */
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST, "gem wait failed: %m");
+   }
+
+   /* Query for device status after the wait.  If the BO we're waiting on got
+* caught in a GPU hang we don't want to return VK_SUCCESS to the client
+* because it clearly doesn't have valid data.  Yes, this most likely means
+* an ioctl, but we just did an ioctl to wait so it's no great loss.
+*/
+   return anv_device_query_status(device);
+}
+
 VkResult anv_QueueSubmit(
 VkQueue _queue,
 uint32_tsubmitCount,
@@ -1273,10 +1316,17 @@ VkResult anv_QueueSubmit(
ANV_FROM_HANDLE(anv_queue, queue, _queue);
ANV_FROM_HANDLE(anv_fence, fence, _fence);
struct anv_device *device = queue->device;
-   if (unlikely(device->lost))
-  return VK_ERROR_DEVICE_LOST;
 
-   VkResult result = VK_SUCCESS;
+   /* Query for device status prior to submitting.  Technically, we don't need
+* to do this.  However, if we have a client that's submitting piles of
+* garbage, we would rather break as early as possible to keep the GPU
+* hanging contained.  If we don't check here, we'll either be waiting for
+* the kernel to kick us or we'll have to wait until the client waits on a
+* fence before we actually know whether or not we've hung.
+*/
+   VkResult result = anv_device_query_status(device);
+   if (result != VK_SUCCESS)
+  return result;
 
/* We lock around QueueSubmit for three main reasons:
 *
@@ -1802,9 +1852,6 @@ VkResult anv_GetFenceStatus(
if (unlikely(device->lost))
   return VK_ERROR_DEVICE_LOST;
 
-   int64_t t = 0;
-   int ret;
-
switch (fence->state) {
case ANV_FENCE_STATE_RESET:
   /* If it hasn't even been sent off to the GPU yet, it's not ready */
@@ -1814,15 +1861,18 @@ VkResult anv_GetFenceStatus(
   /* It's been signaled, return success */
   return VK_SUCCESS;
 
-   case 

[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'

2017-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100259

ovarieg...@yahoo.com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #9 from ovarieg...@yahoo.com ---
I am still experiencing this issue with the current mesa git master in both my
multilib 64 bit system and my clean 32-bit chroot, both with Slackware
installed. I also have the latest libdrm git master installed. A friend who
runs gentoo also experienced this issue.

drivers/dri2/.libs/platform_drm.o: In function `get_back_bo':
platform_drm.c:(.text+0x1d4): undefined reference to
`gbm_bo_create_with_modifiers'
collect2: error: ld returned 1 exit status
libtool:   error: error: relink 'libEGL.la' with the above command before
installing it
make[4]: *** [Makefile:910: install-libLTLIBRARIES] Error 1
make[4]: Leaving directory '/tmp/SBo/mesa/src/egl'
make[3]: *** [Makefile:1385: install-am] Error 2
make[3]: Leaving directory '/tmp/SBo/mesa/src/egl'
make[2]: *** [Makefile:852: install-recursive] Error 1
make[2]: Leaving directory '/tmp/SBo/mesa/src'
make[1]: *** [Makefile:1009: install] Error 2
make[1]: Leaving directory '/tmp/SBo/mesa/src'
make: *** [Makefile:643: install-recursive] Error 1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-29 Thread Thomas Helland
2017-03-29 19:35 GMT+02:00 Bartosz Tomczyk :
> I  would be very grateful if someone could help with testing performance
> impact of this change.
>

Currently prepping some tests on my HTPC, which is a bit CPU-bound.
I'll report back in about an hour or so.

> On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk
>  wrote:
>>
>> Call it directly when batch queue is empty. This avoids costly thread
>> synchronisation. With this fix games that previously regressed
>> with mesa_glthread=true like xonotic or grid autosport.
>> ---
>>  src/mesa/main/glthread.c | 47
>> ++-
>>  1 file changed, 34 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
>> index 06115b916d..faf42c2b89 100644
>> --- a/src/mesa/main/glthread.c
>> +++ b/src/mesa/main/glthread.c
>> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context
>> *ctx)
>> }
>>  }
>>
>> -void
>> -_mesa_glthread_flush_batch(struct gl_context *ctx)
>> +static void
>> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx)
>>  {
>> struct glthread_state *glthread = ctx->GLThread;
>> -   struct glthread_batch *batch;
>> -
>> -   if (!glthread)
>> -  return;
>> -
>> -   batch = glthread->batch;
>> +   struct glthread_batch *batch = glthread->batch;
>> +
>> if (!batch->used)
>>return;
>>
>> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
>>return;
>> }
>>
>> -   pthread_mutex_lock(>mutex);
>> *glthread->batch_queue_tail = batch;
>> glthread->batch_queue_tail = >next;
>> pthread_cond_broadcast(>new_work);
>> +
>> +}
>> +void
>> +_mesa_glthread_flush_batch(struct gl_context *ctx)
>> +{
>> +   struct glthread_state *glthread = ctx->GLThread;
>> +   struct glthread_batch *batch;
>> +
>> +   if (!glthread)
>> +  return;
>> +
>> +   batch = glthread->batch;
>> +   if (!batch->used)
>> +  return;
>> +
>> +   pthread_mutex_lock(>mutex);
>> +   _mesa_glthread_flush_batch_locked(ctx);
>> pthread_mutex_unlock(>mutex);
>>  }
>>
>> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
>> if (pthread_self() == glthread->thread)
>>return;
>>
>> -   _mesa_glthread_flush_batch(ctx);
>> -
>> pthread_mutex_lock(>mutex);
>>
>> -   while (glthread->batch_queue || glthread->busy)
>> -  pthread_cond_wait(>work_done, >mutex);
>> +   if (!(glthread->batch_queue || glthread->busy)) {
>> +  if (glthread->batch && glthread->batch->used) {
>> + struct _glapi_table *dispatch = _glapi_get_dispatch();
>> + glthread_unmarshal_batch(ctx, glthread->batch);
>> + _glapi_set_dispatch(dispatch);
>> + glthread_allocate_batch(ctx);
>> +  }
>> +   }
>> +   else {
>> +  _mesa_glthread_flush_batch_locked(ctx);
>> +  while (glthread->batch_queue || glthread->busy)
>> + pthread_cond_wait(>work_done, >mutex);
>> +   }
>>
>> pthread_mutex_unlock(>mutex);
>>  }
>> --
>> 2.12.2
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/cmd_buffer: fix host memory leak

2017-03-29 Thread cstout
From: Craig Stout 

push_constants must be free'd.

https://bugs.freedesktop.org/show_bug.cgi?id=100452
---
 src/intel/vulkan/anv_cmd_buffer.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 909bee2..c65eba2 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -120,7 +120,12 @@ anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
cmd_buffer->batch.status = VK_SUCCESS;
 
memset(>descriptors, 0, sizeof(state->descriptors));
-   memset(>push_constants, 0, sizeof(state->push_constants));
+   for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (state->push_constants[i] != NULL) {
+ vk_free(_buffer->pool->alloc, state->push_constants[i]);
+ state->push_constants[i] = NULL;
+  }
+   }
memset(state->binding_tables, 0, sizeof(state->binding_tables));
memset(state->samplers, 0, sizeof(state->samplers));
 
@@ -193,6 +198,9 @@ static VkResult anv_create_cmd_buffer(
 
cmd_buffer->batch.status = VK_SUCCESS;
 
+   for (uint32_t i = 0; i < MESA_SHADER_STAGES; i++) {
+  cmd_buffer->state.push_constants[i] = NULL;
+   }
cmd_buffer->_loader_data.loaderMagic = ICD_LOADER_MAGIC;
cmd_buffer->device = device;
cmd_buffer->pool = pool;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/cmd_buffer: fix dynamic state leak

2017-03-29 Thread cstout
From: Craig Stout 

anv_state_pool_alloc requires a matching free, whereas
anv_state_stream_alloc will be cleaned up on finish.

Applies only to 13.0 branch.
x
https://bugs.freedesktop.org/show_bug.cgi?id=100365
---
 src/intel/vulkan/anv_private.h | 12 
 src/intel/vulkan/genX_cmd_buffer.c | 32 
 2 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index dd67508..12a6aa1 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -765,6 +765,18 @@ _anv_combine_address(struct anv_batch *batch, void 
*location,
   __state;  \
})
 
+#define anv_state_stream_emit(stream, cmd, align, ...) 
\
+   ({  
\
+  const uint32_t __size = __anv_cmd_length(cmd) * 4;   
\
+  struct anv_state __state = anv_state_stream_alloc((stream), __size, 
align);  \
+  struct cmd __template = {__VA_ARGS__};   
\
+  __anv_cmd_pack(cmd)(NULL, __state.map, &__template); 
\
+  VG(VALGRIND_CHECK_MEM_IS_DEFINED(__state.map, __anv_cmd_length(cmd) * 
4));   \
+  if (!(stream)->block_pool->device->info.has_llc) 
\
+ anv_state_clflush(__state);   
\
+  __state; 
\
+   })
+
 #define GEN7_MOCS (struct GEN7_MEMORY_OBJECT_CONTROL_STATE) {  \
.GraphicsDataTypeGFDT= 0,   \
.LLCCacheabilityControlLLCCC = 0,   \
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 45fefc9..33db7ce 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1367,26 +1367,26 @@ flush_compute_descriptor_set(struct anv_cmd_buffer 
*cmd_buffer)
const uint32_t slm_size = encode_slm_size(GEN_GEN, prog_data->total_shared);
 
struct anv_state state =
-  anv_state_pool_emit(>dynamic_state_pool,
-  GENX(INTERFACE_DESCRIPTOR_DATA), 64,
-  .KernelStartPointer = pipeline->cs_simd,
-  .BindingTablePointer = surfaces.offset,
-  .BindingTableEntryCount = 0,
-  .SamplerStatePointer = samplers.offset,
-  .SamplerCount = 0,
+  anv_state_stream_emit(_buffer->dynamic_state_stream,
+GENX(INTERFACE_DESCRIPTOR_DATA), 64,
+.KernelStartPointer = pipeline->cs_simd,
+.BindingTablePointer = surfaces.offset,
+.BindingTableEntryCount = 0,
+.SamplerStatePointer = samplers.offset,
+.SamplerCount = 0,
 #if !GEN_IS_HASWELL
-  .ConstantURBEntryReadOffset = 0,
+.ConstantURBEntryReadOffset = 0,
 #endif
-  .ConstantURBEntryReadLength =
- cs_prog_data->push.per_thread.regs,
+.ConstantURBEntryReadLength =
+  cs_prog_data->push.per_thread.regs,
 #if GEN_GEN >= 8 || GEN_IS_HASWELL
-  .CrossThreadConstantDataReadLength =
- cs_prog_data->push.cross_thread.regs,
+.CrossThreadConstantDataReadLength =
+  cs_prog_data->push.cross_thread.regs,
 #endif
-  .BarrierEnable = cs_prog_data->uses_barrier,
-  .SharedLocalMemorySize = slm_size,
-  .NumberofThreadsinGPGPUThreadGroup =
- cs_prog_data->threads);
+.BarrierEnable = cs_prog_data->uses_barrier,
+.SharedLocalMemorySize = slm_size,
+.NumberofThreadsinGPGPUThreadGroup =
+  cs_prog_data->threads);
 
uint32_t size = GENX(INTERFACE_DESCRIPTOR_DATA_length) * sizeof(uint32_t);
anv_batch_emit(_buffer->batch,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] winsys/amdgpu: remove AMDGPU_INFO_NUM_EVICTIONS

2017-03-29 Thread Samuel Pitoiset
This is now exposed with libdrm_amdgpu 2.4.76.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index 37e0140311..39a05d0f02 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -59,10 +59,6 @@
 #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_8X16   16
 #define CIK__PIPE_CONFIG__ADDR_SURF_P16_32X32_16X16  17
 
-#ifndef AMDGPU_INFO_NUM_EVICTIONS
-#define AMDGPU_INFO_NUM_EVICTIONS  0x18
-#endif
-
 static struct util_hash_table *dev_tab = NULL;
 static mtx_t dev_tab_mutex = _MTX_INITIALIZER_NP;
 
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory

2017-03-29 Thread Jason Ekstrand
Looking over the patch, I think I've convinced myself that it's correct.
(I honestly wasn't expecting to come to that conclusion without more
iteration.)  That said, this raises some interesting questions.  I added
Kristian to the Cc in case he has any input.

 1. Should we do powers of two or linear.  I'm still a fan of powers of two.

 2. Should block pools even have a block size at all? We could just make
every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB
and then make the block size part of the state pool or stream that's
allocating from it.  At the moment, I like this idea, but I've given it
very little thought.

 3. If we go with the idea in 2. should we still call it block_pool?  I
think we can keep the name but it doesn't it as well as it once did.

Thanks for working on this!  I'm sorry it's taken so long to respond.
Every time I've looked at it, my brain hasn't been in the right state to
think about lock-free code. :-/

On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero 
wrote:

> Current Anv allocator assign memory in terms of a fixed block size.
>
> But there can be cases where this block is not enough for a memory
> request, and thus several blocks must be assigned in a row.
>
> This commit adds support for specifying how many blocks of memory must
> be assigned.
>
> This fixes a number dEQP-VK.pipeline.render_to_image.* tests that crash.
>
> v2: lock-free free-list is not handled correctly (Jason)
> ---
>  src/intel/vulkan/anv_allocator.c   | 81 +++---
> 
>  src/intel/vulkan/anv_batch_chain.c |  4 +-
>  src/intel/vulkan/anv_private.h |  7 +++-
>  3 files changed, 66 insertions(+), 26 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_
> allocator.c
> index 45c663b..3924551 100644
> --- a/src/intel/vulkan/anv_allocator.c
> +++ b/src/intel/vulkan/anv_allocator.c
> @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool,
> pool->device = device;
> anv_bo_init(>bo, 0, 0);
> pool->block_size = block_size;
> -   pool->free_list = ANV_FREE_LIST_EMPTY;
> +   for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++)
> +  pool->free_list[i] = ANV_FREE_LIST_EMPTY;
> pool->back_free_list = ANV_FREE_LIST_EMPTY;
>
> pool->fd = memfd_create("block pool", MFD_CLOEXEC);
> @@ -500,30 +501,35 @@ fail:
>
>  static uint32_t
>  anv_block_pool_alloc_new(struct anv_block_pool *pool,
> - struct anv_block_state *pool_state)
> + struct anv_block_state *pool_state,
> + uint32_t n_blocks)
>

Maybe have this take a size rather than n_blocks?  It's only ever called by
stuff in the block pool so the caller can do the multiplication.  It would
certainly make some of the math below easier.


>  {
> struct anv_block_state state, old, new;
>
> while (1) {
> -  state.u64 = __sync_fetch_and_add(_state->u64,
> pool->block_size);
> -  if (state.next < state.end) {
> +  state.u64 = __sync_fetch_and_add(_state->u64, n_blocks *
> pool->block_size);
> +  if (state.next > state.end) {
> + futex_wait(_state->end, state.end);
> + continue;
> +  } else if ((state.next + (n_blocks - 1) * pool->block_size) <
> state.end) {
>

First off, please keep the if's in the same order unless we have a reason
to re-arrange them.  It would make this way easier to review. :-)

Second, I think this would be much easier to read as:

if (state.next + size <= state.end) {
   /* Success */
} else if (state.next <= state.end) {
   /* Our block is the one that crosses the line */
} else {
   /* Wait like everyone else */
}


>   assert(pool->map);
>   return state.next;
> -  } else if (state.next == state.end) {
> - /* We allocated the first block outside the pool, we have to
> grow it.
> -  * pool_state->next acts a mutex: threads who try to allocate
> now will
> -  * get block indexes above the current limit and hit futex_wait
> -  * below. */
> - new.next = state.next + pool->block_size;
> +  } else {
> + /* We allocated the firsts blocks outside the pool, we have to
> grow
> +  * it. pool_state->next acts a mutex: threads who try to allocate
> +  * now will get block indexes above the current limit and hit
> +  * futex_wait below.
> +  */
> + new.next = state.next + n_blocks * pool->block_size;
>   new.end = anv_block_pool_grow(pool, pool_state);
> + /* We assume that just growing once the pool is enough to fulfil
> the
> +  * memory requirements
> +  */
>

I think this is probably a reasonable assumption.  That said, it wouldn't
hurt to add a size parameter to block_pool_grow but I don't know that it's
needed.


>   assert(new.end >= new.next && new.end % pool->block_size == 0);
>   old.u64 = __sync_lock_test_and_set(_state->u64, new.u64);
>

[Mesa-dev] [PATCH] configure.ac: require libdrm_amdgpu 2.4.76 for Vega

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index ab9a91e..70885fb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -67,21 +67,21 @@ OPENCL_VERSION=1
 AC_SUBST([OPENCL_VERSION])
 
 # The idea is that libdrm is distributed as one cohesive package, even
 # though it is composed of multiple libraries. However some drivers
 # may have different version requirements than others. This list
 # codifies which drivers need which version of libdrm. Any libdrm
 # version dependencies in non-driver-specific code should be reflected
 # in the first entry.
 LIBDRM_REQUIRED=2.4.75
 LIBDRM_RADEON_REQUIRED=2.4.71
-LIBDRM_AMDGPU_REQUIRED=2.4.63
+LIBDRM_AMDGPU_REQUIRED=2.4.76
 LIBDRM_INTEL_REQUIRED=2.4.75
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
 LIBDRM_FREEDRENO_REQUIRED=2.4.74
 LIBDRM_VC4_REQUIRED=2.4.69
 LIBDRM_ETNAVIV_REQUIRED=2.4.74
 
 dnl Versions for external dependencies
 DRI2PROTO_REQUIRED=2.8
 DRI3PROTO_REQUIRED=1.0
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove support for predicates from TGSI

2017-03-29 Thread Roland Scheidegger
[resend with snipped bits as it's too big]

A couple comments inline.

[snip]

> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -746,39 +746,30 @@ static void lp_exec_default(struct lp_exec_mask *mask,
>  }
>  
>  
>  /* stores val into an address pointed to by dst_ptr.
>   * mask->exec_mask is used to figure out which bits of val
>   * should be stored into the address
>   * (0 means don't store this bit, 1 means do store).
>   */
>  static void lp_exec_mask_store(struct lp_exec_mask *mask,
> struct lp_build_context *bld_store,
> -   LLVMValueRef pred,
> LLVMValueRef val,
> LLVMValueRef dst_ptr)
>  {
> LLVMBuilderRef builder = mask->bld->gallivm->builder;
> +   LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL;
Calling this "pred" now seems to be somewhat of a misnomer (wasn't all
that great before because it then included exec_mask but it's worse now).



>  
> assert(lp_check_value(bld_store->type, val));
> assert(LLVMGetTypeKind(LLVMTypeOf(dst_ptr)) == LLVMPointerTypeKind);
> assert(LLVMGetElementType(LLVMTypeOf(dst_ptr)) == LLVMTypeOf(val));
>  
> -   /* Mix the predicate and execution mask */
> -   if (mask->has_mask) {
> -  if (pred) {
> - pred = LLVMBuildAnd(builder, pred, mask->exec_mask, "");
> -  } else {
> - pred = mask->exec_mask;
> -  }
> -   }
> -
> if (pred) {
>LLVMValueRef res, dst;
>  
>dst = LLVMBuildLoad(builder, dst_ptr, "");
>res = lp_build_select(bld_store, pred, val, dst);
>LLVMBuildStore(builder, res, dst_ptr);
> } else
>LLVMBuildStore(builder, val, dst_ptr);
>  }
>  
> @@ -1029,36 +1020,26 @@ build_gather(struct lp_build_tgsi_context *bld_base,
>  
>  
>  /**
>   * Scatter/store vector.
>   */
>  static void
>  emit_mask_scatter(struct lp_build_tgsi_soa_context *bld,
>LLVMValueRef base_ptr,
>LLVMValueRef indexes,
>LLVMValueRef values,
> -  struct lp_exec_mask *mask,
> -  LLVMValueRef pred)
> +  struct lp_exec_mask *mask)
>  {
> struct gallivm_state *gallivm = bld->bld_base.base.gallivm;
> LLVMBuilderRef builder = gallivm->builder;
> unsigned i;
> -
> -   /* Mix the predicate and execution mask */
> -   if (mask->has_mask) {
> -  if (pred) {
> - pred = LLVMBuildAnd(builder, pred, mask->exec_mask, "");
> -  }
> -  else {
> - pred = mask->exec_mask;
> -  }
> -   }
> +   LLVMValueRef pred = mask->has_mask ? mask->exec_mask : NULL;
same here.


> diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
> b/src/gallium/include/pipe/p_shader_tokens.h
> index 6a3fb98..87d2d92 100644
> --- a/src/gallium/include/pipe/p_shader_tokens.h
> +++ b/src/gallium/include/pipe/p_shader_tokens.h
> @@ -62,21 +62,20 @@ struct tgsi_token
>  
>  enum tgsi_file_type {
> TGSI_FILE_NULL,
> TGSI_FILE_CONSTANT,
> TGSI_FILE_INPUT,
> TGSI_FILE_OUTPUT,
> TGSI_FILE_TEMPORARY,
> TGSI_FILE_SAMPLER,
> TGSI_FILE_ADDRESS,
> TGSI_FILE_IMMEDIATE,
> -   TGSI_FILE_PREDICATE,
> TGSI_FILE_SYSTEM_VALUE,
> TGSI_FILE_IMAGE,
> TGSI_FILE_SAMPLER_VIEW,
> TGSI_FILE_BUFFER,
> TGSI_FILE_MEMORY,
> TGSI_FILE_COUNT,  /**< how many TGSI_FILE_ types */
>  };
>  
>  
>  #define TGSI_WRITEMASK_NONE 0x00
> @@ -609,34 +608,31 @@ struct tgsi_property_data {
>  
>  /**
>   * Opcode is the operation code to execute. A given operation defines the
>   * semantics how the source registers (if any) are interpreted and what is
>   * written to the destination registers (if any) as a result of execution.
>   *
>   * NumDstRegs and NumSrcRegs is the number of destination and source 
> registers,
>   * respectively. For a given operation code, those numbers are fixed and are
>   * present here only for convenience.
>   *
> - * If Predicate is TRUE, tgsi_instruction_predicate token immediately 
> follows.
> - *
>   * Saturate controls how are final results in destination registers modified.
>   */
>  
>  struct tgsi_instruction
>  {
> unsigned Type   : 4;  /* TGSI_TOKEN_TYPE_INSTRUCTION */
> unsigned NrTokens   : 8;  /* UINT */
> unsigned Opcode : 8;  /* TGSI_OPCODE_ */
> unsigned Saturate   : 1;  /* BOOL */
> unsigned NumDstRegs : 2;  /* UINT */
> unsigned NumSrcRegs : 4;  /* UINT */
> -   unsigned Predicate  : 1;  /* BOOL */
> unsigned Label  : 1;
> unsigned Texture: 1;
> unsigned Memory : 1;
> unsigned Padding: 1;
The Padding doesn't match.



So, we still have code which uses this - however this code is only used
for some testing, otherwise we translate this d3d9 stuff away like
everybody else.
Maybe it's time to ditch this stuff then - clearly no other drivers are
ever going to support 

[Mesa-dev] [PATCH 6/9] radeonsi: handle incompatible DCC formats in resource_copy_region

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

Required because of later commits.
---
 src/gallium/drivers/radeonsi/si_blit.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index bc5c2d6..ded8beb 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -934,20 +934,25 @@ void si_resource_copy_region(struct pipe_context *ctx,
src_templ.format = 
PIPE_FORMAT_R32G32B32A32_UINT;
break;
default:
fprintf(stderr, "Unhandled format %s with 
blocksize %u\n",
util_format_short_name(src->format), 
blocksize);
assert(0);
}
}
}
 
+   vi_dcc_disable_if_incompatible_format(>b, dst, dst_level,
+ dst_templ.format);
+   vi_dcc_disable_if_incompatible_format(>b, src, src_level,
+ src_templ.format);
+
/* Initialize the surface. */
dst_view = r600_create_surface_custom(ctx, dst, _templ,
  dst_width, dst_height);
 
/* Initialize the sampler view. */
src_view = si_create_sampler_view_custom(ctx, src, _templ,
 src_width0, src_height0,
 src_force_level);
 
u_box_3d(dstx, dsty, dstz, abs(src_box->width), abs(src_box->height),
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/9] gallium/radeon: s/dcc_disable/disable_dcc/

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.h |  2 +-
 src/gallium/drivers/radeon/r600_texture.c |  4 ++--
 src/gallium/drivers/radeonsi/si_blit.c| 10 +-
 src/gallium/drivers/radeonsi/si_state.c   |  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 53fce50..035ab1c 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -784,21 +784,21 @@ void r600_texture_get_cmask_info(struct 
r600_common_screen *rscreen,
 struct r600_texture *rtex,
 struct r600_cmask_info *out);
 bool r600_init_flushed_depth_texture(struct pipe_context *ctx,
 struct pipe_resource *texture,
 struct r600_texture **staging);
 void r600_print_texture_info(struct r600_texture *rtex, FILE *f);
 struct pipe_resource *r600_texture_create(struct pipe_screen *screen,
const struct pipe_resource *templ);
 bool vi_dcc_formats_compatible(enum pipe_format format1,
   enum pipe_format format2);
-void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx,
+void vi_disable_dcc_if_incompatible_format(struct r600_common_context *rctx,
   struct pipe_resource *tex,
   unsigned level,
   enum pipe_format view_format);
 struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
struct pipe_resource *texture,
const struct pipe_surface 
*templ,
unsigned width, unsigned 
height);
 unsigned r600_translate_colorswap(enum pipe_format format, bool 
do_endian_swap);
 void vi_separate_dcc_start_query(struct pipe_context *ctx,
 struct r600_texture *tex);
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 94024c8..783f50c 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1733,21 +1733,21 @@ bool vi_dcc_formats_compatible(enum pipe_format format1,
return false;
 
type1 = vi_get_dcc_channel_type(desc1);
type2 = vi_get_dcc_channel_type(desc2);
 
return type1 != dcc_channel_incompatible &&
   type2 != dcc_channel_incompatible &&
   type1 == type2;
 }
 
-void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx,
+void vi_disable_dcc_if_incompatible_format(struct r600_common_context *rctx,
   struct pipe_resource *tex,
   unsigned level,
   enum pipe_format view_format)
 {
struct r600_texture *rtex = (struct r600_texture *)tex;
 
if (vi_dcc_enabled(rtex, level) &&
!vi_dcc_formats_compatible(tex->format, view_format))
if (!r600_texture_disable_dcc(rctx, (struct r600_texture*)tex))
rctx->decompress_dcc(>b, rtex);
@@ -1769,21 +1769,21 @@ struct pipe_surface *r600_create_surface_custom(struct 
pipe_context *pipe,
 
pipe_reference_init(>base.reference, 1);
pipe_resource_reference(>base.texture, texture);
surface->base.context = pipe;
surface->base.format = templ->format;
surface->base.width = width;
surface->base.height = height;
surface->base.u = templ->u;
 
if (texture->target != PIPE_BUFFER)
-   vi_dcc_disable_if_incompatible_format(rctx, texture,
+   vi_disable_dcc_if_incompatible_format(rctx, texture,
  templ->u.tex.level,
  templ->format);
 
return >base;
 }
 
 static struct pipe_surface *r600_create_surface(struct pipe_context *pipe,
struct pipe_resource *tex,
const struct pipe_surface 
*templ)
 {
diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index ded8beb..06a28f4 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -934,23 +934,23 @@ void si_resource_copy_region(struct pipe_context *ctx,
src_templ.format = 
PIPE_FORMAT_R32G32B32A32_UINT;
break;
default:
fprintf(stderr, "Unhandled format %s with 
blocksize %u\n",
 

[Mesa-dev] [PATCH 8/9] radeonsi: decompress DCC in set_framebuffer_state instead of create_surface

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

for threaded gallium, which can't use pipe_context in create_surface
---
 src/gallium/drivers/radeon/r600_pipe_common.h |  8 +++
 src/gallium/drivers/radeon/r600_texture.c | 33 +++
 src/gallium/drivers/radeonsi/si_state.c   | 26 +
 3 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 035ab1c..c9cb586 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -276,20 +276,21 @@ struct r600_surface {
struct pipe_surface base;
 
bool color_initialized;
bool depth_initialized;
 
/* Misc. color flags. */
bool alphatest_bypass;
bool export_16bpc;
bool color_is_int8;
bool color_is_int10;
+   bool dcc_incompatible;
 
/* Color registers. */
unsigned cb_color_info;
unsigned cb_color_base;
unsigned cb_color_view;
unsigned cb_color_size; /* R600 only */
unsigned cb_color_dim;  /* EG only */
unsigned cb_color_pitch;/* EG and later */
unsigned cb_color_slice;/* EG and later */
unsigned cb_color_attrib;   /* EG and later */
@@ -784,20 +785,27 @@ void r600_texture_get_cmask_info(struct 
r600_common_screen *rscreen,
 struct r600_texture *rtex,
 struct r600_cmask_info *out);
 bool r600_init_flushed_depth_texture(struct pipe_context *ctx,
 struct pipe_resource *texture,
 struct r600_texture **staging);
 void r600_print_texture_info(struct r600_texture *rtex, FILE *f);
 struct pipe_resource *r600_texture_create(struct pipe_screen *screen,
const struct pipe_resource *templ);
 bool vi_dcc_formats_compatible(enum pipe_format format1,
   enum pipe_format format2);
+bool vi_dcc_formats_are_incompatible(struct pipe_resource *tex,
+unsigned level,
+enum pipe_format view_format);
+void vi_disable_dcc_if_incompatible_flag(struct r600_common_context *rctx,
+struct pipe_resource *tex,
+unsigned level,
+bool dcc_incompatible);
 void vi_disable_dcc_if_incompatible_format(struct r600_common_context *rctx,
   struct pipe_resource *tex,
   unsigned level,
   enum pipe_format view_format);
 struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
struct pipe_resource *texture,
const struct pipe_surface 
*templ,
unsigned width, unsigned 
height);
 unsigned r600_translate_colorswap(enum pipe_format format, bool 
do_endian_swap);
 void vi_separate_dcc_start_query(struct pipe_context *ctx,
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 783f50c..1191a74 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1733,59 +1733,82 @@ bool vi_dcc_formats_compatible(enum pipe_format format1,
return false;
 
type1 = vi_get_dcc_channel_type(desc1);
type2 = vi_get_dcc_channel_type(desc2);
 
return type1 != dcc_channel_incompatible &&
   type2 != dcc_channel_incompatible &&
   type1 == type2;
 }
 
+bool vi_dcc_formats_are_incompatible(struct pipe_resource *tex,
+unsigned level,
+enum pipe_format view_format)
+{
+   struct r600_texture *rtex = (struct r600_texture *)tex;
+
+   return vi_dcc_enabled(rtex, level) &&
+  !vi_dcc_formats_compatible(tex->format, view_format);
+}
+
+void vi_disable_dcc_if_incompatible_flag(struct r600_common_context *rctx,
+struct pipe_resource *tex,
+unsigned level,
+bool dcc_incompatible)
+{
+   struct r600_texture *rtex = (struct r600_texture *)tex;
+
+   if (vi_dcc_enabled(rtex, level) && dcc_incompatible)
+   if (!r600_texture_disable_dcc(rctx, (struct r600_texture*)tex))
+   rctx->decompress_dcc(>b, rtex);
+}
+
+/* This can't be merged with the above function, because
+ * vi_dcc_formats_compatible should be called only when DCC is enabled. */
 void vi_disable_dcc_if_incompatible_format(struct 

[Mesa-dev] [PATCH 9/9] radeonsi: decompress DCC in set_sampler_view instead of create_sampler_view

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_descriptors.c | 14 +++---
 src/gallium/drivers/radeonsi/si_pipe.h|  1 +
 src/gallium/drivers/radeonsi/si_state.c   |  7 ---
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 8010e59..9b1d1f4 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -422,47 +422,55 @@ static void si_set_sampler_view(struct si_context *sctx,
struct si_sampler_views *views = >samplers[shader].views;
struct si_sampler_view *rview = (struct si_sampler_view*)view;
struct si_descriptors *descs = si_sampler_descriptors(sctx, shader);
uint32_t *desc = descs->list + slot * 16;
 
if (views->views[slot] == view && !disallow_early_out)
return;
 
if (view) {
struct r600_texture *rtex = (struct r600_texture 
*)view->texture;
+   bool is_buffer = rtex->resource.b.b.target == PIPE_BUFFER;
+
+   if (unlikely(!is_buffer && rview->dcc_incompatible)) {
+   vi_disable_dcc_if_incompatible_flag(>b,
+   >resource.b.b,
+   
view->u.tex.first_level,
+   
rview->dcc_incompatible);
+   rview->dcc_incompatible = false;
+   }
 
assert(rtex); /* views with texture == NULL aren't supported */
pipe_sampler_view_reference(>views[slot], view);
memcpy(desc, rview->state, 8*4);
 
-   if (rtex->resource.b.b.target == PIPE_BUFFER) {
+   if (is_buffer) {
rtex->resource.bind_history |= PIPE_BIND_SAMPLER_VIEW;
 
si_set_buf_desc_address(>resource,
view->u.buf.offset,
desc + 4);
} else {
bool is_separate_stencil =
rtex->db_compatible &&
rview->is_stencil_sampler;
 
si_set_mutable_tex_desc_fields(rtex,
   rview->base_level_info,
   rview->base_level,
   
rview->base.u.tex.first_level,
   rview->block_width,
   is_separate_stencil,
   desc);
}
 
-   if (rtex->resource.b.b.target != PIPE_BUFFER &&
-   rtex->fmask.size) {
+   if (!is_buffer && rtex->fmask.size) {
memcpy(desc + 8,
   rview->fmask_state, 8*4);
} else {
/* Disable FMASK and bind sampler state in [12:15]. */
memcpy(desc + 8,
   null_texture_descriptor, 4*4);
 
if (views->sampler_states[slot])
memcpy(desc + 12,
   views->sampler_states[slot]->val, 4*4);
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 617ec20..d1a8393 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -120,20 +120,21 @@ struct si_blend_color {
 struct si_sampler_view {
struct pipe_sampler_viewbase;
 /* [0..7] = image descriptor
  * [4..7] = buffer descriptor */
uint32_tstate[8];
uint32_tfmask_state[8];
const struct radeon_surf_level  *base_level_info;
unsignedbase_level;
unsignedblock_width;
bool is_stencil_sampler;
+   bool dcc_incompatible;
 };
 
 #define SI_SAMPLER_STATE_MAGIC 0x34f1c35a
 
 struct si_sampler_state {
 #ifdef DEBUG
unsignedmagic;
 #endif
uint32_tval[4];
 };
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 39b9152..23b6473 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3185,23 +3185,24 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
case PIPE_FORMAT_X24S8_UINT:
case PIPE_FORMAT_S8X24_UINT:
case PIPE_FORMAT_X32_S8X24_UINT:
pipe_format = PIPE_FORMAT_S8_UINT;

[Mesa-dev] [PATCH 3/9] gallium/radeon: formalize that r600_query_hw_add_result doesn't need a context

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_perfcounter.c |  2 +-
 src/gallium/drivers/radeon/r600_query.c   | 13 +++--
 src/gallium/drivers/radeon/r600_query.h   |  2 +-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_perfcounter.c 
b/src/gallium/drivers/radeon/r600_perfcounter.c
index bf24aab..48f609b 100644
--- a/src/gallium/drivers/radeon/r600_perfcounter.c
+++ b/src/gallium/drivers/radeon/r600_perfcounter.c
@@ -189,21 +189,21 @@ static void r600_pc_query_emit_stop(struct 
r600_common_context *ctx,
 }
 
 static void r600_pc_query_clear_result(struct r600_query_hw *hwquery,
   union pipe_query_result *result)
 {
struct r600_query_pc *query = (struct r600_query_pc *)hwquery;
 
memset(result, 0, sizeof(result->batch[0]) * query->num_counters);
 }
 
-static void r600_pc_query_add_result(struct r600_common_context *ctx,
+static void r600_pc_query_add_result(struct r600_common_screen *rscreen,
 struct r600_query_hw *hwquery,
 void *buffer,
 union pipe_query_result *result)
 {
struct r600_query_pc *query = (struct r600_query_pc *)hwquery;
uint64_t *results = buffer;
unsigned i, j;
 
for (i = 0; i < query->num_counters; ++i) {
struct r600_pc_counter *counter = >counters[i];
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index e269c39..b4e36c8 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -508,21 +508,21 @@ static struct r600_query_ops query_hw_ops = {
 };
 
 static void r600_query_hw_do_emit_start(struct r600_common_context *ctx,
struct r600_query_hw *query,
struct r600_resource *buffer,
uint64_t va);
 static void r600_query_hw_do_emit_stop(struct r600_common_context *ctx,
   struct r600_query_hw *query,
   struct r600_resource *buffer,
   uint64_t va);
-static void r600_query_hw_add_result(struct r600_common_context *ctx,
+static void r600_query_hw_add_result(struct r600_common_screen *rscreen,
 struct r600_query_hw *, void *buffer,
 union pipe_query_result *result);
 static void r600_query_hw_clear_result(struct r600_query_hw *,
   union pipe_query_result *);
 
 static struct r600_query_hw_ops query_hw_default_hw_ops = {
.prepare_buffer = r600_query_hw_prepare_buffer,
.emit_start = r600_query_hw_do_emit_start,
.emit_stop = r600_query_hw_do_emit_stop,
.clear_result = r600_query_hw_clear_result,
@@ -1030,26 +1030,26 @@ static unsigned r600_query_read_result(void *map, 
unsigned start_index, unsigned
end = (uint64_t)current_result[end_index] |
  (uint64_t)current_result[end_index+1] << 32;
 
if (!test_status_bit ||
((start & 0x8000UL) && (end & 0x8000UL))) {
return end - start;
}
return 0;
 }
 
-static void r600_query_hw_add_result(struct r600_common_context *ctx,
+static void r600_query_hw_add_result(struct r600_common_screen *rscreen,
 struct r600_query_hw *query,
 void *buffer,
 union pipe_query_result *result)
 {
-   unsigned max_rbs = ctx->screen->info.num_render_backends;
+   unsigned max_rbs = rscreen->info.num_render_backends;
 
switch (query->b.type) {
case PIPE_QUERY_OCCLUSION_COUNTER: {
for (unsigned i = 0; i < max_rbs; ++i) {
unsigned results_base = i * 16;
result->u64 +=
r600_query_read_result(buffer + results_base, 
0, 2, true);
}
break;
}
@@ -1085,21 +1085,21 @@ static void r600_query_hw_add_result(struct 
r600_common_context *ctx,
r600_query_read_result(buffer, 2, 6, true);
result->so_statistics.primitives_storage_needed +=
r600_query_read_result(buffer, 0, 4, true);
break;
case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
result->b = result->b ||
r600_query_read_result(buffer, 2, 6, true) !=
r600_query_read_result(buffer, 0, 4, true);
break;
case PIPE_QUERY_PIPELINE_STATISTICS:
-   if (ctx->chip_class >= EVERGREEN) {
+   if (rscreen->chip_class >= EVERGREEN) {

[Mesa-dev] [PATCH 0/9] RadeonSI cleanups

2017-03-29 Thread Marek Olšák
General cleanups and cleanups in preparation for threaded gallium.

Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] radeonsi: remove a workaround for inexact *8_SNORM blits

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

All tests pass on Fiji now. This prevents DCC disablement due to
incompatible DCC formats due to the fallback.
---
 src/gallium/drivers/radeonsi/si_blit.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 0466f19..bc5c2d6 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -888,23 +888,21 @@ void si_resource_copy_region(struct pipe_context *ctx,
 
sbox.x = util_format_get_nblocksx(src->format, src_box->x);
sbox.y = util_format_get_nblocksy(src->format, src_box->y);
sbox.z = src_box->z;
sbox.width = util_format_get_nblocksx(src->format, 
src_box->width);
sbox.height = util_format_get_nblocksy(src->format, 
src_box->height);
sbox.depth = src_box->depth;
src_box = 
 
src_force_level = src_level;
-   } else if (!util_blitter_is_copy_supported(sctx->blitter, dst, src) ||
-  /* also *8_SNORM has precision issues, use UNORM instead */
-  util_format_is_snorm8(src->format)) {
+   } else if (!util_blitter_is_copy_supported(sctx->blitter, dst, src)) {
if (util_format_is_subsampled_422(src->format)) {
src_templ.format = PIPE_FORMAT_R8G8B8A8_UINT;
dst_templ.format = PIPE_FORMAT_R8G8B8A8_UINT;
 
dst_width = util_format_get_nblocksx(dst->format, 
dst_width);
src_width0 = util_format_get_nblocksx(src->format, 
src_width0);
 
dstx = util_format_get_nblocksx(dst->format, dstx);
 
sbox = *src_box;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/9] gallium/radeon: add and use a new helper vi_dcc_enabled

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.h |  6 ++
 src/gallium/drivers/radeon/r600_texture.c | 11 +--
 src/gallium/drivers/radeonsi/si_blit.c|  6 ++
 src/gallium/drivers/radeonsi/si_descriptors.c |  5 ++---
 src/gallium/drivers/radeonsi/si_state.c   |  2 +-
 5 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 3516884..53fce50 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -941,20 +941,26 @@ r600_get_sampler_view_priority(struct r600_resource *res)
return RADEON_PRIO_SAMPLER_TEXTURE;
 }
 
 static inline bool
 r600_can_sample_zs(struct r600_texture *tex, bool stencil_sampler)
 {
return (stencil_sampler && tex->can_sample_s) ||
   (!stencil_sampler && tex->can_sample_z);
 }
 
+static inline bool
+vi_dcc_enabled(struct r600_texture *tex, unsigned level)
+{
+   return tex->dcc_offset && level < tex->surface.num_dcc_levels;
+}
+
 #define COMPUTE_DBG(rscreen, fmt, args...) \
do { \
if ((rscreen->b.debug_flags & DBG_COMPUTE)) fprintf(stderr, 
fmt, ##args); \
} while (0);
 
 #define R600_ERR(fmt, args...) \
fprintf(stderr, "EE %s:%d %s - " fmt, __FILE__, __LINE__, __func__, 
##args)
 
 /* For MSAA sample positions. */
 #define FILL_SREG(s0x, s0y, s1x, s1y, s2x, s2y, s3x, s3y)  \
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index ec7a325..94024c8 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -65,22 +65,22 @@ bool r600_prepare_for_dma_blit(struct r600_common_context 
*rctx,
 *   When dst is linear, the DB->CB copy preserves HTILE.
 *   When dst is tiled, the 3D path must be used to update HTILE.
 */
if (rsrc->is_depth || rdst->is_depth)
return false;
 
/* DCC as:
 *   src: Use the 3D path. DCC decompression is expensive.
 *   dst: Use the 3D path to compress the pixels with DCC.
 */
-   if ((rsrc->dcc_offset && src_level < rsrc->surface.num_dcc_levels) ||
-   (rdst->dcc_offset && dst_level < rdst->surface.num_dcc_levels))
+   if (vi_dcc_enabled(rsrc, src_level) ||
+   vi_dcc_enabled(rdst, dst_level))
return false;
 
/* CMASK as:
 *   src: Both texture and SDMA paths need decompression. Use SDMA.
 *   dst: If overwriting the whole texture, discard CMASK and use
 *SDMA. Otherwise, use the 3D path.
 */
if (rdst->cmask.size && rdst->dirty_level_mask & (1 << dst_level)) {
/* The CMASK clear is only enabled for the first level. */
assert(dst_level == 0);
@@ -1740,22 +1740,21 @@ bool vi_dcc_formats_compatible(enum pipe_format format1,
   type1 == type2;
 }
 
 void vi_dcc_disable_if_incompatible_format(struct r600_common_context *rctx,
   struct pipe_resource *tex,
   unsigned level,
   enum pipe_format view_format)
 {
struct r600_texture *rtex = (struct r600_texture *)tex;
 
-   if (rtex->dcc_offset &&
-   level < rtex->surface.num_dcc_levels &&
+   if (vi_dcc_enabled(rtex, level) &&
!vi_dcc_formats_compatible(tex->format, view_format))
if (!r600_texture_disable_dcc(rctx, (struct r600_texture*)tex))
rctx->decompress_dcc(>b, rtex);
 }
 
 struct pipe_surface *r600_create_surface_custom(struct pipe_context *pipe,
struct pipe_resource *texture,
const struct pipe_surface 
*templ,
unsigned width, unsigned height)
 {
@@ -2307,21 +2306,21 @@ static bool vi_get_fast_clear_parameters(enum 
pipe_format surface_format,
return true;
 }
 
 void vi_dcc_clear_level(struct r600_common_context *rctx,
struct r600_texture *rtex,
unsigned level, unsigned clear_value)
 {
struct pipe_resource *dcc_buffer;
uint64_t dcc_offset;
 
-   assert(rtex->dcc_offset && level < rtex->surface.num_dcc_levels);
+   assert(vi_dcc_enabled(rtex, level));
 
if (rtex->dcc_separate_buffer) {
dcc_buffer = >dcc_separate_buffer->b.b;
dcc_offset = 0;
} else {
dcc_buffer = >resource.b.b;
dcc_offset = rtex->dcc_offset;
}
 
dcc_offset += rtex->surface.level[level].dcc_offset;
@@ -2478,21 +2477,21 @@ void evergreen_do_fast_color_clear(struct 
r600_common_context *rctx,
/* Stoney can't do 

[Mesa-dev] [PATCH 1/9] gallium/util: use const in u_index_modify helpers

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/util/u_index_modify.c | 6 +++---
 src/gallium/auxiliary/util/u_index_modify.h | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_index_modify.c 
b/src/gallium/auxiliary/util/u_index_modify.c
index 7b072b2..d86be24 100644
--- a/src/gallium/auxiliary/util/u_index_modify.c
+++ b/src/gallium/auxiliary/util/u_index_modify.c
@@ -20,21 +20,21 @@
  * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #include "pipe/p_context.h"
 #include "util/u_index_modify.h"
 #include "util/u_inlines.h"
 
 /* Ubyte indices. */
 
 void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context,
-   struct pipe_index_buffer *ib,
+   const struct pipe_index_buffer *ib,
 unsigned add_transfer_flags,
int index_bias,
unsigned start,
unsigned count,
void *out)
 {
 struct pipe_transfer *src_transfer = NULL;
 const unsigned char *in_map;
 unsigned short *out_map = out;
 unsigned i;
@@ -55,21 +55,21 @@ void util_shorten_ubyte_elts_to_userptr(struct pipe_context 
*context,
 out_map++;
 }
 
 if (src_transfer)
pipe_buffer_unmap(context, src_transfer);
 }
 
 /* Ushort indices. */
 
 void util_rebuild_ushort_elts_to_userptr(struct pipe_context *context,
-struct pipe_index_buffer *ib,
+const struct pipe_index_buffer *ib,
  unsigned add_transfer_flags,
 int index_bias,
 unsigned start, unsigned count,
 void *out)
 {
 struct pipe_transfer *in_transfer = NULL;
 const unsigned short *in_map;
 unsigned short *out_map = out;
 unsigned i;
 
@@ -89,21 +89,21 @@ void util_rebuild_ushort_elts_to_userptr(struct 
pipe_context *context,
 out_map++;
 }
 
 if (in_transfer)
pipe_buffer_unmap(context, in_transfer);
 }
 
 /* Uint indices. */
 
 void util_rebuild_uint_elts_to_userptr(struct pipe_context *context,
-  struct pipe_index_buffer *ib,
+  const struct pipe_index_buffer *ib,
unsigned add_transfer_flags,
   int index_bias,
   unsigned start, unsigned count,
   void *out)
 {
 struct pipe_transfer *in_transfer = NULL;
 const unsigned int *in_map;
 unsigned int *out_map = out;
 unsigned i;
 
diff --git a/src/gallium/auxiliary/util/u_index_modify.h 
b/src/gallium/auxiliary/util/u_index_modify.h
index 0cfc189..d009199 100644
--- a/src/gallium/auxiliary/util/u_index_modify.h
+++ b/src/gallium/auxiliary/util/u_index_modify.h
@@ -21,32 +21,32 @@
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #ifndef UTIL_INDEX_MODIFY_H
 #define UTIL_INDEX_MODIFY_H
 
 struct pipe_context;
 struct pipe_resource;
 struct pipe_index_buffer;
 
 void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context,
-   struct pipe_index_buffer *ib,
+   const struct pipe_index_buffer *ib,
 unsigned add_transfer_flags,
int index_bias,
unsigned start,
unsigned count,
void *out);
 
 void util_rebuild_ushort_elts_to_userptr(struct pipe_context *context,
-struct pipe_index_buffer *ib,
+const struct pipe_index_buffer *ib,
  unsigned add_transfer_flags,
 int index_bias,
 unsigned start, unsigned count,
 void *out);
 
 void util_rebuild_uint_elts_to_userptr(struct pipe_context *context,
-  struct pipe_index_buffer *ib,
+  const struct pipe_index_buffer *ib,
unsigned add_transfer_flags,
   int index_bias,
   unsigned start, unsigned count,
   void *out);
 
 #endif
-- 
2.7.4

___
mesa-dev mailing list

[Mesa-dev] [PATCH 2/9] radeonsi: don't make a copy of pipe_index_buffer in draw_vbo

2017-03-29 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_draw.c | 59 +---
 1 file changed, 27 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 1ff1547..6882ff4 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -991,21 +991,22 @@ void si_ce_post_draw_synchronization(struct si_context 
*sctx)
radeon_emit(sctx->b.gfx.cs, 0);
 
sctx->ce_need_synchronization = false;
}
 }
 
 void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct si_state_rasterizer *rs = sctx->queued.named.rasterizer;
-   struct pipe_index_buffer ib = {};
+   const struct pipe_index_buffer *ib = >index_buffer;
+   struct pipe_index_buffer ib_tmp; /* for index buffer uploads only */
unsigned mask, dirty_tex_counter, rast_prim;
 
if (likely(!info->indirect)) {
/* SI-CI treat instance_count==0 as instance_count==1. There is
 * no workaround for indirect draws, but we can at least skip
 * direct draws.
 */
if (unlikely(!info->instance_count))
return;
 
@@ -1076,78 +1077,72 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
sctx->do_update_shaders = true;
}
}
 
if (sctx->do_update_shaders && !si_update_shaders(sctx))
return;
 
if (!si_upload_graphics_shader_descriptors(sctx))
return;
 
-   if (info->indexed) {
-   /* Initialize the index buffer struct. */
-   pipe_resource_reference(, sctx->index_buffer.buffer);
-   ib.user_buffer = sctx->index_buffer.user_buffer;
-   ib.index_size = sctx->index_buffer.index_size;
-   ib.offset = sctx->index_buffer.offset;
+   ib_tmp.buffer = NULL;
 
+   if (info->indexed) {
/* Translate or upload, if needed. */
/* 8-bit indices are supported on VI. */
-   if (sctx->b.chip_class <= CIK && ib.index_size == 1) {
-   struct pipe_resource *out_buffer = NULL;
-   unsigned out_offset, start, count, start_offset, size;
+   if (sctx->b.chip_class <= CIK && ib->index_size == 1) {
+   unsigned start, count, start_offset, size;
void *ptr;
 
si_get_draw_start_count(sctx, info, , );
start_offset = start * 2;
size = count * 2;
 
u_upload_alloc(ctx->stream_uploader, start_offset,
   size,
   si_optimal_tcc_alignment(sctx, size),
-  _offset, _buffer, );
-   if (!out_buffer) {
-   pipe_resource_reference(, NULL);
+  _tmp.offset, _tmp.buffer, );
+   if (!ib_tmp.buffer)
return;
-   }
 
-   util_shorten_ubyte_elts_to_userptr(>b.b, , 0, 
0,
-  ib.offset + start,
+   util_shorten_ubyte_elts_to_userptr(>b.b, ib, 0, 0,
+  ib->offset + start,
   count, ptr);
 
-   pipe_resource_reference(, NULL);
-   ib.user_buffer = NULL;
-   ib.buffer = out_buffer;
/* info->start will be added by the drawing code */
-   ib.offset = out_offset - start_offset;
-   ib.index_size = 2;
-   } else if (ib.user_buffer && !ib.buffer) {
+   ib_tmp.offset -= start_offset;
+   ib_tmp.index_size = 2;
+   ib = _tmp;
+   } else if (ib->user_buffer && !ib->buffer) {
unsigned start, count, start_offset;
 
si_get_draw_start_count(sctx, info, , );
-   start_offset = start * ib.index_size;
+   start_offset = start * ib->index_size;
 
u_upload_data(ctx->stream_uploader, start_offset,
- count * ib.index_size,
+ count * ib->index_size,
  sctx->screen->b.info.tcc_cache_line_size,
- (char*)ib.user_buffer + start_offset,
- , );

Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses

2017-03-29 Thread Jason Ekstrand
On Wed, Mar 29, 2017 at 8:59 AM, Kristian H. Kristensen 
wrote:

> Jason Ekstrand  writes:
>
> > This commit adds support for using the full 48-bit address space on
> > Broadwell and newer hardware.  Thanks to certain limitations, not all
> > objects can be placed above the 32-bit boundary.  In particular, general
> > and state base address need to live within 32 bits.  (See also
> > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
> > to handle this, we add a supports_48bit_address field to anv_bo and only
> > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
> > for all client-allocated memory objects but leave it false for
> > driver-allocated objects.  While this is more conservative than needed,
> > all driver allocations should easily fit in the first 32 bits of address
> > space and keeps things simple because we don't have to think about
> > whether or not any given one of our allocation data structures will be
> > used in a 48-bit-unsafe way.
> > ---
> >  src/intel/vulkan/anv_allocator.c   | 10 --
> >  src/intel/vulkan/anv_batch_chain.c | 14 ++
> >  src/intel/vulkan/anv_device.c  |  4 +++-
> >  src/intel/vulkan/anv_gem.c | 18 ++
> >  src/intel/vulkan/anv_intel.c   |  2 +-
> >  src/intel/vulkan/anv_private.h | 29 +++--
> >  6 files changed, 67 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_
> allocator.c
> > index 45c663b..88c9c13 100644
> > --- a/src/intel/vulkan/anv_allocator.c
> > +++ b/src/intel/vulkan/anv_allocator.c
> > @@ -255,7 +255,7 @@ anv_block_pool_init(struct anv_block_pool *pool,
> > assert(util_is_power_of_two(block_size));
> >
> > pool->device = device;
> > -   anv_bo_init(>bo, 0, 0);
> > +   anv_bo_init(>bo, 0, 0, false);
> > pool->block_size = block_size;
> > pool->free_list = ANV_FREE_LIST_EMPTY;
> > pool->back_free_list = ANV_FREE_LIST_EMPTY;
> > @@ -475,7 +475,13 @@ anv_block_pool_grow(struct anv_block_pool *pool,
> struct anv_block_state *state)
> >  * values back into pool. */
> > pool->map = map + center_bo_offset;
> > pool->center_bo_offset = center_bo_offset;
> > -   anv_bo_init(>bo, gem_handle, size);
> > +
> > +   /* Block pool BOs are marked as not supporting 48-bit addresses
> because
> > +* they are used to back STATE_BASE_ADDRESS.
> > +*
> > +* See also anv_bo::supports_48bit_address.
> > +*/
> > +   anv_bo_init(>bo, gem_handle, size, false);
> > pool->bo.map = map;
> >
> >  done:
> > diff --git a/src/intel/vulkan/anv_batch_chain.c
> b/src/intel/vulkan/anv_batch_chain.c
> > index 5d7abc6..b098e4b 100644
> > --- a/src/intel/vulkan/anv_batch_chain.c
> > +++ b/src/intel/vulkan/anv_batch_chain.c
> > @@ -979,7 +979,8 @@ anv_execbuf_finish(struct anv_execbuf *exec,
> >  }
> >
> >  static VkResult
> > -anv_execbuf_add_bo(struct anv_execbuf *exec,
> > +anv_execbuf_add_bo(struct anv_device *device,
> > +   struct anv_execbuf *exec,
> > struct anv_bo *bo,
> > struct anv_reloc_list *relocs,
> > const VkAllocationCallbacks *alloc)
> > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> >obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0;
> >obj->rsvd1 = 0;
> >obj->rsvd2 = 0;
> > +
> > +  if (device->instance->physicalDevice.supports_48bit_addresses &&
> > +  bo->supports_48bit_address)
> > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > }
> >
> > if (relocs != NULL && obj->relocation_count == 0) {
> > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> >for (size_t i = 0; i < relocs->num_relocs; i++) {
> >   /* A quick sanity check on relocations */
> >   assert(relocs->relocs[i].offset < bo->size);
> > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc);
> > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL,
> alloc);
> >}
> > }
> >
> > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> > adjust_relocations_from_state_pool(ss_pool,
> _buffer->surface_relocs,
> >cmd_buffer->last_ss_pool_center);
> > VkResult result =
> > -  anv_execbuf_add_bo(, _pool->bo,
> _buffer->surface_relocs,
> > +  anv_execbuf_add_bo(device, , _pool->bo,
> > + _buffer->surface_relocs,
> >   _buffer->pool->alloc);
> > if (result != VK_SUCCESS)
> >return result;
> > @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> >adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo,
> &(*bbo)->relocs,
> > cmd_buffer->last_ss_pool_
> center);
> >
> > -  anv_execbuf_add_bo(, &(*bbo)->bo, 

[Mesa-dev] [PATCH v2 2/2] anv: Query the kernel for reset status

2017-03-29 Thread Jason Ekstrand
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible.  In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering.  In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.

v2 (Jason Ekstrand):
 - Slight restructuring and much better error logging
---
 src/intel/vulkan/anv_device.c  | 114 +
 src/intel/vulkan/anv_gem.c |  17 ++
 src/intel/vulkan/anv_private.h |   5 ++
 src/intel/vulkan/genX_query.c  |  11 ++--
 4 files changed, 107 insertions(+), 40 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 5f0d00f..109a2a1 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -884,8 +884,6 @@ anv_device_submit_simple_batch(struct anv_device *device,
struct anv_bo bo, *exec_bos[1];
VkResult result = VK_SUCCESS;
uint32_t size;
-   int64_t timeout;
-   int ret;
 
/* Kernel driver requires 8 byte aligned batch length */
size = align_u32(batch->next - batch->start, 8);
@@ -925,14 +923,7 @@ anv_device_submit_simple_batch(struct anv_device *device,
if (result != VK_SUCCESS)
   goto fail;
 
-   timeout = INT64_MAX;
-   ret = anv_gem_wait(device, bo.gem_handle, );
-   if (ret != 0) {
-  /* We don't know the real error. */
-  device->lost = true;
-  result = vk_errorf(VK_ERROR_DEVICE_LOST, "execbuf2 failed: %m");
-  goto fail;
-   }
+   result = anv_device_wait(device, , INT64_MAX);
 
  fail:
anv_bo_pool_free(>batch_bo_pool, );
@@ -1264,6 +1255,58 @@ anv_device_execbuf(struct anv_device *device,
return VK_SUCCESS;
 }
 
+VkResult
+anv_device_query_status(struct anv_device *device)
+{
+   /* This isn't likely as most of the callers of this function already check
+* for it.  However, it doesn't hurt to check and it potentially lets us
+* avoid an ioctl.
+*/
+   if (unlikely(device->lost))
+  return VK_ERROR_DEVICE_LOST;
+
+   uint32_t active, pending;
+   int ret = anv_gem_gpu_get_reset_stats(device, , );
+   if (ret == -1) {
+  /* We don't know the real error. */
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST, "get_reset_stats failed: %m");
+   }
+
+   if (active) {
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST,
+   "GPU hung on one of our command buffers");
+   } else if (pending) {
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST,
+   "GPU hung with commands in-flight");
+   }
+
+   return VK_SUCCESS;
+}
+
+VkResult
+anv_device_wait(struct anv_device *device, struct anv_bo *bo,
+int64_t timeout)
+{
+   int ret = anv_gem_wait(device, bo->gem_handle, );
+   if (ret == -1 && errno == ETIME) {
+  return VK_TIMEOUT;
+   } else if (ret == -1) {
+  /* We don't know the real error. */
+  device->lost = true;
+  return vk_errorf(VK_ERROR_DEVICE_LOST, "gem wait failed: %m");
+   }
+
+   /* Query for device status after the wait.  If the BO we're waiting on got
+* caught in a GPU hang we don't want to return VK_SUCCESS to the client
+* because it clearly doesn't have valid data.  Yes, this most likely means
+* an ioctl, but we just did an ioctl to wait so it's no great loss.
+*/
+   return anv_device_query_status(device);
+}
+
 VkResult anv_QueueSubmit(
 VkQueue _queue,
 uint32_tsubmitCount,
@@ -1273,10 +1316,17 @@ VkResult anv_QueueSubmit(
ANV_FROM_HANDLE(anv_queue, queue, _queue);
ANV_FROM_HANDLE(anv_fence, fence, _fence);
struct anv_device *device = queue->device;
-   if (unlikely(device->lost))
-  return VK_ERROR_DEVICE_LOST;
 
-   VkResult result = VK_SUCCESS;
+   /* Query for device status prior to submitting.  Technically, we don't need
+* to do this.  However, if we have a client that's submitting piles of
+* garbage, we would rather break as early as possible to keep the GPU
+* hanging contained.  If we don't check here, we'll either be waiting for
+* the kernel to kick us or we'll have to wait until the client waits on a
+* fence before we actually know whether or not we've hung.
+*/
+   VkResult result = anv_device_query_status(device);
+   if (!result)
+  return result;
 
/* We lock around QueueSubmit for three main reasons:
 *
@@ -1802,9 +1852,6 @@ VkResult anv_GetFenceStatus(
if (unlikely(device->lost))
   return VK_ERROR_DEVICE_LOST;
 
-   int64_t t = 0;
-   int ret;
-
switch (fence->state) {
case ANV_FENCE_STATE_RESET:
   /* If it hasn't even been sent off to the GPU yet, it's not ready */
@@ -1814,15 +1861,18 @@ VkResult anv_GetFenceStatus(
   /* It's been 

Re: [Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-29 Thread Bartosz Tomczyk
I  would be very grateful if someone could help with testing performance
impact of this change.

On Wed, Mar 29, 2017 at 7:31 PM, Bartosz Tomczyk <
bartosz.tomczy...@gmail.com> wrote:

> Call it directly when batch queue is empty. This avoids costly thread
> synchronisation. With this fix games that previously regressed
> with mesa_glthread=true like xonotic or grid autosport.
> ---
>  src/mesa/main/glthread.c | 47 ++
> -
>  1 file changed, 34 insertions(+), 13 deletions(-)
>
> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
> index 06115b916d..faf42c2b89 100644
> --- a/src/mesa/main/glthread.c
> +++ b/src/mesa/main/glthread.c
> @@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context
> *ctx)
> }
>  }
>
> -void
> -_mesa_glthread_flush_batch(struct gl_context *ctx)
> +static void
> +_mesa_glthread_flush_batch_locked(struct gl_context *ctx)
>  {
> struct glthread_state *glthread = ctx->GLThread;
> -   struct glthread_batch *batch;
> -
> -   if (!glthread)
> -  return;
> -
> -   batch = glthread->batch;
> +   struct glthread_batch *batch = glthread->batch;
> +
> if (!batch->used)
>return;
>
> @@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
>return;
> }
>
> -   pthread_mutex_lock(>mutex);
> *glthread->batch_queue_tail = batch;
> glthread->batch_queue_tail = >next;
> pthread_cond_broadcast(>new_work);
> +
> +}
> +void
> +_mesa_glthread_flush_batch(struct gl_context *ctx)
> +{
> +   struct glthread_state *glthread = ctx->GLThread;
> +   struct glthread_batch *batch;
> +
> +   if (!glthread)
> +  return;
> +
> +   batch = glthread->batch;
> +   if (!batch->used)
> +  return;
> +
> +   pthread_mutex_lock(>mutex);
> +   _mesa_glthread_flush_batch_locked(ctx);
> pthread_mutex_unlock(>mutex);
>  }
>
> @@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
> if (pthread_self() == glthread->thread)
>return;
>
> -   _mesa_glthread_flush_batch(ctx);
> -
> pthread_mutex_lock(>mutex);
>
> -   while (glthread->batch_queue || glthread->busy)
> -  pthread_cond_wait(>work_done, >mutex);
> +   if (!(glthread->batch_queue || glthread->busy)) {
> +  if (glthread->batch && glthread->batch->used) {
> + struct _glapi_table *dispatch = _glapi_get_dispatch();
> + glthread_unmarshal_batch(ctx, glthread->batch);
> + _glapi_set_dispatch(dispatch);
> + glthread_allocate_batch(ctx);
> +  }
> +   }
> +   else {
> +  _mesa_glthread_flush_batch_locked(ctx);
> +  while (glthread->batch_queue || glthread->busy)
> + pthread_cond_wait(>work_done, >mutex);
> +   }
>
> pthread_mutex_unlock(>mutex);
>  }
> --
> 2.12.2
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] [RFC v3] mesa/glthread: Call unmarshal_batch directly in glthread_finish

2017-03-29 Thread Bartosz Tomczyk
Call it directly when batch queue is empty. This avoids costly thread
synchronisation. With this fix games that previously regressed
with mesa_glthread=true like xonotic or grid autosport.
---
 src/mesa/main/glthread.c | 47 ++-
 1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 06115b916d..faf42c2b89 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -194,16 +194,12 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx)
}
 }
 
-void
-_mesa_glthread_flush_batch(struct gl_context *ctx)
+static void
+_mesa_glthread_flush_batch_locked(struct gl_context *ctx)
 {
struct glthread_state *glthread = ctx->GLThread;
-   struct glthread_batch *batch;
-
-   if (!glthread)
-  return;
-
-   batch = glthread->batch;
+   struct glthread_batch *batch = glthread->batch;
+   
if (!batch->used)
   return;
 
@@ -223,10 +219,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
   return;
}
 
-   pthread_mutex_lock(>mutex);
*glthread->batch_queue_tail = batch;
glthread->batch_queue_tail = >next;
pthread_cond_broadcast(>new_work);
+
+}
+void
+_mesa_glthread_flush_batch(struct gl_context *ctx)
+{
+   struct glthread_state *glthread = ctx->GLThread;
+   struct glthread_batch *batch;
+
+   if (!glthread)
+  return;
+
+   batch = glthread->batch;
+   if (!batch->used)
+  return;
+
+   pthread_mutex_lock(>mutex);
+   _mesa_glthread_flush_batch_locked(ctx);
pthread_mutex_unlock(>mutex);
 }
 
@@ -252,12 +264,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (pthread_self() == glthread->thread)
   return;
 
-   _mesa_glthread_flush_batch(ctx);
-
pthread_mutex_lock(>mutex);
 
-   while (glthread->batch_queue || glthread->busy)
-  pthread_cond_wait(>work_done, >mutex);
+   if (!(glthread->batch_queue || glthread->busy)) {
+  if (glthread->batch && glthread->batch->used) {
+ struct _glapi_table *dispatch = _glapi_get_dispatch();
+ glthread_unmarshal_batch(ctx, glthread->batch);
+ _glapi_set_dispatch(dispatch);
+ glthread_allocate_batch(ctx);
+  }
+   }
+   else {
+  _mesa_glthread_flush_batch_locked(ctx);
+  while (glthread->batch_queue || glthread->busy)
+ pthread_cond_wait(>work_done, >mutex);
+   }
 
pthread_mutex_unlock(>mutex);
 }
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 05/25] gallium: add sparse buffer interface and capability

2017-03-29 Thread Nicolai Hähnle

On 29.03.2017 16:27, Marek Olšák wrote:

On Wed, Mar 29, 2017 at 12:26 PM, Nicolai Hähnle  wrote:

On 28.03.2017 21:46, Marek Olšák wrote:


On Tue, Mar 28, 2017 at 11:11 AM, Nicolai Hähnle 
wrote:


From: Nicolai Hähnle 

TODO fill out caps in all drivers

v2:
- explain the resource_commit interface in more detail
---
 src/gallium/docs/source/context.rst  | 25 +
 src/gallium/docs/source/screen.rst   |  3 +++
 src/gallium/include/pipe/p_context.h | 13 +
 src/gallium/include/pipe/p_defines.h |  2 ++
 4 files changed, 43 insertions(+)

diff --git a/src/gallium/docs/source/context.rst
b/src/gallium/docs/source/context.rst
index a053193..5949ff2 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -611,20 +611,45 @@ for both regular textures as well as for
framebuffers read via FBFETCH.
 .. _memory_barrier:

 memory_barrier
 %%%

 This function flushes caches according to which of the PIPE_BARRIER_*
flags
 are set.



+.. _resource_commit:
+
+resource_commit
+%%%
+
+This function changes the commit state of a part of a sparse resource.
Sparse
+resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag
when
+calling ``resource_create``. Initially, sparse resources only reserve a
virtual
+memory region that is not backed by memory (i.e., it is uncommitted).
The
+``resource_commit`` function can be called to commit or uncommit parts
(or all)
+of a resource. The driver manages the underlying backing memory.
+
+The contents of newly committed memory regions are undefined. Calling
this
+function to commit an already committed memory region is allowed and
leaves its
+content unchanged. Similarly, calling this function to uncommit an
already
+uncommitted memory region is allowed.
+
+For buffers, the given box must be aligned to multiples of
+``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if
the size
+of the buffer is not a multiple of the page size, changing the commit
state of
+the last (partial) page requires a box that ends at the end of the
buffer
+(i.e., box->x + box->width == buffer->width0).
+
+
+
 .. _pipe_transfer:

 PIPE_TRANSFER
 ^

 These flags control the behavior of a transfer object.

 ``PIPE_TRANSFER_READ``
   Resource contents read back (or accessed directly) at transfer create
time.

diff --git a/src/gallium/docs/source/screen.rst
b/src/gallium/docs/source/screen.rst
index 00c9503..8759639 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -369,20 +369,23 @@ The integer capabilities:
   opcode to retrieve the current value in the framebuffer.
 * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the
   ``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property.
 * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point
operations
   are supported.
 * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported.
 * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo
   operations are supported.
 * ``PIPE_CAP_TGSI_TEX_TXF_LZ``: Whether TEX_LZ and TXF_LZ opcodes are
   supported.
+* ``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``: The page size of sparse buffers
in
+  bytes, or 0 if sparse buffers are not supported. The page size must be
at
+  most 64KB.


 .. _pipe_capf:

 PIPE_CAPF_*
 

 The floating-point capabilities are:

 * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line.
diff --git a/src/gallium/include/pipe/p_context.h
b/src/gallium/include/pipe/p_context.h
index a29fff5..4d5535b 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -578,20 +578,33 @@ struct pipe_context {
 * Flush any pending framebuffer writes and invalidate texture
caches.
 */
void (*texture_barrier)(struct pipe_context *, unsigned flags);

/**
 * Flush caches according to flags.
 */
void (*memory_barrier)(struct pipe_context *, unsigned flags);

/**
+* Change the commitment status of a part of the given resource,
which must
+* have been created with the PIPE_RESOURCE_FLAG_SPARSE bit.
+*
+* \param level The texture level whose commitment should be changed.
+* \param box The region of the resource whose commitment should be
changed.
+* \param commit Whether memory should be committed or un-committed.
+*
+* \return false if out of memory, true on success.
+*/
+   bool (*resource_commit)(struct pipe_context *, struct pipe_resource
*,
+   unsigned level, struct pipe_box *box, bool
commit);



I wonder what the behavior for threaded gallium should be. Possibilities:
1) Sync the context thread and execute directly.
2) Ignore the return value, always return true, and execute it
asynchronously.

If the "false" return value is very unlikely, I may use the second
approach.



"false" here means 

Re: [Mesa-dev] [PATCH] [RFC v2] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.

2017-03-29 Thread Matt Turner
On Wed, Mar 29, 2017 at 9:11 AM, Bartosz Tomczyk
 wrote:
> This avoids costly thread synchronisation. With this fix games that 
> previously regressed with mesa_glthread=true like xonotic or grid autosport.
> Could someone test if games that benefit from glthread didn't regress?
> ---
>  src/mesa/main/glthread.c | 49 
> +---
>  1 file changed, 34 insertions(+), 15 deletions(-)
>
> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
> index 06115b916d..eef7202f01 100644
> --- a/src/mesa/main/glthread.c
> +++ b/src/mesa/main/glthread.c
> @@ -194,18 +194,11 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx)
> }
>  }
>
> -void
> -_mesa_glthread_flush_batch(struct gl_context *ctx)
> +static void
> +_mesa_glthread_flush_batch_no_lock(struct gl_context *ctx)
>  {
> struct glthread_state *glthread = ctx->GLThread;
> -   struct glthread_batch *batch;
> -
> -   if (!glthread)
> -  return;
> -
> -   batch = glthread->batch;
> -   if (!batch->used)
> -  return;
> +   struct glthread_batch *batch = glthread->batch;
>
> /* Immediately reallocate a new batch, since the next marshalled call 
> would
>  * just do it.
> @@ -223,10 +216,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
>return;
> }
>
> -   pthread_mutex_lock(>mutex);
> *glthread->batch_queue_tail = batch;
> glthread->batch_queue_tail = >next;
> pthread_cond_broadcast(>new_work);
> +
> +}
> +void
> +_mesa_glthread_flush_batch(struct gl_context *ctx)
> +{
> +   struct glthread_state *glthread = ctx->GLThread;
> +   struct glthread_batch *batch;
> +
> +   if (!glthread)
> +  return;
> +
> +   batch = glthread->batch;
> +   if (!batch->used)
> +  return;
> +
> +   pthread_mutex_lock(>mutex);
> +   _mesa_glthread_flush_batch_no_lock(ctx);
> pthread_mutex_unlock(>mutex);
>  }
>
> @@ -252,12 +261,22 @@ _mesa_glthread_finish(struct gl_context *ctx)
> if (pthread_self() == glthread->thread)
>return;
>
> -   _mesa_glthread_flush_batch(ctx);
> -
> pthread_mutex_lock(>mutex);
>
> -   while (glthread->batch_queue || glthread->busy)
> -  pthread_cond_wait(>work_done, >mutex);
> +   if (!(glthread->batch_queue || glthread->busy))
> +   {
> +  if (glthread->batch && glthread->batch->used)
> +  {
> + glthread_unmarshal_batch(ctx, glthread->batch);
> +  }

Please follow the existing style of putting the braces on the same
line as the if and else.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove support for predicates from TGSI

2017-03-29 Thread Emil Velikov
On 29 March 2017 at 16:51, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Neved used.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_limits.h  |   4 -
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h|   2 -
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c|  46 ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_info.c   |   6 +-
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c| 137 ++-
>  src/gallium/auxiliary/tgsi/tgsi_build.c|  66 -
>  src/gallium/auxiliary/tgsi/tgsi_build.h|   3 -
>  src/gallium/auxiliary/tgsi/tgsi_dump.c |  24 
>  src/gallium/auxiliary/tgsi/tgsi_exec.c |  59 
>  src/gallium/auxiliary/tgsi/tgsi_exec.h |   7 -
>  src/gallium/auxiliary/tgsi/tgsi_parse.c|   4 -
>  src/gallium/auxiliary/tgsi/tgsi_parse.h|   1 -
>  src/gallium/auxiliary/tgsi/tgsi_sanity.c   |   1 -
>  src/gallium/auxiliary/tgsi/tgsi_strings.c  |   1 -
>  src/gallium/auxiliary/tgsi/tgsi_text.c |  37 -
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c |  84 +---
>  src/gallium/auxiliary/tgsi/tgsi_ureg.h | 149 
> +
>  src/gallium/docs/source/screen.rst |   1 -
>  src/gallium/drivers/freedreno/freedreno_screen.c   |   2 -
>  src/gallium/drivers/i915/i915_fpc.h|   1 -
>  src/gallium/drivers/i915/i915_screen.c |   2 -
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  10 +-
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c |   2 -
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c |   2 -
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   2 -
>  src/gallium/drivers/r300/r300_screen.c |   4 -
>  src/gallium/drivers/r600/r600_pipe.c   |   2 -
>  src/gallium/drivers/r600/r600_shader.c |   4 -
>  src/gallium/drivers/radeonsi/si_pipe.c |   1 -
>  src/gallium/drivers/svga/svga_screen.c |   6 -
>  src/gallium/drivers/vc4/vc4_screen.c   |   2 -
>  src/gallium/drivers/virgl/virgl_screen.c   |   2 -
>  src/gallium/include/pipe/p_defines.h   |   1 -
>  src/gallium/include/pipe/p_shader_tokens.h |  19 ---
>  src/gallium/state_trackers/nine/nine_shader.c  |  18 +--
>  35 files changed, 23 insertions(+), 689 deletions(-)
>
Quick grep for PIPE_SHADER_CAP_MAX_PREDS shows one instance in the
etnaviv driver.

Jose, Brian - you might want to check if nothing is using it on your end.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC v2] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.

2017-03-29 Thread Nicolai Hähnle

On 29.03.2017 18:11, Bartosz Tomczyk wrote:

This avoids costly thread synchronisation. With this fix games that previously 
regressed with mesa_glthread=true like xonotic or grid autosport.
Could someone test if games that benefit from glthread didn't regress?


Please make sure the commit message is wrapped to 75 characters.

The approach seems like a good idea: if the current thread is going to 
wait anyway, we might as well do any pending work locally to avoid 
context switch overhead. Would be nice to see some benchmarks, but this 
should mostly be a win -- the only reason I could imagine why it might 
not be is cache effects, and those could go either way.




---
 src/mesa/main/glthread.c | 49 +---
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 06115b916d..eef7202f01 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -194,18 +194,11 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx)
}
 }

-void
-_mesa_glthread_flush_batch(struct gl_context *ctx)
+static void
+_mesa_glthread_flush_batch_no_lock(struct gl_context *ctx)


A better and more idiomatic name for this function would be 
_mesa_glthread_flush_batch_locked.




 {
struct glthread_state *glthread = ctx->GLThread;
-   struct glthread_batch *batch;
-
-   if (!glthread)
-  return;
-
-   batch = glthread->batch;
-   if (!batch->used)
-  return;
+   struct glthread_batch *batch = glthread->batch;

/* Immediately reallocate a new batch, since the next marshalled call would
 * just do it.
@@ -223,10 +216,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
   return;
}

-   pthread_mutex_lock(>mutex);
*glthread->batch_queue_tail = batch;
glthread->batch_queue_tail = >next;
pthread_cond_broadcast(>new_work);
+
+}
+void
+_mesa_glthread_flush_batch(struct gl_context *ctx)
+{
+   struct glthread_state *glthread = ctx->GLThread;
+   struct glthread_batch *batch;
+
+   if (!glthread)
+  return;
+
+   batch = glthread->batch;
+   if (!batch->used)
+  return;
+
+   pthread_mutex_lock(>mutex);
+   _mesa_glthread_flush_batch_no_lock(ctx);
pthread_mutex_unlock(>mutex);
 }

@@ -252,12 +261,22 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (pthread_self() == glthread->thread)
   return;

-   _mesa_glthread_flush_batch(ctx);
-
pthread_mutex_lock(>mutex);

-   while (glthread->batch_queue || glthread->busy)
-  pthread_cond_wait(>work_done, >mutex);
+   if (!(glthread->batch_queue || glthread->busy))
+   {
+  if (glthread->batch && glthread->batch->used)
+  {
+ glthread_unmarshal_batch(ctx, glthread->batch);


You _must_ reset the api dispatch afterwards; otherwise, your change 
here effectively disables glthread forever. To be on the safe side, I 
think you need to save the current dispatch in a temp variable and then 
reset after unmarshalling.


Cheers,
Nicolai



+  }
+  glthread_allocate_batch(ctx);
+   }
+   else
+   {
+  _mesa_glthread_flush_batch_no_lock(ctx);
+  while (glthread->batch_queue || glthread->busy)
+ pthread_cond_wait(>work_done, >mutex);
+   }

pthread_mutex_unlock(>mutex);
 }




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] [RFC v2] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.

2017-03-29 Thread Bartosz Tomczyk
This avoids costly thread synchronisation. With this fix games that previously 
regressed with mesa_glthread=true like xonotic or grid autosport.
Could someone test if games that benefit from glthread didn't regress?
---
 src/mesa/main/glthread.c | 49 +---
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 06115b916d..eef7202f01 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -194,18 +194,11 @@ _mesa_glthread_restore_dispatch(struct gl_context *ctx)
}
 }
 
-void
-_mesa_glthread_flush_batch(struct gl_context *ctx)
+static void
+_mesa_glthread_flush_batch_no_lock(struct gl_context *ctx)
 {
struct glthread_state *glthread = ctx->GLThread;
-   struct glthread_batch *batch;
-
-   if (!glthread)
-  return;
-
-   batch = glthread->batch;
-   if (!batch->used)
-  return;
+   struct glthread_batch *batch = glthread->batch;
 
/* Immediately reallocate a new batch, since the next marshalled call would
 * just do it.
@@ -223,10 +216,26 @@ _mesa_glthread_flush_batch(struct gl_context *ctx)
   return;
}
 
-   pthread_mutex_lock(>mutex);
*glthread->batch_queue_tail = batch;
glthread->batch_queue_tail = >next;
pthread_cond_broadcast(>new_work);
+
+}
+void
+_mesa_glthread_flush_batch(struct gl_context *ctx)
+{
+   struct glthread_state *glthread = ctx->GLThread;
+   struct glthread_batch *batch;
+
+   if (!glthread)
+  return;
+
+   batch = glthread->batch;
+   if (!batch->used)
+  return;
+
+   pthread_mutex_lock(>mutex);
+   _mesa_glthread_flush_batch_no_lock(ctx);
pthread_mutex_unlock(>mutex);
 }
 
@@ -252,12 +261,22 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (pthread_self() == glthread->thread)
   return;
 
-   _mesa_glthread_flush_batch(ctx);
-
pthread_mutex_lock(>mutex);
 
-   while (glthread->batch_queue || glthread->busy)
-  pthread_cond_wait(>work_done, >mutex);
+   if (!(glthread->batch_queue || glthread->busy))
+   {
+  if (glthread->batch && glthread->batch->used)
+  {
+ glthread_unmarshal_batch(ctx, glthread->batch);
+  }
+  glthread_allocate_batch(ctx);
+   }
+   else
+   {
+  _mesa_glthread_flush_batch_no_lock(ctx);
+  while (glthread->batch_queue || glthread->busy)
+ pthread_cond_wait(>work_done, >mutex);
+   }
 
pthread_mutex_unlock(>mutex);
 }
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gbm/dri: Flush after unmap

2017-03-29 Thread Thomas Hellstrom
Hi, Emil,

On 03/29/2017 02:34 PM, Emil Velikov wrote:
> On 29 March 2017 at 13:02, Thomas Hellstrom  wrote:
>> Hi, Emil,
>>
>> On 03/29/2017 01:30 PM, Emil Velikov wrote:
>>> Hi Thomas,
>>>
>>> On 28 March 2017 at 20:39, Thomas Hellstrom  wrote:
 Drivers may queue dma operations on the context at unmap time so we need
 to flush to make sure the data gets to the bo. Ideally the application
 would take care of this, but since there appears to be no exported gbm
 flush functionality we need to explicitly flush at unmap time.

 This fixes a problem where kmscube on vmwgfx in rgba textured mode would
 render using an uninitialized texture rather than the intended
 rgba pattern.

>>> I haven't checked but the issue should not be restricted to vmwgfx, right ?
>>>
>>> Perhaps we should add the following
>>> Fixes: 8aeb6d768b4 ("gbm: Add map/unmap functions")
>>> CC: 
>> Unfortunately I've, perhaps a bit prematurely, already pushed the fix.
>> Is there a way to get it
>> into stable after push?
>>
> Adding mesa-stable@ to the CC list should do it. Check out the
> instructions for more examples.
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mesa3d.org_submittingpatches.html-23nominations=DwIBaQ=uilaK90D4TOVoH58JNXRgQ=wnSlgOCqfpNS4d02vP68_E9q2BNMCwfD2OZ_6dCFVQQ=BrXsoWQ8oh4YpiBU4MHB3Ajw6fCc8eSvWV1W36tTgt0=FQVDFEI-7Yq6wpypsxCCS-KRkWVaGhtGF3RuN4ZepGY=
>  

Ok. I'll try the option of forwarding the commit id to mesa-stable...



>
 Signed-off-by: Thomas Hellstrom 
 ---
  src/gbm/backends/dri/gbm_dri.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

 diff --git a/src/gbm/backends/dri/gbm_dri.c 
 b/src/gbm/backends/dri/gbm_dri.c
 index ac7ede8..6c2244c 100644
 --- a/src/gbm/backends/dri/gbm_dri.c
 +++ b/src/gbm/backends/dri/gbm_dri.c
 @@ -243,7 +243,7 @@ struct dri_extension_match {
  };

  static struct dri_extension_match dri_core_extensions[] = {
 -   { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) },
 +   { __DRI2_FLUSH, 4, offsetof(struct gbm_dri_device, flush) },
>>> Currently the classic nouveau, radeon/r200 and i915 drivers do not
>>> support v4 of the extension.
>>> As-is this will 'break' them... if they ever worked to begin with.
>>>
>>> One solution is to bail out (return -ENOSYS or similar) in map/unmap
>>> API of the when the DRI module is too old.
>>> Just some ^^ food for thought.
>> Hmm. Is there even a use-case for gbm with those drivers? If so we
>> should perhaps make them up-to-date with the flush extension.
>>
> Of the above:
>
> - nouveau: Does not support DRI_IMAGE, thus it doesn't work even
> before the patch.
> - i915: I have some untested ancient patches. Will see if I can rebase
> + send out.
> - radeons: ??
>
> If someone reports an issue we can ask them to write/test some code, I guess 
> ;-)

Indeed. It looks like gbm is mostly used together with KMS anyway...

/Thomas


>
> -Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] st/mesa: EGLImageTarget* error handling

2017-03-29 Thread Nicolai Hähnle

On 29.03.2017 14:28, Philipp Zabel wrote:

On Wed, 2017-03-29 at 13:01 +0200, Nicolai Hähnle wrote:

On 29.03.2017 09:44, Philipp Zabel wrote:

Stop trying to specify texture or renderbuffer objects for unsupported
EGL images. Generate the error codes specified in the OES_EGL_image
extension.

EGLImageTargetTexture2D and EGLImageTargetRenderbuffer would call
the pipe driver's create_surface callback without ever checking that
the given EGL image is actually compatible with the chosen target
texture or renderbuffer. This patch adds a call to the pipe driver's
is_format_supported callback and generates an INVALID_OPERATION error
for unsupported EGL images. If the EGL image handle does not describe
a valid EGL image, an INVALID_VALUE error is generated.

Signed-off-by: Philipp Zabel 
Reviewed-by: Nicolai Hähnle 
---
v2: fixed get_surface to actually use the usage and error parameters


The v2 usually goes above :)


Ok, I'll remember that next time.


Do you need someone to commit this for you?


Yes, please.


Done.



regards
Philipp




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses

2017-03-29 Thread Kristian H. Kristensen
Jason Ekstrand  writes:

> This commit adds support for using the full 48-bit address space on
> Broadwell and newer hardware.  Thanks to certain limitations, not all
> objects can be placed above the 32-bit boundary.  In particular, general
> and state base address need to live within 32 bits.  (See also
> Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
> to handle this, we add a supports_48bit_address field to anv_bo and only
> set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
> for all client-allocated memory objects but leave it false for
> driver-allocated objects.  While this is more conservative than needed,
> all driver allocations should easily fit in the first 32 bits of address
> space and keeps things simple because we don't have to think about
> whether or not any given one of our allocation data structures will be
> used in a 48-bit-unsafe way.
> ---
>  src/intel/vulkan/anv_allocator.c   | 10 --
>  src/intel/vulkan/anv_batch_chain.c | 14 ++
>  src/intel/vulkan/anv_device.c  |  4 +++-
>  src/intel/vulkan/anv_gem.c | 18 ++
>  src/intel/vulkan/anv_intel.c   |  2 +-
>  src/intel/vulkan/anv_private.h | 29 +++--
>  6 files changed, 67 insertions(+), 10 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_allocator.c 
> b/src/intel/vulkan/anv_allocator.c
> index 45c663b..88c9c13 100644
> --- a/src/intel/vulkan/anv_allocator.c
> +++ b/src/intel/vulkan/anv_allocator.c
> @@ -255,7 +255,7 @@ anv_block_pool_init(struct anv_block_pool *pool,
> assert(util_is_power_of_two(block_size));
>  
> pool->device = device;
> -   anv_bo_init(>bo, 0, 0);
> +   anv_bo_init(>bo, 0, 0, false);
> pool->block_size = block_size;
> pool->free_list = ANV_FREE_LIST_EMPTY;
> pool->back_free_list = ANV_FREE_LIST_EMPTY;
> @@ -475,7 +475,13 @@ anv_block_pool_grow(struct anv_block_pool *pool, struct 
> anv_block_state *state)
>  * values back into pool. */
> pool->map = map + center_bo_offset;
> pool->center_bo_offset = center_bo_offset;
> -   anv_bo_init(>bo, gem_handle, size);
> +
> +   /* Block pool BOs are marked as not supporting 48-bit addresses because
> +* they are used to back STATE_BASE_ADDRESS.
> +*
> +* See also anv_bo::supports_48bit_address.
> +*/
> +   anv_bo_init(>bo, gem_handle, size, false);
> pool->bo.map = map;
>  
>  done:
> diff --git a/src/intel/vulkan/anv_batch_chain.c 
> b/src/intel/vulkan/anv_batch_chain.c
> index 5d7abc6..b098e4b 100644
> --- a/src/intel/vulkan/anv_batch_chain.c
> +++ b/src/intel/vulkan/anv_batch_chain.c
> @@ -979,7 +979,8 @@ anv_execbuf_finish(struct anv_execbuf *exec,
>  }
>  
>  static VkResult
> -anv_execbuf_add_bo(struct anv_execbuf *exec,
> +anv_execbuf_add_bo(struct anv_device *device,
> +   struct anv_execbuf *exec,
> struct anv_bo *bo,
> struct anv_reloc_list *relocs,
> const VkAllocationCallbacks *alloc)
> @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
>obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0;
>obj->rsvd1 = 0;
>obj->rsvd2 = 0;
> +
> +  if (device->instance->physicalDevice.supports_48bit_addresses &&
> +  bo->supports_48bit_address)
> + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> }
>  
> if (relocs != NULL && obj->relocation_count == 0) {
> @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
>for (size_t i = 0; i < relocs->num_relocs; i++) {
>   /* A quick sanity check on relocations */
>   assert(relocs->relocs[i].offset < bo->size);
> - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc);
> + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL, alloc);
>}
> }
>  
> @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> adjust_relocations_from_state_pool(ss_pool, _buffer->surface_relocs,
>cmd_buffer->last_ss_pool_center);
> VkResult result =
> -  anv_execbuf_add_bo(, _pool->bo, _buffer->surface_relocs,
> +  anv_execbuf_add_bo(device, , _pool->bo,
> + _buffer->surface_relocs,
>   _buffer->pool->alloc);
> if (result != VK_SUCCESS)
>return result;
> @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
>adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo, &(*bbo)->relocs,
> cmd_buffer->last_ss_pool_center);
>  
> -  anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs,
> +  anv_execbuf_add_bo(device, , &(*bbo)->bo, &(*bbo)->relocs,
>   _buffer->pool->alloc);
> }
>  
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 4e4fa19..f9d04ee 100644
> --- 

Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses

2017-03-29 Thread Chris Wilson
On Wed, Mar 29, 2017 at 08:36:36AM -0700, Jason Ekstrand wrote:
>On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilson
><[1]ch...@chris-wilson.co.uk> wrote:
> 
>  On Tue, Mar 28, 2017 at 05:41:12PM -0700, Jason Ekstrand wrote:
>  > This commit adds support for using the full 48-bit address space on
>  > Broadwell and newer hardware.  Thanks to certain limitations, not all
>  > objects can be placed above the 32-bit boundary.  In particular,
>  general
>  > and state base address need to live within 32 bits.  (See also
>  > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
>  > to handle this, we add a supports_48bit_address field to anv_bo and
>  only
>  > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the
>  bit
>  > for all client-allocated memory objects but leave it false for
>  > driver-allocated objects.  While this is more conservative than
>  needed,
>  > all driver allocations should easily fit in the first 32 bits of
>  address
>  > space and keeps things simple because we don't have to think about
>  > whether or not any given one of our allocation data structures will be
>  > used in a 48-bit-unsafe way.
>  > ---
>  >  static VkResult
>  > -anv_execbuf_add_bo(struct anv_execbuf *exec,
>  > +anv_execbuf_add_bo(struct anv_device *device,
>  > +                   struct anv_execbuf *exec,
>  >                     struct anv_bo *bo,
>  >                     struct anv_reloc_list *relocs,
>  >                     const VkAllocationCallbacks *alloc)
>  > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
>  >        obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0;
>  >        obj->rsvd1 = 0;
>  >        obj->rsvd2 = 0;
>  > +
>  > +      if (device->instance->physicalDevice.supports_48bit_addresses
>  &&
>  > +          bo->supports_48bit_address)
>  > +         obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> 
>  Don't set bo->supports_48bit_address when
>  !device->instance->physicalDevice.supports_48bit_addresses? My guess is
>  that flagging bo is a rarer task than add_bo(), and it looks like you
>  already have device available in the callers of bo_init(true).
> 
>I thought a bout making that change right before I sent it but decided I
>marginally liked this better.  I'm happy to change it if you'd like.

You're also the maintainer, you have to live with it so pick whichever
you find easier to read and less likely to get in the way of future
changes:)

>  > diff --git a/src/intel/vulkan/anv_private.h
>  b/src/intel/vulkan/anv_private.h
>  > index 27c887c..425e376 100644
>  > --- a/src/intel/vulkan/anv_private.h
>  > +++ b/src/intel/vulkan/anv_private.h
>  > @@ -299,11 +299,34 @@ struct anv_bo {
>  >      * writing to them and synchronize uses on other rings (eg if the
>  display
>  >      * server uses the blitter ring).
>  >      */
>  > -   bool is_winsys_bo;
>  > +   bool is_winsys_bo:1;
>  > +
>  > +   /* Whether or not this BO supports having a 48-bit address.  Not
>  all
>  > +    * buffers support arbitrary 48-bit addresses.  In particular, we
>  need to
>  > +    * be careful with general and instruction state buffers because
>  we set the
>  > +    * size in STATE_BASE_ADDRESS to 0xf (the maximum) even though
>  the BO
>  > +    * is most likely significantly smaller.  If we let the kernel
>  place it
>  > +    * anywhere it wants, it will default to placing it as high up the
>  address
>  > +    * space as possible, the range specified by STATE_BASE_ADDRESS
>  will
>  > +    * over-flow the 48-bit address range, and the GPU will hang.  In
>  order to
>  > +    * avoid this problem, we tell the kernel that the buffer does not
>  support
>  > +    * 48-bit addresses, and it places the buffer at a 32-bit
>  address.  While
>  > +    * this solution is probably overkill, it is effective.
> 
>  How about just setting the field to the bo->size? You must know the bo
>  already at that point so that you can set the relocation target.
> 
>Actually, we don't.  We have a pointer to a thing that claims to be a BO
>but the actual GEM handle and size aren't known until execbuf time.  (Yes,
>that's a bit weird but there are good reasons for it and it's not likely
>to change.  When we stop doing relocations, there's a separate plan for
>how to handle that.)

Hmm. I honestly didn't expect that. Another thing you can do is to use
execobject.size = 4GiB for those buffers. The kernel will then allocate
it 4GiB of space in the GTT, it's feels overkill though. Just limiting
them to the low 4GiB shouldn't be restrictive. I may have to check that
we do allocate those from the bottom -- iirc, we don't require any

Re: [Mesa-dev] [PATCH] [RFC] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.

2017-03-29 Thread Bartosz Tomczyk
Please ignore above patch.

On Wed, Mar 29, 2017 at 5:48 PM, Bartosz Tomczyk <
bartosz.tomczy...@gmail.com> wrote:

> This avoids costly thread synchronisation. With this fix games that
> previously regressed with mesa_glthread=true like xonotic or grid autosport.
> Could someone test if games that benefit from glthread didn't regress?
> ---
>  src/mesa/main/glthread.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
> index 06115b916d..d46288c242 100644
> --- a/src/mesa/main/glthread.c
> +++ b/src/mesa/main/glthread.c
> @@ -252,12 +252,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
> if (pthread_self() == glthread->thread)
>return;
>
> -   _mesa_glthread_flush_batch(ctx);
> -
> pthread_mutex_lock(>mutex);
>
> -   while (glthread->batch_queue || glthread->busy)
> -  pthread_cond_wait(>work_done, >mutex);
> +   if (!(glthread->batch_queue || glthread->busy))
> +   {
> +  if (glthread->batch && glthread->batch->used)
> +  {
> + glthread_unmarshal_batch(ctx, glthread->batch);
> +  }
> +  glthread_allocate_batch(ctx);
> +   }
> +   else
> +   {
> +  while (glthread->batch_queue || glthread->busy)
> + pthread_cond_wait(>work_done, >mutex);
> +   }
>
> pthread_mutex_unlock(>mutex);
>  }
> --
> 2.12.2
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] [RFC] mesa/glthread: Call unmarshal_batch directly in glthread_finish when batch queue is empty.

2017-03-29 Thread Bartosz Tomczyk
This avoids costly thread synchronisation. With this fix games that previously 
regressed with mesa_glthread=true like xonotic or grid autosport.
Could someone test if games that benefit from glthread didn't regress?
---
 src/mesa/main/glthread.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 06115b916d..d46288c242 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -252,12 +252,21 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (pthread_self() == glthread->thread)
   return;
 
-   _mesa_glthread_flush_batch(ctx);
-
pthread_mutex_lock(>mutex);
 
-   while (glthread->batch_queue || glthread->busy)
-  pthread_cond_wait(>work_done, >mutex);
+   if (!(glthread->batch_queue || glthread->busy))
+   {
+  if (glthread->batch && glthread->batch->used)
+  {
+ glthread_unmarshal_batch(ctx, glthread->batch);
+  }
+  glthread_allocate_batch(ctx);
+   }
+   else
+   {
+  while (glthread->batch_queue || glthread->busy)
+ pthread_cond_wait(>work_done, >mutex);
+   }
 
pthread_mutex_unlock(>mutex);
 }
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses

2017-03-29 Thread Jason Ekstrand
On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilson 
wrote:

> On Tue, Mar 28, 2017 at 05:41:12PM -0700, Jason Ekstrand wrote:
> > This commit adds support for using the full 48-bit address space on
> > Broadwell and newer hardware.  Thanks to certain limitations, not all
> > objects can be placed above the 32-bit boundary.  In particular, general
> > and state base address need to live within 32 bits.  (See also
> > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
> > to handle this, we add a supports_48bit_address field to anv_bo and only
> > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
> > for all client-allocated memory objects but leave it false for
> > driver-allocated objects.  While this is more conservative than needed,
> > all driver allocations should easily fit in the first 32 bits of address
> > space and keeps things simple because we don't have to think about
> > whether or not any given one of our allocation data structures will be
> > used in a 48-bit-unsafe way.
> > ---
> >  static VkResult
> > -anv_execbuf_add_bo(struct anv_execbuf *exec,
> > +anv_execbuf_add_bo(struct anv_device *device,
> > +   struct anv_execbuf *exec,
> > struct anv_bo *bo,
> > struct anv_reloc_list *relocs,
> > const VkAllocationCallbacks *alloc)
> > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> >obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0;
> >obj->rsvd1 = 0;
> >obj->rsvd2 = 0;
> > +
> > +  if (device->instance->physicalDevice.supports_48bit_addresses &&
> > +  bo->supports_48bit_address)
> > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>
> Don't set bo->supports_48bit_address when
> !device->instance->physicalDevice.supports_48bit_addresses? My guess is
> that flagging bo is a rarer task than add_bo(), and it looks like you
> already have device available in the callers of bo_init(true).
>

I thought a bout making that change right before I sent it but decided I
marginally liked this better.  I'm happy to change it if you'd like.


> > }
> >
> > if (relocs != NULL && obj->relocation_count == 0) {
> > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> >for (size_t i = 0; i < relocs->num_relocs; i++) {
> >   /* A quick sanity check on relocations */
> >   assert(relocs->relocs[i].offset < bo->size);
> > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc);
> > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL,
> alloc);
> >}
> > }
> >
> > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> > adjust_relocations_from_state_pool(ss_pool,
> _buffer->surface_relocs,
> >cmd_buffer->last_ss_pool_center);
> > VkResult result =
> > -  anv_execbuf_add_bo(, _pool->bo,
> _buffer->surface_relocs,
> > +  anv_execbuf_add_bo(device, , _pool->bo,
> > + _buffer->surface_relocs,
> >   _buffer->pool->alloc);
> > if (result != VK_SUCCESS)
> >return result;
> > @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> >adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo,
> &(*bbo)->relocs,
> > cmd_buffer->last_ss_pool_
> center);
> >
> > -  anv_execbuf_add_bo(, &(*bbo)->bo, &(*bbo)->relocs,
> > +  anv_execbuf_add_bo(device, , &(*bbo)->bo, &(*bbo)->relocs,
> >   _buffer->pool->alloc);
> > }
> >
> > diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> > index 4e4fa19..f9d04ee 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -149,6 +149,8 @@ anv_physical_device_init(struct anv_physical_device
> *device,
> >goto fail;
> > }
> >
> > +   device->supports_48bit_addresses = anv_gem_supports_48b_
> addresses(fd);
> > +
> > if (!anv_device_get_cache_uuid(device->uuid)) {
> >result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
> >   "cannot generate UUID");
> > @@ -1396,7 +1398,7 @@ anv_bo_init_new(struct anv_bo *bo, struct
> anv_device *device, uint64_t size)
> > if (!gem_handle)
> >return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
> >
> > -   anv_bo_init(bo, gem_handle, size);
> > +   anv_bo_init(bo, gem_handle, size, true);
> >
> > return VK_SUCCESS;
> >  }
> > diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
> > index 0dde6d9..3d45243 100644
> > --- a/src/intel/vulkan/anv_gem.c
> > +++ b/src/intel/vulkan/anv_gem.c
> > @@ -301,6 +301,24 @@ anv_gem_get_aperture(int fd, uint64_t *size)
> > return 0;
> >  }
> >
> > +bool
> > +anv_gem_supports_48b_addresses(int fd)
> > +{
> > +   struct drm_i915_gem_exec_object2 obj = {
> > +  .flags = 

Re: [Mesa-dev] [Request for Comments] - Port documentation to Markdown

2017-03-29 Thread Emil Velikov
Hi Jean,

On 8 March 2017 at 16:12, Brian Paul  wrote:

>> >One thing that I would prefer so not see if heavy things like
>> Bootstrap.
>> >We definitely don't need it, I think writing our own few lines of CSS
>> >(which can be inspired by anything you want) is better. We have more
>> >than enough people who know how to do it (myself included), it will
>> be
>> >cleaner (we won't need to include the whole forest to get our tree)
>> and
>> >much easier to fix when there's a bug.
>>
>>
>> I would tend to agree but I don't care too much about those details so
>> long as it's maintainable.  My primary concern is that while a lot of
>> random developers in the community are liable to have brushed into CSS a
>> time or two, most probably won't know bootstrap.
>
>
> Yeah, I can's stress that too much.  The site has to be easily maintainable
> by the developers.  I, for one, don't know much about websites beyond html
> and a little CSS.  If you create a new website infrastructure and then
> disappear after a few months we need to be able to take over.  Also, we
> can't funnel documentation updates through a handful of people that know a
> complex system.
>
Have you had some time to look into this ?

It would be great if we can get things rolling, even if not perfect.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] nir: Add support for 8 and 16-bit types

2017-03-29 Thread Jason Ekstrand
On Wed, Mar 29, 2017 at 12:41 AM, Eduardo Lima Mitev 
wrote:

> Both patches need rebase, but look fine otherwise.
>

The first has already landed (I think).  The second definitely needs
rebasing.  Yesterday, I rebased it on top of the other two
constant_expressions fixup patches I sent out:
https://patchwork.freedesktop.org/series/21244/  It would be nice if that
series landed first as it cleans things up substantially.


> Series is:
>
> Reviewed-by: Eduardo Lima Mitev 
>

Thanks!


> On 03/09/2017 11:05 PM, Jason Ekstrand wrote:
> > This tiny series adds support in NIR for 8 and 16-bit types.  In
> > particular, it now supports int8_t, uint8_t, int16_t, uint16_t, and
> > float16_t.  No 8-bit floating-point type is supported because 8-bit float
> > would be stupid.
> >
> > These patches have been tested in Jenkins but no 8 or 16-bit code has
> been
> > run through it yet.  Even if we're people don't want to land the second
> > patch (due to not having a vertical slice), I'd like to land the first
> > refactor patch.
> >
> > Jason Ekstrand (2):
> >   nir/constant_expressions: Refactor helper functions
> >   nir: Add support for 8 and 16-bit types
> >
> >  src/compiler/nir/nir.h   |  4 ++
> >  src/compiler/nir/nir_constant_expressions.py | 67
> +---
> >  src/compiler/nir/nir_opcodes.py  |  6 ++-
> >  3 files changed, 51 insertions(+), 26 deletions(-)
> >
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+

2017-03-29 Thread Matt Turner
Thanks. That looks good.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3] i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+

2017-03-29 Thread Alejandro Piñeiro
Technically those hw operations are only available on gen7, as gen8+
support the conversion on the MOV. But, when using the builder to
implement nir operations (example: nir_op_fquantize2f16), it is not
needed to do the gen check. This check is done later, on the final
emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
specific operation accordingly.

So in the middle, during optimization phases those hw operations can
be around for gen8+ too.

Without this patch, several (at least 95) vulkan-cts quantize tests
crashes when using INTEL_DEBUG=optimizer. For example:
dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert

v2: simplify the code using GEN_GE (Ilia Mirkin)
v3: tweak brw_instruction_name instead of changing opcode_descs
table, that is used for validation (Matt Turner)
---

Im not really proud of the comment, but I hope it explains well
why it is needed. Comments are welcome.

 src/intel/compiler/brw_shader.cpp | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/intel/compiler/brw_shader.cpp 
b/src/intel/compiler/brw_shader.cpp
index bfaa5e7..73bbc93 100644
--- a/src/intel/compiler/brw_shader.cpp
+++ b/src/intel/compiler/brw_shader.cpp
@@ -157,6 +157,15 @@ brw_instruction_name(const struct gen_device_info 
*devinfo, enum opcode op)
   if (devinfo->gen >= 6 && op == BRW_OPCODE_DO)
  return "do";
 
+  /* The following conversion opcodes doesn't exist on Gen8+, but we use
+   * then to mark that we want to do the conversion.
+   */
+  if (devinfo->gen > 7 && op == BRW_OPCODE_F32TO16)
+ return "f32to16";
+
+  if (devinfo->gen > 7 && op == BRW_OPCODE_F16TO32)
+ return "f16to32";
+
   assert(brw_opcode_desc(devinfo, op)->name);
   return brw_opcode_desc(devinfo, op)->name;
case FS_OPCODE_FB_WRITE:
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+

2017-03-29 Thread Alejandro Piñeiro
On 29/03/17 16:15, Matt Turner wrote:
> On Wed, Mar 29, 2017 at 4:47 AM, Alejandro Piñeiro  
> wrote:
>> Technically those hw operations are only available on gen7, as gen8+
>> support the conversion on the MOV. But, when using the builder to
>> implement nir operations (example: nir_op_fquantize2f16), it is not
>> needed to do the gen check. This check is done later, on the final
>> emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
>> specific operation accordingly.
>>
>> So in the middle, during optimization phases those hw operations can
>> be around for gen8+ too.
>>
>> Without this patch, several (at least 95) vulkan-cts quantize tests
>> crashes when using INTEL_DEBUG=optimizer. For example:
>> dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert
>> ---
>>  src/intel/compiler/brw_eu.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
>> index 77400c1..bff37d7 100644
>> --- a/src/intel/compiler/brw_eu.c
>> +++ b/src/intel/compiler/brw_eu.c
>> @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = {
>>.name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8),
>> },
>> [BRW_OPCODE_F32TO16] = {
>> -  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
>> +  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 
>> | GEN9,
>> },
>> [BRW_OPCODE_F16TO32] = {
>> -  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
>> +  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 
>> | GEN9,
>> },
> This table is for hardware information, used by brw_eu_validate.c.
> Since these opcodes do not exist on Gen8+, we should not add that to
> the table.
>
> I assume that the crashes you are referring to are assertion failures
> in brw_instruction_name() -- assert(brw_opcode_desc(devinfo,
> op)->name)
>
> If that's the case, there's an identical case immediately above. We
> use BRW_OPCODE_DO in the backend IRs, but that opcode is not used on
> Gen6+. I would add two more cases for f32to16 and f16to32 there.

Ok, thanks for the hints. I would work on a v3 of the patch.

> Perhaps we should not use BRW_OPCODE_* for operations used in the
> backend IR that may not actually exist as a real opcode in hardware.
> Not sure.


Yes, at first I found it somewhat counter-intuitive, so I checked just
in case, and it is happening (or happening something really similar)
with several other hw opcodes. The alternative would be create a new
kind of opcode, having hw_opcode and _opcode. But I don't think
that it is worth so such effort, and it is okish to just remember that
there are still a lot happening after calling bld.emit(opcode, ...).

BR


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 05/25] gallium: add sparse buffer interface and capability

2017-03-29 Thread Marek Olšák
On Wed, Mar 29, 2017 at 12:26 PM, Nicolai Hähnle  wrote:
> On 28.03.2017 21:46, Marek Olšák wrote:
>>
>> On Tue, Mar 28, 2017 at 11:11 AM, Nicolai Hähnle 
>> wrote:
>>>
>>> From: Nicolai Hähnle 
>>>
>>> TODO fill out caps in all drivers
>>>
>>> v2:
>>> - explain the resource_commit interface in more detail
>>> ---
>>>  src/gallium/docs/source/context.rst  | 25 +
>>>  src/gallium/docs/source/screen.rst   |  3 +++
>>>  src/gallium/include/pipe/p_context.h | 13 +
>>>  src/gallium/include/pipe/p_defines.h |  2 ++
>>>  4 files changed, 43 insertions(+)
>>>
>>> diff --git a/src/gallium/docs/source/context.rst
>>> b/src/gallium/docs/source/context.rst
>>> index a053193..5949ff2 100644
>>> --- a/src/gallium/docs/source/context.rst
>>> +++ b/src/gallium/docs/source/context.rst
>>> @@ -611,20 +611,45 @@ for both regular textures as well as for
>>> framebuffers read via FBFETCH.
>>>  .. _memory_barrier:
>>>
>>>  memory_barrier
>>>  %%%
>>>
>>>  This function flushes caches according to which of the PIPE_BARRIER_*
>>> flags
>>>  are set.
>>>
>>>
>>>
>>> +.. _resource_commit:
>>> +
>>> +resource_commit
>>> +%%%
>>> +
>>> +This function changes the commit state of a part of a sparse resource.
>>> Sparse
>>> +resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag
>>> when
>>> +calling ``resource_create``. Initially, sparse resources only reserve a
>>> virtual
>>> +memory region that is not backed by memory (i.e., it is uncommitted).
>>> The
>>> +``resource_commit`` function can be called to commit or uncommit parts
>>> (or all)
>>> +of a resource. The driver manages the underlying backing memory.
>>> +
>>> +The contents of newly committed memory regions are undefined. Calling
>>> this
>>> +function to commit an already committed memory region is allowed and
>>> leaves its
>>> +content unchanged. Similarly, calling this function to uncommit an
>>> already
>>> +uncommitted memory region is allowed.
>>> +
>>> +For buffers, the given box must be aligned to multiples of
>>> +``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if
>>> the size
>>> +of the buffer is not a multiple of the page size, changing the commit
>>> state of
>>> +the last (partial) page requires a box that ends at the end of the
>>> buffer
>>> +(i.e., box->x + box->width == buffer->width0).
>>> +
>>> +
>>> +
>>>  .. _pipe_transfer:
>>>
>>>  PIPE_TRANSFER
>>>  ^
>>>
>>>  These flags control the behavior of a transfer object.
>>>
>>>  ``PIPE_TRANSFER_READ``
>>>Resource contents read back (or accessed directly) at transfer create
>>> time.
>>>
>>> diff --git a/src/gallium/docs/source/screen.rst
>>> b/src/gallium/docs/source/screen.rst
>>> index 00c9503..8759639 100644
>>> --- a/src/gallium/docs/source/screen.rst
>>> +++ b/src/gallium/docs/source/screen.rst
>>> @@ -369,20 +369,23 @@ The integer capabilities:
>>>opcode to retrieve the current value in the framebuffer.
>>>  * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the
>>>``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property.
>>>  * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point
>>> operations
>>>are supported.
>>>  * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported.
>>>  * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo
>>>operations are supported.
>>>  * ``PIPE_CAP_TGSI_TEX_TXF_LZ``: Whether TEX_LZ and TXF_LZ opcodes are
>>>supported.
>>> +* ``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``: The page size of sparse buffers
>>> in
>>> +  bytes, or 0 if sparse buffers are not supported. The page size must be
>>> at
>>> +  most 64KB.
>>>
>>>
>>>  .. _pipe_capf:
>>>
>>>  PIPE_CAPF_*
>>>  
>>>
>>>  The floating-point capabilities are:
>>>
>>>  * ``PIPE_CAPF_MAX_LINE_WIDTH``: The maximum width of a regular line.
>>> diff --git a/src/gallium/include/pipe/p_context.h
>>> b/src/gallium/include/pipe/p_context.h
>>> index a29fff5..4d5535b 100644
>>> --- a/src/gallium/include/pipe/p_context.h
>>> +++ b/src/gallium/include/pipe/p_context.h
>>> @@ -578,20 +578,33 @@ struct pipe_context {
>>>  * Flush any pending framebuffer writes and invalidate texture
>>> caches.
>>>  */
>>> void (*texture_barrier)(struct pipe_context *, unsigned flags);
>>>
>>> /**
>>>  * Flush caches according to flags.
>>>  */
>>> void (*memory_barrier)(struct pipe_context *, unsigned flags);
>>>
>>> /**
>>> +* Change the commitment status of a part of the given resource,
>>> which must
>>> +* have been created with the PIPE_RESOURCE_FLAG_SPARSE bit.
>>> +*
>>> +* \param level The texture level whose commitment should be changed.
>>> +* \param box The region of the resource whose commitment should be
>>> changed.
>>> +* \param commit Whether memory should be committed or un-committed.
>>> +*
>>> +* \return false if out of 

Re: [Mesa-dev] [PATCH] radv: move to using nir clip/cull merge pass.

2017-03-29 Thread Edward O'Callaghan
Reviewed-by: Edward O'Callaghan 

On 03/29/2017 04:14 PM, Dave Airlie wrote:
> From: Dave Airlie 
> 
> Doing this before tessellation makes doing some bits of
> tessellation a bit cleaner. It also cleans up a bit of the
> llvm generator code.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 144 
> ++--
>  src/amd/vulkan/radv_pipeline.c  |   1 +
>  2 files changed, 36 insertions(+), 109 deletions(-)
> 
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index f164d8f..78602fd 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -144,8 +144,6 @@ struct nir_to_llvm_context {
>   int num_locals;
>   LLVMValueRef *locals;
>   bool has_ddxy;
> - uint8_t num_input_clips;
> - uint8_t num_input_culls;
>   uint8_t num_output_clips;
>   uint8_t num_output_culls;
>  
> @@ -170,12 +168,9 @@ static unsigned 
> shader_io_get_unique_index(gl_varying_slot slot)
>   return 0;
>   if (slot == VARYING_SLOT_PSIZ)
>   return 1;
> - if (slot == VARYING_SLOT_CLIP_DIST0 ||
> - slot == VARYING_SLOT_CULL_DIST0)
> + if (slot == VARYING_SLOT_CLIP_DIST0)
>   return 2;
> - if (slot == VARYING_SLOT_CLIP_DIST1 ||
> - slot == VARYING_SLOT_CULL_DIST1)
> - return 3;
> + /* 3 is reserved for clip dist as well */
>   if (slot >= VARYING_SLOT_VAR0 && slot <= VARYING_SLOT_VAR31)
>   return 4 + (slot - VARYING_SLOT_VAR0);
>   unreachable("illegal slot in get unique index\n");
> @@ -2195,7 +2190,6 @@ load_gs_input(struct nir_to_llvm_context *ctx,
>   unsigned param, vtx_offset_param;
>   LLVMValueRef value[4], result;
>   unsigned vertex_index;
> - unsigned cull_offset = 0;
>   radv_get_deref_offset(ctx, >variables[0]->deref,
> false, _index,
> _index, _index);
> @@ -2205,13 +2199,11 @@ load_gs_input(struct nir_to_llvm_context *ctx,
> LLVMConstInt(ctx->i32, 4, false), "");
>  
>   param = 
> shader_io_get_unique_index(instr->variables[0]->var->data.location);
> - if (instr->variables[0]->var->data.location == VARYING_SLOT_CULL_DIST0)
> - cull_offset += ctx->num_input_clips;
>   for (unsigned i = 0; i < instr->num_components; i++) {
>  
>   args[0] = ctx->esgs_ring;
>   args[1] = vtx_offset;
> - args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + const_index + 
> cull_offset) * 256, false);
> + args[2] = LLVMConstInt(ctx->i32, (param * 4 + i + const_index) 
> * 256, false);
>   args[3] = ctx->i32zero;
>   args[4] = ctx->i32one; /* OFFEN */
>   args[5] = ctx->i32zero; /* IDXEN */
> @@ -2366,8 +2358,7 @@ visit_store_var(struct nir_to_llvm_context *ctx,
>  
>   value = llvm_extract_elem(ctx, src, chan);
>  
> - if (instr->variables[0]->var->data.location == 
> VARYING_SLOT_CLIP_DIST0 ||
> - instr->variables[0]->var->data.location == 
> VARYING_SLOT_CULL_DIST0)
> + if (instr->variables[0]->var->data.compact)
>   stride = 1;
>   if (indir_index) {
>   unsigned count = glsl_count_attribute_slots(
> @@ -3143,7 +3134,7 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx,
>   LLVMValueRef gs_next_vertex;
>   LLVMValueRef can_emit, kill;
>   int idx;
> - int clip_cull_slot = -1;
> +
>   assert(instr->const_index[0] == 0);
>   /* Write vertex attribute values to GSVS ring */
>   gs_next_vertex = LLVMBuildLoad(ctx->builder,
> @@ -3175,27 +3166,11 @@ visit_emit_vertex(struct nir_to_llvm_context *ctx,
>   if (!(ctx->output_mask & (1ull << i)))
>   continue;
>  
> - if (i == VARYING_SLOT_CLIP_DIST1 ||
> - i == VARYING_SLOT_CULL_DIST1)
> - continue;
> -
> - if (i == VARYING_SLOT_CLIP_DIST0 ||
> - i == VARYING_SLOT_CULL_DIST0) {
> + if (i == VARYING_SLOT_CLIP_DIST0) {
>   /* pack clip and cull into a single set of slots */
> - if (clip_cull_slot == -1) {
> - clip_cull_slot = idx;
> - if (ctx->num_output_clips + 
> ctx->num_output_culls > 4)
> - slot_inc = 2;
> - } else {
> - slot = clip_cull_slot;
> - slot_inc = 0;
> - }
> - if (i == VARYING_SLOT_CLIP_DIST0)
> - length = ctx->num_output_clips;
> - if (i == VARYING_SLOT_CULL_DIST0) {

Re: [Mesa-dev] [PATCH] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+

2017-03-29 Thread Matt Turner
On Wed, Mar 29, 2017 at 4:47 AM, Alejandro Piñeiro  wrote:
> Technically those hw operations are only available on gen7, as gen8+
> support the conversion on the MOV. But, when using the builder to
> implement nir operations (example: nir_op_fquantize2f16), it is not
> needed to do the gen check. This check is done later, on the final
> emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
> specific operation accordingly.
>
> So in the middle, during optimization phases those hw operations can
> be around for gen8+ too.
>
> Without this patch, several (at least 95) vulkan-cts quantize tests
> crashes when using INTEL_DEBUG=optimizer. For example:
> dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert
> ---
>  src/intel/compiler/brw_eu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
> index 77400c1..bff37d7 100644
> --- a/src/intel/compiler/brw_eu.c
> +++ b/src/intel/compiler/brw_eu.c
> @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = {
>.name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8),
> },
> [BRW_OPCODE_F32TO16] = {
> -  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
> +  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 | 
> GEN9,
> },
> [BRW_OPCODE_F16TO32] = {
> -  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
> +  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 | GEN8 | 
> GEN9,
> },

This table is for hardware information, used by brw_eu_validate.c.
Since these opcodes do not exist on Gen8+, we should not add that to
the table.

I assume that the crashes you are referring to are assertion failures
in brw_instruction_name() -- assert(brw_opcode_desc(devinfo,
op)->name)

If that's the case, there's an identical case immediately above. We
use BRW_OPCODE_DO in the backend IRs, but that opcode is not used on
Gen6+. I would add two more cases for f32to16 and f16to32 there.

Perhaps we should not use BRW_OPCODE_* for operations used in the
backend IR that may not actually exist as a real opcode in hardware.
Not sure.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+

2017-03-29 Thread Alejandro Piñeiro
Technically those hw operations are only available on gen7, as gen8+
support the conversion on the MOV. But, when using the builder to
implement nir operations (example: nir_op_fquantize2f16), it is not
needed to do the gen check. This check is done later, on the final
emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
specific operation accordingly.

So in the middle, during optimization phases those hw operations can
be around for gen8+ too.

Without this patch, several (at least 95) vulkan-cts quantize tests
crashes when using INTEL_DEBUG=optimizer. For example:
dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert

v2: simplify the code using GEN_GE (Ilia Mirkin)
---
 src/intel/compiler/brw_eu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
index 77400c1..e7dd325 100644
--- a/src/intel/compiler/brw_eu.c
+++ b/src/intel/compiler/brw_eu.c
@@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = {
   .name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8),
},
[BRW_OPCODE_F32TO16] = {
-  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
+  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN_GE(GEN7),
},
[BRW_OPCODE_F16TO32] = {
-  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
+  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN_GE(GEN7),
},
/* Reserved - 21-22 */
[BRW_OPCODE_BFREV] = {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: expose BRW_OPCODE_[F32TO16/F16TO32] opcode_descs on gen8+

2017-03-29 Thread Ilia Mirkin
I guess you want GEN_GE(GEN7), no?

On Mar 29, 2017 7:48 AM, "Alejandro Piñeiro"  wrote:

> Technically those hw operations are only available on gen7, as gen8+
> support the conversion on the MOV. But, when using the builder to
> implement nir operations (example: nir_op_fquantize2f16), it is not
> needed to do the gen check. This check is done later, on the final
> emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
> specific operation accordingly.
>
> So in the middle, during optimization phases those hw operations can
> be around for gen8+ too.
>
> Without this patch, several (at least 95) vulkan-cts quantize tests
> crashes when using INTEL_DEBUG=optimizer. For example:
> dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert
> ---
>  src/intel/compiler/brw_eu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
> index 77400c1..bff37d7 100644
> --- a/src/intel/compiler/brw_eu.c
> +++ b/src/intel/compiler/brw_eu.c
> @@ -499,10 +499,10 @@ static const struct opcode_desc opcode_descs[128] = {
>.name = "csel",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8),
> },
> [BRW_OPCODE_F32TO16] = {
> -  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
> +  .name = "f32to16", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 |
> GEN8 | GEN9,
> },
> [BRW_OPCODE_F16TO32] = {
> -  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75,
> +  .name = "f16to32", .nsrc = 1, .ndst = 1, .gens = GEN7 | GEN75 |
> GEN8 | GEN9,
> },
> /* Reserved - 21-22 */
> [BRW_OPCODE_BFREV] = {
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gbm/dri: Flush after unmap

2017-03-29 Thread Emil Velikov
On 29 March 2017 at 13:02, Thomas Hellstrom  wrote:
> Hi, Emil,
>
> On 03/29/2017 01:30 PM, Emil Velikov wrote:
>> Hi Thomas,
>>
>> On 28 March 2017 at 20:39, Thomas Hellstrom  wrote:
>>> Drivers may queue dma operations on the context at unmap time so we need
>>> to flush to make sure the data gets to the bo. Ideally the application
>>> would take care of this, but since there appears to be no exported gbm
>>> flush functionality we need to explicitly flush at unmap time.
>>>
>>> This fixes a problem where kmscube on vmwgfx in rgba textured mode would
>>> render using an uninitialized texture rather than the intended
>>> rgba pattern.
>>>
>> I haven't checked but the issue should not be restricted to vmwgfx, right ?
>>
>> Perhaps we should add the following
>> Fixes: 8aeb6d768b4 ("gbm: Add map/unmap functions")
>> CC: 
>
> Unfortunately I've, perhaps a bit prematurely, already pushed the fix.
> Is there a way to get it
> into stable after push?
>
Adding mesa-stable@ to the CC list should do it. Check out the
instructions for more examples.

https://www.mesa3d.org/submittingpatches.html#nominations

>
>>
>>> Signed-off-by: Thomas Hellstrom 
>>> ---
>>>  src/gbm/backends/dri/gbm_dri.c | 9 -
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
>>> index ac7ede8..6c2244c 100644
>>> --- a/src/gbm/backends/dri/gbm_dri.c
>>> +++ b/src/gbm/backends/dri/gbm_dri.c
>>> @@ -243,7 +243,7 @@ struct dri_extension_match {
>>>  };
>>>
>>>  static struct dri_extension_match dri_core_extensions[] = {
>>> -   { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) },
>>> +   { __DRI2_FLUSH, 4, offsetof(struct gbm_dri_device, flush) },
>> Currently the classic nouveau, radeon/r200 and i915 drivers do not
>> support v4 of the extension.
>> As-is this will 'break' them... if they ever worked to begin with.
>>
>> One solution is to bail out (return -ENOSYS or similar) in map/unmap
>> API of the when the DRI module is too old.
>> Just some ^^ food for thought.
>
> Hmm. Is there even a use-case for gbm with those drivers? If so we
> should perhaps make them up-to-date with the flush extension.
>
Of the above:

- nouveau: Does not support DRI_IMAGE, thus it doesn't work even
before the patch.
- i915: I have some untested ancient patches. Will see if I can rebase
+ send out.
- radeons: ??

If someone reports an issue we can ask them to write/test some code, I guess ;-)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] st/mesa: EGLImageTarget* error handling

2017-03-29 Thread Philipp Zabel
On Wed, 2017-03-29 at 13:01 +0200, Nicolai Hähnle wrote:
> On 29.03.2017 09:44, Philipp Zabel wrote:
> > Stop trying to specify texture or renderbuffer objects for unsupported
> > EGL images. Generate the error codes specified in the OES_EGL_image
> > extension.
> >
> > EGLImageTargetTexture2D and EGLImageTargetRenderbuffer would call
> > the pipe driver's create_surface callback without ever checking that
> > the given EGL image is actually compatible with the chosen target
> > texture or renderbuffer. This patch adds a call to the pipe driver's
> > is_format_supported callback and generates an INVALID_OPERATION error
> > for unsupported EGL images. If the EGL image handle does not describe
> > a valid EGL image, an INVALID_VALUE error is generated.
> >
> > Signed-off-by: Philipp Zabel 
> > Reviewed-by: Nicolai Hähnle 
> > ---
> > v2: fixed get_surface to actually use the usage and error parameters
> 
> The v2 usually goes above :)

Ok, I'll remember that next time.

> Do you need someone to commit this for you?

Yes, please.

regards
Philipp

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Invalidate L2 for TRANSFER_WRITE barriers

2017-03-29 Thread Emil Velikov
On 28 March 2017 at 19:11, Bas Nieuwenhuizen  wrote:
> On Tue, Mar 28, 2017 at 6:31 PM, Alex Smith  
> wrote:
>> On 28 March 2017 at 17:09, Emil Velikov  wrote:
>>>
>>> On 22 March 2017 at 10:06, Bas Nieuwenhuizen 
>>> wrote:
>>> > On Tue, Mar 21, 2017 at 1:02 PM, Alex Smith
>>> >  wrote:
>>> >> CP DMA and PKT3_WRITE_DATA (in CmdUpdateBuffer) don't (currently) write
>>> >> through L2. Therefore, to make these writes visible to later accesses
>>> >> we must invalidate L2 rather than just writing it back, to avoid the
>>> >> possibility that stale data is read through L2.
>>> >>
>>> >> Cc: "17.0" 
>>> >> Signed-off-by: Alex Smith 
>>> >> ---
>>> >> It's possible for both CP DMA and PKT3_WRITE_DATA to write through L2
>>> >> as far as I can see, and changing things so that they do also solves
>>> >> the problems that this patch fixes.
>>> >>
>>> >> However, I don't know what the exact consequences of doing so are, or
>>> >> whether there are any situations where that shouldn't be done, so I've
>>> >> gone with this fix instead as it seems like a safer option for now.
>>> >
>>> > Yeah we should be able to. I'm more comfortable sending this patch to
>>> > stable though, so this patch is
>>> >
>>> Bas, others,
>>>
>>> Patch addresses radv_{src,dst}_access_flush() which landed with commit
>>> 6dbb0eaccc3, after the 17.0 branchpoint.
>>
>>
>> Oops, my mistake.
>>
>> I think radv_CmdPipelineBarrier on the 17.0 branch still needs
>> RADV_CMD_FLAG_INV_GLOBAL_L2 added for TRANSFER_WRITE barriers at least. Bas,
>> do you think that should be added in a separate patch just for stable, or
>> would you prefer to push those later changes to stable as well? Looks like
>> there's some fixes in those as well.
>
> I'd prefer to backport this patch. The other patches IMO contain too
> much risk for regression and are actually mostly for optimizations.
Amazing, thank you.

Please add a note like below, so that we get some nice and clear references

[Bas: patch is a backport for 17.0 of the cherry-pick below]
(cherry picked from commit bc5d587a80b64fb3e0a5ea8067e6317fbca2bbc5)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >