Re: [Mesa-dev] [PATCH 2/2] Android: EGL: fix missing nativewindow.h include on O

2017-08-24 Thread Chih-Wei Huang
2017-08-24 22:02 GMT+08:00 Rob Herring :
> On Thu, Aug 24, 2017 at 4:08 AM, Chih-Wei Huang  
> wrote:
>> 2017-08-24 1:25 GMT+08:00 Rob Herring :
>>
>> I'm also trying to fix it.
>> Seems it only requires the headers instead
>> of the shared library. Adding libnativewindow to
>> LOCAL_SHARED_LIBRARIES would add the
>> unnecessary dependency to libGLES_mesa.so.
>
> Then the correct way to do this is LOCAL_HEADER_LIBRARIES instead.

Right. Unfortunately these two libs don't define
cc_library_headers in their Android.bp
so we can't use it.

>> Locally I fixed it in this way:
>>
>> diff --git a/src/egl/Android.mk b/src/egl/Android.mk
>> index 4ccbb9b..9e96aca 100644
>> --- a/src/egl/Android.mk
>> +++ b/src/egl/Android.mk
>> @@ -43,6 +43,8 @@ LOCAL_CFLAGS := \
>> -D_EGL_BUILT_IN_DRIVER_DRI2
>>
>>  LOCAL_C_INCLUDES := \
>> +   frameworks/native/libs/arect/include \
>> +   frameworks/native/libs/nativewindow/include \
>
> Doing external includes this way is exactly what we don't want to do.
> There's a defined way to do cross project headers.

Agree.
But better than add extra unnecessary dependency IMO.

Fortunately we have a much better solution
as suggested by Emil.
I just tested it and it works as expected.
Since it only needs the name ANativeWindow,
forward declaration fits the purpose perfectly.


-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] gallium: normalize CONST file accesses to 2D

2017-08-24 Thread Dieter Nützel

Am 23.08.2017 18:41, schrieb Nicolai Hähnle:

Hi all,

Following the discussion on Timothy's std430 packing series, here's
a quick proposal to just always use 2D accesses to the CONST file
in TGSI.

The first patch should be sufficient for all drivers to accept
those 2D accesses. It seems that most older drivers simply ignore
the dimension, and newer ones should handle it directly.

Subsequent patches modify the producers of TGSI to always use 2D
constant references. This is mostly done by changing ureg.

Finally, the last patch adds an assertion to radeonsi to make
sure all constant references are really 2D. It has survived my
very superficial initial testing.

What needs to be tested is:
- some more drivers
- Nine


Sorry Nicolai,

but Nine corruption with Wine (LS2017 / FarmingSimulator2017) on RX580, 
here.


After KDE relogin partially window/screen corruption (window boarder 
pixel flickering).


Dieter



- TGSI-to-NIR

You can find the series here:
https://cgit.freedesktop.org/~nh/mesa/log/?h=tgsi-const-2d

Please comment/review!
Thanks,
Nicolai
--
 src/gallium/auxiliary/hud/hud_context.c  |   8 +-
 src/gallium/auxiliary/nir/tgsi_to_nir.c  |   2 +-
 src/gallium/auxiliary/postprocess/pp_mlaa.h  |  20 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.c   |  22 +--
 src/gallium/auxiliary/util/u_tests.c |   4 +-
 src/gallium/docs/source/screen.rst   |  11 +-
 src/gallium/drivers/radeon/r600_query.c  |  36 ++--
 src/gallium/drivers/radeonsi/si_shader.c |   1 +
 src/gallium/state_trackers/nine/nine_ff.c|   2 +-
 .../state_trackers/nine/nine_shader.c|  10 +-
 .../tests/graw/fragment-shader/frag-cb-1d.sh |   8 +-
 .../tests/graw/vertex-shader/vert-cb-1d.sh   |   8 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 153 +
 13 files changed, 136 insertions(+), 149 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: implementation of VK_EXT_debug_report extension

2017-08-24 Thread Tapani Pälli



On 08/25/2017 08:08 AM, Jason Ekstrand wrote:
On Thu, Aug 24, 2017 at 9:52 PM, Tapani Pälli > wrote:



 +
 +   vk_foreach_struct(info, pCreateInfo) {


Usually, we handle the primary structure directly and then call
vk_foreach_struct on pCreateInfo->pNext.  This is because the
things in the pNext chain are going to be modifiers to the
original thing so they probably need to happen between
allocating the callback and list_addtail().


Right, this would be for extending the functionality. I wrote it
like this because it seems for me that the typical pattern would be
to chain few report callbacks and send them all at once rather than
calling the function n times. I'll change so that I handle first
item out of the loop.


Is that allowed??? That would be weird but it honestly wouldn't surprise 
me that much for this extension.  Normally, you don't chain multiple of 
the same struct together.  You chain extension structs on to modify the 
original struct.  If inserting multiple callbacks at one go this way is 
allowed, then we should probably do it the way you did it.


Heh I see now that this was my own invention, I thought these chains 
allow one to create multiple objects with one call but that's not true, 
these are meant for extending. Sorry about that, will fix.




 +  switch (info->sType) {
 +  case
VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT: {
 + struct anv_debug_callback *cb =
 +vk_alloc(alloc, sizeof(struct
anv_debug_callback), 8,
 + VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
 + if (!cb)
 +return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 +
 + cb->flags = pCreateInfo->flags;
 + cb->callback = pCreateInfo->pfnCallback;
 + cb->data = pCreateInfo->pUserData;
 +
 + list_addtail(>link, >callbacks);


What kind of threading guarantees does debug_report provide? 
I'm guessing none in which case we need to lock around this list.



True, this is something I completely forgot. Will add locking.



 + break;
 +  }
 +  default:
 + anv_debug_ignored_stype(info->sType);
 + break;
 +  }
 +   }
 +
 +   return VK_SUCCESS;
 +}
 +
 +void
 +anv_DestroyDebugReportCallbackEXT(VkInstance _instance,
 +  VkDebugReportCallbackEXT
callback,
 +  const VkAllocationCallbacks*
 pAllocator)
 +{
 +   ANV_FROM_HANDLE(anv_instance, instance, _instance);
 +   const VkAllocationCallbacks *alloc =
 +  pAllocator ? pAllocator : >alloc;
 +
 +   list_for_each_entry_safe(struct anv_debug_callback,
debug_cb,
 +>callbacks, link) {
 +  /* Found a match, remove from list and destroy given
callback. */
 +  if ((VkDebugReportCallbackEXT)debug_cb->callback ==
callback) {
 + list_del(_cb->link);


lock

 + vk_free(alloc, debug_cb);
 +  }
 +   }
 +}
 +
 +void
 +anv_DebugReportMessageEXT(VkInstance _instance,
 +  VkDebugReportFlagsEXT flags,
 +  VkDebugReportObjectTypeEXT
objectType,
 +  uint64_t object,
 +  size_t location,
 +  int32_t messageCode,
 +  const char* pLayerPrefix,
 +  const char* pMessage)


Woah... This is a bit unexpected.  I wonder why this entrypoint
even exists.  One would think that the loader could just do the
aggrigation without it.


Yep, I was wondering about this one as well.

 +{
 +   ANV_FROM_HANDLE(anv_instance, instance, _instance);
 +   anv_debug_report(instance, flags, objectType, object,
 +location, messageCode, pLayerPrefix,
pMessage);
 +
 +}
 +
 +void
 +anv_debug_report_call(struct anv_debug_callback *cb,
 +  VkDebugReportFlagsEXT flags,
 +  VkDebugReportObjectTypeEXT object_type,
 +  

Re: [Mesa-dev] [PATCH 1/2] anv: implementation of VK_EXT_debug_report extension

2017-08-24 Thread Jason Ekstrand
On Thu, Aug 24, 2017 at 9:52 PM, Tapani Pälli 
wrote:

> Hi;
>
> On 08/24/2017 08:36 PM, Jason Ekstrand wrote:
>
>> On Wed, Aug 23, 2017 at 11:23 PM, Tapani Pälli > > wrote:
>>
>> Patch adds required functionality for extension to manage a list of
>> application provided callbacks and handle debug reporting from driver
>> and application side.
>>
>> Signed-off-by: Tapani Pälli > >
>>
>> ---
>>   src/intel/Makefile.sources  |   1 +
>>   src/intel/vulkan/anv_debug_report.c | 133
>> 
>>   src/intel/vulkan/anv_device.c   |  40 +++
>>   src/intel/vulkan/anv_extensions.py  |   1 +
>>   src/intel/vulkan/anv_private.h  |  32 +
>>   5 files changed, 207 insertions(+)
>>   create mode 100644 src/intel/vulkan/anv_debug_report.c
>>
>> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
>> index 4074ba9ee5..200713b06e 100644
>> --- a/src/intel/Makefile.sources
>> +++ b/src/intel/Makefile.sources
>> @@ -205,6 +205,7 @@ VULKAN_FILES := \
>>  vulkan/anv_batch_chain.c \
>>  vulkan/anv_blorp.c \
>>  vulkan/anv_cmd_buffer.c \
>> +   vulkan/anv_debug_report.c \
>>  vulkan/anv_descriptor_set.c \
>>  vulkan/anv_device.c \
>>  vulkan/anv_dump.c \
>> diff --git a/src/intel/vulkan/anv_debug_report.c
>> b/src/intel/vulkan/anv_debug_report.c
>> new file mode 100644
>> index 00..1a4868cd52
>> --- /dev/null
>> +++ b/src/intel/vulkan/anv_debug_report.c
>> @@ -0,0 +1,133 @@
>> +/*
>> + * Copyright © 2017 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> obtaining a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to
>> whom the
>> + * Software is furnished to do so, subject to the following
>> conditions:
>> + *
>> + * The above copyright notice and this permission notice (including
>> the next
>> + * paragraph) shall be included in all copies or substantial
>> portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
>> DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
>> OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +#include "anv_private.h"
>> +#include "vk_util.h"
>> +
>> +/* This file contains implementation for VK_EXT_debug_report. */
>> +
>> +VkResult
>> +anv_CreateDebugReportCallbackEXT(VkInstance _instance,
>> + const
>> VkDebugReportCallbackCreateInfoEXT* pCreateInfo,
>> + const VkAllocationCallbacks*
>> pAllocator,
>> + VkDebugReportCallbackEXT* pCallback)
>> +{
>> +   ANV_FROM_HANDLE(anv_instance, instance, _instance);
>> +   const VkAllocationCallbacks *alloc =
>> +  pAllocator ? pAllocator : >alloc;
>>
>>
>> This is what vk_alloc2 is for.
>>
>
> Thanks, I had a feeling that there might be a helper for this but did not
> figure it out.
>
> +
>> +   vk_foreach_struct(info, pCreateInfo) {
>>
>>
>> Usually, we handle the primary structure directly and then call
>> vk_foreach_struct on pCreateInfo->pNext.  This is because the things in the
>> pNext chain are going to be modifiers to the original thing so they
>> probably need to happen between allocating the callback and list_addtail().
>>
>
> Right, this would be for extending the functionality. I wrote it like this
> because it seems for me that the typical pattern would be to chain few
> report callbacks and send them all at once rather than calling the function
> n times. I'll change so that I handle first item out of the loop.
>

Is that allowed??? That would be weird but it honestly wouldn't surprise me
that much for this extension.  Normally, you don't chain multiple of the
same struct together.  You chain extension structs on to modify the
original struct.  If inserting multiple callbacks at one go this way is
allowed, then we 

Re: [Mesa-dev] [PATCH] i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9.

2017-08-24 Thread Tapani Pälli

Hi;

On 08/25/2017 12:30 AM, Kenneth Graunke wrote:

On Thursday, August 24, 2017 4:16:39 AM PDT kevin.rogo...@intel.com wrote:

From: Kevin Rogovin 

Special thanks to Eero Tamminen for reporting rasterizer
numbers being twice what it should be for 2xMSAA under
a benchmark.

Signed-off-by: Kevin Rogovin 


Nice catch!  Thanks for fixing this.

Reviewed-by: Kenneth Graunke 

Ian requested that I run this through a full CTS run before pushing, so
that we actually hit all the new visuals, and make sure 2x/16x works as
expected.  Assuming that comes back green, I'll plan to push this.



I did CI run yesterday with this one, it had 2 failing Piglit tests:

bin/read-front -samples=16 -auto
bin/read-front clear-front-first -samples=16 -auto

not sure what's going on there.

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: implementation of VK_EXT_debug_report extension

2017-08-24 Thread Tapani Pälli

Hi;

On 08/24/2017 08:36 PM, Jason Ekstrand wrote:
On Wed, Aug 23, 2017 at 11:23 PM, Tapani Pälli > wrote:


Patch adds required functionality for extension to manage a list of
application provided callbacks and handle debug reporting from driver
and application side.

Signed-off-by: Tapani Pälli >
---
  src/intel/Makefile.sources  |   1 +
  src/intel/vulkan/anv_debug_report.c | 133

  src/intel/vulkan/anv_device.c   |  40 +++
  src/intel/vulkan/anv_extensions.py  |   1 +
  src/intel/vulkan/anv_private.h  |  32 +
  5 files changed, 207 insertions(+)
  create mode 100644 src/intel/vulkan/anv_debug_report.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 4074ba9ee5..200713b06e 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -205,6 +205,7 @@ VULKAN_FILES := \
 vulkan/anv_batch_chain.c \
 vulkan/anv_blorp.c \
 vulkan/anv_cmd_buffer.c \
+   vulkan/anv_debug_report.c \
 vulkan/anv_descriptor_set.c \
 vulkan/anv_device.c \
 vulkan/anv_dump.c \
diff --git a/src/intel/vulkan/anv_debug_report.c
b/src/intel/vulkan/anv_debug_report.c
new file mode 100644
index 00..1a4868cd52
--- /dev/null
+++ b/src/intel/vulkan/anv_debug_report.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person
obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without
limitation
+ * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
+ * and/or sell copies of the Software, and to permit persons to
whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including
the next
+ * paragraph) shall be included in all copies or substantial
portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "anv_private.h"
+#include "vk_util.h"
+
+/* This file contains implementation for VK_EXT_debug_report. */
+
+VkResult
+anv_CreateDebugReportCallbackEXT(VkInstance _instance,
+ const
VkDebugReportCallbackCreateInfoEXT* pCreateInfo,
+ const VkAllocationCallbacks*
pAllocator,
+ VkDebugReportCallbackEXT* pCallback)
+{
+   ANV_FROM_HANDLE(anv_instance, instance, _instance);
+   const VkAllocationCallbacks *alloc =
+  pAllocator ? pAllocator : >alloc;


This is what vk_alloc2 is for.


Thanks, I had a feeling that there might be a helper for this but did 
not figure it out.



+
+   vk_foreach_struct(info, pCreateInfo) {


Usually, we handle the primary structure directly and then call 
vk_foreach_struct on pCreateInfo->pNext.  This is because the things in 
the pNext chain are going to be modifiers to the original thing so they 
probably need to happen between allocating the callback and list_addtail().


Right, this would be for extending the functionality. I wrote it like 
this because it seems for me that the typical pattern would be to chain 
few report callbacks and send them all at once rather than calling the 
function n times. I'll change so that I handle first item out of the loop.




+  switch (info->sType) {
+  case VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT: {
+ struct anv_debug_callback *cb =
+vk_alloc(alloc, sizeof(struct anv_debug_callback), 8,
+ VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
+ if (!cb)
+return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+ cb->flags = pCreateInfo->flags;
+ cb->callback = pCreateInfo->pfnCallback;
+ cb->data = pCreateInfo->pUserData;
+
+ list_addtail(>link, >callbacks);


What kind of threading guarantees does debug_report provide?  I'm 
guessing none in which case we need to lock 

[Mesa-dev] [PATCH] a2xx: add support for a few 16-bit color rendering formats

2017-08-24 Thread Ilia Mirkin
The rest should be possible too, just needs some additional
investigation. Passes fbo-*-formats piglit tests.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c   | 5 +
 src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 7 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
index aaba4127e0a..faf4dbccbc9 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
@@ -50,6 +50,11 @@ static uint32_t fmt2swap(enum pipe_format format)
switch (format) {
case PIPE_FORMAT_B8G8R8A8_UNORM:
case PIPE_FORMAT_B8G8R8X8_UNORM:
+   case PIPE_FORMAT_B5G6R5_UNORM:
+   case PIPE_FORMAT_B5G5R5A1_UNORM:
+   case PIPE_FORMAT_B5G5R5X1_UNORM:
+   case PIPE_FORMAT_B4G4R4A4_UNORM:
+   case PIPE_FORMAT_B4G4R4X4_UNORM:
/* TODO probably some more.. */
return 1;
default:
diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_screen.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_screen.c
index 714948c1cef..8e176b1341f 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_screen.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_screen.c
@@ -54,7 +54,12 @@ fd2_screen_is_format_supported(struct pipe_screen *pscreen,
 
/* TODO figure out how to render to other formats.. */
if ((usage & PIPE_BIND_RENDER_TARGET) &&
-   ((format != PIPE_FORMAT_B8G8R8A8_UNORM) &&
+   ((format != PIPE_FORMAT_B5G6R5_UNORM) &&
+(format != PIPE_FORMAT_B5G5R5A1_UNORM) &&
+(format != PIPE_FORMAT_B5G5R5X1_UNORM) &&
+(format != PIPE_FORMAT_B4G4R4A4_UNORM) &&
+(format != PIPE_FORMAT_B4G4R4X4_UNORM) &&
+(format != PIPE_FORMAT_B8G8R8A8_UNORM) &&
 (format != PIPE_FORMAT_B8G8R8X8_UNORM) &&
 (format != PIPE_FORMAT_R8G8B8A8_UNORM) &&
 (format != PIPE_FORMAT_R8G8B8X8_UNORM))) {
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] disk_cache: assert if a cache entries keys don't match mesa

2017-08-24 Thread Timothy Arceri
In ef42423e7be9 I enable the check for release builds however we
still want to assert in debug builds to alert to collisions or
just general bugs with the key building/compare code. Otherwise
it will just fail silently effectively disabling the cache.
---
 src/util/disk_cache.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 36c1e8e72c..b789a454eb 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -1078,22 +1078,24 @@ disk_cache_get(struct disk_cache *cache, const 
cache_key key, size_t *size)
   goto fail;
 
if (sb.st_size < ck_size)
   goto fail;
 
ret = read_all(fd, file_header, ck_size);
if (ret == -1)
   goto fail;
 
/* Check for extremely unlikely hash collisions */
-   if (memcmp(cache->driver_keys_blob, file_header, ck_size) != 0)
+   if (memcmp(cache->driver_keys_blob, file_header, ck_size) != 0) {
+  assert(!"Mesa cache keys mismatch!");
   goto fail;
+   }
 
size_t cache_item_md_size = sizeof(uint32_t);
uint32_t md_type;
ret = read_all(fd, _type, cache_item_md_size);
if (ret == -1)
   goto fail;
 
if (md_type == CACHE_ITEM_TYPE_GLSL) {
   uint32_t num_keys;
   cache_item_md_size += sizeof(uint32_t);
-- 
2.13.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V3 1/5] util/disk_cache: rename mesa cache dir and introduce cache versioning

2017-08-24 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

on RX580 with Nine

Dieter

Am 24.08.2017 03:11, schrieb Timothy Arceri:

Steam is already analysing cache items, unfortunatly we did not
introduce a versioning mechanism for identifying structural changes
to cache entries earlier so the only way to do so is to rename the
cache directory.

Since we are renaming it we take the opportunity to give the directory
a more meaningful name.

Adding a version field to the header of cache entries will help us to
avoid having to rename the directory in future. Please note this is
versioning for the internal structure of the entries as defined in
disk_cache.{c,h} as opposed to the structure of the data provided to
the disk cache by the GLSL compiler and the various driver backends.

V3: fix silly bug where cache->driver_keys_blob was incremented 
directly

---
 src/compiler/glsl/tests/cache_test.c |  6 +++--
 src/util/disk_cache.c| 47 
+++-

 src/util/disk_cache.h|  2 ++
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/src/compiler/glsl/tests/cache_test.c
b/src/compiler/glsl/tests/cache_test.c
index af1b66fb3d..3796ce6170 100644
--- a/src/compiler/glsl/tests/cache_test.c
+++ b/src/compiler/glsl/tests/cache_test.c
@@ -178,38 +178,40 @@ test_disk_cache_create(void)
/* Test with XDG_CACHE_HOME set */
setenv("XDG_CACHE_HOME", CACHE_TEST_TMP "/xdg-cache-home", 1);
cache = disk_cache_create("test", "make_check", 0);
expect_null(cache, "disk_cache_create with XDG_CACHE_HOME set with"
"a non-existing parent directory");

mkdir(CACHE_TEST_TMP, 0755);
cache = disk_cache_create("test", "make_check", 0);
expect_non_null(cache, "disk_cache_create with XDG_CACHE_HOME 
set");


-   check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/mesa");
+   check_directories_created(CACHE_TEST_TMP "/xdg-cache-home/"
+ CACHE_DIR_NAME);

disk_cache_destroy(cache);

/* Test with MESA_GLSL_CACHE_DIR set */
err = rmrf_local(CACHE_TEST_TMP);
expect_equal(err, 0, "Removing " CACHE_TEST_TMP);

setenv("MESA_GLSL_CACHE_DIR", CACHE_TEST_TMP 
"/mesa-glsl-cache-dir", 1);

cache = disk_cache_create("test", "make_check", 0);
expect_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR set 
with"

"a non-existing parent directory");

mkdir(CACHE_TEST_TMP, 0755);
cache = disk_cache_create("test", "make_check", 0);
expect_non_null(cache, "disk_cache_create with MESA_GLSL_CACHE_DIR 
set");


-   check_directories_created(CACHE_TEST_TMP 
"/mesa-glsl-cache-dir/mesa");

+   check_directories_created(CACHE_TEST_TMP "/mesa-glsl-cache-dir/"
+ CACHE_DIR_NAME);

disk_cache_destroy(cache);
 }

 static bool
 does_cache_contain(struct disk_cache *cache, const cache_key key)
 {
void *result;

result = disk_cache_get(cache, key, NULL);
diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index b2229874e0..b2747fbce4 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -51,20 +51,34 @@

 /* Number of bits to mask off from a cache key to get an index. */
 #define CACHE_INDEX_KEY_BITS 16

 /* Mask for computing an index from a key. */
 #define CACHE_INDEX_KEY_MASK ((1 << CACHE_INDEX_KEY_BITS) - 1)

 /* The number of keys that can be stored in the index. */
 #define CACHE_INDEX_MAX_KEYS (1 << CACHE_INDEX_KEY_BITS)

+/* The cache version should be bumped whenever a change is made to the
+ * structure of cache entries or the index. This will give any 3rd 
party
+ * applications reading the cache entries a chance to adjust to the 
changes.

+ *
+ * - The cache version is checked internally when reading a cache 
entry. If we
+ *   ever have a mismatch we are in big trouble as this means we had a 
cache
+ *   collision. In case of such an event please check the skys for 
giant
+ *   asteroids and that the entire Mesa team hasn't been eaten by 
wolves.

+ *
+ * - There is no strict requirement that cache versions be backwards
+ *   compatible but effort should be taken to limit disruption where 
possible.

+ */
+#define CACHE_VERSION 1
+
 struct disk_cache {
/* The path to the cache directory. */
char *path;

/* Thread queue for compressing and writing cache entries to disk 
*/

struct util_queue cache_queue;

/* Seed for rand, which is used to pick a random directory */
uint64_t seed_xorshift128plus[2];

@@ -153,20 +167,25 @@ concatenate_and_mkdir(void *ctx, const char
*path, const char *name)
   return NULL;

new_path = ralloc_asprintf(ctx, "%s/%s", path, name);

if (mkdir_if_needed(new_path) == 0)
   return new_path;
else
   return NULL;
 }

+#define DRV_KEY_CPY(_dst, _src, _src_size) { \
+   memcpy(_dst, _src, _src_size);\
+   _dst += _src_size;\
+} while (0)
+
 struct disk_cache *
 disk_cache_create(const char 

Re: [Mesa-dev] [PATCH v2 08/10] anv: Use DRM sync objects to back fences whenever possible

2017-08-24 Thread Jason Ekstrand
A v3 of this will be coming tomorrow.  Dave asked me to rework some kernel 
apis.



On August 24, 2017 4:49:53 PM Jason Ekstrand  wrote:


In order to implement VK_KHR_external_fence, we need to back our fences
with something that's shareable.  Since the kernel wait interface for
sync objects already supports waiting for multiple fences in one go, it
makes anv_WaitForFences much simpler if we only have one type of fence.
---
 src/intel/vulkan/anv_batch_chain.c |   8 +++
 src/intel/vulkan/anv_device.c  |   2 +
 src/intel/vulkan/anv_private.h |   4 ++
 src/intel/vulkan/anv_queue.c   | 133 ++---
 4 files changed, 138 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c

index 0a0be8d..52c4510 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1560,6 +1560,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 return result;
  break;

+  case ANV_FENCE_TYPE_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_SIGNAL,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  unreachable("Invalid fence type");
   }
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a6d5215..2e0fa19 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -339,6 +339,8 @@ anv_physical_device_init(struct anv_physical_device 
*device,

device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
+   device->has_syncobj_wait = device->has_syncobj &&
+  anv_gem_supports_syncobj_wait(fd);

bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 66a85db..4d83767 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -646,6 +646,7 @@ struct anv_physical_device {
 boolhas_exec_async;
 boolhas_exec_fence;
 boolhas_syncobj;
+boolhas_syncobj_wait;

 uint32_teu_total;
 uint32_tsubslice_total;
@@ -1748,6 +1749,9 @@ struct anv_fence_impl {
  struct anv_bo bo;
  enum anv_bo_fence_state state;
   } bo;
+
+  /** DRM syncobj handle for syncobj-based fences */
+  uint32_t syncobj;
};
 };

diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 7348e15..5e84566 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -271,17 +271,28 @@ VkResult anv_CreateFence(
if (fence == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);

-   fence->permanent.type = ANV_FENCE_TYPE_BO;
+   if (device->instance->physicalDevice.has_syncobj_wait) {
+  fence->permanent.type = ANV_FENCE_TYPE_SYNCOBJ;

-   VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
-   >permanent.bo.bo, 4096);
-   if (result != VK_SUCCESS)
-  return result;
+  fence->permanent.syncobj = anv_gem_syncobj_create(device);
+  if (!fence->permanent.syncobj)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);

-   if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT)
+ anv_gem_syncobj_signal(device, fence->permanent.syncobj);
} else {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  fence->permanent.type = ANV_FENCE_TYPE_BO;
+
+  VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
+  >permanent.bo.bo, 4096);
+  if (result != VK_SUCCESS)
+ return result;
+
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  } else {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  }
}

*pFence = anv_fence_to_handle(fence);
@@ -301,6 +312,10 @@ anv_fence_impl_cleanup(struct anv_device *device,
case ANV_FENCE_TYPE_BO:
   anv_bo_pool_free(>batch_bo_pool, >bo.bo);
   return;
+
+   case ANV_FENCE_TYPE_SYNCOBJ:
+  anv_gem_syncobj_destroy(device, impl->syncobj);
+  return;
}

unreachable("Invalid fence type");
@@ -328,6 +343,8 @@ VkResult anv_ResetFences(
 uint32_tfenceCount,
 const 

Re: [Mesa-dev] [PATCH 0/5] intel/isl: Set MOCS based on view usage

2017-08-24 Thread Jason Ekstrand

Ken,

Did you ever look at this?

--Jason


On August 23, 2017 12:50:25 AM "Pohjolainen, Topi" 
 wrote:



On Tue, Aug 01, 2017 at 03:48:29PM -0700, Jason Ekstrand wrote:

This little series changes things around so that, instead of passing MOCS
values into ISL, ISL knows how to set them itself.  This allows us to
centralize some of the decisions about how MOCS gets set for surfaces and
hopefully, if we ever do anything crazy in the future, we can share it
between GL and Vulkan.  Unfortunately, surfaces are not the only places
where MOCS is used.  It also shows up in vertex buffers, index buffers, and
streamout buffers.  However those are always set to the platform equivalent
of I915_MOCS_CACHED (and that's not all that liable to change) so they're
not particularly interesting.

If people like this approach, I'd like to Cc it to stable for 17.2 because
it has the side-effect of making Vulkan MOCS a bit more sane.


Looks like there weren't any input yet. I think this is clearer and less error
prone. So +1 and series:

Reviewed-by: Topi Pohjolainen 



Jason Ekstrand (5):
  intel/isl: Set MOCS based on usage for surface states
  intel/blorp: Delete the MOCS plumbing
  i965: Stop passing MOCS information into ISL
  anv: Stop passing MOCS information into ISL
  intel/isl: Get rid of the mocs fields in fill/emit_info

 src/intel/blorp/blorp.h  |  6 ---
 src/intel/blorp/blorp_genX_exec.h| 37 +++--
 src/intel/isl/isl.h  | 22 --
 src/intel/isl/isl_emit_depth_stencil.c   | 12 +++---
 src/intel/isl/isl_genX_mocs.h| 53 
 src/intel/isl/isl_surface_state.c|  9 ++--
 src/intel/vulkan/anv_blorp.c |  3 --
 src/intel/vulkan/anv_device.c|  1 -
 src/intel/vulkan/anv_image.c | 12 ++
 src/intel/vulkan/anv_private.h   |  2 -
 src/intel/vulkan/genX_cmd_buffer.c   | 13 ++
 src/intel/vulkan/genX_state.c|  3 --
 src/mesa/drivers/dri/i965/brw_blorp.c| 15 ---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 26 ++--
 14 files changed, 101 insertions(+), 113 deletions(-)
 create mode 100644 src/intel/isl/isl_genX_mocs.h

--
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102394] RBDOOM3BFG digital vomit

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102394

--- Comment #2 from dlol...@gmail.com ---
To be honest I'm not sure what version it worked. On my laptop which runs a
Radeon HD 6000 series GPU, also ran fine a few weeks ago, and then this started
to happen after an update.  Removing the Padoka PPA will fix the problem as
well, but then I revert to whatever comes with Mint 18.2.  

Anyway this is what my laptop says, but it also has the same problem as my
desktop with the Radeon HD 7850.  I'll see what I can do about finding which
version it worked.

OpenGL vendor string: X.Org
OpenGL renderer string: AMD CEDAR (DRM 2.50.0 / 4.12.5-041205-generic, LLVM
6.0.0)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 17.3.0-devel -
padoka PPA
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 17.3.0-devel - padoka PPA
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 17.3.0-devel - padoka PPA
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v9 0/7] mesa/st: glsl_to_tgsi: refined register merge algorithm

2017-08-24 Thread Michel Dänzer
On 25/08/17 02:38 AM, Gert Wollny wrote:
> 
> The patch doesn't introduce piglit regression (I tested the shader subset). 

I'd recommend testing at least the gpu profile, ideally running on X.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102394] RBDOOM3BFG digital vomit

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102394

--- Comment #1 from Timothy Arceri  ---
Please find out what version of Mesa was working vs the Padoka PPA version and
add that to this bug report. To get that you can use:

glxinfo | grep OpenGL

>From there it would be very helpful if you could build mesa from source and do
a git bisect to discover the commit that caused the regression. You can get
assistance with this on the freenode #radeon IRC channel if you need it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeonsi: rewrite late alloc VS limit computation

2017-08-24 Thread Marek Olšák
From: Marek Olšák 

This is still very simple, but it's better than before.

Loosely ported from Vulkan.
---
 src/gallium/drivers/radeonsi/si_state.c | 38 ++---
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 24e509c..393f960 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -4649,40 +4649,54 @@ static void si_init_config(struct si_context *sctx)
/* If this is 0, Bonaire can hang even if GS isn't 
being used.
 * Other chips are unaffected. These are suboptimal 
values,
 * but we don't use on-chip GS.
 */
si_pm4_set_reg(pm4, R_028A44_VGT_GS_ONCHIP_CNTL,
   S_028A44_ES_VERTS_PER_SUBGRP(64) |
   S_028A44_GS_PRIMS_PER_SUBGRP(4));
}
si_pm4_set_reg(pm4, R_00B21C_SPI_SHADER_PGM_RSRC3_GS, 
S_00B21C_CU_EN(0x));
 
-   if (sscreen->b.info.num_good_compute_units /
-   (sscreen->b.info.max_se * sscreen->b.info.max_sh_per_se) <= 
4) {
+   /* Compute LATE_ALLOC_VS.LIMIT. */
+   unsigned num_cu_per_sh = sscreen->b.info.num_good_compute_units 
/
+(sscreen->b.info.max_se *
+ sscreen->b.info.max_sh_per_se);
+   unsigned late_alloc_limit; /* The limit is per SH. */
+
+   if (sctx->b.family == CHIP_KABINI) {
+   late_alloc_limit = 0; /* Potential hang on Kabini. */
+   } else if (num_cu_per_sh <= 4) {
/* Too few available compute units per SH. Disallowing
-* VS to run on CU0 could hurt us more than late VS
+* VS to run on one CU could hurt us more than late VS
 * allocation would help.
 *
-* LATE_ALLOC_VS = 2 is the highest safe number.
+* 2 is the highest safe number that allows us to keep
+* all CUs enabled.
 */
-   si_pm4_set_reg(pm4, R_00B118_SPI_SHADER_PGM_RSRC3_VS, 
S_00B118_CU_EN(0x));
-   si_pm4_set_reg(pm4, R_00B11C_SPI_SHADER_LATE_ALLOC_VS, 
S_00B11C_LIMIT(2));
+   late_alloc_limit = 2;
} else {
-   /* Set LATE_ALLOC_VS == 31. It should be less than
-* the number of scratch waves. Limitations:
-* - VS can't execute on CU0.
-* - If HS writes outputs to LDS, LS can't execute on 
CU0.
+   /* This is a good initial value.
+* We shouldn't run into a VS-PS deadlock, because it
+* only allows 1 late_alloc wave per SIMD on num_cu - 2.
 */
-   si_pm4_set_reg(pm4, R_00B118_SPI_SHADER_PGM_RSRC3_VS, 
S_00B118_CU_EN(0xfffe));
-   si_pm4_set_reg(pm4, R_00B11C_SPI_SHADER_LATE_ALLOC_VS, 
S_00B11C_LIMIT(31));
+   late_alloc_limit = (num_cu_per_sh - 2) * 4;
+
+   /* The limit is 0-based, so 0 means 1. */
+   assert(late_alloc_limit > 0 && late_alloc_limit <= 64);
+   late_alloc_limit -= 1;
}
 
+   /* VS can't execute on one CU if the limit is > 2. */
+   si_pm4_set_reg(pm4, R_00B118_SPI_SHADER_PGM_RSRC3_VS,
+  S_00B118_CU_EN(late_alloc_limit > 2 ? 0xfffe : 
0x));
+   si_pm4_set_reg(pm4, R_00B11C_SPI_SHADER_LATE_ALLOC_VS,
+  S_00B11C_LIMIT(late_alloc_limit));
si_pm4_set_reg(pm4, R_00B01C_SPI_SHADER_PGM_RSRC3_PS, 
S_00B01C_CU_EN(0x));
}
 
if (sctx->b.chip_class >= VI) {
unsigned vgt_tess_distribution;
 
si_pm4_set_reg(pm4, R_028424_CB_DCC_CONTROL,
   
S_028424_OVERWRITE_COMBINER_MRT_SHARING_DISABLE(1) |
   S_028424_OVERWRITE_COMBINER_WATERMARK(4));
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium/radeon: set EVENT_WRITE_EOP.INT_SEL = wait for write confirmation

2017-08-24 Thread Marek Olšák
From: Marek Olšák 

Ported from Vulkan.
Not sure what this is good for.. maybe write confirmation from L2 flushes?
---
 src/amd/common/r600d_common.h |  3 +++
 src/gallium/drivers/radeon/r600_pipe_common.c | 12 +---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/r600d_common.h b/src/amd/common/r600d_common.h
index 5775746..76c5c4f 100644
--- a/src/amd/common/r600d_common.h
+++ b/src/amd/common/r600d_common.h
@@ -60,20 +60,23 @@
 #defineCOPY_DATA_MEM   1
 #define COPY_DATA_PERF  4
 #define COPY_DATA_IMM   5
 #define COPY_DATA_TIMESTAMP 9
 #defineCOPY_DATA_DST_SEL(x)(((unsigned)(x) & 0xf) 
<< 8)
 #define COPY_DATA_MEM_ASYNC 5
 #defineCOPY_DATA_COUNT_SEL (1 << 16)
 #defineCOPY_DATA_WR_CONFIRM(1 << 20)
 #define PKT3_EVENT_WRITE   0x46
 #define PKT3_EVENT_WRITE_EOP   0x47
+#define EOP_INT_SEL(x)  ((x) << 24)
+#defineEOP_INT_SEL_NONE0
+#defineEOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM  3
 #define EOP_DATA_SEL(x) ((x) << 29)
 #defineEOP_DATA_SEL_DISCARD0
 #defineEOP_DATA_SEL_VALUE_32BIT1
 #defineEOP_DATA_SEL_VALUE_64BIT2
 #defineEOP_DATA_SEL_TIMESTAMP  3
 #define PKT3_RELEASE_MEM   0x49 /* GFX9+ */
 #define PKT3_SET_CONFIG_REG   0x68
 #define PKT3_SET_CONTEXT_REG  0x69
 #define PKT3_STRMOUT_BASE_UPDATE  0x72 /* r700 only */
 #define PKT3_SURFACE_BASE_UPDATE   0x73 /* r600 only */
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 7226fc2..7c12565 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -102,20 +102,26 @@ void radeon_shader_binary_clean(struct ac_shader_binary 
*b)
 void r600_gfx_write_event_eop(struct r600_common_context *ctx,
  unsigned event, unsigned event_flags,
  unsigned data_sel,
  struct r600_resource *buf, uint64_t va,
  uint32_t new_fence, unsigned query_type)
 {
struct radeon_winsys_cs *cs = ctx->gfx.cs;
unsigned op = EVENT_TYPE(event) |
  EVENT_INDEX(5) |
  event_flags;
+   unsigned sel = EOP_DATA_SEL(data_sel);
+
+   /* Wait for write confirmation before writing data, but don't send
+* an interrupt. */
+   if (ctx->chip_class >= SI && data_sel != EOP_DATA_SEL_DISCARD)
+   sel |= EOP_INT_SEL(EOP_INT_SEL_SEND_DATA_AFTER_WR_CONFIRM);
 
if (ctx->chip_class >= GFX9) {
/* A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion
 * counters) must immediately precede every timestamp event to
 * prevent a GPU hang on GFX9.
 *
 * Occlusion queries don't need to do it here, because they
 * always do ZPASS_DONE before the timestamp.
 */
if (ctx->chip_class == GFX9 &&
@@ -129,51 +135,51 @@ void r600_gfx_write_event_eop(struct r600_common_context 
*ctx,
radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_ZPASS_DONE) | 
EVENT_INDEX(1));
radeon_emit(cs, scratch->gpu_address);
radeon_emit(cs, scratch->gpu_address >> 32);
 
radeon_add_to_buffer_list(ctx, >gfx, scratch,
  RADEON_USAGE_WRITE, 
RADEON_PRIO_QUERY);
}
 
radeon_emit(cs, PKT3(PKT3_RELEASE_MEM, 6, 0));
radeon_emit(cs, op);
-   radeon_emit(cs, EOP_DATA_SEL(data_sel));
+   radeon_emit(cs, sel);
radeon_emit(cs, va);/* address lo */
radeon_emit(cs, va >> 32);  /* address hi */
radeon_emit(cs, new_fence); /* immediate data lo */
radeon_emit(cs, 0); /* immediate data hi */
radeon_emit(cs, 0); /* unused */
} else {
if (ctx->chip_class == CIK ||
ctx->chip_class == VI) {
struct r600_resource *scratch = ctx->eop_bug_scratch;
uint64_t va = scratch->gpu_address;
 
/* Two EOP events are required to make all engines go 
idle
 * (and optional cache flushes executed) before the 
timestamp
 * is written.
   

Re: [Mesa-dev] [PATCH 00/14] KHR_no_error support to various TFB functions

2017-08-24 Thread Timothy Arceri

On 24/08/17 23:21, Samuel Pitoiset wrote:

Hi,
Some more simple patches for KHR_no_error support.
Please review, thanks!



With the change suggested in patch 3 series is:

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/14] mesa: add begin_transform_feedback() helper

2017-08-24 Thread Timothy Arceri

On 24/08/17 23:21, Samuel Pitoiset wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/transformfeedback.c | 49 ---
  1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 307728c399..b217d0d84a 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -381,22 +381,22 @@ get_xfb_source(struct gl_context *ctx)
  }
  
  
-void GLAPIENTRY

-_mesa_BeginTransformFeedback(GLenum mode)
+static ALWAYS_INLINE void
+begin_transform_feedback(struct gl_context *ctx, GLenum mode, bool no_error)
  {
 struct gl_transform_feedback_object *obj;
 struct gl_transform_feedback_info *info = NULL;
+   struct gl_program *source;
 GLuint i;
 unsigned vertices_per_prim;
-   GET_CURRENT_CONTEXT(ctx);
  
 obj = ctx->TransformFeedback.CurrentObject;
  
 /* Figure out what pipeline stage is the source of data for transform

  * feedback.
  */
-   struct gl_program *source = get_xfb_source(ctx);
-   if (source == NULL) {
+   source = get_xfb_source(ctx);



Personally I would rather this left as is. We no longer have to bend to 
MCVSs refusal to support C99 for so many years.




+   if (!no_error && source == NULL) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"glBeginTransformFeedback(no program active)");
return;
@@ -404,7 +404,7 @@ _mesa_BeginTransformFeedback(GLenum mode)
  
 info = source->sh.LinkedTransformFeedback;
  
-   if (info->NumOutputs == 0) {

+   if (!no_error && info->NumOutputs == 0) {
_mesa_error(ctx, GL_INVALID_OPERATION,
"glBeginTransformFeedback(no varyings to record)");
return;
@@ -421,23 +421,26 @@ _mesa_BeginTransformFeedback(GLenum mode)
vertices_per_prim = 3;
break;
 default:
-  _mesa_error(ctx, GL_INVALID_ENUM, "glBeginTransformFeedback(mode)");
+  if (!no_error)
+ _mesa_error(ctx, GL_INVALID_ENUM, "glBeginTransformFeedback(mode)");


You should be able to make this something like:

if (!no_error) {
   _mesa_error(ctx, GL_INVALID_ENUM, "glBeginTransformFeedback(mode)");
   return;
} else {
   /* Stop compiler warnings */
   unreachable("Error in API use when using KHR_no_error");
}




return;
 }
  
-   if (obj->Active) {

-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glBeginTransformFeedback(already active)");
-  return;
-   }
+   if (!no_error) {
+  if (obj->Active) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "glBeginTransformFeedback(already active)");
+ return;
+  }
  
-   for (i = 0; i < ctx->Const.MaxTransformFeedbackBuffers; i++) {

-  if ((info->ActiveBuffers >> i) & 1) {
- if (obj->BufferNames[i] == 0) {
-_mesa_error(ctx, GL_INVALID_OPERATION,
-"glBeginTransformFeedback(binding point %d does not "
-"have a buffer object bound)", i);
-return;
+  for (i = 0; i < ctx->Const.MaxTransformFeedbackBuffers; i++) {
+ if ((info->ActiveBuffers >> i) & 1) {
+if (obj->BufferNames[i] == 0) {
+   _mesa_error(ctx, GL_INVALID_OPERATION,
+   "glBeginTransformFeedback(binding point %d does not 
"
+   "have a buffer object bound)", i);
+   return;
+}
   }
}
 }
@@ -472,6 +475,14 @@ _mesa_BeginTransformFeedback(GLenum mode)
  }
  
  
+void GLAPIENTRY

+_mesa_BeginTransformFeedback(GLenum mode)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   begin_transform_feedback(ctx, mode, false);
+}
+
+
  void GLAPIENTRY
  _mesa_EndTransformFeedback(void)
  {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] Revert "st/va: add enviromental variable to disable interlace"

2017-08-24 Thread Michel Dänzer
On 25/08/17 12:11 AM, Leo Liu wrote:
> This reverts commit 10dec2de2d9f568675d66d736b48701fa26f7b50.
> 
> The environment variable is no longer needed with the previous change

Thanks for clarifying the commit log.

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: port the LastLookedUpVAO optimisation to _mesa_lookup_vao()

2017-08-24 Thread Timothy Arceri

Thanks for fixing this I'd been meaning to look into it.

Reviewed-by: Timothy Arceri 

On 24/08/17 23:53, Samuel Pitoiset wrote:

It was only used in the errors path.

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/arrayobj.c | 20 
  1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
index 600177cc5c..88a5702f41 100644
--- a/src/mesa/main/arrayobj.c
+++ b/src/mesa/main/arrayobj.c
@@ -66,11 +66,23 @@
  struct gl_vertex_array_object *
  _mesa_lookup_vao(struct gl_context *ctx, GLuint id)
  {
-   if (id == 0)
+   if (id == 0) {
return NULL;
-   else
-  return (struct gl_vertex_array_object *)
- _mesa_HashLookupLocked(ctx->Array.Objects, id);
+   } else {
+  struct gl_vertex_array_object *vao;
+
+  if (ctx->Array.LastLookedUpVAO &&
+  ctx->Array.LastLookedUpVAO->Name == id) {
+ vao = ctx->Array.LastLookedUpVAO;
+  } else {
+ vao = (struct gl_vertex_array_object *)
+_mesa_HashLookupLocked(ctx->Array.Objects, id);
+
+ _mesa_reference_vao(ctx, >Array.LastLookedUpVAO, vao);
+  }
+
+  return vao;
+   }
  }
  
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: don't error check the default buffer object in glBindBufferOffsetEXT()

2017-08-24 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

On 24/08/17 21:44, Samuel Pitoiset wrote:

An allocation check is already done when the buffer is created at
context creation.

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/transformfeedback.c | 11 +--
  1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 07a9f9e940..a075d0875a 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -778,12 +778,11 @@ _mesa_BindBufferOffsetEXT(GLenum target, GLuint index, 
GLuint buffer,
bufObj = ctx->Shared->NullBufferObj;
 } else {
bufObj = _mesa_lookup_bufferobj(ctx, buffer);
-   }
-
-   if (!bufObj) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glBindBufferOffsetEXT(invalid buffer=%u)", buffer);
-  return;
+  if (!bufObj) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "glBindBufferOffsetEXT(invalid buffer=%u)", buffer);
+ return;
+  }
 }
  
 _mesa_bind_buffer_range_xfb(ctx, obj, index, bufObj, offset, 0);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv/gfx9: fix buffer size on gfx9.

2017-08-24 Thread Dave Airlie
From: Dave Airlie 

The VI sizing only applies to VI.

This fixes:
dEQP-VK.image.image_size.buffer.*

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index cdfbff2..dd5ed1d 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2012,7 +2012,7 @@ get_buffer_size(struct ac_nir_context *ctx, LLVMValueRef 
descriptor, bool in_ele
LLVMConstInt(ctx->ac.i32, 2, false), 
"");
 
/* VI only */
-   if (ctx->abi->chip_class >= VI && in_elements) {
+   if (ctx->abi->chip_class == VI && in_elements) {
/* On VI, the descriptor contains the size in bytes,
 * but TXQ must return the size in elements.
 * The stride is always non-zero for resources using TXQ.
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] mesa: check allocation failures in new_transform_feedback()

2017-08-24 Thread Timothy Arceri



On 24/08/17 21:43, Samuel Pitoiset wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/main/transformfeedback.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 9ffbfe7e6c..e30448ea4f 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -207,7 +207,11 @@ static struct gl_transform_feedback_object *
  new_transform_feedback(struct gl_context *ctx, GLuint name)
  {
 struct gl_transform_feedback_object *obj;
+
 obj = CALLOC_STRUCT(gl_transform_feedback_object);
+   if (!obj)
+  return NULL;
+


We should probably be doing a NULL check in 
_mesa_init_transform_feedback() also, and have that return a bool to 
init_attrib_groups(), but this doesn't make things any worse so for the 
series:


Reviewed-by: Timothy Arceri 


 _mesa_init_transform_feedback_object(obj, name);
 return obj;
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: fix predication on gfx9

2017-08-24 Thread Bas Nieuwenhuizen
Both are

Reviewed-by: Bas Nieuwenhuizen 

On Fri, Aug 25, 2017, at 01:42, Dave Airlie wrote:
> From: Dave Airlie 
> 
> When I added gfx9 I did it wrong, this fixes it.
> 
> Fixes: 5247b311e9 "radv/gfx9: fix set predication packet."
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/vulkan/si_cmd_buffer.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/amd/vulkan/si_cmd_buffer.c
> b/src/amd/vulkan/si_cmd_buffer.c
> index 913ec0e..ef4f926 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -1133,8 +1133,10 @@ si_emit_cache_flush(struct radv_cmd_buffer
> *cmd_buffer)
>  void
>  si_emit_set_predication_state(struct radv_cmd_buffer *cmd_buffer,
>  uint64_t va)
>  {
> -   uint32_t op = PRED_OP(PREDICATION_OP_BOOL64) |
> PREDICATION_DRAW_VISIBLE;
> +   uint32_t op = 0;
>  
> +   if (va)
> +   op = PRED_OP(PREDICATION_OP_BOOL64) |
> PREDICATION_DRAW_VISIBLE;
>   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
>   radeon_emit(cmd_buffer->cs, PKT3(PKT3_SET_PREDICATION, 2, 0));
>   radeon_emit(cmd_buffer->cs, op);
> -- 
> 2.9.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 10/10] anv: Add support for the SYNC_FD handle type for fences

2017-08-24 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 28 +
 src/intel/vulkan/anv_gem_stubs.c | 13 ++
 src/intel/vulkan/anv_private.h   |  4 +++
 src/intel/vulkan/anv_queue.c | 53 +++-
 4 files changed, 87 insertions(+), 11 deletions(-)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index ed94a7e..8d4f56f 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -489,6 +489,34 @@ anv_gem_syncobj_fd_to_handle(struct anv_device *device, 
int fd)
return args.handle;
 }
 
+int
+anv_gem_syncobj_export_sync_file(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_handle args = {
+  .handle = handle,
+  .flags = DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE,
+   };
+
+   int ret = anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD, );
+   if (ret)
+  return -1;
+
+   return args.fd;
+}
+
+int
+anv_gem_syncobj_import_sync_file(struct anv_device *device,
+ uint32_t handle, int fd)
+{
+   struct drm_syncobj_handle args = {
+  .handle = handle,
+  .fd = fd,
+  .flags = DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE,
+   };
+
+   return anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, );
+}
+
 void
 anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle)
 {
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index 17f84f2..1a01e27 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -187,6 +187,19 @@ anv_gem_sync_file_merge(struct anv_device *device, int 
fd1, int fd2)
unreachable("Unused");
 }
 
+int
+anv_gem_syncobj_export_sync_file(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+int
+anv_gem_syncobj_import_sync_file(struct anv_device *device,
+ uint32_t handle, int fd)
+{
+   unreachable("Unused");
+}
+
 uint32_t
 anv_gem_syncobj_create(struct anv_device *device)
 {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 4d83767..808f090 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -810,6 +810,10 @@ uint32_t anv_gem_syncobj_create(struct anv_device *device);
 void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle);
 int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
 uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
+int anv_gem_syncobj_export_sync_file(struct anv_device *device,
+ uint32_t handle);
+int anv_gem_syncobj_import_sync_file(struct anv_device *device,
+ uint32_t handle, int fd);
 void anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle);
 void anv_gem_syncobj_signal(struct anv_device *device, uint32_t handle);
 bool anv_gem_supports_syncobj_wait(int fd);
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index e17e731..0ea6196 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -687,11 +687,14 @@ void anv_GetPhysicalDeviceExternalFencePropertiesKHR(
 
switch (pExternalFenceInfo->handleType) {
case VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
+   case VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
   if (device->has_syncobj_wait) {
  pExternalFenceProperties->exportFromImportedHandleTypes =
-VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
+VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR |
+VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
  pExternalFenceProperties->compatibleHandleTypes =
-VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR;
+VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR |
+VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR;
  pExternalFenceProperties->externalFenceFeatures =
 VK_EXTERNAL_FENCE_FEATURE_EXPORTABLE_BIT_KHR |
 VK_EXTERNAL_FENCE_FEATURE_IMPORTABLE_BIT_KHR;
@@ -731,22 +734,41 @@ VkResult anv_ImportFenceFdKHR(
   if (!new_impl.syncobj)
  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHR);
 
-  /* From the Vulkan 1.0.53 spec:
-   *
-   *"Importing a fence payload from a file descriptor transfers
-   *ownership of the file descriptor from the application to the
-   *Vulkan implementation. The application must not perform any
-   *operations on the file descriptor after a successful import."
-   *
-   * If the import fails, we leave the file descriptor open.
+  break;
+
+   case VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT_KHR:
+  /* Sync files are a bit tricky.  Because we want to continue using the
+   * syncobj implementation of WaitForFences, we don't use the sync file
+   * directly but instead import it into a syncobj.
*/
-  close(fd);
+  new_impl.type = 

[Mesa-dev] [PATCH v2 04/10] anv: Rename anv_fence_state to anv_bo_fence_state

2017-08-24 Thread Jason Ekstrand
It only applies to legacy BO fences.
---
 src/intel/vulkan/anv_batch_chain.c |  2 +-
 src/intel/vulkan/anv_private.h | 10 +-
 src/intel/vulkan/anv_queue.c   | 24 
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 775009c..0a0be8d 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1619,7 +1619,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
* vkGetFenceStatus() return a valid result (VK_ERROR_DEVICE_LOST or
* VK_SUCCESS) in a finite amount of time even if execbuf fails.
*/
-  fence->permanent.bo.state = ANV_FENCE_STATE_SUBMITTED;
+  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SUBMITTED;
}
 
if (result == VK_SUCCESS && need_out_fence) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ab6e5e2..3b50c49 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1713,16 +1713,16 @@ enum anv_fence_type {
ANV_FENCE_TYPE_SYNCOBJ,
 };
 
-enum anv_fence_state {
+enum anv_bo_fence_state {
/** Indicates that this is a new (or newly reset fence) */
-   ANV_FENCE_STATE_RESET,
+   ANV_BO_FENCE_STATE_RESET,
 
/** Indicates that this fence has been submitted to the GPU but is still
 * (as far as we know) in use by the GPU.
 */
-   ANV_FENCE_STATE_SUBMITTED,
+   ANV_BO_FENCE_STATE_SUBMITTED,
 
-   ANV_FENCE_STATE_SIGNALED,
+   ANV_BO_FENCE_STATE_SIGNALED,
 };
 
 struct anv_fence_impl {
@@ -1740,7 +1740,7 @@ struct anv_fence_impl {
*/
   struct {
  struct anv_bo bo;
- enum anv_fence_state state;
+ enum anv_bo_fence_state state;
   } bo;
};
 };
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index a3f4cd8..7348e15 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -279,9 +279,9 @@ VkResult anv_CreateFence(
   return result;
 
if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
-  fence->permanent.bo.state = ANV_FENCE_STATE_SIGNALED;
+  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
} else {
-  fence->permanent.bo.state = ANV_FENCE_STATE_RESET;
+  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
}
 
*pFence = anv_fence_to_handle(fence);
@@ -336,7 +336,7 @@ VkResult anv_ResetFences(
 
   switch (impl->type) {
   case ANV_FENCE_TYPE_BO:
- impl->bo.state = ANV_FENCE_STATE_RESET;
+ impl->bo.state = ANV_BO_FENCE_STATE_RESET;
  break;
 
   default:
@@ -363,18 +363,18 @@ VkResult anv_GetFenceStatus(
switch (impl->type) {
case ANV_FENCE_TYPE_BO:
   switch (impl->bo.state) {
-  case ANV_FENCE_STATE_RESET:
+  case ANV_BO_FENCE_STATE_RESET:
  /* If it hasn't even been sent off to the GPU yet, it's not ready */
  return VK_NOT_READY;
 
-  case ANV_FENCE_STATE_SIGNALED:
+  case ANV_BO_FENCE_STATE_SIGNALED:
  /* It's been signaled, return success */
  return VK_SUCCESS;
 
-  case ANV_FENCE_STATE_SUBMITTED: {
+  case ANV_BO_FENCE_STATE_SUBMITTED: {
  VkResult result = anv_device_bo_busy(device, >bo.bo);
  if (result == VK_SUCCESS) {
-impl->bo.state = ANV_FENCE_STATE_SIGNALED;
+impl->bo.state = ANV_BO_FENCE_STATE_SIGNALED;
 return VK_SUCCESS;
  } else {
 return result;
@@ -427,7 +427,7 @@ anv_wait_for_bo_fences(struct anv_device *device,
  struct anv_fence_impl *impl = >permanent;
 
  switch (impl->bo.state) {
- case ANV_FENCE_STATE_RESET:
+ case ANV_BO_FENCE_STATE_RESET:
 /* This fence hasn't been submitted yet, we'll catch it the next
  * time around.  Yes, this may mean we dead-loop but, short of
  * lots of locking and a condition variable, there's not much that
@@ -436,7 +436,7 @@ anv_wait_for_bo_fences(struct anv_device *device,
 pending_fences++;
 continue;
 
- case ANV_FENCE_STATE_SIGNALED:
+ case ANV_BO_FENCE_STATE_SIGNALED:
 /* This fence is not pending.  If waitAll isn't set, we can return
  * early.  Otherwise, we have to keep going.
  */
@@ -446,14 +446,14 @@ anv_wait_for_bo_fences(struct anv_device *device,
 }
 continue;
 
- case ANV_FENCE_STATE_SUBMITTED:
+ case ANV_BO_FENCE_STATE_SUBMITTED:
 /* These are the fences we really care about.  Go ahead and wait
  * on it until we hit a timeout.
  */
 result = anv_device_wait(device, >bo.bo, timeout);
 switch (result) {
 case VK_SUCCESS:
-   impl->bo.state = ANV_FENCE_STATE_SIGNALED;
+   impl->bo.state = ANV_BO_FENCE_STATE_SIGNALED;
signaled_fences = true;

[Mesa-dev] [PATCH v2 08/10] anv: Use DRM sync objects to back fences whenever possible

2017-08-24 Thread Jason Ekstrand
In order to implement VK_KHR_external_fence, we need to back our fences
with something that's shareable.  Since the kernel wait interface for
sync objects already supports waiting for multiple fences in one go, it
makes anv_WaitForFences much simpler if we only have one type of fence.
---
 src/intel/vulkan/anv_batch_chain.c |   8 +++
 src/intel/vulkan/anv_device.c  |   2 +
 src/intel/vulkan/anv_private.h |   4 ++
 src/intel/vulkan/anv_queue.c   | 133 ++---
 4 files changed, 138 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 0a0be8d..52c4510 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1560,6 +1560,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
 return result;
  break;
 
+  case ANV_FENCE_TYPE_SYNCOBJ:
+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_SIGNAL,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
   default:
  unreachable("Invalid fence type");
   }
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a6d5215..2e0fa19 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -339,6 +339,8 @@ anv_physical_device_init(struct anv_physical_device *device,
device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
+   device->has_syncobj_wait = device->has_syncobj &&
+  anv_gem_supports_syncobj_wait(fd);
 
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 66a85db..4d83767 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -646,6 +646,7 @@ struct anv_physical_device {
 boolhas_exec_async;
 boolhas_exec_fence;
 boolhas_syncobj;
+boolhas_syncobj_wait;
 
 uint32_teu_total;
 uint32_tsubslice_total;
@@ -1748,6 +1749,9 @@ struct anv_fence_impl {
  struct anv_bo bo;
  enum anv_bo_fence_state state;
   } bo;
+
+  /** DRM syncobj handle for syncobj-based fences */
+  uint32_t syncobj;
};
 };
 
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 7348e15..5e84566 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -271,17 +271,28 @@ VkResult anv_CreateFence(
if (fence == NULL)
   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   fence->permanent.type = ANV_FENCE_TYPE_BO;
+   if (device->instance->physicalDevice.has_syncobj_wait) {
+  fence->permanent.type = ANV_FENCE_TYPE_SYNCOBJ;
 
-   VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
-   >permanent.bo.bo, 4096);
-   if (result != VK_SUCCESS)
-  return result;
+  fence->permanent.syncobj = anv_gem_syncobj_create(device);
+  if (!fence->permanent.syncobj)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
 
-   if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT)
+ anv_gem_syncobj_signal(device, fence->permanent.syncobj);
} else {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  fence->permanent.type = ANV_FENCE_TYPE_BO;
+
+  VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
+  >permanent.bo.bo, 4096);
+  if (result != VK_SUCCESS)
+ return result;
+
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  } else {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  }
}
 
*pFence = anv_fence_to_handle(fence);
@@ -301,6 +312,10 @@ anv_fence_impl_cleanup(struct anv_device *device,
case ANV_FENCE_TYPE_BO:
   anv_bo_pool_free(>batch_bo_pool, >bo.bo);
   return;
+
+   case ANV_FENCE_TYPE_SYNCOBJ:
+  anv_gem_syncobj_destroy(device, impl->syncobj);
+  return;
}
 
unreachable("Invalid fence type");
@@ -328,6 +343,8 @@ VkResult anv_ResetFences(
 uint32_tfenceCount,
 const VkFence*  pFences)
 {
+   ANV_FROM_HANDLE(anv_device, device, _device);
+
for (uint32_t i = 0; i < fenceCount; i++) {
  

Re: [Mesa-dev] [PATCH] glsl: fix glsl_struct_field size calculations for shader cache

2017-08-24 Thread Timothy Arceri

Whoops.

Reviewed-by: Timothy Arceri 

Thanks!

On 24/08/17 23:42, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

Found by address sanitizer:

==22621==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x6140cbd8 at pc 0x7f561610a4ff bp 0x7ffca85f9d50 sp 0x7ffca85f94f8
READ of size 344 at 0x6140cbd8 thread T0
 #0 0x7f561610a4fe  (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x5f4fe)
 #1 0x7f560bb305a5 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
 #2 0x7f560bb305a5 in blob_write_bytes 
../../../mesa-src/src/compiler/glsl/blob.c:136
 #3 0x7f560be7d7ff in encode_type_to_blob 
../../../mesa-src/src/compiler/glsl/shader_cache.cpp:153
 #4 0x7f560be81222 in write_program_resource_data 
../../../mesa-src/src/compiler/glsl/shader_cache.cpp:950
 #5 0x7f560be81222 in write_program_resource_list 
../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1118
 #6 0x7f560be81222 in shader_cache_write_program_metadata(gl_context*, 
gl_shader_program*) ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1407
 #7 0x7f560b825fdb in link_program 
../../../mesa-src/src/mesa/main/shaderapi.c:1163

Fixes: 073a84ff60db ("glsl: stop adding pointers from glsl_struct_field to the 
cache")
---
  src/compiler/glsl/shader_cache.cpp | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index aa6c067d041..8eb7a5cb792 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -69,24 +69,23 @@ extern "C" {
  
  static void

  compile_shaders(struct gl_context *ctx, struct gl_shader_program *prog) {
 for (unsigned i = 0; i < prog->NumShaders; i++) {
_mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true);
 }
  }
  
  static void

  get_struct_type_field_and_pointer_sizes(size_t *s_field_size,
-size_t *s_field_ptrs,
-unsigned num_fields)
+size_t *s_field_ptrs)
  {
-   *s_field_size = sizeof(glsl_struct_field) * num_fields;
+   *s_field_size = sizeof(glsl_struct_field);
 *s_field_ptrs =
   sizeof(((glsl_struct_field *)0)->type) +
   sizeof(((glsl_struct_field *)0)->name);
  }
  
  static void

  encode_type_to_blob(struct blob *blob, const glsl_type *type)
  {
 uint32_t encoding;
  
@@ -133,22 +132,21 @@ encode_type_to_blob(struct blob *blob, const glsl_type *type)

blob_write_uint32(blob, type->length);
encode_type_to_blob(blob, type->fields.array);
return;
 case GLSL_TYPE_STRUCT:
 case GLSL_TYPE_INTERFACE:
blob_write_uint32(blob, (type->base_type) << 24);
blob_write_string(blob, type->name);
blob_write_uint32(blob, type->length);
  
size_t s_field_size, s_field_ptrs;

-  get_struct_type_field_and_pointer_sizes(_field_size, _field_ptrs,
-  type->length);
+  get_struct_type_field_and_pointer_sizes(_field_size, _field_ptrs);
  
for (unsigned i = 0; i < type->length; i++) {

   encode_type_to_blob(blob, type->fields.structure[i].type);
   blob_write_string(blob, type->fields.structure[i].name);
  
   /* Write the struct field skipping the pointers */

   blob_write_bytes(blob,
((char *)>fields.structure[i]) + s_field_ptrs,
s_field_size - s_field_ptrs);
}
@@ -206,22 +204,21 @@ decode_type_from_blob(struct blob_reader *blob)
unsigned length = blob_read_uint32(blob);
return glsl_type::get_array_instance(decode_type_from_blob(blob),
 length);
 }
 case GLSL_TYPE_STRUCT:
 case GLSL_TYPE_INTERFACE: {
char *name = blob_read_string(blob);
unsigned num_fields = blob_read_uint32(blob);
  
size_t s_field_size, s_field_ptrs;

-  get_struct_type_field_and_pointer_sizes(_field_size, _field_ptrs,
-  num_fields);
+  get_struct_type_field_and_pointer_sizes(_field_size, _field_ptrs);
  
glsl_struct_field *fields =

   (glsl_struct_field *) malloc(s_field_size * num_fields);
for (unsigned i = 0; i < num_fields; i++) {
   fields[i].type = decode_type_from_blob(blob);
   fields[i].name = blob_read_string(blob);
  
   blob_copy_bytes(blob, ((uint8_t *) [i]) + s_field_ptrs,

   s_field_size - s_field_ptrs);
}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 09/10] anv: Implement VK_KHR_external_fence

2017-08-24 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c |  19 -
 src/intel/vulkan/anv_extensions.py |   5 ++
 src/intel/vulkan/anv_queue.c   | 142 -
 3 files changed, 161 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 52c4510..4f5137c 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1549,8 +1549,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 
if (fence) {
-  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
-  struct anv_fence_impl *impl = >permanent;
+  /* Under most circumstances, out fences won't be temporary.  However,
+   * the spec does allow it for opaque_fd.  From the Vulkan 1.0.53 spec:
+   *
+   *"If the import is temporary, the implementation must restore the
+   *semaphore to its prior permanent state after submitting the next
+   *semaphore wait operation."
+   *
+   * The spec says nothing whatsoever about signal operations on
+   * temporarily imported semaphores so it appears they are allowed.
+   * There are also CTS tests that require this to work.
+   */
+  struct anv_fence_impl *impl =
+ fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+ >temporary : >permanent;
 
   switch (impl->type) {
   case ANV_FENCE_TYPE_BO:
@@ -1617,6 +1629,9 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 
if (fence && fence->permanent.type == ANV_FENCE_TYPE_BO) {
+  /* BO fences can't be shared, so they can't be temporary. */
+  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
+
   /* Once the execbuf has returned, we need to set the fence state to
* SUBMITTED.  We can't do this before calling execbuf because
* anv_GetFenceStatus does take the global device lock before checking
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 3252e0f..6b3d72e 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -47,6 +47,11 @@ class Extension:
 EXTENSIONS = [
 Extension('VK_KHR_dedicated_allocation',  1, True),
 Extension('VK_KHR_descriptor_update_template',1, True),
+Extension('VK_KHR_external_fence',1,
+  'device->has_syncobj_wait'),
+Extension('VK_KHR_external_fence_capabilities',   1, True),
+Extension('VK_KHR_external_fence_fd', 1,
+  'device->has_syncobj_wait'),
 Extension('VK_KHR_external_memory',   1, True),
 Extension('VK_KHR_external_memory_capabilities',  1, True),
 Extension('VK_KHR_external_memory_fd',1, True),
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 5e84566..e17e731 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -348,7 +348,18 @@ VkResult anv_ResetFences(
for (uint32_t i = 0; i < fenceCount; i++) {
   ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
 
-  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
+  /* From the Vulkan 1.0.53 spec:
+   *
+   *"If any member of pFences currently has its payload imported with
+   *temporary permanence, that fence’s prior permanent payload is
+   *first restored. The remaining operations described therefore
+   *operate on the restored payload.
+   */
+  if (fence->temporary.type != ANV_FENCE_TYPE_NONE) {
+ anv_fence_impl_cleanup(device, >temporary);
+ fence->temporary.type = ANV_FENCE_TYPE_NONE;
+  }
+
   struct anv_fence_impl *impl = >permanent;
 
   switch (impl->type) {
@@ -378,11 +389,14 @@ VkResult anv_GetFenceStatus(
if (unlikely(device->lost))
   return VK_ERROR_DEVICE_LOST;
 
-   assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
-   struct anv_fence_impl *impl = >permanent;
+   struct anv_fence_impl *impl =
+  fence->temporary.type != ANV_FENCE_TYPE_NONE ?
+  >temporary : >permanent;
 
switch (impl->type) {
case ANV_FENCE_TYPE_BO:
+  /* BO fences don't support import/export */
+  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
   switch (impl->bo.state) {
   case ANV_BO_FENCE_STATE_RESET:
  /* If it hasn't even been sent off to the GPU yet, it's not ready */
@@ -664,6 +678,128 @@ VkResult anv_WaitForFences(
}
 }
 
+void anv_GetPhysicalDeviceExternalFencePropertiesKHR(
+VkPhysicalDevicephysicalDevice,
+const VkPhysicalDeviceExternalFenceInfoKHR* pExternalFenceInfo,
+VkExternalFencePropertiesKHR*   pExternalFenceProperties)
+{
+   ANV_FROM_HANDLE(anv_physical_device, device, physicalDevice);
+
+   switch (pExternalFenceInfo->handleType) {
+   case VK_EXTERNAL_FENCE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR:
+  if (device->has_syncobj_wait) {
+ 

[Mesa-dev] [PATCH v2 07/10] anv/gem: Add support for syncobj wait and reset

2017-08-24 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_gem.c   | 71 
 src/intel/vulkan/anv_gem_stubs.c | 26 +++
 src/intel/vulkan/anv_private.h   |  6 
 3 files changed, 103 insertions(+)

diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 57a8b79..ed94a7e 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -488,3 +488,74 @@ anv_gem_syncobj_fd_to_handle(struct anv_device *device, 
int fd)
 
return args.handle;
 }
+
+void
+anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_reset args = {
+  .handle = handle,
+   };
+
+   anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_RESET, );
+}
+
+void
+anv_gem_syncobj_signal(struct anv_device *device, uint32_t handle)
+{
+   struct drm_syncobj_signal args = {
+  .handle = handle,
+   };
+
+   anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_SIGNAL, );
+}
+
+bool
+anv_gem_supports_syncobj_wait(int fd)
+{
+   int ret;
+
+   struct drm_syncobj_create create = {
+  .flags = 0,
+   };
+   ret = anv_ioctl(fd, DRM_IOCTL_SYNCOBJ_CREATE, );
+   if (ret)
+  return false;
+
+   uint32_t syncobj = create.handle;
+
+   struct drm_syncobj_wait wait = {
+  .handles = (uint64_t)(uintptr_t),
+  .count_handles = 1,
+  .timeout_nsec = 0,
+  .flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+   };
+   ret = anv_ioctl(fd, DRM_IOCTL_SYNCOBJ_WAIT, );
+
+   struct drm_syncobj_destroy destroy = {
+  .handle = syncobj,
+   };
+   anv_ioctl(fd, DRM_IOCTL_SYNCOBJ_DESTROY, );
+
+   /* If it timed out, then we have the ioctl and it supports the
+* DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT flag.
+*/
+   return ret == -1 && errno == ETIME;
+}
+
+int
+anv_gem_syncobj_wait(struct anv_device *device,
+ uint32_t *handles, uint32_t num_handles,
+ int64_t abs_timeout_ns, bool wait_all)
+{
+   struct drm_syncobj_wait args = {
+  .handles = (uint64_t)(uintptr_t)handles,
+  .count_handles = num_handles,
+  .timeout_nsec = abs_timeout_ns,
+  .flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+   };
+
+   if (wait_all)
+  args.flags |= DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL;
+
+   return anv_ioctl(device->fd, DRM_IOCTL_SYNCOBJ_WAIT, );
+}
diff --git a/src/intel/vulkan/anv_gem_stubs.c b/src/intel/vulkan/anv_gem_stubs.c
index c9f05ee..17f84f2 100644
--- a/src/intel/vulkan/anv_gem_stubs.c
+++ b/src/intel/vulkan/anv_gem_stubs.c
@@ -210,3 +210,29 @@ anv_gem_syncobj_fd_to_handle(struct anv_device *device, 
int fd)
 {
unreachable("Unused");
 }
+
+void
+anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+void
+anv_gem_syncobj_syncobj(struct anv_device *device, uint32_t handle)
+{
+   unreachable("Unused");
+}
+
+
+bool
+anv_gem_supports_syncobj_wait(int fd)
+{
+   return false;
+}
+
+int
+anv_gem_syncobj_wait(struct anv_device *device,
+ uint32_t *handles, uint32_t num_handles,
+ int64_t abs_timeout_ns, bool wait_all)
+{
+   unreachable("Unused");
+}
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 3b50c49..66a85db 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -809,6 +809,12 @@ uint32_t anv_gem_syncobj_create(struct anv_device *device);
 void anv_gem_syncobj_destroy(struct anv_device *device, uint32_t handle);
 int anv_gem_syncobj_handle_to_fd(struct anv_device *device, uint32_t handle);
 uint32_t anv_gem_syncobj_fd_to_handle(struct anv_device *device, int fd);
+void anv_gem_syncobj_reset(struct anv_device *device, uint32_t handle);
+void anv_gem_syncobj_signal(struct anv_device *device, uint32_t handle);
+bool anv_gem_supports_syncobj_wait(int fd);
+int anv_gem_syncobj_wait(struct anv_device *device,
+ uint32_t *handles, uint32_t num_handles,
+ int64_t abs_timeout_ns, bool wait_all);
 
 VkResult anv_bo_init_new(struct anv_bo *bo, struct anv_device *device, 
uint64_t size);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 06/10] drm-uapi/drm: Add DRM_IOCTL_SYNCOBJ_WAIT and RESET

2017-08-24 Thread Jason Ekstrand
---
 include/drm-uapi/drm.h | 25 +
 1 file changed, 25 insertions(+)

diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index bf3674a..f93f80a 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -712,6 +712,28 @@ struct drm_syncobj_handle {
__u32 pad;
 };
 
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+struct drm_syncobj_wait {
+   __u64 handles;
+   /* absolute timeout */
+   __s64 timeout_nsec;
+   __u32 count_handles;
+   __u32 flags;
+   __u32 first_signaled; /* only valid when not waiting all */
+   __u32 pad;
+};
+
+struct drm_syncobj_reset {
+   __u32 handle;
+   __u32 flags;
+};
+
+struct drm_syncobj_signal {
+   __u32 handle;
+   __u32 flags;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -834,6 +856,9 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_DESTROY  DRM_IOWR(0xC0, struct 
drm_syncobj_destroy)
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD DRM_IOWR(0xC1, struct 
drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE DRM_IOWR(0xC2, struct 
drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_WAIT DRM_IOWR(0xC3, struct drm_syncobj_wait)
+#define DRM_IOCTL_SYNCOBJ_RESETDRM_IOWR(0xC4, struct 
drm_syncobj_reset)
+#define DRM_IOCTL_SYNCOBJ_SIGNAL   DRM_IOWR(0xC4, struct 
drm_syncobj_signal)
 
 /**
  * Device specific ioctls should only be in their respective headers
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 05/10] vulkan/util: Add a vk_zalloc helper

2017-08-24 Thread Jason Ekstrand
---
 src/vulkan/util/vk_alloc.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/vulkan/util/vk_alloc.h b/src/vulkan/util/vk_alloc.h
index 2915021..f58a806 100644
--- a/src/vulkan/util/vk_alloc.h
+++ b/src/vulkan/util/vk_alloc.h
@@ -37,6 +37,20 @@ vk_alloc(const VkAllocationCallbacks *alloc,
 }
 
 static inline void *
+vk_zalloc(const VkAllocationCallbacks *alloc,
+  size_t size, size_t align,
+  VkSystemAllocationScope scope)
+{
+   void *mem = vk_alloc(alloc, size, align, scope);
+   if (mem == NULL)
+  return NULL;
+
+   memset(mem, 0, size);
+
+   return mem;
+}
+
+static inline void *
 vk_realloc(const VkAllocationCallbacks *alloc,
void *ptr, size_t size, size_t align,
VkSystemAllocationScope scope)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 01/10] anv: Rework fences to work more like BO semaphores

2017-08-24 Thread Jason Ekstrand
This commit changes fences to work a bit more like BO semaphores.
Instead of the fence being a batch, it's simply a BO that gets added
to the validation list for the last execbuf call in the QueueSubmit
operation.  It's a bit annoying finding the last submit in the execbuf
but this allows us to avoid the dummy execbuf.
---
 src/intel/vulkan/anv_batch_chain.c | 26 ++-
 src/intel/vulkan/anv_private.h |  5 +--
 src/intel/vulkan/anv_queue.c   | 88 +++---
 3 files changed, 51 insertions(+), 68 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 1e7455f..ef6ada4 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1451,8 +1451,11 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
const VkSemaphore *in_semaphores,
uint32_t num_in_semaphores,
const VkSemaphore *out_semaphores,
-   uint32_t num_out_semaphores)
+   uint32_t num_out_semaphores,
+   VkFence _fence)
 {
+   ANV_FROM_HANDLE(anv_fence, fence, _fence);
+
struct anv_execbuf execbuf;
anv_execbuf_init();
 
@@ -1545,6 +1548,13 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   }
}
 
+   if (fence) {
+  result = anv_execbuf_add_bo(, >bo, NULL,
+  EXEC_OBJECT_WRITE, >alloc);
+  if (result != VK_SUCCESS)
+ return result;
+   }
+
if (cmd_buffer)
   result = setup_execbuf_for_cmd_buffer(, cmd_buffer);
else
@@ -1588,6 +1598,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   anv_semaphore_reset_temporary(device, semaphore);
}
 
+   if (fence) {
+  /* Once the execbuf has returned, we need to set the fence state to
+   * SUBMITTED.  We can't do this before calling execbuf because
+   * anv_GetFenceStatus does take the global device lock before checking
+   * fence->state.
+   *
+   * We set the fence state to SUBMITTED regardless of whether or not the
+   * execbuf succeeds because we need to ensure that vkWaitForFences() and
+   * vkGetFenceStatus() return a valid result (VK_ERROR_DEVICE_LOST or
+   * VK_SUCCESS) in a finite amount of time even if execbuf fails.
+   */
+  fence->state = ANV_FENCE_STATE_SUBMITTED;
+   }
+
if (result == VK_SUCCESS && need_out_fence) {
   int out_fence = execbuf.execbuf.rsvd2 >> 32;
   for (uint32_t i = 0; i < num_out_semaphores; i++) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 6b24144..715e0ad 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1642,7 +1642,8 @@ VkResult anv_cmd_buffer_execbuf(struct anv_device *device,
 const VkSemaphore *in_semaphores,
 uint32_t num_in_semaphores,
 const VkSemaphore *out_semaphores,
-uint32_t num_out_semaphores);
+uint32_t num_out_semaphores,
+VkFence fence);
 
 VkResult anv_cmd_buffer_reset(struct anv_cmd_buffer *cmd_buffer);
 
@@ -1720,8 +1721,6 @@ enum anv_fence_state {
 
 struct anv_fence {
struct anv_bo bo;
-   struct drm_i915_gem_execbuffer2 execbuf;
-   struct drm_i915_gem_exec_object2 exec2_objects[1];
enum anv_fence_state state;
 };
 
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 0a40ebc..04d6972 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -114,10 +114,9 @@ VkResult anv_QueueSubmit(
 VkQueue _queue,
 uint32_tsubmitCount,
 const VkSubmitInfo* pSubmits,
-VkFence _fence)
+VkFence fence)
 {
ANV_FROM_HANDLE(anv_queue, queue, _queue);
-   ANV_FROM_HANDLE(anv_fence, fence, _fence);
struct anv_device *device = queue->device;
 
/* Query for device status prior to submitting.  Technically, we don't need
@@ -158,7 +157,20 @@ VkResult anv_QueueSubmit(
 */
pthread_mutex_lock(>mutex);
 
+   if (fence && submitCount == 0) {
+  /* If we don't have any command buffers, we need to submit a dummy
+   * batch to give GEM something to wait on.  We could, potentially,
+   * come up with something more efficient but this shouldn't be a
+   * common case.
+   */
+  result = anv_cmd_buffer_execbuf(device, NULL, NULL, 0, NULL, 0, fence);
+  goto out;
+   }
+
for (uint32_t i = 0; i < submitCount; i++) {
+  /* Fence for this submit.  NULL for all but the last one */
+  VkFence submit_fence = (i == submitCount - 1) ? fence : NULL;
+
   if (pSubmits[i].commandBufferCount == 0) {
  /* If we don't have any 

[Mesa-dev] [PATCH v2 03/10] anv: Pull the guts of anv_fence into anv_fence_impl

2017-08-24 Thread Jason Ekstrand
This is just a refactor, similar to what we did for semaphores, in
preparation for handling VK_KHR_external_fence.
---
 src/intel/vulkan/anv_batch_chain.c |  22 --
 src/intel/vulkan/anv_private.h |  42 ++-
 src/intel/vulkan/anv_queue.c   | 144 ++---
 3 files changed, 159 insertions(+), 49 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index ef6ada4..775009c 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1549,10 +1549,20 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
}
 
if (fence) {
-  result = anv_execbuf_add_bo(, >bo, NULL,
-  EXEC_OBJECT_WRITE, >alloc);
-  if (result != VK_SUCCESS)
- return result;
+  assert(fence->temporary.type == ANV_FENCE_TYPE_NONE);
+  struct anv_fence_impl *impl = >permanent;
+
+  switch (impl->type) {
+  case ANV_FENCE_TYPE_BO:
+ result = anv_execbuf_add_bo(, >bo.bo, NULL,
+ EXEC_OBJECT_WRITE, >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
+  default:
+ unreachable("Invalid fence type");
+  }
}
 
if (cmd_buffer)
@@ -1598,7 +1608,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
   anv_semaphore_reset_temporary(device, semaphore);
}
 
-   if (fence) {
+   if (fence && fence->permanent.type == ANV_FENCE_TYPE_BO) {
   /* Once the execbuf has returned, we need to set the fence state to
* SUBMITTED.  We can't do this before calling execbuf because
* anv_GetFenceStatus does take the global device lock before checking
@@ -1609,7 +1619,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
* vkGetFenceStatus() return a valid result (VK_ERROR_DEVICE_LOST or
* VK_SUCCESS) in a finite amount of time even if execbuf fails.
*/
-  fence->state = ANV_FENCE_STATE_SUBMITTED;
+  fence->permanent.bo.state = ANV_FENCE_STATE_SUBMITTED;
}
 
if (result == VK_SUCCESS && need_out_fence) {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 715e0ad..ab6e5e2 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -1707,6 +1707,12 @@ anv_cmd_buffer_alloc_blorp_binding_table(struct 
anv_cmd_buffer *cmd_buffer,
 
 void anv_cmd_buffer_dump(struct anv_cmd_buffer *cmd_buffer);
 
+enum anv_fence_type {
+   ANV_FENCE_TYPE_NONE = 0,
+   ANV_FENCE_TYPE_BO,
+   ANV_FENCE_TYPE_SYNCOBJ,
+};
+
 enum anv_fence_state {
/** Indicates that this is a new (or newly reset fence) */
ANV_FENCE_STATE_RESET,
@@ -1719,9 +1725,41 @@ enum anv_fence_state {
ANV_FENCE_STATE_SIGNALED,
 };
 
+struct anv_fence_impl {
+   enum anv_fence_type type;
+
+   union {
+  /** Fence implementation for BO fences
+   *
+   * These fences use a BO and a set of CPU-tracked state flags.  The BO
+   * is added to the object list of the last execbuf call in a QueueSubmit
+   * and is marked EXEC_WRITE.  The state flags track when the BO has been
+   * submitted to the kernel.  We need to do this because Vulkan lets you
+   * wait on a fence that has not yet been submitted and I915_GEM_BUSY
+   * will say it's idle in this case.
+   */
+  struct {
+ struct anv_bo bo;
+ enum anv_fence_state state;
+  } bo;
+   };
+};
+
 struct anv_fence {
-   struct anv_bo bo;
-   enum anv_fence_state state;
+   /* Permanent fence state.  Every fence has some form of permanent state
+* (type != ANV_SEMAPHORE_TYPE_NONE).  This may be a BO to fence on (for
+* cross-process fences0 or it could just be a dummy for use internally.
+*/
+   struct anv_fence_impl permanent;
+
+   /* Temporary fence state.  A fence *may* have temporary state.  That state
+* is added to the fence by an import operation and is reset back to
+* ANV_SEMAPHORE_TYPE_NONE when the fence is reset.  A fence with temporary
+* state cannot be signaled because the fence must already be signaled
+* before the temporary state can be exported from the fence in the other
+* process and imported here.
+*/
+   struct anv_fence_impl temporary;
 };
 
 struct anv_event {
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
index 04d6972..a3f4cd8 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -262,23 +262,26 @@ VkResult anv_CreateFence(
 VkFence*pFence)
 {
ANV_FROM_HANDLE(anv_device, device, _device);
-   struct anv_bo fence_bo;
struct anv_fence *fence;
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_FENCE_CREATE_INFO);
 
-   VkResult result = anv_bo_pool_alloc(>batch_bo_pool, _bo, 
4096);
+   fence = vk_zalloc2(>alloc, pAllocator, sizeof(*fence), 8,
+  VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
+   if (fence == NULL)
+  return 

[Mesa-dev] [PATCH v2 02/10] anv/wsi: Use QueueSubmit to trigger the fence in AcquireNextImage

2017-08-24 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_wsi.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_wsi.c b/src/intel/vulkan/anv_wsi.c
index 9369f26..00edb22 100644
--- a/src/intel/vulkan/anv_wsi.c
+++ b/src/intel/vulkan/anv_wsi.c
@@ -364,22 +364,25 @@ VkResult anv_GetSwapchainImagesKHR(
 }
 
 VkResult anv_AcquireNextImageKHR(
-VkDevice device,
+VkDevice _device,
 VkSwapchainKHR   _swapchain,
 uint64_t timeout,
 VkSemaphore  semaphore,
 VkFence  _fence,
 uint32_t*pImageIndex)
 {
+   ANV_FROM_HANDLE(anv_device, device, _device);
ANV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain);
ANV_FROM_HANDLE(anv_fence, fence, _fence);
 
VkResult result = swapchain->acquire_next_image(swapchain, timeout,
semaphore, pImageIndex);
 
-   /* Thanks to implicit sync, the image is ready immediately. */
+   /* Thanks to implicit sync, the image is ready immediately.  However, we
+* should wait for the current GPU state to finish.
+*/
if (fence)
-  fence->state = ANV_FENCE_STATE_SIGNALED;
+  anv_QueueSubmit(anv_queue_to_handle(>queue), 0, NULL, _fence);
 
return result;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv/gfx9: gfx9 has buffer sizing rules like pre-VI.

2017-08-24 Thread Dave Airlie
From: Dave Airlie 

This fixes:
dEQP-VK.robustness.buffer_access.* on GFX9.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index f561919..560d90e 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -186,7 +186,7 @@ radv_make_buffer_descriptor(struct radv_device *device,
state[1] = S_008F04_BASE_ADDRESS_HI(va >> 32) |
S_008F04_STRIDE(stride);
 
-   if (device->physical_device->rad_info.chip_class < VI && stride) {
+   if (device->physical_device->rad_info.chip_class != VI && stride) {
range /= stride;
}
 
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fix predication on gfx9

2017-08-24 Thread Dave Airlie
From: Dave Airlie 

When I added gfx9 I did it wrong, this fixes it.

Fixes: 5247b311e9 "radv/gfx9: fix set predication packet."
Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/si_cmd_buffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 913ec0e..ef4f926 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -1133,8 +1133,10 @@ si_emit_cache_flush(struct radv_cmd_buffer *cmd_buffer)
 void
 si_emit_set_predication_state(struct radv_cmd_buffer *cmd_buffer, uint64_t va)
 {
-   uint32_t op = PRED_OP(PREDICATION_OP_BOOL64) | PREDICATION_DRAW_VISIBLE;
+   uint32_t op = 0;
 
+   if (va)
+   op = PRED_OP(PREDICATION_OP_BOOL64) | PREDICATION_DRAW_VISIBLE;
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
radeon_emit(cmd_buffer->cs, PKT3(PKT3_SET_PREDICATION, 2, 0));
radeon_emit(cmd_buffer->cs, op);
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/omx: move YUV deinterlace function to common

2017-08-24 Thread Andy Furniss

Christian König wrote:

Am 24.08.2017 um 17:11 schrieb Leo Liu:

Signed-off-by: Leo Liu 


Reviewed-by: Christian König  for the series.

Andy do you want to test this? Should make VA-API transcoding simpler to 
use.


Oh, nice it will be great to loose that env.

I started testing before mention of v3 patch - but one thing they seem 
to have fixed is the


[drm:amdgpu_vce_cs_reloc [amdgpu]] *ERROR* BO to small for addr 
0x01000a 48 47


that I had with 2160p raw vid enc since

st/va: clear the video surface on allocation

:-)

I'll try with the latest version tomorrow.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: Don't set a new subpass on compute resolve.

2017-08-24 Thread Dave Airlie
On 25 August 2017 at 09:17, Bas Nieuwenhuizen  wrote:
> We don't use the render path so totally unneeded.
>
> Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"

I think there is a more recent commit to fix this one :-)

either way,

Reviewed-by: Dave Airlie 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: Don't set a new subpass on compute resolve.

2017-08-24 Thread Bas Nieuwenhuizen
We don't use the render path so totally unneeded.

Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
 src/amd/vulkan/radv_meta_resolve_cs.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_resolve_cs.c 
b/src/amd/vulkan/radv_meta_resolve_cs.c
index da6ca76b6d4..ce02884d2d6 100644
--- a/src/amd/vulkan/radv_meta_resolve_cs.c
+++ b/src/amd/vulkan/radv_meta_resolve_cs.c
@@ -521,14 +521,6 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer 
*cmd_buffer)
if (dest_att.attachment == VK_ATTACHMENT_UNUSED)
continue;
 
-   struct radv_subpass resolve_subpass = {
-   .color_count = 1,
-   .color_attachments = (VkAttachmentReference[]) { 
dest_att },
-   .depth_stencil_attachment = { .attachment = 
VK_ATTACHMENT_UNUSED },
-   };
-
-   radv_cmd_buffer_set_subpass(cmd_buffer, _subpass, 
false);
-
emit_resolve(cmd_buffer,
 src_iview,
 dst_iview,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radv: Remove some intel comments from the resolve code.

2017-08-24 Thread Bas Nieuwenhuizen
These are clearly not applicable to radv.
---
 src/amd/vulkan/radv_meta_resolve.c| 7 ---
 src/amd/vulkan/radv_meta_resolve_cs.c | 7 ---
 src/amd/vulkan/radv_meta_resolve_fs.c | 7 ---
 3 files changed, 21 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_resolve.c 
b/src/amd/vulkan/radv_meta_resolve.c
index 6023e0f8999..dd811c25142 100644
--- a/src/amd/vulkan/radv_meta_resolve.c
+++ b/src/amd/vulkan/radv_meta_resolve.c
@@ -612,13 +612,6 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer 
*cmd_buffer)
 
radv_cmd_buffer_set_subpass(cmd_buffer, _subpass, 
false);
 
-   /* Subpass resolves must respect the render area. We can ignore 
the
-* render area here because vkCmdBeginRenderPass set the render 
area
-* with 3DSTATE_DRAWING_RECTANGLE.
-*
-* XXX(chadv): Does the hardware really respect
-* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
-*/
emit_resolve(cmd_buffer,
 &(VkOffset2D) { 0, 0 },
 &(VkExtent2D) { fb->width, fb->height });
diff --git a/src/amd/vulkan/radv_meta_resolve_cs.c 
b/src/amd/vulkan/radv_meta_resolve_cs.c
index d20d04231ed..da6ca76b6d4 100644
--- a/src/amd/vulkan/radv_meta_resolve_cs.c
+++ b/src/amd/vulkan/radv_meta_resolve_cs.c
@@ -529,13 +529,6 @@ radv_cmd_buffer_resolve_subpass_cs(struct radv_cmd_buffer 
*cmd_buffer)
 
radv_cmd_buffer_set_subpass(cmd_buffer, _subpass, 
false);
 
-   /* Subpass resolves must respect the render area. We can ignore 
the
-* render area here because vkCmdBeginRenderPass set the render 
area
-* with 3DSTATE_DRAWING_RECTANGLE.
-*
-* XXX(chadv): Does the hardware really respect
-* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
-*/
emit_resolve(cmd_buffer,
 src_iview,
 dst_iview,
diff --git a/src/amd/vulkan/radv_meta_resolve_fs.c 
b/src/amd/vulkan/radv_meta_resolve_fs.c
index 2f745f0ea09..373dd9665a7 100644
--- a/src/amd/vulkan/radv_meta_resolve_fs.c
+++ b/src/amd/vulkan/radv_meta_resolve_fs.c
@@ -633,13 +633,6 @@ radv_cmd_buffer_resolve_subpass_fs(struct radv_cmd_buffer 
*cmd_buffer)
 
radv_cmd_buffer_set_subpass(cmd_buffer, _subpass, 
false);
 
-   /* Subpass resolves must respect the render area. We can ignore 
the
-* render area here because vkCmdBeginRenderPass set the render 
area
-* with 3DSTATE_DRAWING_RECTANGLE.
-*
-* XXX(chadv): Does the hardware really respect
-* 3DSTATE_DRAWING_RECTANGLE when draing a 3DPRIM_RECTLIST?
-*/
emit_resolve(cmd_buffer,
 src_iview,
 dest_iview,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glmark2 terrain errors on imx6q

2017-08-24 Thread Chris Healy
When running on Mesa 17.2, I believe this has already been fixed with
the GC3000 but is still not working with the GC2000.  Here's the fix
I'm thinking addresses the issue with the GC3000:

https://cgit.freedesktop.org/mesa/mesa/commit/src/gallium/drivers/etnaviv?id=39056b0e2ac10342d8a3a6000f12a510f5dbd773

Can you try with the GC3000 to validate this?

On Thu, Aug 24, 2017 at 4:00 PM, Fabio Estevam  wrote:
> Hi,
>
> Getting the following errors when running glmark2 terrain test on imx6q:
>
> # glmark2-es2-drm -b terrain
> ** Failed to set swap interval. Results may be bounded above by refresh rate.
> ===
> glmark2 2014.03
> ===
> OpenGL Information
> GL_VENDOR: etnaviv
> GL_RENDERER:   Gallium 0.4 on Vivante GC2000 rev 5108
> GL_VERSION:OpenGL ES 2.0 Mesa 17.1.7
> ===
> ** Failed to set swap interval. Results may be bounded above by refresh rate.
> [terrain] :error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
> error: compile failed!
> etna_draw_vbo:199: compiled shaders are not okay
>  FPS: 3 FrameTime: 333.333 ms
> ===
>   glmark2 Score: 3
> ===
>
> Does anyone know of a possible fix for this test?
>
> Thanks
> ___
> etnaviv mailing list
> etna...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/etnaviv
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101851] [regression] libEGL_common.a undefined reference to '__gxx_personality_v0'

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101851

Alec Moskvin  changed:

   What|Removed |Added

 CC||al...@gmx.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] glmark2 terrain errors on imx6q

2017-08-24 Thread Fabio Estevam
Hi,

Getting the following errors when running glmark2 terrain test on imx6q:

# glmark2-es2-drm -b terrain
** Failed to set swap interval. Results may be bounded above by refresh rate.
===
glmark2 2014.03
===
OpenGL Information
GL_VENDOR: etnaviv
GL_RENDERER:   Gallium 0.4 on Vivante GC2000 rev 5108
GL_VERSION:OpenGL ES 2.0 Mesa 17.1.7
===
** Failed to set swap interval. Results may be bounded above by refresh rate.
[terrain] :error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
 FPS: 3 FrameTime: 333.333 ms
===
  glmark2 Score: 3
===

Does anyone know of a possible fix for this test?

Thanks
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac/debug: use util_strchrnul() to fix android build error

2017-08-24 Thread Rob Herring
On Thu, Aug 24, 2017 at 4:52 PM, Mauro Rossi  wrote:
> Similar to e09d04cd56 "radeonsi: use util_strchrnul() to fix android build 
> error"
>
> Android Bionic does not support strchrnul() string function,
> gallium auxiliary util/u_string.h provides util_strchrnul()
>
> This change avoids the following warning and error:
>
> external/mesa/src/amd/common/ac_debug.c:501:15: warning: implicit declaration 
> of function 'strchrnul' is invalid in C99
> char *end = strchrnul(out, '\n');
> ^
> external/mesa/src/amd/common/ac_debug.c:501:9: error: incompatible integer to 
> pointer conversion initializing 'char *' with an expression of type 'int'
> char *end = strchrnul(out, '\n');
>   ^ 
> 1 warning and 1 error generated.
>
> Fixes: c2c3912410 "ac/debug: annotate IB dumps with the raw values"
> ---
>  src/amd/common/ac_debug.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

You beat me to it. I've applied and pushed this.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] ac/debug: use util_strchrnul() to fix android build error

2017-08-24 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Aug 24, 2017 at 11:52 PM, Mauro Rossi  wrote:
> Similar to e09d04cd56 "radeonsi: use util_strchrnul() to fix android build 
> error"
>
> Android Bionic does not support strchrnul() string function,
> gallium auxiliary util/u_string.h provides util_strchrnul()
>
> This change avoids the following warning and error:
>
> external/mesa/src/amd/common/ac_debug.c:501:15: warning: implicit declaration 
> of function 'strchrnul' is invalid in C99
> char *end = strchrnul(out, '\n');
> ^
> external/mesa/src/amd/common/ac_debug.c:501:9: error: incompatible integer to 
> pointer conversion initializing 'char *' with an expression of type 'int'
> char *end = strchrnul(out, '\n');
>   ^ 
> 1 warning and 1 error generated.
>
> Fixes: c2c3912410 "ac/debug: annotate IB dumps with the raw values"
> ---
>  src/amd/common/ac_debug.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c
> index 2af83a146c..3daa556f5d 100644
> --- a/src/amd/common/ac_debug.c
> +++ b/src/amd/common/ac_debug.c
> @@ -39,6 +39,7 @@
>  #include "sid_tables.h"
>  #include "util/u_math.h"
>  #include "util/u_memory.h"
> +#include "util/u_string.h"
>
>  /* Parsed IBs are difficult to read without colors. Use "less -R file" to
>   * read them, or use "aha -b -f file" to convert them to html.
> @@ -498,7 +499,7 @@ static void format_ib_output(FILE *f, char *out)
> if (indent)
> print_spaces(f, indent);
>
> -   char *end = strchrnul(out, '\n');
> +   char *end = util_strchrnul(out, '\n');
> fwrite(out, end - out, 1, f);
> fputc('\n', f); /* always end with a new line */
> if (!*end)
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ac/debug: use util_strchrnul() to fix android build error

2017-08-24 Thread Mauro Rossi
Similar to e09d04cd56 "radeonsi: use util_strchrnul() to fix android build 
error"

Android Bionic does not support strchrnul() string function,
gallium auxiliary util/u_string.h provides util_strchrnul()

This change avoids the following warning and error:

external/mesa/src/amd/common/ac_debug.c:501:15: warning: implicit declaration 
of function 'strchrnul' is invalid in C99
char *end = strchrnul(out, '\n');
^
external/mesa/src/amd/common/ac_debug.c:501:9: error: incompatible integer to 
pointer conversion initializing 'char *' with an expression of type 'int'
char *end = strchrnul(out, '\n');
  ^ 
1 warning and 1 error generated.

Fixes: c2c3912410 "ac/debug: annotate IB dumps with the raw values"
---
 src/amd/common/ac_debug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c
index 2af83a146c..3daa556f5d 100644
--- a/src/amd/common/ac_debug.c
+++ b/src/amd/common/ac_debug.c
@@ -39,6 +39,7 @@
 #include "sid_tables.h"
 #include "util/u_math.h"
 #include "util/u_memory.h"
+#include "util/u_string.h"
 
 /* Parsed IBs are difficult to read without colors. Use "less -R file" to
  * read them, or use "aha -b -f file" to convert them to html.
@@ -498,7 +499,7 @@ static void format_ib_output(FILE *f, char *out)
if (indent)
print_spaces(f, indent);
 
-   char *end = strchrnul(out, '\n');
+   char *end = util_strchrnul(out, '\n');
fwrite(out, end - out, 1, f);
fputc('\n', f); /* always end with a new line */
if (!*end)
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv/meta: missing initialisations in create_pass().

2017-08-24 Thread Bas Nieuwenhuizen
On Thu, Aug 24, 2017, at 11:10, Xavier Bouchoux wrote:
> Otherwise radv_cmd_state_setup_attachments() will complain it has no
> clearvalues,
> when called via radv_process_depth_image_inplace().
> 
> Signed-off-by: Xavier Bouchoux 
> ---
>  src/amd/vulkan/radv_meta_decompress.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/amd/vulkan/radv_meta_decompress.c
> b/src/amd/vulkan/radv_meta_decompress.c
> index f68ce8d2b0..f932b4c492 100644
> --- a/src/amd/vulkan/radv_meta_decompress.c
> +++ b/src/amd/vulkan/radv_meta_decompress.c
> @@ -38,10 +38,13 @@ create_pass(struct radv_device *device,
>   const VkAllocationCallbacks *alloc = >meta_state.alloc;
>   VkAttachmentDescription attachment;
>  
> +   attachment.flags = 0;
>   attachment.format = VK_FORMAT_D32_SFLOAT_S8_UINT;
>   attachment.samples = samples;
>   attachment.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
>   attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
> +   attachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
> +   attachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;

I think we should make these LOAD/STORE instead of DONT_CARE, since
HTILE decompression needs to preserve stencil.

>   attachment.initialLayout = 
> VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
>   attachment.finalLayout = 
> VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
>  
> -- 
> 2.14.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Implement GL_ARB_polygon_offset_clamp

2017-08-24 Thread Ilia Mirkin
My personal inclination would have been the inverse (i.e. alias
glPolygonOffsetClampEXT to glPolygonOffsetClamp), but ... you've been
through enough silly revisions, and I don't sufficiently care.

Reviewed-by: Ilia Mirkin 

On Thu, Aug 24, 2017 at 5:00 PM, Adam Jackson  wrote:
> Semantically identical to the EXT version (whose string is still valid
> for GLES), so rename the bit but expose both extension strings.
> (Suggested by Ilia Mirkin and Ian Romanick.)
>
> v3: Fix the entrypoint alias in GL4x.xml (Ilia)
>
> Signed-off-by: Adam Jackson 
> ---
>  docs/features.txt| 2 +-
>  docs/relnotes/17.3.0.html| 1 +
>  src/mapi/glapi/gen/GL4x.xml  | 9 +
>  src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
>  src/mesa/main/dlist.c| 2 +-
>  src/mesa/main/extensions_table.h | 3 ++-
>  src/mesa/main/get.c  | 2 +-
>  src/mesa/main/get_hash_params.py | 4 ++--
>  src/mesa/main/mtypes.h   | 2 +-
>  src/mesa/main/polygon.c  | 9 +++--
>  src/mesa/main/version.c  | 2 +-
>  src/mesa/state_tracker/st_extensions.c   | 2 +-
>  12 files changed, 24 insertions(+), 16 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index 3f91c2daae..0435ce61ff 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -226,7 +226,7 @@ GL 4.6, GLSL 4.60
>GL_ARB_gl_spirv   in progress (Nicolai 
> Hähnle, Ian Romanick)
>GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
>GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
> radeonsi, softpipe, swr)
> -  GL_ARB_polygon_offset_clamp   not started
> +  GL_ARB_polygon_offset_clamp   DONE (i965, nv50, 
> nvc0, r600, radeonsi, llvmpipe, swr)
>GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
> nvc0, radeonsi, softpipe)
>GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
> radeonsi)
>GL_ARB_shader_group_vote  DONE (i965, nvc0, 
> radeonsi)
> diff --git a/docs/relnotes/17.3.0.html b/docs/relnotes/17.3.0.html
> index 8da43f22f0..4a74284632 100644
> --- a/docs/relnotes/17.3.0.html
> +++ b/docs/relnotes/17.3.0.html
> @@ -44,6 +44,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  
>
>  
> +GL_ARB_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, 
> llvmpipe, swr
>  GL_ARB_transform_feedback_overflow_query on radeonsi
>  GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, 
> radeonsi
>  GL_EXT_memory_object on radeonsi
> diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
> index e958ee70c7..88dba5cd71 100644
> --- a/src/mapi/glapi/gen/GL4x.xml
> +++ b/src/mapi/glapi/gen/GL4x.xml
> @@ -66,4 +66,13 @@
>
>  
>
> +
> +  
> +
> +
> +
> +  
> +  
> +
> +
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index c3cd8004a1..deacd0d9df 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -66,6 +66,7 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.ARB_occlusion_query = true;
> ctx->Extensions.ARB_occlusion_query2 = true;
> ctx->Extensions.ARB_point_sprite = true;
> +   ctx->Extensions.ARB_polygon_offset_clamp = true;
> ctx->Extensions.ARB_seamless_cube_map = true;
> ctx->Extensions.ARB_shader_bit_encoding = true;
> ctx->Extensions.ARB_shader_draw_parameters = true;
> @@ -100,7 +101,6 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.EXT_packed_float = true;
> ctx->Extensions.EXT_pixel_buffer_object = true;
> ctx->Extensions.EXT_point_parameters = true;
> -   ctx->Extensions.EXT_polygon_offset_clamp = true;
> ctx->Extensions.EXT_provoking_vertex = true;
> ctx->Extensions.EXT_stencil_two_side = true;
> ctx->Extensions.EXT_texture_array = true;
> diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
> index 208471aca7..b7d1406eb7 100644
> --- a/src/mesa/main/dlist.c
> +++ b/src/mesa/main/dlist.c
> @@ -10062,7 +10062,7 @@ _mesa_initialize_save_table(const struct gl_context 
> *ctx)
> SET_ProgramUniformMatrix3x4fv(table, save_ProgramUniformMatrix3x4fv);
> SET_ProgramUniformMatrix4x3fv(table, save_ProgramUniformMatrix4x3fv);
>
> -   /* GL_EXT_polygon_offset_clamp */
> +   /* GL_{ARB,EXT}_polygon_offset_clamp */
> SET_PolygonOffsetClampEXT(table, save_PolygonOffsetClampEXT);
>
> /* GL_EXT_window_rectangles */
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index d096260891..9475c1b69d 100644
> --- a/src/mesa/main/extensions_table.h
> +++ 

Re: [Mesa-dev] [PATCH 4/6] ac/debug: annotate IB dumps with the raw values

2017-08-24 Thread Rob Herring
On Thu, Aug 24, 2017 at 4:24 PM, Rob Herring  wrote:
> On Tue, Aug 22, 2017 at 10:45 AM, Nicolai Hähnle  wrote:
>> From: Nicolai Hähnle 
>>
>> ---
>>  src/amd/common/ac_debug.c | 84 
>> +--
>>  1 file changed, 66 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c
>> index 518893ff481..e92dfbd0e4a 100644
>> --- a/src/amd/common/ac_debug.c
>> +++ b/src/amd/common/ac_debug.c
>
> [...]
>
>> +static void format_ib_output(FILE *f, char *out)
>> +{
>> +   unsigned depth = 0;
>> +
>> +   for (;;) {
>> +   char op = 0;
>> +
>> +   if (out[0] == '\n' && out[1] == '\035')
>> +   out++;
>> +   if (out[0] == '\035') {
>> +   op = out[1];
>> +   out += 2;
>> +   }
>> +
>> +   if (op == '<')
>> +   depth--;
>> +
>> +   unsigned indent = 4 * depth;
>> +   if (op != '#')
>> +   indent += 9;
>> +
>> +   if (indent)
>> +   print_spaces(f, indent);
>> +
>> +   char *end = strchrnul(out, '\n');
>
> This fails to build on Android. I think you need an explicit include
> of string.h.

Actually, it's not enabled in bionic, so perhaps util_strchrnul()
instead. I'll send a patch.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9.

2017-08-24 Thread Kenneth Graunke
On Thursday, August 24, 2017 4:16:39 AM PDT kevin.rogo...@intel.com wrote:
> From: Kevin Rogovin 
> 
> Special thanks to Eero Tamminen for reporting rasterizer
> numbers being twice what it should be for 2xMSAA under
> a benchmark.
> 
> Signed-off-by: Kevin Rogovin 

Nice catch!  Thanks for fixing this.

Reviewed-by: Kenneth Graunke 

Ian requested that I run this through a full CTS run before pushing, so
that we actually hit all the new visuals, and make sure 2x/16x works as
expected.  Assuming that comes back green, I'll plan to push this.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] ac/debug: annotate IB dumps with the raw values

2017-08-24 Thread Rob Herring
On Tue, Aug 22, 2017 at 10:45 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/amd/common/ac_debug.c | 84 
> +--
>  1 file changed, 66 insertions(+), 18 deletions(-)
>
> diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c
> index 518893ff481..e92dfbd0e4a 100644
> --- a/src/amd/common/ac_debug.c
> +++ b/src/amd/common/ac_debug.c

[...]

> +static void format_ib_output(FILE *f, char *out)
> +{
> +   unsigned depth = 0;
> +
> +   for (;;) {
> +   char op = 0;
> +
> +   if (out[0] == '\n' && out[1] == '\035')
> +   out++;
> +   if (out[0] == '\035') {
> +   op = out[1];
> +   out += 2;
> +   }
> +
> +   if (op == '<')
> +   depth--;
> +
> +   unsigned indent = 4 * depth;
> +   if (op != '#')
> +   indent += 9;
> +
> +   if (indent)
> +   print_spaces(f, indent);
> +
> +   char *end = strchrnul(out, '\n');

This fails to build on Android. I think you need an explicit include
of string.h.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Implement GL_ARB_polygon_offset_clamp

2017-08-24 Thread Adam Jackson
Semantically identical to the EXT version (whose string is still valid
for GLES), so rename the bit but expose both extension strings.
(Suggested by Ilia Mirkin and Ian Romanick.)

v3: Fix the entrypoint alias in GL4x.xml (Ilia)

Signed-off-by: Adam Jackson 
---
 docs/features.txt| 2 +-
 docs/relnotes/17.3.0.html| 1 +
 src/mapi/glapi/gen/GL4x.xml  | 9 +
 src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
 src/mesa/main/dlist.c| 2 +-
 src/mesa/main/extensions_table.h | 3 ++-
 src/mesa/main/get.c  | 2 +-
 src/mesa/main/get_hash_params.py | 4 ++--
 src/mesa/main/mtypes.h   | 2 +-
 src/mesa/main/polygon.c  | 9 +++--
 src/mesa/main/version.c  | 2 +-
 src/mesa/state_tracker/st_extensions.c   | 2 +-
 12 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index 3f91c2daae..0435ce61ff 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -226,7 +226,7 @@ GL 4.6, GLSL 4.60
   GL_ARB_gl_spirv   in progress (Nicolai 
Hähnle, Ian Romanick)
   GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
   GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
radeonsi, softpipe, swr)
-  GL_ARB_polygon_offset_clamp   not started
+  GL_ARB_polygon_offset_clamp   DONE (i965, nv50, 
nvc0, r600, radeonsi, llvmpipe, swr)
   GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
nvc0, radeonsi, softpipe)
   GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
radeonsi)
   GL_ARB_shader_group_vote  DONE (i965, nvc0, 
radeonsi)
diff --git a/docs/relnotes/17.3.0.html b/docs/relnotes/17.3.0.html
index 8da43f22f0..4a74284632 100644
--- a/docs/relnotes/17.3.0.html
+++ b/docs/relnotes/17.3.0.html
@@ -44,6 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+GL_ARB_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, llvmpipe, 
swr
 GL_ARB_transform_feedback_overflow_query on radeonsi
 GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, radeonsi
 GL_EXT_memory_object on radeonsi
diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index e958ee70c7..88dba5cd71 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -66,4 +66,13 @@
   
 
 
+
+  
+
+
+
+  
+  
+
+
 
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index c3cd8004a1..deacd0d9df 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -66,6 +66,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.ARB_occlusion_query = true;
ctx->Extensions.ARB_occlusion_query2 = true;
ctx->Extensions.ARB_point_sprite = true;
+   ctx->Extensions.ARB_polygon_offset_clamp = true;
ctx->Extensions.ARB_seamless_cube_map = true;
ctx->Extensions.ARB_shader_bit_encoding = true;
ctx->Extensions.ARB_shader_draw_parameters = true;
@@ -100,7 +101,6 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.EXT_packed_float = true;
ctx->Extensions.EXT_pixel_buffer_object = true;
ctx->Extensions.EXT_point_parameters = true;
-   ctx->Extensions.EXT_polygon_offset_clamp = true;
ctx->Extensions.EXT_provoking_vertex = true;
ctx->Extensions.EXT_stencil_two_side = true;
ctx->Extensions.EXT_texture_array = true;
diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 208471aca7..b7d1406eb7 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -10062,7 +10062,7 @@ _mesa_initialize_save_table(const struct gl_context 
*ctx)
SET_ProgramUniformMatrix3x4fv(table, save_ProgramUniformMatrix3x4fv);
SET_ProgramUniformMatrix4x3fv(table, save_ProgramUniformMatrix4x3fv);
 
-   /* GL_EXT_polygon_offset_clamp */
+   /* GL_{ARB,EXT}_polygon_offset_clamp */
SET_PolygonOffsetClampEXT(table, save_PolygonOffsetClampEXT);
 
/* GL_EXT_window_rectangles */
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index d096260891..9475c1b69d 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -94,6 +94,7 @@ EXT(ARB_pipeline_statistics_query   , 
ARB_pipeline_statistics_query
 EXT(ARB_pixel_buffer_object , EXT_pixel_buffer_object  
  , GLL, GLC,  x ,  x , 2004)
 EXT(ARB_point_parameters, EXT_point_parameters 
  , GLL,  x ,  x ,  x , 1997)
 EXT(ARB_point_sprite, ARB_point_sprite 
  , GLL, GLC,  x ,  x , 2003)
+EXT(ARB_polygon_offset_clamp, ARB_polygon_offset_clamp 
  , 

[Mesa-dev] [Bug 102377] PIPE_*_4BYTE_ALIGNED_ONLY caps crashing

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102377

--- Comment #3 from Bruce Cherniak  ---
Tim headed out on vacation today.  He'll be out nearly 2 weeks.  I'm going to
try to get to testing this in the next day or so.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102377] PIPE_*_4BYTE_ALIGNED_ONLY caps crashing

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102377

--- Comment #2 from Brian Paul  ---
Tim, I'll hold off on pushing this patch until you can test it.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 41/47] i965/fs: Add reuse_16bit_conversions_register optimization

2017-08-24 Thread Connor Abbott
Hi Alejandro,

This seems really suspicious. If the live ranges are really independent,
then the register allocator should be able to assign the two virtual
registers to the same physical register if it needs to. This change forces
the two to be the same, which constrains the register allocator
unecessarily and should make it worse, so I'm confused as to why this would
help at all.

IIRC there were some issues where we unnecessarily made the sources and
destination of an instruction interefere with each other, but if that's
what's causing this, then we should fix that underlying issue.

(From what I remember, a lot of SIMD16 were expanded to SIMD8 in the
generator, in which case the second half of the source is read after the
first half of the destination is written, and we falsely thought that the
HW did that too, so we had some code to add a fake interference between
them, but a while ago Curro moved the expansion to happen before register
allocation. I don't have the code in front of me, but I think we still have
this useless code lying around, and I would guess this is the source of the
problem.)

Connor

On Aug 24, 2017 2:59 PM, "Alejandro Piñeiro"  wrote:

When dealing with HF/U/UW, it is usual having a register with a
F/D/UD, and then convert it to HF/U/UW, and not use again the F/D/UD
value. In those cases it would be possible to reuse the register where
the F value is initially stored instead of having two. Take also into
account that when operating with HF/U/UW, you would need to use the
full register (so stride 2). Packs/unpacks would be only useful when
loading/storing several HF/W/UW.

Note that no instruction is removed. The main benefict is reducing the
amoung of registers used, so the pressure on the register allocator is
decreased with big shaders.

Possibly this could be integrated into an existing optimization, at it
is even already done by the register allocator, but was far easier to
write and cleaner to read as a separate optimization.

We found this issue when dealing with some Vulkan CTS tests that
needed several minutes to compile. Most of the time was spent on the
register allocator.

Right now the optimization only handles 32 to 16 bit conversion. It
could be possible to do the equivalent for 16 to 32 bit too, but in
practice, we didn't need it.
---
 src/intel/compiler/brw_fs.cpp | 77 ++
+
 src/intel/compiler/brw_fs.h   |  1 +
 2 files changed, 78 insertions(+)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index b6013a5ce85..1342150b44e 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -39,6 +39,7 @@
 #include "compiler/glsl_types.h"
 #include "compiler/nir/nir_builder.h"
 #include "program/prog_parameter.h"
+#include "brw_fs_live_variables.h"

 using namespace brw;

@@ -3133,6 +3134,81 @@ fs_visitor::remove_extra_rounding_modes()
return progress;
 }

+/**
+ * When dealing with HF/W/UW, it is usual having a register with a F/D/UD,
and
+ * then convert it to HF/W/UW, and not use again the F/D/UD value. In those
+ * cases it would be possible to reuse the register where the F value is
+ * initially stored instead of having two. Take also into account that when
+ * operating with HF/W/UW, you would need to use the full register (so
stride
+ * 2). Packs/unpacks would be only useful when loading/storing several
+ * HF/W/UWs.
+ *
+ * So something like this:
+ *  mov(8) vgrf14<2>:HF, vgrf39:F
+ *
+ * Became:
+ *  mov(8) vgrf39<2>:HF, vgrf39:F
+ *
+ * Note that no instruction is removed. The main benefict is reducing the
+ * amoung of registers used, so the pressure on the register allocator is
+ * decreased with big shaders.
+ */
+bool
+fs_visitor::reuse_16bit_conversions_vgrf()
+{
+   bool progress = false;
+   int ip = -1;
+
+   calculate_live_intervals();
+
+   foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
+  ip++;
+
+  if (inst->dst.file != VGRF || inst->src[0].file != VGRF)
+ continue;
+
+  if (inst->opcode != BRW_OPCODE_MOV)
+ continue;
+
+  if (type_sz(inst->dst.type) != 2 || inst->dst.stride != 2 ||
+  type_sz(inst->src[0].type) != 4 || inst->src[0].stride != 1) {
+ continue;
+  }
+
+  int src_reg = inst->src[0].nr;
+  int src_offset = inst->src[0].offset;
+  unsigned src_var = live_intervals->var_from_vgrf[src_reg];
+  int src_end = live_intervals->end[src_var];
+  int dst_reg = inst->dst.nr;
+
+  if (src_end > ip)
+ continue;
+
+  foreach_block_and_inst(block, fs_inst, scan_inst, cfg) {
+ if (scan_inst->dst.file == VGRF &&
+ scan_inst->dst.nr == dst_reg) {
+scan_inst->dst.nr = src_reg;
+scan_inst->dst.offset = src_offset;
+progress = true;
+ }
+
+ for (int i = 0; i < scan_inst->sources; i++) {
+if (scan_inst->src[i].file == VGRF &&
+scan_inst->src[i].nr 

[Mesa-dev] [PATCH] egl/drm: Don't "fall back" to /dev/dri/card0 if the first open fails

2017-08-24 Thread Adam Jackson
The snprintf stuff here already constructs the right name for the device
node, and if it doesn't, you configured Mesa wrong, don't do that.

Signed-off-by: Adam Jackson 
---
 src/egl/drivers/dri2/platform_drm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index 259b1cd519..0ccbd9a30a 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -667,8 +667,6 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
   int n = snprintf(buf, sizeof(buf), DRM_DEV_NAME, DRM_DIR_NAME, 0);
   if (n != -1 && n < sizeof(buf))
  dri2_dpy->fd = loader_open_device(buf);
-  if (dri2_dpy->fd < 0)
- dri2_dpy->fd = loader_open_device("/dev/dri/card0");
   gbm = gbm_create_device(dri2_dpy->fd);
   if (gbm == NULL) {
  err = "DRI2: failed to create gbm device";
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: Implement GL_ARB_polygon_offset_clamp

2017-08-24 Thread Ilia Mirkin
On Thu, Aug 24, 2017 at 2:40 PM, Adam Jackson  wrote:
> Semantically identical to the EXT version (whose string is still valid
> for GLES), so rename the bit but expose both extension strings.
> (Suggested by Ilia Mirkin and Ian Romanick.)
>
> Signed-off-by: Adam Jackson 
> ---
>  docs/features.txt|  2 +-
>  docs/relnotes/17.3.0.html|  1 +
>  src/mapi/glapi/gen/GL4x.xml  |  9 +
>  src/mesa/drivers/dri/i965/intel_extensions.c |  2 +-
>  src/mesa/main/dlist.c| 10 +-
>  src/mesa/main/extensions_table.h |  3 ++-
>  src/mesa/main/get.c  |  2 +-
>  src/mesa/main/get_hash_params.py |  4 ++--
>  src/mesa/main/mtypes.h   |  2 +-
>  src/mesa/main/polygon.c  | 15 +--
>  src/mesa/main/polygon.h  |  3 +++
>  src/mesa/main/version.c  |  2 +-
>  src/mesa/state_tracker/st_extensions.c   |  2 +-
>  13 files changed, 37 insertions(+), 20 deletions(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index 3f91c2daae..0435ce61ff 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -226,7 +226,7 @@ GL 4.6, GLSL 4.60
>GL_ARB_gl_spirv   in progress (Nicolai 
> Hähnle, Ian Romanick)
>GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
>GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
> radeonsi, softpipe, swr)
> -  GL_ARB_polygon_offset_clamp   not started
> +  GL_ARB_polygon_offset_clamp   DONE (i965, nv50, 
> nvc0, r600, radeonsi, llvmpipe, swr)
>GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
> nvc0, radeonsi, softpipe)
>GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
> radeonsi)
>GL_ARB_shader_group_vote  DONE (i965, nvc0, 
> radeonsi)
> diff --git a/docs/relnotes/17.3.0.html b/docs/relnotes/17.3.0.html
> index 8da43f22f0..4a74284632 100644
> --- a/docs/relnotes/17.3.0.html
> +++ b/docs/relnotes/17.3.0.html
> @@ -44,6 +44,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  
>
>  
> +GL_ARB_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, 
> llvmpipe, swr
>  GL_ARB_transform_feedback_overflow_query on radeonsi
>  GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, 
> radeonsi
>  GL_EXT_memory_object on radeonsi
> diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
> index e958ee70c7..9e7685e5fd 100644
> --- a/src/mapi/glapi/gen/GL4x.xml
> +++ b/src/mapi/glapi/gen/GL4x.xml
> @@ -66,4 +66,13 @@
>
>  
>
> +
> +  
> +
> +
> +
> +  
> +  
> +
> +
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index c3cd8004a1..deacd0d9df 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -66,6 +66,7 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.ARB_occlusion_query = true;
> ctx->Extensions.ARB_occlusion_query2 = true;
> ctx->Extensions.ARB_point_sprite = true;
> +   ctx->Extensions.ARB_polygon_offset_clamp = true;
> ctx->Extensions.ARB_seamless_cube_map = true;
> ctx->Extensions.ARB_shader_bit_encoding = true;
> ctx->Extensions.ARB_shader_draw_parameters = true;
> @@ -100,7 +101,6 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.EXT_packed_float = true;
> ctx->Extensions.EXT_pixel_buffer_object = true;
> ctx->Extensions.EXT_point_parameters = true;
> -   ctx->Extensions.EXT_polygon_offset_clamp = true;
> ctx->Extensions.EXT_provoking_vertex = true;
> ctx->Extensions.EXT_stencil_two_side = true;
> ctx->Extensions.EXT_texture_array = true;
> diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
> index 208471aca7..b710e1b263 100644
> --- a/src/mesa/main/dlist.c
> +++ b/src/mesa/main/dlist.c
> @@ -3504,7 +3504,7 @@ save_PolygonOffsetEXT(GLfloat factor, GLfloat bias)
>  }
>
>  static void GLAPIENTRY
> -save_PolygonOffsetClampEXT(GLfloat factor, GLfloat units, GLfloat clamp)
> +save_PolygonOffsetClamp(GLfloat factor, GLfloat units, GLfloat clamp)
>  {
> GET_CURRENT_CONTEXT(ctx);
> Node *n;
> @@ -3516,7 +3516,7 @@ save_PolygonOffsetClampEXT(GLfloat factor, GLfloat 
> units, GLfloat clamp)
>n[3].f = clamp;
> }
> if (ctx->ExecuteFlag) {
> -  CALL_PolygonOffsetClampEXT(ctx->Exec, (factor, units, clamp));
> +  CALL_PolygonOffsetClamp(ctx->Exec, (factor, units, clamp));
> }
>  }
>
> @@ -10062,11 +10062,11 @@ _mesa_initialize_save_table(const struct gl_context 
> *ctx)
> SET_ProgramUniformMatrix3x4fv(table, save_ProgramUniformMatrix3x4fv);
> SET_ProgramUniformMatrix4x3fv(table, 

[Mesa-dev] [PATCH] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-24 Thread Leo Liu
v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 6c3c4fe..aa4062d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -613,17 +613,22 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
mtx_lock(>mutex);
surf = handle_table_get(drv->htab, context->target_id);
context->mpeg4.frame_num++;
-
screen = context->decoder->context->screen;
interlaced = screen->get_video_param(screen, context->decoder->profile,
 context->decoder->entrypoint,
 PIPE_VIDEO_CAP_SUPPORTS_INTERLACED);
 
if (surf->buffer->interlaced != interlaced) {
-  surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
- 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
- 
PIPE_VIDEO_CAP_PREFERS_INTERLACED);
-  realloc = true;
+  interlaced = screen->get_video_param(screen, context->decoder->profile,
+   context->decoder->entrypoint,
+   PIPE_VIDEO_CAP_PREFERS_INTERLACED);
+  if (!interlaced) {
+ /* The current cases for buffer reallocation are
+all from the interlaced to the deinterlaced,
+and there is no case for the other way around */
+ surf->templat.interlaced = false;
+ realloc = true;
+  }
}
 
if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_JPEG &&
@@ -640,13 +645,18 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
}
 
if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
 
   if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != 
VA_STATUS_SUCCESS) {
+ old_buf->destroy(old_buf);
  mtx_unlock(>mutex);
  return VA_STATUS_ERROR_ALLOCATION_FAILED;
   }
 
+  if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, old_buf, 
surf->buffer);
+
+  old_buf->destroy(old_buf);
   context->target = surf->buffer;
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/omx: move YUV deinterlace function to common

2017-08-24 Thread Leo Liu



On 08/24/2017 11:34 AM, Christian König wrote:

Am 24.08.2017 um 17:11 schrieb Leo Liu:

Signed-off-by: Leo Liu 


Reviewed-by: Christian König  for the series.

Andy do you want to test this? Should make VA-API transcoding simpler 
to use.


Just got chance to test the transcoding(encoding previously). There is 
an issue with current patch 2, which is encode/decoder have buffer 
deinterlaced/interlaced.

v3, will address that, and performance keep same as before.

Regards,
Leo






Regards,
Christian.


---
  src/gallium/auxiliary/vl/vl_compositor.c | 87 
+---

  src/gallium/auxiliary/vl/vl_compositor.h | 21 
  src/gallium/state_trackers/omx/vid_dec.c | 32 +---
  3 files changed, 68 insertions(+), 72 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c

index a79bf11264..794c8b5b17 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -885,6 +885,32 @@ draw_layers(struct vl_compositor *c, struct 
vl_compositor_state *s, struct u_rec

 }
  }
  +static void
+set_yuv_layer(struct vl_compositor_state *s, struct vl_compositor 
*c, unsigned layer,
+  struct pipe_video_buffer *buffer, struct u_rect 
*src_rect,

+  struct u_rect *dst_rect, bool y)
+{
+   struct pipe_sampler_view **sampler_views;
+   unsigned i;
+
+   assert(s && c && buffer);
+
+   assert(layer < VL_COMPOSITOR_MAX_LAYERS);
+
+   s->used_layers |= 1 << layer;
+   sampler_views = buffer->get_sampler_view_components(buffer);
+   for (i = 0; i < 3; ++i) {
+  s->layers[layer].samplers[i] = c->sampler_linear;
+ pipe_sampler_view_reference(>layers[layer].sampler_views[i], 
sampler_views[i]);

+   }
+
+   calc_src_and_dst(>layers[layer], buffer->width, buffer->height,
+src_rect ? *src_rect : 
default_rect(>layers[layer]),
+dst_rect ? *dst_rect : 
default_rect(>layers[layer]));

+
+   s->layers[layer].fs = (y) ? c->fs_weave_yuv.y : c->fs_weave_yuv.uv;
+}
+
  void
  vl_compositor_reset_dirty_area(struct u_rect *dirty)
  {
@@ -1143,36 +1169,6 @@ vl_compositor_set_layer_rotation(struct 
vl_compositor_state *s,

  }
void
-vl_compositor_set_yuv_layer(struct vl_compositor_state *s,
-struct vl_compositor *c,
-unsigned layer,
-struct pipe_video_buffer *buffer,
-struct u_rect *src_rect,
-struct u_rect *dst_rect,
-bool y)
-{
-   struct pipe_sampler_view **sampler_views;
-   unsigned i;
-
-   assert(s && c && buffer);
-
-   assert(layer < VL_COMPOSITOR_MAX_LAYERS);
-
-   s->used_layers |= 1 << layer;
-   sampler_views = buffer->get_sampler_view_components(buffer);
-   for (i = 0; i < 3; ++i) {
-  s->layers[layer].samplers[i] = c->sampler_linear;
- pipe_sampler_view_reference(>layers[layer].sampler_views[i], 
sampler_views[i]);

-   }
-
-   calc_src_and_dst(>layers[layer], buffer->width, buffer->height,
-src_rect ? *src_rect : 
default_rect(>layers[layer]),
-dst_rect ? *dst_rect : 
default_rect(>layers[layer]));

-
-   s->layers[layer].fs = (y) ? c->fs_weave_yuv.y : c->fs_weave_yuv.uv;
-}
-
-void
  vl_compositor_render(struct vl_compositor_state *s,
   struct vl_compositor   *c,
   struct pipe_surface*dst_surface,
@@ -1215,6 +1211,37 @@ vl_compositor_render(struct 
vl_compositor_state *s,

 draw_layers(c, s, dirty_area);
  }
  +void
+vl_compositor_yuv_deint(struct vl_compositor_state *s,
+struct vl_compositor *c,
+struct pipe_video_buffer *src,
+struct pipe_video_buffer *dst)
+{
+   struct pipe_surface **dst_surfaces;
+   struct u_rect dst_rect;
+
+   dst_surfaces = dst->get_surfaces(dst);
+   vl_compositor_clear_layers(s);
+
+   dst_rect.x0 = 0;
+   dst_rect.x1 = src->width;
+   dst_rect.y0 = 0;
+   dst_rect.y1 = src->height;
+
+   set_yuv_layer(s, c, 0, src, NULL, NULL, true);
+   vl_compositor_set_layer_dst_area(s, 0, _rect);
+   vl_compositor_render(s, c, dst_surfaces[0], NULL, false);
+
+   dst_rect.x1 /= 2;
+   dst_rect.y1 /= 2;
+
+   set_yuv_layer(s, c, 0, src, NULL, NULL, false);
+   vl_compositor_set_layer_dst_area(s, 0, _rect);
+   vl_compositor_render(s, c, dst_surfaces[1], NULL, false);
+
+   s->pipe->flush(s->pipe, NULL, 0);
+}
+
  bool
  vl_compositor_init(struct vl_compositor *c, struct pipe_context *pipe)
  {
diff --git a/src/gallium/auxiliary/vl/vl_compositor.h 
b/src/gallium/auxiliary/vl/vl_compositor.h

index 535abb75cd..2546d75b23 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.h
+++ b/src/gallium/auxiliary/vl/vl_compositor.h
@@ -240,18 +240,6 @@ vl_compositor_set_layer_rotation(struct 
vl_compositor_state *state,


Re: [Mesa-dev] [PATCH mesa] egl/android: add missing include

2017-08-24 Thread Rob Herring
On Thu, Aug 24, 2017 at 10:08 AM, Eric Engestrom
 wrote:
> On Thursday, 2017-08-24 10:02:29 -0500, Rob Herring wrote:
>> On Thu, Aug 24, 2017 at 9:26 AM, Rob Herring  wrote:
>> > On Thu, Aug 24, 2017 at 9:22 AM, Eric Engestrom
>> >  wrote:
>> >> Cc: Rob Herring 
>> >> Signed-off-by: Eric Engestrom 
>> >> ---
>> >> This needs to land before [1], otherwise the latter will break android.
>> >>
>> >> [1] 
>> >> https://lists.freedesktop.org/archives/mesa-dev/2017-August/167428.html
>> >>
>> >>  src/egl/drivers/dri2/platform_android.c | 1 +
>> >>  1 file changed, 1 insertion(+)
>> >
>> > For both patches:
>> >
>> > Reviewed-by: Rob Herring 
>>
>> Actually, on further examination I think this isn't needed.
>
> I pushed it a minute before your email: 688d866eca
>
> (I'm waiting on Emil (and maybe others?) to chime in on the khronos
> header change before I push it.)
>
>> egl_dri2.h includes system/window.h which in turn includes
>> native_window.h. I'll test out the EGL change without this.
>
> Personally, I like the include-what-you-use philosophy [1], so I'd
> rather have this include here anyway.

I agree with that, but we don't actually use anything in
android/native_window.h. native_window.h only provides an opaque
struct. Everything we use is in system/window.h.

With your eglplatform.h change alone and without my O fix, O builds
for me. This is because they kept system/window.h for compatibility.
Really, we're supposed to move to vndk/window.h and would need
libnativewindow to pull that in. However, as long as we support pre-O,
we're not going to be changing.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: Implement GL_ARB_polygon_offset_clamp

2017-08-24 Thread Adam Jackson
Semantically identical to the EXT version (whose string is still valid
for GLES), so rename the bit but expose both extension strings.
(Suggested by Ilia Mirkin and Ian Romanick.)

Signed-off-by: Adam Jackson 
---
 docs/features.txt|  2 +-
 docs/relnotes/17.3.0.html|  1 +
 src/mapi/glapi/gen/GL4x.xml  |  9 +
 src/mesa/drivers/dri/i965/intel_extensions.c |  2 +-
 src/mesa/main/dlist.c| 10 +-
 src/mesa/main/extensions_table.h |  3 ++-
 src/mesa/main/get.c  |  2 +-
 src/mesa/main/get_hash_params.py |  4 ++--
 src/mesa/main/mtypes.h   |  2 +-
 src/mesa/main/polygon.c  | 15 +--
 src/mesa/main/polygon.h  |  3 +++
 src/mesa/main/version.c  |  2 +-
 src/mesa/state_tracker/st_extensions.c   |  2 +-
 13 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index 3f91c2daae..0435ce61ff 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -226,7 +226,7 @@ GL 4.6, GLSL 4.60
   GL_ARB_gl_spirv   in progress (Nicolai 
Hähnle, Ian Romanick)
   GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
   GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
radeonsi, softpipe, swr)
-  GL_ARB_polygon_offset_clamp   not started
+  GL_ARB_polygon_offset_clamp   DONE (i965, nv50, 
nvc0, r600, radeonsi, llvmpipe, swr)
   GL_ARB_shader_atomic_counter_ops  DONE (i965/gen7+, 
nvc0, radeonsi, softpipe)
   GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
radeonsi)
   GL_ARB_shader_group_vote  DONE (i965, nvc0, 
radeonsi)
diff --git a/docs/relnotes/17.3.0.html b/docs/relnotes/17.3.0.html
index 8da43f22f0..4a74284632 100644
--- a/docs/relnotes/17.3.0.html
+++ b/docs/relnotes/17.3.0.html
@@ -44,6 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+GL_ARB_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, llvmpipe, 
swr
 GL_ARB_transform_feedback_overflow_query on radeonsi
 GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, radeonsi
 GL_EXT_memory_object on radeonsi
diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index e958ee70c7..9e7685e5fd 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -66,4 +66,13 @@
   
 
 
+
+  
+
+
+
+  
+  
+
+
 
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index c3cd8004a1..deacd0d9df 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -66,6 +66,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.ARB_occlusion_query = true;
ctx->Extensions.ARB_occlusion_query2 = true;
ctx->Extensions.ARB_point_sprite = true;
+   ctx->Extensions.ARB_polygon_offset_clamp = true;
ctx->Extensions.ARB_seamless_cube_map = true;
ctx->Extensions.ARB_shader_bit_encoding = true;
ctx->Extensions.ARB_shader_draw_parameters = true;
@@ -100,7 +101,6 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.EXT_packed_float = true;
ctx->Extensions.EXT_pixel_buffer_object = true;
ctx->Extensions.EXT_point_parameters = true;
-   ctx->Extensions.EXT_polygon_offset_clamp = true;
ctx->Extensions.EXT_provoking_vertex = true;
ctx->Extensions.EXT_stencil_two_side = true;
ctx->Extensions.EXT_texture_array = true;
diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 208471aca7..b710e1b263 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -3504,7 +3504,7 @@ save_PolygonOffsetEXT(GLfloat factor, GLfloat bias)
 }
 
 static void GLAPIENTRY
-save_PolygonOffsetClampEXT(GLfloat factor, GLfloat units, GLfloat clamp)
+save_PolygonOffsetClamp(GLfloat factor, GLfloat units, GLfloat clamp)
 {
GET_CURRENT_CONTEXT(ctx);
Node *n;
@@ -3516,7 +3516,7 @@ save_PolygonOffsetClampEXT(GLfloat factor, GLfloat units, 
GLfloat clamp)
   n[3].f = clamp;
}
if (ctx->ExecuteFlag) {
-  CALL_PolygonOffsetClampEXT(ctx->Exec, (factor, units, clamp));
+  CALL_PolygonOffsetClamp(ctx->Exec, (factor, units, clamp));
}
 }
 
@@ -10062,11 +10062,11 @@ _mesa_initialize_save_table(const struct gl_context 
*ctx)
SET_ProgramUniformMatrix3x4fv(table, save_ProgramUniformMatrix3x4fv);
SET_ProgramUniformMatrix4x3fv(table, save_ProgramUniformMatrix4x3fv);
 
-   /* GL_EXT_polygon_offset_clamp */
-   SET_PolygonOffsetClampEXT(table, save_PolygonOffsetClampEXT);
-
/* GL_EXT_window_rectangles */
SET_WindowRectanglesEXT(table, save_WindowRectanglesEXT);
+
+   /* GL_ARB_polygon_offset_clamp */
+   SET_PolygonOffsetClamp(table, 

[Mesa-dev] [PATCH 1/2] mesa: Implement GL_ARB_texture_filter_anisotropic

2017-08-24 Thread Adam Jackson
The only difference from the EXT version is bumping the minmax to 16, so
just hit all the drivers at once.

v2: Fix driver names, add to 17.3 release notes (Ilia Mirkin)

Reviewed-by: Ilia Mirkin 
Signed-off-by: Adam Jackson 
---
 docs/features.txt| 4 +++-
 docs/relnotes/17.3.0.html| 1 +
 src/glx/glxextensions.c  | 1 +
 src/glx/glxextensions.h  | 1 +
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 src/mesa/drivers/dri/r200/r200_context.c | 1 +
 src/mesa/drivers/dri/radeon/radeon_context.c | 1 +
 src/mesa/main/extensions.c   | 1 +
 src/mesa/main/extensions_table.h | 1 +
 src/mesa/main/mtypes.h   | 1 +
 src/mesa/main/version.c  | 2 +-
 src/mesa/state_tracker/st_extensions.c   | 4 
 12 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index 6f57ec26fd..3f91c2daae 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -231,10 +231,12 @@ GL 4.6, GLSL 4.60
   GL_ARB_shader_draw_parameters DONE (i965, nvc0, 
radeonsi)
   GL_ARB_shader_group_vote  DONE (i965, nvc0, 
radeonsi)
   GL_ARB_spirv_extensions   in progress (Nicolai 
Hähnle, Ian Romanick)
-  GL_ARB_texture_filter_anisotropic not started
+  GL_ARB_texture_filter_anisotropic DONE (i965, nv50, 
nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
   GL_ARB_transform_feedback_overflow_query  DONE (i965/gen6+, 
radeonsi, llvmpipe, softpipe)
   GL_KHR_no_error   started (Timothy 
Arceri)
 
+(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the 
setting
+
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
 
diff --git a/docs/relnotes/17.3.0.html b/docs/relnotes/17.3.0.html
index 25d02cdca7..8da43f22f0 100644
--- a/docs/relnotes/17.3.0.html
+++ b/docs/relnotes/17.3.0.html
@@ -45,6 +45,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 GL_ARB_transform_feedback_overflow_query on radeonsi
+GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, radeonsi
 GL_EXT_memory_object on radeonsi
 GL_EXT_memory_object_fd on radeonsi
 
diff --git a/src/glx/glxextensions.c b/src/glx/glxextensions.c
index 22b078ce48..88bf0de3e6 100644
--- a/src/glx/glxextensions.c
+++ b/src/glx/glxextensions.c
@@ -190,6 +190,7 @@ static const struct extension_info known_gl_extensions[] = {
{ GL(ARB_texture_env_combine),VER(1,3), Y, N, N, N },
{ GL(ARB_texture_env_crossbar),   VER(1,4), Y, N, N, N },
{ GL(ARB_texture_env_dot3),   VER(1,3), Y, N, N, N },
+   { GL(ARB_texture_filter_anisotropic), VER(0,0), Y, N, N, N },
{ GL(ARB_texture_mirrored_repeat),VER(1,4), Y, N, N, N },
{ GL(ARB_texture_non_power_of_two),   VER(1,5), Y, N, N, N },
{ GL(ARB_texture_rectangle),  VER(0,0), Y, N, N, N },
diff --git a/src/glx/glxextensions.h b/src/glx/glxextensions.h
index 21ad02a44b..2a595516ee 100644
--- a/src/glx/glxextensions.h
+++ b/src/glx/glxextensions.h
@@ -101,6 +101,7 @@ enum
GL_ARB_texture_env_combine_bit,
GL_ARB_texture_env_crossbar_bit,
GL_ARB_texture_env_dot3_bit,
+   GL_ARB_texture_filter_anisotropic_bit,
GL_ARB_texture_mirrored_repeat_bit,
GL_ARB_texture_non_power_of_two_bit,
GL_ARB_texture_rectangle_bit,
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index b91bbdc8d9..c3cd8004a1 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -80,6 +80,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.ARB_texture_env_combine = true;
ctx->Extensions.ARB_texture_env_crossbar = true;
ctx->Extensions.ARB_texture_env_dot3 = true;
+   ctx->Extensions.ARB_texture_filter_anisotropic = true;
ctx->Extensions.ARB_texture_float = true;
ctx->Extensions.ARB_texture_mirror_clamp_to_edge = true;
ctx->Extensions.ARB_texture_non_power_of_two = true;
diff --git a/src/mesa/drivers/dri/r200/r200_context.c 
b/src/mesa/drivers/dri/r200/r200_context.c
index ca1023c5c3..0a27985de7 100644
--- a/src/mesa/drivers/dri/r200/r200_context.c
+++ b/src/mesa/drivers/dri/r200/r200_context.c
@@ -339,6 +339,7 @@ GLboolean r200CreateContext( gl_api api,
ctx->Extensions.ARB_texture_env_combine = true;
ctx->Extensions.ARB_texture_env_dot3 = true;
ctx->Extensions.ARB_texture_env_crossbar = true;
+   ctx->Extensions.ARB_texture_filter_anisotropic = true;
ctx->Extensions.ARB_texture_mirror_clamp_to_edge = true;
ctx->Extensions.ARB_vertex_program = true;
ctx->Extensions.ATI_fragment_shader = (ctx->Const.MaxTextureUnits == 6);
diff --git 

Re: [Mesa-dev] [PATCH 08/10] anv: Use DRM sync objects to back fences whenever possible

2017-08-24 Thread Jason Ekstrand
On Thu, Aug 24, 2017 at 10:39 AM, Jason Ekstrand 
wrote:

> On Thu, Aug 24, 2017 at 10:20 AM, Lionel Landwerlin <
> lionel.g.landwer...@intel.com> wrote:
>
>> On 08/08/17 23:45, Jason Ekstrand wrote:
>>
>>> In order to implement VK_KHR_external_fence, we need to back our fences
>>> with something that's shareable.  Since the kernel wait interface for
>>> sync objects already supports waiting for multiple fences in one go, it
>>> makes anv_WaitForFences much simpler if we only have one type of fence.
>>> ---
>>>   src/intel/vulkan/anv_batch_chain.c |   8 +++
>>>   src/intel/vulkan/anv_device.c  |   2 +
>>>   src/intel/vulkan/anv_private.h |   4 ++
>>>   src/intel/vulkan/anv_queue.c   | 132 ++
>>> ---
>>>   4 files changed, 136 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/src/intel/vulkan/anv_batch_chain.c
>>> b/src/intel/vulkan/anv_batch_chain.c
>>> index 5d876e4..15082b5 100644
>>> --- a/src/intel/vulkan/anv_batch_chain.c
>>> +++ b/src/intel/vulkan/anv_batch_chain.c
>>> @@ -1556,6 +1556,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
>>>   return result;
>>>break;
>>>   +  case ANV_FENCE_TYPE_SYNCOBJ:
>>> + result = anv_execbuf_add_syncobj(, impl->syncobj,
>>> +  I915_EXEC_FENCE_SIGNAL,
>>> +  >alloc);
>>> + if (result != VK_SUCCESS)
>>> +return result;
>>> + break;
>>> +
>>> default:
>>>unreachable("Invalid fence type");
>>> }
>>> diff --git a/src/intel/vulkan/anv_device.c
>>> b/src/intel/vulkan/anv_device.c
>>> index a6d5215..2e0fa19 100644
>>> --- a/src/intel/vulkan/anv_device.c
>>> +++ b/src/intel/vulkan/anv_device.c
>>> @@ -339,6 +339,8 @@ anv_physical_device_init(struct anv_physical_device
>>> *device,
>>>  device->has_exec_async = anv_gem_get_param(fd,
>>> I915_PARAM_HAS_EXEC_ASYNC);
>>>  device->has_exec_fence = anv_gem_get_param(fd,
>>> I915_PARAM_HAS_EXEC_FENCE);
>>>  device->has_syncobj = anv_gem_get_param(fd,
>>> I915_PARAM_HAS_EXEC_FENCE_ARRAY);
>>> +   device->has_syncobj_wait = device->has_syncobj &&
>>> +  anv_gem_supports_syncobj_wait(fd);
>>>bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
>>>   diff --git a/src/intel/vulkan/anv_private.h
>>> b/src/intel/vulkan/anv_private.h
>>> index 2f89d3f..430652d 100644
>>> --- a/src/intel/vulkan/anv_private.h
>>> +++ b/src/intel/vulkan/anv_private.h
>>> @@ -654,6 +654,7 @@ struct anv_physical_device {
>>>   boolhas_exec_async;
>>>   boolhas_exec_fence;
>>>   boolhas_syncobj;
>>> +boolhas_syncobj_wait;
>>> uint32_teu_total;
>>>   uint32_tsubslice_total;
>>> @@ -1755,6 +1756,9 @@ struct anv_fence_impl {
>>>struct anv_bo bo;
>>>enum anv_bo_fence_state state;
>>> } bo;
>>> +
>>> +  /** DRM syncobj handle for syncobj-based fences */
>>> +  uint32_t syncobj;
>>>  };
>>>   };
>>>   diff --git a/src/intel/vulkan/anv_queue.c
>>> b/src/intel/vulkan/anv_queue.c
>>> index 7348e15..8e45bb2 100644
>>> --- a/src/intel/vulkan/anv_queue.c
>>> +++ b/src/intel/vulkan/anv_queue.c
>>> @@ -271,17 +271,25 @@ VkResult anv_CreateFence(
>>>  if (fence == NULL)
>>> return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
>>>   -   fence->permanent.type = ANV_FENCE_TYPE_BO;
>>> +   if (device->instance->physicalDevice.has_syncobj_wait) {
>>> +  fence->permanent.type = ANV_FENCE_TYPE_SYNCOBJ;
>>>   -   VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
>>> -   >permanent.bo.bo, 4096);
>>> -   if (result != VK_SUCCESS)
>>> -  return result;
>>> -
>>> -   if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
>>> -  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
>>> +  fence->permanent.syncobj = anv_gem_syncobj_create(device);
>>> +  if (!fence->permanent.syncobj)
>>> + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
>>>
>>
>> Don't you need to do something when the fence is created with the
>> signaled bit with drm syncobj?
>> I didn't see anything in the spec that would make this illegal so I
>> assume we have to handle it.
>>
>
> Hrm... Yes, I think we do.  Unfortunately, that's going to require
> additional kernel API. :(  Thanks for catching that, I'll work on it today.
>

Correction:  This won't require more kernel API.  I can fire off a dummy
execbuf to trigger the fence in the create function.  It's just going to
require more kernel API to do cleanly.

--Jason


>
> --Jason
>
>
>>  } else {
>>> -  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
>>> +  fence->permanent.type 

Re: [Mesa-dev] [PATCH 08/10] anv: Use DRM sync objects to back fences whenever possible

2017-08-24 Thread Jason Ekstrand
On Thu, Aug 24, 2017 at 10:20 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> On 08/08/17 23:45, Jason Ekstrand wrote:
>
>> In order to implement VK_KHR_external_fence, we need to back our fences
>> with something that's shareable.  Since the kernel wait interface for
>> sync objects already supports waiting for multiple fences in one go, it
>> makes anv_WaitForFences much simpler if we only have one type of fence.
>> ---
>>   src/intel/vulkan/anv_batch_chain.c |   8 +++
>>   src/intel/vulkan/anv_device.c  |   2 +
>>   src/intel/vulkan/anv_private.h |   4 ++
>>   src/intel/vulkan/anv_queue.c   | 132 ++
>> ---
>>   4 files changed, 136 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/intel/vulkan/anv_batch_chain.c
>> b/src/intel/vulkan/anv_batch_chain.c
>> index 5d876e4..15082b5 100644
>> --- a/src/intel/vulkan/anv_batch_chain.c
>> +++ b/src/intel/vulkan/anv_batch_chain.c
>> @@ -1556,6 +1556,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
>>   return result;
>>break;
>>   +  case ANV_FENCE_TYPE_SYNCOBJ:
>> + result = anv_execbuf_add_syncobj(, impl->syncobj,
>> +  I915_EXEC_FENCE_SIGNAL,
>> +  >alloc);
>> + if (result != VK_SUCCESS)
>> +return result;
>> + break;
>> +
>> default:
>>unreachable("Invalid fence type");
>> }
>> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.
>> c
>> index a6d5215..2e0fa19 100644
>> --- a/src/intel/vulkan/anv_device.c
>> +++ b/src/intel/vulkan/anv_device.c
>> @@ -339,6 +339,8 @@ anv_physical_device_init(struct anv_physical_device
>> *device,
>>  device->has_exec_async = anv_gem_get_param(fd,
>> I915_PARAM_HAS_EXEC_ASYNC);
>>  device->has_exec_fence = anv_gem_get_param(fd,
>> I915_PARAM_HAS_EXEC_FENCE);
>>  device->has_syncobj = anv_gem_get_param(fd,
>> I915_PARAM_HAS_EXEC_FENCE_ARRAY);
>> +   device->has_syncobj_wait = device->has_syncobj &&
>> +  anv_gem_supports_syncobj_wait(fd);
>>bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
>>   diff --git a/src/intel/vulkan/anv_private.h
>> b/src/intel/vulkan/anv_private.h
>> index 2f89d3f..430652d 100644
>> --- a/src/intel/vulkan/anv_private.h
>> +++ b/src/intel/vulkan/anv_private.h
>> @@ -654,6 +654,7 @@ struct anv_physical_device {
>>   boolhas_exec_async;
>>   boolhas_exec_fence;
>>   boolhas_syncobj;
>> +boolhas_syncobj_wait;
>> uint32_teu_total;
>>   uint32_tsubslice_total;
>> @@ -1755,6 +1756,9 @@ struct anv_fence_impl {
>>struct anv_bo bo;
>>enum anv_bo_fence_state state;
>> } bo;
>> +
>> +  /** DRM syncobj handle for syncobj-based fences */
>> +  uint32_t syncobj;
>>  };
>>   };
>>   diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
>> index 7348e15..8e45bb2 100644
>> --- a/src/intel/vulkan/anv_queue.c
>> +++ b/src/intel/vulkan/anv_queue.c
>> @@ -271,17 +271,25 @@ VkResult anv_CreateFence(
>>  if (fence == NULL)
>> return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
>>   -   fence->permanent.type = ANV_FENCE_TYPE_BO;
>> +   if (device->instance->physicalDevice.has_syncobj_wait) {
>> +  fence->permanent.type = ANV_FENCE_TYPE_SYNCOBJ;
>>   -   VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
>> -   >permanent.bo.bo, 4096);
>> -   if (result != VK_SUCCESS)
>> -  return result;
>> -
>> -   if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
>> -  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
>> +  fence->permanent.syncobj = anv_gem_syncobj_create(device);
>> +  if (!fence->permanent.syncobj)
>> + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
>>
>
> Don't you need to do something when the fence is created with the signaled
> bit with drm syncobj?
> I didn't see anything in the spec that would make this illegal so I assume
> we have to handle it.
>

Hrm... Yes, I think we do.  Unfortunately, that's going to require
additional kernel API. :(  Thanks for catching that, I'll work on it today.

--Jason


>  } else {
>> -  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
>> +  fence->permanent.type = ANV_FENCE_TYPE_BO;
>> +
>> +  VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
>> +  >permanent.bo.bo,
>> 4096);
>> +  if (result != VK_SUCCESS)
>> + return result;
>> +
>> +  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
>> + fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
>> +  } else {
>> +   

[Mesa-dev] [PATCH v9 4/7] mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker

2017-08-24 Thread Gert Wollny
This patch adds a set of unit tests for the new lifetime tracker.
---
 configure.ac   |1 +
 src/mesa/Makefile.am   |2 +-
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   |   12 +-
 .../state_tracker/st_glsl_to_tgsi_temprename.h |8 +-
 src/mesa/state_tracker/tests/Makefile.am   |   36 +
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 1405 
 6 files changed, 1460 insertions(+), 4 deletions(-)
 create mode 100644 src/mesa/state_tracker/tests/Makefile.am
 create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp

diff --git a/configure.ac b/configure.ac
index 53d52f6d52..3ca67b7632 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2931,6 +2931,7 @@ AC_CONFIG_FILES([Makefile
  src/mesa/drivers/osmesa/osmesa.pc
  src/mesa/drivers/x11/Makefile
  src/mesa/main/tests/Makefile
+ src/mesa/state_tracker/tests/Makefile
  src/util/Makefile
  src/util/tests/hash_table/Makefile
  src/util/xmlpool/Makefile
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 97a9bbd8c2..865735be27 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,7 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.
 
-SUBDIRS = . main/tests
+SUBDIRS = . main/tests state_tracker/tests
 
 if HAVE_XLIB_GLX
 SUBDIRS += drivers/x11
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 9690e47fd6..8a73f8c99c 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -601,7 +601,7 @@ static void dump_instruction(int line, prog_scope *scope,
 /* Scan the program and estimate the required register life times.
  * The array lifetimes must be pre-allocated
  */
-void
+bool
 get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
   int ntemps, struct lifetime *lifetimes)
 {
@@ -743,6 +743,15 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
  }
  break;
   }
+  case TGSI_OPCODE_CAL:
+  case TGSI_OPCODE_RET:
+ /* These opcodes are not supported and if a subroutine would
+  * be called in a shader, then the lifetime tracking would have
+  * to follow that call to see which registers are used there.
+  * Since this is not done, we have to bail out here and signal
+  * that no register merge will take place.
+  */
+ return false;
   default: {
  for (unsigned j = 0; j < num_inst_src_regs(inst); j++) {
 const st_src_reg& src = inst->src[j];
@@ -782,6 +791,7 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
RENAME_DEBUG(cerr << "==\n\n");
 
delete[] acc;
+   return true;
 }
 
 /* Code below used for debugging */
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
index 44998cca97..4bfe383aaa 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
@@ -38,7 +38,9 @@ struct lifetime {
int end;
 };
 
-/** Evaluates the required life times of temporary registers in a shader
+/** Evaluates the required life times of temporary registers in a shader.
+ * The life time estimation can only be run sucessfully if the shader doesn't
+ * call a subroutine.
  * @param[in] mem_ctx a memory context that can be used with the ralloc_* 
functions
  * @param[in] instructions the shader to be anlzyed
  * @param[in] ntemps number of temporaries reserved for this shader
@@ -47,8 +49,10 @@ struct lifetime {
  *   allocated memory that can hold ntemps lifetime structures. On output
  *   the life times contains the life times for the registers with the
  *   exception of TEMP[0].
+ * @returns: true if the lifetimes were estimated, false if not (i.e. if a
+ * subroutine was called).
  */
-void
+bool
 get_temp_registers_required_lifetimes(void *mem_ctx, exec_list *instructions,
   int ntemps, struct lifetime *lifetimes);
 
diff --git a/src/mesa/state_tracker/tests/Makefile.am 
b/src/mesa/state_tracker/tests/Makefile.am
new file mode 100644
index 00..fb64cf9dc2
--- /dev/null
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -0,0 +1,36 @@
+AM_CFLAGS = \
+   $(PTHREAD_CFLAGS)
+
+AM_CXXFLAGS = \
+   $(LLVM_CXXFLAGS)
+
+AM_CPPFLAGS = \
+   -I$(top_srcdir)/src/gtest/include \
+   -I$(top_srcdir)/src \
+   -I$(top_srcdir)/src/mapi \
+   -I$(top_builddir)/src/mesa \
+   -I$(top_srcdir)/src/mesa \
+   -I$(top_srcdir)/include \
+   

[Mesa-dev] [PATCH v9 7/7] mesa/st: glsl_to_tgsi: tie in new temporary register merge approach

2017-08-24 Thread Gert Wollny
This patch replaces the old register lifetime estiamtion and
rename mapping evaluation with the new one.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread.

---
old  new(std::sort)

 time ./run -j1 shaders 

  real  5.80s  5.75s
  user  5.75s  5.70s
  sys   0.05s  0.05s

 valgrind --tool=callgrind --dump-instr=yes

 merge   0.08% 0.18%
 estimate lifetime   0.02% 0.11%
 evaluate mapping  (incl=0.3%) 0.04%
 apply mapping   0.03% 0.02%

---   perf (approximate because of statistic sampling) 

merge (total)0.09% 0.16%
estimate lifetime0.03% 0.10%
evaluate mapping  (incl=0.02%) 0.04%
apply mapping0.04% 0.04%
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 66 --
 1 file changed, 16 insertions(+), 50 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9ec8268d3c..e0dceff31c 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,7 +55,7 @@
 #include "st_format.h"
 #include "st_nir.h"
 #include "st_shader_cache.h"
-#include "st_glsl_to_tgsi_private.h"
+#include "st_glsl_to_tgsi_temprename.h"
 
 #include "util/hash_table.h"
 #include 
@@ -574,7 +574,7 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir, unsigned 
op,
if (swz > 1) {
   dinst->src[j].double_reg2 = true;
   dinst->src[j].index++;
-  }
+  }
 
if (swz & 1)
   dinst->src[j].swizzle = MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, 
SWIZZLE_Z, SWIZZLE_W);
@@ -2067,7 +2067,7 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, 
st_src_reg *op)
   st_src_reg temp = get_temp(glsl_type::uvec4_type);
   st_dst_reg temp_dst = st_dst_reg(temp);
   unsigned orig_swz = op[0].swizzle;
-  /* 
+  /*
* To convert unsigned to 64-bit:
* zero Y channel, copy X channel.
*/
@@ -2553,8 +2553,8 @@ glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
if (index) {
 
   if (this->prog->Target == GL_VERTEX_PROGRAM_ARB &&
- src.file == PROGRAM_INPUT)
-element_size = attrib_type_size(ir->type, true);
+ src.file == PROGRAM_INPUT)
+element_size = attrib_type_size(ir->type, true);
   if (is_2D) {
  src.index2D = index->value.i[0];
  src.has_index2 = true;
@@ -2840,7 +2840,7 @@ glsl_to_tgsi_visitor::emit_block_mov(ir_assignment *ir, 
const struct glsl_type *
if (type->is_dual_slot()) {
   l->index++;
   if (r->is_double_vertex_input == false)
-r->index++;
+r->index++;
}
 }
 
@@ -5135,54 +5135,20 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
 void
 glsl_to_tgsi_visitor::merge_registers(void)
 {
-   int *last_reads = ralloc_array(mem_ctx, int, this->next_temp);
-   int *first_writes = ralloc_array(mem_ctx, int, this->next_temp);
-   struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct 
rename_reg_pair, this->next_temp);
-   int i, j;
 
-   /* Read the indices of the last read and first write to each temp register
-* into an array so that we don't have to traverse the instruction list as
-* much. */
-   for (i = 0; i < this->next_temp; i++) {
-  last_reads[i] = -1;
-  first_writes[i] = -1;
-   }
-   get_last_temp_read_first_temp_write(last_reads, first_writes);
+   struct lifetime *lifetimes =
+ rzalloc_array(mem_ctx, struct lifetime, this->next_temp);
 
-   /* Start looking for registers with non-overlapping usages that can be
-* merged together. */
-   for (i = 0; i < this->next_temp; i++) {
-  /* Don't touch unused registers. */
-  if (last_reads[i] < 0 || first_writes[i] < 0) continue;
-
-  for (j = 0; j < this->next_temp; j++) {
- /* Don't touch unused registers. */
- if (last_reads[j] < 0 || first_writes[j] < 0) continue;
-
- /* We can merge the two registers if the first write to j is after or
-  * in the same instruction as the last read from i.  Note that the
-  * register at index i will always be used earlier or at the same time
-  * as the register at index j. */
- if (first_writes[i] <= first_writes[j] &&
- last_reads[i] <= first_writes[j]) {
-renames[j].new_reg = i;
-renames[j].valid = true;
-
-/* Update the first_writes and last_reads arrays with the new
- * values for the merged register index, and mark the newly unused
- * register index as such. */
-assert(last_reads[j] >= last_reads[i]);
-last_reads[i] = last_reads[j];
-

[Mesa-dev] [PATCH v9 5/7] mesa/st: glsl_to_tgsi: add register rename mapping evaluator

2017-08-24 Thread Gert Wollny
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
  src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.

Registers that are not written to are not considered for renaming since in
glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway.
---
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 117 +
 .../state_tracker/st_glsl_to_tgsi_temprename.h |  12 +++
 .../tests/test_glsl_to_tgsi_lifetime.cpp   |  13 ++-
 3 files changed, 137 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index 8a73f8c99c..6dff909ebc 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -590,6 +590,19 @@ lifetime temp_comp_access::get_required_lifetime()
return make_lifetime(first_write, last_read);
 }
 
+/* Helper class for sorting and searching the registers based
+ * on life times. */
+struct access_record {
+   int begin;
+   int end;
+   int reg;
+   bool erase;
+
+   bool operator < (const access_record& rhs) const {
+  return begin < rhs.begin;
+   }
+};
+
 }
 
 #ifndef NDEBUG
@@ -794,6 +807,110 @@ get_temp_registers_required_lifetimes(void *mem_ctx, 
exec_list *instructions,
return true;
 }
 
+/* Find the next register between [start, end) that has a life time starting
+ * at or after bound by using a binary search.
+ * start points at the beginning of the search range,
+ * end points at the element past the end of the search range, and
+ * the array comprising [start, end) must be sorted in ascending order.
+ */
+static access_record*
+find_next_rename(access_record* start, access_record* end, int bound)
+{
+   int delta = (end - start);
+
+   while (delta > 0) {
+  int half = delta >> 1;
+  access_record* middle = start + half;
+
+  if (bound <= middle->begin) {
+ delta = half;
+  } else {
+ start = middle;
+ ++start;
+ delta -= half + 1;
+  }
+   }
+
+   return start;
+}
+
+#ifndef USE_STL_SORT
+static int access_record_compare (const void *a, const void *b) {
+   const access_record *aa = static_cast(a);
+   const access_record *bb = static_cast(b);
+   return aa->begin < bb->begin ? -1 : (aa->begin > bb->begin ? 1 : 0);
+}
+#endif
+
+/* This functions evaluates the register merges by using an buínary
+ * search to find suitable merge candidates. */
+void get_temp_registers_remapping(void *mem_ctx, int ntemps,
+  const struct lifetime* lifetimes,
+  struct rename_reg_pair *result)
+{
+   access_record *reg_access = ralloc_array(mem_ctx, access_record, ntemps);
+
+   int used_temps = 0;
+   for (int i = 0; i < ntemps; ++i) {
+  if (lifetimes[i].begin >= 0) {
+ reg_access[used_temps].begin = lifetimes[i].begin;
+ reg_access[used_temps].end = lifetimes[i].end;
+ reg_access[used_temps].reg = i;
+ reg_access[used_temps].erase = false;
+ ++used_temps;
+  }
+   }
+
+#ifdef USE_STL_SORT
+   std::sort(reg_access, reg_access + used_temps);
+#else
+   std::qsort(reg_access, used_temps, sizeof(access_record), 
access_record_compare);
+#endif
+
+   access_record *trgt = reg_access;
+   access_record *reg_access_end = reg_access + used_temps;
+   access_record *first_erase = reg_access_end;
+   access_record *search_start = trgt + 1;
+
+   while (trgt != reg_access_end) {
+  access_record *src = find_next_rename(search_start, reg_access_end,
+trgt->end);
+  if (src != reg_access_end) {
+ result[src->reg].new_reg = trgt->reg;
+ result[src->reg].valid = true;
+ trgt->end = src->end;
+
+ /* Since we only search forward, don't remove the renamed
+  * register just now, only mark it. */
+ src->erase = true;
+
+ if (first_erase == reg_access_end)
+first_erase = src;
+
+ search_start = src + 1;
+  } else {
+ /* Moving to the next target register it is time to remove
+  * the already merged registers from the search range */
+ if (first_erase != reg_access_end) {
+access_record *outp = first_erase;
+access_record *inp = first_erase + 1;
+
+while (inp != reg_access_end) {
+   if (!inp->erase)
+  *outp++ = *inp;
+   ++inp;
+}
+
+reg_access_end = outp;
+first_erase = reg_access_end;
+ }
+ ++trgt;
+ search_start = trgt + 1;
+  }
+   }
+   ralloc_free(reg_access);

[Mesa-dev] [PATCH v9 6/7] mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping

2017-08-24 Thread Gert Wollny
The patch adds tests for the register rename mapping evaluation and
combined life time estimation and renaming.
---
 .../tests/test_glsl_to_tgsi_lifetime.cpp   | 192 +
 1 file changed, 192 insertions(+)

diff --git a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp 
b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
index 11f0c7f127..a1c77d14ea 100644
--- a/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
+++ b/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
@@ -124,6 +124,33 @@ protected:
void check(const vector& result, const expectation& e);
 };
 
+/* With this test class the renaming mapping estimation is tested */
+class RegisterRemappingTest : public MesaTestWithMemCtx {
+protected:
+   void run(const vector& lt, const vector& expect);
+};
+
+/* With this test class the combined lifetime estimation and renaming
+ * mepping estimation is tested
+ */
+class RegisterLifetimeAndRemappingTest : public RegisterRemappingTest  {
+protected:
+   using RegisterRemappingTest::run;
+   template 
+   void run(const vector& code, const vector& expect);
+};
+
+template 
+void RegisterLifetimeAndRemappingTest::run(const vector& code,
+  const vector& expect)
+{
+ MockShader shader(code);
+ std::vector lt(shader.get_num_temps());
+ get_temp_registers_required_lifetimes(mem_ctx, shader.get_program(),
+   shader.get_num_temps(), [0]);
+ this->run(lt, expect);
+}
+
 TEST_F(LifetimeEvaluatorExactTest, SimpleMoveAdd)
 {
const vector code = {
@@ -1175,6 +1202,148 @@ TEST_F(LifetimeEvaluatorExactTest, 
NestedLoopWithWriteAfterBreak)
run (code, expectation({{-1,-1}, {0,8}}));
 }
 
+/* Test remapping table of registers. The tests don't assume
+ * that the sorting algorithm used to sort the lifetimes
+ * based on their 'begin' is stable.
+ */
+TEST_F(RegisterRemappingTest, RegisterRemapping1)
+{
+   vector lt({{-1,-1},
+{0,1},
+{0,2},
+{1,2},
+{2,10},
+{3,5},
+{5,10}
+   });
+
+   vector expect({0,1,2,1,1,2,2});
+   run(lt, expect);
+}
+
+TEST_F(RegisterRemappingTest, RegisterRemapping2)
+{
+   vector lt({{-1,-1},
+{0,1},
+{0,2},
+{3,4},
+{4,5},
+   });
+   vector expect({0,1,2,1,1});
+   run(lt, expect);
+}
+
+TEST_F(RegisterRemappingTest, RegisterRemappingMergeAllToOne)
+{
+   vector lt({{-1,-1},
+{0,1},
+{1,2},
+{2,3},
+{3,4},
+   });
+   vector expect({0,1,1,1,1});
+   run(lt, expect);
+}
+
+TEST_F(RegisterRemappingTest, RegisterRemappingIgnoreUnused)
+{
+   vector lt({{-1,-1},
+{0,1},
+{1,2},
+{2,3},
+{-1,-1},
+{3,4},
+   });
+   vector expect({0,1,1,1,4,1});
+   run(lt, expect);
+}
+
+TEST_F(RegisterRemappingTest, RegisterRemappingMergeZeroLifetimeRegisters)
+{
+   vector lt({{-1,-1},
+{0,1},
+{1,2},
+{2,3},
+{3,3},
+{3,4},
+   });
+   vector expect({0,1,1,1,1,1});
+   run(lt, expect);
+}
+
+TEST_F(RegisterLifetimeAndRemappingTest, LifetimeAndRemapping)
+{
+   const vector code = {
+  {TGSI_OPCODE_USEQ, {5}, {in0,in1}, {}},
+  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  {TGSI_OPCODE_UCMP, {1}, {5,in1,1}, {}},
+  {TGSI_OPCODE_FSLT, {2}, {1,in1}, {}},
+  {TGSI_OPCODE_UIF, {}, {2}, {}},
+  {  TGSI_OPCODE_MOV, {3}, {in1}, {}},
+  {TGSI_OPCODE_ELSE},
+  {  TGSI_OPCODE_MOV, {4}, {in1}, {}},
+  {  TGSI_OPCODE_MOV, {4}, {4}, {}},
+  {  TGSI_OPCODE_MOV, {3}, {4}, {}},
+  {TGSI_OPCODE_ENDIF},
+  {TGSI_OPCODE_MOV, {out1}, {3}, {}},
+  {TGSI_OPCODE_END}
+   };
+   run (code, vector({0,1,5,5,1,5}));
+}
+
+TEST_F(RegisterLifetimeAndRemappingTest, 
LifetimeAndRemappingWithUnusedReadOnlyIgnored)
+{
+   const vector code = {
+  {TGSI_OPCODE_USEQ, {1}, {in0,in1}, {}},
+  {TGSI_OPCODE_UCMP, {2}, {1,in1,2}, {}},
+  {TGSI_OPCODE_UCMP, {4}, {2,in1,1}, {}},
+  {TGSI_OPCODE_ADD, {5}, {2,4}, {}},
+  {TGSI_OPCODE_UIF, {}, {7}, {}},
+  {  TGSI_OPCODE_ADD, {8}, {5,4}, {}},
+  {TGSI_OPCODE_ENDIF},
+  {TGSI_OPCODE_MOV, {out1}, {8}, {}},
+  {TGSI_OPCODE_END}
+   };
+   /* lt: 1: 0-2,2: 1-3 3: u 4: 2-5 5: 3-5 6: u 7: 0-(-1),8: 5-7 */
+   run (code, vector({0,1,2,3,1,2,6,7,1}));
+}
+
+TEST_F(RegisterLifetimeAndRemappingTest, 

[Mesa-dev] [PATCH v9 3/7] mesa/st: glsl_to_tgsi: implement new temporary register lifetime tracker

2017-08-24 Thread Gert Wollny
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned  second time to record the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actual minimal life time is
evaluated.

In addition, when compiled in debug mode (i.e. NDEBUG is not defined)
the shaders and estimated temporary life times can be logged to stderr
by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG.
---
 src/mesa/Makefile.sources  |   2 +
 .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 886 +
 .../state_tracker/st_glsl_to_tgsi_temprename.h |  55 ++
 3 files changed, 943 insertions(+)
 create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
 create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 40d1a2e538..2e4cf6f638 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -513,6 +513,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_tgsi.h \
state_tracker/st_glsl_to_tgsi_private.cpp \
state_tracker/st_glsl_to_tgsi_private.h \
+   state_tracker/st_glsl_to_tgsi_temprename.cpp \
+   state_tracker/st_glsl_to_tgsi_temprename.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
new file mode 100644
index 00..9690e47fd6
--- /dev/null
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -0,0 +1,886 @@
+/*
+ * Copyright © 2017 Gert Wollny
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "st_glsl_to_tgsi_temprename.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* std::sort is significantly faster than qsort */
+#define USE_STL_SORT
+#ifdef USE_STL_SORT
+#include 
+#endif
+
+#ifndef NDEBUG
+#include 
+#include 
+#include 
+#include 
+using std::cerr;
+using std::setw;
+#endif
+
+using std::numeric_limits;
+
+/* Without c++11 define the nullptr for forward-compatibility
+ * and better readibility */
+#if __cplusplus < 201103L
+#define nullptr 0
+#endif
+
+#ifndef NDEBUG
+/* Helper function to check whether we want to seen debugging output */
+static inline bool is_debug_enabled ()
+{
+   static int debug_enabled = -1;
+   if (debug_enabled < 0)
+  debug_enabled = env_var_as_boolean("GLSL_TO_TGSI_RENAME_DEBUG", false);
+   return debug_enabled > 0;
+}
+#define RENAME_DEBUG(X) if (is_debug_enabled()) do { X; } while (false);
+#else
+#define RENAME_DEBUG(X)
+#endif
+
+namespace {
+
+enum prog_scope_type {
+   outer_scope,   /* Outer program scope */
+   loop_body, /* Inside a loop */
+   if_branch, /* Inside if branch */
+   else_branch,   /* Inside else branch */
+   switch_body,   /* Inside switch statmenet */
+   switch_case_branch,/* Inside switch case statmenet */
+   switch_default_branch, /* Inside switch default statmenet */
+   undefined_scope
+};
+
+class prog_scope {
+public:
+   prog_scope(prog_scope *parent, prog_scope_type type, int id,
+  int depth, int begin);
+
+   prog_scope_type type() const;
+   prog_scope *parent() const;
+   int nesting_depth() const;
+   int id() const;
+   int end() const;
+   int begin() const;
+   int loop_break_line() const;
+
+   const prog_scope *in_ifelse_scope() const;
+   const prog_scope *in_switchcase_scope() const;
+   const prog_scope 

[Mesa-dev] [PATCH v9 2/7] mesa/st: glsl_to_tgsi move some helper classes to extra files

2017-08-24 Thread Gert Wollny
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;

int swizzle_for_type(const glsl_type *type, int component);

  as inline:

bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);
---
 src/mesa/Makefile.sources  |   2 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 289 +
 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 196 ++
 src/mesa/state_tracker/st_glsl_to_tgsi_private.h   | 168 
 4 files changed, 368 insertions(+), 287 deletions(-)
 create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
 create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index e65b091fe8..40d1a2e538 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -511,6 +511,8 @@ STATETRACKER_FILES = \
state_tracker/st_glsl_to_nir.cpp \
state_tracker/st_glsl_to_tgsi.cpp \
state_tracker/st_glsl_to_tgsi.h \
+   state_tracker/st_glsl_to_tgsi_private.cpp \
+   state_tracker/st_glsl_to_tgsi_private.h \
state_tracker/st_glsl_types.cpp \
state_tracker/st_glsl_types.h \
state_tracker/st_manager.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index d05c27cd7a..9ec8268d3c 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -55,6 +55,7 @@
 #include "st_format.h"
 #include "st_nir.h"
 #include "st_shader_cache.h"
+#include "st_glsl_to_tgsi_private.h"
 
 #include "util/hash_table.h"
 #include 
@@ -65,28 +66,6 @@
 
 #define MAX_GLSL_TEXTURE_OFFSET 4
 
-class st_src_reg;
-class st_dst_reg;
-
-static int swizzle_for_size(int size);
-
-static int swizzle_for_type(const glsl_type *type, int component = 0)
-{
-   unsigned num_elements = 4;
-
-   if (type) {
-  type = type->without_array();
-  if (type->is_scalar() || type->is_vector() || type->is_matrix())
- num_elements = type->vector_elements;
-   }
-
-   int swizzle = swizzle_for_size(num_elements);
-   assert(num_elements + component <= 4);
-
-   swizzle += component * MAKE_SWIZZLE4(1, 1, 1, 1);
-   return swizzle;
-}
-
 static unsigned is_precise(const ir_variable *ir)
 {
if (!ir)
@@ -94,231 +73,6 @@ static unsigned is_precise(const ir_variable *ir)
return ir->data.precise || ir->data.invariant;
 }
 
-/**
- * This struct is a corresponding struct to TGSI ureg_src.
- */
-class st_src_reg {
-public:
-   st_src_reg(gl_register_file file, int index, const glsl_type *type,
-  int component = 0, unsigned array_id = 0)
-   {
-  assert(file != PROGRAM_ARRAY || array_id != 0);
-  this->file = file;
-  this->index = index;
-  this->swizzle = swizzle_for_type(type, component);
-  this->negate = 0;
-  this->abs = 0;
-  this->index2D = 0;
-  this->type = type ? type->base_type : GLSL_TYPE_ERROR;
-  this->reladdr = NULL;
-  this->reladdr2 = NULL;
-  this->has_index2 = false;
-  this->double_reg2 = false;
-  this->array_id = array_id;
-  this->is_double_vertex_input = false;
-   }
-
-   st_src_reg(gl_register_file file, int index, enum glsl_base_type type)
-   {
-  assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
-  this->type = type;
-  this->file = file;
-  this->index = index;
-  this->index2D = 0;
-  this->swizzle = SWIZZLE_XYZW;
-  this->negate = 0;
-  this->abs = 0;
-  this->reladdr = NULL;
-  this->reladdr2 = NULL;
-  this->has_index2 = false;
-  this->double_reg2 = false;
-  this->array_id = 0;
-  this->is_double_vertex_input = false;
-   }
-
-   st_src_reg(gl_register_file file, int index, enum glsl_base_type type, int 
index2D)
-   {
-  assert(file != PROGRAM_ARRAY); /* need array_id > 0 */
-  this->type = type;
-  this->file = file;
-  this->index = index;
-  this->index2D = index2D;
-  this->swizzle = SWIZZLE_XYZW;
-  this->negate = 0;
-  this->abs = 0;
-  this->reladdr = NULL;
-  this->reladdr2 = NULL;
-  this->has_index2 = false;
-  this->double_reg2 = false;
-  this->array_id = 0;
-  this->is_double_vertex_input = false;
-   }
-
-   st_src_reg()
-   {
-  this->type = GLSL_TYPE_ERROR;
-  this->file = PROGRAM_UNDEFINED;
-  this->index = 0;
-  this->index2D = 0;
-  this->swizzle = 0;
-  this->negate = 0;
-  this->abs = 0;
-  this->reladdr = NULL;
-  this->reladdr2 = NULL;
-  this->has_index2 = false;
-  

[Mesa-dev] [PATCH v9 0/7] mesa/st: glsl_to_tgsi: refined register merge algorithm

2017-08-24 Thread Gert Wollny
Dear all, 

I thought I might send out this patch another time with its full history and 
freshly rebased. All the changes that I applied were a result of reviews by 
Nicolai (mostly) and Emil (thanks again to both of you).  

The set is mirroed at 
   https://github.com/gerddie/mesa/tree/regrename-v9

The patch fixes a series of bugs where shader compilation fails with 
  "translation from TGSI failed!" 
Among these are 
  * https://bugs.freedesktop.org/show_bug.cgi?id=65448 which 
I can confirm will be fixed for R600_DEBUG=nosb set (with sb enabled it 
will 
fail with a failing assertion in the sb code).
  * According to a user report against v5, the patch also fixes #99349

I can also confirm that the patch fixes the Piano and Voloplosion benchmarks 
implemented in gputest on BARTS (r600g).

The patch has no significant impact on runtime - not taking Dave's patch into
account that in itself reduces the register renaming run-time for shaders 
with a large numbers of temporary registers.  

The patch doesn't introduce piglit regression (I tested the shader subset). 
spec@glsl-1.50@execution@variable-indexing@gs-input-array-vec2-index-rd is 
fixed though. 

The algorithm works like follows:
- first the program is scanned, the loops, switch and if/else scopes are 
  collected and for each temporary first and last reads and writes and the 
  according scopes are collected, and it is recorded whether a variable is 
  written conditionally, and whether loops have continue or break statements.
- then after the whole program has been scanned, the life times are estimated 
  by merging the read and write scopes for each temporary on a per component 
  bases, 
- the life-times of the cmponents are merged, 
- the register mapping is evaluated, and  
- the mapping is applied with the rename_temp_registers method implemented 
  by Dave. 

I've used the patches for quite some time now and so far I didn't encounter
any problems, many thanks for any comments, 
 
Gert

Patch history: 

v2:* significantly cut down on the memory allocations, 
   * expose only a minimal interface to register lifetime estimation and
 calculating the rename table,
   
v3: was broken and v4 restarted from v2 

v4:* split the changes into more patches 
   * correct formatting errors,
   * remove the use of the STL with one exception though: 
 since in st_glsl_to_tgsi.cpp std::sort is already used and its run-time 
 performance is significantly better than qsort. It is used in the register 
 rename mapping evaluation. It can be disabled by commenting out the define 
 USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp. 
   * add more tests and improve the life-time evaluation accordingly,
   * further reduce memory allocations,
   * rename functions and methods to better clarify what they are used for,
   * remove unused methods and variables in prog_scope,
   * eliminate the class tgsi_temp_lifetime,
   * no longer require C++11 for the core library code, however, the tests 
 make use of C++11 and the STL

v5: * correct formatting following Emil's suggetions
* remove un-needed libraries for the tests

v6:* the components are now tracked individually and the life time of a 
temporary 
 is evaluated by merging the life-times of their components, 
   * BRK/CONT are now handled separately, 
   * the final algorithm to evaluate the life-times was simplified, 
   * read and write in the same instruction is now considered to be always 
 well defined,
   * adherence to the coding stile was improved, 
   * the case scope level is now below the according switch scope level, 
   * the new register merge method replaces the old version, i.e. no 
environment 
 variables to switch between implementations. In theory, one could also 
 remove the function get_last_temp_read_first_temp_write, but is is still 
 used in some code in a #define 0 block, so I didn't touch it.  
   * when compiled in debug mode and with the environment variable 
 GLSL_TO_TGSI_RENAME_DEBUG specified the TGSI and resulting register 
 lifetimes will be dumped to stderr.Here Nicolai suggested to use 
 _mesa_register_file_name instead of my hand-backed array of strings, 
 but currently that function doesn't write out all the needed names, 
 so I thought it might be better to address this in another patch that 
 also extends _mesa_register_file_name, 
   * unused registers are now ignored in the rename mapping evaluation, 
   * registers that are only read get a life-time {x,x}, with x the instruction 
 line were the register is last read, so they can be merged,
   * the patch has been rebased against 7d7bcd65d

v7:* Correct documentation of GLSL_TO_TGSI_RENAME_DEBUG in commit message.  
   * fix typos, include files, formatting, and some variable names,
   * add anonymous namespace around classes, 
   * replace debug_log singleton by a function,
   * cleanup the switch-case scope creation, 
   * track switch 

[Mesa-dev] [PATCH v9 1/7] st_glsl_to_tgsi: rewrite rename registers to use array fully.

2017-08-24 Thread Gert Wollny
From: Dave Airlie 

Instead of having to search the whole array, just use the whole
thing and store a valid bit in there with the rename.

Removes this from the profile on some of the fp64 tests

Reviewed-by: Timothy Arceri 
Signed-off-by: Dave Airlie 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 55 ++
 1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 9f021962e4..d05c27cd7a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -399,7 +399,7 @@ find_array_type(struct inout_decl *decls, unsigned count, 
unsigned array_id)
 }
 
 struct rename_reg_pair {
-   int old_reg;
+   bool valid;
int new_reg;
 };
 
@@ -568,7 +568,7 @@ public:
 
void simplify_cmp(void);
 
-   void rename_temp_registers(int num_renames, struct rename_reg_pair 
*renames);
+   void rename_temp_registers(struct rename_reg_pair *renames);
void get_first_temp_read(int *first_reads);
void get_first_temp_write(int *first_writes);
void get_last_temp_read_first_temp_write(int *last_reads, int 
*first_writes);
@@ -4813,36 +4813,37 @@ glsl_to_tgsi_visitor::simplify_cmp(void)
 
 /* Replaces all references to a temporary register index with another index. */
 void
-glsl_to_tgsi_visitor::rename_temp_registers(int num_renames, struct 
rename_reg_pair *renames)
+glsl_to_tgsi_visitor::rename_temp_registers(struct rename_reg_pair *renames)
 {
foreach_in_list(glsl_to_tgsi_instruction, inst, >instructions) {
   unsigned j;
-  int k;
   for (j = 0; j < num_inst_src_regs(inst); j++) {
- if (inst->src[j].file == PROGRAM_TEMPORARY)
-for (k = 0; k < num_renames; k++)
-   if (inst->src[j].index == renames[k].old_reg)
-  inst->src[j].index = renames[k].new_reg;
+ if (inst->src[j].file == PROGRAM_TEMPORARY) {
+int old_idx = inst->src[j].index;
+if (renames[old_idx].valid)
+   inst->src[j].index = renames[old_idx].new_reg;
+ }
   }
 
   for (j = 0; j < inst->tex_offset_num_offset; j++) {
- if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY)
-for (k = 0; k < num_renames; k++)
-   if (inst->tex_offsets[j].index == renames[k].old_reg)
-  inst->tex_offsets[j].index = renames[k].new_reg;
+ if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
+int old_idx = inst->tex_offsets[j].index;
+if (renames[old_idx].valid)
+   inst->tex_offsets[j].index = renames[old_idx].new_reg;
+ }
   }
 
   if (inst->resource.file == PROGRAM_TEMPORARY) {
- for (k = 0; k < num_renames; k++)
-if (inst->resource.index == renames[k].old_reg)
-   inst->resource.index = renames[k].new_reg;
+ int old_idx = inst->resource.index;
+ if (renames[old_idx].valid)
+inst->resource.index = renames[old_idx].new_reg;
   }
 
   for (j = 0; j < num_inst_dst_regs(inst); j++) {
- if (inst->dst[j].file == PROGRAM_TEMPORARY)
- for (k = 0; k < num_renames; k++)
-if (inst->dst[j].index == renames[k].old_reg)
-   inst->dst[j].index = renames[k].new_reg;
+ if (inst->dst[j].file == PROGRAM_TEMPORARY) {
+int old_idx = inst->dst[j].index;
+if (renames[old_idx].valid)
+   inst->dst[j].index = renames[old_idx].new_reg;}
   }
}
 }
@@ -5423,7 +5424,6 @@ glsl_to_tgsi_visitor::merge_registers(void)
int *first_writes = ralloc_array(mem_ctx, int, this->next_temp);
struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct 
rename_reg_pair, this->next_temp);
int i, j;
-   int num_renames = 0;
 
/* Read the indices of the last read and first write to each temp register
 * into an array so that we don't have to traverse the instruction list as
@@ -5450,9 +5450,8 @@ glsl_to_tgsi_visitor::merge_registers(void)
   * as the register at index j. */
  if (first_writes[i] <= first_writes[j] &&
  last_reads[i] <= first_writes[j]) {
-renames[num_renames].old_reg = j;
-renames[num_renames].new_reg = i;
-num_renames++;
+renames[j].new_reg = i;
+renames[j].valid = true;
 
 /* Update the first_writes and last_reads arrays with the new
  * values for the merged register index, and mark the newly unused
@@ -5465,7 +5464,7 @@ glsl_to_tgsi_visitor::merge_registers(void)
   }
}
 
-   rename_temp_registers(num_renames, renames);
+   rename_temp_registers(renames);
ralloc_free(renames);
ralloc_free(last_reads);
ralloc_free(first_writes);
@@ -5480,7 +5479,6 @@ glsl_to_tgsi_visitor::renumber_registers(void)
int new_index = 0;
  

Re: [Mesa-dev] [PATCH 1/2] anv: implementation of VK_EXT_debug_report extension

2017-08-24 Thread Jason Ekstrand
On Wed, Aug 23, 2017 at 11:23 PM, Tapani Pälli 
wrote:

> Patch adds required functionality for extension to manage a list of
> application provided callbacks and handle debug reporting from driver
> and application side.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/intel/Makefile.sources  |   1 +
>  src/intel/vulkan/anv_debug_report.c | 133 ++
> ++
>  src/intel/vulkan/anv_device.c   |  40 +++
>  src/intel/vulkan/anv_extensions.py  |   1 +
>  src/intel/vulkan/anv_private.h  |  32 +
>  5 files changed, 207 insertions(+)
>  create mode 100644 src/intel/vulkan/anv_debug_report.c
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 4074ba9ee5..200713b06e 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -205,6 +205,7 @@ VULKAN_FILES := \
> vulkan/anv_batch_chain.c \
> vulkan/anv_blorp.c \
> vulkan/anv_cmd_buffer.c \
> +   vulkan/anv_debug_report.c \
> vulkan/anv_descriptor_set.c \
> vulkan/anv_device.c \
> vulkan/anv_dump.c \
> diff --git a/src/intel/vulkan/anv_debug_report.c
> b/src/intel/vulkan/anv_debug_report.c
> new file mode 100644
> index 00..1a4868cd52
> --- /dev/null
> +++ b/src/intel/vulkan/anv_debug_report.c
> @@ -0,0 +1,133 @@
> +/*
> + * Copyright © 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "anv_private.h"
> +#include "vk_util.h"
> +
> +/* This file contains implementation for VK_EXT_debug_report. */
> +
> +VkResult
> +anv_CreateDebugReportCallbackEXT(VkInstance _instance,
> + const VkDebugReportCallbackCreateInfoEXT*
> pCreateInfo,
> + const VkAllocationCallbacks* pAllocator,
> + VkDebugReportCallbackEXT* pCallback)
> +{
> +   ANV_FROM_HANDLE(anv_instance, instance, _instance);
> +   const VkAllocationCallbacks *alloc =
> +  pAllocator ? pAllocator : >alloc;
>

This is what vk_alloc2 is for.


> +
> +   vk_foreach_struct(info, pCreateInfo) {
>

Usually, we handle the primary structure directly and then call
vk_foreach_struct on pCreateInfo->pNext.  This is because the things in the
pNext chain are going to be modifiers to the original thing so they
probably need to happen between allocating the callback and list_addtail().


> +  switch (info->sType) {
> +  case VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT: {
> + struct anv_debug_callback *cb =
> +vk_alloc(alloc, sizeof(struct anv_debug_callback), 8,
> + VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
> + if (!cb)
> +return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> +
> + cb->flags = pCreateInfo->flags;
> + cb->callback = pCreateInfo->pfnCallback;
> + cb->data = pCreateInfo->pUserData;
> +
> + list_addtail(>link, >callbacks);
>

What kind of threading guarantees does debug_report provide?  I'm guessing
none in which case we need to lock around this list.


> + break;
> +  }
> +  default:
> + anv_debug_ignored_stype(info->sType);
> + break;
> +  }
> +   }
> +
> +   return VK_SUCCESS;
> +}
> +
> +void
> +anv_DestroyDebugReportCallbackEXT(VkInstance _instance,
> +  VkDebugReportCallbackEXT callback,
> +  const VkAllocationCallbacks* pAllocator)
> +{
> +   ANV_FROM_HANDLE(anv_instance, instance, _instance);
> +   const VkAllocationCallbacks *alloc =
> +  pAllocator ? pAllocator : >alloc;
> +
> +   list_for_each_entry_safe(struct anv_debug_callback, debug_cb,
> +>callbacks, link) {
> +  /* Found a match, remove from list and 

Re: [Mesa-dev] [PATCH 08/10] anv: Use DRM sync objects to back fences whenever possible

2017-08-24 Thread Lionel Landwerlin

On 08/08/17 23:45, Jason Ekstrand wrote:

In order to implement VK_KHR_external_fence, we need to back our fences
with something that's shareable.  Since the kernel wait interface for
sync objects already supports waiting for multiple fences in one go, it
makes anv_WaitForFences much simpler if we only have one type of fence.
---
  src/intel/vulkan/anv_batch_chain.c |   8 +++
  src/intel/vulkan/anv_device.c  |   2 +
  src/intel/vulkan/anv_private.h |   4 ++
  src/intel/vulkan/anv_queue.c   | 132 ++---
  4 files changed, 136 insertions(+), 10 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 5d876e4..15082b5 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -1556,6 +1556,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
  return result;
   break;
  
+  case ANV_FENCE_TYPE_SYNCOBJ:

+ result = anv_execbuf_add_syncobj(, impl->syncobj,
+  I915_EXEC_FENCE_SIGNAL,
+  >alloc);
+ if (result != VK_SUCCESS)
+return result;
+ break;
+
default:
   unreachable("Invalid fence type");
}
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a6d5215..2e0fa19 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -339,6 +339,8 @@ anv_physical_device_init(struct anv_physical_device *device,
 device->has_exec_async = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_ASYNC);
 device->has_exec_fence = anv_gem_get_param(fd, I915_PARAM_HAS_EXEC_FENCE);
 device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
+   device->has_syncobj_wait = device->has_syncobj &&
+  anv_gem_supports_syncobj_wait(fd);
  
 bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
  
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h

index 2f89d3f..430652d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -654,6 +654,7 @@ struct anv_physical_device {
  boolhas_exec_async;
  boolhas_exec_fence;
  boolhas_syncobj;
+boolhas_syncobj_wait;
  
  uint32_teu_total;

  uint32_tsubslice_total;
@@ -1755,6 +1756,9 @@ struct anv_fence_impl {
   struct anv_bo bo;
   enum anv_bo_fence_state state;
} bo;
+
+  /** DRM syncobj handle for syncobj-based fences */
+  uint32_t syncobj;
 };
  };
  
diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c

index 7348e15..8e45bb2 100644
--- a/src/intel/vulkan/anv_queue.c
+++ b/src/intel/vulkan/anv_queue.c
@@ -271,17 +271,25 @@ VkResult anv_CreateFence(
 if (fence == NULL)
return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
  
-   fence->permanent.type = ANV_FENCE_TYPE_BO;

+   if (device->instance->physicalDevice.has_syncobj_wait) {
+  fence->permanent.type = ANV_FENCE_TYPE_SYNCOBJ;
  
-   VkResult result = anv_bo_pool_alloc(>batch_bo_pool,

-   >permanent.bo.bo, 4096);
-   if (result != VK_SUCCESS)
-  return result;
-
-   if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  fence->permanent.syncobj = anv_gem_syncobj_create(device);
+  if (!fence->permanent.syncobj)
+ return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);


Don't you need to do something when the fence is created with the 
signaled bit with drm syncobj?
I didn't see anything in the spec that would make this illegal so I 
assume we have to handle it.



 } else {
-  fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  fence->permanent.type = ANV_FENCE_TYPE_BO;
+
+  VkResult result = anv_bo_pool_alloc(>batch_bo_pool,
+  >permanent.bo.bo, 4096);
+  if (result != VK_SUCCESS)
+ return result;
+
+  if (pCreateInfo->flags & VK_FENCE_CREATE_SIGNALED_BIT) {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_SIGNALED;
+  } else {
+ fence->permanent.bo.state = ANV_BO_FENCE_STATE_RESET;
+  }
 }
  
 *pFence = anv_fence_to_handle(fence);

@@ -301,6 +309,10 @@ anv_fence_impl_cleanup(struct anv_device *device,
 case ANV_FENCE_TYPE_BO:
anv_bo_pool_free(>batch_bo_pool, >bo.bo);
return;
+
+   case ANV_FENCE_TYPE_SYNCOBJ:
+  anv_gem_syncobj_destroy(device, impl->syncobj);
+  return;
 }
  
 unreachable("Invalid fence type");

@@ -328,6 +340,8 @@ VkResult anv_ResetFences(
  uint32_tfenceCount,

Re: [Mesa-dev] [PATCH] i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9.

2017-08-24 Thread Ben Widawsky

On 17-08-24 14:16:39, kevin.rogo...@intel.com wrote:

From: Kevin Rogovin 

Special thanks to Eero Tamminen for reporting rasterizer
numbers being twice what it should be for 2xMSAA under
a benchmark.

Signed-off-by: Kevin Rogovin 
---
src/mesa/drivers/dri/i965/intel_screen.c | 14 +++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 579554f..67eb776 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1882,7 +1882,7 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
   };

   static const uint8_t singlesample_samples[1] = {0};
-   static const uint8_t multisample_samples[2]  = {4, 8};
+   static const uint8_t multisample_samples_2_4_8_16[]  = {2, 4, 8, 16};

   struct intel_screen *screen = dri_screen->driverPrivate;
   const struct gen_device_info *devinfo = >devinfo;
@@ -1959,6 +1959,7 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
* supported.  Singlebuffer configs are not supported because no one wants
* them.
*/
+


No unnecessary whitespace changes, please.


   for (unsigned i = 0; i < ARRAY_SIZE(formats); i++) {
  if (devinfo->gen < 6)
 break;
@@ -1966,6 +1967,7 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
  __DRIconfig **new_configs;
  const int num_depth_stencil_bits = 2;
  int num_msaa_modes = 0;
+  const uint8_t *multisample_samples = NULL;

  depth_bits[0] = 0;
  stencil_bits[0] = 0;
@@ -1978,10 +1980,16 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
 stencil_bits[1] = 8;
  }

-  if (devinfo->gen >= 7)
+  if (devinfo->gen >= 9) {
+ multisample_samples = multisample_samples_2_4_8_16;
+ num_msaa_modes = 4;
+  } else if (devinfo->gen >= 7) {
+ multisample_samples = multisample_samples_2_4_8_16 + 1;
 num_msaa_modes = 2;
-  else if (devinfo->gen == 6)
+  } else if (devinfo->gen == 6) {
+ multisample_samples = multisample_samples_2_4_8_16 + 1;
 num_msaa_modes = 1;
+  }


I think it'd be a little cleaner to just make GEN specific arrays. Easier to
read, and you can just USE ARRAY_SIZE but I honestly don't care much.

if (devinfo->gen >= 9) {
  multisample_samples = multisample_samples_gen9;
  num_msaa_modes = ARRAY_SIZE(multisample_samples_gen9);
}



  new_configs = driCreateConfigs(formats[i],
 depth_bits,


Kind of shocking to me that we missed this previously for both when we added 2x
MSAA and later 16x. Indeed looking at glxinfo, I see no 2x or 16x visuals.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] i965: add missing `const` in function signature

2017-08-24 Thread Ben Widawsky

On 17-08-24 11:01:34, Matt Turner wrote:

Reviewed-by: Matt Turner 


I'm blaming this one on someone else's rebase ;-)
Reviewed-by: Ben Widawsky 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/vbuf: fix buffer reference bugs

2017-08-24 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Aug 24, 2017 at 6:48 PM, Brian Paul  wrote:
> In two places we called pipe_resource_reference() to remove a reference
> to a vertex buffer resource.  But we neglected to check if the buffer was
> a user buffer and not a pipe_resource.  This caused us to pass an invalid
> pipe_resource pointer to pipe_resource_reference().
>
> Instead of calling pipe_resource_reference(>resource, NULL), use
> pipe_vertex_buffer_unreference() which checks the is_user_buffer
> field and does the right thing.
>
> Also, explicity set the is_user_buffer field to false after setting the
> vbuf->resource pointer to out_buffer.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102377
> ---
>  src/gallium/auxiliary/util/u_vbuf.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
> b/src/gallium/auxiliary/util/u_vbuf.c
> index 6dc8bc7..80c30ac 100644
> --- a/src/gallium/auxiliary/util/u_vbuf.c
> +++ b/src/gallium/auxiliary/util/u_vbuf.c
> @@ -513,9 +513,9 @@ u_vbuf_translate_buffers(struct u_vbuf *mgr, struct 
> translate_key *key,
> mgr->real_vertex_buffer[out_vb].stride = key->output_stride;
>
> /* Move the buffer reference. */
> -   pipe_resource_reference(
> -  >real_vertex_buffer[out_vb].buffer.resource, NULL);
> +   pipe_vertex_buffer_unreference(>real_vertex_buffer[out_vb]);
> mgr->real_vertex_buffer[out_vb].buffer.resource = out_buffer;
> +   mgr->real_vertex_buffer[out_vb].is_user_buffer = false;
>
> return PIPE_OK;
>  }
> @@ -833,8 +833,7 @@ void u_vbuf_set_vertex_buffers(struct u_vbuf *mgr,
>   unsigned dst_index = start_slot + i;
>
>   pipe_vertex_buffer_unreference(>vertex_buffer[dst_index]);
> - 
> pipe_resource_reference(>real_vertex_buffer[dst_index].buffer.resource,
> - NULL);
> + pipe_vertex_buffer_unreference(>real_vertex_buffer[dst_index]);
>}
>
>pipe->set_vertex_buffers(pipe, start_slot, count, NULL);
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/vbuf: fix buffer reference bugs

2017-08-24 Thread Brian Paul
In two places we called pipe_resource_reference() to remove a reference
to a vertex buffer resource.  But we neglected to check if the buffer was
a user buffer and not a pipe_resource.  This caused us to pass an invalid
pipe_resource pointer to pipe_resource_reference().

Instead of calling pipe_resource_reference(>resource, NULL), use
pipe_vertex_buffer_unreference() which checks the is_user_buffer
field and does the right thing.

Also, explicity set the is_user_buffer field to false after setting the
vbuf->resource pointer to out_buffer.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102377
---
 src/gallium/auxiliary/util/u_vbuf.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
b/src/gallium/auxiliary/util/u_vbuf.c
index 6dc8bc7..80c30ac 100644
--- a/src/gallium/auxiliary/util/u_vbuf.c
+++ b/src/gallium/auxiliary/util/u_vbuf.c
@@ -513,9 +513,9 @@ u_vbuf_translate_buffers(struct u_vbuf *mgr, struct 
translate_key *key,
mgr->real_vertex_buffer[out_vb].stride = key->output_stride;
 
/* Move the buffer reference. */
-   pipe_resource_reference(
-  >real_vertex_buffer[out_vb].buffer.resource, NULL);
+   pipe_vertex_buffer_unreference(>real_vertex_buffer[out_vb]);
mgr->real_vertex_buffer[out_vb].buffer.resource = out_buffer;
+   mgr->real_vertex_buffer[out_vb].is_user_buffer = false;
 
return PIPE_OK;
 }
@@ -833,8 +833,7 @@ void u_vbuf_set_vertex_buffers(struct u_vbuf *mgr,
  unsigned dst_index = start_slot + i;
 
  pipe_vertex_buffer_unreference(>vertex_buffer[dst_index]);
- 
pipe_resource_reference(>real_vertex_buffer[dst_index].buffer.resource,
- NULL);
+ pipe_vertex_buffer_unreference(>real_vertex_buffer[dst_index]);
   }
 
   pipe->set_vertex_buffers(pipe, start_slot, count, NULL);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102377] PIPE_*_4BYTE_ALIGNED_ONLY caps crashing

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102377

--- Comment #1 from Brian Paul  ---
See proposed patch on mesa-dev.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] TGSI 16-bit support

2017-08-24 Thread Marek Olšák
On Thu, Aug 24, 2017 at 3:39 PM, Nicolai Hähnle  wrote:
> On 24.08.2017 14:19, Erik Faye-Lund wrote:
>>
>> On Wed, Aug 23, 2017 at 6:19 PM, Marek Olšák  wrote:
>>>
>>> On Wed, Aug 23, 2017 at 3:08 PM, Nicolai Hähnle 
>>> wrote:


 Here's another question: What does "low precision" mean on a texture
 instruction? Are the offsets low precision or is it the output? Maybe we
 can
 punt on this for now -- at least GCN doesn't have low precision there
 anyway.
>>>
>>>
>>> HalfPrecision means that all dst and src sources can be 16-bit.
>>>
>>> If the consumer of a TEX instruction is 16-bit, TEX should return
>>> 16-bit automatically. If a source of a TEX instruction is 16-bit, TEX
>>> should accept 16-bit automatically.
>>
>>
>> This sounds inconsistent with how lowp works; a texture-sampler
>> declared as lowp in GLSL only have low precision output AFAIK.
>
>
> lowp and mediump are entirely optional concepts. Their precision can be as
> high as highp operations.

FWIW, we won't probably treat lowp differently from mediump. Its
precision requirements are low and weird: float could be an 11-bit
fixed-point type with 8 fractional bits, and int/uint should have at
least 9 bits.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/47] mesa/st: Handle 16-bit types at st_glsl_storage_type_size()

2017-08-24 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Aug 24, 2017 at 3:54 PM, Alejandro Piñeiro  wrote:
> From: Eduardo Lima Mitev 
>
> This is basically to avoid "not handle in switch" warnings.
>
> v2: Let the new types hit the assertion instead. (Marek Olšák
> and Jason Ekstrand)
> ---
>  src/mesa/state_tracker/st_glsl_types.cpp | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_glsl_types.cpp 
> b/src/mesa/state_tracker/st_glsl_types.cpp
> index 50936025d9f..e57fbc8f314 100644
> --- a/src/mesa/state_tracker/st_glsl_types.cpp
> +++ b/src/mesa/state_tracker/st_glsl_types.cpp
> @@ -98,6 +98,9 @@ st_glsl_storage_type_size(const struct glsl_type *type, 
> bool is_bindless)
> case GLSL_TYPE_VOID:
> case GLSL_TYPE_ERROR:
> case GLSL_TYPE_FUNCTION:
> +   case GLSL_TYPE_FLOAT16:
> +   case GLSL_TYPE_UINT16:
> +   case GLSL_TYPE_INT16:
>assert(!"Invalid type in type_size");
>break;
> }
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: add reference links for resource_create method

2017-08-24 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Aug 24, 2017 at 4:17 PM, Gwan-gyeong Mun  wrote:
> It adds reference links for arguments usage and bind of resource_create().
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/gallium/docs/source/screen.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index 930e5bd5f0..426edadf7f 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -739,9 +739,9 @@ For cube maps this must be 6, for other textures 1.
>  **nr_samples** the nr of msaa samples. 0 (or 1) specifies a resource
>  which isn't multisampled.
>
> -**usage** one of the PIPE_USAGE flags.
> +**usage** one of the :ref:`PIPE_USAGE` flags.
>
> -**bind** bitmask of the PIPE_BIND flags.
> +**bind** bitmask of the :ref:`PIPE_BIND` flags.
>
>  **flags** bitmask of PIPE_RESOURCE_FLAG flags.
>
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: fix a reference link for get_paramf

2017-08-24 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Aug 24, 2017 at 3:53 PM, Gwan-gyeong Mun  wrote:
> Previous get_paramf links same as get_param. It changes the reference link to
> PIPE_CAPF_*
>
> Signed-off-by: Mun Gwan-gyeong 
> ---
>  src/gallium/docs/source/screen.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index be14ddd0c0..930e5bd5f0 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -670,7 +670,7 @@ get_paramf
>
>  Get a floating-point screen parameter.
>
> -**param** is one of the :ref:`PIPE_CAP` names.
> +**param** is one of the :ref:`PIPE_CAPF` names.
>
>  context_create
>  ^^
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: set IF_THRESHOLD to 4

2017-08-24 Thread Marek Olšák
Acked-by: Marek Olšák 

Marek

On Thu, Aug 24, 2017 at 2:46 PM, Timothy Arceri  wrote:
> In 74e39de9324d it was set to 3 and it was reported that 4 caused
> tesseract to start spilling VGPRs. This no longer seems to be the
> case.
>
> Totals:
> SGPRS: 2787844 -> 2787764 (-0.00 %)
> VGPRS: 1713121 -> 1712717 (-0.02 %)
> Spilled SGPRs: 7532 -> 7532 (0.00 %)
> Spilled VGPRs: 49 -> 33 (-32.65 %)
> Private memory VGPRs: 2060 -> 2060 (0.00 %)
> Scratch size: 2200 -> 2180 (-0.91 %) dwords per thread
> Code Size: 79265520 -> 79248360 (-0.02 %) bytes
> LDS: 436 -> 436 (0.00 %) blocks
> Max Waves: 670535 -> 670608 (0.01 %)
> Wait states: 0 -> 0 (0.00 %)
>
> Before:
>  VGPR SPILLING APPS   Shaders SpillVGPR  PrivVGPR ScratchSize
>  EffectsCaveDemo  301 0   256   264
>  ReflectionsSubwayDemo264 0   256   264
>  VehicleGame  295 0   128   132
>  bioshock-infinite   1140 0   448   516
>  dirt-showdown45333 028
>  gang-beasts  364 0   500   496
>  kerbal-space-program1228 0   472   480
>  tomb-raider-ultra   119916 020
>
> After:
>  VGPR SPILLING APPS   Shaders SpillVGPR  PrivVGPR ScratchSize
>  EffectsCaveDemo  301 0   256   264
>  ReflectionsSubwayDemo264 0   256   264
>  VehicleGame  295 0   128   132
>  bioshock-infinite   1140 0   448   516
>  dirt-showdown45333 028
>  gang-beasts  364 0   500   496
>  kerbal-space-program1228 0   472   480
>
> The only change in VGPR spills is the elimination of all spills
> in Tomb Raider at Ultra settings. Closer examination shows that
> the shaders go over the limit because they contain three
> expressions a mul, rcp and ubo load. The ubo load is actually
> used elsewhere and is therefore stored in a temp already in IR
> such as tgsi but glsl ir counts it agaist the if cost.
>
> Cc: Marek Olšák 
> ---
>  src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 80a77a8f1f..1d2b7528ee 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -769,21 +769,21 @@ static int si_get_shader_param(struct pipe_screen* 
> pscreen,
> return SI_NUM_IMAGES;
> case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
> return 32;
> case PIPE_SHADER_CAP_PREFERRED_IR:
> if (sscreen->b.debug_flags & DBG_NIR &&
> (shader == PIPE_SHADER_VERTEX ||
>  shader == PIPE_SHADER_FRAGMENT))
> return PIPE_SHADER_IR_NIR;
> return PIPE_SHADER_IR_TGSI;
> case PIPE_SHADER_CAP_LOWER_IF_THRESHOLD:
> -   return 3;
> +   return 4;
>
> /* Supported boolean features. */
> case PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED:
> case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED:
> case PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR:
> case PIPE_SHADER_CAP_INDIRECT_CONST_ADDR:
> case PIPE_SHADER_CAP_INTEGERS:
> case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
> case PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE:
> case PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS:
> --
> 2.13.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] radeonsi: correct maximum wave count per SIMD

2017-08-24 Thread Marek Olšák
On Thu, Aug 24, 2017 at 3:24 PM, Nicolai Hähnle  wrote:
> On 24.08.2017 13:32, Marek Olšák wrote:
>>
>>
>>
>> On Aug 24, 2017 1:21 PM, "Nicolai Hähnle" > > wrote:
>>
>> On 24.08.2017 12 :34, Marek Olšák wrote:
>>
>>
>>
>> On Aug 24, 2017 10:17 AM, "Nicolai Hähnle" >  > >> wrote:
>>
>>  On 23.08.2017 22:44, Marek Olšák wrote:
>>
>>  From: Marek Olšák >  >
>> >>
>>
>>
>>  ---
>> src/gallium/drivers/radeonsi/si_shader.c | 17
>> -
>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>
>>  diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>>  b/src/gallium/drivers/radeonsi/si_shader.c
>>  index f02fc9e..186a3dd 100644
>>  --- a/src/gallium/drivers/radeonsi/si_shader.c
>>  +++ b/src/gallium/drivers/radeonsi/si_shader.c
>>  @@ -5029,21 +5029,36 @@ static void
>> si_shader_dump_stats(struct
>>  si_screen *sscreen,
>>struct
>> pipe_debug_callback *debug,
>>unsigned processor,
>>FILE *file,
>>bool
>> check_debug_option)
>> {
>>   const struct si_shader_config *conf =
>> >config;
>>   unsigned num_inputs = shader->selector ?
>>  shader->selector->info.num_inputs : 0;
>>   unsigned code_size =
>> si_get_shader_binary_size(shader);
>>   unsigned lds_increment = sscreen->b.chip_class
>>  >= CIK ?
>>  512 : 256;
>>   unsigned lds_per_wave = 0;
>>  -   unsigned max_simd_waves = 10;
>>  +   unsigned max_simd_waves;
>>  +
>>  +   switch (sscreen->b.family) {
>>  +   /* SGPR initialization bug workaround on Tonga
>> and
>>  Iceland reduces
>>  +* the wave count to 8. */
>>  +   case CHIP_TONGA:
>>  +   case CHIP_ICELAND:
>>  +   /* These always have 8 waves: */
>>  +   case CHIP_POLARIS10:
>>  +   case CHIP_POLARIS11:
>>  +   case CHIP_POLARIS12:
>>  +   max_simd_waves = 8;
>>
>>
>>  This should be implied by the num_sgprs set by LLVM, though.
>>
>>
>> I have no idea what you mean or why it's relevant.
>>
>>
>> There's this code later in the function:
>>
>>  if (conf->num_sgprs) {
>>  if (sscreen->b.chip_class >= VI)
>>  max_simd_waves = MIN2(max_simd_waves, 800 /
>> conf->num_sgprs);
>>  else
>>  max_simd_waves = MIN2(max_simd_waves, 512 /
>> conf->num_sgprs);
>>  }
>>
>>
>> Yes, but that's alright. Why is it important? num_sgprs is always nonzero.
>
>
> It means that the SGPR initialization bug part of the change is redundant.
> LLVM sets the num_sgprs in such a way that the calculation
>
>   max_simd_waves = MIN2(max_simd_waves, 800 / conf->num_sgprs);
>
> will set max_simd_waves to 8.

Ah yes, that's true.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] gallium: normalize CONST file accesses to 2D

2017-08-24 Thread Roland Scheidegger
We'll definitely have to adjust some code too, but looks alright to me.
For the series:
Acked-by: Roland Scheidegger 


Am 23.08.2017 um 18:41 schrieb Nicolai Hähnle:
> Hi all,
> 
> Following the discussion on Timothy's std430 packing series, here's
> a quick proposal to just always use 2D accesses to the CONST file
> in TGSI.
> 
> The first patch should be sufficient for all drivers to accept
> those 2D accesses. It seems that most older drivers simply ignore
> the dimension, and newer ones should handle it directly.
> 
> Subsequent patches modify the producers of TGSI to always use 2D
> constant references. This is mostly done by changing ureg.
> 
> Finally, the last patch adds an assertion to radeonsi to make
> sure all constant references are really 2D. It has survived my
> very superficial initial testing.
> 
> What needs to be tested is:
> - some more drivers
> - Nine
> - TGSI-to-NIR
> 
> You can find the series here: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cgit.freedesktop.org_-7Enh_mesa_log_-3Fh-3Dtgsi-2Dconst-2D2d=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=KWKPHyQBcTYmZoDefBNi8oU710I4OY9MGGBR4KvDMto=KPrLO1itcQWh5TgjYW4PSyxC6gZVyGxnBM5_tp1qSCY=
>  
> 
> Please comment/review!
> Thanks,
> Nicolai
> --
>  src/gallium/auxiliary/hud/hud_context.c  |   8 +-
>  src/gallium/auxiliary/nir/tgsi_to_nir.c  |   2 +-
>  src/gallium/auxiliary/postprocess/pp_mlaa.h  |  20 +--
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c   |  22 +--
>  src/gallium/auxiliary/util/u_tests.c |   4 +-
>  src/gallium/docs/source/screen.rst   |  11 +-
>  src/gallium/drivers/radeon/r600_query.c  |  36 ++--
>  src/gallium/drivers/radeonsi/si_shader.c |   1 +
>  src/gallium/state_trackers/nine/nine_ff.c|   2 +-
>  .../state_trackers/nine/nine_shader.c|  10 +-
>  .../tests/graw/fragment-shader/frag-cb-1d.sh |   8 +-
>  .../tests/graw/vertex-shader/vert-cb-1d.sh   |   8 +-
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 153 +
>  13 files changed, 136 insertions(+), 149 deletions(-)
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=KWKPHyQBcTYmZoDefBNi8oU710I4OY9MGGBR4KvDMto=0IDDvdH4zbWCrIj0PSavhDSx7vVLNukEk0UYXSE_c8s=
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] radeonsi: get the raster config from AMDGPU on SI

2017-08-24 Thread Marek Olšák
On Thu, Aug 24, 2017 at 3:44 PM, Alex Deucher  wrote:
> On Thu, Aug 24, 2017 at 4:20 AM, Nicolai Hähnle  wrote:
>> Patches 2-4:
>>
>> Reviewed-by: Nicolai Hähnle 
>>
>>
>> On 23.08.2017 22:44, Marek Olšák wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> Not sure yet if we wanna do this on CIK and VI too.
>
> Any reason why we only do this for SI?

I don't remember the DRM version which started exposing correct raster
configs. Also, SI is still kinda experimental.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102394] RBDOOM3BFG digital vomit

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102394

Bug ID: 102394
   Summary: RBDOOM3BFG digital vomit
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/swr
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: dlol...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Before I installed the Padoka PPA RBDOOM3BFG worked fine, but afterwards I get
this corruption on screen.  I'm using 64-bit Mint 18.2 on a Radeon HD 7850.  

Here's an image of it in action.
https://user-images.githubusercontent.com/31163475/29541369-954b34b6-86a1-11e7-9825-4305f71498b5.png

Here's what I reported to the RBDOOM3BFG github.  
https://github.com/RobertBeckebans/RBDOOM-3-BFG/issues/386

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100038] [BDW] gpu hangs ecode 8:0:0x84d77c1c

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100038

Elizabeth  changed:

   What|Removed |Added

Product|DRI |Mesa
   Assignee|intel-gfx-bugs@lists.freede |mesa-dev@lists.freedesktop.
   |sktop.org   |org
  Component|DRM/Intel   |Other
 QA Contact|intel-gfx-bugs@lists.freede |mesa-dev@lists.freedesktop.
   |sktop.org   |org
Version|XOrg git|unspecified

--- Comment #3 from Elizabeth  ---
>From GPU crash dump:
0xff02d5d8:  0x7b05: 3DPRIMITIVE: fail sequential
0xff02d5dc:  0x:vertex count
0xff02d5e0:  0x101a:start vertex
0xff02d5e4:  0x0fa8:instance count
0xff02d5e8:  0x0001:start instance
0xff02d5ec:  0x:index bias
0xff02d5f0:  0x: MI_NOOP
0xff02d5f4:  0x0500: MI_BATCH_BUFFER_END

With last action executed when hang:
0xff02d5d8:  0x7b05: 3DPRIMITIVE: fail sequential

This seems to be a Mesa problem. Please add your Mesa info and try last
version.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/omx: move YUV deinterlace function to common

2017-08-24 Thread Christian König

Am 24.08.2017 um 17:11 schrieb Leo Liu:

Signed-off-by: Leo Liu 


Reviewed-by: Christian König  for the series.

Andy do you want to test this? Should make VA-API transcoding simpler to 
use.


Regards,
Christian.


---
  src/gallium/auxiliary/vl/vl_compositor.c | 87 +---
  src/gallium/auxiliary/vl/vl_compositor.h | 21 
  src/gallium/state_trackers/omx/vid_dec.c | 32 +---
  3 files changed, 68 insertions(+), 72 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index a79bf11264..794c8b5b17 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -885,6 +885,32 @@ draw_layers(struct vl_compositor *c, struct 
vl_compositor_state *s, struct u_rec
 }
  }
  
+static void

+set_yuv_layer(struct vl_compositor_state *s, struct vl_compositor *c, unsigned 
layer,
+  struct pipe_video_buffer *buffer, struct u_rect *src_rect,
+  struct u_rect *dst_rect, bool y)
+{
+   struct pipe_sampler_view **sampler_views;
+   unsigned i;
+
+   assert(s && c && buffer);
+
+   assert(layer < VL_COMPOSITOR_MAX_LAYERS);
+
+   s->used_layers |= 1 << layer;
+   sampler_views = buffer->get_sampler_view_components(buffer);
+   for (i = 0; i < 3; ++i) {
+  s->layers[layer].samplers[i] = c->sampler_linear;
+  pipe_sampler_view_reference(>layers[layer].sampler_views[i], 
sampler_views[i]);
+   }
+
+   calc_src_and_dst(>layers[layer], buffer->width, buffer->height,
+src_rect ? *src_rect : default_rect(>layers[layer]),
+dst_rect ? *dst_rect : default_rect(>layers[layer]));
+
+   s->layers[layer].fs = (y) ? c->fs_weave_yuv.y : c->fs_weave_yuv.uv;
+}
+
  void
  vl_compositor_reset_dirty_area(struct u_rect *dirty)
  {
@@ -1143,36 +1169,6 @@ vl_compositor_set_layer_rotation(struct 
vl_compositor_state *s,
  }
  
  void

-vl_compositor_set_yuv_layer(struct vl_compositor_state *s,
-struct vl_compositor *c,
-unsigned layer,
-struct pipe_video_buffer *buffer,
-struct u_rect *src_rect,
-struct u_rect *dst_rect,
-bool y)
-{
-   struct pipe_sampler_view **sampler_views;
-   unsigned i;
-
-   assert(s && c && buffer);
-
-   assert(layer < VL_COMPOSITOR_MAX_LAYERS);
-
-   s->used_layers |= 1 << layer;
-   sampler_views = buffer->get_sampler_view_components(buffer);
-   for (i = 0; i < 3; ++i) {
-  s->layers[layer].samplers[i] = c->sampler_linear;
-  pipe_sampler_view_reference(>layers[layer].sampler_views[i], 
sampler_views[i]);
-   }
-
-   calc_src_and_dst(>layers[layer], buffer->width, buffer->height,
-src_rect ? *src_rect : default_rect(>layers[layer]),
-dst_rect ? *dst_rect : default_rect(>layers[layer]));
-
-   s->layers[layer].fs = (y) ? c->fs_weave_yuv.y : c->fs_weave_yuv.uv;
-}
-
-void
  vl_compositor_render(struct vl_compositor_state *s,
   struct vl_compositor   *c,
   struct pipe_surface*dst_surface,
@@ -1215,6 +1211,37 @@ vl_compositor_render(struct vl_compositor_state *s,
 draw_layers(c, s, dirty_area);
  }
  
+void

+vl_compositor_yuv_deint(struct vl_compositor_state *s,
+struct vl_compositor *c,
+struct pipe_video_buffer *src,
+struct pipe_video_buffer *dst)
+{
+   struct pipe_surface **dst_surfaces;
+   struct u_rect dst_rect;
+
+   dst_surfaces = dst->get_surfaces(dst);
+   vl_compositor_clear_layers(s);
+
+   dst_rect.x0 = 0;
+   dst_rect.x1 = src->width;
+   dst_rect.y0 = 0;
+   dst_rect.y1 = src->height;
+
+   set_yuv_layer(s, c, 0, src, NULL, NULL, true);
+   vl_compositor_set_layer_dst_area(s, 0, _rect);
+   vl_compositor_render(s, c, dst_surfaces[0], NULL, false);
+
+   dst_rect.x1 /= 2;
+   dst_rect.y1 /= 2;
+
+   set_yuv_layer(s, c, 0, src, NULL, NULL, false);
+   vl_compositor_set_layer_dst_area(s, 0, _rect);
+   vl_compositor_render(s, c, dst_surfaces[1], NULL, false);
+
+   s->pipe->flush(s->pipe, NULL, 0);
+}
+
  bool
  vl_compositor_init(struct vl_compositor *c, struct pipe_context *pipe)
  {
diff --git a/src/gallium/auxiliary/vl/vl_compositor.h 
b/src/gallium/auxiliary/vl/vl_compositor.h
index 535abb75cd..2546d75b23 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.h
+++ b/src/gallium/auxiliary/vl/vl_compositor.h
@@ -240,18 +240,6 @@ vl_compositor_set_layer_rotation(struct 
vl_compositor_state *state,
   unsigned layer,
   enum vl_compositor_rotation rotate);
  
-/**

- * set a layer of y or uv to render
- */
-void
-vl_compositor_set_yuv_layer(struct vl_compositor_state *s,
-struct vl_compositor *c,
-

[Mesa-dev] [Bug 101982] Weston crashes when running an OpenGL program on i965

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101982

Link Mauve  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Link Mauve  ---
It doesn’t happen anymore on master, as of
fe2f5cfdc7439cbe481d4bea393b46395967a8a3.

The main difference is that programs were using the I915_FORMAT_MOD_X_TILED
modifier with create_immed, which was failing (and still is) on my HD4000 for
some reason.  Now Mesa is using both I915_FORMAT_MOD_Y_TILED (which works fine
here) and doesn’t crash the compositor anymore on an unsupported modifier.

As an aside, when I revert 85ef0215dd3fac2d2a141018467361cff92f4bab I still get
a crash of the compositor, so this glvnd change was indeed needed (this was
asked by Emil).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] anv: implementation of VK_EXT_debug_report extension

2017-08-24 Thread Tapani Pälli
I've noticed one missing thing, for VK_ERROR_OUT_OF_HOST_MEMORY 
situations we would need function to call debug report without actual 
object (since we failed to allocate it!) by only passing object enum, 
maybe have a separate vk_memory_error macro for these cases?


On 08/24/2017 09:23 AM, Tapani Pälli wrote:

Patch adds required functionality for extension to manage a list of
application provided callbacks and handle debug reporting from driver
and application side.

Signed-off-by: Tapani Pälli 
---
  src/intel/Makefile.sources  |   1 +
  src/intel/vulkan/anv_debug_report.c | 133 
  src/intel/vulkan/anv_device.c   |  40 +++
  src/intel/vulkan/anv_extensions.py  |   1 +
  src/intel/vulkan/anv_private.h  |  32 +
  5 files changed, 207 insertions(+)
  create mode 100644 src/intel/vulkan/anv_debug_report.c

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 4074ba9ee5..200713b06e 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -205,6 +205,7 @@ VULKAN_FILES := \
vulkan/anv_batch_chain.c \
vulkan/anv_blorp.c \
vulkan/anv_cmd_buffer.c \
+   vulkan/anv_debug_report.c \
vulkan/anv_descriptor_set.c \
vulkan/anv_device.c \
vulkan/anv_dump.c \
diff --git a/src/intel/vulkan/anv_debug_report.c 
b/src/intel/vulkan/anv_debug_report.c
new file mode 100644
index 00..1a4868cd52
--- /dev/null
+++ b/src/intel/vulkan/anv_debug_report.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "anv_private.h"
+#include "vk_util.h"
+
+/* This file contains implementation for VK_EXT_debug_report. */
+
+VkResult
+anv_CreateDebugReportCallbackEXT(VkInstance _instance,
+ const VkDebugReportCallbackCreateInfoEXT* 
pCreateInfo,
+ const VkAllocationCallbacks* pAllocator,
+ VkDebugReportCallbackEXT* pCallback)
+{
+   ANV_FROM_HANDLE(anv_instance, instance, _instance);
+   const VkAllocationCallbacks *alloc =
+  pAllocator ? pAllocator : >alloc;
+
+   vk_foreach_struct(info, pCreateInfo) {
+  switch (info->sType) {
+  case VK_STRUCTURE_TYPE_DEBUG_REPORT_CALLBACK_CREATE_INFO_EXT: {
+ struct anv_debug_callback *cb =
+vk_alloc(alloc, sizeof(struct anv_debug_callback), 8,
+ VK_SYSTEM_ALLOCATION_SCOPE_INSTANCE);
+ if (!cb)
+return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+ cb->flags = pCreateInfo->flags;
+ cb->callback = pCreateInfo->pfnCallback;
+ cb->data = pCreateInfo->pUserData;
+
+ list_addtail(>link, >callbacks);
+ break;
+  }
+  default:
+ anv_debug_ignored_stype(info->sType);
+ break;
+  }
+   }
+
+   return VK_SUCCESS;
+}
+
+void
+anv_DestroyDebugReportCallbackEXT(VkInstance _instance,
+  VkDebugReportCallbackEXT callback,
+  const VkAllocationCallbacks* pAllocator)
+{
+   ANV_FROM_HANDLE(anv_instance, instance, _instance);
+   const VkAllocationCallbacks *alloc =
+  pAllocator ? pAllocator : >alloc;
+
+   list_for_each_entry_safe(struct anv_debug_callback, debug_cb,
+>callbacks, link) {
+  /* Found a match, remove from list and destroy given callback. */
+  if ((VkDebugReportCallbackEXT)debug_cb->callback == callback) {
+ list_del(_cb->link);
+ vk_free(alloc, debug_cb);
+  }
+   }
+}
+
+void
+anv_DebugReportMessageEXT(VkInstance _instance,
+  VkDebugReportFlagsEXT flags,
+  VkDebugReportObjectTypeEXT objectType,
+  uint64_t object,
+  size_t location,
+   

Re: [Mesa-dev] [PATCH mesa] egl/android: add missing include

2017-08-24 Thread Rob Herring
On Thu, Aug 24, 2017 at 10:09 AM, Tapani Pälli  wrote:
>
>
> On 08/24/2017 06:02 PM, Rob Herring wrote:
>>
>> On Thu, Aug 24, 2017 at 9:26 AM, Rob Herring  wrote:
>>>
>>> On Thu, Aug 24, 2017 at 9:22 AM, Eric Engestrom
>>>  wrote:

 Cc: Rob Herring 
 Signed-off-by: Eric Engestrom 
 ---
 This needs to land before [1], otherwise the latter will break android.

 [1]
 https://lists.freedesktop.org/archives/mesa-dev/2017-August/167428.html

   src/egl/drivers/dri2/platform_android.c | 1 +
   1 file changed, 1 insertion(+)
>>>
>>>
>>> For both patches:
>>>
>>> Reviewed-by: Rob Herring 
>>
>>
>> Actually, on further examination I think this isn't needed. egl_dri2.h
>> includes system/window.h which in turn includes native_window.h. I'll
>> test out the EGL change without this.
>>
>
> Not 100% sure but you might need include for android_native_rect_t from
> 'arect' lib, it was moved there from 'nativewindow' lib

I checked this. At least for O it is implicitly added as an include
path by libnativewindow.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Simplify MOCS mashing in genX_state_upload.c.

2017-08-24 Thread Lionel Landwerlin

On 24/08/17 16:15, Kenneth Graunke wrote:

On Thursday, August 24, 2017 4:04:26 AM PDT Lionel Landwerlin wrote:

Looks good, but it looks like you could replace an additional one in
upload_push_constant_packets().

That one is a bit weird - it uses 0 on Gen8+.  I've wondered about that,
actually - the docs claim that you must use 0 - but at least on Skylake,
0 is an entry in the table that means uncached.  So is the requirement
that the bits be 0, or the requirement that you bypass caching?

Things we'll never know I guess.  I'm not sure if it matters, though,
since it's just pulling the data into a segment of the L3 anyway...so
it's only read one time...

At any rate, I left it open coded because it's different than the others.


Also why not name it GEN_MOCS ? (so it's a bit more consistent with
other macros defined per gen).

Thanks!

I like this.  Changed locally.



Reviewed-by: Lionel Landwerlin 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir/spirv: handle if's with same label in both branches

2017-08-24 Thread Juan A. Suarez Romero
When a conditional branch has the same labels in the "if" part and in the
"else" part, then we have the same cfg block, and it must be handled
once.

Fixes:
dEQP-VK.spirv_assembly.instruction.compute.conditional_branch.same_labels*
dEQP-VK.spirv_assembly.instruction.graphics.conditional_branch.same_labels*
---
 src/compiler/spirv/vtn_cfg.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 03c452cb31..bfca7043cc 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -356,8 +356,11 @@ vtn_cfg_walk_blocks(struct vtn_builder *b, struct 
list_head *cf_list,
   switch_case, switch_break,
   loop_break, loop_cont);
 
- if (if_stmt->then_type == vtn_branch_type_none &&
- if_stmt->else_type == vtn_branch_type_none) {
+ if (then_block == else_block) {
+block = then_block;
+continue;
+ } else if (if_stmt->then_type == vtn_branch_type_none &&
+if_stmt->else_type == vtn_branch_type_none) {
 /* Neither side of the if is something we can short-circuit. */
 assert((*block->merge & SpvOpCodeMask) == SpvOpSelectionMerge);
 struct vtn_block *merge_block =
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Simplify MOCS mashing in genX_state_upload.c.

2017-08-24 Thread Kenneth Graunke
On Thursday, August 24, 2017 4:04:26 AM PDT Lionel Landwerlin wrote:
> Looks good, but it looks like you could replace an additional one in 
> upload_push_constant_packets().

That one is a bit weird - it uses 0 on Gen8+.  I've wondered about that,
actually - the docs claim that you must use 0 - but at least on Skylake,
0 is an entry in the table that means uncached.  So is the requirement
that the bits be 0, or the requirement that you bypass caching?

Things we'll never know I guess.  I'm not sure if it matters, though,
since it's just pulling the data into a segment of the L3 anyway...so
it's only read one time...

At any rate, I left it open coded because it's different than the others.

> Also why not name it GEN_MOCS ? (so it's a bit more consistent with 
> other macros defined per gen).
> 
> Thanks!

I like this.  Changed locally.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102390] centroid interpolation causes broken attribute values

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102390

--- Comment #1 from timon  ---
(In reply to timon from comment #0)
> If I add the the centroid interpolation decoration to any attribute:
> - the values will be mostly broken

To clarify the values for that attribute only, the rest are fine.
Also this happens when running without MSAA (can't test with atm).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101941] Getting different output depending on attribute declaration order

2017-08-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101941

Brian Paul  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Brian Paul  ---
Fixed with commit fe2f5cfdc7439cbe481d4bea393b46395967a8a3

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] st/va: move YUV content to deinterlaced buffer when reallocated for encoder

2017-08-24 Thread Leo Liu
v2: use deinterlace common function

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/va/picture.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 47e63d3b30..9282dabc63 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -627,7 +627,7 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
 
if (surf->buffer->interlaced != interlaced) {
   surf->templat.interlaced = screen->get_video_param(screen, 
context->decoder->profile,
- 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
+ 
context->decoder->entrypoint,
  
PIPE_VIDEO_CAP_PREFERS_INTERLACED);
   realloc = true;
}
@@ -657,13 +657,17 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
}
 
if (realloc) {
-  surf->buffer->destroy(surf->buffer);
+  struct pipe_video_buffer *old_buf = surf->buffer;
 
   if (vlVaHandleSurfaceAllocate(ctx, surf, >templat) != 
VA_STATUS_SUCCESS) {
  mtx_unlock(>mutex);
  return VA_STATUS_ERROR_ALLOCATION_FAILED;
   }
 
+  if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+ vl_compositor_yuv_deint(>cstate, >compositor, old_buf, 
surf->buffer);
+
+  old_buf->destroy(old_buf);
   context->target = surf->buffer;
}
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   >