Re: [Mesa-dev] [MR] WIP: VK_KHR_shader_float_controls implementation on ANV

2019-02-21 Thread Samuel Iglesias Gonsálvez
It is still unreviewed.

Sam

On 2/8/19 8:17 AM, Samuel Iglesias Gonsálvez wrote:
> First three versions of this branch were sent for review to the mailing
> list. In order to avoid flooding the list with more emails, I create
> this MR to continue the review process there.
> 
> Reminder: this patch series relies on VK_KHR_shader_float16_int8 patch
> series which is currently under review. For that reason, I mark this
> Merge Request as Work In Progress.
> 
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/223
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109711

--- Comment #3 from root@scoopta.ninja ---
It turns out this issue only occurs under XWayland and NOT under X.Org however
I'm not sure if it's actually a mesa bug with XWayland or an XWayland bug
directly.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces

2019-02-21 Thread Samuel Iglesias Gonsálvez
Lionel, are you going to push it with this quote? I can add it
otherwise.

Sam

On Thu, 2019-02-21 at 13:41 +, Lionel Landwerlin wrote:
> On 21/02/2019 13:30, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-02-21 12:57:09)
> > > I did not find the PRM bit that says it must be 64b aligned, but
> > > I can
> > > see that's what i915 checks.
> > > 
> > > Chris: If you have a pointer to it, I could add the quote.
> > In amongst the register specs,
> > PLANE_STRIDE:
> > For Linear memory, this field specifies the stride in chunks of 64
> > bytes (1 cache line).
> > -Chris
> > 
> Thanks a lot!
> 
> 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] android: intel/isl: remove redundant building rules

2019-02-21 Thread Tapani Pälli
OK now I understand, this hunk got added accidentially twice from 2 
separate commits!


Reviewed-by: Tapani Pälli 

On 2/22/19 7:50 AM, Tapani Pälli wrote:

Hi;

On 2/22/19 1:30 AM, Mauro Rossi wrote:

Fixes the following building error:

including ./external/mesa/Android.mk ...
build/core/base_rules.mk:183: *** external/mesa/src/intel:
MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already 
defined by external/mesa/src/intel.
make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] 
Error 1


This is strange, craftyguy also reported same error .. I haven't seen 
this though.



ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c
and that source file includes isl_tiled_memcpy.c source


Yes, that is intentional. We want to compile 2 versions of the library, 
'normal' one and 'sse41' one.



Fixes: 96bb328 ("iris: add Android build")
Signed-off-by: Mauro Rossi 
---
  src/intel/Android.isl.mk | 13 -
  1 file changed, 13 deletions(-)

diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk
index 28944875e0..07a64b8ed1 100644
--- a/src/intel/Android.isl.mk
+++ b/src/intel/Android.isl.mk
@@ -198,19 +198,6 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml
  include $(MESA_COMMON_MK)
  include $(BUILD_STATIC_LIBRARY)
-
-# ---
-# Build libmesa_isl_tiled_memcpy
-# ---
-include $(CLEAR_VARS)
-
-LOCAL_MODULE := libmesa_isl_tiled_memcpy
-
-LOCAL_SRC_FILES := isl/isl_tiled_memcpy.c
-
-include $(MESA_COMMON_MK)
-include $(BUILD_STATIC_LIBRARY)
-
  # ---
  # Build libmesa_isl_tiled_memcpy
  # ---


     ^^^

Do you have libmesa_isl_tiled_memcpy 2 times there? This does not match 
what Mesa has ATM.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] android: intel/isl: remove redundant building rules

2019-02-21 Thread Tapani Pälli

Hi;

On 2/22/19 1:30 AM, Mauro Rossi wrote:

Fixes the following building error:

including ./external/mesa/Android.mk ...
build/core/base_rules.mk:183: *** external/mesa/src/intel:
MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by 
external/mesa/src/intel.
make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1


This is strange, craftyguy also reported same error .. I haven't seen 
this though.



ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c
and that source file includes isl_tiled_memcpy.c source


Yes, that is intentional. We want to compile 2 versions of the library, 
'normal' one and 'sse41' one.



Fixes: 96bb328 ("iris: add Android build")
Signed-off-by: Mauro Rossi 
---
  src/intel/Android.isl.mk | 13 -
  1 file changed, 13 deletions(-)

diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk
index 28944875e0..07a64b8ed1 100644
--- a/src/intel/Android.isl.mk
+++ b/src/intel/Android.isl.mk
@@ -198,19 +198,6 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml
  include $(MESA_COMMON_MK)
  include $(BUILD_STATIC_LIBRARY)
  
-

-# ---
-# Build libmesa_isl_tiled_memcpy
-# ---
-include $(CLEAR_VARS)
-
-LOCAL_MODULE := libmesa_isl_tiled_memcpy
-
-LOCAL_SRC_FILES := isl/isl_tiled_memcpy.c
-
-include $(MESA_COMMON_MK)
-include $(BUILD_STATIC_LIBRARY)
-
  # ---
  # Build libmesa_isl_tiled_memcpy
  # ---


^^^

Do you have libmesa_isl_tiled_memcpy 2 times there? This does not match 
what Mesa has ATM.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109735

Shmerl  changed:

   What|Removed |Added

 CC||shtetl...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109735

--- Comment #3 from Shmerl  ---
Yep, I can confirm it renders OK with radv 18.3.2 / llvm 7.0.1, but font is
broken with radv in Mesa master / llvm 9.0

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109739] Mesa build fails when vulkan-overlay-layer option is enabled

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109739

--- Comment #1 from Shmerl  ---
Another problem was that ninja install doesn't deploy actual layer .so file.

I did this in my build script:

DESTDIR="$dest_dir" LC_ALL=C.UTF-8 ninja install

It deployed VkLayer_MESA_overlay.json, but missed libVkLayer_MESA_overlay.so.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109739] Mesa build fails when vulkan-overlay-layer option is enabled

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109739

Bug ID: 109739
   Summary: Mesa build fails when vulkan-overlay-layer option is
enabled
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/Common
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: shtetl...@gmail.com
CC: airl...@freedesktop.org, chadvers...@chromium.org,
dan...@fooishbar.org, ja...@jlekstrand.net

I tried building Mesa master on Debian testing, enabling vulkan-overlay-layer,
and it failed because of mismatching includes. I installed glslang-tools
vulkan-validationlayers-dev and enabled the option:

-Dvulkan-overlay-layer=true

I had to modify includes like this, to make it build:

diff --git a/src/vulkan/overlay-layer/overlay.cpp
b/src/vulkan/overlay-layer/overlay.cpp
index f3678198b00..4b8010ba003 100644
--- a/src/vulkan/overlay-layer/overlay.cpp
+++ b/src/vulkan/overlay-layer/overlay.cpp
@@ -25,13 +25,13 @@
 #include 
 #include 

-#include 
+#include 
 #include 
-#include 
+#include 
 #include 
-#include "vk_layer_data.h"
+#include "vulkan/vk_layer_data.h"
 #include "vk_layer_table.h"
-#include "vk_layer_extension_utils.h"
+#include "vulkan/vk_layer_extension_utils.h"

 #include "imgui.h"

diff --git a/src/vulkan/overlay-layer/vk_layer_table.cpp
b/src/vulkan/overlay-layer/vk_layer_table.cpp
index 4a033b9add6..f5865fa3a6f 100644
--- a/src/vulkan/overlay-layer/vk_layer_table.cpp
+++ b/src/vulkan/overlay-layer/vk_layer_table.cpp
@@ -19,7 +19,7 @@
  */
 #include 
 #include 
-#include "vk_dispatch_table_helper.h"
+#include "vulkan/vk_dispatch_table_helper.h"
 #include "vulkan/vk_layer.h"
 #include "vk_layer_table.h"
 static device_table_map tableMap;

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi: fix query buffer allocation

2019-02-21 Thread Timothy Arceri
Fix the logic for buffer full check on alloc.

This patch just takes the fix Nicolai attached to the bug report
and updates it to work on master.

Fixes: e0f0d3675d4 ("radeonsi: factor si_query_buffer logic out of si_query_hw")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561
---
 src/gallium/drivers/radeonsi/si_query.c | 42 +
 src/gallium/drivers/radeonsi/si_query.h |  5 +--
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_query.c 
b/src/gallium/drivers/radeonsi/si_query.c
index 266b9d3ce84..04685b8bb3f 100644
--- a/src/gallium/drivers/radeonsi/si_query.c
+++ b/src/gallium/drivers/radeonsi/si_query.c
@@ -561,29 +561,31 @@ bool si_query_buffer_alloc(struct si_context *sctx, 
struct si_query_buffer *buff
   bool (*prepare_buffer)(struct si_context *, struct 
si_query_buffer*),
   unsigned size)
 {
-   if (buffer->buf && buffer->results_end + size >= 
buffer->buf->b.b.width0)
-   return true;
+   bool unprepared = buffer->unprepared;
+   buffer->unprepared = false;
+
+   if (!buffer->buf || buffer->results_end + size > 
buffer->buf->b.b.width0) {
+   if (buffer->buf) {
+   struct si_query_buffer *qbuf = 
MALLOC_STRUCT(si_query_buffer);
+   memcpy(qbuf, buffer, sizeof(*qbuf));
+   buffer->previous = qbuf;
+   }
+   buffer->results_end = 0;
 
-   if (buffer->buf) {
-   struct si_query_buffer *qbuf = MALLOC_STRUCT(si_query_buffer);
-   memcpy(qbuf, buffer, sizeof(*qbuf));
-   buffer->previous = qbuf;
+   /* Queries are normally read by the CPU after
+* being written by the gpu, hence staging is probably a good
+* usage pattern.
+*/
+   struct si_screen *screen = sctx->screen;
+   unsigned buf_size = MAX2(size, screen->info.min_alloc_size);
+   buffer->buf = si_resource(
+   pipe_buffer_create(>b, 0, PIPE_USAGE_STAGING, 
buf_size));
+   if (unlikely(!buffer->buf))
+   return false;
+   unprepared = true;
}
 
-   buffer->results_end = 0;
-
-   /* Queries are normally read by the CPU after
-* being written by the gpu, hence staging is probably a good
-* usage pattern.
-*/
-   struct si_screen *screen = sctx->screen;
-   unsigned buf_size = MAX2(size, screen->info.min_alloc_size);
-   buffer->buf = si_resource(
-   pipe_buffer_create(>b, 0, PIPE_USAGE_STAGING, 
buf_size));
-   if (unlikely(!buffer->buf))
-   return false;
-
-   if (prepare_buffer) {
+   if (unprepared && prepare_buffer) {
if (unlikely(!prepare_buffer(sctx, buffer))) {
si_resource_reference(>buf, NULL);
return false;
diff --git a/src/gallium/drivers/radeonsi/si_query.h 
b/src/gallium/drivers/radeonsi/si_query.h
index aaf0bd03aca..c61af51d57c 100644
--- a/src/gallium/drivers/radeonsi/si_query.h
+++ b/src/gallium/drivers/radeonsi/si_query.h
@@ -177,12 +177,13 @@ struct si_query_hw_ops {
 struct si_query_buffer {
/* The buffer where query results are stored. */
struct si_resource  *buf;
-   /* Offset of the next free result after current query data */
-   unsignedresults_end;
/* If a query buffer is full, a new buffer is created and the old one
 * is put in here. When we calculate the result, we sum up the samples
 * from all buffers. */
struct si_query_buffer  *previous;
+   /* Offset of the next free result after current query data */
+   unsignedresults_end;
+   bool unprepared;
 };
 
 void si_query_buffer_destroy(struct si_screen *sctx, struct si_query_buffer 
*buffer);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109735

--- Comment #2 from osch...@web.de ---
Created attachment 143436
  --> https://bugs.freedesktop.org/attachment.cgi?id=143436=edit
bisect log

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109735

--- Comment #1 from osch...@web.de ---
Created attachment 143435
  --> https://bugs.freedesktop.org/attachment.cgi?id=143435=edit
screenshot

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109735

Bug ID: 109735
   Summary: [Regression] broken font with mesa_vulkan_overlay
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: osch...@web.de
QA Contact: mesa-dev@lists.freedesktop.org

The mesa_vulkan_overlay renders broken font with recent radv-git, 18.3.3 works
fine (both build against LLVM 7.0.1). GPU is a RX580. Bisected to:

commit da2959463616756e89cece39f2bae1d437df4bbd
Author: Jason Ekstrand 
Date:   Wed Jan 16 11:52:03 2019 -0600

spirv: Only split blocks

Instead of splitting every per-vertex struct, just split the ones that
are actually blocks.  The reason for the split is so that we have
separate variables for separate locations, qualifiers, and builtin
decorations.  The vulkan spec only allows these on members of blocks.

Reviewed-by: Alejandro Piñeiro 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Rob Clark
On Thu, Feb 21, 2019 at 5:11 PM Connor Abbott  wrote:
>
> On Thu, Feb 21, 2019 at 5:07 PM Alyssa Rosenzweig  
> wrote:
>>
>> > Yes, there is a buffer for holding the results of the tiler. The way it
>> > works is that the userspace driver allocates a very large buffer with a
>> > "grow on page fault" bit, so the kernel will allocate more memory as the
>> > tiler asks for it.
>>
>> To be clear, right now I have this magic misc_0 buffer setup that way,
>> too (allocating something enormous and setting GROW_ON_GPF). Although,
>> it's not clear to me that that is the _correct_ thing to do; IIRC, the
>> blob allocates "just enough", no GROW_ON_GPF needed, but I should
>> probably check that.
>
>
> I don't know about Midgard, but on Bifrost it definitely uses GROW_ON_GPF, 
> telling the kernel to preallocate some small-ish initial set, presumably as 
> an optimization. I don't think you can accurately predict how much memory 
> you're going to need ahead of time, since it depends on how the triangles are 
> distributed along the screen.

the GROW_ON_GPF thing is cute..  not as badly needed on adreno since
we don't actually store pos/psize/varyings from binning pass, only a
compressed visibility stream.. but if only we had proper
stall-on-fault support from iommu driver, we could use a similar trick
to dynamically grow the VSC buffers..

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] android: intel/isl: remove redundant building rules

2019-02-21 Thread Mauro Rossi
Fixes the following building error:

including ./external/mesa/Android.mk ...
build/core/base_rules.mk:183: *** external/mesa/src/intel:
MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by 
external/mesa/src/intel.
make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1

ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c
and that source file includes isl_tiled_memcpy.c source

Fixes: 96bb328 ("iris: add Android build")
Signed-off-by: Mauro Rossi 
---
 src/intel/Android.isl.mk | 13 -
 1 file changed, 13 deletions(-)

diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk
index 28944875e0..07a64b8ed1 100644
--- a/src/intel/Android.isl.mk
+++ b/src/intel/Android.isl.mk
@@ -198,19 +198,6 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml
 include $(MESA_COMMON_MK)
 include $(BUILD_STATIC_LIBRARY)
 
-
-# ---
-# Build libmesa_isl_tiled_memcpy
-# ---
-include $(CLEAR_VARS)
-
-LOCAL_MODULE := libmesa_isl_tiled_memcpy
-
-LOCAL_SRC_FILES := isl/isl_tiled_memcpy.c
-
-include $(MESA_COMMON_MK)
-include $(BUILD_STATIC_LIBRARY)
-
 # ---
 # Build libmesa_isl_tiled_memcpy
 # ---
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535
Bug 109535 depends on bug 109328, which changed state.

Bug 109328 Summary: [BSW BXT GLK] dEQP-VK.subgroups.arithmetic.subgroup 
regressions
https://bugs.freedesktop.org/show_bug.cgi?id=109328

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-21 Thread Nanley Chery
On Thu, Feb 21, 2019 at 12:02:58PM +0200, Eleni Maria Stea wrote:
> Calculating the scissor rectangle fields with the y flipped (0 on top)
> can generate negative values that will cause assertion failure later on
> as the scissor fields are all unsigned. We must clamp the bbox values
> again to make sure they don't exceed the fb_height. Also fixed a
> calculation error.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
> 

I guess we'll want to add the other bug and Cc stable.

> v2:
>- I initially clamped the values inside the if (Y is flipped) case
>and I made a mistake in the calculation: the clamp of the bbox[2] should
>be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
>shouldn't have changed the ScissorRectangleYMax calculation. As the
>fixed code is equivalent with using CLAMP instead of MAX2 at the top of
>the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
>clear, I replaced it. (Nanley Chery)
> 
> v3:
>- Reversed the CLAMP change in bbox[3] as the API guarantees that the
>viewport height is positive. (Nanley Chery)
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index dcdfb3c9292..47f3741e673 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
>  
> bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>  
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Alyssa Rosenzweig
> I don't know about Midgard, but on Bifrost it definitely uses GROW_ON_GPF,
> telling the kernel to preallocate some small-ish initial set, presumably as
> an optimization. I don't think you can accurately predict how much memory
> you're going to need ahead of time, since it depends on how the triangles
> are distributed along the screen.

Good to know, thank you :)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Connor Abbott
On Thu, Feb 21, 2019 at 5:07 PM Alyssa Rosenzweig 
wrote:

> > Yes, there is a buffer for holding the results of the tiler. The way it
> > works is that the userspace driver allocates a very large buffer with a
> > "grow on page fault" bit, so the kernel will allocate more memory as the
> > tiler asks for it.
>
> To be clear, right now I have this magic misc_0 buffer setup that way,
> too (allocating something enormous and setting GROW_ON_GPF). Although,
> it's not clear to me that that is the _correct_ thing to do; IIRC, the
> blob allocates "just enough", no GROW_ON_GPF needed, but I should
> probably check that.
>

I don't know about Midgard, but on Bifrost it definitely uses GROW_ON_GPF,
telling the kernel to preallocate some small-ish initial set, presumably as
an optimization. I don't think you can accurately predict how much memory
you're going to need ahead of time, since it depends on how the triangles
are distributed along the screen.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109711

--- Comment #2 from root@scoopta.ninja ---
Yeah, apparently I didn't make that very clear. It is a GAME hang. As in I have
to forcibly kill the game from htop but it is not a full system hang.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/lower_clip_cull: Fix an incorrect assert

2019-02-21 Thread Kenneth Graunke
On Thursday, February 21, 2019 8:55:31 AM PST Jason Ekstrand wrote:
> Copy+paste error.  It was supposed to test cull and not clip.
> 
> Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..."
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717
> Cc: Kenneth Graunke 
> ---
>  src/compiler/nir/nir_lower_clip_cull_distance_arrays.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c 
> b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
> index d98ffa69596..70578d6f3fd 100644
> --- a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
> +++ b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
> @@ -101,7 +101,7 @@ combine_clip_cull(nir_shader *nir,
> }
>  
> if (cull) {
> -  assert(clip->data.compact);
> +  assert(cull->data.compact);
>cull->data.how_declared = nir_var_hidden;
>cull->data.location = VARYING_SLOT_CLIP_DIST0 + clip_array_size / 4;
>cull->data.location_frac = clip_array_size % 4;
> 

Whoops.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel: limit urb size for SKL/KBL/CFL GT1

2019-02-21 Thread Kristian Høgsberg
What happened to this patch? We carry it downstream for the deqp fix.
Can we get it into 19.0?

Thanks,
Kristian

On Wed, Aug 29, 2018 at 5:39 PM Lionel Landwerlin
 wrote:
>
> The documentation puts the URB size for SKL GT1 as "128K - 192K". I
> guess this means we can't tell which one it is, so we have to go for
> the lower bound. This change also changes the max VS URB entries which
> is lower on GT1 skus.
>
> Fixes a CTS test :
>
>   dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d
>
> Signed-off-by: Lionel Landwerlin 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107505
> ---
>  src/intel/dev/gen_device_info.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c
> index b0ae4d18034..ed1e73efa61 100644
> --- a/src/intel/dev/gen_device_info.c
> +++ b/src/intel/dev/gen_device_info.c
> @@ -617,7 +617,8 @@ static const struct gen_device_info 
> gen_device_info_skl_gt1 = {
> .num_subslices = { 2, },
> .num_eu_per_subslice = 6,
> .l3_banks = 2,
> -   .urb.size = 192,
> +   .urb.size = 128,
> +   .urb.max_entries[MESA_SHADER_VERTEX] = 928,
> .simulator_id = 12,
>  };
>
> @@ -689,6 +690,8 @@ static const struct gen_device_info 
> gen_device_info_kbl_gt1 = {
> .num_subslices = { 2, },
> .num_eu_per_subslice = 6,
> .l3_banks = 2,
> +   .urb.size = 128,
> +   .urb.max_entries[MESA_SHADER_VERTEX] = 928,
> .simulator_id = 16,
>  };
>
> @@ -775,6 +778,8 @@ static const struct gen_device_info 
> gen_device_info_cfl_gt1 = {
> .num_subslices = { 2, },
> .num_eu_per_subslice = 6,
> .l3_banks = 2,
> +   .urb.size = 128,
> +   .urb.max_entries[MESA_SHADER_VERTEX] = 928,
> .simulator_id = 24,
>  };
>  static const struct gen_device_info gen_device_info_cfl_gt2 = {
> --
> 2.18.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/lower_clip_cull: Fix an incorrect assert

2019-02-21 Thread Lionel Landwerlin

On 21/02/2019 16:55, Jason Ekstrand wrote:

Copy+paste error.  It was supposed to test cull and not clip.

Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717
Cc: Kenneth Graunke 



Trivial, I really thought it was harder than that :)

Thanks!


Reviewed-by: Lionel Landwerlin 



---
  src/compiler/nir/nir_lower_clip_cull_distance_arrays.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c 
b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
index d98ffa69596..70578d6f3fd 100644
--- a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
+++ b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
@@ -101,7 +101,7 @@ combine_clip_cull(nir_shader *nir,
 }
  
 if (cull) {

-  assert(clip->data.compact);
+  assert(cull->data.compact);
cull->data.how_declared = nir_var_hidden;
cull->data.location = VARYING_SLOT_CLIP_DIST0 + clip_array_size / 4;
cull->data.location_frac = clip_array_size % 4;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/lower_clip_cull: Fix an incorrect assert

2019-02-21 Thread Jason Ekstrand
Copy+paste error.  It was supposed to test cull and not clip.

Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717
Cc: Kenneth Graunke 
---
 src/compiler/nir/nir_lower_clip_cull_distance_arrays.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c 
b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
index d98ffa69596..70578d6f3fd 100644
--- a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
+++ b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
@@ -101,7 +101,7 @@ combine_clip_cull(nir_shader *nir,
}
 
if (cull) {
-  assert(clip->data.compact);
+  assert(cull->data.compact);
   cull->data.how_declared = nir_var_hidden;
   cull->data.location = VARYING_SLOT_CLIP_DIST0 + clip_array_size / 4;
   cull->data.location_frac = clip_array_size % 4;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] intel/fs: Cap dst-aligned region stride to maximum representable hstride value.

2019-02-21 Thread Jason Ekstrand
On Thu, Feb 14, 2019 at 4:53 PM Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez 
> > wrote:
> >
> >> This is required in combination with the following commit, because
> >> otherwise if a source region with an extended 8+ stride is present in
> >> the instruction (which we're about to declare legal) we'll end up
> >> emitting code that attempts to write to such a region, even though
> >> strides greater than four are still illegal for the destination.
> >> ---
> >>  src/intel/compiler/brw_fs_lower_regioning.cpp | 20 ++-
> >>  1 file changed, 15 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp
> >> b/src/intel/compiler/brw_fs_lower_regioning.cpp
> >> index 6a3c23892b4..b86e95ed9eb 100644
> >> --- a/src/intel/compiler/brw_fs_lower_regioning.cpp
> >> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp
> >> @@ -71,15 +71,25 @@ namespace {
> >>!is_byte_raw_mov(inst)) {
> >>   return get_exec_type_size(inst);
> >>} else {
> >> - unsigned stride = inst->dst.stride * type_sz(inst->dst.type);
> >> + /* Calculate the maximum byte stride and the minimum type size
> >> across
> >> +  * all source and destination operands.
> >> +  */
> >> + unsigned max_stride = inst->dst.stride *
> type_sz(inst->dst.type);
> >> + unsigned min_size = type_sz(inst->dst.type);
> >>
> >>   for (unsigned i = 0; i < inst->sources; i++) {
> >> -if (!is_uniform(inst->src[i]) &&
> !inst->is_control_source(i))
> >> -   stride = MAX2(stride, inst->src[i].stride *
> >> - type_sz(inst->src[i].type));
> >> +if (!is_uniform(inst->src[i]) &&
> !inst->is_control_source(i))
> >> {
> >> +   max_stride = MAX2(max_stride, inst->src[i].stride *
> >> + type_sz(inst->src[i].type));
> >> +   min_size = MIN2(min_size, type_sz(inst->src[i].type));
> >> +}
> >>   }
> >>
> >> - return stride;
> >> + /* Attempt to use the largest byte stride among all present
> >> operands,
> >> +  * but never exceed a stride of 4 since that would lead to
> >> illegal
> >> +  * destination regions during lowering.
> >> +  */
> >> + return MIN2(max_stride, 4 * min_size);
> >>
> >
> > Why not just fall back to tightly packed in this case?  I think I can
> > answer my own question:  Because using something that's equal to one of
> the
> > strides reduces the liklihood that we'll need a temporary.  If that's the
> > correct answer, then maybe what we want is the maximum of all strides
> with
> > stride_in_bytes <= 4 * type_sz?
> >
>
> We also want the result to be greater than or equal to the size of the
> largest non-uniform, non-control source type, since packing a vector of
> such a type into a temporary of lower byte stride than its size is
> impossible.  This patch guarantees that as long as max_size <= 4 *
> min_size, which is necessary for the lowering code that calls this
> function to work at all.
>
> It would be possible to preserve this guarantee while attempting to pick
> one of the strides of the pre-existing sources as you say -- I would be
> happy to review that change as a follow-up micro-optimization patch, but
> there are some corner cases to consider I don't necessarily want to
> bother with in the patch doing the functional change, for the sake of
> bisectability.
>

That's fine with me.

Reviewed-by: Jason Ekstrand 


> It may make sense to add an assert here that max_size <= 4 * min_size
> for the case such an instruction doesn't blow up already at the EU
> validator (it doesn't look like the validator is currently enforcing the
> lack of conversions between 8 and 64 bit types?), it will just involve
> calculating max_size in addition to max_stride and min_size above.
>
> > --Jason
> >
> >
> >>}
> >> }
> >>
> >> --
> >> 2.19.2
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] intel/fs: Exclude control sources from execution type and region alignment calculations.

2019-02-21 Thread Jason Ekstrand
On Thu, Feb 14, 2019 at 9:33 PM Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez 
> > wrote:
> >
> >> Currently the execution type calculation will return a bogus value in
> >> cases like:
> >>
> >>   mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u
> >>
> >> Which will be considered to have a 32-bit integer execution type even
> >> though the actual indirect move operation will be carried out with
> >> 16-bit precision.
> >>
> >> Similarly there's no need to apply the CHV/BXT double-precision region
> >> alignment restrictions to such control sources, since they aren't
> >> directly involved in the double-precision arithmetic operations
> >> emitted by these virtual instructions.  Applying the CHV/BXT
> >> restrictions to control sources was expected to be harmless if mildly
> >> inefficient, but unfortunately it exposed problems at codegen level
> >> for virtual instructions (namely the SHUFFLE instruction used for the
> >> Vulkan 1.1 subgroup feature) that weren't prepared to accept control
> >> sources with an arbitrary strided region.
> >>
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
> >> Reported-by: Mark Janes 
> >> Fixes: efa4e4bc5fc "intel/fs: Introduce regioning lowering pass."
> >> ---
> >>  src/intel/compiler/brw_fs.cpp | 54 +++
> >>  src/intel/compiler/brw_fs_lower_regioning.cpp |  6 +--
> >>  src/intel/compiler/brw_ir_fs.h| 10 +++-
> >>  3 files changed, 66 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/src/intel/compiler/brw_fs.cpp
> b/src/intel/compiler/brw_fs.cpp
> >> index 0359eb079f7..f475b617df2 100644
> >> --- a/src/intel/compiler/brw_fs.cpp
> >> +++ b/src/intel/compiler/brw_fs.cpp
> >> @@ -271,6 +271,60 @@ fs_inst::is_send_from_grf() const
> >> }
> >>  }
> >>
> >> +bool
> >> +fs_inst::is_control_source(unsigned arg) const
> >> +{
> >> +   switch (opcode) {
> >> +   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD:
> >> +   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7:
> >> +   case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN4:
> >> +   case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7:
> >> +  return arg == 0;
> >> +
> >> +   case SHADER_OPCODE_BROADCAST:
> >> +   case SHADER_OPCODE_SHUFFLE:
> >> +   case SHADER_OPCODE_QUAD_SWIZZLE:
> >> +   case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> >> +   case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> >> +   case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
> >> +   case SHADER_OPCODE_IMAGE_SIZE:
> >> +   case SHADER_OPCODE_GET_BUFFER_SIZE:
> >> +  return arg == 1;
> >> +
> >> +   case SHADER_OPCODE_MOV_INDIRECT:
> >> +   case SHADER_OPCODE_CLUSTER_BROADCAST:
> >> +   case SHADER_OPCODE_TEX:
> >> +   case FS_OPCODE_TXB:
> >> +   case SHADER_OPCODE_TXD:
> >> +   case SHADER_OPCODE_TXF:
> >> +   case SHADER_OPCODE_TXF_LZ:
> >> +   case SHADER_OPCODE_TXF_CMS:
> >> +   case SHADER_OPCODE_TXF_CMS_W:
> >> +   case SHADER_OPCODE_TXF_UMS:
> >> +   case SHADER_OPCODE_TXF_MCS:
> >> +   case SHADER_OPCODE_TXL:
> >> +   case SHADER_OPCODE_TXL_LZ:
> >> +   case SHADER_OPCODE_TXS:
> >> +   case SHADER_OPCODE_LOD:
> >> +   case SHADER_OPCODE_TG4:
> >> +   case SHADER_OPCODE_TG4_OFFSET:
> >> +   case SHADER_OPCODE_SAMPLEINFO:
> >> +   case SHADER_OPCODE_UNTYPED_ATOMIC:
> >> +   case SHADER_OPCODE_UNTYPED_ATOMIC_FLOAT:
> >> +   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
> >> +   case SHADER_OPCODE_UNTYPED_SURFACE_WRITE:
> >> +   case SHADER_OPCODE_BYTE_SCATTERED_READ:
> >> +   case SHADER_OPCODE_BYTE_SCATTERED_WRITE:
> >> +   case SHADER_OPCODE_TYPED_ATOMIC:
> >> +   case SHADER_OPCODE_TYPED_SURFACE_READ:
> >> +   case SHADER_OPCODE_TYPED_SURFACE_WRITE:
> >>
> >
> > As of b284d222db, we are no longer using many of the opcodes in this list
> > (gen7 pull constant loads, [un]typed surface reads/writes, etc.)  It will
> > need to be rebased and we need to add SHADER_OPCODE_SEND to the list.
> > Fortunately, the changes to add SHADER_OPCODE_SEND landed before the 19.0
> > cut-off so there is no need to make two versions for backporting.
> >
>
> Yes, that's roughly what I had done during one of my previous rebases of
> this series, see:
>
>
> https://cgit.freedesktop.org/~currojerez/mesa/commit/?h=jenkins=30f8f3ff48b02ead688705e0679a98c0d6c9c87e
>

Looks good.  Some of those opcodes are no longer used.  I sent out a MR to
clean some of it up but I think it's better if this lands first so that it
can be more cleanly back-ported

Reviewed-by: Jason Ekstrand 


> > Other than that, this patch seems perfectly reasonable to me
> >
> > Reviewed-by: Jason Ekstrand 
> >
> > If you want me to hand-review the new list of opcodes, feel free to send
> a
> > v2 and cc me.
> >
> >
> >> +  return arg == 1 || arg == 2;
> >> +
> >> +   default:
> >> +  return false;
> >> +   }
> >> +}
> >> +
> >>  /**
> >>   * Returns true if this instruction's sources and destinations cannot
> >>   * safely be the same register.
> >> diff --git 

[Mesa-dev] [MR] intel/compiler: Clean up image operations

2019-02-21 Thread Jason Ekstrand
This MR cleans up some old cruft lying around in our back-end compiler
surrounding images.

The first three commits provide an enum for surface opcode sources and drop
the surface builder from the scalar back-end.  Now that image format
lowering has moved to NIR and we have logical opcodes for HW image ops,
there builder isn't really gaining us anything except some convenience
wrappers.  IMO, those wrappers are starting to be more trouble than they're
worth.  One could argue about it in the case of atomics but I think we'd be
better off with a quick helper function in brw_fs_nir.cpp.

Next, we drop the typed surface code from the vec4 back-end (I think it
works but it's never been used) and then drop some now unused ocodes.  We
could probably wire up image_load_store for vec4 if we really wanted to
(and now that the lowering is in NIR, it might not be too hard).  However,
it's kind-of pointless and the chances of someone actually doing that work
are getting increasingly remote.  Let's just drop the dead code.

Finally, we clean up some opcodes by deleting opcodes that are no longer
used and renaming some vec4-only opcodes VEC4.

https://gitlab.freedesktop.org/mesa/mesa/merge_requests/287/commits
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics

2019-02-21 Thread Ilia Mirkin
On Thu, Feb 21, 2019 at 11:09 AM Marek Olšák  wrote:
>
> On Thu, Feb 21, 2019, 12:08 AM Ilia Mirkin  wrote:
>>
>> I don't think that's right... this is used in e.g.
>>
>>case TGSI_OPCODE_RESQ:
>>   if (tgsi_is_bindless_image_file(fullinst->Src[0].Register.File))
>>  info->uses_bindless_images = true;
>>   break;
>>
>> And if you call RESQ with a src0 that is not IMAGE or BUFFER, then
>> that's a bindless reference.
>
>
> No it's not. TEMP, CONST, IN, and OUT are bindless references. CONSTBUF isn't.

Indeed you're quite right. Looks like the collective "we" thought of
this problem ahead of time. Excellent. (Now that you mention it, I
feel like I might have even been involved in that... how time flies.)

Reviewed-by: Ilia Mirkin 

>
> Marek
>
>>
>> I think the problem is that this is interacting poorly with the load
>> constbuf stuff -- probably needs a bit of subtlety there.
>>
>>   -ilia
>>
>> On Thu, Feb 21, 2019 at 12:03 AM Marek Olšák  wrote:
>> >
>> > From: Marek Olšák 
>> >
>> > This might have decreased performance for radeonsi/tgsi, because most
>> > most shaders claimed they used bindless.
>> >
>> > Cc: 18.3 19.0 
>> > ---
>> >  src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 +++-
>> >  1 file changed, 3 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h 
>> > b/src/gallium/auxiliary/tgsi/tgsi_scan.h
>> > index 580c73a2814..51d85af4fcb 100644
>> > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
>> > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
>> > @@ -214,18 +214,20 @@ tgsi_scan_arrays(const struct tgsi_token *tokens,
>> >  void
>> >  tgsi_scan_tess_ctrl(const struct tgsi_token *tokens,
>> >  const struct tgsi_shader_info *info,
>> >  struct tgsi_tessctrl_info *out);
>> >
>> >  static inline bool
>> >  tgsi_is_bindless_image_file(unsigned file)
>> >  {
>> > return file != TGSI_FILE_IMAGE &&
>> >file != TGSI_FILE_MEMORY &&
>> > -  file != TGSI_FILE_BUFFER;
>> > +  file != TGSI_FILE_BUFFER &&
>> > +  file != TGSI_FILE_CONSTBUF &&
>> > +  file != TGSI_FILE_HW_ATOMIC;
>> >  }
>> >
>> >  #ifdef __cplusplus
>> >  } // extern "C"
>> >  #endif
>> >
>> >  #endif /* TGSI_SCAN_H */
>> > --
>> > 2.17.1
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics

2019-02-21 Thread Marek Olšák
On Thu, Feb 21, 2019, 12:08 AM Ilia Mirkin  wrote:

> I don't think that's right... this is used in e.g.
>
>case TGSI_OPCODE_RESQ:
>   if (tgsi_is_bindless_image_file(fullinst->Src[0].Register.File))
>  info->uses_bindless_images = true;
>   break;
>
> And if you call RESQ with a src0 that is not IMAGE or BUFFER, then
> that's a bindless reference.
>

No it's not. TEMP, CONST, IN, and OUT are bindless references. CONSTBUF
isn't.

Marek


> I think the problem is that this is interacting poorly with the load
> constbuf stuff -- probably needs a bit of subtlety there.
>
>   -ilia
>
> On Thu, Feb 21, 2019 at 12:03 AM Marek Olšák  wrote:
> >
> > From: Marek Olšák 
> >
> > This might have decreased performance for radeonsi/tgsi, because most
> > most shaders claimed they used bindless.
> >
> > Cc: 18.3 19.0 
> > ---
> >  src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h
> b/src/gallium/auxiliary/tgsi/tgsi_scan.h
> > index 580c73a2814..51d85af4fcb 100644
> > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
> > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
> > @@ -214,18 +214,20 @@ tgsi_scan_arrays(const struct tgsi_token *tokens,
> >  void
> >  tgsi_scan_tess_ctrl(const struct tgsi_token *tokens,
> >  const struct tgsi_shader_info *info,
> >  struct tgsi_tessctrl_info *out);
> >
> >  static inline bool
> >  tgsi_is_bindless_image_file(unsigned file)
> >  {
> > return file != TGSI_FILE_IMAGE &&
> >file != TGSI_FILE_MEMORY &&
> > -  file != TGSI_FILE_BUFFER;
> > +  file != TGSI_FILE_BUFFER &&
> > +  file != TGSI_FILE_CONSTBUF &&
> > +  file != TGSI_FILE_HW_ATOMIC;
> >  }
> >
> >  #ifdef __cplusplus
> >  } // extern "C"
> >  #endif
> >
> >  #endif /* TGSI_SCAN_H */
> > --
> > 2.17.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Alyssa Rosenzweig
> Yes, there is a buffer for holding the results of the tiler. The way it
> works is that the userspace driver allocates a very large buffer with a
> "grow on page fault" bit, so the kernel will allocate more memory as the
> tiler asks for it.

To be clear, right now I have this magic misc_0 buffer setup that way,
too (allocating something enormous and setting GROW_ON_GPF). Although,
it's not clear to me that that is the _correct_ thing to do; IIRC, the
blob allocates "just enough", no GROW_ON_GPF needed, but I should
probably check that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Alyssa Rosenzweig
> *somewhere* the result of VS (or binning shader if VS is split in two)
> needs to be saved for use during the per-tile rendering.  Presumably
> you have to give the hw a buffer of appropriate/matching size
> somewhere, or with enough geometry (and too small a buffer) things
> will overflow and go badly.

We allocate (in another part of a driver) the varying_mem buffer, which
is where varyings, including gl_Position (i.e. all the output of the VS)
get written to after VS and read from by the tiler.

That's not this, but it's not clear what this...

> I guess if you exceed the size given in .tiler_meta, then hw falls
> back to running VS per tile for all geometry (not just geom visible in
> the tile), hence big diff in perf for large tile counts and lotsa
> geometry.

This is possible, although I would expect that would show up in the
performance counters as unusually high VS activity (which was not the
case...? To be fair, I'm not convinced I'm reading the counters right
^^)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #35 from asimiklit  ---
Created attachment 143430
  --> https://bugs.freedesktop.org/attachment.cgi?id=143430=edit
patch to disallow an elimination of the first unused ub/ssbo array elements

(In reply to Ian Romanick from comment #33)
> (In reply to andrii simiklit from comment #32)
> > (In reply to andrii simiklit from comment #31)
> > > (In reply to Mark Janes from comment #30)
> > > > (In reply to Mark Janes from comment #28)
> > > > >   
> > > > > https://android-review.googlesource.com/c/platform/external/deqp/+/901894
> > > > 
> > > > Mesa still asserts with this fix.  I also tested Andrii's mesa patch 
> > > > with
> > > > the dEQP fix and the test fails.
> > > Do you mean the Chris's dEQP fix here, yes?
> > > But looks like the mentioned Chris's dEQP fix considers some GL 
> > > limitations
> > > and doesn't affect the expectations of binding points.
> > > 
> > > Also the assertion is a separate issue, I created the piglit test for 
> > > that:
> > > https://patchwork.freedesktop.org/patch/286287/
> > > But yes, we unable to fix the test fail without assertion because of crash
> > > :-)
> > > 
> > > > 
> > > > Since non-mesa drivers have found issues with the original dEQP change, 
> > > > I
> > > > suspect there are still deeper problems with the test.
> > > Possible they have the same issue with binding points mismatch after
> > > optimizations by glsl compiler. 
> > > They could try this fix/hack for deqp which is already helped us:
> > > https://github.com/asimiklit/deqp/commit/
> > > 91cff8150944213f6da533e281ee76d95ca00f21
> > > If it helps them we will know that it is a common issue and it could
> > > expedite this:
> > > https://github.com/KhronosGroup/OpenGL-API/issues/46
> > 
> > So we have an answer from Piers Daniell:
> >   "I believe all buffer binding points should be consumed, regardless
> > whether 
> >the array elements are used or not. This would be the behavior of least 
> >surprise to the developer. I didn't see any language that would indicate 
> >that unused elements should not be counted when assigning the element to 
> >the buffer binding point."
> 
> I think this basically agrees with my earlier sentiment that we shouldn't
> trim elements from the beginning of the array.  It's generally ok (and in
> some cases expected) to trim elements from the end.

Are you talking about something like attached variant?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Connor Abbott
On Thu, Feb 21, 2019 at 3:01 PM Rob Clark  wrote:

> On Thu, Feb 21, 2019 at 1:19 AM Alyssa Rosenzweig 
> wrote:
> >
> > For reasons that are still unclear (speculation included in the comment
> > added in this patch), the tiler? metadata has a fast path that we were
> > not enabling; there looks to be a possible time/memory tradeoff, but the
> > details remain unclear.
> >
> > Regardless, this patch improves performance dramatically. Particular
> > wins are for geometry-heavy scenes. For instance, glmark2-es2's
> > Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped
> > from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious
> > when vsync is disabled, as in glmark2-es2-wayland.
> >
> > With this patch, on GLES 2.0 samples not involving FBOs, it appears
> > performance is converging with (and sometimes surpassing) the blob.
> >
> > Signed-off-by: Alyssa Rosenzweig 
> > ---
> >  src/gallium/drivers/panfrost/pan_context.c | 42 +++---
> >  1 file changed, 38 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/gallium/drivers/panfrost/pan_context.c
> b/src/gallium/drivers/panfrost/pan_context.c
> > index 822b5a0dfef..2996a9c1e09 100644
> > --- a/src/gallium/drivers/panfrost/pan_context.c
> > +++ b/src/gallium/drivers/panfrost/pan_context.c
> > @@ -256,7 +256,28 @@ static struct bifrost_framebuffer
> >  panfrost_emit_mfbd(struct panfrost_context *ctx)
> >  {
> >  struct bifrost_framebuffer framebuffer = {
> > -.tiler_meta = 0xf0c600,
> > +/* It is not yet clear what tiler_meta means or how it's
> > + * calculated, but we can tell the lower 32-bits are a
> > + * (monotonically increasing?) function of tile count
> and
> > + * geometry complexity; I suspect it defines a memory
> size of
> > + * some kind? for the tiler. It's really unclear at the
> > + * moment... but to add to the confusion, the hardware
> is happy
> > + * enough to accept a zero in this field, so we don't
> even have
> > + * to worry about it right now.
>
> *somewhere* the result of VS (or binning shader if VS is split in two)
> needs to be saved for use during the per-tile rendering.  Presumably
> you have to give the hw a buffer of appropriate/matching size
> somewhere, or with enough geometry (and too small a buffer) things
> will overflow and go badly.
>
>
Yes, there is a buffer for holding the results of the tiler. The way it
works is that the userspace driver allocates a very large buffer with a
"grow on page fault" bit, so the kernel will allocate more memory as the
tiler asks for it.


> I guess if you exceed the size given in .tiler_meta, then hw falls
> back to running VS per tile for all geometry (not just geom visible in
> the tile), hence big diff in perf for large tile counts and lotsa
> geometry.
>

No, this isn't it. The vertex shaders aren't split in half, they all run
entirely before the tiler even starts. The only thing I could imagine would
be some optional part of the tiler buffer where the aforementioned grow on
page fault thing doesn't quite work out.


>
> It does sound a bit strange that it would depend on tile count.  I'd
> expect it would be a function strictly of amount of geometry (and
> possibly effected by whether or not gl_PointSize is written?).. and
> possibly amount of varyings if VS isn't split into two parts.
>
> BR,
> -R
>
> > + * The byte (just after the 32-bit mark) is much more
> > + * interesting. The higher nibble I've only ever seen
> as 0xF,
> > + * but the lower one I've seen as 0x0 or 0xF, and it's
> not
> > + * obvious what the difference is. But what -is-
> obvious is
> > + * that when the lower nibble is zero, performance is
> severely
> > + * degraded compared to when the lower nibble is set.
> > + * Evidently, that nibble enables some sort of fast
> path,
> > + * perhaps relating to caching or tile flush?
> Regardless, at
> > + * this point there's no clear reason not to set it,
> aside from
> > + * substantially increased memory requirements (of the
> misc_0
> > + * buffer) */
> > +
> > +.tiler_meta = ((uint64_t) 0xff << 32) | 0x0,
> >
> >  .width1 = MALI_POSITIVE(ctx->pipe_framebuffer.width),
> >  .height1 = MALI_POSITIVE(ctx->pipe_framebuffer.height),
> > @@ -271,10 +292,23 @@ panfrost_emit_mfbd(struct panfrost_context *ctx)
> >
> >  .unknown2 = 0x1f,
> >
> > -/* Presumably corresponds to unknown_address_X of SFBD
> */
> > +/* Corresponds to unknown_address_X of SFBD */
> >  .scratchpad = ctx->scratchpad.gpu,
> >  .tiler_scratch_start  = ctx->misc_0.gpu,
> > -

Re: [Mesa-dev] [PATCH v2] anv: advertise 8 subpixel precision bits

2019-02-21 Thread Jason Ekstrand
On Thu, Feb 21, 2019 at 3:45 AM Juan A. Suarez Romero 
wrote:

> On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
> used to select between 8 bit subpixel precision (value 0) or 4 bit
> subpixel precision (value 1). As this value is not set, means it is
> taking the value 0, so 8 bit are used.
>
> On the other side, in the Vulkan CTS tests, if the reference rasterizer,
> which uses 8 bit precision, as it is used to check what should be the
> expected value for the tests, is changed to use 4 bit as ANV was
> advertising so far, some of the tests will fail.
>
> So it seems ANV is actually using 8 bits.
>
> v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)
>
> CC: Jason Ekstrand 
> CC: Kenneth Graunke 
> ---
>  src/intel/vulkan/anv_device.c| 2 +-
>  src/intel/vulkan/genX_pipeline.c | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 3120865466a..95224407318 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -1095,7 +1095,7 @@ void anv_GetPhysicalDeviceProperties(
>   16 * devinfo->max_cs_threads,
>   16 * devinfo->max_cs_threads,
>},
> -  .subPixelPrecisionBits= 4 /* FIXME */,
> +  .subPixelPrecisionBits= 8,
>.subTexelPrecisionBits= 4 /* FIXME */,
>.mipmapPrecisionBits  = 4 /* FIXME */,
>.maxDrawIndexedIndexValue = UINT32_MAX,
> diff --git a/src/intel/vulkan/genX_pipeline.c
> b/src/intel/vulkan/genX_pipeline.c
> index 6255e5d83c5..b06036a6fc7 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -464,6 +464,7 @@ emit_rs_state(struct anv_pipeline *pipeline,
> sf.TriangleStripListProvokingVertexSelect = 0;
> sf.LineStripListProvokingVertexSelect = 0;
> sf.TriangleFanProvokingVertexSelect = 1;
> +   sf.VertexSubPixelPrecisionSelect = 0;
>

Please use the _8bit #define.  It looks like you may have to modify the
gen7 XML (and possibly earlier if you don't mind) to add the enum values.


>
> const struct brw_vue_prog_data *last_vue_prog_data =
>anv_pipeline_get_last_vue_prog_data(pipeline);
> --
> 2.20.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: clone instruction set rather than removing individual entries

2019-02-21 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Thu, Feb 21, 2019 at 5:33 AM Thomas Helland 
wrote:

> This patch is:
>
> Reviewed-by: Thomas Helland
>
> Den ons. 20. feb. 2019 kl. 04:04 skrev Timothy Arceri <
> tarc...@itsqueeze.com>:
> >
> > This reduces the time spent in nir_opt_cse() by almost a half.
> > ---
> >  src/compiler/nir/nir_opt_cse.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/compiler/nir/nir_opt_cse.c
> b/src/compiler/nir/nir_opt_cse.c
> > index bf42a6a33dc..3c3617d852a 100644
> > --- a/src/compiler/nir/nir_opt_cse.c
> > +++ b/src/compiler/nir/nir_opt_cse.c
> > @@ -39,9 +39,10 @@
> >   */
> >
> >  static bool
> > -cse_block(nir_block *block, struct set *instr_set)
> > +cse_block(nir_block *block, struct set *dominance_set)
> >  {
> > bool progress = false;
> > +   struct set *instr_set = _mesa_set_clone(dominance_set, NULL);
> >
> > nir_foreach_instr_safe(instr, block) {
> >if (nir_instr_set_add_or_rewrite(instr_set, instr)) {
> > @@ -55,8 +56,7 @@ cse_block(nir_block *block, struct set *instr_set)
> >progress |= cse_block(child, instr_set);
> > }
> >
> > -   nir_foreach_instr(instr, block)
> > - nir_instr_set_remove(instr_set, instr);
> > +   _mesa_set_destroy(instr_set, NULL);
> >
> > return progress;
> >  }
> > --
> > 2.20.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-21 Thread Qa Qa
I've run the test with applied patch and it's successfully passed.
Also, this patch fixed an one issue from another ticket -
https://bugs.freedesktop.org/show_bug.cgi?id=109594 - with this patch,
totem app doesn't crashes at resizing

Thanks!

Tested-by: Paul Chelombitko 

чт, 21 февр. 2019 г. в 12:03, Eleni Maria Stea :

> Calculating the scissor rectangle fields with the y flipped (0 on top)
> can generate negative values that will cause assertion failure later on
> as the scissor fields are all unsigned. We must clamp the bbox values
> again to make sure they don't exceed the fb_height. Also fixed a
> calculation error.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
>
> v2:
>- I initially clamped the values inside the if (Y is flipped) case
>and I made a mistake in the calculation: the clamp of the bbox[2] should
>be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
>shouldn't have changed the ScissorRectangleYMax calculation. As the
>fixed code is equivalent with using CLAMP instead of MAX2 at the top of
>the function when bbox[2] and bbox[3] are calculated, and the 2nd is
> more
>clear, I replaced it. (Nanley Chery)
>
> v3:
>- Reversed the CLAMP change in bbox[3] as the API guarantees that the
>viewport height is positive. (Nanley Chery)
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index dcdfb3c9292..47f3741e673 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
>
> bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535
Bug 109535 depends on bug 109594, which changed state.

Bug 109594 Summary: totem assert failure: totem: 
src/intel/genxml/gen9_pack.h:72: __gen_uint: La declaración `v <= max' no se 
cumple.
https://bugs.freedesktop.org/show_bug.cgi?id=109594

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535
Bug 109535 depends on bug 109594, which changed state.

Bug 109594 Summary: totem assert failure: totem: 
src/intel/genxml/gen9_pack.h:72: __gen_uint: La declaración `v <= max' no se 
cumple.
https://bugs.freedesktop.org/show_bug.cgi?id=109594

   What|Removed |Added

 Status|NEEDINFO|RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/auxiliary/vl: Fix duplicate symbol build errors.

2019-02-21 Thread James Zhu

On 2019-02-19 10:58 a.m., Emil Velikov wrote:
> On Tue, 19 Feb 2019 at 03:32, Vinson Lee  wrote:
>>CXXLDgallium_dri.la
>> duplicate symbol _compute_shader_video_buffer in:
>>  
>> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
>>  
>> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
>> duplicate symbol _compute_shader_weave in:
>>  
>> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
>>  
>> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
>> duplicate symbol _compute_shader_rgba in:
>>  
>> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o)
>>  
>> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o)
>>
>> Fixes: 9364d66cb7f7 ("gallium/auxiliary/vl: Add video compositor compute 
>> shader render")
>> Signed-off-by: Vinson Lee 
>> ---
>>   src/gallium/auxiliary/vl/vl_compositor_cs.h | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/vl/vl_compositor_cs.h 
>> b/src/gallium/auxiliary/vl/vl_compositor_cs.h
>> index 7a203d327eda..a73a8755fc2a 100644
>> --- a/src/gallium/auxiliary/vl/vl_compositor_cs.h
>> +++ b/src/gallium/auxiliary/vl/vl_compositor_cs.h
>> @@ -32,9 +32,9 @@
>>
>>   #include "vl_compositor.h"
>>
>> -char *compute_shader_video_buffer;
>> -char *compute_shader_weave;
>> -char *compute_shader_rgba;
>> +extern char *compute_shader_video_buffer;
>> +extern char *compute_shader_weave;
>> +extern char *compute_shader_rgba;
>>
> Please make them also "const" - in both here and C file.
>
> With that the patch is
> Reviewed-by: Emil Velikov 

Agreed!

This patch is Reviewed-by: James Zhu 

James

>
> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-21 Thread Rob Clark
On Thu, Feb 21, 2019 at 1:19 AM Alyssa Rosenzweig  wrote:
>
> For reasons that are still unclear (speculation included in the comment
> added in this patch), the tiler? metadata has a fast path that we were
> not enabling; there looks to be a possible time/memory tradeoff, but the
> details remain unclear.
>
> Regardless, this patch improves performance dramatically. Particular
> wins are for geometry-heavy scenes. For instance, glmark2-es2's
> Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped
> from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious
> when vsync is disabled, as in glmark2-es2-wayland.
>
> With this patch, on GLES 2.0 samples not involving FBOs, it appears
> performance is converging with (and sometimes surpassing) the blob.
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  src/gallium/drivers/panfrost/pan_context.c | 42 +++---
>  1 file changed, 38 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 822b5a0dfef..2996a9c1e09 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -256,7 +256,28 @@ static struct bifrost_framebuffer
>  panfrost_emit_mfbd(struct panfrost_context *ctx)
>  {
>  struct bifrost_framebuffer framebuffer = {
> -.tiler_meta = 0xf0c600,
> +/* It is not yet clear what tiler_meta means or how it's
> + * calculated, but we can tell the lower 32-bits are a
> + * (monotonically increasing?) function of tile count and
> + * geometry complexity; I suspect it defines a memory size of
> + * some kind? for the tiler. It's really unclear at the
> + * moment... but to add to the confusion, the hardware is 
> happy
> + * enough to accept a zero in this field, so we don't even 
> have
> + * to worry about it right now.

*somewhere* the result of VS (or binning shader if VS is split in two)
needs to be saved for use during the per-tile rendering.  Presumably
you have to give the hw a buffer of appropriate/matching size
somewhere, or with enough geometry (and too small a buffer) things
will overflow and go badly.

I guess if you exceed the size given in .tiler_meta, then hw falls
back to running VS per tile for all geometry (not just geom visible in
the tile), hence big diff in perf for large tile counts and lotsa
geometry.

It does sound a bit strange that it would depend on tile count.  I'd
expect it would be a function strictly of amount of geometry (and
possibly effected by whether or not gl_PointSize is written?).. and
possibly amount of varyings if VS isn't split into two parts.

BR,
-R

> + * The byte (just after the 32-bit mark) is much more
> + * interesting. The higher nibble I've only ever seen as 0xF,
> + * but the lower one I've seen as 0x0 or 0xF, and it's not
> + * obvious what the difference is. But what -is- obvious is
> + * that when the lower nibble is zero, performance is 
> severely
> + * degraded compared to when the lower nibble is set.
> + * Evidently, that nibble enables some sort of fast path,
> + * perhaps relating to caching or tile flush? Regardless, at
> + * this point there's no clear reason not to set it, aside 
> from
> + * substantially increased memory requirements (of the misc_0
> + * buffer) */
> +
> +.tiler_meta = ((uint64_t) 0xff << 32) | 0x0,
>
>  .width1 = MALI_POSITIVE(ctx->pipe_framebuffer.width),
>  .height1 = MALI_POSITIVE(ctx->pipe_framebuffer.height),
> @@ -271,10 +292,23 @@ panfrost_emit_mfbd(struct panfrost_context *ctx)
>
>  .unknown2 = 0x1f,
>
> -/* Presumably corresponds to unknown_address_X of SFBD */
> +/* Corresponds to unknown_address_X of SFBD */
>  .scratchpad = ctx->scratchpad.gpu,
>  .tiler_scratch_start  = ctx->misc_0.gpu,
> -.tiler_scratch_middle = ctx->misc_0.gpu + 
> /*ctx->misc_0.size*/40960, /* Size depends on the size of the framebuffer and 
> the number of vertices */
> +
> +/* The constant added here is, like the lower word of
> + * tiler_meta, (loosely) another product of framebuffer size
> + * and geometry complexity. It must be sufficiently large for
> + * the tiler_meta fast path to work; if it's too small, there
> + * will be DATA_INVALID_FAULTs. Conversely, it must be less
> + * than the total size of misc_0, or else there's no room. 
> It's
> + * possible this constant configures a partition between two
> + * parts of 

Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces

2019-02-21 Thread Lionel Landwerlin

On 21/02/2019 13:30, Chris Wilson wrote:

Quoting Lionel Landwerlin (2019-02-21 12:57:09)

I did not find the PRM bit that says it must be 64b aligned, but I can
see that's what i915 checks.

Chris: If you have a pointer to it, I could add the quote.

In amongst the register specs,
PLANE_STRIDE:
For Linear memory, this field specifies the stride in chunks of 64 bytes (1 
cache line).
-Chris


Thanks a lot!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces

2019-02-21 Thread Chris Wilson
Quoting Lionel Landwerlin (2019-02-21 12:57:09)
> I did not find the PRM bit that says it must be 64b aligned, but I can 
> see that's what i915 checks.
> 
> Chris: If you have a pointer to it, I could add the quote.

In amongst the register specs,
PLANE_STRIDE:
For Linear memory, this field specifies the stride in chunks of 64 bytes (1 
cache line).
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces

2019-02-21 Thread Lionel Landwerlin
I did not find the PRM bit that says it must be 64b aligned, but I can 
see that's what i915 checks.


Chris: If you have a pointer to it, I could add the quote.

Thanks!

Reviewed-by: Lionel Landwerlin 

On 19/02/2019 12:06, Samuel Iglesias Gonsálvez wrote:

Signed-off-by: Samuel Iglesias Gonsálvez 
---
  src/intel/isl/isl.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 5c34efb9a13..7fb469687fa 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1519,6 +1519,9 @@ isl_surf_init_s(const struct isl_device *dev,
   }
}
base_alignment_B = isl_round_up_to_power_of_two(base_alignment_B);
+  /* The display engine requires 64B alignment for linear surfaces.  */
+  if (isl_surf_usage_is_display(info->usage))
+ base_alignment_B = MAX(base_alignment_B, 64);
 } else {
const uint32_t total_h_tl =
   isl_align_div(phys_total_el.h, tile_info.logical_extent_el.height);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 2/2] etnaviv: try to use CPU to tile/untile when the region is small enough

2019-02-21 Thread Boris Brezillon
Using the GPU has a cost, it implies preparing the RS/BLT request and
then possibly requires extra GPU -> CPU syncs, so let's only use
GPU-based tiling/untiling for rather big regions.

While at it, move the "should we use the GPU to tile/untile" logic to
its own function to make things more readable.

Signed-off-by: Boris Brezillon 
---
 .../drivers/etnaviv/etnaviv_transfer.c| 67 +--
 1 file changed, 61 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_transfer.c 
b/src/gallium/drivers/etnaviv/etnaviv_transfer.c
index 38648564b701..a3013e624ead 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_transfer.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_transfer.c
@@ -132,6 +132,66 @@ etna_transfer_unmap(struct pipe_context *pctx, struct 
pipe_transfer *ptrans)
slab_free(>transfer_pool, trans);
 }
 
+#define GPU_TILING_MIN_SURFACE_SIZE1024
+#define GPU_TILING_MIN_UNALIGNED_SURFACE_SIZE  8192
+
+static bool
+etna_transfer_use_gpu(struct etna_context *ctx, struct etna_resource *rsc,
+  const struct pipe_box *box)
+{
+   unsigned w_align, h_align;
+
+   /* Always use the GPU to tile/untile when the resource has a Tile Status
+* buffer attached. */
+   if (rsc->ts_bo)
+  return true;
+
+   /* Nothing to tile/untile if the resource is using a linear format. */
+   if (rsc->layout == ETNA_LAYOUT_LINEAR)
+  return false;
+
+   /* Do not use the GPU when the resource is using a 1byte/pixel format. */
+   if (util_format_get_blocksize(rsc->base.format) == 1)
+  return false;
+
+   /* HALIGN 4 resources are incompatible with the resolve engine, so fall back
+* to using software to detile this resource. */
+   if (rsc->halign == TEXTURE_HALIGN_FOUR)
+  return false;
+
+   /* Using the GPU has a cost, it implies preparing the RS/BLT request and
+* then possibly requires extra GPU -> CPU syncs, so let's only use
+* GPU-based tiling for rather big regions. Right now the minimum surface
+* surface size is arbitrarily set to 1024 pixels. */
+   if (box->width * box->height < GPU_TILING_MIN_SURFACE_SIZE)
+  return false;
+
+   /* No alignment constraints when using BLT. */
+   if (ctx->specs.use_blt)
+  return true;
+
+   if (rsc->layout & ETNA_LAYOUT_BIT_SUPER) {
+  w_align = h_align = 64;
+   } else {
+  w_align = ETNA_RS_WIDTH_MASK + 1;
+  h_align = ETNA_RS_HEIGHT_MASK + 1;
+   }
+   h_align *= ctx->screen->specs.pixel_pipes;
+
+   /* Everything is properly aligned, let's use the RS engine to
+* tile/untile. */
+   if (!((box->x | box->width) & (w_align - 1)) &&
+   !((box->y | box->height) & (h_align - 1)))
+  return true;
+
+   /* We want the minimum surface size to be even bigger for unaligned
+* requests. */
+   if (box->width * box->height < GPU_TILING_MIN_UNALIGNED_SURFACE_SIZE)
+  return false;
+
+   return true;
+}
+
 static void *
 etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc,
   unsigned level,
@@ -179,12 +239,7 @@ etna_transfer_map(struct pipe_context *pctx, struct 
pipe_resource *prsc,
* render resource. Use the texture resource, which avoids bouncing
* pixels between the two resources, and we can de-tile it in s/w. */
   rsc = etna_resource(rsc->texture);
-   } else if (rsc->ts_bo ||
-  (rsc->layout != ETNA_LAYOUT_LINEAR &&
-   util_format_get_blocksize(format) > 1 &&
-   /* HALIGN 4 resources are incompatible with the resolve engine,
-* so fall back to using software to detile this resource. */
-   rsc->halign != TEXTURE_HALIGN_FOUR)) {
+   } else if (etna_transfer_use_gpu(ctx, rsc, box)) {
   /* If the surface has tile status, we need to resolve it first.
* The strategy we implement here is to use the RS to copy the
* depth buffer, filling in the "holes" where the tile status
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #34 from asimiklit  ---
(In reply to asimiklit from comment #26)
> (In reply to Ian Romanick from comment #17)
> > (In reply to asimiklit from comment #6)
> > > Created attachment 143288 [details]
> > > this simple program helps me to reproduce this issue.
> > > 
> > > just share my simple reproducer)
> > > 
> > > Run it in this way:
> > > 
> > >simple_reproducer shader.comp
> > 
> > It seems like this could be made into a piglit test that could be compiled
> > with glslparsertest.  If you haven't already submitted such a test, please
> > do. :)
> 
> I work on it)
The two piglit tests were suggested:
https://patchwork.freedesktop.org/patch/286287/

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/1] Avoid leaking parameter name in prog_to_nir.

2019-02-21 Thread Matthias Lorenz
Fixes: 3d7611e9 "st/nir: use NIR for asm programs"
---
 src/mesa/program/prog_to_nir.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c
index 312b299361e..6e3fa9432a3 100644
--- a/src/mesa/program/prog_to_nir.c
+++ b/src/mesa/program/prog_to_nir.c
@@ -1024,7 +1024,8 @@ prog_to_nir(const struct gl_program *prog,
   c->parameters = rzalloc(s, nir_variable);
   c->parameters->type =
  glsl_array_type(glsl_vec4_type(), prog->Parameters->NumParameters, 0);
-  c->parameters->name = strdup(prog->Parameters->Parameters[0].Name);
+  c->parameters->name =
+ ralloc_strdup(c->parameters, prog->Parameters->Parameters[0].Name);
   c->parameters->data.read_only = true;
   c->parameters->data.mode = nir_var_uniform;
   exec_list_push_tail(>uniforms, >parameters->node);
--
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 0/2] etnaviv: use CPU-based tiling/untiling for small regions

2019-02-21 Thread Boris Brezillon
Hi,

This series aims at optimizing updates of small regions inside a
texture. This is particularly useful when updating a texture atlas one
bit at a time.

Before going for this CPU-based tiling/untiling approach we tried
optimizing things by keeping track of all updates triggered by the
transfer_map/unmap() calls to avoid GPU -> CPU syncs when new regions
are updated without touching other already written regions [1].
Unfortunately, because of the supertile+RS-alignment constraints this
led to almost all updates requiring such a sync (at least in our use
case where each element of the atlas is not properly aligned).

This led us to consider another approach: avoid GPU-based tiling for
small regions, especially when they're not properly aligned. This is
what this patchset implements.

Patch 1 add supports for super and multi tile untiling, and patch 2
makes use of the CPU-based tiling/untiling when the region is small
enough. Note that the thresholds used here have been chosen arbitrarily
but they seem to speed things up in our use case.

Future possible improvements involve supporting NEON-based
tiling/untiling (as started here[2]).

Another option we considered was keeping a shadow linear buffer
attached to the tiled resource so that we could update the linear
version without waiting for the GPU to finish rendering things on
the final BO, but this option will likely be more invasive.

Please let me know what you think of this CPU-based tiling approach,
and if you don't like it, feel free to propose other solution that
would be worth investigating.

Thanks,

Boris

[1]https://gitlab.collabora.com/bbrezillon/mesa/commits/etna-texture-atlas-18.2.4
[2]https://github.com/laanwj/mesa/commit/6d575b3094f17e29246be72dce8fbb6fe048db2c

Boris Brezillon (2):
  etnaviv: add support for CPU-based super/multi tile tiling/untiling
  etnaviv: try to use CPU to tile/untile when the region is small enough

 src/gallium/drivers/etnaviv/etnaviv_tiling.c  | 85 ---
 src/gallium/drivers/etnaviv/etnaviv_tiling.h  | 10 ++-
 .../drivers/etnaviv/etnaviv_transfer.c| 83 ++
 3 files changed, 145 insertions(+), 33 deletions(-)

-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/1] Avoid leaking parameter name in prog_to_nir

2019-02-21 Thread Matthias Lorenz
Noticed this while looking at recent commits.
I'm not familiar with the Mesa codebase so this might be incorrect,
but I don't see how this strdup'ed variable would be freed.

Matthias Lorenz (1):
  Avoid leaking parameter name in prog_to_nir.

 src/mesa/program/prog_to_nir.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH 1/2] etnaviv: add support for CPU-based super/multi tile tiling/untiling

2019-02-21 Thread Boris Brezillon
Sometimes using CPU based untiling/tiling makes, like when updating
a really small region of the texture atlas which is not RS-aligned.
CPU-based support for simple tile layout was already supported, but
not the multi/super tile cases.
Make the etna_texture_[un]tile() more generic to support those cases.

Signed-off-by: Boris Brezillon 
---
 src/gallium/drivers/etnaviv/etnaviv_tiling.c  | 85 ---
 src/gallium/drivers/etnaviv/etnaviv_tiling.h  | 10 ++-
 .../drivers/etnaviv/etnaviv_transfer.c| 16 ++--
 3 files changed, 84 insertions(+), 27 deletions(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_tiling.c 
b/src/gallium/drivers/etnaviv/etnaviv_tiling.c
index f4f85c1d6e6c..7e2b8bd48d3a 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_tiling.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_tiling.c
@@ -32,39 +32,95 @@
 #define TEX_TILE_WIDTH (4)
 #define TEX_TILE_HEIGHT (4)
 #define TEX_TILE_WORDS (TEX_TILE_WIDTH * TEX_TILE_HEIGHT)
+#define TEX_SUPERTILE_WIDTH (64)
+#define TEX_SUPERTILE_HEIGHT (64)
+#define TEX_SUPERTILE_TWIDTH (16)
+#define TEX_SUPERTILE_THEIGHT (16)
+#define TEX_SUPERTILE_WORDS (TEX_SUPERTILE_WIDTH * TEX_SUPERTILE_HEIGHT)
+#define TEX_TILES_PER_SUPERTILE (TEX_SUPERTILE_TWIDTH * TEX_SUPERTILE_THEIGHT)
+
+static unsigned tile_buf_offset(enum etna_surface_layout tlayout,
+unsigned tx, unsigned ty,
+   unsigned tstride, unsigned th,
+   unsigned elmtsize)
+{
+   unsigned offs = 0, tile;
+   unsigned tiles_per_line = (tstride / elmtsize) / TEX_TILE_WIDTH;
+
+   /* Multi tile layouts are described here:
+* 
https://github.com/laanwj/etna_viv/blob/master/doc/hardware.md#multi-tiling
+*/
+   if (tlayout & ETNA_LAYOUT_BIT_MULTI) {
+  if tx / (TEX_TILE_WIDTH * 2)) & 1) ^ ((ty / TEX_TILE_HEIGHT) & 1))) {
+ offs += ((tstride / elmtsize) * (th / 2));
+ if ((tx / (TEX_TILE_WIDTH * 2)) & 1)
+tx -= TEX_TILE_WIDTH * 2;
+else
+tx += TEX_TILE_HEIGHT * 2;
+  }
+
+  ty = ((ty / 2) & ~(TEX_TILE_HEIGHT - 1)) +
+   (ty % TEX_TILE_HEIGHT);
+   }
+
+   if (tlayout & ETNA_LAYOUT_BIT_SUPER) {
+  /* We use the supertile layout described here:
+   * 
https://github.com/laanwj/etna_viv/blob/master/doc/hardware.md#supertiling
+   * FIXME: According to
+   * https://github.com/laanwj/etna_viv/blob/master/tools/detiler.py
+   * another layout exists. We should probably support CPU-based detiling
+   * for this layout too and determine the layout to used based on HW
+   * features. */
+  tile = (ty / TEX_SUPERTILE_HEIGHT) * tiles_per_line *
+ TEX_SUPERTILE_THEIGHT;
+  ty &= TEX_SUPERTILE_HEIGHT - 1;
+  tile += (tx / TEX_SUPERTILE_WIDTH) * TEX_TILES_PER_SUPERTILE;
+  tx &= TEX_SUPERTILE_WIDTH - 1;
+  tile += (ty / (4 * TEX_TILE_HEIGHT)) * 16 * 4;
+  ty &= (4 * TEX_TILE_HEIGHT) - 1;
+  tile += (ty / TEX_TILE_HEIGHT) * 2;
+  tile += ((tx / TEX_TILE_WIDTH) & ~0x1) * 4;
+  tile += (tx / TEX_TILE_WIDTH) & 0x1;
+   } else {
+  tile = (ty / TEX_TILE_HEIGHT) * tiles_per_line;
+  tile += tx / TEX_TILE_WIDTH;
+   }
+
+   offs += tile * TEX_TILE_WORDS;
+   ty &= TEX_TILE_HEIGHT - 1;
+   tx &= TEX_TILE_WIDTH - 1;
+   offs += (ty * TEX_TILE_WIDTH) + tx;
+
+   return offs;
+}
 
 #define DO_TILE(type)   \
src_stride /= sizeof(type);  \
-   dst_stride = (dst_stride * TEX_TILE_HEIGHT) / sizeof(type);  \
for (unsigned srcy = 0; srcy < height; ++srcy) { \
-  unsigned dsty = basey + srcy; \
-  unsigned ty = (dsty / TEX_TILE_HEIGHT) * dst_stride + \
-(dsty % TEX_TILE_HEIGHT) * TEX_TILE_WIDTH;  \
   for (unsigned srcx = 0; srcx < width; ++srcx) {   \
- unsigned dstx = basex + srcx;  \
- ((type *)dest)[ty + (dstx / TEX_TILE_WIDTH) * TEX_TILE_WORDS + \
-(dstx % TEX_TILE_WIDTH)] =  \
+ unsigned destoffs = tile_buf_offset(baselayout, basex + srcx,  \
+ basey + srcy, dst_stride,  \
+ baseh, elmtsize);  \
+ ((type *)dest)[destoffs] = \
 ((type *)src)[srcy * src_stride + srcx];\
   } \
}
 
 #define DO_UNTILE(type)   \
-   src_stride = (src_stride * TEX_TILE_HEIGHT) / sizeof(type);\
dst_stride /= sizeof(type);\
for (unsigned dsty = 0; dsty < height; ++dsty) {   \
-  unsigned srcy = basey + 

Re: [Mesa-dev] [PATCH] nir: clone instruction set rather than removing individual entries

2019-02-21 Thread Thomas Helland
This patch is:

Reviewed-by: Thomas Helland

Den ons. 20. feb. 2019 kl. 04:04 skrev Timothy Arceri :
>
> This reduces the time spent in nir_opt_cse() by almost a half.
> ---
>  src/compiler/nir/nir_opt_cse.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_cse.c b/src/compiler/nir/nir_opt_cse.c
> index bf42a6a33dc..3c3617d852a 100644
> --- a/src/compiler/nir/nir_opt_cse.c
> +++ b/src/compiler/nir/nir_opt_cse.c
> @@ -39,9 +39,10 @@
>   */
>
>  static bool
> -cse_block(nir_block *block, struct set *instr_set)
> +cse_block(nir_block *block, struct set *dominance_set)
>  {
> bool progress = false;
> +   struct set *instr_set = _mesa_set_clone(dominance_set, NULL);
>
> nir_foreach_instr_safe(instr, block) {
>if (nir_instr_set_add_or_rewrite(instr_set, instr)) {
> @@ -55,8 +56,7 @@ cse_block(nir_block *block, struct set *instr_set)
>progress |= cse_block(child, instr_set);
> }
>
> -   nir_foreach_instr(instr, block)
> - nir_instr_set_remove(instr_set, instr);
> +   _mesa_set_destroy(instr_set, NULL);
>
> return progress;
>  }
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: fix alphaToCoverage when there is no color attachment

2019-02-21 Thread Samuel Iglesias Gonsálvez
CL#3532 added a test for alpha to coverage without a color attachment.
First the test draws a primitive with alpha 0 and a subpass with only
a depth buffer. No writes to a depth buffer are expected. Then a
second draw with a color buffer and the same depth buffer is done to
verify the depth buffer still has the original clear values.

This behavior is not explicitly forbidden by the Vulkan spec, so
it seems it is allowed.

When there is no color attachment for a given output, we discard it
so at the end we have an FS assembly like:

Native code for unnamed fragment shader (null)
SIMD16 shader: 1 instructions. 0 loops. 4 cycles. 0:0 spills:fills.
Promoted 0 constants. Compacted 16 to 16 bytes (0%)
   START B0 (4 cycles)
sendc(16)   null<1>UW   g120<0,1,0>F0x90031000

render MsgDesc: RT write SIMD16 LastRT Surface = 0 mlen 8 rlen 0 { align1 1H 
EOT };

As g120 is not initialized, we see random writes to the depth buffer
due to the alphaToCoverage enablement. This commit fixes that by
keeping the output and creating a null render target for it.

Fixes tests:

dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.*

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_pipeline.c | 35 +
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index e2024212bd9..70a958bf3a8 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -792,7 +792,9 @@ anv_pipeline_compile_gs(const struct brw_compiler *compiler,
 
 static void
 anv_pipeline_link_fs(const struct brw_compiler *compiler,
- struct anv_pipeline_stage *stage)
+ struct anv_pipeline_stage *stage,
+ const struct anv_subpass *subpass,
+ const VkPipelineMultisampleStateCreateInfo *ms_info)
 {
unsigned num_rts = 0;
const int max_rt = FRAG_RESULT_DATA7 - FRAG_RESULT_DATA0 + 1;
@@ -843,12 +845,28 @@ anv_pipeline_link_fs(const struct brw_compiler *compiler,
   const unsigned rt = var->data.location - FRAG_RESULT_DATA0;
   if (rt >= MAX_RTS ||
   !(stage->key.wm.color_outputs_valid & (1 << rt))) {
- /* Unused or out-of-bounds, throw it away */
- deleted_output = true;
- var->data.mode = nir_var_function_temp;
- exec_node_remove(>node);
- exec_list_push_tail(>locals, >node);
- continue;
+ if (rt == 0 && ms_info && ms_info->alphaToCoverageEnable &&
+ subpass->depth_stencil_attachment && rt_to_bindings[rt] == -1) {
+/* Don't throw away the unused output, because we needed it for
+ * calculate correctly the alpha to coverage samples when there
+ * is no color buffer attached at location 0.
+ */
+rt_to_bindings[rt] = num_rts;
+/* We need a null render target */
+rt_bindings[rt_to_bindings[rt]] = (struct anv_pipeline_binding) {
+   .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS,
+   .binding = 0,
+   .index = UINT32_MAX,
+};
+num_rts++;
+ } else {
+/* Unused or out-of-bounds, throw it away */
+deleted_output = true;
+var->data.mode = nir_var_function_temp;
+exec_node_remove(>node);
+exec_list_push_tail(>locals, >node);
+continue;
+ }
   }
 
   /* Give it the new location */
@@ -1075,7 +1093,8 @@ anv_pipeline_compile_graphics(struct anv_pipeline 
*pipeline,
  anv_pipeline_link_gs(compiler, [s], next_stage);
  break;
   case MESA_SHADER_FRAGMENT:
- anv_pipeline_link_fs(compiler, [s]);
+ anv_pipeline_link_fs(compiler, [s],
+  pipeline->subpass, info->pMultisampleState);
  break;
   default:
  unreachable("Invalid graphics shader stage");
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109698] dri.pc contents invalid when built with meson

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109698

--- Comment #2 from Emil Velikov  ---
AFAICT there are two issues here.

1. the prefix wrongfully added to the custom variable
2. the custom variable can be a list of paths separated by colon, yet only the
first one ends up in dri.pc

Perhaps one process of prefixing (1) also broke the multiple paths (2)?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-21 Thread Eleni Maria Stea
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)

v3:
   - Reversed the CLAMP change in bbox[3] as the API guarantees that the
   viewport height is positive. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index dcdfb3c9292..47f3741e673 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i,
 
bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
-   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
+   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Fix function return typechecking

2019-02-21 Thread Oscar Blumberg
Hi,

On Thu, Feb 21, 2019 at 10:35 AM Tapani Pälli 
wrote:

> Hi;
>
> On 2/11/19 6:46 PM, Oscar Blumberg wrote:
> > apply_implicit_conversion only converts and check base types but we
> > need actual type equality for function returns, otherwise you can
> > return a vec2 from a function declared as returning a float.
>
> Do you have some test shader that hits this condition? It seems to me
> that currently we will error out correctly if one tries to return vec2
> from function declared as returning a float.
>
>
Yes, I believe it triggers in even the simplest case here, such as :
float f() { return vec2(1.0); }
anywhere in the shader.
Does it fail to reproduce on your end ?
(note that the declared glsl version must be recent enough that implicit
conversion for function returns are enabled, I'm using 450 here).

I believe most of the confusion comes from the name of the
apply_implicit_conversion function, since it is mainly used for arithmetic
operations for which, e.g., vec2+float is allowed.
Because of that it only checks (and convert in place) base types without
looking at element count for vector-like things.
We can still use it to perform the conversion but it requires an additional
check, hence the patch.

That's my understanding of it anyway.

> ---
> >   src/compiler/glsl/ast_to_hir.cpp | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> > index 620153e6a34..6bf2910954f 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -6248,7 +6248,8 @@ ast_jump_statement::hir(exec_list *instructions,
> >
> >   if (state->has_420pack()) {
> >  if
> (!apply_implicit_conversion(state->current_function->return_type,
> > -  ret, state)) {
> > +  ret, state)
> > +   || (ret->type !=
> state->current_function->return_type)) {
> > _mesa_glsl_error(& loc, state,
> >  "could not implicitly convert
> return value "
> >  "to %s, in function `%s'",
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] anv: advertise 8 subpixel precision bits

2019-02-21 Thread Juan A. Suarez Romero
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.

On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.

So it seems ANV is actually using 8 bits.

v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)

CC: Jason Ekstrand 
CC: Kenneth Graunke 
---
 src/intel/vulkan/anv_device.c| 2 +-
 src/intel/vulkan/genX_pipeline.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 3120865466a..95224407318 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1095,7 +1095,7 @@ void anv_GetPhysicalDeviceProperties(
  16 * devinfo->max_cs_threads,
  16 * devinfo->max_cs_threads,
   },
-  .subPixelPrecisionBits= 4 /* FIXME */,
+  .subPixelPrecisionBits= 8,
   .subTexelPrecisionBits= 4 /* FIXME */,
   .mipmapPrecisionBits  = 4 /* FIXME */,
   .maxDrawIndexedIndexValue = UINT32_MAX,
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 6255e5d83c5..b06036a6fc7 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -464,6 +464,7 @@ emit_rs_state(struct anv_pipeline *pipeline,
sf.TriangleStripListProvokingVertexSelect = 0;
sf.LineStripListProvokingVertexSelect = 0;
sf.TriangleFanProvokingVertexSelect = 1;
+   sf.VertexSubPixelPrecisionSelect = 0;
 
const struct brw_vue_prog_data *last_vue_prog_data =
   anv_pipeline_get_last_vue_prog_data(pipeline);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Fix function return typechecking

2019-02-21 Thread Tapani Pälli



On 2/21/19 11:35 AM, Tapani Pälli wrote:

Hi;

On 2/11/19 6:46 PM, Oscar Blumberg wrote:

apply_implicit_conversion only converts and check base types but we
need actual type equality for function returns, otherwise you can
return a vec2 from a function declared as returning a float.


Do you have some test shader that hits this condition? It seems to me 
that currently we will error out correctly if one tries to return vec2 
from function declared as returning a float.


Ah forget that, I can see it can happen with ARB_shading_language_420pack!

Reviewed-by: Tapani Pälli 


---
  src/compiler/glsl/ast_to_hir.cpp | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp 
b/src/compiler/glsl/ast_to_hir.cpp

index 620153e6a34..6bf2910954f 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -6248,7 +6248,8 @@ ast_jump_statement::hir(exec_list *instructions,
  if (state->has_420pack()) {
 if 
(!apply_implicit_conversion(state->current_function->return_type,

-  ret, state)) {
+  ret, state)
+   || (ret->type != 
state->current_function->return_type)) {

    _mesa_glsl_error(& loc, state,
 "could not implicitly convert 
return value "

 "to %s, in function `%s'",


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109711

--- Comment #1 from Bas Nieuwenhuizen  ---
So is this an actual hang of the GPU or just a freeze for the game? (i.e. does
it need a reboot to do anything again or can you just close the game?) Reading
the DXVK bug I'm wondering which of the two it is.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Fix function return typechecking

2019-02-21 Thread Tapani Pälli

Hi;

On 2/11/19 6:46 PM, Oscar Blumberg wrote:

apply_implicit_conversion only converts and check base types but we
need actual type equality for function returns, otherwise you can
return a vec2 from a function declared as returning a float.


Do you have some test shader that hits this condition? It seems to me 
that currently we will error out correctly if one tries to return vec2 
from function declared as returning a float.



---
  src/compiler/glsl/ast_to_hir.cpp | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 620153e6a34..6bf2910954f 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -6248,7 +6248,8 @@ ast_jump_statement::hir(exec_list *instructions,
  
  if (state->has_420pack()) {

 if 
(!apply_implicit_conversion(state->current_function->return_type,
-  ret, state)) {
+  ret, state)
+   || (ret->type != state->current_function->return_type)) {
_mesa_glsl_error(& loc, state,
 "could not implicitly convert return value 
"
 "to %s, in function `%s'",


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109711

Bug ID: 109711
   Summary: [DXVK] Skyrim Special edition hangs
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: root@scoopta.ninja
QA Contact: mesa-dev@lists.freedesktop.org

Skyrim special edition hangs for me on the main menu with a Radeon R9 Fury
using RADV. The corresponding DXVK issue can be found here
https://github.com/doitsujin/dxvk/issues/927 The issue occurs using Mesa
18.3.3(LLVM 7.0.1), Mesa 19.0.0~rc4(LLVM 8), and Mesa 19.0.0~rc4(LLVM 9-svn).
The build of 18.3.3(LLVM 7.0.1) came from the debian sid repositories, the
19.0.0~rc4(LLVM 8) was built by me linking against debian sid's libllvm8 and
then 19.0.0~rc4(LLVM 9-svn) was built entirely from scratch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109698] dri.pc contents invalid when built with meson

2019-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109698

--- Comment #1 from Sergii Romantsov  ---
Hello, Jeremy.
Looks like during meson configuration you are also using --prefix=/usr
So if at this moment as workaround option dri-drivers-path will exclude a
'prefix' from path it should work, like:
-Ddri-drivers-path=/lib/$(DEB_HOST_MULTIARCH)/dri

Also patch to mesa proposed:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/285
Could you, please, try if it fixes issue?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev