Re: [Mesa-dev] [MR] WIP: VK_KHR_shader_float_controls implementation on ANV
It is still unreviewed. Sam On 2/8/19 8:17 AM, Samuel Iglesias Gonsálvez wrote: > First three versions of this branch were sent for review to the mailing > list. In order to avoid flooding the list with more emails, I create > this MR to continue the review process there. > > Reminder: this patch series relies on VK_KHR_shader_float16_int8 patch > series which is currently under review. For that reason, I mark this > Merge Request as Work In Progress. > > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/223 > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs
https://bugs.freedesktop.org/show_bug.cgi?id=109711 --- Comment #3 from root@scoopta.ninja --- It turns out this issue only occurs under XWayland and NOT under X.Org however I'm not sure if it's actually a mesa bug with XWayland or an XWayland bug directly. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces
Lionel, are you going to push it with this quote? I can add it otherwise. Sam On Thu, 2019-02-21 at 13:41 +, Lionel Landwerlin wrote: > On 21/02/2019 13:30, Chris Wilson wrote: > > Quoting Lionel Landwerlin (2019-02-21 12:57:09) > > > I did not find the PRM bit that says it must be 64b aligned, but > > > I can > > > see that's what i915 checks. > > > > > > Chris: If you have a pointer to it, I could add the quote. > > In amongst the register specs, > > PLANE_STRIDE: > > For Linear memory, this field specifies the stride in chunks of 64 > > bytes (1 cache line). > > -Chris > > > Thanks a lot! > > signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] android: intel/isl: remove redundant building rules
OK now I understand, this hunk got added accidentially twice from 2 separate commits! Reviewed-by: Tapani Pälli On 2/22/19 7:50 AM, Tapani Pälli wrote: Hi; On 2/22/19 1:30 AM, Mauro Rossi wrote: Fixes the following building error: including ./external/mesa/Android.mk ... build/core/base_rules.mk:183: *** external/mesa/src/intel: MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel. make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1 This is strange, craftyguy also reported same error .. I haven't seen this though. ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c and that source file includes isl_tiled_memcpy.c source Yes, that is intentional. We want to compile 2 versions of the library, 'normal' one and 'sse41' one. Fixes: 96bb328 ("iris: add Android build") Signed-off-by: Mauro Rossi --- src/intel/Android.isl.mk | 13 - 1 file changed, 13 deletions(-) diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk index 28944875e0..07a64b8ed1 100644 --- a/src/intel/Android.isl.mk +++ b/src/intel/Android.isl.mk @@ -198,19 +198,6 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) - -# --- -# Build libmesa_isl_tiled_memcpy -# --- -include $(CLEAR_VARS) - -LOCAL_MODULE := libmesa_isl_tiled_memcpy - -LOCAL_SRC_FILES := isl/isl_tiled_memcpy.c - -include $(MESA_COMMON_MK) -include $(BUILD_STATIC_LIBRARY) - # --- # Build libmesa_isl_tiled_memcpy # --- ^^^ Do you have libmesa_isl_tiled_memcpy 2 times there? This does not match what Mesa has ATM. // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] android: intel/isl: remove redundant building rules
Hi; On 2/22/19 1:30 AM, Mauro Rossi wrote: Fixes the following building error: including ./external/mesa/Android.mk ... build/core/base_rules.mk:183: *** external/mesa/src/intel: MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel. make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1 This is strange, craftyguy also reported same error .. I haven't seen this though. ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c and that source file includes isl_tiled_memcpy.c source Yes, that is intentional. We want to compile 2 versions of the library, 'normal' one and 'sse41' one. Fixes: 96bb328 ("iris: add Android build") Signed-off-by: Mauro Rossi --- src/intel/Android.isl.mk | 13 - 1 file changed, 13 deletions(-) diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk index 28944875e0..07a64b8ed1 100644 --- a/src/intel/Android.isl.mk +++ b/src/intel/Android.isl.mk @@ -198,19 +198,6 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) - -# --- -# Build libmesa_isl_tiled_memcpy -# --- -include $(CLEAR_VARS) - -LOCAL_MODULE := libmesa_isl_tiled_memcpy - -LOCAL_SRC_FILES := isl/isl_tiled_memcpy.c - -include $(MESA_COMMON_MK) -include $(BUILD_STATIC_LIBRARY) - # --- # Build libmesa_isl_tiled_memcpy # --- ^^^ Do you have libmesa_isl_tiled_memcpy 2 times there? This does not match what Mesa has ATM. // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay
https://bugs.freedesktop.org/show_bug.cgi?id=109735 Shmerl changed: What|Removed |Added CC||shtetl...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay
https://bugs.freedesktop.org/show_bug.cgi?id=109735 --- Comment #3 from Shmerl --- Yep, I can confirm it renders OK with radv 18.3.2 / llvm 7.0.1, but font is broken with radv in Mesa master / llvm 9.0 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109739] Mesa build fails when vulkan-overlay-layer option is enabled
https://bugs.freedesktop.org/show_bug.cgi?id=109739 --- Comment #1 from Shmerl --- Another problem was that ninja install doesn't deploy actual layer .so file. I did this in my build script: DESTDIR="$dest_dir" LC_ALL=C.UTF-8 ninja install It deployed VkLayer_MESA_overlay.json, but missed libVkLayer_MESA_overlay.so. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109739] Mesa build fails when vulkan-overlay-layer option is enabled
https://bugs.freedesktop.org/show_bug.cgi?id=109739 Bug ID: 109739 Summary: Mesa build fails when vulkan-overlay-layer option is enabled Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/Common Assignee: mesa-dev@lists.freedesktop.org Reporter: shtetl...@gmail.com CC: airl...@freedesktop.org, chadvers...@chromium.org, dan...@fooishbar.org, ja...@jlekstrand.net I tried building Mesa master on Debian testing, enabling vulkan-overlay-layer, and it failed because of mismatching includes. I installed glslang-tools vulkan-validationlayers-dev and enabled the option: -Dvulkan-overlay-layer=true I had to modify includes like this, to make it build: diff --git a/src/vulkan/overlay-layer/overlay.cpp b/src/vulkan/overlay-layer/overlay.cpp index f3678198b00..4b8010ba003 100644 --- a/src/vulkan/overlay-layer/overlay.cpp +++ b/src/vulkan/overlay-layer/overlay.cpp @@ -25,13 +25,13 @@ #include #include -#include +#include #include -#include +#include #include -#include "vk_layer_data.h" +#include "vulkan/vk_layer_data.h" #include "vk_layer_table.h" -#include "vk_layer_extension_utils.h" +#include "vulkan/vk_layer_extension_utils.h" #include "imgui.h" diff --git a/src/vulkan/overlay-layer/vk_layer_table.cpp b/src/vulkan/overlay-layer/vk_layer_table.cpp index 4a033b9add6..f5865fa3a6f 100644 --- a/src/vulkan/overlay-layer/vk_layer_table.cpp +++ b/src/vulkan/overlay-layer/vk_layer_table.cpp @@ -19,7 +19,7 @@ */ #include #include -#include "vk_dispatch_table_helper.h" +#include "vulkan/vk_dispatch_table_helper.h" #include "vulkan/vk_layer.h" #include "vk_layer_table.h" static device_table_map tableMap; -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: fix query buffer allocation
Fix the logic for buffer full check on alloc. This patch just takes the fix Nicolai attached to the bug report and updates it to work on master. Fixes: e0f0d3675d4 ("radeonsi: factor si_query_buffer logic out of si_query_hw") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561 --- src/gallium/drivers/radeonsi/si_query.c | 42 + src/gallium/drivers/radeonsi/si_query.h | 5 +-- 2 files changed, 25 insertions(+), 22 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_query.c b/src/gallium/drivers/radeonsi/si_query.c index 266b9d3ce84..04685b8bb3f 100644 --- a/src/gallium/drivers/radeonsi/si_query.c +++ b/src/gallium/drivers/radeonsi/si_query.c @@ -561,29 +561,31 @@ bool si_query_buffer_alloc(struct si_context *sctx, struct si_query_buffer *buff bool (*prepare_buffer)(struct si_context *, struct si_query_buffer*), unsigned size) { - if (buffer->buf && buffer->results_end + size >= buffer->buf->b.b.width0) - return true; + bool unprepared = buffer->unprepared; + buffer->unprepared = false; + + if (!buffer->buf || buffer->results_end + size > buffer->buf->b.b.width0) { + if (buffer->buf) { + struct si_query_buffer *qbuf = MALLOC_STRUCT(si_query_buffer); + memcpy(qbuf, buffer, sizeof(*qbuf)); + buffer->previous = qbuf; + } + buffer->results_end = 0; - if (buffer->buf) { - struct si_query_buffer *qbuf = MALLOC_STRUCT(si_query_buffer); - memcpy(qbuf, buffer, sizeof(*qbuf)); - buffer->previous = qbuf; + /* Queries are normally read by the CPU after +* being written by the gpu, hence staging is probably a good +* usage pattern. +*/ + struct si_screen *screen = sctx->screen; + unsigned buf_size = MAX2(size, screen->info.min_alloc_size); + buffer->buf = si_resource( + pipe_buffer_create(>b, 0, PIPE_USAGE_STAGING, buf_size)); + if (unlikely(!buffer->buf)) + return false; + unprepared = true; } - buffer->results_end = 0; - - /* Queries are normally read by the CPU after -* being written by the gpu, hence staging is probably a good -* usage pattern. -*/ - struct si_screen *screen = sctx->screen; - unsigned buf_size = MAX2(size, screen->info.min_alloc_size); - buffer->buf = si_resource( - pipe_buffer_create(>b, 0, PIPE_USAGE_STAGING, buf_size)); - if (unlikely(!buffer->buf)) - return false; - - if (prepare_buffer) { + if (unprepared && prepare_buffer) { if (unlikely(!prepare_buffer(sctx, buffer))) { si_resource_reference(>buf, NULL); return false; diff --git a/src/gallium/drivers/radeonsi/si_query.h b/src/gallium/drivers/radeonsi/si_query.h index aaf0bd03aca..c61af51d57c 100644 --- a/src/gallium/drivers/radeonsi/si_query.h +++ b/src/gallium/drivers/radeonsi/si_query.h @@ -177,12 +177,13 @@ struct si_query_hw_ops { struct si_query_buffer { /* The buffer where query results are stored. */ struct si_resource *buf; - /* Offset of the next free result after current query data */ - unsignedresults_end; /* If a query buffer is full, a new buffer is created and the old one * is put in here. When we calculate the result, we sum up the samples * from all buffers. */ struct si_query_buffer *previous; + /* Offset of the next free result after current query data */ + unsignedresults_end; + bool unprepared; }; void si_query_buffer_destroy(struct si_screen *sctx, struct si_query_buffer *buffer); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay
https://bugs.freedesktop.org/show_bug.cgi?id=109735 --- Comment #2 from osch...@web.de --- Created attachment 143436 --> https://bugs.freedesktop.org/attachment.cgi?id=143436=edit bisect log -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay
https://bugs.freedesktop.org/show_bug.cgi?id=109735 --- Comment #1 from osch...@web.de --- Created attachment 143435 --> https://bugs.freedesktop.org/attachment.cgi?id=143435=edit screenshot -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109735] [Regression] broken font with mesa_vulkan_overlay
https://bugs.freedesktop.org/show_bug.cgi?id=109735 Bug ID: 109735 Summary: [Regression] broken font with mesa_vulkan_overlay Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: osch...@web.de QA Contact: mesa-dev@lists.freedesktop.org The mesa_vulkan_overlay renders broken font with recent radv-git, 18.3.3 works fine (both build against LLVM 7.0.1). GPU is a RX580. Bisected to: commit da2959463616756e89cece39f2bae1d437df4bbd Author: Jason Ekstrand Date: Wed Jan 16 11:52:03 2019 -0600 spirv: Only split blocks Instead of splitting every per-vertex struct, just split the ones that are actually blocks. The reason for the split is so that we have separate variables for separate locations, qualifiers, and builtin decorations. The vulkan spec only allows these on members of blocks. Reviewed-by: Alejandro Piñeiro -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
On Thu, Feb 21, 2019 at 5:11 PM Connor Abbott wrote: > > On Thu, Feb 21, 2019 at 5:07 PM Alyssa Rosenzweig > wrote: >> >> > Yes, there is a buffer for holding the results of the tiler. The way it >> > works is that the userspace driver allocates a very large buffer with a >> > "grow on page fault" bit, so the kernel will allocate more memory as the >> > tiler asks for it. >> >> To be clear, right now I have this magic misc_0 buffer setup that way, >> too (allocating something enormous and setting GROW_ON_GPF). Although, >> it's not clear to me that that is the _correct_ thing to do; IIRC, the >> blob allocates "just enough", no GROW_ON_GPF needed, but I should >> probably check that. > > > I don't know about Midgard, but on Bifrost it definitely uses GROW_ON_GPF, > telling the kernel to preallocate some small-ish initial set, presumably as > an optimization. I don't think you can accurately predict how much memory > you're going to need ahead of time, since it depends on how the triangles are > distributed along the screen. the GROW_ON_GPF thing is cute.. not as badly needed on adreno since we don't actually store pos/psize/varyings from binning pass, only a compressed visibility stream.. but if only we had proper stall-on-fault support from iommu driver, we could use a similar trick to dynamically grow the VSC buffers.. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] android: intel/isl: remove redundant building rules
Fixes the following building error: including ./external/mesa/Android.mk ... build/core/base_rules.mk:183: *** external/mesa/src/intel: MODULE.TARGET.STATIC_LIBRARIES.libmesa_isl_tiled_memcpy already defined by external/mesa/src/intel. make: *** [build/core/ninja.mk:164: out/build-android_x86_64.ninja] Error 1 ISL_TILED_MEMCPY_FILES is isl/isl_tiled_memcpy_normal.c and that source file includes isl_tiled_memcpy.c source Fixes: 96bb328 ("iris: add Android build") Signed-off-by: Mauro Rossi --- src/intel/Android.isl.mk | 13 - 1 file changed, 13 deletions(-) diff --git a/src/intel/Android.isl.mk b/src/intel/Android.isl.mk index 28944875e0..07a64b8ed1 100644 --- a/src/intel/Android.isl.mk +++ b/src/intel/Android.isl.mk @@ -198,19 +198,6 @@ LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) - -# --- -# Build libmesa_isl_tiled_memcpy -# --- -include $(CLEAR_VARS) - -LOCAL_MODULE := libmesa_isl_tiled_memcpy - -LOCAL_SRC_FILES := isl/isl_tiled_memcpy.c - -include $(MESA_COMMON_MK) -include $(BUILD_STATIC_LIBRARY) - # --- # Build libmesa_isl_tiled_memcpy # --- -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=109535 Bug 109535 depends on bug 109328, which changed state. Bug 109328 Summary: [BSW BXT GLK] dEQP-VK.subgroups.arithmetic.subgroup regressions https://bugs.freedesktop.org/show_bug.cgi?id=109328 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped
On Thu, Feb 21, 2019 at 12:02:58PM +0200, Eleni Maria Stea wrote: > Calculating the scissor rectangle fields with the y flipped (0 on top) > can generate negative values that will cause assertion failure later on > as the scissor fields are all unsigned. We must clamp the bbox values > again to make sure they don't exceed the fb_height. Also fixed a > calculation error. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 > I guess we'll want to add the other bug and Cc stable. > v2: >- I initially clamped the values inside the if (Y is flipped) case >and I made a mistake in the calculation: the clamp of the bbox[2] should >be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I >shouldn't have changed the ScissorRectangleYMax calculation. As the >fixed code is equivalent with using CLAMP instead of MAX2 at the top of >the function when bbox[2] and bbox[3] are calculated, and the 2nd is more >clear, I replaced it. (Nanley Chery) > > v3: >- Reversed the CLAMP change in bbox[3] as the API guarantees that the >viewport height is positive. (Nanley Chery) > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > This patch is Reviewed-by: Nanley Chery > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index dcdfb3c9292..47f3741e673 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, > > bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); > bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); > - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); > + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); > bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); > _mesa_intersect_scissor_bounding_box(ctx, i, bbox); > > -- > 2.20.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
> I don't know about Midgard, but on Bifrost it definitely uses GROW_ON_GPF, > telling the kernel to preallocate some small-ish initial set, presumably as > an optimization. I don't think you can accurately predict how much memory > you're going to need ahead of time, since it depends on how the triangles > are distributed along the screen. Good to know, thank you :) signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
On Thu, Feb 21, 2019 at 5:07 PM Alyssa Rosenzweig wrote: > > Yes, there is a buffer for holding the results of the tiler. The way it > > works is that the userspace driver allocates a very large buffer with a > > "grow on page fault" bit, so the kernel will allocate more memory as the > > tiler asks for it. > > To be clear, right now I have this magic misc_0 buffer setup that way, > too (allocating something enormous and setting GROW_ON_GPF). Although, > it's not clear to me that that is the _correct_ thing to do; IIRC, the > blob allocates "just enough", no GROW_ON_GPF needed, but I should > probably check that. > I don't know about Midgard, but on Bifrost it definitely uses GROW_ON_GPF, telling the kernel to preallocate some small-ish initial set, presumably as an optimization. I don't think you can accurately predict how much memory you're going to need ahead of time, since it depends on how the triangles are distributed along the screen. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs
https://bugs.freedesktop.org/show_bug.cgi?id=109711 --- Comment #2 from root@scoopta.ninja --- Yeah, apparently I didn't make that very clear. It is a GAME hang. As in I have to forcibly kill the game from htop but it is not a full system hang. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir/lower_clip_cull: Fix an incorrect assert
On Thursday, February 21, 2019 8:55:31 AM PST Jason Ekstrand wrote: > Copy+paste error. It was supposed to test cull and not clip. > > Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..." > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717 > Cc: Kenneth Graunke > --- > src/compiler/nir/nir_lower_clip_cull_distance_arrays.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c > b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c > index d98ffa69596..70578d6f3fd 100644 > --- a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c > +++ b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c > @@ -101,7 +101,7 @@ combine_clip_cull(nir_shader *nir, > } > > if (cull) { > - assert(clip->data.compact); > + assert(cull->data.compact); >cull->data.how_declared = nir_var_hidden; >cull->data.location = VARYING_SLOT_CLIP_DIST0 + clip_array_size / 4; >cull->data.location_frac = clip_array_size % 4; > Whoops. Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: limit urb size for SKL/KBL/CFL GT1
What happened to this patch? We carry it downstream for the deqp fix. Can we get it into 19.0? Thanks, Kristian On Wed, Aug 29, 2018 at 5:39 PM Lionel Landwerlin wrote: > > The documentation puts the URB size for SKL GT1 as "128K - 192K". I > guess this means we can't tell which one it is, so we have to go for > the lower bound. This change also changes the max VS URB entries which > is lower on GT1 skus. > > Fixes a CTS test : > > dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d > > Signed-off-by: Lionel Landwerlin > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107505 > --- > src/intel/dev/gen_device_info.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c > index b0ae4d18034..ed1e73efa61 100644 > --- a/src/intel/dev/gen_device_info.c > +++ b/src/intel/dev/gen_device_info.c > @@ -617,7 +617,8 @@ static const struct gen_device_info > gen_device_info_skl_gt1 = { > .num_subslices = { 2, }, > .num_eu_per_subslice = 6, > .l3_banks = 2, > - .urb.size = 192, > + .urb.size = 128, > + .urb.max_entries[MESA_SHADER_VERTEX] = 928, > .simulator_id = 12, > }; > > @@ -689,6 +690,8 @@ static const struct gen_device_info > gen_device_info_kbl_gt1 = { > .num_subslices = { 2, }, > .num_eu_per_subslice = 6, > .l3_banks = 2, > + .urb.size = 128, > + .urb.max_entries[MESA_SHADER_VERTEX] = 928, > .simulator_id = 16, > }; > > @@ -775,6 +778,8 @@ static const struct gen_device_info > gen_device_info_cfl_gt1 = { > .num_subslices = { 2, }, > .num_eu_per_subslice = 6, > .l3_banks = 2, > + .urb.size = 128, > + .urb.max_entries[MESA_SHADER_VERTEX] = 928, > .simulator_id = 24, > }; > static const struct gen_device_info gen_device_info_cfl_gt2 = { > -- > 2.18.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir/lower_clip_cull: Fix an incorrect assert
On 21/02/2019 16:55, Jason Ekstrand wrote: Copy+paste error. It was supposed to test cull and not clip. Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717 Cc: Kenneth Graunke Trivial, I really thought it was harder than that :) Thanks! Reviewed-by: Lionel Landwerlin --- src/compiler/nir/nir_lower_clip_cull_distance_arrays.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c index d98ffa69596..70578d6f3fd 100644 --- a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c +++ b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c @@ -101,7 +101,7 @@ combine_clip_cull(nir_shader *nir, } if (cull) { - assert(clip->data.compact); + assert(cull->data.compact); cull->data.how_declared = nir_var_hidden; cull->data.location = VARYING_SLOT_CLIP_DIST0 + clip_array_size / 4; cull->data.location_frac = clip_array_size % 4; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir/lower_clip_cull: Fix an incorrect assert
Copy+paste error. It was supposed to test cull and not clip. Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717 Cc: Kenneth Graunke --- src/compiler/nir/nir_lower_clip_cull_distance_arrays.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c index d98ffa69596..70578d6f3fd 100644 --- a/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c +++ b/src/compiler/nir/nir_lower_clip_cull_distance_arrays.c @@ -101,7 +101,7 @@ combine_clip_cull(nir_shader *nir, } if (cull) { - assert(clip->data.compact); + assert(cull->data.compact); cull->data.how_declared = nir_var_hidden; cull->data.location = VARYING_SLOT_CLIP_DIST0 + clip_array_size / 4; cull->data.location_frac = clip_array_size % 4; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] intel/fs: Cap dst-aligned region stride to maximum representable hstride value.
On Thu, Feb 14, 2019 at 4:53 PM Francisco Jerez wrote: > Jason Ekstrand writes: > > > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez > > wrote: > > > >> This is required in combination with the following commit, because > >> otherwise if a source region with an extended 8+ stride is present in > >> the instruction (which we're about to declare legal) we'll end up > >> emitting code that attempts to write to such a region, even though > >> strides greater than four are still illegal for the destination. > >> --- > >> src/intel/compiler/brw_fs_lower_regioning.cpp | 20 ++- > >> 1 file changed, 15 insertions(+), 5 deletions(-) > >> > >> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp > >> b/src/intel/compiler/brw_fs_lower_regioning.cpp > >> index 6a3c23892b4..b86e95ed9eb 100644 > >> --- a/src/intel/compiler/brw_fs_lower_regioning.cpp > >> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp > >> @@ -71,15 +71,25 @@ namespace { > >>!is_byte_raw_mov(inst)) { > >> return get_exec_type_size(inst); > >>} else { > >> - unsigned stride = inst->dst.stride * type_sz(inst->dst.type); > >> + /* Calculate the maximum byte stride and the minimum type size > >> across > >> + * all source and destination operands. > >> + */ > >> + unsigned max_stride = inst->dst.stride * > type_sz(inst->dst.type); > >> + unsigned min_size = type_sz(inst->dst.type); > >> > >> for (unsigned i = 0; i < inst->sources; i++) { > >> -if (!is_uniform(inst->src[i]) && > !inst->is_control_source(i)) > >> - stride = MAX2(stride, inst->src[i].stride * > >> - type_sz(inst->src[i].type)); > >> +if (!is_uniform(inst->src[i]) && > !inst->is_control_source(i)) > >> { > >> + max_stride = MAX2(max_stride, inst->src[i].stride * > >> + type_sz(inst->src[i].type)); > >> + min_size = MIN2(min_size, type_sz(inst->src[i].type)); > >> +} > >> } > >> > >> - return stride; > >> + /* Attempt to use the largest byte stride among all present > >> operands, > >> + * but never exceed a stride of 4 since that would lead to > >> illegal > >> + * destination regions during lowering. > >> + */ > >> + return MIN2(max_stride, 4 * min_size); > >> > > > > Why not just fall back to tightly packed in this case? I think I can > > answer my own question: Because using something that's equal to one of > the > > strides reduces the liklihood that we'll need a temporary. If that's the > > correct answer, then maybe what we want is the maximum of all strides > with > > stride_in_bytes <= 4 * type_sz? > > > > We also want the result to be greater than or equal to the size of the > largest non-uniform, non-control source type, since packing a vector of > such a type into a temporary of lower byte stride than its size is > impossible. This patch guarantees that as long as max_size <= 4 * > min_size, which is necessary for the lowering code that calls this > function to work at all. > > It would be possible to preserve this guarantee while attempting to pick > one of the strides of the pre-existing sources as you say -- I would be > happy to review that change as a follow-up micro-optimization patch, but > there are some corner cases to consider I don't necessarily want to > bother with in the patch doing the functional change, for the sake of > bisectability. > That's fine with me. Reviewed-by: Jason Ekstrand > It may make sense to add an assert here that max_size <= 4 * min_size > for the case such an instruction doesn't blow up already at the EU > validator (it doesn't look like the validator is currently enforcing the > lack of conversions between 8 and 64 bit types?), it will just involve > calculating max_size in addition to max_stride and min_size above. > > > --Jason > > > > > >>} > >> } > >> > >> -- > >> 2.19.2 > >> > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] intel/fs: Exclude control sources from execution type and region alignment calculations.
On Thu, Feb 14, 2019 at 9:33 PM Francisco Jerez wrote: > Jason Ekstrand writes: > > > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez > > wrote: > > > >> Currently the execution type calculation will return a bogus value in > >> cases like: > >> > >> mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u > >> > >> Which will be considered to have a 32-bit integer execution type even > >> though the actual indirect move operation will be carried out with > >> 16-bit precision. > >> > >> Similarly there's no need to apply the CHV/BXT double-precision region > >> alignment restrictions to such control sources, since they aren't > >> directly involved in the double-precision arithmetic operations > >> emitted by these virtual instructions. Applying the CHV/BXT > >> restrictions to control sources was expected to be harmless if mildly > >> inefficient, but unfortunately it exposed problems at codegen level > >> for virtual instructions (namely the SHUFFLE instruction used for the > >> Vulkan 1.1 subgroup feature) that weren't prepared to accept control > >> sources with an arbitrary strided region. > >> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 > >> Reported-by: Mark Janes > >> Fixes: efa4e4bc5fc "intel/fs: Introduce regioning lowering pass." > >> --- > >> src/intel/compiler/brw_fs.cpp | 54 +++ > >> src/intel/compiler/brw_fs_lower_regioning.cpp | 6 +-- > >> src/intel/compiler/brw_ir_fs.h| 10 +++- > >> 3 files changed, 66 insertions(+), 4 deletions(-) > >> > >> diff --git a/src/intel/compiler/brw_fs.cpp > b/src/intel/compiler/brw_fs.cpp > >> index 0359eb079f7..f475b617df2 100644 > >> --- a/src/intel/compiler/brw_fs.cpp > >> +++ b/src/intel/compiler/brw_fs.cpp > >> @@ -271,6 +271,60 @@ fs_inst::is_send_from_grf() const > >> } > >> } > >> > >> +bool > >> +fs_inst::is_control_source(unsigned arg) const > >> +{ > >> + switch (opcode) { > >> + case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD: > >> + case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7: > >> + case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN4: > >> + case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7: > >> + return arg == 0; > >> + > >> + case SHADER_OPCODE_BROADCAST: > >> + case SHADER_OPCODE_SHUFFLE: > >> + case SHADER_OPCODE_QUAD_SWIZZLE: > >> + case FS_OPCODE_INTERPOLATE_AT_SAMPLE: > >> + case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET: > >> + case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET: > >> + case SHADER_OPCODE_IMAGE_SIZE: > >> + case SHADER_OPCODE_GET_BUFFER_SIZE: > >> + return arg == 1; > >> + > >> + case SHADER_OPCODE_MOV_INDIRECT: > >> + case SHADER_OPCODE_CLUSTER_BROADCAST: > >> + case SHADER_OPCODE_TEX: > >> + case FS_OPCODE_TXB: > >> + case SHADER_OPCODE_TXD: > >> + case SHADER_OPCODE_TXF: > >> + case SHADER_OPCODE_TXF_LZ: > >> + case SHADER_OPCODE_TXF_CMS: > >> + case SHADER_OPCODE_TXF_CMS_W: > >> + case SHADER_OPCODE_TXF_UMS: > >> + case SHADER_OPCODE_TXF_MCS: > >> + case SHADER_OPCODE_TXL: > >> + case SHADER_OPCODE_TXL_LZ: > >> + case SHADER_OPCODE_TXS: > >> + case SHADER_OPCODE_LOD: > >> + case SHADER_OPCODE_TG4: > >> + case SHADER_OPCODE_TG4_OFFSET: > >> + case SHADER_OPCODE_SAMPLEINFO: > >> + case SHADER_OPCODE_UNTYPED_ATOMIC: > >> + case SHADER_OPCODE_UNTYPED_ATOMIC_FLOAT: > >> + case SHADER_OPCODE_UNTYPED_SURFACE_READ: > >> + case SHADER_OPCODE_UNTYPED_SURFACE_WRITE: > >> + case SHADER_OPCODE_BYTE_SCATTERED_READ: > >> + case SHADER_OPCODE_BYTE_SCATTERED_WRITE: > >> + case SHADER_OPCODE_TYPED_ATOMIC: > >> + case SHADER_OPCODE_TYPED_SURFACE_READ: > >> + case SHADER_OPCODE_TYPED_SURFACE_WRITE: > >> > > > > As of b284d222db, we are no longer using many of the opcodes in this list > > (gen7 pull constant loads, [un]typed surface reads/writes, etc.) It will > > need to be rebased and we need to add SHADER_OPCODE_SEND to the list. > > Fortunately, the changes to add SHADER_OPCODE_SEND landed before the 19.0 > > cut-off so there is no need to make two versions for backporting. > > > > Yes, that's roughly what I had done during one of my previous rebases of > this series, see: > > > https://cgit.freedesktop.org/~currojerez/mesa/commit/?h=jenkins=30f8f3ff48b02ead688705e0679a98c0d6c9c87e > Looks good. Some of those opcodes are no longer used. I sent out a MR to clean some of it up but I think it's better if this lands first so that it can be more cleanly back-ported Reviewed-by: Jason Ekstrand > > Other than that, this patch seems perfectly reasonable to me > > > > Reviewed-by: Jason Ekstrand > > > > If you want me to hand-review the new list of opcodes, feel free to send > a > > v2 and cc me. > > > > > >> + return arg == 1 || arg == 2; > >> + > >> + default: > >> + return false; > >> + } > >> +} > >> + > >> /** > >> * Returns true if this instruction's sources and destinations cannot > >> * safely be the same register. > >> diff --git
[Mesa-dev] [MR] intel/compiler: Clean up image operations
This MR cleans up some old cruft lying around in our back-end compiler surrounding images. The first three commits provide an enum for surface opcode sources and drop the surface builder from the scalar back-end. Now that image format lowering has moved to NIR and we have logical opcodes for HW image ops, there builder isn't really gaining us anything except some convenience wrappers. IMO, those wrappers are starting to be more trouble than they're worth. One could argue about it in the case of atomics but I think we'd be better off with a quick helper function in brw_fs_nir.cpp. Next, we drop the typed surface code from the vec4 back-end (I think it works but it's never been used) and then drop some now unused ocodes. We could probably wire up image_load_store for vec4 if we really wanted to (and now that the lowering is in NIR, it might not be too hard). However, it's kind-of pointless and the chances of someone actually doing that work are getting increasingly remote. Let's just drop the dead code. Finally, we clean up some opcodes by deleting opcodes that are no longer used and renaming some vec4-only opcodes VEC4. https://gitlab.freedesktop.org/mesa/mesa/merge_requests/287/commits ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics
On Thu, Feb 21, 2019 at 11:09 AM Marek Olšák wrote: > > On Thu, Feb 21, 2019, 12:08 AM Ilia Mirkin wrote: >> >> I don't think that's right... this is used in e.g. >> >>case TGSI_OPCODE_RESQ: >> if (tgsi_is_bindless_image_file(fullinst->Src[0].Register.File)) >> info->uses_bindless_images = true; >> break; >> >> And if you call RESQ with a src0 that is not IMAGE or BUFFER, then >> that's a bindless reference. > > > No it's not. TEMP, CONST, IN, and OUT are bindless references. CONSTBUF isn't. Indeed you're quite right. Looks like the collective "we" thought of this problem ahead of time. Excellent. (Now that you mention it, I feel like I might have even been involved in that... how time flies.) Reviewed-by: Ilia Mirkin > > Marek > >> >> I think the problem is that this is interacting poorly with the load >> constbuf stuff -- probably needs a bit of subtlety there. >> >> -ilia >> >> On Thu, Feb 21, 2019 at 12:03 AM Marek Olšák wrote: >> > >> > From: Marek Olšák >> > >> > This might have decreased performance for radeonsi/tgsi, because most >> > most shaders claimed they used bindless. >> > >> > Cc: 18.3 19.0 >> > --- >> > src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 +++- >> > 1 file changed, 3 insertions(+), 1 deletion(-) >> > >> > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h >> > b/src/gallium/auxiliary/tgsi/tgsi_scan.h >> > index 580c73a2814..51d85af4fcb 100644 >> > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h >> > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h >> > @@ -214,18 +214,20 @@ tgsi_scan_arrays(const struct tgsi_token *tokens, >> > void >> > tgsi_scan_tess_ctrl(const struct tgsi_token *tokens, >> > const struct tgsi_shader_info *info, >> > struct tgsi_tessctrl_info *out); >> > >> > static inline bool >> > tgsi_is_bindless_image_file(unsigned file) >> > { >> > return file != TGSI_FILE_IMAGE && >> >file != TGSI_FILE_MEMORY && >> > - file != TGSI_FILE_BUFFER; >> > + file != TGSI_FILE_BUFFER && >> > + file != TGSI_FILE_CONSTBUF && >> > + file != TGSI_FILE_HW_ATOMIC; >> > } >> > >> > #ifdef __cplusplus >> > } // extern "C" >> > #endif >> > >> > #endif /* TGSI_SCAN_H */ >> > -- >> > 2.17.1 >> > >> > ___ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics
On Thu, Feb 21, 2019, 12:08 AM Ilia Mirkin wrote: > I don't think that's right... this is used in e.g. > >case TGSI_OPCODE_RESQ: > if (tgsi_is_bindless_image_file(fullinst->Src[0].Register.File)) > info->uses_bindless_images = true; > break; > > And if you call RESQ with a src0 that is not IMAGE or BUFFER, then > that's a bindless reference. > No it's not. TEMP, CONST, IN, and OUT are bindless references. CONSTBUF isn't. Marek > I think the problem is that this is interacting poorly with the load > constbuf stuff -- probably needs a bit of subtlety there. > > -ilia > > On Thu, Feb 21, 2019 at 12:03 AM Marek Olšák wrote: > > > > From: Marek Olšák > > > > This might have decreased performance for radeonsi/tgsi, because most > > most shaders claimed they used bindless. > > > > Cc: 18.3 19.0 > > --- > > src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h > b/src/gallium/auxiliary/tgsi/tgsi_scan.h > > index 580c73a2814..51d85af4fcb 100644 > > --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h > > +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h > > @@ -214,18 +214,20 @@ tgsi_scan_arrays(const struct tgsi_token *tokens, > > void > > tgsi_scan_tess_ctrl(const struct tgsi_token *tokens, > > const struct tgsi_shader_info *info, > > struct tgsi_tessctrl_info *out); > > > > static inline bool > > tgsi_is_bindless_image_file(unsigned file) > > { > > return file != TGSI_FILE_IMAGE && > >file != TGSI_FILE_MEMORY && > > - file != TGSI_FILE_BUFFER; > > + file != TGSI_FILE_BUFFER && > > + file != TGSI_FILE_CONSTBUF && > > + file != TGSI_FILE_HW_ATOMIC; > > } > > > > #ifdef __cplusplus > > } // extern "C" > > #endif > > > > #endif /* TGSI_SCAN_H */ > > -- > > 2.17.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
> Yes, there is a buffer for holding the results of the tiler. The way it > works is that the userspace driver allocates a very large buffer with a > "grow on page fault" bit, so the kernel will allocate more memory as the > tiler asks for it. To be clear, right now I have this magic misc_0 buffer setup that way, too (allocating something enormous and setting GROW_ON_GPF). Although, it's not clear to me that that is the _correct_ thing to do; IIRC, the blob allocates "just enough", no GROW_ON_GPF needed, but I should probably check that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
> *somewhere* the result of VS (or binning shader if VS is split in two) > needs to be saved for use during the per-tile rendering. Presumably > you have to give the hw a buffer of appropriate/matching size > somewhere, or with enough geometry (and too small a buffer) things > will overflow and go badly. We allocate (in another part of a driver) the varying_mem buffer, which is where varyings, including gl_Position (i.e. all the output of the VS) get written to after VS and read from by the tiler. That's not this, but it's not clear what this... > I guess if you exceed the size given in .tiler_meta, then hw falls > back to running VS per tile for all geometry (not just geom visible in > the tile), hence big diff in perf for large tile counts and lotsa > geometry. This is possible, although I would expect that would show up in the performance counters as unusually high VS activity (which was not the case...? To be fair, I'm not convinced I'm reading the counters right ^^) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds
https://bugs.freedesktop.org/show_bug.cgi?id=109532 --- Comment #35 from asimiklit --- Created attachment 143430 --> https://bugs.freedesktop.org/attachment.cgi?id=143430=edit patch to disallow an elimination of the first unused ub/ssbo array elements (In reply to Ian Romanick from comment #33) > (In reply to andrii simiklit from comment #32) > > (In reply to andrii simiklit from comment #31) > > > (In reply to Mark Janes from comment #30) > > > > (In reply to Mark Janes from comment #28) > > > > > > > > > > https://android-review.googlesource.com/c/platform/external/deqp/+/901894 > > > > > > > > Mesa still asserts with this fix. I also tested Andrii's mesa patch > > > > with > > > > the dEQP fix and the test fails. > > > Do you mean the Chris's dEQP fix here, yes? > > > But looks like the mentioned Chris's dEQP fix considers some GL > > > limitations > > > and doesn't affect the expectations of binding points. > > > > > > Also the assertion is a separate issue, I created the piglit test for > > > that: > > > https://patchwork.freedesktop.org/patch/286287/ > > > But yes, we unable to fix the test fail without assertion because of crash > > > :-) > > > > > > > > > > > Since non-mesa drivers have found issues with the original dEQP change, > > > > I > > > > suspect there are still deeper problems with the test. > > > Possible they have the same issue with binding points mismatch after > > > optimizations by glsl compiler. > > > They could try this fix/hack for deqp which is already helped us: > > > https://github.com/asimiklit/deqp/commit/ > > > 91cff8150944213f6da533e281ee76d95ca00f21 > > > If it helps them we will know that it is a common issue and it could > > > expedite this: > > > https://github.com/KhronosGroup/OpenGL-API/issues/46 > > > > So we have an answer from Piers Daniell: > > "I believe all buffer binding points should be consumed, regardless > > whether > >the array elements are used or not. This would be the behavior of least > >surprise to the developer. I didn't see any language that would indicate > >that unused elements should not be counted when assigning the element to > >the buffer binding point." > > I think this basically agrees with my earlier sentiment that we shouldn't > trim elements from the beginning of the array. It's generally ok (and in > some cases expected) to trim elements from the end. Are you talking about something like attached variant? -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
On Thu, Feb 21, 2019 at 3:01 PM Rob Clark wrote: > On Thu, Feb 21, 2019 at 1:19 AM Alyssa Rosenzweig > wrote: > > > > For reasons that are still unclear (speculation included in the comment > > added in this patch), the tiler? metadata has a fast path that we were > > not enabling; there looks to be a possible time/memory tradeoff, but the > > details remain unclear. > > > > Regardless, this patch improves performance dramatically. Particular > > wins are for geometry-heavy scenes. For instance, glmark2-es2's > > Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped > > from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious > > when vsync is disabled, as in glmark2-es2-wayland. > > > > With this patch, on GLES 2.0 samples not involving FBOs, it appears > > performance is converging with (and sometimes surpassing) the blob. > > > > Signed-off-by: Alyssa Rosenzweig > > --- > > src/gallium/drivers/panfrost/pan_context.c | 42 +++--- > > 1 file changed, 38 insertions(+), 4 deletions(-) > > > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > > index 822b5a0dfef..2996a9c1e09 100644 > > --- a/src/gallium/drivers/panfrost/pan_context.c > > +++ b/src/gallium/drivers/panfrost/pan_context.c > > @@ -256,7 +256,28 @@ static struct bifrost_framebuffer > > panfrost_emit_mfbd(struct panfrost_context *ctx) > > { > > struct bifrost_framebuffer framebuffer = { > > -.tiler_meta = 0xf0c600, > > +/* It is not yet clear what tiler_meta means or how it's > > + * calculated, but we can tell the lower 32-bits are a > > + * (monotonically increasing?) function of tile count > and > > + * geometry complexity; I suspect it defines a memory > size of > > + * some kind? for the tiler. It's really unclear at the > > + * moment... but to add to the confusion, the hardware > is happy > > + * enough to accept a zero in this field, so we don't > even have > > + * to worry about it right now. > > *somewhere* the result of VS (or binning shader if VS is split in two) > needs to be saved for use during the per-tile rendering. Presumably > you have to give the hw a buffer of appropriate/matching size > somewhere, or with enough geometry (and too small a buffer) things > will overflow and go badly. > > Yes, there is a buffer for holding the results of the tiler. The way it works is that the userspace driver allocates a very large buffer with a "grow on page fault" bit, so the kernel will allocate more memory as the tiler asks for it. > I guess if you exceed the size given in .tiler_meta, then hw falls > back to running VS per tile for all geometry (not just geom visible in > the tile), hence big diff in perf for large tile counts and lotsa > geometry. > No, this isn't it. The vertex shaders aren't split in half, they all run entirely before the tiler even starts. The only thing I could imagine would be some optional part of the tiler buffer where the aforementioned grow on page fault thing doesn't quite work out. > > It does sound a bit strange that it would depend on tile count. I'd > expect it would be a function strictly of amount of geometry (and > possibly effected by whether or not gl_PointSize is written?).. and > possibly amount of varyings if VS isn't split into two parts. > > BR, > -R > > > + * The byte (just after the 32-bit mark) is much more > > + * interesting. The higher nibble I've only ever seen > as 0xF, > > + * but the lower one I've seen as 0x0 or 0xF, and it's > not > > + * obvious what the difference is. But what -is- > obvious is > > + * that when the lower nibble is zero, performance is > severely > > + * degraded compared to when the lower nibble is set. > > + * Evidently, that nibble enables some sort of fast > path, > > + * perhaps relating to caching or tile flush? > Regardless, at > > + * this point there's no clear reason not to set it, > aside from > > + * substantially increased memory requirements (of the > misc_0 > > + * buffer) */ > > + > > +.tiler_meta = ((uint64_t) 0xff << 32) | 0x0, > > > > .width1 = MALI_POSITIVE(ctx->pipe_framebuffer.width), > > .height1 = MALI_POSITIVE(ctx->pipe_framebuffer.height), > > @@ -271,10 +292,23 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) > > > > .unknown2 = 0x1f, > > > > -/* Presumably corresponds to unknown_address_X of SFBD > */ > > +/* Corresponds to unknown_address_X of SFBD */ > > .scratchpad = ctx->scratchpad.gpu, > > .tiler_scratch_start = ctx->misc_0.gpu, > > -
Re: [Mesa-dev] [PATCH v2] anv: advertise 8 subpixel precision bits
On Thu, Feb 21, 2019 at 3:45 AM Juan A. Suarez Romero wrote: > On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is > used to select between 8 bit subpixel precision (value 0) or 4 bit > subpixel precision (value 1). As this value is not set, means it is > taking the value 0, so 8 bit are used. > > On the other side, in the Vulkan CTS tests, if the reference rasterizer, > which uses 8 bit precision, as it is used to check what should be the > expected value for the tests, is changed to use 4 bit as ANV was > advertising so far, some of the tests will fail. > > So it seems ANV is actually using 8 bits. > > v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason) > > CC: Jason Ekstrand > CC: Kenneth Graunke > --- > src/intel/vulkan/anv_device.c| 2 +- > src/intel/vulkan/genX_pipeline.c | 1 + > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c > index 3120865466a..95224407318 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -1095,7 +1095,7 @@ void anv_GetPhysicalDeviceProperties( > 16 * devinfo->max_cs_threads, > 16 * devinfo->max_cs_threads, >}, > - .subPixelPrecisionBits= 4 /* FIXME */, > + .subPixelPrecisionBits= 8, >.subTexelPrecisionBits= 4 /* FIXME */, >.mipmapPrecisionBits = 4 /* FIXME */, >.maxDrawIndexedIndexValue = UINT32_MAX, > diff --git a/src/intel/vulkan/genX_pipeline.c > b/src/intel/vulkan/genX_pipeline.c > index 6255e5d83c5..b06036a6fc7 100644 > --- a/src/intel/vulkan/genX_pipeline.c > +++ b/src/intel/vulkan/genX_pipeline.c > @@ -464,6 +464,7 @@ emit_rs_state(struct anv_pipeline *pipeline, > sf.TriangleStripListProvokingVertexSelect = 0; > sf.LineStripListProvokingVertexSelect = 0; > sf.TriangleFanProvokingVertexSelect = 1; > + sf.VertexSubPixelPrecisionSelect = 0; > Please use the _8bit #define. It looks like you may have to modify the gen7 XML (and possibly earlier if you don't mind) to add the enum values. > > const struct brw_vue_prog_data *last_vue_prog_data = >anv_pipeline_get_last_vue_prog_data(pipeline); > -- > 2.20.1 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: clone instruction set rather than removing individual entries
Reviewed-by: Jason Ekstrand On Thu, Feb 21, 2019 at 5:33 AM Thomas Helland wrote: > This patch is: > > Reviewed-by: Thomas Helland > > Den ons. 20. feb. 2019 kl. 04:04 skrev Timothy Arceri < > tarc...@itsqueeze.com>: > > > > This reduces the time spent in nir_opt_cse() by almost a half. > > --- > > src/compiler/nir/nir_opt_cse.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/src/compiler/nir/nir_opt_cse.c > b/src/compiler/nir/nir_opt_cse.c > > index bf42a6a33dc..3c3617d852a 100644 > > --- a/src/compiler/nir/nir_opt_cse.c > > +++ b/src/compiler/nir/nir_opt_cse.c > > @@ -39,9 +39,10 @@ > > */ > > > > static bool > > -cse_block(nir_block *block, struct set *instr_set) > > +cse_block(nir_block *block, struct set *dominance_set) > > { > > bool progress = false; > > + struct set *instr_set = _mesa_set_clone(dominance_set, NULL); > > > > nir_foreach_instr_safe(instr, block) { > >if (nir_instr_set_add_or_rewrite(instr_set, instr)) { > > @@ -55,8 +56,7 @@ cse_block(nir_block *block, struct set *instr_set) > >progress |= cse_block(child, instr_set); > > } > > > > - nir_foreach_instr(instr, block) > > - nir_instr_set_remove(instr_set, instr); > > + _mesa_set_destroy(instr_set, NULL); > > > > return progress; > > } > > -- > > 2.20.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped
I've run the test with applied patch and it's successfully passed. Also, this patch fixed an one issue from another ticket - https://bugs.freedesktop.org/show_bug.cgi?id=109594 - with this patch, totem app doesn't crashes at resizing Thanks! Tested-by: Paul Chelombitko чт, 21 февр. 2019 г. в 12:03, Eleni Maria Stea : > Calculating the scissor rectangle fields with the y flipped (0 on top) > can generate negative values that will cause assertion failure later on > as the scissor fields are all unsigned. We must clamp the bbox values > again to make sure they don't exceed the fb_height. Also fixed a > calculation error. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 > > v2: >- I initially clamped the values inside the if (Y is flipped) case >and I made a mistake in the calculation: the clamp of the bbox[2] should >be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I >shouldn't have changed the ScissorRectangleYMax calculation. As the >fixed code is equivalent with using CLAMP instead of MAX2 at the top of >the function when bbox[2] and bbox[3] are calculated, and the 2nd is > more >clear, I replaced it. (Nanley Chery) > > v3: >- Reversed the CLAMP change in bbox[3] as the API guarantees that the >viewport height is positive. (Nanley Chery) > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index dcdfb3c9292..47f3741e673 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, > > bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); > bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); > - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); > + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); > bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); > _mesa_intersect_scissor_bounding_box(ctx, i, bbox); > > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=109535 Bug 109535 depends on bug 109594, which changed state. Bug 109594 Summary: totem assert failure: totem: src/intel/genxml/gen9_pack.h:72: __gen_uint: La declaración `v <= max' no se cumple. https://bugs.freedesktop.org/show_bug.cgi?id=109594 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=109535 Bug 109535 depends on bug 109594, which changed state. Bug 109594 Summary: totem assert failure: totem: src/intel/genxml/gen9_pack.h:72: __gen_uint: La declaración `v <= max' no se cumple. https://bugs.freedesktop.org/show_bug.cgi?id=109594 What|Removed |Added Status|NEEDINFO|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/auxiliary/vl: Fix duplicate symbol build errors.
On 2019-02-19 10:58 a.m., Emil Velikov wrote: > On Tue, 19 Feb 2019 at 03:32, Vinson Lee wrote: >>CXXLDgallium_dri.la >> duplicate symbol _compute_shader_video_buffer in: >> >> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o) >> >> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o) >> duplicate symbol _compute_shader_weave in: >> >> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o) >> >> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o) >> duplicate symbol _compute_shader_rgba in: >> >> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor.o) >> >> ../../../../src/gallium/auxiliary/.libs/libgalliumvl.a(libgalliumvl_la-vl_compositor_cs.o) >> >> Fixes: 9364d66cb7f7 ("gallium/auxiliary/vl: Add video compositor compute >> shader render") >> Signed-off-by: Vinson Lee >> --- >> src/gallium/auxiliary/vl/vl_compositor_cs.h | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/src/gallium/auxiliary/vl/vl_compositor_cs.h >> b/src/gallium/auxiliary/vl/vl_compositor_cs.h >> index 7a203d327eda..a73a8755fc2a 100644 >> --- a/src/gallium/auxiliary/vl/vl_compositor_cs.h >> +++ b/src/gallium/auxiliary/vl/vl_compositor_cs.h >> @@ -32,9 +32,9 @@ >> >> #include "vl_compositor.h" >> >> -char *compute_shader_video_buffer; >> -char *compute_shader_weave; >> -char *compute_shader_rgba; >> +extern char *compute_shader_video_buffer; >> +extern char *compute_shader_weave; >> +extern char *compute_shader_rgba; >> > Please make them also "const" - in both here and C file. > > With that the patch is > Reviewed-by: Emil Velikov Agreed! This patch is Reviewed-by: James Zhu James > > Thanks > Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)
On Thu, Feb 21, 2019 at 1:19 AM Alyssa Rosenzweig wrote: > > For reasons that are still unclear (speculation included in the comment > added in this patch), the tiler? metadata has a fast path that we were > not enabling; there looks to be a possible time/memory tradeoff, but the > details remain unclear. > > Regardless, this patch improves performance dramatically. Particular > wins are for geometry-heavy scenes. For instance, glmark2-es2's > Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped > from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious > when vsync is disabled, as in glmark2-es2-wayland. > > With this patch, on GLES 2.0 samples not involving FBOs, it appears > performance is converging with (and sometimes surpassing) the blob. > > Signed-off-by: Alyssa Rosenzweig > --- > src/gallium/drivers/panfrost/pan_context.c | 42 +++--- > 1 file changed, 38 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/drivers/panfrost/pan_context.c > b/src/gallium/drivers/panfrost/pan_context.c > index 822b5a0dfef..2996a9c1e09 100644 > --- a/src/gallium/drivers/panfrost/pan_context.c > +++ b/src/gallium/drivers/panfrost/pan_context.c > @@ -256,7 +256,28 @@ static struct bifrost_framebuffer > panfrost_emit_mfbd(struct panfrost_context *ctx) > { > struct bifrost_framebuffer framebuffer = { > -.tiler_meta = 0xf0c600, > +/* It is not yet clear what tiler_meta means or how it's > + * calculated, but we can tell the lower 32-bits are a > + * (monotonically increasing?) function of tile count and > + * geometry complexity; I suspect it defines a memory size of > + * some kind? for the tiler. It's really unclear at the > + * moment... but to add to the confusion, the hardware is > happy > + * enough to accept a zero in this field, so we don't even > have > + * to worry about it right now. *somewhere* the result of VS (or binning shader if VS is split in two) needs to be saved for use during the per-tile rendering. Presumably you have to give the hw a buffer of appropriate/matching size somewhere, or with enough geometry (and too small a buffer) things will overflow and go badly. I guess if you exceed the size given in .tiler_meta, then hw falls back to running VS per tile for all geometry (not just geom visible in the tile), hence big diff in perf for large tile counts and lotsa geometry. It does sound a bit strange that it would depend on tile count. I'd expect it would be a function strictly of amount of geometry (and possibly effected by whether or not gl_PointSize is written?).. and possibly amount of varyings if VS isn't split into two parts. BR, -R > + * The byte (just after the 32-bit mark) is much more > + * interesting. The higher nibble I've only ever seen as 0xF, > + * but the lower one I've seen as 0x0 or 0xF, and it's not > + * obvious what the difference is. But what -is- obvious is > + * that when the lower nibble is zero, performance is > severely > + * degraded compared to when the lower nibble is set. > + * Evidently, that nibble enables some sort of fast path, > + * perhaps relating to caching or tile flush? Regardless, at > + * this point there's no clear reason not to set it, aside > from > + * substantially increased memory requirements (of the misc_0 > + * buffer) */ > + > +.tiler_meta = ((uint64_t) 0xff << 32) | 0x0, > > .width1 = MALI_POSITIVE(ctx->pipe_framebuffer.width), > .height1 = MALI_POSITIVE(ctx->pipe_framebuffer.height), > @@ -271,10 +292,23 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) > > .unknown2 = 0x1f, > > -/* Presumably corresponds to unknown_address_X of SFBD */ > +/* Corresponds to unknown_address_X of SFBD */ > .scratchpad = ctx->scratchpad.gpu, > .tiler_scratch_start = ctx->misc_0.gpu, > -.tiler_scratch_middle = ctx->misc_0.gpu + > /*ctx->misc_0.size*/40960, /* Size depends on the size of the framebuffer and > the number of vertices */ > + > +/* The constant added here is, like the lower word of > + * tiler_meta, (loosely) another product of framebuffer size > + * and geometry complexity. It must be sufficiently large for > + * the tiler_meta fast path to work; if it's too small, there > + * will be DATA_INVALID_FAULTs. Conversely, it must be less > + * than the total size of misc_0, or else there's no room. > It's > + * possible this constant configures a partition between two > + * parts of
Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces
On 21/02/2019 13:30, Chris Wilson wrote: Quoting Lionel Landwerlin (2019-02-21 12:57:09) I did not find the PRM bit that says it must be 64b aligned, but I can see that's what i915 checks. Chris: If you have a pointer to it, I could add the quote. In amongst the register specs, PLANE_STRIDE: For Linear memory, this field specifies the stride in chunks of 64 bytes (1 cache line). -Chris Thanks a lot! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces
Quoting Lionel Landwerlin (2019-02-21 12:57:09) > I did not find the PRM bit that says it must be 64b aligned, but I can > see that's what i915 checks. > > Chris: If you have a pointer to it, I could add the quote. In amongst the register specs, PLANE_STRIDE: For Linear memory, this field specifies the stride in chunks of 64 bytes (1 cache line). -Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] isl: the display engine requires 64B alignment for linear surfaces
I did not find the PRM bit that says it must be 64b aligned, but I can see that's what i915 checks. Chris: If you have a pointer to it, I could add the quote. Thanks! Reviewed-by: Lionel Landwerlin On 19/02/2019 12:06, Samuel Iglesias Gonsálvez wrote: Signed-off-by: Samuel Iglesias Gonsálvez --- src/intel/isl/isl.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c index 5c34efb9a13..7fb469687fa 100644 --- a/src/intel/isl/isl.c +++ b/src/intel/isl/isl.c @@ -1519,6 +1519,9 @@ isl_surf_init_s(const struct isl_device *dev, } } base_alignment_B = isl_round_up_to_power_of_two(base_alignment_B); + /* The display engine requires 64B alignment for linear surfaces. */ + if (isl_surf_usage_is_display(info->usage)) + base_alignment_B = MAX(base_alignment_B, 64); } else { const uint32_t total_h_tl = isl_align_div(phys_total_el.h, tile_info.logical_extent_el.height); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 2/2] etnaviv: try to use CPU to tile/untile when the region is small enough
Using the GPU has a cost, it implies preparing the RS/BLT request and then possibly requires extra GPU -> CPU syncs, so let's only use GPU-based tiling/untiling for rather big regions. While at it, move the "should we use the GPU to tile/untile" logic to its own function to make things more readable. Signed-off-by: Boris Brezillon --- .../drivers/etnaviv/etnaviv_transfer.c| 67 +-- 1 file changed, 61 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/etnaviv/etnaviv_transfer.c b/src/gallium/drivers/etnaviv/etnaviv_transfer.c index 38648564b701..a3013e624ead 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_transfer.c +++ b/src/gallium/drivers/etnaviv/etnaviv_transfer.c @@ -132,6 +132,66 @@ etna_transfer_unmap(struct pipe_context *pctx, struct pipe_transfer *ptrans) slab_free(>transfer_pool, trans); } +#define GPU_TILING_MIN_SURFACE_SIZE1024 +#define GPU_TILING_MIN_UNALIGNED_SURFACE_SIZE 8192 + +static bool +etna_transfer_use_gpu(struct etna_context *ctx, struct etna_resource *rsc, + const struct pipe_box *box) +{ + unsigned w_align, h_align; + + /* Always use the GPU to tile/untile when the resource has a Tile Status +* buffer attached. */ + if (rsc->ts_bo) + return true; + + /* Nothing to tile/untile if the resource is using a linear format. */ + if (rsc->layout == ETNA_LAYOUT_LINEAR) + return false; + + /* Do not use the GPU when the resource is using a 1byte/pixel format. */ + if (util_format_get_blocksize(rsc->base.format) == 1) + return false; + + /* HALIGN 4 resources are incompatible with the resolve engine, so fall back +* to using software to detile this resource. */ + if (rsc->halign == TEXTURE_HALIGN_FOUR) + return false; + + /* Using the GPU has a cost, it implies preparing the RS/BLT request and +* then possibly requires extra GPU -> CPU syncs, so let's only use +* GPU-based tiling for rather big regions. Right now the minimum surface +* surface size is arbitrarily set to 1024 pixels. */ + if (box->width * box->height < GPU_TILING_MIN_SURFACE_SIZE) + return false; + + /* No alignment constraints when using BLT. */ + if (ctx->specs.use_blt) + return true; + + if (rsc->layout & ETNA_LAYOUT_BIT_SUPER) { + w_align = h_align = 64; + } else { + w_align = ETNA_RS_WIDTH_MASK + 1; + h_align = ETNA_RS_HEIGHT_MASK + 1; + } + h_align *= ctx->screen->specs.pixel_pipes; + + /* Everything is properly aligned, let's use the RS engine to +* tile/untile. */ + if (!((box->x | box->width) & (w_align - 1)) && + !((box->y | box->height) & (h_align - 1))) + return true; + + /* We want the minimum surface size to be even bigger for unaligned +* requests. */ + if (box->width * box->height < GPU_TILING_MIN_UNALIGNED_SURFACE_SIZE) + return false; + + return true; +} + static void * etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc, unsigned level, @@ -179,12 +239,7 @@ etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc, * render resource. Use the texture resource, which avoids bouncing * pixels between the two resources, and we can de-tile it in s/w. */ rsc = etna_resource(rsc->texture); - } else if (rsc->ts_bo || - (rsc->layout != ETNA_LAYOUT_LINEAR && - util_format_get_blocksize(format) > 1 && - /* HALIGN 4 resources are incompatible with the resolve engine, -* so fall back to using software to detile this resource. */ - rsc->halign != TEXTURE_HALIGN_FOUR)) { + } else if (etna_transfer_use_gpu(ctx, rsc, box)) { /* If the surface has tile status, we need to resolve it first. * The strategy we implement here is to use the RS to copy the * depth buffer, filling in the "holes" where the tile status -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds
https://bugs.freedesktop.org/show_bug.cgi?id=109532 --- Comment #34 from asimiklit --- (In reply to asimiklit from comment #26) > (In reply to Ian Romanick from comment #17) > > (In reply to asimiklit from comment #6) > > > Created attachment 143288 [details] > > > this simple program helps me to reproduce this issue. > > > > > > just share my simple reproducer) > > > > > > Run it in this way: > > > > > >simple_reproducer shader.comp > > > > It seems like this could be made into a piglit test that could be compiled > > with glslparsertest. If you haven't already submitted such a test, please > > do. :) > > I work on it) The two piglit tests were suggested: https://patchwork.freedesktop.org/patch/286287/ -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/1] Avoid leaking parameter name in prog_to_nir.
Fixes: 3d7611e9 "st/nir: use NIR for asm programs" --- src/mesa/program/prog_to_nir.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c index 312b299361e..6e3fa9432a3 100644 --- a/src/mesa/program/prog_to_nir.c +++ b/src/mesa/program/prog_to_nir.c @@ -1024,7 +1024,8 @@ prog_to_nir(const struct gl_program *prog, c->parameters = rzalloc(s, nir_variable); c->parameters->type = glsl_array_type(glsl_vec4_type(), prog->Parameters->NumParameters, 0); - c->parameters->name = strdup(prog->Parameters->Parameters[0].Name); + c->parameters->name = + ralloc_strdup(c->parameters, prog->Parameters->Parameters[0].Name); c->parameters->data.read_only = true; c->parameters->data.mode = nir_var_uniform; exec_list_push_tail(>uniforms, >parameters->node); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 0/2] etnaviv: use CPU-based tiling/untiling for small regions
Hi, This series aims at optimizing updates of small regions inside a texture. This is particularly useful when updating a texture atlas one bit at a time. Before going for this CPU-based tiling/untiling approach we tried optimizing things by keeping track of all updates triggered by the transfer_map/unmap() calls to avoid GPU -> CPU syncs when new regions are updated without touching other already written regions [1]. Unfortunately, because of the supertile+RS-alignment constraints this led to almost all updates requiring such a sync (at least in our use case where each element of the atlas is not properly aligned). This led us to consider another approach: avoid GPU-based tiling for small regions, especially when they're not properly aligned. This is what this patchset implements. Patch 1 add supports for super and multi tile untiling, and patch 2 makes use of the CPU-based tiling/untiling when the region is small enough. Note that the thresholds used here have been chosen arbitrarily but they seem to speed things up in our use case. Future possible improvements involve supporting NEON-based tiling/untiling (as started here[2]). Another option we considered was keeping a shadow linear buffer attached to the tiled resource so that we could update the linear version without waiting for the GPU to finish rendering things on the final BO, but this option will likely be more invasive. Please let me know what you think of this CPU-based tiling approach, and if you don't like it, feel free to propose other solution that would be worth investigating. Thanks, Boris [1]https://gitlab.collabora.com/bbrezillon/mesa/commits/etna-texture-atlas-18.2.4 [2]https://github.com/laanwj/mesa/commit/6d575b3094f17e29246be72dce8fbb6fe048db2c Boris Brezillon (2): etnaviv: add support for CPU-based super/multi tile tiling/untiling etnaviv: try to use CPU to tile/untile when the region is small enough src/gallium/drivers/etnaviv/etnaviv_tiling.c | 85 --- src/gallium/drivers/etnaviv/etnaviv_tiling.h | 10 ++- .../drivers/etnaviv/etnaviv_transfer.c| 83 ++ 3 files changed, 145 insertions(+), 33 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/1] Avoid leaking parameter name in prog_to_nir
Noticed this while looking at recent commits. I'm not familiar with the Mesa codebase so this might be incorrect, but I don't see how this strdup'ed variable would be freed. Matthias Lorenz (1): Avoid leaking parameter name in prog_to_nir. src/mesa/program/prog_to_nir.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 1/2] etnaviv: add support for CPU-based super/multi tile tiling/untiling
Sometimes using CPU based untiling/tiling makes, like when updating a really small region of the texture atlas which is not RS-aligned. CPU-based support for simple tile layout was already supported, but not the multi/super tile cases. Make the etna_texture_[un]tile() more generic to support those cases. Signed-off-by: Boris Brezillon --- src/gallium/drivers/etnaviv/etnaviv_tiling.c | 85 --- src/gallium/drivers/etnaviv/etnaviv_tiling.h | 10 ++- .../drivers/etnaviv/etnaviv_transfer.c| 16 ++-- 3 files changed, 84 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/etnaviv/etnaviv_tiling.c b/src/gallium/drivers/etnaviv/etnaviv_tiling.c index f4f85c1d6e6c..7e2b8bd48d3a 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_tiling.c +++ b/src/gallium/drivers/etnaviv/etnaviv_tiling.c @@ -32,39 +32,95 @@ #define TEX_TILE_WIDTH (4) #define TEX_TILE_HEIGHT (4) #define TEX_TILE_WORDS (TEX_TILE_WIDTH * TEX_TILE_HEIGHT) +#define TEX_SUPERTILE_WIDTH (64) +#define TEX_SUPERTILE_HEIGHT (64) +#define TEX_SUPERTILE_TWIDTH (16) +#define TEX_SUPERTILE_THEIGHT (16) +#define TEX_SUPERTILE_WORDS (TEX_SUPERTILE_WIDTH * TEX_SUPERTILE_HEIGHT) +#define TEX_TILES_PER_SUPERTILE (TEX_SUPERTILE_TWIDTH * TEX_SUPERTILE_THEIGHT) + +static unsigned tile_buf_offset(enum etna_surface_layout tlayout, +unsigned tx, unsigned ty, + unsigned tstride, unsigned th, + unsigned elmtsize) +{ + unsigned offs = 0, tile; + unsigned tiles_per_line = (tstride / elmtsize) / TEX_TILE_WIDTH; + + /* Multi tile layouts are described here: +* https://github.com/laanwj/etna_viv/blob/master/doc/hardware.md#multi-tiling +*/ + if (tlayout & ETNA_LAYOUT_BIT_MULTI) { + if tx / (TEX_TILE_WIDTH * 2)) & 1) ^ ((ty / TEX_TILE_HEIGHT) & 1))) { + offs += ((tstride / elmtsize) * (th / 2)); + if ((tx / (TEX_TILE_WIDTH * 2)) & 1) +tx -= TEX_TILE_WIDTH * 2; +else +tx += TEX_TILE_HEIGHT * 2; + } + + ty = ((ty / 2) & ~(TEX_TILE_HEIGHT - 1)) + + (ty % TEX_TILE_HEIGHT); + } + + if (tlayout & ETNA_LAYOUT_BIT_SUPER) { + /* We use the supertile layout described here: + * https://github.com/laanwj/etna_viv/blob/master/doc/hardware.md#supertiling + * FIXME: According to + * https://github.com/laanwj/etna_viv/blob/master/tools/detiler.py + * another layout exists. We should probably support CPU-based detiling + * for this layout too and determine the layout to used based on HW + * features. */ + tile = (ty / TEX_SUPERTILE_HEIGHT) * tiles_per_line * + TEX_SUPERTILE_THEIGHT; + ty &= TEX_SUPERTILE_HEIGHT - 1; + tile += (tx / TEX_SUPERTILE_WIDTH) * TEX_TILES_PER_SUPERTILE; + tx &= TEX_SUPERTILE_WIDTH - 1; + tile += (ty / (4 * TEX_TILE_HEIGHT)) * 16 * 4; + ty &= (4 * TEX_TILE_HEIGHT) - 1; + tile += (ty / TEX_TILE_HEIGHT) * 2; + tile += ((tx / TEX_TILE_WIDTH) & ~0x1) * 4; + tile += (tx / TEX_TILE_WIDTH) & 0x1; + } else { + tile = (ty / TEX_TILE_HEIGHT) * tiles_per_line; + tile += tx / TEX_TILE_WIDTH; + } + + offs += tile * TEX_TILE_WORDS; + ty &= TEX_TILE_HEIGHT - 1; + tx &= TEX_TILE_WIDTH - 1; + offs += (ty * TEX_TILE_WIDTH) + tx; + + return offs; +} #define DO_TILE(type) \ src_stride /= sizeof(type); \ - dst_stride = (dst_stride * TEX_TILE_HEIGHT) / sizeof(type); \ for (unsigned srcy = 0; srcy < height; ++srcy) { \ - unsigned dsty = basey + srcy; \ - unsigned ty = (dsty / TEX_TILE_HEIGHT) * dst_stride + \ -(dsty % TEX_TILE_HEIGHT) * TEX_TILE_WIDTH; \ for (unsigned srcx = 0; srcx < width; ++srcx) { \ - unsigned dstx = basex + srcx; \ - ((type *)dest)[ty + (dstx / TEX_TILE_WIDTH) * TEX_TILE_WORDS + \ -(dstx % TEX_TILE_WIDTH)] = \ + unsigned destoffs = tile_buf_offset(baselayout, basex + srcx, \ + basey + srcy, dst_stride, \ + baseh, elmtsize); \ + ((type *)dest)[destoffs] = \ ((type *)src)[srcy * src_stride + srcx];\ } \ } #define DO_UNTILE(type) \ - src_stride = (src_stride * TEX_TILE_HEIGHT) / sizeof(type);\ dst_stride /= sizeof(type);\ for (unsigned dsty = 0; dsty < height; ++dsty) { \ - unsigned srcy = basey +
Re: [Mesa-dev] [PATCH] nir: clone instruction set rather than removing individual entries
This patch is: Reviewed-by: Thomas Helland Den ons. 20. feb. 2019 kl. 04:04 skrev Timothy Arceri : > > This reduces the time spent in nir_opt_cse() by almost a half. > --- > src/compiler/nir/nir_opt_cse.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/compiler/nir/nir_opt_cse.c b/src/compiler/nir/nir_opt_cse.c > index bf42a6a33dc..3c3617d852a 100644 > --- a/src/compiler/nir/nir_opt_cse.c > +++ b/src/compiler/nir/nir_opt_cse.c > @@ -39,9 +39,10 @@ > */ > > static bool > -cse_block(nir_block *block, struct set *instr_set) > +cse_block(nir_block *block, struct set *dominance_set) > { > bool progress = false; > + struct set *instr_set = _mesa_set_clone(dominance_set, NULL); > > nir_foreach_instr_safe(instr, block) { >if (nir_instr_set_add_or_rewrite(instr_set, instr)) { > @@ -55,8 +56,7 @@ cse_block(nir_block *block, struct set *instr_set) >progress |= cse_block(child, instr_set); > } > > - nir_foreach_instr(instr, block) > - nir_instr_set_remove(instr_set, instr); > + _mesa_set_destroy(instr_set, NULL); > > return progress; > } > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv: fix alphaToCoverage when there is no color attachment
CL#3532 added a test for alpha to coverage without a color attachment. First the test draws a primitive with alpha 0 and a subpass with only a depth buffer. No writes to a depth buffer are expected. Then a second draw with a color buffer and the same depth buffer is done to verify the depth buffer still has the original clear values. This behavior is not explicitly forbidden by the Vulkan spec, so it seems it is allowed. When there is no color attachment for a given output, we discard it so at the end we have an FS assembly like: Native code for unnamed fragment shader (null) SIMD16 shader: 1 instructions. 0 loops. 4 cycles. 0:0 spills:fills. Promoted 0 constants. Compacted 16 to 16 bytes (0%) START B0 (4 cycles) sendc(16) null<1>UW g120<0,1,0>F0x90031000 render MsgDesc: RT write SIMD16 LastRT Surface = 0 mlen 8 rlen 0 { align1 1H EOT }; As g120 is not initialized, we see random writes to the depth buffer due to the alphaToCoverage enablement. This commit fixes that by keeping the output and creating a null render target for it. Fixes tests: dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.* Signed-off-by: Samuel Iglesias Gonsálvez --- src/intel/vulkan/anv_pipeline.c | 35 + 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index e2024212bd9..70a958bf3a8 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -792,7 +792,9 @@ anv_pipeline_compile_gs(const struct brw_compiler *compiler, static void anv_pipeline_link_fs(const struct brw_compiler *compiler, - struct anv_pipeline_stage *stage) + struct anv_pipeline_stage *stage, + const struct anv_subpass *subpass, + const VkPipelineMultisampleStateCreateInfo *ms_info) { unsigned num_rts = 0; const int max_rt = FRAG_RESULT_DATA7 - FRAG_RESULT_DATA0 + 1; @@ -843,12 +845,28 @@ anv_pipeline_link_fs(const struct brw_compiler *compiler, const unsigned rt = var->data.location - FRAG_RESULT_DATA0; if (rt >= MAX_RTS || !(stage->key.wm.color_outputs_valid & (1 << rt))) { - /* Unused or out-of-bounds, throw it away */ - deleted_output = true; - var->data.mode = nir_var_function_temp; - exec_node_remove(>node); - exec_list_push_tail(>locals, >node); - continue; + if (rt == 0 && ms_info && ms_info->alphaToCoverageEnable && + subpass->depth_stencil_attachment && rt_to_bindings[rt] == -1) { +/* Don't throw away the unused output, because we needed it for + * calculate correctly the alpha to coverage samples when there + * is no color buffer attached at location 0. + */ +rt_to_bindings[rt] = num_rts; +/* We need a null render target */ +rt_bindings[rt_to_bindings[rt]] = (struct anv_pipeline_binding) { + .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, + .binding = 0, + .index = UINT32_MAX, +}; +num_rts++; + } else { +/* Unused or out-of-bounds, throw it away */ +deleted_output = true; +var->data.mode = nir_var_function_temp; +exec_node_remove(>node); +exec_list_push_tail(>locals, >node); +continue; + } } /* Give it the new location */ @@ -1075,7 +1093,8 @@ anv_pipeline_compile_graphics(struct anv_pipeline *pipeline, anv_pipeline_link_gs(compiler, [s], next_stage); break; case MESA_SHADER_FRAGMENT: - anv_pipeline_link_fs(compiler, [s]); + anv_pipeline_link_fs(compiler, [s], + pipeline->subpass, info->pMultisampleState); break; default: unreachable("Invalid graphics shader stage"); -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109698] dri.pc contents invalid when built with meson
https://bugs.freedesktop.org/show_bug.cgi?id=109698 --- Comment #2 from Emil Velikov --- AFAICT there are two issues here. 1. the prefix wrongfully added to the custom variable 2. the custom variable can be a list of paths separated by colon, yet only the first one ends up in dri.pc Perhaps one process of prefixing (1) also broke the multiple paths (2)? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] i965: fixed clamping in set_scissor_bits when the y is flipped
Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) v3: - Reversed the CLAMP change in bbox[3] as the API guarantees that the viewport height is positive. (Nanley Chery) --- src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index dcdfb3c9292..47f3741e673 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2445,7 +2445,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); _mesa_intersect_scissor_bounding_box(ctx, i, bbox); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix function return typechecking
Hi, On Thu, Feb 21, 2019 at 10:35 AM Tapani Pälli wrote: > Hi; > > On 2/11/19 6:46 PM, Oscar Blumberg wrote: > > apply_implicit_conversion only converts and check base types but we > > need actual type equality for function returns, otherwise you can > > return a vec2 from a function declared as returning a float. > > Do you have some test shader that hits this condition? It seems to me > that currently we will error out correctly if one tries to return vec2 > from function declared as returning a float. > > Yes, I believe it triggers in even the simplest case here, such as : float f() { return vec2(1.0); } anywhere in the shader. Does it fail to reproduce on your end ? (note that the declared glsl version must be recent enough that implicit conversion for function returns are enabled, I'm using 450 here). I believe most of the confusion comes from the name of the apply_implicit_conversion function, since it is mainly used for arithmetic operations for which, e.g., vec2+float is allowed. Because of that it only checks (and convert in place) base types without looking at element count for vector-like things. We can still use it to perform the conversion but it requires an additional check, hence the patch. That's my understanding of it anyway. > --- > > src/compiler/glsl/ast_to_hir.cpp | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/src/compiler/glsl/ast_to_hir.cpp > b/src/compiler/glsl/ast_to_hir.cpp > > index 620153e6a34..6bf2910954f 100644 > > --- a/src/compiler/glsl/ast_to_hir.cpp > > +++ b/src/compiler/glsl/ast_to_hir.cpp > > @@ -6248,7 +6248,8 @@ ast_jump_statement::hir(exec_list *instructions, > > > > if (state->has_420pack()) { > > if > (!apply_implicit_conversion(state->current_function->return_type, > > - ret, state)) { > > + ret, state) > > + || (ret->type != > state->current_function->return_type)) { > > _mesa_glsl_error(& loc, state, > > "could not implicitly convert > return value " > > "to %s, in function `%s'", > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] anv: advertise 8 subpixel precision bits
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is used to select between 8 bit subpixel precision (value 0) or 4 bit subpixel precision (value 1). As this value is not set, means it is taking the value 0, so 8 bit are used. On the other side, in the Vulkan CTS tests, if the reference rasterizer, which uses 8 bit precision, as it is used to check what should be the expected value for the tests, is changed to use 4 bit as ANV was advertising so far, some of the tests will fail. So it seems ANV is actually using 8 bits. v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason) CC: Jason Ekstrand CC: Kenneth Graunke --- src/intel/vulkan/anv_device.c| 2 +- src/intel/vulkan/genX_pipeline.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 3120865466a..95224407318 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1095,7 +1095,7 @@ void anv_GetPhysicalDeviceProperties( 16 * devinfo->max_cs_threads, 16 * devinfo->max_cs_threads, }, - .subPixelPrecisionBits= 4 /* FIXME */, + .subPixelPrecisionBits= 8, .subTexelPrecisionBits= 4 /* FIXME */, .mipmapPrecisionBits = 4 /* FIXME */, .maxDrawIndexedIndexValue = UINT32_MAX, diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 6255e5d83c5..b06036a6fc7 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -464,6 +464,7 @@ emit_rs_state(struct anv_pipeline *pipeline, sf.TriangleStripListProvokingVertexSelect = 0; sf.LineStripListProvokingVertexSelect = 0; sf.TriangleFanProvokingVertexSelect = 1; + sf.VertexSubPixelPrecisionSelect = 0; const struct brw_vue_prog_data *last_vue_prog_data = anv_pipeline_get_last_vue_prog_data(pipeline); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix function return typechecking
On 2/21/19 11:35 AM, Tapani Pälli wrote: Hi; On 2/11/19 6:46 PM, Oscar Blumberg wrote: apply_implicit_conversion only converts and check base types but we need actual type equality for function returns, otherwise you can return a vec2 from a function declared as returning a float. Do you have some test shader that hits this condition? It seems to me that currently we will error out correctly if one tries to return vec2 from function declared as returning a float. Ah forget that, I can see it can happen with ARB_shading_language_420pack! Reviewed-by: Tapani Pälli --- src/compiler/glsl/ast_to_hir.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index 620153e6a34..6bf2910954f 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -6248,7 +6248,8 @@ ast_jump_statement::hir(exec_list *instructions, if (state->has_420pack()) { if (!apply_implicit_conversion(state->current_function->return_type, - ret, state)) { + ret, state) + || (ret->type != state->current_function->return_type)) { _mesa_glsl_error(& loc, state, "could not implicitly convert return value " "to %s, in function `%s'", ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs
https://bugs.freedesktop.org/show_bug.cgi?id=109711 --- Comment #1 from Bas Nieuwenhuizen --- So is this an actual hang of the GPU or just a freeze for the game? (i.e. does it need a reboot to do anything again or can you just close the game?) Reading the DXVK bug I'm wondering which of the two it is. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix function return typechecking
Hi; On 2/11/19 6:46 PM, Oscar Blumberg wrote: apply_implicit_conversion only converts and check base types but we need actual type equality for function returns, otherwise you can return a vec2 from a function declared as returning a float. Do you have some test shader that hits this condition? It seems to me that currently we will error out correctly if one tries to return vec2 from function declared as returning a float. --- src/compiler/glsl/ast_to_hir.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index 620153e6a34..6bf2910954f 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -6248,7 +6248,8 @@ ast_jump_statement::hir(exec_list *instructions, if (state->has_420pack()) { if (!apply_implicit_conversion(state->current_function->return_type, - ret, state)) { + ret, state) + || (ret->type != state->current_function->return_type)) { _mesa_glsl_error(& loc, state, "could not implicitly convert return value " "to %s, in function `%s'", ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109711] [DXVK] Skyrim Special edition hangs
https://bugs.freedesktop.org/show_bug.cgi?id=109711 Bug ID: 109711 Summary: [DXVK] Skyrim Special edition hangs Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: root@scoopta.ninja QA Contact: mesa-dev@lists.freedesktop.org Skyrim special edition hangs for me on the main menu with a Radeon R9 Fury using RADV. The corresponding DXVK issue can be found here https://github.com/doitsujin/dxvk/issues/927 The issue occurs using Mesa 18.3.3(LLVM 7.0.1), Mesa 19.0.0~rc4(LLVM 8), and Mesa 19.0.0~rc4(LLVM 9-svn). The build of 18.3.3(LLVM 7.0.1) came from the debian sid repositories, the 19.0.0~rc4(LLVM 8) was built by me linking against debian sid's libllvm8 and then 19.0.0~rc4(LLVM 9-svn) was built entirely from scratch. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109698] dri.pc contents invalid when built with meson
https://bugs.freedesktop.org/show_bug.cgi?id=109698 --- Comment #1 from Sergii Romantsov --- Hello, Jeremy. Looks like during meson configuration you are also using --prefix=/usr So if at this moment as workaround option dri-drivers-path will exclude a 'prefix' from path it should work, like: -Ddri-drivers-path=/lib/$(DEB_HOST_MULTIARCH)/dri Also patch to mesa proposed: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/285 Could you, please, try if it fixes issue? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev