Re: [Mesa-dev] [PATCH] mesa/gles: adjust internal format in glTexSubImage2D error checks
Forgot to add: Cc: "17.3"On 11/21/2017 08:35 AM, Tapani Pälli wrote: When floating point textures are created on OpenGL ES 2.0, driver is free to choose used internal format. Mesa makes this decision in adjust_for_oes_float_texture. Error checking for glTexImage2D properly checks that sized formats are not used. We use same error checking path for glTexSubImage2D (since there is lot of overlap), however since those checks include internalFormat checks, we need to pass original internalFormat passed by the client. Patch adds oes_float_internal_format that does reverse adjust_for_oes_float_texture to get that format. Fixes following test failure: ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float (when running test with MESA_GLES_VERSION_OVERRIDE=2.0) Signed-off-by: Tapani Pälli Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227 --- src/mesa/main/teximage.c | 56 +++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 4ec6148bf4..d7643a52df 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -122,6 +122,56 @@ adjust_for_oes_float_texture(const struct gl_context *ctx, return format; } +/** + * Returns a corresponding base format for a given internal floating point + * format as specifed by OES_texture_float. + */ +static GLenum +oes_float_internal_format(const struct gl_context *ctx, + GLenum format, GLenum type) +{ + switch (type) { + case GL_FLOAT: + if (ctx->Extensions.OES_texture_float) { + switch (format) { + case GL_RGBA32F: +return GL_RGBA; + case GL_RGB32F: +return GL_RGB; + case GL_ALPHA32F_ARB: +return GL_ALPHA; + case GL_LUMINANCE32F_ARB: +return GL_LUMINANCE; + case GL_LUMINANCE_ALPHA32F_ARB: +return GL_LUMINANCE_ALPHA; + default: +break; + } + } + break; + + case GL_HALF_FLOAT_OES: + if (ctx->Extensions.OES_texture_half_float) { + switch (format) { + case GL_RGBA16F: +return GL_RGBA; + case GL_RGB16F: +return GL_RGB; + case GL_ALPHA16F_ARB: +return GL_ALPHA; + case GL_LUMINANCE16F_ARB: +return GL_LUMINANCE; + case GL_LUMINANCE_ALPHA16F_ARB: +return GL_LUMINANCE_ALPHA; + default: +break; + } + } + break; + } + return format; +} + /** * Install gl_texture_image in a gl_texture_object according to the target @@ -2155,6 +2205,10 @@ texsubimage_error_check(struct gl_context *ctx, GLuint dimensions, return GL_TRUE; } + GLenum internalFormat = _mesa_is_gles(ctx) ? + oes_float_internal_format(ctx, texImage->InternalFormat, type) : + texImage->InternalFormat; + /* OpenGL ES 1.x and OpenGL ES 2.0 impose additional restrictions on the * combinations of format, internalFormat, and type that can be used. * Formats and types that require additional extensions (e.g., GL_FLOAT @@ -2162,7 +2216,7 @@ texsubimage_error_check(struct gl_context *ctx, GLuint dimensions, */ if (_mesa_is_gles(ctx) && texture_format_error_check_gles(ctx, format, type, - texImage->InternalFormat, + internalFormat, dimensions, callerName)) { return GL_TRUE; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 97852] Unreal Engine corrupted preview viewport
https://bugs.freedesktop.org/show_bug.cgi?id=97852 --- Comment #12 from Tapani Pälli--- Created attachment 135622 --> https://bugs.freedesktop.org/attachment.cgi?id=135622=edit wokaround since we now have another case (Observer_) here's a workaround suggestion. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/gles: adjust internal format in glTexSubImage2D error checks
When floating point textures are created on OpenGL ES 2.0, driver is free to choose used internal format. Mesa makes this decision in adjust_for_oes_float_texture. Error checking for glTexImage2D properly checks that sized formats are not used. We use same error checking path for glTexSubImage2D (since there is lot of overlap), however since those checks include internalFormat checks, we need to pass original internalFormat passed by the client. Patch adds oes_float_internal_format that does reverse adjust_for_oes_float_texture to get that format. Fixes following test failure: ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float (when running test with MESA_GLES_VERSION_OVERRIDE=2.0) Signed-off-by: Tapani PälliBugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227 --- src/mesa/main/teximage.c | 56 +++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 4ec6148bf4..d7643a52df 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -122,6 +122,56 @@ adjust_for_oes_float_texture(const struct gl_context *ctx, return format; } +/** + * Returns a corresponding base format for a given internal floating point + * format as specifed by OES_texture_float. + */ +static GLenum +oes_float_internal_format(const struct gl_context *ctx, + GLenum format, GLenum type) +{ + switch (type) { + case GL_FLOAT: + if (ctx->Extensions.OES_texture_float) { + switch (format) { + case GL_RGBA32F: +return GL_RGBA; + case GL_RGB32F: +return GL_RGB; + case GL_ALPHA32F_ARB: +return GL_ALPHA; + case GL_LUMINANCE32F_ARB: +return GL_LUMINANCE; + case GL_LUMINANCE_ALPHA32F_ARB: +return GL_LUMINANCE_ALPHA; + default: +break; + } + } + break; + + case GL_HALF_FLOAT_OES: + if (ctx->Extensions.OES_texture_half_float) { + switch (format) { + case GL_RGBA16F: +return GL_RGBA; + case GL_RGB16F: +return GL_RGB; + case GL_ALPHA16F_ARB: +return GL_ALPHA; + case GL_LUMINANCE16F_ARB: +return GL_LUMINANCE; + case GL_LUMINANCE_ALPHA16F_ARB: +return GL_LUMINANCE_ALPHA; + default: +break; + } + } + break; + } + return format; +} + /** * Install gl_texture_image in a gl_texture_object according to the target @@ -2155,6 +2205,10 @@ texsubimage_error_check(struct gl_context *ctx, GLuint dimensions, return GL_TRUE; } + GLenum internalFormat = _mesa_is_gles(ctx) ? + oes_float_internal_format(ctx, texImage->InternalFormat, type) : + texImage->InternalFormat; + /* OpenGL ES 1.x and OpenGL ES 2.0 impose additional restrictions on the * combinations of format, internalFormat, and type that can be used. * Formats and types that require additional extensions (e.g., GL_FLOAT @@ -2162,7 +2216,7 @@ texsubimage_error_check(struct gl_context *ctx, GLuint dimensions, */ if (_mesa_is_gles(ctx) && texture_format_error_check_gles(ctx, format, type, - texImage->InternalFormat, + internalFormat, dimensions, callerName)) { return GL_TRUE; } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] spirv: fix bug when OpSpecConstantOp calls a conversion
In that case, nir_eval_const_opcode() will evaluate the conversion but as it was using destination's bit_size, the resulting value was just a cast of the source constant value. By passing the source's bit size, it does the conversion properly. Fixes: dEQP-VK.spirv_assembly.instruction.*.opspecconstantop.*convert* Signed-off-by: Samuel Iglesias Gonsálvez--- src/compiler/spirv/spirv_to_nir.c | 31 ++- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 2cc3c275ea9..ffbda4f1946 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -1348,14 +1348,35 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp opcode, bool swap; nir_alu_type dst_alu_type = nir_get_nir_type_for_glsl_type(val->const_type); nir_alu_type src_alu_type = dst_alu_type; - nir_op op = vtn_nir_alu_op_for_spirv_opcode(opcode, , src_alu_type, dst_alu_type); - unsigned num_components = glsl_get_vector_elements(val->const_type); - unsigned bit_size = -glsl_get_bit_size(val->const_type); + unsigned bit_size; - nir_const_value src[4]; assert(count <= 7); + + switch (opcode) { + case SpvOpUConvert: + case SpvOpConvertFToU: + case SpvOpConvertFToS: + case SpvOpConvertSToF: + case SpvOpConvertUToF: + case SpvOpSConvert: + case SpvOpFConvert: +/* We have a source in a conversion */ +src_alu_type = + nir_get_nir_type_for_glsl_type( + vtn_value(b, w[4], vtn_value_type_constant)->const_type); +/* We use the bitsize of the conversion source to evaluate the opcode later */ +bit_size = glsl_get_bit_size( + vtn_value(b, w[4], vtn_value_type_constant)->const_type); +break; + default: +bit_size = glsl_get_bit_size(val->const_type); + }; + + + nir_op op = vtn_nir_alu_op_for_spirv_opcode(opcode, , src_alu_type, dst_alu_type); + nir_const_value src[4]; + for (unsigned i = 0; i < count - 4; i++) { nir_constant *c = vtn_value(b, w[4 + i], vtn_value_type_constant)->constant; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] spirv: allow specialization constants with bitsize different than 32 bits
Signed-off-by: Samuel Iglesias Gonsálvez--- src/compiler/spirv/spirv_to_nir.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 027efab88d7..2cc3c275ea9 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -1361,7 +1361,6 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp opcode, vtn_value(b, w[4 + i], vtn_value_type_constant)->constant; unsigned j = swap ? 1 - i : i; -assert(bit_size == 32); src[j] = c->values[0]; } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/28] vulkan/wsi: Rework WSI to look a lot more like a layer
On 11/16/2017 01:28 PM, Jason Ekstrand wrote: This patch series is the combined brain-child of Dave and myself. The objective is to rewrite Vulkan WSI to look as much like a layer as possible and to reduce the driver <-> WSI interface. We try very hard to have as many of the WSI details as possible in common code and to use standard Vulkan interfaces for everything. Among other things, this means that prime support is now implemented in an entirely driver-agnostic way and the driver doesn't even know it's happening. As a side-effect anv now has prime support. Eventually, someone could pull what's left out into a proper layer and we could drop WSI support from our drivers entirely. There are a fiew pieces of work that would be required to do this: 1) Write all of the annoying layer bits. There are some short-cuts that we can take because we're not in a layer and those will have to go. 2) Define a VK_MESA_legacy_swapchain_image extension to replace the hack introduced in patch 8. 3) It looks like modifiers support will land before the official Vulkan extensions get finished. It will have to be ported to the official extensions. 4) Figure out what to do about the fence in AcquireNextImage. In a future world of explicit synchronization, we can just import the sync_file from X or the Wayland compositor, but with implicit sync like we have today, it's a bit harder. Right now, the helper in wsi_common does nothing with it and trusts the caller to handle it. The two drivers handle this differently today. In anv, we do a dummy QueueSubmit to trigger the fence while radv triggers it a bit more manually. In both cases, we trigger the fence immediately and trust in the GEM's implicit synchronization to sort things out for us. We can't use the anv method as directly with radv because no queue is passed in so we don't know what queue to use in the dummy QueueSubmit. (In ANV, we only have the one queue so that isn't a problem.) I don't have detailed feedback, but I read through the series and this is pretty cool. Glad things are starting to generalize across the driver stacks. I'm optimistic that things have gotten to the point where we'll never have to write a separate Wayland WSI for our driver. Is it an accurate observation to say there aren't any Vulkan API bits missing (other than stuff in the pipeline like modifiers/dma-buf) to allow the full-layer solution? Hopefully we haven't missed anything in the external_* extensions at this point. Thanks, -James Dave, I tried to pull patches from your series where practical but, because we did things in a different order, it frequently wasn't. If you want to claim credit for any of these patches, just say so and I'll --reset-author on them. The series can be found on freedesktop.org here: https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/vulkan-wsi-prime Cc: Dave AirlieCc: Daniel Stone Cc: Chad Versace Cc: James Jones Daniel Stone (1): vulkan/wsi: Add a wsi_image structure Dave Airlie (4): vulkan/wsi: use function ptr definitions from the spec. radv/wsi: drop allocate memory special case radv/wsi: Move the guts of QueuePresent to wsi common vulkan/wsi: move swapchain create/destroy to common code Jason Ekstrand (23): vulkan/wsi/x11: Handle the geometry check earlier in create_swapchain vulkan/wsi: Add a wsi_device_init function vulkan/wsi: Add wsi_swapchain_init/finish functions vulkan/wsi: Implement prime in a completely generic way anv/image: Add a return value to bind_memory_plane vulkan/wsi: Add a mock image creation extension anv/image: Implement the wsi "extension" radv/image: Implement the wsi "extension" vulkan/wsi: Do image creation in common code vulkan/wsi: Add a WSI_FROM_HANDLE macro vulkan/wsi: Refactor result handling in queue_present vulkan/wsi: Only wait on semaphores on the first swapchain vulkan/wsi: Set a proper pWaitDstStageMask on the dummy submit anv/wsi: Use the common QueuePresent code anv/wsi: Enable prime support vulkan/wsi: Move get_images into common code vulkan/wsi: Move prime blitting into queue_present vulkan/wsi: Add a helper for AcquireNextImage vulkan/wsi: Move wsi_swapchain to wsi_common_private.h vulkan/wsi: Drop the can_handle_different_gpu parameter from get_support vulkan/wsi: Add wrappers for all of the surface queries vulkan/wsi: Drop some unneeded cruft from the API vulkan/wsi: Initialize individual WSI interfaces in wsi_device_init src/amd/vulkan/radv_device.c| 18 +- src/amd/vulkan/radv_image.c | 15 +- src/amd/vulkan/radv_private.h | 10 - src/amd/vulkan/radv_wsi.c | 472 +++--- src/intel/vulkan/anv_image.c| 71 +++- src/intel/vulkan/anv_private.h
Re: [Mesa-dev] [PATCH v2 15/17] util: Add Mesa ARB_get_program_binary helper functions
On 21/11/17 15:34, Jordan Justen wrote: On 2017-11-20 19:00:28, Timothy Arceri wrote: I don't think this belongs in util. disk cache is generic and used by both Vulkan and Opengl drivers. These function are specific to one OpenGL extension and have a dependency on OpenGL types. I think you should just move this inside the src/mesa, I can't see this being reused anywhere else. Maybe src/mesa/program ? My goal is just to put it somewhere that all GL/GLES drivers in Mesa will use it. Since we have just the one binary_format value defined, we need to avoid different drivers using the same binary_format enum in different ways. I think I almost put it under src/mesa/main, but I guess src/mesa/program works too. Do you prefer either of those? Maybe it is so simple now that it could just be added into main/shaderobj.c? That file is already a bit of a dumping ground. I like it in the file you have now. Moving it to src/mesa/program makes sense to me. Are any of these ideas likely to push you over the edge for a Reviewed-by? :) Yeah I've skimmed everything else and it seems fine I just need to give it a better look before sending a r-b. I also thought Nicolai might want to take a look over these last few since he gave feedback on these in v1. I'll try take a better look later tonight or tomorrow morning. I'm keen to seem them land also so I can hook up Gallium support. Thanks, -Jordan On 21/11/17 09:27, Jordan Justen wrote: Signed-off-by: Jordan Justen--- src/util/Makefile.sources | 2 + src/util/meson.build | 2 + src/util/program_binary.c | 149 ++ src/util/program_binary.h | 70 ++ 4 files changed, 223 insertions(+) create mode 100644 src/util/program_binary.c create mode 100644 src/util/program_binary.h diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources index 104ecae8ed3..e06f631128a 100644 --- a/src/util/Makefile.sources +++ b/src/util/Makefile.sources @@ -24,6 +24,8 @@ MESA_UTIL_FILES := \ mesa-sha1.h \ os_time.c \ os_time.h \ + program_binary.c \ + program_binary.h \ sha1/sha1.c \ sha1/sha1.h \ ralloc.c \ diff --git a/src/util/meson.build b/src/util/meson.build index ac86c9e111e..496da5c4328 100644 --- a/src/util/meson.build +++ b/src/util/meson.build @@ -48,6 +48,8 @@ files_mesa_util = files( 'mesa-sha1.h', 'os_time.c', 'os_time.h', + 'program_binary.c', + 'program_binary.h', 'sha1/sha1.c', 'sha1/sha1.h', 'ralloc.c', diff --git a/src/util/program_binary.c b/src/util/program_binary.c new file mode 100644 index 000..e10b9f17862 --- /dev/null +++ b/src/util/program_binary.c @@ -0,0 +1,149 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (c) 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * \file program_binary.c + * + * Helper functions for serializing a binary program. + */ + + +#include "main/mtypes.h" +#include "crc32.h" +#include "program_binary.h" + +/** + * Mesa supports one binary format, but it must differentiate between formats + * produced by different drivers and different Mesa versions. + * + * Mesa uses a uint32_t value to specify an internal format. The only format + * defined has one uint32_t value of 0, followed by 20 bytes specifying a sha1 + * that uniquely identifies the Mesa driver type and version. + */ + +struct program_binary_header { + /* If internal_format is 0, it must be followed by the 20 byte sha1 that +* identifies the Mesa driver and version supported. If we want to support +* something besides a sha1, then a new internal_format value can be added. +*/ + uint32_t internal_format; + uint8_t sha1[20]; + /* Fields following sha1 can be changed since the sha1 will guarantee that +* the binary only works with the same Mesa version. +*/
Re: [Mesa-dev] NIR linking optimisations for Gallium drivers
On 21/11/17 15:35, Dieter Nützel wrote: Hello Tim, with both of your latest series I get this: CC state_tracker/st_program.lo state_tracker/st_program.c: In function ‘st_translate_geometry_program’: state_tracker/st_program.c:1435:25: error: implicit declaration of function ‘st_glsl_to_nir’; did you mean ‘st_link_nir’? [-Werror=implicit-function-declaration] nir_shader *nir = st_glsl_to_nir(st, >Base, stgp->shader_program, ^~ st_link_nir state_tracker/st_program.c:1435:25: warning: initialization makes pointer from integer without a cast [-Wint-conversion] cc1: some warnings being treated as errors make[5]: *** [Makefile:2993: state_tracker/st_program.lo] Fehler 1 Greetings, Dieter Oh sorry I think I need to push a patch that was reviewed recently. I'll push it shortly. Am 21.11.2017 04:37, schrieb Timothy Arceri: This series depends on: https://patchwork.freedesktop.org/series/34131/ Tested without regression on radeonsi. Shader-db results (still limited to GLSL 1.40 so not to interesting): Totals from affected shaders: SGPRS: 29440 -> 30464 (3.48 %) VGPRS: 19620 -> 19584 (-0.18 %) Spilled SGPRs: 131 -> 129 (-1.53 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 749312 -> 749548 (0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4751 -> 4767 (0.34 %) Wait states: 0 -> 0 (0.00 %) The middle patches just move things around so that we can make use of the linking opts, hopefully having them split up this much should make regression testing easy for the existing gallium nir drivers. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [AppVeyor] mesa master #6216 failed
Thanks Ilia. >-Original Message- >From: ibmir...@gmail.com [mailto:ibmir...@gmail.com] On Behalf Of Ilia Mirkin >Sent: Tuesday, November 21, 2017 10:11 AM >To: Marathe, Yogesh>Cc: mesa-dev@lists.freedesktop.org; Muthukumar, Aravindan > >Subject: Re: [Mesa-dev] [AppVeyor] mesa master #6216 failed > >Build started >git clone -q --depth=100 --branch=master >git://anongit.freedesktop.org/mesa/mesa C:\projects\mesa >warning: Could not find remote branch master to clone. >fatal: Remote branch master not found in upstream origin Command exited with >code 128 > >Not your commit's fault. > >On Mon, Nov 20, 2017 at 11:33 PM, Marathe, Yogesh > wrote: >> Don't know if this is a concern. The patch was built & tested with intel >> mesa CI >and locally. Does this need attention or can be ignored? >> Please advise. >> >> -Yogesh. >> >>>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On >>>Behalf Of AppVeyor >>>Sent: Tuesday, November 21, 2017 4:23 AM >>>To: mesa-dev@lists.freedesktop.org >>>Subject: [Mesa-dev] [AppVeyor] mesa master #6216 failed >>> >>>Build mesa 6216 failed >>>Commit 971b3c019b by Aravindan Muthukumar on 11/9/2017 5:45 AM: >>>i965: Optimize bucket index calculation\n\nReducing Bucket index >>>calculation to O(1).\n\nThis algorithm calculates the index using >>>matrix method. Assuming\nPAGE_SIZE is 4096, matrix arrangement is as >>>below:\n\n 1*4096 2*4096 3*4096 4*4096\n 5*4096 6*4096 7*4096 8*4096\n >>>10*4096 12*4096 14*4096 16*4096\n 20*4096 24*4096 28*4096 32*4096\n >... >>>... ... ...\n ... ... ... ...\n ... ... ... max_cache_size\n\nFrom >>>this matrix its clearly seen that every row follows the below way:\n\n ... >>>... ... n\n n+(1/4)n n+(1/2)n n+(3/4)n 2n\n\nRow is calculated as >>>log2(size/PAGE_SIZE) Column is calculated as\nconverting the >>>difference between the elements to fit into power size of\ntwo and >>>indexing it.\n\nFinal Index is (row*4)+(col-1)\n\nTested with Intel >>>Mesa CI.\n\nImproves performance of 3DMark on BXT by 0.705966% +/- >>>0.229767% >>>(n=20)\n\nv4: Review comments on style and code comments implemented >>>(Ian).\nv3: Review comments implemented (Ian).\nv2: Review comments >>>implemented (Jason).\n\nSigned-off-by: Aravindan Muthukumar >>> \nSigned-off-by: Kedar Karanje >>> \nReviewed-by: Yogesh Marathe >>> \nSigned-off-by: Ian Romanick >>> Configure your notification preferences >>> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [AppVeyor] mesa master #6216 failed
Build started git clone -q --depth=100 --branch=master git://anongit.freedesktop.org/mesa/mesa C:\projects\mesa warning: Could not find remote branch master to clone. fatal: Remote branch master not found in upstream origin Command exited with code 128 Not your commit's fault. On Mon, Nov 20, 2017 at 11:33 PM, Marathe, Yogeshwrote: > Don't know if this is a concern. The patch was built & tested with intel mesa > CI and locally. Does this need attention or can be ignored? > Please advise. > > -Yogesh. > >>From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On >>Behalf Of AppVeyor >>Sent: Tuesday, November 21, 2017 4:23 AM >>To: mesa-dev@lists.freedesktop.org >>Subject: [Mesa-dev] [AppVeyor] mesa master #6216 failed >> >>Build mesa 6216 failed >>Commit 971b3c019b by Aravindan Muthukumar on 11/9/2017 5:45 AM: >>i965: Optimize bucket index calculation\n\nReducing Bucket index >>calculation to O(1).\n\nThis algorithm calculates the index using >>matrix method. Assuming\nPAGE_SIZE is 4096, matrix arrangement is as >>below:\n\n 1*4096 2*4096 3*4096 4*4096\n 5*4096 6*4096 7*4096 8*4096\n >>10*4096 12*4096 14*4096 16*4096\n 20*4096 24*4096 28*4096 32*4096\n ... >>... ... ...\n ... ... ... ...\n ... ... ... max_cache_size\n\nFrom this >>matrix its clearly seen that every row follows the below way:\n\n ... >>... ... n\n n+(1/4)n n+(1/2)n n+(3/4)n 2n\n\nRow is calculated as >>log2(size/PAGE_SIZE) Column is calculated as\nconverting the difference >>between the elements to fit into power size of\ntwo and indexing >>it.\n\nFinal Index is (row*4)+(col-1)\n\nTested with Intel Mesa >>CI.\n\nImproves performance of 3DMark on BXT by 0.705966% +/- 0.229767% >>(n=20)\n\nv4: Review comments on style and code comments implemented >>(Ian).\nv3: Review comments implemented (Ian).\nv2: Review comments >>implemented (Jason).\n\nSigned-off-by: Aravindan Muthukumar >> \nSigned-off-by: Kedar Karanje >> \nReviewed-by: Yogesh Marathe >> \nSigned-off-by: Ian Romanick >> Configure your notification preferences >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NIR linking optimisations for Gallium drivers
Hello Tim, with both of your latest series I get this: CC state_tracker/st_program.lo state_tracker/st_program.c: In function ‘st_translate_geometry_program’: state_tracker/st_program.c:1435:25: error: implicit declaration of function ‘st_glsl_to_nir’; did you mean ‘st_link_nir’? [-Werror=implicit-function-declaration] nir_shader *nir = st_glsl_to_nir(st, >Base, stgp->shader_program, ^~ st_link_nir state_tracker/st_program.c:1435:25: warning: initialization makes pointer from integer without a cast [-Wint-conversion] cc1: some warnings being treated as errors make[5]: *** [Makefile:2993: state_tracker/st_program.lo] Fehler 1 Greetings, Dieter Am 21.11.2017 04:37, schrieb Timothy Arceri: This series depends on: https://patchwork.freedesktop.org/series/34131/ Tested without regression on radeonsi. Shader-db results (still limited to GLSL 1.40 so not to interesting): Totals from affected shaders: SGPRS: 29440 -> 30464 (3.48 %) VGPRS: 19620 -> 19584 (-0.18 %) Spilled SGPRs: 131 -> 129 (-1.53 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 749312 -> 749548 (0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4751 -> 4767 (0.34 %) Wait states: 0 -> 0 (0.00 %) The middle patches just move things around so that we can make use of the linking opts, hopefully having them split up this much should make regression testing easy for the existing gallium nir drivers. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 15/17] util: Add Mesa ARB_get_program_binary helper functions
On 2017-11-20 19:00:28, Timothy Arceri wrote: > I don't think this belongs in util. disk cache is generic and used by > both Vulkan and Opengl drivers. These function are specific to one > OpenGL extension and have a dependency on OpenGL types. I think you > should just move this inside the src/mesa, I can't see this being reused > anywhere else. Maybe src/mesa/program ? My goal is just to put it somewhere that all GL/GLES drivers in Mesa will use it. Since we have just the one binary_format value defined, we need to avoid different drivers using the same binary_format enum in different ways. I think I almost put it under src/mesa/main, but I guess src/mesa/program works too. Do you prefer either of those? Maybe it is so simple now that it could just be added into main/shaderobj.c? Are any of these ideas likely to push you over the edge for a Reviewed-by? :) Thanks, -Jordan > > On 21/11/17 09:27, Jordan Justen wrote: > > Signed-off-by: Jordan Justen> > --- > > src/util/Makefile.sources | 2 + > > src/util/meson.build | 2 + > > src/util/program_binary.c | 149 > > ++ > > src/util/program_binary.h | 70 ++ > > 4 files changed, 223 insertions(+) > > create mode 100644 src/util/program_binary.c > > create mode 100644 src/util/program_binary.h > > > > diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources > > index 104ecae8ed3..e06f631128a 100644 > > --- a/src/util/Makefile.sources > > +++ b/src/util/Makefile.sources > > @@ -24,6 +24,8 @@ MESA_UTIL_FILES := \ > > mesa-sha1.h \ > > os_time.c \ > > os_time.h \ > > + program_binary.c \ > > + program_binary.h \ > > sha1/sha1.c \ > > sha1/sha1.h \ > > ralloc.c \ > > diff --git a/src/util/meson.build b/src/util/meson.build > > index ac86c9e111e..496da5c4328 100644 > > --- a/src/util/meson.build > > +++ b/src/util/meson.build > > @@ -48,6 +48,8 @@ files_mesa_util = files( > > 'mesa-sha1.h', > > 'os_time.c', > > 'os_time.h', > > + 'program_binary.c', > > + 'program_binary.h', > > 'sha1/sha1.c', > > 'sha1/sha1.h', > > 'ralloc.c', > > diff --git a/src/util/program_binary.c b/src/util/program_binary.c > > new file mode 100644 > > index 000..e10b9f17862 > > --- /dev/null > > +++ b/src/util/program_binary.c > > @@ -0,0 +1,149 @@ > > +/* > > + * Mesa 3-D graphics library > > + * > > + * Copyright (c) 2017 Intel Corporation > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a > > + * copy of this software and associated documentation files (the > > "Software"), > > + * to deal in the Software without restriction, including without > > limitation > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > > + * and/or sell copies of the Software, and to permit persons to whom the > > + * Software is furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice shall be included > > + * in all copies or substantial portions of the Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > > + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > > MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > > + * OTHER DEALINGS IN THE SOFTWARE. > > + */ > > + > > +/** > > + * \file program_binary.c > > + * > > + * Helper functions for serializing a binary program. > > + */ > > + > > + > > +#include "main/mtypes.h" > > +#include "crc32.h" > > +#include "program_binary.h" > > + > > +/** > > + * Mesa supports one binary format, but it must differentiate between > > formats > > + * produced by different drivers and different Mesa versions. > > + * > > + * Mesa uses a uint32_t value to specify an internal format. The only > > format > > + * defined has one uint32_t value of 0, followed by 20 bytes specifying a > > sha1 > > + * that uniquely identifies the Mesa driver type and version. > > + */ > > + > > +struct program_binary_header { > > + /* If internal_format is 0, it must be followed by the 20 byte sha1 that > > +* identifies the Mesa driver and version supported. If we want to > > support > > +* something besides a sha1, then a new internal_format value can be > > added. > > +*/ > > + uint32_t internal_format; > > + uint8_t sha1[20]; > > + /* Fields following sha1 can be changed since the sha1 will guarantee > > that > > +* the binary only works with the same Mesa version. > > +*/ > > + uint32_t size; > > + uint32_t crc32; > > +}; > > + > > +unsigned > >
Re: [Mesa-dev] [AppVeyor] mesa master #6216 failed
Don't know if this is a concern. The patch was built & tested with intel mesa CI and locally. Does this need attention or can be ignored? Please advise. -Yogesh. >From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On >Behalf Of AppVeyor >Sent: Tuesday, November 21, 2017 4:23 AM >To: mesa-dev@lists.freedesktop.org >Subject: [Mesa-dev] [AppVeyor] mesa master #6216 failed > >Build mesa 6216 failed >Commit 971b3c019b by Aravindan Muthukumar on 11/9/2017 5:45 AM: >i965: Optimize bucket index calculation\n\nReducing Bucket index >calculation to O(1).\n\nThis algorithm calculates the index using >matrix method. Assuming\nPAGE_SIZE is 4096, matrix arrangement is as >below:\n\n 1*4096 2*4096 3*4096 4*4096\n 5*4096 6*4096 7*4096 8*4096\n >10*4096 12*4096 14*4096 16*4096\n 20*4096 24*4096 28*4096 32*4096\n ... >... ... ...\n ... ... ... ...\n ... ... ... max_cache_size\n\nFrom this >matrix its clearly seen that every row follows the below way:\n\n ... >... ... n\n n+(1/4)n n+(1/2)n n+(3/4)n 2n\n\nRow is calculated as >log2(size/PAGE_SIZE) Column is calculated as\nconverting the difference >between the elements to fit into power size of\ntwo and indexing >it.\n\nFinal Index is (row*4)+(col-1)\n\nTested with Intel Mesa >CI.\n\nImproves performance of 3DMark on BXT by 0.705966% +/- 0.229767% >(n=20)\n\nv4: Review comments on style and code comments implemented >(Ian).\nv3: Review comments implemented (Ian).\nv2: Review comments >implemented (Jason).\n\nSigned-off-by: Aravindan Muthukumar >\nSigned-off-by: Kedar Karanje > \nReviewed-by: Yogesh Marathe > \nSigned-off-by: Ian Romanick > Configure your notification preferences > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/14] st/glsl_to_nir: enable NIR link time opts
On 21/11/17 15:02, Ilia Mirkin wrote: Should this work out of the box for freedreno and vc4 as well? Or will those need changes like radeonsi did? I'm hopeful they shouldn't need any big changes as the radeonsi changes are due to expecting everything to be a vec4 because it code shares with the tgsi backend. However I have done 0 testing so any testing on those drivers is appreciated. On Mon, Nov 20, 2017 at 10:37 PM, Timothy Arceriwrote: --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 77 +++ 1 file changed, 77 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 27e36bd306..53e07a78e2 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -468,36 +468,107 @@ st_nir_get_mesa_program(struct gl_context *ctx, NIR_PASS_V(nir, nir_lower_io_to_temporaries, nir_shader_get_entrypoint(nir), true, true); NIR_PASS_V(nir, nir_lower_global_vars_to_local); NIR_PASS_V(nir, nir_split_var_copies); NIR_PASS_V(nir, nir_lower_var_copies); return nir; } +static void +st_nir_link_shaders(nir_shader **producer, nir_shader **consumer) +{ + nir_lower_io_arrays_to_elements(*producer, *consumer); + + NIR_PASS_V(*producer, nir_remove_dead_variables, nir_var_shader_out); + NIR_PASS_V(*consumer, nir_remove_dead_variables, nir_var_shader_in); + + if (nir_remove_unused_varyings(*producer, *consumer)) { + NIR_PASS_V(*producer, nir_lower_global_vars_to_local); + NIR_PASS_V(*consumer, nir_lower_global_vars_to_local); + + /* The backend might not be able to handle indirects on + * temporaries so we need to lower indirects on any of the + * varyings we have demoted here. + * + * TODO: radeonsi shouldn't need to do this, however LLVM isn't + * currently smart enough to handle indirects without causing excess + * spilling causing the gpu to hang. + * + * See the following thread for more details of the problem: + * https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html + */ + nir_variable_mode indirect_mask = nir_var_local; + + NIR_PASS_V(*producer, nir_lower_indirect_derefs, indirect_mask); + NIR_PASS_V(*consumer, nir_lower_indirect_derefs, indirect_mask); + + st_nir_opts(*producer); + st_nir_opts(*consumer); + } +} + extern "C" { bool st_link_nir(struct gl_context *ctx, struct gl_shader_program *shader_program) { struct st_context *st = st_context(ctx); + /* Determine first and last stage. */ + unsigned first = MESA_SHADER_STAGES; + unsigned last = 0; + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + if (!shader_program->_LinkedShaders[i]) + continue; + if (first == MESA_SHADER_STAGES) + first = i; + last = i; + } + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; nir_shader *nir = st_nir_get_mesa_program(ctx, shader_program, shader); + + nir_variable_mode mask = (nir_variable_mode) 0; + if (i != first) + mask = (nir_variable_mode)(mask | nir_var_shader_in); + + if (i != last) + mask = (nir_variable_mode)(mask | nir_var_shader_out); + + nir_lower_io_to_scalar_early(nir, mask); + st_nir_opts(nir); + } + + /* Linking the stages in the opposite order (from fragment to vertex) +* ensures that inter-shader outputs written to in an earlier stage +* are eliminated if they are (transitively) not used in a later +* stage. +*/ + int next = last; + for (int i = next - 1; i >= 0; i--) { + struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; + if (shader == NULL) + continue; + + st_nir_link_shaders(>Program->nir, + _program->_LinkedShaders[next]->Program->nir); + next = i; } + int prev = -1; for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; nir_shader *nir = shader->Program->nir; /* fragment shaders may need : */ if (nir->info.stage == MESA_SHADER_FRAGMENT) { static const gl_state_index wposTransformState[STATE_LENGTH] = { @@ -526,20 +597,26 @@ st_link_nir(struct gl_context *ctx, _mesa_add_state_reference(shader->Program->Parameters, wposTransformState); } } NIR_PASS_V(nir, nir_lower_system_values); nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir)); shader->Program->info = nir->info; + if (prev != -1) { +
Re: [Mesa-dev] [PATCH 14/14] st/glsl_to_nir: enable NIR link time opts
Should this work out of the box for freedreno and vc4 as well? Or will those need changes like radeonsi did? On Mon, Nov 20, 2017 at 10:37 PM, Timothy Arceriwrote: > --- > src/mesa/state_tracker/st_glsl_to_nir.cpp | 77 > +++ > 1 file changed, 77 insertions(+) > > diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp > b/src/mesa/state_tracker/st_glsl_to_nir.cpp > index 27e36bd306..53e07a78e2 100644 > --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp > @@ -468,36 +468,107 @@ st_nir_get_mesa_program(struct gl_context *ctx, > NIR_PASS_V(nir, nir_lower_io_to_temporaries, >nir_shader_get_entrypoint(nir), >true, true); > NIR_PASS_V(nir, nir_lower_global_vars_to_local); > NIR_PASS_V(nir, nir_split_var_copies); > NIR_PASS_V(nir, nir_lower_var_copies); > > return nir; > } > > +static void > +st_nir_link_shaders(nir_shader **producer, nir_shader **consumer) > +{ > + nir_lower_io_arrays_to_elements(*producer, *consumer); > + > + NIR_PASS_V(*producer, nir_remove_dead_variables, nir_var_shader_out); > + NIR_PASS_V(*consumer, nir_remove_dead_variables, nir_var_shader_in); > + > + if (nir_remove_unused_varyings(*producer, *consumer)) { > + NIR_PASS_V(*producer, nir_lower_global_vars_to_local); > + NIR_PASS_V(*consumer, nir_lower_global_vars_to_local); > + > + /* The backend might not be able to handle indirects on > + * temporaries so we need to lower indirects on any of the > + * varyings we have demoted here. > + * > + * TODO: radeonsi shouldn't need to do this, however LLVM isn't > + * currently smart enough to handle indirects without causing excess > + * spilling causing the gpu to hang. > + * > + * See the following thread for more details of the problem: > + * > https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html > + */ > + nir_variable_mode indirect_mask = nir_var_local; > + > + NIR_PASS_V(*producer, nir_lower_indirect_derefs, indirect_mask); > + NIR_PASS_V(*consumer, nir_lower_indirect_derefs, indirect_mask); > + > + st_nir_opts(*producer); > + st_nir_opts(*consumer); > + } > +} > + > extern "C" { > > bool > st_link_nir(struct gl_context *ctx, > struct gl_shader_program *shader_program) > { > struct st_context *st = st_context(ctx); > > + /* Determine first and last stage. */ > + unsigned first = MESA_SHADER_STAGES; > + unsigned last = 0; > + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { > + if (!shader_program->_LinkedShaders[i]) > + continue; > + if (first == MESA_SHADER_STAGES) > + first = i; > + last = i; > + } > + > for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { >struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; >if (shader == NULL) > continue; > >nir_shader *nir = st_nir_get_mesa_program(ctx, shader_program, shader); > + > + nir_variable_mode mask = (nir_variable_mode) 0; > + if (i != first) > + mask = (nir_variable_mode)(mask | nir_var_shader_in); > + > + if (i != last) > + mask = (nir_variable_mode)(mask | nir_var_shader_out); > + > + nir_lower_io_to_scalar_early(nir, mask); > + st_nir_opts(nir); > + } > + > + /* Linking the stages in the opposite order (from fragment to vertex) > +* ensures that inter-shader outputs written to in an earlier stage > +* are eliminated if they are (transitively) not used in a later > +* stage. > +*/ > + int next = last; > + for (int i = next - 1; i >= 0; i--) { > + struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; > + if (shader == NULL) > + continue; > + > + st_nir_link_shaders(>Program->nir, > + > _program->_LinkedShaders[next]->Program->nir); > + next = i; > } > > + int prev = -1; > for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { >struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; >if (shader == NULL) > continue; > >nir_shader *nir = shader->Program->nir; > >/* fragment shaders may need : */ >if (nir->info.stage == MESA_SHADER_FRAGMENT) { > static const gl_state_index wposTransformState[STATE_LENGTH] = { > @@ -526,20 +597,26 @@ st_link_nir(struct gl_context *ctx, > _mesa_add_state_reference(shader->Program->Parameters, >wposTransformState); > } >} > >NIR_PASS_V(nir, nir_lower_system_values); > >nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir)); >shader->Program->info = nir->info; > > + if (prev != -1) { > + > nir_compact_varyings(shader_program->_LinkedShaders[prev]->Program->nir, > +
[Mesa-dev] [PATCH 13/14] radeonsi: use driver location when gathering tgsi info from nir
Otherwise we overflow the tgsi array because nir uses a separate var for component packing of varyings, while tgsi expects everything to be vec4. We also need to do our own count of the total number of varyings. --- src/gallium/drivers/radeonsi/si_shader_nir.c | 34 ++-- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c b/src/gallium/drivers/radeonsi/si_shader_nir.c index ec748c9679..2e0e3725f7 100644 --- a/src/gallium/drivers/radeonsi/si_shader_nir.c +++ b/src/gallium/drivers/radeonsi/si_shader_nir.c @@ -129,54 +129,62 @@ void si_nir_scan_shader(const struct nir_shader *nir, nir_function *func; unsigned i; assert(nir->info.stage == MESA_SHADER_VERTEX || nir->info.stage == MESA_SHADER_FRAGMENT); info->processor = pipe_shader_type_from_mesa(nir->info.stage); info->num_tokens = 2; /* indicate that the shader is non-empty */ info->num_instructions = 2; - info->num_inputs = nir->num_inputs; - info->num_outputs = nir->num_outputs; - if (nir->info.stage == MESA_SHADER_GEOMETRY) { info->properties[TGSI_PROPERTY_GS_INPUT_PRIM] = nir->info.gs.input_primitive; info->properties[TGSI_PROPERTY_GS_OUTPUT_PRIM] = nir->info.gs.output_primitive; info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES] = nir->info.gs.vertices_out; info->properties[TGSI_PROPERTY_GS_INVOCATIONS] = nir->info.gs.invocations; } i = 0; + uint64_t processed_inputs = 0; + unsigned num_inputs = 0; nir_foreach_variable(variable, >inputs) { unsigned semantic_name, semantic_index; unsigned attrib_count = glsl_count_attribute_slots(variable->type, nir->info.stage == MESA_SHADER_VERTEX); assert(attrib_count == 1 && "not implemented"); /* Vertex shader inputs don't have semantics. The state * tracker has already mapped them to attributes via * variable->data.driver_location. */ if (nir->info.stage == MESA_SHADER_VERTEX) continue; /* Fragment shader position is a system value. */ if (nir->info.stage == MESA_SHADER_FRAGMENT && variable->data.location == VARYING_SLOT_POS) { if (variable->data.pixel_center_integer) info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER] = TGSI_FS_COORD_PIXEL_CENTER_INTEGER; + + num_inputs++; continue; } + i = variable->data.driver_location; + if (processed_inputs & ((uint64_t)1 << i)) + continue; + + processed_inputs |= ((uint64_t)1 << i); + num_inputs++; + tgsi_get_gl_varying_semantic(variable->data.location, true, _name, _index); info->input_semantic_name[i] = semantic_name; info->input_semantic_index[i] = semantic_index; if (variable->data.sample) info->input_interpolate_loc[i] = TGSI_INTERPOLATE_LOC_SAMPLE; else if (variable->data.centroid) info->input_interpolate_loc[i] = TGSI_INTERPOLATE_LOC_CENTROID; @@ -228,36 +236,48 @@ void si_nir_scan_shader(const struct nir_shader *nir, case INTERP_MODE_FLAT: info->input_interpolate[i] = TGSI_INTERPOLATE_CONSTANT; break; } /* TODO make this more precise */ if (variable->data.location == VARYING_SLOT_COL0) info->colors_read |= 0x0f; else if (variable->data.location == VARYING_SLOT_COL1) info->colors_read |= 0xf0; - - i++; } + if (nir->info.stage != MESA_SHADER_VERTEX) + info->num_inputs = num_inputs; + else + info->num_inputs = nir->num_inputs; + i = 0; + uint64_t processed_outputs = 0; + unsigned num_outputs = 0; nir_foreach_variable(variable, >outputs) { unsigned semantic_name, semantic_index; if (nir->info.stage == MESA_SHADER_FRAGMENT) { tgsi_get_gl_frag_result_semantic(variable->data.location, _name, _index); } else { tgsi_get_gl_varying_semantic(variable->data.location, true, _name, _index); } + i = variable->data.driver_location; +
[Mesa-dev] [PATCH 14/14] st/glsl_to_nir: enable NIR link time opts
--- src/mesa/state_tracker/st_glsl_to_nir.cpp | 77 +++ 1 file changed, 77 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 27e36bd306..53e07a78e2 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -468,36 +468,107 @@ st_nir_get_mesa_program(struct gl_context *ctx, NIR_PASS_V(nir, nir_lower_io_to_temporaries, nir_shader_get_entrypoint(nir), true, true); NIR_PASS_V(nir, nir_lower_global_vars_to_local); NIR_PASS_V(nir, nir_split_var_copies); NIR_PASS_V(nir, nir_lower_var_copies); return nir; } +static void +st_nir_link_shaders(nir_shader **producer, nir_shader **consumer) +{ + nir_lower_io_arrays_to_elements(*producer, *consumer); + + NIR_PASS_V(*producer, nir_remove_dead_variables, nir_var_shader_out); + NIR_PASS_V(*consumer, nir_remove_dead_variables, nir_var_shader_in); + + if (nir_remove_unused_varyings(*producer, *consumer)) { + NIR_PASS_V(*producer, nir_lower_global_vars_to_local); + NIR_PASS_V(*consumer, nir_lower_global_vars_to_local); + + /* The backend might not be able to handle indirects on + * temporaries so we need to lower indirects on any of the + * varyings we have demoted here. + * + * TODO: radeonsi shouldn't need to do this, however LLVM isn't + * currently smart enough to handle indirects without causing excess + * spilling causing the gpu to hang. + * + * See the following thread for more details of the problem: + * https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html + */ + nir_variable_mode indirect_mask = nir_var_local; + + NIR_PASS_V(*producer, nir_lower_indirect_derefs, indirect_mask); + NIR_PASS_V(*consumer, nir_lower_indirect_derefs, indirect_mask); + + st_nir_opts(*producer); + st_nir_opts(*consumer); + } +} + extern "C" { bool st_link_nir(struct gl_context *ctx, struct gl_shader_program *shader_program) { struct st_context *st = st_context(ctx); + /* Determine first and last stage. */ + unsigned first = MESA_SHADER_STAGES; + unsigned last = 0; + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + if (!shader_program->_LinkedShaders[i]) + continue; + if (first == MESA_SHADER_STAGES) + first = i; + last = i; + } + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; nir_shader *nir = st_nir_get_mesa_program(ctx, shader_program, shader); + + nir_variable_mode mask = (nir_variable_mode) 0; + if (i != first) + mask = (nir_variable_mode)(mask | nir_var_shader_in); + + if (i != last) + mask = (nir_variable_mode)(mask | nir_var_shader_out); + + nir_lower_io_to_scalar_early(nir, mask); + st_nir_opts(nir); + } + + /* Linking the stages in the opposite order (from fragment to vertex) +* ensures that inter-shader outputs written to in an earlier stage +* are eliminated if they are (transitively) not used in a later +* stage. +*/ + int next = last; + for (int i = next - 1; i >= 0; i--) { + struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; + if (shader == NULL) + continue; + + st_nir_link_shaders(>Program->nir, + _program->_LinkedShaders[next]->Program->nir); + next = i; } + int prev = -1; for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; nir_shader *nir = shader->Program->nir; /* fragment shaders may need : */ if (nir->info.stage == MESA_SHADER_FRAGMENT) { static const gl_state_index wposTransformState[STATE_LENGTH] = { @@ -526,20 +597,26 @@ st_link_nir(struct gl_context *ctx, _mesa_add_state_reference(shader->Program->Parameters, wposTransformState); } } NIR_PASS_V(nir, nir_lower_system_values); nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir)); shader->Program->info = nir->info; + if (prev != -1) { + nir_compact_varyings(shader_program->_LinkedShaders[prev]->Program->nir, + nir, ctx->API != API_OPENGL_COMPAT); + } + prev = i; + st_glsl_to_nir_post_opts(st, shader->Program, shader_program); assert(shader->Program); if (!ctx->Driver.ProgramStringNotify(ctx, _mesa_shader_stage_to_program(i), shader->Program)) { _mesa_reference_program(ctx, >Program, NULL); return false;
[Mesa-dev] [PATCH 07/14] st/glsl_to_nir: make st_glsl_to_nir() static
Here we also move the extern C functions to the bottom of the file. --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 102 +++--- src/mesa/state_tracker/st_nir.h | 4 -- 2 files changed, 51 insertions(+), 55 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 9a27f4ceea..a4a30a4199 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -240,27 +240,25 @@ st_nir_assign_uniform_locations(struct gl_program *prog, loc = st_nir_lookup_parameter_index(prog->Parameters, uniform->name); } uniform->data.driver_location = loc; max = MAX2(max, loc + type_size(uniform->type)); } *size = max; } -extern "C" { - /* First third of converting glsl_to_nir.. this leaves things in a pre- * nir_lower_io state, so that shader variants can more easily insert/ * replace variables, etc. */ -nir_shader * +static nir_shader * st_glsl_to_nir(struct st_context *st, struct gl_program *prog, struct gl_shader_program *shader_program, gl_shader_stage stage) { struct pipe_screen *pscreen = st->pipe->screen; enum pipe_shader_type ptarget = pipe_shader_type_from_mesa(stage); const nir_shader_compiler_options *options; assert(pscreen->get_compiler_options); /* drivers using NIR must implement this */ @@ -385,68 +383,20 @@ sort_varyings(struct exec_list *var_list) { struct exec_list new_list; exec_list_make_empty(_list); nir_foreach_variable_safe(var, var_list) { exec_node_remove(>node); insert_sorted(_list, var); } exec_list_move_nodes_to(_list, var_list); } -/* Last third of preparing nir from glsl, which happens after shader - * variant lowering. - */ -void -st_finalize_nir(struct st_context *st, struct gl_program *prog, -struct gl_shader_program *shader_program, nir_shader *nir) -{ - struct pipe_screen *screen = st->pipe->screen; - - NIR_PASS_V(nir, nir_split_var_copies); - NIR_PASS_V(nir, nir_lower_var_copies); - NIR_PASS_V(nir, nir_lower_io_types); - - if (nir->info.stage == MESA_SHADER_VERTEX) { - /* Needs special handling so drvloc matches the vbo state: */ - st_nir_assign_vs_in_locations(prog, nir); - /* Re-lower global vars, to deal with any dead VS inputs. */ - NIR_PASS_V(nir, nir_lower_global_vars_to_local); - - sort_varyings(>outputs); - st_nir_assign_var_locations(>outputs, - >num_outputs); - st_nir_fixup_varying_slots(st, >outputs); - } else if (nir->info.stage == MESA_SHADER_FRAGMENT) { - sort_varyings(>inputs); - st_nir_assign_var_locations(>inputs, - >num_inputs); - st_nir_fixup_varying_slots(st, >inputs); - st_nir_assign_var_locations(>outputs, - >num_outputs); - } else if (nir->info.stage == MESA_SHADER_COMPUTE) { - /* TODO? */ - } else { - unreachable("invalid shader type for tgsi bypass\n"); - } - - NIR_PASS_V(nir, nir_lower_atomics_to_ssbo, - st->ctx->Const.Program[nir->info.stage].MaxAtomicBuffers); - - st_nir_assign_uniform_locations(prog, shader_program, - >uniforms, >num_uniforms); - - if (screen->get_param(screen, PIPE_CAP_NIR_SAMPLERS_AS_DEREF)) - NIR_PASS_V(nir, nir_lower_samplers_as_deref, shader_program); - else - NIR_PASS_V(nir, nir_lower_samplers, shader_program); -} - static void set_st_program(struct gl_program *prog, struct gl_shader_program *shader_program, nir_shader *nir) { struct st_vertex_program *stvp; struct st_common_program *stp; struct st_fragment_program *stfp; struct st_compute_program *stcp; @@ -513,20 +463,22 @@ st_nir_get_mesa_program(struct gl_context *ctx, _mesa_update_shader_textures_used(shader_program, prog); nir_shader *nir = st_glsl_to_nir(st, prog, shader_program, shader->Stage); set_st_program(prog, shader_program, nir); prog->nir = nir; return nir; } +extern "C" { + bool st_link_nir(struct gl_context *ctx, struct gl_shader_program *shader_program) { struct st_context *st = st_context(ctx); for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; @@ -546,11 +498,59 @@ st_link_nir(struct gl_context *ctx, _mesa_shader_stage_to_program(i), shader->Program)) { _mesa_reference_program(ctx, >Program, NULL); return false; } } return true; } +/* Last third of preparing nir from glsl, which happens after shader + * variant lowering. + */ +void +st_finalize_nir(struct st_context *st, struct
[Mesa-dev] [PATCH 09/14] st/glsl_to_nir: enable NIR opts
--- src/mesa/state_tracker/st_glsl_to_nir.cpp | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 7d66a85a10..e83b5dd2ef 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -293,21 +293,25 @@ st_glsl_to_nir(struct st_context *st, struct gl_program *prog, assert(pscreen->get_compiler_options); /* drivers using NIR must implement this */ options = (const nir_shader_compiler_options *) pscreen->get_compiler_options(pscreen, PIPE_SHADER_IR_NIR, ptarget); assert(options); if (prog->nir) return prog->nir; - return glsl_to_nir(shader_program, stage, options); + nir_shader *nir = glsl_to_nir(shader_program, stage, options); + + st_nir_opts(nir); + + return nir; } /* Second third of converting glsl_to_nir. This creates uniforms, gathers * info on varyings, etc after NIR link time opts have been applied. */ static void st_glsl_to_nir_post_opts(struct st_context *st, struct gl_program *prog, struct gl_shader_program *shader_program) { nir_shader *nir = prog->nir; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/14] st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()
NIR component packing will be inserted between these calls and the calling of st_glsl_to_nir_post_opts(). --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 67 +-- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 746dfff396..27e36bd306 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -340,50 +340,20 @@ st_glsl_to_nir_post_opts(struct st_context *st, struct gl_program *prog, * This should be enough for Bitmap and DrawPixels constants. */ _mesa_reserve_parameter_storage(prog->Parameters, 8); /* This has to be done last. Any operation the can cause * prog->ParameterValues to get reallocated (e.g., anything that adds a * program constant) has to happen before creating this linkage. */ _mesa_associate_uniform_storage(st->ctx, shader_program, prog, true); - /* fragment shaders may need : */ - if (prog->info.stage == MESA_SHADER_FRAGMENT) { - static const gl_state_index wposTransformState[STATE_LENGTH] = { - STATE_INTERNAL, STATE_FB_WPOS_Y_TRANSFORM - }; - nir_lower_wpos_ytransform_options wpos_options = { { 0 } }; - struct pipe_screen *pscreen = st->pipe->screen; - - memcpy(wpos_options.state_tokens, wposTransformState, - sizeof(wpos_options.state_tokens)); - wpos_options.fs_coord_origin_upper_left = - pscreen->get_param(pscreen, PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT); - wpos_options.fs_coord_origin_lower_left = - pscreen->get_param(pscreen, PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT); - wpos_options.fs_coord_pixel_center_integer = - pscreen->get_param(pscreen, PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER); - wpos_options.fs_coord_pixel_center_half_integer = - pscreen->get_param(pscreen, PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER); - - if (nir_lower_wpos_ytransform(nir, _options)) { - nir_validate_shader(nir); - _mesa_add_state_reference(prog->Parameters, wposTransformState); - } - } - - NIR_PASS_V(nir, nir_lower_system_values); - - nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir)); - prog->info = nir->info; - st_set_prog_affected_state_flags(prog); NIR_PASS_V(nir, st_nir_lower_builtin); NIR_PASS_V(nir, nir_lower_atomics, shader_program); if (st->ctx->_Shader->Flags & GLSL_DUMP) { _mesa_log("\n"); _mesa_log("NIR IR for linked %s program %d:\n", _mesa_shader_stage_to_string(prog->info.stage), shader_program->Name); @@ -519,20 +489,57 @@ st_link_nir(struct gl_context *ctx, continue; nir_shader *nir = st_nir_get_mesa_program(ctx, shader_program, shader); } for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; + nir_shader *nir = shader->Program->nir; + + /* fragment shaders may need : */ + if (nir->info.stage == MESA_SHADER_FRAGMENT) { + static const gl_state_index wposTransformState[STATE_LENGTH] = { +STATE_INTERNAL, STATE_FB_WPOS_Y_TRANSFORM + }; + nir_lower_wpos_ytransform_options wpos_options = { { 0 } }; + struct pipe_screen *pscreen = st->pipe->screen; + + memcpy(wpos_options.state_tokens, wposTransformState, +sizeof(wpos_options.state_tokens)); + wpos_options.fs_coord_origin_upper_left = +pscreen->get_param(pscreen, + PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT); + wpos_options.fs_coord_origin_lower_left = +pscreen->get_param(pscreen, + PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT); + wpos_options.fs_coord_pixel_center_integer = +pscreen->get_param(pscreen, + PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER); + wpos_options.fs_coord_pixel_center_half_integer = +pscreen->get_param(pscreen, + PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER); + + if (nir_lower_wpos_ytransform(nir, _options)) { +nir_validate_shader(nir); +_mesa_add_state_reference(shader->Program->Parameters, + wposTransformState); + } + } + + NIR_PASS_V(nir, nir_lower_system_values); + + nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir)); + shader->Program->info = nir->info; + st_glsl_to_nir_post_opts(st, shader->Program, shader_program); assert(shader->Program); if (!ctx->Driver.ProgramStringNotify(ctx, _mesa_shader_stage_to_program(i), shader->Program)) {
[Mesa-dev] [PATCH 12/14] radeonsi/nir: add support for packed inputs
Because NIR can create non vec4 variables when implementing component packing we need to make sure not to reprocess the same slot again. Also we can drop the fs_attr_idx counter and just use driver_location. --- src/gallium/drivers/radeonsi/si_shader_nir.c | 46 +++- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c b/src/gallium/drivers/radeonsi/si_shader_nir.c index 5d82715f7a..ec748c9679 100644 --- a/src/gallium/drivers/radeonsi/si_shader_nir.c +++ b/src/gallium/drivers/radeonsi/si_shader_nir.c @@ -437,46 +437,42 @@ si_lower_nir(struct si_shader_selector* sel) NIR_PASS(progress, sel->nir, nir_opt_undef); NIR_PASS(progress, sel->nir, nir_opt_conditional_discard); if (sel->nir->options->max_unroll_iterations) { NIR_PASS(progress, sel->nir, nir_opt_loop_unroll, 0); } } while (progress); } static void declare_nir_input_vs(struct si_shader_context *ctx, -struct nir_variable *variable, unsigned rel, +struct nir_variable *variable, LLVMValueRef out[4]) { - si_llvm_load_input_vs(ctx, variable->data.driver_location / 4 + rel, out); + si_llvm_load_input_vs(ctx, variable->data.driver_location / 4, out); } static void declare_nir_input_fs(struct si_shader_context *ctx, -struct nir_variable *variable, unsigned rel, -unsigned *fs_attr_idx, +struct nir_variable *variable, +unsigned input_index, LLVMValueRef out[4]) { - unsigned slot = variable->data.location + rel; - - assert(variable->data.location >= VARYING_SLOT_VAR0 || rel == 0); - + unsigned slot = variable->data.location; if (slot == VARYING_SLOT_POS) { out[0] = LLVMGetParam(ctx->main_fn, SI_PARAM_POS_X_FLOAT); out[1] = LLVMGetParam(ctx->main_fn, SI_PARAM_POS_Y_FLOAT); out[2] = LLVMGetParam(ctx->main_fn, SI_PARAM_POS_Z_FLOAT); out[3] = ac_build_fdiv(>ac, ctx->ac.f32_1, LLVMGetParam(ctx->main_fn, SI_PARAM_POS_W_FLOAT)); return; } - si_llvm_load_input_fs(ctx, *fs_attr_idx, out); - (*fs_attr_idx)++; + si_llvm_load_input_fs(ctx, input_index, out); } static LLVMValueRef si_nir_load_sampler_desc(struct ac_shader_abi *abi, unsigned descriptor_set, unsigned base_index, unsigned constant_index, LLVMValueRef dynamic_index, enum ac_descriptor_type desc_type, bool image, bool write) { struct si_shader_context *ctx = si_shader_context_from_abi(abi); @@ -516,39 +512,47 @@ si_nir_load_sampler_desc(struct ac_shader_abi *abi, index = LLVMBuildAdd(ctx->gallivm.builder, index, LLVMConstInt(ctx->i32, SI_NUM_IMAGES / 2, 0), ""); return si_load_sampler_desc(ctx, list, index, desc_type); } bool si_nir_build_llvm(struct si_shader_context *ctx, struct nir_shader *nir) { struct tgsi_shader_info *info = >shader->selector->info; - unsigned fs_attr_idx = 0; + uint64_t processed_outputs = 0; nir_foreach_variable(variable, >inputs) { unsigned attrib_count = glsl_count_attribute_slots(variable->type, nir->info.stage == MESA_SHADER_VERTEX); unsigned input_idx = variable->data.driver_location; - for (unsigned i = 0; i < attrib_count; ++i) { - LLVMValueRef data[4]; + assert(attrib_count == 1); - if (nir->info.stage == MESA_SHADER_VERTEX) - declare_nir_input_vs(ctx, variable, i, data); - else if (nir->info.stage == MESA_SHADER_FRAGMENT) - declare_nir_input_fs(ctx, variable, i, _attr_idx, data); + LLVMValueRef data[4]; + unsigned loc = variable->data.location; - for (unsigned chan = 0; chan < 4; chan++) { - ctx->inputs[input_idx + chan] = - LLVMBuildBitCast(ctx->ac.builder, data[chan], ctx->ac.i32, ""); - } + /* Packed components share the same location so skip +* them if we have already processed the location. +*/ + if (processed_outputs & ((uint64_t)1 << loc)) + continue; + + if (nir->info.stage == MESA_SHADER_VERTEX) + declare_nir_input_vs(ctx, variable, data); +
[Mesa-dev] [PATCH 10/14] st/glsl_to_nir: call some lowering passes earlier
This is required so that we can enbale NIR linking optimisations. --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index e83b5dd2ef..746dfff396 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -340,27 +340,20 @@ st_glsl_to_nir_post_opts(struct st_context *st, struct gl_program *prog, * This should be enough for Bitmap and DrawPixels constants. */ _mesa_reserve_parameter_storage(prog->Parameters, 8); /* This has to be done last. Any operation the can cause * prog->ParameterValues to get reallocated (e.g., anything that adds a * program constant) has to happen before creating this linkage. */ _mesa_associate_uniform_storage(st->ctx, shader_program, prog, true); - NIR_PASS_V(nir, nir_lower_io_to_temporaries, - nir_shader_get_entrypoint(nir), - true, true); - NIR_PASS_V(nir, nir_lower_global_vars_to_local); - NIR_PASS_V(nir, nir_split_var_copies); - NIR_PASS_V(nir, nir_lower_var_copies); - /* fragment shaders may need : */ if (prog->info.stage == MESA_SHADER_FRAGMENT) { static const gl_state_index wposTransformState[STATE_LENGTH] = { STATE_INTERNAL, STATE_FB_WPOS_Y_TRANSFORM }; nir_lower_wpos_ytransform_options wpos_options = { { 0 } }; struct pipe_screen *pscreen = st->pipe->screen; memcpy(wpos_options.state_tokens, wposTransformState, sizeof(wpos_options.state_tokens)); @@ -495,20 +488,27 @@ st_nir_get_mesa_program(struct gl_context *ctx, } prog->ExternalSamplersUsed = gl_external_samplers(prog); _mesa_update_shader_textures_used(shader_program, prog); nir_shader *nir = st_glsl_to_nir(st, prog, shader_program, shader->Stage); set_st_program(prog, shader_program, nir); prog->nir = nir; + NIR_PASS_V(nir, nir_lower_io_to_temporaries, + nir_shader_get_entrypoint(nir), + true, true); + NIR_PASS_V(nir, nir_lower_global_vars_to_local); + NIR_PASS_V(nir, nir_split_var_copies); + NIR_PASS_V(nir, nir_lower_var_copies); + return nir; } extern "C" { bool st_link_nir(struct gl_context *ctx, struct gl_shader_program *shader_program) { struct st_context *st = st_context(ctx); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/14] st/glsl_to_nir: add basic NIR opt loop helper
We need to be able to do these NIR opts in the state tracker rather than the driver in order for the NIR linking opts to be useful. --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 31 +++ 1 file changed, 31 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index a4a30a4199..7d66a85a10 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -240,20 +240,51 @@ st_nir_assign_uniform_locations(struct gl_program *prog, loc = st_nir_lookup_parameter_index(prog->Parameters, uniform->name); } uniform->data.driver_location = loc; max = MAX2(max, loc + type_size(uniform->type)); } *size = max; } +static void +st_nir_opts(nir_shader *nir) +{ + bool progress; + do { + progress = false; + + NIR_PASS(progress, nir, nir_copy_prop); + NIR_PASS(progress, nir, nir_opt_remove_phis); + NIR_PASS(progress, nir, nir_opt_dce); + if (nir_opt_trivial_continues(nir)) { + progress = true; + NIR_PASS(progress, nir, nir_copy_prop); + NIR_PASS(progress, nir, nir_opt_dce); + } + NIR_PASS(progress, nir, nir_opt_if); + NIR_PASS(progress, nir, nir_opt_dead_cf); + NIR_PASS(progress, nir, nir_opt_cse); + NIR_PASS(progress, nir, nir_opt_peephole_select, 8); + + NIR_PASS(progress, nir, nir_opt_algebraic); + NIR_PASS(progress, nir, nir_opt_constant_folding); + + NIR_PASS(progress, nir, nir_opt_undef); + NIR_PASS(progress, nir, nir_opt_conditional_discard); + if (nir->options->max_unroll_iterations) { + NIR_PASS(progress, nir, nir_opt_loop_unroll, (nir_variable_mode)0); + } + } while (progress); +} + /* First third of converting glsl_to_nir.. this leaves things in a pre- * nir_lower_io state, so that shader variants can more easily insert/ * replace variables, etc. */ static nir_shader * st_glsl_to_nir(struct st_context *st, struct gl_program *prog, struct gl_shader_program *shader_program, gl_shader_stage stage) { struct pipe_screen *pscreen = st->pipe->screen; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/14] st/glsl_to_nir: split linking loop in two in prep of nir linking opts
--- src/mesa/state_tracker/st_glsl_to_nir.cpp | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index bb0ba07012..8c22bde835 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -470,21 +470,21 @@ set_st_program(struct gl_program *prog, stcp = (struct st_compute_program *)prog; stcp->shader_program = shader_program; stcp->tgsi.ir_type = PIPE_SHADER_IR_NIR; stcp->tgsi.prog = nir_shader_clone(NULL, nir); break; default: unreachable("unknown shader stage"); } } -struct gl_program * +static nir_shader * st_nir_get_mesa_program(struct gl_context *ctx, struct gl_shader_program *shader_program, struct gl_linked_shader *shader) { struct st_context *st = st_context(ctx); struct gl_program *prog; validate_ir_tree(shader->ir); prog = shader->Program; @@ -504,39 +504,43 @@ st_nir_get_mesa_program(struct gl_context *ctx, _mesa_log("\n\n"); } prog->ExternalSamplersUsed = gl_external_samplers(prog); _mesa_update_shader_textures_used(shader_program, prog); nir_shader *nir = st_glsl_to_nir(st, prog, shader_program, shader->Stage); set_st_program(prog, shader_program, nir); - return prog; + return nir; } bool st_link_nir(struct gl_context *ctx, struct gl_shader_program *shader_program) { for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; if (shader == NULL) continue; - struct gl_program *linked_prog = - st_nir_get_mesa_program(ctx, shader_program, shader); + nir_shader *nir = st_nir_get_mesa_program(ctx, shader_program, shader); + } - if (linked_prog) { - if (!ctx->Driver.ProgramStringNotify(ctx, - _mesa_shader_stage_to_program(i), - linked_prog)) { -_mesa_reference_program(ctx, >Program, NULL); -return false; - } + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; + if (shader == NULL) + continue; + + assert(shader->Program); + if (!ctx->Driver.ProgramStringNotify(ctx, + _mesa_shader_stage_to_program(i), + shader->Program)) { + _mesa_reference_program(ctx, >Program, NULL); + return false; } } return true; } } /* extern "C" */ -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/14] st/glsl_to_nir: create set_st_program() helper
--- src/mesa/state_tracker/st_glsl_to_nir.cpp | 74 +-- 1 file changed, 40 insertions(+), 34 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 8b66f8277a..bb0ba07012 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -428,58 +428,31 @@ st_finalize_nir(struct st_context *st, struct gl_program *prog, st_nir_assign_uniform_locations(prog, shader_program, >uniforms, >num_uniforms); if (screen->get_param(screen, PIPE_CAP_NIR_SAMPLERS_AS_DEREF)) NIR_PASS_V(nir, nir_lower_samplers_as_deref, shader_program); else NIR_PASS_V(nir, nir_lower_samplers, shader_program); } -struct gl_program * -st_nir_get_mesa_program(struct gl_context *ctx, -struct gl_shader_program *shader_program, -struct gl_linked_shader *shader) +static void +set_st_program(struct gl_program *prog, + struct gl_shader_program *shader_program, + nir_shader *nir) { - struct st_context *st = st_context(ctx); - struct gl_program *prog; - - validate_ir_tree(shader->ir); - - prog = shader->Program; - - prog->Parameters = _mesa_new_parameter_list(); - - _mesa_copy_linked_program_data(shader_program, shader); - _mesa_generate_parameters_list_for_uniforms(ctx, shader_program, shader, - prog->Parameters); - - if (ctx->_Shader->Flags & GLSL_DUMP) { - _mesa_log("\n"); - _mesa_log("GLSL IR for linked %s program %d:\n", - _mesa_shader_stage_to_string(shader->Stage), - shader_program->Name); - _mesa_print_ir(_mesa_get_log_file(), shader->ir, NULL); - _mesa_log("\n\n"); - } - - prog->ExternalSamplersUsed = gl_external_samplers(prog); - _mesa_update_shader_textures_used(shader_program, prog); - struct st_vertex_program *stvp; struct st_common_program *stp; struct st_fragment_program *stfp; struct st_compute_program *stcp; - nir_shader *nir = st_glsl_to_nir(st, prog, shader_program, shader->Stage); - - switch (shader->Stage) { + switch (prog->info.stage) { case MESA_SHADER_VERTEX: stvp = (struct st_vertex_program *)prog; stvp->shader_program = shader_program; stvp->tgsi.type = PIPE_SHADER_IR_NIR; stvp->tgsi.ir.nir = nir; break; case MESA_SHADER_GEOMETRY: case MESA_SHADER_TESS_CTRL: case MESA_SHADER_TESS_EVAL: stp = (struct st_common_program *)prog; @@ -493,24 +466,57 @@ st_nir_get_mesa_program(struct gl_context *ctx, stfp->tgsi.type = PIPE_SHADER_IR_NIR; stfp->tgsi.ir.nir = nir; break; case MESA_SHADER_COMPUTE: stcp = (struct st_compute_program *)prog; stcp->shader_program = shader_program; stcp->tgsi.ir_type = PIPE_SHADER_IR_NIR; stcp->tgsi.prog = nir_shader_clone(NULL, nir); break; default: - assert(!"should not be reached"); - return NULL; + unreachable("unknown shader stage"); } +} + +struct gl_program * +st_nir_get_mesa_program(struct gl_context *ctx, +struct gl_shader_program *shader_program, +struct gl_linked_shader *shader) +{ + struct st_context *st = st_context(ctx); + struct gl_program *prog; + + validate_ir_tree(shader->ir); + + prog = shader->Program; + + prog->Parameters = _mesa_new_parameter_list(); + + _mesa_copy_linked_program_data(shader_program, shader); + _mesa_generate_parameters_list_for_uniforms(ctx, shader_program, shader, + prog->Parameters); + + if (ctx->_Shader->Flags & GLSL_DUMP) { + _mesa_log("\n"); + _mesa_log("GLSL IR for linked %s program %d:\n", + _mesa_shader_stage_to_string(shader->Stage), + shader_program->Name); + _mesa_print_ir(_mesa_get_log_file(), shader->ir, NULL); + _mesa_log("\n\n"); + } + + prog->ExternalSamplersUsed = gl_external_samplers(prog); + _mesa_update_shader_textures_used(shader_program, prog); + + nir_shader *nir = st_glsl_to_nir(st, prog, shader_program, shader->Stage); + set_st_program(prog, shader_program, nir); return prog; } bool st_link_nir(struct gl_context *ctx, struct gl_shader_program *shader_program) { for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/14] st/glsl_to_nir: split the st_glsl_to_nir() function in two
We want to be able to generate NIR then apply NIR optimisations. Once the optimisations are done we can then apply the new post opt function which assigns uniforms etc based on the optimised IR. --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index 8c22bde835..9a27f4ceea 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -242,44 +242,53 @@ st_nir_assign_uniform_locations(struct gl_program *prog, uniform->data.driver_location = loc; max = MAX2(max, loc + type_size(uniform->type)); } *size = max; } extern "C" { -/* First half of converting glsl_to_nir.. this leaves things in a pre- +/* First third of converting glsl_to_nir.. this leaves things in a pre- * nir_lower_io state, so that shader variants can more easily insert/ * replace variables, etc. */ nir_shader * st_glsl_to_nir(struct st_context *st, struct gl_program *prog, struct gl_shader_program *shader_program, gl_shader_stage stage) { struct pipe_screen *pscreen = st->pipe->screen; enum pipe_shader_type ptarget = pipe_shader_type_from_mesa(stage); const nir_shader_compiler_options *options; - nir_shader *nir; assert(pscreen->get_compiler_options); /* drivers using NIR must implement this */ options = (const nir_shader_compiler_options *) pscreen->get_compiler_options(pscreen, PIPE_SHADER_IR_NIR, ptarget); assert(options); if (prog->nir) return prog->nir; - nir = glsl_to_nir(shader_program, stage, options); + return glsl_to_nir(shader_program, stage, options); +} + +/* Second third of converting glsl_to_nir. This creates uniforms, gathers + * info on varyings, etc after NIR link time opts have been applied. + */ +static void +st_glsl_to_nir_post_opts(struct st_context *st, struct gl_program *prog, + struct gl_shader_program *shader_program) +{ + nir_shader *nir = prog->nir; /* Make a pass over the IR to add state references for any built-in * uniforms that are used. This has to be done now (during linking). * Code generation doesn't happen until the first time this shader is * used for rendering. Waiting until then to generate the parameters is * too late. At that point, the values for the built-in uniforms won't * get sent to the shader. */ nir_foreach_variable(var, >uniforms) { if (strncmp(var->name, "gl_", 3) == 0) { @@ -306,21 +315,21 @@ st_glsl_to_nir(struct st_context *st, struct gl_program *prog, _mesa_associate_uniform_storage(st->ctx, shader_program, prog, true); NIR_PASS_V(nir, nir_lower_io_to_temporaries, nir_shader_get_entrypoint(nir), true, true); NIR_PASS_V(nir, nir_lower_global_vars_to_local); NIR_PASS_V(nir, nir_split_var_copies); NIR_PASS_V(nir, nir_lower_var_copies); /* fragment shaders may need : */ - if (stage == MESA_SHADER_FRAGMENT) { + if (prog->info.stage == MESA_SHADER_FRAGMENT) { static const gl_state_index wposTransformState[STATE_LENGTH] = { STATE_INTERNAL, STATE_FB_WPOS_Y_TRANSFORM }; nir_lower_wpos_ytransform_options wpos_options = { { 0 } }; struct pipe_screen *pscreen = st->pipe->screen; memcpy(wpos_options.state_tokens, wposTransformState, sizeof(wpos_options.state_tokens)); wpos_options.fs_coord_origin_upper_left = pscreen->get_param(pscreen, PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT); @@ -343,29 +352,25 @@ st_glsl_to_nir(struct st_context *st, struct gl_program *prog, prog->info = nir->info; st_set_prog_affected_state_flags(prog); NIR_PASS_V(nir, st_nir_lower_builtin); NIR_PASS_V(nir, nir_lower_atomics, shader_program); if (st->ctx->_Shader->Flags & GLSL_DUMP) { _mesa_log("\n"); _mesa_log("NIR IR for linked %s program %d:\n", - _mesa_shader_stage_to_string(stage), + _mesa_shader_stage_to_string(prog->info.stage), shader_program->Name); nir_print_shader(nir, _mesa_get_log_file()); _mesa_log("\n\n"); } - - prog->nir = nir; - - return nir; } /* TODO any better helper somewhere to sort a list? */ static void insert_sorted(struct exec_list *var_list, nir_variable *new_var) { nir_foreach_variable(var, var_list) { if (var->data.location > new_var->data.location) { exec_node_insert_node_before(>node, _var->node); @@ -380,21 +385,21 @@ sort_varyings(struct exec_list *var_list) { struct exec_list new_list; exec_list_make_empty(_list); nir_foreach_variable_safe(var, var_list) { exec_node_remove(>node); insert_sorted(_list, var); } exec_list_move_nodes_to(_list, var_list); } -/* Second half of
[Mesa-dev] [PATCH 02/14] nir: fix support for scalar arrays in nir_lower_io_types()
This was just recreating the same vector type we alreay had and hitting an assert for scalars. --- src/compiler/nir/nir_lower_io_types.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/src/compiler/nir/nir_lower_io_types.c b/src/compiler/nir/nir_lower_io_types.c index d31082e543..795bbd80d5 100644 --- a/src/compiler/nir/nir_lower_io_types.c +++ b/src/compiler/nir/nir_lower_io_types.c @@ -47,29 +47,25 @@ get_new_var(struct lower_io_types_state *state, nir_variable *var, assert(var->data.mode == nir_var_shader_out); list = >new_outs; } nir_foreach_variable(nvar, list) { if (nvar->data.location == (var->data.location + off)) return nvar; } /* doesn't already exist, so we need to create a new one: */ - /* TODO figure out if scalar vs vec, and if float/int/uint/(double?) -* do we need to fixup interpolation mode for int vs float components -* of a struct, etc.. + /* TODO figure out if we need to fixup interpolation mode for int vs float +* components of a struct, etc.. */ - const struct glsl_type *ntype = - glsl_vector_type(glsl_get_base_type(deref_type), - glsl_get_vector_elements(deref_type)); nir_variable *nvar = nir_variable_create(state->shader, var->data.mode, -ntype, NULL); +deref_type, NULL); nvar->name = ralloc_asprintf(nvar, "%s@%u", var->name, off); nvar->data = var->data; nvar->data.location += off; /* nir_variable_create is too clever for its own good: */ exec_node_remove(>node); exec_node_self_link(>node); /* no delinit() :-( */ exec_list_push_tail(list, >node); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] NIR linking optimisations for Gallium drivers
This series depends on: https://patchwork.freedesktop.org/series/34131/ Tested without regression on radeonsi. Shader-db results (still limited to GLSL 1.40 so not to interesting): Totals from affected shaders: SGPRS: 29440 -> 30464 (3.48 %) VGPRS: 19620 -> 19584 (-0.18 %) Spilled SGPRs: 131 -> 129 (-1.53 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 749312 -> 749548 (0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4751 -> 4767 (0.34 %) Wait states: 0 -> 0 (0.00 %) The middle patches just move things around so that we can make use of the linking opts, hopefully having them split up this much should make regression testing easy for the existing gallium nir drivers. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/14] st/glsl_to_nir: add st_nir_assign_var_locations() helper
This avoids packed varyings being assigned different driver locations. --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 43 --- 1 file changed, 34 insertions(+), 9 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index dd4b05d7b9..d0375dd3aa 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -114,20 +114,48 @@ st_nir_assign_vs_in_locations(struct gl_program *prog, nir_shader *nir) * inputs array and expecting to find inputs with a driver_location * set. */ exec_node_remove(>node); var->data.mode = nir_var_global; exec_list_push_tail(>globals, >node); } } } +static void +st_nir_assign_var_locations(struct exec_list *var_list, unsigned *size) +{ + unsigned location = 0; + unsigned assigned_locations[VARYING_SLOT_MAX]; + uint64_t processed_locs = 0; + + nir_foreach_variable(var, var_list) { + /* Because component packing allows varyings to share the same location + * we may have already have processed this location. + */ + if (var->data.location >= VARYING_SLOT_VAR0 && + processed_locs & ((uint64_t)1 << var->data.location)) { + var->data.driver_location = assigned_locations[var->data.location]; + *size += type_size(var->type); + continue; + } + + assigned_locations[var->data.location] = location; + var->data.driver_location = location; + location += type_size(var->type); + + processed_locs |= ((uint64_t)1 << var->data.location); + } + + *size += location; +} + static int st_nir_lookup_parameter_index(const struct gl_program_parameter_list *params, const char *name) { int loc = _mesa_lookup_parameter_index(params, name); /* is there a better way to do this? If we have something like: * *struct S { * float f; @@ -372,33 +400,30 @@ st_finalize_nir(struct st_context *st, struct gl_program *prog, NIR_PASS_V(nir, nir_lower_var_copies); NIR_PASS_V(nir, nir_lower_io_types); if (nir->info.stage == MESA_SHADER_VERTEX) { /* Needs special handling so drvloc matches the vbo state: */ st_nir_assign_vs_in_locations(prog, nir); /* Re-lower global vars, to deal with any dead VS inputs. */ NIR_PASS_V(nir, nir_lower_global_vars_to_local); sort_varyings(>outputs); - nir_assign_var_locations(>outputs, - >num_outputs, - type_size); + st_nir_assign_var_locations(>outputs, + >num_outputs); st_nir_fixup_varying_slots(st, >outputs); } else if (nir->info.stage == MESA_SHADER_FRAGMENT) { sort_varyings(>inputs); - nir_assign_var_locations(>inputs, - >num_inputs, - type_size); + st_nir_assign_var_locations(>inputs, + >num_inputs); st_nir_fixup_varying_slots(st, >inputs); - nir_assign_var_locations(>outputs, - >num_outputs, - type_size); + st_nir_assign_var_locations(>outputs, + >num_outputs); } else if (nir->info.stage == MESA_SHADER_COMPUTE) { /* TODO? */ } else { unreachable("invalid shader type for tgsi bypass\n"); } NIR_PASS_V(nir, nir_lower_atomics_to_ssbo, st->ctx->Const.Program[nir->info.stage].MaxAtomicBuffers); st_nir_assign_uniform_locations(prog, shader_program, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/14] st/glsl: move nir linking loop to new function st_link_nir()
This will allow us to refactor linking and include some nir link time optimisations. --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 25 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 26 +- src/mesa/state_tracker/st_nir.h| 7 +++ 3 files changed, 41 insertions(+), 17 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index d0375dd3aa..8b66f8277a 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -501,11 +501,36 @@ st_nir_get_mesa_program(struct gl_context *ctx, break; default: assert(!"should not be reached"); return NULL; } return prog; } +bool +st_link_nir(struct gl_context *ctx, +struct gl_shader_program *shader_program) +{ + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + struct gl_linked_shader *shader = shader_program->_LinkedShaders[i]; + if (shader == NULL) + continue; + + struct gl_program *linked_prog = + st_nir_get_mesa_program(ctx, shader_program, shader); + + if (linked_prog) { + if (!ctx->Driver.ProgramStringNotify(ctx, + _mesa_shader_stage_to_program(i), + linked_prog)) { +_mesa_reference_program(ctx, >Program, NULL); +return false; + } + } + } + + return true; +} + } /* extern "C" */ diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 0772b73627..4fc1533e28 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -6894,39 +6894,46 @@ GLboolean st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) { /* Return early if we are loading the shader from on-disk cache */ if (st_load_tgsi_from_disk_cache(ctx, prog)) { return GL_TRUE; } struct pipe_screen *pscreen = ctx->st->pipe->screen; assert(prog->data->LinkStatus); + bool use_nir = false; for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { if (prog->_LinkedShaders[i] == NULL) continue; struct gl_linked_shader *shader = prog->_LinkedShaders[i]; exec_list *ir = shader->ir; gl_shader_stage stage = shader->Stage; const struct gl_shader_compiler_options *options = >Const.ShaderCompilerOptions[stage]; enum pipe_shader_type ptarget = pipe_shader_type_from_mesa(stage); bool have_dround = pscreen->get_shader_param(pscreen, ptarget, PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED); bool have_dfrexp = pscreen->get_shader_param(pscreen, ptarget, PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED); bool have_ldexp = pscreen->get_shader_param(pscreen, ptarget, PIPE_SHADER_CAP_TGSI_LDEXP_SUPPORTED); unsigned if_threshold = pscreen->get_shader_param(pscreen, ptarget, PIPE_SHADER_CAP_LOWER_IF_THRESHOLD); + enum pipe_shader_ir preferred_ir = (enum pipe_shader_ir) + pscreen->get_shader_param(pscreen, ptarget, + PIPE_SHADER_CAP_PREFERRED_IR); + if (preferred_ir == PIPE_SHADER_IR_NIR) + use_nir = true; + /* If there are forms of indirect addressing that the driver * cannot handle, perform the lowering pass. */ if (options->EmitNoIndirectInput || options->EmitNoIndirectOutput || options->EmitNoIndirectTemp || options->EmitNoIndirectUniform) { lower_variable_index_to_cond_assign(stage, ir, options->EmitNoIndirectInput, options->EmitNoIndirectOutput, options->EmitNoIndirectTemp, options->EmitNoIndirectUniform); @@ -7016,38 +7023,31 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) progress |= lower_if_to_cond_assign((gl_shader_stage)i, ir, options->MaxIfDepth, if_threshold); } while (progress); } validate_ir_tree(ir); } build_program_resource_list(ctx, prog); + if (use_nir) + return st_link_nir(ctx, prog); + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { struct gl_linked_shader *shader = prog->_LinkedShaders[i]; if (shader == NULL) continue; - enum pipe_shader_type ptarget = - pipe_shader_type_from_mesa(shader->Stage); - enum pipe_shader_ir preferred_ir = (enum pipe_shader_ir) - pscreen->get_shader_param(pscreen,
[Mesa-dev] [PATCH 7/8] radv: enable nir varying array splitting
--- src/amd/vulkan/radv_pipeline.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index faffca8330..03a8bd8604 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1675,20 +1675,23 @@ radv_link_shaders(struct radv_pipeline *pipeline, nir_shader **shaders) ordered_shaders[shader_count++] = shaders[MESA_SHADER_TESS_EVAL]; } if(shaders[MESA_SHADER_TESS_CTRL]) { ordered_shaders[shader_count++] = shaders[MESA_SHADER_TESS_CTRL]; } if(shaders[MESA_SHADER_VERTEX]) { ordered_shaders[shader_count++] = shaders[MESA_SHADER_VERTEX]; } for (int i = 1; i < shader_count; ++i) { + nir_lower_io_arrays_to_elements(ordered_shaders[i], + ordered_shaders[i - 1]); + nir_remove_dead_variables(ordered_shaders[i], nir_var_shader_out); nir_remove_dead_variables(ordered_shaders[i - 1], nir_var_shader_in); bool progress = nir_remove_unused_varyings(ordered_shaders[i], ordered_shaders[i - 1]); if (progress) { nir_lower_global_vars_to_local(ordered_shaders[i]); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] nir: add varying component packing helpers
v2: update shader info input/output masks when pack components v3: make sure interpolation loc matches, this is required for the radeonsi NIR backend. Reviewed-by: Bas Nieuwenhuizen(v1) --- src/compiler/nir/nir.h | 2 + src/compiler/nir/nir_linking_helpers.c | 302 + 2 files changed, 304 insertions(+) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index c62af4afb9..5832c05680 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2441,20 +2441,22 @@ void nir_lower_io_to_temporaries(nir_shader *shader, nir_function_impl *entrypoint, bool outputs, bool inputs); void nir_shader_gather_info(nir_shader *shader, nir_function_impl *entrypoint); void nir_assign_var_locations(struct exec_list *var_list, unsigned *size, int (*type_size)(const struct glsl_type *)); /* Some helpers to do very simple linking */ bool nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer); +void nir_compact_varyings(nir_shader *producer, nir_shader *consumer, + bool default_to_smooth_interp); typedef enum { /* If set, this forces all non-flat fragment shader inputs to be * interpolated as if with the "sample" qualifier. This requires * nir_shader_compiler_options::use_interpolated_input_intrinsics. */ nir_lower_io_force_sample_interpolation = (1 << 1), } nir_lower_io_options; bool nir_lower_io(nir_shader *shader, nir_variable_mode modes, diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c index 4d709c1b3c..7a1d98d7b9 100644 --- a/src/compiler/nir/nir_linking_helpers.c +++ b/src/compiler/nir/nir_linking_helpers.c @@ -166,10 +166,312 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) bool progress = false; progress = remove_unused_io_vars(producer, >outputs, read, patches_read); progress = remove_unused_io_vars(consumer, >inputs, written, patches_written) || progress; return progress; } + +static uint8_t +get_interp_type(nir_variable *var, bool default_to_smooth_interp) +{ + if (var->data.interpolation != INTERP_MODE_NONE) + return var->data.interpolation; + else if (default_to_smooth_interp) + return INTERP_MODE_SMOOTH; + else + return INTERP_MODE_NONE; +} + +#define INTERPOLATE_LOC_SAMPLE 0 +#define INTERPOLATE_LOC_CENTROID 1 +#define INTERPOLATE_LOC_CENTER 2 + +static uint8_t +get_interp_loc(nir_variable *var) +{ + if (var->data.sample) + return INTERPOLATE_LOC_SAMPLE; + else if (var->data.centroid) + return INTERPOLATE_LOC_CENTROID; + else + return INTERPOLATE_LOC_CENTER; +} + +static void +get_slot_component_masks_and_interp_types(struct exec_list *var_list, + uint8_t *comps, + uint8_t *interp_type, + uint8_t *interp_loc, + gl_shader_stage stage, + bool default_to_smooth_interp) +{ + nir_foreach_variable_safe(var, var_list) { + assert(var->data.location >= 0); + + /* Only remap things that aren't built-ins. + * TODO: add TES patch support. + */ + if (var->data.location >= VARYING_SLOT_VAR0 && + var->data.location - VARYING_SLOT_VAR0 < 32) { + + const struct glsl_type *type = var->type; + if (nir_is_per_vertex_io(var, stage)) { +assert(glsl_type_is_array(type)); +type = glsl_get_array_element(type); + } + + unsigned location = var->data.location - VARYING_SLOT_VAR0; + unsigned elements = +glsl_get_vector_elements(glsl_without_array(type)); + + bool dual_slot = glsl_type_is_dual_slot(glsl_without_array(type)); + unsigned slots = glsl_count_attribute_slots(type, false); + unsigned comps_slot2 = 0; + for (unsigned i = 0; i < slots; i++) { +interp_type[location + i] = + get_interp_type(var, default_to_smooth_interp); +interp_loc[location + i] = get_interp_loc(var); + +if (dual_slot) { + if (i & 1) { + comps[location + i] |= ((1 << comps_slot2) - 1); + } else { + unsigned num_comps = 4 - var->data.location_frac; + comps_slot2 = (elements * 2) - num_comps; + + /* Assume ARB_enhanced_layouts packing rules for doubles */ + assert(var->data.location_frac == 0 || + var->data.location_frac == 2); + assert(comps_slot2 <= 4); + + comps[location + i] |= + ((1 <<
[Mesa-dev] [PATCH 8/8] radv: enable nir component packing
SaschaWillems Vulkan demo tessellation: ~4000fps -> ~4600fps Reviewed-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_pipeline.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index 03a8bd8604..5f07040915 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1824,20 +1824,21 @@ void radv_create_shaders(struct radv_pipeline *pipeline, unsigned first = MESA_SHADER_STAGES; unsigned last = 0; for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { if (!pStages[i]) continue; if (first == MESA_SHADER_STAGES) first = i; last = i; } + int prev = -1; for (unsigned i = 0; i < MESA_SHADER_STAGES; ++i) { const VkPipelineShaderStageCreateInfo *stage = pStages[i]; if (!modules[i]) continue; nir[i] = radv_shader_compile_to_nir(device, modules[i], stage ? stage->pName : "main", i, stage ? stage->pSpecializationInfo : NULL); pipeline->active_stages |= mesa_to_vk_shader_stage(i); @@ -1854,20 +1855,25 @@ void radv_create_shaders(struct radv_pipeline *pipeline, if (i != first) mask = mask | nir_var_shader_in; if (i != last) mask = mask | nir_var_shader_out; nir_lower_io_to_scalar_early(nir[i], mask); radv_optimize_nir(nir[i]); } + + if (prev != -1) { + nir_compact_varyings(nir[prev], nir[i], true); + } + prev = i; } if (nir[MESA_SHADER_TESS_CTRL]) { nir_lower_tes_patch_vertices(nir[MESA_SHADER_TESS_EVAL], nir[MESA_SHADER_TESS_CTRL]->info.tess.tcs_vertices_out); } radv_link_shaders(pipeline, nir); for (int i = 0; i < MESA_SHADER_STAGES; ++i) { if (!(device->instance->debug_flags & RADV_DEBUG_DUMP_SHADERS)) -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] i965: call nir_lower_io_to_scalar() at link time for BDW and above
This will allow dead components of varyings to be removed. BDW shader-db results: total instructions in shared programs: 13190730 -> 13108459 (-0.62%) instructions in affected programs: 2110903 -> 2028632 (-3.90%) helped: 14043 HURT: 486 total cycles in shared programs: 541148990 -> 540544072 (-0.11%) cycles in affected programs: 290344296 -> 289739378 (-0.21%) helped: 23418 HURT: 11623 total loops in shared programs: 3923 -> 3920 (-0.08%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 total spills in shared programs: 85784 -> 85853 (0.08%) spills in affected programs: 1374 -> 1443 (5.02%) helped: 6 HURT: 15 total fills in shared programs: 88717 -> 88801 (0.09%) fills in affected programs: 1719 -> 1803 (4.89%) helped: 15 HURT: 9 LOST: 3 GAINED: 0 The fills/spills changes were all in the dolphin uber shaders. I tested enabling this on IVB but the results went in the other direction. --- src/mesa/drivers/dri/i965/brw_link.cpp | 35 -- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index 7986bdf74c..fe2ad52ad1 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -221,20 +221,31 @@ extern "C" GLboolean brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) { struct brw_context *brw = brw_context(ctx); const struct brw_compiler *compiler = brw->screen->compiler; unsigned int stage; struct shader_info *infos[MESA_SHADER_STAGES] = { 0, }; if (shProg->data->LinkStatus == linking_skipped) return GL_TRUE; + /* Determine first and last stage. */ + unsigned first = MESA_SHADER_STAGES; + unsigned last = 0; + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + if (!shProg->_LinkedShaders[i]) + continue; + if (first == MESA_SHADER_STAGES) + first = i; + last = i; + } + for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) { struct gl_linked_shader *shader = shProg->_LinkedShaders[stage]; if (!shader) continue; struct gl_program *prog = shader->Program; prog->Parameters = _mesa_new_parameter_list(); process_glsl_ir(brw, shProg, shader); @@ -248,31 +259,35 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) if (debug_enabled && shader->ir) { fprintf(stderr, "GLSL IR for native %s shader %d:\n", _mesa_shader_stage_to_string(shader->Stage), shProg->Name); _mesa_print_ir(stderr, shader->ir, NULL); fprintf(stderr, "\n\n"); } prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) stage, compiler->scalar_stage[stage]); - } - /* Determine first and last stage. */ - unsigned first = MESA_SHADER_STAGES; - unsigned last = 0; - for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { - if (!shProg->_LinkedShaders[i]) - continue; - if (first == MESA_SHADER_STAGES) - first = i; - last = i; + if (brw->screen->devinfo.gen >= 8) { + nir_variable_mode mask = (nir_variable_mode) 0; + + if (stage != first) +mask = (nir_variable_mode)(mask | nir_var_shader_in); + + if (stage != last) +mask = (nir_variable_mode)(mask | nir_var_shader_out); + + nir_lower_io_to_scalar_early(prog->nir, mask); + + prog->nir = brw_nir_optimize(prog->nir, compiler, + compiler->scalar_stage[stage]); + } } /* Linking the stages in the opposite order (from fragment to vertex) * ensures that inter-shader outputs written to in an earlier stage * are eliminated if they are (transitively) not used in a later * stage. * * TODO: Look into Shadow of Mordor regressions on HSW and enable this for * all platforms. See: https://bugs.freedesktop.org/show_bug.cgi?id=103537 */ -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] i965: enable varying array splitting
total instructions in shared programs: 13210579 -> 13199325 (-0.09%) instructions in affected programs: 89043 -> 77789 (-12.64%) helped: 430 HURT: 0 total cycles in shared programs: 539530190 -> 539493750 (-0.01%) cycles in affected programs: 584860 -> 548420 (-6.23%) helped: 437 HURT: 110 total spills in shared programs: 86646 -> 86640 (-0.01%) spills in affected programs: 6 -> 0 helped: 1 HURT: 0 total fills in shared programs: 90955 -> 90946 (-0.01%) fills in affected programs: 9 -> 0 helped: 1 HURT: 0 --- src/intel/compiler/brw_nir.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 8f3f77f89a..26d17d8f0c 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -672,20 +672,22 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) OPT(nir_remove_dead_variables, nir_var_local); return nir; } void brw_nir_link_shaders(const struct brw_compiler *compiler, nir_shader **producer, nir_shader **consumer) { + nir_lower_io_arrays_to_elements(*producer, *consumer); + NIR_PASS_V(*producer, nir_remove_dead_variables, nir_var_shader_out); NIR_PASS_V(*consumer, nir_remove_dead_variables, nir_var_shader_in); if (nir_remove_unused_varyings(*producer, *consumer)) { NIR_PASS_V(*producer, nir_lower_global_vars_to_local); NIR_PASS_V(*consumer, nir_lower_global_vars_to_local); /* The backend might not be able to handle indirects on * temporaries so we need to lower indirects on any of the * varyings we have demoted here. -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] i965: enable varying component packing for BDW+
shader-db results BDW: total instructions in shared programs: 13192895 -> 13182437 (-0.08%) instructions in affected programs: 827145 -> 816687 (-1.26%) helped: 5199 HURT: 116 total cycles in shared programs: 539249342 -> 539156566 (-0.02%) cycles in affected programs: 21894552 -> 21801776 (-0.42%) helped: 10667 HURT: 7196 LOST: 0 GAINED: 17 --- src/mesa/drivers/dri/i965/brw_link.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index fe2ad52ad1..6dbceb8ed3 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -297,20 +297,21 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) if (shProg->_LinkedShaders[i] == NULL) continue; brw_nir_link_shaders(compiler, >_LinkedShaders[i]->Program->nir, >_LinkedShaders[next]->Program->nir); next = i; } } + int prev = -1; for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) { struct gl_linked_shader *shader = shProg->_LinkedShaders[stage]; if (!shader) continue; struct gl_program *prog = shader->Program; brw_shader_gather_info(prog->nir, prog); NIR_PASS_V(prog->nir, nir_lower_samplers, shProg); NIR_PASS_V(prog->nir, nir_lower_atomics, shProg); @@ -319,20 +320,26 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) if (brw->ctx.Cache) { struct blob writer; blob_init(); nir_serialize(, prog->nir); prog->driver_cache_blob = ralloc_size(NULL, writer.size); memcpy(prog->driver_cache_blob, writer.data, writer.size); prog->driver_cache_blob_size = writer.size; } + if (brw->screen->devinfo.gen >= 8 && prev != -1) { + nir_compact_varyings(shProg->_LinkedShaders[prev]->Program->nir, + prog->nir, ctx->API != API_OPENGL_COMPAT); + } + prev = stage; + infos[stage] = >nir->info; /* Make a pass over the IR to add state references for any built-in * uniforms that are used. This has to be done now (during linking). * Code generation doesn't happen until the first time this shader is * used for rendering. Waiting until then to generate the parameters is * too late. At that point, the values for the built-in uniforms won't * get sent to the shader. */ nir_foreach_variable(var, >nir->uniforms) { -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] i965: move update_xfb_info() call out of loop
We can just call it once. Also a following patch will also introduce link time component packing which modifies the outputs_written bit mask, we want to avoid calling update_xfb_info() until after packing is completed. --- src/mesa/drivers/dri/i965/brw_link.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index d18521e792..7986bdf74c 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -306,42 +306,45 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) struct blob writer; blob_init(); nir_serialize(, prog->nir); prog->driver_cache_blob = ralloc_size(NULL, writer.size); memcpy(prog->driver_cache_blob, writer.data, writer.size); prog->driver_cache_blob_size = writer.size; } infos[stage] = >nir->info; - update_xfb_info(prog->sh.LinkedTransformFeedback, infos[stage]); - /* Make a pass over the IR to add state references for any built-in * uniforms that are used. This has to be done now (during linking). * Code generation doesn't happen until the first time this shader is * used for rendering. Waiting until then to generate the parameters is * too late. At that point, the values for the built-in uniforms won't * get sent to the shader. */ nir_foreach_variable(var, >nir->uniforms) { if (strncmp(var->name, "gl_", 3) == 0) { const nir_state_slot *const slots = var->state_slots; assert(var->state_slots != NULL); for (unsigned int i = 0; i < var->num_state_slots; i++) { _mesa_add_state_reference(prog->Parameters, (gl_state_index *)slots[i].tokens); } } } } + if (shProg->last_vert_prog) { + update_xfb_info(shProg->last_vert_prog->sh.LinkedTransformFeedback, + >last_vert_prog->nir->info); + } + /* The linker tries to dead code eliminate unused varying components, * and make sure interfaces match. But it isn't able to do so in all * cases. So, explicitly make the interfaces match by OR'ing together * the inputs_read/outputs_written bitfields of adjacent stages. */ if (!shProg->SeparateShader) unify_interfaces(infos); if ((ctx->_Shader->Flags & GLSL_DUMP) && shProg->Name != 0) { for (unsigned i = 0; i < shProg->NumShaders; i++) { -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] nir: add varying array splitting pass
V2: fix matrix support, non-array matrices were being skipped in v1 --- src/compiler/Makefile.sources | 1 + src/compiler/nir/meson.build | 1 + src/compiler/nir/nir.h | 1 + src/compiler/nir/nir_lower_io_arrays_to_elements.c | 373 + 4 files changed, 376 insertions(+) create mode 100644 src/compiler/nir/nir_lower_io_arrays_to_elements.c diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index 2ab8e163a2..c5094b7f19 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -219,20 +219,21 @@ NIR_FILES = \ nir/nir_lower_double_ops.c \ nir/nir_lower_drawpixels.c \ nir/nir_lower_global_vars_to_local.c \ nir/nir_lower_gs_intrinsics.c \ nir/nir_lower_load_const_to_scalar.c \ nir/nir_lower_locals_to_regs.c \ nir/nir_lower_idiv.c \ nir/nir_lower_indirect_derefs.c \ nir/nir_lower_int64.c \ nir/nir_lower_io.c \ + nir/nir_lower_io_arrays_to_elements.c \ nir/nir_lower_io_to_temporaries.c \ nir/nir_lower_io_to_scalar.c \ nir/nir_lower_io_types.c \ nir/nir_lower_passthrough_edgeflags.c \ nir/nir_lower_patch_vertices.c \ nir/nir_lower_phis_to_scalar.c \ nir/nir_lower_regs_to_ssa.c \ nir/nir_lower_returns.c \ nir/nir_lower_samplers.c \ nir/nir_lower_samplers_as_deref.c \ diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build index e5c8326aa0..b61a07773d 100644 --- a/src/compiler/nir/meson.build +++ b/src/compiler/nir/meson.build @@ -107,20 +107,21 @@ files_libnir = files( 'nir_lower_double_ops.c', 'nir_lower_drawpixels.c', 'nir_lower_global_vars_to_local.c', 'nir_lower_gs_intrinsics.c', 'nir_lower_load_const_to_scalar.c', 'nir_lower_locals_to_regs.c', 'nir_lower_idiv.c', 'nir_lower_indirect_derefs.c', 'nir_lower_int64.c', 'nir_lower_io.c', + 'nir_lower_io_arrays_to_elements.c', 'nir_lower_io_to_temporaries.c', 'nir_lower_io_to_scalar.c', 'nir_lower_io_types.c', 'nir_lower_passthrough_edgeflags.c', 'nir_lower_patch_vertices.c', 'nir_lower_phis_to_scalar.c', 'nir_lower_regs_to_ssa.c', 'nir_lower_returns.c', 'nir_lower_samplers.c', 'nir_lower_samplers_as_deref.c', diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index f46f614711..c62af4afb9 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2475,20 +2475,21 @@ bool nir_lower_constant_initializers(nir_shader *shader, nir_variable_mode modes); bool nir_move_vec_src_uses_to_dest(nir_shader *shader); bool nir_lower_vec_to_movs(nir_shader *shader); void nir_lower_alpha_test(nir_shader *shader, enum compare_func func, bool alpha_to_one); bool nir_lower_alu_to_scalar(nir_shader *shader); bool nir_lower_load_const_to_scalar(nir_shader *shader); bool nir_lower_read_invocation_to_scalar(nir_shader *shader); bool nir_lower_phis_to_scalar(nir_shader *shader); +void nir_lower_io_arrays_to_elements(nir_shader *producer, nir_shader *consumer); void nir_lower_io_to_scalar(nir_shader *shader, nir_variable_mode mask); void nir_lower_io_to_scalar_early(nir_shader *shader, nir_variable_mode mask); bool nir_lower_samplers(nir_shader *shader, const struct gl_shader_program *shader_program); bool nir_lower_samplers_as_deref(nir_shader *shader, const struct gl_shader_program *shader_program); typedef struct nir_lower_subgroups_options { uint8_t subgroup_size; diff --git a/src/compiler/nir/nir_lower_io_arrays_to_elements.c b/src/compiler/nir/nir_lower_io_arrays_to_elements.c new file mode 100644 index 00..c41f300edb --- /dev/null +++ b/src/compiler/nir/nir_lower_io_arrays_to_elements.c @@ -0,0 +1,373 @@ +/* + * Copyright © 2017 Timothy Arceri + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
[Mesa-dev] V3 More linking optimisations
V3: Some small fixes to allow these passes to work with the radeonsi nir backend. V2: This is a resend/rebddase of the series I sent a couple of weeks ago, its rebased on some of Jason's changes to i965 NIR linking that landed in master. This series adds a varying array splitting pass as well a previous component packing series I sent out previously. This allows avoiding the workaround of calling gather shader info twice since we can more easily keep the input/output bitmasks in sync now that we don't need to worry about partial marking of arrays. Remaining improvements include adding a pass to compact varyings into consecutive slots rather than leaving empty slots when removing dead varyings. Radv results: SaschaWillems Vulkan demo tessellation: ~4000fps -> ~4600fps Shader-db results for serires on i965 (BDW): total instructions in shared programs: 13298718 -> 13191284 (-0.81%) instructions in affected programs: 2315180 -> 2207746 (-4.64%) helped: 14956 HURT: 390 total cycles in shared programs: 540151400 -> 539397048 (-0.14%) cycles in affected programs: 297905258 -> 297150906 (-0.25%) helped: 25231 HURT: 13033 total loops in shared programs: 3807 -> 3804 (-0.08%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 total spills in shared programs: 86577 -> 86640 (0.07%) spills in affected programs: 1380 -> 1443 (4.57%) helped: 7 HURT: 15 total fills in shared programs: 90871 -> 90946 (0.08%) fills in affected programs: 1728 -> 1803 (4.34%) helped: 16 HURT: 9 LOST: 4 GAINED: 15 The spill hurt is all in dolphin uber shaders (as is most of the spill improvements). Two of the lost programs are SIMD16 programs are from CS: GO because 80% of the shaders get optimised away when we remove dead varying components, these are also the shaders where the 3 loops go away. Please review. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 15/17] util: Add Mesa ARB_get_program_binary helper functions
I don't think this belongs in util. disk cache is generic and used by both Vulkan and Opengl drivers. These function are specific to one OpenGL extension and have a dependency on OpenGL types. I think you should just move this inside the src/mesa, I can't see this being reused anywhere else. Maybe src/mesa/program ? On 21/11/17 09:27, Jordan Justen wrote: Signed-off-by: Jordan Justen--- src/util/Makefile.sources | 2 + src/util/meson.build | 2 + src/util/program_binary.c | 149 ++ src/util/program_binary.h | 70 ++ 4 files changed, 223 insertions(+) create mode 100644 src/util/program_binary.c create mode 100644 src/util/program_binary.h diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources index 104ecae8ed3..e06f631128a 100644 --- a/src/util/Makefile.sources +++ b/src/util/Makefile.sources @@ -24,6 +24,8 @@ MESA_UTIL_FILES := \ mesa-sha1.h \ os_time.c \ os_time.h \ + program_binary.c \ + program_binary.h \ sha1/sha1.c \ sha1/sha1.h \ ralloc.c \ diff --git a/src/util/meson.build b/src/util/meson.build index ac86c9e111e..496da5c4328 100644 --- a/src/util/meson.build +++ b/src/util/meson.build @@ -48,6 +48,8 @@ files_mesa_util = files( 'mesa-sha1.h', 'os_time.c', 'os_time.h', + 'program_binary.c', + 'program_binary.h', 'sha1/sha1.c', 'sha1/sha1.h', 'ralloc.c', diff --git a/src/util/program_binary.c b/src/util/program_binary.c new file mode 100644 index 000..e10b9f17862 --- /dev/null +++ b/src/util/program_binary.c @@ -0,0 +1,149 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (c) 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * \file program_binary.c + * + * Helper functions for serializing a binary program. + */ + + +#include "main/mtypes.h" +#include "crc32.h" +#include "program_binary.h" + +/** + * Mesa supports one binary format, but it must differentiate between formats + * produced by different drivers and different Mesa versions. + * + * Mesa uses a uint32_t value to specify an internal format. The only format + * defined has one uint32_t value of 0, followed by 20 bytes specifying a sha1 + * that uniquely identifies the Mesa driver type and version. + */ + +struct program_binary_header { + /* If internal_format is 0, it must be followed by the 20 byte sha1 that +* identifies the Mesa driver and version supported. If we want to support +* something besides a sha1, then a new internal_format value can be added. +*/ + uint32_t internal_format; + uint8_t sha1[20]; + /* Fields following sha1 can be changed since the sha1 will guarantee that +* the binary only works with the same Mesa version. +*/ + uint32_t size; + uint32_t crc32; +}; + +unsigned +get_program_binary_header_size(void) +{ + return sizeof(struct program_binary_header); +} + +bool +write_program_binary(const void *payload, unsigned payload_size, + const void *sha1, void *binary, unsigned binary_size, + GLenum *binary_format) +{ + struct program_binary_header *hdr = binary; + + if (binary_size < sizeof(*hdr)) + return false; + + if (payload_size > binary_size - sizeof(*hdr)) + return false; + + hdr->internal_format = 0; + memcpy(hdr->sha1, sha1, sizeof(hdr->sha1)); + memcpy(hdr + 1, payload, payload_size); + hdr->size = payload_size; + + hdr->crc32 = util_hash_crc32(hdr + 1, payload_size); + *binary_format = GL_PROGRAM_BINARY_FORMAT_MESA; + + return true; +} + +static bool +simple_header_checks(const void *binary, unsigned length) +{ + const struct program_binary_header *hdr = binary; + if (binary == NULL || length < sizeof(*hdr)) + return false; + + if (hdr->internal_format != 0) + return false; + + return true; +} + +static
Re: [Mesa-dev] [PATCH 2/4] mesa: fix crash when an ATI_fs pass begins with an alpha inst
Am 21.11.2017 um 01:40 schrieb Ian Romanick: > On 11/20/2017 04:07 PM, Miklós Máté wrote: >> This fixes crash when: >> - first pass begins with alpha inst >> - first pass ends with color inst, second pass begins with alpha inst >> >> Also, use the symbolic name instead of a number. >> Piglit: spec/ati_fragment_shader/api-alphafirst >> >> Signed-off-by: Miklós Máté>> --- >> src/mesa/main/atifragshader.c | 6 -- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c >> index 49ddb6e5af..d6fc37ac8f 100644 >> --- a/src/mesa/main/atifragshader.c >> +++ b/src/mesa/main/atifragshader.c >> @@ -598,8 +598,10 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, >> GLenum op, GLuint dst, >>curProg->cur_pass=3; >> >> /* decide whether this is a new instruction or not ... all color >> instructions are new, >> - and alpha instructions might also be new if there was no preceding >> color inst */ >> - if ((optype == 0) || (curProg->last_optype == optype)) { >> + and alpha instructions might also be new if there was no preceding >> color inst, >> + and this may be the first inst of the pass */ > > I know the code previously used this same formatting, but the Mesa style > is for the */ of a multi-line comment to be on its own line. Each other > line should also start with a *. And line-wrap around 78 characters. > Like: > >/* Decide whether this is a new instruction or not. All color instructions > * are new, and alpha instructions might also be new if there was no > * preceding color inst. This may also be the first inst of the pass. > */ > >> + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) || (curProg->last_optype == >> optype) >> + || (curProg->numArithInstr[curProg->cur_pass >> 1] == 0)) { > > I lost the debate about where the || (or &&) should go... it should be > on the previous line. Most of the parenthesis are unnecessary, and the > second line should line up with the opening (. > > On a side topic... if anyone understands how > ati_fragment_shader::cur_pass works, it would be really great if they > could document it in mtypes.h. This just indicates which pass is currently being specified. Some instructions will trigger a new pass, some instructions are only valid in the first or second pass and so on, and you can have a maximum of 2 passes. I guess it's a bit awkward how this needs to work due to the shader being specified bit-by-bit rather than all at once, but the actual concept is very similar to the multiple phases of d3d9 and r300 (and didn't i915 have something similar too). Of course, if you translate it away (on everything but r200) then only the error checking aspect of it really matters in the end. FWIW the patches all look correct to me, apparently there were quite some errors in there (I think it was mostly verified with doom3 at that time...). Roland > >>if (curProg->numArithInstr[curProg->cur_pass >> 1] > 7) { >> _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(instrCount)"); >> return; >> > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=E021TggYOgVksqpASyumbrMCSKMsRIh5__J3WER0vyw=ek3znIpxIGsT2v0AG80jt9petuMBJvX1nXseAg81aYA= > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/12] compiler: All leaf Makefile.am should use +=
From: Ian RomanickThis slightly simplifies later changes that add more Makefile.*.am files. Signed-off-by: Ian Romanick --- src/compiler/Makefile.am | 1 + src/compiler/Makefile.glsl.am | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/compiler/Makefile.am b/src/compiler/Makefile.am index 45b9cc5..1f7a80b 100644 --- a/src/compiler/Makefile.am +++ b/src/compiler/Makefile.am @@ -53,6 +53,7 @@ noinst_LTLIBRARIES = libcompiler.la libcompiler_la_SOURCES = $(LIBCOMPILER_FILES) +noinst_PROGRAMS = check_PROGRAMS = TESTS = BUILT_SOURCES = diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am index 179f415..ad19b14 100644 --- a/src/compiler/Makefile.glsl.am +++ b/src/compiler/Makefile.glsl.am @@ -54,7 +54,7 @@ check_PROGRAMS += \ glsl/tests/sampler-types-test \ glsl/tests/uniform-initializer-test -noinst_PROGRAMS = glsl_compiler +noinst_PROGRAMS += glsl_compiler glsl_tests_blob_test_SOURCES = \ glsl/tests/blob_test.c -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/12] util: Add and use util_is_power_of_two_nonzero
From: Ian RomanickSigned-off-by: Ian Romanick --- src/compiler/nir/nir_search_helpers.h | 13 - src/compiler/nir/nir_validate.c | 2 +- src/util/bitscan.h| 11 +++ 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/src/compiler/nir/nir_search_helpers.h b/src/compiler/nir/nir_search_helpers.h index 200f247..c8e0c8f 100644 --- a/src/compiler/nir/nir_search_helpers.h +++ b/src/compiler/nir/nir_search_helpers.h @@ -28,12 +28,7 @@ #define _NIR_SEARCH_HELPERS_ #include "nir.h" - -static inline bool -__is_power_of_two(unsigned int x) -{ - return ((x != 0) && !(x & (x - 1))); -} +#include "util/bitscan.h" static inline bool is_pos_power_of_two(nir_alu_instr *instr, unsigned src, unsigned num_components, @@ -50,11 +45,11 @@ is_pos_power_of_two(nir_alu_instr *instr, unsigned src, unsigned num_components, case nir_type_int: if (val->i32[swizzle[i]] < 0) return false; - if (!__is_power_of_two(val->i32[swizzle[i]])) + if (!util_is_power_of_two_nonzero(val->i32[swizzle[i]])) return false; break; case nir_type_uint: - if (!__is_power_of_two(val->u32[swizzle[i]])) + if (!util_is_power_of_two_nonzero(val->u32[swizzle[i]])) return false; break; default: @@ -80,7 +75,7 @@ is_neg_power_of_two(nir_alu_instr *instr, unsigned src, unsigned num_components, case nir_type_int: if (val->i32[swizzle[i]] > 0) return false; - if (!__is_power_of_two(abs(val->i32[swizzle[i]]))) + if (!util_is_power_of_two_nonzero(abs(val->i32[swizzle[i]]))) return false; break; default: diff --git a/src/compiler/nir/nir_validate.c b/src/compiler/nir/nir_validate.c index 9bf8c70..44b92aa 100644 --- a/src/compiler/nir/nir_validate.c +++ b/src/compiler/nir/nir_validate.c @@ -980,7 +980,7 @@ validate_var_decl(nir_variable *var, bool is_global, validate_state *state) validate_assert(state, is_global == nir_variable_is_global(var)); /* Must have exactly one mode set */ - validate_assert(state, util_bitcount(var->data.mode) == 1); + validate_assert(state, util_is_power_of_two_nonzero(var->data.mode)); if (var->data.compact) { /* The "compact" flag is only valid on arrays of scalars. */ diff --git a/src/util/bitscan.h b/src/util/bitscan.h index 2d4e46e..a3f2d41 100644 --- a/src/util/bitscan.h +++ b/src/util/bitscan.h @@ -119,6 +119,17 @@ util_is_power_of_two_or_zero(unsigned v) return (v & (v - 1)) == 0; } +/* Determine if an unsigned value is a power of two. + * + * \note + * Zero is \b not treated as a power of two. + */ +static inline bool +util_is_power_of_two_nonzero(unsigned v) +{ + return v != 0 && (v & (v - 1)) == 0; +} + /* For looping over a bitmask when you want to loop over consecutive bits * manually, for example: * -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/12] spirv: Generate SPIR-V builder infrastructure
From: Ian Romanickv2: Don't try to automatically set SpvCapabilityGeometry or SpvCapabilityTessellation. v3: Move some changes from a later patch back into this patch. They were only in the later patch because of rebase failure. Make version a parameter to emit_SpvHeader. Many Python style changes based on feedback from Dylan. Signed-off-by: Ian Romanick --- src/compiler/Makefile.sources | 2 + src/compiler/Makefile.spirv.am| 8 +- src/compiler/glsl/meson.build | 2 +- src/compiler/spirv/.gitignore | 1 + src/compiler/spirv/meson.build| 7 + src/compiler/spirv/spirv_builder.h| 245 ++ src/compiler/spirv/spirv_builder_h.py | 585 ++ 7 files changed, 847 insertions(+), 3 deletions(-) create mode 100644 src/compiler/spirv/spirv_builder.h create mode 100644 src/compiler/spirv/spirv_builder_h.py diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index 1d67cba..aea067f 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -287,6 +287,7 @@ NIR_FILES = \ nir/nir_worklist.h SPIRV_GENERATED_FILES = \ + spirv/spirv_builder_functions.h \ spirv/spirv_capabilities.cpp \ spirv/spirv_capabilities.h \ spirv/spirv_info.c @@ -295,6 +296,7 @@ SPIRV_FILES = \ spirv/GLSL.std.450.h \ spirv/nir_spirv.h \ spirv/spirv.h \ + spirv/spirv_builder.h \ spirv/spirv_info.h \ spirv/spirv_to_nir.c \ spirv/vtn_alu.c \ diff --git a/src/compiler/Makefile.spirv.am b/src/compiler/Makefile.spirv.am index 4bc684a..dc3c01c 100644 --- a/src/compiler/Makefile.spirv.am +++ b/src/compiler/Makefile.spirv.am @@ -20,13 +20,17 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS # IN THE SOFTWARE. +spirv/spirv_builder_functions.h: spirv/spirv_builder_h.py spirv/spirv.core.grammar.json + $(MKDIR_GEN) + $(PYTHON_GEN) $(srcdir)/spirv/spirv_builder_h.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + spirv/spirv_capabilities.cpp: spirv/spirv_capabilities_h.py spirv/spirv.core.grammar.json $(MKDIR_GEN) - $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py --gen-cpp $@ $(srcdir)/spirv/spirv.core.grammar.json || ($(RM) $@; false) spirv/spirv_capabilities.h: spirv/spirv_capabilities_h.py spirv/spirv.core.grammar.json $(MKDIR_GEN) - $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py --gen-h $@ $(srcdir)/spirv/spirv.core.grammar.json || ($(RM) $@; false) spirv/spirv_info.c: spirv/spirv_info_c.py spirv/spirv.core.grammar.json $(MKDIR_GEN) diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build index 6e43f80..6e264ba 100644 --- a/src/compiler/glsl/meson.build +++ b/src/compiler/glsl/meson.build @@ -199,7 +199,7 @@ libglsl = static_library( 'glsl', [files_libglsl, glsl_parser, glsl_lexer_cpp, ir_expression_operation_h, ir_expression_operation_strings_h, ir_expression_operation_constant_h, - spirv_capabilities_cpp, spirv_capabilities_h], + spirv_capabilities_cpp, spirv_capabilities_h, spirv_builder_functions_h], c_args : [c_vis_args, c_msvc_compat_args, no_override_init_args], cpp_args : [cpp_vis_args, cpp_msvc_compat_args], link_with : [libnir, libglcpp], diff --git a/src/compiler/spirv/.gitignore b/src/compiler/spirv/.gitignore index 6b5ef0a..a4753c8 100644 --- a/src/compiler/spirv/.gitignore +++ b/src/compiler/spirv/.gitignore @@ -1,3 +1,4 @@ +/spirv_builder_functions.h /spirv_capabilities.cpp /spirv_capabilities.h /spirv_info.c diff --git a/src/compiler/spirv/meson.build b/src/compiler/spirv/meson.build index 8b6071a..92c0a76 100644 --- a/src/compiler/spirv/meson.build +++ b/src/compiler/spirv/meson.build @@ -38,3 +38,10 @@ spirv_capabilities_h = custom_target( output : 'spirv_capabilities.h', command : [prog_python2, '@INPUT0@', '--gen-h', '@OUTPUT@', '@INPUT1@'], ) + +spirv_builder_functions_h = custom_target( + 'spirv_builder_functions.h', + input : files('spirv_builder_h.py', 'spirv.core.grammar.json'), + output : 'spirv_builder_functions.h', + command : [prog_python2, '@INPUT0@', '@INPUT1@', '@OUTPUT@'], +) diff --git a/src/compiler/spirv/spirv_builder.h b/src/compiler/spirv/spirv_builder.h new file mode 100644 index 000..f82f00f --- /dev/null +++ b/src/compiler/spirv/spirv_builder.h @@ -0,0 +1,245 @@ +/* -*- c++ -*- */ +/* + * Copyright (C) 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal
[Mesa-dev] [PATCH 03/12] spirv: Import the latest 1.2.2 header and JSON from Khronos
From: Ian RomanickSigned-off-by: Ian Romanick --- src/compiler/spirv/spirv.core.grammar.json | 193 ++--- src/compiler/spirv/spirv.h | 23 +++- 2 files changed, 197 insertions(+), 19 deletions(-) diff --git a/src/compiler/spirv/spirv.core.grammar.json b/src/compiler/spirv/spirv.core.grammar.json index e2950dd..9a9b903 100644 --- a/src/compiler/spirv/spirv.core.grammar.json +++ b/src/compiler/spirv/spirv.core.grammar.json @@ -27,7 +27,7 @@ "magic_number" : "0x07230203", "major_version" : 1, "minor_version" : 2, - "revision" : 1, + "revision" : 2, "instructions" : [ { "opname" : "OpNop", @@ -2410,7 +2410,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -2434,7 +2434,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -2446,7 +2446,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -2458,7 +2458,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -2470,7 +2470,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -3223,7 +3223,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -3247,7 +3247,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -3259,7 +3259,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -3271,7 +3271,7 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] }, @@ -3283,9 +3283,118 @@ { "kind" : "IdResult" }, { "kind" : "IdScope","name" : "'Execution'" }, { "kind" : "GroupOperation", "name" : "'Operation'" }, -{ "kind" : "IdRef", "name" : "X" } +{ "kind" : "IdRef", "name" : "'X'" } ], "capabilities" : [ "Groups" ] +}, +{ + "opname" : "OpFragmentMaskFetchAMD", + "opcode" : 5011, + "operands" : [ +{ "kind" : "IdResultType" }, +{ "kind" : "IdResult" }, +{ "kind" : "IdRef", "name" : "'Image'" }, +{ "kind" : "IdRef", "name" : "'Coordinate'" } + ], + "capabilities" : [ "FragmentMaskAMD" ] +}, +{ + "opname" : "OpFragmentFetchAMD", + "opcode" : 5012, + "operands" : [ +{ "kind" : "IdResultType" }, +{ "kind" : "IdResult" }, +{ "kind" : "IdRef", "name" : "'Image'" }, +{ "kind" : "IdRef", "name" : "'Coordinate'" }, +{ "kind" : "IdRef", "name" : "'Fragment Index'" } + ], + "capabilities" : [ "FragmentMaskAMD" ] +}, +{ + "opname" : "OpSubgroupShuffleINTEL", + "opcode" :
[Mesa-dev] [PATCH 11/12] spirv: Generate code to track SPIR-V capability dependencies
From: Ian Romanickv2: Clean ups. Remove some functions that never ended up being used. v3: After updating spirv.core.grammar.json, fix the handling of ShaderViewportMaskNV. See the comment around line 71 of spirv_capabilities_h.py. v4: Many Python style changes based on feedback from Dylan. Signed-off-by: Ian Romanick --- src/compiler/Makefile.sources | 2 + src/compiler/Makefile.spirv.am | 8 + src/compiler/glsl/meson.build | 5 +- src/compiler/spirv/.gitignore | 2 + src/compiler/spirv/meson.build | 14 ++ src/compiler/spirv/spirv_capabilities_h.py | 365 + 6 files changed, 394 insertions(+), 2 deletions(-) create mode 100644 src/compiler/spirv/spirv_capabilities_h.py diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index 2ab8e16..1d67cba 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -287,6 +287,8 @@ NIR_FILES = \ nir/nir_worklist.h SPIRV_GENERATED_FILES = \ + spirv/spirv_capabilities.cpp \ + spirv/spirv_capabilities.h \ spirv/spirv_info.c SPIRV_FILES = \ diff --git a/src/compiler/Makefile.spirv.am b/src/compiler/Makefile.spirv.am index 9841004..4bc684a 100644 --- a/src/compiler/Makefile.spirv.am +++ b/src/compiler/Makefile.spirv.am @@ -20,6 +20,14 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS # IN THE SOFTWARE. +spirv/spirv_capabilities.cpp: spirv/spirv_capabilities_h.py spirv/spirv.core.grammar.json + $(MKDIR_GEN) + $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + +spirv/spirv_capabilities.h: spirv/spirv_capabilities_h.py spirv/spirv.core.grammar.json + $(MKDIR_GEN) + $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + spirv/spirv_info.c: spirv/spirv_info_c.py spirv/spirv.core.grammar.json $(MKDIR_GEN) $(PYTHON_GEN) $(srcdir)/spirv/spirv_info_c.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build index 5b505c0..6e43f80 100644 --- a/src/compiler/glsl/meson.build +++ b/src/compiler/glsl/meson.build @@ -198,11 +198,12 @@ files_libglsl_standalone = files( libglsl = static_library( 'glsl', [files_libglsl, glsl_parser, glsl_lexer_cpp, ir_expression_operation_h, - ir_expression_operation_strings_h, ir_expression_operation_constant_h], + ir_expression_operation_strings_h, ir_expression_operation_constant_h, + spirv_capabilities_cpp, spirv_capabilities_h], c_args : [c_vis_args, c_msvc_compat_args, no_override_init_args], cpp_args : [cpp_vis_args, cpp_msvc_compat_args], link_with : [libnir, libglcpp], - include_directories : [inc_common, inc_compiler, inc_nir], + include_directories : [inc_common, inc_compiler, inc_nir, inc_spirv], build_by_default : false, ) diff --git a/src/compiler/spirv/.gitignore b/src/compiler/spirv/.gitignore index f723c31..6b5ef0a 100644 --- a/src/compiler/spirv/.gitignore +++ b/src/compiler/spirv/.gitignore @@ -1 +1,3 @@ +/spirv_capabilities.cpp +/spirv_capabilities.h /spirv_info.c diff --git a/src/compiler/spirv/meson.build b/src/compiler/spirv/meson.build index 8f1a02e..8b6071a 100644 --- a/src/compiler/spirv/meson.build +++ b/src/compiler/spirv/meson.build @@ -24,3 +24,17 @@ spirv_info_c = custom_target( output : 'spirv_info.c', command : [prog_python2, '@INPUT0@', '@INPUT1@', '@OUTPUT@'], ) + +spirv_capabilities_cpp = custom_target( + 'spirv_capabilities.cpp', + input : files('spirv_capabilities_h.py', 'spirv.core.grammar.json'), + output : 'spirv_capabilities.cpp', + command : [prog_python2, '@INPUT0@', '--gen-cpp', '@OUTPUT@', '@INPUT1@'], +) + +spirv_capabilities_h = custom_target( + 'spirv_capabilities.h', + input : files('spirv_capabilities_h.py', 'spirv.core.grammar.json'), + output : 'spirv_capabilities.h', + command : [prog_python2, '@INPUT0@', '--gen-h', '@OUTPUT@', '@INPUT1@'], +) diff --git a/src/compiler/spirv/spirv_capabilities_h.py b/src/compiler/spirv/spirv_capabilities_h.py new file mode 100644 index 000..78b1166 --- /dev/null +++ b/src/compiler/spirv/spirv_capabilities_h.py @@ -0,0 +1,365 @@ +COPYRIGHT = """\ +/* + * Copyright (C) 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this
[Mesa-dev] [PATCH 10/12] spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build
From: Ian RomanickFuture changes will add generated files used only from src/compiler/glsl. These can't be built from Makefile.nir.am, and we can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and it would be silly anyway). v2: Do it for meson too. Signed-off-by: Ian Romanick --- src/compiler/Makefile.am | 2 ++ src/compiler/Makefile.nir.am | 31 ++-- src/compiler/Makefile.spirv.am | 54 ++ src/compiler/meson.build | 5 src/compiler/nir/meson.build | 7 -- src/compiler/spirv/meson.build | 26 6 files changed, 89 insertions(+), 36 deletions(-) create mode 100644 src/compiler/Makefile.spirv.am create mode 100644 src/compiler/spirv/meson.build diff --git a/src/compiler/Makefile.am b/src/compiler/Makefile.am index 1f7a80b..3cc96df 100644 --- a/src/compiler/Makefile.am +++ b/src/compiler/Makefile.am @@ -63,6 +63,8 @@ EXTRA_DIST = SConscript MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D) PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS) +include Makefile.spirv.am + include Makefile.glsl.am include Makefile.nir.am diff --git a/src/compiler/Makefile.nir.am b/src/compiler/Makefile.nir.am index 1533ee5..0b9d419 100644 --- a/src/compiler/Makefile.nir.am +++ b/src/compiler/Makefile.nir.am @@ -52,29 +52,6 @@ nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py $(MKDIR_GEN) $(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false) -spirv/spirv_info.c: spirv/spirv_info_c.py spirv/spirv.core.grammar.json - $(MKDIR_GEN) - $(PYTHON_GEN) $(srcdir)/spirv/spirv_info_c.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) - -noinst_PROGRAMS += spirv2nir - -spirv2nir_SOURCES = \ - spirv/spirv2nir.c - -spirv2nir_CPPFLAGS = \ - $(AM_CPPFLAGS) \ - -I$(top_builddir)/src/compiler/nir \ - -I$(top_srcdir)/src/compiler/nir\ - -I$(top_srcdir)/src/compiler/spirv - -spirv2nir_LDADD = \ - nir/libnir.la \ - $(top_builddir)/src/util/libmesautil.la \ - -lm \ - $(PTHREAD_LIBS) - -nodist_EXTRA_spirv2nir_SOURCES = dummy.cpp - check_PROGRAMS += nir/tests/control_flow_tests nir_tests_control_flow_tests_CPPFLAGS = \ @@ -97,12 +74,10 @@ TESTS += nir/tests/control_flow_tests BUILT_SOURCES += \ - $(NIR_GENERATED_FILES) \ - $(SPIRV_GENERATED_FILES) + $(NIR_GENERATED_FILES) CLEANFILES += \ - $(NIR_GENERATED_FILES) \ - $(SPIRV_GENERATED_FILES) + $(NIR_GENERATED_FILES) EXTRA_DIST += \ nir/nir_algebraic.py\ @@ -114,6 +89,4 @@ EXTRA_DIST += \ nir/nir_opt_algebraic.py\ nir/tests \ nir/README \ - spirv/spirv_info_c.py \ - spirv/spirv.core.grammar.json \ SConscript.nir diff --git a/src/compiler/Makefile.spirv.am b/src/compiler/Makefile.spirv.am new file mode 100644 index 000..9841004 --- /dev/null +++ b/src/compiler/Makefile.spirv.am @@ -0,0 +1,54 @@ +# +# Copyright (C) 2017 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. + +spirv/spirv_info.c: spirv/spirv_info_c.py spirv/spirv.core.grammar.json + $(MKDIR_GEN) + $(PYTHON_GEN) $(srcdir)/spirv/spirv_info_c.py $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false) + +noinst_PROGRAMS += spirv2nir + +spirv2nir_SOURCES = \ + spirv/spirv2nir.c + +spirv2nir_CPPFLAGS = \ + $(AM_CPPFLAGS) \ +
[Mesa-dev] [PATCH 05/12] util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero
From: Ian RomanickThe new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick Suggested-by: Matt Turner --- src/amd/common/ac_gpu_info.c | 4 ++-- src/amd/common/ac_surface.c | 2 +- src/amd/vulkan/radv_formats.c| 4 ++-- src/broadcom/compiler/nir_to_vir.c | 4 ++-- src/gallium/auxiliary/gallivm/lp_bld_arit.c | 2 +- src/gallium/auxiliary/gallivm/lp_bld_debug.cpp | 2 +- src/gallium/auxiliary/gallivm/lp_bld_format_aos.c| 4 ++-- src/gallium/auxiliary/gallivm/lp_bld_gather.c| 8 src/gallium/auxiliary/gallivm/lp_bld_pack.c | 2 +- src/gallium/auxiliary/gallivm/lp_bld_sample.c| 6 +++--- src/gallium/auxiliary/util/u_math.h | 10 +- src/gallium/auxiliary/util/u_ringbuffer.c| 2 +- src/gallium/drivers/etnaviv/etnaviv_texture.c| 2 +- src/gallium/drivers/freedreno/freedreno_query_hw.c | 2 +- src/gallium/drivers/i915/i915_state_sampler.c| 3 ++- src/gallium/drivers/llvmpipe/lp_state_fs.c | 2 +- src/gallium/drivers/llvmpipe/lp_texture.c| 2 +- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp | 2 +- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 4 ++-- src/gallium/drivers/nouveau/nv30/nv30_miptree.c | 6 +++--- src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 2 +- src/gallium/drivers/r300/r300_texture_desc.c | 6 +++--- src/gallium/drivers/r600/r600_texture.c | 2 +- src/gallium/drivers/radeon/r600_texture.c| 2 +- src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 2 +- src/gallium/drivers/softpipe/sp_texture.c| 12 ++-- src/gallium/drivers/swr/swr_screen.cpp | 4 ++-- src/gallium/drivers/vc4/vc4_program.c| 4 ++-- src/intel/vulkan/anv_allocator.c | 6 +++--- src/intel/vulkan/anv_formats.c | 4 ++-- src/intel/vulkan/anv_nir_lower_multiview.c | 2 +- src/mesa/state_tracker/st_cb_readpixels.c| 4 ++-- src/mesa/state_tracker/st_cb_texture.c | 6 +++--- src/util/bitscan.h | 12 src/util/u_vector.c | 4 ++-- 35 files changed, 75 insertions(+), 70 deletions(-) diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c index 6e34a07..8e44b90 100644 --- a/src/amd/common/ac_gpu_info.c +++ b/src/amd/common/ac_gpu_info.c @@ -297,8 +297,8 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev, } info->has_virtual_memory = true; - assert(util_is_power_of_two(dma.available_rings + 1)); - assert(util_is_power_of_two(compute.available_rings + 1)); + assert(util_is_power_of_two_or_zero(dma.available_rings + 1)); + assert(util_is_power_of_two_or_zero(compute.available_rings + 1)); info->num_sdma_rings = util_bitcount(dma.available_rings); info->num_compute_rings = util_bitcount(compute.available_rings); diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c index f7600a3..0ee7e65 100644 --- a/src/amd/common/ac_surface.c +++ b/src/amd/common/ac_surface.c @@ -271,7 +271,7 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib, AddrSurfInfoIn->bpp) { unsigned alignment = 256 / (AddrSurfInfoIn->bpp / 8); - assert(util_is_power_of_two(AddrSurfInfoIn->bpp)); + assert(util_is_power_of_two_or_zero(AddrSurfInfoIn->bpp)); AddrSurfInfoIn->width = align(AddrSurfInfoIn->width, alignment); } diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c index 5c79ea7..587be9b 100644 --- a/src/amd/vulkan/radv_formats.c +++ b/src/amd/vulkan/radv_formats.c @@ -602,13 +602,13 @@ radv_physical_device_get_format_properties(struct radv_physical_device *physical tiled |= VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT; } } - if (tiled && util_is_power_of_two(vk_format_get_blocksize(format)) && !scaled) { + if (tiled && util_is_power_of_two_or_zero(vk_format_get_blocksize(format)) && !scaled) { tiled |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR | VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR; } } - if (linear && util_is_power_of_two(vk_format_get_blocksize(format)) && !scaled) { + if (linear && util_is_power_of_two_or_zero(vk_format_get_blocksize(format)) &&
[Mesa-dev] [PATCH 08/12] util: Include bitscan.h directly
From: Ian RomanickPreviously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick --- src/amd/common/ac_nir_to_llvm.c | 1 + src/compiler/nir/nir.c | 1 + src/compiler/spirv/vtn_alu.c | 1 + src/compiler/spirv/vtn_glsl450.c | 1 + src/util/bitset.h| 2 +- 5 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 1ecdeca..08204a4 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -29,6 +29,7 @@ #include "nir/nir.h" #include "../vulkan/radv_descriptor_set.h" #include "util/bitscan.h" +#include "util/u_math.h" #include #include "ac_shader_abi.h" #include "ac_shader_info.h" diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c index 7380bf4..ffe880c 100644 --- a/src/compiler/nir/nir.c +++ b/src/compiler/nir/nir.c @@ -27,6 +27,7 @@ #include "nir.h" #include "nir_control_flow_private.h" +#include #include nir_shader * diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c index ecf9cbc..ad3f217 100644 --- a/src/compiler/spirv/vtn_alu.c +++ b/src/compiler/spirv/vtn_alu.c @@ -21,6 +21,7 @@ * IN THE SOFTWARE. */ +#include #include "vtn_private.h" /* diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index c30dcc7..0540d5b 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -25,6 +25,7 @@ * */ +#include #include "vtn_private.h" #include "GLSL.std.450.h" diff --git a/src/util/bitset.h b/src/util/bitset.h index 2404ce7..8c9e856 100644 --- a/src/util/bitset.h +++ b/src/util/bitset.h @@ -31,7 +31,7 @@ #ifndef BITSET_H #define BITSET_H -#include "util/u_math.h" +#include "util/bitscan.h" / * generic bitset implementation -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/12 v2] The first of the real SPIR-V work
Almost every patch of the original send was modified, and a couple patches were added. I decided it was easier to re-send the whole thing as a group. This is the first block of patches to enable generating SPIR-V from Mesa's GLSL compiler. The initial use of this is for testing GL_ARB_spirv, but it may eventually be useful in its own right. A lot more work will be necessary for that to happen, though. Most of the work is done, but there are still some bits, notably UBOs and images, left to do. Of the ~46k shaders in my shader-db, I can generate SPIR-V that passes the validator for nearly 10k. Many shaders fail because they use legacy features that cannot be represented in SPIR-V. I am sending this series out now to prime the review process. The whole series is on the order of 80 patches, so I think it's valuable to send things out in smaller, self-contained chunks. The whole series (with some work-in-progress patches) is at https://cgit.freedesktop.org/~idr/mesa/log/?h=emit-spirv The implementation is divided into three layers, and this patch series represents the bottom layer. 1. Raw SPIR-V emit code. This layer has zero knowledge of GLSL IR or anything specific to Mesa. It should be trivial for someone to pull this code out and use it in a separate project. There are two parts. The first handles tracking and managing SPIR-V capabilities. The second part implements a very low-level SPIR-V instruction builder sytled after ir_builder and nir_builder. There are a couple rough bits here with respect to automatic setting of SPIR-V capabilites. There are cases where a SPIR-V feature requires any one of several capabilities, so it is impossible to pick one to enable. 2. Unstructured GLSL IR to SPIR-V. This layer handles emitting single bits of GLSL IR in SPIR-V without knowledge of the global structure. This is mostly things like emitting glsl_type, ir_variable, and ir_function_signature. At 38, this is the largest group of patches. This is mostly nailed down, but the UBO work is going to cause some changes. I hope to finally wrap that up next week. 3. SPIR-V visitor. This top-most layer handles the overall structure of the shader. It uses a visitor in a similar fashion to glsl_to_mesa and glsl_to_tgsi. Use a generator styled after nir_opt_algebraic has kept the amount of code here pretty small. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/12] spirv: Add SubgroupBallotKHR capability to SubgroupSize and SubgroupLocalInvocationId
From: Ian RomanickThe SPV_KHR_shader_ballot spec says: (Add the SubgroupBallotKHR capability to SubgroupSize.) (Add the SubgroupBallotKHR capability to SubgroupLocalInvocationId.) Yet the annotations are missing from the JSON. See also https://github.com/KhronosGroup/SPIRV-Headers/issues/44. --- src/compiler/spirv/spirv.core.grammar.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/spirv/spirv.core.grammar.json b/src/compiler/spirv/spirv.core.grammar.json index 9a9b903..67903d5 100644 --- a/src/compiler/spirv/spirv.core.grammar.json +++ b/src/compiler/spirv/spirv.core.grammar.json @@ -5210,7 +5210,7 @@ { "enumerant" : "SubgroupSize", "value" : 36, - "capabilities" : [ "Kernel" ] + "capabilities" : [ "Kernel", "SubgroupBallotKHR" ] }, { "enumerant" : "SubgroupMaxSize", @@ -5235,7 +5235,7 @@ { "enumerant" : "SubgroupLocalInvocationId", "value" : 41, - "capabilities" : [ "Kernel" ] + "capabilities" : [ "Kernel", "SubgroupBallotKHR" ] }, { "enumerant" : "VertexIndex", -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/12] i965: Silence "enumeral and non-enumeral type in conditional expression" warnings
From: Ian Romanickcompiler/brw_inst.h: In function ‘brw_reg_type brw_inst_dst_type(const gen_device_info*, const brw_inst*)’: compiler/brw_inst.h:801:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra] unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ ~~~^ BRW_GENERAL_REGISTER_FILE :\ brw_inst_##reg##_reg_file(devinfo, inst); \ compiler/brw_inst.h:808:1: note: in expansion of macro ‘REG_TYPE’ REG_TYPE(dst) ^~~~ Signed-off-by: Ian Romanick --- src/intel/compiler/brw_inst.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index 99e637e..dcd955d 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -799,7 +799,7 @@ brw_inst_##reg##_type(const struct gen_device_info *devinfo, \ const brw_inst *inst) \ { \ unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ - BRW_GENERAL_REGISTER_FILE :\ + (unsigned) BRW_GENERAL_REGISTER_FILE : \ brw_inst_##reg##_reg_file(devinfo, inst); \ unsigned hw_type = brw_inst_##reg##_reg_hw_type(devinfo, inst);\ return brw_hw_type_to_reg_type(devinfo, (enum brw_reg_file)file, hw_type); \ -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/12] i965: Don't request GLSL IR lowering of gl_VertexID
From: Ian RomanickLet the lowering in NIR handle it instead. This hurts one shader that occurs twice in shader-db (SynMark GSCloth) on IVB and HSW. No other shaders or platforms were affected. total cycles in shared programs: 253438422 -> 253438426 (0.00%) cycles in affected programs: 412 -> 416 (0.97%) helped: 0 HURT: 2 Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/brw_context.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index dd55b43..165043e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -585,7 +585,6 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.QuadsFollowProvokingVertexConvention = false; ctx->Const.NativeIntegers = true; - ctx->Const.VertexID_is_zero_based = true; /* Regarding the CMP instruction, the Ivybridge PRM says: * -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/12] util: Optimize util_is_power_of_two_nonzero
From: Ian RomanickSigned-off-by: Ian Romanick Suggested-by: Matt Turner --- src/util/bitscan.h | 17 + 1 file changed, 17 insertions(+) diff --git a/src/util/bitscan.h b/src/util/bitscan.h index a3f2d41..5cc75f0 100644 --- a/src/util/bitscan.h +++ b/src/util/bitscan.h @@ -38,6 +38,10 @@ #include #endif +#if defined(__POPCNT__) +#include +#endif + #include "c99_compat.h" #ifdef __cplusplus @@ -127,7 +131,20 @@ util_is_power_of_two_or_zero(unsigned v) static inline bool util_is_power_of_two_nonzero(unsigned v) { + /* __POPCNT__ is different from HAVE___BUILTIN_POPCOUNT. The latter +* indicates the existence of the __builtin_popcount function. The former +* indicates that _mm_popcnt_u32 exists and is a native instruction. +* +* The other alternative is to use SSE 4.2 compile-time flags. This has +* two drawbacks. First, there is currently no build infrastructure for +* SSE 4.2 (only 4.1), so that would have to be added. Second, some AMD +* CPUs support POPCNT but not SSE 4.2 (e.g., Barcelona). +*/ +#ifdef __POPCNT__ + return _mm_popcnt_u32(v) == 1; +#else return v != 0 && (v & (v - 1)) == 0; +#endif } /* For looping over a bitmask when you want to loop over consecutive bits -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/cnl: Correctly sample from fast-cleared sRGB textures
On Mon, Nov 20, 2017 at 04:03:20PM -0800, Jason Ekstrand wrote: > On Mon, Nov 20, 2017 at 3:48 PM, Nanley Cherywrote: > > > On Mon, Nov 20, 2017 at 02:55:28PM -0800, Jason Ekstrand wrote: > > > On Mon, Nov 20, 2017 at 2:10 PM, Nanley Chery > > wrote: > > > > > > > On Mon, Nov 20, 2017 at 01:54:39PM -0800, Nanley Chery wrote: > > > > > When sampling from fast-cleared sRGB textures on gen10, the hardware > > > > > will not decode the clear color to linear. We must therefore do it in > > > > > software. > > > > > > > > > > > > > I should've included the tests that are now passing (added in locally): > > > > > > > > * spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend > > > > * spec@arb_framebuffer_srgb@fbo-fast-clear > > > > * spec@arb_framebuffer_srgb@msaa-fast-clear > > > > * spec@ext_texture_srgb@fbo-fast-clear > > > > * spec@ext_texture_srgb@multisample-fast-clear gl_ext_texture_srgb > > > > > > > > -Nanley > > > > > > > > > Signed-off-by: Nanley Chery > > > > > --- > > > > > src/mesa/drivers/dri/i965/brw_blorp.c| 7 + > > > > > src/mesa/drivers/dri/i965/brw_meta_util.c| 38 > > > > > > > > > src/mesa/drivers/dri/i965/brw_meta_util.h| 5 > > > > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 9 +- > > > > > 4 files changed, 58 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > > > > b/src/mesa/drivers/dri/i965/brw_blorp.c > > > > > index 38284d3b85..b10e9b95b7 100644 > > > > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > > > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > > > > @@ -328,6 +328,13 @@ brw_blorp_blit_miptrees(struct brw_context *brw, > > > > > struct blorp_surf src_surf, dst_surf; > > > > > blorp_surf_for_miptree(brw, _surf, src_mt, src_aux_usage, > > false, > > > > >_level, src_layer, 1, _surfs[0]); > > > > > + if (devinfo->gen >= 10 && isl_format_is_srgb(src_isl_format) && > > > > > + src_clear_supported) { > > > > > + src_surf.clear_color = > > > > > + gen10_convert_srgb_fast_clear_color(devinfo, > > src_isl_format, > > > > > + > > src_mt->fast_clear_color); > > > > > + } > > > > > + > > > > > blorp_surf_for_miptree(brw, _surf, dst_mt, dst_aux_usage, > > true, > > > > >_level, dst_layer, 1, _surfs[1]); > > > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > > index d292f5a8e2..97c6114f8a 100644 > > > > > --- a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > > +++ b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > > @@ -28,6 +28,7 @@ > > > > > #include "brw_state.h" > > > > > #include "main/blend.h" > > > > > #include "main/fbobject.h" > > > > > +#include "main/format_utils.h" > > > > > #include "util/format_srgb.h" > > > > > > > > > > /** > > > > > @@ -315,6 +316,43 @@ brw_is_color_fast_clear_compatible(struct > > > > brw_context *brw, > > > > > return true; > > > > > } > > > > > > > > > > +/* When sampling from fast-cleared sRGB textures on gen10, the > > hardware > > > > > + * will not decode the clear color to linear. We must therefore do > > it in > > > > > + * software. > > > > > + */ > > > > > +union isl_color_value > > > > > +gen10_convert_srgb_fast_clear_color(const struct gen_device_info > > > > *devinfo, > > > > > +enum isl_format format, > > > > > +const union isl_color_value > > color) > > > > > +{ > > > > > + assert(devinfo->gen >= 10); > > > > > + assert(isl_format_is_srgb(format)); > > > > > + assert(isl_format_supports_ccs_d(devinfo, format)); > > > > > + > > > > > + /* All the CCS-supported sRGB textures in GL are 4-channel, > > > > 8-bit/channel, > > > > > +* UNORM formats. > > > > > +*/ > > > > > + assert(isl_format_get_num_channels(format) == 4); > > > > > + assert(isl_format_channels_have_size(format, 8)); > > > > > + assert(isl_format_has_unorm_channel(format)); > > > > > > > > > > There are exactly 4 ISL formats which match this. You could just assert > > > that it's one of the four and drop the other 5 asserts. It doesn't > > really > > > matter, but this is rather complicated. > > > > > > > > > > I agree. I'll fix this locally and send out a v2 after getting more of > > your feedback. > > > > > > > + > > > > > + /* According to do_single_blorp_clear(), fast-clears for > > > > texture-views are > > > > > +* disabled. This means that we only have to do > > channel-to-channel > > > > format > > > > > +* conversions. > > > > > +*/ > > > > > + union isl_color_value override_color = color; > > > > > + for (unsigned i = 0; i < 3; i++) { > > > > > + /* According to brw_is_color_fast_clear_compatible(), only > > > > floating-point > > > > > + *
[Mesa-dev] GBM and the Device Memory Allocator Proposals
As many here know at this point, I've been working on solving issues related to DMA-capable memory allocation for various devices for some time now. I'd like to take this opportunity to apologize for the way I handled the EGL stream proposals. I understand now that the development process followed there was unacceptable to the community and likely offended many great engineers. Moving forward, I attempted to reboot talks in a more constructive manner with the generic allocator library proposals & discussion forum at XDC 2016. Some great design ideas came out of that, and I've since been prototyping some code to prove them out before bringing them back as official proposals. Again, I understand some people are growing concerned that I've been doing this off on the side in a github project that has primarily NVIDIA contributors. My goal was only to avoid wasting everyone's time with unproven ideas. The intent was never to dump the prototype code as-is on the community and presume acceptance. It's just a public research project. Now the prototyping is nearing completion, and I'd like to renew discussion on whether and how the new mechanisms can be integrated with the Linux graphics stack. I'd be interested to know if more work is needed to demonstrate the usefulness of the new mechanisms, or whether people think they have value at this point. After talking with people on the hallway track at XDC this year, I've heard several proposals for incorporating the new mechanisms: -Include ideas from the generic allocator design into GBM. This could take the form of designing a "GBM 2.0" API, or incrementally adding to the existing GBM API. -Develop a library to replace GBM. The allocator prototype code could be massaged into something production worthy to jump start this process. -Develop a library that sits beside or on top of GBM, using GBM for low-level graphics buffer allocation, while supporting non-graphics kernel APIs directly. The additional cross-device negotiation and sorting of capabilities would be handled in this slightly higher-level API before handing off to GBM and other APIs for actual allocation somehow. -I have also heard some general comments that regardless of the relationship between GBM and the new allocator mechanisms, it might be time to move GBM out of Mesa so it can be developed as a stand-alone project. I'd be interested what others think about that, as it would be something worth coordinating with any other new development based on or inside of GBM. And of course I'm open to any other ideas for integration. Beyond just where this code would live, there is much to debate about the mechanisms themselves and all the implementation details. I was just hoping to kick things off with something high level to start. For reference, the code Miguel and I have been developing for the prototype is here: https://github.com/cubanismo/allocator And we've posted a port of kmscube that uses the new interfaces as a demonstration here: https://github.com/cubanismo/kmscube There are still some proposed mechanisms (usage transitions mainly) that aren't prototyped, but I think it makes sense to start discussing integration while prototyping continues. In addition, I'd like to note that NVIDIA is committed to providing open source driver implementations of these mechanisms for our hardware, in addition to support in our proprietary drivers. In other words, wherever modifications to the nouveau kernel & userspace drivers are needed to implement the improved allocator mechanisms, we'll be contributing patches if no one beats us to it. Thanks in advance for any feedback! -James Jones ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] mesa: fix validate for secondary interpolator
On 11/20/2017 04:07 PM, Miklós Máté wrote: > This patch fixes multiple problems: > - the interpolator check was duplicated > - both had arg instead of argRep > - I split it into color and alpha for better readability and error msg > - the DOT4 check only applies to color instruction according to the spec For the sake of reviewers and the next person to look at this code, a spec quotation or two in the code would be really helpful. > - made the DOT4 check fatal, and improved the error msg > > Piglit: spec/ati_fragment_shader/error08-secondary > > Signed-off-by: Miklós Máté> --- > src/mesa/main/atifragshader.c | 23 --- > 1 file changed, 12 insertions(+), 11 deletions(-) > > diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c > index 313ba0c66d..1bca8267b8 100644 > --- a/src/mesa/main/atifragshader.c > +++ b/src/mesa/main/atifragshader.c > @@ -171,15 +171,15 @@ static int check_arith_arg(struct ati_fragment_shader > *curProg, >_mesa_error(ctx, GL_INVALID_ENUM, "C/AFragmentOpATI(arg)"); >return 0; > } > - if ((arg == GL_SECONDARY_INTERPOLATOR_ATI) && (((optype == 0) && (argRep > == GL_ALPHA)) || > - ((optype == 1) && ((arg == GL_ALPHA) || (argRep == GL_NONE) { > - _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interp)"); > - return 0; > - } > - if ((arg == GL_SECONDARY_INTERPOLATOR_ATI) && (((optype == 0) && (argRep > == GL_ALPHA)) || > - ((optype == 1) && ((arg == GL_ALPHA) || (argRep == GL_NONE) { > - _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interp)"); > - return 0; Adding this comment here makes it clear that the code is correct. /* The ATI_fragment_shader spec says: * *The error INVALID_OPERATION is generated by *ColorFragmentOp[1..3]ATI if is SECONDARY_INTERPOLATOR_ATI *and is ALPHA, or by AlphaFragmentOp[1..3]ATI if *is SECONDARY_INTERPOLATOR_ATI and is ALPHA or NONE, ... */ > + if (arg == GL_SECONDARY_INTERPOLATOR_ATI) { > + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) && (argRep == GL_ALPHA)) { Like before, drop the unnecessary (). > + _mesa_error(ctx, GL_INVALID_OPERATION, > "CFragmentOpATI(sec_interp)"); > + return 0; > + } else if ((optype == ATI_FRAGMENT_SHADER_ALPHA_OP) && > +((argRep == GL_ALPHA) || (argRep == GL_NONE))) { Remove extra () and line up with the opening (. > + _mesa_error(ctx, GL_INVALID_OPERATION, > "AFragmentOpATI(sec_interp)"); > + return 0; > + } > } > if ((curProg->cur_pass == 1) && >((arg == GL_PRIMARY_COLOR_ARB) || (arg == > GL_SECONDARY_INTERPOLATOR_ATI))) { > @@ -642,10 +642,11 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, > GLenum op, GLuint dst, >return; >} > } Same here regarding the comment. /* The ATI_fragment_shader spec says: * *The error INVALID_OPERATION is generated by... ColorFragmentOp2ATI *if is DOT4_ATI and is SECONDARY_INTERPOLATOR_ATI and * is ALPHA or NONE. */ > - if ((op == GL_DOT4_ATI) && > + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) && (op == GL_DOT4_ATI) && >(((arg1 == GL_SECONDARY_INTERPOLATOR_ATI) && ((arg1Rep == GL_ALPHA) || > (arg1Rep == GL_NONE))) || >(((arg2 == GL_SECONDARY_INTERPOLATOR_ATI) && ((arg2Rep == GL_ALPHA) || > (arg2Rep == GL_NONE)) { > - _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interp)"); > + _mesa_error(ctx, GL_INVALID_OPERATION, > "C/AFragmentOpATI(sec_interpDOT4)"); > + return; > } > > if (!check_arith_arg(curProg, optype, arg1, arg1Rep)) { > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Freedreno] [PATCH 1/2] nir: allow texture offsets with cube maps
On Mon, Nov 20, 2017 at 4:10 PM, Ilia Mirkinwrote: > On Mon, Nov 20, 2017 at 7:08 PM, Jason Ekstrand > wrote: > > On Mon, Nov 20, 2017 at 3:11 PM, Ilia Mirkin > wrote: > >> > >> On Mon, Nov 20, 2017 at 5:16 PM, Jason Ekstrand > >> wrote: > >> > On Sun, Nov 19, 2017 at 11:54 AM, Ilia Mirkin > >> > wrote: > >> >> > >> >> GL doesn't have this, but some hardware supports it. This is > convenient > >> >> for lowering tg4 to plain texture calls, which is necessary on Adreno > >> >> A4xx hardware. > >> >> > >> >> Signed-off-by: Ilia Mirkin > >> >> --- > >> >> src/compiler/nir/nir.h | 15 +-- > >> >> 1 file changed, 13 insertions(+), 2 deletions(-) > >> >> > >> >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > >> >> index f46f6147110..64965ae16d6 100644 > >> >> --- a/src/compiler/nir/nir.h > >> >> +++ b/src/compiler/nir/nir.h > >> >> @@ -1364,8 +1364,7 @@ nir_tex_instr_src_size(const nir_tex_instr > >> >> *instr, > >> >> unsigned src) > >> >> if (instr->src[src].src_type == nir_tex_src_ms_mcs) > >> >>return 4; > >> >> > >> >> - if (instr->src[src].src_type == nir_tex_src_offset || > >> >> - instr->src[src].src_type == nir_tex_src_ddx || > >> >> + if (instr->src[src].src_type == nir_tex_src_ddx || > >> >> instr->src[src].src_type == nir_tex_src_ddy) { > >> >>if (instr->is_array) > >> >> return instr->coord_components - 1; > >> >> @@ -1373,6 +1372,18 @@ nir_tex_instr_src_size(const nir_tex_instr > >> >> *instr, > >> >> unsigned src) > >> >> return instr->coord_components; > >> >> } > >> >> > >> >> + /* Usual APIs don't allow cube + offset, but we allow it, with 2 > >> >> coords for > >> >> +* the offset, since a cube maps to a single face. > >> >> +*/ > >> >> + if (instr->src[src].src_type == nir_tex_src_offset) { > >> >> + unsigned ret = instr->coord_components; > >> >> + if (instr->is_array) > >> >> + ret--; > >> >> + if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) > >> >> + ret--; > >> >> + return ret; > >> > > >> > > >> > I think I'd rather this look more like the one above: > >> > > >> > if (instr->is_array) > >> >return instr->coord_components; > >> > else if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) > >> >return 2; > >> > else > >> >return instr->coord_components - 1; > >> > > >> > It seems a bit cleaner and/or more explicit to me. > >> > >> OK. Although your version is slightly wrong, but I get the idea. Will > >> do that in a v2. (array should get -1, and cube should always get 2 > >> even if it's an array) > > > > > > I'd forgotten about cube arrays, yes, those would naturally be -2. In > that > > case, maybe -- for each subtraction is a good idea... > > Well, rearranging it, we get > > if (instr->sampler_dim == CUBE) > return 2; > else if (instr->is_array) > return comp - 1; > else > return comp; > Right. With that if-ladder, Reviewed-by: Jason Ekstrand ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] mesa: fix typo in ATI_fs dstMod error checking
On 11/20/2017 04:07 PM, Miklós Máté wrote: > Piglit: spec/ati_fragment_shader/error14-invalidmod > > Signed-off-by: Miklós Máté> --- > src/mesa/main/atifragshader.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c > index d6fc37ac8f..313ba0c66d 100644 > --- a/src/mesa/main/atifragshader.c > +++ b/src/mesa/main/atifragshader.c > @@ -623,7 +623,7 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, > GLenum op, GLuint dst, > } > if ((modtemp != GL_NONE) && (modtemp != GL_2X_BIT_ATI) && >(modtemp != GL_4X_BIT_ATI) && (modtemp != GL_8X_BIT_ATI) && > - (modtemp != GL_HALF_BIT_ATI) && !(modtemp != GL_QUARTER_BIT_ATI) && > + (modtemp != GL_HALF_BIT_ATI) && (modtemp != GL_QUARTER_BIT_ATI) && >(modtemp != GL_EIGHTH_BIT_ATI)) { >_mesa_error(ctx, GL_INVALID_ENUM, "C/AFragmentOpATI(dstMod)%x", > modtemp); >return; A good follow-up patch would be to remove all the extra () and fix the indentation. I'm also waffling about a simplification to the code: if (!util_is_power_of_two_or_zero(modtemp) || modtemp > GL_EIGHTH_BIT_ATI) { _mesa_error(ctx, GL_INVALID_ENUM, "C/AFragmentOpATI(dstMod)%x", modtemp); return; } util_is_power_of_two_or_zero is added by a patch series I'm about to send out. Either way, this patch is clearly correct and is Reviewed-by: Ian Romanick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/6] meson: some non-linux OS work
This is a rough cross section of work to get the meson build working on macOS and on Haiku, with one radeonsi fixup added. Most of this work is pretty straight forward, turning some options that can be assumed on Linux into tri-states, fixing up some early windows and macos stubs that I put in during the initial meosn pass, and removing some code duplication. Dylan Baker (6): meson: Remove duplicate _GNU_SOURCE meson: Convert platform to auto meson: convert llvm option to tristate meson: Fix LLVM requires for radeonsi meson: add logic to select apple and windows dri meson: replace with_*dri with with_dri_platform meson.build| 77 +++--- meson_options.txt | 9 +++--- src/glx/meson.build| 4 +-- src/mapi/glapi/meson.build | 2 +- 4 files changed, 54 insertions(+), 38 deletions(-) -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] meson: replace with_*dri with with_dri_platform
This fixes the windows and macos stubs to be consistent with the *nix path. Signed-off-by: Dylan Baker--- meson.build| 4 src/glx/meson.build| 4 ++-- src/mapi/glapi/meson.build | 2 +- 3 files changed, 3 insertions(+), 7 deletions(-) diff --git a/meson.build b/meson.build index e7d2afb3d0a..bcc560217cb 100644 --- a/meson.build +++ b/meson.build @@ -52,10 +52,6 @@ if get_option('texture-float') message('WARNING: Floating-point texture enabled. Please consult docs/patents.txt and your lawyer before building mesa.') endif -# XXX: yeah, do these -with_appledri = false -with_windowsdri = false - dri_drivers_path = get_option('dri-drivers-path') if dri_drivers_path == '' dri_drivers_path = join_paths(get_option('libdir'), 'dri') diff --git a/src/glx/meson.build b/src/glx/meson.build index 43f8fb24efd..deef3ed2235 100644 --- a/src/glx/meson.build +++ b/src/glx/meson.build @@ -95,9 +95,9 @@ if with_dri3 files_libglx += files('dri3_glx.c', 'dri3_priv.h') endif -if with_appledri +if with_dri_platform == 'apple' files_libglx += files('applegl_glx.c') -elif with_windowsdri +elif with_dri_platform == 'windows' files_libglx += files('driwindows_glx.c') # TODO #extra_libs_libglx += [ diff --git a/src/mapi/glapi/meson.build b/src/mapi/glapi/meson.build index d2d86afd6c1..f2012644785 100644 --- a/src/mapi/glapi/meson.build +++ b/src/mapi/glapi/meson.build @@ -23,7 +23,7 @@ inc_glapi = include_directories('.') static_glapi_files = [] static_glapi_args = [] -if with_appledri or with_windowsdri +if ['apple', 'windows'].contains(with_dri_platform) static_glapi_files += files('glapi_gentable.c') endif -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] meson: Fix LLVM requires for radeonsi
Signed-off-by: Dylan Baker--- meson.build | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/meson.build b/meson.build index c568fdb8785..52f2c1cb0d0 100644 --- a/meson.build +++ b/meson.build @@ -707,7 +707,7 @@ if with_gallium_freedreno endif llvm_modules = ['bitwriter', 'engine', 'mcdisassembler', 'mcjit'] -if with_amd_vk +if with_amd_vk or with_gallium_radeonsi llvm_modules += ['amdgpu', 'bitreader', 'ipo'] endif @@ -715,7 +715,7 @@ _llvm = get_option('llvm') if _llvm == 'auto' dep_llvm = dependency( 'llvm', version : '>= 3.9.0', modules : llvm_modules, -required : with_amd_vk, +required : with_amd_vk or with_gallium_radeonsi, ) with_llvm = dep_llvm.found() elif _llvm == 'true' -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] meson: add logic to select apple and windows dri
This is still not fully correct (haiku and BSD are probably not correct), but Linux is not regressed and this should be correct for macOS and Windows. Signed-off-by: Dylan Baker--- meson.build | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/meson.build b/meson.build index 52f2c1cb0d0..e7d2afb3d0a 100644 --- a/meson.build +++ b/meson.build @@ -187,8 +187,19 @@ if with_dri_i915 dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75') endif -# TODO: other OSes -with_dri_platform = 'drm' +# TODO: gnu +if host_machine.system() == 'darwin' + with_dri_platform = 'apple' +elif host_machine.system() == 'windows' + with_dri_platform = 'windows' +elif host_machine.system() == 'linux' + # FIXME: This should include BSD and possibly other systems + with_dri_platform = 'drm' +else + # FIXME: haiku doesn't use dri, and xlib doesn't use dri, probably should + # assert here that one of those cases has been met. + with_dri_platform = 'none' +endif with_platform_android = false with_platform_wayland = false -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] meson: convert llvm option to tristate
This option has been acting as a strange sort of half-tri state anyway. Signed-off-by: Dylan Baker--- meson.build | 48 +--- meson_options.txt | 5 +++-- 2 files changed, 28 insertions(+), 25 deletions(-) diff --git a/meson.build b/meson.build index 7dd6f89991b..c568fdb8785 100644 --- a/meson.build +++ b/meson.build @@ -46,7 +46,6 @@ with_tests = get_option('build-tests') with_valgrind = get_option('valgrind') with_libunwind = get_option('libunwind') with_asm = get_option('asm') -with_llvm = get_option('llvm') with_osmesa = get_option('osmesa') if get_option('texture-float') pre_args += '-DTEXTURE_FLOAT_ENABLED' @@ -711,30 +710,33 @@ llvm_modules = ['bitwriter', 'engine', 'mcdisassembler', 'mcjit'] if with_amd_vk llvm_modules += ['amdgpu', 'bitreader', 'ipo'] endif -dep_llvm = [] -if with_llvm + +_llvm = get_option('llvm') +if _llvm == 'auto' dep_llvm = dependency( -'llvm', version : '>= 3.9.0', required : with_amd_vk, modules : llvm_modules, +'llvm', version : '>= 3.9.0', modules : llvm_modules, +required : with_amd_vk, ) - if dep_llvm.found() -_llvm_version = dep_llvm.version().split('.') -# Development versions of LLVM have an 'svn' suffix, we don't want that for -# our version checks. -_llvm_patch = _llvm_version[2] -if _llvm_patch.endswith('svn') - _llvm_patch = _llvm_patch.split('s')[0] -endif -pre_args += [ - '-DHAVE_LLVM=0x0@0@@1@@2@'.format(_llvm_version[0], _llvm_version[1], _llvm_patch), - '-DMESA_LLVM_VERSION_PATCH=@0@'.format(_llvm_patch), -] - else -if with_gallium_softpipe - error('Cannot find LLVM to build LLVMPipe. If you wanted softpipe pass -Dllvm=false to meson') -elif with_amd_vk or with_gallium_radeonsi # etc - error('The following drivers requires LLVM: Radv, RadeonSI. One of these is enabled, but LLVM was not found.') -endif - endif + with_llvm = dep_llvm.found() +elif _llvm == 'true' + dep_llvm = dependency('llvm', version : '>= 3.9.0', modules : llvm_modules) + with_llvm = true +else + dep_llvm = [] + with_llvm = false +endif +if with_llvm + _llvm_version = dep_llvm.version().split('.') + # Development versions of LLVM have an 'svn' suffix, we don't want that for + # our version checks. + _llvm_patch = _llvm_version[2] + if _llvm_patch.endswith('svn') +_llvm_patch = _llvm_patch.split('s')[0] + endif + pre_args += [ +'-DHAVE_LLVM=0x0@0@@1@@2@'.format(_llvm_version[0], _llvm_version[1], _llvm_patch), +'-DMESA_LLVM_VERSION_PATCH=@0@'.format(_llvm_patch), + ] elif with_amd_vk or with_gallium_radeonsi error('The following drivers requires LLVM: Radv, RadeonSI. One of these is enabled, but LLVM is disabled.') endif diff --git a/meson_options.txt b/meson_options.txt index 1134a250295..44d46fe0b32 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -132,8 +132,9 @@ option( ) option( 'llvm', - type : 'boolean', - value : true, + type : 'combo', + value : 'auto', + choices : ['auto', 'true', 'false'], description : 'Build with LLVM support.' ) option( -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] meson: Remove duplicate _GNU_SOURCE
There is one provided unconditionally, and one guarded by platform == linux. Remove the unconditional one. Signed-off-by: Dylan Baker--- meson.build | 1 - 1 file changed, 1 deletion(-) diff --git a/meson.build b/meson.build index 383ebb36662..ce2c8c2c1d0 100644 --- a/meson.build +++ b/meson.build @@ -39,7 +39,6 @@ pre_args = [ '-DVERSION="@0@"'.format(meson.project_version()), '-DPACKAGE_VERSION=VERSION', '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa;', - '-D_GNU_SOURCE', ] with_vulkan_icd_dir = get_option('vulkan-icd-dir') -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] meson: Convert platform to auto
This is necessary to support operating systems other than the *nix family (excluding macOS). For Linux nothing has changed, the defaults are still the same. Signed-off-by: Dylan Baker--- meson.build | 7 +++ meson_options.txt | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/meson.build b/meson.build index ce2c8c2c1d0..7dd6f89991b 100644 --- a/meson.build +++ b/meson.build @@ -198,6 +198,13 @@ with_platform_drm = false with_platform_surfaceless = false egl_native_platform = '' _platforms = get_option('platforms') +if _platforms == 'auto' + if ['linux'].contains(host_machine.system()) +_platforms = 'x11,wayland,drm,surfaceless' + else +error('Unknown OS, no platforms enabled. Patches gladly accepted to fix this.') + endif +endif if _platforms != '' _split = _platforms.split(',') with_platform_android = _split.contains('android') diff --git a/meson_options.txt b/meson_options.txt index 6c9cd33998c..1134a250295 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -21,8 +21,8 @@ option( 'platforms', type : 'string', - value : 'x11,wayland,drm,surfaceless', - description : 'comma separated list of window systems to support. wayland, x11, surfaceless, drm, etc.' + value : 'auto', + description : 'comma separated list of window systems to support. If this is set to auto all platforms applicable to the OS will be enabled.' ) option( 'dri3', -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] mesa: fix crash when an ATI_fs pass begins with an alpha inst
On 11/20/2017 04:07 PM, Miklós Máté wrote: > This fixes crash when: > - first pass begins with alpha inst > - first pass ends with color inst, second pass begins with alpha inst > > Also, use the symbolic name instead of a number. > Piglit: spec/ati_fragment_shader/api-alphafirst > > Signed-off-by: Miklós Máté> --- > src/mesa/main/atifragshader.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c > index 49ddb6e5af..d6fc37ac8f 100644 > --- a/src/mesa/main/atifragshader.c > +++ b/src/mesa/main/atifragshader.c > @@ -598,8 +598,10 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, > GLenum op, GLuint dst, >curProg->cur_pass=3; > > /* decide whether this is a new instruction or not ... all color > instructions are new, > - and alpha instructions might also be new if there was no preceding > color inst */ > - if ((optype == 0) || (curProg->last_optype == optype)) { > + and alpha instructions might also be new if there was no preceding > color inst, > + and this may be the first inst of the pass */ I know the code previously used this same formatting, but the Mesa style is for the */ of a multi-line comment to be on its own line. Each other line should also start with a *. And line-wrap around 78 characters. Like: /* Decide whether this is a new instruction or not. All color instructions * are new, and alpha instructions might also be new if there was no * preceding color inst. This may also be the first inst of the pass. */ > + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) || (curProg->last_optype == > optype) > + || (curProg->numArithInstr[curProg->cur_pass >> 1] == 0)) { I lost the debate about where the || (or &&) should go... it should be on the previous line. Most of the parenthesis are unnecessary, and the second line should line up with the opening (. On a side topic... if anyone understands how ati_fragment_shader::cur_pass works, it would be really great if they could document it in mtypes.h. >if (curProg->numArithInstr[curProg->cur_pass >> 1] > 7) { >_mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(instrCount)"); >return; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GPU (and system) monitoring
On Mon, 20 Nov 2017 13:25:26 +0100 Nicolai Hähnlewrote: > ... umr does > that for the amdgpu kernel module I downloaded the source tree, compiled umr, mounted the debugfs to use umr, and ran umr --top switching to sensor mode, you could see loads. And I collected a few second of information to a log file. Then I umounted the debugfs. The load values can change from quite high to quite low from sampling interval to interval. I think some kind of averaging would probably need to be done, just taking a single load value chosen at random would not be useful. In a sense, temperature is going to be an average related to power (load), so maybe just having the temperatures may be enough? For now, I think I should collect a longer series of data, and then see how the (moving) average behaves with respect to how many data points go into the (moving) average. Gord ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC v5 02/19] i965: Only set planar_format if it's actually one
On Mon, Nov 6, 2017 at 2:02 PM, Louis-Francis Ratté-Boulianne < l...@collabora.com> wrote: > The planar_format image property was always set even for > non-planar formats. This was breaking CCS support as > intel_from_planar is now making sure we can't have both > a modifier and an planar format. > This patch is incorrect. planar_format is badly named, we set it even for single-plane images and some stuff relies on that. Really, we just need to change the intel_from_planar to check n_planes == 1. > Signed-off-by: Louis-Francis Ratté-Boulianne> Reviewed-by: Daniel Stone > --- > src/mesa/drivers/dri/i965/intel_screen.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index 10064c3236..b87ccab0a8 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -913,7 +913,8 @@ intel_create_image_from_names(__DRIscreen *dri_screen, > if (image == NULL) >return NULL; > > -image->planar_format = f; > +if (f->nplanes > 1) > +image->planar_format = f; > for (i = 0; i < f->nplanes; i++) { > index = f->planes[i].buffer_index; > image->offsets[index] = offsets[index]; > @@ -961,7 +962,8 @@ intel_create_image_from_fds_common(__DRIscreen > *dri_screen, > image->height = height; > image->pitch = strides[0]; > > - image->planar_format = f; > + if (f->nplanes > 1) > + image->planar_format = f; > > image->bo = brw_bo_gem_create_from_prime(screen->bufmgr, fds[0]); > if (image->bo == NULL) { > -- > 2.13.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH V2 1/4] i965/gen10: Implement WaSampleOffsetIZ workaround
Anuj, these 4 patches series landed in master without the "mesa-stable" tag, hence, just confirming that you dropped the nomination. Let me know if that was not the case. On Fri, 2017-10-06 at 16:30 -0700, Anuj Phogat wrote: > There are few other (duplicate) workarounds which have similar > recommendations: > WaFlushHangWhenNonPipelineStateAndMarkerStalled > WaCSStallBefore3DSamplePattern > WaPipeControlBefore3DStateSamplePattern > > WaPipeControlBefore3DStateSamplePattern has some extra recommendations if > driver is using mid batch context restore. Ignoring it for now because We're > not doing mid-batch context restore in Mesa. > > Cc: mesa-sta...@lists.freedesktop.org > Cc: Jason Ekstrand> Cc: Rafael Antognolli > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/dri/i965/brw_context.h| 2 + > src/mesa/drivers/dri/i965/brw_defines.h| 1 + > src/mesa/drivers/dri/i965/brw_pipe_control.c | 50 > ++ > src/mesa/drivers/dri/i965/gen8_multisample_state.c | 8 > 4 files changed, 61 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 92fc16de13..f0e8d562e9 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct > brw_context *brw); > void brw_emit_depth_stall_flushes(struct brw_context *brw); > void gen7_emit_vs_workaround_flush(struct brw_context *brw); > void gen7_emit_cs_stall_flush(struct brw_context *brw); > +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw); > +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw); > > /* brw_queryformat.c */ > void brw_query_internal_format(struct gl_context *ctx, GLenum target, > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 4abb790612..270cdf29db 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode { > #define GEN7_GPGPU_DISPATCHDIMY 0x2504 > #define GEN7_GPGPU_DISPATCHDIMZ 0x2508 > > +#define GEN7_CACHE_MODE_0 0x7000 > #define GEN7_CACHE_MODE_1 0x7004 > # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4) > # define GEN8_HIZ_NP_PMA_FIX_ENABLE(1 << 11) > diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c > b/src/mesa/drivers/dri/i965/brw_pipe_control.c > index 460b8f73b6..156f5c25ec 100644 > --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c > +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c > @@ -278,6 +278,56 @@ gen7_emit_cs_stall_flush(struct brw_context *brw) > brw->workaround_bo, 0, 0); > } > > +static void > +brw_flush_gpu_caches(struct brw_context *brw) { > + brw_emit_pipe_control_flush(brw, > + PIPE_CONTROL_CACHE_FLUSH_BITS | > + PIPE_CONTROL_CACHE_INVALIDATE_BITS); > +} > + > +/** > + * From Gen10 Workarounds page in h/w specs: > + * WaSampleOffsetIZ: > + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no > + * markers in the pipeline by programming a PIPE_CONTROL with stall. > + */ > +void > +gen10_emit_wa_cs_stall_flush(struct brw_context *brw) > +{ > + const struct gen_device_info *devinfo = >screen->devinfo; > + assert(devinfo->gen == 10); > + brw_emit_pipe_control_flush(brw, > + PIPE_CONTROL_CS_STALL | > + PIPE_CONTROL_STALL_AT_SCOREBOARD); > +} > + > +/** > + * From Gen10 Workarounds page in h/w specs: > + * WaSampleOffsetIZ: > + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an > + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF(SVL) > + * after the command to ensure the state has been delivered prior to any > + * command causing a marker in the pipeline. > + */ > +void > +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw) > +{ > + const struct gen_device_info *devinfo = >screen->devinfo; > + assert(devinfo->gen == 10); > + > + /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must > +* be idle; i.e., full flush is required. > +*/ > + brw_flush_gpu_caches(brw); > + > + /* Write to CACHE_MODE_0 (0x7000) */ > + BEGIN_BATCH(3); > + OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); > + OUT_BATCH(GEN7_CACHE_MODE_0); > + OUT_BATCH(0); > + ADVANCE_BATCH(); > +} > + > /** > * Emits a PIPE_CONTROL with a non-zero post-sync operation, for > * implementing two workarounds on gen6. From section 1.4.7.1 > diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c > b/src/mesa/drivers/dri/i965/gen8_multisample_state.c > index 7a31a5df4a..14043025b6 100644 > ---
Re: [Mesa-dev] [PATCH 1/4] mesa: mesa: add fallback texture for SampleMapATI if there is nothing
Only need one mesa: in the subject. :) On 11/20/2017 04:07 PM, Miklós Máté wrote: > This fixes crash in the state tracker. > Piglit: spec/ati_fragment_shader/render-notexture > > Signed-off-by: Miklós Máté> --- > src/mesa/main/texstate.c | 35 +++ > 1 file changed, 35 insertions(+) > > diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c > index 2146723d08..a0717a542f 100644 > --- a/src/mesa/main/texstate.c > +++ b/src/mesa/main/texstate.c > @@ -819,6 +819,35 @@ update_ff_texture_state(struct gl_context *ctx, > } > } > > +static void > +fix_missing_textures_for_atifs(struct gl_context *ctx, > + struct gl_program *prog, > + BITSET_WORD *enabled_texture_units) I'm fairly sure there's already a function like this somewhere in Mesa. Perhaps that code could be refactored slightly so that it can be re-used here? > +{ > + GLbitfield mask; > + GLuint s; > + int unit; > + gl_texture_index target_index; > + struct gl_texture_object *texObj; All of these declarations except mask can be moved inside the loop. Then they could all be made const. > + > + mask = prog->SamplersUsed; > + while (mask) { > + s = u_bit_scan(); > + unit = prog->SamplerUnits[s]; > + > + target_index = ffs(prog->TexturesUsed[unit]) - 1; > + > + if (!ctx->Texture.Unit[unit]._Current) { > + texObj = _mesa_get_fallback_texture(ctx, target_index); > + > + _mesa_reference_texobj(>Texture.Unit[unit]._Current, texObj); > + BITSET_SET(enabled_texture_units, unit); > + ctx->Texture._MaxEnabledTexImageUnit = > +MAX2(ctx->Texture._MaxEnabledTexImageUnit, (int)unit); > + } > + } > +} > + > /** > * \note This routine refers to derived texture matrix values to > * compute the ENABLE_TEXMAT flags, but is only called on > @@ -866,6 +895,12 @@ _mesa_update_texture_state(struct gl_context *ctx) > if (!prog[MESA_SHADER_FRAGMENT]) >update_ff_texture_state(ctx, enabled_texture_units); > > + /* add fallback texture for SampleMapATI if there is nothing */ > + if (_mesa_ati_fragment_shader_enabled(ctx) && > + ctx->ATIFragmentShader.Current->Program) This should line up with the (. > + fix_missing_textures_for_atifs(ctx, > +ctx->ATIFragmentShader.Current->Program, enabled_texture_units); Same here. > + > /* Now, clear out the _Current of any disabled texture units. */ > for (i = 0; i <= ctx->Texture._MaxEnabledTexImageUnit; i++) { >if (!BITSET_TEST(enabled_texture_units, i)) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: Handle negating immediates on MADs when propagating saturates
On 11/20/2017 03:32 PM, Matt Turner wrote: > On Mon, Nov 20, 2017 at 2:50 PM, Ian Romanickwrote: >> On 11/20/2017 02:33 PM, Matt Turner wrote: >>> MADs don't take immediate sources, but we allow them in the IR since it >>> simplifies a lot of things. I neglected to consider that case. >>> >>> Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate >>> negations into MADs.") >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616 >>> Reported-and-Tested-by: Ruslan Kabatsayev >>> --- >>> src/intel/compiler/brw_fs_saturate_propagation.cpp | 10 -- >>> 1 file changed, 8 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/intel/compiler/brw_fs_saturate_propagation.cpp >>> b/src/intel/compiler/brw_fs_saturate_propagation.cpp >>> index 1c97a507d8..d6cfa79a61 100644 >>> --- a/src/intel/compiler/brw_fs_saturate_propagation.cpp >>> +++ b/src/intel/compiler/brw_fs_saturate_propagation.cpp >>> @@ -88,8 +88,14 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t >>> *block) >>> scan_inst->src[0].negate = >>> !scan_inst->src[0].negate; >>> inst->src[0].negate = false; >>> } else if (scan_inst->opcode == BRW_OPCODE_MAD) { >>> -scan_inst->src[0].negate = >>> !scan_inst->src[0].negate; >>> -scan_inst->src[1].negate = >>> !scan_inst->src[1].negate; >>> +for (int i = 0; i < 2; i++) { >>> + if (scan_inst->src[i].file == IMM) { >>> + brw_negate_immediate(scan_inst->src[i].type, >>> + >>> _inst->src[i].as_brw_reg()); >>> + } else { >>> + scan_inst->src[i].negate = >>> !scan_inst->src[i].negate; >>> + } >>> +} >> >> Is this going to affect the number of generated instructions if there >> are multiple MADs using the same immediate value for a multiply source? >> Would it be better to find a multiply source that isn't an immediate? > > Interesting question. I think the answer is no, since > brw_fs_combine_constants.cpp builds its list of constants by ignoring > their signs. Since source negation is free, it just loads positive > values into registers and negates them with a source modifier if > needed. > > I ran shader-db to confirm -- no changes on SKL. In that case, this patch is also Reviewed-by: Ian Romanick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/bufmgr: Add a helper to mark a BO as external
Jason, this nominated series landed without mentioning any specific stable queue. From what I'm seeing, both depend on 2c4097aff1b which didn't make it for 17.2 so I'm dropping them for that queue. Let me know what you think. On Fri, 2017-11-17 at 17:02 -0800, Jason Ekstrand wrote: > --- > src/mesa/drivers/dri/i965/brw_bufmgr.c | 17 +++-- > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c > b/src/mesa/drivers/dri/i965/brw_bufmgr.c > index 17036b5..60b0dad 100644 > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c > @@ -1177,8 +1177,8 @@ err: > return NULL; > } > > -int > -brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd) > +static void > +brw_bo_make_external(struct brw_bo *bo) > { > struct brw_bufmgr *bufmgr = bo->bufmgr; > > @@ -1190,6 +1190,14 @@ brw_bo_gem_export_to_prime(struct brw_bo *bo, int > *prime_fd) >} >mtx_unlock(>lock); > } > +} > + > +int > +brw_bo_gem_export_to_prime(struct brw_bo *bo, int *prime_fd) > +{ > + struct brw_bufmgr *bufmgr = bo->bufmgr; > + > + brw_bo_make_external(bo); > > if (drmPrimeHandleToFD(bufmgr->fd, bo->gem_handle, >DRM_CLOEXEC, prime_fd) != 0) > @@ -1213,11 +1221,8 @@ brw_bo_flink(struct brw_bo *bo, uint32_t *name) >if (drmIoctl(bufmgr->fd, DRM_IOCTL_GEM_FLINK, )) > return -errno; > > + brw_bo_make_external(bo); >mtx_lock(>lock); > - if (!bo->external) { > - _mesa_hash_table_insert(bufmgr->handle_table, >gem_handle, bo); > - bo->external = true; > - } >if (!bo->global_name) { > bo->global_name = flink.name; > _mesa_hash_table_insert(bufmgr->name_table, >global_name, bo); -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Freedreno] [PATCH 1/2] nir: allow texture offsets with cube maps
On Mon, Nov 20, 2017 at 7:08 PM, Jason Ekstrandwrote: > On Mon, Nov 20, 2017 at 3:11 PM, Ilia Mirkin wrote: >> >> On Mon, Nov 20, 2017 at 5:16 PM, Jason Ekstrand >> wrote: >> > On Sun, Nov 19, 2017 at 11:54 AM, Ilia Mirkin >> > wrote: >> >> >> >> GL doesn't have this, but some hardware supports it. This is convenient >> >> for lowering tg4 to plain texture calls, which is necessary on Adreno >> >> A4xx hardware. >> >> >> >> Signed-off-by: Ilia Mirkin >> >> --- >> >> src/compiler/nir/nir.h | 15 +-- >> >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> >> >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >> >> index f46f6147110..64965ae16d6 100644 >> >> --- a/src/compiler/nir/nir.h >> >> +++ b/src/compiler/nir/nir.h >> >> @@ -1364,8 +1364,7 @@ nir_tex_instr_src_size(const nir_tex_instr >> >> *instr, >> >> unsigned src) >> >> if (instr->src[src].src_type == nir_tex_src_ms_mcs) >> >>return 4; >> >> >> >> - if (instr->src[src].src_type == nir_tex_src_offset || >> >> - instr->src[src].src_type == nir_tex_src_ddx || >> >> + if (instr->src[src].src_type == nir_tex_src_ddx || >> >> instr->src[src].src_type == nir_tex_src_ddy) { >> >>if (instr->is_array) >> >> return instr->coord_components - 1; >> >> @@ -1373,6 +1372,18 @@ nir_tex_instr_src_size(const nir_tex_instr >> >> *instr, >> >> unsigned src) >> >> return instr->coord_components; >> >> } >> >> >> >> + /* Usual APIs don't allow cube + offset, but we allow it, with 2 >> >> coords for >> >> +* the offset, since a cube maps to a single face. >> >> +*/ >> >> + if (instr->src[src].src_type == nir_tex_src_offset) { >> >> + unsigned ret = instr->coord_components; >> >> + if (instr->is_array) >> >> + ret--; >> >> + if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) >> >> + ret--; >> >> + return ret; >> > >> > >> > I think I'd rather this look more like the one above: >> > >> > if (instr->is_array) >> >return instr->coord_components; >> > else if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) >> >return 2; >> > else >> >return instr->coord_components - 1; >> > >> > It seems a bit cleaner and/or more explicit to me. >> >> OK. Although your version is slightly wrong, but I get the idea. Will >> do that in a v2. (array should get -1, and cube should always get 2 >> even if it's an array) > > > I'd forgotten about cube arrays, yes, those would naturally be -2. In that > case, maybe -- for each subtraction is a good idea... Well, rearranging it, we get if (instr->sampler_dim == CUBE) return 2; else if (instr->is_array) return comp - 1; else return comp; Happy to leave it alone too though. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Freedreno] [PATCH 1/2] nir: allow texture offsets with cube maps
On Mon, Nov 20, 2017 at 3:11 PM, Ilia Mirkinwrote: > On Mon, Nov 20, 2017 at 5:16 PM, Jason Ekstrand > wrote: > > On Sun, Nov 19, 2017 at 11:54 AM, Ilia Mirkin > wrote: > >> > >> GL doesn't have this, but some hardware supports it. This is convenient > >> for lowering tg4 to plain texture calls, which is necessary on Adreno > >> A4xx hardware. > >> > >> Signed-off-by: Ilia Mirkin > >> --- > >> src/compiler/nir/nir.h | 15 +-- > >> 1 file changed, 13 insertions(+), 2 deletions(-) > >> > >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > >> index f46f6147110..64965ae16d6 100644 > >> --- a/src/compiler/nir/nir.h > >> +++ b/src/compiler/nir/nir.h > >> @@ -1364,8 +1364,7 @@ nir_tex_instr_src_size(const nir_tex_instr *instr, > >> unsigned src) > >> if (instr->src[src].src_type == nir_tex_src_ms_mcs) > >>return 4; > >> > >> - if (instr->src[src].src_type == nir_tex_src_offset || > >> - instr->src[src].src_type == nir_tex_src_ddx || > >> + if (instr->src[src].src_type == nir_tex_src_ddx || > >> instr->src[src].src_type == nir_tex_src_ddy) { > >>if (instr->is_array) > >> return instr->coord_components - 1; > >> @@ -1373,6 +1372,18 @@ nir_tex_instr_src_size(const nir_tex_instr > *instr, > >> unsigned src) > >> return instr->coord_components; > >> } > >> > >> + /* Usual APIs don't allow cube + offset, but we allow it, with 2 > >> coords for > >> +* the offset, since a cube maps to a single face. > >> +*/ > >> + if (instr->src[src].src_type == nir_tex_src_offset) { > >> + unsigned ret = instr->coord_components; > >> + if (instr->is_array) > >> + ret--; > >> + if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) > >> + ret--; > >> + return ret; > > > > > > I think I'd rather this look more like the one above: > > > > if (instr->is_array) > >return instr->coord_components; > > else if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) > >return 2; > > else > >return instr->coord_components - 1; > > > > It seems a bit cleaner and/or more explicit to me. > > OK. Although your version is slightly wrong, but I get the idea. Will > do that in a v2. (array should get -1, and cube should always get 2 > even if it's an array) > I'd forgotten about cube arrays, yes, those would naturally be -2. In that case, maybe -- for each subtraction is a good idea... > -ilia > > > > > Also, bonus points to anyone who converts this function to use a switch. > :-P > > > > --Jason > > > >> > >> + } > >> + > >> return 1; > >> } > >> > >> -- > >> 2.13.6 > >> > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > > > > ___ > > Freedreno mailing list > > freedr...@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/freedreno > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] Fixes for ATI_fragment_shader
These are fixes for the problems my WIP Piglit tests uncovered so far. After this series gallium drivers should pass all tests, but swrast has some issues that I'm still working on. Unfortunately I can't test on r200. Miklós Máté (4): mesa: mesa: add fallback texture for SampleMapATI if there is nothing mesa: fix crash when an ATI_fs pass begins with an alpha inst mesa: fix typo in ATI_fs dstMod error checking mesa: fix validate for secondary interpolator src/mesa/main/atifragshader.c | 31 +-- src/mesa/main/texstate.c | 35 +++ 2 files changed, 52 insertions(+), 14 deletions(-) -- 2.15.0.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] mesa: mesa: add fallback texture for SampleMapATI if there is nothing
This fixes crash in the state tracker. Piglit: spec/ati_fragment_shader/render-notexture Signed-off-by: Miklós Máté--- src/mesa/main/texstate.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c index 2146723d08..a0717a542f 100644 --- a/src/mesa/main/texstate.c +++ b/src/mesa/main/texstate.c @@ -819,6 +819,35 @@ update_ff_texture_state(struct gl_context *ctx, } } +static void +fix_missing_textures_for_atifs(struct gl_context *ctx, + struct gl_program *prog, + BITSET_WORD *enabled_texture_units) +{ + GLbitfield mask; + GLuint s; + int unit; + gl_texture_index target_index; + struct gl_texture_object *texObj; + + mask = prog->SamplersUsed; + while (mask) { + s = u_bit_scan(); + unit = prog->SamplerUnits[s]; + + target_index = ffs(prog->TexturesUsed[unit]) - 1; + + if (!ctx->Texture.Unit[unit]._Current) { + texObj = _mesa_get_fallback_texture(ctx, target_index); + + _mesa_reference_texobj(>Texture.Unit[unit]._Current, texObj); + BITSET_SET(enabled_texture_units, unit); + ctx->Texture._MaxEnabledTexImageUnit = +MAX2(ctx->Texture._MaxEnabledTexImageUnit, (int)unit); + } + } +} + /** * \note This routine refers to derived texture matrix values to * compute the ENABLE_TEXMAT flags, but is only called on @@ -866,6 +895,12 @@ _mesa_update_texture_state(struct gl_context *ctx) if (!prog[MESA_SHADER_FRAGMENT]) update_ff_texture_state(ctx, enabled_texture_units); + /* add fallback texture for SampleMapATI if there is nothing */ + if (_mesa_ati_fragment_shader_enabled(ctx) && + ctx->ATIFragmentShader.Current->Program) + fix_missing_textures_for_atifs(ctx, +ctx->ATIFragmentShader.Current->Program, enabled_texture_units); + /* Now, clear out the _Current of any disabled texture units. */ for (i = 0; i <= ctx->Texture._MaxEnabledTexImageUnit; i++) { if (!BITSET_TEST(enabled_texture_units, i)) -- 2.15.0.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] mesa: fix typo in ATI_fs dstMod error checking
Piglit: spec/ati_fragment_shader/error14-invalidmod Signed-off-by: Miklós Máté--- src/mesa/main/atifragshader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index d6fc37ac8f..313ba0c66d 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -623,7 +623,7 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, GLenum op, GLuint dst, } if ((modtemp != GL_NONE) && (modtemp != GL_2X_BIT_ATI) && (modtemp != GL_4X_BIT_ATI) && (modtemp != GL_8X_BIT_ATI) && - (modtemp != GL_HALF_BIT_ATI) && !(modtemp != GL_QUARTER_BIT_ATI) && + (modtemp != GL_HALF_BIT_ATI) && (modtemp != GL_QUARTER_BIT_ATI) && (modtemp != GL_EIGHTH_BIT_ATI)) { _mesa_error(ctx, GL_INVALID_ENUM, "C/AFragmentOpATI(dstMod)%x", modtemp); return; -- 2.15.0.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] mesa: fix validate for secondary interpolator
This patch fixes multiple problems: - the interpolator check was duplicated - both had arg instead of argRep - I split it into color and alpha for better readability and error msg - the DOT4 check only applies to color instruction according to the spec - made the DOT4 check fatal, and improved the error msg Piglit: spec/ati_fragment_shader/error08-secondary Signed-off-by: Miklós Máté--- src/mesa/main/atifragshader.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index 313ba0c66d..1bca8267b8 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -171,15 +171,15 @@ static int check_arith_arg(struct ati_fragment_shader *curProg, _mesa_error(ctx, GL_INVALID_ENUM, "C/AFragmentOpATI(arg)"); return 0; } - if ((arg == GL_SECONDARY_INTERPOLATOR_ATI) && (((optype == 0) && (argRep == GL_ALPHA)) || - ((optype == 1) && ((arg == GL_ALPHA) || (argRep == GL_NONE) { - _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interp)"); - return 0; - } - if ((arg == GL_SECONDARY_INTERPOLATOR_ATI) && (((optype == 0) && (argRep == GL_ALPHA)) || - ((optype == 1) && ((arg == GL_ALPHA) || (argRep == GL_NONE) { - _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interp)"); - return 0; + if (arg == GL_SECONDARY_INTERPOLATOR_ATI) { + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) && (argRep == GL_ALPHA)) { + _mesa_error(ctx, GL_INVALID_OPERATION, "CFragmentOpATI(sec_interp)"); + return 0; + } else if ((optype == ATI_FRAGMENT_SHADER_ALPHA_OP) && +((argRep == GL_ALPHA) || (argRep == GL_NONE))) { + _mesa_error(ctx, GL_INVALID_OPERATION, "AFragmentOpATI(sec_interp)"); + return 0; + } } if ((curProg->cur_pass == 1) && ((arg == GL_PRIMARY_COLOR_ARB) || (arg == GL_SECONDARY_INTERPOLATOR_ATI))) { @@ -642,10 +642,11 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, GLenum op, GLuint dst, return; } } - if ((op == GL_DOT4_ATI) && + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) && (op == GL_DOT4_ATI) && (((arg1 == GL_SECONDARY_INTERPOLATOR_ATI) && ((arg1Rep == GL_ALPHA) || (arg1Rep == GL_NONE))) || (((arg2 == GL_SECONDARY_INTERPOLATOR_ATI) && ((arg2Rep == GL_ALPHA) || (arg2Rep == GL_NONE)) { - _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interp)"); + _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(sec_interpDOT4)"); + return; } if (!check_arith_arg(curProg, optype, arg1, arg1Rep)) { -- 2.15.0.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] mesa: fix crash when an ATI_fs pass begins with an alpha inst
This fixes crash when: - first pass begins with alpha inst - first pass ends with color inst, second pass begins with alpha inst Also, use the symbolic name instead of a number. Piglit: spec/ati_fragment_shader/api-alphafirst Signed-off-by: Miklós Máté--- src/mesa/main/atifragshader.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index 49ddb6e5af..d6fc37ac8f 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -598,8 +598,10 @@ _mesa_FragmentOpXATI(GLint optype, GLuint arg_count, GLenum op, GLuint dst, curProg->cur_pass=3; /* decide whether this is a new instruction or not ... all color instructions are new, - and alpha instructions might also be new if there was no preceding color inst */ - if ((optype == 0) || (curProg->last_optype == optype)) { + and alpha instructions might also be new if there was no preceding color inst, + and this may be the first inst of the pass */ + if ((optype == ATI_FRAGMENT_SHADER_COLOR_OP) || (curProg->last_optype == optype) + || (curProg->numArithInstr[curProg->cur_pass >> 1] == 0)) { if (curProg->numArithInstr[curProg->cur_pass >> 1] > 7) { _mesa_error(ctx, GL_INVALID_OPERATION, "C/AFragmentOpATI(instrCount)"); return; -- 2.15.0.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/cnl: Correctly sample from fast-cleared sRGB textures
On Mon, Nov 20, 2017 at 03:48:54PM -0800, Nanley Chery wrote: > On Mon, Nov 20, 2017 at 02:55:28PM -0800, Jason Ekstrand wrote: > > On Mon, Nov 20, 2017 at 2:10 PM, Nanley Cherywrote: > > > > > On Mon, Nov 20, 2017 at 01:54:39PM -0800, Nanley Chery wrote: > > > > When sampling from fast-cleared sRGB textures on gen10, the hardware > > > > will not decode the clear color to linear. We must therefore do it in > > > > software. > > > > > > > > > > I should've included the tests that are now passing (added in locally): > > > > > > * spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend > > > * spec@arb_framebuffer_srgb@fbo-fast-clear > > > * spec@arb_framebuffer_srgb@msaa-fast-clear > > > * spec@ext_texture_srgb@fbo-fast-clear > > > * spec@ext_texture_srgb@multisample-fast-clear gl_ext_texture_srgb > > > > > > -Nanley > > > > > > > Signed-off-by: Nanley Chery > > > > --- > > > > src/mesa/drivers/dri/i965/brw_blorp.c| 7 + > > > > src/mesa/drivers/dri/i965/brw_meta_util.c| 38 > > > > > > > src/mesa/drivers/dri/i965/brw_meta_util.h| 5 > > > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 9 +- > > > > 4 files changed, 58 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > > > b/src/mesa/drivers/dri/i965/brw_blorp.c > > > > index 38284d3b85..b10e9b95b7 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > > > @@ -328,6 +328,13 @@ brw_blorp_blit_miptrees(struct brw_context *brw, > > > > struct blorp_surf src_surf, dst_surf; > > > > blorp_surf_for_miptree(brw, _surf, src_mt, src_aux_usage, false, > > > >_level, src_layer, 1, _surfs[0]); > > > > + if (devinfo->gen >= 10 && isl_format_is_srgb(src_isl_format) && > > > > + src_clear_supported) { > > > > + src_surf.clear_color = > > > > + gen10_convert_srgb_fast_clear_color(devinfo, src_isl_format, > > > > + src_mt->fast_clear_color); > > > > + } > > > > + > > > > blorp_surf_for_miptree(brw, _surf, dst_mt, dst_aux_usage, true, > > > >_level, dst_layer, 1, _surfs[1]); > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > index d292f5a8e2..97c6114f8a 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > +++ b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > @@ -28,6 +28,7 @@ > > > > #include "brw_state.h" > > > > #include "main/blend.h" > > > > #include "main/fbobject.h" > > > > +#include "main/format_utils.h" > > > > #include "util/format_srgb.h" > > > > > > > > /** > > > > @@ -315,6 +316,43 @@ brw_is_color_fast_clear_compatible(struct > > > brw_context *brw, > > > > return true; > > > > } > > > > > > > > +/* When sampling from fast-cleared sRGB textures on gen10, the hardware > > > > + * will not decode the clear color to linear. We must therefore do it > > > > in > > > > + * software. > > > > + */ > > > > +union isl_color_value > > > > +gen10_convert_srgb_fast_clear_color(const struct gen_device_info > > > *devinfo, > > > > +enum isl_format format, > > > > +const union isl_color_value color) > > > > +{ > > > > + assert(devinfo->gen >= 10); > > > > + assert(isl_format_is_srgb(format)); > > > > + assert(isl_format_supports_ccs_d(devinfo, format)); > > > > + > > > > + /* All the CCS-supported sRGB textures in GL are 4-channel, > > > 8-bit/channel, > > > > +* UNORM formats. > > > > +*/ > > > > + assert(isl_format_get_num_channels(format) == 4); > > > > + assert(isl_format_channels_have_size(format, 8)); > > > > + assert(isl_format_has_unorm_channel(format)); > > > > > > > There are exactly 4 ISL formats which match this. You could just assert > > that it's one of the four and drop the other 5 asserts. It doesn't really > > matter, but this is rather complicated. > > > > > > I agree. I'll fix this locally and send out a v2 after getting more of > your feedback. > Just a note: after looking things over, I was surprised to find that only 3 formats match the above (R8G8B8X8_UNORM_SRGB isn't included). -Nanley ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/cnl: Correctly sample from fast-cleared sRGB textures
On Mon, Nov 20, 2017 at 3:48 PM, Nanley Cherywrote: > On Mon, Nov 20, 2017 at 02:55:28PM -0800, Jason Ekstrand wrote: > > On Mon, Nov 20, 2017 at 2:10 PM, Nanley Chery > wrote: > > > > > On Mon, Nov 20, 2017 at 01:54:39PM -0800, Nanley Chery wrote: > > > > When sampling from fast-cleared sRGB textures on gen10, the hardware > > > > will not decode the clear color to linear. We must therefore do it in > > > > software. > > > > > > > > > > I should've included the tests that are now passing (added in locally): > > > > > > * spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend > > > * spec@arb_framebuffer_srgb@fbo-fast-clear > > > * spec@arb_framebuffer_srgb@msaa-fast-clear > > > * spec@ext_texture_srgb@fbo-fast-clear > > > * spec@ext_texture_srgb@multisample-fast-clear gl_ext_texture_srgb > > > > > > -Nanley > > > > > > > Signed-off-by: Nanley Chery > > > > --- > > > > src/mesa/drivers/dri/i965/brw_blorp.c| 7 + > > > > src/mesa/drivers/dri/i965/brw_meta_util.c| 38 > > > > > > > src/mesa/drivers/dri/i965/brw_meta_util.h| 5 > > > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 9 +- > > > > 4 files changed, 58 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > > > b/src/mesa/drivers/dri/i965/brw_blorp.c > > > > index 38284d3b85..b10e9b95b7 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > > > @@ -328,6 +328,13 @@ brw_blorp_blit_miptrees(struct brw_context *brw, > > > > struct blorp_surf src_surf, dst_surf; > > > > blorp_surf_for_miptree(brw, _surf, src_mt, src_aux_usage, > false, > > > >_level, src_layer, 1, _surfs[0]); > > > > + if (devinfo->gen >= 10 && isl_format_is_srgb(src_isl_format) && > > > > + src_clear_supported) { > > > > + src_surf.clear_color = > > > > + gen10_convert_srgb_fast_clear_color(devinfo, > src_isl_format, > > > > + > src_mt->fast_clear_color); > > > > + } > > > > + > > > > blorp_surf_for_miptree(brw, _surf, dst_mt, dst_aux_usage, > true, > > > >_level, dst_layer, 1, _surfs[1]); > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > index d292f5a8e2..97c6114f8a 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > +++ b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > > @@ -28,6 +28,7 @@ > > > > #include "brw_state.h" > > > > #include "main/blend.h" > > > > #include "main/fbobject.h" > > > > +#include "main/format_utils.h" > > > > #include "util/format_srgb.h" > > > > > > > > /** > > > > @@ -315,6 +316,43 @@ brw_is_color_fast_clear_compatible(struct > > > brw_context *brw, > > > > return true; > > > > } > > > > > > > > +/* When sampling from fast-cleared sRGB textures on gen10, the > hardware > > > > + * will not decode the clear color to linear. We must therefore do > it in > > > > + * software. > > > > + */ > > > > +union isl_color_value > > > > +gen10_convert_srgb_fast_clear_color(const struct gen_device_info > > > *devinfo, > > > > +enum isl_format format, > > > > +const union isl_color_value > color) > > > > +{ > > > > + assert(devinfo->gen >= 10); > > > > + assert(isl_format_is_srgb(format)); > > > > + assert(isl_format_supports_ccs_d(devinfo, format)); > > > > + > > > > + /* All the CCS-supported sRGB textures in GL are 4-channel, > > > 8-bit/channel, > > > > +* UNORM formats. > > > > +*/ > > > > + assert(isl_format_get_num_channels(format) == 4); > > > > + assert(isl_format_channels_have_size(format, 8)); > > > > + assert(isl_format_has_unorm_channel(format)); > > > > > > > There are exactly 4 ISL formats which match this. You could just assert > > that it's one of the four and drop the other 5 asserts. It doesn't > really > > matter, but this is rather complicated. > > > > > > I agree. I'll fix this locally and send out a v2 after getting more of > your feedback. > > > > > + > > > > + /* According to do_single_blorp_clear(), fast-clears for > > > texture-views are > > > > +* disabled. This means that we only have to do > channel-to-channel > > > format > > > > +* conversions. > > > > +*/ > > > > + union isl_color_value override_color = color; > > > > + for (unsigned i = 0; i < 3; i++) { > > > > + /* According to brw_is_color_fast_clear_compatible(), only > > > floating-point > > > > + * fast-clears have been enabled, so it is safe to rely on > > > > + * isl_color_value::f32. > > > > + */ > > > > > > > There is no such thing as integer sRGB, it wouldn't even make sense. > > > > > > Wouldn't we run into an issue if integer fast-clears are enabled and > texture views of different
Re: [Mesa-dev] [Mesa-stable] [PATCH 02/14] anv/cmd_buffer: Take bo_offset into account in fast clear state addresses
On Mon, Nov 20, 2017 at 3:51 PM, Andres Gomezwrote: > Jason, this nominated patch landed without mentioning any specific > stable queue. > > Similarly to the 01/14, from what I'm seeing, it depends on > a62a97933578 which didn't make it for 17.2 so I'm dropping it for that > queue. > > Let me know what you think. > Looks correct to me. > On Mon, 2017-11-13 at 08:12 -0800, Jason Ekstrand wrote: > > Otherwise, if the image is not bound to the start of the buffer, we're > > going to be reading and writing its fast clear state in the wrong spot. > > > > Cc: mesa-sta...@lists.freedesktop.org > > --- > > src/intel/vulkan/genX_cmd_buffer.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > b/src/intel/vulkan/genX_cmd_buffer.c > > index 2564976..9eb6074 100644 > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > @@ -461,7 +461,7 @@ get_fast_clear_state_address(const struct > anv_device *device, > > > > return (struct anv_address) { > >.bo = image->planes[plane].bo, > > - .offset = offset, > > + .offset = image->planes[plane].bo_offset + offset, > > }; > > } > > > -- > Br, > > Andres > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 02/14] anv/cmd_buffer: Take bo_offset into account in fast clear state addresses
Jason, this nominated patch landed without mentioning any specific stable queue. Similarly to the 01/14, from what I'm seeing, it depends on a62a97933578 which didn't make it for 17.2 so I'm dropping it for that queue. Let me know what you think. On Mon, 2017-11-13 at 08:12 -0800, Jason Ekstrand wrote: > Otherwise, if the image is not bound to the start of the buffer, we're > going to be reading and writing its fast clear state in the wrong spot. > > Cc: mesa-sta...@lists.freedesktop.org > --- > src/intel/vulkan/genX_cmd_buffer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > b/src/intel/vulkan/genX_cmd_buffer.c > index 2564976..9eb6074 100644 > --- a/src/intel/vulkan/genX_cmd_buffer.c > +++ b/src/intel/vulkan/genX_cmd_buffer.c > @@ -461,7 +461,7 @@ get_fast_clear_state_address(const struct anv_device > *device, > > return (struct anv_address) { >.bo = image->planes[plane].bo, > - .offset = offset, > + .offset = image->planes[plane].bo_offset + offset, > }; > } > -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/cnl: Correctly sample from fast-cleared sRGB textures
On Mon, Nov 20, 2017 at 02:55:28PM -0800, Jason Ekstrand wrote: > On Mon, Nov 20, 2017 at 2:10 PM, Nanley Cherywrote: > > > On Mon, Nov 20, 2017 at 01:54:39PM -0800, Nanley Chery wrote: > > > When sampling from fast-cleared sRGB textures on gen10, the hardware > > > will not decode the clear color to linear. We must therefore do it in > > > software. > > > > > > > I should've included the tests that are now passing (added in locally): > > > > * spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend > > * spec@arb_framebuffer_srgb@fbo-fast-clear > > * spec@arb_framebuffer_srgb@msaa-fast-clear > > * spec@ext_texture_srgb@fbo-fast-clear > > * spec@ext_texture_srgb@multisample-fast-clear gl_ext_texture_srgb > > > > -Nanley > > > > > Signed-off-by: Nanley Chery > > > --- > > > src/mesa/drivers/dri/i965/brw_blorp.c| 7 + > > > src/mesa/drivers/dri/i965/brw_meta_util.c| 38 > > > > > src/mesa/drivers/dri/i965/brw_meta_util.h| 5 > > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 9 +- > > > 4 files changed, 58 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > > b/src/mesa/drivers/dri/i965/brw_blorp.c > > > index 38284d3b85..b10e9b95b7 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > > @@ -328,6 +328,13 @@ brw_blorp_blit_miptrees(struct brw_context *brw, > > > struct blorp_surf src_surf, dst_surf; > > > blorp_surf_for_miptree(brw, _surf, src_mt, src_aux_usage, false, > > >_level, src_layer, 1, _surfs[0]); > > > + if (devinfo->gen >= 10 && isl_format_is_srgb(src_isl_format) && > > > + src_clear_supported) { > > > + src_surf.clear_color = > > > + gen10_convert_srgb_fast_clear_color(devinfo, src_isl_format, > > > + src_mt->fast_clear_color); > > > + } > > > + > > > blorp_surf_for_miptree(brw, _surf, dst_mt, dst_aux_usage, true, > > >_level, dst_layer, 1, _surfs[1]); > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c > > b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > index d292f5a8e2..97c6114f8a 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_meta_util.c > > > +++ b/src/mesa/drivers/dri/i965/brw_meta_util.c > > > @@ -28,6 +28,7 @@ > > > #include "brw_state.h" > > > #include "main/blend.h" > > > #include "main/fbobject.h" > > > +#include "main/format_utils.h" > > > #include "util/format_srgb.h" > > > > > > /** > > > @@ -315,6 +316,43 @@ brw_is_color_fast_clear_compatible(struct > > brw_context *brw, > > > return true; > > > } > > > > > > +/* When sampling from fast-cleared sRGB textures on gen10, the hardware > > > + * will not decode the clear color to linear. We must therefore do it in > > > + * software. > > > + */ > > > +union isl_color_value > > > +gen10_convert_srgb_fast_clear_color(const struct gen_device_info > > *devinfo, > > > +enum isl_format format, > > > +const union isl_color_value color) > > > +{ > > > + assert(devinfo->gen >= 10); > > > + assert(isl_format_is_srgb(format)); > > > + assert(isl_format_supports_ccs_d(devinfo, format)); > > > + > > > + /* All the CCS-supported sRGB textures in GL are 4-channel, > > 8-bit/channel, > > > +* UNORM formats. > > > +*/ > > > + assert(isl_format_get_num_channels(format) == 4); > > > + assert(isl_format_channels_have_size(format, 8)); > > > + assert(isl_format_has_unorm_channel(format)); > > > > There are exactly 4 ISL formats which match this. You could just assert > that it's one of the four and drop the other 5 asserts. It doesn't really > matter, but this is rather complicated. > > I agree. I'll fix this locally and send out a v2 after getting more of your feedback. > > > + > > > + /* According to do_single_blorp_clear(), fast-clears for > > texture-views are > > > +* disabled. This means that we only have to do channel-to-channel > > format > > > +* conversions. > > > +*/ > > > + union isl_color_value override_color = color; > > > + for (unsigned i = 0; i < 3; i++) { > > > + /* According to brw_is_color_fast_clear_compatible(), only > > floating-point > > > + * fast-clears have been enabled, so it is safe to rely on > > > + * isl_color_value::f32. > > > + */ > > > > There is no such thing as integer sRGB, it wouldn't even make sense. > > Wouldn't we run into an issue if integer fast-clears are enabled and texture views of different formats are allowed to be fast-cleared? The specific example I have in mind is: * Fast clear an R8G8B8A8_UINT to (255,255,255,255) * Sample from the texture as SRGB8_ALPHA8 We'd need to use the u32 and directly instead of doing the conversion with the f32. I hope I've explained that
Re: [Mesa-dev] [Mesa-stable] [PATCH 01/14] anv/cmd_buffer: Advance the address when initializing clear colors
Jason, this nominated patch landed without mentioning any specific stable queue. From what I'm seeing, it depends on 3735af04152b which didn't make it for 17.2 so I'm dropping it for that queue, unless you want to provide a backport. Let me know what you think. On Mon, 2017-11-13 at 08:12 -0800, Jason Ekstrand wrote: > Found by inspection > > Cc: mesa-sta...@lists.freedesktop.org > --- > src/intel/vulkan/genX_cmd_buffer.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > b/src/intel/vulkan/genX_cmd_buffer.c > index fbb5706..2564976 100644 > --- a/src/intel/vulkan/genX_cmd_buffer.c > +++ b/src/intel/vulkan/genX_cmd_buffer.c > @@ -557,12 +557,13 @@ init_fast_clear_state_entry(struct anv_cmd_buffer > *cmd_buffer, > /* Other combinations of auxiliary buffers and platforms require specific > * values in the clear value dword(s). > */ > + struct anv_address addr = > + get_fast_clear_state_address(cmd_buffer->device, image, aspect, level, > + FAST_CLEAR_STATE_FIELD_CLEAR_COLOR); > unsigned i = 0; > for (; i < cmd_buffer->device->isl_dev.ss.clear_value_size; i += 4) { >anv_batch_emit(_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) { > - sdi.Address = > -get_fast_clear_state_address(cmd_buffer->device, image, aspect, > level, > - FAST_CLEAR_STATE_FIELD_CLEAR_COLOR); > + sdi.Address = addr; > > if (GEN_GEN >= 9) { > /* MCS buffers on SKL+ can only have 1/0 clear colors. */ > @@ -586,6 +587,8 @@ init_fast_clear_state_entry(struct anv_cmd_buffer > *cmd_buffer, > sdi.ImmediateData = 0; > } >} > + > + addr += 4; > } > } > -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat
Jordan Justenwrites: > This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated > samplers & surfaces. > > Signed-off-by: Jordan Justen Reviewed-by: Scott D Phillips > --- > src/mesa/drivers/dri/i965/intel_batchbuffer.c | 24 > 1 file changed, 24 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c > index 3412b1d0a5f..216073129ba 100644 > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c > @@ -564,6 +564,30 @@ do_batch_dump(struct brw_context *brw) > decode_struct(brw, spec, "DEPTH_STENCIL_STATE", state, > state_gtt_offset, p[1] & ~0x3fu, color); > break; > + case MEDIA_INTERFACE_DESCRIPTOR_LOAD: { > + struct gen_group *group = > +gen_spec_find_struct(spec, "RENDER_SURFACE_STATE"); > + if (!group) > +break; > + > + uint32_t idd_offset = p[3] & ~0x1fu; > + decode_struct(brw, spec, "INTERFACE_DESCRIPTOR_DATA", state, > + state_gtt_offset, idd_offset, color); > + > + uint32_t ss_offset = state[idd_offset / 4 + 3] & ~0x1fu; > + decode_structs(brw, spec, "SAMPLER_STATE", state, > +state_gtt_offset, ss_offset, 4 * 4, color); > + > + uint32_t bt_offset = state[idd_offset / 4 + 4] & ~0x1fu; > + int bt_entries = brw_state_batch_size(brw, bt_offset) / 4; > + uint32_t *bt_pointers = [bt_offset / 4]; > + for (int i = 0; i < bt_entries; i++) { > +fprintf(stderr, "SURFACE_STATE - BTI = %d\n", i); > +gen_print_group(stderr, group, state_gtt_offset + bt_pointers[i], > +[bt_pointers[i] / 4], color); > + } > + break; > + } >} > } > > -- > 2.14.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/fs: Handle negating immediates on MADs when propagating saturates
On Mon, Nov 20, 2017 at 2:50 PM, Ian Romanickwrote: > On 11/20/2017 02:33 PM, Matt Turner wrote: >> MADs don't take immediate sources, but we allow them in the IR since it >> simplifies a lot of things. I neglected to consider that case. >> >> Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate >> negations into MADs.") >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616 >> Reported-and-Tested-by: Ruslan Kabatsayev >> --- >> src/intel/compiler/brw_fs_saturate_propagation.cpp | 10 -- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/src/intel/compiler/brw_fs_saturate_propagation.cpp >> b/src/intel/compiler/brw_fs_saturate_propagation.cpp >> index 1c97a507d8..d6cfa79a61 100644 >> --- a/src/intel/compiler/brw_fs_saturate_propagation.cpp >> +++ b/src/intel/compiler/brw_fs_saturate_propagation.cpp >> @@ -88,8 +88,14 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t >> *block) >> scan_inst->src[0].negate = >> !scan_inst->src[0].negate; >> inst->src[0].negate = false; >> } else if (scan_inst->opcode == BRW_OPCODE_MAD) { >> -scan_inst->src[0].negate = >> !scan_inst->src[0].negate; >> -scan_inst->src[1].negate = >> !scan_inst->src[1].negate; >> +for (int i = 0; i < 2; i++) { >> + if (scan_inst->src[i].file == IMM) { >> + brw_negate_immediate(scan_inst->src[i].type, >> + >> _inst->src[i].as_brw_reg()); >> + } else { >> + scan_inst->src[i].negate = >> !scan_inst->src[i].negate; >> + } >> +} > > Is this going to affect the number of generated instructions if there > are multiple MADs using the same immediate value for a multiply source? > Would it be better to find a multiply source that isn't an immediate? Interesting question. I think the answer is no, since brw_fs_combine_constants.cpp builds its list of constants by ignoring their signs. Since source negation is free, it just loads positive values into registers and negates them with a source modifier if needed. I ran shader-db to confirm -- no changes on SKL. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103825] [glx][i965][regression][bisected] pigilt glx-query-renderer-coverage regression
https://bugs.freedesktop.org/show_bug.cgi?id=103825 Andrés Gómez Garcíachanged: What|Removed |Added QA Contact|mesa-dev@lists.freedesktop. |pig...@lists.freedesktop.or |org |g Version|git |unspecified Product|Mesa|piglit Assignee|mesa-dev@lists.freedesktop. |pig...@lists.freedesktop.or |org |g Component|GLX |tests -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Implement ARB_texture_filter_minmax
Ian Romanickwrites: > On 11/19/2017 01:31 AM, Kenneth Graunke wrote: >> On Thursday, November 16, 2017 11:50:48 AM PST Ian Romanick wrote: >>> On 11/14/2017 02:54 PM, Scott D Phillips wrote: On gen >= 9, minmax reduction modes are available as a flag in SAMPLER_STATE. --- docs/features.txt | 2 +- src/mesa/drivers/dri/i965/brw_formatquery.c | 4 src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 4 files changed, 16 insertions(+), 1 deletion(-) diff --git a/docs/features.txt b/docs/features.txt index 86d07ba80b..9ec3f2b975 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -312,7 +312,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve GL_ARB_sparse_texture not started GL_ARB_sparse_texture2not started GL_ARB_sparse_texture_clamp not started - GL_ARB_texture_filter_minmax not started + GL_ARB_texture_filter_minmax DONE (i965) GL_EXT_memory_object DONE (radeonsi) GL_EXT_memory_object_fd DONE (radeonsi) GL_EXT_memory_object_win32not started diff --git a/src/mesa/drivers/dri/i965/brw_formatquery.c b/src/mesa/drivers/dri/i965/brw_formatquery.c index 4f3b9e467b..bb2281f571 100644 --- a/src/mesa/drivers/dri/i965/brw_formatquery.c +++ b/src/mesa/drivers/dri/i965/brw_formatquery.c @@ -107,6 +107,10 @@ brw_query_internal_format(struct gl_context *ctx, GLenum target, break; } + case GL_TEXTURE_REDUCTION_MODE_ARB: + params[0] = GL_TRUE; + break; + >>> >>> Can Gen9 actually support all formats? When this was getting discussed >>> in Khronos, I thought we had some format restrictions. This is the >>> difference between the EXT and the ARB extension. If we don't actually >>> have any restrictions on Gen9, a good follow-on would be to add support >>> for the EXT... since that doesn't have the OpenGL 3.3 requirement and >>> has interactions with OpenGL ES. >>> >>> I also thought that Gen8 could do some min/max modes. Perhaps that's >>> where the restrictions were that I was thinking of... >> >> Skylake has the SAMPLER_STATE reduction mode field. >> >> Previous generations had sampler_min/max messages that work differently. >> >> If I recall, we discussed this previously and concluded the old messages >> weren't that useful...or at least, would result in more NOS and awkward >> restrictions. > > That's probably what I was thinking of. > > So... we think it should "just work" for all formats on SKL? It sounds > like various test suites only exercise a very small set of > single-channel formats. I think we should add piglit tests to poke at > least some other formats. I talked to Scott about this last week, and > he seemed amenable to the idea. I'm working on making the piglit tests we talked about, so that should help figure this out. FWIW, I wasn't able to find any restrictions listed in the PRM. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] meson: fix test source name for static glapi
fixes: 43a6e84927e3 ("meson: build mesa test.") Signed-off-by: Dylan Baker--- src/mesa/main/tests/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/tests/meson.build b/src/mesa/main/tests/meson.build index 2c1d8e067e8..78bef00f61e 100644 --- a/src/mesa/main/tests/meson.build +++ b/src/mesa/main/tests/meson.build @@ -30,7 +30,7 @@ if with_shared_glapi ) link_main_test += libglapi else - files_main_test += files('stub.cpp') + files_main_test += files('stubs.cpp') endif test('main-test', executable( -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] glapi/check_table: Remove 'extern "C"' block
This doesn't actually accomplish what it's meant to do, as extern C doesn't undefine __cplusplus, so the included headers define a template (because __cplusplus is defined), but then that code is in an 'extern "C"' block, and explosion. Signed-off-by: Dylan Baker--- src/mapi/glapi/tests/check_table.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mapi/glapi/tests/check_table.cpp b/src/mapi/glapi/tests/check_table.cpp index 62b3a43d22f..7cded8d352f 100644 --- a/src/mapi/glapi/tests/check_table.cpp +++ b/src/mapi/glapi/tests/check_table.cpp @@ -24,10 +24,8 @@ #include #include "main/glheader.h" -extern "C" { #include "glapi/glapi.h" #include "glapi/glapitable.h" -} struct name_offset { const char *name; -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] glapi: remove APPLE extensions from test
Fixes: 7009955281260fbb ("mesa: Remove GL_APPLE_vertex_array_object stubs") Signed-off-by: Dylan Baker--- src/mapi/glapi/tests/check_table.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mapi/glapi/tests/check_table.cpp b/src/mapi/glapi/tests/check_table.cpp index 7cded8d352f..30f523ca5fb 100644 --- a/src/mapi/glapi/tests/check_table.cpp +++ b/src/mapi/glapi/tests/check_table.cpp @@ -1402,9 +1402,7 @@ const struct name_offset known_dispatch[] = { { "glPointParameteri", _O(PointParameteri) }, { "glPointParameteriv", _O(PointParameteriv) }, { "glActiveStencilFaceEXT", _O(ActiveStencilFaceEXT) }, - { "glBindVertexArrayAPPLE", _O(BindVertexArrayAPPLE) }, { "glDeleteVertexArrays", _O(DeleteVertexArrays) }, - { "glGenVertexArraysAPPLE", _O(GenVertexArraysAPPLE) }, { "glIsVertexArray", _O(IsVertexArray) }, { "glGetProgramNamedParameterdvNV", _O(GetProgramNamedParameterdvNV) }, { "glGetProgramNamedParameterfvNV", _O(GetProgramNamedParameterfvNV) }, -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] state_tracker: Don't build st-renumerate-test without shared glapi
Signed-off-by: Dylan Baker--- src/mesa/state_tracker/tests/Makefile.am | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/tests/Makefile.am b/src/mesa/state_tracker/tests/Makefile.am index 6c58d367694..18c0ce50621 100644 --- a/src/mesa/state_tracker/tests/Makefile.am +++ b/src/mesa/state_tracker/tests/Makefile.am @@ -16,9 +16,11 @@ AM_CPPFLAGS = \ $(DEFINES) if HAVE_STD_CXX11 +if HAVE_SHARED_GLAPI TESTS = st-renumerate-test check_PROGRAMS = st-renumerate-test endif +endif st_renumerate_test_SOURCES = \ test_glsl_to_tgsi_lifetime.cpp -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] glapi: fix check_table test for non-shared glapi with meson
Signed-off-by: Dylan Baker--- src/mapi/glapi/meson.build | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mapi/glapi/meson.build b/src/mapi/glapi/meson.build index 892d8b35759..db6cd834511 100644 --- a/src/mapi/glapi/meson.build +++ b/src/mapi/glapi/meson.build @@ -78,10 +78,10 @@ libglapi_static = static_library( if not with_shared_glapi and with_tests glapi_static_check_table = executable( 'glapi_static_check_table', -'tests/check_table.cpp', -include_directories : inc_mesa, +['tests/check_table.cpp', glapitable_h], +include_directories : [inc_include, inc_src, inc_mesa, inc_mapi], link_with : [libglapi_static], -dependencies : [idep_gtest], +dependencies : [idep_gtest, dep_thread], ) test('glapi_static_check_table', glapi_static_check_table) -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] glapi: Don't search through subdirs from glapitable.h
Because meson won't put it in that folder. Signed-off-by: Dylan Baker--- src/mapi/glapi/tests/check_table.cpp | 2 +- src/mesa/main/tests/Makefile.am | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mapi/glapi/tests/check_table.cpp b/src/mapi/glapi/tests/check_table.cpp index 30f523ca5fb..6230f1273f3 100644 --- a/src/mapi/glapi/tests/check_table.cpp +++ b/src/mapi/glapi/tests/check_table.cpp @@ -25,7 +25,7 @@ #include "main/glheader.h" #include "glapi/glapi.h" -#include "glapi/glapitable.h" +#include "glapitable.h" struct name_offset { const char *name; diff --git a/src/mesa/main/tests/Makefile.am b/src/mesa/main/tests/Makefile.am index 47fce8a5b78..5c920b70236 100644 --- a/src/mesa/main/tests/Makefile.am +++ b/src/mesa/main/tests/Makefile.am @@ -4,6 +4,8 @@ AM_CPPFLAGS = \ -I$(top_srcdir)/src/gtest/include \ -I$(top_srcdir)/src \ -I$(top_srcdir)/src/mapi \ + -I$(top_builddir)/src/mapi/glapi \ + -I$(top_srcdir)/src/mapi/glapi \ -I$(top_builddir)/src/mesa \ -I$(top_srcdir)/src/mesa \ -I$(top_srcdir)/include \ -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] autotools: Fix includes for non-shared glapi tests.
Signed-off-by: Dylan Baker--- src/mapi/Makefile.am | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am index 015edc37cb3..5db888abddc 100644 --- a/src/mapi/Makefile.am +++ b/src/mapi/Makefile.am @@ -150,7 +150,8 @@ check_PROGRAMS += glapi-test glapi_test_SOURCES = glapi/tests/check_table.cpp glapi_test_CPPFLAGS = \ $(AM_CPPFLAGS) \ - -I$(top_srcdir)/src/gtest/include + -I$(top_srcdir)/src/gtest/include \ + -I$(top_srcdir)/src/mesa glapi_test_LDADD = \ $(top_builddir)/src/mapi/glapi/libglapi.la \ -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] glapi: don't walk backwards for includes
Instead just set the proper -I flags and include it from a more standard path. In this case we'll add -Isrc/mesa (which is common), and #include main/foo.h. --- src/mapi/Makefile.am | 3 ++- src/mapi/glapi/meson.build | 1 + src/mapi/glapi/tests/check_table.cpp | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am index 3f8fcc02e55..015edc37cb3 100644 --- a/src/mapi/Makefile.am +++ b/src/mapi/Makefile.am @@ -87,7 +87,8 @@ check_PROGRAMS += shared-glapi-test shared_glapi_test_SOURCES = shared-glapi/tests/check_table.cpp shared_glapi_test_CPPFLAGS = \ $(AM_CPPFLAGS) \ - -I$(top_srcdir)/src/gtest/include + -I$(top_srcdir)/src/gtest/include \ + -I$(top_srcdir)/src/mesa shared_glapi_test_LDADD = \ $(top_builddir)/src/mapi/shared-glapi/libglapi.la \ $(top_builddir)/src/gtest/libgtest.la diff --git a/src/mapi/glapi/meson.build b/src/mapi/glapi/meson.build index d2d86afd6c1..892d8b35759 100644 --- a/src/mapi/glapi/meson.build +++ b/src/mapi/glapi/meson.build @@ -79,6 +79,7 @@ if not with_shared_glapi and with_tests glapi_static_check_table = executable( 'glapi_static_check_table', 'tests/check_table.cpp', +include_directories : inc_mesa, link_with : [libglapi_static], dependencies : [idep_gtest], ) diff --git a/src/mapi/glapi/tests/check_table.cpp b/src/mapi/glapi/tests/check_table.cpp index 09bf4f3585c..62b3a43d22f 100644 --- a/src/mapi/glapi/tests/check_table.cpp +++ b/src/mapi/glapi/tests/check_table.cpp @@ -22,7 +22,7 @@ */ #include -#include "../mesa/main/glheader.h" +#include "main/glheader.h" extern "C" { #include "glapi/glapi.h" -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] Fix non-shared glapi path
There are two distinct set of problems this series addresses. 1) the meson build has never worked 2) the checks for this path have been broken in several different ways for a long time. Dylan Baker (8): glapi: don't walk backwards for includes meson: fix test source name for static glapi glapi/check_table: Remove 'extern "C"' block glapi: remove APPLE extensions from test autotools: Fix includes for non-shared glapi tests. state_tracker: Don't build st-renumerate-test without shared glapi glapi: Don't search through subdirs from glapitable.h glapi: fix check_table test for non-shared glapi with meson src/mapi/Makefile.am | 6 -- src/mapi/glapi/meson.build | 5 +++-- src/mapi/glapi/tests/check_table.cpp | 8 ++-- src/mesa/main/tests/Makefile.am | 2 ++ src/mesa/main/tests/meson.build | 2 +- src/mesa/state_tracker/tests/Makefile.am | 2 ++ 6 files changed, 14 insertions(+), 11 deletions(-) -- 2.15.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Freedreno] [PATCH 1/2] nir: allow texture offsets with cube maps
On Mon, Nov 20, 2017 at 5:16 PM, Jason Ekstrandwrote: > On Sun, Nov 19, 2017 at 11:54 AM, Ilia Mirkin wrote: >> >> GL doesn't have this, but some hardware supports it. This is convenient >> for lowering tg4 to plain texture calls, which is necessary on Adreno >> A4xx hardware. >> >> Signed-off-by: Ilia Mirkin >> --- >> src/compiler/nir/nir.h | 15 +-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >> index f46f6147110..64965ae16d6 100644 >> --- a/src/compiler/nir/nir.h >> +++ b/src/compiler/nir/nir.h >> @@ -1364,8 +1364,7 @@ nir_tex_instr_src_size(const nir_tex_instr *instr, >> unsigned src) >> if (instr->src[src].src_type == nir_tex_src_ms_mcs) >>return 4; >> >> - if (instr->src[src].src_type == nir_tex_src_offset || >> - instr->src[src].src_type == nir_tex_src_ddx || >> + if (instr->src[src].src_type == nir_tex_src_ddx || >> instr->src[src].src_type == nir_tex_src_ddy) { >>if (instr->is_array) >> return instr->coord_components - 1; >> @@ -1373,6 +1372,18 @@ nir_tex_instr_src_size(const nir_tex_instr *instr, >> unsigned src) >> return instr->coord_components; >> } >> >> + /* Usual APIs don't allow cube + offset, but we allow it, with 2 >> coords for >> +* the offset, since a cube maps to a single face. >> +*/ >> + if (instr->src[src].src_type == nir_tex_src_offset) { >> + unsigned ret = instr->coord_components; >> + if (instr->is_array) >> + ret--; >> + if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) >> + ret--; >> + return ret; > > > I think I'd rather this look more like the one above: > > if (instr->is_array) >return instr->coord_components; > else if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE) >return 2; > else >return instr->coord_components - 1; > > It seems a bit cleaner and/or more explicit to me. OK. Although your version is slightly wrong, but I get the idea. Will do that in a v2. (array should get -1, and cube should always get 2 even if it's an array) -ilia > > Also, bonus points to anyone who converts this function to use a switch. :-P > > --Jason > >> >> + } >> + >> return 1; >> } >> >> -- >> 2.13.6 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ > Freedreno mailing list > freedr...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/freedreno > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev