Re: VkRunner ported to Rust
Alejandro Piñeiro writes: > I have been trying to find a time slot to learn Rust too. If you have > issues around with pending features perhaps one of these days I join > your initiative. Great! I’ll have to have a brainstorm for things that need implementing. The main thing at the moment is that there is still no support for images and textures. That’s probably quite a big task. Otherwise maybe if you find yourself needing to implement a test for something that VkRunner can’t support yet that might be a good place to start. Otherwise for some really small tasks it might be good to have a look through the Clippy output. Clippy is a rust linter. It’s easier to run using Cargo so you could make a trivial Cargo.toml build file like this: [package] name = "vkrunner" version = "0.1.0" edition = "2021" [lib] path = "vkrunner/lib.rs" And then run “cargo clippy” to see the output. Another option could be to add some more unit tests. I think the coverage is already quite good but there’s probably some things missing that could be worth testing. > If the rust-based vkrunner has the same features and can run the same > kind of tests that the original c-based vkrunner, I think that using a > memory-safe language is an advantage that goes beyond just a warm fuzzy > feeling. > > If that is the case, I think that it is a good idea to replace one with > the other. Cool. Yeah, it is feature complete with the old VkRunner so it should be able to run the same scripts. I tried it with the current Piglit tests and it works fine. If no one disagrees it might be nice to merge the branch back into the official repo at Igalia’s github. I’m not sure if the CI system automatically updates to the lastest version but if so it’d be worth checking whether it can cope with switching from CMake to Meson+Rust first. Maybe in the long run it would be nice to host the repo on Freedesktop’s gitlab. Regards, – Neil
VkRunner ported to Rust
Hi folks, Does anybody remember VkRunner? It’s a little tool to help write shader-based tests for Vulkan. It’s the same concept as Piglit’s shader_runner but for Vulkan instead of OpenGL. There are a couple of tests using it in Piglit but apart from that it never really got off the ground. Anyway, I’ve been trying to learn some Rust lately and in order to get some experience working with a non-trivial project I decided to port VkRunner to Rust. The port is now complete and available here: https://github.com/bpeel/vkrunner/ It’s a drop-in replacement for the original VkRunner so it should be possible to start using it in a CI system by just changing the git repo, assuming the rust compiler is installed. I was thinking that now that Mesa has some Rust code in it anyway it might not be too unreasonable to expect CI systems to have the Rust compiler available. Other than that there’s not much advantage to using one or the other except for the warm fuzzy feeling knowing that you’re using a project written in a memory-safe language. I also took the opportunity to add a whole bunch of unit tests so in theory the Rust port might be more robust. It’s currently using Meson as the build system. In the beginning this was necessary because I did the port gradually and it’s probably the best build system if you have a mix of C and Rust code. Now that the port is complete it’d probably be trivial to start using Cargo instead. I’m not sure which would be better. It’s not using any external crates so it doesn’t really need Cargo for now. Now that the port is complete it might be nice to start adding more features. If anyone else hasn’t taken the plunge to start using Rust yet this might be a nice project to get involved in if you fancy helping. Kind regards, – Neil
[Mesa-dev] [MR] glsl: Use default precision on struct/interface members when nothing specified
Mesa already keeps track of the GLES precision for variables and stores it in the ir_variable. When no precision is explicitly specified it takes the default precision for the corresponding type. However, when the variable is a struct or interface, the precision of each individual member is attached to the glsl_type instead. The code to make it use the default precision was missing so this branch adds it in. Only the last patch actually makes this change. The rest of the patches are to fix regressions in Piglit and CTS. The underlying problem is that Mesa was considering types that have different precisions to be different when comparing interstage interfaces (varyings and UBOs). According to the spec it should be ignored. Presumably this problem already existed if mismatched precisions were explicitly specified but we didn’t have any tests to test it. Storing the default precision makes some tests fail because the default precision for ints is different in the vertex and fragment stages so it’s easier to accidentally make a test case that tests this. The tests that regressed are: dEQP-GLES31.functional.shaders.opaque_type_indexing.* (12 tests) piglit.spec.ext_transform_feedback.structs_gles3 basic-struct run piglit.spec.glsl-es-3_00.execution.varying-struct-centroid_gles3 piglit.spec.ext_transform_feedback.structs_gles3 basic-struct get https://gitlab.freedesktop.org/mesa/mesa/merge_requests/736 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] freedreno: Fix loading half-float immediate vectors
Previously the code to load from a constant instruction was always using the u32 pointer. If the constant is actually a 16-bit source this would end up with the wrong values because the pointer would be offset by the wrong size. This fixes it to use the u16 pointer. --- src/freedreno/ir3/ir3_compiler_nir.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/src/freedreno/ir3/ir3_compiler_nir.c b/src/freedreno/ir3/ir3_compiler_nir.c index 445a2b291e9..584c7e83fea 100644 --- a/src/freedreno/ir3/ir3_compiler_nir.c +++ b/src/freedreno/ir3/ir3_compiler_nir.c @@ -2525,10 +2525,18 @@ emit_load_const(struct ir3_context *ctx, nir_load_const_instr *instr) { struct ir3_instruction **dst = get_dst_ssa(ctx, >def, instr->def.num_components); - type_t type = (instr->def.bit_size < 32) ? TYPE_U16 : TYPE_U32; - for (int i = 0; i < instr->def.num_components; i++) - dst[i] = create_immed_typed(ctx->block, instr->value.u32[i], type); + if (instr->def.bit_size < 32) { + for (int i = 0; i < instr->def.num_components; i++) + dst[i] = create_immed_typed(ctx->block, + instr->value.u16[i], + TYPE_U16); + } else { + for (int i = 0; i < instr->def.num_components; i++) + dst[i] = create_immed_typed(ctx->block, + instr->value.u32[i], + TYPE_U32); + } } static void -- 2.17.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] freedreno: Add .dir-locals to the common directory
The commit aa0fed10d35 moved a bunch of Freedreno code to a common directory. The previous directory had a .dir-locals file for Emacs. This patch copies it to the new directory as well. --- src/freedreno/.dir-locals.el | 8 1 file changed, 8 insertions(+) create mode 100644 src/freedreno/.dir-locals.el diff --git a/src/freedreno/.dir-locals.el b/src/freedreno/.dir-locals.el new file mode 100644 index 000..b0e90fcbd53 --- /dev/null +++ b/src/freedreno/.dir-locals.el @@ -0,0 +1,8 @@ +((prog-mode + (indent-tabs-mode . t) + (tab-width . 4) + (c-basic-offset . 4) + (c-file-style . "k") + (fill-column . 78) + ) + ) -- 2.17.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline
Eric Engestrom writes: > That's absolutely fair :) > > I wanted to ack your patch earlier, since fixing it is good regardless, > but freedreno isn't my area so I didn't feel comfortable doing so; > I changed my mind in the mean time though, so here you go :P > Acked-by: Eric Engestrom > > You have push access, right? Yes, I have push access. But actually Rob already pushed my other patch to just remove it in the meantime, so there’s no need to do anything. Thanks anyway. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file
> On Wed, Oct 17, 2018 at 12:45 PM Neil Roberts wrote: > >> I wonder if you have something else in your setup that is setting it? Ilia Mirkin writes: > Perhaps. It's the default, right? It is the default but the toplevel .dir-locals.el sets it to nil. These lower-level files are trying to override it back to the default. > These might have a common source... although, HAH! IT WASN'T ME! > Michel in 8d0a1a6bc05a set it to true, I probably copied, and am so > used to emacs errors that I didn't even notice. Indents worked, so I > was happy. :) > Yes, fixing these all is probably a good move. I don't think there are > a lot of emacs users in mesa. Lucky for them :) Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] Fix setting indent-tabs-mode in the Emacs .dir-locals.el files
Some of the .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. --- src/gallium/drivers/freedreno/.dir-locals.el | 2 +- src/gallium/drivers/r600/.dir-locals.el | 2 +- src/gallium/drivers/radeon/.dir-locals.el| 2 +- src/gallium/drivers/radeonsi/.dir-locals.el | 2 +- src/mesa/drivers/dri/nouveau/.dir-locals.el | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/freedreno/.dir-locals.el b/src/gallium/drivers/freedreno/.dir-locals.el index aa20d495465..b0e90fcbd53 100644 --- a/src/gallium/drivers/freedreno/.dir-locals.el +++ b/src/gallium/drivers/freedreno/.dir-locals.el @@ -1,5 +1,5 @@ ((prog-mode - (indent-tabs-mode . true) + (indent-tabs-mode . t) (tab-width . 4) (c-basic-offset . 4) (c-file-style . "k") diff --git a/src/gallium/drivers/r600/.dir-locals.el b/src/gallium/drivers/r600/.dir-locals.el index 4e35c129e70..15cd68edb0a 100644 --- a/src/gallium/drivers/r600/.dir-locals.el +++ b/src/gallium/drivers/r600/.dir-locals.el @@ -1,5 +1,5 @@ ((prog-mode - (indent-tabs-mode . true) + (indent-tabs-mode . t) (tab-width . 8) (c-basic-offset . 8) (c-file-style . "stroustrup") diff --git a/src/gallium/drivers/radeon/.dir-locals.el b/src/gallium/drivers/radeon/.dir-locals.el index 4e35c129e70..15cd68edb0a 100644 --- a/src/gallium/drivers/radeon/.dir-locals.el +++ b/src/gallium/drivers/radeon/.dir-locals.el @@ -1,5 +1,5 @@ ((prog-mode - (indent-tabs-mode . true) + (indent-tabs-mode . t) (tab-width . 8) (c-basic-offset . 8) (c-file-style . "stroustrup") diff --git a/src/gallium/drivers/radeonsi/.dir-locals.el b/src/gallium/drivers/radeonsi/.dir-locals.el index 4e35c129e70..15cd68edb0a 100644 --- a/src/gallium/drivers/radeonsi/.dir-locals.el +++ b/src/gallium/drivers/radeonsi/.dir-locals.el @@ -1,5 +1,5 @@ ((prog-mode - (indent-tabs-mode . true) + (indent-tabs-mode . t) (tab-width . 8) (c-basic-offset . 8) (c-file-style . "stroustrup") diff --git a/src/mesa/drivers/dri/nouveau/.dir-locals.el b/src/mesa/drivers/dri/nouveau/.dir-locals.el index 774f023ae6f..9b3ddf52461 100644 --- a/src/mesa/drivers/dri/nouveau/.dir-locals.el +++ b/src/mesa/drivers/dri/nouveau/.dir-locals.el @@ -1,5 +1,5 @@ ((prog-mode - (indent-tabs-mode . true) + (indent-tabs-mode . t) (tab-width . 8) (c-basic-offset . 8) (c-file-style . "stroustrup") -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file
Ilia Mirkin writes: > Are you sure? It works fine for me... I'm not against fixing it to be > "t", but the current contents definitely worked fine for me. (As I > recall, I may be the one who checked this file in.) Yes, I’m sure. If you type “true” and then do C-x C-e to evaluate it then Emacs gives a void-variable error. If I leave it as “true” in the file then it does indeed indent without tabs. Also if I do C-h v it says the value is nil, whereas if I change the .dir-local.el to “t” then the indentation works properly and the variable help says its value comes from the .dir-locals.el. I wonder if you have something else in your setup that is setting it? I notice that there are some other files with the same problem. It might be worth fixing them all in one patch. $ git grep 'indent-tabs-mode *\. *true' src/gallium/drivers/freedreno/.dir-locals.el: (indent-tabs-mode . true) src/gallium/drivers/r600/.dir-locals.el: (indent-tabs-mode . true) src/gallium/drivers/radeon/.dir-locals.el: (indent-tabs-mode . true) src/gallium/drivers/radeonsi/.dir-locals.el: (indent-tabs-mode . true) src/mesa/drivers/dri/nouveau/.dir-locals.el: (indent-tabs-mode . true) Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] freedreno: Remove the Emacs mode lines
These are not necessary because the corresponding settings are set via the .dir-locals.el file anyway. Most of them were missing a ‘:’ after “tab-width” which was making Emacs display an annoying warning whenever you open the file. This patch was made with: sed -ri '/-\*- mode:/,/^$/d' \ $(find src/gallium/{drivers,winsys} -name \*.\[ch\] \ -exec grep -l -- '-\*- mode:' {} \+) --- src/gallium/drivers/freedreno/a2xx/fd2_blend.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_blend.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_compiler.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_context.c| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_context.h| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_draw.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_draw.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_emit.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_emit.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_gmem.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_gmem.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_program.c| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_program.h| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_screen.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_texture.c| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_texture.h| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_util.c | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_util.h | 2 -- src/gallium/drivers/freedreno/a2xx/fd2_zsa.c| 2 -- src/gallium/drivers/freedreno/a2xx/fd2_zsa.h| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_blend.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_blend.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_context.c| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_context.h| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_draw.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_draw.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_emit.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_emit.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_gmem.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_gmem.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_program.c| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_program.h| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_query.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_query.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_screen.c | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_screen.h | 2 -- src/gallium/drivers/freedreno/a3xx/fd3_texture.c| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_texture.h| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_zsa.c| 2 -- src/gallium/drivers/freedreno/a3xx/fd3_zsa.h| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_blend.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_blend.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_context.c| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_context.h| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_draw.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_draw.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_emit.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_emit.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_format.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_format.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_gmem.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_gmem.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_program.c| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_program.h| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_query.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_query.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_screen.c | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_screen.h | 2 -- src/gallium/drivers/freedreno/a4xx/fd4_texture.c| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_texture.h| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_zsa.c| 2 -- src/gallium/drivers/freedreno/a4xx/fd4_zsa.h| 2 -- src/gallium/drivers/freedreno/freedreno_context.c | 2 --
[Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file
The .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. --- src/gallium/drivers/freedreno/.dir-locals.el | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/freedreno/.dir-locals.el b/src/gallium/drivers/freedreno/.dir-locals.el index aa20d495465..b0e90fcbd53 100644 --- a/src/gallium/drivers/freedreno/.dir-locals.el +++ b/src/gallium/drivers/freedreno/.dir-locals.el @@ -1,5 +1,5 @@ ((prog-mode - (indent-tabs-mode . true) + (indent-tabs-mode . t) (tab-width . 4) (c-basic-offset . 4) (c-file-style . "k") -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline
Eric Engestrom writes: > You might want to remove these instead, and use the .editorconfig [1] > already present at src/gallium/drivers/freedreno/.editorconfig This is > much easier to maintain than per-files settings ;) Either fixing it or removing it is fine by me. I now notice there is a .dir-locals.el file that should make it work anyway. (apparently I was the last person to touch it too!) It has a typo which makes it fail to set indent-tabs-mode though. I can make everything work locally either way, I just wanted to get rid of the annoying warning whenever you open a file. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] freedreno: Fix emacs modeline
The modeline was missing a ‘:’ after the tab-width and Emacs was complaining every time you open a file. This patch was made with: sed -ri '1 s/; tab-width ([0-9])/; tab-width: \1/' \ $(find -name \*.\[ch\] -exec grep -l -- '-\*- mode:' {} \+) --- src/gallium/drivers/freedreno/a2xx/fd2_blend.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_blend.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_compiler.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_context.c| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_context.h| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_draw.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_draw.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_emit.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_emit.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_gmem.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_gmem.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_program.c| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_program.h| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_screen.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_texture.c| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_texture.h| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_util.c | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_util.h | 2 +- src/gallium/drivers/freedreno/a2xx/fd2_zsa.c| 2 +- src/gallium/drivers/freedreno/a2xx/fd2_zsa.h| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_blend.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_blend.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_context.c| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_context.h| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_draw.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_draw.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_emit.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_emit.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_gmem.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_gmem.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_program.c| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_program.h| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_query.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_query.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_screen.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_screen.h | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_texture.c| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_texture.h| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_zsa.c| 2 +- src/gallium/drivers/freedreno/a3xx/fd3_zsa.h| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_blend.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_blend.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_context.c| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_context.h| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_draw.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_draw.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_emit.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_emit.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_format.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_format.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_gmem.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_gmem.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_program.c| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_program.h| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_query.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_query.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_screen.c | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_screen.h | 2 +- src/gallium/drivers/freedreno/a4xx/fd4_texture.c| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_texture.h| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_zsa.c| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_zsa.h| 2 +- src/gallium/drivers/freedreno/freedreno_context.c | 2 +- src/gallium/drivers/freedreno/freedreno_context.h | 2 +- src/gallium/drivers/freedreno/freedreno_draw.c | 2 +- src/gallium/drivers/freedreno/freedreno_draw.h | 2 +-
Re: [Mesa-dev] [PATCH 04/15] mesa/glspirv: Set last_vert_prog
Timothy Arceri writes: >> + >> + int last_vert_stage = >> + util_last_bit(prog->data->linked_stages & >> +(((1 << (MESA_SHADER_GEOMETRY + 1)) - 1) ^ >> + ((1 << MESA_SHADER_VERTEX) - 1))); > > Isn't this the same as: > > int last_vert_stage = >util_last_bit(prog->data->linked_stages & > ((1 << (MESA_SHADER_GEOMETRY + 1)) - 1)); > > As ((1 << MESA_SHADER_VERTEX) - 1)) == 0 > > If you use the above simplification this patch is: > > Reviewed-by: Timothy Arceri This is based on code in link_varyings_and_uniforms which has a loop over all of the stages from geometry down to vertex to try and find the last vertex stage. I was trying to mimic this behaviour which is presumably saying vertex->geometry are the vertex stages and not just “anything up to and including geometry”. I don’t really mind either way whether we include the MESA_SHADER_VERTEX part or not. I guess the cleanest way could be to add a define for the mask in shader_enums.h to make it most likely not to break if any more stages are added. But seeing as there is precedent for having this range be open-coded in the code I guess we can just leave that to another patch. Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 13/21] nir/linker: Add gl_nir_link_uniforms()
Timothy Arceri writes: > I still think you might be able to drop the tree without using a hash > table. Can't you just add an offset field to the state struct and use > this? You would increment it as you recurse down the arrays/structs via > nir_link_uniform() and you could detect the first leaf-node once you > reach the inner most elements in a way similar to what link_uniforms() > does in with record_type in program_resource_visitor::recursion. When > moving to the next uniform you would just reset the offset back to 0. Let’s walk through an example to demonstrate what needs to happen. Imagine there is a uniform like this: layout(location = 0) uniform struct { sampler2D foo[3]; sampler2D bar[4]; } wibble[2]; This creates a total of 2*(3+4)=14 samplers. We want to group together all of the indices for a single member so that they will be laid out in the following order: Index 0: wibble[0].foo[0] Index 1: wibble[0].foo[1] Index 2: wibble[0].foo[2] Index 3: wibble[1].foo[0] Index 4: wibble[1].foo[1] Index 5: wibble[1].foo[2] Index 6: wibble[0].bar[0] Index 7: wibble[0].bar[1] Index 8: wibble[0].bar[2] Index 9: wibble[0].bar[3] Index 10: wibble[1].bar[0] Index 11: wibble[1].bar[1] Index 12: wibble[1].bar[2] Index 13: wibble[1].bar[3] Ie, all of the foos are bunched together before all of the bars. When we walk the type tree for wibble, it will recognise that the array elements are not a simple type so it will create a separate uniform for each element of wibble. That means it will visit each of foo and bar twice. With the current approach the first time we visit foo it will look at all of the parents of the type to work out the total size needed for all of the foos and reserve a continuous block of 6 indices. Then it will reserve three of those for this instance of foo by incrementing its own next_index entry by 3. Next it will do the same thing for bar by reserving 8 indices out of the global pool and setting its own next_index to 4. Then when we handle wibble[1].foo it will recognise that its own next_index is not zero so it will continue using the pool it already allocated. It will take the next three out of its pool starting at the previous value of its next_index, 3. And then it will do the same for bar taking the next four out of the pool it already reserved. This creates the following entries in UniformStorage: 0: loc=0, elems=3, storage offset=0, name=wibble[0].foo, fragment=0 1: loc=3, elems=4, storage offset=6, name=wibble[0].bar, fragment=6 2: loc=7, elems=3, storage offset=14, name=wibble[1].foo, fragment=3 3: loc=10, elems=4, storage offset=20, name=wibble[1].bar, fragment=10 ‘loc’ is the location of the uniform and ‘fragment’ is the sampler index allocated. I understand how you could use something like record_type to recognise whether it is the first time you visit a leaf node or not, but if it isn’t the first time I don’t see how you can work out what index to use if you only have a single counter in the state struct. I think you need a way to find out what the start of the range was the first time you encountered the leaf and how many times you have visited it so far. Regards, - Neil > >> >> Thanks for looking at the patch. >> >> Regards, >> - Neil >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 10/21] anv/nir: Use nir_variable's type if interface_type is null
Timothy Arceri writes: > > glsl_get_sampler_dim() contains the following: > > assert(glsl_type_is_sampler(type) || glsl_type_is_image(type)); > > Which leads me to believe the code above should just be: > > const struct glsl_type *glsl_type = glsl_without_array(var->type); > > If you agree please squash this patch into the previous patch where you > can keep my r-b. I agree, that sounds sensible. It looks like var->interface_type was being overloaded as a place to store the type for the image or sampler without the array type. The comment above interface_type says: “For variables that are in an interface block or are an instance of an interface block, this is the GLSL_TYPE_INTERFACE type for that block.” So it seems a little strange to use it for this purpose. It presumably works because opaque types can’t be in interfaces so the interface_type is otherwise unused. I guess with the previous patch interface_type will always be NULL so you’re right that the check for whether it is NULL in seems pointless. Maybe we should however add a comment to the commit message for the previous commit along these lines: “The previous code appeared to be using var->interface_type as a place to store the type of the variable without the enclosing array for images and samplers. I guess this worked because opaque types can not appear in interfaces so the interface_type is sort of unused. This patch removes the overloading of var->interface_type and any places that needed the type without the array can now just deduce it from var->type.” Thanks for looking at the patch. Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 13/21] nir/linker: Add gl_nir_link_uniforms()
Timothy Arceri writes: >> +static struct type_tree_entry * >> +build_type_tree_for_type(const struct glsl_type *type) >> +{ > > Do we really need this? As far as I can tell we walk the types here to > build a tree then in nir_link_uniform() we walk the tree. Why not just > walk the types directly in nir_link_uniform()? The tree is needed as a temporary place to store the next_index and detect whether a node in the type tree is encountered for the first time or not. The nodes can be encountered multiple times because if there is an array of structs or array of arrays then we process the subtree of the array once for each element in the array. In nir_link_uniform, it effectively _is_ directly walking the glsl_type tree. However it simultaneously walks the type_tree tree as well for this extra side-band information. The GLSL linker handles this instead by using a hash table to store the next index (see set_opaque_indices in link_uniforms.cpp). The key for the table is the name of the uniform with all array indices removed. I suppose we could try to do something similar for the nir linker but it would have to use a different key because we don’t have the names. However my gut feeling is that that wouldn’t be any more efficient then using this side-band tree and the code to generate the hash table key looks quite involved. If you can think of a simpler way to handle this then of course I am open to suggestions. Thanks for looking at the patch. Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Fix output register sizes when variable ranges are interleaved
In 6f5abf31466aed this code was fixed to calculate the maximum size of an attribute in a seperate pass and then allocate the registers to that size. However this wasn’t taking into account ranges that overlap but don’t have the same starting location. For example: layout(location = 0, component = 0) out float a[4]; layout(location = 2, component = 1) out float b[4]; Previously, if ‘a’ was processed first then it would allocate a register of size 4 for location 0 and it wouldn’t allocate another register for location 2 because it would already be covered by the range of 0. Then if something tries to write to b[2] it would try to write past the end of the register allocated for ‘a’ and it would hit an assert. This patch changes it to scan for any overlapping ranges that start within each range to calculate the maximum extent and allocate that instead. Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/ vs-fs-array-interleave-range.shader_test --- For what it’s worth I also made a simplified test for this for VkRunner here: https://github.com/Igalia/vkrunner/blob/tests/examples/overlap-outputs.shader_test src/intel/compiler/brw_fs_nir.cpp | 25 ++--- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 1ce89520bf1..b179018e5e8 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -67,14 +67,25 @@ fs_visitor::nir_setup_outputs() vec4s[loc] = MAX2(vec4s[loc], var_vec4s); } - nir_foreach_variable(var, >outputs) { - const int loc = var->data.driver_location; - if (outputs[loc].file == BAD_FILE) { - fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * vec4s[loc]); - for (unsigned i = 0; i < vec4s[loc]; i++) { -outputs[loc + i] = offset(reg, bld, 4 * i); - } + for (unsigned loc = 0; loc < ARRAY_SIZE(vec4s);) { + if (vec4s[loc] == 0) { + loc++; + continue; } + + unsigned reg_size = vec4s[loc]; + + /* Check if there are any ranges that start within this range and extend + * past it. If so, include them in this allocation. + */ + for (unsigned i = 1; i < reg_size; i++) + reg_size = MAX2(vec4s[i + loc] + i, reg_size); + + fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * reg_size); + for (unsigned i = 0; i < reg_size; i++) + outputs[loc + i] = offset(reg, bld, 4 * i); + + loc += reg_size; } } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] spirv: Fix InterpolateAt* instructions for vecs with dynamic index
If the glsl is something like this: in vec4 some_input; interpolateAtCentroid(some_input[idx]) then it now gets generated as if it were: interpolateAtCentroid(some_input)[idx] This is necessary because the index will get generated as a series of nir_bcsel instructions so it would no longer be an input variable. It is similar to what is done for GLSL in ca63a5ed3e9efb2bd645b42. Although I can’t find anything explicit in the Vulkan specs to say this should be allowed, the SPIR-V spec just says “the operand interpolant must be a pointer to the Input Storage Class”, which I guess doesn’t rule out any type of pointer to an input. This was found using the spec/glsl-4.40/execution/fs-interpolateAt* Piglit tests with the ARB_gl_spirv branch. --- I’ve made a bunch of tests for this for VkRunner. They can be tested with: git clone https://github.com/Igalia/vkrunner.git -b tests cd vkrunner && ./autogen.sh && make ./src/vkrunner examples/interpolation/*.shader_test src/compiler/spirv/vtn_glsl450.c | 56 +--- 1 file changed, 52 insertions(+), 4 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 6fa759b1bba..cd39fcf6efb 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -773,6 +773,23 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, } } +static bool +has_vector_array_deref(nir_deref *deref) +{ + while (true) { + if (deref->child == NULL) + return false; + + if (glsl_type_is_vector(deref->type) && + deref->child->deref_type == nir_deref_type_array) { + assert(deref->child->child == NULL); + return true; + } + + deref = deref->child; + } +} + static void handle_glsl450_interpolation(struct vtn_builder *b, enum GLSLstd450 opcode, const uint32_t *w, unsigned count) @@ -799,10 +816,28 @@ handle_glsl450_interpolation(struct vtn_builder *b, enum GLSLstd450 opcode, } nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->nb.shader, op); + const struct glsl_type *intrin_type; nir_deref_var *deref = vtn_nir_deref(b, w[5]); intrin->variables[0] = nir_deref_var_clone(deref, intrin); + /* If the value we are interpolating has an index into a vector then +* interpolate the vector and index the result of that instead. This is +* necessary because the index will get generated as a series of nir_bcsel +* instructions so it would no longer be an input variable. +*/ + bool vec_array_deref = has_vector_array_deref(>deref); + if (vec_array_deref) { + /* Remove the last deref */ + nir_deref *parent = >variables[0]->deref; + while (parent->child->child) + parent = parent->child; + parent->child = NULL; + intrin_type = parent->type; + } else { + intrin_type = dest_type; + } + switch (opcode) { case GLSLstd450InterpolateAtCentroid: break; @@ -814,13 +849,26 @@ handle_glsl450_interpolation(struct vtn_builder *b, enum GLSLstd450 opcode, vtn_fail("Invalid opcode"); } - intrin->num_components = glsl_get_vector_elements(dest_type); + intrin->num_components = glsl_get_vector_elements(intrin_type); nir_ssa_dest_init(>instr, >dest, - glsl_get_vector_elements(dest_type), - glsl_get_bit_size(dest_type), NULL); - val->ssa->def = >dest.ssa; + glsl_get_vector_elements(intrin_type), + glsl_get_bit_size(intrin_type), NULL); nir_builder_instr_insert(>nb, >instr); + + if (vec_array_deref) { + nir_deref_array *deref_array = + nir_deref_as_array(nir_deref_tail(>deref)); + if (deref_array->deref_array_type == nir_deref_array_type_direct) { + val->ssa->def = vtn_vector_extract(b, >dest.ssa, +deref_array->base_offset); + } else { + val->ssa->def = vtn_vector_extract_dynamic(b, >dest.ssa, +deref_array->indirect.ssa); + } + } else { + val->ssa->def = >dest.ssa; + } } bool -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/22] i965: Setup glsl uniforms by index rather than name matching
Thanks for the feedback. I’ve implemented your suggestion as the top two patches on a branch here: https://github.com/Igalia/mesa/tree/nroberts/count-uniform-storage It also fixes a slight bug that it wasn’t passing down the uniform_storage_locs argument into the recursion. However I think my preference would be for keeping this as a separate function because it seems a little confusing to make the function count two different things depending on a bool argument. Perhaps for the sake of maintainability it would be good to move it to glsl_types though. If there is a strong preference to merge it into one function then of course I don’t mind. Regards, - Neil Timothy Arceri <tarc...@itsqueeze.com> writes: > On 18/04/18 00:36, Alejandro Piñeiro wrote: >> From: Neil Roberts <nrobe...@igalia.com> >> >> Previously when setting up a uniform it would try to walk the uniform >> storage slots and find one that matches the name of the given >> variable. However, each variable already has a location which is an >> index into the UniformStorage array so we can just directly jump to >> the right slot. Some of the variables take up more than one slot so we >> still need to calculate how many it uses. >> >> The main reason to do this is to support ARB_gl_spirv because in that >> case the uniforms don’t have names so the previous approach won’t >> work. > > This is a nice improvement away :) > > However it might be nicer to just update the existing > glsl_type::uniform_locations() helper > rather than create the custom count_uniform_storage_slots() helper. > > e.g > > unsigned > glsl_type::uniform_locations(bool uniform_storage_locs) const > { > unsigned size = 0; > > switch (this->base_type) { > case GLSL_TYPE_UINT: > case GLSL_TYPE_INT: > case GLSL_TYPE_FLOAT: > case GLSL_TYPE_FLOAT16: > case GLSL_TYPE_DOUBLE: > case GLSL_TYPE_UINT16: > case GLSL_TYPE_UINT8: > case GLSL_TYPE_INT16: > case GLSL_TYPE_INT8: > case GLSL_TYPE_UINT64: > case GLSL_TYPE_INT64: > case GLSL_TYPE_BOOL: > case GLSL_TYPE_SAMPLER: > case GLSL_TYPE_IMAGE: > case GLSL_TYPE_SUBROUTINE: >return 1; > > case GLSL_TYPE_STRUCT: > case GLSL_TYPE_INTERFACE: >for (unsigned i = 0; i < this->length; i++) > size += this->fields.structure[i].type->uniform_locations(); >return size; > case GLSL_TYPE_ARRAY: >if (!uniform_storage_locs || >this->fields.array->is_array() || >this->fields.array->is_interface() || >this->fields.array->is_record()) { >/* For uniform locations passed to the user via the API we > * count all array elements. > */ > return this->length * this->fields.array->uniform_locations(); >} else { > /* For uniform storage the innermost array only uses a >* single slot. >*/ > return 1; >} > default: >return 0; > } > } > > If you agree this patch is: > > Reviewed-by: Timothy Arceri <tarc...@itsqueeze.com> > >> --- >> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 50 >> +++--- >> 1 file changed, 37 insertions(+), 13 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp >> b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp >> index 62b2951432a..cb5e07f74ba 100644 >> --- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp >> @@ -118,35 +118,59 @@ brw_setup_image_uniform_values(gl_shader_stage stage, >> } >> } >> >> +static unsigned >> +count_uniform_storage_slots(const struct glsl_type *type) >> +{ >> + /* gl_uniform_storage can cope with one level of array, so if the >> +* type is a composite type or an array where each element occupies >> +* more than one slot than we need to recursively process it. >> +*/ >> + if (glsl_type_is_struct(type)) { >> + unsigned location_count = 0; >> + >> + for (unsigned i = 0; i < glsl_get_length(type); i++) { >> + const struct glsl_type *field_type = glsl_get_struct_field(type, >> i); >> + >> + location_count += count_uniform_storage_slots(field_type); >> + } >> + >> + return location_count; >> + } >> + >> + if (glsl_type_is_array(type)) { >> + const struct glsl_type *element_type = glsl_get_array_element
[Mesa-dev] [PATCH] spirv: Apply OriginUpperLeft to FragCoord
This behaviour was changed in 1e5b09f42f694687ac. The commit message for that says it is just a “tidy up” so my assumption is that the behaviour change was a mistake. It’s a little hard to decipher looking at the diff, but the previous code before that patch was: if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition) nir_var->data.origin_upper_left = b->origin_upper_left; if (builtin == SpvBuiltInFragCoord) nir_var->data.pixel_center_integer = b->pixel_center_integer; After the patch the code was: case SpvBuiltInSamplePosition: nir_var->data.origin_upper_left = b->origin_upper_left; /* fallthrough */ case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; break; Before the patch origin_upper_left affected both builtins and pixel_center_integer only affected FragCoord. After the patch origin_upper_left only affects SamplePosition and pixel_center_integer affects both variables. This patch tries to restore the previous behaviour by changing the code to: case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; /* fallthrough */ case SpvBuiltInSamplePosition: nir_var->data.origin_upper_left = b->origin_upper_left; break; This change will be important for ARB_gl_spirv which is meant to support OriginLowerLeft. --- src/compiler/spirv/vtn_variables.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 9679ff6526c..fd8ab7f247a 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1419,11 +1419,11 @@ apply_var_decoration(struct vtn_builder *b, nir_variable *nir_var, case SpvBuiltInTessLevelInner: nir_var->data.compact = true; break; - case SpvBuiltInSamplePosition: - nir_var->data.origin_upper_left = b->origin_upper_left; - /* fallthrough */ case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; + /* fallthrough */ + case SpvBuiltInSamplePosition: + nir_var->data.origin_upper_left = b->origin_upper_left; break; default: break; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NIR issue with SPIRV ops that have operands with different bit-size
Rob Clarkwrites: > Karol hit the same thing (with for example, shift instructions) with > the work for spv compute/kernel support. I *think* the number of > special cases isn't too high, so probably vtn should just insert the > appropriate conversion instruction (ie. u2u32, etc) so that if the src > bitsize is incorrect it will be converted. For what it’s worth, I also added a conversion instruction like this for the Refract GLSL extension opcode because its eta argument can have a different size: https://github.com/Igalia/mesa/commit/608d70bc02a968ee2b21a5db0f54 Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] spirv: Don’t check for NaN for most OpFOrd* comparisons
For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware should probably already return false if one of the operands is NaN so we don’t need to have an explicit check for it. This seems to at least work on Intel hardware. This should reduce the number of instructions generated for the most common comparisons. For what it’s worth, the original code to handle this was added in e062eb6415de3a. The commit message for that says that it was to fix some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t handle NaNs this patch shouldn’t affect those tests. At any rate they have since been moved out of the mustpass list. Incidentally those tests fail on the nvidia proprietary driver so it doesn’t seem like handling NaNs correctly is a priority. --- I made a VkRunner test case for all of the OpFOrd* and OpFUnord* opcodes with and without NaNs on the test branch. It can be run like this: git clone -b tests https://github.com/Igalia/vkrunner.git cd vkrunner ./autogen.sh && make -j8 ./src/vkrunner examples/unordered-comparison.shader_test src/compiler/spirv/vtn_alu.c | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c index 71e743cdd1e..3134849ba90 100644 --- a/src/compiler/spirv/vtn_alu.c +++ b/src/compiler/spirv/vtn_alu.c @@ -597,23 +597,18 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode, break; } - case SpvOpFOrdEqual: - case SpvOpFOrdNotEqual: - case SpvOpFOrdLessThan: - case SpvOpFOrdGreaterThan: - case SpvOpFOrdLessThanEqual: - case SpvOpFOrdGreaterThanEqual: { + case SpvOpFOrdNotEqual: { + /* For all the SpvOpFOrd* comparisons apart from NotEqual, the value + * from the ALU will probably already be false if the operands are not + * ordered so we don’t need to handle it specially. + */ bool swap; unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type); unsigned dst_bit_size = glsl_get_bit_size(type); nir_op op = vtn_nir_alu_op_for_spirv_opcode(b, opcode, , src_bit_size, dst_bit_size); - if (swap) { - nir_ssa_def *tmp = src[0]; - src[0] = src[1]; - src[1] = tmp; - } + assert(!swap); val->ssa->def = nir_iand(>nb, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] nir/spirv: fix MSVC warning in vtn_align_u32()
Patches 2-6 are: Reviewed-by: Neil Roberts <nrobe...@igalia.com> Thanks a lot for fixing this. Regards, - Neil Brian Paul <bri...@vmware.com> writes: > Fixes warning that "negation of an unsigned value results in an > unsigned value". > --- > src/compiler/spirv/vtn_private.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/compiler/spirv/vtn_private.h > b/src/compiler/spirv/vtn_private.h > index d8a00f9..269de92 100644 > --- a/src/compiler/spirv/vtn_private.h > +++ b/src/compiler/spirv/vtn_private.h > @@ -732,7 +732,7 @@ void vtn_handle_decoration(struct vtn_builder *b, SpvOp > opcode, > static inline uint32_t > vtn_align_u32(uint32_t v, uint32_t a) > { > - assert(a != 0 && a == (a & -a)); > + assert(a != 0 && a == (a & -((int32_t) a))); > return (v + a - 1) & ~(a - 1); > } > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] spirv: Fix building with SCons
The SCons build broke with commit ba975140d3c9 because a SPIR-V function is called from Mesa main. This adds a convenience library for SPIR-V and adds it to everything that was including nir. It also adds both nir and spirv to drivers/x11/SConscript. --- It would be great if someone who depends on SCons could test this. We weren’t sure if there was a good reason why nir isn’t already included in drivers/x11/SConscript and whether adding it might break some platform. src/compiler/Makefile.nir.am | 3 +- src/compiler/SConscript | 1 + src/compiler/SConscript.spirv | 54 +++ src/gallium/targets/dri/SConscript| 1 + src/gallium/targets/haiku-softpipe/SConscript | 1 + src/gallium/targets/libgl-xlib/SConscript | 1 + src/gallium/targets/osmesa/SConscript | 1 + src/mesa/drivers/x11/SConscript | 2 + 8 files changed, 63 insertions(+), 1 deletion(-) create mode 100644 src/compiler/SConscript.spirv diff --git a/src/compiler/Makefile.nir.am b/src/compiler/Makefile.nir.am index 32e4145b70f..27dc129e57b 100644 --- a/src/compiler/Makefile.nir.am +++ b/src/compiler/Makefile.nir.am @@ -100,4 +100,5 @@ EXTRA_DIST += \ nir/nir_opt_algebraic.py\ nir/tests \ nir/README \ - SConscript.nir + SConscript.nir \ + SConstript.spirv diff --git a/src/compiler/SConscript b/src/compiler/SConscript index 44509a9a95b..0a0c0737422 100644 --- a/src/compiler/SConscript +++ b/src/compiler/SConscript @@ -27,3 +27,4 @@ Export('compiler') SConscript('SConscript.glsl') SConscript('SConscript.nir') +SConscript('SConscript.spirv') diff --git a/src/compiler/SConscript.spirv b/src/compiler/SConscript.spirv new file mode 100644 index 000..49410881d0b --- /dev/null +++ b/src/compiler/SConscript.spirv @@ -0,0 +1,54 @@ +import common + +Import('*') + +from sys import executable as python_cmd + +env = env.Clone() + +env.MSVC2013Compat() + +env.Prepend(CPPPATH = [ +'#include', +'#src', +'#src/mapi', +'#src/mesa', +'#src/gallium/include', +'#src/gallium/auxiliary', +'#src/compiler/nir', +'#src/compiler/spirv', +]) + +# Make generated headers reachable from the include path. +env.Prepend(CPPPATH = [Dir('.').abspath, Dir('nir').abspath]) +env.Prepend(CPPPATH = [Dir('.').abspath, Dir('spirv').abspath]) + +# spirv generated sources + +env.CodeGenerate( +target = 'spirv/spirv_info.c', +script = 'spirv/spirv_info_c.py', +source = ['spirv/spirv.core.grammar.json'], +command = python_cmd + ' $SCRIPT $SOURCE $TARGET' +) + +env.CodeGenerate( +target = 'spirv/vtn_gather_types.c', +script = 'spirv/vtn_gather_types_c.py', +source = ['spirv/spirv.core.grammar.json'], +command = python_cmd + ' $SCRIPT $SOURCE $TARGET' +) + +# parse Makefile.sources +source_lists = env.ParseSourceList('Makefile.sources') + +spirv_sources = source_lists['SPIRV_FILES'] +spirv_sources += source_lists['SPIRV_GENERATED_FILES'] + +spirv = env.ConvenienceLibrary( +target = 'spirv', +source = spirv_sources, +) + +env.Alias('spirv', spirv) +Export('spirv') diff --git a/src/gallium/targets/dri/SConscript b/src/gallium/targets/dri/SConscript index f5c2818d04f..ff6ce3bf4e0 100644 --- a/src/gallium/targets/dri/SConscript +++ b/src/gallium/targets/dri/SConscript @@ -45,6 +45,7 @@ env.Prepend(LIBS = [ mesa, glsl, nir, +spirv, gallium, megadrivers_stub, dri_common, diff --git a/src/gallium/targets/haiku-softpipe/SConscript b/src/gallium/targets/haiku-softpipe/SConscript index f80c167d83b..89792fba132 100644 --- a/src/gallium/targets/haiku-softpipe/SConscript +++ b/src/gallium/targets/haiku-softpipe/SConscript @@ -10,6 +10,7 @@ env.Prepend(LIBS = [ mesa, glsl, nir, +spirv, gallium ]) diff --git a/src/gallium/targets/libgl-xlib/SConscript b/src/gallium/targets/libgl-xlib/SConscript index a81ac793251..b94ef350b16 100644 --- a/src/gallium/targets/libgl-xlib/SConscript +++ b/src/gallium/targets/libgl-xlib/SConscript @@ -33,6 +33,7 @@ env.Prepend(LIBS = [ mesa, glsl, nir, +spirv, gallium, ]) diff --git a/src/gallium/targets/osmesa/SConscript b/src/gallium/targets/osmesa/SConscript index 7be1b48c0b2..ccf7d5170c4 100644 --- a/src/gallium/targets/osmesa/SConscript +++ b/src/gallium/targets/osmesa/SConscript @@ -18,6 +18,7 @@ env.Prepend(LIBS = [ trace, glsl, nir, +spirv, mesautil, softpipe ]) diff --git a/src/mesa/drivers/x11/SConscript b/src/mesa/drivers/x11/SConscript index 59c8df4b3c2..b097dcc5900 100644 --- a/src/mesa/drivers/x11/SConscript +++ b/src/mesa/drivers/x11/SConscript @@ -21,6 +21,8 @@ env.Prepend(LIBS = [ compiler, glsl, mesa, +spirv, +nir, ]) sources = [ -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
[Mesa-dev] [PATCH 1/3] spirv: Handle location decorations on block interface members
Previously the code was taking any location decoration on the block and using that to calculate the member locations for all of the members. I think this was assuming that there would only be one location decoration for the entire block. According to the Vulkan spec it is possible to add location decorations to individual members: “If the structure type is a Block but without a Location, then each of its members must have a Location decoration. If it is a Block with a Location decoration, then its members are assigned consecutive locations in declaration order, starting from the first member which is initially the Block. Any member with its own Location decoration is assigned that location. Each remaining member is assigned the location after the immediately preceding member in declaration order.” This patch makes it instead keep track of which members have been assigned an explicit location. It also has a space to store the location for the struct as a whole. Once all the decorations have been processed it iterates over each member to fill in the missing locations using the rules described above. --- There are tests for this on the tests branch of VkRunner: https://github.com/Igalia/vkrunner/tree/tests ./src/vkrunner examples/block-*.shader_test src/compiler/spirv/vtn_private.h | 6 src/compiler/spirv/vtn_variables.c | 60 ++ 2 files changed, 54 insertions(+), 12 deletions(-) diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h index 70f660fbd48..c268b093656 100644 --- a/src/compiler/spirv/vtn_private.h +++ b/src/compiler/spirv/vtn_private.h @@ -457,6 +457,12 @@ struct vtn_variable { nir_variable *var; nir_variable **members; + /* If the variable is a struct with a location set on it then this will be +* stored here. This will be used to calculate locations for members that +* don’t have their own explicit location. +*/ + int base_location; + int shared_location; /** diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index b2897407fb1..29e9838d5d2 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1541,18 +1541,14 @@ var_decoration_cb(struct vtn_builder *b, struct vtn_value *val, int member, */ if (dec->decoration == SpvDecorationLocation) { unsigned location = dec->literals[0]; - bool is_vertex_input; if (b->shader->info.stage == MESA_SHADER_FRAGMENT && vtn_var->mode == vtn_variable_mode_output) { - is_vertex_input = false; location += FRAG_RESULT_DATA0; } else if (b->shader->info.stage == MESA_SHADER_VERTEX && vtn_var->mode == vtn_variable_mode_input) { - is_vertex_input = true; location += VERT_ATTRIB_GENERIC0; } else if (vtn_var->mode == vtn_variable_mode_input || vtn_var->mode == vtn_variable_mode_output) { - is_vertex_input = false; location += vtn_var->patch ? VARYING_SLOT_PATCH0 : VARYING_SLOT_VAR0; } else { vtn_warn("Location must be on input or output variable"); @@ -1565,14 +1561,10 @@ var_decoration_cb(struct vtn_builder *b, struct vtn_value *val, int member, } else { /* This handles the structure member case */ assert(vtn_var->members); - unsigned length = -glsl_get_length(glsl_without_array(vtn_var->type->type)); - for (unsigned i = 0; i < length; i++) { -vtn_var->members[i]->data.location = location; -location += - glsl_count_attribute_slots(vtn_var->members[i]->interface_type, - is_vertex_input); - } + if (member == -1) +vtn_var->base_location = location; + else +vtn_var->members[member]->data.location = location; } return; } else { @@ -1756,6 +1748,40 @@ is_per_vertex_inout(const struct vtn_variable *var, gl_shader_stage stage) return false; } +static void +add_missing_member_locations(struct vtn_variable *var, + bool is_vertex_input) +{ + unsigned length = + glsl_get_length(glsl_without_array(var->type->type)); + int location = var->base_location; + + for (unsigned i = 0; i < length; i++) { + /* From the Vulkan spec: + * + * “If the structure type is a Block but without a Location, then each + * of its members must have a Location decoration.” + */ + assert(var->base_location != -1 || + var->members[i]->data.location != -1); + + /* From the Vulkan spec: + * + * “Any member with its own Location decoration is assigned that + * location. Each remaining member is assigned the location after the + * immediately preceding member in declaration order.” + */ + if (var->members[i]->data.location != -1) +
[Mesa-dev] [PATCH 3/3] spirv: Don’t use special semantics when counting vertex attribute size
Under Vulkan, the double vertex attributes take up the same size regardless of whether they are vertex inputs or any other stage interface. --- There is a test for this on the tests branch of VkRunner: https://github.com/Igalia/vkrunner/tree/tests ./src/vkrunner examples/double-vertex-input-block.shader_test src/compiler/spirv/vtn_variables.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 29e9838d5d2..269228d486e 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1749,8 +1749,7 @@ is_per_vertex_inout(const struct vtn_variable *var, gl_shader_stage stage) } static void -add_missing_member_locations(struct vtn_variable *var, - bool is_vertex_input) +add_missing_member_locations(struct vtn_variable *var) { unsigned length = glsl_get_length(glsl_without_array(var->type->type)); @@ -1778,7 +1777,7 @@ add_missing_member_locations(struct vtn_variable *var, location += glsl_count_attribute_slots(var->members[i]->interface_type, -is_vertex_input); +false /* is_gl_vertex_input */); } } @@ -1975,9 +1974,7 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val, if ((var->mode == vtn_variable_mode_input || var->mode == vtn_variable_mode_output) && var->members) { - bool is_vertex_input = (b->shader->info.stage == MESA_SHADER_VERTEX && - var->mode == vtn_variable_mode_input); - add_missing_member_locations(var, is_vertex_input); + add_missing_member_locations(var); } if (var->mode == vtn_variable_mode_image || -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] glsl_types: Rename parameter of glsl_count_attribute_slots
glsl_count_attribute_slots takes a parameter to specify whether the type is being used as a vertex input because on GL double attributes only take up one slot. Vulkan doesn’t make this distinction so this patch renames the argument to is_gl_vertex_input in order to make it more clear that it should always be false on Vulkan. --- src/compiler/glsl_types.cpp | 16 ++-- src/compiler/glsl_types.h | 5 - src/compiler/nir_types.cpp | 4 ++-- src/compiler/nir_types.h| 2 +- 4 files changed, 17 insertions(+), 10 deletions(-) diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp index 9d853caf721..50aac611981 100644 --- a/src/compiler/glsl_types.cpp +++ b/src/compiler/glsl_types.cpp @@ -1942,7 +1942,7 @@ glsl_type::std430_size(bool row_major) const } unsigned -glsl_type::count_attribute_slots(bool is_vertex_input) const +glsl_type::count_attribute_slots(bool is_gl_vertex_input) const { /* From page 31 (page 37 of the PDF) of the GLSL 1.50 spec: * @@ -1985,7 +1985,7 @@ glsl_type::count_attribute_slots(bool is_vertex_input) const case GLSL_TYPE_DOUBLE: case GLSL_TYPE_UINT64: case GLSL_TYPE_INT64: - if (this->vector_elements > 2 && !is_vertex_input) + if (this->vector_elements > 2 && !is_gl_vertex_input) return this->matrix_columns * 2; else return this->matrix_columns; @@ -1993,14 +1993,18 @@ glsl_type::count_attribute_slots(bool is_vertex_input) const case GLSL_TYPE_INTERFACE: { unsigned size = 0; - for (unsigned i = 0; i < this->length; i++) - size += this->fields.structure[i].type->count_attribute_slots(is_vertex_input); + for (unsigned i = 0; i < this->length; i++) { + const glsl_type *member = this->fields.structure[i].type; + size += member->count_attribute_slots(is_gl_vertex_input); + } return size; } - case GLSL_TYPE_ARRAY: - return this->length * this->fields.array->count_attribute_slots(is_vertex_input); + case GLSL_TYPE_ARRAY: { + const glsl_type *element = this->fields.array; + return this->length * element->count_attribute_slots(is_gl_vertex_input); + } case GLSL_TYPE_SUBROUTINE: return 1; diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h index 88f987fabee..61eb36ca5c3 100644 --- a/src/compiler/glsl_types.h +++ b/src/compiler/glsl_types.h @@ -366,8 +366,11 @@ public: * * For vertex shader attributes - doubles only take one slot. * For inter-shader varyings - dvec3/dvec4 take two slots. +* +* Vulkan doesn’t make this distinction so the argument should always be +* false. */ - unsigned count_attribute_slots(bool is_vertex_input) const; + unsigned count_attribute_slots(bool is_gl_vertex_input) const; /** * Alignment in bytes of the start of this type in a std140 uniform diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp index 78b66803f08..3969d51c85b 100644 --- a/src/compiler/nir_types.cpp +++ b/src/compiler/nir_types.cpp @@ -119,9 +119,9 @@ glsl_get_aoa_size(const struct glsl_type *type) unsigned glsl_count_attribute_slots(const struct glsl_type *type, - bool is_vertex_input) + bool is_gl_vertex_input) { - return type->count_attribute_slots(is_vertex_input); + return type->count_attribute_slots(is_gl_vertex_input); } const char * diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h index 5b441af1486..3b890f88a55 100644 --- a/src/compiler/nir_types.h +++ b/src/compiler/nir_types.h @@ -72,7 +72,7 @@ unsigned glsl_get_length(const struct glsl_type *type); unsigned glsl_get_aoa_size(const struct glsl_type *type); unsigned glsl_count_attribute_slots(const struct glsl_type *type, -bool is_vertex_input); +bool is_gl_vertex_input); const char *glsl_get_struct_elem_name(const struct glsl_type *type, unsigned index); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/4] spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the equivalent functions in the GLSL spec have explicit alternatives for doubles. Refract is a little bit more complicated due to the fact that the final argument is always supposed to be a scalar 32- or 16- bit float regardless of the other operands. However in practice it seems there is a bug in glslang that makes it convert the argument to 64-bit if you actually try to pass it a 32-bit value while the other arguments are 64-bit. This adds an optional conversion of the final argument in order to support any type. These have been tested against the automatically generated tests of glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch which tests it with quite a large range of combinations. The issue with glslang has been filed here: https://github.com/KhronosGroup/glslang/issues/1279 v2: Convert the eta operand of Refract from any size in order to make it eventually cope with 16-bit floats. --- src/compiler/spirv/vtn_glsl450.c | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 50783fbfb4d..0cabedf741d 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -628,14 +628,14 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, case GLSLstd450FaceForward: val->ssa->def = nir_bcsel(nb, nir_flt(nb, nir_fdot(nb, src[2], src[1]), - nir_imm_float(nb, 0.0)), + NIR_IMM_FP(nb, 0.0)), src[0], nir_fneg(nb, src[0])); return; case GLSLstd450Reflect: /* I - 2 * dot(N, I) * N */ val->ssa->def = - nir_fsub(nb, src[0], nir_fmul(nb, nir_imm_float(nb, 2.0), + nir_fsub(nb, src[0], nir_fmul(nb, NIR_IMM_FP(nb, 2.0), nir_fmul(nb, nir_fdot(nb, src[0], src[1]), src[1]))); return; @@ -645,8 +645,22 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, nir_ssa_def *N = src[1]; nir_ssa_def *eta = src[2]; nir_ssa_def *n_dot_i = nir_fdot(nb, N, I); - nir_ssa_def *one = nir_imm_float(nb, 1.0); - nir_ssa_def *zero = nir_imm_float(nb, 0.0); + nir_ssa_def *one = NIR_IMM_FP(nb, 1.0); + nir_ssa_def *zero = NIR_IMM_FP(nb, 0.0); + /* According to the SPIR-V and GLSL specs, eta is always a float + * regardless of the type of the other operands. However in practice it + * seems that if you try to pass it a float then glslang will just + * promote it to a double and generate invalid SPIR-V. In order to + * support a hypothetical fixed version of glslang we’ll promote eta to + * double if the other operands are double also. + */ + if (I->bit_size != eta->bit_size) { + nir_op conversion_op = +nir_type_conversion_op(nir_type_float | eta->bit_size, + nir_type_float | I->bit_size, + nir_rounding_mode_undef); + eta = nir_build_alu(nb, conversion_op, eta, NULL, NULL, NULL); + } /* k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I)) */ nir_ssa_def *k = nir_fsub(nb, one, nir_fmul(nb, eta, nir_fmul(nb, eta, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/4] spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used to compare against. This has been tested against the arb_gpu_shader_fp64/execution/ fs-isinf-dvec tests using the ARB_gl_spirv branch. v2: Use nir_imm_floatN_t for the constant. --- src/compiler/spirv/vtn_alu.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c index 01be397e271..3226e5b8739 100644 --- a/src/compiler/spirv/vtn_alu.c +++ b/src/compiler/spirv/vtn_alu.c @@ -563,10 +563,11 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode, val->ssa->def = nir_fne(>nb, src[0], src[0]); break; - case SpvOpIsInf: - val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), - nir_imm_float(>nb, INFINITY)); + case SpvOpIsInf: { + nir_ssa_def *inf = nir_imm_floatN_t(>nb, INFINITY, src[0]->bit_size); + val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), inf); break; + } case SpvOpFUnordEqual: case SpvOpFUnordNotEqual: -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
There is an existing macro that is used to choose between either a float or a double immediate constant based on the bit size of the first operand to the builtin. This is now changed to use the new nir_imm_floatN_t helper function to reduce the number of places that make this decision. --- src/compiler/spirv/vtn_glsl450.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 7d32914d516..50783fbfb4d 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -513,7 +513,7 @@ vtn_nir_alu_op_for_spirv_glsl_opcode(struct vtn_builder *b, } } -#define NIR_IMM_FP(n, v) (src[0]->bit_size == 64 ? nir_imm_double(n, v) : nir_imm_float(n, v)) +#define NIR_IMM_FP(n, v) (nir_imm_floatN_t(n, v, src[0]->bit_size)) static void handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] nir/builder: Add a nir_imm_floatN_t helper
This lets you easily build float immediates just given the bit size. If we have this single place here to handle this then it will be easier to add support for 16-bit floats later. --- src/compiler/nir/nir_builder.h | 13 + 1 file changed, 13 insertions(+) diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h index 36e0ae3ac63..32f86249ad3 100644 --- a/src/compiler/nir/nir_builder.h +++ b/src/compiler/nir/nir_builder.h @@ -227,6 +227,19 @@ nir_imm_double(nir_builder *build, double x) return nir_build_imm(build, 1, 64, v); } +static inline nir_ssa_def * +nir_imm_floatN_t(nir_builder *build, double x, unsigned bit_size) +{ + switch (bit_size) { + case 32: + return nir_imm_float(build, x); + case 64: + return nir_imm_double(build, x); + } + + unreachable("unknown float immediate bit size"); +} + static inline nir_ssa_def * nir_imm_vec4(nir_builder *build, float x, float y, float z, float w) { -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] spirv: Support doubles in some builtin functions
This adds support for doubles in some of the builtin functions. The last two patches have been posted already and are a v2 based on Jason’s feedback. These patches come out of testing using the ARB_gl_spirv branch of Mesa and Piglit. However they also affect Vulkan and can be tested with VkRunner using the test branch here: https://github.com/Igalia/vkrunner/tree/tests The corresponding tests can be run with: ./src/vkrunner \ examples/{face-forward,reflect,refract,isinf}-double.shader_test \ examples/refract-double-exp32.shader_test Neil Roberts (4): nir/builder: Add a nir_imm_floatN_t helper spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins spirv: Add a 64-bit implementation of OpIsInf spirv: Accept doubles in FaceForward, Reflect and Refract src/compiler/nir/nir_builder.h | 13 + src/compiler/spirv/vtn_alu.c | 7 --- src/compiler/spirv/vtn_glsl450.c | 24 +++- 3 files changed, 36 insertions(+), 8 deletions(-) -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] spirv: Handle doubles when multiplying a mat by a scalar
The code to handle mat multipication by a scalar tries to pick either imul or fmul depending on whether the matrix is float or integer. However it was doing this by checking whether the base type is float. This was making it choose the int path for doubles (and presumably float16s). --- This was discovered from running the arb_gpu_shader_fp64 tests through the GL_ARB_gl_spirv branch. For example: arb_gpu_shader_fp64@execution@built-in-functions@vs-op-div-dmat4-double. Note however these tests are hidden behind a glslang bug: https://github.com/KhronosGroup/glslang/issues/1278 The fix in that issue along with this patch makes them all pass. There’s also a little test case for this with VkRunner here: https://github.com/Igalia/vkrunner/blob/tests/examples/dmat-mul-scalar.shader_test src/compiler/spirv/vtn_alu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c index d0c9e316935..110fcec2a60 100644 --- a/src/compiler/spirv/vtn_alu.c +++ b/src/compiler/spirv/vtn_alu.c @@ -142,10 +142,10 @@ mat_times_scalar(struct vtn_builder *b, { struct vtn_ssa_value *dest = vtn_create_ssa_value(b, mat->type); for (unsigned i = 0; i < glsl_get_matrix_columns(mat->type); i++) { - if (glsl_get_base_type(mat->type) == GLSL_TYPE_FLOAT) - dest->elems[i]->def = nir_fmul(>nb, mat->elems[i]->def, scalar); - else + if (glsl_base_type_is_integer(glsl_get_base_type(mat->type))) dest->elems[i]->def = nir_imul(>nb, mat->elems[i]->def, scalar); + else + dest->elems[i]->def = nir_fmul(>nb, mat->elems[i]->def, scalar); } return dest; -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Vulkan version of shader_runner
Hi, I’ve made a start on a Vulkan version of Piglit’s shader_runner. I mostly just wanted it as a quick way to compare problems on Vulkan and GL_ARB_gl_spirv but maybe one day it could become part of a Vulkan testing suite. I just thought I’d announce it here in case anyone else finds it useful too. https://github.com/igalia/vkrunner - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used to compare against. This has been tested against the arb_gpu_shader_fp64/execution/ fs-isinf-dvec tests using the ARB_gl_spirv branch. --- src/compiler/spirv/vtn_alu.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c index d0c9e316935..2b03a2e4d09 100644 --- a/src/compiler/spirv/vtn_alu.c +++ b/src/compiler/spirv/vtn_alu.c @@ -529,10 +529,15 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode, val->ssa->def = nir_fne(>nb, src[0], src[0]); break; - case SpvOpIsInf: - val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), - nir_imm_float(>nb, INFINITY)); + case SpvOpIsInf: { + nir_ssa_def *inf; + if (src[0]->bit_size == 64) + inf = nir_imm_double(>nb, INFINITY); + else + inf = nir_imm_float(>nb, INFINITY); + val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), inf); break; + } case SpvOpFUnordEqual: case SpvOpFUnordNotEqual: -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] spirv: Add a 64-bit implementation of Frexp
The implementation is inspired by lower_instructions_visitor::dfrexp_sig_to_arith. This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4 test using the ARB_gl_spirv branch. --- Please also see this related patch which I probably should have bundled in this series: https://patchwork.freedesktop.org/patch/208702/ src/compiler/spirv/vtn_glsl450.c | 60 +--- 1 file changed, 56 insertions(+), 4 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 46ef40f5e3f..51c4cd271bc 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -380,7 +380,7 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x) } static nir_ssa_def * -build_frexp(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent) +build_frexp32(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent) { nir_ssa_def *abs_x = nir_fabs(b, x); nir_ssa_def *zero = nir_imm_float(b, 0.0f); @@ -412,6 +412,51 @@ build_frexp(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent) nir_bcsel(b, is_not_zero, exponent_value, zero)); } +static nir_ssa_def * +build_frexp64(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent) +{ + nir_ssa_def *abs_x = nir_fabs(b, x); + nir_ssa_def *zero = nir_imm_double(b, 0.0); + nir_ssa_def *zero32 = nir_imm_float(b, 0.0f); + + /* Double-precision floating-point values are stored as +* 1 sign bit; +* 11 exponent bits; +* 52 mantissa bits. +* +* We only need to deal with the exponent so first we extract the upper 32 +* bits using nir_unpack_64_2x32_split_y. +*/ + nir_ssa_def *upper_x = nir_unpack_64_2x32_split_y(b, x); + nir_ssa_def *abs_upper_x = nir_unpack_64_2x32_split_y(b, abs_x); + + /* An exponent shift of 20 will shift the remaining mantissa bits out, +* leaving only the exponent and sign bit (which itself may be zero, if the +* absolute value was taken before the bitcast and shift. +*/ + nir_ssa_def *exponent_shift = nir_imm_int(b, 20); + nir_ssa_def *exponent_bias = nir_imm_int(b, -1022); + + nir_ssa_def *sign_mantissa_mask = nir_imm_int(b, 0x800fu); + + /* Exponent of floating-point values in the range [0.5, 1.0). */ + nir_ssa_def *exponent_value = nir_imm_int(b, 0x3fe0u); + + nir_ssa_def *is_not_zero = nir_fne(b, abs_x, zero); + + *exponent = + nir_iadd(b, nir_ushr(b, abs_upper_x, exponent_shift), + nir_bcsel(b, is_not_zero, exponent_bias, zero32)); + + nir_ssa_def *new_upper = + nir_ior(b, nir_iand(b, upper_x, sign_mantissa_mask), + nir_bcsel(b, is_not_zero, exponent_value, zero32)); + + return nir_pack_64_2x32_split(b, + nir_unpack_64_2x32_split_x(b, x), + new_upper); +} + static nir_op vtn_nir_alu_op_for_spirv_glsl_opcode(struct vtn_builder *b, enum GLSLstd450 opcode) @@ -685,15 +730,22 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, case GLSLstd450Frexp: { nir_ssa_def *exponent; - val->ssa->def = build_frexp(nb, src[0], ); + if (src[0]->bit_size == 64) + val->ssa->def = build_frexp64(nb, src[0], ); + else + val->ssa->def = build_frexp32(nb, src[0], ); nir_store_deref_var(nb, vtn_nir_deref(b, w[6]), exponent, 0xf); return; } case GLSLstd450FrexpStruct: { vtn_assert(glsl_type_is_struct(val->ssa->type)); - val->ssa->elems[0]->def = build_frexp(nb, src[0], ->ssa->elems[1]->def); + if (src[0]->bit_size == 64) + val->ssa->elems[0]->def = build_frexp64(nb, src[0], + >ssa->elems[1]->def); + else + val->ssa->elems[0]->def = build_frexp32(nb, src[0], + >ssa->elems[1]->def); return; } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the equivalent functions in the GLSL spec have explicit alternatives for doubles. Refract is a little bit more complicated due to the fact that the final argument is always supposed to be a scalar 32-bit float regardless of the other operands. However in practice it seems there is a bug in glslang that makes it convert the argument to 64-bit if you actually try to pass it a 32-bit value while the other arguments are 32-bit. This adds an optional conversion of the final argument in order to support either type. These have been tested against the automatically generated tests of glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch which tests it with quite a large range of combinations. The issue with glslang has been filed here: https://github.com/KhronosGroup/glslang/issues/1279 --- src/compiler/spirv/vtn_glsl450.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 46ef40f5e3f..b098f082f3e 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -583,14 +583,14 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, case GLSLstd450FaceForward: val->ssa->def = nir_bcsel(nb, nir_flt(nb, nir_fdot(nb, src[2], src[1]), - nir_imm_float(nb, 0.0)), + NIR_IMM_FP(nb, 0.0)), src[0], nir_fneg(nb, src[0])); return; case GLSLstd450Reflect: /* I - 2 * dot(N, I) * N */ val->ssa->def = - nir_fsub(nb, src[0], nir_fmul(nb, nir_imm_float(nb, 2.0), + nir_fsub(nb, src[0], nir_fmul(nb, NIR_IMM_FP(nb, 2.0), nir_fmul(nb, nir_fdot(nb, src[0], src[1]), src[1]))); return; @@ -600,8 +600,17 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, nir_ssa_def *N = src[1]; nir_ssa_def *eta = src[2]; nir_ssa_def *n_dot_i = nir_fdot(nb, N, I); - nir_ssa_def *one = nir_imm_float(nb, 1.0); - nir_ssa_def *zero = nir_imm_float(nb, 0.0); + nir_ssa_def *one = NIR_IMM_FP(nb, 1.0); + nir_ssa_def *zero = NIR_IMM_FP(nb, 0.0); + /* According to the SPIR-V and GLSL specs, eta is always a float + * regardless of the type of the other operands. However in practice it + * seems that if you try to pass it a float then glslang will just + * promote it to a double and generate invalid SPIR-V. In order to + * support a hypothetical fixed version of glslang we’ll promote eta to + * double if the other operands are double also. + */ + if (I->bit_size > eta->bit_size) + eta = nir_build_alu(nb, nir_op_f2f64, eta, NULL, NULL, NULL); /* k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I)) */ nir_ssa_def *k = nir_fsub(nb, one, nir_fmul(nb, eta, nir_fmul(nb, eta, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] spirv/nir: Fix the stream ID when emitting a primitive or vertex
According to the SPIR-V spec: “Stream must be an of a constant instruction with a scalar integer type. That constant is the output-primitive stream number.” The previous code was treating it as an integer literal. --- This is part of the GL SPIR-V branch to enable streams for transform feedback but seeing as it is a standalone fix for existing code I thought it might be worth posting seperately. src/compiler/spirv/spirv_to_nir.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index c6df764682e..c71ed51e4cf 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -2983,8 +2983,12 @@ vtn_handle_barrier(struct vtn_builder *b, SpvOp opcode, nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->shader, intrinsic_op); - if (opcode == SpvOpEmitStreamVertex || opcode == SpvOpEndStreamPrimitive) - nir_intrinsic_set_stream_id(intrin, w[1]); + if (opcode == SpvOpEmitStreamVertex || opcode == SpvOpEndStreamPrimitive) { + struct vtn_value *stream_value = + vtn_value(b, w[1], vtn_value_type_constant); + unsigned stream = stream_value->constant->values[0].u32[0]; + nir_intrinsic_set_stream_id(intrin, stream); + } nir_builder_instr_insert(>nb, >instr); } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965/meta-util: Convert the clear color through the surf format
When programming the fast clear color there was previously a chunk of code to try to make the color match the constraints of the surface format such as by filling in missing components and handling luminance formats. These cases are not handled by the hardware. There are some additional possible restrictions that the hardware does seem to handle, such as clamping to [0,1] for normalised formats. However for whatever reason it doesn't clamp to [0,∞] for the special float formats that don't have a sign bit. Rather than having special cases for all of these this patch makes it instead convert the color to the actual surface format and back again so that we can be sure it will have all of the possible restrictions. Additionally this would avoid some other potential surprises such as getting more precision for the clear color when fast clears are used. Originally this patch was created to fix the following bug, but it is no longer neccessary since f1fa4be871e13c68b50685aaf64. However I think this approach is still worthwhile because it has the advantage of reducing code redundancy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93338 v2: Rebase and add support for converting int formats --- This is a rebase of this patch from 2 years ago: https://patchwork.freedesktop.org/patch/67912/ src/mesa/drivers/dri/i965/brw_meta_util.c | 107 +++--- 1 file changed, 37 insertions(+), 70 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c b/src/mesa/drivers/dri/i965/brw_meta_util.c index 54dc6a5..ec727eb 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_util.c +++ b/src/mesa/drivers/dri/i965/brw_meta_util.c @@ -29,6 +29,8 @@ #include "main/blend.h" #include "main/fbobject.h" #include "util/format_srgb.h" +#include "main/format_pack.h" +#include "main/format_unpack.h" /** * Helper function for handling mirror image blits. @@ -345,6 +347,7 @@ brw_meta_convert_fast_clear_color(const struct brw_context *brw, const struct intel_mipmap_tree *mt, const union gl_color_union *color) { + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); union isl_color_value override_color = { .u32 = { color->ui[0], @@ -354,82 +357,46 @@ brw_meta_convert_fast_clear_color(const struct brw_context *brw, }, }; - /* The sampler doesn't look at the format of the surface when the fast -* clear color is used so we need to implement luminance, intensity and -* missing components manually. -*/ - switch (_mesa_get_format_base_format(mt->format)) { - case GL_INTENSITY: - override_color.u32[3] = override_color.u32[0]; - /* flow through */ - case GL_LUMINANCE: - case GL_LUMINANCE_ALPHA: - override_color.u32[1] = override_color.u32[0]; - override_color.u32[2] = override_color.u32[0]; - break; - default: - for (int i = 0; i < 3; i++) { - if (!_mesa_format_has_color_component(mt->format, i)) -override_color.u32[i] = 0; - } - break; - } - - switch (_mesa_get_format_datatype(mt->format)) { - case GL_UNSIGNED_NORMALIZED: - for (int i = 0; i < 4; i++) - override_color.f32[i] = CLAMP(override_color.f32[i], 0.0f, 1.0f); - break; - - case GL_SIGNED_NORMALIZED: - for (int i = 0; i < 4; i++) - override_color.f32[i] = CLAMP(override_color.f32[i], -1.0f, 1.0f); - break; - - case GL_UNSIGNED_INT: - for (int i = 0; i < 4; i++) { - unsigned bits = _mesa_get_format_bits(mt->format, GL_RED_BITS + i); - if (bits < 32) { -uint32_t max = (1u << bits) - 1; -override_color.u32[i] = MIN2(override_color.u32[i], max); - } - } - break; - - case GL_INT: - for (int i = 0; i < 4; i++) { - unsigned bits = _mesa_get_format_bits(mt->format, GL_RED_BITS + i); - if (bits < 32) { -int32_t max = (1 << (bits - 1)) - 1; -int32_t min = -(1 << (bits - 1)); -override_color.i32[i] = CLAMP(override_color.i32[i], min, max); - } - } - break; - - case GL_FLOAT: - if (!_mesa_is_format_signed(mt->format)) { - for (int i = 0; i < 4; i++) -override_color.f32[i] = MAX2(override_color.f32[i], 0.0f); - } - break; - } - - if (!_mesa_format_has_color_component(mt->format, 3)) { - if (_mesa_is_format_integer_color(mt->format)) - override_color.u32[3] = 1; - else - override_color.f32[3] = 1.0f; - } - /* Handle linear to SRGB conversion */ - if (brw->ctx.Color.sRGBEnabled && - _mesa_get_srgb_format_linear(mt->format) != mt->format) { + if (brw->ctx.Color.sRGBEnabled && linear_format != mt->format) { for (int i = 0; i < 3; i++) { override_color.f32[i] = util_format_linear_to_srgb_float(override_color.f32[i]); } } + union gl_color_union
[Mesa-dev] [PATCH] mesa: Tidy up the 4.6 section of GL4x.xml
The enums are moved to the top and indented like the rest of the file. Comments are added to split up the function aliases by corresponding extension. This should make no functional difference. --- This was requested by Ian Romanick as part of the review of the patch to add the aliases for GL_ARB_indirect_parameters. src/mapi/glapi/gen/GL4x.xml | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml index 4b2703c..cd2e3b8 100644 --- a/src/mapi/glapi/gen/GL4x.xml +++ b/src/mapi/glapi/gen/GL4x.xml @@ -67,15 +67,21 @@ + + + + + + + + - - - + @@ -85,8 +91,7 @@ - - + -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters
--- This is just a rebase of the patch with a minor conflict resolution. src/mapi/glapi/gen/GL4x.xml | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml index 0a80941..4b2703c 100644 --- a/src/mapi/glapi/gen/GL4x.xml +++ b/src/mapi/glapi/gen/GL4x.xml @@ -84,6 +84,28 @@ + + + + + + + + + + + + + + + + + + + + -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests
Kenneth Graunkewrites: > Neil, in case you were wondering, I suggested the above organization > of vertex elements because it would let us only upload 1 in the common > case. Looking in shader-db, there are 3579 shaders that use > gl_InstanceID, 186 shaders that use gl_VertexID, and 0 shaders that > use gl_BaseVertex, gl_BaseInstance, or gl_DrawID. > > It looks like your patches kicked gl_BaseVertex off to VE2 instead of > gl_BaseInstance. I suppose that works too, I just figured that keeping > VertexID/FirstVertex/BaseVertex together would make sense. If it's > more convenient to move gl_BaseVertex, I suppose I'm fine with that > too... Yes, Antia forwarded me your emails with the previous suggestion. The order we came up with for this patch series is: VE 1: VE 2: The “offset for vertex ID” corresponds to what we were previously (incorrectly) sending for gl_BaseVertex, so effectively the values in VE1 have not changed at all. This also means we can continue to directly take these two values from the indirect draw params buffer. I believe your previous suggestion ended up needing a register store to put the values in the right place so with this ordering we get to avoid that. VE1 now contains everything needed for the most common values VertexID and InstanceID. VE2 is only needed for the rare cases that use gl_DrawID or gl_BaseVertex. On Vulkan VE2 won’t even be needed for the BaseVertex builtin because the semantics are different and in that case it corresponds to “offset for vertex ID”. I haven’t looked into too much detail about reordering the patches yet, but it does sound reasonable to try and avoid intermediate breakages. Thanks for looking at the series. Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID
Kenneth Graunkewrites: > We have a number of similar names now: > >SYSTEM_VALUE_BASE_VERTEX >SYSTEM_VALUE_BASE_VERTEX_ID >SYSTEM_VALUE_VERTEX_ID >SYSTEM_VALUE_VERTEX_ID_ZERO_BASE > > BASE_VERTEX and BASE_VERTEX_ID are really similar names, and honestly > either one seems like it could be the name for gl_BaseVertex. I'm > afraid it would be easy to mix them up by mistake. IMHO, it would be > nice to pick a different word, just to keep some distinction between > the two fairly related concepts... > > Perhaps SYSTEM_VALUE_FIRST_VERTEX...? That's only half the meaning, > but it at least uses a different word, and makes you think "do I want > BASE_VERTEX or FIRST_VERTEX?" Yes, naming this thing is really difficult. I’m not sure if you noticed, but for Vulkan, the BaseVertex builtin should actually have the value of BASE_VERTEX_ID unlike GL. So if we rename BASE_VERTEX to something without “base vertex” in it then it will still be confusing for Vulkan. So effectively the descriptive names are like: SYSTEM_VALUE_BASE_VERTEX_ON_GL_BUT_NOT_VULKAN SYSTEM_VALUE_BASE_VERTEX_ON_VULKAN_OR_OFFSET_FOR_VERTEX_ID_ON_GL I’m not sure whether that’s enough of an argument against FIRST_VERTEX though, so personally I don’t really mind either way. Antia, what do you think? Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters
--- src/mapi/glapi/gen/GL4x.xml | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml index 88dba5c..ea28d8e 100644 --- a/src/mapi/glapi/gen/GL4x.xml +++ b/src/mapi/glapi/gen/GL4x.xml @@ -73,6 +73,28 @@ + + + + + + + + + + + + + + + + + + + + -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/glsl_to_tgsi: Add support for SYSTEM_VALUE_BASE_VERTEX_ID
SYSTEM_VALUE_BASE_VERTEX has changed to be the correct value for gl_BaseVertex, which means it will be zero when used with a non-indexed call. The new BASE_VERTEX_ID value can be used as before as an offset to calculate a value for gl_VertexID. These values should be different, but this patch just makes them same for now in order to at least retain the previous behaviour and not break gl_BaseVertexID and gl_VertexID entirely on radeonsi. Note, this hasn’t been tested apart from to verify that it compiles. --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 0772b73..3dfed19 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -5385,6 +5385,11 @@ _mesa_sysval_to_semantic(unsigned sysval) case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return TGSI_SEMANTIC_VERTEXID_NOBASE; case SYSTEM_VALUE_BASE_VERTEX: + case SYSTEM_VALUE_BASE_VERTEX_ID: + /* FIXME: These two values are actually supposed to be different. The + * one used for gl_BaseVertex is supposed to be zero when a non-indexed + * draw call is used. + */ return TGSI_SEMANTIC_BASEVERTEX; case SYSTEM_VALUE_BASE_INSTANCE: return TGSI_SEMANTIC_BASEINSTANCE; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] freedreno: Update to handle rename of the base vertex ID intrinsic
The old intrinsic called base_vertex that is used to add to gl_VertexID is now called base_vertex_id so that base_vertex can be used for the value of gl_BaseVertex, which is different. As far as I can tell freedreno doesn’t support GL_ARB_shader_draw_parameters so it won’t need any changes to generate the new base_vertex intrinsic. I haven’t tested this at all apart from to verify that it compiles. Cc: Rob Clark--- src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index da4aeaa..e6fbf45 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -2071,10 +2071,10 @@ emit_intrinsic(struct ir3_context *ctx, nir_intrinsic_instr *intr) ctx->ir->outputs[n] = src[i]; } break; - case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_base_vertex_id: if (!ctx->basevertex) { ctx->basevertex = create_driver_param(ctx, IR3_DP_VTXID_BASE); - add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX, + add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX_ID, ctx->basevertex); } dst[0] = ctx->basevertex; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] Fix radeonsi and freedreno for the BaseVertex patches
Here are two bonus patches to address the problems mentioned that it might break radeonsi and freedreno. The radeonsi patch doesn’t solve the problem that the value for gl_BaseVertex is presumably wrong, but it at least should stop it from breaking gl_VertexID entirely. I haven’t been able to test either patch so this is more of a request for comments. Neil Roberts (2): freedreno: Update to handle rename of the base vertex ID intrinsic st/glsl_to_tgsi: Add support for SYSTEM_VALUE_BASE_VERTEX_ID src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 4 ++-- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 + 2 files changed, 7 insertions(+), 2 deletions(-) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests
Just to note, I made some Piglit patches to verify this new behaviour too: https://patchwork.freedesktop.org/series/33626/ This might make testing a little easier if someone wants to implement the behaviour for other drivers too. - Neil Antia Puentes <apuen...@igalia.com> writes: > Hi, > > the series sets gl_BaseVertex to zero for gl*DrawArrays* in i965. > Previously, its value was the value passed as in the non-indexed > draw call. This was convinient because the gl_VertexID was calculated as > gl_VertexIDBaseZero + gl_BaseVertex. > > However, as gl_BaseVertex must be zero for non-indexed draw calls, this > calculation is broken. > > Apart from setting the basevertex to zero in i965, the following patches add > a new VS system value which will contain or as needed > to make the gl_VertexID calculation right. > > > The relevant bits of the OpenGL 4.6 specification involved here are: > > * 11.1.3.9 Shader Inputs > > "gl_BaseVertex holds the integer value passed to the baseVertex parameter to > the > command that resulted in the current shader invocation. In the case where the > command has no baseVertex parameter, the value of gl_BaseVertex is zero." > > "gl_VertexID holds the integer index i implicitly passed by DrawArrays or > one of the other drawing commands defined in section 10.4." > > * 10.4. Drawing Commands Using Vertex Arrays: > > - Page 352: > "The index of any element transferred to the GL by DrawArraysOneInstance is > referred to as its vertex ID, and may be read by a vertex shader as > gl_VertexID. > The vertex ID of the ith element transferred is first + i." > > - Page 355: > "The index of any element transferred to the GL by DrawElementsOneInstance is > referred to as its vertex ID, and may be read by a vertex shader as > gl_VertexID. > The vertex ID of the ith element transferred is the sum of basevertex and the > value stored in the currently bound element array buffer at offset indices + > i." > > Regards, > Antia. > > Antia Puentes (5): > compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID > nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics > i965: emit basevertexid as a vertex element component > i965: gl_BaseVertex must be zero for non-indexed draw calls > intel/compiler: implement the basevertex(id) load intrinsics > > Neil Roberts (4): > intel/compiler: Add a uses_basevertexid flag > nir: Offset vertex_id by base_vertex_id instead of base_vertex > i965: Let nir lower gl_VertexID instead of the linker > spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX > > src/compiler/nir/nir.c| 4 +++ > src/compiler/nir/nir_gather_info.c| 1 + > src/compiler/nir/nir_intrinsics.h | 1 + > src/compiler/nir/nir_lower_system_values.c| 2 +- > src/compiler/shader_enums.c | 1 + > src/compiler/shader_enums.h | 15 +++ > src/compiler/spirv/vtn_variables.c| 2 +- > src/intel/compiler/brw_compiler.h | 1 + > src/intel/compiler/brw_nir.c | 12 ++--- > src/intel/compiler/brw_vec4.cpp | 18 - > src/intel/vulkan/genX_cmd_buffer.c| 8 +++--- > src/intel/vulkan/genX_pipeline.c | 4 +-- > src/mesa/drivers/dri/i965/brw_context.c | 3 ++- > src/mesa/drivers/dri/i965/brw_context.h | 36 ++--- > src/mesa/drivers/dri/i965/brw_draw.c | 33 --- > src/mesa/drivers/dri/i965/brw_draw_upload.c | 13 - > src/mesa/drivers/dri/i965/genX_state_upload.c | 39 > +++ > 17 files changed, 132 insertions(+), 61 deletions(-) > > -- > 2.14.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] mesa: rework how we free gl_shader_program_data
This seems like a nice solution to me. I just have a small comment below. > -static struct gl_shader_program_data * > -create_shader_program_data() > +struct gl_shader_program_data * > +_mesa_create_shader_program_data() > { > struct gl_shader_program_data *data; > data = rzalloc(NULL, struct gl_shader_program_data); > if (data) >data->RefCount = 1; > > + data->InfoLog = ralloc_strdup(data, ""); > + The addition of this line makes the “if (data)” above look odd because if the rzalloc fails it’s now going to just crash immediately anyway. I’m not sure what the policy is supposed to be but a quick grep finds other places in the compiler that seem to assume that ralloc can not fail, so it might be nice to just remove the if. Reviewed-by: Neil Roberts <nrobe...@igalia.com> Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/shaderapi: Do a dry run of linking an in-use program
Timothy Arceriwrites: >> You’re right, I think that would be a better way to handle it. I guess >> if this was done then you don’t really need the second link. There are >> several pointers for the uniform state that you would need to keep and I >> think there is more state than just the uniforms as well. Perhaps pretty >> much everything that is freed in _mesa_clear_shader_program_data should >> be kept? > > You only need to keep things that are accessed by the backend post > linking (such as the uniforms), things that are queried via the api can > be trashed as per the spec. I'm pretty sure we don't need most of > those. Right, I guess we could keep a small selection of the pointers that are cleared by _mesa_clear_shader_program_data. However it might be messy to maintain as it’s likely that someone could add new members to gl_shader_program_data and forget to update this function. Already just preserving the neccessary uniform state is a bit fiddly because the allocation we want to maintain is owned by UniformStorage but it is accessed via UniformDataSlots in the i965 driver. I’m not exactly sure how this works in Gallium. Just to double check, I made a little Piglit test to check using atomic counters and sure enough it gets a similar Valgrind error and sporadic failures due to accessing shProg->data->AtomicBuffers, so we would at least need to conserve that too. https://github.com/bpeel/piglit/commit/d95701afbb9367ed1e82af27c98f18 Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/shaderapi: Do a dry run of linking an in-use program
Timothy Arceriwrites: > I think I'd rather see this handled by releasing the uniforms only after > the second link is successful and using a temp/fallback pointer to hold > it until then. We need to do a similar type of thing with shader source > and the shader cache e.g [1]. > > [1] > https://cgit.freedesktop.org/mesa/mesa/commit/?id=e5bb4a0b0f4743fa0a0567991dca751ef49a7200 You’re right, I think that would be a better way to handle it. I guess if this was done then you don’t really need the second link. There are several pointers for the uniform state that you would need to keep and I think there is more state than just the uniforms as well. Perhaps pretty much everything that is freed in _mesa_clear_shader_program_data should be kept? An ideal solution might be to refactor this state into two seperate structs. One would contain everything the user can edit, such as the list of shaders to link, AttributeBindings, SSO flag etc, and the other would be everything that is the result of linking. That way the link could fill in a newly allocated link-state struct and only replace the old state after a sucessful link. My (perhaps lazy) hunch is that that could be a rather intrusive refactor and if the only reason to do it is to fix this obscure corner case then it might not be worth the hassle. I might try to have a go at this anyway but it’s probably the sort of thing that would need some discussion first. Regards, - Neil signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Transform fb buffers are only active if a variable uses them
The GL spec will soon be revised to clarify that a buffer binding for a transform feedback buffer is only required if a variable is actually defined to use the buffer binding point. Previously a declaration for the default transform buffer would make it require a binding even if nothing was declared to use the default buffer. Affects: KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api --- src/compiler/glsl/link_varyings.cpp | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index 66a20a2..e663930 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -1364,7 +1364,6 @@ store_tfeedback_info(struct gl_context *ctx, struct gl_shader_program *prog, if (has_xfb_qualifiers) { for (unsigned j = 0; j < MAX_FEEDBACK_BUFFERS; j++) { if (prog->TransformFeedback.BufferStride[j]) { - buffers |= 1 << j; explicit_stride[j] = true; xfb_prog->sh.LinkedTransformFeedback->Buffers[j].Stride = prog->TransformFeedback.BufferStride[j] / 4; @@ -1389,10 +1388,24 @@ store_tfeedback_info(struct gl_context *ctx, struct gl_shader_program *prog, num_buffers++; buffer_stream_id = -1; continue; - } else if (tfeedback_decls[i].is_varying()) { + } + + if (has_xfb_qualifiers) { +buffer = tfeedback_decls[i].get_buffer(); + } else { +buffer = num_buffers; + } + + if (tfeedback_decls[i].is_varying()) { if (buffer_stream_id == -1) { /* First varying writing to this buffer: remember its stream */ buffer_stream_id = (int) tfeedback_decls[i].get_stream_id(); + + /* Only mark a buffer as active when there is a varying +* attached to it. This behaviour is based on a revised version +* of section 13.2.2 of the GL 4.6 spec. +*/ + buffers |= 1 << buffer; } else if (buffer_stream_id != (int) tfeedback_decls[i].get_stream_id()) { /* Varying writes to the same buffer from a different stream */ @@ -1408,13 +1421,6 @@ store_tfeedback_info(struct gl_context *ctx, struct gl_shader_program *prog, } } - if (has_xfb_qualifiers) { -buffer = tfeedback_decls[i].get_buffer(); - } else { -buffer = num_buffers; - } - buffers |= 1 << buffer; - if (!tfeedback_decls[i].store(ctx, prog, xfb_prog->sh.LinkedTransformFeedback, buffer, num_buffers, num_outputs, -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/shaderapi: Do a dry run of linking an in-use program
If an in-use program is unsuccessfully linked, the GL spec requires that the executable and the uniform state of the program should remain until a new program is bound. Previously this sort of worked in Mesa except that it would free the uniform state before attempting to link. At least on i965 this would usually continue to work because it accesses the uniform values via a dangling pointer. However it was causing sporadic failures in one of the CTS tests. To fix it, when an in-use program is about to be linked, it will now make a copy of the program and first try to link that instead. If it works it will let the link continue, otherwise it will copy the status to the new program and abandon the link. That way the link status and info log is preserved without freeing the uniform state. This isn’t very efficient but it would probably quite a big overhaul to fix it properly and it doesn’t seem worth it because I can’t imagine that any performance-sensitive application would be hitting this very strange corner case of the API. Fixes: KHR-GL46.sepshaderobjs.StateInteraction Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102177 --- src/mesa/main/shaderapi.c | 58 ++- 1 file changed, 57 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 7282435..3f7fe02 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1134,6 +1134,58 @@ _mesa_compile_shader(struct gl_context *ctx, struct gl_shader *sh) } } +static bool +try_link_in_use_program(struct gl_context *ctx, +struct gl_shader_program *shProg) +{ + struct gl_shader_program *shProgCopy = _mesa_new_shader_program(0); + bool ret; + int i; + + /* If the program is in use then try linking a copy of the program first. +* If it won’t work then we’ll copy the link status to the old program and +* abandon the linking. Otherwise we let it relink as normal. This is done +* because relinking destroys the uniform state of the program but the GL +* spec actually requires that the state be preserved when the program is +* in use and the link fails. This isn’t terribly efficent but in practice +* it is probably not worth optimising because it’s a rather strange corner +* case of the spec. +* +* GLSL 4.6 spec section 7.3: +* +* “If a program object that is active for any shader stage is re-linked +* unsuccessfully, the link status will be set to FALSE, but any existing +* executables and associated state will remain part of the current +* rendering state until a subsequent call to UseProgram, +* UseProgramStages, or BindProgramPipeline removes them from use.” +*/ + + for (i = 0; i < shProg->NumShaders; i++) + attach_shader(ctx, shProgCopy, shProg->Shaders[i]); + + shProgCopy->BinaryRetreivableHint = shProg->BinaryRetreivableHint; + shProgCopy->SeparateShader = shProg->SeparateShader; + + _mesa_glsl_link_shader(ctx, shProgCopy); + + if (shProgCopy->data->LinkStatus) { + ret = true; + } else { + shProg->data->LinkStatus = shProgCopy->data->LinkStatus; + shProg->data->Validated = shProgCopy->data->Validated; + shProg->data->Version = shProgCopy->data->Version; + shProg->IsES = shProgCopy->IsES; + ralloc_free(shProg->data->InfoLog); + shProg->data->InfoLog = + ralloc_strdup(shProg->data, shProgCopy->data->InfoLog); + + ret = false; + } + + _mesa_reference_shader_program(ctx, , NULL); + + return ret; +} /** * Link a program's shaders. @@ -1159,12 +1211,16 @@ link_program(struct gl_context *ctx, struct gl_shader_program *shProg, } unsigned programs_in_use = 0; - if (ctx->_Shader) + if (ctx->_Shader) { for (unsigned stage = 0; stage < MESA_SHADER_STAGES; stage++) { if (ctx->_Shader->CurrentProgram[stage] && ctx->_Shader->CurrentProgram[stage]->Id == shProg->Name) { programs_in_use |= 1 << stage; } + } + + if (programs_in_use && !try_link_in_use_program(ctx, shProg)) + return; } FLUSH_VERTICES(ctx, 0); -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] nir/opt_intrinsics: Fix values for gl_SubGroupG{e, t}MaskARB
Thanks for the review. I’ve pushed this first patch to master and I’ll drop the rest. I’ve also pushed the piglit patch. Regards, - Neil Jason Ekstrand <ja...@jlekstrand.net> writes: > On Tue, Oct 31, 2017 at 10:55 AM, Neil Roberts <nrobe...@igalia.com> wrote: > >> Previously the values were calculated by just shifting ~0 by the >> invocation ID. This would end up including bits that are higher than >> gl_SubGroupSizeARB. The corresponding CTS test effectively requires that >> these high bits be zero so it was failing. There is a Piglit test as >> well but this appears to checking the wrong values so it passes. >> >> For the two greater-than bitmasks, this patch adds an extra mask with >> (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. >> > > We had a big discussion about this in the office yesterday. :-) From my > reading of the GL_ARB_shader_ballot and GL_NV_shader_thread_group specs, it > looked like the current behavior was correct. However, I'm very glad to > see that I was wrong because it means that Vulkan and GL are consistent > with each other. :) It'll be a bit painful to rebase my subgroup work on > top of this but I think I'd prefer it if we land this first as it will > actually make some things a bit simpler even though it will conflict madly. > > >> Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 >> Signed-off-by: Neil Roberts <nrobe...@igalia.com> >> --- >> src/compiler/nir/nir_opt_intrinsics.c | 24 ++-- >> 1 file changed, 22 insertions(+), 2 deletions(-) >> >> diff --git a/src/compiler/nir/nir_opt_intrinsics.c >> b/src/compiler/nir/nir_opt_intrinsics.c >> index 26a0f96..d5fdc51 100644 >> --- a/src/compiler/nir/nir_opt_intrinsics.c >> +++ b/src/compiler/nir/nir_opt_intrinsics.c >> @@ -28,6 +28,26 @@ >> * \file nir_opt_intrinsics.c >> */ >> >> +static nir_ssa_def * >> +high_subgroup_mask(nir_builder *b, >> + nir_ssa_def *count, >> + uint64_t base_mask) >> +{ >> + /* group_mask could probably be calculated more efficiently but we >> want to >> +* be sure not to shift by 64 if the subgroup size is 64 because the >> GLSL >> +* shift operator is undefined in that case. > > > Yeah, I'm pretty sure our hardware will just not shift in that case. > > >> In any case if we were worried >> +* about efficency this should probably be done further down because >> the >> +* subgroup size is likely to be known at compile time. >> > > I wouldn't be worried about efficiency. As I said in an earlier patch, > this becomes a constant with my series. > > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > Cc: mesa-sta...@lists.freedesktop.org > > Please push soon or maybe I'll just push it myself. > > >> +*/ >> + nir_ssa_def *subgroup_size = nir_load_subgroup_size(b); >> + nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull); >> + nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size); >> + nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift); >> + nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), >> count); >> + >> + return nir_iand(b, higher_bits, group_mask); >> +} >> + >> static bool >> opt_intrinsics_impl(nir_function_impl *impl) >> { >> @@ -95,10 +115,10 @@ opt_intrinsics_impl(nir_function_impl *impl) >> replacement = nir_ishl(, nir_imm_int64(, 1ull), count); >> break; >> case nir_intrinsic_load_subgroup_ge_mask: >> - replacement = nir_ishl(, nir_imm_int64(, ~0ull), >> count); >> + replacement = high_subgroup_mask(, count, ~0ull); >> break; >> case nir_intrinsic_load_subgroup_gt_mask: >> - replacement = nir_ishl(, nir_imm_int64(, ~1ull), >> count); >> + replacement = high_subgroup_mask(, count, ~1ull); >> break; >> case nir_intrinsic_load_subgroup_le_mask: >> replacement = nir_inot(, nir_ishl(, nir_imm_int64(, >> ~1ull), count)); >> -- >> 2.9.5 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965/fs/nir: Don’t let nir lower nir_intrinsic_load_subgroup_all_mask
Instead of letting nir lower nir_intrinsic_load_subgroup_all_mask this is now generated directly. This is more efficient because it can be calculated in the compiler based on the dispatch width. Sadly it’s still not totally ideal because the constant doesn’t seem to get propagated and there is still a redundant MOV. --- src/intel/compiler/brw_compiler.c | 2 +- src/intel/compiler/brw_fs_nir.cpp | 7 ++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_compiler.c b/src/intel/compiler/brw_compiler.c index 8df0d2e..f02fceb 100644 --- a/src/intel/compiler/brw_compiler.c +++ b/src/intel/compiler/brw_compiler.c @@ -57,7 +57,7 @@ static const struct nir_shader_compiler_options scalar_nir_options = { .lower_unpack_snorm_4x8 = true, .lower_unpack_unorm_2x16 = true, .lower_unpack_unorm_4x8 = true, - .lower_subgroup_all_mask = true, + .lower_subgroup_all_mask = false, .lower_subgroup_masks = true, .max_subgroup_size = 32, .max_unroll_iterations = 32, diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 9202b0f..b73edc9 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -4185,7 +4185,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr break; } - case nir_intrinsic_load_subgroup_all_mask: + case nir_intrinsic_load_subgroup_all_mask: { + uint32_t mask = ~UINT32_C(0) >> (32 - dispatch_width); + bld.MOV(retype(dest, BRW_REGISTER_TYPE_Q), brw_imm_d(mask)); + break; + } + case nir_intrinsic_load_subgroup_eq_mask: case nir_intrinsic_load_subgroup_ge_mask: case nir_intrinsic_load_subgroup_gt_mask: -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] nir: Add an intrinsic for a bitmask of the whole subgroup
Similar to nir_intrinsic_load_subgroup_eq_mask and friends, this adds an intrinsic which contains a bit for every member of the group. This doesn’t have a corresponding GLSL builtin but it will be used to calculate nir_intrinsic_load_subgroup_g{t,e}_mask. It has its own nir option on whether to lower it. The idea is that this should be much easier to generate than the other masks because it will likely be a compile-time constant and if so it will generate more efficient code for the other masks. --- src/compiler/nir/nir.h| 1 + src/compiler/nir/nir_intrinsics.h | 1 + src/compiler/nir/nir_opt_intrinsics.c | 41 +++ src/intel/compiler/brw_compiler.c | 1 + src/intel/compiler/brw_fs_nir.cpp | 1 + 5 files changed, 36 insertions(+), 9 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index dd833cf..edb02b9 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1836,6 +1836,7 @@ typedef struct nir_shader_compiler_options { bool lower_extract_word; bool lower_vote_trivial; + bool lower_subgroup_all_mask; bool lower_subgroup_masks; /** diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_intrinsics.h index cefd18b..de362a8 100644 --- a/src/compiler/nir/nir_intrinsics.h +++ b/src/compiler/nir/nir_intrinsics.h @@ -350,6 +350,7 @@ SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx) SYSTEM_VALUE(subgroup_size, 1, 0, xx, xx, xx) SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx) +SYSTEM_VALUE(subgroup_all_mask, 1, 0, xx, xx, xx) SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx) SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx) SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx) diff --git a/src/compiler/nir/nir_opt_intrinsics.c b/src/compiler/nir/nir_opt_intrinsics.c index d5fdc51..71d79d7 100644 --- a/src/compiler/nir/nir_opt_intrinsics.c +++ b/src/compiler/nir/nir_opt_intrinsics.c @@ -29,23 +29,39 @@ */ static nir_ssa_def * -high_subgroup_mask(nir_builder *b, - nir_ssa_def *count, - uint64_t base_mask) +subgroup_all_mask(nir_builder *b, + nir_ssa_def *count) { /* group_mask could probably be calculated more efficiently but we want to * be sure not to shift by 64 if the subgroup size is 64 because the GLSL -* shift operator is undefined in that case. In any case if we were worried -* about efficency this should probably be done further down because the -* subgroup size is likely to be known at compile time. +* shift operator is undefined in that case. In any case if the driver is +* worried about efficency this should probably be done further down +* because the subgroup size is likely to be known at compile time. */ nir_ssa_def *subgroup_size = nir_load_subgroup_size(b); nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull); nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size); - nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift); - nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count); + return nir_ushr(b, all_bits, shift); +} - return nir_iand(b, higher_bits, group_mask); +static nir_ssa_def * +high_subgroup_mask(nir_builder *b, + nir_ssa_def *count, + uint64_t base_mask) +{ + nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count); + nir_intrinsic_instr *load_group_mask = + nir_intrinsic_instr_create(b->shader, + nir_intrinsic_load_subgroup_all_mask); + load_group_mask->num_components = 1; + nir_ssa_dest_init(_group_mask->instr, + _group_mask->dest, + 1 /* num_components */, + 64 /* bit_size */, + NULL /* name */); + nir_builder_instr_insert(b, _group_mask->instr); + + return nir_iand(b, higher_bits, _group_mask->dest.ssa); } static bool @@ -100,6 +116,10 @@ opt_intrinsics_impl(nir_function_impl *impl) nir_imm_int(, 0)); break; } + case nir_intrinsic_load_subgroup_all_mask: +if (!b.shader->options->lower_subgroup_all_mask) + break; +/* flow through */ case nir_intrinsic_load_subgroup_eq_mask: case nir_intrinsic_load_subgroup_ge_mask: case nir_intrinsic_load_subgroup_gt_mask: @@ -111,6 +131,9 @@ opt_intrinsics_impl(nir_function_impl *impl) nir_ssa_def *count = nir_load_subgroup_invocation(); switch (intrin->intrinsic) { +case nir_intrinsic_load_subgroup_all_mask: + replacement = subgroup_all_mask(, count); + break; case nir_intrinsic_load_subgroup_eq_mask: replacement = nir_ishl(, nir_imm_int64(, 1ull), count); break; diff --git
[Mesa-dev] [PATCH 1/3] nir/opt_intrinsics: Fix values for gl_SubGroupG{e, t}MaskARB
Previously the values were calculated by just shifting ~0 by the invocation ID. This would end up including bits that are higher than gl_SubGroupSizeARB. The corresponding CTS test effectively requires that these high bits be zero so it was failing. There is a Piglit test as well but this appears to checking the wrong values so it passes. For the two greater-than bitmasks, this patch adds an extra mask with (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 Signed-off-by: Neil Roberts <nrobe...@igalia.com> --- src/compiler/nir/nir_opt_intrinsics.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_opt_intrinsics.c b/src/compiler/nir/nir_opt_intrinsics.c index 26a0f96..d5fdc51 100644 --- a/src/compiler/nir/nir_opt_intrinsics.c +++ b/src/compiler/nir/nir_opt_intrinsics.c @@ -28,6 +28,26 @@ * \file nir_opt_intrinsics.c */ +static nir_ssa_def * +high_subgroup_mask(nir_builder *b, + nir_ssa_def *count, + uint64_t base_mask) +{ + /* group_mask could probably be calculated more efficiently but we want to +* be sure not to shift by 64 if the subgroup size is 64 because the GLSL +* shift operator is undefined in that case. In any case if we were worried +* about efficency this should probably be done further down because the +* subgroup size is likely to be known at compile time. +*/ + nir_ssa_def *subgroup_size = nir_load_subgroup_size(b); + nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull); + nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size); + nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift); + nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count); + + return nir_iand(b, higher_bits, group_mask); +} + static bool opt_intrinsics_impl(nir_function_impl *impl) { @@ -95,10 +115,10 @@ opt_intrinsics_impl(nir_function_impl *impl) replacement = nir_ishl(, nir_imm_int64(, 1ull), count); break; case nir_intrinsic_load_subgroup_ge_mask: - replacement = nir_ishl(, nir_imm_int64(, ~0ull), count); + replacement = high_subgroup_mask(, count, ~0ull); break; case nir_intrinsic_load_subgroup_gt_mask: - replacement = nir_ishl(, nir_imm_int64(, ~1ull), count); + replacement = high_subgroup_mask(, count, ~1ull); break; case nir_intrinsic_load_subgroup_le_mask: replacement = nir_inot(, nir_ishl(, nir_imm_int64(, ~1ull), count)); -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/3] glsl_parser_extra: Add utility to copy symbols between symbol tables
From: Eduardo Lima Mitev <el...@igalia.com> Some symbols gathered in the symbols table during parsing are needed later for the compile and link stages, so they are moved along the process. Currently, only functions and non-temporary variables are copied between symbol tables. However, the built-in gl_PerVertex interface blocks are also needed during the linking stage (the last step), to match re-declared blocks of inter-stage shaders. This patch adds a new utility function that will factorize current code that copies functions and variables between two symbol tables, and in addition will copy explicitly declared gl_PerVertex blocks too. The function will be used in a subsequent patch. v2 (Neil Roberts): Allow the src symbol table to be NULL and explicitly copy the gl_PerVertex symbols in case they are not referenced in the exec_list. Signed-off-by: Eduardo Lima Mitev <el...@igalia.com> Signed-off-by: Neil Roberts <nrobe...@igalia.com> --- src/compiler/glsl/glsl_parser_extras.cpp | 43 src/compiler/glsl/glsl_parser_extras.h | 5 2 files changed, 48 insertions(+) diff --git a/src/compiler/glsl/glsl_parser_extras.cpp b/src/compiler/glsl/glsl_parser_extras.cpp index 51d835b..ec4603d 100644 --- a/src/compiler/glsl/glsl_parser_extras.cpp +++ b/src/compiler/glsl/glsl_parser_extras.cpp @@ -1863,6 +1863,49 @@ set_shader_inout_layout(struct gl_shader *shader, shader->bound_image = state->bound_image_specified; } +/* src can be NULL if only the symbols found in the exec_list should be + * copied + */ +void +_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir, + struct glsl_symbol_table *src, + struct glsl_symbol_table *dest) +{ + foreach_in_list (ir_instruction, ir, shader_ir) { + switch (ir->ir_type) { + case ir_type_function: + dest->add_function((ir_function *) ir); + break; + case ir_type_variable: { + ir_variable *const var = (ir_variable *) ir; + + if (var->data.mode != ir_var_temporary) +dest->add_variable(var); + break; + } + default: + break; + } + } + + if (src != NULL) { + /* Explicitly copy the gl_PerVertex interface definitions because these + * are needed to check they are the same during the interstage link. + * They can’t necessarily be found via the exec_list because the members + * might not be referenced. The GL spec still requires that they match + * in that case. + */ + const glsl_type *iface = + src->get_interface("gl_PerVertex", ir_var_shader_in); + if (iface) + dest->add_interface(iface->name, iface, ir_var_shader_in); + + iface = src->get_interface("gl_PerVertex", ir_var_shader_out); + if (iface) + dest->add_interface(iface->name, iface, ir_var_shader_out); + } +} + extern "C" { static void diff --git a/src/compiler/glsl/glsl_parser_extras.h b/src/compiler/glsl/glsl_parser_extras.h index fb35813..2e98bc7 100644 --- a/src/compiler/glsl/glsl_parser_extras.h +++ b/src/compiler/glsl/glsl_parser_extras.h @@ -948,6 +948,11 @@ extern int glcpp_preprocess(void *ctx, const char **shader, char **info_log, extern void _mesa_destroy_shader_compiler(void); extern void _mesa_destroy_shader_compiler_caches(void); +extern void +_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir, + struct glsl_symbol_table *src, + struct glsl_symbol_table *dest); + #ifdef __cplusplus } #endif -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/3] glsl/linker: Check that re-declared, inter-shader built-in blocks match
From: Eduardo Lima Mitev <el...@igalia.com> >From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of the PDF states: "If multiple shaders using members of a built-in block belonging to the same interface are linked together in the same program, they must all redeclare the built-in block in the same way, as described in section 4.3.9 “Interface Blocks” for interface-block matching, or a link-time error will result." Fixes: * GL45-CTS.CommonBugs.CommonBug_PerVertexValidation v2 (Neil Roberts): Explicitly look for gl_PerVertex in the symbol tables instead of waiting to find a variable in the interface. Signed-off-by: Eduardo Lima Mitev <el...@igalia.com> Signed-off-by: Neil Roberts <nrobe...@igalia.com> --- src/compiler/glsl/link_interface_blocks.cpp | 29 + 1 file changed, 29 insertions(+) diff --git a/src/compiler/glsl/link_interface_blocks.cpp b/src/compiler/glsl/link_interface_blocks.cpp index 7037c77..510d4f7 100644 --- a/src/compiler/glsl/link_interface_blocks.cpp +++ b/src/compiler/glsl/link_interface_blocks.cpp @@ -364,6 +364,35 @@ validate_interstage_inout_blocks(struct gl_shader_program *prog, consumer->Stage != MESA_SHADER_FRAGMENT) || consumer->Stage == MESA_SHADER_GEOMETRY; + /* Check that block re-declarations of gl_PerVertex are compatible +* across shaders: From OpenGL Shading Language 4.5, section +* "7.1 Built-In Language Variables", page 130 of the PDF: +* +*"If multiple shaders using members of a built-in block belonging +* to the same interface are linked together in the same program, +* they must all redeclare the built-in block in the same way, as +* described in section 4.3.9 “Interface Blocks” for interface-block +* matching, or a link-time error will result." +* +* This is done explicitly outside of iterating the member variable +* declarations because it is possible that the variables are not used and +* so they would have been optimised out. +*/ + const glsl_type *consumer_iface = + consumer->symbols->get_interface("gl_PerVertex", + ir_var_shader_in); + + const glsl_type *producer_iface = + producer->symbols->get_interface("gl_PerVertex", + ir_var_shader_out); + + if (producer_iface && consumer_iface && + interstage_member_mismatch(prog, consumer_iface, producer_iface)) { + linker_error(prog, "Incompatible or missing gl_PerVertex re-declaration " + "in consecutive shaders"); + return; + } + /* Add output interfaces from the producer to the symbol table. */ foreach_in_list(ir_instruction, node, producer->ir) { ir_variable *var = node->as_variable(); -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/3] glsl: Use the utility function to copy symbols between symbol tables
From: Eduardo Lima Mitev <el...@igalia.com> This effectively factorizes a couple of similar routines. v2 (Neil Roberts): Non-trivial rebase on master Signed-off-by: Eduardo Lima Mitev <el...@igalia.com> Signed-off-by: Neil Roberts <nrobe...@igalia.com> --- src/compiler/glsl/glsl_parser_extras.cpp | 25 +++-- src/compiler/glsl/linker.cpp | 16 +++- 2 files changed, 10 insertions(+), 31 deletions(-) diff --git a/src/compiler/glsl/glsl_parser_extras.cpp b/src/compiler/glsl/glsl_parser_extras.cpp index ec4603d..1fea4d6 100644 --- a/src/compiler/glsl/glsl_parser_extras.cpp +++ b/src/compiler/glsl/glsl_parser_extras.cpp @@ -1979,6 +1979,7 @@ do_late_parsing_checks(struct _mesa_glsl_parse_state *state) static void opt_shader_and_create_symbol_table(struct gl_context *ctx, + struct glsl_symbol_table *source_symbols, struct gl_shader *shader) { assert(shader->CompileStatus != compile_failure && @@ -2036,22 +2037,8 @@ opt_shader_and_create_symbol_table(struct gl_context *ctx, * We don't have to worry about types or interface-types here because those * are fly-weights that are looked up by glsl_type. */ - foreach_in_list (ir_instruction, ir, shader->ir) { - switch (ir->ir_type) { - case ir_type_function: - shader->symbols->add_function((ir_function *) ir); - break; - case ir_type_variable: { - ir_variable *const var = (ir_variable *) ir; - - if (var->data.mode != ir_var_temporary) -shader->symbols->add_variable(var); - break; - } - default: - break; - } - } + _mesa_glsl_copy_symbols_from_table(shader->ir, source_symbols, + shader->symbols); } void @@ -2088,7 +2075,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader, return; if (shader->CompileStatus == compiled_no_opts) { - opt_shader_and_create_symbol_table(ctx, shader); + opt_shader_and_create_symbol_table(ctx, +NULL, /* source_symbols */ +shader); shader->CompileStatus = compile_success; return; } @@ -2149,7 +2138,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader, lower_subroutine(shader->ir, state); if (!ctx->Cache || force_recompile) - opt_shader_and_create_symbol_table(ctx, shader); + opt_shader_and_create_symbol_table(ctx, state->symbols, shader); else { reparent_ir(shader->ir, shader->ir); shader->CompileStatus = compiled_no_opts; diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index f827b68..0045291 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -1255,21 +1255,11 @@ interstage_cross_validate_uniform_blocks(struct gl_shader_program *prog, * Populates a shaders symbol table with all global declarations */ static void -populate_symbol_table(gl_linked_shader *sh) +populate_symbol_table(gl_linked_shader *sh, glsl_symbol_table *symbols) { sh->symbols = new(sh) glsl_symbol_table; - foreach_in_list(ir_instruction, inst, sh->ir) { - ir_variable *var; - ir_function *func; - - if ((func = inst->as_function()) != NULL) { - sh->symbols->add_function(func); - } else if ((var = inst->as_variable()) != NULL) { - if (var->data.mode != ir_var_temporary) -sh->symbols->add_variable(var); - } - } + _mesa_glsl_copy_symbols_from_table(sh->ir, symbols, sh->symbols); } @@ -2288,7 +2278,7 @@ link_intrastage_shaders(void *mem_ctx, link_bindless_layout_qualifiers(prog, shader_list, num_shaders); - populate_symbol_table(linked); + populate_symbol_table(linked, shader_list[0]->symbols); /* The pointer to the main function in the final linked shader (i.e., the * copy of the original shader that contained the main function). -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/3] Fix for PerVertexValidation CTS test
Hi, Here is a v2 of Eduardo’s patches to fix the PerVertexValidation CTS test. They were originally posted here: https://lists.freedesktop.org/archives/mesa-dev/2017-March/146667.html The patches needed a non-trivial rebase. The previous discussion was left with a note saying that there are more cases that should fail but that weren’t handled by the patches and aren’t checked by the CTS test. Ie, if the consuming shader doesn’t reference any of the members of gl_PerVertex then it wouldn’t find the block redefinition. This v2 addresses that by directly fetching the glsl_type for the gl_PerVertex block from the symbol table rather waiting for an instruction which references it. I’ve tested it with Piglit and the public CTS repo and it doesn’t cause any regressions, and of course it fixes the CTS test. It might be nice to add a Piglit or CTS test to check this missing case. - Neil Eduardo Lima Mitev (3): glsl_parser_extra: Add utility to copy symbols between symbol tables glsl: Use the utility function to copy symbols between symbol tables glsl/linker: Check that re-declared, inter-shader built-in blocks match src/compiler/glsl/glsl_parser_extras.cpp| 68 + src/compiler/glsl/glsl_parser_extras.h | 5 +++ src/compiler/glsl/link_interface_blocks.cpp | 29 src/compiler/glsl/linker.cpp| 16 ++- 4 files changed, 87 insertions(+), 31 deletions(-) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount
Jason Ekstrandwrites: > I wish... Unfortunately, the spec says: > > Let *n* be the total number of images in the swapchain, *m* be the > value of VkSurfaceCapabilitiesKHR::minImageCount, and *a* be the > number of presentable images that the application has currently > acquired (i.e. images acquired with vkAcquireNextImageKHR, but not yet > presented with vkQueuePresentKHR). vkAcquireNextImageKHR *can* always > succeed if a ≤ n - m at the time vkAcquireNextImageKHR is called. > vkAcquireNextImageKHR *should* not be called if a > n - m with a > timeout of UINT64_MAX; in such a case, vkAcquireNextImageKHR *may* > block indefinitely. > > Because this is based on the number of images in the swapchain and the > VkSurfaceCapabilitiesKHR field, if we return a swapchain with more > images, n will be larger but not m so that just means that they can > acquire more images, not that we've reserved more. Arguably, that's a > spec bug and I'll file one and see where it goes. However, it's > definitely what the spec says today. :-( Fair enough. Thanks for the explanation. Regards, - Neil . ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount
Jason Ekstrandwrites: > Hey, Neil! Hey Jason :) > Yeah... That's a bit unfortunate. The problem is that we have no way of > returning a different number of images depending on the mode. In theory, > we could start out at 2 and return SUBOPTIMAL and force the application to > recreate the swapchain with more images until we have enough. That would > be a real pain though... In not sure what the best option is. Hm, was there a problem with the previous approach? Ie, in the surface caps it always returns minImages as two but when it comes to creating the swapchain if the present mode is MAILBOX then it will go ahead and create 4 images. The parameter in the create call is also called “minImages” not “numImages” and the spec implies it’s ok to create more than were requested. It seems like that would just do the right thing. Regards, - Neil . ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount
Jason Ekstrandwrites: > + /* For true mailbox mode, we need at least 4 images: > +* 1) One to scan out from > +* 2) One to have queued for scan-out > +* 3) One to be currently held by the Wayland compositor > +* 4) One to render to > +*/ > caps->minImageCount = 2; > - caps->maxImageCount = 4; > + /* There is no real maximum */ > + caps->maxImageCount = 0; This patch as it was applied seems to leave the minImageCount as 2 on X11. Is this possibly a mistake? Now it doesn’t match the comment above it and there’s no explanation of this in the commit message. It seems a little surprising that it has been changed to 4 in the Wayland backend. Won’t this mean that every application using Vulkan, even if it’s just a simple GUI app will end up with four large buffers just to make the few applications that want true mailbox mode behave correctly? Regards, - Neil . ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] intel/compiler: Use emit_emplace in the builders
Replaces all calls of the form emit(instruction(…)) with emit_emplace(…). This should avoid a redundant copy of the stack-allocated temporary instruction. Although the temporary would have been stack-allocated so it might not seem like a big deal, the instructions do also have an internal allocation for the sources so we also get to avoid an extra allocation. I tested this with shader-db over four iterations and the average sum of the time reported for all of the threads drops from 406 seconds to 366. --- src/intel/compiler/brw_fs_builder.h | 32 src/intel/compiler/brw_vec4_builder.h | 28 ++-- 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/src/intel/compiler/brw_fs_builder.h b/src/intel/compiler/brw_fs_builder.h index 90d8285..9991dd7 100644 --- a/src/intel/compiler/brw_fs_builder.h +++ b/src/intel/compiler/brw_fs_builder.h @@ -275,7 +275,7 @@ namespace brw { instruction * emit(enum opcode opcode) const { - return emit(instruction(opcode, dispatch_width())); + return emit_emplace(opcode, dispatch_width()); } /** @@ -284,7 +284,7 @@ namespace brw { instruction * emit(enum opcode opcode, const dst_reg ) const { - return emit(instruction(opcode, dispatch_width(), dst)); + return emit_emplace(opcode, dispatch_width(), dst); } /** @@ -301,11 +301,11 @@ namespace brw { case SHADER_OPCODE_LOG2: case SHADER_OPCODE_SIN: case SHADER_OPCODE_COS: -return emit(instruction(opcode, dispatch_width(), dst, -fix_math_operand(src0))); +return emit_emplace(opcode, dispatch_width(), dst, +fix_math_operand(src0)); default: -return emit(instruction(opcode, dispatch_width(), dst, src0)); +return emit_emplace(opcode, dispatch_width(), dst, src0); } } @@ -320,12 +320,12 @@ namespace brw { case SHADER_OPCODE_POW: case SHADER_OPCODE_INT_QUOTIENT: case SHADER_OPCODE_INT_REMAINDER: -return emit(instruction(opcode, dispatch_width(), dst, -fix_math_operand(src0), -fix_math_operand(src1))); +return emit_emplace(opcode, dispatch_width(), dst, +fix_math_operand(src0), +fix_math_operand(src1)); default: -return emit(instruction(opcode, dispatch_width(), dst, src0, src1)); +return emit_emplace(opcode, dispatch_width(), dst, src0, src1); } } @@ -342,14 +342,14 @@ namespace brw { case BRW_OPCODE_BFI2: case BRW_OPCODE_MAD: case BRW_OPCODE_LRP: -return emit(instruction(opcode, dispatch_width(), dst, -fix_3src_operand(src0), -fix_3src_operand(src1), -fix_3src_operand(src2))); +return emit_emplace(opcode, dispatch_width(), dst, +fix_3src_operand(src0), +fix_3src_operand(src1), +fix_3src_operand(src2)); default: -return emit(instruction(opcode, dispatch_width(), dst, -src0, src1, src2)); +return emit_emplace(opcode, dispatch_width(), dst, +src0, src1, src2); } } @@ -361,7 +361,7 @@ namespace brw { emit(enum opcode opcode, const dst_reg , const src_reg srcs[], unsigned n) const { - return emit(instruction(opcode, dispatch_width(), dst, srcs, n)); + return emit_emplace(opcode, dispatch_width(), dst, srcs, n); } /** diff --git a/src/intel/compiler/brw_vec4_builder.h b/src/intel/compiler/brw_vec4_builder.h index 413b27c..d802990 100644 --- a/src/intel/compiler/brw_vec4_builder.h +++ b/src/intel/compiler/brw_vec4_builder.h @@ -240,7 +240,7 @@ namespace brw { instruction * emit(enum opcode opcode) const { - return emit(instruction(opcode)); + return emit_emplace(opcode); } /** @@ -249,7 +249,7 @@ namespace brw { instruction * emit(enum opcode opcode, const dst_reg ) const { - return emit(instruction(opcode, dst)); + return emit_emplace(opcode, dst); } /** @@ -267,11 +267,11 @@ namespace brw { case SHADER_OPCODE_SIN: case SHADER_OPCODE_COS: return fix_math_instruction( - emit(instruction(opcode, dst, -fix_math_operand(src0; + emit_emplace(opcode, dst, +fix_math_operand(src0))); default:
[Mesa-dev] [PATCH 1/2] intel/compiler: Add a placement version of emit to the builders
Adds “emit_emplace” to emit an instruction that is constructed inplace with the parameters given. This will be used instead of the pattern of doing emit(instruction(…)) because that ends up calling emit(const instruction&) which implies doing a copy. The “emplace” terminology comes from std::vector::emplace_back which does a very similar thing. --- Sorry if this email ends up getting sent twice. It didn’t seem to work the first time. src/intel/compiler/brw_fs_builder.h | 14 ++ src/intel/compiler/brw_vec4_builder.h | 14 ++ 2 files changed, 28 insertions(+) diff --git a/src/intel/compiler/brw_fs_builder.h b/src/intel/compiler/brw_fs_builder.h index 87394bc..90d8285 100644 --- a/src/intel/compiler/brw_fs_builder.h +++ b/src/intel/compiler/brw_fs_builder.h @@ -25,6 +25,7 @@ #ifndef BRW_FS_BUILDER_H #define BRW_FS_BUILDER_H +#include #include "brw_ir_fs.h" #include "brw_shader.h" @@ -256,6 +257,19 @@ namespace brw { } /** + * Insert an instruction by constructing it directly inplace. This uses + * perfect forwarding magic from C++11 to forward the arguments to the + * constructor. + */ + template + instruction * + emit_emplace(Args&&... args) const + { + return emit(new(shader->mem_ctx) + instruction(std::forward(args)...)); + } + + /** * Create and insert a nullary control instruction into the program. */ instruction * diff --git a/src/intel/compiler/brw_vec4_builder.h b/src/intel/compiler/brw_vec4_builder.h index 4c3efe8..413b27c 100644 --- a/src/intel/compiler/brw_vec4_builder.h +++ b/src/intel/compiler/brw_vec4_builder.h @@ -25,6 +25,7 @@ #ifndef BRW_VEC4_BUILDER_H #define BRW_VEC4_BUILDER_H +#include #include "brw_ir_vec4.h" #include "brw_ir_allocator.h" @@ -221,6 +222,19 @@ namespace brw { } /** + * Insert an instruction by constructing it directly inplace. This uses + * perfect forwarding magic from C++11 to forward the arguments to the + * constructor. + */ + template + instruction * + emit_emplace(Args&&... args) const + { + return emit(new(shader->mem_ctx) + instruction(std::forward(args)...)); + } + + /** * Create and insert a nullary control instruction into the program. */ instruction * -- 2.9.5 . ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH vulkan] anv/pipeline: Try to create all pipelines
According to the Vulkan 1.0 spec section 9.4: “When an application attempts to create many pipelines in a single command, it is possible that some subset may fail creation. In that case, the corresponding entries in the pPipelines output array will be filled with VK_NULL_HANDLE values. If any pipeline fails creation (for example, due to out of memory errors), the vkCreate*Pipelines commands will return an error code. The implementation will attempt to create all pipelines, and only return VK_NULL_HANDLE values for those that actually failed.” Previously anv_Create{Graphics,Compute}Pipelines would destroy any previous pipelines that it created. The problem with this is that if the application is expecting the driver to behave like the spec then it may try to free any pipelines that were successfully created by iterating through the results array. If any of them succeeded then the pointer will be a stale pointer and the application would probably crash. --- src/intel/vulkan/anv_pipeline.c | 52 +++-- 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index 1173b4f..d3a01df 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -1231,24 +1231,28 @@ VkResult anv_CreateGraphicsPipelines( const VkAllocationCallbacks*pAllocator, VkPipeline* pPipelines) { - VkResult result = VK_SUCCESS; + VkResult this_result; + VkResult overall_result = VK_SUCCESS; unsigned i = 0; for (; i < count; i++) { - result = anv_graphics_pipeline_create(_device, -pipelineCache, -[i], -NULL, pAllocator, [i]); - if (result != VK_SUCCESS) { - for (unsigned j = 0; j < i; j++) { -anv_DestroyPipeline(_device, pPipelines[j], pAllocator); - } - - return result; + this_result = anv_graphics_pipeline_create(_device, + pipelineCache, + [i], + NULL, + pAllocator, + [i]); + if (this_result != VK_SUCCESS) { + /* According to the spec this should try to create all pipelines and + * if any fail then it will set the corresponding pipeline handle to + * NULL. + */ + pPipelines[i] = VK_NULL_HANDLE; + overall_result = this_result; } } - return VK_SUCCESS; + return overall_result; } static VkResult anv_compute_pipeline_create( @@ -1287,21 +1291,23 @@ VkResult anv_CreateComputePipelines( const VkAllocationCallbacks*pAllocator, VkPipeline* pPipelines) { - VkResult result = VK_SUCCESS; + VkResult this_result; + VkResult overall_result = VK_SUCCESS; unsigned i = 0; for (; i < count; i++) { - result = anv_compute_pipeline_create(_device, pipelineCache, - [i], - pAllocator, [i]); - if (result != VK_SUCCESS) { - for (unsigned j = 0; j < i; j++) { -anv_DestroyPipeline(_device, pPipelines[j], pAllocator); - } - - return result; + this_result = anv_compute_pipeline_create(_device, pipelineCache, +[i], +pAllocator, [i]); + if (this_result != VK_SUCCESS) { + /* According to the spec this should try to create all pipelines and + * if any fail then it will set the corresponding pipeline handle to + * NULL. + */ + pPipelines[i] = VK_NULL_HANDLE; + overall_result = this_result; } } - return VK_SUCCESS; + return overall_result; } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/22] meta/blit: Use GL_EXT_shader_samples_identical in MSAA-SS resolve blit
Iago Toralwrites: > I don't know much about this, but using shader_samples_identical > should only give a benefit if we actually get identical samples, > otherwise it means more work, right? I noticed that the test renders a > quad with random colors for each vertex that will be interpolated > across the region, so could it be that we are hitting few cases of > identical samples? Shouldn't the results be expected to be better if > we rendered a flat color instead? It should all be identical regardless of the color chosen because it is doing per-pixel shading rather than per-sample which means the fragment shader is only run once per pixel and the same value is written to all of the samples. The only case where you get differing samples normally is at the edges of a polygon where the single color is only written to a few of the samples. In this case it is even more identical because the random color is set on a uniform rather than a vertex so it is actually using the same color for the whole framebuffer. However, it is rendered as two triangles which might mean that along the diagonal of the framebuffer where the two triangles meet there will be a polygon edge which means the hardware might not recognise that these all the same color, but that should only be a few pixels and probably won't affect the results very much. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/22] meta/blit: Use GL_EXT_shader_samples_identical in MSAA-SS resolve blit
I made a pathological test case (attached) which repeatedly renders into an MSAA FBO and then blits it to the screen and measures the framerate. It checks it with a range of different sample counts. The rendering is done either by rendering two triangles to fill the framebuffer or by calling glClear. The percentage increase in framerate after applying the patch is like this: With triangles to fill buffer: 16 62,27% 8 48,59% 4 27,72% 2 5,34% 0 0,58% With glClear: 16 -5,20% 8 -7,08% 4 -2,45% 2 -20,76% 0 3,71% It seems like a pretty convincing win for the triangle case but the clear case makes it slightly worse. Presumably this is because we don't do anything to detect the value stored in the MCS buffer when doing a fast clear so the fast path isn't taken and the shader being more complicated makes it slower. Not sure if we want to try and do anything about that because potentially the cleared pixels aren't very common in a framebuffer from a real use case so it might not really matter. Currently we don't use SIMD16 for 16x MSAA because we can't allocate the registers well enough to make it worthwhile. This patch makes that problem a bit more interesting because even if we end up spilling a lot it might still be worth doing SIMD16 because the cases where the spilled instructions are hit would be much less common. - Neil #include #include #include #include #include #define N_FRAMES_TO_SKIP 1000 #define N_FRAMES_TO_DRAW 1 enum draw_mode { DRAW_MODE_TRIANGLES, DRAW_MODE_CLEAR }; struct data { SDL_Window *window; SDL_GLContext gl_context; }; static SDL_GLContext create_gl_context(SDL_Window *window) { /* First try creating a core context because if we get one it * can be more efficient. */ SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, 3); SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, 1); SDL_GL_SetAttribute(SDL_GL_CONTEXT_PROFILE_MASK, SDL_GL_CONTEXT_PROFILE_CORE); return SDL_GL_CreateContext(window); } static const char vertex_shader_source[] = "#version 150\n" "in vec2 piglit_vertex;\n" "\n" "void\n" "main()\n" "{\n" "gl_Position = vec4(piglit_vertex, 0.0, 1.0);\n" "}\n"; static const char fragment_shader_source[] = "#version 150\n" "uniform vec3 color;\n" "\n" "void\n" "main()\n" "{\n" "gl_FragColor = vec4(color, 1.0);\n" "}\n"; static GLuint create_shader(const char *name, GLenum type, const char *source, int source_length) { GLuint shader; GLint length, compile_status; GLsizei actual_length; GLchar *info_log; shader = glCreateShader(type); glShaderSource(shader, 1, /* n_strings */ , _length); glCompileShader(shader); glGetShaderiv(shader, GL_INFO_LOG_LENGTH, ); if (length > 0) { info_log = malloc(length); glGetShaderInfoLog(shader, length, _length, info_log); if (*info_log) { fprintf(stderr, "Info log for %s:\n%s\n", name, info_log); } free(info_log); } glGetShaderiv(shader, GL_COMPILE_STATUS, _status); if (!compile_status) { fprintf(stderr, "%s compilation failed", name); glDeleteShader(shader); return 0; } return shader; } static bool link_program(GLuint program) { GLint length, link_status; GLsizei actual_length; GLchar *info_log; glLinkProgram(program); glGetProgramiv(program, GL_INFO_LOG_LENGTH, ); if (length > 0) { info_log = malloc(length); glGetProgramInfoLog(program, length, _length, info_log); if (*info_log) fprintf(stderr, "Link info log:\n%s\n", info_log); free(info_log); } glGetProgramiv(program, GL_LINK_STATUS, _status); if (!link_status) { fprintf(stderr, "program link failed"); return false; } return true; } static bool run_test(struct data *data, int n_samples, enum draw_mode draw_mode, float *result) { float verts[] = { -1.0f, -1.0f, 1.0f, -1.0f, -1.0f, 1.0f, 1.0f, 1.0f }; GLuint
Re: [Mesa-dev] [PATCH 1/4] mesa: gl_NumSamples should always be at least one
Looks good to me. Interestingly we have a test for it in Piglit which gets run with samples=0 and it is currently passing so presumably it is broken. Reviewed-by: Neil Roberts <n...@linux.intel.com> - Neil Ilia Mirkin <imir...@alum.mit.edu> writes: > From ARB_sample_shading: > > "gl_NumSamples is the total number of samples in the framebuffer, > or one if rendering to a non-multisample framebuffer" > > So make sure to always pass in at least 1. > > Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> > --- > src/mesa/program/prog_statevars.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/program/prog_statevars.c > b/src/mesa/program/prog_statevars.c > index eed2412..489f75f 100644 > --- a/src/mesa/program/prog_statevars.c > +++ b/src/mesa/program/prog_statevars.c > @@ -353,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const > gl_state_index state[], >} >return; > case STATE_NUM_SAMPLES: > - ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer); > + ((int *)value)[0] = MAX2(1, _mesa_geometric_samples(ctx->DrawBuffer)); >return; > case STATE_DEPTH_RANGE: >value[0] = ctx->ViewportArray[0].Near;/* near */ > -- > 2.4.10 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] Android: fix build break in libmesa_program
Looks good to me. Sorry about breaking it. Reviewed-by: Neil Roberts <n...@linux.intel.com> - Neil Rob Herring <r...@kernel.org> writes: > Commit 5fd848f6c9ee ("program: Use _mesa_geometric_samples to calculate > gl_NumSamples") broken Android builds. Add the missing include path "main" > to framebuffer.h like other includes in prog_statevars.c. > > Cc: Neil Roberts <n...@linux.intel.com> > Cc: Ilia Mirkin <imir...@alum.mit.edu> > Signed-off-by: Rob Herring <r...@kernel.org> > --- > v2: Add main to #include instead of adding the path to Android.mk > > src/mesa/program/prog_statevars.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/program/prog_statevars.c > b/src/mesa/program/prog_statevars.c > index eed2412..af5fefe 100644 > --- a/src/mesa/program/prog_statevars.c > +++ b/src/mesa/program/prog_statevars.c > @@ -40,7 +40,7 @@ > #include "prog_statevars.h" > #include "prog_parameter.h" > #include "main/samplerobj.h" > -#include "framebuffer.h" > +#include "main/framebuffer.h" > > > #define ONE_DIV_SQRT_LN2 (1.201122408786449815) > -- > 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] main: Use a derived value for the default sample count
Previously the framebuffer default sample count was taken directly from the value given by the application. On the i965 driver on HSW if the value wasn't one that is supported by the hardware it would hit an assert when it tried to program the state for it. This patch fixes it by adding a derived sample count to the state for the default framebuffer. The driver can then quantize this to one of the valid values in its UpdateState handler when the _NEW_BUFFERS state changes. _mesa_geometric_samples is changed to use the new derived value. Fixes the piglit test arb_framebuffer_no_attachments-query Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93957 Cc: Ilia Mirkin--- src/mesa/drivers/dri/i965/brw_context.c | 19 +++ src/mesa/main/framebuffer.h | 3 ++- src/mesa/main/mtypes.h | 4 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 1032e5a..44d2fe4 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -167,6 +167,19 @@ intel_viewport(struct gl_context *ctx) } static void +intel_update_framebuffer(struct gl_context *ctx, + struct gl_framebuffer *fb) +{ + struct brw_context *brw = brw_context(ctx); + + /* Quantize the derived default number of samples +*/ + fb->DefaultGeometry._NumSamples = + intel_quantize_num_samples(brw->intelScreen, + fb->DefaultGeometry.NumSamples); +} + +static void intel_update_state(struct gl_context * ctx, GLuint new_state) { struct brw_context *brw = brw_context(ctx); @@ -245,6 +258,12 @@ intel_update_state(struct gl_context * ctx, GLuint new_state) } _mesa_lock_context_textures(ctx); + + if (new_state & _NEW_BUFFERS) { + intel_update_framebuffer(ctx, ctx->DrawBuffer); + if (ctx->DrawBuffer != ctx->ReadBuffer) + intel_update_framebuffer(ctx, ctx->ReadBuffer); + } } #define flushFront(screen) ((screen)->image.loader ? (screen)->image.loader->flushFrontBuffer : (screen)->dri2.loader->flushFrontBuffer) diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h index ab077ed..fa434d4 100644 --- a/src/mesa/main/framebuffer.h +++ b/src/mesa/main/framebuffer.h @@ -97,7 +97,8 @@ static inline GLuint _mesa_geometric_samples(const struct gl_framebuffer *buffer) { return buffer->_HasAttachments ? - buffer->Visual.samples : buffer->DefaultGeometry.NumSamples; + buffer->Visual.samples : + buffer->DefaultGeometry._NumSamples; } static inline GLuint diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 58064aa..745eeb8 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3223,6 +3223,10 @@ struct gl_framebuffer struct { GLuint Width, Height, Layers, NumSamples; GLboolean FixedSampleLocations; + /* Derived from NumSamples by the driver so that it can choose a valid + * value for the hardware. + */ + GLuint _NumSamples; } DefaultGeometry; /** \name Drawing bounds (Intersection of buffer size and scissor box) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] program: Use _mesa_geometric_samples to calculate gl_NumSamples
Otherwise it won't take into account the default samples for framebuffers with no attachments. --- src/mesa/program/prog_statevars.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/program/prog_statevars.c b/src/mesa/program/prog_statevars.c index 12490d0..eed2412 100644 --- a/src/mesa/program/prog_statevars.c +++ b/src/mesa/program/prog_statevars.c @@ -40,6 +40,7 @@ #include "prog_statevars.h" #include "prog_parameter.h" #include "main/samplerobj.h" +#include "framebuffer.h" #define ONE_DIV_SQRT_LN2 (1.201122408786449815) @@ -352,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const gl_state_index state[], } return; case STATE_NUM_SAMPLES: - ((int *)value)[0] = ctx->DrawBuffer->Visual.samples; + ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer); return; case STATE_DEPTH_RANGE: value[0] = ctx->ViewportArray[0].Near;/* near */ -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] main: Use _mesa_geometric_samples to calculate GL_SAMPLE_BUFFERS
Otherwise it won't take into account the default samples for framebuffers with no attachments. --- src/mesa/main/get.c | 3 +++ src/mesa/main/get_hash_params.py | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 3070597..f9b4a3c 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -1084,6 +1084,9 @@ find_custom_value(struct gl_context *ctx, const struct value_desc *d, union valu case GL_SAMPLES: v->value_int = _mesa_geometric_samples(ctx->DrawBuffer); break; + case GL_SAMPLE_BUFFERS: + v->value_int = _mesa_geometric_samples(ctx->DrawBuffer) > 0; + break; } } diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index dd4aa8c..94b364a 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -80,7 +80,7 @@ descriptor=[ [ "SAMPLE_COVERAGE_ARB", "CONTEXT_BOOL(Multisample.SampleCoverage), NO_EXTRA" ], [ "SAMPLE_COVERAGE_VALUE_ARB", "CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA" ], [ "SAMPLE_COVERAGE_INVERT_ARB", "CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA" ], - [ "SAMPLE_BUFFERS_ARB", "BUFFER_INT(Visual.sampleBuffers), extra_new_buffers" ], + [ "SAMPLE_BUFFERS_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_new_buffers" ], [ "SAMPLES_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_new_buffers" ], # GL_ARB_sample_shading -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] main: Use _mesa_geometric_samples to calculate the value of GL_SAMPLES
Otherwise it won't take into account the default samples for framebuffers with no attachments. --- src/mesa/main/get.c | 4 src/mesa/main/get_hash_params.py | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 0434836..3070597 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -1080,6 +1080,10 @@ find_custom_value(struct gl_context *ctx, const struct value_desc *d, union valu case GL_DISPATCH_INDIRECT_BUFFER_BINDING: v->value_int = ctx->DispatchIndirectBuffer->Name; break; + /* GL_ARB_multisample */ + case GL_SAMPLES: + v->value_int = _mesa_geometric_samples(ctx->DrawBuffer); + break; } } diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 04aec03..dd4aa8c 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -81,7 +81,7 @@ descriptor=[ [ "SAMPLE_COVERAGE_VALUE_ARB", "CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA" ], [ "SAMPLE_COVERAGE_INVERT_ARB", "CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA" ], [ "SAMPLE_BUFFERS_ARB", "BUFFER_INT(Visual.sampleBuffers), extra_new_buffers" ], - [ "SAMPLES_ARB", "BUFFER_INT(Visual.samples), extra_new_buffers" ], + [ "SAMPLES_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_new_buffers" ], # GL_ARB_sample_shading [ "SAMPLE_SHADING_ARB", "CONTEXT_BOOL(Multisample.SampleShading), extra_gl40_ARB_sample_shading" ], -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] texobj: Fix the completeness checks for cube textures
According to the GL 1.4 spec section 3.8.10, a cubemap texture is only complete if: • The level base arrays of each of the six texture images making up the cube map have identical, positive, and square dimensions. • The level base arrays were each specified with the same internal format. • The level base arrays each have the same border width. Previously the texture completeness code was only checking the first point. This patch makes it additionally check the other two. This fixes the following two dEQP tests: deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgba_rgb_level_0_neg_z deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgb_rgba_level_0_pos_z And also the Piglit test posted here: http://patchwork.freedesktop.org/patch/71333/ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93792 Cc: Ilia Mirkin--- src/mesa/main/texobj.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index b107a8f..9ce7b4c 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -769,7 +769,8 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, } if (t->Target == GL_TEXTURE_CUBE_MAP_ARB) { - /* Make sure that all six cube map level 0 images are the same size. + /* Make sure that all six cube map level 0 images are the same size and + * format. * Note: we know that the image's width==height (we enforce that * at glTexImage time) so we only need to test the width here. */ @@ -784,6 +785,15 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, incomplete(t, BASE, "Cube face missing or mismatched size"); return; } + if (t->Image[face][baseLevel]->InternalFormat != + baseImage->InternalFormat) { +incomplete(t, BASE, "Cube face format mismatch"); +return; + } + if (t->Image[face][baseLevel]->Border != baseImage->Border) { +incomplete(t, BASE, "Cube face border size mismatch"); +return; + } } } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] texobj: Remove redundant checks that the texture cube faces match size
The texture mipmap completeness checking code was checking whether all of the faces have the same size. However this is pointless because the code just above it checks whether the face has the expected size calculated for the mipmap level anyway so the error condition could never be reached. This patch just removes it. --- src/mesa/main/texobj.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 9ce7b4c..e926c7b 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -868,16 +868,6 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, img->Depth2); return; } - - /* Extra checks for cube textures */ - if (face > 0) { - /* check that cube faces are the same size */ - if (img->Width2 != t->Image[0][i]->Width2 || - img->Height2 != t->Image[0][i]->Height2) { -incomplete(t, MIPMAP, "CubeMap Image[n][i] bad size"); -return; - } - } } } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/meta-fast-clear: Convert the clear color through the surf format
Bump. Anyone fancy reviewing this small patch? I think it would be good to have because it makes the code a bit simpler as well as fixing a corner case and making it more robust. - Neil Neil Roberts <n...@linux.intel.com> writes: > When programming the fast clear color there was previously a chunk of > code to try to make the color match the constraints of the surface > format such as by filling in missing components and handling luminance > formats. These cases are not handled by the hardware. There are some > additional possible restrictions that the hardware does seem to > handle, such as clamping to [0,1] for normalised formats. However for > whatever reason it doesn't clamp to [0,∞] for the special float > formats that don't have a sign bit. Rather than adding yet another > special case for this format this patch makes it instead convert the > color to the actual surface format and back again so that we can be > sure it will have all of the possible restrictions. Additionally this > would avoid some other potential surprises such as getting more > precision for the clear color when fast clears are used. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93338 > --- > src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 57 > - > 1 file changed, 27 insertions(+), 30 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > index cf0e56b..29ae6f0 100644 > --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c > @@ -37,6 +37,8 @@ > #include "main/uniforms.h" > #include "main/fbobject.h" > #include "main/texobj.h" > +#include "main/format_unpack.h" > +#include "main/format_pack.h" > > #include "main/api_validate.h" > #include "main/state.h" > @@ -397,45 +399,40 @@ set_fast_clear_color(struct brw_context *brw, > struct intel_mipmap_tree *mt, > const union gl_color_union *color) > { > + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); > union gl_color_union override_color = *color; > - > - /* The sampler doesn't look at the format of the surface when the fast > -* clear color is used so we need to implement luminance, intensity and > -* missing components manually. > -*/ > - switch (_mesa_get_format_base_format(mt->format)) { > - case GL_INTENSITY: > - override_color.ui[3] = override_color.ui[0]; > - /* flow through */ > - case GL_LUMINANCE: > - case GL_LUMINANCE_ALPHA: > - override_color.ui[1] = override_color.ui[0]; > - override_color.ui[2] = override_color.ui[0]; > - break; > - default: > - for (int i = 0; i < 3; i++) { > - if (!_mesa_format_has_color_component(mt->format, i)) > -override_color.ui[i] = 0; > - } > - break; > - } > - > - if (!_mesa_format_has_color_component(mt->format, 3)) { > - if (_mesa_is_format_integer_color(mt->format)) > - override_color.ui[3] = 1; > - else > - override_color.f[3] = 1.0f; > - } > + union gl_color_union tmp_color; > > /* Handle linear→SRGB conversion */ > - if (brw->ctx.Color.sRGBEnabled && > - _mesa_get_srgb_format_linear(mt->format) != mt->format) { > + if (brw->ctx.Color.sRGBEnabled && linear_format != mt->format) { >for (int i = 0; i < 3; i++) { > override_color.f[i] = > util_format_linear_to_srgb_float(override_color.f[i]); >} > } > > + /* Convert the clear color to the surface format and back so that the > color > +* returned when sampling is guaranteed to be a value that could be stored > +* in the surface. For example if the surface is a luminance format and we > +* clear to 0.5,0.75,0.1,0.2 we want the color to come back as > +* 0.5,0.5,0.5,1.0. In general the hardware doesn't seem to look at the > +* surface format when returning the clear color so we need to do this to > +* implement luminance, intensity and missing components. However it does > +* seem to look at it in some cases such as to clamp to the range [0,1] > for > +* unorm formats. Suprisingly however it doesn't clamp to [0,∞] for the > +* special float formats that don't have a sign bit. > +*/ > + if (!_mesa_is_format_integer_color(linear_format)) { > + _mesa_pack_float_rgba_row(linear_format, > +1, /* n_pixels */ > +
[Mesa-dev] [PATCH] texobj: Check completeness with InternalFormat rather than Mesa format
The internal Mesa format used for a texture might not match the one requested in the internalFormat when the texture was created, for example if the driver is internally remapping RGB textures to RGBA. Otherwise it can cause false positives for completeness if one mipmap image is created as RGBA and the other as RGB because they would both have an RGBA Mesa format. If we check the InternalFormat instead then we are directly checking the API usage which I think better matches the intention of the check. https://bugs.freedesktop.org/show_bug.cgi?id=93700 --- src/mesa/main/texobj.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index 547055e..b107a8f 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -835,7 +835,7 @@ _mesa_test_texobj_completeness( const struct gl_context *ctx, incomplete(t, MIPMAP, "TexImage[%d] is missing", i); return; } - if (img->TexFormat != baseImage->TexFormat) { + if (img->InternalFormat != baseImage->InternalFormat) { incomplete(t, MIPMAP, "Format[i] != Format[baseLevel]"); return; } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Revert "i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals"
This reverts commit 839793680f99b8387bee9489733d5071c10f3ace. The patch was breaking DRI3 because driGLFormatToImageFormat does not handle MESA_FORMAT_B8G8R8X8_SRGB which ended up making it fail to create the renderbuffer and it would later crash. It's not trivial to add this format because there is no __DRI_IMAGE_FORMAT nor __DRI_IMAGE_FOURCC define for the format either. I'm not sure how difficult adding this would be and whether adding a new format would require some sort of new version for DRI. Seeing as this might take a while to fix I think it makes sense to just revert the patch in the meantime in order to avoid regressing master. It is also not handled in intel_gles3_srgb_workaround and there may be other cases where it breaks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93388 --- src/mesa/drivers/dri/i965/intel_screen.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 825a7c1..cc90efe 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -999,13 +999,14 @@ intelCreateBuffer(__DRIscreen * driScrnPriv, fb->Visual.samples = num_samples; } - if (mesaVis->redBits == 5) { + if (mesaVis->redBits == 5) rgbFormat = MESA_FORMAT_B5G6R5_UNORM; - } else { - if (mesaVis->alphaBits == 0) - rgbFormat = MESA_FORMAT_B8G8R8X8_SRGB; - else - rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; + else if (mesaVis->sRGBCapable) + rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; + else if (mesaVis->alphaBits == 0) + rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM; + else { + rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; fb->Visual.sRGBCapable = true; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.
Ilia Mirkinwrites: > I suspect that something like this may be the right thing for the > intel driver... no reason not to expose sRGB-capable visuals when it > so happens that alpha == 0. Also probably the same treatment should be > done for BGR565... sRGB encoding will matter even more there, and from > the looks of it, all gens support that. > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index cc90efe..75f2cce 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -1003,9 +1003,10 @@ intelCreateBuffer(__DRIscreen * driScrnPriv, >rgbFormat = MESA_FORMAT_B5G6R5_UNORM; > else if (mesaVis->sRGBCapable) >rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; > - else if (mesaVis->alphaBits == 0) > - rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM; > - else { > + else if (mesaVis->alphaBits == 0) { > + rgbFormat = MESA_FORMAT_B8G8R8X8_SRGB; > + fb->Visual.sRGBCapable = true; > + } else { >rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; >fb->Visual.sRGBCapable = true; > } Thanks for the suggestion. I assume you didn't want to make this into a proper patch yourself so I went ahead and did it here: http://patchwork.freedesktop.org/patch/67844/ Along with the other two patches it doesn't cause any regressions in our CI system. I left the 565 format alone because there is no enum for MESA_FORMAT_B5G6R5_SRGB and it seemed like it's better treated as a separate issue. We should also fix the fact that the visuals aren't marked as sRGB-capable even though they are, but again I think that is a separate issue. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] i965: Add MESA_FORMAT_B8G8R8X8_SRGB to brw_format_for_mesa_format
This will be used in a subsequent patch as the format for RGB visuals. Cc: "11.0 11.1"Cc: Ilia Mirkin Suggested-by: Ilia Mirkin --- src/mesa/drivers/dri/i965/brw_surface_formats.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index ff8aac2..3fc47c3 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -408,6 +408,7 @@ brw_format_for_mesa_format(mesa_format mesa_format) [MESA_FORMAT_A8R8G8B8_SRGB] = 0, [MESA_FORMAT_R8G8B8A8_SRGB] = BRW_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB, [MESA_FORMAT_X8R8G8B8_SRGB] = 0, + [MESA_FORMAT_B8G8R8X8_SRGB] = BRW_SURFACEFORMAT_B8G8R8X8_UNORM_SRGB, [MESA_FORMAT_L_SRGB8] = BRW_SURFACEFORMAT_L8_UNORM_SRGB, [MESA_FORMAT_L8A8_SRGB] = BRW_SURFACEFORMAT_L8A8_UNORM_SRGB, [MESA_FORMAT_A8L8_SRGB] = 0, -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals
Previously if the visual didn't have an alpha channel then it would pick a format that is not sRGB-capable. I don't think there's any reason not to always have an sRGB-capable visual. Since 28090b30 there are now visuals advertised without an alpha channel which means that games that don't request alpha bits in the config would end up without an sRGB-capable visual. This was breaking supertuxkart which assumes the winsys buffer is always sRGB-capable. The previous code always used an RGBA format if the visual config itself was marked as sRGB-capable regardless of whether the visual has alpha bits. I think we don't actually advertise any sRGB-capable visuals (but we just use sRGB formats anyway) so it shouldn't make any difference. However this patch also changes it to use RGBX if an sRGB-capable visual is requested without alpha bits for consistency. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 Cc: "11.0 11.1"Cc: Ilia Mirkin Suggested-by: Ilia Mirkin --- src/mesa/drivers/dri/i965/intel_screen.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index cc90efe..825a7c1 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -999,14 +999,13 @@ intelCreateBuffer(__DRIscreen * driScrnPriv, fb->Visual.samples = num_samples; } - if (mesaVis->redBits == 5) + if (mesaVis->redBits == 5) { rgbFormat = MESA_FORMAT_B5G6R5_UNORM; - else if (mesaVis->sRGBCapable) - rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; - else if (mesaVis->alphaBits == 0) - rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM; - else { - rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; + } else { + if (mesaVis->alphaBits == 0) + rgbFormat = MESA_FORMAT_B8G8R8X8_SRGB; + else + rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB; fb->Visual.sRGBCapable = true; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965: Add B8G8R8X8_SRGB to the alpha format override
brw_init_surface_formats overrides the render format for RGBX formats which aren't supported for rendering so that they internally use RGBA instead. However, B8G8R8X8_SRGB was missing so it wasn't marked as a renderable format. This patch just adds it. Cc: "11.0 11.1"Cc: Ilia Mirkin --- src/mesa/drivers/dri/i965/brw_surface_formats.c | 4 1 file changed, 4 insertions(+) This might conflict when ported to the 11.x branches because the code above doesn't have the if statement there. It should be fine to add it without the if statement on those branches although it shouldn't matter if it does have it either. diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index 3fc47c3..7bc8b8b 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -674,6 +674,10 @@ brw_init_surface_formats(struct brw_context *brw) if (gen < tinfo->render_target) render = BRW_SURFACEFORMAT_B8G8R8A8_UNORM; break; + case BRW_SURFACEFORMAT_B8G8R8X8_UNORM_SRGB: + if (gen < tinfo->render_target) +render = BRW_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB; + break; case BRW_SURFACEFORMAT_R8G8B8X8_UNORM: render = BRW_SURFACEFORMAT_R8G8B8A8_UNORM; break; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/meta-fast-clear: Convert the clear color through the surf format
When programming the fast clear color there was previously a chunk of code to try to make the color match the constraints of the surface format such as by filling in missing components and handling luminance formats. These cases are not handled by the hardware. There are some additional possible restrictions that the hardware does seem to handle, such as clamping to [0,1] for normalised formats. However for whatever reason it doesn't clamp to [0,∞] for the special float formats that don't have a sign bit. Rather than adding yet another special case for this format this patch makes it instead convert the color to the actual surface format and back again so that we can be sure it will have all of the possible restrictions. Additionally this would avoid some other potential surprises such as getting more precision for the clear color when fast clears are used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93338 --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 57 - 1 file changed, 27 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index cf0e56b..29ae6f0 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -37,6 +37,8 @@ #include "main/uniforms.h" #include "main/fbobject.h" #include "main/texobj.h" +#include "main/format_unpack.h" +#include "main/format_pack.h" #include "main/api_validate.h" #include "main/state.h" @@ -397,45 +399,40 @@ set_fast_clear_color(struct brw_context *brw, struct intel_mipmap_tree *mt, const union gl_color_union *color) { + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); union gl_color_union override_color = *color; - - /* The sampler doesn't look at the format of the surface when the fast -* clear color is used so we need to implement luminance, intensity and -* missing components manually. -*/ - switch (_mesa_get_format_base_format(mt->format)) { - case GL_INTENSITY: - override_color.ui[3] = override_color.ui[0]; - /* flow through */ - case GL_LUMINANCE: - case GL_LUMINANCE_ALPHA: - override_color.ui[1] = override_color.ui[0]; - override_color.ui[2] = override_color.ui[0]; - break; - default: - for (int i = 0; i < 3; i++) { - if (!_mesa_format_has_color_component(mt->format, i)) -override_color.ui[i] = 0; - } - break; - } - - if (!_mesa_format_has_color_component(mt->format, 3)) { - if (_mesa_is_format_integer_color(mt->format)) - override_color.ui[3] = 1; - else - override_color.f[3] = 1.0f; - } + union gl_color_union tmp_color; /* Handle linear→SRGB conversion */ - if (brw->ctx.Color.sRGBEnabled && - _mesa_get_srgb_format_linear(mt->format) != mt->format) { + if (brw->ctx.Color.sRGBEnabled && linear_format != mt->format) { for (int i = 0; i < 3; i++) { override_color.f[i] = util_format_linear_to_srgb_float(override_color.f[i]); } } + /* Convert the clear color to the surface format and back so that the color +* returned when sampling is guaranteed to be a value that could be stored +* in the surface. For example if the surface is a luminance format and we +* clear to 0.5,0.75,0.1,0.2 we want the color to come back as +* 0.5,0.5,0.5,1.0. In general the hardware doesn't seem to look at the +* surface format when returning the clear color so we need to do this to +* implement luminance, intensity and missing components. However it does +* seem to look at it in some cases such as to clamp to the range [0,1] for +* unorm formats. Suprisingly however it doesn't clamp to [0,∞] for the +* special float formats that don't have a sign bit. +*/ + if (!_mesa_is_format_integer_color(linear_format)) { + _mesa_pack_float_rgba_row(linear_format, +1, /* n_pixels */ +(const GLfloat (*)[4]) override_color.f, +_color); + _mesa_unpack_rgba_row(linear_format, +1, /* n_pixels */ +_color, +(GLfloat (*)[4]) override_color.f); + } + if (brw->gen >= 9) { mt->gen9_fast_clear_color = override_color; } else { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound
Emil Velikovwrites: > Worth throwing in 11.0 as well ? Yeah, that would probably be sensible. >> if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) { >> - dri2InvalidateDrawable(driContext->driDrawablePriv); >> - dri2InvalidateDrawable(driContext->driReadablePriv); >> + if (driContext->driDrawablePriv) >> + dri2InvalidateDrawable(driContext->driDrawablePriv); >> + if (driContext->driReadablePriv) >> + dri2InvalidateDrawable(driContext->driReadablePriv); > > Afaict i915 could use an identical fix ? Yes, I think you're right. However I don't have any way of testing it so I feel a bit uncomfortable touching i915 driver. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Fix crash when calling glViewport with no surface bound
If EGL_KHR_surfaceless_context is used then glViewport can be called with NULL for the draw and read surfaces. This was previously causing a crash because the i965 driver tries to use this point to invalidate the surfaces and it was derferencing the NULL pointer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257 Cc: Nanley CheryCc: "11.1" --- I've also posted a Piglit test for this here: http://patchwork.freedesktop.org/patch/67356/ src/mesa/drivers/dri/i965/brw_context.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 7d7643c..5374bad 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -159,8 +159,10 @@ intel_viewport(struct gl_context *ctx) __DRIcontext *driContext = brw->driContext; if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) { - dri2InvalidateDrawable(driContext->driDrawablePriv); - dri2InvalidateDrawable(driContext->driReadablePriv); + if (driContext->driDrawablePriv) + dri2InvalidateDrawable(driContext->driDrawablePriv); + if (driContext->driReadablePriv) + dri2InvalidateDrawable(driContext->driReadablePriv); } } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.
Matt Turnerwrites: > # Written in the form (, ) where is an expression > # and is either an expression or a value. An expression is > # defined as a tuple of the form (, , , , ) > @@ -94,6 +97,8 @@ optimizations = [ > (('inot', ('ige', a, b)), ('ilt', a, b)), > (('inot', ('ieq', a, b)), ('ine', a, b)), > (('inot', ('ine', a, b)), ('ieq', a, b)), > + (('ieq', 'a@bool', true), a), > + (('ine', 'a@bool', false), a), I think this second one is already in the file on line 187 here: # Boolean simplifications (('ine', 'a@bool', 0), 'a'), (('ieq', 'a@bool', 0), ('inot', 'a')), Maybe you could add the first one near these two? It could be good to add a line like this as well to complete the set: (('ine', 'a@bool', true), ('inot', a)), Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0.5/5] i965/gen9/fast-clear: Handle linear???SRGB conversion
"Pohjolainen, Topi"writes: >> + /* Handle linear→SRGB conversion */ >> + if (brw->ctx.Color.sRGBEnabled && >> + _mesa_get_srgb_format_linear(mt->format) != mt->format) { > > Patch five disables fast clear for single-sampled if > brw->ctx.Color.sRGBEnabled is set. How about something like this: > > /* Fast clears are disabled for single-sampled buffers when > * sRGBEnabled is set. > */ > assert(mt->num_samples > 1); That could be a good idea but the thing that makes me a little uncomfortable is that it should actually be fine to do a fast clear with GL_FRAMEBUFFER_SRGB enabled and it's only actually a problem when it comes to rendering. Eg, if we didn't have patch 5 you could clear with SRGB enabled and then disable it before rendering something else and everything would be fine. Patch 5 is really only an optimisation which implements the heuristic that if GL_FRAMEBUFFER_SRGB is enabled when you do the clear then it's very likely that it's still going to be enabled when you render something else. The patch isn't a hard requirement because without it it would just do the resolve before rendering so it would still work, but it's just a bit wasteful. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] i965/gen8+: Don't upload the MCS buffer for single-sampled textures
> On Thu, Nov 26, 2015 at 11:18:34AM +0200, Pohjolainen, Topi wrote: >> On Wed, Nov 25, 2015 at 11:01:18AM -0800, Ben Widawsky wrote: >> > On Wed, Nov 25, 2015 at 06:36:36PM +0100, Neil Roberts wrote: >> > > For single-sampled textures the MCS buffer is only used to implement >> > > fast clears. However the surface always needs to be resolved before >> > > being used as a texture anyway so the the MCS buffer doesn't actually >> > > achieve anything. This is important for Gen9 because in that case SRGB >> > >> > I admit a good deal of ignorance, but why do we have to do a resolve >> > before we >> > sample from it? I thought the whole point of the MCS was that we can >> > sample from >> > it without a resolve. >> >> This is my understanding also, I can't see much benefit from the fast clear >> if it would need to be anyway resolved before reading it using GPU (reading >> using CPU is then another story of course). > > I know we have this piece of text in bspec: > > "If the MCS is enabled on a non-multisampled render target, the render > target must be resolved before being used for other purposes (display, > texture, CPU lock)." > > But on the other hand the fact that surface state supports definition of fast > cleared MCS buffer even for sampler engine suggest that "texture" in that > sentence refers to something else than sampling. Not to mention that we > already sample fast cleared color buffers with current Mesa driver and seem > to have no problems with it. The resolve needs to be done for single-sampled render targets but for multisample targets we can safely sample from the MCS buffer. The MCS (or CCS or whatever it should be called in this case) is still useful even if we have to resolve because it saves memory bandwidth during rendering. Eg, the initial clear writes less data because it only writes to the smaller MCS buffer and not the larger color buffer. Any blending operations during rendering can also read from the CCS instead of the color buffer. I guess the assumption is that the bandwidth savings during rendering outweigh the extra cost added by having to resolve it. To be clear, this patch doesn't add the resolve step needed before texturing, nor does anything in the series. The resolve step is already done for all gens in intel_update_state. Look at the bit with the comment “Resolve depth buffer and render cache of each enabled texture”. It calls intel_miptree_resolve_color which resolves the CCS buffer for each singled-sampled surface that is going to be used as a texture. The only change this patch makes it to not not even set the aux buffer in the texture surface state. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: Tiny optimisation for the linear→srgb conversion
When converting 0.0 it would be nice if it didn't do any arithmetic. --- src/util/format_srgb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/format_srgb.h b/src/util/format_srgb.h index 4a8d73f..34b50af 100644 --- a/src/util/format_srgb.h +++ b/src/util/format_srgb.h @@ -57,7 +57,7 @@ util_format_linear_to_srgb_helper_table[104]; static inline float util_format_linear_to_srgb_float(float cl) { - if (cl < 0.0f) + if (cl <= 0.0f) return 0.0f; else if (cl < 0.0031308f) return 12.92f * cl; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] i965/meta-fast-clear: Disable GL_FRAMEBUFFER_SRGB during clear
Adds MESA_META_FRAMEBUFFER_SRGB to the meta save state so that GL_FRAMEBUFFER_SRGB will be disabled when performing the fast clear. That way the render surface state will be programmed with the linear equivalent format during the clear. This is important for Gen9 because the SRGB formats are not marked as losslessly compressible so in theory they aren't support for fast clears. It shouldn't make any difference whether GL_FRAMEBUFFER_SRGB is enabled for the fast clear operation because the color is not actually written to the framebuffer so there is no chance for the hardware to apply the SRGB conversion on it anyway. --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 16 1 file changed, 16 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index cf0e56b..b32db3f 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -505,8 +505,21 @@ fast_clear_attachments(struct brw_context *brw, uint32_t fast_clear_buffers, struct rect fast_clear_rect) { + struct gl_context *ctx = >ctx; + bool srgb_enabled = ctx->Color.sRGBEnabled; + assert(brw->gen >= 9); + /* Make sure the GL_FRAMEBUFFER_SRGB is disabled during fast clear so that +* the surface state will always be uploaded with a linear buffer. SRGB +* buffers are not supported on Gen9 because they are not marked as +* losslessly compressible. This shouldn't matter for the fast clear +* because the color is not written to the framebuffer anyway so the +* hardware doesn't need to do any SRGB conversion. +*/ + if (srgb_enabled) + _mesa_set_framebuffer_srgb(ctx, GL_FALSE); + brw_bind_rep_write_shader(brw, (float *) fast_clear_color); /* SKL+ also has a resolve mode for compressed render targets and thus more @@ -533,6 +546,9 @@ fast_clear_attachments(struct brw_context *brw, } set_fast_clear_op(brw, 0); + + if (srgb_enabled) + _mesa_set_framebuffer_srgb(ctx, GL_TRUE); } bool -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] i965/gen9: Don't do fast clears when GL_FRAMEBUFFER_SRGB is enabled
When GL_FRAMEBUFFER_SRGB is enabled any single-sampled renderbuffers are resolved in intel_update_state because the hardware can't cope with fast clears on SRGB buffers. In that case it's pointless to do a fast clear because it will just be immediately resolved. --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index b32db3f..56c90b7 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -603,6 +603,17 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb, brw->render_target_format[irb->mt->format]) clear_type = REP_CLEAR; + /* Gen9 doesn't support fast clear on single-sampled SRGB buffers. When + * GL_FRAMEBUFFER_SRGB is enabled any color renderbuffers will be + * resolved in intel_update_state. In that case it's pointless to do a + * fast clear because it's very likely to be immediately resolved. + */ + if (brw->gen >= 9 && + irb->mt->num_samples <= 1 && + brw->ctx.Color.sRGBEnabled && + _mesa_get_srgb_format_linear(irb->mt->format) != irb->mt->format) + clear_type = REP_CLEAR; + if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS) clear_type = REP_CLEAR; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] i965/gen9: Allow fast clears for non-MSRT SRGB buffers
SRGB buffers are not marked as losslessly compressible so previously they would not be used for fast clears. However in practice the hardware will never actually see that we are using SRGB buffers for fast clears if we use the linear equivalent format when clearing and make sure to resolve the buffer as a linear format before sampling from it. This is an important use case because by default the window system framebuffers are created as SRGB so without this fast clears won't be used there. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 87e0136..88c0a19 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -259,7 +259,8 @@ intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw, return false; if (brw->gen >= 9) { - const uint32_t brw_format = brw_format_for_mesa_format(mt->format); + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); + const uint32_t brw_format = brw_format_for_mesa_format(linear_format); return brw_losslessly_compressible_format(brw, brw_format); } else return true; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] i965/gen8+: Don't upload the MCS buffer for single-sampled textures
For single-sampled textures the MCS buffer is only used to implement fast clears. However the surface always needs to be resolved before being used as a texture anyway so the the MCS buffer doesn't actually achieve anything. This is important for Gen9 because in that case SRGB surfaces are not supported for fast clears and we don't want the hardware to see the MCS buffer in that case. --- src/mesa/drivers/dri/i965/gen8_surface_state.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 80252a5..075424a 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -225,7 +225,11 @@ gen8_emit_texture_surface_state(struct brw_context *brw, pitch = mt->pitch; } - if (mt->mcs_mt) { + /* The MCS is not uploaded for single-sampled surfaces because the color +* buffer should always have been resolved before it is used as a texture +* so there is no need for it. +*/ + if (mt->mcs_mt && mt->num_samples > 1) { aux_mt = mt->mcs_mt; aux_mode = GEN8_SURFACE_AUX_MODE_MCS; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0.5/5] i965/gen9/fast-clear: Handle linear→SRGB conversion
If GL_FRAMEBUFFER_SRGB is enabled when writing to an SRGB-capable framebuffer then the color will be converted from linear to SRGB before being written. There is no chance for the hardware to do this itself because it can't modify the clear color that is programmed in the surface state so it seems pretty clear that the driver should be handling this itself. Note that this wasn't a problem before Gen9 because previously we were only able to do fast clears to 0 or 1 and those values are the same in linear and SRGB space. --- I noticed this after posting the patch series but I think it belongs in this series because it only causes a problem once we enable fast clears for MSRTs. I made a test case to demonstrate this here: http://patchwork.freedesktop.org/patch/66222/ I tried to think what other similar problems there could be such as whether we need to clamp the values to [0,1] for normalised formats. However this seems to just work so I guess the sampler hardware does look at the surface format enough to determine it needs to clamp the clear color. It's a bit odd what the sampler does and doesn't do with the fast clear color. I made a test case for this as well: http://patchwork.freedesktop.org/patch/66223/ Annoyingly it doesn't clamp the value correctly for GL_R11F_G11F_B10F_EXT. That format is floating-point but it has no signed bits so I think it should clamp to a minimum of 0. I have a github branch with all of my SKL fast clear patches here: https://github.com/bpeel/mesa/commits/skl-fast-clear src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index 1b2ea42..f1920b2 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -41,6 +41,8 @@ #include "main/api_validate.h" #include "main/state.h" +#include "util/format_srgb.h" + #include "vbo/vbo_context.h" #include "drivers/common/meta.h" @@ -424,6 +426,15 @@ set_fast_clear_color(struct brw_context *brw, override_color.f[3] = 1.0f; } + /* Handle linear→SRGB conversion */ + if (brw->ctx.Color.sRGBEnabled && + _mesa_get_srgb_format_linear(mt->format) != mt->format) { + for (int i = 0; i < 3; i++) { + override_color.f[i] = +util_format_linear_to_srgb_float(override_color.f[i]); + } + } + if (brw->gen >= 9) { mt->gen9_fast_clear_color = override_color; } else { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/5] i965/gen9: Support fast clear on SRGB buffers
Currently fast clears are disabled on SKL with SRGB buffers because they are not marked as losslessly compressible in the formats table. This is less than ideal because by default the window system buffers are created as SRGB so in practice it means an application is unlikely to end up using fast clears except for FBOs. However unless the application enables GL_FRAMEBUFFER_SRGB the surface format given to the hardware is always its linear equivalent so it shouldn't actually be a problem. The aim of this series is to make that use case work. Sadly there is a second obstacle stopping window system buffers from using fast clears which as that they are always x-tiled and supposedly SKL doesn't support that. This is explicitly disabled in intel_tiling_supports_non_msrt_mcs. However if I remove that restriction I couldn't find any problems so it might just work to remove it. All of my SKL fast clear patches are in a branch here: https://github.com/bpeel/mesa/commits/skl-fast-clear ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] i965/gen9: Resolve SRGB color buffers when GL_FRAMEBUFFER_SRGB enabled
SKL can't cope with the CCS buffer for SRGB buffers. Normally the hardware won't see the SRGB formats because when GL_FRAMEBUFFER_SRGB is disabled these get mapped to their linear equivalents. In order to avoid relying on the CCS buffer when it is enabled this patch now makes it flush the renderbuffers. --- I made a test case for this here: http://patchwork.freedesktop.org/patch/66225/ src/mesa/drivers/dri/i965/brw_context.c | 25 + 1 file changed, 25 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 7e2fdcb..a8c1052 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -197,6 +197,31 @@ intel_update_state(struct gl_context * ctx, GLuint new_state) brw_render_cache_set_check_flush(brw, tex_obj->mt->bo); } + /* If FRAMEBUFFER_SRGB is used on Gen9+ then we need to resolve any of the +* single-sampled color renderbuffers because the CCS buffer isn't +* supported for SRGB formats and the blending wouldn't work. +*/ + if (brw->gen >= 9 && ctx->Color.sRGBEnabled) { + struct gl_framebuffer *fb = ctx->DrawBuffer; + for (int i = 0; i < fb->_NumColorDrawBuffers; i++) { + struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[i]; + + if (rb == NULL) +continue; + + struct intel_renderbuffer *irb = intel_renderbuffer(rb); + struct intel_mipmap_tree *mt = irb->mt; + + if (mt == NULL || + mt->num_samples > 1 || + _mesa_get_srgb_format_linear(mt->format) == mt->format) + continue; + + intel_miptree_resolve_color(brw, mt); + brw_render_cache_set_check_flush(brw, mt->bo); + } + } + _mesa_lock_context_textures(ctx); } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Always use Y-tiled buffers on SKL+
Hi, Has this situation changed at all? It's probably quite important to get this working because we have to disable fast clears for X-tiled buffers on SKL which effectively means we currently can't do it for window system buffers. Regards, - Neil Chris Wilsonwrites: > On Mon, Apr 13, 2015 at 04:31:29PM +0200, Daniel Vetter wrote: >> On Sat, Apr 11, 2015 at 01:16:11PM -0700, Ben Widawsky wrote: >> > Starting with Skylake, the display engine is capable of scanning out from >> > Y-tiled buffers. As such, we can and should use Y-tiling for better >> > efficiency. >> > >> > Note that the buffer allocation done for mipmaps will already never >> > allocate an >> > X-tiled buffer for GEN9. >> > >> > Signed-off-by: Ben Widawsky >> >> You need a recent enough ddx to make use of Y-tiled buffers, which atm >> still doesn't yet exist. This would at least need some kind of handshake >> with the compositor to make sure it understands this, presuming I didn't >> miss something. > > You can send Y-tiled buffers to the DDX. The problem is that the kernel > won't allow us to display them and so we will (and always have been) > copying from them. > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev