Re: VkRunner ported to Rust

2023-02-21 Thread Neil Roberts
Alejandro Piñeiro  writes:

> I have been trying to find a time slot to learn Rust too. If you have
> issues around with pending features perhaps one of these days I join
> your initiative.

Great! I’ll have to have a brainstorm for things that need implementing.
The main thing at the moment is that there is still no support for
images and textures. That’s probably quite a big task. Otherwise maybe
if you find yourself needing to implement a test for something that
VkRunner can’t support yet that might be a good place to start.

Otherwise for some really small tasks it might be good to have a look
through the Clippy output. Clippy is a rust linter. It’s easier to run
using Cargo so you could make a trivial Cargo.toml build file like this:

[package]
name = "vkrunner"
version = "0.1.0"
edition = "2021"

[lib]
path = "vkrunner/lib.rs"

And then run “cargo clippy” to see the output.

Another option could be to add some more unit tests. I think the
coverage is already quite good but there’s probably some things missing
that could be worth testing.

> If the rust-based vkrunner has the same features and can run the same 
> kind of tests that the original c-based vkrunner, I think that using a 
> memory-safe language is an advantage that goes beyond just a warm fuzzy 
> feeling.
>
> If that is the case, I think that it is a good idea to replace one with 
> the other.

Cool. Yeah, it is feature complete with the old VkRunner so it should be
able to run the same scripts. I tried it with the current Piglit tests
and it works fine. If no one disagrees it might be nice to merge the
branch back into the official repo at Igalia’s github. I’m not sure if
the CI system automatically updates to the lastest version but if so
it’d be worth checking whether it can cope with switching from CMake to
Meson+Rust first.

Maybe in the long run it would be nice to host the repo on Freedesktop’s
gitlab.

Regards,
– Neil


VkRunner ported to Rust

2023-02-20 Thread Neil Roberts
Hi folks,

Does anybody remember VkRunner? It’s a little tool to help write
shader-based tests for Vulkan. It’s the same concept as Piglit’s
shader_runner but for Vulkan instead of OpenGL. There are a couple of
tests using it in Piglit but apart from that it never really got off the
ground.

Anyway, I’ve been trying to learn some Rust lately and in order to get
some experience working with a non-trivial project I decided to port
VkRunner to Rust. The port is now complete and available here:

https://github.com/bpeel/vkrunner/

It’s a drop-in replacement for the original VkRunner so it should be
possible to start using it in a CI system by just changing the git repo,
assuming the rust compiler is installed. I was thinking that now that
Mesa has some Rust code in it anyway it might not be too unreasonable to
expect CI systems to have the Rust compiler available.

Other than that there’s not much advantage to using one or the other
except for the warm fuzzy feeling knowing that you’re using a project
written in a memory-safe language. I also took the opportunity to add a
whole bunch of unit tests so in theory the Rust port might be more
robust.

It’s currently using Meson as the build system. In the beginning this
was necessary because I did the port gradually and it’s probably the
best build system if you have a mix of C and Rust code. Now that the
port is complete it’d probably be trivial to start using Cargo instead.
I’m not sure which would be better. It’s not using any external crates
so it doesn’t really need Cargo for now.

Now that the port is complete it might be nice to start adding more
features. If anyone else hasn’t taken the plunge to start using Rust yet
this might be a nice project to get involved in if you fancy helping.

Kind regards,
– Neil


[Mesa-dev] [MR] glsl: Use default precision on struct/interface members when nothing specified

2019-04-25 Thread Neil Roberts
Mesa already keeps track of the GLES precision for variables and stores
it in the ir_variable. When no precision is explicitly specified it
takes the default precision for the corresponding type. However, when
the variable is a struct or interface, the precision of each individual
member is attached to the glsl_type instead. The code to make it use the
default precision was missing so this branch adds it in.

Only the last patch actually makes this change. The rest of the patches
are to fix regressions in Piglit and CTS. The underlying problem is that
Mesa was considering types that have different precisions to be
different when comparing interstage interfaces (varyings and UBOs).
According to the spec it should be ignored. Presumably this problem
already existed if mismatched precisions were explicitly specified but
we didn’t have any tests to test it. Storing the default precision makes
some tests fail because the default precision for ints is different in
the vertex and fragment stages so it’s easier to accidentally make a
test case that tests this.

The tests that regressed are:

dEQP-GLES31.functional.shaders.opaque_type_indexing.* (12 tests)
piglit.spec.ext_transform_feedback.structs_gles3 basic-struct run
piglit.spec.glsl-es-3_00.execution.varying-struct-centroid_gles3
piglit.spec.ext_transform_feedback.structs_gles3 basic-struct get

https://gitlab.freedesktop.org/mesa/mesa/merge_requests/736


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] freedreno: Fix loading half-float immediate vectors

2018-12-03 Thread Neil Roberts
Previously the code to load from a constant instruction was always
using the u32 pointer. If the constant is actually a 16-bit source
this would end up with the wrong values because the pointer would be
offset by the wrong size. This fixes it to use the u16 pointer.
---
 src/freedreno/ir3/ir3_compiler_nir.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/freedreno/ir3/ir3_compiler_nir.c 
b/src/freedreno/ir3/ir3_compiler_nir.c
index 445a2b291e9..584c7e83fea 100644
--- a/src/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/freedreno/ir3/ir3_compiler_nir.c
@@ -2525,10 +2525,18 @@ emit_load_const(struct ir3_context *ctx, 
nir_load_const_instr *instr)
 {
struct ir3_instruction **dst = get_dst_ssa(ctx, >def,
instr->def.num_components);
-   type_t type = (instr->def.bit_size < 32) ? TYPE_U16 : TYPE_U32;
 
-   for (int i = 0; i < instr->def.num_components; i++)
-   dst[i] = create_immed_typed(ctx->block, instr->value.u32[i], 
type);
+   if (instr->def.bit_size < 32) {
+   for (int i = 0; i < instr->def.num_components; i++)
+   dst[i] = create_immed_typed(ctx->block,
+   
instr->value.u16[i],
+   
TYPE_U16);
+   } else {
+   for (int i = 0; i < instr->def.num_components; i++)
+   dst[i] = create_immed_typed(ctx->block,
+   
instr->value.u32[i],
+   
TYPE_U32);
+   }
 }
 
 static void
-- 
2.17.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] freedreno: Add .dir-locals to the common directory

2018-12-03 Thread Neil Roberts
The commit aa0fed10d35 moved a bunch of Freedreno code to a common
directory. The previous directory had a .dir-locals file for Emacs.
This patch copies it to the new directory as well.
---
 src/freedreno/.dir-locals.el | 8 
 1 file changed, 8 insertions(+)
 create mode 100644 src/freedreno/.dir-locals.el

diff --git a/src/freedreno/.dir-locals.el b/src/freedreno/.dir-locals.el
new file mode 100644
index 000..b0e90fcbd53
--- /dev/null
+++ b/src/freedreno/.dir-locals.el
@@ -0,0 +1,8 @@
+((prog-mode
+  (indent-tabs-mode . t)
+  (tab-width . 4)
+  (c-basic-offset . 4)
+  (c-file-style . "k")
+  (fill-column . 78)
+  )
+ )
-- 
2.17.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Neil Roberts
Eric Engestrom  writes:

> That's absolutely fair :)
>
> I wanted to ack your patch earlier, since fixing it is good regardless,
> but freedreno isn't my area so I didn't feel comfortable doing so;
> I changed my mind in the mean time though, so here you go :P
> Acked-by: Eric Engestrom 
>
> You have push access, right?

Yes, I have push access. But actually Rob already pushed my other patch
to just remove it in the meantime, so there’s no need to do anything.

Thanks anyway.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Neil Roberts

> On Wed, Oct 17, 2018 at 12:45 PM Neil Roberts  wrote:
>
>> I wonder if you have something else in your setup that is setting it?

Ilia Mirkin  writes:

> Perhaps. It's the default, right?

It is the default but the toplevel .dir-locals.el sets it to nil. These
lower-level files are trying to override it back to the default.

> These might have a common source... although, HAH! IT WASN'T ME!
> Michel in 8d0a1a6bc05a set it to true, I probably copied, and am so
> used to emacs errors that I didn't even notice. Indents worked, so I
> was happy.

:)

> Yes, fixing these all is probably a good move. I don't think there are
> a lot of emacs users in mesa.

Lucky for them :)

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] Fix setting indent-tabs-mode in the Emacs .dir-locals.el files

2018-10-17 Thread Neil Roberts
Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.
---
 src/gallium/drivers/freedreno/.dir-locals.el | 2 +-
 src/gallium/drivers/r600/.dir-locals.el  | 2 +-
 src/gallium/drivers/radeon/.dir-locals.el| 2 +-
 src/gallium/drivers/radeonsi/.dir-locals.el  | 2 +-
 src/mesa/drivers/dri/nouveau/.dir-locals.el  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/freedreno/.dir-locals.el 
b/src/gallium/drivers/freedreno/.dir-locals.el
index aa20d495465..b0e90fcbd53 100644
--- a/src/gallium/drivers/freedreno/.dir-locals.el
+++ b/src/gallium/drivers/freedreno/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 4)
   (c-basic-offset . 4)
   (c-file-style . "k")
diff --git a/src/gallium/drivers/r600/.dir-locals.el 
b/src/gallium/drivers/r600/.dir-locals.el
index 4e35c129e70..15cd68edb0a 100644
--- a/src/gallium/drivers/r600/.dir-locals.el
+++ b/src/gallium/drivers/r600/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
diff --git a/src/gallium/drivers/radeon/.dir-locals.el 
b/src/gallium/drivers/radeon/.dir-locals.el
index 4e35c129e70..15cd68edb0a 100644
--- a/src/gallium/drivers/radeon/.dir-locals.el
+++ b/src/gallium/drivers/radeon/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
diff --git a/src/gallium/drivers/radeonsi/.dir-locals.el 
b/src/gallium/drivers/radeonsi/.dir-locals.el
index 4e35c129e70..15cd68edb0a 100644
--- a/src/gallium/drivers/radeonsi/.dir-locals.el
+++ b/src/gallium/drivers/radeonsi/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
diff --git a/src/mesa/drivers/dri/nouveau/.dir-locals.el 
b/src/mesa/drivers/dri/nouveau/.dir-locals.el
index 774f023ae6f..9b3ddf52461 100644
--- a/src/mesa/drivers/dri/nouveau/.dir-locals.el
+++ b/src/mesa/drivers/dri/nouveau/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 8)
   (c-basic-offset . 8)
   (c-file-style . "stroustrup")
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Neil Roberts
Ilia Mirkin  writes:

> Are you sure? It works fine for me... I'm not against fixing it to be
> "t", but the current contents definitely worked fine for me. (As I
> recall, I may be the one who checked this file in.)

Yes, I’m sure. If you type “true” and then do C-x C-e to evaluate it
then Emacs gives a void-variable error. If I leave it as “true” in the
file then it does indeed indent without tabs. Also if I do C-h v it says
the value is nil, whereas if I change the .dir-local.el to “t” then the
indentation works properly and the variable help says its value comes
from the .dir-locals.el. I wonder if you have something else in your
setup that is setting it?

I notice that there are some other files with the same problem. It might
be worth fixing them all in one patch.

$ git grep 'indent-tabs-mode *\. *true'
src/gallium/drivers/freedreno/.dir-locals.el:  (indent-tabs-mode . true)
src/gallium/drivers/r600/.dir-locals.el:  (indent-tabs-mode . true)
src/gallium/drivers/radeon/.dir-locals.el:  (indent-tabs-mode . true)
src/gallium/drivers/radeonsi/.dir-locals.el:  (indent-tabs-mode . true)
src/mesa/drivers/dri/nouveau/.dir-locals.el:  (indent-tabs-mode . true)

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] freedreno: Remove the Emacs mode lines

2018-10-17 Thread Neil Roberts
These are not necessary because the corresponding settings are set via
the .dir-locals.el file anyway. Most of them were missing a ‘:’ after
“tab-width” which was making Emacs display an annoying warning
whenever you open the file.

This patch was made with:

sed -ri '/-\*- mode:/,/^$/d' \
$(find src/gallium/{drivers,winsys} -name \*.\[ch\] \
   -exec grep -l -- '-\*- mode:' {} \+)
---
 src/gallium/drivers/freedreno/a2xx/fd2_blend.c  | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_blend.h  | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_context.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_context.h| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_draw.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_draw.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_emit.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_emit.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_program.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_program.h| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.c | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.h | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_screen.h | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_texture.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_texture.h| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_util.c   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_util.h   | 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.c| 2 --
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_blend.c  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_blend.h  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_context.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_context.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_draw.c   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_draw.h   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_emit.h   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.h   | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_program.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_program.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_query.c  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_query.h  | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.c | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.h | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_screen.c | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_screen.h | 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_texture.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_texture.h| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.c| 2 --
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_blend.c  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_blend.h  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_context.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_context.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_draw.c   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_draw.h   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_emit.h   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_format.c | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_format.h | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.h   | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_program.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_program.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_query.c  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_query.h  | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.h | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_screen.c | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_screen.h | 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_texture.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_texture.h| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.c| 2 --
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.h| 2 --
 src/gallium/drivers/freedreno/freedreno_context.c   | 2 --
 

[Mesa-dev] [PATCH 1/2] freedreno: Fix the Emacs indentation configuration file

2018-10-17 Thread Neil Roberts
The .dir-locals.el had the wrong name for the truthy value so it
wasn’t setting indent-tabs-mode.
---
 src/gallium/drivers/freedreno/.dir-locals.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/.dir-locals.el 
b/src/gallium/drivers/freedreno/.dir-locals.el
index aa20d495465..b0e90fcbd53 100644
--- a/src/gallium/drivers/freedreno/.dir-locals.el
+++ b/src/gallium/drivers/freedreno/.dir-locals.el
@@ -1,5 +1,5 @@
 ((prog-mode
-  (indent-tabs-mode . true)
+  (indent-tabs-mode . t)
   (tab-width . 4)
   (c-basic-offset . 4)
   (c-file-style . "k")
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Neil Roberts
Eric Engestrom  writes:

> You might want to remove these instead, and use the .editorconfig [1]
> already present at src/gallium/drivers/freedreno/.editorconfig This is
> much easier to maintain than per-files settings ;)

Either fixing it or removing it is fine by me. I now notice there is a
.dir-locals.el file that should make it work anyway. (apparently I was
the last person to touch it too!) It has a typo which makes it fail to
set indent-tabs-mode though. I can make everything work locally either
way, I just wanted to get rid of the annoying warning whenever you open
a file.

- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] freedreno: Fix emacs modeline

2018-10-17 Thread Neil Roberts
The modeline was missing a ‘:’ after the tab-width and Emacs was
complaining every time you open a file. This patch was made with:

sed -ri '1 s/; tab-width ([0-9])/; tab-width: \1/' \
$(find -name \*.\[ch\] -exec grep -l -- '-\*- mode:' {} \+)
---
 src/gallium/drivers/freedreno/a2xx/fd2_blend.c  | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_blend.h  | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_context.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_context.h| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_draw.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_draw.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_emit.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_emit.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_program.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_program.h| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.c | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_rasterizer.h | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_screen.c | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_screen.h | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_texture.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_texture.h| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_util.c   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_util.h   | 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.c| 2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_zsa.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_blend.c  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_blend.h  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_context.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_context.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_draw.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_draw.h   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.h   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_gmem.h   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_program.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_program.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_query.c  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_query.h  | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.c | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_rasterizer.h | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_screen.c | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_screen.h | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_texture.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_texture.h| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.c| 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_zsa.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_blend.c  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_blend.h  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_context.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_context.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_draw.c   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_draw.h   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.h   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_format.c | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_format.h | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.c   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_gmem.h   | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_program.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_program.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_query.c  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_query.h  | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.h | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_screen.c | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_screen.h | 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_texture.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_texture.h| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_zsa.h| 2 +-
 src/gallium/drivers/freedreno/freedreno_context.c   | 2 +-
 src/gallium/drivers/freedreno/freedreno_context.h   | 2 +-
 src/gallium/drivers/freedreno/freedreno_draw.c  | 2 +-
 src/gallium/drivers/freedreno/freedreno_draw.h  | 2 +-
 

Re: [Mesa-dev] [PATCH 04/15] mesa/glspirv: Set last_vert_prog

2018-07-30 Thread Neil Roberts
Timothy Arceri  writes:

>> +
>> +   int last_vert_stage =
>> +  util_last_bit(prog->data->linked_stages &
>> +(((1 << (MESA_SHADER_GEOMETRY + 1)) - 1) ^
>> + ((1 << MESA_SHADER_VERTEX) - 1)));
>
> Isn't this the same as:
>
> int last_vert_stage =
>util_last_bit(prog->data->linked_stages &
>  ((1 << (MESA_SHADER_GEOMETRY + 1)) - 1));
>
> As ((1 << MESA_SHADER_VERTEX) - 1)) == 0
>
> If you use the above simplification this patch is:
>
> Reviewed-by: Timothy Arceri 

This is based on code in link_varyings_and_uniforms which has a loop
over all of the stages from geometry down to vertex to try and find the
last vertex stage. I was trying to mimic this behaviour which is
presumably saying vertex->geometry are the vertex stages and not just
“anything up to and including geometry”.

I don’t really mind either way whether we include the MESA_SHADER_VERTEX
part or not. I guess the cleanest way could be to add a define for the
mask in shader_enums.h to make it most likely not to break if any more
stages are added. But seeing as there is precedent for having this range
be open-coded in the code I guess we can just leave that to another
patch.

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/21] nir/linker: Add gl_nir_link_uniforms()

2018-06-07 Thread Neil Roberts
Timothy Arceri  writes:

> I still think you might be able to drop the tree without using a hash 
> table. Can't you just add an offset field to the state struct and use 
> this? You would increment it as you recurse down the arrays/structs via 
> nir_link_uniform() and you could  detect the first leaf-node once you 
> reach the inner most elements in a way similar to what link_uniforms() 
> does in with record_type in program_resource_visitor::recursion. When 
> moving to the next uniform you would just reset the offset back to 0.

Let’s walk through an example to demonstrate what needs to happen.

Imagine there is a uniform like this:

layout(location = 0) uniform struct {
sampler2D foo[3];
sampler2D bar[4];
} wibble[2];

This creates a total of 2*(3+4)=14 samplers. We want to group together
all of the indices for a single member so that they will be laid out in
the following order:

Index 0: wibble[0].foo[0]
Index 1: wibble[0].foo[1]
Index 2: wibble[0].foo[2]
Index 3: wibble[1].foo[0]
Index 4: wibble[1].foo[1]
Index 5: wibble[1].foo[2]
Index 6: wibble[0].bar[0]
Index 7: wibble[0].bar[1]
Index 8: wibble[0].bar[2]
Index 9: wibble[0].bar[3]
Index 10: wibble[1].bar[0]
Index 11: wibble[1].bar[1]
Index 12: wibble[1].bar[2]
Index 13: wibble[1].bar[3]

Ie, all of the foos are bunched together before all of the bars.

When we walk the type tree for wibble, it will recognise that the array
elements are not a simple type so it will create a separate uniform for
each element of wibble. That means it will visit each of foo and bar
twice.

With the current approach the first time we visit foo it will look at
all of the parents of the type to work out the total size needed for all
of the foos and reserve a continuous block of 6 indices. Then it will
reserve three of those for this instance of foo by incrementing its own
next_index entry by 3.

Next it will do the same thing for bar by reserving 8 indices out of the
global pool and setting its own next_index to 4.

Then when we handle wibble[1].foo it will recognise that its own
next_index is not zero so it will continue using the pool it already
allocated. It will take the next three out of its pool starting at the
previous value of its next_index, 3. And then it will do the same for
bar taking the next four out of the pool it already reserved.

This creates the following entries in UniformStorage:

0: loc=0, elems=3, storage offset=0, name=wibble[0].foo, fragment=0
1: loc=3, elems=4, storage offset=6, name=wibble[0].bar, fragment=6
2: loc=7, elems=3, storage offset=14, name=wibble[1].foo, fragment=3
3: loc=10, elems=4, storage offset=20, name=wibble[1].bar, fragment=10

‘loc’ is the location of the uniform and ‘fragment’ is the sampler index
allocated.

I understand how you could use something like record_type to recognise
whether it is the first time you visit a leaf node or not, but if it
isn’t the first time I don’t see how you can work out what index to use
if you only have a single counter in the state struct. I think you need
a way to find out what the start of the range was the first time you
encountered the leaf and how many times you have visited it so far.

Regards,
- Neil

>
>> 
>> Thanks for looking at the patch.
>> 
>> Regards,
>> - Neil
>> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 10/21] anv/nir: Use nir_variable's type if interface_type is null

2018-06-07 Thread Neil Roberts
Timothy Arceri  writes:

>
> glsl_get_sampler_dim() contains the following:
>
> assert(glsl_type_is_sampler(type) || glsl_type_is_image(type));
>
> Which leads me to believe the code above should just be:
>
> const struct glsl_type *glsl_type = glsl_without_array(var->type);
>
> If you agree please squash this patch into the previous patch where you 
> can keep my r-b.

I agree, that sounds sensible.

It looks like var->interface_type was being overloaded as a place to
store the type for the image or sampler without the array type. The
comment above interface_type says:

 “For variables that are in an interface block or are an instance of an
  interface block, this is the GLSL_TYPE_INTERFACE type for that block.”

So it seems a little strange to use it for this purpose. It presumably
works because opaque types can’t be in interfaces so the interface_type
is otherwise unused. I guess with the previous patch interface_type will
always be NULL so you’re right that the check for whether it is NULL in
seems pointless.

Maybe we should however add a comment to the commit message for the
previous commit along these lines:

“The previous code appeared to be using var->interface_type as a place
to store the type of the variable without the enclosing array for images
and samplers. I guess this worked because opaque types can not appear in
interfaces so the interface_type is sort of unused. This patch removes
the overloading of var->interface_type and any places that needed the
type without the array can now just deduce it from var->type.”

Thanks for looking at the patch.

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/21] nir/linker: Add gl_nir_link_uniforms()

2018-06-06 Thread Neil Roberts
Timothy Arceri  writes:

>> +static struct type_tree_entry *
>> +build_type_tree_for_type(const struct glsl_type *type)
>> +{
>
> Do we really need this? As far as I can tell we walk the types here to 
> build a tree then in nir_link_uniform() we walk the tree. Why not just 
> walk the types directly in nir_link_uniform()?

The tree is needed as a temporary place to store the next_index and
detect whether a node in the type tree is encountered for the first time
or not. The nodes can be encountered multiple times because if there is
an array of structs or array of arrays then we process the subtree of
the array once for each element in the array.

In nir_link_uniform, it effectively _is_ directly walking the glsl_type
tree. However it simultaneously walks the type_tree tree as well for
this extra side-band information.

The GLSL linker handles this instead by using a hash table to store the
next index (see set_opaque_indices in link_uniforms.cpp). The key for
the table is the name of the uniform with all array indices removed. I
suppose we could try to do something similar for the nir linker but it
would have to use a different key because we don’t have the names.
However my gut feeling is that that wouldn’t be any more efficient then
using this side-band tree and the code to generate the hash table key
looks quite involved.

If you can think of a simpler way to handle this then of course I am
open to suggestions.

Thanks for looking at the patch.

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix output register sizes when variable ranges are interleaved

2018-05-18 Thread Neil Roberts
In 6f5abf31466aed this code was fixed to calculate the maximum size of
an attribute in a seperate pass and then allocate the registers to
that size. However this wasn’t taking into account ranges that overlap
but don’t have the same starting location. For example:

layout(location = 0, component = 0) out float a[4];
layout(location = 2, component = 1) out float b[4];

Previously, if ‘a’ was processed first then it would allocate a
register of size 4 for location 0 and it wouldn’t allocate another
register for location 2 because it would already be covered by the
range of 0. Then if something tries to write to b[2] it would try to
write past the end of the register allocated for ‘a’ and it would hit
an assert.

This patch changes it to scan for any overlapping ranges that start
within each range to calculate the maximum extent and allocate that
instead.

Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/
vs-fs-array-interleave-range.shader_test
---

For what it’s worth I also made a simplified test for this for
VkRunner here:

https://github.com/Igalia/vkrunner/blob/tests/examples/overlap-outputs.shader_test

 src/intel/compiler/brw_fs_nir.cpp | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 1ce89520bf1..b179018e5e8 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -67,14 +67,25 @@ fs_visitor::nir_setup_outputs()
   vec4s[loc] = MAX2(vec4s[loc], var_vec4s);
}
 
-   nir_foreach_variable(var, >outputs) {
-  const int loc = var->data.driver_location;
-  if (outputs[loc].file == BAD_FILE) {
- fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * vec4s[loc]);
- for (unsigned i = 0; i < vec4s[loc]; i++) {
-outputs[loc + i] = offset(reg, bld, 4 * i);
- }
+   for (unsigned loc = 0; loc < ARRAY_SIZE(vec4s);) {
+  if (vec4s[loc] == 0) {
+ loc++;
+ continue;
   }
+
+  unsigned reg_size = vec4s[loc];
+
+  /* Check if there are any ranges that start within this range and extend
+   * past it. If so, include them in this allocation.
+   */
+  for (unsigned i = 1; i < reg_size; i++)
+ reg_size = MAX2(vec4s[i + loc] + i, reg_size);
+
+  fs_reg reg = bld.vgrf(BRW_REGISTER_TYPE_F, 4 * reg_size);
+  for (unsigned i = 0; i < reg_size; i++)
+ outputs[loc + i] = offset(reg, bld, 4 * i);
+
+  loc += reg_size;
}
 }
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Fix InterpolateAt* instructions for vecs with dynamic index

2018-05-09 Thread Neil Roberts
If the glsl is something like this:

  in vec4 some_input;
  interpolateAtCentroid(some_input[idx])

then it now gets generated as if it were:

  interpolateAtCentroid(some_input)[idx]

This is necessary because the index will get generated as a series of
nir_bcsel instructions so it would no longer be an input variable. It
is similar to what is done for GLSL in ca63a5ed3e9efb2bd645b42.

Although I can’t find anything explicit in the Vulkan specs to say
this should be allowed, the SPIR-V spec just says “the operand
interpolant must be a pointer to the Input Storage Class”, which I
guess doesn’t rule out any type of pointer to an input.

This was found using the spec/glsl-4.40/execution/fs-interpolateAt*
Piglit tests with the ARB_gl_spirv branch.
---

I’ve made a bunch of tests for this for VkRunner. They can be tested
with:

git clone https://github.com/Igalia/vkrunner.git -b tests
cd vkrunner && ./autogen.sh && make
./src/vkrunner examples/interpolation/*.shader_test

 src/compiler/spirv/vtn_glsl450.c | 56 +---
 1 file changed, 52 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 6fa759b1bba..cd39fcf6efb 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -773,6 +773,23 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
}
 }
 
+static bool
+has_vector_array_deref(nir_deref *deref)
+{
+   while (true) {
+  if (deref->child == NULL)
+ return false;
+
+  if (glsl_type_is_vector(deref->type) &&
+  deref->child->deref_type == nir_deref_type_array) {
+ assert(deref->child->child == NULL);
+ return true;
+  }
+
+  deref = deref->child;
+   }
+}
+
 static void
 handle_glsl450_interpolation(struct vtn_builder *b, enum GLSLstd450 opcode,
  const uint32_t *w, unsigned count)
@@ -799,10 +816,28 @@ handle_glsl450_interpolation(struct vtn_builder *b, enum 
GLSLstd450 opcode,
}
 
nir_intrinsic_instr *intrin = nir_intrinsic_instr_create(b->nb.shader, op);
+   const struct glsl_type *intrin_type;
 
nir_deref_var *deref = vtn_nir_deref(b, w[5]);
intrin->variables[0] = nir_deref_var_clone(deref, intrin);
 
+   /* If the value we are interpolating has an index into a vector then
+* interpolate the vector and index the result of that instead. This is
+* necessary because the index will get generated as a series of nir_bcsel
+* instructions so it would no longer be an input variable.
+*/
+   bool vec_array_deref = has_vector_array_deref(>deref);
+   if (vec_array_deref) {
+  /* Remove the last deref */
+  nir_deref *parent = >variables[0]->deref;
+  while (parent->child->child)
+ parent = parent->child;
+  parent->child = NULL;
+  intrin_type = parent->type;
+   } else {
+  intrin_type = dest_type;
+   }
+
switch (opcode) {
case GLSLstd450InterpolateAtCentroid:
   break;
@@ -814,13 +849,26 @@ handle_glsl450_interpolation(struct vtn_builder *b, enum 
GLSLstd450 opcode,
   vtn_fail("Invalid opcode");
}
 
-   intrin->num_components = glsl_get_vector_elements(dest_type);
+   intrin->num_components = glsl_get_vector_elements(intrin_type);
nir_ssa_dest_init(>instr, >dest,
- glsl_get_vector_elements(dest_type),
- glsl_get_bit_size(dest_type), NULL);
-   val->ssa->def = >dest.ssa;
+ glsl_get_vector_elements(intrin_type),
+ glsl_get_bit_size(intrin_type), NULL);
 
nir_builder_instr_insert(>nb, >instr);
+
+   if (vec_array_deref) {
+  nir_deref_array *deref_array =
+ nir_deref_as_array(nir_deref_tail(>deref));
+  if (deref_array->deref_array_type == nir_deref_array_type_direct) {
+ val->ssa->def = vtn_vector_extract(b, >dest.ssa,
+deref_array->base_offset);
+  } else {
+ val->ssa->def = vtn_vector_extract_dynamic(b, >dest.ssa,
+deref_array->indirect.ssa);
+  }
+   } else {
+  val->ssa->def = >dest.ssa;
+   }
 }
 
 bool
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/22] i965: Setup glsl uniforms by index rather than name matching

2018-05-03 Thread Neil Roberts
Thanks for the feedback.

I’ve implemented your suggestion as the top two patches on a branch
here:

https://github.com/Igalia/mesa/tree/nroberts/count-uniform-storage

It also fixes a slight bug that it wasn’t passing down the
uniform_storage_locs argument into the recursion.

However I think my preference would be for keeping this as a separate
function because it seems a little confusing to make the function count two
different things depending on a bool argument. Perhaps for the sake of
maintainability it would be good to move it to glsl_types though.

If there is a strong preference to merge it into one function then of
course I don’t mind.

Regards,
- Neil

Timothy Arceri <tarc...@itsqueeze.com> writes:

> On 18/04/18 00:36, Alejandro Piñeiro wrote:
>> From: Neil Roberts <nrobe...@igalia.com>
>> 
>> Previously when setting up a uniform it would try to walk the uniform
>> storage slots and find one that matches the name of the given
>> variable. However, each variable already has a location which is an
>> index into the UniformStorage array so we can just directly jump to
>> the right slot. Some of the variables take up more than one slot so we
>> still need to calculate how many it uses.
>> 
>> The main reason to do this is to support ARB_gl_spirv because in that
>> case the uniforms don’t have names so the previous approach won’t
>> work.
>
> This is a nice improvement away :)
>
> However it might be nicer to just update the existing 
> glsl_type::uniform_locations() helper
> rather than create the custom count_uniform_storage_slots() helper.
>
> e.g
>
> unsigned
> glsl_type::uniform_locations(bool uniform_storage_locs) const
> {
> unsigned size = 0;
>
> switch (this->base_type) {
> case GLSL_TYPE_UINT:
> case GLSL_TYPE_INT:
> case GLSL_TYPE_FLOAT:
> case GLSL_TYPE_FLOAT16:
> case GLSL_TYPE_DOUBLE:
> case GLSL_TYPE_UINT16:
> case GLSL_TYPE_UINT8:
> case GLSL_TYPE_INT16:
> case GLSL_TYPE_INT8:
> case GLSL_TYPE_UINT64:
> case GLSL_TYPE_INT64:
> case GLSL_TYPE_BOOL:
> case GLSL_TYPE_SAMPLER:
> case GLSL_TYPE_IMAGE:
> case GLSL_TYPE_SUBROUTINE:
>return 1;
>
> case GLSL_TYPE_STRUCT:
> case GLSL_TYPE_INTERFACE:
>for (unsigned i = 0; i < this->length; i++)
>   size += this->fields.structure[i].type->uniform_locations();
>return size;
> case GLSL_TYPE_ARRAY:
>if (!uniform_storage_locs ||
>this->fields.array->is_array() ||
>this->fields.array->is_interface() ||
>this->fields.array->is_record()) {
>/* For uniform locations passed to the user via the API we
> * count all array elements.
> */
>   return this->length * this->fields.array->uniform_locations();
>} else {
>   /* For uniform storage the innermost array only uses a
>* single slot.
>*/
>   return 1;
>}
> default:
>return 0;
> }
> }
>
> If you agree this patch is:
>
> Reviewed-by: Timothy Arceri <tarc...@itsqueeze.com>
>
>> ---
>>   src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 50 
>> +++---
>>   1 file changed, 37 insertions(+), 13 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp 
>> b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
>> index 62b2951432a..cb5e07f74ba 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
>> @@ -118,35 +118,59 @@ brw_setup_image_uniform_values(gl_shader_stage stage,
>>  }
>>   }
>>   
>> +static unsigned
>> +count_uniform_storage_slots(const struct glsl_type *type)
>> +{
>> +   /* gl_uniform_storage can cope with one level of array, so if the
>> +* type is a composite type or an array where each element occupies
>> +* more than one slot than we need to recursively process it.
>> +*/
>> +   if (glsl_type_is_struct(type)) {
>> +  unsigned location_count = 0;
>> +
>> +  for (unsigned i = 0; i < glsl_get_length(type); i++) {
>> + const struct glsl_type *field_type = glsl_get_struct_field(type, 
>> i);
>> +
>> + location_count += count_uniform_storage_slots(field_type);
>> +  }
>> +
>> +  return location_count;
>> +   }
>> +
>> +   if (glsl_type_is_array(type)) {
>> +  const struct glsl_type *element_type = glsl_get_array_element

[Mesa-dev] [PATCH] spirv: Apply OriginUpperLeft to FragCoord

2018-05-02 Thread Neil Roberts
This behaviour was changed in 1e5b09f42f694687ac. The commit message
for that says it is just a “tidy up” so my assumption is that the
behaviour change was a mistake. It’s a little hard to decipher looking
at the diff, but the previous code before that patch was:

  if (builtin == SpvBuiltInFragCoord || builtin == SpvBuiltInSamplePosition)
 nir_var->data.origin_upper_left = b->origin_upper_left;

  if (builtin == SpvBuiltInFragCoord)
 nir_var->data.pixel_center_integer = b->pixel_center_integer;

After the patch the code was:

  case SpvBuiltInSamplePosition:
 nir_var->data.origin_upper_left = b->origin_upper_left;
 /* fallthrough */
  case SpvBuiltInFragCoord:
 nir_var->data.pixel_center_integer = b->pixel_center_integer;
 break;

Before the patch origin_upper_left affected both builtins and
pixel_center_integer only affected FragCoord. After the patch
origin_upper_left only affects SamplePosition and pixel_center_integer
affects both variables.

This patch tries to restore the previous behaviour by changing the
code to:

  case SpvBuiltInFragCoord:
 nir_var->data.pixel_center_integer = b->pixel_center_integer;
 /* fallthrough */
  case SpvBuiltInSamplePosition:
 nir_var->data.origin_upper_left = b->origin_upper_left;
 break;

This change will be important for ARB_gl_spirv which is meant to
support OriginLowerLeft.
---
 src/compiler/spirv/vtn_variables.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 9679ff6526c..fd8ab7f247a 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1419,11 +1419,11 @@ apply_var_decoration(struct vtn_builder *b, 
nir_variable *nir_var,
   case SpvBuiltInTessLevelInner:
  nir_var->data.compact = true;
  break;
-  case SpvBuiltInSamplePosition:
- nir_var->data.origin_upper_left = b->origin_upper_left;
- /* fallthrough */
   case SpvBuiltInFragCoord:
  nir_var->data.pixel_center_integer = b->pixel_center_integer;
+ /* fallthrough */
+  case SpvBuiltInSamplePosition:
+ nir_var->data.origin_upper_left = b->origin_upper_left;
  break;
   default:
  break;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR issue with SPIRV ops that have operands with different bit-size

2018-04-24 Thread Neil Roberts
Rob Clark  writes:

> Karol hit the same thing (with for example, shift instructions) with
> the work for spv compute/kernel support.  I *think* the number of
> special cases isn't too high, so probably vtn should just insert the
> appropriate conversion instruction (ie. u2u32, etc) so that if the src
> bitsize is incorrect it will be converted.

For what it’s worth, I also added a conversion instruction like this for
the Refract GLSL extension opcode because its eta argument can have a
different size:

https://github.com/Igalia/mesa/commit/608d70bc02a968ee2b21a5db0f54

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Don’t check for NaN for most OpFOrd* comparisons

2018-04-24 Thread Neil Roberts
For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware
should probably already return false if one of the operands is NaN so
we don’t need to have an explicit check for it. This seems to at least
work on Intel hardware. This should reduce the number of instructions
generated for the most common comparisons.

For what it’s worth, the original code to handle this was added in
e062eb6415de3a. The commit message for that says that it was to fix
some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t
handle NaNs this patch shouldn’t affect those tests. At any rate they
have since been moved out of the mustpass list. Incidentally those
tests fail on the nvidia proprietary driver so it doesn’t seem like
handling NaNs correctly is a priority.
---

I made a VkRunner test case for all of the OpFOrd* and OpFUnord*
opcodes with and without NaNs on the test branch. It can be run like
this:

git clone -b tests https://github.com/Igalia/vkrunner.git
cd vkrunner
./autogen.sh && make -j8
./src/vkrunner examples/unordered-comparison.shader_test

 src/compiler/spirv/vtn_alu.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 71e743cdd1e..3134849ba90 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -597,23 +597,18 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   break;
}
 
-   case SpvOpFOrdEqual:
-   case SpvOpFOrdNotEqual:
-   case SpvOpFOrdLessThan:
-   case SpvOpFOrdGreaterThan:
-   case SpvOpFOrdLessThanEqual:
-   case SpvOpFOrdGreaterThanEqual: {
+   case SpvOpFOrdNotEqual: {
+  /* For all the SpvOpFOrd* comparisons apart from NotEqual, the value
+   * from the ALU will probably already be false if the operands are not
+   * ordered so we don’t need to handle it specially.
+   */
   bool swap;
   unsigned src_bit_size = glsl_get_bit_size(vtn_src[0]->type);
   unsigned dst_bit_size = glsl_get_bit_size(type);
   nir_op op = vtn_nir_alu_op_for_spirv_opcode(b, opcode, ,
   src_bit_size, dst_bit_size);
 
-  if (swap) {
- nir_ssa_def *tmp = src[0];
- src[0] = src[1];
- src[1] = tmp;
-  }
+  assert(!swap);
 
   val->ssa->def =
  nir_iand(>nb,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] nir/spirv: fix MSVC warning in vtn_align_u32()

2018-03-30 Thread Neil Roberts
Patches 2-6 are:

Reviewed-by: Neil Roberts <nrobe...@igalia.com>

Thanks a lot for fixing this.

Regards,
- Neil

Brian Paul <bri...@vmware.com> writes:

> Fixes warning that "negation of an unsigned value results in an
> unsigned value".
> ---
>  src/compiler/spirv/vtn_private.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/spirv/vtn_private.h 
> b/src/compiler/spirv/vtn_private.h
> index d8a00f9..269de92 100644
> --- a/src/compiler/spirv/vtn_private.h
> +++ b/src/compiler/spirv/vtn_private.h
> @@ -732,7 +732,7 @@ void vtn_handle_decoration(struct vtn_builder *b, SpvOp 
> opcode,
>  static inline uint32_t
>  vtn_align_u32(uint32_t v, uint32_t a)
>  {
> -   assert(a != 0 && a == (a & -a));
> +   assert(a != 0 && a == (a & -((int32_t) a)));
> return (v + a - 1) & ~(a - 1);
>  }
>  
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Fix building with SCons

2018-03-30 Thread Neil Roberts
The SCons build broke with commit ba975140d3c9 because a SPIR-V
function is called from Mesa main. This adds a convenience library for
SPIR-V and adds it to everything that was including nir. It also adds
both nir and spirv to drivers/x11/SConscript.

---

It would be great if someone who depends on SCons could test this. We
weren’t sure if there was a good reason why nir isn’t already included
in drivers/x11/SConscript and whether adding it might break some
platform.

 src/compiler/Makefile.nir.am  |  3 +-
 src/compiler/SConscript   |  1 +
 src/compiler/SConscript.spirv | 54 +++
 src/gallium/targets/dri/SConscript|  1 +
 src/gallium/targets/haiku-softpipe/SConscript |  1 +
 src/gallium/targets/libgl-xlib/SConscript |  1 +
 src/gallium/targets/osmesa/SConscript |  1 +
 src/mesa/drivers/x11/SConscript   |  2 +
 8 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 src/compiler/SConscript.spirv

diff --git a/src/compiler/Makefile.nir.am b/src/compiler/Makefile.nir.am
index 32e4145b70f..27dc129e57b 100644
--- a/src/compiler/Makefile.nir.am
+++ b/src/compiler/Makefile.nir.am
@@ -100,4 +100,5 @@ EXTRA_DIST += \
nir/nir_opt_algebraic.py\
nir/tests \
nir/README \
-   SConscript.nir
+   SConscript.nir \
+   SConstript.spirv
diff --git a/src/compiler/SConscript b/src/compiler/SConscript
index 44509a9a95b..0a0c0737422 100644
--- a/src/compiler/SConscript
+++ b/src/compiler/SConscript
@@ -27,3 +27,4 @@ Export('compiler')
 
 SConscript('SConscript.glsl')
 SConscript('SConscript.nir')
+SConscript('SConscript.spirv')
diff --git a/src/compiler/SConscript.spirv b/src/compiler/SConscript.spirv
new file mode 100644
index 000..49410881d0b
--- /dev/null
+++ b/src/compiler/SConscript.spirv
@@ -0,0 +1,54 @@
+import common
+
+Import('*')
+
+from sys import executable as python_cmd
+
+env = env.Clone()
+
+env.MSVC2013Compat()
+
+env.Prepend(CPPPATH = [
+'#include',
+'#src',
+'#src/mapi',
+'#src/mesa',
+'#src/gallium/include',
+'#src/gallium/auxiliary',
+'#src/compiler/nir',
+'#src/compiler/spirv',
+])
+
+# Make generated headers reachable from the include path.
+env.Prepend(CPPPATH = [Dir('.').abspath, Dir('nir').abspath])
+env.Prepend(CPPPATH = [Dir('.').abspath, Dir('spirv').abspath])
+
+# spirv generated sources
+
+env.CodeGenerate(
+target = 'spirv/spirv_info.c',
+script = 'spirv/spirv_info_c.py',
+source = ['spirv/spirv.core.grammar.json'],
+command = python_cmd + ' $SCRIPT $SOURCE $TARGET'
+)
+
+env.CodeGenerate(
+target = 'spirv/vtn_gather_types.c',
+script = 'spirv/vtn_gather_types_c.py',
+source = ['spirv/spirv.core.grammar.json'],
+command = python_cmd + ' $SCRIPT $SOURCE $TARGET'
+)
+
+# parse Makefile.sources
+source_lists = env.ParseSourceList('Makefile.sources')
+
+spirv_sources = source_lists['SPIRV_FILES']
+spirv_sources += source_lists['SPIRV_GENERATED_FILES']
+
+spirv = env.ConvenienceLibrary(
+target = 'spirv',
+source = spirv_sources,
+)
+
+env.Alias('spirv', spirv)
+Export('spirv')
diff --git a/src/gallium/targets/dri/SConscript 
b/src/gallium/targets/dri/SConscript
index f5c2818d04f..ff6ce3bf4e0 100644
--- a/src/gallium/targets/dri/SConscript
+++ b/src/gallium/targets/dri/SConscript
@@ -45,6 +45,7 @@ env.Prepend(LIBS = [
 mesa,
 glsl,
 nir,
+spirv,
 gallium,
 megadrivers_stub,
 dri_common,
diff --git a/src/gallium/targets/haiku-softpipe/SConscript 
b/src/gallium/targets/haiku-softpipe/SConscript
index f80c167d83b..89792fba132 100644
--- a/src/gallium/targets/haiku-softpipe/SConscript
+++ b/src/gallium/targets/haiku-softpipe/SConscript
@@ -10,6 +10,7 @@ env.Prepend(LIBS = [
 mesa,
 glsl,
 nir,
+spirv,
 gallium
 ])
 
diff --git a/src/gallium/targets/libgl-xlib/SConscript 
b/src/gallium/targets/libgl-xlib/SConscript
index a81ac793251..b94ef350b16 100644
--- a/src/gallium/targets/libgl-xlib/SConscript
+++ b/src/gallium/targets/libgl-xlib/SConscript
@@ -33,6 +33,7 @@ env.Prepend(LIBS = [
 mesa,
 glsl,
 nir,
+spirv,
 gallium,
 ])
 
diff --git a/src/gallium/targets/osmesa/SConscript 
b/src/gallium/targets/osmesa/SConscript
index 7be1b48c0b2..ccf7d5170c4 100644
--- a/src/gallium/targets/osmesa/SConscript
+++ b/src/gallium/targets/osmesa/SConscript
@@ -18,6 +18,7 @@ env.Prepend(LIBS = [
 trace,
 glsl,
 nir,
+spirv,
 mesautil,
 softpipe
 ])
diff --git a/src/mesa/drivers/x11/SConscript b/src/mesa/drivers/x11/SConscript
index 59c8df4b3c2..b097dcc5900 100644
--- a/src/mesa/drivers/x11/SConscript
+++ b/src/mesa/drivers/x11/SConscript
@@ -21,6 +21,8 @@ env.Prepend(LIBS = [
 compiler,
 glsl,
 mesa,
+spirv,
+nir,
 ])
 
 sources = [
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 1/3] spirv: Handle location decorations on block interface members

2018-03-28 Thread Neil Roberts
Previously the code was taking any location decoration on the block
and using that to calculate the member locations for all of the
members. I think this was assuming that there would only be one
location decoration for the entire block. According to the Vulkan spec
it is possible to add location decorations to individual members:

“If the structure type is a Block but without a Location, then each of
 its members must have a Location decoration. If it is a Block with a
 Location decoration, then its members are assigned consecutive
 locations in declaration order, starting from the first member which
 is initially the Block. Any member with its own Location decoration
 is assigned that location. Each remaining member is assigned the
 location after the immediately preceding member in declaration
 order.”

This patch makes it instead keep track of which members have been
assigned an explicit location. It also has a space to store the
location for the struct as a whole. Once all the decorations have been
processed it iterates over each member to fill in the missing
locations using the rules described above.
---

There are tests for this on the tests branch of VkRunner:

https://github.com/Igalia/vkrunner/tree/tests

./src/vkrunner examples/block-*.shader_test

 src/compiler/spirv/vtn_private.h   |  6 
 src/compiler/spirv/vtn_variables.c | 60 ++
 2 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 70f660fbd48..c268b093656 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -457,6 +457,12 @@ struct vtn_variable {
nir_variable *var;
nir_variable **members;
 
+   /* If the variable is a struct with a location set on it then this will be
+* stored here. This will be used to calculate locations for members that
+* don’t have their own explicit location.
+*/
+   int base_location;
+
int shared_location;
 
/**
diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index b2897407fb1..29e9838d5d2 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1541,18 +1541,14 @@ var_decoration_cb(struct vtn_builder *b, struct 
vtn_value *val, int member,
 */
if (dec->decoration == SpvDecorationLocation) {
   unsigned location = dec->literals[0];
-  bool is_vertex_input;
   if (b->shader->info.stage == MESA_SHADER_FRAGMENT &&
   vtn_var->mode == vtn_variable_mode_output) {
- is_vertex_input = false;
  location += FRAG_RESULT_DATA0;
   } else if (b->shader->info.stage == MESA_SHADER_VERTEX &&
  vtn_var->mode == vtn_variable_mode_input) {
- is_vertex_input = true;
  location += VERT_ATTRIB_GENERIC0;
   } else if (vtn_var->mode == vtn_variable_mode_input ||
  vtn_var->mode == vtn_variable_mode_output) {
- is_vertex_input = false;
  location += vtn_var->patch ? VARYING_SLOT_PATCH0 : VARYING_SLOT_VAR0;
   } else {
  vtn_warn("Location must be on input or output variable");
@@ -1565,14 +1561,10 @@ var_decoration_cb(struct vtn_builder *b, struct 
vtn_value *val, int member,
   } else {
  /* This handles the structure member case */
  assert(vtn_var->members);
- unsigned length =
-glsl_get_length(glsl_without_array(vtn_var->type->type));
- for (unsigned i = 0; i < length; i++) {
-vtn_var->members[i]->data.location = location;
-location +=
-   glsl_count_attribute_slots(vtn_var->members[i]->interface_type,
-  is_vertex_input);
- }
+ if (member == -1)
+vtn_var->base_location = location;
+ else
+vtn_var->members[member]->data.location = location;
   }
   return;
} else {
@@ -1756,6 +1748,40 @@ is_per_vertex_inout(const struct vtn_variable *var, 
gl_shader_stage stage)
return false;
 }
 
+static void
+add_missing_member_locations(struct vtn_variable *var,
+ bool is_vertex_input)
+{
+   unsigned length =
+  glsl_get_length(glsl_without_array(var->type->type));
+   int location = var->base_location;
+
+   for (unsigned i = 0; i < length; i++) {
+  /* From the Vulkan spec:
+   *
+   * “If the structure type is a Block but without a Location, then each
+   *  of its members must have a Location decoration.”
+   */
+  assert(var->base_location != -1 ||
+ var->members[i]->data.location != -1);
+
+  /* From the Vulkan spec:
+   *
+   * “Any member with its own Location decoration is assigned that
+   *  location. Each remaining member is assigned the location after the
+   *  immediately preceding member in declaration order.”
+   */
+  if (var->members[i]->data.location != -1)
+   

[Mesa-dev] [PATCH 3/3] spirv: Don’t use special semantics when counting vertex attribute size

2018-03-28 Thread Neil Roberts
Under Vulkan, the double vertex attributes take up the same size
regardless of whether they are vertex inputs or any other stage
interface.
---

There is a test for this on the tests branch of VkRunner:

https://github.com/Igalia/vkrunner/tree/tests

./src/vkrunner examples/double-vertex-input-block.shader_test

 src/compiler/spirv/vtn_variables.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 29e9838d5d2..269228d486e 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1749,8 +1749,7 @@ is_per_vertex_inout(const struct vtn_variable *var, 
gl_shader_stage stage)
 }
 
 static void
-add_missing_member_locations(struct vtn_variable *var,
- bool is_vertex_input)
+add_missing_member_locations(struct vtn_variable *var)
 {
unsigned length =
   glsl_get_length(glsl_without_array(var->type->type));
@@ -1778,7 +1777,7 @@ add_missing_member_locations(struct vtn_variable *var,
 
   location +=
  glsl_count_attribute_slots(var->members[i]->interface_type,
-is_vertex_input);
+false /* is_gl_vertex_input */);
}
 }
 
@@ -1975,9 +1974,7 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,
if ((var->mode == vtn_variable_mode_input ||
 var->mode == vtn_variable_mode_output) &&
var->members) {
-  bool is_vertex_input = (b->shader->info.stage == MESA_SHADER_VERTEX &&
-  var->mode == vtn_variable_mode_input);
-  add_missing_member_locations(var, is_vertex_input);
+  add_missing_member_locations(var);
}
 
if (var->mode == vtn_variable_mode_image ||
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] glsl_types: Rename parameter of glsl_count_attribute_slots

2018-03-28 Thread Neil Roberts
glsl_count_attribute_slots takes a parameter to specify whether the
type is being used as a vertex input because on GL double attributes
only take up one slot. Vulkan doesn’t make this distinction so this
patch renames the argument to is_gl_vertex_input in order to make it
more clear that it should always be false on Vulkan.
---
 src/compiler/glsl_types.cpp | 16 ++--
 src/compiler/glsl_types.h   |  5 -
 src/compiler/nir_types.cpp  |  4 ++--
 src/compiler/nir_types.h|  2 +-
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 9d853caf721..50aac611981 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -1942,7 +1942,7 @@ glsl_type::std430_size(bool row_major) const
 }
 
 unsigned
-glsl_type::count_attribute_slots(bool is_vertex_input) const
+glsl_type::count_attribute_slots(bool is_gl_vertex_input) const
 {
/* From page 31 (page 37 of the PDF) of the GLSL 1.50 spec:
 *
@@ -1985,7 +1985,7 @@ glsl_type::count_attribute_slots(bool is_vertex_input) 
const
case GLSL_TYPE_DOUBLE:
case GLSL_TYPE_UINT64:
case GLSL_TYPE_INT64:
-  if (this->vector_elements > 2 && !is_vertex_input)
+  if (this->vector_elements > 2 && !is_gl_vertex_input)
  return this->matrix_columns * 2;
   else
  return this->matrix_columns;
@@ -1993,14 +1993,18 @@ glsl_type::count_attribute_slots(bool is_vertex_input) 
const
case GLSL_TYPE_INTERFACE: {
   unsigned size = 0;
 
-  for (unsigned i = 0; i < this->length; i++)
- size += 
this->fields.structure[i].type->count_attribute_slots(is_vertex_input);
+  for (unsigned i = 0; i < this->length; i++) {
+ const glsl_type *member = this->fields.structure[i].type;
+ size += member->count_attribute_slots(is_gl_vertex_input);
+  }
 
   return size;
}
 
-   case GLSL_TYPE_ARRAY:
-  return this->length * 
this->fields.array->count_attribute_slots(is_vertex_input);
+   case GLSL_TYPE_ARRAY: {
+  const glsl_type *element = this->fields.array;
+  return this->length * element->count_attribute_slots(is_gl_vertex_input);
+   }
 
case GLSL_TYPE_SUBROUTINE:
   return 1;
diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
index 88f987fabee..61eb36ca5c3 100644
--- a/src/compiler/glsl_types.h
+++ b/src/compiler/glsl_types.h
@@ -366,8 +366,11 @@ public:
 *
 * For vertex shader attributes - doubles only take one slot.
 * For inter-shader varyings - dvec3/dvec4 take two slots.
+*
+* Vulkan doesn’t make this distinction so the argument should always be
+* false.
 */
-   unsigned count_attribute_slots(bool is_vertex_input) const;
+   unsigned count_attribute_slots(bool is_gl_vertex_input) const;
 
/**
 * Alignment in bytes of the start of this type in a std140 uniform
diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp
index 78b66803f08..3969d51c85b 100644
--- a/src/compiler/nir_types.cpp
+++ b/src/compiler/nir_types.cpp
@@ -119,9 +119,9 @@ glsl_get_aoa_size(const struct glsl_type *type)
 
 unsigned
 glsl_count_attribute_slots(const struct glsl_type *type,
-   bool is_vertex_input)
+   bool is_gl_vertex_input)
 {
-   return type->count_attribute_slots(is_vertex_input);
+   return type->count_attribute_slots(is_gl_vertex_input);
 }
 
 const char *
diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h
index 5b441af1486..3b890f88a55 100644
--- a/src/compiler/nir_types.h
+++ b/src/compiler/nir_types.h
@@ -72,7 +72,7 @@ unsigned glsl_get_length(const struct glsl_type *type);
 unsigned glsl_get_aoa_size(const struct glsl_type *type);
 
 unsigned glsl_count_attribute_slots(const struct glsl_type *type,
-bool is_vertex_input);
+bool is_gl_vertex_input);
 
 const char *glsl_get_struct_elem_name(const struct glsl_type *type,
   unsigned index);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 4/4] spirv: Accept doubles in FaceForward, Reflect and Refract

2018-03-21 Thread Neil Roberts
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.

These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.

The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279

v2: Convert the eta operand of Refract from any size in order to make
it eventually cope with 16-bit floats.
---
 src/compiler/spirv/vtn_glsl450.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 50783fbfb4d..0cabedf741d 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -628,14 +628,14 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
case GLSLstd450FaceForward:
   val->ssa->def =
  nir_bcsel(nb, nir_flt(nb, nir_fdot(nb, src[2], src[1]),
-   nir_imm_float(nb, 0.0)),
+   NIR_IMM_FP(nb, 0.0)),
src[0], nir_fneg(nb, src[0]));
   return;
 
case GLSLstd450Reflect:
   /* I - 2 * dot(N, I) * N */
   val->ssa->def =
- nir_fsub(nb, src[0], nir_fmul(nb, nir_imm_float(nb, 2.0),
+ nir_fsub(nb, src[0], nir_fmul(nb, NIR_IMM_FP(nb, 2.0),
   nir_fmul(nb, nir_fdot(nb, src[0], src[1]),
src[1])));
   return;
@@ -645,8 +645,22 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
   nir_ssa_def *N = src[1];
   nir_ssa_def *eta = src[2];
   nir_ssa_def *n_dot_i = nir_fdot(nb, N, I);
-  nir_ssa_def *one = nir_imm_float(nb, 1.0);
-  nir_ssa_def *zero = nir_imm_float(nb, 0.0);
+  nir_ssa_def *one = NIR_IMM_FP(nb, 1.0);
+  nir_ssa_def *zero = NIR_IMM_FP(nb, 0.0);
+  /* According to the SPIR-V and GLSL specs, eta is always a float
+   * regardless of the type of the other operands. However in practice it
+   * seems that if you try to pass it a float then glslang will just
+   * promote it to a double and generate invalid SPIR-V. In order to
+   * support a hypothetical fixed version of glslang we’ll promote eta to
+   * double if the other operands are double also.
+   */
+  if (I->bit_size != eta->bit_size) {
+ nir_op conversion_op =
+nir_type_conversion_op(nir_type_float | eta->bit_size,
+   nir_type_float | I->bit_size,
+   nir_rounding_mode_undef);
+ eta = nir_build_alu(nb, conversion_op, eta, NULL, NULL, NULL);
+  }
   /* k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I)) */
   nir_ssa_def *k =
  nir_fsub(nb, one, nir_fmul(nb, eta, nir_fmul(nb, eta,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/4] spirv: Add a 64-bit implementation of OpIsInf

2018-03-21 Thread Neil Roberts
The only change neccessary is to change the type of the constant used
to compare against.

This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.

v2: Use nir_imm_floatN_t for the constant.
---
 src/compiler/spirv/vtn_alu.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 01be397e271..3226e5b8739 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -563,10 +563,11 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   val->ssa->def = nir_fne(>nb, src[0], src[0]);
   break;
 
-   case SpvOpIsInf:
-  val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]),
-  nir_imm_float(>nb, INFINITY));
+   case SpvOpIsInf: {
+  nir_ssa_def *inf = nir_imm_floatN_t(>nb, INFINITY, src[0]->bit_size);
+  val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), inf);
   break;
+   }
 
case SpvOpFUnordEqual:
case SpvOpFUnordNotEqual:
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins

2018-03-21 Thread Neil Roberts
There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.
---
 src/compiler/spirv/vtn_glsl450.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 7d32914d516..50783fbfb4d 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -513,7 +513,7 @@ vtn_nir_alu_op_for_spirv_glsl_opcode(struct vtn_builder *b,
}
 }
 
-#define NIR_IMM_FP(n, v) (src[0]->bit_size == 64 ? nir_imm_double(n, v) : 
nir_imm_float(n, v))
+#define NIR_IMM_FP(n, v) (nir_imm_floatN_t(n, v, src[0]->bit_size))
 
 static void
 handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] nir/builder: Add a nir_imm_floatN_t helper

2018-03-21 Thread Neil Roberts
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.
---
 src/compiler/nir/nir_builder.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
index 36e0ae3ac63..32f86249ad3 100644
--- a/src/compiler/nir/nir_builder.h
+++ b/src/compiler/nir/nir_builder.h
@@ -227,6 +227,19 @@ nir_imm_double(nir_builder *build, double x)
return nir_build_imm(build, 1, 64, v);
 }
 
+static inline nir_ssa_def *
+nir_imm_floatN_t(nir_builder *build, double x, unsigned bit_size)
+{
+   switch (bit_size) {
+   case 32:
+  return nir_imm_float(build, x);
+   case 64:
+  return nir_imm_double(build, x);
+   }
+
+   unreachable("unknown float immediate bit size");
+}
+
 static inline nir_ssa_def *
 nir_imm_vec4(nir_builder *build, float x, float y, float z, float w)
 {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] spirv: Support doubles in some builtin functions

2018-03-21 Thread Neil Roberts
This adds support for doubles in some of the builtin functions. The
last two patches have been posted already and are a v2 based on
Jason’s feedback.

These patches come out of testing using the ARB_gl_spirv branch of
Mesa and Piglit. However they also affect Vulkan and can be tested
with VkRunner using the test branch here:

https://github.com/Igalia/vkrunner/tree/tests

The corresponding tests can be run with:

./src/vkrunner \
 examples/{face-forward,reflect,refract,isinf}-double.shader_test \
 examples/refract-double-exp32.shader_test

Neil Roberts (4):
  nir/builder: Add a nir_imm_floatN_t helper
  spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
  spirv: Add a 64-bit implementation of OpIsInf
  spirv: Accept doubles in FaceForward, Reflect and Refract

 src/compiler/nir/nir_builder.h   | 13 +
 src/compiler/spirv/vtn_alu.c |  7 ---
 src/compiler/spirv/vtn_glsl450.c | 24 +++-
 3 files changed, 36 insertions(+), 8 deletions(-)

-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Handle doubles when multiplying a mat by a scalar

2018-03-13 Thread Neil Roberts
The code to handle mat multipication by a scalar tries to pick either
imul or fmul depending on whether the matrix is float or integer.
However it was doing this by checking whether the base type is float.
This was making it choose the int path for doubles (and presumably
float16s).

---

This was discovered from running the arb_gpu_shader_fp64 tests through
the GL_ARB_gl_spirv branch. For example:
arb_gpu_shader_fp64@execution@built-in-functions@vs-op-div-dmat4-double.
Note however these tests are hidden behind a glslang bug:

https://github.com/KhronosGroup/glslang/issues/1278

The fix in that issue along with this patch makes them all pass.

There’s also a little test case for this with VkRunner here:

https://github.com/Igalia/vkrunner/blob/tests/examples/dmat-mul-scalar.shader_test

 src/compiler/spirv/vtn_alu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index d0c9e316935..110fcec2a60 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -142,10 +142,10 @@ mat_times_scalar(struct vtn_builder *b,
 {
struct vtn_ssa_value *dest = vtn_create_ssa_value(b, mat->type);
for (unsigned i = 0; i < glsl_get_matrix_columns(mat->type); i++) {
-  if (glsl_get_base_type(mat->type) == GLSL_TYPE_FLOAT)
- dest->elems[i]->def = nir_fmul(>nb, mat->elems[i]->def, scalar);
-  else
+  if (glsl_base_type_is_integer(glsl_get_base_type(mat->type)))
  dest->elems[i]->def = nir_imul(>nb, mat->elems[i]->def, scalar);
+  else
+ dest->elems[i]->def = nir_fmul(>nb, mat->elems[i]->def, scalar);
}
 
return dest;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Vulkan version of shader_runner

2018-03-12 Thread Neil Roberts
Hi,

I’ve made a start on a Vulkan version of Piglit’s shader_runner. I
mostly just wanted it as a quick way to compare problems on Vulkan and
GL_ARB_gl_spirv but maybe one day it could become part of a Vulkan
testing suite. I just thought I’d announce it here in case anyone else
finds it useful too.

https://github.com/igalia/vkrunner

- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] spirv: Add a 64-bit implementation of OpIsInf

2018-03-08 Thread Neil Roberts
The only change neccessary is to change the type of the constant used
to compare against.

This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.
---
 src/compiler/spirv/vtn_alu.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index d0c9e316935..2b03a2e4d09 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
@@ -529,10 +529,15 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode,
   val->ssa->def = nir_fne(>nb, src[0], src[0]);
   break;
 
-   case SpvOpIsInf:
-  val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]),
-  nir_imm_float(>nb, INFINITY));
+   case SpvOpIsInf: {
+  nir_ssa_def *inf;
+  if (src[0]->bit_size == 64)
+ inf = nir_imm_double(>nb, INFINITY);
+  else
+ inf = nir_imm_float(>nb, INFINITY);
+  val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), inf);
   break;
+   }
 
case SpvOpFUnordEqual:
case SpvOpFUnordNotEqual:
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] spirv: Add a 64-bit implementation of Frexp

2018-03-08 Thread Neil Roberts
The implementation is inspired by
lower_instructions_visitor::dfrexp_sig_to_arith.

This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4
test using the ARB_gl_spirv branch.
---

Please also see this related patch which I probably should have
bundled in this series:

https://patchwork.freedesktop.org/patch/208702/

 src/compiler/spirv/vtn_glsl450.c | 60 +---
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 46ef40f5e3f..51c4cd271bc 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -380,7 +380,7 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
 }
 
 static nir_ssa_def *
-build_frexp(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent)
+build_frexp32(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent)
 {
nir_ssa_def *abs_x = nir_fabs(b, x);
nir_ssa_def *zero = nir_imm_float(b, 0.0f);
@@ -412,6 +412,51 @@ build_frexp(nir_builder *b, nir_ssa_def *x, nir_ssa_def 
**exponent)
  nir_bcsel(b, is_not_zero, exponent_value, zero));
 }
 
+static nir_ssa_def *
+build_frexp64(nir_builder *b, nir_ssa_def *x, nir_ssa_def **exponent)
+{
+   nir_ssa_def *abs_x = nir_fabs(b, x);
+   nir_ssa_def *zero = nir_imm_double(b, 0.0);
+   nir_ssa_def *zero32 = nir_imm_float(b, 0.0f);
+
+   /* Double-precision floating-point values are stored as
+*   1 sign bit;
+*   11 exponent bits;
+*   52 mantissa bits.
+*
+* We only need to deal with the exponent so first we extract the upper 32
+* bits using nir_unpack_64_2x32_split_y.
+*/
+   nir_ssa_def *upper_x = nir_unpack_64_2x32_split_y(b, x);
+   nir_ssa_def *abs_upper_x = nir_unpack_64_2x32_split_y(b, abs_x);
+
+   /* An exponent shift of 20 will shift the remaining mantissa bits out,
+* leaving only the exponent and sign bit (which itself may be zero, if the
+* absolute value was taken before the bitcast and shift.
+*/
+   nir_ssa_def *exponent_shift = nir_imm_int(b, 20);
+   nir_ssa_def *exponent_bias = nir_imm_int(b, -1022);
+
+   nir_ssa_def *sign_mantissa_mask = nir_imm_int(b, 0x800fu);
+
+   /* Exponent of floating-point values in the range [0.5, 1.0). */
+   nir_ssa_def *exponent_value = nir_imm_int(b, 0x3fe0u);
+
+   nir_ssa_def *is_not_zero = nir_fne(b, abs_x, zero);
+
+   *exponent =
+  nir_iadd(b, nir_ushr(b, abs_upper_x, exponent_shift),
+  nir_bcsel(b, is_not_zero, exponent_bias, zero32));
+
+   nir_ssa_def *new_upper =
+  nir_ior(b, nir_iand(b, upper_x, sign_mantissa_mask),
+ nir_bcsel(b, is_not_zero, exponent_value, zero32));
+
+   return nir_pack_64_2x32_split(b,
+ nir_unpack_64_2x32_split_x(b, x),
+ new_upper);
+}
+
 static nir_op
 vtn_nir_alu_op_for_spirv_glsl_opcode(struct vtn_builder *b,
  enum GLSLstd450 opcode)
@@ -685,15 +730,22 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
 
case GLSLstd450Frexp: {
   nir_ssa_def *exponent;
-  val->ssa->def = build_frexp(nb, src[0], );
+  if (src[0]->bit_size == 64)
+ val->ssa->def = build_frexp64(nb, src[0], );
+  else
+ val->ssa->def = build_frexp32(nb, src[0], );
   nir_store_deref_var(nb, vtn_nir_deref(b, w[6]), exponent, 0xf);
   return;
}
 
case GLSLstd450FrexpStruct: {
   vtn_assert(glsl_type_is_struct(val->ssa->type));
-  val->ssa->elems[0]->def = build_frexp(nb, src[0],
->ssa->elems[1]->def);
+  if (src[0]->bit_size == 64)
+ val->ssa->elems[0]->def = build_frexp64(nb, src[0],
+ >ssa->elems[1]->def);
+  else
+ val->ssa->elems[0]->def = build_frexp32(nb, src[0],
+ >ssa->elems[1]->def);
   return;
}
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Accept doubles in FaceForward, Reflect and Refract

2018-03-07 Thread Neil Roberts
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32-bit float
regardless of the other operands. However in practice it seems there
is a bug in glslang that makes it convert the argument to 64-bit if
you actually try to pass it a 32-bit value while the other arguments
are 32-bit. This adds an optional conversion of the final argument in
order to support either type.

These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.

The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279
---
 src/compiler/spirv/vtn_glsl450.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 46ef40f5e3f..b098f082f3e 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -583,14 +583,14 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
case GLSLstd450FaceForward:
   val->ssa->def =
  nir_bcsel(nb, nir_flt(nb, nir_fdot(nb, src[2], src[1]),
-   nir_imm_float(nb, 0.0)),
+   NIR_IMM_FP(nb, 0.0)),
src[0], nir_fneg(nb, src[0]));
   return;
 
case GLSLstd450Reflect:
   /* I - 2 * dot(N, I) * N */
   val->ssa->def =
- nir_fsub(nb, src[0], nir_fmul(nb, nir_imm_float(nb, 2.0),
+ nir_fsub(nb, src[0], nir_fmul(nb, NIR_IMM_FP(nb, 2.0),
   nir_fmul(nb, nir_fdot(nb, src[0], src[1]),
src[1])));
   return;
@@ -600,8 +600,17 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 
entrypoint,
   nir_ssa_def *N = src[1];
   nir_ssa_def *eta = src[2];
   nir_ssa_def *n_dot_i = nir_fdot(nb, N, I);
-  nir_ssa_def *one = nir_imm_float(nb, 1.0);
-  nir_ssa_def *zero = nir_imm_float(nb, 0.0);
+  nir_ssa_def *one = NIR_IMM_FP(nb, 1.0);
+  nir_ssa_def *zero = NIR_IMM_FP(nb, 0.0);
+  /* According to the SPIR-V and GLSL specs, eta is always a float
+   * regardless of the type of the other operands. However in practice it
+   * seems that if you try to pass it a float then glslang will just
+   * promote it to a double and generate invalid SPIR-V. In order to
+   * support a hypothetical fixed version of glslang we’ll promote eta to
+   * double if the other operands are double also.
+   */
+  if (I->bit_size > eta->bit_size)
+ eta = nir_build_alu(nb, nir_op_f2f64, eta, NULL, NULL, NULL);
   /* k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I)) */
   nir_ssa_def *k =
  nir_fsub(nb, one, nir_fmul(nb, eta, nir_fmul(nb, eta,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv/nir: Fix the stream ID when emitting a primitive or vertex

2018-01-22 Thread Neil Roberts
According to the SPIR-V spec:

“Stream must be an  of a constant instruction with a scalar
 integer type. That constant is the output-primitive stream number.”

The previous code was treating it as an integer literal.
---

This is part of the GL SPIR-V branch to enable streams for transform
feedback but seeing as it is a standalone fix for existing code I
thought it might be worth posting seperately.

 src/compiler/spirv/spirv_to_nir.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index c6df764682e..c71ed51e4cf 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -2983,8 +2983,12 @@ vtn_handle_barrier(struct vtn_builder *b, SpvOp opcode,
nir_intrinsic_instr *intrin =
   nir_intrinsic_instr_create(b->shader, intrinsic_op);
 
-   if (opcode == SpvOpEmitStreamVertex || opcode == SpvOpEndStreamPrimitive)
-  nir_intrinsic_set_stream_id(intrin, w[1]);
+   if (opcode == SpvOpEmitStreamVertex || opcode == SpvOpEndStreamPrimitive) {
+  struct vtn_value *stream_value =
+ vtn_value(b, w[1], vtn_value_type_constant);
+  unsigned stream = stream_value->constant->values[0].u32[0];
+  nir_intrinsic_set_stream_id(intrin, stream);
+   }
 
nir_builder_instr_insert(>nb, >instr);
 }
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] i965/meta-util: Convert the clear color through the surf format

2018-01-11 Thread Neil Roberts
When programming the fast clear color there was previously a chunk of
code to try to make the color match the constraints of the surface
format such as by filling in missing components and handling luminance
formats. These cases are not handled by the hardware. There are some
additional possible restrictions that the hardware does seem to
handle, such as clamping to [0,1] for normalised formats. However for
whatever reason it doesn't clamp to [0,∞] for the special float
formats that don't have a sign bit.

Rather than having special cases for all of these this patch makes it
instead convert the color to the actual surface format and back again
so that we can be sure it will have all of the possible restrictions.
Additionally this would avoid some other potential surprises such as
getting more precision for the clear color when fast clears are used.

Originally this patch was created to fix the following bug, but it is
no longer neccessary since f1fa4be871e13c68b50685aaf64. However I
think this approach is still worthwhile because it has the advantage
of reducing code redundancy.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93338
v2: Rebase and add support for converting int formats
---

This is a rebase of this patch from 2 years ago:

https://patchwork.freedesktop.org/patch/67912/

 src/mesa/drivers/dri/i965/brw_meta_util.c | 107 +++---
 1 file changed, 37 insertions(+), 70 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c 
b/src/mesa/drivers/dri/i965/brw_meta_util.c
index 54dc6a5..ec727eb 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
@@ -29,6 +29,8 @@
 #include "main/blend.h"
 #include "main/fbobject.h"
 #include "util/format_srgb.h"
+#include "main/format_pack.h"
+#include "main/format_unpack.h"
 
 /**
  * Helper function for handling mirror image blits.
@@ -345,6 +347,7 @@ brw_meta_convert_fast_clear_color(const struct brw_context 
*brw,
   const struct intel_mipmap_tree *mt,
   const union gl_color_union *color)
 {
+   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
union isl_color_value override_color = {
   .u32 = {
  color->ui[0],
@@ -354,82 +357,46 @@ brw_meta_convert_fast_clear_color(const struct 
brw_context *brw,
   },
};
 
-   /* The sampler doesn't look at the format of the surface when the fast
-* clear color is used so we need to implement luminance, intensity and
-* missing components manually.
-*/
-   switch (_mesa_get_format_base_format(mt->format)) {
-   case GL_INTENSITY:
-  override_color.u32[3] = override_color.u32[0];
-  /* flow through */
-   case GL_LUMINANCE:
-   case GL_LUMINANCE_ALPHA:
-  override_color.u32[1] = override_color.u32[0];
-  override_color.u32[2] = override_color.u32[0];
-  break;
-   default:
-  for (int i = 0; i < 3; i++) {
- if (!_mesa_format_has_color_component(mt->format, i))
-override_color.u32[i] = 0;
-  }
-  break;
-   }
-
-   switch (_mesa_get_format_datatype(mt->format)) {
-   case GL_UNSIGNED_NORMALIZED:
-  for (int i = 0; i < 4; i++)
- override_color.f32[i] = CLAMP(override_color.f32[i], 0.0f, 1.0f);
-  break;
-
-   case GL_SIGNED_NORMALIZED:
-  for (int i = 0; i < 4; i++)
- override_color.f32[i] = CLAMP(override_color.f32[i], -1.0f, 1.0f);
-  break;
-
-   case GL_UNSIGNED_INT:
-  for (int i = 0; i < 4; i++) {
- unsigned bits = _mesa_get_format_bits(mt->format, GL_RED_BITS + i);
- if (bits < 32) {
-uint32_t max = (1u << bits) - 1;
-override_color.u32[i] = MIN2(override_color.u32[i], max);
- }
-  }
-  break;
-
-   case GL_INT:
-  for (int i = 0; i < 4; i++) {
- unsigned bits = _mesa_get_format_bits(mt->format, GL_RED_BITS + i);
- if (bits < 32) {
-int32_t max = (1 << (bits - 1)) - 1;
-int32_t min = -(1 << (bits - 1));
-override_color.i32[i] = CLAMP(override_color.i32[i], min, max);
- }
-  }
-  break;
-
-   case GL_FLOAT:
-  if (!_mesa_is_format_signed(mt->format)) {
- for (int i = 0; i < 4; i++)
-override_color.f32[i] = MAX2(override_color.f32[i], 0.0f);
-  }
-  break;
-   }
-
-   if (!_mesa_format_has_color_component(mt->format, 3)) {
-  if (_mesa_is_format_integer_color(mt->format))
- override_color.u32[3] = 1;
-  else
- override_color.f32[3] = 1.0f;
-   }
-
/* Handle linear to SRGB conversion */
-   if (brw->ctx.Color.sRGBEnabled &&
-   _mesa_get_srgb_format_linear(mt->format) != mt->format) {
+   if (brw->ctx.Color.sRGBEnabled && linear_format != mt->format) {
   for (int i = 0; i < 3; i++) {
  override_color.f32[i] =
 util_format_linear_to_srgb_float(override_color.f32[i]);
   }
}
 
+   union gl_color_union 

[Mesa-dev] [PATCH] mesa: Tidy up the 4.6 section of GL4x.xml

2018-01-05 Thread Neil Roberts
The enums are moved to the top and indented like the rest of the file.
Comments are added to split up the function aliases by corresponding
extension. This should make no functional difference.
---

This was requested by Ian Romanick as part of the review of the patch
to add the aliases for GL_ARB_indirect_parameters.

 src/mapi/glapi/gen/GL4x.xml | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index 4b2703c..cd2e3b8 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -67,15 +67,21 @@
 
 
 
+  
+  
+  
+  
+  
+
+  
+
   
 
 
 
   
-  
 
-  
-  
+  
 
   
 
@@ -85,8 +91,7 @@
 
   
 
-  
-  
+  
 
   
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters

2018-01-04 Thread Neil Roberts
---

This is just a rebase of the patch with a minor conflict resolution.

 src/mapi/glapi/gen/GL4x.xml | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index 0a80941..4b2703c 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -84,6 +84,28 @@
 
 
   
+
+  
+  
+
+  
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+  
 
 
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests

2017-11-30 Thread Neil Roberts
Kenneth Graunke  writes:

> Neil, in case you were wondering, I suggested the above organization
> of vertex elements because it would let us only upload 1 in the common
> case. Looking in shader-db, there are 3579 shaders that use
> gl_InstanceID, 186 shaders that use gl_VertexID, and 0 shaders that
> use gl_BaseVertex, gl_BaseInstance, or gl_DrawID.
>
> It looks like your patches kicked gl_BaseVertex off to VE2 instead of
> gl_BaseInstance. I suppose that works too, I just figured that keeping
> VertexID/FirstVertex/BaseVertex together would make sense. If it's
> more convenient to move gl_BaseVertex, I suppose I'm fine with that
> too...

Yes, Antia forwarded me your emails with the previous suggestion. The
order we came up with for this patch series is:

VE 1: 
VE 2: 

The “offset for vertex ID” corresponds to what we were previously
(incorrectly) sending for gl_BaseVertex, so effectively the values in
VE1 have not changed at all. This also means we can continue to directly
take these two values from the indirect draw params buffer. I believe
your previous suggestion ended up needing a register store to put the
values in the right place so with this ordering we get to avoid that.
VE1 now contains everything needed for the most common values VertexID
and InstanceID. VE2 is only needed for the rare cases that use gl_DrawID
or gl_BaseVertex. On Vulkan VE2 won’t even be needed for the BaseVertex
builtin because the semantics are different and in that case it
corresponds to “offset for vertex ID”.

I haven’t looked into too much detail about reordering the patches yet,
but it does sound reasonable to try and avoid intermediate breakages.

Thanks for looking at the series.

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID

2017-11-30 Thread Neil Roberts
Kenneth Graunke  writes:

> We have a number of similar names now:
>
>SYSTEM_VALUE_BASE_VERTEX
>SYSTEM_VALUE_BASE_VERTEX_ID
>SYSTEM_VALUE_VERTEX_ID
>SYSTEM_VALUE_VERTEX_ID_ZERO_BASE
>
> BASE_VERTEX and BASE_VERTEX_ID are really similar names, and honestly
> either one seems like it could be the name for gl_BaseVertex.  I'm
> afraid it would be easy to mix them up by mistake.  IMHO, it would be
> nice to pick a different word, just to keep some distinction between
> the two fairly related concepts...
>
> Perhaps SYSTEM_VALUE_FIRST_VERTEX...?  That's only half the meaning,
> but it at least uses a different word, and makes you think "do I want
> BASE_VERTEX or FIRST_VERTEX?"

Yes, naming this thing is really difficult. I’m not sure if you noticed,
but for Vulkan, the BaseVertex builtin should actually have the value of
BASE_VERTEX_ID unlike GL. So if we rename BASE_VERTEX to something
without “base vertex” in it then it will still be confusing for Vulkan.
So effectively the descriptive names are like:

SYSTEM_VALUE_BASE_VERTEX_ON_GL_BUT_NOT_VULKAN
SYSTEM_VALUE_BASE_VERTEX_ON_VULKAN_OR_OFFSET_FOR_VERTEX_ID_ON_GL

I’m not sure whether that’s enough of an argument against FIRST_VERTEX
though, so personally I don’t really mind either way.

Antia, what do you think?

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters

2017-11-24 Thread Neil Roberts
---
 src/mapi/glapi/gen/GL4x.xml | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index 88dba5c..ea28d8e 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -73,6 +73,28 @@
 
   
   
+
+  
+  
+
+  
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+  
 
 
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/glsl_to_tgsi: Add support for SYSTEM_VALUE_BASE_VERTEX_ID

2017-11-22 Thread Neil Roberts
SYSTEM_VALUE_BASE_VERTEX has changed to be the correct value for
gl_BaseVertex, which means it will be zero when used with a
non-indexed call. The new BASE_VERTEX_ID value can be used as before
as an offset to calculate a value for gl_VertexID. These values should
be different, but this patch just makes them same for now in order to
at least retain the previous behaviour and not break gl_BaseVertexID
and gl_VertexID entirely on radeonsi.

Note, this hasn’t been tested apart from to verify that it compiles.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0772b73..3dfed19 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5385,6 +5385,11 @@ _mesa_sysval_to_semantic(unsigned sysval)
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return TGSI_SEMANTIC_VERTEXID_NOBASE;
case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_BASE_VERTEX_ID:
+  /* FIXME: These two values are actually supposed to be different. The
+   * one used for gl_BaseVertex is supposed to be zero when a non-indexed
+   * draw call is used.
+   */
   return TGSI_SEMANTIC_BASEVERTEX;
case SYSTEM_VALUE_BASE_INSTANCE:
   return TGSI_SEMANTIC_BASEINSTANCE;
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] freedreno: Update to handle rename of the base vertex ID intrinsic

2017-11-22 Thread Neil Roberts
The old intrinsic called base_vertex that is used to add to
gl_VertexID is now called base_vertex_id so that base_vertex can be
used for the value of gl_BaseVertex, which is different. As far as I
can tell freedreno doesn’t support GL_ARB_shader_draw_parameters so it
won’t need any changes to generate the new base_vertex intrinsic.

I haven’t tested this at all apart from to verify that it compiles.

Cc: Rob Clark 
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index da4aeaa..e6fbf45 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2071,10 +2071,10 @@ emit_intrinsic(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
ctx->ir->outputs[n] = src[i];
}
break;
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_base_vertex_id:
if (!ctx->basevertex) {
ctx->basevertex = create_driver_param(ctx, 
IR3_DP_VTXID_BASE);
-   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX,
+   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX_ID,
ctx->basevertex);
}
dst[0] = ctx->basevertex;
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] Fix radeonsi and freedreno for the BaseVertex patches

2017-11-22 Thread Neil Roberts
Here are two bonus patches to address the problems mentioned that it
might break radeonsi and freedreno. The radeonsi patch doesn’t solve
the problem that the value for gl_BaseVertex is presumably wrong, but
it at least should stop it from breaking gl_VertexID entirely. I
haven’t been able to test either patch so this is more of a request
for comments.

Neil Roberts (2):
  freedreno: Update to handle rename of the base vertex ID intrinsic
  st/glsl_to_tgsi: Add support for SYSTEM_VALUE_BASE_VERTEX_ID

 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 4 ++--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 5 +
 2 files changed, 7 insertions(+), 2 deletions(-)

-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests

2017-11-10 Thread Neil Roberts
Just to note, I made some Piglit patches to verify this new behaviour
too:

https://patchwork.freedesktop.org/series/33626/

This might make testing a little easier if someone wants to implement
the behaviour for other drivers too.

- Neil

Antia Puentes <apuen...@igalia.com> writes:

> Hi,
>
> the series sets gl_BaseVertex to zero for gl*DrawArrays* in i965.
> Previously, its value was the value passed as  in the non-indexed
> draw call. This was convinient because the gl_VertexID was calculated as
> gl_VertexIDBaseZero + gl_BaseVertex.
>
> However, as gl_BaseVertex must be zero for non-indexed draw calls, this
> calculation is broken.
>
> Apart from setting the basevertex to zero in i965, the following patches add
> a new VS system value which will contain  or  as needed
> to make the gl_VertexID calculation right.
>
>
> The relevant bits of the OpenGL 4.6 specification involved here are:
>
> * 11.1.3.9 Shader Inputs
>
> "gl_BaseVertex holds the integer value passed to the baseVertex parameter to 
> the
> command that resulted in the current shader invocation. In the case where the
> command has no baseVertex parameter, the value of gl_BaseVertex is zero."
>
> "gl_VertexID holds the integer index i implicitly passed by DrawArrays or
> one of the other drawing commands defined in section 10.4."
>
> * 10.4. Drawing Commands Using Vertex Arrays:
>
>   - Page 352:
> "The index of any element transferred to the GL by DrawArraysOneInstance is
> referred to as its vertex ID, and may be read by a vertex shader as 
> gl_VertexID.
> The vertex ID of the ith element transferred is first + i."
>
>   - Page 355:
> "The index of any element transferred to the GL by DrawElementsOneInstance is
> referred to as its vertex ID, and may be read by a vertex shader as 
> gl_VertexID.
> The vertex ID of the ith element transferred is the sum of basevertex and the
> value stored in the currently bound element array buffer at offset indices + 
> i."
>
> Regards,
> Antia.
>
> Antia Puentes (5):
>   compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID
>   nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics
>   i965: emit basevertexid as a vertex element component
>   i965: gl_BaseVertex must be zero for non-indexed draw calls
>   intel/compiler: implement the basevertex(id) load intrinsics
>
> Neil Roberts (4):
>   intel/compiler: Add a uses_basevertexid flag
>   nir: Offset vertex_id by base_vertex_id instead of base_vertex
>   i965: Let nir lower gl_VertexID instead of the linker
>   spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX
>
>  src/compiler/nir/nir.c|  4 +++
>  src/compiler/nir/nir_gather_info.c|  1 +
>  src/compiler/nir/nir_intrinsics.h |  1 +
>  src/compiler/nir/nir_lower_system_values.c|  2 +-
>  src/compiler/shader_enums.c   |  1 +
>  src/compiler/shader_enums.h   | 15 +++
>  src/compiler/spirv/vtn_variables.c|  2 +-
>  src/intel/compiler/brw_compiler.h |  1 +
>  src/intel/compiler/brw_nir.c  | 12 ++---
>  src/intel/compiler/brw_vec4.cpp   | 18 -
>  src/intel/vulkan/genX_cmd_buffer.c|  8 +++---
>  src/intel/vulkan/genX_pipeline.c  |  4 +--
>  src/mesa/drivers/dri/i965/brw_context.c   |  3 ++-
>  src/mesa/drivers/dri/i965/brw_context.h   | 36 ++---
>  src/mesa/drivers/dri/i965/brw_draw.c  | 33 ---
>  src/mesa/drivers/dri/i965/brw_draw_upload.c   | 13 -
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 39 
> +++
>  17 files changed, 132 insertions(+), 61 deletions(-)
>
> -- 
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] mesa: rework how we free gl_shader_program_data

2017-11-08 Thread Neil Roberts
This seems like a nice solution to me. I just have a small comment
below.

> -static struct gl_shader_program_data *
> -create_shader_program_data()
> +struct gl_shader_program_data *
> +_mesa_create_shader_program_data()
>  {
> struct gl_shader_program_data *data;
> data = rzalloc(NULL, struct gl_shader_program_data);
> if (data)
>data->RefCount = 1;
>  
> +   data->InfoLog = ralloc_strdup(data, "");
> +

The addition of this line makes the “if (data)” above look odd because
if the rzalloc fails it’s now going to just crash immediately anyway.
I’m not sure what the policy is supposed to be but a quick grep finds
other places in the compiler that seem to assume that ralloc can not
fail, so it might be nice to just remove the if.

Reviewed-by: Neil Roberts <nrobe...@igalia.com>

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/shaderapi: Do a dry run of linking an in-use program

2017-11-07 Thread Neil Roberts
Timothy Arceri  writes:

>> You’re right, I think that would be a better way to handle it. I guess
>> if this was done then you don’t really need the second link. There are
>> several pointers for the uniform state that you would need to keep and I
>> think there is more state than just the uniforms as well. Perhaps pretty
>> much everything that is freed in _mesa_clear_shader_program_data should
>> be kept?
>
> You only need to keep things that are accessed by the backend post 
> linking (such as the uniforms), things that are queried via the api can 
> be trashed as per the spec. I'm pretty sure we don't need most of
> those.

Right, I guess we could keep a small selection of the pointers that are
cleared by _mesa_clear_shader_program_data. However it might be messy to
maintain as it’s likely that someone could add new members to
gl_shader_program_data and forget to update this function. Already just
preserving the neccessary uniform state is a bit fiddly because the
allocation we want to maintain is owned by UniformStorage but it is
accessed via UniformDataSlots in the i965 driver. I’m not exactly sure
how this works in Gallium.

Just to double check, I made a little Piglit test to check using atomic
counters and sure enough it gets a similar Valgrind error and sporadic
failures due to accessing shProg->data->AtomicBuffers, so we would at
least need to conserve that too.

https://github.com/bpeel/piglit/commit/d95701afbb9367ed1e82af27c98f18

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/shaderapi: Do a dry run of linking an in-use program

2017-11-07 Thread Neil Roberts
Timothy Arceri  writes:

> I think I'd rather see this handled by releasing the uniforms only after 
> the second link is successful and using a temp/fallback pointer to hold 
> it until then. We need to do a similar type of thing with shader source 
> and the shader cache e.g [1].
>
> [1] 
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=e5bb4a0b0f4743fa0a0567991dca751ef49a7200

You’re right, I think that would be a better way to handle it. I guess
if this was done then you don’t really need the second link. There are
several pointers for the uniform state that you would need to keep and I
think there is more state than just the uniforms as well. Perhaps pretty
much everything that is freed in _mesa_clear_shader_program_data should
be kept?

An ideal solution might be to refactor this state into two seperate
structs. One would contain everything the user can edit, such as the
list of shaders to link, AttributeBindings, SSO flag etc, and the other
would be everything that is the result of linking. That way the link
could fill in a newly allocated link-state struct and only replace the
old state after a sucessful link. My (perhaps lazy) hunch is that that
could be a rather intrusive refactor and if the only reason to do it is
to fix this obscure corner case then it might not be worth the hassle.

I might try to have a go at this anyway but it’s probably the sort of
thing that would need some discussion first.

Regards,
- Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Transform fb buffers are only active if a variable uses them

2017-11-06 Thread Neil Roberts
The GL spec will soon be revised to clarify that a buffer binding for
a transform feedback buffer is only required if a variable is actually
defined to use the buffer binding point. Previously a declaration for
the default transform buffer would make it require a binding even if
nothing was declared to use the default buffer.

Affects:
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api
---
 src/compiler/glsl/link_varyings.cpp | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 66a20a2..e663930 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -1364,7 +1364,6 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
   if (has_xfb_qualifiers) {
  for (unsigned j = 0; j < MAX_FEEDBACK_BUFFERS; j++) {
 if (prog->TransformFeedback.BufferStride[j]) {
-   buffers |= 1 << j;
explicit_stride[j] = true;
xfb_prog->sh.LinkedTransformFeedback->Buffers[j].Stride =
   prog->TransformFeedback.BufferStride[j] / 4;
@@ -1389,10 +1388,24 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
 num_buffers++;
 buffer_stream_id = -1;
 continue;
- } else if (tfeedback_decls[i].is_varying()) {
+ }
+
+ if (has_xfb_qualifiers) {
+buffer = tfeedback_decls[i].get_buffer();
+ } else {
+buffer = num_buffers;
+ }
+
+ if (tfeedback_decls[i].is_varying()) {
 if (buffer_stream_id == -1)  {
/* First varying writing to this buffer: remember its stream */
buffer_stream_id = (int) tfeedback_decls[i].get_stream_id();
+
+   /* Only mark a buffer as active when there is a varying
+* attached to it. This behaviour is based on a revised version
+* of section 13.2.2 of the GL 4.6 spec.
+*/
+   buffers |= 1 << buffer;
 } else if (buffer_stream_id !=
(int) tfeedback_decls[i].get_stream_id()) {
/* Varying writes to the same buffer from a different stream */
@@ -1408,13 +1421,6 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
 }
  }
 
- if (has_xfb_qualifiers) {
-buffer = tfeedback_decls[i].get_buffer();
- } else {
-buffer = num_buffers;
- }
- buffers |= 1 << buffer;
-
  if (!tfeedback_decls[i].store(ctx, prog,
xfb_prog->sh.LinkedTransformFeedback,
buffer, num_buffers, num_outputs,
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/shaderapi: Do a dry run of linking an in-use program

2017-11-02 Thread Neil Roberts
If an in-use program is unsuccessfully linked, the GL spec requires
that the executable and the uniform state of the program should remain
until a new program is bound. Previously this sort of worked in Mesa
except that it would free the uniform state before attempting to link.
At least on i965 this would usually continue to work because it
accesses the uniform values via a dangling pointer. However it was
causing sporadic failures in one of the CTS tests.

To fix it, when an in-use program is about to be linked, it will now
make a copy of the program and first try to link that instead. If it
works it will let the link continue, otherwise it will copy the status
to the new program and abandon the link. That way the link status and
info log is preserved without freeing the uniform state.

This isn’t very efficient but it would probably quite a big overhaul
to fix it properly and it doesn’t seem worth it because I can’t
imagine that any performance-sensitive application would be hitting
this very strange corner case of the API.

Fixes: KHR-GL46.sepshaderobjs.StateInteraction

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102177
---
 src/mesa/main/shaderapi.c | 58 ++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 7282435..3f7fe02 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1134,6 +1134,58 @@ _mesa_compile_shader(struct gl_context *ctx, struct 
gl_shader *sh)
}
 }
 
+static bool
+try_link_in_use_program(struct gl_context *ctx,
+struct gl_shader_program *shProg)
+{
+   struct gl_shader_program *shProgCopy = _mesa_new_shader_program(0);
+   bool ret;
+   int i;
+
+   /* If the program is in use then try linking a copy of the program first.
+* If it won’t work then we’ll copy the link status to the old program and
+* abandon the linking. Otherwise we let it relink as normal. This is done
+* because relinking destroys the uniform state of the program but the GL
+* spec actually requires that the state be preserved when the program is
+* in use and the link fails. This isn’t terribly efficent but in practice
+* it is probably not worth optimising because it’s a rather strange corner
+* case of the spec.
+*
+* GLSL 4.6 spec section 7.3:
+*
+* “If a program object that is active for any shader stage is re-linked
+*  unsuccessfully, the link status will be set to FALSE, but any existing
+*  executables and associated state will remain part of the current
+*  rendering state until a subsequent call to UseProgram,
+*  UseProgramStages, or BindProgramPipeline removes them from use.”
+*/
+
+   for (i = 0; i < shProg->NumShaders; i++)
+  attach_shader(ctx, shProgCopy, shProg->Shaders[i]);
+
+   shProgCopy->BinaryRetreivableHint = shProg->BinaryRetreivableHint;
+   shProgCopy->SeparateShader = shProg->SeparateShader;
+
+   _mesa_glsl_link_shader(ctx, shProgCopy);
+
+   if (shProgCopy->data->LinkStatus) {
+  ret = true;
+   } else {
+  shProg->data->LinkStatus = shProgCopy->data->LinkStatus;
+  shProg->data->Validated = shProgCopy->data->Validated;
+  shProg->data->Version = shProgCopy->data->Version;
+  shProg->IsES = shProgCopy->IsES;
+  ralloc_free(shProg->data->InfoLog);
+  shProg->data->InfoLog =
+ ralloc_strdup(shProg->data, shProgCopy->data->InfoLog);
+
+  ret = false;
+   }
+
+   _mesa_reference_shader_program(ctx, , NULL);
+
+   return ret;
+}
 
 /**
  * Link a program's shaders.
@@ -1159,12 +1211,16 @@ link_program(struct gl_context *ctx, struct 
gl_shader_program *shProg,
}
 
unsigned programs_in_use = 0;
-   if (ctx->_Shader)
+   if (ctx->_Shader) {
   for (unsigned stage = 0; stage < MESA_SHADER_STAGES; stage++) {
  if (ctx->_Shader->CurrentProgram[stage] &&
  ctx->_Shader->CurrentProgram[stage]->Id == shProg->Name) {
 programs_in_use |= 1 << stage;
  }
+  }
+
+  if (programs_in_use && !try_link_in_use_program(ctx, shProg))
+ return;
}
 
FLUSH_VERTICES(ctx, 0);
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir/opt_intrinsics: Fix values for gl_SubGroupG{e, t}MaskARB

2017-10-31 Thread Neil Roberts
Thanks for the review. I’ve pushed this first patch to master and I’ll
drop the rest. I’ve also pushed the piglit patch.

Regards,
- Neil

Jason Ekstrand <ja...@jlekstrand.net> writes:

> On Tue, Oct 31, 2017 at 10:55 AM, Neil Roberts <nrobe...@igalia.com> wrote:
>
>> Previously the values were calculated by just shifting ~0 by the
>> invocation ID. This would end up including bits that are higher than
>> gl_SubGroupSizeARB. The corresponding CTS test effectively requires that
>> these high bits be zero so it was failing. There is a Piglit test as
>> well but this appears to checking the wrong values so it passes.
>>
>> For the two greater-than bitmasks, this patch adds an extra mask with
>> (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero.
>>
>
> We had a big discussion about this in the office yesterday. :-)  From my
> reading of the GL_ARB_shader_ballot and GL_NV_shader_thread_group specs, it
> looked like the current behavior was correct.  However, I'm very glad to
> see that I was wrong because it means that Vulkan and GL are consistent
> with each other. :)  It'll be a bit painful to rebase my subgroup work on
> top of this but I think I'd prefer it if we land this first as it will
> actually make some things a bit simpler even though it will conflict madly.
>
>
>> Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3
>> Signed-off-by: Neil Roberts <nrobe...@igalia.com>
>> ---
>>  src/compiler/nir/nir_opt_intrinsics.c | 24 ++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/compiler/nir/nir_opt_intrinsics.c
>> b/src/compiler/nir/nir_opt_intrinsics.c
>> index 26a0f96..d5fdc51 100644
>> --- a/src/compiler/nir/nir_opt_intrinsics.c
>> +++ b/src/compiler/nir/nir_opt_intrinsics.c
>> @@ -28,6 +28,26 @@
>>   * \file nir_opt_intrinsics.c
>>   */
>>
>> +static nir_ssa_def *
>> +high_subgroup_mask(nir_builder *b,
>> +   nir_ssa_def *count,
>> +   uint64_t base_mask)
>> +{
>> +   /* group_mask could probably be calculated more efficiently but we
>> want to
>> +* be sure not to shift by 64 if the subgroup size is 64 because the
>> GLSL
>> +* shift operator is undefined in that case.
>
>
> Yeah, I'm pretty sure our hardware will just not shift in that case.
>
>
>> In any case if we were worried
>> +* about efficency this should probably be done further down because
>> the
>> +* subgroup size is likely to be known at compile time.
>>
>
> I wouldn't be worried about efficiency.  As I said in an earlier patch,
> this becomes a constant with my series.
>
> Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
> Cc: mesa-sta...@lists.freedesktop.org
>
> Please push soon or maybe I'll just push it myself.
>
>
>> +*/
>> +   nir_ssa_def *subgroup_size = nir_load_subgroup_size(b);
>> +   nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull);
>> +   nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size);
>> +   nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift);
>> +   nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask),
>> count);
>> +
>> +   return nir_iand(b, higher_bits, group_mask);
>> +}
>> +
>>  static bool
>>  opt_intrinsics_impl(nir_function_impl *impl)
>>  {
>> @@ -95,10 +115,10 @@ opt_intrinsics_impl(nir_function_impl *impl)
>> replacement = nir_ishl(, nir_imm_int64(, 1ull), count);
>> break;
>>  case nir_intrinsic_load_subgroup_ge_mask:
>> -   replacement = nir_ishl(, nir_imm_int64(, ~0ull),
>> count);
>> +   replacement = high_subgroup_mask(, count, ~0ull);
>> break;
>>  case nir_intrinsic_load_subgroup_gt_mask:
>> -   replacement = nir_ishl(, nir_imm_int64(, ~1ull),
>> count);
>> +   replacement = high_subgroup_mask(, count, ~1ull);
>> break;
>>  case nir_intrinsic_load_subgroup_le_mask:
>> replacement = nir_inot(, nir_ishl(, nir_imm_int64(,
>> ~1ull), count));
>> --
>> 2.9.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965/fs/nir: Don’t let nir lower nir_intrinsic_load_subgroup_all_mask

2017-10-31 Thread Neil Roberts
Instead of letting nir lower nir_intrinsic_load_subgroup_all_mask this
is now generated directly. This is more efficient because it can be
calculated in the compiler based on the dispatch width.

Sadly it’s still not totally ideal because the constant doesn’t seem
to get propagated and there is still a redundant MOV.
---
 src/intel/compiler/brw_compiler.c | 2 +-
 src/intel/compiler/brw_fs_nir.cpp | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index 8df0d2e..f02fceb 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -57,7 +57,7 @@ static const struct nir_shader_compiler_options 
scalar_nir_options = {
.lower_unpack_snorm_4x8 = true,
.lower_unpack_unorm_2x16 = true,
.lower_unpack_unorm_4x8 = true,
-   .lower_subgroup_all_mask = true,
+   .lower_subgroup_all_mask = false,
.lower_subgroup_masks = true,
.max_subgroup_size = 32,
.max_unroll_iterations = 32,
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 9202b0f..b73edc9 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4185,7 +4185,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   break;
}
 
-   case nir_intrinsic_load_subgroup_all_mask:
+   case nir_intrinsic_load_subgroup_all_mask: {
+  uint32_t mask = ~UINT32_C(0) >> (32 - dispatch_width);
+  bld.MOV(retype(dest, BRW_REGISTER_TYPE_Q), brw_imm_d(mask));
+  break;
+   }
+
case nir_intrinsic_load_subgroup_eq_mask:
case nir_intrinsic_load_subgroup_ge_mask:
case nir_intrinsic_load_subgroup_gt_mask:
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nir: Add an intrinsic for a bitmask of the whole subgroup

2017-10-31 Thread Neil Roberts
Similar to nir_intrinsic_load_subgroup_eq_mask and friends, this adds
an intrinsic which contains a bit for every member of the group. This
doesn’t have a corresponding GLSL builtin but it will be used to
calculate nir_intrinsic_load_subgroup_g{t,e}_mask. It has its own nir
option on whether to lower it. The idea is that this should be much
easier to generate than the other masks because it will likely be a
compile-time constant and if so it will generate more efficient code
for the other masks.
---
 src/compiler/nir/nir.h|  1 +
 src/compiler/nir/nir_intrinsics.h |  1 +
 src/compiler/nir/nir_opt_intrinsics.c | 41 +++
 src/intel/compiler/brw_compiler.c |  1 +
 src/intel/compiler/brw_fs_nir.cpp |  1 +
 5 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index dd833cf..edb02b9 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1836,6 +1836,7 @@ typedef struct nir_shader_compiler_options {
bool lower_extract_word;
 
bool lower_vote_trivial;
+   bool lower_subgroup_all_mask;
bool lower_subgroup_masks;
 
/**
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index cefd18b..de362a8 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -350,6 +350,7 @@ SYSTEM_VALUE(layer_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(view_index, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(subgroup_size, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(subgroup_invocation, 1, 0, xx, xx, xx)
+SYSTEM_VALUE(subgroup_all_mask, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(subgroup_eq_mask, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(subgroup_ge_mask, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(subgroup_gt_mask, 1, 0, xx, xx, xx)
diff --git a/src/compiler/nir/nir_opt_intrinsics.c 
b/src/compiler/nir/nir_opt_intrinsics.c
index d5fdc51..71d79d7 100644
--- a/src/compiler/nir/nir_opt_intrinsics.c
+++ b/src/compiler/nir/nir_opt_intrinsics.c
@@ -29,23 +29,39 @@
  */
 
 static nir_ssa_def *
-high_subgroup_mask(nir_builder *b,
-   nir_ssa_def *count,
-   uint64_t base_mask)
+subgroup_all_mask(nir_builder *b,
+  nir_ssa_def *count)
 {
/* group_mask could probably be calculated more efficiently but we want to
 * be sure not to shift by 64 if the subgroup size is 64 because the GLSL
-* shift operator is undefined in that case. In any case if we were worried
-* about efficency this should probably be done further down because the
-* subgroup size is likely to be known at compile time.
+* shift operator is undefined in that case. In any case if the driver is
+* worried about efficency this should probably be done further down
+* because the subgroup size is likely to be known at compile time.
 */
nir_ssa_def *subgroup_size = nir_load_subgroup_size(b);
nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull);
nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size);
-   nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift);
-   nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count);
+   return nir_ushr(b, all_bits, shift);
+}
 
-   return nir_iand(b, higher_bits, group_mask);
+static nir_ssa_def *
+high_subgroup_mask(nir_builder *b,
+   nir_ssa_def *count,
+   uint64_t base_mask)
+{
+   nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count);
+   nir_intrinsic_instr *load_group_mask =
+  nir_intrinsic_instr_create(b->shader,
+ nir_intrinsic_load_subgroup_all_mask);
+   load_group_mask->num_components = 1;
+   nir_ssa_dest_init(_group_mask->instr,
+ _group_mask->dest,
+ 1 /* num_components */,
+ 64 /* bit_size */,
+ NULL /* name */);
+   nir_builder_instr_insert(b, _group_mask->instr);
+
+   return nir_iand(b, higher_bits, _group_mask->dest.ssa);
 }
 
 static bool
@@ -100,6 +116,10 @@ opt_intrinsics_impl(nir_function_impl *impl)
  nir_imm_int(, 0));
 break;
  }
+ case nir_intrinsic_load_subgroup_all_mask:
+if (!b.shader->options->lower_subgroup_all_mask)
+   break;
+/* flow through */
  case nir_intrinsic_load_subgroup_eq_mask:
  case nir_intrinsic_load_subgroup_ge_mask:
  case nir_intrinsic_load_subgroup_gt_mask:
@@ -111,6 +131,9 @@ opt_intrinsics_impl(nir_function_impl *impl)
 nir_ssa_def *count = nir_load_subgroup_invocation();
 
 switch (intrin->intrinsic) {
+case nir_intrinsic_load_subgroup_all_mask:
+   replacement = subgroup_all_mask(, count);
+   break;
 case nir_intrinsic_load_subgroup_eq_mask:
replacement = nir_ishl(, nir_imm_int64(, 1ull), count);
break;
diff --git 

[Mesa-dev] [PATCH 1/3] nir/opt_intrinsics: Fix values for gl_SubGroupG{e, t}MaskARB

2017-10-31 Thread Neil Roberts
Previously the values were calculated by just shifting ~0 by the
invocation ID. This would end up including bits that are higher than
gl_SubGroupSizeARB. The corresponding CTS test effectively requires that
these high bits be zero so it was failing. There is a Piglit test as
well but this appears to checking the wrong values so it passes.

For the two greater-than bitmasks, this patch adds an extra mask with
(~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero.

Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3
Signed-off-by: Neil Roberts <nrobe...@igalia.com>
---
 src/compiler/nir/nir_opt_intrinsics.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_opt_intrinsics.c 
b/src/compiler/nir/nir_opt_intrinsics.c
index 26a0f96..d5fdc51 100644
--- a/src/compiler/nir/nir_opt_intrinsics.c
+++ b/src/compiler/nir/nir_opt_intrinsics.c
@@ -28,6 +28,26 @@
  * \file nir_opt_intrinsics.c
  */
 
+static nir_ssa_def *
+high_subgroup_mask(nir_builder *b,
+   nir_ssa_def *count,
+   uint64_t base_mask)
+{
+   /* group_mask could probably be calculated more efficiently but we want to
+* be sure not to shift by 64 if the subgroup size is 64 because the GLSL
+* shift operator is undefined in that case. In any case if we were worried
+* about efficency this should probably be done further down because the
+* subgroup size is likely to be known at compile time.
+*/
+   nir_ssa_def *subgroup_size = nir_load_subgroup_size(b);
+   nir_ssa_def *all_bits = nir_imm_int64(b, ~0ull);
+   nir_ssa_def *shift = nir_isub(b, nir_imm_int(b, 64), subgroup_size);
+   nir_ssa_def *group_mask = nir_ushr(b, all_bits, shift);
+   nir_ssa_def *higher_bits = nir_ishl(b, nir_imm_int64(b, base_mask), count);
+
+   return nir_iand(b, higher_bits, group_mask);
+}
+
 static bool
 opt_intrinsics_impl(nir_function_impl *impl)
 {
@@ -95,10 +115,10 @@ opt_intrinsics_impl(nir_function_impl *impl)
replacement = nir_ishl(, nir_imm_int64(, 1ull), count);
break;
 case nir_intrinsic_load_subgroup_ge_mask:
-   replacement = nir_ishl(, nir_imm_int64(, ~0ull), count);
+   replacement = high_subgroup_mask(, count, ~0ull);
break;
 case nir_intrinsic_load_subgroup_gt_mask:
-   replacement = nir_ishl(, nir_imm_int64(, ~1ull), count);
+   replacement = high_subgroup_mask(, count, ~1ull);
break;
 case nir_intrinsic_load_subgroup_le_mask:
replacement = nir_inot(, nir_ishl(, nir_imm_int64(, 
~1ull), count));
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/3] glsl_parser_extra: Add utility to copy symbols between symbol tables

2017-10-27 Thread Neil Roberts
From: Eduardo Lima Mitev <el...@igalia.com>

Some symbols gathered in the symbols table during parsing are needed
later for the compile and link stages, so they are moved along the
process. Currently, only functions and non-temporary variables are
copied between symbol tables. However, the built-in gl_PerVertex
interface blocks are also needed during the linking stage (the last
step), to match re-declared blocks of inter-stage shaders.

This patch adds a new utility function that will factorize current code
that copies functions and variables between two symbol tables, and in
addition will copy explicitly declared gl_PerVertex blocks too.

The function will be used in a subsequent patch.

v2 (Neil Roberts):
Allow the src symbol table to be NULL and explicitly copy the
gl_PerVertex symbols in case they are not referenced in the exec_list.

Signed-off-by: Eduardo Lima Mitev <el...@igalia.com>
Signed-off-by: Neil Roberts <nrobe...@igalia.com>
---
 src/compiler/glsl/glsl_parser_extras.cpp | 43 
 src/compiler/glsl/glsl_parser_extras.h   |  5 
 2 files changed, 48 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 51d835b..ec4603d 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1863,6 +1863,49 @@ set_shader_inout_layout(struct gl_shader *shader,
shader->bound_image = state->bound_image_specified;
 }
 
+/* src can be NULL if only the symbols found in the exec_list should be
+ * copied
+ */
+void
+_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir,
+   struct glsl_symbol_table *src,
+   struct glsl_symbol_table *dest)
+{
+   foreach_in_list (ir_instruction, ir, shader_ir) {
+  switch (ir->ir_type) {
+  case ir_type_function:
+ dest->add_function((ir_function *) ir);
+ break;
+  case ir_type_variable: {
+ ir_variable *const var = (ir_variable *) ir;
+
+ if (var->data.mode != ir_var_temporary)
+dest->add_variable(var);
+ break;
+  }
+  default:
+ break;
+  }
+   }
+
+   if (src != NULL) {
+  /* Explicitly copy the gl_PerVertex interface definitions because these
+   * are needed to check they are the same during the interstage link.
+   * They can’t necessarily be found via the exec_list because the members
+   * might not be referenced. The GL spec still requires that they match
+   * in that case.
+   */
+  const glsl_type *iface =
+ src->get_interface("gl_PerVertex", ir_var_shader_in);
+  if (iface)
+ dest->add_interface(iface->name, iface, ir_var_shader_in);
+
+  iface = src->get_interface("gl_PerVertex", ir_var_shader_out);
+  if (iface)
+ dest->add_interface(iface->name, iface, ir_var_shader_out);
+   }
+}
+
 extern "C" {
 
 static void
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index fb35813..2e98bc7 100644
--- a/src/compiler/glsl/glsl_parser_extras.h
+++ b/src/compiler/glsl/glsl_parser_extras.h
@@ -948,6 +948,11 @@ extern int glcpp_preprocess(void *ctx, const char 
**shader, char **info_log,
 extern void _mesa_destroy_shader_compiler(void);
 extern void _mesa_destroy_shader_compiler_caches(void);
 
+extern void
+_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir,
+   struct glsl_symbol_table *src,
+   struct glsl_symbol_table *dest);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/3] glsl/linker: Check that re-declared, inter-shader built-in blocks match

2017-10-27 Thread Neil Roberts
From: Eduardo Lima Mitev <el...@igalia.com>

>From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of
the PDF states:

"If multiple shaders using members of a built-in block belonging to
 the same interface are linked together in the same program, they must
 all redeclare the built-in block in the same way, as described in
 section 4.3.9 “Interface Blocks” for interface-block matching, or a
 link-time error will result."

Fixes:
* GL45-CTS.CommonBugs.CommonBug_PerVertexValidation

v2 (Neil Roberts):
Explicitly look for gl_PerVertex in the symbol tables instead of
waiting to find a variable in the interface.

Signed-off-by: Eduardo Lima Mitev <el...@igalia.com>
Signed-off-by: Neil Roberts <nrobe...@igalia.com>
---
 src/compiler/glsl/link_interface_blocks.cpp | 29 +
 1 file changed, 29 insertions(+)

diff --git a/src/compiler/glsl/link_interface_blocks.cpp 
b/src/compiler/glsl/link_interface_blocks.cpp
index 7037c77..510d4f7 100644
--- a/src/compiler/glsl/link_interface_blocks.cpp
+++ b/src/compiler/glsl/link_interface_blocks.cpp
@@ -364,6 +364,35 @@ validate_interstage_inout_blocks(struct gl_shader_program 
*prog,
consumer->Stage != MESA_SHADER_FRAGMENT) ||
   consumer->Stage == MESA_SHADER_GEOMETRY;
 
+   /* Check that block re-declarations of gl_PerVertex are compatible
+* across shaders: From OpenGL Shading Language 4.5, section
+* "7.1 Built-In Language Variables", page 130 of the PDF:
+*
+*"If multiple shaders using members of a built-in block belonging
+* to the same interface are linked together in the same program,
+* they must all redeclare the built-in block in the same way, as
+* described in section 4.3.9 “Interface Blocks” for interface-block
+* matching, or a link-time error will result."
+*
+* This is done explicitly outside of iterating the member variable
+* declarations because it is possible that the variables are not used and
+* so they would have been optimised out.
+*/
+   const glsl_type *consumer_iface =
+  consumer->symbols->get_interface("gl_PerVertex",
+   ir_var_shader_in);
+
+   const glsl_type *producer_iface =
+  producer->symbols->get_interface("gl_PerVertex",
+   ir_var_shader_out);
+
+   if (producer_iface && consumer_iface &&
+   interstage_member_mismatch(prog, consumer_iface, producer_iface)) {
+  linker_error(prog, "Incompatible or missing gl_PerVertex re-declaration "
+   "in consecutive shaders");
+  return;
+   }
+
/* Add output interfaces from the producer to the symbol table. */
foreach_in_list(ir_instruction, node, producer->ir) {
   ir_variable *var = node->as_variable();
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/3] glsl: Use the utility function to copy symbols between symbol tables

2017-10-27 Thread Neil Roberts
From: Eduardo Lima Mitev <el...@igalia.com>

This effectively factorizes a couple of similar routines.

v2 (Neil Roberts): Non-trivial rebase on master

Signed-off-by: Eduardo Lima Mitev <el...@igalia.com>
Signed-off-by: Neil Roberts <nrobe...@igalia.com>
---
 src/compiler/glsl/glsl_parser_extras.cpp | 25 +++--
 src/compiler/glsl/linker.cpp | 16 +++-
 2 files changed, 10 insertions(+), 31 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index ec4603d..1fea4d6 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1979,6 +1979,7 @@ do_late_parsing_checks(struct _mesa_glsl_parse_state 
*state)
 
 static void
 opt_shader_and_create_symbol_table(struct gl_context *ctx,
+   struct glsl_symbol_table *source_symbols,
struct gl_shader *shader)
 {
assert(shader->CompileStatus != compile_failure &&
@@ -2036,22 +2037,8 @@ opt_shader_and_create_symbol_table(struct gl_context 
*ctx,
 * We don't have to worry about types or interface-types here because those
 * are fly-weights that are looked up by glsl_type.
 */
-   foreach_in_list (ir_instruction, ir, shader->ir) {
-  switch (ir->ir_type) {
-  case ir_type_function:
- shader->symbols->add_function((ir_function *) ir);
- break;
-  case ir_type_variable: {
- ir_variable *const var = (ir_variable *) ir;
-
- if (var->data.mode != ir_var_temporary)
-shader->symbols->add_variable(var);
- break;
-  }
-  default:
- break;
-  }
-   }
+   _mesa_glsl_copy_symbols_from_table(shader->ir, source_symbols,
+  shader->symbols);
 }
 
 void
@@ -2088,7 +2075,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
  return;
 
   if (shader->CompileStatus == compiled_no_opts) {
- opt_shader_and_create_symbol_table(ctx, shader);
+ opt_shader_and_create_symbol_table(ctx,
+NULL, /* source_symbols */
+shader);
  shader->CompileStatus = compile_success;
  return;
   }
@@ -2149,7 +2138,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
   lower_subroutine(shader->ir, state);
 
   if (!ctx->Cache || force_recompile)
- opt_shader_and_create_symbol_table(ctx, shader);
+ opt_shader_and_create_symbol_table(ctx, state->symbols, shader);
   else {
  reparent_ir(shader->ir, shader->ir);
  shader->CompileStatus = compiled_no_opts;
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f827b68..0045291 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1255,21 +1255,11 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog,
  * Populates a shaders symbol table with all global declarations
  */
 static void
-populate_symbol_table(gl_linked_shader *sh)
+populate_symbol_table(gl_linked_shader *sh, glsl_symbol_table *symbols)
 {
sh->symbols = new(sh) glsl_symbol_table;
 
-   foreach_in_list(ir_instruction, inst, sh->ir) {
-  ir_variable *var;
-  ir_function *func;
-
-  if ((func = inst->as_function()) != NULL) {
- sh->symbols->add_function(func);
-  } else if ((var = inst->as_variable()) != NULL) {
- if (var->data.mode != ir_var_temporary)
-sh->symbols->add_variable(var);
-  }
-   }
+   _mesa_glsl_copy_symbols_from_table(sh->ir, symbols, sh->symbols);
 }
 
 
@@ -2288,7 +2278,7 @@ link_intrastage_shaders(void *mem_ctx,
 
link_bindless_layout_qualifiers(prog, shader_list, num_shaders);
 
-   populate_symbol_table(linked);
+   populate_symbol_table(linked, shader_list[0]->symbols);
 
/* The pointer to the main function in the final linked shader (i.e., the
 * copy of the original shader that contained the main function).
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/3] Fix for PerVertexValidation CTS test

2017-10-27 Thread Neil Roberts
Hi,

Here is a v2 of Eduardo’s patches to fix the PerVertexValidation CTS
test. They were originally posted here:

https://lists.freedesktop.org/archives/mesa-dev/2017-March/146667.html

The patches needed a non-trivial rebase. The previous discussion was
left with a note saying that there are more cases that should fail but
that weren’t handled by the patches and aren’t checked by the CTS
test. Ie, if the consuming shader doesn’t reference any of the members
of gl_PerVertex then it wouldn’t find the block redefinition. This v2
addresses that by directly fetching the glsl_type for the gl_PerVertex
block from the symbol table rather waiting for an instruction which
references it.

I’ve tested it with Piglit and the public CTS repo and it doesn’t
cause any regressions, and of course it fixes the CTS test.

It might be nice to add a Piglit or CTS test to check this missing
case.

- Neil

Eduardo Lima Mitev (3):
  glsl_parser_extra: Add utility to copy symbols between symbol tables
  glsl: Use the utility function to copy symbols between symbol tables
  glsl/linker: Check that re-declared, inter-shader built-in blocks
match

 src/compiler/glsl/glsl_parser_extras.cpp| 68 +
 src/compiler/glsl/glsl_parser_extras.h  |  5 +++
 src/compiler/glsl/link_interface_blocks.cpp | 29 
 src/compiler/glsl/linker.cpp| 16 ++-
 4 files changed, 87 insertions(+), 31 deletions(-)

-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount

2017-10-02 Thread Neil Roberts
Jason Ekstrand  writes:

> I wish...  Unfortunately, the spec says:
>
> Let *n* be the total number of images in the swapchain, *m* be the
> value of VkSurfaceCapabilitiesKHR::minImageCount, and *a* be the
> number of presentable images that the application has currently
> acquired (i.e. images acquired with vkAcquireNextImageKHR, but not yet
> presented with vkQueuePresentKHR). vkAcquireNextImageKHR *can* always
> succeed if a ≤ n - m at the time vkAcquireNextImageKHR is called.
> vkAcquireNextImageKHR *should* not be called if a > n - m with a
> timeout of UINT64_MAX; in such a case, vkAcquireNextImageKHR *may*
> block indefinitely.
>
> Because this is based on the number of images in the swapchain and the
> VkSurfaceCapabilitiesKHR field, if we return a swapchain with more
> images, n will be larger but not m so that just means that they can
> acquire more images, not that we've reserved more. Arguably, that's a
> spec bug and I'll file one and see where it goes. However, it's
> definitely what the spec says today. :-(

Fair enough. Thanks for the explanation.

Regards,
- Neil

.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount

2017-10-01 Thread Neil Roberts
Jason Ekstrand  writes:

> Hey, Neil!

Hey Jason :)

> Yeah... That's a bit unfortunate.  The problem is that we have no way of 
> returning a different number of images depending on the mode.  In theory, 
> we could start out at 2 and return SUBOPTIMAL and force the application to 
> recreate the swapchain with more images until we have enough.  That would 
> be a real pain though...  In not sure what the best option is.

Hm, was there a problem with the previous approach? Ie, in the surface
caps it always returns minImages as two but when it comes to creating
the swapchain if the present mode is MAILBOX then it will go ahead and
create 4 images. The parameter in the create call is also called
“minImages” not “numImages” and the spec implies it’s ok to create more
than were requested. It seems like that would just do the right thing.

Regards,
- Neil

.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount

2017-10-01 Thread Neil Roberts
Jason Ekstrand  writes:

> +   /* For true mailbox mode, we need at least 4 images:
> +*  1) One to scan out from
> +*  2) One to have queued for scan-out
> +*  3) One to be currently held by the Wayland compositor
> +*  4) One to render to
> +*/
> caps->minImageCount = 2;
> -   caps->maxImageCount = 4;
> +   /* There is no real maximum */
> +   caps->maxImageCount = 0;

This patch as it was applied seems to leave the minImageCount as 2 on
X11. Is this possibly a mistake? Now it doesn’t match the comment above
it and there’s no explanation of this in the commit message.

It seems a little surprising that it has been changed to 4 in the
Wayland backend. Won’t this mean that every application using Vulkan,
even if it’s just a simple GUI app will end up with four large buffers
just to make the few applications that want true mailbox mode behave
correctly?

Regards,
- Neil

.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] intel/compiler: Use emit_emplace in the builders

2017-09-22 Thread Neil Roberts
Replaces all calls of the form emit(instruction(…)) with
emit_emplace(…). This should avoid a redundant copy of the
stack-allocated temporary instruction. Although the temporary would
have been stack-allocated so it might not seem like a big deal, the
instructions do also have an internal allocation for the sources so we
also get to avoid an extra allocation.

I tested this with shader-db over four iterations and the average sum
of the time reported for all of the threads drops from 406 seconds to
366.
---
 src/intel/compiler/brw_fs_builder.h   | 32 
 src/intel/compiler/brw_vec4_builder.h | 28 ++--
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/src/intel/compiler/brw_fs_builder.h 
b/src/intel/compiler/brw_fs_builder.h
index 90d8285..9991dd7 100644
--- a/src/intel/compiler/brw_fs_builder.h
+++ b/src/intel/compiler/brw_fs_builder.h
@@ -275,7 +275,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode) const
   {
- return emit(instruction(opcode, dispatch_width()));
+ return emit_emplace(opcode, dispatch_width());
   }
 
   /**
@@ -284,7 +284,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode, const dst_reg ) const
   {
- return emit(instruction(opcode, dispatch_width(), dst));
+ return emit_emplace(opcode, dispatch_width(), dst);
   }
 
   /**
@@ -301,11 +301,11 @@ namespace brw {
  case SHADER_OPCODE_LOG2:
  case SHADER_OPCODE_SIN:
  case SHADER_OPCODE_COS:
-return emit(instruction(opcode, dispatch_width(), dst,
-fix_math_operand(src0)));
+return emit_emplace(opcode, dispatch_width(), dst,
+fix_math_operand(src0));
 
  default:
-return emit(instruction(opcode, dispatch_width(), dst, src0));
+return emit_emplace(opcode, dispatch_width(), dst, src0);
  }
   }
 
@@ -320,12 +320,12 @@ namespace brw {
  case SHADER_OPCODE_POW:
  case SHADER_OPCODE_INT_QUOTIENT:
  case SHADER_OPCODE_INT_REMAINDER:
-return emit(instruction(opcode, dispatch_width(), dst,
-fix_math_operand(src0),
-fix_math_operand(src1)));
+return emit_emplace(opcode, dispatch_width(), dst,
+fix_math_operand(src0),
+fix_math_operand(src1));
 
  default:
-return emit(instruction(opcode, dispatch_width(), dst, src0, 
src1));
+return emit_emplace(opcode, dispatch_width(), dst, src0, src1);
 
  }
   }
@@ -342,14 +342,14 @@ namespace brw {
  case BRW_OPCODE_BFI2:
  case BRW_OPCODE_MAD:
  case BRW_OPCODE_LRP:
-return emit(instruction(opcode, dispatch_width(), dst,
-fix_3src_operand(src0),
-fix_3src_operand(src1),
-fix_3src_operand(src2)));
+return emit_emplace(opcode, dispatch_width(), dst,
+fix_3src_operand(src0),
+fix_3src_operand(src1),
+fix_3src_operand(src2));
 
  default:
-return emit(instruction(opcode, dispatch_width(), dst,
-src0, src1, src2));
+return emit_emplace(opcode, dispatch_width(), dst,
+src0, src1, src2);
  }
   }
 
@@ -361,7 +361,7 @@ namespace brw {
   emit(enum opcode opcode, const dst_reg , const src_reg srcs[],
unsigned n) const
   {
- return emit(instruction(opcode, dispatch_width(), dst, srcs, n));
+ return emit_emplace(opcode, dispatch_width(), dst, srcs, n);
   }
 
   /**
diff --git a/src/intel/compiler/brw_vec4_builder.h 
b/src/intel/compiler/brw_vec4_builder.h
index 413b27c..d802990 100644
--- a/src/intel/compiler/brw_vec4_builder.h
+++ b/src/intel/compiler/brw_vec4_builder.h
@@ -240,7 +240,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode) const
   {
- return emit(instruction(opcode));
+ return emit_emplace(opcode);
   }
 
   /**
@@ -249,7 +249,7 @@ namespace brw {
   instruction *
   emit(enum opcode opcode, const dst_reg ) const
   {
- return emit(instruction(opcode, dst));
+ return emit_emplace(opcode, dst);
   }
 
   /**
@@ -267,11 +267,11 @@ namespace brw {
  case SHADER_OPCODE_SIN:
  case SHADER_OPCODE_COS:
 return fix_math_instruction(
-   emit(instruction(opcode, dst,
-fix_math_operand(src0;
+   emit_emplace(opcode, dst,
+fix_math_operand(src0)));
 
  default:

[Mesa-dev] [PATCH 1/2] intel/compiler: Add a placement version of emit to the builders

2017-09-22 Thread Neil Roberts
Adds “emit_emplace” to emit an instruction that is constructed inplace
with the parameters given. This will be used instead of the pattern of
doing emit(instruction(…)) because that ends up calling emit(const
instruction&) which implies doing a copy.

The “emplace” terminology comes from std::vector::emplace_back which
does a very similar thing.
---

Sorry if this email ends up getting sent twice. It didn’t seem to work
the first time.

 src/intel/compiler/brw_fs_builder.h   | 14 ++
 src/intel/compiler/brw_vec4_builder.h | 14 ++
 2 files changed, 28 insertions(+)

diff --git a/src/intel/compiler/brw_fs_builder.h 
b/src/intel/compiler/brw_fs_builder.h
index 87394bc..90d8285 100644
--- a/src/intel/compiler/brw_fs_builder.h
+++ b/src/intel/compiler/brw_fs_builder.h
@@ -25,6 +25,7 @@
 #ifndef BRW_FS_BUILDER_H
 #define BRW_FS_BUILDER_H
 
+#include 
 #include "brw_ir_fs.h"
 #include "brw_shader.h"
 
@@ -256,6 +257,19 @@ namespace brw {
   }
 
   /**
+   * Insert an instruction by constructing it directly inplace. This uses
+   * perfect forwarding magic from C++11 to forward the arguments to the
+   * constructor.
+   */
+  template
+  instruction *
+  emit_emplace(Args&&... args) const
+  {
+ return emit(new(shader->mem_ctx)
+ instruction(std::forward(args)...));
+  }
+
+  /**
* Create and insert a nullary control instruction into the program.
*/
   instruction *
diff --git a/src/intel/compiler/brw_vec4_builder.h 
b/src/intel/compiler/brw_vec4_builder.h
index 4c3efe8..413b27c 100644
--- a/src/intel/compiler/brw_vec4_builder.h
+++ b/src/intel/compiler/brw_vec4_builder.h
@@ -25,6 +25,7 @@
 #ifndef BRW_VEC4_BUILDER_H
 #define BRW_VEC4_BUILDER_H
 
+#include 
 #include "brw_ir_vec4.h"
 #include "brw_ir_allocator.h"
 
@@ -221,6 +222,19 @@ namespace brw {
   }
 
   /**
+   * Insert an instruction by constructing it directly inplace. This uses
+   * perfect forwarding magic from C++11 to forward the arguments to the
+   * constructor.
+   */
+  template
+  instruction *
+  emit_emplace(Args&&... args) const
+  {
+ return emit(new(shader->mem_ctx)
+ instruction(std::forward(args)...));
+  }
+
+  /**
* Create and insert a nullary control instruction into the program.
*/
   instruction *
-- 
2.9.5


.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH vulkan] anv/pipeline: Try to create all pipelines

2016-02-26 Thread Neil Roberts
According to the Vulkan 1.0 spec section 9.4:

“When an application attempts to create many pipelines in a single
 command, it is possible that some subset may fail creation. In that
 case, the corresponding entries in the pPipelines output array will
 be filled with VK_NULL_HANDLE values. If any pipeline fails creation
 (for example, due to out of memory errors), the vkCreate*Pipelines
 commands will return an error code. The implementation will attempt
 to create all pipelines, and only return VK_NULL_HANDLE values for
 those that actually failed.”

Previously anv_Create{Graphics,Compute}Pipelines would destroy any
previous pipelines that it created. The problem with this is that if
the application is expecting the driver to behave like the spec then
it may try to free any pipelines that were successfully created by
iterating through the results array. If any of them succeeded then the
pointer will be a stale pointer and the application would probably
crash.
---
 src/intel/vulkan/anv_pipeline.c | 52 +++--
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 1173b4f..d3a01df 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -1231,24 +1231,28 @@ VkResult anv_CreateGraphicsPipelines(
 const VkAllocationCallbacks*pAllocator,
 VkPipeline* pPipelines)
 {
-   VkResult result = VK_SUCCESS;
+   VkResult this_result;
+   VkResult overall_result = VK_SUCCESS;
 
unsigned i = 0;
for (; i < count; i++) {
-  result = anv_graphics_pipeline_create(_device,
-pipelineCache,
-[i],
-NULL, pAllocator, [i]);
-  if (result != VK_SUCCESS) {
- for (unsigned j = 0; j < i; j++) {
-anv_DestroyPipeline(_device, pPipelines[j], pAllocator);
- }
-
- return result;
+  this_result = anv_graphics_pipeline_create(_device,
+ pipelineCache,
+ [i],
+ NULL,
+ pAllocator,
+ [i]);
+  if (this_result != VK_SUCCESS) {
+ /* According to the spec this should try to create all pipelines and
+  * if any fail then it will set the corresponding pipeline handle to
+  * NULL.
+  */
+ pPipelines[i] = VK_NULL_HANDLE;
+ overall_result = this_result;
   }
}
 
-   return VK_SUCCESS;
+   return overall_result;
 }
 
 static VkResult anv_compute_pipeline_create(
@@ -1287,21 +1291,23 @@ VkResult anv_CreateComputePipelines(
 const VkAllocationCallbacks*pAllocator,
 VkPipeline* pPipelines)
 {
-   VkResult result = VK_SUCCESS;
+   VkResult this_result;
+   VkResult overall_result = VK_SUCCESS;
 
unsigned i = 0;
for (; i < count; i++) {
-  result = anv_compute_pipeline_create(_device, pipelineCache,
-   [i],
-   pAllocator, [i]);
-  if (result != VK_SUCCESS) {
- for (unsigned j = 0; j < i; j++) {
-anv_DestroyPipeline(_device, pPipelines[j], pAllocator);
- }
-
- return result;
+  this_result = anv_compute_pipeline_create(_device, pipelineCache,
+[i],
+pAllocator, [i]);
+  if (this_result != VK_SUCCESS) {
+ /* According to the spec this should try to create all pipelines and
+  * if any fail then it will set the corresponding pipeline handle to
+  * NULL.
+  */
+ pPipelines[i] = VK_NULL_HANDLE;
+ overall_result = this_result;
   }
}
 
-   return VK_SUCCESS;
+   return overall_result;
 }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/22] meta/blit: Use GL_EXT_shader_samples_identical in MSAA-SS resolve blit

2016-02-19 Thread Neil Roberts
Iago Toral  writes:

> I don't know much about this, but using shader_samples_identical
> should only give a benefit if we actually get identical samples,
> otherwise it means more work, right? I noticed that the test renders a
> quad with random colors for each vertex that will be interpolated
> across the region, so could it be that we are hitting few cases of
> identical samples? Shouldn't the results be expected to be better if
> we rendered a flat color instead?

It should all be identical regardless of the color chosen because it is
doing per-pixel shading rather than per-sample which means the fragment
shader is only run once per pixel and the same value is written to all
of the samples. The only case where you get differing samples normally
is at the edges of a polygon where the single color is only written to a
few of the samples.

In this case it is even more identical because the random color is set
on a uniform rather than a vertex so it is actually using the same color
for the whole framebuffer. However, it is rendered as two triangles
which might mean that along the diagonal of the framebuffer where the
two triangles meet there will be a polygon edge which means the hardware
might not recognise that these all the same color, but that should only
be a few pixels and probably won't affect the results very much.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/22] meta/blit: Use GL_EXT_shader_samples_identical in MSAA-SS resolve blit

2016-02-18 Thread Neil Roberts
I made a pathological test case (attached) which repeatedly renders into
an MSAA FBO and then blits it to the screen and measures the framerate.
It checks it with a range of different sample counts. The rendering is
done either by rendering two triangles to fill the framebuffer or by
calling glClear.

The percentage increase in framerate after applying the patch is like
this:

With triangles to fill buffer:

16  62,27%
8   48,59%
4   27,72%
2   5,34%
0   0,58%

With glClear:   

16  -5,20%
8   -7,08%
4   -2,45%
2   -20,76%
0   3,71%

It seems like a pretty convincing win for the triangle case but the
clear case makes it slightly worse. Presumably this is because we don't
do anything to detect the value stored in the MCS buffer when doing a
fast clear so the fast path isn't taken and the shader being more
complicated makes it slower.

Not sure if we want to try and do anything about that because
potentially the cleared pixels aren't very common in a framebuffer from
a real use case so it might not really matter.

Currently we don't use SIMD16 for 16x MSAA because we can't allocate the
registers well enough to make it worthwhile. This patch makes that
problem a bit more interesting because even if we end up spilling a lot
it might still be worth doing SIMD16 because the cases where the spilled
instructions are hit would be much less common.

- Neil

#include 
#include 
#include 
#include 
#include 

#define N_FRAMES_TO_SKIP 1000
#define N_FRAMES_TO_DRAW 1

enum draw_mode {
DRAW_MODE_TRIANGLES,
DRAW_MODE_CLEAR
};

struct data {
SDL_Window *window;
SDL_GLContext gl_context;
};

static SDL_GLContext
create_gl_context(SDL_Window *window)
{
/* First try creating a core context because if we get one it
 * can be more efficient.
 */
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, 3);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, 1);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_PROFILE_MASK,
SDL_GL_CONTEXT_PROFILE_CORE);

return SDL_GL_CreateContext(window);
}

static const char
vertex_shader_source[] =
"#version 150\n"
"in vec2 piglit_vertex;\n"
"\n"
"void\n"
"main()\n"
"{\n"
"gl_Position = vec4(piglit_vertex, 0.0, 1.0);\n"
"}\n";

static const char
fragment_shader_source[] =
"#version 150\n"
"uniform vec3 color;\n"
"\n"
"void\n"
"main()\n"
"{\n"
"gl_FragColor = vec4(color, 1.0);\n"
"}\n";

static GLuint
create_shader(const char *name,
  GLenum type,
  const char *source,
  int source_length)
{
GLuint shader;
GLint length, compile_status;
GLsizei actual_length;
GLchar *info_log;

shader = glCreateShader(type);

glShaderSource(shader,
   1, /* n_strings */
   ,
   _length);

glCompileShader(shader);

glGetShaderiv(shader, GL_INFO_LOG_LENGTH, );

if (length > 0) {
info_log = malloc(length);
glGetShaderInfoLog(shader, length,
   _length,
   info_log);
if (*info_log) {
fprintf(stderr,
"Info log for %s:\n%s\n",
name,
info_log);
}
free(info_log);
}

glGetShaderiv(shader, GL_COMPILE_STATUS, _status);

if (!compile_status) {
fprintf(stderr, "%s compilation failed", name);
glDeleteShader(shader);
return 0;
}

return shader;
}

static bool
link_program(GLuint program)
{
GLint length, link_status;
GLsizei actual_length;
GLchar *info_log;

glLinkProgram(program);

glGetProgramiv(program, GL_INFO_LOG_LENGTH, );

if (length > 0) {
info_log = malloc(length);
glGetProgramInfoLog(program, length,
_length,
info_log);
if (*info_log)
fprintf(stderr, "Link info log:\n%s\n", info_log);
free(info_log);
}

glGetProgramiv(program, GL_LINK_STATUS, _status);

if (!link_status) {
fprintf(stderr, "program link failed");
return false;
}

return true;
}

static bool
run_test(struct data *data,
 int n_samples,
 enum draw_mode draw_mode,
 float *result)
{
float verts[] = {
-1.0f, -1.0f,
1.0f, -1.0f,
-1.0f, 1.0f,
1.0f, 1.0f
};
GLuint 

Re: [Mesa-dev] [PATCH 1/4] mesa: gl_NumSamples should always be at least one

2016-02-18 Thread Neil Roberts
Looks good to me. Interestingly we have a test for it in Piglit which
gets run with samples=0 and it is currently passing so presumably it is
broken.

Reviewed-by: Neil Roberts <n...@linux.intel.com>

- Neil

Ilia Mirkin <imir...@alum.mit.edu> writes:

> From ARB_sample_shading:
>
> "gl_NumSamples is the total number of samples in the framebuffer,
>  or one if rendering to a non-multisample framebuffer"
>
> So make sure to always pass in at least 1.
>
> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
> ---
>  src/mesa/program/prog_statevars.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/program/prog_statevars.c 
> b/src/mesa/program/prog_statevars.c
> index eed2412..489f75f 100644
> --- a/src/mesa/program/prog_statevars.c
> +++ b/src/mesa/program/prog_statevars.c
> @@ -353,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const 
> gl_state_index state[],
>}
>return;
> case STATE_NUM_SAMPLES:
> -  ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer);
> +  ((int *)value)[0] = MAX2(1, _mesa_geometric_samples(ctx->DrawBuffer));
>return;
> case STATE_DEPTH_RANGE:
>value[0] = ctx->ViewportArray[0].Near;/* near   */
> -- 
> 2.4.10
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] Android: fix build break in libmesa_program

2016-02-15 Thread Neil Roberts
Looks good to me. Sorry about breaking it.

Reviewed-by: Neil Roberts <n...@linux.intel.com>

- Neil

Rob Herring <r...@kernel.org> writes:

> Commit 5fd848f6c9ee ("program: Use _mesa_geometric_samples to calculate
> gl_NumSamples") broken Android builds. Add the missing include path "main"
> to framebuffer.h like other includes in prog_statevars.c.
>
> Cc: Neil Roberts <n...@linux.intel.com>
> Cc: Ilia Mirkin <imir...@alum.mit.edu>
> Signed-off-by: Rob Herring <r...@kernel.org>
> ---
> v2: Add main to #include instead of adding the path to Android.mk
>
>  src/mesa/program/prog_statevars.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/program/prog_statevars.c 
> b/src/mesa/program/prog_statevars.c
> index eed2412..af5fefe 100644
> --- a/src/mesa/program/prog_statevars.c
> +++ b/src/mesa/program/prog_statevars.c
> @@ -40,7 +40,7 @@
>  #include "prog_statevars.h"
>  #include "prog_parameter.h"
>  #include "main/samplerobj.h"
> -#include "framebuffer.h"
> +#include "main/framebuffer.h"
>  
>  
>  #define ONE_DIV_SQRT_LN2 (1.201122408786449815)
> -- 
> 2.5.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] main: Use a derived value for the default sample count

2016-02-04 Thread Neil Roberts
Previously the framebuffer default sample count was taken directly
from the value given by the application. On the i965 driver on HSW if
the value wasn't one that is supported by the hardware it would hit an
assert when it tried to program the state for it. This patch fixes it
by adding a derived sample count to the state for the default
framebuffer. The driver can then quantize this to one of the valid
values in its UpdateState handler when the _NEW_BUFFERS state changes.
_mesa_geometric_samples is changed to use the new derived value.

Fixes the piglit test arb_framebuffer_no_attachments-query

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93957
Cc: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/brw_context.c | 19 +++
 src/mesa/main/framebuffer.h |  3 ++-
 src/mesa/main/mtypes.h  |  4 
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 1032e5a..44d2fe4 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -167,6 +167,19 @@ intel_viewport(struct gl_context *ctx)
 }
 
 static void
+intel_update_framebuffer(struct gl_context *ctx,
+ struct gl_framebuffer *fb)
+{
+   struct brw_context *brw = brw_context(ctx);
+
+   /* Quantize the derived default number of samples
+*/
+   fb->DefaultGeometry._NumSamples =
+  intel_quantize_num_samples(brw->intelScreen,
+ fb->DefaultGeometry.NumSamples);
+}
+
+static void
 intel_update_state(struct gl_context * ctx, GLuint new_state)
 {
struct brw_context *brw = brw_context(ctx);
@@ -245,6 +258,12 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
}
 
_mesa_lock_context_textures(ctx);
+
+   if (new_state & _NEW_BUFFERS) {
+  intel_update_framebuffer(ctx, ctx->DrawBuffer);
+  if (ctx->DrawBuffer != ctx->ReadBuffer)
+ intel_update_framebuffer(ctx, ctx->ReadBuffer);
+   }
 }
 
 #define flushFront(screen)  ((screen)->image.loader ? 
(screen)->image.loader->flushFrontBuffer : 
(screen)->dri2.loader->flushFrontBuffer)
diff --git a/src/mesa/main/framebuffer.h b/src/mesa/main/framebuffer.h
index ab077ed..fa434d4 100644
--- a/src/mesa/main/framebuffer.h
+++ b/src/mesa/main/framebuffer.h
@@ -97,7 +97,8 @@ static inline GLuint
 _mesa_geometric_samples(const struct gl_framebuffer *buffer)
 {
return buffer->_HasAttachments ?
-  buffer->Visual.samples : buffer->DefaultGeometry.NumSamples;
+  buffer->Visual.samples :
+  buffer->DefaultGeometry._NumSamples;
 }
 
 static inline GLuint
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 58064aa..745eeb8 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3223,6 +3223,10 @@ struct gl_framebuffer
struct {
  GLuint Width, Height, Layers, NumSamples;
  GLboolean FixedSampleLocations;
+ /* Derived from NumSamples by the driver so that it can choose a valid
+  * value for the hardware.
+  */
+ GLuint _NumSamples;
} DefaultGeometry;
 
/** \name  Drawing bounds (Intersection of buffer size and scissor box)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] program: Use _mesa_geometric_samples to calculate gl_NumSamples

2016-02-04 Thread Neil Roberts
Otherwise it won't take into account the default samples for
framebuffers with no attachments.
---
 src/mesa/program/prog_statevars.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/program/prog_statevars.c 
b/src/mesa/program/prog_statevars.c
index 12490d0..eed2412 100644
--- a/src/mesa/program/prog_statevars.c
+++ b/src/mesa/program/prog_statevars.c
@@ -40,6 +40,7 @@
 #include "prog_statevars.h"
 #include "prog_parameter.h"
 #include "main/samplerobj.h"
+#include "framebuffer.h"
 
 
 #define ONE_DIV_SQRT_LN2 (1.201122408786449815)
@@ -352,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const 
gl_state_index state[],
   }
   return;
case STATE_NUM_SAMPLES:
-  ((int *)value)[0] = ctx->DrawBuffer->Visual.samples;
+  ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer);
   return;
case STATE_DEPTH_RANGE:
   value[0] = ctx->ViewportArray[0].Near;/* near   */
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] main: Use _mesa_geometric_samples to calculate GL_SAMPLE_BUFFERS

2016-02-04 Thread Neil Roberts
Otherwise it won't take into account the default samples for
framebuffers with no attachments.
---
 src/mesa/main/get.c  | 3 +++
 src/mesa/main/get_hash_params.py | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 3070597..f9b4a3c 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -1084,6 +1084,9 @@ find_custom_value(struct gl_context *ctx, const struct 
value_desc *d, union valu
case GL_SAMPLES:
   v->value_int = _mesa_geometric_samples(ctx->DrawBuffer);
   break;
+   case GL_SAMPLE_BUFFERS:
+  v->value_int = _mesa_geometric_samples(ctx->DrawBuffer) > 0;
+  break;
}
 }
 
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index dd4aa8c..94b364a 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -80,7 +80,7 @@ descriptor=[
   [ "SAMPLE_COVERAGE_ARB", "CONTEXT_BOOL(Multisample.SampleCoverage), 
NO_EXTRA" ],
   [ "SAMPLE_COVERAGE_VALUE_ARB", 
"CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA" ],
   [ "SAMPLE_COVERAGE_INVERT_ARB", 
"CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA" ],
-  [ "SAMPLE_BUFFERS_ARB", "BUFFER_INT(Visual.sampleBuffers), 
extra_new_buffers" ],
+  [ "SAMPLE_BUFFERS_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_new_buffers" ],
   [ "SAMPLES_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_new_buffers" ],
 
 # GL_ARB_sample_shading
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] main: Use _mesa_geometric_samples to calculate the value of GL_SAMPLES

2016-02-04 Thread Neil Roberts
Otherwise it won't take into account the default samples for
framebuffers with no attachments.
---
 src/mesa/main/get.c  | 4 
 src/mesa/main/get_hash_params.py | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 0434836..3070597 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -1080,6 +1080,10 @@ find_custom_value(struct gl_context *ctx, const struct 
value_desc *d, union valu
case GL_DISPATCH_INDIRECT_BUFFER_BINDING:
   v->value_int = ctx->DispatchIndirectBuffer->Name;
   break;
+   /* GL_ARB_multisample */
+   case GL_SAMPLES:
+  v->value_int = _mesa_geometric_samples(ctx->DrawBuffer);
+  break;
}
 }
 
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 04aec03..dd4aa8c 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -81,7 +81,7 @@ descriptor=[
   [ "SAMPLE_COVERAGE_VALUE_ARB", 
"CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA" ],
   [ "SAMPLE_COVERAGE_INVERT_ARB", 
"CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA" ],
   [ "SAMPLE_BUFFERS_ARB", "BUFFER_INT(Visual.sampleBuffers), 
extra_new_buffers" ],
-  [ "SAMPLES_ARB", "BUFFER_INT(Visual.samples), extra_new_buffers" ],
+  [ "SAMPLES_ARB", "LOC_CUSTOM, TYPE_INT, 0, extra_new_buffers" ],
 
 # GL_ARB_sample_shading
   [ "SAMPLE_SHADING_ARB", "CONTEXT_BOOL(Multisample.SampleShading), 
extra_gl40_ARB_sample_shading" ],
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] texobj: Fix the completeness checks for cube textures

2016-01-21 Thread Neil Roberts
According to the GL 1.4 spec section 3.8.10, a cubemap texture is only
complete if:

• The level base arrays of each of the six texture images making up
  the cube map have identical, positive, and square dimensions.
• The level base arrays were each specified with the same internal
  format.
• The level base arrays each have the same border width.

Previously the texture completeness code was only checking the first
point. This patch makes it additionally check the other two.

This fixes the following two dEQP tests:

deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgba_rgb_level_0_neg_z
deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgb_rgba_level_0_pos_z

And also the Piglit test posted here:

http://patchwork.freedesktop.org/patch/71333/

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93792
Cc: Ilia Mirkin 
---
 src/mesa/main/texobj.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index b107a8f..9ce7b4c 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -769,7 +769,8 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
}
 
if (t->Target == GL_TEXTURE_CUBE_MAP_ARB) {
-  /* Make sure that all six cube map level 0 images are the same size.
+  /* Make sure that all six cube map level 0 images are the same size and
+   * format.
* Note:  we know that the image's width==height (we enforce that
* at glTexImage time) so we only need to test the width here.
*/
@@ -784,6 +785,15 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
 incomplete(t, BASE, "Cube face missing or mismatched size");
 return;
  }
+ if (t->Image[face][baseLevel]->InternalFormat !=
+ baseImage->InternalFormat) {
+incomplete(t, BASE, "Cube face format mismatch");
+return;
+ }
+ if (t->Image[face][baseLevel]->Border != baseImage->Border) {
+incomplete(t, BASE, "Cube face border size mismatch");
+return;
+ }
   }
}
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] texobj: Remove redundant checks that the texture cube faces match size

2016-01-21 Thread Neil Roberts
The texture mipmap completeness checking code was checking whether all
of the faces have the same size. However this is pointless because the
code just above it checks whether the face has the expected size
calculated for the mipmap level anyway so the error condition could
never be reached. This patch just removes it.
---
 src/mesa/main/texobj.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index 9ce7b4c..e926c7b 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -868,16 +868,6 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
  img->Depth2);
   return;
}
-
-   /* Extra checks for cube textures */
-   if (face > 0) {
-  /* check that cube faces are the same size */
-  if (img->Width2 != t->Image[0][i]->Width2 ||
-  img->Height2 != t->Image[0][i]->Height2) {
-incomplete(t, MIPMAP, "CubeMap Image[n][i] bad size");
-return;
- }
-   }
 }
  }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/meta-fast-clear: Convert the clear color through the surf format

2016-01-13 Thread Neil Roberts
Bump. Anyone fancy reviewing this small patch? I think it would be good
to have because it makes the code a bit simpler as well as fixing a
corner case and making it more robust.

- Neil

Neil Roberts <n...@linux.intel.com> writes:

> When programming the fast clear color there was previously a chunk of
> code to try to make the color match the constraints of the surface
> format such as by filling in missing components and handling luminance
> formats. These cases are not handled by the hardware. There are some
> additional possible restrictions that the hardware does seem to
> handle, such as clamping to [0,1] for normalised formats. However for
> whatever reason it doesn't clamp to [0,∞] for the special float
> formats that don't have a sign bit. Rather than adding yet another
> special case for this format this patch makes it instead convert the
> color to the actual surface format and back again so that we can be
> sure it will have all of the possible restrictions. Additionally this
> would avoid some other potential surprises such as getting more
> precision for the clear color when fast clears are used.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93338
> ---
>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 57 
> -
>  1 file changed, 27 insertions(+), 30 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
> b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> index cf0e56b..29ae6f0 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> @@ -37,6 +37,8 @@
>  #include "main/uniforms.h"
>  #include "main/fbobject.h"
>  #include "main/texobj.h"
> +#include "main/format_unpack.h"
> +#include "main/format_pack.h"
>  
>  #include "main/api_validate.h"
>  #include "main/state.h"
> @@ -397,45 +399,40 @@ set_fast_clear_color(struct brw_context *brw,
>   struct intel_mipmap_tree *mt,
>   const union gl_color_union *color)
>  {
> +   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
> union gl_color_union override_color = *color;
> -
> -   /* The sampler doesn't look at the format of the surface when the fast
> -* clear color is used so we need to implement luminance, intensity and
> -* missing components manually.
> -*/
> -   switch (_mesa_get_format_base_format(mt->format)) {
> -   case GL_INTENSITY:
> -  override_color.ui[3] = override_color.ui[0];
> -  /* flow through */
> -   case GL_LUMINANCE:
> -   case GL_LUMINANCE_ALPHA:
> -  override_color.ui[1] = override_color.ui[0];
> -  override_color.ui[2] = override_color.ui[0];
> -  break;
> -   default:
> -  for (int i = 0; i < 3; i++) {
> - if (!_mesa_format_has_color_component(mt->format, i))
> -override_color.ui[i] = 0;
> -  }
> -  break;
> -   }
> -
> -   if (!_mesa_format_has_color_component(mt->format, 3)) {
> -  if (_mesa_is_format_integer_color(mt->format))
> - override_color.ui[3] = 1;
> -  else
> - override_color.f[3] = 1.0f;
> -   }
> +   union gl_color_union tmp_color;
>  
> /* Handle linear→SRGB conversion */
> -   if (brw->ctx.Color.sRGBEnabled &&
> -   _mesa_get_srgb_format_linear(mt->format) != mt->format) {
> +   if (brw->ctx.Color.sRGBEnabled && linear_format != mt->format) {
>for (int i = 0; i < 3; i++) {
>   override_color.f[i] =
>  util_format_linear_to_srgb_float(override_color.f[i]);
>}
> }
>  
> +   /* Convert the clear color to the surface format and back so that the 
> color
> +* returned when sampling is guaranteed to be a value that could be stored
> +* in the surface. For example if the surface is a luminance format and we
> +* clear to 0.5,0.75,0.1,0.2 we want the color to come back as
> +* 0.5,0.5,0.5,1.0. In general the hardware doesn't seem to look at the
> +* surface format when returning the clear color so we need to do this to
> +* implement luminance, intensity and missing components. However it does
> +* seem to look at it in some cases such as to clamp to the range [0,1] 
> for
> +* unorm formats. Suprisingly however it doesn't clamp to [0,∞] for the
> +* special float formats that don't have a sign bit.
> +*/
> +   if (!_mesa_is_format_integer_color(linear_format)) {
> +  _mesa_pack_float_rgba_row(linear_format,
> +1, /* n_pixels */
> +   

[Mesa-dev] [PATCH] texobj: Check completeness with InternalFormat rather than Mesa format

2016-01-13 Thread Neil Roberts
The internal Mesa format used for a texture might not match the one
requested in the internalFormat when the texture was created, for
example if the driver is internally remapping RGB textures to RGBA.
Otherwise it can cause false positives for completeness if one mipmap
image is created as RGBA and the other as RGB because they would both
have an RGBA Mesa format. If we check the InternalFormat instead then
we are directly checking the API usage which I think better matches
the intention of the check.

https://bugs.freedesktop.org/show_bug.cgi?id=93700
---
 src/mesa/main/texobj.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index 547055e..b107a8f 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -835,7 +835,7 @@ _mesa_test_texobj_completeness( const struct gl_context 
*ctx,
   incomplete(t, MIPMAP, "TexImage[%d] is missing", i);
   return;
}
-   if (img->TexFormat != baseImage->TexFormat) {
+   if (img->InternalFormat != baseImage->InternalFormat) {
   incomplete(t, MIPMAP, "Format[i] != Format[baseLevel]");
   return;
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Revert "i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals"

2015-12-16 Thread Neil Roberts
This reverts commit 839793680f99b8387bee9489733d5071c10f3ace.

The patch was breaking DRI3 because driGLFormatToImageFormat does not
handle MESA_FORMAT_B8G8R8X8_SRGB which ended up making it fail to
create the renderbuffer and it would later crash. It's not trivial to
add this format because there is no __DRI_IMAGE_FORMAT nor
__DRI_IMAGE_FOURCC define for the format either. I'm not sure how
difficult adding this would be and whether adding a new format would
require some sort of new version for DRI. Seeing as this might take a
while to fix I think it makes sense to just revert the patch in the
meantime in order to avoid regressing master.

It is also not handled in intel_gles3_srgb_workaround and there may be
other cases where it breaks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93388
---
 src/mesa/drivers/dri/i965/intel_screen.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 825a7c1..cc90efe 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -999,13 +999,14 @@ intelCreateBuffer(__DRIscreen * driScrnPriv,
   fb->Visual.samples = num_samples;
}
 
-   if (mesaVis->redBits == 5) {
+   if (mesaVis->redBits == 5)
   rgbFormat = MESA_FORMAT_B5G6R5_UNORM;
-   } else {
-  if (mesaVis->alphaBits == 0)
- rgbFormat = MESA_FORMAT_B8G8R8X8_SRGB;
-  else
- rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
+   else if (mesaVis->sRGBCapable)
+  rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
+   else if (mesaVis->alphaBits == 0)
+  rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM;
+   else {
+  rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
   fb->Visual.sRGBCapable = true;
}
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.

2015-12-11 Thread Neil Roberts
Ilia Mirkin  writes:

> I suspect that something like this may be the right thing for the
> intel driver... no reason not to expose sRGB-capable visuals when it
> so happens that alpha == 0. Also probably the same treatment should be
> done for BGR565... sRGB encoding will matter even more there, and from
> the looks of it, all gens support that.
>
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index cc90efe..75f2cce 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -1003,9 +1003,10 @@ intelCreateBuffer(__DRIscreen * driScrnPriv,
>rgbFormat = MESA_FORMAT_B5G6R5_UNORM;
> else if (mesaVis->sRGBCapable)
>rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
> -   else if (mesaVis->alphaBits == 0)
> -  rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM;
> -   else {
> +   else if (mesaVis->alphaBits == 0) {
> +  rgbFormat = MESA_FORMAT_B8G8R8X8_SRGB;
> +  fb->Visual.sRGBCapable = true;
> +   } else {
>rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
>fb->Visual.sRGBCapable = true;
> }

Thanks for the suggestion. I assume you didn't want to make this into a
proper patch yourself so I went ahead and did it here:

http://patchwork.freedesktop.org/patch/67844/

Along with the other two patches it doesn't cause any regressions in our
CI system.

I left the 565 format alone because there is no enum for
MESA_FORMAT_B5G6R5_SRGB and it seemed like it's better treated as a
separate issue.

We should also fix the fact that the visuals aren't marked as
sRGB-capable even though they are, but again I think that is a separate
issue.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965: Add MESA_FORMAT_B8G8R8X8_SRGB to brw_format_for_mesa_format

2015-12-11 Thread Neil Roberts
This will be used in a subsequent patch as the format for RGB visuals.

Cc: "11.0 11.1" 
Cc: Ilia Mirkin 
Suggested-by: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index ff8aac2..3fc47c3 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -408,6 +408,7 @@ brw_format_for_mesa_format(mesa_format mesa_format)
   [MESA_FORMAT_A8R8G8B8_SRGB] = 0,
   [MESA_FORMAT_R8G8B8A8_SRGB] = BRW_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB,
   [MESA_FORMAT_X8R8G8B8_SRGB] = 0,
+  [MESA_FORMAT_B8G8R8X8_SRGB] = BRW_SURFACEFORMAT_B8G8R8X8_UNORM_SRGB,
   [MESA_FORMAT_L_SRGB8] = BRW_SURFACEFORMAT_L8_UNORM_SRGB,
   [MESA_FORMAT_L8A8_SRGB] = BRW_SURFACEFORMAT_L8A8_UNORM_SRGB,
   [MESA_FORMAT_A8L8_SRGB] = 0,
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals

2015-12-11 Thread Neil Roberts
Previously if the visual didn't have an alpha channel then it would
pick a format that is not sRGB-capable. I don't think there's any
reason not to always have an sRGB-capable visual. Since 28090b30 there
are now visuals advertised without an alpha channel which means that
games that don't request alpha bits in the config would end up without
an sRGB-capable visual. This was breaking supertuxkart which assumes
the winsys buffer is always sRGB-capable.

The previous code always used an RGBA format if the visual config
itself was marked as sRGB-capable regardless of whether the visual has
alpha bits. I think we don't actually advertise any sRGB-capable
visuals (but we just use sRGB formats anyway) so it shouldn't make any
difference. However this patch also changes it to use RGBX if an
sRGB-capable visual is requested without alpha bits for consistency.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759
Cc: "11.0 11.1" 
Cc: Ilia Mirkin 
Suggested-by: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index cc90efe..825a7c1 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -999,14 +999,13 @@ intelCreateBuffer(__DRIscreen * driScrnPriv,
   fb->Visual.samples = num_samples;
}
 
-   if (mesaVis->redBits == 5)
+   if (mesaVis->redBits == 5) {
   rgbFormat = MESA_FORMAT_B5G6R5_UNORM;
-   else if (mesaVis->sRGBCapable)
-  rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
-   else if (mesaVis->alphaBits == 0)
-  rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM;
-   else {
-  rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
+   } else {
+  if (mesaVis->alphaBits == 0)
+ rgbFormat = MESA_FORMAT_B8G8R8X8_SRGB;
+  else
+ rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
   fb->Visual.sRGBCapable = true;
}
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965: Add B8G8R8X8_SRGB to the alpha format override

2015-12-11 Thread Neil Roberts
brw_init_surface_formats overrides the render format for RGBX formats
which aren't supported for rendering so that they internally use RGBA
instead. However, B8G8R8X8_SRGB was missing so it wasn't marked as a
renderable format. This patch just adds it.

Cc: "11.0 11.1" 
Cc: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/brw_surface_formats.c | 4 
 1 file changed, 4 insertions(+)

This might conflict when ported to the 11.x branches because the code
above doesn't have the if statement there. It should be fine to add it
without the if statement on those branches although it shouldn't
matter if it does have it either.

diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
b/src/mesa/drivers/dri/i965/brw_surface_formats.c
index 3fc47c3..7bc8b8b 100644
--- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
+++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
@@ -674,6 +674,10 @@ brw_init_surface_formats(struct brw_context *brw)
  if (gen < tinfo->render_target)
 render = BRW_SURFACEFORMAT_B8G8R8A8_UNORM;
 break;
+  case BRW_SURFACEFORMAT_B8G8R8X8_UNORM_SRGB:
+ if (gen < tinfo->render_target)
+render = BRW_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB;
+ break;
   case BRW_SURFACEFORMAT_R8G8B8X8_UNORM:
  render = BRW_SURFACEFORMAT_R8G8B8A8_UNORM;
  break;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/meta-fast-clear: Convert the clear color through the surf format

2015-12-11 Thread Neil Roberts
When programming the fast clear color there was previously a chunk of
code to try to make the color match the constraints of the surface
format such as by filling in missing components and handling luminance
formats. These cases are not handled by the hardware. There are some
additional possible restrictions that the hardware does seem to
handle, such as clamping to [0,1] for normalised formats. However for
whatever reason it doesn't clamp to [0,∞] for the special float
formats that don't have a sign bit. Rather than adding yet another
special case for this format this patch makes it instead convert the
color to the actual surface format and back again so that we can be
sure it will have all of the possible restrictions. Additionally this
would avoid some other potential surprises such as getting more
precision for the clear color when fast clears are used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93338
---
 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 57 -
 1 file changed, 27 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
index cf0e56b..29ae6f0 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
@@ -37,6 +37,8 @@
 #include "main/uniforms.h"
 #include "main/fbobject.h"
 #include "main/texobj.h"
+#include "main/format_unpack.h"
+#include "main/format_pack.h"
 
 #include "main/api_validate.h"
 #include "main/state.h"
@@ -397,45 +399,40 @@ set_fast_clear_color(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  const union gl_color_union *color)
 {
+   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
union gl_color_union override_color = *color;
-
-   /* The sampler doesn't look at the format of the surface when the fast
-* clear color is used so we need to implement luminance, intensity and
-* missing components manually.
-*/
-   switch (_mesa_get_format_base_format(mt->format)) {
-   case GL_INTENSITY:
-  override_color.ui[3] = override_color.ui[0];
-  /* flow through */
-   case GL_LUMINANCE:
-   case GL_LUMINANCE_ALPHA:
-  override_color.ui[1] = override_color.ui[0];
-  override_color.ui[2] = override_color.ui[0];
-  break;
-   default:
-  for (int i = 0; i < 3; i++) {
- if (!_mesa_format_has_color_component(mt->format, i))
-override_color.ui[i] = 0;
-  }
-  break;
-   }
-
-   if (!_mesa_format_has_color_component(mt->format, 3)) {
-  if (_mesa_is_format_integer_color(mt->format))
- override_color.ui[3] = 1;
-  else
- override_color.f[3] = 1.0f;
-   }
+   union gl_color_union tmp_color;
 
/* Handle linear→SRGB conversion */
-   if (brw->ctx.Color.sRGBEnabled &&
-   _mesa_get_srgb_format_linear(mt->format) != mt->format) {
+   if (brw->ctx.Color.sRGBEnabled && linear_format != mt->format) {
   for (int i = 0; i < 3; i++) {
  override_color.f[i] =
 util_format_linear_to_srgb_float(override_color.f[i]);
   }
}
 
+   /* Convert the clear color to the surface format and back so that the color
+* returned when sampling is guaranteed to be a value that could be stored
+* in the surface. For example if the surface is a luminance format and we
+* clear to 0.5,0.75,0.1,0.2 we want the color to come back as
+* 0.5,0.5,0.5,1.0. In general the hardware doesn't seem to look at the
+* surface format when returning the clear color so we need to do this to
+* implement luminance, intensity and missing components. However it does
+* seem to look at it in some cases such as to clamp to the range [0,1] for
+* unorm formats. Suprisingly however it doesn't clamp to [0,∞] for the
+* special float formats that don't have a sign bit.
+*/
+   if (!_mesa_is_format_integer_color(linear_format)) {
+  _mesa_pack_float_rgba_row(linear_format,
+1, /* n_pixels */
+(const GLfloat (*)[4]) override_color.f,
+_color);
+  _mesa_unpack_rgba_row(linear_format,
+1, /* n_pixels */
+_color,
+(GLfloat (*)[4]) override_color.f);
+   }
+
if (brw->gen >= 9) {
   mt->gen9_fast_clear_color = override_color;
} else {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound

2015-12-09 Thread Neil Roberts
Emil Velikov  writes:

> Worth throwing in 11.0 as well ?

Yeah, that would probably be sensible.

>> if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
>> -  dri2InvalidateDrawable(driContext->driDrawablePriv);
>> -  dri2InvalidateDrawable(driContext->driReadablePriv);
>> +  if (driContext->driDrawablePriv)
>> + dri2InvalidateDrawable(driContext->driDrawablePriv);
>> +  if (driContext->driReadablePriv)
>> + dri2InvalidateDrawable(driContext->driReadablePriv);
>
>  Afaict i915 could use an identical fix ?

Yes, I think you're right. However I don't have any way of testing it so
I feel a bit uncomfortable touching i915 driver.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix crash when calling glViewport with no surface bound

2015-12-08 Thread Neil Roberts
If EGL_KHR_surfaceless_context is used then glViewport can be called
with NULL for the draw and read surfaces. This was previously causing
a crash because the i965 driver tries to use this point to invalidate
the surfaces and it was derferencing the NULL pointer.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257
Cc: Nanley Chery 
Cc: "11.1" 
---

I've also posted a Piglit test for this here:

http://patchwork.freedesktop.org/patch/67356/

 src/mesa/drivers/dri/i965/brw_context.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7d7643c..5374bad 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -159,8 +159,10 @@ intel_viewport(struct gl_context *ctx)
__DRIcontext *driContext = brw->driContext;
 
if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
-  dri2InvalidateDrawable(driContext->driDrawablePriv);
-  dri2InvalidateDrawable(driContext->driReadablePriv);
+  if (driContext->driDrawablePriv)
+ dri2InvalidateDrawable(driContext->driDrawablePriv);
+  if (driContext->driReadablePriv)
+ dri2InvalidateDrawable(driContext->driReadablePriv);
}
 }
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Optimize useless comparisons against true/false.

2015-12-04 Thread Neil Roberts
Matt Turner  writes:

>  # Written in the form (, ) where  is an expression
>  # and  is either an expression or a value.  An expression is
>  # defined as a tuple of the form (, , , , )
> @@ -94,6 +97,8 @@ optimizations = [
> (('inot', ('ige', a, b)), ('ilt', a, b)),
> (('inot', ('ieq', a, b)), ('ine', a, b)),
> (('inot', ('ine', a, b)), ('ieq', a, b)),
> +   (('ieq', 'a@bool', true), a),
> +   (('ine', 'a@bool', false), a),

I think this second one is already in the file on line 187 here:

   # Boolean simplifications
   (('ine', 'a@bool', 0), 'a'),
   (('ieq', 'a@bool', 0), ('inot', 'a')),

Maybe you could add the first one near these two? It could be good to
add a line like this as well to complete the set:

   (('ine', 'a@bool', true), ('inot', a)),

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0.5/5] i965/gen9/fast-clear: Handle linear???SRGB conversion

2015-12-01 Thread Neil Roberts
"Pohjolainen, Topi"  writes:

>> +   /* Handle linear→SRGB conversion */
>> +   if (brw->ctx.Color.sRGBEnabled &&
>> +   _mesa_get_srgb_format_linear(mt->format) != mt->format) {
>
> Patch five disables fast clear for single-sampled if
> brw->ctx.Color.sRGBEnabled is set. How about something like this:
>
>  /* Fast clears are disabled for single-sampled buffers when
>   * sRGBEnabled is set.
>   */
>  assert(mt->num_samples > 1);

That could be a good idea but the thing that makes me a little
uncomfortable is that it should actually be fine to do a fast clear with
GL_FRAMEBUFFER_SRGB enabled and it's only actually a problem when it
comes to rendering. Eg, if we didn't have patch 5 you could clear with
SRGB enabled and then disable it before rendering something else and
everything would be fine.

Patch 5 is really only an optimisation which implements the heuristic
that if GL_FRAMEBUFFER_SRGB is enabled when you do the clear then it's
very likely that it's still going to be enabled when you render
something else. The patch isn't a hard requirement because without it it
would just do the resolve before rendering so it would still work, but
it's just a bit wasteful.

- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] i965/gen8+: Don't upload the MCS buffer for single-sampled textures

2015-11-26 Thread Neil Roberts
> On Thu, Nov 26, 2015 at 11:18:34AM +0200, Pohjolainen, Topi wrote:
>> On Wed, Nov 25, 2015 at 11:01:18AM -0800, Ben Widawsky wrote:
>> > On Wed, Nov 25, 2015 at 06:36:36PM +0100, Neil Roberts wrote:
>> > > For single-sampled textures the MCS buffer is only used to implement
>> > > fast clears. However the surface always needs to be resolved before
>> > > being used as a texture anyway so the the MCS buffer doesn't actually
>> > > achieve anything. This is important for Gen9 because in that case SRGB
>> > 
>> > I admit a good deal of ignorance, but why do we have to do a resolve 
>> > before we
>> > sample from it? I thought the whole point of the MCS was that we can 
>> > sample from
>> > it without a resolve.
>> 
>> This is my understanding also, I can't see much benefit from the fast clear
>> if it would need to be anyway resolved before reading it using GPU (reading
>> using CPU is then another story of course).
>
> I know we have this piece of text in bspec:
>
> "If the MCS is enabled on a non-multisampled render target, the render
> target must be resolved before being used for other purposes (display,
> texture, CPU lock)."
>
> But on the other hand the fact that surface state supports definition of fast
> cleared MCS buffer even for sampler engine suggest that "texture" in that
> sentence refers to something else than sampling. Not to mention that we
> already sample fast cleared color buffers with current Mesa driver and seem
> to have no problems with it.

The resolve needs to be done for single-sampled render targets but for
multisample targets we can safely sample from the MCS buffer. The MCS
(or CCS or whatever it should be called in this case) is still useful
even if we have to resolve because it saves memory bandwidth during
rendering. Eg, the initial clear writes less data because it only writes
to the smaller MCS buffer and not the larger color buffer. Any blending
operations during rendering can also read from the CCS instead of the
color buffer. I guess the assumption is that the bandwidth savings
during rendering outweigh the extra cost added by having to resolve it.

To be clear, this patch doesn't add the resolve step needed before
texturing, nor does anything in the series. The resolve step is already
done for all gens in intel_update_state. Look at the bit with the
comment “Resolve depth buffer and render cache of each enabled texture”.
It calls intel_miptree_resolve_color which resolves the CCS buffer for
each singled-sampled surface that is going to be used as a texture. The
only change this patch makes it to not not even set the aux buffer in
the texture surface state.

- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: Tiny optimisation for the linear→srgb conversion

2015-11-26 Thread Neil Roberts
When converting 0.0 it would be nice if it didn't do any arithmetic.
---
 src/util/format_srgb.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/util/format_srgb.h b/src/util/format_srgb.h
index 4a8d73f..34b50af 100644
--- a/src/util/format_srgb.h
+++ b/src/util/format_srgb.h
@@ -57,7 +57,7 @@ util_format_linear_to_srgb_helper_table[104];
 static inline float
 util_format_linear_to_srgb_float(float cl)
 {
-   if (cl < 0.0f)
+   if (cl <= 0.0f)
   return 0.0f;
else if (cl < 0.0031308f)
   return 12.92f * cl;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] i965/meta-fast-clear: Disable GL_FRAMEBUFFER_SRGB during clear

2015-11-25 Thread Neil Roberts
Adds MESA_META_FRAMEBUFFER_SRGB to the meta save state so that
GL_FRAMEBUFFER_SRGB will be disabled when performing the fast clear.
That way the render surface state will be programmed with the linear
equivalent format during the clear. This is important for Gen9 because
the SRGB formats are not marked as losslessly compressible so in
theory they aren't support for fast clears. It shouldn't make any
difference whether GL_FRAMEBUFFER_SRGB is enabled for the fast clear
operation because the color is not actually written to the framebuffer
so there is no chance for the hardware to apply the SRGB conversion on
it anyway.
---
 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
index cf0e56b..b32db3f 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
@@ -505,8 +505,21 @@ fast_clear_attachments(struct brw_context *brw,
uint32_t fast_clear_buffers,
struct rect fast_clear_rect)
 {
+   struct gl_context *ctx = >ctx;
+   bool srgb_enabled = ctx->Color.sRGBEnabled;
+
assert(brw->gen >= 9);
 
+   /* Make sure the GL_FRAMEBUFFER_SRGB is disabled during fast clear so that
+* the surface state will always be uploaded with a linear buffer. SRGB
+* buffers are not supported on Gen9 because they are not marked as
+* losslessly compressible. This shouldn't matter for the fast clear
+* because the color is not written to the framebuffer anyway so the
+* hardware doesn't need to do any SRGB conversion.
+*/
+   if (srgb_enabled)
+  _mesa_set_framebuffer_srgb(ctx, GL_FALSE);
+
brw_bind_rep_write_shader(brw, (float *) fast_clear_color);
 
/* SKL+ also has a resolve mode for compressed render targets and thus more
@@ -533,6 +546,9 @@ fast_clear_attachments(struct brw_context *brw,
}
 
set_fast_clear_op(brw, 0);
+
+   if (srgb_enabled)
+  _mesa_set_framebuffer_srgb(ctx, GL_TRUE);
 }
 
 bool
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/gen9: Don't do fast clears when GL_FRAMEBUFFER_SRGB is enabled

2015-11-25 Thread Neil Roberts
When GL_FRAMEBUFFER_SRGB is enabled any single-sampled renderbuffers
are resolved in intel_update_state because the hardware can't cope
with fast clears on SRGB buffers. In that case it's pointless to do a
fast clear because it will just be immediately resolved.
---
 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
index b32db3f..56c90b7 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
@@ -603,6 +603,17 @@ brw_meta_fast_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   brw->render_target_format[irb->mt->format])
  clear_type = REP_CLEAR;
 
+  /* Gen9 doesn't support fast clear on single-sampled SRGB buffers. When
+   * GL_FRAMEBUFFER_SRGB is enabled any color renderbuffers will be
+   * resolved in intel_update_state. In that case it's pointless to do a
+   * fast clear because it's very likely to be immediately resolved.
+   */
+  if (brw->gen >= 9 &&
+  irb->mt->num_samples <= 1 &&
+  brw->ctx.Color.sRGBEnabled &&
+  _mesa_get_srgb_format_linear(irb->mt->format) != irb->mt->format)
+ clear_type = REP_CLEAR;
+
   if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS)
  clear_type = REP_CLEAR;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/gen9: Allow fast clears for non-MSRT SRGB buffers

2015-11-25 Thread Neil Roberts
SRGB buffers are not marked as losslessly compressible so previously
they would not be used for fast clears. However in practice the
hardware will never actually see that we are using SRGB buffers for
fast clears if we use the linear equivalent format when clearing and
make sure to resolve the buffer as a linear format before sampling
from it.

This is an important use case because by default the window system
framebuffers are created as SRGB so without this fast clears won't be
used there.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 87e0136..88c0a19 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -259,7 +259,8 @@ intel_miptree_supports_non_msrt_fast_clear(struct 
brw_context *brw,
   return false;
 
if (brw->gen >= 9) {
-  const uint32_t brw_format = brw_format_for_mesa_format(mt->format);
+  mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
+  const uint32_t brw_format = brw_format_for_mesa_format(linear_format);
   return brw_losslessly_compressible_format(brw, brw_format);
} else
   return true;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965/gen8+: Don't upload the MCS buffer for single-sampled textures

2015-11-25 Thread Neil Roberts
For single-sampled textures the MCS buffer is only used to implement
fast clears. However the surface always needs to be resolved before
being used as a texture anyway so the the MCS buffer doesn't actually
achieve anything. This is important for Gen9 because in that case SRGB
surfaces are not supported for fast clears and we don't want the
hardware to see the MCS buffer in that case.
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 80252a5..075424a 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -225,7 +225,11 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
   pitch = mt->pitch;
}
 
-   if (mt->mcs_mt) {
+   /* The MCS is not uploaded for single-sampled surfaces because the color
+* buffer should always have been resolved before it is used as a texture
+* so there is no need for it.
+*/
+   if (mt->mcs_mt && mt->num_samples > 1) {
   aux_mt = mt->mcs_mt;
   aux_mode = GEN8_SURFACE_AUX_MODE_MCS;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0.5/5] i965/gen9/fast-clear: Handle linear→SRGB conversion

2015-11-25 Thread Neil Roberts
If GL_FRAMEBUFFER_SRGB is enabled when writing to an SRGB-capable
framebuffer then the color will be converted from linear to SRGB
before being written. There is no chance for the hardware to do this
itself because it can't modify the clear color that is programmed in
the surface state so it seems pretty clear that the driver should be
handling this itself.

Note that this wasn't a problem before Gen9 because previously we were
only able to do fast clears to 0 or 1 and those values are the same in
linear and SRGB space.
---

I noticed this after posting the patch series but I think it belongs
in this series because it only causes a problem once we enable fast
clears for MSRTs. I made a test case to demonstrate this here:

http://patchwork.freedesktop.org/patch/66222/

I tried to think what other similar problems there could be such as
whether we need to clamp the values to [0,1] for normalised formats.
However this seems to just work so I guess the sampler hardware does
look at the surface format enough to determine it needs to clamp the
clear color. It's a bit odd what the sampler does and doesn't do with
the fast clear color. I made a test case for this as well:

http://patchwork.freedesktop.org/patch/66223/

Annoyingly it doesn't clamp the value correctly for
GL_R11F_G11F_B10F_EXT. That format is floating-point but it has no
signed bits so I think it should clamp to a minimum of 0.

I have a github branch with all of my SKL fast clear patches here:

https://github.com/bpeel/mesa/commits/skl-fast-clear

 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
index 1b2ea42..f1920b2 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
@@ -41,6 +41,8 @@
 #include "main/api_validate.h"
 #include "main/state.h"
 
+#include "util/format_srgb.h"
+
 #include "vbo/vbo_context.h"
 
 #include "drivers/common/meta.h"
@@ -424,6 +426,15 @@ set_fast_clear_color(struct brw_context *brw,
  override_color.f[3] = 1.0f;
}
 
+   /* Handle linear→SRGB conversion */
+   if (brw->ctx.Color.sRGBEnabled &&
+   _mesa_get_srgb_format_linear(mt->format) != mt->format) {
+  for (int i = 0; i < 3; i++) {
+ override_color.f[i] =
+util_format_linear_to_srgb_float(override_color.f[i]);
+  }
+   }
+
if (brw->gen >= 9) {
   mt->gen9_fast_clear_color = override_color;
} else {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] i965/gen9: Support fast clear on SRGB buffers

2015-11-25 Thread Neil Roberts
Currently fast clears are disabled on SKL with SRGB buffers because
they are not marked as losslessly compressible in the formats table.
This is less than ideal because by default the window system buffers
are created as SRGB so in practice it means an application is unlikely
to end up using fast clears except for FBOs. However unless the
application enables GL_FRAMEBUFFER_SRGB the surface format given to
the hardware is always its linear equivalent so it shouldn't actually
be a problem. The aim of this series is to make that use case work.

Sadly there is a second obstacle stopping window system buffers from
using fast clears which as that they are always x-tiled and supposedly
SKL doesn't support that. This is explicitly disabled in
intel_tiling_supports_non_msrt_mcs. However if I remove that
restriction I couldn't find any problems so it might just work to
remove it.

All of my SKL fast clear patches are in a branch here:

https://github.com/bpeel/mesa/commits/skl-fast-clear

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/gen9: Resolve SRGB color buffers when GL_FRAMEBUFFER_SRGB enabled

2015-11-25 Thread Neil Roberts
SKL can't cope with the CCS buffer for SRGB buffers. Normally the
hardware won't see the SRGB formats because when GL_FRAMEBUFFER_SRGB
is disabled these get mapped to their linear equivalents. In order to
avoid relying on the CCS buffer when it is enabled this patch now
makes it flush the renderbuffers.
---

I made a test case for this here:

http://patchwork.freedesktop.org/patch/66225/

 src/mesa/drivers/dri/i965/brw_context.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7e2fdcb..a8c1052 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -197,6 +197,31 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
   brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
}
 
+   /* If FRAMEBUFFER_SRGB is used on Gen9+ then we need to resolve any of the
+* single-sampled color renderbuffers because the CCS buffer isn't
+* supported for SRGB formats and the blending wouldn't work.
+*/
+   if (brw->gen >= 9 && ctx->Color.sRGBEnabled) {
+  struct gl_framebuffer *fb = ctx->DrawBuffer;
+  for (int i = 0; i < fb->_NumColorDrawBuffers; i++) {
+ struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[i];
+
+ if (rb == NULL)
+continue;
+
+ struct intel_renderbuffer *irb = intel_renderbuffer(rb);
+ struct intel_mipmap_tree *mt = irb->mt;
+
+ if (mt == NULL ||
+ mt->num_samples > 1 ||
+ _mesa_get_srgb_format_linear(mt->format) == mt->format)
+   continue;
+
+ intel_miptree_resolve_color(brw, mt);
+ brw_render_cache_set_check_flush(brw, mt->bo);
+  }
+   }
+
_mesa_lock_context_textures(ctx);
 }
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Always use Y-tiled buffers on SKL+

2015-11-23 Thread Neil Roberts
Hi,

Has this situation changed at all? It's probably quite important to get
this working because we have to disable fast clears for X-tiled buffers
on SKL which effectively means we currently can't do it for window
system buffers.

Regards,
- Neil

Chris Wilson  writes:

> On Mon, Apr 13, 2015 at 04:31:29PM +0200, Daniel Vetter wrote:
>> On Sat, Apr 11, 2015 at 01:16:11PM -0700, Ben Widawsky wrote:
>> > Starting with Skylake, the display engine is capable of scanning out from
>> > Y-tiled buffers. As such, we can and should use Y-tiling for better 
>> > efficiency.
>> > 
>> > Note that the buffer allocation done for mipmaps will already never 
>> > allocate an
>> > X-tiled buffer for GEN9.
>> > 
>> > Signed-off-by: Ben Widawsky 
>> 
>> You need a recent enough ddx to make use of Y-tiled buffers, which atm
>> still doesn't yet exist. This would at least need some kind of handshake
>> with the compositor to make sure it understands this, presuming I didn't
>> miss something.
>
> You can send Y-tiled buffers to the DDX. The problem is that the kernel
> won't allow us to display them and so we will (and always have been)
> copying from them.
> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   >