[Mesa-dev] [PATCH v2] drirc: Add allow_glsl_builtin_variable_redeclaration for Dead Island Riptide Definitive Edition

2017-05-21 Thread Benedikt Schemmer

From: Benedikt Schemmer 

This patch sets the allow_glsl_builtin_variable_redeclaration for Dead 
Island Riptide Definitive Edition.


Tested with Mesa git as of today and Dying Light, Dead Island Definitive 
Edition and Dead Island Riptide Definite Edition


v2: set only required option and match option name

Please kindly review and push.

Thanks,
Benedikt

---
 src/mesa/drivers/dri/common/drirc | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc

index 8093e8e42a..d820462fad 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -92,6 +92,10 @@ TODO: document the other workarounds.
 value="true" />

 

+executable="DeadIslandRiptideGame">
+value="true" />

+
+
 
 value="true" />

 
--
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py

2017-06-04 Thread Benedikt Schemmer
Fix missing initializer warning in egd_tables.h by adding appropriate default 
fields in egd_tables.py

---
 src/gallium/drivers/r600/egd_tables.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/egd_tables.py 
b/src/gallium/drivers/r600/egd_tables.py
index 4c606025ba..289981ae18 100644
--- a/src/gallium/drivers/r600/egd_tables.py
+++ b/src/gallium/drivers/r600/egd_tables.py
@@ -270,7 +270,7 @@ struct eg_packet3 {
 strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{%s, %s(~0u), 0, 0},' % (strings.add(field.name), 
field.s_name)
 fields_idx += 1

 print '};'
@@ -282,7 +282,7 @@ struct eg_packet3 {
 print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{%s, %s, 0, 0},' % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] Fix missing initializer errors in generated tables

2017-06-04 Thread Benedikt Schemmer

I send these patches as a series because they share the same problem:
all have missing fields in the default initialization path for field/register
definitions.

This causes several hundreds (thousands?) of "missing initializer" warnings
from the compiler making it hard to spot less frequent errors.

This also fixes a small copy and paste error in vk_format_table.py where
it isnt clear how it is autogenerated from where.

Please kindly review and push if you find this useful.

Thanks,
Benedikt

Benedikt Schemmer (3):
  Fix missing initializer warning in sid_tables.h by adding appropriate
default fields in sid_tables.py
  Fix missing initializer warning in egd_tables.h by adding appropriate
default fields in egd_tables.py
  Fix missing initializer warning in vk_format_table.h by adding
appropriate default fields in vk_format_table.py

 src/amd/common/sid_tables.py   | 4 ++--
 src/amd/vulkan/vk_format_table.py  | 4 ++--
 src/gallium/drivers/r600/egd_tables.py | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] Fix missing initializer warning in vk_format_table.h by, adding appropriate default fields in vk_format_table.py

2017-06-04 Thread Benedikt Schemmer
Fix missing initializer warning in vk_format_table.h by adding appropriate 
default fields in vk_format_table.py
and correct the autogenerated from message

---
 src/amd/vulkan/vk_format_table.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/vk_format_table.py 
b/src/amd/vulkan/vk_format_table.py
index 36352b108d..139bb9544c 100644
--- a/src/amd/vulkan/vk_format_table.py
+++ b/src/amd/vulkan/vk_format_table.py
@@ -86,7 +86,7 @@ def print_channels(format, func):
 print '#endif'

 def write_format_table(formats):
-print '/* This file is autogenerated by u_format_table.py from 
u_format.csv. Do not edit directly. */'
+print '/* This file is autogenerated by vk_format_table.py from 
vk_format_layout.csv. Do not edit directly. */'
 print
 # This will print the copyright message on the top of this file
 print CopyRight.strip()
@@ -106,7 +106,7 @@ def write_format_table(formats):
 if channel.size:
 print "  {%s, %s, %s, %s, %u, %u}%s\t/* %s = %s */" % 
(type_map[channel.type],
bool_map(channel.norm), bool_map(channel.pure), bool_map(channel.scaled), 
channel.size, channel.shift, sep, "xyzw"[i],
channel.name)
 else:
-print "  {0, 0, 0, 0, 0}%s" % (sep,)
+print "  {0, 0, 0, 0, 0, 0}%s" % (sep,)
 print "   },"

 def do_swizzle_array(channels, swizzles):
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py

2017-06-04 Thread Benedikt Schemmer
Fix missing initializer warning in sid_tables.h by adding appropriate default 
fields in sid_tables.py

---
 src/amd/common/sid_tables.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py
index fd88d3c9d5..691d766b08 100644
--- a/src/amd/common/sid_tables.py
+++ b/src/amd/common/sid_tables.py
@@ -270,7 +270,7 @@ struct si_packet3 {
 strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{%s, %s(~0u), 0, 0},' % (strings.add(field.name), 
field.s_name)
 fields_idx += 1

 print '};'
@@ -282,7 +282,7 @@ struct si_packet3 {
 print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{%s, %s, 0, 0},' % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] Fix missing initializer errors in generated tables

2017-06-04 Thread Benedikt Schemmer
Hi all,

my reasoning wasn't based on how c handles partial initializations,
but on how the generated files look like: like something is missing.

Also I don't want to suppress these warnings, because I don't have
the experience you have with the codebase and would kind of like to
see if my changes have an negative impact or I forgot something.

Benedikt


Am 04.06.2017 um 22:20 schrieb Bas Nieuwenhuizen:
> Hi Marek,
> 
> Do you have any other reasons besides it not improving correctness?
> I'd like to pick at least the radv one, as the code doesn't get less
> clear, and using 5 zeros for struct with 6 members is just plain
> silly.
> 
> - Bas
> 
> 
> On Sun, Jun 4, 2017 at 9:57 PM, Marek Olšák  wrote:
>> NAK.
>>
>> In C/C++, the initializer is used to clear the memory to 0s, thus,
>> adding 0s to the initializer is redundant and unnecessary. Empty
>> initializer {} is also commonly used instead of memset.
>>
>> You need to suppress this warning if you don't want to see it.
>>
>> Marek
>>
>> On Sun, Jun 4, 2017 at 9:45 AM, Benedikt Schemmer  wrote:
>>>
>>> I send these patches as a series because they share the same problem:
>>> all have missing fields in the default initialization path for 
>>> field/register
>>> definitions.
>>>
>>> This causes several hundreds (thousands?) of "missing initializer" warnings
>>> from the compiler making it hard to spot less frequent errors.
>>>
>>> This also fixes a small copy and paste error in vk_format_table.py where
>>> it isnt clear how it is autogenerated from where.
>>>
>>> Please kindly review and push if you find this useful.
>>>
>>> Thanks,
>>> Benedikt
>>>
>>> Benedikt Schemmer (3):
>>>   Fix missing initializer warning in sid_tables.h by adding appropriate
>>> default fields in sid_tables.py
>>>   Fix missing initializer warning in egd_tables.h by adding appropriate
>>> default fields in egd_tables.py
>>>   Fix missing initializer warning in vk_format_table.h by adding
>>> appropriate default fields in vk_format_table.py
>>>
>>>  src/amd/common/sid_tables.py   | 4 ++--
>>>  src/amd/vulkan/vk_format_table.py  | 4 ++--
>>>  src/gallium/drivers/r600/egd_tables.py | 4 ++--
>>>  3 files changed, 6 insertions(+), 6 deletions(-)
>>>
>>> --
>>> 2.11.0
>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/3] Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py

2017-06-11 Thread Benedikt Schemmer
Fix missing initializer warning in sid_tables.h by adding appropriate 
designated initializers in sid_tables.py

---
 src/amd/common/sid_tables.py | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py
index fd88d3c9d5..691d766b08 100644
--- a/src/amd/common/sid_tables.py
+++ b/src/amd/common/sid_tables.py
@@ -266,11 +266,13 @@ struct si_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
-strings.add(field.name), field.s_name,
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \
+   \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= 
%s},' \
+   % (strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= 
%s(~0u)},' \
+   % (strings.add(field.name), field.s_name)
 fields_idx += 1

 print '};'
@@ -279,10 +281,13 @@ struct si_packet3 {
 print 'static const struct si_reg sid_reg_table[] = {'
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \
+   \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \
+   % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \
+   % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/3] Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py

2017-06-11 Thread Benedikt Schemmer
Fix missing initializer warning in egd_tables.h by adding appropriate 
designated initializers in egd_tables.py

---
 src/gallium/drivers/r600/egd_tables.py | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/r600/egd_tables.py 
b/src/gallium/drivers/r600/egd_tables.py
index 4c606025ba..289981ae18 100644
--- a/src/gallium/drivers/r600/egd_tables.py
+++ b/src/gallium/drivers/r600/egd_tables.py
@@ -266,11 +266,13 @@ struct eg_packet3 {
 while value[1] >= len(values_offsets):
 values_offsets.append(-1)
 values_offsets[value[1]] = 
strings.add(strip_prefix(value[0]))
-print '\t{%s, %s(~0u), %s, %s},' % (
-strings.add(field.name), field.s_name,
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \
+   \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= 
%s},' \
+   % (strings.add(field.name), field.s_name,
 len(values_offsets), 
strings_offsets.add(values_offsets))
 else:
-print '\t{%s, %s(~0u)},' % (strings.add(field.name), 
field.s_name)
+print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= 
%s(~0u)},' \
+   % (strings.add(field.name), field.s_name)
 fields_idx += 1

 print '};'
@@ -279,10 +281,13 @@ struct eg_packet3 {
 print 'static const struct eg_reg egd_reg_table[] = {'
 for reg in regs:
 if len(reg.fields):
-print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name,
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \
+   \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \
+   % (strings.add(reg.name), reg.r_name,
 len(reg.fields), reg.fields_idx if reg.own_fields else 
reg.fields_owner.fields_idx)
 else:
-print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name)
+print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \
+   % (strings.add(reg.name), reg.r_name)
 print '};'
 print

-- 
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/3] Fix missing initializer warning in vk_format_table.h by, adding appropriate default fields in vk_format_table.py

2017-06-11 Thread Benedikt Schemmer
Fix missing initializer warning in vk_format_table.h by changing to a default 
initializer in vk_format_table.py
and correct the autogenerated from message

---
 src/amd/vulkan/vk_format_table.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/vk_format_table.py 
b/src/amd/vulkan/vk_format_table.py
index 36352b108d..139bb9544c 100644
--- a/src/amd/vulkan/vk_format_table.py
+++ b/src/amd/vulkan/vk_format_table.py
@@ -86,7 +86,7 @@ def print_channels(format, func):
 print '#endif'

 def write_format_table(formats):
-print '/* This file is autogenerated by u_format_table.py from 
u_format.csv. Do not edit directly. */'
+print '/* This file is autogenerated by vk_format_table.py from 
vk_format_layout.csv. Do not edit directly. */'
 print
 # This will print the copyright message on the top of this file
 print CopyRight.strip()
@@ -106,7 +106,7 @@ def write_format_table(formats):
 if channel.size:
 print "  {%s, %s, %s, %s, %u, %u}%s\t/* %s = %s */" % 
(type_map[channel.type],
bool_map(channel.norm), bool_map(channel.pure), bool_map(channel.scaled), 
channel.size, channel.shift, sep, "xyzw"[i],
channel.name)
 else:
-print "  {0, 0, 0, 0, 0}%s" % (sep,)
+print "  {0}%s" % (sep,)
 print "   },"

 def do_swizzle_array(channels, swizzles):
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 0/3] Fix missing initializer errors in generated tables

2017-06-11 Thread Benedikt Schemmer
This series aims to fix hundreds of missing initializer warnings in generated 
header files
when compiling with -Wextra

V1: Fix the old fashioned way by adding 0s where needed
V2: switch to designated initializers (Emil), didnt send
V3: add some layout so its easier to read and create a new version for vk that
just uses {0} instead of {0, 0, 0, 0, 0} (same thing, less zeros)

---

Because this generated such unexpected controversy and out of curiosity i wrote 
a little
test program to show the problem. Compile with -Wextra.
There is no init code generated for any variant on any compiler i tested.

gcc version 5.4.1 20170304 (Ubuntu 5.4.1-8ubuntu1)
gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)
gcc version 7.0.1 20170407 (experimental) [trunk revision 246759] (Ubuntu 
7-20170407-0ubuntu2)
clang version 4.0.0-1ubuntu1 (tags/RELEASE_400/rc1)
clang version 5.0.0-svn305158-0~z~padoka0 (trunk)
MSVC Compiler Version 19.00.24210

---

struct s {
   int a;
   int b;
   int c;
   int d;
   int e;
};

static const struct s str1 = {};   // gcc 5/6/7, clang 4/5 accept this 
without warning, MSVC2013 doesnt compile
(not Standard, will generate a warning with -Wpendantic, much prettier however)
static const struct s str2 = {0};  // clang 4/5 generate a warning here 
(although ANSI Standard)
static const struct s str3 = {0,}; // clang 4/5 generate a warning here 
(although ANSI Standard)
static const struct s str4 = {1,2,3};  // gcc 5/6/7, clang 4/5 generate a 
warning here
static const struct s str5 = {1,2,3,}; // gcc 5/6/7, clang 4/5 generate a 
warning here
static const struct s str6 = {1,2,3,0,0};  // this is fine with all compilers
static const struct s str7 = {.a = 1, .b = 2, .c = 3}; // this might not 
compile on MSVC <2013 but couldnt test

int main() {
   return 0;
}

---

This is what Rust does:

---

#![allow(unused_variables)]
#![allow(dead_code)]

#[derive(Default)]
struct Test {
   a: i32,
   b: i32,
   c: i32,
   d: i32,
   e: i32
}

fn main() {
//all of these wont work
//let t1 = Test {};
//let t2 = Test {0};
//let t3 = Test {0,};
//let t3b= Test {..}; would be very cool
//let t4 = Test {1,2,3};
//let t5 = Test {1,2,3,};
//let t6 = Test {1,2,3,0,0};
//let t7 = Test {a: 1, b: 1};
//let t8 = Test {a: 1, b: 1, ..}; would be cool

//only this is legal in rust
let t9: Test = Default::default();
let t10 = Test {..Default::default()};
let t11 = Test {a: 1, b: 1, ..Default::default()};
let t12 = Test {a: 1, b: 1, c: 1, d: 1, e: 1};

println!("Hello, world!");
}

---

So in the end I followed Emils suggestion of designated initializers for 
partial initialization,
to make it explicit and get rid of the warnings.

Please kindly review and push.

Thanks,
Benedikt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] drirc: Add allow_glsl_builtin_redeclaration for Dead Island Riptide Definitive Edition

2017-05-14 Thread Benedikt Schemmer

From: Benedikt Schemmer 

This patch sets the allow_glsl_builtin_redeclaration for Dead Island 
Riptide Definitive Edition.

This requires the patch series:
[Mesa-dev] [PATCH 0/2] Allow redeclaration of GLSL builtins; fixes Dying 
Light and Dead Island Definitive Edition

by John Brooks sent Fri May 12 13:39:46 UTC 2017

Tested with Mesa git as of today and Dying Light, Dead Island Definitive 
Edition and Dead Island Riptide Definite Edition


---
 src/mesa/drivers/dri/common/drirc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc

index e1b3722e6d..9f2ef6f159 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -92,6 +92,11 @@ TODO: document the other workarounds.
 value="true" />

 

+executable="DeadIslandRiptideGame">
+value="true" />

+
+
+
 
 value="true" />

 
--
2.11.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] Allow redeclaration of GLSL builtins; fixes Dying Light and Dead Island Definitive Edition

2017-05-14 Thread Benedikt Schemmer

Hi all,

Dying Light really works (Radeon RX460 4GB, Ubuntu 17.04, Mesa git 
patches applied plus some minor changes that i have locally to get rid 
of compiler warnings).

I did have to jump through some additional hoops to get it working:

Dying Light only launches from within Steam, Launch Options for me need 
to be

MESA_GL_VERSION_OVERRIDE=4.4 MESA_GLSL_VERSION_OVERRIDE=440

Also two libraries were missing (there are no packages in the 17.04 
repository so I had to install them manually)


I used

find -name "*.so" -exec ldd {} \; | grep not

from the games directory to find the missing dependencies. links for the 
packages below, glew needs to be renamed because the game expects a 
different version.


http://packages.ubuntu.com/yakkety/libglew1.13

sudo ln -sf /usr/lib/x86_64-linux-gnu/libGLEW.so.1.13.0 
/usr/lib/x86_64-linux-gnu/libGLEW.so.1.10


http://packages.ubuntu.com/de/xenial/amd64/libpng12-0/download

After that the biggest problem was that I died within the first few 
minutes because I havent played in months ;)


Big thanks you've made my day,
Benedikt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Fixes: gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE

2017-09-18 Thread Benedikt Schemmer
There is a small typo that prevents compilation.
Please kindly review and push.
Cheers,
Benedikt

Signed-off-by: Benedikt Schemmer 

---

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
@@ -412,7 +412,7 @@ nvc0_hw_get_query_result_resource(struct nvc0_context *nvc0,
PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR);
BEGIN_1IC0(push, NVC0_3D(MACRO_QUERY_BUFFER_WRITE), 9);
if (q->type == PIPE_QUERY_OCCLUSION_PREDICATE ||
-   q->type ++ PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 
64-bit? */
+   q->type == PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 
64-bit? */
   PUSH_DATA(push, 0x0001);
else if (result_type == PIPE_QUERY_TYPE_I32)
   PUSH_DATA(push, 0x7fff);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Fixes: radeonsi: allow out-of-order rasterization in commutative blending cases

2017-09-18 Thread Benedikt Schemmer
Wrong commit, correct one is:
gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE

Am 18.09.2017 um 15:27 schrieb Benedikt Schemmer:
> There is a small typo that prevents compilation.
> Please kindly review and push.
> Cheers,
> Benedikt
> 
> Signed-off-by: Benedikt Schemmer 
> 
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
> @@ -412,7 +412,7 @@ nvc0_hw_get_query_result_resource(struct nvc0_context 
> *nvc0,
> PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR);
> BEGIN_1IC0(push, NVC0_3D(MACRO_QUERY_BUFFER_WRITE), 9);
> if (q->type == PIPE_QUERY_OCCLUSION_PREDICATE ||
> -   q->type ++ PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what 
> if 64-bit? */
> +   q->type == PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what 
> if 64-bit? */
>PUSH_DATA(push, 0x0001);
> else if (result_type == PIPE_QUERY_TYPE_I32)
>PUSH_DATA(push, 0x7fff);
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Fixes: radeonsi: allow out-of-order rasterization in commutative blending cases

2017-09-18 Thread Benedikt Schemmer
There is a small typo that prevents compilation.
Please kindly review and push.
Cheers,
Benedikt

Signed-off-by: Benedikt Schemmer 

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
@@ -412,7 +412,7 @@ nvc0_hw_get_query_result_resource(struct nvc0_context *nvc0,
PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR);
BEGIN_1IC0(push, NVC0_3D(MACRO_QUERY_BUFFER_WRITE), 9);
if (q->type == PIPE_QUERY_OCCLUSION_PREDICATE ||
-   q->type ++ PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 
64-bit? */
+   q->type == PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 
64-bit? */
   PUSH_DATA(push, 0x0001);
else if (result_type == PIPE_QUERY_TYPE_I32)
   PUSH_DATA(push, 0x7fff);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

2017-09-29 Thread Benedikt Schemmer
Would it be ok to rewrite the patch like below?
Avoids code duplication and one variable.

Otherwise your original patch works fine so FWIW you have my

Tested-by: Benedikt Schemmer 

Cheers,
Benedikt


diff --git a/src/gallium/drivers/radeonsi/si_uvd.c 
b/src/gallium/drivers/radeonsi/si_uvd.c
--- a/src/gallium/drivers/radeonsi/si_uvd.c
+++ b/src/gallium/drivers/radeonsi/si_uvd.c
@@ -64,30 +64,16 @@ struct pipe_video_buffer *si_video_buffer_create(struct 
pipe_context *pipe,
template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH);
template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);

-   vl_video_buffer_template(&templ, &template, resource_formats[0], 1, 
array_size, PIPE_USAGE_DEFAULT, 0);
-   /* TODO: get tiling working */
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[0] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[0])
-   goto error;
-
-   if (resource_formats[1] != PIPE_FORMAT_NONE) {
-   vl_video_buffer_template(&templ, &template, 
resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1);
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[1] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[1])
-   goto error;
-   }
-
-   if (resource_formats[2] != PIPE_FORMAT_NONE) {
-   vl_video_buffer_template(&templ, &template, 
resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2);
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[2] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[2])
-   goto error;
+   for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
+   if (resource_formats[i] != PIPE_FORMAT_NONE) {
+   vl_video_buffer_template(&templ, &template,
+resource_formats[i], 1,
+array_size, 
PIPE_USAGE_DEFAULT, i);
+   templ.bind = PIPE_BIND_LINEAR | PIPE_BIND_SHARED;
+   resources[i] = (struct r600_texture *)
+   
pipe->screen->resource_create(pipe->screen, &templ);
+   if (!resources[i]) goto error;
+   }
}

for (i = 0; i < VL_NUM_COMPONENTS; ++i) {

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

2017-09-29 Thread Benedikt Schemmer
From: Benedikt Schemmer 
Date: Fri, 29 Sep 2017 21:02:13 +0200
Subject: [PATCH V2] radeonsi/uvd: fix planar formats broken since 
f70f6baaa3bb0f8b280ac2eaea69bb

V2: remove code duplication and one unnessecary variable, minor whitespace fix

---
 src/gallium/drivers/radeonsi/si_uvd.c | 40 +--
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_uvd.c 
b/src/gallium/drivers/radeonsi/si_uvd.c
index 4e8250664c..c2925ef9cd 100644
--- a/src/gallium/drivers/radeonsi/si_uvd.c
+++ b/src/gallium/drivers/radeonsi/si_uvd.c
@@ -64,30 +64,20 @@ struct pipe_video_buffer *si_video_buffer_create(struct 
pipe_context *pipe,
template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH);
template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);

-   vl_video_buffer_template(&templ, &template, resource_formats[0], 1, 
array_size, PIPE_USAGE_DEFAULT, 0);
/* TODO: get tiling working */
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[0] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[0])
-   goto error;
-
-   if (resource_formats[1] != PIPE_FORMAT_NONE) {
-   vl_video_buffer_template(&templ, &template, 
resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1);
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[1] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[1])
-   goto error;
-   }
-
-   if (resource_formats[2] != PIPE_FORMAT_NONE) {
-   vl_video_buffer_template(&templ, &template, 
resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2);
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[2] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[2])
-   goto error;
+   
+   for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
+   if (resource_formats[i] != PIPE_FORMAT_NONE) {
+   vl_video_buffer_template(&templ, &template,
+resource_formats[i], 1,
+array_size, 
PIPE_USAGE_DEFAULT, i);
+   /* Set PIPE_BIND_SHARED to avoid reallocation in 
r600_texture_get_handle,
+* which can't handle joined surfaces. */
+   templ.bind = PIPE_BIND_LINEAR | PIPE_BIND_SHARED;
+   resources[i] = (struct r600_texture *)
+   
pipe->screen->resource_create(pipe->screen, &templ);
+   if (!resources[i]) goto error;
+   }
}

for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
@@ -159,8 +149,8 @@ struct pipe_video_codec *si_uvd_create_decoder(struct 
pipe_context *context,
struct si_context *ctx = (struct si_context *)context;
bool vcn = (ctx->b.family == CHIP_RAVEN) ? true : false;

-if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
-return si_vce_create_encoder(context, templ, ctx->b.ws, 
si_vce_get_buffer);
+   if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+   return si_vce_create_encoder(context, templ, ctx->b.ws, 
si_vce_get_buffer);

return (vcn) ?  radeon_create_decoder(context, templ) :
si_common_uvd_create_decoder(context, templ, si_uvd_set_dtb);
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

2017-09-30 Thread Benedikt Schemmer
>
> It should be handled as an error if resource_formats[0] is PIPE_FORMAT_NONE.
>
> Better write this as:
>
> if (resource_formats[i] == PIPE_FORMAT_NONE) {
> if (i == 0)
> gotot error;
> continue;
> }
> ...
>

I dont think it can be zero.
In the beginning of vl_video_buffer.c formats and return values are defined and 
resource_formats[0] is guranteed to not
be PIPE_FORMAT_NONE. The only other return value is null which caught 
immediately after requesting resource_formats
(line 59, in si_uvd.c).
So maybe
assert(resource_formats[0] == PIPE_FORMAT_NONE)
for debugging purposes?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

2017-09-30 Thread Benedikt Schemmer
yes

Am 30.09.2017 um 17:53 schrieb Marek Olšák:
> On Sat, Sep 30, 2017 at 5:47 PM, Christian König
>  wrote:
>> Am 30.09.2017 um 12:57 schrieb Benedikt Schemmer:
>>>>
>>>> It should be handled as an error if resource_formats[0] is
>>>> PIPE_FORMAT_NONE.
>>>>
>>>> Better write this as:
>>>>
>>>> if (resource_formats[i] == PIPE_FORMAT_NONE) {
>>>>  if (i == 0)
>>>>  gotot error;
>>>>  continue;
>>>> }
>>>> ...
>>>>
>>> I dont think it can be zero.
>>> In the beginning of vl_video_buffer.c formats and return values are
>>> defined and resource_formats[0] is guranteed to not
>>> be PIPE_FORMAT_NONE. The only other return value is null which caught
>>> immediately after requesting resource_formats
>>> (line 59, in si_uvd.c).
>>> So maybe
>>> assert(resource_formats[0] == PIPE_FORMAT_NONE)
>>> for debugging purposes?
> 
> Did you mean:
> assert(resource_formats[0] != PIPE_FORMAT_NONE) ?
> 
> Marek
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V3] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

2017-09-30 Thread Benedikt Schemmer
From: Benedikt Schemmer 
Date: Sat, 30 Sep 2017 18:05:43 +0200
Subject: [PATCH V3] radeonsi/uvd: fix planar formats broken since 
f70f6baaa3bb0f8b280ac2eaea69bb

 V1: Marek Olsak
 V2: remove code duplication and one unnecessary variable, minor whitespace fix
 V3: ensure valid resource format when debugging (is assumed to be valid in 
release) (suggested by Christian König)

---
 src/gallium/drivers/radeonsi/si_uvd.c | 41 +++
 1 file changed, 17 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_uvd.c 
b/src/gallium/drivers/radeonsi/si_uvd.c
index 4e8250664c..3cb45075c4 100644
--- a/src/gallium/drivers/radeonsi/si_uvd.c
+++ b/src/gallium/drivers/radeonsi/si_uvd.c
@@ -64,30 +64,23 @@ struct pipe_video_buffer *si_video_buffer_create(struct 
pipe_context *pipe,
template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH);
template.height = align(tmpl->height / array_size, 
VL_MACROBLOCK_HEIGHT);

-   vl_video_buffer_template(&templ, &template, resource_formats[0], 1, 
array_size, PIPE_USAGE_DEFAULT, 0);
/* TODO: get tiling working */
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[0] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[0])
-   goto error;
-
-   if (resource_formats[1] != PIPE_FORMAT_NONE) {
-   vl_video_buffer_template(&templ, &template, 
resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1);
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[1] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[1])
-   goto error;
-   }

-   if (resource_formats[2] != PIPE_FORMAT_NONE) {
-   vl_video_buffer_template(&templ, &template, 
resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2);
-   templ.bind = PIPE_BIND_LINEAR;
-   resources[2] = (struct r600_texture *)
-   pipe->screen->resource_create(pipe->screen, &templ);
-   if (!resources[2])
-   goto error;
+   assert(resource_formats[0] != PIPE_FORMAT_NONE);
+
+   for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
+   if (resource_formats[i] != PIPE_FORMAT_NONE) {
+   vl_video_buffer_template(&templ, &template,
+resource_formats[i], 1,
+array_size, 
PIPE_USAGE_DEFAULT, i);
+   /* Set PIPE_BIND_SHARED to avoid reallocation in 
r600_texture_get_handle,
+* which can't handle joined surfaces. */
+   templ.bind = PIPE_BIND_LINEAR | PIPE_BIND_SHARED;
+   resources[i] = (struct r600_texture *)
+   
pipe->screen->resource_create(pipe->screen, &templ);
+   if (!resources[i])
+   goto error;
+   }
}

for (i = 0; i < VL_NUM_COMPONENTS; ++i) {
@@ -159,8 +152,8 @@ struct pipe_video_codec *si_uvd_create_decoder(struct 
pipe_context *context,
struct si_context *ctx = (struct si_context *)context;
bool vcn = (ctx->b.family == CHIP_RAVEN) ? true : false;

-if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
-return si_vce_create_encoder(context, templ, ctx->b.ws, 
si_vce_get_buffer);
+   if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+   return si_vce_create_encoder(context, templ, ctx->b.ws, 
si_vce_get_buffer);

return (vcn) ?  radeon_create_decoder(context, templ) :
si_common_uvd_create_decoder(context, templ, si_uvd_set_dtb);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 00/15] TGSI: improved live range tracking, also including arrays

2018-06-06 Thread Benedikt Schemmer
Hi Gert, Hi Nicolai,

I did play around with this quite a lot (mostly with the previous version) and 
found it to be
stable (doesn't crash deus ex on start up from cold shader cache like NIR) or 
leak memory
like llvm 7 (at least used to leak ~100MB with piglit shaders,
haven't given it another try in the last two weeks).

The arrays-of-arrays piglit shaders I sent you still crash though Gert.

On radeonsi it is mostly (or probably entirely) useless ;)
But it does have an interesting property when I tweak it (see below), results 
of today with v4 and llvm 6:

 BIGGEST IMPROVEMENTS - Max Waves
 Before After Delta Percentage
  3 8 5  166.67 %   
../shader-db/shaders/piglit/988eefdf4bb05e5a3ecc4a5c0b4c4ef54047a5a9_4.shader_test
 [1]
  5 7 2   40.00 %   
../shader-db/shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test
 [0]
  5 7 2   40.00 %   ../shader-db/shaders/cat/1847.shader_test 
[0]
  6 7 1   16.67 %   
../shader-db/shaders/serioussam2017/2160.shader_test [0]
  6 7 1   16.67 %   
../shader-db/shaders/serioussam2017/f1e9a7f8bb8be17b8231116cf68bc10677769709_2149.shader_test
 [0]
  1 2 1  100.00 %   
../shader-db/shaders/deusex_mankind/5b1006a267c95f5c245266d82699d53cad704aab_4008.shader_test
 [0]
  1 2 1  100.00 %   
../shader-db/shaders/deusex_mankind/14060e8c592408bb9b6059b27a72eeb1e66c7480_8288.shader_test
 [0]
  1 2 1  100.00 %   
../shader-db/shaders/deusex_mankind/cb6356f4da76e5a82d6036e811c680ae00249c68_994.shader_test
 [0]

PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR  PrivVGPR  
 Scratch  CodeSize  MaxWaves Waits
All affected 630   -6.28 %   -1.19 %   -0.07 % . .  
   .0.02 %0.42 % .

No regressions with max waves otherwise (but of course I don't own every game 
and
even from those I don't have a shaderdump of every one of them).

But still, from the values I found there may be an opportunity for an 
'intermediate range'
optimization somewhere.

Cheers,
Benedikt


---

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5533,21 +5533,22 @@ glsl_to_tgsi_visitor::merge_registers(void)
}


-   if (get_temp_registers_required_live_ranges(reg_live_ranges, 
&this->instructions,
+//   if (get_temp_registers_required_live_ranges(reg_live_ranges, 
&this->instructions,
+ get_temp_registers_required_live_ranges(reg_live_ranges, 
&this->instructions,
this->next_temp, 
reg_live_ranges,
-   this->next_array, 
arr_live_ranges)) {
+   this->next_array, 
arr_live_ranges);//) {
   struct rename_reg_pair *renames =
 rzalloc_array(reg_live_ranges, struct rename_reg_pair, 
this->next_temp);
   get_temp_registers_remapping(reg_live_ranges, this->next_temp,
reg_live_ranges, renames);
   rename_temp_registers(renames);

-  this->next_array =  merge_arrays(this->next_array, this->array_sizes,
-   &this->instructions, arr_live_ranges);
+//  this->next_array =  merge_arrays(this->next_array, this->array_sizes,
+//   &this->instructions, arr_live_ranges);

   if (arr_live_ranges)
  delete[] arr_live_ranges;
-   }
+//   }
ralloc_free(reg_live_ranges);
 }

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -1330,6 +1330,24 @@ static int register_merge_record_compare (const void *a, 
const void *b) {
 }
 #endif

+/* these magic numbers are determined by looking at the results of shader-db */
+bool should_merge (int distance)
+{
+   switch (distance) {
+   case 12 ... 126: //lower bound interfering with llvm?, upper bound here
+   case 244 ... 768: // and lower bound here determined by one regressing 
tombraider shader
+// nothing to see here
+   case 2432 ... 2496: // purely empiricily determined
+   case 2497 ... 2623: // Deus Ex
+   case 2624 ... 2688:
+// case 2689 ... 2943: // causes regressions in ubershaders
+   case 2944 ... 3072: // above isnt used
+  return true;
+   default:
+  return false;
+   }
+}
+
 /* This functions evaluates the register merges by using a binary
  * search to find suitable merge candidates. */
 void get_temp_registers_remapping(void *mem_ctx, int ntemps,
@@ -1361,10 +1379,18 @@ void get_temp_registers_remapping(void *mem_ctx, int 
ntemps,
register_merge_record *first_erase = reg_access_en

Re: [Mesa-dev] [PATCH 0/9] RadeonSI LLVM crash workaround for Ubuntu 18.04

2018-04-17 Thread Benedikt Schemmer
For the series:

Tested-by: Benedikt Schemmer 



Tested with
Series starting with [Mesa-dev] [PATCH 1/7] ac/surface: handle DCC
subresource fast clear restriction on VI
Series starting with [Mesa-dev] [PATCH 1/9] radeonsi: remove
r600_pipe_common.h
Series starting with [Mesa-dev] [PATCH 0/9] RadeonSI LLVM crash
workaround for Ubuntu 18.04

Kernel 4.16.0+ Mareks kernel patches for IB Partial flush i.e. DRM 3.26.0

I did not test Mareks patches isolated for performance gains
This is git since 9.4.18 plus Mareks patches

RX560 4GB OC (but with 50 Watt cap, before: 60W got too hot)

Deus Ex
1. run: 24.8/5.4/36.5
2. run: 24.9/13.8/37.9
before
1. run: 19.5/8.5/29.7
2. run: 19.8/9.2/30.2

Dirt Rallye
1. run 58/44.7/72
2. run 58.5/44/83
before
1. run: 55/42/73
2. run: 55/43/85

Dead Island classic crash with NIR during shader compilation, TGSI works
fine
Dead Island Definitive works fine with TGSI
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/main: Rework the shader capture naming convention

2018-04-28 Thread Benedikt Schemmer
Change from a purely number.shader_test naming scheme to sha_number.shader_test
because especially games often use the same number for entirely different 
shaders
based on graphics settings etc. and then already captured shaders get 
overwritten.
It is also useful for capturing shaders from applications that reuse program
numbers a lot i.e. piglit. This has helped to identify problems there as well.

Plus minor cleanups, that could be split into a separate patch.
I have used this for approximately two months now, seems to work just fine.

---
 src/mesa/main/shaderapi.c | 315 --
 1 file changed, 168 insertions(+), 147 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 44b18af492..65cf0a67a2 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -64,6 +64,122 @@
 #include "util/mesa-sha1.h"
 #include "util/crc32.h"

+
+/**
+ * Generate a SHA-1 hash value string for given source string.
+ */
+static void
+generate_sha1(const char *source, char sha_str[64])
+{
+   unsigned char sha[20];
+   _mesa_sha1_compute(source, strlen(source), sha);
+   _mesa_sha1_format(sha_str, sha);
+}
+
+
+#ifdef ENABLE_SHADER_CACHE
+/**
+ * Construct a full path for shader replacement functionality using
+ * following format:
+ *
+ * /_.glsl
+ */
+static char *
+construct_name(const gl_shader_stage stage, const char *source,
+   const char *path)
+{
+   char sha[64];
+   static const char *types[] = {
+  "VS", "TC", "TE", "GS", "FS", "CS",
+   };
+
+   generate_sha1(source, sha);
+   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
+}
+
+/**
+ * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
+ */
+static void
+dump_shader(const gl_shader_stage stage, const char *source)
+{
+   static bool path_exists = true;
+   char *dump_path;
+   FILE *f;
+
+   if (!path_exists)
+  return;
+
+   dump_path = getenv("MESA_SHADER_DUMP_PATH");
+   if (!dump_path) {
+  path_exists = false;
+  return;
+   }
+
+   char *name = construct_name(stage, source, dump_path);
+
+   f = fopen(name, "w");
+   if (f) {
+  fputs(source, f);
+  fclose(f);
+   } else {
+  GET_CURRENT_CONTEXT(ctx);
+  _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name,
+strerror(errno));
+   }
+   ralloc_free(name);
+}
+
+/**
+ * Read shader source code from a file.
+ * Useful for debugging to override an app's shader.
+ */
+static GLcharARB *
+read_shader(const gl_shader_stage stage, const char *source)
+{
+   char *read_path;
+   static bool path_exists = true;
+   int len, shader_size = 0;
+   GLcharARB *buffer;
+   FILE *f;
+
+   if (!path_exists)
+  return NULL;
+
+   read_path = getenv("MESA_SHADER_READ_PATH");
+   if (!read_path) {
+  path_exists = false;
+  return NULL;
+   }
+
+   char *name = construct_name(stage, source, read_path);
+   f = fopen(name, "r");
+   ralloc_free(name);
+   if (!f)
+  return NULL;
+
+   /* allocate enough room for the entire shader */
+   fseek(f, 0, SEEK_END);
+   shader_size = ftell(f);
+   rewind(f);
+   assert(shader_size);
+
+   /* add one for terminating zero */
+   shader_size++;
+
+   buffer = malloc(shader_size);
+   assert(buffer);
+
+   len = fread(buffer, 1, shader_size, f);
+   buffer[len] = 0;
+
+   fclose(f);
+
+   return buffer;
+}
+
+#endif /* ENABLE_SHADER_CACHE */
+
 /**
  * Return mask of GLSL_x flags by examining the MESA_GLSL env var.
  */
@@ -125,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx)
/* Device drivers may override these to control what kind of instructions
 * are generated by the GLSL compiler.
 */
-   struct gl_shader_compiler_options options;
+   struct gl_shader_compiler_options options = {};
gl_shader_stage sh;
int i;

-   memset(&options, 0, sizeof(options));
options.MaxUnrollIterations = 32;
options.MaxIfDepth = UINT_MAX;

@@ -138,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx)

ctx->Shader.Flags = _mesa_get_shader_flags();

-   if (ctx->Shader.Flags != 0)
+   if (ctx->Shader.Flags)
   ctx->Const.GenerateTemporaryNames = true;

/* Extended for ARB_separate_shader_objects */
@@ -655,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname,
   GLint *params)
 {
struct gl_shader_program *shProg
-  = _mesa_lookup_shader_program_err(ctx, program, 
"glGetProgramiv(program)");
+  = _mesa_lookup_shader_program_err(ctx, program,
+"glGetProgramiv(program)");

/* Is transform feedback available in this context?
 */
@@ -837,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname,
   *params = shProg->BinaryRetreivableHint;
   return;
case GL_PROGRAM_BINARY_LENGTH:
-  if (ctx->Const.NumProgramBinaryFormats == 0) {
+  if (!ctx->Const.NumProgramBinaryFormats) {
  *params = 0;
   

Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-28 Thread Benedikt Schemmer
The patches apply cleanly, however I just did a shader-db test run and can't 
find a difference with your patch
applied, am I doing something wrong?

compile times went up though:

before:
Thread 3 took 113.72 seconds and compiled 17899 shaders (not including SIMD16) 
with 2232 GL context switches
Thread 5 took 113.23 seconds and compiled 17767 shaders (not including SIMD16) 
with 2150 GL context switches
Thread 7 took 116.63 seconds and compiled 18030 shaders (not including SIMD16) 
with 2219 GL context switches
Thread 1 took 117.10 seconds and compiled 17966 shaders (not including SIMD16) 
with 2154 GL context switches
Thread 4 took 113.76 seconds and compiled 18097 shaders (not including SIMD16) 
with 2285 GL context switches
Thread 2 took 113.61 seconds and compiled 17111 shaders (not including SIMD16) 
with 1934 GL context switches
Thread 6 took 118.93 seconds and compiled 17887 shaders (not including SIMD16) 
with 2205 GL context switches
Thread 0 took 112.91 seconds and compiled 18232 shaders (not including SIMD16) 
with 2321 GL context switches

with your patch:
Thread 1 took 119.41 seconds and compiled 18495 shaders (not including SIMD16) 
with 2237 GL context switches
Thread 7 took 122.11 seconds and compiled 17228 shaders (not including SIMD16) 
with 2105 GL context switches
Thread 4 took 120.57 seconds and compiled 17989 shaders (not including SIMD16) 
with 2165 GL context switches
Thread 5 took 119.79 seconds and compiled 17709 shaders (not including SIMD16) 
with 2190 GL context switches
Thread 6 took 121.95 seconds and compiled 17804 shaders (not including SIMD16) 
with 2209 GL context switches
Thread 2 took 121.43 seconds and compiled 17819 shaders (not including SIMD16) 
with 2167 GL context switches
Thread 0 took 117.16 seconds and compiled 18180 shaders (not including SIMD16) 
with 2151 GL context switches
Thread 3 took 124.79 seconds and compiled 17765 shaders (not including SIMD16) 
with 2176 GL context switches



Radeon RX 560 Series (POLARIS11, DRM 3.26.0, 4.17.0-rc2+, LLVM 6.0.0)

Mesa git of today

PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR  PrivVGPR  
 Scratch  CodeSize  MaxWaves Waits
 0ad6 . . . . . 
. . . .
 aer  590 . . . . . 
. . . .
 alien_isolation 1414 . . . . . 
. . . .
 anholt10 . . . . . 
. . . .
 bioshock_infinite   2581 . . . . . 
. . . .
 blackmesa584 . . . . . 
. . . .
 cat  573 . . . . . 
. . . .
 csgo1392 . . . . . 
. . . .
 deadisland_definitive   1776 . . . . . 
. . . .
 deadisland_original11602 . . . . . 
. . . .
 deadisland_riptide_..293 . . . . . 
. . . .
 deusex_mankind  5051 . . . . . 
. . . .
 dirtrally787 . . . . . 
. . . .
 dolphin   22 . . . . . 
. . . .
 dyinglight  4012 . . . . . 
. . . .
 eurotruck2   216 . . . . . 
. . . .
 f1_2015  746 . . . . . 
. . . .
 glamor16 . . . . . 
. . . .
 hl2ep1   294 . . . . . 
. . . .
 hl2ep2   154 . . . . . 
. . . .
 hl2lostcoast  66 . . . . . 
. . . .
 hlsl3582 . . . . . 
. . . .
 humus-celshading   4 . . . . . 
. . . .
 humus-domino   6 . . . . . 
. . . .
 humus-dyna

Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-28 Thread Benedikt Schemmer
Hi Gert

Am 28.04.2018 um 23:51 schrieb Gert Wollny:
> Am Samstag, den 28.04.2018, 22:43 +0200 schrieb Benedikt Schemmer:
>> The patches apply cleanly, however I just did a shader-db test run
>> and can't find a difference with your patch
>> applied, am I doing something wrong?
> 
> AFAIK radeonsi doesn't use the register-merge optimizer in TGSI.
> 

Ah, ok. Was wondering why your debug code doesn't output anything.
Makes sense now ;)

So is this useless on radeonsi?
Seemed interesting to me.

>>
>> compile times went up though:
> This is strange, because "see above". Did you compile with debug
> information and c++11 or higher enables? In this case there is one
> access to a static variable (first patch)  that per c++11 standard
> should be thread save, which means that there might be a mutex
> protecting access to that variable, and this would explain the longer
> runtime in a multi-threaded environment.  
> 

not intentionally:

   prefix:  /usr/local
exec_prefix: ${prefix}
libdir:  ${prefix}/lib
includedir:  ${prefix}/include

OpenGL:  yes (ES1: yes ES2: yes)

OSMesa:  libOSMesa

DRI platform:drm
DRI drivers: i915 i965 nouveau r200 radeon swrast
DRI driver dir:  ${prefix}/lib/dri
GLX: DRI-based

EGL: yes
EGL drivers: builtin:egl_dri2 builtin:egl_dri3
GBM: yes
EGL/Vulkan/VL platforms:   x11 wayland drm

Vulkan drivers:  intel radeon
Vulkan ICD dir:  ${datarootdir}/vulkan/icd.d

llvm:yes
llvm-config: llvm-config-6.0
llvm-version:6.0.0

Gallium drivers: nouveau svga r600 r300 i915 virgl radeonsi swrast
Gallium st:  mesa xa xvmc vdpau omx_bellagio va nine clover

HUD extra stats: yes
HUD lmsensors:   yes


Shared libs: yes
Static libs: no
Shared-glapi:yes

CFLAGS:  -O3 -fstack-protector-strong -Wall -Wextra 
-Werror=format-security -fno-omit-frame-pointer -Wall 
-Werror=implicit-function-declaration -Werror=missing-prototypes
-Wmissing-prototypes -fno-math-errno -fno-trapping-math -std=c99
CXXFLAGS:-O3 -fstack-protector-strong -Wall -Wextra 
-Werror=format-security -fno-omit-frame-pointer -Wall -fno-math-errno 
-fno-trapping-math
CXX11_CXXFLAGS:
LDFLAGS: -Bsymbolic-functions -z relro
Macros:  -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS 
-DNDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM
-DHAVE_SYS_SYSCTL_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_TIMESPEC_GET 
-DHAVE_STRTOD_L -DHAVE_DL_ITERATE_PHDR -DHAVE_POSIX_MEMALIGN -DHAVE_ZLIB 
-DHAVE_LINUX_FUTEX_H -DHAVE_GALLIUM_EXTRA_HUD=1
-DHAVE_LIBSENSORS=1 -DHAVE_LIBDRM -DGLX_USE_DRM -DGLX_INDIRECT_RENDERING 
-DGLX_DIRECT_RENDERING -DGLX_USE_TLS -DHAVE_X11_PLATFORM 
-DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRM_PLATFORM
-DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -DENABLE_SHADER_CACHE -DHAVE_MINCORE 
-DHAVE_ST_VDPAU -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=0

LLVM_CFLAGS: -I/usr/lib/llvm-6.0/include -D__STDC_CONSTANT_MACROS 
-D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
LLVM_CXXFLAGS:   -I/usr/lib/llvm-6.0/include -std=c++0x -std=c++11 
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
LLVM_CPPFLAGS:   -I/usr/lib/llvm-6.0/include -D__STDC_CONSTANT_MACROS 
-D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
LLVM_LDFLAGS:-L/usr/lib/llvm-6.0/lib

PYTHON2: python2.7

Run 'make' to build Mesa

Cheers,
Benedikt


> Best, 
> Gert
> 
>>
>> before:
>> Thread 3 took 113.72 seconds and compiled 17899 shaders (not
>> including SIMD16) with 2232 GL context switches
>> Thread 5 took 113.23 seconds and compiled 17767 shaders (not
>> including SIMD16) with 2150 GL context switches
>> Thread 7 took 116.63 seconds and compiled 18030 shaders (not
>> including SIMD16) with 2219 GL context switches
>> Thread 1 took 117.10 seconds and compiled 17966 shaders (not
>> including SIMD16) with 2154 GL context switches
>> Thread 4 took 113.76 seconds and compiled 18097 shaders (not
>> including SIMD16) with 2285 GL context switches
>> Thread 2 took 113.61 seconds and compiled 17111 shaders (not
>> including SIMD16) with 1934 GL context switches
>> Thread 6 took 118.93 seconds and compiled 17887 shaders (not
>> including SIMD16) with 2205 GL context switches
>> Thread 0 took 112.91 seconds and compiled 18232 shaders (not
>> including SIMD16) with 2321 GL context switches
>>
>> with your patch:
>> Thread 1 took 119

Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-29 Thread Benedikt Schemmer
 
. . . .
 warsow   176 . . . . . 
. . . .
 warzone21004 . . . . . 
. . . .
 witcher2 928   -0.07 %0.06 % . . . 
.0.04 % . .
 x3_albion641 . . . . . 
. . . .
 xblades  208 . . . . . 
.0.42 % . .
 xcom1020   -0.10 % . . . . 
.0.28 % . .
 xcom2   1439 . . . . . 
. . . .
 yofrankie 82 . . . . . 
. . . .
 
--
 All affected63940.04 %0.16 %0.46 % .7.41 % 
   6.67 %0.51 %   -0.09 % .
 
--
 Total  52662 .0.03 %0.26 % .1.34 % 
   1.09 %0.13 %   -0.01 % .


If theres an easy way to figure out when your code makes it worse and when its 
an improvement this would be really nice.

Really interesting.

Cheers, Benedikt

Am 29.04.2018 um 09:55 schrieb Gert Wollny:
> Hello Benedikt, 
> 
> Am Sonntag, den 29.04.2018, 00:06 +0200 schrieb Benedikt Schemmer:
>> Hi Gert
>>
>> Am 28.04.2018 um 23:51 schrieb Gert Wollny:
>>> Am Samstag, den 28.04.2018, 22:43 +0200 schrieb Benedikt Schemmer:
>>>> The patches apply cleanly, however I just did a shader-db test
>>>> run
>>>> and can't find a difference with your patch
>>>> applied, am I doing something wrong?
>>>
>>> AFAIK radeonsi doesn't use the register-merge optimizer in TGSI.
>>>
>>
>> Ah, ok. Was wondering why your debug code doesn't output anything.
>> Makes sense now ;)
> Not exactly, the reason there is no output is because -DNDEBUG is set.
> Without it the statistics should also be printed out on radeonsi, but
> thinking of it I should probably disable it when register_merge is not
> accessed, because without this the numbers will be inflated and don't
> have much meaning.
> 
>> So is this useless on radeonsi?
> Indeed. 
> 
>> Seemed interesting to me.
> :) it certainly helps on r600 
> 
> 
>>>> compile times went up though:
>>>
>>> This is strange, because "see above". Did you compile with debug
>>> information and c++11 or higher enables?
> ...
>>>   
>>>
>> not intentionally:
> 
> Then you should actually not run any code that this series adds to
> mesa. I checked again, apart from the debugging output nothing will
> ever be run if a drivers that report
> PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS != 0 (as does radeonsi). 
> 
> Best, 
> Gert 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-29 Thread Benedikt Schemmer


Am 29.04.2018 um 11:34 schrieb Gert Wollny:
> Am Sonntag, den 29.04.2018, 10:43 +0200 schrieb Benedikt Schemmer:
>> Hi Gert,
>>
>> couldn't resist at least to try what would happen if I enable
>> register merge for radeonsi:
>>
>>  PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR
>> SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves Waits
>>  piglit 80732   -0.16 %   -0.02
>> % . .0.87 %0.86 %0.04 % . .
>>  --
>> 
>>  All affected 513  -17.58 %   -2.30
>> % . .4.12 %5.87 %1.73 %0.10 % .
>>  --
>> 
>>  Total  80732   -0.16 %   -0.02
>> % . .0.87 %0.86 %0.04 % . .
>>
>> I had already removed the defines around the debug code so thats also
>> happily outputting data.
>>
>> fails with two piglit shaders:
> Which are the names of these test? I'd like to check this on r600,
> because here I didn't see any regressions last time I checked. 
> 
might of course be different on r600 (is bindless available?),
also shader-db is more sensitive to problems than piglit

1. tests/spec/arb_bindless_texture/compiler/images/arrays-of-struct.frag
2. tests/spec/arb_bindless_texture/compiler/samplers/arrays-of-struct.frag

> 
>> Real world is a little different:
> 
> I guess these tests refer to enabled register_merge - without and with
> this patch set, no? 
> 
> Out of curiosity, did you also look at how enabling register_merge
> (before this series) impacts the result as compared to the normal
> operation of radeonsi? 

no I didn't
but if I do

(old vs new)
nothing vs register merge:

 PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR  PrivVGPR 
  Scratch  CodeSize  MaxWaves Waits
 piglit 80732 . . . . . 
.0.06 % . .
 
--
 All affected 4350.89 %0.36 % . . . 
.3.61 %   -0.06 % .
 
--
 Total  80732 . . . . . 
.0.06 % . .

register merge vs yours:

 PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR  PrivVGPR 
  Scratch  CodeSize  MaxWaves Waits
 piglit 80732   -0.16 %   -0.02 % . .0.87 % 
   0.86 %   -0.02 % . .
 
--
 All affected  83  -56.92 %  -14.22 % . .   11.67 % 
  16.67 %   -2.86 %0.93 % .
 
--
 Total  80732   -0.16 %   -0.02 % . .0.87 % 
   0.86 %   -0.02 % . .

nothing vs yours:

 PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR  PrivVGPR 
  Scratch  CodeSize  MaxWaves Waits
 piglit 80732   -0.16 %   -0.02 % . .0.87 % 
   0.86 %0.04 % . .
 
--
 All affected 513  -17.58 %   -2.30 % . .4.12 % 
   5.87 %1.73 %0.10 % .
 
--
 Total  80732   -0.16 %   -0.02 % . .0.87 % 
   0.86 %0.04 % . .


>  
> 
>> If theres an easy way to figure out when your code makes it worse and
>> when its an improvement this would be really nice.
> 
> My insentive for this series was, that on r600 the arrays are allocated
>  before the final optimization pass on the byte code that requires that
> the number of registers is <= 124. When I started this no spilling was
> implemented, and shaders with too many arrays and registers would
> simply fail. Now spilling is impelmented, but AFAIK reducing the
> numbers of registers in the final optimization pass does not result in
> changed spilling, so bringing down the number of registers before tgsi-
> to-bytecode is still of i

Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-29 Thread Benedikt Schemmer
 . . . . . 
. . . .
 
--
 All affected6226   -0.10 %   -0.11 %   -0.83 % . . 
.   -0.47 %0.05 % .
 
--
 Total  52662   -0.01 %   -0.02 %   -0.45 % . . 
.   -0.12 % . .


Am 29.04.2018 um 11:34 schrieb Gert Wollny:
> Am Sonntag, den 29.04.2018, 10:43 +0200 schrieb Benedikt Schemmer:
>> Hi Gert,
>>
>> couldn't resist at least to try what would happen if I enable
>> register merge for radeonsi:
>>
>>  PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR
>> SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves Waits
>>  piglit 80732   -0.16 %   -0.02
>> % . .0.87 %0.86 %0.04 % . .
>>  --
>> 
>>  All affected 513  -17.58 %   -2.30
>> % . .4.12 %5.87 %1.73 %0.10 % .
>>  --
>> 
>>  Total  80732   -0.16 %   -0.02
>> % . .0.87 %0.86 %0.04 % . .
>>
>> I had already removed the defines around the debug code so thats also
>> happily outputting data.
>>
>> fails with two piglit shaders:
> Which are the names of these test? I'd like to check this on r600,
> because here I didn't see any regressions last time I checked. 
> 
> 
>> Real world is a little different:
> 
> I guess these tests refer to enabled register_merge - without and with
> this patch set, no? 
> 
> Out of curiosity, did you also look at how enabling register_merge
> (before this series) impacts the result as compared to the normal
> operation of radeonsi? 
>  
> 
>> If theres an easy way to figure out when your code makes it worse and
>> when its an improvement this would be really nice.
> 
> My insentive for this series was, that on r600 the arrays are allocated
>  before the final optimization pass on the byte code that requires that
> the number of registers is <= 124. When I started this no spilling was
> implemented, and shaders with too many arrays and registers would
> simply fail. Now spilling is impelmented, but AFAIK reducing the
> numbers of registers in the final optimization pass does not result in
> changed spilling, so bringing down the number of registers before tgsi-
> to-bytecode is still of interest. 
> 
> For radeonsi my guess would be that the llvm optimizer works better
> when the registers are not yet merged, and that would be the reason why
> register_merge is disabled. 
> 
> In any case, Timothy wrote in this thread [1] (last message) that he
> had similar patches for NIR. 
> 
> Best, 
> Gert 
> 
> [1] https://patchwork.freedesktop.org/patch/189842/
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-29 Thread Benedikt Schemmer
 . . .
 humus-portals  2 . . . . . 
. . . .
 humus-volumetricfog..  6 . . . . . 
. . . .
 kerbal  1016 . . . . . 
. . . .
 larago   664 . . . . . 
. . . .
 madmax   3540.04 % . . . . 
. . . .
 metro2033redux  4410 .0.03 % . . . 
. .   -0.02 % .
 nexuiz80 . . . . . 
. . . .
 ruiner   685   -0.10 %   -0.06 % . . . 
.0.04 %0.04 % .
 sauerbraten7 . . . . . 
. . . .
 serioussam2017   7360.03 %   -0.07 %   -0.47 % . . 
. .0.06 % .
 soma 436 . . . . . 
. . . .
 specops 1814 . . . . . 
. . . .
 stellaris434 . . . . . 
. . . .
 supertuxkart   4 . . . . . 
. . . .
 talos762 . . . . . 
. . . .
 tesseract430 . . . . . 
. . . .
 tombraider  1012   -0.03 % . . . . 
.0.02 % . .
 total_war_shogun_2   176 . . . . . 
. . . .
 total_war_warhammer  218 . . . . . 
.0.71 % . .
 ubershaders   54 . . . . . 
.0.57 % . .
 ug_gettysburg149 . . . . . 
. . . .
 unigine_heaven   226 . . . . . 
. . . .
 unigine_superposition733 . . . . . 
. . . .
 unigine_valley   288 . . . . . 
. . . .
 unity 72 . . . . . 
. . . .
 w40kdawn2421 . . . . . 
.   -0.03 % . .
 w40kdawn3164 . . . . . 
. . . .
 warsow   176 . . . . . 
. . . .
 warzone21004 . . . . . 
. . . .
 witcher2 928   -0.02 %0.02 % . . . 
. . . .
 x3_albion641 . . . . . 
. . . .
 xblades  208 . . . . . 
.0.02 % . .
 xcom1020 . . . . . 
. . . .
 xcom2   1439 . . . . . 
. . . .
 yofrankie 82 . . . . . 
. . . .
 
--
 All affected 516   -0.52 %0.41 %   -0.58 % .7.41 % 
   6.67 %0.26 %   -0.89 % .
 
--
 Total  52662 .0.01 %   -0.19 % .1.34 % 
   1.09 %0.01 % . .

Am 29.04.2018 um 11:34 schrieb Gert Wollny:
> Am Sonntag, den 29.04.2018, 10:43 +0200 schrieb Benedikt Schemmer:
>> Hi Gert,
>>
>> couldn't resist at least to try what would happen if I enable
>> register merge for radeonsi:
>>
>>  PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR
>> SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves

Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays

2018-04-29 Thread Benedikt Schemmer


Am 29.04.2018 um 21:44 schrieb Gert Wollny:
> Hello Benedict,
> 
> thanks for all the testing!

thanks for all the developing ;)

> 
> On 29.04.2018 12:12, Benedikt Schemmer wrote:
>>> Which are the names of these test? I'd like to check this on r600,
>>> because here I didn't see any regressions last time I checked. 
>>>
>> might of course be different on r600 (is bindless available?),
>> also shader-db is more sensitive to problems than piglit
>>
>> 1. tests/spec/arb_bindless_texture/compiler/images/arrays-of-struct.frag
>> 2. tests/spec/arb_bindless_texture/compiler/samplers/arrays-of-struct.frag
> Indeed, bindless testures are not available on r600, so it is quite
> difficult for me to test this. I would guess that parameters related to
> this might be stored in the TGSI declaration that I currently don't check.
> 
> If you have time for it, could you send me a TGSI dump of one of these
> shaders?
> With "ST_DEBUG=tgsi" this should be possible.

I created 'R600_DEBUG=merge' so I can switch without having to recompile.


$ST_DEBUG=tgsi ./run wollny/

ATTENTION: default value of option allow_glsl_extension_directive_midshader 
overridden by environment.
FRAG
DCL TEMP[0..55], ARRAY(1), LOCAL
IMM[0] INT32 {0, 0, 0, 0}
IMM[1] FLT32 {1., 2., 3., 4.}
  0: STORE TEMP[0], IMM[0]., IMM[1], 2D
  1: END

wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test - LLVM diagnostic 
(remark): :0:0: 12 instructions in function
wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test - Shader Stats: 
SGPRS: 16 VGPRS: 16 Code Size: 80 LDS: 0 Scratch: 0 Max Waves: 8 Spilled SGPRs: 
0 Spilled VGPRs: 0 PrivMem VGPRs: 0
FRAG
DCL OUT[0], COLOR
DCL TEMP[0..1], LOCAL
DCL TEMP[2..57], ARRAY(1), LOCAL
IMM[0] FLT32 {0., 0., 0., 0.}
  0: MOV TEMP[0].xy, IMM[0].
  1: TEX TEMP[1], TEMP[0], TEMP[2].xyxy, 2D
  2: MOV OUT[0], TEMP[1]
  3: END

wollny/a115868a349cd666b842a0e70f47451b4463903a_2.shader_test - LLVM diagnostic 
(remark): :0:0: 11 instructions in function
wollny/a115868a349cd666b842a0e70f47451b4463903a_2.shader_test - Shader Stats: 
SGPRS: 24 VGPRS: 16 Code Size: 72 LDS: 0 Scratch: 0 Max Waves: 8 Spilled SGPRs: 
0 Spilled VGPRs: 0 PrivMem VGPRs: 0
Thread 0 took 0.13 seconds and compiled 2 shaders (not including SIMD16) with 1 
GL context switches



$R600_DEBUG=merge ST_DEBUG=tgsi ./run wollny/

ATTENTION: default value of option allow_glsl_extension_directive_midshader 
overridden by environment.
run: state_tracker/st_glsl_to_tgsi.cpp:5783: ureg_dst 
dst_register(st_translate*, gl_register_file, unsigned int, unsigned int): 
Assertion `array_id && array_id <= t->num_temp_arrays' failed.

 => CRASHED <= while processing these shaders:

wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test

>>
>>> For radeonsi my guess would be that the llvm optimizer works better
>>> when the registers are not yet merged, and that would be the reason why
>>> register_merge is disabled. 
>> well at least sometimes it doesn't, low hanging fruit maybe?
> Unfortunately, I can't test on radeonsi

I can, if you dont mind waiting for an answer sometimes.

But maybe even easier: is there an implicit/explicit magic number I can play 
with to see if it changes anything?

ATM it seems like your code improves half the shaders its affecting a lot and 
hurting the other half bad like it hits an invisible wall.
I think one problem could be the relationship between VGPRs and SGPRs used and 
max Wavefronts achieved.

This is somewhat similar to NIR although that changes things all over the place.

> 
> Best,
> Gert
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCHSET] Allow capture of shaders that would otherwise be overwritten

2018-05-10 Thread Benedikt Schemmer
When capturing shaders with MESA_SHADER_CAPTURE_PATH, shaders can and will
be overwritten if they have the same program number.

This happens when you're capturing shaders from games especially after
changing graphics settings and piglit shaders (I use that as a convenient way
for testing radeonsi NIR).

This probably hasn't been noticed before.

The first set of patches changes the way captured shaders are named by adding
the sha as the first part of the name, making shaders uniquely identifiable.

The second set changes the layout in shader-db/si-report to account for the
longer filenames. It could be applied independently, but is kind of required
for the first set.

[PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
[PATCH 2/3] mesa/main/shaderapi: purely non-functional cleanups
 like whitespace errors and cleanups
[PATCH 3/3] Change from programnumber.shader_test naming to
 sha_programnumber.shader_test

[PATCH 1/2] shader-db/si-report: Change layout to allow longer file names
[PATCH 2/2] shader-db/si-report: Show biggest improvements also

P.S. I don't have commit rights so please kindly review and push
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa/main/shaderapi: Use sha as part of captured shaders name

2018-05-10 Thread Benedikt Schemmer
It is inconvenient to capture shaders by program name alone because this does
not allow to capture shaders that get overwritten by shaders with the same
program name (ie games when you change settings or piglit).

---
 src/mesa/main/shaderapi.c | 47 ---
 1 file changed, 32 insertions(+), 15 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 1d0ca5374b..65cf0a67a2 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1343,29 +1343,46 @@ link_program(struct gl_context *ctx, struct 
gl_shader_program *shProg,
/* Capture .shader_test files. */
const char *capture_path = _mesa_get_shader_capture_path();
if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != NULL) {
+
   FILE *file;
-  char *filename = ralloc_asprintf(NULL, "%s/%u.shader_test",
-   capture_path, shProg->Name);
+  char *filename = NULL;
+  char *fsource = NULL;
+  char *ftemp = NULL;
+
+  asprintf(&fsource, "[require]\nGLSL%s >= %u.%02u\n",
+   shProg->IsES ? " ES" : "",
+   shProg->data->Version / 100, shProg->data->Version % 100);
+
+  if (shProg->SeparateShader) {
+ ftemp = fsource;
+ asprintf(&fsource, "%sGL_ARB_separate_shader_objects\nSSO ENABLED\n",
+  ftemp);
+ free(ftemp);
+  }
+
+  for (unsigned i = 0; i < shProg->NumShaders; i++) {
+  ftemp = fsource;
+  asprintf(&fsource, "%s\n[%s shader]\n%s\n", ftemp,
+   _mesa_shader_stage_to_string(shProg->Shaders[i]->Stage),
+   shProg->Shaders[i]->Source);
+  free(ftemp);
+  }
+
+  char shabuf[64] = {"mylittlebunny"};
+  generate_sha1(fsource, shabuf);
+
+  asprintf(&filename, "%s/%s_%u.shader_test", capture_path,
+   shabuf, shProg->Name);
   file = fopen(filename, "w");
   if (file) {
- fprintf(file, "[require]\nGLSL%s >= %u.%02u\n",
- shProg->IsES ? " ES" : "",
- shProg->data->Version / 100, shProg->data->Version % 100);
- if (shProg->SeparateShader)
-fprintf(file, "GL_ARB_separate_shader_objects\nSSO ENABLED\n");
- fprintf(file, "\n");
-
- for (unsigned i = 0; i < shProg->NumShaders; i++) {
-fprintf(file, "[%s shader]\n%s\n",
-_mesa_shader_stage_to_string(shProg->Shaders[i]->Stage),
-shProg->Shaders[i]->Source);
- }
+ fprintf(file, "%s", fsource);
  fclose(file);
   } else {
  _mesa_warning(ctx, "Failed to open %s", filename);
   }

-  ralloc_free(filename);
+  free(filename);
+  free(fsource);
}

if (shProg->data->LinkStatus == LINKING_FAILURE &&
-- 
2.14.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa/main/shaderapi: purely non-functional cleanups, like whitespace errors and cleanups

2018-05-10 Thread Benedikt Schemmer
remove a memset too and yes, this is all functionally identical

---
 src/mesa/main/shaderapi.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index e8acca4490..1d0ca5374b 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -241,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx)
/* Device drivers may override these to control what kind of instructions
 * are generated by the GLSL compiler.
 */
-   struct gl_shader_compiler_options options;
+   struct gl_shader_compiler_options options = {};
gl_shader_stage sh;
int i;

-   memset(&options, 0, sizeof(options));
options.MaxUnrollIterations = 32;
options.MaxIfDepth = UINT_MAX;

@@ -254,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx)

ctx->Shader.Flags = _mesa_get_shader_flags();

-   if (ctx->Shader.Flags != 0)
+   if (ctx->Shader.Flags)
   ctx->Const.GenerateTemporaryNames = true;

/* Extended for ARB_separate_shader_objects */
@@ -771,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname,
   GLint *params)
 {
struct gl_shader_program *shProg
-  = _mesa_lookup_shader_program_err(ctx, program, 
"glGetProgramiv(program)");
+  = _mesa_lookup_shader_program_err(ctx, program,
+"glGetProgramiv(program)");

/* Is transform feedback available in this context?
 */
@@ -953,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname,
   *params = shProg->BinaryRetreivableHint;
   return;
case GL_PROGRAM_BINARY_LENGTH:
-  if (ctx->Const.NumProgramBinaryFormats == 0) {
+  if (!ctx->Const.NumProgramBinaryFormats) {
  *params = 0;
   } else {
  _mesa_get_program_binary_length(ctx, shProg, params);
@@ -974,7 +974,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
GLenum pname,
  "linked)");
  return;
   }
-  if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
+  if (!shProg->_LinkedShaders[MESA_SHADER_COMPUTE]) {
  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute "
  "shaders)");
  return;
@@ -1234,7 +1234,7 @@ _mesa_compile_shader(struct gl_context *ctx, struct 
gl_shader *sh)
} else {
   if (ctx->_Shader->Flags & GLSL_DUMP) {
  _mesa_log("GLSL source for %s shader %d:\n",
- _mesa_shader_stage_to_string(sh->Stage), sh->Name);
+   _mesa_shader_stage_to_string(sh->Stage), sh->Name);
  _mesa_log("%s\n", sh->Source);
   }

@@ -1381,13 +1381,13 @@ link_program(struct gl_context *ctx, struct 
gl_shader_program *shProg,
   GLuint i;

   printf("Link %u shaders in program %u: %s\n",
-   shProg->NumShaders, shProg->Name,
-   shProg->data->LinkStatus ? "Success" : "Failed");
+ shProg->NumShaders, shProg->Name,
+ shProg->data->LinkStatus ? "Success" : "Failed");

   for (i = 0; i < shProg->NumShaders; i++) {
  printf(" shader %u, stage %u\n",
-  shProg->Shaders[i]->Name,
-  shProg->Shaders[i]->Stage);
+shProg->Shaders[i]->Name,
+shProg->Shaders[i]->Stage);
   }
}
 }
@@ -1460,7 +1460,7 @@ void
 _mesa_active_program(struct gl_context *ctx, struct gl_shader_program *shProg,
 const char *caller)
 {
-   if ((shProg != NULL) && !shProg->data->LinkStatus) {
+   if ((shProg) && !shProg->data->LinkStatus) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
  "%s(program %u not linked)", caller, shProg->Name);
   return;
@@ -1794,7 +1794,7 @@ void GLAPIENTRY
 _mesa_GetObjectParameterfvARB(GLhandleARB object, GLenum pname,
   GLfloat *params)
 {
-   GLint iparams[1] = {0};  /* XXX is one element enough? */
+   GLint iparams[1] = {};  /* XXX is one element enough? */
_mesa_GetObjectParameterivARB(object, pname, iparams);
params[0] = (GLfloat) iparams[0];
 }
@@ -1912,7 +1912,7 @@ shader_source(struct gl_context *ctx, GLuint shaderObj, 
GLsizei count,
   if (!sh)
  return;

-  if (string == NULL) {
+  if (!string) {
  _mesa_error(ctx, GL_INVALID_VALUE, "glShaderSourceARB");
  return;
   }
@@ -1925,7 +1925,7 @@ shader_source(struct gl_context *ctx, GLuint shaderObj, 
GLsizei count,
 * last element will be set to the total length of the source code.
 */
offsets = malloc(count * sizeof(GLint));
-   if (offsets == NULL) {
+   if (!offsets) {
   _mesa_error(ctx, GL_OUT_OF_MEMORY, "glShaderSourceARB");
   return;
}
@@ -1952,7 +1952,7 @@ shader_source(struct gl_context *ctx, GLuint shaderObj, 
GLsizei count,
 */
totalLength = offsets[count - 1] + 2;
source = malloc(totalLength * sizeof(G

[Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

2018-05-10 Thread Benedikt Schemmer
Move shader-cache code from back to front and make generate_sha1() usable
unconditionally to avoid code duplication in the following patch

---
 src/mesa/main/shaderapi.c | 228 +++---
 1 file changed, 116 insertions(+), 112 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 44b18af492..e8acca4490 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -64,6 +64,122 @@
 #include "util/mesa-sha1.h"
 #include "util/crc32.h"

+
+/**
+ * Generate a SHA-1 hash value string for given source string.
+ */
+static void
+generate_sha1(const char *source, char sha_str[64])
+{
+   unsigned char sha[20];
+   _mesa_sha1_compute(source, strlen(source), sha);
+   _mesa_sha1_format(sha_str, sha);
+}
+
+
+#ifdef ENABLE_SHADER_CACHE
+/**
+ * Construct a full path for shader replacement functionality using
+ * following format:
+ *
+ * /_.glsl
+ */
+static char *
+construct_name(const gl_shader_stage stage, const char *source,
+   const char *path)
+{
+   char sha[64];
+   static const char *types[] = {
+  "VS", "TC", "TE", "GS", "FS", "CS",
+   };
+
+   generate_sha1(source, sha);
+   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
+}
+
+/**
+ * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
+ */
+static void
+dump_shader(const gl_shader_stage stage, const char *source)
+{
+   static bool path_exists = true;
+   char *dump_path;
+   FILE *f;
+
+   if (!path_exists)
+  return;
+
+   dump_path = getenv("MESA_SHADER_DUMP_PATH");
+   if (!dump_path) {
+  path_exists = false;
+  return;
+   }
+
+   char *name = construct_name(stage, source, dump_path);
+
+   f = fopen(name, "w");
+   if (f) {
+  fputs(source, f);
+  fclose(f);
+   } else {
+  GET_CURRENT_CONTEXT(ctx);
+  _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name,
+strerror(errno));
+   }
+   ralloc_free(name);
+}
+
+/**
+ * Read shader source code from a file.
+ * Useful for debugging to override an app's shader.
+ */
+static GLcharARB *
+read_shader(const gl_shader_stage stage, const char *source)
+{
+   char *read_path;
+   static bool path_exists = true;
+   int len, shader_size = 0;
+   GLcharARB *buffer;
+   FILE *f;
+
+   if (!path_exists)
+  return NULL;
+
+   read_path = getenv("MESA_SHADER_READ_PATH");
+   if (!read_path) {
+  path_exists = false;
+  return NULL;
+   }
+
+   char *name = construct_name(stage, source, read_path);
+   f = fopen(name, "r");
+   ralloc_free(name);
+   if (!f)
+  return NULL;
+
+   /* allocate enough room for the entire shader */
+   fseek(f, 0, SEEK_END);
+   shader_size = ftell(f);
+   rewind(f);
+   assert(shader_size);
+
+   /* add one for terminating zero */
+   shader_size++;
+
+   buffer = malloc(shader_size);
+   assert(buffer);
+
+   len = fread(buffer, 1, shader_size, f);
+   buffer[len] = 0;
+
+   fclose(f);
+
+   return buffer;
+}
+
+#endif /* ENABLE_SHADER_CACHE */
+
 /**
  * Return mask of GLSL_x flags by examining the MESA_GLSL env var.
  */
@@ -1775,119 +1891,7 @@ _mesa_LinkProgram(GLuint programObj)
link_program_error(ctx, shProg);
 }

-#ifdef ENABLE_SHADER_CACHE
-/**
- * Generate a SHA-1 hash value string for given source string.
- */
-static void
-generate_sha1(const char *source, char sha_str[64])
-{
-   unsigned char sha[20];
-   _mesa_sha1_compute(source, strlen(source), sha);
-   _mesa_sha1_format(sha_str, sha);
-}
-
-/**
- * Construct a full path for shader replacement functionality using
- * following format:
- *
- * /_.glsl
- */
-static char *
-construct_name(const gl_shader_stage stage, const char *source,
-   const char *path)
-{
-   char sha[64];
-   static const char *types[] = {
-  "VS", "TC", "TE", "GS", "FS", "CS",
-   };
-
-   generate_sha1(source, sha);
-   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
-}
-
-/**
- * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
- */
-static void
-dump_shader(const gl_shader_stage stage, const char *source)
-{
-   static bool path_exists = true;
-   char *dump_path;
-   FILE *f;
-
-   if (!path_exists)
-  return;
-
-   dump_path = getenv("MESA_SHADER_DUMP_PATH");
-   if (!dump_path) {
-  path_exists = false;
-  return;
-   }
-
-   char *name = construct_name(stage, source, dump_path);
-
-   f = fopen(name, "w");
-   if (f) {
-  fputs(source, f);
-  fclose(f);
-   } else {
-  GET_CURRENT_CONTEXT(ctx);
-  _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name,
-strerror(errno));
-   }
-   ralloc_free(name);
-}
-
-/**
- * Read shader source code from a file.
- * Useful for debugging to override an app's shader.
- */
-static GLcharARB *
-read_shader(const gl_shader_stage stage, const char *source)
-{
-   char *read_path;
-   static bool path_exists = true;
-   int len, shader_size = 0;
-   GLcharARB *buffer;
-   FILE *f;
-

[Mesa-dev] [PATCH 2/2] shader-db/si-report: Show biggest improvements also

2018-05-10 Thread Benedikt Schemmer
Sometimes it is nice (and useful) to not just see if you've messed up
but also if you've made an improvement.

---
 si-report.py | 24 
 1 file changed, 24 insertions(+)

diff --git a/si-report.py b/si-report.py
index fba652c..d0e0686 100755
--- a/si-report.py
+++ b/si-report.py
@@ -686,6 +686,30 @@ def print_tables(before_all_results, after_all_results):
 break
 if num > 0:
 print
+
+# biggest improvements
+metrics = si_stats().metrics
+for i in range(len(metrics)):
+field = metrics[i][0]
+num = 0
+more_is_better = metrics[i][0] == 'maxwaves'
+
+if more_is_better:
+sort_key = lambda v: -v[1].diff.__dict__[field]
+else:
+sort_key = lambda v: v[1].diff.__dict__[field]
+
+for name, stats in sorted(shaders.items(), key = sort_key):
+if more_is_better:
+if stats.diff.__dict__[field] <= 0:
+continue
+else:
+if stats.diff.__dict__[field] >= 0:
+continue
+
+if num == 0:
+print_yellow(" BIGGEST IMPROVEMENTS - 
{:64}".format(metrics[i][1]))
+print_yellow(" Before After Delta Percentage")
 stats.print_regression(name, field)
 num += 1
 if num == num_listed:
-- 
2.14.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Subject: [PATCH 1/2] shader-db/si-report: Change layout to allow longer file names

2018-05-10 Thread Benedikt Schemmer
When using sha as part of the captured shader name that name can get so
long that part of it is cut off and not visible anymore (although this
also happens when using long directory names etc.) So this is a general
improvement.

---
 si-report.py | 72 ++--
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/si-report.py b/si-report.py
index 0756081..fba652c 100755
--- a/si-report.py
+++ b/si-report.py
@@ -482,39 +482,53 @@ class grouped_stats:
 def print_vgpr_spilling_app(self, name):
 if (self.after.spilled_vgprs > 0 or
 self.after.privmem_vgprs > 0):
-print " {:22}{:6}{:10}{:10}{:10}".format(
-name,
+print " {:6}{:6} {:6}   {:6}   {:22}".format(
 self.num_shaders,
 self.after.spilled_vgprs,
 self.after.privmem_vgprs,
-self.after.scratch_size)
+self.after.scratch_size,
+name)

 def print_one_shader_vgpr_spill(self, name):
 if (self.after.spilled_vgprs > 0 or
 self.after.privmem_vgprs > 0):
-print " {:65}{:10}{:10}{:10}{:10}".format(
-name,
+print " {:6}{:6}{:6}   {:6}{:22}".format(
 self.after.vgprs,
 self.after.spilled_vgprs,
 self.after.privmem_vgprs,
-self.after.scratch_size)
+self.after.scratch_size,
+name)

 def print_sgpr_spilling_app(self, name):
 if self.after.spilled_sgprs > 0:
-print " {:22}{:6}{:10}{:>9.1f}".format(
-name,
+print " {:6} {:6} {:>5.1f}  {:22}".format(
 self.num_shaders,
 self.after.spilled_sgprs,
-float(self.after.spilled_sgprs) / float(self.num_shaders))
+float(self.after.spilled_sgprs) / float(self.num_shaders),
+name)

 def print_one_shader_sgpr_spill(self, name):
 if self.after.spilled_sgprs > 0:
-print " {:65}{:10}{:10}".format(
-name,
+print " {:6}{:6}   {:90}".format(
 self.after.sgprs,
-self.after.spilled_sgprs)
+self.after.spilled_sgprs,
+name)

 def print_percentages(self, name):
+print " {:6}{:6}{}{}{}{}{}{}{}{}{}".format(
+name,
+self.num_shaders,
+format_percent_change(self.before.sgprs, self.after.sgprs),
+format_percent_change(self.before.vgprs, self.after.vgprs),
+format_percent_change(self.before.spilled_sgprs, 
self.after.spilled_sgprs),
+format_percent_change(self.before.spilled_vgprs, 
self.after.spilled_vgprs),
+format_percent_change(self.before.privmem_vgprs, 
self.after.privmem_vgprs),
+format_percent_change(self.before.scratch_size, 
self.after.scratch_size),
+format_percent_change(self.before.code_size, self.after.code_size),
+format_percent_change(self.before.maxwaves, self.after.maxwaves, 
more_is_better = True),
+format_percent_change(self.before.waitstates, 
self.after.waitstates))
+
+def print_percentages_end(self, name):
 print " {:22}{:6}{}{}{}{}{}{}{}{}{}".format(
 name,
 self.num_shaders,
@@ -530,14 +544,14 @@ class grouped_stats:

 def print_regression(self, name, field):
 more_is_better = field == "maxwaves"
-print " {:65}{:10}{:10}{}{}".format(
-name,
+print " {:6}{:6}{}{}   {:90}".format(
 self.before.__dict__[field],
 self.after.__dict__[field],
 format_table_cell(self.after.__dict__[field] - 
self.before.__dict__[field],
   more_is_better = more_is_better),
 format_percent_change(self.before.__dict__[field], 
self.after.__dict__[field],
-  more_is_better = more_is_better))
+  more_is_better = more_is_better),
+name)

 """
 Return "filename [index]", because files can contain multiple shaders.
@@ -608,8 +622,8 @@ def print_tables(before_all_results, after_all_results):
 sort_key = lambda v: -v[1].after.scratch_size
 for name, stats in sorted(shaders.items(), key = sort_key):
 if num == 0:
-print_yellow(" WORST VGPR SPILLS (not deltas)" + (" " * 40) +
- "VGPRs SpillVGPR  PrivVGPR ScratchSize")
+print_yellow("WORST VGPR SPILLS (not deltas)" + (" " * 64))
+print_yellow("  VGPRs Spills Private Scratch")
 stats.print_one_shader_vgpr_spill(name)
 num += 1
 if num == num_listed:
@@ -618,7 +632,7 @@ def print_tables(before_all_results, after_all_results):
 print

 # VGPR spilling apps
-print_yellow(" VGPR SPILLING APPS   Shaders S

Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

2018-05-17 Thread Benedikt Schemmer
Thanks for reviewing!

Am 17.05.2018 um 08:42 schrieb Tapani Pälli:
> 
> 
> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote:
>> Move shader-cache code from back to front and make generate_sha1() usable
>> unconditionally to avoid code duplication in the following patch
>>
>> ---
>>   src/mesa/main/shaderapi.c | 228 
>> +++---
>>   1 file changed, 116 insertions(+), 112 deletions(-)
>>
>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>> index 44b18af492..e8acca4490 100644
>> --- a/src/mesa/main/shaderapi.c
>> +++ b/src/mesa/main/shaderapi.c
>> @@ -64,6 +64,122 @@
>>   #include "util/mesa-sha1.h"
>>   #include "util/crc32.h"
>>
>> +
>> +/**
>> + * Generate a SHA-1 hash value string for given source string.
>> + */
>> +static void
>> +generate_sha1(const char *source, char sha_str[64])
>> +{
>> +   unsigned char sha[20];
>> +   _mesa_sha1_compute(source, strlen(source), sha);
>> +   _mesa_sha1_format(sha_str, sha);
>> +}
> 
> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for 
> generate_sha1 and others was placed there because the imported sha1 code 
> broke windows build, I'm wondering if this is still
> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE 
> guard.
> 

I did a quick
gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h")

and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) 
without a guard
best example from Intel seems to be brw_disk_cache.c which uses it alot outside 
of the ENABLE_SHADER_CACHE guard

so probably safe?

>> +
>> +
>> +#ifdef ENABLE_SHADER_CACHE
>> +/**
>> + * Construct a full path for shader replacement functionality using
>> + * following format:
>> + *
>> + * /_.glsl
>> + */
>> +static char *
>> +construct_name(const gl_shader_stage stage, const char *source,
>> +   const char *path)
>> +{
>> +   char sha[64];
>> +   static const char *types[] = {
>> +  "VS", "TC", "TE", "GS", "FS", "CS",
>> +   };
>> +
>> +   generate_sha1(source, sha);
>> +   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
>> +}
>> +
>> +/**
>> + * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
>> + */
>> +static void
>> +dump_shader(const gl_shader_stage stage, const char *source)
>> +{
>> +   static bool path_exists = true;
>> +   char *dump_path;
>> +   FILE *f;
>> +
>> +   if (!path_exists)
>> +  return;
>> +
>> +   dump_path = getenv("MESA_SHADER_DUMP_PATH");
>> +   if (!dump_path) {
>> +  path_exists = false;
>> +  return;
>> +   }
>> +
>> +   char *name = construct_name(stage, source, dump_path);
>> +
>> +   f = fopen(name, "w");
>> +   if (f) {
>> +  fputs(source, f);
>> +  fclose(f);
>> +   } else {
>> +  GET_CURRENT_CONTEXT(ctx);
>> +  _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name,
>> +    strerror(errno));
>> +   }
>> +   ralloc_free(name);
>> +}
>> +
>> +/**
>> + * Read shader source code from a file.
>> + * Useful for debugging to override an app's shader.
>> + */
>> +static GLcharARB *
>> +read_shader(const gl_shader_stage stage, const char *source)
>> +{
>> +   char *read_path;
>> +   static bool path_exists = true;
>> +   int len, shader_size = 0;
>> +   GLcharARB *buffer;
>> +   FILE *f;
>> +
>> +   if (!path_exists)
>> +  return NULL;
>> +
>> +   read_path = getenv("MESA_SHADER_READ_PATH");
>> +   if (!read_path) {
>> +  path_exists = false;
>> +  return NULL;
>> +   }
>> +
>> +   char *name = construct_name(stage, source, read_path);
>> +   f = fopen(name, "r");
>> +   ralloc_free(name);
>> +   if (!f)
>> +  return NULL;
>> +
>> +   /* allocate enough room for the entire shader */
>> +   fseek(f, 0, SEEK_END);
>> +   shader_size = ftell(f);
>> +   rewind(f);
>> +   assert(shader_size);
>> +
>> +   /* add one for terminating zero */
>> +   shader_size++;
>> +
>> +   buffer = malloc(shader_size);
>> +   assert(buffer);
>> +
>> +   len = fread(buffer, 1, shader_size, f);
>> +   buffer[le

Re: [Mesa-dev] [PATCH 3/3] mesa/main/shaderapi: purely non-functional cleanups, like whitespace errors and cleanups

2018-05-17 Thread Benedikt Schemmer


Am 17.05.2018 um 08:59 schrieb Tapani Pälli:
> 
> 
> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote:
>> remove a memset too and yes, this is all functionally identical
>>
>> ---
>>   src/mesa/main/shaderapi.c | 40 
>>   1 file changed, 20 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>> index e8acca4490..1d0ca5374b 100644
>> --- a/src/mesa/main/shaderapi.c
>> +++ b/src/mesa/main/shaderapi.c
>> @@ -241,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx)
>>  /* Device drivers may override these to control what kind of 
>> instructions
>>   * are generated by the GLSL compiler.
>>   */
>> -   struct gl_shader_compiler_options options;
>> +   struct gl_shader_compiler_options options = {};
>>  gl_shader_stage sh;
>>  int i;
>>
>> -   memset(&options, 0, sizeof(options));
>>  options.MaxUnrollIterations = 32;
>>  options.MaxIfDepth = UINT_MAX;
>>
>> @@ -254,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx)
>>
>>  ctx->Shader.Flags = _mesa_get_shader_flags();
>>
>> -   if (ctx->Shader.Flags != 0)
>> +   if (ctx->Shader.Flags)
>>     ctx->Const.GenerateTemporaryNames = true;
>>
>>  /* Extended for ARB_separate_shader_objects */
>> @@ -771,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, 
>> GLenum pname,
>>     GLint *params)
>>   {
>>  struct gl_shader_program *shProg
>> -  = _mesa_lookup_shader_program_err(ctx, program, 
>> "glGetProgramiv(program)");
>> +  = _mesa_lookup_shader_program_err(ctx, program,
>> +    "glGetProgramiv(program)");
>>
>>  /* Is transform feedback available in this context?
>>   */
>> @@ -953,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
>> GLenum pname,
>>     *params = shProg->BinaryRetreivableHint;
>>     return;
>>  case GL_PROGRAM_BINARY_LENGTH:
>> -  if (ctx->Const.NumProgramBinaryFormats == 0) {
>> +  if (!ctx->Const.NumProgramBinaryFormats) {
> 
> Maybe it's just me having some OCD but with these 'Num' constants I find it 
> much easier to read '== 0' than '!' (also below with NumProgramBinaryFormats 
> and NumSubroutineUniformRemapTable).
> 
> I don't feel strong about this though so no need to change this.

I dont have strong feelings about this either, I use a script to replace these 
things.
In my opinion it just helps to see whether these comparisons have meaning. >=1 
or ==0 don't really.
If they do I just use a define to make it clear. Otherwise I find the ! easier 
to read and understand.

> 
>>    *params = 0;
>>     } else {
>>    _mesa_get_program_binary_length(ctx, shProg, params);
>> @@ -974,7 +974,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
>> GLenum pname,
>>    "linked)");
>>    return;
>>     }
>> -  if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
>> +  if (!shProg->_LinkedShaders[MESA_SHADER_COMPUTE]) {
>>    _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute 
>> "
>>    "shaders)");
>>    return;
>> @@ -1234,7 +1234,7 @@ _mesa_compile_shader(struct gl_context *ctx, struct 
>> gl_shader *sh)
>>  } else {
>>     if (ctx->_Shader->Flags & GLSL_DUMP) {
>>    _mesa_log("GLSL source for %s shader %d:\n",
>> - _mesa_shader_stage_to_string(sh->Stage), sh->Name);
>> +   _mesa_shader_stage_to_string(sh->Stage), sh->Name);
>>    _mesa_log("%s\n", sh->Source);
>>     }
>>
>> @@ -1381,13 +1381,13 @@ link_program(struct gl_context *ctx, struct 
>> gl_shader_program *shProg,
>>     GLuint i;
>>
>>     printf("Link %u shaders in program %u: %s\n",
>> -   shProg->NumShaders, shProg->Name,
>> -   shProg->data->LinkStatus ? "Success" : "Failed");
>> + shProg->NumShaders, shProg->Name,
>> + shProg->data->LinkStatus ? "Success" : "Failed");
>>
>>     for (i = 0; i < shProg->NumShaders; i++) {
>>    pri

Re: [Mesa-dev] [PATCH 2/3] mesa/main/shaderapi: Use sha as part of captured shaders name

2018-05-17 Thread Benedikt Schemmer


Am 17.05.2018 um 09:11 schrieb Tapani Pälli:
> some nitpicking below .. I'll do some testing to see that things work as 
> before but overall these changes look good to me. Would be nice to hear 
> comments from active shader-db users.
> 
> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote:
>> It is inconvenient to capture shaders by program name alone because this does
>> not allow to capture shaders that get overwritten by shaders with the same
>> program name (ie games when you change settings or piglit).
>>
>> ---
>>   src/mesa/main/shaderapi.c | 47 
>> ---
>>   1 file changed, 32 insertions(+), 15 deletions(-)
>>
>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>> index 1d0ca5374b..65cf0a67a2 100644
>> --- a/src/mesa/main/shaderapi.c
>> +++ b/src/mesa/main/shaderapi.c
>> @@ -1343,29 +1343,46 @@ link_program(struct gl_context *ctx, struct 
>> gl_shader_program *shProg,
>>  /* Capture .shader_test files. */
>>  const char *capture_path = _mesa_get_shader_capture_path();
>>  if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != NULL) {
>> +
> 
> extra space
> 
>>     FILE *file;
>> -  char *filename = ralloc_asprintf(NULL, "%s/%u.shader_test",
>> -   capture_path, shProg->Name);
>> +  char *filename = NULL;
>> +  char *fsource = NULL;
>> +  char *ftemp = NULL;
>> +
>> +  asprintf(&fsource, "[require]\nGLSL%s >= %u.%02u\n",
>> +   shProg->IsES ? " ES" : "",
>> +   shProg->data->Version / 100, shProg->data->Version % 100);
>> +
>> +  if (shProg->SeparateShader) {
>> + ftemp = fsource;
>> + asprintf(&fsource, "%sGL_ARB_separate_shader_objects\nSSO 
>> ENABLED\n",
>> +  ftemp);
>> + free(ftemp);
>> +  }
>> +
>> +  for (unsigned i = 0; i < shProg->NumShaders; i++) {
>> +  ftemp = fsource;
>> +  asprintf(&fsource, "%s\n[%s shader]\n%s\n", ftemp,
>> +   _mesa_shader_stage_to_string(shProg->Shaders[i]->Stage),
>> +   shProg->Shaders[i]->Source);
>> +  free(ftemp);
>> +  }
>> +
>> +  char shabuf[64] = {"mylittlebunny"};
> 
> please remove the initialization, it is unnecessary

That was left from before I used generate_sha1 which sometimes failed for some 
reason.
I have one little thing though:
Most of the time this is defined as shabuf[41] (20 bytes sha plus trailing \0 )
In this file it is 64 for some reason.
Should I change that?

> 
>> +  generate_sha1(fsource, shabuf);
>> +
>> +  asprintf(&filename, "%s/%s_%u.shader_test", capture_path,
>> +   shabuf, shProg->Name);
>>     file = fopen(filename, "w");
>>     if (file) {
>> - fprintf(file, "[require]\nGLSL%s >= %u.%02u\n",
>> - shProg->IsES ? " ES" : "",
>> - shProg->data->Version / 100, shProg->data->Version % 100);
>> - if (shProg->SeparateShader)
>> -    fprintf(file, "GL_ARB_separate_shader_objects\nSSO ENABLED\n");
>> - fprintf(file, "\n");
>> -
>> - for (unsigned i = 0; i < shProg->NumShaders; i++) {
>> -    fprintf(file, "[%s shader]\n%s\n",
>> -    _mesa_shader_stage_to_string(shProg->Shaders[i]->Stage),
>> -    shProg->Shaders[i]->Source);
>> - }
>> + fprintf(file, "%s", fsource);
>>    fclose(file);
>>     } else {
>>    _mesa_warning(ctx, "Failed to open %s", filename);
>>     }
>>
>> -  ralloc_free(filename);
>> +  free(filename);
>> +  free(fsource);
>>  }
>>
>>  if (shProg->data->LinkStatus == LINKING_FAILURE &&
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

2018-05-17 Thread Benedikt Schemmer
Ok I found the commit.
But it says: "Until we have a proper fix."
So this would be a good time I guess ;)


---

author  Emil Velikov2017-01-18 19:40:31 
+
committer   Emil Velikov  2017-01-18 20:09:01 
+
commit  9f8dc3bf03ec825bae7041858dda6ca2e9a34363 (patch)
treeff9672995474d3c31f027fea8356cb5733e45388
parent  d1efa09d342bff3e5def2978a0bef748d74f9c82 (diff)
utils: build sha1/disk cache only with Android/Autoconf
Earlier commit imported a SHA1 implementation and relaxed the SHA1 and
disk cache handling, broking the Windows builds.

Restrict things for now until we get to a proper fix.

Fixes: d1efa09d342 "util: import sha1 implementation from OpenBSD"
Signed-off-by: Emil Velikov 


Am 17.05.2018 um 12:28 schrieb Tapani Pälli:
> 
> 
> On 05/17/2018 11:38 AM, Benedikt Schemmer wrote:
>> Thanks for reviewing!
>>
>> Am 17.05.2018 um 08:42 schrieb Tapani Pälli:
>>>
>>>
>>> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote:
>>>> Move shader-cache code from back to front and make generate_sha1() usable
>>>> unconditionally to avoid code duplication in the following patch
>>>>
>>>> ---
>>>>    src/mesa/main/shaderapi.c | 228 
>>>> +++---
>>>>    1 file changed, 116 insertions(+), 112 deletions(-)
>>>>
>>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>>>> index 44b18af492..e8acca4490 100644
>>>> --- a/src/mesa/main/shaderapi.c
>>>> +++ b/src/mesa/main/shaderapi.c
>>>> @@ -64,6 +64,122 @@
>>>>    #include "util/mesa-sha1.h"
>>>>    #include "util/crc32.h"
>>>>
>>>> +
>>>> +/**
>>>> + * Generate a SHA-1 hash value string for given source string.
>>>> + */
>>>> +static void
>>>> +generate_sha1(const char *source, char sha_str[64])
>>>> +{
>>>> +   unsigned char sha[20];
>>>> +   _mesa_sha1_compute(source, strlen(source), sha);
>>>> +   _mesa_sha1_format(sha_str, sha);
>>>> +}
>>>
>>> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for 
>>> generate_sha1 and others was placed there because the imported sha1 code 
>>> broke windows build, I'm wondering if this is still
>>> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE 
>>> guard.
>>>
>>
>> I did a quick
>> gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h")
>>
>> and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) 
>> without a guard
>> best example from Intel seems to be brw_disk_cache.c which uses it alot 
>> outside of the ENABLE_SHADER_CACHE guard
>>
>> so probably safe?
> 
> AFAIK none of those things are compiled on windows. IIUC windows builds are 
> about Mesa core and gallium side.
> 
>>>> +
>>>> +
>>>> +#ifdef ENABLE_SHADER_CACHE
>>>> +/**
>>>> + * Construct a full path for shader replacement functionality using
>>>> + * following format:
>>>> + *
>>>> + * /_.glsl
>>>> + */
>>>> +static char *
>>>> +construct_name(const gl_shader_stage stage, const char *source,
>>>> +   const char *path)
>>>> +{
>>>> +   char sha[64];
>>>> +   static const char *types[] = {
>>>> +  "VS", "TC", "TE", "GS", "FS", "CS",
>>>> +   };
>>>> +
>>>> +   generate_sha1(source, sha);
>>>> +   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
>>>> +}
>>>> +
>>>> +/**
>>>> + * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
>>>> + */
>>>> +static void
>>>> +dump_shader(const gl_shader_stage stage, const char *source)
>>>> +{
>>>> +   static bool path_exists = true;
>>>> +   char *dump_path;
>>>> +   FILE *f;
>>>> +
>>>> +   if (!path_exists)
>>>> +  return;
>>>> +
>>>> +   dump_path = getenv("MESA_SHADER_DUMP_PATH");
>>>> +   if (!dump_path) {
>>>> +  path_exists = false;
>>>> +  return;
>>>> +   }
>>>> +
>>>> +   char *name = construct_name(stage, source, dump_path);
>>>> +
>>>>

Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

2018-05-17 Thread Benedikt Schemmer
I looked at the appveyor log for that specific build
https://ci.appveyor.com/project/mesa3d/mesa/build/3186

It seems to only complain about
disk_cache.c
src\util\disk_cache.c(28) : fatal error C1083: Cannot open include file: 
'sys/file.h': No such file or directory
scons: *** [build\windows-x86-debug\util\disk_cache.obj] Error 2
scons: building terminated because of errors.


Found this on
https://stackoverflow.com/questions/2537960/windows-sys-file-h-equivalent

For the benefit of posterity,  is the BSD version of low-level file 
I/O routines.
Depending upon your compiler installation and build environment, you'll 
probably want  instead.
Most of the usual I/O routines are in , even setvbuf() which is pretty 
low-level control.
You'll want  or  if you want I/O routines/settings that 
don't normally exist under
Linux (or the other *NICES).

Am 17.05.2018 um 12:28 schrieb Tapani Pälli:
> 
> 
> On 05/17/2018 11:38 AM, Benedikt Schemmer wrote:
>> Thanks for reviewing!
>>
>> Am 17.05.2018 um 08:42 schrieb Tapani Pälli:
>>>
>>>
>>> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote:
>>>> Move shader-cache code from back to front and make generate_sha1() usable
>>>> unconditionally to avoid code duplication in the following patch
>>>>
>>>> ---
>>>>    src/mesa/main/shaderapi.c | 228 
>>>> +++---
>>>>    1 file changed, 116 insertions(+), 112 deletions(-)
>>>>
>>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>>>> index 44b18af492..e8acca4490 100644
>>>> --- a/src/mesa/main/shaderapi.c
>>>> +++ b/src/mesa/main/shaderapi.c
>>>> @@ -64,6 +64,122 @@
>>>>    #include "util/mesa-sha1.h"
>>>>    #include "util/crc32.h"
>>>>
>>>> +
>>>> +/**
>>>> + * Generate a SHA-1 hash value string for given source string.
>>>> + */
>>>> +static void
>>>> +generate_sha1(const char *source, char sha_str[64])
>>>> +{
>>>> +   unsigned char sha[20];
>>>> +   _mesa_sha1_compute(source, strlen(source), sha);
>>>> +   _mesa_sha1_format(sha_str, sha);
>>>> +}
>>>
>>> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for 
>>> generate_sha1 and others was placed there because the imported sha1 code 
>>> broke windows build, I'm wondering if this is still
>>> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE 
>>> guard.
>>>
>>
>> I did a quick
>> gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h")
>>
>> and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) 
>> without a guard
>> best example from Intel seems to be brw_disk_cache.c which uses it alot 
>> outside of the ENABLE_SHADER_CACHE guard
>>
>> so probably safe?
> 
> AFAIK none of those things are compiled on windows. IIUC windows builds are 
> about Mesa core and gallium side.
> 
>>>> +
>>>> +
>>>> +#ifdef ENABLE_SHADER_CACHE
>>>> +/**
>>>> + * Construct a full path for shader replacement functionality using
>>>> + * following format:
>>>> + *
>>>> + * /_.glsl
>>>> + */
>>>> +static char *
>>>> +construct_name(const gl_shader_stage stage, const char *source,
>>>> +   const char *path)
>>>> +{
>>>> +   char sha[64];
>>>> +   static const char *types[] = {
>>>> +  "VS", "TC", "TE", "GS", "FS", "CS",
>>>> +   };
>>>> +
>>>> +   generate_sha1(source, sha);
>>>> +   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
>>>> +}
>>>> +
>>>> +/**
>>>> + * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
>>>> + */
>>>> +static void
>>>> +dump_shader(const gl_shader_stage stage, const char *source)
>>>> +{
>>>> +   static bool path_exists = true;
>>>> +   char *dump_path;
>>>> +   FILE *f;
>>>> +
>>>> +   if (!path_exists)
>>>> +  return;
>>>> +
>>>> +   dump_path = getenv("MESA_SHADER_DUMP_PATH");
>>>> +   if (!dump_path) {
>>>> +  path_exists = false;
>>>> +  return;
>>>> +   }

Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

2018-05-18 Thread Benedikt Schemmer
p;fl);
  if (r == -1 && errno == EACCES)
errno = EAGAIN;

  return r;
}

# else /* !HAVE_STRUCT_FLOCK_L_TYPE */

#  error "This platform lacks flock function, and Gnulib doesn't provide a 
replacement. This is a bug in Gnulib."

# endif /* !HAVE_STRUCT_FLOCK_L_TYPE */

#endif /* !Windows */

Am 18.05.2018 um 07:34 schrieb Tapani Pälli:
> 
> 
> On 05/17/2018 02:00 PM, Benedikt Schemmer wrote:
>> Ok I found the commit.
>> But it says: "Until we have a proper fix."
>> So this would be a good time I guess ;)
> 

> 
>>
>> ---
>>
>> author    Emil Velikov     2017-01-18 19:40:31 
>> +
>> committer    Emil Velikov     2017-01-18 20:09:01 
>> +
>> commit    9f8dc3bf03ec825bae7041858dda6ca2e9a34363 (patch)
>> tree    ff9672995474d3c31f027fea8356cb5733e45388
>> parent    d1efa09d342bff3e5def2978a0bef748d74f9c82 (diff)
>> utils: build sha1/disk cache only with Android/Autoconf
>> Earlier commit imported a SHA1 implementation and relaxed the SHA1 and
>> disk cache handling, broking the Windows builds.
>>
>> Restrict things for now until we get to a proper fix.
>>
>> Fixes: d1efa09d342 "util: import sha1 implementation from OpenBSD"
>> Signed-off-by: Emil Velikov 
>>
>>
>> Am 17.05.2018 um 12:28 schrieb Tapani Pälli:
>>>
>>>
>>> On 05/17/2018 11:38 AM, Benedikt Schemmer wrote:
>>>> Thanks for reviewing!
>>>>
>>>> Am 17.05.2018 um 08:42 schrieb Tapani Pälli:
>>>>>
>>>>>
>>>>> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote:
>>>>>> Move shader-cache code from back to front and make generate_sha1() usable
>>>>>> unconditionally to avoid code duplication in the following patch
>>>>>>
>>>>>> ---
>>>>>>     src/mesa/main/shaderapi.c | 228 
>>>>>> +++---
>>>>>>     1 file changed, 116 insertions(+), 112 deletions(-)
>>>>>>
>>>>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>>>>>> index 44b18af492..e8acca4490 100644
>>>>>> --- a/src/mesa/main/shaderapi.c
>>>>>> +++ b/src/mesa/main/shaderapi.c
>>>>>> @@ -64,6 +64,122 @@
>>>>>>     #include "util/mesa-sha1.h"
>>>>>>     #include "util/crc32.h"
>>>>>>
>>>>>> +
>>>>>> +/**
>>>>>> + * Generate a SHA-1 hash value string for given source string.
>>>>>> + */
>>>>>> +static void
>>>>>> +generate_sha1(const char *source, char sha_str[64])
>>>>>> +{
>>>>>> +   unsigned char sha[20];
>>>>>> +   _mesa_sha1_compute(source, strlen(source), sha);
>>>>>> +   _mesa_sha1_format(sha_str, sha);
>>>>>> +}
>>>>>
>>>>> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for 
>>>>> generate_sha1 and others was placed there because the imported sha1 code 
>>>>> broke windows build, I'm wondering if this is still
>>>>> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE 
>>>>> guard.
>>>>>
>>>>
>>>> I did a quick
>>>> gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h")
>>>>
>>>> and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) 
>>>> without a guard
>>>> best example from Intel seems to be brw_disk_cache.c which uses it alot 
>>>> outside of the ENABLE_SHADER_CACHE guard
>>>>
>>>> so probably safe?
>>>
>>> AFAIK none of those things are compiled on windows. IIUC windows builds are 
>>> about Mesa core and gallium side.
>>>
>>>>>> +
>>>>>> +
>>>>>> +#ifdef ENABLE_SHADER_CACHE
>>>>>> +/**
>>>>>> + * Construct a full path for shader replacement functionality using
>>>>>> + * following format:
>>>>>> + *
>>>>>> + * /_.glsl
>>>>>> + */
>>>>>> +static char *
>>>>>> +construct_name(const gl_shader_stage stage, const char *source,
>>>>>> +   const char *path)
>>>>>> +{
>>>>>> +   char sha[64];
>>>>&g

Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer
Hi Timothy,

I was looking into something similar myself.
Great to see someone working on getting legacy extensions available
to the applications that can actually use them!

Another thing I noticed is that extensions aren't advertised at the
correct feature level.
The code below works and does a better job at it, but its probably also
the wrong way to go about it.

Cheers,
Benedikt

P.S. I grepped and sed together a file which describes the requirements
for every extension better, unfortunately the files on khronos
arent exactly machine readable
will send in next reply

---

diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
--- a/src/mesa/main/version.c
+++ b/src/mesa/main/version.c
@@ -233,8 +233,15 @@ _mesa_override_glsl_version(struct gl_constants *consts)
 }

 /**
- * Examine enabled GL extensions to determine GL version.
+ * Enable GL Extensions according to the minimum version requirements as per 
spec.
+ * Careful: some extensions are implemented for the express purpose of being
+ * used in a lower version context (legacy extensions).
+ * Example:
+ * ARB_gpu_shader5: first implementation in OpenGL 4.0 (GLSL 400) but
+ * supposed to be advertised from OpenGL 3.2 (GLSL 150) upwards,
+ * however early Nvidia FXAA could use it from GLSL 120 so advertise it there
  */
+
 static GLuint
 compute_version(const struct gl_extensions *extensions,
 const struct gl_constants *consts, gl_api api)
@@ -242,25 +249,43 @@ compute_version(const struct gl_extensions *extensions,
GLuint major, minor, version;

const bool ver_1_3 = (extensions->ARB_texture_border_clamp &&
+ extensions->ARB_instanced_arrays &&
+ extensions->ARB_shadow &&
+ extensions->ARB_vertex_shader &&
+ extensions->ARB_fragment_shader &&
+ extensions->ARB_depth_texture &&
+ extensions->ARB_occlusion_query2 &&
  extensions->ARB_texture_cube_map &&
+ extensions->ARB_seamless_cube_map &&
  extensions->ARB_texture_env_combine &&
+ extensions->ARB_texture_rg &&
+ extensions->ARB_texture_compression_rgtc &&
+ extensions->ARB_texture_compression_bptc &&
+ extensions->ARB_texture_float &&
+ extensions->ARB_framebuffer_object &&
+ extensions->ARB_framebuffer_no_attachments &&
  extensions->ARB_texture_env_dot3);
const bool ver_1_4 = (ver_1_3 &&
- extensions->ARB_depth_texture &&
- extensions->ARB_shadow &&
+ extensions->ARB_stencil_texturing &&
  extensions->ARB_texture_env_crossbar &&
+ extensions->ARB_texture_mirror_clamp_to_edge &&
  extensions->EXT_blend_color &&
  extensions->EXT_blend_func_separate &&
  extensions->EXT_blend_minmax &&
  extensions->EXT_point_parameters);
const bool ver_1_5 = (ver_1_4 &&
+ extensions->ARB_query_buffer_object &&
  extensions->ARB_occlusion_query);
const bool ver_2_0 = (ver_1_5 &&
  extensions->ARB_point_sprite &&
- extensions->ARB_vertex_shader &&
- extensions->ARB_fragment_shader &&
+ extensions->ARB_blend_func_extended &&
  extensions->ARB_texture_non_power_of_two &&
  extensions->EXT_blend_equation_separate &&
+ extensions->ARB_shader_texture_lod &&
+ extensions->ARB_transform_feedback2 &&
+ extensions->ARB_uniform_buffer_object &&
+ extensions->ARB_texture_buffer_object &&
+ extensions->ARB_texture_buffer_range &&

  /* Technically, 2.0 requires the functionality of the
   * EXT version.  Enable 2.0 if either extension is
@@ -272,33 +297,44 @@ compute_version(const struct gl_extensions *extensions,
   || extensions->ATI_separate_stencil));
const bool ver_2_1 = (ver_2_0 &&
  extensions->EXT_pixel_buffer_object &&
+ extensions->ARB_map_buffer_range &&
+ extensions->ARB_draw_instanced &&
+ extensions->ARB_shader_bit_encoding &&
+ extensions->ARB_gpu_shader5 &&
  extensions->EXT_texture_sRGB);
const bool ver_3_0 = (ver_2_1 &&
  consts->GLSLVersion >= 130 &&
  (consts->MaxSamples >= 4 || consts->FakeSWMSAA) &&
  

Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer
A list of extension requirements and interactions extracted from the GL_* files 
on Khronos
so this is messy but relatively complete
I've started to introduce key phrases like "This extension requires" so that 
this
could become machine readable in the future.

---

./WGL_ARB_make_current_read.txt
This extension requires WGL_ARB_extensions_string is required.
./ARB_cull_distance.txt
This extension requires OpenGL 3.0 is required.
The extension is written against the OpenGL 4.4 Specification, Core 
Profile, March 19, 2014.
The extension is written against the OpenGL Shading Language 4.40 
Specification, January 22, 2014.
./ARB_compute_variable_group_size.txt
This extension is written against the OpenGL 4.3 (Compatibility Profile) 
Specification, dated August 6, 2012.
This extension is written against the OpenGL Shading Language 
Specification, Version 4.30, Revision 7, dated September 24, 2012.
This extension requires OpenGL 4.3 or ARB_compute_shader is required.
This extension interacts with NV_compute_program5.
./ARB_texture_query_levels.txt
This extension requires OpenGL 3.0 is required.
This extension requires OpenGL Shading Language 1.30 is required
This extension is written against the OpenGL 4.2 specification and version 
4.20 of the OpenGL Shading Language Specification.
./ARB_depth_clamp.txt
Written based on the wording of the OpenGL 3.1 specification.
ARB_compatibility affects the behavior of this extension.
./WGL_ARB_render_texture.txt
This extension requires OpenGL 1.1 is required.
This extension requires WGL_ARB_extension_string is required.
This extension requires WGL_ARB_pixel_format is required.
This extension requires WGL_ARB_pbuffer is required.
WGL_ARB_make_current_read affects the definition of this extension.
GL_ARB_texture_cube_map affects the definition of this extension
The extension is written against the OpenGL 1.2.1 Specification.
./ARB_copy_image.txt
This extension requires OpenGL 1.1 is required.
The extension is written against the OpenGL 4.2 (Core Profile) 
Specification.
This extension interacts with OpenGL 4.3 and ARB_texture_view.
This extension interacts with the Compatibility Profile.
This extension interacts with EXT_texture_compression_s3tc.
This extension interacts with OpenGL 3.0 and ARB_texture_compression_rgtc.
This extension interacts with OpenGL 4.2 and ARB_texture_compression_bptc.
./ARB_shader_clock.txt
This extension is written against version 4.50 of the OpenGL Shading 
Language Specification.
This interacts with GL_ARB_gpu_shader_int64.
./ARB_texture_env_crossbar.txt
This extension is written against the OpenGL 1.2.1 Specification.
This extension requires OpenGL 1.1
This extension requires ARB_multitexture
This extension requires ARB_texture_env_combine
./WGL_ARB_extensions_string.txt
None
./ARB_shader_group_vote.txt
This extension is written against the OpenGL 4.3 (Compatibility Profile) 
Specification, dated August 6, 2012.
This extension is written against the OpenGL Shading Language 
Specification, Version 4.30, Revision 7, dated September 24, 2012.
This extension requires OpenGL 4.3 or ARB_compute_shader
This extension interacts with NV_gpu_shader5.
./ARB_ES2_compatibility.txt
Written based on the wording of the OpenGL 4.0 Compatibility Profile (March 
11, 2010) specification.
This extension interacts with ARB_tessellation_shader or
OpenGL 4.0.
./ARB_provoking_vertex.txt
This extension is written against the OpenGL 2.1 Specification but can 
apply to any prior specification.
ARB_geometry_shader4,
EXT_geometry_shader4,
NV_geometry_shader4, and
NV_gpu_program4 interact with this extension
EXT_transform_feedback, NV_transform_feedback, and the transform feedback 
functionality made core by OpenGL 3.0 are clarified by this extension.
./ARB_texture_buffer_object_rgb32.txt
This extension is written against the OpenGL 3.2 specification (Core 
Profile).
./ARB_point_parameters.txt
This extension requires OpenGL 1.0 is required.
ARB_multisample affects the definition of this extension.
The extension is written against the OpenGL 1.2.1 Specification.
./ARB_half_float_vertex.txt
This extension is written against the OpenGL 2.1 Specification
Based on the NV_half_float and OES_vertex_half_float extensions.
./ARB_explicit_attrib_location.txt
This extension requires OpenGL 2.0 or GL_ARB_vertex_shader.
This extension interacts with GL_ARB_blend_func_extended.
This extension is written against the OpenGL 3.2 (Core Profile) 
specification dated 7/24/2009 and the
This extension is written against the OpenGL Shading Language 1.50.09 
specification dated 7/24/2009.
This extension interacts with GL_ARB_separate_shader_objects,
This extension interacts with GL_ARB_gpu_shader_fp64
This extension interacts with OpenGL 4.0
This extension int

Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer
Yes its backwards and it seems it doesn't work anymore ;)
Seems the loophole I was using got removed.

Problem remains that applications that could use an legacy extension
are not aware of it and will simply/usually not request it:

MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=150 glxinfo | grep 
shader5
-nothing-

MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=400 glxinfo | grep 
shader5
GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64,
GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex,
GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object,
GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer,


If an application knows about GLSL 400 it doesn't need GL_ARB_gpu_shader5 
anymore.



Am 18.05.2018 um 11:25 schrieb Timothy Arceri:
> Hi Benedikt,
> 
> I think you have this backwards. compute_version() computes the version of 
> OpenGL supported by the driver by checking what extensions it advertises. See 
> version 2 of my patch to see where the
> extensions are enabled. I don't think this is something that can be 
> automated, it really depends on the driver, just because an extension can be 
> advertised at a certain level doesn't mean you must (or
> can) do it.
> 
> Tim
> 
> On 18/05/18 19:13, Benedikt Schemmer wrote:
>> Hi Timothy,
>>
>> I was looking into something similar myself.
>> Great to see someone working on getting legacy extensions available
>> to the applications that can actually use them!
>>
>> Another thing I noticed is that extensions aren't advertised at the
>> correct feature level.
>> The code below works and does a better job at it, but its probably also
>> the wrong way to go about it.
>>
>> Cheers,
>> Benedikt
>>
>> P.S. I grepped and sed together a file which describes the requirements
>> for every extension better, unfortunately the files on khronos
>> arent exactly machine readable
>> will send in next reply
>>
>> ---
>>
>> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
>> --- a/src/mesa/main/version.c
>> +++ b/src/mesa/main/version.c
>> @@ -233,8 +233,15 @@ _mesa_override_glsl_version(struct gl_constants *consts)
>>   }
>>
>>   /**
>> - * Examine enabled GL extensions to determine GL version.
>> + * Enable GL Extensions according to the minimum version requirements as 
>> per spec.
>> + * Careful: some extensions are implemented for the express purpose of being
>> + * used in a lower version context (legacy extensions).
>> + * Example:
>> + * ARB_gpu_shader5: first implementation in OpenGL 4.0 (GLSL 400) but
>> + * supposed to be advertised from OpenGL 3.2 (GLSL 150) upwards,
>> + * however early Nvidia FXAA could use it from GLSL 120 so advertise it 
>> there
>>    */
>> +
>>   static GLuint
>>   compute_version(const struct gl_extensions *extensions,
>>   const struct gl_constants *consts, gl_api api)
>> @@ -242,25 +249,43 @@ compute_version(const struct gl_extensions *extensions,
>>  GLuint major, minor, version;
>>
>>  const bool ver_1_3 = (extensions->ARB_texture_border_clamp &&
>> + extensions->ARB_instanced_arrays &&
>> + extensions->ARB_shadow &&
>> + extensions->ARB_vertex_shader &&
>> + extensions->ARB_fragment_shader &&
>> + extensions->ARB_depth_texture &&
>> + extensions->ARB_occlusion_query2 &&
>>    extensions->ARB_texture_cube_map &&
>> + extensions->ARB_seamless_cube_map &&
>>    extensions->ARB_texture_env_combine &&
>> + extensions->ARB_texture_rg &&
>> + extensions->ARB_texture_compression_rgtc &&
>> + extensions->ARB_texture_compression_bptc &&
>> + extensions->ARB_texture_float &&
>> + extensions->ARB_framebuffer_object &&
>> + extensions->ARB_framebuffer_no_attachments &&
>>    extensions->ARB_texture_env_dot3);
>>  const bool ver_1_4 = (ver_1_3 &&
>> - extensions->ARB_depth_texture &&
>> - extensions->ARB_shadow &&
>> + 

Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer
Indeed it does, I've written a little test program where I can request specific 
OpenGL versions.
I think that used to work, now if I request anything <= 3.1 I get 3.1 if 
request >= 3.2 I get 4.5.
GL_ARB_gpu_shader5 is available in all of them.

If I force the GL version the version string changes and I actually have to 
request a version lower than that.
Still GL_ARB_gpu_shader5 is available in all of them.

If I force the GLSL version to anything lower than 400, GL_ARB_gpu_shader5 goes 
missing.

That's not the intended behavior is it?

Am 18.05.2018 um 13:32 schrieb Ilia Mirkin:
> The check is against what the driver max is, not what the application 
> requested.
> 
> Using overrides messes up this logic, it seems.
> 
> On Fri, May 18, 2018, 06:14 Benedikt Schemmer  <mailto:b...@besd.de>> wrote:
> 
> Yes its backwards and it seems it doesn't work anymore ;)
> Seems the loophole I was using got removed.
> 
> Problem remains that applications that could use an legacy extension
> are not aware of it and will simply/usually not request it:
> 
> MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=150 glxinfo | 
> grep shader5
> -nothing-
> 
> MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=400 glxinfo | 
> grep shader5
>     GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64,
>     GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex,
>     GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object,
>     GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer,
> 
> 
> If an application knows about GLSL 400 it doesn't need GL_ARB_gpu_shader5 
> anymore.
> 
> 
> 
> Am 18.05.2018 um 11:25 schrieb Timothy Arceri:
> > Hi Benedikt,
> >
> > I think you have this backwards. compute_version() computes the version 
> of OpenGL supported by the driver by checking what extensions it advertises. 
> See version 2 of my patch to see where the
> > extensions are enabled. I don't think this is something that can be 
> automated, it really depends on the driver, just because an extension can be 
> advertised at a certain level doesn't mean you
> must (or
> > can) do it.
> >
> > Tim
> >
> > On 18/05/18 19:13, Benedikt Schemmer wrote:
> >> Hi Timothy,
> >>
> >> I was looking into something similar myself.
> >> Great to see someone working on getting legacy extensions available
> >> to the applications that can actually use them!
> >>
> >> Another thing I noticed is that extensions aren't advertised at the
> >> correct feature level.
> >> The code below works and does a better job at it, but its probably also
> >> the wrong way to go about it.
> >>
> >> Cheers,
> >> Benedikt
> >>
> >> P.S. I grepped and sed together a file which describes the requirements
> >> for every extension better, unfortunately the files on khronos
> >> arent exactly machine readable
> >> will send in next reply
> >>
> >> ---
> >>
> >> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> >> --- a/src/mesa/main/version.c
> >> +++ b/src/mesa/main/version.c
> >> @@ -233,8 +233,15 @@ _mesa_override_glsl_version(struct gl_constants 
> *consts)
> >>   }
> >>
> >>   /**
> >> - * Examine enabled GL extensions to determine GL version.
> >> + * Enable GL Extensions according to the minimum version requirements 
> as per spec.
> >> + * Careful: some extensions are implemented for the express purpose 
> of being
> >> + * used in a lower version context (legacy extensions).
> >> + * Example:
> >> + * ARB_gpu_shader5: first implementation in OpenGL 4.0 (GLSL 400) but
> >> + * supposed to be advertised from OpenGL 3.2 (GLSL 150) upwards,
> >> + * however early Nvidia FXAA could use it from GLSL 120 so advertise 
> it there
> >>    */
> >> +
> >>   static GLuint
> >>   compute_version(const struct gl_extensions *extensions,
> >>   const struct gl_constants *consts, gl_api api)
> >> @@ -242,25 +249,43 @@ compute_version(const struct gl_extensions 
> *extensions,
> >>  GLuint major, minor, version;
> >>
> >>  const bool ver_1_3 = (extensions->ARB_texture_borde

Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer
Am I on the right track to assume that Timothy's patch should read

+EXT(ARB_gpu_shader5 , ARB_gpu_shader5  
  , 32, 40,  x ,  x , 2010)

to give GL_ARB_gpu_shader5 a minimum required legacy OpenGL version of 3.2 and 
core 4.0 as per spec?

I wanted to ask: is ctx->Version >= ext->version[ctx->API] evaluated every time 
a context is created or just once at boot time?
But it seems to be evaluated every time a context is created and this is 
actually more like what I expect from the
version override

MESA_GL_VERSION_OVERRIDE=4.0 glxinfo | grep shader5
GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64,
GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex,
GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object,
GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer,

MESA_GL_VERSION_OVERRIDE=3.2 glxinfo | grep shader5
GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex,
GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object,
GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer,

MESA_GL_VERSION_OVERRIDE=3.0 glxinfo | grep shader5
GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object,
GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer,

---
from extensions.h:

/** Checks if the context supports a user-facing extension */
#define EXT(name_str, driver_cap, ...) \
static inline bool \
_mesa_has_##name_str(const struct gl_context *ctx) \
{ \
   return ctx->Extensions.driver_cap && (ctx->Extensions.Version >= \ // equal 
to ctx->Version
  _mesa_extension_table[MESA_EXTENSION_##name_str].version[ctx->API]); \
}
#include "extensions_table.h"
#undef EXT

Am 18.05.2018 um 14:35 schrieb Ilia Mirkin:
> On Fri, May 18, 2018 at 8:20 AM, Benedikt Schemmer  wrote:
>> Indeed it does, I've written a little test program where I can request 
>> specific OpenGL versions.
>> I think that used to work, now if I request anything <= 3.1 I get 3.1 if 
>> request >= 3.2 I get 4.5.
>> GL_ARB_gpu_shader5 is available in all of them.
>>
>> If I force the GL version the version string changes and I actually have to 
>> request a version lower than that.
>> Still GL_ARB_gpu_shader5 is available in all of them.
>>
>> If I force the GLSL version to anything lower than 400, GL_ARB_gpu_shader5 
>> goes missing.
>>
>> That's not the intended behavior is it?
> 
> Forcing versions via env vars isn't well-supported, esp around the
> edges like this. Feel free to send patches to fix it.
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer
so the gll is ok?

my idea with the glc is that this particular extension would no longer be 
advertised as soon as the app
request a context where the functionality is already part of it (ie OpenGL >= 
4.0)

Do you think a minimum GLSL version could be added to this table (maybe 
replacing the year) so that
GLSL version overriding has a similar effect?

Am 18.05.2018 um 19:32 schrieb Ilia Mirkin:
> On Fri, May 18, 2018 at 1:27 PM, Benedikt Schemmer  wrote:
>> Am I on the right track to assume that Timothy's patch should read
>>
>> +EXT(ARB_gpu_shader5 , ARB_gpu_shader5   
>>  , 32, 40,  x ,  x , 2010)
>>
>> to give GL_ARB_gpu_shader5 a minimum required legacy OpenGL version of 3.2 
>> and core 4.0 as per spec?
> 
> It should read something like 32, 32. I was going to point that out,
> but ... meh. It shouldn't matter in practice. Environment
> variable-based overrides aren't a particularly supported mode of
> operation, so I don't think we need to bend over backwards for it.
> 
>   -ilia
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat

2018-05-18 Thread Benedikt Schemmer


Am 18.05.2018 um 20:05 schrieb Ilia Mirkin:
> On Fri, May 18, 2018 at 1:54 PM, Benedikt Schemmer  wrote:
>> so the gll is ok?
>>
>> my idea with the glc is that this particular extension would no longer be 
>> advertised as soon as the app
>> request a context where the functionality is already part of it (ie OpenGL 
>> >= 4.0)
> 
> Why is this desirable? One could interpret that to mean that e.g.
> "#version 150; #extension GL_ARB_gpu_shader5: enable" wouldn't work...
> While legal, I doubt it'd go over well with applications.

I dont think any application would notice. They should just do #version 400 (or 
higher) and dont need GL_ARB_gpu_shader5 anymore
So less useless extensions to sift through.

> 
>> Do you think a minimum GLSL version could be added to this table (maybe 
>> replacing the year) so that
>> GLSL version overriding has a similar effect?
> 
> The minimum GL version seems to do that just fine...

This is of course very specific to this extension, although I do expect more 
have the same aspect.

I found an old (the) FXAA implementation from Nvidia that was used by retro 
(really really old) games
https://gist.github.com/kosua20/0c506b81b3812ac900048059d2383126

and it could use shader5 with GLSL 120. So it would be nice to have a way to 
enable that extension
even in cases where the requested OpenGL version is lower than specified but 
there is some justification
to it.

---

#ifndef FXAA_FAST_PIXEL_OFFSET
//
// Used for GLSL 120 only.
//
// 1 = GL API supports fast pixel offsets
// 0 = do not use fast pixel offsets
//
#ifdef GL_EXT_gpu_shader4
#define FXAA_FAST_PIXEL_OFFSET 1
#endif
#ifdef GL_NV_gpu_shader5
#define FXAA_FAST_PIXEL_OFFSET 1
#endif
#ifdef GL_ARB_gpu_shader5
#define FXAA_FAST_PIXEL_OFFSET 1
#endif
#ifndef FXAA_FAST_PIXEL_OFFSET
#define FXAA_FAST_PIXEL_OFFSET 0
#endif
#endif

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH 2/3] mesa/main/extensions: first attempt to get the context levels right

2018-05-19 Thread Benedikt Schemmer
These are just a few extensions for which I had the specifications handy, to
get a feel for what this might look like.

It's already been pointed out that applications might fail because of this.

(I haven't found an application that does yet but I can only test on 64-bit)
The argument for this change would be that its ok if the application fails
because it does something unexpected and its better to let it explicitly fail
and create a workaround for it than to just silently hope that everything is
going to be alright.

CC: "Marek Olšák"  
CC: Eric Anholt 
CC: Emil Velikov 
CC: Timothy Arceri 
CC: Ilia Mirkin 

Signed-off-by: Benedikt Schemmer 
---
 src/mesa/main/extensions_table.h | 98 ++--
 1 file changed, 53 insertions(+), 45 deletions(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 931e0d4d33..6e3bdbe99c 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -6,7 +6,8 @@
 #define ANYES2 0
 #define   x   ~0
 
-//EXT(name_str  , driver_cap   
  ,GLL_ver,GLC_ver, gles_ver, glES2_ver, )
+//highest GLL_ver = 46, lowest GLC_ver = 30
+//EXT(name_str  , driver_cap   
  ,GLL_ver,GLC_ver, gles_ver, glES2_ver, year)
 EXT(3DFX_texture_compression_FXT1   , TDFX_texture_compression_FXT1
  , ANYGLL, ANYGLC,   x   ,   x   , 1999)
 
 EXT(AMD_conservative_depth  , ARB_conservative_depth   
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
@@ -31,60 +32,62 @@ EXT(APPLE_texture_max_level , dummy_true
 EXT(ARB_ES2_compatibility   , ARB_ES2_compatibility
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
 EXT(ARB_ES3_1_compatibility , ARB_ES3_1_compatibility  
  ,   x   , ANYGLC,   x   ,   x   , 2014)
 EXT(ARB_ES3_2_compatibility , ARB_ES3_2_compatibility  
  ,   x   , ANYGLC,   x   ,   x   , 2015)
-EXT(ARB_ES3_compatibility   , ARB_ES3_compatibility
  , ANYGLL, ANYGLC,   x   ,   x   , 2012)
-EXT(ARB_arrays_of_arrays, ARB_arrays_of_arrays 
  , ANYGLL, ANYGLC,   x   ,   x   , 2012)
+EXT(ARB_ES3_compatibility   , ARB_ES3_compatibility
  ,  33   ,  33   ,   x   ,   x   , 2012) // requires 
ARB_ES2_compatibility, ARB_invalidate_subdata, ARB_texture_storage, 
OES_compressed_*
+EXT(ARB_arrays_of_arrays, ARB_arrays_of_arrays 
  ,  12   ,  12   ,   x   ,   x   , 2012)
 EXT(ARB_base_instance   , ARB_base_instance
  , ANYGLL, ANYGLC,   x   ,   x   , 2011)
 EXT(ARB_bindless_texture, ARB_bindless_texture 
  , ANYGLL, ANYGLC,   x   ,   x   , 2013)
 EXT(ARB_blend_func_extended , ARB_blend_func_extended  
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
 EXT(ARB_buffer_storage  , ARB_buffer_storage   
  , ANYGLL, ANYGLC,   x   ,   x   , 2013)
+//EXT(ARB_cl_event, ARB_cl_event   
,  31   ,  31   ,   x   ,   x   , 2013) // requires GL_ARB_sync
 EXT(ARB_clear_buffer_object , dummy_true   
  , ANYGLL, ANYGLC,   x   ,   x   , 2012)
 EXT(ARB_clear_texture   , ARB_clear_texture
  , ANYGLL, ANYGLC,   x   ,   x   , 2013)
 EXT(ARB_clip_control, ARB_clip_control 
  , ANYGLL, ANYGLC,   x   ,   x   , 2014)
 EXT(ARB_color_buffer_float  , ARB_color_buffer_float   
  , ANYGLL, ANYGLC,   x   ,   x   , 2004)
 EXT(ARB_compatibility   , ARB_compatibility
  , ANYGLL,   x   ,   x   ,   x   , 2009)
 EXT(ARB_compressed_texture_pixel_storage, dummy_true   
  , ANYGLL, ANYGLC,   x   ,   x   , 2011)
-EXT(ARB_compute_shader  , ARB_compute_shader   
  , ANYGLL, ANYGLC,   x   ,   x   , 2012)
-EXT(ARB_compute_variable_group_size , ARB_compute_variable_group_size  
  , ANYGLL, ANYGLC,   x   ,   x   , 2013)
+EXT(ARB_compute_shader  , ARB_compute_shader   
  ,  42   ,  42   ,   x   ,   x   , 2012)
+EXT(ARB_compute_variable_group_size , ARB_compute_variable_group_size  
  ,  42   ,  42   ,   x   ,   x   , 2013) //or ARB compute shader
 EXT(ARB_conditional_render_inverted , ARB_conditional_render_inverted  
  , ANYGLL, ANYGLC,   x   ,   x   , 2014)
 EXT(ARB_conservative_depth  , ARB_conservative_depth   
  , ANYGLL, ANYGLC,   x   ,   x   , 2011)
 EXT(ARB_copy_buffer , dummy_true   
  , ANYGLL,

[Mesa-dev] [RFC PATCH 1/3] mesa/main/extensions: make constants slightly more expressive and adjust layout

2018-05-19 Thread Benedikt Schemmer
This just makes the constants more descriptive

CC: "Marek Olšák"  
CC: Eric Anholt 
CC: Emil Velikov 
CC: Timothy Arceri 
CC: Ilia Mirkin 

Signed-off-by: Benedikt Schemmer 
---
 src/mesa/main/extensions_table.h | 833 ---
 1 file changed, 417 insertions(+), 416 deletions(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index ffb1caccdd..931e0d4d33 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -1,439 +1,440 @@
 /* The extension table is alphabetically sorted by the extension name string 
column. */
 
-#define GLL 0
-#define GLC 0
-#define ES1 0
-#define ES2 0
-#define  x ~0
+#define ANYGLL 0
+#define ANYGLC 0
+#define ANYES1 0
+#define ANYES2 0
+#define   x   ~0
 
-EXT(3DFX_texture_compression_FXT1   , TDFX_texture_compression_FXT1
  , GLL, GLC,  x ,  x , 1999)
+//EXT(name_str  , driver_cap   
  ,GLL_ver,GLC_ver, gles_ver, glES2_ver, )
+EXT(3DFX_texture_compression_FXT1   , TDFX_texture_compression_FXT1
  , ANYGLL, ANYGLC,   x   ,   x   , 1999)
 
-EXT(AMD_conservative_depth  , ARB_conservative_depth   
  , GLL, GLC,  x ,  x , 2009)
-EXT(AMD_draw_buffers_blend  , ARB_draw_buffers_blend   
  , GLL, GLC,  x ,  x , 2009)
-EXT(AMD_performance_monitor , AMD_performance_monitor  
  , GLL, GLC,  x ,  x , 2007)
-EXT(AMD_pinned_memory   , AMD_pinned_memory
  , GLL, GLC,  x ,  x , 2013)
-EXT(AMD_seamless_cubemap_per_texture, AMD_seamless_cubemap_per_texture 
  , GLL, GLC,  x ,  x , 2009)
-EXT(AMD_shader_stencil_export   , ARB_shader_stencil_export
  , GLL, GLC,  x ,  x , 2009)
-EXT(AMD_shader_trinary_minmax   , dummy_true   
  , GLL, GLC,  x ,  x , 2012)
-EXT(AMD_vertex_shader_layer , AMD_vertex_shader_layer  
  ,  x , GLC,  x ,  x , 2012)
-EXT(AMD_vertex_shader_viewport_index, AMD_vertex_shader_viewport_index 
  ,  x , GLC,  x ,  x , 2012)
+EXT(AMD_conservative_depth  , ARB_conservative_depth   
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
+EXT(AMD_draw_buffers_blend  , ARB_draw_buffers_blend   
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
+EXT(AMD_performance_monitor , AMD_performance_monitor  
  , ANYGLL, ANYGLC,   x   ,   x   , 2007)
+EXT(AMD_pinned_memory   , AMD_pinned_memory
  , ANYGLL, ANYGLC,   x   ,   x   , 2013)
+EXT(AMD_seamless_cubemap_per_texture, AMD_seamless_cubemap_per_texture 
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
+EXT(AMD_shader_stencil_export   , ARB_shader_stencil_export
  , ANYGLL, ANYGLC,   x   ,   x   , 2009)
+EXT(AMD_shader_trinary_minmax   , dummy_true   
  , ANYGLL, ANYGLC,   x   ,   x   , 2012)
+EXT(AMD_vertex_shader_layer , AMD_vertex_shader_layer  
  ,   x   , ANYGLC,   x   ,   x   , 2012)
+EXT(AMD_vertex_shader_viewport_index, AMD_vertex_shader_viewport_index 
  ,   x   , ANYGLC,   x   ,   x   , 2012)
 
-EXT(ANDROID_extension_pack_es31a, ANDROID_extension_pack_es31a 
  ,  x ,  x ,  x ,  31, 2014)
+EXT(ANDROID_extension_pack_es31a, ANDROID_extension_pack_es31a 
  ,   x   ,   x   ,   x   ,  31   , 2014)
 
-EXT(ANGLE_texture_compression_dxt3  , ANGLE_texture_compression_dxt
  , GLL, GLC, ES1, ES2, 2011)
-EXT(ANGLE_texture_compression_dxt5  , ANGLE_texture_compression_dxt
  , GLL, GLC, ES1, ES2, 2011)
+EXT(ANGLE_texture_compression_dxt3  , ANGLE_texture_compression_dxt
  , ANYGLL, ANYGLC, ANYES1, ANYES2, 2011)
+EXT(ANGLE_texture_compression_dxt5  , ANGLE_texture_compression_dxt
  , ANYGLL, ANYGLC, ANYES1, ANYES2, 2011)
 
-EXT(APPLE_object_purgeable  , APPLE_object_purgeable   
  , GLL, GLC,  x ,  x , 2006)
-EXT(APPLE_packed_pixels , dummy_true   
  , GLL,  x ,  x ,  x , 2002)
-EXT(APPLE_texture_max_level , dummy_true   
  ,  x ,  x , ES1, ES2, 2009)
+EXT(APPLE_object_purgeable  , APPLE_object_purgeable   
  , ANYGLL, ANYGLC,   x   ,   x   , 2006)
+EXT(APPLE_packed_pixels , dummy_true   
  , ANYGLL,   x   ,   x   ,   x   , 2002)
+EXT(APPLE_texture_max_level , dummy_true   
  ,   x   ,   x   , ANYES1, ANYES2, 2009)
 
-EXT(ARB_ES2_compatibility   , ARB_ES2_compatibility
  , GLL, GLC,  x ,  x , 2009)
-EXT(ARB_ES3_1_compatibility , ARB_ES3_1_com

[Mesa-dev] [RFC PATCH 0/3] mesa/main/extensions: approximate feature levels for enabling better

2018-05-19 Thread Benedikt Schemmer
Most of the extensions are currently either always available or not.
A mechanism to enable extensions based on the required OpenGL version is already
in place, its just rarely used.

This might cause problems when an application actually tries to use an extension
in a context for which the extension was not designed.

Also this prevents extensions that might be problematic when they are always
enabled, to be enabled in the appropriate context.

Tested with several 64-bit games.

CC: "Marek Olšák"  
CC: Eric Anholt 
CC: Emil Velikov 
CC: Timothy Arceri 
CC: Ilia Mirkin 

Benedikt Schemmer (3):
  mesa/main/extensions: make constants slightly more expressive and adjust 
layout
  mesa/main/extensions: first attempt to get the context levels right
  mesa/main/extensions: introduce dependencies

 src/mesa/main/extensions.c   |   3 +
 src/mesa/main/extensions.h   |   3 +
 src/mesa/main/extensions_table.c |   5 +-
 src/mesa/main/extensions_table.h | 841 ---
 4 files changed, 434 insertions(+), 418 deletions(-)

-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH 1/4] mesa/main/extensions: use GL_ARB_gpu_shader5 in legacy contexts

2018-05-19 Thread Benedikt Schemmer
is the prerequisit for the others I just sent

CC: "Marek Olšák"  
CC: Eric Anholt 
CC: Emil Velikov 
CC: Timothy Arceri 
CC: Ilia Mirkin 

Signed-off-by: Benedikt Schemmer 
---
 src/mesa/main/extensions_table.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 38d241db52..ffb1caccdd 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -74,7 +74,7 @@ EXT(ARB_framebuffer_sRGB, 
EXT_framebuffer_sRGB
 EXT(ARB_get_program_binary  , dummy_true   
  , GLL, GLC,  x ,  x , 2010)
 EXT(ARB_get_texture_sub_image   , dummy_true   
  , GLL, GLC,  x ,  x , 2014)
 EXT(ARB_gl_spirv, ARB_gl_spirv 
  ,  x,  GLC,  x ,  x , 2016)
-EXT(ARB_gpu_shader5 , ARB_gpu_shader5  
  ,  x , GLC,  x ,  x , 2010)
+EXT(ARB_gpu_shader5 , ARB_gpu_shader5  
  , GLL, GLC,  x ,  x , 2010)
 EXT(ARB_gpu_shader_fp64 , ARB_gpu_shader_fp64  
  ,  x , GLC,  x ,  x , 2010)
 EXT(ARB_gpu_shader_int64, ARB_gpu_shader_int64 
  ,  x , GLC,  x ,  x , 2015)
 EXT(ARB_half_float_pixel, dummy_true   
  , GLL, GLC,  x ,  x , 2003)
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] shader-db/si-report: Show biggest improvements also

2018-05-20 Thread Benedikt Schemmer
Hello Marek,
thank you very much for reviewing.

I don't have commit rights, could you push for me?
(I only wrote that in the cover letter, should I put that in every patch in the 
future?)

Benedikt

Am 20.05.2018 um 04:58 schrieb Marek Olšák:
> For the series:
> 
> Reviewed-by: Marek Olšák mailto:marek.ol...@amd.com>>
> 
> Marek
> 
> 
> On Thu, May 10, 2018 at 5:06 AM, Benedikt Schemmer  <mailto:b...@besd.de>> wrote:
> 
> Sometimes it is nice (and useful) to not just see if you've messed up
> but also if you've made an improvement.
> 
> ---
>  si-report.py | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/si-report.py b/si-report.py
> index fba652c..d0e0686 100755
> --- a/si-report.py
> +++ b/si-report.py
> @@ -686,6 +686,30 @@ def print_tables(before_all_results, 
> after_all_results):
>                  break
>          if num > 0:
>              print
> +
> +    # biggest improvements
> +    metrics = si_stats().metrics
> +    for i in range(len(metrics)):
> +        field = metrics[i][0]
> +        num = 0
> +        more_is_better = metrics[i][0] == 'maxwaves'
> +
> +        if more_is_better:
> +            sort_key = lambda v: -v[1].diff.__dict__[field]
> +        else:
> +            sort_key = lambda v: v[1].diff.__dict__[field]
> +
> +        for name, stats in sorted(shaders.items(), key = sort_key):
> +            if more_is_better:
> +                if stats.diff.__dict__[field] <= 0:
> +                    continue
> +            else:
> +                if stats.diff.__dict__[field] >= 0:
> +                    continue
> +
> +            if num == 0:
> +                print_yellow(" BIGGEST IMPROVEMENTS - 
> {:64}".format(metrics[i][1]))
> +                print_yellow(" Before After     Delta Percentage")
>              stats.print_regression(name, field)
>              num += 1
>              if num == num_listed:
> -- 
> 2.14.1
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev 
> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code

2018-05-20 Thread Benedikt Schemmer
There is exactly one flock in mesa and it caused mesa not to build
on windows when shader cache was enabled.

It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363
"utils: build sha1/disk cache only with Android/Autoconf" currently
guarding the offending code with ENABLE_SHADER_CACHE

This would allow shader cache to work on windows I think.

I dont have a test system with windows though.
This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux
both with cold shader cache.

Really
Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82

CC: Tapani Pälli 
CC: "Marek Olšák"  
CC: Emil Velikov 
CC: Timothy Arceri 
CC: Samuel Pitoiset 
---
This enables the patch
[PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

 src/util/disk_cache.c | 48 +++-
 1 file changed, 35 insertions(+), 13 deletions(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 4a762eff20..ca47bb15fb 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -848,6 +847,29 @@ struct cache_entry_file_data {
uint32_t uncompressed_size;
 };
 
+static char *
+generate_random_string(int length) {
+   static const char a[] = "0123456789abcdef";
+
+   if (length > 16)
+  return NULL;
+
+   char buf[16];
+   char *rndstr;
+
+   for (int i = 0; i < length - 1; ++i) {
+   // assign a random element from the lookup table
+   buf[i] = a[rand() % (sizeof(a) - 1)];
+   }
+
+   buf[length - 1] = 0;
+
+   if (asprintf(&rndstr, "%s", buf) == -1)
+  return NULL;
+
+   return rndstr;
+}
+
 static void
 cache_put(void *job, int thread_index)
 {
@@ -855,7 +877,7 @@ cache_put(void *job, int thread_index)
 
int fd = -1, fd_final = -1, err, ret;
unsigned i = 0;
-   char *filename = NULL, *filename_tmp = NULL;
+   char *filename = NULL, *filename_tmp = NULL, *random = NULL;
struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job;
 
filename = get_cache_file(dc_job->cache, dc_job->key);
@@ -873,7 +895,16 @@ cache_put(void *job, int thread_index)
 * final destination filename, (to prevent any readers from seeing
 * a partially written file).
 */
-   if (asprintf(&filename_tmp, "%s.tmp", filename) == -1)
+
+   /* This next part used to be an flock(), which would prevent windows systems
+* to build. 4 hex characters should be enough to prevent filename race
+* conditions for now.
+   */
+   random = generate_random_string(4);
+   if (random == NULL)
+  goto done;
+
+   if (asprintf(&filename_tmp, "%s_%s.tmp", filename, random) == -1)
   goto done;
 
fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644);
@@ -890,16 +921,7 @@ cache_put(void *job, int thread_index)
  goto done;
}
 
-   /* With the temporary file open, we take an exclusive flock on
-* it. If the flock fails, then another process still has the file
-* open with the flock held. So just let that file be responsible
-* for writing the file.
-*/
-   err = flock(fd, LOCK_EX | LOCK_NB);
-   if (err == -1)
-  goto done;
-
-   /* Now that we have the lock on the open temporary file, we can
+   /* Now that we have the open temporary file, we can
 * check to see if the destination file already exists. If so,
 * another process won the race between when we saw that the file
 * didn't exist and now. In this case, we don't do anything more,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code

2018-05-20 Thread Benedikt Schemmer
Thanks, fixed locally.

Am 20.05.2018 um 14:46 schrieb Mariusz Ceier:
> On 20 May 2018 at 14:16, Benedikt Schemmer  wrote:
>> There is exactly one flock in mesa and it caused mesa not to build
>> on windows when shader cache was enabled.
>>
>> It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363
>> "utils: build sha1/disk cache only with Android/Autoconf" currently
>> guarding the offending code with ENABLE_SHADER_CACHE
>>
>> This would allow shader cache to work on windows I think.
>>
>> I dont have a test system with windows though.
>> This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux
>> both with cold shader cache.
>>
>> Really
>> Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82
>>
>> CC: Tapani Pälli 
>> CC: "Marek Olšák" 
>> CC: Emil Velikov 
>> CC: Timothy Arceri 
>> CC: Samuel Pitoiset 
>> ---
>> This enables the patch
>> [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
>>
>>  src/util/disk_cache.c | 48 +++-
>>  1 file changed, 35 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
>> index 4a762eff20..ca47bb15fb 100644
>> --- a/src/util/disk_cache.c
>> +++ b/src/util/disk_cache.c
>> @@ -28,7 +28,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -848,6 +847,29 @@ struct cache_entry_file_data {
>> uint32_t uncompressed_size;
>>  };
>>
>> +static char *
>> +generate_random_string(int length) {
>> +   static const char a[] = "0123456789abcdef";
>> +
>> +   if (length > 16)
>> +  return NULL;
>> +
>> +   char buf[16];
>> +   char *rndstr;
>> +
>> +   for (int i = 0; i < length - 1; ++i) {
>> +   // assign a random element from the lookup table
>> +   buf[i] = a[rand() % (sizeof(a) - 1)];
>> +   }
>> +
>> +   buf[length - 1] = 0;
>> +
>> +   if (asprintf(&rndstr, "%s", buf) == -1)
>> +  return NULL;
>> +
>> +   return rndstr;
>> +}
>> +
>>  static void
>>  cache_put(void *job, int thread_index)
>>  {
>> @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index)
>>
>> int fd = -1, fd_final = -1, err, ret;
>> unsigned i = 0;
>> -   char *filename = NULL, *filename_tmp = NULL;
>> +   char *filename = NULL, *filename_tmp = NULL, *random = NULL;
>> struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job;
>>
>> filename = get_cache_file(dc_job->cache, dc_job->key);
>> @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index)
>>  * final destination filename, (to prevent any readers from seeing
>>  * a partially written file).
>>  */
>> -   if (asprintf(&filename_tmp, "%s.tmp", filename) == -1)
>> +
>> +   /* This next part used to be an flock(), which would prevent windows 
>> systems
>> +* to build. 4 hex characters should be enough to prevent filename race
>> +* conditions for now.
>> +   */
>> +   random = generate_random_string(4);
>> +   if (random == NULL)
>> +  goto done;
>> +
>> +   if (asprintf(&filename_tmp, "%s_%s.tmp", filename, random) == -1)
>>goto done;
>>
> 
> You forgot to free the random string.
> 
>> fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644);
>> @@ -890,16 +921,7 @@ cache_put(void *job, int thread_index)
>>   goto done;
>> }
>>
>> -   /* With the temporary file open, we take an exclusive flock on
>> -* it. If the flock fails, then another process still has the file
>> -* open with the flock held. So just let that file be responsible
>> -* for writing the file.
>> -*/
>> -   err = flock(fd, LOCK_EX | LOCK_NB);
>> -   if (err == -1)
>> -  goto done;
>> -
>> -   /* Now that we have the lock on the open temporary file, we can
>> +   /* Now that we have the open temporary file, we can
>>  * check to see if the destination file already exists. If so,
>>  * another process won the race between when we saw that the file
>>  * didn't exist and now. In this case, we don't do anything more,
>> --
>> 2.14.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH v2] Evade very ugly system dependent code by replacing an flock

2018-05-20 Thread Benedikt Schemmer
There is exactly one flock in mesa and it caused mesa not to build
on windows when shader cache was enabled.

It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363
"utils: build sha1/disk cache only with Android/Autoconf" currently
guarding the offending code with ENABLE_SHADER_CACHE

This would allow shader cache to work on windows I think.
I dont have a test system with windows though.

This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux
both with cold shader cache.

Really
Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82

v2: make the patch ridiculously simple (Jan Vesely) needs testing though

CC: Tapani Pälli 
CC: "Marek Olšák" 
CC: Emil Velikov 
CC: Timothy Arceri 
CC: Samuel Pitoiset 
CC: Jan Vesely 
---
This enables the patch
[PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally

 src/util/disk_cache.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 4a762eff20..35847af6a6 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -876,7 +875,7 @@ cache_put(void *job, int thread_index)
if (asprintf(&filename_tmp, "%s.tmp", filename) == -1)
   goto done;
 
-   fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644);
+   fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT | O_EXCL, 0644);
 
/* Make the two-character subdirectory within the cache as needed. */
if (fd == -1) {
@@ -890,16 +889,7 @@ cache_put(void *job, int thread_index)
  goto done;
}
 
-   /* With the temporary file open, we take an exclusive flock on
-* it. If the flock fails, then another process still has the file
-* open with the flock held. So just let that file be responsible
-* for writing the file.
-*/
-   err = flock(fd, LOCK_EX | LOCK_NB);
-   if (err == -1)
-  goto done;
-
-   /* Now that we have the lock on the open temporary file, we can
+   /* Now that we have the open temporary file, we can
 * check to see if the destination file already exists. If so,
 * another process won the race between when we saw that the file
 * didn't exist and now. In this case, we don't do anything more,
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code

2018-05-21 Thread Benedikt Schemmer


Am 21.05.2018 um 01:19 schrieb Timothy Arceri:
> 
> 
> On 20/05/18 22:16, Benedikt Schemmer wrote:
>> There is exactly one flock in mesa and it caused mesa not to build
>> on windows when shader cache was enabled.
>>
>> It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363
>> "utils: build sha1/disk cache only with Android/Autoconf" currently
>> guarding the offending code with ENABLE_SHADER_CACHE
>>
>> This would allow shader cache to work on windows I think.
> 
> NAK. rand() is not thread safe.
> 
> Why are you trying to make this build on windows? 

Appveyor
https://ci.appveyor.com/project/mesa3d/mesa/build/3186

was the reason why large parts are guarded by ENABLE_SHADER_CACHE
which prevents me from using a simple function in the patch I mentioned

I can of course circumvent that by just duplicating the code I
need.

It just seems cleaner to have no system dependent code here.
But you're right: its quite convoluted and every time I think I got it
something else pops up ;)

There are also other dependencies (dlfcn.h) that need to be
removed. I already have, at least to some degree. I dont know how to eliminate
the runtime check for the llvm build though, but that is only used by amd code
so I moved it to ac_llvm_util.h for the moment which seems to work.

si_pipe.c somewhere around line 800 now looks like:
---
if (ac_get_function_timestamp(LLVMInitializeAMDGPUTargetInfo,
  &llvm_timestamp)) {
res = asprintf(×tamp_str, "%s_%u",
   __TIMESTAMP__, llvm_timestamp);
}
---

>There is no use case for this on windows currently so it won't even be tested. 
>There are also other calls such as getuid() etc that >will fail on windows.
> 
> If someone want this to work on windows they should just write windows 
> specific paths IMO.
> 
>>
>> I dont have a test system with windows though.
>> This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux
>> both with cold shader cache.
>>
>> Really
>> Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82
>>
>> CC: Tapani Pälli 
>> CC: "Marek Olšák" 
>> CC: Emil Velikov 
>> CC: Timothy Arceri 
>> CC: Samuel Pitoiset 
>> ---
>> This enables the patch
>> [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
>>
>>   src/util/disk_cache.c | 48 +++-
>>   1 file changed, 35 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
>> index 4a762eff20..ca47bb15fb 100644
>> --- a/src/util/disk_cache.c
>> +++ b/src/util/disk_cache.c
>> @@ -28,7 +28,6 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>>   #include 
>>   #include 
>>   #include 
>> @@ -848,6 +847,29 @@ struct cache_entry_file_data {
>>  uint32_t uncompressed_size;
>>   };
>>   +static char *
>> +generate_random_string(int length) {
>> +   static const char a[] = "0123456789abcdef";
>> +
>> +   if (length > 16)
>> +  return NULL;
>> +
>> +   char buf[16];
>> +   char *rndstr;
>> +
>> +   for (int i = 0; i < length - 1; ++i) {
>> +   // assign a random element from the lookup table
>> +   buf[i] = a[rand() % (sizeof(a) - 1)];
>> +   }
>> +
>> +   buf[length - 1] = 0;
>> +
>> +   if (asprintf(&rndstr, "%s", buf) == -1)
>> +  return NULL;
>> +
>> +   return rndstr;
>> +}
>> +
>>   static void
>>   cache_put(void *job, int thread_index)
>>   {
>> @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index)
>>    int fd = -1, fd_final = -1, err, ret;
>>  unsigned i = 0;
>> -   char *filename = NULL, *filename_tmp = NULL;
>> +   char *filename = NULL, *filename_tmp = NULL, *random = NULL;
>>  struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job;
>>    filename = get_cache_file(dc_job->cache, dc_job->key);
>> @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index)
>>   * final destination filename, (to prevent any readers from seeing
>>   * a partially written file).
>>   */
>> -   if (asprintf(&filename_tmp, "%s.tmp", filename) == -1)
>> +
>> +   /* This next part used to be an flock(), which would prevent windows 
>> systems
>> +    * to build. 4 hex characters should be enough to prevent filename race
>> +    * conditions for now.
>> +   */
>> +   random = generat

Re: [Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code

2018-05-21 Thread Benedikt Schemmer
Ok, small update. Please ignore this rfc.
I finally got appveyor to build my repo.

There are at least these dependencies that would need to addressed (for anybody 
wanting tor try):

#include  // can be eliminated sometimes by replacing with 
__TIMESTAMP__ or maybe better __DATE__ __TIME__
#include 
#include 
#include 
#include 
#include "zlib.h" // complains about redefinition vsnprintf (I used version 
1.23)


So for now: code duplication it is ;)


appveyor has mingw available, maybe I will give that a try sometime
https://www.appveyor.com/docs/build-environment/

and they even have an ubuntu image available
https://www.appveyor.com/docs/getting-started-with-appveyor-for-linux/

wonder why that isnt being used instead of msvc 2015 with llvm 3.3.1?

Am 21.05.2018 um 01:19 schrieb Timothy Arceri:
> 
> 
> On 20/05/18 22:16, Benedikt Schemmer wrote:
>> There is exactly one flock in mesa and it caused mesa not to build
>> on windows when shader cache was enabled.
>>
>> It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363
>> "utils: build sha1/disk cache only with Android/Autoconf" currently
>> guarding the offending code with ENABLE_SHADER_CACHE
>>
>> This would allow shader cache to work on windows I think.
> 
> NAK. rand() is not thread safe.
> 
> Why are you trying to make this build on windows? There is no use case for 
> this on windows currently so it won't even be tested. There are also other 
> calls such as getuid() etc that will fail on windows.
> 
> If someone want this to work on windows they should just write windows 
> specific paths IMO.
> 
>>
>> I dont have a test system with windows though.
>> This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux
>> both with cold shader cache.
>>
>> Really
>> Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82
>>
>> CC: Tapani Pälli 
>> CC: "Marek Olšák" 
>> CC: Emil Velikov 
>> CC: Timothy Arceri 
>> CC: Samuel Pitoiset 
>> ---
>> This enables the patch
>> [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
>>
>>   src/util/disk_cache.c | 48 +++-
>>   1 file changed, 35 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
>> index 4a762eff20..ca47bb15fb 100644
>> --- a/src/util/disk_cache.c
>> +++ b/src/util/disk_cache.c
>> @@ -28,7 +28,6 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>>   #include 
>>   #include 
>>   #include 
>> @@ -848,6 +847,29 @@ struct cache_entry_file_data {
>>  uint32_t uncompressed_size;
>>   };
>>   +static char *
>> +generate_random_string(int length) {
>> +   static const char a[] = "0123456789abcdef";
>> +
>> +   if (length > 16)
>> +  return NULL;
>> +
>> +   char buf[16];
>> +   char *rndstr;
>> +
>> +   for (int i = 0; i < length - 1; ++i) {
>> +   // assign a random element from the lookup table
>> +   buf[i] = a[rand() % (sizeof(a) - 1)];
>> +   }
>> +
>> +   buf[length - 1] = 0;
>> +
>> +   if (asprintf(&rndstr, "%s", buf) == -1)
>> +  return NULL;
>> +
>> +   return rndstr;
>> +}
>> +
>>   static void
>>   cache_put(void *job, int thread_index)
>>   {
>> @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index)
>>    int fd = -1, fd_final = -1, err, ret;
>>  unsigned i = 0;
>> -   char *filename = NULL, *filename_tmp = NULL;
>> +   char *filename = NULL, *filename_tmp = NULL, *random = NULL;
>>  struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job;
>>    filename = get_cache_file(dc_job->cache, dc_job->key);
>> @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index)
>>   * final destination filename, (to prevent any readers from seeing
>>   * a partially written file).
>>   */
>> -   if (asprintf(&filename_tmp, "%s.tmp", filename) == -1)
>> +
>> +   /* This next part used to be an flock(), which would prevent windows 
>> systems
>> +    * to build. 4 hex characters should be enough to prevent filename race
>> +    * conditions for now.
>> +   */
>> +   random = generate_random_string(4);
>> +   if (random == NULL)
>> +  goto done;
>> +
>> +   if (asprintf(&filename_tmp, "%s_%s.tmp", filename, random) == -1)
>>     goto done;
>>    fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644);
>&

Re: [Mesa-dev] [PATCH 3/3] mesa/main/shaderapi: purely non-functional cleanups, like whitespace errors and cleanups

2018-05-21 Thread Benedikt Schemmer


Am 21.05.2018 um 19:21 schrieb Ian Romanick:
> On 05/10/2018 02:05 AM, Benedikt Schemmer wrote:
>> remove a memset too and yes, this is all functionally identical
>>
>> ---
>>  src/mesa/main/shaderapi.c | 40 
>>  1 file changed, 20 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
>> index e8acca4490..1d0ca5374b 100644
>> --- a/src/mesa/main/shaderapi.c
>> +++ b/src/mesa/main/shaderapi.c
>> @@ -241,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx)
>> /* Device drivers may override these to control what kind of instructions
>>  * are generated by the GLSL compiler.
>>  */
>> -   struct gl_shader_compiler_options options;
>> +   struct gl_shader_compiler_options options = {};
>> gl_shader_stage sh;
>> int i;
>>
>> -   memset(&options, 0, sizeof(options));
> 
> This is not functionally the same.  The memset will zero padding fields
> added by the compiler, but '= {}' does not.

I did check with
https://godbolt.org/
and at least for clang (which generates a memset for {}) and gcc it is exactly 
the same

void test(void) {
struct gl_shader_compiler_options options = {};
memset(&options, 0, sizeof(options));
}

gcc:
  // = {}
  mov QWORD PTR [rbp-32], 0
  mov QWORD PTR [rbp-24], 0
  mov QWORD PTR [rbp-16], 0

  // memset
  lea rax, [rbp-32]
  mov edx, 24
  mov esi, 0
  mov rdi, rax
  call memset

clang:

  // = {}
  mov rdi, rsi
  mov qword ptr [rbp - 32], rsi # 8-byte Spill
  mov esi, eax
  mov qword ptr [rbp - 40], rdx # 8-byte Spill
  mov dword ptr [rbp - 44], eax # 4-byte Spill
  call memset

  //memset
  mov rdx, qword ptr [rbp - 32] # 8-byte Reload
  mov rdi, rdx
  mov esi, dword ptr [rbp - 44] # 4-byte Reload
  mov rdx, qword ptr [rbp - 40] # 8-byte Reload
  call memset

but your right intel compilers generate something weird:

  // = {}
  lea rax, QWORD PTR [-32+rbp] #71.47
  mov edx, 0 #71.47
  mov ecx, 24 #71.47
  mov rdi, rax #71.47
  mov eax, edx #71.47
  and eax, 65535 #71.47
  mov ah, al #71.47
  mov edx, eax #71.47
  shl eax, 16 #71.47
  or eax, edx #71.47
  mov esi, ecx #71.47
  shr rcx, 2 #71.47
  rep stosd #71.47
  mov ecx, esi #71.47
  and ecx, 3 #71.47
  rep stosb #71.47

  // memset
  lea rax, QWORD PTR [-32+rbp] #72.5
  mov edx, 0 #72.5
  mov ecx, 24 #72.5
  mov rdi, rax #72.5
  mov esi, edx #72.5
  mov rdx, rcx #72.5
  call memset #72.5
  mov QWORD PTR [-8+rbp], rax #72.5


> 
> I'm also not fond of the The '!= 0' and '== NULL' changes.  Pretty much
> everywhere in core Mesa uses those patterns.>
> I feel like most of this is just a bunch of spurious changes that will
> just make cherry picking patches to stable irritating later on.
> 
>> options.MaxUnrollIterations = 32;
>> options.MaxIfDepth = UINT_MAX;
>>
>> @@ -254,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx)
>>
>> ctx->Shader.Flags = _mesa_get_shader_flags();
>>
>> -   if (ctx->Shader.Flags != 0)
>> +   if (ctx->Shader.Flags)
>>ctx->Const.GenerateTemporaryNames = true;
>>
>> /* Extended for ARB_separate_shader_objects */
>> @@ -771,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, 
>> GLenum pname,
>>GLint *params)
>>  {
>> struct gl_shader_program *shProg
>> -  = _mesa_lookup_shader_program_err(ctx, program, 
>> "glGetProgramiv(program)");
>> +  = _mesa_lookup_shader_program_err(ctx, program,
>> +"glGetProgramiv(program)");
>>
>> /* Is transform feedback available in this context?
>>  */
>> @@ -953,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
>> GLenum pname,
>>*params = shProg->BinaryRetreivableHint;
>>return;
>> case GL_PROGRAM_BINARY_LENGTH:
>> -  if (ctx->Const.NumProgramBinaryFormats == 0) {
>> +  if (!ctx->Const.NumProgramBinaryFormats) {
>>   *params = 0;
>>} else {
>>   _mesa_get_program_binary_length(ctx, shProg, params);
>> @@ -974,7 +974,7 @@ get_programiv(struct gl_context *ctx, GLuint program, 
>> GLenum pname,
>>   "linked)");
>>   return;
>>}
>> -  if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
>> +  if (!shProg->_LinkedShaders[MESA_SHADER_COMPUTE]) {
>>   _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute "
>>   "shaders)");
>>   return;
>> 

[Mesa-dev] [HOWTO] CI on appveyor with a linux image

2018-05-21 Thread Benedikt Schemmer
I thought this might be interesting because currently appveyor is only used 
with MSVC and LLVM 3.3.1

They can however provide Ubuntu 16.04 LTS. Its free for OSS.
You only need some kind of repository.

Basic setup is easy
https://www.appveyor.com/docs/

The hard part is the appveyor.yml file, which you can find below.
This builds mesa every time you commit.

Probably not the best config options and only for 64-bit, because I dont know 
how to do 32-bit builds on Ubuntu.
So if somebody does, help is more than welcome.

And if the Vmware guys feel like sharing, this can be merged with the current 
appveyor.yml (somehow, doesn't seem to hard though)

p.s. there are line breaks after the autogen that you will have to remove 
manually (should be one long line)
---
# http://www.appveyor.com/doc
#
# To setup AppVeyor for your own personal repositories do the following:
# - Sign up
# - Add a new project
# - Select Git and fill in the Git clone URL
# - Setup a Git hook as explained in
#   https://github.com/appveyor/webhooks#installing-git-hook
# - Check 'Settings > General > Skip branches without appveyor.yml'
# - Check 'Settings > General > Rolling builds'
# - Setup the global or project notifications to your liking
#
# Note that kicking (or restarting) a build via the web UI will not work, as it
# will fail to find appveyor.yml .  The Git hook is the most practical way to
# kick a build.
#
# See also:
# - 
http://help.appveyor.com/discussions/problems/2209-node-grunt-build-specify-a-project-or-solution-file-the-directory-does-not-contain-a-project-or-solution-file
# - 
http://help.appveyor.com/discussions/questions/1184-build-config-vs-appveyoryaml

version: '{build}'

branches:
  except:
  - /^travis.*$/

# Don't download the full Mesa history to speed up cloning.  However the clone
# depth must not be too small, otherwise builds might fail when lots of patches
# are committed in succession, because the desired commit is not found on the
# truncated history.
#
# See also:
# - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories
clone_depth: 100

image: ubuntu
platform: Any CPU
configuration: Release


install:
- ls -al
- sh: sudo sed -i~orig -e 's$# deb-src http://us$deb-src http://us$' 
/etc/apt/sources.list
- sh: sudo add-apt-repository ppa:oibaf/graphics-drivers
- sh: sudo apt-get update
- sh: sudo DEBIAN_FRONTEND=noninteractive apt-get -y -o 
Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" upgrade
- sh: sudo apt-get build-dep mesa -y
- sh: sudo apt-get install libxvmc-dev libxcb-xvmc0-dev libomxil-bellagio-dev -y

build_script:
- sh: ./autogen.sh --enable-dri --with-dri-drivers="nouveau i915 i965 r200 
radeon swrast" --enable-osmesa --enable-glx-tls --enable-shared-glapi 
--enable-texture-float --enable-driglx-direct
--enable-dri3 --with-platforms="x11 wayland drm" --enable-xa --enable-llvm 
ac_cv_path_LLVM_CONFIG=llvm-config-5.0 --enable-vdpau --enable-omx-bellagio 
--enable-va --enable-xvmc --enable-opencl
--enable-opencl-icd --enable-nine --enable-gallium-extra-hud --enable-lmsensors 
--with-gallium-drivers=" nouveau svga r600 r300 i915 virgl radeonsi swrast" 
--enable-gles1 --enable-gles2 --enable-gle
--with-vulkan-drivers=intel,radeon
- sh: make
- sh: sudo make install
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-03-04 Thread Benedikt Schemmer

Hi all,
I thought I'd do some testing on my machine to see if there are any 
problems with nir (enabled with R600_DEBUG=nir)


Tested on Ubuntu 17.10 with mesa git of today
(oibaf for 32 bit, mine with LLVM5.0 on 64 bit because I cant figure out 
how to build 32bit drivers on ubuntu :( )

Xeon 1260l and 8GB DDR3-10600 with RX460 4GB

works means no obvious visual artifacts etc.
I didn't have time to do detailed benchmarking.

---

Alien Isolation - works

Bioshock Infinite - works, nonir Overall avg 45.58, min 9.81, max 
104.49; nir avg 31.47 min 8.35 max 94.19


CAT Interstellar (UE4) - works

Dead Island Definitive - works

Dead Island Riptide Definitive - works

Deus Ex: Mankind Divided - works, nonir low 8.3 avg 24.2 high 37.6; nir 
low 1 avg 22.8 high 35.8
but shader generation crashes two times: once around half the loader bar 
with complete system freeze and then again around the end but only 
exiting the applicationand this is reproducible by deleting the shader cache


Mad Max[vulcan beta] - works
benchmark 1
nonir 43/45/47
nir 10/46/50
benchmark 2
nonir 14/37/41
nir 24/36/40
benchmark 3
nonir 8/43/61
nir 5/41/61
benchmark 4
nonir 9/49/78
nir 4/47/74

Tomb Raider - TressFX doesn't work: hair flickering and is riddled with 
small green block artifacts, smoothing is missing; speed is the same 
28.6 ish


Hitman GO (Unity) - works

Unity of Command (PyGame, SDL) - works

Talos Principle Benchmark nir 64.7 nonir 67


I have more games if there are any usual suspects that are likely to 
cause problems.


Cheers,
Benedikt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-03-07 Thread Benedikt Schemmer
Hi Timothy,

actually I am a subscriber to the list, dont know why this happens from
time to time

Benchmarkwise: NIR is now on par with TGSI, I put up some new HUD
screenshots and benchmarks

You might actually be too conservative: TGSI shows 2 spills, NIR 0


For now only two things stand out: some crashes during shader generation
in Deus Ex, Dying Light and Dead Island (Original) (see nir test) and
TressFX not working correctly in Tomb Raider.



Am 07.03.2018 um 05:16 schrieb Timothy Arceri:
> On 06/03/18 23:49, Benedikt Schemmer wrote:
>> Hi Timothy,
>>
>> I put some additional benchmark results and data on github for you:
>
> Thanks!
>
>>
>> https://github.com/bendat78/mymesa/tree/mymesa2/testresults
>>
>> I use a resolution of 1920x1080 for bioshock but also custom settings
>> with all options enabled
>
> Ok I'll try some different combos and see if I can find any
> performance regressions with nir.
>
>>
>> I also wrote another email for the list, but it hasnt appeared yet.
>
> If you are not subscribed to the list you will need to wait until
> someone approves it (which doesn't always happen as people are busy),
> it's usually just easier to subscribe to the list and filter the
> emails into a folder.
>
>>
>> A log of my testing can be found in the "nir test" file
>>
>>
>>
>> Am 06.03.2018 um 07:38 schrieb Timothy Arceri:
>>> On 05/03/18 03:43, Benedikt Schemmer wrote:
>>>> Hi all,
>>>> I thought I'd do some testing on my machine to see if there are any
>>>> problems with nir (enabled with R600_DEBUG=nir)
>>>>
>>>> Tested on Ubuntu 17.10 with mesa git of today
>>>> (oibaf for 32 bit, mine with LLVM5.0 on 64 bit because I cant
>>>> figure out how to build 32bit drivers on ubuntu :( )
>>>> Xeon 1260l and 8GB DDR3-10600 with RX460 4GB
>>>>
>>>> works means no obvious visual artifacts etc.
>>>> I didn't have time to do detailed benchmarking.
>>>>
>>>> ---
>>>>
>>>> Alien Isolation - works
>>>>
>>>> Bioshock Infinite - works, nonir Overall avg 45.58, min 9.81, max
>>>> 104.49; nir avg 31.47 min 8.35 max 94.19
>>>
>>> Hi,
>>>
>>> Do you recall what resolution / settings you used?
>>>
>>> With a quick test I'm seeing pretty good results on Ultra.
>>>
>>> 3840 x 2160 - Ultra on RX 580: (tgsi) 90.89, (nir) 107.22
>>>
>>>
>>>>
>>>> CAT Interstellar (UE4) - works
>>>>
>>>> Dead Island Definitive - works
>>>>
>>>> Dead Island Riptide Definitive - works
>>>>
>>>> Deus Ex: Mankind Divided - works, nonir low 8.3 avg 24.2 high 37.6;
>>>> nir low 1 avg 22.8 high 35.8
>>>> but shader generation crashes two times: once around half the
>>>> loader bar with complete system freeze and then again around the
>>>> end but only exiting the applicationand this is reproducible by
>>>> deleting the shader cache
>>>>
>>>> Mad Max[vulcan beta] - works
>>>> benchmark 1
>>>> nonir 43/45/47
>>>> nir 10/46/50
>>>> benchmark 2
>>>> nonir 14/37/41
>>>> nir 24/36/40
>>>> benchmark 3
>>>> nonir 8/43/61
>>>> nir 5/41/61
>>>> benchmark 4
>>>> nonir 9/49/78
>>>> nir 4/47/74
>>>>
>>>> Tomb Raider - TressFX doesn't work: hair flickering and is riddled
>>>> with small green block artifacts, smoothing is missing; speed is
>>>> the same 28.6 ish
>>>>
>>>> Hitman GO (Unity) - works
>>>>
>>>> Unity of Command (PyGame, SDL) - works
>>>>
>>>> Talos Principle Benchmark nir 64.7 nonir 67
>>>>
>>>>
>>>> I have more games if there are any usual suspects that are likely
>>>> to cause problems.
>>>>
>>>> Cheers,
>>>> Benedikt
>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Question: Underutilization of GPU in certain benchmarks - Bug or Feature ?

2018-03-07 Thread Benedikt Schemmer
Hi all,

while looking for problems with NIR, I noticed that in the Bioshock
Infinite and also Dirt Rallye Benchmark there are times when GPU shader
utilization drops to 80% (Bioshock) or 60% (Dirt Rallye).

In Bioshock this directly associated with fps drops, in Dirt Rallye it
is not so obvious but higher utilization means higher fps.

Is this normal? Am I looking at the wrong things?

Cheers,

Benedikt

P.S.

Pictures of the Benchmarks with Gallium HUD traces can be found at

https://github.com/bendat78/mymesa/tree/mymesa2/underutilization

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Util: fix msvc build

2018-07-05 Thread Benedikt Schemmer
The MSVC preprocessor doesnt understand #warning

---
 src/util/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/util/process.c b/src/util/process.c
index 449c7fa77c2..6e6376986f3 100644
--- a/src/util/process.c
+++ b/src/util/process.c
@@ -107,7 +107,7 @@ __getProgramName()
 #define GET_PROGRAM_NAME() __getProgramName()
 #else
 #define GET_PROGRAM_NAME() ""
-#warning "Per application configuration won't work with your OS 
version."
+#pragma message ( "Warning: Per application configuration won't work 
with your OS version." )
 #endif
 #endif

-- 
2.17.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-03-30 Thread Benedikt Schemmer
Hi all,

I did some more testing with NIR and wanted to share the results.
https://github.com/bendat78/mymesa/tree/mymesa2/testresults

Overall it seems nir could be better than tgsi, but there are some
shaders with significant regressions and I think that introduces
framerate drops that are quite noticable in some games (Total War:
WARHAMMER benchmark for example).

See the run* directory for comparisons of tgsi vs nir for different
versions of llvm and tgsi or nir vs different llvm backends (also
includes complete shader db output (around 50k shaders)).

NIR gets better with every version increase of llvm, which optimizes for
code size (why not speed ?).

I found no serious regressions so far, however I think that NIR is
responsible for some crashes that happen in Metro 2033 Redux (beginning
before actual game) and Dead Island classic (ingame) so probably related
to compilation issues.

Cheers,
Benedikt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Disable bindless textures with radeonsi NIR until there is support for them.

2018-04-01 Thread Benedikt Schemmer
From: Benedikt Schemmer 
Date: Sun, 1 Apr 2018 13:18:02 +0200
Subject: [PATCH] Disable bindless textures with radeonsi NIR until there
is support for them.

- Allows to build and use a debug build of mesa with
 shader-db (crashes otherwise with affected shaders)

- Dirt Rally doesnt care about bindless, but Warhammer 40k Dawn 3 does
and crashes with nir when ARB_bindless_texture is reported

---
 src/gallium/drivers/radeonsi/si_get.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c
b/src/gallium/drivers/radeonsi/si_get.c
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -176,7 +176,6 @@ static int si_get_param(struct pipe_screen *pscreen,
enum pipe_cap param)
case PIPE_CAP_DOUBLES:
case PIPE_CAP_TGSI_TEX_TXF_LZ:
case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT:
-   case PIPE_CAP_BINDLESS_TEXTURE:
case PIPE_CAP_QUERY_TIMESTAMP:
case PIPE_CAP_QUERY_TIME_ELAPSED:
case PIPE_CAP_NIR_SAMPLERS_AS_DEREF:
@@ -257,6 +256,11 @@ static int si_get_param(struct pipe_screen
*pscreen, enum pipe_cap param)
return 1;
return 0;

+   case PIPE_CAP_BINDLESS_TEXTURE:
+   if (sscreen->debug_flags & DBG(NIR))
+   return 0;
+   return 1;
+
/* Unsupported features. */
case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
-- 
2.14.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Status of radeonsi NIR

2018-04-03 Thread Benedikt Schemmer
Hi Timothy,

I put up another run for you with llvm 7 and mesa git of this morning.
Also I pruned duplicate shaders (especially Metro Redux had significant
amounts of those) so we're down to ~24k unique shaders.

I'm going to use release builds for testing because there is no
difference to the debug build and it is significantly faster.

Shader-db run times with NIR are about 35% slower though. 540s vs 398s

Max Increase:

SGPRS: 32 -> 104 (225.00 %) (in
shaders/deadisland_definitive/1833.shader_test)
VGPRS: 48 -> 216 (350.00 %) (in shaders/dirtrallye/732.shader_test)
Spilled SGPRs: 8 -> 78 (875.00 %) (in shaders/f1_2015/1574.shader_test)
Spilled VGPRs: 71 -> 76 (7.04 %) (in shaders/dirtrallye/1264.shader_test)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 48 -> 56 (16.67 %) dwords per thread (in
shaders/dirtrallye/1264.shader_test)
Code Size: 2952 -> 12668 (329.13 %) bytes (in
shaders/alien_isolation/5324.shader_test)
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 4 -> 8 (100.00 %) (in shaders/borderlands2/2608.shader_test)
Wait states: 0 -> 0 (0.00 %)

Max Decrease:

SGPRS: 104 -> 40 (-61.54 %) (in shaders/csgo/1558.shader_test)
VGPRS: 188 -> 136 (-27.66 %) (in shaders/cat/2031.shader_test)
Spilled SGPRs: 24 -> 0 (-100.00 %) (in
shaders/total_war_shogun_2/1108.shader_test)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 48 -> 0 (-100.00 %) (in shaders/soma/98.shader_test)
Scratch size: 52 -> 0 (-100.00 %) dwords per thread (in
shaders/soma/98.shader_test)
Code Size: 17060 -> 7404 (-56.60 %) bytes (in
shaders/dolphin/ubershaders/228.shader_test)
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 8 -> 3 (-62.50 %) (in shaders/f1_2015/1196.shader_test)
Wait states: 0 -> 0 (0.00 %)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-04-04 Thread Benedikt Schemmer
Hi Timothy,

thanks for looking into this.

Dead Island still crashes for me with NIR.

However when I attach apitrace it behaves even more strangely, both TGSI
and NIR crash. TGSI gets a little further. Without apitrace I can play
the game with TGSI and it reproducibly crashes during shader compilation
(Loading Resort screen) with NIR.

Dont spend too much time on this though I think it might be related to
changes in mesa itself (memory leaks?) not necessarily NIR.

I used
apitrace trace --output=/home/nano/deadbeaf5  %command%
in the launch options of steam, otherwise I dont even get this far or
apitrace disattaches itself from the process and no trace is output

tails
==> nircrash <==
186658 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0,
yoffset = 0, width = 1280, height = 720, format = GL_RED, type =
GL_UNSIGNED_BYTE, pixels = blob(921600))
186659 glGenerateMipmap(target = GL_TEXTURE_2D)
186660 glXMakeCurrent(dpy = 0xcb62580, drawable = 0, ctx = NULL) = True
186661 glXMakeCurrent(dpy = 0xcb62580, drawable = 39845908, ctx =
0xce13840) = True
186662 glBindTexture(target = GL_TEXTURE_2D, texture = 129)
186663 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0,
yoffset = 0, width = 640, height = 360, format = GL_RED, type =
GL_UNSIGNED_BYTE, pixels = blob(230400))
186664 glGenerateMipmap(target = GL_TEXTURE_2D)
186665 glXMakeCurrent(dpy = 0xcb62580, drawable = 0, ctx = NULL) = True
18 glXMakeCurrent(dpy = 0xcb62580, drawable = 39845908, ctx =
0xce13840) = True
186667 glBindTexture(target = GL_TEXTURE_2D, texture = 130)

==> nircrash2 <==
167095 glEnable(cap = GL_FRAMEBUFFER_SRGB)
167096 glClearColor(red = 0, green = 0, blue = 0, alpha = 1)
167097 glClear(mask = GL_COLOR_BUFFER_BIT)
167098 glClearColor(red = 0, green = 0, blue = 0, alpha = 0)
167099 glBindTexture(target = GL_TEXTURE_2D, texture = 128)
167100 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0,
yoffset = 0, width = 1280, height = 720, format = GL_RED, type =
GL_UNSIGNED_BYTE, pixels = blob(921600))
167101 glGenerateMipmap(target = GL_TEXTURE_2D)
167102 glXMakeCurrent(dpy = 0xcc99820, drawable = 0, ctx = NULL) = True
167103 glXMakeCurrent(dpy = 0xcc99820, drawable = 39845908, ctx =
0xcf4bd50) = True
167104 glBindTexture(target = GL_TEXTURE_2D, texture = 129)

==> nircrash3 <==
156903 glEnable(cap = GL_FRAMEBUFFER_SRGB)
156904 glClearColor(red = 0, green = 0, blue = 0, alpha = 1)
156905 glClear(mask = GL_COLOR_BUFFER_BIT)
156906 glClearColor(red = 0, green = 0, blue = 0, alpha = 0)
156907 glBindTexture(target = GL_TEXTURE_2D, texture = 128)
156908 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0,
yoffset = 0, width = 1280, height = 720, format = GL_RED, type =
GL_UNSIGNED_BYTE, pixels = blob(921600))
156909 glGenerateMipmap(target = GL_TEXTURE_2D)
156910 glXMakeCurrent(dpy = 0xc97f820, drawable = 0, ctx = NULL) = True
156911 glXMakeCurrent(dpy = 0xc97f820, drawable = 39845908, ctx =
0xcc57650) = True
156912 glBindTexture(target = GL_TEXTURE_2D, texture = 129)

==> tgsicrash <==
256272 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 39845908, ctx =
0xc974890) = True
256273 glEnable(cap = GL_CULL_FACE)
256274 glFrontFace(mode = GL_CW)
256275 glDepthMask(flag = GL_FALSE)
256276 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 0, ctx = NULL) = True
256277 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 39845908, ctx =
0xc974890) = True
256278 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 0, ctx = NULL) = True
256279 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 39845908, ctx =
0xc974890) = True
256280 glXSwapIntervalMESA(interval = 0) = 0
256281 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 0, ctx = NULL) = True

==> tgsicrash2 <==
294167 glXMakeCurrent(dpy = 0xcbbc820, drawable = 39845908, ctx =
0xce69b30) = True
294168 glEnable(cap = GL_CULL_FACE)
294169 glFrontFace(mode = GL_CW)
294170 glDepthMask(flag = GL_FALSE)
294171 glXMakeCurrent(dpy = 0xcbbc820, drawable = 0, ctx = NULL) = True
294172 glXMakeCurrent(dpy = 0xcbbc820, drawable = 39845908, ctx =
0xce69b30) = True
294173 glXMakeCurrent(dpy = 0xcbbc820, drawable = 0, ctx = NULL) = True
294174 glXMakeCurrent(dpy = 0xcbbc820, drawable = 39845908, ctx =
0xce69b30) = True
294175 glXSwapIntervalMESA(interval = 0) = 0
294176 glXMakeCurrent(dpy = 0xcbbc820, drawable = 0, ctx = NULL) = True



Am 04.04.2018 um 06:05 schrieb Timothy Arceri:
> On 31/03/18 02:44, Benedikt Schemmer wrote:
>> Hi all,
>>
>> I did some more testing with NIR and wanted to share the results.
>> https://github.com/bendat78/mymesa/tree/mymesa2/testresults
>>
>> Overall it seems nir could be better than tgsi, but there are some
>> shaders with significant regressions and I think that introduces
>> framerate drops that are quite noticable in some games (Total War:
>> WARHAMMER benchmark for example).
> 
> I ran the benchmark for this game and there was no real difference
> between the two backends. C

Re: [Mesa-dev] Status of radeonsi NIR

2018-04-04 Thread Benedikt Schemmer
No I dont.

32-bit is a problem, because Ubuntu wants to literally deinstall itself
before letting me do that (some dependency nonsense).

So I only build 64-bit myself.

For 32-bit & backup 64-bit I use the oibaf ppa which I updated this morning.

I wouldnt spend to much time right now, because mesa git feels a bit
broken. If I switch to stable versions like 17.3.8 most problems
disappear (but so does NIR I guess ;)


Am 04.04.2018 um 14:06 schrieb Timothy Arceri:
> On 04/04/18 21:51, Benedikt Schemmer wrote:
>> Hi Timothy,
>>
>> thanks for looking into this.
>>
>> Dead Island still crashes for me with NIR.
> 
> Just to make sure, are you building a 32bit version of Mesa? If not you
> might be running your system Mesa. If you are definitely building Mesa
> 32bit then I'll take a closer look tomorrow.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-04-04 Thread Benedikt Schemmer
Hi Timothy,

another game that is behaving strangly is Metro 2033 Redux, also crashes
earlier with apitrace attached. This time TGSI and NIR crash at about
the same time (press any key to continue screen)
Without apitrace TGSI works fine, NIR crashes like above.

==> metronir <==
2059446 glDisableVertexAttribArray(index = 8)
2059447 glDisableVertexAttribArray(index = 9)
2059448 glDisableVertexAttribArray(index = 10)
2059449 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3)
2059450 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data =
blob(352), usage = GL_DYNAMIC_DRAW)
2059451 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 720, type
= GL_UNSIGNED_SHORT, indices = 0x176a6, basevertex = 15823)
2059452 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3)
2059453 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data =
blob(352), usage = GL_DYNAMIC_DRAW)
2059454 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 711, type
= GL_UNSIGNED_SHORT, indices = 0x17c46, basevertex = 16303)
2059455 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3)

==> metrotgsi <==
1744537 glEnableVertexAttribArray(index = 4)
1744538 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT,
stride = 32, pointer = 0x18)
1744539 glVertexAttribDivisor(index = 4, divisor = 0)
1744540 glDisableVertexAttribArray(index = 5)
1744541 glDisableVertexAttribArray(index = 6)
1744542 glDisableVertexAttribArray(index = 7)
1744543 glDisableVertexAttribArray(index = 8)
1744544 glDisableVertexAttribArray(index = 9)
1744545 glDisableVertexAttribArray(index = 10)
1744546 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3)

==> metrotgsi2 <==
2141178 glVertexAttribDivisor(index = 1, divisor = 0)
2141179 glEnableVertexAttribArray(index = 2)
2141180 glVertexAttribPointer(index = 2, size = 4, type =
GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x10)
2141181 glVertexAttribDivisor(index = 2, divisor = 0)
2141182 glEnableVertexAttribArray(index = 3)
2141183 glVertexAttribPointer(index = 3, size = 4, type =
GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x14)
2141184 glVertexAttribDivisor(index = 3, divisor = 0)
2141185 glEnableVertexAttribArray(index = 4)
2141186 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT,
stride = 32, pointer = 0x18)
2141187 glVertexAttribDivisor(index = 4, divisor = 0) // incomplete


Am 04.04.2018 um 14:06 schrieb Timothy Arceri:
> On 04/04/18 21:51, Benedikt Schemmer wrote:
>> Hi Timothy,
>>
>> thanks for looking into this.
>>
>> Dead Island still crashes for me with NIR.
> 
> Just to make sure, are you building a 32bit version of Mesa? If not you
> might be running your system Mesa. If you are definitely building Mesa
> 32bit then I'll take a closer look tomorrow.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-04-04 Thread Benedikt Schemmer
Just for baseline:
I have no problem running and creating apitraces from
Bioshock Infinite or TombRaider, I think both 32-bit
CAT Interstellar which is 64-bit

using this command
R600_DEBUG=nir apitrace trace --output=/home/nano/bio2 %command%

and building either the 64 or 32 bit version of apitrace

Am 04.04.2018 um 14:06 schrieb Timothy Arceri:
> On 04/04/18 21:51, Benedikt Schemmer wrote:
>> Hi Timothy,
>>
>> thanks for looking into this.
>>
>> Dead Island still crashes for me with NIR.
> 
> Just to make sure, are you building a 32bit version of Mesa? If not you
> might be running your system Mesa. If you are definitely building Mesa
> 32bit then I'll take a closer look tomorrow.
R600_DEBUG=nir apitrace trace --output=/home/nano/bio2 %command%
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Status of radeonsi NIR

2018-04-04 Thread Benedikt Schemmer
RX460 4GB

Am 04.04.2018 um 15:59 schrieb Timothy Arceri:
> On 04/04/18 22:53, Benedikt Schemmer wrote:
>> Hi Timothy,
>>
>> another game that is behaving strangly is Metro 2033 Redux, also crashes
>> earlier with apitrace attached. This time TGSI and NIR crash at about
>> the same time (press any key to continue screen)
>> Without apitrace TGSI works fine, NIR crashes like above.
> 
> Strange. This game works fine for me on NIR, what card did you say you
> were testing on?
> 
>>
>> ==> metronir <==
>> 2059446 glDisableVertexAttribArray(index = 8)
>> 2059447 glDisableVertexAttribArray(index = 9)
>> 2059448 glDisableVertexAttribArray(index = 10)
>> 2059449 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer
>> = 3)
>> 2059450 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data =
>> blob(352), usage = GL_DYNAMIC_DRAW)
>> 2059451 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 720, type
>> = GL_UNSIGNED_SHORT, indices = 0x176a6, basevertex = 15823)
>> 2059452 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer
>> = 3)
>> 2059453 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data =
>> blob(352), usage = GL_DYNAMIC_DRAW)
>> 2059454 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 711, type
>> = GL_UNSIGNED_SHORT, indices = 0x17c46, basevertex = 16303)
>> 2059455 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer
>> = 3)
>>
>> ==> metrotgsi <==
>> 1744537 glEnableVertexAttribArray(index = 4)
>> 1744538 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT,
>> stride = 32, pointer = 0x18)
>> 1744539 glVertexAttribDivisor(index = 4, divisor = 0)
>> 1744540 glDisableVertexAttribArray(index = 5)
>> 1744541 glDisableVertexAttribArray(index = 6)
>> 1744542 glDisableVertexAttribArray(index = 7)
>> 1744543 glDisableVertexAttribArray(index = 8)
>> 1744544 glDisableVertexAttribArray(index = 9)
>> 1744545 glDisableVertexAttribArray(index = 10)
>> 1744546 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer
>> = 3)
>>
>> ==> metrotgsi2 <==
>> 2141178 glVertexAttribDivisor(index = 1, divisor = 0)
>> 2141179 glEnableVertexAttribArray(index = 2)
>> 2141180 glVertexAttribPointer(index = 2, size = 4, type =
>> GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x10)
>> 2141181 glVertexAttribDivisor(index = 2, divisor = 0)
>> 2141182 glEnableVertexAttribArray(index = 3)
>> 2141183 glVertexAttribPointer(index = 3, size = 4, type =
>> GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x14)
>> 2141184 glVertexAttribDivisor(index = 3, divisor = 0)
>> 2141185 glEnableVertexAttribArray(index = 4)
>> 2141186 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT,
>> stride = 32, pointer = 0x18)
>> 2141187 glVertexAttribDivisor(index = 4, divisor = 0) // incomplete
>>
>>
>> Am 04.04.2018 um 14:06 schrieb Timothy Arceri:
>>> On 04/04/18 21:51, Benedikt Schemmer wrote:
>>>> Hi Timothy,
>>>>
>>>> thanks for looking into this.
>>>>
>>>> Dead Island still crashes for me with NIR.
>>>
>>> Just to make sure, are you building a 32bit version of Mesa? If not you
>>> might be running your system Mesa. If you are definitely building Mesa
>>> 32bit then I'll take a closer look tomorrow.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC PATCH] Change the shader capture naming system from purely number based to sha_number

2018-04-06 Thread Benedikt Schemmer
Depending on the application captured shaders may or may not be the same
when they have the same number.
In Games it often depends on the users graphics options.

This is a problem when running shader-db and looking at the output.
There the same shader number can mean entirely different shaders
depending on user settings in an application.
Also different numbers can mean the exact same shader.

A good example of this is Metro 2033 Redux, where on the one hand the
same shader is compiled dozens of times resulting in dozens of identical
shaders with different numbers and on the other hand the same number is
used for entirely different shaders when the graphics options are changed.

examples from metro
ff92ec1631322329e8c6125184b7681f190a2c73_11179.shader_test
ff92ec1631322329e8c6125184b7681f190a2c73_12166.shader_test
ff92ec1631322329e8c6125184b7681f190a2c73_13493.shader_test
ff92ec1631322329e8c6125184b7681f190a2c73_1956.shader_test
ff92ec1631322329e8c6125184b7681f190a2c73_3939.shader_test
ff92ec1631322329e8c6125184b7681f190a2c73_5042.shader_test
ff92ec1631322329e8c6125184b7681f190a2c73_9394.shader_test

639f9dd8dc9ccd6465c203d2483c6f1b1838ae19_1109.shader_test size: 2,2K
f70e06b31f68b96b1d5b0b3b4966cfbe02d51c44_1109.shader_test size: 4,0K

A number at the end seems to be necessary to avoid race conditions when
the application tries to compile the same shader multiple times at the
same time.

Tested with several 64-bit Games like Dirt Rally and Metro 2033 Redux.

I am not sure if this should be a series of smaller patches or if it can
be just one large blob.
Also I dont know where exactly the ENABLE_SHADER_CACHE code should be
put one helper function is used to generate the sha.

Please kindly review and push, if you find this useful.

---
 src/mesa/main/shaderapi.c | 272
--
 1 file changed, 145 insertions(+), 127 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 76bad7f31e..1e35f055a6 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -65,6 +65,122 @@
 #include "util/mesa-sha1.h"
 #include "util/crc32.h"

+
+/**
+ * Generate a SHA-1 hash value string for given source string.
+ */
+static void
+generate_sha1(const char *source, char sha_str[64])
+{
+   unsigned char sha[20];
+   _mesa_sha1_compute(source, strlen(source), sha);
+   _mesa_sha1_format(sha_str, sha);
+}
+
+
+#ifdef ENABLE_SHADER_CACHE
+/**
+ * Construct a full path for shader replacement functionality using
+ * following format:
+ *
+ * /_.glsl
+ */
+static char *
+construct_name(const gl_shader_stage stage, const char *source,
+   const char *path)
+{
+   char sha[64];
+   static const char *types[] = {
+  "VS", "TC", "TE", "GS", "FS", "CS",
+   };
+
+   generate_sha1(source, sha);
+   return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha);
+}
+
+/**
+ * Write given shader source to a file in MESA_SHADER_DUMP_PATH.
+ */
+static void
+dump_shader(const gl_shader_stage stage, const char *source)
+{
+   static bool path_exists = true;
+   char *dump_path;
+   FILE *f;
+
+   if (!path_exists)
+  return;
+
+   dump_path = getenv("MESA_SHADER_DUMP_PATH");
+   if (!dump_path) {
+  path_exists = false;
+  return;
+   }
+
+   char *name = construct_name(stage, source, dump_path);
+
+   f = fopen(name, "w");
+   if (f) {
+  fputs(source, f);
+  fclose(f);
+   } else {
+  GET_CURRENT_CONTEXT(ctx);
+  _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name,
+strerror(errno));
+   }
+   ralloc_free(name);
+}
+
+/**
+ * Read shader source code from a file.
+ * Useful for debugging to override an app's shader.
+ */
+static GLcharARB *
+read_shader(const gl_shader_stage stage, const char *source)
+{
+   char *read_path;
+   static bool path_exists = true;
+   int len, shader_size = 0;
+   GLcharARB *buffer;
+   FILE *f;
+
+   if (!path_exists)
+  return NULL;
+
+   read_path = getenv("MESA_SHADER_READ_PATH");
+   if (!read_path) {
+  path_exists = false;
+  return NULL;
+   }
+
+   char *name = construct_name(stage, source, read_path);
+   f = fopen(name, "r");
+   ralloc_free(name);
+   if (!f)
+  return NULL;
+
+   /* allocate enough room for the entire shader */
+   fseek(f, 0, SEEK_END);
+   shader_size = ftell(f);
+   rewind(f);
+   assert(shader_size);
+
+   /* add one for terminating zero */
+   shader_size++;
+
+   buffer = malloc(shader_size);
+   assert(buffer);
+
+   len = fread(buffer, 1, shader_size, f);
+   buffer[len] = 0;
+
+   fclose(f);
+
+   return buffer;
+}
+
+#endif /* ENABLE_SHADER_CACHE */
+
 /**
  * Return mask of GLSL_x flags by examining the MESA_GLSL env var.
  */
@@ -1228,29 +1344,43 @@ link_program(struct gl_context *ctx, struct
gl_shader_program *shProg,
/* Capture .shader_test files. */
const char *capture_path = _mesa_get_shader_capture_path();
if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != N

Re: [Mesa-dev] [PATCH 3/3] radeonsi: don't emit partial flushes, for internal CS flushes only

2018-04-07 Thread Benedikt Schemmer
Hi Marek,

your patches dont apply cleanly
(src/gallium/drivers/radeonsi/si_gfx_cs.c in particular)

Rebase?

Other than that:
Dirt Rally (avg,min,max)
before 54/35/73
after 54/41/73

My system is GPU bound. RX560 OC with 4GB (aka reflashed RX460 ;)

For the series:

Tested-by: Benedikt Schemmer 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] radeonsi: don't emit partial flushes for internal CS flushes only

2018-04-07 Thread Benedikt Schemmer
Patchwork didnt recognize the message header resend

Hi Marek,

your patches dont apply cleanly
(src/gallium/drivers/radeonsi/si_gfx_cs.c in particular)

Rebase?

Other than that:
Dirt Rally (avg,min,max)
before 54/35/73
after 54/41/73

My system is GPU bound. RX560 OC with 4GB (aka reflashed RX460 ;)

For the series:

Tested-by: Benedikt Schemmer 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [1/3] radeonsi: implement mechanism for IBs without partial flushes at the end (v6)

2018-04-07 Thread Benedikt Schemmer
Hi Marek,

I just tried different combinations: this and with meta

ctx->flags |= SI_CONTEXT_INV_ICACHE |
  SI_CONTEXT_INV_SMEM_L1 |
  SI_CONTEXT_INV_VMEM_L1 |
  SI_CONTEXT_INV_GLOBAL_L2 |
  SI_CONTEXT_START_PIPELINE_STATS |
  SI_CONTEXT_FLUSH_AND_INV_DB |
//SI_CONTEXT_FLUSH_AND_INV_DB_META |
  SI_CONTEXT_FLUSH_AND_INV_CB;

as long as you dont flush db_meta there doesnt seem to be a difference
performancewise and I dont see any visible difference between them at
all cant test this with 32bit applications though

Dirt Rally Benchmark
54/34/71 with meta flush
54/40/74 with db and cb flushes
54/41/73 original

again: gpu limited, all starting with empty shader cache

Cheers,
Benedikt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Status of radeonsi NIR

2018-04-15 Thread Benedikt Schemmer
Hi all,

I did some more test runs with NIR and shader-db.
This time they also include piglit shaders.
There was only one crash that was explicitly related to NIR but also
quite a number of regressions that might be worth looking at.
I've put the results up on
https://github.com/bendat78/mymesa/tree/mymesa2/testresults
look at the most current run* directory.

(a)

=> CRASHED <= while processing these shaders:
 shaders/piglit/da1588b9fd1ec5009a02ff9ee244f11923d32855_3.shader_test

[require]
GLSL >= 1.10

[vertex shader]
#version 110
#if __VERSION__ >= 130
in vec4 piglit_vertex;
#else
attribute vec4 piglit_vertex;
#endif
void main() { gl_Position = piglit_vertex; }


[fragment shader]
#version 110
uniform float id;
void main()
{
ivec4 test = ivec4(0, 1, 0, 0);
vec4 test2 = vec4(0.0, 1.0, 0.0, 0.0);
int index = int(id);
float col = test2[test[index]];
gl_FragColor = vec4(col);
}


(b)

Might be worth looking at this one as it is related to Alien Isolation
and has significant regressions

VGPRS: 4 -> 64 (1500.00 %) (in
shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test)
Spilled VGPRs: 0 -> 46 (0.00 %) (in
shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test)
Scratch size: 0 -> 36 (0.00 %) dwords per thread (in
shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test)
Code Size: 108 -> 7148 (6518.52 %) bytes (in
shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test)

# Tests linking a shader with a large workgroup and local data set.
#
# This is related to:
# Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840
#
# There are two important elements to this test:
#
# 1. local workgroup size of 1024 based on the spec requirements.
#
# 2. Use a 'large' size of live local data. In the test this is
#controlled with the SIZE macro. This is hardware specific, and
#the value of 64 was chosen because it was observed to fail on the
#Mesa i965 driver.

[require]
GL >= 3.3
GLSL >= 3.30
GL_ARB_compute_shader

[compute shader]
#version 330
#extension GL_ARB_compute_shader: enable

layout(local_size_x = 1024) in;

#define SIZE 64

shared int sa[SIZE];

void main()
{
int a[SIZE] = sa;
a[int(gl_LocalInvocationIndex) % SIZE] += 1;
if (a[int(gl_LocalInvocationIndex) % SIZE] == 0)
sa[0] = 1;
}

[test]
link success


(c)

there are a number of shaders that show this kind of behavior (dont know
if they do for the same reason):

Max Waves: 8 -> 1 (-87.50 %) (in
shaders/piglit/31a96d9a82c154da664023713921aab10967ef83_1533.shader_test)

from

/** @file glsl-max-varyings.c
 *
 * Tests whether each varying can be used at all numbers of varyings up to
 * GL_MAX_VARYING_FLOATS / 4.
 */



Have fun,
Benedikt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev