[Mesa-dev] [PATCH v2] drirc: Add allow_glsl_builtin_variable_redeclaration for Dead Island Riptide Definitive Edition
From: Benedikt Schemmer This patch sets the allow_glsl_builtin_variable_redeclaration for Dead Island Riptide Definitive Edition. Tested with Mesa git as of today and Dying Light, Dead Island Definitive Edition and Dead Island Riptide Definite Edition v2: set only required option and match option name Please kindly review and push. Thanks, Benedikt --- src/mesa/drivers/dri/common/drirc | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/common/drirc b/src/mesa/drivers/dri/common/drirc index 8093e8e42a..d820462fad 100644 --- a/src/mesa/drivers/dri/common/drirc +++ b/src/mesa/drivers/dri/common/drirc @@ -92,6 +92,10 @@ TODO: document the other workarounds. value="true" /> +executable="DeadIslandRiptideGame"> +value="true" /> + + value="true" /> -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py
Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py --- src/gallium/drivers/r600/egd_tables.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/egd_tables.py b/src/gallium/drivers/r600/egd_tables.py index 4c606025ba..289981ae18 100644 --- a/src/gallium/drivers/r600/egd_tables.py +++ b/src/gallium/drivers/r600/egd_tables.py @@ -270,7 +270,7 @@ struct eg_packet3 { strings.add(field.name), field.s_name, len(values_offsets), strings_offsets.add(values_offsets)) else: -print '\t{%s, %s(~0u)},' % (strings.add(field.name), field.s_name) +print '\t{%s, %s(~0u), 0, 0},' % (strings.add(field.name), field.s_name) fields_idx += 1 print '};' @@ -282,7 +282,7 @@ struct eg_packet3 { print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name, len(reg.fields), reg.fields_idx if reg.own_fields else reg.fields_owner.fields_idx) else: -print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name) +print '\t{%s, %s, 0, 0},' % (strings.add(reg.name), reg.r_name) print '};' print -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] Fix missing initializer errors in generated tables
I send these patches as a series because they share the same problem: all have missing fields in the default initialization path for field/register definitions. This causes several hundreds (thousands?) of "missing initializer" warnings from the compiler making it hard to spot less frequent errors. This also fixes a small copy and paste error in vk_format_table.py where it isnt clear how it is autogenerated from where. Please kindly review and push if you find this useful. Thanks, Benedikt Benedikt Schemmer (3): Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py Fix missing initializer warning in vk_format_table.h by adding appropriate default fields in vk_format_table.py src/amd/common/sid_tables.py | 4 ++-- src/amd/vulkan/vk_format_table.py | 4 ++-- src/gallium/drivers/r600/egd_tables.py | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] Fix missing initializer warning in vk_format_table.h by, adding appropriate default fields in vk_format_table.py
Fix missing initializer warning in vk_format_table.h by adding appropriate default fields in vk_format_table.py and correct the autogenerated from message --- src/amd/vulkan/vk_format_table.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/vk_format_table.py b/src/amd/vulkan/vk_format_table.py index 36352b108d..139bb9544c 100644 --- a/src/amd/vulkan/vk_format_table.py +++ b/src/amd/vulkan/vk_format_table.py @@ -86,7 +86,7 @@ def print_channels(format, func): print '#endif' def write_format_table(formats): -print '/* This file is autogenerated by u_format_table.py from u_format.csv. Do not edit directly. */' +print '/* This file is autogenerated by vk_format_table.py from vk_format_layout.csv. Do not edit directly. */' print # This will print the copyright message on the top of this file print CopyRight.strip() @@ -106,7 +106,7 @@ def write_format_table(formats): if channel.size: print " {%s, %s, %s, %s, %u, %u}%s\t/* %s = %s */" % (type_map[channel.type], bool_map(channel.norm), bool_map(channel.pure), bool_map(channel.scaled), channel.size, channel.shift, sep, "xyzw"[i], channel.name) else: -print " {0, 0, 0, 0, 0}%s" % (sep,) +print " {0, 0, 0, 0, 0, 0}%s" % (sep,) print " }," def do_swizzle_array(channels, swizzles): -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py
Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py --- src/amd/common/sid_tables.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py index fd88d3c9d5..691d766b08 100644 --- a/src/amd/common/sid_tables.py +++ b/src/amd/common/sid_tables.py @@ -270,7 +270,7 @@ struct si_packet3 { strings.add(field.name), field.s_name, len(values_offsets), strings_offsets.add(values_offsets)) else: -print '\t{%s, %s(~0u)},' % (strings.add(field.name), field.s_name) +print '\t{%s, %s(~0u), 0, 0},' % (strings.add(field.name), field.s_name) fields_idx += 1 print '};' @@ -282,7 +282,7 @@ struct si_packet3 { print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name, len(reg.fields), reg.fields_idx if reg.own_fields else reg.fields_owner.fields_idx) else: -print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name) +print '\t{%s, %s, 0, 0},' % (strings.add(reg.name), reg.r_name) print '};' print -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] Fix missing initializer errors in generated tables
Hi all, my reasoning wasn't based on how c handles partial initializations, but on how the generated files look like: like something is missing. Also I don't want to suppress these warnings, because I don't have the experience you have with the codebase and would kind of like to see if my changes have an negative impact or I forgot something. Benedikt Am 04.06.2017 um 22:20 schrieb Bas Nieuwenhuizen: > Hi Marek, > > Do you have any other reasons besides it not improving correctness? > I'd like to pick at least the radv one, as the code doesn't get less > clear, and using 5 zeros for struct with 6 members is just plain > silly. > > - Bas > > > On Sun, Jun 4, 2017 at 9:57 PM, Marek Olšák wrote: >> NAK. >> >> In C/C++, the initializer is used to clear the memory to 0s, thus, >> adding 0s to the initializer is redundant and unnecessary. Empty >> initializer {} is also commonly used instead of memset. >> >> You need to suppress this warning if you don't want to see it. >> >> Marek >> >> On Sun, Jun 4, 2017 at 9:45 AM, Benedikt Schemmer wrote: >>> >>> I send these patches as a series because they share the same problem: >>> all have missing fields in the default initialization path for >>> field/register >>> definitions. >>> >>> This causes several hundreds (thousands?) of "missing initializer" warnings >>> from the compiler making it hard to spot less frequent errors. >>> >>> This also fixes a small copy and paste error in vk_format_table.py where >>> it isnt clear how it is autogenerated from where. >>> >>> Please kindly review and push if you find this useful. >>> >>> Thanks, >>> Benedikt >>> >>> Benedikt Schemmer (3): >>> Fix missing initializer warning in sid_tables.h by adding appropriate >>> default fields in sid_tables.py >>> Fix missing initializer warning in egd_tables.h by adding appropriate >>> default fields in egd_tables.py >>> Fix missing initializer warning in vk_format_table.h by adding >>> appropriate default fields in vk_format_table.py >>> >>> src/amd/common/sid_tables.py | 4 ++-- >>> src/amd/vulkan/vk_format_table.py | 4 ++-- >>> src/gallium/drivers/r600/egd_tables.py | 4 ++-- >>> 3 files changed, 6 insertions(+), 6 deletions(-) >>> >>> -- >>> 2.11.0 >>> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 1/3] Fix missing initializer warning in sid_tables.h by adding appropriate default fields in sid_tables.py
Fix missing initializer warning in sid_tables.h by adding appropriate designated initializers in sid_tables.py --- src/amd/common/sid_tables.py | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py index fd88d3c9d5..691d766b08 100644 --- a/src/amd/common/sid_tables.py +++ b/src/amd/common/sid_tables.py @@ -266,11 +266,13 @@ struct si_packet3 { while value[1] >= len(values_offsets): values_offsets.append(-1) values_offsets[value[1]] = strings.add(strip_prefix(value[0])) -print '\t{%s, %s(~0u), %s, %s},' % ( -strings.add(field.name), field.s_name, +print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \ + \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= %s},' \ + % (strings.add(field.name), field.s_name, len(values_offsets), strings_offsets.add(values_offsets)) else: -print '\t{%s, %s(~0u)},' % (strings.add(field.name), field.s_name) +print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u)},' \ + % (strings.add(field.name), field.s_name) fields_idx += 1 print '};' @@ -279,10 +281,13 @@ struct si_packet3 { print 'static const struct si_reg sid_reg_table[] = {' for reg in regs: if len(reg.fields): -print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name, +print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \ + \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \ + % (strings.add(reg.name), reg.r_name, len(reg.fields), reg.fields_idx if reg.own_fields else reg.fields_owner.fields_idx) else: -print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name) +print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \ + % (strings.add(reg.name), reg.r_name) print '};' print -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 2/3] Fix missing initializer warning in egd_tables.h by adding appropriate default fields in egd_tables.py
Fix missing initializer warning in egd_tables.h by adding appropriate designated initializers in egd_tables.py --- src/gallium/drivers/r600/egd_tables.py | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/r600/egd_tables.py b/src/gallium/drivers/r600/egd_tables.py index 4c606025ba..289981ae18 100644 --- a/src/gallium/drivers/r600/egd_tables.py +++ b/src/gallium/drivers/r600/egd_tables.py @@ -266,11 +266,13 @@ struct eg_packet3 { while value[1] >= len(values_offsets): values_offsets.append(-1) values_offsets[value[1]] = strings.add(strip_prefix(value[0])) -print '\t{%s, %s(~0u), %s, %s},' % ( -strings.add(field.name), field.s_name, +print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u), \ + \r\n\t .num_values\t= %s,\r\n\t .values_offset\t= %s},' \ + % (strings.add(field.name), field.s_name, len(values_offsets), strings_offsets.add(values_offsets)) else: -print '\t{%s, %s(~0u)},' % (strings.add(field.name), field.s_name) +print '\t{.name_offset\t= %s,\r\n\t .mask\t\t\t= %s(~0u)},' \ + % (strings.add(field.name), field.s_name) fields_idx += 1 print '};' @@ -279,10 +281,13 @@ struct eg_packet3 { print 'static const struct eg_reg egd_reg_table[] = {' for reg in regs: if len(reg.fields): -print '\t{%s, %s, %s, %s},' % (strings.add(reg.name), reg.r_name, +print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s, \ + \r\n\t .num_fields\t= %s,\r\n\t .fields_offset\t= %s},' \ + % (strings.add(reg.name), reg.r_name, len(reg.fields), reg.fields_idx if reg.own_fields else reg.fields_owner.fields_idx) else: -print '\t{%s, %s},' % (strings.add(reg.name), reg.r_name) +print '\t{.name_offset\t= %s,\r\n\t .offset\t\t= %s},' \ + % (strings.add(reg.name), reg.r_name) print '};' print -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 3/3] Fix missing initializer warning in vk_format_table.h by, adding appropriate default fields in vk_format_table.py
Fix missing initializer warning in vk_format_table.h by changing to a default initializer in vk_format_table.py and correct the autogenerated from message --- src/amd/vulkan/vk_format_table.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/vk_format_table.py b/src/amd/vulkan/vk_format_table.py index 36352b108d..139bb9544c 100644 --- a/src/amd/vulkan/vk_format_table.py +++ b/src/amd/vulkan/vk_format_table.py @@ -86,7 +86,7 @@ def print_channels(format, func): print '#endif' def write_format_table(formats): -print '/* This file is autogenerated by u_format_table.py from u_format.csv. Do not edit directly. */' +print '/* This file is autogenerated by vk_format_table.py from vk_format_layout.csv. Do not edit directly. */' print # This will print the copyright message on the top of this file print CopyRight.strip() @@ -106,7 +106,7 @@ def write_format_table(formats): if channel.size: print " {%s, %s, %s, %s, %u, %u}%s\t/* %s = %s */" % (type_map[channel.type], bool_map(channel.norm), bool_map(channel.pure), bool_map(channel.scaled), channel.size, channel.shift, sep, "xyzw"[i], channel.name) else: -print " {0, 0, 0, 0, 0}%s" % (sep,) +print " {0}%s" % (sep,) print " }," def do_swizzle_array(channels, swizzles): -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 0/3] Fix missing initializer errors in generated tables
This series aims to fix hundreds of missing initializer warnings in generated header files when compiling with -Wextra V1: Fix the old fashioned way by adding 0s where needed V2: switch to designated initializers (Emil), didnt send V3: add some layout so its easier to read and create a new version for vk that just uses {0} instead of {0, 0, 0, 0, 0} (same thing, less zeros) --- Because this generated such unexpected controversy and out of curiosity i wrote a little test program to show the problem. Compile with -Wextra. There is no init code generated for any variant on any compiler i tested. gcc version 5.4.1 20170304 (Ubuntu 5.4.1-8ubuntu1) gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2) gcc version 7.0.1 20170407 (experimental) [trunk revision 246759] (Ubuntu 7-20170407-0ubuntu2) clang version 4.0.0-1ubuntu1 (tags/RELEASE_400/rc1) clang version 5.0.0-svn305158-0~z~padoka0 (trunk) MSVC Compiler Version 19.00.24210 --- struct s { int a; int b; int c; int d; int e; }; static const struct s str1 = {}; // gcc 5/6/7, clang 4/5 accept this without warning, MSVC2013 doesnt compile (not Standard, will generate a warning with -Wpendantic, much prettier however) static const struct s str2 = {0}; // clang 4/5 generate a warning here (although ANSI Standard) static const struct s str3 = {0,}; // clang 4/5 generate a warning here (although ANSI Standard) static const struct s str4 = {1,2,3}; // gcc 5/6/7, clang 4/5 generate a warning here static const struct s str5 = {1,2,3,}; // gcc 5/6/7, clang 4/5 generate a warning here static const struct s str6 = {1,2,3,0,0}; // this is fine with all compilers static const struct s str7 = {.a = 1, .b = 2, .c = 3}; // this might not compile on MSVC <2013 but couldnt test int main() { return 0; } --- This is what Rust does: --- #![allow(unused_variables)] #![allow(dead_code)] #[derive(Default)] struct Test { a: i32, b: i32, c: i32, d: i32, e: i32 } fn main() { //all of these wont work //let t1 = Test {}; //let t2 = Test {0}; //let t3 = Test {0,}; //let t3b= Test {..}; would be very cool //let t4 = Test {1,2,3}; //let t5 = Test {1,2,3,}; //let t6 = Test {1,2,3,0,0}; //let t7 = Test {a: 1, b: 1}; //let t8 = Test {a: 1, b: 1, ..}; would be cool //only this is legal in rust let t9: Test = Default::default(); let t10 = Test {..Default::default()}; let t11 = Test {a: 1, b: 1, ..Default::default()}; let t12 = Test {a: 1, b: 1, c: 1, d: 1, e: 1}; println!("Hello, world!"); } --- So in the end I followed Emils suggestion of designated initializers for partial initialization, to make it explicit and get rid of the warnings. Please kindly review and push. Thanks, Benedikt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] drirc: Add allow_glsl_builtin_redeclaration for Dead Island Riptide Definitive Edition
From: Benedikt Schemmer This patch sets the allow_glsl_builtin_redeclaration for Dead Island Riptide Definitive Edition. This requires the patch series: [Mesa-dev] [PATCH 0/2] Allow redeclaration of GLSL builtins; fixes Dying Light and Dead Island Definitive Edition by John Brooks sent Fri May 12 13:39:46 UTC 2017 Tested with Mesa git as of today and Dying Light, Dead Island Definitive Edition and Dead Island Riptide Definite Edition --- src/mesa/drivers/dri/common/drirc | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/common/drirc b/src/mesa/drivers/dri/common/drirc index e1b3722e6d..9f2ef6f159 100644 --- a/src/mesa/drivers/dri/common/drirc +++ b/src/mesa/drivers/dri/common/drirc @@ -92,6 +92,11 @@ TODO: document the other workarounds. value="true" /> +executable="DeadIslandRiptideGame"> +value="true" /> + + + value="true" /> -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] Allow redeclaration of GLSL builtins; fixes Dying Light and Dead Island Definitive Edition
Hi all, Dying Light really works (Radeon RX460 4GB, Ubuntu 17.04, Mesa git patches applied plus some minor changes that i have locally to get rid of compiler warnings). I did have to jump through some additional hoops to get it working: Dying Light only launches from within Steam, Launch Options for me need to be MESA_GL_VERSION_OVERRIDE=4.4 MESA_GLSL_VERSION_OVERRIDE=440 Also two libraries were missing (there are no packages in the 17.04 repository so I had to install them manually) I used find -name "*.so" -exec ldd {} \; | grep not from the games directory to find the missing dependencies. links for the packages below, glew needs to be renamed because the game expects a different version. http://packages.ubuntu.com/yakkety/libglew1.13 sudo ln -sf /usr/lib/x86_64-linux-gnu/libGLEW.so.1.13.0 /usr/lib/x86_64-linux-gnu/libGLEW.so.1.10 http://packages.ubuntu.com/de/xenial/amd64/libpng12-0/download After that the biggest problem was that I died within the first few minutes because I havent played in months ;) Big thanks you've made my day, Benedikt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Fixes: gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE
There is a small typo that prevents compilation. Please kindly review and push. Cheers, Benedikt Signed-off-by: Benedikt Schemmer --- diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c @@ -412,7 +412,7 @@ nvc0_hw_get_query_result_resource(struct nvc0_context *nvc0, PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR); BEGIN_1IC0(push, NVC0_3D(MACRO_QUERY_BUFFER_WRITE), 9); if (q->type == PIPE_QUERY_OCCLUSION_PREDICATE || - q->type ++ PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 64-bit? */ + q->type == PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 64-bit? */ PUSH_DATA(push, 0x0001); else if (result_type == PIPE_QUERY_TYPE_I32) PUSH_DATA(push, 0x7fff); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fixes: radeonsi: allow out-of-order rasterization in commutative blending cases
Wrong commit, correct one is: gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE Am 18.09.2017 um 15:27 schrieb Benedikt Schemmer: > There is a small typo that prevents compilation. > Please kindly review and push. > Cheers, > Benedikt > > Signed-off-by: Benedikt Schemmer > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c > @@ -412,7 +412,7 @@ nvc0_hw_get_query_result_resource(struct nvc0_context > *nvc0, > PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR); > BEGIN_1IC0(push, NVC0_3D(MACRO_QUERY_BUFFER_WRITE), 9); > if (q->type == PIPE_QUERY_OCCLUSION_PREDICATE || > - q->type ++ PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what > if 64-bit? */ > + q->type == PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what > if 64-bit? */ >PUSH_DATA(push, 0x0001); > else if (result_type == PIPE_QUERY_TYPE_I32) >PUSH_DATA(push, 0x7fff); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Fixes: radeonsi: allow out-of-order rasterization in commutative blending cases
There is a small typo that prevents compilation. Please kindly review and push. Cheers, Benedikt Signed-off-by: Benedikt Schemmer diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c --- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c @@ -412,7 +412,7 @@ nvc0_hw_get_query_result_resource(struct nvc0_context *nvc0, PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR); BEGIN_1IC0(push, NVC0_3D(MACRO_QUERY_BUFFER_WRITE), 9); if (q->type == PIPE_QUERY_OCCLUSION_PREDICATE || - q->type ++ PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 64-bit? */ + q->type == PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) /* XXX what if 64-bit? */ PUSH_DATA(push, 0x0001); else if (result_type == PIPE_QUERY_TYPE_I32) PUSH_DATA(push, 0x7fff); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb
Would it be ok to rewrite the patch like below? Avoids code duplication and one variable. Otherwise your original patch works fine so FWIW you have my Tested-by: Benedikt Schemmer Cheers, Benedikt diff --git a/src/gallium/drivers/radeonsi/si_uvd.c b/src/gallium/drivers/radeonsi/si_uvd.c --- a/src/gallium/drivers/radeonsi/si_uvd.c +++ b/src/gallium/drivers/radeonsi/si_uvd.c @@ -64,30 +64,16 @@ struct pipe_video_buffer *si_video_buffer_create(struct pipe_context *pipe, template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH); template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT); - vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_DEFAULT, 0); - /* TODO: get tiling working */ - templ.bind = PIPE_BIND_LINEAR; - resources[0] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[0]) - goto error; - - if (resource_formats[1] != PIPE_FORMAT_NONE) { - vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1); - templ.bind = PIPE_BIND_LINEAR; - resources[1] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[1]) - goto error; - } - - if (resource_formats[2] != PIPE_FORMAT_NONE) { - vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2); - templ.bind = PIPE_BIND_LINEAR; - resources[2] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[2]) - goto error; + for (i = 0; i < VL_NUM_COMPONENTS; ++i) { + if (resource_formats[i] != PIPE_FORMAT_NONE) { + vl_video_buffer_template(&templ, &template, +resource_formats[i], 1, +array_size, PIPE_USAGE_DEFAULT, i); + templ.bind = PIPE_BIND_LINEAR | PIPE_BIND_SHARED; + resources[i] = (struct r600_texture *) + pipe->screen->resource_create(pipe->screen, &templ); + if (!resources[i]) goto error; + } } for (i = 0; i < VL_NUM_COMPONENTS; ++i) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb
From: Benedikt Schemmer Date: Fri, 29 Sep 2017 21:02:13 +0200 Subject: [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb V2: remove code duplication and one unnessecary variable, minor whitespace fix --- src/gallium/drivers/radeonsi/si_uvd.c | 40 +-- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_uvd.c b/src/gallium/drivers/radeonsi/si_uvd.c index 4e8250664c..c2925ef9cd 100644 --- a/src/gallium/drivers/radeonsi/si_uvd.c +++ b/src/gallium/drivers/radeonsi/si_uvd.c @@ -64,30 +64,20 @@ struct pipe_video_buffer *si_video_buffer_create(struct pipe_context *pipe, template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH); template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT); - vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_DEFAULT, 0); /* TODO: get tiling working */ - templ.bind = PIPE_BIND_LINEAR; - resources[0] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[0]) - goto error; - - if (resource_formats[1] != PIPE_FORMAT_NONE) { - vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1); - templ.bind = PIPE_BIND_LINEAR; - resources[1] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[1]) - goto error; - } - - if (resource_formats[2] != PIPE_FORMAT_NONE) { - vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2); - templ.bind = PIPE_BIND_LINEAR; - resources[2] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[2]) - goto error; + + for (i = 0; i < VL_NUM_COMPONENTS; ++i) { + if (resource_formats[i] != PIPE_FORMAT_NONE) { + vl_video_buffer_template(&templ, &template, +resource_formats[i], 1, +array_size, PIPE_USAGE_DEFAULT, i); + /* Set PIPE_BIND_SHARED to avoid reallocation in r600_texture_get_handle, +* which can't handle joined surfaces. */ + templ.bind = PIPE_BIND_LINEAR | PIPE_BIND_SHARED; + resources[i] = (struct r600_texture *) + pipe->screen->resource_create(pipe->screen, &templ); + if (!resources[i]) goto error; + } } for (i = 0; i < VL_NUM_COMPONENTS; ++i) { @@ -159,8 +149,8 @@ struct pipe_video_codec *si_uvd_create_decoder(struct pipe_context *context, struct si_context *ctx = (struct si_context *)context; bool vcn = (ctx->b.family == CHIP_RAVEN) ? true : false; -if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) -return si_vce_create_encoder(context, templ, ctx->b.ws, si_vce_get_buffer); + if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) + return si_vce_create_encoder(context, templ, ctx->b.ws, si_vce_get_buffer); return (vcn) ? radeon_create_decoder(context, templ) : si_common_uvd_create_decoder(context, templ, si_uvd_set_dtb); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb
> > It should be handled as an error if resource_formats[0] is PIPE_FORMAT_NONE. > > Better write this as: > > if (resource_formats[i] == PIPE_FORMAT_NONE) { > if (i == 0) > gotot error; > continue; > } > ... > I dont think it can be zero. In the beginning of vl_video_buffer.c formats and return values are defined and resource_formats[0] is guranteed to not be PIPE_FORMAT_NONE. The only other return value is null which caught immediately after requesting resource_formats (line 59, in si_uvd.c). So maybe assert(resource_formats[0] == PIPE_FORMAT_NONE) for debugging purposes? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb
yes Am 30.09.2017 um 17:53 schrieb Marek Olšák: > On Sat, Sep 30, 2017 at 5:47 PM, Christian König > wrote: >> Am 30.09.2017 um 12:57 schrieb Benedikt Schemmer: >>>> >>>> It should be handled as an error if resource_formats[0] is >>>> PIPE_FORMAT_NONE. >>>> >>>> Better write this as: >>>> >>>> if (resource_formats[i] == PIPE_FORMAT_NONE) { >>>> if (i == 0) >>>> gotot error; >>>> continue; >>>> } >>>> ... >>>> >>> I dont think it can be zero. >>> In the beginning of vl_video_buffer.c formats and return values are >>> defined and resource_formats[0] is guranteed to not >>> be PIPE_FORMAT_NONE. The only other return value is null which caught >>> immediately after requesting resource_formats >>> (line 59, in si_uvd.c). >>> So maybe >>> assert(resource_formats[0] == PIPE_FORMAT_NONE) >>> for debugging purposes? > > Did you mean: > assert(resource_formats[0] != PIPE_FORMAT_NONE) ? > > Marek > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V3] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb
From: Benedikt Schemmer Date: Sat, 30 Sep 2017 18:05:43 +0200 Subject: [PATCH V3] radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb V1: Marek Olsak V2: remove code duplication and one unnecessary variable, minor whitespace fix V3: ensure valid resource format when debugging (is assumed to be valid in release) (suggested by Christian König) --- src/gallium/drivers/radeonsi/si_uvd.c | 41 +++ 1 file changed, 17 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_uvd.c b/src/gallium/drivers/radeonsi/si_uvd.c index 4e8250664c..3cb45075c4 100644 --- a/src/gallium/drivers/radeonsi/si_uvd.c +++ b/src/gallium/drivers/radeonsi/si_uvd.c @@ -64,30 +64,23 @@ struct pipe_video_buffer *si_video_buffer_create(struct pipe_context *pipe, template.width = align(tmpl->width, VL_MACROBLOCK_WIDTH); template.height = align(tmpl->height / array_size, VL_MACROBLOCK_HEIGHT); - vl_video_buffer_template(&templ, &template, resource_formats[0], 1, array_size, PIPE_USAGE_DEFAULT, 0); /* TODO: get tiling working */ - templ.bind = PIPE_BIND_LINEAR; - resources[0] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[0]) - goto error; - - if (resource_formats[1] != PIPE_FORMAT_NONE) { - vl_video_buffer_template(&templ, &template, resource_formats[1], 1, array_size, PIPE_USAGE_DEFAULT, 1); - templ.bind = PIPE_BIND_LINEAR; - resources[1] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[1]) - goto error; - } - if (resource_formats[2] != PIPE_FORMAT_NONE) { - vl_video_buffer_template(&templ, &template, resource_formats[2], 1, array_size, PIPE_USAGE_DEFAULT, 2); - templ.bind = PIPE_BIND_LINEAR; - resources[2] = (struct r600_texture *) - pipe->screen->resource_create(pipe->screen, &templ); - if (!resources[2]) - goto error; + assert(resource_formats[0] != PIPE_FORMAT_NONE); + + for (i = 0; i < VL_NUM_COMPONENTS; ++i) { + if (resource_formats[i] != PIPE_FORMAT_NONE) { + vl_video_buffer_template(&templ, &template, +resource_formats[i], 1, +array_size, PIPE_USAGE_DEFAULT, i); + /* Set PIPE_BIND_SHARED to avoid reallocation in r600_texture_get_handle, +* which can't handle joined surfaces. */ + templ.bind = PIPE_BIND_LINEAR | PIPE_BIND_SHARED; + resources[i] = (struct r600_texture *) + pipe->screen->resource_create(pipe->screen, &templ); + if (!resources[i]) + goto error; + } } for (i = 0; i < VL_NUM_COMPONENTS; ++i) { @@ -159,8 +152,8 @@ struct pipe_video_codec *si_uvd_create_decoder(struct pipe_context *context, struct si_context *ctx = (struct si_context *)context; bool vcn = (ctx->b.family == CHIP_RAVEN) ? true : false; -if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) -return si_vce_create_encoder(context, templ, ctx->b.ws, si_vce_get_buffer); + if (templ->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) + return si_vce_create_encoder(context, templ, ctx->b.ws, si_vce_get_buffer); return (vcn) ? radeon_create_decoder(context, templ) : si_common_uvd_create_decoder(context, templ, si_uvd_set_dtb); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 00/15] TGSI: improved live range tracking, also including arrays
Hi Gert, Hi Nicolai, I did play around with this quite a lot (mostly with the previous version) and found it to be stable (doesn't crash deus ex on start up from cold shader cache like NIR) or leak memory like llvm 7 (at least used to leak ~100MB with piglit shaders, haven't given it another try in the last two weeks). The arrays-of-arrays piglit shaders I sent you still crash though Gert. On radeonsi it is mostly (or probably entirely) useless ;) But it does have an interesting property when I tweak it (see below), results of today with v4 and llvm 6: BIGGEST IMPROVEMENTS - Max Waves Before After Delta Percentage 3 8 5 166.67 % ../shader-db/shaders/piglit/988eefdf4bb05e5a3ecc4a5c0b4c4ef54047a5a9_4.shader_test [1] 5 7 2 40.00 % ../shader-db/shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test [0] 5 7 2 40.00 % ../shader-db/shaders/cat/1847.shader_test [0] 6 7 1 16.67 % ../shader-db/shaders/serioussam2017/2160.shader_test [0] 6 7 1 16.67 % ../shader-db/shaders/serioussam2017/f1e9a7f8bb8be17b8231116cf68bc10677769709_2149.shader_test [0] 1 2 1 100.00 % ../shader-db/shaders/deusex_mankind/5b1006a267c95f5c245266d82699d53cad704aab_4008.shader_test [0] 1 2 1 100.00 % ../shader-db/shaders/deusex_mankind/14060e8c592408bb9b6059b27a72eeb1e66c7480_8288.shader_test [0] 1 2 1 100.00 % ../shader-db/shaders/deusex_mankind/cb6356f4da76e5a82d6036e811c680ae00249c68_994.shader_test [0] PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits All affected 630 -6.28 % -1.19 % -0.07 % . . .0.02 %0.42 % . No regressions with max waves otherwise (but of course I don't own every game and even from those I don't have a shaderdump of every one of them). But still, from the values I found there may be an opportunity for an 'intermediate range' optimization somewhere. Cheers, Benedikt --- diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -5533,21 +5533,22 @@ glsl_to_tgsi_visitor::merge_registers(void) } - if (get_temp_registers_required_live_ranges(reg_live_ranges, &this->instructions, +// if (get_temp_registers_required_live_ranges(reg_live_ranges, &this->instructions, + get_temp_registers_required_live_ranges(reg_live_ranges, &this->instructions, this->next_temp, reg_live_ranges, - this->next_array, arr_live_ranges)) { + this->next_array, arr_live_ranges);//) { struct rename_reg_pair *renames = rzalloc_array(reg_live_ranges, struct rename_reg_pair, this->next_temp); get_temp_registers_remapping(reg_live_ranges, this->next_temp, reg_live_ranges, renames); rename_temp_registers(renames); - this->next_array = merge_arrays(this->next_array, this->array_sizes, - &this->instructions, arr_live_ranges); +// this->next_array = merge_arrays(this->next_array, this->array_sizes, +// &this->instructions, arr_live_ranges); if (arr_live_ranges) delete[] arr_live_ranges; - } +// } ralloc_free(reg_live_ranges); } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp --- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp @@ -1330,6 +1330,24 @@ static int register_merge_record_compare (const void *a, const void *b) { } #endif +/* these magic numbers are determined by looking at the results of shader-db */ +bool should_merge (int distance) +{ + switch (distance) { + case 12 ... 126: //lower bound interfering with llvm?, upper bound here + case 244 ... 768: // and lower bound here determined by one regressing tombraider shader +// nothing to see here + case 2432 ... 2496: // purely empiricily determined + case 2497 ... 2623: // Deus Ex + case 2624 ... 2688: +// case 2689 ... 2943: // causes regressions in ubershaders + case 2944 ... 3072: // above isnt used + return true; + default: + return false; + } +} + /* This functions evaluates the register merges by using a binary * search to find suitable merge candidates. */ void get_temp_registers_remapping(void *mem_ctx, int ntemps, @@ -1361,10 +1379,18 @@ void get_temp_registers_remapping(void *mem_ctx, int ntemps, register_merge_record *first_erase = reg_access_en
Re: [Mesa-dev] [PATCH 0/9] RadeonSI LLVM crash workaround for Ubuntu 18.04
For the series: Tested-by: Benedikt Schemmer Tested with Series starting with [Mesa-dev] [PATCH 1/7] ac/surface: handle DCC subresource fast clear restriction on VI Series starting with [Mesa-dev] [PATCH 1/9] radeonsi: remove r600_pipe_common.h Series starting with [Mesa-dev] [PATCH 0/9] RadeonSI LLVM crash workaround for Ubuntu 18.04 Kernel 4.16.0+ Mareks kernel patches for IB Partial flush i.e. DRM 3.26.0 I did not test Mareks patches isolated for performance gains This is git since 9.4.18 plus Mareks patches RX560 4GB OC (but with 50 Watt cap, before: 60W got too hot) Deus Ex 1. run: 24.8/5.4/36.5 2. run: 24.9/13.8/37.9 before 1. run: 19.5/8.5/29.7 2. run: 19.8/9.2/30.2 Dirt Rallye 1. run 58/44.7/72 2. run 58.5/44/83 before 1. run: 55/42/73 2. run: 55/43/85 Dead Island classic crash with NIR during shader compilation, TGSI works fine Dead Island Definitive works fine with TGSI ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/main: Rework the shader capture naming convention
Change from a purely number.shader_test naming scheme to sha_number.shader_test because especially games often use the same number for entirely different shaders based on graphics settings etc. and then already captured shaders get overwritten. It is also useful for capturing shaders from applications that reuse program numbers a lot i.e. piglit. This has helped to identify problems there as well. Plus minor cleanups, that could be split into a separate patch. I have used this for approximately two months now, seems to work just fine. --- src/mesa/main/shaderapi.c | 315 -- 1 file changed, 168 insertions(+), 147 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 44b18af492..65cf0a67a2 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -64,6 +64,122 @@ #include "util/mesa-sha1.h" #include "util/crc32.h" + +/** + * Generate a SHA-1 hash value string for given source string. + */ +static void +generate_sha1(const char *source, char sha_str[64]) +{ + unsigned char sha[20]; + _mesa_sha1_compute(source, strlen(source), sha); + _mesa_sha1_format(sha_str, sha); +} + + +#ifdef ENABLE_SHADER_CACHE +/** + * Construct a full path for shader replacement functionality using + * following format: + * + * /_.glsl + */ +static char * +construct_name(const gl_shader_stage stage, const char *source, + const char *path) +{ + char sha[64]; + static const char *types[] = { + "VS", "TC", "TE", "GS", "FS", "CS", + }; + + generate_sha1(source, sha); + return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); +} + +/** + * Write given shader source to a file in MESA_SHADER_DUMP_PATH. + */ +static void +dump_shader(const gl_shader_stage stage, const char *source) +{ + static bool path_exists = true; + char *dump_path; + FILE *f; + + if (!path_exists) + return; + + dump_path = getenv("MESA_SHADER_DUMP_PATH"); + if (!dump_path) { + path_exists = false; + return; + } + + char *name = construct_name(stage, source, dump_path); + + f = fopen(name, "w"); + if (f) { + fputs(source, f); + fclose(f); + } else { + GET_CURRENT_CONTEXT(ctx); + _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name, +strerror(errno)); + } + ralloc_free(name); +} + +/** + * Read shader source code from a file. + * Useful for debugging to override an app's shader. + */ +static GLcharARB * +read_shader(const gl_shader_stage stage, const char *source) +{ + char *read_path; + static bool path_exists = true; + int len, shader_size = 0; + GLcharARB *buffer; + FILE *f; + + if (!path_exists) + return NULL; + + read_path = getenv("MESA_SHADER_READ_PATH"); + if (!read_path) { + path_exists = false; + return NULL; + } + + char *name = construct_name(stage, source, read_path); + f = fopen(name, "r"); + ralloc_free(name); + if (!f) + return NULL; + + /* allocate enough room for the entire shader */ + fseek(f, 0, SEEK_END); + shader_size = ftell(f); + rewind(f); + assert(shader_size); + + /* add one for terminating zero */ + shader_size++; + + buffer = malloc(shader_size); + assert(buffer); + + len = fread(buffer, 1, shader_size, f); + buffer[len] = 0; + + fclose(f); + + return buffer; +} + +#endif /* ENABLE_SHADER_CACHE */ + /** * Return mask of GLSL_x flags by examining the MESA_GLSL env var. */ @@ -125,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx) /* Device drivers may override these to control what kind of instructions * are generated by the GLSL compiler. */ - struct gl_shader_compiler_options options; + struct gl_shader_compiler_options options = {}; gl_shader_stage sh; int i; - memset(&options, 0, sizeof(options)); options.MaxUnrollIterations = 32; options.MaxIfDepth = UINT_MAX; @@ -138,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx) ctx->Shader.Flags = _mesa_get_shader_flags(); - if (ctx->Shader.Flags != 0) + if (ctx->Shader.Flags) ctx->Const.GenerateTemporaryNames = true; /* Extended for ARB_separate_shader_objects */ @@ -655,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, GLint *params) { struct gl_shader_program *shProg - = _mesa_lookup_shader_program_err(ctx, program, "glGetProgramiv(program)"); + = _mesa_lookup_shader_program_err(ctx, program, +"glGetProgramiv(program)"); /* Is transform feedback available in this context? */ @@ -837,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, *params = shProg->BinaryRetreivableHint; return; case GL_PROGRAM_BINARY_LENGTH: - if (ctx->Const.NumProgramBinaryFormats == 0) { + if (!ctx->Const.NumProgramBinaryFormats) { *params = 0;
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
The patches apply cleanly, however I just did a shader-db test run and can't find a difference with your patch applied, am I doing something wrong? compile times went up though: before: Thread 3 took 113.72 seconds and compiled 17899 shaders (not including SIMD16) with 2232 GL context switches Thread 5 took 113.23 seconds and compiled 17767 shaders (not including SIMD16) with 2150 GL context switches Thread 7 took 116.63 seconds and compiled 18030 shaders (not including SIMD16) with 2219 GL context switches Thread 1 took 117.10 seconds and compiled 17966 shaders (not including SIMD16) with 2154 GL context switches Thread 4 took 113.76 seconds and compiled 18097 shaders (not including SIMD16) with 2285 GL context switches Thread 2 took 113.61 seconds and compiled 17111 shaders (not including SIMD16) with 1934 GL context switches Thread 6 took 118.93 seconds and compiled 17887 shaders (not including SIMD16) with 2205 GL context switches Thread 0 took 112.91 seconds and compiled 18232 shaders (not including SIMD16) with 2321 GL context switches with your patch: Thread 1 took 119.41 seconds and compiled 18495 shaders (not including SIMD16) with 2237 GL context switches Thread 7 took 122.11 seconds and compiled 17228 shaders (not including SIMD16) with 2105 GL context switches Thread 4 took 120.57 seconds and compiled 17989 shaders (not including SIMD16) with 2165 GL context switches Thread 5 took 119.79 seconds and compiled 17709 shaders (not including SIMD16) with 2190 GL context switches Thread 6 took 121.95 seconds and compiled 17804 shaders (not including SIMD16) with 2209 GL context switches Thread 2 took 121.43 seconds and compiled 17819 shaders (not including SIMD16) with 2167 GL context switches Thread 0 took 117.16 seconds and compiled 18180 shaders (not including SIMD16) with 2151 GL context switches Thread 3 took 124.79 seconds and compiled 17765 shaders (not including SIMD16) with 2176 GL context switches Radeon RX 560 Series (POLARIS11, DRM 3.26.0, 4.17.0-rc2+, LLVM 6.0.0) Mesa git of today PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits 0ad6 . . . . . . . . . aer 590 . . . . . . . . . alien_isolation 1414 . . . . . . . . . anholt10 . . . . . . . . . bioshock_infinite 2581 . . . . . . . . . blackmesa584 . . . . . . . . . cat 573 . . . . . . . . . csgo1392 . . . . . . . . . deadisland_definitive 1776 . . . . . . . . . deadisland_original11602 . . . . . . . . . deadisland_riptide_..293 . . . . . . . . . deusex_mankind 5051 . . . . . . . . . dirtrally787 . . . . . . . . . dolphin 22 . . . . . . . . . dyinglight 4012 . . . . . . . . . eurotruck2 216 . . . . . . . . . f1_2015 746 . . . . . . . . . glamor16 . . . . . . . . . hl2ep1 294 . . . . . . . . . hl2ep2 154 . . . . . . . . . hl2lostcoast 66 . . . . . . . . . hlsl3582 . . . . . . . . . humus-celshading 4 . . . . . . . . . humus-domino 6 . . . . . . . . . humus-dyna
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
Hi Gert Am 28.04.2018 um 23:51 schrieb Gert Wollny: > Am Samstag, den 28.04.2018, 22:43 +0200 schrieb Benedikt Schemmer: >> The patches apply cleanly, however I just did a shader-db test run >> and can't find a difference with your patch >> applied, am I doing something wrong? > > AFAIK radeonsi doesn't use the register-merge optimizer in TGSI. > Ah, ok. Was wondering why your debug code doesn't output anything. Makes sense now ;) So is this useless on radeonsi? Seemed interesting to me. >> >> compile times went up though: > This is strange, because "see above". Did you compile with debug > information and c++11 or higher enables? In this case there is one > access to a static variable (first patch) that per c++11 standard > should be thread save, which means that there might be a mutex > protecting access to that variable, and this would explain the longer > runtime in a multi-threaded environment. > not intentionally: prefix: /usr/local exec_prefix: ${prefix} libdir: ${prefix}/lib includedir: ${prefix}/include OpenGL: yes (ES1: yes ES2: yes) OSMesa: libOSMesa DRI platform:drm DRI drivers: i915 i965 nouveau r200 radeon swrast DRI driver dir: ${prefix}/lib/dri GLX: DRI-based EGL: yes EGL drivers: builtin:egl_dri2 builtin:egl_dri3 GBM: yes EGL/Vulkan/VL platforms: x11 wayland drm Vulkan drivers: intel radeon Vulkan ICD dir: ${datarootdir}/vulkan/icd.d llvm:yes llvm-config: llvm-config-6.0 llvm-version:6.0.0 Gallium drivers: nouveau svga r600 r300 i915 virgl radeonsi swrast Gallium st: mesa xa xvmc vdpau omx_bellagio va nine clover HUD extra stats: yes HUD lmsensors: yes Shared libs: yes Static libs: no Shared-glapi:yes CFLAGS: -O3 -fstack-protector-strong -Wall -Wextra -Werror=format-security -fno-omit-frame-pointer -Wall -Werror=implicit-function-declaration -Werror=missing-prototypes -Wmissing-prototypes -fno-math-errno -fno-trapping-math -std=c99 CXXFLAGS:-O3 -fstack-protector-strong -Wall -Wextra -Werror=format-security -fno-omit-frame-pointer -Wall -fno-math-errno -fno-trapping-math CXX11_CXXFLAGS: LDFLAGS: -Bsymbolic-functions -z relro Macros: -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DNDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_TIMESPEC_GET -DHAVE_STRTOD_L -DHAVE_DL_ITERATE_PHDR -DHAVE_POSIX_MEMALIGN -DHAVE_ZLIB -DHAVE_LINUX_FUTEX_H -DHAVE_GALLIUM_EXTRA_HUD=1 -DHAVE_LIBSENSORS=1 -DHAVE_LIBDRM -DGLX_USE_DRM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_TLS -DHAVE_X11_PLATFORM -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRM_PLATFORM -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -DENABLE_SHADER_CACHE -DHAVE_MINCORE -DHAVE_ST_VDPAU -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=0 LLVM_CFLAGS: -I/usr/lib/llvm-6.0/include -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS LLVM_CXXFLAGS: -I/usr/lib/llvm-6.0/include -std=c++0x -std=c++11 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS LLVM_CPPFLAGS: -I/usr/lib/llvm-6.0/include -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS LLVM_LDFLAGS:-L/usr/lib/llvm-6.0/lib PYTHON2: python2.7 Run 'make' to build Mesa Cheers, Benedikt > Best, > Gert > >> >> before: >> Thread 3 took 113.72 seconds and compiled 17899 shaders (not >> including SIMD16) with 2232 GL context switches >> Thread 5 took 113.23 seconds and compiled 17767 shaders (not >> including SIMD16) with 2150 GL context switches >> Thread 7 took 116.63 seconds and compiled 18030 shaders (not >> including SIMD16) with 2219 GL context switches >> Thread 1 took 117.10 seconds and compiled 17966 shaders (not >> including SIMD16) with 2154 GL context switches >> Thread 4 took 113.76 seconds and compiled 18097 shaders (not >> including SIMD16) with 2285 GL context switches >> Thread 2 took 113.61 seconds and compiled 17111 shaders (not >> including SIMD16) with 1934 GL context switches >> Thread 6 took 118.93 seconds and compiled 17887 shaders (not >> including SIMD16) with 2205 GL context switches >> Thread 0 took 112.91 seconds and compiled 18232 shaders (not >> including SIMD16) with 2321 GL context switches >> >> with your patch: >> Thread 1 took 119
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
. . . . warsow 176 . . . . . . . . . warzone21004 . . . . . . . . . witcher2 928 -0.07 %0.06 % . . . .0.04 % . . x3_albion641 . . . . . . . . . xblades 208 . . . . . .0.42 % . . xcom1020 -0.10 % . . . . .0.28 % . . xcom2 1439 . . . . . . . . . yofrankie 82 . . . . . . . . . -- All affected63940.04 %0.16 %0.46 % .7.41 % 6.67 %0.51 % -0.09 % . -- Total 52662 .0.03 %0.26 % .1.34 % 1.09 %0.13 % -0.01 % . If theres an easy way to figure out when your code makes it worse and when its an improvement this would be really nice. Really interesting. Cheers, Benedikt Am 29.04.2018 um 09:55 schrieb Gert Wollny: > Hello Benedikt, > > Am Sonntag, den 29.04.2018, 00:06 +0200 schrieb Benedikt Schemmer: >> Hi Gert >> >> Am 28.04.2018 um 23:51 schrieb Gert Wollny: >>> Am Samstag, den 28.04.2018, 22:43 +0200 schrieb Benedikt Schemmer: >>>> The patches apply cleanly, however I just did a shader-db test >>>> run >>>> and can't find a difference with your patch >>>> applied, am I doing something wrong? >>> >>> AFAIK radeonsi doesn't use the register-merge optimizer in TGSI. >>> >> >> Ah, ok. Was wondering why your debug code doesn't output anything. >> Makes sense now ;) > Not exactly, the reason there is no output is because -DNDEBUG is set. > Without it the statistics should also be printed out on radeonsi, but > thinking of it I should probably disable it when register_merge is not > accessed, because without this the numbers will be inflated and don't > have much meaning. > >> So is this useless on radeonsi? > Indeed. > >> Seemed interesting to me. > :) it certainly helps on r600 > > >>>> compile times went up though: >>> >>> This is strange, because "see above". Did you compile with debug >>> information and c++11 or higher enables? > ... >>> >>> >> not intentionally: > > Then you should actually not run any code that this series adds to > mesa. I checked again, apart from the debugging output nothing will > ever be run if a drivers that report > PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS != 0 (as does radeonsi). > > Best, > Gert > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
Am 29.04.2018 um 11:34 schrieb Gert Wollny: > Am Sonntag, den 29.04.2018, 10:43 +0200 schrieb Benedikt Schemmer: >> Hi Gert, >> >> couldn't resist at least to try what would happen if I enable >> register merge for radeonsi: >> >> PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR >> SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits >> piglit 80732 -0.16 % -0.02 >> % . .0.87 %0.86 %0.04 % . . >> -- >> >> All affected 513 -17.58 % -2.30 >> % . .4.12 %5.87 %1.73 %0.10 % . >> -- >> >> Total 80732 -0.16 % -0.02 >> % . .0.87 %0.86 %0.04 % . . >> >> I had already removed the defines around the debug code so thats also >> happily outputting data. >> >> fails with two piglit shaders: > Which are the names of these test? I'd like to check this on r600, > because here I didn't see any regressions last time I checked. > might of course be different on r600 (is bindless available?), also shader-db is more sensitive to problems than piglit 1. tests/spec/arb_bindless_texture/compiler/images/arrays-of-struct.frag 2. tests/spec/arb_bindless_texture/compiler/samplers/arrays-of-struct.frag > >> Real world is a little different: > > I guess these tests refer to enabled register_merge - without and with > this patch set, no? > > Out of curiosity, did you also look at how enabling register_merge > (before this series) impacts the result as compared to the normal > operation of radeonsi? no I didn't but if I do (old vs new) nothing vs register merge: PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits piglit 80732 . . . . . .0.06 % . . -- All affected 4350.89 %0.36 % . . . .3.61 % -0.06 % . -- Total 80732 . . . . . .0.06 % . . register merge vs yours: PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits piglit 80732 -0.16 % -0.02 % . .0.87 % 0.86 % -0.02 % . . -- All affected 83 -56.92 % -14.22 % . . 11.67 % 16.67 % -2.86 %0.93 % . -- Total 80732 -0.16 % -0.02 % . .0.87 % 0.86 % -0.02 % . . nothing vs yours: PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits piglit 80732 -0.16 % -0.02 % . .0.87 % 0.86 %0.04 % . . -- All affected 513 -17.58 % -2.30 % . .4.12 % 5.87 %1.73 %0.10 % . -- Total 80732 -0.16 % -0.02 % . .0.87 % 0.86 %0.04 % . . > > >> If theres an easy way to figure out when your code makes it worse and >> when its an improvement this would be really nice. > > My insentive for this series was, that on r600 the arrays are allocated > before the final optimization pass on the byte code that requires that > the number of registers is <= 124. When I started this no spilling was > implemented, and shaders with too many arrays and registers would > simply fail. Now spilling is impelmented, but AFAIK reducing the > numbers of registers in the final optimization pass does not result in > changed spilling, so bringing down the number of registers before tgsi- > to-bytecode is still of i
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
. . . . . . . . . -- All affected6226 -0.10 % -0.11 % -0.83 % . . . -0.47 %0.05 % . -- Total 52662 -0.01 % -0.02 % -0.45 % . . . -0.12 % . . Am 29.04.2018 um 11:34 schrieb Gert Wollny: > Am Sonntag, den 29.04.2018, 10:43 +0200 schrieb Benedikt Schemmer: >> Hi Gert, >> >> couldn't resist at least to try what would happen if I enable >> register merge for radeonsi: >> >> PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR >> SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits >> piglit 80732 -0.16 % -0.02 >> % . .0.87 %0.86 %0.04 % . . >> -- >> >> All affected 513 -17.58 % -2.30 >> % . .4.12 %5.87 %1.73 %0.10 % . >> -- >> >> Total 80732 -0.16 % -0.02 >> % . .0.87 %0.86 %0.04 % . . >> >> I had already removed the defines around the debug code so thats also >> happily outputting data. >> >> fails with two piglit shaders: > Which are the names of these test? I'd like to check this on r600, > because here I didn't see any regressions last time I checked. > > >> Real world is a little different: > > I guess these tests refer to enabled register_merge - without and with > this patch set, no? > > Out of curiosity, did you also look at how enabling register_merge > (before this series) impacts the result as compared to the normal > operation of radeonsi? > > >> If theres an easy way to figure out when your code makes it worse and >> when its an improvement this would be really nice. > > My insentive for this series was, that on r600 the arrays are allocated > before the final optimization pass on the byte code that requires that > the number of registers is <= 124. When I started this no spilling was > implemented, and shaders with too many arrays and registers would > simply fail. Now spilling is impelmented, but AFAIK reducing the > numbers of registers in the final optimization pass does not result in > changed spilling, so bringing down the number of registers before tgsi- > to-bytecode is still of interest. > > For radeonsi my guess would be that the llvm optimizer works better > when the registers are not yet merged, and that would be the reason why > register_merge is disabled. > > In any case, Timothy wrote in this thread [1] (last message) that he > had similar patches for NIR. > > Best, > Gert > > [1] https://patchwork.freedesktop.org/patch/189842/ > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
. . . humus-portals 2 . . . . . . . . . humus-volumetricfog.. 6 . . . . . . . . . kerbal 1016 . . . . . . . . . larago 664 . . . . . . . . . madmax 3540.04 % . . . . . . . . metro2033redux 4410 .0.03 % . . . . . -0.02 % . nexuiz80 . . . . . . . . . ruiner 685 -0.10 % -0.06 % . . . .0.04 %0.04 % . sauerbraten7 . . . . . . . . . serioussam2017 7360.03 % -0.07 % -0.47 % . . . .0.06 % . soma 436 . . . . . . . . . specops 1814 . . . . . . . . . stellaris434 . . . . . . . . . supertuxkart 4 . . . . . . . . . talos762 . . . . . . . . . tesseract430 . . . . . . . . . tombraider 1012 -0.03 % . . . . .0.02 % . . total_war_shogun_2 176 . . . . . . . . . total_war_warhammer 218 . . . . . .0.71 % . . ubershaders 54 . . . . . .0.57 % . . ug_gettysburg149 . . . . . . . . . unigine_heaven 226 . . . . . . . . . unigine_superposition733 . . . . . . . . . unigine_valley 288 . . . . . . . . . unity 72 . . . . . . . . . w40kdawn2421 . . . . . . -0.03 % . . w40kdawn3164 . . . . . . . . . warsow 176 . . . . . . . . . warzone21004 . . . . . . . . . witcher2 928 -0.02 %0.02 % . . . . . . . x3_albion641 . . . . . . . . . xblades 208 . . . . . .0.02 % . . xcom1020 . . . . . . . . . xcom2 1439 . . . . . . . . . yofrankie 82 . . . . . . . . . -- All affected 516 -0.52 %0.41 % -0.58 % .7.41 % 6.67 %0.26 % -0.89 % . -- Total 52662 .0.01 % -0.19 % .1.34 % 1.09 %0.01 % . . Am 29.04.2018 um 11:34 schrieb Gert Wollny: > Am Sonntag, den 29.04.2018, 10:43 +0200 schrieb Benedikt Schemmer: >> Hi Gert, >> >> couldn't resist at least to try what would happen if I enable >> register merge for radeonsi: >> >> PERCENTAGE DELTASShaders SGPRs VGPRs SpillSGPR >> SpillVGPR PrivVGPR Scratch CodeSize MaxWaves
Re: [Mesa-dev] [PATCH v3 00/13] TGSI: improved live range tracking, also including arrays
Am 29.04.2018 um 21:44 schrieb Gert Wollny: > Hello Benedict, > > thanks for all the testing! thanks for all the developing ;) > > On 29.04.2018 12:12, Benedikt Schemmer wrote: >>> Which are the names of these test? I'd like to check this on r600, >>> because here I didn't see any regressions last time I checked. >>> >> might of course be different on r600 (is bindless available?), >> also shader-db is more sensitive to problems than piglit >> >> 1. tests/spec/arb_bindless_texture/compiler/images/arrays-of-struct.frag >> 2. tests/spec/arb_bindless_texture/compiler/samplers/arrays-of-struct.frag > Indeed, bindless testures are not available on r600, so it is quite > difficult for me to test this. I would guess that parameters related to > this might be stored in the TGSI declaration that I currently don't check. > > If you have time for it, could you send me a TGSI dump of one of these > shaders? > With "ST_DEBUG=tgsi" this should be possible. I created 'R600_DEBUG=merge' so I can switch without having to recompile. $ST_DEBUG=tgsi ./run wollny/ ATTENTION: default value of option allow_glsl_extension_directive_midshader overridden by environment. FRAG DCL TEMP[0..55], ARRAY(1), LOCAL IMM[0] INT32 {0, 0, 0, 0} IMM[1] FLT32 {1., 2., 3., 4.} 0: STORE TEMP[0], IMM[0]., IMM[1], 2D 1: END wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test - LLVM diagnostic (remark): :0:0: 12 instructions in function wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test - Shader Stats: SGPRS: 16 VGPRS: 16 Code Size: 80 LDS: 0 Scratch: 0 Max Waves: 8 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0 FRAG DCL OUT[0], COLOR DCL TEMP[0..1], LOCAL DCL TEMP[2..57], ARRAY(1), LOCAL IMM[0] FLT32 {0., 0., 0., 0.} 0: MOV TEMP[0].xy, IMM[0]. 1: TEX TEMP[1], TEMP[0], TEMP[2].xyxy, 2D 2: MOV OUT[0], TEMP[1] 3: END wollny/a115868a349cd666b842a0e70f47451b4463903a_2.shader_test - LLVM diagnostic (remark): :0:0: 11 instructions in function wollny/a115868a349cd666b842a0e70f47451b4463903a_2.shader_test - Shader Stats: SGPRS: 24 VGPRS: 16 Code Size: 72 LDS: 0 Scratch: 0 Max Waves: 8 Spilled SGPRs: 0 Spilled VGPRs: 0 PrivMem VGPRs: 0 Thread 0 took 0.13 seconds and compiled 2 shaders (not including SIMD16) with 1 GL context switches $R600_DEBUG=merge ST_DEBUG=tgsi ./run wollny/ ATTENTION: default value of option allow_glsl_extension_directive_midshader overridden by environment. run: state_tracker/st_glsl_to_tgsi.cpp:5783: ureg_dst dst_register(st_translate*, gl_register_file, unsigned int, unsigned int): Assertion `array_id && array_id <= t->num_temp_arrays' failed. => CRASHED <= while processing these shaders: wollny/41d88325fd2a57cb6af40de02dc281ee0683cc40_2.shader_test >> >>> For radeonsi my guess would be that the llvm optimizer works better >>> when the registers are not yet merged, and that would be the reason why >>> register_merge is disabled. >> well at least sometimes it doesn't, low hanging fruit maybe? > Unfortunately, I can't test on radeonsi I can, if you dont mind waiting for an answer sometimes. But maybe even easier: is there an implicit/explicit magic number I can play with to see if it changes anything? ATM it seems like your code improves half the shaders its affecting a lot and hurting the other half bad like it hits an invisible wall. I think one problem could be the relationship between VGPRs and SGPRs used and max Wavefronts achieved. This is somewhat similar to NIR although that changes things all over the place. > > Best, > Gert > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHSET] Allow capture of shaders that would otherwise be overwritten
When capturing shaders with MESA_SHADER_CAPTURE_PATH, shaders can and will be overwritten if they have the same program number. This happens when you're capturing shaders from games especially after changing graphics settings and piglit shaders (I use that as a convenient way for testing radeonsi NIR). This probably hasn't been noticed before. The first set of patches changes the way captured shaders are named by adding the sha as the first part of the name, making shaders uniquely identifiable. The second set changes the layout in shader-db/si-report to account for the longer filenames. It could be applied independently, but is kind of required for the first set. [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally [PATCH 2/3] mesa/main/shaderapi: purely non-functional cleanups like whitespace errors and cleanups [PATCH 3/3] Change from programnumber.shader_test naming to sha_programnumber.shader_test [PATCH 1/2] shader-db/si-report: Change layout to allow longer file names [PATCH 2/2] shader-db/si-report: Show biggest improvements also P.S. I don't have commit rights so please kindly review and push ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] mesa/main/shaderapi: Use sha as part of captured shaders name
It is inconvenient to capture shaders by program name alone because this does not allow to capture shaders that get overwritten by shaders with the same program name (ie games when you change settings or piglit). --- src/mesa/main/shaderapi.c | 47 --- 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 1d0ca5374b..65cf0a67a2 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1343,29 +1343,46 @@ link_program(struct gl_context *ctx, struct gl_shader_program *shProg, /* Capture .shader_test files. */ const char *capture_path = _mesa_get_shader_capture_path(); if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != NULL) { + FILE *file; - char *filename = ralloc_asprintf(NULL, "%s/%u.shader_test", - capture_path, shProg->Name); + char *filename = NULL; + char *fsource = NULL; + char *ftemp = NULL; + + asprintf(&fsource, "[require]\nGLSL%s >= %u.%02u\n", + shProg->IsES ? " ES" : "", + shProg->data->Version / 100, shProg->data->Version % 100); + + if (shProg->SeparateShader) { + ftemp = fsource; + asprintf(&fsource, "%sGL_ARB_separate_shader_objects\nSSO ENABLED\n", + ftemp); + free(ftemp); + } + + for (unsigned i = 0; i < shProg->NumShaders; i++) { + ftemp = fsource; + asprintf(&fsource, "%s\n[%s shader]\n%s\n", ftemp, + _mesa_shader_stage_to_string(shProg->Shaders[i]->Stage), + shProg->Shaders[i]->Source); + free(ftemp); + } + + char shabuf[64] = {"mylittlebunny"}; + generate_sha1(fsource, shabuf); + + asprintf(&filename, "%s/%s_%u.shader_test", capture_path, + shabuf, shProg->Name); file = fopen(filename, "w"); if (file) { - fprintf(file, "[require]\nGLSL%s >= %u.%02u\n", - shProg->IsES ? " ES" : "", - shProg->data->Version / 100, shProg->data->Version % 100); - if (shProg->SeparateShader) -fprintf(file, "GL_ARB_separate_shader_objects\nSSO ENABLED\n"); - fprintf(file, "\n"); - - for (unsigned i = 0; i < shProg->NumShaders; i++) { -fprintf(file, "[%s shader]\n%s\n", -_mesa_shader_stage_to_string(shProg->Shaders[i]->Stage), -shProg->Shaders[i]->Source); - } + fprintf(file, "%s", fsource); fclose(file); } else { _mesa_warning(ctx, "Failed to open %s", filename); } - ralloc_free(filename); + free(filename); + free(fsource); } if (shProg->data->LinkStatus == LINKING_FAILURE && -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] mesa/main/shaderapi: purely non-functional cleanups, like whitespace errors and cleanups
remove a memset too and yes, this is all functionally identical --- src/mesa/main/shaderapi.c | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index e8acca4490..1d0ca5374b 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -241,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx) /* Device drivers may override these to control what kind of instructions * are generated by the GLSL compiler. */ - struct gl_shader_compiler_options options; + struct gl_shader_compiler_options options = {}; gl_shader_stage sh; int i; - memset(&options, 0, sizeof(options)); options.MaxUnrollIterations = 32; options.MaxIfDepth = UINT_MAX; @@ -254,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx) ctx->Shader.Flags = _mesa_get_shader_flags(); - if (ctx->Shader.Flags != 0) + if (ctx->Shader.Flags) ctx->Const.GenerateTemporaryNames = true; /* Extended for ARB_separate_shader_objects */ @@ -771,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, GLint *params) { struct gl_shader_program *shProg - = _mesa_lookup_shader_program_err(ctx, program, "glGetProgramiv(program)"); + = _mesa_lookup_shader_program_err(ctx, program, +"glGetProgramiv(program)"); /* Is transform feedback available in this context? */ @@ -953,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, *params = shProg->BinaryRetreivableHint; return; case GL_PROGRAM_BINARY_LENGTH: - if (ctx->Const.NumProgramBinaryFormats == 0) { + if (!ctx->Const.NumProgramBinaryFormats) { *params = 0; } else { _mesa_get_program_binary_length(ctx, shProg, params); @@ -974,7 +974,7 @@ get_programiv(struct gl_context *ctx, GLuint program, GLenum pname, "linked)"); return; } - if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) { + if (!shProg->_LinkedShaders[MESA_SHADER_COMPUTE]) { _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute " "shaders)"); return; @@ -1234,7 +1234,7 @@ _mesa_compile_shader(struct gl_context *ctx, struct gl_shader *sh) } else { if (ctx->_Shader->Flags & GLSL_DUMP) { _mesa_log("GLSL source for %s shader %d:\n", - _mesa_shader_stage_to_string(sh->Stage), sh->Name); + _mesa_shader_stage_to_string(sh->Stage), sh->Name); _mesa_log("%s\n", sh->Source); } @@ -1381,13 +1381,13 @@ link_program(struct gl_context *ctx, struct gl_shader_program *shProg, GLuint i; printf("Link %u shaders in program %u: %s\n", - shProg->NumShaders, shProg->Name, - shProg->data->LinkStatus ? "Success" : "Failed"); + shProg->NumShaders, shProg->Name, + shProg->data->LinkStatus ? "Success" : "Failed"); for (i = 0; i < shProg->NumShaders; i++) { printf(" shader %u, stage %u\n", - shProg->Shaders[i]->Name, - shProg->Shaders[i]->Stage); +shProg->Shaders[i]->Name, +shProg->Shaders[i]->Stage); } } } @@ -1460,7 +1460,7 @@ void _mesa_active_program(struct gl_context *ctx, struct gl_shader_program *shProg, const char *caller) { - if ((shProg != NULL) && !shProg->data->LinkStatus) { + if ((shProg) && !shProg->data->LinkStatus) { _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program %u not linked)", caller, shProg->Name); return; @@ -1794,7 +1794,7 @@ void GLAPIENTRY _mesa_GetObjectParameterfvARB(GLhandleARB object, GLenum pname, GLfloat *params) { - GLint iparams[1] = {0}; /* XXX is one element enough? */ + GLint iparams[1] = {}; /* XXX is one element enough? */ _mesa_GetObjectParameterivARB(object, pname, iparams); params[0] = (GLfloat) iparams[0]; } @@ -1912,7 +1912,7 @@ shader_source(struct gl_context *ctx, GLuint shaderObj, GLsizei count, if (!sh) return; - if (string == NULL) { + if (!string) { _mesa_error(ctx, GL_INVALID_VALUE, "glShaderSourceARB"); return; } @@ -1925,7 +1925,7 @@ shader_source(struct gl_context *ctx, GLuint shaderObj, GLsizei count, * last element will be set to the total length of the source code. */ offsets = malloc(count * sizeof(GLint)); - if (offsets == NULL) { + if (!offsets) { _mesa_error(ctx, GL_OUT_OF_MEMORY, "glShaderSourceARB"); return; } @@ -1952,7 +1952,7 @@ shader_source(struct gl_context *ctx, GLuint shaderObj, GLsizei count, */ totalLength = offsets[count - 1] + 2; source = malloc(totalLength * sizeof(G
[Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
Move shader-cache code from back to front and make generate_sha1() usable unconditionally to avoid code duplication in the following patch --- src/mesa/main/shaderapi.c | 228 +++--- 1 file changed, 116 insertions(+), 112 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 44b18af492..e8acca4490 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -64,6 +64,122 @@ #include "util/mesa-sha1.h" #include "util/crc32.h" + +/** + * Generate a SHA-1 hash value string for given source string. + */ +static void +generate_sha1(const char *source, char sha_str[64]) +{ + unsigned char sha[20]; + _mesa_sha1_compute(source, strlen(source), sha); + _mesa_sha1_format(sha_str, sha); +} + + +#ifdef ENABLE_SHADER_CACHE +/** + * Construct a full path for shader replacement functionality using + * following format: + * + * /_.glsl + */ +static char * +construct_name(const gl_shader_stage stage, const char *source, + const char *path) +{ + char sha[64]; + static const char *types[] = { + "VS", "TC", "TE", "GS", "FS", "CS", + }; + + generate_sha1(source, sha); + return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); +} + +/** + * Write given shader source to a file in MESA_SHADER_DUMP_PATH. + */ +static void +dump_shader(const gl_shader_stage stage, const char *source) +{ + static bool path_exists = true; + char *dump_path; + FILE *f; + + if (!path_exists) + return; + + dump_path = getenv("MESA_SHADER_DUMP_PATH"); + if (!dump_path) { + path_exists = false; + return; + } + + char *name = construct_name(stage, source, dump_path); + + f = fopen(name, "w"); + if (f) { + fputs(source, f); + fclose(f); + } else { + GET_CURRENT_CONTEXT(ctx); + _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name, +strerror(errno)); + } + ralloc_free(name); +} + +/** + * Read shader source code from a file. + * Useful for debugging to override an app's shader. + */ +static GLcharARB * +read_shader(const gl_shader_stage stage, const char *source) +{ + char *read_path; + static bool path_exists = true; + int len, shader_size = 0; + GLcharARB *buffer; + FILE *f; + + if (!path_exists) + return NULL; + + read_path = getenv("MESA_SHADER_READ_PATH"); + if (!read_path) { + path_exists = false; + return NULL; + } + + char *name = construct_name(stage, source, read_path); + f = fopen(name, "r"); + ralloc_free(name); + if (!f) + return NULL; + + /* allocate enough room for the entire shader */ + fseek(f, 0, SEEK_END); + shader_size = ftell(f); + rewind(f); + assert(shader_size); + + /* add one for terminating zero */ + shader_size++; + + buffer = malloc(shader_size); + assert(buffer); + + len = fread(buffer, 1, shader_size, f); + buffer[len] = 0; + + fclose(f); + + return buffer; +} + +#endif /* ENABLE_SHADER_CACHE */ + /** * Return mask of GLSL_x flags by examining the MESA_GLSL env var. */ @@ -1775,119 +1891,7 @@ _mesa_LinkProgram(GLuint programObj) link_program_error(ctx, shProg); } -#ifdef ENABLE_SHADER_CACHE -/** - * Generate a SHA-1 hash value string for given source string. - */ -static void -generate_sha1(const char *source, char sha_str[64]) -{ - unsigned char sha[20]; - _mesa_sha1_compute(source, strlen(source), sha); - _mesa_sha1_format(sha_str, sha); -} - -/** - * Construct a full path for shader replacement functionality using - * following format: - * - * /_.glsl - */ -static char * -construct_name(const gl_shader_stage stage, const char *source, - const char *path) -{ - char sha[64]; - static const char *types[] = { - "VS", "TC", "TE", "GS", "FS", "CS", - }; - - generate_sha1(source, sha); - return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); -} - -/** - * Write given shader source to a file in MESA_SHADER_DUMP_PATH. - */ -static void -dump_shader(const gl_shader_stage stage, const char *source) -{ - static bool path_exists = true; - char *dump_path; - FILE *f; - - if (!path_exists) - return; - - dump_path = getenv("MESA_SHADER_DUMP_PATH"); - if (!dump_path) { - path_exists = false; - return; - } - - char *name = construct_name(stage, source, dump_path); - - f = fopen(name, "w"); - if (f) { - fputs(source, f); - fclose(f); - } else { - GET_CURRENT_CONTEXT(ctx); - _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name, -strerror(errno)); - } - ralloc_free(name); -} - -/** - * Read shader source code from a file. - * Useful for debugging to override an app's shader. - */ -static GLcharARB * -read_shader(const gl_shader_stage stage, const char *source) -{ - char *read_path; - static bool path_exists = true; - int len, shader_size = 0; - GLcharARB *buffer; - FILE *f; -
[Mesa-dev] [PATCH 2/2] shader-db/si-report: Show biggest improvements also
Sometimes it is nice (and useful) to not just see if you've messed up but also if you've made an improvement. --- si-report.py | 24 1 file changed, 24 insertions(+) diff --git a/si-report.py b/si-report.py index fba652c..d0e0686 100755 --- a/si-report.py +++ b/si-report.py @@ -686,6 +686,30 @@ def print_tables(before_all_results, after_all_results): break if num > 0: print + +# biggest improvements +metrics = si_stats().metrics +for i in range(len(metrics)): +field = metrics[i][0] +num = 0 +more_is_better = metrics[i][0] == 'maxwaves' + +if more_is_better: +sort_key = lambda v: -v[1].diff.__dict__[field] +else: +sort_key = lambda v: v[1].diff.__dict__[field] + +for name, stats in sorted(shaders.items(), key = sort_key): +if more_is_better: +if stats.diff.__dict__[field] <= 0: +continue +else: +if stats.diff.__dict__[field] >= 0: +continue + +if num == 0: +print_yellow(" BIGGEST IMPROVEMENTS - {:64}".format(metrics[i][1])) +print_yellow(" Before After Delta Percentage") stats.print_regression(name, field) num += 1 if num == num_listed: -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Subject: [PATCH 1/2] shader-db/si-report: Change layout to allow longer file names
When using sha as part of the captured shader name that name can get so long that part of it is cut off and not visible anymore (although this also happens when using long directory names etc.) So this is a general improvement. --- si-report.py | 72 ++-- 1 file changed, 46 insertions(+), 26 deletions(-) diff --git a/si-report.py b/si-report.py index 0756081..fba652c 100755 --- a/si-report.py +++ b/si-report.py @@ -482,39 +482,53 @@ class grouped_stats: def print_vgpr_spilling_app(self, name): if (self.after.spilled_vgprs > 0 or self.after.privmem_vgprs > 0): -print " {:22}{:6}{:10}{:10}{:10}".format( -name, +print " {:6}{:6} {:6} {:6} {:22}".format( self.num_shaders, self.after.spilled_vgprs, self.after.privmem_vgprs, -self.after.scratch_size) +self.after.scratch_size, +name) def print_one_shader_vgpr_spill(self, name): if (self.after.spilled_vgprs > 0 or self.after.privmem_vgprs > 0): -print " {:65}{:10}{:10}{:10}{:10}".format( -name, +print " {:6}{:6}{:6} {:6}{:22}".format( self.after.vgprs, self.after.spilled_vgprs, self.after.privmem_vgprs, -self.after.scratch_size) +self.after.scratch_size, +name) def print_sgpr_spilling_app(self, name): if self.after.spilled_sgprs > 0: -print " {:22}{:6}{:10}{:>9.1f}".format( -name, +print " {:6} {:6} {:>5.1f} {:22}".format( self.num_shaders, self.after.spilled_sgprs, -float(self.after.spilled_sgprs) / float(self.num_shaders)) +float(self.after.spilled_sgprs) / float(self.num_shaders), +name) def print_one_shader_sgpr_spill(self, name): if self.after.spilled_sgprs > 0: -print " {:65}{:10}{:10}".format( -name, +print " {:6}{:6} {:90}".format( self.after.sgprs, -self.after.spilled_sgprs) +self.after.spilled_sgprs, +name) def print_percentages(self, name): +print " {:6}{:6}{}{}{}{}{}{}{}{}{}".format( +name, +self.num_shaders, +format_percent_change(self.before.sgprs, self.after.sgprs), +format_percent_change(self.before.vgprs, self.after.vgprs), +format_percent_change(self.before.spilled_sgprs, self.after.spilled_sgprs), +format_percent_change(self.before.spilled_vgprs, self.after.spilled_vgprs), +format_percent_change(self.before.privmem_vgprs, self.after.privmem_vgprs), +format_percent_change(self.before.scratch_size, self.after.scratch_size), +format_percent_change(self.before.code_size, self.after.code_size), +format_percent_change(self.before.maxwaves, self.after.maxwaves, more_is_better = True), +format_percent_change(self.before.waitstates, self.after.waitstates)) + +def print_percentages_end(self, name): print " {:22}{:6}{}{}{}{}{}{}{}{}{}".format( name, self.num_shaders, @@ -530,14 +544,14 @@ class grouped_stats: def print_regression(self, name, field): more_is_better = field == "maxwaves" -print " {:65}{:10}{:10}{}{}".format( -name, +print " {:6}{:6}{}{} {:90}".format( self.before.__dict__[field], self.after.__dict__[field], format_table_cell(self.after.__dict__[field] - self.before.__dict__[field], more_is_better = more_is_better), format_percent_change(self.before.__dict__[field], self.after.__dict__[field], - more_is_better = more_is_better)) + more_is_better = more_is_better), +name) """ Return "filename [index]", because files can contain multiple shaders. @@ -608,8 +622,8 @@ def print_tables(before_all_results, after_all_results): sort_key = lambda v: -v[1].after.scratch_size for name, stats in sorted(shaders.items(), key = sort_key): if num == 0: -print_yellow(" WORST VGPR SPILLS (not deltas)" + (" " * 40) + - "VGPRs SpillVGPR PrivVGPR ScratchSize") +print_yellow("WORST VGPR SPILLS (not deltas)" + (" " * 64)) +print_yellow(" VGPRs Spills Private Scratch") stats.print_one_shader_vgpr_spill(name) num += 1 if num == num_listed: @@ -618,7 +632,7 @@ def print_tables(before_all_results, after_all_results): print # VGPR spilling apps -print_yellow(" VGPR SPILLING APPS Shaders S
Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
Thanks for reviewing! Am 17.05.2018 um 08:42 schrieb Tapani Pälli: > > > On 05/10/2018 12:05 PM, Benedikt Schemmer wrote: >> Move shader-cache code from back to front and make generate_sha1() usable >> unconditionally to avoid code duplication in the following patch >> >> --- >> src/mesa/main/shaderapi.c | 228 >> +++--- >> 1 file changed, 116 insertions(+), 112 deletions(-) >> >> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >> index 44b18af492..e8acca4490 100644 >> --- a/src/mesa/main/shaderapi.c >> +++ b/src/mesa/main/shaderapi.c >> @@ -64,6 +64,122 @@ >> #include "util/mesa-sha1.h" >> #include "util/crc32.h" >> >> + >> +/** >> + * Generate a SHA-1 hash value string for given source string. >> + */ >> +static void >> +generate_sha1(const char *source, char sha_str[64]) >> +{ >> + unsigned char sha[20]; >> + _mesa_sha1_compute(source, strlen(source), sha); >> + _mesa_sha1_format(sha_str, sha); >> +} > > There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for > generate_sha1 and others was placed there because the imported sha1 code > broke windows build, I'm wondering if this is still > the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE > guard. > I did a quick gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h") and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) without a guard best example from Intel seems to be brw_disk_cache.c which uses it alot outside of the ENABLE_SHADER_CACHE guard so probably safe? >> + >> + >> +#ifdef ENABLE_SHADER_CACHE >> +/** >> + * Construct a full path for shader replacement functionality using >> + * following format: >> + * >> + * /_.glsl >> + */ >> +static char * >> +construct_name(const gl_shader_stage stage, const char *source, >> + const char *path) >> +{ >> + char sha[64]; >> + static const char *types[] = { >> + "VS", "TC", "TE", "GS", "FS", "CS", >> + }; >> + >> + generate_sha1(source, sha); >> + return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); >> +} >> + >> +/** >> + * Write given shader source to a file in MESA_SHADER_DUMP_PATH. >> + */ >> +static void >> +dump_shader(const gl_shader_stage stage, const char *source) >> +{ >> + static bool path_exists = true; >> + char *dump_path; >> + FILE *f; >> + >> + if (!path_exists) >> + return; >> + >> + dump_path = getenv("MESA_SHADER_DUMP_PATH"); >> + if (!dump_path) { >> + path_exists = false; >> + return; >> + } >> + >> + char *name = construct_name(stage, source, dump_path); >> + >> + f = fopen(name, "w"); >> + if (f) { >> + fputs(source, f); >> + fclose(f); >> + } else { >> + GET_CURRENT_CONTEXT(ctx); >> + _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name, >> + strerror(errno)); >> + } >> + ralloc_free(name); >> +} >> + >> +/** >> + * Read shader source code from a file. >> + * Useful for debugging to override an app's shader. >> + */ >> +static GLcharARB * >> +read_shader(const gl_shader_stage stage, const char *source) >> +{ >> + char *read_path; >> + static bool path_exists = true; >> + int len, shader_size = 0; >> + GLcharARB *buffer; >> + FILE *f; >> + >> + if (!path_exists) >> + return NULL; >> + >> + read_path = getenv("MESA_SHADER_READ_PATH"); >> + if (!read_path) { >> + path_exists = false; >> + return NULL; >> + } >> + >> + char *name = construct_name(stage, source, read_path); >> + f = fopen(name, "r"); >> + ralloc_free(name); >> + if (!f) >> + return NULL; >> + >> + /* allocate enough room for the entire shader */ >> + fseek(f, 0, SEEK_END); >> + shader_size = ftell(f); >> + rewind(f); >> + assert(shader_size); >> + >> + /* add one for terminating zero */ >> + shader_size++; >> + >> + buffer = malloc(shader_size); >> + assert(buffer); >> + >> + len = fread(buffer, 1, shader_size, f); >> + buffer[le
Re: [Mesa-dev] [PATCH 3/3] mesa/main/shaderapi: purely non-functional cleanups, like whitespace errors and cleanups
Am 17.05.2018 um 08:59 schrieb Tapani Pälli: > > > On 05/10/2018 12:05 PM, Benedikt Schemmer wrote: >> remove a memset too and yes, this is all functionally identical >> >> --- >> src/mesa/main/shaderapi.c | 40 >> 1 file changed, 20 insertions(+), 20 deletions(-) >> >> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >> index e8acca4490..1d0ca5374b 100644 >> --- a/src/mesa/main/shaderapi.c >> +++ b/src/mesa/main/shaderapi.c >> @@ -241,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx) >> /* Device drivers may override these to control what kind of >> instructions >> * are generated by the GLSL compiler. >> */ >> - struct gl_shader_compiler_options options; >> + struct gl_shader_compiler_options options = {}; >> gl_shader_stage sh; >> int i; >> >> - memset(&options, 0, sizeof(options)); >> options.MaxUnrollIterations = 32; >> options.MaxIfDepth = UINT_MAX; >> >> @@ -254,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx) >> >> ctx->Shader.Flags = _mesa_get_shader_flags(); >> >> - if (ctx->Shader.Flags != 0) >> + if (ctx->Shader.Flags) >> ctx->Const.GenerateTemporaryNames = true; >> >> /* Extended for ARB_separate_shader_objects */ >> @@ -771,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, >> GLenum pname, >> GLint *params) >> { >> struct gl_shader_program *shProg >> - = _mesa_lookup_shader_program_err(ctx, program, >> "glGetProgramiv(program)"); >> + = _mesa_lookup_shader_program_err(ctx, program, >> + "glGetProgramiv(program)"); >> >> /* Is transform feedback available in this context? >> */ >> @@ -953,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, >> GLenum pname, >> *params = shProg->BinaryRetreivableHint; >> return; >> case GL_PROGRAM_BINARY_LENGTH: >> - if (ctx->Const.NumProgramBinaryFormats == 0) { >> + if (!ctx->Const.NumProgramBinaryFormats) { > > Maybe it's just me having some OCD but with these 'Num' constants I find it > much easier to read '== 0' than '!' (also below with NumProgramBinaryFormats > and NumSubroutineUniformRemapTable). > > I don't feel strong about this though so no need to change this. I dont have strong feelings about this either, I use a script to replace these things. In my opinion it just helps to see whether these comparisons have meaning. >=1 or ==0 don't really. If they do I just use a define to make it clear. Otherwise I find the ! easier to read and understand. > >> *params = 0; >> } else { >> _mesa_get_program_binary_length(ctx, shProg, params); >> @@ -974,7 +974,7 @@ get_programiv(struct gl_context *ctx, GLuint program, >> GLenum pname, >> "linked)"); >> return; >> } >> - if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) { >> + if (!shProg->_LinkedShaders[MESA_SHADER_COMPUTE]) { >> _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute >> " >> "shaders)"); >> return; >> @@ -1234,7 +1234,7 @@ _mesa_compile_shader(struct gl_context *ctx, struct >> gl_shader *sh) >> } else { >> if (ctx->_Shader->Flags & GLSL_DUMP) { >> _mesa_log("GLSL source for %s shader %d:\n", >> - _mesa_shader_stage_to_string(sh->Stage), sh->Name); >> + _mesa_shader_stage_to_string(sh->Stage), sh->Name); >> _mesa_log("%s\n", sh->Source); >> } >> >> @@ -1381,13 +1381,13 @@ link_program(struct gl_context *ctx, struct >> gl_shader_program *shProg, >> GLuint i; >> >> printf("Link %u shaders in program %u: %s\n", >> - shProg->NumShaders, shProg->Name, >> - shProg->data->LinkStatus ? "Success" : "Failed"); >> + shProg->NumShaders, shProg->Name, >> + shProg->data->LinkStatus ? "Success" : "Failed"); >> >> for (i = 0; i < shProg->NumShaders; i++) { >> pri
Re: [Mesa-dev] [PATCH 2/3] mesa/main/shaderapi: Use sha as part of captured shaders name
Am 17.05.2018 um 09:11 schrieb Tapani Pälli: > some nitpicking below .. I'll do some testing to see that things work as > before but overall these changes look good to me. Would be nice to hear > comments from active shader-db users. > > On 05/10/2018 12:05 PM, Benedikt Schemmer wrote: >> It is inconvenient to capture shaders by program name alone because this does >> not allow to capture shaders that get overwritten by shaders with the same >> program name (ie games when you change settings or piglit). >> >> --- >> src/mesa/main/shaderapi.c | 47 >> --- >> 1 file changed, 32 insertions(+), 15 deletions(-) >> >> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >> index 1d0ca5374b..65cf0a67a2 100644 >> --- a/src/mesa/main/shaderapi.c >> +++ b/src/mesa/main/shaderapi.c >> @@ -1343,29 +1343,46 @@ link_program(struct gl_context *ctx, struct >> gl_shader_program *shProg, >> /* Capture .shader_test files. */ >> const char *capture_path = _mesa_get_shader_capture_path(); >> if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != NULL) { >> + > > extra space > >> FILE *file; >> - char *filename = ralloc_asprintf(NULL, "%s/%u.shader_test", >> - capture_path, shProg->Name); >> + char *filename = NULL; >> + char *fsource = NULL; >> + char *ftemp = NULL; >> + >> + asprintf(&fsource, "[require]\nGLSL%s >= %u.%02u\n", >> + shProg->IsES ? " ES" : "", >> + shProg->data->Version / 100, shProg->data->Version % 100); >> + >> + if (shProg->SeparateShader) { >> + ftemp = fsource; >> + asprintf(&fsource, "%sGL_ARB_separate_shader_objects\nSSO >> ENABLED\n", >> + ftemp); >> + free(ftemp); >> + } >> + >> + for (unsigned i = 0; i < shProg->NumShaders; i++) { >> + ftemp = fsource; >> + asprintf(&fsource, "%s\n[%s shader]\n%s\n", ftemp, >> + _mesa_shader_stage_to_string(shProg->Shaders[i]->Stage), >> + shProg->Shaders[i]->Source); >> + free(ftemp); >> + } >> + >> + char shabuf[64] = {"mylittlebunny"}; > > please remove the initialization, it is unnecessary That was left from before I used generate_sha1 which sometimes failed for some reason. I have one little thing though: Most of the time this is defined as shabuf[41] (20 bytes sha plus trailing \0 ) In this file it is 64 for some reason. Should I change that? > >> + generate_sha1(fsource, shabuf); >> + >> + asprintf(&filename, "%s/%s_%u.shader_test", capture_path, >> + shabuf, shProg->Name); >> file = fopen(filename, "w"); >> if (file) { >> - fprintf(file, "[require]\nGLSL%s >= %u.%02u\n", >> - shProg->IsES ? " ES" : "", >> - shProg->data->Version / 100, shProg->data->Version % 100); >> - if (shProg->SeparateShader) >> - fprintf(file, "GL_ARB_separate_shader_objects\nSSO ENABLED\n"); >> - fprintf(file, "\n"); >> - >> - for (unsigned i = 0; i < shProg->NumShaders; i++) { >> - fprintf(file, "[%s shader]\n%s\n", >> - _mesa_shader_stage_to_string(shProg->Shaders[i]->Stage), >> - shProg->Shaders[i]->Source); >> - } >> + fprintf(file, "%s", fsource); >> fclose(file); >> } else { >> _mesa_warning(ctx, "Failed to open %s", filename); >> } >> >> - ralloc_free(filename); >> + free(filename); >> + free(fsource); >> } >> >> if (shProg->data->LinkStatus == LINKING_FAILURE && >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
Ok I found the commit. But it says: "Until we have a proper fix." So this would be a good time I guess ;) --- author Emil Velikov2017-01-18 19:40:31 + committer Emil Velikov 2017-01-18 20:09:01 + commit 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 (patch) treeff9672995474d3c31f027fea8356cb5733e45388 parent d1efa09d342bff3e5def2978a0bef748d74f9c82 (diff) utils: build sha1/disk cache only with Android/Autoconf Earlier commit imported a SHA1 implementation and relaxed the SHA1 and disk cache handling, broking the Windows builds. Restrict things for now until we get to a proper fix. Fixes: d1efa09d342 "util: import sha1 implementation from OpenBSD" Signed-off-by: Emil Velikov Am 17.05.2018 um 12:28 schrieb Tapani Pälli: > > > On 05/17/2018 11:38 AM, Benedikt Schemmer wrote: >> Thanks for reviewing! >> >> Am 17.05.2018 um 08:42 schrieb Tapani Pälli: >>> >>> >>> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote: >>>> Move shader-cache code from back to front and make generate_sha1() usable >>>> unconditionally to avoid code duplication in the following patch >>>> >>>> --- >>>> src/mesa/main/shaderapi.c | 228 >>>> +++--- >>>> 1 file changed, 116 insertions(+), 112 deletions(-) >>>> >>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >>>> index 44b18af492..e8acca4490 100644 >>>> --- a/src/mesa/main/shaderapi.c >>>> +++ b/src/mesa/main/shaderapi.c >>>> @@ -64,6 +64,122 @@ >>>> #include "util/mesa-sha1.h" >>>> #include "util/crc32.h" >>>> >>>> + >>>> +/** >>>> + * Generate a SHA-1 hash value string for given source string. >>>> + */ >>>> +static void >>>> +generate_sha1(const char *source, char sha_str[64]) >>>> +{ >>>> + unsigned char sha[20]; >>>> + _mesa_sha1_compute(source, strlen(source), sha); >>>> + _mesa_sha1_format(sha_str, sha); >>>> +} >>> >>> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for >>> generate_sha1 and others was placed there because the imported sha1 code >>> broke windows build, I'm wondering if this is still >>> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE >>> guard. >>> >> >> I did a quick >> gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h") >> >> and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) >> without a guard >> best example from Intel seems to be brw_disk_cache.c which uses it alot >> outside of the ENABLE_SHADER_CACHE guard >> >> so probably safe? > > AFAIK none of those things are compiled on windows. IIUC windows builds are > about Mesa core and gallium side. > >>>> + >>>> + >>>> +#ifdef ENABLE_SHADER_CACHE >>>> +/** >>>> + * Construct a full path for shader replacement functionality using >>>> + * following format: >>>> + * >>>> + * /_.glsl >>>> + */ >>>> +static char * >>>> +construct_name(const gl_shader_stage stage, const char *source, >>>> + const char *path) >>>> +{ >>>> + char sha[64]; >>>> + static const char *types[] = { >>>> + "VS", "TC", "TE", "GS", "FS", "CS", >>>> + }; >>>> + >>>> + generate_sha1(source, sha); >>>> + return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); >>>> +} >>>> + >>>> +/** >>>> + * Write given shader source to a file in MESA_SHADER_DUMP_PATH. >>>> + */ >>>> +static void >>>> +dump_shader(const gl_shader_stage stage, const char *source) >>>> +{ >>>> + static bool path_exists = true; >>>> + char *dump_path; >>>> + FILE *f; >>>> + >>>> + if (!path_exists) >>>> + return; >>>> + >>>> + dump_path = getenv("MESA_SHADER_DUMP_PATH"); >>>> + if (!dump_path) { >>>> + path_exists = false; >>>> + return; >>>> + } >>>> + >>>> + char *name = construct_name(stage, source, dump_path); >>>> + >>>>
Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
I looked at the appveyor log for that specific build https://ci.appveyor.com/project/mesa3d/mesa/build/3186 It seems to only complain about disk_cache.c src\util\disk_cache.c(28) : fatal error C1083: Cannot open include file: 'sys/file.h': No such file or directory scons: *** [build\windows-x86-debug\util\disk_cache.obj] Error 2 scons: building terminated because of errors. Found this on https://stackoverflow.com/questions/2537960/windows-sys-file-h-equivalent For the benefit of posterity, is the BSD version of low-level file I/O routines. Depending upon your compiler installation and build environment, you'll probably want instead. Most of the usual I/O routines are in , even setvbuf() which is pretty low-level control. You'll want or if you want I/O routines/settings that don't normally exist under Linux (or the other *NICES). Am 17.05.2018 um 12:28 schrieb Tapani Pälli: > > > On 05/17/2018 11:38 AM, Benedikt Schemmer wrote: >> Thanks for reviewing! >> >> Am 17.05.2018 um 08:42 schrieb Tapani Pälli: >>> >>> >>> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote: >>>> Move shader-cache code from back to front and make generate_sha1() usable >>>> unconditionally to avoid code duplication in the following patch >>>> >>>> --- >>>> src/mesa/main/shaderapi.c | 228 >>>> +++--- >>>> 1 file changed, 116 insertions(+), 112 deletions(-) >>>> >>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >>>> index 44b18af492..e8acca4490 100644 >>>> --- a/src/mesa/main/shaderapi.c >>>> +++ b/src/mesa/main/shaderapi.c >>>> @@ -64,6 +64,122 @@ >>>> #include "util/mesa-sha1.h" >>>> #include "util/crc32.h" >>>> >>>> + >>>> +/** >>>> + * Generate a SHA-1 hash value string for given source string. >>>> + */ >>>> +static void >>>> +generate_sha1(const char *source, char sha_str[64]) >>>> +{ >>>> + unsigned char sha[20]; >>>> + _mesa_sha1_compute(source, strlen(source), sha); >>>> + _mesa_sha1_format(sha_str, sha); >>>> +} >>> >>> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for >>> generate_sha1 and others was placed there because the imported sha1 code >>> broke windows build, I'm wondering if this is still >>> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE >>> guard. >>> >> >> I did a quick >> gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h") >> >> and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) >> without a guard >> best example from Intel seems to be brw_disk_cache.c which uses it alot >> outside of the ENABLE_SHADER_CACHE guard >> >> so probably safe? > > AFAIK none of those things are compiled on windows. IIUC windows builds are > about Mesa core and gallium side. > >>>> + >>>> + >>>> +#ifdef ENABLE_SHADER_CACHE >>>> +/** >>>> + * Construct a full path for shader replacement functionality using >>>> + * following format: >>>> + * >>>> + * /_.glsl >>>> + */ >>>> +static char * >>>> +construct_name(const gl_shader_stage stage, const char *source, >>>> + const char *path) >>>> +{ >>>> + char sha[64]; >>>> + static const char *types[] = { >>>> + "VS", "TC", "TE", "GS", "FS", "CS", >>>> + }; >>>> + >>>> + generate_sha1(source, sha); >>>> + return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); >>>> +} >>>> + >>>> +/** >>>> + * Write given shader source to a file in MESA_SHADER_DUMP_PATH. >>>> + */ >>>> +static void >>>> +dump_shader(const gl_shader_stage stage, const char *source) >>>> +{ >>>> + static bool path_exists = true; >>>> + char *dump_path; >>>> + FILE *f; >>>> + >>>> + if (!path_exists) >>>> + return; >>>> + >>>> + dump_path = getenv("MESA_SHADER_DUMP_PATH"); >>>> + if (!dump_path) { >>>> + path_exists = false; >>>> + return; >>>> + }
Re: [Mesa-dev] [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally
p;fl); if (r == -1 && errno == EACCES) errno = EAGAIN; return r; } # else /* !HAVE_STRUCT_FLOCK_L_TYPE */ # error "This platform lacks flock function, and Gnulib doesn't provide a replacement. This is a bug in Gnulib." # endif /* !HAVE_STRUCT_FLOCK_L_TYPE */ #endif /* !Windows */ Am 18.05.2018 um 07:34 schrieb Tapani Pälli: > > > On 05/17/2018 02:00 PM, Benedikt Schemmer wrote: >> Ok I found the commit. >> But it says: "Until we have a proper fix." >> So this would be a good time I guess ;) > > >> >> --- >> >> author Emil Velikov 2017-01-18 19:40:31 >> + >> committer Emil Velikov 2017-01-18 20:09:01 >> + >> commit 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 (patch) >> tree ff9672995474d3c31f027fea8356cb5733e45388 >> parent d1efa09d342bff3e5def2978a0bef748d74f9c82 (diff) >> utils: build sha1/disk cache only with Android/Autoconf >> Earlier commit imported a SHA1 implementation and relaxed the SHA1 and >> disk cache handling, broking the Windows builds. >> >> Restrict things for now until we get to a proper fix. >> >> Fixes: d1efa09d342 "util: import sha1 implementation from OpenBSD" >> Signed-off-by: Emil Velikov >> >> >> Am 17.05.2018 um 12:28 schrieb Tapani Pälli: >>> >>> >>> On 05/17/2018 11:38 AM, Benedikt Schemmer wrote: >>>> Thanks for reviewing! >>>> >>>> Am 17.05.2018 um 08:42 schrieb Tapani Pälli: >>>>> >>>>> >>>>> On 05/10/2018 12:05 PM, Benedikt Schemmer wrote: >>>>>> Move shader-cache code from back to front and make generate_sha1() usable >>>>>> unconditionally to avoid code duplication in the following patch >>>>>> >>>>>> --- >>>>>> src/mesa/main/shaderapi.c | 228 >>>>>> +++--- >>>>>> 1 file changed, 116 insertions(+), 112 deletions(-) >>>>>> >>>>>> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >>>>>> index 44b18af492..e8acca4490 100644 >>>>>> --- a/src/mesa/main/shaderapi.c >>>>>> +++ b/src/mesa/main/shaderapi.c >>>>>> @@ -64,6 +64,122 @@ >>>>>> #include "util/mesa-sha1.h" >>>>>> #include "util/crc32.h" >>>>>> >>>>>> + >>>>>> +/** >>>>>> + * Generate a SHA-1 hash value string for given source string. >>>>>> + */ >>>>>> +static void >>>>>> +generate_sha1(const char *source, char sha_str[64]) >>>>>> +{ >>>>>> + unsigned char sha[20]; >>>>>> + _mesa_sha1_compute(source, strlen(source), sha); >>>>>> + _mesa_sha1_format(sha_str, sha); >>>>>> +} >>>>> >>>>> There is one potential problem here. The 'ENABLE_SHADER_CACHE' guard for >>>>> generate_sha1 and others was placed there because the imported sha1 code >>>>> broke windows build, I'm wondering if this is still >>>>> the case? If so, then generate_sha1 should be inside ENABLE_SHADER_CACHE >>>>> guard. >>>>> >>>> >>>> I did a quick >>>> gedit $(grep -Rlsi "_mesa_sha1_compute" | grep -E "\.c|\.h") >>>> >>>> and it seems radv and anv use _mesa_sha1_compute (and _mesa_sha1_format) >>>> without a guard >>>> best example from Intel seems to be brw_disk_cache.c which uses it alot >>>> outside of the ENABLE_SHADER_CACHE guard >>>> >>>> so probably safe? >>> >>> AFAIK none of those things are compiled on windows. IIUC windows builds are >>> about Mesa core and gallium side. >>> >>>>>> + >>>>>> + >>>>>> +#ifdef ENABLE_SHADER_CACHE >>>>>> +/** >>>>>> + * Construct a full path for shader replacement functionality using >>>>>> + * following format: >>>>>> + * >>>>>> + * /_.glsl >>>>>> + */ >>>>>> +static char * >>>>>> +construct_name(const gl_shader_stage stage, const char *source, >>>>>> + const char *path) >>>>>> +{ >>>>>> + char sha[64]; >>>>&g
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
Hi Timothy, I was looking into something similar myself. Great to see someone working on getting legacy extensions available to the applications that can actually use them! Another thing I noticed is that extensions aren't advertised at the correct feature level. The code below works and does a better job at it, but its probably also the wrong way to go about it. Cheers, Benedikt P.S. I grepped and sed together a file which describes the requirements for every extension better, unfortunately the files on khronos arent exactly machine readable will send in next reply --- diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c --- a/src/mesa/main/version.c +++ b/src/mesa/main/version.c @@ -233,8 +233,15 @@ _mesa_override_glsl_version(struct gl_constants *consts) } /** - * Examine enabled GL extensions to determine GL version. + * Enable GL Extensions according to the minimum version requirements as per spec. + * Careful: some extensions are implemented for the express purpose of being + * used in a lower version context (legacy extensions). + * Example: + * ARB_gpu_shader5: first implementation in OpenGL 4.0 (GLSL 400) but + * supposed to be advertised from OpenGL 3.2 (GLSL 150) upwards, + * however early Nvidia FXAA could use it from GLSL 120 so advertise it there */ + static GLuint compute_version(const struct gl_extensions *extensions, const struct gl_constants *consts, gl_api api) @@ -242,25 +249,43 @@ compute_version(const struct gl_extensions *extensions, GLuint major, minor, version; const bool ver_1_3 = (extensions->ARB_texture_border_clamp && + extensions->ARB_instanced_arrays && + extensions->ARB_shadow && + extensions->ARB_vertex_shader && + extensions->ARB_fragment_shader && + extensions->ARB_depth_texture && + extensions->ARB_occlusion_query2 && extensions->ARB_texture_cube_map && + extensions->ARB_seamless_cube_map && extensions->ARB_texture_env_combine && + extensions->ARB_texture_rg && + extensions->ARB_texture_compression_rgtc && + extensions->ARB_texture_compression_bptc && + extensions->ARB_texture_float && + extensions->ARB_framebuffer_object && + extensions->ARB_framebuffer_no_attachments && extensions->ARB_texture_env_dot3); const bool ver_1_4 = (ver_1_3 && - extensions->ARB_depth_texture && - extensions->ARB_shadow && + extensions->ARB_stencil_texturing && extensions->ARB_texture_env_crossbar && + extensions->ARB_texture_mirror_clamp_to_edge && extensions->EXT_blend_color && extensions->EXT_blend_func_separate && extensions->EXT_blend_minmax && extensions->EXT_point_parameters); const bool ver_1_5 = (ver_1_4 && + extensions->ARB_query_buffer_object && extensions->ARB_occlusion_query); const bool ver_2_0 = (ver_1_5 && extensions->ARB_point_sprite && - extensions->ARB_vertex_shader && - extensions->ARB_fragment_shader && + extensions->ARB_blend_func_extended && extensions->ARB_texture_non_power_of_two && extensions->EXT_blend_equation_separate && + extensions->ARB_shader_texture_lod && + extensions->ARB_transform_feedback2 && + extensions->ARB_uniform_buffer_object && + extensions->ARB_texture_buffer_object && + extensions->ARB_texture_buffer_range && /* Technically, 2.0 requires the functionality of the * EXT version. Enable 2.0 if either extension is @@ -272,33 +297,44 @@ compute_version(const struct gl_extensions *extensions, || extensions->ATI_separate_stencil)); const bool ver_2_1 = (ver_2_0 && extensions->EXT_pixel_buffer_object && + extensions->ARB_map_buffer_range && + extensions->ARB_draw_instanced && + extensions->ARB_shader_bit_encoding && + extensions->ARB_gpu_shader5 && extensions->EXT_texture_sRGB); const bool ver_3_0 = (ver_2_1 && consts->GLSLVersion >= 130 && (consts->MaxSamples >= 4 || consts->FakeSWMSAA) &&
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
A list of extension requirements and interactions extracted from the GL_* files on Khronos so this is messy but relatively complete I've started to introduce key phrases like "This extension requires" so that this could become machine readable in the future. --- ./WGL_ARB_make_current_read.txt This extension requires WGL_ARB_extensions_string is required. ./ARB_cull_distance.txt This extension requires OpenGL 3.0 is required. The extension is written against the OpenGL 4.4 Specification, Core Profile, March 19, 2014. The extension is written against the OpenGL Shading Language 4.40 Specification, January 22, 2014. ./ARB_compute_variable_group_size.txt This extension is written against the OpenGL 4.3 (Compatibility Profile) Specification, dated August 6, 2012. This extension is written against the OpenGL Shading Language Specification, Version 4.30, Revision 7, dated September 24, 2012. This extension requires OpenGL 4.3 or ARB_compute_shader is required. This extension interacts with NV_compute_program5. ./ARB_texture_query_levels.txt This extension requires OpenGL 3.0 is required. This extension requires OpenGL Shading Language 1.30 is required This extension is written against the OpenGL 4.2 specification and version 4.20 of the OpenGL Shading Language Specification. ./ARB_depth_clamp.txt Written based on the wording of the OpenGL 3.1 specification. ARB_compatibility affects the behavior of this extension. ./WGL_ARB_render_texture.txt This extension requires OpenGL 1.1 is required. This extension requires WGL_ARB_extension_string is required. This extension requires WGL_ARB_pixel_format is required. This extension requires WGL_ARB_pbuffer is required. WGL_ARB_make_current_read affects the definition of this extension. GL_ARB_texture_cube_map affects the definition of this extension The extension is written against the OpenGL 1.2.1 Specification. ./ARB_copy_image.txt This extension requires OpenGL 1.1 is required. The extension is written against the OpenGL 4.2 (Core Profile) Specification. This extension interacts with OpenGL 4.3 and ARB_texture_view. This extension interacts with the Compatibility Profile. This extension interacts with EXT_texture_compression_s3tc. This extension interacts with OpenGL 3.0 and ARB_texture_compression_rgtc. This extension interacts with OpenGL 4.2 and ARB_texture_compression_bptc. ./ARB_shader_clock.txt This extension is written against version 4.50 of the OpenGL Shading Language Specification. This interacts with GL_ARB_gpu_shader_int64. ./ARB_texture_env_crossbar.txt This extension is written against the OpenGL 1.2.1 Specification. This extension requires OpenGL 1.1 This extension requires ARB_multitexture This extension requires ARB_texture_env_combine ./WGL_ARB_extensions_string.txt None ./ARB_shader_group_vote.txt This extension is written against the OpenGL 4.3 (Compatibility Profile) Specification, dated August 6, 2012. This extension is written against the OpenGL Shading Language Specification, Version 4.30, Revision 7, dated September 24, 2012. This extension requires OpenGL 4.3 or ARB_compute_shader This extension interacts with NV_gpu_shader5. ./ARB_ES2_compatibility.txt Written based on the wording of the OpenGL 4.0 Compatibility Profile (March 11, 2010) specification. This extension interacts with ARB_tessellation_shader or OpenGL 4.0. ./ARB_provoking_vertex.txt This extension is written against the OpenGL 2.1 Specification but can apply to any prior specification. ARB_geometry_shader4, EXT_geometry_shader4, NV_geometry_shader4, and NV_gpu_program4 interact with this extension EXT_transform_feedback, NV_transform_feedback, and the transform feedback functionality made core by OpenGL 3.0 are clarified by this extension. ./ARB_texture_buffer_object_rgb32.txt This extension is written against the OpenGL 3.2 specification (Core Profile). ./ARB_point_parameters.txt This extension requires OpenGL 1.0 is required. ARB_multisample affects the definition of this extension. The extension is written against the OpenGL 1.2.1 Specification. ./ARB_half_float_vertex.txt This extension is written against the OpenGL 2.1 Specification Based on the NV_half_float and OES_vertex_half_float extensions. ./ARB_explicit_attrib_location.txt This extension requires OpenGL 2.0 or GL_ARB_vertex_shader. This extension interacts with GL_ARB_blend_func_extended. This extension is written against the OpenGL 3.2 (Core Profile) specification dated 7/24/2009 and the This extension is written against the OpenGL Shading Language 1.50.09 specification dated 7/24/2009. This extension interacts with GL_ARB_separate_shader_objects, This extension interacts with GL_ARB_gpu_shader_fp64 This extension interacts with OpenGL 4.0 This extension int
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
Yes its backwards and it seems it doesn't work anymore ;) Seems the loophole I was using got removed. Problem remains that applications that could use an legacy extension are not aware of it and will simply/usually not request it: MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=150 glxinfo | grep shader5 -nothing- MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=400 glxinfo | grep shader5 GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex, GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object, GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer, If an application knows about GLSL 400 it doesn't need GL_ARB_gpu_shader5 anymore. Am 18.05.2018 um 11:25 schrieb Timothy Arceri: > Hi Benedikt, > > I think you have this backwards. compute_version() computes the version of > OpenGL supported by the driver by checking what extensions it advertises. See > version 2 of my patch to see where the > extensions are enabled. I don't think this is something that can be > automated, it really depends on the driver, just because an extension can be > advertised at a certain level doesn't mean you must (or > can) do it. > > Tim > > On 18/05/18 19:13, Benedikt Schemmer wrote: >> Hi Timothy, >> >> I was looking into something similar myself. >> Great to see someone working on getting legacy extensions available >> to the applications that can actually use them! >> >> Another thing I noticed is that extensions aren't advertised at the >> correct feature level. >> The code below works and does a better job at it, but its probably also >> the wrong way to go about it. >> >> Cheers, >> Benedikt >> >> P.S. I grepped and sed together a file which describes the requirements >> for every extension better, unfortunately the files on khronos >> arent exactly machine readable >> will send in next reply >> >> --- >> >> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c >> --- a/src/mesa/main/version.c >> +++ b/src/mesa/main/version.c >> @@ -233,8 +233,15 @@ _mesa_override_glsl_version(struct gl_constants *consts) >> } >> >> /** >> - * Examine enabled GL extensions to determine GL version. >> + * Enable GL Extensions according to the minimum version requirements as >> per spec. >> + * Careful: some extensions are implemented for the express purpose of being >> + * used in a lower version context (legacy extensions). >> + * Example: >> + * ARB_gpu_shader5: first implementation in OpenGL 4.0 (GLSL 400) but >> + * supposed to be advertised from OpenGL 3.2 (GLSL 150) upwards, >> + * however early Nvidia FXAA could use it from GLSL 120 so advertise it >> there >> */ >> + >> static GLuint >> compute_version(const struct gl_extensions *extensions, >> const struct gl_constants *consts, gl_api api) >> @@ -242,25 +249,43 @@ compute_version(const struct gl_extensions *extensions, >> GLuint major, minor, version; >> >> const bool ver_1_3 = (extensions->ARB_texture_border_clamp && >> + extensions->ARB_instanced_arrays && >> + extensions->ARB_shadow && >> + extensions->ARB_vertex_shader && >> + extensions->ARB_fragment_shader && >> + extensions->ARB_depth_texture && >> + extensions->ARB_occlusion_query2 && >> extensions->ARB_texture_cube_map && >> + extensions->ARB_seamless_cube_map && >> extensions->ARB_texture_env_combine && >> + extensions->ARB_texture_rg && >> + extensions->ARB_texture_compression_rgtc && >> + extensions->ARB_texture_compression_bptc && >> + extensions->ARB_texture_float && >> + extensions->ARB_framebuffer_object && >> + extensions->ARB_framebuffer_no_attachments && >> extensions->ARB_texture_env_dot3); >> const bool ver_1_4 = (ver_1_3 && >> - extensions->ARB_depth_texture && >> - extensions->ARB_shadow && >> +
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
Indeed it does, I've written a little test program where I can request specific OpenGL versions. I think that used to work, now if I request anything <= 3.1 I get 3.1 if request >= 3.2 I get 4.5. GL_ARB_gpu_shader5 is available in all of them. If I force the GL version the version string changes and I actually have to request a version lower than that. Still GL_ARB_gpu_shader5 is available in all of them. If I force the GLSL version to anything lower than 400, GL_ARB_gpu_shader5 goes missing. That's not the intended behavior is it? Am 18.05.2018 um 13:32 schrieb Ilia Mirkin: > The check is against what the driver max is, not what the application > requested. > > Using overrides messes up this logic, it seems. > > On Fri, May 18, 2018, 06:14 Benedikt Schemmer <mailto:b...@besd.de>> wrote: > > Yes its backwards and it seems it doesn't work anymore ;) > Seems the loophole I was using got removed. > > Problem remains that applications that could use an legacy extension > are not aware of it and will simply/usually not request it: > > MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=150 glxinfo | > grep shader5 > -nothing- > > MESA_GL_VERSION_OVERRIDE=3.2 MESA_GLSL_VERSION_OVERRIDE=400 glxinfo | > grep shader5 > GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, > GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex, > GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object, > GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer, > > > If an application knows about GLSL 400 it doesn't need GL_ARB_gpu_shader5 > anymore. > > > > Am 18.05.2018 um 11:25 schrieb Timothy Arceri: > > Hi Benedikt, > > > > I think you have this backwards. compute_version() computes the version > of OpenGL supported by the driver by checking what extensions it advertises. > See version 2 of my patch to see where the > > extensions are enabled. I don't think this is something that can be > automated, it really depends on the driver, just because an extension can be > advertised at a certain level doesn't mean you > must (or > > can) do it. > > > > Tim > > > > On 18/05/18 19:13, Benedikt Schemmer wrote: > >> Hi Timothy, > >> > >> I was looking into something similar myself. > >> Great to see someone working on getting legacy extensions available > >> to the applications that can actually use them! > >> > >> Another thing I noticed is that extensions aren't advertised at the > >> correct feature level. > >> The code below works and does a better job at it, but its probably also > >> the wrong way to go about it. > >> > >> Cheers, > >> Benedikt > >> > >> P.S. I grepped and sed together a file which describes the requirements > >> for every extension better, unfortunately the files on khronos > >> arent exactly machine readable > >> will send in next reply > >> > >> --- > >> > >> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c > >> --- a/src/mesa/main/version.c > >> +++ b/src/mesa/main/version.c > >> @@ -233,8 +233,15 @@ _mesa_override_glsl_version(struct gl_constants > *consts) > >> } > >> > >> /** > >> - * Examine enabled GL extensions to determine GL version. > >> + * Enable GL Extensions according to the minimum version requirements > as per spec. > >> + * Careful: some extensions are implemented for the express purpose > of being > >> + * used in a lower version context (legacy extensions). > >> + * Example: > >> + * ARB_gpu_shader5: first implementation in OpenGL 4.0 (GLSL 400) but > >> + * supposed to be advertised from OpenGL 3.2 (GLSL 150) upwards, > >> + * however early Nvidia FXAA could use it from GLSL 120 so advertise > it there > >> */ > >> + > >> static GLuint > >> compute_version(const struct gl_extensions *extensions, > >> const struct gl_constants *consts, gl_api api) > >> @@ -242,25 +249,43 @@ compute_version(const struct gl_extensions > *extensions, > >> GLuint major, minor, version; > >> > >> const bool ver_1_3 = (extensions->ARB_texture_borde
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
Am I on the right track to assume that Timothy's patch should read +EXT(ARB_gpu_shader5 , ARB_gpu_shader5 , 32, 40, x , x , 2010) to give GL_ARB_gpu_shader5 a minimum required legacy OpenGL version of 3.2 and core 4.0 as per spec? I wanted to ask: is ctx->Version >= ext->version[ctx->API] evaluated every time a context is created or just once at boot time? But it seems to be evaluated every time a context is created and this is actually more like what I expect from the version override MESA_GL_VERSION_OVERRIDE=4.0 glxinfo | grep shader5 GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex, GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object, GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer, MESA_GL_VERSION_OVERRIDE=3.2 glxinfo | grep shader5 GL_ARB_gpu_shader5, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex, GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object, GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer, MESA_GL_VERSION_OVERRIDE=3.0 glxinfo | grep shader5 GL_EXT_gpu_shader5, GL_EXT_map_buffer_range, GL_EXT_memory_object, GL_OES_get_program_binary, GL_OES_gpu_shader5, GL_OES_mapbuffer, --- from extensions.h: /** Checks if the context supports a user-facing extension */ #define EXT(name_str, driver_cap, ...) \ static inline bool \ _mesa_has_##name_str(const struct gl_context *ctx) \ { \ return ctx->Extensions.driver_cap && (ctx->Extensions.Version >= \ // equal to ctx->Version _mesa_extension_table[MESA_EXTENSION_##name_str].version[ctx->API]); \ } #include "extensions_table.h" #undef EXT Am 18.05.2018 um 14:35 schrieb Ilia Mirkin: > On Fri, May 18, 2018 at 8:20 AM, Benedikt Schemmer wrote: >> Indeed it does, I've written a little test program where I can request >> specific OpenGL versions. >> I think that used to work, now if I request anything <= 3.1 I get 3.1 if >> request >= 3.2 I get 4.5. >> GL_ARB_gpu_shader5 is available in all of them. >> >> If I force the GL version the version string changes and I actually have to >> request a version lower than that. >> Still GL_ARB_gpu_shader5 is available in all of them. >> >> If I force the GLSL version to anything lower than 400, GL_ARB_gpu_shader5 >> goes missing. >> >> That's not the intended behavior is it? > > Forcing versions via env vars isn't well-supported, esp around the > edges like this. Feel free to send patches to fix it. > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
so the gll is ok? my idea with the glc is that this particular extension would no longer be advertised as soon as the app request a context where the functionality is already part of it (ie OpenGL >= 4.0) Do you think a minimum GLSL version could be added to this table (maybe replacing the year) so that GLSL version overriding has a similar effect? Am 18.05.2018 um 19:32 schrieb Ilia Mirkin: > On Fri, May 18, 2018 at 1:27 PM, Benedikt Schemmer wrote: >> Am I on the right track to assume that Timothy's patch should read >> >> +EXT(ARB_gpu_shader5 , ARB_gpu_shader5 >> , 32, 40, x , x , 2010) >> >> to give GL_ARB_gpu_shader5 a minimum required legacy OpenGL version of 3.2 >> and core 4.0 as per spec? > > It should read something like 32, 32. I was going to point that out, > but ... meh. It shouldn't matter in practice. Environment > variable-based overrides aren't a particularly supported mode of > operation, so I don't think we need to bend over backwards for it. > > -ilia > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: enable ARB_gpu_shader5 in compat
Am 18.05.2018 um 20:05 schrieb Ilia Mirkin: > On Fri, May 18, 2018 at 1:54 PM, Benedikt Schemmer wrote: >> so the gll is ok? >> >> my idea with the glc is that this particular extension would no longer be >> advertised as soon as the app >> request a context where the functionality is already part of it (ie OpenGL >> >= 4.0) > > Why is this desirable? One could interpret that to mean that e.g. > "#version 150; #extension GL_ARB_gpu_shader5: enable" wouldn't work... > While legal, I doubt it'd go over well with applications. I dont think any application would notice. They should just do #version 400 (or higher) and dont need GL_ARB_gpu_shader5 anymore So less useless extensions to sift through. > >> Do you think a minimum GLSL version could be added to this table (maybe >> replacing the year) so that >> GLSL version overriding has a similar effect? > > The minimum GL version seems to do that just fine... This is of course very specific to this extension, although I do expect more have the same aspect. I found an old (the) FXAA implementation from Nvidia that was used by retro (really really old) games https://gist.github.com/kosua20/0c506b81b3812ac900048059d2383126 and it could use shader5 with GLSL 120. So it would be nice to have a way to enable that extension even in cases where the requested OpenGL version is lower than specified but there is some justification to it. --- #ifndef FXAA_FAST_PIXEL_OFFSET // // Used for GLSL 120 only. // // 1 = GL API supports fast pixel offsets // 0 = do not use fast pixel offsets // #ifdef GL_EXT_gpu_shader4 #define FXAA_FAST_PIXEL_OFFSET 1 #endif #ifdef GL_NV_gpu_shader5 #define FXAA_FAST_PIXEL_OFFSET 1 #endif #ifdef GL_ARB_gpu_shader5 #define FXAA_FAST_PIXEL_OFFSET 1 #endif #ifndef FXAA_FAST_PIXEL_OFFSET #define FXAA_FAST_PIXEL_OFFSET 0 #endif #endif ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 2/3] mesa/main/extensions: first attempt to get the context levels right
These are just a few extensions for which I had the specifications handy, to get a feel for what this might look like. It's already been pointed out that applications might fail because of this. (I haven't found an application that does yet but I can only test on 64-bit) The argument for this change would be that its ok if the application fails because it does something unexpected and its better to let it explicitly fail and create a workaround for it than to just silently hope that everything is going to be alright. CC: "Marek Olšák" CC: Eric Anholt CC: Emil Velikov CC: Timothy Arceri CC: Ilia Mirkin Signed-off-by: Benedikt Schemmer --- src/mesa/main/extensions_table.h | 98 ++-- 1 file changed, 53 insertions(+), 45 deletions(-) diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h index 931e0d4d33..6e3bdbe99c 100644 --- a/src/mesa/main/extensions_table.h +++ b/src/mesa/main/extensions_table.h @@ -6,7 +6,8 @@ #define ANYES2 0 #define x ~0 -//EXT(name_str , driver_cap ,GLL_ver,GLC_ver, gles_ver, glES2_ver, ) +//highest GLL_ver = 46, lowest GLC_ver = 30 +//EXT(name_str , driver_cap ,GLL_ver,GLC_ver, gles_ver, glES2_ver, year) EXT(3DFX_texture_compression_FXT1 , TDFX_texture_compression_FXT1 , ANYGLL, ANYGLC, x , x , 1999) EXT(AMD_conservative_depth , ARB_conservative_depth , ANYGLL, ANYGLC, x , x , 2009) @@ -31,60 +32,62 @@ EXT(APPLE_texture_max_level , dummy_true EXT(ARB_ES2_compatibility , ARB_ES2_compatibility , ANYGLL, ANYGLC, x , x , 2009) EXT(ARB_ES3_1_compatibility , ARB_ES3_1_compatibility , x , ANYGLC, x , x , 2014) EXT(ARB_ES3_2_compatibility , ARB_ES3_2_compatibility , x , ANYGLC, x , x , 2015) -EXT(ARB_ES3_compatibility , ARB_ES3_compatibility , ANYGLL, ANYGLC, x , x , 2012) -EXT(ARB_arrays_of_arrays, ARB_arrays_of_arrays , ANYGLL, ANYGLC, x , x , 2012) +EXT(ARB_ES3_compatibility , ARB_ES3_compatibility , 33 , 33 , x , x , 2012) // requires ARB_ES2_compatibility, ARB_invalidate_subdata, ARB_texture_storage, OES_compressed_* +EXT(ARB_arrays_of_arrays, ARB_arrays_of_arrays , 12 , 12 , x , x , 2012) EXT(ARB_base_instance , ARB_base_instance , ANYGLL, ANYGLC, x , x , 2011) EXT(ARB_bindless_texture, ARB_bindless_texture , ANYGLL, ANYGLC, x , x , 2013) EXT(ARB_blend_func_extended , ARB_blend_func_extended , ANYGLL, ANYGLC, x , x , 2009) EXT(ARB_buffer_storage , ARB_buffer_storage , ANYGLL, ANYGLC, x , x , 2013) +//EXT(ARB_cl_event, ARB_cl_event , 31 , 31 , x , x , 2013) // requires GL_ARB_sync EXT(ARB_clear_buffer_object , dummy_true , ANYGLL, ANYGLC, x , x , 2012) EXT(ARB_clear_texture , ARB_clear_texture , ANYGLL, ANYGLC, x , x , 2013) EXT(ARB_clip_control, ARB_clip_control , ANYGLL, ANYGLC, x , x , 2014) EXT(ARB_color_buffer_float , ARB_color_buffer_float , ANYGLL, ANYGLC, x , x , 2004) EXT(ARB_compatibility , ARB_compatibility , ANYGLL, x , x , x , 2009) EXT(ARB_compressed_texture_pixel_storage, dummy_true , ANYGLL, ANYGLC, x , x , 2011) -EXT(ARB_compute_shader , ARB_compute_shader , ANYGLL, ANYGLC, x , x , 2012) -EXT(ARB_compute_variable_group_size , ARB_compute_variable_group_size , ANYGLL, ANYGLC, x , x , 2013) +EXT(ARB_compute_shader , ARB_compute_shader , 42 , 42 , x , x , 2012) +EXT(ARB_compute_variable_group_size , ARB_compute_variable_group_size , 42 , 42 , x , x , 2013) //or ARB compute shader EXT(ARB_conditional_render_inverted , ARB_conditional_render_inverted , ANYGLL, ANYGLC, x , x , 2014) EXT(ARB_conservative_depth , ARB_conservative_depth , ANYGLL, ANYGLC, x , x , 2011) EXT(ARB_copy_buffer , dummy_true , ANYGLL,
[Mesa-dev] [RFC PATCH 1/3] mesa/main/extensions: make constants slightly more expressive and adjust layout
This just makes the constants more descriptive CC: "Marek Olšák" CC: Eric Anholt CC: Emil Velikov CC: Timothy Arceri CC: Ilia Mirkin Signed-off-by: Benedikt Schemmer --- src/mesa/main/extensions_table.h | 833 --- 1 file changed, 417 insertions(+), 416 deletions(-) diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h index ffb1caccdd..931e0d4d33 100644 --- a/src/mesa/main/extensions_table.h +++ b/src/mesa/main/extensions_table.h @@ -1,439 +1,440 @@ /* The extension table is alphabetically sorted by the extension name string column. */ -#define GLL 0 -#define GLC 0 -#define ES1 0 -#define ES2 0 -#define x ~0 +#define ANYGLL 0 +#define ANYGLC 0 +#define ANYES1 0 +#define ANYES2 0 +#define x ~0 -EXT(3DFX_texture_compression_FXT1 , TDFX_texture_compression_FXT1 , GLL, GLC, x , x , 1999) +//EXT(name_str , driver_cap ,GLL_ver,GLC_ver, gles_ver, glES2_ver, ) +EXT(3DFX_texture_compression_FXT1 , TDFX_texture_compression_FXT1 , ANYGLL, ANYGLC, x , x , 1999) -EXT(AMD_conservative_depth , ARB_conservative_depth , GLL, GLC, x , x , 2009) -EXT(AMD_draw_buffers_blend , ARB_draw_buffers_blend , GLL, GLC, x , x , 2009) -EXT(AMD_performance_monitor , AMD_performance_monitor , GLL, GLC, x , x , 2007) -EXT(AMD_pinned_memory , AMD_pinned_memory , GLL, GLC, x , x , 2013) -EXT(AMD_seamless_cubemap_per_texture, AMD_seamless_cubemap_per_texture , GLL, GLC, x , x , 2009) -EXT(AMD_shader_stencil_export , ARB_shader_stencil_export , GLL, GLC, x , x , 2009) -EXT(AMD_shader_trinary_minmax , dummy_true , GLL, GLC, x , x , 2012) -EXT(AMD_vertex_shader_layer , AMD_vertex_shader_layer , x , GLC, x , x , 2012) -EXT(AMD_vertex_shader_viewport_index, AMD_vertex_shader_viewport_index , x , GLC, x , x , 2012) +EXT(AMD_conservative_depth , ARB_conservative_depth , ANYGLL, ANYGLC, x , x , 2009) +EXT(AMD_draw_buffers_blend , ARB_draw_buffers_blend , ANYGLL, ANYGLC, x , x , 2009) +EXT(AMD_performance_monitor , AMD_performance_monitor , ANYGLL, ANYGLC, x , x , 2007) +EXT(AMD_pinned_memory , AMD_pinned_memory , ANYGLL, ANYGLC, x , x , 2013) +EXT(AMD_seamless_cubemap_per_texture, AMD_seamless_cubemap_per_texture , ANYGLL, ANYGLC, x , x , 2009) +EXT(AMD_shader_stencil_export , ARB_shader_stencil_export , ANYGLL, ANYGLC, x , x , 2009) +EXT(AMD_shader_trinary_minmax , dummy_true , ANYGLL, ANYGLC, x , x , 2012) +EXT(AMD_vertex_shader_layer , AMD_vertex_shader_layer , x , ANYGLC, x , x , 2012) +EXT(AMD_vertex_shader_viewport_index, AMD_vertex_shader_viewport_index , x , ANYGLC, x , x , 2012) -EXT(ANDROID_extension_pack_es31a, ANDROID_extension_pack_es31a , x , x , x , 31, 2014) +EXT(ANDROID_extension_pack_es31a, ANDROID_extension_pack_es31a , x , x , x , 31 , 2014) -EXT(ANGLE_texture_compression_dxt3 , ANGLE_texture_compression_dxt , GLL, GLC, ES1, ES2, 2011) -EXT(ANGLE_texture_compression_dxt5 , ANGLE_texture_compression_dxt , GLL, GLC, ES1, ES2, 2011) +EXT(ANGLE_texture_compression_dxt3 , ANGLE_texture_compression_dxt , ANYGLL, ANYGLC, ANYES1, ANYES2, 2011) +EXT(ANGLE_texture_compression_dxt5 , ANGLE_texture_compression_dxt , ANYGLL, ANYGLC, ANYES1, ANYES2, 2011) -EXT(APPLE_object_purgeable , APPLE_object_purgeable , GLL, GLC, x , x , 2006) -EXT(APPLE_packed_pixels , dummy_true , GLL, x , x , x , 2002) -EXT(APPLE_texture_max_level , dummy_true , x , x , ES1, ES2, 2009) +EXT(APPLE_object_purgeable , APPLE_object_purgeable , ANYGLL, ANYGLC, x , x , 2006) +EXT(APPLE_packed_pixels , dummy_true , ANYGLL, x , x , x , 2002) +EXT(APPLE_texture_max_level , dummy_true , x , x , ANYES1, ANYES2, 2009) -EXT(ARB_ES2_compatibility , ARB_ES2_compatibility , GLL, GLC, x , x , 2009) -EXT(ARB_ES3_1_compatibility , ARB_ES3_1_com
[Mesa-dev] [RFC PATCH 0/3] mesa/main/extensions: approximate feature levels for enabling better
Most of the extensions are currently either always available or not. A mechanism to enable extensions based on the required OpenGL version is already in place, its just rarely used. This might cause problems when an application actually tries to use an extension in a context for which the extension was not designed. Also this prevents extensions that might be problematic when they are always enabled, to be enabled in the appropriate context. Tested with several 64-bit games. CC: "Marek Olšák" CC: Eric Anholt CC: Emil Velikov CC: Timothy Arceri CC: Ilia Mirkin Benedikt Schemmer (3): mesa/main/extensions: make constants slightly more expressive and adjust layout mesa/main/extensions: first attempt to get the context levels right mesa/main/extensions: introduce dependencies src/mesa/main/extensions.c | 3 + src/mesa/main/extensions.h | 3 + src/mesa/main/extensions_table.c | 5 +- src/mesa/main/extensions_table.h | 841 --- 4 files changed, 434 insertions(+), 418 deletions(-) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 1/4] mesa/main/extensions: use GL_ARB_gpu_shader5 in legacy contexts
is the prerequisit for the others I just sent CC: "Marek Olšák" CC: Eric Anholt CC: Emil Velikov CC: Timothy Arceri CC: Ilia Mirkin Signed-off-by: Benedikt Schemmer --- src/mesa/main/extensions_table.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h index 38d241db52..ffb1caccdd 100644 --- a/src/mesa/main/extensions_table.h +++ b/src/mesa/main/extensions_table.h @@ -74,7 +74,7 @@ EXT(ARB_framebuffer_sRGB, EXT_framebuffer_sRGB EXT(ARB_get_program_binary , dummy_true , GLL, GLC, x , x , 2010) EXT(ARB_get_texture_sub_image , dummy_true , GLL, GLC, x , x , 2014) EXT(ARB_gl_spirv, ARB_gl_spirv , x, GLC, x , x , 2016) -EXT(ARB_gpu_shader5 , ARB_gpu_shader5 , x , GLC, x , x , 2010) +EXT(ARB_gpu_shader5 , ARB_gpu_shader5 , GLL, GLC, x , x , 2010) EXT(ARB_gpu_shader_fp64 , ARB_gpu_shader_fp64 , x , GLC, x , x , 2010) EXT(ARB_gpu_shader_int64, ARB_gpu_shader_int64 , x , GLC, x , x , 2015) EXT(ARB_half_float_pixel, dummy_true , GLL, GLC, x , x , 2003) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] shader-db/si-report: Show biggest improvements also
Hello Marek, thank you very much for reviewing. I don't have commit rights, could you push for me? (I only wrote that in the cover letter, should I put that in every patch in the future?) Benedikt Am 20.05.2018 um 04:58 schrieb Marek Olšák: > For the series: > > Reviewed-by: Marek Olšák mailto:marek.ol...@amd.com>> > > Marek > > > On Thu, May 10, 2018 at 5:06 AM, Benedikt Schemmer <mailto:b...@besd.de>> wrote: > > Sometimes it is nice (and useful) to not just see if you've messed up > but also if you've made an improvement. > > --- > si-report.py | 24 > 1 file changed, 24 insertions(+) > > diff --git a/si-report.py b/si-report.py > index fba652c..d0e0686 100755 > --- a/si-report.py > +++ b/si-report.py > @@ -686,6 +686,30 @@ def print_tables(before_all_results, > after_all_results): > break > if num > 0: > print > + > + # biggest improvements > + metrics = si_stats().metrics > + for i in range(len(metrics)): > + field = metrics[i][0] > + num = 0 > + more_is_better = metrics[i][0] == 'maxwaves' > + > + if more_is_better: > + sort_key = lambda v: -v[1].diff.__dict__[field] > + else: > + sort_key = lambda v: v[1].diff.__dict__[field] > + > + for name, stats in sorted(shaders.items(), key = sort_key): > + if more_is_better: > + if stats.diff.__dict__[field] <= 0: > + continue > + else: > + if stats.diff.__dict__[field] >= 0: > + continue > + > + if num == 0: > + print_yellow(" BIGGEST IMPROVEMENTS - > {:64}".format(metrics[i][1])) > + print_yellow(" Before After Delta Percentage") > stats.print_regression(name, field) > num += 1 > if num == num_listed: > -- > 2.14.1 > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > <https://lists.freedesktop.org/mailman/listinfo/mesa-dev> > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code
There is exactly one flock in mesa and it caused mesa not to build on windows when shader cache was enabled. It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 "utils: build sha1/disk cache only with Android/Autoconf" currently guarding the offending code with ENABLE_SHADER_CACHE This would allow shader cache to work on windows I think. I dont have a test system with windows though. This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux both with cold shader cache. Really Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82 CC: Tapani Pälli CC: "Marek Olšák" CC: Emil Velikov CC: Timothy Arceri CC: Samuel Pitoiset --- This enables the patch [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally src/util/disk_cache.c | 48 +++- 1 file changed, 35 insertions(+), 13 deletions(-) diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c index 4a762eff20..ca47bb15fb 100644 --- a/src/util/disk_cache.c +++ b/src/util/disk_cache.c @@ -28,7 +28,6 @@ #include #include #include -#include #include #include #include @@ -848,6 +847,29 @@ struct cache_entry_file_data { uint32_t uncompressed_size; }; +static char * +generate_random_string(int length) { + static const char a[] = "0123456789abcdef"; + + if (length > 16) + return NULL; + + char buf[16]; + char *rndstr; + + for (int i = 0; i < length - 1; ++i) { + // assign a random element from the lookup table + buf[i] = a[rand() % (sizeof(a) - 1)]; + } + + buf[length - 1] = 0; + + if (asprintf(&rndstr, "%s", buf) == -1) + return NULL; + + return rndstr; +} + static void cache_put(void *job, int thread_index) { @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index) int fd = -1, fd_final = -1, err, ret; unsigned i = 0; - char *filename = NULL, *filename_tmp = NULL; + char *filename = NULL, *filename_tmp = NULL, *random = NULL; struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job; filename = get_cache_file(dc_job->cache, dc_job->key); @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index) * final destination filename, (to prevent any readers from seeing * a partially written file). */ - if (asprintf(&filename_tmp, "%s.tmp", filename) == -1) + + /* This next part used to be an flock(), which would prevent windows systems +* to build. 4 hex characters should be enough to prevent filename race +* conditions for now. + */ + random = generate_random_string(4); + if (random == NULL) + goto done; + + if (asprintf(&filename_tmp, "%s_%s.tmp", filename, random) == -1) goto done; fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); @@ -890,16 +921,7 @@ cache_put(void *job, int thread_index) goto done; } - /* With the temporary file open, we take an exclusive flock on -* it. If the flock fails, then another process still has the file -* open with the flock held. So just let that file be responsible -* for writing the file. -*/ - err = flock(fd, LOCK_EX | LOCK_NB); - if (err == -1) - goto done; - - /* Now that we have the lock on the open temporary file, we can + /* Now that we have the open temporary file, we can * check to see if the destination file already exists. If so, * another process won the race between when we saw that the file * didn't exist and now. In this case, we don't do anything more, -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code
Thanks, fixed locally. Am 20.05.2018 um 14:46 schrieb Mariusz Ceier: > On 20 May 2018 at 14:16, Benedikt Schemmer wrote: >> There is exactly one flock in mesa and it caused mesa not to build >> on windows when shader cache was enabled. >> >> It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 >> "utils: build sha1/disk cache only with Android/Autoconf" currently >> guarding the offending code with ENABLE_SHADER_CACHE >> >> This would allow shader cache to work on windows I think. >> >> I dont have a test system with windows though. >> This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux >> both with cold shader cache. >> >> Really >> Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82 >> >> CC: Tapani Pälli >> CC: "Marek Olšák" >> CC: Emil Velikov >> CC: Timothy Arceri >> CC: Samuel Pitoiset >> --- >> This enables the patch >> [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally >> >> src/util/disk_cache.c | 48 +++- >> 1 file changed, 35 insertions(+), 13 deletions(-) >> >> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c >> index 4a762eff20..ca47bb15fb 100644 >> --- a/src/util/disk_cache.c >> +++ b/src/util/disk_cache.c >> @@ -28,7 +28,6 @@ >> #include >> #include >> #include >> -#include >> #include >> #include >> #include >> @@ -848,6 +847,29 @@ struct cache_entry_file_data { >> uint32_t uncompressed_size; >> }; >> >> +static char * >> +generate_random_string(int length) { >> + static const char a[] = "0123456789abcdef"; >> + >> + if (length > 16) >> + return NULL; >> + >> + char buf[16]; >> + char *rndstr; >> + >> + for (int i = 0; i < length - 1; ++i) { >> + // assign a random element from the lookup table >> + buf[i] = a[rand() % (sizeof(a) - 1)]; >> + } >> + >> + buf[length - 1] = 0; >> + >> + if (asprintf(&rndstr, "%s", buf) == -1) >> + return NULL; >> + >> + return rndstr; >> +} >> + >> static void >> cache_put(void *job, int thread_index) >> { >> @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index) >> >> int fd = -1, fd_final = -1, err, ret; >> unsigned i = 0; >> - char *filename = NULL, *filename_tmp = NULL; >> + char *filename = NULL, *filename_tmp = NULL, *random = NULL; >> struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job; >> >> filename = get_cache_file(dc_job->cache, dc_job->key); >> @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index) >> * final destination filename, (to prevent any readers from seeing >> * a partially written file). >> */ >> - if (asprintf(&filename_tmp, "%s.tmp", filename) == -1) >> + >> + /* This next part used to be an flock(), which would prevent windows >> systems >> +* to build. 4 hex characters should be enough to prevent filename race >> +* conditions for now. >> + */ >> + random = generate_random_string(4); >> + if (random == NULL) >> + goto done; >> + >> + if (asprintf(&filename_tmp, "%s_%s.tmp", filename, random) == -1) >>goto done; >> > > You forgot to free the random string. > >> fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); >> @@ -890,16 +921,7 @@ cache_put(void *job, int thread_index) >> goto done; >> } >> >> - /* With the temporary file open, we take an exclusive flock on >> -* it. If the flock fails, then another process still has the file >> -* open with the flock held. So just let that file be responsible >> -* for writing the file. >> -*/ >> - err = flock(fd, LOCK_EX | LOCK_NB); >> - if (err == -1) >> - goto done; >> - >> - /* Now that we have the lock on the open temporary file, we can >> + /* Now that we have the open temporary file, we can >> * check to see if the destination file already exists. If so, >> * another process won the race between when we saw that the file >> * didn't exist and now. In this case, we don't do anything more, >> -- >> 2.14.1 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH v2] Evade very ugly system dependent code by replacing an flock
There is exactly one flock in mesa and it caused mesa not to build on windows when shader cache was enabled. It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 "utils: build sha1/disk cache only with Android/Autoconf" currently guarding the offending code with ENABLE_SHADER_CACHE This would allow shader cache to work on windows I think. I dont have a test system with windows though. This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux both with cold shader cache. Really Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82 v2: make the patch ridiculously simple (Jan Vesely) needs testing though CC: Tapani Pälli CC: "Marek Olšák" CC: Emil Velikov CC: Timothy Arceri CC: Samuel Pitoiset CC: Jan Vesely --- This enables the patch [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally src/util/disk_cache.c | 14 ++ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c index 4a762eff20..35847af6a6 100644 --- a/src/util/disk_cache.c +++ b/src/util/disk_cache.c @@ -28,7 +28,6 @@ #include #include #include -#include #include #include #include @@ -876,7 +875,7 @@ cache_put(void *job, int thread_index) if (asprintf(&filename_tmp, "%s.tmp", filename) == -1) goto done; - fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); + fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT | O_EXCL, 0644); /* Make the two-character subdirectory within the cache as needed. */ if (fd == -1) { @@ -890,16 +889,7 @@ cache_put(void *job, int thread_index) goto done; } - /* With the temporary file open, we take an exclusive flock on -* it. If the flock fails, then another process still has the file -* open with the flock held. So just let that file be responsible -* for writing the file. -*/ - err = flock(fd, LOCK_EX | LOCK_NB); - if (err == -1) - goto done; - - /* Now that we have the lock on the open temporary file, we can + /* Now that we have the open temporary file, we can * check to see if the destination file already exists. If so, * another process won the race between when we saw that the file * didn't exist and now. In this case, we don't do anything more, -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code
Am 21.05.2018 um 01:19 schrieb Timothy Arceri: > > > On 20/05/18 22:16, Benedikt Schemmer wrote: >> There is exactly one flock in mesa and it caused mesa not to build >> on windows when shader cache was enabled. >> >> It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 >> "utils: build sha1/disk cache only with Android/Autoconf" currently >> guarding the offending code with ENABLE_SHADER_CACHE >> >> This would allow shader cache to work on windows I think. > > NAK. rand() is not thread safe. > > Why are you trying to make this build on windows? Appveyor https://ci.appveyor.com/project/mesa3d/mesa/build/3186 was the reason why large parts are guarded by ENABLE_SHADER_CACHE which prevents me from using a simple function in the patch I mentioned I can of course circumvent that by just duplicating the code I need. It just seems cleaner to have no system dependent code here. But you're right: its quite convoluted and every time I think I got it something else pops up ;) There are also other dependencies (dlfcn.h) that need to be removed. I already have, at least to some degree. I dont know how to eliminate the runtime check for the llvm build though, but that is only used by amd code so I moved it to ac_llvm_util.h for the moment which seems to work. si_pipe.c somewhere around line 800 now looks like: --- if (ac_get_function_timestamp(LLVMInitializeAMDGPUTargetInfo, &llvm_timestamp)) { res = asprintf(×tamp_str, "%s_%u", __TIMESTAMP__, llvm_timestamp); } --- >There is no use case for this on windows currently so it won't even be tested. >There are also other calls such as getuid() etc that >will fail on windows. > > If someone want this to work on windows they should just write windows > specific paths IMO. > >> >> I dont have a test system with windows though. >> This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux >> both with cold shader cache. >> >> Really >> Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82 >> >> CC: Tapani Pälli >> CC: "Marek Olšák" >> CC: Emil Velikov >> CC: Timothy Arceri >> CC: Samuel Pitoiset >> --- >> This enables the patch >> [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally >> >> src/util/disk_cache.c | 48 +++- >> 1 file changed, 35 insertions(+), 13 deletions(-) >> >> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c >> index 4a762eff20..ca47bb15fb 100644 >> --- a/src/util/disk_cache.c >> +++ b/src/util/disk_cache.c >> @@ -28,7 +28,6 @@ >> #include >> #include >> #include >> -#include >> #include >> #include >> #include >> @@ -848,6 +847,29 @@ struct cache_entry_file_data { >> uint32_t uncompressed_size; >> }; >> +static char * >> +generate_random_string(int length) { >> + static const char a[] = "0123456789abcdef"; >> + >> + if (length > 16) >> + return NULL; >> + >> + char buf[16]; >> + char *rndstr; >> + >> + for (int i = 0; i < length - 1; ++i) { >> + // assign a random element from the lookup table >> + buf[i] = a[rand() % (sizeof(a) - 1)]; >> + } >> + >> + buf[length - 1] = 0; >> + >> + if (asprintf(&rndstr, "%s", buf) == -1) >> + return NULL; >> + >> + return rndstr; >> +} >> + >> static void >> cache_put(void *job, int thread_index) >> { >> @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index) >> int fd = -1, fd_final = -1, err, ret; >> unsigned i = 0; >> - char *filename = NULL, *filename_tmp = NULL; >> + char *filename = NULL, *filename_tmp = NULL, *random = NULL; >> struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job; >> filename = get_cache_file(dc_job->cache, dc_job->key); >> @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index) >> * final destination filename, (to prevent any readers from seeing >> * a partially written file). >> */ >> - if (asprintf(&filename_tmp, "%s.tmp", filename) == -1) >> + >> + /* This next part used to be an flock(), which would prevent windows >> systems >> + * to build. 4 hex characters should be enough to prevent filename race >> + * conditions for now. >> + */ >> + random = generat
Re: [Mesa-dev] [RFC PATCH] Replace an flock with a random filename to evade some very ugly system dependent code
Ok, small update. Please ignore this rfc. I finally got appveyor to build my repo. There are at least these dependencies that would need to addressed (for anybody wanting tor try): #include // can be eliminated sometimes by replacing with __TIMESTAMP__ or maybe better __DATE__ __TIME__ #include #include #include #include #include "zlib.h" // complains about redefinition vsnprintf (I used version 1.23) So for now: code duplication it is ;) appveyor has mingw available, maybe I will give that a try sometime https://www.appveyor.com/docs/build-environment/ and they even have an ubuntu image available https://www.appveyor.com/docs/getting-started-with-appveyor-for-linux/ wonder why that isnt being used instead of msvc 2015 with llvm 3.3.1? Am 21.05.2018 um 01:19 schrieb Timothy Arceri: > > > On 20/05/18 22:16, Benedikt Schemmer wrote: >> There is exactly one flock in mesa and it caused mesa not to build >> on windows when shader cache was enabled. >> >> It should be possible to revert 9f8dc3bf03ec825bae7041858dda6ca2e9a34363 >> "utils: build sha1/disk cache only with Android/Autoconf" currently >> guarding the offending code with ENABLE_SHADER_CACHE >> >> This would allow shader cache to work on windows I think. > > NAK. rand() is not thread safe. > > Why are you trying to make this build on windows? There is no use case for > this on windows currently so it won't even be tested. There are also other > calls such as getuid() etc that will fail on windows. > > If someone want this to work on windows they should just write windows > specific paths IMO. > >> >> I dont have a test system with windows though. >> This builds on linux and is tested with Deus Ex:MD and Metro 2033 Redux >> both with cold shader cache. >> >> Really >> Fixes: d1efa09d342bff3e5def2978a0bef748d74f9c82 >> >> CC: Tapani Pälli >> CC: "Marek Olšák" >> CC: Emil Velikov >> CC: Timothy Arceri >> CC: Samuel Pitoiset >> --- >> This enables the patch >> [PATCH 1/3] mesa/main/shaderapi: Use generate_sha1() unconditionally >> >> src/util/disk_cache.c | 48 +++- >> 1 file changed, 35 insertions(+), 13 deletions(-) >> >> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c >> index 4a762eff20..ca47bb15fb 100644 >> --- a/src/util/disk_cache.c >> +++ b/src/util/disk_cache.c >> @@ -28,7 +28,6 @@ >> #include >> #include >> #include >> -#include >> #include >> #include >> #include >> @@ -848,6 +847,29 @@ struct cache_entry_file_data { >> uint32_t uncompressed_size; >> }; >> +static char * >> +generate_random_string(int length) { >> + static const char a[] = "0123456789abcdef"; >> + >> + if (length > 16) >> + return NULL; >> + >> + char buf[16]; >> + char *rndstr; >> + >> + for (int i = 0; i < length - 1; ++i) { >> + // assign a random element from the lookup table >> + buf[i] = a[rand() % (sizeof(a) - 1)]; >> + } >> + >> + buf[length - 1] = 0; >> + >> + if (asprintf(&rndstr, "%s", buf) == -1) >> + return NULL; >> + >> + return rndstr; >> +} >> + >> static void >> cache_put(void *job, int thread_index) >> { >> @@ -855,7 +877,7 @@ cache_put(void *job, int thread_index) >> int fd = -1, fd_final = -1, err, ret; >> unsigned i = 0; >> - char *filename = NULL, *filename_tmp = NULL; >> + char *filename = NULL, *filename_tmp = NULL, *random = NULL; >> struct disk_cache_put_job *dc_job = (struct disk_cache_put_job *) job; >> filename = get_cache_file(dc_job->cache, dc_job->key); >> @@ -873,7 +895,16 @@ cache_put(void *job, int thread_index) >> * final destination filename, (to prevent any readers from seeing >> * a partially written file). >> */ >> - if (asprintf(&filename_tmp, "%s.tmp", filename) == -1) >> + >> + /* This next part used to be an flock(), which would prevent windows >> systems >> + * to build. 4 hex characters should be enough to prevent filename race >> + * conditions for now. >> + */ >> + random = generate_random_string(4); >> + if (random == NULL) >> + goto done; >> + >> + if (asprintf(&filename_tmp, "%s_%s.tmp", filename, random) == -1) >> goto done; >> fd = open(filename_tmp, O_WRONLY | O_CLOEXEC | O_CREAT, 0644); >&
Re: [Mesa-dev] [PATCH 3/3] mesa/main/shaderapi: purely non-functional cleanups, like whitespace errors and cleanups
Am 21.05.2018 um 19:21 schrieb Ian Romanick: > On 05/10/2018 02:05 AM, Benedikt Schemmer wrote: >> remove a memset too and yes, this is all functionally identical >> >> --- >> src/mesa/main/shaderapi.c | 40 >> 1 file changed, 20 insertions(+), 20 deletions(-) >> >> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c >> index e8acca4490..1d0ca5374b 100644 >> --- a/src/mesa/main/shaderapi.c >> +++ b/src/mesa/main/shaderapi.c >> @@ -241,11 +241,10 @@ _mesa_init_shader_state(struct gl_context *ctx) >> /* Device drivers may override these to control what kind of instructions >> * are generated by the GLSL compiler. >> */ >> - struct gl_shader_compiler_options options; >> + struct gl_shader_compiler_options options = {}; >> gl_shader_stage sh; >> int i; >> >> - memset(&options, 0, sizeof(options)); > > This is not functionally the same. The memset will zero padding fields > added by the compiler, but '= {}' does not. I did check with https://godbolt.org/ and at least for clang (which generates a memset for {}) and gcc it is exactly the same void test(void) { struct gl_shader_compiler_options options = {}; memset(&options, 0, sizeof(options)); } gcc: // = {} mov QWORD PTR [rbp-32], 0 mov QWORD PTR [rbp-24], 0 mov QWORD PTR [rbp-16], 0 // memset lea rax, [rbp-32] mov edx, 24 mov esi, 0 mov rdi, rax call memset clang: // = {} mov rdi, rsi mov qword ptr [rbp - 32], rsi # 8-byte Spill mov esi, eax mov qword ptr [rbp - 40], rdx # 8-byte Spill mov dword ptr [rbp - 44], eax # 4-byte Spill call memset //memset mov rdx, qword ptr [rbp - 32] # 8-byte Reload mov rdi, rdx mov esi, dword ptr [rbp - 44] # 4-byte Reload mov rdx, qword ptr [rbp - 40] # 8-byte Reload call memset but your right intel compilers generate something weird: // = {} lea rax, QWORD PTR [-32+rbp] #71.47 mov edx, 0 #71.47 mov ecx, 24 #71.47 mov rdi, rax #71.47 mov eax, edx #71.47 and eax, 65535 #71.47 mov ah, al #71.47 mov edx, eax #71.47 shl eax, 16 #71.47 or eax, edx #71.47 mov esi, ecx #71.47 shr rcx, 2 #71.47 rep stosd #71.47 mov ecx, esi #71.47 and ecx, 3 #71.47 rep stosb #71.47 // memset lea rax, QWORD PTR [-32+rbp] #72.5 mov edx, 0 #72.5 mov ecx, 24 #72.5 mov rdi, rax #72.5 mov esi, edx #72.5 mov rdx, rcx #72.5 call memset #72.5 mov QWORD PTR [-8+rbp], rax #72.5 > > I'm also not fond of the The '!= 0' and '== NULL' changes. Pretty much > everywhere in core Mesa uses those patterns.> > I feel like most of this is just a bunch of spurious changes that will > just make cherry picking patches to stable irritating later on. > >> options.MaxUnrollIterations = 32; >> options.MaxIfDepth = UINT_MAX; >> >> @@ -254,7 +253,7 @@ _mesa_init_shader_state(struct gl_context *ctx) >> >> ctx->Shader.Flags = _mesa_get_shader_flags(); >> >> - if (ctx->Shader.Flags != 0) >> + if (ctx->Shader.Flags) >>ctx->Const.GenerateTemporaryNames = true; >> >> /* Extended for ARB_separate_shader_objects */ >> @@ -771,7 +770,8 @@ get_programiv(struct gl_context *ctx, GLuint program, >> GLenum pname, >>GLint *params) >> { >> struct gl_shader_program *shProg >> - = _mesa_lookup_shader_program_err(ctx, program, >> "glGetProgramiv(program)"); >> + = _mesa_lookup_shader_program_err(ctx, program, >> +"glGetProgramiv(program)"); >> >> /* Is transform feedback available in this context? >> */ >> @@ -953,7 +953,7 @@ get_programiv(struct gl_context *ctx, GLuint program, >> GLenum pname, >>*params = shProg->BinaryRetreivableHint; >>return; >> case GL_PROGRAM_BINARY_LENGTH: >> - if (ctx->Const.NumProgramBinaryFormats == 0) { >> + if (!ctx->Const.NumProgramBinaryFormats) { >> *params = 0; >>} else { >> _mesa_get_program_binary_length(ctx, shProg, params); >> @@ -974,7 +974,7 @@ get_programiv(struct gl_context *ctx, GLuint program, >> GLenum pname, >> "linked)"); >> return; >>} >> - if (shProg->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) { >> + if (!shProg->_LinkedShaders[MESA_SHADER_COMPUTE]) { >> _mesa_error(ctx, GL_INVALID_OPERATION, "glGetProgramiv(no compute " >> "shaders)"); >> return; >>
[Mesa-dev] [HOWTO] CI on appveyor with a linux image
I thought this might be interesting because currently appveyor is only used with MSVC and LLVM 3.3.1 They can however provide Ubuntu 16.04 LTS. Its free for OSS. You only need some kind of repository. Basic setup is easy https://www.appveyor.com/docs/ The hard part is the appveyor.yml file, which you can find below. This builds mesa every time you commit. Probably not the best config options and only for 64-bit, because I dont know how to do 32-bit builds on Ubuntu. So if somebody does, help is more than welcome. And if the Vmware guys feel like sharing, this can be merged with the current appveyor.yml (somehow, doesn't seem to hard though) p.s. there are line breaks after the autogen that you will have to remove manually (should be one long line) --- # http://www.appveyor.com/doc # # To setup AppVeyor for your own personal repositories do the following: # - Sign up # - Add a new project # - Select Git and fill in the Git clone URL # - Setup a Git hook as explained in # https://github.com/appveyor/webhooks#installing-git-hook # - Check 'Settings > General > Skip branches without appveyor.yml' # - Check 'Settings > General > Rolling builds' # - Setup the global or project notifications to your liking # # Note that kicking (or restarting) a build via the web UI will not work, as it # will fail to find appveyor.yml . The Git hook is the most practical way to # kick a build. # # See also: # - http://help.appveyor.com/discussions/problems/2209-node-grunt-build-specify-a-project-or-solution-file-the-directory-does-not-contain-a-project-or-solution-file # - http://help.appveyor.com/discussions/questions/1184-build-config-vs-appveyoryaml version: '{build}' branches: except: - /^travis.*$/ # Don't download the full Mesa history to speed up cloning. However the clone # depth must not be too small, otherwise builds might fail when lots of patches # are committed in succession, because the desired commit is not found on the # truncated history. # # See also: # - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories clone_depth: 100 image: ubuntu platform: Any CPU configuration: Release install: - ls -al - sh: sudo sed -i~orig -e 's$# deb-src http://us$deb-src http://us$' /etc/apt/sources.list - sh: sudo add-apt-repository ppa:oibaf/graphics-drivers - sh: sudo apt-get update - sh: sudo DEBIAN_FRONTEND=noninteractive apt-get -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" upgrade - sh: sudo apt-get build-dep mesa -y - sh: sudo apt-get install libxvmc-dev libxcb-xvmc0-dev libomxil-bellagio-dev -y build_script: - sh: ./autogen.sh --enable-dri --with-dri-drivers="nouveau i915 i965 r200 radeon swrast" --enable-osmesa --enable-glx-tls --enable-shared-glapi --enable-texture-float --enable-driglx-direct --enable-dri3 --with-platforms="x11 wayland drm" --enable-xa --enable-llvm ac_cv_path_LLVM_CONFIG=llvm-config-5.0 --enable-vdpau --enable-omx-bellagio --enable-va --enable-xvmc --enable-opencl --enable-opencl-icd --enable-nine --enable-gallium-extra-hud --enable-lmsensors --with-gallium-drivers=" nouveau svga r600 r300 i915 virgl radeonsi swrast" --enable-gles1 --enable-gles2 --enable-gle --with-vulkan-drivers=intel,radeon - sh: make - sh: sudo make install ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
Hi all, I thought I'd do some testing on my machine to see if there are any problems with nir (enabled with R600_DEBUG=nir) Tested on Ubuntu 17.10 with mesa git of today (oibaf for 32 bit, mine with LLVM5.0 on 64 bit because I cant figure out how to build 32bit drivers on ubuntu :( ) Xeon 1260l and 8GB DDR3-10600 with RX460 4GB works means no obvious visual artifacts etc. I didn't have time to do detailed benchmarking. --- Alien Isolation - works Bioshock Infinite - works, nonir Overall avg 45.58, min 9.81, max 104.49; nir avg 31.47 min 8.35 max 94.19 CAT Interstellar (UE4) - works Dead Island Definitive - works Dead Island Riptide Definitive - works Deus Ex: Mankind Divided - works, nonir low 8.3 avg 24.2 high 37.6; nir low 1 avg 22.8 high 35.8 but shader generation crashes two times: once around half the loader bar with complete system freeze and then again around the end but only exiting the applicationand this is reproducible by deleting the shader cache Mad Max[vulcan beta] - works benchmark 1 nonir 43/45/47 nir 10/46/50 benchmark 2 nonir 14/37/41 nir 24/36/40 benchmark 3 nonir 8/43/61 nir 5/41/61 benchmark 4 nonir 9/49/78 nir 4/47/74 Tomb Raider - TressFX doesn't work: hair flickering and is riddled with small green block artifacts, smoothing is missing; speed is the same 28.6 ish Hitman GO (Unity) - works Unity of Command (PyGame, SDL) - works Talos Principle Benchmark nir 64.7 nonir 67 I have more games if there are any usual suspects that are likely to cause problems. Cheers, Benedikt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
Hi Timothy, actually I am a subscriber to the list, dont know why this happens from time to time Benchmarkwise: NIR is now on par with TGSI, I put up some new HUD screenshots and benchmarks You might actually be too conservative: TGSI shows 2 spills, NIR 0 For now only two things stand out: some crashes during shader generation in Deus Ex, Dying Light and Dead Island (Original) (see nir test) and TressFX not working correctly in Tomb Raider. Am 07.03.2018 um 05:16 schrieb Timothy Arceri: > On 06/03/18 23:49, Benedikt Schemmer wrote: >> Hi Timothy, >> >> I put some additional benchmark results and data on github for you: > > Thanks! > >> >> https://github.com/bendat78/mymesa/tree/mymesa2/testresults >> >> I use a resolution of 1920x1080 for bioshock but also custom settings >> with all options enabled > > Ok I'll try some different combos and see if I can find any > performance regressions with nir. > >> >> I also wrote another email for the list, but it hasnt appeared yet. > > If you are not subscribed to the list you will need to wait until > someone approves it (which doesn't always happen as people are busy), > it's usually just easier to subscribe to the list and filter the > emails into a folder. > >> >> A log of my testing can be found in the "nir test" file >> >> >> >> Am 06.03.2018 um 07:38 schrieb Timothy Arceri: >>> On 05/03/18 03:43, Benedikt Schemmer wrote: >>>> Hi all, >>>> I thought I'd do some testing on my machine to see if there are any >>>> problems with nir (enabled with R600_DEBUG=nir) >>>> >>>> Tested on Ubuntu 17.10 with mesa git of today >>>> (oibaf for 32 bit, mine with LLVM5.0 on 64 bit because I cant >>>> figure out how to build 32bit drivers on ubuntu :( ) >>>> Xeon 1260l and 8GB DDR3-10600 with RX460 4GB >>>> >>>> works means no obvious visual artifacts etc. >>>> I didn't have time to do detailed benchmarking. >>>> >>>> --- >>>> >>>> Alien Isolation - works >>>> >>>> Bioshock Infinite - works, nonir Overall avg 45.58, min 9.81, max >>>> 104.49; nir avg 31.47 min 8.35 max 94.19 >>> >>> Hi, >>> >>> Do you recall what resolution / settings you used? >>> >>> With a quick test I'm seeing pretty good results on Ultra. >>> >>> 3840 x 2160 - Ultra on RX 580: (tgsi) 90.89, (nir) 107.22 >>> >>> >>>> >>>> CAT Interstellar (UE4) - works >>>> >>>> Dead Island Definitive - works >>>> >>>> Dead Island Riptide Definitive - works >>>> >>>> Deus Ex: Mankind Divided - works, nonir low 8.3 avg 24.2 high 37.6; >>>> nir low 1 avg 22.8 high 35.8 >>>> but shader generation crashes two times: once around half the >>>> loader bar with complete system freeze and then again around the >>>> end but only exiting the applicationand this is reproducible by >>>> deleting the shader cache >>>> >>>> Mad Max[vulcan beta] - works >>>> benchmark 1 >>>> nonir 43/45/47 >>>> nir 10/46/50 >>>> benchmark 2 >>>> nonir 14/37/41 >>>> nir 24/36/40 >>>> benchmark 3 >>>> nonir 8/43/61 >>>> nir 5/41/61 >>>> benchmark 4 >>>> nonir 9/49/78 >>>> nir 4/47/74 >>>> >>>> Tomb Raider - TressFX doesn't work: hair flickering and is riddled >>>> with small green block artifacts, smoothing is missing; speed is >>>> the same 28.6 ish >>>> >>>> Hitman GO (Unity) - works >>>> >>>> Unity of Command (PyGame, SDL) - works >>>> >>>> Talos Principle Benchmark nir 64.7 nonir 67 >>>> >>>> >>>> I have more games if there are any usual suspects that are likely >>>> to cause problems. >>>> >>>> Cheers, >>>> Benedikt >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Question: Underutilization of GPU in certain benchmarks - Bug or Feature ?
Hi all, while looking for problems with NIR, I noticed that in the Bioshock Infinite and also Dirt Rallye Benchmark there are times when GPU shader utilization drops to 80% (Bioshock) or 60% (Dirt Rallye). In Bioshock this directly associated with fps drops, in Dirt Rallye it is not so obvious but higher utilization means higher fps. Is this normal? Am I looking at the wrong things? Cheers, Benedikt P.S. Pictures of the Benchmarks with Gallium HUD traces can be found at https://github.com/bendat78/mymesa/tree/mymesa2/underutilization ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Util: fix msvc build
The MSVC preprocessor doesnt understand #warning --- src/util/process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/process.c b/src/util/process.c index 449c7fa77c2..6e6376986f3 100644 --- a/src/util/process.c +++ b/src/util/process.c @@ -107,7 +107,7 @@ __getProgramName() #define GET_PROGRAM_NAME() __getProgramName() #else #define GET_PROGRAM_NAME() "" -#warning "Per application configuration won't work with your OS version." +#pragma message ( "Warning: Per application configuration won't work with your OS version." ) #endif #endif -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
Hi all, I did some more testing with NIR and wanted to share the results. https://github.com/bendat78/mymesa/tree/mymesa2/testresults Overall it seems nir could be better than tgsi, but there are some shaders with significant regressions and I think that introduces framerate drops that are quite noticable in some games (Total War: WARHAMMER benchmark for example). See the run* directory for comparisons of tgsi vs nir for different versions of llvm and tgsi or nir vs different llvm backends (also includes complete shader db output (around 50k shaders)). NIR gets better with every version increase of llvm, which optimizes for code size (why not speed ?). I found no serious regressions so far, however I think that NIR is responsible for some crashes that happen in Metro 2033 Redux (beginning before actual game) and Dead Island classic (ingame) so probably related to compilation issues. Cheers, Benedikt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Disable bindless textures with radeonsi NIR until there is support for them.
From: Benedikt Schemmer Date: Sun, 1 Apr 2018 13:18:02 +0200 Subject: [PATCH] Disable bindless textures with radeonsi NIR until there is support for them. - Allows to build and use a debug build of mesa with shader-db (crashes otherwise with affected shaders) - Dirt Rally doesnt care about bindless, but Warhammer 40k Dawn 3 does and crashes with nir when ARB_bindless_texture is reported --- src/gallium/drivers/radeonsi/si_get.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_get.c b/src/gallium/drivers/radeonsi/si_get.c --- a/src/gallium/drivers/radeonsi/si_get.c +++ b/src/gallium/drivers/radeonsi/si_get.c @@ -176,7 +176,6 @@ static int si_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_DOUBLES: case PIPE_CAP_TGSI_TEX_TXF_LZ: case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT: - case PIPE_CAP_BINDLESS_TEXTURE: case PIPE_CAP_QUERY_TIMESTAMP: case PIPE_CAP_QUERY_TIME_ELAPSED: case PIPE_CAP_NIR_SAMPLERS_AS_DEREF: @@ -257,6 +256,11 @@ static int si_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return 1; return 0; + case PIPE_CAP_BINDLESS_TEXTURE: + if (sscreen->debug_flags & DBG(NIR)) + return 0; + return 1; + /* Unsupported features. */ case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY: case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT: -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Status of radeonsi NIR
Hi Timothy, I put up another run for you with llvm 7 and mesa git of this morning. Also I pruned duplicate shaders (especially Metro Redux had significant amounts of those) so we're down to ~24k unique shaders. I'm going to use release builds for testing because there is no difference to the debug build and it is significantly faster. Shader-db run times with NIR are about 35% slower though. 540s vs 398s Max Increase: SGPRS: 32 -> 104 (225.00 %) (in shaders/deadisland_definitive/1833.shader_test) VGPRS: 48 -> 216 (350.00 %) (in shaders/dirtrallye/732.shader_test) Spilled SGPRs: 8 -> 78 (875.00 %) (in shaders/f1_2015/1574.shader_test) Spilled VGPRs: 71 -> 76 (7.04 %) (in shaders/dirtrallye/1264.shader_test) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 48 -> 56 (16.67 %) dwords per thread (in shaders/dirtrallye/1264.shader_test) Code Size: 2952 -> 12668 (329.13 %) bytes (in shaders/alien_isolation/5324.shader_test) LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4 -> 8 (100.00 %) (in shaders/borderlands2/2608.shader_test) Wait states: 0 -> 0 (0.00 %) Max Decrease: SGPRS: 104 -> 40 (-61.54 %) (in shaders/csgo/1558.shader_test) VGPRS: 188 -> 136 (-27.66 %) (in shaders/cat/2031.shader_test) Spilled SGPRs: 24 -> 0 (-100.00 %) (in shaders/total_war_shogun_2/1108.shader_test) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 48 -> 0 (-100.00 %) (in shaders/soma/98.shader_test) Scratch size: 52 -> 0 (-100.00 %) dwords per thread (in shaders/soma/98.shader_test) Code Size: 17060 -> 7404 (-56.60 %) bytes (in shaders/dolphin/ubershaders/228.shader_test) LDS: 0 -> 0 (0.00 %) blocks Max Waves: 8 -> 3 (-62.50 %) (in shaders/f1_2015/1196.shader_test) Wait states: 0 -> 0 (0.00 %) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
Hi Timothy, thanks for looking into this. Dead Island still crashes for me with NIR. However when I attach apitrace it behaves even more strangely, both TGSI and NIR crash. TGSI gets a little further. Without apitrace I can play the game with TGSI and it reproducibly crashes during shader compilation (Loading Resort screen) with NIR. Dont spend too much time on this though I think it might be related to changes in mesa itself (memory leaks?) not necessarily NIR. I used apitrace trace --output=/home/nano/deadbeaf5 %command% in the launch options of steam, otherwise I dont even get this far or apitrace disattaches itself from the process and no trace is output tails ==> nircrash <== 186658 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0, yoffset = 0, width = 1280, height = 720, format = GL_RED, type = GL_UNSIGNED_BYTE, pixels = blob(921600)) 186659 glGenerateMipmap(target = GL_TEXTURE_2D) 186660 glXMakeCurrent(dpy = 0xcb62580, drawable = 0, ctx = NULL) = True 186661 glXMakeCurrent(dpy = 0xcb62580, drawable = 39845908, ctx = 0xce13840) = True 186662 glBindTexture(target = GL_TEXTURE_2D, texture = 129) 186663 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0, yoffset = 0, width = 640, height = 360, format = GL_RED, type = GL_UNSIGNED_BYTE, pixels = blob(230400)) 186664 glGenerateMipmap(target = GL_TEXTURE_2D) 186665 glXMakeCurrent(dpy = 0xcb62580, drawable = 0, ctx = NULL) = True 18 glXMakeCurrent(dpy = 0xcb62580, drawable = 39845908, ctx = 0xce13840) = True 186667 glBindTexture(target = GL_TEXTURE_2D, texture = 130) ==> nircrash2 <== 167095 glEnable(cap = GL_FRAMEBUFFER_SRGB) 167096 glClearColor(red = 0, green = 0, blue = 0, alpha = 1) 167097 glClear(mask = GL_COLOR_BUFFER_BIT) 167098 glClearColor(red = 0, green = 0, blue = 0, alpha = 0) 167099 glBindTexture(target = GL_TEXTURE_2D, texture = 128) 167100 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0, yoffset = 0, width = 1280, height = 720, format = GL_RED, type = GL_UNSIGNED_BYTE, pixels = blob(921600)) 167101 glGenerateMipmap(target = GL_TEXTURE_2D) 167102 glXMakeCurrent(dpy = 0xcc99820, drawable = 0, ctx = NULL) = True 167103 glXMakeCurrent(dpy = 0xcc99820, drawable = 39845908, ctx = 0xcf4bd50) = True 167104 glBindTexture(target = GL_TEXTURE_2D, texture = 129) ==> nircrash3 <== 156903 glEnable(cap = GL_FRAMEBUFFER_SRGB) 156904 glClearColor(red = 0, green = 0, blue = 0, alpha = 1) 156905 glClear(mask = GL_COLOR_BUFFER_BIT) 156906 glClearColor(red = 0, green = 0, blue = 0, alpha = 0) 156907 glBindTexture(target = GL_TEXTURE_2D, texture = 128) 156908 glTexSubImage2D(target = GL_TEXTURE_2D, level = 0, xoffset = 0, yoffset = 0, width = 1280, height = 720, format = GL_RED, type = GL_UNSIGNED_BYTE, pixels = blob(921600)) 156909 glGenerateMipmap(target = GL_TEXTURE_2D) 156910 glXMakeCurrent(dpy = 0xc97f820, drawable = 0, ctx = NULL) = True 156911 glXMakeCurrent(dpy = 0xc97f820, drawable = 39845908, ctx = 0xcc57650) = True 156912 glBindTexture(target = GL_TEXTURE_2D, texture = 129) ==> tgsicrash <== 256272 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 39845908, ctx = 0xc974890) = True 256273 glEnable(cap = GL_CULL_FACE) 256274 glFrontFace(mode = GL_CW) 256275 glDepthMask(flag = GL_FALSE) 256276 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 0, ctx = NULL) = True 256277 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 39845908, ctx = 0xc974890) = True 256278 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 0, ctx = NULL) = True 256279 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 39845908, ctx = 0xc974890) = True 256280 glXSwapIntervalMESA(interval = 0) = 0 256281 glXMakeCurrent(dpy = 0xc6d26d0, drawable = 0, ctx = NULL) = True ==> tgsicrash2 <== 294167 glXMakeCurrent(dpy = 0xcbbc820, drawable = 39845908, ctx = 0xce69b30) = True 294168 glEnable(cap = GL_CULL_FACE) 294169 glFrontFace(mode = GL_CW) 294170 glDepthMask(flag = GL_FALSE) 294171 glXMakeCurrent(dpy = 0xcbbc820, drawable = 0, ctx = NULL) = True 294172 glXMakeCurrent(dpy = 0xcbbc820, drawable = 39845908, ctx = 0xce69b30) = True 294173 glXMakeCurrent(dpy = 0xcbbc820, drawable = 0, ctx = NULL) = True 294174 glXMakeCurrent(dpy = 0xcbbc820, drawable = 39845908, ctx = 0xce69b30) = True 294175 glXSwapIntervalMESA(interval = 0) = 0 294176 glXMakeCurrent(dpy = 0xcbbc820, drawable = 0, ctx = NULL) = True Am 04.04.2018 um 06:05 schrieb Timothy Arceri: > On 31/03/18 02:44, Benedikt Schemmer wrote: >> Hi all, >> >> I did some more testing with NIR and wanted to share the results. >> https://github.com/bendat78/mymesa/tree/mymesa2/testresults >> >> Overall it seems nir could be better than tgsi, but there are some >> shaders with significant regressions and I think that introduces >> framerate drops that are quite noticable in some games (Total War: >> WARHAMMER benchmark for example). > > I ran the benchmark for this game and there was no real difference > between the two backends. C
Re: [Mesa-dev] Status of radeonsi NIR
No I dont. 32-bit is a problem, because Ubuntu wants to literally deinstall itself before letting me do that (some dependency nonsense). So I only build 64-bit myself. For 32-bit & backup 64-bit I use the oibaf ppa which I updated this morning. I wouldnt spend to much time right now, because mesa git feels a bit broken. If I switch to stable versions like 17.3.8 most problems disappear (but so does NIR I guess ;) Am 04.04.2018 um 14:06 schrieb Timothy Arceri: > On 04/04/18 21:51, Benedikt Schemmer wrote: >> Hi Timothy, >> >> thanks for looking into this. >> >> Dead Island still crashes for me with NIR. > > Just to make sure, are you building a 32bit version of Mesa? If not you > might be running your system Mesa. If you are definitely building Mesa > 32bit then I'll take a closer look tomorrow. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
Hi Timothy, another game that is behaving strangly is Metro 2033 Redux, also crashes earlier with apitrace attached. This time TGSI and NIR crash at about the same time (press any key to continue screen) Without apitrace TGSI works fine, NIR crashes like above. ==> metronir <== 2059446 glDisableVertexAttribArray(index = 8) 2059447 glDisableVertexAttribArray(index = 9) 2059448 glDisableVertexAttribArray(index = 10) 2059449 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3) 2059450 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data = blob(352), usage = GL_DYNAMIC_DRAW) 2059451 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 720, type = GL_UNSIGNED_SHORT, indices = 0x176a6, basevertex = 15823) 2059452 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3) 2059453 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data = blob(352), usage = GL_DYNAMIC_DRAW) 2059454 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 711, type = GL_UNSIGNED_SHORT, indices = 0x17c46, basevertex = 16303) 2059455 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3) ==> metrotgsi <== 1744537 glEnableVertexAttribArray(index = 4) 1744538 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT, stride = 32, pointer = 0x18) 1744539 glVertexAttribDivisor(index = 4, divisor = 0) 1744540 glDisableVertexAttribArray(index = 5) 1744541 glDisableVertexAttribArray(index = 6) 1744542 glDisableVertexAttribArray(index = 7) 1744543 glDisableVertexAttribArray(index = 8) 1744544 glDisableVertexAttribArray(index = 9) 1744545 glDisableVertexAttribArray(index = 10) 1744546 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer = 3) ==> metrotgsi2 <== 2141178 glVertexAttribDivisor(index = 1, divisor = 0) 2141179 glEnableVertexAttribArray(index = 2) 2141180 glVertexAttribPointer(index = 2, size = 4, type = GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x10) 2141181 glVertexAttribDivisor(index = 2, divisor = 0) 2141182 glEnableVertexAttribArray(index = 3) 2141183 glVertexAttribPointer(index = 3, size = 4, type = GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x14) 2141184 glVertexAttribDivisor(index = 3, divisor = 0) 2141185 glEnableVertexAttribArray(index = 4) 2141186 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT, stride = 32, pointer = 0x18) 2141187 glVertexAttribDivisor(index = 4, divisor = 0) // incomplete Am 04.04.2018 um 14:06 schrieb Timothy Arceri: > On 04/04/18 21:51, Benedikt Schemmer wrote: >> Hi Timothy, >> >> thanks for looking into this. >> >> Dead Island still crashes for me with NIR. > > Just to make sure, are you building a 32bit version of Mesa? If not you > might be running your system Mesa. If you are definitely building Mesa > 32bit then I'll take a closer look tomorrow. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
Just for baseline: I have no problem running and creating apitraces from Bioshock Infinite or TombRaider, I think both 32-bit CAT Interstellar which is 64-bit using this command R600_DEBUG=nir apitrace trace --output=/home/nano/bio2 %command% and building either the 64 or 32 bit version of apitrace Am 04.04.2018 um 14:06 schrieb Timothy Arceri: > On 04/04/18 21:51, Benedikt Schemmer wrote: >> Hi Timothy, >> >> thanks for looking into this. >> >> Dead Island still crashes for me with NIR. > > Just to make sure, are you building a 32bit version of Mesa? If not you > might be running your system Mesa. If you are definitely building Mesa > 32bit then I'll take a closer look tomorrow. R600_DEBUG=nir apitrace trace --output=/home/nano/bio2 %command% ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Status of radeonsi NIR
RX460 4GB Am 04.04.2018 um 15:59 schrieb Timothy Arceri: > On 04/04/18 22:53, Benedikt Schemmer wrote: >> Hi Timothy, >> >> another game that is behaving strangly is Metro 2033 Redux, also crashes >> earlier with apitrace attached. This time TGSI and NIR crash at about >> the same time (press any key to continue screen) >> Without apitrace TGSI works fine, NIR crashes like above. > > Strange. This game works fine for me on NIR, what card did you say you > were testing on? > >> >> ==> metronir <== >> 2059446 glDisableVertexAttribArray(index = 8) >> 2059447 glDisableVertexAttribArray(index = 9) >> 2059448 glDisableVertexAttribArray(index = 10) >> 2059449 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer >> = 3) >> 2059450 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data = >> blob(352), usage = GL_DYNAMIC_DRAW) >> 2059451 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 720, type >> = GL_UNSIGNED_SHORT, indices = 0x176a6, basevertex = 15823) >> 2059452 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer >> = 3) >> 2059453 glBufferData(target = GL_UNIFORM_BUFFER, size = 352, data = >> blob(352), usage = GL_DYNAMIC_DRAW) >> 2059454 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 711, type >> = GL_UNSIGNED_SHORT, indices = 0x17c46, basevertex = 16303) >> 2059455 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer >> = 3) >> >> ==> metrotgsi <== >> 1744537 glEnableVertexAttribArray(index = 4) >> 1744538 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT, >> stride = 32, pointer = 0x18) >> 1744539 glVertexAttribDivisor(index = 4, divisor = 0) >> 1744540 glDisableVertexAttribArray(index = 5) >> 1744541 glDisableVertexAttribArray(index = 6) >> 1744542 glDisableVertexAttribArray(index = 7) >> 1744543 glDisableVertexAttribArray(index = 8) >> 1744544 glDisableVertexAttribArray(index = 9) >> 1744545 glDisableVertexAttribArray(index = 10) >> 1744546 glBindBufferBase(target = GL_UNIFORM_BUFFER, index = 1, buffer >> = 3) >> >> ==> metrotgsi2 <== >> 2141178 glVertexAttribDivisor(index = 1, divisor = 0) >> 2141179 glEnableVertexAttribArray(index = 2) >> 2141180 glVertexAttribPointer(index = 2, size = 4, type = >> GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x10) >> 2141181 glVertexAttribDivisor(index = 2, divisor = 0) >> 2141182 glEnableVertexAttribArray(index = 3) >> 2141183 glVertexAttribPointer(index = 3, size = 4, type = >> GL_UNSIGNED_BYTE, normalized = GL_TRUE, stride = 32, pointer = 0x14) >> 2141184 glVertexAttribDivisor(index = 3, divisor = 0) >> 2141185 glEnableVertexAttribArray(index = 4) >> 2141186 glVertexAttribIPointer(index = 4, size = 4, type = GL_SHORT, >> stride = 32, pointer = 0x18) >> 2141187 glVertexAttribDivisor(index = 4, divisor = 0) // incomplete >> >> >> Am 04.04.2018 um 14:06 schrieb Timothy Arceri: >>> On 04/04/18 21:51, Benedikt Schemmer wrote: >>>> Hi Timothy, >>>> >>>> thanks for looking into this. >>>> >>>> Dead Island still crashes for me with NIR. >>> >>> Just to make sure, are you building a 32bit version of Mesa? If not you >>> might be running your system Mesa. If you are definitely building Mesa >>> 32bit then I'll take a closer look tomorrow. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH] Change the shader capture naming system from purely number based to sha_number
Depending on the application captured shaders may or may not be the same when they have the same number. In Games it often depends on the users graphics options. This is a problem when running shader-db and looking at the output. There the same shader number can mean entirely different shaders depending on user settings in an application. Also different numbers can mean the exact same shader. A good example of this is Metro 2033 Redux, where on the one hand the same shader is compiled dozens of times resulting in dozens of identical shaders with different numbers and on the other hand the same number is used for entirely different shaders when the graphics options are changed. examples from metro ff92ec1631322329e8c6125184b7681f190a2c73_11179.shader_test ff92ec1631322329e8c6125184b7681f190a2c73_12166.shader_test ff92ec1631322329e8c6125184b7681f190a2c73_13493.shader_test ff92ec1631322329e8c6125184b7681f190a2c73_1956.shader_test ff92ec1631322329e8c6125184b7681f190a2c73_3939.shader_test ff92ec1631322329e8c6125184b7681f190a2c73_5042.shader_test ff92ec1631322329e8c6125184b7681f190a2c73_9394.shader_test 639f9dd8dc9ccd6465c203d2483c6f1b1838ae19_1109.shader_test size: 2,2K f70e06b31f68b96b1d5b0b3b4966cfbe02d51c44_1109.shader_test size: 4,0K A number at the end seems to be necessary to avoid race conditions when the application tries to compile the same shader multiple times at the same time. Tested with several 64-bit Games like Dirt Rally and Metro 2033 Redux. I am not sure if this should be a series of smaller patches or if it can be just one large blob. Also I dont know where exactly the ENABLE_SHADER_CACHE code should be put one helper function is used to generate the sha. Please kindly review and push, if you find this useful. --- src/mesa/main/shaderapi.c | 272 -- 1 file changed, 145 insertions(+), 127 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 76bad7f31e..1e35f055a6 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -65,6 +65,122 @@ #include "util/mesa-sha1.h" #include "util/crc32.h" + +/** + * Generate a SHA-1 hash value string for given source string. + */ +static void +generate_sha1(const char *source, char sha_str[64]) +{ + unsigned char sha[20]; + _mesa_sha1_compute(source, strlen(source), sha); + _mesa_sha1_format(sha_str, sha); +} + + +#ifdef ENABLE_SHADER_CACHE +/** + * Construct a full path for shader replacement functionality using + * following format: + * + * /_.glsl + */ +static char * +construct_name(const gl_shader_stage stage, const char *source, + const char *path) +{ + char sha[64]; + static const char *types[] = { + "VS", "TC", "TE", "GS", "FS", "CS", + }; + + generate_sha1(source, sha); + return ralloc_asprintf(NULL, "%s/%s_%s.glsl", path, types[stage], sha); +} + +/** + * Write given shader source to a file in MESA_SHADER_DUMP_PATH. + */ +static void +dump_shader(const gl_shader_stage stage, const char *source) +{ + static bool path_exists = true; + char *dump_path; + FILE *f; + + if (!path_exists) + return; + + dump_path = getenv("MESA_SHADER_DUMP_PATH"); + if (!dump_path) { + path_exists = false; + return; + } + + char *name = construct_name(stage, source, dump_path); + + f = fopen(name, "w"); + if (f) { + fputs(source, f); + fclose(f); + } else { + GET_CURRENT_CONTEXT(ctx); + _mesa_warning(ctx, "could not open %s for dumping shader (%s)", name, +strerror(errno)); + } + ralloc_free(name); +} + +/** + * Read shader source code from a file. + * Useful for debugging to override an app's shader. + */ +static GLcharARB * +read_shader(const gl_shader_stage stage, const char *source) +{ + char *read_path; + static bool path_exists = true; + int len, shader_size = 0; + GLcharARB *buffer; + FILE *f; + + if (!path_exists) + return NULL; + + read_path = getenv("MESA_SHADER_READ_PATH"); + if (!read_path) { + path_exists = false; + return NULL; + } + + char *name = construct_name(stage, source, read_path); + f = fopen(name, "r"); + ralloc_free(name); + if (!f) + return NULL; + + /* allocate enough room for the entire shader */ + fseek(f, 0, SEEK_END); + shader_size = ftell(f); + rewind(f); + assert(shader_size); + + /* add one for terminating zero */ + shader_size++; + + buffer = malloc(shader_size); + assert(buffer); + + len = fread(buffer, 1, shader_size, f); + buffer[len] = 0; + + fclose(f); + + return buffer; +} + +#endif /* ENABLE_SHADER_CACHE */ + /** * Return mask of GLSL_x flags by examining the MESA_GLSL env var. */ @@ -1228,29 +1344,43 @@ link_program(struct gl_context *ctx, struct gl_shader_program *shProg, /* Capture .shader_test files. */ const char *capture_path = _mesa_get_shader_capture_path(); if (shProg->Name != 0 && shProg->Name != ~0 && capture_path != N
Re: [Mesa-dev] [PATCH 3/3] radeonsi: don't emit partial flushes, for internal CS flushes only
Hi Marek, your patches dont apply cleanly (src/gallium/drivers/radeonsi/si_gfx_cs.c in particular) Rebase? Other than that: Dirt Rally (avg,min,max) before 54/35/73 after 54/41/73 My system is GPU bound. RX560 OC with 4GB (aka reflashed RX460 ;) For the series: Tested-by: Benedikt Schemmer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radeonsi: don't emit partial flushes for internal CS flushes only
Patchwork didnt recognize the message header resend Hi Marek, your patches dont apply cleanly (src/gallium/drivers/radeonsi/si_gfx_cs.c in particular) Rebase? Other than that: Dirt Rally (avg,min,max) before 54/35/73 after 54/41/73 My system is GPU bound. RX560 OC with 4GB (aka reflashed RX460 ;) For the series: Tested-by: Benedikt Schemmer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [1/3] radeonsi: implement mechanism for IBs without partial flushes at the end (v6)
Hi Marek, I just tried different combinations: this and with meta ctx->flags |= SI_CONTEXT_INV_ICACHE | SI_CONTEXT_INV_SMEM_L1 | SI_CONTEXT_INV_VMEM_L1 | SI_CONTEXT_INV_GLOBAL_L2 | SI_CONTEXT_START_PIPELINE_STATS | SI_CONTEXT_FLUSH_AND_INV_DB | //SI_CONTEXT_FLUSH_AND_INV_DB_META | SI_CONTEXT_FLUSH_AND_INV_CB; as long as you dont flush db_meta there doesnt seem to be a difference performancewise and I dont see any visible difference between them at all cant test this with 32bit applications though Dirt Rally Benchmark 54/34/71 with meta flush 54/40/74 with db and cb flushes 54/41/73 original again: gpu limited, all starting with empty shader cache Cheers, Benedikt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Status of radeonsi NIR
Hi all, I did some more test runs with NIR and shader-db. This time they also include piglit shaders. There was only one crash that was explicitly related to NIR but also quite a number of regressions that might be worth looking at. I've put the results up on https://github.com/bendat78/mymesa/tree/mymesa2/testresults look at the most current run* directory. (a) => CRASHED <= while processing these shaders: shaders/piglit/da1588b9fd1ec5009a02ff9ee244f11923d32855_3.shader_test [require] GLSL >= 1.10 [vertex shader] #version 110 #if __VERSION__ >= 130 in vec4 piglit_vertex; #else attribute vec4 piglit_vertex; #endif void main() { gl_Position = piglit_vertex; } [fragment shader] #version 110 uniform float id; void main() { ivec4 test = ivec4(0, 1, 0, 0); vec4 test2 = vec4(0.0, 1.0, 0.0, 0.0); int index = int(id); float col = test2[test[index]]; gl_FragColor = vec4(col); } (b) Might be worth looking at this one as it is related to Alien Isolation and has significant regressions VGPRS: 4 -> 64 (1500.00 %) (in shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test) Spilled VGPRs: 0 -> 46 (0.00 %) (in shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test) Scratch size: 0 -> 36 (0.00 %) dwords per thread (in shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test) Code Size: 108 -> 7148 (6518.52 %) bytes (in shaders/piglit/51e07b0ac242a7c22d5d519ea8b3e4f7ac8e51f9_2.shader_test) # Tests linking a shader with a large workgroup and local data set. # # This is related to: # Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840 # # There are two important elements to this test: # # 1. local workgroup size of 1024 based on the spec requirements. # # 2. Use a 'large' size of live local data. In the test this is #controlled with the SIZE macro. This is hardware specific, and #the value of 64 was chosen because it was observed to fail on the #Mesa i965 driver. [require] GL >= 3.3 GLSL >= 3.30 GL_ARB_compute_shader [compute shader] #version 330 #extension GL_ARB_compute_shader: enable layout(local_size_x = 1024) in; #define SIZE 64 shared int sa[SIZE]; void main() { int a[SIZE] = sa; a[int(gl_LocalInvocationIndex) % SIZE] += 1; if (a[int(gl_LocalInvocationIndex) % SIZE] == 0) sa[0] = 1; } [test] link success (c) there are a number of shaders that show this kind of behavior (dont know if they do for the same reason): Max Waves: 8 -> 1 (-87.50 %) (in shaders/piglit/31a96d9a82c154da664023713921aab10967ef83_1533.shader_test) from /** @file glsl-max-varyings.c * * Tests whether each varying can be used at all numbers of varyings up to * GL_MAX_VARYING_FLOATS / 4. */ Have fun, Benedikt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev