Re: [Mesa-dev] [radeonsi] Blender/vsraytrace/fsraytrace/gsraytrace - GPUShader: compile error
I'm guessing your using GCC 8.2.1 to compile Mesa? There was a compiler bug: https://bugzilla.redhat.com/show_bug.cgi?id=1645400 On 12/11/18 2:11 pm, Dieter Nützel wrote: Hello, I get brocken shaders with Blender and the above demos didn't start any longer. NOT NIR related. Have to start bisect. OpenGL renderer string: Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 4.19.0-rc1-1.g7262353-default+, LLVM 8.0.0) OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.0.0-devel (git-590fcb50e7) OpenGL core profile shading language version string: 4.50 mesa-demos/glsl> blender Read prefs: /home/dieter/.config/blender/2.79/config/userpref.blend libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: pci id for fd 16: 1002:67df, driver radeonsi libGL: OpenDriver: trying /usr/local/lib64/dri/tls/radeonsi_dri.so libGL: OpenDriver: trying /usr/local/lib64/dri/radeonsi_dri.so libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. usr/share/libdrm/amdgpu.ids version: 1.0.0 libGL: Using DRI3 for screen 0 Read blend: /data/Blender/BMW3v2.blend 2.66 versioning fix: replacing black sky with premultiplied alpha for scene Scene Read blend: /data/Blender/BMW27GE.blend GPUShader: compile error: 0:1177(22): error: invalid input layout qualifier used [-] Read blend: /data/Blender/BMW27.blend skipping driver '100*power', automatic scripts are disabled skipping driver '-100*power', automatic scripts are disabled skipping driver '-90*brake', automatic scripts are disabled skipping driver '90*brake', automatic scripts are disabled libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. Read blend: /data/Blender/sanisidro.blend Read blend: /data/Blender/bh.blend Info: Read library: '/projeto.blend', '//../../projeto.blend', parent '' Warning: Cannot find lib '/projeto.blend' Warning: LIB: Group: 'Projeto' missing from '/projeto.blend', parent '' Info: Read library: '/projeto.blend', '//../../projeto.blend', parent '' Warning: Unable to open '/projeto.blend': No such file or directory Warning: Cannot find lib '/projeto.blend' Warning: LIB: Group: 'Projeto' missing from '/projeto.blend', parent '' GPUShader: compile error: 0:1177(22): error: invalid input layout qualifier used [-] mesa-demos/glsl> ./fsraytrace libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: pci id for fd 4: 1002:67df, driver radeonsi libGL: OpenDriver: trying /usr/local/lib64/dri/tls/radeonsi_dri.so libGL: OpenDriver: trying /usr/local/lib64/dri/radeonsi_dri.so libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. usr/share/libdrm/amdgpu.ids version: 1.0.0 libGL: Using DRI3 for screen 0 Error: problem compiling shader: 0:48(2): error: invalid input layout qualifier used Same with 'vsraytrace' and 'gsraytrace'. Thanks, Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 108636] test_optpass has use after free bug, failing with memory testing tools like address sanitizer
https://bugs.freedesktop.org/show_bug.cgi?id=108636 Tapani Pälli changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Tapani Pälli --- --- 8< --- commit 8dc2085baf954e7d52159797fe3051a554df3f6d Author: Hanno Böck Date: Wed Nov 7 09:01:42 2018 +0100 glsl/test: Fix use after free in test_optpass. The variable state is free'd and afterwards state->error is used as the return value, resulting in a use after free bug detected by memory safety tools like address sanitizer. Signed-off-by: Hanno Böck Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636 Reviewed-by: Eric Engestrom Reviewed-by: Tapani Pälli -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: don't pack varyings ints with floats unless flat
On Sun, Nov 11, 2018 at 9:37 PM Timothy Arceri wrote: > On 12/11/18 2:11 pm, Jason Ekstrand wrote: > > On November 11, 2018 20:30:29 Timothy Arceri > wrote: > > > >> Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers") > >> --- > >> src/compiler/nir/nir_linking_helpers.c | 9 ++--- > >> 1 file changed, 6 insertions(+), 3 deletions(-) > >> > >> > >> diff --git a/src/compiler/nir/nir_linking_helpers.c > >> b/src/compiler/nir/nir_linking_helpers.c > >> index b07bb40894e..125a8d55b3e 100644 > >> --- a/src/compiler/nir/nir_linking_helpers.c > >> +++ b/src/compiler/nir/nir_linking_helpers.c > >> @@ -196,10 +196,13 @@ nir_remove_unused_varyings(nir_shader *producer, > >> nir_shader *consumer) > >> } > >> > >> static uint8_t > >> -get_interp_type(nir_variable *var, bool default_to_smooth_interp) > >> +get_interp_type(nir_variable *var, const struct glsl_type *type, > >> +bool default_to_smooth_interp) > >> { > >>if (var->data.interpolation != INTERP_MODE_NONE) > >> return var->data.interpolation; > >> + else if (glsl_type_is_integer(type)) > >> + return INTERP_MODE_FLAT; > > > > Are we guaranteed that there will be no explititly interpolated integers? > > The GLSL 1.30 spec used to require ints to be marked as flat but the > 4.60 spec doesn't say much at all. I can make "if > (glsl_type_is_integer(type))" the first check, this would match what we > do in GLSL IR. Would this cover your concerns? > Yes, it would. With that, both are Reviewed-by: Jason Ekstrand > > > >>else if (default_to_smooth_interp) > >> return INTERP_MODE_SMOOTH; > >>else > >> @@ -253,7 +256,7 @@ get_slot_component_masks_and_interp_types(struct > >> exec_list *var_list, > >> unsigned comps_slot2 = 0; > >> for (unsigned i = 0; i < slots; i++) { > >> interp_type[location + i] = > >> - get_interp_type(var, default_to_smooth_interp); > >> + get_interp_type(var, type, default_to_smooth_interp); > >> interp_loc[location + i] = get_interp_loc(var); > >> > >> if (dual_slot) { > >> @@ -425,7 +428,7 @@ compact_components(nir_shader *producer, > >> nir_shader *consumer, uint8_t *comps, > >> continue; > >> > >> bool found_new_offset = false; > >> - uint8_t interp = get_interp_type(var, > >> default_to_smooth_interp); > >> + uint8_t interp = get_interp_type(var, type, > >> default_to_smooth_interp); > >> for (; cursor[interp] < 32; cursor[interp]++) { > >> uint8_t cursor_used_comps = comps[cursor[interp]]; > >> > >> -- > >> 2.19.1 > >> > >> > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: don't pack varyings ints with floats unless flat
On 12/11/18 2:11 pm, Jason Ekstrand wrote: On November 11, 2018 20:30:29 Timothy Arceri wrote: Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers") --- src/compiler/nir/nir_linking_helpers.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c index b07bb40894e..125a8d55b3e 100644 --- a/src/compiler/nir/nir_linking_helpers.c +++ b/src/compiler/nir/nir_linking_helpers.c @@ -196,10 +196,13 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) } static uint8_t -get_interp_type(nir_variable *var, bool default_to_smooth_interp) +get_interp_type(nir_variable *var, const struct glsl_type *type, + bool default_to_smooth_interp) { if (var->data.interpolation != INTERP_MODE_NONE) return var->data.interpolation; + else if (glsl_type_is_integer(type)) + return INTERP_MODE_FLAT; Are we guaranteed that there will be no explititly interpolated integers? The GLSL 1.30 spec used to require ints to be marked as flat but the 4.60 spec doesn't say much at all. I can make "if (glsl_type_is_integer(type))" the first check, this would match what we do in GLSL IR. Would this cover your concerns? else if (default_to_smooth_interp) return INTERP_MODE_SMOOTH; else @@ -253,7 +256,7 @@ get_slot_component_masks_and_interp_types(struct exec_list *var_list, unsigned comps_slot2 = 0; for (unsigned i = 0; i < slots; i++) { interp_type[location + i] = - get_interp_type(var, default_to_smooth_interp); + get_interp_type(var, type, default_to_smooth_interp); interp_loc[location + i] = get_interp_loc(var); if (dual_slot) { @@ -425,7 +428,7 @@ compact_components(nir_shader *producer, nir_shader *consumer, uint8_t *comps, continue; bool found_new_offset = false; - uint8_t interp = get_interp_type(var, default_to_smooth_interp); + uint8_t interp = get_interp_type(var, type, default_to_smooth_interp); for (; cursor[interp] < 32; cursor[interp]++) { uint8_t cursor_used_comps = comps[cursor[interp]]; -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [radeonsi] Blender/vsraytrace/fsraytrace/gsraytrace - GPUShader: compile error
Hello, I get brocken shaders with Blender and the above demos didn't start any longer. NOT NIR related. Have to start bisect. OpenGL renderer string: Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 4.19.0-rc1-1.g7262353-default+, LLVM 8.0.0) OpenGL core profile version string: 4.5 (Core Profile) Mesa 19.0.0-devel (git-590fcb50e7) OpenGL core profile shading language version string: 4.50 mesa-demos/glsl> blender Read prefs: /home/dieter/.config/blender/2.79/config/userpref.blend libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: pci id for fd 16: 1002:67df, driver radeonsi libGL: OpenDriver: trying /usr/local/lib64/dri/tls/radeonsi_dri.so libGL: OpenDriver: trying /usr/local/lib64/dri/radeonsi_dri.so libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. usr/share/libdrm/amdgpu.ids version: 1.0.0 libGL: Using DRI3 for screen 0 Read blend: /data/Blender/BMW3v2.blend 2.66 versioning fix: replacing black sky with premultiplied alpha for scene Scene Read blend: /data/Blender/BMW27GE.blend GPUShader: compile error: 0:1177(22): error: invalid input layout qualifier used [-] Read blend: /data/Blender/BMW27.blend skipping driver '100*power', automatic scripts are disabled skipping driver '-100*power', automatic scripts are disabled skipping driver '-90*brake', automatic scripts are disabled skipping driver '90*brake', automatic scripts are disabled libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. Read blend: /data/Blender/sanisidro.blend Read blend: /data/Blender/bh.blend Info: Read library: '/projeto.blend', '//../../projeto.blend', parent '' Warning: Cannot find lib '/projeto.blend' Warning: LIB: Group: 'Projeto' missing from '/projeto.blend', parent '' Info: Read library: '/projeto.blend', '//../../projeto.blend', parent '' Warning: Unable to open '/projeto.blend': No such file or directory Warning: Cannot find lib '/projeto.blend' Warning: LIB: Group: 'Projeto' missing from '/projeto.blend', parent '' GPUShader: compile error: 0:1177(22): error: invalid input layout qualifier used [-] mesa-demos/glsl> ./fsraytrace libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: pci id for fd 4: 1002:67df, driver radeonsi libGL: OpenDriver: trying /usr/local/lib64/dri/tls/radeonsi_dri.so libGL: OpenDriver: trying /usr/local/lib64/dri/radeonsi_dri.so libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. libGL: Can't open configuration file /usr/local/etc/drirc: No such file or directory. usr/share/libdrm/amdgpu.ids version: 1.0.0 libGL: Using DRI3 for screen 0 Error: problem compiling shader: 0:48(2): error: invalid input layout qualifier used Same with 'vsraytrace' and 'gsraytrace'. Thanks, Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] nir: don't pack varyings ints with floats unless flat
On November 11, 2018 20:30:29 Timothy Arceri wrote: Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers") --- src/compiler/nir/nir_linking_helpers.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c index b07bb40894e..125a8d55b3e 100644 --- a/src/compiler/nir/nir_linking_helpers.c +++ b/src/compiler/nir/nir_linking_helpers.c @@ -196,10 +196,13 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) } static uint8_t -get_interp_type(nir_variable *var, bool default_to_smooth_interp) +get_interp_type(nir_variable *var, const struct glsl_type *type, +bool default_to_smooth_interp) { if (var->data.interpolation != INTERP_MODE_NONE) return var->data.interpolation; + else if (glsl_type_is_integer(type)) + return INTERP_MODE_FLAT; Are we guaranteed that there will be no explititly interpolated integers? else if (default_to_smooth_interp) return INTERP_MODE_SMOOTH; else @@ -253,7 +256,7 @@ get_slot_component_masks_and_interp_types(struct exec_list *var_list, unsigned comps_slot2 = 0; for (unsigned i = 0; i < slots; i++) { interp_type[location + i] = - get_interp_type(var, default_to_smooth_interp); + get_interp_type(var, type, default_to_smooth_interp); interp_loc[location + i] = get_interp_loc(var); if (dual_slot) { @@ -425,7 +428,7 @@ compact_components(nir_shader *producer, nir_shader *consumer, uint8_t *comps, continue; bool found_new_offset = false; - uint8_t interp = get_interp_type(var, default_to_smooth_interp); + uint8_t interp = get_interp_type(var, type, default_to_smooth_interp); for (; cursor[interp] < 32; cursor[interp]++) { uint8_t cursor_used_comps = comps[cursor[interp]]; -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.
On Sun, Nov 11, 2018 at 8:46 PM Karol Herbst wrote: > an) > On Mon, Nov 12, 2018 at 12:37 AM Jason Ekstrand > wrote: > > > > On November 11, 2018 16:36:16 Karol Herbst wrote: > > > > > On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand > wrote: > > >> > > >> On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova > > >> wrote: > > >>> > > >>> > > >>> Lowering shader variables which depend on the local work group > > >>> size being available in nir_lower_system_values is only possible > > >>> if the local work group size isn't variable, otherwise this should > > >>> be handled by the native driver (if it supports > > >>> ARB_compute_variable_group_size). > > >>> > > >>> > > >>> Signed-off-by: Plamena Manolova > > >>> --- > > >>> src/compiler/nir/nir_lower_system_values.c | 22 > ++ > > >>> 1 file changed, 22 insertions(+) > > >>> > > >>> > > >>> diff --git a/src/compiler/nir/nir_lower_system_values.c > > >>> b/src/compiler/nir/nir_lower_system_values.c > > >>> index bde7eb1180..6fdf9f9c51 100644 > > >>> --- a/src/compiler/nir/nir_lower_system_values.c > > >>> +++ b/src/compiler/nir/nir_lower_system_values.c > > >>> @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) > > >>>*"The value of gl_GlobalInvocationID is equal to > > >>>*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" > > >>>*/ > > >>> + > > >>> + /* > > >>> + * If the local work group size is variable we can't lower > the global > > >>> + * invocation id here. > > >> > > >> > > >> Why not? > > >> > > >>> > > >>> > > >>> + */ > > >>> + if (b->shader->info.cs.local_size_variable) { > > >> > > >> > > >> I didn't realize we'd added a bit for this. At one point in time, > Karol > > >> had patches which had it keyed off of the size being zero. Having a > > >> separate bit is probably fine, it just surpised me. > > > > > > > > > yeah.. I guess I choose that, because I had nothing better. But I > > > guess having the size being 0 is good enough as long as we sure it is > > > 0 in relevant cases. > > > > It's fine. I just thought we'd chosen a different convention bit there's > > nothing wrong with it. I was just surprised. That's all. > > > > > > > >>> > > >>> > > >>> +break; > > >>> + } > > >>> + > > >>> nir_ssa_def *group_size = build_local_group_size(b); > > >>> nir_ssa_def *group_id = nir_load_work_group_id(b); > > >>> nir_ssa_def *local_id = nir_load_local_invocation_id(b); > > >>> @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) > > >>> } > > >>> > > >>> > > >>> case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { > > >>> + /* If the local work group size is variable we can't lower > it here */ > > >>> + if (b->shader->info.cs.local_size_variable) { > > >>> +break; > > >>> + } > > >>> + > > >>> sysval = build_local_group_size(b); > > >>> break; > > >>> } > > >>> @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b) > > >>> break; > > >>> > > >>> > > >>> case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: { > > >>> + /* > > >>> + * If the local work group size is variable we can't lower > the global > > >>> + * group size here. > > >>> + */ > > >>> + if (b->shader->info.cs.local_size_variable) { > > >>> +break; > > >>> + } > > >> > > >> > > >> Why can't we lower the global size? It seems like we would want the > below > > >> multiplication regardless of whether the local size is constant. > > > > > > well I am not sure about ARB_compute_variable_group_size, but at least > > > in CL you know nothing about it at compile time as you specify > > > everything when you enqueue the kernel. Could be that the number of > > > work_groups is fixed with ARB_compute_variable_group_size though > > > > Why can't you multiply two non-constant things together? Maybe the driver > > can do something now efficient but it's not clear to me that this isn't > > what we want. > > > > oh, you meant it that way. Mhh, I actually wrote the patch adding > build_local_group_size to handle that inside that function (lower to > constants if known, intrinsics otherwise) as we need the local_size > for three different system values. I have a patch for that for CL, but > the patch itself is a bit hacky right now (as in, it doesn't handle it > inside build_local_group_size). > > And I think that patch misses handling for > SYSTEM_VALUE_GLOBAL_GROUP_SIZE anyhow. > > For CL we even want to add a global_invocation_id_offset system value > and add it when handling SYSTEM_VALUE_GLOBAL_INVOCATION_ID, but that > we can ignore for now. > > Anyhow, I think it would make sense to move the check into > build_local_group_sized and either return a constant value or an > intrinsic loading the value from the driver. I just remember vaguely > that I think somebody brought that up months ago and there was a good > reason not to do that for every driver? > That's
Re: [Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.
an) On Mon, Nov 12, 2018 at 12:37 AM Jason Ekstrand wrote: > > On November 11, 2018 16:36:16 Karol Herbst wrote: > > > On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand > > wrote: > >> > >> On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova > >> wrote: > >>> > >>> > >>> Lowering shader variables which depend on the local work group > >>> size being available in nir_lower_system_values is only possible > >>> if the local work group size isn't variable, otherwise this should > >>> be handled by the native driver (if it supports > >>> ARB_compute_variable_group_size). > >>> > >>> > >>> Signed-off-by: Plamena Manolova > >>> --- > >>> src/compiler/nir/nir_lower_system_values.c | 22 ++ > >>> 1 file changed, 22 insertions(+) > >>> > >>> > >>> diff --git a/src/compiler/nir/nir_lower_system_values.c > >>> b/src/compiler/nir/nir_lower_system_values.c > >>> index bde7eb1180..6fdf9f9c51 100644 > >>> --- a/src/compiler/nir/nir_lower_system_values.c > >>> +++ b/src/compiler/nir/nir_lower_system_values.c > >>> @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) > >>>*"The value of gl_GlobalInvocationID is equal to > >>>*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" > >>>*/ > >>> + > >>> + /* > >>> + * If the local work group size is variable we can't lower the > >>> global > >>> + * invocation id here. > >> > >> > >> Why not? > >> > >>> > >>> > >>> + */ > >>> + if (b->shader->info.cs.local_size_variable) { > >> > >> > >> I didn't realize we'd added a bit for this. At one point in time, Karol > >> had patches which had it keyed off of the size being zero. Having a > >> separate bit is probably fine, it just surpised me. > > > > > > yeah.. I guess I choose that, because I had nothing better. But I > > guess having the size being 0 is good enough as long as we sure it is > > 0 in relevant cases. > > It's fine. I just thought we'd chosen a different convention bit there's > nothing wrong with it. I was just surprised. That's all. > > > > >>> > >>> > >>> +break; > >>> + } > >>> + > >>> nir_ssa_def *group_size = build_local_group_size(b); > >>> nir_ssa_def *group_id = nir_load_work_group_id(b); > >>> nir_ssa_def *local_id = nir_load_local_invocation_id(b); > >>> @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) > >>> } > >>> > >>> > >>> case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { > >>> + /* If the local work group size is variable we can't lower it > >>> here */ > >>> + if (b->shader->info.cs.local_size_variable) { > >>> +break; > >>> + } > >>> + > >>> sysval = build_local_group_size(b); > >>> break; > >>> } > >>> @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b) > >>> break; > >>> > >>> > >>> case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: { > >>> + /* > >>> + * If the local work group size is variable we can't lower the > >>> global > >>> + * group size here. > >>> + */ > >>> + if (b->shader->info.cs.local_size_variable) { > >>> +break; > >>> + } > >> > >> > >> Why can't we lower the global size? It seems like we would want the below > >> multiplication regardless of whether the local size is constant. > > > > well I am not sure about ARB_compute_variable_group_size, but at least > > in CL you know nothing about it at compile time as you specify > > everything when you enqueue the kernel. Could be that the number of > > work_groups is fixed with ARB_compute_variable_group_size though > > Why can't you multiply two non-constant things together? Maybe the driver > can do something now efficient but it's not clear to me that this isn't > what we want. > oh, you meant it that way. Mhh, I actually wrote the patch adding build_local_group_size to handle that inside that function (lower to constants if known, intrinsics otherwise) as we need the local_size for three different system values. I have a patch for that for CL, but the patch itself is a bit hacky right now (as in, it doesn't handle it inside build_local_group_size). And I think that patch misses handling for SYSTEM_VALUE_GLOBAL_GROUP_SIZE anyhow. For CL we even want to add a global_invocation_id_offset system value and add it when handling SYSTEM_VALUE_GLOBAL_INVOCATION_ID, but that we can ignore for now. Anyhow, I think it would make sense to move the check into build_local_group_sized and either return a constant value or an intrinsic loading the value from the driver. I just remember vaguely that I think somebody brought that up months ago and there was a good reason not to do that for every driver? > > > >>> > >>> > >>> + > >>> nir_ssa_def *group_size = build_local_group_size(b); > >>> nir_ssa_def *num_work_groups = nir_load_num_work_groups(b); > >>> sysval = nir_imul(b, group_size, num_work_groups); > >>> -- > >>> 2.11.0 > >>> > >>> > >>> ___ >
[Mesa-dev] [PATCH 2/2] nir: don't pack varyings ints with floats unless flat
Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers") --- src/compiler/nir/nir_linking_helpers.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c index b07bb40894e..125a8d55b3e 100644 --- a/src/compiler/nir/nir_linking_helpers.c +++ b/src/compiler/nir/nir_linking_helpers.c @@ -196,10 +196,13 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) } static uint8_t -get_interp_type(nir_variable *var, bool default_to_smooth_interp) +get_interp_type(nir_variable *var, const struct glsl_type *type, +bool default_to_smooth_interp) { if (var->data.interpolation != INTERP_MODE_NONE) return var->data.interpolation; + else if (glsl_type_is_integer(type)) + return INTERP_MODE_FLAT; else if (default_to_smooth_interp) return INTERP_MODE_SMOOTH; else @@ -253,7 +256,7 @@ get_slot_component_masks_and_interp_types(struct exec_list *var_list, unsigned comps_slot2 = 0; for (unsigned i = 0; i < slots; i++) { interp_type[location + i] = - get_interp_type(var, default_to_smooth_interp); + get_interp_type(var, type, default_to_smooth_interp); interp_loc[location + i] = get_interp_loc(var); if (dual_slot) { @@ -425,7 +428,7 @@ compact_components(nir_shader *producer, nir_shader *consumer, uint8_t *comps, continue; bool found_new_offset = false; - uint8_t interp = get_interp_type(var, default_to_smooth_interp); + uint8_t interp = get_interp_type(var, type, default_to_smooth_interp); for (; cursor[interp] < 32; cursor[interp]++) { uint8_t cursor_used_comps = comps[cursor[interp]]; -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] nir: add glsl_type_is_integer() helper
--- src/compiler/nir_types.cpp | 5 + src/compiler/nir_types.h | 1 + 2 files changed, 6 insertions(+) diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp index d24f0941519..3cd61f66056 100644 --- a/src/compiler/nir_types.cpp +++ b/src/compiler/nir_types.cpp @@ -301,6 +301,11 @@ glsl_type_is_boolean(const struct glsl_type *type) { return type->is_boolean(); } +bool +glsl_type_is_integer(const struct glsl_type *type) +{ + return type->is_integer(); +} const glsl_type * glsl_void_type(void) diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h index 77454fa9fab..70d593b96ab 100644 --- a/src/compiler/nir_types.h +++ b/src/compiler/nir_types.h @@ -142,6 +142,7 @@ bool glsl_type_is_image(const struct glsl_type *type); bool glsl_type_is_dual_slot(const struct glsl_type *type); bool glsl_type_is_numeric(const struct glsl_type *type); bool glsl_type_is_boolean(const struct glsl_type *type); +bool glsl_type_is_integer(const struct glsl_type *type); bool glsl_sampler_type_is_shadow(const struct glsl_type *type); bool glsl_sampler_type_is_array(const struct glsl_type *type); bool glsl_contains_atomic(const struct glsl_type *type); -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.
On November 11, 2018 16:36:16 Karol Herbst wrote: On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand wrote: On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova wrote: Lowering shader variables which depend on the local work group size being available in nir_lower_system_values is only possible if the local work group size isn't variable, otherwise this should be handled by the native driver (if it supports ARB_compute_variable_group_size). Signed-off-by: Plamena Manolova --- src/compiler/nir/nir_lower_system_values.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index bde7eb1180..6fdf9f9c51 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) *"The value of gl_GlobalInvocationID is equal to *gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" */ + + /* + * If the local work group size is variable we can't lower the global + * invocation id here. Why not? + */ + if (b->shader->info.cs.local_size_variable) { I didn't realize we'd added a bit for this. At one point in time, Karol had patches which had it keyed off of the size being zero. Having a separate bit is probably fine, it just surpised me. yeah.. I guess I choose that, because I had nothing better. But I guess having the size being 0 is good enough as long as we sure it is 0 in relevant cases. It's fine. I just thought we'd chosen a different convention bit there's nothing wrong with it. I was just surprised. That's all. +break; + } + nir_ssa_def *group_size = build_local_group_size(b); nir_ssa_def *group_id = nir_load_work_group_id(b); nir_ssa_def *local_id = nir_load_local_invocation_id(b); @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) } case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { + /* If the local work group size is variable we can't lower it here */ + if (b->shader->info.cs.local_size_variable) { +break; + } + sysval = build_local_group_size(b); break; } @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b) break; case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: { + /* + * If the local work group size is variable we can't lower the global + * group size here. + */ + if (b->shader->info.cs.local_size_variable) { +break; + } Why can't we lower the global size? It seems like we would want the below multiplication regardless of whether the local size is constant. well I am not sure about ARB_compute_variable_group_size, but at least in CL you know nothing about it at compile time as you specify everything when you enqueue the kernel. Could be that the number of work_groups is fixed with ARB_compute_variable_group_size though Why can't you multiply two non-constant things together? Maybe the driver can do something now efficient but it's not clear to me that this isn't what we want. + nir_ssa_def *group_size = build_local_group_size(b); nir_ssa_def *num_work_groups = nir_load_num_work_groups(b); sysval = nir_imul(b, group_size, num_work_groups); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.
On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand wrote: > > On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova > wrote: >> >> Lowering shader variables which depend on the local work group >> size being available in nir_lower_system_values is only possible >> if the local work group size isn't variable, otherwise this should >> be handled by the native driver (if it supports >> ARB_compute_variable_group_size). >> >> Signed-off-by: Plamena Manolova >> --- >> src/compiler/nir/nir_lower_system_values.c | 22 ++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/src/compiler/nir/nir_lower_system_values.c >> b/src/compiler/nir/nir_lower_system_values.c >> index bde7eb1180..6fdf9f9c51 100644 >> --- a/src/compiler/nir/nir_lower_system_values.c >> +++ b/src/compiler/nir/nir_lower_system_values.c >> @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) >>*"The value of gl_GlobalInvocationID is equal to >>*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" >>*/ >> + >> + /* >> + * If the local work group size is variable we can't lower the >> global >> + * invocation id here. > > > Why not? > >> >> + */ >> + if (b->shader->info.cs.local_size_variable) { > > > I didn't realize we'd added a bit for this. At one point in time, Karol had > patches which had it keyed off of the size being zero. Having a separate bit > is probably fine, it just surpised me. > yeah.. I guess I choose that, because I had nothing better. But I guess having the size being 0 is good enough as long as we sure it is 0 in relevant cases. >> >> +break; >> + } >> + >> nir_ssa_def *group_size = build_local_group_size(b); >> nir_ssa_def *group_id = nir_load_work_group_id(b); >> nir_ssa_def *local_id = nir_load_local_invocation_id(b); >> @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) >>} >> >>case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { >> + /* If the local work group size is variable we can't lower it here >> */ >> + if (b->shader->info.cs.local_size_variable) { >> +break; >> + } >> + >> sysval = build_local_group_size(b); >> break; >>} >> @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b) >> break; >> >>case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: { >> + /* >> + * If the local work group size is variable we can't lower the >> global >> + * group size here. >> + */ >> + if (b->shader->info.cs.local_size_variable) { >> +break; >> + } > > > Why can't we lower the global size? It seems like we would want the below > multiplication regardless of whether the local size is constant. > well I am not sure about ARB_compute_variable_group_size, but at least in CL you know nothing about it at compile time as you specify everything when you enqueue the kernel. Could be that the number of work_groups is fixed with ARB_compute_variable_group_size though? >> >> + >> nir_ssa_def *group_size = build_local_group_size(b); >> nir_ssa_def *num_work_groups = nir_load_num_work_groups(b); >> sysval = nir_imul(b, group_size, num_work_groups); >> -- >> 2.11.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.
On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova < plamena.n.manol...@gmail.com> wrote: > Lowering shader variables which depend on the local work group > size being available in nir_lower_system_values is only possible > if the local work group size isn't variable, otherwise this should > be handled by the native driver (if it supports > ARB_compute_variable_group_size). > > Signed-off-by: Plamena Manolova > --- > src/compiler/nir/nir_lower_system_values.c | 22 ++ > 1 file changed, 22 insertions(+) > > diff --git a/src/compiler/nir/nir_lower_system_values.c > b/src/compiler/nir/nir_lower_system_values.c > index bde7eb1180..6fdf9f9c51 100644 > --- a/src/compiler/nir/nir_lower_system_values.c > +++ b/src/compiler/nir/nir_lower_system_values.c > @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) >*"The value of gl_GlobalInvocationID is equal to >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" >*/ > + > + /* > + * If the local work group size is variable we can't lower the > global > + * invocation id here. > Why not? > + */ > + if (b->shader->info.cs.local_size_variable) { > I didn't realize we'd added a bit for this. At one point in time, Karol had patches which had it keyed off of the size being zero. Having a separate bit is probably fine, it just surpised me. > +break; > + } > + > nir_ssa_def *group_size = build_local_group_size(b); > nir_ssa_def *group_id = nir_load_work_group_id(b); > nir_ssa_def *local_id = nir_load_local_invocation_id(b); > @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) >} > >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { > + /* If the local work group size is variable we can't lower it > here */ > + if (b->shader->info.cs.local_size_variable) { > +break; > + } > + > sysval = build_local_group_size(b); > break; >} > @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b) > break; > >case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: { > + /* > + * If the local work group size is variable we can't lower the > global > + * group size here. > + */ > + if (b->shader->info.cs.local_size_variable) { > +break; > + } > Why can't we lower the global size? It seems like we would want the below multiplication regardless of whether the local size is constant. > + > nir_ssa_def *group_size = build_local_group_size(b); > nir_ssa_def *num_work_groups = nir_load_num_work_groups(b); > sysval = nir_imul(b, group_size, num_work_groups); > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2] Update on thread_submit
On 10/11/2018 17:09, Axel Davy wrote: Axel Davy (2): st/nine: Allow 'triple buffering' with thread_submit st/nine: Remove thread_submit warning Nice! Series is: Tested-by: Andre Heider ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] nir: Handle variables dependent on the local work group size.
Lowering shader variables which depend on the local work group size being available in nir_lower_system_values is only possible if the local work group size isn't variable, otherwise this should be handled by the native driver (if it supports ARB_compute_variable_group_size). Signed-off-by: Plamena Manolova --- src/compiler/nir/nir_lower_system_values.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index bde7eb1180..6fdf9f9c51 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) *"The value of gl_GlobalInvocationID is equal to *gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" */ + + /* + * If the local work group size is variable we can't lower the global + * invocation id here. + */ + if (b->shader->info.cs.local_size_variable) { +break; + } + nir_ssa_def *group_size = build_local_group_size(b); nir_ssa_def *group_id = nir_load_work_group_id(b); nir_ssa_def *local_id = nir_load_local_invocation_id(b); @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) } case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { + /* If the local work group size is variable we can't lower it here */ + if (b->shader->info.cs.local_size_variable) { +break; + } + sysval = build_local_group_size(b); break; } @@ -189,6 +203,14 @@ convert_block(nir_block *block, nir_builder *b) break; case SYSTEM_VALUE_GLOBAL_GROUP_SIZE: { + /* + * If the local work group size is variable we can't lower the global + * group size here. + */ + if (b->shader->info.cs.local_size_variable) { +break; + } + nir_ssa_def *group_size = build_local_group_size(b); nir_ssa_def *num_work_groups = nir_load_num_work_groups(b); sysval = nir_imul(b, group_size, num_work_groups); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: Handle variables dependent on the local work group size.
Lowering the global invocation id and the local work group size in nir_lower_system_values is only possible if the local work group size isn't variable, otherwise this should be handled by the native driver (if it supports ARB_compute_variable_group_size). Signed-off-by: Plamena Manolova --- src/compiler/nir/nir_lower_system_values.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index bde7eb1180..74041d98f7 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -77,6 +77,15 @@ convert_block(nir_block *block, nir_builder *b) *"The value of gl_GlobalInvocationID is equal to *gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID" */ + + /* + * If the local work group size is variable we can't lower the global + * invocation id here. + */ + if (b->shader->info.cs.local_size_variable) { +break; + } + nir_ssa_def *group_size = build_local_group_size(b); nir_ssa_def *group_id = nir_load_work_group_id(b); nir_ssa_def *local_id = nir_load_local_invocation_id(b); @@ -115,6 +124,11 @@ convert_block(nir_block *block, nir_builder *b) } case SYSTEM_VALUE_LOCAL_GROUP_SIZE: { + /* If the local work group size is variable we can't lower it here */ + if (b->shader->info.cs.local_size_variable) { +break; + } + sysval = build_local_group_size(b); break; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50/ir/ra: enforce max register requirement, and change spill order
On nv50, certain operations must happen on regs below 64, due to encoding requirements. First of all, we add infrastructure to enforce this. Secondly we change the spill order to first spill RIG nodes that are unconstrained, followed by ones that are. This makes the gamecube logo shadertoy compile properly. Curiously, if we adjust the spill order so that we first spill the constrained RIG nodes instead, the RA also succeeds. However it seems more logical to first spill the unconstrained ones. While we're at it, drop the nv50 max register to reserve r127 as the zero register of last resort (r63 is preferred). Signed-off-by: Ilia Mirkin --- This is probably one of the more intrusive RA patches I've written, so I'm fairly unsure about it all. In practice, it does appear to work on that one example, but who knows what else it'll mess up. Not sure what testing I should do before merging this... probably not going to be much more than ensuring that Heaven/Valley keep working. .../drivers/nouveau/codegen/nv50_ir_ra.cpp| 25 --- .../nouveau/codegen/nv50_ir_target_nv50.cpp | 2 +- .../drivers/nouveau/codegen/nv50_ir_util.cpp | 8 +++--- .../drivers/nouveau/codegen/nv50_ir_util.h| 5 +++- 4 files changed, 25 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index afaf59a8cd9..8d0d66cd212 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -55,7 +55,7 @@ public: void periodicMask(DataFile f, uint32_t lock, uint32_t unlock); void intersect(DataFile f, const RegisterSet *); - bool assign(int32_t& reg, DataFile f, unsigned int size); + bool assign(int32_t& reg, DataFile f, unsigned int size, unsigned int maxReg); void release(DataFile f, int32_t reg, unsigned int size); void occupy(DataFile f, int32_t reg, unsigned int size); void occupy(const Value *); @@ -160,9 +160,9 @@ RegisterSet::print(DataFile f) const } bool -RegisterSet::assign(int32_t& reg, DataFile f, unsigned int size) +RegisterSet::assign(int32_t& reg, DataFile f, unsigned int size, unsigned int maxReg) { - reg = bits[f].findFreeRange(size); + reg = bits[f].findFreeRange(size, maxReg); if (reg < 0) return false; fill[f] = MAX2(fill[f], (int32_t)(reg + size - 1)); @@ -714,6 +714,7 @@ private: public: uint32_t degree; uint16_t degreeLimit; // if deg < degLimit, node is trivially colourable + uint16_t maxReg; uint16_t colors; DataFile f; @@ -851,12 +852,12 @@ GCRA::RIG_Node::init(const RegisterSet& regs, LValue *lval) weight = std::numeric_limits::infinity(); degree = 0; - int size = regs.getFileSize(f); + maxReg = regs.getFileSize(f); // On nv50, we lose a bit of gpr encoding when there's an embedded // immediate. if (regs.restrictedGPR16Range && f == FILE_GPR && (lval->reg.size == 2 || isShortRegVal(lval))) - size /= 2; - degreeLimit = size; + maxReg /= 2; + degreeLimit = maxReg; degreeLimit -= relDegree[1][colors] - 1; livei.insert(lval->livei); @@ -916,6 +917,8 @@ GCRA::coalesceValues(Value *dst, Value *src, bool force) // add val's definitions to rep and extend the live interval of its RIG node rep->defs.insert(rep->defs.end(), val->defs.begin(), val->defs.end()); nRep->livei.unify(nVal->livei); + nRep->degreeLimit = MIN2(nRep->degreeLimit, nVal->degreeLimit); + nRep->maxReg = MIN2(nRep->maxReg, nVal->maxReg); return true; } @@ -1289,13 +1292,17 @@ GCRA::simplify() } else if (!DLLIST_EMPTY()) { RIG_Node *best = hi.next; + unsigned bestMaxReg = best->maxReg; float bestScore = best->weight / (float)best->degree; - // spill candidate + // Spill candidate. First go through the ones with the highest max + // register, then the ones with lower. That way the ones with the + // lowest requirement will be allocated first, since it's a stack. for (RIG_Node *it = best->next; it != it = it->next) { float score = it->weight / (float)it->degree; -if (score < bestScore) { +if (score < bestScore || it->maxReg > bestMaxReg) { best = it; bestScore = score; + bestMaxReg = it->maxReg; } } if (isinf(bestScore)) { @@ -1396,7 +1403,7 @@ GCRA::selectRegisters() LValue *lval = node->getValue(); if (prog->dbgFlags & NV50_IR_DEBUG_REG_ALLOC) regs.print(node->f); - bool ret = regs.assign(node->reg, node->f, node->colors); + bool ret = regs.assign(node->reg, node->f, node->colors, node->maxReg); if (ret) { INFO_DBG(prog->dbgFlags, REG_ALLOC, "assigned reg %i\n", node->reg); lval->compMask = node->getCompMask(); diff --git
Re: [Mesa-dev] [PATCH] util: Fix warning in u_cpu_detect on non-x86
Alright, thank you :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH] gallium/nir: Add shader-based blending helpers
Some mobile GPUs lack fixed-function hardware for blending, instead emulating blending via internal shaders. In particular for us, vc4 lacks most of the fragment pipeline, implementing blending in the epilogue of the fragment shader. Newer Malis supported by Panfrost have limitations in their fixed-function pipeline, introducing a dedicated "blend shader" stage to implement arbitrarily complex operations. This helper, originally from vc4, generates NIR programs from a Gallium blend state. At the moment, it implements only floating-point ES2 blend operations; integer blending and non-ES2 modes like logic ops are planned for the future. Signed-off-by: Alyssa Rosenzweig --- src/gallium/auxiliary/meson.build | 2 + src/gallium/auxiliary/nir/nir_lower_blend.c | 188 src/gallium/auxiliary/nir/nir_lower_blend.h | 36 3 files changed, 226 insertions(+) create mode 100644 src/gallium/auxiliary/nir/nir_lower_blend.c create mode 100644 src/gallium/auxiliary/nir/nir_lower_blend.h diff --git a/src/gallium/auxiliary/meson.build b/src/gallium/auxiliary/meson.build index e1497992b1..4fb3620374 100644 --- a/src/gallium/auxiliary/meson.build +++ b/src/gallium/auxiliary/meson.build @@ -365,6 +365,8 @@ files_libgallium = files( 'util/u_viewport.h', 'nir/tgsi_to_nir.c', 'nir/tgsi_to_nir.h', + 'nir/nir_lower_blend.c', + 'nir/nir_lower_blend.h', ) if dep_libdrm.found() diff --git a/src/gallium/auxiliary/nir/nir_lower_blend.c b/src/gallium/auxiliary/nir/nir_lower_blend.c new file mode 100644 index 00..2cb32ca1e4 --- /dev/null +++ b/src/gallium/auxiliary/nir/nir_lower_blend.c @@ -0,0 +1,188 @@ +/* + * Copyright © 2015 Broadcom + * Copyright © 2018 Alyssa Rosenzweig + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "nir_lower_blend.h" + +/* Implements fixed-function blending in software. The standard entrypoint for + * floating point blending is nir_blending_f, called with the Gallium blend + * state and nir_ssa_def's for the various parameters used in blending. These + * routines may be used to construct dedicated blend shaders or appended to + * fragment shaders; accordingly, they do not perform I/O to maximize + * flexibility. + * + * Inputs are assumed to be clamped to [0, 1]. fsat instructions must be added + * by the caller if clamping is not otherwise performed. + * + * TODO: sRGB, logic ops, integers, dual-source blending, advanced blending + */ + +/* src and dst are vec4 */ + +static nir_ssa_def * +nir_blend_channel_f(nir_builder *b, +nir_ssa_def **src, +nir_ssa_def **dst, +nir_ssa_def *constant, +unsigned factor, +int channel) +{ + switch(factor) { + case PIPE_BLENDFACTOR_ONE: + return nir_imm_float(b, 1.0); + case PIPE_BLENDFACTOR_SRC_COLOR: + return src[channel]; + case PIPE_BLENDFACTOR_SRC_ALPHA: + return src[3]; + case PIPE_BLENDFACTOR_DST_ALPHA: + return dst[3]; + case PIPE_BLENDFACTOR_DST_COLOR: + return dst[channel]; + case PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE: + if (channel != 3) { + return nir_fmin(b, + src[3], + nir_fsub(b, + nir_imm_float(b, 1.0), + dst[3])); + } else { + return nir_imm_float(b, 1.0); + } + case PIPE_BLENDFACTOR_CONST_COLOR: + return nir_channel(b, constant, channel); + case PIPE_BLENDFACTOR_CONST_ALPHA: + return nir_channel(b, constant, 3); + case PIPE_BLENDFACTOR_ZERO: + return nir_imm_float(b, 0.0); + case PIPE_BLENDFACTOR_INV_SRC_COLOR: + return nir_fsub(b, nir_imm_float(b, 1.0), src[channel]); + case PIPE_BLENDFACTOR_INV_SRC_ALPHA: + return nir_fsub(b, nir_imm_float(b, 1.0), src[3]); + case PIPE_BLENDFACTOR_INV_DST_ALPHA: +
Re: [Mesa-dev] [PATCH] meson: Don't set -Wall
On Friday, 2018-11-09 13:28:49 -0800, Dylan Baker wrote: > meson does this for you with it's warn levels, so we don't need to set > it ourselves. s/it's/its/ And yeah you're right, I just checked and -Wall (or equivalent) is always passed to compilers that support it. Reviewed-by: Eric Engestrom > > Fixes: d1992255bb29054fa51763376d125183a9f602f3 >("meson: Add build Intel "anv" vulkan driver") > --- > meson.build | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/meson.build b/meson.build > index 18667988bac..dabfb9abddd 100644 > --- a/meson.build > +++ b/meson.build > @@ -787,7 +787,7 @@ endif > > # Check for generic C arguments > c_args = [] > -foreach a : ['-Wall', '-Werror=implicit-function-declaration', > +foreach a : ['-Werror=implicit-function-declaration', > '-Werror=missing-prototypes', '-Werror=return-type', > '-fno-math-errno', > '-fno-trapping-math', '-Qunused-arguments'] > @@ -809,7 +809,7 @@ endif > > # Check for generic C++ arguments > cpp_args = [] > -foreach a : ['-Wall', '-Werror=return-type', > +foreach a : ['-Werror=return-type', > '-fno-math-errno', '-fno-trapping-math', > '-Qunused-arguments'] >if cpp.has_argument(a) > -- > 2.19.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: Fix warning in u_cpu_detect on non-x86
Thank you. I noticed this as well. Reviewed-by: Matt Turner I'll commit it tomorrow if there are no additional comments. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: Fix warning in u_cpu_detect on non-x86
regs is only set and used on x86; on other platforms (like ARM), this code causes a trivial warning, solved by moving the regs declaration to the architecture-dependent usage. Signed-off-by: Alyssa Rosenzweig --- src/util/u_cpu_detect.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/u_cpu_detect.c b/src/util/u_cpu_detect.c index 4dbb4d8fb5..52b9ae547d 100644 --- a/src/util/u_cpu_detect.c +++ b/src/util/u_cpu_detect.c @@ -370,14 +370,14 @@ check_os_arm_support(void) static void get_cpu_topology(void) { - uint32_t regs[4]; - /* Default. This is correct if L3 is not present or there is only one. */ util_cpu_caps.cores_per_L3 = util_cpu_caps.nr_cpus; #if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) /* AMD Zen */ if (util_cpu_caps.x86_cpu_type == 0x17) { + uint32_t regs[4]; + /* Query the L3 cache topology information. */ cpuid_count(0x801D, 3, regs); unsigned cache_level = (regs[0] >> 5) & 0x7; -- 2.19.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 3/3] meson: fix wayland-less builds
On Thursday, 2018-10-11 16:36:15 +0100, Eric Engestrom wrote: > Those empty variables in the !wayland case are useless and running that > meson.build with them breaks the build: > > [287/850] Generating wayland-drm-client-protocol.h with a custom command. > FAILED: src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h > client-header ../src/egl/wayland/wayland-drm/wayland-drm.xml > src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h > /bin/sh: client-header: command not found > ninja: build stopped: subcommand failed. > > Fixes: d1992255bb29054fa5176 "meson: Add build Intel "anv" vulkan driver" > Signed-off-by: Eric Engestrom > --- > Kind of amazed that no one has tried to build with wayland disabled > since meson was introduced, but I guess it's a good thing ^^' Humble ping for this series? It is currently impossible to build Mesa without support for wayland, which I think is quite an important bug :) > --- > meson.build | 7 --- > src/meson.build | 4 +++- > 2 files changed, 3 insertions(+), 8 deletions(-) > > diff --git a/meson.build b/meson.build > index 0276b11e9cf003f6ebde..fcde41387ac2581ccad0 100644 > --- a/meson.build > +++ b/meson.build > @@ -1315,13 +1315,6 @@ if with_platform_wayland > 'linux-dmabuf', 'linux-dmabuf-unstable-v1.xml' >) >pre_args += ['-DHAVE_WAYLAND_PLATFORM', '-DWL_HIDE_DEPRECATED'] > -else > - prog_wl_scanner = [] > - wl_scanner_arg = '' > - dep_wl_protocols = null_dep > - dep_wayland_client = null_dep > - dep_wayland_server = null_dep > - wayland_dmabuf_xml = '' > endif > > dep_x11 = null_dep > diff --git a/src/meson.build b/src/meson.build > index 2b8fcc36be5b0b1688ec..75f642018f03ecfef562 100644 > --- a/src/meson.build > +++ b/src/meson.build > @@ -56,7 +56,9 @@ if with_gles1 or with_gles2 or with_shared_glapi > endif > # TODO: opengl > subdir('compiler') > -subdir('egl/wayland/wayland-drm') > +if with_platform_wayland > + subdir('egl/wayland/wayland-drm') > +endif > if with_any_vk >subdir('vulkan') > endif > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50/ir/ra: improve condition for short regs, unify with cond for 16-bit
Instead of the size restriction existing in two places, and potentially being applied twice, we move this together. Ops with 16-bit register addresses can only take a short reg, and ops with immediates can only take a short reg. Of course we leave the immediate 0 in place since we know that it will be replaced by r63/r127 down the line, so don't treat zeroes as an immediate. Signed-off-by: Ilia Mirkin --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp index 21cbd955537..afaf59a8cd9 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp @@ -66,10 +66,8 @@ public: inline int getMaxAssigned(DataFile f) const { return fill[f]; } - inline unsigned int getFileSize(DataFile f, uint8_t regSize) const + inline unsigned int getFileSize(DataFile f) const { - if (restrictedGPR16Range && f == FILE_GPR && regSize == 2) - return (last[f] + 1) / 2; return last[f] + 1; } @@ -813,9 +811,11 @@ GCRA::printNodeInfo() const static bool isShortRegOp(Instruction *insn) { - // Immediates are always in src1. Every other situation can be resolved by + // Immediates are always in src1 (except zeroes, which end up getting + // replaced with a zero reg). Every other situation can be resolved by // using a long encoding. - return insn->srcExists(1) && insn->src(1).getFile() == FILE_IMMEDIATE; + return insn->srcExists(1) && insn->src(1).getFile() == FILE_IMMEDIATE && + insn->getSrc(1)->reg.data.u64; } // Check if this LValue is ever used in an instruction that can't be encoded @@ -851,10 +851,10 @@ GCRA::RIG_Node::init(const RegisterSet& regs, LValue *lval) weight = std::numeric_limits::infinity(); degree = 0; - int size = regs.getFileSize(f, lval->reg.size); + int size = regs.getFileSize(f); // On nv50, we lose a bit of gpr encoding when there's an embedded // immediate. - if (regs.restrictedGPR16Range && f == FILE_GPR && isShortRegVal(lval)) + if (regs.restrictedGPR16Range && f == FILE_GPR && (lval->reg.size == 2 || isShortRegVal(lval))) size /= 2; degreeLimit = size; degreeLimit -= relDegree[1][colors] - 1; -- 2.18.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/7] intel/fs: Support min_lod parameters on texture instructions
Sagar reminded me via his MSR that this patch series never landed. Everything got reviewed by Ian except this patch which he only acked. Would someone mind doing a more proper review? This seemed like the wrong patch to push with just an ack. --Jason On Thu, Oct 11, 2018 at 4:33 PM Jason Ekstrand wrote: > We have to lower some shadow instructions because they don't exist in > hardware and we have to lower txb+offset+clamp because the message gets > too big and we run into the sampler message length limit of 11 regs. > --- > src/intel/compiler/brw_eu_defines.h | 2 ++ > src/intel/compiler/brw_fs.cpp | 22 +- > src/intel/compiler/brw_fs_nir.cpp | 6 +- > src/intel/compiler/brw_nir.c| 3 +++ > 4 files changed, 31 insertions(+), 2 deletions(-) > > diff --git a/src/intel/compiler/brw_eu_defines.h > b/src/intel/compiler/brw_eu_defines.h > index 52957882b10..affe977835b 100644 > --- a/src/intel/compiler/brw_eu_defines.h > +++ b/src/intel/compiler/brw_eu_defines.h > @@ -811,6 +811,8 @@ enum tex_logical_srcs { > TEX_LOGICAL_SRC_LOD, > /** dPdy if the operation takes explicit derivatives */ > TEX_LOGICAL_SRC_LOD2, > + /** Min LOD */ > + TEX_LOGICAL_SRC_MIN_LOD, > /** Sample index */ > TEX_LOGICAL_SRC_SAMPLE_INDEX, > /** MCS data */ > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp > index 23a25fedca5..a09451249d8 100644 > --- a/src/intel/compiler/brw_fs.cpp > +++ b/src/intel/compiler/brw_fs.cpp > @@ -4472,6 +4472,7 @@ lower_sampler_logical_send_gen7(const fs_builder > , fs_inst *inst, opcode op, > const fs_reg , > const fs_reg _c, > fs_reg lod, const fs_reg , > +const fs_reg _lod, > const fs_reg _index, > const fs_reg , > const fs_reg , > @@ -4682,6 +4683,15 @@ lower_sampler_logical_send_gen7(const fs_builder > , fs_inst *inst, opcode op, > bld.MOV(sources[length++], offset(coordinate, bld, i)); > } > > + if (min_lod.file != BAD_FILE) { > + /* Account for all of the missing coordinate sources */ > + length += 4 - coord_components; > + if (op == SHADER_OPCODE_TXD) > + length += (3 - grad_components) * 2; > + > + bld.MOV(sources[length++], min_lod); > + } > + > int mlen; > if (reg_width == 2) >mlen = length * reg_width - header_size; > @@ -4713,6 +4723,7 @@ lower_sampler_logical_send(const fs_builder , > fs_inst *inst, opcode op) > const fs_reg _c = inst->src[TEX_LOGICAL_SRC_SHADOW_C]; > const fs_reg = inst->src[TEX_LOGICAL_SRC_LOD]; > const fs_reg = inst->src[TEX_LOGICAL_SRC_LOD2]; > + const fs_reg _lod = inst->src[TEX_LOGICAL_SRC_MIN_LOD]; > const fs_reg _index = inst->src[TEX_LOGICAL_SRC_SAMPLE_INDEX]; > const fs_reg = inst->src[TEX_LOGICAL_SRC_MCS]; > const fs_reg = inst->src[TEX_LOGICAL_SRC_SURFACE]; > @@ -4725,7 +4736,8 @@ lower_sampler_logical_send(const fs_builder , > fs_inst *inst, opcode op) > > if (devinfo->gen >= 7) { >lower_sampler_logical_send_gen7(bld, inst, op, coordinate, > - shadow_c, lod, lod2, sample_index, > + shadow_c, lod, lod2, min_lod, > + sample_index, >mcs, surface, sampler, tg4_offset, >coord_components, grad_components); > } else if (devinfo->gen >= 5) { > @@ -5262,6 +5274,14 @@ static unsigned > get_sampler_lowered_simd_width(const struct gen_device_info *devinfo, > const fs_inst *inst) > { > + /* If we have a min_lod parameter on anything other than a simple > sample > +* message, it will push it over 5 arguments and we have to fall back > to > +* SIMD8. > +*/ > + if (inst->opcode != SHADER_OPCODE_TEX && > + inst->components_read(TEX_LOGICAL_SRC_MIN_LOD)) > + return 8; > + > /* Calculate the number of coordinate components that have to be > present > * assuming that additional arguments follow the texel coordinates in > the > * message payload. On IVB+ there is no need for padding, on ILK-SNB > we > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 7930205d659..6862abf80bb 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -3079,7 +3079,7 @@ fs_visitor::emit_non_coherent_fb_read(const > fs_builder , const fs_reg , > > /* Emit the instruction. */ > const fs_reg srcs[] = { coords, fs_reg(), brw_imm_ud(0), fs_reg(), > - sample, mcs, > + fs_reg(), sample, mcs, > brw_imm_ud(surface), brw_imm_ud(0), >
Re: [Mesa-dev] [PATCH] nir: Allow to skip integer ops in nir_lower_to_source_mods
On November 11, 2018 09:11:26 Gert Wollny wrote: Am Sonntag, den 11.11.2018, 09:07 -0600 schrieb Jason Ekstrand: On November 11, 2018 07:44:54 Gert Wollny wrote: From: Gert Wollny Since some hardware supports source mods only for float operations make it possible to skip this lowering for integer operations. Out of curiosity, what hardware would that be? r600/Evergreen at least. Signed-off-by: Gert Wollny --- I'm a bit unsure about what the best name for the parameter is, i.e. passing in true when one doesn't want all to be lowered seems a bit irritating, but a name like "lower_also_int_ops" looks ugly, and "lower_all" immediately asks for a comment what "not all" includes. I don't think it's worth adding option flags that would be more self-explanatory though. Maybe a bit field: typedef enum { nir_lower_int_source_mods = 1 << 0, nir_lower_float_source_mods = 1 << 1, } nir_lower_to_source_mods_flags; That was my first idea, but then it seemed a bit too much, i.e. I don't know whether there is hardware that can do has source modes for int, but not for float. Meh. It makes it clear and that's the point. I don't really care if it's "too much" as long as the code remains readable. As an aside, Intel hardware has source mods for integers but not all instructions support them and on Broadwell and later, logical instructions (and, or, xor) treat the negation modifier as doing an inot so it doesn't map perfectly for us either. Best, Gert thanks for any comments and reviews, Gert src/compiler/nir/nir.h | 2 +- src/compiler/nir/nir_lower_to_source_mods.c | 36 +- --- src/intel/compiler/brw_nir.c| 2 +- 3 files changed, 24 insertions(+), 16 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index dc3c729dee..e2f64c9d44 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -3013,7 +3013,7 @@ typedef struct nir_lower_bitmap_options { void nir_lower_bitmap(nir_shader *shader, const nir_lower_bitmap_options *options); bool nir_lower_atomics_to_ssbo(nir_shader *shader, unsigned ssbo_offset); -bool nir_lower_to_source_mods(nir_shader *shader); +bool nir_lower_to_source_mods(nir_shader *shader, bool skip_int_ops); bool nir_lower_gs_intrinsics(nir_shader *shader); diff --git a/src/compiler/nir/nir_lower_to_source_mods.c b/src/compiler/nir/nir_lower_to_source_mods.c index 077ca53704..36f2160627 100644 --- a/src/compiler/nir/nir_lower_to_source_mods.c +++ b/src/compiler/nir/nir_lower_to_source_mods.c @@ -34,7 +34,7 @@ */ static bool -nir_lower_to_source_mods_block(nir_block *block) +nir_lower_to_source_mods_block(nir_block *block, bool skip_int_ops) { bool progress = false; @@ -62,6 +62,8 @@ nir_lower_to_source_mods_block(nir_block *block) continue; break; case nir_type_int: +if (skip_int_ops) + continue; if (parent->op != nir_op_imov) continue; break; @@ -102,19 +104,10 @@ nir_lower_to_source_mods_block(nir_block *block) alu->op = nir_op_fmov; alu->dest.saturate = true; break; - case nir_op_ineg: - alu->op = nir_op_imov; - alu->src[0].negate = !alu->src[0].negate; - break; case nir_op_fneg: alu->op = nir_op_fmov; alu->src[0].negate = !alu->src[0].negate; break; - case nir_op_iabs: - alu->op = nir_op_imov; - alu->src[0].abs = true; - alu->src[0].negate = false; - break; case nir_op_fabs: alu->op = nir_op_fmov; alu->src[0].abs = true; @@ -124,6 +117,21 @@ nir_lower_to_source_mods_block(nir_block *block) break; } + if (!skip_int_ops) { + switch (alu->op) { + case nir_op_ineg: +alu->op = nir_op_imov; +alu->src[0].negate = !alu->src[0].negate; +break; + case nir_op_iabs: +alu->op = nir_op_imov; +alu->src[0].abs = true; +alu->src[0].negate = false; +break; + default: +break; + } + } /* We've covered sources. Now we're going to try and saturate the * destination if we can. */ @@ -185,12 +193,12 @@ nir_lower_to_source_mods_block(nir_block *block) } static bool -nir_lower_to_source_mods_impl(nir_function_impl *impl) +nir_lower_to_source_mods_impl(nir_function_impl *impl, bool skip_int_ops) { bool progress = false; nir_foreach_block(block, impl) { - progress |= nir_lower_to_source_mods_block(block); + progress |= nir_lower_to_source_mods_block(block, skip_int_ops); } if (progress) @@ -201,13 +209,13 @@ nir_lower_to_source_mods_impl(nir_function_impl *impl) } bool -nir_lower_to_source_mods(nir_shader *shader) +nir_lower_to_source_mods(nir_shader *shader, bool skip_int_ops) { bool progress = false; nir_foreach_function(function, shader) { if (function->impl) { - progress |= nir_lower_to_source_mods_impl(function- impl); + progress |= nir_lower_to_source_mods_impl(function- impl,
Re: [Mesa-dev] [PATCH] nir: Allow to skip integer ops in nir_lower_to_source_mods
Am Sonntag, den 11.11.2018, 09:07 -0600 schrieb Jason Ekstrand: > On November 11, 2018 07:44:54 Gert Wollny > wrote: > > > From: Gert Wollny > > > > > > Since some hardware supports source mods only for float operations > > make > > it possible to skip this lowering for integer operations. > > Out of curiosity, what hardware would that be? r600/Evergreen at least. > > > Signed-off-by: Gert Wollny > > --- > > I'm a bit unsure about what the best name for the parameter is, > > i.e. passing in > > true when one doesn't want all to be lowered seems a bit > > irritating, but a name > > like "lower_also_int_ops" looks ugly, and "lower_all" immediately > > asks for a > > comment what "not all" includes. I don't think it's worth adding > > option flags > > that would be more self-explanatory though. > > Maybe a bit field: > > typedef enum { >nir_lower_int_source_mods = 1 << 0, >nir_lower_float_source_mods = 1 << 1, > } nir_lower_to_source_mods_flags; That was my first idea, but then it seemed a bit too much, i.e. I don't know whether there is hardware that can do has source modes for int, but not for float. Best, Gert > > > thanks for any comments and reviews, > > Gert > > > > > > > > > > src/compiler/nir/nir.h | 2 +- > > src/compiler/nir/nir_lower_to_source_mods.c | 36 +- > > --- > > src/intel/compiler/brw_nir.c| 2 +- > > 3 files changed, 24 insertions(+), 16 deletions(-) > > > > > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > > index dc3c729dee..e2f64c9d44 100644 > > --- a/src/compiler/nir/nir.h > > +++ b/src/compiler/nir/nir.h > > @@ -3013,7 +3013,7 @@ typedef struct nir_lower_bitmap_options { > > void nir_lower_bitmap(nir_shader *shader, const > > nir_lower_bitmap_options > > *options); > > > > bool nir_lower_atomics_to_ssbo(nir_shader *shader, unsigned > > ssbo_offset); > > -bool nir_lower_to_source_mods(nir_shader *shader); > > +bool nir_lower_to_source_mods(nir_shader *shader, bool > > skip_int_ops); > > > > bool nir_lower_gs_intrinsics(nir_shader *shader); > > > > diff --git a/src/compiler/nir/nir_lower_to_source_mods.c > > b/src/compiler/nir/nir_lower_to_source_mods.c > > index 077ca53704..36f2160627 100644 > > --- a/src/compiler/nir/nir_lower_to_source_mods.c > > +++ b/src/compiler/nir/nir_lower_to_source_mods.c > > @@ -34,7 +34,7 @@ > > */ > > > > static bool > > -nir_lower_to_source_mods_block(nir_block *block) > > +nir_lower_to_source_mods_block(nir_block *block, bool > > skip_int_ops) > > { > >bool progress = false; > > > > @@ -62,6 +62,8 @@ nir_lower_to_source_mods_block(nir_block *block) > >continue; > > break; > > case nir_type_int: > > +if (skip_int_ops) > > + continue; > > if (parent->op != nir_op_imov) > >continue; > > break; > > @@ -102,19 +104,10 @@ nir_lower_to_source_mods_block(nir_block > > *block) > > alu->op = nir_op_fmov; > > alu->dest.saturate = true; > > break; > > - case nir_op_ineg: > > - alu->op = nir_op_imov; > > - alu->src[0].negate = !alu->src[0].negate; > > - break; > > case nir_op_fneg: > > alu->op = nir_op_fmov; > > alu->src[0].negate = !alu->src[0].negate; > > break; > > - case nir_op_iabs: > > - alu->op = nir_op_imov; > > - alu->src[0].abs = true; > > - alu->src[0].negate = false; > > - break; > > case nir_op_fabs: > > alu->op = nir_op_fmov; > > alu->src[0].abs = true; > > @@ -124,6 +117,21 @@ nir_lower_to_source_mods_block(nir_block > > *block) > > break; > > } > > > > + if (!skip_int_ops) { > > + switch (alu->op) { > > + case nir_op_ineg: > > +alu->op = nir_op_imov; > > +alu->src[0].negate = !alu->src[0].negate; > > +break; > > + case nir_op_iabs: > > +alu->op = nir_op_imov; > > +alu->src[0].abs = true; > > +alu->src[0].negate = false; > > +break; > > + default: > > +break; > > + } > > + } > > /* We've covered sources. Now we're going to try and > > saturate the > >* destination if we can. > >*/ > > @@ -185,12 +193,12 @@ nir_lower_to_source_mods_block(nir_block > > *block) > > } > > > > static bool > > -nir_lower_to_source_mods_impl(nir_function_impl *impl) > > +nir_lower_to_source_mods_impl(nir_function_impl *impl, bool > > skip_int_ops) > > { > >bool progress = false; > > > >nir_foreach_block(block, impl) { > > - progress |= nir_lower_to_source_mods_block(block); > > + progress |= nir_lower_to_source_mods_block(block, > > skip_int_ops); > >} > > > >if (progress) > > @@ -201,13 +209,13 @@ > > nir_lower_to_source_mods_impl(nir_function_impl *impl) > > } > > > > bool > >
Re: [Mesa-dev] [PATCH] nir: Allow to skip integer ops in nir_lower_to_source_mods
On November 11, 2018 07:44:54 Gert Wollny wrote: From: Gert Wollny Since some hardware supports source mods only for float operations make it possible to skip this lowering for integer operations. Out of curiosity, what hardware would that be? Signed-off-by: Gert Wollny --- I'm a bit unsure about what the best name for the parameter is, i.e. passing in true when one doesn't want all to be lowered seems a bit irritating, but a name like "lower_also_int_ops" looks ugly, and "lower_all" immediately asks for a comment what "not all" includes. I don't think it's worth adding option flags that would be more self-explanatory though. Maybe a bit field: typedef enum { nir_lower_int_source_mods = 1 << 0, nir_lower_float_source_mods = 1 << 1, } nir_lower_to_source_mods_flags; thanks for any comments and reviews, Gert src/compiler/nir/nir.h | 2 +- src/compiler/nir/nir_lower_to_source_mods.c | 36 + src/intel/compiler/brw_nir.c| 2 +- 3 files changed, 24 insertions(+), 16 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index dc3c729dee..e2f64c9d44 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -3013,7 +3013,7 @@ typedef struct nir_lower_bitmap_options { void nir_lower_bitmap(nir_shader *shader, const nir_lower_bitmap_options *options); bool nir_lower_atomics_to_ssbo(nir_shader *shader, unsigned ssbo_offset); -bool nir_lower_to_source_mods(nir_shader *shader); +bool nir_lower_to_source_mods(nir_shader *shader, bool skip_int_ops); bool nir_lower_gs_intrinsics(nir_shader *shader); diff --git a/src/compiler/nir/nir_lower_to_source_mods.c b/src/compiler/nir/nir_lower_to_source_mods.c index 077ca53704..36f2160627 100644 --- a/src/compiler/nir/nir_lower_to_source_mods.c +++ b/src/compiler/nir/nir_lower_to_source_mods.c @@ -34,7 +34,7 @@ */ static bool -nir_lower_to_source_mods_block(nir_block *block) +nir_lower_to_source_mods_block(nir_block *block, bool skip_int_ops) { bool progress = false; @@ -62,6 +62,8 @@ nir_lower_to_source_mods_block(nir_block *block) continue; break; case nir_type_int: +if (skip_int_ops) + continue; if (parent->op != nir_op_imov) continue; break; @@ -102,19 +104,10 @@ nir_lower_to_source_mods_block(nir_block *block) alu->op = nir_op_fmov; alu->dest.saturate = true; break; - case nir_op_ineg: - alu->op = nir_op_imov; - alu->src[0].negate = !alu->src[0].negate; - break; case nir_op_fneg: alu->op = nir_op_fmov; alu->src[0].negate = !alu->src[0].negate; break; - case nir_op_iabs: - alu->op = nir_op_imov; - alu->src[0].abs = true; - alu->src[0].negate = false; - break; case nir_op_fabs: alu->op = nir_op_fmov; alu->src[0].abs = true; @@ -124,6 +117,21 @@ nir_lower_to_source_mods_block(nir_block *block) break; } + if (!skip_int_ops) { + switch (alu->op) { + case nir_op_ineg: +alu->op = nir_op_imov; +alu->src[0].negate = !alu->src[0].negate; +break; + case nir_op_iabs: +alu->op = nir_op_imov; +alu->src[0].abs = true; +alu->src[0].negate = false; +break; + default: +break; + } + } /* We've covered sources. Now we're going to try and saturate the * destination if we can. */ @@ -185,12 +193,12 @@ nir_lower_to_source_mods_block(nir_block *block) } static bool -nir_lower_to_source_mods_impl(nir_function_impl *impl) +nir_lower_to_source_mods_impl(nir_function_impl *impl, bool skip_int_ops) { bool progress = false; nir_foreach_block(block, impl) { - progress |= nir_lower_to_source_mods_block(block); + progress |= nir_lower_to_source_mods_block(block, skip_int_ops); } if (progress) @@ -201,13 +209,13 @@ nir_lower_to_source_mods_impl(nir_function_impl *impl) } bool -nir_lower_to_source_mods(nir_shader *shader) +nir_lower_to_source_mods(nir_shader *shader, bool skip_int_ops) { bool progress = false; nir_foreach_function(function, shader) { if (function->impl) { - progress |= nir_lower_to_source_mods_impl(function->impl); + progress |= nir_lower_to_source_mods_impl(function->impl, skip_int_ops); } } diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index cf5a4a96d6..9c5f2af030 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -793,7 +793,7 @@ brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler, OPT(nir_opt_algebraic_late); - OPT(nir_lower_to_source_mods); + OPT(nir_lower_to_source_mods, false); OPT(nir_copy_prop); OPT(nir_opt_dce); OPT(nir_opt_move_comparisons); -- 2.18.1
[Mesa-dev] [PATCH] nir: Allow to skip integer ops in nir_lower_to_source_mods
From: Gert Wollny Since some hardware supports source mods only for float operations make it possible to skip this lowering for integer operations. Signed-off-by: Gert Wollny --- I'm a bit unsure about what the best name for the parameter is, i.e. passing in true when one doesn't want all to be lowered seems a bit irritating, but a name like "lower_also_int_ops" looks ugly, and "lower_all" immediately asks for a comment what "not all" includes. I don't think it's worth adding option flags that would be more self-explanatory though. thanks for any comments and reviews, Gert src/compiler/nir/nir.h | 2 +- src/compiler/nir/nir_lower_to_source_mods.c | 36 + src/intel/compiler/brw_nir.c| 2 +- 3 files changed, 24 insertions(+), 16 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index dc3c729dee..e2f64c9d44 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -3013,7 +3013,7 @@ typedef struct nir_lower_bitmap_options { void nir_lower_bitmap(nir_shader *shader, const nir_lower_bitmap_options *options); bool nir_lower_atomics_to_ssbo(nir_shader *shader, unsigned ssbo_offset); -bool nir_lower_to_source_mods(nir_shader *shader); +bool nir_lower_to_source_mods(nir_shader *shader, bool skip_int_ops); bool nir_lower_gs_intrinsics(nir_shader *shader); diff --git a/src/compiler/nir/nir_lower_to_source_mods.c b/src/compiler/nir/nir_lower_to_source_mods.c index 077ca53704..36f2160627 100644 --- a/src/compiler/nir/nir_lower_to_source_mods.c +++ b/src/compiler/nir/nir_lower_to_source_mods.c @@ -34,7 +34,7 @@ */ static bool -nir_lower_to_source_mods_block(nir_block *block) +nir_lower_to_source_mods_block(nir_block *block, bool skip_int_ops) { bool progress = false; @@ -62,6 +62,8 @@ nir_lower_to_source_mods_block(nir_block *block) continue; break; case nir_type_int: +if (skip_int_ops) + continue; if (parent->op != nir_op_imov) continue; break; @@ -102,19 +104,10 @@ nir_lower_to_source_mods_block(nir_block *block) alu->op = nir_op_fmov; alu->dest.saturate = true; break; - case nir_op_ineg: - alu->op = nir_op_imov; - alu->src[0].negate = !alu->src[0].negate; - break; case nir_op_fneg: alu->op = nir_op_fmov; alu->src[0].negate = !alu->src[0].negate; break; - case nir_op_iabs: - alu->op = nir_op_imov; - alu->src[0].abs = true; - alu->src[0].negate = false; - break; case nir_op_fabs: alu->op = nir_op_fmov; alu->src[0].abs = true; @@ -124,6 +117,21 @@ nir_lower_to_source_mods_block(nir_block *block) break; } + if (!skip_int_ops) { + switch (alu->op) { + case nir_op_ineg: +alu->op = nir_op_imov; +alu->src[0].negate = !alu->src[0].negate; +break; + case nir_op_iabs: +alu->op = nir_op_imov; +alu->src[0].abs = true; +alu->src[0].negate = false; +break; + default: +break; + } + } /* We've covered sources. Now we're going to try and saturate the * destination if we can. */ @@ -185,12 +193,12 @@ nir_lower_to_source_mods_block(nir_block *block) } static bool -nir_lower_to_source_mods_impl(nir_function_impl *impl) +nir_lower_to_source_mods_impl(nir_function_impl *impl, bool skip_int_ops) { bool progress = false; nir_foreach_block(block, impl) { - progress |= nir_lower_to_source_mods_block(block); + progress |= nir_lower_to_source_mods_block(block, skip_int_ops); } if (progress) @@ -201,13 +209,13 @@ nir_lower_to_source_mods_impl(nir_function_impl *impl) } bool -nir_lower_to_source_mods(nir_shader *shader) +nir_lower_to_source_mods(nir_shader *shader, bool skip_int_ops) { bool progress = false; nir_foreach_function(function, shader) { if (function->impl) { - progress |= nir_lower_to_source_mods_impl(function->impl); + progress |= nir_lower_to_source_mods_impl(function->impl, skip_int_ops); } } diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index cf5a4a96d6..9c5f2af030 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -793,7 +793,7 @@ brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler, OPT(nir_opt_algebraic_late); - OPT(nir_lower_to_source_mods); + OPT(nir_lower_to_source_mods, false); OPT(nir_copy_prop); OPT(nir_opt_dce); OPT(nir_opt_move_comparisons); -- 2.18.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev