Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.
Kenneth Graunke kenn...@whitecape.org writes: On Friday, March 27, 2015 01:54:32 PM Eric Anholt wrote: This will be used by the VC4 driver for doing device-independent optimization, and hopefully eventually replacing its whole IR. It also may be useful to other drivers for the same reason. Hi Eric! I have a bunch of comments below, but overall this looks great. You should probably have someone who knows TGSI better than I do review it, but for what it's worth, this is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org Thanks! There was definitely useful feedback in here, and I've taken most of it. +/* LOG - Approximate Logarithm Base 2 + * dst.x = \lfloor\log_2{|src.x|}\rfloor + * dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}} + * dst.z = \log_2{|src.x|} + * dst.w = 1.0 + */ +static void +ttn_log(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) +{ + nir_ssa_def *abs_srcx = nir_fabs(b, ttn_channel(b, src[0], X)); + nir_ssa_def *log2 = nir_flog2(b, abs_srcx); + + ttn_move_dest_masked(b, dest, nir_ffloor(b, log2), TGSI_WRITEMASK_X); + ttn_move_dest_masked(b, dest, +nir_fdiv(b, abs_srcx, nir_fexp2(b, nir_ffloor(b, log2))), You're generating two copies of floor(log2) here, which will have to be CSE'd later. In prog_to_nir, I created a temporary and used it in both places: nir_ssa_def *floor_log2 = nir_ffloor(b, log2); We're generating tons of rubbish for NIR to optimize anyway, so it's not a big deal...but...may as well do the trivial improvement. I much more expect the whole mess except for the dst.z computation to get DCEed away, so it's just one extra DCE out of so many. (and we generate lots of copy prop to avoid all throughout this mess, which I've considered short-circuting in nir_builder some day). +static void +ttn_sle(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) +{ + ttn_move_dest(b, dest, nir_sge(b, src[1], src[0])); +} I've got code here to generate b2f(fge(...)) instead of sge(...) since I didn't want to bother implementing it in my driver, and figured the b2fs might be able to get optimized away. That said, I suppose we could probably just add lowering transformations that turn sge - b2f(fge(...)) when options-native_integers is set, and delete my code... For me an SGE in hardware is: fsub.sf(null, src0, src1) mov.nc(dest, 1.0) mov.ns(dest, 0) while your plan would be... oh wait. I didn't even have a b2f implementation because TGSI doesn't do that (they just AND the bool with 1.0). But an FGE is: fsub.sf(null, src0, src1) mov.nc(dest, ~0) mov.ns(dest, 0) so any more instructions would be worse. +static void +ttn_xpd(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) +{ + ttn_move_dest_masked(b, dest, +nir_fsub(b, + nir_fmul(b, + ttn_swizzle(b, src[0], Y, Z, X, X), + ttn_swizzle(b, src[1], Z, X, Y, X)), + nir_fmul(b, + ttn_swizzle(b, src[1], Y, Z, X, X), + ttn_swizzle(b, src[0], Z, X, Y, X))), +TGSI_WRITEMASK_XYZ); + ttn_move_dest_masked(b, dest, nir_imm_float(b, 1.0), TGSI_WRITEMASK_W); +} + +static void +ttn_dp2a(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) +{ + ttn_move_dest(b, dest, + ttn_channel(b, nir_fadd(b, + ttn_channel(b, nir_fdot2(b, src[0], + src[1]), + X), Do you really need to do ttn_channel(b, ..., X) on a fdot2 result? It's already a scalar value. Same comment applies to the below four. I should probably delete that from prog_to_nir as well. Good catch, the cleanups for scalar in the builder have obsoleted this, I think. + src[2]), + X)); +} +static void +ttn_ucmp(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) +{ + ttn_move_dest(b, dest, nir_bcsel(b, +nir_ine(b, src[0], nir_imm_float(b, 0.0)), Doing nir_imm_int(b, 0) here would make more sense. Yeah, now that I have it :) +static void +ttn_kill_if(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) +{ + nir_if *if_stmt = nir_if_create(b-shader); + if_stmt-condition = + nir_src_for_ssa(nir_bany4(b, nir_flt(b, src[0], nir_imm_float(b, 0.0; + nir_cf_node_insert_end(b-cf_node_list, if_stmt-cf_node); + + nir_intrinsic_instr *discard = + nir_intrinsic_instr_create(b-shader, nir_intrinsic_discard); + nir_instr_insert_after_cf_list(if_stmt-then_list, discard-instr); A while
[Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.
This will be used by the VC4 driver for doing device-independent optimization, and hopefully eventually replacing its whole IR. It also may be useful to other drivers for the same reason. v2: Add all of the instructions I was relying on tgsi_lowering to remove, and more. v3: Rebase on SSA rework of the builder. v4: Use the NIR ineg operation instead of doing a src modifier. v5: Don't use ineg for fnegs. (infer_src_type on MOV doesn't do what I expect, again). v6: Fix handling of multi-channel KILL_IF sources. v7: Make ttn_get_f() return a swizzle of a scalar load_const, rather than a vector load_const. CSE doesn't recognize that srcs out of those channels are actually all the same. v8: Rebase on nir_builder auto-sizing, make the scalar arguments to non-ALU instructions actually be scalars. v9: Add support for if/loop instructions, additional texture targets, and untested support for indirect addressing on temps. v10: Rebase on master, drop bad comment about control flow and just choose the X channel, use int comparison opcodes in LIT for now, drop unused pipe_context argument.. v11: Fix translation of LRP (previously missed because I mis-translated back out), use nir_builder init helpers. v12: Rebase on master, adding explicit include of mtypes.h to get INTERP_QUALIFIER_* v13: Rebase on variables being in lists instead of hash tables, drop use of mtypes.h in favor of util/pipeline.h. Use Ken's nir_builder swizzle and fmov/imov_alu helpers, drop struct in front of nir_builder, use nir_builder directly as the function arg in a lot of cases, drop redundant members of ttn_compile that are also in nir_builder, drop some half-baked malloc failure handling. --- This series is present as part of my vc4-nir-rebase-qir-2 branch. The shader-db results across the branch are: total uniforms in shared programs: 13433 - 13434 (0.01%) uniforms in affected programs: 62 - 63 (1.61%) total instructions in shared programs: 40003 - 39794 (-0.52%) instructions in affected programs: 15494 - 15285 (-1.35%) I don't get to delete my driver's optimization code in that branch yet, notably because of optimization available in the programmable blending I have to generate. src/gallium/auxiliary/Makefile.sources |1 + src/gallium/auxiliary/nir/tgsi_to_nir.c | 1425 +++ src/gallium/auxiliary/nir/tgsi_to_nir.h | 30 + 3 files changed, 1456 insertions(+) create mode 100644 src/gallium/auxiliary/nir/tgsi_to_nir.c create mode 100644 src/gallium/auxiliary/nir/tgsi_to_nir.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 09496fa..08e4e4c 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -69,6 +69,7 @@ C_SOURCES := \ indices/u_indices_priv.h \ indices/u_primconvert.c \ indices/u_primconvert.h \ + nir/tgsi_to_nir.c \ os/os_memory_aligned.h \ os/os_memory_debug.h \ os/os_memory_stdc.h \ diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c new file mode 100644 index 000..e8cf9f4 --- /dev/null +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -0,0 +1,1425 @@ +/* + * Copyright © 2014-2015 Broadcom + * Copyright (C) 2014 Rob Clark robcl...@freedesktop.org + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include util/ralloc.h +#include glsl/nir/nir.h +#include glsl/nir/nir_builder.h +#include glsl/list.h +#include util/pipeline.h + +#include nir/tgsi_to_nir.h +#include tgsi/tgsi_parse.h +#include tgsi/tgsi_dump.h +#include tgsi/tgsi_info.h +#include tgsi/tgsi_scan.h + +#define SWIZ(X, Y, Z, W) (unsigned[4]){ \ + TGSI_SWIZZLE_##X, \ + TGSI_SWIZZLE_##Y, \ + TGSI_SWIZZLE_##Z, \ +
Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.
On Fri, Mar 27, 2015 at 4:54 PM, Eric Anholt e...@anholt.net wrote: diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 09496fa..08e4e4c 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -69,6 +69,7 @@ C_SOURCES := \ indices/u_indices_priv.h \ indices/u_primconvert.c \ indices/u_primconvert.h \ + nir/tgsi_to_nir.c \ os/os_memory_aligned.h \ os/os_memory_debug.h \ os/os_memory_stdc.h \ I believe you're supposed to include .h files in these lists as well nowadays, since Emil uses them to generate the list of all files to include in the distribution tarball. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.
On Friday, March 27, 2015 01:54:32 PM Eric Anholt wrote: This will be used by the VC4 driver for doing device-independent optimization, and hopefully eventually replacing its whole IR. It also may be useful to other drivers for the same reason. v2: Add all of the instructions I was relying on tgsi_lowering to remove, and more. v3: Rebase on SSA rework of the builder. v4: Use the NIR ineg operation instead of doing a src modifier. v5: Don't use ineg for fnegs. (infer_src_type on MOV doesn't do what I expect, again). v6: Fix handling of multi-channel KILL_IF sources. v7: Make ttn_get_f() return a swizzle of a scalar load_const, rather than a vector load_const. CSE doesn't recognize that srcs out of those channels are actually all the same. v8: Rebase on nir_builder auto-sizing, make the scalar arguments to non-ALU instructions actually be scalars. v9: Add support for if/loop instructions, additional texture targets, and untested support for indirect addressing on temps. v10: Rebase on master, drop bad comment about control flow and just choose the X channel, use int comparison opcodes in LIT for now, drop unused pipe_context argument.. v11: Fix translation of LRP (previously missed because I mis-translated back out), use nir_builder init helpers. v12: Rebase on master, adding explicit include of mtypes.h to get INTERP_QUALIFIER_* v13: Rebase on variables being in lists instead of hash tables, drop use of mtypes.h in favor of util/pipeline.h. Use Ken's nir_builder swizzle and fmov/imov_alu helpers, drop struct in front of nir_builder, use nir_builder directly as the function arg in a lot of cases, drop redundant members of ttn_compile that are also in nir_builder, drop some half-baked malloc failure handling. --- This series is present as part of my vc4-nir-rebase-qir-2 branch. The shader-db results across the branch are: total uniforms in shared programs: 13433 - 13434 (0.01%) uniforms in affected programs: 62 - 63 (1.61%) total instructions in shared programs: 40003 - 39794 (-0.52%) instructions in affected programs: 15494 - 15285 (-1.35%) I don't get to delete my driver's optimization code in that branch yet, notably because of optimization available in the programmable blending I have to generate. Hi Eric! I have a bunch of comments below, but overall this looks great. You should probably have someone who knows TGSI better than I do review it, but for what it's worth, this is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org src/gallium/auxiliary/Makefile.sources |1 + src/gallium/auxiliary/nir/tgsi_to_nir.c | 1425 +++ src/gallium/auxiliary/nir/tgsi_to_nir.h | 30 + 3 files changed, 1456 insertions(+) create mode 100644 src/gallium/auxiliary/nir/tgsi_to_nir.c create mode 100644 src/gallium/auxiliary/nir/tgsi_to_nir.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 09496fa..08e4e4c 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -69,6 +69,7 @@ C_SOURCES := \ indices/u_indices_priv.h \ indices/u_primconvert.c \ indices/u_primconvert.h \ + nir/tgsi_to_nir.c \ os/os_memory_aligned.h \ os/os_memory_debug.h \ os/os_memory_stdc.h \ diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c new file mode 100644 index 000..e8cf9f4 --- /dev/null +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -0,0 +1,1425 @@ +/* + * Copyright © 2014-2015 Broadcom + * Copyright (C) 2014 Rob Clark robcl...@freedesktop.org + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include util/ralloc.h +#include glsl/nir/nir.h +#include glsl/nir/nir_builder.h