Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.

2015-03-30 Thread Eric Anholt
Kenneth Graunke kenn...@whitecape.org writes:

 On Friday, March 27, 2015 01:54:32 PM Eric Anholt wrote:
 This will be used by the VC4 driver for doing device-independent
 optimization, and hopefully eventually replacing its whole IR.  It also
 may be useful to other drivers for the same reason.

 Hi Eric!

 I have a bunch of comments below, but overall this looks great.

 You should probably have someone who knows TGSI better than I do review
 it, but for what it's worth, this is:

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Thanks!  There was definitely useful feedback in here, and I've taken
most of it.

 +/* LOG - Approximate Logarithm Base 2
 + *  dst.x = \lfloor\log_2{|src.x|}\rfloor
 + *  dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
 + *  dst.z = \log_2{|src.x|}
 + *  dst.w = 1.0
 + */
 +static void
 +ttn_log(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
 +{
 +   nir_ssa_def *abs_srcx = nir_fabs(b, ttn_channel(b, src[0], X));
 +   nir_ssa_def *log2 = nir_flog2(b, abs_srcx);
 +
 +   ttn_move_dest_masked(b, dest, nir_ffloor(b, log2), TGSI_WRITEMASK_X);
 +   ttn_move_dest_masked(b, dest,
 +nir_fdiv(b, abs_srcx, nir_fexp2(b, nir_ffloor(b, 
 log2))),

 You're generating two copies of floor(log2) here, which will have to be
 CSE'd later.  In prog_to_nir, I created a temporary and used it in both
 places:

nir_ssa_def *floor_log2 = nir_ffloor(b, log2);

 We're generating tons of rubbish for NIR to optimize anyway, so it's not
 a big deal...but...may as well do the trivial improvement.

I much more expect the whole mess except for the dst.z computation to
get DCEed away, so it's just one extra DCE out of so many.

(and we generate lots of copy prop to avoid all throughout this mess,
which I've considered short-circuting in nir_builder some day).

 +static void
 +ttn_sle(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
 +{
 +   ttn_move_dest(b, dest, nir_sge(b, src[1], src[0]));
 +}

 I've got code here to generate b2f(fge(...)) instead of sge(...) since I
 didn't want to bother implementing it in my driver, and figured the b2fs
 might be able to get optimized away.

 That said, I suppose we could probably just add lowering transformations
 that turn sge - b2f(fge(...)) when options-native_integers is set, and
 delete my code...

For me an SGE in hardware is:

fsub.sf(null, src0, src1)
mov.nc(dest, 1.0)
mov.ns(dest, 0)

while your plan would be... oh wait.  I didn't even have a b2f
implementation because TGSI doesn't do that (they just AND the bool with
1.0).  But an FGE is:

fsub.sf(null, src0, src1)
mov.nc(dest, ~0)
mov.ns(dest, 0)

so any more instructions would be worse.

 +static void
 +ttn_xpd(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
 +{
 +   ttn_move_dest_masked(b, dest,
 +nir_fsub(b,
 + nir_fmul(b,
 +  ttn_swizzle(b, src[0], Y, Z, X, 
 X),
 +  ttn_swizzle(b, src[1], Z, X, Y, 
 X)),
 + nir_fmul(b,
 +  ttn_swizzle(b, src[1], Y, Z, X, 
 X),
 +  ttn_swizzle(b, src[0], Z, X, Y, 
 X))),
 +TGSI_WRITEMASK_XYZ);
 +   ttn_move_dest_masked(b, dest, nir_imm_float(b, 1.0), TGSI_WRITEMASK_W);
 +}
 +
 +static void
 +ttn_dp2a(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
 +{
 +   ttn_move_dest(b, dest,
 + ttn_channel(b, nir_fadd(b,
 + ttn_channel(b, nir_fdot2(b, 
 src[0],
 +  src[1]),
 + X),

 Do you really need to do ttn_channel(b, ..., X) on a fdot2 result?  It's
 already a scalar value.  Same comment applies to the below four.

 I should probably delete that from prog_to_nir as well.

Good catch, the cleanups for scalar in the builder have obsoleted this,
I think.


 + src[2]),
 + X));
 +}

 +static void
 +ttn_ucmp(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
 +{
 +   ttn_move_dest(b, dest, nir_bcsel(b,
 +nir_ine(b, src[0], nir_imm_float(b, 
 0.0)),

 Doing nir_imm_int(b, 0) here would make more sense.

Yeah, now that I have it :)

 +static void
 +ttn_kill_if(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def 
 **src)
 +{
 +   nir_if *if_stmt = nir_if_create(b-shader);
 +   if_stmt-condition =
 +  nir_src_for_ssa(nir_bany4(b, nir_flt(b, src[0], nir_imm_float(b, 
 0.0;
 +   nir_cf_node_insert_end(b-cf_node_list, if_stmt-cf_node);
 +
 +   nir_intrinsic_instr *discard =
 +  nir_intrinsic_instr_create(b-shader, nir_intrinsic_discard);
 +   nir_instr_insert_after_cf_list(if_stmt-then_list, discard-instr);

 A while 

Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.

2015-03-27 Thread Ilia Mirkin
On Fri, Mar 27, 2015 at 4:54 PM, Eric Anholt e...@anholt.net wrote:
 diff --git a/src/gallium/auxiliary/Makefile.sources 
 b/src/gallium/auxiliary/Makefile.sources
 index 09496fa..08e4e4c 100644
 --- a/src/gallium/auxiliary/Makefile.sources
 +++ b/src/gallium/auxiliary/Makefile.sources
 @@ -69,6 +69,7 @@ C_SOURCES := \
 indices/u_indices_priv.h \
 indices/u_primconvert.c \
 indices/u_primconvert.h \
 +   nir/tgsi_to_nir.c \
 os/os_memory_aligned.h \
 os/os_memory_debug.h \
 os/os_memory_stdc.h \

I believe you're supposed to include .h files in these lists as well
nowadays, since Emil uses them to generate the list of all files to
include in the distribution tarball.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.

2015-03-27 Thread Kenneth Graunke
On Friday, March 27, 2015 01:54:32 PM Eric Anholt wrote:
 This will be used by the VC4 driver for doing device-independent
 optimization, and hopefully eventually replacing its whole IR.  It also
 may be useful to other drivers for the same reason.
 
 v2: Add all of the instructions I was relying on tgsi_lowering to remove,
 and more.
 v3: Rebase on SSA rework of the builder.
 v4: Use the NIR ineg operation instead of doing a src modifier.
 v5: Don't use ineg for fnegs.  (infer_src_type on MOV doesn't do what I
 expect, again).
 v6: Fix handling of multi-channel KILL_IF sources.
 v7: Make ttn_get_f() return a swizzle of a scalar load_const, rather than
 a vector load_const.  CSE doesn't recognize that srcs out of those
 channels are actually all the same.
 v8: Rebase on nir_builder auto-sizing, make the scalar arguments to
 non-ALU instructions actually be scalars.
 v9: Add support for if/loop instructions, additional texture targets, and
 untested support for indirect addressing on temps.
 v10: Rebase on master, drop bad comment about control flow and just choose
  the X channel, use int comparison opcodes in LIT for now, drop unused
  pipe_context argument..
 v11: Fix translation of LRP (previously missed because I mis-translated
  back out), use nir_builder init helpers.
 v12: Rebase on master, adding explicit include of mtypes.h to get
  INTERP_QUALIFIER_*
 v13: Rebase on variables being in lists instead of hash tables, drop use
  of mtypes.h in favor of util/pipeline.h.  Use Ken's nir_builder
  swizzle and fmov/imov_alu helpers, drop struct in front of
  nir_builder, use nir_builder directly as the function arg in a lot of
  cases, drop redundant members of ttn_compile that are also in
  nir_builder, drop some half-baked malloc failure handling.
 ---
 
 This series is present as part of my vc4-nir-rebase-qir-2 branch.  The
 shader-db results across the branch are:
 
 total uniforms in shared programs: 13433 - 13434 (0.01%)
 uniforms in affected programs: 62 - 63 (1.61%)
 total instructions in shared programs: 40003 - 39794 (-0.52%)
 instructions in affected programs: 15494 - 15285 (-1.35%)
 
 I don't get to delete my driver's optimization code in that branch
 yet, notably because of optimization available in the programmable
 blending I have to generate.

Hi Eric!

I have a bunch of comments below, but overall this looks great.

You should probably have someone who knows TGSI better than I do review
it, but for what it's worth, this is:

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 
  src/gallium/auxiliary/Makefile.sources  |1 +
  src/gallium/auxiliary/nir/tgsi_to_nir.c | 1425 
+++
  src/gallium/auxiliary/nir/tgsi_to_nir.h |   30 +
  3 files changed, 1456 insertions(+)
  create mode 100644 src/gallium/auxiliary/nir/tgsi_to_nir.c
  create mode 100644 src/gallium/auxiliary/nir/tgsi_to_nir.h
 
 diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
 index 09496fa..08e4e4c 100644
 --- a/src/gallium/auxiliary/Makefile.sources
 +++ b/src/gallium/auxiliary/Makefile.sources
 @@ -69,6 +69,7 @@ C_SOURCES := \
   indices/u_indices_priv.h \
   indices/u_primconvert.c \
   indices/u_primconvert.h \
 + nir/tgsi_to_nir.c \
   os/os_memory_aligned.h \
   os/os_memory_debug.h \
   os/os_memory_stdc.h \
 diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 new file mode 100644
 index 000..e8cf9f4
 --- /dev/null
 +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 @@ -0,0 +1,1425 @@
 +/*
 + * Copyright © 2014-2015 Broadcom
 + * Copyright (C) 2014 Rob Clark robcl...@freedesktop.org
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the 
Software),
 + * to deal in the Software without restriction, including without 
limitation
 + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the 
next
 + * paragraph) shall be included in all copies or substantial portions of 
the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS 
OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
DEALINGS
 + * IN THE SOFTWARE.
 + */
 +
 +#include util/ralloc.h
 +#include glsl/nir/nir.h
 +#include glsl/nir/nir_builder.h