Re: [Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal

2014-05-08 Thread Ilia Mirkin
On Wed, May 7, 2014 at 10:55 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 So... this shader (from
 generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test):

 uniform mat2 arg0;
 uniform mat2 arg1;

 void main()
 {
   bool result = (arg0 == arg1);
   gl_FragColor = vec4(result, 0.0, 0.0, 0.0);
 }

 Which becomes the following IR:

 (
 (declare (shader_out ) vec4 gl_FragColor)
 (declare (temporary ) vec4 gl_FragColor)
 (declare (uniform ) mat2 arg0)
 (declare (uniform ) mat2 arg1)
 (function main
   (signature void
 (parameters
 )
 (
   (declare (temporary ) vec4 vec_ctor)
   (assign  (yzw) (var_ref vec_ctor)  (constant vec3 (0.0 0.0 0.0)) )
   (declare (temporary ) bvec2 mat_cmp_bvec)
   (assign  (x) (var_ref mat_cmp_bvec)  (expression bool any_nequal
 (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref
 arg0) (constant int (0)) ) ) )
   (assign  (y) (var_ref mat_cmp_bvec)  (expression bool any_nequal
 (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref
 arg0) (constant int (1)) ) ) )
   (assign  (x) (var_ref vec_ctor)  (expression float b2f
 (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) )
   (assign  (xyzw) (var_ref gl_FragColor)  (var_ref vec_ctor) )
   (assign  (xyzw) (var_ref gl_FragColor@4)  (var_ref gl_FragColor) )
 ))

 )


 When converted to TGS becomes:

 FRAG
 PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
 DCL OUT[0], COLOR
 DCL CONST[0..3]
 DCL TEMP[0..2], LOCAL
 IMM[0] FLT32 {0., 1., 0., 0.}
 IMM[1] INT32 {0, 0, 0, 0}
   0: MOV TEMP[0].yzw, IMM[0].
   1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy
   2: OR TEMP[1].x, TEMP[1]., TEMP[1].
   3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy
   4: OR TEMP[2].x, TEMP[2]., TEMP[2].
   5: MOV TEMP[1].y, TEMP[2].
   6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy
   7: USNE TEMP[1].x, TEMP[1]., IMM[1].
   8: NOT TEMP[1].x, TEMP[1].
   9: AND TEMP[0].x, TEMP[1]., IMM[0].
  10: MOV OUT[0], TEMP[0]
  11: END

 Note that FSNE/OR are used, implying that the integer version of these
 is expected. However then it goes on to use DP2, which, as I
 understand, does a floating point multiply + add. Now, this _happens_
 to work out, since the integer representations of float 0 and int 0
 are the same, and those are really the only possilibities we care
 about.

 However this seems really dodgy... wouldn't it be clearer to use
 either SNE + OR (which would still work!) + DP2, or alternatively AND
 them all together instead of SNE/DP2? This seems to come in via
 ir_unop_any_nequal. IMO the latter would be better since it keeps

 Erm, sorry -- the email subject and this sentence isn't _quite_
 accurate. That should be ir_unop_any. ir_binop_any_nequal is what
 generates the FSNE/OR' combos. But everything else still holds :)

 things in integer space, and presumably AND's are cheaper than
 fmul/fadd.

 I noticed this because nouveau's codegen logic isn't able to optimize
 this intelligently and I was trying to figure out why.

 Thoughts?

I sent a patch that implements a native integers version of
ir_unop_any: http://patchwork.freedesktop.org/patch/25569/

From the overall symmetry of things, it seems like this was just
forgotten whenever native integer support was added. All the other
any_equal/etc have if (native_integers) do OR+etc else DP2 + etc.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal

2014-05-07 Thread Ilia Mirkin
So... this shader (from
generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test):

uniform mat2 arg0;
uniform mat2 arg1;

void main()
{
  bool result = (arg0 == arg1);
  gl_FragColor = vec4(result, 0.0, 0.0, 0.0);
}

Which becomes the following IR:

(
(declare (shader_out ) vec4 gl_FragColor)
(declare (temporary ) vec4 gl_FragColor)
(declare (uniform ) mat2 arg0)
(declare (uniform ) mat2 arg1)
(function main
  (signature void
(parameters
)
(
  (declare (temporary ) vec4 vec_ctor)
  (assign  (yzw) (var_ref vec_ctor)  (constant vec3 (0.0 0.0 0.0)) )
  (declare (temporary ) bvec2 mat_cmp_bvec)
  (assign  (x) (var_ref mat_cmp_bvec)  (expression bool any_nequal
(array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref
arg0) (constant int (0)) ) ) )
  (assign  (y) (var_ref mat_cmp_bvec)  (expression bool any_nequal
(array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref
arg0) (constant int (1)) ) ) )
  (assign  (x) (var_ref vec_ctor)  (expression float b2f
(expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) )
  (assign  (xyzw) (var_ref gl_FragColor)  (var_ref vec_ctor) )
  (assign  (xyzw) (var_ref gl_FragColor@4)  (var_ref gl_FragColor) )
))

)


When converted to TGS becomes:

FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL OUT[0], COLOR
DCL CONST[0..3]
DCL TEMP[0..2], LOCAL
IMM[0] FLT32 {0., 1., 0., 0.}
IMM[1] INT32 {0, 0, 0, 0}
  0: MOV TEMP[0].yzw, IMM[0].
  1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy
  2: OR TEMP[1].x, TEMP[1]., TEMP[1].
  3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy
  4: OR TEMP[2].x, TEMP[2]., TEMP[2].
  5: MOV TEMP[1].y, TEMP[2].
  6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy
  7: USNE TEMP[1].x, TEMP[1]., IMM[1].
  8: NOT TEMP[1].x, TEMP[1].
  9: AND TEMP[0].x, TEMP[1]., IMM[0].
 10: MOV OUT[0], TEMP[0]
 11: END

Note that FSNE/OR are used, implying that the integer version of these
is expected. However then it goes on to use DP2, which, as I
understand, does a floating point multiply + add. Now, this _happens_
to work out, since the integer representations of float 0 and int 0
are the same, and those are really the only possilibities we care
about.

However this seems really dodgy... wouldn't it be clearer to use
either SNE + OR (which would still work!) + DP2, or alternatively AND
them all together instead of SNE/DP2? This seems to come in via
ir_unop_any_nequal. IMO the latter would be better since it keeps
things in integer space, and presumably AND's are cheaper than
fmul/fadd.

I noticed this because nouveau's codegen logic isn't able to optimize
this intelligently and I was trying to figure out why.

Thoughts?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal

2014-05-07 Thread Ilia Mirkin
On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 So... this shader (from
 generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test):

 uniform mat2 arg0;
 uniform mat2 arg1;

 void main()
 {
   bool result = (arg0 == arg1);
   gl_FragColor = vec4(result, 0.0, 0.0, 0.0);
 }

 Which becomes the following IR:

 (
 (declare (shader_out ) vec4 gl_FragColor)
 (declare (temporary ) vec4 gl_FragColor)
 (declare (uniform ) mat2 arg0)
 (declare (uniform ) mat2 arg1)
 (function main
   (signature void
 (parameters
 )
 (
   (declare (temporary ) vec4 vec_ctor)
   (assign  (yzw) (var_ref vec_ctor)  (constant vec3 (0.0 0.0 0.0)) )
   (declare (temporary ) bvec2 mat_cmp_bvec)
   (assign  (x) (var_ref mat_cmp_bvec)  (expression bool any_nequal
 (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref
 arg0) (constant int (0)) ) ) )
   (assign  (y) (var_ref mat_cmp_bvec)  (expression bool any_nequal
 (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref
 arg0) (constant int (1)) ) ) )
   (assign  (x) (var_ref vec_ctor)  (expression float b2f
 (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) )
   (assign  (xyzw) (var_ref gl_FragColor)  (var_ref vec_ctor) )
   (assign  (xyzw) (var_ref gl_FragColor@4)  (var_ref gl_FragColor) )
 ))

 )


 When converted to TGS becomes:

 FRAG
 PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
 DCL OUT[0], COLOR
 DCL CONST[0..3]
 DCL TEMP[0..2], LOCAL
 IMM[0] FLT32 {0., 1., 0., 0.}
 IMM[1] INT32 {0, 0, 0, 0}
   0: MOV TEMP[0].yzw, IMM[0].
   1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy
   2: OR TEMP[1].x, TEMP[1]., TEMP[1].
   3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy
   4: OR TEMP[2].x, TEMP[2]., TEMP[2].
   5: MOV TEMP[1].y, TEMP[2].
   6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy
   7: USNE TEMP[1].x, TEMP[1]., IMM[1].
   8: NOT TEMP[1].x, TEMP[1].
   9: AND TEMP[0].x, TEMP[1]., IMM[0].
  10: MOV OUT[0], TEMP[0]
  11: END

 Note that FSNE/OR are used, implying that the integer version of these
 is expected. However then it goes on to use DP2, which, as I
 understand, does a floating point multiply + add. Now, this _happens_
 to work out, since the integer representations of float 0 and int 0
 are the same, and those are really the only possilibities we care
 about.

 However this seems really dodgy... wouldn't it be clearer to use
 either SNE + OR (which would still work!) + DP2, or alternatively AND
 them all together instead of SNE/DP2? This seems to come in via
 ir_unop_any_nequal. IMO the latter would be better since it keeps

Erm, sorry -- the email subject and this sentence isn't _quite_
accurate. That should be ir_unop_any. ir_binop_any_nequal is what
generates the FSNE/OR' combos. But everything else still holds :)

 things in integer space, and presumably AND's are cheaper than
 fmul/fadd.

 I noticed this because nouveau's codegen logic isn't able to optimize
 this intelligently and I was trying to figure out why.

 Thoughts?

   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev