On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > So... this shader (from > generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): > > uniform mat2 arg0; > uniform mat2 arg1; > > void main() > { > bool result = (arg0 == arg1); > gl_FragColor = vec4(result, 0.0, 0.0, 0.0); > } > > Which becomes the following IR: > > ( > (declare (shader_out ) vec4 gl_FragColor) > (declare (temporary ) vec4 gl_FragColor) > (declare (uniform ) mat2 arg0) > (declare (uniform ) mat2 arg1) > (function main > (signature void > (parameters > ) > ( > (declare (temporary ) vec4 vec_ctor) > (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) > (declare (temporary ) bvec2 mat_cmp_bvec) > (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal > (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref > arg0) (constant int (0)) ) ) ) > (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal > (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref > arg0) (constant int (1)) ) ) ) > (assign (x) (var_ref vec_ctor) (expression float b2f > (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) > (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) > (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) > )) > > ) > > > When converted to TGS becomes: > > FRAG > PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 > DCL OUT[0], COLOR > DCL CONST[0..3] > DCL TEMP[0..2], LOCAL > IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} > IMM[1] INT32 {0, 0, 0, 0} > 0: MOV TEMP[0].yzw, IMM[0].xxxx > 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy > 2: OR TEMP[1].x, TEMP[1].xxxx, TEMP[1].yyyy > 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy > 4: OR TEMP[2].x, TEMP[2].xxxx, TEMP[2].yyyy > 5: MOV TEMP[1].y, TEMP[2].xxxx > 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy > 7: USNE TEMP[1].x, TEMP[1].xxxx, IMM[1].xxxx > 8: NOT TEMP[1].x, TEMP[1].xxxx > 9: AND TEMP[0].x, TEMP[1].xxxx, IMM[0].yyyy > 10: MOV OUT[0], TEMP[0] > 11: END > > Note that FSNE/OR are used, implying that the integer version of these > is expected. However then it goes on to use DP2, which, as I > understand, does a floating point multiply + add. Now, this _happens_ > to work out, since the integer representations of float 0 and int 0 > are the same, and those are really the only possilibities we care > about. > > However this seems really dodgy... wouldn't it be clearer to use > either SNE + OR (which would still work!) + DP2, or alternatively AND > them all together instead of SNE/DP2? This seems to come in via > ir_unop_any_nequal. IMO the latter would be better since it keeps
Erm, sorry -- the email subject and this sentence isn't _quite_ accurate. That should be ir_unop_any. ir_binop_any_nequal is what generates the FSNE/OR' combos. But everything else still holds :) > things in integer space, and presumably AND's are cheaper than > fmul/fadd. > > I noticed this because nouveau's codegen logic isn't able to optimize > this intelligently and I was trying to figure out why. > > Thoughts? > > -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev