Re: [Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal
On Wed, May 7, 2014 at 10:55 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin imir...@alum.mit.edu wrote: So... this shader (from generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): uniform mat2 arg0; uniform mat2 arg1; void main() { bool result = (arg0 == arg1); gl_FragColor = vec4(result, 0.0, 0.0, 0.0); } Which becomes the following IR: ( (declare (shader_out ) vec4 gl_FragColor) (declare (temporary ) vec4 gl_FragColor) (declare (uniform ) mat2 arg0) (declare (uniform ) mat2 arg1) (function main (signature void (parameters ) ( (declare (temporary ) vec4 vec_ctor) (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) (declare (temporary ) bvec2 mat_cmp_bvec) (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref arg0) (constant int (0)) ) ) ) (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref arg0) (constant int (1)) ) ) ) (assign (x) (var_ref vec_ctor) (expression float b2f (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) )) ) When converted to TGS becomes: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..3] DCL TEMP[0..2], LOCAL IMM[0] FLT32 {0., 1., 0., 0.} IMM[1] INT32 {0, 0, 0, 0} 0: MOV TEMP[0].yzw, IMM[0]. 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy 2: OR TEMP[1].x, TEMP[1]., TEMP[1]. 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy 4: OR TEMP[2].x, TEMP[2]., TEMP[2]. 5: MOV TEMP[1].y, TEMP[2]. 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy 7: USNE TEMP[1].x, TEMP[1]., IMM[1]. 8: NOT TEMP[1].x, TEMP[1]. 9: AND TEMP[0].x, TEMP[1]., IMM[0]. 10: MOV OUT[0], TEMP[0] 11: END Note that FSNE/OR are used, implying that the integer version of these is expected. However then it goes on to use DP2, which, as I understand, does a floating point multiply + add. Now, this _happens_ to work out, since the integer representations of float 0 and int 0 are the same, and those are really the only possilibities we care about. However this seems really dodgy... wouldn't it be clearer to use either SNE + OR (which would still work!) + DP2, or alternatively AND them all together instead of SNE/DP2? This seems to come in via ir_unop_any_nequal. IMO the latter would be better since it keeps Erm, sorry -- the email subject and this sentence isn't _quite_ accurate. That should be ir_unop_any. ir_binop_any_nequal is what generates the FSNE/OR' combos. But everything else still holds :) things in integer space, and presumably AND's are cheaper than fmul/fadd. I noticed this because nouveau's codegen logic isn't able to optimize this intelligently and I was trying to figure out why. Thoughts? I sent a patch that implements a native integers version of ir_unop_any: http://patchwork.freedesktop.org/patch/25569/ From the overall symmetry of things, it seems like this was just forgotten whenever native integer support was added. All the other any_equal/etc have if (native_integers) do OR+etc else DP2 + etc. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal
So... this shader (from generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): uniform mat2 arg0; uniform mat2 arg1; void main() { bool result = (arg0 == arg1); gl_FragColor = vec4(result, 0.0, 0.0, 0.0); } Which becomes the following IR: ( (declare (shader_out ) vec4 gl_FragColor) (declare (temporary ) vec4 gl_FragColor) (declare (uniform ) mat2 arg0) (declare (uniform ) mat2 arg1) (function main (signature void (parameters ) ( (declare (temporary ) vec4 vec_ctor) (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) (declare (temporary ) bvec2 mat_cmp_bvec) (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref arg0) (constant int (0)) ) ) ) (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref arg0) (constant int (1)) ) ) ) (assign (x) (var_ref vec_ctor) (expression float b2f (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) )) ) When converted to TGS becomes: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..3] DCL TEMP[0..2], LOCAL IMM[0] FLT32 {0., 1., 0., 0.} IMM[1] INT32 {0, 0, 0, 0} 0: MOV TEMP[0].yzw, IMM[0]. 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy 2: OR TEMP[1].x, TEMP[1]., TEMP[1]. 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy 4: OR TEMP[2].x, TEMP[2]., TEMP[2]. 5: MOV TEMP[1].y, TEMP[2]. 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy 7: USNE TEMP[1].x, TEMP[1]., IMM[1]. 8: NOT TEMP[1].x, TEMP[1]. 9: AND TEMP[0].x, TEMP[1]., IMM[0]. 10: MOV OUT[0], TEMP[0] 11: END Note that FSNE/OR are used, implying that the integer version of these is expected. However then it goes on to use DP2, which, as I understand, does a floating point multiply + add. Now, this _happens_ to work out, since the integer representations of float 0 and int 0 are the same, and those are really the only possilibities we care about. However this seems really dodgy... wouldn't it be clearer to use either SNE + OR (which would still work!) + DP2, or alternatively AND them all together instead of SNE/DP2? This seems to come in via ir_unop_any_nequal. IMO the latter would be better since it keeps things in integer space, and presumably AND's are cheaper than fmul/fadd. I noticed this because nouveau's codegen logic isn't able to optimize this intelligently and I was trying to figure out why. Thoughts? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] odd translation from glsl to tgsi for ir_unop_any_nequal
On Wed, May 7, 2014 at 8:38 PM, Ilia Mirkin imir...@alum.mit.edu wrote: So... this shader (from generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-mat2-mat2.shader_test): uniform mat2 arg0; uniform mat2 arg1; void main() { bool result = (arg0 == arg1); gl_FragColor = vec4(result, 0.0, 0.0, 0.0); } Which becomes the following IR: ( (declare (shader_out ) vec4 gl_FragColor) (declare (temporary ) vec4 gl_FragColor) (declare (uniform ) mat2 arg0) (declare (uniform ) mat2 arg1) (function main (signature void (parameters ) ( (declare (temporary ) vec4 vec_ctor) (assign (yzw) (var_ref vec_ctor) (constant vec3 (0.0 0.0 0.0)) ) (declare (temporary ) bvec2 mat_cmp_bvec) (assign (x) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (0)) ) (array_ref (var_ref arg0) (constant int (0)) ) ) ) (assign (y) (var_ref mat_cmp_bvec) (expression bool any_nequal (array_ref (var_ref arg1) (constant int (1)) ) (array_ref (var_ref arg0) (constant int (1)) ) ) ) (assign (x) (var_ref vec_ctor) (expression float b2f (expression bool ! (expression bool any (var_ref mat_cmp_bvec) ) ) ) ) (assign (xyzw) (var_ref gl_FragColor) (var_ref vec_ctor) ) (assign (xyzw) (var_ref gl_FragColor@4) (var_ref gl_FragColor) ) )) ) When converted to TGS becomes: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..3] DCL TEMP[0..2], LOCAL IMM[0] FLT32 {0., 1., 0., 0.} IMM[1] INT32 {0, 0, 0, 0} 0: MOV TEMP[0].yzw, IMM[0]. 1: FSNE TEMP[1].xy, CONST[2].xyyy, CONST[0].xyyy 2: OR TEMP[1].x, TEMP[1]., TEMP[1]. 3: FSNE TEMP[2].xy, CONST[3].xyyy, CONST[1].xyyy 4: OR TEMP[2].x, TEMP[2]., TEMP[2]. 5: MOV TEMP[1].y, TEMP[2]. 6: DP2 TEMP[1].x, TEMP[1].xyyy, TEMP[1].xyyy 7: USNE TEMP[1].x, TEMP[1]., IMM[1]. 8: NOT TEMP[1].x, TEMP[1]. 9: AND TEMP[0].x, TEMP[1]., IMM[0]. 10: MOV OUT[0], TEMP[0] 11: END Note that FSNE/OR are used, implying that the integer version of these is expected. However then it goes on to use DP2, which, as I understand, does a floating point multiply + add. Now, this _happens_ to work out, since the integer representations of float 0 and int 0 are the same, and those are really the only possilibities we care about. However this seems really dodgy... wouldn't it be clearer to use either SNE + OR (which would still work!) + DP2, or alternatively AND them all together instead of SNE/DP2? This seems to come in via ir_unop_any_nequal. IMO the latter would be better since it keeps Erm, sorry -- the email subject and this sentence isn't _quite_ accurate. That should be ir_unop_any. ir_binop_any_nequal is what generates the FSNE/OR' combos. But everything else still holds :) things in integer space, and presumably AND's are cheaper than fmul/fadd. I noticed this because nouveau's codegen logic isn't able to optimize this intelligently and I was trying to figure out why. Thoughts? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev