On Wed, 18 Apr 2012 23:29:31 -0700, Kenneth Graunke <kenn...@whitecape.org> wrote: > Consider the following code: > dp4(8) g8<1>.xF g1<4,4,1>F g5<4,4,1>F { align16 WE_normal 1Q }; > mov(8) m3<1>.xF g8<4,4,1>.xF { align16 WE_normal 1Q }; > > Thanks to our existing compute-to-MRF code, this becomes: > dp4(8) m3<1>.xF g1<4,4,1>F g5<4,4,1>F { align16 WE_normal 1Q }; > > However: > dp4(8) g8<1>.xF g1<4,4,1>F g5<4,4,1>F { align16 WE_normal 1Q }; > mov(8) m3<1>.yF g8<4,4,1>.xF { align16 WE_normal 1Q }; > does not get optimized since the MRF and temporary GRF use different > components, and the code does not yet support rewriting swizzles in the > general case. Scalars are an easy special case: since there's only one > component, you can simply change the writemask to store it in the proper > component for the MRF. > > Reduces a simple shader in Unigine Tropics from 12 instructions to 9 > by eliminating superfluous MOVs for 3 of the 4 vector components.
This looks to me like it would also apply itself to a series of, say, MULs to individual channels. But it wouldn't reswizzle the source channels as necessary, so you'd get wrong results.
pgpXoIAmKXrha.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev