On Wed, 18 Apr 2012 23:29:31 -0700, Kenneth Graunke <kenn...@whitecape.org> 
wrote:
> Consider the following code:
>    dp4(8)  g8<1>.xF  g1<4,4,1>F    g5<4,4,1>F { align16 WE_normal 1Q };
>    mov(8)  m3<1>.xF  g8<4,4,1>.xF             { align16 WE_normal 1Q };
> 
> Thanks to our existing compute-to-MRF code, this becomes:
>    dp4(8)  m3<1>.xF  g1<4,4,1>F    g5<4,4,1>F { align16 WE_normal 1Q };
> 
> However:
>   dp4(8)  g8<1>.xF  g1<4,4,1>F    g5<4,4,1>F { align16 WE_normal 1Q };
>   mov(8)  m3<1>.yF  g8<4,4,1>.xF             { align16 WE_normal 1Q };
> does not get optimized since the MRF and temporary GRF use different
> components, and the code does not yet support rewriting swizzles in the
> general case.  Scalars are an easy special case: since there's only one
> component, you can simply change the writemask to store it in the proper
> component for the MRF.
> 
> Reduces a simple shader in Unigine Tropics from 12 instructions to 9
> by eliminating superfluous MOVs for 3 of the 4 vector components.

This looks to me like it would also apply itself to a series of, say,
MULs to individual channels.  But it wouldn't reswizzle the source
channels as necessary, so you'd get wrong results.

Attachment: pgpXoIAmKXrha.pgp
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to