On Sun, Apr 4, 2010 at 10:42 PM, Luca Barbieri <luca.barbi...@gmail.com> wrote:
>> Way back I actually looked into LLVM for R300. I was totally
>> unconvinced by their vector support back then, but that may well have
>> changed. In particular, I'm curious about how LLVM deals with
>> writemasks. Writing to only a select subsets of components of a vector
>> is something I've seen in a lot of shaders, but it doesn't seem to be
>> too popular in CPU-bound SSE code, which is probably why LLVM didn't
>> support it well. Has that improved?
>>
>> The trouble with writemasks is that it's not something you can just
>> implement one module for. All your optimization passes, from simple
>> peephole to the smartest loop modifications need to understand the
>> meaning of writemasks.
>
> You should be able to just use
> shufflevector/insertelement/extractelement to mix the new computed
> values with the previous values in the vector register (as well as
> doing swizzles).

Okay, that looks good.

> There is also the option of immediately scalarizing, optimizing the
> scalar code, and then revectorizing.
> This risks pessimizing the input code, but might turn out to work well.

This might depend on the target: R600+, for example, is quite
scalar-oriented anyway (modulo a lot of subtle limitations), so just
pretending that everything is scalar could work well there since
revectorizing is almost unnecessary.

cu,
Nicolai

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to