On Sun, Apr 4, 2010 at 10:42 PM, Luca Barbieri <luca.barbi...@gmail.com> wrote: >> Way back I actually looked into LLVM for R300. I was totally >> unconvinced by their vector support back then, but that may well have >> changed. In particular, I'm curious about how LLVM deals with >> writemasks. Writing to only a select subsets of components of a vector >> is something I've seen in a lot of shaders, but it doesn't seem to be >> too popular in CPU-bound SSE code, which is probably why LLVM didn't >> support it well. Has that improved? >> >> The trouble with writemasks is that it's not something you can just >> implement one module for. All your optimization passes, from simple >> peephole to the smartest loop modifications need to understand the >> meaning of writemasks. > > You should be able to just use > shufflevector/insertelement/extractelement to mix the new computed > values with the previous values in the vector register (as well as > doing swizzles).
Okay, that looks good. > There is also the option of immediately scalarizing, optimizing the > scalar code, and then revectorizing. > This risks pessimizing the input code, but might turn out to work well. This might depend on the target: R600+, for example, is quite scalar-oriented anyway (modulo a lot of subtle limitations), so just pretending that everything is scalar could work well there since revectorizing is almost unnecessary. cu, Nicolai ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev