On Sat, Apr 3, 2010 at 2:37 PM, Luca Barbieri <luca.barbi...@gmail.com> wrote: > Note all "X compiler is bad for VLIW or whatever GPU architecture" > objections are irrelevant, since almost all optimizations are totally > architecture independent.
Way back I actually looked into LLVM for R300. I was totally unconvinced by their vector support back then, but that may well have changed. In particular, I'm curious about how LLVM deals with writemasks. Writing to only a select subsets of components of a vector is something I've seen in a lot of shaders, but it doesn't seem to be too popular in CPU-bound SSE code, which is probably why LLVM didn't support it well. Has that improved? The trouble with writemasks is that it's not something you can just implement one module for. All your optimization passes, from simple peephole to the smartest loop modifications need to understand the meaning of writemasks. > This is obviously not achievable if Mesa/Gallium contributors are > supposed to write the compiler optimization themselves, since clearly > there is not even enough manpower to support a relatively up-to-date > version of OpenGL or, say, to have drivers that can allocate and fence > GPU memory in a sensible and fast way, or implement hierarchical Z > buffers, or any of the other things expected from a decent driver, > that the Mesa drivers don't do. I agree, though if I were to start an LLVM-based compilation project, I would do it for R600+, not for R300. That would be a very different kind of project. > So, for a GSoC project, I'd kind of suggest: > (1) Adapt the gallivm/llvmpipe TGSI->LLVM converter to also generate > AoS code (i.e. RGBA vectors as opposed to RRRR, GGGG, etc.) if > possible or write one from scratch otherwise > (2) Write a LLVM->TGSI backend, restricted to programs without any control > flow > (3) Make LLVM->TGSI always work (even with control flow and DDX/DDY) > (4) Hook up all useful LLVM optimizations A LLVM->TGSI conversion is not the best way to go because TGSI doesn't match the hardware all that well, at least in the Radeon family. R300-R500 fragment programs have the weird RGB/A split, and R600+ is yet another beast that looks quite different from TGSI. So at least for Radeon, I believe it would be best to generate hardware-level instructions directly from LLVM, possibly via some Radeon-family specific intermediate representation. The thing is, a lot of the optimizations in the r300 compiler are actually there *because* TGSI (and Mesa instructions) are not a good match for what the hardware looks like. So replacing those optimizations by an LLVM pass which then becomes useless due to a drop to TGSI seems a bit silly. In a way, this is rather frustrating when dealing with the assembly produced by the Mesa GLSL compiler. That compiler is rather well-meaning and tries to deal well with scalar values, but then those "optimizations" are actually counterproductive for Radeon, because they end up e.g. using instructions like RCP and RSQ on one of the RGB components, which happens to be a really bad idea. It would be nice if we could feed e.g. LLVM IR into the Gallium driver instead of TGSI, and let the Gallium driver worry about all optimizations. Anyway, I'm convinced that LLVM (or something like it) is necessary for the future. However, for this particular GSoC proposal, it's off the mark. cu, Nicolai ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev