On Sat, Apr 3, 2010 at 2:37 PM, Luca Barbieri <luca.barbi...@gmail.com> wrote:
> Note all "X compiler is bad for VLIW or whatever GPU architecture"
> objections are irrelevant, since almost all optimizations are totally
> architecture independent.

Way back I actually looked into LLVM for R300. I was totally
unconvinced by their vector support back then, but that may well have
changed. In particular, I'm curious about how LLVM deals with
writemasks. Writing to only a select subsets of components of a vector
is something I've seen in a lot of shaders, but it doesn't seem to be
too popular in CPU-bound SSE code, which is probably why LLVM didn't
support it well. Has that improved?

The trouble with writemasks is that it's not something you can just
implement one module for. All your optimization passes, from simple
peephole to the smartest loop modifications need to understand the
meaning of writemasks.


> This is obviously not achievable if Mesa/Gallium contributors are
> supposed to write the compiler optimization themselves, since clearly
> there is not even enough manpower to support a relatively up-to-date
> version of OpenGL or, say, to have drivers that can allocate and fence
> GPU memory in a sensible and fast way, or implement hierarchical Z
> buffers, or any of the other things expected from a decent driver,
> that the Mesa drivers don't do.

I agree, though if I were to start an LLVM-based compilation project,
I would do it for R600+, not for R300. That would be a very different
kind of project.


> So, for a GSoC project, I'd kind of suggest:
> (1) Adapt the gallivm/llvmpipe TGSI->LLVM converter to also generate
> AoS code (i.e. RGBA vectors as opposed to RRRR, GGGG, etc.) if
> possible or write one from scratch otherwise
> (2) Write a LLVM->TGSI backend, restricted to programs without any control 
> flow
> (3) Make LLVM->TGSI always work (even with control flow and DDX/DDY)
> (4) Hook up all useful LLVM optimizations

A LLVM->TGSI conversion is not the best way to go because TGSI doesn't
match the hardware all that well, at least in the Radeon family.
R300-R500 fragment programs have the weird RGB/A split, and R600+ is
yet another beast that looks quite different from TGSI. So at least
for Radeon, I believe it would be best to generate hardware-level
instructions directly from LLVM, possibly via some Radeon-family
specific intermediate representation.

The thing is, a lot of the optimizations in the r300 compiler are
actually there *because* TGSI (and Mesa instructions) are not a good
match for what the hardware looks like. So replacing those
optimizations by an LLVM pass which then becomes useless due to a drop
to TGSI seems a bit silly.

In a way, this is rather frustrating when dealing with the assembly
produced by the Mesa GLSL compiler. That compiler is rather
well-meaning and tries to deal well with scalar values, but then those
"optimizations" are actually counterproductive for Radeon, because
they end up e.g. using instructions like RCP and RSQ on one of the RGB
components, which happens to be a really bad idea. It would be nice if
we could feed e.g. LLVM IR into the Gallium driver instead of TGSI,
and let the Gallium driver worry about all optimizations.

Anyway, I'm convinced that LLVM (or something like it) is necessary
for the future. However, for this particular GSoC proposal, it's off
the mark.

cu,
Nicolai

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to