Re: [Mesa-dev] r600g: status of the r600-sb branch

Christian König Sat, 20 Apr 2013 04:38:51 -0700

Am 20.04.2013 13:12, schrieb Vadim Girlin:

On 04/20/2013 01:42 PM, Christian König wrote:
Am 19.04.2013 18:50, schrieb Vadim Girlin:
On 04/19/2013 08:35 PM, Christian König wrote:
Hey Vadim,
Am 19.04.2013 18:18, schrieb Vadim Girlin:
[SNIP]

In theory, yes, some optimizations in this branch are typically used
on the earlier compilation stages, not on the target machine code. On
the other hand, there are some differences that might make it harder,
e.g. many algorithms require SSA form, and though it's possible to do
similar optimizations without SSA, it would be hard to implement.Also
I wanted to support both default backend and llvm backend for
increased testing coverage and to be able to compare theefficiency of
the algorithms in my experiments etc.
Yeah I know, missing an SSA implementation is also something thatalways
bothered me a bit with both TGSI and GLSL (while I haven't done much
with GLSL, so maybe I misjudge here).

Can you name the different algorithms used?
There is a short description of the algorithms and passes in the
notes.markdown file [1] in that branch, there are also links in the
end to the full description of some algorithms, though some of them
were modified/adapted for this branch.
It's not a strict prerequisite, but I think we both agree that doing
things like LICM on R600 bytecode isn't the best idea over all (when
doing it on GLSL would be beneficial for all drivers not only r600).
In fact there is no special LICM pass, it's done by the GCM (Global
Code Motion, [2]), which probably could be also called global
scheduler. In fact in my branch this pass is combined with some
hw-specific scheduling logic, e.g. grouping fetch/alu instructions to
reduce clause type switching in the code and the number of required CF
instructions, potentially it can also schedule clauses to expose more
parallelism with the BARRIER bit usage.
Yeah I already thought that you're using something like this.

On one hand that is really good, cause it is specialized on so produces
really optimal code for the r600 target. But on the other hand it's bad,
cause it is specialized on so produces really optimal code ONLY on the
r600 target....
I think such pass on higher level (GLSL IR or TGSI) would at leastneed some callbacks or caps to be tunable for the target.
Anyway the result of GCM pass is affected by the CFG structure, sowhen the target applies e.g. if-conversion or any othertarget-specific control flow optimization, this means that you mightwant to apply similar pass again on the target instruction level forbetter results, and then previous pass on higher level IR looks notvery useful.
Also there are some high level operations that are translated to thebunch of target instructions, e.g. integer division on r600.High-level pass can't hoist "i/5" (where i is loop counter) out of theloop, but after translation to target instructions it's possible tohoist some of the resulting instructions, producing more efficient code.
One more point is that GCM allows to achieve best efficiency when usedwith GVN (Global Value Numbering) pass, e.g. GCM allows GVN to notcare about code placement during elimination of redundant operations,so you'll probably want to implement high-level GVN pass as well.
I think it's possible to implement GVN-GCM on GLSL or TGSI level, butI suspect it will require a lot more efforts than it was required byimplementation of these passes in my branch, and will be less efficient.
Just speculating, what would it take to make those passes run on the
LLVM Machine Instruction representation instead of your ownrepresentation?
Main difference between IRs is the representation of control flow,r600-sb relies on the fact that r600 arch doesn't have arbitrarycontrol flow, this renders CFGs superfluous. Implementation of thesepasses on CFGs will be more complicated, it will also require thecomputation of dominance frontiers, loops detection and analysis, etc.On the r600-sb's IR these passes are greatly simplified.
Regarding the GCM, original algorithm as described in that pdf workson the CFG, so it shouldn't be hard to implement in LLVM, but I'm notsure how it will fit into the LLVM infrastructure. LLVM has GVN-PRE,LICM and other passes that together do basically the same thing asGVN-GCM, so if you implement it, you might want to get rid of LLVM'sown passes that duplicate the same functionality, and I'm not sure ifthis would be easy, possibly there are some interdependencies etc.Also I saw mentions of some plans (e.g. [1],[2]) regarding theimplementation of global code motion in LLVM, looks like there isalready some work in progress.

Oh, I wasn't taking about replacing any LLVM passes, more like extendingthem to provide the same amount of functionality. Also I hadn't had LLVMIR in mind while writing this, but more the machine instructionrepresentation they use.

Well you have quite allot of C++ code here, and a big bunch of it justdeals with bringing the bytecode into a representation where you can runyour algorithms on it.


Christian.

Vadim
[1]http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120709/146206.html[2]http://markmail.org/message/2td3fnnggk6oripp#query:+page:1+mid:2td3fnnggk6oripp+state:results
Christian.
Vadim

 [1]
http://cgit.freedesktop.org/~vadimg/mesa/tree/src/gallium/drivers/r600/sb/notes.markdown?h=r600-sb
 [2]
http://www.cs.washington.edu/education/courses/cse501/06wi/reading/click-pldi95.pdf
Regards,
Christian.


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] r600g: status of the r600-sb branch

Reply via email to