On Mon, Sep 21, 2009 at 12:17 AM, Hugh Fisher <[email protected]> wrote: > Timothy Normand Miller wrote: >> >> One of the design details that seems to be hard to present is the MIMD >> architecture. At first glance, it looks like a SIMD architecture. >> But all of you are right to point out that shader workloads are >> primarily scalar. > > I'd like to see some evidence for this.
A challenge! :) People keep telling me that they're primarily scalar workloads. I have accepted what they say. It may be that well-written shader programs are heavily vector but that typical shader programs written by typical programmers are not. Besides the obvious scalar ALU instructions, there are other instructions that take bandwidth that are also not vector: flow control, loads and stores There's lots of those. No? Also, if memory load instruction latency dominates, then none of this matters. Many shader programs will spend most of their time waiting on memory, making vector optimizations moot. > Some years ago I wrote a bunch of demonstration GPU shader programs > in low level ARB/nVidia assembly. You can still find them at: > <http://cs.anu.edu.au/~Hugh.Fisher/3dstuff/lowlevel.html> > > 80% of the instructions are vector, only 20% scalar. The ratio of > scalar instructions increases very slightly with the more complex > shaders to perhaps 25%. The single most common instruction is DP, > Dot Product, of three or four operands from a vertex/color/matrix. I can see vertex shader programs being DP heavy. But there will be far fewer vertexes than fragments. How DP-heavy are fragment shader programs, generally? > If you're using shaders to emulate the original fixed function > OpenGL/Direct3D pipelines, the ratio of SIMD to scalar will be > even higher. > > OK, my shaders are old, and predate Shader Model 3.0 and widespread > use of high level languages. They still do what every 3D engine > spends most of its time doing: multiply vertices by a matrix, and > RGB/RGBA colors by other colors. > > I'm happy to be proved wrong on this, but let's do so on the basis > of real world shaders written by graphics programmers. Yeah. I agree. I don't know enough about this myself. We need to do this right without egos or too much guessing. Others here should be able to fill in the gaps. > I'd suggest a MIPS with each floating point reg extended to 128 bits > as 4 x 32 / 2 x 64 floats with every add/etc instruction now being > SIMD. For you Intel folk, think of it as using SSE for everything. This is congruent with one of my early designs. :) -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
