Timothy Normand Miller wrote:
A challenge!  :)  People keep telling me that they're primarily scalar
workloads.  I have accepted what they say.  It may be that
well-written shader programs are heavily vector but that typical
shader programs written by typical programmers are not.

Besides the obvious scalar ALU instructions, there are other
instructions that take bandwidth that are also not vector:  flow
control, loads and stores
There's lots of those.  No?

I downloaded all the vertex and fragment shaders from the Orange
Book, AKA OpenGL Shading Language by Randi J Rost. You can grab
them as a ZIP file from from the bottom of the list at
<http://3dshaders.com/home/index.php?option=com_weblinks&catid=14&Itemid=34>

The Mesa3D source distro includes a utility program that can compile
most GLSL programs into low-level GPU assembler code. It's based on
the original low-level ARB/NVIDIA instruction sets with some extra
directives for if and loop branching.

I wrote a Python program that runs the Mesa3D glslcompiler and then
scans through the assembler output, classifying each line as either
a jump/flow of control instruction, a scalar operation involving
only single values, or a vector operation that requires two or more
values from a 128 bit 4 x 32 vector. If you want to try it yourself,
<http://cs.anu.edu.au/~hugh.fisher/mesa3d/countGLSL.py>

Running this program over the Orange book demos, 9 of the shaders
wouldn't compile because they used features only available in newer
versions of GLSL not supported by Mesa3D. For the 90 shaders that
did work, the average number of lines of code (assembler, not high
level) is 32.

Breaking down by instruction type over those 90 shaders

Lines =  2863
Other  50 =    1.7%     (These are branch targets)
Branch 80 =    2.8%
Scalar 1008 = 35.2%
Vector 1725 = 60.3%

Yes, the ratios have shifted since the early days of Shader Model
1.0 and low level assembler. But not much. 3D graphics in the GPU
era is still all about vector/matrix crunching.

The only argument I've heard so far in favour of MIMD is that it
would improve the performance for scalar workloads. For a graphics
chip, that doesn't appear to be a smart approach.

--
        Hugh Fisher
        CECS, ANU
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to