Nicolas Boulay wrote:
2007/4/17, Attila Kinali <[EMAIL PROTECTED]>:
> nVidia and ATI have designed specialised CPU ('shader units')
> for their cards. I think it is reasonable for the OGF to
> consider using a general purpose CPU on the card because it
> will be quicker and easier than designing our own. We'll end
> up with a bigger chip than we need, but we know it will work
> right away and come with an assembler and debugger.
No it will not. Shaders are highly parallel SIMD units.
general purpose CPUs are SISD constructs with SIMD extensions.
Very few of them can execute more than 2 SIMD instructions
in parallel which limits the number of shader operations
done in parallel which in turn limits the fill rate
(super linearly because there are second order effects
that are not obvious).
If we need a shader, the deal is different. Generic cpu are no needed
because you don't need c compiler (or for gpgpu but ...) but you need
to convert the generic ASM of the shading langage to the asm of your
shader.
Many sample of shader code have been send here few month ago. Generic
ASM was design with SIMD of 4 "vectors at a time" instruction set. But
those sample was not SIMD at all but mostly scalar ! At that time, If
we use a 4 way SIMD shader, 90% of the time the 4th FMACC unit is not
used. And maybe 50% of the time, only 1 FMACC is needed.
The problem is that OpenGL shader language includes matrix multiplies.
The most expensive instruction is a 4 by 4 cross product. This is not
common but a 3 by 3 and 4 by 4 dot product instruction are more common
and these are also quite expensive instructions. So we have the
situation where most of the code executes rapidly but a small amount of
the code takes a long time. I don't have figures on this but I suspect
that the computationally expensive instructions wind up taking the
majority of the time to execute.
Beside that, NVIDIA shader was 2 ways SIMD, but there new card are
describe using 128 _scalar_ shaders.
IIUC, the new nVidia chips do not have shaders but rather an array of
ALUs (probably MACs) which can be used to implement shaders. How this
is done is something that I don't know. I only know how I would do it.
--
JRT
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)