2007/4/17, Attila Kinali <[EMAIL PROTECTED]>:
> nVidia and ATI have designed specialised CPU ('shader units') > for their cards. I think it is reasonable for the OGF to > consider using a general purpose CPU on the card because it > will be quicker and easier than designing our own. We'll end > up with a bigger chip than we need, but we know it will work > right away and come with an assembler and debugger.No it will not. Shaders are highly parallel SIMD units. general purpose CPUs are SISD constructs with SIMD extensions. Very few of them can execute more than 2 SIMD instructions in parallel which limits the number of shader operations done in parallel which in turn limits the fill rate (super linearly because there are second order effects that are not obvious).
If we need a shader, the deal is different. Generic cpu are no needed because you don't need c compiler (or for gpgpu but ...) but you need to convert the generic ASM of the shading langage to the asm of your shader. Many sample of shader code have been send here few month ago. Generic ASM was design with SIMD of 4 "vectors at a time" instruction set. But those sample was not SIMD at all but mostly scalar ! At that time, If we use a 4 way SIMD shader, 90% of the time the 4th FMACC unit is not used. And maybe 50% of the time, only 1 FMACC is needed. Beside that, NVIDIA shader was 2 ways SIMD, but there new card are describe using 128 _scalar_ shaders. Don't forget that shader have a workset that is "embarrasingly parrallel". Imagine that 3 scalar core is as big as a 4 way core, the speed will be higher because of the rate of use of the FMACC units. I know that the problem will move to the connection network. But don't forget also that small core will have higher clock speed. Regards, Nicolas _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
