> Nicolas Boulay wrote: >>>>Yep. Why nobody like the idea to keep every thing simple with 4 scalar >>>>core ? >>> >>>The control and routing hardware for doing vectors is simpler than >>>doing four scalars at the same time. >> >> >> No because you need to access easly each vector member. >> >> So only difference, is to manage many core instead of one. But todays >> Ati and >> Nvidia use at maximum 48 shader. So we will have to manage many core in >> parrallel. > > No, you have misunderstood how the nVidia and ATI GPUs are > designed. > > Each of those 48 shader units has a four-way vector ALU.
How could you be sure of this ? Do you have find some microarchitecture document from ATI and Nvidia ? I never find any of them. > The management problem is dividing up each vertex of a > triangle, or each fragment within a scan line, to be > assigned to a different shader unit. Each of the shader > units is executing it's own program. It's the same kind > of multiprocessing problem as dividing up, say, SETI or > similar parallel problems across many CPUs. > > What we are discussing is how to execute the instructions > WITHIN a single shader unit. We are ok about this. > The argument for vectors is that, with todays GPUs, the > majority of the instructions are four-way vector ops to > begin with. Nop. Read the code. Most propose ASM instruction are vector one. But when you read real code, you see a lot of MOV, a lot of partial vector use, scalar mul, etc... Maybe we should read more shader code but the one shown here are pretty full of scalar op (so this code did not fit well inside vector FPU). > It's not like regular C/C++/Java code where > the compiler has to try and identify vector-like sequences. > In shader code, every vector value and vector operation is > explicitly declared/written in the source code, so it's > easy for the assembler/compiler to generate vector ops. We don't have speak about that at all. The question was : "How optimise scalar code using vector arythmetique ?" This question is not easy. I had proposed scalar shader unit, after reading the code posted here, the 2 ARB document, and some tutorial about GLSL. It did not come from nowhere... > This also makes it easy to translate shaders into SSE or > AltiVec code sequences on conventional CPUs, which is > how various pure software pixel shader implementations > get their performance. > One of the early poster on this list did it. Maybe we should recontact him to have some real world statistic on the code used. Nicolas Boulay > -- > Hugh Fisher > DCS, ANU > _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
