Sun's UltraSparc-T1 (http://www.sun.com/processors/UltraSPARC-T1/) makes some interesting trade-offs... They ditched out of order execution and branch prediction (making for a much simplified (and thus smaller) processor core) and designed each core with 4 sets of registers and enough logic to keep track of up to 4 threads, executing instructions from one at a time, in order. Fetcheds from cache take 4 cycles, so by the time a thread rolls around again, it's associated data is there, ready and waiting. A design like that, especially with multiple cores, trades some memory (in the form of extra registers) and some latency for nearly 100% CPU utilization so long as you have enough threads.
--tim _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
