Paul Brook wrote:
By your own numbers conventional 4-element SIMD gets <70% ALU utilization.
That leaves quite a lot of scope for improvement, be it full MIMD or several
ALU threads controlled by a single dispatch unit (ala. Larabee SIMD with
software thread combining).
I started this thread because the OGA2 designers were assuming
that GPU loads had become increasingly scalar. They're not, so
the design needs to be reconsidered.
I'm not opposed to MIMD full stop. If y'all can design a MIMD
processor that runs vector code fast, I'll be happy.
--
Hugh Fisher
CECS, ANU
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)