On Mon, Aug 22, 2011 at 11:26 AM, Nicolas Boulay <[email protected]> wrote: > "They *don't* run in parallel (as much as they > appear), instead when each thread stalls (say, to do a memory lookup), > another thread will run. It's just pipelined." > > This have the problem of removing access locality, the data should not > be happy with such system. > > I don't know how fast could be a cpu with lot of register, lot of > load/store addressing mode to avoid data realignement, a lot of basic > type to decode easly pixel packing, the main goal will be to use at > each cycle the floating point multiplication unit.
many of the "embedded" opengl systems use fixed point. there have been plenty of research papers as well as actual real embedded GPUs that prove that it's possible [for opengl ES]. then, also, the DEC Alpha CPU didn't have floating point: it had fast enough fixed-point primitives in order to emulate floating point at the same kinds of speeds as dedicated FPUs. the key primitives are 1s complement add and/or mantissa normalisation. l. _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
