On Wed, May 30, 2012 at 3:56 PM, Nicolas Boulay <[email protected]> wrote: > If you want a general purpose shader, it's not very different of an > array of RISC cpus with no memory coherency ?
Yes and no. It's RISC, but the instruction sets are differently. Also, while CPUs are designed to minimize latency, GPUs maximize throughput at the expense of latency, which is why a GPU is so widely SMP and SMT. You might want to read some stuff about GPU architecture before continuing. > > To raise effisciency, you maybe have to use a LIW design to have the > possiblity to use the big FPU on every cycle. Then you need complexe > load/store unit that use complexe adressing scheme to avoid loosing > time in 2D access. You also need enough register to never use RAM for > temporary variable, and you don't have the task switch problem of cpu. LIW designs have the drawback of finding it hard to fill all the slots, giving you poor utilization. > > To help on defining an instruction set : > * orc instruction should be studied http://code.entropywave.com/orc/ > (it's a generic assembler for any SIMD instruction set) But we have already argued that SIMD is not what we want to do because of the poor internal utilization. Most instructions won't be 4vec. > * the extension of SPARC V9 for the super computer of Fujitsu, it's > mainly SIMD, but have some good idea like speed up polynomial calculus > (polynomial are used to approximate any function) > > 2012/5/30 Timothy Normand Miller <[email protected]>: >> To your list, I can add memory addresses, address calculations, and >> coordinates. >> >> Do we need an integer multiply? We'll already have a multiplier block >> for float, and we can route integers through that, but that still >> means extra logic in the pipeline. >> >> We'll also need instructions for converting to/from float. >> >> >> On Wed, May 30, 2012 at 2:43 PM, <[email protected]> wrote: >>> On Wed, 30 May 2012 14:26:56 -0400, Timothy Normand Miller wrote: >>>> >>>> Someone pointed out to me that integer ops don't have much utility in >>>> GPUs. Memory may be in integer form, but ARGB in 8888 format, for >>>> instance, would get converted to/from vec4 format (or our equivalent) >>>> as part of the memory access. If we have no need for integer ops, >>>> that would simplify the architecture. So can anyone make any >>>> arguments as to why it would be strictly necessary to support integer >>>> operations? >>> >>> >>> loop counters ? >>> compute array indices ? >>> helpful for crypto acceleration ? >>> (we're speaking about making an energy-efficient >>> system so if it means reusing it for other common >>> tasks, great) >>> >>> just wondering. >>> >>> >>> _______________________________________________ >>> Open-graphics mailing list >>> [email protected] >>> http://lists.duskglow.com/mailman/listinfo/open-graphics >>> List service provided by Duskglow Consulting, LLC (www.duskglow.com) >> >> >> >> -- >> Timothy Normand Miller >> http://www.cse.ohio-state.edu/~millerti >> Open Graphics Project >> _______________________________________________ >> Open-graphics mailing list >> [email protected] >> http://lists.duskglow.com/mailman/listinfo/open-graphics >> List service provided by Duskglow Consulting, LLC (www.duskglow.com) -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
