MISC are a false good idea. Managing pipelined unit and latency are an horrible task : if your FMAC is a pipelines unit, you have exactly one time slot to read the output !
If i remember correctly, shader program use only ~256 instructions. So you could design an instruction set that look like µcode or vliw code. Such shader should be small and fast. If you need more power put 2 of them. For me the instruction set, look like a part for JUMP/LOOP management, a part for computation (very complete : ALU, dot product, MIN/MAX, CLAMP..., all in 1 cycle) you could put 4 register read and 2 write here, such ILP could be easly find in typical code, and you add a third part that manage load&store with complexe adressing mode (for 1D, 2D and 3D data,...). You could also add some bit for predicat calculus. This introduice a cheap way for if-clause. At the end, you will have a long instruction word (>100 bits), few register set (one with a least 32*4 floating 32 bits word, one for adresse calculation, one maybe for managing data read from the memory (a write port is very costly)) 1 instruction could perform a load or a store, a calcul and a jump. It's important to have the calcul unit always produicing a usefull work. Normaly, it's the largest unit of the shader design. Nicolas Boulay > On 4/17/06, Tom Cook <[EMAIL PROTECTED]> wrote: > >> This probably sounds dumb, in fact it probably is dumb, but I don't >> really >> see the need for an instruction op-code at all? If EVERYTHING the GPU >> has >> access to is memory mapped, why not just make each "instruction" a pair >> of >> addresses? (or a set of addresses, however wide you want your processor >> to >> be). > > That is the basic idea behind MISC. Everything is done via > special-purpose registers. > > But as I said before, all you're really doing is encoding the opcode > into the register index. And think about this for a moment: > > In a MISC design, if you want to add, you specify the source operands > (probably general purpose registers) to copy into "input registers" > for your adder. That's two moves (which you can do in one > instruction). Later, you can pop out the result and move it back to a > GPR. (Another move.) So here's your code: > > mov rA0 <- r1, rA1 <- r2 > ... > mov r3 <- rA2 > > Let's say you have a register space of 256 registers, so each > instruction takes 16 bits. The two together require 32 bits. > > Now, let's consider a RISC design. In this case, you don't need so > many registers, just the GPRs: > > add r3 <- r1,r2 > > If the add opcode is 4 bits and the three operands are each 4 bits, > then you need 16 bits to encode this. > > The point to take is that you need twice as many bits to encode the > same "instruction". With misc, all you've done is move the upper > nybble of the add ports (0xA) from the three operands into the one > opcode. > > It may very well be worth it to use the extra bits (something I've > hinted at earlier), but keep in mind where your redundancies are and > make sure they're a net gain. > > > Oh, one other thing: Compilers have a hard time with special-purpose > registers. > _______________________________________________ > Open-graphics mailing list > [email protected] > http://lists.duskglow.com/mailman/listinfo/open-graphics > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
