MISC are a false good idea. Managing pipelined unit and latency are an
horrible task : if your FMAC is a pipelines unit, you have exactly one
time slot to read the output !

If i remember correctly, shader program use only ~256 instructions. So you
could design an instruction set that look like µcode or vliw code.

Such shader should be small and fast. If you need more power put 2 of them.

For me the instruction set, look like a part for JUMP/LOOP management, a
part for computation (very complete : ALU, dot product, MIN/MAX, CLAMP...,
all in 1 cycle) you could put 4 register read and 2 write here, such ILP
could be easly find in typical code, and you add a third part that manage
load&store with complexe adressing mode (for 1D, 2D and 3D data,...). You
could also add some bit for predicat calculus. This introduice a cheap way
for if-clause.

At the end, you will have a long instruction word (>100 bits), few
register set (one with a least 32*4 floating 32 bits word, one for adresse
calculation, one maybe for managing data read from the memory (a write
port is very costly))

1 instruction could perform a load or a store, a calcul and a jump. It's
important to have the calcul unit always produicing a usefull work.
Normaly, it's the largest unit of the shader design.

Nicolas Boulay

> On 4/17/06, Tom Cook <[EMAIL PROTECTED]> wrote:
>
>> This probably sounds dumb, in fact it probably is dumb, but I don't
>> really
>> see the need for an instruction op-code at all?  If EVERYTHING the GPU
>> has
>> access to is memory mapped, why not just make each "instruction" a pair
>> of
>> addresses?  (or a set of addresses, however wide you want your processor
>> to
>> be).
>
> That is the basic idea behind MISC.  Everything is done via
> special-purpose registers.
>
> But as I said before, all you're really doing is encoding the opcode
> into the register index.  And think about this for a moment:
>
> In a MISC design, if you want to add, you specify the source operands
> (probably general purpose registers) to copy into "input registers"
> for your adder.  That's two moves (which you can do in one
> instruction).  Later, you can pop out the result and move it back to a
> GPR.  (Another move.)  So here's your code:
>
> mov rA0 <- r1, rA1 <- r2
> ...
> mov r3 <- rA2
>
> Let's say you have a register space of 256 registers, so each
> instruction takes 16 bits.  The two together require 32 bits.
>
> Now, let's consider a RISC design.  In this case, you don't need so
> many registers, just the GPRs:
>
> add r3 <- r1,r2
>
> If the add opcode is 4 bits and the three operands are each 4 bits,
> then you need 16 bits to encode this.
>
> The point to take is that you need twice as many bits to encode the
> same "instruction".  With misc, all you've done is move the upper
> nybble of the add ports (0xA) from the three operands into the one
> opcode.
>
> It may very well be worth it to use the extra bits (something I've
> hinted at earlier), but keep in mind where your redundancies are and
> make sure they're a net gain.
>
>
> Oh, one other thing:  Compilers have a hard time with special-purpose
> registers.
> _______________________________________________
> Open-graphics mailing list
> [email protected]
> http://lists.duskglow.com/mailman/listinfo/open-graphics
> List service provided by Duskglow Consulting, LLC (www.duskglow.com)
>


_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to