Re: [Open-graphics] Designing a CPU

Timothy Normand Miller Sat, 17 Mar 2007 11:17:17 -0800

On 3/17/07, Stephen Pollei <[EMAIL PROTECTED]> wrote:

On 3/16/07, Timothy Normand Miller <[EMAIL PROTECTED]> wrote:
> It won't be long before we'll have to design a nanocontroller for OGD1
> to manage VGA and DMA.  I may be able to just go off and design one
> myself, but I think that many of you would fancy observing and
> participating in the design process, and with more brains on it, we'd
> do a better job.


Sounds very interesting, do you want an assembler for it as well? Or
do you just want to use machine code to run it?


We'll definitely want an assembler.  It'll be too hard to code otherwise.

Since you are basing it off a mips design do you want to at least use
a subset of the mips mnemonics?


Yes, absolutely.

http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html
http://www.xs4all.nl/~vhouten/mipsel/r3000-isa.html
http://userpages.umbc.edu/~abhishek/cmsc411_slides/mips_instrucitons.pdf
http://en.wikipedia.org/wiki/MIPS_architecture


>  We'll have
> 32 registers, so we need 5-bit fields.  We need fields in the
> instruction for two source operands and one destination operand (that
> will get used in a later pipeline stage).

OK so register set is 32 big. Thats 5bits to address the whole set.
two source operands, and one destination operand. is 15 bits for
register addressing, so 17 bits for instructions, or so.
you have to play with the bits some if you sometimes store immediates
and sometimes registers in the same bit space. Do you just want to
copy the way mips encodes their instructions?


I figured we'd work our way back to there, yes.  Jim Dinan was
suggesting PISA, I believe.  We just need to be sure not to fixate on
that.  I chose a MIPS arch because it's very simple.  A non-pipelined
design could require almost as much logic, so we might as well
pipeline it.


> This is also where we need to deal with branches.  If the instruction
> is a branch, the condition needs to be resolved, and the address needs
> to be fed back to stage (1).  This is why RISC processors typically
> have a delayed branch.  The possible branch conditions are reg-value=0
> and reg-value!=0.
BEQ -- Branch on equal
BNE -- Branch on not equal
BGEZ, BGEZAL, BGTZ, BLEZ, BLTZ, BLTZAL from mips not used?


I'm not sure what those do, but I'm working from the simplest MIPS
model from the textbook.  We need to strike a balance between
functionality and logic area.  We need for the general case to be able
to keep up with the dataflow from PCI at 66MHz.  If the CPU runs at,
say, 100MHz, we can do that if we use lots of unrolled loops for data
movement.  Instructions that don't help us with that are just not
needed.


J -- Jump
JAL -- Jump and link
JR -- Jump register
You might not want some of these J's, if you want it real simple.


Yeah.  I figured we'd have a CALL-like instruction that puts the
return pointer into a register.  Return would just be to jump to an
address contained in a register.

Since our addresses are only 9 bits, that gives us some freedom that
MIPS didn't have.  We should probably reserve 10 or 11 bits to
future-proof it, but since this is very special-purpose, we shouldn't
be afraid to use a completely different ISA in a future product.

>
> (3) ALU
> Here, the numbers fetched from registers in stage (2) are combined
> based on an opcode in the instruction.  ALU operations include add,
> subtract, shift, multiply (using dedicated multiplier logic), and
> bitwise logical operations.
ADD -- Add
ADDI -- Add immediate
ADDIU -- Add immediate unsigned
ADDU -- Add unsigned
AND -- Bitwise and
ANDI -- Bitwise and immediate
DIV -- Divide
DIVU -- Divide unsigned
MULT, MULTU, OR, ORI, SLL, SLLV, SRA, SRL, SRLV, SUB, SUBU, XOR, XORI
I suppose you are happy with this list?


Yeah, but ditch the divide.  It's a multi-cycle instruction.  In its
place, we can add some single-cycle instructions that assist with
divides.  Early SPARCs didn't have multiply, but they did have
multiply step instructions.  We can do multiply, though, because the
FPGAs and ASICs have dedicated multiplier circuits.

BTW, does MIPS have a shift-and-add instruction?

MFHI -- Move from HI
MFLO -- Move from LO
might be considered real simple arith opers as well
a full move is just a ADDI with immediate equal to zero.

> Comparisons are done in the ALU.  The subtract instruction is used for
> equal/not-equal comparisons.  In addition, we'll provide signed and
> unsigned less-than instructions.
SLT -- Set on less than (signed)
SLTI -- Set on less than immediate (signed)
SLTIU -- Set on less than immediate unsigned
SLTU -- Set on less than unsigned


Yup.


> (4) Memory access and I/O
> This is the stage where we take an address computed above and read or
> write our local memory.  Our "local" memory is actually another
> 512-word block RAM, that we'll use as scratch space.

LB -- Load byte
LUI -- Load upper immediate
LW -- Load word
SB -- Store byte
SW -- Store word


We don't need any byte or 16-bit instructions.  OGA is spec'd out to
not understand anything but 32-bit pixels anyhow (we'll provide ways
to pretend to do 8-bit, but that's a separate discussion).  So if the
CPU needs to process 8-bit words, we'll just have to act like Alpha
and use extra code.

>
> I believe the MIPS processor uses the ALU to add the contents of one
> register to a short immediate value stored in the instruction, and
> that's used as the address.  We should do the same.  That makes it so
> that the only memory addressing mode is reg-value + offset.

Yes and imediate values are 16 bits. However if we have only 512 bytes
of ram then you only really need 9 bits. Only J and JAL take
immediates that are greater than 16bits, 26bits.


512 32-bit words.

This will either make it easier to code the instruction, or give us
more freedom.


> In addition, this is also the stage where we'll want to do other
> I/O-related operations, such as providing access to real graphics
> memory and controlling other aspects of the GPU that are accessible by
> this processor.  We'll make that available, to appear as another
> 512-word space (or more or less as necessary) or read-only and
> write-only "memory locations".

OK I don't know what you want io instructions to look like.


They can look like memory load/store instructions.  But there are also
cases where we might want a single instruction to cause a word to be
popped straight from one fifo and pushed into another.

Also, we need ways to handle writing to full fifos and reading from
empty ones.  One way would be to have the instruction block, but being
able to pause the pipeline adds extra logic and complexity that we
might like to avoid.



--
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Favorite book:  The Design of Everyday Things, Donald A. Norman, ISBN
0-465-06710-7
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Designing a CPU

Reply via email to