On 3/16/07, Timothy Normand Miller <[EMAIL PROTECTED]> wrote:
It won't be long before we'll have to design a nanocontroller for OGD1 to manage VGA and DMA. I may be able to just go off and design one myself, but I think that many of you would fancy observing and participating in the design process, and with more brains on it, we'd do a better job.
Sounds very interesting, do you want an assembler for it as well? Or do you just want to use machine code to run it? Since you are basing it off a mips design do you want to at least use a subset of the mips mnemonics? http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html http://www.xs4all.nl/~vhouten/mipsel/r3000-isa.html http://userpages.umbc.edu/~abhishek/cmsc411_slides/mips_instrucitons.pdf http://en.wikipedia.org/wiki/MIPS_architecture
If you want a good textbook on this, look for "Computer Architecture: A Quantitative Approach" by Hennessy and Patterson. Here's how we'll break up our processor pipeline, deviating slightly from the MIPS template described in that book. (1) Instruction fetch Here, you have an instruction pointer that indicates the address of the next instruction to execute. In our processor, our instructions are stored in a local static RAM inside of the FPGA, so there is no need for any sort of "cache miss" logic. With an address, you are guaranteed to get an instruction immediately on the next cycle. Our instructions are 32 bits wide. (We could go to 36 bits if we find it helpful.)
We'll have 32 registers, so we need 5-bit fields. We need fields in the instruction for two source operands and one destination operand (that will get used in a later pipeline stage).
OK so register set is 32 big. Thats 5bits to address the whole set. two source operands, and one destination operand. is 15 bits for register addressing, so 17 bits for instructions, or so. you have to play with the bits some if you sometimes store immediates and sometimes registers in the same bit space. Do you just want to copy the way mips encodes their instructions?
This is also where we need to deal with branches. If the instruction is a branch, the condition needs to be resolved, and the address needs to be fed back to stage (1). This is why RISC processors typically have a delayed branch. The possible branch conditions are reg-value=0 and reg-value!=0.
BEQ -- Branch on equal BNE -- Branch on not equal BGEZ, BGEZAL, BGTZ, BLEZ, BLTZ, BLTZAL from mips not used? J -- Jump JAL -- Jump and link JR -- Jump register You might not want some of these J's, if you want it real simple.
(3) ALU Here, the numbers fetched from registers in stage (2) are combined based on an opcode in the instruction. ALU operations include add, subtract, shift, multiply (using dedicated multiplier logic), and bitwise logical operations.
ADD -- Add ADDI -- Add immediate ADDIU -- Add immediate unsigned ADDU -- Add unsigned AND -- Bitwise and ANDI -- Bitwise and immediate DIV -- Divide DIVU -- Divide unsigned MULT, MULTU, OR, ORI, SLL, SLLV, SRA, SRL, SRLV, SUB, SUBU, XOR, XORI I suppose you are happy with this list? MFHI -- Move from HI MFLO -- Move from LO might be considered real simple arith opers as well a full move is just a ADDI with immediate equal to zero.
Comparisons are done in the ALU. The subtract instruction is used for equal/not-equal comparisons. In addition, we'll provide signed and unsigned less-than instructions.
SLT -- Set on less than (signed) SLTI -- Set on less than immediate (signed) SLTIU -- Set on less than immediate unsigned SLTU -- Set on less than unsigned
(4) Memory access and I/O This is the stage where we take an address computed above and read or write our local memory. Our "local" memory is actually another 512-word block RAM, that we'll use as scratch space.
LB -- Load byte LUI -- Load upper immediate LW -- Load word SB -- Store byte SW -- Store word
I believe the MIPS processor uses the ALU to add the contents of one register to a short immediate value stored in the instruction, and that's used as the address. We should do the same. That makes it so that the only memory addressing mode is reg-value + offset.
Yes and imediate values are 16 bits. However if we have only 512 bytes of ram then you only really need 9 bits. Only J and JAL take immediates that are greater than 16bits, 26bits.
In addition, this is also the stage where we'll want to do other I/O-related operations, such as providing access to real graphics memory and controlling other aspects of the GPU that are accessible by this processor. We'll make that available, to appear as another 512-word space (or more or less as necessary) or read-only and write-only "memory locations".
OK I don't know what you want io instructions to look like.
We'll treat graphics memory access as though we're controlling some other device. Writes involve dropping a pair of words (address, data) into a queue. Reads involve dropping a word (address) into a queue and them some time later, popping the read data out of another queue. Those queues will show up as "memory addresses" to the CPU. In fact, the CPU will control quite a number of things by writing/reading queues.
-- http://dmoz.org/profiles/pollei.html http://sourceforge.net/users/stephen_pollei/ http://www.orkut.com/Profile.aspx?uid=2455954990164098214 http://stephen_pollei.home.comcast.net/ _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
