Patrick wrote:
> Ok, here are some assumptions I made.
>
> Load/store architecture
> unified instruction, data, registers. In other words the 512 memory
> locations contain both code data and registers
> Looking at the instruction count, I think we can certainly use one of
> the FPGA 512x36 RAM blocks for the nanocontroller.
The XC3S1500 has 32 of the 18Kbit BlockRAMs, and the XC3S4000 has
96 of them, so it's probably reasonable to allocate several to the
nanocontroller to provide flexibility. And after all, it's an FPGA,
so tweaking the number of BlockRAMs assigned to the nanocontroller
should only be a matter of changing a few lines of RTL.
The block RAMs have only two ports, so you can't use a single one
for code, data, and registers.
For a load/store architecture (that doesn't do both simultaneously),
you might be able to share one block RAM between instructions and
data. But if pipelining requires that data written by store
instruction n has to be be written at the same time as data read by a
load instruction n+1, then a separate block RAM is needed for data
(or a stall/pipeline bubble).
For the registers, it looks like you need two read ports and one
write port. For a relatively small number of registers (typically
16, 32, or 64), this is best done using distributed RAM (in the CLBs)
rather than block RAM. Each FPGA "logic element" (LE) has 16 bits of
memory and one flip-flop. The 16-bit memory is normally the function
lookup table, used to compute any combinatorial function of four inputs.
In the Spartan-3, half of the LEs can have their 16-bit memories
configured as dual-port RAMs. To build a three-port RAM for a CPU
register file, you duplicate the RAM contents twice, with each copy
having one read port and one write port. Writes are always done to both
copies simultaneously.
This scheme can be expanded to "n" read ports, as long as you only
have one write port.
What you're trying to do is slightly more sophisticated than what
PicoBlaze or PacoBlack can probably do; they don't use three-register
instructions. The Xilinx Microblaze is closer, although perhaps it
is more sophisticated than you need. There is a Verilog clone of
Microblaze called aeMB.
I've done a lot of work with generating video timing (and even full
video signals) entirely in software on microcontrollers. For instance,
I wrote the software and Rich Ottosen build the hardware that won the
SXlist Video Challenge:
http://www.sxlist.com/techref/ubicom/contest/video.htm
http://www.brouhaha.com/~eric/ubicom/servid/
http://www.brouhaha.com/~eric/ubicom/servid/photos/
Best regards,
Eric
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)