Re: [Open-graphics] VRAM<-->VGA interface

Patrick McNamara Sat, 28 May 2005 09:38:48 -0700

Eric Smith wrote:

Patrick wrote:

Ok, here are some assumptions I made.


Load/store architecture
unified instruction, data, registers.  In other words the 512 memory
locations contain both code data and registers

Looking at the instruction count, I think we can certainly use one of
the FPGA 512x36 RAM blocks for the nanocontroller.


The XC3S1500 has 32 of the 18Kbit BlockRAMs, and the XC3S4000 has
96 of them, so it's probably reasonable to allocate several to the
nanocontroller to provide flexibility.  And after all, it's an FPGA,
so tweaking the number of BlockRAMs assigned to the nanocontroller
should only be a matter of changing a few lines of RTL.

The block RAMs have only two ports, so you can't use a single one
for code, data, and registers.

For a load/store architecture (that doesn't do both simultaneously),
you might be able to share one block RAM between instructions and
data.  But if pipelining requires that data written by store
instruction n has to be be written at the same time as data read by a
load instruction n+1, then a separate block RAM is needed for data
(or a stall/pipeline bubble).

So we drop the pipelining. As Timothy has pointed out, it doesn't haveto be fast. Rather than try and pipeline the nanocontroller (which willbe constantly stalled waiting on card memory anyway) lets go the otherway. Assume 1 instruction every 5 clocks for the nanocontroller. Thatshould give enough stages to allow for a single read or write per clockcycle.

Now for the math. Timothy said to expect a 20 clock delay for randomaccess to card memory. I'm going to assume that is 20 clocks in the200Mhz domain. This means the controller would have to stall for 10clocks for each external memory access. At 149000 memory accesses perscreen update that gives us 1.49M clock cycles for memory access. Forthe program there are 62 instructions. One set of 13 is looped 64times (the blit of the character bitmap). This gives us 881instructions per character output times 2000 characters or 1.76Minstructions per screen update. At 5 clocks per instruction we get8.81M clocks per screen update for the instructions. A grand total of10.3M clocks per screen update or just under 10hz at 100Mhz controllerclock.

Of course that cuts in half on an 80x50 screen... Maybe we do need thepipelining... At 1.2 (the .2 I tossed in for pipeline stalls andflushes) ipc and 80x25 screen refresh rate of slightly over 27Hz ispossible.


Food for thought.

Patrick M
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] VRAM<-->VGA interface

Reply via email to