On Monday 30 May 2005 21:46, Timothy Miller wrote: > > The 100MHz figure is totally arbitrary. If we can run it faster, we > will. If it has to run slower, that's okay too. The key point here > is that the design needs to be very small, so putting in a > general-purpose CPU core is not going to do. What we are designing > right now is probably over-sized already.
I've been wondering about that. I haven't looked in too much detail at what
you are doing and I'm not an expert on FPGAs or nano-CPUs, but I'm wondering
whether a solution without a nano-CPU would really be that complex. For 80x25
text mode:
RAM:
Frame buffer: 640x480x32 (xRGB)
Text data: 80x25x16 (AC)
Block RAM:
Character map/font (2x2048x9 for 256 characters). Can these be shared with
e.g. the 3D pipeline's division block RAMs?
int19 c_fb = fb_base; // framebuffer address counter
int10 c_fb_x = 0; // framebuffer x
int9 c_fb_y = 0; // framebuffer y
int10 c_td = tb_base; // text buffer address counter
int8 char0a; // even character attributes
int8 char0c; // even character ascii code
int8 char1a; // odd character attributes
int8 char1c; // odd character ascii code
int32 out; // output xRGB
do
///// READ stage
if (c_fb_x & 0xF == 0) // one read of text per 16 cycles, prefetch next?
(char0a,char0c,char1a,char1c) = read32 text[c_td];
///// CONVERT stage
if (c_fb:3) // left or right character? BRAM is dual gated so we could just
// read both and select the output instead of the address
// using a 1-from-16 selector
bit b = font[(char0c << 4, c_fb_y & 0xF)]:(c_fb_x:0-2) // bit from font
if (!c_fb:3)
bit b = font[(char1c << 4, c_fb_y & 0xF)]:(c_fb_x:0-2) // bit from font
///// TRANSLATE COLOUR stage
if (b & c_fb:3)
out = RGB4_to_RGB24(char0a:0-3)
if (!b & c_fb:3)
out = RGB4_to_RGB24(0,char0a:4-6)
if (b & !c_fb:3)
out = RGB4_to_RGB24(char1a:0-3)
if (!b & c_fb:3)
out = RGB4_to_RGB24(0,char1a:4-6)
///// WRITE stage
write32 framebuffer[c_fb] out;
// the following increments the counters and makes them loop over the screen
// and the text buffer in memory order
///// LOOP
c_fb++;
c_x++;
if (c_x & 0x7 == 0)
c_tb++;
if (c_x == 640)
c_x = 0;
c_y++;
if (c_y == 480)
c_y = 0;
c_fb = fb_base;
c_tb = tb_base;
loop
Basically, it's a couple of counters, a few comparators, two RAM blocks and
some loose gates. If you can get it to run at say 2 clocks per iteration (I
don't see why not, the frame buffer RAM accesses are in order, not random,
and the text accesses can be prefetched) then you've got about 80 fps at 50
MHz.
Various variations are possible of course, depending on what is more efficient
implementation-wise and what features we want.
For example, if we duplicate the 8th bit of each character in the font map
(the BRAM is 9 bits wide) then adding the repeat-8th-bit VGA feature is just
another MUX. We'd have to run in 720x400 mode of course, and increment c_tb
every 9th iteration, but that is not that hard to do.
For 80x50 text mode only a few constants need to change. Can we reuse the BRAM
for palettes in 16 and 256 colour graphics modes?
And speaking of 16 colour modes, RGB4_to_RGB24 logic could be used in those
modes as well. The convert stage would be different to account for the bit
planes, but the rest would be the same. For 256 colour modes the translate
colour stage would be different (using palettes in BRAM), and the convert
stage would drop out, but that's it.
Well, food for thought at least.
Lourens
pgpyMQQZ2eUv9.pgp
Description: PGP signature
_______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
