Re: [Open-graphics] Challenges with async fifo designs holding up OGD1

James Richard Tyrer Tue, 13 Feb 2007 13:29:44 -0800

Timothy Normand Miller wrote:

As I'm sure most of you are aware, we're testing OGD1 by putting a
semi-complete design into it with PCI, video, memory controller, etc.


We've run into a challenge with video, and we could use some
brain-storming help to solve it.

The problem has to do with async fifos.  Check out the existing designs:

https://svn.suug.ch/repos/opengraphics/main/trunk/rtl/fifos/

The fifos of interest are the "async" fifos, which have head and tail
ends at different clock rates, and fifo_DxW.v, which uses one clock
domain and can be mapped to one or more of the large block RAMs on the
chip.

I can give more specific detail later, but the bottom line is that we
need a fast 512-entry async fifo.  That is, the head end needs to run
at 200MHz.  The problem is that we can't both compare two 9-bit
addresses and use that as a control to increment a 9-bit address in
5ns.

So, I'm looking for more novel approaches.

For a frame of reference, the way the 16-entry async fifo works is as
follows:  There are gray-code head and tail pointers.  When something
is enqueued, the tail pointer is "advanced".  We can determine if the
fifo has entries in it by retiming the tail pointer into the head
clock domain and comparing them.  If they differ, the fifo contains
something, and we can dequeue.

So, what we need are two independent head and tail pointers, each in
its own clock domain.  On the write end, we need to know if the fifo
is full or not.  On the read end, we need to know if an entry in the
RAM is new data (valid) or old data (the fifo is empty).  One idea
I've thought of is to encode validity info into the fifo data itself,
but it's not fully fleshed out.

Thoughts?

512 words seems like a lot.  Is there a reason we need this much?

IAC, a real FIFO buffer is always the fastest as far as transfer rate isconcerned. The problem with using a real FIFO that long would be alatency of 512 clocks.

You can shorten the latency of a real FIFO by using multiple shiftregisters in parallel. E.G. using 4 @ 128 each. They can be striped orused in sequence. This is still a latency of 128 clocks which might notbe acceptable if they were empty. You can keep using more smaller shiftregisters till you get to 512 @ 1 each which means just registers/memoryand there is no difference between striped and sequence.


This might work.

A real FIFO uses an extra bit which is set when data is written andcleared when it is read. Perhaps this would work with memory basedFIFO. If you has a "dirty" bit, then you wouldn't need to compareaddresses to determine if the FIFO was empty. If the memory location atthe tail pointer had the dirty bit set, then it would read out thatmemory location, clear the dirty bit and increment the counter. Thehead pointer would avoid overrun by not writing to a memory locationtill the dirty bit was clear.

In theory, the above will work if at reset all dirty bits are clearedand the head and tail pointer are both set to the same value (normally0). There is an issue of what would happen if this manages to get outof sync due to a stray cosmic ray which might need to be addressed. Ifthe FIFO empties this is simple Wen all dirty bits are reset then thetwo counters are reset to 0.


--
JRT
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Challenges with async fifo designs holding up OGD1

Reply via email to