On 9 Dec 2007, at 20:49, Timothy Normand Miller wrote:
On Dec 7, 2007 7:50 PM, Michael Meeuwisse <[EMAIL PROTECTED]>
wrote:
Good. But for the record, what is exactly your definition of a fifo
interface? In the 'RFC pipeline' you also mentioned it but I'm not
too sure what a fifo has and what it doesn't have. I get that it's
based on dual port block ram so it can do the whole clock domain
transition.
Click on the "FIFO" button on this page:
http://www.traversaltech.com/download.phtml
This describes an interface for moving data through a chip, with flow
control. We use it as the interface for dealing with fifos and
computation pipelines.
Great. I'm missing descriptions of some of the signals, but the rest
makes sense.
This is somewhat in the spirit of the Wishbone interface used by all
the opencores.org blocks. I just like this better, because it seems
simpler, is designed for high throughput with short combinatorial
delays, and doesn't incur any latency to start or stop "bursts".
So instead of pushing
addresses into a request fifo (or four), we have a small fifo that
accepts an address and count to transport the request from the video
clock domain into the memory clock domain. Then these counters
appear
as read request agents. Each memory controller spills its return
data
into the correct return fifo, and the video controller dequeues them
at the right time. (This also implies that the arbiter has four
schedulers. Oh, and we can't forget the "memory refresh" agent that
is the timer for DRAM refresh.)
Wouldn't it make sense to let the arbiter know we want a chunk of
memory starting at an address, and let that be translated to the
correct controllers? Same for the memory refresh, a don't-care
problem from the video fifo perspective. Just keep an eye on the
output-valid line from the arbiter.
When it comes to reads, it may or may not help to have that batch-size
metadata. At SOME point, we have to break things into individual word
requests because that's how the memories and memory controllers work.
Yes. I'm arguing that it's better to do this in the arbiter than in
the agent, because it's easier at a later point to act 'smart'. I
think that figuring out that a dozen addresses can be grouped
together in a single read on a memory controller is much more
expensive than deciding to ungroup a requested block in multiple
reads if it needs to.
If, for some reason, all requests from all agents hit the same memory
row for each access, we could schedule long runs of reads and long
runs of writes, interleaved between different agents, with no penalty.
There are small penalties for switching between reading and writing.
And there's a rather large penalty for having to change memory rows.
Batching or not, a batch could cross a row boundary, and we might want
the scheduler to do something smart when that happens, say switching
to another agent that happens to be wanting to access the exact same
row. (We don't have any data on how likely that is, in order to
determine how much logic it's worth spending on it.)
For video, due to latencies, we need to make requests well in advance.
Then the data has to be sitting there in a queue, waiting for us to
pull it out at EXACTLY the right time.
My idea was to send out a request for one fifo the moment it runs out
of data and another fifo starts supplying data. The arbiter will have
time for as long as the other fifo can provide data. We can put the
address we want (and the block size somehow, say, another queue) in a
queue from the arbiter, and the arbiter can write data back to us as
if we were a fifo. Internally, we'd pass it on to the correct fifo
(this is all in the arbiter's clock domain).
The tricky part is that the fifo's will not be very big. There's only
216KB of block ram available, so say that we take for each fifo a two
blocks of 18Kbit. In our highest target resolution (2048 * 1600 * 24,
60Hz) the raster scanner will work through 160.000 full fifos per
second. To get these all filled in time will become quite a strain on
the arbiter.
So the idea is roughly like so;
[snip your description of how video works]
You have the general idea. I'll describe our specifics.
We have a memory controller whose function you can look up. What
matters here is that, for a given scanline on the display, it makes
the memory request for the corresponding data one scanline in advance.
That request is broken down into individual word requests and handed
to the memory controllers. When the data comes back, it's put into a
queue. At the time when the video rasterscan needs the data, it'll
pulled from the queue.
I'm not sure how (if at all) this differs from my description. The
only point I'm making is that the queue the data sits in, is in fact
part of the agent. For the addresses going to the memory controllers;
this is all arbiter talk, which sits between us agents and the
controllers. When the data comes back the arbiter kept track of what
address was associated with this data and plays it on to the relevant
agent. Interesting, but not relevant for the video fifo. :)
Our video controller is a programmable state machine that can be
programmed to do a wide variety of different video modes. The same
download page I mentioned earlier
(http://www.traversaltech.com/download.phtml) has a link to its
documentation. Likewise, the queues I describe are linked to from
there.
I'm going to look into it.
Read requests are, effectively or literally, made by putting addresses
into one queue. The data comes back through another.
Agreed. Does this queue have data relating to the number of bits we
want from that address? Or will we make another queue for that? Or is
it predefined (which is nasty, as I tried to explain earlier).
Note that there are no tristates inside of an FPGA. (Well, there
could hypothetically be, but we never use them.)
You mean my inout usage?
--
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
A final thing to add, I mentioned sending a signal a cycle early. I
essentially meant the 'empty' from the fifo, only a clock early. This
way we can switch between the fifos driving the bus to the raster
scanner without the raster scanner ever knowing.
Mike
www.projectvga.org
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)