On 9/8/07, Petter Urkedal <[EMAIL PROTECTED]> wrote: > Here is my attempt to refine the I/O ports. For PCI I'm making wild > guesses, so nailing down the ports is just an easy way to expose my > misunderstandings. The point is to gain some understanding of how the > nanocontroller will interact with the PCI controller (and memory). > > Memory read > in MEM_READREQ_FREE Free slots in command pipe. > out MEM_READREQ_ADDR First address to read. > out MEM_READREQ_COUNT Number of words to read. > in MEM_READREPLY_DATA Data stream from memory. > in MEM_READREPLY_AVAIL Number of words in FIFO.
Perfect. BTW, in the symbol name, we may want to add a reminder to the human programmer as to which ones are "trigger" writes. For instance, MEM_READREQ_COUNT triggers the read at the address that was programmed in. > > Memory write > out MEM_WRITE_ADDR Start address. > out MEM_WRITE_COUNT Defauts to 1. (Needed at all?) > in MEM_WRITE_FREE Free slots in output FIFO. > out MEM_WRITE_DATA Data stream to memory. I'm not sure that the count has much meaning. Writes are nice in that in some cases like this one, we can just "fire and forget." :) > Master read > in PCI_MASTER_READREQ_FREE Free slots in command pipe. > out PCI_MASTER_READREQ_ADDR Host-mapped address to read. > out PCI_MASTER_READREQ_COUNT Number of words to receive. > in PCI_MASTER_READREPLY_DATA Data stream from host. > in PCI_MASTER_READREPLY_AVAIL Number of words in FIFO. Excellent! > Master write > out PCI_MASTER_WRITE_ADDR Host-mapped address to write. > out PCI_MASTER_WRITE_COUNT Number of words to send. > in PCI_MASTER_WRITE_FREE Free words in output FIFO. > out PCI_MASTER_WRITE_DATA Data stream to host. Perfect! Now that I think about it, I rather prefer the idea of indicating count up front, rather than having to tag the last word. This way, nothing (besides the master, which really needs to know the most and soonest) has to keep track of when the last word is going to come through. And moreover, I'm also thinking that perhaps the DMA master should have separate command and write data fifos. This way, some other agent can be filling the data fifo asynchronously. For instance, some data words come in from the memory system, but the master doesn't know what to do with them, so it doesn't do anything, and then the nanocontroller gets around to sending a command to the master, and then it can do something with the data. More opportunity to make things asynchronous. I'm tired of typing in "nanocontroller." I keep mistyping it. I think I'll start just calling it "HQ" and if anyone asks, we'll refer them to the right place. > Target of write (we're reading) > in PCI_TARGET_WRITEREQ_ADDR Target address of write. > in PCI_TARGET_WRITEREQ_COUNT Number of words to receive. > in PCI_TARGET_WRITEREPLY_AVAIL Number of words in FIFO. > out PCI_TARGET_WRITEREPLY_DATA Data stream from host. This one's tricky. With the target, we have absolutely no control here. For one thing, we "config" ports that set whether or not we're take PIO accesses. Either they go directly over to the Spartan, or they all come to us. Or perhaps we want to select by BAR. Not sure exactly yet. Now, the only time we do intercept PIO transactions is when we're really going to process them somehow, so the flow control can be as complex as we can afford in the time available. So, basically, I think we could make do with one physical fifo. The target keeps track of addresses for PIO bursts, so we could just push 64-bit entries into a fifo. 4 bits are byte flags. 28 bits are a word address (1 GiB max space). 32 bits are data. One way to handle this is to have one I/O port samples (but doesn't dequeue) the flags/address word. The other I/O port grabs the data and dequeues. This way, in the unlikely event that you KNEW what the next address would be, you could just ignore it and grab the data in one cycle. These would be the I/O ports: in PCI_TARGET_WRITE_COUNT The number of words in the write queue in PCI_TARGET_WRITE_ADDRFLAGS Address and flags for a write data word in PCI_TARGET_WRITE_DATA Data of write word > Target of read (we're writing) > in PCI_TARGET_READ_ADDR The requested address. > in PCI_TARGET_READ_COUNT The requested number of words. > in PCI_TARGET_READ_FREE Free words in output FIFO. > out PCI_TARGET_READ_DATA Data stream to host. For reads, we have two queues. One of those queues may in fact be the write queue in disguise, with some flags set differently. (They are mutually exclusive, and we do want to process pending writes before reads anyhow.) We only ever get one read request at a time. That request may be posted from an earlier attempt that timed out. The request may try to burst, and we may want to try to accomodate that, but we have no way of predicting how may data words we really need to send. We may or may not want to try to support some sort of dedicated cache line mechanism that allows reads to certain BARs to be handled automatically except when there's a cache miss. In other words, PIO reads are a pain. One simple solution is this: in PCI_TARGET_READ_CMD_VALID Is there a read command pending? in PCI_TARGET_READ_ADDR Address of read command in PCI_TARGET_READ_FREE Number of free words in read output queue out PCI_TARGET_READ_DATA Queue for read data words going out in PCI_TARGET_READ_COMPLETED How many read data words actually went out Here's the painful bit. When a transaction terminates (for whatever reason), the data queue has to be cleared. This is because we don't know if we queued too many and what will be the address of the next request. Did the burst end because the host controller doesn't want to hug the bus but will start again at the next address? Or did the burst end because it got everything it wanted? So we have to check to see what DID happen and then adjust what we do based on that, possibly requeueing data we queued before, so we have to keep track of it. ... You know what? We'll never be able to keep up with that. It's too complicated. There's absolutely no reason why PIO reads have to be fast, ESPECIALLY in the cases where we would actually intercept requests. PIO reads suck, and we should not put gobs of logic into trying to make it not suck. So, no, I think we should handle one at a time, and each individual transaction should be only one word. In this state, the target would be in a mode where it always asserts STOP at the same time as TRDY, on top of the usual timeout mechanism. So here are the ports: in PCI_TARGET_READ_PENDING Is a read pending? in PCI_TARGET_READ_ADDR The one address that is pending out PCI_TARGET_READ_DATA Where we write the one data word So, if the read times out, the controller is smart enough to recognize that the address of a later retry is the same as the posted one and automatically returns the data (if we've supplied any) or times out (if we haven't supplied any). Note that we could probably combine PENDING and ADDR. It's not a queue. Most of the bits will be the address, and one is a flag indicating if it's valid. The PENDING flag will be cleared whenever we write to the DATA port (which is also not a queue). In the microcode we would do well to do some sort of caching, so that we can return data before timeout (by my design, we have only 8 PCI cycles, though). But that's all an implementation detail. Oh, and don't forget PCI_TARGET_INTERCEPT_CONFIG or whatever. -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
