* Timothy Miller <[EMAIL PROTECTED]> [2005-08-20 14:45]: > On 8/20/05, Brian Magnuson <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I've gotten as far as printing out the relevant portions of the PCI spec. I > > wanted to actually get into it by now but I've been in gate level hell for > > the > > past couple of days. I just got a test inserted netlist and I've been > > trying > > to work it into our verification environment. Tons o' fun. :( Once I get > > some > > free time (hopefully next week) I'll start really digging into it. > > This is the life of a chip designer. :) > > > Anyway, just thought I'd check in and let you know what's up. I've got a > > pretty good model going for the flash interface and I wanted to check some > > of > > my assumptions before going any further. The IOs look like this: > > > > input clock; //The usual suspects > > input reset_n; //The usual suspects > > > > //System interface > > input [31:0] wdata; //Write data from host > > input [23:0] addr; //Starting address > > input [2:0] cmd; //Command - Non-zero starts a transaction > > What are the commands? read, write, erase?
There were nop, read, write, read config, write config, and erase. The various flavors of block erase will add 3 more (4KB, 32KB, and 64KB). > > > input done; //Stops a running read or write > > Is this an abort? Nope, since the read/write transactions were unsized this was the indication that the requestor had sent/received all the data it wanted. > > > > output [31:0] rdata; //Read data > > output rdv; //Asserted with valid data on rdata > > output ready; //Asserted when ready to accept commands > > output wdacc; //wdata must be stable until this is asserted > > > > //PROM interface > > input sin; > > output sout; > > output sclk; > > output shold; > > output sce; > > > > I've made all data transfers as multiples of 32 bits. > > That should work. Writes won't happen very much, and reads are sized > to the 32-bit bus. > > > > > There's no size in the command. Reads and writes continue until done is > > asserted. I did this since it seems like PCI transactions aren't sized > > either. > > See below. I think they should be atomic. Request a transaction, and > off it goes. All other requests are ignored while it's busy (and > there will be a busy signal indicating such so nothing is lost). That's what the ready was for. Any command that came along when ready was not asserted would be ignored. > > wdacc is asserted as an indication that the current wdata can be changed > > > > rdv will be asserted for one clock each time 32 bits is read > > Unfortunately, as you'll see, we can't insert that many wait states > atomically, so we'll have to do something slightly different. See > below. > > > Only full chip erase is supported for now. Getting the block erases are > > trivial but takes more command encoding. Do we care about block erase? > > Yes, we'd like block erase. The PROM will hold BIOS, bitfile, and > empty space for whatever people want to hack in there. We'd like to > be able to program them all separately. Not a problem. Just a few more commands. Should not change my state machine at all. > > > > sclk will be a 1/2 version of clock which I'm taking to be the PCI clock (66 > > MHz?) At this speed the steady state read bandwidth is 4.125MB/s so the > > FPGA > > program will take about 0.33s which should be fine. > > PCI will be 33, 66, or 133. And for a PCI-X to PCIe bridge, I'd like > to see if we can't do even faster. > > > This also means that > > between each read data word from PCI there will be 64 wait states. Can we > > do > > this? > > Sortof. Most systems should handle this, but what will happen is that > after 16 wait states, we'll terminate the transaction with retry. > That'll happen a few times before the data arrives, and then we'll > grab it when it arrives. As such, I have some suggestions: > > (1) Don't try to allow a stream of accesses. Make them atomic. If we > could do one per clock, that would be one thing, but since it takes so > long, there's no point in having the logic necessary to pipeline > anything or whatever. Just don't accept a new request while one is in > progress, etc. If you notice, my assumed interface has "busy" and > "read valid" signals. The busy means it's busy doing anything and > that requests will be ignored. > There wasn't really any pipelining going on. A read would simply continue from the starting address until told to stop. Busy also exists, although I called it ready. I rather like this mode for programming. There's a lot of overhead in starting and finishing a command, but once started a read gives data on every clock. If the other client (PCI) just wants one word it can assert done along with the command. > > (2) When a read request arrives, start the state machine. When the > read is completed, cache it (one word cache), and compare the address > we're requesting. When the address matches, assert the "data valid", > asynchronously, unless we have to register it for speed, which I'll > figure out and hack later. Also, have a timer that invalidates the > word after a while, say 128 cycles after it's arrived. See my > pci_send block's cache logic for an example of that. Not sure that caching is helpful here, especially a single word cache. Wouldn't the typical access pattern be a streaming read? This just seems sort of complicated. > > > > 21 state, one-hot state machine. Lots of states but the next state logic > > for > > the most part is dead simple. > > > > I don't have much in the way of a verification environment yet. Apparently > > SST > > does not provide a model of this part. You have to go through Denali and > > use > > their commercial tool which even if we could get it for free probably won't > > play all that well with iverilog. So somebody needs to get a BFM together > > to > > see if this thing really works. > > I might be coaxed into writing a simulation model of the PROM chip. > Can you point me to the appropriate reference materials? http://www.sst.com/downloads/datasheet/S71271.pdf > > Speaking of verification have you given any thought to a generalized test > > framework? Unit tests, regressions, C test interface, etc... > > I have thought that it should happen. :) Actually, I kinda started > on it for PCI. I have some tasks that manipulate PCI signals, and I > thought we could expand that. We could even write another state > machine to act as the host so we can simulate DMA and stuff, > eventually. Heh. I don't have any grand plans either yet. Just wondering if maybe you did. :) > > There will be two clients (PCI and FPGA) but I haven't built any arbitration > > into the module. The basic idea is that the FPGA gets priority and the > > flash is > > unavailable to PCI while programming is in progress. Can't see how this > > would > > be a problem since it's probably bad form to be programing the flash as it's > > written into the FPGA. :) > > Well, I thought about that a bit too, and you're not going to like the > answer. I think PCI MUST get priority, because the host may try to > POST the device and map it (being a graphics device) before the FPGA > is programmed. As such, we need to be able to read BIOS while > programming the FPGA. Sucks, neh? Yeah, a little more complicated, but logic that belongs outside of this module though. I'm thinking in the arbiter which will need a way to tell the requestor (the FPGA progamming logic) that it's being preempted. The FPGA programmer will then need to smarts to start it's next request at the appropriate address. Not a problem. > > There's a single, 8 bit status/control register in the flash. We could make > > this available at an address just above the top of the flash. > > We could map it to a few different places, such as PCI cfg space. > Another thing we should consider is to map the first so many registers > of the engine space to the Lattice. I'll defer to you here. > > -Brian > > > > P.S. gtkwave *sucks*. You can't argue with free, but going from Debussy to > > gtkwave was a bit of a shock. Whew... want to develop a decent wave viewer > > while we are it. :) > > It's got its problems. I can't remember the one we used with > ncverilog, but it's very similar to that. > > Anyhow, looks like you've been doing a lot. Thank you again very much > for the help! > > _______________________________________________ > Open-graphics mailing list > [email protected] > http://lists.duskglow.com/mailman/listinfo/open-graphics > List service provided by Duskglow Consulting, LLC (www.duskglow.com) _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
