On 8/20/05, Brian Magnuson <[EMAIL PROTECTED]> wrote: > Hi, > > I've gotten as far as printing out the relevant portions of the PCI spec. I > wanted to actually get into it by now but I've been in gate level hell for the > past couple of days. I just got a test inserted netlist and I've been trying > to work it into our verification environment. Tons o' fun. :( Once I get some > free time (hopefully next week) I'll start really digging into it.
This is the life of a chip designer. :) > Anyway, just thought I'd check in and let you know what's up. I've got a > pretty good model going for the flash interface and I wanted to check some of > my assumptions before going any further. The IOs look like this: > > input clock; //The usual suspects > input reset_n; //The usual suspects > > //System interface > input [31:0] wdata; //Write data from host > input [23:0] addr; //Starting address > input [2:0] cmd; //Command - Non-zero starts a transaction What are the commands? read, write, erase? > input done; //Stops a running read or write Is this an abort? > > output [31:0] rdata; //Read data > output rdv; //Asserted with valid data on rdata > output ready; //Asserted when ready to accept commands > output wdacc; //wdata must be stable until this is asserted > > //PROM interface > input sin; > output sout; > output sclk; > output shold; > output sce; > > I've made all data transfers as multiples of 32 bits. That should work. Writes won't happen very much, and reads are sized to the 32-bit bus. > > There's no size in the command. Reads and writes continue until done is > asserted. I did this since it seems like PCI transactions aren't sized > either. See below. I think they should be atomic. Request a transaction, and off it goes. All other requests are ignored while it's busy (and there will be a busy signal indicating such so nothing is lost). > wdacc is asserted as an indication that the current wdata can be changed > > rdv will be asserted for one clock each time 32 bits is read Unfortunately, as you'll see, we can't insert that many wait states atomically, so we'll have to do something slightly different. See below. > Only full chip erase is supported for now. Getting the block erases are > trivial but takes more command encoding. Do we care about block erase? Yes, we'd like block erase. The PROM will hold BIOS, bitfile, and empty space for whatever people want to hack in there. We'd like to be able to program them all separately. > > sclk will be a 1/2 version of clock which I'm taking to be the PCI clock (66 > MHz?) At this speed the steady state read bandwidth is 4.125MB/s so the FPGA > program will take about 0.33s which should be fine. PCI will be 33, 66, or 133. And for a PCI-X to PCIe bridge, I'd like to see if we can't do even faster. > This also means that > between each read data word from PCI there will be 64 wait states. Can we do > this? Sortof. Most systems should handle this, but what will happen is that after 16 wait states, we'll terminate the transaction with retry. That'll happen a few times before the data arrives, and then we'll grab it when it arrives. As such, I have some suggestions: (1) Don't try to allow a stream of accesses. Make them atomic. If we could do one per clock, that would be one thing, but since it takes so long, there's no point in having the logic necessary to pipeline anything or whatever. Just don't accept a new request while one is in progress, etc. If you notice, my assumed interface has "busy" and "read valid" signals. The busy means it's busy doing anything and that requests will be ignored. (2) When a read request arrives, start the state machine. When the read is completed, cache it (one word cache), and compare the address we're requesting. When the address matches, assert the "data valid", asynchronously, unless we have to register it for speed, which I'll figure out and hack later. Also, have a timer that invalidates the word after a while, say 128 cycles after it's arrived. See my pci_send block's cache logic for an example of that. > > 21 state, one-hot state machine. Lots of states but the next state logic for > the most part is dead simple. > > I don't have much in the way of a verification environment yet. Apparently > SST > does not provide a model of this part. You have to go through Denali and use > their commercial tool which even if we could get it for free probably won't > play all that well with iverilog. So somebody needs to get a BFM together to > see if this thing really works. I might be coaxed into writing a simulation model of the PROM chip. Can you point me to the appropriate reference materials? > Speaking of verification have you given any thought to a generalized test > framework? Unit tests, regressions, C test interface, etc... I have thought that it should happen. :) Actually, I kinda started on it for PCI. I have some tasks that manipulate PCI signals, and I thought we could expand that. We could even write another state machine to act as the host so we can simulate DMA and stuff, eventually. > There will be two clients (PCI and FPGA) but I haven't built any arbitration > into the module. The basic idea is that the FPGA gets priority and the flash > is > unavailable to PCI while programming is in progress. Can't see how this would > be a problem since it's probably bad form to be programing the flash as it's > written into the FPGA. :) Well, I thought about that a bit too, and you're not going to like the answer. I think PCI MUST get priority, because the host may try to POST the device and map it (being a graphics device) before the FPGA is programmed. As such, we need to be able to read BIOS while programming the FPGA. Sucks, neh? > There's a single, 8 bit status/control register in the flash. We could make > this available at an address just above the top of the flash. We could map it to a few different places, such as PCI cfg space. Another thing we should consider is to map the first so many registers of the engine space to the Lattice. > -Brian > > P.S. gtkwave *sucks*. You can't argue with free, but going from Debussy to > gtkwave was a bit of a shock. Whew... want to develop a decent wave viewer > while we are it. :) It's got its problems. I can't remember the one we used with ncverilog, but it's very similar to that. Anyhow, looks like you've been doing a lot. Thank you again very much for the help! _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
