On Dec 9, 2007 4:53 PM, Petter Urkedal <[EMAIL PROTECTED]> wrote: > On 2007-12-09, Timothy Normand Miller wrote: > > Petter, > > > > I had some more caffeine and looked over your oga1hq_io.v file. It's > > exactly what I had in mind and nicely organized. Excellent job! > > Thanks, but I also think you made some excellent points with the code in > your previous post: > > 1. You do reads purely combinatorially. That way the HQ memory stage > can register it, and we avoid muxing it into the output. I think there > is enough time within the HQ memory stage for this, and that we should > avoid incurring delay after it's output, since that can spill into more > critical parts due forwarding. I guess we won't know before we try, but > what does your intuition say?
My intuition from earlier (that reads are just a big MUX) left out the fact that some of these reads cause outputs. Reading from a fifo causes deq to be pulsed, meaning that while the data being pulled into the CPU is just coming through a big MUX, there's more do it. You can separate that logic if you like, although you'll end up with the same logic in the end, just organized differently in the source code. Do what makes the most sense and is easiest to maintain. > 2. As you point out, we don't need separate outputs for each external > unit, only two ports and individual binary lines. That should save a > fair amount of logic. This also means we can collapse most of the > non-trigger port addresses, which reduces muxing for writes. I'd have > to look closer at what can be shared. Actually, I was thinking that we could collapse more of the trigger write ports and fewer of the non-trigger ones. This is particularly an issue where we want to interleave access to different agents. An address port, with or without autoincrement, is looked at every time we access the trigger data port. We can't let accesses to more than one agent clobber each other's address ports. On the other hand, the data port is only valid for one cycle and we never care what was written to it the last time. That last statement is invalidated, however, if we want to pay attention to "full" signals, where the protocol requires that we maintain the register contents until full is deasserted. We need to look at the things that are mutually exclusive. If HQ were inside of the Spartan, we would give it independent read and write channels. However, since the bridge isn't full duplex, we can only do one thing at a time. This may give us room for some sharing. However, it's conceivable that we'd set up a write address, write a few words, request a read (with its own address), and then assume we can start another write where we left off without explicitly updating the address. SO... when we decide to combine things, we need to be careful to document all the side-effects that can occur so that the programmer knows that when they've done this read that that write address is no longer valid, etc. We could even make some of this more explicit by giving them the same I/O address! Don't take anything I'm saying to discourage you against making these combinings; I'm just trying to work through the usage scenarios and the tradeoffs. Note: I made that write address auto-incrementer able to increment the whole word. We need to restrict the counter to the lower 6 bits. This is faster and requires less logic. The drawback is that HQ is responsible for updating the whole address when crossing the 64-word boundary. (We may change the boundary to like 256 word or something later, depending on how things go with static timing.) > So, if you agree, I'd try to rewrite more like your suggestion and see > what simplification we can make. Although I'm very happy with what you've done, I do encourage experimentation. Conversely, I feel that we're under a time crunch and that if we have something that will do the job well, it may be better for you to spend your time on something else. > I'd say we just define these as symbolic constants in the assembler and > declare each as either read or write and any other use to be undefined. > If we want a port (like a self-incrementing address) to be both readable > and writeable we'll make sure the address is the same for both > directions. Write addresses would increment for each word. We wouldn't want auto-inc for reads, because we'd have to have a full adder in there to increment the address by the read count. Better to add another instruction and save the extra logic. > > I added reset logic. The dequeue signal has to come out a cycle > > earlier (its assertion causes the next data to appear on the next > > cycle). I put in some perhaps unnecessary checks against full for > > writes; see the comment. I added byte enable flags for writes. I > > added auto-increment for write addresses. (Hmmm... a side-effect we > > have to document carefully!) > > Yes, it looks good. I didn't think we needed reset, but of course, when > adding incrementing address and the conditional reset of enq/deq, it > becomes stateful. In these FPGAs, there is a global reset that we take advantage. Howard understands this better than I do. But for correctness and to make simulation work _at all_, we need to be sure to have sane reset values. Otherwise, those registers start out as x's (undefined) in simulation and we get something that doesn't work. Plus, you need to make sure that, in hardware, the enqueue and dequeue signals (at the very least) are deasserted so you don't get anything spurious going on. -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
