Hi Korey, Thanks for bringing this up. It is a confusing thing that we probably need to think through and find a solution for, and agree to and document what the expectations are on the Port interface as far as timing.
I believe the assumption in the SimpleTimingPort case is that the device on the other side of the port is managing things according to its internal clock, and that even though you may send a packet on a funny tick then the receiver will deal with it appropriately (or close enough). The receiver could do this by accepting the packet on curTick+1 and queuing it internally until it's ready, or it could return false on sendTiming() and use the retry mechanism to tell the port when it's ready. So resending on curTick+1 is basically "send it as soon as the receiver will take it". I'm not saying that's elegant or the best way to do it though. This arguably works OK if you're talking to a bus; even though an idle bus will forward a packet through on an arbitrary tick, it calculates the next free time according to its internal clock cycle, and so any subsequent packet that comes before the bus goes idle again will be stalled until the next clock edge where the bus is free, and queued packets are always handled on bus clock edges. So a lightly used bus will allow packets through on arbitrary ticks, but never closer together than the bus bandwidth would imply, and heavily loaded bus where there is queuing will process packets on bus clock edges. The cache model (as you see) doesn't have a notion of a clock; it just has a latency. Thus it also doesn't have any internal bandwidth constraint. The general expectation is that whatever is feeding the cache (whether a CPU or a bus) is imposing the bandwidth constraint by not sending packets in at a higher rate than the cache is supposed to handle. I assume you've got a cache hooked up directly to your CPU. I think the solutions to your problem that fit with the current model are for the cpu itself to do what I stated above, which is either to accept Packet B at 340001 but buffer it internally and process it on the next cpu clock edge, or have your receiveTiming() method return false for Packet B and then issue a retry at the next cpu clock edge. The former seems a little cleaner to me, but I don't have a strong opinion. I'm also open to thoughts about how to handle this better in general... I agree that scheduling things on "curTick+1" just seems wrong, and it is confusing and a potential (and quite possibly current) source of timing model errors, so if there's a clean and easy way to improve it without scheduling lots of extra events then we should consider that. Requiring every SimObject to have a clock and every port to know the clock of its owner is one possibility; it's rather invasive, but maybe it's a good idea anyway. Thoughts from the peanut gallery? Steve On Sat, Apr 24, 2010 at 7:02 AM, Korey Sewell <ksew...@umich.edu> wrote: > Hey all, > I've recently come across a couple of code segments that worry me a bit. > They are different spots in the memory system where you have a "curTick + 1" > to schedule something. > > If you grep around, you'll see a # of these in the code: > src/cpu/inorder/cpu.cc:697: mainEventQueue.schedule(&tickEvent, > nextCycle(curTick + 1)); > src/cpu/inorder/cpu.cc:698: DPRINTF(InOrderCPU, "Scheduled CPU > for next tick @ %i.\n", nextCycle(curTick + 1)); > src/cpu/o3/lsq_unit_impl.hh:757: cpu->schedule(wb, curTick + > 1); > src/dev/ide_disk.cc:556: schedule(dmaTransferEvent, curTick + 1); > src/mem/bridge.cc:286: schedule(sendEvent, std::max(buf->ready, > curTick + 1)); > src/mem/cache/cache_impl.hh:1559: schedule(sendEvent, > std::max(nextReady, curTick + 1)); > src/mem/cache/base.cc:127: schedule(ev, curTick + 1); > src/mem/ruby/system/RubyPort.cc:303: schedSendTiming(pkt, curTick + 1); > //minimum latency, must be > 0 > src/mem/ruby/system/RubyPort.cc:310: schedSendTiming(pkt, curTick + 1); > //minimum latency, must be > 0 > > as well as: > src/mem/tport.cc:155: schedule(sendEvent, time <= curTick ? > curTick+1 : time); > > As far as I can tell, you get to scheduling things on invalid cycles when > you use a curTick + 1 instead of a "curTick + cycles(1)" where you account > for what the next valid cycle is in a assumed cycles() function. > > The last example is the one causing buggy results for me: the > SimpleTimingPort::sendDeferredPacket() function around line 55. I interpret > the code snippet below as saying that if you have multiple packets to send > off the transmit list in on one cycle, then send the subsequent packet on > "curTick+1": > bool success = sendTiming(dp.pkt); > > if (success) { > if (!transmitList.empty() && !sendEvent->scheduled()) { > Tick time = transmitList.front().tick; > schedule(sendEvent, time <= curTick ? curTick+1 : time); > } > > if (transmitList.empty() && drainEvent) { > drainEvent->process(); > drainEvent = NULL; > } > } ... > > The problem is here is that say you have 2 packets scheduled to come back on > tick 340000: Packet A will come back on tick 340000 and Packet B will come > back on tick 340001. Isnt 340001 an invalid tick for data to come back on? > Shouldnt it be on whatever clock the bus/port/memory_object is running on? > In my sims, the resulting effect is a bug since I want to activate a thread > when data comes back from the cache, but if I get data back on tick 340001 > and the CPU never sees tick 340001 the system eventually just deadlocks. > > I've fixed some things on the CPU side of things to only schedule on valid > CPU ticks but I dont think I have should to do bookkeeping to account for if > data comes back too early (there's more to it then just scheduling ticks > correctly). > > So the "BIG" fix for the SimpleTimingPort is to figure out how to change > "curTick + 1" into "curTick + portOwnerCycles(1)" or something of that > nature. The problem is there is no obvious way to write that function yet. > > In my example, I really just want to add the BaseCache Latency > (SimpleTimingPort->CachePort) so I need a cycles * cache->hitLatency. > However, SimpleTimingPort is just a base class of CachePort, so we cant > assume that SimpleTimingPort will always have a cache for me to call into to > figure out the right timing. Also, what happens if SimpleTimingPort is > connected to a Bus and not a CPU? Additional ramifications there? > > I always get a bit confused on the port connection handling but I think > there is a potential problem here with this. But maybe there is a reason for > this "+ 1" code that I'm unaware of. If so, please let me know. If not, if > anyone has any thoughts/ideas on a fix, please advise! > > Thanks. > > -- > - Korey > > _______________________________________________ > m5-dev mailing list > m5-dev@m5sim.org > http://m5sim.org/mailman/listinfo/m5-dev > > _______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev