In x86 without caches, the page table walker and the TLBs both have ports to the membus. In timing mode, however, I haven't tried getting the classic memory model with caches to work, since we decided the Ruby model (which we have up and running) will be more useful to us in the near term. So, this functionality might not have been tested in x86.
This also sounds similar to the bug that I encountered with the page table walker in x86, namely, an unaligned memory access that missed in the TLB would cause multiple outstanding requests to the page table walker in the same cycle, which it wasn't originally designed to handle (the bug manifest itself in the walker rather than the caches though). We've tested my walker patch fairly thoroughly now. I'm not sure if it would be enlightening, but I can post it for review if you'd like to see what I did. Joel On Mon, Sep 27, 2010 at 6:32 AM, Ali Saidi <[email protected]> wrote: > Actually... x86 does this too. Although it might internally delay the > translation. The fundamental issue here is that we're grabbing the cpus port > to use for this kind of thing and there is no rate limiting mechanism there. > If the tlb attached to the L2 cache bus this wouldn't be an issue or if > there was some bus like thing between the l1 and the walker + cpu. > > Ali > > On Sep 27, 2010, at 8:15 AM, Ali Saidi wrote: > > > So, the table walker seems to be the problem. Looks like the response > from the translation needs be delayed at least one cycle for this to all > work out.... > > > > Ali > > > > > > #0 SimpleTimingPort::schedSendEvent (this=0x10bf61080, > when=507847508000) at tport.hh:111 > > #1 0x0000000100a30b07 in BaseCache::CachePort::requestBus > (this=0x10bf61080, cause=BaseCache::Request_MSHR, time=507847508000) at > base.hh:136 > > #2 0x0000000100a2e54e in Cache<LRU>::timingAccess (this=0x102257800, > pkt=0x102277060) at base.hh:469 > > #3 0x0000000100a2ea93 in Cache<LRU>::CpuSidePort::recvTiming > (this=0x10bf60fa0, pkt=<value temporarily unavailable, due to > optimizations>) at cache_impl.hh:1532 > > #4 0x00000001007cf4fd in TimingSimpleCPU::handleReadPacket > (this=0x10bcefbb0, pkt=0x102277060) at build/ARM_FS/cpu/simple/timing.cc:260 > > #5 0x00000001007cfc0c in TimingSimpleCPU::sendData (this=0x10bcefbb0, > req=0x102276850, data=0x10c3c1ad0 "", res=0x0, read=true) at > build/ARM_FS/cpu/simple/timing.cc:2 > > #6 0x00000001007cfefe in TimingSimpleCPU::finishTranslation > (this=0x10bcefbb0, state=0x10c3c2e20) at > build/ARM_FS/cpu/simple/timing.cc:683 > > #7 0x00000001007d1679 in DataTranslation<TimingSimpleCPU>::finish > (this=0x10c3c25c0, fau...@0x102277060, req=0xbd0600, tc=<value temporarily > unavailable, due to optim > > #8 0x0000000100070ff3 in RefCountingPtr<FaultBase>::del () at > build/ARM_FS/arch/arm/tlb.cc:592 > > #9 ~RefCountingPtr [inlined] () at > /Users/alisai01/work/m5/build/ARM_FS/base/refcnt.hh:85 > > #10 0x0000000100070ff3 in ArmISA::TLB::translateTiming (this=<value > temporarily unavailable, due to optimizations>, req=0x102276850, > tc=0x10bceca00, translation=0x10c3 > > #11 0x000000010008f9ca in ArmISA::TableWalker::doL1DescriptorWrapper > (this=0x10bced2a0) at build/ARM_FS/arch/arm/table_walker.cc:601 > > #12 0x00000001008d054a in DmaPort::recvTiming (this=0x10bf61b00, > pkt=0x102277000) at build/ARM_FS/dev/io_device.cc:146 > > #13 0x00000001009ae96d in Bus::recvTiming (this=0x10bcad7b0, > pkt=0x102277000) at build/ARM_FS/mem/bus.cc:262 > > #14 0x00000001009b462b in Bus::BusPort::recvTiming (this=<value > temporarily unavailable, due to optimizations>, pkt=<value temporarily > unavailable, due to optimization > > #15 0x0000000100a042ae in Cache<LRU>::MemSidePort::sendPacket > (this=0x10bf61080) at port.hh:186 > > #16 0x0000000100a92090 in Flags<short>::isSet () at > build/ARM_FS/sim/eventq.cc:203 > > #17 0x0000000100a92090 in Event::isExitEvent () at > /Users/alisai01/work/m5/build/ARM_FS/base/flags.hh:296 > > #18 0x0000000100a92090 in EventQueue::serviceOne (this=<value temporarily > unavailable, due to optimizations>) at build/ARM_FS/sim/eventq.cc:204 > > > > > > > > > > On Sep 27, 2010, at 1:10 AM, Steve Reinhardt wrote: > > > >> That does seem odd... like the dcache is sending a packet, and in the > >> middle the bus decides to retry the cache to get it to send a second > >> packet before it completes sending the first. Can you send a stack > >> backtrace from somewhere in the middle (where the bus has called back > >> to the cache)? > >> > >> Steve > >> > >> On Sun, Sep 26, 2010 at 5:04 PM, Ali Saidi <[email protected]> wrote: > >>> This is what happens (I've added a lot of dprintfs)... > >>> > >>> The cache port sends a packet: > >>> 507847507000: system.cpu.dcache-mem_side_port: sendPacket() called > >>> 507847507000: system.cpu.dcache-mem_side_port: -- defered Packet > Ready:1 > >>> 507847507000: system.cpu.dcache-mem_side_port: --sending packet > >>> > >>> This part I don't get. The bus receives the packet and probes the cache > that made the request? Either way an sendEvent gets scheduled here for the > future. > >>> 507847507000: system.membus: recvTiming: src 6 dst 8 ReadResp 0x73631e4 > >>> 507847507000: system.cpu.dcache: ReadReq 79d2000 miss > >>> 507847507000: system.cpu.dcache-mem_side_port: Asserting bus request > for cause 0 > >>> 507847507000: system.cpu.dcache-mem_side_port: sendEvent being > scheduled @ 507847508000 > >>> 507847507000: system.membus: The bus is now occupied from tick > 507847507000 to 507847509000 > >>> > >>> Now sendTiming returns success, so the cache port says that was done, > and waitingOnRetry is false (although the sendTiming scheduled an event). > >>> 507847507000: system.cpu.dcache-mem_side_port: --sendTiming returned 1 > >>> 507847507000: system.cpu.dcache-mem_side_port: sendPacket(): > WaitingOnRetry:0 > >>> > >>> Now the cache port attempts to schedule the next packet, since there > isn't a retry pending > >>> 507847507000: system.cpu.dcache-mem_side_port: --sendEvent.scheduled=1 > sendEvent.when=507847508000 > >>> 507847507000: system.cpu.dcache-mem_side_port: --more packets to send @ > 507847508000 > >>> > >>> At this point the sendEvent is scheduled but it's already been > scheduled above, so we assert. > >>> > >>> Any ideas? > >>> > >>> Ali > >>> > >>> > >>> > >>> On Sep 24, 2010, at 4:34 PM, Steve Reinhardt wrote: > >>> > >>>> Can you provide some more information on where you're seeing this? > >>>> One simple case where it would happen is if there are actually > >>>> multiple instances of the cache that are doing this on the same > >>>> tick... > >>>> > >>>> Steve > >>>> > >>>> On Wed, Sep 22, 2010 at 4:10 PM, Ali Saidi <[email protected]> wrote: > >>>>> Anyone ever seen a case where > >>>>> schedule(sendEvent, std::max(nextReady, curTick + 1)); > >>>>> is called multiple times in cache_impl.hh? > >>>>> > >>>>> I haven't quite been able to track what is going on. Any ideas how > this could happen Steve? > >>>>> Thanks, > >>>>> Ali > >>>>> > >>>>> _______________________________________________ > >>>>> m5-dev mailing list > >>>>> [email protected] > >>>>> http://m5sim.org/mailman/listinfo/m5-dev > >>>>> > >>>> _______________________________________________ > >>>> m5-dev mailing list > >>>> [email protected] > >>>> http://m5sim.org/mailman/listinfo/m5-dev > >>>> > >>> > >>> _______________________________________________ > >>> m5-dev mailing list > >>> [email protected] > >>> http://m5sim.org/mailman/listinfo/m5-dev > >>> > >> _______________________________________________ > >> m5-dev mailing list > >> [email protected] > >> http://m5sim.org/mailman/listinfo/m5-dev > >> > > > > _______________________________________________ > > m5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/m5-dev > > > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness
_______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
