So, this is what is happening:

CPU -> TLB translation (miss) -> TableWalker -> system.membus
transaction -> L1 dcache hit (dcache memside port) -> Table walk
complete -> TLB translation finished -> CPU -> read -> L1 Dcache miss ->
dcache mem-side port -> boom!

I think it's still solved via delaying the translation by one cycle.

Ali


On Mon, 27 Sep 2010 11:49:46 -0700, Steve Reinhardt <[email protected]>
wrote:
> OK, so I understand that the DMAPort/iodevice.cc part is a red
> herring.  But why does the response come out of the cache's
> MemSidePort?  Is the TLB hooked up on the opposite side of the cache
> from the CPU?  I'm not 100% sure that's directly related to the
> problem, but it seems odd.  Basically you're leaving the back door of
> the cache, and then going around the side yard, coming in the front
> door, and trying to leave the back door again in the same cycle.  In
> contrast, if you were leaving through the front door and someone else
> tried to come in the front door at the same time, either the cache
> would be smart enough to issue a retry or if not at least the back
> door would still be free for use.
> 
> Steve
> 
> On Mon, Sep 27, 2010 at 6:15 AM, Ali Saidi <[email protected]> wrote:
>> So, the table walker seems to be the problem. Looks like the response from 
>> the translation needs be delayed at least one cycle for this to all work 
>> out....
>>
>> Ali
>>
>>
>> #0  SimpleTimingPort::schedSendEvent (this=0x10bf61080, when=507847508000) 
>> at tport.hh:111
>> #1  0x0000000100a30b07 in BaseCache::CachePort::requestBus 
>> (this=0x10bf61080, cause=BaseCache::Request_MSHR, time=507847508000) at 
>> base.hh:136
>> #2  0x0000000100a2e54e in Cache<LRU>::timingAccess (this=0x102257800, 
>> pkt=0x102277060) at base.hh:469
>> #3  0x0000000100a2ea93 in Cache<LRU>::CpuSidePort::recvTiming 
>> (this=0x10bf60fa0, pkt=<value temporarily unavailable, due to 
>> optimizations>) at cache_impl.hh:1532
>> #4  0x00000001007cf4fd in TimingSimpleCPU::handleReadPacket 
>> (this=0x10bcefbb0, pkt=0x102277060) at build/ARM_FS/cpu/simple/timing.cc:260
>> #5  0x00000001007cfc0c in TimingSimpleCPU::sendData (this=0x10bcefbb0, 
>> req=0x102276850, data=0x10c3c1ad0 "", res=0x0, read=true) at 
>> build/ARM_FS/cpu/simple/timing.cc:2
>> #6  0x00000001007cfefe in TimingSimpleCPU::finishTranslation 
>> (this=0x10bcefbb0, state=0x10c3c2e20) at 
>> build/ARM_FS/cpu/simple/timing.cc:683
>> #7  0x00000001007d1679 in DataTranslation<TimingSimpleCPU>::finish 
>> (this=0x10c3c25c0, fau...@0x102277060, req=0xbd0600, tc=<value temporarily 
>> unavailable, due to optim
>> #8  0x0000000100070ff3 in RefCountingPtr<FaultBase>::del () at 
>> build/ARM_FS/arch/arm/tlb.cc:592
>> #9  ~RefCountingPtr [inlined] () at 
>> /Users/alisai01/work/m5/build/ARM_FS/base/refcnt.hh:85
>> #10 0x0000000100070ff3 in ArmISA::TLB::translateTiming (this=<value 
>> temporarily unavailable, due to optimizations>, req=0x102276850, 
>> tc=0x10bceca00, translation=0x10c3
>> #11 0x000000010008f9ca in ArmISA::TableWalker::doL1DescriptorWrapper 
>> (this=0x10bced2a0) at build/ARM_FS/arch/arm/table_walker.cc:601
>> #12 0x00000001008d054a in DmaPort::recvTiming (this=0x10bf61b00, 
>> pkt=0x102277000) at build/ARM_FS/dev/io_device.cc:146
>> #13 0x00000001009ae96d in Bus::recvTiming (this=0x10bcad7b0, 
>> pkt=0x102277000) at build/ARM_FS/mem/bus.cc:262
>> #14 0x00000001009b462b in Bus::BusPort::recvTiming (this=<value temporarily 
>> unavailable, due to optimizations>, pkt=<value temporarily unavailable, due 
>> to optimization
>> #15 0x0000000100a042ae in Cache<LRU>::MemSidePort::sendPacket 
>> (this=0x10bf61080) at port.hh:186
>> #16 0x0000000100a92090 in Flags<short>::isSet () at 
>> build/ARM_FS/sim/eventq.cc:203
>> #17 0x0000000100a92090 in Event::isExitEvent () at 
>> /Users/alisai01/work/m5/build/ARM_FS/base/flags.hh:296
>> #18 0x0000000100a92090 in EventQueue::serviceOne (this=<value temporarily 
>> unavailable, due to optimizations>) at build/ARM_FS/sim/eventq.cc:204
>>
>>
>>
>>
>> On Sep 27, 2010, at 1:10 AM, Steve Reinhardt wrote:
>>
>>> That does seem odd... like the dcache is sending a packet, and in the
>>> middle the bus decides to retry the cache to get it to send a second
>>> packet before it completes sending the first.  Can you send a stack
>>> backtrace from somewhere in the middle (where the bus has called back
>>> to the cache)?
>>>
>>> Steve
>>>
>>> On Sun, Sep 26, 2010 at 5:04 PM, Ali Saidi <[email protected]> wrote:
>>>> This is what happens (I've added a lot of dprintfs)...
>>>>
>>>> The cache port sends a packet:
>>>> 507847507000: system.cpu.dcache-mem_side_port: sendPacket() called
>>>> 507847507000: system.cpu.dcache-mem_side_port: -- defered Packet Ready:1
>>>> 507847507000: system.cpu.dcache-mem_side_port: --sending packet
>>>>
>>>> This part I don't get. The bus receives the packet and probes the cache 
>>>> that made the request? Either way an sendEvent gets scheduled here for the 
>>>> future.
>>>> 507847507000: system.membus: recvTiming: src 6 dst 8 ReadResp 0x73631e4
>>>> 507847507000: system.cpu.dcache: ReadReq 79d2000 miss
>>>> 507847507000: system.cpu.dcache-mem_side_port: Asserting bus request for 
>>>> cause 0
>>>> 507847507000: system.cpu.dcache-mem_side_port: sendEvent being scheduled @ 
>>>> 507847508000
>>>> 507847507000: system.membus: The bus is now occupied from tick 
>>>> 507847507000 to 507847509000
>>>>
>>>> Now sendTiming returns success, so the cache port says that was done, and 
>>>> waitingOnRetry is false (although the sendTiming scheduled an event).
>>>> 507847507000: system.cpu.dcache-mem_side_port: --sendTiming returned 1
>>>> 507847507000: system.cpu.dcache-mem_side_port: sendPacket(): 
>>>> WaitingOnRetry:0
>>>>
>>>> Now the cache port attempts to schedule the next packet, since there isn't 
>>>> a retry pending
>>>> 507847507000: system.cpu.dcache-mem_side_port: --sendEvent.scheduled=1 
>>>> sendEvent.when=507847508000
>>>> 507847507000: system.cpu.dcache-mem_side_port: --more packets to send @ 
>>>> 507847508000
>>>>
>>>> At this point the sendEvent is scheduled but it's already been scheduled 
>>>> above, so we assert.
>>>>
>>>> Any ideas?
>>>>
>>>> Ali
>>>>
>>>>
>>>>
>>>> On Sep 24, 2010, at 4:34 PM, Steve Reinhardt wrote:
>>>>
>>>>> Can you provide some more information on where you're seeing this?
>>>>> One simple case where it would happen is if there are actually
>>>>> multiple instances of the cache that are doing this on the same
>>>>> tick...
>>>>>
>>>>> Steve
>>>>>
>>>>> On Wed, Sep 22, 2010 at 4:10 PM, Ali Saidi <[email protected]> wrote:
>>>>>> Anyone ever seen a case where
>>>>>>            schedule(sendEvent, std::max(nextReady, curTick + 1));
>>>>>> is called multiple times in cache_impl.hh?
>>>>>>
>>>>>> I haven't quite been able to track what is going on. Any ideas how this 
>>>>>> could happen Steve?
>>>>>> Thanks,
>>>>>> Ali
>>>>>>
>>>>>> _______________________________________________
>>>>>> m5-dev mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>>
>>>>> _______________________________________________
>>>>> m5-dev mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> m5-dev mailing list
>>>> [email protected]
>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to