On Mar 8, 2010, at 3:50 PM, Stijn Souffriau wrote:
> On Sunday 07 March 2010 01:32:09 am Ali wrote:
>> Hi Stijn,
>>
>> They are just snoop requests from the access filtering through the
>> system. You might be able to stop them by having the CPU return
>> snoop=false to getDeviceAddressRanges(), but I'm not sure that the
>> caches have code to do that implemented (Steve?). In any case, it's
>> certainly OK to ignore them in the atomic simple cpu since it doesn't
>> have any sort of buffering for the requests. The other CPU models
>> ignore them as well, although it's possible that the snoop order
>> could
>> be used to enforce a stronger ordering in the system (although no one
>> has ever tried this).
>>
>> Hope this helps,
>> Ali
>
> Hi again,
>
> Good news this time. I managed to get it working, or at least, it
> seems to
> work fine functionally for some non-trivial benchmarks, though I
> need to do
> some more tests. As suggested I simply abort
> Cache<TagStore>::atomicAccess in
> case of a miss at the time of calling memSidePort->sendAtomic but in
> the case
> of a writeback I can't abort Cache<TagStore>::atomicAccess
> completely but
> catch deadlocks in the while loop and repeat as following:
>
> while (!writebacks.empty()) {
> PacketPtr wbPkt = writebacks.front();
> try {
> this->enableRollBack();
> memSidePort->sendAtomic(wbPkt);
> writebacks.pop_front();
> delete wbPkt;
> } catch (ExclusiveResource::DeadLockException ex) {
> this->unlockOnce(); // other thread gets priority
> this->lockOnce();
> //throw ex;
> }
> }
>
> Aborting atomicAccess completely when a writeback deadlocks results in
> functional errors in the program so I have to do it like this, which
> is more
> efficient anyway. Can anyone foresee any simulation errors resulting
> from
> interrupting writebacks as such that could lead to inconsistencies?
It seems like the writeback should still be scanned by the cache until
it successfully completes, so that doesn't seem like a big problem.
Nothing else comes to mind... Steve?
> The deadlock frequency seems to be acceptable, a couple of times a
> second
> (will measure more accurately later). However as I expected, there
> seems to be
> a lot of contention as the number of cores increases so I'll
> probably have to
> manage thread priority a bit better for higher amounts of cores.
I assume additional levels of hierarchy in the memory system should
reduce contention as long as the workloads scales well?
Ali
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev