> On Feb. 10, 2016, 1:03 a.m., Steve Reinhardt wrote: > > So if you look at > > http://grok.gem5.org/search?q=handleLockedSnoop&defs=&refs=&path=&hist=&type=&project=gem5 > > there are multiple places where TheISA::handleLockedSnoop() is called from > > the various CPU models. Are we consistent in that this is only ever called > > on writes or invalidates? Is there a problem with it being called > > unconditionally, as it apparently is in most places? > > > > Or to put it another way, your problem with AtomicSimpleCPU was the 'if > > (pkt->isInvalidate())' check; an alternative solution would be just to get > > rid of that check entirely rather than adding pkt->isWrite() to the > > condition. And regardless of which way the solution goes, we should be > > sure it's applied consistently at all of the TheISA::handleLockedSnoop() > > call sites. (And maybe even document that somewhere!)
Aye; the fundamental correction being made here is that when multiple Simple* CPUs share a bus and have no caches [a perfectly acceptable/'supported' configuration], they receive actual Writes from the other CPUs, as invalidation is a notion created by the caches themselves. In this case we must treat these packets as invalidations for LLSC functionality. In our tree (with this patch), we are in a situation of only ever calling handleLockedSnoop() on writes and invalidates :-) I think there is a problem calling it unconditionally -- These sorts of suggestions also made the rounds internally; it may seem possible to delete the checks and things would function, but in the case we're 'fixing', the CPU might see packets of any kind and there's no point in doing the LLSC logic on things other than writes and invalidates, and while my memory is a bit hazy here, I think we tried that and it breaks because you really shouldn't clear LLSC flags like that, or you can have trouble ensuring progress. And then for consistency we changed Minor. It is my understanding that O3 either expects or requires being connected to a cache (and would be nonsensical otherwise anyway) and will only see Invalidations at the LSQ, so no changes are needed. I'm not particularly attached to any solution, but in the interest of conservation of energy, these changes are well tested and, in my view, good enough. You mention the various callsites. The threadSnoop() path is only reachable by writes on other threads of the same CPU (AtomicSimpleCPU::writeMem(...), TimingSimpleCPU::sendData(...) in the write path). I think you have a point in Atomic CPU's recvFunctionalSnoop, but I'm having trouble imagining a realistic use case with that. That leaves the callsites in this patch and O3, which I've addressed. - Curtis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/3316/#review7999 ----------------------------------------------------------- On Feb. 10, 2016, 12:31 a.m., Curtis Dunham wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/3316/ > ----------------------------------------------------------- > > (Updated Feb. 10, 2016, 12:31 a.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > cpu: Fix LLSC atomic CPU wakeup > > Writes to locked memory addresses (LLSC) did not wake up the locking > CPU. This can lead to deadlocks on multi-core runs. In AtomicSimpleCPU, > recvAtomicSnoop was checking if the incoming packet was an invalidation > (isInvalidate) and only then handled a locked snoop. But, writes are > seen instead of invalidates when running without caches (fast-forward > configurations). As as simple fix, now handleLockedSnoop is also called > even if the incoming snoop packet are from writes. > > Tests, Ran regression on all ISAs, observed acceptable change in 1: > opt/quick/fs/10.linux-boot/arm/linux/realview-simple-timing-dual > > > Diffs > ----- > > src/cpu/minor/lsq.cc c2146e4e20cd30c35b7ff4f4cd15c953663cdc4d > src/cpu/simple/atomic.cc c2146e4e20cd30c35b7ff4f4cd15c953663cdc4d > src/cpu/simple/timing.cc c2146e4e20cd30c35b7ff4f4cd15c953663cdc4d > > Diff: http://reviews.gem5.org/r/3316/diff/ > > > Testing > ------- > > > Thanks, > > Curtis Dunham > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
