Sure I will do that; let me see how can I make a diff file with all the changes
(changes need to be made to obey store-load ordering of a stronger model too!)
and post it for review.
Thanks,
dibakar
On 07/13/12, Ali Saidi wrote:
>
>
>
>
>
>
> If you could post it for review it would be a lot easier to understand since
> the email seems to have stripped all indenting.
>
>
>
> Thanks,
>
> Ali
>
>
>
> On 13.07.2012 12:47, Dibakar Gope wrote:
>
> >
> > Hi Nilay, Sorry for late response, I din't check my emails since last night
> > :). Anyway, so the checkviolations part that we are talking about, that
> > takes care of not having any CMP violation of coherence, but it does not
> > re-execute a load (not at the front of the commit queue) and following
> > younger insts upon receiving a snoop invalidation request, so in my
> > understanding it does not enforce the strict load-load ordering of a
> > stronger model. So i add couple of lines in checkSnoop: see the changes
> > below (1) the first if clause of checking the " // If there are no loads in
> > the LSQ we don't care" condition was wrong i guess in the existing code, it
> > actually was checking"If there are no loads in the LSQ we don't care" with
> > the "if (load_idx == loadTail)" clause. So with an additional if clause, I
> > make sure that if the snoop hits the front of the load queue, then nothing
> > need to be done. (2) further I add a clause towards the end of checkSnoop
> > () with needSC condition to check,
if the snoop hits a executed load that is not at the front of the queue,
reexecutes using ReExec (hopefully ReExec squashs all the younger insts
including that and re-fetches, as i understood from Ali's response) The other
changes that I did to maintain SC is to add few more constraints on the load
queue to ensure store-load ordering, ie a load in the load queue can not retire
from ROB until and unless the committed store instructions before that in the
program order are exposed to the memory system, as a result a load can still
receive snoop invalidates and need to be re-executed, if needed. I can post my
changes to enforce SC for review. template void LSQUnit::checkSnoop(PacketPtr
pkt) { int load_idx = loadHead; if (!cacheBlockMask) { assert(dcachePort); Addr
bs = dcachePort->peerBlockSize(); // Make sure we actually got a size assert(bs
!= 0); cacheBlockMask = ~(bs - 1); } // If there are no loads in the LSQ we
don't care if (load_idx == loadTail) { DPRINTF(LSQUnit, "loa
dHead: %d, loadTail:%d\n", loadHead, loadTail); //assert(0); return; } // If
this is the only load in the LSQ we don't care if (loadTail == (load_idx + 1))
{ DPRINTF(LSQUnit, "loadHead: %d, loadTail:%d\n", loadHead, loadTail);
//assert(0); return; } incrLdIdx(load_idx); DPRINTF(LSQUnit, "Got snoop for
address %#x\n", pkt->getAddr()); Addr invalidate_addr = pkt->getAddr() &
cacheBlockMask; while (load_idx != loadTail) { DynInstPtr ld_inst =
loadQueue[load_idx]; if (!ld_inst->effAddrValid || ld_inst->uncacheable()) {
incrLdIdx(load_idx); continue; } Addr load_addr = ld_inst->physEffAddr &
cacheBlockMask; DPRINTF(LSQUnit, "-- inst [sn:%lli] load_addr: %#x to
pktAddr:%#x\n", ld_inst->seqNum, load_addr, invalidate_addr); if (load_addr ==
invalidate_addr) { if (ld_inst->possibleLoadViolation) { DPRINTF(LSQUnit,
"Conflicting load at addr %#x [sn:%lli]\n", ld_inst->physEffAddr,
pkt->getAddr(), ld_inst->seqNum); // Mark the load for re-execution
ld_inst->fault = new ReExec; } else {
// If a older load checks this and it's true // then we might have missed the
snoop // in which case we need to invalidate to be sure
ld_inst->hitExternalSnoop = true; if (needsSC == true){ ld_inst->fault = new
ReExec; } } } incrLdIdx(load_idx); } return; } On 07/12/12, Nilay Vaish wrote:
> >
> > > Dibakar, any progress on this front? On Wed, 27 Jun 2012, Ali Saidi
> > > wrote:
> > > > Hi Dibakar, I'm not saying that I believe this is correct for x86. It
> > > > seems like x86 does require more ordering than is currently provided by
> > > > the lsq. Hopefully someone with more x86 experience could chime in and
> > > > confirm that. The faulting mechanism needs an overhaul in the o3 cpu.
> > > > There shouldn't be any fundamental difference. Thanks, Ali On
> > > > 27.06.2012 18:08, Dibakar Gope wrote:
> > > > > Hi Ali, from this thread,
> > > > http://www.mail-archive.com/[email protected]/msg00782.html, I get an
> > > > idea that a snoop invalidate will make a younger load and its following
> > > > younger instructions to re-execute, if only an older load in the
> > > > program order to the same cache block see an updated value. But I am
> > > > not still sure, if it obeys the load-load ordering of a stronger
> > > > consistency model other than ARM. Suppose for example,
> > > > > C0 C1 St A Ld C St B Ld A
> > > >
> > > > > In the above scenario, if the memory order becomes Ld A -> St A -> St
> > > > B -> Ld C and if C1 receives an invalidation for cache block A, before
> > > > Ld A make it to the front of the commit queue, still checkViolations()
> > > > code won't squash the Ld A and any younger instructions to maintain
> > > > strong consistency.
> > > > > My other doubt is that, can we make use of the
> > > > squashDueToMemOrder() squash mechanism instead of using ReExec fault,
> > > > if I want to squash the load A and younger instructions and re-fetch
> > > > those again in the above scenario? ReExec waits for the faulted
> > > > instruction to reach the front of the commit, is there any other
> > > > fundamental difference of using ReExec in comparison to the
> > > > squashDueToMemOrder() other than this?
> > > > > Thanks, --Dibakar On 06/25/12, Ali Saidi wrote:
> > > > ARM just requires load-load ordering (which is stronger than alpha).
> > > > x86 to my knowledge requires all stores in the system to be visible in
> > > > the same order. Ali On Jun 22, 2012, at 11:50 PM, Nilay wrote:What's
> > > > the difference between ARM's load-load ordering and TSO? I am guessing
> > > > in ARM not all instructions are flushed from pipe, but only those that
> > > > are affected by the snoop. My understanding is that the O3 CPU flushes
> > > > the entire pipeline when it sees that an instruction needs to execute
> > > > again. Since instructions commit inorder, any load that gets squashed
> > > > would mean that all subsequent loads are squashed as well. -- Nilay On
> > > > Fri, June 22, 2012 8:47 am, Ali Saidi wrote:
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > HI
> > > > > > >
> > > > > >
> > > > >
> > > > Dibakar, I'd have to think carefully about it, but you may be right
> > > > about TSO. I'd hope that someone who is more familiar with x86 could
> > > > respond. Thanks, Ali On 22.06.2012 07:46, Dibakar Gope wrote:
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > Hi Ali, Thanks for the response. Ok, I got the point. I
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > thought that since the O3 attempts to support the TSO for X86 , so
> > > > inherently this enforces/covers the regular load-load ordering present
> > > > in any stronger consistency model. But if it inline with ARM's
> > > > requirements,then does it not violate x86 and TSO's conventional
> > > > load-load ordering?
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > thanks, Dibakar
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > _______________________________________________ gem5-users mailing list
> > > > [email protected] <[email protected]> [1]
> > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users[2]_______________________________________________
> > > > gem5-users mailing list [email protected] <[email protected]> [3]
> > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users[4]_______________________________________________
> > > >
> > > > > gem5-users mailing
> > > > list
> > > > > [email protected] <[email protected]>
> > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users Links: ------ [1]
> > > > mailto:[email protected]
> > > > <[email protected]>(java_script:main.compose() [2]
> > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users [3]
> > > > mailto:[email protected]
> > > > <[email protected]>(java_script:main.compose() [4]
> > > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
> > >
> >
> > _______________________________________________ gem5-users mailing list
> > [email protected] <[email protected]>
> > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
> >
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users