Hey guys, Just want to double check something.
It looks like you translate FLUSH instructions to nop's. In CmpFlex, the shared Piranha cache maintains proper coherence between all the instruction and data caches, and that all seems to make sense. In the DSMFlex or UniFlex simulator, I don't see how coherence is maintained between the instruction and data caches. I know that IProbes are used, so that a Fetch that reaches the L2 will Probe the L1-D and get data that might have been written. But I don't see how Invalidate messages get propagated to the L1-I caches. For example, if you fetch an instruction into the L1-I cache, then the L1-D cache does a Write on that same piece of data, it does not look like that write will invalidate the data in the L1-I cache. In a SPARC system the FLUSH instruction would be issued to guarantee coherence between the I and D caches; however, it looks like the v9Decoder translates FLUSH instructions to NOPs, so subsequent accesses might continue to hit in the L1-I cache and never get the updated data from the ICache. I realize that in practice, this isn't going to make any difference because the data isn't actually being read from the caches in the simulator. But I know from experience that our workloads do write to some instruction data, and I'm wondering how this is handled in the DSMFlex or UniFlex simulators. Thanks, Jason From twenisch at eecs.umich.edu Fri Sep 12 04:50:51 2008 From: twenisch at eecs.umich.edu (Thomas Wenisch) List-Post: [email protected] Date: Fri Sep 12 04:51:11 2008 Subject: [Simflex] FLUSH instructions In-Reply-To: <[email protected]> References: <[email protected]> Message-ID: <[email protected]> Hi Jason, I believe you are correct that UniFlex and DSMFlex do not properly maintain coherence in the I caches. The core assumes the memory system handles it, and the memory system does not bother propagating invalidates to L1I. As you noted, coherence is correctly maintained in CMP simulators. Also, the IProbe thing was added exactly because there are writes to I cache lines (the writes are mostly the loader creating stubs for dynamic link libraries). To avoid errors in the DSM coherence protocol, the L1 I-cache must interrogate the L1 D cache before issuing an off-chip miss. However, that is not sufficient to maintain coherence (it is enough to avoid protocol errors). Regards, -Thomas Wenisch On Thu, 11 Sep 2008, Jason Zebchuk wrote: > Hey guys, > > Just want to double check something. > > It looks like you translate FLUSH instructions to nop's. > > In CmpFlex, the shared Piranha cache maintains proper coherence between all > the instruction and data caches, and that all seems to make sense. > > In the DSMFlex or UniFlex simulator, I don't see how coherence is maintained > between the instruction and data caches. I know that IProbes are used, so > that a Fetch that reaches the L2 will Probe the L1-D and get data that might > have been written. But I don't see how Invalidate messages get propagated to > the L1-I caches. > > For example, if you fetch an instruction into the L1-I cache, then the L1-D > cache does a Write on that same piece of data, it does not look like that > write will invalidate the data in the L1-I cache. In a SPARC system the FLUSH > instruction would be issued to guarantee coherence between the I and D > caches; however, it looks like the v9Decoder translates FLUSH instructions to > NOPs, so subsequent accesses might continue to hit in the L1-I cache and > never get the updated data from the ICache. > > I realize that in practice, this isn't going to make any difference because > the data isn't actually being read from the caches in the simulator. But I > know from experience that our workloads do write to some instruction data, > and I'm wondering how this is handled in the DSMFlex or UniFlex simulators. > > > Thanks, > > Jason > _______________________________________________ > SimFlex mailing list > [email protected] > https://sos.ece.cmu.edu/mailman/listinfo/simflex > SimFlex web page: http://www.ece.cmu.edu/~simflex > From babak at cmu.edu Sat Sep 13 09:58:47 2008 From: babak at cmu.edu (Babak Falsafi) List-Post: [email protected] Date: Sun Sep 14 14:13:23 2008 Subject: [Simflex] FLUSH instructions References: <[email protected]> <[email protected]> Message-ID: <003b01c915a8$d834fc90$bf00a...@lenovo14b64ca7> Dear Jason, If you decide to fix the flush instructions, please send us back the pathes so we can add it to our next release. Regards, _____________________ Babak Falsafi Professor of I&C EPFL Adjunct Professor of ECE & CS Carnegie Mellon people.epfl.ch/babak.falsafi www.c2s2.org ----- Original Message ----- From: "Thomas Wenisch" <[email protected]> To: "SimFlex software support" <[email protected]> Sent: Friday, September 12, 2008 10:50 AM Subject: Re: [Simflex] FLUSH instructions > Hi Jason, > > I believe you are correct that UniFlex and DSMFlex do not properly > maintain coherence in the I caches. The core assumes the memory system > handles it, and the memory system does not bother propagating invalidates > to L1I. As you noted, coherence is correctly maintained in CMP > simulators. > > Also, the IProbe thing was added exactly because there are writes to I > cache lines (the writes are mostly the loader creating stubs for dynamic > link libraries). To avoid errors in the DSM coherence protocol, the L1 > I-cache must interrogate the L1 D cache before issuing an off-chip miss. > However, that is not sufficient to maintain coherence (it is enough to > avoid protocol errors). > > Regards, > -Thomas Wenisch > > On Thu, 11 Sep 2008, Jason Zebchuk wrote: > >> Hey guys, >> >> Just want to double check something. >> >> It looks like you translate FLUSH instructions to nop's. >> >> In CmpFlex, the shared Piranha cache maintains proper coherence between >> all the instruction and data caches, and that all seems to make sense. >> >> In the DSMFlex or UniFlex simulator, I don't see how coherence is >> maintained between the instruction and data caches. I know that IProbes >> are used, so that a Fetch that reaches the L2 will Probe the L1-D and get >> data that might have been written. But I don't see how Invalidate >> messages get propagated to the L1-I caches. >> >> For example, if you fetch an instruction into the L1-I cache, then the >> L1-D cache does a Write on that same piece of data, it does not look like >> that write will invalidate the data in the L1-I cache. In a SPARC system >> the FLUSH instruction would be issued to guarantee coherence between the >> I and D caches; however, it looks like the v9Decoder translates FLUSH >> instructions to NOPs, so subsequent accesses might continue to hit in the >> L1-I cache and never get the updated data from the ICache. >> >> I realize that in practice, this isn't going to make any difference >> because the data isn't actually being read from the caches in the >> simulator. But I know from experience that our workloads do write to some >> instruction data, and I'm wondering how this is handled in the DSMFlex or >> UniFlex simulators. >> >> >> Thanks, >> >> Jason >> _______________________________________________ >> SimFlex mailing list >> [email protected] >> https://sos.ece.cmu.edu/mailman/listinfo/simflex >> SimFlex web page: http://www.ece.cmu.edu/~simflex >> > _______________________________________________ > SimFlex mailing list > [email protected] > https://sos.ece.cmu.edu/mailman/listinfo/simflex > SimFlex web page: http://www.ece.cmu.edu/~simflex >
