Excellent explanations, Nikos! Now I got it. Thank you for your time! I believe more users will benefit from your explanations.
Best regards gjins On Wed, Aug 29, 2018 at 8:23 AM, Nikos Nikoleris <[email protected]> wrote: > Hi Gjins, > > On 29/08/2018 10:50, Gongjin Sun wrote: > > Thank you for clear explanations, Nikos. But I still have several > > follow-up discussions. Please see them below. > > > > > > On Tue, Aug 28, 2018 at 5:32 AM, Nikos Nikoleris > > <[email protected] <mailto:[email protected]>> wrote: > > > > Hi Gjins, > > > > Please see below for my response. > > > > On 27/08/2018 07:28, Gongjin Sun wrote: > > > > > > 1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles > > > &lat, PacketList &writebacks) (src/mem/cache/base.cc) > > > > > > (1) In the segment "if (pkt->isEviction()) { ...}", if I > understand it > > > correctly, this code segment checks whether arriving requests > (Writeback > > > and CleanEvict) have already had their copies (for the same block > > > address) in the Write Buffer and handle them accordingly. > > > > > > But I notice the comments > > > "// We check for presence of block in above caches before issuing > > > // Writeback or CleanEvict to write buffer. Therefore the only > > > ... > > > ", it is confusing to say here "in above caches". Shouldn't it be > "for > > > presence of block in this Write Buffer"? > > > > At this point, a cache above performed an eviction and this cache has > > received the packet pkt. Before anything else, we search the write > > buffer of this cache for any packet wbPkt for the same block. If we > find > > a matching wbPkt, then wbPkt has to be a writeback (can't be a > > CleanEvict). > > > > When we add a packet (wbPkt) to the write buffer we check if the > block > > is cached above (see Cache::doWritebacks()). If it is cached above > and > > the packet is a CleanEvict or a WritebackClean then we just squash it > > and we don't add it to the write buffer. > > > > In this case, we just received an eviction from a cache above (pkt), > > which means that wbPkt can't be a CleanEvict since it would have been > > squashed. > > > > I agree though the comment here is not crystal clear. We should > probably > > update it. > > > > > > Thanks. For example, in the "a Writeback generated in this cache peer > > cache ...", does "this cache peer cache" mean "this cache" or "this > > cache's peer cache (in another core)"? > > > > I believe this is a typo. It should be: > Therefore the only possible cases can be of a CleanEvict or a > WritebackClean packet coming from above encountering a Writeback > generated in this cache and waiting in the write buffer. > > > In addition, for "Cases of upper level peer caches ... simultaneously", > > it says two scenarios: 1) upper level peer caches (they should be > > multiple cores' L1 cache assuming this cache is shared L2) generate > > CleanEvict and Writeback respectively, 2) upper level peer caches only > > generate CleanEvict, is my understanding correct? > > > > 1) Could be more than one cache above > 2) A cache above can generate CleanEvict or WritebackClean if am not > missing something. > > > > > > > > > > Also, about the comments > > > "// Dirty writeback from above trumps our clean writeback... > discard > > > here", why is the local found writeback is clean? I think it could > be > > > clean or dirty. So arriving dirty writeback sees local writeback > in the > > > write buffer and the former could be (but not necessarily) newer > than > > > the latter. (One such scenario is: cpu core write hit block A in > L1 data > > > cache and then write it back to L2. Then core read it into L1 > again. > > > Next, the dirty A is put into Write Buffer in L2. After that, the > cpu > > > core could "write back A to L2 again" or "write A (the second > write) and > > > then write back A to L2 again". The latter makes arriving dirty A > has > > > different value from the dirty A in L2's write buffer.) > > > > > > > In your example, I believe that the 2nd ReadEx that hits in L2 and > finds > > the block dirty will clear the dirty bit and respond with the flag > > cacheResponding which means that the L1 will fill-in and mark the > block > > as dirty. In this particular case, I am not sure the L2 can have the > > block dirty. > > > > > > Yea, you are right. The 2nd ReadExReq will clear the dirty bit and set > > CacheResponding flag (in Cache::satisfyRequest(...), cache.cc). But this > > block still has dirty data even it is not marked "dirty" any more ... > > > > Indeed the cache has a more recent version of the data but another cache > has the latest version of the data and has the responsibility to perform > the writeback and provide the data to any request asking for it. For the > coherence protocol this cache will not respond to any requests and it > might as well evict the block without writing it back (if it does it > will be a WritebackClean or CleanEvict). > > > I think the local writeback has to be clean but I might be wrong in > any > > case we should add an assertion here: > > assert(wbPkt->isCleanEviction); > > or better > > assert(wbPkt->cmd == MemCmd::WritebackClean; > > > > > > I agree with you. I cannot think any scenarios which allow an incoming > > WritebackDirty from above cache to see a second local WritebackDirty. > > Actually, it looks like this is guaranteed by Gem5's MOESI > > implementation which only allows one dirty block to exist in the whole > > cache hierarchy. The scenario I mentioned only could happen when > > multiple dirty blocks are allowed to exist. Speaking of this, I have a > > relevant question below about Gem5's own MOESI. (see below, why is only > > one dirty block allowed) > > > > > > > About the comments > > > "// The CleanEvict and WritebackClean snoops into other > > > // peer caches of the same level while traversing the", > > > > > > Do here "peer caches of the same level" mean the caches of the same > > > level in other cpus? > > > > > > > I think you are right. > > > > > (2) About the comments > > > "// we could get a clean writeback while we are having outstanding > > > accesses to a block, ..." > > > How does this happen? I just cannot understand this. If we see an > > > outstanding access in local cache, that means it must miss in above > > > caches for the same cpu. How can the above cache still evict a > clean > > > block (it is a miss) and write it back to next cache level? Would > you > > > like to show one scenario for this? > > > > You can have more than one cache above. Take for example a dual core > > system with private DCache and shared L2. Suppose the DCache0 has the > > block shared and clean, and Core1 performs a read. DCache1 doesn't > have > > the block and it will issue a ReadSharedReq. The crossbar will snoop > > DCache0 but since it has a clean block it won't respond. The > > ReadSharedReq will be forwarded to the L2 where it misses. The L2 > will > > create an MSHR. While the MSHR is in service in the L2, the DCache0 > > could evict the block and therefore perform a WritebackClean which > will > > be sent to the L2. > > > > This scenario definitely makes sense in terms of Gem5's MOESI protocol. > > However, I just don't understand why Gem5's MOESI does not allow an > > exclusive (also clean) cache line in this core to respond other core's > > read request. I did notice that packet.hh has very detailed comments > > about CacheResponding where only "modified" or "owned" is allowed to > > respond. But why is that? I refer to several university slides where > > they have different MOESI definition from Gem5's one ( "std::string > > print() (in blk.hh)" clearly shows the definition of each state in > MOESI): > > > > https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf > > <https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf> > > https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13- > slides-1up.pdf <https://www.cs.virginia.edu/~ > cr4bd/6354/F2016/slides/lec13-slides-1up.pdf> > > > > In these slides, "exclusive" is allowed to respond to the requests from > > other cores. Additionally, they also allow multiple dirty copies of the > > same block to exist in multiple cores. But Gem5's MOESI (according to > > the definition in blk.hh) seems not to allow this (/"Note that only one > > cache ever has a block in Modified or Owned state, i.e., only one cache > > owns the block, or equivalently has the BlkDirty bit set. .../"). So I'm > > confused with this difference. Is there any special reason for Gem5 to > > design a different MOESI implementation? > > > > In the snooping MOESI protocol we've implemented in gem5, the cache that > * has the dirty copy of the block (state M or O), or > * has an outstanding request and expects a writable copy, or > * a WritebackDirty for the block > is the ordering point. All subsequent requests for the same block will > have well defined order and from the software point of view they happen > after. As a result there should always be only one cache in the system > with the block in dirty state, a pending modified MSHR or WritebackDirty > to guarantee certain memory ordering requirements. > > This is not the only sane design, you will definitely find systems with > very different protocol. > > Nikos > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
