Re: [gem5-users] Questions about the detailed and useful comments in the Cache access related code

Nikos Nikoleris Wed, 29 Aug 2018 08:23:51 -0700

Hi Gjins,

On 29/08/2018 10:50, Gongjin Sun wrote:
> Thank you for clear explanations, Nikos. But I still have several
> follow-up discussions. Please see them below.
>
>
> On Tue, Aug 28, 2018 at 5:32 AM, Nikos Nikoleris
> <[email protected] <mailto:[email protected]>> wrote:
>
>     Hi Gjins,
>
>     Please see below for my response.
>
>     On 27/08/2018 07:28, Gongjin Sun wrote:
>     >
>     > 1 BaseCache::access(PacketPtr pkt, CacheBlk *&blk, Cycles
>     > &lat, PacketList &writebacks) (src/mem/cache/base.cc)
>     >
>     > (1) In the segment "if (pkt->isEviction()) { ...}",  if I understand it
>     > correctly, this code segment checks whether arriving requests (Writeback
>     > and CleanEvict) have already had their copies (for the same block
>     > address) in the Write Buffer and handle them accordingly.
>     >
>     > But I notice the comments
>     > "// We check for presence of block in above caches before issuing
>     > // Writeback or CleanEvict to write buffer. Therefore the only
>     > ...
>     > ", it is confusing to say here "in above caches". Shouldn't it be "for
>     > presence of block in this Write Buffer"?
>
>     At this point, a cache above performed an eviction and this cache has
>     received the packet pkt. Before anything else, we search the write
>     buffer of this cache for any packet wbPkt for the same block. If we find
>     a matching wbPkt, then wbPkt has to be a writeback (can't be a
>     CleanEvict).
>
>     When we add a packet (wbPkt) to the write buffer we check if the block
>     is cached above (see Cache::doWritebacks()). If it is cached above and
>     the packet is a CleanEvict or a WritebackClean then we just squash it
>     and we don't add it to the write buffer.
>
>     In this case, we just received an eviction from a cache above (pkt),
>     which means that wbPkt can't be a CleanEvict since it would have been
>     squashed.
>
>     I agree though the comment here is not crystal clear. We should probably
>     update it.
>
>
> Thanks. For example, in the "a Writeback generated in this cache peer
> cache ...", does "this cache peer cache" mean "this cache" or "this
> cache's peer cache (in another core)"?
>


I believe this is a typo. It should be:
Therefore the only possible cases can be of a CleanEvict or a
WritebackClean packet coming from above encountering a Writeback
generated in this cache and waiting in the write buffer.

> In addition, for "Cases of upper level peer caches ... simultaneously",
> it says two scenarios: 1) upper level peer caches (they should be
> multiple cores' L1 cache assuming this cache is shared L2) generate
> CleanEvict and Writeback respectively, 2) upper level peer caches only
> generate CleanEvict, is my understanding correct?
>

1) Could be more than one cache above
2) A cache above can generate CleanEvict or WritebackClean if am not
missing something.


>
>     >
>     > Also, about the comments
>     > "// Dirty writeback from above trumps our clean writeback... discard
>     > here", why is the local found writeback is clean? I think it could be
>     > clean or dirty. So arriving dirty writeback sees local writeback in the
>     > write buffer and the former could be (but not necessarily) newer than
>     > the latter. (One such scenario is: cpu core write hit block A in L1 data
>     > cache and then write it back to L2. Then core read it into L1 again.
>     > Next, the dirty A is put into Write Buffer in L2. After that, the cpu
>     > core could "write back A to L2 again" or "write A (the second write) and
>     > then write back A to L2 again". The latter makes arriving dirty A has
>     > different value from the dirty A in L2's write buffer.)
>     >
>
>     In your example, I believe that the 2nd ReadEx that hits in L2 and finds
>     the block dirty will clear the dirty bit and respond with the flag
>     cacheResponding which means that the L1 will fill-in and mark the block
>     as dirty. In this particular case, I am not sure the L2 can have the
>     block dirty.
>
>
> Yea, you are right. The 2nd ReadExReq will clear the dirty bit and set
> CacheResponding flag (in Cache::satisfyRequest(...), cache.cc). But this
> block still has dirty data even it is not marked "dirty" any more ...
>

Indeed the cache has a more recent version of the data but another cache
has the latest version of the data and has the responsibility to perform
the writeback and provide the data to any request asking for it. For the
coherence protocol this cache will not respond to any requests and it
might as well evict the block without writing it back (if it does it
will be a WritebackClean or CleanEvict).

>     I think the local writeback has to be clean but I might be wrong in any
>     case we should add an assertion here:
>     assert(wbPkt->isCleanEviction);
>     or better
>     assert(wbPkt->cmd == MemCmd::WritebackClean;
>
>
> I agree with you. I cannot think any scenarios which allow an incoming
> WritebackDirty from above cache to see a second local WritebackDirty.
> Actually, it looks like this is guaranteed by Gem5's MOESI
> implementation which only allows one dirty block to exist in the whole
> cache hierarchy.  The scenario I mentioned only could happen when
> multiple dirty blocks are allowed to exist. Speaking of this, I have a
> relevant question below about Gem5's own MOESI. (see below, why is only
> one dirty block allowed)
>
>
>     > About the comments
>     > "// The CleanEvict and WritebackClean snoops into other
>     > // peer caches of the same level while traversing the",
>     >
>     > Do here "peer caches of the same level" mean the caches of the same
>     > level in other cpus?
>     >
>
>     I think you are right.
>
>     > (2) About the comments
>     > "// we could get a clean writeback while we are having outstanding
>     > accesses to a block, ..."
>     > How does this happen? I just cannot understand this. If we see an
>     > outstanding access in local cache, that means it must miss in above
>     > caches for the same cpu. How can the above cache still evict a clean
>     > block (it is a miss) and write it back to next cache level? Would you
>     > like to show one scenario for this?
>
>     You can have more than one cache above. Take for example a dual core
>     system with private DCache and shared L2. Suppose the DCache0 has the
>     block shared and clean, and Core1 performs a read. DCache1 doesn't have
>     the block and it will issue a ReadSharedReq. The crossbar will snoop
>     DCache0 but since it has a clean block it won't respond. The
>     ReadSharedReq will be forwarded to the L2 where it misses. The L2 will
>     create an MSHR. While the MSHR is in service in the L2, the DCache0
>     could evict the block and therefore perform a WritebackClean which will
>     be sent to the L2.
>
> This scenario definitely makes sense in terms of Gem5's MOESI protocol.
> However, I just don't understand why Gem5's MOESI does not allow an
> exclusive (also clean) cache line in this core to respond other core's
> read request. I did notice that packet.hh has very detailed comments
> about CacheResponding where only "modified" or "owned" is allowed to
> respond. But why is that? I refer to several university slides where
> they have different MOESI definition from Gem5's one ( "std::string
> print() (in blk.hh)" clearly shows the definition of each state in MOESI):
>
> https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf
> <https://inst.eecs.berkeley.edu/~cs61c/su13/disc/Disc10Sol.pdf>
> https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13-slides-1up.pdf 
> <https://www.cs.virginia.edu/~cr4bd/6354/F2016/slides/lec13-slides-1up.pdf>
>
> In these slides, "exclusive" is allowed to respond to the requests from
> other cores. Additionally, they also allow multiple dirty copies of the
> same block to exist in multiple cores. But Gem5's MOESI (according to
> the definition in blk.hh) seems not to allow this (/"Note that only one
> cache ever has a block in Modified or  Owned state, i.e., only one cache
> owns the block, or equivalently has the BlkDirty bit set. .../"). So I'm
> confused with this difference. Is there any special reason for Gem5 to
> design a different MOESI implementation?
>

In the snooping MOESI protocol we've implemented in gem5, the cache that
* has the dirty copy of the block (state M or O), or
* has an outstanding request and expects a writable copy, or
* a WritebackDirty for the block
is the ordering point. All subsequent requests for the same block will
have well defined order and from the software point of view they happen
after. As a result there should always be only one cache in the system
with the block in dirty state, a pending modified MSHR or WritebackDirty
to guarantee certain memory ordering requirements.

This is not the only sane design, you will definitely find systems with
very different protocol.

Nikos
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Questions about the detailed and useful comments in the Cache access related code

Reply via email to