Re: [m5-users] S-NUCA: dealing with delayed MSHR allocation

Steve Reinhardt Sun, 28 Nov 2010 19:34:09 -0800

On Sun, Nov 28, 2010 at 5:44 PM, Jeroen DR
<[email protected]<voetsjoeba%[email protected]>
> wrote:


>  Steve,
>
> First of all, thank you for taking the time to provide an explanation. Much
> appreciated :)
>
> So if I understand correctly, when you say "it absolutely needs to know
> whether that invalidation belongs to a request that precedes or succeeds its
> own request", you mean succeeds or precedes the moment that the downstream
> request arrives on the common bus (since the first common bus is the
> authority for determining order)?
>

Strictly speaking, I'm talking about whether the request that causes the
invalidation succeeds or precedes the cache's request in the global ordering
of requests for the block.  This is typically determined by the order in
which they reach the nearest common bus (e.g., if the block is not cached
anywhere), but if there is an owned copy in an intermediate cache then it is
determined by the first request to reach that cache.  So I think you've got
the right idea, it's just safer to think "global order" than "common bus"
since it's not always a common bus that determines the order.

Actually maybe a better way to think of it is which request reaches the
owned copy first; in the case where the owned copy is at main memory or at a
common downstream cache, then the first one onto the nearest common bus is
going to get to that copy first, but if there's an owned copy that's at some
other cache then it's not quite so simple.

So then, come to think about it, downstreamPending could be interpreted as
> sort of a "still has levels to go until the common bus"-flag for the
> downstream request.
>

Right, or more specifically "hasn't reached the block owner" or equivalently
"hasn't reached the point where the response will be generated".



> One case that then inevitably pops up in my head (although I'm not actually
> sure if it's plausible -- I still have many insights to gain in cache
> coherence) is if a cache snoops an invalidating request from a peer on its
> local bus rather than from a downstream cache, while the miss request is
> still moving downstream long past the local bus.
>
> In this case, the common bus would be the cache's local bus, but when the
> invalidate comes in from the peer, downstreamPending will still be set as
> the request is still pending somewhere downstream. But I guess that would be
> where the difference between an express snoop and a snoop on the local bus
> comes in?
>

Very perceptive, yes... that's exactly it, the snoop coming up from
downstream will be marked as an express snoop, but the one coming from the
peer won't, which is why the term on the first line of MSHR::handleSnoop()
is "(pkt->isExpressSnoop() && downstreamPending)".


> In my particular S-NUCA setup, I know that the L2 will always be the
> last-level cache, so I think I should indeed be able to get away with
> marking the upstream MSHR as downstreamPending as soon as the request is
> received on the main CPU-side port, and then calling
> clearDownstreamPending() only if the request hit in the bank. In case of a
> miss, downstreamPending can simply remain true. If the bank then maintains
> its regular behaviour when sending the miss request further downstream to
> main memory, clearDownstreamPending should also be called after it notices
> that no MSHR was allocated downstream.
>
> That is, of course, provided again that the request leaving the bank can be
> instantly propagated to the local bus and doesn't remain queued in an
> internal port. It wouldn't make a difference for "seeing" whether an MSHR
> was allocated (since it's the LLC anyway), but I imagine the request
> actually has to be seen on the bus so that any other caches peered to it can
> see it before sending any invalidates. In a single-LLC setup, though, this
> argument no longer applies and I think it should be safe to queue outgoing
> downstream requests.
>
> Am I on the right track here?
>

Sounds like it to me... as far as internal queuing, the key thing is that
you can clear downstreamPending only after the point where the request is
guaranteed to get handled before any other request to the same block that
could subsequently cause an expressSnoop invalidation, which it sounds like
you are taking care of there.

Steve

_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Re: [m5-users] S-NUCA: dealing with delayed MSHR allocation

Reply via email to