On Mon, 27 Oct 2014, Steve Reinhardt wrote:

On Mon, Oct 27, 2014 at 6:58 AM, Nilay Vaish <[email protected]> wrote:

On Sun, 26 Oct 2014, Steve Reinhardt wrote:

 On Sun, Oct 26, 2014 at 2:23 PM, Nilay Vaish <[email protected]> wrote:


Marc Orr asked me the same question last year.  I am pasting the examples
I gave him:

a. the data in the message is stale, but the sender does not know about
it. Take a look at the MESI CMP directory protocol. In the case when an
L1
controller (A) sends a PUTX to the L2 controller, it is possible that the
L2 controller has already transferred the ownership to some L1 controller
(B). In this case, it is possible that there are two message buffers that
contain messages from A and B to the L2 controller, but it is message
from
B which has the 'right' data.


Interesting.  I can see how this technically could be a problem, but it
seems like a pretty unlikely corner case. Have you seen it happen in
practice, and if so, what was the functional read for?  I suppose I just
have a hard time imagining an actual program that has a lot of contention
on a block that ends up being used as a parameter to a system call.  I
guess it could happen with a syscall that's specifically for
synchronization, like futex.


About an year, I had actually committed a patch that returned the first
data value it found (after making sure that no controller had the block in
a stable state).  I ran into the case illustrated above and I had to
rollback the patch.


Do you recall any details of how you ran into this?  Was this in the
process of executing an emulated syscall?  How did you detect the problem?


Nope. It might have been that I was running a tester, might have been that I was running an actual application. I probably first figured which was commit was causing the error and then used the trace of the events in the protocol to figure out the actual problem.


b. no data is present in the message and the receiver will infer that the
data it has is correct since the message did not have any data.


This seems like it should pretty easy to fix... if you're querying the
message to see if it has relevant data, then if the address matches but
there is no data, you should just return false.  I'd think there'd be a
protocol-independent way to determine that a message has no data.  It's
similar to the idea that you have to check the valid bit in the cache, you
can't just look for a tag match.


I doubt we can do this in protocol independent way.  I think the presence
/ absence of data is decided by the type of the message which is a protocol
specific enum.


I don't see how this is any harder than having the cache controller know
whether a block is valid or writable in a protocol-independent fashion.
Unless the entire message format is completely protocol defined, I'd think
it would be even easier, since you could directly check whether there's a
data or payload section in the message.


The message format, the data block associated with a cached address, the dirty bit associated with block: all these are defined by the protocol.

--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to