I'm only somewhat following this thread, but I think that it's worth saying that requiring protocols to actually get the data correct has one huge benefit. It causes incorrect algorithms to produce incorrect answers. Sometimes we want to allow shortcuts, but generally with modeling, it's nice to have the extra checking. One of the huge things that we did forever ago in M5 was to build an execute-in-execute CPU model. This was a big departure from all of the other simulators at the time, but it does mean one thing. It was really hard to cheat in our model. I think that forcing protocols to get data correct simply means forcing protocols to be correct. Not a bad thing.
Nate On Mon, Oct 27, 2014 at 10:02 AM, Steve Reinhardt via gem5-dev < [email protected]> wrote: > On Mon, Oct 27, 2014 at 6:58 AM, Nilay Vaish <[email protected]> wrote: > > > On Sun, 26 Oct 2014, Steve Reinhardt wrote: > > > > On Sun, Oct 26, 2014 at 2:23 PM, Nilay Vaish <[email protected]> wrote: > >> > >> > >>> Marc Orr asked me the same question last year. I am pasting the > examples > >>> I gave him: > >>> > >>> a. the data in the message is stale, but the sender does not know about > >>> it. Take a look at the MESI CMP directory protocol. In the case when an > >>> L1 > >>> controller (A) sends a PUTX to the L2 controller, it is possible that > the > >>> L2 controller has already transferred the ownership to some L1 > controller > >>> (B). In this case, it is possible that there are two message buffers > that > >>> contain messages from A and B to the L2 controller, but it is message > >>> from > >>> B which has the 'right' data. > >>> > >>> > >> Interesting. I can see how this technically could be a problem, but it > >> seems like a pretty unlikely corner case. Have you seen it happen in > >> practice, and if so, what was the functional read for? I suppose I just > >> have a hard time imagining an actual program that has a lot of > contention > >> on a block that ends up being used as a parameter to a system call. I > >> guess it could happen with a syscall that's specifically for > >> synchronization, like futex. > >> > > > > About an year, I had actually committed a patch that returned the first > > data value it found (after making sure that no controller had the block > in > > a stable state). I ran into the case illustrated above and I had to > > rollback the patch. > > > Do you recall any details of how you ran into this? Was this in the > process of executing an emulated syscall? How did you detect the problem? > > > > b. no data is present in the message and the receiver will infer that the > >>> data it has is correct since the message did not have any data. > >>> > >>> > >> This seems like it should pretty easy to fix... if you're querying the > >> message to see if it has relevant data, then if the address matches but > >> there is no data, you should just return false. I'd think there'd be a > >> protocol-independent way to determine that a message has no data. It's > >> similar to the idea that you have to check the valid bit in the cache, > you > >> can't just look for a tag match. > >> > > > > I doubt we can do this in protocol independent way. I think the presence > > / absence of data is decided by the type of the message which is a > protocol > > specific enum. > > > > I don't see how this is any harder than having the cache controller know > whether a block is valid or writable in a protocol-independent fashion. > Unless the entire message format is completely protocol defined, I'd think > it would be even easier, since you could directly check whether there's a > data or payload section in the message. > > Steve > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
