On Fri, 6 Jan 2012, Beckmann, Brad wrote:

Hi Andreas,

(moving back to gem5-dev since I suspect other will be interested)

I've dug myself out of my email hole and I think I can help out here. I read through your trace and I know what is going on. As Nilay already mentioned, we know that functional accesses, especially functional writes, will not be successful if they race with timing requests. Even though the hello world test uses a single in-order SimpleTiming CPU, a timing request is racing with the functional write. Specifically, the writeback of block 0x89580 and the directory waiting for the data to be written to DRAM, is racing with the fstat syscall's functional write to the same block. I know it is a little hard to figure all that out from staring at the current trace with all Ruby flags turned on. In the future, I would recommend just turning on the ProtocolTrace Flag. It will be much easier to understand what is going on.

Though I think I understand the problem, I'm not quite sure how to fix it. When Nilay added functional access support to Ruby, Nilay and I were hoping this situation would not occur. However, since this is just the simple 1-cpu hello world test, I think it is pretty obvious that we are going to have to deal with this situation somehow. We could just deal with these situations one-by-one by modifying the AccessPermissions for particular states. Specifically here we and solve this problem by setting the permission of Dir:WB_E_W to Read_Write. However, is that how we want to try to solve all these issues? There are certain races that we simply can't get around by better specifying AccessPermissions.

Nilay, what do you think?

Brad


I think we should try to understand as to why this problem is occurring in first place. Andreas, in one of the earlier emails, mentioned that these memory-system patches do not introduce any timing changes. The only other reason I can think of why this test is failing, is that these accesses did not used to go through Ruby earlier. This seems strange, but may be that is true.

Andreas, is it possible for you to figure out what particular change in the memory system is making this test fail?

Whether or not that particular state can have Read_Write permissions depends on the protocol itself. A quick glance tells me that it might be all right to change the permissions in this case. We might want to switch to a single copy of each cache block in order to avoid this problem. Do we really need the data to reside in the interconnection network to carry out a simulation? Can we not have fake data in the network and the actual data always resides at one single place?

Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to