On Fri, 6 Jan 2012, Beckmann, Brad wrote:
Hi Andreas,
(moving back to gem5-dev since I suspect other will be interested)
I've dug myself out of my email hole and I think I can help out here.
I read through your trace and I know what is going on. As Nilay already
mentioned, we know that functional accesses, especially functional
writes, will not be successful if they race with timing requests. Even
though the hello world test uses a single in-order SimpleTiming CPU, a
timing request is racing with the functional write. Specifically, the
writeback of block 0x89580 and the directory waiting for the data to be
written to DRAM, is racing with the fstat syscall's functional write to
the same block. I know it is a little hard to figure all that out from
staring at the current trace with all Ruby flags turned on. In the
future, I would recommend just turning on the ProtocolTrace Flag. It
will be much easier to understand what is going on.
Though I think I understand the problem, I'm not quite sure how to fix
it. When Nilay added functional access support to Ruby, Nilay and I
were hoping this situation would not occur. However, since this is just
the simple 1-cpu hello world test, I think it is pretty obvious that we
are going to have to deal with this situation somehow. We could just
deal with these situations one-by-one by modifying the AccessPermissions
for particular states. Specifically here we and solve this problem by
setting the permission of Dir:WB_E_W to Read_Write. However, is that
how we want to try to solve all these issues? There are certain races
that we simply can't get around by better specifying AccessPermissions.
Nilay, what do you think?
Brad
I think we should try to understand as to why this problem is occurring in
first place. Andreas, in one of the earlier emails, mentioned that these
memory-system patches do not introduce any timing changes. The only other
reason I can think of why this test is failing, is that these accesses did
not used to go through Ruby earlier. This seems strange, but may be that
is true.
Andreas, is it possible for you to figure out what particular change in
the memory system is making this test fail?
Whether or not that particular state can have Read_Write permissions
depends on the protocol itself. A quick glance tells me that it might be
all right to change the permissions in this case. We might want to switch
to a single copy of each cache block in order to avoid this problem. Do we
really need the data to reside in the interconnection network to carry out
a simulation? Can we not have fake data in the network and the actual data
always resides at one single place?
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev