Is your suggestion to live with the failing regression at the moment? To put it differently: is there something I can/must do to assist with solving this issue or can I keep on going and leave this to a Ruby-expert (read Brad or Nilay) to sort out?
Andreas -----Original Message----- From: Beckmann, Brad [mailto:[email protected]] Sent: 06 January 2012 23:58 To: Nilay Vaish Cc: Andreas Hansson; Ali Saidi; gem5 Developer List ([email protected]) Subject: RE: [gem5-dev] One failing Ruby regression after memory-system patches > > I think we should try to understand as to why this problem is occurring in > first > place. Andreas, in one of the earlier emails, mentioned that these memory- > system patches do not introduce any timing changes. The only other reason I > can think of why this test is failing, is that these accesses did not used to > go > through Ruby earlier. This seems strange, but may be that is true. > The problem occurs because of a race between timing requests and function requests that come an emulated system call that doesn't appear to have been modified in years. I doubt there is anything in Andreas's patches that directly cause this problem. They probably just reorder the requests in a particular way that now cause the rare race to occur with the hammer protocol. Having a functional access race with a timing writeback seems like a very rare situation. I'm not surprised we haven't seen this before. > Andreas, is it possible for you to figure out what particular change in the > memory system is making this test fail? > > Whether or not that particular state can have Read_Write permissions > depends on the protocol itself. A quick glance tells me that it might be all > right to change the permissions in this case. We might want to switch to a > single copy of each cache block in order to avoid this problem. Do we really > need the data to reside in the interconnection network to carry out a > simulation? Can we not have fake data in the network and the actual data > always resides at one single place? > I'd rather not remove data from the interconnect. That is certainly not in the spirit of "execute at execute". Having data exist in one single place is what we do today with Ruby's backing copy of physmem. If we have data always reside in one single place, then we might as well remove all of Ruby's functional access support and go back to just sending all functional accesses to physmem. For the particular problem we're seeing today, data is not stuck in the interconnection network. Rather it is just stuck in the DRAM request queue that simulates the timing of the DRAM interface. The data itself has already been written to DirectoryMemory. Overall, I'm not happy with any solution that comes to my mind. I don't like having to deal with these problems one-by-one, nor do I want to remove Ruby's functional access support. I also don't want to have to build some sort of complicated mechanism that tries to identify valid data floating in any Ruby buffer (network, DRAM, etc.) because I don't see how one can do that without putting a lot of burden/restriction on the protocol writer. Brad -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
