I agree. Seems pretty dangerous to me. Nate
On Mon, Jan 9, 2012 at 12:53 PM, Gabriel Michael Black <[email protected]> wrote: > I think a new thread and a new event queue are independent. I don't like how > we're already adding something to run time forward and then throw things > away and roll back time. Time should be monotonically increasing. > > Gabe > > > Quoting "Beckmann, Brad" <[email protected]>: > >> Make it a separate patch please. There are probably 100s of similar races >> that still exist in Ruby. Some of them can be fixed by simply better >> defining the access permissions (like the current fix), but other can't be >> fixed given the current infrastructure. My email last week was lamenting >> the fact that this fix is very unsatisfying. >> >> So just to through an idea out here...What do people think about >> supporting Ruby functional accesses by launching a separate thread? If the >> eventual plan is to have gem5 multi-threaded with each thread having a >> separate eventqueue, then one could imagine launching a thread that >> transforms a functional access as a timing access that utilizes its own >> eventqueue, thus the original thread's call stack and eventqueue is >> unperturbed. I know it sounds a little crazy and who knows when >> multi-threaded support will actually exist. However, it would provide Ruby >> functional access support without requiring a bunch of access permission >> fixes or protocol redesign. >> >> Brad >> >> >> >>> -----Original Message----- >>> From: Andreas Hansson [mailto:[email protected]] >>> Sent: Monday, January 09, 2012 7:40 AM >>> To: [email protected] >>> Cc: Beckmann, Brad; Ali Saidi; gem5 Developer List ([email protected]) >>> Subject: RE: [gem5-dev] One failing Ruby regression after memory-system >>> patches >>> >>> Hi Nilay, >>> >>> Thanks! With the suggested change (Busy->Read_Write) the regression >>> passes without any errors. Are you suggesting I make this modification a >>> part >>> of the existing patch (port proxy introduction) or shall we address this >>> as a >>> separate patch to ensure there are no undesirable side effects? >>> >>> Andreas >>> >>> -----Original Message----- >>> From: Nilay [mailto:[email protected]] >>> Sent: 09 January 2012 15:13 >>> To: Andreas Hansson >>> Cc: Beckmann, Brad; Ali Saidi; gem5 Developer List ([email protected]) >>> Subject: RE: [gem5-dev] One failing Ruby regression after memory-system >>> patches >>> >>> Andreas, in the file src/mem/protocol/MOESI_hammer-dir.sm, set the >>> access permission for state WB_E_W to Read_Write, instead of Busy, the >>> current set permission. See if this helps in removing the error. >>> >>> -- >>> Nilay >>> >>> On Mon, January 9, 2012 7:58 am, Andreas Hansson wrote: >>> > Is your suggestion to live with the failing regression at the moment? >>> > To put it differently: is there something I can/must do to assist with >>> > solving this issue or can I keep on going and leave this to a >>> > Ruby-expert (read Brad or Nilay) to sort out? >>> > >>> > Andreas >>> > >>> > -----Original Message----- >>> > From: Beckmann, Brad [mailto:[email protected]] >>> > Sent: 06 January 2012 23:58 >>> > To: Nilay Vaish >>> > Cc: Andreas Hansson; Ali Saidi; gem5 Developer List >>> > ([email protected]) >>> > Subject: RE: [gem5-dev] One failing Ruby regression after >>> > memory-system patches >>> > >>> >> >>> >> I think we should try to understand as to why this problem is >>> >> occurring in first place. Andreas, in one of the earlier emails, >>> >> mentioned that these >>> >> memory- >>> >> system patches do not introduce any timing changes. The only other >>> >> reason I can think of why this test is failing, is that these >>> >> accesses did not used to go through Ruby earlier. This seems strange, >>> >> but may be that is true. >>> >> >>> > >>> > The problem occurs because of a race between timing requests and >>> > function requests that come an emulated system call that doesn't >>> > appear to have been modified in years. I doubt there is anything in >>> > Andreas's patches that directly cause this problem. They probably >>> > just reorder the requests in a particular way that now cause the rare >>> > race to occur with the hammer protocol. Having a functional access >>> > race with a timing writeback seems like a very rare situation. I'm >>> > not surprised we haven't seen this before. >>> > >>> >> Andreas, is it possible for you to figure out what particular change >>> >> in the memory system is making this test fail? >>> >> >>> >> Whether or not that particular state can have Read_Write permissions >>> >> depends on the protocol itself. A quick glance tells me that it might >>> >> be all right to change the permissions in this case. We might want to >>> >> switch to a single copy of each cache block in order to avoid this >>> >> problem. Do we really need the data to reside in the interconnection >>> >> network to carry out a simulation? Can we not have fake data in the >>> >> network and the actual data always resides at one single place? >>> >> >>> > I'd rather not remove data from the interconnect. That is certainly >>> > not in the spirit of "execute at execute". Having data exist in one >>> > single place is what we do today with Ruby's backing copy of physmem. >>> > If we have data always reside in one single place, then we might as >>> > well remove all of Ruby's functional access support and go back to >>> > just sending all functional accesses to physmem. >>> > >>> > For the particular problem we're seeing today, data is not stuck in >>> > the interconnection network. Rather it is just stuck in the DRAM >>> > request queue that simulates the timing of the DRAM interface. The >>> > data itself has already been written to DirectoryMemory. >>> > >>> > Overall, I'm not happy with any solution that comes to my mind. I >>> > don't like having to deal with these problems one-by-one, nor do I >>> > want to remove Ruby's functional access support. I also don't want to >>> > have to build some sort of complicated mechanism that tries to >>> > identify valid data floating in any Ruby buffer (network, DRAM, etc.) >>> > because I don't see how one can do that without putting a lot of >>> > burden/restriction on the protocol writer. >>> > >>> > Brad >>> > >>> > >>> > >>> > >>> > -- IMPORTANT NOTICE: The contents of this email and any attachments >>> > are confidential and may also be privileged. If you are not the >>> > intended recipient, please notify the sender immediately and do not >>> > disclose the contents to any other person, use it for any purpose, or >>> > store or copy the information in any medium. Thank you. >>> > >>> > >>> >>> >>> >>> -- IMPORTANT NOTICE: The contents of this email and any attachments are >>> confidential and may also be privileged. If you are not the intended >>> recipient, >>> please notify the sender immediately and do not disclose the contents to >>> any >>> other person, use it for any purpose, or store or copy the information in >>> any >>> medium. Thank you. >>> >> >> >> _______________________________________________ >> gem5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/gem5-dev >> > > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
