> On Aug. 14, 2015, 8:03 p.m., Joel Hestness wrote: > > src/mem/slicc/ast/PeekStatementAST.py, line 74 > > <http://reviews.gem5.org/r/2983/diff/1/?file=48301#file48301line74> > > > > This is a very subtle change to already complex code, so I'm not sure I > > understand how it works. The patch description indicates that bad things > > could happen if controller activity occurred between the read and write of > > a RMW, but it doesn't really say what the bad things are. This code appears > > to only block intermediate activity when the write part of the RMW gets to > > the head of a controller queue. > > > > Can you clarify what was causing the deadlock, and why this works to > > fix it? > > Nilay Vaish wrote: > Joel, consider the following operations happen on some address A: > > RMW Read Address A > Some other mem op on Address A > RMW Write Address A > > > Currently, on seeing a RMW Read, the sequencer asks the controller to > block > any operation on address A other than those coming from the mandatory > queue. > This is to prevent losing coherence permissions on address A. My guess > is that > RMW operations in x86 have to always succeed, unlink LL/SC which I > believe can > fail. I think the expectation here is that no operation other than RMW > Write > on the same address would be issued from the manadatory queue. This > implies > the sequencer expects that the processing core would not issue any > operation > other than RMW Write. But Lena must have observed an instruction fetch > request > to the same address between RMW Read and RMW write. > > > Now, I think the proposed fix is a character away from correctness. The > condition should be: > > if (m_is_blocking && > (m_block_map.count(in_msg_ptr->m_$address_field) == 1) && > ((m_block_map[in_msg_ptr->m_$address_field] != &$qcode) || > (!in_msg_ptr->bypassBlocked()))) { > > > I am actually not in favor of fixing the problem the way this patch does > it. > As I said above, the controller expects that the mandatory queue will not > issue any operation on the address other than RMW Write. I think we > should > change the sequencer and put in a check. > > Joel Hestness wrote: > Hi Nilay. We've discussed this patch pretty thoroughly on the gem5-gpu > list (here: https://groups.google.com/forum/#!topic/gem5-gpu-dev/RQv4SxIKv7g > ). Based on some testing, the patch in this review request probably doesn't > work correctly, even if you change the condition as you suggest. I've posted > a preliminary fix on the thread linked here, but I'm not sure that is a good > way to do it either. I'd be happy to hear your take on our gem5-gpu > discussion. > > Nilay Vaish wrote: > I saw your patch. It is essentially what I had in mind when I said that > the check should > go in the sequencer. But I would prefer the check is made at line 577 > just before insertRequest() > is called, rather than at 177 as is being done currently in your patch. > And mark isBlocked as const.
Posted here: http://reviews.gem5.org/r/3149/ - Joel ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2983/#review6955 ----------------------------------------------------------- On July 21, 2015, 4:06 p.m., Lena Olson wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2983/ > ----------------------------------------------------------- > > (Updated July 21, 2015, 4:06 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 10926:97dc170a3064 > --------------------------- > ruby: Fix protocol deadlock with RMW operations > > Previously, if there was a memory operation between the read part and the > write > part of a RMW transaction, a deadlock could occur. There was logic to allow > the > write part of the RMW transaction to bypass the blocked controller, but it > also > allowed other operations to bypass as well. Thus, only the write parts of a > RMW > operation can bypass. This patch adds a new function to the message class > which > controls whether a block can bypass the blocked controller. > > It is possible that an IFETCH is issued between the read and write parts of > the > RMW transaction. > > > Diffs > ----- > > src/mem/ruby/slicc_interface/Message.hh 5fe05690d03d > src/mem/ruby/slicc_interface/RubyRequest.hh 5fe05690d03d > src/mem/slicc/ast/PeekStatementAST.py 5fe05690d03d > > Diff: http://reviews.gem5.org/r/2983/diff/ > > > Testing > ------- > > Tested with the ruby random tester as well as running some multiprogrammed > workloads, including ones where this deadlock manifested. The multicore > tests I have are for x86, so if those with more knowledge of other ISAs could > comment that would be helpful. > > Many thanks to Jason Power for help in tracking this down and fixing it. > > > Thanks, > > Lena Olson > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
