> On Aug. 14, 2015, 8:03 p.m., Joel Hestness wrote:
> > src/mem/slicc/ast/PeekStatementAST.py, line 74
> > <http://reviews.gem5.org/r/2983/diff/1/?file=48301#file48301line74>
> >
> >     This is a very subtle change to already complex code, so I'm not sure I 
> > understand how it works. The patch description indicates that bad things 
> > could happen if controller activity occurred between the read and write of 
> > a RMW, but it doesn't really say what the bad things are. This code appears 
> > to only block intermediate activity when the write part of the RMW gets to 
> > the head of a controller queue.
> >     
> >     Can you clarify what was causing the deadlock, and why this works to 
> > fix it?
> 
> Nilay Vaish wrote:
>     Joel, consider the following operations happen on some address A:
>     
>     RMW Read Address A
>     Some other mem op on Address A
>     RMW Write Address A
>     
>     
>     Currently, on seeing a RMW Read, the sequencer asks the controller to 
> block
>     any operation on address A other than those coming from the mandatory 
> queue.
>     This is to prevent losing coherence permissions on address A.  My guess 
> is that
>     RMW operations in x86 have to always succeed, unlink LL/SC which I 
> believe can
>     fail.  I think the expectation here is that no operation other than RMW 
> Write
>     on the same address would be issued from the manadatory queue.  This 
> implies 
>     the sequencer expects that the processing core would not issue any 
> operation
>     other than RMW Write.  But Lena must have observed an instruction fetch 
> request
>     to the same address between RMW Read and RMW write.
>     
>     
>     Now, I think the proposed fix is a character away from correctness. The 
> condition should be:
>     
>          if (m_is_blocking &&
>             (m_block_map.count(in_msg_ptr->m_$address_field) == 1) &&
>             ((m_block_map[in_msg_ptr->m_$address_field] != &$qcode) ||
>             (!in_msg_ptr->bypassBlocked()))) {
>             
>     
>     I am actually not in favor of fixing the problem the way this patch does 
> it.
>     As I said above, the controller expects that the mandatory queue will not
>     issue any operation on the address other than RMW Write.  I think we 
> should
>     change the sequencer and put in a check.

Hi Nilay. We've discussed this patch pretty thoroughly on the gem5-gpu list 
(here: https://groups.google.com/forum/#!topic/gem5-gpu-dev/RQv4SxIKv7g ). 
Based on some testing, the patch in this review request probably doesn't work 
correctly, even if you change the condition as you suggest. I've posted a 
preliminary fix on the thread linked here, but I'm not sure that is a good way 
to do it either. I'd be happy to hear your take on our gem5-gpu discussion.


- Joel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2983/#review6955
-----------------------------------------------------------


On July 21, 2015, 4:06 p.m., Lena Olson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2983/
> -----------------------------------------------------------
> 
> (Updated July 21, 2015, 4:06 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 10926:97dc170a3064
> ---------------------------
> ruby: Fix protocol deadlock with RMW operations
> 
> Previously, if there was a memory operation between the read part and the 
> write
> part of a RMW transaction, a deadlock could occur. There was logic to allow 
> the
> write part of the RMW transaction to bypass the blocked controller, but it 
> also
> allowed other operations to bypass as well. Thus, only the write parts of a 
> RMW
> operation can bypass. This patch adds a new function to the message class 
> which
> controls whether a block can bypass the blocked controller.
> 
> It is possible that an IFETCH is issued between the read and write parts of 
> the
> RMW transaction.
> 
> 
> Diffs
> -----
> 
>   src/mem/ruby/slicc_interface/Message.hh 5fe05690d03d 
>   src/mem/ruby/slicc_interface/RubyRequest.hh 5fe05690d03d 
>   src/mem/slicc/ast/PeekStatementAST.py 5fe05690d03d 
> 
> Diff: http://reviews.gem5.org/r/2983/diff/
> 
> 
> Testing
> -------
> 
> Tested with the ruby random tester as well as running some multiprogrammed 
> workloads, including ones where this deadlock manifested.  The multicore 
> tests I have are for x86, so if those with more knowledge of other ISAs could 
> comment that would be helpful.
> 
> Many thanks to Jason Power for help in tracking this down and fixing it.
> 
> 
> Thanks,
> 
> Lena Olson
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to