On Thu, 2007-11-29 at 16:43 -0800, Sean Hefty wrote:
> > In cancel_mads, elements from two different lists are added to the 
> > cancel_list:  wait_list and local_list.  Subsequent processing of the 
> > cancel_list treats all elements as struct ib_mad_send_wr_private, and 
> > uses the send_buf field of that structure.  But it appears to me that 
> > the items from local_list are actually of type struct 
> > ib_mad_local_private, and hence the reference to send_buf for these 
> > elements is incorrect.  Can you help me understand how this works?
> 
> I was looking at the local_list handling in cancel_mads() and the rest 
> of mad code myself.  Hal knows this part of the code better than I do, 
> maybe he can look here and see if there's a definite problem.  This 
> looks like the cause of the bug Dotan just reported.

Sorry for the slow response. I've been consumed with other matters for
the last couple days.

I started investigating this and found that this change was first
introduced over 2 years ago by the following:

commit 2c153b934dca08d58e0aafde18a182e0891aa201
Author: Hal Rosenstock <[EMAIL PROTECTED]>
Date:   Wed Jul 27 11:45:31 2005 -0700

    [PATCH] IB: Eliminate MAD cache leak associated with local completions
    
    Eliminate MAD cache leak associated with local completions.  Also, when
    canceling MAD, empty local completion list as well.
    
    Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]>
    Cc: Roland Dreier <[EMAIL PROTECTED]>
    Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
    Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

More later...

-- Hal

> - Sean
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to