On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote:
> Roland, what do you think?
> 
> As I've said, I think we should go ahead with using the rtnl lock in
> the core.  Is there a complete patch available for review?  looks
> like the original was a partial fix.

I guess I should realize that when no one jumps at fixing my issues
for me that they probably aren't simple to fix.  The solution that
Cong proposed was to acquire rtnl_lock() before acquiring the
infiniband device_mutex, and his partial patch did that in
ib_register_client().  The problem is that you would also need to do
that in ib_unregister_client(), ib_register_device(), and
ib_unregister_device(), and that brings us back to the original
problem which was that cxgb3 was holding the rtnl_lock() when it
called ib_register_device().  Thus with the proposed fix I believe
cxgb3 would already be holding the rtnl_lock() and then call
ib_register_device() which would try to acquire the rtnl_lock() again
and deadlock for a different reason.

Actually how does this currently work?  ib_register_device() calls
client->add() for each client in the list which should call
ipoib_add_one() which calls register_netdev().  Shouldn't that also
deadlock in the cxgb3 case?

Also while digging through this I think I see another bug which is
that ipoib_dev_cleanup() can be called from ipoib_add_port() but in
the current code ipoib_add_port() is not holding the rtnl_lock() which
appears to be a requirement of ipoib_dev_cleanup().

Sigh...  I'm going to stop looking at this for now and hopefully
someone can propose a better solution to this issue.

Thanks,
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to