Roland Dreier wrote:
 > ipoib_stop is called with rtnl_lock, and flushes ipoib_workqueue.
 > the flush operation might wait for mcast_join_task to finish, which
 > in turn might wait for rtnl_lock.

when did we introduce this bug?

http://www.openfabrics.org/git/?p=ofed_1_4/linux-2.6.git;a=commit;h=529024117628d0037644a20b4870c61d63cea2a1


 > +         /* Avoid deadlock with ipoib_stop */
 > +         while (!(ret = rtnl_trylock()) &&
 > +                test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
 > +                 yield();
 > +
 > +         if (ret) {
 > +                 dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu));
 > +                 rtnl_unlock();
 > +         } else
 > +                 ipoib_dbg_mcast(priv, "ignoring mtu setup because device is 
down\n");

this is rather horrible looking... is there any way we can avoid the
loop on trylock?


We can just give up if you can't get the lock, like it's done in drivers/net/cxgb3/cxgb3_main.c. Other solution might get messy, because you don't have control when the lock is actually locked, so you can't set any flags and such. These might be: flush the queue sometime later, set the mtu sometime
later on another workqueue.

 - R.

--
--Yossi
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to