On Sat, 2013-06-01 at 09:19 -0400, Bob Copeland wrote:
> As of "cfg80211/mac80211: use cfg80211 wdev mutex in mac80211",
> mac80211 expects to be able to take the wdev mutex around sdata
> accesses.  This causes a recursive deadlock since
> __cfg80211_leave_mesh() already holds the wdev mutex.  Removing
> the sdata_lock() calls in ieee80211_stop_mesh() alone won't fix
> this, as the cancel_work_sync() in mesh runs the iface work,
> and various work items also want to take the wdev lock (not
> just in mesh, see e.g.  ieee80211_sta_rx_queued_mgmt().)

Ouch. My mistake, clearly.

> diff --git a/net/wireless/mesh.c b/net/wireless/mesh.c
> index 5dfb289..6344a81 100644
> --- a/net/wireless/mesh.c
> +++ b/net/wireless/mesh.c
> @@ -250,7 +250,9 @@ static int __cfg80211_leave_mesh(struct 
> cfg80211_registered_device *rdev,
>       if (!wdev->mesh_id_len)
>               return -ENOTCONN;
>  
> +     wdev_unlock(wdev);
>       err = rdev_leave_mesh(rdev, dev);
> +     wdev_lock(wdev);

I'm not really happy much with this, like you said, and it's also
incomplete because the same can happen in an error path in mac80211 in
rdev_join_mesh().

I also don't really want to think about races with mesh_id_len,
particularly in the join.

However, I think that in mac80211 we can instead just remove the locking
and the cancel_work_sync() since the latter will happen whenever the
interface goes down, in a different code path outside of this. Just need
to make sure the work can cope with running while the interface is not
joined to a mesh, but I guess that's not going to be a big problem.

johannes

_______________________________________________
Devel mailing list
[email protected]
http://lists.open80211s.org/cgi-bin/mailman/listinfo/devel

Reply via email to