On Mon, Dec 19, 2011 at 11:19 PM, Eli Cohen <[email protected]> wrote:
> I considered this but that means that you serialize attach/detach
> operations at ib core. Using a spinlock to protect the list allows
> more concurrency. After all, we hit this bug since concurrency of such
> operations occur in real life applications.

Looking closer at it, I don't see how we can avoid serializing multicast
operations.

At least, I don't see how your proposed patch can work for all the crazy
things userspace might do.

For example suppose two threads join the same QP to the same MCG at
the same time.  In that case it might happen that both threads check the
list and don't find the group there, since you drop the lock before the
actual low-level verbs attach (and you don't and can't hold the lock across
that sleeping call).

In that case both calls to attach will succeed (underlying verb is
idempotent) and the membership info will be added to the list twice.

Then all sorts of problems ensue, eg a detach will succeed but only
remove one list entry, and so a subsequent re-attach will silently do
nothing.

Also I'm sure racing attach/detach can get into trouble too, eg
detach succeeds but list entry ends up still on the list.

I don't see any obvious way to close all these holes except to
make sure that adding/removing a list entry is done atomically
along with the actual attach/detach operation -- ie hold some
sort of sleepable lock (like an rwsem) across checking the list,
doing the attach/detach and doing the list add/remove.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to