Ralph Campbell wrote:
I was looking at the code for multicast.c and noticed that
ib_sa_join_multicast() calls queue_join() which puts the
request at the front of the group->pending_list.  If this
is a second request, it seems like it would interfere with
process_join_error() since group->last_join won't point
to the member at the head of the pending_list. The sequence
would thus be:

Thanks. This does indeed appear to be a bug, which your patch should fix. However, to clarify, the problem is really:

1. ib_sa_join_multicast()
   // puts member1 on head of pending_list and starts work thread
2. mcast_work_handler()
   // calls send_join() which sets group->last_join to member1
3. ib_sa_join_multicast()
   // puts member2 on head of pending_list
4. IB_EVENT_PORT_ERR event calls mcast_groups_lost()
   // sets group->state to MCAST_ERROR

replace 4 above with:

4. Join operation fails with non-zero status. I.e. the problem is related to a failure response from the SA, perhaps due to an invalid setting for the multicast group, and not related to a port event.

5. join_handler() is called with error status
6. process_join_error() fails to process member1 since
   it doesn't match the first entry in the group->pending_list.

The impact is that the failed join request gets tossed. The request at the head of the pending_list now gets processed. After it completes, the original request (from step 1) ends up trying again. So, everything should eventually work out as expected.

Roland, can we please queue this for 2.6.24? Would you like it resubmitted with an updated patch description?

- Sean

Signed-off-by: Ralph Campbell <[EMAIL PROTECTED]>

diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index 15b4c4d..1bc1fe6 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -196,7 +196,7 @@ static void queue_join(struct mcast_member *member)
        unsigned long flags;
spin_lock_irqsave(&group->lock, flags);
-       list_add(&member->list, &group->pending_list);
+       list_add_tail(&member->list, &group->pending_list);
        if (group->state == MCAST_IDLE) {
                group->state = MCAST_BUSY;
                atomic_inc(&group->refcount);


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to