remove_qp() can execute concurrently with a qib_lookup_qpn() on another CPU,
which in of itself, is ok, given the RCU locking.

The issue is that remove_qp() NULLs out the qp->next field so that a
qib_lookup_qpn() might fail to find a qp if it occurs after the one that is
being deleted.   This is a momentary issue and subsequent qib_lookup_qpn()
calls would find the qp's since the search restarts from the bucket head.
At scale, the issue might causes dropped packets and unnecessary
retransmissions.

The fix just deletes the qp->next NULL assignment to prevent the
remove_qp() from hiding qp's from qib_lookup_qpn().

Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
---
 drivers/infiniband/hw/qib/qib_qp.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_qp.c 
b/drivers/infiniband/hw/qib/qib_qp.c
index 3527509..a6a2cc2 100644
--- a/drivers/infiniband/hw/qib/qib_qp.c
+++ b/drivers/infiniband/hw/qib/qib_qp.c
@@ -268,8 +268,9 @@ static void remove_qp(struct qib_ibdev *dev, struct qib_qp 
*qp)
                                qpp = &q->next)
                        if (q == qp) {
                                atomic_dec(&qp->refcount);
-                               *qpp = qp->next;
-                               rcu_assign_pointer(qp->next, NULL);
+                               rcu_assign_pointer(*qpp,
+                                       rcu_dereference_protected(qp->next,
+                                        lockdep_is_held(&dev->qpt_lock)));
                                break;
                        }
        }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to