Sean> I'm seeing an issue trying to recover from an error in
    Sean> userspace.  Basically, I allocate a PD, a CQ, and a QP, then
    Sean> destroy the QP because of an unrelated error.  The destroy
    Sean> call takes several seconds to complete, and appears to be
    Sean> hung in mthca_cq_clean: line 551.  Stepping through the
    Sean> while loop there, I'm not falling into the if or else if
    Sean> cases.  The call does eventually complete.

I think I see the problem.  Does this patch fix it for you?
(basically you're doing a benchmark seeing how fast your CPU can go
through the loop 4 billion times ;)

 - R.

--- libmthca/src/cq.c   (revision 3989)
+++ libmthca/src/cq.c   (working copy)
@@ -524,7 +524,7 @@ void mthca_arbel_cq_event(struct ibv_cq 
 void mthca_cq_clean(struct mthca_cq *cq, uint32_t qpn, struct mthca_srq *srq)
 {
        struct mthca_cqe *cqe;
-       int prod_index;
+       uint32_t prod_index;
        int nfreed = 0;
 
        pthread_spin_lock(&cq->lock);
@@ -546,7 +546,7 @@ void mthca_cq_clean(struct mthca_cq *cq,
         * Now sweep backwards through the CQ, removing CQ entries
         * that match our QP by copying older entries on top of them.
         */
-       while (--prod_index > cq->cons_index) {
+       while ((int) --prod_index - (int) cq->cons_index >= 0) {
                cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe);
                if (cqe->my_qpn == htonl(qpn)) {
                        if (srq)
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to