Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: race in mthca_cq.c?
> 
>     Michael> Not in the driver I have: mthca_array_clear is at line
>     Michael> 1351, mthca_cq_clean at line 1372.  Isn't
>     Michael> mthca_array_clear freeing the slot in QP table?
> 
> Nope, the bitmap slot isn't freed until mthca_free().

Oh. Right. I see it now.

>     Michael> But there might be more EQEs for this CQN outstanding in
>     Michael> the EQ which we have not seen yet.
> 
> Now that you mention it, that could be a real problem I guess.
> synchronize_irq() isn't enough because the interrupt handler might not
> have even started yet.
> 
> But on the other hand a CQ can't be destroyed until after all
> associated QPs have been destroyed.  So could we really miss EQEs for
> that long?

Yes, I think there might be spurious EQEs and they might get delayed
in HW for a long time. Destroyng QPs does not flush completion events out.

So just this bit?

--

Check EQE is not for a stale CQ number.  Since high bits in CQ number are
allocated by round-robin, we can be reasonably sure CQ number is different even
for CQs which share slot in CQ table.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>


--- openib/drivers/infiniband/hw/mthca/mthca_cq.c       2006-05-09 
21:07:28.623383000 +0300
+++ /mswg/work/mst/tmp/infiniband1/hw/mthca/mthca_cq.c  2006-06-08 
23:46:52.404499000 +0300
@@ -217,9 +217,9 @@ void mthca_cq_completion(struct mthca_de
 {
        struct mthca_cq *cq;
 
        cq = mthca_array_get(&dev->cq_table.cq, cqn & (dev->limits.num_cqs - 
1));
 
-       if (!cq) {
+       if (!cq || cq->cqn != cqn) {
                mthca_warn(dev, "Completion event for bogus CQ %08x\n", cqn);
                return;
        }

-- 
MST

_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to