I guess the solution is to merge IPoIB NAPI to avoid overloading the
system with interrupts.  I'll fix up a few last things with my NAPI
patch and we can try to get it in shape to merge for 2.6.22.

 > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
 > b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 > index f2aa923..97ea26f 100644
 > --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 > +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 > @@ -301,6 +301,7 @@ void ipoib_ib_completion(struct ib_cq *c
 >              n = ib_poll_cq(cq, IPOIB_NUM_WC, priv->ibwc);
 >              for (i = 0; i < n; ++i)
 >                      ipoib_ib_handle_wc(dev, priv->ibwc + i);
 > +            cond_resched();

obviously this is wrong because ipoib_ib_completion() is not
necessarily called in process context (in fact the ehca scaling hack
is probably the only driver that does call it when it's safe to
reschedule).

 >      } while (n == IPOIB_NUM_WC);
 >  }
 > 
 > However I still saw that BUG trace occurred on 3-4 cpus after several hrs. 

Right, because this patch is not really doing anything to reduce the
interrupt load.
_______________________________________________
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to