I guess the solution is to merge IPoIB NAPI to avoid overloading the system with interrupts. I'll fix up a few last things with my NAPI patch and we can try to get it in shape to merge for 2.6.22.
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c > b/drivers/infiniband/ulp/ipoib/ipoib_ib.c > index f2aa923..97ea26f 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c > @@ -301,6 +301,7 @@ void ipoib_ib_completion(struct ib_cq *c > n = ib_poll_cq(cq, IPOIB_NUM_WC, priv->ibwc); > for (i = 0; i < n; ++i) > ipoib_ib_handle_wc(dev, priv->ibwc + i); > + cond_resched(); obviously this is wrong because ipoib_ib_completion() is not necessarily called in process context (in fact the ehca scaling hack is probably the only driver that does call it when it's safe to reschedule). > } while (n == IPOIB_NUM_WC); > } > > However I still saw that BUG trace occurred on 3-4 cpus after several hrs. Right, because this patch is not really doing anything to reduce the interrupt load. _______________________________________________ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
