I see the first patch is in OFED-1.2-20070511-0600 now, I'll try it out. Scott
> -----Original Message----- > From: Scott Weitzenkamp (sweitzen) > Sent: Wednesday, May 09, 2007 4:46 PM > To: Michael S. Tsirkin; Scott Weitzenkamp (sweitzen) > Cc: Yohad Dickman; Amit Krig; Tziporet Koren; > [EMAIL PROTECTED]; [email protected]; Roland Dreier > Subject: RE: [PATCH] ipoib/cm: make stale task actually run > once in a while > > I see a new patch ipoib_correct_timers.patch in > OFED-1.2-20070509-0600, which patch should I try? > > Scott > > > -----Original Message----- > > From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] > > Sent: Monday, May 07, 2007 1:03 PM > > To: Scott Weitzenkamp (sweitzen) > > Cc: Yohad Dickman; Amit Krig; Tziporet Koren; > > [EMAIL PROTECTED]; [email protected]; Roland Dreier > > Subject: [PATCH] ipoib/cm: make stale task actually run once > > in a while > > > > In the presence of some active passive connections, stale > > task would never run, > > since each 4 RX CQEs we repeat queue_delayed_work calls which > > delays it for some > > 10 minutes. As a result, on a noisy system with failing > > ports, we slowly run > > out of resources - slowing connection setup down and > > eventually failing. > > > > What we actually want to do is - start stale task when a first > > passive connection is added, rerun it every 10 min as long > > as there are outstanding passive connections. > > > > As a happy side effect, this removes some code from RX data path. > > > > Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]> > > > > --- > > > > Scott, I think this might address bugs 541 and 465: slow > > IPoIB CM HA failover > > and eventual failing IPoIB HA. Could you test this please? > > > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c > > b/drivers/infiniband/ulp/ipoib/ipoib_cm.c > > index 2b242a4..b77e8d7 100644 > > --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c > > @@ -258,10 +258,11 @@ static int ipoib_cm_req_handler(struct > > ib_cm_id *cm_id, struct ib_cm_event *even > > cm_id->context = p; > > p->jiffies = jiffies; > > spin_lock_irqsave(&priv->lock, flags); > > + if (list_empty(&priv->cm.passive_ids)) > > + queue_delayed_work(ipoib_workqueue, > > + &priv->cm.stale_task, > > IPOIB_CM_RX_DELAY); > > list_add(&p->list, &priv->cm.passive_ids); > > spin_unlock_irqrestore(&priv->lock, flags); > > - queue_delayed_work(ipoib_workqueue, > > - &priv->cm.stale_task, IPOIB_CM_RX_DELAY); > > return 0; > > > > err_rep: > > @@ -380,8 +381,6 @@ void ipoib_cm_handle_rx_wc(struct > > net_device *dev, struct ib_wc *wc) > > if (!list_empty(&p->list)) > > list_move(&p->list, > > &priv->cm.passive_ids); > > spin_unlock_irqrestore(&priv->lock, flags); > > - queue_delayed_work(ipoib_workqueue, > > - > > &priv->cm.stale_task, IPOIB_CM_RX_DELAY); > > } > > } > > > > @@ -1104,6 +1103,10 @@ static void ipoib_cm_stale_task(struct > > work_struct *work) > > kfree(p); > > spin_lock_irqsave(&priv->lock, flags); > > } > > + > > + if (!list_empty(&priv->cm.passive_ids)) > > + queue_delayed_work(ipoib_workqueue, > > + &priv->cm.stale_task, > > IPOIB_CM_RX_DELAY); > > spin_unlock_irqrestore(&priv->lock, flags); > > } > > > > -- > > MST > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
