I see the first patch is in OFED-1.2-20070511-0600 now, I'll try it out.

Scott 

> -----Original Message-----
> From: Scott Weitzenkamp (sweitzen) 
> Sent: Wednesday, May 09, 2007 4:46 PM
> To: Michael S. Tsirkin; Scott Weitzenkamp (sweitzen)
> Cc: Yohad Dickman; Amit Krig; Tziporet Koren; 
> [EMAIL PROTECTED]; [email protected]; Roland Dreier
> Subject: RE: [PATCH] ipoib/cm: make stale task actually run 
> once in a while
> 
> I see a new patch ipoib_correct_timers.patch in 
> OFED-1.2-20070509-0600, which patch should I try?
> 
> Scott 
> 
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] 
> > Sent: Monday, May 07, 2007 1:03 PM
> > To: Scott Weitzenkamp (sweitzen)
> > Cc: Yohad Dickman; Amit Krig; Tziporet Koren; 
> > [EMAIL PROTECTED]; [email protected]; Roland Dreier
> > Subject: [PATCH] ipoib/cm: make stale task actually run once 
> > in a while
> > 
> > In the presence of some active passive connections, stale 
> > task would never run,
> > since each 4 RX CQEs we repeat queue_delayed_work calls which 
> > delays it for some
> > 10 minutes.  As a result, on a noisy system with failing 
> > ports, we slowly run
> > out of resources - slowing connection setup down and 
> > eventually failing.
> > 
> > What we actually want to do is - start stale task when a first
> > passive connection is added, rerun it every 10 min as long
> > as there are outstanding passive connections.
> > 
> > As a happy side effect, this removes some code from RX data path.
> > 
> > Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>
> > 
> > ---
> > 
> > Scott, I think this might address bugs 541 and 465: slow 
> > IPoIB CM HA failover
> > and eventual failing IPoIB HA. Could you test this please?
> > 
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
> > b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> > index 2b242a4..b77e8d7 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> > @@ -258,10 +258,11 @@ static int ipoib_cm_req_handler(struct 
> > ib_cm_id *cm_id, struct ib_cm_event *even
> >     cm_id->context = p;
> >     p->jiffies = jiffies;
> >     spin_lock_irqsave(&priv->lock, flags);
> > +   if (list_empty(&priv->cm.passive_ids))
> > +           queue_delayed_work(ipoib_workqueue,
> > +                              &priv->cm.stale_task, 
> > IPOIB_CM_RX_DELAY);
> >     list_add(&p->list, &priv->cm.passive_ids);
> >     spin_unlock_irqrestore(&priv->lock, flags);
> > -   queue_delayed_work(ipoib_workqueue,
> > -                      &priv->cm.stale_task, IPOIB_CM_RX_DELAY);
> >     return 0;
> >  
> >  err_rep:
> > @@ -380,8 +381,6 @@ void ipoib_cm_handle_rx_wc(struct 
> > net_device *dev, struct ib_wc *wc)
> >                     if (!list_empty(&p->list))
> >                             list_move(&p->list, 
> > &priv->cm.passive_ids);
> >                     spin_unlock_irqrestore(&priv->lock, flags);
> > -                   queue_delayed_work(ipoib_workqueue,
> > -                                      
> > &priv->cm.stale_task, IPOIB_CM_RX_DELAY);
> >             }
> >     }
> >  
> > @@ -1104,6 +1103,10 @@ static void ipoib_cm_stale_task(struct 
> > work_struct *work)
> >             kfree(p);
> >             spin_lock_irqsave(&priv->lock, flags);
> >     }
> > +
> > +   if (!list_empty(&priv->cm.passive_ids))
> > +           queue_delayed_work(ipoib_workqueue,
> > +                              &priv->cm.stale_task, 
> > IPOIB_CM_RX_DELAY);
> >     spin_unlock_irqrestore(&priv->lock, flags);
> >  }
> >  
> > -- 
> > MST
> > 
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to