Re: [ofa-general] IPOB CM (NOSRQ) [PATCH V7] patch resubmit

Pradeep Satyanarayana Wed, 18 Jul 2007 17:54:50 -0700

Roland Dreier wrote:
> There's still some rather obvious problems with this patch.  It would
> really help if you would read over your patch again I think... anyway:
> 
>  > +#define CM_PACKET_SIZE (1ul << 16)
> 
> This duplicates IPOIB_CM_MTU I think... certainly it needs to be kept
> in sync with it somehow.


They are not quite the same. How about:
#define CM_PACKET_SIZE (ALIGN(IPOIB_CM_MTU, PAGE_SIZE))

This should keep the two in sync.

> 
>  > @@ -564,10 +574,9 @@ static inline void ipoib_cm_skb_too_long
>  >    dev_kfree_skb_any(skb);
>  >  }
>  >  
>  > -static inline void ipoib_cm_handle_rx_wc(struct net_device *dev, struct 
> ib_wc *wc)
>  > +void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
> 
> Why is this change here?  (This is in the CONFIG_INFINIBAND_IPOIB_CM=n
> part of ipoib.h)
> 
>  >  }
>  > -
>  >  #endif

Will do

> 
>  > -          ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret);
>  > +          ipoib_warn(priv, "post srq failed for buf %ld (%d)\n", id, ret);
> 
> extra noise here (and still wrong -- id might be long long on some
> architectures).

Correct, it should have been %lld

> 
>  > -          .event_handler = ipoib_cm_rx_event_handler,
> 
> why?  seems harmless to just leave this alone for all QPs even if an
> SRQ isn't attached.
> 

If memory serves me right, I tried that and ran into some inexplicable problems.
Maybe it was hang or no traffic went through -don't exactly recollect what it 
was.
After this change the problem went away.


> 
>  > +          spin_unlock_irq(&priv->lock);
>  > +          ipoib_warn(priv, "NOSRQ has reached the configurable limit "
>  > +                     "of either %d RC QPs or, max recv buf size of "
>  > +                     "0x%x MB\n", max_rc_qp, max_recv_buf);
> 
>  > +          ib_send_cm_rej(cm_id, IB_CM_REJ_NO_QP, NULL, 0, NULL, 0);
>  > +          ret = -EINVAL;
>  > +          goto err_alloc_and_post;
> 
> there's a bug here... you never undo the atomic_inc() of the number of
> RC QPs even though you exit without creating a new connection.

The atomic_dec() does happen, but that is in ipoib_cm_req_handler(). There are
several places where allocate_and_post_rbuf_nosrq() could return an error after 
the atomic_inc(). So, there is an atomic_dec() in the calling routine. On the
other hand I could move that to allocate_and_post_rbuf_nosrq() itself.
 
> 
>  > -  if (!p)
>  > +  if (!p) {
>  > +          printk(KERN_WARNING "Failed to allocate RX control block when "
>  > +                 "REQ arrived\n");
>  >            return -ENOMEM;
>  > +  }
> 
> more unrelated changes... (feel free to send these as separate
> patches)
> 

OK

>  >            kfree(p);
>  >    }
>  >  
>  > +
>  >    cancel_delayed_work(&priv->cm.stale_task);
>  >  }
> 
> extra noise in the patch
> 
>  > +          if (!priv->cm.srq) {
>  > +                  atomic_dec(&current_rc_qp);
>  > +          }
> 
> no need for { } here

OK

> 
>  > +  /* We increase the size of the CQ in the NOSRQ case to prevent CQ
>  > +   * overflow. Every new REQ creates a new RX QP and each QP has an
>  > +   * RX ring associated with it. Therefore we could have
>  > +   * NOSRQ_INDEX_TABLE_SIZE*ipoib_recvq_size + ipoib_sendq_size CQEs
>  > +   * in a CQ.
>  > +   */
>  > +  if (!priv->cm.srq)
>  > +          size += (NOSRQ_INDEX_TABLE_SIZE -1) * ipoib_recvq_size;
> 
> only need to do this if CM is enabled
> 
> space after - here please too.
> 
> that's just from a quick skim of the patch...
>

OK


Pradeep


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] IPOB CM (NOSRQ) [PATCH V7] patch resubmit

Reply via email to