[ewg] ib_rdma test over chelsio

2009-07-22 Thread Shirley Ma
I have trouble to run ib_rdma test over chelsio. I got client reported
pp_client_connect: unexpected CM event 1 error with or without any
server running. The FW is 7.4.0. I tried stack 1.2.5 and ofed-1.4.1 both
doesn't seem work to me. Anybody has any idea?


Thanks
Shirley


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH] ipoib: avoid enabling napi when it's already enabled

2008-10-28 Thread Shirley Ma

在 2008-10-28二的 10:47 +0200,Vladimir Sokolovsky写道:
 Yossi Etigin wrote:
  ipoib_open() may be called from ipoib_pkey_poll(), after napi has 
  already been
  enbaled, and try to enable it again. This triggers BUG_ON test in 
  napi_enable().
  
  Signed-off-by: Yossi Etigin [EMAIL PROTECTED]
 
 Applied,
 
 Regards,
 Vladimir

The same fix should submit to mainline kernel. I checked the code, same
problem there.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH] ipoib: avoid enabling napi when it's already enabled

2008-10-27 Thread Shirley Ma
We found the same problem during child interface test for ofed-1.4-rc3.
Please help on fixing it in ofed-1.4 daily built.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] [PATCH v2] IB/ipoib: copy small SKBs in CM mode

2008-05-29 Thread Shirley Ma




Hello Eli,

  In this case, how many tx drop packets from ifconfig output? Should we
  see ifconfig tx drop packets + tx successfully transmit packets close
  to netperf packets?
 That's right.

I am looking at ipoib_cm_handle_tx_wc(), there is no tx drop packets
increased in this situation, so tx transmit packets should be around
netperf send packets.

void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
{
  ...

tx_req = tx-tx_ring[wr_id];

ib_dma_unmap_single(priv-ca, tx_req-mapping[0], tx_req-skb-len,
DMA_TO_DEVICE);

/* FIXME: is this right? Shouldn't we only increment on success? */
++dev-stats.tx_packets;
dev-stats.tx_bytes += tx_req-skb-len;
...
}

  Any TCP STREAM test results to share here?
 TCP won't demonstrate the problem since it uses Nagle's algorithm to
 aggregate data into full sized packets.

So when hitting this RNR retry, the error status return was flush err, so
the packets were silently dropped instead of failed cm send event and
clear the interface up flag?

Please correct me if wrong.

thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: IPoIB panics on ipoib_cm_handle_rx_wc()

2008-04-25 Thread Shirley Ma
Hello Or,

We have seen skb_under_panic() in our test enviornment as well. It's
easy reproduce this with tcpdump on and off.

thanks
Shirley

On Thu, 2008-04-24 at 14:11 +0300, Or Gerlitz wrote:
  https://bugs.openfabrics.org/show_bug.cgi?id=989
  --- Comment #13 from [EMAIL PROTECTED]  2008-04-23 13:23 ---
  I think I found the problem and have a fix.
 
  In ipoib_cm_handle_rx_wc(), if the byte_len is  SKB_TSHOLD, a new sk_buff
  is allocated.  The sk_buff's mac.raw is not initialized.  It sends the 
  packet
  up the stack to netif_receive_skb().  In netif_receive_skb(), skb-mac_len
  is computed by subtracting mac.raw from nh.raw.  Since mac.raw was not
  initialized, we get a very large number.  It eventually leads to a panic in
  skb_under_panic.
 
  The diff of the fix:
  drivers/infiniband/ulp/ipoib/ipoib_cm.c.pre_kris
  drivers/infiniband/ulp/ipoib/ipoib_cm.c
  *** drivers/infiniband/ulp/ipoib/ipoib_cm.c.pre_kris2008-04-23 
  16:05:26.0 -0400
  --- drivers/infiniband/ulp/ipoib/ipoib_cm.c 2008-04-23 
  15:16:23.0-0400
  ***
  *** 622,627 
  --- 622,628 
  skb_copy_from_linear_data_offset(skb, 
  IPOIB_ENCAP_LEN,
   small_skb-data, 
  dlen);
  skb_put(small_skb, dlen);
  +   skb_reset_mac_header(small_skb);
  skb = small_skb;
  goto copied;
  }
 Hi Kris,
 
 Good catch. This code does not exist in the mainline kernel and was 
 added through the ofed 1.3 (non) process, see the patch below. Does this 
 bug hits you for --every-- small packet received with connected mode? if 
 not, can you explain why?
 
 The patch for itself as provided by the ofed sources 
 (kernel_patches/fixes/ipoib_0320_small_skb_copy.patch) is event not 
 documented, I took the change log from the git used to store it. This not 
 reviewed and not documented patch which has a bug who could have been found 
 if reviewed is yet another good example why code should not be added through 
 ofed but rather through the mainline cycle, etc, oh well.
 
 Or.
  commit 92557c139fd8329daf1a1bf8beeaa6ae940b055a
  Author: Eli Cohen [EMAIL PROTECTED](none)
  Date:   Mon Feb 18 21:56:03 2008 +0200
 
  IB/ipoib: copy small SKBs in CM mode
 
  CM mode handling of received packets involves iterating trough the 
  fragments.
  This is time consuming and in case of small packets it is better to 
  allocate
  a new small skb and copy the data and pass this smaller SKB up to the 
  IP stack.
 
  Signed-off-by: Eli Cohen [EMAIL PROTECTED]
 
  Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
  ===
  --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib.h
  2008-02-18 19:23:23.0 +0200
  +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h 2008-02-18 
  22:20:48.0 +0200
  @@ -99,6 +99,8 @@ enum {
  MAX_SEND_CQE  = 16,
  UD_POST_RCV_COUNT = 16,
  CM_POST_SRQ_COUNT = 16,
  +
  +   SKB_TSHOLD= 256,
   };
 
   #defineIPOIB_OP_RECV   (1ul  31)
  Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_cm.c
  ===
  --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
  2008-02-18 19:23:23.0 +0200
  +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_cm.c  
  2008-02-18 22:21:01.0 +0200
  @@ -554,6 +554,7 @@ void ipoib_cm_handle_rx_wc(struct net_de
  u64 mapping[IPOIB_CM_RX_SG];
  int frags;
  int has_srq;
  +   struct sk_buff *small_skb;
 
  ipoib_dbg_data(priv, cm recv completion: id %d, status: %d\n,
 wr_id, wc-status);
  @@ -608,6 +609,20 @@ void ipoib_cm_handle_rx_wc(struct net_de
  }
  }
 
  +   if (wc-byte_len  SKB_TSHOLD) {
  +   int dlen = wc-byte_len - IPOIB_ENCAP_LEN;
  +
  +   small_skb = dev_alloc_skb(dlen);
  +   if (small_skb) {
  +   small_skb-protocol = ((struct ipoib_header 
  *)skb-data)-proto;
  +   skb_copy_from_linear_data_offset(skb, IPOIB_ENCAP_LEN,
  +small_skb-data, dlen);
  +   skb_put(small_skb, dlen);
  +   skb = small_skb;
  +   goto copied;
  +   }
  +   }
  +
  frags = PAGE_ALIGN(wc-byte_len - min(wc-byte_len,
(unsigned)IPOIB_CM_HEAD_SIZE)) / 
  PAGE_SIZE;
 
  @@ -634,6 +649,7 @@ void ipoib_cm_handle_rx_wc(struct net_de
  skb_reset_mac_header(skb);
  skb_pull(skb, IPOIB_ENCAP_LEN);
 
  +copied:
  dev-last_rx = jiffies;
  ++dev-stats.rx_packets;
  dev-stats.rx_bytes += skb-len;

 
 
 
 
 

[ewg] Re: [PATCH] IPoIB 4K MTU support

2008-04-22 Thread Shirley Ma
Hello Roland,

On Tue, 2008-04-22 at 13:46 -0700, Roland Dreier wrote:
 Thanks, applied with some cleanups as below.
Thanks!

 As an aside, in the case where we need to use a fragment in the receive
 skb, does it make sense to make the initial linear part bigger so the
 TCP and IP headers fit there (and the kernel doesn't have to look into
 the fragment list to handle the packet)?
We can improve this later.

 Also, is there any clean way where a kernel with PAGE_SIZE  4096 can
 have ud_need_sg evaluate to 0 at compile time, so that all the unneeded
 code can be thrown out by the compiler?
 
   +  return (IPOIB_UD_BUF_SIZE(ib_mtu)  PAGE_SIZE) ? 1 : 0;
 
 I've never understood this style: it makes no sense to do
 
   return bool ? 1 : 0;
 
 instead of just
 
   return bool;
You are right.

   +static inline void ipoib_ud_dma_unmap_rx(struct ipoib_dev_priv *priv,
   +   u64 mapping[IPOIB_UD_RX_SG])
   +{
   +  if (ipoib_ud_need_sg(priv-max_ib_mtu)) {
   +  ib_dma_unmap_single(priv-ca, mapping[0], IPOIB_UD_HEAD_SIZE, 
 DMA_FROM_DEVICE);
   +  ib_dma_unmap_page(priv-ca, mapping[1], PAGE_SIZE, 
 DMA_FROM_DEVICE);
   +  } else
   +  ib_dma_unmap_single(priv-ca, mapping[0], 
 IPOIB_UD_BUF_SIZE(priv-max_ib_mtu), DMA_FROM_DEVICE);
   +}
   +
   +static inline void ipoib_ud_skb_put_frags(struct ipoib_dev_priv *priv,
   +struct sk_buff *skb,
   +unsigned int length)
   +{
   +  if (ipoib_ud_need_sg(priv-max_ib_mtu)) {
   +  skb_frag_t *frag = skb_shinfo(skb)-frags[0];
   +  /*
   +   * There is only two buffers needed for max_payload = 4K,
   +   * first buf size is IPOIB_UD_HEAD_SIZE
   +   */
   +  skb-tail += IPOIB_UD_HEAD_SIZE;
   +  frag-size = length - IPOIB_UD_HEAD_SIZE;
   +  skb-data_len += frag-size;
   +  skb-truesize += frag-size;
   +  skb-len += length;
   +  } else
   +  skb_put(skb, length);
   +
   +}
 
 These are pretty big to put in a header file as inlines... I moved them
 to the only .c file where they're used.
 
  - R.
Right. I should have moved it into .c file from Or's comment. I forgot. 

Thanks.
Shirley


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] IPoIB 4K MTU support

2008-04-20 Thread Shirley Ma
Hello Roland,

I recreated IPoIB 4K MTU patch. Below patch is built against 2.6.25
kernel for 2.6.26 kernel submission. Please review and integrate it.
Please let me if any problem.

Thanks
Shirley

This patch enables IPoIB 4K MTU support by using two S/G buffers when
PAGE_SIZE is less than or equal to HCA IB MTU size. The first buffer is
for IPoIB header + GRH header. The second buffer is IPoIB payload, which
is 4K-4.

Signed-off-by: Shirley Ma [EMAIL PROTECTED]
---
 drivers/infiniband/ulp/ipoib/ipoib.h   |   50 +-
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|   86 +--
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   19 --
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 +-
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c |   15 -
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c  |1 +
 6 files changed, 125 insertions(+), 49 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 73b2b17..6a05ead 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -56,11 +56,11 @@
 /* constants */
 
 enum {
-   IPOIB_PACKET_SIZE = 2048,
-   IPOIB_BUF_SIZE= IPOIB_PACKET_SIZE + IB_GRH_BYTES,
-
IPOIB_ENCAP_LEN   = 4,
 
+   IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
+   IPOIB_UD_RX_SG= 2, /* max buffer needed for 4K mtu */
+
IPOIB_CM_MTU  = 0x1 - 0x10, /* padding to align header 
to 16 */
IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU  + IPOIB_ENCAP_LEN,
IPOIB_CM_HEAD_SIZE= IPOIB_CM_BUF_SIZE % PAGE_SIZE,
@@ -139,7 +139,7 @@ struct ipoib_mcast {
 
 struct ipoib_rx_buf {
struct sk_buff *skb;
-   u64 mapping;
+   u64 mapping[IPOIB_UD_RX_SG];
 };
 
 struct ipoib_tx_buf {
@@ -294,6 +294,7 @@ struct ipoib_dev_priv {
 
unsigned int admin_mtu;
unsigned int mcast_mtu;
+   unsigned int max_ib_mtu;
 
struct ipoib_rx_buf *rx_ring;
 
@@ -305,6 +306,9 @@ struct ipoib_dev_priv {
struct ib_send_wrtx_wr;
unsigned tx_outstanding;
 
+   struct ib_recv_wrrx_wr;
+   struct ib_sgerx_sge[IPOIB_UD_RX_SG];
+
struct ib_wc ibwc[IPOIB_NUM_WC];
 
struct list_head dead_ahs;
@@ -366,6 +370,44 @@ struct ipoib_neigh {
struct list_headlist;
 };
 
+#define IPOIB_UD_MTU(ib_mtu)   (ib_mtu - IPOIB_ENCAP_LEN)
+#define IPOIB_UD_BUF_SIZE(ib_mtu)  (ib_mtu + IB_GRH_BYTES)
+
+static inline int ipoib_ud_need_sg(unsigned int ib_mtu)
+{
+   return (IPOIB_UD_BUF_SIZE(ib_mtu)  PAGE_SIZE) ? 1 : 0;
+}
+
+static inline void ipoib_ud_dma_unmap_rx(struct ipoib_dev_priv *priv,
+u64 mapping[IPOIB_UD_RX_SG])
+{
+   if (ipoib_ud_need_sg(priv-max_ib_mtu)) {
+   ib_dma_unmap_single(priv-ca, mapping[0], IPOIB_UD_HEAD_SIZE, 
DMA_FROM_DEVICE);
+   ib_dma_unmap_page(priv-ca, mapping[1], PAGE_SIZE, 
DMA_FROM_DEVICE);
+   } else
+   ib_dma_unmap_single(priv-ca, mapping[0], 
IPOIB_UD_BUF_SIZE(priv-max_ib_mtu), DMA_FROM_DEVICE);
+}
+
+static inline void ipoib_ud_skb_put_frags(struct ipoib_dev_priv *priv,
+ struct sk_buff *skb,
+ unsigned int length)
+{
+   if (ipoib_ud_need_sg(priv-max_ib_mtu)) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[0];
+   /*
+* There is only two buffers needed for max_payload = 4K,
+* first buf size is IPOIB_UD_HEAD_SIZE
+*/
+   skb-tail += IPOIB_UD_HEAD_SIZE;
+   frag-size = length - IPOIB_UD_HEAD_SIZE;
+   skb-data_len += frag-size;
+   skb-truesize += frag-size;
+   skb-len += length;
+   } else
+   skb_put(skb, length);
+
+}
+
 /*
  * We stash a pointer to our private neighbour information after our
  * hardware address in neigh-ha.  The ALIGN() expression here makes
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 0205eb7..8b3f1b2 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -92,25 +92,18 @@ void ipoib_free_ah(struct kref *kref)
 static int ipoib_ib_post_receive(struct net_device *dev, int id)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
-   struct ib_sge list;
-   struct ib_recv_wr param;
struct ib_recv_wr *bad_wr;
int ret;
 
-   list.addr = priv-rx_ring[id].mapping;
-   list.length   = IPOIB_BUF_SIZE;
-   list.lkey = priv-mr-lkey;
+   priv-rx_wr.wr_id   = id | IPOIB_OP_RECV;
+   priv-rx_sge[0].addr = priv-rx_ring[id].mapping[0];
+   priv-rx_sge[1].addr = priv-rx_ring[id].mapping[1];
+   
 
-   param.next= NULL

[ewg] Re: [RFC][1/2] IPoIB UD 4K MTU support

2008-04-08 Thread Shirley Ma
On Fri, 2008-04-04 at 15:36 -0700, Roland Dreier wrote:
   +  unsigned int max_ib_mtu;
 
 I don't see where this is ever set?
 
  - R.

It is set in ipoib_main.c, ipoib_add_port()

+   if (!ib_query_port(hca, port, attr))
+   priv-max_ib_mtu = ib_mtu_enum_to_int(attr.max_mtu);

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: 4K MTU patch to kernel 2.6.26

2008-03-20 Thread Shirley Ma





Hello Tziporet,

Yes, that's I am working on. I am going on vacation next week, that's why I
hesitated to submit this patch this week since I can't respond to the
review comments on time. If submitting the patch on April.1 is too late, I
can submit this patch by tomorrow.  How do you think?

thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] [RFC][0/2] IPoIB UD 4K MTU support

2008-03-20 Thread Shirley Ma
Here is a patchset to enable IPoIB UD 4K MTU support for any IB fabric
where the max IPoIB payload can be up to 4K. This patchset uses two S/G
buffers when IPoIB payload + IB_GRH header size is greater than
PAGE_SIZE. The first buffer size is IB_GRH_HEAD + IPOIB_ENCAP_LEN. The
second buffer is the data.

Please review it.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [RFC][1/2] IPoIB UD 4K MTU support

2008-03-20 Thread Shirley Ma
This patch defines some parameters and creates a couple of APIs and  for UD RX 
S/G to be used later.

Signed-off-by: Shirley Ma [EMAIL PROTECTED]
---

 drivers/infiniband/ulp/ipoib/ipoib.h |   48 ++
 1 files changed, 48 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index f9b7caa..73a8fe5 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -61,6 +61,10 @@ enum {
 
IPOIB_ENCAP_LEN   = 4,
 
+   IPOIB_UD_MAX_PAYLOAD  = 4096,
+   IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
+   IPOIB_UD_RX_SG= (IPOIB_UD_MAX_PAYLOAD + IB_GRH_BYTES) / 
PAGE_SIZE,
+
IPOIB_CM_MTU  = 0x1 - 0x10, /* padding to align header 
to 16 */
IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU  + IPOIB_ENCAP_LEN,
IPOIB_CM_HEAD_SIZE= IPOIB_CM_BUF_SIZE % PAGE_SIZE,
@@ -141,6 +145,11 @@ struct ipoib_rx_buf {
u64 mapping;
 };
 
+struct ipoib_ud_rx_buf {
+   struct sk_buff *skb;
+   u64 mapping[IPOIB_UD_RX_SG];
+};
+
 struct ipoib_tx_buf {
struct sk_buff *skb;
u64 mapping[MAX_SKB_FRAGS + 1];
@@ -289,6 +298,7 @@ struct ipoib_dev_priv {
 
unsigned int admin_mtu;
unsigned int mcast_mtu;
+   unsigned int max_ib_mtu;
 
struct ipoib_rx_buf *rx_ring;
 
@@ -359,6 +369,44 @@ struct ipoib_neigh {
struct list_headlist;
 };
 
+#define IPOIB_UD_MTU(ib_mtu)   (ib_mtu - IPOIB_ENCAP_LEN)
+#define IPOIB_UD_BUF_SIZE(ib_mtu)  (ib_mtu + IB_GRH_BYTES)
+
+static inline int ipoib_ud_need_sg(unsigned int ib_mtu)
+{
+   return (IPOIB_UD_BUF_SIZE(ib_mtu)  PAGE_SIZE) ? 1 : 0;
+}
+
+static inline void ipoib_ud_dma_unmap_rx(struct ipoib_dev_priv *priv,
+u64 mapping[IPOIB_UD_RX_SG])
+{
+   if (ipoib_ud_need_sg(priv-max_ib_mtu)) {
+   ib_dma_unmap_single(priv-ca, mapping[0], IPOIB_UD_HEAD_SIZE, 
DMA_FROM_DEVICE);
+   ib_dma_unmap_page(priv-ca, mapping[1], PAGE_SIZE, 
DMA_FROM_DEVICE);
+   } else
+   ib_dma_unmap_single(priv-ca, mapping[0], 
IPOIB_UD_BUF_SIZE(priv-max_ib_mtu), DMA_FROM_DEVICE);
+}
+
+static inline void ipoib_ud_skb_put_frags(struct ipoib_dev_priv *priv,
+ struct sk_buff *skb,
+ unsigned int length)
+{
+   if (ipoib_ud_need_sg(priv-max_ib_mtu)) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[0];
+   /*
+* There is only two buffers needed for max_payload = 4K,
+* first buf size is IPOIB_UD_HEAD_SIZE
+*/
+   skb-tail += IPOIB_UD_HEAD_SIZE;
+   frag-size = length - IPOIB_UD_HEAD_SIZE;
+   skb-data_len += frag-size;
+   skb-truesize += frag-size;
+   skb-len += length;
+   } else
+   skb_put(skb, length);
+
+}
+
 /*
  * We stash a pointer to our private neighbour information after our
  * hardware address in neigh-ha.  The ALIGN() expression here makes


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [RFC][2/2] IPoIB UD 4K MTU support

2008-03-20 Thread Shirley Ma
This patch enabled 4K MTU support for IPoIB UD. 
I fixed unnecessary define in [RFC][1/2] patch 
since there is only 2 buffers are needed. I will
integrate any comments later for this patchset and 
resubmit it. I have touched test this patch for branch-2.6.25 
git tree.

Signed-off-by: Shirley Ma [EMAIL PROTECTED]
---

 drivers/infiniband/ulp/ipoib/ipoib.h   |   13 +---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|   86 +--
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   19 --
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 +-
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c |   15 -
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c  |1 +
 6 files changed, 83 insertions(+), 54 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 73a8fe5..fcbb618 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -56,14 +56,11 @@
 /* constants */
 
 enum {
-   IPOIB_PACKET_SIZE = 2048,
-   IPOIB_BUF_SIZE= IPOIB_PACKET_SIZE + IB_GRH_BYTES,
-
IPOIB_ENCAP_LEN   = 4,
 
IPOIB_UD_MAX_PAYLOAD  = 4096,
IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
-   IPOIB_UD_RX_SG= (IPOIB_UD_MAX_PAYLOAD + IB_GRH_BYTES) / 
PAGE_SIZE,
+   IPOIB_UD_RX_SG= 2, /* max buffer needed */
 
IPOIB_CM_MTU  = 0x1 - 0x10, /* padding to align header 
to 16 */
IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU  + IPOIB_ENCAP_LEN,
@@ -142,11 +139,6 @@ struct ipoib_mcast {
 
 struct ipoib_rx_buf {
struct sk_buff *skb;
-   u64 mapping;
-};
-
-struct ipoib_ud_rx_buf {
-   struct sk_buff *skb;
u64 mapping[IPOIB_UD_RX_SG];
 };
 
@@ -310,6 +302,9 @@ struct ipoib_dev_priv {
struct ib_send_wrtx_wr;
unsigned tx_outstanding;
 
+   struct ib_recv_wrrx_wr;
+   struct ib_sgerx_sge[IPOIB_UD_RX_SG];
+
struct ib_wc ibwc[IPOIB_NUM_WC];
 
struct list_head dead_ahs;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_fs.c 
b/drivers/infiniband/ulp/ipoib/ipoib_fs.c
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 9d3e778..072acc2 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -90,25 +90,18 @@ void ipoib_free_ah(struct kref *kref)
 static int ipoib_ib_post_receive(struct net_device *dev, int id)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
-   struct ib_sge list;
-   struct ib_recv_wr param;
struct ib_recv_wr *bad_wr;
int ret;
 
-   list.addr = priv-rx_ring[id].mapping;
-   list.length   = IPOIB_BUF_SIZE;
-   list.lkey = priv-mr-lkey;
+   priv-rx_wr.wr_id   = id | IPOIB_OP_RECV;
+   priv-rx_sge[0].addr = priv-rx_ring[id].mapping[0];
+   priv-rx_sge[1].addr = priv-rx_ring[id].mapping[1];
+   
 
-   param.next= NULL;
-   param.wr_id   = id | IPOIB_OP_RECV;
-   param.sg_list = list;
-   param.num_sge = 1;
-
-   ret = ib_post_recv(priv-qp, param, bad_wr);
+   ret = ib_post_recv(priv-qp, priv-rx_wr, bad_wr);
if (unlikely(ret)) {
ipoib_warn(priv, receive failed for buf %d (%d)\n, id, ret);
-   ib_dma_unmap_single(priv-ca, priv-rx_ring[id].mapping,
-   IPOIB_BUF_SIZE, DMA_FROM_DEVICE);
+   ipoib_ud_dma_unmap_rx(priv, priv-rx_ring[id].mapping);
dev_kfree_skb_any(priv-rx_ring[id].skb);
priv-rx_ring[id].skb = NULL;
}
@@ -116,15 +109,22 @@ static int ipoib_ib_post_receive(struct net_device *dev, 
int id)
return ret;
 }
 
-static int ipoib_alloc_rx_skb(struct net_device *dev, int id)
+static struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev,
+ int id,
+ u64 mapping[IPOIB_UD_RX_SG])
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
struct sk_buff *skb;
-   u64 addr;
+   int buf_size;
+
+   if (ipoib_ud_need_sg(priv-max_ib_mtu))
+   buf_size = IPOIB_UD_HEAD_SIZE;
+   else
+   buf_size = IPOIB_UD_BUF_SIZE(priv-max_ib_mtu);
 
-   skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4);
-   if (!skb)
-   return -ENOMEM;
+   skb = dev_alloc_skb(buf_size + 4);
+   if (unlikely(!skb))
+   return NULL;
 
/*
 * IB will leave a 40 byte gap for a GRH and IPoIB adds a 4 byte
@@ -133,17 +133,31 @@ static int ipoib_alloc_rx_skb(struct net_device *dev, int 
id)
 */
skb_reserve(skb, 4);
 
-   addr = ib_dma_map_single(priv-ca, skb-data, IPOIB_BUF_SIZE

Re: [ewg] [Fwd: Re: [ofa-general] IPOIB/CM increase retry counts]

2008-02-13 Thread Shirley Ma




 I saw cases where a fast sender consumed the TX ring and I solved this by

 increasing the size of the tx queue. I will try to connect ConnectX
 with Sinai
 and see if there are such issues.

Which indicates we really need to fix bug 907.

Thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

2008-02-13 Thread Shirley Ma
On Wed, 2008-02-13 at 10:04 +0200, Or Gerlitz wrote:
 Also here, does this problem exist in the 2.6.25-rc1 upstream code as 
 well? from the change log I don't understand the source of the
 problem 
 (only the symptom of failing to destroy ipoib/cm rx QP) and the
 solution.
 
 Or.

I believe so. This is not a new problem in OFED-1.3 release.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: IB/ipoib: ipoib_ib_post_receive: infinite loop in error path

2008-02-08 Thread Shirley Ma





Thanks Nam. I will fix it along with ipoib_sg_skb_put_frags() optimization.

Thanks
Shirley



   
 Hoang-Nam 
 Nguyen
 hnguyen@  To
 linux.vne [EMAIL PROTECTED], Shirley 
 t.ibm.com Ma/Beaverton/[EMAIL PROTECTED]  
   cc
   ewg@lists.openfabrics.org,  
 02/08/08  [EMAIL PROTECTED]   
 07:10 AM  Subject
   IB/ipoib: ipoib_ib_post_receive: infinite
   loop in error path  
   
   
   
   
   
   




Hello Eli!
Looked at ipoib code from ofed-1.3-rc4 and the saw the following code
snippet
in ipoib_ib_post_receive():

 if (++priv-rx_outst == UD_POST_RCV_COUNT) {
 ret = ib_post_recv(priv-qp, priv-rx_wr_draft,
bad_wr);

 if (unlikely(ret)) {
 ipoib_warn(priv, receive failed for
buf %d (%d)\n, id, ret);
 while (bad_wr) {
 id = bad_wr-wr_id 
~IPOIB_OP_RECV;

ipoib_sg_dma_unmap_rx(priv,

priv-rx_ring[i].mapping);
#1/ipoib_0240_4kmtu.patch: should be priv-rx_ring[id].mapping

dev_kfree_skb_any(priv-rx_ring[id].skb);
 priv-rx_ring[id].skb =
NULL;
#2/ipoib_0220_ud_post_list.patch: missing iterator forwarding, ie bad_wr =
bad_wr-next;
 }

 }
 priv-rx_outst = 0;
 }

#1: I've talked with Shirley about this.
#2: I thought to have seen you fixed it, but still see it in rc4 after
called
configure script.

Nam

inline: graycol.gifinline: pic06271.gifinline: ecblank.gif___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [PATCH] IB/ipoib - Problem with latest OFED 1.3 build... IPoIB and iPATH

2008-02-08 Thread Shirley Ma





Hello Ralph,

  I looked at ehca and mthca, in create_ah(), both driver didn't check
dlid condition check like ipath here. In the port initilizaiton,
priv-local_lid is set to 0 which is created by ipoib_0190_unsig_udqp.patch
in RC4. I will let Eli look at this problem.

static struct ib_ah *ipath_create_ah(struct ib_pd *pd,
 struct ib_ah_attr *ah_attr)
{
struct ipath_ah *ah;
struct ib_ah *ret;
struct ipath_ibdev *dev = to_idev(pd-device);
unsigned long flags;

/* A multicast address requires a GRH (see ch. 8.4.1). */
if (ah_attr-dlid = IPATH_MULTICAST_LID_BASE 
ah_attr-dlid != IPATH_PERMISSIVE_LID 
!(ah_attr-ah_flags  IB_AH_GRH)) {
ret = ERR_PTR(-EINVAL);
goto bail;
}

if (ah_attr-dlid == 0) {
ret = ERR_PTR(-EINVAL);
goto bail;
}


Thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] ***SPAM*** Re: [PATCH] IB/ipoib - Problem with latest OFED 1.3 build... IPoIB and iPATH

2008-02-08 Thread Shirley Ma




Hello Ralph,

  This patch looks OK to me. Let's wait for Eli's response.

Thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [Fwd: Re: [ofa-general] Problem with latest OFED 1.3 build... IPoIB and iPATH]

2008-02-07 Thread Shirley Ma
Hello Ralph,

What's ifconfig ib0 output?

  We can reproduce the problem here.
  We haven't made any ib_ipath driver changes between RC3 and RC4
  so some recent patch has broken us.
  I'm in the process of looking at it.
  
  On Wed, 2008-02-06 at 17:17 -0800, Arlin Davis wrote:
   I cannot ifconfig ib0 on ipath with using the latest build
   (ofed20080206).

   ifup ib0
   SIOCSIFFLAGS: Invalid argument
   Failed to bring up ib0.
   
 ib0: failed to create own ah

int ipoib_ib_dev_open(struct net_device *dev)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
int ret;

if (ib_find_pkey(priv-ca, priv-port, priv-pkey,
priv-pkey_index)) {
ipoib_warn(priv, P_Key 0x%04x not found\n,
priv-pkey);
clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
return -1;
}
set_bit(IPOIB_PKEY_ASSIGNED, priv-flags);

ret = create_own_ah(priv);
if (ret) {
priv-own_ah = NULL;
ipoib_warn(priv, failed to create own ah\n);
return -1;
}

Looks like the ipath driver returns error from create_own_ah() call. Are
you sure there is no ipath driver changes between RC3 and RC4?

Which kernel did you hit this problem? What's the kernel PAGE_SIZE?

thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [Fwd: Re: [ofa-general] Problem with latest OFED 1.3 build... IPoIB and iPATH]

2008-02-07 Thread Shirley Ma
On Thu, 2008-02-07 at 18:16 -0800, Ralph Campbell wrote:
 # cat /etc/*release
 Red Hat Enterprise Linux Server release 5 (Tikanga)
 # uname -r
 2.6.18-8.el5
 
 4K PAGE_SIZE
I don't have ipath driver here. Otherwise I could try them out. 

A couple suggestions here, could you please try out?

1. try this on 64K page size, like RHEL5U1 to see whether you have the
same issue.

2. Can you put a debug message in ipath_create_ah() to see whether this
is a memory allocation failure?

3. How many IB cards in your system? If you have severals, just leave
one ipath there to see whether you can hit this problem.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Shirley Ma
Hello Or,

I found out that if you increase send_queue_size and recv_queue_size,
like 1K, this problem will be gone.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Shirley Ma
On Wed, 2008-02-06 at 18:25 +0200, Tziporet Koren wrote:
 Hi,
 
 We will have OFED 1.3-rc4 tomorrow after one more night of regression
 
 It will include:
 
1. IPoIB: Non-SRQ for CM mode
2. IPOIB: 4K MTU
3. IPoIB - Small messages improvements
 
 Note that today's latest build will include theses features too if 
 someone want to test it today
 
 Tziporet

Thanks Tziporet. We will test it right after it's out.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] [UPDATE][PATCH] IPoIB-UD 4K MTU patch against 2.6.24 ofed-1.3-git tree

2008-02-05 Thread Shirley Ma





Hello Tziporet,

  The problem was because of the last check  in of small UDP
performance patch. It changed the receiving path completely. And I only got
less than one day to merge/test the patch with that patch on both intel and
PPC platform. The patch was in good/stable shape before this patch. It has
passed stress test for both intel and PPC platform. I have tested the whole
night of the new patch yesterday night. It works well and passes the stress
test without any problem.

  Regarding Eli's comments, I have sent out. I am sorry for the minor
mistake because of the rushing, but I don't see any risk from my test
results. Please reconsider this patch to be in OFED-1.3.

thanks
Shirley




   
 Tziporet  
 Koren 
 tziporet  To
 @dev.mell [EMAIL PROTECTED] 
 anox.co.i  cc
 lEli Cohen [EMAIL PROTECTED], 
 Sent by:  ewg@lists.openfabrics.org, OpenFabrics  
 general-b General [EMAIL PROTECTED] 
 [EMAIL PROTECTED] 
Subject
 sts.openf [ofa-general] Re: [ewg] [UPDATE][PATCH] 
 abrics.or IPoIB-UD 4K MTU patch against 2.6.24
 g ofed-1.3-git tree   
   
   
 02/05/08  
 08:19 AM  
   
   




Shirley Ma wrote:
 I found one one line was out side for loop when merging this patch
 with current git-tree. This caused UD_POST_RCV_COUNT = 16 wrong. I have
 fixed it. This is the updated patch.

 Thanks
 Shirley




Hi Shirley,

Its seems to me that 4K MTU patch is not cooked enough for RC4.
I appreciate your hard work to push it but so many changes, possible
leaks and not enough time for review and testing means too high risk for
now

Tziporet

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
inline: graycol.gifinline: pic23340.gifinline: ecblank.gif___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ofa-general] Re: [ewg] [UPDATE][PATCH] IPoIB-UD 4K MTU patch against 2.6.24 ofed-1.3-git tree

2008-02-05 Thread Shirley Ma
Hello Tziporet,

On Tue, 2008-02-05 at 18:56 +0200, Tziporet Koren wrote:
 Shirley Ma wrote:
 
  Hello Tziporet,
 
  The problem was because of the last check in of small UDP performance 
  patch. It changed the receiving path completely. And I only got less 
  than one day to merge/test the patch with that patch on both intel and 
  PPC platform. The patch was in good/stable shape before this patch. It 
  has passed stress test for both intel and PPC platform. I have tested 
  the whole night of the new patch yesterday night. It works well and 
  passes the stress test without any problem.
 
 Which OS have you tested?

2.6.24 kernel, and I am going to test SLES10SP2 kernel. It has passed
stress test the whole night for 2K MTU test suites.

  Regarding Eli's comments, I have sent out. I am sorry for the minor 
  mistake because of the rushing, but I don't see any risk from my test 
  results. Please reconsider this patch to be in OFED-1.3.
 
 OK - we will do this - we will run one set of our regression with your 
 patch now, and also check that it pass compilation on all kernels.
 If both will be OK we will take it.
 
 I cross fingers for you :-)
 
 ziporet

Appreciate you, Vlad and Eli's help here! There is one line change
needed for backporting ++priv-stats and ++dev-stats. I didn't create
the backport patch for this.

thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] [UPDATE][PATCH] IPoIB-UD 4K MTU patch against 2.6.24 ofed-1.3-git tree

2008-02-05 Thread Shirley Ma




Tziporet Koren [EMAIL PROTECTED] wrote on 02/05/2008 12:07:28
PM:

 Please test on RHREL 5 too
 What are your stress tests?

Ok. The stress test is similar to netperf/netserver. But it's
bi-directional multiple streams. I have stressed the stream to 150, duplex
running overnight.

 Please send this backport patch and specify to which kernels its needed

 Tziporet

Ok. It might be out tonight.

Thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] [PATCH] IPoIB-UD 4K MTU patch for RC3 against 2.6.24

2008-02-04 Thread Shirley Ma
Hello Vlad,

Here is the IPoIB-4K MTU patch for OFED-1.3-RC3 release agains 2.6.24
kernel. I create an attachment as well since my email has some problem.
Regarding the backport, one line is needed to add for priv-stats vs.
dev-stats. I don't have the backport patch, if you could help me that
would be nice. If this is any issue, I will ask Nam to help out.

I have touch tested mthca for 2K MTU for the updated patch. More test
are going on.

thanks
Shirley

Signed-off-by Shirley Ma [EMAIL PROTECTED]
---

 drivers/infiniband/ulp/ipoib/ipoib.h   |   28 +++-
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|  218
+---
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   19 ++-
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |3 +-
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c |   16 ++-
 5 files changed, 212 insertions(+), 72 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 8eb6aa2..cb3aeab 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -56,11 +56,11 @@
 /* constants */
 
 enum {
-   IPOIB_PACKET_SIZE = 2048,
-   IPOIB_BUF_SIZE= IPOIB_PACKET_SIZE + IB_GRH_BYTES,
-
IPOIB_ENCAP_LEN   = 4,
 
+   IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
+   IPOIB_UD_RX_SG= 2, /* for 4K MTU */ 
+
IPOIB_CM_MTU  = 0x1 - 0x10, /* padding to align header
to 16 */
IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU  + IPOIB_ENCAP_LEN,
IPOIB_CM_HEAD_SIZE= IPOIB_CM_BUF_SIZE % PAGE_SIZE,
@@ -135,9 +135,9 @@ struct ipoib_mcast {
struct net_device *dev;
 };
 
-struct ipoib_rx_buf {
+struct ipoib_sg_rx_buf {
struct sk_buff *skb;
-   u64 mapping;
+   u64 mapping[IPOIB_UD_RX_SG];
 };
 
 struct ipoib_tx_buf {
@@ -286,7 +286,7 @@ struct ipoib_dev_priv {
unsigned int admin_mtu;
unsigned int mcast_mtu;
 
-   struct ipoib_rx_buf *rx_ring;
+   struct ipoib_sg_rx_buf *rx_ring;
 
spinlock_t   tx_lock;
struct ipoib_tx_buf *tx_ring;
@@ -315,6 +315,9 @@ struct ipoib_dev_priv {
struct dentry *mcg_dentry;
struct dentry *path_dentry;
 #endif
+   int max_ib_mtu;
+   struct ib_sge rx_sge[IPOIB_UD_RX_SG];
+   struct ib_recv_wr rx_wr;
 };
 
 struct ipoib_ah {
@@ -355,6 +358,19 @@ struct ipoib_neigh {
struct list_headlist;
 };
 
+#define IPOIB_UD_MTU(ib_mtu)   (ib_mtu - IPOIB_ENCAP_LEN)
+#define IPOIB_UD_BUF_SIZE(ib_mtu)  (ib_mtu + IB_GRH_BYTES)
+static inline int ipoib_ud_need_sg(int ib_mtu)
+{
+   return (IPOIB_UD_BUF_SIZE(ib_mtu)  PAGE_SIZE) ? 1 : 0;
+}
+static inline void ipoib_sg_dma_unmap_rx(struct ipoib_dev_priv *priv,
+u64 mapping[IPOIB_UD_RX_SG])
+{
+   ib_dma_unmap_single(priv-ca, mapping[0], IPOIB_UD_HEAD_SIZE,
DMA_FROM_DEVICE);
+   ib_dma_unmap_single(priv-ca, mapping[1], PAGE_SIZE, DMA_FROM_DEVICE);
+}
+
 /*
  * We stash a pointer to our private neighbour information after our
  * hardware address in neigh-ha.  The ALIGN() expression here makes
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 5063dd5..6c9eefe 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -87,32 +87,93 @@ void ipoib_free_ah(struct kref *kref)
spin_unlock_irqrestore(priv-lock, flags);
 }
 
+/* Adjust length of skb with fragments to match received data */
+static void ipoib_ud_skb_put_frags(struct sk_buff *skb, unsigned int
length,
+  struct sk_buff *toskb)
+{
+   unsigned int size;
+   skb_frag_t *frag = skb_shinfo(skb)-frags[0];
+
+   /* put header into skb */
+   size = min(length, (unsigned)IPOIB_UD_HEAD_SIZE);
+   skb-tail += size;
+   skb-len += size;
+   length -= size;
+
+   if (length == 0) {
+   /* don't need this page */
+   skb_fill_page_desc(toskb, 0, frag-page, 0, PAGE_SIZE);
+   --skb_shinfo(skb)-nr_frags;
+   } else {
+   size = min(length, (unsigned) PAGE_SIZE);
+   frag-size = size;
+   skb-data_len += size;
+   skb-truesize += size;
+   skb-len += size;
+   length -= size;
+   }
+}
+
+static struct sk_buff *ipoib_sg_alloc_rx_skb(struct net_device *dev,
+int id, u64 
mapping[IPOIB_UD_RX_SG])
+{
+   struct ipoib_dev_priv *priv = netdev_priv(dev);
+   struct page *page;
+   struct sk_buff *skb;
+
+   skb = dev_alloc_skb(IPOIB_UD_HEAD_SIZE);
+
+   if (unlikely(!skb)) 
+   return NULL;
+
+   mapping[0] = ib_dma_map_single(priv-ca, skb-data,
IPOIB_UD_HEAD_SIZE,
+  DMA_FROM_DEVICE);
+   if (unlikely

[ewg] Re: [PATCH] IPoIB-UD 4K MTU patch for RC3 against 2.6.24

2008-02-04 Thread Shirley Ma
Hello all,

I have created the patch and tested without Eli's patch but with
Pradeep's patch. It works OK. Then I create another patch with Eli and
Pradeep's patch against today's ofed-1.3 git tree. The ping worked for a
while then stopped. I will try to debug it. 

And We have found a crash in today's ofed git tree in IPoIB-CM mode.
Pradeep has narrowed down it to Eli's patch. Please address it on time.
So we can continue our test.

thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] IPoIB-UD 4K MTU patch against 2.6.24 ofed-1.3-git tree

2008-02-04 Thread Shirley Ma
Tziporet,

This IPoIB 4K MTU patch is built against today's 2.6.24 OFED-1.3-Git
tree. This patch tested before Eli's patch successfully. This rebuilt
patch is on top of Eli's patch. However this constant UD_POST_RCV_COUNT
which is defined in Eli's patch as 16 does impact the behavior this
patch. When I define this as 1, everything works OK, if I change the
value to 8 or bigger, the patch won't work well.

We do see a couple of issues after Eli's patch checks in. So I suggest
to check in the patch. Then we can work together to address these issues
tomorrow. In Eli's patch I would suggest use kzalloc() to alloc 16
ib_sge and ib_recv_wr instead of defining this in ipoib_dev_priv since
it might have some memory issue there. I am working on the patch now to
see any better results.

Vlad,

There would be one line change for backporting regarding priv-stats vs.
dev-stats. If you have any problem to create the backport patch, let me
know. I will ask Nam to help. The attachment is for you to easily apply
the patch, my email might have issues.

Thanks
Shirley

Signed-off-by: Shirley Ma [EMAIL PROTECTED] 
---

diff -urpN ofed_1_3_a/drivers/infiniband/ulp/ipoib/ipoib.h
ofed_1_3_b/drivers/infiniband/ulp/ipoib/ipoib.h
--- ofed_1_3_a/drivers/infiniband/ulp/ipoib/ipoib.h 2008-02-04
15:45:44.0 -0800
+++ ofed_1_3_b/drivers/infiniband/ulp/ipoib/ipoib.h 2008-02-04
15:40:38.0 -0800
@@ -56,11 +56,11 @@
 /* constants */
 
 enum {
-   IPOIB_PACKET_SIZE = 2048,
-   IPOIB_BUF_SIZE= IPOIB_PACKET_SIZE + IB_GRH_BYTES,
-
IPOIB_ENCAP_LEN   = 4,
 
+   IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
+   IPOIB_UD_RX_SG= 2, /* for 4K MTU */ 
+
IPOIB_CM_MTU  = 0x1 - 0x10, /* padding to align header
to 16 */
IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU  + IPOIB_ENCAP_LEN,
IPOIB_CM_HEAD_SIZE= IPOIB_CM_BUF_SIZE % PAGE_SIZE,
@@ -141,9 +141,9 @@ struct ipoib_mcast {
struct net_device *dev;
 };
 
-struct ipoib_rx_buf {
+struct ipoib_sg_rx_buf {
struct sk_buff *skb;
-   u64 mapping;
+   u64 mapping[IPOIB_UD_RX_SG];
 };
 
 struct ipoib_tx_buf {
@@ -337,7 +337,7 @@ struct ipoib_dev_priv {
 
struct net_device  *dev;
struct ib_recv_wr   rx_wr_draft[UD_POST_RCV_COUNT];
-   struct ib_sge   sglist_draft[UD_POST_RCV_COUNT];
+   struct ib_sge   sglist_draft[UD_POST_RCV_COUNT][IPOIB_UD_RX_SG];
unsigned intrx_outst;
 
struct napi_struct napi;
@@ -378,7 +378,7 @@ struct ipoib_dev_priv {
unsigned int admin_mtu;
unsigned int mcast_mtu;
 
-   struct ipoib_rx_buf *rx_ring;
+   struct ipoib_sg_rx_buf *rx_ring;
 
spinlock_t   tx_lock;
struct ipoib_tx_buf *tx_ring;
@@ -412,6 +412,7 @@ struct ipoib_dev_priv {
struct ipoib_ethtool_st etool;
struct timer_list poll_timer;
struct ib_ah *own_ah;
+   int max_ib_mtu;
 };
 
 struct ipoib_ah {
@@ -452,6 +453,19 @@ struct ipoib_neigh {
struct list_headlist;
 };
 
+#define IPOIB_UD_MTU(ib_mtu)   (ib_mtu - IPOIB_ENCAP_LEN)
+#define IPOIB_UD_BUF_SIZE(ib_mtu)  (ib_mtu + IB_GRH_BYTES)
+static inline int ipoib_ud_need_sg(int ib_mtu)
+{
+   return (IPOIB_UD_BUF_SIZE(ib_mtu)  PAGE_SIZE) ? 1 : 0;
+}
+static inline void ipoib_sg_dma_unmap_rx(struct ipoib_dev_priv *priv,
+u64 mapping[IPOIB_UD_RX_SG])
+{
+   ib_dma_unmap_single(priv-ca, mapping[0], IPOIB_UD_HEAD_SIZE,
DMA_FROM_DEVICE);
+   ib_dma_unmap_single(priv-ca, mapping[1], PAGE_SIZE, DMA_FROM_DEVICE);
+}
+
 /*
  * We stash a pointer to our private neighbour information after our
  * hardware address in neigh-ha.  The ALIGN() expression here makes
diff -urpN ofed_1_3_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
ofed_1_3_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
--- ofed_1_3_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c  2008-02-04
15:45:44.0 -0800
+++ ofed_1_3_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c  2008-02-04
15:40:38.0 -0800
@@ -96,14 +96,82 @@ static void clean_pending_receives(struc
 
for (i = 0; i  priv-rx_outst; ++i) {
id = priv-rx_wr_draft[i].wr_id  ~IPOIB_OP_RECV;
-   ib_dma_unmap_single(priv-ca, priv-rx_ring[id].mapping,
-IPOIB_BUF_SIZE,
DMA_FROM_DEVICE);
+   if (ipoib_ud_need_sg(priv-max_ib_mtu))
+   ipoib_sg_dma_unmap_rx(priv,
+ priv-rx_ring[i].mapping);
+   else
+   ib_dma_unmap_single(priv-ca, 
priv-rx_ring[id].mapping[0],
+
IPOIB_UD_BUF_SIZE(priv-max_ib_mtu), DMA_FROM_DEVICE);
dev_kfree_skb_any(priv-rx_ring[id].skb);
priv-rx_ring[id].skb = NULL;
}
priv-rx_outst = 0;
 }
 
+static void ipoib_ud_skb_put_frags(struct

Re: [ewg] Oops with today's OFED 1.3

2008-02-04 Thread Shirley Ma
Eli,

Please look at this issues ASAP. Without your patch everything works
well.

Thanks
Shirley 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] Re: [ofa-general] Please send all patches for OFED 1.3 rc4 by end of Monday (Feb 4)

2008-02-04 Thread Shirley Ma






Tziporet Koren [EMAIL PROTECTED] wrote on 02/04/2008 08:14:08
AM:

 OK - go ahead and regenerate patch and we will be able to include it in
RC4
 BTW - how did you test it with mthca? It does not support 4K MTU. You
 can test it with ConnectX since it does supports 4K MTU (with a special
 burning configuration). Please let me know if you have ConnectX and you
 wish to test it with 4K MTU

 Tziporet

Thanks Tzipoeret. I would like to test ConnectX. But I can't test right it
now since the switch connected to ConnectX is configured as 2K MTU and the
test team has other test task to finish. But I can suggest the test team to
include 4K MTU test as port of their system validation. Please send me the
instructions on how to enable it for ConnectX.

Thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] [UPDATE][PATCH] IPoIB-UD 4K MTU patch against 2.6.24 ofed-1.3-git tree

2008-02-04 Thread Shirley Ma
I found one one line was out side for loop when merging this patch
with current git-tree. This caused UD_POST_RCV_COUNT = 16 wrong. I have
fixed it. This is the updated patch.

Thanks
Shirley


Signed-off-by: Shirley Ma [EMAIL PROTECTED]
---

diff -urpN ofed_kernel_a/drivers/infiniband/ulp/ipoib/ipoib.h
ofed_kernel_b/drivers/infiniband/ulp/ipoib/ipoib.h
--- ofed_kernel_a/drivers/infiniband/ulp/ipoib/ipoib.h  2008-02-04
20:09:18.0 -0800
+++ ofed_kernel_b/drivers/infiniband/ulp/ipoib/ipoib.h  2008-02-04
20:11:26.0 -0800
@@ -56,11 +56,11 @@
 /* constants */
 
 enum {
-   IPOIB_PACKET_SIZE = 2048,
-   IPOIB_BUF_SIZE= IPOIB_PACKET_SIZE + IB_GRH_BYTES,
-
IPOIB_ENCAP_LEN   = 4,
 
+   IPOIB_UD_HEAD_SIZE= IB_GRH_BYTES + IPOIB_ENCAP_LEN,
+   IPOIB_UD_RX_SG= 2, /* for 4K MTU */ 
+
IPOIB_CM_MTU  = 0x1 - 0x10, /* padding to align header
to 16 */
IPOIB_CM_BUF_SIZE = IPOIB_CM_MTU  + IPOIB_ENCAP_LEN,
IPOIB_CM_HEAD_SIZE= IPOIB_CM_BUF_SIZE % PAGE_SIZE,
@@ -141,9 +141,9 @@ struct ipoib_mcast {
struct net_device *dev;
 };
 
-struct ipoib_rx_buf {
+struct ipoib_sg_rx_buf {
struct sk_buff *skb;
-   u64 mapping;
+   u64 mapping[IPOIB_UD_RX_SG];
 };
 
 struct ipoib_tx_buf {
@@ -337,7 +337,7 @@ struct ipoib_dev_priv {
 
struct net_device  *dev;
struct ib_recv_wr   rx_wr_draft[UD_POST_RCV_COUNT];
-   struct ib_sge   sglist_draft[UD_POST_RCV_COUNT];
+   struct ib_sge   sglist_draft[UD_POST_RCV_COUNT][IPOIB_UD_RX_SG];
unsigned intrx_outst;
 
struct napi_struct napi;
@@ -378,7 +378,7 @@ struct ipoib_dev_priv {
unsigned int admin_mtu;
unsigned int mcast_mtu;
 
-   struct ipoib_rx_buf *rx_ring;
+   struct ipoib_sg_rx_buf *rx_ring;
 
spinlock_t   tx_lock;
struct ipoib_tx_buf *tx_ring;
@@ -412,6 +412,7 @@ struct ipoib_dev_priv {
struct ipoib_ethtool_st etool;
struct timer_list poll_timer;
struct ib_ah *own_ah;
+   int max_ib_mtu;
 };
 
 struct ipoib_ah {
@@ -452,6 +453,19 @@ struct ipoib_neigh {
struct list_headlist;
 };
 
+#define IPOIB_UD_MTU(ib_mtu)   (ib_mtu - IPOIB_ENCAP_LEN)
+#define IPOIB_UD_BUF_SIZE(ib_mtu)  (ib_mtu + IB_GRH_BYTES)
+static inline int ipoib_ud_need_sg(int ib_mtu)
+{
+   return (IPOIB_UD_BUF_SIZE(ib_mtu)  PAGE_SIZE) ? 1 : 0;
+}
+static inline void ipoib_sg_dma_unmap_rx(struct ipoib_dev_priv *priv,
+u64 mapping[IPOIB_UD_RX_SG])
+{
+   ib_dma_unmap_single(priv-ca, mapping[0], IPOIB_UD_HEAD_SIZE,
DMA_FROM_DEVICE);
+   ib_dma_unmap_single(priv-ca, mapping[1], PAGE_SIZE, DMA_FROM_DEVICE);
+}
+
 /*
  * We stash a pointer to our private neighbour information after our
  * hardware address in neigh-ha.  The ALIGN() expression here makes
diff -urpN ofed_kernel_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
ofed_kernel_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
--- ofed_kernel_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c   2008-02-04
20:09:18.0 -0800
+++ ofed_kernel_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c   2008-02-04
20:11:26.0 -0800
@@ -96,14 +96,82 @@ static void clean_pending_receives(struc
 
for (i = 0; i  priv-rx_outst; ++i) {
id = priv-rx_wr_draft[i].wr_id  ~IPOIB_OP_RECV;
-   ib_dma_unmap_single(priv-ca, priv-rx_ring[id].mapping,
-IPOIB_BUF_SIZE,
DMA_FROM_DEVICE);
+   if (ipoib_ud_need_sg(priv-max_ib_mtu))
+   ipoib_sg_dma_unmap_rx(priv,
+ priv-rx_ring[i].mapping);
+   else
+   ib_dma_unmap_single(priv-ca, 
priv-rx_ring[id].mapping[0],
+
IPOIB_UD_BUF_SIZE(priv-max_ib_mtu), DMA_FROM_DEVICE);
dev_kfree_skb_any(priv-rx_ring[id].skb);
priv-rx_ring[id].skb = NULL;
}
priv-rx_outst = 0;
 }
 
+static void ipoib_ud_skb_put_frags(struct sk_buff *skb, unsigned int
length,
+  struct sk_buff *toskb)
+{
+   unsigned int size;
+   skb_frag_t *frag = skb_shinfo(skb)-frags[0];
+ 
+   /* put header into skb */
+   size = min(length, (unsigned)IPOIB_UD_HEAD_SIZE);
+   skb-tail += size;
+   skb-len += size;
+   length -= size;
+ 
+   if (length == 0) {
+   /* don't need this page */
+   skb_fill_page_desc(toskb, 0, frag-page, 0, PAGE_SIZE);
+   --skb_shinfo(skb)-nr_frags;
+   } else {
+   size = min(length, (unsigned) PAGE_SIZE);
+   frag-size = size;
+   skb-data_len += size;
+   skb-truesize += size;
+   skb-len += size;
+   length -= size;
+   }
+}
+ 
+static struct sk_buff *ipoib_sg_alloc_rx_skb

RE: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness

2008-01-31 Thread Shirley Ma
On Wed, 2008-01-30 at 17:10 -0800, Woodruff, Robert J wrote:
 Tziporet wrote,
 * Delay 1.3 release in a week
 * Do RC4 next week - Feb 6
 * Add RC5 on Feb 18 - this will be the GOLD version
 * GA release on Feb 25
 
 
 All - please reply if this is acceptable
 
 I hate to keep slipping this, but I think it is important to get
 what RedHat needs into OFED 1.3, so I am not apposed to this.
 
 I think however that perhaps after 1.3, we should discuss our process
 a bit to try to get a little better at making our original
 release dates. I think we are getting hit with feature creep, allowing
 some pretty major changes after the feature freeze date, late in the
 release cycle.
 
 I also think that we do need to be a little more careful
 and selective about what features go into OFED, as it is suppose to be
 an enterprise release rather than an experimental code release. 
 
 For the kernel code, I think that this means keeping things a little
 closer to the kernel.org kernel features and if something is not
 upstream, then
 press for getting it upstream (or at least queued for upsteam) 
 rather than allowing big patches into OFED that have not had a good
 review. 
 The way we are working now, if it is getting into OFED, people are less
 aggressive at getting things upstream. 
 
 Perhaps we can have a discussion about this at the Sonoma workshop.

In addition, we should talk about how to integrate patches being queued
in upper stream but not in OFED, like IPoIB noSRQ. There is always a
window between OFED release and kernel release, a window between Distro
release and OFED release. Some customers are targeted OFED release, some
customers are targeted OFED release. Then how to handle these windows to
meet different customers' requirements could be something t to be
discussed at Sonoma workshop as well.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness

2008-01-31 Thread Shirley Ma

 In addition, we should talk about how to integrate patches being queued
 in upper stream but not in OFED, like IPoIB noSRQ. There is always a
 window between OFED release and kernel release, a window between Distro
 release and OFED release. Some customers are targeted OFED release, some
 customers are targeted OFED release. Then how to handle these windows to
 meet different customers' requirements could be something t to be
 discussed at Sonoma workshop as well.

Oops, a typo, I meant some customers are targeted Distro releases. From
customer support point view, it's always better to have OFED releases in
Distros.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] Bonding and hw_csum

2008-01-31 Thread Shirley Ma
Hello Eli,

 ipoib_0030_hw_csum.patch has been removed

Would removing this patch cause any errors on applying the rest of
patches? If not, I will remove it for our testing as well.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: non SRQ patch for OFED 1.3

2008-01-31 Thread Shirley Ma

 Pradeep,
 We tries to apply this patch for OFED 1.3 and its breaks some of the 
 backports.
 Please use the makedist script on the ofa server (there is an 
 explanation in the developers Wiki) and fix this so we can try to apply it
 Vlad will help you later today too
 
 Thanks,
 Tziporet

Thanks Tziporet/Vlad for helping this into OFED-1.3. Sean suggested to
compare noSRQ and SRQ performance in a smaller cluster environment long
time ago. That's an interesting suggestion. We are planning to compare
it in OFED-1.3.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3readiness

2008-01-31 Thread Shirley Ma
Thanks for everyone here. I appreciate your comments and effort. The
big challenge for us is how to sync features/blockers with OFED release
Distros release. Most of our customers prefer Distros release so they
can get same level of support as other pieces. If OFED could work with
Distros release, then it will be less problems for both end users and
Distros. That's just my personal opinion.

We are here to support any issues being found in OFED release cycle on
time regarding these patches.

Thanks again!
Shirley 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] OFED Jan 28 meeting summary on RC3 readiness

2008-01-30 Thread Shirley Ma





[EMAIL PROTECTED] wrote on 01/30/2008 08:40:10 AM:

 Doug Ledford wrote:
 
  Hmmm...I'd like to put my $.02 in here.  I don't have any visibility
  into what drives the OFED schedule, so I have no clue as to why people
  don't want to slip the schedule for this change.  I'm sure you guys
have
  your reasons.  However, I also happen to be a consumer of this code,
and
  I know for a fact that no one has gotten my input on this issue.  So,
  the deal is that I'm currently integrating OFED 1.3 into what will be
  RHEL5.2.  The RHEL5.2 freeze date has already passed, but in order to
  keep what finally goes out from being too stale, I'm being allowed to
  submit the OFED-1.3-rc1 code prior to freeze, and then update to
  OFED-1.3 final during our beta test process.  What this means, is that
  anything you punt from 1.3 to 1.3.1, you are also punting out of
RHEL5.2
  and RHEL4.7.  So, that being said, there's a whole trickle down effect
  with various groups that would really like to be able to use 5.2 out of
  the box that may prefer a slip in 1.3 so that this can be part of it
  instead of punting to 1.3.1.  I'm not saying this will change your
mind,
  but I'm sure it wasn't part of the decision process before, so I'm
  bringing it up.
 
 Thanks for the input (BTW you are welcome to join our weekly meetings
 and give us feedback online)
 I think it is important to make sure RH new versions will include best
 OFED release

 This my suggestion is:

 * Delay 1.3 release in a week
 * Do RC4 next week - Feb 6
 * Add RC5 on Feb 18 - this will be the GOLD version
 * GA release on Feb 25


 All - please reply if this is acceptable
 
 
  760 major   [EMAIL PROTECTED]  UDP performance on Rx is lower
  than Tx  - for 1.3.1
  761 major   [EMAIL PROTECTED]  Poor and jittery UDP
  performance at small messages  - for 1.3.1
 
 
  Ditto for requesting these two be in 1.3.  We've already had customers
  bring up the UDP performance issue in our previous releases.
 
 
 We will push some fixes of these to RC4 if the above plan is accepted

 Tziporet

  Is also that possible to include some delayed features which are
planning to be in later release as well? Like IPoIB noSRQ, 4K mtu etc, we
do have some customers request already. IPoIB noSRQ has been in upper
stream already, but it's not in 2.6.24, it will be in 2.6.25. 4K mtu patch
is under review. We have passed our tests. I will post a new version
against RC3, and split the patch into several for 2.6.25 upper stream
submission.

thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] Bonding and hw_csum

2008-01-30 Thread Shirley Ma




Hello Tziporet,

 the hw checksum patch was removed from OFED 1.3

 Tziporet

Could youp please specify which patch has been removed? I still can see a
list of patches under RC3. here they are:

ipoib_0010_Add-high-dma-support-to-ipoib.patch
ipoib_0020_Add-s-g-support-for-IPOIB.patch
ipoib_0030_hw_csum.patch
ipoib_0040_checksum-offload.patch
ipoib_0050_Add-LSO-support.patch
ipoib_0060_ethtool-support.patch
ipoib_0070_modiy_cq_params.patch
ipoib_0080_broadcast_null.patch
ipoib_0110_set_default_cq_patams.patch

thanks
Shirley___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg