Re: [ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-22 Thread Or Gerlitz

Eli Cohen wrote:


  Can you provide a pointer to some Mellanox high level doc/faq explaining
  the possible benefits from the stateless offloading features of connectX?

The benefits of the stateless offload features can best be described by
numbers. I have been able to push throughput in datagram mode to 814
MB/s. Details of the offload features are in the ConnectX prm.


Hi Eli,

Thanks for the feedback, the 814 MB/s number is quite impressive.

Dror/Eitan -

The question here is that on what scenario this code would in use when 
taking into account that the connected mode is now available and can 
provide ever higher throughput. My thinking was that this --can-- 
helpful when there is UDP multicast traffic of big packets, but Eli says 
you don't support it (UDP UFO)


Again, I would be happy to get from Mellanox pointer to document which 
is not under NDA and describes the feature (eg Marketing stuff)


Or.

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-21 Thread Or Gerlitz

Eli Cohen wrote:

On Mon, 2007-08-20 at 18:48 +0300, Or Gerlitz wrote:

I see that the patch adds the NETIF_F_TSO flag to the device features 
but not the NETIF_F_UFO flag. Is UDP LSO supported by the connectX HW? 
if yes, what would it take SW wise to support it?


UDP LSO is not supported by the HW.


I see.


Reading http://en.wikipedia.org/wiki/TCP_segmentation_offload and 
thinking about the connectx TCP segmentation offloading a little further 
I am somehow confused:


With the ipoib connected mode a typical MTU exposed to the OS is 64K, 
where data goes over IB RC (soon to be UC) transport meaning that the 
HCA does the fragmentation / reassembly at the QP (IB L4) level 
allowing to send one 64K IB packet over a path whose Link layer (IB L2) 
MTU is 2k.


So TCP can use segments size upto 64K and UDP can send datagram's whose 
size is upto 64K without IP fragmentation.


With all this at hand, what does LSO buys you at all?

Or.

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-21 Thread Eli Cohen
On Tue, 2007-08-21 at 12:51 +0300, Or Gerlitz wrote:
 Reading http://en.wikipedia.org/wiki/TCP_segmentation_offload and 
 thinking about the connectx TCP segmentation offloading a little further 
 I am somehow confused:
 
 With the ipoib connected mode a typical MTU exposed to the OS is 64K, 
 where data goes over IB RC (soon to be UC) transport meaning that the 
 HCA does the fragmentation / reassembly at the QP (IB L4) level 
 allowing to send one 64K IB packet over a path whose Link layer (IB L2) 
 MTU is 2k.
 
 So TCP can use segments size upto 64K and UDP can send datagram's whose 
 size is upto 64K without IP fragmentation.
 
 With all this at hand, what does LSO buys you at all?
 

You have the option to work in either connected or datagram mode, and
some might prefer to use datagram mode for various reasons. In these
cases you will want the benefits that LSO provides.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-21 Thread Eli Cohen

 Can you provide a pointer to some Mellanox high level doc/faq explaining 
 the possible benefits from the stateless offloading features of connectX?

The benefits of the stateless offload features can best be described by
numbers. I have been able to push throughput in datagram mode to 814
MB/s. Details of the offload features are in the ConnectX prm.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-20 Thread Or Gerlitz

Eli Cohen wrote:

Add LSO support to ipoib

Using LSO improves performance by allowing the software
to not fragment the payload to mtu sized patckets and also
results in lower rate of interrupts since each such work
request has just one CQE. 


Hi Eli, Dror,

I see that the patch adds the NETIF_F_TSO flag to the device features 
but not the NETIF_F_UFO flag. Is UDP LSO supported by the connectX HW? 
if yes, what would it take SW wise to support it?


Or.

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-20 Thread Eli Cohen
On Mon, 2007-08-20 at 18:48 +0300, Or Gerlitz wrote:

 I see that the patch adds the NETIF_F_TSO flag to the device features 
 but not the NETIF_F_UFO flag. Is UDP LSO supported by the connectX HW? 
 if yes, what would it take SW wise to support it?

UDP LSO is not supported by the HW.
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] [PATCH - 11] ipoib - add LSO support

2007-08-15 Thread Eli Cohen
Add LSO support to ipoib

Using LSO improves performance by allowing the software
to not fragment the payload to mtu sized patckets and also
results in lower rate of interrupts since each such work
request has just one CQE. 

Signed-off-by: Eli Cohen [EMAIL PROTECTED]

---

Index: linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- linux-2.6.23-rc1.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c 
2007-08-15 20:50:33.0 +0300
+++ linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_main.c  2007-08-15 
20:50:38.0 +0300
@@ -704,7 +704,13 @@ static int ipoib_start_xmit(struct sk_bu
goto out;
}
 
-   ipoib_send(dev, skb, neigh-ah, 
IPOIB_QPN(skb-dst-neighbour-ha));
+   if (skb_is_gso(skb))
+   ipoib_send_gso(dev, skb, neigh-ah,
+  IPOIB_QPN(skb-dst-neighbour-ha));
+   else
+   ipoib_send(dev, skb, neigh-ah,
+  IPOIB_QPN(skb-dst-neighbour-ha));
+
goto out;
}
 
@@ -1152,6 +1158,10 @@ static struct net_device *ipoib_add_port
goto event_failed;
}
 
+   if (priv-dev-features  NETIF_F_SG)
+   if (priv-ca-flags  IB_DEVICE_TCP_GSO)
+   priv-dev-features |= NETIF_F_TSO;
+
result = register_netdev(priv-dev);
if (result) {
printk(KERN_WARNING %s: couldn't register ipoib port %d; error 
%d\n,
Index: linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- linux-2.6.23-rc1.orig/drivers/infiniband/ulp/ipoib/ipoib.h  2007-08-15 
20:50:33.0 +0300
+++ linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib.h   2007-08-15 
20:50:38.0 +0300
@@ -373,6 +373,10 @@ int ipoib_add_pkey_attr(struct net_devic
 
 void ipoib_send(struct net_device *dev, struct sk_buff *skb,
struct ipoib_ah *address, u32 qpn);
+
+void ipoib_send_gso(struct net_device *dev, struct sk_buff *skb,
+   struct ipoib_ah *address, u32 qpn);
+
 void ipoib_reap_ah(struct work_struct *work);
 
 void ipoib_flush_paths(struct net_device *dev);
Index: linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_ib.c
===
--- linux-2.6.23-rc1.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c   
2007-08-15 20:50:33.0 +0300
+++ linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_ib.c2007-08-15 
20:50:38.0 +0300
@@ -38,6 +38,7 @@
 #include linux/delay.h
 #include linux/dma-mapping.h
 #include linux/ip.h
+#include linux/tcp.h
 
 #include rdma/ib_cache.h
 
@@ -249,15 +250,24 @@ repost:
 }
 
 static int dma_unmap_list(struct ib_device *ca, struct ipoib_mapping_st *map,
-  u16 n)
+  u16 n, int gso)
 {
int i;
int len;
+   int first;
 
-   ib_dma_unmap_single(ca, map[0].addr, map[0].size, DMA_TO_DEVICE);
-   len = map[0].size;
+   if (!gso) {
+   ib_dma_unmap_single(ca, map[0].addr, map[0].size,
+   DMA_TO_DEVICE);
+   len = map[0].size;
+   first = 1;
+   } else {
+   len = 0;
+   first = 0;
+   }
+
+   for (i = first; i  n; ++i) {
 
-   for (i = 1; i  n; ++i) {
ib_dma_unmap_page(ca, map[i].addr, map[i].size,
  DMA_TO_DEVICE);
len += map[i].size;
@@ -276,6 +286,7 @@ static void ipoib_ib_handle_tx_wc(struct
ipoib_dbg_data(priv, send completion: id %d, status: %d\n,
   wr_id, wc-status);
 
+
if (unlikely(wr_id = ipoib_sendq_size)) {
ipoib_warn(priv, send completion event with wrid %d ( %d)\n,
   wr_id, ipoib_sendq_size);
@@ -283,8 +294,16 @@ static void ipoib_ib_handle_tx_wc(struct
}
 
tx_req = priv-tx_ring[wr_id];
-   priv-stats.tx_bytes += dma_unmap_list(priv-ca, tx_req-mapping,
-   skb_shinfo(tx_req-skb)-nr_frags + 1);
+   if (skb_is_gso(tx_req-skb))
+   priv-stats.tx_bytes +=
+   dma_unmap_list(priv-ca, tx_req-mapping,
+  skb_shinfo(tx_req-skb)-nr_frags, 1);
+   else
+   priv-stats.tx_bytes +=
+   dma_unmap_list(priv-ca, tx_req-mapping,
+  skb_shinfo(tx_req-skb)-nr_frags + 1,
+  0);
+
++priv-stats.tx_packets;
 
dev_kfree_skb_any(tx_req-skb);
@@ -367,7 +386,8 @@ void ipoib_ib_completion(struct ib_cq *c
 static inline int post_send(struct ipoib_dev_priv