Re: [ofa-general] [PATCH - 11] ipoib - add LSO support
Eli Cohen wrote: Can you provide a pointer to some Mellanox high level doc/faq explaining the possible benefits from the stateless offloading features of connectX? The benefits of the stateless offload features can best be described by numbers. I have been able to push throughput in datagram mode to 814 MB/s. Details of the offload features are in the ConnectX prm. Hi Eli, Thanks for the feedback, the 814 MB/s number is quite impressive. Dror/Eitan - The question here is that on what scenario this code would in use when taking into account that the connected mode is now available and can provide ever higher throughput. My thinking was that this --can-- helpful when there is UDP multicast traffic of big packets, but Eli says you don't support it (UDP UFO) Again, I would be happy to get from Mellanox pointer to document which is not under NDA and describes the feature (eg Marketing stuff) Or. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH - 11] ipoib - add LSO support
Eli Cohen wrote: On Mon, 2007-08-20 at 18:48 +0300, Or Gerlitz wrote: I see that the patch adds the NETIF_F_TSO flag to the device features but not the NETIF_F_UFO flag. Is UDP LSO supported by the connectX HW? if yes, what would it take SW wise to support it? UDP LSO is not supported by the HW. I see. Reading http://en.wikipedia.org/wiki/TCP_segmentation_offload and thinking about the connectx TCP segmentation offloading a little further I am somehow confused: With the ipoib connected mode a typical MTU exposed to the OS is 64K, where data goes over IB RC (soon to be UC) transport meaning that the HCA does the fragmentation / reassembly at the QP (IB L4) level allowing to send one 64K IB packet over a path whose Link layer (IB L2) MTU is 2k. So TCP can use segments size upto 64K and UDP can send datagram's whose size is upto 64K without IP fragmentation. With all this at hand, what does LSO buys you at all? Or. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH - 11] ipoib - add LSO support
On Tue, 2007-08-21 at 12:51 +0300, Or Gerlitz wrote: Reading http://en.wikipedia.org/wiki/TCP_segmentation_offload and thinking about the connectx TCP segmentation offloading a little further I am somehow confused: With the ipoib connected mode a typical MTU exposed to the OS is 64K, where data goes over IB RC (soon to be UC) transport meaning that the HCA does the fragmentation / reassembly at the QP (IB L4) level allowing to send one 64K IB packet over a path whose Link layer (IB L2) MTU is 2k. So TCP can use segments size upto 64K and UDP can send datagram's whose size is upto 64K without IP fragmentation. With all this at hand, what does LSO buys you at all? You have the option to work in either connected or datagram mode, and some might prefer to use datagram mode for various reasons. In these cases you will want the benefits that LSO provides. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH - 11] ipoib - add LSO support
Can you provide a pointer to some Mellanox high level doc/faq explaining the possible benefits from the stateless offloading features of connectX? The benefits of the stateless offload features can best be described by numbers. I have been able to push throughput in datagram mode to 814 MB/s. Details of the offload features are in the ConnectX prm. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH - 11] ipoib - add LSO support
Eli Cohen wrote: Add LSO support to ipoib Using LSO improves performance by allowing the software to not fragment the payload to mtu sized patckets and also results in lower rate of interrupts since each such work request has just one CQE. Hi Eli, Dror, I see that the patch adds the NETIF_F_TSO flag to the device features but not the NETIF_F_UFO flag. Is UDP LSO supported by the connectX HW? if yes, what would it take SW wise to support it? Or. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] [PATCH - 11] ipoib - add LSO support
On Mon, 2007-08-20 at 18:48 +0300, Or Gerlitz wrote: I see that the patch adds the NETIF_F_TSO flag to the device features but not the NETIF_F_UFO flag. Is UDP LSO supported by the connectX HW? if yes, what would it take SW wise to support it? UDP LSO is not supported by the HW. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] [PATCH - 11] ipoib - add LSO support
Add LSO support to ipoib Using LSO improves performance by allowing the software to not fragment the payload to mtu sized patckets and also results in lower rate of interrupts since each such work request has just one CQE. Signed-off-by: Eli Cohen [EMAIL PROTECTED] --- Index: linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_main.c === --- linux-2.6.23-rc1.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c 2007-08-15 20:50:33.0 +0300 +++ linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_main.c 2007-08-15 20:50:38.0 +0300 @@ -704,7 +704,13 @@ static int ipoib_start_xmit(struct sk_bu goto out; } - ipoib_send(dev, skb, neigh-ah, IPOIB_QPN(skb-dst-neighbour-ha)); + if (skb_is_gso(skb)) + ipoib_send_gso(dev, skb, neigh-ah, + IPOIB_QPN(skb-dst-neighbour-ha)); + else + ipoib_send(dev, skb, neigh-ah, + IPOIB_QPN(skb-dst-neighbour-ha)); + goto out; } @@ -1152,6 +1158,10 @@ static struct net_device *ipoib_add_port goto event_failed; } + if (priv-dev-features NETIF_F_SG) + if (priv-ca-flags IB_DEVICE_TCP_GSO) + priv-dev-features |= NETIF_F_TSO; + result = register_netdev(priv-dev); if (result) { printk(KERN_WARNING %s: couldn't register ipoib port %d; error %d\n, Index: linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib.h === --- linux-2.6.23-rc1.orig/drivers/infiniband/ulp/ipoib/ipoib.h 2007-08-15 20:50:33.0 +0300 +++ linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib.h 2007-08-15 20:50:38.0 +0300 @@ -373,6 +373,10 @@ int ipoib_add_pkey_attr(struct net_devic void ipoib_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_ah *address, u32 qpn); + +void ipoib_send_gso(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn); + void ipoib_reap_ah(struct work_struct *work); void ipoib_flush_paths(struct net_device *dev); Index: linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_ib.c === --- linux-2.6.23-rc1.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2007-08-15 20:50:33.0 +0300 +++ linux-2.6.23-rc1/drivers/infiniband/ulp/ipoib/ipoib_ib.c2007-08-15 20:50:38.0 +0300 @@ -38,6 +38,7 @@ #include linux/delay.h #include linux/dma-mapping.h #include linux/ip.h +#include linux/tcp.h #include rdma/ib_cache.h @@ -249,15 +250,24 @@ repost: } static int dma_unmap_list(struct ib_device *ca, struct ipoib_mapping_st *map, - u16 n) + u16 n, int gso) { int i; int len; + int first; - ib_dma_unmap_single(ca, map[0].addr, map[0].size, DMA_TO_DEVICE); - len = map[0].size; + if (!gso) { + ib_dma_unmap_single(ca, map[0].addr, map[0].size, + DMA_TO_DEVICE); + len = map[0].size; + first = 1; + } else { + len = 0; + first = 0; + } + + for (i = first; i n; ++i) { - for (i = 1; i n; ++i) { ib_dma_unmap_page(ca, map[i].addr, map[i].size, DMA_TO_DEVICE); len += map[i].size; @@ -276,6 +286,7 @@ static void ipoib_ib_handle_tx_wc(struct ipoib_dbg_data(priv, send completion: id %d, status: %d\n, wr_id, wc-status); + if (unlikely(wr_id = ipoib_sendq_size)) { ipoib_warn(priv, send completion event with wrid %d ( %d)\n, wr_id, ipoib_sendq_size); @@ -283,8 +294,16 @@ static void ipoib_ib_handle_tx_wc(struct } tx_req = priv-tx_ring[wr_id]; - priv-stats.tx_bytes += dma_unmap_list(priv-ca, tx_req-mapping, - skb_shinfo(tx_req-skb)-nr_frags + 1); + if (skb_is_gso(tx_req-skb)) + priv-stats.tx_bytes += + dma_unmap_list(priv-ca, tx_req-mapping, + skb_shinfo(tx_req-skb)-nr_frags, 1); + else + priv-stats.tx_bytes += + dma_unmap_list(priv-ca, tx_req-mapping, + skb_shinfo(tx_req-skb)-nr_frags + 1, + 0); + ++priv-stats.tx_packets; dev_kfree_skb_any(tx_req-skb); @@ -367,7 +386,8 @@ void ipoib_ib_completion(struct ib_cq *c static inline int post_send(struct ipoib_dev_priv